Tuesday, November 15, 2011

SAS and Ampersands

I realized I'm a SAS geek today. On my way to work, near the UW and VA hospitals, I saw a street sign that displayed "UW&VA" and I wondered, "What does &VA resolve to?"

Road sign: 'UW&VA Hospitals Parking Right Lane'
What does &VA resolve to?

So you know when you're a SAS geek when you see a sign with an ampersand immediately followed by characters, and wonder, "What does that resolve to?" Bad joke, go to your work library!

Seriously, though, this brings up a good topic: How do you resolve macro variables with special characters in the value? For example, if we have the macro variable parking_sign with the value "UW&VA", and this value is assigned automatically in a data step using the CALL SYMPUT routine or SQL using the INTO statement, neither of which would generate an error in the log regarding the value of the macro variable VA.

There are a couple of solutions to solve this and avoid an error in the log:
  1. Define the macro variable VA as "VA" (i.e., %let va=VA;). This is a quick fix that allows the process to continue by creating a value for the macro variable that the code is mistakenly trying to look up. But it does not solve the problem completely, since new strings with ampersands or even percent signs could be introduced into your source data. This process could be automated to generate multiple macro variables for each string with an ampersand found, but it would not work for macro programs (which use percent signs).
  2. Edit the source data and mask the characters after the ampersand (or percent sign) with the %STR (or related) function. This solution would work, but it would be rather awkward, and your source data would have a strange-looking value (i.e. "UW&%str(V)A"), which might not be a good solution if you have to use the data in a different way later, in which the %STR function would not execute, leaving the odd-looking raw value. It gets even more complicated when percent signs are involved.
  3. Instead of resolving the macro variable with the ampersand, resolve it with the %SUPERQ function. This function is a better solution, since it avoids any special characters problems and does not edit the source. Basically it tells SAS to resolve only the first level of the macro variable, and no further. However, it is not as easy to implement, since it requires replacing all references to the macro variable (e.g., &PARKING_SIGN) with different code (i.e., %SUPERQ(PARKING_SIGN));
  4. Use the %SUPERQ function with an additional step that redefines the PARKING_SIGN macro variable (see below). This is the best solution, since the rest of the code does not have to be re-written. The %LET statement is used to immediately redefine the macro variable using the %SUPERQ function, which will not resolve the values any further. Better yet, this is only one line of very simple code, and it fixes the problem forever!
/* Dummy data */
data test;
    input parking_sign $;
    format parking_sign $15.;
    infile datalines;
datalines;
UW&VA
;
run;

/* Generate macro variable */
data _null_;
    set test;
    call symputx('parking_sign', parking_sign);
run;

/* Use macro variable by writing it to the log. */
/* Note: Generates a (w)arning on the first run. */
%put &PARKING_SIGN;

/* Solution: Redefine macro variable */
%let parking_sign=%superq(PARKING_SIGN);

/* Use macro variable by writing it to the log. */
/* Note: This time there is no (w)arning. */
%put &PARKING_SIGN;

This last solution is by far the simplest and least amount of work to fix the problem.

For more on macro variable quoting (the %STR and %SUPERQ functions), see the SAS website:

http://support.sas.com/documentation/cdl/en/mcrolref/61885/HTML/default/viewer.htm#tw3514-overvwqt.htm

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.