Thursday, February 3, 2011

The Powerful SAS Macro

In SAS, there are three main types of code: the DATA step, procedures (PROC), and the macro language. The macro language is incredibly powerful. It allows users to create dynamic programs, either by conditional logic, iterations, or a combination. A macro program can be as simple as running the same exact DATA step the same way every time, or as dynamic as generating custom code based on the input. (See here and here for more information on the macro language. For some examples of general-use macros, see my macro site.)

As a visual demonstration of the power of the SAS macro language, here is a comparison of SAS code and log output, printed on paper with a similar format:


SAS code (33 pages), including project-specific macros (13 pages)
(Courier New font, size 10; 1.5 line spacing; .5" margins)


SAS log from above code (391 pages)
(Courier New font, size 10; original line spacing; .5" margins)

(Note: The SAS log is a record of what SAS has done. When code is run, it documents input and output record counts, issues like warnings and errors, run time, and other information. See this page for more information.)

The code was formatted with 1.5 line spacing, although the line spacing in the log was left as-is. Additionally, the code included general-use macros (macros not specific to this project), which were not included in the printed code, but would have printed log output upon use. For example, the DUPCHECK macro, used for identifying and resolving duplicates, was used often. As one line of code, it generates, in the simplest example, 42 lines of code. (For an example of its use, see code and log.)

Even so, the ability to do so much work with so few lines of code is incredibly powerful and efficient. I am able to save time and energy, make processing more efficient and effective, simplify and standardize code, and share processes with other users. If you are a SAS user and you do not know the macro language, I encourage you to start learning!

2 comments:

  1. The three types of code that you mention (DATA step, procs, macro) are important for Base SAS programmers to master. For statistical programmers, there is a fourth: the SAS/IML language. The SAS/IML language is used by statisticians to compute analyses that are not available in any SAS procedure. I also use it for data manipulation. As I show in my blog and book, sometimes a simple IML program can replace a complicated macro.

    ReplyDelete
  2. @ Rick: Ah yes, but that doesn't come with base SAS, right? I've never had the chance to use products other than Base, STAT, and Graph.

    ReplyDelete

Note: Only a member of this blog may post a comment.