Monday, September 12, 2011

SAS Passwords Suck

There, I said it. Apparently anyone else who uses SAS passwords hasn't had the guts to say that. Here's why I say so:
  • SAS data set passwords must adhere to the rules for SAS names. These rules include:
    • The first character must be a letter or the underscore (1)
    • All other characters can be letters, numbers, or the underscore (1)
    • The length cannot exceed 8 characters (2)
  • Using PROC SQL, programmers can enter longer passwords than are acceptable, without any notification (unlike in the DATA step), and these passwords become truncated. This lets the programmer think they are applying good passwords. Worse, if someone else has to use the password other than the programmer who created it, they will find it is broken and will not be able to continue with their work.
  • There is an option to encode passwords, but as the SAS documentation notes (3): "With encoding, one character set is translated to another character set through some form of table lookup. An encoded password is intended to prevent casual, non-malicious viewing of passwords. You should not depend on encoded passwords for all your data security needs; a determined and knowledgeable attacker can decode the encoded passwords." So I'm not quite sure what the point is of such an option, since it's easily hacked.
  • SAS data set passwords are typically stored as plain text within SAS code. For anyone who wants to get the password, find the source code. Usually the directory is not secured in the OS.
  • If the password is incorrect, a pop-up appears, which can be turned off (4). So a hacker could turn off the option and keep cycling through passwords until the correct one is reached.
  • Requiring passwords on data sets implies that other, standard systems of securing and backing up data are not trusted; furthermore, it conveys a level of distrust for the SAS programmers who are using the data sets. Other than human error, what is a well-meaning SAS programmer going to do with the data?
Someone out there might say, why not use data set generations (5)? (Generations are basically automatic copies of SAS data sets that have a limit on the number of copies.) There are several reasons against this:
  1. It's a waste of disk space, especially with large data sets, and especially if you have a good backup system.
  2. The generations are eventually deleted. Once an item reaches the maximum number, it's deleted. If that data set was important, why would you want to delete it, ever?
  3. It's confusing. There are no explicit explanation of what a generation may mean. It could be that one value was off in one generation so the data had to be re-run. It could be the whole data set was the wrong one. Why would you want to keep a history of that? It's simply garbage.
So what's a SAS administrator to do? First, use the operating system to manage rights by folder access. If only certain people should access the SAS data sets and code, lock the files for only those users. Second, implement good OS password policies. Those passwords can be by far better than SAS's! Third, back up hard drives automatically. Fourth, set a coding standard for making backups/copies in SAS before overwriting important data sets, which simply makes it easier to restore from a bad run (instead of finding the actual backup). (The DATASETS procedure is great for bulk copying and modifying SAS data sets.) Fifth, trust your programmers. If you can't trust your programmers, why are they working for you? Why do you give them important work? Do you really want to convey that to them? What does it do for their morale and self worth?

I hope by posting this SAS realizes that their password mechanism is essentially broken and useless. Even a novice hacker could figure out how to break into such a system. Other options in the system don't really help, which leads back to the OS itself. And the OS probably has much better tools for managing access and passwords that SAS does.


1 comment:

  1. Chris, I think you recognize the key limitation: SAS data set passwords are not meant to be a substitute for more robust security mechanisms, such as file system and database permissions.

    See for some more ideas to reduce reliance on passwords in your code.


Note: Only a member of this blog may post a comment.