Something You Know (Password)
The idea here is that you know a secret — often called a password — that nobody else does. Thus, knowledge of a secret distinguishes you from all other individuals. And the authentication system simply needs to check to see if the person claiming to be you knows the secret.
Unfortunately, use of secrets is not a panacea. If the secret is entered at some sort of keyboard, an eavesdropper (“shoulder surfing”) might see the secret being typed. For authenticating machines, we used challenge/response protocols to avoid sending a secret (key) over the wire where it could be intercepted by a wiretapper. But we can’t force humans to engage in a challenge/response protocol on their own, because people cannot be expected to do cryptographic calculations.
Furthermore, people will tend to choose passwords that are easy to remember, which usually means that the password is easy to guess. Or they choose passwords that are difficult to guess but are also difficult to remember (so the passwords must be written down and then are easy for an attacker to find).
Even if a password is not trivial to guess, it might succumb to an offline search of the password space. An offline search needs some way to check a guess without using the system itself, and some methods used today for storing passwords do provide such a way. (See below.)
Finally, changing a password requires human intervention. Thus, compromised passwords could remain valid for longer than is desirable. And there must be some mechanism for resetting the password (because passwords will get forgotten and compromised). This mechanism could itself be vulnerable to social-engineering attacks, which rely on convincing a human with the authority to change or access information that it is necessary to do so.
With all these concerns about passwords, you might wonder what is required for a password to be considered a good one. There are three dimensions, and they interact so that strengthening one can be used to offset a weakness in another.
- Length. This is the easiest dimension for people to strengthen. Longer passwords are better. A good way to get a long password that is seemingly random yet easy to remember is to think of a passphrase (like the first words of a song) and then generate the password from the first letters of the passphrase.
- Character set. The more characters that can be used in a password, the greater the number of possible combinations of characters, so the larger the password space. To search a larger password space require doing more work by an attacker.
- Randomness. Choose a password from a language (English, say) and an attacker can leverage regularities in this language to reduce the work needed in searching the password space (because certain passwords are now “impossible”). For instance, given the phonotactic and orthographic constraints of English, an attacker searching for an English word need not try passwords containing sequences like krz (although this would be a perfectly reasonable to try if the password was known to be in Polish). Mathematically, it turns out that English has about 1.3 bits of information per character. Thus it takes 49 characters to get 64 bits of “secret”, which comes out to about 10 words (at 5 characters on average per word).
When passwords are used for authenticating a user, the system must have a way to check whether the password entered is valid. Simply storing a file with the list of usernames and associated passwords, however, is a bad idea because if the confidentiality of this file were ever compromised all would be lost. (Similarly, backup copies of this file would have to be afforded the same level of protection, since people rarely ever change their passwords.) Better not to store actual passwords on-line. So instead we might compute a cryptographic hash of the password, and store that. Now, the user enters a password; the system computes a hash of that password; and the system then compares that hash with what has been stored in the password file.
Even when password hashes instead of actual passwords are what is being stored, the integrity of this file of hashes must still be protected. Otherwise an attacker could insert a different hash (for a password the attacker knows) and log into the system using that new password.
The problem with having a password file that is not confidential — even if cryptographic hashes are what is being stored — is the possibility of offline dictionary attacks. Here, the attacker computes the hash of every word in some dictionary and then compares each hash with the stored password hashes. If any match, the attacker has learned a password. An alternative to confidentiality for defending against offline dictionary attacks is use of salt. Salt is a random number that is associated with a user and is added to that user’s password when the hash is computed. With high probability, a given pair of users will not have the same salt value. And the system stores both h(password + salt) and the salt for each account.
Salt does not make it more difficult for an attacker to guess the password for a given account, since the salt for each account is stored in the clear. What salt does, however, is make it harder for the attacker to perpetrate an offline dictionary attack against all users. When salt is used, all the words in the dictionary would have to be rehashed for every user. What formerly could be seen as a “wholesale” attack has been transformed into a “retail” one.
Salt is used in most UNIX implementations. The salt in early versions of UNIX was 12 bits, and it was formed from the system time and the process identifier when an account is created. Unfortunately, 12 bits is hopelessly small, nowadays. Even an old PC can perform 13,000 crypt/sec, which means such a PC so can hash a 20k word dictionary with every possible value of a 12 bit salt in 1 hour.
Another defense against offline dictionary attacks is to use secret salt (invented by Manber and independently by Abadi and Needham). In this scheme, we select a small set of possible “secret salt” values from a large space. The password file then stores for each user: userid, h(password, public salt, secret salt), public salt. Note that the value of the secret salt used in computing the hash is not saved anyplace. When secret salt is being employed, a user login involves having the system guess the value of secret salt that was used in computing the stored, hashed password; the guess involves checking through the possible secret salt values. The effect is to make computing a hashed password very expensive for attackers.
Examples of Password Systems
We now outline several widely-used password systems.
- Unix. Unix stores a hashed salted password and salt. For the hash, it iterates DES 25 times with an input of “0” and with the password as the key; it then adds the 12-bit salt. As discussed above, this is not strong enough for today’s machines. Some versions of Unix employ a shadow password file, so that it is harder for an attacker to retrieve the hashed passwords. There are then two files: /etc/shadow and/etc/master.password.
- FreeBSD. FreeBSD stores a hashed password (where the hash is based on MD5). There is no limit to the length of the password, and 48 bits of salt are used.
- OpenBSD. OpenBSD does a hash based on blowfish encryption, and then stores the hashed password along with 128 bits of salt. The system guarantees that no two accounts will have the same salt value.
- Windows NT/2000/XP. NT stores 2 password hashes: one called the LanMan hash and another called the NT hash. The LanMan hash is used for backwards compatibility with Windows 95/98, and it is a very weak scheme. The following diagram shows how it works.
To see the weakness, consider how much work an attacker would have to do to break this scheme. The numbers and uppercase letters together make up 36 characters. Each half of a 14-character password then has 367 possible values, which comes out as 78,364,164,096. The actual work factor then is 2 x 367 (whereas the theoretical work factor for 14 characters is 3614 = 367 x 367).
Note that if upper and lower case were both allowed, then there would be (2 x 26) + 10 = 62 possible characters and thus 627 = 3,512,614,606,208 possible values, which is 100 times greater than the LanMan value.
The NT hash is somewhat better. In the NT operating system, there was still a 14 character limit, although this limit was removed in Windows 2000 and XP. The password is then passed through 48 iterations of MD4 to get a 128 bit hash. This hash is stored in the system, but no salt is used at all.
Defense Against Password Theft: A Trusted Path
Given schemes that make passwords hard to guess, an attacker might be tempted to try theft. The attack is: install some some sort of program to produce a window that resembles a login prompt or otherwise invites the user to reveal a password. Users will then type their passwords into this program, where the password is saved for later use by the attacker.
How can you defend against such attacks? What we would like is some way for a user to determine the pedigree of any window purporting to be a login prompt. If each point in the pedigree is trusted, then the login prompt window must be trusted and it is safe to enter a password. This idea is called a trusted path.
To implement a trusted path, the keyboard driver recognizes a certain key sequence (Ctl-Alt-Del in Windows) and always then transfers control to some trusted software that displays a (password prompt) window and reads the contents. Users are educated to type passwords only into windows that appear after typing that special key sequence.
Notice, however, that this scheme requires that a trusted keyboard driver is executing. So, that means the system must be running an operating system that is trusted to prevent keyboard driver substitutions. One might expect that rebooting the machine would be a way to ensure that a trusted operating system is executing (presuming you trust whatever operating system is installed), but what if the OS image on the disk had been altered by an attacker? So, one must be certain that the operating system software stored on the disk has not been modified, too. But even that’s not enough. What about the boot loader, which might have been altered to read a boot block from a non-standard location on the disk? And so it goes. Even if you start each session by booting from your own fresh OS CD, a ROM or even the hardware might have been hacked by an attacker. Physical security of the hardware then must also have been maintained. In the end, though, to the extent that you can trust all layers from the hardware to the keyboard driver, the resulting trusted path provides a way to defend against attacks implemented by programs that attempt to steal passwords by spoofing.