Theoretical and Practical Password Entropy
We got a number of submissions pointing to today's XKCD cartoon [1] . I think the cartoon is great, and illustrates a nice dilemma in password security. Yes, I know passwords don't work, but we still all use them and we still have to come up with reasonable passwords.
Even if you are using a password safe tool that comes up with new random passwords for each application and website, you still need to remember the password for the password safe, and there are a few applications (e.g. logging in to your system) that can't be covered by a password safe.
The basic dilemma is that you need to come up with a password that is hard to guess for others but easy enough for you to remember. Most password policies try to enforce a hard to guess password by forcing you to extend the range of characters from which you pick (different case letters, numbers, special characters). However, in real life, this may actually reduce the space of "memorable" passwords, or the total number of possible passwords.
Pass phrases, as suggested by the cartoon, are one solution. But once an attacker knows that you use a pass phrase, the key space is all for sudden limited again. There has been some research showing that a library of 3 word phrases pulled from wikipedia makes a decent dictionary to crack these passwords.
The qualify of a password is usually expressed in "bits of entropy". The "bits of entropy" are calculated by the number of bits it would take to represent all possible passwords. Lets look at some common schemes:
a 4 digit PIN: 10,000 possible passwords, or 13.3 bits (ln2(10,000)=13.3)
12 characters using the full 95 characters ASCII set: 5.4 10^23, or 78.8 bits. (this is the current NIST recommendation)
Pass phrases are harder to evaluate. It depends on the size of the vocabulary of the user, and of course the constraints of grammar. People will likely not choose some random words, but a phrase that makes some sense to them. One model that can be used to obtain a passphrase is called "Diceware", but it assumes random phrases from 6^5 words (7,776).If you consider Diceware's 7,776 words, you would need 6 words to arrive at the same 77.5 bits, close to strength that NIST asks for.
What it all comes down to: How are people actually selecting passwords? People make pretty bad random number generators, in particular if you ask them to remember the result. A good password cracking algorithm takes this into account and tailors the password list based on password requirements and the targets background. For example, for web application pen testing, the simple ruby script "cewl" will create a custom password list from words it finds on the targets website. In past tests, I was easily able to double my password cracking success using this technique if compared to normal dictionaries.
In order to solve this, we need to figure out what passwords people really use. How about asking them for their password and offering them a candy bar in return :). And then there is always another XKCD cartoon for you [2]
[1] http://www.xkcd.org/936/
[2] http://xkcd.com/538/
------
Johannes B. Ullrich, Ph.D.
SANS Technology Institute
Twitter
Network Monitoring and Threat Detection In-Depth | Baltimore | Mar 3rd - Mar 8th 2025 |
Comments
Rick
Aug 10th 2011
1 decade ago
Jason
Aug 10th 2011
1 decade ago
Steven Chamberlain
Aug 11th 2011
1 decade ago
Steven Chamberlain
Aug 11th 2011
1 decade ago
JJ
Aug 11th 2011
1 decade ago
there are perhaps ~1,000 common words
If you pick 4 random ones, there are
1000 * 1000 * 1000 * 1000 = 1000 ^ 4 = 10^12 possible choices.
That's approximately 2^39 or about 39 bits worth of 'choice'.
The problem is, that amount is only the entropy if your selection actually _IS_ truly random, which means you use a computer to make a truly random selection of 4 words; don't pick words off the top of your head.
As a human, there is likely to be some bias, in the words you would choose, reducing the entropy; if you pick them without a randomizer, there will be less than 39 bits of entropy.
A punctuation mark added, and a random letter capitalized somewhere in the passphrase also helps.
A combination of _both_ approaches is possible, without going overboard and making the password hard to remember.
Mysid
Aug 11th 2011
1 decade ago
1. If you implement account lockouts, even if for ten or twenty attempts, entropy doesn't matter.
2. If they've swiped a password file, they've got all the time in the world without any lockout concerns and a graphics card password cracker can try billions of permutations per second. If it's taking too long, add in a few more graphics cards.
Passwords are passé. The only reason companies use them is because they're free, and you get what you pay for.
JJ
Aug 11th 2011
1 decade ago
For using all of the character types, the max entropy is 70.3 bits at 12 characters. There are 1.48E21 possible permutations here. Assuming a random distribution, the random time to brute force at 3.3 billion passwords per second (achievable with current GPUs) is an average of about 24.8 GPU-years. To get that down to a month, you would need about 27,500 cards--not impossible, but not trivial, either. This increases to an average of 17,900 GPU-years at 13 characters.
I don't yet have complete subsets (e.g., 2 of 4 character classes) as I've not worked out the spreadsheet structure, but it still won't reach the 78.8 bits recommended by NIST. A quick calculation suggests 72.2 bits, but I'm not swearing on that number.
I've thought about doing some research into passphrase strengths where spelling is correct, but the calculations get daunting. The Oxford English Dictionary discusses word count just for English[1] and comes up with 250,000 to 750,000 words, depending on context. Then you get into sentence structure, vocabulary statistics, and regional variations, and it becomes clear why, when no other methods are available, passphrases are by far the better solution.
http://oxforddictionaries.com/page/93
Jarrod
Aug 11th 2011
1 decade ago
Have you added one word per word? I mean; There are permutations of words, like lowercase, uppercase, first letter uppercase which would at least tripple the amount. Have you added whitespaces? What if someone replaces Whitespace with lets say dots or 'x' or whatever. You see, the complexity is even higher!
alibert
Aug 11th 2011
1 decade ago
The number of words in the Oxford English Dictionary is not relevant here. What is relevant here is the number of words in the average person vocabulary. That is only a few thousand at best.
Next you can decimate the permutations, because most people will choose a grammatically correct phrase, e.g. subject transitive-verb object, or subject intransitive-verb adjective. Also you might need to ban "I", "The" and "and".
Greg
Aug 11th 2011
1 decade ago