936: Password Strength

Explain xkcd: It's 'cause you're dumb.
Revision as of 01:29, 4 February 2014 by Davidy22 (Talk | contribs)

Jump to: navigation, search
Password Strength
To anyone who understands information theory and security and is in an infuriating argument with someone who does not (possibly involving mixed case), I sincerely apologize.
Title text: To anyone who understands information theory and security and is in an infuriating argument with someone who does not (possibly involving mixed case), I sincerely apologize.

Explanation

Computer security consultant Mark Burnett has posted a discussion and analysis of this comic on his blog.

This comic is saying that the password in the top frames "Tr0ub4dor&3" is easier for password cracking software to guess because it's shorter in length than "correcthorsebatterystaple" and also more difficult for a human to remember, leading to insecure practices like writing the password down on a post-it attached to the monitor (Tr0ub4dor&3).

It is absolutely true that people make passwords hard to remember because they think they are "safer", and it is certainly true that length, all other things being equal, tends to make for very strong passwords and this can confirmed by using rumkin.com's password strength checker. Even if the individual characters are all limited to [a-z], the exponent implied in "we added another lowercase character, so multiply by 26 again" tends to dominate the results.

In addition to being easier to remember, and long strings of lowercase characters are also easier to type on smartphones and soft keyboards.

xkcd's entropy estimate of 11 bits per word assumes that the password is being brute-forced with a dictionary attack, and that the words are being chosen from a dictionary of 2000 words (log2(2000) ≈ 11). For comparison, the entropy offered by Diceware's 7776 word dictionary is 13 bits per word. If a dictionary attack were not used, the "common words" password would take even longer to crack than depicted. 25 random lowercase characters would have 117 bits of entropy, vs 44 bits for the dictionary words.

Steve Gibson from the Security Now podcast did a lot of work in this arena and found that the password D0g..................... (24 characters long) is stronger than PrXyc.N(n4k77#L!eVdAfp9 (23 characters long) because both have at least one uppercase letter, lowercase letter, number, and "special" character, so length trumps perceived complexity. Steve Gibson makes this very clear in his password haystack reference guide and tester:

"Once an exhaustive password search begins, the most important factor is password length!"

The important thing to take away from this comic is that longer passwords are better because each additional character adds much more time to the breaking of the password. That's what Randall is trying to get through here. Complexity does not matter unless you have length in passwords. Complexity is more difficult for humans to remember, but length is not.

Transcript

The comic illustrates the relative strength of passwords assuming basic knowledge of the system used to generate them.
A set of boxes is used to indicate how many bits of entropy a section of the password provides.
The comic is laid out with 6 panels arranged in a 3x2 grid.
On each row, the first panel explains the breakdown of a password, the second panel shows how long it would take for a computer to guess, and the third panel provides an example scene showing someone trying to remember the password.
[The password "Tr0ub4dor&3" is shown in the center of the panel. A line from each annotation indicates the word section the comment applies to.]
Uncommon (non-gibberish) base word
[Highlighting the base word - 16 bits of entropy.]
Caps?
[Highlighting the first letter - 1 bit of entropy.]
Common Substitutions
[Highlighting the letters 'a' (substituted by '4') and both 'o's (the first of which is substituted by '0') - 3 bits of entropy.]
Punctuation
[Highlighting the symbol appended to the word - 4 bits of entropy.]
Numeral
[Highlighting the number appended to the word - 3 bits of entropy.]
Order unknown
[Highlighting the appended characters - 1 bit of entropy.]
(You can add a few more bits to account for the fact that this is only one of a few common formats.)
~28 bits of entropy
228 = 3 days at 1000 guesses sec
(Plausible attack on a weak remote web service. Yes, cracking a stolen hash is faster, but it's not what the average user should worry about.)
Difficulty to guess: Easy.
[Cueball stands scratching his head trying to remember the password.]
Cueball: Was it trombone? No, Troubador. And one of the O's was a zero?
Cueball: And there was some symbol...
Difficulty to remember: Hard.
[The passphrase "correct horse battery staple" is shown in the center of the panel.]
Four random common words {Each word has 11 bits of entropy.}
~44 bits of entropy
244 = 550 years at 1000 guesses sec
Difficulty to guess: Hard.
[Cueball is thinking, in his thought bubble a horse is standing to one side talking to an off-screen observer. An arrow points to a staple attached to the side of a battery.]
Horse: That's a battery staple.
Observer: Correct!
Difficulty to remember: You've already memorized it
Through 20 years of effort, we've successfully trained everyone to use passwords that are hard for humans to remember, but easy for computers to guess.

External links

  • Some info was used from the highest voted answer given to the question of "how accurate is this XKCD comic" at StackExchange [1].
  • Similarly, a question of "how right this comic is" was made at AskMetaFilter [2] and Randall responded there.
  • Also the Wikipedia article on 'Passphrase' is useful.
  • In case you missed it in the explanation, GRC's Steve Gibson has a fantastic page [3] about this (and may have prompted this comic, as his podcast [4] about this was posted the month before this comic).
comment.png add a comment! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

You still have to vary the words with a bit of capitalization, punctuation and numbers a bit, or hackers can just run a dictionary attack against your string of four words. Davidy²²[talk] 09:12, 9 March 2013 (UTC)

No you don't. Hackers cannot run a dictionary attack against a string of four randomly picked words. Look at the number of bits displayed in the image: 11 bits for each word. That means he's assuming a dictionary of 2048 words, from which each word is picked randomly. The assumption is that the cracker knows your password scheme. 86.81.151.19 20:17, 28 April 2013 (UTC) Willem

I just wrote a program to bruteforce this password creation method. https://github.com/KrasnayaSecurity/xkcd936/blob/master/listGen936.py Once I get it I'll try coming up with more bruteforcing algorithms such as substituting symbols, numbers, camel case, and the like. Point is, don't rely on this or any one method. I wouldn't be surprised if the crackers are already working on something like this. Lieutenant S. (talk) 07:03, 8 September 2014 (UTC)
It took 1.25 hours to bruteforce "correcthorsebatterystaple" using the 2,000 most common words with one CPU. Lieutenant S. (talk) 07:09, 9 September 2014 (UTC)
1) ... as compared to 69 milliseconds for the other method. 2) Since you are able to test 3,9 billion passwords as second (very impressive!) I am guessing that your setup is not performing its attack over a ”weak remote service”, which is breaking the rules of the #936 game. 3) five words and a 20k-wordlist would get you 9400 years (still breaking the weak remote service rule).--Gnirre (talk) 09:13, 14 October 2014 (UTC)

Sometimes this is not possible. (I'm looking at you, local banks with 8-12 character passwords and PayPal) If I can, I use a full sentence. A compound sentence for the important stuff. This adds the capitalization, punctuation and possibly the use of numbers while it's even easier to remember then Randall's scheme. I think it might help against the keyloggers too, if your browser/application autofills the username filed, because you password doesn't stand out from the feed with being gibberish. 195.56.58.169 09:01, 30 August 2013 (UTC)

The basic concept can be adapted to limited-length passwords easily enough: memorize a phrase and use the first letter of each word. It'll require about a dozen words (you're only getting 4.7 bits per letter at best, actually less because first letters of words are not truly random, though they are weakly if at all correlated with their neighbors -- based on the frequencies of first letters of words in English, and assuming no correlation between each first letter and the next, I calculate about 4 bits per character of Shannon entropy). SteveMB 18:35, 30 August 2013 (UTC)

Followup: The results of extracting the first letters of words in sample texts (the Project Gutenberg texts of The Adventures of Huckleberry Finn, The War of the Worlds, and Little Fuzzy) and applying a Shannon entropy calculation were 4.07 bits per letter (i.e. first letter in word) and 8.08 bits per digraph (i.e. first letters in two consecutive words). These results suggest that first-letter-of-phrase passwords have approximately 4 bits per letter of entropy. --SteveMB (talk) 14:21, 4 September 2013 (UTC)

Addendum: The above test was case-insensitive (all letters converted to lowercase before feeding them to the [frequency counter]). Thus, true-random use of uppercase and lowercase would have 5 bits per letter of entropy, and any variation in case (e.g. preserving the case of the original first letter) would fall between 4 and 5 bits per letter. --SteveMB (talk) 14:28, 4 September 2013 (UTC)

I just have RANDOM.ORG print me ten pages of 8-character passwords and tape it to the wall, then highlight some of them and use others (say two down and to the right or similar) for my passwords, maybe a given line a line a little jumbled for more security. 70.24.167.3 13:27, 30 September 2013 (UTC)

Remind me to visit your office and secretly replace your wall-lists by a list of very similar looking strings ;) --Chtz (talk) 13:53, 30 September 2013 (UTC)

Simple.com (online banking site) had the following on it’s registration page:

“Passphrase? Yes. Passphrases are easier to remember and more secure than traditional passwords. For example, try a group of words with spaces in between, or a sentence you know you'll remember. "correct horse battery staple" is a better passphrase than r0b0tz26.”

Online security for a banking site has been informed by an online comic. Astounding. 173.245.54.78 21:22, 11 November 2013 (UTC)

The Web service Dropbox has an Easter egg related to this comic on their sign-up page. That page has a password strength indicator (powered by JavaScript) which changes as you type your password. This indicator also shows hints when hovering the mouse cursor over it. Entering "Tr0ub4dor&3" or "Tr0ub4dour&3" as the password causes the password strength indicator to fall to zero, with the hint saying, "Guess again." Entering "correcthorsebatterystaple" as the password also causes the strength indicator to fall to zero, but the hint says, "Whoa there, don't take advice from a webcomic too literally ;)." 108.162.218.95 15:17, 11 February 2014 (UTC)

The explanation said that the comic uses a dictionary[5]. In fact it's a word list, which seems similar but it's not. All the words in the word list must be easy to memorize. This means it's better not to have words such as than or if. Also, it's better not to have homophones (wood and would, for example). The sentence dictionary attack doesn't apply here. A dictionary attack requires the attacker to use all the words in the dictionary (e.g. 100,000 words). Here we must generate the 17,592,186,044,416 combinations of 4 common words. Those combinations can't be found in any dictionary. At 25 bytes per "word" that dictionary would need 400 binary terabytes to be stored. Xhfz (talk) 21:37, 11 March 2014 (UTC)

This comic was mentioned in a TED talk by Lorrie Faith Cranor on in March 2014. After performing a lot of studies and analysis, she concludes that "pass phrase" passwords are no easier to remember than complex passwords and that the increased length of the password increases the number of errors when typing it. There is a lot of other useful information from her studies that can be gleaned from the talk. Link. What she doesn't mention is the frequency of changing passwords - in most organizations it's ~90 days. I don't know where that standard originated, but (as a sys admin) I suspect it's about as ineffective as most of our other password trickery - that is that it does nothing. Today's password thieves don't bash stolen password hash tables, they bundle keyloggers with game trainers and browser plugins.--173.245.50.75 18:14, 2 July 2014 (UTC)

Lorrie Faith Cranor gets the random part of #936 word generation correct, which is great. Regarding memorizability, this study (https://cups.cs.cmu.edu/soups/2012/proceedings/a7_Shay.pdf) does not address #936. The study uses no generator for gibberish of length 11. Most comparable are perhaps two classes of five or six randomly assigned characters. None of the study's generators has 44 bits of entropy – its dictionary for the method closest to #936 – noun-instr – contains only 181 nouns. The article contains no discussion of the significance of these differences to #936. In her TED Lorrie Faith Cranor says ”sorry all you xkcd fans” which could be interpreted as judgement of #936, but there is no basis in the above article for that. It does however seem plausible that the report could be reworked to address #936. --Gnirre (talk) 10:42, 14 October 2014 (UTC)
Password-changing frequency isn't about making passwords more secure, but instead it's about mitigating the damage of a successfully cracked password. If a hacker gets your password (through any means) and your password changes every 90 days, the password the hacker has obtained is only useful for a few months at most. That might be enough, but it might not. If the hacker is brute forcing the passwords to get them, that cuts into the time the password is useful. --173.245.54.168 22:22, 13 October 2014 (UTC)
However, brute-forcing gets much easier that way.
Say the average employee is around for 10 years, which is reasonable for some companies , absurdly high for others, and a bit low for a family business. That's 40 password changes.
Now if you have to remember another password every now and then, you sacrifice complexity, lest you forget it. A factor of 40 is like one character less. But how much shorter will the password be? It's more likely that it's gonna be 3 or 4 characters less. Congrats, you just a factor of 1000's for a perceived "mitigation", which doesn't even work. Pro attackers can vacuum your server in a DAY once they have the PW. 141.101.104.53 13:03, 4 December 2014 (UTC)

Just because you are required to have a password that has letters and numbers in it doesn't mean you can't make it memorable. When caps are required, use CamelCase. When punctuation is required, make it an ampersand (&) or include a contraction. When numbers are required, pick something that has significance to you (your birthday, the resolution of your television, ect.). Keep in mind that, if your phrase is an actual sentence, the password entropy is 1.1 bits per character (http://what-if.xkcd.com/34), so length is key if you want your password to be secure. (Though no known algorithm can actually exploit the 1.1 bits of entropy to gain time, so it might be more like 11 bits of entropy per word. Even then, my passwords have nonexistent and uncommon words in them, (like doge or trope), which also adds some entropy.) 108.162.246.213 22:18, 1 September 2014 (UTC)

Flip side of the story, the "capital plus small plus other char" policy doesn't make your password any safer.
The German company T-online had an experimental gateway with the password, "internet". Now that sucked. No problem, tho, because that gateway wasn't accessible from outside. When they went live, they "improved" the password to "Internet1". There are still lots of these passwords around: first letter is a Cap, and the only non-alphabetic char is a 1 at the end. This doesn't add any entropy. 141.101.104.53 13:03, 4 December 2014 (UTC)
This shows that about one third of all digits in a sample of passwords was "1" . 141.101.104.53 13:14, 4 December 2014 (UTC)

You can also troll the brute-force engine by using words from other languages, fictional books and video games.--Horsebattery (talk) 03:04, 3 November 2014 (UTC)

That's a good idea; it adds to the entropy bits per word. If you really want to throw them off, mix different languages. Just don't use very well-known words; I'm sure the hackers have cojones and Blitzkrieg in their dictionaries. 141.101.104.53 13:03, 4 December 2014 (UTC)
Personal tools
Namespaces

Variants
Actions
Navigation
Tools

It seems you are using noscript, which is stopping our project wonderful ads from working. Explain xkcd uses ads to pay for bandwidth, and we manually approve all our advertisers, and our ads are restricted to unobtrusive images and slow animated GIFs. If you found this site helpful, please consider whitelisting us.

Want to advertise with us, or donate to us with Paypal or Bitcoin?