Editing 936: Password Strength

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 8: Line 8:
  
 
==Explanation==
 
==Explanation==
This comic says that a password such as "Tr0ub4dor&3" is bad because it is easy for password cracking software and hard for humans to remember, leading to insecure practices like writing the password down on a post-it attached to the monitor. On the other hand, a password such as "correct horse battery staple" is hard for computers to guess due to having more entropy but quite easy for humans to remember.
+
Computer security consultant Mark Burnett has posted a [http://xato.net/passwords/analyzing-the-xkcd-comic/ discussion and analysis] of this comic on his blog.
  
{{w|Entropy (information theory)|Entropy}} is a measure of "uncertainty" in an outcome. In this context, it can be thought of as a value representing how unpredictable the next character of a password is. It is calculated as ''log2(a^b)'' where ''a'' is the number of allowed symbols and ''b'' is its length.
+
This comic is saying that the password in the top frames "Tr0ub4dor&3" is easier for password cracking software to guess because it has less entropy than "correcthorsebatterystaple" and also more difficult for a human to remember, leading to insecure practices like writing the password down on a post-it attached to the monitor.
  
A truly random string of length 11 (not like "Tr0ub4dor&3", but more like "J4I/tyJ&Acy") has log2(94^11) = 72.1 bits, with 94 being the total number of letters, numbers, and symbols one can choose. However the comic shows that "Tr0ub4dor&3" has only 28 bits of entropy. This is because the password follows a simple pattern of a dictionary word + a couple extra numbers or symbols, hence the entropy calculation is more appropriately expressed with log2(65000*94*94), with 65000 representing a rough estimate of all dictionary words people are likely to choose. (For related info, see https://what-if.xkcd.com/34/).
+
In simple cases the {{w|Entropy (information theory)|entropy}} of a password is calculated as ''a^b'' where ''a'' is the number of allowed symbols and ''b'' is its length. A dictionary word (however long) has an entropy of around 65000, i.e. 16 bits. A truly random string of length 11 (not like "Tr0ub4dor&3", but more like "J4I/tyJ&Acy") has 94^11 = 72.1 bits. However the comic shows that "Tr0ub4dor&3" has only 28 bits of entropy. Another way of selecting a password is to have 2048 "symbols" (common words) and select only 4 of those symbols. 2048^11 = 44 bits, much better than 28.
  
Another way of selecting a password is to have 2048 "symbols" (common words) and select only 4 of those symbols. log2(2048^4) = 44 bits, much better than 28. Using such symbols was again visited in one of the tips in [[1820: Security Advice]].
+
It is absolutely true that people make passwords hard to remember because they think they are "safer", and it is certainly true that length, all other things being equal, tends to make for very strong passwords and this can confirmed by using [http://rumkin.com/tools/password/passchk.php rumkin.com's password strength checker]. Even if the individual characters are all limited to [a-z], the exponent implied in "we added another lowercase character, so multiply by 26 again" tends to dominate the results.
  
It is absolutely true that people make passwords hard to remember because they think they are "safer", and it is certainly true that length, all other things being equal, tends to make for very strong passwords and this can be confirmed by using [https://rumkin.com/tools/password/passchk.php rumkin.com's password strength checker]. Even if the individual characters are all limited to [a-z], the exponent implied in "we added another lowercase character, so multiply by 26 again" tends to dominate the results.
+
In addition to being easier to remember, long strings of lowercase characters are also easier to type on smartphones and {{w|Virtual keyboard|soft keyboards}}.
 +
 
 +
xkcd's password generation scheme requires the user to have a list of 2048 common words (log<sub>2</sub>2(2048) = 11). For any attack we must assume that the attacker knows our password generation algorithm, but not the exact password. In this case the attacker knows the 2048 words, and knows that we selected 4 words, but not which words. The number of combinations of 4 words from this list of words is 2<sup>11</sup>&times;4 = 44 bits. For comparison, the [http://world.std.com/~reinhold/dicewarefaq.html#calculatingentropy entropy offered by Diceware's 7776 word list is 13 bits per word]. If the attacker doesn't know the algorithm used, and only knows that lowercase letters are selected, the "common words" password would take even longer to crack than depicted. 25 ''random'' lowercase characters would have [http://www.wolframalpha.com/input/?i=log2%2826^25%29 117 bits of entropy], vs 44 bits for the common words list.
  
In addition to being easier to remember, long strings of lowercase characters are also easier to type on smartphones and {{w|Virtual keyboard|soft keyboards}}.
+
{{w|Steve Gibson (computer programmer)|Steve Gibson}} from the {{w|Security Now}} podcast did a lot of work in this arena and found that the password <code>D0g.....................</code> (24 characters long) is stronger than <code>PrXyc.N(n4k77#L!eVdAfp9</code> (23 characters long) because both have at least one uppercase letter, lowercase letter, number, and "special" character, so length trumps perceived complexity. Steve Gibson makes this very clear in his password haystack [https://www.grc.com/haystack.htm reference guide and tester]:
 +
:"Once an exhaustive password search begins, '''the most important factor''' is password length!"
  
xkcd's password generation scheme requires the user to have a list of 2048 common words (log<sub>2</sub>(2048) = 11). For any attack we must assume that the attacker knows our password generation algorithm, but not the exact password. In this case the attacker knows the 2048 words, and knows that we selected 4 words, but not which words. The number of combinations of 4 words from this list of words is (2<sup>11</sup>)<sup>4</sup> = 2<sup>44</sup>, i.e. 44 bits. For comparison, the [https://world.std.com/~reinhold/dicewarefaq.html#calculatingentropy entropy offered by Diceware's 7776 word list is 13 bits per word]. If the attacker doesn't know the algorithm used, and only knows that lowercase letters are selected, the "common words" password would take even longer to crack than depicted. 25 ''random'' lowercase characters would have [https://www.wolframalpha.com/input/?i=log2%2826^25%29 117 bits of entropy], vs 44 bits for the common words list.
+
The important thing to take away from this comic is that longer passwords are better because each additional character adds much more time to the breaking of the password. That's what [[Randall]] is trying to get through here. Complexity does not matter unless you have length in passwords. Complexity is more difficult for humans to remember, but length is not.
  
 
;Example
 
;Example
 
Below there is a detailed example which shows how different rules of complexity work to generate a password with supposed 44 bits of entropy. The examples of expected passwords were generated in random.org.(*)
 
Below there is a detailed example which shows how different rules of complexity work to generate a password with supposed 44 bits of entropy. The examples of expected passwords were generated in random.org.(*)
  
If ''n'' is the number of symbols and ''L'' is the length of the password, then ''L'' = 44 / log<sub>2</sub>(n).
+
If ''n'' is the number of symbols and ''L'' is the length of the password, then ''L'' = 44 / log<sub><small>2</small></sub>(n).
  
 
{|class="wikitable"
 
{|class="wikitable"
Line 36: Line 39:
 
!Comment
 
!Comment
 
|-
 
|-
|a||26||9.3||mdniclapwz||jxtvesveiv||troubadorx||16+4.7 = 20.7||Extra letter to meet length requirement; log<sub>2</sub>(26) = 4.7
+
|a||26||9.3||mdniclapwz||jxtvesveiv||troubadorx||16+4.7 = 20.7||Extra letter to meet length requirement; log<sub><small>2</small></sub>(26) = 4.7
 
|-
 
|-
 
|rowspan="2"|a 9
 
|rowspan="2"|a 9
Line 45: Line 48:
 
|tr0ub4d0r||16+3=19||3 = common substitutions in the comic
 
|tr0ub4d0r||16+3=19||3 = common substitutions in the comic
 
|-
 
|-
|troubador1||16+3.3=19.3||log<sub>2</sub>(10) = 3.3
+
|troubador1||16+3.3=19.3||log<sub><small>2</small></sub>(10) = 3.3
 
|-
 
|-
 
|a A||52||7.7||jAwwBYne||NeTvgcrq||Troubador||16+1=17||1 = caps? in the comic
 
|a A||52||7.7||jAwwBYne||NeTvgcrq||Troubador||16+1=17||1 = caps? in the comic
Line 53: Line 56:
 
|a A 9||62||7.3||cDe8CgAf||RONygLMi||Tr0ub4d0r||16+1+3=20||1 = caps?; 3 = common substitutions
 
|a A 9||62||7.3||cDe8CgAf||RONygLMi||Tr0ub4d0r||16+1+3=20||1 = caps?; 3 = common substitutions
 
|-
 
|-
|a 9 &amp;||68||7.2||_@~"#^.2||un$l&#x7c;!f]||tr0ub4d0r&amp;||16+3+4=23||3 = common substitutions; 4 = punctuation
+
|a 9 &amp;||68||7.2||_@~"#^.2||un$l&amp;#7c;!f]||tr0ub4d0r&amp;||16+3+4=23||3 = common substitutions; 4 = punctuation
 
|-
 
|-
 
|a A 9 &amp;||94||6.7||Re-:aRo||^$rV{3?||Tr0ub4d0r&||16+1+3+4=24||1 = caps?; 3 = common substitutions; 4 = punctuation
 
|a A 9 &amp;||94||6.7||Re-:aRo||^$rV{3?||Tr0ub4d0r&||16+1+3+4=24||1 = caps?; 3 = common substitutions; 4 = punctuation
Line 64: Line 67:
 
|-
 
|-
 
|correct&#8203;horse&#8203;battery&#8203;staple
 
|correct&#8203;horse&#8203;battery&#8203;staple
|1
+
|0
|Thanks to this comic, this is now one of the first passwords a hacker will try. The only entropy left is a boolean statement: "Is this password correct&#8203;horse&#8203;battery&#8203;staple, yes or no?"
+
|Because of this comic this password has no entropy
 
|}
 
|}
  
Line 73: Line 76:
 
:&amp; = the 32 special characters in an American keyboard; Randall assumes only the 16 most common characters are used in practice (4 bits)
 
:&amp; = the 32 special characters in an American keyboard; Randall assumes only the 16 most common characters are used in practice (4 bits)
  
:(*)&nbsp;The use of random.org explains why <code>jAwwBYne</code> has two consecutive w's, why <code>Re-:aRo</code> has two R's, why <code>_@~"#^.2</code> has no letters, why <code>ewpltiayq</code> has no numbers, why "constant yield" is part of a password, etc. A human would have attempted at passwords that looked random.
+
:(*)&nbsp;The use of random.org explains why <code>jAwwBYne</code> has two consecutive w's, why <code>Re-:aRo</code> has two R's, why <code>_@~"#^.2</code> has no letters, why <code>ewpltiayq</code> has no numbers, etc. A human would have attempted at passwords that looked random.
 
 
==People who don't understand information theory and security==
 
 
 
The title text likely refers to the fact that this comic could cause people who understand information theory and agree with the message of the comic to get into an infuriating argument with people who do not — and disagree with the comic.
 
 
 
If you're confused, don't worry; you're in good company; even security "experts" don't understand the comic:
 
 
 
*  Bruce Schneier thinks that dictionary attacks make this method "obsolete", despite the comic ''assuming'' perfect knowledge of the user's dictionary from the get-go.  He advocates his own low-entropy "first letters of common plain English phrases" method instead:  [https://www.schneier.com/blog/archives/2014/03/choosing_secure_1.html#:~:text=xkcd Schneier original article] and rebuttals: [https://web.archive.org/web/20160305001236/https://robinmessage.com/2014/03/why-bruce-schneier-is-wrong-about-passwords/ 1] [https://security.stackexchange.com/a/62881/10616 2] [https://www.reddit.com/r/technology/comments/1yxgqo/bruce_schneier_on_choosing_a_secure_password/cfp2z9k 3] [https://www.reddit.com/r/YouShouldKnow/comments/232uch/ysk_how_to_properly_choose_a_secure_password_the/cgte7lp 4] [https://www.reddit.com/r/YouShouldKnow/comments/232uch/ysk_how_to_properly_choose_a_secure_password_the/cgszp62 5] [https://www.reddit.com/r/YouShouldKnow/comments/232uch/ysk_how_to_properly_choose_a_secure_password_the/cgt6ohq 6]
 
* Steve Gibson basically gets it, but calculates entropy incorrectly in order to promote his own method and upper-bound password-checking tool: [https://www.grc.com/sn/sn-313.htm#:~:text=math%20is%20wrong Steve Gibson Security Now transcript] and [https://subrabbit.wordpress.com/2011/08/26/how-much-entropy-in-that-password/ rebuttal]
 
* Computer security consultant Mark Burnett ''almost'' understands the comic, but then advocates adding numerals and other crud to make passphrases less memorable, which completely defeats the point (that it is human-friendly) in the first place: [https://web.archive.org/web/20150319220514/https://xato.net/passwords/analyzing-the-xkcd-comic/ Analyzing the XKCD Passphrase Comic]
 
* Ken Grady incorrectly thinks that user-selected sentences like "I have really bright children" have the same entropy as randomly-selected words: [https://www.hellersearch.com/blog/bid/141527/is-your-password-policy-stupid Is Your Password Policy Stupid?]
 
* Diogo Mónica is correct that a truly random 8-character string is still stronger than a truly random 4-word string (52.4 vs 44), but doesn't understand that the words have to be truly random, not user-selected phrases like "let me in facebook":  [https://diogomonica.com/posts/password-security-why-the-horse-battery-staple-is-not-correct/ Password Security: Why the horse battery staple is not correct]
 
* Ken Munro confuses entropy with permutations and undermines his own argument that "correct horse battery staple" is weak due to dictionary attacks by giving an example "strong" password that still consists of English words. He also doesn't realize that using capital letters in predictable places (first letter of every word) only  increases password strength by a bit (figuratively and literally): [https://www.pentestpartners.com/security-blog/correcthorsebatterystaple-isnt-a-good-password-heres-why/ CorrectHorseBatteryStaple isn’t a good password. Here’s why.]
 
 
 
Sigh. 🤦‍♂️
 
  
 
==Transcript==
 
==Transcript==
Line 113: Line 101:
  
 
:~28 bits of entropy  
 
:~28 bits of entropy  
:2<sup>28</sup> = 3 days at 1000 guesses/sec
+
:2<sup>28</sup> = 3 days at 1000 guesses sec
 
:(Plausible attack on a weak remote web service. Yes, cracking a stolen hash is faster, but it's not what the average user should worry about.)
 
:(Plausible attack on a weak remote web service. Yes, cracking a stolen hash is faster, but it's not what the average user should worry about.)
:Difficulty to guess: Easy
+
:Difficulty to guess: Easy.
  
 
:[Cueball stands scratching his head trying to remember the password.]
 
:[Cueball stands scratching his head trying to remember the password.]
 
:Cueball: Was it trombone? No, Troubador. And one of the O's was a zero?
 
:Cueball: Was it trombone? No, Troubador. And one of the O's was a zero?
 
:Cueball: And there was some symbol...
 
:Cueball: And there was some symbol...
:Difficulty to remember: Hard
+
:Difficulty to remember: Hard.
  
 
:[The passphrase "correct horse battery staple" is shown in the center of the panel.]
 
:[The passphrase "correct horse battery staple" is shown in the center of the panel.]
 
:Four random common words {Each word has 11 bits of entropy.}
 
:Four random common words {Each word has 11 bits of entropy.}
  
:~52 bits of entropy
+
:~44 bits of entropy
:2<sup>44</sup> = 550 years at 1000 guesses/sec
+
:2<sup>44</sup> = 550 years at 1000 guesses sec
:Difficulty to guess: Hard
+
:Difficulty to guess: Hard.
  
 
:[Cueball is thinking, in his thought bubble a horse is standing to one side talking to an off-screen observer. An arrow points to a staple attached to the side of a battery.]
 
:[Cueball is thinking, in his thought bubble a horse is standing to one side talking to an off-screen observer. An arrow points to a staple attached to the side of a battery.]
 
:Horse: That's a battery staple.
 
:Horse: That's a battery staple.
:Observer: ''Correct!''
+
:Observer: Correct!
 
:Difficulty to remember: You've already memorized it
 
:Difficulty to remember: You've already memorized it
  
Line 137: Line 125:
  
 
==External links==
 
==External links==
*An [https://en.wikipedia.org/wiki/Request_for_Comments RFC], RFC7997 ''The Use of Non-ASCII Characters in RFCs'', uses "Correct Horse Battery Staple" in ''Table 3: A sample of legal passwords'' on page 10. [https://www.rfc-editor.org/rfc/pdfrfc/rfc7997.txt.pdf#page=10]
+
*Some info was used from the highest voted answer given to the question of "how accurate is this XKCD comic" at StackExchange [http://security.stackexchange.com/questions/6095/xkcd-936-short-complex-password-or-long-dictionary-passphrase].
*Some info was used from the highest voted answer given to the question of "how accurate is this XKCD comic" at StackExchange [https://security.stackexchange.com/questions/6095/xkcd-936-short-complex-password-or-long-dictionary-passphrase].
+
*Similarly, a question of "how right this comic is" was made at AskMetaFilter [http://ask.metafilter.com/193052/Oh-Randall-you-do-confound-me-so] and [[Randall]] responded [http://ask.metafilter.com/193052/Oh-Randall-you-do-confound-me-so#2779020 there].
*Similarly, a question of "how right this comic is" was made at AskMetaFilter [https://ask.metafilter.com/193052/Oh-Randall-you-do-confound-me-so] and [[Randall]] responded [https://ask.metafilter.com/193052/Oh-Randall-you-do-confound-me-so#2779020 there].
 
 
*Also the Wikipedia article on '{{w|Passphrase}}' is useful.
 
*Also the Wikipedia article on '{{w|Passphrase}}' is useful.
*In case you missed it in the explanation, GRC's Steve Gibson has a fantastic page [https://www.grc.com/haystack.htm] about this (and may have prompted this comic, as his podcast [https://www.grc.com/sn/sn-303.htm] about this was posted the month before this comic).
+
*In case you missed it in the explanation, GRC's Steve Gibson has a fantastic page [https://www.grc.com/haystack.htm] about this (and may have prompted this comic, as his podcast [http://www.grc.com/sn/sn-303.htm] about this was posted the month before this comic).
* This comic inspired [https://blog.acolyer.org/2015/10/29/how-to-memorize-a-random-60-bit-string/ How to memorize a random 60-bit string] scientific paper (link is to the article about paper, with paper itself linked)
 
* [https://github.com/dropbox/zxcvbn zxcvbn password strength estimator] thanks this comic for the inspiration in acknowledgements.
 
* CMU paper: [https://cups.cs.cmu.edu/soups/2012/proceedings/a7_Shay.pdf Correct horse battery staple: Exploring the usability of system-assigned passphrases]
 
* [https://www.microsoft.com/en-us/research/wp-content/uploads/2016/06/Microsoft_Password_Guidance-1.pdf Microsoft Password Guidance] (page 8)
 
* [https://gizmodo.com/the-guy-who-invented-those-annoying-password-rules-now-1797643987 The Guy Who Invented Those Annoying Password Rules Now Regrets Wasting Your Time], August 8, 2017 (this comic is reproduced in the article).
 
  
 
{{comic discussion}}
 
{{comic discussion}}
Line 152: Line 134:
 
[[Category:Math]]
 
[[Category:Math]]
 
[[Category:Computers]]
 
[[Category:Computers]]
[[Category:Psychology]]
 
[[Category:Computer security]]
 

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)