Password Strength Title text: To anyone who understands information theory and security and is in an infuriating argument with someone who does not (possibly involving mixed case), I sincerely apologize.

## Explanation

Computer security consultant Mark Burnett has posted a discussion and analysis of this comic on his blog.

This comic is saying that the password in the top frames "Tr0ub4dor&3" is easier for password cracking software to guess because it has less entropy than "correcthorsebatterystaple" and also more difficult for a human to remember, leading to insecure practices like writing the password down on a post-it attached to the monitor.

In simple cases the entropy of a password is calculated as a^b where a is the number of allowed symbols and b is its length. A dictionary word (however long) has an entropy of around 65000, i.e. 16 bits. A truly random string of length 11 (not like "Tr0ub4dor&3", but more like "J4I/tyJ&Acy") has 94^11 = 72.1 bits. However the comic shows that "Tr0ub4dor&3" has only 28 bits of entropy. Another way of selecting a password is to have 2048 "symbols" (common words) and select only 4 of those symbols. 2048^4 = 44 bits, much better than 28.

It is absolutely true that people make passwords hard to remember because they think they are "safer", and it is certainly true that length, all other things being equal, tends to make for very strong passwords and this can confirmed by using rumkin.com's password strength checker. Even if the individual characters are all limited to [a-z], the exponent implied in "we added another lowercase character, so multiply by 26 again" tends to dominate the results.

In addition to being easier to remember, long strings of lowercase characters are also easier to type on smartphones and soft keyboards.

xkcd's password generation scheme requires the user to have a list of 2048 common words (log2(2048) = 11). For any attack we must assume that the attacker knows our password generation algorithm, but not the exact password. In this case the attacker knows the 2048 words, and knows that we selected 4 words, but not which words. The number of combinations of 4 words from this list of words is (211)4 = 244 bits. For comparison, the entropy offered by Diceware's 7776 word list is 13 bits per word. If the attacker doesn't know the algorithm used, and only knows that lowercase letters are selected, the "common words" password would take even longer to crack than depicted. 25 random lowercase characters would have 117 bits of entropy, vs 44 bits for the common words list.

Steve Gibson from the Security Now podcast did a lot of work in this arena and found that the password `D0g.....................` (24 characters long) is stronger than `PrXyc.N(n4k77#L!eVdAfp9` (23 characters long) because both have at least one uppercase letter, lowercase letter, number, and "special" character, so length trumps perceived complexity. Steve Gibson makes this very clear in his password haystack reference guide and tester:

"Once an exhaustive password search begins, the most important factor is password length!"

The important thing to take away from this comic is that longer passwords are better because each additional character adds much more time to the breaking of the password. That's what Randall is trying to get through here. Complexity does not matter unless you have length in passwords. Complexity is more difficult for humans to remember, but length is not.

Example

Below there is a detailed example which shows how different rules of complexity work to generate a password with supposed 44 bits of entropy. The examples of expected passwords were generated in random.org.(*)

If n is the number of symbols and L is the length of the password, then L = 44 / log2(n).

Symbols Number of symbols Minimum length Examples of expected passwords Example of an actual password Actual bits of entropy Comment
a 26 9.3 mdniclapwz jxtvesveiv troubadorx 16+4.7 = 20.7 Extra letter to meet length requirement; log2(26) = 4.7
a 9 36 8.5 qih7cbrmd ewpltiayq tr0ub4d0r 16+3=19 3 = common substitutions in the comic
a A 52 7.7 jAwwBYne NeTvgcrq Troubador 16+1=17 1 = caps? in the comic
a & 58 7.5 j.h?nv), c/~/fg\: troubador& 16+4=20 4 = punctuation in the comic
a A 9 62 7.3 cDe8CgAf RONygLMi Tr0ub4d0r 16+1+3=20 1 = caps?; 3 = common substitutions
a 9 & 68 7.2 _@~"#^.2 un\$l|!f] tr0ub4d0r& 16+3+4=23 3 = common substitutions; 4 = punctuation
a A 9 & 94 6.7 Re-:aRo ^\$rV{3? Tr0ub4d0r& 16+1+3+4=24 1 = caps?; 3 = common substitutions; 4 = punctuation
common words 2048 4 reasonable​retail​sometimes​possibly constant​yield​specify​priority reasonable​retail​sometimes​possibly 11×4=44 Go to random.org and select 4 random integers between 1 and 2048; then go to your list of common words
correct​horse​battery​staple 0 Because of this comic, this password has no entropy
a = lowercase letters
A = uppercase letters
9 = digits
& = the 32 special characters in an American keyboard; Randall assumes only the 16 most common characters are used in practice (4 bits)
(*) The use of random.org explains why `jAwwBYne` has two consecutive w's, why `Re-:aRo` has two R's, why `_@~"#^.2` has no letters, why `ewpltiayq` has no numbers, why "constant yield" is part of a password, etc. A human would have attempted at passwords that looked random.

The title text likely refers to the fact that this comic could cause people who understand information theory and agree with the message of the comic to get into an infuriating argument with people who do not — and disagree with the comic.

If you're confused, don't worry; you're in good company. Even security "experts" like Steve Gibson and Bruce Schneier misunderstand the comic's point: Steve Gibson Security Now transcript and rebuttal, Schneier original article and rebuttals: 1 2 3 4 5

## Transcript

The comic illustrates the relative strength of passwords assuming basic knowledge of the system used to generate them.
A set of boxes is used to indicate how many bits of entropy a section of the password provides.
The comic is laid out with 6 panels arranged in a 3x2 grid.
On each row, the first panel explains the breakdown of a password, the second panel shows how long it would take for a computer to guess, and the third panel provides an example scene showing someone trying to remember the password.
[The password "Tr0ub4dor&3" is shown in the center of the panel. A line from each annotation indicates the word section the comment applies to.]
Uncommon (non-gibberish) base word
[Highlighting the base word - 16 bits of entropy.]
Caps?
[Highlighting the first letter - 1 bit of entropy.]
Common Substitutions
[Highlighting the letters 'a' (substituted by '4') and both 'o's (the first of which is substituted by '0') - 3 bits of entropy.]
Punctuation
[Highlighting the symbol appended to the word - 4 bits of entropy.]
Numeral
[Highlighting the number appended to the word - 3 bits of entropy.]
Order unknown
[Highlighting the appended characters - 1 bit of entropy.]
(You can add a few more bits to account for the fact that this is only one of a few common formats.)
~28 bits of entropy
228 = 3 days at 1000 guesses sec
(Plausible attack on a weak remote web service. Yes, cracking a stolen hash is faster, but it's not what the average user should worry about.)
Difficulty to guess: Easy.
Cueball: Was it trombone? No, Troubador. And one of the O's was a zero?
Cueball: And there was some symbol...
Difficulty to remember: Hard.
[The passphrase "correct horse battery staple" is shown in the center of the panel.]
Four random common words {Each word has 11 bits of entropy.}
~44 bits of entropy
244 = 550 years at 1000 guesses sec
Difficulty to guess: Hard.
[Cueball is thinking, in his thought bubble a horse is standing to one side talking to an off-screen observer. An arrow points to a staple attached to the side of a battery.]
Horse: That's a battery staple.
Observer: Correct!
Difficulty to remember: You've already memorized it
Through 20 years of effort, we've successfully trained everyone to use passwords that are hard for humans to remember, but easy for computers to guess.
##### Tools

It seems you are using noscript, which is stopping our project wonderful ads from working. Explain xkcd uses ads to pay for bandwidth, and we manually approve all our advertisers, and our ads are restricted to unobtrusive images and slow animated GIFs. If you found this site helpful, please consider whitelisting us.

Want to advertise with us, or donate to us with Paypal?