Password Strength Title text: To anyone who understands information theory and security and is in an infuriating argument with someone who does not (possibly involving mixed case), I sincerely apologize.

## Explanation

This comic says that a password such as "Tr0ub4dor&3" is bad because it is easy for password cracking software and hard for humans to remember, leading to insecure practices like writing the password down on a post-it attached to the monitor. On the other hand, a password such as "correcthorsebatterystaple" is hard for computers to guess due to having more entropy but quite easy for humans to remember.

In simple cases the entropy of a password is calculated as a^b where a is the number of allowed symbols and b is its length. A dictionary word (however long) has a password space of around 65000, i.e. 16 bits. A truly random string of length 11 (not like "Tr0ub4dor&3", but more like "J4I/tyJ&Acy") has 94^11 = 72.1 bits. However the comic shows that "Tr0ub4dor&3" has only 28 bits of entropy. Another way of selecting a password is to have 2048 "symbols" (common words) and select only 4 of those symbols. 2048^4 = 44 bits, much better than 28. Using such symbols was again visited in one of the tips in 1820: Security Advice.

It is absolutely true that people make passwords hard to remember because they think they are "safer", and it is certainly true that length, all other things being equal, tends to make for very strong passwords and this can confirmed by using rumkin.com's password strength checker. Even if the individual characters are all limited to [a-z], the exponent implied in "we added another lowercase character, so multiply by 26 again" tends to dominate the results.

In addition to being easier to remember, long strings of lowercase characters are also easier to type on smartphones and soft keyboards.

xkcd's password generation scheme requires the user to have a list of 2048 common words (log2(2048) = 11). For any attack we must assume that the attacker knows our password generation algorithm, but not the exact password. In this case the attacker knows the 2048 words, and knows that we selected 4 words, but not which words. The number of combinations of 4 words from this list of words is (211)4 = 244, i.e. 44 bits. For comparison, the entropy offered by Diceware's 7776 word list is 13 bits per word. If the attacker doesn't know the algorithm used, and only knows that lowercase letters are selected, the "common words" password would take even longer to crack than depicted. 25 random lowercase characters would have 117 bits of entropy, vs 44 bits for the common words list.

Example

Below there is a detailed example which shows how different rules of complexity work to generate a password with supposed 44 bits of entropy. The examples of expected passwords were generated in random.org.(*)

If n is the number of symbols and L is the length of the password, then L = 44 / log2(n).

Symbols Number of symbols Minimum length Examples of expected passwords Example of an actual password Actual bits of entropy Comment
a 26 9.3 mdniclapwz jxtvesveiv troubadorx 16+4.7 = 20.7 Extra letter to meet length requirement; log2(26) = 4.7
a 9 36 8.5 qih7cbrmd ewpltiayq tr0ub4d0r 16+3=19 3 = common substitutions in the comic
a A 52 7.7 jAwwBYne NeTvgcrq Troubador 16+1=17 1 = caps? in the comic
a & 58 7.5 j.h?nv), c/~/fg\: troubador& 16+4=20 4 = punctuation in the comic
a A 9 62 7.3 cDe8CgAf RONygLMi Tr0ub4d0r 16+1+3=20 1 = caps?; 3 = common substitutions
a 9 & 68 7.2 [email protected]~"#^.2 un\$l|!f] tr0ub4d0r& 16+3+4=23 3 = common substitutions; 4 = punctuation
a A 9 & 94 6.7 Re-:aRo ^\$rV{3? Tr0ub4d0r& 16+1+3+4=24 1 = caps?; 3 = common substitutions; 4 = punctuation
common words 2048 4 reasonable​retail​sometimes​possibly constant​yield​specify​priority reasonable​retail​sometimes​possibly 11×4=44 Go to random.org and select 4 random integers between 1 and 2048; then go to your list of common words
correct​horse​battery​staple 1 Thanks to this comic, this is now one of the first passwords a hacker will try. The only entropy left is a boolean statement: "Is this password correct​horse​battery​staple, yes or no?"
a = lowercase letters
A = uppercase letters
9 = digits
& = the 32 special characters in an American keyboard; Randall assumes only the 16 most common characters are used in practice (4 bits)
(*) The use of random.org explains why `jAwwBYne` has two consecutive w's, why `Re-:aRo` has two R's, why `[email protected]~"#^.2` has no letters, why `ewpltiayq` has no numbers, why "constant yield" is part of a password, etc. A human would have attempted at passwords that looked random.

## People who don't understand information theory and security

The title text likely refers to the fact that this comic could cause people who understand information theory and agree with the message of the comic to get into an infuriating argument with people who do not — and disagree with the comic.

If you're confused, don't worry; you're in good company; even security "experts" don't understand the comic:

• Bruce Schneier thinks that dictionary attacks make this method "obsolete", despite the comic assuming perfect knowledge of the user's dictionary from the get-go. He advocates his own low-entropy "first letters of common plain English phrases" method instead: Schneier original article and rebuttals: 1 2 3 4 5 6
• Steve Gibson basically gets it, but calculates entropy incorrectly in order to promote his own method and upper-bound password-checking tool: Steve Gibson Security Now transcript and rebuttal
• Computer security consultant Mark Burnett almost understands the comic, but then advocates adding numerals and other crud to make passphrases less memorable, which completely defeats the point (that it is human-friendly) in the first place: Analyzing the XKCD Passphrase Comic
• Ken Grady incorrectly thinks that user-selected sentences like "I have really bright children" have the same entropy as randomly-selected words: Is Your Password Policy Stupid?
• Diogo Mónica is correct that a truly random 8-character string is still stronger than a truly random 4-word string (52.4 vs 44), but doesn't understand that the words have to be truly random, not user-selected phrases like "let me in facebook": Password Security: Why the horse battery staple is not correct
• Ken Munro confuses entropy with permutations and undermines his own argument that "correct horse battery staple" is weak due to dictionary attacks by giving an example "strong" password that still consists of English words. He also doesn't realize that using capital letters in predictable places (first letter of every word) does not increase password strength: CorrectHorseBatteryStaple isn’t a good password. Here’s why.

Sigh.

## Transcript

The comic illustrates the relative strength of passwords assuming basic knowledge of the system used to generate them.
A set of boxes is used to indicate how many bits of entropy a section of the password provides.
The comic is laid out with 6 panels arranged in a 3x2 grid.
On each row, the first panel explains the breakdown of a password, the second panel shows how long it would take for a computer to guess, and the third panel provides an example scene showing someone trying to remember the password.
[The password "Tr0ub4dor&3" is shown in the center of the panel. A line from each annotation indicates the word section the comment applies to.]
Uncommon (non-gibberish) base word
[Highlighting the base word - 16 bits of entropy.]
Caps?
[Highlighting the first letter - 1 bit of entropy.]
Common Substitutions
[Highlighting the letters 'a' (substituted by '4') and both 'o's (the first of which is substituted by '0') - 3 bits of entropy.]
Punctuation
[Highlighting the symbol appended to the word - 4 bits of entropy.]
Numeral
[Highlighting the number appended to the word - 3 bits of entropy.]
Order unknown
[Highlighting the appended characters - 1 bit of entropy.]
(You can add a few more bits to account for the fact that this is only one of a few common formats.)
~28 bits of entropy
228 = 3 days at 1000 guesses/sec
(Plausible attack on a weak remote web service. Yes, cracking a stolen hash is faster, but it's not what the average user should worry about.)
Difficulty to guess: Easy