Main Page

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search

Welcome to the explain xkcd wiki!
We have an explanation for all 3271 xkcd comics, and only 44 (1.3%) are incomplete. Help us finish them!

 Go to this explanation

Latest comic

Sports Commentary
The plural of anecdote may not be data, but the singular of data is anecdote.
Title text: The plural of anecdote may not be data, but the singular of data is anecdote.

Explanation

P-hacking is the academically problematic practice of attempting to come up with a question for which the data offers a significant p-value (probability value), a subject previously covered in comic form. This is in contrast to correct scientific analysis, in which a realistic question is formulated clearly and then answered (or shown to be unjustified) with data. There are several issues with p-hacking. One is that that larger data sets usually give more reliable results, so shrinking the data set indicates an effort to justify a conclusion, rather than a desire for accuracy. Another issue that the more different data sets you compare, the greater the odds of one of them showing a false correlation, simply due to statistical noise. An honest researcher would want to avoid such pitfalls, but someone trying to justify a conclusion might not care.

A method of p-hacking involves analyzing subgroups to attempt to find significance when the full dataset does not yield statistically significant results; for instance, if a medical study didn't show an expected correlation, one might look only at data for male patients, and then only at male patients of certain age ranges, and so on, until they found a group that showed the desired correlation. Restricting data is warranted in some situations, but doing it to look for a particular result greatly increases the chances of misinterpreting statistical noise as a real result.

A similar effect is seen with sport commentators, and this is lampooned in the strip. Commentators often try to make predictions about developing situations by comparing them to past situations, such as previous competitions between the same teams. If commentators are trying to support a pet theory, however, they deliberately restrict themselves situations that ended in a particular way. By narrowing down the historical body with multiple qualifiers, they can justify a particular prediction. (A similar tactic was portrayed in 2901: Geographic Qualifiers)

Randall satirizes this with an example in which the restriction uses very specific criteria largely irrelevant to gameplay patterns in order to narrow down the subgroup sample size to a mere two games. The 0-2 record (there were two situations considered as comparable, and neither of them resulted in the result hoped for in this current case) reflects random noise much more than any significant insight. As well as being irrelevant to gameplay, their p-hacking also makes the game sound like jargon, which can be confusing and difficult to understand. This is ironic given a sports commentator's job is supposed to be to explain the situation they are fronting, rather than making them more vague and incomprehensible. However, this may be the inevitable response to being left in front of the camera during breaks in play, or even during periods of gameplay that are nominally unremarkable — feeling the pressure to say something, they will draw upon ever more obscure and irrelevant details to justify their (or their off-screen advisors') efforts and expertise to entertain and inform the viewing public.

The title text references an old saying in statistics: "The plural of anecdote is not data". This saying means a set of anecdotes do not constitute significant data, because anecdotes are heavily subject to selection bias, may be unreliable (as they're often not rigorously recorded or controlled) and usually don't come in large enough numbers to be significant. Randall, however, argues that the reverse is true. By reducing the body of data to a single point (which is the ultimate extreme of p-hacking), all you are left with is an anecdote, statistically worth nothing.

This comic was published 11 days into the 2026 FIFA World Cup. The World Cup was also the subject of 3260: Messi, published the previous Wednesday. Sports commentary was also the subject of 904: Sports.

Transcript

[Cueball and Ponytail are sitting at a table, looking at the wall behind them. On the wall is a screen showing a soccer field with some mostly unreadable score information above it. The only readable information is that the score is 2-1.]
Cueball: They could be in trouble. Over the last 36 years, they've gone 0 for 2 when they've scored in the 37th minute to lead 2-1 against a team whose country comes before theirs alphabetically.
[Caption below comic:]
I wish sports commentators hadn't discovered p-hacking.


      new topic.png  View comic discussion

New here?

Last 7 days (Top 10)

Lots of people contribute to make this wiki a success. Many of the recent contributors above have just joined. You can do it too! Create your account here.

You can read a brief introduction about this wiki at explain xkcd. Feel free to create an account and contribute to the wiki! We need explanations for xkcd comics, characters, What If? articles, and everything in between. If it is referenced in an xkcd comic, it should be here.

  • The incomplete explanations are listed here. Feel free to help out by expanding them!

Rules

Don't be a jerk!

There are a lot of comics that don't have set-in-stone explanations; feel free to put multiple interpretations in the wiki page for each comic.

If you want to talk about a specific comic, use its discussion page.

Please only submit material directly related to xkcd and, of course, only submit material that can legally be posted and freely edited. Off-topic or other inappropriate content is subject to removal or modification at admin discretion, and users who repeatedly post such content will be blocked.

If you need assistance from an admin, post a message to the Admin requests board.