Editing 882: Significant

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 4: Line 4:
 
| title    = Significant
 
| title    = Significant
 
| image    = significant.png
 
| image    = significant.png
 +
| imagesize =
 
| titletext = So, uh, we did the green study again and got no link. It was probably a-- "RESEARCH CONFLICTED ON GREEN JELLY BEAN/ACNE LINK; MORE STUDY RECOMMENDED!"
 
| titletext = So, uh, we did the green study again and got no link. It was probably a-- "RESEARCH CONFLICTED ON GREEN JELLY BEAN/ACNE LINK; MORE STUDY RECOMMENDED!"
 
}}
 
}}
  
 
==Explanation==
 
==Explanation==
This comic is about {{w|data dredging}} (aka ''p''-hacking), and the misrepresentation of science and statistics in the media. A girl with a black ponytail comes to [[Cueball]] with her claim that {{w|jelly beans}} cause {{w|acne}}, and Cueball then commissions two scientists (a man with goggles and [[Megan]]) to do some research on the link between jelly beans and acne. They find no link, but in the end the real result of this research is bad news reporting!
+
This comic is about the often difficult relationship between science, statistics and the media. [[Megan]] and [[Cueball]] commission some research on the link between jelly beans and acne. At first the scientists do not want to stop playing the addictive game {{w|Minecraft}} (which has been referenced in a [[861|previous xkcd]]), but they do eventually start.
  
First, some basic statistical theory. Let's imagine you are trying to find out if jelly beans cause acne. To do this you could find a group of people and randomly split them into two groups - one group who you get to eat lots of jelly beans and a second group who are banned from eating jelly beans. After some time you compare whether the group that eat jelly beans have more acne than those who do not. If more people in the group that eat jelly beans have acne, then you might think that jelly beans cause acne. However, there is a problem.
+
First some basic statistical theory. Let's imagine you are trying to find out if jelly beans cause acne. To do this you could find a group of people and randomly split them into two groups - one group who you get to eat lots of jelly beans and a second group who are banned from eating jelly beans. After some time you compare whether the group that eat jelly beans have more acne than those who do not. If more people in the group that eat jelly beans have acne then you might think that jelly beans cause acne. However, there is a problem.
  
Some people will suffer from acne whether they eat jelly beans or not, and some will never have acne even if they do eat jelly beans. There is an element of chance in how many people prone to acne are in each group. What if, purely by chance, all the group we selected to eat jelly beans would have had acne anyway while those who didn't eat jelly beans were the lucky sort of people who never get spots? Then, even if jelly beans did not cause acne, we would conclude that jelly beans did cause acne. Of course, it is very unlikely that all the acne prone people end up in one group by chance, especially if we have enough people in each group. However, to give more confidence in the result of this type of experiment, scientists use statistics to see how likely it is that the result they find is purely by chance. This is known as {{w|statistical hypothesis testing}}. Before we start the experiment, we choose a threshold known as the significance level. In the comic the scientists choose a threshold of 5%. If they find that more of the people who ate jelly beans had acne and the chance it was a purely random result is less than 1 in 20, they will say that jelly beans do cause acne. If, however, the chance that their result was purely by random chance is greater than 5%, they will say they have found no evidence of a link. The important point is this '''there could still be a 1 in 20 chance that this result was purely a statistical fluke'''.
+
Some people will suffer from acne whether they eat jelly beans or not and some will never have acne even if they do eat jelly beans. There is an element of chance in how many people prone to acne are in each group. What if, purely by chance, all the group we selected to eat jelly beans would have had acne anyway while those who didn't eat jelly beans were the lucky sort of people who never get spots? Then, even if jelly beans did not cause acne, we would conclude that jelly beans did cause acne. Of course it is very unlikely that all the acne prone people end up in one group by chance, especially if we have enough people in each group. However, to give more confidence in the result of this type of experiment, scientists use statistics to see how likely it is that the result they find is purely by chance. This is known as {{w|statistical hypothesis testing}}. Before we start the experiment, we choose a threshold known as the significance level. In the comic the scientists choose a threshold of 5%. If they find that that more of the people who ate jelly beans had acne and the chance it was a purely random result is less than 1 in 20, they will say that jelly beans do cause acne. If however, the chance that their result was purely by random chance is greater than 5% they will say they have found no evidence of a link. The important point is this - '''there could still be a 1 in 20 chance that this result was purely a statistical fluke'''.
  
At first, the scientists do not want to stop playing the addictive game ''{{w|Minecraft}}'', but they do eventually start. Minecraft was previously referenced in [[861: Wisdom Teeth]].
+
In the comic, the scientists find no link between jelly beans and acne (the probability that the result is by chance is more than 5% i.e. p > 0.05) but then Megan and Cueball ask them to see if only one colour of jelly beans is responsible. They test 20 different colors each at a significance level of 5%. If the probability that each trial gives a false positive result is 1 in 20, then by testing 20 different colors it is now highly likely that at least one jelly bean test will give a false positive. In this case they find that green jelly beans do cause acne.  
  
The scientists find no link between jelly beans and acne (the probability that the result is by chance is more than 5% i.e. ''p'' > 0.05), but then Megan and Cueball ask them to see if only one color of jelly beans is responsible. They test 20 different colors, each at a significance level of 5%.  
+
This leads to a big newspaper headline saying '''GREEN JELLY BEANS CAUSE ACNE''' but when the scientists {{w|Reproducibility|repeat the experiment}} (another key part of the scientific method) they find no evidence for a link. They try to tell the reporter that it was probably a coincidence but that is not news. Instead it leads to another major headline saying '''RESEARCH CONFLICTED'''.
  
This finding leads to a big newspaper headline saying '''Green Jelly Beans Linked To Acne''' where it is said that they have 95 percent confidence with only a 5% chance of a coincidence. Unfortunately, while the p-values reported by the scientists are (presumably) mathematically correct, the wording in the newspaper is misleading and would only apply if green jelly beans were the only ones tested. Common sense should tell you that when you do a whole bunch of tests, it becomes much more likely that you'll get a false positive result.
+
This can be an issue with more serious matters than jelly beans and acne - at any one time there are many studies about possible links between substances (e.g. red wine) and illness (e.g. cancer). Because only the positive results get reported, this limits the value any single study has - especially if the mechanism linking the two things is not known. For more information see: http://www.fallacyfiles.org/multcomp.html
 
 
In the title text, we find out that the scientists {{w|Reproducibility|repeated the experiment}} (another key part of the scientific method), but now they no longer find any evidence for the link between acne and green jelly beans. They try to tell the reporter something, maybe that it was probably a coincidence, but the reporters are not interested since that is not news, and refuse to listen. Instead, they make another major headline from the repeat study saying '''Research conflicted''' (which is not accurate, the scientists doubted their results and had their doubts ''confirmed'') and recommend more study on the link (which is what the scientist just did).
 
 
 
To elaborate on the statistical theory behind this issue:
 
If the probability that each trial gives a false positive result is 1 in 20, then by testing 20 different colors it is now likely that at least one jelly bean test will give a false positive. To be precise, the probability of having ''zero'' false positives in 20 tests is 0.95<sup>20</sup> = 35.85% while the probability of having at least 1 false positive in 20 tests is 64.15% (the probability of having ''zero'' false positive in 21 tests (counting the test without color discrimination) is 0.95<sup>21</sup> = 34.06%).
 
In scientific fields that perform many simultaneous tests on large amounts of data it is therefore common to adjust for the effect of {{w|Multiple_comparisons_problem|multiple testing}}; typically by controlling the {{w|False_discovery_rate|False Discovery Rate}} which is the number of (expected) false positives compared to all positive results (here, it would be 1/1=1). For this, you bundle your tests into a single "test of tests" and adjust your single-test p-values such that the chance of your ''"test of tests"'' reporting a significant result falls below a certain threshold. Typically, that threshold is 0.05 - the same as the conventional p-value for a single test, and it can be interpreted the same way: that only 1 in 20 "tests of tests" would report a result at this level of significance even if the null hypothesis were true.
 
Applying the {{w|False_discovery_rate#Benjamini–Hochberg_procedure|Benjamini–Hochberg procedure}}, the lowest p-value of a set of 20 tests would need to be smaller than (1/20)*0.05 = 0.0025 to be accepted as significant. Such an adjustment would likely have prevented the situation depicted in the comic.
 
 
 
 
 
This general situation is (sadly) often an issue with more serious matters than jelly beans and acne at any one time there are many studies about possible links between substances (e.g. red wine) and illness (e.g. cancer). Because only the positive results get reported, this limits the value any single study has - especially if the mechanism linking the two things is not known.
 
 
 
=== p-hacking and bad news reporting in real life ===
 
In 2015 some journalists demonstrated the same problem: just how gullible other news outlets are with the same sort of flawed "experimental design": [http://www.washingtonpost.com/news/morning-mix/wp/2015/05/28/how-and-why-a-journalist-tricked-news-outlets-into-thinking-chocolate-makes-you-thin/?hpid=z5 How, and why, a journalist tricked news outlets into thinking chocolate makes you thin - The Washington Post]
 
  
 
==Transcript==
 
==Transcript==
:[A girl with a black ponytail runs up to Cueball, who subsequently points off-panel where there are presumably scientists.]
+
:[Ponytail runs up to another person, who subsequently points off-panel where there are presumably scientists.]
:Girl with black ponytail: Jelly beans cause acne!
+
:Ponytail: Jelly beans cause acne!
:Cueball: Scientists! Investigate!
+
:Another: Scientists! Investigate!
:Scientist (off panel): But we're playing Minecraft!  
+
:Scientists: But we're playing Minecraft! ...Fine.
:Scientist (off panel): ...Fine.
 
 
 
:[Two scientists. The man has safety goggles on, Megan has a sheet of notes.]
 
:Scientist with goggles: We found no link between jelly beans and acne (p > 0.05).
 
  
 +
:[Two scientists. Cueball has safety goggles, Megan has a sheet of notes.]
 +
:Cueball: We found no link between jelly beans and acne (p > 0.05).
 
:[Back to the original two.]
 
:[Back to the original two.]
:Cueball: That settles that.
+
:Another: That settles that.
:Girl with black ponytail: I hear it's only a certain color that causes it.
+
:Ponytail: I hear it's only a certain color that causes it.
:Cueball: Scientists!
+
:Another: Scientists!
:Scientist (off screen): But Miiiinecraft!
+
:Scientists: But Miiiinecraft!
 
 
:[20 identical small panels follow, in 4 rows of 5 columns. The exact same picture as in panel 2 above. The scientist with goggles states the results and Megan holds some notes in her hand. The only difference from panel to panel is the color and then in the 14th panel where the result is positive and there is an exclamation from off-panel.]
 
:Scientist with goggles: We found no link between purple jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between brown jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between pink jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between blue jelly beans and acne (p > 0.05).
 
  
:Scientist with goggles: We found no link between teal jelly beans and acne (p > 0.05).
+
:[20 near identical small panels follow, 4 rows 5 columns.]
 +
:Cueball: We found no link between purple jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between brown jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between pink jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between blue jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between teal jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between salmon jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between red jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between turquoise jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between magenta jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between yellow jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between grey jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between tan jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between cyan jelly beans and acne (p > 0.05).
 +
:Cueball: We found a link between green jelly beans and acne (p < 0.05).
 +
:Off-panel: ''WHOA!''
 +
:Cueball: We found no link between yellow jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between beige jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between lilac jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between black jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between peach jelly beans and acne (p > 0.05).
 +
:Cueball: We found no link between orange jelly beans and acne (p > 0.05).
  
:Scientist with goggles: We found no link between salmon jelly beans and acne (p > 0.05).
+
:[Newspaper front page.]
 
+
:NEWS Green Jelly Beans Linked To Acne! 95% Confidence
:Scientist with goggles: We found no link between red jelly beans and acne (p > 0.05).
+
:[There is a picture of 3 green jelly beans.]
 
 
:Scientist with goggles: We found no link between turquoise jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between magenta jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between yellow jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between grey jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between tan jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between cyan jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found a link between green jelly beans and acne (p < 0.05).
 
:Voice (off panel): ''Whoa!''
 
 
 
:Scientist with goggles: We found no link between mauve jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between beige jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between lilac jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between black jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between peach jelly beans and acne (p > 0.05).
 
 
 
:Scientist with goggles: We found no link between orange jelly beans and acne (p > 0.05).
 
 
 
:[Newspaper front page with a picture with three green jelly beans. There are several sections with unreadable text below each of the last three readable sentences.]
 
:'''News'''
 
:'''Green Jelly Beans Linked To Acne!'''
 
:95% Confidence
 
 
:Only 5% chance of coincidence!
 
:Only 5% chance of coincidence!
 
:Scientists...
 
:Scientists...
  
 
{{comic discussion}}
 
{{comic discussion}}
[[Category:Comics with color‏‎]]
 
 
[[Category:Comics featuring Cueball]]
 
[[Category:Comics featuring Cueball]]
 
[[Category:Comics featuring Megan]]
 
[[Category:Comics featuring Megan]]
[[Category:Statistics]]
+
[[Category:Comics featuring Ponytail]]
[[Category:Scientific research]]
+
[[Category:Math‏‎]]
[[Category:Minecraft]]
+
[[Category:Comics with color‏‎]]

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)