Difference between revisions of "User:NHSavage/Sandbox"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(Explanation)
(Explanation)
Line 10: Line 10:
 
==Explanation==
 
==Explanation==
  
This comic is about the often difficult relationship between science, statistics and the media.  
+
This comic is about the often difficult relationship between science, statistics and the media. Megan and Cueball commision some research on the link between jelly beans and acne. At first the scientists do not want to stop playing the addictive game {{w|Minecraft}} (which has been referenced in a [[861|previous xkcd]]), but they do eventually start.
  
 
First some basic statistical theory. Let's imagine you are trying to find out if jelly beans cause acne. To do this you could find a group of people and randomly split them into two groups - one group who you get to eat lots of jelly beans and a second group who are banned from eating jelly beans. After some time you compare whether the group that eat jelly beans have more acne than those who do not. If more people in the group that eat jelly beans have acne then you might think that jelly beans cause acne. However, there is a problem.
 
First some basic statistical theory. Let's imagine you are trying to find out if jelly beans cause acne. To do this you could find a group of people and randomly split them into two groups - one group who you get to eat lots of jelly beans and a second group who are banned from eating jelly beans. After some time you compare whether the group that eat jelly beans have more acne than those who do not. If more people in the group that eat jelly beans have acne then you might think that jelly beans cause acne. However, there is a problem.
  
Some people will suffer from acne whether they eat jelly beans or not and some will never have acne even if they do eat jelly beans. There is an element of chance in how many people prone to acne are in each group. What if, purely by chance, all the group we selected to eat jelly beans would have had acne anyway while those who didn't eat jelly beans were the lucky sort of people who never get spots? Then, even if jelly beans did not cause acne, we would conclude that jelly beans did cause acne. Of course it is very unlikely that all the acne prone people end up in one group by chance, especially if we have enough people in each group. However, to give more confidence in the result of this type of experiment, scientists use statistics to see how likely it is that the result they find is purely by chance. This is known as {{w|statistical hypothesis testing}}. Before we start the experiment, we choose a threshold known as the significance level. In the comic the scientists choose a threshold of 5%. If they find that there is a result that jelly beans cause acne and the chance it was a purely random result is less than 1 in 20, they will say that jelly beans do cause acne. If however, the chance that their result was purely by random chance is greater than 5% they will say they have found no evidence of a link. The important point is this - '''there is still a 1 in 20 chance that this result was purely a statistical fluke''' - more people who would have got acne anyway are in the group that we fed jelly beans.
+
Some people will suffer from acne whether they eat jelly beans or not and some will never have acne even if they do eat jelly beans. There is an element of chance in how many people prone to acne are in each group. What if, purely by chance, all the group we selected to eat jelly beans would have had acne anyway while those who didn't eat jelly beans were the lucky sort of people who never get spots? Then, even if jelly beans did not cause acne, we would conclude that jelly beans did cause acne. Of course it is very unlikely that all the acne prone people end up in one group by chance, especially if we have enough people in each group. However, to give more confidence in the result of this type of experiment, scientists use statistics to see how likely it is that the result they find is purely by chance. This is known as {{w|statistical hypothesis testing}}. Before we start the experiment, we choose a threshold known as the significance level. In the comic the scientists choose a threshold of 5%. If they find that that more of the people who ate jelly beans had acne and the chance it was a purely random result is less than 1 in 20, they will say that jelly beans do cause acne. If however, the chance that their result was purely by random chance is greater than 5% they will say they have found no evidence of a link. The important point is this - '''there could is still be a 1 in 20 chance that this result was purely a statistical fluke'''.
  
In this comic, Megan and Cueball come to see some scientists to investigate their theory that jelly beans cause acne. The scientists of course do not want to stop playing the addictive game {{w|Minecraft}} (which has been referenced in a [[861|previous xkcd]]).  
+
In the comic, the scientists find no link between jelly beans and cancer but then Megan and Cueball ask them to see if only one colour of jelly beans is responsible. They test 20 different colors each at a significance level of 5%. If the probability that each trial gives a false positive result is 1 in 20, then by testing 20 different colors it is now highly likely that at least one jelly bean test will give a false positive. In this case they find that green jelly beans do cause acne.  
  
When the scientists come back and say there is no link between jelly beans and acne, Megan and Cueball have heard that it is specific colors that cause acne.  So, the scientists tear themselves away from Minecraft again and study twenty different colors.  Green is the only color for which they find any link at a 95% {{w|confidence interval}}. So, obviously the news blows the coverage out of control, even though as we see in the image text that the green jelly beans study may have been a coincidence.
+
This leads to a big newspaper headline saying '''GREEN JELLY BEANS CAUSE ACNE''' - however, when they {{w|Reproducibility|repeat the experiment}} (another key part of the scientific method) they find no evidence for a link. However, this just leads to another major headline saying '''RESEARCH CONFLICTED''' instead of recognising the fact that the original result was almost probably purely by chance.
  
But, of course, the news coverage in all caps in the image text blows it out of proportion again.
+
This can be an issue with more serious matters than jelly beans and acne - at anyone time there are many studies about possible links between substances (e.g. red wine) and illness (e.g. cancer). Because only the positive results get reported, this limits the value any single study has - especially if the mechanism linking the two things is not known. For more information see: http://www.fallacyfiles.org/multcomp.html
 
 
More to the point, the scientists studied 20 different colors, and only one showed a link at the 95% confidence level. The 95% confidence level MEANS that there is a 95% chance that the link is not random. So by definition we would expect 1 out of 20 studies to show a link at this level by pure random chance.
 
This is a big problem in correlation research generally, because scientists study many different possible causal factors, and then report the ones that show significance. Without a plausable causation mechanism it’s not clear what value such studies have, given that 1/20 will randomly show a correlation at (p>.05). For more: http://www.fallacyfiles.org/multcomp.html
 
 
 
Also comment on media coverage of science.
 
 
 
Now you are about to learn something about statistics (or at least, how experiments and scientists use statistics).
 
 
 
When a scientist conducts an experiment like those in the comic, they compare the test outcome (the rate of acne among green jellybean eaters) with a control outcome (the rate of acne among folks who don’t eat green jellybeans). Since there is always a certain amount of randomness and error in such tests, they compute the probability p that the observed test outcome could have come about purely by chance assuming that the hypothesis they are testing (green jellybeans give acne) is false. If the probability p is small enough, they conclude the result is “significant”.
 
 
 
By saying that for mauve jellybeans p is greater than 0.05 (a common cutoff for significance), they are stating that the results weren’t statistically significant. On the otherhand, the results for green jellybeans (p less than 0.05) are statistically significant.
 
 
 
One would expect that 1 out of 20 tests would give results with a p less than or equal to 0.05 if the hypothesis was false. If one didn’t, one could reasonably suspect that ones method of calculating p values was flawed.
 
  
 
==Transcript==
 
==Transcript==

Revision as of 20:08, 31 October 2012


Explanation

This comic is about the often difficult relationship between science, statistics and the media. Megan and Cueball commision some research on the link between jelly beans and acne. At first the scientists do not want to stop playing the addictive game Minecraft (which has been referenced in a previous xkcd), but they do eventually start.

First some basic statistical theory. Let's imagine you are trying to find out if jelly beans cause acne. To do this you could find a group of people and randomly split them into two groups - one group who you get to eat lots of jelly beans and a second group who are banned from eating jelly beans. After some time you compare whether the group that eat jelly beans have more acne than those who do not. If more people in the group that eat jelly beans have acne then you might think that jelly beans cause acne. However, there is a problem.

Some people will suffer from acne whether they eat jelly beans or not and some will never have acne even if they do eat jelly beans. There is an element of chance in how many people prone to acne are in each group. What if, purely by chance, all the group we selected to eat jelly beans would have had acne anyway while those who didn't eat jelly beans were the lucky sort of people who never get spots? Then, even if jelly beans did not cause acne, we would conclude that jelly beans did cause acne. Of course it is very unlikely that all the acne prone people end up in one group by chance, especially if we have enough people in each group. However, to give more confidence in the result of this type of experiment, scientists use statistics to see how likely it is that the result they find is purely by chance. This is known as statistical hypothesis testing. Before we start the experiment, we choose a threshold known as the significance level. In the comic the scientists choose a threshold of 5%. If they find that that more of the people who ate jelly beans had acne and the chance it was a purely random result is less than 1 in 20, they will say that jelly beans do cause acne. If however, the chance that their result was purely by random chance is greater than 5% they will say they have found no evidence of a link. The important point is this - there could is still be a 1 in 20 chance that this result was purely a statistical fluke.

In the comic, the scientists find no link between jelly beans and cancer but then Megan and Cueball ask them to see if only one colour of jelly beans is responsible. They test 20 different colors each at a significance level of 5%. If the probability that each trial gives a false positive result is 1 in 20, then by testing 20 different colors it is now highly likely that at least one jelly bean test will give a false positive. In this case they find that green jelly beans do cause acne.

This leads to a big newspaper headline saying GREEN JELLY BEANS CAUSE ACNE - however, when they repeat the experiment (another key part of the scientific method) they find no evidence for a link. However, this just leads to another major headline saying RESEARCH CONFLICTED instead of recognising the fact that the original result was almost probably purely by chance.

This can be an issue with more serious matters than jelly beans and acne - at anyone time there are many studies about possible links between substances (e.g. red wine) and illness (e.g. cancer). Because only the positive results get reported, this limits the value any single study has - especially if the mechanism linking the two things is not known. For more information see: http://www.fallacyfiles.org/multcomp.html

Transcript

[Person with a pony tail runs up to another person, who subsequently points off-panel where there are presumably scientists.]
Ponytail: Jelly beans cause acne!
Another: Scientists! Investigate!
Scientists: But we&'re playing Minecraft!
Scientists: ... Fine.
[Two scientists. One has safety goggles, the other has a sheet of notes.]
Goggles: We found no link between jelly beans and acne (p > 0.05).
[Back to the original two.]
Another: That settles that.
Ponytail: I hear it's only a certain color that causes it.
Another: Scientists!
Scientists: But Miiiinecraft!
[20 near identical small panels follow, 4 rows 5 columns.]
Goggles: We found no link between purple jelly beans and acne (p > 0.05).
Goggles: We found no link between brown jelly beans and acne (p > 0.05).
Goggles: We found no link between pink jelly beans and acne (p > 0.05).
Goggles: We found no link between blue jelly beans and acne (p > 0.05).
Goggles: We found no link between teal jelly beans and acne (p > 0.05).
Goggles: We found no link between salmon jelly beans and acne (p > 0.05).
Goggles: We found no link between red jelly beans and acne (p > 0.05).
Goggles: We found no link between turquoise jelly beans and acne (p > 0.05).
Goggles: We found no link between magenta jelly beans and acne (p > 0.05).
Goggles: We found no link between yellow jelly beans and acne (p > 0.05).
Goggles: We found no link between grey jelly beans and acne (p > 0.05).
Goggles: We found no link between tan jelly beans and acne (p > 0.05).
Goggles: We found no link between cyan jelly beans and acne (p > 0.05).
Goggles: We found a link between green jelly beans and acne (p < 0.05).
Off-panel: WHOA!
Goggles: We found no link between yellow jelly beans and acne (p > 0.05).
Goggles: We found no link between beige jelly beans and acne (p > 0.05).
Goggles: We found no link between lilac jelly beans and acne (p > 0.05).
Goggles: We found no link between black jelly beans and acne (p > 0.05).
Goggles: We found no link between peach jelly beans and acne (p > 0.05).
Goggles: We found no link between orange jelly beans and acne (p > 0.05).
[Newspaper front page.]
NEWS Green Jelly Beans Linked To Acne! 95% Confidence
[There is a picture of 3 green jelly beans.]
Only 5% chance of coincidence!
Scientists...
comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

No comments yet!