552: Correlation

Explain xkcd: It's 'cause you're dumb.
(Difference between revisions)
Jump to: navigation, search
(Explanation)
(Explanation: wiki markup)
Line 10: Line 10:
 
This comic focuses on the apparent difficulty people have in understanding the difference between {{w|Correlation and dependence|correlation}} and {{w|Causality|causation}}. When two variables (like blood cholesterol levels and heart disease) are positively correlated, it means that low values of one variable are associated with low values of the other, and high values are associated with high.  The human brain is very good at seeing patterns and deducing rules, and the seemingly natural conclusion is that that the one is leading to the other. In the example, that high blood cholesterol causes heart disease.
 
This comic focuses on the apparent difficulty people have in understanding the difference between {{w|Correlation and dependence|correlation}} and {{w|Causality|causation}}. When two variables (like blood cholesterol levels and heart disease) are positively correlated, it means that low values of one variable are associated with low values of the other, and high values are associated with high.  The human brain is very good at seeing patterns and deducing rules, and the seemingly natural conclusion is that that the one is leading to the other. In the example, that high blood cholesterol causes heart disease.
  
This may well be true.  The positive correlation is certainly not an argument *against* such a conclusion.  But it is only one type of evidence, and is certainly not proof.
+
This may well be true.  The positive correlation is certainly not an argument '''against''' such a conclusion.  But it is only one type of evidence, and is certainly not proof.
  
 
The relationship between diet and blood chemistry and heart disease is a complex one, but simpler examples abound.  For example, if you tallied the sales of sunglasses and incidence of skin cancer by region, you would probably find that there is a high positive correlation.  That is, in locations where many people buy sunglasses, there are also many cases of skin cancer. Here it would seem silly to believe that wearing sunglasses can cause skin cancer, but this is exactly the same thinking that allowed us to conclude that blood cholesterol causes heart disease.  Correlations do have the ability to mislead us.  In this example, both sunglasses and skin cancer are directly affected by a third factor (specifically, a climate where many people expose themselves to the sun).
 
The relationship between diet and blood chemistry and heart disease is a complex one, but simpler examples abound.  For example, if you tallied the sales of sunglasses and incidence of skin cancer by region, you would probably find that there is a high positive correlation.  That is, in locations where many people buy sunglasses, there are also many cases of skin cancer. Here it would seem silly to believe that wearing sunglasses can cause skin cancer, but this is exactly the same thinking that allowed us to conclude that blood cholesterol causes heart disease.  Correlations do have the ability to mislead us.  In this example, both sunglasses and skin cancer are directly affected by a third factor (specifically, a climate where many people expose themselves to the sun).

Revision as of 16:31, 14 February 2014

Correlation
Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.
Title text: Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing 'look over there'.

Explanation

This comic focuses on the apparent difficulty people have in understanding the difference between correlation and causation. When two variables (like blood cholesterol levels and heart disease) are positively correlated, it means that low values of one variable are associated with low values of the other, and high values are associated with high. The human brain is very good at seeing patterns and deducing rules, and the seemingly natural conclusion is that that the one is leading to the other. In the example, that high blood cholesterol causes heart disease.

This may well be true. The positive correlation is certainly not an argument against such a conclusion. But it is only one type of evidence, and is certainly not proof.

The relationship between diet and blood chemistry and heart disease is a complex one, but simpler examples abound. For example, if you tallied the sales of sunglasses and incidence of skin cancer by region, you would probably find that there is a high positive correlation. That is, in locations where many people buy sunglasses, there are also many cases of skin cancer. Here it would seem silly to believe that wearing sunglasses can cause skin cancer, but this is exactly the same thinking that allowed us to conclude that blood cholesterol causes heart disease. Correlations do have the ability to mislead us. In this example, both sunglasses and skin cancer are directly affected by a third factor (specifically, a climate where many people expose themselves to the sun).

In essence, when two variables are correlated it does not provide evidence that one variable has caused the other. All it says is that their trends move in relation to each other. A positive correlation would mean that as one variable increases so does the other, while a negative correlation means that as one variable increases, the other decreases. The correlation could be due to causality, but it could equally be due to other factors, or it could even be a random result.

In this situation Cueball is explaining to Megan his realization that correlation is not the same thing as causation. He further explains that his belief changed after taking a statistics class. Megan, then makes the seemingly obvious leap and declares that his realization was the result of taking the statistics course. Cueball’s final response of “Well, Maybe.” is ironic. In his heart, he probably knows that what Megan is saying is true. However, his new found skepticism is making him doubt even himself.

The title text makes it clear that even though correlation does not imply causation, it is supporting evidence and it cannot be dismissed out of hand. The correlation must be investigated further, perhaps in a wider scope or with the consideration of more variables, so that the reason for it is understood. For example, Barry Marshall and Robin Warren noticed that the presence of Helicobacter pylori was highly correlated with duodenal ulcer patients. They investigated further. Result: the Nobel Prize in Medicine.

Transcript

[Cueball is talking to Megan.]
Cueball: I used to think correlation implied causation.
Cueball: Then I took a statistics class. Now I don't.
Megan: Sounds like the class helped.
Cueball: Well, maybe.
comment.png add a comment!

Discussion

It is stated that Cueball is doubting, due to his newly found sceptism, which I believe is incorrect.

By stating that "the class helped", Megan is inferring there is a causal relation between Cueball taking a statistics class and him no longer believing correlation implies causation. However, Cueball is replying "well maybe" to indicate there is only a correlation between them, showing he correctly understood the distinction.

173.245.53.104 15:29, 13 May 2014 (UTC)
Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox

It seems you are using noscript, which is stopping our project wonderful ads from working. Explain xkcd uses ads to pay for bandwidth, and we manually approve all our advertisers, and our ads are restricted to unobtrusive images and slow animated GIFs. If you found this site helpful, please consider whitelisting us.

Want to advertise with us, or donate to us with Paypal or Bitcoin?