Difference between revisions of "2001: Clickbait-Corrected p-Value"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(Explanation)
(Explanation)
Line 13: Line 13:
 
Clickbait is the practice of using deceptive or manipulative headlines to entice readers to click on a dubious news story, often with the purpose of generating ad revenue.
 
Clickbait is the practice of using deceptive or manipulative headlines to entice readers to click on a dubious news story, often with the purpose of generating ad revenue.
  
Hypothesis testing in statistics is a standard method to determine whether a particular hypothesis is supported by the data. For the topic given in this comic, a researcher might compare data on athletic performance with data on chocolate consumption by those athletes to determine whether the two trend together. By convention, the "null hypothesis" (designated H<sub>0</sub>) is that there's no correlation (that chocolate isn't correlated with athletic performance, in this case) and the "alternate hypothesis" (H<sub>1</sub>) is that they are correlated. (If the study consists of ''feeding'' chocolate to one of two identical groups and not the other, rather than tracking what they'd be eating anyway, then the alternative hypothesis can be strengthened to be that chocolate *causes* improved performance.) These sets are subjected to statistical tests which return a "p-value", which indicates the probability of observing the obtained results (or any more extreme value), when all assumptions of the test are true (including the null hypothesis). In layman's terms: <b> the p-value is the probability that the researcher sees results as extreme or more extreme than the observed result given the null hypothesis is true; [http://www.perfendo.org/docs/BayesProbability/twelvePvaluemisconceptions.pdf the p-value is NOT the probability that the null hypothesis is correct].</b> It answers the question: If there is no difference, how likely was it that I saw a difference at least this big? Hence, if the p-value is low enough, the null hypothesis is rejected, and we conclude that the alternate hypothesis is supported by the data.  
+
[https://en.wikipedia.org/wiki/Statistical_hypothesis_testing Hypothesis testing] in statistics is a standard method to determine whether a particular hypothesis is supported by the data. For the topic given in this comic, a researcher might compare data on athletic performance with data on chocolate consumption by those athletes to determine whether the two trend together. By convention, the "null hypothesis" (designated H<sub>0</sub>) is that there's no correlation (that chocolate isn't correlated with athletic performance, in this case) and the "alternate hypothesis" (H<sub>1</sub>) is that they are correlated. (If the study consists of ''feeding'' chocolate to one of two identical groups and not the other, rather than tracking what they'd be eating anyway, then the alternative hypothesis can be strengthened to be that chocolate *causes* improved performance.) These sets are subjected to statistical tests which return a "test statistic". From that test statistic a [https://en.wikipedia.org/wiki/P-value "p-value"] is calculated. The p-value indicates the probability of observing the obtained results (or any more extreme value), when all assumptions of the test are true (including the null hypothesis).  
 +
In layman's terms: <b> the p-value is the probability that the researcher sees results as extreme or more extreme than the observed result given the null hypothesis is true; [http://www.perfendo.org/docs/BayesProbability/twelvePvaluemisconceptions.pdf the p-value is NOT the probability that the null hypothesis is correct].</b> It answers the question: If there is no correlation, how likely was it that I saw a correlation at least this big? Hence, if the p-value is low enough (by convention < 0.05), the null hypothesis is rejected, and we conclude that the alternate hypothesis is supported by the data (NOT that it is "correct" or "true").  
  
 
In this comic, the p-value is corrected by a factor that takes clickbait into account. This factor has the effect of increasing the p-value if H<sub>1</sub> is more clickbaity than H<sub>0</sub>, and decreases the p-value if H<sub>0</sub> is more clickbaity than H<sub>1</sub>. This suggests that whatever clickers of clickbait believe, the reverse is likely to be true.
 
In this comic, the p-value is corrected by a factor that takes clickbait into account. This factor has the effect of increasing the p-value if H<sub>1</sub> is more clickbaity than H<sub>0</sub>, and decreases the p-value if H<sub>0</sub> is more clickbaity than H<sub>1</sub>. This suggests that whatever clickers of clickbait believe, the reverse is likely to be true.

Revision as of 14:16, 29 September 2018

Clickbait-Corrected p-Value
When comparing hypotheses with Bayesian methods, the similar 'clickbayes factor' can account for some harder-to-quantify priors.
Title text: When comparing hypotheses with Bayesian methods, the similar 'clickbayes factor' can account for some harder-to-quantify priors.

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: Click here to learn more about the influence of Clickbait... But please first explain p-value. Most people don't know. And more wiki links.
If you can address this issue, please edit the page! Thanks.

This is yet another comic dealing with Clickbait, and is satire mocking researchers/journalists/publishers for fudging research data based on what brings in the most advertising revenue. The topic of fudging research data in academia has also previously appeared in 882: Significant and 1478: P-Values.

Clickbait is the practice of using deceptive or manipulative headlines to entice readers to click on a dubious news story, often with the purpose of generating ad revenue.

Hypothesis testing in statistics is a standard method to determine whether a particular hypothesis is supported by the data. For the topic given in this comic, a researcher might compare data on athletic performance with data on chocolate consumption by those athletes to determine whether the two trend together. By convention, the "null hypothesis" (designated H0) is that there's no correlation (that chocolate isn't correlated with athletic performance, in this case) and the "alternate hypothesis" (H1) is that they are correlated. (If the study consists of feeding chocolate to one of two identical groups and not the other, rather than tracking what they'd be eating anyway, then the alternative hypothesis can be strengthened to be that chocolate *causes* improved performance.) These sets are subjected to statistical tests which return a "test statistic". From that test statistic a "p-value" is calculated. The p-value indicates the probability of observing the obtained results (or any more extreme value), when all assumptions of the test are true (including the null hypothesis). In layman's terms: the p-value is the probability that the researcher sees results as extreme or more extreme than the observed result given the null hypothesis is true; the p-value is NOT the probability that the null hypothesis is correct. It answers the question: If there is no correlation, how likely was it that I saw a correlation at least this big? Hence, if the p-value is low enough (by convention < 0.05), the null hypothesis is rejected, and we conclude that the alternate hypothesis is supported by the data (NOT that it is "correct" or "true").

In this comic, the p-value is corrected by a factor that takes clickbait into account. This factor has the effect of increasing the p-value if H1 is more clickbaity than H0, and decreases the p-value if H0 is more clickbaity than H1. This suggests that whatever clickers of clickbait believe, the reverse is likely to be true.

Or, another interpretation could be that this factor corrects for a selection bias effect where the p-values for more clickbaity H1s tend to be lower than they should be and p-values for non-clickbaity H0s to be higher than they should be. For example, one explanation could be that for p-values that are on the cusp of significance, researchers may be more incentivized to fudge and adjust the data to get the p-value down if the H1 is highly sensational, since the H1 would make the research more likely to get published and attract attention. (See also FiveThirtyEight's article on p-hacking and this Stack Exchange question about p-hacking in the wild.)

As the statistical results now depend on people's beliefs about the hypothesis, this is as far from actual science as one can get. However, in a way, it is more in tune with a quote by Arbuthnot (one of the originators of the use of p-values) attributing variation to active thought rather than chance, "From whence it follows, that it is Art, not Chance, that governs." Randall applying that quote to the thoughts of the masses, bringing it in line with "Art".

Technically, the comic's depiction of null and alternative hypotheses is not entirely correct. As the alternative hypothesis (H1) predicts that chocolate will improve performance (i.e., a one-tailed, directional hypothesis), the null hypothesis (H0) should predict that chocolate will do nothing or make performance worse. In other words, the alternative hypothesis should be true if and only if the null hypothesis is false. For example, alternatively, if the H1 were to say that chocolate will change performance (for better or worse; i.e., a two-tailed hypothesis) then H0 should say that chocolate will do nothing.

For the mouseover text: Bayesian methods start with a "prior", which is the probabilities believed before seeing new evidence (e.g. before conducting an experiment). Time spent reading clickbait would probably cause people to have unusual beliefs about what is likely before seeing evidence.

Transcript

[Under a heading that says Clickbait-Corrected p-Value there is a mathematic formula. Below that is the description of the two used variables and what they mean:]
Clickbait-corrected p-value:
PCL = Ptraditional ∙ click(H1)/click(H0)
H0: NULL hypothesis ("Chocolate has no effect on athletic performance")
H1: Alternative hypothesis ("Chocolate boosts athletic performance")
click(H): Fraction of test subjects who click on a headline announcing that H is true


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

I thought this comic was about correcting for any p-hacking that aimed to increase the media presence (and thus the clickbait) of the study. 172.68.94.10 17:32, 1 June 2018 (UTC)

The explanation for null hypothesis is correct semantically, it would be accepted if there was no OR negative improvement, however, this is usually stated more succinctly as "will not improve performance" or (in keeping with the language of the comic) "does not boost performance", since that has the same meaning without the unnecessary verbosity. ---- 162.158.186.42 (talk) (please sign your comments with ~~~~)

I can't believe I clicked on this 172.68.86.46 20:28, 1 June 2018 (UTC)

I've removed a paragraph which claimed that this was an instance of Bayes theorem. Despite some similarity in structure, it is not. Winstonewert (talk) 01:39, 2 June 2018 (UTC)

I was honestly expecting a comic about (or at least referencing) 2001: A Space Odyssey. Herobrine (talk) 07:41, 2 June 2018 (UTC)

If reseachers were to use this adjusted formula, it would make sensational results much harder to demonstrate as significant, and uninteresting results much easier. Seems to me it’s a good adjustment for a lot of things. I wonder about p-values, though ... seems to me a value that is at all borderline just means you don’t have enough data yet for the actual size of the effect you’re measuring, but I don’t know much about statistics. 172.68.54.130 02:08, 3 June 2018 (UTC)

Ummm. I use a Gecko engine* with "Block Advertisement" checked. *(K-Meleon 76.0) I can see the image from "xkcd Phone 2000" and "LeBron James and Stephen Curry", but NOT THIS PAGE. Unless I uncheck "Block Advertisement". Obviously this is to encourage clicking on things? 172.68.2.70 09:29, 4 June 2018 (UTC)

This could be an attempt to correct for the effects described in the infamous Iohannides paper:

In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller[...] where there is greater flexibility in designs, [...] where there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true.

--162.158.90.192 23:04, 19 June 2018 (UTC)

Incomplete?

This comic is labeled as incomplete, but the explanation seems pretty thorough as it is. Any explanation can be cleaned up ad infinitum to suit people's liking, but this one seems pretty good as it is. Is the incomplete tag still warranted at this point?--Sensorfire (talk) 18:46, 1 October 2018 (UTC)

There were many edits recently because this comic is mentioned at the sitenotice on top here, if you now understand what a p-Value is, feel free to remove that incomplete tag. I personally prefer a more straight forward and shorter explanation. But that's only my opinion. When this comic is not labeled incomplete anymore I will put some else to that sitenotice. --Dgbrt (talk) 21:23, 1 October 2018 (UTC)
If this wiki tracked pageviews, somebody could put forth a hypothesis of something measurable on the site, see how many clicks each hypothesis got, and produce a real clickbait-adjusted p-value for it. 162.158.79.107 02:52, 5 October 2018 (UTC)
We don't explain clickbait here...--Dgbrt (talk) 19:20, 5 October 2018 (UTC)

Still incomplete because if you google for this "chocolate health" you will understand. --Dgbrt (talk) 19:20, 5 October 2018 (UTC)