2001: Clickbait-Corrected p-Value
Title text: When comparing hypotheses with Bayesian methods, the similar 'clickbayes factor' can account for some harder-to-quantify priors.
Clickbait is the practice of using deceptive or hyperbolic headlines to entice readers to click on a dubious or sensationalist news story, often with the purpose of generating site traffic and ad revenue. Randall uses the scientific controversy regarding the health effects of chocolate to humans as an example, as there is widespread misinformation on the health effects of chocolate online. In fact, there are no reliable studies to confirm any health effects while no medical authority has approved any health claims regarding chocolate.
Hypothesis testing in statistics is a standard method to determine whether a particular hypothesis is supported by the data. For the topic given in this comic, a researcher might compare data on athletic performance with data on chocolate consumption by those athletes to determine whether the two trend together. By convention, the "null hypothesis" (denoted H0) is that there is no correlation (e.g. chocolate doesn't affect athletic performance) while the "alternate hypothesis" (H1) would be that they are correlated. (If the study consists of feeding chocolate to one of two identical groups and not the other, rather than tracking what they'd be eating anyway, then the alternative hypothesis can be strengthened to be that chocolate causes improved performance.) These sets are subjected to statistical tests which return a "test statistic". From that test statistic a "p-value" is calculated. The p-value indicates the probability of observing the obtained results (or any more extreme value), when the null hypothesis is true (e.g. chocolate has no effect on athletic performance).
In other words, the p-value is an indicator as to the statistical significance and consequential reliability of the results affirming the "alternate hypothesis"(not the probability that the null hypothesis is correct). It answers the question: If there is no correlation, how likely was it that I saw a correlation at least this big? Hence, if the p-value is low enough (by convention < 0.05), the null hypothesis is rejected, and we conclude that the alternate hypothesis is supported by the data (NOT that it is "correct" or "true").
In this comic, the p-value is corrected by a factor that takes clickbait into account. This factor has the effect of increasing the p-value if H1 is more clickbaity than H0, and decreases the p-value if H0 is more clickbaity than H1. This suggests that whatever clickers of clickbait believe, the reverse is likely to be true.
Furthermore, this factor may be interpreted as normalisation for the inherent selection bias where the p-values for more clickbaity H1s tend to be lower than they should be and p-values for non-clickbaity H0s to be higher than they should be. For example, one explanation could be that for p-values that are on the cusp of significance, researchers may be more incentivized to fudge and adjust the data to get the p-value down if the H1 is highly sensational, since the H1 would make the research more likely to get published and attract attention. (See also FiveThirtyEight's article on p-hacking and this Stack Exchange question about p-hacking in the wild.) P-hacking has also previously already been associated with chocolate and media sensationalism.
As the statistical results now depend on people's beliefs about the hypothesis, this could appear as far from actual science as one can get. However, in a way, it is more in tune with a quote by John Arbuthnot (one of the originators of the use of p-values) attributing variation to active thought rather than chance, "from whence it follows, that it is Art, not Chance, that governs." Randall applying that quote to the thoughts of the masses brings it in line with "Art".
If this correction could be somehow enforced on the scientific world, it would have the effect of keeping the popular view of scientific results more in line with reality. Often one study will be performed that shows an exciting result, and consequently be sensationalised by the media prior to further studies to verify it. This is in part due to the conflicting interest of the scientific community and the media. The clickbait correction may aid a reader in exercising caution when interpreting sensationalist scientific discoveries in news media. Additionally, there can be a problem in some areas of science where more mundane results never undergo the third-party replication studies (see replication crisis, or perhaps are even never studied in the first place. The clickbait correction factor has the opposite effect on these more mundane topics, making it easier to demonstrate effects within them with a lower statistical barrier for entry, perhaps in the hope that more will get studied, published, and exposed to the public.
Technically, the comic's depiction of null and alternative hypotheses is not entirely correct. As the alternative hypothesis (H1) predicts that chocolate will improve performance (i.e., a one-tailed, directional hypothesis), the null hypothesis (H0) should predict that chocolate will do nothing or make performance worse. In other words, the alternative hypothesis should be true if and only if the null hypothesis is false. For example, alternatively, if the H1 were to say that chocolate will change performance (for better or worse; i.e., a two-tailed hypothesis) then H0 should say that chocolate will do nothing.
The title text refers to Bayesian statistics, a statistical technique which involves considering (before you see the new data) how likely you think it is that the hypothesis is true. (It is worth noting that the traditional statistical analysis described above, doesn't directly say anything about how likely the hypothesis is to be *true*. It simply assesses whether the data is consistent with the null hypothesis.) Under Bayesian analysis, you begin with a prior probability, or simply just "prior", which expresses how likely you think the alternate hypothesis is. Then after seeing the new data, you apply Bayes' theorem to *update* your belief about the hypothesis, and as a result you should then consider the hypothesis to be more likely (or less likely) than you considered it before.
Bayesian statistics therefore recognizes that an extraordinary claim should require more evidence to convince you than a "reasonable" claim would. (Which is, arguably, sort of, the same point being made by the Clickbait-correction.) But also that *enough* evidence, perhaps gathered step by step over time, should be sufficient to convince you even of extraordinary claims.
The technique can be hard to apply in science however, because of the difficulty in agreeing upon reasonable priors. Here it's suggested that an alternative "clickbayes factor" (a pun and portmanteau of clickbait and Bayesian) could be used to approximate hard to quantify priors.
- [Under a heading that says Clickbait-Corrected p-Value there is a mathematical formula. Below that is the description of the two used variables and what they mean:]
- Clickbait-corrected p-value:
- PCL = Ptraditional ∙ click(H1)/click(H0)
- H0: NULL hypothesis ("Chocolate has no effect on athletic performance")
- H1: Alternative hypothesis ("Chocolate boosts athletic performance")
- click(H): Fraction of test subjects who click on a headline announcing that H is true
add a comment! ⋅ add a topic (use sparingly)! ⋅ refresh comments!