Editing 1132: Frequentists vs. Bayesians
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 12: | Line 12: | ||
<blockquote>I seem to have stepped on a hornet’s nest, though, by adding “Frequentist” and “Bayesian” titles to the panels. This came as a surprise to me, in part because I actually added them as an afterthought, along with the final punchline. … The truth is, I genuinely didn’t realize Frequentists and Bayesians were actual camps of people—all of whom are now emailing me. I thought they were loosely-applied labels—perhaps just labels appropriated by the books I had happened to read recently—for the standard textbook approach we learned in science class versus an approach which more carefully incorporates the ideas of prior probabilities.</blockquote> | <blockquote>I seem to have stepped on a hornet’s nest, though, by adding “Frequentist” and “Bayesian” titles to the panels. This came as a surprise to me, in part because I actually added them as an afterthought, along with the final punchline. … The truth is, I genuinely didn’t realize Frequentists and Bayesians were actual camps of people—all of whom are now emailing me. I thought they were loosely-applied labels—perhaps just labels appropriated by the books I had happened to read recently—for the standard textbook approach we learned in science class versus an approach which more carefully incorporates the ideas of prior probabilities.</blockquote> | ||
− | The " | + | The "frequentist" statistician is (mis)applying the common standard of "{{w|P-value|p}}<0.05". In a scientific study, a result is presumed to provide strong evidence if there is less than a 5% chance that the result was merely random. (More formally, this probability is defined in terms of the {{w|null hypothesis}}, or a default position that the observations are unrelated. The null hypothesis was also referenced in [[892: Null Hypothesis]].) |
− | Since the likelihood of rolling double sixes is below this 5% threshold, the "frequentist" decides (by this rule of thumb) to accept the detector's output as correct. The " | + | Since the likelihood of rolling double sixes is below this 5% threshold, the "frequentist" decides (by this rule of thumb) to accept the detector's output as correct. The "Bayesian" statistician has, instead, applied at least a small measure of probabilistic reasoning ({{w|Bayesian inference}}) to determine that the unlikeliness of the detector lying is greatly outweighed by the unlikeliness of the sun exploding. Therefore, he concludes that the sun has ''not'' exploded and the detector is lying. |
− | + | The line, "Bet you $50 it hasn't", is a reference to the approach of a leading bayesian scholar, {{w|Bruno de Finetti}}, who made extensive use of bets in his examples and thought experiments. See {{w|Coherence (philosophical gambling strategy)}} for more information on his work. In this case, however, the bet is also a joke because we would all be dead if the sun exploded, therefore money would be worthless. This is a tongue-in-cheek reference to the absurdity of the premise, and a reference to the need to consider things in context. | |
− | The | + | The title text refers to a classic series of logic puzzles known as {{w|Knights and Knaves#Fork in the road|Knights and Knaves}}, where there are two guards in front of two exit doors, one of which is real and the other leads to death. One guard is a liar and the other tells the truth. The visitor doesn't know which is which, and is allowed to ask one question to one guard. The solution is to ask either guard what the other one would say is the real exit, then choose the opposite. Two such guards were featured in the 1986 Jim Henson movie ''[[246|Labyrinth]]'', hence the mention of "A LABYRINTH GUARD" here. |
− | + | ===Mathematical and scientific details=== | |
− | |||
− | |||
− | |||
− | === | ||
As mentioned, this is an instance of the {{w|base rate fallacy}}. If we treat the "truth or lie" setup as simply modelling an inaccurate test, then it is also specifically an illustration of the {{w|false positive paradox}}: A test that is rarely wrong, but which tests for an event that is even rarer, will be more often wrong than right when it says that the event has occurred. | As mentioned, this is an instance of the {{w|base rate fallacy}}. If we treat the "truth or lie" setup as simply modelling an inaccurate test, then it is also specifically an illustration of the {{w|false positive paradox}}: A test that is rarely wrong, but which tests for an event that is even rarer, will be more often wrong than right when it says that the event has occurred. | ||
− | The test | + | The test in this case is a neutrino detector. It relies on the fact that neutrinos can pass through the earth, so a neutrino detector would detect neutrinos from the sun at all times, day and night. The detector is stated to give false results ("lie") 1/36th of the time. |
There is no record of any star ever spontaneously exploding—they always show signs of deterioration long before their explosion—so the probability is near zero. For the sake of a number, though, consider that the sun's estimated lifespan is 10 billion years. Let's say the test is run every hour, twelve hours a day (at night time). This gives us a probability of the Sun exploding at one in 4.38×10<sup>13</sup>. Assuming this detector is otherwise reliable, when the detector reports a solar explosion, there are two possibilities: | There is no record of any star ever spontaneously exploding—they always show signs of deterioration long before their explosion—so the probability is near zero. For the sake of a number, though, consider that the sun's estimated lifespan is 10 billion years. Let's say the test is run every hour, twelve hours a day (at night time). This gives us a probability of the Sun exploding at one in 4.38×10<sup>13</sup>. Assuming this detector is otherwise reliable, when the detector reports a solar explosion, there are two possibilities: | ||
− | # The sun '''has''' exploded (one in 4.38×10<sup>13</sup>) and the detector '''is''' telling the truth (35 in 36). This event has a total probability of about 1/(4.38×10<sup>13</sup>) × 35/36 or about one in 4.50×10<sup>13</sup> | + | # The sun '''has''' exploded (one in 4.38×10<sup>13</sup>) and the detector '''is''' telling the truth (35 in 36). This event has a total probability of about 1/(4.38×10<sup>13</sup>) × 35/36 or about one in 4.50×10<sup>13</sup>. |
# The sun '''hasn't''' exploded (4.38×10<sup>13</sup> − 1 in 4.38×10<sup>13</sup>) and the detector '''is not''' telling the truth (1 in 36). This event has a total probability of about (4.38×10<sup>13</sup> − 1) / 4.38×10<sup>13</sup> × 1/36 or about one in 36. | # The sun '''hasn't''' exploded (4.38×10<sup>13</sup> − 1 in 4.38×10<sup>13</sup>) and the detector '''is not''' telling the truth (1 in 36). This event has a total probability of about (4.38×10<sup>13</sup> − 1) / 4.38×10<sup>13</sup> × 1/36 or about one in 36. | ||
− | Clearly the sun exploding is not the most likely option. | + | Clearly the sun exploding is not the most likely option. |
+ | |||
+ | ===Presidential election predictions=== | ||
+ | [[File:Nate Silver Tweet.png|.@JoeNBC: If you think it's a toss-up, let's bet. If Obama wins, you donate $1,000 to the American Red Cross. If Romney wins, I do. Deal?|right]] | ||
+ | |||
+ | This comic may be about the accuracy of presidential election predictions that used statistical models, such as Nate Silver's ''538'' and Professor Sam Wang's ''PEC''. The bet may refer to a well-publicized bet that Nate Silver tried to make with Joe Scarborough regarding the outcome of the election (see tweet on the right). | ||
− | : < | + | ==Trivia== |
− | + | *In the same blog comment as cited above<ref name="munroe-on-gelman"/>, Randall explains that he chose the "sun exploding" scenario as a more clearly absurd example than those usually used: | |
− | + | <blockquote>…I realized that in the common examples used to illustrate this sort of error, like the cancer screening/drug test false positive ones, the correct result is surprising or unintuitive. So I came up with the sun-explosion example, to illustrate a case where naïve application of that significance test can give a result that’s obviously nonsense.</blockquote> | |
− | + | *"Bayesian" statistics is named for Thomas Bayes, who studied conditional probability — the likelihood that one event is true when given information about some other related event. From {{w|Bayes Theorem|Wikipedia}}: "Bayesian interpretation expresses how a subjective degree of belief should rationally change to account for evidence". | |
− | + | * The "frequentist" says that 1/36 = 0.027. It's actually 0.02777…, which should round to 0.028. | |
− | + | * Using neutrino detectors as an advance warning of a supernova is possible, and the {{w|Supernova Early Warning System}} does just this. The neutrinos arrive ahead of the photons, because they can escape from the core of the star before the supernova explosion reaches the mantle. | |
− | |||
==Transcript== | ==Transcript== | ||
− | : | + | :Did the sun just explode? (It's night, so we're not sure) |
− | |||
− | |||
− | :[Two | + | :[Two statisticians stand alongside an adorable little computer that is suspiciously similar to K-9 that speaks in Westminster typeface.] |
:Frequentist Statistician: This neutrino detector measures whether the sun has gone nova. | :Frequentist Statistician: This neutrino detector measures whether the sun has gone nova. | ||
:Bayesian Statistician: Then, it rolls two dice. If they both come up as six, it lies to us. Otherwise, it tells the truth. | :Bayesian Statistician: Then, it rolls two dice. If they both come up as six, it lies to us. Otherwise, it tells the truth. | ||
− | :Frequentist Statistician: Let's try. | + | :Frequentist Statistician: Let's try. [to the detector] Detector! Has the sun gone nova? |
− | : | + | :Detector: ''roll'' YES. |
− | |||
− | |||
:Frequentist Statistician: | :Frequentist Statistician: | ||
:Frequentist Statistician: The probability of this result happening by chance is 1/36=0.027. Since p<0.05, I conclude that the sun has exploded. | :Frequentist Statistician: The probability of this result happening by chance is 1/36=0.027. Since p<0.05, I conclude that the sun has exploded. | ||
Line 61: | Line 57: | ||
:Bayesian Statistician: | :Bayesian Statistician: | ||
:Bayesian Statistician: Bet you $50 it hasn't. | :Bayesian Statistician: Bet you $50 it hasn't. | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
==References== | ==References== | ||
Line 74: | Line 62: | ||
{{comic discussion}} | {{comic discussion}} | ||
− | |||
[[Category:Comics featuring Cueball]] | [[Category:Comics featuring Cueball]] | ||
[[Category:Multiple Cueballs]] | [[Category:Multiple Cueballs]] | ||
+ | [[Category:Math]] | ||
[[Category:Statistics]] | [[Category:Statistics]] | ||
[[Category:Physics]] | [[Category:Physics]] | ||
− |