2400: Statistics

Explain xkcd: It's 'cause you're dumb.
Revision as of 10:16, 19 December 2020 by 141.101.98.138 (talk) (Explanation: Could go into more about whether the "steep line" is just steep because of y-scale multiplication to fit, or t-scale is such that divergence takes a century, but for now let's keep it simple.)
Jump to: navigation, search
Statistics
We reject the null hypothesis based on the 'hot damn, check out this chart' test.
Title text: We reject the null hypothesis based on the 'hot damn, check out this chart' test.

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: Created by a PLACEBO GROUP. Please mention here why this explanation isn't complete. Do NOT delete this tag too soon.
If you can address this issue, please edit the page! Thanks.

This comic is another comic in a series of comics related to the 2020 pandemic of the coronavirus SARS-CoV-2, which causes COVID-19.

The main focus of the comic is a graph showing cases of COVID-19 versus time for two groups: one group was vaccinated and the other group was not. Graphs are ways to visualize data, and almost always indicate specific values. This graph does not; it simply has two lines, and no indication of any of the scale values involved. The higher line ("placebo group") rises in a steep curve. The lower line ("vaccine group") follows the first for a bit but then levels out to a much slower rate of climb. The caption eschews statistical analysis in favor of a holistic assessment: the vaccine is clearly working; just look how much those lines diverge.

This comic was released one day after the FDA's Dec 17th briefing document for the Moderna COVID-19 vaccine was released. The document includes the following chart: File:FDA Modena Dec17.png. The charts draw the integral of the incidence data rather than the data itself ("cumulative" rather than "rate"): this results in changes in disease rate towards the left side of the chart, being added into the data on the right side, amplifying their difference. This technique for emphasizing the data is valid: the spread between the lines only continues to increase if the effect continues happening, such that the total spread at the right is proportional to the total effect the vaccine had. The charts do not show any information on other possible variables. Randall has described previously in his webcomics how very clear charts can be made to hide misleading data. The linked graph does not leave the numbers out [and this editor isn't a statistics major, but it looks like the numbers indicate the vaccine has a 95% change of being at least 91% effective at completely preventing the disease].

The advice here could be seen as the inverse of the "science tip" in 2311: Confidence Interval, in which the data was so bad that its error bars fell outside of the graph and were not shown.

Transcript

Ambox notice.png This transcript is incomplete. Please help editing it! Thanks.
[Shown is a graph with the x-axis labeled "time" and the y-axis labeled "COVID cases." There is a black line on the graph labeled "placebo group", which has a roughly linear slope moving toward the top right corner. There is a red line labeled "vaccine group", which follows the black line for about an eighth of the width of the graph before leveling off.]
[Caption beneath the graph]:
Statistics tip: Always try to get data that's good enough that you don't need to do statistics on it


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

This is a representation of the actual graph showing the efficacy of the Pfizer/BioNTech coronavirus vaccine, based on data from Deutsche Bank AG and the FDA as published in John Authers' Bloomberg Opinion column.  And yes, the results are just that clear and graphically obvious (pun unintended). RAGBRAIvet (talk) 00:51, 19 December 2020 (UTC)

I agree, but the original graph can be found in this paper: https://www.nejm.org/doi/full/10.1056/NEJMoa2034577#figures_media --162.158.203.25 09:11, 19 January 2021 (UTC)
So, the value on bottom right of the graph ... is it three days? -- Hkmaly (talk) 03:55, 19 December 2020 (UTC)
The corresponding graph in the FDA report covers about 100 days. Barmar (talk) 05:00, 19 December 2020 (UTC)

When I saw this comic I immediately thought of this bit about doublespeak in graphs. Not saying I inherently believe or disbelieve numbers/statistics about covid but an impressive graph with no numbers...Apparently it is actually that clear though. https://youtu.be/qP07oyFTRXc?t=292 DarkVex9 (talk) 01:05, 19 December 2020 (UTC) Bold text The graph really is a scientist's dream. It's so pretty that I had to add it to the explanation, but I'm not sure my upload worked (permissions?). Someone should screen grab fig 2 from the FDA briefing and add it. Mperrotta (talk) 03:56, 19 December 2020 (UTC)

I dispute that graphs are only a way of visualizing data; this graph is actually the platonic graph talked about in a textbook about graphs which funnily I found on xkcd. tldr: a good graph makes the truth obvious. This is everything working out as it should be. 172.69.63.135 08:28, 19 December 2020 (UTC)

In the kinds of statistical analyses I have been involved with, this is what's called a "bridge of the nose" analysis. It hits you right between the eyes. Roll on science. (brad)

Interestingly, the "Statistical Analysis" section of the cited study reads, in its entirety: "No formal statistical hypothesis was tested in this study and all results were descriptive." Even they went by the "hot damn check out this chart" test. Anyhow, is that notable enough to put somewhere in the explanation? 172.69.248.144 18:12, 21 December 2020 (UTC)