2239: Data Error

Explain xkcd: It's 'cause you're dumb.
Revision as of 18:54, 9 December 2019 by Pere prlpz (talk | contribs) (Explanation: Mouse-over text is about the {{w|Great Oxidation Event}} when prokaryotic photosynthetic organisms built up oxygen in Earth atmosphere for the first time and most organisms, which weren't adapted to oxygen, went extinct.)
Jump to: navigation, search
Data Error
Cyanobacteria wiped out nearly all life on Earth once before, and they can do it again!
Title text: Cyanobacteria wiped out nearly all life on Earth once before, and they can do it again!

Explanation

Ambox notice.png This explanation may be incomplete or incorrect: Created by some anomolous perfectly normal algae. Please mention here why this explanation isn't complete. Do NOT delete this tag too soon.
If you can address this issue, please edit the page! Thanks.

Mouse-over text is about the Great Oxidation Event when prokaryotic photosynthetic organisms built up oxygen in Earth atmosphere for the first time and most organisms, which weren't adapted to oxygen, went extinct.

Transcript

Ambox notice.png This transcript is incomplete. Please help editing it! Thanks.
[Black Hat and Megan stand facing each other.]
Megan: I can't believe this data error invalidates a year and a half of my research.
Megan: I was about to publish.
Black Hat: Don't panic. You have two options.
Megan: Yeah?
[Closeup shot of Black Hat holding one hand up.
Black Hat: 1) Redo your analysis and share whatever results you can, whether positive or negative. It's disappointing, but these things happen.
[Black Hat has closed his fist. Megan holds her hands up.]
Black Hat: 2) Destroy the evidence. Use your materials and research methods to build a superweapon. Conquer Earth and rule with an iron fist.
Megan: Tremble before my anomalously productive algae!
Megan: Except the anomaly was an artifact.
Megan: Tremble before my normal algae!


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

Randall's comics are usually relevant to recent events on or near the day comics are posted. I was wondering if this Data Error comic might be referencing some recent event, some data error at NASA or something. Does anyone know what it might be in reference to? 108.162.219.40 21:13, 9 December 2019 (UTC) ... Sorry, forgot to sign in. Saibot84 21:14, 9 December 2019 (UTC)

I'm not aware of anything in the news. However, this is not the first time Randall has commented on research publication in a comic, so I suspect it's just another in that series. It seems obvious that he feels the first option is the appropriate choice, and the second option is the joke. Ianrbibtitlht (talk) 21:22, 9 December 2019 (UTC)
I believe there was a relatively recent issue where a Python script used for processing data-sets made assumptions about the order in which data files would be returned by the host operating system that turned out to not always be true, throwing the results of several analyses off. Could he be referring to that? The scripts in question were used for obtaining results into cyanobacteria studies... https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/ 162.158.34.222 15:03, 13 December 2019 (UTC)

I think the stickwoman is not "excited" but sarcastic, although you can't be sure in text. It is a joke based on the discrepancy in capabilities between real scientists and fictional mad scientists. 108.162.238.119 22:23, 9 December 2019 (UTC)

I agree, Megan is being a smart-ass 108.162.245.202 15:46, 10 December 2019 (UTC)
For start, "mad scientists" are usually more like mad engineers ... you can't get world domination by researching something and writing paper about it, you need to USE that research, usually by building something. -- Hkmaly (talk) 23:10, 9 December 2019 (UTC)
Are you suggesting scientists can't build things? I don't actually know, since I'm an engineer! Ianrbibtitlht (talk) 23:43, 9 December 2019 (UTC)

What is a data error in general? Explain me a term :) 172.69.22.74 02:39, 10 December 2019 (UTC)

The discovery that the data you used was sampled below the Nyquist frequency pretty much kills your thesis until you can get data that was properly acquired. All your results will be contaminated with artifacts produced by the sampling rate, rather than by variations in the quantity that you imagined you were observing. 173.245.52.209 12:37, 10 December 2019 (UTC)
I thought I knew what a data error is, but after that reply I'm not sure - although I'm almos sure that it did not help the one asking the question ;-) --Kynde (talk) 15:55, 10 December 2019 (UTC)
Well, that is a type of data error (bad sampling technique), but not the only type. The data itself could have had corruption problems, such as maybe some rogue second species of algae contaminated the samples, etc. 172.69.62.46 21:39, 10 December 2019 (UTC)
Also, malfunctioning or miscalibrated measuring equipment (transducers, cabling, etc.) would be another type of data error. Ianrbibtitlht (talk) 22:17, 10 December 2019 (UTC)
More about data errors. Yes, I listed just one kind, and a fellow I knew had to re-do his thesis because of that particular error. The careful researcher investigates many possible sources of error. The poor researcher simply throws away the data points that do not match his preconceptions. HERE WE GO, enumerating some errors: (1) Noise from physically sloppy equipment. (2) Lack of calibration of measuring device. (3) Device loses calibration over time. (4) Manually recorded data errors, such as transposed digits. (5) Incorrect assumptions of linearity in the design of measurement. (6) Failure to record crucial environmental parameters. [That's just six minutes of thinking. Surely there are others.]
Yes, I omitted an important source of error: Sabotage! You're not paranoid, someone really is messing with your data.162.158.79.113 01:34, 11 December 2019 (UTC)
So, a data error is an error in your data, instead of in your analysis? 172.68.132.107 11:35, 11 December 2019 (UTC)

If it were merely an error in analysis (see the recent mess with python, [1] ), then you simply fix your analysis code and re-run. So, yes, a "data error" means the original data values were flawed or invalid or whatever. Most likely sabotage inflicted by sophons. Cellocgw (talk) 12:29, 11 December 2019 (UTC)

I'm happy that he said "two options" instead of "two choices", which of course would involve around four options. Watching the horrific Star Trek: Discovery for completist purposes, I was annoyed when someone said "you have only one alternative" when they meant "you have only one option". — Kazvorpal (talk) 18:39, 22 January 2020 (UTC)