Editing 2494: Flawed Data
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 4: | Line 4: | ||
| title = Flawed Data | | title = Flawed Data | ||
| image = flawed_data.png | | image = flawed_data.png | ||
− | | titletext = We trained it to produce data that looked convincing, and we have to admit | + | | titletext = We trained it to produce data that looked convincing, and we have to admit the results look convincing! |
}} | }} | ||
==Explanation== | ==Explanation== | ||
+ | {{incomplete|Created by a flawed but CONVINCING AI. Please mention here why this explanation isn't complete. Do NOT delete this tag too soon.}} | ||
+ | This is another comic about what is the right or wrong way to perform research when your data is not adequate. | ||
− | This is | + | This time we see [[Cueball]] clearly admit that they have realized that all of their data is actually flawed. He presents them on a poster with two graphs with data point and possible fitted curves in the first panel. |
− | + | From there three different reactions to this is displayed in order of how good a decision they make based on this realization. | |
− | |||
− | From there | ||
;Good | ;Good | ||
− | In the first scenario Cueball | + | In the first scenario Cueball then admit that they are no longer sure about the conclusions they had drawn out from these flawed data. That is, they cannot really make any conclusions, which is the right (good) decision when realizing that the data you have is not valid. |
;Bad | ;Bad | ||
− | In the second scenario Cueball then explains that after | + | In the second scenario Cueball then explains that after doing a lot of math (manipulation) of their flawed data, they decided they where actually fine. Since the data is flawed, math will not make them true. Thus trying to use them, by hiding that they are flawed in lots of math is a bad approach. Alternatively they try to find reasons supported by math, why their bad data is correct, effectively changing the model and expected outcome so the bad data fits well. While statistical analysis can be used to discard "flaws", e.g. outliers in a data set, it is not vaild to this after the results didn't match your expectations. Since there are many different statistical methods and tests, trying one after the other will almost guarantee that you will eventually the outcome you are after - even if the data is flawed. |
;Very bad | ;Very bad | ||
− | In the third and final scenario | + | In the third and final scenario Cueball explains that they scrapped all the flawed data. But in stead of trying to make some new data doing research/measurements/tests, they instead trained an {{w|Artificial Intelligence}} (AI) to generate better data. This is of course not real data, but just a simulation of data. And since they are probably looking for a specific result, they could train the AI to generate data that supports this. This has nothing to do with research into the problem they are actually looking into and is thus very bad. They do gain some insights into programing the AI (see [[2173: Trained a Neural Net]]). AI is a recurring [[:Category:Artificial Intelligence|theme]] on xkcd. |
− | In the title text | + | In the title text the results from the very bad approach is mentioned and the fact that they got the data they where looking for made clear when they state that ''We trained it to produce data that looked convincing, and we have to admit the results look convincing!'' So of course if they successfully ask the AI for data that supports their theory, in a way that looks convincing, that would be what they got back. |
==Transcript== | ==Transcript== | ||
Line 30: | Line 30: | ||
:Cueball: We realized all our data is flawed. | :Cueball: We realized all our data is flawed. | ||
− | :[The three next panels all have a label in a frame going over the top of each panels frame. The poster can no longer be seen in the rest of the panels.] | + | :[The three next panels all have a label in a frame going over the top of each panels frame. The poster can no longer be seen in the rest of the panels. Cueball has taken the stick down.] |
:Label: Good | :Label: Good | ||
− | |||
:Cueball: ...So we're not sure about our conclusions. | :Cueball: ...So we're not sure about our conclusions. | ||
+ | :[Cueball holds the pointer almost as in the first panel.] | ||
:Label: Bad | :Label: Bad | ||
− | |||
:Cueball: ...So we did lots of math and then decided our data is actually fine. | :Cueball: ...So we did lots of math and then decided our data is actually fine. | ||
+ | :[Cueball holds the pointer so it point upwards. Also he lifts his other hand a bit up.] | ||
:Label: Very bad | :Label: Very bad | ||
− | |||
:Cueball: ...So we trained an AI to generate better data. | :Cueball: ...So we trained an AI to generate better data. | ||
Line 49: | Line 48: | ||
[[Category:Statistics]] | [[Category:Statistics]] | ||
[[Category:Artificial Intelligence]] | [[Category:Artificial Intelligence]] | ||
− |