Editing 2295: Garbage Math

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 8: Line 8:
  
 
==Explanation==
 
==Explanation==
This comic illustrates the "{{w|garbage in, garbage out}}" concept using mathematical expressions. It shows how, if you have garbage as inputs to your calculations, then you will likely get garbage as a result, except when you multiply by zero, which eliminates all uncertainty of the result.  
+
{{incomplete|Created by a ZILOG Z80. Please mention here why this explanation isn't complete. Do NOT delete this tag too soon.}}
 +
This comic explains the "{{w|garbage in, garbage out}}" concept using arithmetical expressions. Just like the comic says, if you get garbage in any part of your workflow, you get garbage as a result.
  
The propagation of errors in {{w|arithmetic}}, other {{w|mathematical operations}}, and {{w|statistics}} is described in colloquial terms. Numbers with low precision are termed garbage, while numbers with high precision are called precise. The table below quantifies the change in precision from the operands to their result in terms of their {{w|variance}}, represented by σ, the Greek lowercase letter sigma, equal to the {{w|standard deviation}}, or the square root of the variance. Variance or standard deviation are common specifications of uncertainty (as an alternative to, for example, a {{w|tolerance interval}}.)
+
Some of these rules correspond to the rules of {{w|floating point arithmetic}}, while others may be inspired by the rules of {{w|Propagation_of_uncertainty#Example_formulae| propagation of uncertainty}} where a "garbage" number would correspond to an estimate with a high degree of uncertainty, and the uncertainty of the result of arithmetic operations will tend to be dominated by the term with the highest uncertainty. The rule about N pieces of independent garbage reflects the {{w|central limit theorem}} and how it predicts that the uncertainty (or {{w|standard error}}) of an estimate will be reduced when independent estimates are averaged. The comic oddly omits raising garbage to the 0th power, which transforms even NaN, the platonic ideal of garbage, to exactly 1.
  
The {{w|accuracy and precision}} of mathematical operations correspond to the rules of {{w|Propagation_of_uncertainty#Example_formulae|propagation of uncertainty}}, where a "garbage" number would correspond to an estimate with a high degree of uncertainty, and a precise number has low uncertainty. The uncertainty of the result of such operations will usually correspond to the term with the highest uncertainty. The rule about N pieces of independent garbage used to calculate an {{w|arithmetic mean}} reflects how the {{w|central limit theorem}} predicts that the uncertainty (or {{w|standard error}}) of an estimate will be reduced when independent estimates are averaged.
+
This comic is not related to the {{w|2019–20 coronavirus outbreak|2020 pandemic}} of the {{w|coronavirus}} {{w|SARS-CoV-2}}, which causes {{w|COVID-19}}, breaking the streak of comics preceding this on [[:Category:COVID-19|topics relating to COVID-19]], after (rather appropriately) 19 comics (not counting the [[2288: Collector's Edition|April Fools' comic]]).
 +
 
 +
This comic is about the propagation of errors in numerical analysis and statistics, but described in much more colloquial terms. Numbers with low precision are termed "garbage" and numbers with high precision are labeled "precise".
  
 
{| class="wikitable"
 
{| class="wikitable"
!Formula as shown
+
!Formula
!Resulting uncertainty
+
!Statistical Expression
 
!Explanation
 
!Explanation
 
|-
 
|-
 
|Precise number + Precise number = Slightly less precise number
 
|Precise number + Precise number = Slightly less precise number
|<math>\mathop\sigma(X+Y)=\sqrt{\mathop\sigma(X)^2+\mathop\sigma(Y)^2}</math>
+
|<math>\mathop\sigma(X+Y)=\sqrt{(\mathop\sigma(X))^2+(\mathop\sigma(Y))^2}</math>
|{{Nowrap|If we know absolute error bars, then adding two precise numbers will}} at worst add the sizes of the two error bars. For example, if our precise numbers are 1 (±10<sup>-6</sup>) and 1 (±10<sup>-6</sup>), then our sum is 2 (±2·10<sup>-6</sup>). It is possible to lose a lot of relative precision, if the resultant sum is close to zero as a result of adding a number to its approximate negation, a phenomenon known as {{w|catastrophic cancellation}}. Therefore, both of the numbers must be positive for the stated assertion to be true.
+
|If we know absolute error bars, then adding two precise numbers will at worst add the sizes of the two error bars. For example, if our precise numbers are 1 (±10<sup>-6</sup>) and 1 (±10<sup>-6</sup>), then our sum is 2 (±2·10<sup>-6</sup>). It is possible to lose a lot of relative precision, if the resultant sum is close to zero as a result of adding a number and then close to its inverse. This phenomenon is known as catastrophic cancellation. Therefore, it is likely that all numbers referred here are positive numbers, which does not exhibit this phenomenon.
 
|-
 
|-
 
|Precise number × Precise number = Slightly less precise number
 
|Precise number × Precise number = Slightly less precise number
|<math>\mathop\sigma(X\times Y)\cong</math><br><br><math>\sqrt{\mathop\sigma(X)\times Y^2+\mathop\sigma(Y)\times X^2}</math>
+
|<math>\mathop\sigma(X\times Y)=\sqrt{(\mathop\sigma(X)\times Y)^2+(\mathop\sigma(Y)\times X)^2}</math>
 
|Here, instead of absolute error, relative error will be added. For example, if our precise numbers are 1 (±10<sup>-6</sup>) and 1 (±10<sup>-6</sup>), then our product is 1 (±2·10<sup>-6</sup>).
 
|Here, instead of absolute error, relative error will be added. For example, if our precise numbers are 1 (±10<sup>-6</sup>) and 1 (±10<sup>-6</sup>), then our product is 1 (±2·10<sup>-6</sup>).
 
|-
 
|-
 
|Precise number + Garbage = Garbage
 
|Precise number + Garbage = Garbage
|<math>\mathop\sigma(X+Y)=\sqrt{\mathop\sigma(X)^2+\mathop\sigma(Y)^2}</math>
+
|<math>\mathop\sigma(X+Y)=\sqrt{(\mathop\sigma(X))^2+(\mathop\sigma(Y))^2}</math>
 
|If one of the numbers has a high absolute error, and the numbers being added are of comparable size, then this error will be propagated to the sum.  
 
|If one of the numbers has a high absolute error, and the numbers being added are of comparable size, then this error will be propagated to the sum.  
 
|-
 
|-
 
|Precise number × Garbage = Garbage
 
|Precise number × Garbage = Garbage
|<math>\mathop\sigma(X\times Y)\cong</math><br><br><math>\sqrt{\mathop\sigma(X)\times Y^2+\mathop\sigma(Y)\times X^2}</math>
+
|<math>\mathop\sigma(X\times Y)=\sqrt{(\mathop\sigma(X)\times Y)^2+(\mathop\sigma(Y)\times X)^2}</math>
 
|Likewise, if one of the numbers has a high relative error, then this error will be propagated to the product. Here, this is independent of the sizes of the numbers.
 
|Likewise, if one of the numbers has a high relative error, then this error will be propagated to the product. Here, this is independent of the sizes of the numbers.
 
|-
 
|-
|<span style="border-top:1px solid; padding:0 0.1em;">Garbage</span> = Less bad garbage
+
|<math>\sqrt{\text{Garbage}} = \text{Less bad garbage}</math>
|<math>\mathop\sigma(\sqrt X)\cong\frac{\mathop\sigma(X)}{2\times\sqrt X} </math>
+
|<math>\mathop\sigma(\sqrt X)=\frac{\mathop\sigma(X)}{2\times\sqrt X} </math>
 
| When the square root of a number is computed, its relative error will be halved. Depending on the application, this might not be all that much ''better'', but it's at least ''less bad''.
 
| When the square root of a number is computed, its relative error will be halved. Depending on the application, this might not be all that much ''better'', but it's at least ''less bad''.
 
|-
 
|-
 
|Garbage<sup>2</sup> = Worse garbage
 
|Garbage<sup>2</sup> = Worse garbage
|<math>\mathop\sigma(X^2)\cong2\times X\times\mathop\sigma(X)</math>
+
|<math>\mathop\sigma(X^2)=2\times X\times\mathop\sigma(X)</math>
 
|Likewise, when a number is squared, its relative error will be doubled. This is a corollary to multiplication adding relative errors.
 
|Likewise, when a number is squared, its relative error will be doubled. This is a corollary to multiplication adding relative errors.
 
|-
 
|-
|<math>\frac{1}{N}\sum(</math>N pieces of statistically independent garbage<math>)</math> = Better garbage
+
|<math>\frac{1}{N}\sum(\text{N pieces of statistically independent garbage}) = \text{Better garbage}</math>
|<math>{\sigma}_\bar{X}\ = \frac{\sigma_X}{\sqrt{N}}</math>
+
|
|By aggregating many pieces of statistically independent observations (for instance, surveying many individuals), it is possible to reduce relative error to the {{w|Standard_error#Standard_error_of_the_mean|standard error of the mean}}. This is the basis of statistical sampling and the {{w|central limit theorem}}.
+
|By aggregating many pieces of statistically independent observations (for instance, surveying many individuals), it is possible to reduce relative error. This is the basis of statistical sampling.
 
|-
 
|-
 
|Precise number<sup>Garbage</sup> = Much worse garbage
 
|Precise number<sup>Garbage</sup> = Much worse garbage
|<math>\mathop\sigma(b^X)\cong|b^X|\times\mathop{\mathrm{ln}}b\times\sigma(X)</math>
+
|<math>\mathop\sigma(b^X)=b^{2\times X}\times\mathop{\mathrm{ln}}b\times\sigma(X)</math>
 
|The exponent is very sensitive to changes, which may also magnify the effect based on the magnitude of the precise number.
 
|The exponent is very sensitive to changes, which may also magnify the effect based on the magnitude of the precise number.
 
|-
 
|-
 
|Garbage – Garbage = Much worse garbage
 
|Garbage – Garbage = Much worse garbage
|<math>\mathop\sigma(X-Y)=\sqrt{\mathop\sigma(X)^2+\mathop\sigma(Y)^2}</math>
+
|<math>\mathop\sigma(X-Y)=\sqrt{(\mathop\sigma(X))^2+(\mathop\sigma(Y))^2}</math>
 
|This line involves catastrophic cancellation. If both pieces of garbage are about the same (e.g. if their error bars overlap), then it is possible that the answer is positive, zero, or negative.
 
|This line involves catastrophic cancellation. If both pieces of garbage are about the same (e.g. if their error bars overlap), then it is possible that the answer is positive, zero, or negative.
 
|-
 
|-
 
|<math>\frac{\text{Precise number}}{\text{Garbage}-\text{Garbage}}</math> = Much worse garbage, possible division by zero
 
|<math>\frac{\text{Precise number}}{\text{Garbage}-\text{Garbage}}</math> = Much worse garbage, possible division by zero
|<math>\mathop\sigma\left(\frac{a}{X-Y}\right)\cong</math><br><br><math>\frac {|a|}{(X-Y)^2}\times\sqrt{\mathop\sigma(X)^2+\mathop\sigma(Y)^2}</math>{{fact}}
+
|<math>\mathop\sigma(\frac{X}{Y})=\sqrt{frac{|(\mathop\sigma(X)\times Y)^2-(\mathop\sigma(Y)\times X)^2|}{\mathop\sigma(Y)}}</math>
 
|Indeed, as with above, if error bars overlap then we might end up dividing by zero.
 
|Indeed, as with above, if error bars overlap then we might end up dividing by zero.
 
|-
 
|-
Line 64: Line 67:
 
|}
 
|}
  
The title text refers to the computer science maxim of "garbage in, garbage out," which states that when it comes to computer code, supplying incorrect initial data will produce incorrect results, even if the code itself accurately does what it is supposed to do. As we can see above, however, when plugging data into mathematical formulas, this can possibly magnify the error of our input data, though there are ways to reduce this error (such as aggregating data). Therefore, the quantity of garbage is not necessarily {{w|Conservation law|conserved}}, in contrast to other scientific quantities like energy and momentum that are always conserved. Alternatively, this could be take as a pun on environmental conservation efforts, which can often involve recycling one's trash. However, the computer science maxim of "garbage in, garbage out," has nothing to do with actual garbage.
+
The title text refers to the computer science maxim of "garbage in, garbage out," which states that when it comes to computer code, supplying incorrect initial data will produce incorrect results, even if the code itself accurately does what it is supposed to do. As we can see above, however, when plugging data into mathematical formulas, this can possibly magnify the error of our input data, though there are ways to reduce this error (such as aggregating data). Therefore, the quantity of garbage is not necessarily conserved.
  
 
==Transcript==
 
==Transcript==
 +
{{incomplete transcript|Do NOT delete this tag too soon.}}
  
 
[A series of mathematical equations are written from top to bottom]
 
[A series of mathematical equations are written from top to bottom]
Line 79: Line 83:
  
 
√<span style="border-top:1px solid; padding:0 0.1em;">Garbage</span> = Less bad garbage
 
√<span style="border-top:1px solid; padding:0 0.1em;">Garbage</span> = Less bad garbage
 
Garbage² = Worse garbage
 
  
 
1/N Σ (N pieces of statistically independent garbage) = Better garbage
 
1/N Σ (N pieces of statistically independent garbage) = Better garbage

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)