Difference between revisions of "Talk:2048: Curve-Fitting"

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
(added comment on "specialness" of this comic)
Line 14: Line 14:
  
 
To be honest, I'm a bit disappointed. I kinda expected a special comic with such a nice round number.. Been counting down since comic #2000... [[Special:Contributions/162.158.92.184|162.158.92.184]] 18:14, 19 September 2018 (UTC)
 
To be honest, I'm a bit disappointed. I kinda expected a special comic with such a nice round number.. Been counting down since comic #2000... [[Special:Contributions/162.158.92.184|162.158.92.184]] 18:14, 19 September 2018 (UTC)
 +
Different anon here, I think this is very special and if Randall makes a poster available I will be buying several to give away.  Of course, part of my business is experimental data analysis and modeling...and this is a fantastic summary of common errors.
  
 
'''Curve-Fitting'''
 
'''Curve-Fitting'''
  
 
How fitting works needs to be explained. f(x)=mx+b works fine for single values, but how do we get that red line from the data set? --[[User:Dgbrt|Dgbrt]] ([[User talk:Dgbrt|talk]]) 20:12, 19 September 2018 (UTC)
 
How fitting works needs to be explained. f(x)=mx+b works fine for single values, but how do we get that red line from the data set? --[[User:Dgbrt|Dgbrt]] ([[User talk:Dgbrt|talk]]) 20:12, 19 September 2018 (UTC)

Revision as of 20:26, 19 September 2018


House of Cards: Not a real method, but a common consequence of mis-application of statistical methods: a curve can be generated that fits the data extremely well, but immediately becomes absurd as soon as one glances outside the training data sample range, and your analysis comes crashing down "like a house of cards". This is a type of _overfitting_

I'm pretty sure it refers to the TV show house of cards, the dots representing the quality of the series increasing until Netflix renewed it a bit too much 172.68.26.65 (talk) (please sign your comments with ~~~~)

I'm a little mystified by the alt-text. Cauchy and Lorentz both seem like mathematically capable people. What am I missing? 172.69.62.226 17:46, 19 September 2018 (UTC)

Google-Fu reveals that it's a continuous probability distribution. This isn't bad per se, but it is quite visually distinctive and also can be quite...concerning if the data set isn't one where probability should be an issue. Werhdnt (talk) 18:00, 19 September 2018 (UTC)
This is not the issue, but the fact that the moments (such as mean and variance) of the distribution don't exist = converge. See edited explanation. So if you wanted to estimate the parameters of the distribution, taking the sample mean for example will not converge with the number of data points, and is therefore bad to attempt. It is more mathematically alarming than alarmingly mathematical. GamesAndMath
My own Google-Fu brought me to a page with this information: “The distribution is important in physics as it is the solution to the differential equation describing forced resonance, while in spectroscopy it is the description of the line shape of spectral lines.” (from here: https://www.boost.org/doc/libs/1_53_0/libs/math/doc/sf_and_dist/html/math_toolkit/dist/dist_ref/dists/cauchy_dist.html) Justinjustin7 (talk) 18:09, 19 September 2018 (UTC)

To be honest, I'm a bit disappointed. I kinda expected a special comic with such a nice round number.. Been counting down since comic #2000... 162.158.92.184 18:14, 19 September 2018 (UTC) Different anon here, I think this is very special and if Randall makes a poster available I will be buying several to give away. Of course, part of my business is experimental data analysis and modeling...and this is a fantastic summary of common errors.

Curve-Fitting

How fitting works needs to be explained. f(x)=mx+b works fine for single values, but how do we get that red line from the data set? --Dgbrt (talk) 20:12, 19 September 2018 (UTC)