Editing Talk:2023: Y-Axis

Jump to: navigation, search
Ambox notice.png Please sign your posts with ~~~~

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 2: Line 2:
  
 
"There are four kinds of lies: lies, damned lies, graphs, and statistics." [[User:Andyd273|Andyd273]] ([[User talk:Andyd273|talk]]) 13:37, 23 July 2018 (UTC)
 
"There are four kinds of lies: lies, damned lies, graphs, and statistics." [[User:Andyd273|Andyd273]] ([[User talk:Andyd273|talk]]) 13:37, 23 July 2018 (UTC)
:Lies by omission!  ...not very funny, though[[Special:Contributions/162.158.106.66|162.158.106.66]] 13:50, 25 July 2018 (UTC)
+
 
  
 
To me this graph stands out as having something very wrong far more than those that limit the y axis to a short range. If the grid lines were several shades lighter however...  [[User:PotatoGod|PotatoGod]] ([[User talk:PotatoGod|talk]]) 15:44, 23 July 2018 (UTC)
 
To me this graph stands out as having something very wrong far more than those that limit the y axis to a short range. If the grid lines were several shades lighter however...  [[User:PotatoGod|PotatoGod]] ([[User talk:PotatoGod|talk]]) 15:44, 23 July 2018 (UTC)
Line 8: Line 8:
 
Also I wonder if anyone can find a legitimate (non-misleading) use for the semi-semi-log plot? I’m sure there’s some scenario where it could be useful. Perhaps showing the population growth of a species, then when the growth levels out at the maximum sustainable level for its environment (I forget the proper term from high school biology) showing more detail of the small population changes or something like that? [[User:PotatoGod|PotatoGod]] ([[User talk:PotatoGod|talk]]) 15:52, 23 July 2018 (UTC)
 
Also I wonder if anyone can find a legitimate (non-misleading) use for the semi-semi-log plot? I’m sure there’s some scenario where it could be useful. Perhaps showing the population growth of a species, then when the growth levels out at the maximum sustainable level for its environment (I forget the proper term from high school biology) showing more detail of the small population changes or something like that? [[User:PotatoGod|PotatoGod]] ([[User talk:PotatoGod|talk]]) 15:52, 23 July 2018 (UTC)
 
:Frankly, it would be better to just use 2 separate graphs. Even if you explain to the reader that the scale changes mid-way, it would still be misleading on the subconscious level. The whole point of visualization is to allow the reader to utilize that sweet auto-processing power of our brains so that we don't have to think about what we are looking at too much. [[User:Jaalenja|Jaalenja]] ([[User talk:Jaalenja|talk]]) 17:59, 23 July 2018 (UTC)
 
:Frankly, it would be better to just use 2 separate graphs. Even if you explain to the reader that the scale changes mid-way, it would still be misleading on the subconscious level. The whole point of visualization is to allow the reader to utilize that sweet auto-processing power of our brains so that we don't have to think about what we are looking at too much. [[User:Jaalenja|Jaalenja]] ([[User talk:Jaalenja|talk]]) 17:59, 23 July 2018 (UTC)
:Yes, specifically in anomaly or outlier detection before doing any feature scaling/normalization, regression, sampling, replace of missing values. For data modeling, Semi-log can help you detect if outliers affect your model or if your p-hacking based on outliers.  For a given programming language or software, semi-log plot has had their place when you were not able to do quantile-quantile plot, heteroskedasticity plots, etc.  In layman's terms, it can be beneficial to compare both the semi-log and non-logarithmic pot simultaneously to see how removing outliers or large value might change the plot or results.  However, there now are easily accessible specific heteroskedasticity and outlier functions in R and cookbooks in python that would allow you test for outliers and data dredging more rigorously than semilog plots. Therefore, semi-log plots for outlier/anomaly detection may be going out of style.  I am not sure if there are any science's that still rely on semilog plots in data exploration step of science.  Does anyone know of any applications of semilog plots are still used for a specific science today? --[[Special:Contributions/162.158.186.36|162.158.186.36]] 22:51, 24 July 2018 (UTC)
+
 
:I would use semi-semi-log plot to compare exponential behavior of one dataset with linear behavior of another, but this would not be the intention of the comic because the two axes would be used for distinct datasets. [[Special:Contributions/162.158.63.118|162.158.63.118]] 14:34, 25 July 2018 (UTC)
 
  
 
Are there any IRL examples of this type of plot trick? I've never seen it
 
Are there any IRL examples of this type of plot trick? I've never seen it
  
At first, I thought the X-axis was logarithmic, because it lacks labels. This can also cause the sudden data jump.
+
At first I thought the X-axis was logarithmic, because it lacks labels. This can also cause the sudden data jump.
 +
 
  
 
There are no Y-axis labels and values, the x-axis dates are questionable, and the data points are even more questionable, resembling linear growth at really convenient spots. [https://amp.businessinsider.com/images/50b62c2669beddc340000005-320-185.jpg Fox News misleading graph]
 
There are no Y-axis labels and values, the x-axis dates are questionable, and the data points are even more questionable, resembling linear growth at really convenient spots. [https://amp.businessinsider.com/images/50b62c2669beddc340000005-320-185.jpg Fox News misleading graph]
:I think you were onto something about the X-axis being logarithmic.  X-axis AND Y-axis are both logarithmic.  The trick is to realize that the X-axis is reversed.  The Y-axis is logarithmic between 50% and 100%, but the X-axis is logarithmic on the LEFT and AFTER the first tick mark. A readable symlog or x-axis semi-log plot has the logarithmic on the LEFT or AFTER the first tick mark.  This I think really highlights an important point that Randall is making with this comic: '''Whether you exaggerate tick marks to the range to data or adjust ticks to a range outside of the data, you ultimately skew the meaning of the plot'''  Both Y-axis trick and log-scaling are bad. --[[Special:Contributions/162.158.186.36|162.158.186.36]] 22:51, 24 July 2018 (UTC)
 
:Yes, there is a programming example in python besides the Fox News one shown above.  You can reproduce this plot using the symlog function in python.  This is my first time posting in this wiki, so I am not sure if I should edit the page to include this example.  Here is a link: https://matplotlib.org/gallery/scales/symlog_demo.html .  Specifically, double symlog plot has a similar axis to Randall's picture.  You might notice that you can also do this R; however, it is intentionally much harder to do because of the very point Randall is making. --[[Special:Contributions/162.158.186.36|162.158.186.36]] 22:51, 24 July 2018 (UTC)
 
:There is an interesting color version of the point Randall is making that was published today in livescience: [https://www.livescience.com/63153-brain-color-distortion-maps.html].  Turns out our eyes for color expect this kind of scaling distortion. --[[Special:Contributions/162.158.186.36|162.158.186.36]] 22:51, 24 July 2018 (UTC)
 
:There is also a related problem for the case of discrete plots like bar charts called Waterfall charts. Waterfall charts are so bad, that their is saying in business, "Waterfall charts are how you lie to stakeholders".  Here is a deeper explanation: https://zebrabi.com/excel-waterfall-chart/ --[[Special:Contributions/162.158.186.36|162.158.186.36]] 22:51, 24 July 2018 (UTC)
 
 
Here is an example of a peer-reviewed scientific paper using a mixed linear/logarithmic scale on both axes: http://dx.doi.org/10.1029/2004JA010829 (Figure 9, page 8) [[Special:Contributions/162.158.222.52|162.158.222.52]] 12:17, 30 July 2018 (UTC)
 

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)