Editing 2731: K-Means Clustering

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 10: Line 10:
  
 
==Explanation==
 
==Explanation==
βˆ’
 
+
{{incomplete|Created by 3 TYPES OF EDITORS - Please change this comment when editing this page. Do NOT delete this tag too soon.}}
 
[[Ponytail]] is giving a talk about her research groups analysis of which different types of people there are in the world.
 
[[Ponytail]] is giving a talk about her research groups analysis of which different types of people there are in the world.
  
βˆ’
A popular class of wry observations use the {{wiktionary|snowclone}} "There are two types of people in the world... those who do A, and those who do B". Here B will usually, though not always, be some antithesis of A. The most self-referent version is the joke "There are two types of people in the world - those who divide people into two types, and those who don't". Other well known versions include: "There are three types of people in the world - those who can count, and those who can't", "There are two types of people in the world - those who can extrapolate... ", and "There are 10 types of people in the world - those who understand binary and those who don't."
+
A popular class of wry observations use the {{wiktionary|snowclone}} "There are two types of people in the world... those who do A, and those who do B". Here B will usually, though not always, be some antithesis of A. The most self-referent version is the joke "There are two types of people in the world - those who divide people into two types, and those who don't". Other well known versions include: "There are three types of people in the world - those who can count, and those who can't" and "There are two types of people in the world - those who can extrapolate... "
  
 
Ponytail uses {{w|K-means_clustering|''k''-means clustering}} with k=3. This is a method of categorizing data. To explain how it works, imagine a set of people of various heights and weights, that should be split into 3 groups (which gives k the value 3). One way to do this would be to plot the data onto a scatter chart; then pick three points at random for reference; then sort the people according to which point they are closest to, forming 3 initial groups. After forming 3 groups, the average of the data point of every item in each group is found; these average data points are used as new reference points to once again categorize all the data into 3 new groups. This process is repeated until the data converges; that is, the data points no longer change groups even after new reference points are picked.
 
Ponytail uses {{w|K-means_clustering|''k''-means clustering}} with k=3. This is a method of categorizing data. To explain how it works, imagine a set of people of various heights and weights, that should be split into 3 groups (which gives k the value 3). One way to do this would be to plot the data onto a scatter chart; then pick three points at random for reference; then sort the people according to which point they are closest to, forming 3 initial groups. After forming 3 groups, the average of the data point of every item in each group is found; these average data points are used as new reference points to once again categorize all the data into 3 new groups. This process is repeated until the data converges; that is, the data points no longer change groups even after new reference points are picked.
Line 20: Line 20:
  
 
Ponytail's determination that there are three clusters is unsurprising if she herself falls into the category of those who use k=3 as a fixed value, which will inevitably result in three data clusters. However, the joke is that while one group's trait is "uses K=3", this logically means all the data that isn't in the group does not use k=3... except that with two other groups, then that description applies to both, meaning what distinguishes the other two groups from each other is unclear.
 
Ponytail's determination that there are three clusters is unsurprising if she herself falls into the category of those who use k=3 as a fixed value, which will inevitably result in three data clusters. However, the joke is that while one group's trait is "uses K=3", this logically means all the data that isn't in the group does not use k=3... except that with two other groups, then that description applies to both, meaning what distinguishes the other two groups from each other is unclear.
 +
 +
This could though easily have been fixed by saying those who use k-means clustering with k=3, those who use k<3 and those who use k>3. So splitting the rest up in just two groups would seem to be no problem... ''except'' for accounting for those who do not have a preconceived value of k at all! (Ideally, one perhaps finds the lowest practical k having the least amount of total scatter away from any cluster's focus, for which there are various competing solutions according to the details of the analysis.)
  
 
In the title text Ponytail, or maybe it is [[Randall]], claims that: "According to my especially unsupervised K-means clustering algorithm, there are currently about 8 billion types of people in the world."
 
In the title text Ponytail, or maybe it is [[Randall]], claims that: "According to my especially unsupervised K-means clustering algorithm, there are currently about 8 billion types of people in the world."

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)