Talk:2739: Data Quality

Explain xkcd: It's 'cause you're dumb.
Revision as of 01:00, 18 February 2023 by 172.71.151.99 (talk)
Jump to: navigation, search

Hash tables aren't lossy, maybe Randall means hash functions? Barmar (talk) 17:06, 17 February 2023 (UTC)

I was thinking more a (subset of) a Rainbow table, than an associative array... Although such things tend not to preserve/respect item order (in reading, writing and altering in general), which is potentially information-lossy. 172.69.79.185 18:50, 17 February 2023 (UTC)
Hash tables have an ultra-low collision rate, as compared to the transforms used in packetwise error-correction... Since the comic is primarily focused on contrasting media fidelity with direct alteration of the content, ciphers seem a less direct association than content distribution networks? Given the context presented, my immediate association was the use of both piece & whole-pack hash verification, which has a collision rate so low terms like "number of particles in the universe" start entering the conversation. Upon further consideration, I wonder if Randall is referring to plain old CRC32 hash checking? Or the SHA hashes commonly used to verify disc downloads? (If it passes SHA *and* torrent content checking, I'd say you've probably got better chances of 1:1 integrity, than any original medium has of retaining it?)
ProphetZarquon (talk) 22:51, 17 February 2023 (UTC)

GIF's aren't lossy either, though often other formats can't be converted to GIF without discarding information. Bemasher (talk) 18:27, 17 February 2023 (UTC)

I think that's the point. 172.68.50.203 20:12, 17 February 2023 (UTC)
GIFs are lossy in the very act of creating them: the actual colors of the real object have to be smashed down into (I think it’s) 256 different colors, resulting in an image that even human perception recognizes as crappy. Even the so-called ‘lossless’ formats such as PNG are lossy in the act of creation, just not as drastically as GIFs. A truly ‘lossless’ format would have to specify the exact intensity of every wavelength of electromagnetic radiation emanating from every atom of the original object. Good luck with that. 172.71.151.99 01:00, 18 February 2023 (UTC)

Someone needs to add a table describing all the formats in the chart. Barmar (talk) 19:29, 17 February 2023 (UTC)

Yep. It needs a description of each point on the graph. I'm on my phone though... and feeling lazy after shoveling snow.
ProphetZarquon (talk) 22:54, 17 February 2023 (UTC)

It seems there are two definitions of data quality that Randall is juxtaposing for comic effect: in one, quality data is data that represents the original phenomenon without error or degradation. In the other, he's applying the concept of quality to the phenomenon itself – data is better if it describes a better phenomenon. My cat is better than your cat, therefore data about my cat is better than data about your cat. I'd like to see this concept in the explanation of the page but don't know how to add into the flow of the current text.K95 (talk) 19:33, 17 February 2023 (UTC)

I already put that in earlier. See the second sentence of the second paragraph, I called it "general excellence". Barmar (talk) 21:45, 17 February 2023 (UTC)

"Data are transferred in bits"...Hear, hear. I'm over 60, I still remember of stuff that is called "analog" ;-) -- 172.71.160.37 (talk) 20:07, 17 February 2023 (UTC) (please sign your comments with ~~~~)

Note, however, that we are transferring data digitally for over four thousand years. That's how long is technically possible to make a lossless copy of written story. -- Hkmaly (talk) 22:19, 17 February 2023 (UTC)
That's only if you're lucky enough to be still reading it in the original Klingon language, etc... 172.69.79.184 22:53, 17 February 2023 (UTC)
"It is a Klingon name!" 😾
Transcription definitely suffers from a Darmok & Jalad type contextual dependency.
ProphetZarquon (talk) 22:59, 17 February 2023 (UTC)

I think that "Better data" is a reference to gainful compression, and that "my better cat" doesn't specifically refer to the author but to the lyrical subject (as in poems). 172.68.50.203 20:12, 17 February 2023 (UTC)