Editing Talk:2739: Data Quality
![]() |
Please sign your posts with ~~~~ |
Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.
The edit can be undone.
Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision | Your text | ||
Line 4: | Line 4: | ||
::Hash tables have an ultra-low collision rate, as compared to the transforms used in packetwise error-correction... Since the comic is primarily focused on contrasting media fidelity with direct alteration of the content, ciphers seem a less direct association than content distribution networks? Given the context presented, my immediate association was the use of both piece & whole-pack hash verification, which has a collision rate so low terms like "number of particles in the universe" start entering the conversation. Upon further consideration, I wonder if Randall is referring to plain old CRC32 hash checking? Or the SHA hashes commonly used to verify disc downloads? (If it passes SHA *and* torrent content checking, I'd say you've probably got better chances of 1:1 integrity, than any original medium has of retaining it?) | ::Hash tables have an ultra-low collision rate, as compared to the transforms used in packetwise error-correction... Since the comic is primarily focused on contrasting media fidelity with direct alteration of the content, ciphers seem a less direct association than content distribution networks? Given the context presented, my immediate association was the use of both piece & whole-pack hash verification, which has a collision rate so low terms like "number of particles in the universe" start entering the conversation. Upon further consideration, I wonder if Randall is referring to plain old CRC32 hash checking? Or the SHA hashes commonly used to verify disc downloads? (If it passes SHA *and* torrent content checking, I'd say you've probably got better chances of 1:1 integrity, than any original medium has of retaining it?) | ||
::[[User:ProphetZarquon|ProphetZarquon]] ([[User talk:ProphetZarquon|talk]]) 22:51, 17 February 2023 (UTC) | ::[[User:ProphetZarquon|ProphetZarquon]] ([[User talk:ProphetZarquon|talk]]) 22:51, 17 February 2023 (UTC) | ||
− | |||
− | |||
− | |||
GIF's aren't lossy either, though often other formats can't be converted to GIF without discarding information. [[User:Bemasher|Bemasher]] ([[User talk:Bemasher|talk]]) 18:27, 17 February 2023 (UTC) | GIF's aren't lossy either, though often other formats can't be converted to GIF without discarding information. [[User:Bemasher|Bemasher]] ([[User talk:Bemasher|talk]]) 18:27, 17 February 2023 (UTC) | ||
− | |||
:I think that's the point. [[Special:Contributions/172.68.50.203|172.68.50.203]] 20:12, 17 February 2023 (UTC) | :I think that's the point. [[Special:Contributions/172.68.50.203|172.68.50.203]] 20:12, 17 February 2023 (UTC) | ||
:GIFs are lossy in the very act of creating them: the actual colors of the real object have to be smashed down into (I think it’s) 256 different colors, resulting in an image that even human perception recognizes as crappy. Even the so-called ‘lossless’ formats such as PNG are lossy in the act of creation, just not as drastically as GIFs. A truly ‘lossless’ format would have to specify the exact intensity of every wavelength of electromagnetic radiation emanating from every atom of the original object. Good luck with that. [[Special:Contributions/172.71.151.99|172.71.151.99]] 01:00, 18 February 2023 (UTC) | :GIFs are lossy in the very act of creating them: the actual colors of the real object have to be smashed down into (I think it’s) 256 different colors, resulting in an image that even human perception recognizes as crappy. Even the so-called ‘lossless’ formats such as PNG are lossy in the act of creation, just not as drastically as GIFs. A truly ‘lossless’ format would have to specify the exact intensity of every wavelength of electromagnetic radiation emanating from every atom of the original object. Good luck with that. [[Special:Contributions/172.71.151.99|172.71.151.99]] 01:00, 18 February 2023 (UTC) | ||
− | |||
− | |||
− | |||
::It's subjective whether formats (even .gif) can be recognised as 'crappy'. The display format may further tune down everything so that something defined with 65536 colours is more like 256, or it could work well with any given stippling/halftoning/dithering to produce something more like the better original than the file data strictly allows (even from 6bits-per-pixel, or 3) when viewed at sufficient remove. And a .gif of a block-coloured diagram is notably better than a typical .jpg of one, despite the technically superior palette the later has. (Nobody says that an image has to be from a real-life subject, with all kinds of missing data, such as photons thst happen to hit the gap between CCD pixels but might be considered important and might well have been captured with the Mk 1 Eyeball and significantly 'noticed' by the nerves and ultimately the respective processing usters of the brain behind it... Which has a complete set of 'analogue lossiness' to it, anyway.) [[Special:Contributions/172.71.242.203|172.71.242.203]] 16:37, 18 February 2023 (UTC) | ::It's subjective whether formats (even .gif) can be recognised as 'crappy'. The display format may further tune down everything so that something defined with 65536 colours is more like 256, or it could work well with any given stippling/halftoning/dithering to produce something more like the better original than the file data strictly allows (even from 6bits-per-pixel, or 3) when viewed at sufficient remove. And a .gif of a block-coloured diagram is notably better than a typical .jpg of one, despite the technically superior palette the later has. (Nobody says that an image has to be from a real-life subject, with all kinds of missing data, such as photons thst happen to hit the gap between CCD pixels but might be considered important and might well have been captured with the Mk 1 Eyeball and significantly 'noticed' by the nerves and ultimately the respective processing usters of the brain behind it... Which has a complete set of 'analogue lossiness' to it, anyway.) [[Special:Contributions/172.71.242.203|172.71.242.203]] 16:37, 18 February 2023 (UTC) | ||
− | |||
Someone needs to add a table describing all the formats in the chart. [[User:Barmar|Barmar]] ([[User talk:Barmar|talk]]) 19:29, 17 February 2023 (UTC) | Someone needs to add a table describing all the formats in the chart. [[User:Barmar|Barmar]] ([[User talk:Barmar|talk]]) 19:29, 17 February 2023 (UTC) | ||
Line 24: | Line 16: | ||
:::Tables are actually [https://mediawiki.org/wiki/Help:Tables quite easy to do] (if you don't intend to do much complex stuff), but also very easy to slightly mess up (temporarily - Preview is your friend, especially if you need to rowspan/colspan at all). For this purpose, nothing fancy. Header row, other rows, nothing particar special in alignment, sorting, colour (foreground and/or background), etc. It'll be fairly intelligently fitted to the browser window, according to the contents. | :::Tables are actually [https://mediawiki.org/wiki/Help:Tables quite easy to do] (if you don't intend to do much complex stuff), but also very easy to slightly mess up (temporarily - Preview is your friend, especially if you need to rowspan/colspan at all). For this purpose, nothing fancy. Header row, other rows, nothing particar special in alignment, sorting, colour (foreground and/or background), etc. It'll be fairly intelligently fitted to the browser window, according to the contents. | ||
:::However, here (when you might have large amounts of narrative in one column), perhaps just ";"-prefix a mini-header (can include "(in Title text)" or other shorthand details) and then have ":"-prefixed 'definition' prose that rambles on about each item in freehand text. I would suggest that's as complicated as you need it, no real need for tabling at all. (But, without wanting to show you how to use a hammer, then making every problem now look like a nail to you, I think you could handle ''learning'' the basic table-markup/learning where to get the more complex stuff. So there you are.) [[Special:Contributions/172.70.91.197|172.70.91.197]] 16:54, 19 February 2023 (UTC) | :::However, here (when you might have large amounts of narrative in one column), perhaps just ";"-prefix a mini-header (can include "(in Title text)" or other shorthand details) and then have ":"-prefixed 'definition' prose that rambles on about each item in freehand text. I would suggest that's as complicated as you need it, no real need for tabling at all. (But, without wanting to show you how to use a hammer, then making every problem now look like a nail to you, I think you could handle ''learning'' the basic table-markup/learning where to get the more complex stuff. So there you are.) [[Special:Contributions/172.70.91.197|172.70.91.197]] 16:54, 19 February 2023 (UTC) | ||
− | |||
It seems there are two definitions of data quality that Randall is juxtaposing for comic effect: in one, quality data is data that represents the original phenomenon without error or degradation. In the other, he's applying the concept of quality to the phenomenon itself – data is better if it describes a better phenomenon. My cat is better than your cat, therefore data about my cat is better than data about your cat. I'd like to see this concept in the explanation of the page but don't know how to add into the flow of the current text.[[User:K95|K95]] ([[User talk:K95|talk]]) 19:33, 17 February 2023 (UTC) | It seems there are two definitions of data quality that Randall is juxtaposing for comic effect: in one, quality data is data that represents the original phenomenon without error or degradation. In the other, he's applying the concept of quality to the phenomenon itself – data is better if it describes a better phenomenon. My cat is better than your cat, therefore data about my cat is better than data about your cat. I'd like to see this concept in the explanation of the page but don't know how to add into the flow of the current text.[[User:K95|K95]] ([[User talk:K95|talk]]) 19:33, 17 February 2023 (UTC) | ||
Line 43: | Line 34: | ||
The opening sentence of the explanation, about data loss in transit, seems a bit irrelevant to the comic, which is only concerned with lossiness in information due to format. [[Special:Contributions/172.70.91.197|172.70.91.197]] 10:40, 20 February 2023 (UTC) | The opening sentence of the explanation, about data loss in transit, seems a bit irrelevant to the comic, which is only concerned with lossiness in information due to format. [[Special:Contributions/172.70.91.197|172.70.91.197]] 10:40, 20 February 2023 (UTC) | ||
:''Very'' relevent to the parity ones. (Leads me to believe it's a scale of "amount of provided data to represent original data". You send less than you really ought to, the more left you go, you send more than you should ''technically'' need to as you go to the right. Checksums add a little bit extra, once you get to them, and ''correcting'' checksums (hamming bits, etc) are significantly extra overhead. The whole 'better data' is basically "send a similar amount of newer information, or even more, on top of the original".) [[Special:Contributions/162.158.34.71|162.158.34.71]] 12:55, 20 February 2023 (UTC) | :''Very'' relevent to the parity ones. (Leads me to believe it's a scale of "amount of provided data to represent original data". You send less than you really ought to, the more left you go, you send more than you should ''technically'' need to as you go to the right. Checksums add a little bit extra, once you get to them, and ''correcting'' checksums (hamming bits, etc) are significantly extra overhead. The whole 'better data' is basically "send a similar amount of newer information, or even more, on top of the original".) [[Special:Contributions/162.158.34.71|162.158.34.71]] 12:55, 20 February 2023 (UTC) | ||
− | |||
− | |||
− | |||
− |