Editing Talk:2739: Data Quality

Jump to: navigation, search
Ambox notice.png Please sign your posts with ~~~~

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 6: Line 6:
 
: Maybe it was to be about '''cuckoo filters''', which are probabilistic data structure alternative to classic Bloom filter, which are based on space-efficient variants of cuckoo hashing? --[[User:JakubNarebski|JakubNarebski]] ([[User talk:JakubNarebski|talk]]) 14:05, 20 February 2023 (UTC)
 
: Maybe it was to be about '''cuckoo filters''', which are probabilistic data structure alternative to classic Bloom filter, which are based on space-efficient variants of cuckoo hashing? --[[User:JakubNarebski|JakubNarebski]] ([[User talk:JakubNarebski|talk]]) 14:05, 20 February 2023 (UTC)
 
::Hash tables don't have to store the original data at all, technically; they are commonly done as hash table->KEY:DATA or hash table->KEY:Pointer to data (or suchlike), but hash table->present is a valid hashing scheme, which results in a likely verification that you have the right data (but not guarunteed because collisions) but no way of reconstructing the data itself. [[User:Mneme|Mneme]] ([[User talk:Mneme|talk]]) 02:25, 21 February 2023 (UTC)
 
::Hash tables don't have to store the original data at all, technically; they are commonly done as hash table->KEY:DATA or hash table->KEY:Pointer to data (or suchlike), but hash table->present is a valid hashing scheme, which results in a likely verification that you have the right data (but not guarunteed because collisions) but no way of reconstructing the data itself. [[User:Mneme|Mneme]] ([[User talk:Mneme|talk]]) 02:25, 21 February 2023 (UTC)
:He’s casually referring to the hash conflict situation in common implementations of hash tables: the table of hashes, not the whole structure. You have O(n) lookup speed proportional to the impact of uniqueness lost in the hash lookup. The point is that this is the same way that bloom filters {which also usually need a source of truth to be useful) are used. The two concepts perform the same function but with different degrees of lossiness, different widenesses of matching. [[Special:Contributions/162.158.62.140|162.158.62.140]] 16:40, 24 February 2023 (UTC) EDIT: it also leaves it ambiguous that it could mean a table of hash functions outputs as you suggest, where hashes have often been thought of as uniquely identifying data that is not recoverable (this does require a sufficiently constrained situation but is often used), where bloom filterd are thought of as ambiguously referring to multiple items. I can imagine it being more clear to leave out the word table. [[Special:Contributions/172.70.114.78|172.70.114.78]] 16:48, 24 February 2023 (UTC)
 
  
 
GIF's aren't lossy either, though often other formats can't be converted to GIF without discarding information. [[User:Bemasher|Bemasher]] ([[User talk:Bemasher|talk]]) 18:27, 17 February 2023 (UTC)
 
GIF's aren't lossy either, though often other formats can't be converted to GIF without discarding information. [[User:Bemasher|Bemasher]] ([[User talk:Bemasher|talk]]) 18:27, 17 February 2023 (UTC)
 
 
:I think that's the point. [[Special:Contributions/172.68.50.203|172.68.50.203]] 20:12, 17 February 2023 (UTC)
 
:I think that's the point. [[Special:Contributions/172.68.50.203|172.68.50.203]] 20:12, 17 February 2023 (UTC)
 
:GIFs are lossy in the very act of creating them: the actual colors of the real object have to be smashed down into (I think it’s) 256 different colors, resulting in an image that even human perception recognizes as crappy. Even the so-called ‘lossless’ formats such as PNG are lossy in the act of creation, just not as drastically as GIFs. A truly ‘lossless’ format would have to specify the exact intensity of every wavelength of electromagnetic radiation emanating from every atom of the original object. Good luck with that. [[Special:Contributions/172.71.151.99|172.71.151.99]] 01:00, 18 February 2023 (UTC)
 
:GIFs are lossy in the very act of creating them: the actual colors of the real object have to be smashed down into (I think it’s) 256 different colors, resulting in an image that even human perception recognizes as crappy. Even the so-called ‘lossless’ formats such as PNG are lossy in the act of creation, just not as drastically as GIFs. A truly ‘lossless’ format would have to specify the exact intensity of every wavelength of electromagnetic radiation emanating from every atom of the original object. Good luck with that. [[Special:Contributions/172.71.151.99|172.71.151.99]] 01:00, 18 February 2023 (UTC)
:::GIFs can only have 256 colors per *frame*, but can have many frames, so 16,777,216 (256^3) colors total should be possible. [[User:SDSpivey|SDSpivey]] ([[User talk:SDSpivey|talk]]) 01:39, 7 March 2023 (UTC)
 
::::Temporal dithering? Don't know if that's the term for it, but it's the one I'd use to describe it.
 
::::And I remember trying that on a BBC Microcomputer, messing with fast direct video-memory copying and also the interupts to get the high-res but monochome MODE 0 (1-bitplane, but with some choice of foreground and background colours that are used that can be changed fairly rapidly, as well as in horizontal bands) to create a disconcerting effect (I wouldn't subject an epileptic to it!) that could still approximated at least a 3-bit colour-mode. Half the colour-res of  MODE 2, twice that of MODE 1, but vertical dot-res twice that of the latter and ''quadruple'' that of the former. IIRC. [[Special:Contributions/172.70.85.225|172.70.85.225]] 02:01, 7 March 2023 (UTC)
 
 
::It's subjective whether formats (even .gif) can be recognised as 'crappy'. The display format may further tune down everything so that something defined with 65536 colours is more like 256, or it could work well with any given stippling/halftoning/dithering to produce something more like the better original than the file data strictly allows (even from 6bits-per-pixel, or 3) when viewed at sufficient remove. And a .gif of a block-coloured diagram is notably better than a typical .jpg of one, despite the technically superior palette the later has. (Nobody says that an image has to be from a real-life subject, with all kinds of missing data, such as photons thst happen to hit the gap between CCD pixels but might be considered important and might well have been captured with the Mk 1 Eyeball and significantly 'noticed' by the nerves and ultimately the respective processing usters of the brain behind it... Which has a complete set of 'analogue lossiness' to it, anyway.) [[Special:Contributions/172.71.242.203|172.71.242.203]] 16:37, 18 February 2023 (UTC)
 
::It's subjective whether formats (even .gif) can be recognised as 'crappy'. The display format may further tune down everything so that something defined with 65536 colours is more like 256, or it could work well with any given stippling/halftoning/dithering to produce something more like the better original than the file data strictly allows (even from 6bits-per-pixel, or 3) when viewed at sufficient remove. And a .gif of a block-coloured diagram is notably better than a typical .jpg of one, despite the technically superior palette the later has. (Nobody says that an image has to be from a real-life subject, with all kinds of missing data, such as photons thst happen to hit the gap between CCD pixels but might be considered important and might well have been captured with the Mk 1 Eyeball and significantly 'noticed' by the nerves and ultimately the respective processing usters of the brain behind it... Which has a complete set of 'analogue lossiness' to it, anyway.) [[Special:Contributions/172.71.242.203|172.71.242.203]] 16:37, 18 February 2023 (UTC)
:I encoded the records you wanted transferred to your department's systems into a standard GIF format file. Would you prefer an MJPEG video? [[Special:Contributions/172.70.114.79|172.70.114.79]] 16:51, 24 February 2023 (UTC) EDIT: You're right, though. Maybe Randall has experience with color loss using GIF. In the 90's GIF was a compressed photography format, smaller than BMP. 16:54, 24 February 2023 (UTC)
 
  
 
Someone needs to add a table describing all the formats in the chart. [[User:Barmar|Barmar]] ([[User talk:Barmar|talk]]) 19:29, 17 February 2023 (UTC)  
 
Someone needs to add a table describing all the formats in the chart. [[User:Barmar|Barmar]] ([[User talk:Barmar|talk]]) 19:29, 17 February 2023 (UTC)  

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)

Templates used on this page: