1683: Digital Data

Explain xkcd: It's 'cause you're dumb.
Revision as of 04:15, 20 May 2016 by (talk) (Transcript)
Jump to: navigation, search
Digital Data
“If you can read this, congratulations—the archive you’re you're using still knows about the mouseover text”!
Title text: “If you can read this, congratulations—the archive you’re you're using still knows about the mouseover text”!


Ambox notice.png This explanation may be incomplete or incorrect: Created by a BOT - Please change this comment when editing this page.
If you can address this issue, please edit the page! Thanks.


Ambox notice.png This transcript is incomplete. Please help editing it! Thanks.

[Cueball and a White Hat are walking.]

Cueball: The great thing about digital data is that it never degrades.

[The next panel is slightly pixelated] Cueball: Hard drives fail, of course, but their bits can be copied forever without loss.

[The third panel is more pixelated, the white is slightly discolored, and it contains part of the interface for a viewer program] Cueball: Film degrades, paint cracks, but a copy of a century-old data file is identical to the original.

[The fourth panel is even more pixelated and discolored, and contains watermarks and more 'frame' elements] Cueball: If humanity has a permanent record, we are the first generation in it.

White Hat: Amazing.

comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!


Ewww, Verizon? **** them International Space Station (talk) 04:58, 20 May 2016 (UTC)

Don't forget the whole "Verizon Math" incident and Randall's much passed around check image. I'd be surprised if it isn't on 9GAG somewhere.... Psu256 (talk) 17:12, 23 May 2016 (UTC)
https://xkcd.com/verizon/ 02:30, 15 July 2017 (UTC)

Ironically, the title text on explainxkcd is different from the one on xkcd.com, demonstrating the reinterpretation of text encoded in UTF-8 as if it were encoded in ISO 8859-1. 05:45, 20 May 2016 (UTC)

-Exactly; this nicely proves Randall's point. On my computer, different characters appear in different browsers, but of course in one browser the characters are reproducible.--Jkrstrt (talk) 07:26, 20 May 2016 (UTC)

Here is the decoded title text:

   “If you can read this, congratulations–the archive youʼre you're using still knows about the mouseover text”! 07:51, 20 May 2016 (UTC)

Grungy details:
Odysseus654 (talk) 17:31, 20 May 2016 (UTC)
The convert to hex step is really encode with Windows-1252. Also, in the last sequence, the "!" is not part of the encoded quotation mark. The third byte of the quotation mark comes from an unprintable U-009D between the "â€" and the "!". U-009D isn't a valid Windows-1252 character, so either the encoding is actually a superset of Windows-1252 that includes U-009D, or the encoding process just allowed it. 17:26, 21 May 2016 (UTC)

He's written you're twice, but one is with a curly apostrophe, often favoured by americans (and maybe brits?), possible because of their keyboard. The simple apostrophe is “just” html-formatted, whereas the curly one has been molested by a UTF-8 / ISO-8859-1 misreading. -- 07:51, 20 May 2016 (UTC)

I'm British, and I don't have the curly apostrophe anywhere on my keyboard. Enchantedsleeper (talk) 11:01, 20 May 2016 (UTC)
I'm American, and I also don't have the curly apostrophe anywhere on my keyboard, but word processing programs (like MS-Word) are configured by default to automatically replace an ASCII apostrophe in a conjunction with the fancy right-single-quote mark. Also when using quotation marks around text those programs automatically replace the repeated single ASCII quotation marks with the fancy left and right quotation marks (single if using single quotes, double if using double quotes). Most people don't care enough to disable that "feature"... 15:13, 20 May 2016 (UTC)
Ok. I've never experienced that from any text processor (incl. MS Word), so maybe it's dependant on the system locale or another mysterious factor. I've just noticed a prevalence in english language texts online, but an absence in other european languages. Not even french, which has as many or more contractions. 08:11 21 May 2016 (UTC)

This is a phenomenon that has always both fascinated me and frustrated me. I find it fascinating how, even today, data degrades as more and more people copy it (remember the old days when people used to copy VHS tapes, and the further you were from the original tape the more copying artefacts your copy had in it?). It also frustrates me, though, when I'm trying to find an original, undegraded image or video and it seems impossible to find. It's also annoying because it's actually pretty easy to copy something without causing any quality loss, yet practically every copied image on the internet has been degraded in some way or another. 07:08, 20 May 2016 (UTC)

If you haven't yet, you should check out this guy who ripped and reuploaded his own Youtube video 1000 times: https://www.youtube.com/watch?v=jEIzS_27Vt0 08:28, 20 May 2016 (UTC)
...and after 100 iterations https://www.youtube.com/watch?v=k6GMvihskBQ ...and the summary of all of them https://www.youtube.com/watch?v=icruGcSsPp0 Odysseus654 (talk) 16:50, 20 May 2016 (UTC)
It can be frustrating to try to convince new people drawing schematics on the computer to not use 4-way junctions because they don't expect digital images to degrade over multiple generations of copying. This xkcd demonstrates the way multiple generations can degrade even digital images, potentially making it difficult to differentiate two crossing (but electrically separate) signal lines from a 4-way junction on a schematic. Sorry, I'll get off my soap box now. ;-) 15:13, 20 May 2016 (UTC)

It's also funny because just a few moments ago I was trying to compress some video to send to someone. 07:12, 20 May 2016 (UTC)

http://fotoforensics.com/analysis.php?id=274fcf46426f2da31b057f1652ae5269cfdbd70a.190103 this page highlights the encoding blocks so that the degration of quality can be seen better. 09:42, 20 May 2016 (UTC)

Nice example. Their picture is already a bad copy. While it's still a PNG, it's already reduced in size (600x228 instead of 720x282, 131381 byte instead of 190103). Btw. the file used in this wiki is also slightly different from what I see on xkcd. It's just 3 minutes older and 308 bytes larger. 01:28, 23 May 2016 (UTC)

The phenomenon that Randall is making fun of in this comic is actually called a "shitpic" http://www.theawl.com/2014/12/the-triumphant-rise-of-the-shitpic The explanation should probably make reference to that. Enchantedsleeper (talk) 10:57, 20 May 2016 (UTC)

I think the watermarks on the last frame are from an unregistered screenshot tool, not "9gag" or similar. The references to shit pics are interesting, but aren't you over interpreting the whole thing? (talk) (please sign your comments with ~~~~)

...You realise that over-interpreting is what this wiki is for, right? Also, not really, since all I said was that a "shitpic" is what this type of degraded image is called. Enchantedsleeper (talk) 15:03, 23 May 2016 (UTC)

There's a 9gag thing in the image, clean your glasses and look again. -- 12:15, 20 May 2016 (UTC)

Both screenshots from iOS definitely. Safari browser and… anybody knows? Some kind of other web browser? Maybe Chrome or Opera? <Need to finally create account> 15:32, 20 May 2016 (UTC)

Apparently Russians have been getting this a lot, as they (up to the point of the existence of UNICODE) have had to deal a lot with people using bad codepages. Example of their post office dealing with a physical package addressed with a bad codepage: http://worldlanguages.wikia.com/wiki/Mojibake?file=Letter_to_Russia_with_krokozyabry.jpg Odysseus654 (talk) 16:54, 20 May 2016 (UTC)

Here is the progression as I see it:

  • Frame 1 - The original PNG
  • Frame 2 - The PNG converted to a JPEG
  • Frame 3 - The JPEG as viewed on a mobile browser (Safari on iOS in this case)
  • Frame 4 - A screen-shot of the mobile browser uploaded to Tumblr and then stolen by 9GAG 19:37, 20 May 2016 (UTC)

Note that while the term "digital" is new, first digital format of information appeared long ago, with the development of standard alphabet. Images hand-drawn on paper can't be copied without loss, but if you write letters in fixed alphabet, it can be copied without errors forever (not counting errors caused by some letters getting out of use through history). Egyptian literature is probably lost due to us not knowing the (very big) full set of hieroglyphs, but Odyssey could (and hopefully even was) be stored exactly how it was written. Wouldn't help read it, of course, language changed since then and it would need to be translated which, again, can lose some meaning ... -- Hkmaly (talk) 16:16, 21 May 2016 (UTC)

There's a much much older example. RNA and subsequently DNA are digital representations of the protein structures (also digital representations of 3-D molecular shapes). Degradation through copying is 1 source of variation which evolution selects over.MerlinMM (talk) 11:28, 23 May 2016 (UTC)

Right. Humans were using digital data for their own reproduction long before they knew what "digital", "data" or even just "letter" is. DNA even uses primitive error correction techniques. Although when humans finally found out about RNA being digital, they already had other digital formats. -- Hkmaly (talk) 00:21, 15 July 2017 (UTC)
There's nothing primitive at all about DNA error correction techniques, just some people's understanding of them. 02:35, 15 July 2017 (UTC)

Is it possible that the watermark in the bottom left of the last panel is supposed to read "drama.tumblr.com"? -- 20:42, 23 May 2016 (UTC)

The alt text has been fixed, the second "You're" has been removed. (talk) (please sign your comments with ~~~~)

The phenomenon is related to Generation loss --JakubNarebski (talk) 14:50, 27 May 2016 (UTC)

Btw, does anybody know a digital archive that actually "knows about the title-text"? (talk) (please sign your comments with ~~~~)

Source image updating?

If you look at the comic on the website, the first couple of frames are much more "decayed" than they are on the wiki copy. -- 01:47, 19 December 2016 (UTC)

The source image has definitely been changed. Here's the original image, and here's the new one. -- 01:13, 23 December 2016 (UTC)