1459: Documents

Explain xkcd: It's 'cause you're dumb.
Revision as of 15:07, 15 December 2014 by 108.162.216.209 (talk) (Explanation: window 7 did not exist at the time of the comic)
Jump to: navigation, search
Documents
Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Untitled.doc
Title text: Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Copy of Untitled.doc

Explanation

The comic portrays the type of naming conventions used by some people (in this case, White Hat). When saving documents, the user is typically prompted to choose a filename, which may seem like a trivial choice. However, the filename is often the primary way of identifying the document you are looking for, and a descriptive title is of huge benefit when trying to find a certain document. Those who are too rushed or too lazy to create a useful filename, or those who don't understand what constitutes a useful filename are setting themselves up for future frustration.

When a user creates a new copy of a file in the same directory, the operating system may automatically append "copy" or "Copy of" to the filename. Subsequent copies of the file have "copy 2", "copy 3" etc appended. When searching documents later, the user may struggle to remember which copy is the correct one to use.

Cueball has a severe distaste for these types of saved documents and hence provides a protip to never look in someone else's documents folder for the fear of finding these irritating details.

The .doc and .docx extensions are given to documents created in Microsoft Word, with .docx being the default option from Microsoft Office 2007 onwards. When first saving a document, many programs will default to "Untitled", adding numbers to the end as more are created. However, in Microsoft Word the default filename is the first sentence of the document; if the document is still empty, the default filename is "Doc1" with the number increasing each time. In order to get such a file directory, White Hat would have to manually title all of his documents "Untitled". He appears to frequently make copies, and occasionally made copies of the copies, only very rarely adding a keyword to the file name like "important".

In some cases he has added a minimal amount of detail to the filename, though hasn't removed the redundant "untitled copy" portion, which probably only adds to Cueball's frustration, as it demonstrates that White Hat does have at least a basic understanding of the importance of meaningful filenames, but still hasn't made any attempt to address the systemic problem.

The Untitled 40 MOM ADDRESS.jpg is an image file (jpg), not something that would normally be used to store someone's address (though it could a map or a picture of an envelope). It is the first jpg file on the list, but that last full filename is also a jpg with number 41, and below in the "speech" line down to the PC the next three files have number 42, 43 and something beginning with 4. So here the numbering of jpg files continue.

The .doc numbering goes from 241 to 243, and then 243 IMPORTANT. The .docx only increases from 138 to 139, but there are two extra copies of the 138 document.

The filenames are not in alphabetical order as 241 and 40 falls out of place. This likely means that there is no automatic sorting all (i.e., they are sorted by hand), or that they are sorted by time stamp. Sorting by timestamp can very useful especially if you use White Hats naming scheme. But this also means that he still uses .doc (copies old files) after he has obtained the new Microsoft Office 2007 that used .docx.

The title text refers to a common quirk of copy and pasting within the same folder on a Windows PC. The copy of the file will default to the name "Copy of <original title>", a second copy becomes "Copy of Copy of <original title>" and so forth. It is rather extreme to get to a 33rd copy of the original untitled.doc file as shown here. As a result the file name is 276 characters long (including the four from the .doc extension), an impossible file name in most operating environments because it is too long. 255 characters is the limit for any file or folder name in Linux, and is the limit for a fully defined file name (file name, extension and the full folder path in which the file is stored in) in Windows. So the file name is 22 characters too long for Linux and at least 25 characters too long for Windows since being in the root of drive takes 3 characters, each folder adds at least 2 characters (one chosen and the backslash). Whereas such long names for a file may be uncommon, it is not uncommon in Windows that users run out of characters for the full name and path, if they have several sub folders.

Transcript

[White Hat is sitting at his PC browsing through the Documents folder, which contains files with the names listed below. The first document name is partly blocked by the upper boundary of the square around the names. The same goes for the last full name. Then for the last three visible names below only a small part of the file name is visible.]
Untitled 138.docx
Untitled 241.doc
Untitled 138 copy.docx
Untitled 138 copy2.docx
Untitled 139.docx
Untitled 40 MOM ADDRESS.jpg
Untitled 242.doc
Untitled 243.doc
Untitled 243 IMPORTANT.doc
Untitled 41.jpg
42
43
4
[Cueball stands behind White Hat looking over his shoulder at the screen.]
Cueball: Oh my god.
Protip: Never look in someone else's documents folder.


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

742 Evergreen Terrace.docx 742 Evergreen Terrace (2).docx

141.101.99.51 07:24, 12 December 2014 (UTC)

I'm sure everyone can relate to using poor filenames occasionally. As far as default filenames go:

  • Notepad (XP) = *.txt - Cannot save without choosing a new filename.
  • Word (2003) = Title (if set by template) > First sentence of document > Doc1.doc, Doc2.doc, etc
  • Paint (XP) = untitled.bmp

--Pudder (talk) 08:58, 12 December 2014 (UTC)

Using the image format (.jpg) to store text information (like addresses) will also contribute to an annoying future if you ever need to copy data from that file into some other programme. sirKitKat (talk) 09:58, 12 December 2014 (UTC)

oo good point -- Brettpeirce (talk) 13:13, 12 December 2014 (UTC)
Could be a JPEG because it's a camera photo of the address on something. That'd make it even more perverse because most cameras create files with names like DSC01234.jpg meaning he's given it the "Untitled" moniker on purpose. 141.101.99.78 14:23, 12 December 2014 (UTC)
It's a screenshot. 173.245.62.169 18:15, 12 December 2014 (UTC)
Screenshots begin with "IMG_XXXX". 108.162.216.41 05:23, 26 December 2014 (UTC)
Placing an address in a graphic is often used when the address is to be displayed on a web page to make it difficult for -address harvesting programs to grab the address for spamming. But that's probably not relevant here.--RenniePet (talk) 15:28, 12 December 2014 (UTC)

Something I come a cross now and then is the result of the following situation: You are in the process of selecting multiple files while holding CTRL. During the process of quickly selecting the next file, you accidentally move your cursor/mouse while clicking the next file, resulting in copying all the selected files on the same location :) sirKitKat (talk) 13:36, 12 December 2014 (UTC)

Title is an impossible file name in most operating environments because it is too long at 277 characters. 255 characters is the limit for any file or folder name in Linux, and is the limit for a fully defined file name (file and full path the file is in) in Windows. So the Title/Alt text is 22 characters too long for Linux and at least 25 characters too long for Windows since being in the root of drive takes 3 characters, each folder adds at least 2 characters (a letter and the slash). I encounter clients pushing this limit all the time, complaining why they can't access their files with the novel length file names, so this comic REALLLYYY spoke to me. As an IT consultant, I get to see and occasionally cleanup such poor file naming conventions. Chaosadventurer (talk) 15:34, 12 December 2014 (UTC)

Technically, Windows can handle paths longer than 260 characters (the definition of MAX_PATH in Windows API), but it requires special nomenclature (eg. "\\?\D:\very-long-path), and each individual backslash-delimited component is still limited to 255 chars. The maximum length of that type of path is 32,767 characters AFTER Unicode expansion. Most Unix-based file systems have a max filename length of 255 chars and a max path length of 4,096 chars. KieferSkunk (talk) 20:54, 12 December 2014 (UTC)
Mostly correct, ReFS supports up to 32,767 Unicode characters, but is limited in Windows 8/8.1(and I guess by extention 2012 and 2012 R2) to 255 characters. Most filesystems specify bytes and not characters, so it could vary based on if it's unicode or not. TuxyQ (talk) 09:57, 17 December 2014 (UTC)

I suppose it's just the OCD but the fact that the filenames are not in alphabetical order is the first thing that hit me. They're not even alphabetical by file type/extension. About the only thing that would result in this ordering is if the files were sorted by timestamp (which we don't see). Of course, if I were looking over someone's shoulder at their timestamp sorted list of files, I might be just as horrified by the ordering as I would by the names. MrBigDog2U (talk) 15:40, 12 December 2014 (UTC)

Sometimes it is useful to sort by timestamp. When looking for the file, for example. Given the filenames are near useless in this example, sorting by timestamp could be the easiest way to find something. ("I'm looking for the fine I worked on about two weeks ago.") -- Equinox 199.27.128.117 18:53, 12 December 2014 (UTC)

Does anyone know why "Untitled 241.doc" and "Untitled 40 MOM ADRESS.jpg" are out of order. The rest seem to be in accending order? 108.162.221.135 (talk) (please sign your comments with ~~~~)

Assuming it wasn't just an oversight on Randall's part, it's likely using a non-alphanumeric sort on the directory listing. The operating system (likely Windows) usually sorts things alphanumerically, but can also sort them by date (created or modified). In a DOS-style listing, you can also list them in the order they were inserted into the file system (effectively unsorted). On the other hand, Windows listings also contain special logic to process numbers in "natural order" rather than alphanumeric order, so that (1, 2, 3, 10, 11, 20) would be listed in that order instead of (1, 10, 11, 2, 20, 3). However, that doesn't appear to be happening in this case. KieferSkunk (talk) 20:48, 12 December 2014 (UTC)

Could the alt text be a reference to "successor of" notation from set theory. I'm not an expert at all, but the explicit use of "copy of" over and over makes sense as another mathematical but absurd document naming schema. I think it's called successor ordinals or something like that. 173.245.50.179 (talk) (please sign your comments with ~~~~)

The "copy of copy of copy of" thing is actually a quirk related to passing around files via (happens often within an office network) where the other person does not save the file but rather opens it first then proceeds to save it after reading/editing, since MS Office has originally designated that file as 'from another computer'/read only, it will add the prefix 'copy of' to properly save a copy of the original file. This file is then further forwarded to someone else, continuing the chain. In a file that is heavily edited you can often get names with 4 of 5 "copy of"s before the actual name.

Someone may want to edit the explanation to add this detail as it is the most common reason for multiple "copy of"s in front of each other. TjPhysicist (talk) 05:28, 19 December 2014 (UTC)

A more likely/common reason for "copy of copy of copy of" relates to files opened directly from programs (instead of saved then opened), upon saving them after editing like this the phrase "copy of" will be added to the filename indicating that this is a copy of the original file (the original file being somewhere in a temp folder, since it was never saved). This trend often continues, especially in office settings, where files are passed around via a lot, every user that edits it adding one extra "copy of". Editing to mention this

TjPhysicist (talk) 05:42, 19 December 2014 (UTC)

"Copy of copy of copy of copy ..." also reminded me of NIN song "Copy of a" (http://youtu.be/pVB_DI4ajKA) 141.101.93.218 20:06, 3 January 2015 (UTC)

The "Untitled 40 Mom Address looks like a "yo mom address" SilverMagpie (talk) 20:33, 16 November 2016 (UTC)