394: Kilobyte

Explain xkcd: It's 'cause you're dumb.
Revision as of 23:15, 7 November 2015 by 141.101.80.86 (talk) (Explanation: It's absurd and wrong to claim that "computer experts" all misuse SI units.)
Jump to: navigation, search
Kilobyte
I would take 'kibibyte' more seriously if it didn't sound so much like 'Kibbles N Bits'.
Title text: I would take 'kibibyte' more seriously if it didn't sound so much like 'Kibbles N Bits'.

Explanation

This comic pokes fun at the confusion over the definition of a kilobyte. Some interpret the prefix literally, meaning a kilobyte is 1000 bytes. Others, however, usually define it as 210 = 1024 bytes, because it is computationally easier to deal with.

The first row of the table is simply mocking this discrepancy.

The second row is Randall's interpretation on how Stan Kelly-Bootle would approach this problem. Kelly-Bootle is known for writing The Computer Contradictionary which satirizes the jargon and language of the computer industry. Kelly-Bootle was likely motivated to write this work after working for several years at IBM, a company infamous for its excessive use of acronyms in the work place. Averaging the two definitions together to get 1012 bytes is simply a humorous approach that Kelly-Bootle would likely have taken ("Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper consideration." — Stan Kelly-Bootle). The serendipitous fact that the initials of Kelly-Bootle's name are "KB," the same letters used to abbreviate the word "kilobyte," adds a layer of plausibility to the joke.

The imaginary kilobyte simply plays on the fact that complex analysis is required in quantum computing in relation to quantum mechanics. The imaginary number is represented as i and has a value of the square root of -1. This is a pun on the fact that KiB is used for the "binary kilobyte" (occasionally "kibibyte") which is standardized at 1024 bytes.

The Intel kilobyte mocks the Pentium floating point unit which, in 1994, became notorious for having a major flaw in its floating point division algorithm that gave slightly erroneous results. (For the non-computer folk, a floating point number is a real number like 4.0 or -13.387.)

The smaller, drivemaker's kilobyte mocks a business model for handling higher prices that keeps prices constant but reduces quantity. The food industry has been notorious for decreasing quantity of food and keeping prices the same instead of increasing prices and keeping quantity the same. Randall is suggesting that if the computer industry tried to do this with hard drives, it could have humorous results such as smaller number of bytes in a kilobyte. In reality, hard drive capacity is specified in 103 byte (kB) units, while the content you put on it (programs etc.) is specified in 210 (KiB) units. Formatting the drive, i.e. making it usable for storage, further decreases the available space. Thus a 250 GB drive might be reported to have a capacity of only 232 GB (really GiB) by the operating system. This discrepancy increases with increasing drive size; however the trend humorously suggested in the comic, where real storage per advertised storage decreases linearly with time, would cause the drivemaker's kilobyte to become zero in the year 2235!

The baker's kilobyte is a play on the baker's dozen, which is 13 instead of 12. A baker's byte with 9 bits to the byte would result in a total of 9216 bits in a 1024 byte kilobyte. Converting this into "normal" bytes (with 8 bits), we divide 9216 bits by 8 bits per byte to get 1152 8-bit bytes to the baker's kilobyte.

At the title text Randall mentions the definition kibibyte, which is defined more precisely. The binary prefix kibi means 1024, a portmanteau of the words kilo and binary. But he doesn't like the word because it sounds like the dog food Kibbles 'n Bits.

Transcript

There's been a lot of confusion over 1024 vs 1000,
kbyte vs kbit, and the capitalization for each.
Here, at last, is a single, definitive standard:
[table of various kinds of kilobytes]
SYMBOL NAME SIZE NOTES
kB Kilobyte 1024 bytes OR 1000 bytes 1000 bytes during leap years, 1024 otherwise
KB Kelly-Bootle standard unit 1012 bytes compromise between 1000 and 1024 bytes
KiB Imaginary kilobyte 1024 √-1 bytes used in quantum computing
kb Intel kilobyte 1023.937528 bytes calculated on Pentium F.P.U.
Kb Drivemaker's kilobyte currently 908 bytes shrinks by 4 bytes each year for marketing reasons
KBa Baker's kilobyte 1152 bytes 9 bits to the byte since you're such a good customer


comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!

Discussion

The drivemaker's version here does 'depreciate' their kilobyte, indeed, but rather than based on slipping food-standards (which are often highly regulated) I think this is actually based upon the actual age-old practice of them sometimes using 103n (1,000s, 1,000,000s, etc) measures of byte-multiplies in preference to 210n ones (1,024, 1,048,576, etc) in order to get a better figure. For example 20MB drives (back in the old days, this is) with 971,520 bytes (almost 1Mb, by either measure) less than the true binary-matching 20MiB value which various computer OSes would work with. (Or a 'binarily' 20MB drive gets advertised as "20.1MB" one.) On the other hand, something that "needs 20Mb of installation space" might have deliberately been given the binary-divisible version of the unit to make it look marginally less resource-hungry than the decimalised measure would have indicated. Minor differences in their own right, on a bad day when the competing standards mesh badly you might find yourself just short of storage space when you thought you'd be Ok.

Although in real-life the difference between any given unit's interpretation has not changed, as equipment capacities increases and we start to use increasing degrees of prefix upwards, any discrepancy becomes more significant. 1KB is plus or minus 24 bytes (~2%), 1MB is plus or minus around 48KB (~5%), 1GB is plus or minus 73MB (~7%) and 1TB could be very nearly 100Gb short (~10%). For those that care about these things that's at the very least annoying. Like with CRT monitor sizes that were often more an indicator of tube-end size than the true size of the visible/illuminatable portion, giving them an inch or two less of effective display than you might expect. 178.98.31.27 13:55, 18 June 2013 (UTC)

Just to follow-up to myself, based upon a unit capitalisation discrepancy that I only spotted post-posting, but that I won't bother fixing, there's also the old confusion between "kilobits-per-second" and "kilobytes-per-second" (and mega- and giga- versions, more recently with broadband and more advanced ethernets/etc) when it comes to bandwidths and expected speeds. Although you don't necessarily expect to exactly hit the stated limit (with contentions and collisions and latencies and overheads), getting a factor of 8 less than you might have expected has caught people out before, thinking they're getting a far poorer service than advertised... (Not that this has much to do with the above comic, just saying. And, oh lookie here on my desk. A 28,800 'Sportster' PCMCIA faxmodem card (V34, V32bis) with an XJACK® pop-out socket. Why have I still got that?) 178.98.31.27 14:16, 18 June 2013 (UTC)

This table fails to mention, of course, that while a Baker's Kilobyte is 1152 bytes normally, it's 1125 on leap years. Hppavilion1 (talk) 23:21, 26 October 2017 (UTC)

What is the source of the "official" definition of the kilobyte?

From the article at time of creating this topic: "the official definition now states that 1 kilobyte is 1000 bytes". This is official according to whom exactly? I may have missed a citation somewhere but I think this clause needs a citation in the text of the the article. AzureArmageddon 13:50, 3 October 2023 (UTC)

Well, SI and IEC either state or 'recommend' that a kilobyte (kB) is 10³ bytes, while tradition has tended to use KB (capital-K) for 2¹⁰ bytes (obviously open for confusion) while IEC defines this as a 'kibibyte' (KiB). There's several possible cites for that, one really would need to decide which look best/official.
As a general hint to people, though, I think that makes it probably best to just always use explicit KiBs, and mibi/gibi/tebi/etc equivalents, in full or as unit abbreviations, because there might be people who haven't got the memo/don't know whether you got the memo, otherwise. And at least there's a chance that even those unaware of "FOObibytes" will try to find out what these are... unless they just mistake them for typos or read them unconsciously wrongly, but then there's probably more problems than just assuming the wrong base-multiple... ;) 172.70.86.54 17:26, 3 October 2023 (UTC)
I agree with you on that best practice for sure. I don't feel qualified to determine which authority is most authoritative, though. Hoping someone with relevant industry credentials can make a qualified opinion. AzureArmageddon 15:56, 5 October 2023 (UTC)