Talk:1726: Unicode

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
  1. Proposal by Courtney Milan - 3 dinosaurs:
  2. Feedback by Andrew West - 13 dinosaurs:
  3. Article by Becky Ferreira - they should have feathers:

Sebastian 12:14, 29 August 2016 (UTC)--

Regarding the brontosaurus reference, there is also some material in the intro of the wikipedia page. Chtit draco (talk) 14:33, 29 August 2016 (UTC)
Comic could be a reference to WE’RE ALL USING THESE EMOJI WRONG - where the 😪 emoji is supposed to be a sleepy emoji and not a side-tear emoji - - see facebook's interpretation vs Samsung's (talk) (please sign your comments with ~~~~)
Indeed. However IMHO the problem lies not in the standardisation attempt, but on the choice of non-obvious pictograms (which is a font-designer problem). The sleepy emoji would not be used wrong if it unquestionably looked like sleepy. Chinese solved this problem long ago by switching from pictograms to abstract ideogram designs. 14:13, 30 August 2016 (UTC) Sylvain M.

I thought it was funny that the two people in the upper left (who, at the time of this comment, were noted to be "helping" Cueball) are actually impeding the quixotic quest by arguing amongst themselves. 23:38, 29 August 2016 (UTC)

Personally, I'm still dumbfounded by the lack of a marijuana leaf. There are pills, a syringe, a cigarette, rice wine, plus *multiple* Emoji for both wine & beer. I hate the fact that Emoji are *not* implemented in a sensible, standardized fashion: For instance, the guy Emoji may or may not have a mustache, or gray hair. The "short hair" female may be blonde, or brunette & may even have a coiffure instead of short hair! I think they should be far more specific with their definitions. Personally, I'm sticking with emoticons until they get this sorted out.  ; P As for dinosaur Emoji, contrary to my previous statement about specificity, I believe you only need three dinomoji: Carnivore head (raptor or T-rex, non-specific), long-neck herbivore in profile, & winged. Anything more specific than that should probably be expressed with, y'know, WORDS. 07:35, 30 August 2016 (UTC)

Words? Weird concept ;) Elektrizikekswerk (talk) 07:47, 30 August 2016 (UTC)
There's already a winged dinosaur emoji and has been since 2010 Jeremyp (talk) 09:33, 30 August 2016 (UTC)

There is a good amount of detail regarding why/how the Unicode people are arguing over Emojis (In reference to the title text) but there is not much information provided regarding what Randall is referring to in the main strip, e.g. an example of what kind of language regulations the Unicode group try to impose. While the current explanation does a good job of explaining why there is a lot of drama regarding a Brontosaurus Emoji, the meat and potatoes of the article is in reference to language itself. I have never encountered anyone trying to communicate in English using letters that are not part of the current alphabet. Since English uses predefined Roman symbols for sound representation, and the Unicode people only deal with the representation of symbols, I am having a difficult time comprehending how the group in charge of rendering English into text would have any part in the changes that (at least English) is undergoing (which are largely related to spelling and grammar, not the symbols itself). Snowblinded (talk) 08:19, 30 August 2016 (UTC)

I think the main point of this comic is about using characters from different alphabets to get a funny look (or fool anti-spam). In Unicode, characters sharing the same design but from different alphabets have separate code-points. For example: U+0041 (latin "A"), U+0391 (greek "Alpha") and U+0410 (cyrillic "A") look exactly the same but are not interchangeable... neither in Unicode nor in real life since writing English with Greek letters doesn't make sense anyway. Example 2: U+0049 (latin "I"), U+2160 (roman numeral 1) and U+30BC (japanese "E") have a similar yet different look (and very different meaning), and so have different code-points (seems logical). One may want to mix them to get a funny typing... as long as writing proper English is not a concern. Conclusion: I hardly see how Unicode restricts anything, since the "consistent technical standards" pretty much already exists in any language. 11:55, 30 August 2016 (UTC) Sylvain M.

I feel like he isn't trying to steer the river but the two confused looking people across the river. What else are their role if it's not the case? 14:01, 30 August 2016 (UTC)

They have another sign laying down on the ground, so they seem to be fighting about where to put said sign. Psu256 (talk) 17:45, 30 August 2016 (UTC)

I think that the "Hey! That's not what that area is for!" line is about how people use features of Unicode in unintended ways.--Henke37 (talk) 12:33, 31 August 2016 (UTC)

You don't need to go far as emoji to show how Unicode is doomed; the CJK(Chinese, Japanese, Korean) charsets, used in probably most developed countries outside of America/Europe, have had pretty tough time getting settled yet still have a few problems 18:01, 31 August 2016 (UTC)

Can you elaborate or give a reference? Thanks 20:45, 31 August 2016 (UTC) Sylvain M.

Okay. Since I'm a Korean, let me start with Hangul, which is used to write Korean language. The beauty of Hangul is that a complete letter is consisted of 2~3 'jamo's(consonants or vowels). The first one is a consonant and called 'chosung', second one is a vowel and called 'joongsung', the last one's a consonant and called 'jongsung'. Possible numbers for each are 125, 95, 138. So total possible number of a letter is 1,638,750. But that's a theoretical number and actually frequently used letters are not that much. So in Unicode 1.0 there were 2,350 complete letters. However, it trimmed too much and was missing quite lots of letters. So 4,516 letters were added in Unicode 1.1. Unfortunately, this time the order of charset table was all messed up. You need a program to construct a letter from jamos and it was almost impossible to make a program that does consistent conversion. So in Unicode 2.0 these areas were totally scrapped, and 11,172 letters were allocated in a new area.
The Hangul charset was mostly settled there. The rest of 1,638,750 hangul letters that are rarely used are constructed by another method, writing three jamos in sequence. You might ask why we didn't use this method in the first place, that's because there would be too much overhead. We could have ended up using 4~6 byte per complete letter, instead of 2 byte per letter...
You can still find "CJK unified ideographs" keep being added even in recent Unicode versions. Since these ideographs are used in so vast area and different countries, there are so many similar but different characters. AFAIK these are mostly needed in Japanese names. 15:08, 1 September 2016 (UTC)

would a brontosaurus have feathers? 01:21, 2 September 2016 (UTC)

Also, it's possible people who like to argue over how Unicode should define things could get draw in? 04:23, 2 September 2016 (UTC)

Personal tools


It seems you are using noscript, which is stopping our project wonderful ads from working. Explain xkcd uses ads to pay for bandwidth, and we manually approve all our advertisers, and our ads are restricted to unobtrusive images and slow animated GIFs. If you found this site helpful, please consider whitelisting us.

Want to advertise with us, or donate to us with Paypal?