Editing 2109: Invisible Formatting

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 9: Line 9:
 
==Explanation==
 
==Explanation==
 
 
In various word processor programs, when highlighting text, whether by clicking-and-dragging or double-clicking, it is easy to highlight characters which have no visible effects when markup is applied (ie ''italics'' or''' bold'''), such as a space or the end-of-paragraph passage. Since in most fonts the word space looks identical between the bold, the italicized, and the regular, this has no effect on how the end user will read the document, but could theoretically cause a problem in certain occasions, most notably in computers which might parse a bold space differently or incorrectly. This problem is compounded if the text cursor does not indicate clearly the space is in bold or italics when a user hovers their mouse over it. [[Randall]] worries about this.
+
In various word processor programs, highlighting text, whether by clicking-and-dragging or double-clicking, it is easy to highlight characters which have no visible effects when markup is applied (ie ''italics'' or '''bold'''), such as a space or the end-of-paragraph passage. Since in most fonts the word space looks identical between the bold, the italicized, and the regular, this has no effect on how the end user will read the document, but could theoretically cause a problem in certain occasions, most notably in computers which might parse a bold space differently or incorrectly. This problem is compounded if the text cursor does not indicate clearly the space is in bold or italics when a user hovers their mouse over it. Randall worries about this.
  
 
In the pictured case, Randall does not appear to have selected the word by double-clicking, since the cursor is depicted past the end of the word instead of on top of it; rather, he has clicked-and-dragged the mouse cursor to select it. The space character is a relatively thin character, which makes it hard to avoid and to notice, but even so most people don’t worry if they've selected it and tend not to bother fixing. Randall later uses the same click-and-drag method to have the bold removed, but this time omits the space, retaining its bold formatting on that character. Since it is a blank character, there is no easy way to tell it is still bold — even if it is slightly longer in the bold font, this may be hard to notice. This is the situation the comic is highlighting, [[559: No Pun Intended|no pun intended]].
 
In the pictured case, Randall does not appear to have selected the word by double-clicking, since the cursor is depicted past the end of the word instead of on top of it; rather, he has clicked-and-dragged the mouse cursor to select it. The space character is a relatively thin character, which makes it hard to avoid and to notice, but even so most people don’t worry if they've selected it and tend not to bother fixing. Randall later uses the same click-and-drag method to have the bold removed, but this time omits the space, retaining its bold formatting on that character. Since it is a blank character, there is no easy way to tell it is still bold — even if it is slightly longer in the bold font, this may be hard to notice. This is the situation the comic is highlighting, [[559: No Pun Intended|no pun intended]].
  
Usually, if one were to highlight a word via double-clicking, the word and the space following would both become highlighted. Therefore, this problem could have been avoided if Randall had used this method to highlight, as the space would have been automatically included both times, thus removing markup on the space character as well.
+
Usually, if one were to highlight a word via double-clicking, the word and the space following would both become highlighted, therefore, this problem could have been avoided if Randall used this method to highlight, as the space would have been automatically included both times, thus removing markup should have been done to the space character as well.
  
Though Randall is likely thinking of computer-related problems caused by his invisible formatting, there is also a chance that his bold space would cause other, non-computer-related issues. As Randall has bolded the word "not" but then changed his mind, it indicates that he believes writing '''not '''is too strongly-worded. With an invisible bold space, whoever the document was intended for could notice Randall's bold space and figure that the word "not" was originally bolded. Depending on the context, a bolded "not" could be enough to change the tone of the text from polite and formal to dismissive (eg. "We believe you are not suitable for this position." vs "We believe you are '''not '''suitable for this position.")
+
The comic also indicates that Randall bolds text via clicking the "bold" button in the word processor, rather than using a keyboard shortcut (usually Ctrl+B or Cmd+B). Since the keyboard shortcut to bold - as well as italics, copy, paste, and print - are very commonly known shortcuts, clicking on a GUI as well as dragging to highlight a single word rather than double-clicking could indicate that Randall is not familiar with using word processors.
 +
 
 +
Though Randall is likely thinking of computer-related problems caused by his invisible formatting, there is also a chance that his bold space would cause other, non-computer-related issues. As Randall has bolded the word "not" but then changed his mind, it indicates that he believes writing '''not''' is too strongly-worded. With an invisible bold space, whoever the document was intended for could notice Randall's bold space and figure that the word "not" was originally bolded. Depending on the context, a bolded "not" could be enough to change the tone of the text from polite and formal to dismissive (eg. "We believe you are not suitable for this position." vs "We believe you are '''not''' suitable for this position.")
  
 
In the title text, Randall says that he “fixes” this by running the text through {{w|Optical character recognition|OCR}}, which turns physical copies or images into text. Although this would "fix" the invisible formatting (since the OCR is unable to detect it), this would usually ruin even more formatting, and add inaccuracies to the text. This way, no one can tell which bugs were introduced by him and which ones by the OCR, which he facetiously suggests is better somehow.
 
In the title text, Randall says that he “fixes” this by running the text through {{w|Optical character recognition|OCR}}, which turns physical copies or images into text. Although this would "fix" the invisible formatting (since the OCR is unable to detect it), this would usually ruin even more formatting, and add inaccuracies to the text. This way, no one can tell which bugs were introduced by him and which ones by the OCR, which he facetiously suggests is better somehow.
  
As the title text explains, Randall finds it very important to control all information he publishes. Real-world examples are governments changing the impact of reports for political reasons. Attempted tampering of this kind can be revealed by bold spaces. Another example would be a casual and short one-sentence reply e.g. to a romantic interest, which one takes one hour to formulate to sound as natural as possible.
+
As the title text tells it is really important to Randall to control all information he publishes. Real-world examples are governments changing the impact of reports for political reasons. Attempted tampering of this kind can be revealed by bold spaces. Another example would be a casual and short one-sentence reply e.g. to a romantic interest, which one takes one hour to formulate to sound as natural as possible.
  
There are also other occasions where a hidden bold space may be a problem for later editors (see the [[#Trivia|Trivia]] section below). Randall’s background in {{w|computer programming}} could also make him more attentive to these types of technical problems, and therefore add this as a reason for his worries about invisible formatting.
+
There are also other occasions where a hidden bold space may be a problem for later editors etc. See the [[#Trivia|Trivia]] section below. Randall’s background in {{w|computer programming}} could also make him more attentive to these types of technical problems, and therefore add this as a reason for his worries about invisible formating.
  
 
==Transcript==
 
==Transcript==
Line 34: Line 36:
  
 
:[The cursor is next to the "to". No text is highlighted.]
 
:[The cursor is next to the "to". No text is highlighted.]
 +
:Text: ...ere, but would '''not '''have to mo...
 
:Thought bubble: ...Nah, the bold is too much.
 
:Thought bubble: ...Nah, the bold is too much.
:Text: ...ere, but would '''not '''have to mo...
 
  
 
:[The word "not" is now highlighted in blue again, but the following space is not.]
 
:[The word "not" is now highlighted in blue again, but the following space is not.]
Line 57: Line 59:
 
*Exporting to plain text files.  If for example a {{w|markdown}} style is used, there will be characters in the output that do not make sense.
 
*Exporting to plain text files.  If for example a {{w|markdown}} style is used, there will be characters in the output that do not make sense.
 
*Scraping, data mining, and linguistics processing by computer algorithms.  Often (although not always) these algorithms are written based on samples of training or testing text that may not have spurious formatting present, and may misprocess something when encountering the spurious formatting.
 
*Scraping, data mining, and linguistics processing by computer algorithms.  Often (although not always) these algorithms are written based on samples of training or testing text that may not have spurious formatting present, and may misprocess something when encountering the spurious formatting.
*Wikis. In''' '''this''' '''sentence,''' '''every''' '''space''' '''is''' '''a''' '''hidden''' '''bold''' '''space. From the editing view, all the spaces look <code><nowiki>like''' '''this</nowiki></code>. This will annoy all future editors of this article, due to the hidden apostrophes which are formatting the spaces. They may also accidentally introduce bold words.
+
*Wikis. In the first paragraph of this article, every space is a hidden bold space. From the editing view, all the spaces look <code><nowiki>like''' '''this</nowiki></code>. This will annoy all future editors of this article, due to the hidden apostrophes which are formatting the spaces. They may also accidentally introduce bold words.
 
**By default, MediaWiki attempts to prevent this by not including the trailing spaces in the bold formatting when you click the “bold” button, so someone has to manually type the formatting apostrophes to do this.
 
**By default, MediaWiki attempts to prevent this by not including the trailing spaces in the bold formatting when you click the “bold” button, so someone has to manually type the formatting apostrophes to do this.
 
*A situation where formatted text is not allowed, and is rejected, but the user failed to strip formatting from the spaces, and this is noticed.
 
*A situation where formatted text is not allowed, and is rejected, but the user failed to strip formatting from the spaces, and this is noticed.
Line 64: Line 66:
 
*Bold (or italic or non-breaking) spaces are also popular in {{w|Steganography|steganography}}. By using bold spaces in some places and not in others it is possible to hide secret information in a public text, that will not be visible to the casual reader, who does not explicitly search for the hidden information. Additionally if such a document is found with a person, that person can {{w|Plausible_deniability|plausibly deny}} all knowledge of the encoded information.
 
*Bold (or italic or non-breaking) spaces are also popular in {{w|Steganography|steganography}}. By using bold spaces in some places and not in others it is possible to hide secret information in a public text, that will not be visible to the casual reader, who does not explicitly search for the hidden information. Additionally if such a document is found with a person, that person can {{w|Plausible_deniability|plausibly deny}} all knowledge of the encoded information.
  
Popular modern word processing programs have features which may make it easier to notice improperly formatted invisible characters. In the tutorials linked here, one may learn how to view invisible characters in [https://support.microsoft.com/en-au/office/show-or-hide-tab-marks-in-word-84a53213-5d02-404a-b022-09cae1a3958b Microsoft Word], [https://support.apple.com/kb/PH23650?locale=en_US&viewlocale=en_US Pages] and [https://help.libreoffice.org/latest/en-US/text/swriter/01/03100000.html LibreOffice Writer], however even with this on it would be difficult to spot a bolded space (which looks like a bolded dot &ndash; now visible but so small it's still hard to tell if it's bold or not). In the older word processor {{w|WordPerfect}}, one could do this with the “Reveal Codes” feature, which showed you character codes, separate from the characters themselves, around the characters.  For example, a bolded space would look something like "<span style="background:#34F5FF">[BOLD&#8827;</span>&ensp;<span style="background:#34F5FF">&#8826;BOLD]</span>".
+
Popular modern word processing programs have features which may make it easier to notice improperly formatted invisible characters. In the tutorials linked here, one may learn how to view invisible characters in [https://support.office.com/en-us/article/show-or-hide-formatting-marks-c2d8a607-5646-4165-8b08-bd68f9d172a0 Microsoft Word], [https://support.apple.com/kb/PH23650?locale=en_US&viewlocale=en_US Pages] and [https://help.libreoffice.org/Writer/Nonprinting_Characters LibreOffice Writer], however even with this on it would be difficult to spot a bolded space (which looks like a bolded dot &ndash; now visible but so small it's still hard to tell if it's bold or not). In the older word processor {{w|WordPerfect}}, one could do this with the “Reveal Codes” feature, which showed you character codes, separate from the characters themselves, around the characters.  For example, a bolded space would look something like "<span style="background:#34F5FF">[BOLD&#8827;</span>&ensp;<span style="background:#34F5FF">&#8826;BOLD]</span>".
  
 
Web sites which allow content to be edited by users but generate the formatting code automatically often have versions of the invisible formatting problem; for example, eBay listings which use anything other than the default font rapidly accumulate hard spaces, font end and begin transitions, and other invisible formatting if they are subsequently edited, which can slow page loading and cause other problems. This is also seen in blogs etc.
 
Web sites which allow content to be edited by users but generate the formatting code automatically often have versions of the invisible formatting problem; for example, eBay listings which use anything other than the default font rapidly accumulate hard spaces, font end and begin transitions, and other invisible formatting if they are subsequently edited, which can slow page loading and cause other problems. This is also seen in blogs etc.

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)