Editing 2298: Coronavirus Genome

Jump to: navigation, search

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 18: Line 18:
 
[[Cueball]] is surprised that Megan and her colleagues actually use {{w|Microsoft Notepad}}, a simple {{w|text editor}}, to look at the genome, instead of more modern technology. She explains that better research institutions use {{w|Microsoft Word}}, a more advanced editor, to allow additional formatting (such as '''bolding''' and ''italics''), and humorously calls this "{{w|epigenetics}}". In the real world, epigenetics is the study of changes that are not caused by changes in nucleotides, but by chemical modifications of DNA or chromosomes that cause changes in patterns of gene expression and activation, sometimes several generations down.  This might be considered analogous to altering the meaning of a text by changing its formatting rather than the content; for example, content can be moved into parentheses or footnotes to be de-emphasized, or rendered in boldface or enlarged to attract attention and emphasize key points. Much as text can be wrapped in HTML tags or similar markup to change its formatting, nucleotides can be {{w|DNA methylation|methylated}} to prevent transcription, and the {{w|histone}}s around which DNA is wound can also be modified to promote or repress gene expression. During DNA replication, these modifications are often also reproduced.  
 
[[Cueball]] is surprised that Megan and her colleagues actually use {{w|Microsoft Notepad}}, a simple {{w|text editor}}, to look at the genome, instead of more modern technology. She explains that better research institutions use {{w|Microsoft Word}}, a more advanced editor, to allow additional formatting (such as '''bolding''' and ''italics''), and humorously calls this "{{w|epigenetics}}". In the real world, epigenetics is the study of changes that are not caused by changes in nucleotides, but by chemical modifications of DNA or chromosomes that cause changes in patterns of gene expression and activation, sometimes several generations down.  This might be considered analogous to altering the meaning of a text by changing its formatting rather than the content; for example, content can be moved into parentheses or footnotes to be de-emphasized, or rendered in boldface or enlarged to attract attention and emphasize key points. Much as text can be wrapped in HTML tags or similar markup to change its formatting, nucleotides can be {{w|DNA methylation|methylated}} to prevent transcription, and the {{w|histone}}s around which DNA is wound can also be modified to promote or repress gene expression. During DNA replication, these modifications are often also reproduced.  
  
βˆ’
The real punchline comes when Megan uses {{w|Spell checker|spellcheck}} to detect mutations in the genome by adding the previous genome to spellcheck and comparing them. Overall, Megan uses ridiculously and humorously crude methods to analyze a major genetic item. The genome of SARS-CoV-2 is almost 30,000 base-pairs long, which exceeds the {{w|longest words}} of any natural language by two orders of magnitude (the longest words ever used in literature -- i.e. not constructed in isolation simply for the purpose of being a long word, or chemical formulas -- approach 200 letters), and may exceed the capabilities of any available spell-checking program. Furthermore, a spellcheck program underlines the whole word if a single letter is wrong and not just the letter itself. Thus, it would not be able to highlight individual mutated base pairs.  Megan might be better served by using a {{w|diff}} tool, but most scientists generally use commercial software that is designed to view, annotate, and edit DNA sequences (eg: Snapgene, Geneious, DNAstrider, ApE).
+
The real punchline comes when Megan uses {{w|Spell checker|spellcheck}} to detect mutations in the genome by adding the previous genome to spellcheck and comparing them. Overall, Megan uses ridiculously and humorously crude methods to analyze a major genetic item. The genome of SARS-CoV-2 is almost 30,000 base-pairs long, which exceeds the {{w|longest words}} of any natural language by two orders of magnitude (the longest words ever used in literature -- i.e. not constructed in isolation simply for the purpose of being a long word, or chemical formulas -- approach 200 letters), and may exceed the capabilities of any available spell-checking program. Furthermore, a spellcheck program underlines the whole word if a single letter is wrong and not just the letter itself. Thus, it would not be able to highlight individual mutated base pairs.  Megan might be better served by using a {{w|diff}} tool, but most scientists generally use commercial software that is designed to view, annotate and edit DNA sequences (eg: Snapgene, Geneious, DNAstrider, ApE).
  
 
The title text mentions {{w|Grammar checker|grammar checking}} and claims that whoever discovers how to use that to compare genomic material should be awarded a {{w|Nobel Prize}}. Spell-checking is analogous to comparing sequences against ones previously known, an activity that is the bread and butter of bioinformatics nowadays. Grammar checking would be analogous to having some sort of sense as to how well all the sequences generally cooperate and interact to create possibly viable functionality in an organism, something we are unable to do at the moment except in very limited ways and only in a few simple cases. It may also be a snarky commentary on the untrustworthy nature of grammar-check programs in general, which often follow grammatical rules far more strictly than is practical; it's not uncommon for an author to follow a grammar-check recommended correction only to find the corrected portion is now part of a longer portion that the checker deems "incorrect".
 
The title text mentions {{w|Grammar checker|grammar checking}} and claims that whoever discovers how to use that to compare genomic material should be awarded a {{w|Nobel Prize}}. Spell-checking is analogous to comparing sequences against ones previously known, an activity that is the bread and butter of bioinformatics nowadays. Grammar checking would be analogous to having some sort of sense as to how well all the sequences generally cooperate and interact to create possibly viable functionality in an organism, something we are unable to do at the moment except in very limited ways and only in a few simple cases. It may also be a snarky commentary on the untrustworthy nature of grammar-check programs in general, which often follow grammatical rules far more strictly than is practical; it's not uncommon for an author to follow a grammar-check recommended correction only to find the corrected portion is now part of a longer portion that the checker deems "incorrect".
Line 25: Line 25:
  
 
==Transcript==
 
==Transcript==
 +
{{incomplete transcript|Do NOT delete this tag too soon.}}
  
 
:[Megan sits at a desk, working on a laptop. A genome sequence is displayed on her laptop screen, shown with a jagged line in a text bubble.]
 
:[Megan sits at a desk, working on a laptop. A genome sequence is displayed on her laptop screen, shown with a jagged line in a text bubble.]
 
:Cueball (off-screen): So that's the coronavirus genome, huh?
 
:Cueball (off-screen): So that's the coronavirus genome, huh?
 
:Megan: It is!
 
:Megan: It is!
βˆ’
:Laptop: ''<A long string of unintelligible letters, presumably the genome>''
+
:Laptop: TACTAGCGTGCCTTTGTAAGCACAAGCTGATTAGTACGAACTTATGTACTCATTCGTTTCGGAAGAGACAGGTACGTTA
  
 
:[Cueball walks up and stands behind Megan, still working on the laptop.]
 
:[Cueball walks up and stands behind Megan, still working on the laptop.]
Line 42: Line 43:
 
:Megan: That extra formatting is called "epigenetics".
 
:Megan: That extra formatting is called "epigenetics".
  
βˆ’
:[A regular panel. Cueball still stands behind Megan, this time with his hand on his chin.]
+
:[A regular panel, Cueball still stands behind Megan. He has his hand on his chin.]
 
:Cueball: Hey, why does that one have a red underline?
 
:Cueball: Hey, why does that one have a red underline?
 
:Megan: When we identify a virus, we add its genome to spellcheck. That's how we spot mutations.
 
:Megan: When we identify a virus, we add its genome to spellcheck. That's how we spot mutations.

Please note that all contributions to explain xkcd may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see explain xkcd:Copyrights for details). Do not submit copyrighted work without permission!

To protect the wiki against automated edit spam, we kindly ask you to solve the following CAPTCHA:

Cancel | Editing help (opens in new window)