1605: DNA

Explain xkcd: It's 'cause you're dumb.
Jump to: navigation, search
Researchers just found the gene responsible for mistakenly thinking we've found the gene for specific things. It's the region between the start and the end of every chromosome, plus a few segments in our mitochondria.
Title text: Researchers just found the gene responsible for mistakenly thinking we've found the gene for specific things. It's the region between the start and the end of every chromosome, plus a few segments in our mitochondria.


Because we have pretty much mapped the entire human genome, it's tempting to think we now know what makes our bodies tick and can start changing things. But just knowing what the individual pieces are, doesn't mean we know how they interact and behave in a complex system like our bodies.

In the comic, White Hat thinks that mapping the human genome is the same as knowing the source code for a computer program. By studying the source code for a program, a person can often understand why it does what it does, and make effective and fundamental changes to the program's operation. This may be a reference to the hyperbolic claims of Raymond Kurzweil, author of The Singularity is Near, that DNA is closely analogous to a computer program. Kurzweil believes that since we have sequenced DNA, we will soon be able to reverse engineer the brain and program a computer to completely simulate all its functions.

Megan points out that even a complete knowledge of DNA would only provide a partial understanding of our body's workings. Complete knowledge would require an understanding of feedbacks and external processing (such as the interactions of the proteins created by DNA). In addition the comparison is not valid because the human body is so many orders of magnitude more complicated than the computers we have running programs. White Hat is not persuaded, even though Megan points out that DNA has been developed in the most aggressive optimization process in the universe (natural evolution), running for billions of years. White Hat's thought process may be similar to the physicist in 793: Physicists who assumes that any other field is simple because it appears to be similar to something he's seen before.

Finally Megan enlightens White Hat by making him look at the source code for Google's front page. In a web browser, the page looks simple; a very plain white page with a search box in the middle plus a few text links and icons, and indeed back in the 1990s Google's HTML code for the page was quite simple. But in less than 20 years, Google developers have vastly expanded it, with over 300 kilobytes of minified Javascript and CSS. Looking at some obfuscated source code may make it clearer how misleading even simple looking code can be, and how unreadable correct and well working code can be. This analogy causes White Hat to consider how much more complexity could evolve over billions of years through the relentless forces of nature.

What makes this even worse with DNA is that although it can be thought of as 'source code' it isn't for a language we fully understand, and this code was generated through various natural mechanisms such as natural selection, feedback loops like homeostasis, etc.; possibly even including processes that are not currently known to science. Further, program maintainability is not an issue, so there is no reason for the code to be easy to understand. Additionally, there are many other non-genetic factors such as epigenetics, maternal effect and environment, which change how the genetic code is used. This means that not all parts make sense and that there may be all kinds of side effects and things that have several purposes.

The title text reference to finding the gene that is responsible "for mistakenly thinking we've found the gene for specific things" is a reference to the tendency of news organizations to run headlines making similar claims, often by oversimplifying or misrepresenting the actual study. These claims are based off the common belief that since DNA is a 'source code' for our body it should be possible to pin point the effect of individual genes in much the same way that we could describe the effect each line of code has in a very simple program; leading to people expecting one gene to be associated with each observable human trait. In reality even small traits are the results of hundreds of genes, sometime spread across multiple chromosomes, interacting through complex mechanisms; making it rare that a single gene, or gene sequence, can be definitively stated to be the sole, or primary, cause of a given trait.

The joke of the title text is that the responsible gene is located in the region between the start and the end of every chromosome meaning that the whole genome, not any one gene or DNA segment, must be considered responsible for the referenced trait, since the interconnected nature of DNA and environment during development means that every gene is at least partially responsible in generating any complex traits. Randall even includes the mitochondria, recognizing that the short DNA sequences present in these organelles, which are located outside the cell-nucleus, also contribute to development. The organismal chromosome or chromosomes are located in the nucleus, but mitochondria have their own tiny independent genome, reflecting their distant ancestry as separate but symbiotic organisms. This means that the DNA segments coding for any given human trait are not even necessarily all found on the main chromosomes in the nucleus.

Technically a gene is "a locus (or region) of DNA that encodes a functional RNA or protein product", which means that it is a single discrete unit of DNA, with human DNA containing over 20,000 genes. Thus the theoretical gene could not include the entire region between the start and the end of every chromosome since that region contains thousands of genes, any more than it's possible to say that the ace of clubs is the card everywhere from the top of the full deck of cards to the bottom of it.

Of course if such a gene actually did exist, then we would never be able to correctly identify where it was since we would make a mistake every time we thought we found a gene for something specific. So the whole title text is either a contradiction (they could never find this gene if it was there) and/or it is a tautology since if the gene did exist, then of course it has to be part of our entire DNA. (If it is a tautology it is the second title text using this in just two weeks, the last being 1602: Linguistics Club.)

Google's home page for the date this cartoon appeared can be seen at the internet archive: www.google.com homepage (18 Nov 2015).

Similar discussions between White Hat and Megan can be found in 1255: Columbus and 1731: Wrong.


[White Hat, holding a laptop, is talking to Megan who looks at her smart phone.]
White Hat: Biology is largely solved. DNA is the source code for our bodies. Now that gene sequencing is easy, we just have to read it.
Megan: It's not just "source code". There's a ton of feedback and external processing.
[White Hat, opening his laptop, walks toward a desk and chair past Megan who holds her arms out.]
Megan: But even if it were, DNA is the result of the most aggressive optimization process in the universe, running in parallel at every level, in every living thing, for four billion years.
White Hat: It's still just code.
[White Hat sits down at the desk with his opens laptop, while Megan looks over his shoulder.]
Megan: OK, try opening google.com and clicking "View Source."
White Hat: OK,I-...Oh my god.
Megan: That's just a few years of optimization by Google devs. DNA is thousands of times longer and way, way worse.
White Hat: Wow, biology is impossible.

comment.png add a comment! ⋅ comment.png add a topic (use sparingly)! ⋅ Icons-mini-action refresh blue.gif refresh comments!


The source for Google.com can be found at `view-source:https://www.google.com/` for Firefox and Chrome. Also here. —Artyer (talk|ctb) 16:06, 18 November 2015 (UTC)

Should there be a link to the code in the explain. I do not understand these links or the source code, and would not like to place these links in the explanation. --Kynde (talk) 18:43, 18 November 2015 (UTC)

I really like this comic. IMHO, just another good example of intelligent design. Google's dev had to design, plan and carefully code. If that is seemingly simple compared to DNA and biology then how much more intelligence and thought was needed for the coding of all living things?--R0hrshach (talk) 17:18, 18 November 2015 (UTC)

With all the stupid things going on in our bodies (rendered useless by natural selection but staying put anyway like the Appendix or our tailbone) then it is to me just a clear example that there has been no intelligence behind our genome, but just trial and error, and then 4 billion years to get it right enough that it works but not smart. And don't get me started on how our air and food/drink has to go in the same way with the risk of being (nearly) killed by a pretzel...(even if you are the president of the US ;-) That is just plain stupid design. But few enough dies from this, that it was necessary for nature to change it once it was working. Humans and the genes survived long enough to reproduce. --Kynde (talk) 18:43, 18 November 2015 (UTC)
Without an appendix how would our gut immune system develop properly? Without a tail bone how would we stand upright? It's a fallacy to think that just because we don't understand something it must have no purpose. 00:53, 19 November 2015 (UTC)
I think it somewhat illogical and incomprehensible that someone could point to the human body and call it just plain stupid design. So stupid that humans cannot replicate it on a mechanical or software level - yet it's complex design works independently of our conscious thought and exists for the most part on its own. When you look at it from an evolutionary standpoint you spot design flaws; however, we don't even fully understand the full scope of our own biology and we still run trial and error studies. My profession is in aircraft parts design and manufacture. When I look at the parts we create and build to put on aircraft and how much time, engineering, design, testing and ultimately still discovering small errors in tolerance stack-ups and cascading events and still these parts are considerably crude in nature compared to living cells and the entire ecosystem that is - us. It is mind blowing. Believe me. The fact that surgeons can go in move things around, cut things out, insert things for goodness sake and still the body operates is a wonderful testament to the truly awe-inspiring mechanical design that is our bodies.--R0hrshach (talk) 22:31, 23 November 2015 (UTC)
Without a pretzel-choking mechanism, how could we ever hope to weed out less-desirable presidents? 21:59, 19 November 2015 (UTC)

Nothing to do with choking; the pretzel interfered with the vagus nerve, interfering with his heart rate Gearóid (talk) 08:05, 20 November 2015 (UTC).
I am sure you are right about intelligent design being involved: clearly DNA's tangled structure is a deliberate nod to a plate of spaghetti, proof indeed that the Flying Spaghetti Monster has had a hand (well, a noodly appendage) in all of creation. Martin (talk) 00:16, 20 November 2015 (UTC)

Apologies, when I saved my comments it blitzed someone else's that must have been being written at the same time :'-( RIIW - Ponder it (talk) 19:13, 18 November 2015 (UTC)

Yeah it was my two comments above? I have now moved the one right her above back in place from the bottom where Davidy22 had placed it when he tried to fix it. No harms done but as he says: Read error messages, I know mediawiki gives them to you. You can always see in the history what you have changed. --Kynde (talk) 21:08, 18 November 2015 (UTC)
I, for my part read the edit conflict (with Kynde, 18:43) like a good little boy, re-edited in light of that, resubmitted and... forgot to answer the security question. For completeness I wrote the following. If it's still helpful...
Had the same thought.  Although I just use "View | Source" from the menu or right-click and "View Page Source", or whatever that browser tends to want to give me.  And, having had that same thought: For reference, in case anybody wants it, the source of the google.co.uk main page (assumed not far off google.com in its nature) is 51 lines.  But that's 51 long lines of mostly javascript, with much of the unnecessary whitespace (including line-feeds) taken out of it, overwhelmingly single-character variable names, over 150 'if' statements (including 'else if' ones, in continuation to a prior one) and perhaps 56 'for' loops, at first glance.  Whether 'optimised' or obfuscated, it certainly could be a challenge to fully understand.
HTH, HAND 21:43, 18 November 2015 (UTC)
I don't think that Google's homepage is only optimized. I'm sure part of obfuscation is deliberate. That said, just removing comments and changing variable names is usually enough to make program unreadable. -- Hkmaly (talk) 13:18, 19 November 2015 (UTC)

IMHO DNA with its redundant sections for things not currently used and the bodges in biological design are a good example of unintelligent design. For example the blood supply to the retina is between the iris and the retina, so it is in the way. An intelligent designer would do an eye mark II. But this has nothing to do with the comic. RIIW - Ponder it (talk) 19:07, 18 November 2015 (UTC)

Sigh. Another of the "I could do a better job" brigade. Go ahead. Try it. Post back here after you learn enough about the existing eye design that you recognise just how incredible it is. 00:57, 19 November 2015 (UTC)
Eye mark II is used in octopuses: Cephalopod eye. Solves multiple problems of our eyes. -- Hkmaly (talk) 13:15, 19 November 2015 (UTC)
Cephalopod vs Vertebrate eyes is a classic example of convergent evolution, therefore eye structure proves evolution not intelligent design. Martin (talk) 00:16, 20 November 2015 (UTC)
Eye structure disproves intelligent design BECAUSE no intelligent designer would use two things which are so similar and yet so different. Disproving intelligent design is easy. The real content is between evolution and STUPID design. Or, well ... Cephalopod vs Vertebrate eyes looks EXACTLY like something which would happen if two designers try to compete without directly copying from each other. -- Hkmaly (talk) 13:34, 20 November 2015 (UTC)
Nothing about Intelligent Design leads one to assume that the human anatomy would have been designed foolproof and without any inherent weak points. Why would you design a superhuman?--R0hrshach (talk) 22:37, 23 November 2015 (UTC)

White Hat is showing the hubris often seen by people who think their (often limited) knowledge in one field can be used as an anology for something very different. Megan only manages to showchim his error by showing that a "simple" web page, which has only been evolving for a few years is more complex than he thinks, and the role of any one line/command in the page is probably far from clear without deep analysis RIIW - Ponder it (talk) 19:07, 18 November 2015 (UTC)

The evolution of life is composed exclusively of copy-paste programming on top of legacy code, global variables, and hacks on hacks on hacks at every level, from telomeres and DNA looping, to the structure of the human hip (childbirth), to our breathing tract, optic nerve, and brain structure and cognition. -- 21:47, 18 November 2015 (UTC)

That's what you get when you hack the universe together with perl. -- Dsollen (talk) (please sign your comments with ~~~~)
Or C. Reminds me of a joke going around in the 90s ... http://www.gnu.org/fun/jokes/dna.en.html

So much for Gattaca then... Martin (talk) 00:16, 20 November 2015 (UTC)

I wasn't quite sure what was meant by a comprehensive language in this line:

"What makes this even worse with DNA is that although it can be thought of 'source code' it isn't for a comprehensive language"

Whether it meant that the language of DNA is incomplete in some way (e.g. relies on other linguistic components), or what. I changed it to:

"What makes this even worse with DNA is that although it can be thought of as 'source code' it isn't for a language we fully understand"

Which I think is clearer, and I hope captures the intended meaning. If not, please clarify. Thanks. 20:54, 21 November 2015 (UTC)

To maintain historical context, it would be prudent to add links to the current version of the Google homepage, both as it appears in a web browser, and Google's current code for the page (e.g. via the Wayback machine). Google could change the page to make it visually more complex, or change the code to make it simpler. Preserving samples of both would futureproof this explanation. Here is a link to the archive for the Google homepage on the day this comic was posted [Archive of www.google.com homepage, on 18 Nov 2015]

I do not know how to easily provide either an image of what this page looks like in a web browser from this time period, or how to provide a link to just the google code from the archived page, without violating copyright. 20:54, 21 November 2015 (UTC)

I thought it notable that the source code for the wiki page for 'Minification (programming)' [1], itself contained minified code. These Are Not The Coments You Are Looking For (talk) 00:12, 23 November 2015 (UTC)

I do not think DNA is 'source' code. A more apt comparison would be compiled (binary) code of a self-modifying program for which no source code is available. Anyone who've dabbled in reverse engineering is probably familiar with the Chinese-crossword level of confusion when first reading an unannotated binary (although it does get better with experience). Now imagine you also don't know most of the assembly language, multiply by at least 1000, and you've got the genetic engineering problem. Biologists who study it now are at least as hardcore as programmers in 60's. Probably more. 13:29, 27 June 2022 (UTC)

Being a programmer (and especially one that works with hardware a lot), I love to try to make analogies of biology to computers. Of course that really really only goes so far. Having done the briefest amount of research into DNA, it's definitely not "source code". It's machine code for the worst ISA in the world that has no documentation and the code uses every single quirk to its extreme. Talk about self-modifying code, this is a self-modifying processor! Multiple "instructions" seem to code for the same thing except they actually impact the "performance" significantly. The interactions between every part of the system is so wild and sensitive that not only is the outcome different "run to run" but even when replicating the code exactly, environmental conditions make the outcome possibly very different. Not to mention that every single "processor" is slightly different and will run the same thing differently! The fact that we can be sure about almost anything is wild to me but clearly we know way less about complex organisms than simple ones. IIRC, we've managed to "compile" a simple bacterium gnome and get it to work at least somewhat, but not as well as the original! Brycemw (talk) 16:13, 23 April 2024 (UTC)