The topic of Junk DNA has come up a number of times in posts on this site over the last year or so. In particular one commenter has referred on several occasions to the supposed debunking of junk DNA as a serious blow to the theory of evolution. This is a fairly common misunderstanding. But the recent flurry of publicity over the release of a large series of papers by the ENCODE project the first week of September and a truly exceptional claim that 80% of DNA exhibits biochemical function has brought Junk to the fore once again.
A typical report from the New York Times: Bits of Mystery DNA, Far From ‘Junk,’ Play Crucial Role.
And from the Washington Post (one of the worst examples of hype I found): ‘Junk DNA’ concept debunked by new analysis of human genome.
Another from Forbes looks at the reaction to the press surrounding ENCODE: Reports of Junk DNA’s Demise Have Been Greatly Exaggerated.
A response on the Nature News Blog: Fighting about ENCODE and junk.
This is a real mess – a mixture of hype and real breakthroughs, straightforward commentary and polemic.
What does this mean for the arguments for evolution and common descent?
What is the non-scientist to think?
As a result the topic of junk DNA is probably worth a post.
We all learned in high school or college biology that the sequence of DNA codes for proteins. Each of the 20 natural amino acids is specified by one or more three base nucleotide sequences according to a genetic code. The stretch of DNA that codes for a given protein is called a gene. The human genome contains some 23,000 genes. Most of the DNA in each chromosome however, does not code for protein. The remainder either serves a different function or perhaps serves no function. This DNA has often been referred to as “junk” DNA as though it is uniformly without function in biology.
The presence and pattern of this so-called “junk” DNA is a persuasive argument for evolution and common descent. In his book The Language of God Francis Collins describes transposable elements or ancient repetitive elements (AREs) as one of many threads of evidence for common ancestors and common descent (p. 135-136).
Even more compelling evidence for a common ancestor comes from the study of what is known as ancient repetitive elements (AREs). These arise from “jumping genes,” which are capable of copying and inserting themselves in various other locations in the genome, usually without any functional consequences. Mammalian genomes are littered with such AREs, with roughly 45% of the human genome made up of such genetic flotsam and jetsam. When one aligns sections of the human and mouse genomes, anchored by the appearance of genetic counterparts that occur in the same order, one can usually also identify AREs in approximately the same location in these two genomes.
Some of these may have been lost in one species of the other, but many of them remain in a position that is most consistent with their having arrived in the genome of a common mammalian ancestor, and having been carried along ever since. Of course, some might argue that there are actually functional elements placed there by the Creator for a good reason, and our discounting them as “junk DNA” just betrays our current level of ignorance. And indeed some small fraction of them may play important regulatory roles. But certain examples strain the credulity of that explanation. The processes of transposition often damages the jumping gene. There are AREs throughout the human an mouse genomes that were truncated when they landed, removing any possibility of their functioning. In many instances, one can identify a decapitated and utterly defunct ARE in parallel positions in the mouse and human genome.
Do new discoveries of function for the so-called “junk” undermine this argument? Dennis Venema, an associate professor of biology at Trinity Western University in British Columbia, had a good series of posts on the BioLogos blog earlier this year discussing “junk DNA”: Is there Junk in your Genome. I recommend these posts to anyone with interest in more of the details.
The label “junk” is not entirely accurate as both Dr. Collins and Dr. Venema note, and functions of segments of the so-called “junk DNA” are being identified with some regularity. In fact is is not uncommon to see articles in the scientific literature, or more often in the popular press, with an opening paragraph that highlights the label and continues on to describe some function of some portion of the so-called “junk”. Many small sequence of conserved noncoding elements (often abbreviated CNC or CNE) appear to serve regulatory functions turning genes on and off.
An NPR story last year Don’t Throw It Out: ‘Junk DNA’ Essential In Evolution highlighted the results of a paper published in Science, Three Periods of Regulatory Innovation During Vertebrate Evolution by Lowe et al. (Science 333, 1019-1024, 2011). This article discusses how the evolution of gene regulatory elements in the “junk” may be responsible for changes in the phenotypes (characteristic traits) of animals.
So what is the fuss? The headlines for ENCODE claim something like 80% of the DNA exhibits features of biochemical function. This is a truly phenomenal claim, an excellent headliner and an attention grabber. But the definition of function used in the ENCODE study is, as Dennis Venema points out, very fuzzy. The kinds of function found range from a strict definition involving protein coding and gene regulation to a very loose definition. Dennis works though the details of this in two new articles posted just this week dealing with questions coming into BioLogos following the publicity arising from the ENCODE findings: ENCODE and “Junk DNA” Part 1: All Good Concepts are Fuzzy and Part 2: Function: What’s in a Word?
The paper outlining the primary conclusions of the ENCODE project so far is available Open Access at Nature: An integrated encyclopedia of DNA elements in the human genome. I’m not the expert – so I may get some of this wrong. I will correct anything pointed out by a reader with more expertise. But here is the summary I can cull from the paper about the conclusions from ENCODE. Numbers are a little hard to pin down.
Something like 1.2% or so of our DNA codes for proteins.
Perhaps another 10-20% (many favor the low end of this range) functions as gene regulatory elements or serves other specific biochemical functions. According to the more conservative definitions of function the ENCODE project identifies 2.9% as “protein-coding gene exons” and 8.5% involved in “specific protein DNA binding.” The gene regulatory elements are critical, but operate under a different kind of evolutionary pressure for conservation of sequence. It appears, however, that more (much more) of our DNA is involved in gene regulation than in protein coding. This is a fascinating discovery. Biology is an incredibly complex topic. Quantum physics is trivial in comparison (perhaps).
Much of the rest of the DNA has function according to a very loose definition of function (see Dennis’s article for some more detail). Some of this “functional” DNA may have specific useful function to be uncovered in the future, but much of it probably doesn’t. I would bet very little of it does. Most, if not all, of it is under little to no selective pressure conserving the sequence of bases. In this 60% or so of the DNA that is loosely functional, and in the 20% for which ENCODE found no sign of biochemical function, we find ancient repeat units and pseudogenes, and other evidence for evolution recorded in the genome.
History in the Genome. It is interesting that as this story was breaking I was in Israel at a conference. One of the places I visited, and not for the first time, was the ancient site of Caesarea on the coast of the Mediterranean. Here we see both Roman ruins and the remains of a Crusader fortress. One interesting feature of the Crusader fortress is the recycling of material from the Roman city. The picture at the top of the post shows Roman pillars incorporated into the foundations and structures of the fortress. The picture to the right was taken in about the same location about a decade earlier (my camera wasn’t as good then – but the tree didn’t block some of the view). These pillars served a function for the Crusaders. They were not merely refuse carried along for no good purpose. Yet there is a history recorded in the details of the pillars independent of the function they served for the Crusaders, a history that points back to the Roman city.
A few percent of DNA codes for proteins, another 10-20% has specific function of various sorts, most importantly gene regulation. Much of the 80% of the DNA in the human genome that the ENCODE project finds to be functional only by the most liberal of definitions, or not functional even by that definition, carries a history written into the sequence of bases in the DNA chain. This history is something like the details, shapes, and carvings, of the pillars reused in the Crusader fortress. The details tell us something about the history of the genome. If this DNA serves a function, it is not a sequence specific function and the function does not obliterate the history encoded in the DNA sequence or disprove the theory of evolution.
Did the recent hype about Junk DNA cause you to wonder about the strength of arguments for evolution?
What is your reaction to the press and hype? Where do we look for good information?
If you wish to contact me directly you may do so at rjs4mail [at] att.net.