The ENCODE Data and Pseudogenes

As I mentioned in two previous posts (here and here), the coordinated release of scientific papers from the ENCODE project has produced an enormous amount of amazing data when it comes to the human genome and how cells in the body use the information stored there. While the majority of commentary regarding these data has focused on the fact that human cells use more than 80% of the DNA found in them, I think some of the most interesting scientific results have gotten very little attention. They are contained in a paper that was published in a journal named Genome Biology, and they relate to the pseudogenes found in human DNA.

For those who are not aware, a pseudogene is a DNA sequence that looks a lot like a gene, but because of some details in the sequence, it cannot be used to make a protein. Remember, a gene’s job is to provide a “recipe” for the cell so that it can make a protein. Well, a pseudogene looks a lot like a recipe for a protein, but it cannot be used that way. Think of your favorite recipe in a cookbook. If you use it a lot, it probably has stains on it because it has been open while you are cooking. Imagine what would happen if the recipe got so stained that certain important instructions were rendered unreadable. For someone who has never looked at the recipe before, he might recognize that it is a recipe, but because certain important instructions are unreadable, he will never be able to use the recipe to make the dish. That’s what a pseudogene is like. It looks like a recipe for a protein, but certain important parts have been damaged so that they cannot be used properly anymore. As a result, the recipe cannot be used by the cell to make a protein.

Pseudogenes have been promoted by evolutionists as completely functionless and as evidence against the idea that the human genome is the result of design. Here is how Dr. Kenneth R. Miller put it back in 1994:1

From a design point of view, pseudogenes are indeed mistakes. So why are they there? Intelligent design cannot explain the presence of a nonfunctional pseudogene, unless it is willing to allow that the designer made serious errors, wasting millions of bases of DNA on a blueprint full of junk and scribbles. Evolution, however, can explain them easily. Pseudogenes are nothing more than chance experiments in gene duplication that have failed, and they persist in the genome as evolutionary remnants…

Obviously, Dr. Miller didn’t understand intelligent design or creationism when he wrote that, as they can both explain nonfunctional pseudogenes. Before I discuss that, however, I need to point out that since 1994, functions have been found for certain pseudogenes. As far as I can tell, the first definitive evidence for function in a pseudogene came in 2003, when Shinji Hirotsune and colleagues found that a specific pseudogene was involved in regulating the functional gene that it resembles.2 Since then, functions for several other pseudogenes have been found. In fact, a recent paper in RNA Biology suggests that the use of pseudogenes as regulatory agents is “widespread.”3

Even though functions have been found for many pseudogenes, the question remains: Are most pseudogenes functional, or are most of them non-functional? Well, based on the ENCODE results, we might have the answer. While the ENCODE results indicate that the vast majority of the genome is functional, they also indicate that the vast majority of pseudogenes are, in fact, non-functional.

While this result might sound counter-intuitive, it seems to be borne out by the data. First, let’s look at the numbers. The researchers found more than 12,000 sequences in human DNA that might be reasonably called pseudogenes. Based on several lines of analysis, they suggest that there might be as many as 14,112 pseudogenes in the human genome. However, identification of pseudogenes can be complicated, so in the end, they say that they are only confident in the identification of 11,216 actual pseudogenes. They call this group their “survey set.”

As I discussed previously, the ENCODE team assumes that any sequence of DNA that is transcribed by the cell is functional. After all, the process of transcription consumes a lot of resources and energy, and it is hard to believe that the cell would waste it all on DNA sequences that aren’t used. So this part of the ENCODE team specifically wanted to see how many of the pseudogenes they identified were actually transcribed. The answer is only 876 of them. Out of all 11,216 pseudogenes that they confidently identified, less than 8% are actually transcribed. Based on their definition of functional, then, less than 8% of pseudogenes in the human genome are functional.4

Now the authors are quick to point out that their identification of transcribed pseudogenes is conservative, so there may be many more. Also, as I pointed out previously, the ENCODE team has not studied all cell types in the human body, so there might be some cell types that have not been studied that use some pseudogenes the ENCODE results indicate are non-functional. However, based on these results, it is hard to believe that we will ever see the percentage of functional pseudogenes increase to anywhere near 51%. As a result, I think that based on the ENCODE data, it seems that most pseudogenes are, indeed, non-functional.

So let’s now revisit Dr. Miller’s statement. He claims that intelligent design can’t explain non-functional pseudogenes. Of course, that’s just not true. In fact, intelligent design would predict the existence of pseudogenes. After all, we know that mutations occur in the genome and often, those mutations result in the destruction of genetic information. Thus, it is not surprising that some genes have been “broken” as a result of mutation. Creationism would also expect pseudogenes, since creationism says that after God created the world and the life in it, there was a terrible event called the Fall, and creation groans as a result (Romans 8:22). Thus, it is not surprising that over time, the Fall has resulted in the destruction of some genes.

What both intelligent design and creationism predict is that the vast majority of the genome is functional, because if too much damage is done to an intricately-designed system, it simply won’t work anymore. So if the vast majority of the genome is non-functional, that would be difficult (if not impossible) for intelligent design or creationism to explain. Of course, the ENCODE results indicate that this is not an issue, since they find evidence that human cells use more than 80% of the genome.

In the end, then, the ENCODE data tell us that while the majority of the human genome is probably functional, the majority of pseudogenes are probably not. This tells me something important. Most likely, some DNA sequences that have been identified as pseudogenes are probably not broken versions of functional genes. Most likely, they are regulatory elements that were designed into the genome. At the same time, however, most of what have been identified as pseudogenes are, indeed, broken genes, and they do appear to be useless. Obviously, this conclusion is subject to change based on new information, but it does seem to be what the ENCODE data are telling us.


1. Kenneth R. Miller, “Life’s Grand Design,” Technology Review 97(2):24-32, 1994.
Return to Text

2. Shinji Hirotsune, et. al., “An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene,” Nature 423:91-96, 2003.
Return to Text

3. Yan-Zi Wen, Ling-Ling Zheng, Liang-Hu Qu, Francisco J. Ayala and Zhao-Rong Lun, “Pseudogenes are not pseudo any more,” RNA Biology 9(1):27-32, 2012.
Return to Text

4. Baikang Pei, et. al., “The GENCODE pseudogene resource,” Genome Biology 13:R51, 2012. (Available online)
Return to Text

25 thoughts on “The ENCODE Data and Pseudogenes”

  1. Over and over again, on your articles regarding the ENCODE project and this one, you mention the fact that cells wouldn’t waste energy on DNA sequences that aren’t used.

    My question is, how does the cell “know” whether or not it’s components are going to be used or not? Is this natural selection in action?

    1. Seth, the cell is filled with feedback systems, telling it when too much of one protein is made or too little of another protein is made. I would think if a bunch of useless RNA transcripts were floating around the cell, similar feedback systems would end up sending information to the nucleus telling it to turn off transcription for those DNA sequences. Since this would presumably happen in all cells, including germline cells, that would stop transcription in future generations as well. I think these pseudogene results support that idea. Since less than 8% of pseudogenes are transcribed, that tells me the cell is good at identifying broken genes and shutting off their transcription.

      So yes, I think this is natural selection in action. Any cell that doesn’t turn off the transcription of useless DNA sequences will be at a severe disadvantage compared to a cell that does turn off such transcription. As a result, the cells that transcribe useless DNA sequences tend to die off, and the ones that don’t transcribe those sequences tend to survive.

  2. You can’t have it both ways, Jay. ID and creationism did not predict pseudogenes or “junk” DNA. They predicted that it would not be there and according to your post above, they were wrong. It seems that you guys may have started the party too early when ENCODE first reported their findings. Now who’s making the just-so stories for their theories? Original sin isn’t a scientific concept to explain things. Time to convert to Darwinism, Jay.

    1. JLAfan2001, intelligent design did, indeed, predict nonfunctional pseudogenes. Here is a quote from an article on the intelligent design website Uncommon Descent that suggested many pseudogenes might be nonfunctional before the ENCODE data were revealed:

      What if the ‘compiled’ genome contains whole code-libraries that have been linked in, but not all functions are used, and some were deactivated? Wouldn’t that be a natural explanation for true pseudogenes, that is pseudogenes that really are totally non-functional?

      As Dr. Paul Nesselroade tells us, the attempts by Darwinists to explain the existence of functional pseudogenes is a classic example of an ad hoc modification of a hypothesis specifically so the hypothesis can have it both ways:

      Without batting an eye, evolutionary theory has been modified to fit the data. Whereas prior to this research only pseudogene non-functionality was expected by evolution, now we see that pseudogene functionality is also perfectly compatible. This is a good example of the main point in last month’s essay, “Betting on All the Horses.” The horse of pseudogene functionality has been seamlessly added to the Darwinian stable. Now, according to the new model, pseudogenes are expected to be non-functional except, well… except when they’re not!

      If any group is trying to “have it both ways” when it comes to pseudogenes, it is the Darwinists.

      You say that original sin isn’t a scientific concept to explain things. However, if original sin is a fact, it must be incorporated into any model of nature. This is why the anti-creationist movement is blatantly anti-science. By artificially excluding certain possibilities that its proponents don’t want to be true, it stifles serious scientific inquiry.

      Also, if you think that the concept of original sin is not a scientific concept, you need to explain why there are serious historians, like Dr. Peter Harrison of Oxford University, who argue that the concept was central to the development of modern science:

      Peter Harrison provides an account of the religious foundations of scientific knowledge. He shows how the approaches to the study of nature that emerged in the sixteenth and seventeenth centuries were directly informed by theological discussions about the Fall of Man and the extent to which the mind and the senses had been damaged by that primeval event. Scientific methods, he suggests, were originally devised as techniques for ameliorating the cognitive damage wrought by human sin. At its inception, modern science was conceptualized as a means of recapturing the knowledge of nature that Adam had once possessed. Contrary to a widespread view that sees science emerging in conflict with religion, Harrison argues that theological considerations were of vital importance in the framing of the scientific method.

  3. Jay

    I’m confused. You said that ID predicted the presence of non-functional pseudogenes and yet the people at UD, Evolution News and Reasons to Believe were happy when ENCODE said that they do have a function. Look at “The myth of Junk DNA” by Jonathan Wells. The Darwinists were up in arms with ENCODE because of these results and now you are saying it was actually the other way around? The Darwinists are now trying to explain the presence of non-functional when they predicted functional? In fact look at the quote you posted from Ken Miller. He says right there that pseudogenes are present so according to your answer Darwinists and ID were on the same page to begin with. Something seems out of kilter to me.

    1. JLAfan2001, it seems out of kilter to you because you obviously haven’t followed the ID position. The ID position has never been that all pseudogenes are functional. Indeed, as the quote I gave you showed, ID has never had a problem with functionless pseudogenes. All the ID advocates have done is point out time and time again that many of the DNA sequences that Darwinists have definitively pronounced as “junk” actually have function. Since some pseudogenes do have function, ID advocates have pointed that out.

      Also, the people at UD, ENV, and Reasons to Believe are happy because ENCODE said that more than 80% of the human genome is functional. This includes some functional pseudogenes. However, it is not restricted to pseudogenes.

      I think you didn’t read my response very well. I did not claim that Darwinists “are now trying to explain the presence of non-functional when they predicted functional.” Instead, I am saying that Darwinists did not predict functional pseudogenes at all. Thus, when functional pseudogenes were found, Darwinists had to insert an ad hoc explanation for them. That’s the point of the article I posted. If anyone is trying to “have it both ways” when it comes to pseudogenes, it is the Darwinists.

  4. Let’s use an example of a car. The Darwinists are saying that all four tires on the car are flat. ID says we don’t have a problem with that but we don’t think that all the tires will be flat. ENCODE says that one of the four tires is fine and the engine works too. ID says “See, we win. Not ALL the tires are flat”! The Darwinists ask “how did the other three tires come to be flat if the designer is so smart?” ID then says because of misuse through a supernatural occurrence that science can’t currently or has ever been able to detect empirically. Keep in mind that Creationism can say it’s original sin but ID can’t because that would be mean identifying the designer which ID posits that they don’t know who it is.

    1. But JLAfan2001, that’s not what ID says at all. ID doesn’t say that the functionless pseudogenes are the result of “misuse through a supernatural occurrence that science can’t currently or has ever been able to detect empirically.” As the link in my first reply demonstrates, ID proponents think that even the functionless pseudogenes are “whole code-libraries” of useful information that are a part of the genome’s initial programming.

      I agree with you that creationism’s explanation for functionless pseudogenes is much more compelling than ID’s explanation. However, that doesn’t take away from the fact that ID had an explanation for functionless and functional pseudogenes, while Darwinists had to add an ad hoc explanation to get around the fact that there are some functional pseudogenes.

      Once again, then, if anyone is trying to “have it both ways” with pseudogenes, it is the Darwinists.

  5. “As the link in my first reply demonstrates, ID proponents think that even the functionless pseudogenes are “whole code-libraries” of useful information that are a part of the genome’s initial programming.”

    Are you saying that the functionless pseudogenes once had a job to do in the forming of the gene but that job was done when the gene was formed and hence became functionless?

    BTW, thank you for replying to my posts. I apologize for the tone of my initial post. I’m not really a Darwinist as my messages would imply. I’m just a struggling christian who is trying to come to terms with the evidence for evolution. I thought that the ENCODE results was a blow to Darwinism. When I read this post, it literally made me sick to my stomach because I thought you were saying that it turns out ID was wrong on the ENCODE findings.

    1. JLAfan2001, that’s not exactly what the ID people are saying. They view the genome as a large computer program that was built to deal with many eventualities. Well, imagine designing a computer code like that. If you did, there would probably be a lot of “modules” of code that would be available for the program to use, if it encountered data that needed the code for analysis. However, if it didn’t encounter such data, it wouldn’t use those modules of code. Now…apply that thinking to the genome. Pseudogenes represent those “modules” of instructions. That’s why they look like genes. In certain situations, the genome will use them. That’s why some are transcribed. However, unless the specific situation arises, many of those modules may be entirely unused. That’s why many are not transcribed. So it’s not that they are completely functionless. They can become functional, but only if certain conditions are met. Obviously, within the scope of the ENCODE experiments, the conditions were not met, so they showed up as being nonfunctional. As I said, I don’t think that view is as compelling as a creationist view where you would expect some genetic deterioration after the Fall. Nevertheless, the explanation was in place before the ENCODE data came out, so it doesn’t represent the ad hoc kind of reasoning that Darwinists are forced to use all the time.

      The ENCODE results most certainly are a blow to Darwinism. As I said in my initial posts on the subject, evolution not only predicted that the vast majority of the genome is junk, it really depends on the idea. Remember, the “gold standard” computer simulation of evolution requires that 85% of the simulated genome is junk. Now that ENCODE has shown that the cell uses more than 80% of the genome, we know that such predictions and requirements are scientifically indefensible. That is a huge blow to Darwinism. Pseudogenes are a real side issue. They account for well under 1% of the genome, and the fact that many are not functional is easy to understand in the context of both ID and creationism.

      No need to apologize for your tone. I get a lot worse! I find that most people who write nasty comments are writing as a result of emotion, not rational thinking. Many Darwinists, for example, can’t stand the fact that there is so much scientific evidence against evolution and for creation. As a result, they get really cranky when people like me point that out!

      I do have a question for you, though. You say that you are “trying to come to terms with the evidence for evolution.” As a scientist, I find very little evidence for evolution and a large amount of evidence against it. Thus, I find it very hard to believe in evolution on scientific grounds. Could you tell me what evidence you are trying to “come to terms with?”

  6. I started looking into evolution about a year ago when I kept hearing that evolution is a fact when I always thought it was just a “theory”. I found the evidence to be compelling. The fossil record of human, whale and horse evolution, The findings of archeopteryx and tiktaalik, homology, embryonic research, chromosome 2 fusion, genetic similarities in all species, biogeography and the geological record of simple to complex lifeforms, not to mention that it is widely accepted among christian and non-christian scientists. I know that some of the evidence have been refuted like Haeckel’s drawings, Piltdown Man, Nebraska Man, dark moths but all the other stuff I mentioned can’t all be wrong. I accept that species change and adapt over time but the evidence of common ancestry is quite vast.

    1. JLAfan2001, I would strongly encourage you to investigate these issues more thoroughly, as they provide virtually no evidence for evolution. In fact, they mostly show how difficult it is to make evolution compatible with the data. For example, here is an excellent discussion of whale evolution, showing how it is not possible, even in evolutionary terms. In addition, recent fossil evidence has thrown a real monkey wrench into the whale evolution story. As this article and the book it references show, human evolution has virtually no evidence supporting it. As this article shows, horse evolution is not what it is presented to be. As this article shows, Archaeopteryx is almost certainly not an intermediate between reptiles and birds. As this article shows, Tiktaalik also has several problems as an intermediate between fish and amphibians. As I have pointed out, homology suffers from the severe problem of being evidence for evolution when it can be reconciled with evolution and being the result of sheer coincidence when it cannot be reconciled with evolution. As I have written (here and here) embryonic research provides no evidence for evolution. As this article shows, the chromosome fusion argument is not nearly as clear-cut as what it is generally made out to be. I have also already pointed out that genetic comparisons have shown how inconsistent the evolutionary view is. I don’t find anything compelling just because a lot of people accept it. A lot of people accepted spontaneous generation for a long, long time, and they were all shown to be wrong.

      Like you, I accept that species can adapt and change over time. The evidence for that is, indeed, vast. However, the evidence for common ancestry is partial and very contradictory. Please understand that I don’t find evolution to be inconsistent with Christianity. I just find it to be inconsistent with modern science. If you come to a different conclusion from me, it won’t bother me. I just ask that you read both sides to see who has the stronger arguments. To me, the answer is rather obvious.

  7. JLAfan2001, I for one really appreciate your honesty. I also find myself stuck in the middle trying to make sense of it all at times.

    Dr Wile is great, I find he always has a well researched, easy to understand answer to all the difficult questions.

    Good luck with your research friend.

  8. Since nothing was killed prior to the fall, perhaps some of these pseudogenes are the remnants of the genes that allowed carnivores like lions to be vegetarians.

    1. That’s a good thought, Evan. I am not convinced that no animal was killed prior to the Fall. The Bible says that no people died before the Fall, and it also says that plants died before the Fall. Whether or not animals died before the Fall isn’t really spelled out in the Bible. It is certainly possible, of course. We just don’t know. It certainly is possible that the pseudogenes had some function before the Fall and have since mostly lost their function.

  9. Jay you are putting it much too lightly when you say that Creationism predicts broken genes. It flat out DEMANDS that our genetic machinery have breaks in function

    As every Christian knows we MUST have broken machinery. It is REQUIRED by the doctrine of the fall of man and of sin. It is the position of the Bible that genes NEVER wore down and man would never die before the fall AND that we INHERIT death/weakness both spiritually and physically so some of that defect HAS TO BE in the human genes. It must be or a central part of Biblical teaching would be a complete lie.

    1. Mike, the doctrine of the Fall is critical to Christian theology, I agree. However, exactly how the Fall causes creation to groan is at least somewhat speculative. Thus, saying that it must come in the form of broken genes is probably a bit strong. Certainly, broken genes make sense in the context of the Fall, but human death and suffering could come from a source other than broken genes. To definitively say that the Fall must be manifest in broken genes is going a bit beyond what we know, in my opinion.

  10. respect your disagreeing but human death is clearly inherited through Adam biblically and genes as far as we know now is a major source of inheritance. I can’t see how genes would not be involved or affected in human death resulting from the fall. That to me would be speculative. I am not saying that genes are all that is involved but that the fall and sin would have no effect on function of genes would be quite a reach.

    1. I agree, Mike. The Fall must have affected genes, but that doesn’t mean it caused us to have broken genes. It might have affected them in some other way to produce human death. That’s why I said that it is too strong to say that “broken genes” are required by the Fall. Perhaps “corrupted genes” would be a better way to phrase it.

  11. fair enough. I get your point. By broken I more meant not functioning as originally designed and was not being scientifically precise so that was a bit misleading particularly in the present context. What I was clumsily stating was that creationists do expect to find flaws at the genetic level so atheists pointing to them as evidence against design or creation are not really making a strong point and that It would potentially create another set of theological problems if we found no evidence for the effect of sin at that level.

    As a nonscientist but being aware of the debate I never ever thought as JLA seems to believe that the big issue in the pseudogenes debate was their existence or non existence but more about the shared ones between Man and Ape. That to me seemed a far stronger argument for common descent but it is my understanding that quite a bit of those have function.

    As my first time commenting on your blog I did want to say that there are two science stops I make every week. Yours and Evolution Creation news. You read about things in both places that you would never hear anywhere else. As father that never had the privilege of having you teach my children I came across your blog a few months back having never heard of you before and its been quite a pleasure.

    1. Thank you, Mike. I agree that the “shared mistakes” in certain pseudogenes probably are the best evidence for common descent. However, you have to realize that there are shared mistakes in pseudogenes that are not consistent with the evolutionary story. When that happens, evolutionists simply state that those specific shared mistakes are not the result of common ancestry. Instead, there are multiple origins for those specific shared mistakes. In the end, then, even the strongest evolutionary argument is rather weak.

  12. Hey Dr. Wile,

    I actually did your science curriculum in my high school years and am now studying Biomedical Science in preperation for medicine. I just wanted to thank you for your blog and for your science books. In a world that is so staunchly evolutionist it is totally refreshing to read science articles that present both sides of the argument and give all the information. Thank you for the amazing impact that you are having in the science community.

    God bless.

  13. I guess creation science can’t be all that bad if it’s contributed to someone studying Biomedical Science in preparation for medicine. Wouldn’t this kinda refute Bill Nye’s argument in a way?

    1. JLAfan2001, it certainly does, and there are many, many such examples. Creationism is very good for science, on many levels. I have written about this a couple of times before (here and here, for example).

Comments are closed.