The genetic code is degenerate, but that doesn’t mean it is immoral or corrupt. In fact, in the case of the genetic code, degeneracy is a good thing! Let me explain. One of DNA’s jobs is to tell the cell what proteins to make and how to make them. As a result, it stores “recipes” for proteins, and we call those recipes genes. Well, a protein is produced when smaller chemicals, called amino acids, are linked together in long chains that then fold into intricate shapes. So in order to tell a cell how to make a protein, a gene needs to list a string of amino acids. If the cell puts those amino acids together in the order specified by the gene, the correct protein can then be produced.
How does a gene list the amino acids? As shown in the illustration above, it does so by using the four nucleotide bases known as cytosine (C), guanine (G), thymine (T), and adenine (A). A group of three nucleotide bases codes for a specific amino acid. For example, when a gene has three thymines in a row (TTT), this means “use the amino acid called lysine.” When it has three guanines in a row (GGG), it means “use the amino acid called proline.” So by grouping its four nucleotide bases three at a time, a gene specifies which amino acid should be used in building a protein.
Here’s the catch: There are only 20 amino acids in the standard proteins of life. As a result, there need to be only 20 codes to specify them. However, there are 64 possible ways you can group four nucleotide bases three at a time. Thus, there are 64 different possibilities for how a gene can specify an amino acid, but there are only 20 amino acids the gene needs to specify. As a result, most amino acids are specified by more than one set of three nucleotide bases. As I said above, a sequence of three thymines (TTT) means “use the amino acid called lysine.” However, two thymines followed by a cytosine (TTC) means the same thing. This is why we say the genetic code is degenerate. It has multiple ways it can specify most amino acids.
Why is the genetic code degenerate? There is one obvious reason. If the genetic code used only 20 of its 64 possibilities*, the remaining ones would be meaningless. As a result, if a mutation caused a change in one three-nucleotide-base sequence, the result would most likely be something meaningless, which would terminate the production of the protein. To avoid this, all combinations are used. That way mutations don’t stop the protein from being produced. It might change the amino acid that is used, which might change the protein a bit, but at least the protein would still be produced. As a result, there would be a chance for it to continue to do its job. As one biochemistry textbook succinctly puts it:1
Thus, degeneracy minimizes the deleterious effects of mutations.
For many years, scientists have thought that was the only reason for DNA’s degeneracy. However, new research has uncovered a completely different reason!
A team made up of two Harvard biologists and one biologist from the University of Chicago decided to specifically look at the differences in how quickly proteins are made when different sequences are used for the same amino acid. The results were surprising, to say the least. When there was an ample supply of a given amino acid, the rate at which a protein was produced was not strongly affected by which specific sequence was used to code for that amino acid. However, when the amino acid was in short supply, the rate of protein production changed dramatically. Genes that used one sequence to specify the amino acid ended up producing almost no protein at all, while genes that used a different sequence for the same amino acid ended up producing proteins up to 100 times faster!
This seems to indicate that there is a hierarchy of importance among the different sequences that code for the same amino acid. Some sequences seem to be used in “low-priority” proteins that are simply not made if the amino acid is in short supply. Other sequences seem to be used in “high-priority” proteins that are made regardless of how much amino acid is available. The authors say that this lifts the degeneracy of the genetic code, showing that different sequences for the same amino acid are not necessarily the same.
Why would the code be built this way? Here’s what the authors suggest:2
Therefore lifting the degeneracy of the genetic code might emerge as a general strategy for biological systems to expand their repertoire of responses to environmental perturbations.
In other words, this might be a way for the cell to know what proteins must be made and what proteins can be ignored when the supplies for making that protein run low. If this is true, it represents yet another level of information that is stored in the genetic code. As I wrote previously, the information storage capacity of DNA is already mind-blowing. This research suggests it might be even more mind-blowing than previously thought.
The more I study science, the more amazed I am at the handiwork of the Creator!
1. John L. Tymoczko, Jeremy M. Berg, and Lubert Stryer, Biochemistry: A Short Course, Second Edition, W. H. Freeman, 2013, p. 676
Return to Text
2. Arvind R. Subramaniam, Tao Pan, and Philippe Cluzela, “Environmental perturbations lift the degeneracy of the genetic code to regulate protein levels in bacteria,” Proceedings of the National Academy of Sciences of the United States of America doi:10.1073/pnas.1211077110, 2012
Return to Text
* Please note that there are really only 61 possibilities for amino acid codes, because three of the possibilities (ATC, ATT, and ACT) are used to tell the cell that the sequence is finished.
Return to Text