Everyone has heard of DNA, but many don’t appreciate its marvelous design. It stores all the information an organism needs to make proteins, regulate how they are made, and control how they are used. It does this by coding biological information in sequences of four nucleotide bases: adenine (A), thymine (T), guanine (G), and cytosine (C). The nucleotide bases link to one another in order to hold DNA’s familiar double-helix structure together. A can only link to T, and C can only link to G. As a result, the two linking nucleotide bases are often called a base pair. DNA’s ingenious design allows it to store information in these base pairs more efficiently than any piece of human technology that has ever been devised.
What you might not realize is that pretty much any information can be stored in DNA. While the information necessary for life involves the production, use, and regulation of proteins, DNA is such a wonderfully-designed storage system that it can efficiently store almost any kind of data. A scientist recently demonstrated this by storing his own book (which contained words, illustrations, and a Java script code) in the form of DNA.1
The way he and his colleagues did this was very clever. They took the digital version of their book, which was 5.27 megabits of 1’s and 0’s, and used it as a template for producing strands of DNA. Every time there was a “1” in the digital version of the book, they added a guanine (G) or a thymine (T) to the DNA strand. Every time the digital version of the book had a “0,” they added an adenine (A) or a cytosine (C). Now unfortunately, human technology cannot come close to matching the incredible design of even the simplest living organism. As a result, while living organisms can produce DNA that is billions of base pairs long, human technology cannot. It can produce only short strands of DNA.2 So while a single-celled organism could have produced one strand of DNA that contained the entire book (and then some), the scientists had to use 54,898 small strands of DNA to store the entire book.
Of course, just storing the information in DNA form wasn’t enough. In order to show that the data storage actually worked, they used a completely separate process to read the DNA, convert the information in the DNA back into 1’s and 0’s, and produce a new copy of the book. The entire process worked incredibly well. Of course, it was very slow. It took about two weeks to both store the book in DNA form and then read the DNA back to reproduce the book. Nevertheless, it demonstrated that DNA can store pretty much any kind of information, and it demonstrated that DNA is incredibly efficient at doing so.
The entire book was stored in less than a trillionth of a gram of DNA, leading the authors to state that the theoretical limit of DNA’s storage capabilities is 455 exabytes per gram. In case you aren’t familiar with the term, an exabyte is a billion gigabytes. Think about that for a moment. I am incredibly impressed with my little flash drive that can hold 16 gigabytes of information. DNA can store more than a billion times that amount in a fraction of the flash drive’s mass. In fact, if the author’s estimate is correct, all the information produced in the entire world over the course of a year could be stored in a mere 4 grams of DNA!
It’s no wonder the authors wanted to show that their storage and retrieval process could work. If we ever get to the point where human technology can produce and read DNA at even a fraction of the speed that a single-celled organism can do so, the information storage possibilities would be mind-blowing! As the authors state:
DNA is particularly suitable for immutable, high-latency, sequential access applications such as archival storage. Density, stability, and energy efficiency are all potential advantages of DNA storage…
Indeed. The design of DNA is truly astounding. It only makes sense that we should try to use it in our own technology, which is primitive by comparison.
2. Daniel G. Gibson, et al., “Complete Chemical Synthesis, Assembly, and Cloning of a Mycoplasma genitalium Genome,” Science 319:1215-1220, 2008.
Return to Text