Groundbreaking DNA drive saves data in the building blocks of life

Collaborating with the University of Washington, Microsoft has demonstrated a system that can store information in synthetic DNA and then retrieve the data afterward. The proof-of-concept device is among the first of its kind and can automatically encode data into a DNA sequence and then convert the information back to ones and zeros again so it's readable by digital machines.

Scientists have long considered the potential for DNA to be used as a storage medium and various projects have been launched in recent years. The density of information that can fit in a strand of DNA is unparalleled with anything offered by man-made technologies. It's said that a single gram of DNA is roughly equal to a billion terabytes or one zettabyte of data.

For reference, Cisco estimates that with the rollout of 5G networks, mobile Internet traffic from more than 12 billion devices will approach one zettabyte in 2022.

The machine created by Microsoft and UW only encoded the word "hello" into the manufactured DNA, which is less data stored than previous demonstrations that have taken place, but the researchers note that this latest project involves a fully automated process. "Our device encodes data into a DNA sequence," the team explains, "which is then written to a DNA oligonucleotide using a custom DNA synthesizer, pooled for liquid storage, and read using a nanopore sequencer and a novel, minimal preparation protocol."

The machine is described as having a "bench-top footprint" and has three main parts: encode/decode software, a DNA synthesis module, and another module for preparing and sequencing DNA. Together, these can write, store and read five bytes of data at a time, though Microsoft notes that the system is designed in a modular fashion so that this capacity can be expanded as the technology advances. It's constructed from parts costing around $10,000 including a mini sequencing machine, though it's anticipated that this cost will be reduced to somewhere around $3,000 to $4,000.

Previous attempts to use DNA for data storage have resulted in greater amounts of information being written, such as but this process has generally been done by hand in a lab, which isn't feasible for real world use cases involving large scale data storage. Wikipedia has an interesting page chronicling much of the history behind these efforts, among which being research published by George Church at Harvard University in 2012 which involved the encoding of digital information into DNA including a 53,400 word book, 11 JPG images and JavaScript program.