DNA storage group publishes first standards for DNA-based storage — paves the way for broader use by standardizing vendor and codec data

DNA
DNA (Image credit: Shutterstock)

The DNA Data Storage Alliance has released its first DNA storage specifications, which aims to standardize the storage of vendor info and CODECs, thus paving the way to further standardization that will help cement DNA storage as a viable form of data storage. DNA storage technology is still in its infancy, and the two new specifications of DNA Data Storage Alliance, including companies such as HDD- and SSD-maker Western Digital, represent the first standard capable of achieving industry-wide acceptance.

The key problem with DNA as a storage medium is that it's geared towards storing genetic data, not human-created data such as photos and videos. "DNA does not have a fixed physical structure, a built-in controller, or a way to address different regions of the media linearly," according to the DNA Data Storage Alliance. To get around this issue, the group says a new way of "booting up" DNA for storage purposes is required.

To this end, the group has developed two specifications to standardize the storage of vendor data and the CODEC inside of DNA storage: Sector Zero and Sector One. Sector Zero contains the info necessary to identify which vendor made the DNA and what CODEC (the method of converting DNA data to digital data and vice versa) is employed for encoding Sector One. In turn, Sector One includes metadata and is meant to enable the reading of actual DNA-stored data.

DNA storage technology already exists, and you can even apparently buy 1KB worth of DNA storage for $1,100. What the DNA Data Storage Alliance promises is standardization, which could be as important as the actual DNA storage technology itself. A standard format for reading and writing DNA data in theory would prevent the industry from becoming fragmented to the point of being an obstacle to the adoption of DNA storage.

DNA is seen as a potentially attractive medium for data storage thanks to its molecular size. Additionally, that it's made of four base molecules (adenine, thymine, guanine, and cytosine) could mean DNA is more efficient at storing data than binary-based storage.

DNA isn't just being experimented with for storage but is potentially capable of being used as a processor as well. One research paper submitted to Nature, one of the most prestigious science journals, says DNA could be used to make a CPU-like processor. Another experiment was able to create a DNA processor capable of storage as well.

Matthew Connatser

Matthew Connatser is a freelancing writer for Tom's Hardware US. He writes articles about CPUs, GPUs, SSDs, and computers in general.

  • Kamen Rider Blade
    1 KiB @ $1,100.

    We're a long way from mass consumer adoption then.
    Reply
  • usertests
    Kamen Rider Blade said:
    1 KiB @ $1,100.

    We're a long way from mass consumer adoption then.
    You should see the read speeds.

    I think this could catch on for certain companies that want very cold storage, since DNA could last for centuries depending on storage conditions. Way longer than tapes, HDDs, or current optical storage. Sufficient error correction and redundancy can ensure that it's usable even after the sequence degrades in random places.

    https://www.sciencefocus.com/the-human-body/how-long-does-dna-lasthttps://slate.com/technology/2013/02/dna-testing-richard-iii-how-long-does-dna-last.html
    Reply
  • slightnitpick
    usertests said:
    I think this could catch on for certain companies that want very cold storage, since DNA could last for centuries depending on storage conditions.
    Proteins undergoing the Maillard reaction have so far lasted half a billion years in fossilized samples. There's a loss of fidelity, but this loss is known and can be compensated for.

    Re: the DNA, though. As this DNA probably doesn't need to be propagated, we aren't limited to just the canonical nucleotides, but can use a variety of nucleoside analogs in easily synthesizable single-stranded DNA, or unnatural base pairs in double-stranded DNA.

    Sufficient error correction and redundancy can ensure that it's usable even after the sequence degrades in random places.

    And of course not all degradation is random. So certain things would become less readable over time than others.

    Kamen Rider Blade said:
    1 KiB @ $1,100.

    We're a long way from mass consumer adoption then.
    A big issue with those costs are the shear redundancy in the "bytes" used to store the data. They use 8 base pairs, capable of encoding 16 bits of data, to encode each byte of 8 bits of data. As I mentioned in the earlier article I think this might be because they're encoding Unicode, not just ASCII. I wonder if the initial standard will include compression built in?

    Regardless of anything, whatever this standard is going to be, I hope it doesn't lock things in too much for further standards. And I bet there will still be specialized solutions that don't use the standards. I can see uses for massively parallel reading and writing (or dispensing) of shorter 10 - 20 nucleotide DNA strands in very large chip arrays. 100 million wells on a chip could allow parallel sequencing of 1 - 2 billion bases simultaneously. With nucleoside analogs this could yield on the order of a gigabyte of storage per chip which could be read in less than a day with adapted current technology, and hopefully much faster with future technology.
    Reply
  • duffer9999
    Kamen Rider Blade said:
    1 KiB @ $1,100.

    We're a long way from mass consumer adoption then.
    So a 1TB drive would run about a Trillion dollars.
    Reply