AI breakthrough claimed to make DNA data retrieval 3,200x faster with better accuracy, but still slower than standard storage
Speed improvement delivers a big step in the right direction.
 
Storing digital content in DNA is an emerging technology that takes advantage of the molecule's density, durability, and low power needs. DNA can last for generations, unlike NAND flash and HDDs that degrade in years or decades, at best. It also offers a data capacity about 100 million times higher than typical data storage systems. However, retrieving data from DNA storage is a complex and rather slow process. Good news, a breakthrough made by Israeli researchers speeds up the activity by 3,200 times, reports TechXplore.
A team at Technion – Israel Institute of Technology has created an AI tool that makes retrieving digital information stored in DNA dramatically faster and more accurate. The system, named DNAformer, is 3,200 times faster than the most accurate previous methods and is claimed to deliver excellent results, showing promise for efficient, large-scale data storage using biological material. It is still too slow for the commercial market, but the researchers believe that they are moving in the right direction.
  
This new approach allows for processing 100MB in only 10 minutes, compared to several days using the best current methods. On a test involving 3.1MB, the tool handled several types of content: a still image in color, a short sound recording of Neil Armstrong on the Moon, a text about DNA's storage advantages, and randomly generated data mimicking encrypted or compressed files.
To store data, customized DNA molecules are synthesized. Reading the information requires sequencing, but this introduces various errors like deletions or substitutions, and returns unordered and sometimes corrupted data copies. DNAformer handles these issues using algorithms that identify correct patterns from flawed inputs. The model includes tailored correction codes and a safety layer to detect highly noisy sequences. It uses specialized tools to clean up those errors before translating the sequences back into digital form.
DNAformer is based on a transformer model trained using synthetic datasets produced by a simulator also built at Technion. Besides the speed improvement, DNAformer also showed up to 40% higher accuracy over previous quick retrieval methods. This performance marks a breakthrough in handling real-world DNA storage data, especially when dealing with incomplete or noisy sequences that challenge traditional correction methods.
The researchers plan to adjust DNAformer for specific needs and believe the system can scale for industrial and research applications. It is designed to be flexible and can evolve with future progress in how DNA is written and read, helping meet the growing demand for sustainable and high-capacity storage solutions.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

Anton Shilov is a contributing writer at Tom’s Hardware. Over the past couple of decades, he has covered everything from CPUs and GPUs to supercomputers and from modern process technologies and latest fab tools to high-tech industry trends.