Wow! DNA could store Petabytes and is only 5 years away, new report says
Based on Sabine Hossenfelder's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
DNA can be synthesized and offers extremely high theoretical storage density (about 200 petabytes per gram), with current practical results still far above solid-state storage.
Briefing
DNA is being positioned as a high-density, long-life medium for archival data storage—and a new industry assessment suggests real deployments could arrive within 3 to 5 years. DNA can be synthetically made and, in theory, packs about 200 petabytes per gram (roughly 100 petabytes per cubic centimeter). Even with today’s practical limitations—where achieved densities are about 10 times lower than the theoretical ceiling—it still translates to around 10,000 times the storage density of current solid-state devices. Its other standout trait is longevity: DNA can persist for a thousand years or longer, making it attractive for “keep it forever” backup and record-keeping rather than everyday cloud use.
The push toward commercialization is backed by a consortium known as the DNA Storage Alliance, active since 2020 and now listing 41 members, including major technology companies such as Microsoft, Dell, IBM, Samsung, and Lenovo, alongside universities. In a June whitepaper, the group evaluates readiness and concludes that archival data storage use cases are likely to emerge over the next 3–5 years. That timeline matters because DNA storage has long been constrained less by the concept than by engineering bottlenecks—especially the process of converting data into DNA sequences and synthesizing the corresponding molecules.
How DNA storage works is straightforward in principle but complex in execution. Data is translated into a code using DNA’s four nucleotides, then chemically synthesized into custom DNA strands. Those strands are typically stored as a powder and later read back using DNA sequencing. The catch is speed: writing custom DNA is currently extremely time-consuming. As a result, DNA storage is not poised to replace cloud storage. Current performance is roughly 100 megabytes per day, while faster conventional storage systems can write around a gigabyte per second. DNA’s role is therefore likely to be specialized—ideal for safety backups, rare archives, and long-term preservation where retrieval infrequency and durability outweigh write speed.
Beyond storage, DNA is also being treated as a programmable computing substrate. Researchers have developed DNA-based neural networks and logical processing schemes where single-strand DNA interacts with double-strand DNA to form input, gate, and output strands through controlled chemical reactions. Because base pairing determines the behavior of these molecular “circuits,” programmed DNA networks can carry out simple calculations. The appeal extends to biocompatibility, opening potential pathways for medical diagnostics and other in-tissue applications.
Separate lines of work weave DNA into larger lattices to create materials with custom properties—such as guiding light or sound—by interlocking DNA structures. Taken together, the momentum points to a broader convergence of biology and computer science: DNA as both a storage medium and a building block for computation and engineered materials. The near-term expectation is archival storage first; the longer-term vision ranges from programmable nano-scale systems to DNA-designed materials that could reshape technology within a decade or two.
Cornell Notes
DNA is gaining traction as a long-term data storage medium because it can be synthesized, packed at extremely high theoretical densities (about 200 petabytes per gram), and preserved for a thousand years or more. A June assessment by the DNA Storage Alliance—an industry consortium with 41 members including Microsoft, Dell, IBM, Samsung, and Lenovo—predicts archival DNA storage use cases will emerge in the next 3–5 years. The workflow converts data into nucleotide sequences, chemically synthesizes the DNA, stores it as powder, and later retrieves it via DNA sequencing. Practical write speeds remain slow (around 100 MB per day), so DNA is unlikely to replace cloud storage; it’s better suited for specialized backups and archives. Separately, DNA is also being explored for molecular computing and programmable materials.
Why does DNA storage attract attention even though it’s not fast enough to replace cloud storage?
What does the DNA Storage Alliance’s readiness assessment say, and who is involved?
What are the main steps in storing and retrieving data using synthetic DNA?
How can DNA strands function like logic gates or neural-network components?
What other non-storage applications are being pursued with DNA beyond data archiving?
Review Questions
- What specific properties of DNA make it suitable for archival storage, and how do those properties compare with solid-state storage?
- Why is DNA storage unlikely to replace cloud storage in the near term, and what metric illustrates that gap?
- Describe the basic pipeline from data to synthesized DNA to sequencing-based retrieval. What role do the four nucleotides play?
Key Points
- 1
DNA can be synthesized and offers extremely high theoretical storage density (about 200 petabytes per gram), with current practical results still far above solid-state storage.
- 2
DNA’s long lifetime—likely a thousand years or more—makes it especially suitable for archival backups rather than everyday storage.
- 3
The DNA Storage Alliance (41 members since 2020, including Microsoft, Dell, IBM, Samsung, and Lenovo) projects archival DNA storage use cases will emerge within 3–5 years.
- 4
DNA storage works by translating data into nucleotide sequences, chemically synthesizing the DNA, storing it as powder, and later reading it via DNA sequencing.
- 5
Write speed is the major limitation: DNA storage is around 100 megabytes per day, compared with roughly a gigabyte per second for fast conventional storage.
- 6
DNA is also being explored as a programmable computing medium, using strand interactions to create logic-like input–gate–output behavior.
- 7
Separate research efforts use DNA lattices to design materials with custom optical or acoustic properties, showing DNA’s broader role beyond data storage.