Skip to main content
. 2021 Nov 24;7(48):eabi6714. doi: 10.1126/sciadv.abi6714

Fig. 1. DNA data storage requires higher synthesis throughput than is possible with current techniques.

Fig. 1.

(A to D) Overview of the DNA data storage pipeline. (A) Digital data are encoded from their binary representation into sequences of DNA bases, with an identifier that correlates them with a data object, addressing information that is used to reorder the data when reading, and redundant information that is used for error correction. (B) These sequences are synthesized into DNA oligonucleotides and stored. (C) At retrieval time, the DNA molecules are selected and copied via PCR or other methods and sequenced back into electronic representations of the bases in these sequences. (D) The decoding process takes this noisy and sometimes incomplete set of sequencing reads, corrects for errors and missing sequences, and decodes the information to recover the data. (E) Summary of the commercial synthesis processes and corresponding estimated oligonucleotide densities, as reported in the literature or by the companies themselves (see text S2). Our electrochemical method density is highlighted in dark red.