Fig. 5. Demonstration of digital music data storage in DNA oligonucleotides enzymatically synthesized in multiplex: decoding and data analysis.
a During the decoding process, several filters are applied to extract and align the reads that contain the digital music data to the expected template sequences. Template alignment was performed using the Smith–Waterman algorithm. b From this, error analysis can be performed to determine the quality of multiplex synthesis. The upper histogram indicates the percentage of insertions, deletions, or mismatches that occurred in the filtered sequencing reads. The bottom histogram indicates the number of reads containing errors for each possible base transition across the array. c Sequencing data also yields statistical information regarding the extension length distribution for each base transition for all 12 oligonucleotides synthesized in multiplex. For example, subset 6, which indicates sequence index 6 is shown in the box plot. All other subsets are indicated in Supplementary Figs. 10 and 11. d Additional statistics such as the extension length distribution for all possible transitions from the entire array were analyzed as shown in the indicated box plot with red pluses being statistical outliers. Source data are provided as a Source Data file.