Table 1.
Error counts in sequences derived from untreated oligonucleotides
Observed nucleotide count (%; 95%CI) | Expected nucleotide (n) |
||||
---|---|---|---|---|---|
A (1232) | T (1064) | C (1120) | G (2632) | 5 mC (560) | |
A | 1230 (99.8%; 99.1, 99.9%) | 3 (0.28%; 0.06, 0.78%) | 0 (0.09%; 0.002, 0.32%) | 0 (0.04%; 0.008, 0.14%) | 0 (0.2%; 0.005, 0.63%) |
T | 0 (0.08%; 0.002, 0.3%) | 1057 (99.3%; 98.7, 99.7%) | 10 (1%; 0.45, 1.7%) | 0 (0.03%; 0.001, 0.14%) | 0 (0.17%; 0.04, 0.64%) |
C | 0 | 3 | 1108 | 2 | 560 |
(0.08%; 0.002, 0.29%) | (0.28%; 0.055, 0.82%) | (98.7; 98, 99.4%) | (0.11%; 0.02, 0.27%) | (99.4%; 98.9, 99.9%) | |
G | 2 | 1 | 2 | 2630 | 0 |
(0.24%; 0.04, 0.58%) | (0.09%; 0.002, 0.52%) | (0.1%; 0.05, 0.66%) | (0.2%; 0.004, 0.65%) | (0.2%; 0.1, 0.7%) |
Error counts were recorded from 56 single-stranded oligonucleotide molecules, excluding the 5′-overhang and primer-binding regions. For each molecule, we collected information from 22 adenines, 19 thymines, 20 unmethylated non-CpG cytosines, 47 guanines and 10 methylated CpG cytosines. For columns in which one or more values were 0, percentages were calculated using the ‘pseudocounts’ method originally introduced by LaPlace (21). Under this method, each value is treated as if it were greater by 1, and the denominator is calculated as the true value plus the total number of groups. Values shown in parentheses below event counts for each nucleotide indicate mean error rates and 95% CI on point estimates computed directly, or 95% credible intervals on point estimates computed using the pseudocounts method (see Methods section).