Table 2.
Data sets and their relative coverage on bias motifs
Relative coverage | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Data set | GC extremes | Special motifs | ||||||||
Sample | # | Library method | Sequencing platform | Coverage (x) | GC ≤ 10% | GC ≥ 75% | GC ≥ 85% | (AT)15 | G|C ≥ 80% | Bad promoters |
P. falciparum | 1 | Fisher et al.a with Kapa reagents | Illumina MiSeq | 150 | 0.58 | - | - | 0.43 | - | - |
3D7 | 2 | Ion Torrent standard | Ion Torrent PGM | 103 | 0.39 | - | - | 0.11 | - | - |
3 | Pacific Biosciences standard | Pacific Biosciences RS | 104 | 0.89 | - | - | 0.85 | - | - | |
E. coli | 4 | Fisher et al.a with Kapa reagents | Illumina MiSeq | 380 | - | 0.82 | - | - | - | - |
K12 MG1655 | 5 | Ion Torrent standard | Ion Torrent PGM | 311 | - | 0.31 | - | - | - | - |
6 | Pacific Biosciences standard | Pacific Biosciences RS | 115 | - | 0.97 | - | - | - | - | |
R. sphaeroides | 7 | Fisher et al.a with Kapa reagents | Illumina MiSeq | 388 | - | 0.94 | 0.60 | - | - | - |
2.4.1 | 8 | Ion Torrent standard | Ion Torrent PGM | 302 | - | 0.39 | 0.10 | - | - | - |
9 | Pacific Biosciences standard | Pacific Biosciences RS | 142 | - | 0.97 | 0.87 | - | - | - | |
Human | 10 | Aird et al. with Phusion | Illumina HiSeq v2 | 028 | 0.58 | 0.27 | 0.071 | 0.38 | 0.19 | 0.027 |
NA12878 | 11 | Aird et al. with Phusion+betaine | Illumina HiSeq v2 | 048 | 0.44 | 0.44 | 0.28 | 0.26 | 0.20 | 0.14 |
12 | Aird et al. with AccuPrime | Illumina HiSeq v2 | 075 | 0.42 | 0.42 | 0.23 | 0.23 | 0.38 | 0.16 | |
13 | Fisher et al.a | Illumina HiSeq v3 | 070 | 0.29 | 1.1 | 0.56 | 0.23 | 0.44 | 0.39 | |
14 | Fisher et al.a with Kapa reagents | Illumina HiSeq v3 | 120 | 0.41 | 0.88 | 0.48 | 0.25 | 0.65 | 0.36 | |
14' | Fisher et al.a with Kapa reagents | Illumina HiSeq v3 | 000.5 | 0.41 ± 0.0032 | 0.88 ± 0.0047 | 0.48 ± 0.0067 | 0.25 ± 0.0042 | 0.65 ± 0.012 | 0.37 ± 0.022 | |
15 | Ion Torrent standard | Ion Torrent PGM | 001.1 | 0.27 | 0.36 | 0.068 | 0.19 | 0.26 | 0.046 | |
16 | Complete Genomics standard | Complete Genomics | 079 | 0.24 | 0.53 | 0.18 | 0.28 | 0.61 | 0.092 |
aLow-input variation of Fisher et al. [31] (see Materials and methods). Data sets from samples, library construction methods and sequencing platforms are shown, along with their total coverage of the genome, and relative coverage, for each of five bias motifs and a set of 'bad promoters' (see text). Entries are blank if the samples' genome had no instances of the given motif. Data set 14' is the summary of ten random subsamplings from data set 14, with coverage reduced to 0.5×, and we show the mean and standard deviations for the relative coverage measurements from it (see text).