Table 1.
Data type | Number of reads | Number of mapped reads |
Total bases (Gb) |
Mapped bases (Gb) | Effective depth (fold) |
Percentage with unique placement |
Rate of nucleotide mismatches (%) |
---|---|---|---|---|---|---|---|
SE | 2,019,025,890 | 1,921,271,902 | 72 | 64.4 | 22.5 | 83.60 | 1.62 |
PE | 1,315,249,404 | 1,028,695,924 | 45.7 | 38.5 | 13.5 | 90.20 | 1.16 |
Total | 3,334,275,294 | 2,949,967,826 | 117.7 | 102.9 | 36 | 86.10 | 1.45 |
Single-end (SE) and paired-end (PE) sequencing reads were aligned onto the reference assembly in NCBI build 36.1, allowing at most two mismatches or one continuous gap with a size of 1–3 bp. Effective depth was determined through the calculation of all mapped bases divided by the length of NCBI36 (excluding Ns, 2,858,013,089 bp in length). ‘Unique placement’ means a read had only one best placement with the least number of mismatches and gaps. The rate of nucleotide mismatches is the percentage of mismatched nucleotides over all mapped nucleotides, including sequencing errors and real genetic variations. In total, 487 million reads (14.6%) could not be aligned to the reference genome.