Skip to main content
. 2015 Sep 3;10(9):e0137549. doi: 10.1371/journal.pone.0137549

Table 2. Comparison of reference genomic sequence datasets for mapping captured reads.

Numbers of SNPs by allele frequency (AF)
Reference Min. variant reads SNP type 0.05–0.1 0.1–0.2 0.2–0.3 0.3–0.4 0.4–0.5 0.5–0.6 0.6–0.7 0.7–0.8 0.8–0.9 0.9–0.10 Total Total (AF>0.1)
CSS 3 EMS 427 163 64 56 46 30 23 19 18 257 1103 676
    NON-EMS 294 113 28 11 1 1 0 1 0 4 453 159
CSS 8 EMS 13 22 31 38 35 26 20 18 17 246 466 453
    NON-EMS 16 3 3 1 0 0 0 0 0 4 27 11
Ensembl 3 EMS 923 419 127 74 53 27 26 16 18 253 1936 1013
    NON-EMS 660 349 91 32 3 1 0 1 0 3 1140 480
Ensembl 8 EMS 24 33 34 43 34 22 23 16 17 242 488 464
    NON-EMS 17 1 5 1 0 0 0 1 0 3 28 11
Ensembl-RM 3 EMS 426 163 61 54 47 22 27 17 16 231 1064 638
    NON-EMS 279 130 31 8 2 0 1 1 0 2 454 175
Ensembl-RM 8 EMS 17 25 32 40 37 19 24 17 16 221 448 431
    NON-EMS 12 9 2 1 0 0 0 0 0 2 26 3

Reads were mapped with Novoalign using parameter t = 60, equivalent to a mismatch setting of approximately 2. Novoalign hard clipping option was used with a base quality 15. Reads were filtered to remove those with a mapping score less than 20. References used were the full IWGSC chromosome arm survey (“CSS”), the Ensembl v21 subset of CSS (“Ensembl”) or a repeat-masked version of the latter (“Ensembl-RM”). Minimum total read coverage was 8, minimum SNP read coverage 3 or 8, and minimum SNP base quality of 20.