Skip to main content
. 2015 Mar 20;16(1):218. doi: 10.1186/s12864-015-1368-9

Table 5.

Comparison of Illumina read mapping efficacy using clinical isolates derived from different lineages using Bowtie2 and SAMtools

i) Comparison of the numbers of mapped and unmapped reads to the H37Rv sequence or consensus sequence
LineAge H37Rv Consensus sequence Subtraction of ratio (%)
Mapping stringency* Local End to end Local End to end Local End to end
(CS minus H37Rv)
F092 EAI mapped 681561 664376 684952 667250
unmapped 22261 39446 18870 36572
ratio (%) 96.837 94.395 97.319 94.804 0.482 0.408
J156 EAI mapped 1680156 1650866 1689673 1658917
unmapped 40162 69452 30645 61401
ratio (%) 97.665 95.963 98.219 96.431 0.553 0.468
F038 Haarlem, LAM, X etc. mapped 1024873 997301 1029625 1000714
unmapped 75113 102685 70361 99272
ratio (%) 93.171 90.665 93.603 90.975 0.432 0.310
F070 Haarlem, LAM, X etc. mapped 858126 840921 861393 843463
unmapped 27822 45027 24555 42485
ratio (%) 96.860 94.918 97.228 95.205 0.369 0.287
J073 Haarlem, LAM, X etc. mapped 1534315 1503494 1537891 1504256
unmapped 11979 42800 8403 42038
ratio (%) 99.225 97.232 99.457 97.281 0.231 0.049
J147 Haarlem, LAM, X etc. mapped 847807 836269 849747 836489
unmapped 11775 23313 9835 23093
ratio (%) 98.630 97.288 98.856 97.313 0.226 0.026
F081 other non-Beijing mapped 1004912 974425 1008107 976556
unmapped 43978 74465 40783 72334
ratio (%) 95.807 92.901 96.112 93.104 0.305 0.203
J020 other non-Beijing mapped 1081365 1062537 1085065 1065704
unmapped 11293 30121 7593 26954
ratio (%) 98.966 97.243 99.305 97.533 0.339 0.290
J027 other non-Beijing mapped 751633 741219 754861 744254
unmapped 5259 15673 2031 12638
ratio (%) 99.305 97.929 99.732 98.330 0.426 0.401
F022 Ancestral Beijing mapped 1162270 1143243 1166545 1147960
unmapped 26600 45627 22325 40910
ratio (%) 97.763 96.162 98.122 96.559 0.360 0.397
J090 Ancestral Beijing mapped 490815 484340 492424 486326
unmapped 5153 11628 3544 9642
ratio (%) 98.961 97.655 99.285 98.056 0.324 0.400
J002 Ancestral Beijing mapped 736473 727044 739288 730539
unmapped 5757 15186 2942 11691
ratio (%) 99.224 97.954 99.604 98.425 0.379 0.471
J029 Modern Beijing mapped 953792 936476 957539 941221
unmapped 10220 27536 6473 22791
ratio (%) 98.940 97.144 99.329 97.636 0.389 0.492
F076 Modern Beijing mapped 532526 519473 534742 522431
unmapped 16374 29427 14158 26469
ratio (%) 97.017 94.639 97.421 95.178 0.404 0.539
J111 Modern Beijing mapped 719693 708076 722895 712304
unmapped 14341 25958 11139 21730
ratio (%) 98.046 96.464 98.482 97.040 0.436 0.576
ii) Comparison of mappping frequency ratio (%) among the MTBC lineanges
EAI Haarlem, LAM, X etc. other non-Beijing
Haarlem, LAM, X etc. ns - -
other non-Beijing ns ns -
Beijing P < 0.05 ns ns

In this analysis CS based on 13 M. tuberculosis strains (Table 1) was used as the consensus sequence. i) After mapping with Bowtie2 [42] against H37Rv or CS, the idxstats command of SAMtools [43] was used to calculate the mapping efficacy (Table 5). In read mapping with Bowtie2, both of local and end-to-end mapping mode were performed, and the other parameters were set with default values. Significant differences in mapping frequencies were assessed using multiple comparisons of proportions tests [44]. For all isolates, the difference between H37Rv and CS as a reference differed significantly (p < 0.0001). For both mapping modes, the ratio of mapped to total reads was calculated, and these values used to calculate differences in mapping frequency between the consensus and H37Rv sequences by simple subtraction.

ii) Based on the difference in mapping frequency in 1), the mapping frequencies of MTBC lineages were compared using Mann–Whitney U tests. Combination of Beijing and EAI sequences showed the significan difference (p < 0.05) in mapping frequencies when compared relative to the consensus and H37Rv sequences, the latter belonging to the Haarlem, LAM, X etc. lineage (linage 4).