Table 5.
i) Comparison of the numbers of mapped and unmapped reads to the H37Rv sequence or consensus sequence | ||||||||
---|---|---|---|---|---|---|---|---|
LineAge | H37Rv | Consensus sequence | Subtraction of ratio (%) | |||||
Mapping stringency* | Local | End to end | Local | End to end | Local | End to end | ||
(CS minus H37Rv) | ||||||||
F092 | EAI | mapped | 681561 | 664376 | 684952 | 667250 | ||
unmapped | 22261 | 39446 | 18870 | 36572 | ||||
ratio (%) | 96.837 | 94.395 | 97.319 | 94.804 | 0.482 | 0.408 | ||
J156 | EAI | mapped | 1680156 | 1650866 | 1689673 | 1658917 | ||
unmapped | 40162 | 69452 | 30645 | 61401 | ||||
ratio (%) | 97.665 | 95.963 | 98.219 | 96.431 | 0.553 | 0.468 | ||
F038 | Haarlem, LAM, X etc. | mapped | 1024873 | 997301 | 1029625 | 1000714 | ||
unmapped | 75113 | 102685 | 70361 | 99272 | ||||
ratio (%) | 93.171 | 90.665 | 93.603 | 90.975 | 0.432 | 0.310 | ||
F070 | Haarlem, LAM, X etc. | mapped | 858126 | 840921 | 861393 | 843463 | ||
unmapped | 27822 | 45027 | 24555 | 42485 | ||||
ratio (%) | 96.860 | 94.918 | 97.228 | 95.205 | 0.369 | 0.287 | ||
J073 | Haarlem, LAM, X etc. | mapped | 1534315 | 1503494 | 1537891 | 1504256 | ||
unmapped | 11979 | 42800 | 8403 | 42038 | ||||
ratio (%) | 99.225 | 97.232 | 99.457 | 97.281 | 0.231 | 0.049 | ||
J147 | Haarlem, LAM, X etc. | mapped | 847807 | 836269 | 849747 | 836489 | ||
unmapped | 11775 | 23313 | 9835 | 23093 | ||||
ratio (%) | 98.630 | 97.288 | 98.856 | 97.313 | 0.226 | 0.026 | ||
F081 | other non-Beijing | mapped | 1004912 | 974425 | 1008107 | 976556 | ||
unmapped | 43978 | 74465 | 40783 | 72334 | ||||
ratio (%) | 95.807 | 92.901 | 96.112 | 93.104 | 0.305 | 0.203 | ||
J020 | other non-Beijing | mapped | 1081365 | 1062537 | 1085065 | 1065704 | ||
unmapped | 11293 | 30121 | 7593 | 26954 | ||||
ratio (%) | 98.966 | 97.243 | 99.305 | 97.533 | 0.339 | 0.290 | ||
J027 | other non-Beijing | mapped | 751633 | 741219 | 754861 | 744254 | ||
unmapped | 5259 | 15673 | 2031 | 12638 | ||||
ratio (%) | 99.305 | 97.929 | 99.732 | 98.330 | 0.426 | 0.401 | ||
F022 | Ancestral Beijing | mapped | 1162270 | 1143243 | 1166545 | 1147960 | ||
unmapped | 26600 | 45627 | 22325 | 40910 | ||||
ratio (%) | 97.763 | 96.162 | 98.122 | 96.559 | 0.360 | 0.397 | ||
J090 | Ancestral Beijing | mapped | 490815 | 484340 | 492424 | 486326 | ||
unmapped | 5153 | 11628 | 3544 | 9642 | ||||
ratio (%) | 98.961 | 97.655 | 99.285 | 98.056 | 0.324 | 0.400 | ||
J002 | Ancestral Beijing | mapped | 736473 | 727044 | 739288 | 730539 | ||
unmapped | 5757 | 15186 | 2942 | 11691 | ||||
ratio (%) | 99.224 | 97.954 | 99.604 | 98.425 | 0.379 | 0.471 | ||
J029 | Modern Beijing | mapped | 953792 | 936476 | 957539 | 941221 | ||
unmapped | 10220 | 27536 | 6473 | 22791 | ||||
ratio (%) | 98.940 | 97.144 | 99.329 | 97.636 | 0.389 | 0.492 | ||
F076 | Modern Beijing | mapped | 532526 | 519473 | 534742 | 522431 | ||
unmapped | 16374 | 29427 | 14158 | 26469 | ||||
ratio (%) | 97.017 | 94.639 | 97.421 | 95.178 | 0.404 | 0.539 | ||
J111 | Modern Beijing | mapped | 719693 | 708076 | 722895 | 712304 | ||
unmapped | 14341 | 25958 | 11139 | 21730 | ||||
ratio (%) | 98.046 | 96.464 | 98.482 | 97.040 | 0.436 | 0.576 | ||
ii) Comparison of mappping frequency ratio (%) among the MTBC lineanges | ||||||||
EAI | Haarlem, LAM, X etc. | other non-Beijing | ||||||
Haarlem, LAM, X etc. | ns | - | - | |||||
other non-Beijing | ns | ns | - | |||||
Beijing | P < 0.05 | ns | ns |
In this analysis CS based on 13 M. tuberculosis strains (Table 1) was used as the consensus sequence. i) After mapping with Bowtie2 [42] against H37Rv or CS, the idxstats command of SAMtools [43] was used to calculate the mapping efficacy (Table 5). In read mapping with Bowtie2, both of local and end-to-end mapping mode were performed, and the other parameters were set with default values. Significant differences in mapping frequencies were assessed using multiple comparisons of proportions tests [44]. For all isolates, the difference between H37Rv and CS as a reference differed significantly (p < 0.0001). For both mapping modes, the ratio of mapped to total reads was calculated, and these values used to calculate differences in mapping frequency between the consensus and H37Rv sequences by simple subtraction.
ii) Based on the difference in mapping frequency in 1), the mapping frequencies of MTBC lineages were compared using Mann–Whitney U tests. Combination of Beijing and EAI sequences showed the significan difference (p < 0.05) in mapping frequencies when compared relative to the consensus and H37Rv sequences, the latter belonging to the Haarlem, LAM, X etc. lineage (linage 4).