. 2015 Mar 20;16(1):218. doi: 10.1186/s12864-015-1368-9

Table 5.

Comparison of Illumina read mapping efficacy using clinical isolates derived from different lineages using Bowtie2 and SAMtools

i) Comparison of the numbers of mapped and unmapped reads to the H37Rv sequence or consensus sequence
	LineAge		H37Rv		Consensus sequence		Subtraction of ratio (%)
		Mapping stringency*	Local	End to end	Local	End to end	Local	End to end
		Mapping stringency*	Local	End to end	Local	End to end	(CS minus H37Rv)
F092	EAI	mapped	681561	664376	684952	667250
		unmapped	22261	39446	18870	36572
		ratio (%)	96.837	94.395	97.319	94.804	0.482	0.408
J156	EAI	mapped	1680156	1650866	1689673	1658917
		unmapped	40162	69452	30645	61401
		ratio (%)	97.665	95.963	98.219	96.431	0.553	0.468
F038	Haarlem, LAM, X etc.	mapped	1024873	997301	1029625	1000714
		unmapped	75113	102685	70361	99272
		ratio (%)	93.171	90.665	93.603	90.975	0.432	0.310
F070	Haarlem, LAM, X etc.	mapped	858126	840921	861393	843463
		unmapped	27822	45027	24555	42485
		ratio (%)	96.860	94.918	97.228	95.205	0.369	0.287
J073	Haarlem, LAM, X etc.	mapped	1534315	1503494	1537891	1504256
		unmapped	11979	42800	8403	42038
		ratio (%)	99.225	97.232	99.457	97.281	0.231	0.049
J147	Haarlem, LAM, X etc.	mapped	847807	836269	849747	836489
		unmapped	11775	23313	9835	23093
		ratio (%)	98.630	97.288	98.856	97.313	0.226	0.026
F081	other non-Beijing	mapped	1004912	974425	1008107	976556
		unmapped	43978	74465	40783	72334
		ratio (%)	95.807	92.901	96.112	93.104	0.305	0.203
J020	other non-Beijing	mapped	1081365	1062537	1085065	1065704
		unmapped	11293	30121	7593	26954
		ratio (%)	98.966	97.243	99.305	97.533	0.339	0.290
J027	other non-Beijing	mapped	751633	741219	754861	744254
		unmapped	5259	15673	2031	12638
		ratio (%)	99.305	97.929	99.732	98.330	0.426	0.401
F022	Ancestral Beijing	mapped	1162270	1143243	1166545	1147960
		unmapped	26600	45627	22325	40910
		ratio (%)	97.763	96.162	98.122	96.559	0.360	0.397
J090	Ancestral Beijing	mapped	490815	484340	492424	486326
		unmapped	5153	11628	3544	9642
		ratio (%)	98.961	97.655	99.285	98.056	0.324	0.400
J002	Ancestral Beijing	mapped	736473	727044	739288	730539
		unmapped	5757	15186	2942	11691
		ratio (%)	99.224	97.954	99.604	98.425	0.379	0.471
J029	Modern Beijing	mapped	953792	936476	957539	941221
		unmapped	10220	27536	6473	22791
		ratio (%)	98.940	97.144	99.329	97.636	0.389	0.492
F076	Modern Beijing	mapped	532526	519473	534742	522431
		unmapped	16374	29427	14158	26469
		ratio (%)	97.017	94.639	97.421	95.178	0.404	0.539
J111	Modern Beijing	mapped	719693	708076	722895	712304
		unmapped	14341	25958	11139	21730
		ratio (%)	98.046	96.464	98.482	97.040	0.436	0.576
	ii) Comparison of mappping frequency ratio (%) among the MTBC lineanges
		EAI	Haarlem, LAM, X etc.	other non-Beijing
	Haarlem, LAM, X etc.	ns	-	-
	other non-Beijing	ns	ns	-
	Beijing	P < 0.05	ns	ns

In this analysis CS based on 13 M. tuberculosis strains (Table 1) was used as the consensus sequence. i) After mapping with Bowtie2 [42] against H37Rv or CS, the idxstats command of SAMtools [43] was used to calculate the mapping efficacy (Table 5). In read mapping with Bowtie2, both of local and end-to-end mapping mode were performed, and the other parameters were set with default values. Significant differences in mapping frequencies were assessed using multiple comparisons of proportions tests [44]. For all isolates, the difference between H37Rv and CS as a reference differed significantly (p < 0.0001). For both mapping modes, the ratio of mapped to total reads was calculated, and these values used to calculate differences in mapping frequency between the consensus and H37Rv sequences by simple subtraction.

ii) Based on the difference in mapping frequency in 1), the mapping frequencies of MTBC lineages were compared using Mann–Whitney U tests. Combination of Beijing and EAI sequences showed the significan difference (p < 0.05) in mapping frequencies when compared relative to the consensus and H37Rv sequences, the latter belonging to the Haarlem, LAM, X etc. lineage (linage 4).