Skip to main content
Journal of Clinical Microbiology logoLink to Journal of Clinical Microbiology
. 2004 Jun;42(6):2573–2580. doi: 10.1128/JCM.42.6.2573-2580.2004

Genomic Characterization of an Endemic Mycobacterium tuberculosis Strain: Evolutionary and Epidemiologic Implications

Dao Nguyen 1, Paul Brassard 1,2, Dick Menzies 3, Louise Thibert 4, Rob Warren 5, Serge Mostowy 2, Marcel Behr 1,2,*
PMCID: PMC427889  PMID: 15184436

Abstract

In a study of 302 Mycobacterium tuberculosis clinical isolates from the low-incidence Canadian-born population of Quebec, we characterized a large endemic strain family by using genomic deletions. The DS6Quebec deleted region (11.4 kb) defined a strain family of 143 isolates encompassing two subgroups: one characterized by pyrazinamide (PZA) susceptibility and the other marked by a PZA-monoresistant phenotype. A second deletion (8 bp) in the pncA gene was shared by all 76 isolates with the PZA resistance phenotype, whereas a third DRv0961 deletion (970 bp) defined a further subset of 15 isolates. From their deletion profiles, we derived a most parsimonious evolutionary scenario and compared multiple standard genotyping modalities (using IS6110 restriction fragment length polymorphism [RFLP], spoligotyping, and mycobacterial interspersed repetitive units [MIRU]) across the deletion-based subgroups. The use of a single genotyping modality yielded an unexpectedly high proportion of clustered isolates for a high IS6110 copy strain (27% by IS6110 RFLP, 61% by MIRU, and 77% by spoligotyping). By combining all three modalities, only 14% were genotypically clustered overall, a result more congruent with the epidemiologic profile of reactivation tuberculosis, as suggested by the older age (mean age, 60 years), rural setting, and low proportion of epidemiologic links. These results provide insight into the evolution of genotypes in endemic strains and the potential for false clustering in molecular epidemiologic studies.


In the past decade, population-based molecular epidemiologic studies of tuberculosis (TB) have been performed in a number of geographically and socially diverse populations (4, 14, 33). Through the use of different genotyping methods (predominantly IS6110-based restriction fragment length polymorphism [RFLP]), various studies have determined the proportion of genotypically clustered isolates and thereby inferred the extent of recent transmission in a population (3). In the process, a number of widespread Mycobacterium tuberculosis strains have been recognized and further characterized in an attempt to better define their global prevalence and epidemiologic impact (5, 11, 12, 23, 43).

To date, strain families have primarily been recognized by similar patterns by using genetic markers (e.g., IS6110 patterns) (23) with further refinements, including the demonstration of shared IS6110 insertion sites (24) and/or characteristic drug resistance sequence polymorphisms where present (5). More recently, genomic deletions have presented themselves as unidirectional evolutionary polymorphisms useful for deriving phylogenies of the M. tuberculosis complex and M. tuberculosis sensu stricto isolates (22, 27). Since M. tuberculosis has a clonal population structure, and strains studied to date have revealed an average of 2.9 such deletions per clone, it follows that documenting a signature region of DNA lacking only from certain isolates would effectively define the strain and/or family of interest (22).

In Quebec, Canada, we have characterized a prevalent pyrazinamide (PZA) monoresistant M. tuberculosis family based on a distinct mutation profile in the pncA gene (28). The epidemiologic profile of pncA mutant cases and randomly sampled Canadian-born controls was reported previously and provided strong evidence against recent epidemic spread. On closer inspection of the IS6110 RFLP patterns of isolates without the pncA mutation, there appeared to be evidence of a much larger M. tuberculosis family, encompassing both PZA-resistant and PZA-susceptible organisms. To better define this larger endemic M. tuberculosis family of Quebec, we used genomic analysis to look for deleted regions shared by all or some isolates under study. By testing for the presence or absence of specific genomic regions, we have inferred an evolutionary scenario for these closely related bacteria, thereby defining subgroups of increasing genetic similarity. Our results of an endemic strain permit us to compare the microevolution of multiple standard genotyping modalities (IS6110 RFLP, spoligotyping, and mycobacterial interspersed repetitive units [MIRU]) and provide insights regarding the impact of endemic strains in molecular epidemiologic studies.

MATERIALS AND METHODS

Setting.

The province of Quebec, its population, and the current epidemiology of TB have been described in a previous publication (28). Briefly, there are 7.2 million inhabitants living in a very large territory (600,000 square miles) and the Canadian-born population in Quebec today has one of the world's lowest TB incidence (1.9 cases per 100,000). The island of Montreal is the largest urban center with 1.8 million inhabitants.

Study design.

All cases of PZA-resistant (PZA-R) M. tuberculosis and a 20% random sampling of PZA-susceptible (PZA-S) M. tuberculosis isolates in Canadian-born subjects during the 1990 to 2000 period were initially evaluated. Among PZA-R isolates, only those harboring a specific “Quebec” mutation profile in the pncA gene (described below) were included in the study. Isolates were excluded from the study if the patients did not live in Quebec at the time of diagnosis.

Materials.

All culture-positive active TB cases (ca. 90% of reported cases) in the province of Quebec are reported and sent to the provincial public health laboratory (Laboratoire de Santé Publique du Québec) for culture confirmation and routine drug susceptibility testing (including pyrazinamide). Susceptibility to PZA was determined by the radiometric method (BACTEC 460) (17). PZA resistance was defined as an MIC of >100 mg/liter and was confirmed with the pyrazinamidase activity assay (45). Additional biochemical and genetic tests were performed to exclude M. bovis as previously described (28).

Epidemiologic data.

Baseline demographic and clinical data were obtained from the provincial database of reportable diseases (Maladies A Declaration Obligatoire database). Epidemiologic links were obtained from public health chart review and defined as a family, close social, school, or workplace contact with a previous active TB case.

Genotyping.

M. tuberculosis DNA was extracted from clinical isolates by standardized methods, and all study isolates were characterized by the following tests. The “Quebec” mutation profile in the pncA gene—an 8-bp deletion (ATGGCTTG at position 446) and point mutation (C to A) at position 418—was identified by PCR-restriction fragment length polymorphism (RFLP) as described previously (28). IS6110 RFLP by Southern blot was performed by standardized methods (41). Spoligotyping (or spacer oligonucleotide typing) was done by using a PCR-based technique which detects the presence or absence of 43 unique sequence spacers in the direct repeat (DR) locus (Isogen Bioscience BV, Maarssen, The Netherlands) (20). The principal genetic groups based on polymorphisms at the katG463 and gyrA95 loci (37) were determined by using molecular beacon assays with real-time PCR (ABI Prism 7700 Sequence Detection System; Perkin-Elmer Applied Biosystems, Wellesley, Mass.) (29). MIRU genotypes based on the number of copies of repeated units at 12 independent loci across the genome were determined by PCR as previously described (38).

Identification of genomic deletions.

The genomic DNA of two prototypical clinical isolates was hybridized to the Affymetrix M. tuberculosis GeneChip Array according to previously published methodologies (22, 27, 30). To identify genomic deletions specific to the PZA-R clone, we selected one such isolate (no. 13239). To identify deletions potentially marking the larger family, we tested one PZA-S isolate (no. 19218) that shared a similar IS6110 RFLP pattern and a “signature” spoligotype deletion of direct variable repeat (DVR) 9 and 10. Potentially deleted regions were then confirmed by PCR bridging across the deletion and mapped to the base pair against the H37Rv genome by sequencing and sequence alignment against published genome sequences (Tuberculist, http://genolist.pasteur.fr/TubercuList/). The sequence of primers used are outlined in Table 1. Once confirmed, these regions were tested across the entire panel of isolates by PCR-based analysis. Briefly, a region was considered present when it could be amplified by PCR with the internal primers within the region. A region was judged to be deleted if it failed to amplify with the internal primers and when the amplification of both the IS6110 flanking regions gave the expected PCR products corresponding to a deleted region. To further verify the specific deletion, the absence of this genomic region was demonstrated by either (i) PCR amplification across both the IS6110 element and deleted region and/or (ii) PCR amplification with primers specific to the IS6110 junction sites bridging across the IS6110 element. IS6110 insertion-specific primers were designed as 25-bp primers that include 20 bp of the deletion flanking sequence and 5 bp of the 3′ or 5′ end of the IS6110 sequence in order to confirm that the deletion occurred at the exact same base pair.

TABLE 1.

Primer sequences for identification of deleted regions

Target region Primer sequence (5′ to 3′) Position in H37Rv
DS6Quebec upstream region ggtagcaggaacaacgtggt 1987319
DS6Quebec dowstream region ggtctcgtggcgactgttat 1999086
DS6Quebec internal region (5′ end) gcgtcagctggaaggtgtat 1987689
DS6Quebec internal region (3′ end) agccacgatctccacaatg 1998384
DS6Quebec IS6110 3′ flanking region aagccccggccgcggctggatgaac 1987438
DS6Quebec IS6110 5′ flanking region agccagccaaccccggcccttgaac 1998868
DRv0961 upstream region ggtaacggtagcctggaaca 1074109
DRv0961 downstream region cagcctcattctgttggaca 1075969
DRv0961 internal region (5′ end) acaactccgtacccggcgac 1074699
DRv0961 internal region (3′ end) gacggccaaagaataactcg 1074778
DRv0961 IS6110 3′ flanking region ttcccggccatggtcaccattgaac 1074254
DRv0961 IS6110 5′ flanking region atgatcagcttcggatgagtggtga 1075265
IS6110 3′ end ttcaaccatcgccgcctct
IS6110 5′ end ggtacctcctcgatgaacc

Genotype analysis.

IS6110 RFLP and MIRU results were scanned into Syngene GeneTools (Synoptics, Cambridge, United Kingdom) and digitized for computer-assisted visual reading by two independent readers. IS6110 RFLP cluster analysis was restricted to high-copy-number isolates (greater than four bands) and was performed with the MFA software (Molecular Fingerprint Analyzer J, v2.0; Stanford Center for Tuberculosis Research). IS6110 RFLP patterns were considered clustered when all bands were identical matches. Dendrograms were constructed with the Dice coefficient (within 2% tolerance) by using the UPGMA (unweighted pair group method with arithmetic mean) algorithm in Gel Compar II (Applied Maths, Kortrijk, Belgium). The Dice coefficient derived from the IS6110 RFLP was used to estimate a genetic similarity index between isolates. Spoligotype and MIRU patterns were coded and analyzed manually as an Excel document by using the sort function. Cluster analyses by IS6110 RFLP, spoligotype, and MIRU were conducted in parallel and subsequently integrated for the concordance analysis. Two genotyping modalities were called concordant if a pair of isolates had matching genotypes with both modalities. In our multimodality analysis, a cluster had to have matching genotypes in all three modalities.

Epidemiologic analysis.

Using genomic deletions to define the sets and subsets of related isolates, we explored the demographic characteristics of patients within each set and tested for the proportion that had epidemiologic links, suggestive of ongoing transmission. Continuous variables were compared by using the t test, and categorical variables were compared by using the chi-square test or the Mantel-Haenszel chi-square test (version 8.0; SAS Software, Cary, N.C.). Statistical testing was considered significant when the P value was <0.05.

RESULTS

We initially examined 354 clinical isolates (101 PZA-R and 253 PZA-S). Among the 101 PZA-R isolates, 77 PZA-R isolates were deemed clonally related because of a common “Quebec” mutation profile in the pncA gene. Furthermore, they all belonged to the principal genetic group 2, had a “signature” deletion of DVR 9 and 10 in their spoligotype patterns, and shared similar, but not identical IS6110 RFLP patterns. We excluded 23 PZA-R isolates without the “Quebec” mutation profile, 23 PZA-S isolates with fewer than five IS6110 bands, 5 isolates with incomplete genotypes, and 1 PZA-R isolate due to PCR inhibition. A total of 302 isolates remained in the study, with 77 PZA-R and 225 PZA-S isolates.

From the genomic study of two prototype isolates (PZA-R isolate 13239 and PZA-S isolate 19218), we confirmed two deletions of interest (Fig. 1). The first deletion is an 11.4-kb region (positions 1987457 to 1998849) deleted from both PZA-S and PZA-R isolates, named DS6Quebec, and associated with an IS6110 insertion. This deletion truncates the plcD (Rv1755) and Rv1765c genes, thereby eliminating the nine intervening open reading frames present in H37Rv. Several different IS6110 insertion sites and deletions (of variable locations and size) have previously been reported in that region (18, 22, 31), including the RvD2 deletion in the H37Rv strain (7) and the DS6 deletion described by Kato-Maeda et al. (22). The second deletion named DRv0961 was observed in the PZA-R isolate (no. 13239) only. It consists of a 970-bp deletion (positions 1074273 to 1075243) compared to H37Rv and is associated with the insertion of an IS6110 element. The genes Rv0961 and lprP (Rv0962c) genes are disrupted by the deletion. Both deleted regions involve the truncation of open reading frames annotated in H37Rv, M. tuberculosis strain CDC1551, and M. bovis strain 2122. They can therefore be confidently classified as deletions from the strains under study, and are most likely IS6110 mediated events.

FIG. 1.

FIG. 1.

PCR-based identification of deleted regions in study isolates. The solid line represents genomic DNA present in all isolates, and the dashed line represents the genomic region present in HR37Rv but deleted in the strain under study here.

By testing for these deletions across the study panel, we were able to successfully assign their presence or absence in 298 of 302 isolates. In 3 PZA-S isolates and 1 PZA-R isolate, the flanking regions failed to amplify, leaving 222 PZA-S and 76 PZA-R isolates for deletion analysis. The DS6Quebec region was deleted from all 76 PZA-R samples and 67 of 226 PZA-S samples. These 143 isolates were characterized by related IS6110 RFLP profiles as shown in Fig. 2 (mean similarity index = 80%; standard deviation [SD] = 8%) and the absence of DVR 9 and 10 in their spoligotype. Among these 143 DS6Quebec(−) isolates, the 76 PZA-R shared the same 8-bp deletion within the pncA gene. Finally, the DRv0961 region was deleted in 15 isolates, all of which had both the pncA deletion and the DS6Quebec deletion. This subset was designated Drv0961(−). We derived the most parsimonious evolutionary scenario based on the isolates' deletion profiles (Fig. 3). The following simple observations suggested the validity and robustness of the deletion-based phylogeny: all DRv0961(−) isolates were PZA-R with the pncA Quebec mutation, all isolates of the pncA Quebec clone were DS6Quebec(−), and all DS6Quebec(−) isolates belonged to principal genetic group 2 as described by Sreevatsan et al. (37).

FIG. 2.

FIG. 2.

Dendrogram of IS6110 RFLP of the 143 isolates in the DS6Quebec(−) family. Isolates belonging to the deletion-defined group 2 are identified by two dots, those in group 3 are identified by three dots, and those in group 4 are identified by four dots. The UPGMA-based dendrogram of IS6110 RFLP partly segregates the deletion-defined groups but does not define a clear and comparable phylogeny.

FIG. 3.

FIG. 3.

Deletion-based evolutionary scenario of the Quebec strain. Group 1, no regions deleted; group 2, DS6Quebec region deleted but no pncA deletion; group 3, DS6Quebec region and pncA deletion but no DRv0961 deletion; group 4, DS6Quebec region, pncA and DRv0961 deletion. ✽, The DS6Quebec region was deleted in 143 of 298 isolates; ✽✽, the pncA deletion was present in 76 of 298 isolates; ✽✽✽, the DRv0961 region was deleted in 15 of 298 isolates.

Accepting that the deletion-based family tree captures the evolutionary scenario of this Quebec family of M. tuberculosis, we turned to the evolution of rapidly evolving genotyping markers used for epidemiologic studies (IS6110 RFLP, spoligotype, and MIRU). To facilitate this analysis, we arbitrarily assigned the isolates to four mutually exclusive groups (groups 1, 2, 3, and 4) as outlined in Fig. 3. The proportions of isolates with matching fingerprints by individual genotyping modality are presented for each group in Table 2. In keeping with our expectation that the deletion-based groups defined evolutionary steps of a strain family, the mean similarity index by IS6110 for each group was 47% (SD = 16%) for group 1, 77% (SD = 8%) for group 2, 87% (SD = 6%) for group 3, and 91% (SD = 5%) for group 4. The genetic diversity of each group, as represented by the number of genotyping patterns relative to the number of isolates, also decreased from group 1 to group 4. By combining all three modalities, we observed a gradient of increasing concordance across the three modalities consistent with the successive steps of our evolutionary scenario. Furthermore, all triple-matched clusters (16 clusters with n = 42) were confined to the same deletion-defined group, with the exception of one cluster of 2.

TABLE 2.

Genotyping analysis by IS6110 RFLP, spoligotyping, and MIRU in each group of the Quebec familya

Method Group 1 (n = 155)
Group 2 (n = 67)
Group 3 (n = 61)
Group 4 (n = 15)
% Clustered (no. of clusters) No. of patterns % Clustered (no. of clusters) No. of patterns % Clustered (no. of clusters) No. of patterns % Clustered (no. of clusters) No. of patterns
IS6110 RFLP 38 (13) 130 6 (2)* 64 36 (8)* 47 60 (3) 9
Spoligotyping 72 (24) 67 74 (11) 24 89 (4) 11 100 (4) 4
MIRU 54 (23) 93 60 (24) 35 75 (10) 25 74 (5) 9
Clustered by both IS6110 RFLP and spoligotyping 22 (13) 6 (2)* 31 (7)* 60 (3)
Clustered by both IS6110 RFLP and MIRU 19 (11) 0 (0)* 18 (4)* 27 (2)
Clustered by both spoligotyping and MIRU 35 (15) 36 (6)* 62 (8)* 73 (3)
“Triple” clustered 19 (11) 0 (0)* 15 (3)* 27 (2)
a

Groups 1, 2, 3, and 4 are defined in Fig. 3. The percent clustered represents the proportion with matching genotypes in the group. *, statistically significant difference (P < 0.05) when groups 2 and 3 are compared to group 4.

In this low-incidence study population, the proportion of matching genotypes was surprisingly high. By IS6110 identical matching, the proportion clustered was 27%, whereas it increased to 49% if a one-IS6110-band difference was tolerated. Using spoligotyping and MIRU, the proportion of isolates matched was 77 and 62%, respectively. To investigate whether these high rates of clustering represented ongoing transmission, we performed a more conservative analysis by requiring matching patterns across all three modalities to infer transmission events. Using this approach, only 14% (42 of 298) of all isolates would be clustered, suggesting that 9% of active TB cases were due to ongoing transmission (using the “n − 1” method). The epidemiologic profile of each deletion-based groups is presented in Table 3. Of interest, we identified 11 clusters of up to seven cases in group 1. Among these 29 clustered cases, 8 had identified epidemiologic links, 6 of which agreed with genotypic matching, suggesting to us that some of these group 1 clusters represent unrecognized transmission events. Conversely, it is also noteworthy that of four cases with epidemiologic links in group 4, two were linked to cases with different genotypes and without the DRv0961 deletion. This suggested that upon close investigation for contacts, one can occasionally obtain false-positive epidemiologic links. Clustered cases by IS6110 RFLP alone are contrasted with clustered cases with matching across all three modalities in Table 4. In both analyses, the epidemiologic profile is not particularly suggestive of epidemic TB; however, the association of age appeared more pronounced when we used matching across the different modalities.

TABLE 3.

Epidemiologic comparison of the Quebec strain family and its subsetsa

Parameter Group 1 (n = 155) Group 2 (n = 67) Group 3 (n = 61) Group 4 (n = 15)
Mean age (yr) ± SD 58 ± 21 66 ± 18 58 ± 16 61 ± 21
% Female 36 (56/155) 33 (22/67) 31 (19/61) 27 (4/15)
% Smear positiveb 63 (41/65) 85 (22/26) 67 (18/27) 86 (6/7)
% Epidemiologic linkc 8 (12/155) 10 (7/67) 8 (5/61) 27 (4/15)
% Living outside Montreald 78 (36/155) 94 (63/67) 87 (53/61) 100 (15/15)
a

None of the variables were statistically different among the three groups. Numbers in parentheses represent the number of cases affected versus the total number of cases.

b

Among the pulmonary TB cases.

c

From public health charts.

d

At the time of diagnosis.

TABLE 4.

Epidemiologic characteristics of clustered cases compared to nonclustered cases as defined by identical matching across IS6110 RFLP, spoligotype, and MIRUa

Parameter Multiclustered and nonmulticlustered
IS6110 clustered and nonclustered
Multiclustered (n = 42) Nonmulticlustered (n = 256) P IS6110 clustered (n = 81) IS6110 nonclustered (n = 217) P
Mean age (yr) ± SD 51 ± 19 62 ± 19 <0.001 56 ± 19 62 ± 19 <0.05
% Female 40 34 NSe 30 (24/81) 36 (79/217) NS
% Smear positiveb 76 68 NS 72 (28/39) 69 (59/86) NS
% Epidemiologic linkc 24 (10/42) 7 (18/256) <0.001 22 (18/81) 5 (10/217) <0.001
% Living outside Montreald 79 85 NS 83 (67/81) 84 (183/217) NS
a

Numbers in parentheses represent the number of cases affected versus the total number of cases.

b

Among the pulmonary TB cases.

c

From public health charts.

d

At the time of diagnosis.

e

NS, not significant.

DISCUSSION

Molecular epidemiologic studies were initially performed in defined geographic regions (such as cities or neighborhoods) over a relatively short period of time (typically 2 to 5 years) (1, 33, 40). In these settings, IS6110 RFLP clustering analysis provided epidemiologic inferences that were well validated and became a valuable tool for understanding the transmission patterns of TB. Since then, many studies have expanded their scope to study longer periods of time (one or more decades) and span several states or even countries (19, 26). In the process, a large number of isolates sharing highly related genotypes (by IS6110 RFLP or spoligotype) have been uncovered, and prevalent strain families such as the Beijing, Haarlem, or African family have been characterized (5, 12, 23, 23, 36). Although the DS6Quebec(−) strain family appears common in Quebec, representing perhaps one quarter of M. tuberculosis isolates in the Canadian-born, some evidence suggests that this family may be more widespread globally. For instance, Sampson et al. have identified strains harboring the exact same IS6110 3′-end insertion site in South Africa but without characterization of the 5′ end or deleted region (31).

From comparisons of diverse M. tuberculosis strains, it is now known that M. tuberculosis has a clonal population structure and has likely undergone a relatively recent clonal expansion (34, 37). In contrast to single nucleotide polymorphisms (SNPs) that are relatively infrequent in M. tuberculosis (3.6 × 10−4 to 1 × 10−5 nucleotide sites) (13, 21, 37), large sequence polymorphisms (LSPs) appear to be a major source of genetic variability among M. tuberculosis strains and species of the M. tuberculosis complex (13). Among these LSPs, genomic deletions are believed to represent unique and irreversible events, where all isolates of the same strain share a truncation of open reading frames (based on H37Rv) at the same exact base pair. Consequently, these LSPs have been used to define and reconstruct the phylogeny of the M. tuberculosis complex and M. tuberculosis clinical isolates (22, 27).

Our data suggest that the DS6Quebec deletion “brands” a larger, potentially global family, within which the pncA and DRv0961 deletions define successive subsets. This indicates that our locally endemic strain is far more common in Quebec than previously suspected, representing nearly one-quarter of all isolates in this province, as opposed to the 6.2% attributed to the pncA clone. Although the success of this strain in Quebec may point to an inherent biologic advantage that we have not identified, we hypothesize instead that this strain has undergone a clonal expansion in the Quebec population, a phenomenon akin to the clonal expansion of M. bovis strains recently reported across herds in the United Kingdom (34). In the historically isolated French Canadian population of Quebec, a newly introduced M. tuberculosis strain may have met little meaningful competition. The coexistence of both the PZA-S (group 2) and PZA-R (groups 3 and 4) subsets and the absence of the pncA “Quebec” mutation profile in other geographic regions together suggest that the pncA deletion occurred at a later point in Quebec. Although it is not possible for us to provide a more accurate chronological definition of this family, all evidence points to an established endemic strain with little evidence of ongoing spread.

Since each branch of the proposed evolutionary tree serves as a genetic landmark of this strain's evolution, the proposed scenario provides a backbone to study the evolution of the genotypic profiles. From an evolutionary perspective, members of a longstanding endemic strain should demonstrate similar, but not necessarily identical genotypes across different modalities. One would expect that the more remote the common ancestral strain, the more discordant the genetic markers should be. In contrast to studies that estimate the relative molecular clock of markers by comparing genotypic polymorphisms across a panel of potentially unrelated isolates, our results examine the evolution in parallel of these markers over time among progeny of a common ancestor. Reassuringly, with just one exception, all clusters defined by matching across the three modalities (IS6110 RFLP, spoligotyping, and MIRU) were confined to one branch of deletion-defined evolutionary scenario.

In the setting of ongoing transmission, the identification of locally or globally prevalent M. tuberculosis strains typically suggests relatively recent clonal expansion. However, the epidemiologic interpretation is challenged by the possibility that shared patterns may also represent reactivation of a locally endemic strain without epidemiologic links. For instance, in several high-incidence Asian countries, over half the isolates studied have been of the Beijing genotype, thus providing insufficient background genetic diversity to accurately discern clonal outbreaks (2, 9, 43). Conversely, matched IS6110 RFLP patterns have been inferred to occur due to reactivation in certain low incidence settings, although it has often been difficult to confidently exclude some degree of ongoing transmission (6, 10, 46). On the basis of our previous data showing no more evidence of transmission with our strain than with random controls, we now have a setting that permits a more thorough investigation of the impact of established endemic strains in molecular epidemiologic studies (28).

In our sample, the use of any single genotyping modality alone resulted in a large proportion of isolates having matching genotypes and would have failed to distinguish a recent outbreak from reactivation disease. By requiring matching across all three genotyping modalities, we were able to significantly decrease the proportion of cases clustered and the size of remaining clusters. This conservative approach may underestimate recent transmission in certain epidemic settings but was deemed appropriate in our setting. The long study period (10 years) in conjunction with the relatively slow molecular clock of spoligotyping would have further contributed to the “false clustering.” Although it has been previously observed that the use of two or more genotyping modality enhances the discriminatory power of DNA fingerprinting, many studies have focused on isolates with a low IS6110 copy number (8, 44). Our results suggest that the enhanced resolution afforded by the concurrent use of multiple genotyping modalities is valuable where there are endemic strains, even when the isolates have a high number of IS6110 copies.

The degree of clustering observed in the present study when one genotyping modality alone was used should set a cautionary note to the interpretation of certain molecular clustering studies. As demonstrated here, a high proportion of matching genotypic patterns may occur in the absence of recent transmission in a setting with a low pretest probability of epidemic spread. Spoligotyping, known to be significantly less discriminatory than IS6110 RFLP in high-copy IS6110 isolates, resulted in much greater clustering (77%) than IS6110 RFLP, as expected (15, 23, 39). MIRU, a genotyping modality recently developed, was believed to be nearly as discriminatory as IS6110 RFLP, although these data are derived from the study of small numbers of epidemiologically unrelated isolates and established outbreaks (16, 25, 32, 38). In our analysis, MIRU analysis also resulted in a significantly higher proportion of matching patterns (61%) than with IS6110 RFLP, and this was still the case when we accepted near-identical IS6110 RFLP patterns as a match. Whether MIRU clustering performs better in population-based studies still remains to be determined (35), but our data suggest that its ability to discriminate between unrelated isolates is unlikely to match that of IS6110 RFLP in isolates with a high copy number of IS6110. It is possible that our study of an endemic strain underestimates the discriminatory power of MIRU typing and that this genotyping modality performs better in a more diverse population. On the other hand, our study sample only represents a 20% sampling of all TB cases, thereby underestimating the proportion of clustered isolates. As MIRU typing becomes more widely used, it will be important to establish the epidemiologic validity of MIRU clusters in different settings. In our setting with an endemic strain, the results obtained suggest that both spoligotyping and MIRU may be prone to “false-positive” clustering.

Our study further supports the use of genomic deletions to define dominant strains in both high and low incidence settings, thereby facilitating the tracking and study of these strains. If we accept that the evolution of genotyping patterns may be strain dependent (42) (such as with low-IS6110-copy strains or the spoligotype pattern of the Beijing strain), this strain family nonetheless shows no unusual genotypic feature to suspect that its evolutionary pattern is unique. In describing the genotypic profile of this strain family across successive evolutionary “generations,” we provide a template that may be useful in other population-based studies to recognize and characterize endemic strains.

Acknowledgments

We thank Jennifer Westley, Micheal Purdy, and Annie Gatewood for assistance with genotyping and genetic analysis.

This study was supported by grants from the Canadian Institutes for Health Research to P.B. 43896 and from the Sequella Global Tuberculosis Foundation to M.B. D.N. is a recipient of a FRSQ Bourse de Formation, and P.B. and M.B. are New Investigators of CIHR.

REFERENCES

  • 1.Alland, D., G. E. Kalkut, A. R. Moss, R. A. McAdam, J. A. Hahn, W. Bosworth, E. Drucker, and B. R. Bloom. 1994. Transmission of tuberculosis in New York City: an analysis by DNA fingerprinting and conventional epidemiologic methods. N. Engl. J. Med. 330:1710-1716. [DOI] [PubMed] [Google Scholar]
  • 2.Anh, D. D., M. W. Borgdorff, L. N. Van, N. T. Lan, T. van Gorkom, K. Kremer, and D. van Soolingen. 2000. Mycobacterium tuberculosis Beijing genotype emerging in Vietnam. Emerg. Infect. Dis. 6:302-305. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Barnes, P. F., and M. D. Cave. 2003. Molecular epidemiology of tuberculosis. N. Engl. J. Med. 349:1149-1156. [DOI] [PubMed] [Google Scholar]
  • 4.Bauer, J., Z. Yang, S. Poulsen, and A. B. Andersen. 1998. Results from 5 years of nationwide DNA fingerprinting of Mycobacterium tuberculosis complex isolates in a country with a low incidence of M. tuberculosis infection. J. Clin. Microbiol. 36:305-308. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bifani, P. J., B. Mathema, N. E. Kurepina, and B. N. Kreiswirth. 2002. Global dissemination of the Mycobacterium tuberculosis W-Beijing family strains. Trends Microbiol. 10:45-52. [DOI] [PubMed] [Google Scholar]
  • 6.Braden, C. R., G. L. Templeton, M. D. Cave, S. E. Valway, I. M. Onorato, K. G. Castro, D. Moers, Z. Yang, W. W. Stead, and J. H. Bates. 1997. Interpretation of restriction fragment length polymorphism analysis of Mycobacterium tuberculosis isolates from a state with a large rural population. J. Infect. Dis. 175:1446-1452. [DOI] [PubMed] [Google Scholar]
  • 7.Brosch, R., W. J. Philipp, E. Stavropoulos, M. J. Colston, S. T. Cole, and S. V. Gordon. 1999. Genomic analysis reveals variation between Mycobacterium tuberculosis H37Rv and the attenuated M. tuberculosis H37Ra strain. Infect. Immun. 67:5768-5774. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Burman, W. J., R. R. Reves, A. P. Hawkes, C. A. Rietmeijer, Z. Yang, H. El-Hajj, J. H. Bates, and M. D. Cave. 1997. DNA fingerprinting with two probes decreases clustering of Mycobacterium tuberculosis. Am. J. Respir. Crit. Care. Med. 155:1140-1146. [DOI] [PubMed] [Google Scholar]
  • 9.Chan, M. Y., M. Borgdorff, C. W. Yip, P. E. de Haas, W. S. Wong, K. M. Kam, and D. van Soolingen. 2001. Seventy percent of the Mycobacterium tuberculosis isolates in Hong Kong represent the Beijing genotype. Epidemiol. Infect. 127:169-171. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Cowan, L. S., and J. T. Crawford. 2002. Genotype analysis of Mycobacterium tuberculosis isolates from a sentinel surveillance population. Emerg. Infect. Dis. 8:1294-1302. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Douglas, J. T., L. Qian, J. C. Montoya, J. M. Musser, J. D. van Embden, D. van Soolingen, and K. Kremer. 2003. Characterization of the Manila family of Mycobacterium tuberculosis. J. Clin. Microbiol. 41:2723-2726. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Filliol, I., J. R. Driscoll, D. van Soolingen, B. N. Kreiswirth, K. Kremer, G. Valetudie, D. A. Dang, R. Barlow, D. Banerjee, P. J. Bifani, K. Brudey, A. Cataldi, R. C. Cooksey, D. V. Cousins, J. W. Dale, O. A. Dellagostin, F. Drobniewski, G. Engelmann, S. Ferdinand, D. Gascoyne-Binzi, M. Gordon, M. C. Gutierrez, W. H. Haas, H. Heersma, E. Kassa-Kelembho, M. L. Ho, A. Makristathis, C. Mammina, G. Martin, P. Mostrom, I. Mokrousov, V. Narbonne, O. Narvskaya, A. Nastasi, S. N. Niobe-Eyangoh, J. W. Pape, V. Rasolofo-Razanamparany, M. Ridell, M. L. Rossetti, F. Stauffer, P. N. Suffys, H. Takiff, J. Texier-Maugein, V. Vincent, J. H. De Waard, C. Sola, and N. Rastogi. 2003. Snapshot of moving and expanding clones of Mycobacterium tuberculosis and their global distribution assessed by spoligotyping in an international study. J. Clin. Microbiol. 41:1963-1970. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Fleischmann, R. D., D. Alland, J. A. Eisen, L. Carpenter, O. White, J. Peterson, R. DeBoy, R. Dodson, M. Gwinn, D. Haft, E. Hickey, J. F. Kolonay, W. C. Nelson, L. A. Umayam, M. Ermolaeva, S. L. Salzberg, A. Delcher, T. Utterback, J. Weidman, H. Khouri, J. Gill, A. Mikula, W. Bishai, J. W. Jacobs, Jr., J. C. Venter, and C. M. Fraser. 2002. Whole-genome comparison of Mycobacterium tuberculosis clinical and laboratory strains. J. Bacteriol. 184:5479-5490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Godfrey-Faussett, P., P. Sonnenberg, S. C. Shearer, M. C. Bruce, C. Mee, L. Morris, and J. Murray. 2000. Tuberculosis control and molecular epidemiology in a South African gold-mining community. Lancet 356:1066-1071. [DOI] [PubMed] [Google Scholar]
  • 15.Goyal, M., N. A. Saunders, J. D. van Embden, D. B. Young, and R. J. Shaw. 1997. Differentiation of Mycobacterium tuberculosis isolates by spoligotyping and IS6110 restriction fragment length polymorphism. J. Clin. Microbiol. 35:647-651. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Hawkey, P. M., E. G. Smith, J. T. Evans, P. Monk, G. Bryan, H. H. Mohamed, M. Bardhan, and R. N. Pugh. 2003. Mycobacterial interspersed repetitive unit typing of Mycobacterium tuberculosis compared to IS6110-based restriction fragment length polymorphism analysis for investigation of apparently clustered cases of tuberculosis. J. Clin. Microbiol. 41:3514-3520. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Heifets, L. 2002. Susceptibility testing of Mycobacterium tuberculosis to pyrazinamide. J. Med. Microbiol. 51:11-12. [DOI] [PubMed] [Google Scholar]
  • 18.Ho, T. B., B. D. Robertson, G. M. Taylor, R. J. Shaw, and D. B. Young. 2000. Comparison of Mycobacterium tuberculosis genomes reveals frequent deletions in a 20-kb variable region in clinical isolates. Yeast 17:272-282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ijaz, K., Z. Yang, H. S. Matthews, J. H. Bates, and M. D. Cave. 2002. Mycobacterium tuberculosis transmission between cluster members with similar fingerprint patterns. Emerg. Infect. Dis. 8:1257-1259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Kamerbeek, J., L. Schouls, A. Kolk, M. van Agterveld, D. van Soolingen, S. Kuijper, A. Bunschoten, H. Molhuizen, R. Shaw, M. Goyal, and J. D. A. van Embden. 1997. Simultaneous detection and strain differentiation of Mycobacterium tuberculosis for diagnosis and epidemiology. J. Clin. Microbiol. 35:907-914. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Kapur, V., T. S. Whittam, and J. M. Musser. 1994. Is Mycobacterium tuberculosis 15,000 years old? J. Infect. Dis. 170:1348-1349. [DOI] [PubMed] [Google Scholar]
  • 22.Kato-Maeda, M., J. T. Rhee, T. R. Gingeras, H. Salamon, J. Drenkow, N. Smittipat, and P. M. Small. 2001. Comparing genomes within the species Mycobacterium tuberculosis. Genome Res. 11:547-554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kremer, K., D. van Soolingen, R. Frothingham, W. H. Haas, P. W. Hermans, C. Martin, P. Palittapongarnpim, B. B. Plikaytis, L. W. Riley, M. A. Yakrus, J. M. Musser, and J. D. van Embden. 1999. Comparison of methods based on different molecular epidemiological markers for typing of Mycobacterium tuberculosis complex strains: interlaboratory study of discriminatory power and reproducibility. J. Clin. Microbiol. 37:2607-2618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Kurepina, N. E., S. Sreevatsan, B. B. Plikaytis, P. J. Bifani, N. D. Connell, R. J. Donnelly, D. van Soolingen, J. M. Musser, and B. N. Kreiswirth. 1998. Characterization of the phylogenetic distribution and chromosomal insertion sites of five IS6110 elements in Mycobacterium tuberculosis: non-random integration in the dnaA-dnaN region. Tuberc. Lung Dis. 79:31-42. [DOI] [PubMed] [Google Scholar]
  • 25.Mazars, E., S. Lesjean, A. L. Banuls, M. Gilbert, V. Vincent, B. Gicquel, M. Tibayrenc, C. Locht, and P. Supply. 2003. High-resolution minisatellite-based typing as a portable approach to global analysis of Mycobacterium tuberculosis molecular epidemiology. Proc. Natl. Acad. Sci. USA 98:1901-1906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.McElroy, P. D., T. R. Sterling, C. R. Driver, B. Kreiswirth, C. L. Woodley, W. A. Cronin, D. X. Hardge, K. L. Shilkret, and R. Ridzon. 2002. Use of DNA fingerprinting to investigate a multiyear, multistate tuberculosis outbreak. Emerg. Infect. Dis. 8:1252-1256. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Mostowy, S., D. Cousins, J. Brinkman, A. Aranaz, and M. A. Behr. 2002. Genomic deletions suggest a phylogeny for the Mycobacterium tuberculosis complex. J. Infect. Dis. 186:74-80. [DOI] [PubMed] [Google Scholar]
  • 28.Nguyen, D., P. Brassard, J. Westley, L. Thibert, M. Proulx, K. Henry, K. Schwartzman, D. Menzies, and M. A. Behr. 2003. Widespread pyrazinamide-resistant Mycobacterium tuberculosis family in a low-incidence setting. J. Clin. Microbiol. 41:2878-2883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Rhee, J. T., A. S. Piatek, P. M. Small, L. M. Harris, S. V. Chaparro, F. R. Kramer, and D. Alland. 1999. Molecular epidemiologic evaluation of transmissibility and virulence of Mycobacterium tuberculosis. J. Clin. Microbiol. 37:1764-1770. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Salamon, H., M. Kato-Maeda, P. M. Small, J. Drenkow, and T. R. Gingeras. 2000. Detection of deleted genomic DNA using a semiautomated computational analysis of GeneChip data. Genomic Res. 10:2044-2054. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Sampson, S. L., R. M. Warren, M. Richardson, G. D. van der Spuy, and P. D. van Helden. 1999. Disruption of coding regions by IS6110 insertion in Mycobacterium tuberculosis. Tuberc. Lung Dis. 79:349-359. [DOI] [PubMed] [Google Scholar]
  • 32.Savine, E., R. M. Warren, G. D. Van Der Spuy, N. Beyers, P. D. Van Helden, C. Locht, and P. Supply. 2002. Stability of variable-number tandem repeats of mycobacterial interspersed repetitive units from 12 loci in serial isolates of Mycobacterium tuberculosis. J. Clin. Microbiol. 40:4561-4566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Small, P. M., P. C. Hopewell, S. P. Singh, A. Paz, J. Parsonnet, D. C. Ruston, G. F. Schecter, C. L. Daley, and G. K. Schoolnik. 1994. The epidemiology of tuberculosis in San Francisco: a population-based study using conventional and molecular methods. N. Engl. J. Med. 330:1703-1709. [DOI] [PubMed] [Google Scholar]
  • 34.Smith, N. H., J. Dale, J. Inwald, S. Palmer, S. V. Gordon, R. G. Hewinson, and S. J. Maynard. 2003. The population structure of Mycobacterium bovis in Great Britain: clonal expansion. Proc. Natl. Acad. Sci. USA 100:15271-15275. [DOI] [PMC free article] [PubMed]
  • 35.Sola, C., S. Ferdinand, C. Mammina, A. Nastasi, and N. Rastogi. 2001. Genetic diversity of Mycobacterium tuberculosis in Sicily based on spoligotyping and variable number of tandem DNA repeats and comparison with a spoligotyping database for population-based analysis. J. Clin. Microbiol. 39:1559-1565. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Sola, C., I. Filliol, M. C. Gutierrez, I. Mokrousov, V. Vincent, and N. Rastogi. 2001. Spoligotype database of Mycobacterium tuberculosis: biogeographic distribution of shared types and epidemiologic and phylogenetic perspectives. Emerg. Infect. Dis. 7:390-396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Sreevatsan, S., X. Pan, K. E. Stockbauer, N. D. Connell, B. N. Kreiswirth, T. S. Whittam, and J. M. Musser. 1997. Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc. Natl. Acad. Sci. USA 94:9869-9874. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Supply, P., E. Mazars, S. Lesjean, V. Vincent, B. Gicquel, and C. Locht. 2000. Variable human minisatellite-like regions in the Mycobacterium tuberculosis genome. Mol. Microbiol. 36:762-771. [DOI] [PubMed] [Google Scholar]
  • 39.Torrea, G., C. Offredo, M. Simonet, B. Gicquel, P. Berche, and C. Pierre-Audigier. 1996. Evaluation of tuberculosis transmission in a community by 1 year of systematic typing of Mycobacterium tuberculosis clinical isolates. J. Clin. Microbiol. 34:1043-1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.van Deutekom, H., J. J. J. Gerritsen, D. van Soolingen, E. J. C. van Ameijden, J. D. A. van Embden, and R. A. Coutino. 1997. A molecular Epidemiological approach to studying the transmission of tuberculosis in Amsterdam. Clin. Infect. Dis. 25:1071-1077. [DOI] [PubMed] [Google Scholar]
  • 41.van Embden, J. D., M. D. Cave, J. T. Crawford, J. W. Dale, K. D. Eisenach, B. Gicquel, P. W. Hermans, C. Martin, R. A. McAdam, and T. M. Shinnick. 1993. Strain identification of Mycobacterium tuberculosis by DNA fingerprinting: recommendations for a standardized methodology. J. Clin. Microbiol. 31:406-409. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.van Embden, J. D., T. van Gorkom, K. Kremer, R. Jansen, B. A. Der Zeijst, and L. M. Schouls. 2000. Genetic variation and evolutionary origin of the direct repeat locus of Mycobacterium tuberculosis complex bacteria. J. Bacteriol. 182:2393-2401. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.van Soolingen, D., L. Qian, P. E. de Haas, J. T. Douglas, H. Traore, F. Portaels, H. Z. Qing, D. Enkhsaikan, P. Nymadawa, and J. D. van Embden. 1995. Predominance of a single genotype of Mycobacterium tuberculosis in countries of east Asia. J. Clin. Microbiol. 33:3234-3238. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Warren, R., M. Richardson, S. Sampson, J. H. Hauman, N. Beyers, P. R. Donald, and P. D. van Helden. 1996. Genotyping of Mycobacterium tuberculosis with additional markers enhances accuracy in epidemiological studies. J. Clin. Microbiol. 34:2219-2224. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Wayne, L. G. 1974. Simple pyrazinamidase and urease tests for routine identification of mycobacteria. Am. Rev. Respir. Dis. 109:147-151. [DOI] [PubMed] [Google Scholar]
  • 46.Yang, Z. H., J. H. Bates, K. D. Eisenach, and M. D. Cave. 2001. Secondary typing of Mycobacterium tuberculosis isolates with matching IS6110 fingerprints from different geographic regions of the United States. J. Clin. Microbiol. 39:1691-1695. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Journal of Clinical Microbiology are provided here courtesy of American Society for Microbiology (ASM)

RESOURCES