Abstract
Background
Whole-genome sequencing (WGS) has emerged as a powerful tool for elucidating Mycobacterium tuberculosis (MTB) transmission dynamics and drug resistance patterns. In China, the application of WGS in TB surveillance has been rapidly expanding. However, molecular epidemiological studies based on WGS data from low-incidence areas remain limited. Huzhou City, located in northern Zhejiang Province, reported a TB incidence rate of 27.16 per 100,000 population in 2024, which was lower than both the national and provincial averages. In July 2023, Huzhou pioneered China’s first “TB-Free City” initiative. To support public health efforts and facilitate the development of a WGS-based molecular surveillance network tailored for low-incidence settings, we performed WGS on 350 MTB isolates obtained from culture-positive TB patients in Huzhou between March 2023 and September 2024. Phylogenetic analysis, drug resistance profiling, and transmission cluster identification (using a ≤ 12 SNP threshold) were conducted to characterize the molecular epidemiology of TB in this region.
Results
Lineage 2.2.1 (Beijing genotype) was predominant (80.0%). A total of 86 isolates (24.6%) harbored drug resistance-associated mutations, including 2.0% MDR-TB and 1.7% pre-XDR-TB, with no XDR-TB or resistance to bedaquiline, linezolid, or delamanid detected. We identified 28 genomic clusters comprising 65 isolates (18.6%), with a clustering rate of 11.6% among DR-TB cases. Furthermore, 79.1% (68/86) of drug-resistant TB (DR-TB) cases were likely attributable to recent transmission, with clustered DR-TB strains sharing identical resistance-conferring mutations. Comparative analysis revealed that patients under 60 years of age were significantly more likely to be involved in recent transmission events (P = 0.035), while lineage, gender, occupation, treatment history, and local residency were not statistically associated with clustering.
Conclusions
Our findings suggest that recent transmission, particularly among younger individuals, contributes substantially to the DR-TB burden in Huzhou. WGS-based surveillance revealed moderate resistance levels and limited transmission, supporting the ongoing “TB-Free City” initiative. Enhanced genomic monitoring and early intervention targeting younger, mobile populations may further curb TB transmission in low-incidence settings.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12864-025-12202-8.
Keywords: Tuberculosis, Whole-genome sequencing, Drug resistance, Mycobacterium tuberculosis, Molecular epidemiology, Transmission dynamics, China
Introduction
Tuberculosis (TB), caused by the Mycobacterium tuberculosis (MTB) complex, remains one of the most serious public health challenges globally. Despite a gradual global decline, TB continues to impose a heavy disease burden, particularly in high-incidence countries like China. According to the WHO Global Tuberculosis Report 2024 [1], an estimated 10.8 million new TB cases occurred worldwide in 2023, with over 1.25 million TB-related deaths. China accounted for approximately 741,000 new cases, making it the country with the third-largest TB burden globally.
To tackle this challenge, both the World Health Organization (WHO) and the Chinese government have established ambitious goals for TB control. The WHO’s End TB Strategy aims to reduce TB incidence by 90% and TB-related mortality by 95% by 2035. In line with this global agenda, China has incorporated TB control into the national “Healthy China 2030” strategy and recently launched the “National TB Control Plan (2024–2030),” targeting incidence reductions to 52 per 100,000 by 2025 and 43 per 100,000 by 2030.
However, the growing prevalence of drug-resistant TB (DR-TB)—including rifampicin-resistant (RR-TB), multidrug-resistant (MDR-TB), and extensively drug-resistant (XDR-TB) strains—has emerged as a major threat to TB elimination. In 2023, an estimated 400,000 cases of MDR/RR-TB were reported globally, with China contributing approximately 29,000 cases, ranking fourth worldwide [1].
Timely interruption of TB transmission, especially for MDR-TB, is considered one of the most direct and effective strategies to reduce TB incidence. Molecular epidemiological approaches, particularly whole-genome sequencing (WGS), have revolutionized our ability to understand TB transmission dynamics and resistance patterns [2, 3]. WGS not only enables high-resolution genotyping and detection of drug-resistance-conferring mutations but also helps identify recent transmission events through phylogenetic and SNP-based clustering analyses.
In China, the application of WGS in TB surveillance has expanded rapidly, offering significant advantages in monitoring resistance trends, mapping transmission chains, and guiding precision public health interventions [4–7]. However, most studies have focused on high-burden TB areas or have primarily addressed drug-resistant strains. Molecular epidemiological studies based on WGS data from low-burden areas remain limited.
Huzhou, located in northern Zhejiang Province, has a resident population of approximately 4.2 million. In 2024, its reported TB incidence was 27.16 per 100,000, lower than both the national and provincial averages. In July 2023, Huzhou became the first city in China to initiate a “TB-Free City” campaign, which combines patient-centered diagnosis and treatment with community-based management and preventive therapy for latent infections.To support these public health efforts, our study utilized WGS to characterize the molecular epidemiology, drug resistance patterns, and transmission dynamics of MTB in Huzhou. The goal was to generate a high-resolution genomic baseline to inform targeted TB control strategies and contribute to the development of a WGS-based molecular surveillance network tailored for low-incidence regions.
Methods
Study setting and sample inclusion
Huzhou City administers three counties (Anji County, Changxing County, and Deqing County) and two districts (Wuxing District and Nanxun District). All TB cases were referred to local comprehensive TB-designated hospitals, including Huzhou Central Hospital, Changxing County People’s Hospital, Deqing County People’s Hospital, and Anji County People’s Hospital. Sputum samples from TB patients were collected and cultured using Lowenstein-Jensen medium.
This study included patients with positive sputum cultures between March 2023 and September 2024. Demographic information (including sex, age, residence, and occupation) and clinical data (such as new cases or previously treated cases) were extracted and matched from the national TB registration system and the clinical records of the participating Huzhou hospitals.
WGS and bioinformatics
Genomic DNA was extracted from Mycobacterium tuberculosis (MTB) colonies scraped from Lowenstein-Jensen (L-J) medium using Mag-MK Bacterial Genomic DNA Extraction Kit (Sangon Biotech, Shanghai, China), following the manufacturer’s instructions. DNA concentrations were measured using a Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA, USA). The extracted genomic DNA was utilized to construct 150 bp paired-end libraries, which were sequenced on an Illumina NovaSeq 6000 platform with 150 cycles, targeting a sequencing depth of 200×. To ensure specificity for MTB, the sequencing reads were analyzed using Kraken v1.1.1 with the prebuilt MiniKraken DB_8GB database. Only isolates with a minimum of 90% of the reads mapping to the MTB complex were retained for further analysis [8, 9]. The quality of the FASTQ files was assessed, and low-quality regions were trimmed using fastp v0.23, ensuring that the average read quality was ≥ Q20 [10]. The sequencing data were deposited in the Genome Sequence Archive (https://ngdc.cncb.ac.cn/gsa/) under accession number CRA026460.
Filtered reads were aligned to the MTB reference genome H37Rv (GenBank: NC000962.3) using the BWA-MEM algorithm with default settings. Samples that meet the following conditions were qualified after quality control: >90% of the reads were classified as MTBC, the mapping rate and 10× coverage of the H37Rv reference genome were >98% and 97%, and mixed infection of different lineages was not detected. The subsequent analysis, including base recalibration and realignment to correct potential artefacts, was performed using SAMtools and the Genome Analysis Toolkit (GATK) [11, 12]. Single nucleotide polymorphisms (SNPs) were called using SAMtools/BCFtools, with a frequency threshold of ≥ 90% and a minimum of five supporting reads [13].
MTB lineages and mutations associated with resistance to anti-tuberculosis drugs were identified using TB-Profiler [14]. The full analysis pipeline, including scripts and parameters, is available at: https://github.com/KevinLYW366/TBSeqPipe. In line with WHO definitions [15], DR-TB includes several categories: isoniazid-resistant (HR)-TB, rifampicin-resistant (RR)-TB, and multidrug-resistant (MDR)-TB (resistant to both rifampicin and isoniazid). Additionally, pre-extensively drug-resistant TB (pre-XDR-TB) is defined as MDR/RR-TB with resistance to any fluoroquinolone, while extensively drug-resistant TB (XDR-TB) includes MDR/RR-TB with resistance to any fluoroquinolone and at least one additional Group A drug (such as levofloxacin, moxifloxacin, bedaquiline, or linezolid). Cases of resistance to any one anti-TB drug that do not meet the criteria for the above categories are classified as “Other” (https://github.com/jodyphelan/TBProfiler/issues/145).
Phylogenetic and phylodynamic analysis
Fixed single nucleotide polymorphisms (SNPs), excluding those located within the proline-glutamic acid-proline-proline-glutamic acid sequence, the proline-glutamic acid polymorphic GC-rich regions, and those associated with drug resistance genes, were extracted and aligned into a concatenated sequence [16]. Phylogenetic trees based on maximum likelihood (ML) were generated from this concatenated alignment using IQ-Tree v2.2.2 for all Mycobacterium tuberculosis (MTB) isolates [17]. Tree construction was performed with the following parameters: “-m TEST -B 1000”, which enabled automatic model selection via jModelTest and incorporated 1000 ultrafast bootstrap replicates [18]. The optimal ML tree was rooted using Mycobacterium canettii (RefSeq: NC_015848.1) as the outgroup. The resulting tree was visualized using the Interactive Tree of Life (iTOL) platform [19].
Transmission cluster identification based on genetic distance
Transmission clusters were defined by setting a threshold of 12 or fewer pairwise SNP differences between sequences, as previously outlined in the literature [20]. The pairwise SNP distances between all isolates were calculated. Transmission clusters across different lineages were then identified using TransCluster, with a SNP threshold of 12 [21].
Spatial hot spots of genomic-clustered MTB cases
Spatial analysis and visualization were performed using ArcGIS (version 10.6; Esri).and R (version 4.5.1). The residential addresses of genomic-clustered MTB cases were collected and organized, then converted into latitude and longitude coordinates using the Baidu Coordinate Picker system (http://api.map.baidu.com/lbsapi/getpoint/index.html). The geocoded data were displayed in the form of a point distribution map on the study area’s vector map. Based on kernel density estimation, a Gaussian smoothing model was applied to analyze the spatial aggregation of patients.
Statistical analysis
Categorical variables were expressed as numbers (percentages) and compared using the Chi-square test or Fisher’s exact test, as appropriate. Data were recorded using Microsoft Excel 2016, and all statistical analyses were performed using R software (version 4.4.1). A two-sided P value of < 0.05 was considered indicative of statistical significance.
Results
Demographic characteristics of TB cases
From March 1, 2023, to September 30, 2024, a total of 518 sputum culture-positive strains meeting the quality control requirements for genomic sequencing were collected for WGS. After excluding 31 strains that did not pass data quality control, 77 strains identified as non-tuberculous mycobacteria, and 60 duplicate strains from the same patient, a final cohort of 350 strains with qualified WGS data was selected for further analysis (Fig. 1). Among these 350 strains, 275 (78.6%) were from male patients, and 75 (21.4%) were from female patients. The age of the patients ranged from 15 to 91 years, with a median age of 59 years (interquartile range: 39–74 years). Most of the TB cases (87.4%, 306/350) were newly diagnosed, while 12.6% (44/350) had received prior treatment. Regarding occupation, approximately half of the patients were farmers (51.4%, 180/350), followed by factory workers (15.4%, 54/350), individuals in commercial services (11.7%, 41/350), students (1.7%, 6/350), and 19.7% (69/350) were unemployed.
Fig. 1.
A Sample enrollment and study flowchart. B Map of the study site in Zhejiang, China. The study site is located in Huzhou, situated in the northern part of Zhejiang Province, which lies in the eastern region of China
Phylogenetic analysis MTB strains in Huzhou
To explore the evolutionary relationships among the strains, a ML phylogenetic tree was constructed using concatenated sequences derived from non-redundant SNP loci across all 350 MTB strains from Huzhou (Fig. 2). The phylogenetic analysis revealed the presence of three major MTB lineages. The predominant lineage was lineage 2, comprising 289 strains (82.6%), followed by lineage 4 (Euro–American) with 60 strains (17.1%), and lineage 1 (Indo–Oceanic) with a single strain (0.3%). Lineage 2 was further subdivided into two sub-lineages, lineage 2.2.1 and lineage 2.2.2, both commonly referred to as the Beijing family. Lineage 4 consisted of four sub-lineages: lineage 4.2, lineage 4.4, lineage 4.5, and lineage 4.8. Among these, lineage 2.2.1 (modern Beijing strains) was the dominant sub-lineage, accounting for 80.0% (280/350) of the strains, followed by lineage 4.5 (10.3%, 36/350), lineage 4.4 (4.3%, 15/350), lineage 2.2.2 (2.6%, 9/350) (ancient Beijing strains), and lineage 4.2 (2.3%, 8/350). Only one strain was classified as lineage 1.1, and one strain was identified as lineage 4.8.
Fig. 2.
Maximum-likelihood phylogenetic tree of 350 Mycobacterium tuberculosis (MTB) isolates derived from TB patients in Huzhou, with branches and nodes color-coded according to sub-lineages. The outer ring denotes the genotypic drug resistance profile of the strains
Molecular drug-resistant characteristics
TB-Profile was utilized to identify gene mutations associated with resistance to 14 anti-tuberculosis (anti-TB) drugs for each strain and to predict their drug resistance profiles. Based on WGS data, a total of 86 strains exhibited drug resistance-related genetic mutations, resulting in an overall drug resistance rate of 24.6% (86/350). All drug-resistant strains were derived from lineage 2 and lineage 4. The drug resistance rates for lineage 2 and lineage 4 isolates were 24.6% (71/350) and 25.0% (15/60), respectively.
The predicted drug resistance profiles revealed that 4 strains were classified as RR-TB (1.1%, 4/350), 32 strains as HR-TB (9.1%, 32/350), 7 strains as MDR-TB (2.0%, 7/350), 6 strains as pre-XDR-TB (1.7%, 6/350), and 37 strains as other types (10.6%, 37/350). No strains classified as XDR-TB were detected.
A total of 53 distinct genetic mutations linked to 9 major classes of antimicrobial drugs were identified in this study. As outlined in Supplementary Table 1, the most frequently observed mutations included Ser315Thr (n = 29) in katG (associated with isoniazid resistance), Lys43Arg (n = 22) in rpsL (streptomycin), Ser450Leu (n = 10) in rpoB (rifampicin), His75Asn (n = 9) in thyA (para-aminosalicylic acid), −15 C >T (n = 7) in fabG1 (ethionamide), and Ala90Val (n = 7) in gyrA (fluoroquinolones). Notably, no resistance-associated mutations were detected for clofazimine, cycloserine, or linezolid (95%CI: [0, 0.8525%]), the latter being a Group A drug recommended for MDR-TB treatment according to WHO guidelines. Similarly, no mutations were observed for the newly introduced drugs, bedaquiline and delamanid (95%CI: [0, 0.8525%]) [22–25].
Based on these mutations, the single-drug resistance rates for different antimicrobial agents, ranked from highest to lowest, were as follows: Isoniazid (INH) (12.6%, 44/350), Streptomycin (SM) (9.4%, 33/350), Fluoroquinolones (FQs) (6.6%, 23/350), Rifampicin (RFP) (4.9%, 17/350), Para-aminosalicylic acid (PAS) (3.1%, 11/350), Ethionamide (ETO) (2.3%, 8/350), Ethambutol (EMB) (2.0%, 7/350), Pyrazinamide (PZA) (1.7%, 6/350), and Aminoglycosides (AGs) (1.1%, 4/350). We further compared the drug resistance profiles and single-drug resistance rates between the two predominant lineages identified in this study—lineage 2 and lineage 4, see Table 1. The results showed that lineage 2 strains were significantly more likely to exhibit resistance to streptomycin compared to lineage 4 (P = 0.005). HR-TB strains were significantly more likely to be found in lineage 4 (P = 0.044).
Table 1.
Comparison of drug-resistant profiles between lineage 2 and lineage 4
| Lineage2(n = 71)no.(%) | Lineage4(n = 15)no.(%) | χ2 | P | |
|---|---|---|---|---|
| Drugs/drug resistant patterns | ||||
| RR-TB | 4(5.6) | 0(0.0) | NA | 0.458༊ |
| HR-TB | 23(32.4) | 9(60.0) | 4.039 | 0.044 |
| MDR-TB | 7(9.9) | 0(0.0) | 0.561 | 0.454 |
| Pre-XDR-TB | 5(7.0) | 1(6.7) | NA | 0.72༊ |
| Other | 32(45.1) | 5(3.3) | 0.086 | 0.77 |
| Drug resistance | ||||
| RFP-resistance | 16(22.5) | 1(6.7) | 1.093 | 0.296 |
| INH-resistance | 35(49.3) | 9(60.0) | 0.568 | 0.451 |
| EMB-resistance | 7(9.9) | 0(0.0) | 0.561 | 0.454 |
| PZA-resistance | 5(7.0) | 1(6.7) | NA | 0.72༊ |
| SM-resistance | 32(45.1) | 1(6.7) | 7.723 | 0.005 |
| FQs-resistance | 19(26.8) | 4(26.7) | 0.001 | 0.999 |
| AGs-resistance | 3(4.2) | 1(6.7) | NA | 0.542༊ |
| ETO-resistance | 7(9.9) | 1(6.7) | 0.001 | 0.999 |
| PAS-resistance | 11(15.5) | 0(0.0) | 1.457 | 0.227 |
NA not applicable
* indicates P value was calculated by Fisher exact test
Identification of genomic-clustered MTB cases and their risk factors
A total of 28 putative clusters (C01-C28) comprising 65 strains were identified, resulting in a clustering rate of 18.6% (65/350) within the county, see Fig. 3. The largest cluster, designated as cluster 2, included 9 strains, while the remaining clusters contained between 2 and 3 strains. Notably, 35.7% of the cases within certain clusters (C06, C07, C14, C17, C18, C19, C20, C21, C25, C26) had current addresses in the same county, suggesting a geographical association. To gain a comprehensive understanding of the spatial distribution, kernel density maps were generated using the residential addresses of clustered MTB cases (Fig. 4). Two hotspots of clustered patients were identified, located in Zhicheng Street and Huaxi Street under the jurisdiction of the main urban area of Changxing County, and in Renhuangshan Street, Huanzhu Street, Fenghuang Street, Aishan Street, and Chaoyang Street within the jurisdiction of the main urban area of Huzhou City, Wu Xing District, characterized by high population density.
Fig. 3.
Maximum-likelihood phylogenetic tree of 65 Mycobacterium tuberculosis strains grouped into 28 genomic clusters (labeled C01–C28). Strain lineages, diagnosis timelines, current residential addresses, treatment histories, and drug resistance (DR) profiles are color-coded for visualization
Fig. 4.
Kernel density maps of genomic-clustered MTB cases
To further investigate factors influencing genomic clustering, we examined the correlations between cluster membership and various variables, including bacteriological, demographic, treatment history, and geographic factors (Table 2). The age distribution showed a statistically significant difference between clustered and non-clustered patients (P = 0.035). However, lineage type, drug resistance status, gender, occupation, and local residency were not statistically associated with clustering. These findings suggest that individuals under 60 years of age may be at higher risk for local transmission.
Table 2.
Characteristics of genomic-clustered and unique cases of TB in Huzhou, China
| no.(%) | χ2 | P | Odds ratio(95%CI) | |||
|---|---|---|---|---|---|---|
| Clustered(n = 65) | Unique(n = 285) | Total(n = 350) | ||||
| Bacteriological factors | 0.297 | 0.586 | ||||
| Genotype | ||||||
| Beijing | 55(84.6) | 233(81.8) | 288(82.3) | 1.00 | ||
| Non-Beijing | 10(15.4) | 52(18.2) | 62(17.7) | 1.227 (0.587–2.567) | ||
| Drug resistance category | NA | 0.096༊ | ||||
| Sensitive | 54(83.0) | 210(73.7) | 264(75.4) | 1.00 | ||
| RR-TB | 0(0.0) | 4(1.4) | 4(1.1) | 1.257 (0.001- Inf) | ||
| HR-TB | 4(6.2) | 28(9.8) | 32(9.1) | 1.8 (0.237–1.576) | ||
| MDR-TB | 0(0.0) | 7(2.5) | 7(2.0) | 1.257 (0.001- Inf) | ||
| Pre-XDR-TB | 3(4.6) | 3(1.1) | 6(1.7) | 0.257 (1.061–5.633) | ||
| Other | 4(6.2) | 33(11.6) | 37(10.6) | 2.121 (0.203–1.374) | ||
| Demographic factors | ||||||
| Gender | 0.129 | 0.72 | ||||
| Female | 15(23.1) | 60(21.1) | 75(21.4) | 1.00 | ||
| male | 50(76.9) | 225(78.9) | 275(78.6) | 1.125 (0.591–2.141) | ||
| Occupation | NA | 0.508༊ | ||||
| Factory/Office worker | 10(15.4) | 44(15.4) | 54(15.4) | 1.00 | ||
| Farmer | 28(43.1) | 152(53.3) | 180(51.4) | 1.234 (0.436–1.617) | ||
| Business service | 10(15.4) | 31(10.9) | 41(11.7) | 0.705 (0.606–2.864) | ||
| Unemployed | 15(23.1) | 54(18.9) | 69(19.7) | 0.818 (0.573–2.403) | ||
| Student | 2(3.0) | 4(1.4) | 6(1.7) | 0.455 (0.509–6.361) | ||
| Treatment | 0.295 | 0.587 | ||||
| New case | 59(90.8) | 252(88.4) | 311(88.9) | 1.00 | ||
| Retreated case | 6(9.2) | 33(11.6) | 39(11.1) | 1.288 (0.516–3.215) | ||
| Geographic factors | 1.103 | 0.294 | ||||
| Resident | 57(87.7) | 236(82.8) | 293(83.7) | 1.00 | ||
| Migrant | 8(12.3) | 49(17.2) | 57(16.3) | 1.531 (0.688–3.408) | ||
| Age | 4.464 | 0.035 | ||||
| ≦ 60 | 40(61.5) | 134(47.0) | 174(49.7) | 1.00 | ||
| > 60 | 25(38.5) | 151(53.0) | 176(50.3) | 1.803 (1.039–3.129) | ||
NA not applicable
* indicates P value was calculated by Fisher exact test
Epidemiological links of genomic-clustered cases
Among the 28 genomic clusters, only one cluster (C17) displayed confirmed epidemiological links. Cluster C17 comprised three cases, two of which (A2024204 and A2024354) were students from the same school. The third case (A2024131) involved a farmer who was not acquainted with the two students. However, the farmer resides in the same village as student A2024204, with a geographical proximity of less than 2000 m.
Transmission and acquisition of DR-TB
The genomic clusters may indicate either the transmission of multidrug-resistant tuberculosis (DR-TB) strains or the initial transmission of non-DR-TB strains that later developed multidrug resistance. When isolates within a cluster share the same resistance mutations, we infer that the emergence of DR-TB is more likely due to the transmission of DR-TB isolates rather than the de novo selection of resistance during the treatment of the index case. Additionally, the presence of DR-TB in treatment-naïve patients further suggests the transmission of DR-TB isolates.
In our study, we identified 86 DR-TB isolates, of which 10 were clustered into five distinct clusters (Fig. 3), and the remaining 76 were genomic unique isolates. This resulted in a clustering rate of 11.6% (10/86) for DR-TB cases. Upon comparing the drug-resistance mutation profiles, we observed that 10 DR-TB strains within the same clusters (C04, C06, C12, C16, C20) exhibited identical drug resistance profiles and consistent resistance-associated mutations, see Supplementary Table 2. Among these, 6 cases in clusters C04, C06, and C20 were new cases, while 2 cases in cluster C16 were relapse cases. In cluster C12, 1 case was a new case, and the other was a relapse case, with the relapse case having an earlier diagnosis date than the new case. Regarding the 76 genomic unique isolates, 58 were from treatment-naïve cases. Thus, if we combine the DR-TB cases among treatment-naïve cases and in genomically clustered cases, up to 79.1% (68/86) of DR-TB patients in Huzhou during our study period were likely caused by transmission of DR strains.
Discussion
This study systematically described the molecular epidemiological characteristics of TB in Huzhou by performing WGS on 350 clinical MTB isolates. The analysis covered aspects such as lineage distribution, molecular drug resistance features, and transmission clustering. As expected, the Beijing genotype (lineage 2.2, primarily sublineage 2.2.1) was the dominant genotype among TB strains circulating in Huzhou, accounting for 82.6% of the isolates. The prevalence of L2.2-Beijing strains in our region aligns with that observed in central, southern, and eastern China, including in Hunan (74.2%) [26], Shenzhen (84.1%) [5], and Shanghai (80.2%) [7]. However, it notably differs from the prevalence in Southwest China, where L2.2-Beijing strains are less dominant, as evidenced by their prevalence of 53.9% in Sichuan [27], 59.6% in Yunnan [28], and 48.0% in Luodian City, Guizhou [29]. Studies suggest that Beijing strains possess higher adaptability and transmissibility, potentially forming covert transmission chains within immunocompetent populations through enhanced immune evasion mechanisms [30, 31].
WGS enables the detection of variations in all known drug resistance genes at the genomic level, and its effectiveness has been recognized by the World Health Organization [32]. WGS has demonstrated sensitivity and specificity exceeding 85.0% for detecting resistance to first-line and common second-line drugs. Moreover, the accuracy of predicting sensitivity to first-line drugs such as INH and RIF can reach over 95% [33, 34].
In this study, 86 strains with drug resistance-related mutations were detected (24.6%), with detection rates for MDR-TB and Pre-XDR-TB at 2.0% and 1.7%, respectively. No XDR-TB strains were identified. When compared to a 7-year study conducted in Eastern China (including Shanghai and Zhejiang Province), where the detection rates for MDR-TB and Pre-XDR-TB were 5.1% and 2.1%, respectively [35], the overall drug resistance level in Huzhou was relatively low. The most commonly identified resistance-associated mutations in this study were Ser315Thr in katG, Lys43Arg in rpsL, and Ser450Leu in rpoB, which were consistent with previous domestic studies [34, 36]. Additionally, no mutations previously reported to be associated with resistance to newly introduced drugs such as bedaquiline, delamanid, or the WHO Group A drug, linezolid, were detected, suggesting that these drugs have not yet developed significant resistance pressure in clinical use in Huzhou.
Lineage 2 has been frequently associated with drug resistance in various studies [37]. However, in the present study, the overall drug resistance rates between Lineage 2 and Lineage 4 isolates were comparable (28.05% vs. 23.59%), with no statistically significant difference observed. This finding is consistent with the report by Zhou et al., which showed similar resistance rates among Beijing (28.1%) and non-Beijing (23.4%) lineage strains [38]. Notably, our analysis revealed that Lineage 2 isolates were significantly more likely to be resistant to streptomycin, while HR-TB was more commonly observed in Lineage 4 strains.
Studies have highlighted the significant concern of MDR-TB transmission in China. A national multi-site surveillance study conducted across 70 counties in all 31 provinces revealed that at least 61.4% of MDR isolates were likely the result of recent transmission of MDR/RR strains [4]. Similarly, Yang et al. reported that up to 72.5% of MDR-TB cases in Shanghai were attributed to the transmission of MDR-TB strains, rather than resistance acquisition due to treatment failure [7]. In the present study, 11.6% of DR-TB cases were identified within genomic clusters, with isolates sharing identical resistance-associated mutations. By combining these genomic cluster cases with treatment-naïve DR-TB cases harboring unique genomic profiles, we estimate 79.1% of DR-TB cases were likely caused by the transmission of DR-TB isolates during the study period. However, considering factors such as latent infections and delays in diagnosis, the true burden of DR-TB transmission within the local population may be underestimated. Moving forward, particular attention must be directed towards identifying and interrupting primary drug-resistant transmission chains as part of DR-TB control efforts.
In this study, transmission clusters were defined using a cutoff of 12 SNPs, resulting in the identification of 28 putative clusters containing 65 strains, yielding a clustered proportion of 18.6%. This is higher than the rates reported in low-burden countries such as Switzerland (8.0%) [39] and the Netherlands (14.2%) [40], but lower than the national clustering rate of 23.0% observed in a previous cross-sectional study [4]. Additionally, recent transmission accounted for 11.6% of all DR-TB cases in our study, which is also lower than the rates observed in Shanghai (31.8%) [7], Shenzhen (25.2%) [5], and Ningbo (23.8%) [36]. Several factors may explain the relatively low clustering rate observed in this study. First, the lower clustering rate may be related to the demographic structure of Huzhou. The city has a relatively stable population, with a significantly lower proportion of migrant populations compared to major metropolitan areas, leading to fewer opportunities for cross-regional transmission and thus a reduced likelihood of recent transmission events. Second, the study period was relatively short, spanning only 18 months (March 2023 to September 2024). Given the long latent period of tuberculosis, which can extend from several months to years, some cases of recent transmission may not have yet developed symptoms or been diagnosed, potentially underestimating the true level of transmission clustering. Furthermore, the relatively low incidence of pulmonary tuberculosis in Huzhou, which is below both national and provincial levels, may also contribute to the lower clustering rate, with the epidemic primarily sporadic rather than clustered. Moreover, the initiation of the “TB-Free City” pilot program in Huzhou in July 2023, which emphasizes active case detection, latent infection screening, and preventive treatment strategies, may have also reduced the community transmission of infectious sources, effectively limiting the formation of new transmission chains.
After analyzing the associations between clustering and bacteriological, demographic, treatment history, and geographic variables, we did not identify any specific population groups in Huzhou with a significantly elevated risk of TB transmission, except for individuals aged under 60 years. This finding suggests that younger individuals may play a more prominent role in recent local transmission events. Our results are consistent with those reported by Che et al. in Ningbo [36], who found that young adults aged 18–37 years exhibited a significantly higher risk of transmitting drug-resistant tuberculosis. Furthermore, two hotspots for clustered patients were identified in the main urban areas of Changxing County and Huzhou City, both of which are relatively densely populated regions. Although the majority of the clustered cases did not have direct epidemiological links (only one cluster exhibited confirmed links), we hypothesize that their daily activities may involve spatial intersections, contributing to the potential transmission risk.
WGS technology has been shown to outperform traditional contact tracing in identifying transmission routes and transmission chains by detecting SNP variations between strains and applying molecular evolutionary algorithms to track their spread [41]. However, in this study, only one cluster exhibited confirmed epidemiological links, indicating a gap between WGS analysis and traditional epidemiological methods. Additionally, limitations such as high cost and interpretive uncertainty should be considered when applying WGS in routine epidemiological investigations. Moving forward, it is essential to integrate WGS-based recent transmission analysis with traditional epidemiological methods, to swiftly identify sources of infection and missing transmission links, thereby enhancing the efficiency of transmission chain identification and supporting the effective control of tuberculosis outbreaks.
There are some limitations in this study. First, the relatively short observation period (March 2023–September 2024) may not fully capture the long-term dynamics of tuberculosis transmission, given the disease’s extended latency. Future studies with a longer observation period would help better understand the temporal evolution of TB transmission patterns. Additionally, the study relied primarily on culture-positive sputum samples from designated hospitals, which may have led to the underrepresentation of extrapulmonary and smear-negative TB cases. These cases are important for a more comprehensive view of local TB epidemiology, and expanding the inclusion criteria in future studies would provide a broader perspective on the true burden of TB.
Conclusion
This study offers a comprehensive genomic characterization of MTB isolates in Huzhou, revealing a predominance of the Beijing lineage (L2.2.1), a moderate rate of drug resistance, with 2.0% MDR-TB and 1.7% pre-XDR-TB cases observed, and no XDR-TB strains detected. Notably, no resistance-associated mutations were identified for recently introduced second-line drugs, indicating minimal resistance pressure to these key agents in this region. We also found a relatively low level of recent transmission, with a clustering rate of 18.6%. Furthermore, up to 79.1% of drug-resistant TB (DR-TB) cases were likely attributable to recent transmission, with clustered DR-TB strains sharing identical resistance-conferring mutations. These findings highlight the urgent need for early detection and intervention to disrupt transmission chains of DR-TB, particularly in low-burden settings. The integration of whole-genome sequencing into routine surveillance provides a powerful tool for informing timely and targeted interventions, thereby supporting the “TB-Free City” initiative in Huzhou and contributing to broader national TB elimination efforts.
Supplementary Information
Authors’ contributions
L.J. was responsible for data analysis, data interpretation, and manuscript drafting. F.R was responsible for data analysis, data interpretation. D.X. and X.Z. collected and contributed the MTB strains used in this study. Z.T. and J.S. did data curation. P.Z. conducted study design and revised the manuscript. All authors approved the final version of manuscript for submission and publication.
Funding
This study was supported by Zhejiang Science and Technology Plan for Disease Prevention and Control (Project No. 2025JK282), “Special support plan for local high-level talents in South Taihu Lake” of Huzhou (202011019) and Key Laboratory of Emergency Detection for Public Health of Huzhou.
Data availability
The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in National Genomics Data Center (Nucleic Acids Res 2025), China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (GSA: CRA026460) that are publicly accessible at https://ngdc.cncb.ac.cn/gsa/browse/CRA026460.
Declarations
Ethics approval and consent to participate
The study protocol was reviewed and approved by the Ethics Committees of Huzhou Center for Disease Control and Prevention (HZ2024003&2025Y003). Written informed consent to participate in this study was provided by the participant’s legal guardian/next of kin. All methods were carried out in accordance with relevant guidelines and regulations.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Lei Ji and Feilin Ren contributed equally to this work.
References
- 1.World Health Organization. Global Tuberculosis Report 2024. 2024. https://www.who.int/teams/global-tuberculosis-programme/tb-reports/global-tuberculosis-report-2024
- 2.van der Werf MJ, Ködmön C. Whole-Genome sequencing as tool for investigating international tuberculosis outbreaks: A systematic review. Front Public Health. 2019;7:87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Meehan CJ, Goig GA, Kohl TA, Verboven L, Dippenaar A, Ezewudo M, et al. Whole genome sequencing of Mycobacterium tuberculosis: current standards and open issues. Nat Rev Microbiol. 2019;17(9):533–45. [DOI] [PubMed] [Google Scholar]
- 4.Liu D, Huang F, Li Y, Mao L, He W, Wu S, et al. Transmission characteristics in tuberculosis by WGS: nationwide cross-sectional surveillance in China. Emerg Microbes Infections. 2024;13(1):2348505. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jiang Q, Liu Q, Ji L, Li J, Zeng Y, Meng L, et al. Citywide transmission of Multidrug-resistant tuberculosis under china’s rapid urbanization: A retrospective Population-based genomic Spatial epidemiological study. Clin Infect Diseases: Official Publication Infect Dis Soc Am. 2020;71(1):142–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Huang H, Ding N, Yang T, Li C, Jia X, Wang G, et al. Cross-sectional Whole-genome sequencing and epidemiological study of Multidrug-resistant Mycobacterium tuberculosis in China. Clin Infect Diseases: Official Publication Infect Dis Soc Am. 2019;69(3):405–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yang C, Luo T, Shen X, Wu J, Gan M, Xu P, et al. Transmission of multidrug-resistant Mycobacterium tuberculosis in Shanghai, china: a retrospective observational study using whole-genome sequencing and epidemiological investigation. Lancet Infect Dis. 2017;17(3):275–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Vargas R, Freschi L, Marin M, Epperson LE, Smith M, Oussenko I, et al. In-host population dynamics of Mycobacterium tuberculosis complex during active disease. Elife. 2021;10:e61805. [DOI] [PMC free article] [PubMed]
- 9.Wood DE, Salzberg SL. Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 2014;15(3):R46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Jiang X, Wang M, Wang K, Estes MK. Sequence and genomic organization of Norwalk virus. Virology. 1993;195(1):51–61. [DOI] [PubMed] [Google Scholar]
- 11.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence Alignment/Map format and samtools. Bioinformatics. 2009;25(16):2078–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Liu Q, Wei J, Li Y, Wang M, Su J, Lu Y, et al. Mycobacterium tuberculosis clinical isolates carry mutational signatures of host immune environments. Sci Adv. 2020;6(22):eaba4901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Phelan JE, O’Sullivan DM, Machado D, Ramos J, Oppong YEA, Campino S, et al. Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs. Genome Med. 2019;11(1):41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.World Health Organization. Global tuberculosis report 2020. 2020. https://www.who.int/publications/i/item/9789240013131
- 16.Luo T, Yang C, Peng Y, Lu L, Sun G, Wu J, et al. Whole-genome sequencing to detect recent transmission of Mycobacterium tuberculosis in settings with a high burden of tuberculosis. Tuberculosis (Edinb). 2014;94(4):434–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32(1):268–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: more models, new heuristics and parallel computing. Nat Methods. 2012;9(8):772. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Letunic I, Bork P. Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. 2016;44(W1):W242–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Pan J, Li X, Zhang M, Lu Y, Zhu Y, Wu K, et al. TransFlow: a snakemake workflow for transmission analysis of Mycobacterium tuberculosis whole-genome sequencing data. Bioinformatics. 2023;39(1):btac785. [DOI] [PMC free article] [PubMed]
- 21.Yang C, Lu L, Warren JL, Wu J, Jiang Q, Zuo T, et al. Internal migration and transmission dynamics of tuberculosis in Shanghai, china: an epidemiological, spatial, genomic analysis. Lancet Infect Dis. 2018;18(7):788–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Ismail NA, Omar SV, Joseph L, Govender N, Blows L, Ismail F, et al. Defining bedaquiline Susceptibility, Resistance, Cross-Resistance and associated genetic determinants: A retrospective cohort study. EBioMedicine. 2018;28:136–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Ismail N, Ismail NA, Omar SV, Peters RPH. In Vitro study of Stepwise acquisition of rv0678 and AtpE mutations conferring bedaquiline resistance. Antimicrob Agents Chemother. 2019;63(8):e00292–19. [DOI] [PMC free article] [PubMed]
- 24.Andries K, Villellas C, Coeck N, Thys K, Gevers T, Vranckx L, et al. Acquired resistance of Mycobacterium tuberculosis to bedaquiline. PLoS ONE. 2014;9(7):e102135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Zheng H, He W, Jiao W, Xia H, Sun L, Wang S, et al. Molecular characterization of multidrug-resistant tuberculosis against levofloxacin, moxifloxacin, bedaquiline, linezolid, clofazimine, and Delamanid in Southwest of China. BMC Infect Dis. 2021;21(1):330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.He W, Tan Y, Liu C, Wang Y, He P, Song Z, et al. Drug-Resistant Characteristics, genetic Diversity, and transmission dynamics of Rifampicin-Resistant Mycobacterium tuberculosis in Hunan, China, revealed by Whole-Genome sequencing. Microbiol Spectr. 2022;10(1):e0154321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Jian-Ping D, Hai-Can L, Bin W, Hai-Yan D, Zheng-Dong Z, Xiu-Qin Z, Qun LI, Kang-Lin W. Spoligotyping of Mycobacterium tuberculosis in the South of Sichuan Province. Chin J Zoon. 2015;31(12):1116–9. [Google Scholar]
- 28.Chen L, Yang X, Ma L, Yang H, Chen J, Ru H. Analysis of the distribution of Beijing genotype strains of Mycobacterium tuberculosis in parts of Yunnan Province. J Path Biol. 2015;10(09):825–9. [Google Scholar]
- 29.Liu M, Xu P, Liao X, Li Q, Chen W, Gao Q, et al. Molecular epidemiology and drug-resistance of tuberculosis in Luodian revealed by whole genome sequencing. Infect Genet Evolution: J Mol Epidemiol Evolutionary Genet Infect Dis. 2021;93:104979. [DOI] [PubMed] [Google Scholar]
- 30.Hanekom M, van der Spuy GD, Streicher E, Ndabambi SL, McEvoy CR, Kidd M, et al. A recently evolved sublineage of the Mycobacterium tuberculosis Beijing strain family is associated with an increased ability to spread and cause disease. J Clin Microbiol. 2007;45(5):1483–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Parwati I, van Crevel R, van Soolingen D. Possible underlying mechanisms for successful emergence of the Mycobacterium tuberculosis Beijing genotype strains. Lancet Infect Dis. 2010;10(2):103–11. [DOI] [PubMed] [Google Scholar]
- 32.World Health Organization. GLASS whole-genome sequencing for surveillance of antimicrobial resistance. 2020 . https://www.who.int/publications/i/item/9789240011007
- 33.Chen X, He G, Wang S, Lin S, Chen J, Zhang W. Evaluation of Whole-Genome sequence method to diagnose resistance of 13 Anti-tuberculosis drugs and characterize resistance genes in clinical Multi-Drug resistance Mycobacterium tuberculosis isolates from China. Front Microbiol. 2019;10:1741. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Liu D, Huang F, Zhang G, He W, Ou X, He P, et al. Whole-genome sequencing for surveillance of tuberculosis drug resistance and determination of resistance level in China. Clin Microbiol Infection: Official Publication Eur Soc Clin Microbiol Infect Dis. 2022;28(5):e7319–15. [DOI] [PubMed] [Google Scholar]
- 35.Wang L, Chen B, Zhou H, Mathema B, Chen L, Li X, et al. Emergence and evolution of drug-resistant Mycobacterium tuberculosis in Eastern china: A six-year prospective study. Genomics. 2023;115(3):110640. [DOI] [PubMed] [Google Scholar]
- 36.Che Y, Li X, Chen T, Lu Y, Sang G, Gao J, et al. Transmission dynamics of drug-resistant tuberculosis in Ningbo, china: an epidemiological and genomic analysis. Front Cell Infect Microbiol. 2024;14:1327477. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Cox HS, Kubica T, Doshetov D, Kebede Y, Rüsch-Gerdess S, Niemann S. The Beijing genotype and drug resistant tuberculosis in the Aral sea region of central Asia. Respir Res. 2005;6(1):134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Zhou Z, Yi H, Zhou Q, Wang L, Zhu Y, Wang W, et al. Evolution and epidemic success of Mycobacterium tuberculosis in Eastern china: evidence from a prospective study. BMC Genomics. 2023;24(1):241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Stucki D, Ballif M, Egger M, Furrer H, Altpeter E, Battegay M, et al. Standard genotyping overestimates transmission of Mycobacterium tuberculosis among immigrants in a Low-Incidence country. J Clin Microbiol. 2016;54(7):1862–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Jajou R, de Neeling A, van Hunen R, de Vries G, Schimmel H, Mulder A, et al. Epidemiological links between tuberculosis cases identified twice as efficiently by whole genome sequencing than conventional molecular typing: A population-based study. PLoS ONE. 2018;13(4):e0195413. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Nikolayevskyy V, Niemann S, Anthony R, van Soolingen D, Tagliani E, Ködmön C, et al. Role and value of whole genome sequencing in studying tuberculosis transmission. Clin Microbiol Infection: Official Publication Eur Soc Clin Microbiol Infect Dis. 2019;25(11):1377–82. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive (Genomics, Proteomics & Bioinformatics 2021) in National Genomics Data Center (Nucleic Acids Res 2025), China National Center for Bioinformation/Beijing Institute of Genomics, Chinese Academy of Sciences (GSA: CRA026460) that are publicly accessible at https://ngdc.cncb.ac.cn/gsa/browse/CRA026460.




