Skip to main content
PLOS Medicine logoLink to PLOS Medicine
. 2022 Feb 22;19(2):e1003933. doi: 10.1371/journal.pmed.1003933

Phylogeography and transmission of M. tuberculosis in Moldova: A prospective genomic analysis

Chongguang Yang 1,2,, Benjamin Sobkowiak 3,, Vijay Naidu 3, Alexandru Codreanu 4, Nelly Ciobanu 4, Kenneth S Gunasekera 2, Melanie H Chitwood 2, Sofia Alexandru 4, Stela Bivol 5, Marcus Russi 2, Joshua Havumaki 2, Patrick Cudahy 2, Heather Fosburgh 2, Christopher J Allender 6, Heather Centner 6, David M Engelthaler 6, Nicolas A Menzies 7, Joshua L Warren 8, Valeriu Crudu 4,*, Caroline Colijn 3,, Ted Cohen 2,‡,*
Editor: Claudia M Denkinger9
PMCID: PMC8903246  PMID: 35192619

Abstract

Background

The incidence of multidrug-resistant tuberculosis (MDR-TB) remains critically high in countries of the former Soviet Union, where >20% of new cases and >50% of previously treated cases have resistance to rifampin and isoniazid. Transmission of resistant strains, as opposed to resistance selected through inadequate treatment of drug-susceptible tuberculosis (TB), is the main driver of incident MDR-TB in these countries.

Methods and findings

We conducted a prospective, genomic analysis of all culture-positive TB cases diagnosed in 2018 and 2019 in the Republic of Moldova. We used phylogenetic methods to identify putative transmission clusters; spatial and demographic data were analyzed to further describe local transmission of Mycobacterium tuberculosis. Of 2,236 participants, 779 (36%) had MDR-TB, of whom 386 (50%) had never been treated previously for TB. Moreover, 92% of multidrug-resistant M. tuberculosis strains belonged to putative transmission clusters. Phylogenetic reconstruction identified 3 large clades that were comprised nearly uniformly of MDR-TB: 2 of these clades were of Beijing lineage, and 1 of Ural lineage, and each had additional distinct clade-specific second-line drug resistance mutations and geographic distributions. Spatial and temporal proximity between pairs of cases within a cluster was associated with greater genomic similarity. Our study lasted for only 2 years, a relatively short duration compared with the natural history of TB, and, thus, the ability to infer the full extent of transmission is limited.

Conclusions

The MDR-TB epidemic in Moldova is associated with the local transmission of multiple M. tuberculosis strains, including distinct clades of highly drug-resistant M. tuberculosis with varying geographic distributions and drug resistance profiles. This study demonstrates the role of comprehensive genomic surveillance for understanding the transmission of M. tuberculosis and highlights the urgency of interventions to interrupt transmission of highly drug-resistant M. tuberculosis.


In a prospective genome surveillance study, Chongguang Yang and colleagues investigate the dynamics of multidrug-resistant tuberculosis transmission in Moldova.

Author summary

Why was this study done?

  • The transmission of multidrug-resistant tuberculosis (MDR-TB) poses a major challenge for tuberculosis (TB) control in several countries, but a detailed understanding of the local dynamics of TB and MDR-TB transmission in these high MDR-TB burden settings has been elusive.

  • The increasing availability of whole genome sequencing, and the development of new statistical approaches for combining spatial, epidemiological, and genomic data to infer transmission, offers new opportunities to identify TB transmission with high resolution.

What did the researchers do and find?

  • We prospectively enrolled all individuals with incident culture-positive TB from the Republic of Moldova, a high MDR-TB burden setting, between January 2018 and December 2019 and sequenced a diagnostic Mycobacterium tuberculosis isolate from each individual.

  • We found that that nearly all extant MDR-TB in Moldova is likely the result of recent transmission and that multidrug resistance (MDR) is highly concentrated within 2 M. tuberculosis lineages (Beijing and Ural).

  • Phylogeographic analyses revealed geographically distinct patterns of transmission for the Beijing MDR strains, which were predominantly localized within the Transnistrian region to the east of the country, while Ural MDR strains were less geographically restricted.

  • Each putative MDR-TB transmission cluster had distinct second-line drugs resistance-conferring mutations. Population genetic analyses revealed both long periods of local population expansion as well as more recent introduction of specific MDR-TB strains into the country.

What do these findings mean?

  • To our knowledge, this is first study to comprehensively sequence all M. tuberculosis isolates from an entire high MDR incidence country and offers unique insights into the complexity MDR-TB transmission in Moldova.

  • Local transmission of distinct highly drug-resistant M. tuberculosis strains suggests that public health and clinical interventions tailored to address such local heterogeneities may be needed to interrupt transmission and improve treatment outcomes.

Introduction

Multidrug-resistant tuberculosis (MDR-TB) (i.e., resistance to at least rifampin and isoniazid) poses serious threats to effective tuberculosis (TB) control in many countries. Globally, approximately 4% to 5% of incident TB cases are multidrug resistance (MDR), but this is substantially higher in countries of the former Soviet Union where MDR-TB represents >20% of new TB cases and >50% of previously treated TB [1]. MDR-TB in this region has been attributed to breakdowns in public health infrastructure, transmission of TB in hospitals and prisons, and a deterioration of living conditions coinciding with the dissolution of the Soviet Union in the early 1990s [2]. While the contributions of these factors remain uncertain, there is consensus that the transmission of MDR-TB, as opposed to resistance acquired through inadequate treatment of drug-susceptible TB, is now the predominant cause of incident MDR-TB [3]. This consensus is supported by routine surveillance data that document that the majority of incident MDR-TB episodes are diagnosed among individuals with no prior anti-TB treatment [1]. However, these data alone do not address critical questions about where and between whom MDR-TB is transmitted or reveal the extent to which specific M. tuberculosis variants are responsible for MDR-TB transmission.

The increasing availability of next generation sequencing (NGS), coupled with the development of analytic approaches for integrating high-resolution genomic, spatial, and epidemiological data, has transformed our ability to describe transmission of pathogens in populations [46]. Previous genomic analyses of TB from the former Soviet Union have described the emergence and evolution of specific M. tuberculosis lineages responsible for an outsized proportion of MDR-TB in the region. In general, these studies have been conducted on isolates enriched for drug resistance phenotypes or on samples from larger cohorts [79], and this can challenge transmission inference.

We systematically collected and sequenced initial diagnostic isolates from all culture-positive TB cases occurring over 2 years in the Republic of Moldova, a former Soviet country experiencing a severe MDR-TB epidemic. In addition to capturing M. tuberculosis isolates from all culture-positive cases, we also collected data on home location and other demographic and epidemiological data, allowing us to study the distribution and dynamics of TB with high resolution across the entire country.

Methods

Study setting

Moldova is a small country (approximately 4 million population), which gained independence when the Soviet Union dissolved in 1991. In 2019, the World Health Organization estimated an incidence rate of 80 TB cases (68 to 92) per 100,000 persons. A total of 33% (30% to 35%) of new TB cases and 60% (56% to 64%) of previously treated TB cases were estimated to have MDR-TB [1].

Study enrollment

TB diagnosis occurs at 46 diagnostic centers located throughout the country. Between January 1, 2018 and December 31, 2019, all nonincarcerated individuals evaluated for pulmonary TB were invited to participate in this study (Fig A in S1 Appendix); written consent was provided. This consent allowed us to access routinely collected basic demographic, residential, and epidemiological data and to perform sequencing on their mycobacterial isolates should they have culture-positive TB. This study was approved by the Ethics Committee of Research of the Phthisiopneumology Institute in Moldova and the Yale University Human Investigation Committee (No. 2000023071).

Data and specimen collection and processing

Demographic data (sex, age, employment, history of incarceration, and education level), residential status (rural or urban residence and home village/locality), and epidemiological data (household contacts and date of diagnosis) were collected from each participant.

Sputum specimens were tested at diagnostic centers by microscopy and Xpert and then sent to 4 in-country laboratories for solid and liquid culture. Positive cultures were sent to the National TB Reference Laboratory in Chisinau for mycobacterial DNA extraction by the cetyl trimethyl ammonium bromide (CTAB) method [10].

Whole genome sequencing

Genomic DNA was prepared for NGS using the Illumina DNA Prep library preparation kit (S1 Appendix). Raw sequencing files were checked with FastQC [11] and mapped to the H37Rv reference strain (NC_000962.3) using BWA “mem” [12] and sorted with SAMtools v.1.10 [13] (S1 Data). Variant calling was conducted with GATK [14] to identify single nucleotide polymorphisms (SNPs), with low-quality SNPs (Phred score Q <20 and read depth <5) and sites with missing calls in >10% of isolates removed.

Samples with possible polyclonal infections were identified through a previously described method [15] and were not included in the transmission analysis, although we do provide additional details about these polyclonal infections in the Supporting information appendix (S1 Appendix). Heterogenous sites were called as the consensus allele if present in ≥80% of mapped reads; otherwise, they were labeled as ambiguous. SNPs in repetitive regions, PE/PPE genes, and in known resistance-conferring genes were excluded from phylogenetic tree reconstruction. In silico drug resistance prediction was carried out using TB-Profiler v2.8.14 (Tables A–C in S1 Appendix) [16].

Phylogenetic analysis and transmission cluster identification

A multiple sequence alignment of concatenated SNPs was used to construct a maximum–likelihood (M–L) phylogenetic tree with RAxML [17], using the “GTR-GAMMA” nucleotide substitution model and a Lewis ascertainment bias correction from 500 bootstrap samples. Putative transmission clusters were identified in the resulting M–L tree using TreeCluster [18], testing 2 distance thresholds of 0.001 and 0.0005 substitutions/site, corresponding to approximate SNP thresholds of 40 and 20, respectively. These thresholds reflect the maximum distance within a cluster; we also estimate the median pairwise distance within a cluster. Timed phylogenetic trees for each large cluster (≥10 cases) identified using the distance threshold of 0.001 substitutions/site were built with BEAST2 v2.6.3. (S1 Appendix) [19]. Briefly, phylogenetic trees were built using a strict molecular clock with a fixed rate of 1.0 × 10−7 per site per year and constant population model with a log normal [0,200] prior distribution [20]. Markov chain Monte Carlo chains were run for 250 million iterations, with 10% burn-in to produce maximum clade credibility trees. Finally, past population events in 3 large clades identified in the study population were inferred using the Bayesian Skyline model in BEAST2.

Inference of person-to-person transmission events

We identified person-to-person transmission events between sampled hosts in large transmission clusters (≥10 cases, TreeCluster distance threshold 0.001 substitutions/site) by reconstructing transmission networks using TransPhylo [21]. This R package uses a Bayesian approach to reconstruct transmission networks from timed phylogenies, including sampled and unsampled hosts, and allows for within-host diversity. We used a “multitree” method that simultaneously infers transmission trees from a selection of input phylogenetic trees while estimating a single value for shared model parameters. This accounts for uncertainty in the phylogenetic tree reconstruction [21]. The procedures for transmission inference within large clusters are detailed in the Supporting information appendix (S1 Appendix).

Spatial/genetic distance analysis

For each large transmission cluster (≥10 cases), we used a recently developed hierarchical Bayesian regression model to quantify the association between the genetic and spatial distances for unique pairs of cases, adjusting for other pair- and individual-level features and multiple sources of correlation in the data [22]. We then used a Bayesian meta-analysis framework to better understand shared trends and variability in the estimated associations across genetic clusters.

In our main analysis, we modeled the log-scaled patristic distance between each pair of cases within cluster k as a function of geographic distance and other covariates:

lnYkij=xkijTβk+zki+zkjTγk+θki+θkj+ϵkij,i<j,

where Ykij is the patristic distance between cases i and j within cluster k and ϵkij~N(0,σkϵ2) are the independent, Gaussian distributed errors. We defined the expected value as a function of pair- and individual-level information, where xkij includes covariates based on differences between the pair (i.e., Euclidean distance in kilometers, an indicator for whether the pair is in the same home village/locality, absolute difference between the dates of diagnosis in days, absolute difference between the ages in years) and zki includes individual-level covariates (i.e., age in years, number of household contacts, sex (male and female), education status (<secondary and ≥secondary), working status (employed and unemployed), residence location type (urban and not urban), and housing status (homeless and not homeless)). The θki are spatially correlated random effect parameters that account for correlation between paired outcomes due to (i) the same individual being represented across multiple paired responses; and (ii) spatial correlation between individuals. Complete details on the statistical model, including prior distributions for the model parameters, are provided in [22].

We fit the regression model separately for each of the transmission clusters with at least 10 cases, using the “Patristic” function in the R package “GenePair” (https://github.com/warrenjl/GenePair). For each individual cluster analysis, we included a predictor if <10% of the values across the pairs were missing and if there were >4 pairs in each of the categorical variable levels, to ensure stable model fitting results. Inference was based on 10,000 samples from the joint posterior distribution after removing the first 10,000 iterations prior to convergence and thinning the remaining 100,000 by a factor of 10 to reduce correlation in the posterior samples.

To better understand shared trends and variability in the estimated associations across genetic clusters, we then used the estimates and uncertainty measures obtained from the first stage analyses within a Bayesian meta-analysis framework. The model for a single association l is given as

β^kl|βkl~Nβkl,σ^kl2,k=1,,ml,
βkl~Nμβl,σβl2,

where β^kl is the posterior mean obtained from the regression model fit to cluster k for covariate l, βkl represents the corresponding true but unobserved value, σ^kl2 is the posterior standard deviation, and ml is the number of main analyses (out of 35 in total) where covariate l was included. We note that γkl effects are included in this same meta-analysis framework as well but describe the model in terms of βkl without loss of generality. We assumed that the true cluster-specific effects arise from a common Gaussian distribution with mean μβl and variance σβl2, and estimate these parameters by giving them weakly informative prior distributions such that μβl~N0,1002 and σβl~Uniform(0,100). By making inference on μβl we determined if covariate l had a consistent impact when data were pooled across all clusters and uncertainty in the parameter estimates was correctly quantified. When reporting results from the second stage analysis, we present posterior means and 95% quantile-based credible intervals for exp{μβlj} (i.e., the pooled effect on the reported as the ratio of expected patristic distances per specified change in covariate value).

As a sensitivity analysis, we repeated these analyses modeling SNP distance (instead of patristic) using a similar negative binomial regression framework (details in S1 Appendix).

Results

Study population

We invited all culture-positive TB patients (N = 2770) over the study period to participate; 2,405 consented, and, among them, 2,236 (93%) had available isolates for NGS analysis. These patients lived in 709 named localities within 50 regions (Fig 1). Among enrolled participants with treatment history information (N = 2182, Table 1), 31% had been previously treated for TB, 22% were female, and the median age was 43 years (interquartile range (IQR) 23 to 71). A total of 60% lived in rural regions, and 10% were previously imprisoned.

Fig 1.

Fig 1

(A) Map of culture-confirmed TB patients in Moldova. The center of each circle represents the geometric center of the localities/region (709 named localities within 50 regions) where the case was diagnosed and sampled. The scale indicates the number of culture-confirmed TB patients (n = 2,236). The Transnistrian region of Moldova is highlighted. The geographic distribution of the notified incidence of all culture-confirmed (B) TB and (C) MDR-TB by locality. The colors show the distribution of notified case per population and localities colored dark gray have missing population data. The map data were extracted from the GADM database (www.gadm.org/download_country.html). MDR-TB, multidrug-resistant tuberculosis; TB, tuberculosis.

Table 1. Demographic and clinical characteristics of study participants.

Characteristics All participants (N = 2,182*) New cases (N = 1,514) Previously treated cases (N = 668)
Demographic
Female, no. (%) 482 (22.1) 350 (23.1) 132 (19.8)
Age, median (year, IQR) 43 (23 to 71) 42 (27 to 65) 43 (22 to 68)
Homeless 238 (10.9) 144 (9.5) 94 (14.1)
Rural residence 1,323 (60.6) 991 (65.5) 332 (49.7)
Unemployed 1,437 (65.9) 995 (65.7) 442 (66.2)
Previously prisoner# 209 (9.6) 90 (5.9) 119 (17.8)
Education level
Primary 770 (35.3) 507 (33.5) 263 (39.4)
Secondary 1,288 (59.0) 912 (60.2) 376 (56.3)
No. of household contacts (mean, SD) 2.30 (2.33) 2.47 (2.40) 1.91 (2.11)
Clinical
Smear positive 975 (44.7) 697 (46.0) 278 (41.6)
Drug resistance profiles$
 Pan-susceptible 1,152 (52.8) 939 (62.0) 213 (31.9)
 MDR 779 (35.7) 386 (25.5) 393 (58.8)

* A total of 54 participants did not report the information of TB treatment history.

# A total of 108 participants did not report information of incarceration.

$ The drug-resistant profiles in this table were determined by the whole genome sequencing detection of the drug-resistant–related mutations.

IQR, interquartile range; MDR, multidrug resistance; SD, standard deviation; TB, tuberculosis.

A total of 779 participants (36%) were infected with genetic variants conferring MDR; 50% (386) of these MDR cases were treatment naive (Table 1). There was substantial geographic variation in distribution of MDR-TB. Transnistria, a small region east of the Dniester River, had localities with the highest proportions of TB cases that were MDR and among the highest incidence rates of MDR-TB in the country (Fig 1, Fig B in S1 Appendix).

Genomic analysis and phylogeny reconstruction

We obtained sequence data from pretreatment specimens of 2,220 participants. Polyclonal infections were identified in 386 participants (17.4%) (Fig A and C in S1 Appendix) and removed, resulting in a final dataset of 1,834 M. tuberculosis isolates. Among these isolates, 672 (36.6%) were genotypic MDR-TB, including 319 pre-extensive drug resistance (XDR) (17.4%) and 118 XDR (6.4%) TB. Aligning reads against the reference strain revealed 43,284 SNPs that were used to reconstruct a maximum likelihood phylogeny (Fig 2).

Fig 2.

Fig 2

(A) M–L phylogeny of 1,834 Moldova M. tuberculosis isolates based on 43,284 variable sites. The outer bands represent the in silico drug-resistant profiles, treatment history of participant and the region where the isolates were sampled from. The tree is rooted to Mycobacterium bovis (branch in green). L2 denotes lineage 2 (light orange) and L4 lineage 4 (light blue). Three major clades from the Ural/ lineage 4.2.1 (clade 1) and Beijing/lineage2.2.1 (clades 2 to 3) are shaded. The main nodes of the tree have 100% bootstrap support. (B) Phylogenetic distribution of resistance-related genotypes. The columns depict loci associated with drug resistance. “P” followed by a subscription of gene name indicates the promotor region. Colored bands of each column represent different polymorphisms. DR, drug resistance; MDR, multidrug resistance; MDR-TB, multidrug-resistant tuberculosis; M–L, maximum–likelihood.

A total of 1,014 isolates (55.3%) belonged to Lineage 4 and 804 (43.8%) belonged to Lineage 2/sublineage 2.2.1 (Fig 2A). Mapping revealed distinct geographic patterns for the 3 major MDR-TB clades: clade 1 comprising 243 Ural/lineage 4.2.1 isolates that were widely distributed, and clade 2 and clade 3 containing 102 and 121 Beijing/lineage 2.2.1 strains that were concentrated within Transnistria (Fig 2A). A high proportion of individuals (50.4%) in these 3 large MDR-TB clades had been previously treated for TB.

All Beijing/lineage 2.2.1 strains (802 consensus SNP call, 2 heterogenous SNP call) had a specific nonsynonymous mutation in esxW (Thr2Ser), a gene in which mutations were found to be associated with transmission success of Beijing lineages in Vietnam [23]. In contrast, just 3% of non-Beijing strains (32/1,030) harbored this mutation (Table D in S1 Appendix). Additionally, 2 nonsynonymous variants in esxW were found in low frequencies in non-Beijing strains, 6 samples with a nonsense mutation at codon 172, and 17 samples with a Thr173Ser mutation.

Prevalence of drug resistance genotypes

The 3 large clades were comprised almost entirely of MDR isolates (96%, 449 of 466) (Fig 2B); resistance-conferring mutations for isoniazid and rifampin were similar and found in the katG 315 codon and in the 81-bp rifampicin resistance determination region (RRDR). However, each of these 3 clades had additional distinctive drug resistance mutations: the isolates in Ural strain/lineage 4 clade 1 harbored an eis promoter (−12 C>T) mutation conferring kanamycin resistance, one Beijing strain/lineage 2 clade had an ethA (110–110 del), associated with ethionamide resistance, while the other had thyX (−16 C>T) and thyA (Arg222Gly) mutations, associated with resistance to p-aminosalicylic acid. We also identified clusters of isolates harboring additional drug-resistant mutations associated with drugs in newly recommended MDR treatment regimens including lineozid (n = 14), bedaquiline (n = 1), and delamanid (n = 9). We also reported DR mutations among the 386 mixed samples (Table A and B in S1 Appendix).

Transmission of drug-resistant M. tuberculosis

Of the 1,834 M. tuberculosis isolates, 1,551 (85.6%) formed clusters ranging in size from 2 to 105, and 1,000 (54.5%) belonged to 35 large clusters with at least 10 participants at the clustering threshold of 0.001 substitutions/site. The median SNP distance across all transmission clusters was 14 SNPs (IQR 10 to 18 SNPs), with the median within-cluster SNP distance ranging from 0 to 26 SNPs (Fig D-a in S1 Appendix). Meanwhile, the median SNP distance in a cluster defined using the threshold of 0.0005 substitutions/site was 9 SNPs (IQR 7 to 12 SNPs) (Fig D-b in S1 Appendix).

Of 672 MDR-TB isolates included in the final analysis, 619 (92.1%) were part of a cluster, and 454 (67.6%) belonged to one of the 35 large transmission clusters. Individuals with MDR-TB were more likely to be in large clusters than individuals with pan-susceptible disease (odds ratio (OR) 3.39, P-value < 0.001, Table E in S1 Appendix). Eight of the 14 MDR plus linezolid-resistant isolates were members of large clusters (Cluster 1, 2, and 21, Fig E in S1 Appendix). Among the 9 MDR isolates with delamanid resistance, 7 had the same delamanid-associated resistance mutation, forming a single subcluster (Cluster 19, Fig E in S1 Appendix) with a median pairwise SNP distance of <5 SNPs, suggesting recent transmission of this highly resistant M. tuberculosis strain in Moldova.

Closer inspection of the 35 large transmission clusters revealed distinct demographic and epidemiological differences between clusters. The largest transmission cluster (Cluster 1) included 105 participants with the sublineage 4.2.1/Ural Clade 1 stain residing throughout the entire country (Fig 3A and 3D). In contrast, the next largest cluster (Cluster 2) included 102 participants with the sublineage 2.2.1/Beijing Clade 2 stain living predominately in Transnistria (Fig 3B and 3E). A total of 16 of the 35 large clusters were comprised almost entirely of MDR-TB (Fig 3, Fig E in S1 Appendix). Notably, there were cluster-specific demographic differences observed across transmission clusters, with the largest 2 groups comprising a high proportion of previous prisoners and reporting unsatisfactory living conditions (Fig F in S1 Appendix). Table E in S1 Appendix details the association of covariates and membership in large clusters, along with a sensitivity analysis defining clusters using a stricter threshold of 0.0005 substitutions/site that showed broadly the same significant associations.

Fig 3.

Fig 3

(A–C) Tree visualizations for 3 large putative transmission clusters (N ≥ 10 isolates), each showing the location of cases in either the Moldova or Transnistria regions along with resistance/susceptibility to 12 anti-TB drugs, as identified by in silico prediction. (D, E) Spatial distribution of 3 largest clusters (Cluster 1, 2, and 7) in the Ural/Lineage 4.2.1 and Beijing/lineage 2.2.1 clades. The map data were extracted from the GADM database (www.gadm.org/download_country.html). MDR-TB, multidrug-resistant tuberculosis; TB, tuberculosis.

Reconstructing transmission networks in the 35 broad clusters using the multitree TransPhylo approach, we inferred 194 person–person transmission events. The relatively short study period allows for limited opportunities to capture transmission chains and pairs, and, accordingly, a minority of clustered isolates were predicted to be involved in transmission events in at least half the posterior transmission trees (338/1,000, 33.8%). Nonetheless, the identification of these transmission events support evidence of recent, local transmission between sampled individuals in the region. We found no significant factors that were associated with inclusion in these person-to-person transmission events compared to other clustered person-to-person individuals, although there was some evidence for an increased likelihood of transmission linkage between hosts in the Transnistria region compared to the rest of Moldova (OR 1.42, P = 0.02).

Bayesian Skyline analysis of large MDR-TB clades

To gain further insight into the population dynamics of MDR-TB in Moldova, we reconstructed the scaled effective population size for the 3 large MDR-TB clades (Fig 2) and estimated the time to the most recent common ancestor (TMRCA) (Table F in S1 Appendix).

We estimated the TMRCA of the Ural/4.2.1 (clade 1) to be around 1984, although with a relatively broad posterior density interval (95% highest posterior density interval (HPDI)) of 1961 to 2003 owing to the limited temporal range of the data. The 2 Beijing/L2.2.1 clades (clades 2 and 3) are estimated to have a TMRCA of 2013 (95% HPDI: 2010 to 2015) and approximately 2006 (95% HPDI: 1999 to 2012), respectively (Table G in S1 Appendix), implying more recent introduction of these clades to the region. Fig 4 shows the estimated M. tuberculosis effective population size for the 3 major clades over time. Our analysis estimated substantial growth of the Ural/clade 1 between 2012 and 2013 and of the Beijing/clade 3 in late 2013 to mid-2014. For the Beijing/clade 2, the effective population size has remained relatively constant, although the estimated date of origin falls within the time period when growth occurred in other clades. These results indicate a period of population expansion of MDR-TB in Moldova between 2012 and 2014. A sensitivity analysis using alternative clock models and rate estimates (Table G in S1 Appendix) showed similar estimates for the TMRCA and effective population sizes for each clade (Fig G in S1 Appendix).

Fig 4.

Fig 4

(A–C) Coalescent Bayesian Skyline plots of the 3 large clades among Ural/lineage 4.2.1 and Beijing/lineage 2.2.1 with specific resistant mutations (detailed in Fig 2B) using an uncorrelated log normal relaxed clock model. The 2 blue lines are the upper and lower bounds of the 95% HPD interval. The x-axis is the time in years and the y-axis is on a log scale. (D) Density distribution of within-clade pairwise SNPs distance of clades 1 to 3. HPD, highest posterior density; SNP, single nucleotide polymorphism.

Spatial/genetic distance analysis

Table 2 shows the pooled risk ratio (RR) inference for pair- and individual-covariates from the Bayesian meta-analysis of genetic and spatial distances. Two cases in the same locality had a 47% lower expected patristic distance compared to cases in different localities (RR: 0.53; 95% CI: 0.40, 0.68). For cases in different localities, as the distance between the localities increases by 50 kilometers, the patristic distance between the pair increased by 6% (RR: 1.06 (1.03, 1.08)). For every half-year increase in the separation between dates of diagnosis for a pair, the patristic distance increased by 3% (RR: 1.03; (1.01, 1.07)). A sensitivity analyses using SNP distances yielded similar results (Table H in S1 Appendix).

Table 2. Pooled Bayesian meta-analysis inference for each exponentiated effect (i.e., RR interpretation).

Effect Estimate 95% credible interval
Distance between localities (50 km) 1.06 (1.03, 1.08)
Same locality (yes versus no) 0.53 (0.40, 0.68)
Date of diagnosis distance (1/2 year) 1.03 (1.01, 1.07)
Age difference (10 years) 1.00 (1.00, 1.01)
Age (10 years) 0.99 (0.97, 1.01)
Household contacts (1 person) 0.99 (0.97, 1.01)
Sex
Mixed pair versus both female 0.96 (0.91, 1.02)
Both male versus both female 0.91 (0.82, 1.01)
Residence location
Mixed pair versus both not urban 1.02 (0.97, 1.09)
Both urban versus both not urban 1.06 (0.94, 1.21)
Housing
Mixed pair versus both not homeless 1.02 (0.87, 1.20)
Both homeless versus both not homeless 1.11 (0.83, 1.48)
Working status
Mixed pair versus both unemployed 1.03 (0.96, 1.12)
Both employed versus both unemployed 1.13 (0.96, 1.33)
Education
Mixed pair versus both ≥ secondary 0.98 (0.94, 1.02)
Both < secondary versus both ≥ secondary 0.95 (0.87, 1.03)

Posterior means and 95% quantile-based credible intervals are presented.

RR, risk ratio.

Discussion

We describe the recent circulation of 3 distinct clades of M. tuberculosis (1 of Ural lineage and 2 of Beijing lineage) responsible for the vast majority of MDR-TB in Moldova. While these clades share similar isoniazid- and rifampin-conferring mutations, there are additional clade-specific mutations conferring resistance to important second-line TB antibiotics critical for MDR treatment success.

Broad transmission networks based on genomic similarity showed that >85% of all culture-positive TB cases in Moldova could be mapped to putative transmission clusters and that the majority (>54%) of these cases were found in 35 large transmission clusters. The role of recent transmission was even more pronounced for MDR-TB cases, among which >92% were found within putative transmission clusters (and >67% found within the 35 large transmission clusters). Individuals with MDR-TB had over 3-fold higher odds of being in a large transmission cluster compared with individuals with pan-susceptible TB. Other notable covariates associated with increased odds of being in a large transmission cluster included urban residence, previous incarceration, and a history of previous treatment for TB. We found that pairs with closer times of diagnosis and living within the same locality had the greatest genomic similarity and that for pairs in different localities, closer spatial proximity was associated with greater genomic similarity.

Previous analyses of surveillance data have revealed striking spatial heterogeneity of MDR-TB in Moldova with MDR-TB incidence differing by more than an order of magnitude for different localities [24], but the mechanisms driving this variation have not been described. Our analysis reveals that this heterogeneity is associated with the multiple overlapping epidemics of transmitted MDR-TB, some of which are due to clades that have extended across the entire country, while others are thus far confined to specific subregions. Most notably, the 2 largest transmission clusters of the Beijing lineage are found almost exclusively in Transnistria, where, in some localities, MDR-TB incidence rates exceed 200 cases per 100,000 persons/year. Our finding that nearly all Beijing lineage strains in Moldova have esxW mutations corroborates recent work that suggests that these variants may be under positive selection [23].

A recently reported genomic study conducted among patients diagnosed in 2013 and 2014 at a single municipal hospital in Chisinau described the local concentration of MDR-enriched lineage 4.2.1 (Ural) isolates [25]. In the current study, conducted approximately 6 years later and inclusive of the entire country, we found that MDR isolates within this lineage are present throughout Moldova and are commonly within transmission clusters, although this has thus far only been reported sporadically outside Moldova [26]. Prior work had found this lineage to be responsible for MDR-TB due to reinfection in nosocomial settings [27]; it is now apparent that these MDR strains are transmitted frequently in community settings. Regional reviews have suggested an important role of Beijing and Ural lineages in current TB epidemics [28]; our current work confirms and builds upon these insights, revealing in high resolution the overlapping dynamics of these 2 lineages in Moldova.

A major strength of our study was that we were able to include all culture-positive isolates across the country, minimizing challenges to transmission inference due to sampling biases. However, because we only could collect samples for 2 years—a short duration compared with the natural history of TB—our ability to track chains of transmission and to predict who infected whom was limited. We cannot rule out bias caused by individuals with TB that were never diagnosed or because some TB cases were not culture positive [29]. Additionally, polyclonal samples were removed from this analysis due to difficulties in producing well-resolved phylogenies. We do note that we found evidence for homogeneous and heterogenous drug resistance mutations in these sequences at a similar proportion to the remaining study population (Table B in S1 Appendix). Further methods development and analysis are required to understand the potential role of polyclonal TB infection in transmission within Moldova.

There are urgent clinical and public health implications of these findings. While the crisis of transmitted MDR-TB was already apparent in this region, these data reveal that there are several cocirculating highly drug-resistant TB clades that differ in terms of drug resistance profiles, geographic distribution, and epidemic trajectory. These results suggest the urgency of interrupting MDR-TB transmission in Moldova, especially within specific geographic foci in the capital city of Chisinau and in the region of Transnistria. While the role of genomic surveillance for informing TB interventions in high-burden settings remains incompletely explored, this study provides an important example of how such information may be used to understand the complex epidemiology of MDR-TB in a high incidence country. We must next investigate whether this improved understanding of local transmission can inform the design of more effective and efficient interventions, a question which remains unanswered at this time.

Supporting information

S1 STROBE Checklist. STROBE Statement—Checklist of items that should be included in reports of observational studies.

STROBE, STrengthening the Reporting of OBservational studies in Epidemiology.

(DOCX)

S1 Data. Additional demographic and epidemiological data used in the analysis.

(CSV)

S1 Appendix

Table A: A summary of the lineages found in mixed M. tuberculosis samples from Moldova, as designated by TB-Profiler. Table B: A summary of the homogeny in drug resistance mutations present in mixed M. tuberculosis samples from Moldova. Table C: In silico drug resistance prediction using TBprofiler and genTB tools. Table D: Allele counts for 9 SNP variants identified in the esxW gene within the study population, showing counts within samples classified as either Beijing strains (all lineage 2.2.1) or as any other lineage. Table E: Demographic associations in cases belonging to large transmission clusters (≥10 cases), identified with patristic distance thresholds of 0.001 and 0.0005. Cases in small clusters (2 to 9 cases) are not included. ORs are calculated using logistic regression and P values by Wald chi-squared test, adjusted for age and sex. Table F: Results of the Coalescent Bayesian Skyline analyses of the 3 large clades with specific resistant mutations using an uncorrelated log normal relaxed clock model. Table G: Complete Coalescent Bayesian Skyline results of the sensitivity analysis using 3 different clock model settings (strict, log normal relaxed, and exponential relaxed) and 3 clock rate estimates of the 3 large clades with specific resistant mutations. The clock rate used log normal distribution. Table H: Pooled Bayesian meta-analysis inference for each exponentiated effect (i.e., ratio of expected SNP distances per specified change in covariate value). Posterior means and 95% quantile-based credible intervals are presented. Fig A: The study flow diagram. Fig B: Distribution of the proportion of MDR-TB by the regions where they were diagnosed. (a) Regions sorted by the proportion of MDR-TB and (b) the total numbers of MDR-TB isolates from high to low. Fig C: (a) A scatterplot showing the pairwise SNP distance (max. 50 SNP differences) plotted against the patristic distance on an M–L phylogeny produced with RAxML between all 2,236 Moldovan isolates with whole genome sequence data. (b) A scatterplot showing the pairwise SNP distance (max. 50 SNP differences) plotted against the patristic distance on an M–L phylogeny produced with RAxML between 1,834 nonmixed Moldovan isolates with whole genome sequence data. Fig D: (a) The pairwise SNP distance in 35 large transmission clusters with at least 10 participants involved with the threshold of 0.001. The box plot shows the IQR and median SNP distance of each cluster. (b) The pairwise SNP distance in 26 large transmission clusters with at least 10 participants involved with the threshold of 0.0005. The box plot shows the IQR and median SNP distance of each cluster. Fig E: Tree visualizations for remaining 32 transmission clusters (N ≥ 10 isolates), each showing the location of cases in either the Moldova or Transnistria regions along with resistance/susceptibility to anti-TB drugs, as identified by in silico prediction. Fig F: Tree visualizations for the 35 transmission clusters (N ≥ 10 isolates), each showing the location of cases in either the Moldova and Transnistria regions along with selected covariates, namely, urban residence, homeless, unsatisfactory living conditions, and former prisoner. Fig G: Coalescent Bayesian Skyline plots of the sensitivity analysis using 3 different clock model settings (strict, log normal relaxed, and exponential relaxed) and 3 clock rate estimates of the 3 large clades with specific resistant mutations. IQR, interquartile range; MDR-TB, multidrug-resistant tuberculosis; M–L, maximum–likelihood; OR, odds ratio; SNP, single nucleotide polymorphism.

(PDF)

Acknowledgments

We thank the clinical and laboratory staff of Phthisiopneumology Institute from Chisinau and Regional Reference Laboratories from Balti, Vorniceni and Bender from Moldova for invaluable help and for their assistance in collecting and testing patient specimens.

Disclaimers

The contents are the responsibility of the authors and Subgrantee and do not necessarily reflect the views of USAID or the United States Government.

Abbreviations

CTAB

cetyl trimethyl ammonium bromide

HPDI

highest posterior density interval

IQR

interquartile range

MDR

multidrug resistance

MDR-TB

multidrug-resistant tuberculosis

M–L

maximum–likelihood

NGS

next generation sequencing

OR

odds ratio

RR

risk ratio

RRDR

rifampicin resistance determination region

SNP

single nucleotide polymorphism

TB

tuberculosis

TMRCA

time to the most recent common ancestor

XDR

extensive drug resistance

Data Availability

The genomic data have been made available through GenBank (PRJNA736718, https://www.ncbi.nlm.nih.gov/bioproject/PRJNA736718). Additional data used in the analysis (with the exception of location data which cannot be provided because of the small number of participants at locations would allow linkage to individual participants), are provided as a csv in the Supporting information.

Funding Statement

This study was made possible by the generous support of the American people through the United States Agency for International Development (USAID) through the TREAT TB Cooperative Agreement No. GHN-A-00-08-00004 (TC, CC, and VC). CY received funding from the Nation Institutes of Health- Clinical and Translational Science Awards (CTSA) program No. UL1 TR001863. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1.Global tuberculosis report. Geneva: World Health Organization, 2020.
  • 2.The New Profile of Drug-Resistant Tuberculosis in Russia: A Global and Local Perspective: Summary of a Joint Workshop. Washington (DC); 2011. [PubMed]
  • 3.Kendall EA, Fofana MO, Dowdy DW. Burden of transmitted multidrug resistance in epidemics of tuberculosis: a transmission modelling analysis. Lancet Respir Med. 2015;3(12):963–72. Epub 2015/11/26. doi: 10.1016/S2213-2600(15)00458-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Grenfell BT, Pybus OG, Gog JR, Wood JL, Daly JM, Mumford JA, et al. Unifying the epidemiological and evolutionary dynamics of pathogens. Science. 2004;303(5656):327–32. Epub 2004/01/17. doi: 10.1126/science.1090727 . [DOI] [PubMed] [Google Scholar]
  • 5.Lemey P, Rambaut A, Welch JJ, Suchard MA. Phylogeography takes a relaxed random walk in continuous space and time. Mol Biol Evol. 2010;27(8):1877–85. Epub 2010/03/06. doi: 10.1093/molbev/msq067 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Pybus OG, Suchard MA, Lemey P, Bernardin FJ, Rambaut A, Crawford FW, et al. Unifying the spatial epidemiology and molecular evolution of emerging epidemics. Proc Natl Acad Sci U S A. 2012;109(37):15066–71. Epub 2012/08/29. doi: 10.1073/pnas.1206598109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Casali N, Nikolayevskyy V, Balabanova Y, Harris SR, Ignatyeva O, Kontsevaya I, et al. Evolution and transmission of drug-resistant tuberculosis in a Russian population. Nat Genet. 2014;46(3):279–86. Epub 2014/01/28. doi: 10.1038/ng.2878 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Wollenberg K, Harris M, Gabrielian A, Ciobanu N, Chesov D, Long A, et al. A retrospective genomic analysis of drug-resistant strains of M. tuberculosis in a high-burden setting, with an emphasis on comparative diagnostics and reactivation and reinfection status. BMC Infect Dis. 2020;20(1):17. Epub 2020/01/09. doi: 10.1186/s12879-019-4739-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Merker M, Barbier M, Cox H, Rasigade JP, Feuerriegel S, Kohl TA, et al. Compensatory evolution drives multidrug-resistant tuberculosis in Central Asia. Elife. 2018;7. Epub 2018/10/31. doi: 10.7554/eLife.38200 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Schiebelhut LM, Abboud SS, Gomez Daglio LE, Swift HF, Dawson MN. A comparison of DNA extraction methods for high-throughput DNA analyses. Mol Ecol Resour. 2017;17(4):721–9. Epub 2016/10/22. doi: 10.1111/1755-0998.12620 . [DOI] [PubMed] [Google Scholar]
  • 11.Andrews S. FastQC: A Quality Control Tool for High Throughput Sequence Data. 2015; http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
  • 12.Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60. Epub 2009/05/20. doi: 10.1093/bioinformatics/btp324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25(16):2078–9. Epub 2009/06/10. doi: 10.1093/bioinformatics/btp352 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20(9):1297–303. Epub 2010/07/21. doi: 10.1101/gr.107524.110 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Sobkowiak B, Glynn JR, Houben R, Mallard K, Phelan JE, Guerra-Assuncao JA, et al. Identifying mixed Mycobacterium tuberculosis infections from whole genome sequence data. BMC Genomics. 2018;19(1):613. Epub 2018/08/16. doi: 10.1186/s12864-018-4988-z [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Phelan JE, O’Sullivan DM, Machado D, Ramos J, Oppong YEA, Campino S, et al. Integrating informatics tools and portable sequencing technology for rapid detection of resistance to anti-tuberculous drugs. Genome Med. 2019;11(1):41. Epub 2019/06/27. doi: 10.1186/s13073-019-0650-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics. 2014;30(9):1312–3. Epub 2014/01/24. doi: 10.1093/bioinformatics/btu033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Balaban M, Moshiri N, Mai U, Jia X, Mirarab S. TreeCluster: Clustering biological sequences using phylogenetic trees. PLoS ONE. 2019;14(8):e0221068. Epub 2019/08/23. doi: 10.1371/journal.pone.0221068 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bouckaert R, Vaughan TG, Barido-Sottani J, Duchene S, Fourment M, Gavryushkina A, et al. BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis. PLoS Comput Biol. 2019;15(4):e1006650. Epub 2019/04/09. doi: 10.1371/journal.pcbi.1006650 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Menardo F, Duchene S, Brites D, Gagneux S. The molecular clock of Mycobacterium tuberculosis. PLoS Pathog. 2019;15(9):e1008067. Epub 2019/09/13. doi: 10.1371/journal.ppat.1008067 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Didelot X, Fraser C, Gardy J, Colijn C. Genomic Infectious Disease Epidemiology in Partially Sampled and Ongoing Outbreaks. Mol Biol Evol. 2017;34(4):997–1007. Epub 2017/01/20. doi: 10.1093/molbev/msw275 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Warren JL, Chitwood MH, Sobkowiak B, Crudu V, Colijn C, Cohen T. Statistical methods for modeling spatially-referenced paired genetic relatedness data. arXiv. 2021, 2109:14003. [Google Scholar]
  • 23.Holt KE, McAdam P, Thai PVK, Thuong NTT, Ha DTM, Lan NN, et al. Frequent transmission of the Mycobacterium tuberculosis Beijing lineage and positive selection for the EsxW Beijing variant in Vietnam. Nat Genet. 2018;50(6):849–56. Epub 2018/05/23. doi: 10.1038/s41588-018-0117-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Jenkins HE, Plesca V, Ciobanu A, Crudu V, Galusca I, Soltan V, et al. Assessing spatial heterogeneity of multidrug-resistant tuberculosis in a high-burden country. Eur Respir J. 2013;42(5):1291–301. Epub 2012/10/27. doi: 10.1183/09031936.00111812 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Tyler S. Brown VE, Ola Brynildsrud, Magnus Osnes, Natalie Stennis, James Stimson, Caroline Colijn, Sofia Alexandru, Ecaterina Noroc, Nelly Ciobanu, Valeriu Crudu, Ted Cohen, Mathema B. Evolution and emergence of multidrug-resistant Mycobacterium tuberculosis in Chisinau, Moldova. Microb Genom. 2021;7(8):000620. doi: 10.1099/mgen.0.000620 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Sinkov V, Ogarkov O, Mokrousov I, Bukin Y, Zhdanova S, Heysell SK. New epidemic cluster of pre-extensively drug resistant isolates of Mycobacterium tuberculosis Ural family emerging in Eastern Europe. BMC Genomics. 2018;19(1):762. Epub 2018/10/24. doi: 10.1186/s12864-018-5162-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Crudu V, Merker M, Lange C, Noroc E, Romancenco E, Chesov D, et al. Nosocomial transmission of multidrug-resistant tuberculosis. Int J Tuberc Lung Dis. 2015;19(12):1520–3. Epub 2015/11/29. doi: 10.5588/ijtld.15.0327 . [DOI] [PubMed] [Google Scholar]
  • 28.Mokrousov I. Mycobacterium tuberculosis phylogeography in the context of human migration and pathogen’s pathobiology: Insights from Beijing and Ural families. Tuberculosis (Edinb). 2015;95 Suppl 1:S167–76. Epub 2015/03/11. doi: 10.1016/j.tube.2015.02.031 . [DOI] [PubMed] [Google Scholar]
  • 29.Borgdorff MW, van den Hof S, Kalisvaart N, Kremer K, van Soolingen D. Influence of sampling on clustering and associations with risk factors in the molecular epidemiology of tuberculosis. Am J Epidemiol. 2011;174(2):243–51. Epub 2011/05/25. doi: 10.1093/aje/kwr061 . [DOI] [PubMed] [Google Scholar]

Decision Letter 0

Beryne Odeny

21 Jul 2021

Dear Dr Yang,

Thank you for submitting your manuscript entitled "Phylogeography and transmission of M. tuberculosis in Moldova" for consideration by PLOS Medicine.

Your manuscript has now been evaluated by the PLOS Medicine editorial staff and I am writing to let you know that we would like to send your submission out for external peer review.

However, before we can send your manuscript to reviewers, we need you to complete your submission by providing the metadata that is required for full assessment. To this end, please login to Editorial Manager where you will find the paper in the 'Submissions Needing Revisions' folder on your homepage. Please click 'Revise Submission' from the Action Links and complete all additional questions in the submission questionnaire.

Please re-submit your manuscript within two working days, i.e. by Jul 23 2021 11:59PM.

Login to Editorial Manager here: https://www.editorialmanager.com/pmedicine

Once your full submission is complete, your paper will undergo a series of checks in preparation for peer review. Once your manuscript has passed all checks it will be sent out for review.

Feel free to email us at plosmedicine@plos.org if you have any queries relating to your submission.

Kind regards,

Beryne Odeny

Associate Editor

PLOS Medicine

Decision Letter 1

Beryne Odeny

7 Oct 2021

Dear Dr. Yang,

Thank you very much for submitting your manuscript "Phylogeography and transmission of M. tuberculosis in Moldova" (PMEDICINE-D-21-03120R1) for consideration at PLOS Medicine.

Your paper was evaluated by a senior editor and discussed among all the editors here. It was also discussed with an academic editor with relevant expertise, and sent to independent reviewers, including a statistical reviewer. The reviews are appended at the bottom of this email and any accompanying reviewer attachments can be seen via the link below:

[LINK]

In light of these reviews, I am afraid that we will not be able to accept the manuscript for publication in the journal in its current form, but we would like to consider a revised version that addresses the reviewers' and editors' comments. Obviously we cannot make any decision about publication until we have seen the revised manuscript and your response, and we plan to seek re-review by one or more of the reviewers.

In revising the manuscript for further consideration, your revisions should address the specific points made by each reviewer and the editors. Please also check the guidelines for revised papers at http://journals.plos.org/plosmedicine/s/revising-your-manuscript for any that apply to your paper. In your rebuttal letter you should indicate your response to the reviewers' and editors' comments, the changes you have made in the manuscript, and include either an excerpt of the revised text or the location (eg: page and line number) where each change can be found. Please submit a clean version of the paper as the main article file; a version with changes marked should be uploaded as a marked up manuscript.

In addition, we request that you upload any figures associated with your paper as individual TIF or EPS files with 300dpi resolution at resubmission; please read our figure guidelines for more information on our requirements: http://journals.plos.org/plosmedicine/s/figures. While revising your submission, please upload your figure files to the PACE digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at PLOSMedicine@plos.org.

We expect to receive your revised manuscript by Oct 28 2021 11:59PM. Please email us (plosmedicine@plos.org) if you have any questions or concerns.

***Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.***

We ask every co-author listed on the manuscript to fill in a contributing author statement, making sure to declare all competing interests. If any of the co-authors have not filled in the statement, we will remind them to do so when the paper is revised. If all statements are not completed in a timely fashion this could hold up the re-review process. If new competing interests are declared later in the revision process, this may also hold up the submission. Should there be a problem getting one of your co-authors to fill in a statement we will be in contact. YOU MUST NOT ADD OR REMOVE AUTHORS UNLESS YOU HAVE ALERTED THE EDITOR HANDLING THE MANUSCRIPT TO THE CHANGE AND THEY SPECIFICALLY HAVE AGREED TO IT. You can see our competing interests policy here: http://journals.plos.org/plosmedicine/s/competing-interests.

Please use the following link to submit the revised manuscript:

https://www.editorialmanager.com/pmedicine/

Your article can be found in the "Submissions Needing Revision" folder.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please ensure that the paper adheres to the PLOS Data Availability Policy (see http://journals.plos.org/plosmedicine/s/data-availability), which requires that all data underlying the study's findings be provided in a repository or as Supporting Information. For data residing with a third party, authors are required to provide instructions with contact information for obtaining the data. PLOS journals do not allow statements supported by "data not shown" or "unpublished results." For such statements, authors must provide supporting data or cite public sources that include it.

We look forward to receiving your revised manuscript.

Sincerely,

Beryne Odeny,

PLOS Medicine

plosmedicine.org

-----------------------------------------------------------

Requests from the editors:

1) Please revise your title according to PLOS Medicine's style. Your title must be nondeclarative and not a question. It should begin with main concept if possible. Please place the study design in the subtitle (i.e., after a colon). For example, “Phylogeography and transmission of M. tuberculosis in Moldova: A prospective genomic analysis”

2) The Data Availability Statement (DAS) requires revision. If part of the data is not freely available, please include an appropriate contact (web or email address) for inquiries (this cannot be a study author/ co-author).

3) At this stage, we ask that you write a non-technical Author Summary. The Author Summary should immediately follow the Abstract in your revised manuscript. This text is subject to editorial change and should be distinct from the scientific abstract. The summary should be accessible to a wide audience that includes both scientists and non-scientists. Please see our author guidelines for more information: https://journals.plos.org/plosmedicine/s/revising-your-manuscript#loc-author-summary.

4) Abstract:

a) Please ensure that all numbers presented in the abstract are present and identical to numbers presented in the main manuscript text.

b) In the last sentence of the Abstract Methods and Findings section, please describe the main limitation(s) of the study's methodology.

5) Did your study have a prospective protocol or analysis plan? Please state this (either way) early in the Methods section:

a) If a prospective analysis plan (from your funding proposal, IRB or other ethics committee submission, study protocol, or other planning document written before analyzing the data) was used in designing the study, please include the relevant prospectively written document with your revised manuscript as a Supporting Information file to be published alongside your study and cite it in the Methods section. A legend for this file should be included at the end of your manuscript.

b) If no such document exists, please make sure that the Methods section transparently describes when analyses were planned, and when/why any data-driven changes to analyses took place.

c) In either case, changes in the analysis-- including those made in response to peer review comments-- should be identified as such in the Methods section of the paper, with rationale.

6) Please specify whether informed consent was written or oral.

7) Please ensure that the study is reported according to the STROBE guideline for observational studies, and include the completed STROBE checklist as Supporting Information. Please add the following statement, or similar, to the Methods: "This study is reported as per the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guideline (S1 Checklist)." The STROBE guideline can be found here: http://www.equator-network.org/reporting-guidelines/strobe/

8) Your study is observational and therefore causality cannot be inferred. Please remove language that implies causality throughout the manuscript, such as “... the result of...” For example, you state in line 396-397 that “Our analysis reveals that this heterogeneity is the result of…” Please refer to associations instead.

9) In the Methods and Results section, please consistently provide 95% CIs and p values for estimates in the main text and tables.

10) Figures 1-3, S4-S5:

a) Please ensure that Figure 1 complies with our figure requirement: http://journals.plos.org/plosmedicine/s/figures.

b) Please confirm that the appropriate usage rights apply to the use of this Figure 1. Please see our guidelines for map images: https://journals.plos.org/plosmedicine/s/figures#loc-maps

11) Please do not report P<0.01, instead report as P < 0.001.

12) Please remove the “Role of the funding source”, “Declaration of interests”, and “Data sharing” from the main text. This information will be published as metadata based on your responses to the submission form.

13) Please use PloS Medicine style in-text reference call outs noting the square brackets. For example "... countries [1,2]."

Comments from the reviewers:

Reviewer #1: General Comments

Yang and colleagues present the results of a prospective genomic analysis of all TB cases in the Republic of Moldova diagnosed in 2018 and 2019. The authors used next generation sequencing and phylogenetic methods to determine the proportion of cases attributable to multi-drug resistant M. tuberculosis (MDR-TB) variants and characterize their genetic clustering and spatial distribution. Finally, they identified independent individual and geographic predictors of recent transmission of MDR-TB. The authors conclude that the MDR-TB epidemic in Moldova is driven by local transmission and that there is an urgent need to apply comprehensive genomic surveillance to understand transmission and potentially to inform the design of public health interventions to contain the DR-TB epidemic.

Major Comments

This is a rigorously performed study that seeks to answer a provocative, if open-ended question, about what can be learned when demographic, spatiotemporal, and pathogen genomic data are integrated together at population level to provide a broad picture of an MDR-TB epidemic in a high-incidence country.

I have a few brief questions and comments about the methods, presentation, and interpretation, as well as a larger question about the place of this work in the literature and its implications.

1) The authors mention the comprehensive sampling approach as a strength and note that some prior studies did not do this, including a recent study from Moldova. Is it possible to comment on how the inferences changed empirically with comprehensive sampling, and if so how? Are there any alternatives (e.g. stratified random samples) to comprehensive sampling that could improve feasibility without sacrificing rigor?

2) Reduced reagent volumes were one of the techniques the authors used to improve the affordability of comprehensive WGS. Should this not be highlighted as a strength or innovation of the study? Are the other learnings to improve feasibility that might inform replication of this study for research or practice?

3) Can the authors state somewhere why was Moldova chosen as a location for this study? Can the authors comment on the possible generalizability of these findings to other countries with DR-TB epidemics, in the region and possibly beyond? Presumably every country has its unique features, but are there particular characteristics of Moldova (beyond its small size) or of its MDR-TB epidemic that made it more suitable for this project than others?

4) The authors present a detailed analysis of pathogen and demographic factors that may drive transmission, but there is also very limited clinical information about host factors. Could the authors speculate to what extent could HIV, or other host factors that also cluster in individuals, might act as confounders of the observed associations? Should this type of clinical information be captured in a comprehensive surveillance system?

5) Can the authors comment on the potential effects of excluding polyclonal infections? I understand that this is technically necessary, but it would help to hear the implications for the overall analysis. For example, to the extent that mixed infections reflect greater transmission, could the overall estimates of recent and local transmission be underestimates?

6) Some additional explanation to help contextualize these findings would help highlight its importance to general audiences. Specifically, the application of TB molecular epidemiology methods to improve global TB control has been somewhat disappointing. To what extent are the two innovations described here, whole genome sequencing and comprehensive surveillance, well-positioned to overcome some of the limitations of prior approaches, and, if adopted, better inform design of interventions to improve TB control and elimination? What are the next steps, scientifically, to move from the detailed understanding provided in this paper to such interventions?

Minor Comments:

Results

Could the authors report the prevalence of pre-XDR and XDR isolates? Figure 2 suggests that the number of rrs and rlsb mutations may be large but the number of gyrA mutations appears small.

Figures and Tables

Figure 1. Please state the number of participants shown in the figure, as well as the total number of districts and regions. Could you elaborate on "The scheme of color represents the

distribution of notified incidence and localities with missing population data are in gray." Most of the figure(s) are gray - did you really mean missing denominators, or missing cases (as in no cases reported?), which is what the legend implies?

A flow diagram, as specified by STROBE guidelines, is needed to aggregate in one figure all data on patient inclusion in and exclusion from analyses. This could appear in the Supplement but should be referenced in the main text. The inclusion of a high proportion of the eligible population is actually a strength of the study.

Supplement

There appears to be a discrepancy in the number of polyclonal infections reported in the text (n=386) and the number of mixed infections in the supplement (n=403) - I assume that these should be the same.

The legend and other aspects of Table S4 on Page S18 printed partly outside the margins.

Reviewer #2: General comments

The manuscript is clearly described and provide additional insight into the MDR-TB transmitted by heterogeneity is the result of multiple overlapping epidemic in Moldova.

The subject of the manuscript is of importance as 3 main clades of M.tb transmission have been well addressed in the study,the finding could be useful for public health response to address this issue.

Major comments

1.the SNP threshold is crucial to define the cluster and explore the transmission, given in the high TB epidemic setting, the genetic evolution might be varied under the different selection pressures, so for the genomic similarity, it will be reasonable if the author describe why the SNP thresholds of 40 and 20 were chosen, rather than 12 or 18. As we know, under these thresholds and the cluster rates tend to be decreased accordingly. the author had better make analysis and add relevant content in the paper.

Minor comments

1.Within this manuscript the authors have looked at whole picture of Moldova, in this study, the author have collected all culture-positive isolate in Moldova, however, infection and development of TB should be pointed out as the limitation and bias of the un- recovered strain and missing case.

2. inference of direct transmission it seems solid analysis, however, in this study, as to chronic infectious disease with relatively unclear exposure history, how about the in-direct transmission.

Reviewer #3: The study is relevant to understanding the distribution and dynamics of TB, with important approaches in phylogenetic analysis and transmission cluster identification as well as in genomic analysis and phylogeny reconstruction. The methodology used is consistent with the objective of the study, which was approved by the Ethics Committee and has a consenting participant enrolment.

The results showed in detail the Map of culture-confirmed TB patients in Moldova, the phylogenetic distribution of resistance-related genotypes, the prevalence of drug resistance genotypes, the transmission of drug-resistant M. tuberculosis, and the estimate of M. tuberculosis effective population size for the three major clades over time, which collaborates to intervene in the dynamics of disease transmission. The tables present relevant data (demographic and clinical characteristics of study participants/pooled bayesian meta-analysis inference for each effect on the relative risk scale) and the figures clearly present the results of the study.

The discussion was carried out with a scientific basis and with a description of the strength and limitation of the study, which corroborates for further studies to be carried out and provides an important example of how such information may be used to understand the complex epidemiology of MDR-TB in a high incidence country.

I congratulate the authors for their initiative in this study and their scientific collaboration in TB control.

Reviewer #4: The authors of "Phylogeography and transmission of M. tuberculosis in Moldova" have done a great job at elucidating the phylogenetic intricacies of TB spread in Moldova, using robust methodologies for phylogeny inference. The biggest strength of this study is that it includes all TB-positive samples from the whole country of Moldova, and hence provides a good overview of the situation concerning tuberculosis in the country. In general, this study is well designed, implemented, and written. However, the huge amount of whole genome sequence data generated by this study could yield more informative results if analyzed a bit further. Kindly find bellow my suggestions.

A- On line 132 the authors state that in silico drug resistance prediction was carried out using TB-Profiler. Other studies exploring the in silico drug resistance genotypes of TB have noted that the use of more than one software to predict these genotypes would yield complementary results. Although I share the methodological viewpoint of the authors that currently TB-Profiler holds the throne for in silico drug resistance genotyping of TB with its huge database and robust methodology, I am curious to know whether the authors tested a few other bioinformatic tools such as Mykrobe or KvarQ.

Additionally, a study of this scale would benefit from the implementation of a newly developed methodology that reports to have a slightly higher sensitivity than TB-Profiler. Although still in draft form, the tool GenTB reportedly has 3.2% higher sensitivity compared to TB-Profiler in detecting resistance conferring mutations for the nine second-line anti-TB drugs. Seeing as how this could mean that up to 71 samples from this study could have their resistance profiles altered, I suggest that the authors implement the approach outlined in [https://github.com/farhat-lab/gentb-site] and report any changes to their in silico drug resistance prediction results.

B- On line 127, and in more detail in the supplementary data, the authors state that samples exhibiting heterogeneity were discarded from the analysis. Although their aim to minimise the discordance between genomic variation phylogenetic relatedness clarifies as to why 18% of their samples were removed from the phylogenetic analyses, nevertheless, I believe that there is potential in analysing these 403 heterogeneous samples.

Other publications have shown that polyclonal TB infections are of clinical interest as highlighted in (https://dx.doi.org/10.3201%2Feid2311.170077) this paper also senior authored by Dr. Cohen. Therefore, I suggest that these 403 samples be checked for the presence of mutations encoding drug resistance, and reported in a table format without a phylogeny simply for the added value of this analysis to report the MDR status of polyclonal infections in Moldova. Consequently, there should be a brief discussion on the importance of polyclonal TB infections and their clinical implications

https://www.nature.com/articles/srep41410

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0237345

https://www.sciencedirect.com/science/article/pii/S1472979217302251

https://www.sciencedirect.com/science/article/pii/S1472979221000706

I hope that these suggestions aid in improving your already well-drafted manuscript

Balig Panossian

Reviewer #5: Alex McConnachie, Statistical Review

The paper by Yang and colleagues looks at the genetics of tuberculosis cases in Moldova in 2018/19. Much of the paper is beyond my area of expertise, but I can comment on the spatial/genetic distance analysis described from line 162 onwards.

This analysis takes each pair of cases within a cluster as the unit of analysis, and treats the patristic distance between them as the outcome. Again, this is not my field, but I believe this is a measure of the "genetic distance" between two cases. The predictors are measures of within-pair differences, in space, time, or demographics, and some individual demographic measures.

The model assumes that the log-transformed patristic distance is a linear function of the predictor variables, with Normally-distributed, independent errors. The authors do not give any indication of whether these assumptions about the error part of the model are reasonable. I would be particularly concerned about the independence assumption, since the outcomes are derived from pairs of cases. I would expect this to lead to some in-built correlation between residuals. In general, failure to account for correlations between residuals will tend to result in overly-precise estimates of associations.

The results of these analyses are combined across clusters using a Bayesian meta-analysis, which is good. However, the associations are reported as relative risks, which seems wrong, since the outcome is continuous. I am never quite sure of the best term to use, but the exponentiated regression coefficients could perhaps be described as ratios of geometric means associated with a given change in a predictor, but if the authors can think of a better descriptor, that would be fine.

Similar comments could be made about the Poisson version of this analysis reported in the appendix. Are the errors really independent? Is the Poisson distribution assumption (mean = variance) reasonable? Is "relative risk" a sensible term to describe the estimated associations?

One final comment - logistic regression is used to look at factors associated with a case being in a "large cluster", but this is not described in the methods section.

Any attachments provided with reviews can be seen via the following link:

[LINK]

Decision Letter 2

Beryne Odeny

12 Jan 2022

Dear Dr. Yang,

Thank you very much for re-submitting your manuscript "Phylogeography and transmission of M. tuberculosis in Moldova: A prospective genomic analysis" (PMEDICINE-D-21-03120R2) for review by PLOS Medicine.

I have discussed the paper with my colleagues and the academic editor and it was also seen again by four reviewers. I am pleased to say that provided the remaining editorial and production issues are dealt with we are planning to accept the paper for publication in the journal.

The remaining issues that need to be addressed are listed at the end of this email. Any accompanying reviewer attachments can be seen via the link below. Please take these into account before resubmitting your manuscript:

[LINK]

***Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.***

In revising the manuscript for further consideration here, please ensure you address the specific points made by each reviewer and the editors. In your rebuttal letter you should indicate your response to the reviewers' and editors' comments and the changes you have made in the manuscript. Please submit a clean version of the paper as the main article file. A version with changes marked must also be uploaded as a marked up manuscript file.

Please also check the guidelines for revised papers at http://journals.plos.org/plosmedicine/s/revising-your-manuscript for any that apply to your paper. If you haven't already, we ask that you provide a short, non-technical Author Summary of your research to make findings accessible to a wide audience that includes both scientists and non-scientists. The Author Summary should immediately follow the Abstract in your revised manuscript. This text is subject to editorial change and should be distinct from the scientific abstract.

We expect to receive your revised manuscript within 1 week. Please email us (plosmedicine@plos.org) if you have any questions or concerns.

We ask every co-author listed on the manuscript to fill in a contributing author statement. If any of the co-authors have not filled in the statement, we will remind them to do so when the paper is revised. If all statements are not completed in a timely fashion this could hold up the re-review process. Should there be a problem getting one of your co-authors to fill in a statement we will be in contact. YOU MUST NOT ADD OR REMOVE AUTHORS UNLESS YOU HAVE ALERTED THE EDITOR HANDLING THE MANUSCRIPT TO THE CHANGE AND THEY SPECIFICALLY HAVE AGREED TO IT.

Please ensure that the paper adheres to the PLOS Data Availability Policy (see http://journals.plos.org/plosmedicine/s/data-availability), which requires that all data underlying the study's findings be provided in a repository or as Supporting Information. For data residing with a third party, authors are required to provide instructions with contact information for obtaining the data. PLOS journals do not allow statements supported by "data not shown" or "unpublished results." For such statements, authors must provide supporting data or cite public sources that include it.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript.

Please note, when your manuscript is accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you've already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosmedicine@plos.org.

If you have any questions in the meantime, please contact me or the journal staff on plosmedicine@plos.org.  

We look forward to receiving the revised manuscript by Jan 12 2022 11:59PM.   

Sincerely,

Beryne Odeny,

PLOS Medicine

plosmedicine.org

------------------------------------------------------------

Requests from Editors:

1. Please provide a web link to access GenBank data (PRJNA736718)

2. Author summary – Regarding the content under “What did the researchers do and find?” please trim it down to 3 or 4 bullet points.

3. Your study is observational and therefore causality cannot be inferred. Please remove language that implies causality throughout the manuscript abstract, summary, main text and conclusions, such as, “is fueled by …” Please refer to associations, e.g., “…is associated with …” or similar

4. The terms gender and sex are not interchangeable (as discussed in http://www.who.int/gender/whatisgender/en/ ); please use the appropriate term

5. Please temper claims of primacy of results (e.g., the first study to…) by stating, "to our knowledge" or something similar.

6. Thank you for providing your STROBE checklist. Please replace the page numbers with paragraph numbers per section (e.g. "Methods, paragraph 1"), since the page numbers of the final published paper may be different from the page numbers in the current manuscript.

7. Please provide the meaning of abbreviations used in tables and figures e.g. yr, IQR, SD, MDR. Please provide this in the footnotes.

Comments from Reviewers:

Reviewer #1: The Authors have addressed my previous questions and comments in their responses. I offer my congratulations on their work.

Reviewer #2: The author have asked and modified manuscript according to the comments, i feel comfortable about this version.

Reviewer #4: The authors have worked diligently to address the suggestions put forth by the other reviewers and myself. I have no further comments on this polished version of the manuscript

Reviewer #5: Alex McConnachie, Statistical Review

I thank the authors for their responses to my original points, which were more than satisfactory. I am sorry that they had to do all that extra work to fit a model that made little difference to the final results!

I noticed a possible typo. On line 219, just before the regression equation, the sentence ends "...and other covariates such as" which sounds as if something is missing.

I have no further comments.

Any attachments provided with reviews can be seen via the following link:

[LINK]

Decision Letter 3

Beryne Odeny

25 Jan 2022

Dear Dr. Yang,

Thank you very much for re-submitting your manuscript "Phylogeography and transmission of M. tuberculosis in Moldova: A prospective genomic analysis" (PMEDICINE-D-21-03120R3) for review by PLOS Medicine.

I am pleased to say that provided the remaining editorial and production issues are dealt with we are planning to accept the paper for publication in the journal.

The remaining issues that need to be addressed are listed at the end of this email. Any accompanying reviewer attachments can be seen via the link below. Please take these into account before resubmitting your manuscript:

[LINK]

***Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.***

In revising the manuscript for further consideration here, please ensure you address the specific points made by each reviewer and the editors. In your rebuttal letter you should indicate your response to the reviewers' and editors' comments and the changes you have made in the manuscript. Please submit a clean version of the paper as the main article file. A version with changes marked must also be uploaded as a marked up manuscript file.

Please also check the guidelines for revised papers at http://journals.plos.org/plosmedicine/s/revising-your-manuscript for any that apply to your paper. If you haven't already, we ask that you provide a short, non-technical Author Summary of your research to make findings accessible to a wide audience that includes both scientists and non-scientists. The Author Summary should immediately follow the Abstract in your revised manuscript. This text is subject to editorial change and should be distinct from the scientific abstract.

We expect to receive your revised manuscript within 1 week. Please email us (plosmedicine@plos.org) if you have any questions or concerns.

We ask every co-author listed on the manuscript to fill in a contributing author statement. If any of the co-authors have not filled in the statement, we will remind them to do so when the paper is revised. If all statements are not completed in a timely fashion this could hold up the re-review process. Should there be a problem getting one of your co-authors to fill in a statement we will be in contact. YOU MUST NOT ADD OR REMOVE AUTHORS UNLESS YOU HAVE ALERTED THE EDITOR HANDLING THE MANUSCRIPT TO THE CHANGE AND THEY SPECIFICALLY HAVE AGREED TO IT.

Please ensure that the paper adheres to the PLOS Data Availability Policy (see http://journals.plos.org/plosmedicine/s/data-availability), which requires that all data underlying the study's findings be provided in a repository or as Supporting Information. For data residing with a third party, authors are required to provide instructions with contact information for obtaining the data. PLOS journals do not allow statements supported by "data not shown" or "unpublished results." For such statements, authors must provide supporting data or cite public sources that include it.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript.

Please note, when your manuscript is accepted, an uncorrected proof of your manuscript will be published online ahead of the final version, unless you've already opted out via the online submission form. If, for any reason, you do not want an earlier version of your manuscript published online or are unsure if you have already indicated as such, please let the journal staff know immediately at plosmedicine@plos.org.

If you have any questions in the meantime, please contact me or the journal staff on plosmedicine@plos.org.  

We look forward to receiving the revised manuscript by Feb 01 2022 11:59PM.   

Sincerely,

Beryne Odeny,

PLOS Medicine

plosmedicine.org

------------------------------------------------------------

Requests from Editors:

1. Please replace the term "gender" with "sex." PLOS Medicine style is to use biological "sex" rather than "gender", where appropriate. Apologies for the confusion caused with the terms used.

Any attachments provided with reviews can be seen via the following link:

[LINK]

Decision Letter 4

Beryne Odeny

31 Jan 2022

Dear Dr Yang, 

On behalf of my colleagues and the Academic Editor, Dr. Claudia M. Denkinger, I am pleased to inform you that we have agreed to publish your manuscript "Phylogeography and transmission of M. tuberculosis in Moldova: A prospective genomic analysis" (PMEDICINE-D-21-03120R4) in PLOS Medicine.

Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. Please be aware that it may take several days for you to receive this email; during this time no action is required by you. Once you have received these formatting requests, please note that your manuscript will not be scheduled for publication until you have made the required changes.

In the meantime, please log into Editorial Manager at http://www.editorialmanager.com/pmedicine/, click the "Update My Information" link at the top of the page, and update your user information to ensure an efficient production process. 

PRESS

We frequently collaborate with press offices. If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximise its impact. If the press office is planning to promote your findings, we would be grateful if they could coordinate with medicinepress@plos.org. If you have not yet opted out of the early version process, we ask that you notify us immediately of any press plans so that we may do so on your behalf.

We also ask that you take this opportunity to read our Embargo Policy regarding the discussion, promotion and media coverage of work that is yet to be published by PLOS. As your manuscript is not yet published, it is bound by the conditions of our Embargo Policy. Please be aware that this policy is in place both to ensure that any press coverage of your article is fully substantiated and to provide a direct link between such coverage and the published work. For full details of our Embargo Policy, please visit http://www.plos.org/about/media-inquiries/embargo-policy/.

To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols

Thank you again for submitting to PLOS Medicine. We look forward to publishing your paper. 

Sincerely, 

Beryne Odeny 

PLOS Medicine

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 STROBE Checklist. STROBE Statement—Checklist of items that should be included in reports of observational studies.

    STROBE, STrengthening the Reporting of OBservational studies in Epidemiology.

    (DOCX)

    S1 Data. Additional demographic and epidemiological data used in the analysis.

    (CSV)

    S1 Appendix

    Table A: A summary of the lineages found in mixed M. tuberculosis samples from Moldova, as designated by TB-Profiler. Table B: A summary of the homogeny in drug resistance mutations present in mixed M. tuberculosis samples from Moldova. Table C: In silico drug resistance prediction using TBprofiler and genTB tools. Table D: Allele counts for 9 SNP variants identified in the esxW gene within the study population, showing counts within samples classified as either Beijing strains (all lineage 2.2.1) or as any other lineage. Table E: Demographic associations in cases belonging to large transmission clusters (≥10 cases), identified with patristic distance thresholds of 0.001 and 0.0005. Cases in small clusters (2 to 9 cases) are not included. ORs are calculated using logistic regression and P values by Wald chi-squared test, adjusted for age and sex. Table F: Results of the Coalescent Bayesian Skyline analyses of the 3 large clades with specific resistant mutations using an uncorrelated log normal relaxed clock model. Table G: Complete Coalescent Bayesian Skyline results of the sensitivity analysis using 3 different clock model settings (strict, log normal relaxed, and exponential relaxed) and 3 clock rate estimates of the 3 large clades with specific resistant mutations. The clock rate used log normal distribution. Table H: Pooled Bayesian meta-analysis inference for each exponentiated effect (i.e., ratio of expected SNP distances per specified change in covariate value). Posterior means and 95% quantile-based credible intervals are presented. Fig A: The study flow diagram. Fig B: Distribution of the proportion of MDR-TB by the regions where they were diagnosed. (a) Regions sorted by the proportion of MDR-TB and (b) the total numbers of MDR-TB isolates from high to low. Fig C: (a) A scatterplot showing the pairwise SNP distance (max. 50 SNP differences) plotted against the patristic distance on an M–L phylogeny produced with RAxML between all 2,236 Moldovan isolates with whole genome sequence data. (b) A scatterplot showing the pairwise SNP distance (max. 50 SNP differences) plotted against the patristic distance on an M–L phylogeny produced with RAxML between 1,834 nonmixed Moldovan isolates with whole genome sequence data. Fig D: (a) The pairwise SNP distance in 35 large transmission clusters with at least 10 participants involved with the threshold of 0.001. The box plot shows the IQR and median SNP distance of each cluster. (b) The pairwise SNP distance in 26 large transmission clusters with at least 10 participants involved with the threshold of 0.0005. The box plot shows the IQR and median SNP distance of each cluster. Fig E: Tree visualizations for remaining 32 transmission clusters (N ≥ 10 isolates), each showing the location of cases in either the Moldova or Transnistria regions along with resistance/susceptibility to anti-TB drugs, as identified by in silico prediction. Fig F: Tree visualizations for the 35 transmission clusters (N ≥ 10 isolates), each showing the location of cases in either the Moldova and Transnistria regions along with selected covariates, namely, urban residence, homeless, unsatisfactory living conditions, and former prisoner. Fig G: Coalescent Bayesian Skyline plots of the sensitivity analysis using 3 different clock model settings (strict, log normal relaxed, and exponential relaxed) and 3 clock rate estimates of the 3 large clades with specific resistant mutations. IQR, interquartile range; MDR-TB, multidrug-resistant tuberculosis; M–L, maximum–likelihood; OR, odds ratio; SNP, single nucleotide polymorphism.

    (PDF)

    Attachment

    Submitted filename: Response letter_final.docx

    Attachment

    Submitted filename: response_tc.docx

    Attachment

    Submitted filename: response letter.docx

    Data Availability Statement

    The genomic data have been made available through GenBank (PRJNA736718, https://www.ncbi.nlm.nih.gov/bioproject/PRJNA736718). Additional data used in the analysis (with the exception of location data which cannot be provided because of the small number of participants at locations would allow linkage to individual participants), are provided as a csv in the Supporting information.


    Articles from PLoS Medicine are provided here courtesy of PLOS

    RESOURCES