We find that genome-wide DNA transfer by conjugation in mycobacteria affords bacteria that reproduce by binary fission the same advantages of sexual reproduction, and may explain the genomic evolution of Mycobacterium tuberculosis.
Abstract
Horizontal gene transfer (HGT) in bacteria generates variation and drives evolution, and conjugation is considered a major contributor as it can mediate transfer of large segments of DNA between strains and species. We previously described a novel form of chromosomal conjugation in mycobacteria that does not conform to classic oriT-based conjugation models, and whose potential evolutionary significance has not been evaluated. Here, we determined the genome sequences of 22 F1-generation transconjugants, providing the first genome-wide view of conjugal HGT in bacteria at the nucleotide level. Remarkably, mycobacterial recipients acquired multiple, large, unlinked segments of donor DNA, far exceeding expectations for any bacterial HGT event. Consequently, conjugal DNA transfer created extensive genome-wide mosaicism within individual transconjugants, which generated large-scale sibling diversity approaching that seen in meiotic recombination. We exploited these attributes to perform genome-wide mapping and introgression analyses to map a locus that determines conjugal mating identity in M. smegmatis. Distributive conjugal transfer offers a plausible mechanism for the predicted HGT events that created the genome mosaicism observed among extant Mycobacterium tuberculosis and Mycobacterium canettii species. Mycobacterial distributive conjugal transfer permits innovative genetic approaches to map phenotypic traits and confers the evolutionary benefits of sexual reproduction in an asexual organism.
Author Summary
Bacteria reproduce by binary fission, generating two clones of the original; this restricts the genomic diversity of the population, which brings with it inherent evolutionary drawbacks. This problem can be eased by conjugation, which transfers DNA from a donor to a recipient bacterium. Understanding the potential of conjugal DNA transfer for generating genetic diversity is necessary for estimating gene flow through populations and for predicting rates of bacterial evolution. The influence of chromosomal conjugal DNA transfer on mycobacterial diversity has not been previously addressed. Here, we determine and compare the complete genome sequences of independent progeny from bacterial matings between defined donor and recipient strains of Mycobacterium smegmatis. We find the resulting hybrid bacteria to be extremely diverse blends of the parental strains, reminiscent of the genetic mixing that occurs through meiotic recombination in sexual organisms. This novel mechanism of conjugation can create genome-wide mosaicism in a single event, generating segments of donor DNA that range from small (∼0.05 kb) to large (∼250 kb), widely distributed around the recipient chromosome. We exploit this mixing by using genetic tools originally developed for finding mammalian disease genes to locate the genes that confer a donor phenotype in M. smegmatis. We speculate that similar genomic mosaicism observed in pathogenic mycobacteria arose from conjugation between ancestral progenitor strains.
Introduction
Sexual reproduction in eukaryotes promotes genetic diversity by increasing gene flow through a population, permitting both the loss of mutant genes and the acquisition of functionally distinct gene alleles. The diversifying potential is further enhanced by crossover events that create new mosaic recombinant meiotic products, which in turn may impart new functionalities not present in either parent. In contrast, bacterial fission provides rapid clonal expansion to fill an environmental niche, but lacks the evolutionary advantages of sexual reproduction. Horizontal gene transfer (HGT) mitigates the diversification constraints of asexual reproduction by mediating limited gene flow through the population. The fundamental forms of HGT include transformation, transduction, and conjugation. Conjugation is considered a major contributor to HGT, as it can transfer more extensive segments of DNA between different species and even kingdoms [1]–[4].
Conjugation describes the unidirectional transfer of DNA from a donor to a recipient, and requires cell–cell contact. Conjugal processes are traditionally plasmid encoded, or encoded by a discrete genetic element integrated into the chromosome. Transfer proteins are generally classified into those that establish and maintain mating-pair formation or those responsible for DNA transfer [5],[6]. These latter proteins recognize and nick the unique origin of transfer (oriT) on the plasmid and guide the DNA into the recipient cell. oriT is cis-acting, and thus, when recombined into the chromosome, it can mediate transfer of chromosomal DNA, as first described for E. coli Hfr strains [7]. DNA transfer in M. smegmatis displays all of the hallmarks of conjugation: it requires stable and extended contact between a donor and a recipient strain, it is DNase resistant, and the transferred DNA segments are incorporated into the recipient chromosome by homologous recombination [8]. While the process clearly meets the traditional definition of conjugation, the similarities with the classical E. coli Hfr system end there [9]–[13]. Mycobacterial conjugation is chromosome—not plasmid—based, and bioinformatic and genetic studies have yet to identify a genetic element that might mediate transfer [14],[15]. In E. coli, Hfr transfer always initiates at the sole plasmid-encoded oriT site, and the DNA is transferred in a 5′ to 3′ direction, such that only genes proximal and 3′ to oriT are inherited at high frequencies [10],[16]. By contrast, in M. smegmatis, all regions of the chromosome are transferred with comparable efficiencies as demonstrated by equivalent transfer of a kanamycin-resistance marker regardless of its chromosomal location [11]. This position independence is consistent with the presence of multiple, but ill-defined, initiation sites [17].
Transposon mutagenesis screens provided initial insights into the genetic requirements of transfer [14],[15]. These studies established a prominent role for the Type VII secretion apparatus, ESX-1, in both donor and recipient activity. ESX-1 clearly plays different roles in each cell type. ESX-1 donor mutants are hyperconjugative, suggesting secretion plays a role in negatively regulating transfer activity [15]. By contrast, recipient strain ESX-1 mutants do not receive donor DNA [14]. Although these studies provided novel insights into the functional roles of ESX-1, they did not provide insights on the transfer mechanism, or define what determines the mating type of a cell (either donor or recipient).
Here, as an alternative approach, we examined the products of DNA transfer to better understand this process and its contributions to mycobacterial evolution. We used next-generation sequencing to determine the parental inheritance profiles in transconjugant M. smegmatis progeny. The genomic sequence of each of the M. smegmatis parental strains has been determined, and the abundant single nucleotide polymorphisms between the two strains indicated that the transferred segments comprising the transconjugant genomes could be mapped with precision. We found that the parental contributions to the transconjugants were much more complex than expected, indicating a surprisingly major role for conjugal DNA transfer in generating genomic diversity. The blending of the parental genomes is reminiscent of that seen in the meiotic products of sexual reproduction. This comparison is validated by our use here of genomic approaches previously developed and applied in sexual reproduction systems to define candidate genes for conjugal mating identity.
Results
Transconjugant Genomes Are Highly Mosaic
To provide a selectable marker for chromosomal DNA transfer, a kanamycin resistance gene (Kmr) was integrated in the chromosome of mc2155, the standard laboratory and conjugal donor strain of M. smegmatis. Donor mc2155 derivatives that differed in their Kmr insertion site were mated to an apramycin-resistant (Apr) recipient strain, mc2874 (Figure 1A). mc2874 is an independent isolate of M. smegmatis that we have used as a standard recipient strain [8],[18]. Apramycin resistance was episomally encoded to avoid inheritance biases caused by selecting for this gene on the recipient chromosome. From matings between these strains, 12 independent KmrApr F1 progeny were isolated, and the DNA sequences of their genomes were determined (sequence data deposited in the EBI/ENA database at http://www.ebi.ac.uk/ena/data/view/ERP002619). Our comparative sequence analyses of the parental strains had shown that the circular mc2155 and mc2874 genomes are collinear, and that they contained abundant single nucleotide polymorphisms (SNPs; averaging one per 56 bp) providing a clear distinction between parental DNA origins (Figures 1A and S1). Individual sequence reads from each transconjugant were aligned with the donor strain genome to identify all transferred donor segments. When evaluating transconjugant sequences, we conservatively required the presence or absence of two consecutive recipient SNPs to define a boundary between recipient and donor sequence tracts, respectively (Figure S2). Donor segments replaced the corresponding recipient sequences, as evidenced by a concomitant localized loss of recipient-specific SNPs in transconjugants. Unique segments of transferred donor DNA, predicted by alignment analyses in transconjugants, were confirmed by conventional PCR and Sanger sequencing (Table S1). Two transconjugants had 11 regions that were merodiploid (approximately equal contributions of donor and recipient SNPs). As this was a resequencing and not a de novo sequencing strategy, we cannot determine the precise architecture and location of these regions. These regions did not contain repetitive elements, though it is possible that integration occurred at nonsynonymous sites via microhomology or through mechanisms not requiring homology.
The most striking observation from an alignment of our initial set of 12 transconjugant genomes with the parental genomes was that the transconjugant genomes were broadly mosaic, containing at least two, and as many as 21, separate tracts of cotransferred mc2155 DNA embedded in an mc2874 background (Figure 1B and Table S2). These separate segments of DNA were acquired in a single cell–cell transfer event, as determined in earlier studies [11]. To our knowledge, this degree of genome-wide diversity is unprecedented in genetic transfer events between bacteria. This contrasts directly with the iconic plasmid-transfer systems in which a single segment of donor DNA linked to oriT is inherited [10],[19]. Therefore, we refer to mycobacterial conjugation as distributive conjugal transfer to distinguish it from oriT-mediated transfer.
As expected, all transconjugant progeny acquired the selected Kmr gene, along with variable amounts of flanking mc2155 DNA (Figure 1B, Kmr, green segments embedded in yellow recipient DNA). Surprisingly, 5-fold more mc2155 DNA was co-inherited in segments that were not selected, and these segments were distributed around the genome with no obvious regional biases (Figure 1B, alternating blue and magenta improve visual discrimination between adjacent tracts; Table S2). The 12 transconjugant genomes analyzed contained from 57 kb to 679 kb (of 6.9 Mb) of mc2155-derived sequence. The sizes of the donor segments varied >1,000-fold, ranging from 59 bp to 226 kb (Figure S3 and Table S2), with an average size of 33.8 kb, and a mean of 10 tracts per genome (Table 1).
Table 1. Total contributions of donor-derived DNA in transconjugants.
F1 Recipient (N = 12) | F1 Donor (N = 10) | Backcross (N = 6) | ||||
Unit Measured | N | % | N | % | N | % |
Donor | 4,297,500 | 5.1 | 8,374,846 | 12.0 | 872,006 | 2.1 |
Kmr | 870,085 | 20.2 | 1,093,908 | 13.1 | 450,405 | 51.7 |
esx1 | 4,399 | 0.0 | 1,413,460 | 16.8 | 311,489 | 68.5a |
Unselected b | 3,427,415 | 79.8 | 6,168,951 | 73.6 | 329,884 | 37.8 |
Total segs | 124 | 166 | 28c |
The total number of base pairs in donor-derived segments was calculated for three transconjugant cohorts (itemized lists appear in Tables S2 and S4). The total DNA can be subdivided into DNA associated with the selected Kmr gene, esx1, and unselected DNA. Transconjugant cohorts are recipient-proficient F1 transconjugants (Figure 1); donor-proficient F1 transconjugants (Figure 2), for which esx1 was enriched by screening for donor function; and backcross transconjugants that are either donor or recipient-proficient (Figures 4 and 5). Donor percentages assume 7 Mb of DNA per transconjugant genome, whereas percentages for segments that spanned Kmr, esx1, or were transferred but not selected were calculated per the total amount of donor DNA transferred in that cohort.
The percentage for esx1 segments in the backcrosses was calculated for the three donor-proficient derivatives.
Unselected segments are not contiguous with the donor-derived Kmr gene or esx1 locus.
Only three of the transferred segments were unchanged from their ancestral F1 parental boundaries, with the remainder representing subdivided fragments of previously uninterrupted donor tracts.
Some regions showed intricate microcomplexity of multiple inherited segments separated by short intervals of recipient DNA (Figure 1C and highlighted in Table S2). Note that the single-nucleotide discrepancies (colored SNPs) derive from parental inheritance, not de novo mutation (see reciprocal parental reference sequence alignments in Figure 1C). These likely resulted from a combination of repair and recombination events occurring between the recipient chromosome and a single molecule of introduced donor DNA, as some segments are separated by only a few base pairs. Regardless of the mechanism, the net effect was to create a localized composite blend of parental contributions at the nucleotide level.
DCT Facilitates a Genome-Wide Mapping Approach That Identifies a Mating Identity (Mid) Locus
The image in Figure 1B shows the extent of mc2155 DNA transferred to recipients when selecting for a single event: acquisition of the gene encoding Kmr. Based on the distributive nature of transfer, we reasoned that we could employ secondary screens of the transconjugants to map any additional genetic trait regardless of its linkage to the Kmr gene. Tracking parental SNPs within a group of individual transconjugants exhibiting a given phenotype should identify those shared SNPs (and parental genes) associated with that phenotype. We have previously observed that a subset of transconjugants become donors, suggesting that these progeny acquired a donor-conferring locus [11]. We hypothesized that an unbiased genome-wide mapping approach would identify a shared segment of mc2155 DNA among those progeny encoding this trait. Transconjugants derived from crosses of the differentially marked donor strains were screened for donor ability, and 10 independent donor-proficient transconjugants were identified. We note that mating identity is a mutually exclusive phenotype, and transconjugants exhibit transfer efficiencies comparable to parental strains ([11] and Table S3). Genomic DNA from each donor-proficient transconjugant was prepared and its sequence determined. Comparative sequence analysis showed that all donor-proficient transconjugants, regardless of the location of the Kmr gene in the parent, shared only one segment of mc2155 DNA (Figure 2A and Table S4), with the smallest region of overlap encompassing coordinates 74,522 to 119,788 bp (Figure 2B). This result is consistent with transfer of a single 45 kb locus (mid) that is sufficient to switch mating identity from recipient to donor in these transconjugants.
This region is not simply a hot spot for integration of acquired DNA, since the 12 recipient-proficient (i.e., did not become donors) transconjugants in Figure 1B were not similarly enriched for this segment of mc2155 DNA (compare Figures 1B and 2A, and see below). Closer examination of the region acquired by donor-proficient transconjugants established that they all had inherited a minimal segment of DNA encompassing the mc2155 esx1 locus (Figure 2B, 74,600–107,334 bp, esx1 D, where the subscript differentiates donor or recipient origin). The esx1 locus encodes a Type VII secretion system [20],[21]. The encoded ESX-1 apparatus assembles in the cell membrane and secretes a specific set of proteins, which, in M. tuberculosis, are essential for pathogenicity [22]–[24]. Proteins secreted by ESX-1 lack a signal peptide that would aid in their identification, and the most notable substrate is a heterodimer of two small proteins, EsxB and EsxA. Other proteins encoded within the esx1 locus and elsewhere in the genome are also secreted through ESX-1, some of which are co-dependent on EsxBA secretion. The functions of most of the proteins encoded by esx1 genes are unknown, but the overall composition of the esx1 loci between the parental mc2155 and mc2874 strains are similar (see below). Although our previous transposon mutagenesis studies have shown that ESX-1 plays an important role in the process of DNA transfer in both donor and recipient strains, mating-type identity is not reversed in ESX-1 mutants [14],[15]. Therefore, the role of ESX-1 in determining mating identity was quite unexpected, and underscores the utility of a “change-of-function” mapping approach.
While all of the donor-proficient transconjugants inherited an intact esx1 D locus, none of the recipient-proficient F1 strains did. Notably, four of the F1 recipient-proficient strains were derived from the Km0.1 parent, in which only 15 kb separate esx1 D and the selected Kmr gene. Despite this tight linkage, distributive conjugal transfer readily segregated the Kmr gene and intact esx1 D locus when appropriately screened, thereby augmenting the mapping resolution (Figure 1B, Table S2, and below). Helpfully, one of these recipient-proficient transconjugants (Km0.1c) inherited parts of esx1 D, excluding these esx1 genes from mid candidacy (0064–0068 and 0077–0083, Table S2). These negative correlations affirm the functional dependence of the donor trait on the mid genes of esx1 D and demonstrate the robust nature of distributive conjugal transfer in generating the level of genetic diversity necessary for our mapping analyses.
Fine Mapping of the Mid Locus by a Backcrossing Analysis
In classical genetic studies, fine mapping of a genetic determinant can be achieved by performing successive backcross introgression analyses to genetically purify a locus in a recipient background. We reasoned a similar strategy would achieve two goals: (1) discard mc2155 parental genes not required for the donor transfer trait and (2) further narrow the key conjugal mid gene region. Six F1 donor recombinants were backcrossed with mc2874 recipient derivatives that were marked with a different episomally encoded antibiotic resistance gene (Hygr or Apyr) in successive generations. Introgression entailed co-selection for Kmr transfer and the recipient marker to identify transconjugants at each generation (Nx), and then screening progeny for donor proficiency (Figure 3). Comparative analyses of genomes of three donor-proficient strains showed a purifying selection of the donor-conferring locus and Kmr genes in an otherwise recipient genome (Figure 4, Table S4). In each case, the majority of the F1 mc2155 DNA was lost. For example, the F1 parent of Km0.1BCb contained 19 mc2155 segments totaling over 869 kb, yet following six backcross generations this DNA was trimmed to three segments totaling 110 kb, most of which encompassed the selected mid and Kmr genes (79 kb, Table S4).
As expected, backcross matings also resulted in recipient-proficient progeny, several of which were also sequenced (Figure 3). Coincident with a reversal of mating identity, the esx1 D locus failed to transfer. One recipient strain, Km0.8BC, retained only 75 kb of mc2155 DNA of the 920 kb originally present in the F1 parent (Figure 5, Table S4). Analyses of two recipient-proficient strains derived from independent F1 Km6.9 parents further refined the region of interest. Km6.9BCa included donor genes 0055D–0067D and 0079D–0083D and Km6.9BCb contained genes 0072–0075D (Figures 5 and 6, Table S4). Thus, these esx1 D genes are insufficient to confer a donor phenotype. Taken together, the mapping data identify esx1 genes in 0068D–0071D and/or 0076D–0078D as being critical for determining mating identity. Ongoing studies requiring multiple, precise, targeted gene swaps will identify the key gene(s).
While most esx1 gene products are highly conserved among mycobacterial species, M. smegmatis proteins 0069, 0070, and the N-terminal two-thirds of 0071 have notably low amino acid identity between donor and recipient orthologs (Figure 6 and Figure S4) [14] and are therefore good candidates for switching mating identity. The proteins encoded from this region are not predicted to contain an obvious motif or domain that would provide mechanistic insight into their role in conjugation. However, the location of the mid genes within esx1 suggests that the encoded proteins modify ESX-1 structure or function, to perhaps affect cell–cell communication or physically mediate DNA transfer.
Discussion
We used next-generation sequencing to examine transconjugant genomes and found that mycobacterial conjugation generates highly mosaic genomes created by a robust distributive conjugal transfer process. Transconjugants acquired large amounts of donor DNA (some exceeding one-fourth of the transconjugant genome; Table S4, Km4.5a), in varied segment sizes (spanning four orders of magnitude) that were distributed around the genome. We exploited these characteristics of distributive conjugal transfer (DCT) to map mating identity genes of M. smegmatis.
Hfr transfer in E. coli is initiated from the unique oriT and results in transfer of a single segment of the donor chromosome [9],[19],[25]. Thus, while the recipient acquires new genetic information, that new information is limited to DNA immediately adjacent and 3′ to oriT (Figure 7, left). Genetic analyses and an understanding of the RecBCD recombination machinery suggest that a single segment is integrated into the recipient chromosome via a recombination event occurring at each end of the transferred DNA molecule [16]. To our knowledge, whole genome sequencing has not been reported for Hfr– transconjugants, preventing a detailed comparison of the two conjugation systems. Thus, our study provides the first genome-wide analysis of bacterial conjugal transfer. In contrast to oriT-mediated transfer, the complex inheritance profiles exhibited by mycobacterial transconjugants suggest stochastic co-transfer from multiple origins, as previously predicted [17]. Based on our genome sequence data, we speculate that random chromosomal DNA fragments are generated in the donor, some of which are co-transferred into the recipient strain where they replace recipient sequences through homologous recombination. An alternative scenario is that a single large DNA molecule is transferred, which is processed into smaller segments before their integration into the recipient chromosome by homologous recombination. This scenario seems less likely as we would have expected to identify some transconjugant progeny containing exceedingly large chunks of donor DNA (3–4 Mb) integrated into the chromosome. These would have resulted from recombination close to the ends of the transferred molecule, before creation of small segments. This latter scenario is also less consistent with our previous observations, which indicated that the donor chromosome contained multiple initiation sites and that the efficiency of gene transfer was location-independent. We have considered examining boundary sequences to determine whether they provide insight on the mechanism of conjugation. However, there are multiple factors influencing boundary regions, which together prevent a unifying mechanistic insight. For example, the actual breakpoints generated by conjugation are almost certainly lost as the boundaries are driven by the requirement for homology and by different recombination mechanisms mediating integration, as evidenced by inheritance of both regions of microheterogeneity and single large integration events.
Mycobacteria encode multiple nonredundant recombination pathways (RecBCD, AdnAB, and nonhomologous end-joining), but are not known to encode a mismatch repair system [26],[27]. We postulate that homologous recombination mediated by AdnAB is likely responsible for the simple crossover events, which is consistent with the absolute requirement for RecA in DCT [17]. However, this form of homologous recombination alone seems insufficient to explain regions of microcomplexity. The clustered proximity of recombinant tracks indicates that an imported donor segment initially encompassed the entire region, but the mechanism underlying the internal mosaicism is unclear. Characterization of the mechanism and the enzymes behind this process will require careful directed approaches using defined recombination mutants.
Every facet of the transfer process contributes to the genetic complexity of the transconjugants (Figure 7). The large number and distributive character of the transferred segments, combined with the microcomplexity in some tracts, makes each transconjugant uniquely different from the others, as well as from the parental strains. The widely varied sizes of the transferred segments allows transconjugants to acquire both major changes, potentially bringing in entire operons encoding biological pathways, and minor nucleotide substitutions that provide subtle diversity, which could, for example, modify the activity or interaction specificity of an enzyme. Multiple pan-genomic changes that typically accompany evolution of bacteria are assumed to be a serial accrual of HGT and spontaneous mutation events (Figure 7). By contrast, a single step DCT event between two single cells generates a transconjugant strain that is a mosaic blend of the parental genomes, and not merely an incrementally altered derivative. Thus, distributive conjugal transfer provides an unparalleled mechanism for quickly generating tremendous genetic diversity, which rivals that seen in sexual reproduction [28].
Recent genome-wide studies of naturally competent strains provide an interesting contrast between the progeny of transformation and conjugation [29]–[32]. In these studies, nonselected segments of DNA were also observed around the recipient chromosome and thus contribute to variation. Microcomplexity in these segments suggested that, as for DCT, integration of transformed DNA was mediated by both recombination and/or repair machinery. However, the nonselected segments were significantly smaller (1–4 kb, depending on the species) than those described here, which average 49 kb and can be as large as 249 kb (Table S4, Km4.5b: 6,942,375–202,798). The limitation on recombination sizes in pneumococci correlated with an underrepresentation of large insertions, which together argued that transformation led to genome reduction and was unlikely to act as a mechanism for uptake of accessory loci [29]. The large DNA segments acquired via DCT, in contrast, facilitates inheritance of novel operons and genes. For example, one large recombination tract introduced a contiguous stretch of ∼55 kb of nonhomologous donor-derived DNA into the transconjugant chromosome (Km6.9b). Perhaps an example more functionally pertinent to our work was an insertion–deletion exchange observed in the divergent mid candidate region of esx1 in transconjugants switched to donors (Figure S5).
We have demonstrated conjugal DNA transfer in additional naturally derived M. smegmatis strains [8], indicating a broader presence for mycobacterial distributive conjugal transfer. The rough-colony morphology members of the Mycobacterium tuberculosis complex (MTBC) exhibit extremely low genetic variation, suggesting that they do not undergo HGT, are evolutionary young, and resulted from a recent clonal expansion [33]. However, there is now convincing evidence for HGT among M. canettii, and other smooth-colony MTBC strains, which display genome-wide mosaicism, although the precise mechanism(s) of HGT are unknown [34],[35]. Based on sequence comparisons, it was proposed that M. canettii strains are extant members of a genetically diverse MTBC progenitor species, M. prototuberculosis, whose members underwent frequent HGT [34],[36],[37]. The unspecified HGT process underlying that mosaicism is presumed to result from a series of sequential transfer events. However, based on our studies, distributive conjugal transfer involving the ancestral M. prototuberculosis offers a plausible and parsimonious explanation for the remarkably similar mosaicism observed among the extant M. canettii. We could envision that distributive conjugal transfer in M. prototuberculosis rapidly incorporated the necessary blend of parental genotypes that drove the emergence of the pathogenic, rough-colony morphology species, like M. tuberculosis, allowing their subsequent clonal expansion. Moreover, if DCT drove these postulated HGT events, the evolutionary clock for M. tuberculosis is likely much shorter because of the capacity of DCT to generate genome-wide mosaicism in a single step. Given the widespread nature of conjugation, we speculate that distributive conjugal transfer also occurs in other bacteria, conferring similar evolutionary benefits.
The characteristics of mycobacterial distributive conjugation suggested to us that tools developed for mammalian genetics could be applied here. Using a eukaryotic-style genome-wide association mapping approach, we mapped the mating identity locus (mid) for mycobacterial conjugation (Figure 7). Similarly, we applied a backcross introgression strategy to refine the mapping and to purge extraneous mc2155 sequence (Figure 7). The purifying selection of successive backcross generations effectively introgressed the mc2155 mid locus into the mc2874 background; this created a strain that was nearly isogenic to the mc2874 parent strain, but which now functioned as a conjugal donor. We note that the hybrid esx1 loci produced by distributive conjugal transfer have not been disabled (as in transposon mutagenesis screens), and still encode functional ESX-1 secretory apparatuses that secrete the major ESX-1 substrates (Figure S6). The un-annotated theoretical proteins encoded by the mid candidate genes bear no overt resemblance to those known to be involved in conjugation in other bacteria. Their association with the esx1 locus suggests that Mid proteins modify the ESX-1 secretion system, are secreted by ESX-1, or interact with other ESX-1–secreted substrates. The next step in their functional assessment will likely result from an extension of this work to identify which protein(s) or protein motifs are necessary and sufficient to impart conjugal sex identity. Interestingly, orthologs for the mid candidate genes are found in the sequenced genomes of other environmental mycobacteria, suggesting a possible ongoing role for distributive conjugal transfer in gene flow between mycobacteria. Orthologs of these mid candidates are not apparent in the esx1 locus of M. tuberculosis, consistent with our speculative model that the MTBC represents a clonally expanded product of distributive conjugal transfer, not necessarily an active participant in this process. Nevertheless, recent evidence from genome sequencing comparisons indicates that some form of genetic exchange has occurred between M. tuberculosis and M. canettii [35].
While we applied DCT to map mid genes, in principle any genetic trait that differs between the parental strains can be mapped using this genome-wide mapping strategy. For example, mc2155 and mc2874 grossly differ in colony morphology, biofilm formation, and phage susceptibility, any of which could have been scored as a change of function in the recipient and mapped by DCT. Similarly, biochemical differences between these strains could be discerned through simple, high-throughput assays. We recognize that more traditional approaches for mutagenic loss-of-function mapping [38],[39] will remain important in mycobacterial studies, but this new application of conjugation now allows any phenotype that differs between a mating pair to be unambiguously mapped.
Our analysis of distributive conjugal transfer (DCT) in M. smegmatis has practical and conceptual ramifications. It brings new tools to mycobacteriology, including those traditionally used exclusively in eukaryotic genetics. It also shows how bacterial evolutionary time scales can be compressed by generating incredible genetic diversity in a single step. Identifying the necessary components, such as esx1 and mid, will help to elucidate the mechanism, to allow modification of the system, and to computationally identify bacteria that actively participate in DCT—or engineer them to do so. Our previous finding of DCT in a mixed biofilm [40] underscores the importance of predicting how prevalent DCT may be in nature, for a more accurate interpretation of metagenomic datasets and to model gene flow through bacterial populations. Regardless of these secondary ramifications, our primary finding of the tremendous genomic variation generated by DCT takes a significant step toward bringing the evolutionary benefits of sexual reproduction to bacteria.
Materials and Methods
Mycobacterial Strains and Conjugation
M. smegmatis donor strains were derivatives of the laboratory strain, mc2155 [41]. Each derivative has a Km R gene inserted at a unique location in the chromosome [11], which was mapped by DNA sequencing the flanking DNA and alignment to the mc2155 genome sequence (http://cmr.jcvi.org/tigr-scripts/CMR/GenomePage.cgi?org=gms), or the draft genome of the recipient (GenBank CM001762). The recipient strain mc2874 [18],[42] was transformed with a plasmid encoding either apramycin or hygromycin resistance to allow counterselection against the donor. M. smegmatis strains were cultured at 37°C in Trypticase Soy Broth with 0.05% Tween80, or on Trypticase Soy Agar (TSA) plates. Antibiotics were added at 100 µg/ml (apramycin), 100 µg/ml (hygromycin), and 10 µg/ml (kanamycin). DNA transfer experiments were carried out as described previously selecting for dual-resistant transconjugants [8]. To allow selection in the reiterative backcrosses, the recipient strain was alternated between that encoding either apramycin or hygromycin resistance. Each independent transconjugant was assayed in subsequent mating experiments to determine whether they were donor or recipient, in parallel with positive controls. As we have observed previously [8], this phenotype was mutually exclusive. Donor transfer frequencies were determined based on the average of three, independent mating experiments as described previously [8]. Zero transconjugants were obtained with recipient strains, below the sensitivity threshold of one event per 108 cells [8].
Genomic Sequencing and Analysis
Transconjugants were colony purified, and genomic DNA was prepared and then subjected to whole-genome DNA sequence analysis at the Institute for Genome Sciences (IGS), U. Maryland, using paired-end Illumina technology. The sequence coverage for each genome was between 50-fold for F1 progeny and ∼1,000-fold for backcross strains. Sequence reads were mapped to the mc2155 reference sequence by IGS. Single nucleotide polymorphisms (SNPs) or sequence gaps were identified using the Integrative Genomics Viewer (IGV) sequence viewer [43] to define genomic regions of different parental origins. Boundaries of recipient- and donor-derived segments were recorded as the last recipient SNP observed with a minimum of two consecutive SNPs defining parental identity (Figure S2). A donor segment unique to each transconjugant was identified to confirm accuracy of the aligned sequence reads. Primers were designed to specifically amplify these segments, and the amplified products were cloned and sequenced (Table S1) to confirm that donor SNPs had been inherited by the recipient. A compilation of the donor and recipient segments from each transconjugant was projected onto the circular mycobacterial donor chromosome reference sequence, arranged as concentric circles of a Circos plot [44], with color optimization guided by ColorBrewer (Cynthia Brewer, The Pennsylvania State University). Collinearity of the donor and recipient genome was determined using Mauve, a program that was also used to identify SNPs and in/dels [45],[46]. All sequence data have been deposited at the European Nucleotide Archive at http://www.ebi.ac.uk/ena/data/view/ERP002619.
Supporting Information
Acknowledgments
We are grateful to Nigel Grindley, Joe Wade, and Paul Masters for critical comments; to the Wadsworth Genetics and Computational cores for their services; and to Luke Tallon and Ivette Santana-Cruz at the Institute for Genome Sciences, University of Maryland, for input on sequence determination and analysis.
Abbreviations
- DCT
distributive conjugal transfer
- HGT
horizontal gene transfer
- oriT
origin of transfer
Funding Statement
This work was supported by funds from the Wadsworth Center and grants AI042308, R56AI080694 to K.M.D. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1. Buchanan-Wollaston V, Passiatore JE, Cannon F (1987) The mob and oriT mobilization functions of a bacterial plasmid promote its transfer to plants. Nature 328: 172–175. [Google Scholar]
- 2. Frost LS, Leplae R, Summers AO, Toussaint A (2005) Mobile genetic elements: the agents of open source evolution. Nature reviews Microbiology 3: 722–732. [DOI] [PubMed] [Google Scholar]
- 3. Heinemann JA, Sprague GF Jr (1989) Bacterial conjugative plasmids mobilize DNA transfer between bacteria and yeast. Nature 340: 205–209. [DOI] [PubMed] [Google Scholar]
- 4. Thomas CM, Nielsen KM (2005) Mechanisms of, and barriers to, horizontal gene transfer between bacteria. Nature Reviews Microbiology 3: 711–721. [DOI] [PubMed] [Google Scholar]
- 5. Alvarez-Martinez CE, Christie PJ (2009) Biological diversity of prokaryotic type IV secretion systems. Microbiol Mol Biol Rev 73: 775–808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. de la Cruz F, Frost LS, Meyer RJ, Zechner EL (2010) Conjugative DNA metabolism in Gram-negative bacteria. FEMS Microbiol Rev 34: 18–40. [DOI] [PubMed] [Google Scholar]
- 7. Wollman EL, Jacob F, Hayes W (1956) Conjugation and genetic recombination in Escherichia coli K-12. Cold Spring Harb Symp Quant Biol 21: 141–162. [DOI] [PubMed] [Google Scholar]
- 8. Parsons LM, Jankowski CS, Derbyshire KM (1998) Conjugal transfer of chromosomal DNA in Mycobacterium smegmatis. Mol Microbiol 28: 571–582. [DOI] [PubMed] [Google Scholar]
- 9.Firth N, Ippen-Ihler K, Skurray RA (1996) Structure and function of the F factor and mechanism of conjugation. In; Escherichia coli and Salmonella Cellular and Molecular Biology, 2nd ed., Neidhardt FC, Curtiss III R, Ingraham JL, Lin ECC, Low KB, et al.., editors. Washington, DC: ASM Press.
- 10.Lloyd RG, Low KB (1996) Homologous Recombination. In: Escherichia coli and Salmonella Cellular and Molecular Biology, 2nd ed., Neidhardt FC, Curtiss III R, Ingraham JL, Lin ECC, Low KB, et al.., editors. Washington, D.C.: ASM Press.
- 11. Wang J, Karnati PK, Takacs CM, Kowalski JC, Derbyshire KM (2005) Chromosomal DNA transfer in Mycobacterium smegmatis is mechanistically different from classical Hfr chromosomal DNA transfer. Mol Microbiol 58: 280–288. [DOI] [PubMed] [Google Scholar]
- 12. Wollman EL, Jacob F (1955) [Mechanism of the transfer of genetic material during recombination in Escherichia coli K12]. Comptes rendus hebdomadaires des seances de l'Academie des sciences 240: 2449–2451. [PubMed] [Google Scholar]
- 13. Wollman EL, Jacob F, Hayes W (1956) Conjugation and genetic recombination in Escherichia coli K-12. Cold Spring Harbor symposia on quantitative biology 21: 141–162. [DOI] [PubMed] [Google Scholar]
- 14. Coros A, Callahan B, Battaglioli E, Derbyshire KM (2008) The specialized secretory apparatus ESX-1 is essential for DNA transfer in Mycobacterium smegmatis. Mol Microbiol 69: 794–808. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Flint JL, Kowalski JC, Karnati PK, Derbyshire KM (2004) The RD1 virulence locus of Mycobacterium tuberculosis regulates DNA transfer in Mycobacterium smegmatis. Proc Natl Acad Sci U S A 101: 12598–12603. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Smith GR (1991) Conjugational recombination in E. coli: myths and mechanisms. Cell 64: 19–27. [DOI] [PubMed] [Google Scholar]
- 17. Wang J, Parsons LM, Derbyshire KM (2003) Unconventional conjugal DNA transfer in mycobacteria. Nat Genet 34: 80–84. [DOI] [PubMed] [Google Scholar]
- 18. Mizuguchi Y, Suga K, Tokunaga T (1976) Multiple mating types of Mycobacterium smegmatis. Japanese Journal of Microbiology 20: 435–443. [DOI] [PubMed] [Google Scholar]
- 19. de la Cruz F, Frost LS, Meyer RJ, Zechner EL (2010) Conjugative DNA metabolism in Gram-negative bacteria. FEMS Microbiology Reviews 34: 18–40. [DOI] [PubMed] [Google Scholar]
- 20. Abdallah AM, Gey van Pittius NC, Champion PA, Cox J, Luirink J, et al. (2007) Type VII secretion–mycobacteria show the way. Nature Reviews Microbiology 5: 883–891. [DOI] [PubMed] [Google Scholar]
- 21. DiGiuseppe Champion PA, Cox JS (2007) Protein secretion systems in Mycobacteria. Cellular Microbiology 9: 1376–1384. [DOI] [PubMed] [Google Scholar]
- 22. Guinn KM, Hickey MJ, Mathur SK, Zakel KL, Grotzke JE, et al. (2004) Individual RD1-region genes are required for export of ESAT-6/CFP-10 and for virulence of Mycobacterium tuberculosis. Molecular Microbiology 51: 359–370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Hsu T, Hingley-Wilson SM, Chen B, Chen M, Dai AZ, et al. (2003) The primary mechanism of attenuation of bacillus Calmette-Guerin is a loss of secreted lytic function required for invasion of lung interstitial tissue. Proceedings of the National Academy of Sciences of the United States of America 100: 12420–12425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Lewis KN, Liao R, Guinn KM, Hickey MJ, Smith S, et al. (2003) Deletion of RD1 from Mycobacterium tuberculosis mimics bacille Calmette-Guerin attenuation. The Journal of Infectious Diseases 187: 117–123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Llosa M, Gomis-Ruth FX, Coll M, de la Cruz Fd F (2002) Bacterial conjugation: a two-step mechanism for DNA transport. Molecular Microbiology 45: 1–8. [DOI] [PubMed] [Google Scholar]
- 26. Gupta R, Barkan D, Redelman-Sidi G, Shuman S, Glickman MS (2011) Mycobacteria exploit three genetically distinct DNA double-strand break repair pathways. Mol Microbiol 79: 316–330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Warner DF, Mizrahi V (2011) Making ends meet in mycobacteria. Mol Microbiol 79: 283–287. [DOI] [PubMed] [Google Scholar]
- 28. Narra HP, Ochman H (2006) Of what use is sex to bacteria? Current Biology: CB 16: R705–R710. [DOI] [PubMed] [Google Scholar]
- 29. Croucher NJ, Harris SR, Barquist L, Parkhill J, Bentley SD (2012) A high-resolution view of genome-wide pneumococcal transformation. PLoS Pathog 8: e1002745 doi:10.1371/journal.ppat.1002745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Golubchik T, Brueggemann AB, Street T, Gertz RE, Spencer CCA, et al. (2012) Pneumococcal genome sequencing tracks a vaccine escape variant formed through a multi-fragment recombination event. Nature Genetics 44: 352–355. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Kulick S, Moccia C, Didelot X, Falush D, Kraft C, et al. (2008) Mosaic DNA imports with interspersions of recipient sequence after natural transformation of Helicobacter pylori. PLoS One 3: e3797 doi:10.1371/journal.pone.0003797. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Mell JC, Shumilina S, Hall IM, Redfield RJ (2011) Transformation of natural genetic variation into Haemophilus influenzae genomes. PLoS Pathog 7: e1002151 doi:10.1371/journal.ppat.1002151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Sreevatsan S, Pan X, Stockbauer KE, Connell ND, Kreiswirth BN, et al. (1997) Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proceedings of the National Academy of Sciences of the United States of America 94: 9869–9874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Gutierrez MC, Brisse S, Brosch R, Fabre M, Omais B, et al. (2005) Ancient origin and gene mosaicism of the progenitor of Mycobacterium tuberculosis. PLoS Pathog 1: e5 doi:10.1371/journal.ppat.0010005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Supply P, Marceau M, Mangenot S, Roche D, Rouanet C, et al. (2013) Genomic analysis of smooth tubercle bacilli provides insights into ancestry and pathoadaptation of Mycobacterium tuberculosis. Nat Genet 45: 172–179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Gordon SV, Bottai D, Simeone R, Stinear TP, Brosch R (2009) Pathogenicity in the tubercle bacillus: molecular and evolutionary determinants. BioEssays: News and Reviews in Molecular, Cellular and Developmental Biology 31: 378–388. [DOI] [PubMed] [Google Scholar]
- 37. Smith NH, Hewinson RG, Kremer K, Brosch R, Gordon SV (2009) Myths and misconceptions: the origin and evolution of Mycobacterium tuberculosis. Nature Reviews Microbiology 7: 537–544. [DOI] [PubMed] [Google Scholar]
- 38. Sassetti CM, Boyd DH, Rubin EJ (2001) Comprehensive identification of conditionally essential genes in mycobacteria. Proceedings of the National Academy of Sciences of the United States of America 98: 12712–12717. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Zhang YJ, Ioerger TR, Huttenhower C, Long JE, Sassetti CM, et al. (2012) Global assessment of genomic regions required for growth in Mycobacterium tuberculosis. PLoS Pathog 8: e1002946 doi: 10.1371/journal.ppat.1002946. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Nguyen KT, Piastro K, Gray TA, Derbyshire KM (2010) Mycobacterial biofilms facilitate horizontal DNA transfer between strains of Mycobacterium smegmatis. J Bacteriol 192: 5134–5142. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41. Snapper SB, Melton RE, Mustafa S, Kieser T, Jacobs WR Jr (1990) Isolation and characterization of efficient plasmid transformation mutants of Mycobacterium smegmatis. Molecular Microbiology 4: 1911–1919. [DOI] [PubMed] [Google Scholar]
- 42. Pavelka MS Jr, Jacobs WR Jr (1996) Biosynthesis of diaminopimelate, the precursor of lysine and a component of peptidoglycan, is an essential function of Mycobacterium smegmatis. Journal of Bacteriology 178: 6496–6507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, et al. (2011) Integrative genomics viewer. Nature Biotechnology 29: 24–26. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, et al. (2009) Circos: an information aesthetic for comparative genomics. Genome Research 19: 1639–1645. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Darling AE, Mau B, Blattner FR, Perna NT (2004) GRIL: genome rearrangement and inversion locator. Bioinformatics 20: 122–124. [DOI] [PubMed] [Google Scholar]
- 46. Darling AE, Mau B, Perna NT (2010) progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5: e11147 doi:10.1371/journal.pone.0011147. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Wirth SE, Krywy JA, Aldridge BB, Fortune SM, Fernandez-Suarez M, et al. (2012) Polar assembly and scaffolding proteins of the virulence-associated ESX-1 secretory apparatus in mycobacteria. Mol Microbiol 83: 654–664. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.