Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2021 Mar 1;118(10):e2007873118. doi: 10.1073/pnas.2007873118

Adaptive evolution of hybrid bacteria by horizontal gene transfer

Jeffrey J Power a,1, Fernanda Pinheiro a,1, Simone Pompei a,1, Viera Kovacova a, Melih Yüksel a, Isabel Rathmann a, Mona Förster a, Michael Lässig a,2, Berenike Maier a,b,2
PMCID: PMC7958396  PMID: 33649202

Significance

In a parallel evolution experiment, we probe lateral gene transfer between two Bacillus subtilis lineages close to the species boundary. We show that laboratory evolution by horizontal gene transfer can rapidly generate hybrid organisms with broad genomic and functional alterations. By combining genomics, transcriptomics, fitness assays, and statistical modeling, we map the selective effects underlying gene transfer. We show that transfer takes place under genome-wide positive and negative selection, generating a net fitness increase in hybrids. The evolutionary dynamics efficiently navigates this fitness landscape, finding viable paths with increasing fraction of transferred genes.

Keywords: horizontal gene transfer, experimental evolution, fitness landscape

Abstract

Horizontal gene transfer (HGT) is an important factor in bacterial evolution that can act across species boundaries. Yet, we know little about rate and genomic targets of cross-lineage gene transfer and about its effects on the recipient organism's physiology and fitness. Here, we address these questions in a parallel evolution experiment with two Bacillus subtilis lineages of 7% sequence divergence. We observe rapid evolution of hybrid organisms: gene transfer swaps ∼12% of the core genome in just 200 generations, and 60% of core genes are replaced in at least one population. By genomics, transcriptomics, fitness assays, and statistical modeling, we show that transfer generates adaptive evolution and functional alterations in hybrids. Specifically, our experiments reveal a strong, repeatable fitness increase of evolved populations in the stationary growth phase. By genomic analysis of the transfer statistics across replicate populations, we infer that selection on HGT has a broad genetic basis: 40% of the observed transfers are adaptive. At the level of functional gene networks, we find signatures of negative, positive, and epistatic selection, consistent with hybrid incompatibilities and adaptive evolution of network functions. Our results suggest that gene transfer navigates a complex cross-lineage fitness landscape, bridging epistatic barriers along multiple high-fitness paths.


Horizontal gene transfer (HGT) plays an important role in bacterial evolution (1, 2), which includes speeding up the adaptation to new ecological niches (3, 4) and mitigating the genetic load of clonal reproduction (5, 6). On macroevolutionary time scales, HGT occurs between bacteria of different species and also between bacteria and eukaryotes (1, 2, 7); these dynamics have very heterogeneous rates (8), and the permanent integration of transferred genes into regulatory networks is slow (9). An important mechanism of HGT is transformation, the active import and inheritable integration of DNA from the environment (10, 11). In this process, extracellular DNA binds to the surface of a recipient cell, is transported into the cytoplasm by the recipient’s uptake machinery, and then is integrated into its genome by homologous recombination.

The rate of transformation depends on multiple physiological and selective factors (12, 13). The efficiency of the DNA uptake machinery is a major determinant of the probability of transformation (11). At the level of recombination, this probability decreases exponentially as a function of the local sequence divergence, likely because nucleotide mismatches suppress sequence pairing at the initiation of the recombination step (1416). As shown in a recent study, laboratory experiments can induce genome-wide transformation between close lineages (17). This study also describes the inhibition of gene uptake by restriction-modification systems, but fitness effects of gene transfer are not addressed. Other evolution experiments give evidence of complex selective effects of HGT (1820). On the one hand, HGT can impose fitness costs that increase with genetic distance between donor and recipient (21). Observed costs include codon usage mismatch, reduction in RNA and protein stability, or mismatch of regulation or enzymatic activity in the recipient organism (22, 23) and can be mitigated by subsequent compensatory mutations (22, 24) or gene duplications (25). On the other hand, transferred genes can confer adaptive value, for example, resistance against antibiotics or shift of carbon sources (26, 27). These experiments elicit HGT of a few genes with a specific function, which are a confined genomic target of selection. However, we know little about positive and negative fitness effects of HGT on a genome-wide scale. In particular, it remains to be shown to what extent selection can foster genome-wide HGT across lineages and where selective barriers significantly constrain recombination.

These questions are the subject of the present paper. We map genome-wide horizontal transfer between two model lineages (28) of Bacillus subtilis, a species with high genetic and phenotypic diversity (29). The two lineages are subspecies with an average sequence divergence of 6.8% in their core genomes, which is close to the species boundary: it is larger than typical diversities within populations and smaller than cross-species distances. We find pervasive HGT on a genome-wide scale, which covers hundreds of genes within about 200 generations and confers a repeatable net fitness increase to evolved hybrid organisms. Using Bayesian analysis, we jointly infer the effects of sequence diversity and selection on the observed transfer pattern. We find evidence for genome-wide positive and negative selection with significant hot and cold spots in functional gene networks. We discuss the emerging picture of HGT navigating a complex cross-lineage fitness landscape.

Results

Parallel Evolution Experiments.

Our experiments record HGT from the donor species Bacillus subtilis subsp. spizizenii W23 to the recipient Bacillus subtilis subsp. subtilis 168. The two lineages (subspecies) share a 3.6 Mbp core genome with 3,746 genes. The average divergence of 6.8% in the core genome is small enough so that high-rate transfer is physiologically possible (30) and large enough so that transfer segments can reliably be detected in sequence alignments (SI Appendix, Fig. S1). In addition, there are unique accessory genomes of 0.4 Mbp in the donor and 0.6 Mbp in the recipient, which allow for nonorthologous sequence changes. Our evolution experiment consists of 2-d cycles with a six-step protocol shown in Fig. 1A. On day 1, recipient cells are diluted and cultured in liquid for 4.5 h, subject to ultraviolet (UV) radiation, and again diluted and inoculated onto an agar plate. On day 2, a single (clonal) colony is grown for 2.5 h in liquid culture, competence is induced and transformation in the presence of donor DNA takes place for 2 h, and then the culture is washed and grown overnight. The last step involves population dynamics with an exponential growth phase of about 4 h and a stationary growth phase of about 14 h. In this step, selection acts on HGT. In each run of the experiment, a recipient population evolves over 21 consecutive cycles, corresponding to about 200 generations. The full experiment consists of seven replicate runs. We also perform three control runs without donor DNA under the same population dynamic conditions as the primary runs. Our laboratory protocol is designed to study HGT at controlled levels of donor DNA and of induced competence. Unlike in previous assays (31), HGT cannot be elicited as a mechanism for DNA repair after UV radiation because the cells are not competent at this stage. Our protocol also differs from natural environments, where both DNA supply and competence are subject to environmental fluctuations, regulatory tuning, and ecological interactions between lineages (11, 32).

Fig. 1.

Fig. 1.

Evolution of B. subtilis hybrids. (A) Experimental design. A single 2-d (10 generation) cycle contains six steps, including two growth phases with irradiation and HGT, respectively. (B) Transferred segments (green) are broadly distributed over the recipient core genome (light shading) with two hot spots (red) and two cold spots (blue). De novo mutations occur throughout the recipient genome, and there are a few deletions from the recipient genome (orange). Data are shown for run four; see SI Appendix, Fig. S3 and Datasets S1 and S2 for data from all runs and from intermediate time points. (C) Time dependence of HGT. Fraction of core genes (light green) and of core genome (dark green) affected by HGT after 9, 15, and 21 cycles (mean and SD across the seven parallel runs). (D) Length histogram of transferred genome segments (bars: count histogram, black line: exponential fit for segments > 1,000 bp). Inset: number of segments versus time (mean and SD across runs). (E) Inferred distribution of donor–recipient sequence divergence, d, in 100 bp windows around the recombination start site of transferred segments (green) and in scrambled 100 bp windows (gray). (F) Distribution of the transfer frequency per gene, Q^(θ), in the seven parallel runs, counted across the recipient core genome (green); corresponding distribution P0(θ) from simulations of the null model (gray).

Gene Transfer Is the Dominant Mode of Evolution.

The experiments reveal fast and repeatable genome evolution that is driven predominantly by HGT. To infer these dynamics, we perform whole-genome sequencing after cycles 9, 15, and 21 and align the evolved genomes to the donor and the ancestral recipient genome (Materials and Methods and SI Appendix, Fig. S1). The genome-wide pattern of sequence evolution in hybrids is displayed in Fig. 1B and SI Appendix, Fig. S2. Transferred genome segments (green bars) are seen to be distributed over the entire core genome of the recipient. We find HGT by orthologous recombination of genome segments at an approximately constant rate of about 11 genes/hr (counted over the transfer period of 2 h/cycle) in all replicate runs. These changes accumulate to an average of about 100 transferred segments per replicate, which cover 12% (±3%) of the core genome and affect 500 (±133) core genes (Fig. 1C). Thus, HGT by orthologous recombination causes sequence evolution at an average rate of 100 bp/generation (counted over all cycles). This drastically exceeds the mutation rate of 2 × 10−3 bp/generation determined by a fluctuation assay in the absence of radiation (SI Appendix) as well as the corresponding rate recorded in the primary experiments, 3 × 10−1 bp/generation (corresponding to an average of 55 de novo mutations per replicate, Dataset S1). Our rate of HGT strongly exceeds the rate measured in a recent evolution experiment of B. subtilis in the presence of DNA from other Bacillus lineages and species (33), which is likely caused by differences in the level of competence induced in the experiments.

Besides orthologous recombination, we find insertions from the donor accessory genome (average five genes per replicate); deletions (average 31 genes per replicate), including the deletion of the mobile element ICEBs1 (in all replicates but not in the control runs, indicating clearance by HGT); and multigene duplications (192 genes in one replicate). These data are reported in Dataset S2. Thus, the by far dominant mode of genome evolution is HGT by orthologous recombination (simply called HGT in the following).

Transfer Depends on Segment Length and Local Sequence Similarity.

Transferred segments have an average length of 4,200 bp, contain an average of 5.1 genes, and have an approximately exponential length distribution (Fig. 1D), in agreement with previous results in Streptococcus pneumoniae (34) and Haemophilus influenza (35). These statistics show ubiquitous multigene transfers with no sharp cutoff on segment length. The number of transferred segments increases linearly in time (Inset of Fig. 1D and SI Appendix, Fig. S2), consistent with the observation that most transferred segments are planted into recipient flanking sequence (Dataset S2). At higher transfer coverage, segment extensions are expected to become more prevalent.

Fig. 1E displays the distribution of donor–recipient sequence divergence, d, in transferred segments (recorded in 100 bp windows around the inferred recombination start site). By comparing this distribution to the divergence distribution recorded in randomly positioned 100 bp windows, we infer a transfer rate, u(d), that decreases exponentially with increasing d; see SI Appendix and SI Appendix, Fig. S3 for details. This result is in agreement with previous work (14, 15). The function u(d) will serve as a neutral rate model for HGT. These are the rates of recombination in individual cells, independently of selection on specific genes. In contrast, the HGT events reported in Fig. 1 are substitutions, which occur in all cells of a given replicate population. The likelihood of these events is modulated by selection on gene function that is to be inferred below. In our system, the dependence on sequence similarity already generates a significant variation of neutral transfer rates. About 16% of potential recombination start sites in the core genome have a local divergence <5%, leading to a transfer rate higher than threefold above the genome average; another 3% have a divergence >15% and a transfer rate lower than ninefold below the average (SI Appendix, Fig. S3). We conclude that the transfer rate is significantly modulated by the local efficiency of the recombination machinery, but transfer is physiologically possible at the vast majority of genomic sites. At higher transfer coverage, beyond what is observed in the present experiments, our model predicts a significant enhancement of genome-wide neutral HGT rate because existing transfer segments provide preferential landing strips for additional segments.

Gene Transfer Is Broadly Distributed.

Next, we ask how HGT is distributed over replicate runs and genome loci. For a given gene, the transfer frequency, θ, is defined as the fraction of runs in which that gene is hit by HGT. Fig. 1F shows the histogram of transfer frequencies evaluated over the entire core genome. We find 60% of core genes are transferred in at least one of seven runs, and 1.7% of core genes are replaced in four or more runs. This is consistent with the broad genomic distribution of HGT shown in Fig. 1B. We find only few exceptions from this pattern. First, two hot spots of about 8 kbp and 16 kbp show strong enhancement of HGT, with more than 50% of their genes repeatably swapped in all replicates (Fig. 1F and SI Appendix). Both hot spots encode functional differences between donor and recipient, which are discussed below. To assess their statistical significance, we compare the observed multiple-hit statistics with a null model that is obtained by simulations of a positionally scrambled HGT dynamics with local rates u(d) (SI Appendix, Fig. S3). Both hot spots turn out to be highly significant (P<2×103); that is, their multiple-hit statistics cannot be explained by local sequence divergence alone. Second, there are two genomic cold spots, which are extended segments where HGT is suppressed in all replicates. Both cold spots are about 50 kb long and are significant deviations from the null model; that is, the absence of transfer is unlikely to be caused by local sequence divergence alone (P<0.04, SI Appendix, Fig. S4 and SI Appendix).

Analyzing HGT by gene ontology (GO) leads to a similarly broad distribution (Dataset S3). In most GO categories, the observed average transfer frequencies are consistent with the null model; only essential genes show significantly enhanced transfer (>30%, P<0.03). Taken together, most genomic loci and most functional classes are accessible to evolution by cross-lineage HGT, allowing the evolution of complex hybridization patterns.

Gene Transfer Generates a Net Fitness Gain.

We measure the selection coefficients of evolved hybrids and of control populations in competition with ancestral recipient cells. Selection is evaluated separately for the exponential and stationary growth phases, which are two consecutive selection windows in our population dynamics following HGT (Fig. 2A). The stationary phase covers most of the population dynamics interval (14/18 h). We find a fitness increase of about 5% in this phase, which is statistically significant (P < 0.03, Mann–Whitney U-test) and repeatable across replicate populations. The control populations without transfer show no comparable fitness increase (Fig. 2A). A part of the fitness effects of HGT may be attributable to compensation of UV radiation damage; however, the net fitness increase signals that uptake of donor genes does more than just repair deleterious mutations: there is adaptive evolution by HGT.

Fig. 2.

Fig. 2.

Selective effects of HGT. (A) Transfer affects hybrid fitness. Selection coefficient of hybrids compared to the ancestral recipient strain in the exponential phase and in the stationary phase (colored dots: data from individual runs, bars and boxes: mean and SD over all seven runs); control data of runs without donor DNA (gray). (B) Genome-wide selection on transfer. The relative likelihood of transfer, Q^θ|p0/P0θ|p0, is shown for genes binned by transfer frequency (eight bins, θ=0/7,1/7,,7/7) and by local sequence similarity, p0 (four bins, p0 values are averages in each bin). The underlying count histograms are shown in SI Appendix, Fig. S5. (C) The relative likelihood of transfer aggregated over p0 bins, Q^(θ)/P0(θ), is shown together with the corresponding likelihood ratio Q(θ)/P0(θ) of the maximum likelihood selection model (green circles); see SI Appendix. Relative to the neutral null model (gray baseline), the selection model has an enhanced transfer probability p+/p0=1.9 in a fraction c=0.2 of the recipient core genes and a reduced probability (p/p0=0.75) in the remainder of core genes.

In the exponential growth phase, the replicate-average fitness remains constant so that the total balance in average fitness is dominated by the gain in the stationary phase. The control populations decline in average fitness in both phases, as expected under UV treatment. Evolved hybrids acquire a large exponential-phase fitness variation across replicate runs (Fig. 2A). This pattern indicates that complex fitness landscapes with positive and negative components govern the evolution of hybrid organisms; the shape of such landscapes will be further explored below.

Gene Transfer Has Genome-Wide Selective Effects.

How are the strong fitness effects of HGT compatible with the ubiquity of transfer across the core genome? In particular, is adaptive evolution by HGT limited to few genomic loci with large effects, or can we identify a genetic basis of multiple genes with potentially smaller individual effects? To establish a link between genotype and fitness, we first analyze the multiple-hit statistics of transfers in comparison to the null model of scrambled transfers with rates u(d) (SI Appendix, Fig. S3). By simulations of the null model, we obtain a transfer probability p0 for each gene that depends only on the donor–recipient mutation pattern in its vicinity. This probability accounts for the fact that local sequence divergence affects homologous recombination. Because the null model is neutral with respect to gene function, all genes in loci with the same p0 have the same multiple-hit probability distribution, which is a binomial distribution with expectation value p0 (Materials and Methods). To quantify target and strength of selection, we use a minimal model with selectively enhanced (i.e., adaptive) transfer substitution probabilities p+=φ+p0 in a fraction c of the core genes and reduced probabilities p=φp0 in the remainder of the core genome. This mixture model generates a multiple-hit distribution Q(θ|p0) that is a superposition of two binomial distributions (Materials and Methods). Importantly, the model jointly captures neutral and selective variation of HGT substitution rates, properly discounting physiological effects from the inference of selection. Comparing the observed distribution of multiple hits, Q^(θ|p0), and the null distribution shows an excess count of no-hit (θ=0) genes and of multiple hits θ3/7 in the data. Importantly, this pattern is observed in all p0 classes; that is, independently of the local sequence similarity. The deviations from the null model are strongly significant (P<1020, SI Appendix), and we ascribe them to selection by HGT. The selection mixture model, with parameters c=0.2[0.1,0.4], φ+=1.9[1.6,2.4], and φ=0.75[0.6,0.84] (maximum likelihood value, CI in brackets) estimated from a Bayesian posterior distribution, explains the observed multiple hits (Fig. 2C) and infers 40[20,60]% of the observed transfers to be adaptive (SI Appendix, Fig. S6). These results suggest that HGT is a broad target of positive and negative selection in hybrid organisms, in tune with the experimental results (Fig. 2A).

De novo point substitutions occur preferentially in transferred sequence segments. The strongest bias is observed in intergenic sequence: across seven replicate runs, we find a total of 91 intergenic substitutions that introduce a derived allele different from both the donor and recipient allele (SI Appendix). This amounts to a 30× enhanced rate compared to the background of ancestral sequence; moreover, these mutations cluster in the upstream regions of 50 genes. We conclude that HGT opens windows of positive selection for the subsequent evolution by point mutations; this effect is most pronounced in intergenic regions. A similar interplay of recombination and point mutations has been observed in Escherichia coli populations in the mouse gut (36).

Frequently Transferred Genes Tend to Be Up-regulated and Positively Selected.

Next, we characterize the effect of HGT on gene expression. Using whole-genome transcriptomics data, we compare gene expression in evolved strains and their ancestor (Fig. 3 and SI Appendix, Fig. S4). The overall distribution of log2 fold changes of RNA levels ΔR in the entire genome is balanced and similar to that observed in the control runs, as expected for viable cells (raw data are reported in Dataset S4). To map correlations between HGT and expression, we partition genes into a class with low transfer frequency, 0θ3/7, and a class with high transfer frequency, 4/7θ1. Within each class, we further partition genes by their ancestral lineage (recipient: R and donor: D). We observe that genes with low transfer frequency are not substantially affected in their average expression level. In contrast, genes with high transfer frequency show up-regulation in hybrids, which is strongest for genes hit by HGT. A priori, up-regulation can compensate for reduced translational or functional efficiency, or it can signal an enhanced functional role of the affected genes. In the first case, we would expect a signal for transferred (D) genes independently of their θ class. The fact that we find up-regulation specifically for genes with high transfer frequency but not for others may point to an enhanced functional role. This is consistent with our inference of selection: in the high transfer class, 88% of the genes are inferred to be under positive selection (Materials and Methods).

Fig. 3.

Fig. 3.

Gene expression in hybrids. Log2 RNA fold changes, ΔR, with respect to the ancestral recipient (whisker plots, blue line: mean, box: first and third quartiles, bars: 99% percentiles, dots: outliers) are shown for different gene classes: all genome, genes with low transfer frequency (0θ3/7), and genes with high transfer frequency (4/7θ1); cf. Fig. 1F. In each gene class, we show ΔR whisker plots separately for nontransferred recipient genes (R; orange) and for transferred orthologous donor genes (D; green), together with the corresponding changes in the control experiments without donor DNA (0; gray). For example, a gene subject to HGT in replicates one, two, and five is in the low transfer class; its ΔR values from replicates 1, 2, 5 (3, 4, 6, 7) contribute to the low/D (low/R) statistics. Asterisks mark highly significant changes of the average ΔR for up-regulation of high/D and high/R genes (P<10−3, t test) compared to the ancestor.

Selection on Gene Transfer in Functional Networks.

To link selection to cellular functions, we map the transfer patterns onto gene networks of the recipient organism. Here, we use complex, multigene operons as units of genomic and functional organization above the level of individual genes. We model intranetwork HGT by a minimal cross-lineage fitness landscape of the form

F(q)=aq+bq(1q), [1]

which quantifies the selective effect of an operon as a function of the transfer fraction q, relative to the ancestral q=0 state (Fig. 4A). The linear term describes directional selection on HGT, the quadratic term intranetwork epistasis. Specifically, for b<0, this term captures hybrid incompatibilities within operons, leading to a fitness trough at intermediate q and a rebound at larger values of q.

Fig. 4.

Fig. 4.

Cross-lineage selection in gene networks. (A) Cross-lineage fitness landscapes of the form of Eq. 2 with different parameters (a,b) contain predominantly negative directional selection (a<0, orange), predominantly positive directional selection (a>0, magenta), disruptive epistasis due to hybrid incompatibilities (b<0, blue), stabilizing epistasis due to advantageous cross-lineage combinations (b>0, dark blue), and approximately neutral evolution (gray). (B) Posterior fitness parameters (a,b) inferred for complex operons, colored by predominant selection component. (CG) Examples of operons with different types of maximum likelihood cross-lineage selection. (Left) Protein–protein interaction link diagram, darker lines indicating links with stronger support (35). (Center) HGT trajectories recorded at cycle 9, 15, and 21 for seven experimental runs (solid lines) and 40 simulation runs (dashed). Dots and error bars mark the transfer frequency after cycle 21, θ, and the SD of experimental and simulation runs, respectively. (Right) Measured values (θ,h) (green dots) and the corresponding distribution from simulations of the selection model, Qθ,h|a,b. (C) Negative directional selection (iol). (D and E) Positive directional selection (leu, eps). (F and G) Disruptive epistasis, hybrid incompatibilities (rps, ylo). See also Dataset S5.

To infer fitness parameters for individual operons, we use two summary statistics of the HGT dynamics. The network transfer frequency, θ, is defined, in generalization of the definition for individual genes, as the replicate average of the transfer fraction q at the end of the experiment. The heterogeneity, h, is the cross-replicate variance of transfer increments computed over three partial periods (cycles 1 to 9, 10 to 15, and 16 to 21) (Materials and Methods). We evaluate these statistics for 58 operons that contain at least seven genes; we refer to this set as complex operons. For each operon, we compare the observed values (θ,h) with simulations of the HGT dynamics in a fitness landscape of the form of Eq. 1 and obtain operon-specific posterior fitness parameters (a,b) (Fig. 4B and SI Appendix). In the set of complex operons, we infer 20 operons most likely under negative selection (a1,b0), nine operons most likely under positive selection (a1,b0), and seven operons with disruptive epistasis (b1) signaling hybrid incompatibilities. Gene content and functional annotations of the networks in all three selection classes are listed in Dataset S5.

To test the statistical significance of the fitness inference, we compare the (θ,h) data with appropriate null models. First, we rank the observed θ values against simulations of HGT under neutral evolution (a=b=0); here, we omit the two hotspots of HGT identified above, which contain the operons discussed below. In the remaining set of complex operons, low values of θ signal that HGT takes place under selective constraint, consistent with fitness parameters a<0 or b<0. The deviation from neutrality is statistically significant (P<0.04,SI Appendix and SI Appendix, Fig. S7A). We note that the neutral null model includes all HGT correlations between genes because of their spatial clustering in a common operon, which are to be discounted from the inference of selection. To disentangle directional and epistatic selection, we compare the observed h values with simulations under the optimal directional selection model with operon-specific parameters (alin,b=0) (SI Appendix, Fig. S7B). This reveals an excess heterogeneity of HGT in the data that cannot be explained by directional selection (P<105,SI Appendix), establishing statistically significant evidence for disruptive epistasis in complex operons.

Specific operons inferred to be under different types of selection are shown in Fig. 4 CG. In each case, we display the protein–protein interaction links within the operon, which may be indicative of functional links between genes (37), evolutionary trajectories from the experimental replicate runs and from simulations of the selection model, and the observed (θ,h) data together with the corresponding distributions from simulations. First, the iol operon, which is involved in antibiotic resistance (37, 38), is an example of suppressed HGT (θ=0); the ancestral genes are retained in all replicates (Fig. 4C). This pattern is consistent with simulations under substantial negative directional selection (a=4.25) but deviates from neutral evolution.

A second group of networks shows the opposite pattern of strongly enhanced HGT. The leu operon is the most prominent genomic hotspot of HGT (θ=0.84); all of its genes are coherently transferred in three replicate runs, and high transfer fractions are reached already at cycle 9 (Fig. 4D). The leu operon confers the ability to grow without external leucine supply. Evolved hybrids gain this function, which is present in the donor but absent in the ancestral recipient (SI Appendix). Uptake of leu has been observed previously in transformation essays under leucine starvation (39). To quantify the contribution of leu to adaptive evolution, we used a similar protocol to transform the recipient with donor DNA containing either the full leu operon or one specific gene, leuC. These leu-transformed strains show an increased fitness compared to the ancestral strain; however, the mean stationary state selection coefficient is significantly lower than in the evolved strains (SI Appendix, Fig. S8). This is consistent with the inferred polygenic basis of adaptation in the evolution experiments. The second HGT hotspot is the eps operon, which contains 13 genes, has θ=0.64 and up to 100% transferred genes (Fig. 4E). This operon is important for biofilm formation, a function that is strongly impaired in the recipient (40) but potent in the donor. Both leu and eps show an enhancement of θ that is consistent with positive directional selection (a>4) but deviates significantly from neutral evolution (P<2×103,SI Appendix). This signals adaptive evolution by HGT at the level of gene networks, generating coherent transfer network genes.

A third pattern of HGT is observed in the networks with b<1. Here, we show the rps operon, which encodes ribosomal proteins and affects translation (41), and the ylo operon as examples (Fig. 4 F and G). These operons reach moderate transfer frequencies (θ0.10.3), but the evolutionary trajectories are highly heterogeneous: many replicates retain q=0 and other replicates reach high q by few, large transfers. This pattern signals disruptive epistasis at the network level, consistent with hybrid incompatibilities suppressing trajectories through intermediate transfer fractions.

Discussion

In this work, we integrate experimental evolution and evolutionary modeling to map effects of HGT on genome dynamics, gene expression, and fitness. We show that genome-wide cross-lineage HGT is a fast evolutionary mode generating hybrid bacterial organisms. This mode is repeatably observed in all replicate runs of our experiment. After about 200 generations, Bacillus hybrids have acquired about 12% donor genes across the entire core genome, but coherently transferred functional gene networks reach up to 100% transferred genes in individual runs.

Despite its broad genomic pattern, the HGT dynamics is far from evolutionary neutrality. Evolved hybrids show a substantial fitness increase compared to the ancestral strain in stationary growth, which occurs repeatably across all replicates. Hence, HGT does more than just repairing deleterious mutations, which are caused by UV radiation in our protocol: it carries a net adaptive benefit. The adaptive dynamics has a broad genomic target; we infer some 40% of the observed transfers to be under positive selection. From the point of view of methods, it is noteworthy that broad selection on genomic targets can be inferred from the multiple-hit statistics in parallel-evolving lines even without a neutral gauge comparable to synonymous point mutations. In addition, we find positive, negative, and epistatic selection on HGT in a range of functional gene networks. We conclude that in our system, unlike for uptake of resistance genes under antibiotic stress (26, 27), HGT does not have just a single dominant target of selection. Rather, evolution by orthologous recombination appears to tinker with multiple new combinations of donor and recipient genes, using the combinatorial complexity of hundreds of transferred genes in each replicate run. This picture is consistent with the substantial fitness variation across replicates, which is most pronounced in the exponential growth phase.

To capture HGT under genome-wide selection, we introduce the concept of a cross-lineage fitness landscape. Such landscapes describe the fitness effects of orthologous recombination between two (sub)species, starting from the unmixed genomes as focal points of a priori equal rank. This feature distinguishes cross-lineage landscapes from empirically known landscapes for mutation accumulation within lineages, for example, for antibiotic resistance evolution, most of which have a single global fitness peak. Here, we begin to map building blocks of cross-lineage landscapes (Fig. 4). These include selection against partial transfer of operons, which suggests that hybrid incompatibilities between donor and recipient, causing disruptive epistasis within gene networks, play a role in the observed HGT dynamics. Conversely, stabilizing epistasis reflecting new favorable hybrid networks generates fitness peaks at intermediate transfer fractions. This type of cross-species landscape, which has not yet been mapped in our experiments, may become observable at higher (cross-operon) levels of gene networks. Future massively parallel experiments will allow a more complete mapping of cross-species fitness landscapes, which combines these building blocks into a systems picture. Such experiments will also show how different compositions of donor DNA affect the HGT dynamics on cross-lineage landscapes. Moreover, modulating selection pressures by variation of bottlenecks in the evolution protocol can display epistatic effects on the speed of evolution by HGT (42).

Our observation of fast HGT under broad selection suggests that evolution navigates the cross-lineage landscape in an efficient way. That is, deep cross-lineage fitness valleys must be sparse enough for evolution by HGT to find viable paths of hybrid evolution with increasing transfer fraction. Stronger and more ubiquitous fitness barriers at larger donor–recipient distances or larger transfer fractions may eventually halt gene uptake, but we have yet to see where the limits of genome-wide HGT are.

Two key features of our experiments enable these dynamics. First, choosing a donor–recipient pair with a sequence divergence close to the species boundary generates more, often subtle functional differences between orthologous genes—that is, more potential targets of selection on HGT—than closer pairs or single populations. Second, permissive population dynamics with recurrent bottlenecks allow the (transient) fixation of deleterious recombinant genomes, which can bridge fitness valleys of hybrid incompatibilities and act as stepping stones for subsequent adaptation. Together, our results suggest that laboratory evolution by HGT can become a factory for evolutionary innovation. An exciting perspective is to use cross-lineage HGT together with artificial evolution in order to engineer functional novelties.

Materials and Methods

Experimental Procedures.

The evolution experiment consisted of rounds of a 2-d cycle, including dilution, radiation, plating, colony selection and regrowth, competence induction and addition of extracellular donor DNA, washing, and overnight growth (Fig. 1A). A derivative of B. subtilis subsp. subtilis 168 served as recipient strain and B. subtilis subsp. spizizenii W23 served as donor strain. We collected genomic and transcriptomic data and performed fitness assays between evolved and ancestral strains. The experimental procedures are described in detail in SI Appendix, Supplementary Materials and Methods.

Sequence Analysis.

DNA sequencing and RNA sequencing reads from each library were paired, trimmed, filtered, and aligned against the reference genomes of B. subtilis 168 and B. subtilis W23. RNA data counts were normalized, and batch effects were removed. Genomic sequences were used to identify HGT, de novo mutations, insertions, and deletions in the genome of evolved sequences. Specifically, we infer transfer segments from the alleles of an evolved sequence at all sites with donor–recipient (D–R) divergence (SI Appendix, Fig. S1), using the following algorithm: The start of each transfer segment is marked by a sequence of two consecutive D alleles (5′ marker). The starting coordinate of the segment is then assigned to the midpoint between the 5′ marker and its left flanking site, which is either the previous R site or the start of the core genome segment. The end of each segment is marked by a sequence of k consecutive R alleles (3′ marker; we choose an optimized value k=5). The end coordinate of the segment is then assigned to the midpoint between the 3′ marker and its left flanking site, which is the previous D site. Optimization and tests of the algorithm are described in SI Appendix.

Evolutionary Analysis of HGT.

Here, we establish a population genetics of HGT by separately inferring the underlying neutral and selective forces from genomic data. From the genome-wide frequency of observed transfers on the local sequence divergence at the inferred recombination start site, d, we estimate the neutral transfer probability per segment,

π0(d)=τ0u(d), [2]

with a normalization factor τ0. Simulations of these dynamics then determine the neutral transfer probability of a given gene, p0(g). For the global inference of selection at the gene level, we use a two-component model with fixation probabilities p±=φ±p0 under selection. With a mixture parameter c, this model generates a multiple-hit distribution

Qθ|p0=cB7,7θ,φ+p0+1cB7,7θ,φp0, [3]

which differs from the corresponding neutral distribution P0(θ|p0)=B(7,7θ,p0). For the local inference of selection on specific operons, we use simulations of the transfer dynamics with fixation rates

πdrs,q,q'|a,b=π0dφsq,q'|a,b, [4]

where q,q' are the network transfer fractions before and after the transfer event. The fixation probability depends on the selection coefficient s(q,q|a,b)=F(q|a,b)F(q|a,b), which follows from the fitness landscape of Eq. 1 (here, landscape and selection coefficients are scaled by the effective population size Ne). We infer Bayesian posterior fitness parameters (a,b) by comparison of data and model, using two summary statistics of HGT trajectories: 1) the network transfer frequency

θ=(17)α=17qα, [5]

with qα denoting the fraction of genes affected by HGT in replicate α at the end of the experiment (cycle 21) and 2) the transfer heterogeneity, h, defined as the cross-replicate variance of transfer increments,

h=(17)α=17((Δqα,9Δq¯9)2+(Δqα,15Δq¯15)2+(Δqα,21Δq¯21)2), [6]

where Δqα,9=(qα,9q0), Δqα,15=(qα,15qα,9), and Δqα,21=(qα,21qα,15) (for replicates α=1,7 and cycles 9, 15, and 21), and overbars denote replicate averages. Details are given in SI Appendix.

Supplementary Material

Supplementary File
Supplementary File
pnas.2007873118.sd01.xlsx (63.8KB, xlsx)
Supplementary File
pnas.2007873118.sd02.xlsx (270.2KB, xlsx)
Supplementary File
pnas.2007873118.sd03.xlsx (12.7KB, xlsx)
Supplementary File
Supplementary File
pnas.2007873118.sd05.xlsx (68.3KB, xlsx)

Acknowledgments

We acknowledge discussions with M. Cosentino Lagomarsino and A. de Visser. This work has been partially funded by Deutsche Forschungsgemeinschaft Grant CRC 1310. Sequence analysis and simulations were performed at the Regional Computing Center, University of Cologne.

Footnotes

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2007873118/-/DCSupplemental.

Data Availability

All study data are included in the article and/or supporting information.

References

  • 1.Gogarten J. P., Townsend J. P., Horizontal gene transfer, genome innovation and evolution. Nat. Rev. Microbiol. 3, 679–687 (2005). [DOI] [PubMed] [Google Scholar]
  • 2.Schönknecht G., et al., Gene transfer from bacteria and archaea facilitated evolution of an extremophilic eukaryote. Science 339, 1207–1210 (2013). [DOI] [PubMed] [Google Scholar]
  • 3.Niehus R., Mitri S., Fletcher A. G., Foster K. R., Migration and horizontal gene transfer divide microbial genomes into multiple niches. Nat. Commun. 6, 8924 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Soucy S. M., Huang J., Gogarten J. P., Horizontal gene transfer: Building the web of life. Nat. Rev. Genet. 16, 472–482 (2015). [DOI] [PubMed] [Google Scholar]
  • 5.Kondrashov A. S., Classification of hypotheses on the advantage of amphimixis. J. Hered. 84, 372–387 (1993). [DOI] [PubMed] [Google Scholar]
  • 6.Held T., Klemmer D., Lässig M., Survival of the simplest in microbial evolution. Nat. Commun. 10, 2472 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lin M., Kussell E., Inferring bacterial recombination rates from large-scale sequencing datasets. Nat. Methods 16, 199–204 (2019). [DOI] [PubMed] [Google Scholar]
  • 8.Sakoparnig Thomas, Field Chris, van Nimwegen Erik, Whole genome phylogenies reflect the distributions of recombination rates for many bacterial species. Elife 10, 10.7554/eLife.65366 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Lercher M. J., Pál C., Integration of horizontally transferred genes into regulatory interaction networks takes many million years. Mol. Biol. Evol. 25, 559–567 (2008). [DOI] [PubMed] [Google Scholar]
  • 10.Chen I., Dubnau D., DNA uptake during bacterial transformation. Nat. Rev. Microbiol. 2, 241–249 (2004). [DOI] [PubMed] [Google Scholar]
  • 11.Maier B., “Competence and transformation” in Bacillus: Molecular and Cellular Biology, Graumann P., Ed. (Caister Academic Press, 2017), pp. 395–414. [Google Scholar]
  • 12.Popa O., Dagan T., Trends and barriers to lateral gene transfer in prokaryotes. Curr. Opin. Microbiol. 14, 615–623 (2011). [DOI] [PubMed] [Google Scholar]
  • 13.Thomas C. M., Nielsen K. M., Mechanisms of, and barriers to, horizontal gene transfer between bacteria. Nat. Rev. Microbiol. 3, 711–721 (2005). [DOI] [PubMed] [Google Scholar]
  • 14.Carrasco B., Serrano E., Sánchez H., Wyman C., Alonso J. C., Chromosomal transformation in Bacillus subtilis is a non-polar recombination reaction. Nucleic Acids Res. 44, 2754–2768 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Vulić M., Dionisio F., Taddei F., Radman M., Molecular keys to speciation: DNA polymorphism and the control of genetic exchange in enterobacteria. Proc. Natl. Acad. Sci. U.S.A. 94, 9763–9767 (1997). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Kidane D., Ayora S., Sweasy J. B., Graumann P. L., Alonso J. C., The cell pole: The site of cross talk between the DNA uptake and genetic recombination machinery. Crit. Rev. Biochem. Mol. Biol. 47, 531–555 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Bubendorfer S., et al., Genome-wide analysis of chromosomal import patterns after natural transformation of Helicobacter pylori. Nat. Commun. 7, 11995 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Baltrus D. A., Guillemin K., Phillips P. C., Natural transformation increases the rate of adaptation in the human pathogen Helicobacter pylori. Evolution 62, 39–49 (2008). [DOI] [PubMed] [Google Scholar]
  • 19.Utnes A. L., et al., Growth phase-specific evolutionary benefits of natural transformation in Acinetobacter baylyi. ISME J. 9, 2221–2231 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Engelmoer D. J. P., Donaldson I., Rozen D. E., Conservative sex and the benefits of transformation in Streptococcus pneumoniae. PLoS Pathog. 9, e1003758 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Ambur O. H., Engelstadter J., Johnsen P. J., Miller E. L., Rozen D. E., Steady at the wheel: Conservative sex and the benefits of bacterial transformation. Philos. Trans. R Soc. Lond. B Biol. Sci. 371, 20150528 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Bershtein S., et al., Protein homeostasis imposes a barrier on functional integration of horizontally transferred genes in bacteria. PLoS Genet. 11, e1005612 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Bloom J. D., Labthavikul S. T., Otey C. R., Arnold F. H., Protein stability promotes evolvability. Proc. Natl. Acad. Sci. U.S.A. 103, 5869–5874 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Bedhomme S., et al., Evolutionary changes after translational challenges imposed by horizontal gene transfer. Genome Biol. Evol. 11, 814–831 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Lind P. A., Tobin C., Berg O. G., Kurland C. G., Andersson D. I., Compensatory gene amplification restores fitness after inter-species gene replacements. Mol. Microbiol. 75, 1078–1089 (2010). [DOI] [PubMed] [Google Scholar]
  • 26.Wadsworth C. B., Arnold B. J., Sater M. R. A., Grad Y. H., Azithromycin resistance through interspecific acquisition of an epistasis-dependent efflux pump component and transcriptional regulator in Neisseria gonorrhoeae. MBio 9, e01419-18 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Chu H. Y., Sprouffske K., Wagner A., Assessing the benefits of horizontal gene transfer by laboratory evolution and genome sequencing. BMC Evol. Biol. 18, 54 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Zeigler D. R., The genome sequence of Bacillus subtilis subsp. spizizenii W23: Insights into speciation within the B. subtilis complex and into the history of B. subtilis genetics. Microbiology (Reading) 157, 2033–2041 (2011). [DOI] [PubMed] [Google Scholar]
  • 29.Brito P. H., et al., Genetic competence drives genome diversity in Bacillus subtilis. Genome Biol. Evol. 10, 108–124 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Carrasco B., Serrano E., Martín-González A., Moreno-Herrero F., Alonso J. C., Bacillus subtilis MutS modulates RecA-mediated DNA strand exchange between divergent DNA sequences. Front. Microbiol. 10, 237 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Michod R. E., Wojciechowski M. F., Hoelzer M. A., DNA repair and the evolution of transformation in the bacterium Bacillus subtilis. Genetics 118, 31–39 (1988). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Stefanic P., Kraigher B., Lyons N. A., Kolter R., Mandic-Mulec I., Kin discrimination between sympatric Bacillus subtilis isolates. Proc. Natl. Acad. Sci. U.S.A. 112, 14042–14047 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Slomka S., et al., Experimental evolution of Bacillus subtilis reveals the evolutionary dynamics of horizontal gene transfer and suggests adaptive and neutral effects. Genetics 216, 543–558 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Croucher N. J., Harris S. R., Barquist L., Parkhill J., Bentley S. D., A high-resolution view of genome-wide pneumococcal transformation. PLoS Pathog. 8, e1002745 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Mell J. C., Lee J. Y., Firme M., Sinha S., Redfield R. J., Extensive cotransformation of natural variation into chromosomes of naturally competent Haemophilus influenzae. G3 (Bethesda) 4, 717–731 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Frazão N., Sousa A., Lässig M., Gordo I., Horizontal gene transfer overrides mutation in Escherichia coli colonizing the mammalian gut. Proc. Natl. Acad. Sci. U.S.A. 116, 17906–17915 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Szklarczyk D., et al., STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Miller W. R., et al., LiaR-independent pathways to daptomycin resistance in Enterococcus faecalis reveal a multilayer defense against cell envelope antibiotics. Mol. Microbiol. 111, 811–824 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Albano M., Hahn J., Dubnau D., Expression of competence genes in Bacillus subtilis. J. Bacteriol. 169, 3110–3117 (1987). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Gallegos-Monterrosa R., Mhatre E., Kovács A. T., Specific Bacillus subtilis 168 variants form biofilms on nutrient-rich medium. Microbiology (Reading) 162, 1922–1932 (2016). [DOI] [PubMed] [Google Scholar]
  • 41.Byrgazov K., Vesper O., Moll I., Ribosome heterogeneity: Another level of complexity in bacterial translation regulation. Curr. Opin. Microbiol. 16, 133–139 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Weissman D. B., Feldman M. W., Fisher D. S., The rate of fitness-valley crossing in sexual populations. Genetics 186, 1389–1410 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
Supplementary File
pnas.2007873118.sd01.xlsx (63.8KB, xlsx)
Supplementary File
pnas.2007873118.sd02.xlsx (270.2KB, xlsx)
Supplementary File
pnas.2007873118.sd03.xlsx (12.7KB, xlsx)
Supplementary File
Supplementary File
pnas.2007873118.sd05.xlsx (68.3KB, xlsx)

Data Availability Statement

All study data are included in the article and/or supporting information.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES