Skip to main content
eLife logoLink to eLife
. 2022 Jul 6;11:e76162. doi: 10.7554/eLife.76162

Host-parasite coevolution promotes innovation through deformations in fitness landscapes

Animesh Gupta 1, Luis Zaman 2, Hannah M Strobel 3, Jenna Gallie 4, Alita R Burmeister 5, Benjamin Kerr 6, Einat S Tamar 7, Roy Kishony 7,, Justin R Meyer 3,
Editors: C Brandon Ogbunugafor8, Aleksandra M Walczak9
PMCID: PMC9259030  PMID: 35793223

Abstract

During the struggle for survival, populations occasionally evolve new functions that give them access to untapped ecological opportunities. Theory suggests that coevolution between species can promote the evolution of such innovations by deforming fitness landscapes in ways that open new adaptive pathways. We directly tested this idea by using high-throughput gene editing-phenotyping technology (MAGE-Seq) to measure the fitness landscape of a virus, bacteriophage λ, as it coevolved with its host, the bacterium Escherichia coli. An analysis of the empirical fitness landscape revealed mutation-by-mutation-by-host-genotype interactions that demonstrate coevolution modified the contours of λ’s landscape. Computer simulations of λ’s evolution on a static versus shifting fitness landscape showed that the changes in contours increased λ’s chances of evolving the ability to use a new host receptor. By coupling sequencing and pairwise competition experiments, we demonstrated that the first mutation λ evolved en route to the innovation would only evolve in the presence of the ancestral host, whereas later steps in λ’s evolution required the shift to a resistant host. When time-shift replays of the coevolution experiment were run where host evolution was artificially accelerated, λ did not innovate to use the new receptor. This study provides direct evidence for the role of coevolution in driving evolutionary novelty and provides a quantitative framework for predicting evolution in coevolving ecological communities.

Research organism: E. coli, Viruses

Introduction

A starting point for understanding how populations evolve is to assume that they exist in an unchanging world where they can adapt toward optimality (Orr, 2005; Pigliucci and Müller, 2010). However, even in static environments, populations never reach optimality because their circumstances continuously change as neighboring species coevolve with them (Valen, 1973). This more dynamic view of the evolutionary process opens the potential for unbounded evolution and creates new opportunities for evolutionary innovation (Doebeli, 2011; Thompson, 2005; Nahum et al., 2017; Thompson and Cunningham, 2002; Zaman et al., 2014). Darwin recognized this potential in the final pages of On the Origin of Species, where he wrote that, ‘It is interesting to contemplate an entangled bank’ of organisms evolving with one another to produce such a variety of forms and functions (Darwin, 1859). But he also realized the empirical challenges created by the richness of species interactions within ecological communities in his further description of ‘these elaborately constructed forms, … dependent on each other in so complex a manner…’ (Darwin, 1859). The complexity arises because an organism’s fitness is a function of its interactions with other species, and the strength and form of these interactions can continuously change as they coevolve. Furthermore, the coevolving traits of organisms are encoded within genomes by mutations that might interact with one another, a pervasive phenomenon called epistasis (Weinreich et al., 2006). This means that interactions at all levels must be considered; from mutation-by-mutation within a species (classical epistasis), to mutation-by-mutation between species (interspecific epistasis), and higher order phenomena such as the combination of classic and interspecific epistasis, where the within genome mutation-by-mutation interactions depend on the genotype of an interacting species.

Many advances have been made over recent decades that enable us to tackle this combinatorial problem. Efficient genetic engineering methods permit the construction of genetic libraries with combinatorial sets of mutations that can be used to measure epistasis (Kosuri and Church, 2014; Fowler and Fields, 2014). Also available are convenient approaches to measure Darwinian fitness of the mutant libraries (Weinreich et al., 2006; Palmer et al., 2015; Khan et al., 2011; Chou et al., 2011). Coupling these two technologies allows the creation of extensive genotype-to-fitness maps, or fitness landscapes (Wright, 1932), that provide information important for predicting adaption (de Visser and Krug, 2014; de Visser et al., 2018; Lee et al., 2018). However, these maps alone are often not sufficient to predict evolution because their topographies can depend on abiotic environmental conditions (Ogbunugafor et al., 2016; Lindsey et al., 2013; Steinberg and Ostermeier, 2016; Flynn et al., 2013) and biotic interactions (Cervera et al., 2016; Fragata et al., 2019). Here, we take two significant steps forward in fitness landscape research. First, we build on the observation that landscape structures depend on species interactions by studying the interdependence of two species’ landscapes and how they shift during coevolution. Second, we test whether these shifts facilitate the evolution of a key innovation, where the species evolves a new function that unlocks new ecological opportunities.

As a model system of coevolution, we studied the host-parasite interaction between bacteriophage λ and its host, Escherichia coli, because of the extensive background research completed on their coevolution and the availability of well-developed molecular tools (Meyer et al., 2012; Maddamsetti et al., 2018). When λ and E. coli are cocultured in the laboratory, one quarter of the λ populations evolve to use a new receptor (Meyer et al., 2012). λ’s native receptor is E. coli’s outer-membrane protein LamB, but through mutations in its host-recognition gene J, λ evolves to use a second receptor protein, OmpF. While only four mutations are necessary for OmpF+ function (Maddamsetti et al., 2018), more J mutations typically evolve along the way (Meyer et al., 2012). λ gains this new function after E. coli evolves resistance through malT mutations (Meyer et al., 2012) that cause reduced LamB expression (Boos and Böhm, 2000). Thus, it was hypothesized that the evolution of resistance in E. coli deformed λ’s fitness landscape in ways that promoted λ’s innovation (Thompson, 2012). In line with this, it was previously shown that four out of six λ genotypes randomly chosen on the path to evolve OmpF+ had higher relative fitness when cultured with resistant malT cells rather than ancestral cells (Burmeister et al., 2016), suggesting that the host’s coevolution would promote key steps in λ’s evolution. However, one out of six genotypes had higher fitness in the presence of the ancestral host, and the last λ’s fitness was neutral to the host’s genotype. Given the conflicting pattern and small sample size, it could not be concluded whether coevolution was responsible for the innovation. Here, we build on this study with high-throughput technologies capable of measuring the fitness of hundreds of λ genotypes. The technology also produces combinatorial-mutation libraries of λ genotypes that can be used to quantify epistasis. This allows us to establish the contours of λ’s adaptive landscape rather than simply studying isolated genotypes within the space. Given the efficiency of this method, we can now measure λ’s landscape repeatedly in different host contexts in order to test whether host-induced deformations that naturally arise during coevolution promote OmpF+ evolution. Although it has been shown that antagonistic coevolution can hasten molecular evolution of phages (Paterson et al., 2010) and lead them to broader host ranges (Hall et al., 2011), coevolution’s role in unlocking unexplored regions of the fitness landscape has not been directly tested.

Results

λ’s fitness landscape at different stages of coevolution

To construct λ’s fitness landscape, we focused on 10 J mutations that were a subset of mutations λ repeatedly evolved on its path to use OmpF (Meyer et al., 2012; Supplementary file 1a). Together, they form a 10-dimensional genotype space with a total of 1024 (210) unique variants of different combinations of the mutations including, the wild type (WT) allele configuration. Using Multiplexed Automated Genome Engineering (MAGE) (Wang et al., 2009), a technique that uses repeated cycles of homologous recombination in the λ-red system to produce combinatorial genomic diversity, we were successful at engineering a library of 671 genotypes out of the possible 1024 (see Materials and methods). To measure the fitness of each genotype in this library, we competed the full library en masse and monitored their frequency changes using next-generation sequencing (Figure 1—figure supplement 1; Kelsic et al., 2016; Russ et al., 2020). The fitness of each genotype was then calculated by comparing its change in frequency relative to the non-engineered ancestor. Fitness was measured in four replicate competitions for both the ancestral host and malT host (see Materials and methods). To reduce the effect of sequencing errors and to overcome other methodological pitfalls, we modified the MAGE protocol by introducing neutral watermark mutations in the library construction and developed a high-throughput competition assay that yielded reproducible results (see Materials and methods, Figure 1—figure supplement 2, Appendix 1). Overall, we were able to measure the fitness of 580 λ genotypes cocultured with ancestral E. coli and 131 genotypes with malT (Figure 1—figure supplement 3). The reduced number of genotypes compared to the initial library was due to a combination of factors. The randomness of the MAGE editing process caused some genotypes to be rarely constructed and to have low frequencies in the initial library. If these genotypes did not possess disproportionately high fitness, then their frequency would fall below the limit of detection during the competition, removing them from the analysis. This effect was more pronounced in the malT landscape, where fitness differences were even more extreme.

Visual inspection of the two fitness landscapes reveals host-dependent structures; the landscape with the ancestral host has a standard diminishing-returns pattern (Khan et al., 2011; Chou et al., 2011; Kryazhimskiy et al., 2014; MacLean et al., 2010; Guerrero et al., 2019), while the landscape with the malT host has an atypical sigmoidal shape that plateaus at a higher fitness than the first (Figure 1a and b). The non-linear relationship between mutation number and fitness suggests the presence of epistasis (mutation-by-mutation interactions), the differences in the magnitude of fitness effects between landscapes suggests mutation-by-host interactions, and different shapes suggest host-dependent epistasis (mutation-by-mutation-by-host interactions). To determine how much variation in fitness is explained by these interactions, we performed multiple linear regression analyses (see Materials and methods). We found pervasive epistasis in both landscapes (Figure 1c). For the ancestral landscape, 58.66% of the variation was explained by the direct effects of mutations and 24.69% by pairwise interactions (Radj2=0.8172, F55,439 = 39.97, p<0.0001). Similarly, 48.35% of the variance in the malT landscape was explained by the direct mutation effects and 27.61% by the interaction terms (Radj2=0.7072, F55,252 = 14.48, p<0.0001). To test for mutation-by-mutation-by-host interactions, we regressed another linear model that includes host as a predictor variable. In this model, we found significant mutation-by-host interactions (Figure 1d), and sizeable effects of the host-dependent epistasis (21 mutation-by-mutation-by-host interaction terms were significant out of 45 and 12.62% of the total variance in the data were attributable to these terms, Figure 1d). This three-way interaction term measures the extent to which the landscape structure is transformed by host evolution and suggests that λ’s evolutionary trajectory could depend on its host’s genotype.

Figure 1. Empirical fitness landscapes of λ when infecting the (a) ancestral host and (b) malT host, and their statistical analyses in (c) and (d).

Each node in (a) and (b) represents a unique genotype and two nodes are connected by edges if the corresponding genotypes are separated by one mutation. The node at zero mutations is ancestral λ. Selection rate (per 4 hr competition experiment) is the difference of Malthusian growth rates of a given genotype i to ancestral λ over 4 hr, calculated as ln(λi,4λi,0)-ln(λanc,4λanc,0), where λi,t denotes the density of the given genotype at time t. (c) Statistical analysis of direct and interactive effects of mutations in both the landscapes. Colored cells represent statistically significant terms determined by multiple regression analysis after correction for multiple hypothesis testing (see Materials and methods). The diagonal elements of the matrix represent single mutation effects and all the off-diagonal terms represent pairwise epistatic interactions. See Supplementary file 1i for identity of mutations corresponding to different Gi . (d) Statistical test of whether the two landscapes varied in topology. The additional variable, E, represents environment (host) to indicate mutation-by-host effects in the lower-left matrix and mutation-by-mutation-by-host (G×G×E) in the upper-right matrix. Light colored cells indicate terms present in the final AIC-optimized model out of the full-factorial model (F76,726 = 37.45, p<0.0001), and dark colored cells indicate statistically significant terms after controlling for rate of false positives (see Materials and methods).

Figure 1.

Figure 1—figure supplement 1. Schematic illustration of MAGE-Seq used to construct λ’s fitness landscapes.

Figure 1—figure supplement 1.

(a) 24 independently evolved J alleles reported in Meyer et al., 2012. The white blocks indicate a location of a mutation not studied here, whereas colored blocks indicate mutations used to construct the genetic library. (b) Oligos were synthesized with focal mutations positioned in the center of the sequence. (c) MAGE was used to construct a combinatorial library of lambda lysogens integrated into E. coli’s genome. (d) Lysogens were induced in order to generate a library of λ particles with different combinations of mutations. (e) The λ library was first mixed with the ancestor in 1:9 ratio, incubated with different types of hosts for 4 hr at 37°C, and then sequenced at both initial and final timepoints to calculate the selection rates of individual genotypes.
Figure 1—figure supplement 2. Test to determine whether amplification distorts measurements of fitness.

Figure 1—figure supplement 2.

For every replicate population in the competition (total eight, four for each host) we performed two separate amplifications and used a unique barcode for each. This allowed us to compare fitness calculations based on independent amplifications and to determine if the amplification step is repeatable and unbiased. Each point represents a unique λ genotype presented in the fitness landscapes reported in Figure 1. The red line is a linear least-square fit. The regression fitting parameter and coefficient of determination are provided in the inset. The black line represents the vector, where fitness estimates from each barcode are equal.
Figure 1—figure supplement 3. Distribution of λ genotypes present with respect to the number of mutations they possess.

Figure 1—figure supplement 3.

(a) An ideal fitness landscape where all combinations of 10 mutations are present. (b and c) Empirical fitness landscapes with ancestral host and malT host, respectively.
Figure 1—figure supplement 4. Test of whether the error in our estimates of selection rates (per 4 hr) is influenced by the magnitude of the estimate.

Figure 1—figure supplement 4.

SD does not correlate with mean fitness of a genotypes for both (a) ancestor host, and (b) malT host; p-values in the insets indicate that the slope is not significantly different than zero. Uniform variance in fitness values permit the performance of analysis of variance to detect genotype by genotype by host interactions. Points with zero SD indicate presence of only one replicate population with a fitness value for that genotype.

The role of shifting landscapes in promoting λ’s innovation

To test whether changes in the structure of λ’s landscape opened trajectories to OmpF exploitation, we simulated λ’s evolution on the landscapes using a modified Wright-Fisher model (see Materials and methods, Figure 2—figure supplement 1). Before running the simulations, we imputed the missing λ genotypes’ fitness values to complete the landscapes. We did this by successively choosing missing genotypes at random and assigning them the average fitness of their nearest neighbors. The simulations were run based on conservative estimates of the number of generations and population sizes from previously published (Meyer et al., 2012) and this study’s coevolution experiments (960 generations; ~6.3×109 λ particles, Figure 4—figure supplement 1; Meyer et al., 2012) and λ’s intrinsic mutation rate (7.7×10–8 base–1 replication–1) (Drake, 1991). We predicted that λ would be more likely to evolve OmpF function (three specific mutations plus one additional Maddamsetti et al., 2018) in simulations that accounted for coevolution by shifting the population from one landscape to the next. We ran trials where λ evolved on only one landscape at a time to establish a baseline for the frequency of OmpF+ evolution without coevolution. Next, we ran nine shifting landscape scenarios where we varied how many generations λ evolved on the ancestral host landscape before switching to the malT landscape. As anticipated, the switching protocol increased the frequency of OmpF+ evolution in all nine treatments above the single host simulations, but only seven out of nine treatments were found to be significantly higher (Figure 2; ANOVA: F-ratio=6.14, d.f.=99, p<0.0001, Supplementary file 1b). This result was robust to changes in population size and total number of generations, and when controlling for both, different number of genotypes measured in the two landscapes, and noise created by imputing missing data points (Figure 2—figure supplement 2, Appendix 2).

Figure 2. Simulation results of the frequency of OmpF-use evolution observed when fitness landscapes were shifted at different frequencies.

Each bar represents an average of 300 simulation runs. Error bars indicate 95% confidence intervals. OmpF evolution is favored when λ evolves on shifting landscapes. The only two shifting landscape treatments that are not significantly higher than simulations on the constant malT landscape are the 0.2 and 0.4 treatments (Supplementary file 1b).

Figure 2.

Figure 2—figure supplement 1. Schematic outline of the simulations used to evolve λ on the fitness landscapes.

Figure 2—figure supplement 1.

Figure 2—figure supplement 2. Additional simulations to verify that shifting landscapes promote OmpF+ evolution.

Figure 2—figure supplement 2.

Frequency of OmpF+ evolution when λ population is evolved on different landscapes with different controls as compared with Figure 2 — (a) half the total number of generations, (b) twice the total number of generations, (c) with one-tenth of the population size, and (d) starting with only the genotypes whose fitness values were measured for both the landscapes. For all switching landscape treatments except two (switching fraction-time of 0.9 in (a), and 0.1 in (d)), the frequency of OmpF+ evolution was significantly higher (adjusted p-value<0.05) than the constant landscape treatments (ANOVA: F-ratio=6.62, d.f.=99, p<0.0001 for (a), F-ratio=8.22, d.f.=99, p<0.0001 for (b), and F-ratio=8.12, d.f.=99, p<0.0001 for (c), and F-ratio=36.04, d.f.=109, p<0.0001 for (d)). Each bar represents an average of 300 simulation runs and error bars indicate 95% confidence interval. The frequency of OmpF+ evolution in our simulations depends mainly on whether λ spends any time on the ancestral landscape before shifting to malT and there is little evidence of a relationship between the amount of time spend on each landscape and the probability of evolving OmpF+. However, we can reason that as the time spent on the ancestral landscape approaches zero, the frequency of OmpF+ will drop and approach the frequency caused by the constant malT landscape. This transition must happen sometime between 0 and 10%. (e and g) show the frequency of OmpF+ evolution when shifting is done between the same hosts, and (f) and (h) show difference between this switching treatment, and the varying host treatment. The bars are the averages taken across all switching times. Statistics for two-sample t-tests: (f) t=9.10, d.f. =169, p<0.0001; (h) t=8.54, d.f. =161, p<0.0001.

Reconstructing coevolution in an experimental population

The simulation results suggest that the shifting landscape encourages λ’s evolution to gain the required mutations for OmpF function. In particular, the simulations show that the first steps along the path to the innovation are more likely if λ first adapts to the ancestral bacterium, meanwhile the final steps are more likely to occur if the host coevolves resistance. To verify this result with laboratory experiments, we analyzed the path λ took to OmpF+ in a single population cryopreserved from the previous coevolution study (population ‘D7’ in Meyer et al., 2012; Table S1). Population ‘D7’ was chosen because λ evolved relatively few mutations in this population. We believed this choice was conservative and constitutes a strong test of our hypothesis since fewer λ mutations would provide fewer opportunities to detect host-induced contingency. We sampled λ strains from different timepoints of population ‘D7’ and sequenced their J gene (Figure 3a, Supplementary file 1c and Supplementary file 1d). Next, we ran pairwise competition experiments between λ genotypes at different stages of evolution on the two hosts. We found that the first mutation on the line of descent to OmpF+ required ancestral E. coli to evolve, while the second mutation required malT E. coli (Figure 3b and c). In addition, the eventual OmpF+ genotype with five J mutations only outcompeted the genotype with two mutations when provided with malT hosts (Figure 3—figure supplement 1). These findings show that the path λ took in population D7 required it to sequentially adapt to both host types and that λ’s fitness landscape changed during coevolution in a way that ultimately facilitated evolutionary innovation.

Figure 3. J evolution (a) and evidence of the interdependency between λ and E. coli fitness during their coevolution (b-d).

(a) Phylogenetic reconstruction and relative abundance of λ genotypes isolated through time from a previously coevolved community (Meyer et al., 2012). Each letter and star indicate a non-synonymous mutation in J (see Supplementary file 1c for labels’ corresponding mutations). A genotype’s relative abundance on a given day is denoted by the fraction of the total height of the y-axis that it occupies (e.g. on day 9, frequency of ABC is 0.2 and A**C is 0.8; see Supplementary file 1d). The lineage WT-A-ABC-ABCDE eventually evolves OmpF function and fixes in the population; resistance in E. coli through malT rises to high frequencies between days 5 and 8 (Meyer et al., 2012). (b & c) Selection rates (per 24 hr) of phage genotypes on the two hosts. Each bar represents the mean of three experimental replicates. While mutation A is favored over wildtype (WT) λ in the presence of the ancestral host and not malT, AB only outcompetes A in the presence of malT and not the ancestral host. One tailed t-tests to test if the mean selection rate is significantly greater (or less) than zero: A vs WT with ancestor host- t=98.76, d.f. =2,p<0.0001; A vs WT with malT - t=4.99, d.f. =2,p=0.0190; AB vs A with ancestor- t=3.4, d.f. =2,p=0.0383; AB vs A with malT - t=8.88, d.f. =2,p=0.0062. (d) Selection rate (per 4 hr) of malT E. coli relative to its ancestor in the presence of λ from different stages of coevolution. Each competition was replicated three times. Lowercase letters denote significance via Tukey’s honest significance test, see Supplementary file 1g for pairwise p-values (ANOVA: F-ratio=111.22, d.f.=11, p<0.0001). One tailed t-tests were also used to test if the selection rate of malT was greater than zero WT: t=2.44, d.f. =2,p=0.676; A: t=5.12, d.f. =2,p=0.0181; ABC: t=26.59, d.f. =2,p=0.0007; A**C: t=71.67, d.f. =2,p<0.0001. This shows that malT is unlikely to evolve in the presence of WT λ but it becomes progressively more likely as λ gains mutations. Asterisks over all the competitions indicate significance level corresponding to the p-values. Error bars in all bar graphs represent one sample SD.

Figure 3.

Figure 3—figure supplement 1. Competition assay of λ isolates from population D7.

Figure 3—figure supplement 1.

Relative fitness of λ with two mutations (AB) to an OmpF+ λ with five mutations (ABCDE) on two host genotypes. This shows that AB outcompetes ABCDE in the presence of ancestral E. coli while ABCDE is favored over AB with malT E. coli. Selection rate (per 4 hr) is the difference in Malthusian growth rates of the competitors over 1 day with a value of 0 indicating no difference in fitness. Each bar represents mean of three replicate trials and error bars indicate one sample SD. (One-tailed t-test to test if the selection rate is greater (or less) than zero— ABCDE vs AB with ancestor host: t=58.53, d.f. =2,p<0.0001, ABCDE vs AB with malT: t=13.09, d.f. =2,p=0.0029. Asterisks over the bar graphs indicate significance level corresponding to the p-values).

At this point, we shifted the focus of our study to testing whether E. coli’s resistance evolution was also impacted by λ’s evolution. Before reconstructing J evolution, it was believed that E. coli evolved resistance first and then J mutations evolved in response (Meyer et al., 2012, Figure 5). However, J evolved within a day, while malT mutations were previously shown to fix between 5 and 8 days in the population (Meyer et al., 2012). The timing suggests that λ improved infectivity and then applied pressure on E. coli to evolve resistance. To test whether λ evolution promoted host resistance evolution, we ran competition experiments between ancestral and malT hosts in the presence of phages isolated from four different time points. We found that malT was not significantly more fit than the ancestral E. coli in the presence of the ancestral λ, but it was more fit in the presence of the evolved λs (Figure 3d). This provides another example of interspecific epistasis, but this time the parasite’s genotype alters the host’s landscape. This result combined with the others suggests that there is an intricate coevolutionary feedback at play between λ and E. coli: λ evolves J mutations that better exploit E. coli, which in turn applies pressure on E. coli to evolve resistance. Once resistance evolves, new adaptive pathways become available to λ that encourage the innovation. For the computer simulations, we arbitrarily chose timepoints to switch from one host to the other; however, in reality, the dynamics of the switch are dictated by the host-parasite coevolution.

Evolutionary replay experiments

To further test the role of coevolutionary processes at driving λ’s innovation, we ran replays of the coevolution experiment (Figure 4—figure supplement 2). We initiated 12 populations with malT host that already possessed resistance, and 12 populations with ancestral host where λ and E. coli would coevolve normally. The former treatment should hinder the evolution of OmpF function because it denies λ the opportunity to evolve first with ancestral E. coli. In line with our expectations, 0 of 12 replicates evolved OmpF use in the malT treatment, meanwhile 3 out of 12 evolved the innovation in the ancestral treatment (Figure 4a and b). By sequencing J alleles of the resulting λ genotypes, we found that fewer mutations evolved with malT despite evolving for the same length of time. This suggests that λ’s evolution was stymied by starting with the resistant host (Supplementary file 1e), and by disrupting the coevolutionary process we interfered with λ’s ability to innovate.

Figure 4. Evolutionary replay experiments reveal that λ’s evolution to use OmpF depends on host coevolution.

(a) Wildtype (WT) λ with ancestral host, (b) WT λ with malT, (c) λ one mutation removed from evolving OmpF function with malT, and (d) identical setup as (c), but with λ two mutations removed (see Supplementary file 1f for identity of mutations). (e) The bar graph provides the frequency of OmpF+ evolution compared to the frequency observed by Meyer et al., 2012. Given λ’s established one in four rates of OmpF+ with ancestral host, the probability of observing no OmpF+ evolution in 12 replicate populations is ~0.03. Thus, no positives for OmpF evolution in (b) shows that λ’s evolution to OmpF function is significantly hindered, when coevolution is initiated with malT host. However, malT does not impede OmpF+ evolution, when the coevolution is initiated with already evolved λs (p-values for Fisher’s exact test: between (b) and (c) p=0.0261; between (b) and (d) p=0.0122). Notably, some λ populations went extinct which is common for these experiments and was previously shown to be caused by the evolution of resistance mutations in E. coli’s ManXYZ protein complex (Meyer et al., 2012).

Figure 4.

Figure 4—figure supplement 1. A schematic overview of the coevolution experiments performed in this study and by Meyer et al., 2012.

Figure 4—figure supplement 1.

Coevolution is initiated by coculturing λ and E. coli in glucose media for 24 hr at 37°C. At the end of each day, 1% of the total population is transferred to fresh media to continue the coevolution and the spent media is discarded. A fraction of the community is also sampled for long-term storage at −80°C, to determine population densities and to check the status of λ’s OmpF use.
Figure 4—figure supplement 2. Phage densities during the coevolutionary replay experiment performed in Figure 4.

Figure 4—figure supplement 2.

Timeseries of phage density of 12 replicates measured at the end of each day for cultures initiated with WT λ and ancestor host or malT host (corresponding to Figure 4a, b). The three λ populations that evolved OmpF function in the ancestor host treatment are marked by solid lines. For the simulations, we used a populations size of 6.3 × 109 which is lower than the peak population density observed (~1010), but above the average population density (~6.3 × 108).

Lastly, we tested whether malT would still promote the evolution of OmpF use if the replay experiments were initiated with λ genotypes positioned further along the path to gaining OmpF function. We initiated two more replays: one with a λ strain that was just a single mutation away from becoming OmpF+, and another that was two mutations away (see Supplementary file 1f for J alleles). 16 of 24 λ populations evolved OmpF+ showing that whether malT promotes OmpF+ evolution depends on where the λ genotypes are located in the fitness landscape (Figure 4c and d).

While the goal of these experiments was to test a specific hypothesis about λ’s innovation, the results have broader implications for coevolutionary dynamics and the repeatability of evolution. We showed that if the host outpaces the parasite (Figure 4b; λWT and malT E. coli), then the parasite is unable to innovate. This is in line with a previous study showing that if the host evolves even higher levels of resistance than provided by the malT mutation, λ loses the ability to innovate (Meyer et al., 2012). When specific pairs of host and parasite genotypes are combined that are at complementary steps in the coevolution, then the innovation becomes possible and even highly likely (Figure 4a, c and d; λWT and ancestral E. coli, λ–1 and malT E. coli, and λ–2 and malT E. coli). This suggests that the genotype of host and parasite that first encounter each other in nature, as well as the timing of their coevolution, play an important role in determining the dynamics and endpoints of their evolution.

Discussion

This study provides multiple direct tests of coevolution’s role in driving innovation, as well as revealing a more complicated model of λ and E. coli coevolution than previously published (Figure 5). λ was thought to evolve OmpF use as a direct response to malT resistance (Meyer et al., 2012); however, here we learned that key steps were missing from that model. λ is initially poor at infecting its host, so it evolves mutations in J that enhance its infectivity. The new λ genotypes apply pressure on E. coli to evolve resistance. When host-resistance increases in the community, λ’s fitness landscape is deformed in a way that promotes J evolution toward OmpF use, but only if it had already acquired some J mutations. Remarkably, the timing and coordination of each of these interdependent steps is facilitated by the reciprocity ingrained in host-parasite coevolution. Altogether, we were able to provide direct experimental evidence that fluctuating landscapes, also known as fitness seascapes (Merrell, 1994), can promote evolutionary innovations.

Figure 5. Previous and new version of λ-E. coli coevolutionary model.

Figure 5.

In the original model, E. coli evolves resistance by repressing λ’s receptor and then λ evolves mutations that allow it to use a new receptor. The updated model describes a more involved dynamic where λ evolves mutations that improve its ability to infect E. coli; with increased pressure to avoid infection, E. coli responds by evolving resistance, and then λ evolves the remaining mutations that leads to the ability to infect using the new receptor.

Our studies show that the fitness of a parasite depends on complex genetic interactions within its own genome and with the genomes of interacting hosts. These interdependencies result in highly contingent evolution, where λ is unlikely to evolve an innovation unless it participates in a particular sequence of coevolutionary steps with its host. Despite the stochasticity that is expected to arise in systems with substantial historical contingency (Gould, 1989; Blount et al., 2018), λ’s evolution to use a new receptor is repeatable because the sequence is coordinated by coevolutionary feedback. While coevolution may yield tangled banks of interactions, we demonstrate how high-throughput technologies can be used to untangle them and to predict evolution. The ability to successfully predict evolution in any system represents a significant step forward, but it is particularly notable in conditions that incorporate species interactions. Our approach was data-intensive and relied on technologies that are not currently available for more natural and complex ecological systems; however, many efforts are underway to develop these technologies (Bergelson et al., 2021). This study shows that these efforts are worthwhile because even though the resulting data may appear to be an uninterpretable morass, computational analyses can be leveraged to penetrate the information to aid learning and prediction.

Ideas and speculation

This work was completed during the 2020–22 SARS-CoV-2 pandemic, raising the question of whether this research provides insight into strategies to prevent future pandemics. While it is difficult to extrapolate, there are general lessons that can be applied to understanding pandemics. First, natural coevolutionary processes can repeatedly drive the evolution of viruses to achieve difficult adaptations that underly host-range expansions. On one hand, this knowledge is alarming because it suggests that pandemics may be more likely than previously anticipated, but on the other hand, the repeatability provides an opportunity to study the process and develop intervention strategies. We also learned that shifting conditions can enhance the evolutionary potential of viruses, suggesting that there could be additional unexpected consequences of human-caused environmental changes.

Materials and methods

Bacterial and phage strains

The ancestral phage λ strain in our study is cI26; it is a strictly lytic strain that was used in the previous coevolution study on which this paper builds (Meyer et al., 2012). λ has two life cycles: lytic and lysogenic. In the lytic life cycle, λ infects its host, creates multiple copies of its genome, and lyses the cell to produce new viral particles. This is unlike the lysogenic life cycle of λ, where it stably integrates its genome into the host genome and is replicated along with the host (Hendrix, 1983). The λ strain cI26 has a frameshift mutation in the cI gene that disrupts the regulatory protein cI required by λ to switch to a lysogenic life cycle (Meyer et al., 2012). This renders cI26 obligatory lytic. The other λ strain we used was cI857 (provided by Ing-Nang Wang, State University of New York, Albany) to construct our λ genomic library using MAGE (discussed in ‘Fitness landscapes’ section). The advantage of using cI857 over cI26 is that it is able to form a lysogen and as a genomic complex with E. coli’s genome, its genome can be easily edited with E. coli engineering methods. cI857 has an advantage over typical lysogenic λ because it has a mutation in the cI gene that makes the repressor protein cI unfold at high temperatures (Meyer et al., 2016). This enables us to induce the lytic life cycle of cI857 from lysogens using heat shock treatments, which has fewer side effects compared to using mutagens that the typical strain requires for induction.

We used E. coli B strain REL606 for the ancestral-sensitive host and its derivative EcC4 for the evolved-resistant (malT) host to construct fitness landscapes (in Figure 1) and run competition assays (in Figure 3). REL606 was the ancestral host in the coevolution experiments performed by Meyer et al., 2012, and EcC4 was isolated from one of the coevolving populations that has a single mutation, a nonsense mutation (C→T) at genome location 3,482,567 in malT gene (Table S5 in Meyer et al., 2012). MalT is a positive regulator of lamB, so a disruption in transcription of malT inhibits LamB expression (Boos and Böhm, 2000). Resistance to λ in EcC4 due to this nonsense malT mutation has been shown to yield high levels of resistance, the equivalent of a nonsense mutation in lamB (Figure S5 in Chaudhry et al., 2018). We used another malT mutant of REL606, named LR01, for our coevolutionary replay experiments (Figure 4). This strain has a 25 bp duplication at genome location 3,482,677 causing a frameshift in malT. LR01 was also isolated from a coevolving population of λ and E. coli (Figure S3 in Meyer et al., 2012); however, unlike EcC4, LR01 has a high reversion rate to malT+ that results in ‘leaky resistance’ to λ (Chaudhry et al., 2018). This allows λ populations to sustain serial dilution transfers and thus was critical for the success of the coevolutionary replay experiments (Figure 4). We note that while we use the term ‘resistant’ to refer to malT genotypes, this ‘resistance’ is specific to the OmpF λ in this study; these bacterial strains are susceptible to OmpF+ λ phage.

Several E. coli strains were used for culturing λ. The most often used was DH5α, a lacZα derivative of E. coli K-12, because it is permissive to all λ genotypes and it lacks lacZ which is used as a genetic marker to distinguish phage genotypes in competition assays (additional information in the ‘Phage Competition Experiments’ section). Two lamB mutants of E. coli with non-functional LamB were used interchangeably to culture OmpF+ λ on LB agar. One strain was a derivative of REL606 that has a 1 bp insertion of nucleotide T between base positions 610 and 611 in lamB (Meyer et al., 2012; Meyer et al., 2015), and the other strain was an E. coli K-12 derivative from the Keio collection (lamB JW3996) (Baba et al., 2006). We found no difference in the efficiency of plaquing on the two strains.

MAGE was performed in an E. coli K12 strain, HWEC106, provided by Harris Wang, Columbia University. The strain’s mutS gene is deleted and it possesses the pKD46 plasmid with an inducible λ-red recombineering system (Datsenko and Wanner, 2000). cI857 successfully integrated into this strain’s canonical ATTB site located near the galK gene (Meyer et al., 2012; Maddamsetti et al., 2018; Petrie et al., 2018).

Media

We performed most experiments in conditions identical to the initial coevolution experiment (Meyer et al., 2012). For competition assays (low and high-throughput) and two of the four coevolutionary replay experiments (Figure 4a and b), bacteria and phage were cocultured in modified M9 Glucose (47.7 mM disodium phosphate, 22.0 mM potassium phosphate monobasic, 18.7 mM ammonium chloride, 8.6 mM sodium chloride, 0.1 mM calcium chloride, 10 mM magnesium sulfate, 5.55 mM glucose, and 0.2 mM of LB) as used in Meyer et al., 2012. A new medium was used for the remaining two coevolution replays (Figure 4c and d) that we call Tris DM Glucose: 1.6 mM potassium phosphate monobasic, 0.59 mM potassium phosphate dibasic, 0.2 mM of calcium chloride, 50 mM Tris base (pH 7.4), 10 mM of magnesium sulphate, 7.5 μM thiamine, 3.2 mM ammonium sulphate, and 5.55 mM glucose. This medium is improved over M9 Glucose because it is less prone to magnesium precipitation, and fortunately has no noticeable effect on the coevolution.

For isolation, estimating densities and initial culturing of bacteria and phage, four more media were used according to the following specifications—(a) LB (Lennox Broth): 10 g tryptone, 5 g yeast extract, and 5 g sodium chloride per liter of water, (b) LBM9: 20 g tryptone, 10 g yeast extract, 12.8 g sodium phosphate heptahydrate, 3 g potassium phosphate monobasic, 0.5 g sodium chloride, 1 g ammonium chloride, 1.2 g magnesium sulfate, 22 mg calcium chloride per liter of water, (c) LB agar: 10 g tryptone, 5 g yeast extract, 5 g sodium chloride, and 16 g agar per liter of water, (d) soft agar: 10 g tryptone, 1 g yeast extract, 8 g sodium chloride, 7 g agar, 0.1 g glucose per liter of water, supplemented with a final concentration of 2 mM calcium chloride. The soft agar was also supplemented with 10 mM magnesium sulphate to improve plaquing of phage particles.

Isolation and culturing techniques

Most strains were grown at 37°C, except for HWEC106, which was grown at 30°C because pKD46 has a temperature-sensitive origin of replication. To ensure uniform aeration and nutrient availability, 4 ml cultures were shaken at 220 rpm and 10 ml cultures at 120 rpm; different rpm was chosen to ensure comparable aeration in cultures (10 ml cultures were grown in 50 ml Erlenmeyer flasks and 4 ml in glass tubes). We used LBM9 to grow phage cultures and LB to grow bacteria cultures, unless otherwise indicated. All bacteria and phage stocks were stored by freezing 1 ml of culture with 15% glycerol at −80°C. We revived phage from freezer stocks by growing ~2 μl of frozen stocks on 100 μl of DH5α overnight culture in 4 ml of LBM9. To harvest phage, cells were killed and separated by adding 100 μl of chloroform and centrifuging the solution at 3900 rpm for 10 min. The supernatant containing phage lysate was then stored with ~2% chloroform at 4°C. Bacteria were revived from −80°C by growing ~2 μl of the frozen stocks over night in 4 ml of LB.

Phage strains were isolated from a population by infusing phage particles into bacterial lawns of DH5α cells. We made these infused plates by mixing a small volume (between 10μl and 100 μl) of diluted phage and ~5 × 108 host cells to a 4 ml of molten soft agar at 55°C, and pouring the mixture over an LB agar plate (Sambrook and Russell, 2001). The soft agar was allowed to solidify and then incubated overnight at 37°C for phage particles to form plaques. A plaque is a near-circular clearing that forms on the bacterial lawn when nearby cells are killed by an infection that is initiated by a single phage particle. Phage dilutions were made in saline solution (8.5 g/L NaCl) and implemented to yield between 30 and 300 plaques. A single plaque was picked from the infused plates and clonal phage stocks were made by culturing the isolate overnight with DH5α and extracting phage lysate using techniques similar to frozen λ stocks.

Estimation of phage densities

Infused plates with appropriate dilutions of phage in saline solution (8.5 g/L NaCl) were used to estimate phage densities for phage competition assays. We controlled the number of plaques on an infused plate so that individual plaques could be identified and counted. These counts were used to back-calculate the phage growth rate.

To estimate the phage densities in the coevolutionary replay experiments, serial dilutions of phage were spotted on a bacterial lawn of DH5α. We made the bacterial lawns by adding ~5 × 109 cells of DH5α to 10 ml of molten soft agar at 55°C, and pouring it over a 150 mm diameter petri dish of LB agar base (Sambrook and Russell, 2001). After the soft agar solidified, 2 μl of eight different phage dilutions were added on top of the surface, let to dry, and incubated overnight at 37°C. The plaques were counted from the dilution where we could identify individual plaques. We then used these counts to calculate phage density in pfu per ml, where pfu is plaque forming units.

Sequencing for analyses of mutations in J

We routinely sequenced the J gene of λ to verify the identity of stocks and to determine whether J evolved in the replay experiments. DNA samples were PCR amplified using Q5 high-fidelity 2 × master mix (New England Biolabs) and primers described in Supplementary file 1h. Unpurified PCRs were submitted to Genewiz (La Jolla, CA) for Sanger sequencing.

Calculation of selection rate using Malthusian parameter

Selection rate (s) was used to quantify the difference in fitness between pairs of strains (say X and Y) in a given environment. It is calculated as the difference of their Malthusian parameters:

s=lnXTX0-lnYTY0T (1)

where Xt and Yt are the densities of strains X and Y, respectively, at time t, and T is the period of time over which strains are assessed (and the assay starts at t=0). Selection rate has units of inverse time; however, in all our figures, we report selection rate as per unit time period (T) of the assays. We reported the values this way because λ does not strictly grow exponentially and it can be misleading to report normalized rates after dividing by T because the rates cannot be extrapolated for different times in a straightforward way like for exponential growths. That being said, for readers who want the per hour rates, we have provided the value of T in hours within each figure caption.

Construction of λ genomic library using MAGE

Our goal was to measure the fitness of a significant number of λ genotypes to establish the basic structure of the landscape. To accomplish this, we generated a combinatorial library of genotypes made with 10 J mutations previously observed to evolve en route to OmpF+ (Supplementary file 1a). These mutations were chosen because they fell within two <100 base frames where adaptive mutations tended to evolve. We used a genetic engineering technique called MAGE to construct the library (Wang et al., 2009). This technique employs the λ-red recombineering system that can efficiently recombine single-stranded DNA into E. coli’s genomic DNA. To engineer λ, λ’s genome is integrated into E. coli’s genome, creating a lysogen. The λ genome in the lysogen (a prophage) becomes dormant and can be treated as any other E. coli gene. Gene edits in this λ lysogen are made by expressing λ-red from a plasmid (pKD46) within E. coli cells (Datsenko and Wanner, 2000) and electroporating synthetic oligos with specific mutations written into the sequences. Multiple oligos can be combined into a single experiment in order to create a diverse library of different combinations of the mutations. The output of this procedure is a single sample with a library of genotypes mixed together.

We previously reported our MAGE protocol in Maddamsetti et al., 2018. The oligos used are in Table S1 of the manuscript listed under the subheading ‘10-mutation library’. There are a few important aspects of the oligo and protocol design that we discuss here, but for more detailed information see the original paper.

Our goal was to construct a genetic library with each combination of mutations represented equally, this means a 50/50 split of mutant and WT states at each site. To achieve this, we modified the typical MAGE protocol. First, when mutations were clustered near each other such that a single 90-mer used for MAGE editing encompassed multiple mutations, we designed multiple oligos with different combinations of the mutations at these nearby sites. Without doing this, edits would be correlated, creating an imbalance in the network representation, and limiting our ability to determine the effects of individual mutations. Second, we designed oligos with the WT state at the 10 positions. Including these oligos slowed the efficiency of MAGE to introduce mutations because sometimes they would overwrite an edit; however, it safeguarded the procedure from saturating the library with the 10 mutations. A third strategy was employed to even out the variation at these 10 sites. This was to perform MAGE starting from two orthogonal points in the genetic network; λ that had none of the 10 mutations and one that we engineered in all 10 (Maddamsetti et al., 2018). Additionally, we ran MAGE on three separate replicates from each starting point in order to enhance the potential to explore more genetic space. Lastly, we ran the MAGE protocol for 50 cycles, which is the number of cycles a computer simulation of MAGE with an efficiency of 10% recombination efficiency per cycle predicted we would need to construct all variants. In the end, we mixed all six MAGE libraries together in equal frequencies to maximize diversity in the library.

Our design also included measures to improve our ability to detect rare variants in the library. The next step of the fitness landscape protocol is amplicon sequencing with an Illumina MiSeq. This technology has relatively low rates of error (~10–3 per base sequenced); however, we anticipated that some of the genotypes in our library would fall below this frequency, making it impossible to distinguish between false positives and true reads. To improve our resolution, we edited a synonymous mutation adjacent to each focal mutation. We called this a watermark mutation because it helped us distinguish between a true genome edit and an error in sequencing. By only recording the presence of a focal mutation if it occurred alongside the watermark, we decreased our detection limit from ~10–3 to ~10–6.

The watermark strategy we employed was slightly more complicated than this. Most synonymous mutations at the C-terminal end of proteins, like our watermark mutations, have no fitness effects (Kelsic et al., 2016). Despite this, we designed a way to test for neutrality. For each focal mutation, we edited one of two different watermark mutations (see Supplementary file 1i for list of mutations). This allowed us to compare equivalent genotypes with distinct watermarks in order to test whether one of the synonymous mutations impacted λ’s fitness when compared to the other. We found that there was no significant fitness effect of the neutral mutations on the phage genotype (see Appendix 1).

Empirical fitness landscape resolved through one-pot competition experiment

After creating the λ genomic library, we measured the relative fitness of genotypes with respect to the WT λ. To do this, we cocultured the λ genomic library and ancestral λ in a 1:9 ratio. This ratio was used so that the most abundant competitor remains the ancestor throughout the competition experiment. This means that the competitive fitness we measured is with respect to a single genotype. If we had not done this, then as the community of engineered λ shifts during the competition, and more fit genotypes become enriched, they will change the mean fitness of the population and cause mildly fit genotypes to begin to decline, making them appear unfit. Flooding the flask with ancestral λ and running competitions for a short time period (4 hr) solves this problem.

Having a disproportionate number of ancestral λ has a pitfall. We measured λ fitness by comparing the frequency changes of each genotype using amplicon sequencing (more information in the next section). The problem is that most of our sequencing effort would be spent on sequencing a single genotype, the ancestor. To avoid this, we engineered an ancestral genome with 6 synonymous mutations, where the reverse primer binds. These edits interfered with primer binding and caused selective amplification of the library λs. We edited 6 ‘wobble’ positions: 3381 (g→t), 3384 (c→a), 3387 (c→a), 3390 (c→t) 3393 (g→a), and 3396 (c→a). The engineering was done using MAGE with the oligo provided in Supplementary file 1j. These edits were made at the end of the protein (3399 nucleotide) and so they likely did not affect λ fitness. We did not test this because this engineered strain acted as a standard competitor for all competition experiments and so it should have no effect on the relative fitness comparisons.

We ran eight competitions, four in the presence of ancestral E. coli and four with malT E. coli. Competitions were inoculated with ~107 total λ particles and ~2 × 108 cells into 10 ml of M9 glucose in 50 ml flasks. The cells were preconditioned in M9 glucose for 24 hr before the competition. Flasks were cultured for 4 hr at 37°C and shaking at 120 rpm. 1 ml samples were removed from the library before and after the competition for processing. Phage particles were concentrated using PEG precipitation (Sambrook and Russell, 2001). The pellet was resuspended in 25 μl of molecular grade water and then a two-step PCR reaction was performed in order to amplify just the region of interest, to attach barcodes to the amplicons so that we could multiplex samples into a single Illumina run, and attach adapters required for Illumina sequencing. The protocol was published by Kelsic et al., 2016. The primers we used are provided in Supplementary file 1k. As described in Kelsic et al., we ran two separate PCR reactions for each sample and used a unique barcode on each. This allowed us to test for amplification bias, which we did not detect (Figure 1—figure supplement 2).

Sequencing

Amplicon samples were pooled together and sequenced using an Illumina Mi-Seq maintained in the Systems Biology Department at Harvard Medical School. 100 base paired-end reads were run. We were able to extract on average 107,839 high quality reads per sample with standard deviation of 26,889. This provided considerable coverage to reliably estimate fitness for even rare genotypes.

Post-sequencing analysis and construction of fitness landscape

Amplicon sequences were analyzed to calculate selection rates for each engineered genotype. These values were later used to construct the fitness landscapes. We used a combination of custom Python and MATLAB (version R2019b) scripts to read and concatenate the raw paired-end reads, identify the genotypes based on the focal mutations they possessed, and to count their abundances. All focal mutations were called only if either of the two corresponding watermark mutations were also present. We removed any reads that contained more than one mutation other than the focal and watermark mutations. This essentially acted as a quality filter and we did not have to filter based on the Illumina provided Q-score. For each competition, a mean of counts corresponding to the two barcodes used in PCR amplifications was taken for all the genotypes. If the counts for both, initial and final timepoints were available for a genotype, they were used to estimate the genotype’s Malthusian growth rate, given by the calculation lnλi,Tλi,0T where λi,t is the density of the given genotype at time t and T is the total time of the competition. Note that most genotypes performed poorly on malT host and fell below detection limit at the final timepoint. We quantified the differences in fitness of all genotypes by calculating their selection rate with respect to the WT λ (see ‘Calculation of selection rate using Malthusian parameter’ section). The final fitness landscapes (Figure 1a and b) were constructed by taking the mean of the selection rates obtained from the four replicate competition trials performed on the same host.

Statistical analysis of fitness landscapes

We used multiple linear regression models to quantify the genetic interactions in the fitness landscapes. We regressed fitness values on each landscape against individual effects of the mutations and the pairwise epistatic interactions between them (genotype-by-genotype or G×G interactions). This allowed us to estimate the contribution of epistasis toward explaining the observed fitness data. The fitness of each genotype in the regression model was thus described by 55 predictor variables (10 main effect terms for the 10 mutations +45 interaction terms):

y=β+iβi Gi+i<jβijGiGj (2)

where β** is the intercept, all other βs are regression coefficients, contribution of an individual mutation i is described by the term Gi , and the effect of pair of mutations i and j are captured by terms GiGj , where Gi is an indicator variable that is equal to zero when the mutation i is absent and one when present. After fitting the model, we used Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995) to control for false discovery rates and identify statistically significant terms in the model (see Figure 1c). All analyses were performed in R version 3.6.1 (R Development Core Team, 2019). The error in fitness estimates of the genotypes did not depend on the mean fitness value of the genotypes (Figure 1—figure supplement 4).

To understand how λ’s landscape differed between the two hosts, we incorporated the host genotype as a predictor variable in the linear model and regressed the combined fitness data of both landscapes together. This allowed us to test if a mutation significantly interacted with the host (genotype-by-environment or G×E interaction) and whether interaction between pairs of mutations changed with the host genotype (genotype-by-genotype-by-environment or G×G×E interaction). The combined full-factorial model consisted of a total of 111 terms (10 G for individual mutation +1 E for host +45 G×G+10 G×E+45 G×G×E terms),

y= β+iβiGi+β+E+iβi+GiE+i<jβijGiGj+i<jβij+GiGjE (3)

where additional E is an indicator variable for host type and other notations follow same scheme as in Equation 2. Since the fit of a model generally improves as the number of predictor variables increases, we tested for overfitting using Akaike information criterion (AIC) (Akaike, 1974). AIC is a penalized-likelihood criterion which penalizes a model for increasing number of parameters in it. The model with the greatest relative likelihood is considered to be the one with minimum AIC value. We minimized AIC value using the step function in R Development Core Team, 2019 to uncover subsets of predictor variables that have high predictive power. This resulted in a more parsimonious model (77 terms out of a total 111, see Figure 1d). A complimentary adjusted R-squared analysis came to the same conclusion, Radj2=0.775 for the reduced model and Radj2=0.769 for the full-factorial model (higher values indicate more predictive and parsimonious models). After model selection, we used Benjamini-Hochberg procedure to determine which variables were significantly predictive (Figure 1d). This procedure only controls for false positives and is susceptible to false negatives. In this way, it is conservative for our purposes.

Simulation of λ’s evolution on fitness landscapes

To test if the changes in λ’s fitness landscape facilitated its evolution to infect via OmpF, we simulated λ’s evolution on the two landscapes we measured and recorded whether OmpF+ genotypes arose to a high enough frequency that we would have detected them in the original laboratory evolution. We initiated λ populations with no mutations and allowed them to evolve and mutate at any of the 10 focal sites used to construct the landscapes (Supplementary file 1a). For each treatment (Figure 2), we simulated 300 separate trials using a modified Wright-Fisher model with discrete generations and a fixed population size.

For each simulation, a new fitness landscape was constructed by assigning fitness values to all the genotypes (210=1024 genotypes). This was done in a way to account for error in estimating fitness and the error associated with imputing the values of missing data points. For the genotypes that had empirical fitness data available (Figure 1a and b), we did not simply average the values of the replicate fitness measurements but instead we performed a bootstrapping protocol in order to account for the error in the fitness estimate. We randomly resampled from the four replicate measurements (with replacement) and then computed the mean of the four. Some genotypes were not present in all four replicates, in these cases we resampled as many times as there were replicates. Genotypes with only one replicate data point were, thus, assigned the same corresponding fitness values in all the runs of the simulations. Next, we imputed fitness of the genotypes that were missing from the empirical landscape. A genotype with a missing fitness value was randomly chosen and assigned the mean of the fitness values of its nearest neighbors (one-mutation away genotypes) present in the landscape. This was iterated until the full genotypic space was complete. Note that the order in which genotypes are chosen can affect the value that is estimated for a given genotype. This is because as the landscape is filled in, each genotype will have more neighbors to draw values from. This means that if a genotype is randomly chosen early, its fitness will be based on fewer neighbors than if it were chosen later, and its value will be slightly different. Since missing genotypes are randomly chosen in each iteration, the order will vary from one simulation to the next. This method introduces an extra source of variation in the simulation runs and captures uncertainty associated with the imputation of fitness values.

After constructing a complete fitness landscape, we evolved a λ population through repeated cycles of reproduction, selection, and mutation. For each generation, reproduction in the population was simulated by a multinomial sampling where the number of trials was equal to the population size N (set to ~6.3 × 109 based on Figure 4—figure supplement 2) and the success probability associated with a genotype i was given by pi=niwi/iniwi , where ni is the abundance of the genotype i, and wi is defined as the exponential of selection rate used in fitness landscape. Thus, the probability of (k1,k2,k3,km) offsprings for genotypes 1,2,3,...,m would be:

Nk1k2k3kmi=1mniwijnjwjki

To incorporate mutations, all genotypes whose frequencies increased were mutated as per λ’s mutation rate (7.7 × 10–8 per base per replication Drake, 1991). Consider a genotype (say i) that increased in abundance and let the number of additional individuals produced by this genotype be denoted by zi (with zi=ki-ni). Each of these zi individuals retained its parent’s genotype with a probability e-μ (assuming a Poisson distribution for mutations). Otherwise, the individual was assigned to a random neighboring (i.e. mutant) genotype. Given λ’s mutation rate, the probability of multiple mutations is very small and was ignored here. We simulated this modified Wright-Fisher cycle of reproduction and mutation for 960 generations. It is difficult to know how many generations phages undergo because the evolved phages have a high spontaneous death rate (Petrie et al., 2018) and can also experience other sources of mortality. Given this, we decided to run the simulation for a somewhat arbitrary amount of time, 960 generations which corresponds to two doublings per hour for the 20-day experiment we are trying to replicate Meyer et al., 2012. This is likely an overestimate, which is unintuitively conservative for our purposes because more cycles will cause additional evolution and exploration of the fitness landscape, enhancing the possibility of OmpF+ evolution in the negative controls, and reducing our ability to detect treatment differences.

We evaluated whether λ evolved OmpF+ by examining the population for genotypes that had the necessary mutations to be OmpF+. Previous studies revealed that OmpF genotypes must have four mutations (Maddamsetti et al., 2018). They all possess two specific changes (A3034G) and (G3319A) and a third change that can occur at positions 3320 or 3321. For the 10 mutations we studied, T3321A is the only third mutation that satisfies this requirement, meaning all OmpF genotypes must have three specific changes (A3034G, G3319A, and T3321A). Many J mutations satisfy the last requirement (Maddamsetti et al., 2018), so we implemented what we call the ‘3+1’ rule where genotypes are designated as OmpF+ if they have the three necessary J mutations plus any additional mutation (see Supplementary file 1i). If any such genotype crossed the threshold of 5000 λ particles during the course of a simulation, the λ population in that run was marked to have evolved OmpF-function. We based the threshold value on the detection limit of OmpF+ genotypes in the original laboratory coevolution experiments (~500 pfu/ml, see ‘Coevolutionary Replay Experiments’ section).

We implemented host switching by first evolving λ on the landscape measured with ancestral E. coli, stopping the simulation early, and quantifying the frequency of each λ genotype. Next, we would initiate evolution on the malT for the remainder of the time but starting with genotypes at the frequencies recorded on the ancestral landscape. A total of 11 different coevolution treatments were run by varying how long λ evolved on the ancestral host landscape and malT landscape (Figure 2). 300 simulation trials were run for each treatment. We calculated error in the simulations in order to detect significant treatment differences by batching the runs into 30 and estimating the frequency of OmpF+ for each subset. This allowed us to calculate a 95% confidence interval using a Student’s t distribution for the frequency of OmpF+ evolution in each treatment, and then to use an ANOVA coupled with Tukey’s multiple comparison test to test for treatment differences.

Phage competition assays

To test whether λ’s path to innovation requires sequential adaptation to host genotypes isolated from different stages of their coevolution, we studied the dynamics of λ evolution in much more depth than previously reported. We focused on a single experimental replicate reported on in Meyer et al., 2012; ‘D7’ that evolved to be OmpF+ on the 12th day. Daily samples of the population were preserved, so we were able to revive phages from the full time series. We isolated 66 phages in total (Supplementary file 1c and d) and sequenced the full-length J gene from each in order to reconstruct the evolutionary dynamics (Figure 3a).

Next, we ran head-to-head competition experiments from key genotypes along the path to OmpF+, the WT and λ with a single J mutation (A), double (AB), and quintuple (ABCDE) (Figure 3b and c, Figure 3—figure supplement 1). Our first step was to mark key strains with a gene that caused the plaques to be visually distinguishable from unmarked phage. The gene is lacZα, which in the presence of a 5-bromo-4-chloro-3-indolyl-b-D-galactopyranoside (X-gal), Isopropyl-β-D-thiogalactoside (IPTG), and a host cell like DH5α that lacks lacZα, produces blue plaques. To incorporate lacZα into λ’s genome, we used a gene fusion of lacZα with λ’s R gene (Burmeister et al., 2016; Shao and Wang, 2008). The fusion readily recombines into λ’s genome by the phage’s endogenous recombination system, λ-red (Ellis et al., 2001). The procedure is straightforward, infect an E. coli strain that has a plasmid with the lacZ fusion (strain: SYP042; plasmid: pSwtRlaczalphaZalpha +RZ xl1 Blue Amp Blue provided by Ing-Nang Wang, Albany, NY) and some fraction of the λ produced will have recombined with the plasmid. The recombinants are isolated by picking blue plaques. WT and AB were each marked. The marker has been shown to have a slight fitness effect; however, this would not significantly influence our measurements of fitness since we observed such large differences between strains (Burmeister et al., 2016; Shao and Wang, 2008).

We competed three pairs of phages, WTlacZ vs A, A vs ABlacZ, and ABlacZ vs ABCDE, under two conditions: with ancestral E. coli or malT. Competitions were run for a single 24 hr period under identical conditions in which the phage evolved (Meyer et al., 2012). Initial λ density was between 104 and 105 particles per ml and the relative frequency of the strains was sometimes skewed in order to start with more of the unfit genotype. Fitness differences were so large that the less fit genotype would be overwhelmed and its frequency undetectable if they did not start with a numerical advantage. ~5 × 106 exponentially growing cells were inoculated to each flask (5 × 105 per ml). The cells were preconditioned by growing them overnight in the competition medium. Three replicate competitions were run for each treatment. Phage densities of the unmarked and marked λ were determined by plating in typical soft agar plates, where we added 0.5 mg/ml of X-gal and 0.25 mg/ml IPTG to the molten soft agar. Densities were measured at t=0 and t=24 hr.

Host competition assay

This competition experiment tested the role of λ’s evolution plays in promoting the evolution of host resistance. We competed the ancestral host (REL606) against the evolved-resistant host (EcC4) in the presence of different λ genotypes that had increasing numbers of J mutations (Figure 3; WT (cI26), A (1-mutation), ABC (3-mutation), and A**C (4-mutation)). We initiated three replicate populations for each phage treatment. The competition was performed with identical conditions as the original evolution experiment and run for 4 hr to prevent λ from evolving during the competition assay. Cells were preconditioned in modified M9 glucose for 24 hr before the competition. As with λ, fitness was measured by observing the change in frequency of the competitors over time. Frequencies were determined by plating a subsample of the populations onto tetrazolium maltose (TMal) agar plates (Shuman and Silhavy, 2003). REL606 produces white colonies on the plates, while EcC4 produces smaller red colonies. Before plating, the phages were removed by centrifugation of the cells and then removing the supernatant that contained the phage. Cells were resuspended in saline solution and the centrifugation and resuspension were repeated once more.

Coevolutionary replay experiments

Would λ’s evolution to OmpF innovation proceed without initially evolving on an ancestral host? We tested this by replaying λ-E. coli coevolution but starting out with a host that already evolved resistance via a malT mutation (strain LR01). We initiated 12 replicate populations with malT, and another 12 with REL606 as a positive control. The replay experiment was run nearly identically to the coevolution experiment performed by Meyer et al., 2012 with cI26 as the ancestor phage, and daily sampling was done to detect the presence of OmpF+ phage by spotting the phage on the bacterial lawn of lamB cells (detection limit ~500 pfu per ml) (Figure 4—figure supplement 1). The two differences were that the study was run longer (31 days) and daily estimates of λ populations were made (Figure 4—figure supplement 2). Twelve replicates were chosen in order to have enough statistical power to determine whether or not a treatment reduces the chances of OmpF+ evolution. In the previously reported study, only a quarter of populations (24 out of 96) evolved to use OmpF (Meyer et al., 2012); this result has also been replicated in our lab on two separate instances where 3 out of 12 populations gained OmpF-function. Assuming a binomial distribution and a true success rate of 0.25 for λ’s OmpF+ evolution with the ancestral host, the probability of observing no replicate evolving to be OmpF+ in 12 replicate populations is 0.03167. Thus, we can conclude that malT treatment reduced λ’s ability to evolve the innovation as compared to when coevolution is initiated with ancestral host. Two isolates from each population were sampled on day 26, and the reactive region of their J genes were sequenced using the Sanger sequencing method.

We ran two additional replay experiments as a positive control for any unintended side effects of initiating an experiment with malT host (LR01). A prediction from the fitness landscape simulation results was that if we started a replay with malT and a phage isolated from a later-stage of the coevolution, then the phage should evolve OmpF use. To test this, we genetically modified an OmpF+ λ by removing one mutation required for OmpF use (λ–1), and then a second (λ–2). The OmpF+ genotype was a lysogen (cI857) that we had previously edited in seven J mutations (Supplementary file 1f), which is reported about in Petrie et al., 2018. The two new edits were made using MAGE and the oligos are reported in Supplementary file 1j.

The replay experiments were run identically, except because of an oversight, these replays were run in a slightly different minimal glucose medium called Tris DM (compared to the original coevolution experiment run by Meyer et al., 2012). The key difference in the medium is the buffer, it is a Tris buffer, not a phosphate buffer. Both mediums impose carbon limitation and have the same concentration of the single carbon source, glucose, and so population and evolutionary dynamics should not have been affected. We confirmed in an additional replay experiment that the medium does not affect the timing or repeatability of OmpF+ evolution.

Data and code availability

Amplicon sequencing data used to generate fitness landscapes in this study have been deposited in the NCBI Sequence Read Archive under BioProject PRJNA646809. The code and data that support the findings of this study are available at https://github.com/anigupta12/fitness-landscape-paper, copy archived at swh:1:rev:19292491a2768444bd009da8a3de6d25ce8ab18c (Gupta, 2021).

Sharing of biological material requires a material transfer agreement, described at http://blink.ucsd.edu/research/conducting-research/mta/index.html.

Acknowledgements

We thank the James McDonnel Foundation and the NSF-DEB (1934515) for financial support and The Max Planck Society for supporting JG. We thank A Agarwal for help with the statistical analyses of fitness landscapes.

Appendix 1

Fitness effect of synonymous mutations used as watermarks in MAGE

We designed two watermark mutations for each focal mutation. Watermark mutations are synonymous mutations that fall within a few nucleotides of the focal mutation. These edits improved our ability to detect the presence of focal mutations within the MAGE library (Supplementary file 1i). In principle, these watermark mutations should not affect the fitness of λ because the introduced mutations do not change the amino acids. However, we designed two different watermarks in order to test whether or not they influence λ fitness. To test for fitness effects of the synonymous mutations, we evaluated the fitness of genotypes with just a single focal mutation, and then we split the count data from the competition experiments into two, one for watermark ‘1’ and a second pool for ‘2’. We calculated the fitness of each group and compared their means using a t-test. The analysis was limited to just fitness measurements made for the treatment with the ancestral host because the single mutants did poorly on malT host and many dropped below our limit of detection during the competitions. We were also only able to run the analysis on 7 out of 10 mutations because in three cases only one watermark was represented in the final counts data. We found that the fitness did not depend on the neutral marker for all seven using a Bonferroni corrected alpha value of 0.0071 (Supplementary file 1l). However, one of the seven was significant based on the uncorrected alpha value of 0.05. The effect on the selection rate was estimated to be 0.9947 with 95% confidence interval of 0.4758–1.5136. This value falls well within the error associated with fitness estimates (Figure 1—figure supplement 4). Because of our uncertainty of whether there was an effect of this synonymous mutation, and because the effect falls within normal levels of error, we did not take steps to correct for this possible effect.

Appendix 2

Simulation results when only using genotypes present in both the ancestral and malT fitness landscapes

One possible complication with the fitness landscape analyses is that the malT landscape is based on a subset of the ancestral landscape (131 vs 580 out of 1,024 possible genotypes). This could lead to potentially spurious comparisons of evolution on the two hosts. To control for this, we re-ran the simulations starting with only the genotypes that were present in both the landscapes. While the frequencies of OmpF+ populations did shift with this new analysis, the main result that switching landscapes enhances the frequency of OmpF+ evolution remained statistically significant (Figure 2—figure supplement 2d). The increased frequency in ancestor-only treatment stems from the increased stochasticity in estimating the landscape since more genotypes are imputed in panel d than a-c. This stochasticity leads to more OmpF+ evolution because the increased randomness in landscape formation increases the chances of producing viable pathways to OmpF use. Interestingly, the ancestral landscape OmpF+ frequency is equivalent to malT in panel d. This similarity suggests that the frequency of OmpF evolution for malT may be artificially high, which would mean that in reality there may be greater differences between this treatment and the fluctuating treatments.

Simulation results when shifting is done between landscapes of the same host

Each time we impute fitness values of missing genotypes in a simulation run, a slightly different landscape is produced. This procedure raises the question of whether the increased frequency of OmpF+ in the shifting landscape simulations is due to structural differences between the two landscapes, or the random differences created by the imputation technique. To test this, we repeated simulations as previously discussed; however, this time we shifted to a newly generated landscape created from the same host-landscape data. OmpF+ did not evolve when the shift was made between two ancestral landscapes. λ did evolve OmpF function in some replicates when the switching was made between two malT landscapes; however, the frequency was significantly less than when the shift was made from ancestor to malT (Figure 2—figure supplement 2e and f). Lastly, to control for the sparser landscape of malT compared with the ancestor, we shifted between two ancestral host landscapes that were generated using only fitness values of genotypes present in both landscapes. We found the same qualitative result as for the malT to malT landscape shift (Figure 2—figure supplement 2g and h). These results show that the structural differences between the two host landscapes encourage OmpF+ evolution above and beyond what results from the noise associated with the imputation procedure.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Roy Kishony, Email: rkishony@tx.technion.ac.il.

Justin R Meyer, Email: jrmeyer@ucsd.edu.

C Brandon Ogbunugafor, Yale University, United States.

Aleksandra M Walczak, CNRS LPENS, France.

Funding Information

This paper was supported by the following grants:

  • National Science Foundation 1934515 to Justin R Meyer, Animesh Gupta.

  • McDonnell Foundation to Justin R Meyer.

  • Max Planck Foundation to Jenna Gallie.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing - original draft, Writing - review and editing.

Conceptualization, Formal analysis, Methodology, Resources, Software, Validation, Writing - review and editing.

Data curation, Investigation, Methodology, Resources, Writing - review and editing.

Conceptualization, Data curation, Investigation, Methodology, Resources, Writing - review and editing.

Conceptualization, Data curation, Investigation, Methodology, Resources, Writing - review and editing.

Conceptualization, Formal analysis, Funding acquisition, Methodology, Resources, Supervision, Writing - review and editing.

Methodology, Resources, Validation, Writing - review and editing.

Conceptualization, Funding acquisition, Methodology, Resources, Supervision, Writing - review and editing.

Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing - original draft, Writing - review and editing.

Additional files

Supplementary file 1. All the supplementary tables.
elife-76162-supp1.docx (102.3KB, docx)
Transparent reporting form

Data availability

Amplicon sequencing data used to generate fitness landscapes in this study have been deposited in the NCBI Sequence Read Archive under BioProject PRJNA646809. Other data that support the findings of this study is deposited at Dryad at https://doi.org/10.5061/dryad.ht76hdrhq and all code is made available at GitHub at https://github.com/anigupta12/fitness-landscape-paper, (copy archived at swh:1:rev:19292491a2768444bd009da8a3de6d25ce8ab18c).

The following datasets were generated:

Gupta A, Zaman L, Strobel HM, Gallie J, Burmeister AR, Kerr B, Tamar ES, Kishony R, Meyer JR. 2022. Host-parasite coevolution promotes innovation through deformations in fitness landscapes. Dryad Digital Repository.

Gupta A. 2020. Fitness landscape of bacteriophage lambda with ancestral host and resitant host. NCBI BioProject. PRJNA646809

References

  1. Akaike H. A new look at the statistical model identification. IEEE Transactions on Automatic Control. 1974;19:716–723. doi: 10.1109/TAC.1974.1100705. [DOI] [Google Scholar]
  2. Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Molecular Systems Biology. 2006;2:2006.0008. doi: 10.1038/msb4100050. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Benjamini Y, Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. 1995;57:289–300. doi: 10.1111/j.2517-6161.1995.tb02031.x. [DOI] [Google Scholar]
  4. Bergelson J, Kreitman M, Petrov DA, Sanchez A, Tikhonov M. Functional biology in its natural context: A search for emergent simplicity. eLife. 2021;10:e67646. doi: 10.7554/eLife.67646. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Blount ZD, Lenski RE, Losos JB. Contingency and determinism in evolution: Replaying life’s tape. Science. 2018;362:eaam5979. doi: 10.1126/science.aam5979. [DOI] [PubMed] [Google Scholar]
  6. Boos W, Böhm A. Learning new tricks from an old dog: MalT of the Escherichia coli maltose system is part of a complex regulatory network. Trends in Genetics. 2000;16:404–409. doi: 10.1016/s0168-9525(00)02086-2. [DOI] [PubMed] [Google Scholar]
  7. Burmeister AR, Lenski RE, Meyer JR. Host coevolution alters the adaptive landscape of a virus. Proceedings of the Royal Society B. 2016;283:20161528. doi: 10.1098/rspb.2016.1528. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Cervera H, Lalić J, Elena SF. Effect of Host Species on Topography of the Fitness Landscape for a Plant RNA Virus. Journal of Virology. 2016;90:10160–10169. doi: 10.1128/JVI.01243-16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Chaudhry WN, Pleška M, Shah NN, Weiss H, McCall IC, Meyer JR, Gupta A, Guet CC, Levin BR. Leaky resistance and the conditions for the existence of lytic bacteriophage. PLOS Biology. 2018;16:e2005971. doi: 10.1371/journal.pbio.2005971. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Chou HH, Chiu HC, Delaney NF, Segrè D, Marx CJ. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science. 2011;332:1190–1192. doi: 10.1126/science.1203799. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Darwin C. The Origin of Species. London, UK: John Murray; 1859. [Google Scholar]
  12. Datsenko KA, Wanner BL. One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. PNAS. 2000;97:6640–6645. doi: 10.1073/pnas.120163297. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. de Visser JAGM, Krug J. Empirical fitness landscapes and the predictability of evolution. Nature Reviews. Genetics. 2014;15:480–490. doi: 10.1038/nrg3744. [DOI] [PubMed] [Google Scholar]
  14. de Visser JAGM, Elena SF, Fragata I, Matuszewski S. The utility of fitness landscapes and big data for predicting evolution. Heredity. 2018;121:401–405. doi: 10.1038/s41437-018-0128-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Doebeli M. Adaptive Diversification, Monographs in Population Biology. Princeton, New Jersey: Princeton University Press; 2011. [Google Scholar]
  16. Drake JW. A constant rate of spontaneous mutation in DNA-based microbes. PNAS. 1991;88:7160–7164. doi: 10.1073/pnas.88.16.7160. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Ellis HM, Yu D, DiTizio T, Court DL. High efficiency mutagenesis, repair, and engineering of chromosomal DNA using single-stranded oligonucleotides. PNAS. 2001;98:6742–6746. doi: 10.1073/pnas.121164898. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Flynn KM, Cooper TF, Moore FBG, Cooper VS. The environment affects epistatic interactions to alter the topology of an empirical fitness landscape. PLOS Genetics. 2013;9:e1003426. doi: 10.1371/journal.pgen.1003426. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Fowler DM, Fields S. Deep mutational scanning: a new style of protein science. Nature Methods. 2014;11:801–807. doi: 10.1038/nmeth.3027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Fragata I, Blanckaert A, Dias Louro MA, Liberles DA, Bank C. Evolution in the light of fitness landscape theory. Trends in Ecology & Evolution. 2019;34:69–82. doi: 10.1016/j.tree.2018.10.009. [DOI] [PubMed] [Google Scholar]
  21. Gould SJ. Wonderful Life: The Burgess Shale and the Nature of History. ‎ W. W. Norton & Company; 1989. [Google Scholar]
  22. Guerrero RF, Scarpino SV, Rodrigues JV, Hartl DL, Ogbunugafor CB. Proteostasis Environment Shapes Higher-Order Epistasis Operating on Antibiotic Resistance. Genetics. 2019;212:565–575. doi: 10.1534/genetics.119.302138. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Gupta A. fitness-landscape-paper. swh:1:rev:19292491a2768444bd009da8a3de6d25ce8ab18cGitHub. 2021 https://github.com/anigupta12/fitness-landscape-paper
  24. Hall AR, Scanlan PD, Buckling A. Bacteria‐Phage Coevolution and the Emergence of Generalist Pathogens. The American Naturalist. 2011;177:44–53. doi: 10.1086/657441. [DOI] [PubMed] [Google Scholar]
  25. Hendrix RW. Cold Spring Harbor Monograph Series. Cold Spring Harbor Laboratory Press; 1983. [Google Scholar]
  26. Kelsic ED, Chung H, Cohen N, Park J, Wang HH, Kishony R. RNA Structural Determinants of Optimal Codons Revealed by MAGE-Seq. Cell Systems. 2016;3:563–571. doi: 10.1016/j.cels.2016.11.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Khan AI, Dinh DM, Schneider D, Lenski RE, Cooper TF. Negative epistasis between beneficial mutations in an evolving bacterial population. Science. 2011;332:1193–1196. doi: 10.1126/science.1203801. [DOI] [PubMed] [Google Scholar]
  28. Kosuri S, Church GM. Large-scale de novo DNA synthesis: technologies and applications. Nature Methods. 2014;11:499–507. doi: 10.1038/nmeth.2918. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Kryazhimskiy S, Rice DP, Jerison ER, Desai MM, evolution M. Microbial evolution. Global epistasis makes adaptation predictable despite sequence-level stochasticity. Science. 2014;344:1519–1522. doi: 10.1126/science.1250939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Lee JM, Huddleston J, Doud MB, Hooper KA, Wu NC, Bedford T, Bloom JD. Deep mutational scanning of hemagglutinin helps predict evolutionary fates of human H3N2 influenza variants. PNAS. 2018;115:E8276–E8285. doi: 10.1073/pnas.1806133115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Lindsey HA, Gallie J, Taylor S, Kerr B. Evolutionary rescue from extinction is contingent on a lower rate of environmental change. Nature. 2013;494:463–467. doi: 10.1038/nature11879. [DOI] [PubMed] [Google Scholar]
  32. MacLean RC, Perron GG, Gardner A. Diminishing returns from beneficial mutations and pervasive epistasis shape the fitness landscape for rifampicin resistance in Pseudomonas aeruginosa. Genetics. 2010;186:1345–1354. doi: 10.1534/genetics.110.123083. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Maddamsetti R, Johnson DT, Spielman SJ, Petrie KL, Marks DS, Meyer JR. Gain‐of‐function experiments with bacteriophage lambda uncover residues under diversifying selection in nature. Evolution. 2018;72:2234–2243. doi: 10.1111/evo.13586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Merrell DJ. The Adaptive Seascape: The Mechanism of Evolution. University of Minnesota Press; 1994. [Google Scholar]
  35. Meyer JR, Dobias DT, Weitz JS, Barrick JE, Quick RT, Lenski RE. Repeatability and Contingency in the Evolution of a Key Innovation in Phage Lambda. Science. 2012;335:428–432. doi: 10.1126/science.1214449. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Meyer JR, Gudelj I, Beardmore R. Biophysical mechanisms that maintain biodiversity through trade-offs. Nature Communications. 2015;6:6278. doi: 10.1038/ncomms7278. [DOI] [PubMed] [Google Scholar]
  37. Meyer JR, Dobias DT, Medina SJ, Servilio L, Gupta A, Lenski RE. Ecological speciation of bacteriophage lambda in allopatry and sympatry. Science. 2016;354:1301–1304. doi: 10.1126/science.aai8446. [DOI] [PubMed] [Google Scholar]
  38. Nahum JR, West J, Althouse BM, Zaman L, Ofria C, Kerr B. Improved adaptation in exogenously and endogenously changing environments. Proceedings of the 14th European Conference on Artificial Life ECAL 2017; 2017. pp. 306–313. [DOI] [Google Scholar]
  39. Ogbunugafor CB, Wylie CS, Diakite I, Weinreich DM, Hartl DL. Adaptive Landscape by Environment Interactions Dictate Evolutionary Dynamics in Models of Drug Resistance. PLOS Computational Biology. 2016;12:e1004710. doi: 10.1371/journal.pcbi.1004710. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Orr HA. The genetic theory of adaptation: a brief history. Nature Reviews. Genetics. 2005;6:119–127. doi: 10.1038/nrg1523. [DOI] [PubMed] [Google Scholar]
  41. Palmer AC, Toprak E, Baym M, Kim S, Veres A, Bershtein S, Kishony R. Delayed commitment to evolutionary fate in antibiotic resistance fitness landscapes. Nature Communications. 2015;6:7385. doi: 10.1038/ncomms8385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Paterson S, Vogwill T, Buckling A, Benmayor R, Spiers AJ, Thomson NR, Quail M, Smith F, Walker D, Libberton B, Fenton A, Hall N, Brockhurst MA. Antagonistic coevolution accelerates molecular evolution. Nature. 2010;464:275–278. doi: 10.1038/nature08798. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Petrie KL, Palmer ND, Johnson DT, Medina SJ, Yan SJ, Li V, Burmeister AR, Meyer JR. Destabilizing mutations encode nongenetic variation that drives evolutionary innovation. Science. 2018;359:1542–1545. doi: 10.1126/science.aar1954. [DOI] [PubMed] [Google Scholar]
  44. Pigliucci M, Müller GB. Evolution—the Extended Synthesis High-Dimensional Fitness Landscapes and Speciation. MIT Press; 2010. [DOI] [Google Scholar]
  45. R Development Core Team . Vienna, Austria: R Foundation for Statistical Computing; 2019. http://www.r-project.org [Google Scholar]
  46. Russ D, Glaser F, Shaer Tamar E, Yelin I, Baym M, Kelsic ED, Zampaloni C, Haldimann A, Kishony R. Escape mutations circumvent a tradeoff between resistance to a beta-lactam and resistance to a beta-lactamase inhibitor. Nature Communications. 2020;11:15666. doi: 10.1038/s41467-020-15666-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Sambrook J, Russell DW. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press; 2001. [Google Scholar]
  48. Shao Y, Wang IN. Bacteriophage adsorption rate and optimal lysis time. Genetics. 2008;180:471–482. doi: 10.1534/genetics.108.090100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Shuman HA, Silhavy TJ. The art and design of genetic screens: Escherichia coli. Nature Reviews. Genetics. 2003;4:419–431. doi: 10.1038/nrg1087. [DOI] [PubMed] [Google Scholar]
  50. Steinberg B, Ostermeier M. Environmental changes bridge evolutionary valleys. Science Advances. 2016;2:e1500921. doi: 10.1126/sciadv.1500921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Thompson JN, Cunningham BM. Geographic structure and dynamics of coevolutionary selection. Nature. 2002;417:735–738. doi: 10.1038/nature00810. [DOI] [PubMed] [Google Scholar]
  52. Thompson JN. The Geographic Mosaic of Coevolution, Interspecific Interactions. Chicago, Illinois: University of Chicago Press; 2005. [Google Scholar]
  53. Thompson JN. The Role of Coevolution. Science. 2012;335:410–411. doi: 10.1126/science.1217807. [DOI] [PubMed] [Google Scholar]
  54. Valen LV. A new evolutionary law. Evolutionary Theory. 1973;1:1–30. [Google Scholar]
  55. Wang HH, Isaacs FJ, Carr PA, Sun ZZ, Xu G, Forest CR, Church GM. Programming cells by multiplex genome engineering and accelerated evolution. Nature. 2009;460:894–898. doi: 10.1038/nature08187. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Weinreich DM, Delaney NF, Depristo MA, Hartl DL. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science. 2006;312:111–114. doi: 10.1126/science.1123539. [DOI] [PubMed] [Google Scholar]
  57. Wright S. The Roles of Mutation, Inbreeding, Crossbreeding, and Selection in Evolution. University of chicago; 1932. [Google Scholar]
  58. Zaman L, Meyer JR, Devangam S, Bryson DM, Lenski RE, Ofria C. Coevolution drives the emergence of complex traits and promotes evolvability. PLOS Biology. 2014;12:e1002023. doi: 10.1371/journal.pbio.1002023. [DOI] [PMC free article] [PubMed] [Google Scholar]

Editor's evaluation

C Brandon Ogbunugafor 1

This study uses the parlance and framing of the fitness landscape to articulate a co-evolution story between host and parasite. It utilizes a tractable system, bacteriophage λ and E. coli, to ask questions that unite different pillars of evolutionary theory – evolutionary genetics (via the fitness landscape analogy), co-evolution, and host-parasite interactions. The findings will be relevant to a number of audiences, and will likely spawn downstream studies that further interrogate the molecular specifics that underlie host-parasite co-evolution.

Decision letter

Editor: C Brandon Ogbunugafor1
Reviewed by: Jeremy Barr2

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Host-parasite coevolution promotes innovation through deformations in fitness landscapes" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Aleksandra Walczak as the Senior Editor. The following individuals involved in the review of your submission have agreed to reveal their identity: Nkrumah Grant (Reviewer #1), Jeremy Barr (Reviewer #2).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) Please pay close attention to comments from reviewers with respect to the choice of study sub-population (why did the authors choose population D7 from Meyer et al. 2012 )?

2) Other stylistic and communication-based considerations as described by the reviewers. Some involve figure clarity. Also, please consider comments made with respect to items that are in the supplemental material.

Reviewer #1 (Recommendations for the authors):

1. Overall the manuscript is written well, but it could use some soft editing.

2. Consider making the color scheme of the simulations and biological data different. For example, figure 2 shows simulation data in blue and yellow, whereas biological data is shown in figures, 3 (panels B and C) and 4.

3. Please clarify why you chose population D7 from Meyer et al. 2012 to analyze further. Why only one population? How generalizable are your results?

4. It would be extremely beneficial to readers if the figures describing your methods were expanded upon (add figure showing MAGE-seq and input/outputs of simulations).

Reviewer #2 (Recommendations for the authors):

• I recommend including and potentially reworking supplemental figure 1 into the main text to provide a visual overview of the experimental system and also encourage the authors to visually recreate the λ co-evolutionary mechanisms explained in text in lines 41-47 on pg2. This would help with the readability of the manuscript, especially for readers not familiar with this model system.

• Similarly, the abstract appears intentionally vague and perhaps too broad at times. The authors could attempt to rework this to more clearly explain the co-evolutionary mechanisms that were seen in the study.

• I don't understand the rationale for numbering the supplemental figures in the way the authors have done so. Having sub-supplemental figures for each main figure is not logical and I recommend simply labeling supplemental figures in order of appearance in the main text.

• I really enjoyed the experimental design and approach to disentangling these co-evolutionary dynamics. This is more a comment rather than something to be addressed, but the phylogenetic reconstruction of λ genotypes in Figure 3A shows the emergence of A**C and ABC genotypes from a similar genetic background. How likely do the authors think that the C genotype emergence twice simultaneously and have they instead considered recombination playing a role?

eLife. 2022 Jul 6;11:e76162. doi: 10.7554/eLife.76162.sa2

Author response


Essential revisions:

1) Please pay close attention to comments from reviewers with respect to the choice of study sub-population (why did the authors choose population D7 from Meyer et al. 2012 )?

We have added the sentences: ‘Population ‘D7’ was chosen because λ evolved relatively few mutations in this population. We believed this choice was conservative and constitutes a strong test of our hypothesis since fewer λ mutations would provide fewer opportunities to detect host-induced contingency.’

2) Other stylistic and communication-based considerations as described by the reviewers. Some involve figure clarity. Also, please consider comments made with respect to items that are in the supplemental material.

We have followed the advice of the reviewers and made many stylistic changes. These are described in our response to the reviewers.

Reviewer #1 (Recommendations for the authors):

1. Overall the manuscript is written well, but it could use some soft editing.

We thank the reviewer for their comment and suggestion. We have re-read the entire manuscript and made minor edits at 17 different places. None of these changes altered the content of the manuscript.

2. Consider making the color scheme of the simulations and biological data different. For example, figure 2 shows simulation data in blue and yellow, whereas biological data is shown in figures, 3 (panels B and C) and 4.

We carefully considered this suggestion; however, we chose to leave the color scheme as-is. The color scheme contrasts the two hosts and we wanted to keep it consistent throughout the manuscript. We have modified the y-axis label in figure 2 to make it clearer that the results are from computer simulations by adding ‘(in silico)’. We also change the figure caption’s first sentence to emphasize that the results stem from simulations. ‘Simulation results of the frequency of OmpF-use evolution observed when fitness landscapes were shifted at different frequencies.’

3. Please clarify why you chose population D7 from Meyer et al. 2012 to analyze further. Why only one population? How generalizable are your results?

We have added the sentences: ‘D7 was chosen because λ evolved relatively few mutations in this population. We believed this choice was conservative and constitutes a strong test of our hypothesis since fewer λ mutations would provide fewer opportunities to detect host-induced contingency.’

These results are generalizable given that they align with the fitness landscape simulations where we used J mutations evolved in tens of independent λ populations. They were also in line with results from the replay experiments where replicate λ populations were free to evolve any mutations – not just the ones evolved in D7 – and they showed no capacity to innovate when cocultured with malT hosts.

4. It would be extremely beneficial to readers if the figures describing your methods were expanded upon (add figure showing MAGE-seq and input/outputs of simulations).

Thank you for this recommendation. We have added a new figure (Figure 2—figure supplement 1) to describe the simulations and we have edited the figure supplement 1 of Figure 1 to provide a more detailed schematic illustration of MAGE-Seq to the reader.

Reviewer #2 (Recommendations for the authors):

• I recommend including and potentially reworking supplemental figure 1 into the main text to provide a visual overview of the experimental system and also encourage the authors to visually recreate the λ co-evolutionary mechanisms explained in text in lines 41-47 on pg2. This would help with the readability of the manuscript, especially for readers not familiar with this model system.

We have incorporated the suggestion by adding a new figure in the Discussion section (new Figure 5) that shows both the previous model of coevolution and our new understanding of it after this study. We believe this will greatly help the readers who are not familiar with this coevolutionary model system. We also considered moving supplemental figure 1 to the main text but decided not to out of our own stylistic preference. We would like to note that eLife online publication makes it very easy to see figures, as they and all the supplements are well-linked with pop-open views embedded within the main text.

• Similarly, the abstract appears intentionally vague and perhaps too broad at times. The authors could attempt to rework this to more clearly explain the co-evolutionary mechanisms that were seen in the study.

We rewrote the second half of the abstract to be more specific.

“During the struggle for survival, populations occasionally evolve new functions that give them access to untapped ecological opportunities. Theory suggests that coevolution between species can promote the evolution of such innovations by deforming fitness landscapes in ways that open new adaptive pathways. We directly tested this idea by using high throughput gene editing-phenotyping technology (MAGE-Seq) to measure the fitness landscape of a virus, bacteriophage λ, as it coevolved with its host, the bacterium Escherichia coli. Analyses of the empirical fitness landscape revealed mutation-by-mutation-by-host-genotype interactions that prove coevolution modified the contours of λ’s landscape. Computer simulations of λ’s evolution on a static versus shifting fitness landscape showed that the changes in contours increased λ’s chances of evolving the ability to use a new host receptor. By coupling sequencing and pairwise competition experiments, we demonstrated that the first mutation λ evolved enroute to the innovation would only evolve in the presence of the ancestral host, whereas later steps in λ’s evolution required the shift to a resistant host. When time-shift replays of the coevolution experiment were run where host evolution was artificially accelerated, λ did not innovate to use the new receptor. This study provides direct evidence for the role of coevolution in driving evolutionary novelty and provides a quantitative framework for predicting evolution in coevolving ecological communities.”

• I don't understand the rationale for numbering the supplemental figures in the way the authors have done so. Having sub-supplemental figures for each main figure is not logical and I recommend simply labeling supplemental figures in order of appearance in the main text.

This is an eLife formatting specification. The first version of the manuscript we submitted had supplemental figures in the more traditional format, but we were asked by the editorial office to re-submit the manuscript in the current form.

• I really enjoyed the experimental design and approach to disentangling these co-evolutionary dynamics. This is more a comment rather than something to be addressed, but the phylogenetic reconstruction of λ genotypes in Figure 3A shows the emergence of A**C and ABC genotypes from a similar genetic background. How likely do the authors think that the C genotype emergence twice simultaneously and have they instead considered recombination playing a role?

We have speculated about the source of C too and we do not have an answer. Some thoughts: We have observed a striking level of parallel evolution between populations separated in different flasks, so it’s possible that C arose multiple times within the same population. However, λ has an active homologous recombination system and we have observed recombination in these studies between the phage and host-encoded phage genes (Borin et al. PNAS 2021), so it could happen. The MOI of these studies is low – fewer phage than bacteria – so that would suggest there’s little opportunity for coinfection and recombination.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Gupta A, Zaman L, Strobel HM, Gallie J, Burmeister AR, Kerr B, Tamar ES, Kishony R, Meyer JR. 2022. Host-parasite coevolution promotes innovation through deformations in fitness landscapes. Dryad Digital Repository. [DOI] [PMC free article] [PubMed]
    2. Gupta A. 2020. Fitness landscape of bacteriophage lambda with ancestral host and resitant host. NCBI BioProject. PRJNA646809

    Supplementary Materials

    Supplementary file 1. All the supplementary tables.
    elife-76162-supp1.docx (102.3KB, docx)
    Transparent reporting form

    Data Availability Statement

    Amplicon sequencing data used to generate fitness landscapes in this study have been deposited in the NCBI Sequence Read Archive under BioProject PRJNA646809. The code and data that support the findings of this study are available at https://github.com/anigupta12/fitness-landscape-paper, copy archived at swh:1:rev:19292491a2768444bd009da8a3de6d25ce8ab18c (Gupta, 2021).

    Sharing of biological material requires a material transfer agreement, described at http://blink.ucsd.edu/research/conducting-research/mta/index.html.

    Amplicon sequencing data used to generate fitness landscapes in this study have been deposited in the NCBI Sequence Read Archive under BioProject PRJNA646809. Other data that support the findings of this study is deposited at Dryad at https://doi.org/10.5061/dryad.ht76hdrhq and all code is made available at GitHub at https://github.com/anigupta12/fitness-landscape-paper, (copy archived at swh:1:rev:19292491a2768444bd009da8a3de6d25ce8ab18c).

    The following datasets were generated:

    Gupta A, Zaman L, Strobel HM, Gallie J, Burmeister AR, Kerr B, Tamar ES, Kishony R, Meyer JR. 2022. Host-parasite coevolution promotes innovation through deformations in fitness landscapes. Dryad Digital Repository.

    Gupta A. 2020. Fitness landscape of bacteriophage lambda with ancestral host and resitant host. NCBI BioProject. PRJNA646809


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES