Abstract
Adaptation dynamics on fitness landscapes is often studied theoretically in the strong-selection, weak-mutation regime. However, in a large population, multiple beneficial mutants can emerge before any of them fixes in the population. Competition between mutants is known as clonal interference, and while it is known to slow down the rate of adaptation (when compared to the strong-selection, weak-mutation model with the same parameters), how it affects the shape of long-term fitness trajectories in the presence of epistasis is an open question. Here, by considering how changes in fixation probabilities arising from weak clonal interference affect the dynamics of adaptation on fitness-parameterized landscapes, we find that the change in the shape of fitness trajectory arises only through changes in the supply of beneficial mutations (or equivalently, the beneficial mutation rate). Furthermore, a depletion of beneficial mutations as a population climbs up the fitness landscape can speed up the rescaled fitness trajectory (where adaptation speed is measured relative to its value at the start of the experiment), while an enhancement of the beneficial mutation rate does the opposite of slowing it down. Our findings suggest that by carrying out evolution experiments in both regimes (with and without clonal interference), one could potentially distinguish the different sources of macroscopic epistasis (fitness effect of mutations vs change in fraction of beneficial mutations).
Keywords: fitness trajectories, clonal interference, macroscopic epistasis, fitness landscapes
Introduction
For unicellular organisms, as individuals in an asexual population gain mutations and increase in fitness, how their average fitness increases over evolutionary timescales [which we refer to as the average fitness trajectory F(t)] depends on the distribution of fitness effects s of potential new mutations (i.e. mutations that can potentially occur). This distribution may change as a population evolves, an effect known as macroscopic epistasis (Fig. 1a) (Good and Desai 2015). This could come about, for example, when the fitness effects of mutations depend on the state of other genes or the presence of other mutations (Chou et al. 2011; Khan et al. 2011; Wang et al. 2013). The effect of such changes in on fitness trajectories has been studied in the context of fitness-parameterized landscapes, where is assumed to depend only on the fitness x of the cell (Kryazhimskiy et al. 2009; Good and Desai 2015).
Within this framework, adaptation dynamics observed experimentally can be used to infer features of the underlying fitness landscape or equivalently, the type of epistasis present in the system (Wiser et al. 2013; Kryazhimskiy et al. 2014; Good and Desai 2015). The shape of fitness trajectories (which we characterize using its functional form) is typically studied theoretically in the strong-selection, weak-mutation (SSWM) regime, in which the time for beneficial mutations to fix is much shorter than the time for successful beneficial mutations to emerge (Desai and Fisher 2007). However, in large populations, multiple beneficial mutants can emerge before any of them fixes. Competition between beneficial mutants in different lineages is known as clonal interference (Fig. 1b) and has been shown to reduce fixation probabilities (Gerrish and Lenski 1998; Lin et al. 2020), and hence slow down adaptation rate when compared to the SSWM model with the same parameters. There have also been suggestions that clonal interference slows down the form (i.e. shape) of fitness trajectories (Wiser et al. 2013) (here, we are interested in the functional forms of the fitness trajectories, and therefore the trajectories are compared after scaling time such that they have identical initial slopes—we consider a trajectory to be slower than another if it has a slower fitness increase after this re-scaling, Fig. 1c). In particular, the slow fitness trajectory observed in Lenski’s long-term evolution experiment (where the fitness appears to continue increasing over a long period of time without reaching any plateau) was previously attributed to both diminishing returns epistasis and clonal interference (Wiser et al. 2013). However, the relative contributions from these two factors have not been explored, and it is not clear if it is generally the case that clonal interference slows down the form of fitness trajectories in the presence of epistasis.
Here, by considering how changes in fixation probabilities arising from weak clonal interference affect the dynamics of adaptation on fitness-parameterized landscapes, we find that the change in the form of fitness trajectory due to clonal interference arises only through changes in the supply of beneficial mutations (or equivalently, the beneficial mutation rate), independent of any changes in the average fitness effect of beneficial mutations. This implies that as long as the fraction of beneficial mutations stays the same, the functional form of the fitness trajectory will be the same with or without clonal interference, even if the mean fitness effect of beneficial mutations decreases over time (i.e. diminishing returns). Furthermore, a depletion of beneficial mutations as a population climbs up the fitness landscape can speed up the functional form of the fitness trajectory (while an enhancement of the beneficial mutation rate slows it down). These findings suggest that by carrying out evolution experiments in both regimes (with and without clonal interference), one could potentially distinguish the different sources of macroscopic epistasis (change in fitness effects of mutations vs change in the fraction of beneficial mutations).
Results
We consider the Moran process, where a population has a constant size N, and each cell divides at some rate which we call the fitness of the cell. Whenever a cell divides, a random cell is simultaneously removed from the population. Experimentally, this corresponds, approximately, to growing cells in a continuous culture (e.g. in a turbidostat) (Bryson and Szybalski 1952; Moran et al. 1964). During each division event, there is some probability that the daughter cell gains a new mutation (Fig. 1a). We assume that the fitness effects of deleterious mutations are typically much larger than those of beneficial ones, such that beneficial mutations rarely compensate deleterious mutations, and hence deleterious mutations cannot fix. The adaptation of the population is therefore driven by the probability μb of gaining a beneficial mutation during division, which is a dimensionless quantity that we will refer to as the beneficial mutation rate.
Inspired by previous theoretical studies (Gillespie 1984; Orr 2003), we assume an exponential distribution for the fitness effects of beneficial mutations. Such a distribution also seems to have some experimental support (Imhof and Schlötterer 2001; Rokyta et al. 2005; Kassen and Bataillon 2006). Nevertheless, there are other data that suggest otherwise (Rokyta et al. 2008; Levy et al. 2015), and we show in Appendix B and Appendix C that our main conclusions hold for a more general class of distributions. Generalizing the approach of Kryazhimskiy et al. (2009), given the current fitness of a cell x, the distribution of its beneficial mutant fitness values y > x is assumed to be given by:
(1) |
where represents how the mean fitness effects of mutations vary with current fitness, and h(x) governs how the fraction of beneficial mutations changes with the current fitness of the cell. The beneficial mutation rate is then given by , with being the initial beneficial mutation rate.
For any , the corresponding distribution of fitness selection coefficients is then given by
(2) |
where is the inverse of the mean selection coefficient of beneficial mutations. Within this model, macroscopic epistasis can act through changing the mean of beneficial mutation effects as the population increases in fitness, through changing the availability of new beneficial mutations, or a combination of both.
This framework reduces back to other previously known models for specific forms of f(x) (or equivalently, ) and h(x). For example, the House of cards (HoC)/Uncorrelated fitness landscape is the case where is independent of x and corresponds to and (Kryazhimskiy et al. 2009). This implies that while the absolute mean fitness increase conferred by mutations stay the same, the availability of beneficial mutations decreases exponentially as fitness increases. Similarly, the nonepistatic (NEPI) fitness landscape is the case where the distribution of fitness effects of mutations is independent of genotype, i.e. is only a function of (Kryazhimskiy et al. 2009), which corresponds to having and h(x) = 0. The Stairway to heaven (STH) fitness landscape (Kryazhimskiy et al. 2009) is the case where is independent of fitness, and corresponds to having while h(x) = 0. The diminishing returns epistasis model adopted by Wiser et al. (2013) can also be mapped to this fitness-parameterized framework with for and h(x) = 0.
We assume that the beneficial mutation rate is not too large such that the population is typically in a monomorphic state. When a new beneficial mutation with fitness effect s emerges, it fixes with some probability . The average fitness trajectory F(t) on such fitness-parametrized landscapes is then approximately given by (Kryazhimskiy et al. 2009):
(3) |
where time t here refers to real time (i.e. the actual amount of time that has passed in a continuous culture), in contrast to being the number of generations often considered in other studies (Kryazhimskiy et al. 2009; Wiser et al. 2013). This definition of time is why, compared to those previous studies, we have an additional factor of F in the equation (since we have defined fitness to be the growth rate, such that the number of generations in a small time interval δt is ).
In the SSWM regime [ (Desai and Fisher 2007)], . Substituting from Equation (2) and defining (the selection coefficient of a mutation relative to the average selection coefficient), the dynamics of fitness in this regime are given by:
(4) |
where the functional form of the fitness trajectory depends on the type of epistasis specified through and h(F). For example, the HOC landscape () gives , the NEPI landscape (h(F) = 0, ) gives , and the STH landscape (, h(F) = 0) gives a function F(t) that increases faster than linearly. The diminishing returns model (, h(F) = 0) by Wiser et al. gives a power law trajectory with exponent . (Note that this exponent is slightly different from that in Wiser et al. because of the difference in how time is defined. Also, in Wiser et al. clonal interference is taken into account in addition to the assumption of the diminishing returns model. We will show later that taking into account clonal interference has no effect on this exponent.)
When a population is sufficiently large, after a mutation has escaped loss via genetic drift and the number of mutants starts to increase deterministically, there is some probability that another new successful mutation emerges in the resident population while the first mutant is still in the process of taking over the population. The second mutant could then potentially out-compete the first mutant, causing the first mutation to go extinct. (Note that if a second mutation were to occur in a cell that already contains the first mutation, it is not considered an interfering mutation because if the second mutation were to fix, the first mutation would fix along with it. In this case, we also assume that this second mutation does not affect the fixation probability of the first mutation since the first mutant was already on its way to fixation even without the second mutation.) Following the approach in Gerrish and Lenski, for a mutant that emerges in a clonal population containing individuals with fitness F, the probability of it fixing, , will then be the probability that it escapes loss via genetic drift (and would go on to fix in the absence of any interference), , and not out-competed by any other mutation that could potentially emerge in the background population while the first mutant population is growing in size:
(5) |
where is the average number of generations it takes for a mutant to fix without any interference (assuming deterministic, logistic growth of the mutant strain). is the rate at which a successful interfering mutation emerges at generation after the first mutant sub-population starts growing in size, and is given by the rate at which the resident population gains another more beneficial mutation which subsequently goes on to fix:
(6) |
where is the fraction of resident cells (again from logistic growth of the mutant strain), is the distribution of fitness effects of mutations that can arise in a cell with fitness F (Equation 2), and is the fixation probability of the interfering mutant with fitness effect sI. Intuitively, we expect to increase with n since the closer the first mutant is to fixing (i.e. lower n), the harder it is for the new mutant to outcompete the original mutant. In particular, one might also expect the probability of the interfering mutant fixing in this mixed population to be the same as that in a clonal population with the same average population growth rate (i.e. all cells grow at this rate). We will show that this is indeed true for all n when the growth rate of the mutant is much larger than that of the population-averaged growth rate (but not in general).
Assuming that clonal interference is weak such that at most one successful competing mutation occurs during the time a mutant is trying to fix (i.e. no further mutations occur after the emergence of the second mutant), the extinction probability of the new (interfering) mutant (in the large N limit) can be found by considering the different events that can happen (different cell types dividing and leaving the population) in the next time step and requiring all remaining mutants to go extinct (Appendix A). The fixation probability is then found to be given by the solution to the following equation (Appendix A):
(7) |
where and , and the boundary conditions are given by and , which are also the fixation probabilities of a mutant in a clonal population within the Moran process (Moran et al. 1964; Nowak 2006).
When the first two terms in Equation (7) dominate (either when n is close to 0 or 1, or when ), we recover the intuitive result that the fixation probability is the selective advantage of the interfering mutant over that of the population average:
(8) |
where we have also taken the limit where fitness effects are generally small . For simplicity and concreteness, we will assume this specific form for in the rest of the paper, but we argue in Appendix B that adopting the more general solution of Equation (7) does not affect our main conclusions.
Within this approach, we have considered the effect of potential interfering mutants on the fixation probability of a mutant that initially arose in a clonal population. This implicitly assumes that clonal interference is sufficiently weak that most of the time a mutation emerges when the population is dominated by a single subgroup. We elaborate on this assumption more extensively in the Discussion section.
The dynamics of fitness in the presence of weak clonal interference is therefore given by (using Equations [3], [5], [6], [8], see Appendix B):
(9) |
where, in the limit ,
(10) |
If the beneficial mutation rate does not depend on fitness [h(F) = 0], we recover the same expression for derived by Gerrish and Lenski (1998).
In the following section, we will explore the effect of clonal interference on the functional form (i.e. shape) of the average fitness trajectory by comparing the dynamics with (Equation 9) and without (Equation 4) clonal interference. When comparing the numerical solutions of Equations (4) and (9), we scale time in the case with clonal interference by a factor such that (since we are interested in comparing functional forms rather than the absolute time dynamics).
To verify our findings, we also carried out full Gillespie simulations of the Moran process, where we keep track of the fitness values of all cells, with the probability of a cell dividing proportional to its fitness (see Appendix D for simulation details). When a mutation event occurs during division, the fitness effect of the mutation is drawn from a distribution based on the fitness of the dividing cell. Within each simulation, the mean population fitness at any time is obtained from an average over all cells in the population. We show in Fig. A2 that the fitness trajectory given by Equation (9) agrees well with our simulation data over a range of mutation rates.
Change in functional form of fitness trajectory only depends on how beneficial mutation rate changes
Since any additional dependence on F due to the clonal interference term comes only from h(F) (Equations 9 and 10), we expect weak clonal interference to only affect the functional form of the fitness trajectory if the beneficial mutation rate changes with fitness (Figs. 2 and A2). In other words, if (such as in the case of the diminishing returns model and the STH landscape), even if there is epistasis that acts through , clonal interference does not change the functional form of the trajectory (Fig. 2a, left panel in Fig. A2). This is the case because as long as the availability of beneficial mutations does not change, the presence of clonal interference reduces the rate of fitness increase by the same amount at all times.
If h(F) depends on F, whether the form of the trajectory is sped up or slowed down by clonal interference depends on whether h(F) increases or decreases with F. In many commonly used and widely studied models of fitness landscapes, h(F) increases with F, i.e. the beneficial mutation rate decreases with fitness. For example, in the HOC landscape, a new fitness value is drawn from the same distribution whenever a mutation occurs (Park and Krug 2008; Kryazhimskiy et al. 2009), implying that as fitness increases, the probability of drawing a higher value of fitness (for the mutant) decreases. In Fisher’s geometric model (Fisher 1930), the state of the cell lies in a multidimensional phenotypic space, with a certain point being the optimal phenotype, such that the fitness is specified by how close the state of the cell is to the optimal state. Mutations then correspond to drawing random vectors in the space. Within this model, the probability of a mutation providing an improvement (i.e. bringing the population closer to the optimal state) also decreases with fitness (Fisher 1930; Hartl and Taubes 1996; Ram and Hadany 2015). In these cases where h(F) increases with F, clonal interference speeds up the functional form of the trajectory (Fig. 2b, middle panel in Fig. A2). Intuitively, this occurs because as fitness increases, the effect of clonal interference is reduced since there are fewer potential beneficial interfering mutations. This phenomenon was previously described in Park and Krug (2008).
Similarly, in the opposite case where h(F) decreases with F, there are more potential beneficial mutations as the population climbs up the fitness landscape (one could imagine this happening if gaining certain mutations opens up more beneficial mutational paths). In this case, interfering mutations become more common as the population evolves and the form of the trajectory is slowed down by such an effect (Fig. 2c, right panel in Fig. A2).
Besides the average fitness trajectory, we find that these general arguments and results also hold for the average substitution trajectory, i.e. how the average number of fixed mutations increases over time, when is sufficiently small (Appendix E). However, there is a regime where this theory underestimates the number of accumulated mutations while still providing good predictions for the average fitness, suggesting that the excess mutations arise from nearly neutral mutations (Appendix E). We return to this point in the Discussion section.
Proposed experimental protocol for determining how beneficial mutation rate changes
To determine whether and how the availability of beneficial mutations changes over time (with fitness), one can carry out two or more sets of long-term evolution experiments at different population sizes or mutation rates (Sprouffske et al. 2018) spanning both the SSWM regime and the regime with weak clonal interference (Fig. 3a). N can be varied, for example, by carrying out experiments in a turbidostat at different optical densities, while potential ways of varying the overall mutation rate μ (i.e. the probability of gaining a mutation per division) include changing the expression levels of DNA repair enzymes, such as mutH (Sherer and Kuhlman 2020) and Ada (Uphoff et al. 2016) in Escherichia coli, and inducing mutagenesis using UV radiation (Shibai et al. 2017). If N is the variable being varied, it would be useful to first measure the mutation rate of the initial strain (e.g. by carrying out the Luria Delbruck fluctuation test) if it is not already known, as this would provide an upper bound for . This can inform the possible choices of N for the experiment. For example, if [and assuming that N is large enough so that fixation of deleterious mutations via genetic drift can be neglected, although in general it may not be true (Silander et al. 2007; Kryazhimskiy et al. 2012)] it would almost certainly be in the SSWM regime, and one can then adopt increasingly large values of N. [One could also sequence samples of the population at regular intervals during the initial stage of the experiment to extract the allele frequency trajectories of all detected mutations (Good et al. 2017)—predominantly nonoverlapping selective sweeps would indicate that the population is initially either in the SSWM or weak clonal interference regime.] The average fitness trajectory for a given N and μ can then be obtained by averaging over the replicates in the corresponding set of experiments (Fig. 3b). In general, the initial rate of fitness increase will differ between sets of experiments (Fig. 3b). For an experimentally obtained average trajectory (where fitness values are obtained at discrete time points), the initial gradient can be estimated by extracting the time point tc at which the measured fitness first exceeds some value Fc (Fig. 3b). The initial gradient can then approximated by , where we have implicitly assumed that the trajectory is sufficiently well sampled that is not much larger than Fc, i.e. with ϵF controlling the desired time resolution . Since the upper bound to the initial gradient is (Equations 4 and 9), a conservative choice would be for —often α0 and are a priori unknown in which case one could start with conservative estimates, take fitness measurements more frequently initially and adapt the measurement frequency as the experiment progresses. Scaling time for each trajectory by a factor proportional to such that all scaled fitness trajectories have approximately the same initial derivative (Fig. 3, c–e), one can then infer how μb changes. In particular, if the different trajectories overlap (or are statistically indistinguishable), it would suggest that μb remains approximately constant (Fig. 3c). If instead the scaled trajectories speed up (slow down) as N or the overall mutation rate is increased, it would suggest that μb decreases (increases) as fitness increases over time (Fig. 3, d and e). Note that since we are comparing the scaled trajectories, the different sets of experiments (with different N or μ) should be run to approximately the same final F rather than for the same period of time (i.e. for smaller N or μ, the experiment should run longer for fitness to increase to the same level). The trajectories should also be sufficiently long for any effect of epistasis (occurring through ) to be captured (since the strength of epistasis that can be detected depends on the range of F spanned by the trajectory).
To test this protocol, we applied it on synthetic data obtained from our numerical simulations (Fig. 4, three leftmost columns). We coarsely sample the fitness trajectories, such that the time between successive measurements is at least a hundred (with the initial growth rate set to be 1), and focus on analyzing data from the diminishing-returns (Fig. 4a) and the House-of-cards landscapes (Fig. 4b). We find that the qualitative observation of how the shape of the trajectories changes with N remains approximately the same across different values of Fc (Fig. 4, three leftmost columns), suggesting that the inference is robust to how the initial fitness gradients are estimated. This is also consistent with the effect of varying Fc when rescaling trajectories obtained from solving Equation (9) (Appendix F, Fig. A5). In practice, experimental data could be noisier, and it would be useful to compare the trajectories using multiple values of Fc to ensure that any trends remain consistent.
An alternative way of extracting the initial gradient would be to fit the trajectories directly to Equation (9). We assume that and such that , and treat α0, and γ as parameters of the cell to be inferred from the fitness trajectories. We find that minimizing the least-squares error between the synthetic data and the trajectories obtained from solving Equation (9) allows a good recovery of the underlying landscape parameters (Fig. 4, right column). While these parameters already directly provide information about the type of epistasis present, they also allow a direct calculation of the initial fitness gradient (using Equation 9) which can again be used for visualizing the scaled trajectories (Fig. 4, right column). It is important to note that this approach requires the assumption of some general form for and h(F) which we typically have no knowledge of. Nevertheless, this fitting procedure can still be potentially useful for testing specific classes of landscapes and for distinguishing the relative types of epistasis (e.g. by comparing the magnitudes of and γ).
Discussion
In the SSWM regime, different types of epistasis have been known to give very similar fitness trajectories. For example, both the HoC landscape (where the mean fitness effect stays the same but beneficial mutations deplete exponentially with fitness) and diminishing returns epistasis with constant beneficial mutation rates can give rise to a slow, approximately logarithmic fitness trajectory that does not seem to approach any plateau. Our results suggest that one way of distinguishing whether it is the average fitness effects of potential mutations or the average fraction of available mutations (or both) that is changing over time is to carry out the same set of evolution experiments with two or more different population sizes (Kryazhimskiy et al. 2012) or mutation rates (Sprouffske et al. 2018) (in both the SSWM regime and the regime with weak clonal interference), and compare their fitness trajectories.
Although we have only considered adaptation dynamics on fitness-parameterized landscapes, it is possible for such macroscopic epistasis to arise from a microscopic model of fitness landscape that explicitly takes into account interactions between genes (i.e. microscopic epistasis) (Guo et al. 2019; Reddy and Desai 2021). Furthermore, in many of these microscopic models, the rate of beneficial mutations decreases as the population climbs up the fitness landscape. In fact, such depletion of beneficial mutations also occurs when one models the genome as a finite sequence of sites, each having its own independent contribution to fitness [i.e. no microscopic epistasis, in which case F(t) approaches a maximum Fmax according to a power law in the SSWM regime if the independent contributions to fitness follows an exponential distribution] (Good and Desai 2015). Our results therefore suggest that clonal interference can speed up the functional form of fitness trajectories even in these microscopic models.
Even though we have neglected the accumulation of deleterious mutations, if the selection coefficients of deleterious mutations are typically much larger than that of beneficial ones, one might consider the effect of deleterious mutations as decreasing the fraction of cells that can gain successful beneficial mutations (Fisher 1930; Orr 2000). When the deleterious mutation rate is much higher than that of beneficial mutations, the system can be thought of as being in a quasi-equilibrium state (mutation-selection balance), with the fraction of cells having the highest fitness (i.e. free of deleterious mutations) being , where μd is the deleterious mutation rate, and sd is the harmonic mean of the distribution of fitness effects of deleterious mutations (Orr 2000). In the presence of macroscopic epistasis, μd, sd, and hence P0 can potentially depend on the fitness of the cell. Since successful interfering mutations can only arise in the fraction of resident population that is free of deleterious mutations, there would be an additional factor of in the expression for (Equation 10), which could further affect the shape of F(t) (Equation 9) in the presence of clonal interference (relative to that in the SSWM regime). Nevertheless, if sd is independent of F, since (with μ being the constant probability of gaining a mutation per division), changes (i.e. increases or decreases) with F in the same way as . This implies that the effect of in terms of slowing or speeding up the shape of the trajectory is the same as that of . In general, sd could vary with F and this could potentially be inferred by comparing fitness trajectories obtained with different μ (since this would change the contribution from P0).
If instead the effects of deleterious mutations are typically weak, one needs to take into account the possibility of beneficial mutants emerging in backgrounds other than the fittest subgroup, and that a beneficial mutant that further accumulates deleterious mutations might still be able to fix (Johnson and Barton 2002; Pénisson et al. 2017; Jain 2019). In the limit where the selection coefficients of deleterious mutations are very weak (), the probability of a beneficial mutation with selection coefficient s surviving drift can be approximated by a step function at (Johnson and Barton 2002; Pénisson et al. 2017; Jain 2019), such that for while remaining unchanged () for larger values of s. In this case, the lower bound of s in the integral in Equation (3) would be , and the corresponding lower bounds for both the integrals in Equations (4) and (9) would be . The factor of in the lower bound suggests that even if μd is independent of F, epistasis acting through can potentially change the relative shape of the fitness trajectories with and without clonal interference. In particular, as increases with F (representing a decrease in the mean selection coefficient of beneficial mutations), the increase in the lower bound implies that the effect of clonal interference is reduced over time since stronger mutations are less likely to be interfered by other mutations (Equation 10). This would lead to a speed-up of the form of the trajectory if the population started off in the clonal interference regime. Nevertheless, if (i.e. the deleterious mutation rate is smaller than the mean selection coefficient of beneficial mutations) throughout the measured trajectory, the influence of deleterious mutations is small and our main conclusions would still hold. Experiments with different values of mutation rates would therefore allow one to disentangle the effects of weak deleterious mutations from that of clonal interference.
In the regime where the time taken for a mutant to fix is at least comparable to the time for new successful mutations to emerge, besides interference from other beneficial mutations that emerge in the resident population, the original mutant population can also gain additional mutations. The fixation probabilities of these new mutations are enhanced since they are accumulated in cells that already have a fitness advantage in the population. This effect could be important when there are a large number of different subgroups (i.e. genotypes) in the population, such that the average fixation probability of a mutation (of a certain fitness effect) should also be an average over the different background fitness values the mutation can occur in (Good et al. 2012). In general, not accounting for multiple mutations would underestimate the fixation probabilities of small effect mutations since these benefit most from hitch-hiking (Fogle et al. 2008). However, these small effect mutations also contribute the least amount to fitness, which may explain why taking into account clonal interference alone seems to provide a good agreement with simulated fitness trajectories (Figs. 2 and A2), even when there are more accumulated mutations than predicted by the current framework (Fig. A4). Nevertheless, how including the effect of multiple mutations would affect the functional form of long-term fitness trajectories in the presence of epistasis, and how our results extend to the regime of strong clonal interference, are interesting questions that we leave for future work. Another potential direction for future studies is to take into account the interactions between clonal interference and horizontal gene transfer (Slomka et al. 2020) (and how these would affect adaptation dynamics), an aspect we have not studied in this paper.
Data availability
The MATLAB codes for simulation and analysis can be found on GitHub repository https://github.com/yipeiguo/Effect-of-clonal-interference-on-fitness-trajectories.
Acknowledgments
The authors thank Jie Lin, Michael Manhart, Tzachi Pilpel, and Yoav Ram for useful discussions and feedback.
Funding
This research was supported by the National Science Foundation through the NSF CAREER 1752024 and the Harvard QBio fellowship.
Conflicts of interest
The authors declare that there is no conflict of interest.
Appendix A: Fixation probability of a new mutant in a population with 2 subgroups
We consider the scenario where there are two genotypes 1 and 2 with growth rates g1 and g2 when a new mutant with growth rate gm emerges. Let be the extinction probability of the mutant lineage when there are N1 type 1 cells and N2 type 2 cells in the population of constant size N. Assuming the dynamics follow a Moran process, and that no further mutations occur, these extinction probabilities satisfy the following set of equations:
(A1) |
where the first term on the right takes into account the probability that the next event involves the division of a type 1 cell and the removal of a type 2 and vice versa, the second term involves the removal of the mutant, the third term is associated with the division of the mutant and removal of one of the other cell types, and the last term is for cases where there is no change in the composition of the population. The boundary conditions are and when , and is the total growth rate of all cells:
(A2) |
The corresponding fixation probabilities can then be obtained exactly by solving this large system of linear equations (Equation A1).
The extinction probability of a newly emerged mutant cell is then given by:
(A3) |
where n is the fraction of type 1 cells among the type 1 and 2 cells.
Since and , for ,
(A4) |
(A5) |
(A6) |
If there are two mutant cells in the population, the probability of the mutant strain eventually going extinct is the probability that the lineages of both mutant cells separately go extinct. In the large N limit, the fates of the two mutant cells can be assumed to be approximately independent of one another, such that:
(A7) |
(A8) |
Substituting these expressions (Equations A4–A8) into Equation (A1) and allowing terms of O(1) and above in N (which is why we neglect the change in n when the mutant replaces a type 1/2 cell—this requires the mutant to divide and the probability of the mutant being the next to divide is a factor of N lower than a type 1/2 cell being the next to divide), satisfies the following equation:
(A9) |
with (from Equation A2).
For ,
(A10) |
(A11) |
such that
(A12) |
Substituting this into Equation (A9), letting and , and taking the large N limit, the solution for is given by:
(A13) |
where , and the boundary conditions are given by and . In the large N limit, the third term will be much smaller than the other terms.
If the first two terms in Equation (A13) dominate (either when n is close to 0 or 1, or when ),
(A14) |
where in the second line we have taken the limit . For and s > 0, the fourth term in Equation (A13) is positive and hence this expression is an overestimate of the true fixation probability.
Denoting the actual fixation probability as:
(A15) |
and substituting this back into Equation (A13), we find that to first order in Δ and in the limit satisfies the following equation:
(A16) |
Defining the integrating factor:
(A17) |
with , the solution to Equation (A16) is given by:
(A18) |
which satisfies the boundary conditions .
We find that these expressions for the fixation probability provide a good estimate of the exact values for a finite (but large) population (Fig. A1).
Appendix B: Effect of clonal interference on fitness trajectories
The rate that a successful interfering mutation emerges at generation t after a mutation of fitness effect s has escaped loss via genetic drift and is still on its way to taking over the population is given by:
(A19) |
where is the fraction of resident cells at generation t (assuming deterministic, logistic growth of the mutant strain), is the fixation probability of the interfering mutant with fitness effect sI (Appendix A), and is the distribution of fitness effects of beneficial mutations that can arise in a cell with fitness F.
Here, we consider a general class of exponential-like fitness effect distributions that has been adopted in previous theoretical studies (Fogle et al. 2008; Schiffels et al. 2011; Good et al. 2012):
(A20) |
where Γ is the Gamma function, is the fraction of mutations that are beneficial, and α and β characterize the shape of the distribution. In particular, if β = 1, we recover the exponential distribution used in the main text (Equation 2). When falls faster than exponentially; when falls more slowly than exponentially. The effect of the shape of on the speed of adaptation in the clonal interference regime has been explored in Park et al. (2010) and Fogle et al. (2008) assuming that does not change over time (i.e. no epistasis); here we are interested in how different types of macroscopic epistasis affect the shape of the average fitness trajectory.
Substituting this distribution into Equation (A19) gives:
(A21) |
where , and the function depends on the value of β and whether we adopt the approximate (linear in n, Equation A14) or more accurate (Equations A15 and A18) expression for . For β = 1, if we assume that (Equation A14), then .
Integrating this over all possible times the interfering mutant can emerge gives:
(A22) |
where is a function of . In the main text, we assumed the simple form of and β = 1, in which case and hence:
(A23) |
where we have taken the large N limit () in the approximation. Substituting this into Equation (A22), we recover Equation (10) in the main text.
Since the fixation probability Pf is reduced by a factor of in the presence of clonal interference (Equation 5), the dynamics of average fitness is given by:
(A24) |
We find that this expression agrees well with the simulation data over a reasonably wide range of mutation rates (with not too large, Fig. A2).
The fact that explicitly depends on h(F) but not implies that only h(F) affects the functional form of the fitness trajectory.
Appendix C: Numerical results with other distributions of fitness effects
We find that our expression for the dynamics of fitness in the clonal interference regime (Equation A24) agrees well with simulation data for other distribution of fitness effects specified through different values of β (Equation A20), and that the conclusion that depletion in beneficial mutations speeds up the functional form of the trajectory holds for these different distributions (Fig. A3).
Appendix D: Simulation Details
We carry out full simulations of the evolutionary process within the Moran model using the Gillespie algorithm. For each simulation, the population is initialized with N cells, each with initial fitness (i.e. division rate) of 1. Throughout the simulation, we keep track of the size Ni (i.e. number of cells) and fitness Fi of cells in each subgroup , where ns is the number of currently existing subgroups that is present in the population. For example, the initial population consists only of ns = 1 subgroup of size and fitness value . At each point during the simulation, we also calculate the corresponding beneficial mutation rates (i.e. probability of gaining a beneficial mutation per division) (Equation 2, Equation A20) of each subgroup. The possible events that change the composition of the population are: (1) a cell from a subgroup i divides (without mutation) and replaces a cell in another subgroup , which occurs with rate , and (2) a cell from a subgroup i divides and mutates while a cell from subgroup j is removed, which occurs with rate . The total rate of an event occurring is then given by , and we draw the time to the next event from an exponential distribution with mean given by . The next event is drawn with probability proportional to the rate at which the event occurs, and the composition of the population is updated accordingly. If a beneficial mutation occurs while a cell in subgroup i divides (i.e. event type b), we draw its selection coefficient from the specified distribution with mean given by (Equation A20, or Equation [2] if β = 1), and this new mutant is stored as a new subgroup.
Appendix E: Effect of clonal interference on substitution trajectories
Analogous to the fitness trajectory, the average substitution trajectory S(t) (i.e. the number of fixed mutations as a function of time) is given by:
(A25) |
Assuming an exponential distribution of fitness effects (Equation 2), in the SSWM regime, the average substitution trajectory is therefore given by:
(A26) |
In the presence of weak clonal interference, assuming the simplified linear expression for the fixation probability (Equation A14) gives:
(A27) |
where is the same as in the expression for fitness dynamics (Equation 10).
As for the fitness trajectory, Equations (A26) and (A27) suggest that clonal interference changes the form of the substitution trajectory only through h(F). If the beneficial mutation rate μb stays the same, the form of the substitution trajectory stays the same regardless of whether there is clonal interference. If μb decreases with fitness, the form of the substitution trajectory speeds up; if instead μb increases, the trajectory slows down.
When comparing the numerical solutions of Equation (A27) across different values of , we first scale time as we would for the fitness trajectories (such that the initial fitness gradient is the same) and then scale S accordingly such that is the same. For small values of (see lines for and in Fig. A4), Equation (A27) agrees well with the simulation data and we see indeed that as is increased from to , the shape of the trajectory remains unchanged in the diminishing-returns model, speeds up in the HOC landscape, and slows down in a mutation-releasing landscape.
However, as the value of is increased further, Equation (A27) increasingly underestimates the mutation trajectory. This is the case even though the fitness trajectories are in good agreement with the theory (Fig. A2), suggesting that the excess mutations observed in the simulations likely arise from nearly neutral mutations that do not contribute much to fitness. The fixation of nearly neutral mutations that typically do not fix on their own could occur, for example, when a neutral mutant gains an additional beneficial mutation which then drives fixation. Therefore, to distinguish the different types of macroscopic epistasis (and in particular whether μb changes over time), comparing the shape of fitness trajectories would be more reliable (i.e. would work over a larger range of ) than comparing substitution trajectories.
Appendix F: Effect of varying Fc when estimating initial gradient
In practice, experimental data consist of fitness values measured at discrete time points. One way of estimating the initial gradient of the fitness trajectory is to find the time point tc when the average fitness first exceeds some value Fc, and let the estimate of the initial slope be . When comparing the shape of fitness trajectories across different values of N, the time for each trajectory is rescaled (), with the scaling factor k being proportional to the corresponding value of . By solving Equation (9) for different landscapes and then rescaling the trajectories using different estimates of gradients using different values of Fc, we find that our qualitative results are insensitive to the choice of Fc (Fig. A5).
Literature cited
- Bryson V, Szybalski W.. Microbial selection. Science. 1952;116(3003):45–51. [PubMed] [Google Scholar]
- Chou HH, Chiu HC, Delaney NF, Segrè D, Marx CJ.. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science. 2011;332(6034):1190–1192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Desai MM, Fisher DS.. Beneficial mutation–selection balance and the effect of linkage on positive selection. Genetics. 2007;176(3):1759–1798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fisher RA. The Genetical Theory of Natural Selection. Oxford, UK: Clarendon Press; 1930. [Google Scholar]
- Fogle CA, Nagle JL, Desai MM.. Clonal interference, multiple mutations and adaptation in large asexual populations. Genetics. 2008;180(4):2163–2173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gerrish PJ, Lenski RE.. The fate of competing beneficial mutations in an asexual population. Genetica. 1998;102:127. [PubMed] [Google Scholar]
- Gillespie JH. Molecular evolution over the mutational landscape. Evolution. 1984;38(5):1116–1129. [DOI] [PubMed] [Google Scholar]
- Good BH, Desai MM.. The impact of macroscopic epistasis on long-term evolutionary dynamics. Genetics. 2015;199(1):177–190. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Good BH, McDonald MJ, Barrick JE, Lenski RE, Desai MM.. The dynamics of molecular evolution over 60,000 generations. Nature. 2017;551(7678):45–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Good BH, Rouzine IM, Balick DJ, Hallatschek O, Desai MM.. Distribution of fixed beneficial mutations and the rate of adaptation in asexual populations. Proc Natl Acad Sci U S A. 2012;109(13):4950–4955. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guo Y, Vucelja M, Amir A.. Stochastic tunneling across fitness valleys can give rise to a logarithmic long-term fitness trajectory. Sci Adv. 2019;5(7):eaav3842. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartl DL, Taubes CH.. Compensatory nearly neutral mutations: selection without adaptation. J Theor Biol. 1996;182(3):303–309. [DOI] [PubMed] [Google Scholar]
- Imhof M, Schlötterer C.. Fitness effects of advantageous mutations in evolving Escherichia coli populations. Proc Natl Acad Sci USA. 2001;98(3):1113–1117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jain K. Interference effects of deleterious and beneficial mutations in large asexual populations. Genetics. 2019;211(4):1357–1369. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johnson T, Barton NH.. The effect of deleterious alleles on adaptation in asexual populations. Genetics. 2002;162(1):395–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kassen R, Bataillon T.. Distribution of fitness effects among beneficial mutations before selection in experimental populations of bacteria. Nat Genet. 2006;38(4):484–488. [DOI] [PubMed] [Google Scholar]
- Khan AI, Dinh DM, Schneider D, Lenski RE, Cooper TF.. Negative epistasis between beneficial mutations in an evolving bacterial population. Science. 2011;332(6034):1193–1196. [DOI] [PubMed] [Google Scholar]
- Kryazhimskiy S, Rice DP, Desai MM.. Population subdivision and adaptation in asexual populations of Saccharomyces cerevisiae. Evolution. 2012;66(6):1931–1941. [DOI] [PubMed] [Google Scholar]
- Kryazhimskiy S, Rice DP, Jerison ER, Desai MM.. Global epistasis makes adaptation predictable despite sequence-level stochasticity. Science. 2014;344(6191):1519–1522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kryazhimskiy S, Tkačik G, Plotkin JB.. The dynamics of adaptation on correlated fitness landscapes. Proc Natl Acad Sci USA. 2009;106(44):18638–18643. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Levy SF, Blundell JR, Venkataram S, Petrov DA, Fisher DS, Sherlock G.. Quantitative evolutionary dynamics using high-resolution lineage tracking. Nature. 2015;519(7542):181–186. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin J, Manhart M, Amir A.. Evolution of microbial growth traits under serial dilution. Genetics. 2020;215(3):767–777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Moran PAP. The Statistical Processes of Evolutionary Theory. Oxford, UK: Clarendon Press; 1964. [Google Scholar]
- Nowak MA. Evolutionary Dynamics: Exploring the Equations of Life. Cambridge, MA, USA: Harvard University Press; 2006. [Google Scholar]
- Orr HA. The rate of adaptation in asexuals. Genetics. 2000;155(2):961–968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Orr HA. The distribution of fitness effects among beneficial mutations. Genetics. 2003;163(4):1519–1526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Park SC, Krug J.. Evolution in random fitness landscapes: the infinite sites model. J Stat Mech. 2008;2008(04):P04014. [Google Scholar]
- Park SC, Simon D, Krug J.. The speed of evolution in large asexual populations. J Stat Phys. 2010;138(1–3):381–410. [Google Scholar]
- Pénisson S, Singh T, Sniegowski P, Gerrish P.. Dynamics and fate of beneficial mutations under lineage contamination by linked deleterious mutations. Genetics. 2017;205(3):1305–1318. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ram Y, Hadany L.. The probability of improvement in Fisher’s geometric model: a probabilistic approach. Theor Popul Biol. 2015;99:1–6. [DOI] [PubMed] [Google Scholar]
- Reddy G, Desai MM.. Global epistasis emerges from a generic model of a complex trait. Elife. 2021;10:e64740. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rokyta DR, Beisel CJ, Joyce P, Ferris MT, Burch CL, Wichman HA.. Beneficial fitness effects are not exponential for two viruses. J Mol Evol. 2008;67(4):368–376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rokyta DR, Joyce P, Caudle SB, Wichman HA.. An empirical test of the mutational landscape model of adaptation using a single-stranded DNA virus. Nat Genet. 2005;37(4):441–444. [DOI] [PubMed] [Google Scholar]
- Schiffels S, Szöllosi GJ, Mustonen V, Lässig M.. Emergent neutrality in adaptive asexual evolution. Genetics. 2011;189(4):1361–1375. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sherer NA, Kuhlman TE.. Escherichia coli with a tunable point mutation rate for evolution experiments. G3 (Bethesda). 2020;10(8):2671–2681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shibai A, Takahashi Y, Ishizawa Y, Motooka D, Nakamura S, Ying BW, Tsuru S.. Mutation accumulation under UV radiation in Escherichia coli. Sci Rep. 2017;7(1):12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silander OK, Tenaillon O, Chao L.. Understanding the evolutionary fate of finite populations: the dynamics of mutational effects. PLoS Biol. 2007;5(4):e94. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slomka S, Françoise I, Hornung G, Asraf O, Biniashvili T, Pilpel Y, Dahan O.. Experimental evolution of bacillus subtilis reveals the evolutionary dynamics of horizontal gene transfer and suggests adaptive and neutral effects. Genetics. 2020;216(2):543–558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sprouffske K, Aguilar-Rodríguez J, Sniegowski P, Wagner A.. High mutation rates limit evolutionary adaptation in Escherichia coli. PLoS Genet. 2018;14(4):e1007324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Uphoff S, Lord ND, Okumus B, Potvin-Trottier L, Sherratt DJ, Paulsson J.. Stochastic activation of a DNA damage response causes cell-to-cell mutation rate variation. Science. 2016;351(6277):1094–1097. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang Y, Diaz Arenas C, Stoebel DM, Cooper TF.. Genetic background affects epistatic interactions between two beneficial mutations. Biol Lett. 2013;9(1):20120328. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wiser MJ, Ribeck N, Lenski RE.. Long-term dynamics of adaptation in asexual populations. Science. 2013;342(6164):1364–1367. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The MATLAB codes for simulation and analysis can be found on GitHub repository https://github.com/yipeiguo/Effect-of-clonal-interference-on-fitness-trajectories.