Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2009 Oct 26;106(44):18638–18643. doi: 10.1073/pnas.0905497106

The dynamics of adaptation on correlated fitness landscapes

Sergey Kryazhimskiy a,1, Gašper Tkačik a,b,1, Joshua B Plotkin a,2
PMCID: PMC2767361  PMID: 19858497

Abstract

Evolutionary theory predicts that a population in a new environment will accumulate adaptive substitutions, but precisely how they accumulate is poorly understood. The dynamics of adaptation depend on the underlying fitness landscape. Virtually nothing is known about fitness landscapes in nature, and few methods allow us to infer the landscape from empirical data. With a view toward this inference problem, we have developed a theory that, in the weak-mutation limit, predicts how a population's mean fitness and the number of accumulated substitutions are expected to increase over time, depending on the underlying fitness landscape. We find that fitness and substitution trajectories depend not on the full distribution of fitness effects of available mutations but rather on the expected fixation probability and the expected fitness increment of mutations. We introduce a scheme that classifies landscapes in terms of the qualitative evolutionary dynamics they produce. We show that linear substitution trajectories, long considered the hallmark of neutral evolution, can arise even when mutations are strongly selected. Our results provide a basis for understanding the dynamics of adaptation and for inferring properties of an organism's fitness landscape from temporal data. Applying these methods to data from a long-term experiment, we infer the sign and strength of epistasis among beneficial mutations in the Escherichia coli genome.

Keywords: epistasis, fitness trajectory, substitution trajectory, weak mutation, evolution


Evolutionary theory predicts that mean fitness will increase over time when a population encounters a new environment. This behavior is observed in natural and laboratory populations. Yet evolutionary theory offers few quantitative predictions for the dynamics of adaptation (1). The primary difficulty is that adaptation depends on the shape of the underlying fitness landscape. Unfortunately, mapping out an organism's fitness landscape is virtually impossible because of its vast dimensionality and the coarse resolution of fitness measurements. Moreover, because of the scarcity of such measurements, most theoretical work has been pursued in isolation from data.

Much of the theory of adaptation is concerned with understanding the dynamics on uncorrelated, or “rugged”, fitness landscapes. This approach, pioneered by Kingman (2) and Kauffman and Levin (3), has generated many important results (e.g. refs. (47)). But many of these results do not extend to landscapes that are correlated. One striking example is the expected length of an adaptive walk: It is extremely short on rugged landscapes (3, 8), but it can be very long on correlated landscapes (9). Although data are scarce, a long-term evolution experiment in Escherichia coli has found that adaptation continues to proceed even after 20,000 generations in a constant environment (10). This observation suggests that fitness landscapes in nature are correlated.

A second body of work examines relatively realistic, complex genotype-to-fitness maps—e.g. an RNA folding algorithm—and studies adaptation on the resulting correlated landscapes by computer simulation (e.g. refs. (3, 1115)). This approach provides important insights into the process of adaptation, and it produces quantitative predictions about the specific systems being simulated. But such results are difficult to generalize.

A third approach, orthogonal to the first two, was introduced by Gillespie (16, 17) and revived more recently by Orr (8, 18, 19). It utilizes extreme-value theory to identify features of the adaptation process that are independent of the underlying fitness landscape. Although helpful for understanding some fundamental properties of evolution, this approach suffers from a few serious drawbacks. Most importantly, by focusing on features of adaptation that are independent of the fitness landscape, the Orr–Gillespie theory does not elucidate how the structure of the landscape influences adaptation, nor does it allow us to infer the landscape from empirical data. Yet this is a question of central interest in evolutionary biology. In addition, most of the predictions of this theory concern a single adaptive step (8, 18, 19), and those predictions that extend to multiple steps hold again only for uncorrelated landscapes (20).

In order to address these shortcomings, we present here an elementary theory of adaptation on a correlated fitness landscape. Our theory makes an explicit connection between the shape of the fitness landscape and observable features of adaptation, and it therefore allows us to infer important properties of the fitness landscapes from data. Experimental studies of microbial evolution typically report the mean fitness of the population (21, 22) and the mean number of accumulated substitutions (23, 24) over time; therefore we develop a theory that predicts these dynamic quantities, which we call the fitness and substitution trajectories, in terms of the underlying fitness landscape.

To develop this theory, we need a sufficiently general but tractable description of a correlated fitness landscape. As in Gillespie's model (17), we will describe the fitness landscape by specifying the distribution of fitnesses of single-mutant neighbors for each genotype, which we call the “neighbor fitness distribution” (NFD). On an uncorrelated landscape, all genotypes share the same NFD. We introduce correlations by assuming that the same NFD is shared among genotypes that have the same fitness, but genotypes of different fitnesses may have different NFDs. We say that such landscapes are fitness-parameterized because the possible consequences of a mutation are determined only by the fitness of the parental genotype (52). This framework accommodates arbitrary correlations introduced by nonneutral mutations. But neutral networks (14, 25, 26) or mutations with equal effect but different evolutionary potential fall outside of the scope of fitness-parametrized landscapes. Nevertheless, the space of fitness-parametrized landscapes is very large and contains most of the landscapes studied in previous literature.

To understand this space better, we will first explore three classical fitness landscapes: the uncorrelated landscape (2, 5, 6, 20, 27), the (additive) nonepistatic landscape (28, 29), and the landscape with a constant distribution of selection coefficients (30, 31). We will demonstrate how the choice of landscape influences the dynamics of adaptation. Having gained some insight from these examples, we will classify fitness-parametrized landscapes in terms of the qualitative evolutionary dynamics they produce. Remarkably, the qualitative dynamics fall into 14 possible classes, which include, among others, the well-known classical examples. By comparing these classes against observations from microbial evolution experiments (21), we will infer the space of landscapes that, given our simplifying assumptions, are compatible with existing data.

We will study the dynamics of adaptation in the limit of weak mutation (8, 16, 17, 32), which allows us to ignore the effects of multiple, competing beneficial mutations (30, 31, 33, 34). This approach is mathematically convenient, and, more importantly, it allows us to study the dynamics induced by the fitness landscape itself in isolation from those that result from clonal interference (30, 31, 35, 36). Our analysis will therefore provide a null expectation against which to compare more complex models or data.

Results

Three Classical Fitness Landscapes.

We describe a fitness landscape by a family of probability distributions, Φx. Φx(y)dy denotes the probability that a mutation arising in an individual of fitness x will have a fitness in [y,y + dy]. The space of fitness-parametrized landscapes includes, among others, such well-known (2, 5, 6, 20, 27, 2931) landscapes as (i) the “house of cards” (HOC) or the uncorrelated landscapes, for which all genotypes have the same NFD Φx(y) = Ψ(y); (ii) the non-epistatic (NEPI) landscapes, for which the distribution of fitness effects of mutations is the same for all genotypes, so that the NFD is given by Φx(y) = Ψ(yx), and (iii) the “stairway to heaven” (STH) landscapes, for which the distribution of selection coefficients is the same for all genotypes, so that the NFD is given by Φx(y) = x −1Ψ(x −1(yx)).

The definitions of these three well-known landscapes are summarized in Table 1, where we have assumed that the NFD follows an exponential form. We will derive expressions for the expected fitness and substitution trajectories on each of these landscapes. Our results also hold qualitatively if we replace the exponential distribution by any other distribution from the Gumbel domain of attraction as predicted by the Orr–Gillespie theory (18). Note that there are no deleterious or neutral mutations in the NEPI and STH landscapes (Table 1), but our conclusions would not change if we added such mutations (see SI Appendix).

Table 1.

Classical fitness landscapes with the exponential form and the corresponding fitness and substitution trajectories obtained from Eqs. 1 and 2

NFD Φx(y) Expected fitness increment*r(x) Fitness trajectory F(t) Expected fixation probability*q(x) Substitution trajectory S(t)
HOC 1aeya,y0 4a2exa aln(ex0a+4at) 2aexa 12aln(ex0a+4at)x02a2
NEPI 1aeyxa,yx 4a2x x02+8a2t 2ax 12a(x02+8a2tx0)
STH 1axeyxax,yx 4a2(1+a)(1+2a)2x x0exp(4a2(a+1)(2a+1)2t) 2a1+2a 2a2a+1t

*Expressions for the r − and q-functions are derived in the limit x ≫ 1 (HOC, NEPI) and under the approximation N ≫ 1 (HOC, NEPI, STH). These approximations are highly accurate, especially for large x (see Fig. 1). See SI Appendix for details.

Before we derive analytic expressions for the dynamics of adap- tation on the three classical landscapes, we first develop some intuitive expectations. On all landscapes, we expect substitutions to accrue and the mean fitness to increase over time. For the HOC landscapes, we expect that the rate of fitness increase should slow down as the population becomes more adapted. To see this slowdown, imagine a population initially at fitness x 0, where x0Ψ(y)dy=0.5, i.e. 50% of mutations are beneficial. If a beneficial mutation arises and fixes, providing fitness x 1 > x 0, then this event can only reduce the pool of remaining beneficial mutations—i.e. x1Ψ(y)dy<0.5. Thus, the rate of fitness increase should be reduced as adaptation proceeds on the HOC landscape. By contrast, on a STH landscape, we expect that the rate of fitness increase will increase as the population adapts. Indeed, the fraction of mutations that are adaptive does not change as fitness increases, but the fitness increment of such mutations grows linearly with the fitness of the parent (because the selection coefficient stays the same). These simple considerations indicate that HOC landscapes are antagonistically epistatic, whereas STH landscapes are synergistically epistatic. We call the landscape Φx(y) = Ψ(yx) nonepistatic because on this landscape the distribution of fitness increments of mutations does not depend upon the fitness of the parental genotype. If fitness effects were viewed multiplicatively, however, then the STH landscape would be considered nonepistatic—although we do not adapt this convention here (see ref. 28 for an extensive discussion on this topic). Moreover, as we show below, the STH landscape produces unrealistic evolutionary dynamics.

Fitness and Substitution Trajectories.

In order to analyze the dynamics of adaptation, we consider an asexual population of fixed size N that evolves according to the infinite-sites Wright–Fisher (WF) model (see Materials and Methods for details). We assume that the mutation rate is sufficiently small that, at most, one mutant segregates in the population at any time (8, 17). Thus, the population is essentially always monomorphic, and it can be characterized at each time by its fitness x. When a mutation with fitness y arises, it either fixes instantaneously with Kimura's fixation probability πx(y) = (1 − e −2sx(y))/(1 − e −2Nsx(y)) or is instantaneously lost with probability 1 − πx(y) where s x(y) is the selection coefficient (see Materials and Methods). In this limit, the adaptive walk of the population is described by a continuous-time, continuous-space Markov chain. We emphasize that, in contrast to the “greedy” adaptive walks typically studied in the literature on rugged fitness landscapes (3, 4), the adaptive walks studied here never stop. Even if a population reaches a local fitness maximum, a deleterious mutation will eventually fix, and the walk will continue.

We have developed a method for efficiently computing the full ensemble distribution of fitnesses and substitutions of the population at time t, given that its initial fitness was x 0 at time zero (see SI Appendix). Here we focus on two important statistics of these distributions: the expected fitness of the population F(t) at time t, and the expected number of substitutions S(t) accumulated in the population by time t. We call these quantities the fitness trajectory and the substitution trajectory, respectively. If we measure time in the expected number of mutations, these functions approximately satisfy the following equations (see Materials and Methods):

graphic file with name zpq99909-9963-m01.jpg
graphic file with name zpq99909-9963-m02.jpg

where the dot denotes a derivative with respect to time;

graphic file with name zpq99909-9963-m03.jpg

is the expected fixation probability of a mutation arising in a population with fitness x; and

graphic file with name zpq99909-9963-m04.jpg

is the expected fitness increment of such a mutation, weighted by its fixation probability. Eqs. 1 and 2 were derived under the infinite-sites assumption, i.e. each genotype was assumed to have an infinite number of neighbors, so that even very fit genotypes have a nonzero chance of discovering a beneficial mutation. Consistent with previous work (37), the infinite-sites approximation is highly accurate, as we demonstrate by comparing (Fig. 1) the solutions of these equations (Table 1) to simulations of a finite-site model (see Materials and Methods).

Fig. 1.

Fig. 1.

Dynamics of adaptation on three classical fitness landscapes. Rows correspond to fitness landscapes. The first column graphs the NFD, Φx(y), for two representative values of the parental fitness, x 0 = 1 and x 0 = 4. The second and third columns show the fitness and substitution trajectories for a population starting with fitness x 0 = 2. Black lines correspond to the theoretical predictions of Eqs. 1 and 2; gray lines show the results of stochastic simulations; dashed lines show a linear function, for reference. Note that axes are logarithmic. The fourth column shows the empirical distribution of selection coefficients of fixed mutations; dashed lines show the best-fit regression on the semi-log scale, with slope k (only selection coefficients > 0.5 were used for fitting). Parameter values: N = 1000; μ = 10−5; L = 1000; number of replicate simulations = 104; a = 1 for the HOC and the NEPI landscapes, and a = 0.42 for the STH landscape.

Fig. 1 shows the dynamics of adaptation on the three classical fitness landscapes. On the HOC landscape, both the expected fitness of the population and the expected number of substitutions grow logarithmically with time, consistent with previous work (4). As we expected, the rate of adaptation on such landscapes rapidly declines as the fitness of the population grows. As the population adapts, there are two forces on the HOC landscape that act against further adaptation. First, the fraction of mutations that are beneficial decreases. Second, the probability of fixation of an adaptive mutation decreases as well. This decrease occurs because the fixation probability monotonically depends on its selection coefficient, and the selection coefficients of available adaptive mutations decline as the fitness of the parent increases. In addition, adaptation slows down further because the time to fixation of beneficial mutations grows with declining selection coefficients. However, this effect turns out to be negligible (see the comparison with the full WF model below). The rate of adaptation on the NEPI landscape also slows down as the fitness increases, but it does so less dramatically than on the HOC landscape. This behavior is expected because the fraction of beneficial mutations and their effects do not change as the fitness of the parental genotypes increases. However, the selection coefficients of beneficial mutations decrease, thereby reducing the rate of fitness growth. Finally, on the STH landscape, the rate of mean-fitness increase grows without bound over time, as expected. In contrast to HOC and NEPI landscapes, there are no forces on such landscapes that impede further adaptation as the population becomes more adapted (hence the name “stairway to heaven”).

In order to investigate the robustness of the results in Fig. 1 with respect to the assumption of weak mutation, we have simulated the full stochastic WF model over a wide range of mutation rates. These simulations incorporate the effects of competing mutations, and they also account for the (nonzero) time to fixation. Our theoretical prediction matches the dynamics of the full WF model very well when θ ≲ 0.1. Moreover, even when θ > 1, the concavities of fitness and substitution trajectories are correctly predicted by our theory (see SI Appendix).

Distribution of Selection Coefficients of Fixed Mutations.

In addition to fitness and substitution trajectories, we have investigated the distribution of selection coefficients for mutations that fix during adaptation (Fig. 1, fourth column). By using computer simulations, Orr previously showed that this distribution is approximately exponential (excluding small selection coefficients) for uncorrelated landscapes whose NFD belongs to the Gumbel type (8). Fig. 1 shows that Orr's observation holds more generally—i.e. even for correlated landscapes, such as the NEPI and STH landscapes. In fact, the distribution of fixed selection coefficients is so robust to changes in the landscape structure that virtually no inference can be made on its basis. To demonstrate this problem, we have chosen the parameter a (see Table 1) so that the resulting distributions of fixed selection coefficients are virtually the same for all three classical fitness landscapes, even though their qualitative trajectories are completely different (Fig. 1). In other words, the selection coefficients associated with mutations that are fixed during evolution tell us very little about the long-term behavior of an adapting population or the fitness landscape on which it is evolving.

Toward a Classification of Landscapes.

The space of all possible fitness landscapes is vast. We therefore wish to classify landscapes in terms of the qualitative evolutionary dynamics they produce—i.e. in terms of their fitness and substitution trajectories, which can be directly observed in an experiment. Our analytic approximation in Eqs. 1 and 2 captures the behavior of the trajectories quite well, especially as the population reaches high fitnesses (Fig. 1). Remarkably, these equations depend on only two simple functions of the landscape: the expected fixation probability of a mutation arising in a population of fitness x, q(x), and the expected fitness increment of such a mutation weighted by its fixation probability, r(x). By varying just these two quantities, we can explore all possible qualitative behaviors of the fitness and substitution trajectories.

For the purpose of classification, we consider only landscapes that are defined on the whole positive real axis, and whose r − and q-functions are monotonic and smooth. The five different shapes of the r-function and three different shapes of the q-function determine, respectively, five qualitatively different fitness trajectories and three qualitatively different substitution trajectories (Fig. 2). Landscapes with an increasing or decreasing r-function produce convex (type I and II) or concave (types III, IV, and V) fitness trajectories, respectively. More specifically, fitness trajectories grow superlinearly with time (type I), are asymptotically linear (type II and III), grow sublinearly (type IV), or asymptote to a constant (type V). Similarly, landscapes with an increasing or decreasing q-function produce convex (type A) or concave (types B and C) substitution trajectories, respectively. Substitution trajectories grow asymptotically linearly (type A and B), or sublinearly (type C). Considering all possible combinations of the r- and q-functions produces a total of 14 classes of qualitatively different evolutionary dynamics (Fig. 2).

Fig. 2.

Fig. 2.

Classification of fitness landscapes. Column 1 shows five possible shapes for the r-function, and three possible shapes for the q-function. In some cases, these functions have asymptotes, shown as dashed horizontal lines. Columns 2–6 show the fitness (Upper) and substitution (Lower) trajectories for the 15 landscapes that arise through combinations of r − and q-functions. Substitution trajectories for landscapes with q-function of type A, B, and C are shown in green, dark orange, and purple, respectively. In some cases, the fitness or substitution trajectories possess asymptotic slopes, shown as dashed lines in the corresponding color. In these cases, the asymptotic slope equals the asymptotic value of the corresponding r − or q-function (except for the substitution trajectories in case V). Landscapes V-B and V-C both have asymptotically linear substitution trajectories, and therefore fall into the same class.

This classification scheme accommodates the three classical landscapes considered above. The STH landscapes belong to class I-A or I-B, because q(x) is constant and r(x) grows without bound. The NEPI landscapes belong to class IV-C, because both r(x) and q(x) decay as x −1. The HOC landscapes belong to class V-C because r(x) is negative for large x and q(x) decays to zero. Recall that the STH landscapes are synergistically epistatic and the HOC landscapes are antagonistically epistatic. This observation suggests the following natural definition: landscapes for which the r-function either grows or decays slower than x −1 are synergistically epistatic (types I, II, III, and IV), whereas landscapes for which the r-function decays faster than x −1 are antagonistically epistatic (types IV and V).

Remarkably, the substitution trajectories for landscapes of type IV or V are almost linear—a pattern long considered the hallmark of neutral or nearly neutral evolution (38). As these correlated landscapes demonstrate, this pattern can also arise when substitutions confer significant fitness gains. In fact, the linear accrual of adaptive mutations has recently been observed in experimental populations (53).

Inferring Landscape Structure From Data.

Which fitness landscapes are compatible with empirical data, and which are not? To address this question, we have compared predicted evolutionary dynamics with data from long-term evolution experiments. Empirical fitness trajectories in a fixed environment typically have negative curvature: Fitness increases quickly at the early stages of adaptation, and more slowly at later stages (10, 21, 22, 3942). This negative curvature implies that the r-functions for landscapes in nature belong to type III, IV or V. In other words, a large class of strongly synergistic landscapes (those with an increasing r-function) are incompatible with basic, empirical observations. The space of unrealistic fitness landscapes includes the widely used STH landscapes (30, 31, 3335, 4345), for which r(x) ∼ x.

Landscapes with either antagonistic epistasis (r(x) < Cx −1) or weak synergistic epistasis (Cx −1 < r(x) ≤ C) produce fitness trajectories that are concave, and so they are qualitatively consistent with data from microbial evolution experiments. We can use such data to estimate the sign and strength of epistasis. In order to do so, we assume that the r-function has the form r(x) = Bx β with B > 0 and β ≤ 0. This form is convenient because it includes nonepistatic landscapes when β = −1, weakly synergistic landscapes when − 1 < β ≤ 0, and antagonistic landscapes when β < −1. Eq. 1 can then be solved analytically, and the fitness trajectory is given by

graphic file with name zpq99909-9963-m05.jpg

It follows from this expression that the slope of the line fitted on the log–log scale to the fitness trajectory observed in a long-term evolution experiment provides an estimate of (1 − β)−1. We applied this procedure to data from the evolutionary experiment by Lenski et al. (21) and found that β^ = −9.58 with the 95% confidence interval [−13.36,−7.38], suggesting that the fitness landscape of E. coli is, on average, strongly antagonistically epistatic. This qualitative conclusion is robust with respect to the violation of the weak mutation assumption (see SI Appendix), although the precise estimate of β may change with the development of more refined models of E. coli evolution.

Discussion

The framework developed here addresses two key problems in the theory of adaptation: how to characterize evolution on a correlated fitness landscape and how to infer properties of a fitness landscape from empirical data. Our analysis has relied on two assumptions: weak mutation and the fitness parametrization of the landscape. The assumption of weak mutation, although restrictive, has been used in previous literature and provides a reasonable starting point for future research. Relaxing this assumption presents substantial mathematical complications and introduces entirely new phenomena, such as clonal interference (30, 35) and “piggybacking” (31, 36). Therefore, we must first have a solid understanding of adaptation dynamics under weak mutation before proceeding to incorporate these additional effects. Without a theory of weak mutation, we would be unable to disentangle the effects of the fitness landscape itself from the effects of clonal interference. In the future, experiments whose primary goal is to probe the fitness landscape should be designed to minimize the effects of clonal interference, e.g. by choosing small population sizes.

The fitness parametrization is a less-restrictive assumption, especially when weak mutation is already assumed. Indeed, neutral networks are important for adaptation only when a population can use them to quickly access previously inaccessible beneficial mutations. This regime only occurs when the population is polymorphic, i.e. when θ > 1. In contrast, a monomorphic population can explore the neutral network only very slowly, by substituting neutral mutations (26). Such a population is far more likely to substitute a beneficial mutation and jump to a new neutral network.

We have studied several quantities that characterize evolutionary dynamics. We found that the distribution of selection coefficients of fixed mutations is insensitive to the underlying NFD, consistent with previous findings (8, 46, 47). In contrast, the fitness and substitution trajectories are very informative about the underlying fitness landscape. In particular, the substitution trajectory is convex or concave on landscapes for which the fixation probability of a mutation increases or decreases with increasing fitness, respectively. Similarly, the fitness trajectory is convex or concave on landscapes for which the expected fitness increment of a mutation increases or decreases with increasing fitness. Moreover, the curvature of the fitness trajectory is informative about the sign and strength of epistasis in the fitness landscape.

These results provide a groundwork for inferring fitness landscapes from dynamic data. In particular, we have shown that data from bacterial evolution experiments are incompatible with landscapes that feature a constant distribution of selection coefficients—even though such landscapes are often used in the theoretical literature. We have also proposed a simple method for inferring the sign and strength of epistasis from such data. In contrast to most other estimates of epistasis that are based on measurements of interactions among deleterious mutations (see e.g. ref. 48 and references therein), we provide an estimate of epistasis based on the interaction among beneficial mutations—which is more informative for the long-term dynamics of adaptation. Our estimates suggest that the E. coli fitness landscape is characterized by strong antagonistic epistasis, at least in a fixed laboratory environment, which is consistent with one previous study (49). However, the precise type of landscape (e.g. type IV versus type V) for E. coli or other microorganisms may be difficult to determine on the basis of fitness and substitution trajectories alone. The ensemble variance in trajectories across experimental replicates may provide additional power (see SI Appendix).

Here we have focused on static fitness landscapes, which probably arise only in laboratory environments. Fitness landscapes in the field are likely dynamic because of fluctuations in the environment or frequency-dependent selection. We can hope to understand the evolutionary dynamics on such landscapes only after we acquire a firm understanding of static landscapes. Our elementary theory provides an explicit link between the form of static fitness landscapes and their resulting evolutionary dynamics, in terms of simple observable quantities. Hopefully, this link will help bring together theoretical and experimental studies of adaptation.

Materials and Methods

We consider an asexual population of fixed size N that evolves according to the infinite-sites WF model (50) with a small mutation rate, so that θ ≪ (4 logN)−1, where θ = Nμ and μ is the per-locus, per-generation mutation rate. This condition ensures that the absorption time of all mutations, including neutral ones, is much shorter than the waiting time until the arrival of the next mutation. Therefore, the population is monomorphic at virtually all times, and occasionally it transitions almost instantaneously to a new type (17). Individuals and the population as a whole are characterized by their fitness, x. Φx(y)dy denotes the fitness-parametrized landscape, i.e. the probability that the mutation arising in an individual with fitness x has fitness y. We assume that genome length is sufficiently large so that each mutation occurs at a new site. A mutation fixes in the population with Kimura's fixation probability πx(y) = (1 − e −2sx(y))/(1 − e −2Nsx(y)) where s x(y) = y/x − 1 is the selection coefficient (50). If a mutation arises and fixes, then the population instantaneously transitions from fitness x to fitness y—we ignore the time it takes for a mutation to fix. We can thus describe the sequence of such transitions by a stationary continuous-time Markov chain, whose state space is the semi axis [0,+∞). The population waits θ−1 generations for the next mutation on average. If we measure time by the expected number of mutations, the probability that the population has fitness in [y,y + dy] at time t + δt, given it had fitness x at time t, is Φx(yx(y)dyδt.

We define the fitness and substitution trajectories as F(t,x)=0yP(y,t|x)dy, and S(t,x)=i=0iPi(t|x), respectively, where P(y,t|x) is the probability that the population has fitness in [y,y + δy] at time t, given initial fitness x, and P i(t|x) is the probability that the population has accumulated i substitutions by time t, given initial fitness x [for simplicity we also write F(t) and S(t)]. It follows from the classical Markov chain theory that F and S satisfy the equations (see SI Appendix)

graphic file with name zpq99909-9963-m06.jpg
graphic file with name zpq99909-9963-m07.jpg

where K^b is defined by

graphic file with name zpq99909-9963-m08.jpg

which is the backward Kolmogorov operator. In the SI Appendix, we present an efficient numerical method for finding the whole distributions P(y,t|x) and P i(t|x).

On landscapes for which mutations of large effect become increasingly unlikely as the fitness of the population increases, most of the contribution to the integral in Eq. 8 comes from values ξ ≈ x, and we can write f(ξ) − f(x) ≈ f′(x)(ξ−x). Consequently, (K^bf(·))(x) ≈ r(x)f′(x), where r(x) is given by Eq. 4. Therefore, Eqs. 6 and 7 can be approximated by so-called advection equations that turn out to be equivalent to Eqs. 1 and 2 (see SI Appendix for details). Eqs. 1 and 2 are closely related to those derived by Tachida (51) and Welch and Waxman (37) for the uncorrelated landscape.

In stochastic simulations, we implement a finite-site version of the model described above. In these simulations, after a substitution has occurred, a sample of size L = 1,000 is drawn from the distribution Φx, which represents the (finite) mutational neighborhood of the current genotype. Each of these L-neighboring genotypes has the same probability to be drawn at a subsequent mutation event. Our results do not depend on the value of L on the time scales examined as long as L is large (e.g. L ≥ 103). Code written in the Objective Caml language is available upon request.

Supplementary Material

Supporting Information

Acknowledgments.

The authors thank Richard Lenski, Michael Desai, Todd Parsons, and Jeremy Draghi for many fruitful discussions. J.B.P. acknowledges support from the Burroughs Wellcome Fund, the David and Lucile Packard Foundation, the James S. McDonnell Foundation, the Alfred P. Sloan Foundation, and Defense Advanced Research Projects Agency Grant HR0011-05-1-0057. G.T. acknowledges support from National Science Foundation Grants IBN-0344678 and DMR04-25780.

Footnotes

The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at www.pnas.org/cgi/content/full/0905497106/DCSupplemental.

References

  • 1.Aita T, et al. Extracting characteristic properties of fitness landscape from in vitro molecular evolution: A case study on infectivity of fd phage to E. coli. J Theor Biol. 2007;246:538–550. doi: 10.1016/j.jtbi.2006.12.037. [DOI] [PubMed] [Google Scholar]
  • 2.Kingman JFC. A simple model for the balance between selection and mutation. J Appl Prob. 1978;15:1–12. [Google Scholar]
  • 3.Kauffman S, Levin S. Towards a general theory of adaptive walks on rugged landscapes. J Theor Biol. 1987;128:11–45. doi: 10.1016/s0022-5193(87)80029-2. [DOI] [PubMed] [Google Scholar]
  • 4.Flyvbjerg H, Lautrup B. Evolution in a rugged fitness landscape. Phys Rev A. 1992;46:6714–6723. doi: 10.1103/physreva.46.6714. [DOI] [PubMed] [Google Scholar]
  • 5.Park SC, Krug J. Evolution in random fitness landscapes: The infinite sites model. J Stat Mech. 2008 P04014. [Google Scholar]
  • 6.Macken CA, Perelson AS. Protein evolution on rugged landscapes. Proc Natl Acad Sci USA. 1989;86:6191–6195. doi: 10.1073/pnas.86.16.6191. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kauffman S, Weinberger ED. The NK model of rugged fitness landscape and its application to maturation of the immune response. J Theor Biol. 1989;141:211–245. doi: 10.1016/s0022-5193(89)80019-0. [DOI] [PubMed] [Google Scholar]
  • 8.Orr HA. The population genetics of adaptation: The adaptation of DNA sequences. Evolution. 2002;7:1317–1330. doi: 10.1111/j.0014-3820.2002.tb01446.x. [DOI] [PubMed] [Google Scholar]
  • 9.Orr HA. The population genetics of adaptation on correlated fitness landscapes: the block model. Evolution. 2006;60:1113–1124. [PubMed] [Google Scholar]
  • 10.Cooper VS, Lenski RE. The population genetics of ecological specialization in evolving Escherichia coli populations. Nature. 2000;407:736–739. doi: 10.1038/35037572. [DOI] [PubMed] [Google Scholar]
  • 11.Perelson AS, Macken CA. Protein evolution on partially correlated landscapes. Proc Natl Acad Sci USA. 1995;92:9657–9661. doi: 10.1073/pnas.92.21.9657. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Newman MEJ, Engelhardt R. Effects of selective neutrality on the evolution of molecular species. Proc R Soc London Ser B. 1998;265:1333–1338. [Google Scholar]
  • 13.Adami C. Digital genetics: Unravelling the genetic basis of evolution. Nat Rev Genet. 2006;7:109–118. doi: 10.1038/nrg1771. [DOI] [PubMed] [Google Scholar]
  • 14.Cowperthwaite MC, Meyers LA. How mutational networks shape evolution: Lessons from RNA models. Annu Rev Ecol Evol Syst. 2007;38:203–230. [Google Scholar]
  • 15.Ndifon W, Plotkin JB, Dushoff J. On the accessibility of adaptive phenotypes of a bacterial metabolic network. PLoS Comput Biol. 2009;5:e1000472. doi: 10.1371/journal.pcbi.1000472. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Gillespie JH. A simple stochastic gene substitution model. Theor Pop Biol. 1983;23:202–215. doi: 10.1016/0040-5809(83)90014-x. [DOI] [PubMed] [Google Scholar]
  • 17.Gillespie JH. The Causes of Molecular Evolution. Oxford: Oxford Univ Press; 1994. [Google Scholar]
  • 18.Orr HA. The distribution of fitness effects among beneficial mutations. Genetics. 2003;163:1519–1526. doi: 10.1093/genetics/163.4.1519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Joyce P, Rokyta DR, Beisel CJ, Orr HA. A general extreme value theory model for the adaptation of DNA sequences under strong selection and weak mutation. Genetics. 2008;180:1627–1643. doi: 10.1534/genetics.108.088716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Rokyta DR, Beisel CJ, Joyce P. Properties of adaptive walks on uncorrelated landscapes under strong selection and weak mutation. J Theor Biol. 2006;243:114–120. doi: 10.1016/j.jtbi.2006.06.008. [DOI] [PubMed] [Google Scholar]
  • 21.Lenski RE, Travisano M. Dynamics of adaptation and diversification: A 10,000-generation experiment with bacterial populations. Proc Natl Acad Sci USA. 1994;91:6808–6814. doi: 10.1073/pnas.91.15.6808. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Silander OK, Tenaillon O, Chao L. Understanding the evolutionary fate of finite populations: The dynamics of mutational effects. PLoS Bio. 2007;5:e94. doi: 10.1371/journal.pbio.0050094. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Paquin C, Adams J. Frequency of fixation of adaptive mutations is higher in evolving diploid than haploid yeast populations. Nature. 1983;302:495–500. doi: 10.1038/302495a0. [DOI] [PubMed] [Google Scholar]
  • 24.Wichman HA, Millstein J, Bull JJ. Adaptive molecular evolution for 13,000 phage generations: A possible arms race. Genetics. 2005;170:19–31. doi: 10.1534/genetics.104.034488. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Fontana W, Schuster P. Continuity in evolution: On the nature of transitions. Science. 1998;280:1451–1455. doi: 10.1126/science.280.5368.1451. [DOI] [PubMed] [Google Scholar]
  • 26.van Nimwegen E, Crutchfield JP, Huynen M. Neutral evolution of mutational robustness. Proc Natl Acad Sci USA. 1999;96:9716–9720. doi: 10.1073/pnas.96.17.9716. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Orr HA. A minimum on the mean number of steps taken in adaptive walks. J Theor Biol. 2003;220:241–247. doi: 10.1006/jtbi.2003.3161. [DOI] [PubMed] [Google Scholar]
  • 28.Mani R, St. Onge RP, Hartman JL, IV, Giaever G, Roth FP. Defining genetic interaction. Proc Natl Acad Sci USA. 2008;105:3461–3466. doi: 10.1073/pnas.0712255105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Eshel I. On evolution in a population with an infinite number of types. Theor Pop Biol. 1971;2:209–236. doi: 10.1016/0040-5809(71)90015-3. [DOI] [PubMed] [Google Scholar]
  • 30.Gerrish PJ, Lenski RE. The fate of competing beneficial mutations in an asexual population. Genetica. 1998;102–103:127–144. [PubMed] [Google Scholar]
  • 31.Desai MM, Fisher DS. Beneficial mutation-selection balance and the effect of linkage on positive selection. Genetics. 2007;176:1759–1798. doi: 10.1534/genetics.106.067678. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Dieckmann U, Law R. The dynamical theory of coevolution: a derivation from stochastic ecological processes. J Math Biol. 1996;34:579–612. doi: 10.1007/BF02409751. [DOI] [PubMed] [Google Scholar]
  • 33.Rouzine IM, Wakeley J, Coffin JM. The solitary wave of asexual evolution. Proc Natl Acad Sci USA. 2003;100:587–592. doi: 10.1073/pnas.242719299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Park SC, Krug J. Clonal interference in large populations. Proc Natl Acad Sci USA. 2007;104:18135–18140. doi: 10.1073/pnas.0705778104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Wilke CO. The speed of adaptation in large asexual populations. Genetics. 2004;167:2045–2053. doi: 10.1534/genetics.104.027136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Zeyl C. Evolutionary genetics: A piggyback ride to adaptation and diversity. Curr Biol. 2007;17:R333. doi: 10.1016/j.cub.2007.02.042. [DOI] [PubMed] [Google Scholar]
  • 37.Welch JJ, Waxman D. The nk model and population genetics. J Theor Biol. 2005;234:329–340. doi: 10.1016/j.jtbi.2004.11.027. [DOI] [PubMed] [Google Scholar]
  • 38.Kimura M, Ohta T. Protein polymorphism as a phase of molecular evolution. Nature. 1968;229:467–469. doi: 10.1038/229467a0. [DOI] [PubMed] [Google Scholar]
  • 39.Bull JJ, et al. Exceptional convergent evolution in a virus. Genetics. 1997;147:1497–1507. doi: 10.1093/genetics/147.4.1497. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Elena SF, Davila M, Novella IS, Holland JJ, Esteban Evolutionary dynamics of fitness recovery from the debilitating effects of Muller's ratchet. Evolution. 1998;52:309–314. doi: 10.1111/j.1558-5646.1998.tb01633.x. [DOI] [PubMed] [Google Scholar]
  • 41.de Visser JAGM, Lenski RE. Long-term experimental evolution in Escherichia coli. XI Rejection of non-transitive interactions as cause of declining rate of adaptation. BMC Evol Biol. 2002;2:19. doi: 10.1186/1471-2148-2-19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Hayashi Y, et al. Experimental rugged fitness landscape in protein sequence space. PLoS ONE. 2006;1:e96. doi: 10.1371/journal.pone.0000096. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Orr HA. The rate of adaptation in asexuals. Genetics. 2000;155:961–968. doi: 10.1093/genetics/155.2.961. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Johnson T, Barton NH. The effect of deleterious alleles on adaptation in asexual populations. Genetics. 2002;162:395–411. doi: 10.1093/genetics/162.1.395. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Bachtrog D, Gordo I. Adaptive evolution of asexual populations under Muller's ratchet. Evolution. 2004;58:1403–1413. doi: 10.1111/j.0014-3820.2004.tb01722.x. [DOI] [PubMed] [Google Scholar]
  • 46.Rozen DE, de Visser JAG, Gerrish PJ. Fitness effects of fixed beneficial mutations in microbial populations. Curr Biol. 2002;12:1040–1045. doi: 10.1016/s0960-9822(02)00896-5. [DOI] [PubMed] [Google Scholar]
  • 47.Hegreness M, Shoresh N, Hartl D, Kishony R. An equivalence principle for the incorporation of favorable mutations in asexual populations. Science. 2006;311:1615–1617. doi: 10.1126/science.1122469. [DOI] [PubMed] [Google Scholar]
  • 48.Kouyos RD, Silander OK, Bonhoeffer S. Epistasis between deleterious mutations and the evolution of recombination. Trends Ecol Evol. 2007;22:308–315. doi: 10.1016/j.tree.2007.02.014. [DOI] [PubMed] [Google Scholar]
  • 49.Sanjun R, Moya A, Elena SF. The contribution of epistasis to the architecture of fitness in an RNA virus. Proc Natl Acad Sci USA. 2004;101:15376–15379. doi: 10.1073/pnas.0404125101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Crow JF, Kimura M. An Introduction to Population Genetics Theory. New York: Harper & Row; 1972. [Google Scholar]
  • 51.Tachida H. A study on a nearly neutral mutation model in finite populations. Genetics. 1991;128:183–192. doi: 10.1093/genetics/128.1.183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Brandt H. Correlation Analysis of Fitness Landscapes. 2001 (International Institute for Applied Systems Analysis, Laxenburg, Austria), Interim Report IR-01-058. [Google Scholar]
  • 53.Barrick JE, et al. Genome evolution and adaptation in a long-term experiment with E. coli. Nature. 2009 doi: 10.1038/nature08480. 10.1038/nature08480. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES