Skip to main content
PLOS Computational Biology logoLink to PLOS Computational Biology
. 2014 Sep 11;10(9):e1003818. doi: 10.1371/journal.pcbi.1003818

The Time Scale of Evolutionary Innovation

Krishnendu Chatterjee 1,*, Andreas Pavlogiannis 1, Ben Adlam 2, Martin A Nowak 2
Editor: Niko Beerenwinkel3
PMCID: PMC4161296  PMID: 25211329

Abstract

A fundamental question in biology is the following: what is the time scale that is needed for evolutionary innovations? There are many results that characterize single steps in terms of the fixation time of new mutants arising in populations of certain size and structure. But here we ask a different question, which is concerned with the much longer time scale of evolutionary trajectories: how long does it take for a population exploring a fitness landscape to find target sequences that encode new biological functions? Our key variable is the length, Inline graphic of the genetic sequence that undergoes adaptation. In computer science there is a crucial distinction between problems that require algorithms which take polynomial or exponential time. The latter are considered to be intractable. Here we develop a theoretical approach that allows us to estimate the time of evolution as function of Inline graphic We show that adaptation on many fitness landscapes takes time that is exponential in Inline graphic even if there are broad selection gradients and many targets uniformly distributed in sequence space. These negative results lead us to search for specific mechanisms that allow evolution to work on polynomial time scales. We study a regeneration process and show that it enables evolution to work in polynomial time.

Author Summary

Evolutionary adaptation can be described as a biased, stochastic walk of a population of sequences in a high dimensional sequence space. The population explores a fitness landscape. The mutation-selection process biases the population towards regions of higher fitness. In this paper we estimate the time scale that is needed for evolutionary innovation. Our key parameter is the length of the genetic sequence that needs to be adapted. We show that a variety of evolutionary processes take exponential time in sequence length. We propose a specific process, which we call ‘regeneration processes’, and show that it allows evolution to work on polynomial time scales. In this view, evolution can solve a problem efficiently if it has solved a similar problem already.

Introduction

Our planet came into existence 4.6 billion years ago. There is clear chemical evidence for life on earth 3.5 billion years ago [1], [2]. The evolutionary process generated procaria, eucaria and complex multi-cellular organisms. Throughout the history of life, evolution had to discover sequences of biological polymers that perform specific, complicated functions. The average length of bacterial genes is about 1000 nucleotides, that of human genes about 3000 nucleotides. The longest known bacterial gene contains more than Inline graphic nucleotides, the longest human gene more than Inline graphic. A basic question is what is the time scale required by evolution to discover the sequences that perform desired functions. While many results exist for the fixation time of individual mutants [3][15], here we ask how the time scale of evolution depends on the length Inline graphic of the sequence that needs to be adapted. We consider the crucial distinction of polynomial versus exponential time [16][18]. A time scale that grows exponentially in Inline graphic is infeasible for long sequences.

Evolutionary dynamics operates in sequence space, which can be imagined as a discrete multi-dimensional lattice that arises when all sequences of a given length are arranged such that nearest neighbors differ by one point mutation [19]. For constant selection, each point in sequence space is associated with a non-negative fitness value (reproductive rate). The resulting fitness landscape is a high dimensional mountain range. Populations explore fitness landscapes searching for elevated regions, ridges, and peaks [20][27].

A question that has been extensively studied is how long does it take for existing biological functions to improve under natural selection. This problem leads to the study of adaptive walks on fitness landscapes [15], [20], [21], [28], [29]. In this paper we ask a different question: how long does it take for evolution to discover a new function? More specifically, our aim is to estimate the expected discovery time of new biological functions: how long does it take for a population of reproducing organisms to discover a biological function that is not present at the beginning of the search. We will discuss two approximations for rugged fitness landscapes. We also discuss the significance of clustered peaks.

We consider an alphabet of size four, as is the case for DNA and RNA, and a nucleotide sequence of length Inline graphic. We consider a population of size Inline graphic, which reproduces asexually. The mutation rate, Inline graphic, is small: individual mutations are introduced and evaluated by natural selection and random drift one at a time. The probability that the evolutionary process moves from a sequence Inline graphic to a sequence Inline graphic, which is at Hamming distance one from Inline graphic, is given by Inline graphic, where Inline graphic is the fixation probability of sequence Inline graphic in a population consisting of sequence Inline graphic. In the special case of a flat fitness landscape, we have Inline graphic, and Inline graphic. Thus we have an evolutionary random walk, where each step is a jump to a neighboring sequence of Hamming distance one.

Results

Consider a high-dimensional sequence space. A particular biological function can be instantiated by some of the sequences. Each sequence Inline graphic has a fitness value Inline graphic, which measures the ability of the sequence Inline graphic to encode the desired function. Biological fitness landscapes are typically expected to have many peaks [29][31]. They can be highly rugged due to epistatic effects of mutations [32][34]. They can also contain large regions or networks of neutrality [20], [21]. Empirical studies of short RNA sequences have revealed that the underlying fitness landscape has low peak density [35]: around Inline graphic peaks in Inline graphic sequences.

For the purpose of estimating the expected discovery time we can approximate the fitness landscape with a binary step function over the sequence space. We discuss two different approximations (Figure 1). For the first approximation, we consider the scenario where fitness values below some threshold, Inline graphic, have negligible contribution; those sequences do not instantiate the desired function (either not at all or only below the minimum level that could be detected by natural selection). We approximate the rugged fitness landscape as follows: if Inline graphic then Inline graphic; if Inline graphic then Inline graphic. The set of sequences with Inline graphic constitutes the target set, and the remaining fitness landscape is neutral.

Figure 1. Approximations of a highly rugged fitness landscape by broad peaks and neutral regions.

Figure 1

The figures depict examples of highly rugged fitness landscapes where the sequence space has been projected in one dimension. (A) Sequences with fitness below some level Inline graphic are functionally very different to the desired function, and selection cannot act upon them. All other sequences are considered as targets. The fitness landscape is approximated by a step function: if Inline graphic, then Inline graphic, otherwise Inline graphic. (B) Local maxima below the desired fitness threshold Inline graphic are known to slow down the evolutionary random walk towards sequences that attain fitness at least Inline graphic. We approximate the fitness landscape by broad peaks and neutral regions by increasing the fitness of every sequence that belongs in a mountain range with fitness below Inline graphic to the maximal local maxima Inline graphic below Inline graphic. Note that the target set starts from the upslope of a mountain range whose peak exceeds Inline graphic.

The second approximation works as follows. Consider the evolutionary process exploring a rugged fitness landscape where the goal is to attain a fitness level Inline graphic. Local maxima below Inline graphic slow down the evolutionary process to attain Inline graphic, because the evolutionary walk might get stuck in those local maxima. In order to derive lower bounds for the expected discovery time, the rugged fitness landscape can be approximated as follows. Let Inline graphic be the fitness value of the highest local maximum below Inline graphic. Then for every sequence in a mountain range with a local maximum below Inline graphic we assign the fitness value Inline graphic. The mountain ranges with local maxima above Inline graphic are the target sequences. Note that the target set includes sequences that start at the upslope of mountain ranges with peaks above Inline graphic. Thus, again we obtain a fitness landscape with clustered targets and neutral region, where the neutral region consists of all sequences whose fitness values have been assigned to Inline graphic. The two approximations are illustrated in Figure 1. For Inline graphic the second approximation generates larger target areas than the first approximation and is therefore more lenient.

Our key results for estimating the discovery time can now be formulated for binary fitness landscapes, but they apply to any type of rugged landscape using one of the two approximations. We note that our methods can also be applied for certain non-binary fitness landscapes, and an example of a fitness landscape with a large gradient arising from multiplicative fitness effects is discussed in Sections 6 and 7 of Text S1.

We now present our main results in the following order. We first estimate the discovery time of a single search aiming to find a single broad peak. Then we study multiple simultaneous searches for a single broad peak. Finally, we consider multiple broad peaks that are uniformly randomly distributed in sequence space.

We first study a broad peak of target sequences described as follows: consider a specific sequence; any sequence within a certain Hamming distance of that sequence belongs to the target set. Specifically, we consider that the evolutionary process has succeeded, if the population discovers a sequence that differs from the specific sequence in no more than a fraction Inline graphic of positions. We refer to the specific sequence as the target center and Inline graphic as the width (or radius) of the peak. For example, if Inline graphic and Inline graphic, then the target center is surrounded by a cloud of approximately Inline graphic sequences. For a single broad peak with width Inline graphic, the target set contains at least Inline graphic sequences, which is an exponential function of Inline graphic. The fitness landscape outside the broad peak is flat. We refer this binary fitness landscape as a broad peak landscape. The population needs to discover any one of the target sequences in the broad peak, starting from some sequence that is not in the broad peak. We establish the following result.

Theorem 1

Consider a single search exploring a broad peak landscape with width Inline graphic and mutation rate Inline graphic . The following assertions hold:

  • if Inline graphic , then there exists Inline graphic such that for all sequence spaces of sequence length Inline graphic , the expected discovery time is at least Inline graphic ;

  • if Inline graphic , then for all sequence spaces of sequence length Inline graphic , the expected discovery time is at most Inline graphic .

Our result can be interpreted as follows (see Theorem S2 and Corollary S2 in Text S1): (i) If Inline graphic, then the expected discovery time is exponential in Inline graphic; and (ii) if Inline graphic, then the expected discovery time is polynomial in Inline graphic. Thus, we have derived a strong dichotomy result which shows a sharp transition from polynomial to exponential time depending on whether a specific condition on Inline graphic does or does not hold.

For the four letter alphabet most random sequences have Hamming distance Inline graphic from the target center. If the population is further away than this Hamming distance, then random drift will bring it closer. If the population is closer than this Hamming distance, then random drift will push it further away. This argument constitutes the intuitive reason that Inline graphic is the critical threshold. If the peak has a width of less than Inline graphic, then we prove that the expected discovery time by random drift is exponential in the sequence length Inline graphic (see Figure 2). This result holds for any population size, Inline graphic, as long as Inline graphic, which is certainly the case for realistic values of Inline graphic and Inline graphic. In the Text S1 we also present a more general result, where along with a single broad peak, instead of a flat landscape outside the peak we consider a multiplicative fitness landscape and establish a sharp dichotomy result that generalizes Theorem 1 (see Corollary S2 in Text S1).

Figure 2. Broad peak with different fitness landscapes.

Figure 2

For the broad peak there is a specific sequence, and all sequences that are within Hamming distance Inline graphic are part of the target set. The fitness landscape is flat outside the broad peak. (A) If the width of the broad peak is Inline graphic, then the expected discovery time is exponential in sequence length, Inline graphic. (B) If the width of the broad peak is Inline graphic, then the expected discovery time is polynomial in sequence length, Inline graphic. (C) Numerical calculations for broad peak fitness landscapes. We observe exponential expected discovery time for Inline graphic and Inline graphic, whereas polynomial expected discovery time for Inline graphic.

Remark 1

We highlight two important aspects of our results.

  1. First, when we establish exponential lower bounds for the expected discovery time, then these lower bounds hold even if the starting sequence is only a few steps away from the target set.

  2. Second, we present strong dichotomy results, and derive mathematically the most precise and strongest form of the boundary condition.

Let us now give a numerical example to demonstrate that exponential time is intractable. Bacterial life on earth has been around for at least 3.5 billion years, which correspond to Inline graphic hours. Assuming fast bacterial cell division of 20–30 minutes on average we have at most Inline graphic generations. The expected discovery time for a sequence of length Inline graphic with a very large broad peak of Inline graphic is approximately Inline graphic generations; see Table 1.

Table 1. Numerical data for discovery time in flat fitness landscapes.
Inline graphic Inline graphic Inline graphic Inline graphic
Inline graphic Inline graphic Inline graphic Inline graphic
Inline graphic Inline graphic Inline graphic Inline graphic

Numerical data for the discovery time of broad peaks with width Inline graphic and Inline graphic embedded in flat fitness landscapes. First the discovery time is computed for small values of Inline graphic as shown in Figure 2(C). Then the exponential growth is extrapolated to Inline graphic and Inline graphic, respectively. We show the discovery times for Inline graphic, and Inline graphic. For Inline graphic the values are polynomial in Inline graphic.

If individual evolutionary processes cannot find targets in polynomial time, then perhaps the success of evolution is based on the fact that many populations are searching independently and in parallel for a particular adaptation. We prove that multiple, independent parallel searches are not the solution of the problem, if the starting sequence is far away from the target center. Formally we show the following result.

Theorem 2

In all cases where the lower bound on the expected discovery time is exponential, for all polynomials Inline graphic , Inline graphic and Inline graphic , for any starting sequence with Hamming distance at least Inline graphic from the target center, the probability for any one out of Inline graphic independent multiple searches to reach the target set within Inline graphic steps is at most Inline graphic .

If an evolutionary process takes exponential time, then polynomially many independent searches do not find the target in polynomial time with reasonable probability (for details see Theorem S5 in the Text S1). We also show an informal and approximate calculation of the success probability for Inline graphic independent searches, as follows: if the expected discovery time is exponential (say, Inline graphic), then the probability that all Inline graphic independent searches fail upto Inline graphic steps is at least Inline graphic (i.e., the success probability within Inline graphic steps of any of the searches is at most Inline graphic), when the starting sequence is far away from the target center. In such a case, one could quickly exhaust the physical resources of an entire planet. The estimated number of bacterial cells [36] on earth is about Inline graphic. To give a specific example let us assume that there are Inline graphic independent searches, each with population size Inline graphic. The probability that at least one of those independent searches succeeds within Inline graphic generations for sequence length Inline graphic and broad peak of Inline graphic is less than Inline graphic.

In our basic model, individual mutants are evaluated one at a time. The situation of many mutant lineages evolving in parallel is similar to the multiple searches described above. As we show that whenever a single search takes exponential time, multiple independent searches do not lead to polynomial time solutions, our results imply intractability for this case as well.

We now explore the case of multiple broad peaks that are uniformly and randomly distributed. Consider that there are Inline graphic target centers. Around each target center there is a selection gradient extending up to a distance Inline graphic. Formally we can consider any fitness function Inline graphic that assigns zero fitness to a sequence whose Hamming distance exceeds Inline graphic from all the target centers, which in particular is subsumed by considering the multiple broad peaks where around each center we consider a broad peak of target set with peak width Inline graphic. We establish the following result:

Theorem 3

Consider a single search under the multiple broad peak fitness landscape of Inline graphic target centers chosen uniformly at random, with peak width at most Inline graphic for each center and Inline graphic . Then with high probability, the expected discovery time of the target set is at least Inline graphic .

Whether or not the function Inline graphic is exponential in Inline graphic depends on how Inline graphic changes with Inline graphic. But even if we assume exponentially many broad peak centers, Inline graphic, with peak width Inline graphic where Inline graphic, we need not obtain polynomial time (Figure 3 and Theorem S6 in Text S1).

Figure 3. The search for randomly, uniformly distributed targets in sequence space.

Figure 3

(A) The target set consists of Inline graphic random sequences; each one of them is surrounded by a broad peak of width up to Inline graphic. The figure shows a pictorial illustration where the Inline graphic-dimensional sequence space is projected onto two dimensions. From a randomly chosen starting sequence outside the target set, the expected discovery time is at least Inline graphic, which can be exponential in Inline graphic. (B) Computer simulations showing the average discovery time of Inline graphic, Inline graphic, and Inline graphic targets, with Inline graphic. We observe exponential dependency on Inline graphic. The discovery time is averaged over 200 runs. (C) Success probability estimated as the fraction of the 200 searches that succeed in finding one of the target sequences within Inline graphic generations. The success probability drops exponentially with Inline graphic. (D) Success probability as a function of time for Inline graphic and Inline graphic. (E) Discovery time for a large number of randomly generated target sequences. Either Inline graphic or Inline graphic sequences were generated. For Inline graphic and Inline graphic the target set consists of balls of Hamming distance Inline graphic and Inline graphic (respectively) around each sequence. The figure shows the average discovery time of 100 runs. As expected we observe that the discovery time grows exponentially with sequence length, Inline graphic.

It is known that recombination may accelerate evolution on certain fitness landscapes [28], [37][39], and recombination may also slow down evolution on other fitness landscapes [40]. Recombination, however, reduces the discovery time only by at most a linear factor in sequence length [28], [37], [38], [41], [42]. A linear or even polynomial factor improvement over an exponential function does not convert the exponential function into a polynomial one. Hence, recombination can make a significant difference only if the underlying evolutionary process without recombination already operates in polynomial time.

What are then adaptive problems that can be solved by evolution in polynomial time? We propose a “regeneration process”. The basic idea is that evolution can solve a new problem efficiently, if it is has solved a similar problem already. Suppose gene duplication or genome rearrangement can give rise to starting sequences that are at most Inline graphic point mutations away from the target set, where Inline graphic is a number that is independent of Inline graphic. It is important that starting sequences can be regenerated again and again. We prove that Inline graphic many searches are sufficient in order to find the target in polynomial time with high probability (see Figure 4 and Section 10 in Text S1). The upper bound, Inline graphic, holds even for neutral drift (without selection). Note that in this case, the expected discovery time for any single search is still exponential. Therefore, most of the Inline graphic searches do not succeed in polynomial time; however, with high probability one of the searches succeeds in polynomial time. There are two key aspects to the “regeneration process”: (a) the starting sequence is only a small number of steps away from the target; and (b) the starting sequence can be generated repeatedly. This process enables evolution to overcome the exponential barrier. The upper bound, Inline graphic, may possibly be further reduced, if selection and/or recombination are included.

Figure 4. Regeneration process.

Figure 4

Gene duplication (or possibly some other process) generates a steady stream of starting sequences that are a constant number Inline graphic of mutations away from the target. Many searches drift away from the target, but some will succeed in polynomially many steps. We prove that Inline graphic searches ensure that with high probability some search succeed in polynomially many steps.

Discussion

The regeneration process formalizes the role of several existing ideas. First, it ties in with the proposal that gene duplications and genome rearrangements are major events leading to the emergence of new genes [43]. Second, evolution can be seen as a tinkerer playing around with small modifications of existing sequences rather than creating entirely new ones [44]. Third, the process is related to Gillespie's suggestion [29] that the starting sequence for an evolutionary search must have high fitness. In our theory, proximity in fitness value is replaced by proximity in sequence space. However, our results show that proximity alone is insufficient to break the exponential barrier, and only when combined with the process of regeneration it yields polynomial discovery time with high probability. Our process can also explain the emergence of orphan genes arising from non-coding regions [45]. Section 12 of the Text S1 discusses the connection of our approach to existing results.

There is one other scenario that must be mentioned. It is possible that certain biological functions are hyper-abundant in sequence space [21] and that a process generating a large number of random sequences will find the function with high probability. For example, Bartel & Szostak [46] isolated a new ribozyme from a pool of about Inline graphic random sequences of length Inline graphic. While such a process is conceivable for small effective sequence length, it cannot represent a general solution for large Inline graphic.

Our theory has clear empirical implications. The regeneration process can be tested in systems of in vitro evolution [47]. A starting sequence can be generated by introducing Inline graphic point mutations in a known protein encoding sequence of length Inline graphic. If these point mutations destroy the function of the protein, then the expected discovery time of any one attempt to find the original sequence should be exponential in Inline graphic. But only polynomially many searches in Inline graphic are required to find the target with high probability in polynomially many steps. The same setup can be used to explore whether the biological function can be found elsewhere in sequence space: the evolutionary trajectory beginning with the starting sequence could discover new solutions. Our theory also highlights how important it is to explore the distribution of biological functions in sequence space both for RNA [20], [21], [35], [46] and in the protein universe [48].

In summary, we have developed a theory that allows us to estimate time scales of evolutionary trajectories. We have shown that various natural processes of evolution take exponential time as function of the sequence length, Inline graphic. In some cases we have established strong dichotomy results for precise boundary conditions. We have proposed a mechanism that allows evolution in polynomial time scales. Some interesting directions of future work are as follows: (1) Consider various forms of rugged fitness landscapes and study more refined approximations as compared to the ones we consider; and then estimate the expected discovery time for the refined approximations. (2) While in this paper we characterize the difference between exponential and polynomial for the expected discovery time, more refined analysis (such as efficiency for polynomial time, like cubic vs quadratic time) for specific fitness landscapes using mechanisms like recombination is another interesting problem.

Materials and Methods

Our results are based on a mathematical analysis of the underlying stochastic processes. For Markov chains on the one-dimensional grid, we describe recurrence relations for the expected hitting time and present lower and upper bounds on the expected hitting time using combinatorial analysis (see Text S1 for details). We now present the basic intuitive arguments of the main results.

Markov chain on the one-dimensional grid

For a single broad peak, due to symmetry we can interpret the evolutionary random walk as a Markov chain on the one-dimensional grid. A sequence of type Inline graphic is Inline graphic steps away from the target, where Inline graphic is the Hamming distance between this sequence and the target. The probability that a type Inline graphic sequence mutates to a type Inline graphic sequence is given by Inline graphic. The stochastic process of the evolutionary random walk is a Markov chain on the one-dimensional grid Inline graphic.

The basic recurrence relation

Consider a Markov chain on the one-dimensional grid, and let Inline graphic denote the expected hitting time from Inline graphic to Inline graphic. The general recurrence relation for the expected hitting time is as follows:

graphic file with name pcbi.1003818.e201.jpg (1)

for Inline graphic, with boundary condition Inline graphic. The interpretation is as follows. Given the current state Inline graphic, if Inline graphic, at least one transition will be made to a neighboring state Inline graphic, with probability Inline graphic, from which the hitting time is Inline graphic.

Intuition behind Theorem 1

Theorem 1 is derived by obtaining precise bounds for the recurrence relation of the hitting time (Equation 1). Consider that Inline graphic for all Inline graphic (i.e., progress towards state Inline graphic is always possible), as otherwise Inline graphic is never reached from Inline graphic. We show (see Lemma 2 in the Text S1) that we can write Inline graphic as a sum, Inline graphic, where Inline graphic is the sequence defined as:

graphic file with name pcbi.1003818.e217.jpg (2)

The basic intuition obtained from Equation 2 is as follows: (i) If Inline graphic, for some constant Inline graphic, then the sequence Inline graphic grows at least as fast as a geometric series with factor Inline graphic. (ii) On the other hand, if Inline graphic and Inline graphic for some constant Inline graphic, then the sequence Inline graphic grows at most as fast as an arithmetic series with difference Inline graphic. From the above case analysis the result for Theorem 1 is obtained as follows: If Inline graphic, then for all Inline graphic, we have Inline graphic for some Inline graphic, and hence the sequence Inline graphic grows geometrically for a linear length in Inline graphic. Then, Inline graphic for all states Inline graphic (i.e., for all sequences outside of the target set). This corresponds to case 1 of Theorem 1. On the other hand, if Inline graphic, then it is Inline graphic, and case 2 of Theorem 1 is derived (for details see Corollary 2 in Text S1).

Intuition behind Theorem 2

The basic intuition for the result is as follows: consider a single search for which the expected hitting time is exponential. Then for the single search the probability to succeed in polynomially many steps is negligible (as otherwise the expectation would not have been exponential). In case of independent searches, the independence ensures that the probability that all searches fail is the product of the probabilities that every single search fails. Using the above arguments we establish Theorem 2 (for details see Section 8 in Text S1).

Intuition behind Theorem 3

For this result, it is first convenient to view the evolutionary walk taking place in the sequence space of all sequences of length Inline graphic, under no selection. Each sequence has Inline graphic neighbors, and considering that a point mutation happens, the transition probability to each of them is Inline graphic. The underlying Markov chain due to symmetry has fast mixing time, i.e., the number of steps to converge to the stationary distribution (the mixing time) is Inline graphic. Again by symmetry the stationary distribution is the uniform distribution. If Inline graphic, then from Theorem 1 we obtain that the expected time to reach a single broad peak is exponential. By union bound, if Inline graphic, the probability to reach any of the Inline graphic broad peaks within Inline graphic steps is negligible. Since after the first Inline graphic steps the Markov chain converges to the stationary distribution, then each step of the process can be interpreted as selection of sequences uniformly at random among all sequences. Using Hoeffding's inequality, we show that with high probability, in expectation Inline graphic such steps are required before a sequence is found that belongs to the target set. Thus we obtain the result of Theorem 3 (for details see Section 9 in Text S1).

Remark about techniques

An important aspect of our work is that we establish our results using elementary techniques for analysis of Markov chains. The use of more advanced mathematical machinery, such as martingales [49] or drift analysis [50], [51], can possibly be used to derive more refined results. While in this work our goal is to distinguish between exponential and polynomial time, whether the techniques from [49][51] can lead to a more refined characterization within polynomial time is an interesting direction for future work.

Supporting Information

Text S1

Detailed proofs for “The Time Scale of Evolutionary Innovation.”

(PDF)

Acknowledgments

We thank Nick Barton and Daniel Weissman for helpful discussions and pointing us to relevant literature.

Funding Statement

Austrian Science Fund (FWF) Grant No P23499-N23, FWF NFN Grant No S11407-N23 (RiSE), ERC Start grant (279307: Graph Games), and Microsoft Faculty Fellows award. Support from the John Templeton foundation is gratefully acknowledged. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

References

  • 1. Allwood AC, Grotzinger JP, Knoll AH, Burch IW, Anderson MS, et al. (2009) Controls on development and diversity of early archean stromatolites. Proc Natl Acad Sci USA 106: 9548–9555. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Schopf JW (August 2006) The first billion years: When did life emerge? Elements 2: 229–233. [Google Scholar]
  • 3. Kimura M (1968) Evolutionary rate at the molecular level. Nature 217: 624–626. [DOI] [PubMed] [Google Scholar]
  • 4. Ewens WJ (1967) The probability of survival of a new mutant in a uctuating environment. Heredity 22: 438–443. [DOI] [PubMed] [Google Scholar]
  • 5. Barton NH (1995) Linkage and the limits to natural selection. Genetics 140: 821–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. Campos PR (2004) Fixation of beneficial mutations in the presence of epistatic interactions. Bull Math Biol 66: 473–486. [DOI] [PubMed] [Google Scholar]
  • 7. Antal T, Scheuring I (2006) Fixation of strategies for an evolutionary game in finite populations. Bull Math Biol 68: 1923–1944. [DOI] [PubMed] [Google Scholar]
  • 8. Whitlock MC (2003) Fixation probability and time in subdivided populations. Genetics 164: 767–779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Altrock PM, Traulsen A (2009) Fixation times in evolutionary games under weak selection. New J Phys 11 . doi:10.1088/1367-2630/11/1/013012 [Google Scholar]
  • 10. Kimura M, Ohta T (1969) Average number of generations until fixation of a mutant gene in a finite population. Genetics 61: 763–771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Johnson T, Gerrish P (2002) The fixation probability of a beneficial allele in a population dividing by binary fission. Genetica 115: 283–287. [DOI] [PubMed] [Google Scholar]
  • 12. Orr HA (2000) The rate of adaptation in asexuals. Genetics 155: 961–968. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Wilke CO (2004) The speed of adaptation in large asexual populations. Genetics 167: 2045–2053. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Desai MM, Fisher DS, Murray AW (2007) The speed of evolution and maintenance of variation in asexual populations. Curr Biol 17: 385–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Ohta T (1972) Population size and rate of evolution. J Mol Evol 1: 305–314. [PubMed] [Google Scholar]
  • 16.Papadimitriou C (1994) Computational complexity. Addison-Wesley.
  • 17.Cormen T, Leiserson C, Rivest R, Stein C (2009) Introduction to Algorithms. MIT Press.
  • 18. Valiant LG (2009) Evolvability. J ACM 56: 3:1–3:21. [Google Scholar]
  • 19. Maynard Smith J (1970) Natural selection and the concept of a protein space. Nature 225: 563–564. [DOI] [PubMed] [Google Scholar]
  • 20. Fontana W, Schuster P (1987) A computer model of evolutionary optimization. Biophys Chem 26: 123–147. [DOI] [PubMed] [Google Scholar]
  • 21. Fontana W, Schuster P (1998) Continuity in evolution: On the nature of transitions. Science 280: 1451–1455. [DOI] [PubMed] [Google Scholar]
  • 22. Eigen M, Mccaskill J, Schuster P (1988) Molecular quasi-species. J Phys Chem 92: 6881–6891. [Google Scholar]
  • 23. Eigen M, Schuster P (1978) The hypercycle. Naturwissenschaften 65: 7–41. [DOI] [PubMed] [Google Scholar]
  • 24. Park SC, Simon D, Krug J (2010) The speed of evolution in large asexual populations. J Stat Phys 138: 381–410. [Google Scholar]
  • 25. Derrida B, Peliti L (1991) Evolution in a at fitness landscape. Bull Math Biol 53: 355–382. [Google Scholar]
  • 26. Stadler PF (2002) Fitness landscapes. Appl Math & Comput 117: 187–207. [Google Scholar]
  • 27. Worden RP (1995) A speed limit for evolution. J Theor Biol 176: 137–152. [DOI] [PubMed] [Google Scholar]
  • 28.Crow JF, Kimura M (1965) Evolution in sexual and asexual populations. Am Nat 99 : pp. 439–450. [Google Scholar]
  • 29. Gillespie JH (1984) Molecular evolution over the mutational landscape. Evolution 38: 1116–1129. [DOI] [PubMed] [Google Scholar]
  • 30. Kauffman S, Levin S (1987) Towards a general theory of adaptive walks on rugged landscapes. Journal of Theoretical Biology 128: 11–45. [DOI] [PubMed] [Google Scholar]
  • 31. Orr HA (2000) A minimum on the mean number of steps taken in adaptive walks. Journal of Theoretical Biology 220: 241–247. [DOI] [PubMed] [Google Scholar]
  • 32. Weinreich DM, Watson RA, Chao L (2005) Perspective:sign epistasis and genetic constraint on evolutionary trajectories. Evolution 59: 1165–1174. [PubMed] [Google Scholar]
  • 33. Poelwijk FJ, Kiviet DJ, Weinreich DM, Tans SJ (2007) Empirical fitness landscapes reveal accessible evolutionary paths. Nature 445: 383–386. [DOI] [PubMed] [Google Scholar]
  • 34. Woods RJ, Barrick JE, Cooper TF, Shrestha U, Kauth MR, et al. (2011) Second-order selection for evolvability in a large escherichia coli population. Science 331: 1433–1436. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35. Jimenez JI, Xulvi-Brunet R, Campbell GW, Turk-MacLeod R, Chen IA (2013) Comprehensive experimental fitness landscape and evolutionary network for small rna. Proc Natl Acad Sci USA. 110(37): 14984–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Whitman WB, Coleman DC, Wiebe WJ (1998) Prokaryotes: The unseen majority. Proc Natl Acad Sci USA 95: 6578–6583. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Smith JM (1974) Recombination and the rate of evolution. Genetics 78: 299–304. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Crow JF, Kimura M (1970) An introduction to population genetics theory. Burgess Publishing Company.
  • 39. Park SC, Krug J (2013) Rate of adaptation in sexuals and asexuals: A solvable model of the fishermuller effect. Genetics 195: 941–955. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. de Visser JAGM, Park S, Krug J (2009) Exploring the effect of sex on empirical fitness landscapes. The American Naturalist 174: S15–S30. [DOI] [PubMed] [Google Scholar]
  • 41. Neher RA, Shraiman BI, Fisher DS (2010) Rate of adaptation in large sexual populations. Genetics 184: 467–481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Weissman DB, Hallatschek O (2014) The rate of adaptation in large sexual populations with linear chromosomes. Genetics 196: 1167–1183. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Ohno S (1970) Evolution by gene duplication. Springer-Verlag.
  • 44. Jacob F (1977) Evolution and tinkering. Science 196: 1161–1166. [DOI] [PubMed] [Google Scholar]
  • 45. Tautz D, Domazet-Lošo T (2011) The evolutionary origin of orphan genes. Nat Rev Genet 12: 692–702. [DOI] [PubMed] [Google Scholar]
  • 46. Bartel D, Szostak J (1993) Isolation of new ribozymes from a large pool of random sequences. Science 261: 1411–1418. [DOI] [PubMed] [Google Scholar]
  • 47. Leconte AM, Dickinson BC, Yang DD, Chen IA, Allen B, et al. (2013) A population-based experimental model for protein evolution: Effects of mutation rate and selection stringency on evolutionary outcomes. Biochemistry 52: 1490–1499. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Povolotskaya IS, Kondrashov FA (2010) Sequence space and the ongoing expansion of the protein universe. Nature 465: 922–926. [DOI] [PubMed] [Google Scholar]
  • 49.Williams D (1991) Probability with Martingales. Cambridge mathematical textbooks. Cambridge University Press.
  • 50.Hajek B (1982) Hitting-time and occupation-time bounds implied by drift analysis with applications. Advances in Applied Probability 14 : pp. 502–525. [Google Scholar]
  • 51.Lehre PK, Witt C (2013) General drift analysis with tail bounds. CoRR abs/1307.2559.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Text S1

Detailed proofs for “The Time Scale of Evolutionary Innovation.”

(PDF)


Articles from PLoS Computational Biology are provided here courtesy of PLOS

RESOURCES