Abstract
The existence of complex (multiple-step) genetic adaptations that are ‘irreducible’ (i.e., all partial combinations are less fit than the original genotype) is one of the longest standing problems in evolutionary biology. In standard genetics parlance, these adaptations require the crossing of a wide adaptive valley of deleterious intermediate stages. Here we demonstrate, using a simple model, that evolution can cross wide valleys to produce ‘irreducibly complex’ adaptations by making use of previously cryptic mutations. When revealed by an evolutionary capacitor, previously cryptic mutants have higher initial frequencies than do new mutations, bringing them closer to a valley-crossing saddle in allele frequency space. Moreover, simple combinatorics imply an enormous number of candidate combinations exist within available cryptic genetic variation. We model the dynamics of crossing of a wide adaptive valley after a capacitance event using both numerical simulations and analytical approximations. Although individual valley crossing events become less likely as valleys widen, by taking the combinatorics of genotype space into account, we see that revealing cryptic variation can cause the frequent evolution of complex adaptations.
Keywords: Complex adaptation, adaptive valley, evolutionary capacitance, theoretical population genetics, Moran model
Introduction
When a population is well adapted to its environment, the vast majority of new mutations will be neutral or negative. If a higher fitness genotype exists that requires multiple mutations, but each intermediate mutation combination is deleterious, the population must traverse a metaphorical “adaptive valley” of low fitness to access the superior adaptation (Wright 1932). Such adaptations are called “irreducibly complex” by the intelligent design lobby, which uses the term to assert that evolution cannot cross multi-step adaptive valleys. Detailed investigations into the evolution of specific complex adaptations (Bridgham et al. 2006; Weinreich et al. 2006; Poelwijk et al. 2007; Egelman 2010) have shown that in these particular cases, evolved complexity is not “irreducible”. Many biologists assume, in agreement with the intelligent design lobby, that irreducible complexity rarely, if ever, evolves (e.g. Weinreich et al. (2006)).
In fact, valley crossing in asexual populations is both possible and relatively well understood (Van Nimwegen and Crutchfield 2000; Weissman et al. 2009). In small populations, individually deleterious mutations may fix sequentially by drift (Wright 1932). In large populations, fit multiple-mutants occasionally appear even when the component mutations are rare. This process is called ‘stochastic tunneling’ (Carter and Wagner 2002; Komarova et al. 2003; Iwasa et al. 2004; Weinreich and Chao 2005; Burton and Travis 2008). Weakly deleterious mutations may also act as stepping stones across deeper adaptive valleys (Covert et al. 2013).
The evolution of complex adaptations is more problematic in sexual populations because of genetic recombination. While recombination can facilitate complex adaptation by bringing together mutations from different lineages into a single individual (Fisher 1930; Muller 1932) it also breaks up beneficial combinations, rendering the crossing of even narrow valleys impossible (Crow and Kimura 1965; Eshel and Feldman 1970; Karlin and McGregor 1971). At low frequencies, mutations required for a given complex adaptation are almost always present separately, where selection acts against them. Rare individuals carrying a complex adaptation are unlikely to mate with other such (rare) individuals, and so produce maladapted offspring. In large populations the situation is particularly dire, as mutations are kept even rarer by more efficient selection. Thus, barring tiny effective population sizes or large mutation rates, high rates of recombination prevent valley crossing (Weissman et al. 2010). This raises the question: is it possible for sexual populations to produce irreducibly complex adaptations at all?
One mechanism that may allow the evolution of complex adaptations is the revelation of cryptic variation (Phillips 1996; Hansen et al. 2000; Masel 2006; Kim 2007), via a phenomenon known as evolutionary capacitance (Bergman and Siegal 2003; Masel 2005, 2013; Schlichting 2008; Masel and Trotter 2010). When the environment changes and organisms are stressed, evolutionary capacitors switch the status of genetic variation from “off” (phenotypically cryptic) to “on”. After revelation by a capacitor, this previously phenotypically silent genetic variation can acquire fitness consequences, producing a burst of “new” genotypic effects that are potentially adaptive in the new environment. A growing body of both theoretical and laboratory work suggests that such revelation events are a common feature of biological systems (Bergman and Siegal 2003; Jarosz and Lindquist 2010; Tirosh et al. 2010; Hayden et al. 2011; Freddolino et al. 2012; Janssens et al. 2013; Rohner et al. 2013; Siegal 2013; Takahashi 2013).
The rate of complex adaptation is ultimately determined by a combination of two factors: the number of adaptive allelic combinations available to the population, and the probability that any given adaptive combination will go to fixation:
The argument that crypticity can facilitate complex adaptation is twofold. First, cryptic mutations attain higher allele frequencies than they would if selection against them were operating at full strength. Since allele frequency must exceed a threshold before recombination produces peak genotypes more often than it breaks them up (Weinreich and Chao 2005), high initial frequencies of cryptic mutants can give adaptation a head start across a valley (Kim 2007). This head start will result in an increase in the value of the third factor in the expression above.
Second, revelation of cryptic variants allows the population to sample genotypes from many new and different parts of genotype space simultaneously. This will result in a drastic increase in the value of the first factor in the expression above. The new genotypes will mostly fall into low fitness valleys, but may, on rare occasions, hit upon new adaptive peaks. As the number of newly exposed mutant loci increases, the number of ways to combine those loci to form potential complex adaptations can become enormous. In other words, while any given complex adaptation is unlikely to fix, simple combinatorics imply that an enormous number of candidate combinations exist within available cryptic genetic variation, any one of which might fix. As well as increasing the (still low) likelihood of any given valley crossing event, crypticity multiplies the number of possible valley crossing events dramatically. In short, many more complex than simple possibilities exist in genotype space. This combinatoric argument has occasionally been made verbally (Fisher 2007, p439; Weissman et al. 2009) but has never been formalized.
Here we show, using a simple population genetic model, that irreducibly complex adaptations can arise and fix under biologically reasonable conditions. We model crossing a fitness valley (Figure 1) in two stages, corresponding to ‘before’ and ‘after’ an event in which environmental change prompts revelation via evolutionary capacitance. In the ‘before’ stage, we use an infinite sites model, assuming independence between sites, so that at the moment of environmental change and revelation, there are on average 2Nμτ polymorphic loci, where 2Nμ is the rate of introduction of mutant sites, and τ is the expected sojourn time of a new mutation given environmental stasis. The ~2Nμτ derived alleles are slightly deleterious, and typically found at low frequency relative to the reference ancestral allele at each site. We consider valley crossing from the genotype containing all ancestral alleles to one containing some combination of j out of the ~2Nμτ derived alleles.
Figure 1.
A schematic example of a 4-step fitness valley before and after revelation of cryptic variation.
To do this, we first sample initial allele frequencies at j cryptic mutant sites, using distributions calculated using the Moran model (Masel 2006). Second, we expose those j loci to stronger selection, as expected after revelation by a capacitor. Assuming that the mutations are individually deleterious, but in full combination confer an adaptive advantage in the new environment, we evaluate whether or not the population fixes all j mutant alleles. Simulating many such valley crossing attempts, and counting the successes, yields the probability of crossing a given j step valley after a capacitance event. Each valley is defined by its “width” (number of mutant sites, j), its “depth” (the strength of selection against individual revealed mutations, svalley), and its “height” (the strength of selection in favor of the new peak genotype, speak).
For simplicity, we neglect any fitness effects at other loci. Each additional mutant site incurs a fitness decrement regardless of crypticity, but this decrement is smaller when a mutation is in a cryptic state. Crypticity effectively flattens the valley, allowing the population to spread further across genotype space than would be possible under full-strength selection.
The probability of valley crossing depends on the parameters (j, svalley, speak) associated with the valley itself, as well as population size (N), mutation rate (μ), and selection against new mutations while they are cryptic (scryptic_valley). Finally, we multiply the probability of crossing a given j-step valley by the expected number of available j-step genotypes (as derived in Masel (2006)), each of which might be a peak with low probability ε. A schematic illustrating the steps in our simulation model can be seen in Figure 2.
Figure 2.
A schematic illustration of the steps in our models.
Although individual valley crossing events become less probable as valleys widen, by taking the combinatorics of genotype space into account, we find that revealing cryptic variation can cause the frequent evolution of complex adaptations. We also present analytical approximations that agree with our simulation results, providing additional support for and insight into our findings.
Methods
Our model follows a population of N diploid, randomly mating individuals with discrete generations. We assume infinite sites with total mutation rate μ, and free recombination. Fitness effects are multiplicative between sites, and the dominance coefficient is h=0.5. Let selection against mutant alleles while in the cryptic state be scryptic_valley, and while revealed, svalley. Selection for adaptive peak genotypes is speak .Where required for brevity, we will write these as scv, sv, and sp. At cryptic sites, we assume weakened selection such that svalley > scryptic_valley > 1/2N. We use both numerical simulations and analytical approximations to estimate the expected number of complex fixations produced after the revelation of variation by a capacitor, under a wide range of valley and population parameters.
Simulations
Initial “Revealed” Allele Frequencies
For a given population size, allele frequencies at each of the j sites at the time of revelation were sampled from the distribution of expected sojourn times, with . The expected sojourn times τi during which the allele frequency is i and the total time τ during which it has not yet become fixed or gone extinct, given that it entered the population as a single new mutation, are calculated from the Moran model, as follows. Strictly speaking, the Moran model applies to haploids only. However, the diploid case can be closely approximated by a model of 2N haploids. Before revelation, effects at different sites are independent, with dominance coefficient h = 0.5: these assumptions are necessary in order to approximate a diploid population using the Moran model. Given that homozygosity of mutant cryptic alleles is extremely rare, relaxing the codominance assumption is not expected to be important.
At each time step before revelation, one haploid individual is randomly chosen to die and one to reproduce. 2N such time steps comprise one “generation”. For a given polymorphic site at time t, i copies of the mutant allele will be present in the population, each contributing factor 1-scv to the fitness of the individual that carries them. The evolutionary dynamics are dominated by the selection coefficient on one copy of a mutant allele, set here to be scv .We model fitness as differential reproduction. As an infinite sites model, it is possible to neglect recurrent and back mutations, so the next reproducing individual has the mutant allele with probability:
The probability that the mutant allele appears in the next haploid individual chosen to die is i/2N. The probability that the number of copies of the mutant allele increases from i by 1 is then given by the probability that a mutant haploid individual reproduces while a wild-type haploid individual dies:
The probability that the number of mutants decreases from i by 1 then takes into account the probability that a mutant individual dies and a wild-type individual reproduces:
Following Ewens (2004, equation 2.158), let
The probability of fixation by drift starting from i individuals is then and the probability of fixation by drift starting from a single mutant individual is (Ewens 2004, equation 2.159). The sojourn time τi during which there are i descendants of a single original mutant is given by
and , (Ewens 2004, eq 2.144), where the unit of time is one generation or 2N rounds in the Moran model.
For each of j mutant sites, allele frequencies were sampled from the sojourn time distributions as above, to be used as the initial frequencies of “revealed” cryptic variants in the second part of our simulations. Alleles were assigned randomly to individuals, such that there was no initial linkage disequilibrium, and genotype frequencies at each site followed approximately Hardy-Weinberg proportions. Mean linkage disequilibrium is generally slightly negative in asexual populations, but of negligible magnitude in sexual populations (Kouyos et al. 2007).
Selection after revelation of cryptic variants
We model two different dominance scenarios. First, we assume the mutant alleles to be recessive for their adaptive function, such that only individuals with two mutant alleles at each site acquire the new peak fitness. This is the most conservative case of valley crossing, which we call a recessive peak. In the case of a dominant peak, individuals with at least one mutant allele at each site acquire the fitness peak. Thus, dominant adaptive peaks are more likely to fix.
After revelation, the j-site genotypes were assigned new fitness values based on the number of copies of mutant alleles. In the case of a recessive peak:
In the case of a dominant peak:
Note that individuals with no mutant alleles maintain a fitness of 1 both before and after revelation.
Each generation of our model has three steps: recombination/reproduction, viability selection and drift.
In the reproduction step, each genotype contributes gametes to an infinite gamete pool according to its frequency in the population, with free recombination. For a system with j mutant loci, we have k = 3j multilocus genotypes capable of producing 2j possible gametes. After recombination, new genotype frequencies were generated by random union of gametes. Viability selection was then applied to the new genotypes based on genotype fitnesses, such that for each multisite genotype frequency gi,gi+1,...gk the frequency after viability selection is given by:
Finally, genotype frequencies were resampled from a multinomial distribution to simulate sampling effects due to finite population size.
Once any one of the j mutant alleles went extinct, the j-site peak became inaccessible and so the simulation was stopped and the population recorded as fixed for the original wild type peak. The population was considered fixed for the new peak if all individuals in the population carried two copies of mutant alleles at all j sites.
For each combination of scv, sv, and sp, the population was iterated to fixation/extinction from at least 105 sets of initial allele frequencies and, in the case of deeper valleys, from up to 107 sets (so as to detect even very rare fixations). This probability is then multiplied by the expected number of newly available j-step genotypes following a capacitance event, to arrive at our final estimate of the expected number of complex fixations (E[fixations]).
Expected number of potential peaks
The expected number of available polymorphic sites in the population is Poisson distributed around a mean of 2Nμτ (N,scryptic_valley) where 2Nμ is the rate of introduction of mutant sites, and τ is the expected sojourn time of a new mutation under the Moran model. The expected number of potential peaks (combinations of j alleles) available to the population is then (Masel 2006):
| (1) |
We assume that any one of the many possible combinations of j sites could be adaptive, each with low probability ε. All adaptation rates are proportional to the infinitesimal ε, which can be approximated as constant and hence factor out of their ratios unless ε scales extremely strongly with j.
Analytical Approximations
We would like to find a mathematical approximation for the probability of irreducibly complex adaptation from formerly cryptic variation. In our model, this probability is the product of two factors: ε (the probability that a given combination of mutations is adaptive), and E[fixations], the expected number of combinations of j mutations that would fix if they were adaptive. We want to find this second factor. We do this by writing it as the probability that at the time that cryptic genetic variation is revealed, there is a set of j mutations present at frequencies x1≥ x2≥...xj, multiplied by the probability that they successfully fix (assuming that they are an adaptive combination), summed over all possible combinations of xi.
| (2) |
Finding the first factor in (2) is straightforward. Let ϕ(x) be the equilibrium site-frequency spectrum of the cryptic variation, meaning that the probability that there is a mutation with frequency in the infinitesimal range [x, x + dx] is ϕ(x)dx, or equivalently, that the expected number of mutations with frequencies between y1 and y2 is . Using the diffusion approximation, ϕ is given by (Ewens 2004, equation 9.23, adjusted for a Moran model):
| (3) |
Since all the mutations are independent, the joint spectrum is the product . Note that since the diffusion approximation requires continuous allele frequencies, we must also change the sum in (2) to an integral when we substitute in (3).
| (4) |
Calculating the second factor in (2), the probability of fixation given the vector of initial frequencies, is harder. To make progress we first assume that the trajectory of the mutations will be entirely determined by selection once the variation is revealed, so that we can neglect stochastic effects. This will be accurate as long as mutations that start out at very low frequencies do not contribute much to the probability of complex adaptation. With this assumption, the probability of fixation is always 0 or 1, so instead of having to find and multiply it, we can take an integral over the basin of attraction of fixation (which we here call V) under deterministic expectations.
Thus, our approximation takes the form:
| (5) |
However, finding the boundary of this basin of attraction is difficult, and in general must be done approximately. Let s̄i be the mean advantage of mutation i over the wild-type allele. Since x1≥x2≥...xj, it follows s̄1 ≤ s̄2 ≤ ··· s̄j because the advantage of each mutation increases as the frequencies of the other mutations increase). Thus, a necessary condition for being in the basin of attraction of fixation is to have s̄j > 0 (i.e., for the rarest mutant to be initially favored), and a sufficient condition is to have s̄1 > 0 (i.e., for the most common mutant to be favored). We find the expected number of sets of mutations satisfying the latter condition for both recessive and dominant adaptive peaks, keeping in mind that this gives an underestimate of E[fixations].
Recessive adaptations
First consider the case in which the complex adaptation is recessive. In this case, the mean selective advantage of mutation i is:
If we assume sp > > sv, (the peak is more advantageous than the valley is disadvantageous), this is approximately:
The condition for all s̄i to be positive is therefore
| (6) |
Using the region defined by (6) as an approximation for V, (2) becomes
| (7) |
where the lower bounds for the integrals are given by
and
Dominant adaptations
If the complex adaptation is dominant, the mean selective advantage of mutation i is
and if we assume sp > > sv, and that all mutations are initially at low frequencies
With these approximations, the condition for all s̄i to be positive is
The approximate expected number of potential combination is still given by (7), but now the lower bounds on the integral are given by
and
However, using these bounds can substantially underestimate V. This is because there is a large contribution from very small values of xj, where our approximations break down. (In contrast, for recessive adaptations, xj cannot drop too low, because then there would be no individuals homozygous for the jth mutation.) To account for finite population size, we must adjust the integral bounds to also include the requirement that xj > 1/2N, to avoid impossibly low allele frequencies.
We also need to have a better approximation for the volume of V that includes frequencies such that the most common mutant alleles are initially disfavored, but then switch to being favored as the rarer mutant alleles increase in frequency. Rather than requiring that the lowest frequency xj be high enough for all alleles to have an initial advantage, we can just require that it be within striking distance of this threshold. Specifically, we can require that once variation is revealed, it will increase to the value given by before the other xi decrease much. These other alleles will have a mean fitness disadvantage of at most sv, so we can make the approximation that xj has about 1/sv generations to reach the threshold before they can decrease significantly. During this time, we can approximate the jth allele’s advantage by , with the xk taken to be their initial values. The adjusted initial minimum value for xj is now
Using these approximations, numerical integration of (7) gives the dotted lines in Figure S1. For j ≥ 4, however, even more accurate approximations are needed, and this approach becomes impractical.
For the simplest case, j = 2, we can do better, and actually find an “exact” expression for the region in which the mutations will be driven to fixation:
Integrating ϕ(x1)ϕ(x2) over this region gives the solid-line value seen at j = 2 in our results figures.
Accounting for common mutations
In (7) above, we have let the frequency x1 of the most common mutant allele range all the way up to one. However, this is probably too generous: intuitively, if at the time when the environment begins to select for an adaptation one of the necessary mutations is already nearly fixed, that mutation should not count towards the complexity of the adaptation. There are several possible ways to make our results more conservative by removing the contribution of alleles that have nearly fixed prior to environmental change, just as we remove alleles whose fixation is complete. One stringent method is to include an upper limit to initial frequency, xmax < < 1. This will have little effect on the results as long as in the equations above. Equivalently, if we set j to be the number of mutations involved in an adaptation that are initially rare, j is likely to be limited by:
Note that this limit to complexity depends only weakly on the exact value that we choose for xmax. The above equation indicates that there is an intermediate level of complexity that can arise purely from initially rare mutations.
Another approach is to simply subtract off mutations’ individual probabilities of fixation in the absence of positive selection from j to determine an effective complexity. Thus, fixed mutations would contribute nothing to the complexity, and mutations frequent enough to be likely to fix by chance would contribute only a little, while an initially rare mutation would count as roughly one unit of complexity. Calculating fixation probabilities shows that this is approximately equivalent to introducing a maximum frequency of
For Nsv > > 1, xmax is quite close to one, so this is a small correction for all but extremely complex adaptations. Intuitively, this means that crypticity promotes complex adaptation by allowing strongly deleterious mutations to drift to high frequency, far higher than they could have if they were always exposed to selection, but generally not all the way up to near-fixation.
Results
We find that the probability of fixation for any given j sized adaptation is largest when crypticity is strong and j is small (Figure 3). For all sizes of adaptation, the probability of fixation increases as the level of crypticity increases. Thus, while fixing a given complex (large j) adaptation remains unlikely, it is demonstrably less unlikely under crypticity (Kim 2007).
Figure 3.
Probability of fixation for a given j-sized adaptation as a function of valley width, j. Line colors denote varying levels of crypticity, which we define to be the strength of selection on cryptic alleles relative to the full strength of selection on revealed variation. We define “strong” crypticity to be svalley = 10scryptic_valley and “medium” crypticity to be svalley = 5scryptic_valley. If the two selection coefficients are equal, crypticity is absent. For each value of j and scryptic_valley, we performed at least 105 simulations to calculate a Monte Carlo estimate of the probability that a given j-sized adaptation would fix. In the dominant case, we observed no fixations of size j=4 when crypticity was absent. In the recessive case, we only observed fixations of size j=4 when crypticity was strong.
Further, when one also takes into account the expected number of available j-step genotypes, we find that for a large subset of parameter value combinations, the expected number of fixation events (Figure 4) is much greater for complex adaptations than for simple one step adaptations, even given a recessive peak. In both the recessive and dominant cases illustrated in Figure 4, when crypticity becomes stronger (small Nscryptic_valley), a higher proportion of adaptations are complex. Higher values of j represent valleys that are more difficult to cross, but they also present the population with a larger number of possible peaks to sample. For strong crypticity, the latter effect dominates the results, leading to many complex fixation events.
Figure 4.
Expected numbers of fixed j-sized adaptations as a function of valley width. Line colors denote varying levels of crypticity, which we define to be the strength of selection on cryptic alleles relative to the full strength of selection on revealed variation. We define “strong” crypticity to be svalley = 10scryptic_valley and “medium” crypticity to be svalley = 5scryptic_valley.. If the two selection coefficients are equal, crypticity is absent. For each value of j and scryptic_valley, we performed at least 105 simulations to calculate a Monte Carlo estimate of the probability that a given j-sized adaptation would fix, and multiplied this by the expected number of j-sized adaptations that could be created by recombination of existing polymorphic alleles.
The number of potential peaks depends on the jth power of the expected number of segregating mutant sites (Eq. 1), which in turn depends on the mutation rate, on the population size, and on the sojourn time. When crypticity is strong (Nscryptic_valley is small), mutations are nearly neutral and sojourn times are dominated by N. When Nscryptic_valley is large and crypticity is weak, selection shortens sojourn times. Exponential dependences can lead to an abrupt transition to the complex-adaptation regime, e.g. with a threshold value of Nscryptic_valley ~1 in the recessive case (see Figure S1). If selection on cryptic variation exceeds this threshold, sojourn times are so short that very few segregating sites exist at any given moment, and thus few potential peaks are available.
Our results are best summarized by looking at the proportion of all expected fixations that have j > 1. We estimate this proportion for a range of mutation rates and levels of crypticity and also taking into consideration the increase in number of potential peaks with increasing j.(Figure 5)
Figure 5.
Proportion of expected fixations that are complex (j > 1). Nspeak= 1000, Nsvalley = 10, j = 1...4. N was held constant at 10,000 while scryptic_valley and μ varied. Panel A: Adaptive peak is recessive. Panel B: Adaptive peak is dominant.
We see that when mutations are significantly cryptic (a definitional criterion in our scheme), complex adaptations dominate (i.e., we find >50% of observed fixation events have j > 1) when the product μN is high enough. The proportion of complex fixations is surprisingly insensitive to changes in valley depth (svalley) (Figures S2, S3), peak height (speak) (Figures S4, S5), and population size N (Figure S6). It is the availability of many potential adaptations, not the difficulty of valley-crossing, that drives our result.
The relevance of these results therefore depends on whether it is reasonable to assume high enough values of μN in natural populations. Consider, for example, Drosophila melanogaster. Estimates based on neutral diversity assign D. melanogaster an Ne ≈ 106 (Li and Stephan 2006). Our μ is a per-trait genomic mutation rate. If we take D. melanogaster’s per locus mutation rate (Haag-Liautard et al. 2007) to be on the order of 10−6, and assume an average of 10 genes per trait, our estimated per-trait rate is about 10−5. We then have μNe =10, far inside in the parameter range required for revelation of cryptic genetic variation to make valley-crossing a frequent source of adaptation. While mutation rates can vary substantially across species (Drake et al. 1998), data support some kind of scaling law that holds the product μNe relatively constant across sexual species (Lynch 2010; Sung et al. 2012), suggesting the broad applicability of our model. In other words, the relatively high effective population size of Drosophila is balanced by a relatively low mutation rate per germline reproduction event (Lynch 2010).
Note that a capacitance event may uncover only a subset of all possible cryptic variants. The number 10 of loci per trait should be interpreted as the number of loci uncovered by a capacitor, not as the total number of possible cryptic loci affecting a trait.
With the values N = 106, μ =10−5, and scryptic_valley = 10−4, the expected number of available 4-step potential adaptations in our model is on the order of 108. While each complex adaptation remains individually unlikely, the sheer number of potential adaptations available after a capacitance event can override this individual rarity to make complexity commonplace.
To interpret the most extreme results above, involving the highest values of j, note that we have defined the complexity of an adaptation by how many mutation steps separate it from the ancestral genotype that includes all the mutations that fixed during the time that variation was cryptic. This means that an adaptation involving one mutation starting at 99% frequency and another at very low frequency is counted as having j=2, even though to an observer tracking the population it would appear that there was simply a selective sweep involving the rare mutation. In the Methods above, we describe alternative measures of the complexity of adaptations that may match better with what can be determined empirically. If all that can be measured are allele trajectories, then the most natural reference genotype has the majority allele at the time variation is revealed at every locus. With this more conservative definition, the explosion of complexity seen in Figure 4 for small values of Nscryptic_valley is cut off, and while adaptively irreducible complexity remains the norm, the expected number of fixations peaks at a moderate value of j. If, on the other hand, individual mutations’ fitness effects in the absence of the adaptation can also be measured, then the most natural reference genotype is essentially the one used above, and the extreme complexity reappears.
Discussion
We consider valley crossing occurring as an adaptive response to environmental change. At the moment of environmental change, the individually deleterious alleles involved in peak crossing are at intermediate frequencies given by a quasi-stationary distribution based on the Moran model. Our work is a proof of principle that valley crossing may be common, based on one particular scenario.
Note that the “irreducibly complex” adaptations that we consider are defined as such at the molecular level, as combinations of mutations that are collectively advantageous, but deleterious unless all present simultaneously. This definition cannot be mapped in a straightforward fashion onto phenotypes where quantitative rather than discrete traits are involved, e.g. the evolution of the vertebrate eye, Obviously, given the large uncertainty in the relevant parameters, we do not claim to have calculated a precise expected frequency of complex adaptations at the molecular level. But our assumptions would need to be wrong in a very substantial way in order to overturn our result. For example, if incrementing j reduces ε by many orders of magnitude, or if genomic rates of cryptic mutation are orders of magnitude below our assumed values, this might contradict our conclusions. Our theoretical contribution is a deductive one; given certain premises / assumptions, we conclude that the crossing of wide adaptive valleys should be common. Indeed, Figure 4 shows that for many parameter values we expect complex adaptations to be much more common than simple ones. The primary outstanding empirical questions stemming from this work therefore concern whether our assumptions are supported by data.
Some of our assumptions are quite conservative, e.g. when we consider a recessive peak. As another example of conservatism, we use the Moran model, which assumes that accidents of sampling are the primary stochastic force in molecular evolution; if linkage to other sites under selection is the primary stochastic force, complex adaptations are much more common than calculated here (Neher and Shraiman 2011).
Real adaptations are likely to be clumped in genotype space, rather than being uniformly distributed as we have assumed (Ostman et al. 2011). The form of this clumping can have large effects on the probability of complex adaptation. Most obviously, if the fitness landscape is structured so that all adaptive j-mutant genotypes are next to adaptive (j-1)-mutant genotypes, etc., then there will be no irreducibly complex adaptations for evolution to find. While this extreme case seems unlikely (especially as fitness valleys have been observed empirically (da Silva et al. 2010)), it seems plausible that this kind of clumping could mean that the fraction of j-mutant genotypes that are irreducibly complex adaptations could be far smaller than the total fraction ε that are adaptive.
Clumping among adaptive j-mutant genotypes can also be important. While a population may explore, for example, thousands of quadruple-mutant genotypes (Fig. 2), many of these genotypes will share the one or two most frequent mutations in the population, i.e., the explored genotypes also clump in genotype space. This clumping in explored genotype space means that the potential clumping of adaptive genotypes would reduce the number of independent genotypes that the population “tests.” As an extreme case, if adaptive j-mutant genotypes tend to occur in clumps that are large compared to the region of genotype space explored by the population, then the population would effectively only try one potential irreducibly complex adaptation. On the other hand, this kind of clumping would effectively increase the selective coefficients of any complex adaptations that were found, meaning that they could succeed even if initially present at very low frequencies. These considerations show that ultimately, the rate of valley-crossing must depend on the shape of actual genotype-fitness maps. In cases where the rate can be estimated empirically, it could even serve as a probe of the shape of fitness landscapes, potentially far more sensitive and wide-ranging than any direct experimental fitness measurement.
Our model assumes free recombination, raising questions as to the effects of linkage and linkage disequilibrium. Our valleys are shallow while cryptic, making valley-crossing rates with high recombination approximately equal to rates under the no-recombination limit (Weissman et al. 2010, eq 2). Empirically, populations of RNA enzymes that had accumulated cryptic variation also adapted more rapidly to a new substrate, via a genotype with multiple changes (Hayden et al. 2011); these populations do recombine during PCR, but have lower recombination rates than obligately sexual populations. Thus, our findings of a strong positive effect of cryptic variation on valley crossing might not be restricted to freely recombining populations. Thinking about evolution as a process often requires us to ignore our usual intuition about the threshold at which we consider improbable to become impossible. We have already become accustomed to considering the immensity of evolutionary time when we talk about adaptation. Perhaps now is the time to begin seriously considering the implications of the immensity of high-dimensional genotype space as well.
Supplementary Material
ACKNOWLEDGMENTS
J.M, M.V.T, G.I.P. and K.M.P were supported by the National Institutes of Health (R01GM076041). J.M. is a Pew Scholar in the Biomedical Sciences, a Fellow at the Wissenschaftskolleg zu Berlin and is supported by NIH grant (R01GM104040) and the John Templeton Foundation. D.B.W. was supported by ERC grant 250152 and the Simons Foundation. K.M.P received support from the Undergraduate Biology Research Program at the University of Arizona. G.I.P thanks Alex Lancaster for discussion.
Footnotes
Further supplementary figures are available in the Supplementary Materials. Supplementary Materials Include: Figures S1-S6
Literature Cited
- Bergman A, Siegal ML. Evolutionary capacitance as a general feature of complex gene networks. Nature. 2003;424:549–552. doi: 10.1038/nature01765. [DOI] [PubMed] [Google Scholar]
- Bridgham JT, Carroll SM, Thornton JW. Evolution of Hormone-Receptor Complexity by Molecular Exploitation. Science. 2006;312:97–101. doi: 10.1126/science.1123348. [DOI] [PubMed] [Google Scholar]
- Burton OJ, Travis JMJ. The frequency of fitness peak shifts is increased at expanding range margins due to mutation surfing. Genetics. 2008;179:941–950. doi: 10.1534/genetics.108.087890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Carter AJR, Wagner GP. Evolution of functionally conserved enhancers can be accelerated in large populations: a population–genetic model. Proceedings of the Royal Society of London. Series B: Biological Sciences. 2002;269:953–960. doi: 10.1098/rspb.2002.1968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Covert AW, Lenski RE, Wilke CO, Ofria C. Experiments on the role of deleterious mutations as stepping stones in adaptive evolution. PNAS. 2013;110:E3171–E3178. doi: 10.1073/pnas.1313424110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crow JF, Kimura M. Evolution in Sexual and Asexual Populations. The American Naturalist. 1965;99:439–450. [Google Scholar]
- Da Silva J, Coetzer M, Nedellec R, Pastore C, Mosier DE. Fitness Epistasis and Constraints on Adaptation in a Human Immunodeficiency Virus Type 1 Protein Region. Genetics. 2010;185:293–303. doi: 10.1534/genetics.109.112458. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drake JW, Charlesworth B, Charlesworth D, Crow JF. Rates of Spontaneous Mutation. Genetics. 1998;148:1667–1686. doi: 10.1093/genetics/148.4.1667. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Egelman EH. Reducing irreducible complexity: divergence of quaternary structure and function in macromolecular assemblies. Current Opinion in Cell Biology. 2010;22:68–74. doi: 10.1016/j.ceb.2009.11.007. [DOI] [PubMed] [Google Scholar]
- Eshel I, Feldman MW. On the evolutionary effect of recombination. Theor Popul Biol. 1970;1:88–100. doi: 10.1016/0040-5809(70)90043-2. [DOI] [PubMed] [Google Scholar]
- Ewens WJ. Mathematical Population Genetics: Theoretical introduction. Springer; 2004. [Google Scholar]
- Fisher DS. Course 11: Evolutionary dynamics. In: M. M., Jean-Philippe Bouchaud JD, editors. Proceedings of Les Houches Summer School. Elsevier; 2007. pp. 395–446. [Google Scholar]
- Fisher RA. The genetical theory of natural selection. Dover; New York: 1930. [Google Scholar]
- Freddolino PL, Goodarzi H, Tavazoie S. Fitness Landscape Transformation through a Single Amino Acid Change in the Rho Terminator. PLoS Genet. 2012;8:e1002744. doi: 10.1371/journal.pgen.1002744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Haag-Liautard C, Dorris M, Maside X, Macaskill S, Halligan DL, Houle D, Charlesworth B, Keightley PD. Direct estimation of per nucleotide and genomic deleterious mutation rates in Drosophila. Nature. 2007;445:82–85. doi: 10.1038/nature05388. [DOI] [PubMed] [Google Scholar]
- Hansen TF, Carter AJR, Chiu C-H. Gene Conversion may aid Adaptive Peak Shifts. Journal of Theoretical Biology. 2000;207:495–511. doi: 10.1006/jtbi.2000.2189. [DOI] [PubMed] [Google Scholar]
- Hayden EJ, Ferrada E, Wagner A. Cryptic genetic variation promotes rapid evolutionary adaptation in an RNA enzyme. Nature. 2011;474:92–95. doi: 10.1038/nature10083. [DOI] [PubMed] [Google Scholar]
- Iwasa Y, Michor F, Nowak MA. Stochastic Tunnels in Evolutionary Dynamics. Genetics. 2004;166:1571–1579. doi: 10.1534/genetics.166.3.1571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Janssens H, Crombach A, Richard Wotton K, Cicin-Sain D, Surkova S, Lu Lim C, Samsonova M, Akam M, Jaeger J. Lack of tailless leads to an increase in expression variability in Drosophila embryos. Developmental Biology. 2013;377:305–317. doi: 10.1016/j.ydbio.2013.01.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jarosz DF, Lindquist S. Hsp90 and Environmental Stress Transform the Adaptive Value of Natural Genetic Variation. Science. 2010;330:1820–1824. doi: 10.1126/science.1195487. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Karlin S, McGregor J. On mutation selection balance for two-locus haploid and diploid populations. Theor Popul Biol. 1971;2:60–70. doi: 10.1016/0040-5809(71)90005-0. [DOI] [PubMed] [Google Scholar]
- Kim Y. Rate of adaptive peak shifts with partial genetic robustness. Evolution. 2007;61:1847–1856. doi: 10.1111/j.1558-5646.2007.00166.x. [DOI] [PubMed] [Google Scholar]
- Komarova NL, Sengupta A, Nowak MA. Mutation-selection networks of cancer initiation: tumor suppressor genes and chromosomal instability. Journal of Theoretical Biology. 2003;223:433–450. doi: 10.1016/s0022-5193(03)00120-6. [DOI] [PubMed] [Google Scholar]
- Kouyos RD, Silander OK, Bonhoeffer S. Epistasis between deleterious mutations and the evolution of recombination. Trends in Ecology & Evolution. 2007;22:308–315. doi: 10.1016/j.tree.2007.02.014. [DOI] [PubMed] [Google Scholar]
- Li H, Stephan W. Inferring the Demographic History and Rate of Adaptive Substitution in Drosophila. PLoS Genetics. 2006;2:e166. doi: 10.1371/journal.pgen.0020166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynch M. Evolution of the mutation rate. Trends Genet. 2010;26:345–352. doi: 10.1016/j.tig.2010.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Masel J. Cryptic Genetic Variation Is Enriched for Potential Adaptations. Genetics. 2006;172:1985–1991. doi: 10.1534/genetics.105.051649. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Masel J. Evolutionary capacitance may be favored by natural selection. Genetics. 2005;170:1359–1371. doi: 10.1534/genetics.105.040493. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Masel J. Q&A: Evolutionary capacitance. BMC Biology. 2013;11:103. doi: 10.1186/1741-7007-11-103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Masel J, Trotter MV. Robustness and evolvability. Trends Genet. 2010;26:406–414. doi: 10.1016/j.tig.2010.06.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muller HJ. Some Genetic Aspects of Sex. The American Naturalist. 1932;66:118–138. [Google Scholar]
- Neher RA, Shraiman BI. Genetic Draft and Quasi-Neutrality in Large Facultatively Sexual Populations. Genetics. 2011;188:975–996. doi: 10.1534/genetics.111.128876. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ostman B, Hintze A, Adami C. Critical properties of complex fitness landscapes.. Proceedings of the ALife XII Conference; MIT Press; 2011. pp. 126–132. [Google Scholar]
- Phillips PC. Waiting for a Compensatory Mutation: Phase Zero of the Shifting-Balance Process. Genetics Research. 1996;67:271–283. doi: 10.1017/s0016672300033759. [DOI] [PubMed] [Google Scholar]
- Poelwijk FJ, Kiviet DJ, Weinreich DM, Tans SJ. Empirical fitness landscapes reveal accessible evolutionary paths. Nature. 2007;445:383–386. doi: 10.1038/nature05451. [DOI] [PubMed] [Google Scholar]
- Rohner N, Jarosz DF, Kowalko JE, Yoshizawa M, Jeffery WR, Borowsky RL, Lindquist S, Tabin CJ. Cryptic Variation in Morphological Evolution: HSP90 as a Capacitor for Loss of Eyes in Cavefish. Science. 2013;342:1372–1375. doi: 10.1126/science.1240276. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schlichting CD. Hidden Reaction Norms, Cryptic Genetic Variation, and Evolvability. Annals of the New York Academy of Sciences. 2008;1133:187–203. doi: 10.1196/annals.1438.010. [DOI] [PubMed] [Google Scholar]
- Siegal ML. Crouching variation revealed. Mol Ecol. 2013;22:1187–1189. doi: 10.1111/mec.12195. [DOI] [PubMed] [Google Scholar]
- Sung W, Ackerman MS, Miller SF, Doak TG, Lynch M. Drift-barrier hypothesis and mutation-rate evolution. PNAS. 2012;109:18488–18492. doi: 10.1073/pnas.1216223109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Takahashi KH. Multiple capacitors for natural genetic variation in Drosophila melanogaster. Molecular Ecology. 2013;22:1356–1365. doi: 10.1111/mec.12091. [DOI] [PubMed] [Google Scholar]
- Tirosh I, Reikhav S, Sigal N, Assia Y, Barkai N. Chromatin regulators as capacitors of interspecies variations in gene expression. Mol Syst Biol. 2010;6 doi: 10.1038/msb.2010.84. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Van Nimwegen E, Crutchfield J. Metastable Evolutionary Dynamics: Crossing Fitness Barriers or Escaping via Neutral Paths? Bulletin of Mathematical Biology. 2000;62:799–848. doi: 10.1006/bulm.2000.0180. [DOI] [PubMed] [Google Scholar]
- Weinreich DM, Chao L. Rapid evolutionary escape by large populations from local fitness peaks is likely in nature. Evolution. 2005;59:1175–1182. [PubMed] [Google Scholar]
- Weinreich DM, Delaney NF, DePristo MA, Hartl DL. Darwinian Evolution Can Follow Only Very Few Mutational Paths to Fitter Proteins. Science. 2006;312:111–114. doi: 10.1126/science.1123539. [DOI] [PubMed] [Google Scholar]
- Weissman DB, Desai MM, Fisher DS, Feldman MW. The rate at which asexual populations cross fitness valleys. Theor Popul Biol. 2009;75:286–300. doi: 10.1016/j.tpb.2009.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Weissman DB, Feldman MW, Fisher DS. The rate of fitness-valley crossing in sexual populations. Genetics. 2010;186:1389–1410. doi: 10.1534/genetics.110.123240. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wright S. The roles of mutation, inbreeding, cross-breeding and selection in evolution. Proceedings of the Sixth International Congress on Genetics. 1932:356–366. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.





