Significance
The role of gene interactions in the response to selection has long been a controversial subject; whereas some have claimed they are not relevant for adaptation, others have argued that their long-term effects are of high significance. In this manuscript, we derive simple and general predictions for the effect of gene interactions on the long-term response to selection in two extreme regimes. We show that, when the dynamics of allele frequencies are dominated by genetic drift, the long-term response is surprisingly simple, depending only on the initial components of the trait variance, regardless of the detailed genetic architecture. In the opposite regime, when selection dominates the dynamics of allele frequencies, the long-term response depends only on the genotype−phenotype.
Keywords: epistasis, genetic variance, adaptation, genetic drift
Abstract
The role of gene interactions in the evolutionary process has long been controversial. Although some argue that they are not of importance, because most variation is additive, others claim that their effect in the long term can be substantial. Here, we focus on the long-term effects of genetic interactions under directional selection assuming no mutation or dominance, and that epistasis is symmetrical overall. We ask by how much the mean of a complex trait can be increased by selection and analyze two extreme regimes, in which either drift or selection dominate the dynamics of allele frequencies. In both scenarios, epistatic interactions affect the long-term response to selection by modulating the additive genetic variance. When drift dominates, we extend Robertson’s [Robertson A (1960) Proc R Soc Lond B Biol Sci 153(951):234−249] argument to show that, for any form of epistasis, the total response of a haploid population is proportional to the initial total genotypic variance. In contrast, the total response of a diploid population is increased by epistasis, for a given initial genotypic variance. When selection dominates, we show that the total selection response can only be increased by epistasis when some initially deleterious alleles become favored as the genetic background changes. We find a simple approximation for this effect and show that, in this regime, it is the structure of the genotype−phenotype map that matters and not the variance components of the population.
The relation between an organism’s genotype and its phenotype is immensely complicated, yet quantitative genetics predicts the correlations between relatives, and the response to selection over tens of generations, based primarily on an additive model. How do interactions between genes affect the response to selection? This question is not easy to answer: Although the additive model can be represented by a few parameters, there are an enormous number of possible relationships between genotype and phenotype. Some insight may come from studying specific well-understood systems, for example, regulation of gene expression by binding of transcription factors, or folding of RNA molecules. Here, we take the alternative approach, seeking statistical regularities, while making minimal assumptions about the nature of epistasis.
There has been a long-standing debate about the effect of genetic interactions on adaptation (1–7). Although some claim that they are unimportant because they contribute very little to the total genetic variance of a population, and consequently to its short-term response, others claim that their long-term effects can be substantial. At the heart of the debate is the distinction between what has been called physiological and statistical epistasis. The former is independent of the state of the population, whereas the latter is the statistical contribution of gene interactions to the trait variance, which depends on the allele frequencies (8, 9).
In the absence of new mutations, the total response to selection is limited by the initial standing variation. Selection acts on the variation present in the population, and will typically act to reduce it; the extent to which it does so depends on the allele frequencies, and on the relation between genotype and phenotype (i.e., the “genetic architecture”). Interactions between genes (epistasis) are a key component of this genetic architecture, and their effect on the response to selection has been controversial. On the one hand, artificially selected populations are usually well approximated by the infinitesimal model, which, in its simplest form, assumes infinitely many loci of small and additive effect (10, 11). This suggests that gene interactions do not play an important role. On the other hand, biology can hardly be additive, and genomes are finite. If the trait depended on a small number of additive alleles, these would quickly fix—contradicting the robust observation of sustained response to selection. One possibility is that epistasis sustains additive genetic variance for longer: Alleles that were initially deleterious or near-neutral may acquire favorable effects as the genetic background changes, “converting” epistatic variance into additive, and so prolonging the response to selection.
Note that, if the map between genotype and phenotype were arbitrary, no predictions could be made: The fittest genotype could have any value, and genotypes can be organized in any way over the fitness landscape. We will make a statistical argument, which assumes that epistasis primarily involves low-order interactions, and that these can be treated as, to some degree, independent of each other. In particular, we assume that there is no systematic tendency for alleles with a positive marginal effect to interact positively (or negatively), on average; if there were, the rate of evolution would clearly accelerate (or decelerate).
Although much effort has been spent measuring the sign of epistasis among new mutations (12, 13), or among segregating variants (14–16), it is not clear that this would tell us about the natural pattern over a wider range. A population under selection can reach trait values far beyond the initial trait variance it displays, so the epistasis among variants in the existing population may not predict longer-term evolution. Here, we analyze two extreme limits: the limit of very weak selection, in which drift dominates the dynamics of the genetic variances, and the deterministic limit, where selection is the only force. We investigate the effect of epistasis on the long-term response of a population, and the mechanistic basis for these effects.
Results
The Interaction Between Epistasis and Random Drift.
Robertson (17) showed that, in the weak selection limit , where is the effective population size and s is the selection coefficient at a single locus, the total response of an additive trait in a finite diploid population is simply proportional to the additive variance initially present, provided there is no dominance. He used the fact that the probability of fixation of an allele can be written as ) = . The total expected response, summed over all generations until every allele is lost or fixed, is , where is the additive effect of each allele on the trait. When selection is weak relative to random drift, this can be approximated as . If the selection gradient on the trait is β, the selection favoring an allele with effect is , and so , where is the initial additive genetic variance, and is the response in the first generation.
This calculation is valid in the limit of weak selection at every locus, reflected in the fact that the probability of fixation is approximated by the first-order perturbation to neutrality. It is also only strictly valid when the trait is additive, excluding any form of interaction between genes. Indeed, it seems very hard to generalize to allow for epistasis, because, then, the calculation of the fixation probability of an allele depends on the trajectories of the alleles at all other loci it interacts with.
An alternative derivation is possible, which focuses on the change in variance components due to drift. Consider an additive trait that is determined by very many loci, each under negligibly weak selection. Drift will disperse allele frequencies, decreasing the additive variance by a factor per generation. Therefore, the total selection response is just = , as before. Both arguments show that the total response is proportional to the initial additive variance in the population (which, for an additive genetic architecture with no dominance, where heterozygotes display the mean phenotype of the homozygotes, corresponds to the total initial genetic variance). This is to be expected, because both assume weak selection at individual loci, so that the variance declines primarily through random drift.
What is the effect of epistasis on the total response of the population? Classical quantitative genetics (e.g., ref. 18) uses statistical arguments to derive expressions for the expected components of genetic variance in an inbred population. Barton and Turelli (19) derive these using an explicit population genetic model, assuming two alleles per locus; Hill et al. (20) show how these are related to the classical expressions, and that they apply without any constraint on the distribution of allelic effects at each locus. In the following, we make use of these results to derive expressions for the expected long-term response of populations with arbitrary genetic architectures. We first focus on haploid populations and then extend our results to diploid populations.
The Long-Term Response for Haploids.
Suppose that a haploid population initially has additive genetic variance , pairwise epistatic variance , and so on for higher-order epistatic components; it is convenient to denote the kth order variance component as . Throughout, we will assume that the population is always at linkage equilibrium (such that , where the expectation is taken over multiple realizations of the process). If the expected change in allele frequency due to selection at any locus (, where β is the selection gradient and is the effect of locus i on the trait) is small compared with its variance (; according to the Wright−Fisher model), we can assume that the dynamics of the variance components are dominated by drift. This does not necessarily mean that selection is weak on the trait; if the trait is affected by many loci, these conditions can be met at any individual locus even with strong selection on the trait (the infinitesimal model). Under these conditions, after an arbitrary period of inbreeding, the expected additive variance of a haploid population is
where is the probability of identity between two genes at a locus (equation 29 of ref. 19, but see also Dynamics of Variance Components Under Genetic Drift). Assuming a uniform rate of inbreeding, so that , the total response to directional selection β is
| [1] |
Remarkably, the total response to directional selection is proportional to the initial total genotypic variance, —a simple extension of Robertson’s (17) result. The change in mean can thus be very much greater than the change in the first generation, if is large (Fig. 1). For a given initial total genetic variance, , the initial response is slower with epistasis, because it is proportional only to the additive component, . However, as allele frequencies fluctuate randomly, epistasis generates additional additive variance, such that the total selection response is the same. For a given initial genotypic variance, epistasis slows the initial response, but does not affect its final value, whereas, for a given initial additive variance, epistasis increases the final response by , compared with an additive genetic architecture.
Fig. 1.
Long-term trait value reached by finite populations with a pairwise genetic architecture as a function of the initial fraction of epistatic variance , for different population sizes as indicated. For each population size, we increase the variance in interaction strength, keeping the variance of main effects constant at , from to , resulting in an initial genetic variance ranging from to . Initial epistatic variance saturates with the strength of interactions (see Calculating Expectations for the Initial Components of Variance). Lines represent the theoretical prediction from Eq. 1. Selection strength was set to . Circles represent means of simulations, and error bars (too small to be seen) represent the SE.
Our result is based on the dynamics of genetic variances, but Robertson’s (17) result (reproduced at the beginning of the section) suggests an alternative view: As alleles fluctuate in frequency, they change the average effects of alleles at other loci, thereby changing their probability of fixation. In particular, as we show in The Interaction Between Strong Directional Selection and Epistasis, alleles that are initially deleterious but become beneficial as the response unfolds should contribute to this increase in long-term response. This view predicts that an increase in long-term response should be correlated with an increase in the number of alleles that are beneficial at the end of the response, which we confirmed through simulations (Fig. S1).
Fig. S1.
Average increase in the number of alleles that are beneficial: mean number of alleles that switch sign from negative to positive, calculated from the same simulations as in Fig. 1.
The Long-Term Response for Diploids.
For a diploid population, with no dominance (meaning that the phenotype of the heterozygote is simply the mean of the homozygous phenotypes), epistasis has a stronger effect. From equation 45 of Barton and Turelli (19),
Assuming that , the total response to directional selection β is
Now, epistasis can increase the total response disproportionately. This is because we are calculating the contribution of epistatic components to the variance of a population that will ultimately be completely inbred—which amplifies the genetic variance contributed by each locus by a factor of 2 per order of interaction. (To see this, compare the variance of the trait in a diploid population at Hardy−Weinberg proportions, , with that after complete inbreeding, ). However, it is unlikely that higher-order components will be large enough for this effect to be substantial (6). With two alleles per locus, the kth order variance is proportional to , the product of allele frequencies across the interacting loci. Because , this decreases at least as .
We have assumed constant directional selection, β, on the trait. However, the result readily extends to more complex forms of selection. Assuming that fitness increases monotonically with the trait value, we can transform to a scale on which log fitness depends linearly on the trait. On this new scale, the variance components will be different, but our general results show how the total response depends on these components, when measured on the appropriate scale.
The Interaction Between Strong Directional Selection and Epistasis.
We now turn to the deterministic case, where selection is the predominant process. Imagine that, in the initial haploid population, alleles at n loci have marginal effects , so that, with a selection gradient β, they have selective advantage . We label the alleles by their initial effect, so that . If the trait is additive, and the population infinitely large, then, if selection remains constant, the fittest genotype will eventually fix, and the mean will increase by , where . However, with epistasis, the marginal effect of alleles will change through time. If these effects retain the same sign, the same final state will be reached, but it will have a different trait value. If some effects change sign, then a different genotype may be reached. There may still be a single global fitness peak, or there may be multiple peaks, so that the outcome is sensitive to the initial allele frequencies. However, we avoid considering this complication by simply comparing the final state of populations with and without epistasis. More precisely, we compare the total change in mean of an epistatic trait with the total change in mean of an additive trait, starting with the same allele frequencies and marginal effects.
Typically, the initial distribution of allele frequencies will be U-shaped, as expected for a population in a stationary distribution with low mutation rates per site (Neμ < 1 for haploids). Thus, an enormous range of variation can be released as initially rare alleles increase, and are brought together by recombination. We will assume that the population is so large that alleles that are initially deleterious, but will ultimately be favored, will not be lost from the population. This is not realistic, but it is equivalent to assuming a low rate of recurrent mutation, such that alleles are continually reintroduced, so that, if they are favored, they will become common enough to increase deterministically. We assume that epistasis is not so strong that the marginal effects of alleles fluctuate more rapidly than the timescale needed for them to establish.
If most alleles are very rare or very common, they will spend a relatively short time passing from low to high frequency. Thus, we can caricature the process as a series of separate substitutions; at each substitution, the marginal effects of all of the other alleles change. This will change the time at which favorable alleles substitute, and may cause them to lose their advantage entirely. This caricature is not quite the same as the Strong Selection Weak Mutation approximation (21), in which populations are assumed to be fixed for a single genotype, and to evolve by fixation of new mutations, because rare alleles still follow a complex dynamic. In reality, substitutions may overlap, and their effects change continuously. This limiting scenario is helpful in understanding the following argument, but, because we focus on the final outcome, we do not need to assume that alleles substitute one by one. All we need to know is what is the difference between the final and initial effects of an allele.
Assuming only pairwise interactions, two alleles per locus, and linkage equilibrium throughout (see Materials and Methods), the ultimate change in mean is
At any time, allele i has selective advantage , where is the average effect of locus i on the trait mean that changes linearly with the frequencies of all of the other alleles (see Materials and Methods). Recall that we labeled alleles such that, initially, for all alleles. For simplicity, assume that the set of alleles starts at low frequency, and a set starts at high frequency. Therefore, alleles are expected to sweep from low to high frequency, thereby perturbing the selection on all other alleles. Provided that epistasis is sufficiently weak that no selection coefficients change sign, this set will sweep to fixation, and the difference between the response of an additive trait with the same initial effects (and therefore necessarily with the same initial additive variance) will be . Therefore, as long as epistasis is not directional, epistasis will, on average, make no difference to the total response. More precisely, provided that the total pairwise epistasis, summed over all pairs of alleles that sweep from low to high frequency, is zero, epistasis will have no net effect on the selection response.
This calculation is based on the assumption that additive effects cannot change sign; if interactions are strong enough to turn an initially beneficial allele, into a deleterious one as frequencies at other loci change, the long-term response can be different, because the difference between the two architectures is now . By assuming that the epistatic coefficients are independent random variables, we can approximate the probability that epistasis will change the ultimate genotype, and the expected change in trait mean that results.
The probability that an allele switches sign as the remaining favorable alleles increase is simply the probability that the sum of all interactions is less than the negative of its main effect: , where is the cumulative probability distribution and is the probability density of the sum of all epistatic interactions with the ith locus, and the loci that are sweeping from low to high frequency. We assume that the number of interactions is large enough, and that they are sufficiently independent, that this is approximately Gaussian: ε = , where is the standard deviation (SD) of the . Note that we only need to assume that the are drawn independently from a distribution with finite variance, so that their sum approaches a Gaussian. Note also that one of the alleles that is at high frequency will interact with all of the alleles that sweep from low to high frequency, whereas one of the sweeping alleles will only interact with the other alleles that are also sweeping. However, we assume that , and so approximate by throughout.
We can now calculate the average probability of an allelic reversal by integrating over the distribution of the main effects, ,
| [2] |
This can be calculated explicitly for specific choices for ,
where is the ratio of the SDs of the strength of genetic interactions versus the main effects, the relative “strength” of genetic interactions. These two functions necessarily increase linearly from zero for small , and saturate at for large ; they differ by, at most, . Comparing with numerical simulations, we see that this calculation captures the expected number of alleles that would be deleterious, compared with the original ultimate genotype (in which alleles remain positive throughout the evolutionary response) (Fig. 2). However, it assumes that all loci respond independently to the changing genetic background. In reality, as alleles switch sign and go extinct, other loci will be affected as well. When comparing to the actual global peak of the fitness landscape, we see that the actual number of alleles that change sign is less than predicted by our approximation of independence (Eq. 2). This is because, for each allele that reverses its sign, an interaction is effectively removed, thereby reducing for the remaining loci. Numerically, we found that a good approximation to the expected number of allelic reversals is of that assuming independence, independently of number of loci (Fig. 2A). This expression should converge to the one assuming independence at low values of , which we confirmed through simulations (Fig. 2A). Each of these allelic reversals (or “flips”) will translate into some change in trait value compared with an additive population. For each allele, this increase will be , and so the average contribution of each allele that switches sign will be
| [3] |
For specific distributions of main effects,
| [4] |
| [5] |
These functions are again very similar, tending to zero for weak epistasis, and to for large , where for the half-Gausssian, and 0.444 for the exponential. Finally, the total trait increase due to allelic reversals is simply the expected number of alleles that reverse times the increase due to each of those,
| [6] |
Substituting from Eqs. 2 and 4, we see that increases close to linearly with , as (Fig. 2B). Note that, whereas only the loci that sweep from low to high frequency contribute (via epistasis) to changes in marginal effects, all loci may be affected by these changes. An initially favorable allele that is still at low frequency when it loses its advantage will never appear, whereas one that is already at high frequency will sweep back down to low frequency (recall that we assume either an extremely large population or recurrent mutation, so that deleterious alleles never get fixed). Either way, any of the n initially favored alleles that lose their advantage will be eliminated. The previous calculations are upper bounds for the effect of epistasis on the long-term response of a population, because we considered that loci have low initial allele frequencies. If initial allele frequencies are appreciable, part of the interaction effect is absorbed by the initial average effect of the loci, and alleles may not reverse their sign, but this can be seen as a reduction in .
Fig. 2.
(A) Mean fraction of alleles that reverse their sign when all initially beneficial alleles go to fixation (black circles) or in the genotype of highest trait value (gray circles), as a function of the relative strength of genetic interactions . The black line denotes the prediction assuming independence of alleles, and the gray dashed line is the same prediction multiplied by . (B) Expected increase in trait value if all of the initially beneficial alleles fix (black circles) or at the actual global peak (gray circles). Dashed line corresponds to the prediction obtained based on the gray line in A. All points correspond to the mean of 100 trials, randomizing the coefficients for the indicated variances for and for loci.
When drift dominates, the ultimate response depends only on initial variance components. When selection dominates, the effect of epistasis on the ultimate response depends on reversals in selection on individual alleles, so that a different genotype is reached; this depends on the ratio of SDs of epistatic versus main effects, multiplied by the square root of the number of interacting alleles that sweep from low to high frequency (), as we showed above. The transition between the two regimes depends only on the selection strength each locus experiences (Fig. 3); the trait can be under high selection strength, but, as long as it is highly polygenic, the prediction derived for the drift-dominated regime will hold (Fig. 3) because, for the same initial genetic variance, the selection coefficients scale as with the number of loci. In the limit of infinite numbers of genes, this prediction would hold regardless of the selection gradient on the trait. In the selection-dominated regime, the distribution of allele frequencies is mostly irrelevant because the ultimate response depends on alleles that are initially vanishingly rare (or absent if we allow mutation). The key parameter is , which is not constrained by the initial variance components: If alleles are at extreme frequency, the epistatic variance must be small, even if is large. However, we can relate this parameter to variance components in the F2 population formed by crossing the ancestral with the derived population (see Measuring Strength of Interactions from F2 Crosses and Fig. S2). This implies that the initial epistatic variance does not predict the long-term response of the population, because the population may contain strong interactions (large ) that are not manifested as epistatic variance (which depends on allele frequencies) but can contribute to the long-term response as allele frequencies reach appreciable levels.
Fig. 3.
The transition from the drift-dominated to the selection-dominated regimes. Stochastic simulations with increasing scaled selection () showing the total trait increase in units of initial SD of trait value for (squares) and (circles) loci. Dotted line corresponds to the drift-dominated prediction (), and dashed lines correspond to the deterministic limit taken from numerical analysis, for an epistatic architecture (black) and an additive architecture (gray) of the same initial genetic variance. Initial allele frequencies were sampled from a beta distribution with mean and variance , , and . Simulations were performed either with or and varying to yield the reported product .
Fig. S2.
The ratio of epistatic () versus additive () variances of a population obtained by crossing an ancestral population with a derived population that differ at loci, as a function of the strength of genetic interactions . Line corresponds to the prediction . Circles represent means of 1,000 trials of random pairwise genetic architectures with sampled from a Gaussian distribution of zero mean and and sampled from a Gaussian distribution of zero mean and varied from to .
Discussion
The role of epistasis in evolution has long been controversial. Wright (1) argued that epistasis would cause populations to become trapped at local “adaptive peaks” and proposed that a “shifting balance” between selection and random drift could allow them to explore alternative peaks, so as to move toward the global optimum. This theory motivated much work on the structure of natural populations, yet it remains unclear whether adaptation is significantly slowed by trapping on local peaks (1, 22). Mayr (23) criticized the supposed neglect of epistasis by “bean-bag genetics,” provoking a robust defense by Haldane (24). More recently, it has been proposed that epistatic variance can be “converted” into additive variance following a bottleneck, aiding adaptation (25). The failure of large genome-wide association studies to assign much heritable variance to specific loci (the so-called “missing heritability”) has been attributed to epistasis (26, 27), although this explanation is unnecessary (28). Overall, the practical success of the additive model in quantitative genetics appears hard to reconcile with the strong molecular interactions between genes.
We investigate how epistasis affects the response to selection, by asking a simple and clearly defined question: By how much does epistasis influence the ultimate change in the mean of a selected trait? We compare the effects of directional selection on two populations that initially have the same genetic variance for a trait; in one, inheritance is strictly additive, whereas, in the other, there can be strong gene interactions. We find simple results in the two extreme cases, where either drift or selection dominate.
In the first case, where drift is stronger than selection on individual alleles, the outcome can be predicted from the initial variance components. This seems remarkable but can be understood as a perturbation to neutrality: When selection is spread over very many loci, its effect on any one locus is weak relative to drift, and so the variance components are hardly perturbed by selection. This is an extension of the infinitesimal model to nonadditive inheritance (29). For haploids, the total selection response is proportional to the initial genotypic variance (including both additive and nonadditive components). For diploids, kth-order components of variance have effect multiplied by . Nevertheless, it is extremely difficult to find plausible models in which such higher-order variance components are significant (10). One way to see this is to imagine a population in which all additive effects are zero—as would be the case at an equilibrium under balancing selection. However, any change in allele frequencies will necessarily generate nonzero additive effects, and, consequently, substantial additive variance. Higher-order epistatic variance is also likely to be small if alleles are at extreme frequencies: the kth-order epistasis is proportional to the product of across k loci, and so cannot be large (10).
In contrast, when selection is strong relative to drift , the population will fix at an adaptive peak. This requires strong selection on each allele (), but also recurrent mutation, so any allele that is favored will eventually succeed. Thus, the issue is primarily about the genotype−phenotype map rather than any population genetic process. We show that, unless epistasis is systematically biased, the ultimate response is increased, on average, only if the fittest peak is changed, because epistasis changes some alleles from being favored to being deleterious. The expected response can be predicted by a simple argument, which assumes that the net effect of the changing background is a normally distributed perturbation, and that the chance that selection on an allele changes sign is simply the chance that this random epistatic perturbation exceeds the main effect of the allele.
When drift dominates, our conclusions follow simply from the variance components, without further assumptions. When selection dominates, we assume that epistasis is not systematically biased toward (or against) interactions between favorable alleles. If a systematic bias is allowed, then epistasis can have an arbitrarily large effect. To see this, imagine a trait that is some function of an additive trait, z. If curves upward, then the selection response will accelerate, and can become much larger than the corresponding additive model with the same initial additive variance. Indeed, if we are allowed to assign trait values to genotypes arbitrarily, we can construct paths that follow any pattern of change through time. Our conclusions about the ultimate effects of selection depend on the assumption that epistasis adds a random perturbation, without any definite bias. Empirically, this seems to be the consensus: Epistasis is pervasive between genetic loci but is generally nondirectional in the sense that there is no prevalent pattern of either positive or negative interactions (30).
Throughout this paper, we assumed that the populations remained at linkage equilibrium. However, linkage disequilibrium (LD) will be generated in a number of ways. Drift alone will produce some LD, although this should be symmetric around no disequilibrium and so should not, on average, have strong effects. In the selection-dominated regime, however, LD should be generated consistently in a directional fashion. Nagylaki’s theorem (31) states that, under weak selection, as we assume here, the population is guaranteed to approach and remain close to linkage equilibrium. Furthermore, the fact that we assume that interactions have no preferred direction should further hinder the buildup of LD. Nevertheless, LD will affect the response: A truly infinite population with no recombination (full linkage) is guaranteed to find the global peak, and linkage can only affect the rate at which this is approached. In this sense, linkage can help because recombining populations can get “trapped” at local peaks.
Do gene interactions and epistasis affect the long-term response? In the drift-dominated regime, which can apply even when selection on the trait is strong for polygenic traits, the long-term response is merely proportional to the initial genetic variance. Epistatic architectures can reach higher trait values compared with additive populations of the same additive variance, because the former necessarily harbor more genetic variance. However, this effect will be small and the response slower because epistatic variance typically represents a small fraction of the total genetic variance. In the selection-dominated regime, it is the specifics of the interaction structure that matter. Substantial increases of the long-term response can be reached when interactions are strong and induce allelic reversals, but the initial epistatic variance of the population is not predictive of this (see Deterministic Limits as a Function of Allele Frequency Distribution and Figs. S3 and S4). These results set expectations for the effect of epistasis on the long-term response under directional selection and help reconcile the success of the infinitesimal additive model (11) with the biological fact that genetic interactions are pervasive in nature.
Fig. S3.
(A) Long-term response for a population with initial allele frequency distribution as denoted in the figure ( and denote the mean and variance of a beta distribution). Number of loci is set to , , , and . (B) The initial fraction of epistatic variance present in the population. Dashed line is , which crosses the isocontours of fraction of epistatic variance at their highest value.
Fig. S4.
Difference in long-term response of an epistatic population and an additive one of the same initial additive variance as a function of the number of loci for and (from bottom to top) . Initial allele frequencies were drawn from a Beta distribution with mean m = 1/4 and variance v = 1/20.
Materials and Methods
Simulations.
Except where noted, we assume a haploid population of effective size ; similar results can be obtained for a diploid population with copies of each gene, provided there is no dominance. We also assume that the trait is under weak directional selection, with selection gradient β, and that the population is close to linkage equilibrium. We ignore the environmental component of variance, because this does not affect the response to directional selection. Our results for finite populations under weak selection are independent of the trait architecture. However, for simulations, we assume that the trait of a diallelic haploid genotype of n loci is given by . Assuming that the population is at linkage equilibrium, one can write, for the mean trait value, . The effect of a particular locus on the trait mean is then . We typically assume that main effects and epistatic coefficients are randomly sampled from Gaussian distributions and , respectively.
We assume directional selection for increased trait values such that the fitness of a particular genotype is . At linkage equilibrium, the mean fitness of the population can then be written as . The change in allele frequency at any locus is then . The dynamics of macroscopic variables, such as trait mean, variance, etc., can be derived from this expression.
Finite Population Simulations.
The simulations were performed by keeping track only of allele frequencies, thereby abolishing LD. Every generation, allele frequencies are updated according to the deterministic expectation, , where . A binomial sample of N copies is then sampled with this probability of success for each locus, and the allele frequencies are updated according to this sample. We repeat this procedure until no genetic variation exists at any locus, i.e., the population is fixed for one genotype.
Deterministic Numerical Simulations.
Every iteration, we iterate the deterministic recursion , where . The recursion is iterated for a set number of generations chosen so that the remaining variance in the population is vanishingly small.
Fitness Landscape Analysis.
To find the absolute maximum trait increase, we associate a trait value to every genotype and exhaustively search for the maximum of the landscape.
Dynamics of Variance Components Under Genetic Drift
For a haploid population with n biallelic loci, any trait can be defined as
Under linkage equilibrium, the mean and variance of the trait distribution over the population can be written as
where
If binomial sampling (genetic drift) is the main force determining dynamics at every locus, and the population is random mating and at linkage equilibrium, we can write, for the expectation at the next generation (for multiple realizations of the same process) under the Wright-Fisher model,
Using these expressions, we can calculate the expectation of the genetic variance at each locus in the next generation,
| [S1] |
Because does not contain terms in higher powers of each of the allele frequencies (that is, it does not contain terms in ), every does not depend on the allele frequencies of any of the loci. This means that the expectation .
Each of the is of the form
The cross-terms are all linear on the allele frequencies and so, in expectation, remain the same in the next generation because . The terms of that are quadratic in are changed in the next generation, in expectation, as in Eq. S1. With a bit of algebra (cumbersome using this notation, but see ref. 19), we can see that the expectation for is inflated by a fraction of how much the locus i contributes to all of the higher-order components of variance (, etc.).
For example, for a genetic architecture involving only pairwise interactions, the contribution of a locus to additive variance is
This contribution will be in expectation in the next generation,
In general, for arbitrary orders of interactions, one obtains
and, in particular, for the additive variance,
Calculating Expectations for the Initial Components of Variance over the Allele Frequency Distribution
The initial epistatic variance present in a population can be written as
If we assume that the initial allele frequencies are independently distributed, we can average over the allele frequencies and write
where and are the mean and variance of the distribution of allele frequencies. In the same manner, if we assume that epistatic interactions are much stronger than the background independent effects , the initial additive variance can be written as
where , and .
Using , , and , the average total genetic variance can then be written as
and, realizing that, because all of the are independent and identically distributed, all terms are identical and all terms and , we can write, for the initial fraction of epistatic variance,
This allows us to quantify the amount of epistatic variance initially present in the population, as a function only of the mean and variance of the initial allele distribution. In the limit of strong epistasis or many loci, this expression reduces to a limit that depends only on the initial allele frequency distribution,
Deterministic Limits as a Function of Allele Frequency Distribution
The long-term response of a population in the deterministic regime is mostly determined by the fitness landscape the population evolves on (see The Interaction Between Strong Directional Selection and Epistasis). When initial allele frequencies are not vanishingly rare, the long-term response will be below the limits we show in The Interaction Between Strong Directional Selection and Epistasis, because the population starts at a higher trait value. Fig. S3 shows the long-term response as a function of the allele frequency distribution for strong .
Measuring Strength of Interactions from F2 Crosses
One can measure the critical parameter by crossing ancestral populations with derived populations. These differ at loci, which will be at in the hybrid population. Therefore, . Therefore, . We also have that , so that —which is just a haploid version of the Wright−Castle−Lande estimator. We see that, if the ratio of epistatic to additive variance in the hybrid population is 1, then , which is small.
Acknowledgments
The authors thank Jitka Polechová and Michael Turelli for helpful comments. This work was supported by European Research Council Advanced Grant ERC-2009-AdG-250152. This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under Grant Agreement 618091 Speed of Adaptation in Population Genetics and Evolutionary Computation (SAGE).
Footnotes
The authors declare no conflict of interest.
This article is a PNAS Direct Submission.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1518830113/-/DCSupplemental.
References
- 1.Wright S. Proceedings of the VI International Congress of Genetrics. Int Congr Genetics; Ithaca, NY: 1932. The roles of mutation, inbreeding, crossbreeding and selection in evolution; pp. 356–366. [Google Scholar]
- 2.Hayman BI, Mather K. The description of genic interactions in continuous variation. Biometrics. 1955;11(1):69–82. [Google Scholar]
- 3.Whitlock MC, Phillips PC, Moore FB, Tonsor SJ. Multiple fitness peaks and epistasis. Annu Rev Ecol Syst. 1995;26:601–629. [Google Scholar]
- 4.Phillips PC. Epistasis—The essential role of gene interactions in the structure and evolution of genetic systems. Nat Rev Genet. 2008;9(11):855–867. doi: 10.1038/nrg2452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Hansen TF. Why epistasis is important for selection and adaptation. Evolution. 2013;67(12):3501–3511. doi: 10.1111/evo.12214. [DOI] [PubMed] [Google Scholar]
- 6.Mäki-Tanila A, Hill WG. Influence of gene interaction on complex trait variation with multilocus models. Genetics. 2014;198(1):355–367. doi: 10.1534/genetics.114.165282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Ávila V, et al. The action of stabilizing selection, mutation, and drift on epistatic quantitative traits. Evolution. 2014;68(7):1974–1987. doi: 10.1111/evo.12413. [DOI] [PubMed] [Google Scholar]
- 8.Lynch M, Walsh B. Genetics and Analysis of Quantitative Traits. Sinauer; Sunderland, MA: 1998. [Google Scholar]
- 9.Cheverud JM, Routman EJ. Epistasis and its contribution to genetic variance components. Genetics. 1995;139(3):1455–1461. doi: 10.1093/genetics/139.3.1455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hill WG, Goddard ME, Visscher PM. Data and theory point to mainly additive genetic variance for complex traits. PLoS Genet. 2008;4(2):e1000008. doi: 10.1371/journal.pgen.1000008. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Weber KE, Diggins LT. Increased selection response in larger populations. II. Selection for ethanol vapor resistance in Drosophila melanogaster at two population sizes. Genetics. 1990;125(3):585–597. doi: 10.1093/genetics/125.3.585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Costanzo M, et al. The genetic landscape of a cell. Science. 2010;327(5964):425–431. doi: 10.1126/science.1180823. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Jasnos L, Korona R. Epistatic buffering of fitness loss in yeast double deletion strains. Nat Genet. 2007;39(4):550–554. doi: 10.1038/ng1986. [DOI] [PubMed] [Google Scholar]
- 14.Kelly JK. Epistasis in monkeyflowers. Genetics. 2005;171(4):1917–1931. doi: 10.1534/genetics.105.041525. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Visser JAGMD, Hoekstra RF, Ende HVD. The effect of sex and deleterious mutations on fitness in Chlamydomonas. Proc R Soc Lond B Biol Sci. 1996;263(1367):193–200. [Google Scholar]
- 16.Whitlock MC, Bourguet D. Factors affecting the genetic load in Drosophila: Synergistic epistasis and correlations among fitness components. Evolution. 2000;54(5):1654–1660. doi: 10.1111/j.0014-3820.2000.tb00709.x. [DOI] [PubMed] [Google Scholar]
- 17.Robertson A. A theory of limits in artificial selection. Proc R Soc Lond B Biol Sci. 1960;153(951):234–249. [Google Scholar]
- 18.Kempthorne O. The correlation between relatives in a random mating population. Proc R Soc Lond B Biol Sci. 1954;143(910):102–113. [PubMed] [Google Scholar]
- 19.Barton NH, Turelli M. Effects of genetic drift on variance components under a general model of epistasis. Evolution. 2004;58(10):2111–2132. doi: 10.1111/j.0014-3820.2004.tb01591.x. [DOI] [PubMed] [Google Scholar]
- 20.Hill WG, Barton NH, Turelli M. Prediction of effects of genetic drift on variance components under a general model of epistasis. Theor Popul Biol. 2006;70(1):56–62. doi: 10.1016/j.tpb.2005.10.001. [DOI] [PubMed] [Google Scholar]
- 21.Gillespie JH. A simple stochastic gene substitution model. Theor Popul Biol. 1983;23(2):202–215. doi: 10.1016/0040-5809(83)90014-x. [DOI] [PubMed] [Google Scholar]
- 22.Coyne JA, Barton NH, Turelli M. Perspective: A critique of Sewall Wright’s shifting balance theory of evolution. Evolution. 1997;51(3):643–671. doi: 10.1111/j.1558-5646.1997.tb03650.x. [DOI] [PubMed] [Google Scholar]
- 23.Mayr E. Animal Species and Evolution. Belknap; Cambridge, MA: 1963. [Google Scholar]
- 24.Haldane JBS. A defense of beanbag genetics. Int J Epidemiol. 2008;37(3):435–442. doi: 10.1093/ije/dyn056. [DOI] [PubMed] [Google Scholar]
- 25.Cheverud JM, Routman EJ. Epistasis as a source of increased additive genetic variance at population bottlenecks. Evolution. 1996;50(3):1042–1051. doi: 10.1111/j.1558-5646.1996.tb02345.x. [DOI] [PubMed] [Google Scholar]
- 26.Zuk O, Hechter E, Sunyaev SR, Lander ES. The mystery of missing heritability: Genetic interactions create phantom heritability. Proc Natl Acad Sci USA. 2012;109(4):1193–1198. doi: 10.1073/pnas.1119675109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Manolio TA, et al. Finding the missing heritability of complex diseases. Nature. 2009;461(7265):747–753. doi: 10.1038/nature08494. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Eichler EE, et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet. 2010;11(6):446–450. doi: 10.1038/nrg2809. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Fisher RA. The correlation between relatives on the supposition of Mendelian inheritance. Philos Trans R Soc Edinburgh. 1918;52(2):399–433. [Google Scholar]
- 30.Mackay TFC. Epistasis and quantitative traits: Using model organisms to study gene-gene interactions. Nat Rev Genet. 2014;15(1):22–33. doi: 10.1038/nrg3627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Nagylaki T. The evolution of multilocus systems under weak selection. Genetics. 1993;134(2):627–647. doi: 10.1093/genetics/134.2.627. [DOI] [PMC free article] [PubMed] [Google Scholar]







