Abstract
Empirical evidence for diminishing fitness returns of beneficial mutations supports Fisher’s geometric model. We show that a similar pattern emerges through the phenomenon of regression to the mean and that few studies correct for it. Although biases are often small, regression to the mean has overemphasized diminishing returns and will hamper cross-study comparisons unless corrected for.
Keywords: epistasis, fitness landscape, adaptation, Fisher’s geometric model, regression to the mean
The Problem
EPISTASIS, i.e., the interactive (nonadditive) effect of coexpressed mutations, is widespread (e.g., Weinreich et al. 2006; Flint and Mackay 2009; Huang et al. 2012; Corbett-Detig et al. 2013) and plays a fundamental role in genetic theories of sex and recombination, mutation load, genetic robustness, response to selection, and speciation (Whitlock et al. 1995; Phillips et al. 2000; De Visser et al. 2011; Olson-Manning et al. 2012; Hansen 2013).
A number of recent studies have attempted to demonstrate that as mutations become increasingly beneficial, they are more likely to show negative epistasis for fitness when combined (supporting information, Table S1). The demonstration of such diminishing fitness returns has bearing on evolutionary theory by providing a mechanistic basis for decelerating rates of adaptation, as predicted from Fisher’s geometric model (FGM) (Martin et al. 2007; Chou et al. 2011; Khan et al. 2011; Draghi and Plotkin 2013; Szendro et al. 2013).
Diminishing-returns epistasis has commonly been inferred from a negative correlation between the additive fitness effects of a pair (or set) of mutations and their epistatic effect in interaction with each other (Table S1). To this end, the relative fitness of mutation i (wi) is typically measured as
where wabs,i is the absolute fitness of a genotype carrying mutation i and wabs,ref is the fitness of the wild type used as a reference. The relative fitness of a second mutation j (wj) and that of mutations i and j combined (wij) are obtained in the same manner. Subsequently, the epistatic interaction between mutations i and j (Eij) is obtained as
(1) |
or, in words, by the difference between the observed fitness of a genotype carrying both mutations i and j (wij) and the expected fitness of this double mutant if gene action is completely additive ([wi + wj]) (e.g., da Silva et al. 2010). Repeating this for a large number of combinations of mutations, the correlation between Eij and [wi + wj] equals
(2) |
In the presence of diminishing-returns epistasis, mutations with large beneficial effects on fitness show more negative epistasis, resulting in this correlation being negative.
However, by calculating epistasis from expected fitness (Equation 1), the two terms to be correlated will share measurement errors, and a statistical dependence is created artificially. In any empirical study wi, wj, and wij are measured with error. So if wi = ai + ei and wj = aj + ej, where a and e are the additive genetic and residual components of wi and wj, respectively, and wij = ai + aj + iij + eij, where iij is the epistatic effect of mutations i and j, then
(3) |
Assuming uncorrelated measurement errors,
(4) |
It follows that measurement error in wi, wj, and wij [i.e., σ2(ei) = σ2(ej) = σ2(eij) > 0], appearing in the denominator of Equation 4, weakens the correlation. However, [σ2(ei) + σ2(ej)] also appears in the numerator, making negative correlations more negative and positive correlations less positive. The latter is the result of correlating [wi + wj] with wij – [wi + wj] and thereby the measurement error in [wi + wj] (i.e., ei + ej) with itself. On the whole, measurement error can thus result in a negative correlation between expected fitness and epistasis, which could erroneously be interpreted as evidence for diminishing-returns epistasis (Figure 1A).
Having knowledge of σ2(ei), σ2(ej), and σ2(eij), we are able to obtain the corrected correlation between iij and [ai + aj] that is not biased by measurement error variance, using
(5) |
From this it becomes apparent that correcting the variance components in the denominator of Equation 5 for measurement error can lead to both approaching zero whenever error is high relative to additive genetic variance. The latter will be the case whenever correlations are based on statistically nonsignificant variance for epistasis and expected fitness. In such cases, observed correlations run the greatest risk of being inflated. For example, in the extreme scenario when additive genetic variance = 0, it follows from Equation 4 that the uncorrected correlation between epistasis and expected fitness, assuming equal error variances in single and double mutants, = −2/ = −0.82 purely due to measurement error. Generally, statistical significance of the correlation therefore needs to be evaluated using data-resampling techniques.
Although we here focus on negative epistasis of beneficial mutations, the described effect would also generate a pattern where combinations of mutations with increasingly deleterious effects show more positive epistasis. Thereby it may partially explain the lack of empirical support for stronger negative epistasis of increasingly deleterious mutations (i.e., the opposite pattern) as a selective agent maintaining sexual reproduction and recombination (Elena and Lenski 1997; Bonhoeffer et al. 2004). Furthermore, because the correlation between Eij and [wi + wj] is a direct function of the amount of measurement error, variation in the latter can introduce differences in the strength of the correlation among experiments. For example, because the ratio of environmental to genetic variance for fitness often differs between benign and stressful environments (Hoffmann and Merilä 1999; Agrawal and Whitlock 2010), it might erroneously be concluded that epistasic effects are shaped by environmental quality. Similarly, diminishing-returns epistasis might be found to be more pronounced in complex organisms, in which fitness is often estimated with less precision.
The effect outlined here is referred to as “regression to the mean” (Galton 1886) and is a common cause of misinterpretation in biology (Kelly and Price 2005; Postma 2006, 2011; Roff 2011; Verhulst et al. 2013) and other sciences (e.g., Hotelling 1933; Kahneman and Tversky 1973). Although we have focused on one way the phenomenon can introduce biases, it may raise its head in other ways. First, we note that although Equations 4 and 5 need to be modified if epistasis instead is defined in relative terms or by a multiplicative model and/or if epistasis is regressed on the fitness of the genetic background into which new mutations were introduced, bias remains (Table S1). Second, whenever only mutations with relatively strong effects are selected from a larger sample (as is often the case; Table S1) and if fitness is measured with error, these mutations will on average have lower fitness when measured again and therefore show apparent negative epistasis. Third, even when mutant fitness is estimated without error, estimates may be biased when combinations of beneficial mutations are selected for further investigation from experimental evolution studies (which is common too; Table S1). This is because mutations with large positive epistasis for fitness on the particular genetic background of the experimental population are more likely to fix during experimental evolution and thereby to be selected for introduction into other genetic backgrounds, where they will on average have lower fitness (Draghi and Plotkin 2013; Chou et al. 2014; Greene and Crona 2014).
A Brief Literature Survey
Although some authors seem aware of the issue, few have attempted to account for it (Table S1), and the severity of the bias hence remains unknown. We reviewed 30 recent articles that reported results on diminishing-returns epistasis for fitness in microorganisms. In 22 studies, epistasis was directly related to expected fitness, and 18 of these did so without correcting for regression to the mean (Table S1). We note that only one study (Szafraniec et al. 2003) looked for diminishing-returns epistasis by regressing observed on predicted fitness, using reduced major axis regression and testing for a slope significantly <1. This method, although not free of problematic assumptions regarding the nature of error variances (see Warton et al. 2006; Smith 2009), is more robust to the issue raised here. From 15 studies estimating we were able to extract estimates of variance components (for details see Table S1, and for a numerical example see Table S2), which allowed us to obtain unbiased estimates of 25 published correlations, using Equation 5. In four additional cases, correlations could not be corrected because of nonsignificant epistatic variance. In these cases, (almost) all variation in Eij is the result of measurement error variance, resulting in corrected correlations taking on values outside the theoretical boundary (see Table S1). The fact that these correlations were strongly negative before correction, together with the fact that most corrected correlations are less negative than the published estimates, shows that regression to the mean introduces directional bias into empirical estimates of diminishing-returns epistasis (Figure 1B, Table S1). In most cases however, corrections did not affect results qualitatively, which can be attributed to mutant fitness typically being estimated with small error.
Conclusion
Here we have shown how biases due to regression to the mean inflate estimates of diminishing-returns epistasis. Although the majority of studies have not corrected for this, biases are in most cases small. Nevertheless, we do observe bias, most notably with four cases of published negative correlations based on nonsignificant epistatic variances, underlining the importance of performing corrections to allow accurate comparative analyses and prevent publication bias. We also note that we may have underestimated the amount of bias by assuming uncorrelated measurement errors, an assumption that is often violated in experiments by uncontrolled temporal or spatial block effects. Crucially, such effects would lead to undetected measurement errors that would overestimate diminishing-returns epistasis further. Application of appropriate statistical corrections in future studies will further increase our understanding of the manifestation and role of diminishing-returns epistasis in evolution.
Supplementary Material
Acknowledgments
We thank B. Rogell and C. Rueffler for helpful discussions.
Footnotes
Supporting information is available online at http://www.genetics.org/lookup/suppl/doi:10.1534/genetics.114.169870/-/DC1.
Communicating editor: E. A. Stone
Literature Cited
- Agrawal A. F., Whitlock M. C., 2010. Environmental duress and epistasis: How does stress affect the strength of selection on new mutations? Trends Ecol. Evol. 25: 450–458. [DOI] [PubMed] [Google Scholar]
- Bonhoeffer S., Chappey C., Parkin N. T., Whitcomb J. M., Petropoulos C. J., 2004. Evidence for positive epistasis in HIV-1. Science 306: 1547–1550. [DOI] [PubMed] [Google Scholar]
- Chou H., Chiu H., Delaney N. F., Segre D., Marx C. J., 2011. Diminishing returns epistasis among beneficial mutations decelerates adaptation. Science 332: 1190–1192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chou H., Delaney N. F., Draghi J. A., Marx C. J., 2014. Mapping the fitness landscape of gene expression uncovers the cause of antagonism and epistasis between adaptive mutations. PLoS Genet. 10: e1004149. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Corbett-Detig R. B., Zhou J., Clark A. G., Hartl D. L., Ayroles J. F., 2013. Genetic incompatibilities are widespread within species. Nature 504: 135–137. [DOI] [PMC free article] [PubMed] [Google Scholar]
- da Silva J., Coetzer M., Nedellec R., Pastore C., Mosier D. E., 2010. Fitness epistasis and constraints on adaptation in a human immunodeficiency virus type 1 protein region. Genetics 185: 292–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Visser J. A. G. M., Cooper T. F., Elena S. F., 2011. The causes of epistasis. Proc. Biol. Sci. 278: 3617–3624. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Draghi J. A., Plotkin J. B., 2013. Selection biases the prevalence and type of epistasis along adaptive trajectories. Evolution 67: 3120–3131. [DOI] [PubMed] [Google Scholar]
- Elena S. F., Lenski R. E., 1997. Test of synergistic interactions among deleterious mutations in bacteria. Nature 390: 395–398. [DOI] [PubMed] [Google Scholar]
- Flint C., Mackay T. F. C., 2009. Genetic architecture of quantitative traits in mice, flies, and humans. Genome Res. 19: 723–733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Galton F., 1886. Regression to mediocrity in hereditary stature. J. Anthropol. Inst. 15: 246–263. [Google Scholar]
- Greene D., Crona K., 2014. The changing geometry of a fitness landscape along an adaptive walk. PLoS Comput. Biol. 10: e1003520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hansen T. F., 2013. Why epistasis is important for selection and adaptation. Evolution 67: 3501–3511. [DOI] [PubMed] [Google Scholar]
- Hoffmann A. A., Merilä J., 1999. Heritable variation and evolution under favourable and unfavourable conditions. Trends Ecol. Evol. 14: 96–101. [DOI] [PubMed] [Google Scholar]
- Hotelling H., 1933. Review of The Triumph of Mediocrity in Business. J. Am. Stat. Assoc. 28: 463–465. [Google Scholar]
- Huang, W., S. Richards, M. A. Carbone, D. Zhy, R. R. H. Anholt et al., 2012 Epistasis dominates the genetic architecture of Drosophila quantitative traits. Proc. Natl. Acad. Sci. USA 109: 15553–15559. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kahneman D., Tversky A., 1973. On the psychology of prediction. Psychol. Rev. 80: 237–251. [Google Scholar]
- Kelly C., Price T. D., 2005. Correcting for regression to the mean in behavior and ecology. Am. Nat. 166: 700–707. [DOI] [PubMed] [Google Scholar]
- Khan A. I., Dinh D. M., Schneider D., Lenski R. E., Cooper T. F., 2011. Negative epistasis between beneficial mutations in an evolving bacterial population. Science 332: 1193–1196. [DOI] [PubMed] [Google Scholar]
- Martin G., Elena S. F., Lenormand T., 2007. Distributions of epistasis in microbes fit predictions from a fitness landscape model. Nat. Genet. 39: 555–560. [DOI] [PubMed] [Google Scholar]
- Olson-Manning C. F., Wagner M. R., Mitchell-Olds T., 2012. Adaptive evolution: evaluating empirical support for theoretical predictions. Nat. Rev. Genet. 13: 867–877. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips P. C., Otto S. P., Whitlock M. C., 2000. Beyond the average: the evolutionary importance of gene interactions and epistatic effects, pp. 20–38 in Epistasis and the Evolutionary Process, edited by Wolf J. B., Brodie E. D., Wade M. J. Oxford University Press, Oxford. [Google Scholar]
- Postma E., 2006. Implications of the difference between true and predicted breeding values for the study of natural selection and micro-evolution. J. Evol. Biol. 19: 309–320. [DOI] [PubMed] [Google Scholar]
- Postma E., 2011. Comment on “Additive genetic breeding values correlate with the load of partially deleterious mutations”. Science 333: 1221-a. [DOI] [PubMed] [Google Scholar]
- Roff, D. A., 2011 Measuring the cost of plasticity: a problem of statistical non-independence. Proc. R. Soc. Lond. Ser. B 278: 2724–2725; discussion 2726–2727. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Smith R. J., 2009. Use and misuse of the reduced major axis for line-fitting. Am. J. Phys. Anthropol. 140: 476–486. [DOI] [PubMed] [Google Scholar]
- Szafraniec K., Wloch D. M., Sliwa P., Borts R. H., Korona R., 2003. Small fitness effects and weak genetic interactions between deleterious mutations in heterozygous loci of the yeast Saccharomyces cerevisiae. Genet. Res. 82: 19–31. [DOI] [PubMed] [Google Scholar]
- Szendro I. G., Schenk M. F., Franke J., Krug J., de Visser J. A. G. M., 2013. Quantitative analyses of empirical fitness landscapes. J. Stat. Mech. 2013: P01005. [Google Scholar]
- Warton D. I., Wright I. J., Falster D. S., Westoby M., 2006. Bivariate line-fitting methods for allometry. Biol. Rev. Camb. Philos. Soc. 81: 259–291. [DOI] [PubMed] [Google Scholar]
- Weinreich D. M., Delaney N. F., DePristo M. A., Hartl D. L., 2006. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science 312: 111–114. [DOI] [PubMed] [Google Scholar]
- Whitlock M. C., Phillips P. C., Moore F. B. G., Tonsor S., 1995. Multiple fitness peaks and epistasis. Annu. Rev. Ecol. Syst. 26: 601–629. [Google Scholar]
- Verhulst S., Aviv A., Benetos A., Berenson G. S., Kark J. D., 2013. Do leukocyte telomere length dynamics depend on baseline telomere length? An analysis that corrects for regression to the mean. Eur. J. Epidemiol. 11: 859–866. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.