Skip to main content
Nature Communications logoLink to Nature Communications
. 2020 Nov 5;11:5592. doi: 10.1038/s41467-020-19437-x

Increasing our ability to predict contemporary evolution

Patrik Nosil 1,2,, Samuel M Flaxman 3, Jeffrey L Feder 4, Zachariah Gompert 2
PMCID: PMC7645684  PMID: 33154385

Abstract

Classic debates concerning the extent to which scientists can predict evolution have gained new urgency as environmental changes force species to adapt or risk extinction. We highlight how our ability to predict evolution can be constrained by data limitations that cause poor understanding of deterministic natural selection. We then emphasize how such data limits can be reduced with feasible empirical effort involving a combination of approaches.

Subject terms: Evolutionary ecology, Evolutionary theory

What is predictability and why does it matter?

Prediction is a core component of the sciences. However, evolutionary biology is often portrayed as a descriptive or historical science, rather than a predictive one13. Nonetheless, the predictability of evolution can be quantified, for example by testing how well existing time series predict future evolutionary changes (Fig. 1)1,4. Besides its scientific importance, our ability to predict evolution has applied implications, for example for the development of vaccines and antibiotics (i.e., viruses and bacteria evolve to be resistant), animal breeding programs aimed at conservation and reintroduction, and biocontrol of insect pests that attack crops and lumber.

Fig. 1. Quantifying the predictability of short-term evolution using time-series data.

Fig. 1

Autoregressive moving average (ARMA) models can be applied to existing data to generate predictions for future trait values or allele frequencies. In turn, the fit (e.g., r2 value) of these predicted values to those actually observed provides a metric of the predictability of evolution.

Here we focus on predictability defined as the ability to forecast future trait values or allele frequencies using existing data (Fig. 1). Such predictive ability can be studied using temporal data alone, or by adding information on the mechanisms and genomic basis of evolution5. We focus on contemporary evolution using time-series data spanning several to dozens of generations (i.e., in many organisms this will equate to decades), where evolution may proceed via standing genetic variation or new mutations. This focus on medium-term evolution complements quantitative genetics work on the predictability of immediate, single-generation responses to selection and studies that consider parallel and repeated evolution over longer (e.g., phylogenetic) time scales3,6.

Two hypotheses for limits in our ability to predict evolution

The degree to which evolution is predictable forms a long-standing debate in biology3,7. At the core of this debate is the question of the extent to which evolution is driven by random versus deterministic processes3 (Fig. 2). In this context, there are two main classes of explanation for difficulties in predicting evolution. First, predictability can be limited by random processes (the “random limits” hypothesis)8. The key mechanisms underlying this hypothesis are stochastic changes in allele frequency due to genetic drift and the random nature of mutation. Second, even evolution driven by deterministic natural selection can be difficult to predict, due to limited data that in turn leads to poor understanding of selection and its environmental causes, trait variation, and inheritance4,9,10 (the “data limits” hypothesis). Indeed, a starting point for improving our ability to predict evolution is to increase understanding of when selection is expected to be directional, fluctuating, or stabilizing.

Fig. 2. Schematic illustration of two hypotheses for limitations on predicting evolution.

Fig. 2

This includes depiction of the evolutionary processes involved, and data which might be used to improve prediction. QTL quantitative trait locus, GWA genome wide association.

These explanations are not mutually exclusive and are likely to operate simultaneously. However, they are conceptually distinct due to core differences in the factors that they propose to limit our ability to make accurate predictions; inherent unpredictability caused by stochastic processes underlies the random limits hypothesis, whereas insufficient knowledge on the part of those trying to form predictions underlies the data limits hypothesis.

In terms of the data limits hypothesis, the underlying assumption is that with sufficient data and proper analysis, deterministic processes can be predicted. Thus, shortcomings in predictive ability stem largely from insufficient data and inadequate analytical tools, not from inherent randomness per se. Limits to data and our understanding of evolutionary process can arise at several levels. First, the environmental sources of selection, such as climatic conditions or predator abundance, might fluctuate in ways that are themselves difficult to predict, even if they are deterministic1. We stress that even deterministic environmental fluctuations might appear random, due to sensitivity to initial conditions that generates chaotic dynamics11. Such chaotic fluctuations are not truly random and our ability to predict them is still, in principle, tied to data limits. Second, even if environmental changes can be predicted, poor understanding of how environmental factors affect resource distributions and impose selection on phenotypes can reduce predictive ability for trait evolution. Third, poor understanding of the genetic architecture of traits can produce difficulties predicting genetic change from patterns of phenotypic selection5,12. For example, prediction can be complicated by phenotypic plasticity, which may be a common way that organisms respond to environmental change13.

At all these levels, limits can arise in the quality or quantity of data, and in analysis. Such data limits are exacerbated by the potential for different factors to act at varying temporal and spatial scales, and by the fact that rare and difficult to predict environmental changes can have large effects on evolution. These general concepts apply across environmental factors, traits, and taxa, as outlined in Box1 using examples in birds, insects, and other organisms.

Box 1. Examples of our ability to predict evolution in natural populations.

We here discuss progress and challenges in predicting evolution using empirical examples. A first example involves fluctuating selection caused by climatic variability, which has been documented in numerous species2628; (Fig. 3a). Perhaps the best example stems from long-term studies of beak size evolution in Darwin’s finches1. Here, variation in rainfall on Daphne Major has been shown to affect the relative abundances of small versus large seeds, which in turn can exert selection on beak size in Geospiza fortis during drought conditions. Thus, rare and difficult-to-predict droughts can have large effects on evolution. Indeed, in the case of G. fortis it could be argued that evolution is unpredictable not because we don’t understand selection (i.e., selection is known to be exerted by seed size distributions), but rather because available data and models cannot predict climatic fluctuations, or how these affect seed size distributions. Thus, prediction in this case was limited (r2 ~ 0.14, this value is a point estimate from autocorrelation analysis of how well trait values for beaks in the past predict those in the future, see also Fig. 1)4, and might be improved via better climate models and data on how climate affects resource distributions.

A second example involves predation, which is a common source of natural selection that can fluctuate according to prey characteristics (Fig. 3b). In particular, predation can cause negative frequency-dependent selection (NFDS) when predators focus on more common prey types. In such cases, the fitness of a phenotype fluctuates because it depends on the phenotype’s frequency in the population, and is higher when the phenotype is rare. This has been documented, for example, in cichlids, guppies, stickleback, and stick insects4,2931. Such systems represent cases where evolution is expected to be easier to predict. Even with NFDS, however, data limits can apply, as illustrated by long-term studies of the evolution of striped and unstriped cryptic color morphs in Timema cristinae stick insects4. In T. cristinae, morph frequencies fluctuate predictably among years (r2 ~ 0.90) and there is experimental support for NFDS. Specifically, an experiment showed that the striped morph is strongly favored when initially rare (i.e., 20% initial frequency), but shows idiosyncratic changes when initially common (80% initial frequency). Whether selection would differ if ratios were manipulated more extremely is unclear. Moreover, why fluctuations occur at yearly, rather than monthly, scales is unknown. Thus, prediction might be improved by estimating the quantitative form of the NFDS fitness function, and via understanding factors that affect the foraging behavior and search images of bird predators. Nonetheless, evolution was highly predictable in this example, and the mechanisms of evolution are reasonably understood due to insights from combining experiments and genomics. Specifically, experiments support NFDS and genomic data rule out a predominant role for random genetic drift, and have clarified the role of epistasis32 and suppressed recombination in the evolution of color genes.

Challenges and ways forward

The examples in Box1 illustrate how data limits in even well-studied systems can mediate the extent to which scientists can predict evolution. However, rather than dampening hope for prediction, the results suggest that progress can be made with empirical effort, for example via coupling long-term monitoring of populations with large, replicated experiments that reveal evolutionary process, and powerful genomic tools that allow dissection of the genetic basis of traits. Nonetheless, gathering such data will rarely be a trivial task. At a minimum, obtaining time-series data necessarily takes time, and this cannot be sped up with more effort. Identifying and measuring additional factors affecting evolutionary dynamics, such as relevant environmental parameters and selection estimates, increases the effort required. Simulation models calibrated based on empirical understanding of a system may aid in parsing the effects of different factors on predictability (e.g., variation in selection, genetic architecture, random drift), thus guiding researchers as to where further effort is best placed, the sample sizes required to increase precision, etc. Box 1 provides specific examples of how knowledge of a study system can inform where additional empirical effort is best placed, and Table 1 lists analytical tools that enable prediction. Thus, we propose that focused data collection and analysis can improve prediction of evolution. However, we temper this claim with the caveat that this will not necessarily be an easy task, particularly because the required measurements potentially span different scales of time, space, and biological organization.

Table 1.

Examples of data types and models that can aid the quantification of uncertainty related to predicting evolution over moderate time scales.

Data type Model Key features Software (citation)
Trait genetics Bayesian sparse linear mixed model (BSLMM) Estimates heritabilities, genetic covariances and number of causal genetic variants while accounting for (and quantifying) uncertainty in genotype-phenotype associations GEMMA12
Climatic variation Bayesian modeling of uncertainty in ensembles of climate models Generates future, predictive distributions of climatic variation with uncertainty over different climate models JAGS/STAN21
Ecological interactions N-level structural equation modeling (e.g., generalized linear latent and mixed models (GLLAMM)) Multilevel extension of structural equation modeling that allows for interactions across hierarchical levels in a Bayesian context; can consider joint uncertainty of model parameters and latent variables xxM22
Evolution Forward genetic simulation models (e.g., Wright-Fisher and extensions with age structured populations, etc.) Flexible models that allow for drift, selection, gene flow, and other evolutionary processes; can be fit in various ways, and can incorporate ecological data SLiM323
Time series Autoregressive moving average models (ARMA) Models that account for spatial or temporal autocorrelation; of broad and general use for time-series analysis JAGS/STAN24
Combination of data types Hierarchical (multilevel) Bayesian models General class of flexible Bayesian models that can combine disparate types of data to make joint inference of evolutionary processes, considering uncertainty from each source and integrated over sources JAGS/STAN25

We focus mostly on hierarchical (i.e., multilevel) models that can be fit in a Bayesian context. Each model accounts for uncertainty (due to data limits or randomness) in a factor relevant for predicting evolution, but an ideal analysis would combine these components to propagate information and uncertainty across these disparate components. We stress that the examples below are representative, but by no means exhaustive.

Moreover, many complexities make it difficult to obtain data sufficient for accurate prediction (Fig. 3). An example of such a complexity is where mutations interact with one another (i.e., epistasis), rather than having additive effects. Epistasis can cause some genotypic combinations to have much higher fitness than others. Thus, epistasis can cause even adaptive (i.e., non-neutral) evolution to be mediated by historical contingencies in the type and order of mutations that arise14,15. Specifically, mutations that arise early in evolution can strongly affect which mutations are subsequently viable, making evolution dependent on mutation-order and difficult to predict. For example, mutations that arise early in the evolution of antibiotic resistance effect which subsequent mutations are favored by natural selection15. Other interactions, such as those between genes and the environment, are likely to have similar effects for complicating prediction.

Fig. 3. Hypothetical examples of how variation in different factors can limit the predictability of evolution driven by deterministic natural selection.

Fig. 3

This figure is motivated by empirical systems, but does not depict real data. a Uncertainty in climatic variability can limit the predictability of evolution for traits affected by environment-dependent fluctuating selection, such as beak size in G. fortis. Here black lines denote observed (left half) or predicted (right half) climatic values, and red lines denote observed (left half) or predicted (right half) trait values. Multiple possible predictions are shown. b Uncertainty in the form of the selection function can limit the predictability of evolution by negative frequency-dependent selection, as is observed for color pattern in T. cristinae stick insects. Possible evolutionary trajectories given three different selection functions (different colored lines) are shown here. c Predictability can also be limited by sensitivity to initial conditions, as occurs on rugged fitness landscapes with considerable epistasis. Two hypothetical fitness landscapes with low (top) and high (bottom) epistasis, and thus sensitivity to initial conditions, are shown (left side; the axes represent genotypes for different loci). Hypothetical evolutionary trajectories from different starting conditions are shown on the right (colored lines). High epistasis promotes different outcomes dependent on initial conditions. Finch and stick insect drawings courtesy of R. Ribas.

A related issue is sensitivity to initial conditions11, which can lead to chaotic dynamics that are deterministic but impossible to predict unless initial conditions are known with extreme precision. An example where this might occur is evolution on highly rugged fitness landscapes, where ruggedness arises due to epistasis. Here, the starting place on a rugged landscape might strongly affect which local fitness peaks are climbed and which valleys are difficult to cross. Although biology may not have a strict counterpart to the Heisenberg uncertainty principle, it is possible that data collection itself alters starting conditions for evolution (e.g., if a human observer scares away predators, this could affect predator-prey dynamics for subsequent evolution). Chaos has received much attention outside of the biological sciences and in the field of ecology, but is not often considered in evolution.

All this said, there are also reasons for hope. For example, conceptual and analytical frameworks from the scientific study of complex systems exist to aid prediction of complex phenomena (Table 1). Specifically, systems thinking focuses on understanding and predicting how complex networks exhibit emergent properties not shown by individual nodes in the network16. In terms of evolution, this involves considering the dynamics of collective networks of genes, populations, and interacting species, rather than trying to use reductionist approaches to understand components in isolation. Because systems approaches apply across scientific disciplines a qualitative analogy can be drawn between the current state of a biological population and the ability to predict its future state based on knowledge of the evolutionary forces operating on it, and the current state of a physical system and the ability to predict its future state based on knowledge of the physical forces acting upon it. In both physics and biology there is the distinction between predictions for individual particles or genes versus the aggregate behavior of many particles (as in statistical thermodynamics) or genes (leading to quantitative genetic breeding values)17.

Conclusions

In conclusion, although collecting sufficient data for prediction may often represent a formidable challenge, we argue that it is not an insurmountable one. With creative application of emerging technologies and analytical approaches we may improve our ability to predict evolutionary patterns and processes. For example, genomic tools will allow the inference of genetic details such as non-linearities in the genotype-phenotype-fitness map18, which can then be incorporated into models to improve prediction. Box 1 provides an example where genomic tools, experiments, and knowledge of genetic and ecological interactions were used to aid prediction of evolution in stick insects. In turn, improved ability to predict evolution may affect our understanding of ecological processes, because to the extent that evolution can be predicted, perhaps so can its ecological consequences for communities and ecosystems19.

A major avenue for future work is to expand the concepts presented here across broader time scales, where the probability of rare yet consequential events increases. Such longer-term prediction will likely require combining contemporary time series data with deeper phylogenetic patterns, and experimental tests of evolutionary processes. Indeed, progress on this front is exemplified by long-term experimental evolution studies in microbes that demonstrate the effects of rare yet consequential random mutations20. Although only further work can reveal the extent to which prediction can be realistically improved, we propose that appreciable progress should be possible in at least some species.

Acknowledgements

The authors thank T. Reimchen for years of discussion that influenced the development of the ideas presented here. The work was funded by a grant from the European Research Council (EE-Dynamics 770826, https://erc.europa.eu/) to P.N., from NSF (DEB-1638997) and the USDA-NIFA program (2015-67013-23289) to J.L.F., and from the NSF (DEB 1844941) to Z.G. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author contributions

P.N., S.M.F., J.F.L., and Z.G. conceived the project. P.N., S.M.F., J.F.L., and Z.G. contributed to writing.

Competing interests

The authors declare no competing interests.

Footnotes

Peer review information Nature Communications thanks Rosemary Grant, Michael Kinnison and the other, anonymous, reviewer for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Grant PR, Grant BR. Unpredictable evolution in a 30-year study of Darwin’s Finches. Science. 2002;296:707–711. doi: 10.1126/science.1070315. [DOI] [PubMed] [Google Scholar]
  • 2.Lässig, M., Mustonen, V. & Walczak, A. M. Predicting evolution. Nat. Ecol. Evol. 1, 77 (2017). [DOI] [PubMed]
  • 3.Blount, Z. D., Lenski, R. E. & Losos, J. B. Contingency and determinism in evolution: replaying life’s tape. Science362, eaam5979 (2018). [DOI] [PubMed]
  • 4.Nosil P, et al. Natural selection and the predictability of evolution in Timema stick insects. Science. 2018;359:765–770. doi: 10.1126/science.aap9125. [DOI] [PubMed] [Google Scholar]
  • 5.Exposito-Alonso M, Burbano HernánA, Bossdorf O, Nielsen R, Weigel D. Natural selection on the Arabidopsis thaliana genome in present and future climates. Nature. 2019;573:126–129. doi: 10.1038/s41586-019-1520-9. [DOI] [PubMed] [Google Scholar]
  • 6.Stern, D. L. Evolution, Development, & the Predictable Genome (Roberts & Co. Publishers, USA, 2011).
  • 7.Reznick DN, Travis J. Is evolution predictable? Science. 2018;359:738–739. doi: 10.1126/science.aas9043. [DOI] [PubMed] [Google Scholar]
  • 8.Gould, S. J. The Structure of Evolutionary Theory (Harvard University Press, USA, 2002).
  • 9.Reimchen TE. Predator-induced cyclical changes in lateral plate frequencies of Gasterosteus. Behaviour. 1995;132:1079–1094. doi: 10.1163/156853995X00469. [DOI] [Google Scholar]
  • 10.Marques DA, et al. Experimental evidence for rapid genomic adaptation to a new niche in an adaptive radiation. Nat. Ecol. Evol. 2018;2:1128–1138. doi: 10.1038/s41559-018-0581-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Rego-Costa A, Débarre F, Chevin L-M. Chaos and the (un)predictability of evolution in a changing environment. Evolution (N. Y) 2018;72:375–385. doi: 10.1111/evo.13407. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zhou, X., Carbonetto, P. & Stephens, M. Polygenic modeling with bayesian sparse linear mixed models. PLoS Genet. 9, e1003264 (2013). [DOI] [PMC free article] [PubMed]
  • 13.Pfennig DW, et al. Phenotypic plasticity’s impacts on diversification and speciation. Trends Ecol. Evol. 2010;25:459–467. doi: 10.1016/j.tree.2010.05.006. [DOI] [PubMed] [Google Scholar]
  • 14.Storz JF. Causes of molecular convergence and parallelism in protein evolution. Nat. Rev. Genet. 2016;17:239–250. doi: 10.1038/nrg.2016.11. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Weinreich DM, Delaney NF, DePristo MA, Hartl DL. Darwinian evolution can follow only very few mutational paths to fitter proteins. Science. 2006;312:111–114. doi: 10.1126/science.1123539. [DOI] [PubMed] [Google Scholar]
  • 16.Kitano H. Systems biology: a brief overview. Science. 2002;295:1662–1664. doi: 10.1126/science.1069492. [DOI] [PubMed] [Google Scholar]
  • 17.de Vladar HP, Barton NH. The contribution of statistical physics to evolutionary biology. Trends Ecol. Evol. 2011;26:424–432. doi: 10.1016/j.tree.2011.04.002. [DOI] [PubMed] [Google Scholar]
  • 18.Milocco L, Salazar-Ciudad I. Is evolution predictable? Quantitative genetics under complex genotype-phenotype maps. Evolution. 2020;74:230–244. doi: 10.1111/evo.13907. [DOI] [PubMed] [Google Scholar]
  • 19.Hendry, A. P. Eco-evolutionary Dynamics (Princeton University Press, USA, 2017).
  • 20.Blount, Z.D., Borland, C.Z., & Lenski, R.E. Historical contingency and the evolution of a key innovation in an experimental population of Escherichia coliProc. Natl Acad. Sci.105, 7899–7906 (2008). [DOI] [PMC free article] [PubMed]
  • 21.Tebaldi C, Knutti R. The use of the multi-model ensemble in probabilistic climate projections. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2007;365:2053–2075. doi: 10.1098/rsta.2007.2076. [DOI] [PubMed] [Google Scholar]
  • 22.Rabe-Hesketh S, Skrondal A, Pickles A. Generalized multilevel structural equation modeling. Psychometrika. 2004;69:167–190. doi: 10.1007/BF02295939. [DOI] [Google Scholar]
  • 23.Haller BC, Messer PW. SLiM 3: forward genetic simulations beyond the Wright-Fisher model. Mol. Biol. Evol. 2019;36:632–637. doi: 10.1093/molbev/msy228. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Epperson, B. K. Geographical Genetics (MPB-38) (Princeton University Press, USA, 2003).
  • 25.McElreath, R. Statistical Rethinking: A Bayesian Course with Examples in R and Stan (CRC Press, USA, 2020).
  • 26.Siepielski AM, DiBattista JD, Carlson SM. It’s about time: the temporal dynamics of phenotypic selection in the wild. Ecol. Lett. 2009;12:1261–1276. doi: 10.1111/j.1461-0248.2009.01381.x. [DOI] [PubMed] [Google Scholar]
  • 27.Siepielski AM, et al. Precipitation drives global variation in natural selection. Science. 2017;355:959–962. doi: 10.1126/science.aag2773. [DOI] [PubMed] [Google Scholar]
  • 28.Bergland AO, Behrman EL, O’Brien KR, Schmidt PS, Petrov DA. Genomic evidence of rapid and stable adaptive oscillations over seasonal time scales in Drosophila. PLoS Genet. 2014;10:e1004775. doi: 10.1371/journal.pgen.1004775. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Olendorf R, et al. Frequency-dependent survival in natural guppy populations. Nature. 2006;441:633–636. doi: 10.1038/nature04646. [DOI] [PubMed] [Google Scholar]
  • 30.Bolnick DI, Stutz WE. Frequency dependence limits divergent evolution by favouring rare immigrants over residents. Nature. 2017;546:285–288. doi: 10.1038/nature22351. [DOI] [PubMed] [Google Scholar]
  • 31.Hori M. Frequency-dependent natural selection in the handedness of scale-eating Cichlid fish. Science. 1993;260:216–219. doi: 10.1126/science.260.5105.216. [DOI] [PubMed] [Google Scholar]
  • 32.Nosil, P. et al. Ecology shapes epistasis in a genotype-phenotype-fitness map for stick insect colour. Nat. Ecol. Evol. 10.1038/s41559-020-01305-y (2020). [DOI] [PubMed]

Articles from Nature Communications are provided here courtesy of Nature Publishing Group

RESOURCES