Abstract
The nearly neutral theory is a common framework to describe natural selection at the molecular level. This theory emphasizes the importance of slightly deleterious mutations by recognizing their ability to segregate and eventually get fixed due to genetic drift in spite of the presence of purifying selection. As genetic drift is stronger in smaller than in larger populations, a correlation between population size and molecular measures of natural selection is expected within the nearly neutral theory. However, this hypothesis was originally formulated under equilibrium conditions. As most natural populations are not in equilibrium, testing the relationship empirically may lead to confounded outcomes. Demographic nonequilibria, for instance following a change in population size, are common scenarios that are expected to push the selection–drift relationship off equilibrium. By explicitly modeling the effects of a change in population size on allele frequency trajectories in the Poisson random field framework, we obtain analytical solutions of the nonstationary allele frequency spectrum. This enables us to derive exact results of measures of natural selection and effective population size in a demographic nonequilibrium. The study of their time-dependent relationship reveals a substantial deviation from the equilibrium selection–drift balance after a change in population size. Moreover, we show that the deviation is sensitive to the combination of different measures. These results therefore constitute relevant tools for empirical studies to choose suitable measures for investigating the selection–drift relationship in natural populations. Additionally, our new modeling approach extends existing population genetics theory and can serve as foundation for methodological developments.
Keywords: nonequilibrium theory, nearly neutral theory, demographic nonequilibrium, theoretical population genetics, selection–drift balance
Significance.
A central question in evolutionary genetics concerns the relative contribution of natural selection versus chance to evolution, but theoretical predictions and empirical observations do not always provide a congruent picture about this question. Our hypothesis for this ambiguity is that theoretical predictions usually rely on equilibrium assumptions while most natural populations are not in equilibrium. To investigate this hypothesis, we formulate a mathematical framework for a demographic nonequilibrium scenario, which enables us to reconcile theory and data and also can serve as a practical guide on study design and interpretation of empirical observations.
Introduction
Among the key driving factors of evolution are mutations, natural selection, and genetic drift. The analysis of the interplay between them provides valuable understanding on the genetic variation within and among populations and on their ability to evolve and adapt. Population genetics theory provides a mathematical approach to describe and analyse the interaction of the population-level processes. In particular, such theory predicts that the strength of genetic drift is weaker in larger populations than in smaller populations, due to the stochastic nature of reproduction (Wright 1931; Kimura 1964). This results in a positive correlation between the efficacy of selection and population size, the selection–drift balance. As a consequence, the rate of molecular evolution is influenced by the population size, in particular in the presence of weakly selected mutations (Kimura 1964; Ohta 1973, 1976).
The nearly neutral theory of molecular evolution emphasizes the importance of weakly selected mutations on a genome-wide scale (Ohta 1973, 1976, 1992). Within this framework, typically the distribution of fitness effects (DFE) of new mutations is weighted towards purifying selection: most mutations are deleterious, of which a nonnegligible amount is slightly deleterious, and only a small proportion of mutations is advantageous. The smaller the population size the more (deleterious) mutations fall into the weak selection regime, potentially contributing to segregating polymorphisms and fixation due to genetic drift. These molecular signatures make it possible to investigate the predictions of the nearly neutral theory in empirical studies with help of genomic data.
To detect evidence of selection in genome data, different approaches and methods have been developed (reviewed in Nielsen 2005; Vitti et al. 2013; Booker et al. 2017). A common feature of quantitative methods is to contrast neutral reference and test data, such as the contrast between synonymous and nonsynonymous mutations in protein-coding sequences. Here, we can distinguish between measures of natural selection at the micro- and macroevolutionary timescale (Vitti et al. 2013). Measures at the microevolutionary timescale, which are designed to identify selective events within a species, are typically based on segregating polymorphisms and give a snapshot of the current state. A popular representative is the ratio of nonsynonymous and synonymous diversity, (Nei and Li 1979). Macroevolutionary measures assess lineage-specific selection over larger evolutionary timescales in a phylogenetic setting. These measures are accumulative and typically based on interspecific differences that result from fixations in one lineage after divergence from a common ancestor. A measure that belongs to this group is the ratio of the nonsynonymous and synonymous sequence divergence, (Goldman and Yang 1994; Muse and Gaut 1994), which represents an estimate of the ratio of nonsynonymous and synonymous fixations in the time period after species divergence (Mugal et al. 2020). While the instantaneous fixation rate ratio is frequently denoted as , we introduce notation for the ratio of nonsynonymous and synonymous fixations after species divergence in order to emphasize its accumulative character (fig. 1A).
Fig. 1.
Study design and research question. Panel A: Illustration of the workflow. (i) A single change in population size at time is modeled by a step function. We visualize the macroevolutionary timescale that spans the time interval (indicated in purple), and the microevolutionary timescale that provides a snapshot at time t (indicated in orange). (ii) We model the impact of demographic history on the allele frequency dynamics in as the solution to the stochastic differential equation stated in equation (1). Different categories of allele frequency trajectories can be distinguished. Representative trajectories are highlighted in the respective color coding: mutations that segregate before (blue), mutations that arise before but continue to segregate after (blue-red), mutations that arise and segregate after but no longer at time t (red), mutations resulting in polymorphisms segregating at time t (orange). Representative trajectories that result in fixations and accumulate over are highlighted by stars. (iii) A snapshot of the population dynamics at any point in time is described by the distribution of allele frequencies at that specific point in time and is summarized in the AFS. (iv) Based on the AFS we are able to derive different measures of natural selection and genetic drift at any point in time. From this the main question arises as how these measures relate to each other in a demographic nonequilibrium? Panels B and C: Examples of an ancient (B) and more recent (C) change in population size and their impact on measures of and the fixation rate ratio . (For the color representation of this figure the reader is referred to the online version of this paper.)
The traditional approach to assessing genetic drift is to apply some version of an effective population size, . Conceptually, relates a given (nonideal) population with a simpler idealized reference model, such as the ideal Wright–Fisher model, with respect to a particular property. This leads to different definitions of , e.g. inbreeding and variance (Wright 1931; Crow and Kimura 1970) or eigenvalue effective size (Ewens 1979). In addition, also life-history traits are frequently used as proxies for (Nikolaev et al. 2007; Lee et al. 2011; Waples et al. 2013; Figuet et al. 2016; Bolívar et al. 2019). All approaches predict under different circumstances as for example certain spatial and temporal scales and demographic scenarios. Often it is not evident whether underlying assumptions of the various models are met in natural populations and how accurate the resulting estimates of are in case assumptions are violated. For this reason, the spatial and temporal scales of different estimates of have to be interpreted carefully to draw firm conclusions (Otto and Whitlock 1997; Wang et al. 2016; Nadachowska-Brzyska et al. 2021).
The nearly neutral theory predicts a negative correlation between and the measures and . However, this prediction is based on the equilibrium assumption, where the effect of genetic drift on segregating polymorphisms balances the efficacy of selection implying a constant evolutionary rate. Yet, changes in population size, amongst other factors, generally cause a nonequilibrium for a prevalent amount of time, which disturbs the selection–drift balance (Brandvain and Wright 2016). In a meta-analysis, Brandvain and Wright (2016) compare predictions of classical (equilibrium) theory with results from a large number of empirical studies. This analysis stresses the need of care for nonequilibrium conditions when evaluating differences in selection efficacy among species. To enable such care to be taken, simulation studies and mathematical models are critical tools to investigate the effects of demographic nonequilibria on different evolutionary processes. Simulation-based studies are able to generate observational insight of complex scenarios, such as fluctuating population sizes and the effect of linked selection in demographic nonequilibria (Rousselle et al. 2018; Torres et al. 2020). A strength of mathematical approaches is the ability to clearly decompose effects of nonequilibrium conditions on the processes driving evolution. This constitutes a valuable complement to simulation studies and in turn provides the possibility to develop refined methodology, compare e.g. Evans et al. (2007), Živković and Stephan (2011), Živković et al. (2015) and Kaj and Mugal (2016).
In this study, we investigate the effect of a single change in population size on micro- and macroevolutionary measures of selection in an otherwise ideal population (fig. 1). Concentrating on this isolated aspect enables us to derive exact analytical results that are straightforward to interpret. In a pioneering work, Eyre-Walker (2002) addressed the isolated scenario of a change in population size with help of the stationary Poisson random field framework. In this setting, the nonequilibrium is modeled only indirectly as the weighted sum of the ancestral and the new equilibrium value. This original model forms the basis for many methodological developments (Keightley and Eyre-Walker 2007; Charlesworth and Eyre-Walker 2008; Eyre-Walker and Keightley 2009; Schneider et al. 2011; Kousathanas and Keightley 2013), which have found wide application in evolutionary genetic studies. Nevertheless, as a consequence of the stationarity assumption, the effects of a demographic nonequilibrium on allele frequency trajectories are ignored (Williamson et al. 2005; Boyko et al. 2008).
Here, we explicitly model the impact of a change in population size on allele frequency trajectories. Specifically, we build on the Poisson random field framework approach as in Kaj and Mugal (2016) and derive the nonstationary allele frequency spectrum (AFS) after a change in population size. This enables us to obtain time-dependent formulations of the above addressed measures, and . The study setup, connected to the mathematical framework, is illustrated in figure 1. The time-dependent formulations allow for the discussion of the following questions: First, how does a change in population size affect micro- and macroevolutionary measures of natural selection, and ? Second, how does a change in population size affect the relationship between measures of natural selection and genetic drift during the nonequilibrium period? To this end, we investigate different choices of as measures of genetic drift. Finally, we discuss the relevance of micro- and macroevolutionary measures for empirical studies of the selection–drift relationship and outline possible applications and extensions of the model.
Results
Basic Model
Our goal is to formulate a mathematical model that describes the allele frequency evolution in a population during a time interval in which the population experiences a change in population size. Within this framework, we shall then derive an analytical description of the nonequilibrium AFS, which will enable us to study the behavior of micro- and macroevolutionary measures of natural selection in a nonequilibrium population (fig. 1A). Specifically, we consider the allele frequency evolution in a population that undergoes an instantaneous change in population size at a single point in time from constant size N to constant size , where is a positive parameter. In other words, the population size over time is a step function such that , , and , , compare figure 1 where and .
Throughout this work, we use N as reference size and apply an evolutionary timescale where one unit of time corresponds to generations. We will consider a time interval , with t corresponding to the present time, and corresponding to a point in time generations in the past. We then examine a population that undergoes an ancient change in population size at close to (fig. 1B), and a population that undergoes a more recent change at close to t (fig. 1C). For generality, we let N represent the population size of a haploid population. Under the assumption of additive fitness effects in a diploid organism, an assumption common to many methodological developments in the field (Keightley and Eyre-Walker 2007; Charlesworth and Eyre-Walker 2008; Eyre-Walker and Keightley 2009; Schneider et al. 2011; Kousathanas and Keightley 2013; Johri et al. 2020), this is equivalent to a diploid population of size . Each haploid individual is characterized by a genome sequence of L independent sites, which corresponds to the assumption of free recombination across sites. Random mutations arrive independently and uniformly over individuals on monomorphic sites with population mutation intensity per generation in the reference population. Hence, as long as , the mutation intensity per time unit is . Consequently, for , the mutation intensity is per generation and per time unit. Since a mutation arises in a single individual, its initial frequency is , i.e. or dependent on if it arises before or after the change in population size. Each mutation is assigned a population selection intensity .
We use the Wright–Fisher model with selection (Fisher 1930; Wright 1931) for two alleles segregating at one site to model reproduction and then study the population dynamics of the collection of all L independent sites. In the limit as L tends to infinity and N is large but fixed, the number of new mutations over all mono-allelic sites is approximately Poisson distributed with mean per time unit (Kaj and Mugal 2016). When taking the initial frequency of new mutations, , balances the mutational input and ensures that it does not become infinite. Under these limits, the Poisson random field approximation applies (Sawyer and Hartl 1992; Kaj and Mugal 2016). The derived allele frequencies are independent over polymorphic sites. The allele frequency in a single site starting at time s evolves as a Wright–Fisher diffusion process with selection, that is, a solution of the stochastic differential equation
| (1) |
with initial value —typically . Here, is a standard Brownian motion and whenever and for . The Brownian motion part of the equation encodes genetic drift that varies depending on the population size. Basically, equation (1) describes that the frequency of a mutant allele changes randomly but is pushed towards (fixation) or (extinction) depending on the selection coefficient. We denote such a Markov process by or simply when the initial time is . Furthermore, let and be the law and expectation of processes that start in y, have selective pressure , and evolve in a population of size . Let be the time to fixation of the derived allele. Hence, the fixation probability for a derived allele with frequency y and selective pressure is given by (Kimura 1962)
| (2) |
for a fixed . As , the scaled fixation rate emerges as
| (3) |
This means, in an equilibrium population of size , the instantaneous fixation rate ratio of a class of selected (with selective pressure ) and neutral mutations in the limit equals .
Returning to the Poisson random field setting, the allele frequencies are represented by Poisson points on the collection of sites according to the Poisson distribution with intensity : once such a mutation event takes place at a certain time s, a path is initialized at frequency . We fix and represent the state of the Poisson random field, i.e. the collection of allele frequencies, at time t as a random measure on . We further focus on the allele frequency evolution for , i.e. we will ignore fixations for but start from polymorphic frequencies on at . A visualization of the setup is presented in figure 1A (ii). It is known that the aggregate of all mutations from the infinite past in the ancestral population builds up a Poisson measure in steady state (Kaj and Mugal 2016). More precisely, the relevant initial distribution of allele frequencies at for our model, that is , is a Poisson measure with intensity measure on , where
| (4) |
The initial distribution of trajectories at plus the arrival of new mutations during together preserve the Poisson distribution which is invariant as long as the population size does not change, i.e. for . To account for fixations during we also include the singular contribution at , , , which is a Poisson counting process with time-inhomogeneous intensity. Using suitable functions f, the evaluation is the sum over the random number of segregating sites present in the population at time t and keeps track of the corresponding allele frequencies . The expected value is a deterministic measure on the frequency interval , which in the limit of allows for the interpretation of allele frequency spectrum.
We discuss the formal construction of the random measure model in “Materials and Methods”. Details of the presentation and most of the technical aspects are deferred to the Supplementary Sections 1.1 and 1.2, Supplementary Material online.
Nonequilibrium Allele Frequency Spectrum
The AFS accounts for the collection of all derived allele frequencies across sites at a fixed point in time. More formally, the spectrum of allele frequencies y, , represents the average intensity of attained frequency values at t, for some , compare figure 1A. In our approach, we also include alleles which have reached fixation during . As a reference case we begin with the equilibrium AFS, which arises as the scaled limit of expected values for the case of a fixed size population, say ,
| (5) |
for suitable functions f satisfying sufficient conditions for these integrals to be well defined. The linear term in t represents the effect of constant rate fixations and the integral term independent of time represents the steady-state spectrum of polymorphic frequencies.
Now, considering a population undergoing a change in size at time , equation (5) applies with as long as . It is only when we attempt to extend relation (5) beyond that the change in population size begins to alter the composition of weights of allele frequencies. The collection of paths at a time point contains both ancestral trajectories of alleles which were present already at and new paths emerging from mutations taking place subsequent to the change in population size. The additional contributing terms together with those in relation (5) yield
| (6) |
in detail derived in the Supplementary Section 1, Lemma 3(i) and Theorem 1, Supplementary Material online. While in this representation we do not see directly a spectrum of frequencies y with explicit weights affecting , we do see indirectly the time-dependence effect due to the nonequilibrium framework.
The mathematical framework presented in this work permits retrieving time-dependent expressions for relevant summary statistics by application of selected functions f to the nonequilibrium AFS in equation (6). In this sense we consider nucleotide diversity associated with the function and fixation rate associated with , as well as their respective ratios for nonsynonymous and synonymous mutations (see “Materials and Methods” for details).
Measures of Natural Selection in a Nonequilibrium Population
We study the behavior of two molecular measures of natural selection as functions of time after a change in population size, that is the ratio of nonsynonymous and synonymous genetic diversity, , and the ratio of nonsynonymous and synonymous fixations, , over a time interval . In this setting, corresponds to the present time and to a time generations in the past. A change in population size occurs at , i.e. generations in the past, which we refer to as ancient change (fig. 1B). Since we are particularly interested in the prediction of the nearly neutral theory in nonequilibrium, we consider a DFE restricted to deleterious mutations ranging from strongly to slightly deleterious fitness effects approximated by a -distribution. Figure 2A shows the behavior of for different extents and directions of change in population size, . The time it takes to reach the new equilibrium depends on both, the direction and extent of change in population size: the new equilibrium is reached more quickly in case of a population decline (, the larger the reduction the faster). For an increase in population size (), it takes longer to attain the new equilibrium. Also, given a DFE restricted to deleterious mutations, is negatively correlated with population size as predicted by the nearly neutral theory of molecular evolution. The behavior of after a change in population size is depicted in figure 2B (and fig. 1B) and resembles the behavior of . The ratio decreases for , which means that fewer deleterious nonsynonymous mutations reach fixation—in accordance with observations about selection acting more efficiently in larger populations. However, is an accumulative measure over the time interval , while reflects a snapshot of the strength of selection at time t. As a consequence, it takes longer for to reach its new equilibrium than it does for .
Fig. 2.
Measures of selection for different values of as functions of time. Panel A: the ratio of nonsynonymous and synonymous diversity . Panel B: the fixation rate ratio . For comparison, colored, dashed curves represent the weighted fixation rate ratio . Vertical, dotted lines indicate time . Parameters: , and and for the DFE.
Another means to capture the impact of a change in population size on is to consider the weighted sum of the ancestral and the new equilibrium value (Eyre-Walker 2002). In our notation, this reads for equilibrium instantaneous fixation rates and , respectively. To visualize the difference between and , the weighted fixation rate ratios are included as dashed lines in figure 2B. The nonequilibrium model derived in this study shows that the function reacts more slowly to the change in population size and takes longer to reach the new equilibrium value in comparison to the approach of weighting the equilibrium values. This illustrates that ignoring the period where allele frequencies are in nonequilibrium, as for example implemented in methods to estimate the DFE (Keightley and Eyre-Walker 2007; Eyre-Walker and Keightley 2009; Schneider et al. 2011; Kousathanas and Keightley 2013), leads to an underestimation of the time until reaches its equilibrium.
Note that figure 2B shows the fixation rate ratio for a change in population size at time . Changes at other time points can lead to severely different behaviors. A change at , for example, would lead to without reflecting any influence of the ancestral population. On the other hand, if the change in population size happens more recently in time (fig. 1C), the contribution of the ancient population size becomes more pronounced (see Supplementary fig. S1, Supplementary Material online). In addition, we note that represents a population functional, that assesses fixations in the whole population or lineage. The common estimate of is , which represents a sample functional and introduces further bias for small t, but converges for (Mugal et al. 2014, 2020).
Proxies of Effective Population Size as Measures of Genetic Drift
In order to evaluate the prediction of the nearly neutral theory in nonequilibrium we need to relate the above-introduced measures of selection to estimates of the effective population size. Since there are various ways to define , it is fundamental to first discuss the differences and to assess which of the definitions are relevant to relate to and in our modeling approach. The most commonly considered concepts of among others are variance and inbreeding (Wright 1931, 1940; Crow 1954; Crow and Kimura 1970), coalescent (Lynch and Conery 2003), and eigenvalue effective population size (Ewens 1969, 1979, 1982). The properties, that these concepts aim to model, are the variance in allele frequencies over time due to random genetic drift, the average inbreeding coefficient, the rate of coalescence of neutral alleles, and the leading nonunit eigenvalue of the allele frequency transition matrix.
We here focus on the pairwise synonymous nucleotide diversity (Lynch and Conery 2003; Wakeley and Sargsyan 2008; Ellegren and Galtier 2016) and the harmonic mean effective population size over (Wright 1940; Karlin 1968; Nei and Tajima 1981). The scaled pairwise synonymous diversity, , where is the mutation rate per generation and individual, is an estimate of effective population size based on genetic variation and accordingly represents a microevolutionary measure of effective population size. We note that scaled pairwise synonymous diversity is often also perceived as coalescent effective population size (Lynch and Conery 2003; Wakeley and Sargsyan 2008).
The harmonic mean effective size over is a representative of variance effective population size and defined as the average of genetic drift over the time interval with ,
with , and for t large. This means the ancestral population size N loses its influence on the further in the past the change took place. If is constant over , then . Also, in view of the genetic drift term in equation (1)—the variance term of the SDE—the parameter at time t multiplied by N can be interpreted as a snapshot of the variance effective population size at time t. The harmonic mean effective population size over the time interval for t large, on the other hand, can be considered a representative of long-term effective population size.
With the two measures of effective population size at hand, the microevolutionary measure and the macroevolutionary measure , we investigate and compare how a change in population size is reflected in each of them. For this purpose, we consider two scenarios of change in population size: an ancient change at , i.e. generations in the past from present time , (fig. 3A) and a more recent change at , i.e. generations in the past, (fig. 3B). For each scenario (solid lines) and (dashed lines) are plotted for different values of as functions of time.
Fig. 3.
The effective population size based on nucleotide variation, (solid lines), and the harmonic mean effective population size, (dashed lines), for different values of . Black, dotted lines mark the time of change in population size, . Panel A shows an ancient change in population size at , panel B a more recent change at . Parameters and . (For the color representation of this figure the reader is referred to the online version of this paper.)
For an ancient change, it seems that both proxies mirror the change in size to a large degree as they are close to the new equilibrium value. However, reaches the new equilibrium value more slowly compared to . This holds in particular for , leading to the presumption that the more a population increases, the slower the new equilibrium is reached and vice versa. For a more recent change in population size, the difference between the two estimates is much more evident (fig. 3B). The proxy responds quickly to a change in population size, as expected for a measure relevant at the microevolutionary timescale, while is rather unaffected.
The Selection–Drift Relationship After a Change in Population Size
We investigate the selection–drift relationship after a change in population size and compare it to the equilibrium behavior. For this purpose, we relate the ratio of nucleotide diversity, , and the fixation rate ratio, , to the two measures and , after an ancient (, fig. 4) and a more recent (, fig. 5) change in population size. To evaluate the nonequilibrium behavior, we indicate the expected relation of genetic drift and natural selection in equilibrium populations. For a fixed DFE following a -distribution, the log–log relationship of the measures of selection at hand and proxies of is approximately linear at equilibrium (Kimura 1979; Welch et al. 2008),
where the slope a is given by the shape parameter of the -distribution and the intercept by some constants and , respectively.
Fig. 4.
Proxies of versus measures of selection at time after an ancient change in population size at for . Black, dashed lines show the expected relation in equilibrium populations. Panels A and C: genetic drift estimated by effective population size based on nucleotide variation, . Panels B and D: genetic drift estimated by the harmonic mean effective population size, . Parameters and in the DFE, , and .
Fig. 5.
Proxies of versus measures of selection at time after a more recent change in population size at for . Black, dashed lines show the expected relation in equilibrium populations. Panels A and C: genetic drift estimated by effective population size based on nucleotide variation, . Panels B and D: genetic drift estimated by the harmonic mean effective population size, . Parameters and in the DFE, , and .
Figure 4 visualizes the selection–drift relationship at for an ancient change in population size at , i.e. generations in the past. In addition, the selection–drift relationship for and is shown to depict how the selection–drift relationship changes over time. When using as measure of natural selection (fig. 4A and B) a slight discrepancy between the prediction of the nearly neutral theory in equilibrium and the nonequilibrium behavior exists shortly after the change in population size, i.e. at , but is not very pronounced. As the change in population size becomes more ancient, i.e. for and , the discrepancy vanishes, regardless of the choice of measure for . Using as measure of natural selection (fig. 4C and D) also leads to a difference between equilibrium and nonequilibrium relationship for . For an increase in population size, is larger than expected in equilibrium, while the reverse is true for a population decline. This results in a flatter slope of the selection–drift relationship shortly after the change in population size. As time passes, i.e. for and , the slope again approaches the equilibrium slope. Even though the linear approximation is less good during nonequilibrium, the deviations from linearity appear modest. The main difference between equilibrium and nonequilibrium is the slope of the selection–drift relationship, which in nonequilibrium, i.e. shortly after the change in population size, no longer is representative of the shape parameter of the DFE.
Since we see a clear deviation from the equilibrium prediction of the nearly neutral theory shortly after a change in population size, we also consider a more recent change at in figure 5, i.e. generations in the past. Figure 5A shows that and react quickly to the change in population size and their relationship closely follows the equilibrium behavior as both measures are affected similarly by the nonequilibrium. In contrast, a strong deviation from the equilibrium relationship is observed when relating to as measure of genetic drift in figure 5B. A similarly strong deviation from the equilibrium relation but notably in the opposite direction is obtained when using the fixation rate ratio as measure of selection and correlating it to (fig. 5C). The deviations from the equilibrium selection–drift relationship in figure 5B and C clearly illustrate that the combination of microevolutionary and macroevolutionary measures is problematic, since microevolutionary measures react faster to a change in population size than macroevolutionary measures. If two macroevolutionary measures are related to each other, and , the deviation from the equilibrium selection–drift relationship is less apparent, with both measures rather insensitive to more recent changes in population size (fig. 5D).
Overall, our analytical results clearly demonstrate that microevolutionary and macroevolutionary measures show different sensitivity to demographic events. As a consequence, the comparison of micro- and macroevolutionary measures of natural selection and genetic drift under ongoing demographic nonequilibria can essentially lead to a biased picture of the selection–drift relationship (fig. 5B and C). Also, depending on whether or is considered, there is not only a difference in the degree of deviation, the slope of the log–log relationship changes into different directions. When comparing to a macroevolutionary measure of the slope is larger than in equilibrium (fig. 5B), while in case of the slope is smaller than in equilibrium irrespective of what measure of is chosen (fig. 5C and D).
Discussion
The key question of this study is how a change in population size affects the selection–drift balance. Our analytical results illustrate that in the absence of advantageous mutations the negative correlation between molecular measures of selection and genetic drift holds even during nonequilibrium periods. However, the strength of the relationship is clearly influenced during nonequilibrium periods and dependent on what measures of selection and are chosen for comparison. As a consequence, the slope of the log–log selection–drift relationship is no longer given by the shape parameter of the DFE.
Implications for Empirical Evolutionary Genetics Studies
Our mathematical framework provides a guide to investigate the selection–drift relationship in a demographic nonequilibrium. Figures 4 and 5 suggest that it seems advisable to correlate microevolutionary measures of with microevolutionary measures of selection and macroevolutionary measures of with macroevolutionary measures of selection. These combinations will ensure that the influence of nonequilibrium periods is of similar extent on both, measures of selection and , such that the slope of the log–log selection–drift relationship approximately reflects the shape parameter of the underlying DFE. Alternatively, our mathematical framework could also form the basis for methodological developments that directly account for the demographic nonequilibrium and thereby enable the combination of micro- and macroevolutionary measures. In addition, we can conclude that the observed selection–drift relationship based on common measures of selection and is in particular sensitive to the choice of measures for a more recent but not so much for an ancient change in population size, since for an ancient change both micro- and macroevolutionary measures have had sufficient time to equilibrate (fig. 4).
In empirical studies, the harmonic mean effective population size, , is rather rarely used as proxy of long-term . Instead life-history traits, such as body mass, propagule size, or longevity, find wide application for investigating the selection–drift relationship (Nikolaev et al. 2007; Popadin et al. 2007; Lartillot and Poujol 2010; Nabholz et al. 2013; Romiguier et al. 2014; Chen et al. 2017; Bolívar et al. 2019; Kutschera et al. 2020), since they are accessible for a wide range of species. The observed relationship between life-history traits and macroevolutionary measures of selection is frequently in line with the nearly neutral prediction of a negative correlation between measures of selection and (Nikolaev et al. 2007; Popadin et al. 2007; Lartillot and Poujol 2010; Nabholz et al. 2013; Romiguier et al. 2014; Bolívar et al. 2019). Also studies that correlate life-history traits with microevolutionary measures of selection obtain results consistent with this prediction (Brandvain et al. 2013; Slotte et al. 2013; Burgarella et al. 2015; Chen et al. 2017; Kutschera et al. 2020), in particular for the case where population size has been relatively stable over time. However, evaluation of the slope is complicated by the abstract nature of . Moreover, as predicted by our study, if a population has undergone a more recent change in size, the observed relationship between life-history traits and microevolutionary measures of selection can be skewed in empirical studies (Deinum et al. 2015; James and Eyre-Walker 2020), as the two measures show different sensitivity to a change in population size.
To capture changes in selection pressure following more recent population size fluctuations, our analytical results suggest to instead apply a combination of microevolutionary measures of selection and genetic drift. An example of such application is given by the comparison of island versus mainland populations where island colonization happened in the more recent past (James et al. 2016; Leroy et al. 2021). As microevolutionary measures of selection, not only but also polymorphism-based estimates of the DFE might be considered (Welch et al. 2008; Arunkumar et al. 2014; Chen et al. 2017). The choice of for short branches, i.e. small t, as an alternative microevolutionary measure should, on the other hand, be avoided for two reasons. First, estimation of has been shown to be significantly biased by polymorphisms if applied to short branches (Mugal et al. 2014, 2020), and therefore reflects the ongoing selection pressure in a population only poorly. In addition, the number of nonsynonymous and synonymous fixations is strongly influenced by the ancestral population size during a representative period of time after the change in population size (fig. 4C and D). These two reasons could, for example, explain the rather weak selection–drift relationship in the comparison of island and mainland species in some earlier studies that predate the re-sequencing era (Woolfit and Bromham 2005; Wright et al. 2009), and it could be interesting to reassess signatures of selection with help of polymorphism data.
Moreover, our analytical results entail important implications for the estimation of the rate of adaptive evolution, . Many methods that estimate , as for instance the DFE-alpha method and its derivatives (Keightley and Eyre-Walker 2007; Charlesworth and Eyre-Walker 2008; Eyre-Walker and Keightley 2009; Schneider et al. 2011; Kousathanas and Keightley 2013), are designed to contrast polymorphism-based and divergence-based data, i.e. they combine micro- and macroevolutionary measures. However, the different sensitivity in and after a rather recent change in population size causes a smaller value of than of for a nonnegligible time period in case of a decline in population size (figs. 2 and 5). In the case of an increase in population size, exaggerates for a substantial amount of time, which could wrongly be attributed to the presence of positive selection. Applying the DFE-alpha method for the estimation of to populations in nonequilibrium conditions can consequently lead to confounded estimates: negative estimates of can be obtained in case of a more recent decline (Gossmann et al. 2010; Good et al. 2013; Deinum et al. 2015) or an inflated for a rather recent growth (Tsagkogeorga et al. 2012; Lin et al. 2018; Rousselle et al. 2018). Even though the more recent versions of these methods account for a change in population size, they neither directly account for the nonequilibrium period, nor address the discrepancies that arise as a result of the different timescales evaluated. Similarly, also a recent method by Brevet and Lartillot (2021) combines micro- and macroevolutionary measures of selection to estimate without accounting for the possibility that different could act on different timescales. Again, this can result in biased estimates.
Nevertheless, it could still be of interest to investigate the selection–drift relationship of both micro- and macroevolutionary measures of and natural selection. In fact, valuable information can be gained by comparison. If the observed relationships show a different behavior, this could be indicative of an ongoing nonequilibrium condition (fig. 5). Obviously, data availability is an important prerequisite for such an analysis. Microevolutionary measures of genetic drift and natural selection can be directly computed based on population re-sequencing data as these measures rely on intra-species genomic variation. Also, a macroevolutionary measure of can be assessed based on intra-species genomic variation (Leroy et al. 2021) with help of methods based on the sequentially Markovian coalescent (SMC) (McKenna et al. 2010; Li and Durbin 2011; Schiffels and Durbin 2014). However, assessing macroevolutionary measures of selection can be complicated by the lack of a distantly related reference species (or lack of available genomic data thereof) (e.g. Muyle et al. 2020), which often is unavoidable in empirical studies.
Limitations and Possible Extensions of the Model
We built our model of a nonequilibrium scenario on several simplifying assumptions of which one is the focus on a single change in population size. The advantage of such narrow focus is that interpretations are more straightforward. On the other hand, the framework we presented here is only directly comparable to a limited number of empirical scenarios, such as (but not limited to) reductions in effective population size due to isolation of island and mainland populations (Woolfit and Bromham 2005; Wang et al. 2014; Kutschera et al. 2020; Leroy et al. 2021). In addition, it might also be of interest to study periodic changes in population size, as for example done by Rousselle et al. (2018) using simulations, or stochastically varying population size, as for example in Sjödin et al. (2004). Our analytical modeling approach, especially the derivation of a nonequilibrium AFS in the Poisson random field framework, may be extended to such situations. Periodic changes in population size would entail a sequence , for , of time points of period T and a sequence of size parameters making up a new step function . For , the more general nonequilibrium AFS extending equation (6) is
In fact, this AFS is not restricted to periodically changing environments, and hence may be used to develop the further case of allowing a prescribed, continuously varying, deterministic, scaled population size, such that . To include the perspective of effective population size itself evolving as a stochastic process, consider the case where the population size is switching between the two states N and according to a continuous time Markov chain with given transition rates. A simulation study of the discrete time version of this model in the context of coalescent effective population size is carried out in Sjödin et al. (2004), and potential implications are discussed in terms of fast, intermediate, or slow fluctuations. Quite similar considerations might be relevant in the situation at hand.
Apart from change in population size, there are other mechanisms that can cause a demographic nonequilibrium and affect the selection–drift balance. Examples of such mechanisms are population structure or migration. Inference of estimates of gene flow in nonequilibrium conditions exemplify that migration can impact inference from genomic data (e.g. Austin et al. 2004; Pinho et al. 2008). It could therefore be interesting to extend the description of the nonequilibrium AFS and incorporate migration to investigate its effect on the selection–drift relationship.
Besides demography, also linkage among sites influences allele frequency trajectories. Interference of allele frequency trajectories among two selected sites results in a reduced efficacy of selection, a phenomenon known as Hill–Robertson effect (Hill and Robertson 1966). In addition, the effect of selection at linked sites affects allele frequency trajectories at neutral sites, which implies that neutral diversity is affected indirectly by selection (Maynard Smith and Haigh 1974; Charlesworth et al. 1993; Campos et al. 2014; Hollister et al. 2014). The phenomenon of selection at linked sites has recently received much attention. Specifically, a debate on the validity of the (nearly) neutral theory in the light of linked selection effects has originated (Jensen et al. 2018; Kern and Hahn 2018; Chen et al. 2020). Kern and Hahn (2018) triggered the debate that with today’s data and knowledge the theory lacks evidence as genomic variation is widely shaped by “the direct and indirect consequences of natural selection”. This prompted efforts to reconcile the original theory with new insights, which suggest that the nearly neutral theory does not lose its validity per se but rather that its initial formulation needs to be extended to account for selection at linked sites (Jensen et al. 2018; Chen et al. 2020).
Recently, Torres et al. (2018, 2020) and Johri et al. (2020) also discussed the interaction of a demographic nonequilibrium and selection at linked sites on allele frequency trajectories of neutral sites. Their simulation results provide evidence that the AFS at neutral sites provides a biased picture of the demographic history, since selection at linked sites shows a significant impact on the shape of the AFS. In addition, the authors highlight that conventional methods used to infer the DFE that do not account for linked selection, such as the widely used DFE-alpha method (Keightley and Eyre-Walker 2007; Eyre-Walker and Keightley 2009), provide biased estimates. To account for any effects of linkage among sites, Johri et al. (2020) propose an ABC approach to estimate the DFE. Essentially, the comparison between their ABC approach and conventional approaches stresses the importance of a refined null model that accounts for the interaction between demography and selection at linked sites, i.e. indirect selection.
Complementary to Johri et al. (2020), analytical results gained in the present study stress that besides the interaction of demography and indirect selection, also the interaction between a demographic nonequilibrium and direct selection is important. This suggests that observed differences between the DFE-alpha method (Keightley and Eyre-Walker 2007; Eyre-Walker and Keightley 2009) and the ABC method (Johri et al. 2020) should be attributed to both, direct and indirect effects of a demographic nonequilibrium on allele frequency trajectories. In order to decompose the two effects within our mathematical framework, we would need to incorporate linked selection in our model. As an approximation, selection at linked sites can be modeled as variation in effective population size across the genome (Robertson 1961; Charlesworth et al. 2009), which also could be implemented in our framework. For complementary methodological developments, existing methodology that accounts for the direct effects of a demographic nonequilibrium (Williamson et al. 2005; Boyko et al. 2008) could be extended in a similar fashion.
Conclusion
The flexible framework we present in this study allows for various modifications and extensions. At the same time, restricting the model by specific simplifying assumptions enables us to derive exact analytical solutions, which found the basis for valuable conceptual understanding. We demonstrate that the selection–drift balance is substantially affected by a change in population size. Moreover, we illustrate that micro- and macroevolutionary measures of natural selection and genetic drift show a considerably different sensitivity to recent fluctuations in size. These analytical results, therefore, serve as a helpful tool for empirical studies to choose suitable measures for investigating the selection–drift relationship and to correctly interpret and compare resulting observations. Finally, the explicit modeling of a nonequilibrium condition and its effects on allele frequency trajectories extends the existing body of population genetics theory and constitutes a valuable foundation to refine methodology.
Materials and Methods
The Poisson Random Field Model During Nonequilibrium
To construct the random measure , briefly introduced in the “Basic Model”, we apply stochastic Poisson integrals. First, for ,
| (7) |
for , specified in the Supplementary Section 1.1, Supplementary Material online, satisfying sufficient conditions for these integrals to be well defined. The class is the path space for the diffusion processes , consisting of functions which are right continuous and have left limits. Moreover, is a Poisson random measure on with intensity and is a Poisson random measure on with intensity measure . The first term in equation (7) represents the family of ancestral allele frequencies with initial values at given by the Poisson measure , . The second term contains additional allele frequencies due to mutations during . Similarly, at a time we have
| (8) |
Conditional on , is a Poisson random measure on with intensity . This term represents the fate of the allele frequencies extending beyond of all alleles, polymorphic or fixed, which were present at . The second term covers mutations occurring in .
To analyze the nonequilibrium AFS caused by applying population size , we derive the limiting expected value of , see Theorem 1 in Supplementary Section 1.2, Supplementary Material online. The ancestral component, mutations which occurred prior to , yields
| (9) |
compare equation (IV) in Theorem 1. Similarly, the allele frequencies originating from mutations starting at generate the nonstationary build-up AFS (Kaj and Mugal 2016, Theorem 1), that arises from a completely mono-allelic population,
| (10) |
The Ratio of Nucleotide Diversity During Nonequilibrium
We derive and investigate the ratio of nucleotide diversity, (Nei and Li 1979), in a population undergoing a change in population size according to . Nucleotide diversity measures the number of pairwise differences, which entails integrating the specific function , the probability of sampling pairwise differences at frequency y, with respect to the AFS. As a reference case we observe that during equilibrium in a population controlled by a size parameter and a fixed selection coefficient , we have by equation (5) with ,
| (11) |
More generally, by applying equation (6), we obtain the time-dependent nonsynonymous nucleotide diversity measure in nonequilibrium.
In order to allow for variation in selection across sites for the nonsynonymous diversity, we integrate the previous expressions over a DFE. We denote the random variable generating the values for by and assume it has a continuous density function . Because of the presumed rarity or negligibility of advantageous mutations within the nearly neutral theory, we focus on weak and strong purifying selection following Eyre-Walker et al. (2006), Loewe and Charlesworth (2006) and Galtier and Rousselle (2020). A common choice of DFE in this scenario is the negative -distribution. The density function is
| (12) |
with shape parameter , scale parameter , and mean . Integration of the expression in equation (11) and over this density yields an averaged diversity measure . Taken together it holds
| (13) |
see Supplementary Section 1.3, Supplementary Material online for details. The expectations in the above expression are used to indicate integration over the DFE. We observe that approaches a new equilibrium, as , since for . For the case of neutral evolution, , the result simplifies considerably and we obtain the synonymous diversity as
| (14) |
with as . The ratio of nonsynonymous and synonymous diversity, which we denote , is determined by equations (13) and (14).
The Ratio of Nonsynonymous and Synonymous Fixations
We consider the number of nonsynonymous and synonymous fixations (Goldman and Yang 1994; Muse and Gaut 1994) in the population during the finite time interval with a change in population size as before given by . To account for fixations in the random field setting, we wish to count all Poisson points such that . In other words, we evaluate the indicator function at the nonequilibrium AFS, equation (6). Hence, the number of fixations in the population during with a change in population size given by is . The decomposition of into the ancestral contribution in equation (9) and the build-up in equation (10) allows for matching the different categories of fixations in figure 1A with the corresponding analytic representation: the first term in equation (9) reflects fixations (blue paths) appearing during , whereas the second part corresponds to fixations (blue-red paths) during for which the mutation happened before . The build-up component in equation (10) accounts for fixations (red paths) during for which the mutation occurred after .
To obtain an explicit representation of , we note that and that the expectation operator applied to can be rewritten in terms of the fixation time distribution,
| (15) |
Thus,
compare Supplementary Section 1.4, Supplementary Material online for technical details. Fixations that originate from nonsynonymous mutations are averaged over the DFE in equation (12); for synonymous fixations is set to zero. Finally, the ratio of nonsynonymous and synonymous fixations after a change in population size is defined as
The nonequilibrium quantity is consistent with the equilibrium, instantaneous fixation rate ratio stated in equation (3), since for and for .
Stochastic Simulations
For performing stochastic simulation of paths in the Julia programming language (Bezanson et al. 2017), we apply the discrete Wright–Fisher model with selection to a population of size N. It suffices to simulate paths for the reference population, since a polymorphism in a population of size evolves as in the reference population but with time scaled by . Paths are simulated over a maximum of n generations using the binomial Wright–Fisher sampling with selection. The selection coefficient in the discrete setting is obtained from the relation .
For the distribution of the time to fixation, , we generate paths for each tuple . If the derived allele does not get fixed, the time to fixation is set to infinity. Otherwise the fixation time is set to the generation it got fixed. Finally, the distribution function of the time to fixation on the evolutionary timescale is obtained by scaling generations with N. For the expected value in the nonequilibrium AFS we simulate and average over paths for each triplet .
Parameters used are , , or equivalently , (if ) and (if ), respectively, .
Supplementary Material
Supplementary data are available at Genome Biology and Evolution online.
Supplementary Material
Acknowledgments
The authors thank Martin Lascoux for valuable discussions, David Widmann for generous advice on implementing simulations in the Julia programming language and Madeline Chase for feedback on an earlier version of the manuscript. The authors also thank two anonymous reviewers for their constructive comments that helped to improve the manuscript. C.F.M. is funded by grants to Hans Ellegren from the Swedish Research Council (2013/08271) and Knut and Alice Wallenberg Foundation.
Data Availability
Code used to implement the stochastic simulations can be found on GitLab.
Literature Cited
- Arunkumar R, Ness RW, Wright SI, Barrett SCH. 2014. The evolution of selfing is accompanied by reduced efficacy of selection and purging of deleterious mutations. Genetics 199(3):817–829. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Austin JD, Lougheed SC, Boag PT. 2004. Controlling for the effects of history and nonequilibrium conditions in gene flow estimates in northern bullfrog (Rana catesbeiana) populations. Genetics 168(3):1491–1506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bezanson J, Edelman A, Karpinski S, Shah VB. 2017. Julia: a fresh approach to numerical computing. SIAM Rev. 59(1):65–98. [Google Scholar]
- Bolívar P, Guéguen L, Duret L, Ellegren H, Mugal CF. 2019. GC-biased gene conversion conceals the prediction of the nearly neutral theory in avian genomes. Genome Biol. 20(1):1–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Booker TR, Jackson BC, Keightley PD. 2017. Detecting positive selection in the genome. BMC Biol. 15(1):98. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boyko AR, et al. 2008. Assessing the evolutionary impact of amino acid mutations in the human genome. PLoS Genet. 4(5):e1000083. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brandvain Y, Slotte T, Hazzouri KM, Wright SI, Coop G. 2013. Genomic identification of founding haplotypes reveals the history of the selfing species Capsella rubella. PLoS Genet. 9(9):e1003754. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brandvain Y, Wright SI. 2016. The limits of natural selection in a nonequilibrium world. Trends Genet. 32(4):201–210. [DOI] [PubMed] [Google Scholar]
- Brevet M, Lartillot N. 2021. Reconstructing the history of variation in effective population size along phylogenies. Genome Biol Evol. 13(8):evab150. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Burgarella C, et al. 2015. Molecular evolution of freshwater snails with contrasting mating systems. Mol Biol Evol. 32(9):2403–2416. [DOI] [PubMed] [Google Scholar]
- Campos JL, Halligan DL, Haddrill PR, Charlesworth B. 2014. The relation between recombination rate and patterns of molecular evolution and variation in Drosophila melanogaster. Mol Biol Evol. 31(4):1010–1028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Charlesworth B, Betancourt AJ, Kaiser VB, Gordo I. 2009. Genetic recombination and molecular evolution. Cold Spring Harb Symp Quant Biol. 74:177–186. [DOI] [PubMed] [Google Scholar]
- Charlesworth J, Eyre-Walker A. 2008. The McDonald-Kreitman test and slightly deleterious mutations. Mol Biol Evol. 25(6):1007–1015. [DOI] [PubMed] [Google Scholar]
- Charlesworth B, Morgan MT, Charlesworth D. 1993. The effect of deleterious mutations on neutral molecular variation. Genetics 134(4):1289–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen J, Glémin S, Lascoux M. 2017. Genetic diversity and the efficacy of purifying selection across plant and animal species. Mol Biol Evol. 34(6):1417–1428. [DOI] [PubMed] [Google Scholar]
- Chen J, Glémin S, Lascoux M. 2020. From drift to draft: how much do beneficial mutations actually contribute to predictions of Ohta’s slightly deleterious model of molecular evolution? Genetics 214(4):1005–1018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Crow JF. 1954. Breeding structure of populations. II. Effective population number. In: Statistics and mathematics in biology. Ames, Iowa: Iowa State College Press. p. 543–556. [Google Scholar]
- Crow JF, Kimura M. 1970. An introduction to population genetics theory. New York: Harper and Row. [Google Scholar]
- Deinum EE, et al. 2015. Recent evolution in Rattus norvegicus is shaped by declining effective population size. Mol Biol Evol. 32(10):2547–2558. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ellegren H, Galtier N. 2016. Determinants of genetic diversity. Nat Rev Genet. 17(7):422–433. [DOI] [PubMed] [Google Scholar]
- Evans SN, Shvets Y, Slatkin M. 2007. Non-equilibrium theory of the allele frequency spectrum. Theor Popul Biol. 71(1):109–119. [DOI] [PubMed] [Google Scholar]
- Ewens WJ. 1969. Population genetics. London: Methuen. [Google Scholar]
- Ewens WJ. 1979. Mathematical population genetics. Berlin: Springer. [Google Scholar]
- Ewens W. 1982. On the concept of the effective population size. Theor Popul Biol. 21(3):373–378. [Google Scholar]
- Eyre-Walker A. 2002. Changing effective population size and the McDonald-Kreitman test. Genetics 162(4):2017–2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eyre-Walker A, Keightley PD. 2009. Estimating the rate of adaptive molecular evolution in the presence of slightly deleterious mutations and population size change. Mol Biol Evol. 26(9):2097–2108. [DOI] [PubMed] [Google Scholar]
- Eyre-Walker A, Woolfit M, Phelps T. 2006. The distribution of fitness effects of new deleterious amino acid mutations in humans. Genetics 173(2):891–900. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Figuet E, et al. 2016. Life history traits, protein evolution, and the nearly neutral theory in amniotes. Mol Biol Evol. 33(6):1517–1527. [DOI] [PubMed] [Google Scholar]
- Fisher RA. 1930. The genetical theory of natural selection. Oxford: Clarendon Press. [Google Scholar]
- Galtier N, Rousselle M. 2020. How much does vary among species? Genetics 216(2):559–572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldman N, Yang Z. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 11(5):725–736. [DOI] [PubMed] [Google Scholar]
- Good JM, et al. 2013. Comparative population genomics of the ejaculate in humans and the great apes. Mol Biol Evol. 30(4):964–976. [DOI] [PubMed] [Google Scholar]
- Gossmann TI, et al. 2010. Genome wide analyses reveal little evidence for adaptive evolution in many plant species. Mol Biol Evol. 27(8):1822–1832. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hill WG, Robertson A. 1966. The effect of linkage on limits to artificial selection. Genet Res. 8(3):269–294. [PubMed] [Google Scholar]
- Hollister JD, et al. 2014. Recurrent loss of sex is associated with accumulation of deleterious mutations in Oenothera. Mol Biol Evol. 32(4):896–905. [DOI] [PubMed] [Google Scholar]
- James J, Eyre-Walker A. 2020. Mitochondrial DNA sequence diversity in mammals: a correlation between the effective and census population sizes. Genome Biol Evol. 12(12):2441–2449. [DOI] [PMC free article] [PubMed] [Google Scholar]
- James JE, Lanfear R, Eyre-Walker A. 2016. Molecular evolutionary consequences of island colonization. Genome Biol Evol. 8(6):1876–1888. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jensen JD, et al. 2018. The importance of the Neutral Theory in 1968 and 50 years on: A response to Kern and Hahn 2018. Evolution 73(1):111–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Johri P, Charlesworth B, Jensen JD. 2020. Toward an evolutionarily appropriate null model: jointly inferring demography and purifying selection. Genetics 215(1):173–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kaj I, Mugal CF. 2016. The non-equilibrium allele frequency spectrum in a Poisson random field framework. Theor Popul Biol. 111:51–64. [DOI] [PubMed] [Google Scholar]
- Karlin S. 1968. Rates of approach to homozygosity for finite stochastic models with variable population size. Am Nat. 102(927):443–455. [Google Scholar]
- Keightley PD, Eyre-Walker A. 2007. Joint inference of the distribution of fitness effects of deleterious mutations and population demography based on nucleotide polymorphism frequencies. Genetics 177(4):2251–2261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kern AD, Hahn MW. 2018. The neutral theory in light of natural selection. Mol Biol Evol. 35(6):1366–1371. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura M. 1962. On the probability of fixation of mutant genes in a population. Genetics 47(6):713–719. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kimura M. 1964. Diffusion models in population genetics. J Appl Probab. 1(2):177–232. [Google Scholar]
- Kimura M. 1979. Model of effectively neutral mutations in which selective constraint is incorporated. Proc Natl Acad Sci U S A. 76(7):3440–3444. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kousathanas A, Keightley PD. 2013. A comparison of models to infer the distribution of fitness effects of new mutations. Genetics 193(4):1197–1208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kutschera VE, et al. 2020. Purifying selection in corvids is less efficient on islands. Mol Biol Evol. 37(2):469–474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lartillot N, Poujol R. 2010. A phylogenetic model for investigating correlated evolution of substitution rates and continuous phenotypic characters. Mol Biol Evol. 28(1):729–744. [DOI] [PubMed] [Google Scholar]
- Lee AM, Engen S, Sæther B-E. 2011. The influence of persistent individual differences and age at maturity on effective population size. Proc Biol Sci. 278(1722):3303–3312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Leroy T, et al. 2021. Island songbirds as windows into evolution in small populations. Curr Biol. 31(6):1303–1310.e4. [DOI] [PubMed] [Google Scholar]
- Li H, Durbin R. 2011. Inference of human population history from individual whole-genome sequences. Nature 475(7357):493–496. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lin Y-C, et al. 2018. Functional and evolutionary genomic inferences in Populus through genome and population sequencing of American and European aspen. Proc Natl Acad Sci U S A. 115(46):E10970–E10978. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loewe L, Charlesworth B. 2006. Inferring the distribution of mutational effects on fitness in Drosophila. Biol Lett. 2(3):426–430. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lynch M, Conery JS. 2003. The origins of genome complexity. Science 302(5649):1401–1404. [DOI] [PubMed] [Google Scholar]
- Maynard Smith J, Haigh J. 1974. The hitch-hiking effect of a favourable gene. Genet Res. 23(56):23–35. [PubMed] [Google Scholar]
- McKenna A, et al. 2010. The genome analysis toolkit: a mapreduce framework for analyzing next-generation dna sequencing data. Genome Res. 20(9):1297–1303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mugal CF, Kutschera VE, Botero-Castro F, Wolf JB, Kaj I. 2020. Polymorphism data assist estimation of the nonsynonymous over synonymous fixation rate ratio for closely related species. Mol Biol Evol. 37(1):260–279. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mugal CF, Wolf JB, Kaj I. 2014. Why time matters: codon evolution and the temporal dynamics of . Mol Biol Evol. 31(1):212–231. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Muse SV, Gaut BS. 1994. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol Biol Evol. 11(5):715–724. [DOI] [PubMed] [Google Scholar]
- Muyle A, et al. 2020. Dioecy is associated with high genetic diversity and adaptation rates in the plant genus Silene. Mol Biol Evol. 38(3):805–818. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nabholz B, Uwimana N, Lartillot N. 2013. Reconstructing the phylogenetic history of long-term effective population size and life-history traits using patterns of amino acid replacement in mitochondrial genomes of mammals and birds. Genome Biol Evol. 5(7):1273–1290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nadachowska-Brzyska K, Konczal M, Babik W. 2021. Navigating the temporal continuum of effective population size. Methods Ecol Evol. 13(1):22–41. [Google Scholar]
- Nei M, Li W-H. 1979. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci U S A. 76(10):5269–5273. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nei M, Tajima F. 1981. Genetic drift and estimation of effective population size. Genetics 98(3):625–640. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nielsen R. 2005. Molecular signatures of natural selection. Annu Rev Genet. 39(1):197–218. [DOI] [PubMed] [Google Scholar]
- Nikolaev SI, et al. 2007. Life-history traits drive the evolutionary rates of mammalian coding and noncoding genomic elements. Proc Natl Acad Sci U S A. 104(51):20443–20448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ohta T. 1973. Slightly deleterious mutant substitutions in evolution. Nature 246(5428):96–98. [DOI] [PubMed] [Google Scholar]
- Ohta T. 1976. Role of very slightly deleterious mutations in molecular evolution and polymorphism. Theor Popul Biol. 10(3):254–275. [DOI] [PubMed] [Google Scholar]
- Ohta T. 1992. The Nearly Neutral Theory of Molecular Evolution. Annu Rev Ecol Syst. 23(1):263–286. [Google Scholar]
- Otto SP, Whitlock MC. 1997. The probability of fixation in populations of changing size. Genetics 146(2):723–733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pinho C, Harris DJ, Ferrand N. 2008. Non-equilibrium estimates of gene flow inferred from nuclear genealogies suggest that Iberian and North African wall lizards (Podarcis spp.) are an assemblage of incipient species. BMC Evol Biol. 8:63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Popadin K, Polishchuk LV, Mamirova L, Knorre D, Gunbin K. 2007. Accumulation of slightly deleterious mutations in mitochondrial protein-coding genes of large versus small mammals. Proc Natl Acad Sci U S A. 104(33):13390–13395. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Robertson A. 1961. Inbreeding in artificial selection programmes. Genet Res. 2(2):189–194. [DOI] [PubMed] [Google Scholar]
- Romiguier J, et al. 2014. Comparative population genomics in animals uncovers the determinants of genetic diversity. Nature 515(7526):261–263. [DOI] [PubMed] [Google Scholar]
- Rousselle M, Mollion M, Nabholz B, Bataillon T, Galtier N. 2018. Overestimation of the adaptive substitution rate in fluctuating populations. Biol Lett. 14(5):20180055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sawyer SA, Hartl DL. 1992. Population genetics of polymorphism and divergence. Genetics 132(4):1161–1176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schiffels S, Durbin R. 2014. Inferring human population size and separation history from multiple genome sequences. Nat Genet. 46(8):919–925. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schneider A, Charlesworth B, Eyre-Walker A, Keightley PD. 2011. A method for inferring the rate of occurrence and fitness effects of advantageous mutations. Genetics 189(4):1427–1437. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sjödin P, Kaj I, Krone S, Lascoux M, Nordborg M. 2004. On the meaning and existence of an effective population size. Genetics 169(2):1061–1070. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slotte T, et al. 2013. The Capsella rubella genome and the genomic consequences of rapid mating system evolution. Nat Genet. 45(7):831–835. [DOI] [PubMed] [Google Scholar]
- Torres R, Stetter MG, Hernandez RD, Ross-Ibarra J. 2020. The temporal dynamics of background selection in nonequilibrium populations. Genetics 214(4):1019–1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Torres R, Szpiech ZA, Hernandez RD. 2018. Human demographic history has amplified the effects of background selection across the genome. PLoS Genet. 14(6):e1007387. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tsagkogeorga G, Cahais V, Galtier N. 2012. The population genomics of a fast evolver: High levels of diversity, functional constraint, and molecular adaptation in the tunicate Ciona intestinalis. Genome Biol Evol. 4(8):852–861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vitti JJ, Grossman SR, Sabeti PC. 2013. Detecting natural selection in genomic data. Annu Rev Genet. 47:97–120. [DOI] [PubMed] [Google Scholar]
- Wakeley J, Sargsyan O. 2008. Extensions of the coalescent effective population size. Genetics 181(1):341–345. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang S, et al. 2014. Population size and time since island isolation determine genetic diversity loss in insular frog populations. Mol Ecol. 23(3):637–648. [DOI] [PubMed] [Google Scholar]
- Wang J, Santiago E, Caballero A. 2016. Prediction and estimation of effective population size. Heredity 117:193–206. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waples RS, Luikart G, Faulkner JR, Tallmon DA. 2013. Simple life-history traits explain key effective population size ratios across diverse taxa. Proc Biol Sci. 280(1768):20131339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Welch JJ, Eyre-Walker A, Waxman D. 2008. Divergence and polymorphism under the nearly neutral theory of molecular evolution. J Mol Evol. 67(4):418–426. [DOI] [PubMed] [Google Scholar]
- Williamson SH, et al. 2005. Simultaneous inference of selection and population growth from patterns of variation in the human genome. Proc Natl Acad Sci U S A. 102(22):7882–7887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Woolfit M, Bromham L. 2005. Population size and molecular evolution on islands. Proc Biol Sci. 272(1578):2277–2282. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wright S. 1931. Evolution in mendelian populations. Genetics 16(2):97–159. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wright S. 1940. Breeding structure of populations in relation to speciation. Am Nat. 74(752):232–248. [Google Scholar]
- Wright SD, Gillman LN, Ross HA, Keeling DJ. 2009. Slower tempo of microevolution in island birds: implications for conservation biology. Evolution 63(9):2275–2287. [DOI] [PubMed] [Google Scholar]
- Živković D, Steinrücken M, Song YS, Stephan W. 2015. Transition densities and sample frequency spectra of diffusion processes with selection and variable population size. Genetics 200(2):601–617. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Živković D, Stephan W. 2011. Analytical results on the neutral non-equilibrium allele frequency spectrum based on diffusion theory. Theor Popul Biol. 79(4):184–191. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
Code used to implement the stochastic simulations can be found on GitLab.





