Skip to main content
Genetics logoLink to Genetics
. 2023 May 18;224(3):iyad091. doi: 10.1093/genetics/iyad091

The divergence of mean phenotypes under persistent directional selection

Archana Devi 1, Gil Speyer 2, Michael Lynch 3,
Editor: D Roze2
PMCID: PMC10552002  PMID: 37200616

Abstract

Numerous organismal traits, particularly at the cellular level, are likely to be under persistent directional selection across phylogenetic lineages. Unless all mutations affecting such traits have large enough effects to be efficiently selected in all species, gradients in mean phenotypes are expected to arise as a consequence of differences in the power of random genetic drift, which varies by approximately five orders of magnitude across the Tree of Life. Prior theoretical work examining the conditions under which such gradients can arise focused on the simple situation in which all genomic sites affecting the trait have identical and constant mutational effects. Here, we extend this theory to incorporate the more biologically realistic situation in which mutational effects on a trait differ among nucleotide sites. Pursuit of such modifications leads to the development of semi-analytic expressions for the ways in which selective interference arises via linkage effects in single-effects models, which then extend to more complex scenarios. The theory developed clarifies the conditions under which mutations of different selective effects mutually interfere with each others’ fixation and shows how variance in effects among sites can substantially modify and extend the expected scaling relationships between mean phenotypes and effective population sizes.

Keywords: evolutionary divergence, mutation bias, phenotypic divergence, phenotypic scaling, random genetic drift, selective interference

Introduction

Much of evolutionary biology relies on comparisons of mean phenotypes from distantly related species, followed by downstream attempts to develop plausible hypotheses for the observed patterns, almost always in the context of adaptive explanations. As phylogenetic lineages become isolated, their mean phenotypes are expected to diverge as a consequence of varying selection pressures. However, under many circumstances, substantial divergence can be expected even in the face of identical selection pressures, owing to the vagaries of mutation and random genetic drift. In particular, by altering the accessibility of mutations to selection, a change in effective population size (Ne) modifies the fixation probabilities of alternative alleles, with small Ne reducing the accumulation of beneficial alleles and increasing that of detrimental alleles.

This basic principle leads to the expectation that there can be gradients in the performance of traits across species having different population sizes but otherwise experiencing identical selection pressures, owing simply to the differential accumulation of deleterious mutations. This should be universally true, provided that the effects of all mutations are not so large as to be equally visible to natural selection at all population sizes (Lynch 2018, 2020), although the precise scaling parameters will depend on the underlying genetic details. Interest in this matter is motivated by observations in which the performances of key organismal features scale negatively with Ne across the Tree of Life; e.g. mutation rates (Lynch et al. 2016), maximum growth rates (Lynch et al. 2022), swimming efficiency (Schavemaker and Lynch 2022), protein folding rates and stability (Galzitskaya et al. 2011), enzyme catalytic rates (Bar-Even et al. 2011), and degree of removal of exogenous genomic DNA (Lynch 2007).

Here, we explore the consequences of a key determinant of the drift barrier to the mean performance of traits that has been ignored in prior theory development—the consequences of a distribution of sites with varying effects on the phenotype, showing that such variation is a central determinant of the ways in which traits under directional selection are expected to scale with Ne. A substantial fraction of earlier work on the evolution of mean phenotypes assume an infinite-site model, whereby each newly arising mutation arrives at a site previously fixed in the population, while also assuming an absence of limits to the potential range of phenotypic variation (Kimura and Crow 1964; Kimura 1969; Latter 1970; Lande 1975; Bulmer 1980; Lynch and Hill 1986). Owing to its relative mathematical tractability, this model has played a central role in many areas of population genetics, including the development of theory on the maintenance of variation, the long-term response to selection, and the accumulation of deleterious mutations in various contexts (reviewed in Walsh and Lynch 2018). However, for a wide variety of problems in phenotypic evolution, the infinite-site model is unrealistic biologically, and its utility as an approximation remains unclear.

First, the mutational target sizes of the molecular/cellular constituents of phenotypic traits can be quite constrained. For example, an average protein is of order 1 kb in length, and specific functional domains generally encompass <20 amino acids. Many elements at the level of DNA (e.g. transcription-factor binding sites) and RNA (e.g. microRNAs, stems, and loops of larger RNAs) are substantially smaller. The sizes of effectively nonrecombining linkage groups are often in the range of a few bp to hundreds of kb depending on the recombination rate. Second, the mutation rate is sufficiently high that in large populations, multiple independent mutations will often cosegregate at individual nucleotide sites, which can confer no more than four allelic types. Finally, infinite-site models often have the undesirable property that the mutation spectrum is independent of the genetic background, resulting in a situation in which mean phenotypes can diverge without limits by either drift or directional selection. In the reality, as more nucleotide sites in a stretch of DNA are occupied by deleterious mutations, the segment-wide deleterious and beneficial mutation rates must, respectively, decline and increase.

The approach taken here assumes a finite number of genomic sites contributing to the expression of a trait, with mutations at different sites potentially having different magnitudes of phenotypic/fitness effects, e.g. amino acid replacement sites with different functional consequences for the encoded protein, silent sites under varying levels of selection owing to effects on mRNA folding and/or translational speed or accuracy, and noncoding sites with varying effects on gene expression. There has been growing interest in this type of model (Cockerham 1984; Charlesworth and Jain 2014; John and Jain 2015; Lynch 2018, 2020), but many problems remain to be solved.

It has long been known that the linkage between jointly selected sites can diminish the efficiency of selection via the effects of linkage disequilibrium (Hill and Robertson 1966; Comeron et al. 2008), particularly when the numbers of sites are large and the selection is relatively weak. Models of such effects have played a significant role in the interpretation of patterns of codon bias within and among regions of protein-coding genes experiencing different levels of recombination (McVean and Charlesworth 2000; Kim 2004; Loewe and Charlesworth 2007; Charlesworth et al. 2009; Charlesworth and Campos 2014). Linked sites with differing mutational effects can be expected to play a significant role in phenotypic divergence owing to the multiple ways in which they interfere with each other in the selective process. For example, beneficial mutations at sites with small effects will be unavailable to selection if they arise in tight linkage with a segregating deleterious mutation at a site with large effects (Nguyen Ba et al. 2019). On the other hand, if sites with small effects greatly exceed the number of major-effect loci, beneficial mutations at the latter positions will have reduced visibility to selection if they happen to arise on a relatively poor linked background associated with segregating minor-effect sites. More generally, one can expect moderate-effect sites to experience both types of problems, particularly if there is an inverse relationship between the number of sites and their contributing effects. The overall process is further complicated by the fact that recurrent purging of deleterious mutations at multiple linked sites have general effects on effective population sizes, thereby influencing all other aspects of the efficiency of selection.

There has been much research on these matters as well (Gerrish and Lenski 1998; Johnson and Barton 2002; Desai and Fisher 2007; Loewe and Charlesworth 2007; Campos and Wahl 2010; Kaiser and Charlesworth 2009; Charlesworth 2013a; Good et al. 2014; Pénisson et al. 2017; Campos and Charlesworth 2019; Jain 2019). Although these analyses have yielded numerous useful insights, in the few cases that consider sites with different fitness effects, general theoretical results defining the effects of linkage on the effective population size governing long-term mean allele frequencies remain to be developed. Here, we present computational results and mathematical approximations bearing on this difficult but biologically general problem.

The model

We start with a simple model with L linked sites (or genetic loci), each with two alternative allelic states, + and , contributing positively and negatively to the trait, but with the magnitude of +/ effects allowed to vary among sites (Fig. 1). Such a model would apply, for example, to a situation in which there is one optimal nucleotide at a site, with the remaining three having equivalent fitness effects, as in Li (1987) and Bulmer (1991). Because the stretch of nucleotide sites under consideration is assumed to be completely linked, the positions of the sites are irrelevant, and there can be a multiplicity of functionally equivalent haplotypes (i.e. with identical numbers of + alleles) in each effect class, which alters their ease of mutational accessibility (Lynch 2018, 2020). The site-specific per-generation mutation rates from the to the + states, and vice versa, denoted as u01 and u10, respectively, will be assumed to be identical at all sites.

Fig. 1.

Fig. 1.

Schematic for the general approach. Here, there is a linkage block (experiencing no recombination) containing 22 sites, with an approximately exponential distribution of numbers of sites with three different magnitudes of effects (one site with major effects, surrounded by five of medium effects, and 16 of small effects). Three of the many possible haplotypes are shown, with solid and open balls denoting + and alleles. Given the assumption of complete linkage, the ordering of site-specific haplotypes is irrelevant, and in this case, haplotypes 2 and 3 are functionally equivalent, as they contain identical numbers of sites with + alleles for the three types of effects. The pattern of mutation is haplotype dependent, being a function of the numbers of + and alleles at each type of the site; the rates of the total set of possible mutations for haplotype 1 are given in the bottom panel, with u01 and u10 being the per-site mutation rates from to + allelic states, and vice versa.

A central goal is to determine the conditions under which gradients in mean phenotypes with respect to Ne can be expected for populations under identical patterns of persistent directional selection. Thus, it is desirable to perform analyses with biologically realistic combinations of parameter values. Across the Tree of Life, Ne generally falls in the range of 104 to 109, and the mutation rate per nucleotide site scales negatively with the ∼0.76 power of Ne (Lynch et al. 2016; Long et al. 2017; Walsh and Lynch 2018). Thus, where computational work was involved, the following analyses were performed under the assumption of a deleterious mutation rate per site (which might be a cluster of adjacent nucleotides) of 107 at an adult population size of N=104, such that u10=0.0011N0.76, which is approximately 10× the known rate per nucleotide site. With this scaling, for the full range of population sizes employed here (N104 to 2×109), the product Nu10 then ranges from 0.01 mutations/population/site/generation at the lowest to 0.10 at the highest population sizes. It should be noted that the negative scaling of the mutation rate with absolute population size (N) is likely shallower than that assumed here, as Ne/N for large multicellular species (small Ne) is likely on the order of 0.1, whereas that for microbial species can be orders of magnitude smaller. In the end, we provide analytical approximations that make no assumptions about the relationship between mutation rates and population sizes.

We evaluate the consequences of a wide range of linkage-block lengths, from 1 (free recombination) to 106, following the behavior of single blocks (with no other simultaneously segregating unlinked blocks). Selection coefficients ranged from s=108 to 104, and mutational biases towards beneficial alleles ranged from β=u01/u10=0.10 to 1.00. Under this finite-site model, the deleterious mutation rate per haplotype increases linearly with the number of sites harboring advantageous alleles, whereas the beneficial rate scales in the opposite direction. In all simulations, we embedded a single neutral site within the linkage block and then monitored the long-term average heterozygosity; this then enabled a retrospective estimate of the variance Ne by use of the expression for the expected variance at a neutral site under drift-mutation equilibrium (Equation (2b) below).

The population consists of N haploid individuals, so that de novo mutations have initial frequencies of 1/N. Assuming independent fitness effects within and between loci (i.e. no dominance or epistasis), as done below, all results should extend to diploids by substituting 2N for N and 2Ne for Ne. As noted below, the effective population size (Ne), which is N, governs the magnitude of random genetic drift and is a natural outcome of the structure of the linkage group, the strength of selection, and N itself.

The following work is performed under the assumptions of a classical Wright–Fisher discrete-generation model with sequential episodes of mutation, selection, and random genetic drift. Under this model, allele frequencies fluctuate in time, but because mutations are reversible, the system always eventually evolves to a quasi-steady-state distribution, provided the fitness function remains constant. Our particular focus is on how long-term average frequencies of beneficial alleles at various site types depend on the number and distribution of site types within linkage groups, on the joint forces of selection and mutational bias, and in particular on the population size. Related analyses have been performed by John and Jain (2015), Jain and John (2016), and Jain (2019), but mostly under the assumptions of either an effectively infinite population and/or an infinite-site framework, and even in these cases, achieving reasonably simple expressions has been difficult.

Owing to the stochastic nature of the underlying processes, computer simulations of these processes must proceed for very large numbers of generations to achieve stable estimates of means and variances. To obtain greater computational speed, for large population sizes, we scaled the input parameters so as to keep Nu10, Nu01, and Ns constant, by reducing N and increasing the mutation and selection parameters by the same factor, with constraints such that N was always 103, and s and Lu10 always ≤0.1. Burn-in periods before compiling statistics were typically at least 105N generations, with the populations then being assayed every N/10 generations for 106 to 108 intervals. Simulations, which often extended for several days, were carried out with a program written in C++ (freely available from the authors), in a form that allows parallel analysis of multiple population sizes. Although we have evaluated a broad range of population-genetic environments extensively by computer simulation, throughout we attempt to provide heuristic semi-analytical expressions to address more general issues.

Results

Sites with single effects

For baseline comparisons, we start with the simplest situation of sites with single effects, such that all mutations compete maximally with each other. Expanding on prior work (Lynch 2020), new expressions are presented to explain the general consequences of this extreme setting. The fitness function is assumed to be of the form W(Ld)=(1s)Ld, where Ld is the number of deleterious mutations carried in a haplotype, such that a maximum fitness of 1.0 occurs in individuals free of deleterious alleles, whereas with L equivalent sites, (1s)L is the minimum fitness (for a haplotype containing only deleterious alleles). Under this multiplicative fitness model, selection operates on each site independently, and there is no epistasis. Although this leads to the expectation of no linkage disequilibrium in populations that are infinite in size (Eshel and Feldman 1970), this is not the case in finite populations.

The case of linkage blocks of length L=1 is of special interest, as it represents the limiting situation of free recombination, where selection is most efficient. For this situation, an analytical expression for the long-term mean frequency of the + allele, here denoted p~, has already been developed by Kimura et al. (1963), and will not be repeated here, except to say that the fit to simulated data is excellent across the full range of population sizes, selection coefficients, and mutation rates. Although highly accurate, two undesirable features of the Kimura et al. (1963) solution are the need to solve a confluent hypergeometric function by a series expansion and the rather nontransparent interpretation of the formulations. For this reason, various approximations for particular domains of Nu10 and Ns have been given by Charlesworth and Jain (2014).

An alternative expression, which is quite accurate over the full range of parameter space explored herein and extends to larger linkage blocks, can be obtained in the following way. In Lynch (2020, Equation S10), it was noticed that, if the within-population variance in numbers of mutant alleles per individual is known from simulations, the long-term average frequency of + alleles is accurately described by

p~=β+(sλσw2/u10)1+βs. (1)

Following Lynch et al. (1993), β is the ratio of mutation rates (defined above), σw2 is the mean within-population variance per locus (i.e. the total variance in number of + alleles per individual divided by L), and λ=1(1/Ne) is a measure of the resistance of the population to random genetic drift, with Ne being the effective population size. Derived from a quantitative-genetic perspective, this expression evaluates p~ as the mean allele frequency at which the selection advance per generation (a function of the genetic variance) is matched by the decline associated with mutation. Others have used such a matching approach to estimate the position of the leading edge of the full distribution of fitness (Goyal et al. 2012; John and Jain 2015).

Letting S=2Nes and assuming 2Nu011, by extension from McVean and Charlesworth (1999) and Long et al. (2018),

σw2σn2(1+β)(1eS)S(β+eS), (2a)

where

σn2=2Neu01β(1+β)[β+2Neu01(1+β)] (2b)

(Gale 1990; Charlesworth and Jain 2014) is the expected variance under neutrality (equivalent to half the expected neutral heterozygosity per site). For Neu011, Equations (2a,b) reduce to Equation 15 of McVean and Charlesworth (1999) after modifying for the scaling factor of 2 and haploidy, and Equation (1) yields the Li–Bulmer (Li 1987; Bulmer 1991) equation (Equation A1) by setting λ=1 and neglecting the s in the denominator. A key remaining issue is that unless Neu101, the effective population size (Ne), will be depressed below the absolute population size (N), by selective interference among simultaneously segregating mutations. As a consequence, Equation (1) cannot be solved by substituting Ne=N, and a separate expression is needed for Ne.

There are many ways to define an effective population size, depending on the allelic behavior of interest. One common consideration is the coalescence effective population size, i.e. the degree to which nucleotide diversity is depressed at neutral sites linked to other sites under selection (e.g. Charlesworth et al. 1995; Kim and Stephan 2000; Good et al. 2014; Campos and Charlesworth 2019). However, application of estimates of Ne obtained from simulations of standing levels of variation at linked neutral sites to the preceding formulae yield a less than satisfactory fit to observed levels of variation and mean allele frequencies at selected sites.

An alternative approach starts with a consideration of the expected mean frequency of beneficial alleles over sites under the assumption of no interference (Li 1987; Bulmer 1991). Given the selection and mutation pressure (s and β), the ratio Ne/N necessary to account for an observed equilibrium beneficial-allele frequency, p~, is then

ϕ=(12Ns)ln(p~β(1p~)) (3)

(Lynch 2020). This expression is simply a rearrangement of the Li–Bulmer equation, with Ne=ϕN. The assumption here is that the Li–Bulmer equation will yield the proper scaling of p~ with population size when N is replaced by Ne.

Using estimates of p~ from simulated data to solve for ϕ, and substituting Ne=ϕN in the preceding expressions, Equations (2a,b) provide excellent fits to observed within-population variances for the full range of parameters explored here, usually well within 10% of observed values (Supplementary Fig. 1), showing that the fixation Ne is the relevant determinant of the standing variation. Equation (1) yields estimates of p~ that are always within 3% of observed values (Fig. 2). Notably, this approach was found to be valid for all linkage-block lengths explored, from L=1 to 106. Note also that as Ns,

Fig. 2.

Fig. 2.

Relationship between the average frequency of advantageous alleles (p~) and the population size N (x axis), selection coefficient s (different panels), and linkage-block length L (colored lines within panels) under the assumption of equal fitness effects across loci. The lines associated with each set of points are the theoretical predictions obtained with Equations (1), (2a), and (2b), using the fixation Ne derived from computer simulations. The mutational bias in all panels is β=0.33, so the neutral expectation at small N is p~=0.25.

σw2α+α2+4su102s, (4)

where α=s+u01+u10. In this case, provided s exceeds the site-specific mutation rates (u01 and u10), σw2 also closely approximates the equilibrium frequency of a deleterious allele in an infinite haploid (fully recombining) population, and more generally p~(1p~).

Although these observations justify the use of the correction factor ϕ (defined by Equation 3) to transform N into the fixation effective population size relevant to equilibrium allele frequencies, validation of this approach required the use of estimates of ϕ derived by computer simulations. For more practical applications, we require an expression for ϕ from first principles. An excellent approximation to ϕ, as a function of the mutation rates, selection coefficient, number of loci, and absolute population size, were obtained by inspection in Lynch (2020), albeit with a particular scaling between the mutation rate and population size. In the Appendix, we derive more general expressions, taking into consideration the amount of selective interference imposed on the fixation probability for beneficial mutations by linked sites; these expressions make no assumptions about mutation-rate scaling.

Despite the complexity of the underlying issues, the derived expressions for ϕ generally yield estimates that are within 30% (but often considerably closer) of simulation results (Supplementary Fig. 2), and even closer fits for p~ (Fig. 3). Given that ϕ varies 10,000-fold over the full range of parameter space, the essence of the system is captured. This provides an upgrade to the visual-fit interpretation of Lynch (2020), yielding insight into the scaling relationships between ϕ and the underlying population-genetic parameters. For example, for L>104,

Fig. 3.

Fig. 3.

Comparison of the theoretical predictions to observations from computer simulations for a wide range of selection coefficients s, population sizes N, and linkage group sizes L, for β=0.33. In all cases, the mutation rate is set to scale with N as described in the text. Solid lines: for L>100, based on Equation (A11b), using k=0.25; for L=10 and 100, also based on Equation (A11b) in the same way, but with k=1, and ϕmax=1. Dashed lines are based on Equation (A13), an approximation that assumes s/(Lu01)<1, which can be seen to break down outside of this domain.

ϕ(4s(1+β)ϕmax(Ns)2Lu01)1/3, (5a)

with

ϕmax=11+[2Ns/ln{1+[s/(u01)]}]. (5b)

This expression shows that the magnitude of the reduction in Ne caused by linkage increases with the cube root of the number of sites. It also shows that ϕ is a function of two other key composite parameters: the ratio of the selection strength to the mutation rate to beneficial alleles, and the ratio of the selection strength to the power of drift in the absence of interference, i.e. for large L, ϕ scales with the ∼1/3 power of s/u01, and with the 2/3 to 1 power of Ns with increasing Ns. We note that ϕmax is simply employed as a scaling function that retains the basic small-population size behavior of the Li–Bulmer formulation when alleles are typically fixed but tends towards deterministic expectations of allele frequencies as N (see Appendix).

Summing up for the simplest situation in which all sites within a linkage block have equivalent effects on fitness, contrary to the single-site expectations (Kimura et al. 1963), where there is a quantum shift in the frequency of beneficial alleles with increasing population size around a pivot point of Ns=1, linkage reduces the gradient of response of p~ to N (Fig. 2). Instead of a shift from the neutral expectation of p~ to that expected under deterministic selection–mutation balance over a window of just an order of magnitude of N, linkage can extend the gradient to several orders of magnitude of N, with the effect becoming increasingly pronounced with larger L. On the other hand, when viewed as a function of the variance Ne, where the latter is derived from the heterozygosity segregating at linked neutral sites, p~ is largely (but not entirely) a stepwise function of Nes, as Ne subsumes the influence of linkage interference. There is, however, some additional influence of L in the region of Nes1 (Fig. 4).

Fig. 4.

Fig. 4.

Response of the mean frequency of the beneficial allele as a function of the coalescence effective population size (Ne as determined from the long-term average variation at linked neutral sites) and of the product Nes, over a five order-of-magnitude range for L (the number of linked sites) and a four order-of-magnitude range of s. The lines simply connect the points.

Finally, note that the preceding expressions also yield descriptions of the expected standing levels of within-population variation for quantitative traits under persistent directional selection (in this case a multiplicative fitness function) with reversible mutation, a problem of long-standing interest in quantitative genetics (Walsh and Lynch 2018). For example, simplifying from Equations (2a,b), assuming unbiased mutation (β=1), the average genetic variance for a trait with L equivalent loci with average squared allelic effect E(a2) is

σA2=LE(a2)(2Neu1+4Neu)(1eSS(1+eS)), (6a)
σA2=LE(a2)(u2uS+s), (6b)

for u=u10=u01, and S<4 and S>4, respectively. These expressions show that under selection, the genetic variance reaches a maximum at the point where Nes1, where the power of drift and selection are essentially equivalent (Supplementary Fig. 1). The genetic variance initially grows with N owing to the increase in number of mutating individuals in the population, but beyond the peak, the deterministic force of selection overwhelms drift.

Even in the case of neutrality, there is a natural upper bound on the genetic variance, owing to the finite number of effects per nucleotide site (here assumed to be two), with the neutral variance in the case of β=1 being simply proportional to 2Nu/(1+4Nu). Although it might be assumed that increased efficiency of selection (higher Ns) will always reduce standing levels of variation, in fact when mutation is biased in the opposite direction of selection, the genetic variance increases at an accelerating rate with S up to 4. This is because the conflict between mutation towards alleles and selection towards + alleles pulls the latter towards more intermediate frequencies.

Two site types

Having arrived at a reasonable understanding of the factors determining the mean and variance of traits in the simplest case of L equivalent sites, we now explore the consequences of sites with variable fitness effects, starting with the case of just two site types to help illuminate the general complexities that arise. Some prior work has been done in this area (e.g. Johnson and Barton 2002; Desai and Fisher 2007; Pénisson et al. 2017; Campos and Charlesworth 2019; Jain 2019). In their seminal work on two-locus systems, Hill and Robertson (1966) suggested that a site with selection coefficients smaller than half that of a major-effect site will have a negligible interference effect on the latter, but this conjecture will be shown to break down in systems with more than two linked loci. Here, we seek to obtain more general results, allowing for finite population size and finite numbers of sites within linkage blocks of arbitrary length. We assume that the two site types have identical mutational features, while allowing for different site numbers.

Results described in the preceding section show that when linked sites have single effects, there is a smooth gradient in the expected frequency of favorable alleles with increasing population size. For any particular s, the gradient with N becomes increasingly shallow with increasing numbers of linked loci, owing to enhanced levels of selective interference and associated fractional reductions in the effective population size. This gradient becomes steeper and is almost independent of L when reformulated as a function of Ne rather than N.

When sites with two effects contribute to the expression of a trait, a qualitative shift in the response of the mean phenotype to Ne is expected, as the sites with larger fitness effects will make a transition to high frequencies at a lower Ne than those with small effects. Moreover, a shift in the scaling of the fixation Ne with respect to N can be anticipated owing to the fact that once N is high enough to enable all major-effect sites to approach fixation for + alleles, these sites no longer contribute much to selective interference.

Consider, for example, a trait having an underlying additive-genetic basis with two types of sites: LM sites with major phenotypic and fitness effects aM and sM, and Lm sites with minor effects am and sm. The mean genotypic value is then

z¯=c+LMp~MaM+Lmp~mam, (7)

where p~M and p~m denote the mean frequencies of the + alleles at the major and minor loci, and c is an arbitrary baseline constant. If sM is substantially larger than sm, beneficial alleles at the major-effect sites will achieve near fixation at relatively low Ne, before the minor-effect sites begin to respond to selection.

An example is shown in Fig. 5, where the response of mean performance to the effective population size in the single-effects case (equivalent to p~M) is compared to that for cases in which there are 10-fold additional minor-effect sites for each major-effect site, each with 10-fold lower selection coefficients, i.e. Lm=10LM, and sM=10sm. In this figure, Ne is the coalescence effective size inferred by the average level of nucleotide diversity at linked neutral markers, as this would typically be the measure used in a population-genetics analysis (not to be confused with the fixation effective size employed in our analytical work). Assuming that the phenotypic effects are proportional to the selection coefficients, mean performance in the two-effect case is defined by Equation (7), normalized by LMsM+Lmsm to give a maximum performance of 1.0 when all sites are fixed for + alleles. In this example, with LMsM=Lmsm, half of the maximum performance is determined by each class of sites. With a 10-fold difference in s between classes, as Ne increases to the point at which NesM exceeds 1.0, a shoulder appears in the response profile because most major-effect loci are near fixation for + alleles, whereas the minor-effect loci do not start to significantly respond to selection until Nesm approaches 1. As a consequence, whereas the dynamic range of mean performance extends over just one order of magnitude of Ne under the single-effects case, the gradient extends for two orders of magnitude of Ne when substantial numbers of minor-effect sites are present.

Fig. 5.

Fig. 5.

Response of mean performance to coalescence Ne (as determined from the nucleotide diversity at a linked neutral site) for the case of one- and two-effect models. In the two-effect case, the minor-effect loci have one-tenth the selection coefficient as that for the major-effect sites (sM=10sm) but are 10× more abundant (Lm=10LM), such that the total potential selective load is the same for both types of sites, i.e. LMsM=Lmsm. The lines simply connect the points.

All of the issues raised in the preceding section on selective interference between linked loci apply here, except that there is an asymmetry in the degree of interference that depends on the relative abundance of the two site types. Most notably, from the standpoint of major-effect sites, there is a remarkable simplicity with respect to the interference caused by minor-effect loci (Appendix). The interference of a single minor-effect locus imposed on major-effect loci is equivalent to the influence of (sm/sM)2 of the latter (Fig. 6). This yields an overall level of interference operating on major-effect loci equivalent to that resulting from

Fig. 6.

Fig. 6.

Demonstration that the effects of selective interference of a minor-effect site on the equilibrium frequencies of beneficial alleles at major-effect sites is equivalent to adding (sm/sM)2 major-effect sites to the genome. The values on the x axis refer to results under the single-effects model with various numbers of loci with major effects (LM in the insets) and no background minor-effect sites. The values on the y axis refer to results when a single major-effect site is surrounded by Lm minor-effect sites. Each point denotes the equilibrium mean frequency of + alleles under both conditions. In all cases here, sm/sM=0.1, and the prediction is that (sM/sm)2=100 minor sites have the same influence on the equilibrium + major allele frequency as the addition of one more major-effect site. Computer simulation results are given for 21 different population sizes for each set of conditions. For every set of points on the x axis with LM major effects sites alone, there is a parallel set of points on the y axis with 100LM minor-effect sites but just a single actual major-effect site.

LM*LM+(sm/sM)2Lm (8)

sites in the single-effects model. In other words, from the standpoint of a major-effect site, if sm/sM=1/10, the addition of 100 linked minor-effect sites is required to shift the effective amount of interference from that associated with LM to LM+1 major-effect sites. All of the machinery just introduced for estimating the behavior of linked major-effect loci can then still be relied upon by substituting LM* for L. This scaling with the squared effect of the selection coefficient can be roughly understood by noting that the probability of an establishment of beneficial mutation of effect s is also proportional to s. It is also related to the fact that the variance in fitness, which depresses Ne, is a function of the squared selection coefficient at a site (Hill and Robertson 1966; Santiago and Caballero 1998; Good et al. 2014).

More generally, the overall fixation effective size of the population (which must be the same for both site types within linkage blocks) is dominated by the sites for which the product Lxsx is highest, provided the population size is not so great that the categories in question are in near selection–mutation balance (i.e. no longer influenced by the vagaries of random genetic drift or selective interference). This will now be illustrated by considering three alternative domains of relative values of LMsM and Lmsm (Fig. 7).

Fig. 7.

Fig. 7.

Relationship between fixation (left) and variance (right) effective population sizes to absolute population sizes, for three relative conditions involving LMsM and Lmsm. (Left) Open and closed points denote results for major- and minor-effect sites, substituting in the mean frequencies of allelic types obtained by computer simulations into Equation (3) and multiplying by N. Dashed lines denote results obtained with Equation(3) using the single-effects allele frequencies from computer simulations, using: L=LM and s=sM in the top panel; L=LM and s=sM (dashed lines) and L=Lm and s=sm (dash-dotted lines) in the middle panel; and L=Lm and s=sm in the bottom panel. Solid lines simply join the connecting points. Note that regions of the plots where the fixation Ne levels off are within the pseudo-Ne domain, where sites have nonzero equilibrium frequencies of deleterious alleles defined by selection–mutation balance. Right) Solid points give the coalescence effective population sizes obtained from simulations of linked neutral sites and factoring out the mutation rate from the mean observed neutral heterozygosity to obtain Ne. Dashed lines are taken from the left panels to compare the variance and fixation effective sizes. In all cases, the black dashed line denotes Ne=N.

First, for the extreme situation in which LMsMLmsm, mutations at the minor-effect sites are subject to strong hitchhiking effects associated with the major-effect backgrounds upon which they arise. However, the major-effect sites behave in accordance with the predictions from the single-effects model, as they experience essentially no interference from minor-effect sites. With complete linkage, the behavior of the major-effect sites then dictates the Ne for the entire linkage block. As noted above for the single-effects model, the fixation effective Ne for major-effect sites steadily increases with N, to a degree that depends on LM, but upon reaching a critical N levels off as all such sites are close to pure mutation–selection balance, and then enters into a pseudo-Ne domain.

This pseudo-Ne domain is purely a mathematical feature of the use of Equation (3) to define the fixation Ne. As N reaches high enough levels that p~ approximates the level expected under pure selection–mutation balance, absolute fixation (identity in state) of + alleles never occurs, although genealogical fixation (identity by descent) does. As a consequence, the fixation Ne levels off, as ϕmax (Equation 5a) declines. Although this pseudo-Ne is not a reflection of the actual Ne in the large-N domain, its deployment in the preceding mathematical expressions is required to obtain an acceptable overall expression for p~M. The behavior of the variance Ne, shown in the right panels of Fig. 7, yields some insight into the stochastic features of the population, as it qualitatively tracks the behavior of the fixation Ne (outside of the pseudo-Ne domain), although overestimating the latter. The variance Ne always starts out as Ne=N at NsM1, is reduced relative to N at intermediate N by selective interference, and then asymptotically returns to Ne=N for Nsm1.

In the example shown, because sm=sM/10, the minor-effect sites do not begin to respond to selection until N is an order of magnitude beyond the point at which the major-effect sites are subject to selection. At this point, the governing fixation Ne is reflected in the behavior of the minor-effect sites, until they too enter their pseudo-Ne domain at very large N. We have not been able to achieve a fully mathematical description of this transitional behavior in the minor-effect sites in this limiting case of LMsMLmsm, but an intuitive understanding of the processes involved can be understood as follows.

One might expect the fixation Ne for minor-effect alleles to increase to N once the deterministic regime for major effects has been reached, as in this case there is a just a single minor-effect site. However, from the behavior of the variance Ne, it can be seen that the system does not return to Ne=N until N is well beyond the point of entry of the major-effect sites into the deterministic regime. Selective interference from the major-effect sites still occurs (to a degree that increases with LM), owing to the background variation among individuals with respect to major-effect alleles. For example, at N106, in this particular set of simulations, the mutation rate to deleterious alleles 108, so with sM=105, the equilibrium frequency of deleterious alleles 103. With LM=103, there is then an average of 1.0 deleterious major-effect mutations per individual, and as the distribution among individuals is expected to be Poisson, ∼37% of individuals will be free of deleterious major-effect mutations. Only in this subset of individuals are beneficial minor-effect mutations able to progress towards fixation, as all lineages containing major-effect deleterious alleles will be subject to purging from the population (unless a reversion mutation is acquired), and even then on a large LM background, some of these can become victims of subsequently arriving linked major-effect deleterious mutations. Thus, if LM is sufficiently large, trapping of beneficial minor-effect mutations can impose selective interference effects even in the deterministic regime for major-effect sites.

We next consider the situation in which Lmsm=LMsM, such that there is an inverse relationship between the number and selective effects associated with the two site types. In this case, at sufficiently small N such that the minor-effect sites are effectively neutral, the fixation effective population size is a function of LMsM. A departure between the estimates of ϕ using major- vs. minor-effect sites only arises at larger N as the major-effect sites enter the pseudo-Ne domain (ϕmax<1), while the minor-effect sites remain under the stochastic effects of drift. At this point, the fixation Ne of the minor-effect sites is primarily a function of Lmsm, until they themselves enter their pseudo-Ne domain. This kind of domain shift will become more blurred as smsM, in the limit becoming equivalent to the single-effects model with L=LM+Lm.

Finally, for the case in which the minor-effect sites greatly outnumber those for major effects, such that LmsmLMsM, the former are largely unaffected by interference from the major-effect sites, rendering the behavior of the minor-effect sites very close to that observed in the single-effects situation as defined by Lm and sm. In this case, the behavior of the major-effect sites is also essentially defined by the minor-effect loci, as LM*=LM+(sm/sM)2Lm(sm/sM)2Lm. In both this and the prior case, background trapping of minor-effect alleles plays a negligible role in the pseudo-Ne domain for the major-effect sites because much to most of the background variation is associated with the minor-effect sites.

Generalization to multiple site types

With sites with additional effects, one can anticipate an extension of the features noted above. From the standpoint of the major-effect loci, the preceding logic can be extended to an arbitrary number of effects, yielding an interference effective number of major sites equivalent to

LM*=LM+i=1n(si/sM)2Li, (9)

where Li is the number loci with effect size si<sM, and n is the number of effect classes. For many polygenic traits, the distribution of site types may be nearly continuous in form, in which case this expression could be replaced by an integration over the full spectrum of site types.

Examples are given in Fig. 8 for the case of three effects with an inverse relationship between site numbers and effects, such that Lxsx is constant, and with 10-fold differences in sx between site types. In this situation, the summed effects of sites within each of the three classes contribute equally to the maximum performance of the trait (assuming additive phenotypic effects across sites). As the population size declines, the site types with larger s progressively accumulate deleterious mutations, yielding a gradient for mean total performance that is nearly continuous over several orders of magnitude of N. Increasing the number of sites within linkage blocks (while keeping the ratios of numbers of site types constant) has a greater effect on the sites with small effects, reducing the steepness of the performance gradient. The precise form of the scaling would be altered with different distributions of effects, which would shift the relative contributions of the three types to mean performance.

Fig. 8.

Fig. 8.

Response of mean + allele frequencies to population size (N) for the case of the three site types, with an inverse relationship between the number of sites and the selective effects within a class, such that Lxsx=constant, with a ratio of 1:10:100 for major:medium:minor-effect sites. The mutational bias towards + alleles is set to β=0.33. Two situations are shown, with the major site type being present in one (solid points) or 10 (open points) copies. Mean performance is obtained by extension of Equation (7) to three site types, normalized by the value expected when all sites are fixed for + alleles.

Discussion

The primary motivation for this work is the idea that quantitative traits under persistent directional selection of the same form in different phylogenetic lineages should exhibit gradients in mean phenotypes associated with differences in effective population sizes. Such an expectation, which arises for the simple reason that Ne dictates the efficiency of natural selection, should hold generally provided that a significant fraction of mutations with phenotypic effects have selection coefficients within the lower and upper bounds of 1/Ne across phylogenetic lineages. The fact that Ne ranges from 104 to 109 among phylogenetic lineages, scaling negatively with the ∼0.2 power of body mass (Lynch and Trickovic 2020), indicates that the complete absence of the effects of drift on mean phenotypes requires an absence of mutations with fitness effects in the range of 109 to 104, which seems highly implausible. One point of concern, illustrated in the above work, is that the traditional measure of Ne determined from measures of standing variation at silent sites (the variance Ne) is generally larger (in some cases by up to an order of magnitude) than the measure of population size that governs the evolution of mean phenotypes (the fixation Ne) (Supplementary Fig. 4).

Prior theoretical work provided a framework for considering the fundamental population-genetic processes influencing the generation of such gradients, including the effects of linkage block size (an analog of the level of recombination), but under the restriction that all mutations have comparable effects on phenotypes and fitness (Lynch 2020). Here, we have developed more general mathematical approximations for the single-effects model, and used these to further understand the more biologically realistic situation in which genomic sites have different effects. Ultimately, we would like to make statements on the quantitative scaling of mean phenotypes with Ne based on first principles, but this will require detailed information on the distribution of mutational effects summarized over different site types. Unfortunately, the fraction of this distribution that is of most relevance resides within the 1/Ne bounds noted above. Although likely quite abundant, mutations with such small effects are highly impenetrable to direct enumeration (Walsh and Lynch 2018; Lynch and Ho 2020). For now, we at least have a framework within which to derive predictions under specified genomic and population-genetic conditions.

For example, although a fully general description of the steady-state distribution of mean phenotypes under a variable-effect model remains to be developed, the preceding results provide the basis for a heuristic argument as to how mean phenotypes under persistent directional selection should scale with Ne. To clarify the main points, the following qualitative discussion relies on order-of-magnitude arguments and starts with a simple additive genotype-to-phenotype mapping,

z=c+ki=1nL+,isi, (10)

such that the expected trait value is a linear function of the number of plus alleles at each of n types of sites, each weighted by the selective advantage, with c and k being arbitrary constants. We wish to determine how the mean genoytpic value z¯ scales with Ne.

As noted above, Ne will be largely governed by the sites with the largest fitness effects, unless there is a much stronger than exponential increase in the numbers of sites with diminishing effects. Supposing the sites with the strongest effects have s=105, then below Ne=104, all such sites will have expected + allele frequencies at the neutral expectations defined by the level of mutational bias. Above Ne=106, all such sites will be essentially fixed for + alleles, with most of the gradient residing in the vicinity of Ne105. Likewise, sites with s on the order of 106 will exhibit a gradient in the vicinity of Ne=106, with + alleles just starting to accumulate at Ne105 and becoming essentially fixed at Ne107. The same argument applies to sites with all lower-order effects, with each order-of-magnitude effect exhibiting a gradient roughly corresponding to where the prior and subsequent ones exhibit maximum responses to Ne.

The precise form of the gradient of z¯ will depend on the relative incidences of site types and on the form of the genotype-to-phenotype map. For example, for the linear mapping in Equation (10), if the number of sites of type i is inversely proportional to si (i.e. an essentially exponential distribution of site types), then the contribution of each site type to total performance will be equal over all site types, as in Fig. 8, and there will be a continuous gradient over Ne in the range of 1/smax to 1/smin. (The small wobbles in the gradient shown in Fig. 8 would become essentially invisible with the inclusion of more fine-grained effects). Deviations from an exponential distribution of site types would alter the gradient accordingly, as would a different weighting scheme for the genotype–phenotype map. For example, if there was a paucity of intermediate-effect sites, there would be a shoulder in the response to Ne, as nearly all large-effect sites would become fixed before Ne reaches a high enough level for small-effect sites to respond to selection.

While not a formal mathematical statement, this heuristic argument provides a roadmap for thinking about how gradients of mean phenotypes should scale with effective population sizes for traits under similar forms of directional selection across species. If, for example, there was an exponential distribution of sites with different fitness effects over five orders of magnitude of s, a gradient in performance would be expected over five orders of magnitude of Ne. The actual “power-law” scaling would depend on the scale upon which performance is measured and on the genotype–phenotype map. For a multiplicative mapping function, on a logarithmic scale (the usual procedure in studies of allometry), the expected slope might approach +1, but could be shallower in the case of large linkage blocks, which would enhance the level of selective interference. In contrast, the linear mapping function used above might lead to nonlinear allometric scaling, depending on the distribution of site types.

There is considerable room for more theoretical work in this area. For example, as a surrogate for the level of recombination, we have relied on the concept of a linkage block (Good et al. 2014), which greatly facilitates computational study in the domain of large L and also eases a number of aspects of the mathematical analysis. Although we do not expect qualitative changes in the conclusions to result from a more fully implemented recombinational model, work of this nature is desirable. Most notably, we have focused on a single fitness function, the exponential (or multiplicative) model commonly used in studies of deleterious mutation accumulation. Under this model, there are no epistatic effects of mutations, as each additional deleterious mutation reduces fitness by a fractional amount s regardless of the genetic background. Variants of this model have been invoked to explain how drift barriers may influence the phylogenetic distribution of mutation rates (Lynch 2011; Lynch et al. 2016) and maximum growth rates (Lynch et al. 2022), both of which are plausibly under persistent directional selection in most lineages. Other types of traits that might be explored in this regard are cell biochemical and/or physiological processes shared across the Tree of Life.

Future exploration will need to consider Gaussian and mesa fitness functions, which do introduce epistatic fitness effects. The mesa fitness function (with a plateau) imposes pure directional selection with diminishing fitness increments as the trait approaches the asymptotic optimum, whereas the Gaussian fitness function provides a setting in which traits can be under stabilizing selection for an intermediate optimum. In the limit, as the optimum falls far out of the range of obtainable phenotypes, the Gaussian fitness function converges on the multiplicative model used herein.

Notably, although a substantial body of work in evolutionary quantitative genetics has been developed under the assumption of a Gaussian fitness function, and many evolutionary biologists operate under the assumption that such stabilizing selection is pervasive, a broad survey of estimated fitness functions raises questions about the generality of this model, leaving open the possibility that persistent directional selection is a common force (Kingsolver and Diamond 2011). In the field of evolutionary ecology, arguments for suboptimal performance traditionally invoke limitations owing to constraints/tradeoffs between traits, which are typically assumed but seldom verified empirically. Independent of such issues, we have shown that persistent under-performance can be expected whenever a significant fraction of genomic sites contributing to a trait harbor preferred alleles with small selective advantages, and that this effect will become more pronounced when mutation is biased in the direction of deleterious alleles. Some progress has been made on the study of the expected evolution of mean phenotypes under these alternative models (Charlesworth 2013b; Lynch 2018, 2020), but again under the assumption of mutations with fixed effects. The change in curvilinearity of the fitness function with increasing distance from the optimum will alter the response of mean phenotypes to Ne, but the preceding results suggest that the extension of prior work to scenarios in which genomic sites vary in their effects will be necessary to evaluate the generality of prior conclusions.

Supplementary Material

iyad091_Supplementary_Data

Acknowledgements

We thank the anonymous reviewers for their helpful comments, which helped us to improve the article.

Appendix

Reduction in Ne by selective interference: single-effects case. Observations reported in the text justify the use of a correction factor ϕ=Ne/N to transform N into a fixation effective population size relevant to the evolution of mean allele frequencies. Validation of this approach required the use of estimates of ϕ derived by computer simulations. For more practical applications, we require a mathematical expression for ϕ derived from first principles. A heuristic approximation can be obtained by considering the number (I) of competing mutations that a mutation destined to fixation must contend with during its sojourn through the population. Gerrish and Lenski (1998) and Campos and Wahl (2009, 2010) used such an approach to evaluate the number of newly arising mutations with advantages exceeding that of a target mutation, under the assumption of an exponential distribution of mutational fitness effects, but here we consider the case in which all newly arising beneficial mutations have identical effects, albeit residing in different genetic backgrounds.

We start with the Li–Bulmer equation for the expected frequency of a beneficial allele under sequential fixations

p~=βe2Nes1+βe2Nes, (A1)

where β=u01/u10 is the ratio of mutational pressure towards the beneficial relative to the deleterious allele, s is the selective advantage of the beneficial allele, and Ne is the effective population size. Rearrangement of Equation A1 leads to

ϕ=NeN=(12Ns)ln(p~β(1p~)). (A2)

Although Equation (A1) implies that p~ asymptotically approaches 1.0 as the population size approaches infinity, in this extreme, deleterious alleles will actually be maintained at a low frequency by mutation-selection balance, such that

p~=1(1+β)u10+s(1u01)[(1+β)u10+s(1u01)]24u10s2s. (A3a)

Provided the strength of selection exceeds u10,

p~u01+su01+u10+s. (A3b)

Substituting Equation (A3b) into A2 then yields an upper bound to ϕ,

ϕmax11+[2Ns/ln{1+[s/(βu10)]}], (A4)

where a slight modification has been made by adding 1 to the denominator to allow for the fact that ϕmax must asymptotically approach 1 as N1 (as this eliminates interference). Consistent with this expression, the simulation results show that once Ns exceeds 10, ϕ becomes inversely proportional to Ns, and only weakly dependent on the composite selection–mutation parameters subsumed into s/(βu10) (Supplementary Fig. 2). To allow for the further depressive effects of selective interference from linked mutations, we use

ϕ=ϕmax1+I, (A5)

where it is assumed that when combined with I interfering mutations, a target mutation destined to fixation in the absence of interference has its probability of fixation reduced by factor 1/(1+I). This approach ignores the possibility that the pool of mutations interfering with the fixation of a focal beneficial mutation can also interfere with themselves.

We now proceed towards the development of an estimator for I, progressively incorporating the number of potentially interfering mutations arising during a focal mutation’s sojourn through the population, along with the magnitude of the effect per interfering mutation. Let τ be the mean time to fixation of a beneficial allele in the absence of competition from other segregrating mutations. During this period, additional beneficial mutations will arise in individuals outside of the focal lineage at an average rate Lβu10(1p*), where p* denotes the expected mean frequency of + alleles. The average per-generation number of individuals in the target lineage N/2 because the frequency of the lineage under consideration (assuming it does indeed fix) increases from essentially zero to one. Only a fraction pf(s) of all newly arisen beneficial mutations are destined to fixation, and it is this subset that presents the most potential interference to the focal mutation. Moreover, the strength of selection operating on a mutation must be on the order of the magnitude of genetic drift or greater if it is to compete for fixation; and to allow for this, we use the weighting term Nes/(1+Nes), which asymptotically approaches zero as Nes0 and 1.0 as Nes.

Taking all of these factors into consideration, the effective number of competing mutations is then proportional to the product of terms,

IτLβu10(1p*)N2pf(s)Nes1+Nesk, (A6)

where

pf(s)=1e2Nes/N1e2Nes (A7)

is the probability of fixation of a newly arisen mutation with fitness benefit s (Kimura 1983), and k is a term to correct for the fact that only a fraction of newly arising mutations emerge in backgrounds with high enough fitness to compete with the target mutation.

Although k must depend on the haplotype distribution of fitness in the population, a rough starting point for large L is k=0.25, as any mutant haplotype destined to fixation by positive selection must almost certainly be in the upper half of the distribution, and to be a successful competitor, any outside mutant clone must then be in the upper half of the upper half. Previous workers (Gerrish and Lenski 1998; Campos and Wahl 2009, 2010) have let k=1, which applies if prior to the emergence of competing mutations, the population consists of just two haplotypes, one with and the other without the target mutation. In principle, a more rigorous approach might be possible if the form of the equilibrium distribution were known, but although this is often assumed to be Poisson, there are subtle and significant deviations from such behavior in numerous contexts (Gessler 1995; Goyal et al. 2012; Jain and John 2016), and k0.25 will be shown below to be a reasonable approximation for an equilibrium population for L>100. Note also that the matter of lineage contamination, i.e. the addition of secondary mutations to lineages en route to fixation (Pénisson et al. 2017) has been ignored here, as the population is in equilibrium, and all competing mutant lineages are presumably confronted with the same secondary-mutation issues; in effect, as N, all mutant lineages approach selection–mutation balance, rendering a near neutral situation with respect to lineage competition.

Aside from the inclusion of the factor k, our computation of I differs in several significant ways from prior applications. First, rather than approximating the fixation probability as 2s, we implement the full formulation for pf(s), as the former yields inappropriate estimates when Nes<1 and ignores the dependency of Ne on N, the very issue that we are exploring. Second, prior applications have not included the weighting term noted above, which also seems essential for Nes<1, for which the magnitude of individual interference must be small. Third, the estimation of ϕ is quantitatively quite sensitive to the definition of τ, and whereas previous authors have used the deterministic approximation τ=2ln(N)/s, this can give wildly unrealistic values in certain domains of parameter space, including times in excess of the neutral expectation of Ne generations. Charlesworth (2020) provides a broad overview of estimators of τ and offers a measure that explicitly corrects for the stochastic and deterministic phases of the process, but the implementation of his method in Equation (A6) consistently led to significant underestimates of I, whereas a derivation of Gale (1990), modified for haploids, was more suitable,

τ3.927+2ln(Nes/2)s2ϕ2Ns. (A8)

Note that the expressions of Gale (1990) and Charlesworth (2020) can yield negative estimates of τ when Nes<1, in which case τ=2Ne[1(Nes)2/18)] derived by Charlesworth (2022) for the weak selection regime is useful. Below, we find mathematical expressions for the ϕ in the small Nes limit using the τ obtained in Charlesworth (2022); however, Equation (A8) is useful in the Nes>1 limit. Finally, Equation (A6) requires an expression for p* (a concern given that p* is the allele frequency that we are ultimately trying to determine), but here we utilize Equation (A1) as a first-order approximation.

Equation (A5) is a transcendental function, as several of the terms entering I are complex functions of Ne=ϕN: the fixation probability and time, the weighting factor, and p*, although the equation can be solved by iteration. Despite the complexity of the underlying issues and the approximate nature of the derivation, Equation (A5) generally yields estimates of ϕ that are within 30% (often considerably closer) of simulation results (Supplementary Fig. 2). Although this is not a fully satisfactory outcome, given that ϕ varies 10,000-fold over the full range of parameter space, this heuristic solution appears to capture the essence of the system, is an upgrade to the visual-fit interpretation of Lynch (2020), and provides insight into the regions of parameter space that merit further consideration. Major discrepancies appear to be restricted to very large linkage blocks (of order L=106) with Ns1, where ϕ is underestimated up to 2-fold. As can be seen in Supplementary Fig. 2, the primary determinant of ϕ is Ns, with ϕ only responding in a significant way after Ns exceeds a threshold value near 1 for small L and 0.01 for very large L. Not surprisingly, larger L leads to stronger interference, but in all cases, mutational bias (β) has a secondary effect.

A simpler approximate solution is obtainable under the assumption of Nes1, which allows the approximations τ2Ne(1(Nes)218) and pf(s)=(2Nes/N)/[1e2Nes], reducing Equation (A6) to

I2kβ(Nes)3(1118(Nes)2)g(1+Nes)(1e2Nes)(1+βe2Nes), (A9)

where g=s/(Lu10). Letting x=Nes and referring back to Equation (A5), this rearranges to another transcendental equation

x+2kβx4(1118x2)g(1+x)(1e2x)(1+βe2x)=Nsϕmax. (A10)

For small x, Taylor-series expansion of the left side of Equation (A10) leads to

x+kβx3g(1+β)(12βx1+β+(526β+41β2)x218(1+β)2+)=Nsϕmax. (A11a)

For small x, a cubic equation in terms of x is obtained by neglecting the terms within the large brackets,

x+kβx3g(1+β)=Nsϕmax. (A11b)

For small L and u10/s1, the parameter g1, and hence the above equation simplifies to Nsϕmax, and hence

ϕ=xNs=ϕmax. (A12)

In the other limit, as L, the parameter g0, so the second term in the left side of Equation (A11) dominates; ignoring the algebraic terms within the large parentheses in Equation (A11) then leads to the approximation,

ϕ(g(1+β)ϕmaxkβ(Ns)2)1/3. (A13)

This expression yields scaling relationships that are fairly similar to those generated by less formal methods in Lynch (2020, Equation 11), which suggested ϕ to be an approximately inverse function of Ns3/4L1/3β1/4u101/3 over the full domain of Ns, noting that here there are additional terms in ϕmax. Although Equation (A13) can yield ϕ>1 at small Ns, this can be accommodated by simply setting ϕ=1 at this point.

The preceding approach yields estimates of p~ that are in close accord with observations from computer simulations. In Fig. 3, results are given for L=103 to 106, letting k=0.25 and using Equation (A11b) (solid lines) vs. the simplified Equation A13. Although the fits are not perfect, they provide quite good approximations to p~, with the utility of Equation (A13) breaking down with small L, as expected. For L=10 and 100, modifications are required for the expression for I, but we find a nearly perfect fit to the data by setting k=1, ϕmax=1. Finally, although most of the computer simulations used herein relied upon a particular scaling between N and the mutation rate, results in Supplementary Fig. 3 show that the algebraic results provided above extend to a full spectrum of mutation rates, as expected from the general nature of the derivations.

The effect of background loci has been considered in Good and Desai (2013) in an effort to find the variance and fixation effective population size in the limiting case of NsNLu101 for β=1. As inferred from the agreement with computer simulations, our calculations seem to cover a broader range of parameter space, but they do not exactly match with expression obtained in Good and Desai in the limit NsNLu101. Although NeN in this range, the correction is of order (Ns)2 in Good and Desai (2013) our calculation gives a correction of order Ns. We also find that the correction is order of Ns for β1 using the same approach as in their paper.

Reduction in Ne by selective interference from minor-effect loci: two-effect case. A large body of literature in the concurrent mutation regime is based on a semi-deterministic approach where the bulk of the distribution of frequency classes follows a deterministic equation and the stochastic noise enters only at the nose of the distribution (Rouzine et al. 2008; Goyal et al. 2012; John and Jain 2015). However, these studies consider uniform sites with single mutational effects. Here, we try to understand the connection between the single-effects model and the two-effect case using the deterministic equation for the frequency Pi,j of the class containing i and j deleterious loci summed over all LM and Lm major and minor-effect sites, respectively,

Pi,jt=u10[LM(i1)]Pi1,j+βu10(i+1)Pi+1,j+u10[Lm(j1)]Pi,j1+βu10(j+1)Pi,j+1[Lu10+(i+j)(β1)u10+(isM+jsmκ)]Pi,j, (A14)

where L=LM+Lm and κ=i=0LMj=0Lm(isM+jsm)Pi,j.

Letting isM+jsm=(i+γj)sM=ksM, where γ=sm/sM1, and k=i+γj, the two-effect equation can be reduced to an effective one-effect equation with the frequency class Pi,j becoming Pk with associated selection ksM,

Pkt=u10(LMi+1)Pk1+βu10(i+1)Pk+1+u10(Lmj+1)Pkγ+βu10(j+1)Pk+γ[Lu10+(i+j)(β1)u10+ksMκ]Pk, (A15)

Dividing by Lu10 everywhere,

PkT=ηM(1ai1)Pk1+βηMai+1Pk+1+ηm(1bj1)Pkγ+βηmaj+1Pk+γ[1+(ηMai+ηmbj)(β1)+ηMgM(kk¯)]Pk, (A16)

where T=t/Lu10, ηM=LM/L, ηm=Lm/L, ai=i/LM, bj=j/Lm, gM=sM/LMu10, and k¯=κ/sM.

When the joint distribution Pi,j is far from the edges, the approximations aiqM=1pM=1LMi=0LMj=0LmiPi,j and bjqm=1pm=1Lmi=0LMj=0LmjPi,j hold. Here, we follow an approach similar to Rouzine et al. (2008) and assume that the logarithm of the frequency is a smooth function, and use the approximation ln(Pk±h)ln(Pk)±hln(Pk)k. In the steady state, the above equation then becomes

0=ηM(1q~M)edln(P~k)dk+βηMq~Medln(P~k)dk+ηm(1q~m)eγdln(P~k)dk+βq~meγdln(P~k)dkηMgM(kk~)+(1β)(ηMq~M+ηmq~m)1, (A17a)

which after letting x=kk~ where k~=LMq~M+γLmq~m and ψ(x)=dln(P~k)/dk can be rewritten as

ηMgMx(1β)(ηMq~M+ηmq~m)+1=ηM(1q~M)eψ(x)+βηMq~Meψ(x)+ηm(1q~m)eγψ(x)+βηmq~meγψ(x). (A17b)

Although the above equation is not exactly solvable, by expanding the exponentials for small ψ(x), we get

ψ(x)=lnηMgM2π[(ηM+γηm)(1+β)(ηMq~M+γηmq~m)]ηMgMx22[(ηM+γηm)(1+β)(ηMq~M+γηmq~m)]. (A18a)

The first term on the right side is obtained from the normalization condition, eψ(x)dx=1, assuming x is continuous. Equation (A18a) can be rewritten as

ψ(x)=lnρ2π[1(1+β)φ~]ρx22[1(1+β)φ~], (A18b)

with ρ=gM/(1+ζ), ζ=Lmsm/LMsM, and φ~=(q~M+ζq~m)/(1+ζ). Because we have assumed the width of the distribution is large, this implies that the above calculation is valid for ρ1. The above distribution yields a Gaussian distribution for k=i+γj,

Pk=ρ2π[1(1+β)k~LM(1+ζ)]exp(ρ(kk~)22[1(1+β)k~LM(1+ζ)]), (A18c)

with mean k~=LMq~M+γLmq~m. The variance of the distribution can be expressed in terms of the mean of the distribution,

c~2=(LMu10sM+γ2Lmu10sm)[1(1+β)k~kmax], (A19)

where kmax=LM+γLm. Note that previous studies have also found this Gaussian distribution in the single-effects case both for infinite and finite loci models (Rouzine et al. 2008; Goyal et al. 2012; Jain and John 2016).

The single-effects distribution can be reproduced by setting sM=sm or taking the number of major or minor loci to zero in Equation (A18c). Here, the variance (Equation (A19)) produced by the minor-effect loci in the distribution of the effective number of deleterious major loci is γ2 times smaller than that associated with the major-effect sites, suggesting that the influence of a minor-effect locus on a major-effect locus is γ2 times smaller than its influence on itself. The calculations in this section are based on the deterministic evolution equation for the fitness class of effective deleterious major loci, and the distribution (Equation A18c) is a good approximation near kk~. Stochastic noise plays a major role near the edges of the distribution, and therefore k~ will depend upon the population size through the stochastic conditions near the edges. However, the relative contributions of major- and minor-effect loci on the variance are unaffected by the stochasticity because the relative contribution is independent of k~.

This variance in the distribution increases the interference effect, which decreases the effective population size in a finite population due to linkage. Santiago and Caballero (1998) found an expression for the effective population size as a function of the genetic variance. In our simulation results, we observe that the interference effect caused by the minor-effect loci on the major-effect loci follows the γ2 reduction effect. The variance can also be written as c~2=LMu10sM(1+ζ)[1(1+β)k~kmax], which indicates that the effective population size will be dominated by the major-effect loci when ζ1 or LMsMLmsm and by minor-effect loci in the opposite regime.

Contributor Information

Archana Devi, Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85287, USA.

Gil Speyer, Knowledge Enterprise, Arizona State University, Tempe, AZ 85287, USA.

Michael Lynch, Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85287, USA.

Data availability

The authors affirm that all data necessary for confirming the conclusions presented in the article are represented fully within the article and figures. The C++ code for the simulation data can be found on the GitHub website (https://github.com/ArchanaDevi8474/ThreeEffectsSimulationCode). Supplemental material available at GENETICS online.

Funding

This research was supported by the Multidisciplinary University Research Initiative awards W911NF-09-1-0444 and W911NF-09-1-0444 from the US Army Research Office, National Institutes of Health award R35-GM122566-01, National Science Foundation awards DBI-2119963, DEB-1927159, and MCB-1518060, and Moore and Simons Foundations Grant 735927.

Author contributions

AD and ML conceptualized the theory, derived the mathematical approximations, and wrote the article. ML developed the computer programs associated with the work, with assistance from GS in parallelizing the code.

Literature cited

  1. Bar-Even  A, Noor  E, Savir  Y, Liebermeister  W, Davidi  D, Tawfik  DS, Milo  R. The moderately efficient enzyme: evolutionary and physicochemical trends shaping enzyme parameters. Biochemistry. 2011;50:4402–4410. doi: 10.1021/bi2002289 [DOI] [PubMed] [Google Scholar]
  2. Bulmer  MG. The Mathematical Theory of Quantitative Genetics. Oxford (UK): Oxford University Press; 1980. [Google Scholar]
  3. Bulmer  M. The selection-mutation-drift theory of synonymous codon usage. Genetics. 1991;129:897–907. doi: 10.1093/genetics/129.3.897 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Campos  JL, Charlesworth  B. The effects on neutral variability of recurrent selective sweeps and background selection. Genetics. 2019;212:287–303. doi: 10.1534/genetics.119.301951 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Campos  PRA, Wahl  LM. The effects of population bottlenecks on clonal interference, and the adaptation effective population size. Evolution. 2009;63:950–958. doi: 10.1111/j.1558-5646.2008.00595.x [DOI] [PubMed] [Google Scholar]
  6. Campos  PRA, Wahl  LM. The adaptation rate of asexuals: deleterious mutations, clonal interference and population bottlenecks. Evolution. 2010;64:1973–1983. [DOI] [PubMed] [Google Scholar]
  7. Charlesworth  B. Background selection 20 years on. J Hered. 2013a;104:161–171. doi: 10.1093/jhered/ess136 [DOI] [PubMed] [Google Scholar]
  8. Charlesworth  B. Stabilizing selection, purifying selection, and mutational bias in finite populations. Genetics. 2013b;194:955–971. doi: 10.1534/genetics.113.151555 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Charlesworth  B. How long does it take to fix a favorable mutation, and why should we care?  Am Nat. 2020;195:753–771. doi: 10.1086/708187 [DOI] [PubMed] [Google Scholar]
  10. Charlesworth  B. The effects of weak selection on neutral diversity at linked sites. Genetics. 2022;221:iyac027. doi: 10.1093/genetics/iyac027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Charlesworth  B, Betancourt  AJ, Kaiser  VB, Gordo  I. Genetic recombination and molecular evolution. Cold Spring Harb Symp Quant Biol. 2009;74:177–186. doi: 10.1101/sqb.2009.74.015 [DOI] [PubMed] [Google Scholar]
  12. Charlesworth  B, Campos  JL. The relations between recombination rate and patterns of molecular variation and evolution in Drosophila. Annu Rev Genet. 2014;48:383–403. doi: 10.1146/annurev-genet-120213-092525 [DOI] [PubMed] [Google Scholar]
  13. Charlesworth  D, Charlesworth  B, Morgan  MT. The pattern of neutral molecular variation under the background selection model. Genetics. 1995;141:1619–1632. doi: 10.1093/genetics/141.4.1619 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Charlesworth  B, Jain  K. Purifying selection, drift, and reversible mutation with arbitrarily high mutation rates. Genetics. 2014;198:1587–1602. doi: 10.1534/genetics.114.167973 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Cockerham  CC. Drift and mutation with a finite number of allelic states. Proc Natl Acad Sci USA. 1984;81:530–534. doi: 10.1073/pnas.81.2.530 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Comeron  JM, Williford  A, Kliman  RM. The Hill–Robertson effect: evolutionary consequences of weak selection and linkage in finite populations. Heredity. 2008;100:19–31. doi: 10.1038/sj.hdy.6801059 [DOI] [PubMed] [Google Scholar]
  17. Desai  MM, Fisher  DS. Beneficial mutation selection balance and the effect of linkage on positive selection. Genetics. 2007;176:1759–1798. doi: 10.1534/genetics.106.067678 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Eshel  I, Feldman  MW. On the evolutionary effect of recombination. Theor Popul Biol. 1970;1:88–100. doi: 10.1016/0040-5809(70)90043-2 [DOI] [PubMed] [Google Scholar]
  19. Gale  JS. Theoretical Population Genetics.. London (UK): Unwin Hyman; 1990. [Google Scholar]
  20. Galzitskaya  OV, Bogatyreva  NS, Glyakina  AV. Bacterial proteins fold faster than eukaryotic proteins with simple folding kinetics. Biochemistry (Mosc.). 2011;76:225–235. doi: 10.1134/S000629791102009X [DOI] [PubMed] [Google Scholar]
  21. Gerrish  PJ, Lenski  RE. The fate of competing beneficial mutations in an asexual population. Genetica. 1998;102/103:127–144. doi: 10.1023/A:1017067816551 [DOI] [PubMed] [Google Scholar]
  22. Gessler  DD. The constraints of finite size in asexual populations and the rate of the ratchet. Genet Res. 1995;66:241–253. doi: 10.1017/S0016672300034686 [DOI] [PubMed] [Google Scholar]
  23. Good  BH, Desai  MM. Fluctuations in fitness distributions and the effects of weak linked selection on sequence evolution. Theor Popul Biol. 2013;112:117–125. [DOI] [PubMed] [Google Scholar]
  24. Good  BH, Walczak  AM, Neher  RA, Desai  MM. Genetic diversity in the interference selection limit. PLoS Genet. 2014;10:e1004222. doi: 10.1371/journal.pgen.1004222 [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Goyal  S, Balick  DJ, Jerison  ER, Neher  RA, Shraiman  BI, Desai  MM. Dynamic mutation-selection balance as an evolutionary attractor. Genetics. 2012;191:1309–1319. doi: 10.1534/genetics.112.141291 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Hill  WG, Robertson  A. The effect of linkage on the limits to artificial selection. Genet Res. 1966;8:269–294. doi: 10.1017/S0016672300010156 [DOI] [PubMed] [Google Scholar]
  27. Jain  K. Interference effects of deleterious and beneficial mutations in large asexual populations. Genetics. 2019;211:1357–1369. doi: 10.1534/genetics.119.301960 [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Jain  K, John  S. Deterministic evolution of an asexual population under the action of beneficial and deleterious mutations on additive fitness landscapes. Theor Popul Biol. 2016;112:117–125. doi: 10.1016/j.tpb.2016.08.009 [DOI] [PubMed] [Google Scholar]
  29. John  S, Jain  K. Effect of drift, selection and recombination on the equilibrium frequency of deleterious mutations. J Theor Biol. 2015;365:238–246. doi: 10.1016/j.jtbi.2014.10.023 [DOI] [PubMed] [Google Scholar]
  30. Johnson  T, Barton  NH. The effect of deleterious alleles on adaptation in asexual populations. Genetics. 2002;162:395–411. doi: 10.1093/genetics/162.1.395 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Kaiser  VB, Charlesworth  B. The effects of deleterious mutations on evolution in non-recombining genomes. Trends Genet. 2009;25:9–12. doi: 10.1016/j.tig.2008.10.009 [DOI] [PubMed] [Google Scholar]
  32. Kim  Y. Effect of strong directional selection on weakly selected mutations at linked sites: implication for synonymous codon usage. Mol Biol Evol. 2004;21:286–294. doi: 10.1093/molbev/msh020 [DOI] [PubMed] [Google Scholar]
  33. Kim  Y, Stephan  W. Joint effects of genetic hitchhiking and background selection on neutral variation. Genetics. 2000;155:1415–1427. doi: 10.1093/genetics/155.3.1415 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Kimura  M. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics. 1969;61:893–903. doi: 10.1093/genetics/61.4.893 [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Kimura  M. The Neutral Theory of Molecular Evolution. Cambridge (UK): Cambridge University Press; 1983. [Google Scholar]
  36. Kimura  M, Crow  JF. The number of alleles that can be maintained in a finite population. Genetics. 1964;49:725–738. doi: 10.1093/genetics/49.4.725 [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Kimura  M, Maruyama  T, Crow  JF. The mutation load in small populations. Genetics. 1963;48:1303–1312. doi: 10.1093/genetics/48.10.1303 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Kingsolver  JG, Diamond  SE. Phenotypic selection in natural populations: what limits directional selection?  Am Nat. 2011;177:346–357. doi: 10.1086/658341 [DOI] [PubMed] [Google Scholar]
  39. Lande  R. The maintenance of genetic variability by mutation in a polygenic character with linked loci. Genet Res. 1975;26:221–235. doi: 10.1017/S0016672300016037 [DOI] [PubMed] [Google Scholar]
  40. Latter  BD. Selection in finite populations with multiple alleles. II. Centripetal selection, mutation, and isoallelic variation. Genetics. 1970;66:165–186. doi: 10.1093/genetics/66.1.165 [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Li  WH. Models of nearly neutral mutations with particular implications for non-random usage of synonymous codons. J Mol Evol. 1987;24:337–345. doi: 10.1007/BF02134132 [DOI] [PubMed] [Google Scholar]
  42. Loewe  L, Charlesworth  B. Background selection in single genes may explain patterns of codon bias. Genetics. 2007;175:1381–1393. doi: 10.1534/genetics.106.065557 [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Long  H, Sung  W, Kucukyildirim  S, Williams  E, Miller  SF, Guo  W, Patterson  C, Gregory  C, Strauss  C, Stone  C, et al.  Evolutionary determinants of genome-wide nucleotide composition. Nat Ecol Evol. 2018;2:237–240. doi: 10.1038/s41559-017-0425-y [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Lynch  M. The Origins of Genome Architecture. Sunderland (MA): Sinauer Assocs., Inc.; 2007. [Google Scholar]
  45. Lynch  M. The lower bound to the evolution of mutation rates. Genome Biol Evol. 2011;3:1107–1118. doi: 10.1093/gbe/evr066 [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Lynch  M. Phylogenetic diversification of cell biological features. eLife. 2018;7e34820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Lynch  M. The evolutionary scaling of cellular traits imposed by the drift barrier. Proc Natl Acad Sci USA. 2020;117:10435–10444. doi: 10.1073/pnas.2000446117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Lynch  M, Ackerman  M, Gout  JF, Long  H, Sung  W, Thomas  WK, Foster  PL. Genetic drift, selection, and evolution of the mutation rate. Nat Rev Genet. 2016;17:704–714. doi: 10.1038/nrg.2016.104 [DOI] [PubMed] [Google Scholar]
  49. Lynch  M, Bürger  R, Butcher  D, Gabriel  W. Mutational meltdowns in asexual populations. J Hered. 1993;84:339–344. doi: 10.1093/oxfordjournals.jhered.a111354 [DOI] [PubMed] [Google Scholar]
  50. Lynch  M, Hill  WG. Phenotypic evolution by neutral mutation. Evolution. 1986;40:915–935. doi: 10.2307/2408753 [DOI] [PubMed] [Google Scholar]
  51. Lynch  M, Ho  WC. The limits to estimating population-genetic parameters with temporal data. Genome Biol Evol. 2020;12:443–455. doi: 10.1093/gbe/evaa056 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Lynch  M, Trickovic  B. A theoretical framework for evolutionary cell biology. J Mol Biol. 2020;432:1861–1879. doi: 10.1016/j.jmb.2020.02.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Lynch  M, Trickovic  B, Kempes  CP. Evolutionary scaling of maximum growth rates with organism size. Sci Rep. 2022;1222586. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. McVean  G, Charlesworth  B. A population genetic model for the evolution of synonymous codon usage:patterns and predictions. Genet Res. 1999;74:145–158. doi: 10.1017/S0016672399003912 [DOI] [Google Scholar]
  55. McVean  GA, Charlesworth  B. The effects of Hill–Robertson interference between weakly selected mutations on patterns of molecular evolution and variation. Genetics. 2000;155:929–944. doi: 10.1093/genetics/155.2.929 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Nguyen Ba  AN, Cvijović  I, Rojas Echenique  JI, Lawrence  KR, Rego-Costa  A, Liu  X, Levy  SF, Desai  MM. High-resolution lineage tracking reveals travelling wave of adaptation in laboratory yeast. Nature. 2019;575:494–499. doi: 10.1038/s41586-019-1749-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Pénisson  S, Singh  T, Sniegowski  P, Gerrish  P. Dynamics and fate of beneficial mutations under lineage contamination by linked deleterious mutations. Genetics. 2017;205:1305–1318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Rouzine  IM, Brunet  E, Wilke  CO. The traveling wave approach to asexual evolution: Muller’s ratchet and speed of adaptation. Theor Popul Biol. 2008;73:24–46. doi: 10.1016/j.tpb.2007.10.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Santiago  E, Caballero  A.  Effective size and polymorphism of linked neutral loci in populations under directional selection. Genetics. 1998;149:2105–2117. doi: 10.1093/genetics/149.4.2105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Schavemaker  PE, Lynch  M. Flagellar energy costs across the tree of life. eLife. 2022;11e77266. doi: 10.7554/eLife.77266 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Walsh  JB, Lynch  M. Evolution and Selection of Quantitative Traits. Oxford (UK): Oxford University Press. 2018. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

iyad091_Supplementary_Data

Data Availability Statement

The authors affirm that all data necessary for confirming the conclusions presented in the article are represented fully within the article and figures. The C++ code for the simulation data can be found on the GitHub website (https://github.com/ArchanaDevi8474/ThreeEffectsSimulationCode). Supplemental material available at GENETICS online.


Articles from Genetics are provided here courtesy of Oxford University Press

RESOURCES