Abstract
Natural populations often show enhanced genetic drift consistent with a strong skew in their offspring number distribution. The skew arises because the variability of family sizes is either inherently strong or amplified by population expansions. The resulting allele-frequency fluctuations are large and, therefore, challenge standard models of population genetics, which assume sufficiently narrow offspring distributions. While the neutral dynamics backward in time can be readily analyzed using coalescent approaches, we still know little about the effect of broad offspring distributions on the forward-in-time dynamics, especially with selection. Here, we employ an asymptotic analysis combined with a scaling hypothesis to demonstrate that over-dispersed frequency trajectories emerge from the competition of conventional forces, such as selection or mutations, with an emerging time-dependent sampling bias against the minor allele. The sampling bias arises from the characteristic time-dependence of the largest sampled family size within each allelic type. Using this insight, we establish simple scaling relations for allele-frequency fluctuations, fixation probabilities, extinction times, and the site frequency spectra that arise when offspring numbers are distributed according to a power law.
Keywords: natural selection, site-frequency spectrum, fixation probability, stationary distribution, traveling waves, Λ-coalescent, jackpot events, multiple mergers
Interpreting The Genetic Differences between and within populations we observe today requires a robust understanding of how allele frequencies change over time. Most theoretical and statistical advancements have been based on the Wright–Fisher model (Fisher 1930; Wright 1931), which has shaped the intuition of generations of population geneticists for how evolutionary dynamics works (Crow and Kimura 1970). The Wright–Fisher model assumes that the genetic makeup of a generation results from resampling the gene pool of the previous generation, whereby biases are introduced to account for most relevant evolutionary forces, such as selection, migration, or variable population sizes. For large populations, the resulting dynamics can be approximated by a biased diffusion process, which simplifies the statistical modeling of the genetic diversity. More importantly, the Wright–Fisher diffusion is the limiting allele frequency process of a wide variety of microscopic models, as long as they satisfy seemingly mild assumptions (see below). This flexibility has made the Wright–Fisher diffusion the standard model of choice to infer the demographic history of a species, loci of selection, or the strength of polygenic selection (Bollback et al. 2008; Berg and Coop 2014; Feder et al. 2014; Foll et al. 2015; Schraiber et al. 2016; Tataru et al. 2017).
Despite its versatility, the Wright–Fisher diffusion can be a poor approximation when the population dynamics is driven by rare but strong number fluctuations. It is increasingly recognized that number fluctuations can be inflated for very different reasons. First, the considered species may have a broad offspring distribution, which occurs for marine species and plants with a Type III survivorship curve (Hedgecock 1994; Eldon and Wakeley 2006) as well as viruses and fungi [reviewed in Tellier and Lemaire (2014)]. Broad offspring distributions also arise in infectious disease, when relatively few super-spreaders are responsible for the majority of the disease transmissions (Lloyd-Smith et al. 2005). In the recent SARS-CoV-2 pandemic, for example, a strongly skewed offspring distributions were consistently inferred from both contact tracing data and infection cluster size distributions (Adam et al. 2020; Laxminarayan et al. 2020). Understanding allele frequency trajectories in these systems is extremely challenging, as statistical inference based on the Wright–Fisher model is often misleading (see, e.g., Sackman et al. 2019).
A second mechanism for strong number fluctuations are so-called jackpot events, which can occur in any species no matter the actual offspring distribution. Jackpot events are population bottlenecks that arise when the earliest, the most fit or the most advanced individuals have an unusual large number of descendants. Temporal jackpot events (“earliest”) were first discovered by Luria and Delbrück (1943) and studied as a signal of spontaneous mutations in an expanding population. They observed that a phage resistant mutant clone can grow exceptionally large if the resistance mutation by chance occurs early in an expansion event. Despite being rare, these jackpot events are easily detectable in large populations because they strongly inflate the variance of the expected number of mutants and lead to power-law descendant distributions.
The very same descendant distribution arises in models of rampant adaptation and background selection. In these models, mutations generate jackpot events when they arise within the few fittest individuals (Neher and Hallatschek 2013). Jackpot events also arise in range expansions, where the most advanced individuals in the front of the population have a good chance to leave many descendants over the next few generations. This phenomenon of gene surfing can produce a wide range of scale-free descendant distributions (Hallatschek and Nelson 2008; Fusco et al. 2016; Birzu et al. 2018, 2021).
To account for skewed offspring distributions, a number of theoretical studies have been conducted in the context of the coalescent framework. Based on this backward-in-time, a striking feature of broad offspring distributions is the simultaneous merging of multiple lineages. One of the most widely studied models is the beta-coalescent (Schweinsberg 2003a), which is a subclass of the Λ-coalescent and corresponds to the population dynamics with a power-law offspring number distribution . The case α = 1, called Bolthausen–Sznitman coalescent (Bolthausen and Sznitman 1998), has been shown to be the limiting coalescent in models of so-called “pulled” traveling waves, which describe the most basic scenarios of range expansions (Brunet et al. 2007) and of rampant adaptation (Desai et al. 2013; Kosheleva and Desai 2013; Neher and Hallatschek 2013; Schweinsberg 2017). Moreover, so-called “semi-pushed” traveling waves that contain some level of co-operativity, induced e.g. by an Allee effect, generate power-law offspring distributions with (Birzu et al. 2018), indicating that their coalescent is intermediate between the Bolthausen–Sznitman and Kingman coalescents.
The tractability of coalescent approaches make it particularly useful for inferring demographic histories and detecting outlier behaviors (Basdevant and Goldschmidt 2008; Eldon 2009 2011). However, as it is notoriously difficult to integrate selection in coalescent frameworks, there is also a strong need for forward-in-time approaches that capture the competition between genetic drift and selection. While for , the limiting allele frequency dynamics is given by the well-understood Wright–Fisher process, much less is known for the case . This is unfortunate because, as mentioned above, any exponent can arise dynamically.
Recently, the forward-dynamics of the special case α = 1 was studied by one of the authors (Hallatschek 2018), finding that an emergent sampling bias generates strong deviations from the Wright–Fisher dynamics. The sampling bias arises because, in each generation, an allele with high frequency can sample more often and, hence, deeper into the tail of the offspring distribution than an allele with small frequency. The major allele of a biallelic site, therefore, has with high probability a greater number of offspring per individual than the minority type. This sampling bias acts like a selective advantage of the major allele, but its average effect is compensated by rare frequency hikes of the minor allele so that the expected change in frequency only changes in the presence of genuine selection.
Here, we focus on the understudied case intermediate between the known cases of , corresponding to the Wright–Fisher diffusion, and α = 1 described by jumps and sampling bias but vanishing diffusion. Similarly to the α = 1 borderline case, we find that a minor-allele-suppressing sampling bias arises but that it is fading over time as the offspring distributions are sampled more and more thoroughly. This time-dependent sampling bias determines the scaling of the fixation probability, extinction time, stationary distribution, and site frequency spectrum (SFS). The combination of jumps and bias generates a so-called Levy-flight which controls the variability of allele frequency trajectories, for instance between unlinked genes or between populations. The flexibility of our model should enable to fit wide range cases that deviate from the Wright–Fisher diffusion.
Model and methods
Model
To study the impact of broad offspring numbers, we consider an idealized, panmictic, haploid population of constant size N that produces non-overlapping generations in the following way. First, we associate with each individual i a “reproductive value” (Fisher 1930; Barton and Etheridge 2011) Ui, which represents its expected contribution to the population of the next generation. The random numbers Ui are drawn from a specified distribution PU. In a second step, we sample each individual in proportion to its reproductive value until we have obtained N new individuals representing the next generation.
Our model belongs to the general class of Cannings models (Cannings 1974). The Wright–Fisher model is obtained if we choose PU to be a Dirac delta function, such that all individuals have the same reproductive value.
We focus most of our analysis on the dynamics of two mutually exclusive alleles, a wild type and a mutant allele. The dynamics of the two alleles is captured by the time-dependent frequency X(t) () of mutants. The wild type frequency is given by . The total reproductive values M and W of the mutant population and the wild type population, respectively, are given by
(1) |
Here, and are the individual reproductive values of mutants and wild types and sampled from the distribution PU. The population at the next generation is generated by binomially sampling N individuals with success probability (namely, the probability that the parent of a randomly chosen individual is a mutant) . Mutations and selection are included as in the Wright–Fisher model. If the fitness of the mutant relative to the wild-type is , where s is the selection coefficient, and the forward- and back- mutation rates are μ1 and μ2, respectively, then the success probability is given by .
For the offspring distribution PU, we consider a family of fat-tailed distributions, which asymptotically behave as with α being a positive constant. To make our presentation concrete, we choose , which is known as the Pareto distribution. In the large population size limit, the neutral allele-frequency dynamics is known to only depend on the asymptotic power law exponent α provided we measure time in units of the coalescence time (Schweinsberg 2003b). For different but closely related modeling of broad offspring distributions, see Bah and Pardoux (2015).
Methods
Our goal is to understand the asymptotic dynamics of our model for large N, where the frequency becomes continuous over time (Kimura 1955; Gardiner 2009) provided that (Schweinsberg 2003b). We first present simulation results regarding relevant measures in the population genetics. Later, we provide a heuristic argument to explain them. Many separate observations (the fixation probability, extinction time, allele frequency fluctuations, stationary distribution, and SFS) can be matched up with a unifying scaling picture.
Below, t and denote a time in units of generations and one normalized by the characteristic (coalescent) timescale Tc, respectively. Tc depends on the population size and the exponent α as follows: Tc = N when when α = 2, when , and by when α = 1. These timescales were originally derived in the coalescent framework (Schweinsberg 2003b). Later, we explain how they can be rationalized within the forward-in-time approach.
To understand the frequency dynamics when , it is essential to distinguish between average and typical trajectories. As a proxy for typical trajectories, we use the median of the frequencies, denoted by , throughout this paper.
Results
Neutral dynamics: typical trajectories and extinction time
First, we characterize the allele frequency dynamics in the absence of selection s = 0. In this neutral limit, the expected value of the allele frequency does not change over time, i.e., . Yet, despite the overall neutrality, a typical trajectory experiences a bias against the minority allele. This can be seen in Figure 1, where the mean and median are plotted across many realizations that start from the same frequency . While the mean does not change over time, as required from neutrality, the median decays to zero in an α-dependent manner. By symmetry, the median increases toward fixation if the starting frequency is larger than 50%. Thus, the median experiences a bias against the minor allele. Note also that, when , the velocity of the median approaching extinction decreases as it approaches the extinction boundary (see the red curve in Figure 1). As we will show later, an uptick of the SFS at the boundaries originates from this slowing.
Numerical simulations of the early part of trajectories show that time-dependent median displacement follows a simple power law,
(2) |
up to an X(0)-dependent prefactor. Figure 2 shows the numerical result for . The red curve represents the median of trajectories, which agrees well with .
Next we quantify the time to extinction, which turns out to be driven by the above minor-allele suppressing bias. Numerical results of the mean extinction time are consistent with
(3) |
as shown in Figure 3. Hence, in units of the coalescence time, the mean extinction time becomes larger as α decreases (namely, for a broader offspring distribution). Note, however, that if one measures time in units of generations, Equation 3 can be rewritten as , which becomes smaller as α decreases since . As we show later, Equation 2 can be analytically derived from a short-time approximation of the dynamics (Equation 25). Equation 3 can be explained from an effective sampling bias (Equation 35).
Allele frequency fluctuations as a signature of broad offspring distributions
Next, we explore to what extent the spectrum of allele frequency fluctuations can provide a clue for identifying the exponent α of the offspring distribution. A deviation from the Wright–Fisher diffusion is most clearly revealed by measuring the median square displacement [median standard deviation (SD)],
(4) |
where denotes taking the median (e.g., ). To measure the median SD, we simulate 1000 neutral allele frequency trajectories with initial condition , for and the Wright–Fisher model (Figure 4A). As shown in Figure 4B, the median SD computed from this data set is consistent with the scaling,
(5) |
when . Noting , this scaling means that typical fluctuations characterized by the median SD exhibit super-diffusion. Later, we derive the superdiffusive exponent in Equation 5 analytically (Equation 32).
Usually, allele frequency fluctuations are quantified by using the mean SD , rather than the median SD. For the Wright–Fisher diffusion, the distinction between these two measures is irrelevant since both of them increase linearly with time, except with differing prefactors. However, for , the α-dependence in Equation 5 can be detected by measuring the median SD. As shown in Figure 4C, the mean SD (computed from a large data set) grows linearly in time even when α is less than 2, as if the underlying process was diffusive.
That the dynamics is not diffusive also impacts the mean SD, but somewhat subtly in that its value depends on the size of the data set (i.e., the number of frequency trajectories) used to measure it. This is because while rare large jumps contribute the mean SD in a large data set, these jumps are not observed in a small data set (with high probability). To demonstrate this data-size dependence, we prepare an ensemble of data sets, where each data set consists of a given number of allele-frequency trajectories. Then, for each data set, we measure the diffusion exponent κ, which is defined by
(6) |
In Figure 4D, the ensemble-averaged exponent is shown by the blue circle. We can see that, as the data size increases, fluctuations characterized by the mean SD exhibit a crossover from super-diffusion () to normal diffusion (κ = 1). For the median SD, by contrast, we find that its diffusion exponent κ can be computed reliably without any significant dependence on the size of the data set (orange triangles in Figure 4D). For example, under the parameter setting in Figure 4D, given a date set of 320 trajectories, the diffusion exponent of the median SD falls within the interval with probability . This in turn predicts , which is close to the actual value .
Fixation probability
Next, we examine the effect of natural selection on the fixation probability of beneficial mutations. We consider a mutant with positive selective advantage s > 0 arising in a monoclonal population. The fixation probability of a single mutant depends on the parameter α of the offspring distribution. In the Wright–Fisher model (or equivalently, ), the fixation probability can be obtained using a diffusion approximation and is given by , which becomes when and s is small. When α = 1, an analytic result has been recently obtained in (Hallatschek 2018), which can be approximated as . For the intermediate case, , we find that the fixation probability is given by
(7) |
See Figure 5 for the numerical results. Note that since in the neutral limit independently of α, these results hold for sufficiently strong selection, . Equation 7 can be deduced from the balance between the selection force and an emergent sampling bias (Equation 38 and Figure 13A).
As Equation 7 shows, for a fixed population size and selective advantage, the fixation probability becomes smaller as α decreases. Intuitively, this is because, for smaller α, the success of fixation in catching a ride on a jackpot event depends more strongly on luck than on fitness differences.
Site frequency spectrum
We return to the neutral case and present the scaling behaviors of the neutral SFS. The SFS is often used as a convenient summary of the genetic diversity within a population. Theoretically, the SFS is defined in the infinite alleles model (Kimura 1969) as the density of neutral derived alleles in the population (namely, is the number of derived alleles in the frequency window ).
Figure 6 shows numerical plots of the neutral SFS for , and the Wright–Fisher model. In the standard Wright–Fisher model, the SFS is proportional to , which decreases monotonically as x increases. By contrast, when offspring numbers are broadly distributed (when ), the SFS is non-monotonic with a somewhat surprising uptick toward the fixation boundary. When α = 1, the analytic understandings of asymptotic behaviors near both boundaries are well-established: is proportional to near and near , respectively (Kosheleva and Desai 2013; Neher and Hallatschek 2013) (see also Appendix A).
For the intermediate case , the rare-end behavior of the SFS has been analytically studied. From a backward approach (the Λ-coalescent), the authors in Berestycki et al. (2014) showed
(8) |
Here, n is a sample size and is the number of sites at which variants appear i times in the sample (see Berestycki et al. 2014) for the proportionality constant of the right-hand side of Equation 8). By using Stirling’s approximation in Equation 8, we have
(9) |
Equation 8, cannot be used for high-frequency variants, because the number of times the variants appear (i in Equation 8) is kept finite in taking the limit of the sample size n. To the best of our knowledge, a precise behavior at the high-frequency end for has not been reported. As shown in Figure 7, we find that the asymptotic form of the uptick of is given by
(10) |
We will show that the uptick arises due to the fact that an effective sampling bias decreases as an allele-frequency trajectory approaches the fixation boundary (Equation 42).
Mutation-drift balance
A broad offspring distribution also affects the stationary distribution of allele frequency when mutations and genetic drift are balancing one another. For simplicity, we consider symmetric reversible mutations between two neutral allele types. We denote the scaled mutation rate (per unit time in the continuous description) as , where μ denotes the mutation rate per generation. In the Wright–Fisher model, it is known that the stationary distribution is given by Kimura (1955)
(11) |
There is a critical value : When , the distribution in Equation 11 has a single peak at the center ; when , it has a U-shaped distribution, where the density is increasing monotonically from the center to the boundaries.
Figure 8, A and B show the numerical results of the stationary distributions for the Wright–Fisher model and , respectively. When , while a critical value of the mutation rate θc exists as in the Wright–Fisher model, there is a qualitatively different feature: For a small mutation rate , the stationary distribution is not a U-shaped but an M-shaped distribution with two peaks near the boundaries. Note that the M-shaped distribution indicates a stochastic switching behavior, as illustrated in Figure 8D (the blue curve). As shown in Figure 8D, the peak positions are approximately given up to prefactors by
(12) |
In Appendix B, we show that the M-shaped stationary distribution persists even in the presence of natural selection, provided that selection is weaker than the sampling bias at the peaks of the distribution.
A similar M-shaped distribution was observed for the EW process in (Der and Plotkin 2014), wherein moments of the stationary distribution were extensively studied. However, the origin of the M-shaped distribution remained unclear. Below, using scaling arguments, we explain why the bimodal distribution arises in our case (see the argument above Equation 44).
Analytical arguments
Limiting process, transition density, and time-dependent effective bias
We now provide analytical arguments for the observations made in the simulations described in the first part of this paper. Our discussion starts with an exact but somewhat unwieldy description of the allele frequency dynamics. We then show how exact short-time and intermediate time asymptotics can be derived and used to rationalize the sampling bias and the scaling laws discovered above.
The allele frequency dynamics can be fully characterized by the transition probability density that the mutant frequency changes from x to y in one generation. Since one generation consists of random offspring contributions to the seed pool and binomial sampling from the seed pool, we have
(13) |
Here, is the probability density that the sum of random mutant offspring numbers takes the value M, is that for the wild type, and is the probability of getting successes in N trials with success probability . First, we will focus on the neutral case, for which and are the same function, i.e., .
While the resampling distribution wN may in general behave in complex ways, it has few options in the large N limit. These constraints emerge from two asymptotic simplifications. First, since M and W are the sums of many random variables, and tend to stable distributions as described by the generalized central limit theorem (Gnedenko and Kolmogorov 1968; Uchaikin and Zolotarev 2011) (see also Appendix C for a brief description of the theorem). Second, the fluctuations associated with binomial sampling become negligible compared with those induced by offspring number contributions to the seed pool, provided that the offspring distribution is sufficiently broad, i.e., . Thus, we can replace with a Dirac delta function, . By using these facts and evaluating the integral in Equation 13 (see Appendix D for details), we obtain a simple analytical expression of , which is valid in the large N limit: When α = 1 (Hallatschek 2018),
(14) |
When ,
(15) |
where .
To obtain the continuum description, we must appropriately scale the time t with the population size N (Gardiner 2009). The characteristic timescale (coalescent timescale) Tc can be read from the dependence of the transition density on N. Hallatschek (2018) showed that, when α = 1, the resulting limiting process is described by,
where the jump kernel is given by
(17) |
and the advection (bias) term V(x) is given by
(18) |
where P.V. denotes the Cauchy principal value. It is easy to check that Equation 16 satisfies the neutrality condition (see Hallatschek 2018 for the calculation). Equations of the form in Equation (14) are sometime called differential Chapman–Kolmogorov equations (Gardiner 2009).
To develop intuition, it is useful to interpret the different terms in Equation 16. First, V(x) has a form of frequency-dependent selection that enhances the major allele (with frequency ) and suppresses the minor allele. The apparent fitness differences between the mutant and wild type is given by the log-ratio of their frequencies. Such a selection-like effect arises because the major allele can sample the offspring number from more deeply than the minor allele (see Hallatschek 2018). Second, in spite of this apparent bias, the neutrality of the whole process is maintained due to rare large jumps, characterized by . This also means that the neutrality does not hold if we focus on “typical” trajectories (see Figure 1). In fact, as we show in Appendix A, the median of the mutant frequency, which is a proxy of “typical” trajectories, evolves according to
(19) |
When , using the same reasoning as the derivation of Equation 14 and choosing , we can obtain the following differential Chapman–Kolmogorov equation,
where
(21) |
and
(22) |
As in Equation 16, the advection term guarantees the neutrality. Equation 21 means that, when , rightward jumps occur more frequently than leftward ones, and this tendency reverses when . Noting the overall minus sign in Equation 22, this in turn means that is a bias against the minor allele (see Figure 1), as in the case of α = 1. We will later show that when , the median trajectory is initially decaying like (Equation 2).
We note that the short-time superdiffusive behavior in Equation 5 implies that Equation 20 cannot be simplified to a Fokker–Planck equation. We also note that, in the limit , two divergencies arise in Equation 20, one in the integral for the advection velocity in Equation 22 and one in the jump integral in Equation 16. However, since both divergencies exactly cancel, the entire right-hand side of Equation 20 is well-defined. As shown in Appendix E, Equations 16 and 20 can also be derived as a dual of the Λ-Fleming-Viot process, namely as the adjoint operator of the backward generator (e.g., Etheridge et al. 2010; Griffiths 2014).
Although it is difficult to study Equation 20 analytically, it is possible to derive exact short-time and long-time asymptotics that, combined with scaling arguments, paint a fairly comprehensive picture of the ensuing statistical genetics.
Short-time dynamics and fluctuations
First, we describe the transition density of Equation 20 for small times. When , the allele frequency changes due to the deterministic bias V(x) and random occurrence of jumps, sampled from the broad distribution in Equation 21. Since the number of jump events is enormous even for small τ, the generalized central limit theorem applies, and is asymptotically distributed according to a stable distribution (Gnedenko and Kolmogorov 1968). For a general stable distribution, its analytical expression is not available, and only its characteristic function can be expressed analytically. As we show in Appendix F, the random displacement can be expressed as
(23) |
Here Z is sampled from the stable distribution p(z) whose characteristic function is given by
(24) |
and the scale parameter and the skewness parameter are respectively given by
(25) |
(26) |
Note that statistical properties of Z are independent of τ, and depends on τ via the scale parameter . As shown in Figure 9A, for small times, the transition density computed from the stable distribution agrees precisely with numerical simulation results in the discrete-time model. Our result can be regarded as a counterpart of the Gaussian approximation often employed for the Wright–Fisher diffusion (see Tataru et al. 2017 and the references therein).
Now, we study the mean and median of the allele frequency using the short-time expression. The mean does not change in time since , which is consistent with the neutrality. On the other hand, the median changes as
(27) |
where denotes the median of Z. depends on x0 via (see Equation 24), and for . Equation 27 agrees with numerical simulations in the discrete-time model, while is close to the initial frequency x0 (see the red and black curves in Figure 9B).
The scaling property in Equation 2 immediately follows from Equation 27, since . This scaling implies that there is a time-dependent bias driving the median of the allele frequency. Differentiating Equation 27 with respect to time gives
(28) |
where the effective time-dependent bias is given by
(29) |
Near the boundaries x = 0 and x = 1, is approximately given by
(30) |
where is a positive constant.
The advection term arises from a sampling bias
Intuitively, the time-dependent bias arises from a time-dependence of the largest sampled offspring number (Figure 10). To see this, consider a typical trajectory of the allele frequency starting from x. Up to a short time τ, only jumps from x to are likely to occur, where and can be estimated by the extremal criterion (Krapivsky et al. 2010),
(31) |
These conditions give
(32) |
Because these small jumps cancel a part of the bias V(x) in Equation 22, the typical trajectory is then driven by the uncanceled residual part of the bias V(x),
(33) |
When , the dominant contribution to this integral is from . Using from the first line of Equation 21 and from Equation 32, the above integral can be evaluated as , which agrees with in Equation 30 for (up to the factor κ). When , the dominant contribution to is from and can be evaluated in a similar way, reproducing in Equation 30 for .
One interpretation of Equation 33 is that the bias V(x) in Equation 22 is mitigated by small jumps in a short time, and therefore, the integration over small jumps is excluded in Equation 33. Another interpretation is that, for typical short-time dynamics, small jumps and the bias V(x) are relevant, and, from the overall neutrality, the change caused by these two is equal to the negative of that caused by large jumps, thus resulting in Equation 33.
Allele frequency fluctuations are inconsistent with the Wright–Fisher diffusion
In the simulations, we found that, for , allele frequency fluctuations are inconsistent with the Wright–Fisher diffusion and characterized by super-diffusion with diffusion exponent (see Equation 5). This finding is readily explained by the short-time asymptotic in Equation 23. Recalling (Equation 25) and statistical properties of Z are independent of τ, we obtain
(34) |
This scaling can also be justified heuristically by noting that, for , the square displacement is dominated by large jumps. During time τ, an allele frequency around x typically jumps to given in Equation 32. When , it is easy to see with x-dependent prefactors. Because the median SD is dominated by the largest displacements, it can be evaluated as
(35) |
where is assumed.
Long-time dynamics and extinction time
Above, we saw that at short times, allele frequencies carry out an unconstrained Levy flight. This random search process, however, gets distorted as soon as the allele frequency starts to get in reach of one of the absorbing boundaries. Interestingly, the dynamics then enters a universal intermediate asymptotic regime that controls both the characteristic extinction time as well as establishment times and fixation probabilities.
To see this, let us consider the extinction dynamics of a trajectory starting from a small frequency (Figure 4). At short times, we can apply the short-time asymptotics in Equations 28 and 30. We expect Equations 28 and 30 to break down when the displacement computed from Equation 28 becomes comparable to x0, which occurs at . By taking a coarse-grained view, the rate of the frequency change in is roughly given by
(36) |
This suggests that, in a long timescale (), the median frequency decreases as
(37) |
where, up to a prefactor, the frequency-dependent bias is given by
(38) |
In Figure 9C, it is numerically shown that the long-time trajectory is consistent with Equation 37. By solving Equation 37, the median trajectory goes to extinction at (Equation 3), in agreement with our simulations (Figure 4). Note that, for , the bias in Equation 38 is replaced by .
Importantly, Equations 37 and 38 can also be rigorously justified from a scaling ansatz for the transition density. After some time, spreads broadly over the region with a peak at x = 0 (Figure 11A). As shown in Figure 11B, is consistent with the following scaling ansatz;
(39) |
where and is a function of . Up to an overall constant, can be determined analytically and expressed as an infinite series (see Appendix F). Note that the τ-dependent factor in Equation 39 is motivated from the fact that the extent over which the distribution spreads increases like . Equation 39 implies that, conditional on establishment at τ, the median frequency increases as . Then, Equation 38 follows by evaluating the bias in Equation 28 at and at , instead of at x0.
As a consistency check of the exponent in Equation 3, we consider two solvable, extreme cases. First, in the limit , the dependence on x0 in Equation 3 becomes linear. In the Wright–Fisher model, the mean extinction time can be obtained analytically by solving the backward equation (see, e.g., Karlin and Taylor 1981). The solution is proportional to x0 with a logarithmic correction, . Second, when , the mean extinction time no longer depends on x0. We can obtain this explicitly, by solving Equation 17: Using when , the solution is given by . Therefore, if we approximately define the mean extinction time as , we obtain , which is to leading order independent of x0 if x0 is taken to be of order one.
Natural selection and fixation probability
One important advantage of the forward-time perspective is that we account for natural selection by introducing an appropriate bias favoring of the beneficial variant. Suppose that the mutant type has a selective advantage s > 0, such that the average offspring number of mutants is increased by a factor of relative to the wild type. In time-rescaled Chapman–Kolmogorov equation, this adds the term , where , into the advection V(x) of Equation 20.
The key observation underlying the argument below is that when X is sufficiently small, the selection force is negligible compared to the bias in Equation 38 because, while the former is approximately linear in X, the latter is sublinear. If the frequency happens to grow and reach a certain value Xc, the genuine selection begins to dominate over the bias, and the trajectory fixes with high probability (see Figure 12 for example trajectories and Figure 13A). By using Equation 38, the crossover point Xc can be estimated from the balance between the selection force and the sampling bias ,
(40) |
which gives
(41) |
For , the dynamics are essentially neutral (described by Equation 20), while, for , the trajectory grows almost deterministically. Therefore, the fixation probability of a beneficial mutation can be estimated by using the neutral fixation probability in a population of size . Although the full dynamics in Equation 20 is difficult to analyze, it is obvious that the neutral fixation probability is equal to the inverse of the population size. Therefore, we have
(42) |
which is valid for . Equation 42 reproduces our simulation results in Figure 5 for and, as , also reproduces the known result of the Wright–Fisher model, (up to a prefactor).
Site frequency spectrum
By using the time-dependent effective bias, we can also estimate the behavior of the SFS for frequent and rare variants. While the SFS is theoretically defined in the infinite alleles model, it can be computed from our biallelic framework (Ewens 1963): is defined as the expected number of neutral derived alleles in the frequency interval in a sampled population (here, the whole population). Because new mutations are assumed to arise uniformly in time, the SFS for unlinked neutral loci is given by the product of the total mutation rate μN and the mean sojourn time, namely, the average time an allele spends in the frequency interval until fixation or extinction.
First, we consider the low-frequency end, , of the SFS (see Cvijović et al. 2018 for a similar argument). Since the SFS is proportional to the sojourn time, trajectories whose maximum frequencies are x or slightly larger than x dominantly contribute to the SFS at x. Since these trajectories typically go extinct due to the bias, and we can roughly estimate their sojourn times at x as the inverse of “velocity”, in Equation 38. Since the probability that a trajectory grows above a frequency x is roughly given by , the SFS is proportional to
(43) |
Similarly, for the high-frequency end of the SFS, only the trajectories that grow above x can contribute to . Typically, these trajectories go to fixation due to the bias . Therefore, the SFS is proportional to
(44) |
The effect of the genuine selection on the SFS can also be studied by using the effective bias. See Appendix G.
Bimodality of stationary distribution
Now, we turn to explaining the bimodality observed at mutation-drift balance. We found that, when the mutation rates are small, the stationary allele frequency distribution is not a U-shaped, as expected from the Wright–Fisher dynamics, but M-shaped, as shown in Figure 8. The M-shaped distribution arises from the balance between the mutational force and the effective bias (see Figure 13B). In the Chapman–Kolmogorov equation, the mutational force is given by
(45) |
which pushes the frequency toward the center . On the other hand, the effective bias, for and for , pushes a trajectory toward the closer boundary. Therefore, the positions where these two forces balance are approximately given by
(46) |
where c is a positive constant. If θ is sufficiently small, we can always find the balancing points. The presence of these two balancing points means that we can think of the allele frequency dynamics as a two-state system, essentially analogous to a super-diffusing particle in a double-well potential (see Figure 8C for a realization of trajectories). This explains the bimodal shape of the stationary distribution.
Finally, we remark that, even in the presence of natural selection, the balancing positions are still determined from the mutation-effective bias balance provided that : while the effective bias and the mutational term are sub-linear and constant respectively, the selection term is linear in x when . Thus, when θ is sufficiently small, the magnitude of the selection term around is negligible, and the peak positions are given by Equation 46.
Discussion
In this study, we analyzed the effect of power law offspring distributions on the competition of two mutually exclusive alleles. Our main reason to consider such broad offspring distributions is that they often emerge in evolutionary scenarios that inflate the reproductive value (Barton and Etheridge 2011) of a small set of founders. For example, range expansions blow up the descendant numbers of the most advanced individuals in the front of the population, an effect that has been called gene surfing (Hallatschek and Nelson 2008). Likewise, continual rampant adaptation boosts the descendant numbers of the most fit individuals. The resulting allele frequency dynamics becomes asymptotically similar to that of a population with scale-free offspring distributions.
In the case of narrow offspring distributions, which is predominant assumption in population genetics, it is usually an excellent approximation to describe the allele frequency dynamics by a biased diffusion process, which forms the basis of powerful inference frameworks (Tataru et al. 2017). If the offspring distribution is broad, however, allele frequency trajectories are disrupted by discontinuous jumps, resulting from so-called jackpot events—exceptionally large family sizes drawn by chance from the offspring distribution. Our goal was to find an analytical and intuitive framework within which we can understand the main features of these unusual dynamics.
We found that the main counter-intuitive features can be understood and well-approximated from a competition of selection and mutations with a time-dependent emergent sampling bias, . The sampling bias favors the major allele and arises, because the sub-population carrying the major allele typically samples deeper into the tail of the offspring distribution than the minor allele fraction.
In the remainder, we first summarize the unusual population genetic patterns that can be explained by the action of these effective forces. We then discuss how broad offspring dynamics could be detected in natural populations and what its implications are for the dynamics of adaptation. Finally, we demonstrate that these dynamics are also ubiquitous in populations with narrow offspring distributions, when mutational jackpots are possible. Therefore, we believe our theoretical framework may be taken as a general null model for populations far from equilibrium.
Unusual dynamics
We found that the sampling bias effectively acts like time- and frequency-dependent selection. In the absence of true selection, drives the major allele to fixation, first rapidly and than gradually slowing down with time and proximity to fixation. The slowing down of the sampling bias near fixation also leads to an excess of high-frequency alleles, given continual influx of neutral mutations. This generates a high-frequency uptick in the SFS, which is characteristic of the tail of the offspring distribution. In mutation-drift balance, the allele frequency distribution is M-shaped, in contrast to the U-shape expected from the Wright–Fisher dynamics. The peaks reflect the balance of the mutational and sampling bias.
Non-neutral dynamics depends on whether the genuine selection force dominates over the sampling bias. The sampling bias tends to dominate near extinction or fixation, and wanes near 50% frequency. A de novo beneficial allele will not be able to fix unless it overcomes, by chance, the switch-point frequency at which genuine selection becomes stronger than the sampling bias. Finally, fluctuations in typical trajectories are getting stronger over time. As a consequence, allele frequencies super-diffuse: fluctuations grow with time more rapidly than under the Wright–Fisher diffusion.
Detecting dynamics driven by broad offspring distributions
The time-dependent over-dispersion is most readily detected by plotting the median square displacement as a function of time (see Figure 4B). Testing deviations in this statistic is an attractive avenue for detecting deviations from the Wright–Fisher diffusion because the signal is strong for intermediate allele frequencies, which can be accurately measured by population sequencing. By contrast, the time-dependent bias vanishes when an allele has 50% frequency. So, the detection of the sampling bias requires accurate time series data of low frequency variants, which is difficult to obtain given sequencing errors.
It is clear that a single super-diffusing but neutral allele would not abide by the diffusive Wright–Fisher null model and thus might be falsely considered as an allele under selection. But importantly, allele super-diffusion has an impact even on statistics that sum over many unlinked loci. This is significant for inference methods, for instance to detect polygenic selection, which argue that trait values follow a diffusion process, if not for an underlying Wright–Fisher dynamics of the allele frequencies then because they sum over many independent allele frequencies (Berg and Coop 2014). However, dynamics breaks both of these arguments. In particular, sums of many unlinked loci tend to non-Gaussian distributions (so-called alpha-stable distributions). Hence, for traditional inference methods based on the Wright–Fisher diffusion or standard central limit theorem (Tataru et al. 2017), an underlying super-diffusion process should be ruled out.
If time series are not available, broad offspring numbers can also be detected from the SFS (Neher and Hallatschek 2013). A tail-tale sign of the sampling bias is a characteristic uptick at the high-frequency tail of the SFS, which is difficult to generate by demographic variation (Neher and Hallatschek 2013). As we have shown, the shape of the uptick is characteristic of the tail of the offspring distribution (the parameter α).
Implications for the dynamics of adaptation
We found that the fixation probabilities quite sensitively depends on the broadness α of the offspring distribution (Equation 42). Accordingly, the dynamics of adaptation, which ultimately depends on the fixation of beneficial variants, should change quantitatively. To estimate these modifications, we consider an asexual population of constant size N with a broad offspring distribution with , wherein beneficial mutations occur at the rate . For low mutation rates, mutations sweep one after the other but when mutation rate are sufficiently high, multiple mutations occur and most mutations are outcompeted by fitter mutations. Such a situation is known as clonal interference.
We can study the effect of the exponent α on the adaptation dynamics quantitatively by repeating the argument in Desai and Fisher (2007), wherein the variance of offspring numbers is assumed to be narrow. As discussed in Appendix H, clonal interference should occur if
(47) |
where s > 0 is the fitness effect of a mutation, which we assume to be constant. The rate R of adaptation is given by
(48) |
Note that the second line in Equation 48 reproduces Equation 5 of Desai and Fisher (2007) in the limit . Thus, the rate of adaptation depends only weakly (logarithmically) on α in the clonal interference regime, even though the condition for clonal interference in Equation 47 depends on α quite sensitively.
Emergence of skewed offspring distributions in models of range expansions
Our study can be regarded as an analysis of the population genetics induced by power-law offspring distributions. The main reason to consider these scale-free offspring distributions is that they quite generally emerge in models of stochastic traveling waves (Birzu et al. 2018). Such models are ubiquitous in population genetics because they describe a wide range of evolutionary scenarios, including range expansions, rampant asexual and sexual adaptation as well as Muller’s ratchet (Brunet et al. 2007; Desai et al. 2013; Kosheleva and Desai 2013; Neher and Hallatschek 2013; Schweinsberg 2017; Birzu et al. 2018). Our analysis should therefore apply most directly to these evolutionary scenarios, which we now demonstrate using a simple model of a range expansion. We end by discussing the question of whether some of our results may also arise in scale-rich offspring distributions.
Birzu et al. (2018) argued that any exponent can emerge in a simple model of range expansions that incorporates a tunable level of cooperativity between individuals (Figure 14A). The model can be described by a generalized stochastic Fisher–Kolmogorov equation
(49) |
for the time-dependent population density n(x, t) at position x in a linear habitat and time t. The growth rate r(n) is assumed to be density-dependent, with
(50) |
where the parameter accounts for co-operativity among individuals, which is also called an Allee effect. As discussed in Hallatschek (2018), lineages in the region of the wave tip are diffusively mixed within the timescale . This implies that, in this microscopic model, resampling from an offspring distribution roughly occurs every τmix generations. In Birzu et al. (2018, 2021), it was argued that depending on the strength of the Allee effect, the offspring distributions corresponding to any of the three distinct classes of the beta coalescent process can arise; namely, the Bolthausen–Sznitman coalescent when B < 2, the beta coalescent with when , and the Kingman coalescent when B > 4.
To demonstrate clearly that our present study can serve as a macroscopic analysis of the traveling model, we introduce reversible mutations in the traveling wave model and measured the mutant frequency of the first individuals from the edge of the front. Here, k is the spatial decay rate, i.e., where is the coordinate comoving with the expansion. This definition of the mutant frequency is reasonable because only the wave front has a skewed offspring distribution due to the founder effect. In Figure 14B, for B = 1 (left), 3 (middle), and 8 (right), the frequency distributions in the traveling wave model are shown when the mutation rate is small (orange jagged line) and when it is large (blue jagged line). The corresponding distributions in the macroscopic model are shown by black dotted lines. The stationary distributions in the traveling wave model agree well with those in the macroscopic model. Especially, the transition from the M-shaped or U-shaped distribution to the monomodal distribution is consistently reproduced in the traveling wave model. These results underscore the correspondence between the traveling wave with the Allee effect and the beta coalescent process.
The above-described correspondence suggests that the spatial area occupied by one allele type in a range expansion should behave statistically like the time-integral over the allele frequency in the Cannings model. In the context of adapting (non-spatial) populations, this quantity describes the total number of mutational opportunities of a mutant lineage (Desai and Fisher 2007; Weissman et al. 2009; Neher and Shraiman 2011). As presented in Appendix J, the distribution of the time-integrated frequency exhibits a scaling behavior that depends on the offspring distribution sensitively. While a full discussion is beyond the scope of this paper, we expect that the distribution of areas serves as a useful observable to distinguish different prototypes of traveling waves (Birzu et al. 2018).
Broad offspring distributions with a scale: While scale-free offspring distributions often emerge over an intermediate time scale (τmix in the above traveling wave model), there are also species that over single generations show broad offspring numbers and violate the Wright–Fisher diffusion. For such species, it may be more natural to consider offspring distribution with a characteristic scale. In ‘sweepstake’ reproduction (Eldon and Wakeley 2006), a fixed and finite fraction of the population is replaced at every sweepstake event (specified by the parameter in Eldon and Wakeley (2006)). Because sets a characteristic scale in offspring numbers, power law relationships for the median of allele frequencies as well as frequency fluctuations cannot be expected, which we confirm in Appendix K. Nevertheless, the qualitative features of a sampling bias can be recognized quite clearly for sweepstake reproduction as well.
Either type of model ultimately is an approximation to true offspring distributions, and it depends on the situation, which one to use. As we argued, the beta-coalescent along with the forward-in-time model described in this article is the natural choice for range expansions, rapid adaptive process or other scenarios where the reproductive value of a chosen few are highly inflated.
Data availability
The authors state that all data necessary for confirming the conclusions presented in the article are represented fully within the article.
Funding
Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award R01GM115851, a National Science Foundation CAREER Award (#1555330), a Simons Investigator award from the Simons Foundation (#327934), RIKEN iTHEMS Program, and JSPS KAKENHI (JP19K03663).
Conflict of interest
The authors declare that there is no conflict of interest.
Acknowledgments
The authors thank Benjamin H. Good, Daniel B. Weissman, Jiseon Min, Joao Ascensao, Michael M. Desai, and Stephen Martis for their helpful discussions and comments.
Appendix A: Analytic results in the marginal case α = 1
Although the main target of our present study is the case of , we here provide analytical results for α = 1, which have not been derived before.
Site frequency spectrum in the presence of genuine selection
The transition density for α = 1 in the presence of natural selection is derived in Hallatschek (2018) (see Kosheleva and Desai 2013 for neutral case). In x space, it is given by
(A.1) |
where and σ is the selective advantage [there is an erratum in Equation 38 in Hallatschek (2018)].
For the purpose of computing the SFS (or, equivalently, the mean sojourn time), we set . Since we are considering the large N limit, the denominator of Equation A.1 can be rewritten as
(A.2) |
Thus, the transition density for can be written as
(A.3) |
Near the boundaries, this can be approximated as
(A.4) |
The SFS is given by , where μ is the mutation rate per generation, and t(x) is the mean sojourn time density, which is given by
(A.5) |
Next, we compute the integrals in Equation A.5, asymptotically close to the absorbing boundaries (see Equation A.14 for the final results). To evaluate Equation A.5 for , we first consider the integral,
(A.6) |
When has a sharp peak at , we approximate this integral as
(A.7) |
In our case,
(A.8) |
where . takes the maximum value at .1 At ,2 and . The saddle-point evaluation in Equation A.7 is precise when . By using these expressions, can be evaluated as
(A.9) |
By setting , we find
(A.10) |
Next, to evaluate Equation A.5 for the high-frequency end, we consider the following integral
(A.11) |
When , the integrand takes the maximum value at the boundary η = 0. Thus,
(A.12) |
By setting , we find
(A.13) |
In summary, the SFS in Equation A.5 is given by
(A.14) |
Note that the dependence on σ disappears when . Figure A1 shows the plots of the SFS.
For comparison, we write the SFS for the Wright–Fisher model () (see, e.g., Crow and Kimura 1970; Evans et al. 2007);
(A.15) |
The asymptotic forms near the boundaries are given by
where we have expanded the SFS around x = 1 up to the sub-leading order. For a sufficiently strong selection (), the SFS increases with x at the high-frequency end. However, unlike the case of , the increase is not strong and the SFS approaches the constant as .
Dynamics of the median of allele frequencies
When α = 1, we can derive a simple differential equation that described the median of trajectories. In the logit space, the transition density is given by
(A.16) |
where . The median (at a given time point ρ) is characterized by
(A.17) |
From the symmetry of cosh, the median is given by the peak of the transition density;
(A.18) |
By differentiating Equation A.18 with respect to ρ and eliminating ψ0, we obtain
(A.19) |
Noting that , we find
(A.20) |
Since the median is invariant under a coordinate transformation, the median in the x space is simply related with via the logit transformation, . By differentiating this with respect to time and using Equation A.20, we obtain
(A.21) |
Allele frequency dynamics conditioned on fixation
By using Bayes’ theorem, the probability distribution of the allele frequency conditioned on fixation can be written as
(A.22) |
(A.23) |
The fixation probability for the initial frequency x0 is given by (see Hallatschek 2018)
(A.24) |
In particular, the fixation probability of a single mutant is given by
(A.25) |
By using Equation A.24, the conditioned probability in Equation A.23 is computed as
(A.26) |
Appendix B: Stationary distributions of traveling wave model in the presence of natural selection
In Figure 14 of the main text, the mutant allele is assumed be neutral. Here, we provide the results in the case where mutants have a fitness advantage σ (Figure B1). As in the main text, symmetrically reversible mutations are assumed.
Appendix C: Generalized central limit theorem
Here, we briefly summarize the generalized central limit theorem (Gnedenko and Kolmogorov 1968; Uchaikin and Zolotarev 1999). Suppose that each random number ui is sampled from the Pareto distribution and consider the shifted and rescaled random variable ζ;
(C.1) |
where an and bn are
(C.2) |
It is well-known that the distribution of ζ is well-approximated by the α-stable distribution, which we denote as . While an explicit expression of is not available in general, the characteristic function is given by
(C3) |
Appendix D: The transition density of an allele frequency and the asymptotic dynamics for large N
Allele-frequency change in a generation is characterized by the transition density , which is the probability distribution of the allele frequency y at the next generation given the current allele frequency x. When N is large, the asymptotic dynamics can be described by a time-continuous differential Chapman–Kolmogorov equation, which is defined by an advection velocity V(x), diffusion coefficient D(x), and jump kernel (Gardiner 2009). The triplet is obtained from the transition density as follows:
(D.1) |
where is an N-dependent timescale, corresponding to one generation measured in units of the coalescent timescale. In the following, we derive the transition density and the asymptotic dynamics for general α by using a similar computational technique used in Hallatschek (2018), wherein the case of α = 1 is studied extensively.
As mentioned in the main text, when , the binomial sampling error is negligible for large N compared to the stochasticity coming from broad offspring number fluctuations, and we can replace the binomial distribution in Equation 13 of the main text with the Dirac delta function;
(D.2) |
Here means the average over and . Using the variable , we can rewrite wN as
(D.3) |
Here,
(D.4) |
with
(D.5) |
To use the properties of the α-stable distributions in Appendix C, we further rewrite as follows:
(D.6) |
When N is large, the quantities in the two brackets in the last line can be approximated by the characteristic functions of α-stable distribution, Equation C.3, with and , respectively. Thus, when , Equation D.6 can be computed as
(D.7) |
In the following, we evaluate the integral expression of and compute the transition density from Equation D.3.
When
By using Equation C.2,
(D.8) |
we have
(D.9) |
By setting becomes
(D.10) |
By differentiating it with respect to y, we obtain
(D.11) |
Note that this does not depend on N, which is consistent with the fact that the coalescent time is when .
When
By using Equation C.2,
(D.12) |
Equation D.7 becomes
(D.13) |
By changing the variable of integration as , we have
(D.14) |
By changing the variable of integration as and redefining as σ, we have
(D.15) |
where
(D.16) |
The transition probability is given by
(D.17) |
Consider the integral
(D.18) |
where . Then, the transition probability can be written as
(D.19) |
From Watson’s lemma, the integral can be expressed as a series expansion;
(D.20) |
By substituting Equation D.20 into Equation D.19 and writing , we obtain
(D.21) |
The leading order (m = 1) is given by
(D.22) |
Equation 21 in the main text can be obtained by introducing the continuous time where . Equation 22 follows from the neutrality . Note that the expansion of Equation D.20 is possible only when is finite, i.e., when where ϵ is an N-independent positive constant. Although in D.22 diverges as , this divergence is not a problem, because the jump term of the asymptotic dynamics in Equation 20 can be obtained from for (see Gardiner 2009).
When α = 2
an and bn are given by
(D.23) |
Equation D.7 then becomes
By changing the variable of integration as ,
(D.24) |
where is the Gauss error function
(D.25) |
By differentiating with respect to y, we have
(D.26) |
Suppose that ϵ is a sufficiently small but finite constant. For , wN can be approximated as
(D.27) |
where . From the symmetry of , the advection term is zero. The diffusivity D is given by
(D.28) |
where we have introduced the natural timescale as and used the integral approximation
(D.29) |
Finally, the jump kernel asymptotically vanishes on the time scale ,
(D.30) |
because for fixed x, y with becomes exponentially small as N becomes large.
Thus, in the large-N limit, α = 2 corresponds to the Wright–Fisher diffusion for a population of effective size
(D.31) |
When
In this case, since the Pareto distribution has finite mean and finite variance , and the large N limit of the allele frequency dynamics should be described by the Wright–Fisher diffusion process. To confirm this more generally, we consider a general distribution with finite mean and variance, namely, consider that each individual’s offspring number ui is sampled from a distribution with mean a and variance b2. Then, from the central limit theorem, the shifted and rescaled variable
(D.32) |
obeys the normal distribution . Its characteristic function is given by . Thus, we have
(D.33) |
By setting ,
(D.34) |
Thus, we obtain
(D.35) |
where . For the Pareto distribution, .
For becomes exponentially small as N becomes large, and so the jump term does not exist in the asymptotic dynamics; . For , we can approximate as
(D.36) |
where . From the symmetry of , the advection is zero. Finally, the diffusion is evaluated as
(D.37) |
Thus, by re-scaling time as , we obtain
(D.38) |
which corresponds to the Wright–Fisher diffusion of a population of effective size . Notice as , indicating that the concept of the effective population size breaks down when the variance of the offspring distribution diverges.
Appendix E: From Lambda-Fleming-Viot Generator to differential Chapman–Kolmogorov equation
In Appendix D, the jump density is derived from the generalized Wright–Fisher sampling, Equation 13 in the main text. Here, we present another more formal derivation of the jump density for . See Hallatschek (2018) for the case α = 1.
Jump density for general Λ measure
The backward generator of the Λ coalescent process for the biallelic model (see, e.g., Etheridge et al. 2010; Griffiths 2014) is given by
(E.1) |
This can be rewritten as a sum of two terms:
(E.2) |
where
(E.3) |
(E.4) |
We introduce the integration variable for A and for B, respectively. By writing
(E.5) |
A and B become
(E.6) |
(E.7) |
Defining the jump kernel as
(E.8) |
we can formally rewrite the generator as
(E.9) |
where
(E.10) |
When the measure is the Beta distribution Beta
We take the Beta distribution as the Λ measure, which corresponds to the descendant distribution considered in this study, :
(E.11) |
With this measure, A and B become
(E.12) |
(E.13) |
Note that the integrals A and B are convergent for , because, near , the terms inside are and so the integrands are . The jump kernel is given by
(E.14) |
When , this density agrees with Equation 21 of the main text (up to a proportionality constant). The advection is given by
(E.15) |
Note that, when , the limit in Equation E.15 does not exist, although this divergence is rather formal since there exists a natural cutoff for a finite-size population.
Appendix F: The transition density for the differential Chapman–Kolmogorov equation for
Here we derive the short-time transition density given in Equations 23 and 24 and determine in the scaling ansatz given in Equation 39.
The short-time transition density
Before discussing the CK equation in Equation 18, it is instructive to start from the simple diffusion equation,
(F.1) |
with the initial condition . The solution of this initial value problem is given by
(F.2) |
which is usually derived from the Laplace-Fourier transformation. However, this solution can also be obtained by using the central limit theorem: Equation F.1 is equivalent to a Brownian motion where jumps occur with rate , where a and m are related with D via . Since jumps occur in time τ, the displacement is approximately given by where . Then, from the central limit theorem, is distributed according to the normal distribution with mean and variance , namely, Equation F.2. Note that, even if the diffusion constant depends on x, the solution in Equation F.2 (with ) is valid in short times.
Essentially the same argument can be applied to the CK dynamics, except that the generalized central limit theorem should be employed since the variance of jump sizes is divergent in the case of the CK dynamics. Suppose that the initial density is given by (for notational simplicity, the subscript 0 on x is dropped). In the CK dynamics, the frequency change is caused by the bias V(x) in Equation 20 and by stochastic jumps. The rate of a frequency-increasing jump and that of a frequency-decreasing jump are given by
(F.3) |
(F.4) |
respectively. Therefore, the expected number n of jump events in time τ is given by
(F.5) |
Because randomness in the number of jump events is negligible compared to that in jump sizes, it can be assumed that exactly n jumps occur in time τ. Then, the displacement can be written as
(F.6) |
where denotes the displacement due to the i-th jump. For small τ, for , which means that are independent and identically distributed. From Equation 19, each li is approximately sampled from the following power-law distribution,
(F.7) |
where the factor (resp. ) represents the probability that a given jump is frequency-increasing (resp. frequency-decreasing). P(l) is normalized as . Note that, in Equation F.7, the original range of l has been extended to . Under this modification, the variance is no longer well-defined. However, this modification does not alter short-time properties of typical events, because the presence of the boundaries at x = 0, 1 is not important for them.
By noting that P(l) has a divergent variance and that the number of jumps is even for small τ (as ), the generalized central limit theorem states that the sum in Equation F.6 obeys an α-stable distribution. The stable distribution is characterized by given below (see, e.g., Uchaikin and Zolotarev 2011): The mean is
(F.8) |
Asymptotically, P(l) satisfies
(F.9) |
Note . The parameters and β are determined from ;
(F.10) |
(F.11) |
Then, from the generalized central limit theorem, the random variable,
(F.12) |
has the following characteristic function,
(F.13) |
We can determine the characteristic function for , using Equation F.13 and the relation
(F.14) |
which follows from Equations F.6 and F.12. While V(x) and are divergent in the limit , we can show, by using Equation F.8 and , that these divergent terms exactly cancel out each other. Therefore, the displacement is simplified as
(F.15) |
Equations 24 and 23 in the main text are the same as Equations F.15 and F.13 (with the replacement of ). By substituting this into Equation F.13, we obtain the characteristic function of the allele frequency ;
(F.16) |
The scaling ansatz for the long-time transition density in Equation 39
Consider the initial distribution with . After some time, the distribution spreads over the region with a peak at the extinction boundary x = 0. As presented in Equation 37 of the main text, up to a constant prefactor, takes the following form
where and . Here, we present an analytic argument to determine .
Equation 20 can be rewritten as
(F.17) |
where given by Equation 21. For is approximately given by
(F.18) |
We substitute the ansatz into the above CK equation. The left-hand side of the CK equation becomes
(F.19) |
which is proportional to . The right-hand side is decomposed into the integrals over and those over . We can show that the former is proportional to , while the latter is proportional to .3 Since the extinction time for the initial frequency is much shorter than the coalescent timescale, we can assume , which implies that the integrals over are negligible compared to those over . By evaluating the integrals over using the scaling form of and comparing them with Equation F.19, we have
(F.20) |
where is the Heaviside step function. Note that the variable of integration has been changed from Δ to , and the upper bound in the integral has been extended into , to make the equation analytically tractable. It is convenient to express Equation F.20 in terms of ;
(F.21) |
The solution of the integro-differential equation in Equation F.21 can be obtained as a series expansion. Assume, for small ξ,
(F.22) |
where c1 is a normalization and the exponent of the leading term is denoted by . Here, is required since we are considering the situation where is monotonically decreasing in x, while is required to make normalizable. By substituting Equation F.22 into Equation F.21, we have
(F.23) |
Since for , in order for the two sides to be balanced, the coefficient needs to be zero, which is possible only when diverges. Since and , we can conclude . Therefore, the leading term of is given by
(F.24) |
More generally, by starting from the ansatz,
(F.25) |
the coefficients can be determined iteratively:
(F.26) |
By using this iteratively, we can express as
(F.27) |
where is the Pochhammer symbol, . The analytic expression of can be obtained from this using .
On the other hand, for , we expect that decreases in the same way as the offspring distribution does;
(F.28) |
Therefore, we expect there is a crossover point such that for and for . The scaling form for can indeed be confirmed by considering the following ansatz for ,
(F.29) |
where is a normalization and is an exponent to be determined. Substituting this ansatz into Equation F.21, we can show , leading to for .
Finally, we remark that, while Equation F.27 is derived assuming , the series converges for any . This indicates that the scaling form for large ξ should directly follow from a resummation of the infinite series in Equation F.27. In fact, numerical evaluation of a finite truncation of the series indicates the crossover behavior Equation F.29 (see Figure F1).
Appendix G: Site frequency spectra in presence of selection
Here, we argue the effect of the genuine selection on the SFS by using the effective bias when . As discussed in the main text, there is a crossover point xc, shown in Equation 39, below which the selection is negligible compared to the effective bias (see Figure 13). Thus, we can expect that the SFS becomes independent of the selective advantage σ for a sufficiently small frequency x. Similarly, for the high-frequency end , the selection is negligible compared with the effective bias. Therefore, we expect that even in the presence of natural selection. In particular, the exponent is independent of σ. Figure G1 shows the numerical results when . As x approaches 0, the SFS becomes independent of the selective advantage σ. For frequent variants , the SFS can be fitted well by , while the magnitude of the SFS increases with σ. A similar result can be obtained analytically when α = 1 (see Appendix A).
Appendix H: Derivation of the rate of adaptation in Equation 46 of the main text
Here, we conjecture the rate of adaptation for an asexual population with a broad offspring distribution () in the clonal-interference regime, using a self-consistency condition argument described in Desai and Fisher (2007).
We assume that mutations have a fixed effect s much larger than the mutation rate at which they arise. First, we consider the dynamics of the fittest sub-population that becomes established at the nose of the fitness wave. We can estimate the size of the sub-population when established from the establishment probability of a single fittest mutant;
(H.1) |
where () is the fitness lead of the sub-population compared with the mean of the whole population, and the fixation probability is given by Equation 42, . In the time this sub-population is seeded and becomes established, the mean fitness should increase by s. This implies that, after its establishment, this sub-population will initially grow exponentially at rate . The growth rate will slow down to 0 when it fixes. Therefore, the time from establishment to fixation can be estimated as
(H.2) |
where is its average growth rate between the establishment and fixation. Thus, the rate of adaptation is given by
(H.3) |
Second, we focus on successive events of establishments at the edge of the fitness wave. We define test as the mean time interval between two successive establishments. An established sub-population grows like , from which the next event of establishment is produced with rate . Therefore, test can be estimated from
(H.4) |
which leads to . Since the nose of the fitness wave advances at a speed , we have
(H.5) |
By comparing Equations H.3 and H.5, we obtain
(H.6) |
By substituting into Equation H.6, we obtain
(H.7) |
where we used . In the limit , the above results reproduce those in Desai and Fisher (2007).
The case of α = 1 can be discussed in a similar way. Suppose that the population is monoclonal. The fixation probability of a mutant is given by (see Equation A.25), which implies that the establishment size is roughly given by . While the timescale of establishment of a mutant is given by , the timescale of fixation is given by . Thus, the successive selection sweeps occur if , or equivalently,
(H.8) |
By substituting into Equation H.6, the rate of adaptation in the clonal-interference regime is given by
(H.9) |
In the successive-sweeps regime, the adaptation rate is given by
(H.10) |
Note that clonal interference becomes unlikely to occur as the offspring distribution becomes broader. For example, when α = 1, the population size needs to be for to satisfy .
Figure H1 shows the numerical results of the adaptation rate R versus the selection coefficient s. The parameters used in the simulation are in the regime of clonal interference. When , R is approximately proportional to s2, while, when α = 1, R is approximately proportional to s3, which are consistent with Equations H.6 and H.9. However, when α = 1, the quantitative agreement between the numerical result and the theoretical prediction is not good, and a further investigation is needed to validate Equation H.9.
Appendix I: Numerical simulations
Simulations are implemented in C++ with the GNU scientific library’s random number generators. Results obtained from the simulations are analyzed by Mathematica. The codes are freely available upon request.
Numerical synthesis of Pareto random variables and α-stable distribution
In order to generate the mutant frequency of the gamete pool, we need to compute the sums of random Pareto variables,
(I.1) |
where ui, vi are drawn from the Pareto distribution . One simple way to synthesize ui, vi is to sample a number r from the uniform distribution on (0, 1) and compute .
To generate the sums M, N efficiently for large N (e.g., ), we can use the generalized central limit theorem when xN and are large. In simulations, when xN < 100, M is generated directly by synthesizing random variables , while, when , M is generated by sampling a random number ζ from the α-stable distribution and then determining from Equation C.1. W is generated in a similar way.
After generating M and W, the population is updated by the binomial sampling with the success probability (although this sampling process can be omitted when since the fluctuations associated with the binomial sampling is negligible compared to the fluctuations associated with M and N). Natural selection and mutations are implemented by modifying the success probability as
(I.2) |
where is the mutation rate from the wild-type to the mutant allele, and is the mutation rate in the reverse direction.
Site frequency spectrum
Since the SFS is proportional to the mean sojourn time, the SFS can be computed numerically by generating trajectories staring with until fixation or extinction and measuring how many times a trajectory visits a given frequency interval on average.
Numerical simulation of the model of range expansion in the main text
We first review the numerical implementation of the range expansion model with two neutral alleles without mutations (Birzu et al. 2018). The per capita growth rate r(n) with an Allee effect is given by
(I.3) |
where is the sum of the two population densities, and B is the strength of cooperativity. In each deme, there are three types; allele 1, allele 2, and “empty.” At each time step, the configuration of deme x is updated by the trinomial sampling process with
(I.4) |
where is the population density after migration,
(I.5) |
and in the denominator of Equation I.4 is the sum of these densities, , and a denotes the width of a deme. The expectation value of the total density n after one time step is given by
(I.6) |
which explains the denominator of Equation I.4. In the simulation, a = 1 and τ = 1 are used.
As in the standard Wright–Fisher model, a mutation process can be introduced by using the success probabilities given by
(I.7) |
where and U is a matrix representing mutational transitions. In the case of symmetrical mutations in the main text, U is given by
(I.8) |
This model serves as a microscopic description of our (non-spatial) macroscopic model of the population with a broad offspring distribution . We can argue the relation between the parameters in the two models by comparing the coalescent timescales. As established in Birzu et al. (2018), for a semi-pushed wave (), the coalescent timescale is given by
(I.9) |
where is the ratio of the Fisher velocity to the wave velocity . On the other hand, the coalescent timescale in the macroscopic description for is proportional to (see Equation 15). By comparing the exponents, a semi-pushed wave with B corresponds to the macroscopic model with4
(I.10) |
For example, B = 3 corresponds to . In addition, the mutation rate per generation in the microscopic model and the mutation rate per generation in the macroscopic model should be related by .
In the three panels (Left. Center, Right) in Figure 14B of the main text, The following parameters are used.
Left: for the microscopic model, and for the macroscopic model.
Center: for the microscopic model, and for the macroscopic model.
Right: for the microscopic model, and the Wright–Fisher model, for the macroscopic model.
In all of the three cases, the growth rate and the migration probability are used in the microscopic model, and the population size is used in the macroscopic model. Note that, to compare the microscopic model with the macroscopic model, the value of the carrying capacity K for each case is chosen such that the size of the front population , where k is the spatial decay rate of the population density,5 approximately agrees with the population size in the macroscopic model.
Appendix J: Areas swept by trajectories
J-1: A scaling argument on area distributions
Consider frequency trajectories that depart from a single mutant and are eventually absorbed either at x = 0 or at x = 1. For each of such trajectories, we can define the area in -space swept by the trajectory (see Figure J1),
(J.1) |
where is the absorption time of the trajectory. While this quantity is defined for a population without spatial structure, we expect that it has a natural interpretation in a model of range expansion as a spatial integration over the mutant frequency (i.e., the abundance of the mutant type), since τ in Equation J.1 is related with the spatial position of the traveling wave in the comoving frame.
Here, we examine how the area A defined in Equation J.1 depends on the exponent α of the offspring distribution. The left panel of Figure J2 shows the numerical results of the area distribution p(A) for , and the Wright–Fisher model (corresponding to ). In a wide range of A, areas are distributed according to .
Focusing on small areas, which correspond to extinct trajectories, this power-law behavior can be rationalized again from a scaling argument: First, by using Equation 3, a trajectory whose maximum frequency is sweeps an area roughly given by (see Figure J1), i.e., . Second, from the neutrality, the cumulative probability that a single mutant achieves a frequency larger than before absorption is estimated as . Hence, the density is given by . Combining these two results, we can estimate the area distribution p(A) as
(J.2) |
When (the Wright–Fisher limit), the distribution becomes , which can be analytically confirmed by solving a backward diffusion equation of the Wright–Fisher diffusion (see Appendix J-2).
The numerical results indicate that, when , there is an uptick in the area distribution p(A), which comes from fixed trajectories (see the case of α = 1 in the right panel of Figure J2). The uptick becomes less pronounced as α increases. For the Wright–Fisher model, we can analytically prove that p(A) monotonically decreases with A.
J-2: Area distribution in the Wright–Fisher model
Here, we derive an analytical result of Equation J.1 for the Wright–Fisher diffusion process.
Consider a Langevin equation
(J.3) |
with . Assume the initial value and the absorbing boundaries at X = 0, 1. For a given trajectory departing from x0 and ending at either one of the boundaries, we consider the “area” defined by
(J.4) |
where is the absorption time.
The area distribution for a given initial condition obeys a backward equation. To show this, we discretize the dynamics;
(J.5) |
where h denotes a short time interval and . The transition density is given by
(J.6) |
Note that
(J.7) |
By separating a trajectory into the initial step and the remaining part, we have
(J.8) |
By Taylor-expanding , we have
(J.9) |
Therefore, Equation J.8 becomes
(J.10) |
By using Equation J.7, we obtain
(J.11) |
More generally, it can be shown that, for the following integral,
(J.12) |
the distribution satisfies
(J.13) |
In the neutral Wright–Fisher model, and . The backward equation in Equation J.11 is given by
(J.14) |
From this equation, it follows that monotonically decreases with A0 because the spectrum of the operator is non-positive.
We can determine the area distribution p(A) analytically at least for small A. We are interested in the invasion by a single mutant, . Furthermore, for the purpose of determining the behavior for small areas, we expect that we can ignore the presence of the high-frequency boundary x = 1 and solve the problem on the semi-infinite line . Therefore, we consider the following problem:
(J.15) |
In our case, , because the trajectory starting from has A = 0.
For a function f(A) of A, we write the Laplace transformation as
(J.16) |
By taking the Laplace transform with respect to A, we have
(J.17) |
The solution is
(J.18) |
We take the inverse of the Laplace transformation,
(J.19) |
From the convolution theorem, this is given by the convolution of and g(A);
(J.20) |
When , we have
(J.21) |
Especially, when , we have
(J.22) |
where we have used since only areas larger than are meaningful for a finite-size population.
Appendix K: Forward-in-time behaviors of the Eldon–Wakeley model
Here, we present simulation results of the median allele frequency and the median and mean square displacements in the Eldon–Wakeley model (Eldon and Wakeley 2006) (see also Der et al. 2012). As shown below, unlike our model, these quantities do not exhibit sustained power-law behaviors, because of the existence of a characteristic size ψ in the offspring distribution.
We consider the neutral Eldon–Wakeley model, where the following offspring distribution is given by [see Equation (7) in Eldon and Wakeley (2006)];
(K.1) |
where is the Kronecker delta. and the parameters characterizing how large and frequent ‘sweepstakes’ are.
The limiting process as depends on γ [see Equation (9) in Der et al. (2012)]. For , the process is the same as the Wright–Fisher diffusion, while, for , it is described by a jump process whose backward-time generator is given by
(K.2) |
where the continuous time τ is related with generations t by . The first term of the generator represents a frequency-increasing jump with rate x, while the last one represents a frequency-decreasing jump with rate .
Figure K1 shows numerical simulation results for the median of allele frequencies and the median/mean square displacements. The median frequency for a small initial frequency is well described by (Figure K1A). This exponential decay can be expected from the generator in Equation K.2; for , frequency-increasing jumps (with rate x) are unlikely to occur, and an allele frequency typically decreases by with rate . Thus, the median frequency in the Eldon–Wakeley model does not exhibit a power-law behavior.
As for frequency fluctuations, while the mean SD exhibits a normal diffusion as in the Moran (or the Wright–Fisher) model, i.e., , the median SD does not exhibit a sustained power-law behavior (Figure K1B); in a short- and long-time scales, the median SD exhibits a normal diffusion (), but, for an intermediate timescale ( generations in the figure), it increases more rapidly than expected from a normal diffusion.
Footnotes
is obtained from
Although the magnitudes of –1 and are small compared to , we need to retain these two terms because contributes to through .
For example, one of the integrals over is
while one of the integrals over is
where we have changed the integration variable from Δ to .
Note that the definition of the parameter αH in Birzu et al. (2018) is different from our definition of α. For , which corresponds to the semi-pushed wave region , the two definitions are related by .
Specifically, the rate k is given by for and by for (Birzu et al. 2018).
Literature cited
- Adam DC, Wu P, Wong JY, Lau EH, Tsang TK, et al. 2020. Clustering and superspreading potential of SARS-CoV-2 infections in Hong Kong. Nat Med. 26:1714–1719. [DOI] [PubMed] [Google Scholar]
- Bah B, Pardoux E.. 2015. The Λ-lookdown model with selection. Stoch Process Appl. 125:1089–1126. [Google Scholar]
- Barton NH, Etheridge AM.. 2011. The relation between reproductive value and genetic contribution. Genetics. 188:953–973. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Basdevant A, Goldschmidt C.. 2008. Asymptotics of the allele frequency spectrum associated with the Bolthausen–Sznitman coalescent. Electron J Probab. 13:486–512. [Google Scholar]
- Berestycki J, Berestycki N, Limic V.. 2014. Asymptotic sampling formulae for Λ-coalescents. Ann IHP Prob Stat. 50:715–731. [Google Scholar]
- Berg JJ, Coop G.. 2014. A population genetic signal of polygenic adaptation. PLoS Genet. 10:e1004412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Birzu G, Hallatschek O, Korolev KS.. 2018. Fluctuations uncover a distinct class of traveling waves. Proc Natl Acad Sci USA. 115:E3645–E3654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Birzu G, Hallatschek O, Korolev KS.. 2021. Genealogical structure changes as range expansions transition from pushed to pulled. Proc Natl Acad Sci. 118:34. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bollback JP, York TL, Nielsen R.. 2008. Estimation of 2Nes from temporal allele frequency data. Genetics. 179:497–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bolthausen E, Sznitman A.-S.. 1998. On Ruelle’s probability cascades and an abstract cavity method. Commun Math Phys. 197:247–276. [Google Scholar]
- Brunet É, Derrida B, Mueller AH, Munier S.. 2007. Effect of selection on ancestry: an exactly soluble case and its phenomenological generalization. Phys Rev E Stat Nonlin Soft Matter Phys. 76:041104. [DOI] [PubMed] [Google Scholar]
- Cannings C. 1974. The latent roots of certain Markov chains arising in genetics: a new approach, I. Haploid models. Adv Appl Prob. 6:260–290. [Google Scholar]
- Crow JF, Kimura M.. 1970. An Introduction to Population Genetics Theory. New York, Evanston and London: Harper & Row, Publishers. [Google Scholar]
- Cvijović I, Good BH, Desai MM.. 2018. The effect of strong purifying selection on genetic diversity. Genetics. 209:1235–1278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Der R, Epstein C, Plotkin JB.. 2012. Dynamics of neutral and selected alleles when the offspring distribution is skewed. Genetics. 191:1331–1344. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Der R, Plotkin JB.. 2014. The equilibrium allele frequency distribution for a population with reproductive skew. Genetics. 196:1199–1216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Desai MM, Fisher DS.. 2007. Beneficial mutation–selection balance and the effect of linkage on positive selection. Genetics. 176:1759–1798. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Desai MM, Walczak AM, Fisher DS.. 2013. Genetic diversity and the structure of genealogies in rapidly adapting populations. Genetics. 193:565–585. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Eldon B. 2009. Structured coalescent processes from a modified Moran model with large offspring numbers. Theor Popul Biol. 76:92–104. [DOI] [PubMed] [Google Scholar]
- Eldon B. 2011. Estimation of parameters in large offspring number models and ratios of coalescence times. Theor Popul Biol. 80:16–28. [DOI] [PubMed] [Google Scholar]
- Eldon B, Wakeley J.. 2006. Coalescent processes when the distribution of offspring number among individuals is highly skewed. Genetics. 172:2621–2633. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etheridge AM, Griffiths RC, Taylor JE.. 2010. A coalescent dual process in a Moran model with genic selection, and the lambda coalescent limit. Theor Popul Biol. 78:77–92. [DOI] [PubMed] [Google Scholar]
- Evans SN, Shvets Y, Slatkin M.. 2007. Non-equilibrium theory of the allele frequency spectrum. Theor Popul Biol. 71:109–119. [DOI] [PubMed] [Google Scholar]
- Ewens WJ. 1963. The diffusion equation and a pseudo-distribution in genetics. J R Stat Soc Series B Methodol. 25:405–412. [Google Scholar]
- Feder AF, Kryazhimskiy S, Plotkin JB.. 2014. Identifying signatures of selection in genetic time series. Genetics. 196:509–522. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fisher R. 1930. The Genetical Theory of Natural Selection. London, UK: Oxford University Press. [Google Scholar]
- Foll M, Shim H, Jensen JD.. 2015. WFABC: a Wright–Fisher ABC-based approach for inferring effective population sizes and selection coefficients from time-sampled data. Mol Ecol Resour. 15:87–98. [DOI] [PubMed] [Google Scholar]
- Fusco D, Gralka M, Kayser J, Anderson A, Hallatschek O.. 2016. Excess of mutational jackpot events in expanding populations revealed by spatial Luria–Delbrück experiments. Nat Commun. 7:12760. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gardiner C. 2009. Stochastic Methods, Vol. 4. Berlin: Springer. [Google Scholar]
- Gnedenko BV, Kolmogorov A.. 1968. Limit Distributions for Sums of Independent Random Variables, Vol. 233. MA: Addison-wesley Reading. [Google Scholar]
- Griffiths RC. 2014. The Λ-Fleming-Viot process and a connection with Wright-Fisher diffusion. Adv Appl Prob. 46:1009–1035. [Google Scholar]
- Hallatschek O. 2018. Selection-like biases emerge in population models with recurrent jackpot events. Genetics. 210:1053–1073. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hallatschek O, Nelson DR.. 2008. Gene surfing in expanding populations. Theor Popul Biol. 73:158–170. [DOI] [PubMed] [Google Scholar]
- Hedgecock D. 1994. Does variance in reproductive success limit effective population sizes of marine organisms. Genet Evol Aquat Organ. 122:122–134. [Google Scholar]
- Karlin S, Taylor HE.. 1981. A Second Course in Stochastic Processes. New York: Academic Press. [Google Scholar]
- Kimura M. 1955. Stochastic Processes and Distribution of Gene Frequencies under Natural Selection. Cold Spring Harbor Symp Quant Biol. 20:57–66. [DOI] [PubMed] [Google Scholar]
- Kimura M. 1969. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics. 61:893–903. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kosheleva K, Desai MM.. 2013. The dynamics of genetic draft in rapidly adapting populations. Genetics. 195:1007–1025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Krapivsky PL, Redner S, Ben-Naim E.. 2010. A Kinetic View of Statistical Physics. New York: Cambridge University Press. [Google Scholar]
- Laxminarayan R, Wahl B, Dudala SR, Gopal K, Neelima S, Reddy KJ, et al. 2020. Epidemiology and transmission dynamics of COVID-19 in two Indian states. Science. 370:691–697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM.. 2005. Superspreading and the effect of individual variation on disease emergence. Nature. 438:355–359. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Luria SE, Delbrück M.. 1943. Mutations of bacteria from virus sensitivity to virus resistance. Genetics. 28:491–511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neher RA, Hallatschek O.. 2013. Genealogies of rapidly adapting populations. Proc Natl Acad Sci USA. 110:437–442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Neher RA, Shraiman BI.. 2011. Genetic draft and quasi-neutrality in large facultatively sexual populations. Genetics. 188:975–996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sackman AM, Harris RB, Jensen JD.. 2019. Inferring demography and selection in organisms characterized by skewed offspring distributions. Genetics. 211:1019–301684. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schraiber JG, Evans SN, Slatkin M.. 2016. Bayesian inference of natural selection from allele frequency time series. Genetics. 203:493–511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schweinsberg J. 2003a. Coalescent processes obtained from supercritical Galton–Watson processes. Stoch Process Appl. 106:107–139. [Google Scholar]
- Schweinsberg J. 2003b. Coalescent processes obtained from supercritical Galton–Watson processes. Stoch Process Appl. 106:107–139. [Google Scholar]
- Schweinsberg J. 2017. Rigorous results for a population model with selection II: genealogy of the population. Electron J Probab. 22:1–54. [Google Scholar]
- Tataru P, Simonsen M, Bataillon T, Hobolth A.. 2017. Statistical inference in the Wright–Fisher model using allele frequency data. Syst Biol. 66:e30–e46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tellier A, Lemaire C.. 2014. Coalescence 2.0: a multiple branching of recent theoretical developments and their applications. Mol Ecol. 23:2637–2652. [DOI] [PubMed] [Google Scholar]
- Uchaikin VV, Zolotarev VM.. 1999. Chance and Stability: Stable Distributions and Their Applications. UtrechtVSP. [Google Scholar]
- Weissman DB, Desai MM, Fisher DS, Feldman MW.. 2009. The rate at which asexual populations cross fitness valleys. Theor Popul Biol. 75:286–300. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wright S. 1931. Evolution in Mendelian populations. Genetics. 16:97–159. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The authors state that all data necessary for confirming the conclusions presented in the article are represented fully within the article.