Skip to main content
Systematic Biology logoLink to Systematic Biology
. 2016 Aug 2;66(1):e30–e46. doi: 10.1093/sysbio/syw056

Statistical Inference in the Wright–Fisher Model Using Allele Frequency Data

Paula Tataru 1,, Maria Simonsen 1,, Thomas Bataillon 1, Asger Hobolth 1,*
PMCID: PMC5837693  PMID: 28173553

Abstract

The Wright–Fisher model provides an elegant mathematical framework for understanding allele frequency data. In particular, the model can be used to infer the demographic history of species and identify loci under selection. A crucial quantity for inference under the Wright–Fisher model is the distribution of allele frequencies (DAF). Despite the apparent simplicity of the model, the calculation of the DAF is challenging. We review and discuss strategies for approximating the DAF, and how these are used in methods that perform inference from allele frequency data. Various evolutionary forces can be incorporated in the Wright–Fisher model, and we consider these in turn. We begin our review with the basic bi-allelic Wright–Fisher model where random genetic drift is the only evolutionary force. We then consider mutation, migration, and selection. In particular, we compare diffusion-based and moment-based methods in terms of accuracy, computational efficiency, and analytical tractability. We conclude with a brief overview of the multi-allelic process with a general mutation model. [Allele frequency, diffusion, inference, moments, selection, Wright–Fisher.]


A central goal of population genetics is to infer the past history of populations and describe the evolutionary forces that have shaped their genetic variation. The Wright–Fisher model (Fisher 1930; Wright 1931) explicitly accounts for the effects of various evolutionary forces—random genetic drift, mutation, selection—on allele frequencies over time. This model can also accommodate the effect of demographic forces such as variation in population size through time and/or migration connecting populations. Information about these evolutionary and demographic forces can, in principle, be retrieved from allele frequency data. The questions that researchers can answer and the types of inference they can make depend on the type of genetic data available, which can be broadly divided into two categories.

One type of data is a time series of allele frequencies from a single population (Fig. 1a). Here, the task is often to quantify the amount of drift that has influenced the changes in allele frequencies over time. This is done by estimating the size of the ideal Wright–Fisher population that best accounts for the patterns of genetic drift observed in the data, or, in other words, to estimate the effective population size. Furthermore, an important goal could be to identify those loci that have been under positive selection over the time interval considered.

Figure 1.

Figure 1.

Data types. The gray boxes represent the unobserved history of the populations, together with the corresponding population allele frequency Inline graphic, whereas the white boxes indicate the observed data: the generation Inline graphic when the data are sampled, the size Inline graphic of the sample, and the allele count Inline graphic, that is, how many alleles of a given type have been observed among the genotyped individuals. Given the population frequency Inline graphic, Inline graphic follows a binomial distribution with size Inline graphic and probability Inline graphic. In order to calculate the likelihood of the data, the DAF of Inline graphic is needed. a) Time series data where, typically, one population is sampled at different (known) generations. b) Single time-point data, where multiple populations are sampled just once, typically in the present. The history of the populations is given as a tree. The leaves and internal nodes represent the sampled and ancestral populations, respectively. The branch lengths reflect the amount of time populations have diverged since the split from the ancestral population.

The second type of data consists of allele frequencies from multiple populations, typically collected in the present (Fig. 1b). In this situation, the task is often to infer divergence times, population sizes, mutation rates, and, if applicable, migration rates between populations. Additionally, there is also considerable interest in evaluating the role of selection in shaping the observed data. Typical questions are: Do allele frequencies in regions of interest harbor footprints of selection? What is the overall importance of purifying selection on a specific set of sites (e.g., non-coding regions of functional interest or non-synonymous positions in gene coding regions)? We emphasize that this second type of data is very similar to the type of data analyzed in phylogenetics. In both instances, information is gained as new mutations arise at the nucleotide level and the fate of these mutations is influenced by the different evolutionary and demographic forces of interest. The difference between phylogenetics and population genetics essentially resides in the time scales that are modeled. Phylogenetics is often concerned with long time scales, and the data contain one sample per species. Differences among the sequences are most often substitutions. Population genetics typically considers data where several samples are available within a species, and many differences are detected due to mutations that are still segregating (polymorphic). Interestingly, these two time scales tend to merge when considering data sets containing sequences of individuals that comprise recently diverged species, as both types of differences—mutations that are still polymorphic and mutations that have been fixed as substitutions—have to be modeled jointly.

To infer the evolutionary history of a population, model-based approaches in population genetics have to rely on an explicit model for the evolution of populations. The Wright–Fisher model (Fisher 1930; Wright 1931) occupies a central position in this endeavour. It provides an elegant mathematical framework for modeling allele frequency data. The dynamics of the model are well understood (Kimura 1955a, 1955b, 1964; Crow and Kimura 1956; Crow and Kimura 1970; Ewens 1972; Crow 1987; Ewens 2004) but inference under the Wright–Fisher model is complicated due to the lack of a simple closed-form analytical expression for the distribution of allele frequencies (DAF). Common to all inference methods is the need to determine the DAF, either at equilibrium or over specified time intervals.

Here, we focus on how the DAF is influenced by demographic and evolutionary forces and concentrate on both classical and more recent attempts to calculate the DAF that enable accurate yet tractable population genetics inference. We begin our review with the basic bi-allelic Wright–Fisher model by considering, in turn, the forces of pure genetic drift, mutation, migration, and selection. For each of these forces, we provide expressions for the mean and variance of the DAF, and discuss and compare the approaches used to obtain the DAF. We also review implementations of the inference methods (Table 1).

Table 1.

Overview of recent inference methods for the bi-allelic Wright–Fisher model

Reference Data Mut Mig Sel Approach Availability
Markov chain theory
Mathieson and McVean (2013)a T   Inline graphic Inline graphic Normal -
Gompert (2015)a T     Inline graphic Beta spatpg
Diffusion approximation
Bollback et al. (2008) T     Inline graphic Finite-difference -
Gutenkunst et al. (2009) S Inline graphic Inline graphic   Finite-difference Inline graphicaInline graphici
Lukić and Hey (2012) S Inline graphic Inline graphic   Spectral decomposition MultiPop
Malaspinas et al. (2012) T     Inline graphic Numerical approximation upon request
Gautier and Vitalis (2013) S       Spectral decomposition KimTree
Steinrücken et al. (2014) T Inline graphic   Inline graphic Spectral decomposition spectralHMM
Vitalis et al. (2014) S   Inline graphic Inline graphic Stationary DAF SelEstim
Živković et al. (2015) S Inline graphic   Inline graphic Spectral decomposition upon request
Ferrer-Admetlla et al. (2016) T Inline graphic   Inline graphic Numerical approximation ApproxWF
Moment-based approximations
Sirén et al. (2011) S       Beta -
Pickrell and Pritchard (2012) S   Inline graphic   Normal TreeMix
Lacerda and Seoighe (2014) T     Inline graphic Normal upon request
Hui and Burt (2015) T       Beta NB
Tataru et al. (2015) S       Beta with spikes SpikeyTree
Terhorst et al. (2015) T     Inline graphic Normal EandR-timeseries

The table indicates what type of data the method uses (Data): time series data from one population (T) or single time-point data from multiple populations (S); if the method models new mutations (Mut), migration (Mig) or selection (Sel); which type of approach is used for calculating the DAF (Approach); and whether the method is publicly available (Availability). All methods model genetic drift.

a analyze jointly time series data from multiple populations. The table covers only the more recent inference methods.

Although the bi-allelic Wright–Fisher model captures a major part of data types, in particular single-nucleotide polymorphisms (SNPs), some loci are intrinsically multi-allelic. We therefore also briefly discuss recent progress to calculate the DAF under the general multi-allelic Wright–Fisher model. We investigate if one of the widely used approximations for the multi-allelic DAF can capture adequately the first two moments of the DAF, and point to limitations of the approximation.

A variety of methods that are grounded in the Wright–Fisher model use a range of tests and/or summary statistics to detect population differentiation (Balding and Nichols 1995; 1997; Nicholson et al. 2002; Gaggiotti and Foll 2010), or carry out genome-wide scans for selection (Foll and Gaggiotti 2008; Coop et al. 2010; Gautier et al. 2010; Gautier 2015). Several of these methods use some of the approaches for calculating the DAF discussed here. However, they do not directly use or estimate the effect of the different evolutionary forces on the DAF. Therefore, we do not review such methods and refer the reader instead to Haasl and Payseur (2015) for details.

Next to the Wright–Fisher model, the coalescent (Kingman 2000, 1982a, 1982b, 1982c) and Moran (Moran 1958) models occupy an important role in the field. The coalescent process is dual to the Wright–Fisher model: although the Wright–Fisher model describes the evolution of a population forward in time in discrete non-overlapping generations, the coalescent process is built backwards in time, and arises as an approximation to the Wright–Fisher model when the population size is large. Unlike the coalescent, the Moran model is a forward-in-time process, and it is often regarded as an equivalent to the Wright–Fisher model (but see Bhaskar and Song 2009). Both the coalescent and Moran models have been analyzed extensively and their dynamics are in several cases more amenable to mathematical analysis (Donnelly 1984; Ewens 2004; Hobolth et al. 2007; Muirhead and Wakeley 2009; Li and Durbin 2011; Paul et al. 2011; Vogl and Clemente 2012). However, the Moran model is hardly ever used for inference (but see, e.g., De Maio et al. 2013; 2015), whereas the coalescent is typically restricted to a handful of individuals (Hobolth et al. 2007; Li and Durbin 2011; Paul et al. 2011; Mailund et al. 2012; Sheehan et al. 2013; Schiffels and Durbin 2014; Rasmussen et al. 2014) and does not use allele frequency data (but see, e.g., Liu and Fu 2015). Therefore, we do not include the coalescent and Moran models in this review, and refer the reader instead to Fu and Li (1999); Durrett (2008); Kuhner (2009); Liu et al. (2009); Wakeley (2009); Nielsen and Slatkin (2013); Edwards et al. (2016).

BI-ALLELIC WRIGHT–FISHER MODEL

The Wright–Fisher model assumes a randomly mating population of finite size reproducing in discrete non-overlapping generations, by allowing the individuals in generation Inline graphic to choose parents at random from the previous generation Inline graphic. The model describes the stochastic behavior through time of the frequency of an allele at a locus. This frequency is influenced by a series of evolutionary forces that, as discussed below, change the probability of choosing a parent. Here, we consider a diploid population of size Inline graphic which contains only two alleles, denoted Inline graphic and Inline graphic. Below we review methods used to obtain the DAF of allele Inline graphic after a certain amount of generations.

Pure Drift

The Wright–Fisher model, in its simplest form, only considers random genetic drift (Fig. 2), where the stochastic fluctuations in the allele frequency are purely determined by the random mating of the population. This assumption is appropriate for the analysis of loci that have small mutation rates and the analysis of recently diverged populations, leaving little time for mutation to create new alleles, and where we expect an overall negligible effect of selection.

Figure 2.

Figure 2.

Dynamics in the pure drift bi-allelic Wright–Fisher model. The child inherits the parental allele.

Dynamics and moments.—

Let Inline graphic be the number of Inline graphic alleles in generation Inline graphic and Inline graphic be the corresponding allele frequency. The random mating of the population leads to a count of Inline graphic alleles in generation Inline graphic that is binomially distributed (Fisher 1930; Wright 1931; Crow and Kimura 1970; Ewens 2004)

z(r+1)z(r)Bin(2N,x(r)). (1)

Here, Inline graphic is the binomial distribution with sample size Inline graphic and probability Inline graphic. The genetic variation present in the population is due to ancestral polymorphism, and because no new variation is added, the Inline graphic allele is eventually fixed or lost (Fig. 3a).

Figure 3.

Figure 3.

a) Simulation under the pure drift model (equation (1)) with Inline graphic and Inline graphic. The vertical bars indicate three sampled time-points. The Inline graphic-axis denotes the time measured in scaled number of generations. b) DAF at the three sampled time-points. The vertical bars indicate the simulated allele frequencies.

The goal is to determine the DAF: the distribution Inline graphic of Inline graphic, after evolving for Inline graphic generations from an initial frequency Inline graphic (Fig. 3b). We first calculate the first two moments of the DAF. From the binomial sampling, the mean and variance over one generation are given by

E[x(r+1)x(r)]=x(r),Var(x(r+1)x(r))=12Nx(r)(1x(r)).

The mean and variance after Inline graphic generations can be obtained by iterating the two expressions above or from alternative derivations (Wright 1942; Crow 1954; Crow and Kimura 1970). The result is

E[x(r)x(0)]=x(0), (2)
Var(x(r)x(0))=x(0)(1x(0))(1(112N)r). (3)

For large Inline graphic, we can approximate the variance by

Var(x(t)x(0))x(0)(1x(0))(1et), (4)

where Inline graphic. Note that this implies that Inline graphic can be estimated by equation (2) only if Inline graphic is known, otherwise only the ratio Inline graphic can be estimated.

Markov chain theory.—

Because the allele frequency at generation Inline graphic only depends on generation Inline graphic, the Wright–Fisher model is a discrete-time finite-space Markov chain. Using this property, the DAF can be obtained from classical Markov chain theory (Karlin and Taylor 1975), where the transition probabilities are given by equation (1) (Williamson and Slatkin 1999). However, this procedure quickly becomes computationally infeasible, as the transition probability matrix has a size of Inline graphic. By recognizing that most of the probability mass from equation (1) is centered around Inline graphic, the computational demand can be reduced by evaluating, storing and using only the transition probabilities that are large enough to contribute significantly to the DAF (Wang 2001; Freeman et al. 2003).

Under the assumption of large Inline graphic, diffusion theory (see below) shows that the population size acts as a scaling factor (Feller et al. 1951; Wakeley 2005) and therefore one could calculate the DAF using a smaller Inline graphic. This approach was used by De Maio et al. (2013; 2015), though they relied on the Moran model rather than the Wright–Fisher. Alternatively, if Inline graphic is large enough such that the allele frequencies can be treated as continuous, the Markov chain can be built over discretized allele frequencies, and thus the computational burden is controlled by the number of bins. The original discrete binomial sampling probability from equation (1) is then replaced by the continuous normal or beta distributions (Mathieson and McVean 2013; Gompert 2015).

Diffusion approximation.—

One way to calculate the DAF is to take advantage of the diffusion approximation to the Wright–Fisher model, which is appropriate when the population size Inline graphic is large, such that both allele frequencies and time can be treated as continuous. Diffusion theory uses two fundamental equations, the Kolmogorov forward and backward equations (Kolmogorov 1931). The forward equation was first used by Wright (1945) to calculate the rate of decay and stationary DAF, whereas Kimura (1957) used the backward equation first to study the problem of fixation. Let us define a new time scale by Inline graphic such that one time unit corresponds to Inline graphic generations. Then, we have

2Nx(t+δt)x(t)Bin(2N,x(t)),

from which we can approximate

x(t+δt)x(t)N(x(t),x(t)(1x(t))δt). (5)

Here, Inline graphic is the normal distribution with mean Inline graphic and variance Inline graphic. Equation (5) corresponds to the time-homogeneous stochastic differential equation

dx(t)=a(x)dt+b(x)dw(t), (6)

where Inline graphic is a standard Brownian motion, and Inline graphic and Inline graphic are the infinitesimal mean and variance, respectively. For the Wright–Fisher model, Inline graphic, whereas Inline graphic has different forms depending on the evolutionary forces. Under pure drift, Inline graphic, as is evident from equation (5).

The DAF Inline graphic at time Inline graphic is now determined by the forward Kolmogorov (or Fokker–Planck or diffusion) equation (Kolmogorov 1931; Crow and Kimura 1970; Ewens 2004)

f(x;t)t=x{a(x)f(x;t)}+122x2{x(1x)f(x;t)}, (7)

with boundary condition Inline graphic for Inline graphic. This equation can be solved using different approaches (Table 1). Kimura first described how the DAF can be calculated under pure drift (Kimura 1955a) using the spectral decomposition of equation (7), which results in an infinite sum of scaled Gegenbauer polynomials. In practice, the infinite sum needs to be truncated and the optimal truncation level depends on the convergence properties. This controls the accuracy, but also the computational performance. The diffusion equation can also be solved using purely numerical methods. Chang and Cooper (1970) developed a finite-difference scheme to numerically solve any diffusion equation, whereas Zhao et al. (2013) proposed a finite-volume scheme to solve the Wright–Fisher diffusion equation.

Gautier and Vitalis (2013) relied on the solution proposed by Kimura (1955a) to estimate divergence times between populations that have been evolving under pure drift, from single time-point data.

Moment-based approximations.—

The use of the diffusion approximation is limited in practice due to the high computational burden. Cavalli-Sforza and Edwards (1967) approximated pure drift as a Brownian motion process, and current moment-based approximations are reminiscent of that approach, in that they are based on mathematically convenient instrumental distributions. By relying on the equations for the mean (2) and variance (3, 4), we can fit to the true DAF distributions that can be parameterized solely through the first two moments, such as the normal and beta distributions. These two distributions arise as special cases of the DAF approximated from the diffusion theory: the normal distribution is a transient distribution (equation (5)) which is appropriate for very short evolutionary times, whereas the stationary DAF under linear evolutionary pressure is given by a beta distribution (see Box 1, equation (B.9)).

Box 1.

Evolutionary models for the bi-allelic Wright–Fisher

Consider the general bi-allelic Wright–Fisher process, where Inline graphic captures the evolutionary pressures acting on the allele,

z(r+1)z(r)Bin(2N,g(x(r))). (B.1)

The function Inline graphic can take different forms.

General linear evolutionary pressure:

g(x)=(1a)x+b,for 0ba<1, (B.2)

where Inline graphic and Inline graphic are given by

Box 1 (B.3)

Let Inline graphic, Inline graphic and Inline graphic. For large Inline graphic, the mean and variance for the DAF are given by (Tataru et al. 2015)

E[x(t)x(0)]=BA+eAt(x0BA), (B.4)
Var(x(t)x(0))=BA(1BA)1e(2A+1)t2A+1(x0BA)2e2At(1et)+(12BA)(x0BA)eAt1e(A+1)tA+1. (B.5)

For pure drift, Inline graphic and we set Inline graphic. Note that equations (2), (4), (11), and (12) can be obtained as special cases of the above.

Selection (non-linear evolutionary pressure):

g(x)=(1+s)x2+(1+sh)x(1x)(1+s)x2+2(1+sh)x(1x)+(1x)2 (B.6)
x+sx(1x)(h+(12h)x), (B.7)

where the approximation relies on the selection coefficients Inline graphic and Inline graphic being small (Crow and Kimura 1970).

Selection with linear evolutionary pressure: Alleles can undergo linear evolutionary pressure and selection jointly. Then,

g(x)=(1a){x+sx(1x)(h+(12h)x)}+b. (B.8)

Stationary distribution: When Inline graphic, variation is constantly introduced in the population and the DAF has a stationary distribution given by (up to a normalization constant),

f(x)x2B1(1x)2(AB)1eSx(2h+(12h)x), (B.9)

where Inline graphic is the scaled selection coefficient. When Inline graphic, we obtain a beta distribution with shape parameters Inline graphic and Inline graphic, which is in agreement with the expressions for mean and variance in the limit Inline graphic.

Several authors used the normal distribution (Nicholson et al. 2002; Coop et al. 2010; Gautier et al. 2010; Pickrell and Pritchard 2012; Lacerda and Seoighe 2014; Terhorst et al. 2015), which takes the form

x(r)x(0)N(E[x(r)x(0)],Var(x(r)x(0))). (8)

Equations (5) and (8) are equivalent under pure drift when the number of generations Inline graphic is small relative to the population size Inline graphic. Then, by using the approximation

1(112N)r1er2Nr2N,

in the variance equation (3), we recover equation (5) from equation (8) with Inline graphic.

Balding and Nichols (1995; 1997) first proposed the use of the Dirichlet distribution, the multivariate generalization of the beta distribution, for the multi-allelic Wright–Fisher (see the multi-allelic section below). For the bi-allelic Wright–Fisher model, the DAF can be approximated with a beta distribution as follows,

x(r)x(0)Beta(E[x(r)x(0)],Var(x(r)x(0))),

where Inline graphic is the beta distribution parameterized by mean Inline graphic and variance Inline graphic. We note here that a beta distribution always verifies the condition Inline graphic. For the alternative parameterization with shapes Inline graphic and Inline graphic, we have the relation

α=(m(1m)v1)m,β=(m(1m)v1)(1m). (9)

Although both the normal and beta distributions have been used for inference, they differ in accuracy. One major difference comes from the support of the distributions. The allele frequency Inline graphic always lies between Inline graphic and Inline graphic, and, under the Wright–Fisher model, there can be a positive probability for Inline graphic being either Inline graphic or Inline graphic (the allele is lost or fixed, respectively). The normal distribution is defined over the whole real line, and a positive probability can exist outside Inline graphic. If Inline graphic is intermediate and Inline graphic is small, the probability that Inline graphic falls outside of Inline graphic is small and therefore can be ignored (Pickrell and Pritchard 2012; Lacerda and Seoighe 2014; Terhorst et al. 2015). If Inline graphic is close to the boundaries, the normal distribution from equation (8) can be truncated to Inline graphic. The probabilities in the intervals Inline graphic and Inline graphic are added as two atoms at Inline graphic and Inline graphic and serve as the loss and fixation probabilities, respectively (Nicholson et al. 2002; Coop et al. 2010; Gautier et al. 2010). Gautier and Vitalis (2013) noted that the truncated normal distribution no longer has the true variance of the DAF.

Unlike the normal distribution, the beta distribution has support in Inline graphic. However, due to its continuous nature, the beta distribution cannot account for the discrete events that Inline graphic can be Inline graphic or Inline graphic. Tataru et al. (2015) addressed this issue and introduced a new approximation, the beta with spikes, a beta distribution for the polymorphic frequencies (Inline graphic), supplemented by two spikes at 0 and 1 accounting for the loss and fixation probabilities. Then the distribution of Inline graphic is

x(r)x(0)Beta(E[x(r)x(0)],Var(x(r)x(0)),p0(r),p1(r)),

where Inline graphic is the beta with spikes distribution parameterized by mean Inline graphic, variance Inline graphic, and probabilities Inline graphic and Inline graphic found at Inline graphic and Inline graphic, respectively. This is given by

Beta(x;m,v,p0,p1)=p0δ(x)+p1δ(1x)+1p0p1B(α,β)Beta(x;m,v).

Here, Inline graphic is the Dirac delta function, introduced to account for the non-zero probabilities at the boundaries, and Inline graphic and Inline graphic are the mean and variance of the beta distribution for the polymorphic frequencies, given by (Tataru et al. 2015)

m=mp11p0p1,v=v+m2p11p0p1(m)2.

The beta function Inline graphic acts as a normalization factor, where Inline graphic and Inline graphic are the shape parameters of Inline graphic (equation (9)).

Using the equations (2) and (3) for the mean and variance, the normal and beta approximations of the DAF can be written in closed form. However, the loss and fixation probabilities are not known in closed form, and therefore, the beta with spikes relies on a recursive approach to calculate these probabilities (see Tataru et al. (2015) for details).

The moment-based approximations have been used in a series of inference methods (Table 1). Hui and Burt (2015) used the beta distribution to infer the effective size of one population undergoing pure drift from time series data. Sirén et al. (2011) and Tataru et al. (2015) used single time-point data to infer divergence times between populations evolving under pure drift. Sirén et al. (2011) used the beta distribution, and therefore could not accurately model the alleles that are close to being lost or fixed. Tataru et al. (2015) used the beta with spikes approximation and demonstrated that the addition of spikes leads to a more accurate inference compared with merely using the beta distribution.

Quality of approximations.—

We evaluated the accuracy of the approximations to the true DAF obtained from the Markov chain property, using the Hellinger distance (Le Cam and Yang 2000), which lies between Inline graphic and Inline graphic, with Inline graphic indicating a perfect match of the two distributions. The diffusion approximation is the most accurate, whereas the truncated normal and beta distributions are the least accurate (Fig. 4). They approximate the true DAF well when the probability mass is away from the boundaries: Inline graphic is close to Inline graphic and the generation Inline graphic is not too large. As Inline graphic increases, the frequency drifts away from Inline graphic and more and more probability accumulates at the boundaries. The beta distribution fails to capture this, whereas the atoms and spikes in the truncated normal and beta with spikes distributions, respectively, approximate these probabilities with various degrees of accuracy. Overall, the beta with spikes distribution is more accurate than both the truncated normal and beta distributions.

Figure 4.

Figure 4.

Fit of various approximations to the pure drift true DAF, calculated using the Markov chain property for Inline graphic and a range of Inline graphic and Inline graphic. Each column shows a different type of approximation, indicated at the top of the figure. a) Hellinger distance on log scale between the approximated and true DAF. The three “Inline graphic”s in each of the heatmaps indicate the combinations of Inline graphic and Inline graphic used in b). b) True (dashed lines) and approximated (solid lines) DAF for Inline graphic and different values of Inline graphic. The truncated normal, beta and beta with spikes are discretized as in Tataru et al. (2015). The diffusion DAF is calculated as in Zhao et al. (2013), with Inline graphic and Inline graphic. We used Inline graphic for computational reasons, but we see similar patterns for larger Inline graphic.

Neutral Mutations

The most common way to introduce variation in a population is by allowing the alleles to mutate (Fig. 5).

Figure 5.

Figure 5.

Dynamics in the bi-allelic Wright–Fisher model with mutations. If the parental allele is Inline graphic, the child has the same allele with probability Inline graphic, and a mutation occurs with probability Inline graphic. If the parental allele is Inline graphic, the child allele is Inline graphic with probability Inline graphic, and becomes Inline graphic with probability Inline graphic.

Dynamics and moments.—

If Inline graphic is the probability of a mutation from Inline graphic to Inline graphic, and Inline graphic is the probability for the reverse event, the sampling probability from equation (1) is changed by allowing each individual to undergo a mutation after choosing its parent. Therefore, the individual is carrying an Inline graphic allele if the parent had an Inline graphic allele (probability Inline graphic) and there was no mutation (probability Inline graphic), or the parent had an Inline graphic allele (probability Inline graphic) and it mutated (probability Inline graphic), leading to a sampling probability

x(r)(1u)+(1x(r))v=(1uv)x(r)+v.

Then, the binomial distribution of Inline graphic becomes

z(r+1)z(r)Bin(2N,(1uv)x(r)+v). (10)

For large Inline graphic, Crow and Kimura (1956) derived general formulas for all moments of Inline graphic. The mean and variance after Inline graphic generations of evolution can also be obtained by repeated use of the laws of total expectation and variance (Sirén 2012). Tataru et al. (2015) provided the formulas:

E[x(t)x(0)]=νμ+ν+(x(0)νμ+ν)e(μ+ν)t, (11)
Var(x(t)x(0))=μν(μ+ν)2(2(μ+ν)+1)(1e(2(μ+ν)+1)t)(x(0)νμ+ν)2e2(μ+ν)t(1et)+μν(μ+ν)(μ+ν+1)(x(0)νμ+ν)e(μ+ν)t(1e(μ+ν+1)t), (12)

where Inline graphic, Inline graphic, and Inline graphic.

Diffusion approximation.—

The diffusion approximation of the Wright–Fisher with neutral mutations is obtained in a similar way as for pure drift. Let Inline graphic and Inline graphic be the scaled mutation rates, and we again scale the time in units of Inline graphic generations. Recall that the infinitesimal variance is independent of the evolutionary forces. For neutral mutations, the infinitesimal mean is given by

a(x)=μx+ν(1x). (13)

When new variation is constantly introduced in the population, after enough time, the allele frequency will reach a stationary distribution. This was first obtained by Wright (1931) by noting that at stationarity, the mean and variance are unchanged between successive generations. Later on, the stationary DAF was re-derived using alternative methods, including diffusion (Wright 1945; 1938). The stationary DAF for neutral mutations is given by a beta distribution with shape parameters Inline graphic and Inline graphic (Crow and Kimura 1970; Ewens 2004). Note that this result is in agreement with the mean (equation (11)) and variance (equation (12)) in the limit Inline graphic.

The spectral decomposition method developed by Kimura (1955a) to calculate the DAF under pure drift was extended to calculate the DAF with recurrent mutation (Crow and Kimura 1956; 1970; Song and Steinrücken 2012), and to incorporate mutation rates and population sizes that vary in time in a piecewise constant manner (Steinrücken et al. 2016).

Moment-based approximations.—

Using the moments of the DAF for the bi-allelic Wright–Fisher with neutral mutations (equations (11) and (12)), the moment-based approximations are obtained just as for pure drift.

Quality of approximations.—

The non-zero mutation probabilities introduce variation in the population, and reduce the loss and fixation probabilities relative to pure drift (Figs. 4 and 6). For example, under pure drift, the probability that the mutation is lost (fixed) at Inline graphic is Inline graphic, while when alleles mutate with Inline graphic, the probability is reduced to Inline graphic. As more of the probability mass is now found away from the Inline graphic and Inline graphic boundaries, all approximations have an overall improved fit to the true DAF (Fig. 6).

Figure 6.

Figure 6.

Fit of various approximations to the true DAF with neutral mutations, calculated using the Markov chain property for Inline graphic, Inline graphic and a range of Inline graphic and Inline graphic. Each column shows a different type of approximation, indicated at the top of the figure. a) Hellinger distance on log scale between the approximated and true DAF. The three "Inline graphic"s in each of the heatmaps indicate the combinations of Inline graphic and Inline graphic used in b). b) True (dashed lines) and approximated (solid lines) DAF for Inline graphic and different values of Inline graphic. Calculations are performed as for Figure 4. For comparison purposes, the a) heatmap and b) Inline graphic-axis scales are the same as in Figure 4.

Migration

In its simplest form, the migration model describes the evolution of the allele frequency in one population that sends migrants, with probability Inline graphic, to an infinitely large population with constant allele frequency Inline graphic, and receives immigrants such that the population size stays constant over time. Then the allele count at generation Inline graphic is given by (Crow and Kimura 1970)

z(r+1)z(r)Bin(2N,(1m)x(r)+mxc). (14)

Under pure drift, the sampling among the alleles in generation Inline graphic is done uniformly (equation (1)). However, as different evolutionary pressures act on the allele, the sampling probability is changed, as observed for neutral mutations and migration in (14). We can capture all the evolutionary pressures acting on the allele in a function Inline graphic which alters the sampling probability of the binomial distribution from equation (1). We then obtain the more general process

z(r+1)z(r)Bin(2N,g(x(r))). (15)

The evolutionary pressures for pure drift, mutation, and migration are linear in Inline graphic (see Box 1) and are therefore collectively called linear pressure (Crow and Kimura 1970). It is this linearity that allows the calculation of the first two moments of the DAF in closed form. One can formulate a general linear evolutionary pressure model, where pure drift, mutation and migration are special cases (see Box 1).

The migration model from equation (14) is a good approximation if the immigrants represent a random sample of the entire species (Crow and Kimura 1970). This is often not the case, and migrants are typically exchanged by at least two populations that have non-constant allele frequencies. This leads to an evolutionary pressure Inline graphic that is dependent on the generation, and the DAFs of both populations need to be modeled jointly.

Markov chain theory.—

Mathieson and McVean (2013) inferred effective population sizes and migration rates from time series data (Table 1) while modeling multiple populations distributed on a lattice, where neighboring populations exchange migrants every generation.

Diffusion approximation.—

Gutenkunst et al. (2009) built a diffusion equation to model jointly the allele frequencies in multiple populations. They solved this equation using the finite-difference scheme to infer divergence time between populations, mutation, and migration rates. From the joint DAF, Gutenkunst et al. (2009) calculated the expected multi-population allele frequency spectrum (AFS), which summarizes allele frequency data. Because the dimension of the AFS depends on the number of populations, the time needed to compute the AFS grows exponentially with the number of populations. This limited their analysis to only three populations. Lukić and Hey (2012) also calculated the expected AFS, but they extended the spectral decomposition method to calculate the joint DAF of multiple populations that exchange migrants, while accounting for de novo mutations. The implementation of Lukić and Hey (2012) was optimized to use little memory, and can therefore tackle more than three populations. However, compared with Gutenkunst et al. (2009), it has a lower computational speed on two and three populations.

Moment-based approximations.—

Pickrell and Pritchard (2012) used the normal distribution to infer divergence times between populations that have been evolving under pure drift and have exchanged migrants. Due to their use of the normal distribution, the method is not accurate for alleles with frequencies close to Inline graphic or Inline graphic.

Quality of approximations.—

As both the neutral mutation ((10)) and migration (equation (14)) models are special cases of the general linear evolutionary pressure model (Box 1), the quality of the approximations is similar. The approximation quality shown in Figure 6, where Inline graphic, also applies for Inline graphic and Inline graphic.

Selection

When selection is present, the different genotypes are transmitted to the next generation with different probabilities, determined by their fitness. If the Inline graphic allele has frequency Inline graphic and selection is parameterized by coefficient Inline graphic and dominance parameter Inline graphic, the three possible genotypes have the following frequencies (assuming Hardy–Weinberg equilibrium) and fitness (Crow and Kimura 1970)

Genotype Inline graphic Inline graphic Inline graphic
Frequency Inline graphic Inline graphic Inline graphic
Fitness Inline graphic Inline graphic Inline graphic

The allele count Inline graphic still follows the process given in equation (14), with the evolutionary pressure function from equation (B.7).

Dynamics and moments.—

The first two moments of the DAF for the general linear evolutionary pressure (equations (B.4) and (B.5)) can be obtained using the law of total expectation and variance, respectively. These take the form

E[x(r+1)x(0)]=E[g(x(r))x(0)], (16)
Var(x(r+1)x(0))=12NE[x(r+1)x(0)]E[x(r+1)x(0)]2+(112N)E[g(x(r))2x(0)]. (17)

The evaluation of Inline graphic and Inline graphic typically requires all moments of Inline graphic. However, these can be written as functions of only the first two moments when Inline graphic is a linear function in Inline graphic, allowing the above recursions to be solved in closed form (Tataru et al. 2015). When the allele is under selection and Inline graphic is no longer linear, we can approximate Inline graphic by only using the first two moments by relying on a Taylor series. This will yield a recursion for calculating the mean and variance of the DAF. The Taylor series can be evaluated around the deterministic trajectory of Inline graphic (Barton and Otto 2005; Terhorst et al. 2015), or around the pre-calculated mean of Inline graphic (Lacerda and Seoighe 2014).

To obtain the Taylor series about the deterministic trajectory, we decompose Inline graphic as Inline graphic, where Inline graphic represents the deterministic trajectory followed by the allele frequency in the infinite-population limit, and Inline graphic is the random disturbance away from Inline graphic. Then,

E[x(r)x(0)]=x¯(r)+E[x~(r)x(0)], (18)
Var(x(r)x(0))=Var(x~(r)x(0)). (19)

From equations (16) and (18) we obtain, using the Taylor series for Inline graphic about Inline graphic,

E[x~(r+1)x(0)]E[x~(r)x(0)]dgdxx¯(r)+12E[x~(r)2x(0)]d2gd2xx¯(r).

Similarly, from the Taylor series of Inline graphic about Inline graphic, and using equations (17), (18), and (19) we obtain the recursion for Inline graphic,

E[x~(r+1)2x(0)]12Nx¯(r+1)(1x¯(r+1))+12N(12x¯(r+1))E[x~(r+1)x(0)]+(112N)E[x~(r)2x(0)](dgdxx¯(r))2.

By iterating the recursions above and calculating numerically the first two moments of Inline graphic, we can recover the mean and variance of the DAF after Inline graphic generations.

Markov chain theory.—

Mathieson and McVean (2013) and Gompert (2015) inferred selection from time series data by discretizing continuous allele frequencies and building a Markov chain with normal and beta transition probabilities, respectively (Table 1). Gompert (2015) additionally allowed for variability in time of selection coefficients and population sizes.

Diffusion approximation.—

For a Wright–Fisher model with drift, mutation and selection, specified by equations (B.1), (B.2), (B.3), and (B.8), and letting Inline graphic, we obtain the following infinitesimal mean

a(x)=νx+μ(1x)+Sx(1x)(h+(12h)x).

The diffusion equation when selection is present is the most difficult to solve. However, the stationary distribution is known in closed form (Wright 1937; Crow and Kimura 1970; Ewens 2004) and is, up to a normalization constant, given by a tilted beta distribution

f(x)x2ν1(1x)2μ1eSx(2h+(12h)x). (20)

We note here that the diffusion limit to the Wright–Fisher model requires that the parameters involved in the evolutionary pressure, Inline graphic, Inline graphic, Inline graphic, Inline graphic, and Inline graphic, are all in the order of Inline graphic, such that the resulting scaled parameters, Inline graphic, Inline graphic, Inline graphic, Inline graphic, and Inline graphic, are in the order of Inline graphic. This is the source of the approximation of equation (B.6) with equation (B.7), and of the common practice of simplifying expressions by removing “small” terms (Feller et al. 1951; Wakeley 2005). It also indicates that in the diffusion limit, the population size Inline graphic acts as a scaling factor, and a rescaling of the parameters and time by a constant factor will not affect the DAF. This result is responsible for the notion that it is impossible to estimate, for example, the mutation rate and effective population size separately. However, although it may be true that there is low power in doing so, this is simply a consequence of the assumptions of the diffusion approximation. These might be expected to break down in cases in which the diffusion is not appropriate (Wakeley 2005). In this respect, the moment-based approximations are free of the small parameters assumption, especially because the mean and variance of the general linear evolutionary pressure can be calculated without making the approximation of large Inline graphic (Tataru et al. 2015). Therefore, moment-based approximations might be more appropriate when the evolutionary pressure is strong (Lacerda and Seoighe 2014).

Using the spectral decomposition of the diffusion equation, Kimura (1955b; 1957) found the DAF when selection is present. This approach was extended by Song and Steinrücken (2012) to improve the convergence properties for stronger selection, whereas Steinrücken et al. (2016) developed it further to model selection coefficients that vary over time in a piecewise constant manner. The DAF was also calculated using a finite-difference scheme (Bollback et al. 2008), finite-volume scheme (Zhao et al. 2013), a path integral formalism (Schraiber 2014) and other numerical approaches (Malaspinas et al. 2012; Ferrer-Admetlla et al. 2016).

Bollback et al. (2008); Steinrücken et al. (2014); Malaspinas et al. (2012) estimated jointly selection coefficients and effective population sizes from time series data from one population. Ferrer-Admetlla et al. (2016) could additionally infer mutation rates. Živković et al. (2015) used the spectral decomposition of Steinrücken (2012) to infer mutation, selection and variable population size from present data from one population. Vitalis et al. (2014) used the stationary distribution of the DAF when multiple populations exchange migrants and experience selection. As they used the stationary DAF, they could not recover any information about the divergence of the populations. We would like to note here that although the method of Gutenkunst et al. (2009) can in principle incorporate selection, the inference software does not estimate selection coefficients.

Moment-based approximations.—

Using the numerically approximated moments of the DAF, the truncated normal and beta distributions are obtained as previously. The beta with spikes approximation has not been extended to include selection. However, the approximation developed by Tataru et al. (2015) for the loss and fixation probabilities should still be reasonable if the selection pressure is small and the loss and fixation probabilities are mainly dominated by genetic drift.

Moment-based approximations have had limited use for inference of selection due to the difficulties in calculating the first two moments of the DAF. Both Lacerda and Seoighe (2014) and Terhorst et al. (2015) estimated effective population sizes and selection coefficients from time series data, using the normal distribution and the Taylor expansion approach. One critical difference between the two is that Lacerda and Seoighe (2014) assumed additive selection (Inline graphic) and used a Taylor series about the mean of Inline graphic, whereas Terhorst et al. (2015) made no assumptions about dominance and used a Taylor series about the deterministic trajectory. Additionally, Terhorst et al. (2015) were the first to incorporate linkage, but in practice their model is limited to jointly analyze only a small number of loci (typically 3).

Quality of approximations.—

Relative to pure drift, positive selection acts by increasing the expected frequency and probability of fixation of the Inline graphic allele, and decreasing the probability of loss (Figs. 4 and 7). For example, under pure drift and with a beginning frequency of Inline graphic, the probability that the mutation is lost (fixed) at Inline graphic is Inline graphic (Inline graphic), while when selection is present with Inline graphic, the probability is reduced (increased) to Inline graphic (Inline graphic). Overall, for Inline graphic, all approximations have a fit to the true DAF (Fig. 7) that is very similar to that for pure drift (Fig. 4). We note here that Inline graphic is a very small selection coefficient. For larger values of Inline graphic, the Taylor series approach leads to estimated values for the mean Inline graphic and variance Inline graphic for which Inline graphic, and these cannot be fitted by a beta distribution.

Figure 7.

Figure 7.

Fit of various approximations to the true DAF with selection, calculated using the Markov chain property for Inline graphic, Inline graphic, Inline graphic and a range of Inline graphic and Inline graphic. Each column shows a different type of approximation, indicated at the top of the figure. a) Hellinger distance on log scale between the approximated and true DAF. The three “Inline graphic”s in each of the heatmaps indicate the combinations of Inline graphic and Inline graphic used in b). b) True (dashed lines) and approximated (solid lines) DAF for Inline graphic and different values of Inline graphic. Calculations are performed as for Figure 4. For comparison purposes, the a) heatmap and b) Inline graphic-axis scales are the same as in Figure 4.

MULTI-ALLELIC WRIGHT–FISHER MODEL

The bi-allelic Wright–Fisher model is typically a very good approximation for SNP data (because the per-nucleotide mutation rate is typically small), but due to highly mutable sites, ancestral polymorphism, very large sample size or large evolutionary distance, a number of SNPs may contain 3 or 4 alleles. Furthermore, highly variable loci (e.g., short tandem repeats) are still widely used, especially in forensics (Balding and Nichols 1997; Balding and Steele 2015), and are typically multi-allelic. In these cases, the data can be analyzed using the multi-allelic Wright–Fisher model, an extension of the bi-allelic model. Instead of following the frequency of one allele, which is sampled from a binomial distribution from one generation to the next, the multi-allelic model describes the joint distribution of the Inline graphic alleles present in the population, which are now sampled from one generation to the next from a multinomial distribution.

Pure Drift

Similar to the bi-allelic model, the simplest form is the pure random genetic drift model, where the stochastic fluctuations in the allele frequencies are purely determined by the random mating of the finite population (Fig. 8).

Figure 8.

Figure 8.

Dynamics in the pure drift Inline graphic multi-allelic Wright–Fisher model for Inline graphic. The child inherits the parental allele.

Dynamics and moments.—

Let Inline graphic be the number of Inline graphic alleles in generation Inline graphic, Inline graphic and Inline graphic be the corresponding allele frequency. The distribution of Inline graphic is

z(r+1)|z(r)Mult(2N,x(r)). (21)

Here, Inline graphic is the multinomial distribution with sample size Inline graphic and probability vector Inline graphic.

To determine the mean and covariance of the DAF, we move from discrete generations to continuous time where one time unit corresponds to Inline graphic generations, and set Inline graphic. Then,

E[x(t)x(0)]=x(0), (22)
Var(x(t)x(0))=(1et)(diag{x(0)}x(0)x(0)), (23)

where Inline graphic denotes vector transpose. These formulas are natural extensions of equations (2) and (4).

Diffusion approximation.—

Diffusion theory can be extended from the bi-allelic to the multi-allelic case. We will not cover this here, but refer to Ewens (2004; section 4.8, p. 151) for a general discussion of multi-dimensional diffusion processes, and Ewens (2004; section 5.10, p. 192) for the Inline graphic-allele pure drift Wright–Fisher model. In particular, Ewens (2004) mentions that a generalization of equation (7) can be formulated and that a generalization of Kimura’s solution in terms of orthogonal polynomials exists.

Moment-based approximations.—

The beta distribution is a natural choice for approximating the DAF for the bi-allelic Wright–Fisher model, and it provides a good approximation when the allele is not close to being lost or fixed (Figs. 4, 6, and 7). It is therefore natural to approximate the DAF for the multi-allelic Wright–Fisher using the generalization of the beta distribution, the Dirichlet distribution (Balding and Nichols 1995; 1997). Just like for the bi-allelic case, where the beta distribution arises as the stationary DAF under linear evolutionary pressure, the Dirichlet distribution is the stationary DAF for a specific mutation model (Ewens 2004) (see below).

Under the Dirichlet model, also called the Balding–Nichols model (Balding and Steele 2015), the allele frequency vector Inline graphic follows a Dirichlet distribution

x(t)x(0)Dirichlet(α),

where Inline graphic. This implies that allele Inline graphic has marginal distribution

xi(t)Beta(αi,α0αi),with α0=i=1Kαi.

Under the Dirichlet distribution, the mean and covariance of the DAF are

E[x(t)x(0)]=αα0, (24)
Var(x(t)x(0))=1α0+1(diag{αα0}(αα0)αα0). (25)

The mean and covariance of the DAF (equations (22) and (23)) are equivalent to those under the Dirichlet distribution (equations (24) and (25)) when

x(0)=αα0,and1et=1α0+1.

Therefore, the Dirichlet distribution can accurately capture the true mean and covariance of the multi-allelic pure drift Wright–Fisher model.

Neutral Mutations

Just as is the case for the bi-allelic model (Fig. 3), when the alleles evolve under pure drift, eventually the process will reach a monomorphic state, where only one of the alleles will be present in the population. The variation can be maintained in the population by allowing mutations (Fig. 9).

Figure 9.

Figure 9.

Dynamics in the Inline graphic multi-allelic Wright–Fisher model with mutations for Inline graphic. If the parental allele is Inline graphic, the child receives the same allele with probability Inline graphic and another allele Inline graphic with probability Inline graphic, for Inline graphic.

Dynamics and moments.—

If Inline graphic is the probability of an Inline graphic allele to mutate to a Inline graphic allele, the multinomial distribution of Inline graphic becomes

z(r+1)z(r)Mult(2N,x(r)U),

where the mutation probabilities are stored in a Inline graphic matrix Inline graphic. By specifying the structure of Inline graphic, different evolutionary mutation models can be formulated, such as the Jukes–Cantor (JC) model, parent independent mutation model, infinite alleles model, Kimura model, and single-step mutation model (Felsenstein 2004).

The mean and covariance of the DAF in continuous time (Inline graphic) are obtained using the rate matrix Inline graphic, where Inline graphic is the identity matrix, from the diffusion approximation (Hobolth and Sirén 2016),

E[x(t)x(0)]=x(0)eQt, (26)
Var(x(t)x(0))=0tes(eQs)diag{x(0)eQ(ts)}(eQs)ds(eQt)x(0)x(0)eQt(1et). (27)

These general formulas make it possible to numerically calculate the mean and covariance for any mutation model. In practice, the mean can be calculated using one of the many available numerical procedures for matrix exponentials (Moler and Van Loan 2003). Calculating the covariance, which involves integrals of matrix exponentials, is more tedious, but this can be done numerically using the eigenvalue decomposition of the rate matrix (Hobolth and Sirén 2016).

The JC is the most simple mutation model, where all mutation probabilities are equal, Inline graphic, for all Inline graphic. The entries in the rate matrix for the JC model are given by

Qij=2N(UijIij)={qK1if ijqif i=j

where Inline graphic. The rate matrix can be written in matrix form as

Q=qK1(EIK),

where Inline graphic is the Inline graphic matrix with Inline graphic in every entry. We can now obtain a closed-form solution for the matrix exponential Inline graphic, namely

eQt=eϵt2(IEK)+EK,

where Inline graphic. The mean and covariance in the JC model are found from equations (26) and (27) and given by

E[x(t)x(0)]=eϵt2(x(0)eK)+eK, (28)
Var(x(t)x(0))=1K(IEK)11+ϵ(1e(1+ϵ)t)(x(0)eK)(x(0)eK)eϵt(1et)+(diag{x(0)eK}(x(0)eK)eKeK(x(0)eK))×eϵt211+ϵ2(1e(1+ϵ2)t), (29)

where Inline graphic is the Inline graphic vector with 1 in every entry.

For Inline graphic, these reduce to

E[x(t)x(0)]=eK,Var(x(t)x(0))=1K(IEK)11+ϵ.

We note that these moments are the same as for a Dirichlet distribution with Inline graphic, and indeed the Dirichlet distribution is the stationary DAF of the multi-allelic JC Wright–Fisher model (Ewens 2004).

Moment-based approximations.—

The mean and covariance of the Dirichlet distribution (equations (24) and (25)) are equivalent to those under the JC model if the covariance approximately fulfills the proportionality condition

Var(x(t)x(0))diag{E[x(t)]}E[x(t)]E[x(t)]=1K(IEK)(x(0)eK)(x(0)eK)eϵt+(diag{x(0)eK}(x(0)eK)eKeK(x(0)eK))eϵt2, (30)

where we used the expression for the mean in equation (28). By comparing equations (29) and (30), we observe that the expressions are approximately proportional with proportionality constant Inline graphic when Inline graphic is small, which corresponds to the pure drift case. Regardless of the parameter Inline graphic, the expressions are also approximately proportional, with proportionality constant Inline graphic, when the evolutionary distance Inline graphic is small. Finally, for large Inline graphic, the proportionality constant is Inline graphic, because the Dirichlet distribution is the stationary distribution for the JC model. These analytical considerations are confirmed by Figure 10. The Dirichlet distribution cannot accurately capture the mean and covariance of the JC model for intermediate values of Inline graphic, and the deviation is very clear for large values of Inline graphic (Fig. 10b). Therefore, care should be taken when using the Dirichlet distribution in practice. Because the JC is the most simple mutation model, with just one parameter, one could expect that the fit of the Dirichlet distribution could be even more problematic for more complex mutation models. An important step in developing more appropriate distributions for the DAF under the multi-allelic Wright–Fisher model is made by Sirén et al. (2013) and Hobolth and Sirén (2016), but in general more research is needed in this direction.

Figure 10.

Figure 10.

Fit of the Dirichlet distribution (dotted lines) to the true mean and covariance of the multi-allelic JC Wright–Fisher model (solid lines) with a) Inline graphic (Inline graphic small), and b) Inline graphic (Inline graphic large). All six plots are calculated for Inline graphic, Inline graphic, Inline graphic and different values of Inline graphic.

CONCLUSION AND PERSPECTIVES

We have provided a broad overview of methods to calculate the DAF under the Wright–Fisher model. These methods have a number of working assumptions in common. Here, we discuss in turn each of these and how current methods tackle these issues or potentially could be improved to do so.

Virtually all methods presented here rely on unlinked loci, with an exception worth mentioning using a moment-based approach (Terhorst et al. 2015). Several inference methods built on the coalescent process analyze pairs of linked neutral loci (Li and Durbin 2011; Paul et al. 2011; Sheehan et al. 2013; Schiffels and Durbin 2014; Rasmussen et al. 2014). Some theoretical results do exist for linked bi-allelic selected loci in the limit of a large population size (Lessard and Kermany 2012), but these have limited use for inference methods. To our knowledge, equivalent results for the joint DAF of two loci are not available, but see Jenkins et al. (2014) for an approximation for loosely linked loci. These results could be used as the basis for including recombination in inference methods, but so far these types of approximations have not really permeated the field of inference under the Wright–Fisher model.

The methods we have presented here also implicitly assume very simplified demographic scenarios. In some instances, especially if the data contain individuals sampled from populations with complex demography, it might be easier, although more computationally intensive, to rely on simulations under the Wright–Fisher model to perform inference (Excoffier et al. 2013). In particular, much progress has been made in using Approximate Bayesian Computation (ABC) that relies on a series of summary statistics from simulations and either rejection sampling or more sophisticated methods to obtain approximated posterior distributions for the parameters of interest (Beaumont et al. 2002; Blum 2010; Marin et al. 2012). This is becoming very much a field of its own with active method development in the choice of relevant summary statistics, quasi-sufficiency (Joyce and Marjoram 2008), and various algorithms to obtain computationally efficient approximations of the likelihood (Excoffier et al. 2013). Simulation-based methods can be a source of inspiration for developing methods seeking to specify the DAF under a wider range of scenarios and less restrictive assumptions. Such an example is the inference of selection from time series data from experimental evolution, where the population size undergoes periodic bottlenecks, followed by population expansion. This is typical of most experimental setups, where the population size is experimentally controlled (Foll et al. 2014).

Ultimately, the directions of future method development are likely to be conditioned by the type of data modeled and the evolutionary or demographic questions of interest that motivate the need for inference. As genome-wide re-sequencing becomes increasingly common and replaces most SNP genotyping and exome sequencing, we can expect that the data will increasingly consist of polymorphism counts among tightly linked sites.

One direction worth exploring is using the Wright–Fisher model to learn about how selection varies along the genome, and thereby shapes genome-wide diversity. Some progress has been made in inferring mutation rates and selection coefficients by expressing expected local levels of nucleotide diversity as a function of the amount of selection affecting neutral sites due to linkage (Elyashiv et al. 2014).

Finally, at present, most software programs that implement inference methods have been developed for bi-allelic data, whereas inference for multi-allelic data is clearly lagging behind. We have discussed recent attempts to understand and formulate approximations for the DAF under the multi-allelic Wright–Fisher model with mutation. These developments are expected to improve modeling of short tandem repeat data that are still widely used in forensics (Balding and Steele 2015). They might also allow the analysis of a broader range of biological situations where the bi-allelic assumption is not always appropriate, for example, when there is extensive heterogeneity in the mutation rate or the product of effective population size and mutation rate is high, as is the case for microbial and viral genomes.

ACKNOWLEDGMENTS

We would like to thank Jukka Sirén, Dominik Schrempf, the associate editor, editor, and an anonymous reviewer for their constructive suggestions and comments that helped improve the manuscript.

FUNDING

This work has been supported, in part, by the European Research Council under the European Unions Seventh Framework Program (FP7/20072013, ERC grant number 311341) and the Danish Research Council (grant number DFF4002-00382).

References

  1. Balding D.J, Nichols R.A. 1995.. A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetica 96:3–12. [DOI] [PubMed] [Google Scholar]
  2. Balding D.J, Nichols R.A. 1997.. Significant genetic correlations among Caucasians at forensic DNA loci. Heredity 78(6):583–589. [DOI] [PubMed] [Google Scholar]
  3. Balding D.J, Steele C.D. 2015.. Weight-of-evidence for forensic DNA profiles. Chichester: John Wiley and Sons. [Google Scholar]
  4. Barton N.H, Otto S.P. 2005.. Evolution of recombination due to random drift. Genetics 169(4):2353–2370. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Beaumont M.A, Zhang W., Balding D.J. 2002.. Approximate Bayesian computation in population genetics. Genetics 162(4):2025–2035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bhaskar A., Song Y.S. 2009.. Multi-locus match probability in a finite population: a fundamental difference between the Moran and Wright–Fisher models. Bioinformatics 25(12):i187–i195. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Blum M.G.B. 2010.. Approximate Bayesian computation: a nonparametric perspective. J. Amer. Stat. Ass. 105(491):1178–1187. [Google Scholar]
  8. Bollback J.P, York T.L, Nielsen R.. 2008.. Estimation of 2nes from temporal allele frequency data. Genetics 179(1):497–502. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cavalli-Sforza L.L, Edwards A.W.F.. 1967.. Phylogenetic analysis: models and estimation procedures. Amer. J. Hum. Gen. 19(3 Pt 1):233. [PMC free article] [PubMed] [Google Scholar]
  10. Chang J.S, Cooper G.. 1970.. A practical difference scheme for Fokker-Planck equations. J. Comp. Phy. 6(1):1–16. [Google Scholar]
  11. Coop G., Witonsky D., Rienzo A.D, Pritchard J.K. 2010.. Using environmental correlations to identify loci underlying local adaptation. Genetics 185(4):1411–1423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Crow J.F, Kimura M.. 1970.. An introduction to population genetics theory. New York, Evanston and London: Harper & Row. [Google Scholar]
  13. Crow J.F. 1954.. Random mating with linkage in polysomics. Amer. Nat. 88(843):431–434. [Google Scholar]
  14. Crow J.F. 1987.. Population genetics history: a personal view. Ann. Rev. Gen. 21(1):1–22. [DOI] [PubMed] [Google Scholar]
  15. Crow J.F, Kimura M.. 1956.. Some genetic problems in natural populations. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, volume 4. Berkeley and Los Angeles: University of California Press, p. 1–22. [Google Scholar]
  16. De Maio N., Schlötterer C., Kosiol C.. 2013.. Linking great apes genome evolution across time scales using polymorphism-aware phylogenetic models. Mol. Biol. Evol. 30(10):2249–2262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. De Maio N., Schrempf D., Kosiol C.. 2015.. Pomo: An allele frequency-based approach for species tree estimation. Syst. Biol. 64(6):1018–1031. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Donnelly P. 1984.. The transient behaviour of the moran model in population genetics. In: Mathematical Proceedings of the Cambridge Philosophical Society, volume 95. Cambridge: Cambridge University Press, p. 349–358. [Google Scholar]
  19. Durrett R. 2008.. Probability models for DNA sequence evolution. New York: Springer. [Google Scholar]
  20. Edwards S.V, Xi Z., Janke A., Faircloth B.C, McCormack J.E, Glenn T.C, Zhong B., Wu S., Lemmon E.M, Lemmon A.R, Leaché A.D., Liu L., Davis C.C. 2016.. Implementing and testing the multispecies coalescent model: A valuable paradigm for phylogenomics. Mol. Phyl. Evol. 94:447–462. [DOI] [PubMed] [Google Scholar]
  21. Elyashiv E., Sattath S., Hu T.T, Strustovsky A., McVicker G., Andolfatto P., Coop G., Sella G. A genomic map of the effects of linked selection in Drosophila. 2014. doi: 10.1371/journal.pgen.1006130. arXiv preprint arXiv:1408.5461 [q-bio.PE]. [DOI] [PMC free article] [PubMed] [Google Scholar]
  22. Ewens W.J. 1972.. The sampling theory of selectively neutral alleles. Theor. Pop. Biol. 3(1):87–112. [DOI] [PubMed] [Google Scholar]
  23. Ewens W.J. Mathematical Population Genetics 1: I. Theoretical Introduction, Vol 27. Springer Science & Business Media. 2004. [Google Scholar]
  24. Excoffier L., Dupanloup I., Huerta-Sánchez E., Sousa V.C, Foll M.. 2013.. Robust demographic inference from genomic and snp data. PLOS Gen. 9(10):e1003905. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Feller W. 1951.. Diffusion processes in genetics. In: Proceedings of Second Berkeley Symposium on Mathematical Statistics and Probability University of California Press, Berkeley and Los Angeles, volume 227, p. 246. [Google Scholar]
  26. Felsenstein J. 2004.. Inferring phylogenies. Inc. Sunderland, Massachusetts: Sinauer Associates. [Google Scholar]
  27. Ferrer-Admetlla A., Leuenberger C., Jensen J.D, Wegmann D.. 2016.. An approximate Markova model for the Wright–Fisher diffusion and its application to time series data. Genetics 203(2):831–846. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Fisher R.A. 1930.. The genetical theory of natural selection. Oxford: Clarendon. [Google Scholar]
  29. Foll M., and Gaggiotti O.. 2008.. A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 180(2):977–993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Foll M., Poh Y.-P., Renzette N., Ferrer-Admetlla A., Bank C., Shim H., Malaspinas A.-S., Ewing G., Liu P., Wegmann D., Caffrey D.R, Zeldovich K.B, Bolon D.N, Wang J.P, Kowalik T.F, Schiffer C.A, Finberg R.W, Jensen J.D. 2014.. Influenza virus drug resistance: a time-sampled population genetics perspective. PLoS Gen. 10(2):e1004185. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Freeman W.H, Wang J., Whitlock M.C. 2003.. Estimating effective population size and migration rates from genetic samples over space and time. Genetics 163(1):429–446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Fu Y.-X., Li W.-H.. 1999.. Coalescing into the 21st century: an overview and prospects of coalescent theory. Theor. Pop. Biol. 56(1):1–10. [DOI] [PubMed] [Google Scholar]
  33. Gaggiotti O.E, Foll M.. 2010.. Quantifying population structure using the F-model. Mol. Ecol. Res. 10(5):821–830. [DOI] [PubMed] [Google Scholar]
  34. Gautier M., Vitalis R.. 2013.. Inferring population histories using genome-wide allele frequency data. Mol. Biol. Evol. 30(3):654–668. [DOI] [PubMed] [Google Scholar]
  35. Gautier M., Hocking T.D, Foulley J.-L.. 2010.. A Bayesian outlier criterion to detect SNPs under selection in large data sets. PLoS one 5(8):e11913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Gautier M. 2015.. Genome-wide scan for adaptive divergence and association with population-specific covariates. Genetics 201(4):1555–1579. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Gompert Z. 2015.. Bayesian inference of selection in a heterogeneous environment from genetic timeseries data. Mol. Ecol. 25(1):121–134 [DOI] [PubMed] [Google Scholar]
  38. Gutenkunst R.N, Hernandez R.D, Williamson S.H, Bustamante C.D. 2009.. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Gen 5(10):e1000695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Haasl R.J, Payseur B.A. 2015.. Fifteen years of genomewide scans for selection: trends, lessons and unaddressed genetic sources of complication. Mol. Ecol. 25(1):5–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Hobolth A., Sirén J.. 2016.. The multivariate Wright–Fisher process with mutation: Moment-based analysis and inference using a hierarchical beta model. Theor. Pop. Biol. 108:36–50. [DOI] [PubMed] [Google Scholar]
  41. Hobolth A., Christensen O.F, Mailund T. Schierup M.H. 2007.. Genomic relationships and speciation times of human, chimpanzee, and gorilla inferred from a coalescent hidden Markov model. PLoS Gen. 3(2):e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Hui T.-Y.J, Burt A.. 2015.. Estimating effective population size from temporally spaced samples with a novel, efficient maximum-likelihood algorithm. Genetics 200(1):285–293. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Jenkins P.A, Fearnhead P., Song Y.S. Tractable diffusion and coalescent processes for weakly correlated loci. 2014. doi: 10.1214/ejp.v20-3564. arXiv preprint arXiv:1405.6863. [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Joyce P., Marjoram P.. 2008.. Approximately sufficient statistics and bayesian computation. Stat. Appl. Gen. Mol. Biol. 7(1):1544–6115. [DOI] [PubMed] [Google Scholar]
  45. Karlin S., Taylor H.M. 1975.. A first course in stochastic processes. 2nd ed San Diego: Academic Press. [Google Scholar]
  46. Kimura M. 1955a.. Solution of a process of random genetic drift with a continuous model. Proc. Nat. Acad. Sci. USA. 41(3):144. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Kimura M. 1955b.. Stochastic processes and distribution of gene frequencies under natural selection. In: Cold Spring Harbor Symposia on Quantitative Biology, volume 20 Cold Spring Harbor: Cold Spring Harbor Laboratory Press, p. 33–53. [DOI] [PubMed] [Google Scholar]
  48. Kimura M. 1957.. Some problems of stochastic processes in genetics. Ann. Math. Stat. 28(4):882–901. [Google Scholar]
  49. Kimura M. 1964.. Diffusion models in population genetics. J. App. Prob. 1(2):177–232. [Google Scholar]
  50. Kingman J.F.C. 1982a.. Exchangeability and the evolution of large populations. In: Koch G., Spizzichino F., editors, Exchangeability in probability and statistics. Amsterdam: North-Holland, p. 97–112. [Google Scholar]
  51. Kingman J.F.C. 1982b.. On the genealogy of large populations. J. App. Prob. 19A:27–43. [Google Scholar]
  52. Kingman J.F.C. 1982c.. The coalescent. Sto. Proc. Appl. 13(3):235–248. [Google Scholar]
  53. Kingman J.F.C. 2000.. Origins of the coalescent: 1974-1982. Genetics 156(4):1461–1463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  54. Kolmogorov A. 1931.. ber die analytischen methoden in der wahrscheinlichkeitsrechnung. Mathematische Annalen 104:415–458. [Google Scholar]
  55. Kuhner M.K. 2009.. Coalescent genealogy samplers: windows into population history. Tren. Ecol. Evol. 24(2):86–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Lacerda M., Seoighe C.. 2014.. Population genetics inference for longitudinally-sampled mutantsunder strong selection. Genetics 198(3):1237–1250. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Lessard S., Kermany A.R.. 2012.. Fixation probability in a two-locus model by the ancestral recombination–selection graph. Genetics 190(2):691–707. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Le Cam L.M., Lo Yang G.. 2000.. Asymptotics in statistics: some basic concepts. 2nd ed New York: Springer. [Google Scholar]
  59. Liu L., Yu L., Kubatko L., Pearl D.K, Edwards S.V.. 2009.. Coalescent methods for estimating phylogenetic trees. Mol. Phyl. E vol. 53(1):320–328. [DOI] [PubMed] [Google Scholar]
  60. Liu X., Fu Y.-X.. 2015.. Exploring population size changes using SNP frequency spectra. Nat. Gen. 47:555–559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Li H., Durbin R.. 2011.. Inference of human population history from individual whole-genome sequences. Nature 475(7357):493–496. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Lukić S., Hey J.. 2012.. Demographic inference using spectral methods on SNP data, with an analysis of the human out-of-africa expansion. Genetics 192(2):619–639. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Mailund T., Halager A.E, Westergaard M., Dutheil J.Y, Munch K., Andersen L.N, Lunter G., Prüfer K., Scally A., Hobolth A., Schierup M.H.. 2012.. A new isolation with migration model along complete genomes infers very different divergence processes among closely related great ape species. PLoS Gen. 8(12):e1003125. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Malaspinas A.-S., Malaspinas O., Evans S.N, Slatkin M.. 2012.. Estimating allele age and selection coefficient from time-serial data. Genetics 192(2):599–607. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Marin J.-M., Pudlo P., Robert C.P, Ryder R.J.. 2012.. Approximate Bayesian computational methods. Statistics and Computing 22(6):1167–1180. [Google Scholar]
  66. Mathieson I., McVean G.. 2013.. Estimating selection coefficients in spatially structured populations from time series data of allele frequencies. Genetics 193(3):973–984. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Moler C., Loan C.V.. 2003.. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev. 45(1):3–49. [Google Scholar]
  68. Moran P.A.P. 1958.. Random processes in genetics. In: Mathematical Proceedings of the Cambridge Philosophical Society, volume 54 Cambridge: Cambridge University Press, p. 60–71. [Google Scholar]
  69. Muirhead C.A, Wakeley J.. 2009.. Modeling multiallelic selection using a moran model. Genetics 182(4):1141–1157. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Nicholson G., Smith A.V, Jónsson F., Gústafsson Ó, Stefánsson K., Donnelly P.. 2002.. Assessing population differentiation and isolation from single-nucleotide polymorphism data. J. Roy. Stat. Soc. Ser. B (Stat. Meth.) 64(4):695–715. [Google Scholar]
  71. Nielsen R., Slatkin M.. 2013.. An introduction to population genetics: theory and applications. Sunderland, Massachusetts: Sinauer Associates, Inc. [Google Scholar]
  72. Paul J.S, Steinrücken M., Song Y.S.. 2011.. An accurate sequentially Markov conditional sampling distribution for the coalescent with recombination. Genetics 187(4):1115–1128. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Pickrell J.K, Pritchard J.K.. 2012.. Inference of population splits and mixtures from genomewide allele frequency data. PLoS Gen. 8(11):e1002967. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Rasmussen M.D, Hubisz M.J, Gronau I., Siepel A.. 2014.. Genome-wide inference of ancestral recombination graphs. PLoS Gen., 10(5):e1004342. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Schiffels S., Durbin R.. 2014.. Inferring human population size and separation history from multiple genome sequences. Nat. Gen. 46:919–925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Schraiber J.G. 2014.. A path integral formulation of the Wright–Fisher process with genic selection. Theor. Pop. Biol. 92:30–35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  77. Sheehan S., Harris K., Song Y.S.. 2013.. Estimating variable effective population sizes from multiple genomes: a sequentially Markov conditional sampling distribution approach. Genetics 194(3):647–662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Sirén J., Marttinen P., Corander J.. 2011.. Reconstructing population histories from single nucleotide polymorphism data. Mol. Biol. Evol. 28(1):673–683. [DOI] [PubMed] [Google Scholar]
  79. Sirén J., Hanage W.P, Corander J.. 2013.. Inference on population histories by approximating infinite alleles diffusion. Mol. Biol. Evol. 30(2):457–468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Sirén J. Statistical models for inferring the structure and history of populations from genetic data [PhD Thesis] University of Helsinki, Faculty of Science, Department of Mathematics and Statistics Helsinki. 2012. [Google Scholar]
  81. Song Y.S, Steinrücken M.. 2012.. A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection. Genetics 190(3):1117–1129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  82. Steinrücken M., Bhaskar A., Song Y.S. 2014.. A novel spectral method for inferring general diploid selection from time series genetic data. Ann. App. Stat. 8(4):2203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Steinrücken M., Jewett E.M, Song Y.S.. 2016.. SpectralTDF: transition densities of diffusion processes with time-varying selection parameters, mutation rates, and effective population sizes Bioinformatics 32(5):795–797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  84. Tataru P., Bataillon T., Hobolth A.. 2015.. Inference under a Wright–Fisher model using an accurate beta approximation. Genetics 201:1133–1141. [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Terhorst J., Schlötterer C., Song Y.S.. 2015.. Multi-locus analysis of genomic time series data from experimental evolution. PLoS Gen. 11(4):e1005069. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Vitalis R., Gautier M., Dawson K.J, Beaumont M.A.. 2014.. Detecting and measuring selection from gene frequency data. Genetics 196:799–817. [DOI] [PMC free article] [PubMed] [Google Scholar]
  87. Vogl C., Clemente F.. 2012.. The allele-frequency spectrum in a decoupled moran model with mutation, drift, and directional selection, assuming small mutation rates. Theor. Pop. Biol. 81(3):197–209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  88. Wakeley J. 2005.. The limits of theoretical population genetics. Genetics 169(1):1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Wakeley J. 2009.. Coalescent theory: an introduction. Roberts & Company Publishers, Greenwood Village. [Google Scholar]
  90. Wang J. 2001.. A pseudo-likelihood method for estimating effective population size from temporally spaced samples. Gen. Res. 78(03):243–257. [DOI] [PubMed] [Google Scholar]
  91. Williamson E.G, Slatkin M.. 1999.. Using maximum likelihood to estimate population size from temporal changes in allele frequencies. Genetics 152(2):755–761. [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Wright S. 1931.. Evolution in Mendelian populations. Genetics 16:97–159. [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Wright S. 1937.. The distribution of gene frequencies in populations. Proc. Nat. Acad. Sci. 23(6):307–320. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Wright S. 1938.. The distribution of gene frequencies under irreversible mutation. Proc. Nat. Acad. Sci. 24(7):253–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Wright S. 1942.. Statistical genetics and evolution. Bull. Amer. Math. Soc. 48(4):223–246. [Google Scholar]
  96. Wright S. 1945.. The differential equation of the distribution of gene frequencies. Proc. Nat. Acad. Sci. USA 31(12):382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  97. Zhao L., Yue X., Waxman D.. 2013.. Complete numerical solution of the diffusion equation of random genetic drift. Genetics 194(4):973–985. [DOI] [PMC free article] [PubMed] [Google Scholar]
  98. Živković D., Steinrücken M., Song Y.S, Stephan W.. 2015.. Transition densities and sample frequency spectra of diffusion processes with selection and variable population size. Genetics 200:601–617. [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from Systematic Biology are provided here courtesy of Oxford University Press

RESOURCES