Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Feb 1.
Published in final edited form as: Theor Popul Biol. 2012 Nov 2;83:1–14. doi: 10.1016/j.tpb.2012.10.006

An explicit transition density expansion for a multi-allelic Wright-Fisher diffusion with general diploid selection

Matthias Steinrücken a, Y X Rachel Wang a, Yun S Song a,b,
PMCID: PMC3568258  NIHMSID: NIHMS419697  PMID: 23127866

Abstract

Characterizing time-evolution of allele frequencies in a population is a fundamental problem in population genetics. In the Wright-Fisher diffusion, such dynamics is captured by the transition density function, which satisfies well-known partial differential equations. For a multi-allelic model with general diploid selection, various theoretical results exist on representations of the transition density, but finding an explicit formula has remained a difficult problem. In this paper, a technique recently developed for a diallelic model is extended to find an explicit transition density for an arbitrary number of alleles, under a general diploid selection model with recurrent parent-independent mutation. Specifically, the method finds the eigenvalues and eigenfunctions of the generator associated with the multi-allelic diffusion, thus yielding an accurate spectral representation of the transition density. Furthermore, this approach allows for efficient, accurate computation of various other quantities of interest, including the normalizing constant of the stationary distribution and the rate of convergence to this distribution.

1. Introduction

Diffusion processes can be used to describe the evolution of population-wide allele frequencies in large populations, and they have been successfully applied in various population genetic analyses in the past. Karlin and Taylor (1981), Ewens (2004), and Durrett (2008) provide excellent introduction to the subject. The diffusion approximation captures the key features of the underlying evolutionary model and provides a concise framework for describing the dynamics of allele frequencies, even in complex evolutionary scenarios. However, finding explicit expressions for the transition density function (TDF) is a challenging problem for most models of interest. Although a partial differential equation (PDE) satisfied by the TDF can be readily obtained from the standard diffusion theory, few models admit analytic solutions.

Since closed-form transition density functions are unknown for general diffusions, approaches such as finite difference methods (Bollback et al., 2008; Gutenkunst et al., 2009) and series expansions (Lukić et al., 2011) have been adopted recently to obtain approximate solutions. In a finite difference scheme, one needs to discretize the state space, but since the TDF depends on the parameters of the model (e.g. the selection coefficients), the suitability of a given discretization might depend strongly on the parameter values, and whether a particular discretization would produce accurate solutions is difficult to predict a priori. Series expansions allow one to circumvent the problem of choosing an appropriate discretization for the state space. However, if the chosen basis functions in the representation are not the eigenfunctions of the diffusion generator, as in Lukić et al. (2011), then one has to solve a system of coupled ordinary differential equations (ODE) to obtain the transition density. Lukić et al. (2011) solve this system of ODEs numerically, which may introduce potential errors of numerical approximations.

If the eigenvalues and eigenfunctions of the diffusion generator can be found, the spectral representation (which in some sense provides the optimal series expansion) of the TDF can be obtained. For the one-locus Wright-Fisher diffusion with an arbitrary number K of alleles evolving under a neutral parent-independent mutation (PIM) model, Shimakura (1977) and Griffiths (1979) derived an explicit spectral representation of the TDF using orthogonal polynomials. More recently, Baxter et al. (2007) derived the same solution by diagonalizing the associated PDE using a suitable coordinate transformation, followed by solving for each dimension independently. In a related line of research, Griffiths and Li (1983) and Tavaré (1984) expressed the time-evolution of the allele frequencies in terms of a stochastic process dual to the diffusion, and showed that the resulting expression is closely related to the spectral representation.

The duality approach was later extended by Barbour et al. (2000) to incorporate a general selection model. Although theoretically very interesting, this approach does not readily lead to efficient computation of the TDF because of the following reason: Computation under the dual process requires evaluating the moments of the stationary distribution. Although the functional form of this distribution is known (Ethier and Kurtz, 1994; Barbour et al., 2000), the normalization constant and moments can only be computed analytically in special cases (Genz and Joyce, 2003), and numerical computation under a general model of diploid selection is difficult. Incidentally, this issue arises in various applications (e.g., Buzbas et al. 2009, 2011), and it has therefore received significant attention in the past; see, for example, Donnelly et al. (2001) and Buzbas and Joyce (2009).

Many decades ago, Kimura (1955, 1957) addressed the problem of finding an explicit spectral representation of the TDF for models with selection. Specifically, in the case of a diallelic model with special selection schemes, he employed a perturbation method to find the required eigenvalues and eigenfunctions of the diffusion generator. Being perturbative in the selection coefficient, this approach is accurate only for small selection parameters. Recently, Song and Steinrücken (2012) revisited this problem and developed an alternative method of deriving an explicit spectral representation of the TDF for the diallelic Wright-Fisher diffusion under a general diploid selection model with recurrent mutation. In contrast to Kimura’s approach, this new approach is non-perturbative and is applicable to a broad range of parameter values. The goal of the present paper is to extend the work of Song and Steinrücken (2012) to an arbitrary number K of alleles, assuming a PIM model with general diploid selection.

The rest of this paper is organized as follows. In Section 2, we lay out the necessary mathematical background and review the work of Song and Steinrücken (2012) in the case of a diallelic (K = 2) model with general diploid selection. In Section 3, we describe the spectral representation for the neutral PIM model with an arbitrary number K of alleles. Then, in Section 4, we generalize the method of Song and Steinrücken (2012) to an arbitrary K-allelic PIM model with general diploid selection. We demonstrate in Section 5 that the quantities involved in the spectral representation converge rapidly. Further, we discuss the computation of the normalization constant of the stationary distribution under mutation-selection balance and the rate of convergence to this distribution. We conclude in Section 6 with potential applications and extensions of our work.

2. Background

2.1. The Wright-Fisher diffusion

In this paper, we consider a single locus with K distinct possible alleles. The dynamics of the allele frequencies in a large population is commonly approximated by the Wright-Fisher diffusion on the (K − 1)-simplex

ΔK1{x0K1:1|x|0},

where |x|=i=1K1xi. For a given x = (x1, …, xK−1) ∈ ΔK−1, the component xi denotes the population frequency of allele i ∈ {1, …, K − 1}. The frequency of allele K is given by xK = 1 − |x|.

The associated diffusion generator ℒ is a second order differential operator of form

f(x)=12i,j=1K1bi,j(x)2xixjf(x)+i=1K1ai(x)xif(x), (1)

which acts on twice continuously differentiable functions f : ΔK−1 → ℝ. The diffusion coefficient bi,j(x) is given by

bi,j(x)=xi(δi,jxj),

where the Kronecker delta δi,j is equal to 1 if i = j and 0 otherwise. For a neutral PIM model, we use θi = 4Nui to denote the population-scaled mutation rate associated with allele i, where ui is the probability of mutation producing allele i per individual per generation and N is the effective population size. Under this model, the drift coefficient ai(x) is given by

ai(x)=12(θi|θ|xi),

where θ=(θ1,,θK)>0K and |θ|=i=1Kθi.

Consider a general diploid selection model in which the relative fitness of a diploid individual with one copy of allele i and one copy of allele j is given by 1 + 2si,j. We measure fitness relative to that of an individual with two copies of allele K, thus sK,K = 0. The diffusion generator in this case is given by ℒ = ℒ0 + ℒσ, where ℒ0 denotes the diffusion generator under neutrality and the additional term ℒσ, which captures the contribution from selection to the drift coefficient ai(x), is given by

σ=i=1K1xi[σi(x)σ̄(x)]xi,

where σi(x) denotes the marginal fitness of type i and σ̄(x) denotes the mean fitness of the population with allele frequencies x. More precisely,

σi(x)=j=1Kσi,jxj (2)

and

σ̄(x)i,j=1Kσi,jxixj, (3)

where σi,j = 2Nsi,j. Intuitively, the population frequency of a given allele tends to increase if its marginal fitness is higher than the mean fitness of the population. The selection scheme is specified by a symmetric matrix σ = (σi,j)1≤i,jK ∈ ℝK×K of population-scaled selection coefficients, with σK,K = 0.

The operators ℒ0 and ℒ are elliptic inside the simplex ΔK−1, but not on the boundaries. Thus, the precise domain of the generator is not straightforward to describe, but Epstein and Mazzeo (2011) give a suitable characterization.

2.2. Spectral representation of the transition density function

For t ≥ 0, the time evolution of a diffusion Xt on the simplex ΔK−1 is described by the transition density function p(t; x, y)dy = ℙ[Xtdy | X0 = x], where x, y ∈ ΔK−1. The transition density function satisfies the Kolmogorov backward equation

tp(t;x,y)=p(t;x,y), (4)

where ℒ, the generator associated with the diffusion, is a differential operator in x.

We briefly review the framework underlying the spectral representation of the transition density function. The operator ℒ is said to be symmetric with respect to a density π : ΔK−1 → ℝ≥0 if, for all twice continuously differentiable functions f : ΔK−1 → ℝ and g : ΔK−1 → ℝ that belong to the domain of the operator, the following equality holds:

ΔK1[f(x)]g(x)π(x)dx=ΔK1f(x)[g(x)]π(x)dx.

A straightforward calculation using integration by parts yields that the diffusion generators described in Section 2.1 are symmetric with respect to their associated stationary densities.

Theorem 1.4.4 of Epstein and Mazzeo (2011) guarantees that an unbounded symmetric operator ℒ of the kind defined in Section 2.1 of this paper has countably many eigenvalues {−Λ0, −Λ1, −Λ2, …}, which are real and non-positive, satisfying

0Λ0Λ1Λ2,

with Λn → ∞ as n → ∞. An eigenfunction Bn : ΛK−1 → ℝ with eigenvalue −Λn satisfies

Bn(x)=ΛnBn(x), (5)

and, furthermore, Bn(x) is an element of the Hilbert space L2K−1, π(x)) of functions square integrable with respect to the density π(x), equipped with the canonical inner product 〈·, ·〉π. If ℒ is symmetric with respect to π(x), then its eigenfunctions are orthogonal with respect to π(x):

Bn,BmπΔK1Bn(x)Bm(x)π(x)dx=δn,mdn,

where δn,m is the Kronecker delta and dn are some constants. In the cases considered in this paper, the eigenfunctions form a basis of the Hilbert space L2K−1, π(x)).

It follows from equation (5) that exp(−Λnt)Bn(x) is a solution to the Kolmogorov backward equation (4). By linearity of (4), the sum of two solutions is again a solution. Combining the initial condition p(0; x, y) = δ(xy) with the fact that {Bn(x)} form a basis yields the following spectral representation of the transition density:

p(t;x,y)=n=01dneΛntBn(x)Bn(y)π(y). (6)

The initial condition being the Dirac delta δ(xy) corresponds to the frequency at time zero being x.

2.3. Univariate Jacobi polynomials

The univariate Jacobi polynomials play an important role throughout this paper. Here we review some key facts about this particular type of classical orthogonal polynomials. An excellent treatise on univariate orthogonal polynomials can be found in Szegö (1939) and a comprehensive collection of useful formulas can be found in Abramowitz and Stegun (1965, Chapter 22).

The Jacobi polynomials pn(a,b)(z), for z ∈ [−1, 1], satisfy the differential equation

(1z2)d2f(z)dz2+[ba(a+b+2)z]df(z)dz+n(n+a+b+1)f(z)=0. (7)

For given a, b > −1, the set {pn(a,b)(z)}n=0 forms an orthogonal system on the interval [−1, 1] with respect to the weight function (1 − z)a(1 + z)b. For a more convenient correspondence with the diffusion parameters, we define the following modified Jacobi polynomials, for x ∈ [0, 1] and a, b > 0:

Rn(a,b)(x)=pn(b1,a1)(2x1).

This definition is slightly different from that adopted by Griffiths and Spanò (2010).

Equation (7) implies that the modified Jacobi polynomials Rn(a,b)(x), for x ∈ [0, 1], satisfy the differential equation

x(1x)d2f(x)dx2+[a(a+b)x]df(x)dx+n(n+a+b1)f(x)=0. (8)

For fixed a, b > 0, the set {Rn(a,b)(x)}n=0 forms an orthogonal system on [0, 1] with respect to the weight function xa−1(1 − x)b−1. More precisely,

01Rn(a,b)(x)Rm(a,b)(x)xa1(1x)b1dx=δn,mcn(a,b),

where δn,m denotes the Kronecker delta and

cn(a,b)=Γ(n+a)Γ(n+b)(2n+a+b1)Γ(n+a+b1)Γ(n+1). (9)

Note that {Rn(a,b)(x)}n=0 form a complete basis of the Hilbert space L2([0, 1], xa−1(1 − x)b−1).

For n ≥ 1, the modified Jacobi polynomial Rn(a,b)(x) satisfies the recurrence relation

xRn(a,b)(x)=(n+a1)(n+b1)(2n+a+b1)(2n+a+b2)Rn1(a,b)(x)+[12b2a22(ba)2(2n+a+b)(2n+a+b2)]Rn(a,b)(x)+(n+1)(n+a+b1)(2n+a+b)(2n+a+b1)Rn+1(a,b)(x), (10)

while, for n = 0,

xR0(a,b)(x)=aa+bR0(a,b)(x)+1a+bR1(a,b)(x). (11)

Also, note that R0(a,b)(x)1. These recurrence relations play an important role in the work of Song and Steinrücken (2012), and the multivariate analogues, discussed later in Section 3.2, are similarly important for the present work.

The modified Jacobi polynomials satisfy other interesting relations, one of them being the following:

Rn(a,b)(x)=n+a+b12n+a+b1Rn(a,b+1)(x)𝟙{n>0}n+a12n+a+b1Rn1(a,b+1)(x). (12)

Using this identity, polynomials with parameter b can be related to polynomials with parameter b + 1. We utilize this relation later.

2.4. A review of the K = 2 case

To motivate the approach to be employed in the general case, we briefly review the work of Song and Steinrücken (2012) for deriving the transition density function in the diallelic (K = 2) case. The vector of mutation rates is given by θ = (α, β), while the symmetric matrix describing the general diploid selection scheme can be parametrized as

σ=(2σ2σh2σh0),

where σ is the selection strength and h the dominance parameter. For K = 2, the diffusion is one dimensional and the simplex Δ1 is equal to the unit interval [0, 1]. With x denoting x1, the generator (1) reduces to

f(x)=12x(1x)2x2f(x)+{12[α(α+β)x]+2σx(1x)[x+h(12x)]}xf(x).

In the neutral case (i.e., σ = 0), the modified Jacobi polynomials Rn(α,β)(x) are eigenfunctions of the diffusion generator with eigenvalues λn(α,β)=12n(n1+α+β). Hence, a spectral representation of the transition density function can be readily obtained via (6).

In the non-neutral case (i.e., σ ≠ 0), consider the functions Snθ(x)=eσ̄(x)/2Rn(α,β)(x), which form an orthogonal basis of the Hilbert space L2([0, 1], eσ̄(x)xα−1(1 − x)β−1), where eσ̄(x)xα−1(1 − x)β−1 corresponds to the stationary distribution of the non-neutral diffusion, up to a multiplicative constant. Since the eigenfunctions Bn(x) of the diffusion generator are elements of this Hilbert space, we can pose an expansion Bn(x)=m=0wn,mSmθ(x) in terms of the basis functions Snθ(x), where wn,m are to be determined. Then, the eigenvalue equation ℒBn(x) = −ΛnBn(x) implies the algebraic equation

m=0wn,m[λm(α,β)+Q(x;α,β,σ,h)]Rm(α,β)(x)=Λnm=0wn,mRm(α,β)(x),

where Q(x; α, β, σ, h) is a polynomial in x of degree four. Utilizing the recurrence relations in (10) and (11), one can then arrive at a linear system Mwn = Λnwn, where wn = (wn,0, wn,1, wn,2, …) is an infinite-dimensional vector of variables and M is a sparse infinite-dimensional matrix with entries that depend on the index n and the parameters α, β, σ, h of the model. The infinite linear system Mwn = Λnwn is approximated by a finite-dimensional truncated linear system

M[D]wn[D]=Λn[D]wn[D],

where wn(D)=(wn,0[D],wn,1[D],,wn,D1[D]) and M[D] is the submatrix of M consisting of its first D rows and D columns. This finite-dimensional linear system can be easily solved using standard linear algebra to obtain the eigenvalues Λn[D] and the eigenvectors wn[D] of M[D]. Song and Steinrücken observed that Λn[D] and wn,m[D] converge very rapidly as the truncation level D increases. Finally, the coefficients wn,m[D] can be used to approximate the eigenfunctions Bn(x), and, together with the eigenvalues Λn[D], an efficient approximation of the transition density function can be obtained via (6).

3. The Neutral Case with an Arbitrary Number of Alleles

In this section, we describe the spectral representation of the transition density of a neutral PIM model with an arbitrary number K of alleles. As in the case of K = 2, reviewed in Section 2.4, for an arbitrary K the eigenfunctions in the neutral case can be used to construct the eigenfunctions in the case with selection. The latter case is considered in Section 4.

3.1. Multivariate Jacobi polynomials

In what follows, let ℕ0 = {0, 1, 2, …} denote the set of non-negative integers. As in Griffiths and Spanò (2011), we define the following system of multivariate orthogonal polynomials in K − 1 variables:

Definition 1. For each vector n=(n1,,nK1)0K1 and θ=(θ1,,θK)0K, the orthogonal polynomial Pnθ(x) is defined as

Pnθ(x)=j=1K1[(1xj1i=1j1xi)NjRnj(θj,Θj+2Nj)(xj1i=1j1xi)],

where Nj=i=j+1K1ni and Θj=i=j+1Kθi.

For x = (x1, …, xK−1) ∈ ΔK−1, let Π0(x) denote an unnormalized density of the Dirichlet distribution with parameter θ = (θ1, …, θK):

Π0(x)=i=1Kxiθi1, (13)

where xK = 1 − |x|. The following lemma, a proof of which is provided in Appendix A, states that the above multivariate Jacobi polynomials Pnθ(x) are orthogonal with respect to Π0(x):

Lemma 2. For n,m0K1,

ΔK1Pnθ(x)Pmθ(x)Π0(x)dx=δn,mCnθ,

where δn,m=i=1K1δni,mi and

Cnθi=1K1cni(θi,Θi+2Ni), (14)

with Cn(a,b) defined in (9).

Remark: The multivariate Jacobi polynomials form a complete basis of L2K−1, Π0(x)), the Hilbert space of functions on ΔK−1 square integrable with respect to the unnormalized Dirichlet density Π0(x).

3.2. Recurrence relation for multivariate Jacobi polynomials

Recall that the univariate Jacobi polynomials satisfy the recurrence relation (10). Theorem 3.2.1 of Dunkl and Xu (2001) guarantees that the multivariate Jacobi polynomials satisfy a similar recurrence relation. More precisely, we have the following lemma, the proof of which is provided in Appendix A:

Lemma 3. Given n=(n1,,nK1)0K1 and m=(m1,,mK1)0K1, define Nj=i=j+1K1ni and Mj=i=j+1K1mi. For given i ∈ {1, …, K − 1} and n, Pnθ(x) satisfies the recurrence relation

xiPnθ(x)=mi(n)rn,m(θ,i)Pmθ(x), (15)

where rn,m(θ,i) are known constants (provided in Appendix A) and

i(n){m0K1:Mj=Nj for all j>i and |MjNj|1 for all ji}. (16)

Impose an ordering on the (K − 1)-dimensional index vectors n0K1. Then, the recurrence (15) can be represented as

xiPnθ(x)=mi(n)[𝒢iθ]n,mPmθ(x),

where 𝒢iθ corresponds to an infinite dimensional matrix in which columns and rows are indexed by the ordered (K − 1)-tuples, and the (n, m)-th entry is defined as

[𝒢iθ]n,m={rn,m(θ,i)if mi(n),0,otherwise. (17)

Note that for each given n, the number of non-zero entries in every row of 𝒢iθ is finite. One can deduce the following corollary from the new representation:

Corollary 4. Let a=(an)n0K1 be such that n0K1an2Cnθ<. Then

xi·n0K1anPnθ(x)=n0K1bnPnθ(x),

where (bn)n0K1=a·𝒢iθ.

Remark: Since under multiplication xi commutes with xj for 1 ≤ i, jK − 1, the corresponding matrices 𝒢iθ and 𝒢jθ also commute.

3.3. Eigenfunctions of the neutral generator ℒ0

It is well known that the stationary distribution of the Wright-Fisher diffusion under a neutral PIM model is the Dirichlet distribution (Wright, 1949). The density of the Dirichlet distribution is a weight function with respect to which the associated diffusion generator ℒ0 is symmetric. As discussed in Section 3.1, the multivariate Jacobi polynomials Pnθ(x) are orthogonal with respect to the weight function Π0(x), which is equal to the density of the Dirichlet distribution up to a multiplicative constant. Given the discussion in Section 2.2, one might then suspect that Pnθ(x) are potential eigenfunctions of ℒ0. The following lemma establishes that this is indeed the case:

Lemma 5. For all n0K1, the multivariate Jacobi polynomials Pnθ(x) satisfy

0Pnθ(x)=λ|n|θPnθ(x),

where

λ|n|θ=12|n|(|n|1+|θ|). (18)

That is, Pnθ(x) are eigenfunctions of0 with eigenvalues λ|n|θ.

A proof of this lemma is deferred to Appendix A. We conclude this section with a few comments.

Remarks:

  1. Substituting the eigenvalues and eigenfunctions into the spectral representation (6), we obtain
    p(t;x,y)=n0K11Cnθeλ|n|θtPnθ(x)Pnθ(y)Π0(y). (19)
  2. For every n0K1, note that λ|n|θ only depends on the norm |n|, which implies degeneracy in the spectrum of ℒ0. Griffiths (1979) constructed orthogonal kernel polynomials indexed by |n|, that is the sum over all orthogonal polynomials with index summing to |n|, and obtained the transition density expansion (19).

4. A General Diploid Selection Case with an Arbitrary Number of Alleles

In this section, we derive the spectral representation of the transition density function of the Wright-Fisher diffusion under a K-allelic PIM model with general diploid selection. This work extends the work of Song and Steinrücken (2012), the special case of K = 2 briefly summarized in Section 2.4, to an arbitrary number K of alleles. The recurrence relation presented in Lemma 3 plays a crucial role in the following derivation.

Recall that the backward generator for the full model is ℒ = ℒ0 + ℒσ, where ℒ0 corresponds to the generator under neutrality and ℒσ corresponds to the contribution from selection. The diffusion has a unique stationary density [see Ethier and Kurtz (1994) or Barbour et al. (2000)] proportional to

Π(x)eσ̄(x)Π0(x), (20)

where Π0(x) is defined in (13) and σ̄(x) is the mean fitness defined in (3). As mentioned in Section 2.2, ℒ is symmetric with respect to Π(x). For n ∈ ℕ0, we aim to find the eigenvalues −Λn and the eigenfunctions Bn of ℒ such that

Bn(x)=ΛnBn(x). (21)

By convention, we place Λn in non-decreasing order. The symmetry of ℒ implies that {Bn(x)} form an orthogonal system with respect to Π(x), that is

ΔK1Bn(x)Bm(x)Π(x)dxδn,m.

Such a system of orthogonal functions, however, is not unique. The orthogonality of {Pnθ} with respect to Π0, established in Lemma 2, can be used to show that the functions

Snθ(x)Pnθ(x)eσ̄(x)/2 (22)

are orthogonal with respect to Π, as are Bn(x). Furthermore, the fact that {Pnθ(x)} form a complete basis of L2K−1, Π0(x)) means that {Snθ(x)} is a complete basis of L2K−1, Π(x)). Since BnL2K−1, Π(x)), we thus seek to represent Bn(x) as linear combination of the basis Snθ(x):

Bn(x)=m0K1un,mSmθ(x), (23)

where un,m are some constants to be determined.

Define an index set =L=04 {1, …, K − 1}L and for i = (i1, …, iL) ∈ ℐ, define xi = xi1xiL. We have the following theorem for solving the eigensystem associated with the full generator ℒ:

Theorem 6. For all n ∈ ℕ0, the eigenfunction Bn(x) ofcan be represented by (23). The corresponding eigenvalues −Λn and the coefficients un,m can be found by solving the infinite-dimensional eigensystem

unM=unΛn, (24)

where un=(un,m)m0K1 and

M=diag ({λ|m|θ}m0K1)+iq(i)𝒢iθ.

Here, λ|m|θ is defined as in (18) and, for i = (i1, …, iL) ∈ ℐ, we define 𝒢iθ=𝒢i1θ𝒢iLθ, where 𝒢iθ is given in (17). When L =0, 𝒢iθ is defined to be the identity matrix. Explicit expressions of the constants q(i) are provided in Appendix C.

Remarks:

  1. Although M is infinite dimensional, it is in fact sparse with only finitely many non-zero entries in every row and column.

  2. Equation (24) implies that Λn and un are in fact the left eigenvalues and eigenvectors of M. To solve the eigensystem in practice requires some truncation of the matrix to finite dimensions, as in the K = 2 case described in Section 2.4. For a given n ∈ ℕ0, we would like both Λn and un,m to converge as the truncation level increases. In Section 5, we demonstrate that this is indeed the case using empirical examples.

We now provide a proof of the theorem.

Proof of Theorem 6. Substituting (23) into (21) we obtain

k0K1un,kSkθ(x)=k0K1Λnun,kSkθ(x).

It is shown in Appendix D that

Skθ(x)=eσ̄(x)/2[λ|k|θPkθ(x)+Q(x;σ,θ)Pkθ(x)], (25)

where

Q(x;σ,θ)=12[i=1Kxiσi2(x)+i=1Kθiσi(x)+i=1Kxiσi,i(1+|θ|)σ̄(x)σ̄(x)2],

with σi(x) and σ̄(x) defined as in (2) and (3), respectively. Thus, one arrives at the following equation:

k0K1Λnun,kPkθ(x)=k0K1un,k[λ|k|θPkθ(x)+Q(x;σ,θ)Pkθ(x)]. (26)

We solve the equation by first representing Q(x; σ, θ)Pkθ(x) as a finite linear combination of {Pkθ(x)}k0K1. Observe that Q is in fact a fourth-order polynomial in x. Collecting terms, Q can be written in the form

Q(x;σ,θ)=iq(i)xi, (27)

for the constants q(i) given in Appendix C. Applying Corollary 4 recursively, we obtain

Q(x;σ,θ)Pkθ(x)=iq(i)l0K1[𝒢iθ]k,lPlθ(x).

Finally, substituting this equation into (26), multiplying both sides of (26) by Pmθ(x), and integrating with respect to Π0(x) over the simplex ΔK−1 yields the matrix equation (24).

5. Empirical Results and Applications

In this section, we study the convergence behavior of the eigenvalues and eigenvectors as we approximate the solutions of (24). Further, we show how the spectral representation can be employed to obtain the transient and stationary density explicitly (especially the normalizing constant), and to characterize the convergence rate of the diffusion to stationarity. A Mathematica implementation of the relevant formulas for computing the spectral representation is available from the authors upon request.

5.1. Convergence of the eigenvalues and eigenvectors

In what follows we order the Jacobi polynomials according to the graded lexicographic ordering of their corresponding indices. Thus Pn1θ<Pn2θ if

  • |n1| < |n2|, or

  • |n1| = |n2| and n1 is lexicographically smaller than n2.

Fix K and note that, for a given truncation level D ∈ ℕ0 and l ∈ ℕ0, there are (l+K2K2) polynomials Pnθ with |n| = l, and 𝒰(D)(D+K1K1) polynomials with index |n| ≤ D. For the computations in the rest of this section we chose K = 3, unless otherwise stated.

Now, one can obtain a finite-dimensional linear system approximating (24) by truncation, that is, taking only those entries in M and un whose associated index vectors satisfy |n| ≤ D. More explicitly, with M[D] = ([M]k,l) ∈ ℝ𝒰(D)×𝒰(D) and un[D]=(un,k)𝒰(D), where k,l0K1 such that |k|, |l| ≤ D, the solutions of

un[D]M[D]=un[D]Λn[D]

should approximate the solutions of the infinite system Λn and un. The convergence patterns of Λn[D] and un,k[D] as D increases are exemplified in Figure 1 for the parameters

  • i)

    K = 3, θ = (0.01, 0.02, 0.03), σ=σ1:=(12141514111315130); and

  • ii)

    K = 3, θ = (0.01, 0.02, 0.03), σ=σ2:=(1201401501401101301501300).

Figure 2 displays the convergence behavior for the parameters

  • iii)

    K = 3, θ = (10, 20, 30), σ = σ1; and

  • iv)

    K = 4, θ = (0.01, 0.02, 0.03, 0.04), σ3=(121415161411101315109141613140).

Figure 1.

Figure 1

Convergence of the truncated eigenvalues Λn and coefficients of the eigenvectors un as the truncation level D increases, for K = 3 with low mutation rates. Subfigures (a) and (b) show Λ0[D],Λ75[D], and Λ150[D] for σ = σ1 and σ = σ2, respectively. Subfigures (c) and (d) show u75,(8,2)[D],u75,(7,3)[D], and u75,(6,4)[D] for σ = σ1 and σ = σ2, respectively. The mutation rates were set to θ = (0.01, 0.02, 0.03) for all computations.

Figure 2.

Figure 2

Convergence of the truncated eigenvalues Λn and coefficients of the eigenvectors un as the truncation level D increases, for K = 3 with high mutation rates and for K = 4 with low mutation rates. Subfigures (a) and (c) show Λn[D] for n = 0, 75, 150, and u75,m[D] for m = (8, 2), (7, 3), (6, 4), respectively, for mutation rates θ = (10, 20, 30) and selection coefficients σ = σ1. The convergence behavior for K = 4 is shown in subfigures (b) and (d) for Λn[D] with n = 0, 75, 150, and u75,m[D] with m = (3, 2, 4), (5, 3, 4), (3, 5, 2), respectively; the mutation rates were set to θ = (0.01, 0.02, 0.03, 0.04) and the selection coefficients to σ = σ3.

In all cases, Λn[D] and un,k[D] converge with increasing truncation level to empirical limits. The eigenvalues Λn[D] decrease towards the empirical limit, whereas the coefficients un,k[D] show oscillatory behavior before ultimately stabilizing. The rate of convergence is faster for smaller selection intensity. Varying the mutation parameters does not influence convergence behavior significantly. As expected, Λ0[D] converges rapidly to zero in all cases, consistent with the fact that the diffusion has a stationary distribution. For a fixed n, Λn[D] and its associated coefficients un,k[D] roughly converge at similar truncation levels.

Figure 3 shows Λn[D] for D = 24 and 0 ≤ n ≤ 35 under neutrality (σ = 0) and selection (σ = σ1 and σ=14σ2). Upon inspection all of the eigenvalues displayed have converged properly. Under neutrality, the eigenvalues are functions of |n|, thus they are degenerate and cluster into groups. In the presence of selection, however, we empirically observe that all of the eigenvalues are distinct. For moderate selection intensity, the group structure is less prominent. In general, increasing the selection parameters evens out the group structure and shifts the entire spectrum upward.

Figure 3.

Figure 3

The first 36 eigenvalues of the different spectra for the selection parameters σ = 0, σ1 and 14σ2, respectively. The latter was chosen so that the ranges of the eigenvalues are comparable. The truncation level D was set to 24 and mutation rates θ = (0.01, 0.02.0.03) were used.

Computing the transition density function for large selection coefficients requires combining terms of substantially different orders of magnitude, because of the exponential weighting factors in the density (20) and in the expansion (22). Therefore, the coefficients un,k[D] have to be calculated with high precision to obtain accurate numerical results under strong selection.

5.2. Transient and stationary densities

The approximations to the eigenvalues Λn[D] and the eigenfunctions Bn (via the eigenvectors un[D] and equation (23)) can be used in the spectral representation (6) to approximate the transition density function at arbitrary times t. Examples with σ = σ1 for different times are given in Figure 4. At first, the density is concentrated around the initial frequencies x = (0.02, 0.02, 0.96), but as time increases, the frequencies of the first and second allele increases, since these have a higher relative fitness. Eventually, the transition density converges to the stationary distribution (similar to distribution at t = 2), where the bulk of the mass is concentrated at high frequencies for the first and second allele.

Figure 4.

Figure 4

Approximation of the transition density function (6) for different times t ∈ {0.04, 0.2, 1.0, 2.0}. Selection was governed by the matrix of coefficients σ = σ1 and x = (0.02, 0.02, 0.96) was used as initial condition. The truncation level was set to D = 40, whereas the summation in equation (23) ranged over all m such that 0 ≤ |m| ≤ 36, and all eigenfunctions and eigenvalues with 0 ≤ n ≤ 561 were included in equation (6). The mutation rates were set to θ = (0.01, 0.02.0.03). The plots only vary in y1 and y2, since y3 = 1 − y1y2. (a) t = 0.04. (b) t = 0.2. (c) t = 1.0. (d) t = 2.0.

The eigenvalues Λn[D] and coefficients un,k[D] can also be employed to approximate the constant that normalizes the stationary distribution Π(x) to a proper probability distribution. Following the same line of argument as Song and Steinrücken (2012), the orthogonal relations enable us to circumvent the difficulty involved in directly evaluating a multivariate integral over the simplex ΔK−1. First, note that since ℒ maps constant functions to zero, any constant function is an eigenfunction with associated eigenvalue Λ0 = 0, thus B0(x) = B0(y) = const. In (6), taking t → ∞, we get

limtp(t;x,y)=Π(y)B0(x)B0(y)B0,B0Π1CΠΠ(y).

Then for x = y = 0, by (23) we have

CΠ=ΔK1Π(z)dz=B0,B0ΠB0(0)2=m0K1u0,m2Pmθ,PmθΠ0eσ̄(0)(m0K1u0,mPmθ(0))2=m0K1u0,m2Cmθ(m0K1u0,mj=1K1Γ(nj+θj)Γ(nj+1)Γ(θj))2, (28)

since σ̄(0) = σK,K = 0 and Rnj(θj,Θj+2Nj)(0)=(1)njΓ(nj+θj)Γ(nj+1)Γ(θj). Here Cmθ is the constant defined in (14). The purely algebraic form of the right hand side in equation (28) allows to compute an accurate approximation of the normalizing constant CΠ by replacing the infinite sums by sums over all indices m such that |m| is less or equal then a given truncation level. This offers an attractive alternative to other computationally intensive methods (Donnelly et al., 2001; Genz and Joyce, 2003; Buzbas and Joyce, 2009).

Figure 5 shows two examples of stationary distributions for different selection coefficients. In Figure 5(a) the stationary density is concentrated in the interior of the simplex, since all homozygotes are less fit then the heterozygotes. This situation is referred to as heterozygote advantage, resulting in a balancing selection pattern, and the different alleles co-exist at stationarity. In Figure 5(b), allele number 1 is strongly favored by the given selection coefficients, and thus the stationary density is concentrated at high frequencies for this allele.

Figure 5.

Figure 5

Two examples of the stationary distribution for different selection parameters. The mutation rates were set to θ = (0.01, 0.02.0.03) in both cases. Again, a truncation level of D = 40 was used, the summation in equation (28) ranged over all m such that 0 ≤ |m| ≤ 36. The plots only vary in y1 and y2, since y3 = 1 − y1y2.

We can also use (6) to investigate the rate of convergence of the diffusion process to the stationary distribution. Denote the difference between the transition density and the stationary density by

d(t;x,y)p(t;x,y)1CΠΠ(y)=n=1eΛntΠ(y)Bn(x)Bn(y)Bn,BnΠ.

We measure the magnitude of d(t; x, y) by the square of its L2 norm with respect to the weight function 1/Π(y), that is,

d(t;x,·)1/Π2d,d1/Π=n=1e2ΛntBn(x)2Bn,BnΠ=n=1e2Λnteσ̄(x)(k0K1un,kPkθ(x))2m0K1un,m2Cmθ. (29)

Again, the sums in this expression can be approximated by truncating at a given level. Figure 6 shows d(t;x,·)1/Π2 as a function of time t, for σ = σ1, σ = 0.5σ1, σ = 0.1σ1. The initial frequencies were x = (0.02, 0.02, 0.96). As expected, the distance to the stationary distribution decreases over time. Further, the rate of convergence is faster if the values in σ get larger, which was observed by Song and Steinrücken (2012) too. We note that the spectral representation can also be readily employed to study convergence rates measured by other metrics such as the total variation distance or relative entropy.

Figure 6.

Figure 6

Convergence of the transition density to stationarity as time evolves, for initial frequencies x = (0.02, 0.02, 0.96)T. Deviation from the stationary density is measured by d(t;x,·)1/Π2, defined in (29). The mutation rates were chosen to be θ = (0.01, 0.02, 0.03) and the selection parameters were σ = 0.1σ1, σ = 0.5σ1 and σ = 0.1σ1, respectively. The truncation level was set to D = 40, and (29) was approximated by summing over 0 ≤ n ≤ 561 and m, k such that 0 ≤ |m|, |k| ≤ 36.

6. Discussion

In this paper, we have extended the method of Song and Steinrücken (2012) to obtain an explicit spectral representation of the transition density function for the multi-dimensional Wright-Fisher diffusion under a PIM model with general diploid selection and an arbitrary number of alleles. We have demonstrated the fidelity and fast convergence of the approximations. Further, as an example application of our work, we have computed the normalization constant of the stationary distribution and quantified the rate at which the transition density approaches this distribution.

Efficient approximations of the eigensystem and the transition density function lead to a number of important applications. Combining the stationary distribution discussed in Section 5.2 with the recurrence relation shown in Lemma 3, one can calculate algebraically the probability of observing a given genetic configuration of individuals sampled from the stationary distribution of the non-neutral diffusion. This kind of algebraic approach would complement previous works (Evans et al., 2007; Živković and Stephan, 2011) on sample allele frequency spectra that involve solving ODEs satisfied by the moments of the diffusion. Further, the algebraic approach is potentially more efficient than computationally expensive Monte Carlo methods (Donnelly et al., 2001) and more generally applicable than methods relying on the selection coefficients being of a certain form (Genz and Joyce, 2003). Note that, by discretizing time and space, our representation of the transition density function can be used for approximate simulation of frequencies from stationarity as well as frequency trajectories, which can in turn be employed in the aforementioned Monte Carlo frameworks.

The sampling probability can be applied, for example, to estimate evolutionary parameters via maximum likelihood or Bayesian inference frameworks. Furthermore, the notion of sampling probability can be combined with the spectral representation of the transition density function in a hidden Markov model framework as in Bollback et al. (2008), to calculate the probability of observing a series of configurations sampled at different times. The method developed in this paper would allow for such an analysis in a model with multiple alleles subject to recurrent parent-independent mutation and general diploid selection.

An important, albeit very challenging, future direction is to extend our current approach to analyze the dynamics of multi-locus diffusions with recombination and selection. Such an extension would allow for the incorporation of additional data at closely linked loci, which has the potential to significantly improve the inference of evolutionary parameters, especially the strength and localization of selection. We have only considered Wright-Fisher diffusions in a single panmictic population of a constant size. As achieved in the alternative approaches of Gutenkunst et al. (2009) and Lukić et al. (2011), mentioned in Introduction, it would be desirable to generalize our approach to incorporate subdivided populations exchanging migrants, with possibly fluctuating population sizes. Another possible extension is to relax the PIM assumption and consider a more general mutation model.

We note that our present technique relies on the diffusion generator being symmetric. This symmetry does not hold in some of the scenarios mentioned above, making a direct application of the ideas developed here difficult. However, we believe that it is worthwhile investigating whether one could apply our approach to devise approximations to the transition density function that are sufficiently accurate for practical applications.

Acknowledgement

We thank Anand Bhaskar for many helpful discussions. This research is supported in part by a DFG Research Fellowship STE 2011/1-1 to M.S.; and by an NIH grant R01-GM094402, an Alfred P. Sloan Research Fellowship, and a Packard Fellowship for Science and Engineering to Y.S.S.

Appendix A

Proofs of lemmas

Proof of Lemma 2. For two indices n,m0K1, consider the integral

ΔK1Pnθ(x)Pmθ(x)Π0(x)dx=[0,1]K1nθ(ξ)mθ(ξ)Π0(x(ξ))|det(Dx)(ξ)|dξ, (A.1)

where the right hand side can be obtained by the coordinate transformation introduced in Appendix B and the multivariate integration through substitution rule. Using

Π0(x(ξ))=i=1K1ξiθi1(1ξi)Θi(Ki),

the determinant of the Jacobian

|det(Dx)(ξ)|=i=1K2(1ξi)K(i+1),

see Baxter et al. (2007)[Equation B.1], and the transformed Jacobi polynomials (B.1), it can be shown that

j=1K101Rnj(θj,Θj+2Nj)(ξj)Rmj(θj,Θj+2Mj)(ξj)ξjθj1(1ξj)Θj+Nj+Mj1dξj=Cnθδn,m

holds, with

Cnθ=j=1K1cnj(θj,Θj+2Nj).

In the case n = m this can be seen immediately. If nm without loss of generality let 1 ≤ lK − 1 be the largest l such that nl < ml and nk = mk for all k = l+1, …, K − 1. Then Nl = Ml (recall NK−1 = MK−1 = 0) and Rml(θl,Θl+2Ml)l) is orthogonal to all polynomials of lesser degree with respect to the weight function ξlθl1 (1 − ξl)Θl+2Ml−1, and thus the l-th factor and the whole product is zero.

Proof of Lemma 3. We found it most convenient to derive a recurrence relation for

xiPnθ(x) (A.2)

by projecting expression (A.2) onto the orthogonal basis {Pmθ(x)}, and investigate the respective coefficients. First, note that the coordinate transformation introduced in Appendix B yields xi = ξi Πj<i(1 − ξj), so

xiPnθ(x)=ξij<i(1ξj)nθ(ξ). (A.3)

Further, integrate expression (A.3) against the base function mθ(ξ) times the weight function Π0 to get the respective coefficient in the basis representation. Using the integration by substitution rule again, as in equation (A.1), this yields

1Cmθ[0,1]K1ξij<i(1ξi)nθ(ξ)mθ(ξ)k=1K1ξkθk1(1ξk)Θk1dξ=j=i+1K11cmj(θj,Θj+2Mj)01Rnj(θj,Θj+2Nj)(ξj)Rmj(θj,Θj+2Mj)(ξj)ξjθj1(1ξj)Θj+Nj+Mj1dξj×1cmi(θi,Θi+2Mi)01Rni(θi,Θi+2Ni)(ξi)Rmi(θi,Θi+2Mi)(ξi)ξiθi(1ξj)Θi+Ni+Mi1dξi×j=1i11cmj(θj,Θj+2Mj)01Rnj(θj,Θj+2Nj)(ξj)Rmj(θj,Θj+2Mj)(ξj)ξjθj1(1ξj)Θj+Nj+Mjdξj.

The first term on the right hand side yields zero, unless mj = nj for all j > i, thus Mi = Ni. In this case the term is equal to 1. Since mj = nj for j > i, note that the second term on the right hand side is of the form

1cmi(α,β)01Rni(α,β)(ξ)Rmi(α,β)(ξ)ξwα,β(ξ)dξ=Gni,mi(α,β)δni+1,mi+Gni,mi(α,β)δni,mi+Gni,mi(α,β)δni1,mi,

with wα,β(ξ) = ξα−1(1 − ξ)β−1, α = θi, and β = Θi + 2Ni. Here we applied the recurrence relation (10) to ξRni(α,β)(ξ) and used the orthogonality of the Jacobi polynomials. The constants Gn,m(a,b) are given by

Gn,m(a,b)={(n+a1)(n+b1)(2n+a+b1)(2n+a+b2),if nm=1 and n>0,12b2a22(ba)2(2n+a+b)(2n+a+b2),if nm=0 and n0,(n+1)(n+a+b1)(2n+a+b)(2n+a+b1),if nm=1 and n0.

This expression is non-zero for −1 ≤ nimi ≤ 1. Furthermore, the form of the integral for j = i − 1 depends on this difference, or rather the difference between Nj and Mj. Depending on the difference NjMj we have to consider the integrals

1:1cmj(α,β+2)01Rnj(α,β)(ξ)Rmj(α,β+2)(ξ)wα,β+2(ξ)dξ, (A.4)
0:1cmj(α,β)01Rnj(α,β)(ξ)Rmj(α,β)(ξ)(1ξ)wα,β(ξ)dξ, (A.5)
+1:1cmj(α,β+2)01Rnj(α,β)(ξ)Rmj(α,β2)(ξ)wα,β(ξ),dξ, (A.6)

with α = θj and β = Θj + 2Nj. In expression (A.6) we have to assume β > 2, which is equivalent to Nj ≥ 1. This holds true, because if Nj = 0, this case would not have to be considered.

Applying relation (12) twice to the polynomial Rnj(α,β)(ξ) in equation (A.4) and using orthogonality yields

Hnj,mj(α,β)δnj,mj+Hnj,mj(α,β)δnj1,mj+Hnj,mj(α,β)δnj2,mj,

for some constants Hn,m(α,β). Here H0,1(α,β)=H0,2(α,β)=H1,1(α,β)=0. Thus, in the case Mj = Nj + 1, the expression for j is non-zero for mj = nj, nj − 1, and nj − 2. Furthermore, relation (10) can be applied to the term Rnj(α,β)(ξ) (1 − ξ), together with orthogonality to get

Inj,mj(α,β)δnj+1,mj+Inj,mj(α,β)δnj,mj+Inj,mj(α,β)δnj1,mj,

for given constants In,m(α,β), with I1,0(α,β)=I0,1(α,β)=0. In the case Mj = Nj, the expression is non-zero for mj = nj − 1, nj, and nj + 1. Finally, applying relation (12) to the term Rmj(α,β2)(ξ) in expression (A.6) combined with orthogonality yields

Jnj,mj(α,β)δnj,mj+Jnj,mj(α,β)δnj+1,mj+Jnj,mj(α,β)δnj+2,mj,

for given constants Jn,m(α,β). Again J1,0(α,β)=J2,0(α,β)=J1,1(α,β)=0. Thus this expression is non-zero for mj = nj, nj + 1, and nj + 2. The constants Hn,m(a,b),In,m(a,b),Jn,m(a,b) are given by

Hn,m(a,b)={(n+a+b1)(n+a+b)(2n+a+b1)(2n+a+b),if mn=0 and n0,2aa+b+2,if mn=1 and n=12(n+a1)(n+a+b1)(2n+a+b2)(2n+a+b),if mn=1 and n>1,(n+a2)(n+a1)(2n+a+b2)(2n+a+b1),if mn=2 and n>1,
In,m(a,b)={1a+b,if mn=1 and n=0,(n+1)(n+a+b1)(2n+a+b1)(2n+a+b),if mn=1 and n>0,ba+b,if mn=0 and n=0,b2+a(b+2)(a+b)(a+b+2),if mn=0 and n=1,b2+2n(n+a1)+b(2n+a2)(2n+a+b2)(2n+a+b),if mn=0 and n>1,ab(a+b)(a+b+1),if mn=1 and n=1(n+a1)(n+b1)(2n+a+b2)(2n+a+b1),if mn=1 and n>1,
Jn,m(a,b)={(b1)(b2)(a+b1)(a+b2),if mn=0 and n=0,(n+b2)(n+b1)(2n+a+b2)(2n+a+b1),if mn=0 and n>0,2(n+1)(n+b1)(2n+a+b2)(2n+a+b),if mn=1 and n0,(n+1)(n+2)(2n+a+b1)(2n+a+b),if mn=2 and n0.

Now considering all three possible values for NjMj, and all possible implications for the difference njmj, it can be shown that 1 ≤ Nj−1Mj−1 ≤ 1 has to hold as well. Using induction shows that 1 ≤ NjMj ≤ 1 holds for all j < i. Thus for all j < i the same integrals (A.4), (A.5), and (A.6), with adjusted parameters α = θj and β = Θj + 2Nj, have to be considered.

Combining these results shows that for fixed i and n the polynomials with a non-zero contribution to the recurrence relation for xiPnθ(x) are exactly those with indices from the set

i(n):={m0K1|mj0j,Mj=Njj>i,|MjNj|1ji},

defined in (16). Thus,

xiPnθ(x)=mi(n)rn,m(θ,i)Pmθ(x),

where the coefficients rn,m(θ,i) are given by

rn,m(θ,i)=Gni,mi(θi,Θi+2Ni)j<i{Hnj,mj(θj,Θj+2Nj),if  NjMj=1,Inj,mj(θj,Θj+2Nj),if  NjMj=0,Jnj,mj(θj,Θj+2Nj),if  NjMj=+1.

Proof of Lemma 5. Using the coordinate transformation introduced in Appendix B, and applying ℒ0, given in equation (B.2), to nθ(ξ) from equation (B.1) yields

0nθ(ξ)=12i=1K11Πk<i(1ξk)j=1,jiK1Rnj(θj,Θj+2Nj)(ξj)(1ξj)Nj×(ξi(1ξi)2ξi2{Rni(θi,Θi+2Ni)(ξi)(1ξi)Ni}+(θiΘi1ξi)ξi{Rni(θi,Θi+2Ni)(ξi)(1ξi)Ni}). (A.7)

Employing equation (8), one can show that the terms in the brackets on the right hand side of equation (A.7) reduce to

(1ξi)NiRni(θi,Θi+2Ni)(ξi)(Ni1(Ni11+Θi1)+11ξiNi(Ni1+Θi)),

and substitution yields

0nθ(ξ)=12nθ(ξ)(i=1K11Πk<i(1ξk)Ni1(Ni11+Θi1)+i=2K1Πk<i(1ξk)Ni1(Ni11+Θi1))=λ|n|θnθ(ξ)

with λ|n|θ=12|n|(|n|1+|θ|), since Θ0 = |θ|, N0 = |n|, and NK − 1 = 0.

Appendix B

Change of coordinates

Working with the multivariate Jacobi polynomials and the neutral diffusion, it is convenient, for some derivations, to transform the equations to a different coordinate system. This transformation maps the simplex ΔK − 1 to the K − 1-dimensional unit cube [0, 1]K−1. It is implicitly used in Griffiths and Spanò (2011, Section 3), but more explicitly introduced and used as a transformation in Baxter et al. (2007). The vector ξ(x) = (ξ1(x), …, ξK−1(x)) is obtained from the vector of population frequencies x via the transformation

ξi=xi1j<ixj

for 1 ≤ iK − 1. The inverse of this transformation is given by

xi=ξij<i(1ξj)

for 1 ≤ iK − 1. The inverse relation can be derived by noting that 1 − ∑j<i xj = Πj<i(1 − ξj) holds.

Definition 1 yields immediately that the multivariate Jacobi polynomials Pnθ(x) take the form

nθ(ξ)=Pnθ(x(ξ))=j=1K1Rnj(θj,Θj+2Nj)(ξj)(1ξj)Nj (B.1)

in the transformed coordinates. The neutral diffusion generator ℒ0 in the transformed coordinate system is given by the following lemma.

Lemma 7. Using variables in the new coordinate system, the backward generator of the diffusion under neutrality0 can be written as

0(ξ)=12i=1K1i,i(ξ)2ξi2(ξ)+i=1K1ãi(ξ)ξi(ξ), (B.2)

with

i,j(ξ)=δi,j(ξi(1ξi)Πk<i(1ξk))

and

ãi(ξ)=12θiΘi1ξiΠk<i(1ξk).

The proof of this lemma is paraphrased in Appendix B of Baxter et al. (2007). The transformation diagonalizes the operator by removing all the mixed second order partial derivatives.

Appendix C

Coefficients of the polynomial Q(x; σ, θ)

q=12(j=1KθjσK,j|θ|σK,K) when i=,
q(i1)=12(j=1Kθj(σi1,jσt,K)+σi1,K2+σK,K22σK,Kσi1,K2(1+|θ|)σi1,K+(1+2|θ|)σK,K+σi1,i1),
q(i1,i2)=12(2σi1,Kσi1,i23σi1,Kσi2,K+8σi2,KσK,K2σK,Kσi1,i22σi1,K23σK,K2(1+|θ|)(σi1,i2+σK,K2σi2,K)),
q(i1,i2,i3)=12((σi1,i3σi1,K)(σi1,i2σi1,K)(σi3,KσK,K)(σi2,KσK,K)4(σi2,i3+σK,K2σi3,K)(σi1,KσK,K)),
q(i1,i2,i3,i4)=12((σi1,i2+σK,K2σi2,K)(σi3,i4+σK,K2σi4,K)).

Appendix D

Derivation of equation (25)

Applying ℒ to Snθ(x),

Snθ(x)=(0+σ)(Pnθ(x)eσ̄(x)/2)=eσ̄(x)20Pnθ(x)+Pnθ(x)0eσ̄(x)2+i,j=1K1xi(δi,jxj)xi{eσ̄(x)2}xj{Pnθ(x)}+Pnθ(x)σeσ̄(x)2+eσ̄(x)2σPnθ(x)=λnθeσ̄(x)2Pnθ(x)+Pnθ(x)eσ̄(x)2+i,j=1K1xi(δi,jxj)xi{eσ̄(x)2}xj{Pnθ(x)}+eσ̄(x)2σPnθ(x).

It can be shown that the last two terms in the above expression sum up to 0. Note that for 1 ≤ i, jK − 1,

xiσ̄(x)=2k=1Kσk,ixk2l=1Kσl,Kxl, (D.1)
2xjxiσ̄(x)=2(σi,jσj,Kσi,K+σK,K). (D.2)

It follows that

i,j=1K1xi(δi,jxj)xi{eσ̄(x)2}xj{Pnθ(x)}+eσ̄(x)2σPnθ(x)=eσ̄(x)2[i=1K1xixi{σ̄(x)2}xi{Pnθ(x)}i,j=1K1xixjxi{σ̄(x)2}xj{Pnθ(x)}+i=1K1xixi{Pnθ(x)}j=1Kσi,jxjσ̄(x)i=1K1xixi{Pnθ(x)}]=eσ̄(x)2[i=1K1xi(k=1Kσk,ixkl=1Kσl,Kxl)xi{Pnθ(x)}+i,j=1K1xixj(k=1Kσk,ixkl=1Kσl,Kxl)xj{Pnθ(x)}+i=1K1xixi{Pnθ(x)}j=1Kσijxjσ̄(x)i=1K1xixi{Pnθ(x)}]=eσ̄(x)2[i=1K1xixi{Pnθ(x)}l=1Kσl,Kxlj=1K1xjxj{Pnθ(x)}(l=1Kσl,KxKxl+i=1K1xil=1Kσl,Kxl)]=0,

where we used equation (D.1) for the second equality and i=1Kxi=1 for the last equality.

Further, using equation (D.1) and (D.2) one can show that

eσ̄(x)2=12eσ̄(x)2(i=1Kxiσi2(x)i=1Kxiσii+(1+|θ|)σ̄(x)+σ̄(x)2i=1Kθiσi(x))=eσ¯(x)2Q(x;σ,θ),

where Q takes the form (27), that is Q(x; σ, θ) = ∑i∈ℐ q(i)xi, with the constants q(i) given in Appendix C.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Contributor Information

Matthias Steinrücken, Email: steinrue@stat.berkeley.edu.

Y. X. Rachel Wang, Email: rachelwang@stat.berkeley.edu.

Yun S. Song, Email: yss@cs.berkeley.edu.

References

  1. Abramowitz M, Stegun IA. Handbook of Mathematical Functions. Dover Publications; 1965. [Google Scholar]
  2. Barbour AD, Ethier SN, Griffiths RC. A transition function expansion for a diffusion model with selection. Ann. Appl. Probab. 2000;10:123–162. [Google Scholar]
  3. Baxter G, Blythe R, McKane A. Exact solution of the multi-allelic diffusion model. Math. Biosci. 2007;209:124–170. doi: 10.1016/j.mbs.2007.01.001. [DOI] [PubMed] [Google Scholar]
  4. Bollback JP, York TL, Nielsen R. Estimation of 2Nes from temporal allele frequency data. Genetics. 2008;179(1):497–502. doi: 10.1534/genetics.107.085019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Buzbas EO, Joyce P. Maximum likelihood estimates under the k-allele model with selection can be numerically unstable. Ann. Appl. Stat. 2009;3(3):1147–1162. [Google Scholar]
  6. Buzbas EO, Joyce P, Abdo Z. Estimation of selection intensity under overdominance by Bayesian methods. Stat. Appl. Genet. Mol. Biol. 2009;8(1) doi: 10.2202/1544-6115.1466. Article 32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Buzbas EO, Joyce P, Rosenberg NA. Inference on the strength of balancing selection for epistatically interacting loci. Theor. Popul. Biol. 2011;79(3):102–113. doi: 10.1016/j.tpb.2011.01.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Donnelly P, Nordborg M, Joyce P. Likelihoods and simulation methods for a class of nonneutral population genetics models. Genetics. 2001;159(2):853–867. doi: 10.1093/genetics/159.2.853. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Dunkl C, Xu Y. Orthogonal Polynomials of Several Variables. Cambridge University Press; 2001. [Google Scholar]
  10. Durrett R. Probability Models for DNA Sequence Evolution. Springer; 2008. [Google Scholar]
  11. Epstein CL, Mazzeo R. Degenerate diffusion operators arising in population biology. 2011 arXiv preprint: http://arxiv.org/abs/1110.0032.
  12. Ethier SN, Kurtz TG. Convergence to Fleming-Viot processes in the weak atomic topology. Stoch. Proc. Appl. 1994;54(1):1–27. [Google Scholar]
  13. Evans SN, Shvets Y, Slatkin M. Non-equilibrium theory of the allele frequency spectrum. Theor. Popul. Biol. 2007;71(1):109–119. doi: 10.1016/j.tpb.2006.06.005. [DOI] [PubMed] [Google Scholar]
  14. Ewens W. Mathematical Population Genetics. 2nd edition. volume I. Springer; 2004. Theoretical introduction. [Google Scholar]
  15. Genz A, Joyce P. Computation of the normalizing constant for exponentially weighted dirichlet distribution integrals. Comp. Sci. Stat. 2003;35:181–212. [Google Scholar]
  16. Griffiths R. A transition density expansion for a multi-allele diffusion model. Adv. Appl. Prob. 1979;11:310–325. [Google Scholar]
  17. Griffiths RC, Li W-H. Simulating allele frequencies in a population and the genetic differentiation of populations under mutation pressure. Theor. Popul. Biol. 1983;23(1):19–33. doi: 10.1016/0040-5809(83)90003-5. [DOI] [PubMed] [Google Scholar]
  18. Griffiths RC, Spanò D. Probability and Mathematical Genetics, Papers in Honour of Sir John Kingman. LMS Lecture Note Series 378. chapter 15. Cambridge University Press; 2010. Diffusion processes and coalescent trees; pp. 358–375. [Google Scholar]
  19. Griffiths RC, Spanò D. Multivariate Jacobi and Laguerre polynomials, infinite-dimensional extensions, and their probabilistic connections with multivariate Hahn and Meixner polynomials. Bernoulli. 2011;17(3):1095–1125. [Google Scholar]
  20. Gutenkunst RN, Hernandez RD, Williamson SH, Bustamante CD. Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genet. 2009;5:e1000695. doi: 10.1371/journal.pgen.1000695. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Karlin S, Taylor H. A Second Course in Stochastic Processes. Academic Press; 1981. [Google Scholar]
  22. Kimura M. Stochastic processes and distribution of gene frequences under natural selection. Cold Spring Harb. Symp. Quant. Biol. 1955;20:33–53. doi: 10.1101/sqb.1955.020.01.006. [DOI] [PubMed] [Google Scholar]
  23. Kimura M. Some problems of stochastic processes in genetics. Ann. Math. Stat. 1957;28:882–901. [Google Scholar]
  24. Lukić S, Hey J, Chen K. Non-equilibrium allele frequency spectra via spectral methods. Theor. Popul. Biol. 2011;79(4):203–219. doi: 10.1016/j.tpb.2011.02.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Shimakura N. Equations différentielles provenant de la génétique des populations. Tohoku Math. J. 1977;29:287–318. [Google Scholar]
  26. Song YS, Steinrücken M. A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection. Genetics. 2012;190(3):1117–1129. doi: 10.1534/genetics.111.136929. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Szegö G. Orthogonal Polynomials. American Mathematical Society; 1939. [Google Scholar]
  28. Tavaré S. Line-of-descent and genealogical processes, and their applications in population genetics models. Theor. Popul. Biol. 1984;26:119–164. doi: 10.1016/0040-5809(84)90027-3. [DOI] [PubMed] [Google Scholar]
  29. Wright S. Adaptation and selection. In: Jepson GL, Mayr E, Simpson GG, editors. Genetics, Paleontology and Evolution. Princeton, New Jersey: Princeton Univ. Press; 1949. pp. 365–389. [Google Scholar]
  30. Živković D, Stephan W. Analytical results on the neutral non-equilibrium allele frequency spectrum based on diffusion theory. Theor. Popul. Biol. 2011;79(4):184–191. doi: 10.1016/j.tpb.2011.03.003. [DOI] [PubMed] [Google Scholar]

RESOURCES