Abstract
We investigate the properties of a Wright-Fisher diffusion process started from frequency x at time 0 and conditioned to be at frequency y at time T. Such a process is called a bridge. Bridges arise naturally in the analysis of selection acting on standing variation and in the inference of selection from allele frequency time series. We establish a number of results about the distribution of neutral Wright-Fisher bridges and develop a novel rejection sampling scheme for bridges under selection that we use to study their behavior.
1. Introduction
The Wright-Fisher Markov chain is of central importance in population genetics and has contributed greatly to the understanding of the patterns of genetic variation seen in natural populations. Much recent work has focused on developing sampling theory for neutral sites linked to sites under selection (Smith and Haigh, 1974; Kaplan et al., 1989; Nielsen et al., 2005; Etheridge et al., 2006). Typically, the site under selection is assumed to have dynamics governed by the diffusion process limit of the Wright-Fisher chain, in which case the genealogy of linked neutral sites can be constructed using the framework of Hudson and Kaplan (1988). However, due to the complicated nature of this model, analytical theory is necessarily approximate and the main focus is on simulation methods. In particular, a number of simulation programs, including mbs (Teshima and Innan, 2009) and msms (Ewing and Hermisson, 2010) have recently appeared to help facilitate the simulation of neutral genealogies linked to sites undergoing a Wright-Fisher diffusion with selection.
Simulations of Wright-Fisher paths under selection can be easily carried out using standard techniques for simulating diffusions. Frequently, however, it is necessary to simulate a Wright-Fisher path conditioned on some particular outcome. For example, to simulate the path of an allele under selection that is currently at frequency x, a time-reversal argument shows that it is possible to simulate a path starting at x conditioned to hit 0 eventually (Maruyama, 1974). However, more complicated scenarios, including the action of natural selection on standing genetic variation, require more elaborate simulation methods (Peter et al., 2012).
The stochastic process describing an allele that starts at frequency x at time 0 and is conditioned to end at frequency y at time T is called a bridge between x and y in time T or a bridge between x and y over the time interval [0, T]. Wright-Fisher diffusion bridges appear naturally in the study of selection acting on standing variation because it is necessary to know the path taken by an allele at current frequency y that fell under the influence of natural selection at a time T generations in the past when it was segregating neutrally at frequency x. Wright-Fisher diffusion bridges are also of interest for their application to inference of selection from allele frequency time series (Bollback et al., 2008; Malaspinas et al., 2012; Mathieson and McVean, 2013; Feder et al., 2013). In particular, analysis of bridges can help determine the extent to which more signal is gained by adding further intermediate time points.
In addition to their applied interest, there are interesting theoretical questions surrounding Wright-Fisher diffusion bridges. For alleles conditioned to eventually fix, Maruyama (1974) showed that the distribution of the trajectory does not depend on the sign of the selection coefficient; that is, both positively and negatively selected alleles with the same absolute value of the selection coefficient exhibit the same dynamics conditioned on eventual fixation. It is natural to inquire whether the analogous result holds for a bridge between any two interior points. Moreover, the degree to which a Wright-Fisher bridge with selection will differ from a Wright-Fisher bridge under neutrality is not known (in connection with this question, we recall the well-known fact that the distribution of a bridge for a Brownian motion with drift does not depend on the drift parameter, and so it is conceivable that the presence of selection has little or no effect on the behavior of Wright-Fisher bridges). Lastly, the characteristics of the sample paths of the frequency of alleles destined to be lost in a fixed amount of time are not only interesting theoretically but may also have applications to geographically structured populations (Slatkin and Excoffier, 2012).
Here we investigate various features of Wright-Fisher diffusion bridges. The paper is structured as follows. First, we establish analytical results for neutral Wright-Fisher bridges. Then, we derive a novel rejection sampler for Wright-Fisher bridges with selection and use it to study the properties of such processes. For example, we estimate the distribution of the maximum of a bridge from 0 to 0 under selection and investigate how this distribution depends on the strength of selection.
2. Background
A Wright-Fisher diffusion with genic selection is a diffusion process {Xt, t ≥ 0} with state space [0, 1] and infinitesimal generator
| (2.1) |
When γ = 0, the diffusion is said to be neutral; otherwise, the drift term captures the strength and direction of natural selection.
The corresponding Wright-Fisher diffusion bridge, is the stochastic process that results from conditioning the Wright-Fisher diffusion to start with value x at time 0 and end with value z at time T. Denote by f(x, y; t) the transition density of the diffusion corresponding to (2.1). By the Markov property of the Wright-Fisher diffusion, the bridge is a time-inhomogeneous diffusion and the transition density for the bridge going from state u at time s to state v at time t is
| (2.2) |
The time-inhomogeneous infinitesimal generator of the bridge acting on a test function g at time s is
| (2.3) |
An obvious method for simulating a Wright-Fisher bridge would be to simulate the stochastic differential equation (SDE) corresponding to this infinitesimal generator. There are two obstacles to this approach. Firstly, analytic expressions for the transition density f are only known for the neutral case, and even there they are in the form of infinite series. Secondly, note that the first order coefficient in the infinitesimal generator becomes increasing singular as s ↑ T; consequently, an attempt to simulate the bridge by simulating the SDE would be quite unstable because the drift term in the SDE would explode at times close to the terminal time T. It is because this naive approach is infeasible that we need to consider the more sophisticated simulation methods explored in this paper.
In addition to conditioning the process to obtain a particular value at a particular time, it is possible to condition a process’s long term behavior. The transition densities of the conditioned process, fh(x, y; t) are related to to the transition densities of the unconditioned process by the usual Doob h-transform formula,
The h-transformed process has infinitesimal generator
| (2.4) |
Note that the finite dimensional marginal distribution at times 0 ≤ t1 ≤ … ≤ tn ≤ T of the Wright-Fisher diffusion bridge starting at x at time 0 and ending at y at time T has density
whereas the analogous density for the corresponding bridge of the h-transformed process is
Thus, the the bridges for the two processes have the same distribution.
Typical h-transforms include the conditioning a process to eventually hit a particular value, and for the sake of future reference we recall from standard diffusion theory (Rogers and Williams, 2000) that the probability that the Wright-Fisher diffusion started from x eventually hits y is
| (2.5) |
where S is the scale function given by
Thus,
| (2.6) |
when γ ≠ 0
| (2.7) |
3. Analytic theory for neutral bridges
3.1. Transition densities for the neutral Wright-Fisher diffusion
When there is no natural selection (i.e., γ = 0), the transition densities of the Wright-Fisher diffusion can be expressed
| (3.1) |
where the ql(t) are the transition functions of a death process starting at infinity with death rate when n individuals are left alive and β(·; α, β) is the density of the Beta distribution with parameters α and β (Ethier and Griffiths, 1993). That is, ql(t) is the probability that a Kingman coalescent tree with infinitely many leaves at time 0 has l lineages present t units of time in the past. In the Appendix we present a related pair of eigenfunction expansions of the transition density.
Let be a sequence of independent exponential random variables with rates . We think of Tj as the length of time in a Kingman coalescent tree when j lineages are present. Thus, is the time to l – 1 lineages being present. Write hl(t) for the density of this sum. The Laplace transform of hl is
| (3.2) |
Because
we see that
| (3.3) |
Thus, the Laplace transform of f(x, y; ·) is
| (3.4) |
To construct bridges with 0 as their initial or final points, we need to consider the behavior of the transition density f(x, y; t) as x ↓ 0. Discarding terms that are O(x2), (3.4) is asymptotic to
| (3.5) |
Note that
| (3.6) |
is the Laplace transform of the density of
| (3.7) |
where N - 2 is distributed as the number of failures before the first success in a sequence of i.i.d. Bernoulli trials with success probability y.
3.2. Bridge from 0 to 0 over
[0, T]. For x, y ∉ {0, 1}, it follows from (2.2) that the density of Xt given that X0 = x and XT = z is
| (3.8) |
In the second line of (3.8) we used reversibility (before hitting 0 or 1) with respect to the speed measure z−1(1 - z)−1. From (3.4) we know the asymptotic form of (3.8). The limit of
as x ↓ 0 is
| (3.9) |
If z ↓ 0 as well, then the limit is
| (3.10) |
Therefore,
| (3.11) |
The density hl is given by
| (3.12) |
where a(b) := a(a + 1) ⋯ (a + b − 1). In addition, an eigenfunction expansion of the transition density in the Appendix shows that
| (3.13) |
It is clear from the above that the random variable has the same distribution as for 0 ≤ t ≤ T, and an elaboration of this argument using (2.2) to compute the finite dimensional distributions of the process X0,0,[0,T] shows the following invariance under time-reversal
where 𝒟̳ denotes equality in distribution.
As T → ∞, the density of for a fixed t > 0 converges to
| (3.14) |
By a similar calculation, we find that, centering around T/2, the limiting density of XT/2+t for −T/2 < t < T/2 fixed is just 6y(1 - y), independent of t.
Moreover, from (2.2) we see that the transition densities of satisfy
| (3.15) |
For fixed 0 < s < t, this transition density converges to
| (3.16) |
the transition density of the neutral Wright-Fisher diffusion conditioned on non-absorption, a process with infinitesimal generator
| (3.17) |
For fixed −∞ < s < t < ∞, the transition density f0,0,[0,T](u, v; T/2 + s, T/2 + t) converges as T → ∞ to the same limit, and so the finite-dimensional distributions of the process converge to those of the stationary Markov process indexed by the whole real line that is obtained by taking the neutral Wright-Fisher diffusion conditioned on non-absorption in equilibrium.
3.3. Bridge from x to 0 over [0, T]
The density of Xt given that X0 = x and XT = 0 is
| (3.18) |
The derivation of (3.18) is similar to that of (3.11). Note from (2.3) that Xx,0,[0,T] is a time inhomogeneous diffusion with time inhomogeneous infinitesimal generator
| (3.19) |
The transition densities of Xx,0,[0,T] are the same as those of X0,0,[0,T], and so they converge as T → ∞ to those of the neutral Wright-Fisher diffusion conditioned on non-absorption. As one would expect, the first order coefficient in (3.19) converges as T → ∞ to (1 – 2y), the first order coefficient in the infinitesimal generator of the neutral Wright-Fisher diffusion conditioned on non-absorption.
3.4. First passage time distribution
To determine the density of the maximum in a Wright-Fisher diffusion bridge, we will require the first passage time densities of the Wright-Fisher diffusion. Let g(·; x, y) be the first passage time density from x to y. Note that because the Wright-Fisher diffusion starting at x may be absorbed before hitting y, the density g(·; x, y) is improper; that is,
Taking the Laplace transform of the identity
we see that the Laplace transform of g(·; x, y) is
| (3.20) |
Although the Laplace transform (3.20) is easy to evaluate, it appears to be difficult to invert it explicitly because of the denominator.
To gain more insight into first passage times, we consider moments of the first passage time from x to y conditioned on hitting y. By (2.7), the first passage time distribution, conditioned on hitting y, has Laplace transform
Combined with (3.20), the limit of this Laplace transform as x ↓ 0 is
| (3.21) |
It follows that
| (3.22) |
exists and gives the density of the limit as x ↓ 0 of the first passage time from x to y conditional on y being hit. For later use, we record the definition
| (3.23) |
We can now use (3.21) to calculate the mean first passage time from 0 to y conditioned on hitting y. The transition density satisfies the backward equation
Take y > x, multiply by t, integrate from 0 to ∞, and use integration-by-parts to get
| (3.24) |
Set
Use the fact that to rewrite (3.24) as
This ordinary differential equation has the general solution
| (3.25) |
Differentiating (3.5) and sending λ ↓ 0, we find that asymptotically as x ↓ 0,
Thus,
for small x, and hence
| (3.26) |
To find the mean first passage time from 0 to y conditional on y being hit (or, more correctly, the mean of the limit as x ↓ 0 of the first passage time from x to y conditional on y being hit), differentiate (3.21), set λ = 0, and recall that f*(y, y, 0) = 2 to get
| (3.27) |
Note that this mean increases monotonically from 0 to 2 as y goes from 0 to 1.
3.5. Joint density of a maximum and time to hitting in a bridge
For the class of diffusions with inaccessible boundaries, Csáki et al. (1987) studied the joint density of a maximum and it’s hitting time. This theory is not directly applicable to the Wright-Fisher diffusion because of the absorbing boundaries. However, we may condition the Wright-Fisher process to not be absorbed, thereby making the boundaries inaccessible. By an argument similar to that made in Section 2 for h-transforms, the bridges of this process are the same as the bridges of the unconditioned process. The transition density, f̃(x, y; t) and infinitesimal generator, ℒ̃ of the conditioned process are given in (3.16) and (3.17), respectively. We will also need the first passage time density for the conditioned process,
along with its scale density,
and speed density
Applying the formula in Theorem A of Csáki et al. (1987), we find that the joint density of the maximum and time of hitting for an arbitrary bridge from x to z in time T is
Taking limits as x, z ↓ 0, we see that joint density for a bridge from 0 to 0 is
3.6. Maximum in a bridge
Let Mx,z,[0,T] be the maximum of the bridge , where 0 ≤ x, z ≤ 1.
The occurrence of the event {Mx,z,[0,T] ≥ y} is equivalent to the Wright-Fisher diffusion making a first passage from x to y at some time t ∈ [0, T] and then going on to hit z at time T. Recalling that g(·; x, y) is the density of the first passage from x to y, for 0 < x, z < 1 we have
| (3.28) |
We wish to obtain an expression for ℙ{M0,0,[0,T] ≥ y}. Multiply the numerator and denominator of the right-hand side of (3.28) by x−1, re-write the numerator using the relationship
that follows from the reversibility of the neutral Wright-Fisher process with respect to the speed measure y−1(1 – y)−1 dy, and x, y ↓ 0 to get
where g◇ was defined in (3.23) and the sequence of polynomials are defined in the Appendix.
The Laplace transform of t ↦ g#(t; y) = yg◇(t; y) is given by (3.21). Although the numerator and denominator of (3.21) can be computed accurately using the orthogonal function expansion, however there is not a simple way to invert the Laplace transform of the first passage time.
If we write the Laplace transform of g # (t; y)
| (3.29) |
we see that the numerator and denominator are both Laplace transforms of probability distributions because Green function of the neutral Wright-Fisher diffusion is given by
Equation (3.29) can be rewritten as
which implies the convolution equation
| (3.30) |
The easiest way to solve this equation numerically is by discretization. Take ε > 0 and positive integer K. Let Pε, K and Qε, K be the discrete probability distributions on the set {0, ε, 2ε, …} given by
and
Note that the quantities can be computed accurately using orthogonal function expansions.
Equation (3.30) implies that if Rε,K is the probability distribution on the set {0, ε, 2ε, …} given by
then Pε,K should be approximately the convolution Qε,K * Rε,K. That is, Pε,K ({kε}) should be approximately ck for 0 ≤ k ≤ K, where c0, …, cK is the solution of the system of equations
Therefore, c0 = a0/b0 and we obtain c1, …, cK recursively by
| (3.31) |
Thus,
| (3.32) |
where
3.7. Numerical calculations
The infinite series in (3.32) was approximated using the first 3000 terms. The step size in the discrete first passage time approximation was taken to be ε = 0.001 and the number of points was taken to be K = 5000.
Distribution function of the maximum in a bridge M.
| T | 0.05 | 0.10 | 0.15 | 0.20 | 0.25 | 0.30 | 0.35 | 0.40 | 0.45 | 0.50 |
| 0.5 | 0.0 | 0.02 | 0.17 | 0.43 | 0.66 | 0.83 | 0.92 | 0.96 | 0.99 | 0.99 |
| 1.0 | 0.0 | 0.02 | 0.09 | 0.21 | 0.36 | 0.52 | 0.66 | 0.77 | ||
| 1.5 | 0.0 | 0.01 | 0.03 | 0.08 | 0.17 | 0.28 | 0.40 | |||
| 2.0 | 0.0 | 0.02 | 0.04 | 0.09 | 0.17 | |||||
| T | 0.55 | 0.60 | 0.65 | 0.70 | 0.75 | 0.80 | 0.85 | 0.90 | 0.95 | 1.0 |
| 0.5 | 1.0 | |||||||||
| 1.0 | 0.85 | 0.91 | 0.95 | 0.97 | 0.99 | 0.99 | 1.0 | |||
| 1.5 | 0.52 | 0.63 | 0.73 | 0.82 | 0.88 | 0.93 | 0.96 | 0.99 | 1.0 | |
| 2.0 | 0.26 | 0.37 | 0.48 | 0.59 | 0.70 | 0.97 | 0.87 | 0.93 | 0.97 | 1.0 |
| T | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 |
| 0.1 | 0.00 | 0.01 | 0.14 | 0.37 | 0.59 | 0.76 |
| T | 0.07 | 0.08 | 0.09 | 0.10 | 0.11 | 0.12 |
| 0.1 | 0.86 | 0.93 | 0.96 | 0.98 | 0.99 | 1.0 |
The distribution function behaves as expected. If T is 0.1 the maximum is very small, with the distribution function shown in a separate table with a small scale. M is less than 0.06 with probability 0.76 and less than 0.12 with probability 1.0. If T =0.5 the maximum is still small, but larger than when T = 0.1, with a probability of 0.17 of being greater than 0.3 and a probability of 1.0 of being less than 0.55. If T = 1.0, 1.5, 2.0 the maximum is increasingly larger with respective probabilities of exceeding 0.5 of 0.23, 0.60, 0.83 and when T = 2 the probability of exceeding 0.75 is 0.30. Recall that the mean to coalescence of a population to a single ancestor is 2 time units.
4. Rejection sampling Wright-Fisher bridge paths
4.1. General framework
When selection is incorporated into the Wright-Fisher model, there is no known series formula for the transition density akin to (3.1) (but see Kimura (1955) and Kimura (1957) for attempts using perturbation theory, as well as Song and Steinrücken (2012) and Steinrücken et al. (2012) for methods of approximating an eigenfunction expansion computationally). Therefore, analytical results for distributions associated with the corresponding bridge like those we obtained in the neutral case are not available. Instead, we develop a rejection sampling method that can sample paths of Wright-Fisher diffusion bridges with genic selection efficiently for the purpose of investigating their properties. In this work, we focus on a diffusion with genic selection, instead of general diploid selection, for analytical convenience. The following approach would apply even in the more general case.
Before we explain how rejection sampling can be used to sample paths of a Wright-Fisher bridge, we first describe the analogous, but simpler, method for sampling paths of diffusion bridges that have distributions which are absolutely continuous with respect to that of a Brownian bridge. Fix x, z ∈ ℝ and T > 0. Let 𝕎 be the distribution of Brownian bridge from x to z over the time interval [0, T], and let ℙ be the distribution of the path of a bridge from x to z over the time interval [0, T] for a diffusion with infinitesimal generator
| (4.1) |
It follows from Girsanov’s theorem (see, for example, Rogers and Williams (2000)) that the probability measure ℙ is absolutely continuous with respect to 𝕎 with Radon-Nikodym derivative (that is, density)
| (4.2) |
for the path ω, where the first integral in (4.2) is an Itô integral – see Beskos and Roberts (2005) for the details of the disintegration argument that concludes this fact about Radon-Nikodym derivatives with respect to the Brownian bridge distribution from the usual statement of Girsanov’s theorem, which is about Radon-Nikodym derivatives with respect to the distribution of Brownian motion. Because a Brownian bridge can be constructed using a simple transformation of a Brownian motion (namely, if B is a standard Brownian motion, then the process has the distribution 𝕎), it is computationally feasible to obtain fine-grained samples of the Brownian bridge. Once we have a sequence of Brownian bridge paths, (4.2) can be used to compute a likelihood ratio, and a standard rejection sampling scheme can then be utilized to obtain realizations of diffusion bridge paths; see Beskos and Roberts (2005) for examples of extensions to this approach.
This method is not immediately applicable to the Wright-Fisher bridge because its infinitesimal generator is not of the form (4.1). However, it was shown on pp 119–120 of Wright (1931) that if X is the Wright-Fisher process with infinitesimal generator (2.1), then the transformation
| (4.3) |
suggested in Fisher (1922) produces a diffusion process Y on the state space [0, π] with infinitesimal generator
Because Y has absorbing boundaries at 0 and π, sampling paths of bridges for Y by sampling Brownian bridges can involve extremely high rejection rates. More specifically,
and so the likelihood ratio (4.2) becomes extremely small for paths that spend a significant amount of time near 0. A similar phenomenon occurs near π.
To overcome the difficulty near 0, we develop a rejection sampling scheme where the proposals are realizations of a process other than the Brownian bridge.
As a first step, consider the Wright-Fisher diffusion conditioned to be eventually absorbed at 1. By the argument given in Section 2, this process has the same bridges as the unconditional process. It follows from (2.6) and (2.7) with y = 1 that the probability the process starting from×is absorbed at 1 is
The transition densities of the conditioned process, fh(x, y; t), are related to the unconditional transition densities by the usual Doob h-transform formula
The corresponding infinitesimal generator is
| (4.4) |
Applying the transformation (4.3) to the process with infinitesimal generator (4.4) results in a process with infinitesimal generator
| (4.5) |
Note that
| (4.6) |
and
| (4.7) |
Moreover, if ℚ is the distribution of a bridge from x to z over the time interval [0, T] for some diffusion with infinitesimal generator
and ℙ is the distribution of a bridge from x to z over the time interval [0, T] for the diffusion with infinitesimal generator (4.1), then
This suggests that a better rejection sampling scheme for bridges of the process Y with end points close to zero will result when the proposals come from a diffusion with an infinitesimal generator having a first order coefficient with a singularity at zero matching the one appearing in both (4.6) and (4.7).
For such a modified scheme to be feasible, it is necessary to work with a proposal diffusion for which it is easy to simulate the associated bridges. We now introduce such a process. The 4-dimensional Bessel process is the radial part of a 4-dimensional Brownian motion. That is, if is a vector of 4 independent one-dimensional Brownian motions, then
is a 4-dimensional Bessel process (see Revuz and Yor (1999, Section XI.1) for a thorough discussion of Bessel processes). The 4-dimensional Bessel process is a diffusion with infinitesimal generator
Letting ℙ (resp. 𝔹) be the distribution of the bridge for the process with infinitesimal generator (4.5), and hence the distribution of the transformed Wright-Fisher diffusion Y, (resp. the 4-dimensional Bessel bridge) from x to z over the time interval [0, T], we have
| (4.8) |
We next explain how to simulate a 4-dimensional Bessel bridge. We can construct the bridge from u ∈ ℝ 4 to v ℝ ℝ4 over the time interval [0, T] for the 4-dimensional Brownian motion as
where B0 = 0. The distribution of u + BT conditional on |u + BT | = z has density proportional to w ↦ exp(w · u/T) with respect to the normalized surface measure on the sphere centered at the origin with radius y, where w · u is the usual scalar product of the two vectors w, u ∈ ℝ4. Hence, a 4-dimensional Bessel bridge from x to z over the time interval [0, T] is given by
where B0 = 0, u ∈ ℝ4 is any vector with |u| = x, and V is random vector taking values on the sphere centered at the origin with radius z that is independent of B and has a density with respect to the normalized surface measure that is proportional to w ↦ exp(w · u/T). Note that the random vector V/z, which takes values on the unit sphere centered at the origin, has a Fisher – von Mises distribution with mean vector u/x and concentration parameter xz/T (see, for example, Mardia et al. (1979, Ch. 15)).
Increasing the strength of natural selection causes the Wright-Fisher bridge to move faster for intermediate frequencies, but the method proposed above uses the same 4-dimensional Bessel bridge regardless of the value of the selection parameter γ, and so the rejection rate can become very high for large values of γ. To deal with this phenomenon, we introduce the following further refinement to the proposal process. With ℙ the distribution of the transformed Wright-Fisher bridge from x to z over the time interval [0, T] as above, let ωε : [0, T] → [0, π], > 0, be the path with that maximizes
Then, ωε converges as ε ↓ 0 to a path ω*. Heuristically, we can think of ω* as the path that has “maximum probability” or is “modal” for ℙ. This path is sometimes called an Onsager-Machlup function and it can be found by solving a certain variational problem – see, for example, Ikeda and Watanabe (1989). For the transformed Wright-Fisher bridge, an analysis of the variational problem shows that the maximum probability path satisfies the second order ordinary differential equation
| (4.9) |
with boundary conditions
With a solution to (4.9) in hand, it is possible to construct a better proposal distribution by linking together bridges that are “close” to the maximum probability path. First, choose a number of discretization points N and take times 0 < t1 < … < tN < T. Then, sample independent random variables U1, U2, …, UN with densities g1, g2, …, gN to be specified later. Put t0 = 0, tN+1 = T, U0 = x and UN+1 = z. Build conditionally independent 4-dimensional Bessel bridges from Ui to Ui+1 over the time intervals [ti, ti+1]. The distribution of Ui should be chosen so that Ui is close to the maximum probability path at time ti; we choose re-scaled Beta distributions with mode at the solution of (4.9) at time ti. More specifically, we set Ui = πXi, where Xi has the Beta distribution with parameters
for some free parameter θ. We used the particular value θ = 50 for the examples in this paper, but other value of θ could be used in a given situation in an attempt to optimize the frequency of rejection.
By stringing these bridges together, we get a path going from x to z over the time interval [0, T]. However, the distribution of this path is certainly not that of the 4-dimensional Bessel bridge because of the manner in which we have chosen the endpoints of the component bridges.
Therefore, we can’t simply use the Radon-Nikodym derivative (4.8) as it stands to construct a rejection sampling procedure. Rather, if we let ℚ be the distribution of the path built by stringing the bridges together, then we must accept a path ω with probability proportional to
| (4.10) |
Note that
| (4.11) |
where
| (4.12) |
is the transition density of the 4-dimensional Bessel process with Iv the modified Bessel function of the first kind.
To demonstrate the effectiveness of the rejection sampling scheme, Figure 7.1 shows Q-Q plots of the one-dimensional marginal at time t of a Wright-Fisher bridge with genic selection as estimated using the rejection sampler compared to an approximation that uses the method of Song and Steinrücken (2012) to compute the cumulative distribution function of the marginal. For both rows, the bridge goes from x = .2 to z = 0.7 over the time interval [0, T] = [0, 0.1]. The left panels correspond to t = 0.03 and the right panels correspond to t = 0.07. The top row corresponds to γ = 10 and the bottom row to γ = 50, demonstrating the effectiveness of the rejection sampling scheme over a wide range of selection coefficients.
Figure 7.1.
Q-Q plot showing the accuracy of the rejection sampling scheme. Theoretical quantiles were calculated using the method of Song and Steinrücken (2012) and sample quantiles are determined from 1000 bridges simulated using the method described in the text. The bridge goes from x = 0.2 to z = 0.7 over the time interval [0, T] = [0, 0.1]. The left panels correspond to t = 0.03 and the right panels correspond to t = 0.07. The top row corresponds to γ = 10 and the bottom row to γ = 50.
Figure 7.2 demonstrates the behavior of a Wright-Fisher diffusion bridge as the selection coefficient increases. A bridge from x = 0.01 to z = 0.8 over the time interval [0, T] = [0, 0.1] is shown for γ = 0, γ = 50 and γ = 100. As the selection coefficient increases, the proportion of time the bridge spends near the boundary also increases, because the Wright-Fisher diffusion moves faster when it is away from the boundaries. In addition, the paths that the bridge takes become more tightly centered around the most probable path as the selection coefficient increases. Being able to sample Wright-Fisher bridge paths makes it very easy to numerically approximate the distribution and expectation of various functionals of the path. As an example, Figure 7.3 shows the density of the maximum in a bridge from x = 0 to z = 0 over the time interval [0, T] = [0, 0.1] for γ = 0, γ = 50 and γ = 100. Note that the maximum in the bridge decreases as the strength of selection increases, and also becomes more tightly concentrated around its expectation. To gain a more quantitative understanding of the extent to which a bridge for an allele experiencing natural selection looks different from the bridge for a neutral allele, it is possible to compute the Radon-Nikodym derivative (i.e. the likelihood ratio) of the distribution under selection against the distribution under neutrality. Using an argument similar to that which led to (4.8), the likelihood ratio is
| (4.13) |
where the constant of proportionality only depends on the endpoints. A few things are immediately evident from (4.13). First of all, the likelihood ratio does not depend on the sign of the selection coefficient, only the magnitude. This is analogous to the result Maruyama (1974) that, conditioned on eventual fixation, the sign of the selection coefficient is irrelevant to the distribution of the Wright- Fisher diffusion path. Also apparent is that bridges with strong natural selection will be more likely to be found near the boundary than bridges under neutrality. Finally, because 0 ≤ sin2 (x) ≤ 1, we see that, very loosely, a bridge will look approximately neutral if
| (4.14) |
Figure 7.2.
Plot showing the properties of bridge paths as the strength of selection increases. Each bridge is from x = 0.01 to z = 0.8 over the time interval [0, T] = [0, 0.1]. The successive selection coefficients are γ = 100. For each selection coefficient, pointwise 0%, 25%, 50%, 75% and 100% quantiles are calculated. Solid line is the 50% quantile, dashed line indicates 25% and 75% quantiles, and the dotted line indicates 0% and 100% quantiles.
Figure 7.3.
Densities of the maximum in a 0 to 0 bridge over the time interval [0, T] = [0, 0.1] for the selection strengths γ= 0, γ = 50 and γ = 100.
5. Discussion
We have examined the behavior of Wright-Fisher diffusion bridges under both neutral models and models with genic selection. Although various conditioned Wright-Fisher diffusions have been studied in the past, Wright-Fisher diffusions conditioned to obtain a specific value at a predetermined time have not been studied extensively. We have elucidated some of the properties of Wright-Fisher bridges using a combination of analytical theory and simulations.
In contrast to Brownian motion with drift, for which the distribution of a bridge does not depend on the magnitude of the drift coefficient, the distribution of a Wright-Fisher bridge does depend on the magnitude of the selection coefficient. As one might expect, bridges under strong selection are more constrained than neutral bridges. This can clearly be seen in Figure 7.2, in which the bridge with γ = 0 has a broad range, but when γ = 100 the paths of the bridge are highly likely to be confined near the boundary at 0 until quite late in the bridge. A similar conclusion can be drawn from Figure 7.3 which shows the density of the maximum in a bridge from 0 to 0 over the time interval [0, T] = [0, 0.1]. The expected maximum of a neutral bridge is much higher than one with strong selection, and there is significantly more variance about that maximum under neutrality.
Much of the behavior of Wright-Fisher bridges under selection can be understood in terms of the likelihood ratio (4.13). Because sin(x) takes its smallest values for x ≈ 0 and x ≈ π, very strong selection will confine a bridge of the transformed process Y to near these boundaries. Intuitively, this is because the Wright-Fisher diffusion has the largest magnitude of drift and diffusion coefficients at x = 0.5, and thus the diffusion moves “faster” when it is away from the boundaries 0 and 1. In order for a diffusion with a large selection coefficient to reach an interior point after a large amount of time, it must spend most of that time near the boundary.
However, these differences between selection and neutrality are mostly apparent in cases of extreme selection coefficients or very long times. This has important implications for maximum likelihood inference of selection coefficients from allele frequency time series. Because the realizations are likely to be quite similar for a selected allele and a neutral allele when the selection coefficient is moderate, most of the information about the selection coefficient comes from the end-points. This is consistent with the work of Watterson (1979), who showed that even with the whole sample path, it is difficult to reject neutrality when selection is weak. Therefore, in many cases increasing the time-density of samples may not provide much additional information about the selection coefficient. Because many allelic time-series are obtained via costly ancient DNA techniques, this is an important consideration for the many researchers who are interested in the history of selection acting on a particular allele.
In addition to results directly concerning bridges, we have made several technical advances in the analysis of the Wright-Fisher diffusion. We have developed the theory of first passage times of a neutral Wright-Fisher diffusion starting from low frequency and we were able to provide a closed-form for the density of the maximum in a neutral bridge that goes from 0 to 0.
While our rejection sampling scheme is similar to that of Beskos and Roberts (2005) in some regards, there are several differences. Primarily, we do not provide exact samples, in the sense that Beskos and Roberts (2005) does. Because we store a discrete representation of our proposal bridges in computer memory, the calculation of (4.8) is necessarily an approximation, and hence the samples are only approximate. However, Figure 7.1 shows that they are extremely accurate. Also, because we are concerned with a specific model, we used 4-dimensional Bessel bridges, instead of Brownian bridges, in our proposal mechanism. This choice is superior for the Wright-Fisher diffusion because both the Bessel bridge and the Wright-Fisher bridge have boundaries at 0 with asymptotically equivalent singularities in the drift coefficient, while the Brownian bridge can assume negative values and hence result an unacceptably high rejection rate when it is used as a proposal distribution. Ideally, we would sample from a proposal distribution that describes a diffusion that was also bounded above and had a suitable singularity in its drift coefficient at the upper boundary; however, we have not yet discovered an appropriate diffusion for which it is easy to sample the corresponding bridges. Finally, we make use of the “most likely” bridge path as a means of guiding samples of bridges that are likely to be extremely different from those generated by the 4-dimensional Bessel bridge proposal distribution. This modification is akin to shifting the mean of a proposal distribution when doing rejection sampling of a 1-dimensional random variable, and it greatly increases the efficiency of sampling.
Acknowledgments
The authors thank M. Slatkin and B. Peter for initial discussions that led to our interest in this topic.
JGS supported in part by NIH NRSA trainee appointment grant T32 HG 00047 and by NIH grant R01-GM40282.
RCG supported by Miller Institute for Basic Research in Science, University of California at Berkeley.
SNE supported in part by NSF grant DMS-0907630.
7. Appendix
7.1. Eigenfunction expansions of the transition density
Eigenfunction expansions of the Wright-Fisher transition densities in the case of no mutation were first explored in Kimura (1957).
The form given in Crow and Kimura (1970) is
where is the Gegenbauer polynomial with λ = 3/2.
An explicit formula for the Gegenbauer polynomial is
The generating function for the sequence is
Note that
and the right-hand side is (n + 1)(n + 2)/2 when λ = 3/2.
The sequence of polynomials satisfies the three-term recurrence
with initial conditions . It is convenient in computations to use the scaled polynomials which are bounded in modulus by unity on the interval [−1, +1]. The corresponding three-term recurrence for the sequence is
with initial conditions P0(x) = 1 and P1(x) = x.
The transition density written with the scaled polynomials is
The asymptotic form of the transition density as x ↓ 0 is
| (7.1) |
Also,
We also use a form of the expansion that is formally equivalent to the one above – see Griffiths and Spanó (2010). The expansion is
| (7.2) |
where
| (7.3) |
and
| (7.4) |
Note that
as x ↓ 0. Therefore,
| (7.5) |
which is equal to (3.9). To calculate
we observe that
| (7.6) |
Therefore,
| (7.7) |
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Contributor Information
Joshua G. Schraiber, Department of Integrative Biology, University of California, 3060 Valley Life Sciences Bldg #3140, Berkeley, CA 94720-3140, U.S.A. jgschraiber@berkeley.edu
Robert C. Griffiths, Department of Statistics, 1 South Parks Road, Oxford OX1 3TG, U.K. griff@stats.ox.ac.uk
Steven N. Evans, Department of Statistics, University of California, 367 Evans Hall #3860, Berkeley, CA 94720-3860, U.S.A. evans@stat.berkeley.edu
References
- Smith JM, Haigh J. The hitch-hiking effect of a favourable gene. Genetical Research. 1974;23:23–35. [PubMed] [Google Scholar]
- Kaplan NL, Hudson RR, Langley CH. The “hitchhiking effect” revisited. Genetics. 1989;123:887–899. doi: 10.1093/genetics/123.4.887. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nielsen R, Williamson S, Kim Y, Hubisz MJ, Clark AG, Bustamante C. Genomic scans for selective sweeps using SNP data. Genome Research. 2005;15:1566–1575. doi: 10.1101/gr.4252305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Etheridge A, Pfaffelhuber P, Wakolbinger A. An approximate sampling formula under genetic hitchhiking. Ann. Appl. Probab. 2006;16:685–729. [Google Scholar]
- Hudson RR, Kaplan NL. The coalescent process in models with selection and recombination. Genetics. 1988;120:831–840. doi: 10.1093/genetics/120.3.831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Teshima KM, Innan H. mbs: modifying Hudson’s ms software to generate samples of DNA sequences with a biallelic site under selection. BMC Bioinformatics. 2009;10:166. doi: 10.1186/1471-2105-10-166. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ewing G, Hermisson J. MSMS: a coalescent simulation program including recombination, demo- graphic structure and selection at a single locus. Bioinformatics (Oxford, England) 2010;26:2064–2065. doi: 10.1093/bioinformatics/btq322. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maruyama T. The age of an allele in a finite population. Genetical Research. 1974;23:137–143. doi: 10.1017/s0016672300014750. [DOI] [PubMed] [Google Scholar]
- Peter BM, Huerta-Sanchez E, Nielsen R. Distinguishing between selective sweeps from standing variation and from a de novo mutation. PLoS Genetics. 2012;8:e1003011. doi: 10.1371/journal.pgen.1003011. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bollback JP, York TL, Nielsen R. Estimation of 2Nes from temporal allele frequency data. Genetics. 2008;179:497–502. doi: 10.1534/genetics.107.085019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Malaspinas AS, Malaspinas O, Evans SN, Slatkin M. Estimating allele age and selection coefficient from time-serial data. Genetics. 2012;192:599–607. doi: 10.1534/genetics.112.140939. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mathieson I, McVean G. Estimating selection coefficients in spatially structured populations from time series data of allele frequencies. Genetics. 2013;193:973–984. doi: 10.1534/genetics.112.147611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Feder A, Kryazhimskiy S, Plotkin JB. Identifying signatures of selection in genetic time series, arXiv preprint arXiv. 2013:1302.0452. doi: 10.1534/genetics.113.158220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Slatkin M, Excoffier L. Serial founder effects during range expansion: a spatial analog of genetic drift. Genetics. 2012;191:171–181. doi: 10.1534/genetics.112.139022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rogers LCG, Williams D. Cambridge Mathematical Library. Itô calculus, Reprint of the second (1994) edition. Vol. 2. Cambridge: Cambridge University Press; 2000. Diffusions, Markov processes, and martingales. [Google Scholar]
- Ethier SN, Griffiths RC. The transition function of a Fleming-Viot process. Ann. Probab. 1993;21:1571–1590. [Google Scholar]
- Csáki E, Földes A, Salminen P. On the joint distribution of the maximum and its location for a linear diffusion. Ann. Inst. H. Poincaré Probab. Statist. 1987;23:179–194. [Google Scholar]
- Kimura M. Cold Spring Harbor Symposia on Quantitative Biology. Vol. 20. Cold Spring Harbor Laboratory Press; Stochastic processes and distribution of gene frequencies under natural selection; pp. 33–53. [DOI] [PubMed] [Google Scholar]
- Kimura M. Some problems of stochastic processes in genetics. The Annals of Mathematical Statistics. 1957:882–901. [Google Scholar]
- Song YS, Steinrücken M. A simple method for finding explicit analytic transition densities of diffusion processes with general diploid selection. Genetics. 2012;190:1117–1129. doi: 10.1534/genetics.111.136929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Steinrücken M, Wang Y, Song YS. An explicit transition density expansion for a multi-allelic wright–fisher diffusion with general diploid selection. Theoretical Population Biology. 2012 doi: 10.1016/j.tpb.2012.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Beskos A, Roberts GO. Exact simulation of diffusions. Ann. Appl. Probab. 2005;15:2422–2444. [Google Scholar]
- Wright S. Evolution in Mendelian Populations. Genetics. 1931;16:97–159. doi: 10.1093/genetics/16.2.97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fisher R. On the dominance ratio. Proceedings of the Royal Society of Edinburgh. 1922;42:321–341. [Google Scholar]
- Revuz D, Yor M. Continuous martingales and Brownian motion, volume 293 Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences] third edition. Vol. 293. Berlin: Springer-Verlag; 1999. [Google Scholar]
- Mardia KV, Kent JT, Bibby JM. Probability and Mathematical Statistics: A Series of Monographs and Textbooks. London: Academic Press [Harcourt Brace Jovanovich Publishers]; 1979. Multivariate analysis. [Google Scholar]
- Ikeda N, Watanabe S. Stochastic differential equations and diffusion processes, volume 24 of North-Holland Mathematical Library. second edition. Vol. 24. Amsterdam: North-Holland Publishing Co.; 1989. [Google Scholar]
- Watterson G. Estimating and testing selection: the two-alleles, genic selection diffusion model. Advances in Applied Probability. 1979:14–30. [Google Scholar]
- Kimura M. Some problems of stochastic processes in genetics. Ann. Math. Statist. 1957;28:882–901. [Google Scholar]
- Crow JF, Kimura M. An introduction to population genetics theory. Harper & Row Publishers, New York. 1970 [Google Scholar]
- Griffiths RC, Spanó D. Diffusion processes and coalescent trees, in: Probability and mathematical genetics, volume 378 of London Math. Soc. Lecture Note Ser. Cambridge: Cambridge Univ. Press; 2010. pp. 358–379. [Google Scholar]



