A MATHEMATICAL FRAMEWORK FOR EXACT MILESTONING

DAVID ARISTOFF; JUAN M BELLO-RIVAS; RON ELBER

doi:10.1137/15M102157X

. Author manuscript; available in PMC: 2017 Mar 3.

Published in final edited form as: Multiscale Model Simul. 2016 Mar 3;14(1):301–322. doi: 10.1137/15M102157X

A MATHEMATICAL FRAMEWORK FOR EXACT MILESTONING

DAVID ARISTOFF ^*, JUAN M BELLO-RIVAS ^†, RON ELBER ^‡

PMCID: PMC4879838 NIHMSID: NIHMS785485 PMID: 27239166

Abstract

We give a mathematical framework for Exact Milestoning, a recently introduced algorithm for mapping a continuous time stochastic process into a Markov chain or semi-Markov process that can be efficiently simulated and analyzed. We generalize the setting of Exact Milestoning and give explicit error bounds for the error in the Milestoning equation for mean first passage times.

Keywords: accelerated molecular dynamics, long-time dynamics, stationary distribution, semi-Markov processes

1. Introduction

Molecular Dynamics (MD) simulations, in which classical equations of motion are solved for molecular systems of significant complexity, have proven useful for interpreting and understanding many chemical and biological phenomena (for textbooks see [23, 12, 1]). However, a significant limitation of MD is of time scales. Many molecular processes of interest occur on time scales significantly longer than the temporal scales accessible to straightforward simulations. For example, permeation of molecules through membranes can take hours [6], while MD is usually restricted to the microsecond time scale. One approach to extend simulation times is to use faster hardware [24, 25, 21]. Other approaches focus on developing theories and algorithms for long time phenomena. The emphasis in developing algorithms has been on methodologies for activated processes with a single dominant barrier such as Transition Path Sampling [8, 19]. Approaches for dynamics on rough energy landscapes and for more general and/or diffusive behavior have also been developed [22, 7, 26, 20]. The techniques of Exact Milestoning [3] and Milestoning [10] belong to the latter category.

The acceleration in Milestoning is based on the use of a large number of short trajectories between certain hypersurfaces in phase space that we call milestones, instead of complete trajectories between reactants and products (see Figure 1). The use of short trajectories is trivial to parallelize, making the formulation efficient to use on modern computing resources. Moreover, using short trajectories initiated near bottlenecks of reactions allows for enhanced sampling of improbable but important events. The last enhancement can lead to exponential speed up of the computations. A challenge is how to start the short trajectories and how to analyze the result to obtain the correct long time behavior.

Fig. 1 — Representation of the state space Ω and the milestones. Each milestone is an edge of one of the squares traced by dashed grey lines. The reactant state R is highlighted as a square dot in the left-bottom corner while the product state P is comprised of the two line segments shown in blue at the upper-right corner. A particular realization of a long trajectory appears as a continuous black line and the corresponding values of the process (*J_n*) are marked with circles. The individual trajectories shown in different colors near the bottom-right corner represent trajectory fragments as used in the Milestoning algorithm.

Milestoning uses an approximate distribution for the initial conditions of the trajectories at the hypersurfaces. The results are then analyzed within the Milestoning theory. This approximation is typically the canonical distribution conditioned to be at the milestones. Exact Milestoning improves over the starting guess of the initial conditions by an iterative process that converges to the correct stationary distribution.

There are related approaches that use short trajectory information to obtain information on long time kinetics such as non-equilibrium umbrella sampling [28] and trajectory tilting [27]. However the overall scope of these methods is more limited than that of Exact Milestoning, which provides a complete statistical mechanical picture of the system.

While Milestoning is an approximate procedure, it shares the same philosophy and core algorithm as the Exact Milestoning approach. In both algorithms the state space Ω is coarse-grained by closed subsets called milestones. The short trajectories are run between nearby milestones and are terminated the first time they reach a milestone different from the one they were initiated on (see Figure 1). The short trajectories in time and space should be contrasted with long continuous trajectories from the reactant R to the products P in straight-forward MD.

In the theoretical analysis of the short trajectories, in Milestoning and Exact Milestoning emphasis is made on the absolute flux. The absolute flux is always positive and is the number of trajectories that pass through a milestone per unit time. While we obtained time-dependent solutions for the kinetics in Milestoning in the past [10], more recently and frequently we use stationary solutions [3, 17]. In stationary systems the flux is time independent and can be used to define a probability measure after a proper normalization. Hence, it is possible to reformulate the algorithms of Milestoning and Exact Milestoning in terms of widely used concepts and tools of probability theory. This formulation and a rigorous mathematical framework are provided in the present manuscript.

This article is organized as follows. In Section 2, we describe the setting for Exact Milestoning and introduce notation used throughout. In Section 3, we show existence of and convergence to a stationary flux under very general conditions. In Section 4 we state precisely the Exact Milestoning algorithm [3]. In Section 5, we establish conditions under which convergence to the stationary flux is consistent in the presence of numerical error (Lemma 5.2 and Theorem 5.3), and we give a natural upper bound for the numerical error arising in Exact Milestoning (Theorem 5.4). Finally, in Section 6 we consider some instructive examples.

2. Setup and notation

2.1. The dynamics and MFPT

In Milestoning we spatially coarse-grain a stochastic dynamics (X_t). The basic idea is to stop and start trajectories on certain interfaces, called milestones, and then reconstruct functions of (X_t) using these short trajectories, which can be efficiently simulated in parallel. We focus here on using Milestoning for the efficient computation of mean first passage times (MFPTs) of (X_t), although similar ideas can be used to compute other non-equilibrium quantities.

To make our arguments we need some assumptions on the process (X_t). We assume (X_t) is a time homogeneous strong Markov process with càdlàg paths taking values in a standard Borel space Ω. These assumptions allow us to stop and restart (X_t) on the milestones without knowing its history. In applications, usually (X_t) is Langevin or overdamped Langevin dynamics, and Ω is a subset of Euclidean space. More generally, (X_t) could be a diffusion and Ω = ℝ^d/ℤ^d could be a torus, or (X_t) could be a continuous time Markov chain on a discrete set Ω = ℕ.

We write ℙ, 𝔼 for all probability measures and expectations. ℙ^ξ and ℙ^x will represent probabilities for a process starting at a distribution ξ or point x, respectively. We write ||·||_TV for total variation norm of signed measures. Our analysis below will mostly take place in an idealized setting where we assume infinite sampling on the milestones. In this setting, distributions are smooth (if state space is continuous) and the total variation norm is the appropriate one. We omit explicit mention of sigma algebras and filtrations, which will always be the natural ones, and we will use the words distribution and probability measure interchangeably.

Recall we are interested in computing the MFPT of (X_t) from a reactant set R to a product set P. Throughout we consider fixed disjoint product and reactant sets P, R ⊂ Ω. When R is not a single point, we will start (X_t) from a fixed probability measure ρ on R. If R is a single point, ρ = δ_R, the delta distribution at R. As discussed above, Milestoning allows for an efficient computation of the MFPT of (X_t) to P, starting at ρ. It is useful to think of P as a sink, and R as a source for (X_t). More precisely, we assume that when (X_t) reaches P, it immediately restarts on R according to ρ. Obviously, this assumption has no effect on the MFPT to P. It will be useful, however, for computational and theoretical considerations.

2.2. The milestones and semi-Markov viewpoint

Milestoning uses a set M ⊂ Ω to spatially parallelize the computation of the MFPT. Each point x ∈ M will belong to a milestone M_x ⊂ M. Thus, M is the union of all the milestones. We assume there are finitely many milestones, each of which is a closed set. Moreover, we assume that (X_t) passes through the intersection of two milestones with probability 0 – thus, (X_t) can only cross one milestone at a time. This can be accomplished for Langevin or overdamped Langevin dynamics by, for example, taking the milestones to be codimension 1 manifolds whose pairwise intersections are codimension 2 or larger. The sets P and R will be two of the milestones. Typically, the milestones are hyperplanes which approximate the level sets of a reaction coordinate, or they are faces of cell boundaries of a Voronoi tesselation obtained by taking points along a high temperature trajectory for (X_t).

Let X₀ ~ ξ, where ξ denotes a generic probability measure on M. We now describe in detail the coarse-graining process. By following the sequence of milestones crossed by (X_t), we obtain a stochastic process defined on M as follows. Let θ₀, θ₁, θ₂, . . . be stopping times for (X_t) defined by

if X_{θ_{n}} = x, then θ_{n + 1} : = inf {t > θ_{n} : X_{t} \in M_{y} for some M_{y} \neq M_{x}}, θ_{0} = 0.

Informally, θ_n corresponds to the time at which (X_t) crosses a milestone for the nth time. It is very important to note that θ_n corresponds to the first time for (X_t) to hit the “next” milestone, that is, a milestone different from the one (X_t) hit at time θ_n₋₁. Thus, when we refer to milestone crossings, we mean crossings of a milestone different from the last one crossed.

Define also τ₀ = 0 and for n ≥ 1, τ_n = θ_n − θ_n₋₁. Informally, τ_n is the time between the (n − 1)th and nth milestone crossings. For n ≥ 0, define J_n = X_{θ_n}to be the sequence of milestone points visited. That is, J_n is the point on the nth milestone crossed by (X_t). Note that each J_n is an element on some milestone – not the label of a milestone. Thus, (J_n) lives on the possibly uncountable set M.

As discussed above, due to our assumptions on M and (X_t), the law of X_n and τ_n depends only on X_n₋₁, that is,

ℙ (J_{n} \in C, τ_{n} \leq t ∣ (J_{0}, τ_{0}), \dots, (J_{n - 1} = x, τ_{n - 1})) = ℙ (J_{n} \in C, τ_{n} \leq t ∣ J_{n - 1} = x) .

(2.1)

Let (Y_t) be the jump process on M defined by Y_t = J_n if θ_n ≤ t < θ_n₊₁. Since J_n is the nth milestone point crossed by (X_t), and θ_n is the time at which (X_t) crossed this point, we see that (Y_t) equals (X_t) at each milestone crossing. Thus, (Y_t) can be considered a coarse-grained version of (X_t) which throws away the path of (X_t) between milestones, keeping only the endpoints. When (Y_t) has a probability density, it corresponds to the density p(x, t) from [3] for the last milestone point passed. Due to (2.1), (Y_t) is called semi-Markov. It has transition kernel K(x, dy, dt) given by

ℙ (J_{n} \in C, τ_{n} \leq t ∣ J_{n - 1} = x) = \int_{[0, t]} \int_{C} K (x, d y, d t) .

We adopt the convention that the innermost integral corresponds to the first variable of integration appearing in the integrand. Note that if (Y_t) and (X_t) start at the same distribution ξ on M, then they have the same MFPT to P. Thus, for the purposes of studying the MFPT, it suffices to consider the coarse-grained process (Y_t). Since (X_t) restarts at ρ upon reaching P, so does (Y_t), and so

for all x \in P, K (x, d y, d t) = δ_{t = 0} ρ (d y) .

(2.2)

Following standard terminology in probability theory, we call (J_n) the jump chain of (Y_t). It is a Markov chain with transition kernel

K (x, d y) : = \int_{[0, \infty)} K (x, d y, d t) .

2.3. Stopping times

Note that the sojourn times τ_n between milestones depend on the position of (X_t) on the nth milestone crossed at time t = θ_n₋₁. The following notation will thus be useful. For x ∈ M, we write

τ_{M}^{x} = inf {t > 0 : Y_{t} \in M_{y} for some M_{y} \neq M_{x}} .

This is the first time for (Y_t) to hit some milestone other than M_x, starting at x ∈ M. In particular, if X_θn₋₁ = x was the (n − 1)th crossing point, then $θ_{n} = θ_{n - 1} + τ_{M}^{x}$ is the next crossing time. As we are concerned with first passage times to P, we will also use the notation

τ_{P} = inf {t > 0 : Y_{t} \in P} .

In particular, the MFPT we are interested in is 𝔼^ρ[τ_P]. As discussed above, the MFPT will be estimated via short trajectories between milestones. An important ingredient is the correct starting distribution for these trajectories. Exact Milestoning makes use of a stationary flux of (X_t) on the milestones, which corresponds to the stationary distribution (shown to exist below) of the jump chain (J_n). To study this distribution we need to define also the first hitting time of P for (J_n):

σ_{P} = min {n \geq 0 : J_{n} \in P} .

Some assumption is required to guarantee the existence of a stationary flux. We adopt the following very general sufficient condition, which we assume holds through the remainder of the paper: 𝔼^ξ[τ_P] and 𝔼^ξ[σ_P] are finite for all probability measures ξ on M. This ensures that (Y_t) reaches P in finite expected time and does not cross infinitely many milestones in finite time. In probabilistic language, the latter condition means that (Y_t) is non-explosive. This sufficient condition can be readily verified in the standard settings for milestoning discussed above. Under only this assumption, we give in Theorem 3.3 an exact milestoning equation for 𝔼^ρ[τ_P]. Moreover we show in Section (4) how the equation leads to an efficient algorithm for computing &Eopf^ρ[τ_P]. The exact equation we give is the same as in [3], except phrased in the semi-Markov framework defined above.

2.4. Dual action of kernels

We use the following notation for the expected value of a bounded function f : M → ℝ with respect to a probability measure ξ on M:

ξ f : = \int_{M} f d ξ .

It will be useful to consider K(x, dy) as a stochastic kernel which operates on probability measures from the left and bounded functions on the right. More precisely, by a stochastic kernel we mean an operator P such that

ξ P : = \int_{M} ξ (d x) P (x, \cdot), P f : = \int_{M} P (\cdot, d y) f (y),

are maps preserving, respectively, the complete metric space ℘ of probability measures on M with the total variation norm, and the Banach space ℬ of bounded functions on M with the sup norm. Informally, ξK is the distribution of J_n, and ξKf is the expected value of f(J_n), given that J_n₋₁ was distributed according to ξ. In particular, Kⁿ is a stochastic kernel for each n ≥ 0, with ℙ^ξ(J_n ∈ ·) = ξKⁿ and 𝔼^ξ[f(J_n)] = ξKⁿf, where K⁰ = 1 is the identity operator.

3. Invariant measure and MFPT

3.1. Stationary distribution on the milestones

The formula for the MFPT relies on a stationary distribution, which we denote by μ, for (J_n). Our μ is the same as the appropriately normalized stationary flux q in other Milestoning papers. We use the notation μ instead of q to emphasize that μ is a probability measure on M that may not have a density. We show in Theorem 3.1 below that μ exists. The proof of existence requires only the above assumption of finite MFPTs to P and non-explositivity of (Y_t), along with the special source and sink structure of the dynamics implied by (2.2).

Theorem 3.1

(J_n) has an invariant probability measure μ defined by

μ (\cdot) : = E^{ρ} [\sum_{n = 0}^{σ_{P}} 𝟙_{{J_{n} \in \cdot}}] E^{ρ} {[σ_{P} + 1]}^{- 1} .

Proof

Define $ν (\cdot) = E^{ρ} [\sum_{n = 0}^{σ_{P}} 𝟙_{{J_{n} \in \cdot}}]$ and observe that

ν (\cdot) = \sum_{n = 0}^{\infty} \sum_{m = 0}^{n} ℙ^{ρ} (J_{m} \in \cdot ∣ σ_{P} = n) ℙ^{ρ} (σ_{P} = n) = \sum_{n = 0}^{\infty} ℙ^{ρ} (J_{n} \in \cdot, σ_{P} \geq n) .

If C ∧ R = ∅, by bounded convergence,

\int_{M} ν (d x) K (x, C) = \sum_{n = 0}^{\infty} ℙ^{ρ} (J_{n + 1} \in C, σ_{P} \geq n) = \sum_{n = 0}^{\infty} ℙ^{ρ} (J_{n} \in C, σ_{P} \geq n) = ν (C),

where the second equality uses ℙ^ρ(J₀ ∈ C) = 0 and J_n₊₁ ∉ R ⇒ σ_P ≠ n. If C ⊂ R,

\begin{array}{l} \int_{M} ν (d x) K (x, C) = \sum_{n = 0}^{\infty} ℙ^{ρ} (J_{n + 1} \in C, σ_{P} = n) + \sum_{n = 0}^{\infty} ℙ^{ρ} (J_{n + 1} \in C, σ_{P} \geq n + 1) \\ = ρ (C) - ℙ^{ρ} (J_{0} \in C, σ_{P} \geq 0) + \sum_{n = 0}^{\infty} ℙ^{ρ} (J_{n} \in C, σ_{P} \geq n) = ν (C) . \end{array}

We will show below that (J_n) converges to μ under appropriate conditions. In that case μ is unique and we will call μ the stationary measure. We emphasize that the formula on which Exact Milestoning is based – the equation in Theorem 3.3 below – holds true whether or not μ is unique. However, a successful application of the Exact Milestoning algorithm will require some technique for sampling μ. The algorithm we present (Algorithm 1 below) is based on convergence of (J_n) to μ. We consider two types of convergence in Theorem 3.4 and 3.5 below. We also discuss a simple case where (J_n) may not converge following Theorem 3.4 below.

Algorithm 1.

Exact Milestoning algorithm.

Input: Milestones

M = \cup_{j = 1}^{m} M_{j}

, initial guess ξ, and tolerance ε > 0 for the absolute error in the MFPT.

Output: Estimates for μ, local MFPTs 𝔼^μ[τ_M], and overall MFPT 𝔼^ρ[τ_P].

μ⁽⁰⁾ ← ξ

T⁽⁰⁾ ← +∞

for all n = 1, 2, . . . do

for i = 1 to m do

Estimate

a_{i j}^{(n)}

and

E^{μ_{i}^{(n - 1)}} [τ_{M}]

A_{i j}^{(n)} \leftarrow a_{i j}^{(n)} (M_{j})

end for

Solve w^TA = w ^T (with

A = (A_{i j}^{(n)}) \in ℝ_{\geq 0}^{m \times m}

and

w = (w_{1}, \dots, w_{m}) \in ℝ_{\geq 0}^{m}

)

for j = 1 to m do

μ_{j}^{(n)} \leftarrow \sum_{i = 1}^{m} w_{i} a_{i j}^{(n)}

end for

Normalize μ⁽ⁿ⁾

T⁽ⁿ⁾ ← μ(P)⁻¹ 𝔼^{μ⁽ⁿ⁻¹⁾}[τ_M]

if |T⁽ⁿ⁾ − T⁽ⁿ⁻¹⁾| < ε then

break

end if

end for

return (μ⁽ⁿ⁾, 𝔼^{μ⁽ⁿ⁻¹⁾}[τ_M], T⁽ⁿ⁾)

Open in a new tab

It is worth noting that the proof of Theorem 3.1 leads to the following representation of μ as a Neumann series. The representation is given in Corollary 3.2 below. In principle, this representation can be used to sample μ even when (J_n) does not converge to μ. The Neumann series is written in terms of the sub-stochastic transient kernel K̄ defined by

\bar{K} (x, d y) = {\begin{cases} K (x, d y), & x \notin P \\ 0, & x \in P \end{cases} .

(3.1)

The transient kernel K̄ has the same right action as the stochastic kernel K, but its left action is on the larger Banach space of signed bounded measures with total variation norm. Informally, it corresponds to a version of (J_n) that is “killed” when upon reaching P.

Corollary 3.2

We have

lim_{n \to \infty} {‖ ν {(M)}^{- 1} \sum_{i = 0}^{n - 1} ρ {\bar{K}}^{i} - μ ‖}_{T V} = 0.

(3.2)

Proof

Recall that μ = ν/ν (M) where

ν (\cdot) = \sum_{n = 0}^{\infty} ℙ^{ρ} (J_{n} \in \cdot, σ_{P} \geq n) = \sum_{n = 0}^{\infty} ρ {\bar{K}}^{n} .

Moreover,

sup_{∣ f ∣ \leq 1} | ν {(M)}^{- 1} \sum_{i = 0}^{n - 1} ρ {\bar{K}}^{i} f - μ f | \leq ν {(M)}^{- 1} \sum_{i = n}^{\infty} ℙ^{ρ} (σ_{P} \geq i),

(3.3)

and the right hand side of (3.3) is summable since by assumption 𝔼^ρ[σ_P] < ∞.

3.2. Milestoning equation for the MFPT

Equipped with an invariant measure μ, we are now able to state the Milestoning equation (3.4) for the MFPT. In Exact Milestoning, this equation is used to efficiently compute the MFPT. The algorithm is based on two principles: first, many trajectories can be simulated in parallel to estimate $τ_{M}^{x}$ for various x; and second, the stationary distribution μ can be efficiently estimated through a technique based on power iteration. (See the right hand side of equation (3.4).) The gain in efficiency comes from the fact that the trajectories used to estimate $τ_{M}^{x}$ are much shorter than trajectories from R to P. The second principle above depends somewhat on whether we have a good “initial guess” of μ. When (X_t) is Langevin dynamics, we have found in some cases the canonical Gibbs distribution is a sufficiently good initial guess. See [3] and [4] for details and discussion.

Theorem 3.3

Let μ be defined as above. Then μ(P) > 0 and

μ (P) E^{ρ} [τ_{P}] = \int_{M} μ (d x) E^{x} [τ_{M}^{x}] : = E^{μ} [τ_{M}] .

(3.4)

Proof

The assumption 𝔼^ρ[σ_P] < ∞ shows that μ(P) > 0. For any x ∈ M,

\begin{array}{l} E^{x} [τ_{P}] = \int_{M} E^{x} [τ_{P} ∣ Y_{τ_{M}^{x}} = y] K (x, d y) \\ = \int_{M} E^{x} [τ_{M}^{x} ∣ Y_{τ_{M}^{x}} = y] K (x, d y) + \int_{M \ P} E^{x} [τ_{P} - τ_{M}^{x} ∣ Y_{τ_{M}^{x}} = y] K (x, d y) \\ = E^{x} [τ_{M}^{x}] + \int_{M \ P} E^{y} [τ_{P}] K (x, d y) . \end{array}

Thus,

\begin{array}{l} E^{μ} [τ_{P}] = E^{μ} [τ_{M}] + \int_{M \ P} \int_{M} μ (d x) E^{y} [τ_{P}] K (x, d y) \\ = E^{μ} [τ_{M}] + \int_{M \ P} μ (d y) E^{y} [τ_{P}], \end{array}

and so

E^{μ} [τ_{M}] = \int_{P} μ (d y) E^{y} [τ_{P}] = μ (P) E^{ρ} [τ_{P}] .

In Section 4 below we present the Exact Milestoning algorithm (Algorithm 1) recently used in [3] and [4]. The algorithm uses a technique which combines coarse-graining and power iteration to sample μ. Consistency of power iteration algorithms are justified via Theorem 3.4 below, where we show ξKⁿ → μ as n → ∞. Though we emphasize that there are a range of possibilities for sampling μ (for example, algorithms based on (3.2) or (3.7) below) we note that Algorithm 1 was shown to be efficient for computing the MFPT in the entropic barrier example of [3] and the random energy landscapes example of [4].

3.3. Convergence to stationarity

In this section we justify the consistency of power iteration-based methods for sampling μ by showing that ξKⁿ converges to μ in total variation norm as n → ∞. The theorem requires an extra assumption – aperiodicity of the jump chain (J_n).

Theorem 3.4

Suppose that (J_n) is aperiodic in the following sense:

g . c . d . {n \geq 1 : ℙ^{ρ} (σ_{P} = n - 1) > 0} = 1.

(3.5)

Then for all probability measures ξ on M,

lim_{n \to \infty} {‖ ℙ^{ξ} (J_{n} \in \cdot) - μ ‖}_{T V} = 0.

(3.6)

In particular, μ is unique.

Proof

We use a simple coupling argument. Let (H_n) be an independent copy of (J_n) and let J₀ ~ ξ and H₀ ~ μ. For n ≥ 0, let S_n (resp. T_n) be the times at which (J_n) (resp. (H_n)) hit P for the (n + 1)st time. Then S_n₊₁ − S_n, n ≥ 0, are iid random variables with finite expected value and nonlattice distribution, and (S_n₊₁ −S_n)_n_≥0 ~ (T_n₊₁ −T_n)_n_≥0. It follows that (S_n −T_n)_n_≥0 is a mean zero random walk with nonlattice step distribution. Thus, its first time to hit 0 is finite almost surely. So

ζ : = inf {n \geq 0 : J_{n} \in P, H_{n} \in P}

obeys ℙ (ζ ≥ n) → 0 as n → ∞. Note that J_n ~ H_n whenever ζ < n. Thus

∣ ℙ^{ξ} (J_{n} \in C) - ℙ^{μ} (H_{n} \in C) ∣ \leq 2 ℙ (ζ \geq n) .

Since μ is stationary for (H_n) we have ℙ^μ(H_n ∈ C) = μ(C). Now

{‖ ℙ^{ξ} (J_{n} \in \cdot) - μ ‖}_{T V} = sup_{C \subset M} ∣ ℙ^{ξ} (J_{n} \in C) - μ (C) ∣ \leq 2 ℙ (ζ \geq n),

which establishes the convergence result. To see uniqueness, suppose ξ is another invariant probability measure for (J_n); then the last display becomes ||ξ − μ||_TV ≤ 2ℙ(ζ ≥ n). Letting n → ∞ shows that ξ ~ μ.

We now consider a class of problems where there is a smooth one-dimensional reaction coordinate ψ : Ω → [0, 1] tracking progress of (X_t) from R to P. In this case ψ|_R ≡ 0, ψ|_P ≡ 1, the milestones M₁, ..., M_m are disjoint level sets of ψ, and R = M₁, P = M_m. The jump chain (J_n) can only hop between neighboring milestones, unless it is at P. That is, if J_n ∈ M_i for i ∉ {1, m}, then J_n₊₁ ∈ M_i₋₁ or J_n ∈ M_i₊₁; if J_n ∈ M₁ then J_n₊₁ ∈ M₂; and if J_n ∈ M_m, then J_n₊₁ ∈ M₁. Suppose that if J_n ∈ M_i for i ∉ {1, m}, then J_n₊₁ ∈ M_i₋₁ with probability in (0, 1). Such systems satisfy the aperiodicity assumption (3.5) if and only if m is odd. This is due to the fact that, starting at M₁, (J_n) can reach M_m in either m − 1 or m + 1 steps with positive probability, and m and m+2 are coprime when m is odd. On the other hand, if m is even then the conclusion of Theorem 3.4 cannot hold. To see this, let m be even and start (J_n) at an initial distribution supported on an odd-indexed milestone. Then after any even number of steps (J_n) is supported on an odd-indexed milestone, while after any odd number of steps (J_n) is supported on an even-indexed milestone.

Theorem 3.4 estabishes convergence of the time marginals of (J_n), that is, the distributions of J_n at fixed times n. (J_n) also converges in a time-averaged sense, even if (3.5) does not hold. More precisely, we have the following version of the Birkhoff ergodic theorem:

Theorem 3.5

Let J₀ ~ ξ, with ξ a probability measure on M. For bounded f : M → ℝ,

lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (J_{i}) \overset{a . s .}{=} \int_{M} f d μ \equiv μ f .

(3.7)

Proof

Let S_n be the times at which (J_n) hits P for the (n + 1)st time, and define

f_{n} = \sum_{i = S_{n} + 1}^{S_{n + 1}} f (J_{i}) .

Note that f_n, n ≥ 0, are iid. Let k(n) = max{k : S_k ≤ n} and write

\frac{1}{n} \sum_{i = 1}^{n - 1} f (J_{i}) = \frac{1}{n} \sum_{i = 0}^{S_{0}} f (J_{i}) + \frac{1}{n} \sum_{i = 0}^{k (n) - 1} f_{i} + \frac{1}{n} \sum_{i = S_{k (n)} + 1}^{n} f (J_{i})

Since (J_n) hits P in finite time a.s., n − S_k₍_n₎ and S₀ are finite a.s. Thus,

lim_{n \to \infty} \frac{1}{n} \sum_{i = 0}^{n - 1} f (J_{i}) = lim_{n \to \infty} \frac{k (n) - 1}{n} \frac{1}{k (n) - 1} \sum_{i = 0}^{k (n) - 1} f_{i} .

Notice R_n := S_n₊₁ − S_n, n ≥ 0 are iid with finite expectation and

\frac{R_{0} + \dots + R_{k (n) - 1}}{k (n)} \leq \frac{n - S_{0}}{k (n)} \leq \frac{R_{0} + \dots + R_{k (n)}}{k (n)} .

By the previous two displays and the law of large numbers,

lim_{n \to \infty} \sum_{i = 0}^{n - 1} f (J_{i}) = \frac{E [f_{0}]}{E [R_{0}]} = \frac{1}{E [σ_{P}]} \sum_{i = 0}^{σ_{P}} E^{ρ} [f (J_{i})] = \frac{ν f}{ν (M)} = μ f,

with ν defined as in Theorem 3.4.

Theorem 3.5 shows that problems in sampling μ arising from aperiodicity can be managed by averaging over time.

Markov chains for which the conclusion of Theorem 3.4 hold are called Harris ergodic. It is worth noting that Harris ergodicity, together with aperiodicity of the sojourn times, also leads to a limit for the time marginals of of (Y_t). More precisely, suppose (3.5) holds and for each x ∈ M \ P and y ∈ M, ℙ^x(τ₁ ∈ · | J₁ = y) is nonlattice. Then for any C ⊂ M and μ-a.e. x,

lim_{t \to \infty} ℙ^{x} (Y_{t} \in C) = \frac{\int_{C} μ (d y) E^{y} [τ_{M}^{y}]}{\int_{M} μ (d y) E^{y} [τ_{M}^{y}]} .

(3.8)

See [2] for details and proof. When the right hand side of (3.8) has a density, it is the same as the stationary probability density p(x) in [3] for the last milestone point passed.

4. Exact Milestoning algorithm

We now describe in detail an algorithm for sampling the stationary measure μ and the MFPT 𝔼^ρ[τ_P], used successfully for instance in [3] and [4]. Let ξ be an initial guess for μ. For example, if (X_t) is Brownian or Langevin dynamics, we usually take ξ to the canonical Gibbs distribution. We assume here that the conclusion of Theorem 3.4 holds. We write M_i for the distinct milestones, so that M = ∪_iM_i and M_i, M_j have disjoint interiors when i ≠ j. The algorithm will produce approximations

ξ \equiv μ^{(0)}, μ^{(1)}, μ^{(2)}, \dots

of μ. Let $μ_{i}^{(n)}$ be the non-normalized restriction of μ⁽ⁿ⁾ to M_i, and define

E^{μ_{i}^{(n)}} [τ_{M}] : = μ^{(n)} {(M_{i})}^{- 1} \int_{M_{i}} μ_{i}^{(n)} (d x) E^{x} [τ_{M}^{x}],

(This is a slight abuse of notation, since $μ_{i}^{(n)}$ is not a probability distribution: it is not normalized.) For C ⊂ M_j we will also use the notation

a_{i j}^{(n)} (C) = μ^{(n - 1)} {(M_{i})}^{- 1} \int_{M_{i}} μ_{i}^{(n - 1)} (d x) K (x, C) .

Below we think of $a_{i j}^{(n)}$ and $μ_{i}^{(n)}$ as either distributions or densities. The $a_{i j}^{(n)}$ are obtained based on observations of trajectory fragments. We can conceive a simple Monte Carlo scheme for the purpose of approximating these distributions as follows. Let x₁,..., x_L be iid samples from the distribution $μ_{i}^{(n - 1)} / μ^{(n - 1)} (M_{i})$ . From each _i sample point x_ℓ ∈ M_i, we simulate (X_t) until it hits a neighboring milestone, say at the point y_ℓ ∈ M_j. If we idealize by assuming the simulation of (X_t) is done exactly, then by Chebyshev’s inequality,

ℙ (| a_{i j}^{(n)} (C) - \int_{C} \frac{1}{L} \sum_{ℓ = 1}^{L} δ_{y_{ℓ}} (d y) | > ε) \leq \frac{a_{i j}^{(n)} (C) - a_{i j}^{(n)} {(C)}^{2}}{L ε^{2}},

where δ_y is the Dirac delta distribution at y. We therefore write, for y ∈ M,

a_{i j}^{(n)} (y) \approx \frac{1}{L} \sum_{ℓ = 1}^{L} {\tilde{δ}}_{y_{ℓ}} (y),

(4.1)

where δ̃_z is either some suitable approximation to the identity at z, or simply a delta function at z. Thus, in Algorithm 1 we think of $a_{i j}^{(n)}$ and $μ_{i}^{(n - 1)}$ as either densities in the former case, or as distributions in the latter. The local mean first passage times, that is, the times between milestone crossings, are approximated by the sample means

E^{μ_{i}^{(n - 1)}} [τ_{M}] \approx \frac{1}{L} \sum_{ℓ = 1}^{L} τ_{M}^{x_{ℓ}} .

It is important to realize that we do not need to store the full coordinates of each y_ℓ in memory. Instead, it suffices to use a data-structure that keeps track of the pairs (y_ℓ, M_j). The actual coordinates of each point can be written to disk and read from it as needed.

The eigenvalue problem in Algorithm 1 involves a stochastic matrix $A \in ℝ_{\geq 0}^{m \times m}$ that is sparse. Indeed, the i-th row corresponds to milestone M_i and may have only as many non-zero entries as the number of neighboring milestones M_j. In practice, to solve the eigenvalue problem we can use efficient and accurate Krylov subspace solvers [15] such as Arnoldi iteration [18] to obtain w without computing all the other eigenvectors.

In Algorithm 1, if $w_{i} : = μ_{i}^{(n - 1)} (M_{i})$ is used instead of the solution w to w^TA = w^T, then the algorithm approximates μ by simple power iteration. The reason for defining the weights as the solution to w^TA = w^T is practical: we have found that it gives faster convergence of the iterations, at no apparent cost to accuracy. It can be seen as a version of power iteration that uses coarse-graining. See [3, 4] for applications of the algorithm in Exact Milestoning and [15, 18] for related discussions.

5. Error analysis

5.1. Stationary distribution error

In practice, due to time discretization error, we cannot generate trajectories exactly according to the transition kernel K. Instead, we can generate trajectories according to a numerical approximation K_ε. We investigate here whether such schemes are consistent, that is, whether powers of K_ε of K converge to a distribution μ_ε ≈ μ. We emphasize that, even though we account for time discretization here, we still assume infinite sampling, and thus for a given x ∈ M, K_ε (x, dy) may be a continuous distribution. See Section 5.2 below for related remarks and a discussion of how time discretization errors affect the Exact Milestoning estimate of the MFPT.

The following theorem, restated from [11], establishes consistency of iteration schemes based on Theorem 3.4 when K_ε is sufficiently close to K and (J_n) is geometrically ergodic. After the theorem, in Lemma 5.2 and Theorem 5.3 we give natural conditions for geometric ergodicity of (J_n).

Theorem 5.1

Suppose (J_n) is geometrically ergodic: there exists κ ∈ (0, 1) such that

sup_{x \in M} {‖ δ_{x} K^{n} - μ ‖}_{T V} = O (κ^{n}) .

Let {K_ε} be a family of stochastic kernels with K₀ = K, assumed to act continuously on ℬ, such that

lim_{ε \to 0} sup_{∣ f ∣ \leq 1} {‖ K_{ε} f - K f ‖}_{\infty} = 0.

(5.1)

Then for each κ̂ ∈ (κ, 1), there is δ > 0 such that for each ε ∈ [0, δ), K_ε has a unique invariant probability measure μ_ε, and

\begin{array}{l} sup_{ε < δ} sup_{x \in M} {‖ δ_{x} K_{ε}^{n} - μ_{ε} ‖}_{T V} = O ({\hat{κ}}^{n}), \\ lim_{ε \to 0} {‖ μ_{ε} - μ ‖}_{T V} = 0. \end{array}

Geometric ergodicity is inconvenient to check directly. We give two sufficient conditions for geometric ergodicity of (J_n). The first condition is a uniform lower bound on the probability to reach P in N steps; see Lemma 5.2. We use this to obtain a strong Feller condition in Theorem 5.3. The latter is a very natural condition and is easy to verify in some cases, for instance when (X_t) is a nondegenerate diffusion and the milestones are sufficiently regular.

Lemma 5.2

Suppose that there exists λ ∈ (0, 1) and N ∈ ℕ such that for all x ∈ M, ℙ^x(J_N₋₁ ∈ P) ≥ λ > 0. Then (J_n) is geometrically ergodic:

sup_{x \in M} {‖ δ_{x} K^{n} - μ ‖}_{T V} \leq λ^{- 1} {(1 - λ)}^{⌊ n / N ⌋} .

Proof

Let ξ₁, ξ₂ ∈ ℘, consider the signed measure ξ = ξ₁ − ξ₂ and compute

\begin{array}{l} {‖ ξ_{1} K^{N} - ξ_{2} K^{N} ‖}_{T V} = sup_{∣ f ∣ \leq 1} | \int_{M} \int_{M} ξ (d y) K^{N} (y, d z) f (z) | \\ = sup_{∣ f ∣ \leq 1} | \int_{M} \int_{M} ξ (d y) (K^{N} (y, d z) - λ ρ (d z)) f (z) | \\ = sup_{∣ f ∣ \leq 1} | \int_{M} ξ (d y) \int_{M} (K^{N} (y, d z) - λ ρ (d z)) f (z) | \\ \leq (1 - λ) sup_{∣ f ∣ \leq 1} | \int_{M} ξ (d y) f (y) | = (1 - λ) {‖ ξ_{1} - ξ_{2} ‖}_{T V} . \end{array}

The last line uses the fact that K^N(y, dz) − λρ(dz) is a positive measure. This shows that K^N is a contraction mapping on ℘ with contraction constant (1 − λ). Observe also that ||ξ₁K − ξ₂K||_TV ≤ ||ξ₁ − ξ₂||_TV. The result now follows from the contraction mapping theorem. See for instance Theorem 6.40 of [9].

Note that the λ in Lemma (5.2) is a quantity that can be estimated, at least in principle, by running trajectories of (X_t) starting at x which cross N − 1 milestones before reaching P. However, this is likely impractical for the same reason direct estimation of the MFPT is impractical – the trajectories would be too long. One alternative would be to compute the probability $P^{i} (J_{N - 1}^{A} \in P)$ for the Markov chain ( $J_{n}^{A}$ ) on {1, …, m} with transition matrix A, and use the minimum over i ∈ {1, …, m} as a proxy for λ. Even without a practical way to estimate λ, we believe the characterization of Lemma 5.2 is useful for understanding the convergence rate.

Lemma 5.2 leads to the following condition for geometric ergodicity of (J_n).

Theorem 5.3

Suppose that M is compact and (J_n) is a strong Feller chain which is aperiodic in the sense of (3.5). Then (J_n) is geometrically ergodic.

Proof

Let ε ∈ (0, μ(P)). By Theorem 3.4, for each x ∈ M there is N_x ∈ ℕ such that ℙ^x(J_n ∈ P ) ≥ ε for all n ≥ N_x. Because (J_n) is strong Feller, the map x → ℙ^x(J_n ∈ P) is continuous. By compactness of M, it follows that for any λ ∈ (0, ε) there is N ∈ ℕ such that ℙ^y(J_n ∈ P) ≥ λ for all y ∈ M and n ≥ N − 1. Theorem 5.2 now yields the result.

5.2. MFPT error

As discussed above, Equation 3.4 can be used to estimate the MFPT 𝔼^ρ[τ_P ] by sampling μ and local MFPTs $τ_{M}^{x}$ . The error in this estimate has two sources. First, in general we only have an approximation μ̃ ≡ μ_ε of μ. The second source of error is in the sampling of $τ_{M}^{x}$ , due to the fact that we can only simulate a time discrete version (X̃_nδt) of (X_t). In Theorem 5.4 below we give an explicit formula for the numerical error of the MFPT in terms of these two sources. We first need the following notation. Let ${\tilde{τ}}_{M}^{x}$ be the minimum of all n δt > 0 such that the line segment between X̃_nδt and X̃ ₍_n₊₁₎_δt intersects M \ M_x, and define

E^{\tilde{μ}} [{\tilde{τ}}_{M}] : = \int_{M} \tilde{μ} (d x) E^{x} [{\tilde{τ}}_{M}^{x}] .

We emphasize that Theorem 5.4 below gives an expression for the error in the original Milestoning as well as in Exact Milestoning.

Theorem 5.4

There exists a nonnegative function ϕ such that

∣ E^{ρ} [τ_{P}] - \tilde{μ} {(P)}^{- 1} E^{\tilde{μ}} [{\tilde{τ}}_{M}] ∣ \leq c_{1} ∣ μ {(P)}^{- 1} - \tilde{μ} {(P)}^{- 1} ∣ + \tilde{μ} {(P)}^{- 1} (c_{2} {‖ μ - \tilde{μ} ‖}_{T V} + ϕ (δ t)),

(5.2)

where

c_{1} : = E^{μ} [τ_{M}], c_{2} : = sup_{x \in M} E^{x} [τ_{M}^{x}] .

Proof

Note that

\begin{array}{l} ∣ E^{ρ} [τ_{P}] - \tilde{μ} {(P)}^{- 1} E^{\tilde{μ}} [{\tilde{τ}}_{M}] ∣ = ∣ μ {(P)}^{- 1} E^{μ} [τ_{M}] - \tilde{μ} {(P)}^{- 1} E^{\tilde{μ}} [{\tilde{τ}}_{M}] ∣ \\ \leq ∣ μ {(P)}^{- 1} E^{μ} [τ_{M}] - \tilde{μ} {(P)}^{- 1} E^{μ} [τ_{M}] ∣ + ∣ \tilde{μ} {(P)}^{- 1} E^{μ} [τ_{M}] - \tilde{μ} {(P)}^{- 1} E^{\tilde{μ}} [τ_{M}] ∣ + ∣ \tilde{μ} {(P)}^{- 1} E^{\tilde{μ}} [τ_{M}] - \tilde{μ} {(P)}^{- 1} E^{\tilde{μ}} [{\tilde{τ}}_{M}] ∣ \end{array}

where we have written $E^{\tilde{μ}} [τ_{M}] : = \int_{M} \tilde{μ} (d x) E^{x} [τ_{M}^{x}]$ . We may write

ϕ (δ t) = ∣ E^{\tilde{μ}} [τ_{M}] - E^{\tilde{μ}} [{\tilde{τ}}_{M}] ∣

for the term depending only on time stepping error. Note that

\begin{array}{l} ∣ E^{μ} [τ_{M}] - E^{\tilde{μ}} [τ_{M}] ∣ = | \int_{M} μ (d x) E^{x} [τ_{M}^{x}] - \int_{M} \tilde{μ} (d x) E^{x} [τ_{M}^{x}] | \\ \leq (sup_{x \in M} E^{x} [τ_{M}^{x}]) {‖ μ - \tilde{μ} ‖}_{T V} . \end{array}

Combining the last three expressions yields the result.

Recall that in the above we have ignored errors from finite sampling. We now discuss the implications of those errors. In the original Milestoning, μ̃ is the canonical Gibbs distribution on the milestones. In that setting, we can typically sample independently from μ̃ on the milestones. Thus, the central limit theorem implies that the true error in the Milestoning approximation of 𝔼^ρ [τ_P] is bounded above with high probability by the right hand side of (5.2) plus a constant times $1 / \sqrt{N}$ , where N is the number of samples. An analogous argument applies to Exact Milestoning if μ̃ is sampled by simple power iteration. For our coarse-grained version of power iteration, however, we obtain samples of μ̃ which are not independent, and thus a more detailed analysis would be required to determine the additional error from finite sampling.

We do not analyze the time discretization error ϕ(δt) and instead refer the reader to [14] and references therein. Here we simply remark that, if (X_t) is a diffusion process, then under certain smoothness assumptions on the drift and diffusion coefficients of (X_t) and on M, we have $ϕ (δ t) = θ (\sqrt{δ t})$ when (X_nδt) is the standard Euler time discretization with time step δt. See [13] for details and proof and see [5, 16] for numerical schemes that mitigate time discretization error in the MFPTs.

6. Illustrative examples

In this section we present two examples of Milestoning on the torus Ω = ℝ²/ℤ². The process (X_t) will be governed by the Brownian dynamics equation,

{\begin{cases} d X_{t} = - \nabla U (X_{t}) d t + \sqrt{2 β^{- 1}} d B_{t}, \\ X_{0} ~ ρ \end{cases}

where U : Ω → ℝ is a smooth potential energy function, β > 0 is the inverse temperature, and (B_t) is a standard Brownian motion. For our computations, we consider a uniformly spaced mesh of milestones with fixed product and reactant sets P, R; see Figure 2.

Fig. 2 — Diagram showing the reactant state (orange square) and the product state (blue square) within the set of all milestones for the example in Section 6. Each milestone is an edge of one of the small squares in the diagram. (The total number of milestones has been decreased to enhance visibility.)

6.1. Constant potential energy

In the case where U is a constant function, the corresponding canonical Gibbs distribution is also constant. However, the stationary distribution μ obtained in Milestoning is not constant. This is because every trajectory reaching the product state P immediately restarts at R according to ρ, as determined by (2.2). The resulting (density of) μ is shown in Figure 3.

Fig. 3 — Stationary distribution μ corresponding to a system with constant potential energy for the example in Section 6. There are 2 × 40 × 40 total milestones.

6.2. Rough energy landscape

In this case, we consider U of the form

U (x_{1}, x_{2}) = \sum_{k_{1} = - N}^{N - 1} \sum_{k_{2} = - N}^{N - 1} C_{k_{1}, k_{2}} f_{k_{1}, k_{2}} (x_{1}, x_{2}),

(6.1)

where N is a parameter determining the maximum allowed frequency. The constants C_k₁,k₂ are given by

C_{k_{1}, k_{2}} = {\begin{cases} c_{k_{1}, k_{2}}, & with probability \frac{1}{2}, \\ 0, & with probability \frac{1}{2}, \end{cases}

with c_k₁,k₂ being uniformly distributed in the interval ( $- \frac{1}{2 π}, \frac{1}{2 π}$ ). The random functions f_k₁,k₂ are defined as

f_{k_{1}, k_{2}} (x_{1}, x_{2}) = {\begin{cases} cos (2 π k_{1} x_{1}) cos (2 π k_{2} x_{2}), & with probabilit y 1 / 3, \\ cos (2 π k_{1} x_{1}) sin (2 π k_{2} x_{2}), & with probability 1 / 3, \\ sin (2 π k_{1} x_{1}) sin (2 π k_{2} x_{2}), & with probability 1 / 3. \end{cases}

Therefore, the potential energy function U is a sparse, periodic function on the two-dimensional torus. An example of a stationary distribution on a particular realization of the above potential energy functions is displayed in Figure 4, where we see that (the density of) μ is concentrated in a few basins of attraction.

Fig. 4 — Stationary density μ on a rough energy landscape for the example in Section 6. The contour lines are the level sets of U.

Acknowledgments

This research was supported in part by a grant from the NIH GM59796 and from the Welch Foundation Grant No. F-1783.

References

1.Allen P, Tildesley DJ. Computer Simulation of Liquids. Oxford Science Publications, Clarendon Press; 1989. [Google Scholar]
2.Alsmeyer G. On the Markov renewal theorem. Stochastic Processes and their Applications. 1994;50:37–56. [Google Scholar]
3.Bello-Rivas JM, Elber R. Exact milestoning. The Journal of Chemical Physics. 2015;142:094102. doi: 10.1063/1.4913399. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Bello-Rivas JM, Elber R. Simulations of thermodynamics and kinetics on rough energy landscapes with milestoning. 2015 doi: 10.1002/jcc.24039. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Bouchard B, Geiss S, Gobet E. First time to exit of a continuous Itô process: general moment estimates and $L 1$-convergence rate for discrete time approximations. 2013:1–31. [Google Scholar]
6.Cardenas AE, DeLeon KY, Hegefeld WA, Kuczera K, Jas GS, Elber R. Unassisted Transport of Block Tryptophan through DOPC Membrane: Experiment and Simulation. 2012 doi: 10.1021/jp2102447. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Chodera JD, Singhal N, Pande VS, Dill KA, Swope WC. Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics. Journal of Chemical Physics. 2007;126 doi: 10.1063/1.2714538. [DOI] [PubMed] [Google Scholar]
8.Dellago C, Bolhuis PG, Geissler PL. Transition Path Sampling. 2002;123 doi: 10.1146/annurev.physchem.53.082301.113146. [DOI] [PubMed] [Google Scholar]
9.Douc R, Moulines E, Stoffer D. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis; 2014. Nonlinear Time Series: Theory, Methods and Applications with R Examples. [Google Scholar]
10.Faradjian AK, Elber R. Computing time scales from reaction coordinates by milestoning. Journal of Chemical Physics. 2004;120:10880–10889. doi: 10.1063/1.1738640. [DOI] [PubMed] [Google Scholar]
11.Ferré D, Hervé L, Ledoux J. Regular perturbation of v -geometrically ergodic markov chains. Journal of Applied Probability. 2013;50:184–194. [Google Scholar]
12.Frenkel D, Smit B. Understanding Molecular Simulation: From Algorithms to Applications. 2002. [Google Scholar]
13.Gobet E, Menozzi S. Exact approximation rate of killed hypoelliptic diffusions using the discrete Euler scheme. Stochastic Processes and their Applications. 2004;112:201–223. [Google Scholar]
14.Gobet E, Menozzi S. Stopped diffusion processes: Boundary corrections and overshoot. Stochastic Processes and their Applications. 2010;120:130–162. [Google Scholar]
15.Golub GH, Van Loan CF. Matrix computations. 4. Johns Hopkins Studies in the Mathematical Sciences, Johns Hopkins University Press; Baltimore, MD: 2013. [Google Scholar]
16.Higham DJ, Mao X, Roj M, Song Q, Yin G. Mean Exit Times and the Multilevel Monte Carlo Method. SIAM Journal on Uncertainty Quantification. 1(1):2–18. [Google Scholar]
17.Kirmizialtin S, Elber R. Revisiting and computing reaction coordinates with Directional Milestoning. The journal of physical chemistry A. 2011;115:6137–48. doi: 10.1021/jp111093c. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Lehoucq RB, Sorensen DC, Yang C. ARPACK users’ guide, vol. 6 of Software, Environments, and Tools. Society for Industrial and Applied Mathematics (SIAM); Philadelphia, PA: 1998. [Google Scholar]
19.Metzner P, Schütte C, Vanden-Eijnden E. Transition Path Theory for Markov Jump Processes. 2009. [Google Scholar]
20.Moroni D, Bolhuis PG, Van Erp TS. Rate constants for diffusive processes by partial path sampling. Journal of Chemical Physics. 2004;120:4055–4065. doi: 10.1063/1.1644537. [DOI] [PubMed] [Google Scholar]
21.Ruymgaart AP, Cardenas AE, Elber R. MOIL-opt: Energy-Conserving Molecular Dynamics on a GPU/CPU system. Journal of chemical theory and computation. 2011;7:3072–3082. doi: 10.1021/ct200360f. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Sarich M, Noé F, Schütte C. On the Approximation Quality of Markov State Models. 2010. [Google Scholar]
23.Schlick T. Molecular Modeling and Simulation: An Interdisciplinary Guide. 2010. [Google Scholar]
24.Shaw DE, Chao JC, Eastwood MP, Gagliardo J, Grossman JP, Ho CR, Lerardi DJ, Kolossváry I, Klepeis JL, Layman T, McLeavey C, Deneroff MM, Moraes MA, Mueller R, Priest EC, Shan Y, Spengler J, Theobald M, Towles B, Wang SC, Dror RO, Kuskin JS, Larson RH, Salmon JK, Young C, Batson B, Bowers KJ. Anton, a special-purpose machine for molecular dynamics simulation. 2008. [Google Scholar]
25.Stone JE, Phillips JC, Freddolino PL, Hardy DJ, Trabuco LG, Schulten K. Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry. 2007;28:2618–2640. doi: 10.1002/jcc.20829. [DOI] [PubMed] [Google Scholar]
26.Swenson DWH, Bolhuis PG. A replica exchange transition interface sampling method with multiple interface sets for investigating networks of rare events. Journal of Chemical Physics. 2014;141 doi: 10.1063/1.4890037. [DOI] [PubMed] [Google Scholar]
27.Vanden-Eijnden E, Venturoli M. Exact rate calculations by trajectory parallelization and tilting. The Journal of chemical physics. 2009;131:1–7. doi: 10.1063/1.3180821. [DOI] [PubMed] [Google Scholar]
28.Warmflash A, Bhimalapuram P, Dinner AR. Umbrella sampling for nonequilibrium processes. The Journal of chemical physics. 2007;127:154112. doi: 10.1063/1.2784118. [DOI] [PubMed] [Google Scholar]

[R1] 1.Allen P, Tildesley DJ. Computer Simulation of Liquids. Oxford Science Publications, Clarendon Press; 1989. [Google Scholar]

[R2] 2.Alsmeyer G. On the Markov renewal theorem. Stochastic Processes and their Applications. 1994;50:37–56. [Google Scholar]

[R3] 3.Bello-Rivas JM, Elber R. Exact milestoning. The Journal of Chemical Physics. 2015;142:094102. doi: 10.1063/1.4913399. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Bello-Rivas JM, Elber R. Simulations of thermodynamics and kinetics on rough energy landscapes with milestoning. 2015 doi: 10.1002/jcc.24039. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Bouchard B, Geiss S, Gobet E. First time to exit of a continuous Itô process: general moment estimates and $L 1$-convergence rate for discrete time approximations. 2013:1–31. [Google Scholar]

[R6] 6.Cardenas AE, DeLeon KY, Hegefeld WA, Kuczera K, Jas GS, Elber R. Unassisted Transport of Block Tryptophan through DOPC Membrane: Experiment and Simulation. 2012 doi: 10.1021/jp2102447. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Chodera JD, Singhal N, Pande VS, Dill KA, Swope WC. Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics. Journal of Chemical Physics. 2007;126 doi: 10.1063/1.2714538. [DOI] [PubMed] [Google Scholar]

[R8] 8.Dellago C, Bolhuis PG, Geissler PL. Transition Path Sampling. 2002;123 doi: 10.1146/annurev.physchem.53.082301.113146. [DOI] [PubMed] [Google Scholar]

[R9] 9.Douc R, Moulines E, Stoffer D. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis; 2014. Nonlinear Time Series: Theory, Methods and Applications with R Examples. [Google Scholar]

[R10] 10.Faradjian AK, Elber R. Computing time scales from reaction coordinates by milestoning. Journal of Chemical Physics. 2004;120:10880–10889. doi: 10.1063/1.1738640. [DOI] [PubMed] [Google Scholar]

[R11] 11.Ferré D, Hervé L, Ledoux J. Regular perturbation of v -geometrically ergodic markov chains. Journal of Applied Probability. 2013;50:184–194. [Google Scholar]

[R12] 12.Frenkel D, Smit B. Understanding Molecular Simulation: From Algorithms to Applications. 2002. [Google Scholar]

[R13] 13.Gobet E, Menozzi S. Exact approximation rate of killed hypoelliptic diffusions using the discrete Euler scheme. Stochastic Processes and their Applications. 2004;112:201–223. [Google Scholar]

[R14] 14.Gobet E, Menozzi S. Stopped diffusion processes: Boundary corrections and overshoot. Stochastic Processes and their Applications. 2010;120:130–162. [Google Scholar]

[R15] 15.Golub GH, Van Loan CF. Matrix computations. 4. Johns Hopkins Studies in the Mathematical Sciences, Johns Hopkins University Press; Baltimore, MD: 2013. [Google Scholar]

[R16] 16.Higham DJ, Mao X, Roj M, Song Q, Yin G. Mean Exit Times and the Multilevel Monte Carlo Method. SIAM Journal on Uncertainty Quantification. 1(1):2–18. [Google Scholar]

[R17] 17.Kirmizialtin S, Elber R. Revisiting and computing reaction coordinates with Directional Milestoning. The journal of physical chemistry A. 2011;115:6137–48. doi: 10.1021/jp111093c. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.Lehoucq RB, Sorensen DC, Yang C. ARPACK users’ guide, vol. 6 of Software, Environments, and Tools. Society for Industrial and Applied Mathematics (SIAM); Philadelphia, PA: 1998. [Google Scholar]

[R19] 19.Metzner P, Schütte C, Vanden-Eijnden E. Transition Path Theory for Markov Jump Processes. 2009. [Google Scholar]

[R20] 20.Moroni D, Bolhuis PG, Van Erp TS. Rate constants for diffusive processes by partial path sampling. Journal of Chemical Physics. 2004;120:4055–4065. doi: 10.1063/1.1644537. [DOI] [PubMed] [Google Scholar]

[R21] 21.Ruymgaart AP, Cardenas AE, Elber R. MOIL-opt: Energy-Conserving Molecular Dynamics on a GPU/CPU system. Journal of chemical theory and computation. 2011;7:3072–3082. doi: 10.1021/ct200360f. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Sarich M, Noé F, Schütte C. On the Approximation Quality of Markov State Models. 2010. [Google Scholar]

[R23] 23.Schlick T. Molecular Modeling and Simulation: An Interdisciplinary Guide. 2010. [Google Scholar]

[R24] 24.Shaw DE, Chao JC, Eastwood MP, Gagliardo J, Grossman JP, Ho CR, Lerardi DJ, Kolossváry I, Klepeis JL, Layman T, McLeavey C, Deneroff MM, Moraes MA, Mueller R, Priest EC, Shan Y, Spengler J, Theobald M, Towles B, Wang SC, Dror RO, Kuskin JS, Larson RH, Salmon JK, Young C, Batson B, Bowers KJ. Anton, a special-purpose machine for molecular dynamics simulation. 2008. [Google Scholar]

[R25] 25.Stone JE, Phillips JC, Freddolino PL, Hardy DJ, Trabuco LG, Schulten K. Accelerating molecular modeling applications with graphics processors. Journal of Computational Chemistry. 2007;28:2618–2640. doi: 10.1002/jcc.20829. [DOI] [PubMed] [Google Scholar]

[R26] 26.Swenson DWH, Bolhuis PG. A replica exchange transition interface sampling method with multiple interface sets for investigating networks of rare events. Journal of Chemical Physics. 2014;141 doi: 10.1063/1.4890037. [DOI] [PubMed] [Google Scholar]

[R27] 27.Vanden-Eijnden E, Venturoli M. Exact rate calculations by trajectory parallelization and tilting. The Journal of chemical physics. 2009;131:1–7. doi: 10.1063/1.3180821. [DOI] [PubMed] [Google Scholar]

[R28] 28.Warmflash A, Bhimalapuram P, Dinner AR. Umbrella sampling for nonequilibrium processes. The Journal of chemical physics. 2007;127:154112. doi: 10.1063/1.2784118. [DOI] [PubMed] [Google Scholar]

PERMALINK

A MATHEMATICAL FRAMEWORK FOR EXACT MILESTONING

DAVID ARISTOFF

JUAN M BELLO-RIVAS

RON ELBER

Abstract

1. Introduction

Fig. 1.

2. Setup and notation

2.1. The dynamics and MFPT

2.2. The milestones and semi-Markov viewpoint

2.3. Stopping times

2.4. Dual action of kernels

3. Invariant measure and MFPT

3.1. Stationary distribution on the milestones

Theorem 3.1

Proof

Algorithm 1.

Corollary 3.2

Proof

3.2. Milestoning equation for the MFPT

Theorem 3.3

Proof

3.3. Convergence to stationarity

Theorem 3.4

Proof

Theorem 3.5

Proof

4. Exact Milestoning algorithm

5. Error analysis

5.1. Stationary distribution error

Theorem 5.1

Lemma 5.2

Proof

Theorem 5.3

Proof

5.2. MFPT error

Theorem 5.4

Proof

6. Illustrative examples

Fig. 2.

6.1. Constant potential energy

Fig. 3.

6.2. Rough energy landscape

Fig. 4.

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases