Skip to main content
The Journal of Chemical Physics logoLink to The Journal of Chemical Physics
. 2008 Nov 5;129(17):174102. doi: 10.1063/1.2996509

On the assumptions underlying milestoning

Eric Vanden-Eijnden 1,a), Maddalena Venturoli 1,b), Giovanni Ciccotti 2,c), Ron Elber 3,d)
PMCID: PMC2646510  NIHMSID: NIHMS70556  PMID: 19045328

Abstract

Milestoning is a procedure to compute the time evolution of complicated processes such as barrier crossing events or long diffusive transitions between predefined states. Milestoning reduces the dynamics to transition events between intermediates (the milestones) and computes the local kinetic information to describe these transitions via short molecular dynamics (MD) runs between the milestones. The procedure relies on the ability to reinitialize MD trajectories on the milestones to get the right kinetic information about the transitions. It also rests on the assumptions that the transition events between successive milestones and the time lags between these transitions are statistically independent. In this paper, we analyze the validity of these assumptions. We show that sets of optimal milestones exist, i.e., sets such that successive transitions are indeed statistically independent. The proof of this claim relies on the results of transition path theory and uses the isocommittor surfaces of the reaction as milestones. For systems in the overdamped limit, we also obtain the probability distribution to reinitialize the MD trajectories on the milestones, and we discuss why this distribution is not available in closed form for systems with inertia. We explain why the time lags between transitions are not statistically independent even for optimal milestones, but we show that working with such milestones allows one to compute mean first passage times between milestones exactly. Finally, we discuss some practical implications of our results and we compare milestoning with Markov state models in view of our findings.

INTRODUCTION

Milestoning, introduced in Ref. 1 and further developed in Refs. 2, 3, 4, 5, is a procedure to analyze the progress in time of complicated processes such as barrier crossing events between predefined states or slow diffusive motion from an initial state toward a target state. Milestoning coarse grains the temporal and spatial description of the system by reducing the dynamics into a succession of independent transition events between intermediates (the milestones), which are hypersurfaces in the configuration space of the system. The necessary information about the probability of occurrence of these transitions and the time lags between them is obtained by collecting local kinetic information from straightforward and short trajectories between the milestones. By reducing the dynamics this way, milestoning allows one to simulate the overall kinetics of the system for time scales which are much beyond what is accessible by standard molecular dynamics (MD) simulations. Milestoning, however, rests on the assumptions that such a reduction of the dynamics to transitions between milestones is legitimate and that it is possible to retrieve the right kinetic information using short simulations between these milestones. The aim of this paper is to analyze under which conditions these assumptions are valid.

To introduce the issues that this question brings up and recall the details of the milestoning procedure, it is useful to consider a thought experiment in which one is given an infinitely long unbiased trajectory of a MD system which, assuming ergodicity, visits a given set of milestones infinitely often. Denoting these milestones by S1,S2,…,SN and by x(t) with t⩾0 the instantaneous position of the system along the trajectory, let Si0,Si1,Si2,Si3,… with ik∊{1,2,…,N}, k=0,1,2,… be the infinite sequence of successive milestones that x(t) crosses [thus we do not count recrossings of the same milestone until another milestone has been crossed in the meantime, i.e., ikik−1 for all k=1,2,… (see Fig. 1 for an illustration)]. Let also 0⩽t0<t1<t2<⋯ be the times at which these crossings occur, i.e., t0 is the first time⩾0 such that x(t0)∊Si0 and for k=1,2,…, tkis the first time such thatx(tk)∊Sikafter the trajectory crossedSik−1.

Figure 1.

Figure 1

Schematic of a piece of a long ergodic trajectory crossing a set of three milestones: S1, S2, and S3. In this example, Sik−1=S1, Sik=S2, Sik+1=S3, and Sik+2=S2. The part of the trajectory highlighted in bold contributes to one event counted in p23 and the time tk+1tk contributes to the statistics of f23(s). The figure also shows the previous transition event from S1 to S2, which contributes to p12 with the time tktk−1 contributing to f12(s), and the next one from S3 to S2, which contributes to p32 with the time tk+2tk+1 contributing to f32(s).

The sequence {(Si0,t0),(Si1,t1),…}≡{(Sik,tk)}k=0,1,… gives a coarse-grained picture of the MD trajectory, but one which, with a suitable choice of milestones, will still contain useful information about the kinetics of this trajectory. The main objective of the milestoning procedure is to generate this sequence and to analyze its statistical properties in a way which is computationally cheaper than running a long MD trajectory and waiting that it visits all the milestones as in the thought experiment above to gather its statistics.

To see how this objective is realized and the conditions this entails, let us introduce the following two statistical quantities which play a central role in milestoning. First, by going back to the sequence {(Sik,tk)}k=0,1,… obtained from the long ergodic MD trajectory and recording the proportion of times along this sequence that Sj is the milestone crossed next after Si, we can define the probability pij that this event occurs,6

pij=limnk=1nδi,ik1δj,ikk=1nδi,ik1=(probabilitythatSjisthemilestonecrossednextafterSi), (1)

where δi,j=1 if i=j and δi,j=0 otherwise. (Thus, by definition pii=0 and j=1Npij=1 for all i=1,…,N.) Second, by recording the time it takes for the trajectory to go to Sj next after Si each time this event happens, we can define the probability density fij(s) of the lag time between these crossings conditional on them occurring: for any pair i,j such that pij≠0, fij(s) is such that

0Δtfij(s)ds=limnk=1nδi,ik1δj,ik1[0,Δt](tktk1)k=1nδi,ik1δj,ik=(probabilitythatthetrajectorytakeslessthatΔttoreachSjaftercrossingSigiventhatSjisthemilestonecrossednextafterSi) (2)

where 1[0,Δt](s)=1 if s∊[0,Δt] and 1[0,Δt](s)=0 otherwise. [Thus, fij(s) is only defined if pij≠0 and in this case 0fij(s)ds=1.]

As discussed below, the quantities 1, 2 can, in principle, be computed more efficiently than via a long MD trajectory. Before explaining how to do so, however, one should wonder how useful these quantities are. Indeed, in general (i.e., for an arbitrary set of milestones), Eqs. 1, 2 contain only partial information about the statistical properties of the sequence {(Sik,tk)}k=0,1,… [for instance, we could be interested in the probability that Sj is crossed after Si and Sk after Sj, which is not given by pijpjk in general, or the average time it takes to go from Si to Sj, which is not expressible in terms of pij and fij(s) alone in general]. However, all the statistical properties of the sequence {(Sik,tk)}k=0,1,… could be deduced from Eqs. 1, 2 alone if the following two properties were true. The first makes an assumption about the way successive transitions between milestones occur.

(P1) The probability that after crossing Sj0, the system successively visits n milestones in the sequence Sj1,Sj2,Sj3,…,Sjn is for any nN given by

pj0j1pj1j2pj2j3pjn1jn, (3)

where the probability is again defined by recording the proportion of times this sequence of transitions is observed along the sequence {Sik}k=0,1,2,….

This property guarantees that the statistical properties of the sequence {Sik}k=0,1,… are identical to those of a discrete-time Markov process (here the “time” is the index k) with transition probability pij. The second property makes an assumption about the statistics of the lag times between crossings of successive milestones.

(P2) Denoting by tj0j1,tj1j2,…,tjn−1jn the lag times between successive crossings of the milestones along any sequence Sj0,Sj1,Sj2,…,Sjn, these lag times are statistically independent, i.e., if we compute the joint probability density function that tj0j1=s1,tj1j2=s2,…,tjn−1jn=sn, for any nN it reduces to the product

fj0j1(s1)fj1j2(s2)fjn1jn(sn). (4)

Together with (P1), this property guarantees that the statistical properties of the sequence {(Sik,tk)}k=0,1,… are those of a discrete-time Markov process (where the time stands again for the index k, whereas the physical time tk is part of the state variables of the chain7) with transition probability

(probabilitythatSik+1=Sjandtk+1t+ΔtgiventhatSik=Siandtk=t)=pij0Δtfij(s)ds. (5)

This says that given that milestone Si is crossed at time tk, with probability pij, the next milestone to be crossed is Sj and given that this event occurs, it happens at time tk+1=tk+tij, where tij is a random time distributed according to Eq. 2.

The key assumption made in the milestoning procedure is that properties (P1) and (P2) do hold true, thereby guaranteeing that the statistical properties of the sequence {(Sik,tk)}k=0,1,… obtained from the MD trajectory are identical to those of the Markov chain with transition probability Eq. 5. This chain is fully specified if one knows the quantities in Eqs. 1, 2 and it can be analyzed using the tools developed in Refs. 1, 2, 3, 4, 5. Of course, this still leaves us with the important question of how to compute Eqs. 1, 2 in practice. A key component of milestoning is to do so by reinitializing the MD simulation on each milestone Si. In principle, this reinitialization must be done using the probability density of the position at which a long ergodic trajectory first hits milestone Si after crossing another one [i.e., the density of the points x(tk) with k such that iki], but since this density is not known a priori, we are left with the question: What is the expression of the first hitting point probability density on the milestones that should be used to reinitialize the short MD runs in order to compute Eqs. 1, 2?

If we can answer this question, then Eqs. 1, 2 can be computed by performing relatively short MD runs until the trajectories initiated on a given milestone reach another milestone. This allows one to bypass the need to wait that the system actually reaches this milestone on its own (i.e., the need to use a long unbiased trajectory) and, in effect, it is the reason why milestoning offers a computational gain over direct MD. Indeed, with an appropriate choice of the milestones, the individual transition times between them can be made relatively short even though overall transition times between milestones which are far apart can be very long.

This finally brings us back to the aims of this paper. Our main objectives are to analyze the validity of properties (P1) and (P2) and to derive an expression for the first hitting point probability density that must be used to reinitialize the trajectories on the milestones. In Sec. 2 we will first analyze systems in the overdamped limit and show that there exist sets of milestones for which property (P1) is satisfied exactly. Property (P2), in general, will not hold exactly for this set of milestones, but we will show that this property is, in fact, not necessary to compute exactly certain quantities such as mean first passage times [in order to do such calculations exactly, only property (P1) is required]. For this reason, we shall refer to the milestones in sets such that property (P1) holds as “optimal milestones.” Using the results from transition path theory (TPT),8, 9, 10 we will show that these sets of optimal milestones involve the isosurfaces of the so-called committor function. We also derive the probability density of first hitting of these surfaces, from which one needs to initiate trajectories to compute Eq. 1 in the milestoning procedure. These results are illustrated in Sec. 3 via simple examples. In Sec. 4 we then consider systems with inertia, for which the situation is more complex. Optimal milestoning and the exact calculation of mean first passage time are, in principle, possible using again the isosurfaces of the committor function, but this function now depends on both the system position and momentum. In addition, for reasons that we will explain, an explicit expression for the probability density of first hitting of these surfaces is not available for systems with inertia, which complicates matters even more. How to deal with these issues, in practice, is explained in Sec. 5 where, among other things, we discuss how to identify optimal milestones in practice by combining milestoning with the string method.11, 12, 13, 14, 15, 16 Finally, a few concluding remarks are given in Sec. 6 where we compare milestoning with other strategies that use short runs between intermediates to deduce the overall kinetics. These include, in particular, Markov state models (MSMs) (e.g., Refs. 17, 18, 19, 20, 21, 22, 23, 24, 25), transition interface sampling26, 27 (TIS) and forward flux sampling (FFS).28 For the reader’s convenience, some background material on TPT (in particular on the committor function) and some technical derivations are deferred to two appendixes.

OPTIMAL MILESTONING IN THE OVERDAMPED LIMIT

In this section, we focus on systems governed by the overdamped (Smoluchowski) equation

γx˙(t)=V(x(t))+2β1γη(t). (6)

Here x(t)∊Rd denotes the position of the system, γ is the friction coefficient, V(x) is the potential, β is the inverse temperature, and η(t) is a white noise, i.e., a Gaussian process with mean zero and covariance ⟨ηi(tj(t)⟩=δijδ(tt). The case of systems with inertia governed by the Langevin equation will be considered later in Sec. 4.

In Sec. 2A we first establish that optimal sets of milestones which satisfy property (P1) exactly exist and involve the isosurfaces of the committor function. Then in Sec. 2B we derive the probability density of first hitting points on these optimal milestones. Finally, in Sec. 2C we discuss why, in general, property (P2) does not hold exactly even for optimal milestones, but we show that mean first passage times can nonetheless be computed exactly with these milestones.

Isocommittor surfaces as optimal milestones

Successive transitions between milestones are not independent, in general, and property (P1) does not always hold because typically, the position where the trajectory crossed milestone Si influences whether it will cross next Sj rather than another milestone. In other words, the probability that x(tk)∊Sj given that x(tk−1)∊Si depends on where x(tk−1) is located on Si [if we knew x(tk−1), we could deduce the probability that x(tk)∊Sj since x(t) is a Markov process, but the exact location of x(tk−1) is precisely what milestoning aims at coarse graining]. This issue, however, obviously disappears and property (P1) will hold if one can find sets of milestones such that the following property holds:

(P1) the probability to reach first Sj from Si is the same from any starting point on Si. The main result of this section is to show that such sets of optimal milestones satisfying property (P1) and, hence, (P1), exist.

To see why this is the case, suppose that we start with a first and a last milestone in a set of N milestones, i.e., two nonintersecting but otherwise arbitrary smooth dividing surfaces in configuration space, respectively, denoted by S1 and SN. We wish to complete this set with S2,S3,…,SN−1 in such a way that property (P1) hold exactly for this set. To do so, let us introduce the following function:

q(x0)=(probabilitythatx(t)reachesfirstSNratherthatS1giventhatx(0)=x0). (7)

Here and below, the probability is defined with respect to the realizations of the noise in Eq. 6. The function q(x) is the so-called committor function introduced in TPT to describe the reaction between a reactant state ARd and a product state BRd, which are here the two sets whose respective boundaries are S1 and SN and leave out the region in between S1 and SN (note that since S1 and SN can be chosen arbitrarily, in the present context the labels “reactant” and “product” are somewhat void of content). As we recall in Appendix A, the committor function q(x) is the solution of the partial differential equation given in Eq. A7 and the isosurfaces of q(x) foliate the region in between S1 and SN (i.e., they are nonintersecting smooth dividing surfaces in this region). The key idea behind optimal milestoning is to use some of these surfaces as milestones. Specifically, letting z1=0<z2<z3<⋯<zN=1, we define

Si={x:q(x)=zi},i=1,,N. (8)

This definition is consistent with S1 and SN since q(x)=0 on S1 and q(x)=1 on SN by construction, and we claim that these surfaces are optimal milestones. To see this, look at the reaction from the extended reactant state at the left of Si−1, i.e., the set {x:q(x)⩽zi−1}, to the extended product state at the right of Si+1, i.e., the set {x:q(x)⩾zi+1}, and note that the surface Si={x:q(x)=zi} also is an isocommittor surface for this reaction. Indeed, the committor function for this reaction is related to the original q(x) as (q(x)−zi−1)∕(zi+1zi−1) since, by construction, this function satisfies Eq. A7 with the correct boundary conditions that it be 0 on Si−1 and 1 on Si+1. Since q(x)=zi on Si by definition, this shows that the probability to reach first Si+1 rather than Si−1 starting from any point on Si is equal to (zizi−1)∕(zi+1zi−1) for any i=2,…,N−1 (obviously the probability to reach first S2 from S1 and the one to reach first SN−1 from SN are both 1). This implies that properties (P1) and (P1) are indeed satisfied and gives the following expression for Eq. 1: p12=pNN−1=1 and, for i=2,…,N−1,

pij={zi+1zizi+1zi1ifj=i1zizi1zi+1zi1ifj=i+10otherwise.} (9)

The conclusions above rely on the properties of the exact q(x), which we can only hope to compute from Eq. A7 in systems of low dimensions. We note however that there are algorithms (see, e.g., Refs. 12, 15, 29, 30) to effectively approximate the committor surfaces in systems with much higher dimensions under suitable assumptions (more on this in Sec. 6 below).

First hitting point density and the issue of reinitialization

As explained earlier, using optimal milestones such that property (P1) holds exactly is not enough to make milestoning practical. Indeed, since we typically do not have at our disposal an infinitely long ergodic trajectory, we also need to know how to reinitiate MD trajectories from the milestones to compute the lag-time density 2 [which is the only thing left to compute with optimal milestones since Eq. 1 is given by Eq. 9 for these milestones]. This is the issue that we discuss next.

Denote by ρij(x) the probability density of the first hitting point on Sj given that the infinitely long ergodic trajectory generated by Eq. 6 crossed Si last before Sj. As demonstrated in Appendix B, a key property of the optimal milestones defined in Eq. 8 is that ρij(x) depends only on j and not on i, i.e., we have ρij(x)≡ρj(x) with ρj(x) given by31

ρj(x)=Zj1eβV(x)q(x). (10)

Here Zj is a normalization factor given by the surface integral

Zj=SjeβV(x)q(x)dσSj(x), (11)

where dσSj(x) denotes the surface element on Sj (i.e., the Lebesgue measure on this surface). Because property (P1) holds with optimal milestones, one can compute the lag-time density 2 by reinitializing MD trajectories on these milestones, and the correct probability density to perform this reinitialization on Sj is ρj(x).

The proof that Eq. 10 is the correct density of first hitting point on the optimal milestones is somewhat technical, and we defer it to Appendix B. Here let us simply stress that Eq. 10 is not the equilibrium density on Sj, which would be proportional to e−βV(x). The extra factor ∣∇q(x)∣, which appears in Eq. 10, can be understood within the framework of TPT upon noting that the probability current of reactive trajectories is proportional to e−βV(x)q(x), and so the probability flux induced by this current through the surface Sj is precisely given by Eq. 11. The first hitting point density is related to this probability flux, and this is consistent with Eq. 10 rather than with the equilibrium density. How different the density 10 is with respect to the equilibrium one will be illustrated via examples in Sec. 3. In Sec. 4 we also discuss how to deal with Eq. 10 in practice.

Exact calculation of mean first passage times

As we have mentioned earlier, property (P2) is independent of (P1). In particular, even the set of optimal milestones made of isocommittor surfaces introduced in Sec. 2A will not, in general, be such that the lag times between successive transitions between milestones are statistically independent (why this is the case will be illustrated via an example in Sec. 3). This means that the reduction of the dynamics used in milestoning will not be exact in general. However, optimal milestones allow one to compute exactly certain important quantities, such as the mean first passage time from any milestone to any other one. Next we discuss why this is the case and give an explicit expression for these mean first passage times.

Consider the various instances when a long ergodic trajectory visits the milestones Sj0, Sj1,…,Sjn successively in this sequence. The duration of any one of such sequences is given by a sum of times, tj0j1+tj1j2+⋯+tjn−1jn, each of which is the lag between the crossing times of two successive milestones, tjkjk+1=tk+1tk with x(tk)∊Sjk. If one builds the statistics of these times, one will observe that, in general, they are correlated, i.e., rather than Eq. 2 what one needs to describe their statistical properties in full is their joint probability density

fj0j1jn(s1,s2,,sn)=(probabilitydensitythatt1t0=s1,t2t1=s2,,andtntn1=sn,giventhatx(tk)Sjkfork=0,,n)fj0j1(s1)fj1j2(s2)fjn1jn(sn). (12)

However, regardless of whether the times tj0j1,…,tjn−1jn are statistically independent or not, Eq. 2 remains the marginal of the density 12 for any tjkjk+1, i.e., for any k=1,…,n we have

fjk1jk(sk)=fj0j1jn(s1,s2,,sn)ds1dsk1dsk+1dsn. (13)

As a result, the average duration of such sequence is given exactly by

τj0j1+τj1j2++τjn1jn, (14)

where τij is the first moment of Eq. 2,

τij=0sfij(s)ds. (15)

In essence, evaluating the mean first passage time amounts to summing up the mean duration of sequences of transitions between milestones weighed by their probability of occurrence [an explicit expression for the mean passage time is given as the solution of Eq. 21 below]. As a result, what is required to compute these times is Eq. 15 and, as long as one uses optimal milestones [in order that property (P1) holds and to get the correct Eq. 9] and one reinitializes the trajectories according to Eq. 10 [to compute the first moment 15 correctly], it does not matter whether property (P2) holds or not. This is good news also because measuring the first moment τij rather than the full density fij(s) is easier.

Let us now give an explicit expression for the mean first passage time to a given milestone. Without loss of generality, we can assume that this milestone is the last one SN since we can always make it so by relabeling the milestones index. Let us define

Ti=(meanfirstpassagetimefrommilestoneSitomilestoneSN). (16)

Thus Ti>0 for iN and TN=0. To compute this mean first passage time, we need to sum up the average lengths in time of all the sequences leading to SN weighed by their probability of occurrence. This is easy to do with the help of a little trick which consists in introducing a modified pij, denoted as p^ij and defined as

p^ij={pijifiNδjNifi=N.} (17)

The modified p^ij turns the milestone SN into a so-called cemetery state: unlike with the original pij, with the new p^ij the process stays at milestone SN forever if it ever reaches this milestone. Therefore the average length in time of a path starting from Si and moving from milestones to milestones for up to n transitions but staying in SN if ever it reaches this state can be expressed as

Ti(n)=j1,,jn(τij1+τj1j2++τjn1jn)×p^ij1p^j1j2p^jn1jn, (18)

where τij is the first moment of fij(s) defined in Eq. 15 and we have set τii=0: since the only nonzero p^ii is p^NN=1, only τNN enters the sum in Eq. 18, and the choice of τNN=0 guarantees that the length in time of a path stops growing as soon as it reaches SN. The mean first passage time Ti is the limit of Eq. 18 as n→∞, Ti=limnTi(n) since this then takes into account paths of all possible length to reach SN. Thus expression 18 confirms that the mean first passage time to a given milestones will be exact if property (P1) is satisfied (which ensures that pij and, hence, p^ij contains the exact statistical information about probability of transitions between milestones), but regardless of whether (P2) holds or not.

To compute the limit of Eq. 18 as n→∞, let us introduce some notations and denote by T(n)=(T1(n),T2(n),,Tn(n))T the vector with entries Ti(n), by 1=(1,1,…,1)T the unit vector in RN, by P^ the matrix with entries p^ij, and by Q the matrix with entries τijp^ij. Then, Eq. 18 can be expressed as

T(n)=m=1nP^nmQP^m11=m=1nP^nmQ1, (19)

where we used P^1=1 which follows from P^ being a transition probability matrix to get the second equality. Left multiplying by P^, we obtain the following recurrence relation for T(n)

T(n+1)=P^T(n)+τ^ (20)

where τ^=Q1: explicitly τ^i=τi for iN and τ^N=0 where τi=j=1Npijτij. We can now take the limit as n→∞ to obtain

(IdP^)T=τ^, (21)

where T=(T1,T2,…,TN)T is the vector with entries Ti and Id is the identity matrix. This equation may seem problematic because det(IdP^)=0 since IdP^ has a zero eigenvalue associated with the eigenvector 1. However, since by definition the Nth component of the vectors (IdP^)T and τ^ are both zero, we can remove the last equation in the set [Eq. 21], which leads to a regular set of N−1 equations for the N−1 unknown T1,…,TN−1 which can easily be solved (and we already know that TN=0 by definition). Equation 21 agrees with the one for T derived in Refs. 2, 3 using different methods.

We conclude this section with a comment. From Eq. 21, the only quantities required to compute mean first passage times are pij and the mean time τi. As a result it does not matter what the actual shape of fij(s) is, and we could approximate this function by

f˜ij(s)=τi1exp(sτi). (22)

without changing Eq. 21. Since this approximation amounts to assuming that the dynamics of the transitions between successive milestones is a continuous-time Markov process,7 it was referred to as Markovian milestoning in Ref. 3. In general, the approximation of fij(s) by Eq. 22 is invalid, but we stress again that, when used with optimal milestones, it gives the correct mean first passage times. We will come back to this observation in Sec. 6 when we compare milestoning to MSMs.

ILLUSTRATIVE EXAMPLES

Let us now illustrate the results in Sec. 2 on a few simple examples. In Fig. 2, we analyze a two dimensional example with potential

V(x1,x2)=14(1x12)2+12(x2+12x12)2 (23)

in a situation where β=40 and γ=1 in Eq. 6. First we consider an extreme case in which we only take three milestones, S1={x1=−0.9}, S2={x1=0}, and S3={x1=0.9}. Thus S1 and S3 mark the boundary of the reactant set A={x1<−0.9} and the product set B={x1>0.9} [i.e., S1 is also the isocommittor surface where q(x)=0, while S3 is the one where q(x)=1], and by symmetry, S2 is the isocommittor surface where q(x)=0.5 for this reaction. These three surfaces are shown in Fig. 2 atop the contour plot of the potential 23. Also shown as gray dots are snapshots every δt=0.05 of 103 trajectories generated from S2 according to Eq. 10 in which we used a numerical approximation of ∇q(x) obtained by solving Eq. A7 by finite elements (see Ref. 10 for details). The predicted density 10 of hitting points on S1 is also shown (thick black line) and compared to the equilibrium density on S1, i.e., the density proportional to e−βV(−0.9,x2) (dashed line). Clearly, these two densities are different, and Eq. 10 rather than the equilibrium density is the correct first hitting point density. This was also confirmed by computing this density directly from a long MD trajectory (not shown). The difference between Eq. 10 and the equilibrium density is also apparent from Fig. 3 where we compare the density of hitting points obtained by binning the location where 105 trajectories hit S1 (black solid curve), with Eq. 10 (black dashed curve) and the equilibrium density (gray dot-dashed curve).

Figure 2.

Figure 2

Contour plot of the potential 23 with the three milestones S1={x1=−0.9}, S2={x1=0}, and S3={x1=0.9} (S1 and S3 are the boundaries of the reactant set {x1<0.9} and product set {x2>−0.9} where q(x)=0 and q(x)=1, respectively, and S2 is the isocommittor surface q(x)=0.5 for this reaction). The minimum energy path is also shown (dot-dashed line). The gray dots are snapshots every δt=0.05 along 103 trajectories starting from points distributed on S2 according to Eq. 10, and the predicted density 10 of first hitting points on S1 (thick black line) is compared to the equilibrium density on S1 (dashed line).

Figure 3.

Figure 3

Comparison of the probability density of first hitting points obtained by binning the location where 105 trajectories started from points in S2 distributed according to Eq. 10 hit S1 (black solid curve) with the density [Eq. 10] (black dashed curve) and with the equilibrium density (gray dot-dashed curve). We also computed the first hitting point density using a long unbiased trajectory and, up to statistical errors, it coincides with the solid curve shown in the figure.

It should be stressed that working with three milestones only is an extreme case, and it is the reason why the density 10 on S1 and S3 is so different than the equilibrium one. Indeed, S1 and S3 are the boundary of sets A and B which are arbitrary. As a result, we should not expect Eq. 10 to agree with the equilibrium density on S1 and S3. If, however, we include more milestones, then one can expect that the ones away from the boundaries of A and B will bend in the right direction in such a way that Eq. 10 and the equilibrium density may become more alike if ∣∇q(x)∣≈cst in the region in q(x)=zj, where e−βV(x) is peaked. This approximation will be valid if one can locally approximate the isocommittor surfaces by planes and neglect curvature effects along the reaction channel. A situation where this approximation is valid is illustrated in Figs. 45. In general, however, the extra factor ∣∇q(x)∣ in Eq. 10 compared to the equilibrium density restricted to Sj will matter (as shown, e.g., in our second example below). How to account for it will be discussed some more in Sec. 6.

Figure 4.

Figure 4

Contour plot of the potential 23 with superimposed isocommittor surfaces used as milestones: from left to right, these surfaces are q(x)=0, q(x)=0.001, q(x)=0.01, q(x)=0.1, q(x)=0.5, q(x)=0.9, q(x)=0.99, q(x)=0.999, and q(x)=1.

Figure 5.

Figure 5

Comparison of the probability density 10 (solid line) on the four milestones shown as thick lines in Fig. 4 with the equilibrium probability density (dashed line). The probability densities are plotted as functions of the arc length along the milestones.

As a second illustration, let us consider an example originally proposed in Ref. 32 in which there are two reaction channels. The potential is shown in Fig. 6 and has been analyzed in detail in Ref. 10. This example shows why the time lags between transitions can be correlated even though these transitions are not. Indeed, because there are two channels (the upper and the lower ones), and the upper one contains a dynamical trap around the shallow minimum centered at (0, 1.5), the time lags between transitions in the upper channel tend to be longer than between the ones in the lower one. Hence, a long lag between transition is more likely to be followed by a long one (indicating that the process goes through the upper channel), and a short lag by a short one (indicating that the process goes through the lower channel). Therefore, this is a situation where property (P1) holds but (P2) does not. It should also be stressed that this is an example where the factor ∣∇q(x)∣ matters again. This is because there are two channels for the reaction, i.e., the Boltzmann factor e−βV(x) is peaked in two different regions in q(x)=zj, and even though ∣∇q(x)∣ is approximately constant in these two regions, these constants are different. This is illustrated in Fig. 6. Using the equilibrium density on q(x)=0.5 would suggest that the preferred channel is the upper one, whereas using Eq. 10 clearly shows that the lower one is preferred. This effect should be accounted for in the milestoning procedure. We come back to this point in Sec. 6.

Figure 6.

Figure 6

Contour plot of the three-hole potential. We use three milestones, shown as vertical lines and corresponding to (from left to right) q(x)=0, q(x)=0.5, and q(x)=1. The density 10 (thick solid line) on the surface q(x)=0.5 shows that the lower channel is the preferred one, even though the equilibrium density (thick dashed line) is peaked in the upper channel.

OPTIMAL MILESTONING IN SYSTEMS WITH INERTIA

In this section, we discuss what happens in systems with inertia, e.g., when we replace the overdamped Eq. 6 by the Langevin equation

x˙(t)=v(t), (24)
mv˙(t)=V(x(t))γv(t)+2β1γη(t),

where v(t) denotes the velocity, m the mass matrix, and γ the friction coefficient. Let us revisit the results of Sec. 2 in the context of Eq. 24. Much of the discussion below can be generalized, at least formally, if the Langevin terms γv(t)+2β1η(t) in Eq. 24 are replaced by another thermal bath, possibly consistent with another ensemble than NVT.

Isocommittor surfaces as optimal milestones

Given any two nonintersecting smooth dividing surfaces in phase-space (x,v), henceforth denoted as S1 and SN, the committor function associated with these surfaces can be defined as [cf. Eq. 7]

q(x0,v0)=(probabilitythat(x(t),v(t))reachesfirstSNratherthanS1giventhat(x(0),v(0))=(x0,v0)). (25)

As recalled in Appendix A, q(x,v) satisfies the partial differential Eq. A9, and the isosurfaces of this function are nonintersecting smooth dividing surfaces foliating the region between S1 and SN. As a result, we can take as milestones the generalization of Eq. 8, i.e.,

Si={(x,v):q(x,v)=zi},i=1,,N, (26)

where z1=0<z2<z3<⋯<zN=1, and these milestones will satisfy properties (P1) and (P1) exactly with the transition matrix pij given in Eq. 9. This shows that optimal milestoning is possible, at least in principle, even for systems governed by Eq. 24.

There are, however, two caveats with this result. The first is that the optimal milestones are now nontrivial dividing surfaces in phase space rather than the simpler lift up in phase space of surfaces defined primarily in configuration space. The second, discussed next in Sec. 4B, is that, unlike in the case of systems in the overdamped limit, the first hitting point density associated with these optimal milestones is not available in closed form. (Practical considerations regarding both these issues are given in Sec. 5.)

First hitting point density and the issue of reinitialization

To understand the issue which arises with the Langevin Eq. 24 when one tries to derive the first hitting point density for this equation, it is useful to recall some results of TPT. TPT can be applied to Eq. 24, but in this case the statistical properties of the reactive trajectories depend on two committor functions: the forward one, which was defined in Eq. 25, and the backward one. The forward committor function gives the probability to reach first a product rather than a reactant state starting from point (x,v); the backward committor function gives the probability to arrive at (x,v) coming last from a reactant state rather than a product state. Unlike in the overdamped case, the backward committor function is not simply one minus the forward one: rather it is given by 1−q(x,−v), where q(x,v) is the forward committor function defined in Eq. 25.

As discussed in Sec. 4A, optimal milestones having property (P1) and, hence, (P1) can be defined using the isosurfaces of the forward committor function. This is consistent with properties (P1) and (P1) dealing with what happens after the process leaves a milestones, i.e., forward in time. However, when one tries to derive an explicit expression for the first hitting density of the milestones, one deals with the way the process reaches these surfaces, i.e., one must look backward in time. As a result, while it is possible to give an explicit expression for the first hitting point density of milestones defined as isosurfaces of the backward committor function (see Appendix B), such an explicit expression is not available for the optimal milestones defined in Eq. 26. To avoid confusions, let us stress that this negative conclusion does not mean that milestoning (or even optimal milestoning) is not viable in systems with inertia such as the ones governed by Eq. 24. Simply, it means that the exact probability density to reinitiate the MD trajectories on the milestones is not available in closed form, and one needs either to resort to approximations or find ways to bypass the need of this density altogether to perform milestoning. These practical issues are discussed further in Sec. 5.

Exact calculation of mean first passage times

Provided that one uses the optimal milestones defined in Eq. 26 and one uses the proper density to reinitialize the MD trajectories on these milestones to compute the first moment 15, the results given in Sec. 2C apply to the present situation, i.e., Eq. 21 remains the exact equation for the mean first passage time from any milestone to the last.

SOME PRACTICAL CONSIDERATIONS

To make the results presented in this paper practical, two main issues must be addressed. The first is how to identify the isosurfaces of the committor function to be used as optimal milestones. The second is how to reinitialize the MD trajectories on these surfaces to compute the density 2 or its first moment 15.

The first issue suggests to combine milestoning with the string method,11, 12, 13, 14, 15, 16 which is a technique to calculate the committor function and its isosurfaces from scratch, at least in situations where the reaction channels are localized either in the original Cartesian space or when viewed in suitable collective variables. The string method can be applied either to the overdamped Eq. 6 or to the Langevin Eq. 24. In the latter case, however, it makes the approximation that the committor function q(x,v) can be approximated by a function of x only, q(x,v)≈q(x) or, more generally, by a function of some suitable collective variables θ(x)=(θ1(x),θ2(x),…,θN(x))T such as dihedral angles and bond distances: q(x,v)≈Q(θ(x)) [this second approximation can also be used for the overdamped Eq. 6]. The output of the method is one or more curves, the hyperplanes perpendicular to which are local approximations near the curve of the isosurfaces of the committor function (see Refs. 5, 16 for details). The accuracy of these approximations depends on the validity of both the assumptions above: that the reaction channels are localized and that the committor function can be approximated by a function of x only. The latter assumption is difficult to justify on general grounds, but we note that a sufficient condition for it to be valid is that the overdamped Eq. 6 be a good approximation of Eq. 24 on longer time scales where the velocities have had time to decorrelate. This suggests to use milestones that are separated enough so that the typical travel time between milestones will be longer than the velocity decorrelation time. For instance, it was observed empirically in Ref. 3 on the example of solvated alanine dipeptide that the milestones must be sufficiently separated to allow for accurate rate determination. This limits somewhat the efficiency gain achievable by milestoning, but may not be a serious issue in practice since the velocity decorrelates in MD simulations on the subpicosecond time scale while we are interested in time scales that are at least a million times longer. For instance, termination times of picoseconds between milestones yield computational speed up of about 1000 for Scapharca hemoglobin.4

Considering now the issue of how to reinitialize the MD trajectories on the milestones, for overdamped systems at least, the results of the string method can also be used to estimate the extra factor ∣∇q(x)∣ to be included in Eq. 10. The problem, however, is that such an approximation can be rather poor (approximating the gradient of a function is typically harder than approximating the function itself) and, in addition, this procedure is not an option for systems with inertia for which the equivalent of Eq. 10 is not available in closed form. One way to get around these difficulties in overdamped systems is to assume that ∣∇q(x)∣ is approximately constant on the isosurfaces of q(x). As was illustrated in Sec. 3 via examples, this approximation is sometimes valid but not always. Another way, which is applicable also for systems with inertia, is to use milestones sufficiently separated so that decorrelation effects along the trajectory make the influence of the density used to initiate the trajectories on the milestones less stringent. This is the procedure which was used in the original milestoning papers,1, 2, 3, 4 where MD trajectories are generated on the milestones using the equilibrium density restricted to these milestones. Yet another way, applicable again both to overdamped systems and to those with inertia, is to come up with a sampling procedure that bypasses the need of the first hitting point density altogether. Such a sampling procedure will be presented in Ref. 5 and we refer the reader to this reference for details.

CONCLUDING REMARKS

In this paper, we have analyzed the assumptions underlying milestoning, namely, properties (P1) and (P2), whose specific roles and importance are different. Property (P1) states that transitions between successive milestones are statistically independent and one of the main results in this paper was to establish that property (P1) is satisfied exactly provided that one uses isocommittor surfaces as milestones. Property (P2), on the other hand, does not hold exactly in general, i.e., it is not true that the time lags between successive transitions between milestones are statistically independent. However, we showed that the importance of this property is less than that of (P1). In particular, it was established that mean first passage times from any milestone to any other one can be calculated exactly if one uses milestones satisfying property (P1) and computes the first moment 15 exactly, even if (P2) does not hold. This leads to what we called optimal milestoning. Optimal milestoning requires one to use isocommittor surfaces as milestones, which in practice suggests to combine milestoning with the string method, as mentioned in Sec. 5. Optimal milestoning also requires one to know the probability density of hitting points on the milestones. This density is the one to use to reinitiate the trajectory on the milestones in order to compute the key quantities 1, 2. For systems in the overdamped limit, we derived an explicit expression for this density. We also explained why such a density is not available for the optimal milestones associated with system with inertia such as the ones governed by the Langevin Eq. 24 but, as discussed in Sec. 5, there are various ways to address this issue in practice.

The fact that, in principle at least, optimal milestoning allows one to compute exactly quantities such as mean first passage times is, we believe, a great advantage of the method. It should be stressed, however, that milestoning can remain useful even in situations where property (P1) is not satisfied exactly. In fact, the way milestoning was originally implemented in Refs. 1, 2, 3, 4, 5 is not optimal in the sense above, and yet it was demonstrated in this paper that the method does give accurate predictions of the long-time kinetics of various complicated MD systems. While it is difficult to give a precise assessment of why this is the case, this seems to be related to the use of well-separated milestones. The separation must be such that the trajectory decorrelates as it travels between successive milestones in such a way that properties (P1) and (P2) are approximately satisfied even though the milestones are not optimal. Clearly, using optimal milestones should remain the preferred solution, but it shows that things do not necessarily go wrong if this option is not available.

We now conclude this paper with quick comparisons of (i) milestoning versus MSMs, and (ii) milestoning versus TIS and FFS.

Milestoning versus MSM

It is useful to compare milestoning with MSMs in light of our results. MSMs have recently become popular to analyze the kinetics of complicated processes by gathering information about these processes either via short MD runs, or sometimes even in situations where the thought experiment described in the Introduction is realized, i.e., one actually has a very long MD trajectory at one disposal. The basic idea in MSM is to reduce the dynamics to a Markov jump process between the appropriately chosen states in such a way that this reduced dynamics be consistent with the actual one. Thus, at first sight, MSM seems quite similar in spirit to the Markovian milestoning procedure described at the end of Sec. 2C. There is, however, an important difference, namely, that the states used in MSM typically form a partition of phase space, whereas those used in Markovian milestoning are hypersurfaces in phase space (e.g., the states in MSM could be the regions between the milestones rather than the milestones themselves). A consequence of this is that it is hard to justify when MSMs of this type are valid and, in particular, it is not obvious how to choose the time lag at which to compute the transition probability matrix from the MD data in these models. There have been recent studies to address this issue (see, e.g., Refs. 23, 24), but to the best of our knowledge, there are no rigorous results as to when MSMs are accurate. This situation is to be contrasted with milestoning which, provided that optimal milestones are being used, permits to compute exactly mean first passage times. We believe that this is an important advantage of milestoning (Markovian or not) over standard MSM that makes the former preferable to the latter for the analysis of the kinetics of complicated MD processes.

Milestoning versus TIS and FFS

Finally, our results show that milestoning is an interesting alternative to techniques such as TIS26, 27 or FFS.28 TIS and FFS allow one to compute exactly the flux of reactive trajectories through a given set of interfaces and thereby the mean first passages between these interfaces. The interfaces used in TIS and FFS are, in principle, arbitrary (i.e., they do not need to be the isocommittor surfaces as in optimal milestoning). In practice, however, the efficiency of these methods will drop if one does not use appropriate surfaces. Ideally, one would like to choose these surfaces so as to optimize the computational gain offered by TIS or FFS, but how to do so is not clear. The isocommittor surfaces seem again a natural choice but, to the best of our knowledge, there is no rigorous proof that this choice is the best one. In addition, TIS and FFS are not as straightforward to parallelize as milestoning and as a result they will typically offer lesser computational gains. It would be interesting to investigate these questions further but we will leave this for future work.

ACKNOWLEDGMENTS

We are grateful to Philipp Metzner for providing us with the committor function data in the examples of Sec. 3. We also thank the referees for their thoughtful comments that led us to rethink the results of this paper and enlarge its scope. Part of this work was performed while the authors were visiting the Erwin Schrödinger Institute (ESI) in Vienna whose support is gratefully acknowledged. This work was also partially supported by NIH under Grant No. GM59796, NSF under Grant Nos. DMS02-09959, DMS02-39625, and DMS07-08140, and ONR under Grant No. N00014-04-1-0565.

APPENDIX A: THE COMMITTOR EQUATION

Here, we give the equations for the committor function associated with the overdamped Eq. 6 and the Langevin Eq. 24. This is a standard material which can be found in any textbook on stochastic processes, e.g., Sec. 5.7.B in Ref. 33. We start by deriving the equation for the committor function associated with Eq. 6. If x(t) satisfies Eq. 6, given a function u(x), Itô's change in variable formula (e.g., Sec. 3.3 in Ref. 33) asserts that

u˙(x(t))=(Lu)(x(t))+2(βγ)1u(x(t))η(t). (A1)

Here L denotes the infinitesimal generator of the diffusion associated with Eq. 6, i.e.,

(Lu)(x)=γ1V(x)u(x)+(βγ)1Δu(x), (A2)

where Δ denotes the Laplacian. Suppose that we consider Eq. 6 in a domain Ω⊂Rd, and denote by τ the first exit time of Ω, i.e.,

τ=min(t:x(t)Ω), (A3)

where ∂Ω denotes the boundary of the set Ω. Integrating Eq. A1 up to time τ and taking the expectation using the property that the noise term has zero expectation, we obtain

u(x(τ))=u(x0)+0τ(Lu)(x(t))dt, (A4)

where x0=x(0) is the initial condition of Eq. 6. So far, u was an arbitrary function. Now, let us assume that it is the solution of

(Lu)(x)=0ifxΩ, (A5)
u(x)=f(x)ifxΩ,

where f(x) is some given function defined on the boundary of Ω. Then Eq. A4 reduces to

f(x(τ))=u(x0), (A6)

where we used that x(τ)∊∂Ω, and so u(x(τ))=f(x(τ)) from the boundary condition in Eq. A5. Equation A6 is known as (an instance of) Feynman–Kac formula which gives an expression for the solution of the elliptic partial differential Eq. A5 in terms of an expectation involving the solution of the stochastic differential Eq. 6. An interesting special case is when Ω is the whole configuration space minus the union of sets A and B, Ω=Rd\(AB), and one sets f(x)=0 if x∊∂A and f(x)=1 if x∊∂B. Denoting by q(x) instead of u(x) the solution of Eq. A5 in this special case, one sees that Eq. A5 reduces to

(Lq)(x)=0ifxRd\(AB), (A7)
q(x)=1ifxB
q(x)=0ifxA,

and, from Eq. A6, the solution of this equation gives the probability that x(τ) belongs to ∂B rather than ∂A, i.e., it is the committor function defined in Eq. 7 provided that we set ∂A=S1 and ∂B=SN. Note that, using the time-reversal symmetry of the overdamped equation, q(x) also is the probability to arrive at x coming last from B rather than A, i.e., 1−q(x) is the backward committor function.

The case of the Langevin Eq. 24 can be treated similarly, the main difference is that all calculations must now be done in phase space rather than configuration space. The generator of the diffusion associated with Eq. 24 is [cf. Eq. A2]

(Lu)(x,v)=vxu(x,v)V(x)Δvu(x,v)γvvu(x,v)+β1γΔvu(x,v) (A8)

and the equation for the committor function is [cf. A7]

(Lq)(x,v)=0if(x,v)R2d\(AB) (A9)
q(x,v)=1ifxB
q(x,v)=0ifxA,

where A and B are sets in phase space defining the reactant and the product states, respectively, and ∂A and ∂B are the boundaries of these sets. The only additional caveat with Eq. A9 compared to Eq. A7 is that the local vector normal to the boundaries ∂A and ∂B must have a nonzero velocity component everywhere on ∂A and ∂B except possibly on a set of zero Lebesgue measure on these sets. This condition is required in order to be able to impose the Dirichlet boundary conditions in Eq. A9 because the Laplacian term in Eq. A9 involves only v.

Finally, we note that the function q(x,v) is the forward committor function, i.e., it gives the probability to first reach B rather than A forward in time starting from point (x,v). The backward committor function, giving the probability to arrive at point (x,v) coming last from A rather than B is given by 1−q(x,−v) [and not simply 1−q(x,v)] as a result of the symmetry by time reversal of the Langevin Eq. 24.

APPENDIX B: FIRST HITTING POINT DENSITY

Here we show that the first hitting point density on milestone Sj is given by Eq. 10. Specifically, we show that if a solution of Eq. 6 is initially distributed on Sj according to ρj(x) (consistent with this density being assumed to be the one of the first hitting point on this surface), then the probability density of the first hitting point on Sj+1 (Sj−1) conditional on the process hitting this surface first is given by ρj+1(x) [ρj−1(x)]. Since this argument can be repeated to all the milestones, this is consistent with Eq. 10 being the first hitting point density on milestone Sj.

To get the probability density of the first hitting point on Sj+1 and Sj−1, we consider the probability density at time t of the solution of Eq. 6 in the region Ωj between Sj−1 and Sj+1, i.e., Ωj={x:zj−1<q(x)<zj+1} (if j=1,N, we simply set Ω1={x:q(x)<z2} and ΩN={x:zN−1<q(x)}). Denoting by pt(x) the probability density that the particle be at x at time t, pt(x) satisfies

tpt(x)=γ1(V(x)pt(x)+β1pt(x)) (B1)

for x∊Ωj, and pt(x)=0 if xSj−1 or xSj+1. As initial condition for Eq. B1, we take the distribution in Rd whose density on Sj is ρj(x),31

pt=0(x)=Zj1eβV(x)q(x)2δ(q(x)zj). (B2)

We also set absorbing boundary conditions at Sj−1 and Sj+1, i.e., pt(x)=0 if xSj−1 or xSj+1. Due to the presence of these absorbing boundaries, we have that pt(x)→0 for x∊Ωj as t→∞, and this decay is associated with escape events by the boundaries Sj−1 and Sj+1. Writing Eq. B1 as

tpt(x)=Jt(x), (B3)

where Jt(x) is the probability current

Jt(x)=γ1V(x)pt(x)(γβ)1pt(x), (B4)

we see that the probability flux density at point xSj−1Sj+1 and time t is given by

γ1n^(x)(V(x)pt(x)+β1pt(x)), (B5)

where n^(x) denotes the unit vector pointing outward Ωj at xSj−1Sj+1: explicitly, n^(x)=q(x)q(x) for xSj+1 and n^(x)=q(x)q(x) for xSj−1. We are interested in the time-integrated probability flux density at the point xSj−1Sj+1 since this time-integrated flux density gives the overall probability density that the particle escapes by this point (the hitting density). To calculate this flux density, let

P(x)=0pt(x)dt. (B6)

Using the initial condition B2 as well as limt→∞pt(x)=0, by integrating Eq. B1 in time from 0 to ∞, we arrive at the following equation for P(x):

ρj(x)=γ1(V(x)P(x)+β1P(x)) (B7)

for x∊Ωj, and P(x)=0 if xSj−1 or xSj+1. We claim that the solution of Eq. B7 is

P(x)={CL(q(x)zj1)eβV(x)ifzj1q(x)<zjCR(zj+1q(x))eβV(x)ifzjq(x)zj+1,} (B8)

where

CL=γZj1zj+1zjzj+1zj1, (B9)
CR=γZj1zjzj1zj+1zj1.

Using Eq. A7, it is easy to see that Eq. B8 solves Eq. B7 in the region Ωj\Sj in which ρj(x)=0, that it satisfies the right boundary condition on Sj+1 and Sj−1, and that it is continuous (though not differentiable) on Sj. Thus it simply remains to establish that Eq. B8 also satisfies Eq. B7 on Sj in some appropriate sense. Specifically, we need to show that if we take δ>0 and an arbitrary test function f(x), multiply Eq. B7 by f(x), integrate the resulting equation on the region zj−δ⩽q(x)⩽zj+δ, and finally let δ→0, we get an equation which is satisfied with P(x) given by Eq. B8. Performing this calculation and using integration by parts at the right-hand side, the result is

Zj1Sjf(x)eβV(x)q(x)dσSj(x)=γ1(CL+CR)Sjf(x)eβV(x)q(x)dσSj(x), (B10)

which is indeed satisfied for every f(x) in view of Eq. B9. Having established that Eq. B8 is the solution of Eq. B7, we can calculate the time-integrated flux at xSj−1Sj+1 which gives the probability density that the particle exits at this point. Time-integrating Eq. B5 and using Eq. B6, this hitting density can be expressed in terms of P(x) as

γ1n^(x)(V(x)P(x)+β1P(x)), (B11)

which, using Eq. B8, can be expressed, on Sj−1 and Sj+1, as

ρL(x)=γ1CLeβV(x)q(x)ifxSj1 (B12)
ρR(x)=γ1CReβV(x)q(x)ifxSj+1,

which, according to Eq. 10, are indeed proportional to ρj−1(x) and ρj+1(x), respectively. In fact, the integrals of ρL(x) on Sj−1 and of ρR(x) on Sj+1 give the probabilities that the particle exits by Sj−1 or Sj+1, respectively,

RdρL(x)dx=zj+1zjzj+1zj1pjj1, (B13)
RdρR(x)dx=zjzj1zj+1zj1pjj+1,

where we used Eqs. 11, B9 as well as the property (shown below) that for any j,k=1,…,N, we have

Sjq(x)eβV(x)dσSj(x)=Skq(x)eβV(x)dσSk(x). (B14)

Eq. B13 is consistent with Eq. 9, and we see that ρj−1(x)=ρL(x)∕pjj−1 and ρj+1(x)=ρL(x)∕pjj+1. This shows that the hitting probability on Sj−1 conditional on exiting through Sj−1 is indeed ρj−1(x), and similarly for Sj+1.

Finally to show that Eq. B14 holds, consider

I(z)=q(x)=zeβV(x)q(x)dσ{q(x)=z}=RdeβV(x)q(x)2δ(q(x)z)dx. (B15)

We claim that I(z) is independent of z, which implies Eq. B14. To see this, use

dI(z)dz=RdeβV(x)q(x)2δ(q(x)z)dx=RdeβV(x)q(x)δ(q(x)z)dx=βγRdeβV(x)(Lq)(x)δ(q(x)z)dx=0, (B16)

where we integrated by parts to get the third equality and we used Eq. A2 to get the fourth.

Similar calculations can be done for the Langevin Eq. 24 but, as explained in Sec. 4B, we can only get a closed form expression for the first hitting point density if we use as milestones the isosurfaces of the backward committor function rather than the forward one. For economy, let us omit these calculations since they are similar to the one presented above and simply state the result: the first hitting point density of the surface Sjb={(x,v):q(x,v)=zj} [not to be confused with Eq. 26] is given by

ϱj(x,v)=Zj1vq(x,v)eβH(x,v). (B17)

Here Zj is the following normalization factor:

Zj=Sjbvq(x,v)eβH(x,v)dσSj(x,v), (B18)

where dσSjb(x,v) denotes the surface element on Sjb, and H(x,v) is the Hamiltonian of the system.

References

  1. Faradjian A. K. and Elber R., J. Chem. Phys. 10.1063/1.1738640 120, 10880 (2004). [DOI] [PubMed] [Google Scholar]
  2. Shalloway D. and Faradijan A. K., J. Chem. Phys. 10.1063/1.2161211 124, 054112 (2006). [DOI] [PubMed] [Google Scholar]
  3. West A. M. A., Elber R., and Shalloway D., J. Chem. Phys. 10.1063/1.2716389 126, 145104 (2007). [DOI] [PubMed] [Google Scholar]
  4. Elber R., Biophys. J. 92, L85 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Vanden-Eijnden E., Venturoli M., and Elber R., “Markovian milestoning with Voronoi tessellations,” J. Chem. Phys. (2008) (in preparation). [DOI] [PubMed]
  6. Instead of taking a single long ergodic trajectory, we can also consider an ensemble of these, e.g., by considering the ensemble of solutions of Eq. 6 or 24 generated using different realizations of the noise and initial conditions drawn from the canonical distribution. The definitions 1, 2 for pij and fij(s), which involve time averages, remain valid in this case, but, alternatively, we can also define these quantities using appropriate averages with respect to the ensemble of trajectories.
  7. To avoid confusions, note that assuming that the sequence (Sik,tk) is Markov with a discrete time index k=0,1,2,… does not imply that the successive crossings of the milestones can be described as a continuous-time Markov process whose state-space is the index of the last milestone crossed, say i(t). Indeed this would require that Eq. 2 is of the form fij(s)=τi1exp(sτi) for some τi>0. This is assumed in Markovian milestoning (see the remark at the end of Sec. 2C), but not in milestoning.
  8. E W. and Vanden-Eijnden E., J. Stat. Phys. 10.1007/s10955-005-9003-9 123, 503 (2006). [DOI] [Google Scholar]
  9. Vanden-Eijnden E., in Computer Simulations in Condensed Matter: From Materials to Chemical Biology, edited by Ferrario M., Ciccotti G., and Binder K. (Springer, Berlin, 2006), Vol. 1, pp. 439–478. [Google Scholar]
  10. Metzner P., Schütte C., and Vanden-Eijnden E., J. Chem. Phys. 10.1063/1.2335447 125, 084110 (2006). [DOI] [PubMed] [Google Scholar]
  11. E W., Ren W., and Vanden-Eijnden E., Phys. Rev. B 10.1103/PhysRevB.66.052301 66, 052301 (2002). [DOI] [Google Scholar]
  12. E W., Ren W., and Vanden-Eijnden E., Chem. Phys. Lett. 10.1016/j.cplett.2005.07.084 413, 242 (2005). [DOI] [Google Scholar]
  13. E W., Ren W., and Vanden-Eijnden E., J. Phys. Chem. B 10.1021/jp0455430 109, 6688 (2005). [DOI] [PubMed] [Google Scholar]
  14. E W., Vanden-Eijnden E., and Maragakis P., J. Chem. Phys. 10.1063/1.2013256 123, 134109 (2005). [DOI] [PubMed] [Google Scholar]
  15. Maragliano L., Fischer A., Vanden-Eijnden E., and Ciccotti G., J. Chem. Phys. 10.1063/1.2212942 125, 024106 (2006). [DOI] [PubMed] [Google Scholar]
  16. Vanden-Eijnden E. and Venturoli M., “Revisiting the Finite Temperature String Method for the Calculation of Reaction Tubes and Free Energies,” J. Chem. Phys. (to be published). [DOI] [PubMed]
  17. Shalloway D., J. Chem. Phys. 10.1063/1.472830 105, 9986 (1996). [DOI] [Google Scholar]
  18. Ulitsky A. and Shalloway D., J. Chem. Phys. 10.1063/1.476882 109, 1670 (1998). [DOI] [Google Scholar]
  19. Hummer G. and Kevrekidis I., J. Chem. Phys. 10.1063/1.1574777 118, 10762 (2003). [DOI] [Google Scholar]
  20. Swope W., Pitera J., and Suits F., J. Phys. Chem. B 10.1021/jp037421y 108, 6571 (2004). [DOI] [Google Scholar]
  21. Swope W. C., Pitera J. W., Suits F., Pitman M. C., Eleftheriou M., Fitch B. G., Germain R. S., Rayshubski A., Ward T. J. C., Zhestkov Y., and Zhou R. J., J. Phys. Chem. B 10.1021/jp037422q 108, 6582 (2004). [DOI] [Google Scholar]
  22. Krivov S. and Karplus M., J. Chem. Phys. 10.1063/1.1517606 117, 10894 (2004). [DOI] [Google Scholar]
  23. Noe F., Horenko I., Schütte C., and Smith J., J. Chem. Phys. 10.1063/1.2714539 126, 155102 (2007). [DOI] [PubMed] [Google Scholar]
  24. Buchete N. and Hummer G., J. Phys. Chem. B 112, 6057 (2008). [DOI] [PubMed] [Google Scholar]
  25. Chodera J., Singhal N., Swope W. C., Pande V. S., and Dill K. A., J. Chem. Phys. 10.1063/1.2714538 126, 155101 (2007). [DOI] [PubMed] [Google Scholar]
  26. Moroni D., van Erp T. S., and Bolhuis P. G., Physica A 10.1016/j.physa.2004.04.033 340, 395 (2004). [DOI] [Google Scholar]
  27. Moroni D., Bolhuis P. G., and van Erp T. S., J. Chem. Phys. 10.1063/1.1644537 120, 4055 (2004). [DOI] [PubMed] [Google Scholar]
  28. Allen R. J., Frenkel D., and ten Wolde P. R., J. Chem. Phys. 10.1063/1.2140273 124, 024102 (2006). [DOI] [PubMed] [Google Scholar]
  29. Ma A. and Dinner A. R., J. Phys. Chem. B 10.1021/jp045546c 109, 6769 (2005). [DOI] [PubMed] [Google Scholar]
  30. Peters B. and Trout B. L., J. Chem. Phys. 10.1063/1.2234477 125, 054108 (2006). [DOI] [PubMed] [Google Scholar]
  31. Note that Eq. 10 is a probability density function with respect to the Lebesgue measure (i.e., the surface element) dσSj(x) on Sj. Thus the support of the associated probability distribution is Sj. Formally, this distribution can be written in Rd as Zj1eβV(x)q(x)2δ(q(x)zj)dx, where δ(q(x)−zj) denotes the Dirac delta function. This representation can be justified from the identity ∫Rde−βV(x)∣∇q(x)∣2δ(q(x)−zj)f(x)dx=∫Sje−βV(x)∣∇q(x)∣f(x)dσSj(x) valid for any function f(x) such that the surface integral exists.
  32. Huo S. and Straub J. E., J. Chem. Phys. 10.1063/1.474863 107, 5000 (1997). [DOI] [Google Scholar]
  33. Karatzas I. and Shreve S. E., Brownian Motion and Stochastic Calculus, Graduate Texts in Mathematics Vol. 113, 2nd ed. (Springer-Verlag, New York, 1991). [Google Scholar]

Articles from The Journal of Chemical Physics are provided here courtesy of American Institute of Physics

RESOURCES