Abstract
Transition path theory provides a statistical description of the dynamics of a reaction in terms of local spatial quantities. In its original formulation, it is limited to reactions that consist of trajectories flowing from a reactant set A to a product set B. We extend the basic concepts and principles of transition path theory to reactions in which trajectories exhibit a specified sequence of events and illustrate the utility of this generalization on examples.
I. INTRODUCTION
Many reactions studied today proceed through competing pathways. Understanding such reactions relies on being able to assess the relative importance of the competing pathways and how they contribute to overall rates. When the pathways are well separated, they can be treated independently, often by traditional theories that assume a well-defined activated complex (transition state) and a simple form for the underlying (free) energy landscape governing the dynamics.1,2 However, when the (observed) dynamics are stochastic, the pathways of reactions often overlap in configuration space. Approaches that treat competing pathways in a unified fashion are thus needed.
To this end, here, we build on transition path theory (TPT).3–7 The core idea of transition path theory is that the statistics of the ensemble of reactive trajectories can be related to quantities that are local in space: probability currents of reactive trajectories (henceforth, reactive currents) and committors. These quantities enable TPT to go beyond traditional theories by providing information about mechanisms. Reactive currents quantify flows in phase space. Committors, which are probabilities of reaching one metastable state before another, by definition characterize the progress of stochastic reactions.8
In its traditional formulation, TPT focuses on transitions between two metastable states. In the present paper, we extend TPT to compute statistics for sequences of events, and we show how this significantly expands its applicability. Our work builds on but goes beyond previous studies. It is closely related to history-augmented Markov state models, in which states are labeled based on the last metastable state visited.9 Separating the ensemble of reactive trajectories using these labels enables rates to be computed from the flux into a metastable state10,11 as well as reactive currents and committors from the underlying trajectories.12 Our approach generalizes the labeling strategy to sequences of arbitrary numbers of states and allows specification of not just past events but also future ones.
Our work also has connections to that of Koltai and co-workers, who extended TPT to allow trajectories to leave and enter the region connecting the metastable states to analyze trajectory segments from satellite data for drifters in the ocean.13 Specifically, they redefined committors to exclude trajectories that were not wholly within this region and noted that this approach could be used to exclude trajectories that pass through selected states. They also considered computing statistics for trajectories beginning and ending in specific portions of metastable states. Both of these developments allow statistics to be computed for subsets of reactive trajectories based on the states that they visit.
We present our work as follows. First, in Secs. II A and II B, we review how TPT expresses path statistics in terms of spatially and temporally local quantities using committors. In Sec. II C, we present a motivating example in which this is not possible within the existing framework, but it can be made possible by augmenting the stochastic process with labels that account for sequences of events. This is the key idea of the paper. While this idea is straightforward, formulating the theory in full generality requires some technical development, and readers may wish to skim Secs. II D and IV initially, focusing on the brief summaries in the first paragraphs of Secs. II D and IV. In Sec. II D, we discuss conditions of consistency and Markovianity that must be satisfied for TPT to apply to the augmented process. We also generalize committors and integrals based on them. In Sec. III, we review the most commonly computed TPT statistics and show how they can be computed in our augmented TPT framework. In Sec. IV, we introduce a procedure for constructing the augmented process from pairs of successive time points rather than full trajectory segments, and we show how processes can be composed to construct more complex ones. We summarize the operational procedure in Sec. V. Then, in Sec. VI, we illustrate our approach on two systems with multiple pathways and intermediates. In Sec. VII, possible extensions and numerical strategies for treating more complicated systems are discussed. In the Appendix, we provide a method for calculating augmented TPT statistics using a finite difference scheme. Code implementing this method is available at github.com/dinner-group/atpt.
II. FRAMEWORK
In this section, we review TPT to show how it casts statistics for reactive trajectories in terms of local quantities. Then, we present an example that cannot be treated within the traditional TPT framework and show how it can be treated by introducing an augmented process. The essential idea is that the augmented process accounts for the order of events. The challenge in implementing this idea is that, for a finite-length trajectory segment, we generally do not know the events that occur before and after it.
For clarity, we present our results in terms of a discrete-time Markov process Xt with time step Δ, but our results generalize to continuous-time processes in the limit Δ → 0. We denote the time interval r, r + Δ, …, s by r:s and a trajectory segment on this time interval by Xr:s = (Xr, Xr+Δ, …, Xs). For conciseness, we denote an infinite trajectory X−∞:∞ by X.
A. Ensemble of reactive trajectories
In both traditional and augmented TPT, statistics are computed over the ensemble of reactive trajectories. In this section, we define this ensemble and integrals over it. Here, we focus on traditional TPT, but the framework generalizes to augmented TPT immediately once we define the augmented process in Sec. II D.
Traditional TPT considers a reaction from a set A to a set B via trajectories that cross a region D. In anticipation of our augmented framework, we allow D ⊆ (A ∪ B)c as in Ref. 13. We consider a trajectory Xr:s to be reactive if its first time point Xr is in the reactant set A, its last time point Xs is in the product set B, and all intervening time points Xr+Δ:s−Δ are in the region D. Mathematically, we implement this definition through the indicator function
(1) |
where
(2) |
and is the n-fold Cartesian product.
Given (1), we define the integral over the ensemble of reactive trajectories to be
(3) |
where IX[f(X)] is the integral of f(X) over the distribution of infinite trajectories X, which we denote using the superscript X. When Xt is a stationary ergodic process and IX is the expectation EX over the distribution of infinite trajectories, as in traditional TPT, we can compute from a single infinite trajectory and so IX can be omitted; however, this is not necessarily true for time-dependent processes, as in Ref. 14, or for augmented processes, as in this work. As Xt is a Markov process, we can compute this integral by sampling configurations X−T from the distribution of states at time −T and propagating until time T. The prefactor 1/(2T) ensures that gives consistent results across different trajectory lengths 2T. We can then calculate expectations over the ensemble of reactive trajectories as
(4) |
where the normalization factor is the expected number of reactive trajectories that start (or end) per unit time. The integral , thus, yields statistics that can be used to characterize and compare reaction pathways.
B. Transition path theory
In general, the ensemble of reactive trajectories can only be meaningfully interpreted through its statistics. Although these statistics can be computed directly from the ensemble of reactive trajectories, TPT enables them to be computed from other data as well by expressing them in terms of spatially and temporally local quantities.
TPT specifically considers functions that can be written as
(5) |
where γ(Xt:t+Δ) is a function of successive time points Xt and Xt+Δ. In this case, substituting (5) into (3) and exchanging the order of the sums yields
(6) |
(7) |
where from (6) to (7) we have taken the limit T → ∞ and performed a time average over t, which we denote by the superscript t. That is,
(8) |
We can, then, factor
(9) |
where
(10) |
(11) |
are the last exit time from Dc and the first entrance time to Dc, respectively. Equation (9) results from the identities
(12) |
(13) |
We arrive at (13) by observing that only one term in the sum can be nonzero: Because D and B are disjoint, 1D×⋯×D×B(Xt:s) can be nonzero only when s = τ+ (t) is the first time t′ ≥ t that Xt′ ∉ D. Similar logic applies for (12).
Consequently, for a Markov process, (7) can be expressed in terms of only local quantities as follows:
(14) |
(15) |
where we have defined the backward and forward committors, respectively, as
(16) |
(17) |
The backward committor q−(Xt) is the probability that Xt last came from A rather than (A ∪ D)c, and the forward committor q+(Xt) is the probability that Xt will go to B before (B ∪ D)c.
The main result of this section is (15). The advantage of (15) over (3) is that the former involves only statistics that are local in space and time. This aids in the interpretation of these statistics, and it enables their estimation from short trajectories Xt:t+Δ, thus eliminating the need for trajectories that actually cross from A to B.
C. A motivating reaction
To motivate our augmented framework, we consider a reaction with an intermediate state C and compute statistics only for reactive trajectories that proceed through the intermediate. The function that selects trajectories of interest is
(18) |
where D = (A ∪ B ∪ C)c, and the sum allows for any t1 and t2 satisfying r < t1 ≤ t2 < s. The sum searches for times t1 and t2, which are the first and last times that the trajectory is in C, respectively. Because determining t1 and t2 requires a search over the entire trajectory Xr:s, we cannot factor ω(Xr:s) as in (9).
Now suppose that, for each reactive trajectory segment Xr:s in the infinite trajectory X, we have a process Yt with
(19) |
An example reactive trajectory labeled with Yt is shown in Fig. 1. We can apply TPT on the augmented process Zt = (Xt, Yt) because we can write (18) in the form of (1) as
(20) |
where .
This approach suggests a general strategy. We identify events—in this case, the first time t1 and last time t2 that Xt is in C—and define a process Yt that labels these events. Then, we define reactive trajectories on the augmented state space using (1). So long as Zt satisfies the assumptions behind TPT, we can express statistics using local quantities in the same manner as in (15). In Sec. II D, we discuss conditions for this to be the case, allowing for the possibility that an infinite trajectory has multiple labelings (e.g., to account for multiple finite reactive segments).
D. Augmented transition path theory
In this section, we introduce a function Ω(Y|X) for constructing an ensemble of trajectories augmented with labels from the distribution of trajectories X. We first consider the case of infinite length trajectories Z = (X, Y) and then the case of finite-length trajectories Zr:s = (Xr:s, Yr:s), to which one is limited in practice. We discuss two conditions that must hold for our framework. First, Ω(Yr:s|Xr:s) must be consistent with Ω(Y|X). Second, Zt must be Markovian. Later, in Sec. IV, we detail a specific construction of Y from X that requires examining only successive pairs of time points Xt:t+Δ and Yt:t+Δ.
We now present our augmented framework. We replace Xt with the augmented process Zt = (Xt, Yt), where Yt augments Xt with information about past and future events. Often, a single Y is associated with each infinite trajectory X because the latter contains full information about the past and future of any Xt. However, cases arise in which multiple Y can be associated with a given infinite trajectory X. For example, in the motivating reaction above, we define a Y for each reactive trajectory segment in X (i.e., we consider multiple r and s). It is, thus, necessary to consider a distribution of Y, and we compute integrals over the distribution of infinite trajectories Z as
(21) |
where Ω(Y|X) is the distribution of Y (and thus Z) for a given infinite trajectory X.
This immediately yields analogs of (15), (16), and (17),
(22) |
(23) |
(24) |
where sets A, B, and D are now defined on the augmented state space, and τ− (t) and τ+ (t) are now on the augmented process.
However, we cannot yet evaluate these integrals and expectations because Yt and, thus, each time point Zt depends on the infinite trajectory X. Instead, we must convert integrals over Z to integrals over X using
(25) |
where the weight of Yr:s (and thus Zr:s) given the trajectory segment Xr:s is
(26) |
When a single Y is associated with each infinite trajectory X, Ω(Yr:s|Xr:s) is the probability of Yr:s given Xr:s and so ∫Ω(Yr:s|Xr:s) dYr:s = 1; this is not true in the general case. Equation (26) is the first requirement of our augmented framework: We must be able to convert integrals involving Ω(Y|X), which depend on the infinite trajectory X, to those involving Ω(Yr:s|Xr:s), which depend only on the finite trajectory Xr:s.
Using (25), we can then write (22), (23), and (24) as
(27) |
(28) |
(29) |
(30) |
(31) |
For q−(Zt) and q+(Zt), we excluded Yt from the variables over which we integrate because we conditioned on it. We note that when ω(Zr:s) has no dependence on Yr:s [i.e., ω(Zr:s) = ω(Xr:s)] and there is a one-to-one correspondence between X and Z [i.e., ∫Ω(Y|X) dY = 1], we can recover the traditional TPT committors as
(32) |
(33) |
In traditional TPT, Xt must be a Markov process so that, from (14) to (15), we could take expectations of and to obtain committors q−(Xt) and q+(Xt+Δ). For the augmented process Zt to be similarly treatable, we also require it to be a Markov process. This requirement may be surprising because Yt can depend on the future of Xt. This can be understood by observing that, for the augmented process, the probability distribution of Xt+Δ depends on both Xt and Yt. For example, for q+(Zt) in (24), the distribution of Xt+Δ:∞ conditioned on Zt is not the same as that of Xt+Δ:∞ conditioned on Xt alone, since Yt specifies that Xt must undergo certain events in the future.
Since Xt and Zt are Markov processes, we can factor the path probabilities PX[Xr:s] and PZ[Zr:s] of the original and augmented processes as follows:
(34) |
(35) |
(36) |
(37) |
As above, the superscripts indicate distributions of infinite trajectories. Thus, for example,
(38) |
is the probability of observing Xr:s with all possible semi-infinite segments X−∞:r−Δ and Xs+Δ:∞ before and after r:s, respectively; PX[X] is the probability of a specific infinite trajectory X. The probability distribution of Z is
(39) |
with c = ∫Ω(Y|X)PX[X] dZ. Therefore, from (26),
(40) |
(41) |
(42) |
To compute Ω(Yr:s|Xr:s), we divide (37) by (35) and then apply (42),
(43) |
This factorization is the second requirement of our augmented framework: We must be able to construct Ω(Y|X), which depends on the infinite trajectory X, from Ω(Yt:t+Δ|Xt:t+Δ), which can only depend on pairs of successive time points Xt:t+Δ.
III. REACTIVE STATISTICS
In this section, we discuss TPT statistics that provide information about mechanisms. These include committors, the reactive flux, the reactive density, the reactive current, and expectations over reactive trajectories that they enable computing. We present expressions for augmented TPT in the form of (22), which can be evaluated using (27). The corresponding expressions for traditional TPT can be obtained by replacing Zt with Xt. The statistics are normalized so that different reactions that are specified through different ω(Zr:s) but calculated from the same distribution of infinite trajectories X are directly comparable.
We note that augmented TPT is useful even for reactions that can be described using traditional TPT [i.e., ω(Zr:s) = ω(Xr:s)]. The augmented process allows reaction mechanisms to be resolved in more detail, since committors and other statistics can be calculated on points that depend on both past and future behaviors of trajectories. Furthermore, the addition of past and future information enables the calculation of statistics with η(Xr:s) no longer restricted to the form in (5).
Several of the statistics that we discuss yield quantities on points v in a collective variable (CV) space θ, that is, a space of functions of a subset of the coordinates. We indicate this using the subscript θ. We express these statistics on a CV space rather than the state space of Zt because, for complex systems, it is often the case that the full state space contains variables that are irrelevant to understanding the reaction. This is particularly true for the augmented state space, which must contain the information required to select reactive trajectories using (1) and compute statistics using (5), both of which rely on Yt to obtain past or future information. Nevertheless, the theory holds for the choice θ(Zt) = Zt.
A. Reactive flux
The reactive flux is the expected number of reactive trajectories that start (or end) per unit time. We can express the reactive flux in the form of (22) by choosing γ(Zt:t+Δ) so that η(Zr:s) = 1 when Zr:s is reactive. Such choices of γ(Zt:t+Δ) include 1A(Zt)/Δ and 1B(Zt+Δ)/Δ, which are nonzero only when Zt:t+Δ is the first or last step of the reactive trajectory, respectively. Consequently, we can compute the reactive flux using
(44) |
(45) |
where we have applied the identities 1A(Zt)q−(Zt) = 1A(Zt) and 1B(Zt)q+(Zt) = 1B(Zt). Equation (44) counts the number of trajectories that exit A in the time interval Δ and then react; (45) is the analog for trajectories entering B.
The reactive flux is of interest not only in its own right but also for calculating expectations over reactive trajectories,
(46) |
For example, the duration N(Zr:s) = s − r of a trajectory can be expressed in the form of (5) with γ(Zt:t+Δ) = 1, and so the expected duration of a reactive trajectory is
(47) |
B. Reactive density
The reactive density is the distribution of configurations that belong to reactive trajectories. For a point v in the CV space θ, the reactive density ρθ(v) is the probability that θ(Zt) = v and Zt is part of a reactive trajectory. Equivalently, it is the expected fraction of time an infinite trajectory spends reactive at v. It can be expressed in the form of (22) as
(48) |
where δ is the Dirac delta function. When computing the expectation, δv(θ(Zt)) selects the points Zt with θ(Zt) = v. The term [δv(θ(Zt)) + δv(θ(Zt+Δ))]/2 corresponds to assuming that half of the time of each step Zt:t+Δ is spent in Zt and half of the time is spent in Zt+Δ.
In turn, the reactive density can be used to evaluate (22) when γ(Zt:t+Δ) = [f(θ(Zt)) + f(θ(Zt+Δ))]/2 is a path-independent function on the CV space,
(49) |
For instance, the expected fraction of time an infinite trajectory spends reactive can be obtained by setting f(v) = 1, so that
(50) |
where we have assumed the distribution of trajectories X to be a probability distribution, so that IX[1] = 1.
We note that when the CV space θ is contained in the CV space θ′, i.e., we can write θ(Zt) = ζ(θ′(Zt)) for some ζ(v′), we can calculate ρθ(v) by projecting onto θ,
(51) |
We can do the same for functions f(v′) defined on the CV space θ′. We calculate Aθ[f](v), the expected value of f(v′) at a point v in the CV space θ, conditioned on trajectories passing through that point being reactive, as
(52) |
We emphasize that f(v′) can use Yt to obtain information from the past and future of Xt, and so (52) is significantly more powerful than its traditional TPT counterpart. For instance, we can calculate the conditional mean first passage time, the expected time it takes for Zt to hit B given that Zt is part of a reactive trajectory, using (52) as discussed further in Sec. III E, whereas in traditional TPT, we would need to employ a Feynman–Kac formula (e.g., see Ref. 15).
C. Reactive current
The reactive current Jθ(v) through a point v in the CV space θ is the net flow of reactive trajectories within θ through v. It can be expressed in the form of (22) as
(53) |
Conceptually, for each pair of successive time points Zt:t+Δ that are part of a reactive trajectory, we compute the numerical derivative [θ(Zt+Δ) − θ(Zt)]/Δ and then split it equally between Zt and Zt+Δ. In fact, in the limit Δ → 0, when θ(Zt) is differentiable, (53) becomes
(54) |
which is the time derivative of θ(Zt) integrated over the distribution of reactive trajectories Zt with θ(Zt) = v.
It can be useful to compute the reactive current along the gradient of a function f(v),
(55) |
In the limit Δ → 0, for differentiable f(v), (55) is Jθ[f](v) = Jθ(v) · ∇θf(v), which we can derive by observing that the finite differences in (53) and (55) are dθ(Zt)/dt and df(θ(Zt))/dt in this limit, respectively, and by the chain rule df(θ(Zt))/dt = dθ(Zt)/dt ⋅∇θf(θ(Zt)).
Like the reactive density, we can calculate Jθ(v) and Jθ[f](v) by projecting and onto θ,
(56) |
(57) |
where (f◦ζ) (v′) = f(ζ(v′)).
D. Committors
The committors q−(Zt) and q+(Zt) are defined on the state space of Zt, which makes them useful for calculating other statistics but can make them hard to interpret. To address this issue, we can treat the committors as reaction coordinates and project them onto a CV space θ as Aθ[q−](v) and Aθ[q+](v). These quantities have a physical interpretation. For instance, Aθ[q+](v) is the probability that a trajectory starting at a point Zt that is drawn from configurations with θ(Zt) = v in the ensemble of reactive trajectories enters B when it first leaves D. We note that, unlike most other reactive statistics, Aθ[q−](v) and Aθ[q+](v) with θ(Zt) = Xt are not independent of the choice of Yt, even when the same ensemble of reactive trajectories is selected because the likelihood that a trajectory contributes positively to the committor and the likelihood that it is reactive are correlated.
E. Conditional mean first and last passage times
The first passage time to the product is the time it takes for a trajectory starting at time t to reach the product B, at time τ+ (t). It can be expressed as
(58) |
where . This increments by Δ for each time step backward in time when Zt ∉ B and sets when Zt ∈ B. The conditional mean first passage time, m+(Zt), is the expected first passage time to the product for a point Zt that is part of a reactive trajectory. This statistic, and higher moments of the first passage time distribution, are useful for real-time forecasting, e.g., of weather.15 To compute the conditional mean first passage time, we take the conditional expectation of (58) with respect to the reactive density using (52), i.e.,
(59) |
where . We can also calculate more general statistics on the distribution of the first passage time. For instance, the conditional variance of the first passage time to the product is . Likewise, the last passage time from the reactant is the time it takes for a trajectory ending at time t to come from the reactant A, at time τ− (t), conditioned on Zt being part of a reactive trajectory, and can be expressed as
(60) |
where . The conditional mean last passage time from the reactant is then
(61) |
where . These statistics can be projected onto points v on a CV space θ(Zt) as Aθ[m−](v) and Aθ[m+](v).
IV. CONSTRUCTION OF THE AUGMENTED PROCESS
In this section, we describe a particularly useful way to define the augmented process: We decompose Ω(Y|X) into a product over functions of successive time points, κ(Yt:t+Δ|Xt:t+Δ). We show that Ω(Yr:s|Xr:s) so defined satisfies the required properties (26) and (43) by construction. We, then, demonstrate the construction of complex augmented processes from simpler ones by composition. Finally, we show how this machinery applies to the motivating reaction.
A. Decomposition of Ω
To apply our augmented framework, we need to construct the augmented process and, in turn, the distribution of infinite trajectories Z such that Ω(Yr:s|Xr:s) satisfies (26) and (43). One way to do so is to define Ω(Y|X), calculate Ω(Yr:s|Xr:s) from Ω(Y|X) using (26), and then verify that (43) holds. In this section, we present an alternative approach. We specify the augmented process through a function of pairs of successive time points, κ(Yt:t+Δ|Xt:t+Δ), and then use it to calculate Ω(Yr:s|Xr:s). This procedure satisfies (26) and (43) by construction.
We start by using the pair structure of (43) to factor
(62) |
It follows immediately that
(63) |
where
(64) |
This cleanly separates terms that depend on the past X−∞:r and future Xs:∞ from those that depend on the trajectory segment Xr:s.
Using (63), we can compute (26) as
(65) |
where we have defined
(66) |
(67) |
We describe how to calculate k−(Yt|Xt) and k+(Yt|Xt) in Sec. V. The case r = s = t is the weight of Yt given Xt,
(68) |
We can verify that the resulting Zt is a Markov process by substituting (65) and (68) into (43).
B. Building augmented processes by composition
The factorization in (62) has a number of advantages over (43). One is that it facilitates deriving useful expressions for treating multiple augmented processes. If we have Ω(Y|X) of the form
(69) |
where n labels different augmented processes, we can factor both sides by (62) to obtain
(70) |
which involves only successive time points. Each of the terms defines a process using information from the original process Xt and other processes , which we denote as . We can use this to combine multiple augmented processes, which may be defined independently or hierarchically.
For example, consider the augmented process , where is used to define pathways and is defined using . To compute statistics on the first and last passage times, we can augment this process with the augmented processes in (58) and (60) by defining
(71) |
(72) |
The combined process is then specified by
(73) |
This example furthermore shows how (62) allows forward-in-time and backward-in-time augmented processes to be treated in a unified manner and combined, which is not straightforward with (43).
C. Augmented process for the motivating reaction
We can also use (70) to construct augmented processes by combining simpler augmented processes. Here, we detail a possible construction of the augmented process (19) as a composite of three augmented processes. First, we define an augmented process that selects all reactive trajectories regardless of pathway,
(74) |
For a reactive trajectory Xr:s from A to B, splits the infinite trajectory X into three parts: time points X−∞:r before the reaction , time points Xr+Δ:s−Δ during the reaction , and time points Xs:∞ after the reaction . We list the possible transitions of this augmented process in Fig. 2(a). The nodes are sets in which may belong; an arrow from one set to another indicates that may transition to when is in the first set and is in the second set. For instance, the fourth term in (74) corresponds to the arrow from to .
Next, we employ additional augmented processes to find the first and last times t1 and t2 that the trajectory is in C. Using (74), we define the processes
(75) |
(76) |
During the reaction (i.e., ), for times t ≥ t1 and for times t ≤ t2. We can write (75) and (76) as
(77) |
(78) |
We can then combine (74), (77), and (78) using (70),
(79) |
where Zt = (Xt, Yt), and to match (19), we have merged , , and into
(80) |
We show the possible transitions of this augmented process in Fig. 2(b).
In Fig. 2(c), we illustrate the determination of Yr:s from κ(Yt:t+Δ|Xt:t+Δ) for the trajectory Xr:s in Fig. 1. For each time point Xt, we list the possible values of Yt from (79). We then stitch together trajectories by connecting successive time point pairs Zt:t+Δ and following the arrows in Fig. 2(b). The black elements in Fig. 2(c) indicate step pairs Yt:t+Δ in trajectories Yr:s that satisfy κ(Yr:s|Xr:s) = 1, and the gray elements indicate step pairs Yt:t+Δ that satisfy κ(Yt:t+Δ|Xt:t+Δ) = 1 but do not belong to any Yr:s that satisfies κ(Yr:s|Xr:s) = 1.
There is one Y associated with each reactive trajectory segment in X. The diagonal path in Fig. 2(c) represents one such reactive trajectory segment and can be selected using augmented TPT. The horizontal paths are collections of augmented processes corresponding to times before each future reaction (Yt = 0) and after each past reaction (Yt = 4).
V. ALGORITHM
In this section, we summarize the operational aspects of the method. For the numerical examples that we consider in the present paper, we evaluate integrals of the form in (3) using a finite difference approximation, which we detail in the Appendix; more complex systems can be treated by extending the approach in Refs. 16 and 17, which we leave for future work. In terms of the finite difference approximation, the algorithm for evaluating these statistics is as follows:
-
1.
Define κ(Yt:t+Δ|Xt:t+Δ), ω(Zr:s), and γ(Zt:t+Δ) for the statistic of interest.
-
2.
Compute k−(Yt|Xt) and k+(Yt|Xt), which account for weights associated with the past and future segments of trajectories. To this end, we express (66) and (67) as
(81) |
(82) |
and solve these equations using (A9) and (A10).
-
3.
Compute Ω(Yt:t+Δ|Xt:t+Δ) and Ω(Yt|Xt) by (65) and (68), respectively.
-
4.
Compute q−(Zt) and q+(Zt). To this end, we express the committors as solutions to boundary value problems. For Zt ∈ D,
(83) |
(84) |
(85) |
(86) |
For Zt∉D, q−(Zt) = 1A(Zt) and q+(Zt) = 1B(Zt). Above, (83) and (85) result from applying the identities (12) and (13) to the definitions (16) and (17), and (84) and (86) follow, in turn, from (29) and (31). We solve (84) and (86) using (A9) and (A10).
VI. NUMERICAL EXAMPLES
In this section, we demonstrate our augmented framework on simple examples that make the limitations of traditional TPT apparent. The examples we consider employ overdamped Langevin dynamics on a potential U(x) and satisfy the Fokker–Planck equation given by
(87) |
We calculate all statistics using a quadrature scheme adapted from Ref. 16, which we detail in the Appendix.
A. Reaction through an intermediate
In our first example, we demonstrate the use of augmented TPT to resolve individual reaction steps. We consider a reaction through an intermediate with the three-dimensional potential
(88) |
where x = (x1, x2, x3). We visualize the U(x) = −3 isosurface and the probability density on the CV space θ(x) = (x1, x2) in Fig. 3.
Our reaction of interest is described by the indicator function
(89) |
where we have defined the reactant A, product B, and intermediate C to be
(90) |
and D = (A ∪ B ∪ C)c. This reaction represents, for instance, a catalyzed reaction where the interaction of the substrate with the catalyst (represented by x1) and a substrate internal coordinate (represented by x2) can be observed while the status of the reaction (represented by x3) cannot. The observable variables form the CV space (x1, x2), and the sets A, B, and C are defined on this CV space.
There are two pathways in this reaction: uncatalyzed and catalyzed. In the uncatalyzed pathway, the system transitions from the reactant A to S1, then crosses directly to S2 before entering the product B. In the catalyzed pathway, instead of directly crossing from S1 to S2, the system transitions from S1 into the intermediate C and then to S2.
We select trajectories that react through each pathway by applying augmented TPT to the reaction. We define an augmented process for each of the pathways by including only the terms from (79) that are involved in that pathway. For the uncatalyzed pathway, we remove all reactive trajectories that visit the intermediate C by removing all terms that contain from (79), yielding
(91) |
and then select reactive trajectories using
(92) |
For the catalyzed pathway, we retain only reactive trajectories that pass through the intermediate C by removing all terms that contain as well as the direct transition from to , which does not pass through C. This yields the augmented process
(93) |
We then select reactive trajectories using
(94) |
where . We show the possible transitions of (91) in Fig. 4(a) and (93) in Fig. 4(b).
Our goal is to visualize the mechanism of the reaction in the CV space (x1, x2) and quantify the relative rates of the two pathways. To this end, we examine four reactive statistics: the reactive density ρθ, the reactive current Jθ, the forward committor Aθ[q+], and the conditional mean first passage time to the product Aθ[m+].
We plot the reactive statistics from traditional TPT in Fig. 5(a). The reactive density ρθ reveals that reactive trajectories spend much of their time around (x1, x2) = (0, 0) and (x1, x2) = (2, 0), which is consistent with the presence of intermediates S1/S2 and C. The reactive current Jθ (vector field) suggests that the reaction is dominated by the uncatalyzed pathway, although a significant fraction does react through the catalyzed pathway. The forward committor Aθ[q+] changes rapidly around (x1, x2) = (0, 0), suggesting the presence of a bottleneck, corresponding to direct crossing from S1 to S2. It is almost uniform around C, suggesting the presence of the intermediate C. The conditional mean first passage time Aθ[m+] can be interpreted in the same way as the forward committor; however, it allows us to visualize the order in which states are visited more clearly. The region below A has a higher value of Aθ[m+] than the region around C, which suggests that reactive trajectories usually visit the former before the latter. Together, these reactive statistics suggest a cohesive picture. The reaction is dominated by the uncatalyzed pathway, which has a bottleneck around (x1, x2) = (0, 0). Reactive trajectories may leave this pathway before the bottleneck into the catalyzed pathway, which has an intermediate C, and return after the bottleneck (i.e., they appear to circumvent S1/S2).
Some of the results from traditional TPT are misleading. For example, traditional TPT suggests that the uncatalyzed pathway is dominant, yet the total reactive flux from A to B is 3.0 × 10−4, while the reactive flux for trajectories that visit C is 2.2 × 10−4, i.e., 73% of trajectories go through the intermediate. This results from the restriction of the observed coordinates to (x1, x2); in the full state space (x1, x2, x3), traditional TPT is capable of correctly resolving the two pathways. However, we note that even given (x1, x2, x3), traditional TPT cannot calculate dynamical statistics for the ensemble of trajectories that react through a particular pathway. Augmented TPT provides a solution to the overlap issue and enables the calculation of reactive statistics for individual steps of each pathway.
We first analyze the uncatalyzed pathway, which was wrongly suggested by traditional TPT to be the dominant pathway. Reactive statistics for this pathway are shown in Fig. 5(b). As we would expect, the reactive density ρθ shows that reactive trajectories spend much of their time around S1/S2, and the reactive current Jθ suggests that reactive trajectories flow directly from A to S1/S2 to B, without any notable deviation to the vicinity of C. On the uncatalyzed pathway, the forward committor Aθ[q+] changes rapidly around (x1, x2) = (0, 0) due to the transition from S1 to S2. Off the uncatalyzed pathway, the forward committor has a higher value closer to A and a lower value closer to B. This is surprising and results from slight differences in the reactive density at different values of x3. The conditional mean first passage time Aθ[m+] rapidly decreases near S1/S2, suggesting the same single bottleneck.
We now analyze the catalyzed pathway through the intermediate C, which dominates the rate. We select trajectories that react through this pathway using (94). Augmented TPT enables us to split the pathway into individual steps and, so, to resolve the structure of each reaction step. The first step (Yt = 1) starts when the reactive trajectory leaves the reactant A and ends when it first enters the intermediate C. The second step (Yt = 2) starts at the first time the reactive trajectory enters C, and it ends at the last time the reactive trajectory leaves C. The third step (Yt = 3) starts when the reactive trajectory last leaves the intermediate C and ends when it enters the product B. As we now explain, separating the catalyzed pathway into these steps leads to reactive statistics that lead to a different interpretation than those from traditional TPT.
In the first step, the reactive density ρθ and reactive current Jθ clearly show that most reactive trajectories flow through an intermediate near (x1, x2) = (0, 0), in this case S1, rather than a more direct path from A to C, as suggested by the reactive current from traditional TPT. Likewise, in the last step, they show that most reactive trajectories flow through an intermediate near (x1, x2) = (0, 0), in this case S2. The absence of any significant reactive current in the second step, along with the high reactive density near C, suggests that reactive trajectories predominantly remain in C during this step, with a few trajectories transitioning back and forth to S1/S2, where there is a lower value of reactive density. The reactive current from traditional TPT is misleading because the flows from S1 to C and from C to S2 cancel each other, since S1 and S2 overlap in the CV space.
The forward committor Aθ[q+] on the catalyzed pathway is uniformly low in the first step and uniformly high in the third step. This suggests that the main bottleneck occurs in the second step, where Aθ[q+] ≈ 0.5 around C. The abrupt changes between steps suggest that the dynamics of the variables not captured within the CV space are influential in determining whether the reaction occurs. For the second step, we note that the low value of Aθ[q+] below A and high value above B reflect the full three-dimensional potential [Fig. 3(a)]. In traditional TPT, the high value of Aθ[q+] in the first step and the low value in the third step cancel, giving rise to the apparent rapid change near (x1, x2) = (0, 0).
In the first step, Aθ[m+] decreases from (x1, x2) = (0, 0) to C, suggesting the presence of a bottleneck between intermediates S1 to C. The same holds for the second step, suggesting that if the system crosses back to an intermediate near (x1, x2) = (0, 0), it needs to overcome the same bottleneck to return to C. In the third step, Aθ[m+] decreases from (x1, x2) = (0, 0) to B, marking the transition from S2 to B. We note that this decrease occurs at a slightly lower value than in the uncatalyzed pathway. This separation of the two bottlenecks lies in contrast with Aθ[m+] from traditional TPT, where the superposition of the two bottlenecks creates an apparent bottleneck near (x1, x2) = (0, 0), which conflates the dynamics of the catalyzed and uncatalyzed pathways.
Overall, we see that the statistics from traditional TPT qualitatively resemble a superposition of those for the uncatalyzed pathway and those associated with the second step of the catalyzed pathway, in which the system is mainly localized at C. The important contributions from the first and third steps of the catalyzed pathway mask each other in traditional TPT.
B. Reaction with multiple pathways
For our second example, we demonstrate the use of augmented TPT to separate pathways that overlap in the CV space. We consider overdamped Langevin dynamics on the three-dimensional potential
(95) |
where x = (x1, x2, x3). As previously, these dynamics satisfy the Fokker–Planck equation in (87). The U(x) = −3 isosurface for this potential is shown in Fig. 6(a), and the probability distribution on the (x1, x2) coordinates, which we use as CVs, is shown in Fig. 6(b). We define the reactant A, product B, and intermediates C1, C2, C3, and C4 to be
(96) |
We also define the sets C = C1 ∪ C2 ∪ C3 ∪ C4 and D = (A ∪ B ∪ C)c. The reaction of interest is specified through the indicator function
(97) |
We use the intermediates to define pathways. This is advantageous because it is not possible to divide the space into regions corresponding to different pathways owing to overlap. We define four pathways: major pathways I and II and minor pathways III and IV. We define pathway I to first hit intermediate Ci = C1 after leaving A and last hit intermediate Cj = C4 before hitting B. We likewise define pathway II with (Ci, Cj) = (C2, C3), pathway III with (Ci, Cj) = (C1, C3), and pathway IV with (Ci, Cj) = (C2, C4).
To define the means of selecting these pathways, we label a trajectory before the reaction with Yt = 0 and after the reaction with Yt = 4. When the reaction is in process, we label times before the reactive trajectory first enters Ci with Yt = 1, after the reactive trajectory last exits Cj with Yt = 3, and between those times with Yt = 2. Then, we select pathways using
(98) |
where . This choice of Yt corresponds to
(99) |
The first five terms of (99) denote the sets in which Xt may be for each of the labels and the remaining six terms describe the permitted transitions between the labels. We represent (99) visually in Fig. 7. For instance, the seventh term, , corresponds to the single timestep transition from A to Ci and associates this with a change in the label from Yt = 0 to Yt = 2.
We determine the reactive flux associated with each pathway and compare it with the total reactive flux. For the reaction specified by (97), the total reactive flux is 7.5 × 10−4. Each of the major pathways has a reactive flux of 2.9 × 10−4, which is 39% of the total reactive flux. Each of the minor pathways has a reactive flux of 6.4 × 10−4, which is 8.5% of the total reactive flux. These four pathways, thus, give rise to 95% of the total reactive flux and, so, are representative of the majority of reactive trajectories. The remaining 5% results from trajectories that do not conform to these pathways (e.g., ones that pass through only a single intermediate).
In Fig. 8, we plot four reactive statistics: the reactive density ρθ, the reactive current Jθ, the conditional mean last passage time from the reactant Aθ[m−], and the conditional mean first passage time to the product Aθ[m+].
Reactive statistics from traditional TPT are shown in Fig. 8(a). From the reactive density ρθ, we observe that reactive trajectories spend most of their time in the X-shaped region that connects the intermediates. The reactive current Jθ suggests that the majority of the reactive trajectories flow from the reactant A to either C1 or C2, then to either C3 or C4 via (x1, x2) = (0, 0), and finally to the product B. Importantly, even if we consider the full state space (x1, x2, x3), we cannot determine the relative weights of these four possible pathways because the pathways are composed of segments that belong to multiple pathways (e.g., the first half of pathway III overlaps with pathway I and the second half of pathway III overlaps with pathway II) and the transitions through (x1, x2, x3) = (0, 0, 0) along pathways III and IV occur in opposite directions. Other quantities calculated using traditional TPT have the same issue. Both Aθ[m−] and Aθ[m+] are unable to distinguish between the pathways and only indicate the presence of a bottleneck between C1 ∪ C2 and C3 ∪ C4.
In Fig. 8(b), we visualize the reactive statistics for the major pathways. As pathway I and pathway II are mirror images of one another, we discuss only pathway I. The reactive current Jθ clearly shows that the system transitions directly between on-pathway intermediates, from A to C1 to C4 to B. The reactive density ρθ corroborates this picture, with relatively little density in C2 and C3 compared to C1 and C4. We observe that Aθ[m−] and Aθ[m+] are highest near C2 and C3, suggesting that these configurations are dynamically disconnected from the main flow of the reactive trajectories. The transition from C1 to C4 is accompanied by an abrupt increase in Aθ[m−](v) and an abrupt decrease in Aθ[m+], which shows that a transition bottleneck is traversed. We note that the sharp increase in Aθ[m−] from A to C1 and the sharp decrease in Aθ[m+] from C4 to B imply the presence of bottlenecks between each of these pairs of states.
The minor pathways in Fig. 8(c) result from trajectories that switch between the major pathways. Since the two minor pathways are mirror images of each other, we discuss only pathway III, which involves a switch from pathway I to pathway II. In contrast with the major pathways, the reactive density ρθ indicates that the system is likely to visit off-pathway intermediates C2 and C4 on its way from C1 to C3. The conditional mean last passage time Aθ[m−] from the reactant is nearly identical to that of pathway I, with higher values near C2 and C3 and lower values near C1 and C4. However, the conditional mean first passage time Aθ[m+] to the product is nearly identical to that of pathway II. In conjunction with ρθ, the slight increase in Aθ[m−] from C1 to C4 suggests that these intermediates readily interconvert, and similarly for the slight increase in Aθ[m+] from C3 to C2. The larger change in Aθ[m−] and Aθ[m+] between C2 ∪ C3 and C1 ∪ C4 implies a bottleneck between C2 ∪ C3 and C1 ∪ C4. This bottleneck is significant because the transition from C1 to C3 must occur for this pathway. As with the major pathways, there are also bottlenecks between A and C1, and C3 and B.
VII. DISCUSSION
In this paper, we introduced an augmented process that labels sequences of events. This process enabled us to write statistics that depend on knowledge of past and future events in terms of quantities that are local in time and, in turn, to extend the TPT framework. We demonstrated how this framework can be used to separate statistics of competing pathways in reactions with intermediates to reveal features of mechanisms that are not apparent from traditional TPT analyses. Our framework can also be used to treat new classes of reactions that are not amenable to TPT analyses. For instance, reactions with the same reactant and product states, such as cycles of oscillators and excitable systems, can be handled using augmented TPT but not traditional TPT.
Our framework generalizes a previous extension of TPT13 and history-augmented approaches for computing rates9–11 and reactive statistics.12 The augmented process that we introduce is distinct from that in Ref. 14, in which the state space is expanded to include a time variable to treat time-dependent processes, including transient relaxations and systems with periodically varying dynamics. As a result, the two approaches can be combined to treat sequences of events of finite-time processes.
Our focus here was on establishing the conceptual framework for augmented TPT, and the examples that we showed were sufficiently simple that the Fokker–Planck equations defining their dynamics could be numerically integrated in the variables by quadrature. The dynamics of models with larger numbers of variables must instead be sampled through simulations that generate stochastic realizations of trajectories (i.e., the dynamics of the variables are numerically integrated in time). Because, like traditional TPT, the framework casts statistics in terms of quantities that are local in time, we can extend methods that compute reactive statistics from short trajectories.12,16,17 Such efforts are underway.
ACKNOWLEDGMENTS
We thank Adam Antoszewski, Spencer Guo, John Strahan, and Bodhi Vani for useful discussions. We acknowledge the Research Computing Center at the University of Chicago for computational resources. This work was supported by National Institutes of Health Award No. R35 GM136381 and National Science Foundation Award No. DMS-2054306.
APPENDIX: FINITE DIFFERENCE SCHEME
For a time-reversible drift-diffusion process given by
(A1) |
with stationary distribution π(Xt) ∝ exp(−U(Xt)), the infinitesimal generator can be used to compute expectations forward-in-time as
(A2) |
To evaluate expectations by quadrature, we adapt the finite difference scheme from Ref. 16, which we reproduce here. We approximate (A1) as a discrete-time Markov jump process with time step Δ on a grid with uniform spacing ϵ. For a small change ϵi in the direction of the ith coordinate with magnitude ϵ, we substitute −∇U(Xt) = ∇π(Xt)/π(Xt) and then make the approximation
(A3) |
Alternatively, we can write
(A4) |
(A5) |
where P(Xt:t+Δ) represents the approximation of PX[Xt:t+Δ] on the grid. Above, the first equality follows from the definition of conditional expectation and the second assumes that all transitions within time Δ are to neighboring grid points.
By matching terms between (A5) and (A3), the only nonzero entries of P(x, x′) are
(A6) |
(A7) |
(A8) |
Then, we can use the above expressions to estimate expectations using
(A9) |
(A10) |
(A11) |
where the sums are over points on the grid.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Chatipat Lorpaiboon: Conceptualization (lead); Formal analysis (lead); Investigation (lead); Methodology (lead); Software (lead); Visualization (lead); Writing – original draft (lead); Writing – review & editing (equal). Jonathan Weare: Conceptualization (equal); Funding acquisition (equal); Supervision (equal); Writing – review & editing (equal). Aaron R. Dinner: Conceptualization (equal); Funding acquisition (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
Data sharing is not applicable to this article as no new data were created or analyzed in this study.
REFERENCES
- 1.Hänggi P., Talkner P., and Borkovec M., “Reaction-rate theory: Fifty years after Kramers,” Rev. Mod. Phys. 62, 251–341 (1990). 10.1103/revmodphys.62.251 [DOI] [Google Scholar]
- 2.Peters B., Reaction Rate Theory and Rare Events (Elsevier, 2017). [Google Scholar]
- 3.E W. and Vanden-Eijnden E., “Towards a theory of transition paths,” J. Stat. Phys. 123, 503–523 (2006). 10.1007/s10955-005-9003-9 [DOI] [Google Scholar]
- 4.Vanden-Eijnden E., “Transition path theory,” in Computer Simulations in Condensed Matter Systems: From Materials to Chemical Biology, Lecture Notes in Physics Vol. 1, edited by Ferrario M., Ciccotti G., and Binder K. (Springer, Berlin, Heidelberg, 2006), pp. 453–493. [Google Scholar]
- 5.E W. and Vanden-Eijnden E., “Transition-path theory and path-finding algorithms for the study of rare events,” Annu. Rev. Phys. Chem. 61, 391–420 (2010). 10.1146/annurev.physchem.040808.090412 [DOI] [PubMed] [Google Scholar]
- 6.Metzner P., Schütte C., and Vanden-Eijnden E., “Transition path theory for Markov jump processes,” Multiscale Model. Simul. 7, 1192–1219 (2009). 10.1137/070699500 [DOI] [Google Scholar]
- 7.Metzner P., Schütte C., and Vanden-Eijnden E., “Illustration of transition path theory on a collection of simple examples,” J. Chem. Phys. 125, 084110 (2006). 10.1063/1.2335447 [DOI] [PubMed] [Google Scholar]
- 8.Du R., Pande V. S., Grosberg A. Y., Tanaka T., and Shakhnovich E. S., “On the transition coordinate for protein folding,” J. Chem. Phys. 108, 334–350 (1998). 10.1063/1.475393 [DOI] [Google Scholar]
- 9.Suárez E., Adelman J. L., and Zuckerman D. M., “Accurate estimation of protein folding and unfolding times: Beyond Markov state models,” J. Chem. Theory Comput. 12, 3473–3481 (2016). 10.1021/acs.jctc.6b00339 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Vanden-Eijnden E. and Venturoli M., “Exact rate calculations by trajectory parallelization and tilting,” J. Chem. Phys. 131, 044120 (2009). 10.1063/1.3180821 [DOI] [PubMed] [Google Scholar]
- 11.Dickson A., Warmflash A., and Dinner A. R., “Separating forward and backward pathways in nonequilibrium umbrella sampling,” J. Chem. Phys. 131, 154104 (2009). 10.1063/1.3244561 [DOI] [PubMed] [Google Scholar]
- 12.Vani B. P., Weare J., and Dinner A. R., “Computing transition path theory quantities with trajectory stratification,” J. Chem. Phys. 157, 034106 (2022). 10.1063/5.0087058 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Miron P., Beron-Vera F. J., Helfmann L., and Koltai P., “Transition paths of marine debris and the stability of the garbage patches,” Chaos 31, 033101 (2021). 10.1063/5.0030535 [DOI] [PubMed] [Google Scholar]
- 14.Helfmann L., Ribera Borrell E., Schütte C., and Koltai P., “Extending transition path theory: Periodically driven and finite-time dynamics,” J. Nonlinear Sci. 30, 3321–3366 (2020). 10.1007/s00332-020-09652-7 [DOI] [Google Scholar]
- 15.Finkel J., Webber R. J., Gerber E. P., Abbot D. S., and Weare J., “Learning forecasts of rare stratospheric transitions from short simulations,” Mon. Weather Rev. 149, 3647–3669 (2021). 10.1175/mwr-d-21-0024.1 [DOI] [Google Scholar]
- 16.Thiede E. H., Giannakis D., Dinner A. R., and Weare J., “Galerkin approximation of dynamical quantities using trajectory data,” J. Chem. Phys. 150, 244111 (2019). 10.1063/1.5063730 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Strahan J., Antoszewski A., Lorpaiboon C., Vani B. P., Weare J., and Dinner A. R., “Long-time-scale predictions from short-trajectory data: A benchmark analysis of the trp-cage miniprotein,” J. Chem. Theory Comput. 17, 2948–2963 (2021). 10.1021/acs.jctc.0c00933 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Data sharing is not applicable to this article as no new data were created or analyzed in this study.