Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2014 Sep 3.
Published in final edited form as: Mol Simul. 2014 Apr 22;40(10-11):784–793. doi: 10.1080/08927022.2014.907898

Recent developments in methods for identifying reaction coordinates

Wenjin Li 1, Ao Ma 1,*
PMCID: PMC4152980  NIHMSID: NIHMS621364  PMID: 25197161

Abstract

In the study of rare events in complex systems with many degrees of freedom, a key element is to identify the reaction coordinates of a given process. Over recent years, a number of methods and protocols have been developed to extract the reaction coordinates based on limited information from molecular dynamics simulations. In this review, we provide a brief survey over a number of major methods developed in the past decade, some of which are discussed in greater detail, to provide an overview of the problems that are partially solved and challenges that still remain. A particular emphasis has been placed on methods for identifying reaction coordinates that are related to the committor.

Keywords: reaction coordinate, committor, molecular dynamics, rare events

1. Introduction

Many essential biological and biochemical processes, such as protein folding, conformational dynamics and enzymatic reactions, are rare events in the sense that they occur on time scales that are orders of magnitude slower than that of the elementary molecular motions. A standard simplified picture of rare events is a transition along a special degree of freedom termed the reaction coordinate between two stable states that are separated by a free energy barrier that is high compared with the thermal energy kBT. This picture has its roots in the transition state theory (TST) [1,2] and Kramers theory [3] for chemical reaction dynamics, in which the two stable states are the reactant and product states and the energy barrier locates the transition state. The transition from the reactant to the product requires overcoming the high free-energy barrier between them, which induces the separation in the time scale of reactive events from that of elementary atomic motions. The rare event nature of reactive processes and the critical role of the transition state in such processes has brought the reaction coordinate to a central stage in today's computational studies of complex systems.

In the earlier development of TST for small molecular systems, the identity of reaction coordinates was implicitly assumed as self-evident. Then, the reaction dynamics of the system is determined by the free energy profile (FEP) and diffusion coefficient along the reaction coordinate. This situation changed dramatically when the focus of investigations on transition processes shifted to complex systems, where it was often found that the actual identities of the reaction coordinates, if happen to be known, are more than often counter-intuitive.

More specifically, the significance of reaction coordinates to studies of reactive dynamics of complex systems is reflected in the following aspects. First, knowledge of the correct reaction coordinates provide the fundamental details of the underlying mechanisms of a given transition process. The FEP along the reaction coordinates allows us to determine the activation energy and transition states, and thus the essence of the reaction dynamics. In particular, the FEP provides a projection of the dynamics in high-dimensional space onto a few degrees of freedom that allows an intuitive and immediate grasp of a complex process. On the more practical side, reaction coordinates are also intimately related to the development of effective enhanced sampling methods. Straightforward molecular dynamics (MD) simulations spend the vast majority of simulation time sampling stable regions, whereas the more interesting transition regions are rarely visited, if at all. In order to study these rare events at the atomistic level, various enhance sampling methods, e.g. umbrella sampling,[4] metadynamics,[5] red orthogonal space sampling [68] and constrained dynamics,[9,10] have been developed to improve sampling of regions other than the stable basins. These methods rely on application of a biasing potential on one or a small set of coordinates, usually termed reaction coordinates or order parameters, along which the progress of the transition can be quantified to certain extent. In this regard, the best coordinates to apply bias are the reaction coordinates, as the bias on the correct reaction coordinates will guide the simulation through the true dynamic bottleneck in the configuration space for the given process.

In spite of the importance of a reaction coordinates, systematic research on how to identify them is a relatively new field and still at a rather primitive stage. The early studies on this subject bear a ‘trial-and-error’ flavour, and more systematic methods have only started to be developed recently. A few reasons contributed to this situation and two of the most prominent challenges are as follows: (1) it is computationally demanding to obtain data from MD simulations of complex macromolecular systems that is sufficient for the purpose of determining the reaction coordinates and (2) given the sufficient data, correctly picking a few coordinates out the enormous degrees of freedom of a complex system is in itself a challenging task.

The existing methods can be categorised from two aspects. On the one hand, they can be grouped according to the definition of reaction coordinates. In this regard, there are mainly two different views: (1) reaction coordinates should reveal the underlying mechanism of the process under study and (2) reaction coordinates should provide a reduced description of a given process that preserves some geometric or informatic metric of the configuration space of the system. Free energy-related definition [11,12] and committor are the prominent examples of the first group. Committor is gaining popularity as the measure of the quality of reaction coordinates due to its clear and specific relationship with reaction dynamics. The second category mainly includes dimensionality reduction-oriented methods such as Isomap,[13,14] diffusion map [1517] and sketch-map.[18]

On the other hand, existing methods can also be categorised based on the way that the reaction coordinates are determined. Earlier methods are heavily ‘trial-and-error’ in nature: a structural coordinate is selected based on chemical and/or physical intuition, then biased molecular simulations are performed along the proposal coordinate to collect necessary information, which is used to judge whether the proposed coordinate is a reaction coordinate or not. As the collected information is specific to the selected coordinate, it cannot be reused to test whether other coordinates are good reaction coordinate or not. If the proposed coordinate is not a reaction coordinate, then other coordinates will be selected and new biased molecular simulations will be performed. Due to the often counter-intuitive nature of the reaction coordinates, error occurs far more frequently than successes, making the ‘trial-and-error’ approach too costly. Consequently, more systematic methods that involve first preparing a database that contains information for determining the reaction coordinates and then using typically machine learning-inspired methods to identify the reaction coordinates out of a pool of candidates. Methods such as the genetic neural network (GNN),[19] likelihood maximisation method,[2022] non-linear reaction coordinate analysis [23] and Kernel PCA [24] all belong to this group.

In this review, we discuss methods developed to identify the reaction coordinate, with emphasis on the methods that using the committor as the ideal reaction coordinate. In addition, the history and theoretical development of the committor will be introduced, as the methods based on the committor are becoming the dominant ones in the field and the committor itself is commonly used to check the quality of a coordinate as the reaction coordinate. We are aware of the limitations of the current review, as we mainly discussed methods that are close to our own field of expertise and interest.

2. Committor-based methods

In this section, we focus our discussion on methods that use the committor as the ideal reaction coordinate and judge the quality of any given physical coordinates as reaction coordinates based on their relationship with the committor.

2.1. Committor

For a transition between two stable basins, a trajectory initiating from a given configuration will commit to one of the basins. The probability of an arbitrary trajectory from a configuration to commit to the product state before the reactant state takes a fixed value for a given equilibrium ensemble, thus this probability quantifies how close a configuration is to the product state in a parametric manner. Onsager is the first one who used the probability that two particles will not combine to quantify the progress of ion-pair recombination.[25] This probability was termed the splitting probability, and its expression in diffusive systems was derived by Kampen [26] and Gardiner [27]. Pratt and Ryter appear to be the first to define transition state using the concept of splitting probability [2830] – they defined transition states as the states with a splitting probability of 0.5, namely the states with equal probability to relax to the reactant and the product. Therefore, the splitting probability can be used to test whether a configuration is a transition state or not. Later on, this definition has been used in the study of activated escape of a Brownian particle from a potential well,[31] protein folding,[32] where the name pfold was used, and ion pair dissociation in water,[33] where the splitting probability was termed as the commitment probability or committor (committor is used consistently in the review). Nowadays, the committor is widely used in the study of chemical and biochemical processes.[19,3445]

Pratt proposed to initiate a number of Monte Carlo simulations from a configuration and use the fraction of trajectories that return to a reactant configuration to define the transition state.[28] This is equivalent to the committor, where the probability to the product region is counted. Such a procedure was employed by Du et al. to calculate the committor of protein folding [32] – the first practical application of committor to evaluate whether a geometrical coordinate is a good reaction coordinate or not. Later, a similar procedure termed shooting was proposed by Geissler et al. to evaluate the committor with MD simulations.[33] By definition, the transition states that are determined by a good reaction coordinate should have a committor value of 0.5 or a narrow distribution of the committor value centred around 0.5. Now this committor histogram test (CHIT) that tests the reaction coordinate based on the histogram of committor values of configurations with the critical value of the proposed reaction coordinate is widely adopted red in various studies.[1921,3436,46,47] Recently, Peters analysed the statistical error of CHIT in the estimation of reaction coordinate.[48]

The theoretical study of the committor was rare until recent years. For a system that can be described by the Smoluchowski equation, Gardiner derived the analytic formula of the committor.[27] Based on the analytic expression of committor, Rhee and Pande [49] demonstrated that committor is the reaction coordinate of a diffusive process with a parabolic barrier at the saddle point of the potential of mean force (PMF). When the direction of the reaction is defined as the gradient along the committor, it was found to be parallel to the eigenvector of the matrix VD, where V is the Hessian matrix of the PMF and D is the diffusion coefficient matrix. Under the same parabolic barrier approximation, Berezhkovshiii and Szabo [50] demonstrated that the PMF along the eigenvector of the matrix VD preserve the exact mean first passage times and thus the rate constant of a diffusive process predicted by the multidimensional Kramers– Langer theory. Therefore, the PMF along the committor can reproduce the exact rate constant. In addition, Rhee and Pande [49] have proposed a reaction coordinate whose PMF can reproduce the probability density function (PDF) of the committor for configurations in an equilibrium ensemble but may not preserve the right rate constant. The construction of such a coordinate requires knowledge of the PDF of the committor, which is computationally expensive, and the proposed reaction coordinate is not a physical coordinate. Recently, a more general diffusion equation along the committor is derived by projecting multidimensional diffusive dynamics onto it, assuming the committor is the slowest coordinate of the system.[51] It showed that the resulting diffusion equation preserves the exact reactive flux at equilibrium and thus the rate constant but not the exact dynamics. Therefore, the committor can be considered as an ‘ideal’ reaction coordinate.

Remarkably, the above-mentioned studies [49,50] showed that the reaction coordinate, which is perpendicular to the isocommittor surface, is not necessarily parallel to the gradient of the PMF at the saddle point (the eigenvector of the matrix V) for anisotropic diffusion systems. It is actually parallel to the eigenvector of the matrix VD. Ma et al. have verified these results based on a study of an isomerisation reaction of an alanine dipeptide in implicit solvent and found that the two eigenvectors deviate by a small angle. [52] The PMF and the diffusion tensor along two reaction coordinates were estimated from MD simulations, and then committor were estimated based on Berezhkovshiii and Szabo's theory.[50] The predicted committor is consistent with the one estimated with MD simulations, demonstrating that a complex biological process can be simplified by a low-dimensional physical model.

2.2. ‘Trial-and-error’ methods

Most early work on the searching of reaction coordinate employed a trial-and-error process. Examples include but are not limited to the studies on simple solvated systems, [33] enzymatic reactions,[10,36,53,54] protein folding [32,55] and biomolecular conformational changes.[34,41] Typically a reaction coordinate is proposed based on intuition and knowledge, and information is then collected to test whether it is a reaction coordinate or not based on various mechanism oriented criteria. For instance, free energy along the proposed coordinate was estimated with enhanced sampling methods and the reaction coordinate was considered to be the one with highest free energy barrier [54] or the reaction coordinate can be approximated by a minimum free energy path with a free energy barrier consistent with experimental results.[10,53] More rigorous criteria are based on the committor, where committors of selected configurations at fix values of a coordinate were estimated and CHIT was commonly used to determine the quality of the coordinate as a reaction coordinate.[3234,36,41]

2.3. Methods based on p(TP|r)

Best and Hummer [35,56] proposed that transition states are configurations with the highest p(TP|x), where p(TP|x) is the probability for a trajectory that passes through a configuration x to be a transition path. And the reaction coordinate is a coordinate r with sharpest peak in p(TP|r), where p(TP|r) is defined as

p(TP|r)=p(TP|x)δ[rr(x)]peq(r)dxδ[rr(x)]peq(r)dx,

which is the average probability for trajectories passing through configurations with the same value of r to be transition paths. Here, δ(r) is the Dirac's delta function and peq(r) is the equilibrium probability distribution of system configurations projected onto the coordinate r. Therefore, any coordinate can be tested by obtaining p(TP|r) from a long equilibrium trajectory and the coordinate with the highest peak of p(TP|r) can therefore be identified as the reaction coordinate. Since it is not always possible to prepare a long enough equilibrium trajectory, they proposed a computationally less costly way to estimate p(TP|r). According to a Bayesian relationship between the equilibrium ensemble and the transition path ensemble, p(TP|r) = p(r|TP)p(TP)/peq(r) for Markovian processes. Here, peq(r) and p(r|TP) are the probability distribution of configurations projected onto the coordinate r for the equilibrium ensemble and the transition path ensemble, respectively. p(TP) is the fraction of time that the system spent in the transition paths, relative to the total time in the long equilibrium trajectory. peq(r) of a chosen coordinate can be obtained by enhanced sampling methods, e.g. umbrella sampling was used in the work by Best and Hummer [35], to reduce the computational cost. The optimisation of p(TP|r) was later demonstrated to be equivalent to a method that optimises the stochastic separatrix.[57]

2.4. Methods that utilise machine learning algorithms

The biggest advantage of collecting sufficient data first and analysing them to search for the reaction coordinate is that one can consider every possible coordinate as candidate of reaction coordinate without assuming that the reaction coordinates are contained in a small set of pre-selected collective variables. This assumption is not necessarily true, as the identities of the correct reaction coordinates are often counter-intuitive. What is the sufficient information to identify the reaction coordinate? A long equilibrium trajectory that samples enough transitions between stable states should be sufficient, although the preparation of such a trajectory is only possible for small systems with MD simulations or a few medium systems by highly parallel distributed computing or by special purpose high-performance computing.[5861] Since committor is the ideal reaction coordinate, committor of configurations in the transition path ensemble should contain sufficient information to extract the reaction coordinate.[1921,23] In this regard, transition path sampling (TPS) [6265] and committor estimation are the most commonly used ways to harvest the information.

2.4.1 GNN method

The method developed by Ma and Dinner [19] uses a GNN [66,67] method to extract the reaction coordinates from a pool of pre-selected candidates based on the committor information contained in a database of systems configurations. The committor value for each configuration in the database was accurately evaluated using the shooting procedure. Configurations were harvested with TPS [6265] and selected to ensure uniform distribution of the committor values in the database to avoid bias in the training and testing of the neural network model. The GNN method was then applied to identify the combination of coordinates that produces the most accurate prediction of the committor, which is considered the best approximation of the reaction coordinate.

The GNN method is a combination of a genetic algorithm and a neural network method. The neural network was used to build the model that can produce the best prediction for committor value for a given combination of physical coordinates and the genetic algorithm identifies the best combination of coordinates among all of those sampled from the candidate pool through a Monte Carlo-like procedure. The best combination of physical coordinates from the GNN procedure provides us the components of the reaction coordinates. In addition, the GNN method provides a mathematical expression that can be used to predict the committor value of a given configuration with great accuracy. Neural network model establishes a relationship between physical coordinates and committor, which is flexible and in principle could take into account the potentially complex non-linear relationship between the physical coordinates and committor. The GNN method is not intrinsically coupled to TPS, other methods for harvesting transition paths, such as transition interface sampling [68,69] and forward flux sampling,[70] should work as well.

The GNN method has been applied to alanine dipeptide isomerisation reactions, a standard model system for reaction coordinate studies, in vacuum and implicit and explicit water. In the vacuum, the identified reaction coordinates are consistent with the results of a previous study.[34] In explicit water, a torque on the solute derived from electrostatic interactions with the solvent molecules was found to be a critical component of the reaction coordinate. CHIT showed that the identified reaction coordinate can predict transition states with committor values narrowly distributed around 0.5. In implicit water, the predicted components of the reaction coordinate were successfully used to construct a low-dimensional physical model to describe the dynamics of the complex biomolecular system.[52] Other applications of the method included the study of a nucleotide flipping facilitated by O6-alkylguanine-DNA alkyltransferase [38] and the study of the folding of a 20-residue antiparallel-sheet miniprotein.[71]

2.4.2. Likelihood maximisation method

Peters and Trout [20] have designed an aimless shooting algorithm for TPS, in which the momenta of the system for a given configuration are drawn from the Boltzmann distribution, instead of being derived from a small perturbation of the original momenta as implemented in the original TPS algorithm. In aimless shooting, each shooting trajectory can be considered as a realisation of committor and a large set of configurations with committor estimated by such a one-time realisation can be collected from the TPS history. For a given set of coordinates, the relationship between the committor and a linear combination of coordinates is proposed to be a sigmoid function with PB(r) = [1 + tan h(r)]/2, where PB(r) is the averaged committor of configurations along the reaction coordinate r and r is a linear combination of several physical coordinates. The reaction coordinate is approximated by the linear combination that maximises the likelihood [21]:

L=PB(xk)=1PB(r(xk))PB(xk)=0(1PB(r(xk))). (1)

It is the occurring probability of the one-time committor realisation of these configurations assuming that the committor along the reaction coordinate r is PB(r). Here, xk is one of the configurations whose committor is estimated, PB(xk is the estimated committor of a configuration xk by a single shooting trajectory and PB(r(xk)) is the committor of xk estimated by the proposed sigmoid function.

Different numbers of coordinates and different combinations of coordinates can be tested to find the best approximation of the reaction coordinate by taking the combination of coordinates with the maximum likelihood. Typically, the more coordinates are included, the higher likelihood for the resulting model. The optimal number of coordinates is reached if there is no significant increase of the likelihood when an extra coordinate is taken into account. [20] The distribution of the configurations in the database was assumed to be peaked near the transition state region, as the aimless shooting procedure has the tendency to concentrate towards the transition state. A recent extension of the likelihood maximisation is the inertial likelihood maximisation method,[22] which takes into account the velocities projected onto the selected coordinates as well. For the systems studied by this method, the variance of the committor values of configurations on the transition state surface that is determined by the optimised reaction coordinate is in general smaller than the ones obtained by the original likelihood maximisation method. In addition, the transmission coefficients of proposed transition states from inertial likelihood maximisation are larger and closer to 1. Thus, the inertial likelihood maximisation method is an improvement over the original likelihood maximisation approach. Recently, Lechner et al. introduced non-linearity into the reaction coordinate in the likelihood maximisation method.[23] Learning from the string methods,[7276] they replaced the linear combination of coordinates by a string of configurations in a low-dimensional collective variable space to approximate the reaction coordinate using the likelihood maximisation and the committor of configurations was obtained from a replica exchange transition interface sampling.

The likelihood maximisation method has been applied to a number of systems: the mechanism of the partial unfolding transition in a photoactive yellow protein [77]; the folding details of Trp-cage protein in explicit solvent [78]; the homogeneous nucleation process of a crystal in a Gaussian core model [79,80]; diffusion of water molecules in a glassy polymer.[81]

The likelihood maximisation approach shares a number of similarities with the GNN method discussed earlier. Both methods assume the committor of configurations as the sufficient information for identifying the reaction coordinate. In the former, the committor is evaluated in great accuracy, whereas in the latter the committor is estimated by a one-time realisation. In the GNN method, the distribution of configurations along the committor is enforced to be uniform, whereas it is a natural outcome of the aimless shooting procedure in the likelihood maximisation method, which will in principle vary with the system under study but is likely to concentrate around the committor value of 0.5 due to the particular feature of aimless shooting. To extract the reaction coordinate from the given information, both methods resort to a sigmoid model. In the GNN method, the sigmoid model is employed inside the neural network, whereas in the likelihood maximisation method, it directly establishes the relationship between the committor and the coordinates. In fact, the sigmoid model in the likelihood maximisation can be considered as a specific neural network model, in which there is no hidden layers – with the selected coordinates as input and the committor as the only output. Also, one can hybrid the two methods together. For instance, the likelihood maximisation procedure can be applied to a set of configurations with committor values estimated from a standard shooting procedure instead of the aimless shooting, although such a hybrid method was not able to identify the reaction coordinate in a study to the thiol/disulfide exchange in a protein.[39]

2.4.3. Transition state ensemble optimisation

In the standard picture of reaction dynamics, the reaction coordinate is perpendicular to the transition state surface at the saddle point of the free energy surface, methods were thus developed to identify geometrical coordinates that comprise a reaction coordinate in complex systems from the transition state ensemble alone. The transition state ensemble or the stochastic separatrix is a hypersurface in the configuration space on which there is no change in the progression of the reaction.[82] To characterise the behaviour of coordinates on the stochastic separatrix, Antoniou and Schwartz [82] plotted the distributions of all the coordinates and examined the width of their distributions on the separatrix in model systems of a double-well potential coupled to a single and multiple oscillators. They found that the variation of the reaction coordinate is significantly smaller than that of the non-reactive coordinates. Based on this observation, they proposed a method to identify coordinates with the smallest variations along the separatrix as the reaction coordinate. The method assumes a narrow transition state region and is, therefore, mainly applicable to systems with fast barrier crossing, but not so much to systems with diffusive barrier crossing.[82,83]

In a later application, this method was generalised to deal with high dimensionality and non-linearity that are often present in real systems. Their basic assumption is that the width of the separatrix along a coordinate that is part of the reaction coordinate must be thin,[24] as the correct reaction coordinate should take a fixed value at the separatrix. However, the non-reaction coordinates will show large variations in the transition state ensemble as they do not correlate with the progression of the reaction. Therefore, one can check the width of the separatrix along all the coordinates and select those with narrow width as the components of the reaction coordinate. Mathematically, the width of the separatrix along a coordinate can be quantified by the contribution of the coordinate to the direction of maximum variance on the separatrix. Coordinates that contribute little to the direction of maximum variance will be part of the reaction coordinate. Traditionally, principal component analysis is the standard method for identifying the direction of maximum variance. However, the separatrix of a complex system is a curved surface with great complexity and it is non-trial to identify the dominant direction of the separatrix. Consequently, they proposed a kernel principal component analysis (kPCA) approach with a non-linear kernel function to identify the direction with the largest variance. The method has been tested on an enzymatic reaction catalysed by lactate dehydrogenase, which is previously studied by the same group in great details with TPS and committor analysis. [36,84] The identified reaction coordinates by kPCA are consistent with those previously obtained with the ‘trial-and-error’ method.[36]

3. Free energy-related methods

Minimum energy path and minimum free energy path are conventionally associated with the concept of reaction coordinate [8587] and this connection has recently been derived from transition path theory as well.[76,88,89] For simple chemical reactions, the minimum energy path can be found by searching along all the degrees of freedom. However, this procedure is practically impossible for complex systems due to the large number of degrees of freedom. Consequently, searching over several collective variables is employed for macromolecular systems and the minimum free energy path is searched for instead.

Various methods have been developed to find the minimum free energy paths or maximum flux paths for complex systems, e.g. string methods [7276] and elastic band methods.[9092] Both types of methods start with a chain of states that connects the reactant and the product, then this chain of states are evolved towards the minimum free energy paths using optimisation algorithms. In string methods, the evolution of different states are essentially independent of each other, whereas in elastic band methods each state is subject to constraints from neighbouring states. Since the conformational searching is performed in a low-dimensional space spanned by a few collective variables, the choice of the collective variables is quite crucial for the success of these methods. These methods and have been successfully applied to studies of biological and material systems.[9398] Some recent extensions have also been used in studies of reactive dynamics using CHIT.[46,47] For more details of these methods, we refer readers to recent reviews.[89,99]

Krivov and Karplus proposed the concept of the cut-based FEP,[11,12] which was inspired by their earlier work on free energy disconnectivity map.[100,101] Previously, they showed that the partition function of the reactant region can preserve the energy barriers.[11,12] For conventional FEP, its partition function (ZH) is constructed from the density of configurations along a given coordinate. For cut-based FEP, its partition function (ZC) is proportional to the total number of transitions through a given point on the coordinate during a small time interval. There exists a useful relationship between ZC and ZH, in which ZC can be expressed as a function of ZH and position-dependent diffusion coefficient. Most importantly, a ‘natural coordinate’, which is a continuous and invertible function of a given coordinate, can be constructed such that ZC and ZH along the natural coordinate differ by a constant or the diffusion coefficient along the natural coordinate is constant. They demonstrated that the reaction coordinate is the one with the highest cut-based free energy barrier along its natural coordinate.[12]

Later, Krivov proposed the idea of the cut FEP, whose partition function is proportional to the sum of the diffusion distance from a give point during a small time interval, and the reaction coordinate is the one with a constant cut FEP, which is independent of position and sampling interval. [102] A major concern of the above-mentioned methods is that it is almost computationally impossible to get a long enough equilibrium trajectory for real systems and it is not trivial to construct and compare two cut-based FEPs.

4. Dimensionality reduction-oriented methods

Dimensionality reduction, which means projecting a high dimensional data-set onto a few essential directions to gain a clear visualisation and facilitate conceptualisation of otherwise extremely messy and complex data, is an important subject in the fields of informatics and data mining in general. The same philosophy has been utilised in the studies of protein dynamics, with principal component analysis [103] as the most familiar example. Recently, some of the most popular methods for general dimensionality reduction purpose, such as Isomap,[13,14] diffusion map [1517,104] and sketch-map,[18] have been introduced to studies of reaction coordinates. These methods typically achieve dimensionality reduction based on the preservation of geometric measures of the configuration space and the dynamical information contained in the temporal sequence of the configurations is usually ignored. [105] Here, we briefly discuss a few of them.

In Isomap, the separation of two configurations in the configuration space is quantified by the geodesic distance, which is defined as the shortest path between them in a connected graph.[13] A configuration is connected to another configuration if the root mean square deviation (RMSD) is small or they are nearest neighbours. The underlying assumption is that configurations with small RMSD are mutually accessible without crossing any barrier, i.e. they belong to the same state, although there are configurations with small RMSD that are separated by high energy barrier. A study of protein folding with a coarse-grained model showed that the transition state ensemble can be correctly identified from a free energy landscape in a low-dimensional space, which is constructed with Isomap.[14]

Diffusion map is designed to perform non-linear dimensionality reduction of a connected graph of points. From MD simulation data, connected graph of configurations is constructed with certain metric or kernel which quantifies the weight between two configurations. A Markov process on the graph is then constructed by renormalising the kernel so that it quantifies the probability for a random walker on the graph to make a step from one point to another. The diffusion distance quantifies the rate of connectivity of two points in the graph and is robust to perturbation on the data and preserved in the dimensionality reduction.[104] Dimensionality reduction can be performed to construct a few principal diffusion coordinate, whose ordinary Euclidean distance in the embedding space measures the diffusion distance. Then, the diffusion coordinates are correlated with quantities with more specific physical meaning.[99] For more details of these methods, we recommend an excellent recent review.[99]

5. Conclusions and prospectives

As our discussions suggested, the notion of reaction coordinates plays an essential role in today's understanding of the reaction mechanism of complex systems and development of efficient methods for simulating such systems. Consequently, considerable efforts have been devoted to develop methods for identifying reaction coordinates in complex systems over the past decade. Based on the definition of the reaction coordinates, these methods can be roughly grouped as those targeted at committor-based reaction coordinates and those targeted for reduced description of a reaction process that can preserve certain geometric measures in the configuration space. While the committor-based reaction coordinates have clear mechanistic implications to the systems and processes under study, the specific meaning of configuration space geometry related reaction coordinates may require our improved understanding of the geometric picture of reactive dynamics.

The emphasis of this review has been on the methods for identifying committor-based reaction coordinates and the current trend along this direction has been along the line of using machine learning methods to extract coordinates that exhibit strong ‘patterns’, defined by various committer-based metrics, from a candidate pool in a database prepared from simulation data. Along this direction, there are three main issues: (1) the sufficient information required to correctly determine the reaction coordinates from the database, (2) the computational cost involved in preparing the database and (3) principles for constructing a complete candidate pool in the sense that the inclusion of the correct reaction coordinates is guaranteed.

For the first question, the GNN method [19] requires a database that contains accurate committor information over the entire range from 0 to 1, whereas the kernel PCA method [24,82] only requires information around the committor value of 0.5, which reduced the computational cost considerably but may also limit the applicability of the method to reactions with high barrier and ballistic dynamics. The likelihood maximisation method reduced the computational cost at the expense of the accuracy of committor information in the database, which could potentially limit the reliability of the reaction coordinates selected by the method. Therefore, how to properly balance between the accuracy and range of committor information and the computational cost remains a challenge at the current stage, whereas an improved understanding of the minimum sufficient information required for reliable selection of reaction coordinates will greatly help to clarify the situation. In addition, the current approach for preparing the candidate pool is intuition based and non-systematic. A drawback of such an approach is it is difficult to determine when the candidate pool should be complete. A clear example in this regard is the solvent coordinate essential for the isomerisation reaction of the alanine dipeptide in explicit water,[19] which turned out to be the electrostatic torque on the solute from the solvent molecules, a highly counter-intuitive coordinate that one would be expected to consider only after sufficient amount of trial and errors. Of course, our intuition on this would improve after more complex systems have been studied using the rigorous approach, but a more first-principle based systematic approach should be the direction to pursue.

Acknowledgments

The research was supported in part by a National Institutes of Health grant (R01GM086536) awarded to A. Ma.

References

  • 1.Wigner E. The transition state method. Trans Faraday Soc. 1938;34:29–41. [Google Scholar]
  • 2.Chandler D. Statistical mechanics of isomerization dynamics in liquids and the transition state approximation. J Chem Phys. 1978;68:2959–2970. [Google Scholar]
  • 3.Kramers H. Brownian motion in a field of force and the diffusion model of chemical reactions. Physica. 1940;7:284–304. [Google Scholar]
  • 4.Torrie GM, Valleau JP. Nonphysical sampling distributions in Monte Carlo free-energy estimation: umbrella sampling. J Comput Phys. 1977;23(2):187–199. [Google Scholar]
  • 5.Laio A, Gervasio FL. Metadynamics: a method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science. Rep Prog Phys. 2008;71(12):126601. [Google Scholar]
  • 6.Zheng L, Chen M, Yang W. Random walk in orthogonal space to achieve efficient free-energy simulation of complex systems. Proc Natl Acad Sci. 2008;105(51):20227–20232. doi: 10.1073/pnas.0810631106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Zheng L, Chen M, Yang W. Simultaneous escaping of explicit and hidden free energy barriers: application of the orthogonal space random walk strategy in generalized ensemble based conformational sampling. J Chem Phys. 2009;130:234105. doi: 10.1063/1.3153841. [DOI] [PubMed] [Google Scholar]
  • 8.Zheng L, Yang W. Practically efficient and robust free energy calculations: double-integration orthogonal space tempering. J Chem Theory Comput. 2012;8(3):810–823. doi: 10.1021/ct200726v. [DOI] [PubMed] [Google Scholar]
  • 9.Sprik M, Ciccotti G. Free energy from constrained molecular dynamics. J Chem Phys. 1998;109:7737–7744. [Google Scholar]
  • 10.Li W, Rudack T, Gerwert K, Gräter F, Schlitter J. Exploring the multidimensional free energy surface of phosphoester hydrolysis with constrained QM/MM dynamics. J Chem Theory Comput. 2012;8(10):3596–3604. doi: 10.1021/ct300022m. [DOI] [PubMed] [Google Scholar]
  • 11.Krivov SV, Karplus M. One-dimensional free-energy profiles of complex systems: progress variables that preserve the barriers. J Phys Chem B. 2006;110(25):12689–12698. doi: 10.1021/jp060039b. [DOI] [PubMed] [Google Scholar]
  • 12.Krivov SV, Karplus M. Diffusive reaction dynamics on invariant free energy profiles. Proc Natl Acad Sci. 2008;105(37):13841–13846. doi: 10.1073/pnas.0800228105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Tenenbaum JB, De Silva V, Langford JC. A global geometric framework for nonlinear dimensionality reduction. Science. 2000;290(5500):2319–2323. doi: 10.1126/science.290.5500.2319. [DOI] [PubMed] [Google Scholar]
  • 14.Das P, Moll M, Stamati H, Kavraki LE, Clementi C. Low-dimensional, free-energy landscapes of protein-folding reactions by nonlinear dimensionality reduction. Proc Natl Acad Sci. 2006;103(26):9885–9890. doi: 10.1073/pnas.0603553103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Rohrdanz M, Zheng W, Maggioni M, Clementi C. Determination of reaction coordinates via locally scaled diffusion map. J Chem Phys. 2011;134:124116. doi: 10.1063/1.3569857. [DOI] [PubMed] [Google Scholar]
  • 16.Coifman R, Kevrekidis IG, Lafon S, Maggioni M, Nadler B. Diffusion maps, reduction coordinates, and low dimensional representation of stochastic systems. Multiscale Model Simul. 2008;7(2):842–864. [Google Scholar]
  • 17.Coifman R, Lafon S. Diffusion maps. Appl Comput Harmon Anal. 2006;21(1):5–30. [Google Scholar]
  • 18.Ceriotti M, Tribello G, Parrinello M. Simplifying the representation of complex free-energy landscapes using sketch-map. Proc Natl Acad Sci. 2011;108(32):13023–13028. doi: 10.1073/pnas.1108486108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ma A, Dinner AR. Automatic method for identifying reaction coordinates in complex systems. J Phys Chem B. 2005;109(14):6769–6779. doi: 10.1021/jp045546c. [DOI] [PubMed] [Google Scholar]
  • 20.Peters B, Trout BL. Obtaining reaction coordinates by likelihood maximization. J Chem Phys. 2006;125:054108. doi: 10.1063/1.2234477. [DOI] [PubMed] [Google Scholar]
  • 21.Peters B, Beckham GT, Trout BL. Extensions to the likelihood maximization approach for finding reaction coordinates. J Chem Phys. 2007;127(3):034109. doi: 10.1063/1.2748396. [DOI] [PubMed] [Google Scholar]
  • 22.Peters B. Inertial likelihood maximization for reaction coordinates with high transmission coefficients. Chem Phys Lett. 2012;554:248–253. [Google Scholar]
  • 23.Lechner W, Rogal J, Juraszek J, Ensing B, Bolhuis PG. Nonlinear reaction coordinate analysis in the reweighted path ensemble. J Chem Phys. 2010;133:174110. doi: 10.1063/1.3491818. [DOI] [PubMed] [Google Scholar]
  • 24.Antoniou D, Schwartz SD. Toward identification of the reaction coordinate directly from the transition state ensemble using the kernel PCA method. J Phys Chem B. 2011;115(10):2465–2469. doi: 10.1021/jp111682x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Onsager L. Initial recombination of ions. Phys Rev. 1938;54(8):554–557. [Google Scholar]
  • 26.Van Kampen NG. Escape and splitting probabilities in diffusive and non-diffusive Markov processes. Prog Theor Phys Supp. 1978;64:389–401. [Google Scholar]
  • 27.Gardiner CW. Handbook of stochastic methods. Vol. 3. Berlin: Springer; 1985. [Google Scholar]
  • 28.Pratt LR. A statistical method for identifying transition states in high dimensional problems. J Chem Phys. 1986;85:5045–5048. [Google Scholar]
  • 29.Ryter D. Noise-induced transitions in a double-well potential at low friction. J Stat Phys. 1987;49(3–4):751–765. [Google Scholar]
  • 30.Ryter D. On the eigenfunctions of the Fokker–Planck operator and of its adjoint. Phys A Stat Mech Appl. 1987;142(1):103–121. [Google Scholar]
  • 31.Klosek MM, Matkowsky BJ, Schuss Z. The Kramers problem in the turnover regime: the role of the stochastic separatrix. Berich Bunseng Physi Chem. 1991;95(3):331–337. [Google Scholar]
  • 32.Du R, Pande VS, Grosberg AY, Tanaka T, Shakhnovich ES. On the transition coordinate for protein folding. J Chem Phys. 1998;108:334–350. [Google Scholar]
  • 33.Geissler PL, Dellago C, Chandler D. Kinetic pathways of ion pair dissociation in water. J Phys Chem B. 1999;103(18):3706–3710. [Google Scholar]
  • 34.Bolhuis PG, Dellago C, Chandler D. Reaction coordinates of biomolecular isomerization. Proc Natl Acad Sci. 2000;97(11):5877–5882. doi: 10.1073/pnas.100127697. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Best RB, Hummer G. Reaction coordinates and rates from transition paths. Proc Natl Acad Sci USA. 2005;102(19):6732–6737. doi: 10.1073/pnas.0408098102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Quaytman S, Schwartz SD. Reaction coordinate of an enzymatic reaction revealed by transition path sampling. Proc Natl Acad Sci. 2007;104(30):12253–12258. doi: 10.1073/pnas.0704304104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Juraszek J, Bolhuis PG. Sampling the multiple folding mechanisms of Trp-cage in explicit solvent. Proc Natl Acad Sci. 2006;103(43):15859–15864. doi: 10.1073/pnas.0606692103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Hu J, Ma A, Dinner AR. A two-step nucleotide-flipping mechanism enables kinetic discrimination of DNA lesions by AGT. Proc Natl Acad Sci. 2008;105(12):4615–4620. doi: 10.1073/pnas.0708058105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Li W, Gräter F. Atomistic evidence of how force dynamically regulates thiol/disulfide exchange. J Am Chem Soc. 2010;132(47):16790–16795. doi: 10.1021/ja104763q. [DOI] [PubMed] [Google Scholar]
  • 40.Snow CD, Rhee YM, Pande VS. Kinetic definition of protein folding transition state ensembles and reaction coordinates. Biophys J. 2006;91(1):14–24. doi: 10.1529/biophysj.105.075689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Hagan MF, Dinner AR, Chandler D, Chakraborty AK. Atomistic understanding of kinetic pathways for single base-pair binding and unbinding in DNA. Proc Natl Acad Sci. 2003;100(24):13922–13927. doi: 10.1073/pnas.2036378100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Radhakrishnan R, Trout BL. Nucleation of hexagonal ice (Ih) in liquid water. J Am Chem Soc. 2003;125(25):7743–7747. doi: 10.1021/ja0211252. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Moroni D, Rein P, Wolde T, Bolhuis PG. Interplay between structure and size in a critical crystal nucleus. Phys Rev Lett. 2005;94(23):235703. doi: 10.1103/PhysRevLett.94.235703. [DOI] [PubMed] [Google Scholar]
  • 44.Gsponer J, Caflisch A. Molecular dynamics simulations of protein folding from the transition state. Proc Natl Acad Sci. 2002;99(10):6719–6724. doi: 10.1073/pnas.092686399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Chu JW, Brooks BR, Trout B. Oxidation of methionine residues in aqueous solutions: free methionine and methionine in granulocyte colony-stimulating factor. J Am Chem Soc. 2004;126(50):16601–16607. doi: 10.1021/ja0467059. [DOI] [PubMed] [Google Scholar]
  • 46.Chen M, Yang W. On-the-path random walk sampling for efficient optimization of minimum free-energy path. J Comput Chem. 2009;30(11):1649–1653. doi: 10.1002/jcc.21311. [DOI] [PubMed] [Google Scholar]
  • 47.Cao L, Lv C, Yang W. Hidden conformation events in DNA base extrusions: a generalized-ensemble path optimization and equilibrium simulation study. J Chem Theory Comput. 2013;9(8):3756–3768. doi: 10.1021/ct400198q. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Peters B. Using the histogram test to quantify reaction coordinate error. J Chem Phys. 2006;125(24):241101. doi: 10.1063/1.2409924. [DOI] [PubMed] [Google Scholar]
  • 49.Rhee YM, Pande VS. One-dimensional reaction coordinate and the corresponding potential of mean force from commitment probability distribution. J Phys Chem B. 2005;109(14):6780–6786. doi: 10.1021/jp045544s. [DOI] [PubMed] [Google Scholar]
  • 50.Berezhkovskii A, Szabo A. One-dimensional reaction coordinates for diffusive activated rate processes in many dimensions. J Chem Phys. 2005;122:014503. doi: 10.1063/1.1818091. [DOI] [PubMed] [Google Scholar]
  • 51.Berezhkovskii AM, Szabo A. Diffusion along the splitting/commitment probability reaction coordinate. J Phys Chem B. 2013;117:13115–13119. doi: 10.1021/jp403043a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Ma A, Nag A, Dinner AR. Dynamic coupling between coordinates in a model for biomolecular isomerization. J Chem Phys. 2006;124(14):144911. doi: 10.1063/1.2183768. [DOI] [PubMed] [Google Scholar]
  • 53.Klähn M, Rosta E, Warshel A. On the mechanism of hydrolysis of phosphate monoesters dianions in solutions and proteins. J Am Chem Soc. 2006;128(47):15310–15323. doi: 10.1021/ja065470t. [DOI] [PubMed] [Google Scholar]
  • 54.Rosta E, Woodcock HL, Brooks BR, Hummer G. Artificial reaction coordinate “tunneling” in free-energy calculations: the catalytic reaction of RNase H. J Comput Chem. 2009;30(11):1634–1641. doi: 10.1002/jcc.21312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Dinner AR, Karplus M. The thermodynamics and kinetics of protein folding: a lattice model analysis of multiple pathways with intermediates. J Phys Chem B. 1999;103(37):7976–7994. [Google Scholar]
  • 56.Hummer G. From transition paths to transition states and rate coefficients. J Chem Phys. 2004;120:516–523. doi: 10.1063/1.1630572. [DOI] [PubMed] [Google Scholar]
  • 57.Peters B. p(TP|q) peak maximization: necessary but not sufficient for reaction coordinate accuracy. Chem Phys Lett. 2010;494(1):100–103. [Google Scholar]
  • 58.Snow CD, Zagrovic B, Pande VS. The Trp cage: folding kinetics and unfolded state topology via molecular dynamics simulations. J Am Chem Soc. 2002;124(49):14548–14549. doi: 10.1021/ja028604l. [DOI] [PubMed] [Google Scholar]
  • 59.Lane TJ, Bowman GR, Beauchamp K, Voelz VA, Pande VS. Markov state model reveals folding and functional dynamics in ultra-long MD trajectories. J Am Chem Soc. 2011;133(45):18413–18419. doi: 10.1021/ja207470h. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 60.Bowman GR, Pande VS. Protein folded states are kinetic hubs. Proc Natl Acad Sci. 2010;107(24):10890–10895. doi: 10.1073/pnas.1003962107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 61.Piana S, Lindorff-Larsen K, Shaw DE. Protein folding kinetics and thermodynamics from atomistic simulation. Proc Natl Acad Sci. 2012;109(44):17845–17850. doi: 10.1073/pnas.1201811109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62.Dellago C, Bolhuis PG, Csajka FS, Chandler D. Transition path sampling and the calculation of rate constants. J Chem Phys. 1998;108:1964–1977. [Google Scholar]
  • 63.Dellago C, Bolhuis PG, Chandler D. On the calculation of reaction rate constants in the transition path ensemble. J Chem Phys. 1999;110:6617–6625. [Google Scholar]
  • 64.Bolhuis PG, Dellago C, Chandler D. Sampling ensembles of deterministic transition pathways. Faraday Discuss. 1998;110:421–436. [Google Scholar]
  • 65.Bolhuis PG, Chandler D, Dellago C, Geissler PL. Transition path sampling: throwing ropes over rough mountain passes, in the dark. Ann Rev Phys Chem. 2002;53(1):291–318. doi: 10.1146/annurev.physchem.53.082301.113146. [DOI] [PubMed] [Google Scholar]
  • 66.So SS, Karplus M. Genetic neural networks for quantitative structure–activity relationships: improvements and application of benzodiazepine affinity for benzodiazepine/GABAA receptors. J Med Chem. 1996;39(26):5246–5256. doi: 10.1021/jm960536o. [DOI] [PubMed] [Google Scholar]
  • 67.So SS, Karplus M. Evolutionary optimization in quantitative structure–activity relationship: an application of genetic neural networks. J Med Chem. 1996;39(7):1521–1530. doi: 10.1021/jm9507035. [DOI] [PubMed] [Google Scholar]
  • 68.van Erp TS, Moroni D, Bolhuis PG. A novel path sampling method for the calculation of rate constants. J Chem Phys. 2003;118:7762–7774. [Google Scholar]
  • 69.Van Erp T, Bolhuis PG. Elaborating transition interface sampling methods. J Comput Phys. 2005;205(1):157–181. [Google Scholar]
  • 70.Allen RJ, Valeriani C, ten Wolde PR. Forward flux sampling for rare event simulations. J Phys Condens Matter. 2009;21(46):463102. doi: 10.1088/0953-8984/21/46/463102. [DOI] [PubMed] [Google Scholar]
  • 71.Qi B, Muff S, Caflisch A, Dinner AR. Extracting physically intuitive reaction coordinates from transition networks of a β-sheet miniprotein. J Phys Chem B. 2010;114(20):6979–6989. doi: 10.1021/jp101476g. [DOI] [PubMed] [Google Scholar]
  • 72.Weinan E, Ren W, Vanden-Eijnden E. String method for the study of rare events. Phys Rev B. 2002;66(5):052301. doi: 10.1021/jp0455430. [DOI] [PubMed] [Google Scholar]
  • 73.Weinan E, Ren W, Vanden-Eijnden E. Simplified and improved string method for computing the minimum energy paths in barrier-crossing events. J Chem Phys. 2007;126:164103. doi: 10.1063/1.2720838. [DOI] [PubMed] [Google Scholar]
  • 74.Maragliano L, Fischer A, Vanden-Eijnden E, Ciccotti G. String method in collective variables: minimum free energy paths and isocommittor surfaces. J Chem Phys. 2006;125:024106. doi: 10.1063/1.2212942. [DOI] [PubMed] [Google Scholar]
  • 75.Maragliano L, Vanden-Eijnden E. On-the-fly string method for minimum free energy paths calculation. Chem Phys Lett. 2007;446(1):182–190. [Google Scholar]
  • 76.Weinan E, Vanden-Eijnden E. Towards a theory of transition paths. J Stat Phys. 2006;123(3):503–523. [Google Scholar]
  • 77.Vreede J, Juraszek J, Bolhuis PG. Predicting the reaction coordinates of millisecond light-induced conformational changes in photoactive yellow protein. Proc Natl Acad Sci. 2010;107(6):2397–2402. doi: 10.1073/pnas.0908754107. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 78.Juraszek J, Bolhuis PG. Rate constant and reaction coordinate of Trp-cage folding in explicit water. Biophys J. 2008;95(9):4246–4257. doi: 10.1529/biophysj.108.136267. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 79.Lechner W, Dellago C, Bolhuis PG. Role of the prestructured surface cloud in crystal nucleation. Phys Rev Lett. 2011;106(8):085701. doi: 10.1103/PhysRevLett.106.085701. [DOI] [PubMed] [Google Scholar]
  • 80.Lechner W, Dellago C, Bolhuis PG. Reaction coordinates for the crystal nucleation of colloidal suspensions extracted from the reweighted path ensemble. J Chem Phys. 2011;135:154110. doi: 10.1063/1.3651367. [DOI] [PubMed] [Google Scholar]
  • 81.Xi L, Shah M, Trout BL. Hopping of water in a glassy polymer studied via transition path sampling and likelihood maximization. J Phys Chem B. 2013;117(13):3634–3647. doi: 10.1021/jp3099973. [DOI] [PubMed] [Google Scholar]
  • 82.Antoniou D, Schwartz SD. The stochastic separatrix and the reaction coordinate for complex systems. J Chem Phys. 2009;130:151103. doi: 10.1063/1.3123162. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 83.Antoniou D, Schwartz SD. Reply to “comment on ‘toward identification of the reaction coordinate directly from the transition state ensemble using the kernel PCA method’”. J Phys Chem B. 2011;115(43):12674–12675. doi: 10.1021/jp207463g. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 84.Basner JE, Schwartz SD. How enzyme dynamics helps catalyze a reaction in atomic detail: a transition path sampling study. J Am Chem Soc. 2005;127(40):13822–13831. doi: 10.1021/ja043320h. [DOI] [PubMed] [Google Scholar]
  • 85.Fukui K. Formulation of the reaction coordinate. J Phys Chem. 1970;74(23):4161–4163. [Google Scholar]
  • 86.Quapp W, Heidrich D. Analysis of the concept of minimum energy path on the potential energy surface of chemically reacting systems. Theor Chim Acta. 1984;66(3–4):245–260. [Google Scholar]
  • 87.Gonzalez C, Schlegel HB. Reaction path following in mass-weighted internal coordinates. J Phys Chem. 1990;94(14):5523–5527. [Google Scholar]
  • 88.Vanden-Eijnden E. Computer simulations in condensed matter systems: from materials to chemical biology. Vol. 1. Berlin, Heidelberg: Springer; 2006. Transition path theory; pp. 453–493. [Google Scholar]
  • 89.Weinan E, Vanden-Eijnden E. Transition-path theory and path-finding algorithms for the study of rare events. Annu Rev Phys Chem. 2010;61:391–420. doi: 10.1146/annurev.physchem.040808.090412. [DOI] [PubMed] [Google Scholar]
  • 90.Henkelman G, Uberuaga BP, Jónsson H. A climbing image nudged elastic band method for finding saddle points and minimum energy paths. J Chem Phys. 2000;113:9901–9904. [Google Scholar]
  • 91.Sheppard D, Terrell R, Henkelman G. Optimization methods for finding minimum energy paths. J Chem Phys. 2008;128:134106. doi: 10.1063/1.2841941. [DOI] [PubMed] [Google Scholar]
  • 92.Jonsson H, Mills G, Jacobsen KW. Nudged elastic band method for finding minimum energy paths of transitions. In: Berne BJ, Ciccoti G, Coker DF, editors. Classical and quantum dynamics in condensed phase simulations. Singapore: World Scientific; 1998. pp. 385–404. [Google Scholar]
  • 93.Elder RM, Jayaraman A. Sequence-specific recognition of cancer drug–DNA adducts by HMGB1a repair protein. Biophys J. 2012;102(10):2331–2338. doi: 10.1016/j.bpj.2012.04.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 94.Miller TF, Vanden-Eijnden E, Chandler D. Solvent coarse-graining and the string method applied to the hydrophobic collapse of a hydrated chain. Proc Natl Acad Sci. 2007;104(37):14559–14564. doi: 10.1073/pnas.0705830104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 95.Ren W, Vanden-Eijnden E, Maragakis P, Weinan E. Transition pathways in complex systems: application of the finite-temperature string method to the alanine dipeptide. J Chem Phys. 2005;123:134109. doi: 10.1063/1.2013256. [DOI] [PubMed] [Google Scholar]
  • 96.Yazyev OV, Pasquarello A. Effect of metal elements in catalytic growth of carbon nanotubes. Phys Rev Lett. 2008;100(15):156102. doi: 10.1103/PhysRevLett.100.156102. [DOI] [PubMed] [Google Scholar]
  • 97.Greeley J, Mavrikakis M. Alloy catalysts designed from first principles. Nat Mat. 2004;3(11):810–815. doi: 10.1038/nmat1223. [DOI] [PubMed] [Google Scholar]
  • 98.Krasheninnikov AV, Lehtinen PO, Foster AS, Pyykkö P, Nieminen RM. Embedding transition-metal atoms in graphene: structure, bonding, and magnetism. Phys Rev Lett. 2009;102(12):126807. doi: 10.1103/PhysRevLett.102.126807. [DOI] [PubMed] [Google Scholar]
  • 99.Rohrdanz MA, Zheng W, Clementi C. Discovering mountain passes via torchlight: methods for the definition of reaction coordinates and pathways in complex macromolecular reactions. Ann Rev Phys Chem. 2013;64:295–316. doi: 10.1146/annurev-physchem-040412-110006. [DOI] [PubMed] [Google Scholar]
  • 100.Krivov SV, Karplus M. Free energy disconnectivity graphs: application to peptide models. J Chem Phys. 2002;117:10894. [Google Scholar]
  • 101.Krivov SV, Karplus M. Hidden complexity of free energy surfaces for peptide (protein) folding. Proc Natl Acad Sci USA. 2004;101(41):14766–14770. doi: 10.1073/pnas.0406234101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 102.Krivov SV. On reaction coordinate optimality. J Chem Theory Comput. 2012;9(1):135–146. doi: 10.1021/ct3008292. [DOI] [PubMed] [Google Scholar]
  • 103.Jolliffe IT. Principal component analysis. 2nd. New York: Springer; 2002. [Google Scholar]
  • 104.Coifman RR, Lafon S, Lee A, Maggioni M, Nadler B, Warner F, Zucker SW. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc Natl Acad Sci USA. 2005;102(21):7426–7431. doi: 10.1073/pnas.0500334102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 105.Krivov SV. Numerical construction of the p fold (committor) reaction coordinate for a Markov process. J Phys Chem B. 2011;115(39):11382–11388. doi: 10.1021/jp205231b. [DOI] [PubMed] [Google Scholar]

RESOURCES