Abstract
Milestoning is a theory and an algorithm that computes kinetics and thermodynamics at long time scales. It is based on partitioning the (phase) space into cells and running a large number of short trajectories between the boundaries of the cells. The termination points of the trajectories are analyzed with the Milestoning theory to obtain kinetic and thermodynamic information. Managing the tens to hundreds of thousands of Milestoning trajectories is a challenge, which we handle with a python script, ScMiles. Here, we introduce a new version of the python script, ScMiles2 to conduct Milestoning simulations. Major enhancements are: (i) post analysis of Milestoning trajectories to obtain the free energy, Mean First Passage Time, the committor function, and exit times; (ii) Similar to (i) but the post analysis is for a single long trajectory; (iii) we support the use of the GROMACS software in addition to NAMD; (iv) a restart option, and (v) the automated finding, sampling and launching trajectories from new milestones that are found on the fly. (vi) Support Milestoning calculations with several coarse variables and for complex reaction coordinates. We also evaluate the simulation parameters and suggest new algorithmic features to enhance the rate of convergence of observables. We propose the use of an iteration averaged kinetic matrix for a rapid approach to asymptotic values. Illustrations are provided for small systems and one large example.
Graphical Abstract

I. Introduction
One of the outstanding challenges in computer simulations of molecular processes is the observation of long-time events. Equations of motion are integrated as a function of time in small steps. The integration time step is of order of one femtosecond (fs). A femtosecond is much shorter than seconds and hours that are required to progress significantly in many biochemical and molecular biophysics processes. Therefore, straightforward time integration of trajectories to the minutes and hours timescales with fs time steps are exceptionally difficult or infeasible, even when using especially designed computer hardware.1
Enhanced simulation methods to study long time scales were therefore developed. Examples of such techniques are the Markov State Model (MSM),2 Weighted Ensemble (WE),3–5 Transition Interface Sampling (TIS),6 Non-Equilibrium Umbrella Sampling,7 Forward Flux,8 Adaptive Multilevel Sampling,9 and Milestoning.10 Variants of Milestoning were also proposed,11–12 as well as combinations of Milestoning with other techniques.13–14 The Milestoning approach is based on running short time trajectories (typically of nanosecond length) and using this information to construct a stochastic model for the entire process.
The present manuscript describes an expansion of a software implementation of the Milestoning algorithm – ScMiles.15 Another software implementation of Milestoning that emphasizes binding of small molecules to proteins is available in SEEKR.16 ScMiles exploits the algorithm of exact Milestoning10 and allow the simulation of multidimensional coarse space. A number of review articles on Milestoning and its applications were published.17–20 Therefore, in this manuscript we only summarize the theory and the algorithm and we focus instead on the implementation. We refer the reader to detailed discussions available elsewhere.
At the core of the Milestoning theory is the partition of a coarse space into cells (Figure 1). The coarse space typically includes 1–3 internal variables such as distances, angles, and torsions. The only strict requirement from the coordinates of the coarse space is that they must distinguish the reactant from the product. Other considerations influence the efficiency of the algorithm, but not its correctness. For example, an efficient implementation will have the transition probabilities between milestones as close to uniform as possible.
Figure 1.

A schematic representation of Milestoning trajectories (red arrowed lines) in a two-dimensional coarse space (Q1 and Q2). The milestones are thin blue lines that bound the squared cells. For illustration we marked one of the milestones by a thick purple line. Trajectories are initiated from the purple milestone and are continued until they terminate on any other milestone. Also shown two cells representing the reactants and the products in blue and green, respectively.
We conduct short trajectories between the boundaries of the cells (Figure 1). Trajectories are computed from an initiating milestone until they cross for the first time another milestone. The same procedure is repeated for all active milestones. The simulation provides a set of hitting points at the terminating milestones, the probability to observe a transition between two milestones, and the times required for such transitions.
The exploitation of short trajectories between milestones is similar to the operations in the algorithms of the Weighted Ensemble (WE),5, 21 Adaptive Multilevel Splitting (AMS),9 Non-Equilibrium Umbrella Sampling (NEUS),7 and more. However, important differences remain. Critical simplifying features of Milestoning are that the trajectories are short and independent of each other, making it possible to use massively parallel resources. The weights of each of the sampled trajectories are identical (weight of one) providing ample and straightforward statistics of trajectory fragments.
The free energy and kinetics are recovered using the short trajectories and the Milestoning theory.17 In brief, let the probability of a trajectory initiated at milestone i and terminated at milestone j be Kij. This probability is estimated from the computed trajectories as
| (1) |
where is the number of trajectories initiated at milestone i at position and the number of trajectories initiated at milestone i at that cross next milestone j at . The eigenvector of the K matrix with an eigenvalue of one is . The length of the vector q is the number of milestones, and the elements are the stationary fluxes through each of the milestones. Since the dimension of the matrix K can be large, we frequently focus on the calculations of q using power iterations of K, i.e.,
| (2) |
The iterations number is n. The flux is expressed as a product of two terms: , which is the milestone weight, and , which is the normalized distribution of first crossing points at the milestone . The initial guess for the distribution of points is the canonical distribution . Substituting the more explicit expression for the flux in the iterative equation we obtain separate equations for the weights and the distributions in the milestone
| (3) |
We call for brevity , which gives a compact linear equation for the weights, .Given an initial guess to we have a liner equation for the which we solve first. Then we determine the distributions in the milestone for the next iteration
| (4) |
Note that in Equation (4) we divide by instead of . This is an exact replacement. When the iterations converged, we have . From the short trajectories we also estimate the lifetime of milestone i called . It is the average of trajectory times from initiation at milestone i until they reach any other milestone.
| (5) |
We can use the iteration number to define a hierarchy of approximations that approaches the exact solution.10 In this case is the first approximation to the exact q or the method of “classical Milestoning”.22
With an estimate for the stationary fluxes, q, and the life times, t, at hand, we can compute common thermodynamic and kinetic observables. The mean first passage time, τ, from the reactant to the products (with qf the flux of the product) is
| (6) |
The free energy of milestone i, , is given by
| (7) |
Current observables include: the free energy, the mean first passage time, the exit times, and the committor function.23 The observables are computed as a function of the milestone index.
Eq. (2) implies that trajectories that terminate at a milestone and contribute to are continued in the next iteration until they cross yet another milestone. An exact continuation uses the stored coordinate and velocity vectors at the time of trajectory termination. However, the iteration process can lead to statistical challenges, which we address by “enrichment”. Enrichment is automatically invoked in exact Milestoning. Enrichment is a process to make the sampling at the milestones as uniform as possible. If one of the milestones accumulates only a small number of crossings in the last iteration, then the statistics of transitions from that milestone in the next iteration is likely to be poor. To obtain a uniform sampling at all the milestones we set the number of trajectories emerging from a milestone to be fixed, say L, which is typically in the hundreds. If the number of terminating trajectories at milestone i is smaller than L, we use some of the termination points to initiate more trajectories until we have L trajectories at milestone i. We use the same spatial coordinates to start additional trajectories, but vary the velocities or the forces.
The most drastic change we make to the termination point to initiate a new trajectory is to resample the velocities from the canonical distribution (note that the final velocities of Milestoning trajectories are kept upon termination and can be used to initiate a new iteration). The error compared to a direct continuation of the trajectory is therefore of order of the time step ∆t. This procedure is appropriate for dynamics with high friction which is close to overdamped dynamics. It is less appropriate for an underdamped Langevin with a small friction coefficient and significant velocity memory. A gentler approach is to enrich the trajectory while starting with the same coordinates and velocities and resampling random forces. In the last case the modified term is of order of and the generated configurations represent the canonical distribution.
Noise and fluctuations may remain significant even with enrichment and we take additional steps to reduce the noise. Consider the transition matrix K. The number of matrix elements is about the square of the number of milestones. Each of the elements have statistical errors that are reflected in the values of the observables, such as the MFPT. In the past we estimated the error bars for the transition matrix and the lifetime vector using sampling from model distributions. 24–25 Here we use an alternative approach, which is appropriate for exact Milestoning. Instead of using the matrix of iteration i we use the average of the K matrices of previous iterations up to i, i.e.
| (8) |
The averaging stabilizes the fluctuations and converges asymptotically to the right matrix. However, the rate of convergence to the final matrix is uncertain. Therefore, it is useful to consider a window, k, for the averaging. In the current paper we show averages for the complete history of alanine dipeptide (i.e., k = 1), that shows considerable improvement. We also show a conformational transition of adenosine, for which averages over the last 5 iterations (k = i − 4) are preferred.
In complex systems the number of trajectories may exceed hundreds of thousands, which is a significant number to initiate, monitor, and analyze. In earlier applications of Milestoning the trajectories were monitored manually, adding significant human burden and potential errors to the studies. Therefore, we introduce recently a python script call ScMile15 to automate the launching of the trajectories, monitor their progress, and analyze the results. The present manuscript is a significant upgrade of the first version as discussed in the Methods section.
Significant new features of ScMiles2 are: (i) An analysis module that process Milestoning trajectories to obtain kinetics and thermodynamics results within Milestoning theory. (ii) Similar to (i) but for a single long trajectory, which is a common practice in Molecular Dynamics (MD) simulations. The same module allows for manipulation and further analyses of the milestone network. (iii) The ability to run Milestoning with GROMACS, diversifying the choice of the Molecular Dynamics engine. (iv) A restart option to continue the simulation when it is stopped, even if the script was terminated prematurely (v) Automated addition of newly detected milestones. (vi) Support Milestoning calculations with several coarse variables and for complex reaction coordinates.
II. Method
II.1. Setting up the calculations
The first step in a Milestoning study is to partition the space to compartments and define the milestones as the dividers between cells. A convenient way to set milestones is as boundaries of Voronoi cells.11 The Voronoi cells are defined in a space of coarse variables, which typically includes several degrees of freedom. A position in the coarse space is captured by the vector Q, with several variables while the coordinate vector of the entire system X includes tens of thousands to a million particles. Frequently, the coarse variables are internal degrees of freedom such as distances, torsions, or angles. A Voronoi cell is defined by a single central configuration in coarse space which we call an anchor.
A point in coarse space, Q, is in Voronoi cell i if the distance di = |Q − Qi| is smaller than the distances dj to all other centers of Voronoi cells - Qj. The anchors and the Voronoi cells are determined from a preliminary sparse sampling of the space of reaction. The sparse sampling may be a high temperature or steered molecular dynamics trajectories26, or a numerically computed reaction coordinate,27 or based on chemical intuition. The cells must separate the reactants and products and ideally chart a dense network of Milestoning transitions between cells. The edges of the network between the cells should be accessible to efficient and short Milestoning trajectories. (Figure 2)
Figure 2.

A schematic representation of the reaction space as a function of two coarse variables, Q1 and Q2. The space is partitioned into Voronoi cells. At the center of each Voronoi cell we find an anchor - a blue filled circle. The milestones are the thin blue lines that bound the Voronoi cells. Also shown a single production (red) trajectory that travels between the boundaries of a Voronoi cell. The green curve is a “seek” trajectory between an anchor and the boundary of the cell. The purple thin line is a sampling trajectory at the milestone which provides initial conditions for production. See text for more details.
The number of milestones grow exponentially with the dimensionality of the coarse space. Models with more than two coarse variables are therefore difficult to examine exhaustively. Instead of exhaustive enumeration of milestones in more than one dimension we choose to sample them. We consider the calculation converged if no new milestones were discovered in the last iteration, or when major observables (such as the MFPT – Mean First Passage Time) do not change significantly upon additions of milestones discovered by sampling. The procedure of sampling milestones in the coarse space is called “seek” and is described in more detail in section II.2 Running ScMiles2. ScMiles2 is adding new milestones automatically. However, the addition of anchors must be done by the user manually.
Note that during a single ScMiles2 run we do not allow the addition or removal of anchors. The anchors are selected prior to initiating the ScMiles2 script. Changes in anchors are nevertheless desired when trajectories are found with exceptionally long termination times. A solution for long-lived trajectories is the addition of proximate anchors, generating nearby milestones, and reducing trajectory termination times.
Once contributing milestones are detected in the “seek” phase (see next section) we sample configurations at the milestones. From those sampled points we initiate unbiased trajectories until they cross for the first-time other milestones. These trajectories are used to estimate K and t (Eq. (1–4)). In section II.2 we discuss the data structure, and how different steps are implemented in the python script.
Critical components of a Milestoning calculation are the determination of the state of a trajectory and crossing events. The trajectory is conducted in full atomistic space and we need to map it to the reduced space of milestones to use the Milestoning theory. Let the coarse variable vector be Q and the positions of N anchors . The distances from the configuration to each of the anchors are . If dj is the minimal distance of the set, we say that the configuration is in cell j. When we consider a trajectory in coarse space we are interested in cell changes, or in crossing cell boundaries or milestones. If the last milestone that was crossed is between cells j and k we say that the trajectory is in state jk. Configurations of this trajectory state are in cell j or k. Assume that the trajectory is in cell k. A transition is detected when a distance dm to another anchor m, becomes smaller than all other distances. Hence the trajectory transitioned to a Milestoning state km.
II.2. Running ScMiles2
The directory structure and files needed to run ScMiles2 are similar to the version described in the previous paper15 although several additional options have been added and the format of some files has changed as discussed below.
In Figure 3 we show the directory structure. The python scripts are stored below the current working directory in a folder ScMiles2.9. The directory that the user prepares is my_project_input that includes several input files and a pdb directory. Compared to the previous version, the input files are extended to include GROMACS28 files and options. The input files list the anchors (anchors.txt), define the coarse variables (colvar.txt), and include sample files for different runs. The directory “pdb” includes coordinates of anchors written in Protein Data Bank format (PDB). There are two output directories: “my_project_output” and “crd”. The directory “my_project_output” includes text files that report useful functions, such as the K matrix (k.txt), the life times (life_time.txt), the committor function in the space of the milestones (committor.txt),23 and a new observable, the exit times (exit_time.txt).18 The exit times are defined as the times to leave the transition domain for the first time either to the reactant or to the product. Finally, the “crd” folder includes raw coordinate files which are the output of the Molecular Dynamics (MD) engine. They can be used for further analyses of the simulations in addition to the results reported in the “my_project_output” folder.
Figure 3:

The directory structure of ScMiles2. Initially, the folders ScMiles2.9 and my_project_input are created by the user. The folders my_project_output and crd are generated during the ScMiles2 run. The input files include instructions for the MD engine. See text for more details.
Figure 4 is a flow chart describing the progression of a Milestoning calculation. We start with a seek option in which trajectories initiated at the anchors are integrated as a function of time until they terminate on nearby milestones (for an illustration see the green trajectory in Figure 2). Once the trajectories terminate and milestones are identified, a second phase of sampling at the milestone starts (purple trajectory in Figure 2). This sampling is conducted by a fixed time molecular dynamics trajectory constrained to be at the milestone (typically for a nanosecond to ensure that the configurations are uncorrelated). Points are selected from the constrained trajectories as initial configurations for the next step of unbiased (free) trajectories. These unbiased trajectories are integrated until they hit for the first time a milestone different from the milestone they started on (red curve in Figure 2).
Figure 4:

Flowchart of the Milestoning algorithm as is executed by the ScMiles2 script. See text for more details
In the sampling phase of Milestoning we conduct an MD trajectory which is restrained to a milestone (e.g. milestone ij). The restraining to a milestone consists of two conditions. First, an equality restraint. We sample from the set of points, Q, such that
| (9) |
A harmonic restraint - ensures that the sampled points are placed at equal distance from the anchors of cell i and cell j. In more than one dimension (e.g. Figure 2) the constraint is not sufficient since the sampling may slide to nearby cells even if it remains equidistance from both anchors. We therefore add conditional restraints. If or we add the penalty or .29 This test applies to all anchors that are different from i and j. Therefore, this calculation can be expensive if the number of anchors is large. It can be improved in future implementation by adding a list of neighboring milestones to each of the milestones, similarly to non-bonded list used in MD simulations. We finally comment that the conceptual implementation of the restraints is similar in both GROMACS28 and NAMD.30 However, the software packages used to enforce the restraints are different. They are PLUMED31–32 and COLVAR33, respectively.
The unbiased trajectories are independent and may hit new milestones not discovered in the “seek” phase. In the past (and in the first version of ScMiles), the newly sampled milestones were reported but not acted upon. After the ScMiles run was completed, the user could prepare another ScMiles cycle with the new milestones included.
ScMiles2 automates the finding and sampling of new milestones. During the run of unbiased trajectories newly detected boundaries between cells are added to the list of active milestones. When the runs of unbiased trajectories of the current set of milestones are completed, the script returns back a step and samples configurations in the newly discovered milestones. Next, the script computes unbiased trajectories from the latest milestones. This process of sampling initial conditions in the newly detected milestones and conducting unbiased trajectories from them is repeated until new milestones are not found, or other convergence criteria are satisfied.
When no new milestones are discovered the set of milestones is self-consistent. However, this goal may be difficult to achieve if the coarse and full spaces are large. If the new milestones are sampled rarely, they may have negligible influence on the outcomes and may be ignored. This removal can be done in the analysis step. For example, if major observables such as the MFPT or the free energy are accurate up to a threshold when the new milestones are removed, then convergence may be assumed.
It is clear from the above discussion that a source of error in Milestoning is an incomplete sampling of the space of milestones. This problem is similar to an incomplete sampling of configuration space, which is always a concern in MD simulations. Similar to MD we use the convergence of observables to conclude the calculations. Another source of error is the sampling in the milestone. The exact distribution of crossing trajectories at the milestone is the first hitting point distribution (FHPD) which was discussed extensively in the Transition Path Theory (TPT) of E and Vanden Eijnden.34 It was also examined in the context of Milestoning.35 In the first step of the Milestoning calculations we approximate the FHPD by equilibrium sampling. While the equilibrium distribution is an excellent approximation to the FHPD in many cases, in some cases it is not. As explained in the Introduction (Eq. 2–4) the FHPD (or ) can be recovered by the use of iterations of the type . In the language of trajectories, the iterations mean collecting the phase space points that terminate at a milestone and use them to initiate new trajectories from the terminating milestones. In other words, the trajectories are continued until the next milestone. In the limit of a large number of iterations, we are getting exceptionally long individual trajectories. Note however, that we are not interested in the calculations of long trajectories. Rather we consider the convergence of the distribution at the milestone, q, which is typically converged in about 10 iterations. In practice we use the value of the free energy or the MFPT to determine the convergence of the simulation.
In the Supplementary Information (SI) we provide and explain sample input and output files that are found in the folders “my_project_input” and “my_project_output”.
II.3. New Features
After the discussion in the previous section of the general operation of ScMiles2 we focus in this section on enhancements to the original script.
a. The state of the trajectory in Milestoning space and implementation in GROMACS
The state of a configuration in coarse space is determined by the anchor with the smallest distance to the current values of the coarse-grained variables. The state of a trajectory is given by the last milestone or a boundary between cells that it crossed. Determining the state of a trajectory is critical in Milestoning analysis. The efficient detection of crossing events depends on the tools available within an MD engine.
The NAMD30 implementation of Milestoning (The original ScMiles script36) exploits the COLVAR program33 and Tcl scripts37 to define the coarse variables and to determine crossing of milestones (see SI). Since Tcl scripting is not available in GROMACS28 a similar implementation was not possible. Instead, we used PLUMED31–32 to define and control the coarse variables. PLUMED and COLVAR are both designed to handle many types of collective variables. Their formats are, however, different, and we added a new python script plumed.py to use PLUMED in ScMiles.
Similar to the application of COLVAR in NAMD the sampling in the milestone are conditioned to be at the milestone. They are performed by restraint commands in the plumed.dat file. We evaluate the distances (in coarse space) between the current configuration of the trajectory and every anchor. In NAMD/COLVAR version of ScMiles we use “if” statements to determine the nearest anchor. In GROMACS/PLUMED, there are no “if” statements and we therefore use step functions and the committor facility. To differentiate the committor function from kinetics and the committor function of PLUMED we call the last Pcommittor in the current text.
Let us consider an example. A trajectory is initiated at milestone ij. It remains in the ij milestone when one of the distances, , or is smaller than the distances to all other anchors. Consider another anchor k with a distance . If is smaller than both , and then the trajectory must have transitioned to cell k. With the lack of an “if” statement we make the determination of a new crossing as follows. We consider the step function which returns zero if the argument, y, is negative and one if it is positive. We evaluate the sum, S, of two steps functions,
| (10) |
The function S has three possible values, 0, 1, and 2. A transition happened only if it equals two.
The Pcommittor function, PC, of PLUMED is activated only at a value of two. When PC(S = 2 a signal is sent to GROMACS to write the last set of coordinates and terminate the current trajectory. The Pcommittor test is performed at every step.
Note that at this point we do not know which milestone was crossed, only that a transition happened. It could have been ik or jk. The identity of the crossed milestone is determined only at postprocessing when the trajectory ensemble is completed using the stored coarse configurations of the termination points. For example, if then the milestone that was crossed was ik. An example of an input file to the Pcommittor function is in the SI, that also provides more details of how we used it to detect transitions.
b. Restart the ScMiles calculation
ScMiles runs require significant computational resources. The exact Milestoning simulations of the small systems we tested here took several days on a small GPU cluster with 10 nodes. Using the same cluster for a classical Milestoning calculation of a system with 50,000 atoms, 200 anchors and 500 milestones took about a month. Given the time it usually takes to finish a Milestoning job, having a restart option in ScMiles that continues an interrupted job where it stopped is beneficial. Premature program termination can be a computer stability problem, or it can be initiated by the user to rescue some data from a problematic run.
The restart option (turned on with the option “restart on” in input.txt) reads the log file first to determine as accurately as possible what the script was doing before it stops. For example, in exact Milestoning the script determines the iteration that was running when it stopped.
c. Support for Milestoning calculations with three or more coarse variables
The previous version of ScMiles only supported the use of two coarse variables. In the present version, there are no limits on the number of coarse variables that can be used.
d. Support for complex custom functions
Complex coarse variables that are a combination of several collective variables are supported by the current version of ScMiles.
e. A more general handling of periodic coarse variables with NAMD
In the previous version of ScMiles, periodic coarse variables only work for one-dimensional cases and when a milestone is located at the boundary of the periodic system. These restrictions were removed for NAMD/COLVAR version, but they are still present with GROMACS/PLUMED version.
f. Post analysis calculations of ScMiles results
At the end of a ScMiles computation the final results (transition and milestone probabilities, MFPT, free energy, committor, and exit times) are reported in the my_project_output folder. For exact Milestoning a similar set of results are reported for each iteration. For complex systems, with hundreds of anchors and thousands of milestones, there will be milestones that are visited rarely, having a low population probability, with associated observables with significant statistical errors. These milestones may be on a bottleneck of the reaction and therefore important (e.g. a transition state). Improved sampling of these milestones by adding trajectories at specific spots is therefore desired. In other cases, rarely sampled milestones, which are on the rim of the Milestoning network, and do not influence the core of the reaction will have little impact on observables such as the MFPT and can be removed from the list.
To assist in the study of the Milestoning network, a set of analysis tools (further_analysis.py) was added to ScMiles2. The user can explore how changes of the transition matrix (removing certain milestones or transitions) can change the results. The analysis is aimed at better understanding network bottlenecks, identifying critical milestones, and making it possible to focus new simulations on poorly sampled components. The fluxes through the different milestones in exact Milestoning depend on each other (the fluxes are independent in classical Milestoning), and therefore the analysis offered is qualitative. The analysis is done by running the command python3 ScMiles2.9/further_analysis.py at the ScMiles directory containing my_project_input, my_project_output, and crd folders. An additional input file named analysis.txt needs to be added to my_project_input. The file accepts these options:
source scmiles the data come from a ScMiles calculation.
ignore_milestones (for example ignore_milestones 3_5,7_9): use it if there are certain set of milestones that you want to ignore and explore the impact of their removal on the observables. The milestones will be removed from the transition matrix and the life time vector.
ignore_transitions (for example ignore_transitions 1_2-2_5,5_6-5_8): use it if there are certain milestone transitions that you wish to ignore. In the example the entries for the transitions between milestones 1_2 and 2_5 and milestones 5_6 and 5_8 will be made zero in the transition matrix. The corresponding elements for the transition time matrix are make zero as well. Elements of the transition time matrix, tij, are the average times of trajectories between pairs of milestones (e.g., a trajectory initiated at milestone i and terminating on milestone j). They are needed only in the calculations of the exit times.18 For free energy and MFPT calculations only the life time ti is needed.
iteration:
Select a specific exact Milestoning iteration for the analysis. For example, “iteration 5” will initiate analysis only for iteration 5 in an exact Milestoning calculation.
k_min_sum (for example k_min_sum 2): We first construct an unnormalized transition matrix. Every entry is the number of Milestoning trajectories that were sampled for a particular transition. If the number of trajectories that terminate at a milestone (sum) is less than a threshold (in the example it is 2) then the milestone and its lifetime are removed. This option helps in detecting poorly sampled milestones and the impact of their removal on observables.
max_lifetime:
(for example, max_lifetime 20000): On rare occasions a few trajectories get trapped in a milestone state for a long time and do not terminate. This observation suggests that more anchors are needed in that region. If the option is “on”, the analysis will only consider trajectories with a lifetime smaller than max_lifetime. The idea is to explore the rest of the milestone network before fixing the poor sampling of the domain with long trajectories. The units of max_lifetime is femtosecond.
MS_list (for example MS_list 3_4,9_11, 13_17): this option does the opposite of the ignore_milestone option. It analyzes the subnetwork of the milestones in the MS_list. A reactant and product must be defined. If the original reactant or product are not included in the new list, then a new reactant and product should be provided.
g. Milestoning analysis for a long trajectory
The further_analysis.py tool can also be used to perform a Milestoning analysis of a long Molecular Dynamics trajectory. The analysis chops the long trajectories to fragments between milestones and conducts the analysis assuming the fragments are from a Milestoning run. Running long MD trajectories are common practice in the field. The analysis of these trajectories and the extraction of kinetics and equilibrium observables such as the MFPT, free energy, and the iso-committor surfaces can be, however, non-trivial. Here we offer analysis tools that use Milestoning partitions in coarse space (which are provided by the user) to detect transitions between milestones and analyze the trajectory using the Milestoning theory.17 To perform the analysis we need to obtain a file from the MD run that gives the time changes of the coarse variables to be used for the Milestoning description. This could be the colvars.traj output file obtained with COLVAR or PLUMED. To capture more precisely transitions between milestones the collective variable data should be saved often during the MD run.
An alternative to using a collective variable file of the type COLVAR and PLUMED generate, is to use a path_history.dat file that contains information about every time the system gets closer to a different anchor. The file can be generated by the user using the colvars.traj and anchors.txt data. An advantage of using this file compared to the original colvars.traj file is that it will usually be a lot smaller. An additional advantage is that during the creation of this file the user can take into account the periodicity of the coarse variables.
III. Results
III.1. Alanine dipeptide in vacuum using NAMD
In the first version of ScMiles15 we presented simulation results for alanine dipeptide in vacuum adding barriers at the edges of the free energy surface as a function of the coarse variables. The barriers remove periodic transitions that are of lesser interest. Here we expand on these simulations by probing two characteristics of the simulation in more details than the first paper: (i) the initiation of enrichment trajectories and (ii) the averaging of Milestoning functions during iterations.
In Figure 5 we show the position of the anchors on the free energy landscape. The reactant and the product are marked with R and P respectively.
Figure 5:

Contour representation of the free energy landscape of alanine dipeptide in vacuum using the NAMD software and the CHARMM22 forcefield. The anchors for the Milestoning are shown with red circles and the positions of the Reactant (R) and the Product (P) are marked. The free energy is constructed from a single long trajectory. See text for more details.
The convergences of the MFPT as a function of the number of iterations in exact Milestoning is examined in Figure 6. The conditions of the simulation are the same as in reference 15. A 1 μs equilibrium trajectory was computed with a timestep of one femtosecond. The trajectory run at 600K to make it possible to overcome the significant barrier at ϕ≅0. Since the system is small, ergodicity is an issue and we use Langevin dynamics to overcome this problem. The coupling coefficient was 5 ps−1. The trajectory returns back and forth between the reactant and product. The transitions are used to obtain an exact MFPT at equilibrium. The error bars of the MFPT, computed with the long trajectory, are small and are not shown.
Figure 6:

The MFPT for the conformational transition of alanine dipeptide in vacuum as a function of the number of iterations. The transition is from the upper left to the lower right minima of the free energy landscape (Figure 5). Top panel: it shows the MFPT in each iteration when the enhancement is done with force enrichment. The black line displays the result when only data from the current iteration is used to construct the transition matrix while for the red line the MFPT is computed using data from current and the previous iteration. The green line (in both panels) is the MFPT obtained from a long MD trajectory. Using previous iteration besides the current iteration data provide better statistical properties and convergence for Milestoning. Bottom panel: the blue and orange lines use velocity enrichment with a friction coupling coefficient of 5 and 50 ps−1, respectively. Comparing both panels is clear that force enrichment converges to the exact result (green line) faster than velocity enrichment. For velocity enrichment, convergence to the exact result is better in the case of overdamped dynamics.
In each of the Milestoning simulations the force constant for the Milestoning restraints was 1 kcal/mol×deg−2. We also added half harmonic restraints at the boundaries of the (ϕ, ψ) map (±175°) to focus the transitions to the map center.
Below we illustrate numerically the velocity and force enrichment (see Introduction) and the impact of transition matrix averaging (Eq. (8)). In Figure 6 we show the MFPT calculation using several approaches. In the top panel we examined force enrichment. The green line is the reference MFPT from a long trajectory. The black line is a force enrichment calculation in which only data from the last iteration is used to generate the transition matrix. The red line represents calculations that used the last two iterations to get the transition matrix. The lower panel examines velocity enrichment for an underdamped and overdamped cases. The blue curve is the MFPT for underdamped velocities, and the orange curve is overdamped velocities. The fluctuations of the value and the error bars of the MFPT computed with underdamped velocity enrichment are high. In contrast, the curve of force enrichment converged much faster to the correct answer. Interestingly, the averaging of the transition matrix in conjunction with force enrichment (red curve) shows much better relaxation rate and is the best approach for this system.
III.2. Alanine dipeptide in water using GROMACS
To illustrate the application of ScMiles2 in GROMACS we consider alanine dipeptide in a box of water. The alanine dipeptide molecule was solvated with 665 TIP3 water molecules38 and the force field was CHARMM36.39 The leap-frog stochastic dynamics integrator implemented in GROMACS was used to compute the trajectory with a temperature coupling of 2 ps and temperature of 300 K. A distance cutoff of 12 Å was used for the non-covalent interactions. Particle Meshed Ewald was used to evaluate the long-range electrostatics. A time step of 0.1 fs was used for sampling and 1.0 fs for the free trajectories during Milestoning calculations. Bonds involving hydrogen atoms were constrained with the LINCS algorithm. Twenty-two anchors were used to cover most of the lower energy regions and possible transition pathways (Figure 7). The correct evaluation of periodicity of angular coarse variables is not included in the GROMACS implementation of ScMiles. Therefore, we added an additional harmonic wall constraint at the boundaries of 5.8 kcal/mol × deg−2 at ± 166 ° for both phi and psi dihedral angles. The force constant for the sampling constraints was 0.73 kcal/mol × deg−2. In GROMACS these additional constraints were added using PLUMED.31–32
Figure 7.

Free energy contour plot for alanine dipeptide in water computed as a function of the dihedral angles ϕ and ψ. The free energy is calculated with a 1 μs long trajectory using the same parameters as for the Milestoning run. The positions of the anchors are shown with red circles and the corresponding milestones are shown with black lines. The reactant (R) and product (P) milestones are indicated as well
We computed the transition between the reactant and product states shown in Figure 7 and consider another computational scenario. The total cost of a Milestoning calculation is proportional to the number of force evaluations, or for fixed positions of anchors and milestones, to the number of Milestoning trajectories. The smaller is the number of trajectories that leads to desired accuracy, the more efficient is the calculation. Therefore, there is a tradeoff between the number of trajectories that we run in a milestone and the number of iterations that is required for convergence. We assume convergence when the MFPT is within 5% of the result of the long trajectory (~170ns).
In Figure 8 we show that the total number trajectories used in the entire Milestoning run, with 5% relative accuracy, is increasing monotonically as a function of the number of trajectories per milestone and per iteration. The transition matrix was averaged over the last five iterations (Eq. (8)). Hence, according to this example it is better to use a small number of trajectories per iteration (~100) and iterate the system tens of times to approach the correct solution.
Figure 8.

The number of iterations (red plot) and the total number of free trajectories (blue plot) that are needed to have a relative error of 5% or less with respect to the MFPT of a long trajectory. The Milestoning MFPT results are assumed converged if they deviate by less than 5% in three successive iterations.
The choice of reactant and/or product state for the Milestoning calculation can be changed after the trajectory computations using the further_analysis tool of ScMiles. For example, we change the position of the reactant state to a milestone close to the location of the metastable state at ϕ = 60° and ψ =−150° (Figures 7 and 9). With the change of the reactant milestone the transition has to overcome the barrier at ϕ = 0° to the product state and the MFPT is 7.6 ± 0.5 x 102 ns (Figure 9, top). Two possible pathways can pass through the barrier. With ScMiles we can ignore specific milestones in the calculation to investigate the two paths in detail. Blocking milestones to prevent passing of the barrier at negative ψ values show a fast transition (Figure 9, middle) while the transition is a lot slower when we block the upper passage. (Figure 9, bottom). In the latter case the pathway crosses the metastable state at ϕ = −60° and ψ =0° spending time in that basin.
Figure 9:


(Top): Milestone free energies and MFPT for transition between the reactant (R) milestone between anchors 11 and 12, and the product (P) milestone between anchors 9 and 10. The free energy of the milestones is color-coded in the lines. A black color milestone represents a non-active milestone. The two possible pathways are depicted. (Middle): Results obtained when milestones 11-21, 12-21, and 13-21 (marked with grey color and X symbols) are ignored in the Milestoning calculations. (Bottom): Results when milestones 17-22, 18-22, 19-22 and 20-22 are ignored.
III.3. Adenosine solvated in water
The third example we considered (using NAMD) is of a solvated adenosine.
The system was prepared with CHARMM-GUI.40 It contained one adenosine molecule surrounded by 456 water molecules. We use the CHARMM36 force field for the adenosine and the TIP3P model for the water molecules. The temperature was kept at 300K using Langevin dynamics with a coupling coefficient of 1 ps−1. For the free trajectories a time step of 1 fs was used. The time step was decreased to 0.1 fs for the sampling stage (for 400 ps). The cutoff for nonbonded interactions was 12 Å and PME was used for the treatment of long-range electrostatics. The SETTLE algorithm was used to keep the water molecules rigid. We used two different coarse variables to describe the dynamics of the system. These variables have been used in the past by some of us41 and others42 to describe the sugar ring puckering kinetics and thermodynamics. One is the pseudo-rotation phase angle p defined by
| (11) |
where the set are the endocyclic torsions corresponding to C1’-C2’-C3’-C4’, C2’-C3’-C4’-O4’, C3’-C4’-O4’-C1’, C4’-O4’-C1’-C2’, O4’-C1’-C2’-C3’, respectively (Figure 10). The second is the glycosyl torsion χ (O4’-C1’-N9-C4).
Figure 10:

The adenosine configurations that are used as reactant (anchor 15) and product (anchor 6).
To determine anchors, we first run a 200 ns trajectory and select 20 conformations. The Voronoi cell of anchor 15 is defined as the reactant, (i.e., all the milestones that surround anchor 15 are used as the reactant state) and the cell centered at anchor 6 is the product state. The transition corresponds to a larger motion in the glycosyl torsion χ. The force constant for the harmonic and half harmonic potentials used for the sampling step in Milestoning was 1 kcal/mol × deg−2. We applied periodic boundary conditions for both p (at −90° and 90°) and the angle (at −180° and 180°)
Figure 11 shows the MFPT for the transition from reactant to product for adenosine computed with exact Milestoning. We computed 200 free trajectories per iteration. The first iteration is inaccurate. Therefore, the averages of the transition matrix converged more slowly if the initial iteration are conducted with a window of k = 1 (see Eq. 8). The plot illustrates that the results of exact Milestoning tend to converge faster when the transition matrices of the first iterations are discarded.
Figure 11:

The MFPT for the conformational transition of adenosine as a function of the number of Milestoning iterations. The black line shows results in which all the transition matrices from previous iterations are averaged (k = 1 in Eq. 8). The blue line shows the results where only the 5 previous iterations are averaged to evaluate the MFPT (k = i - 5 in Eq. 8). The red dashed line is the average MFPT extracted from two 200-ns long trajectories.
Figure 12 (top) compares the free energy results obtained with Exact Milestoning (Milestone lines color-coded) and the free energy obtained from the long trajectory (color contours). We used the results at iteration 20 (blue line in Figure 11) to compute the free energy (Eq. 7) for Milestoning. For the low free energy regions, the Milestoning free energies are close to the exact results. Note that the Milestoning free energies are averages over two Voronoi cells that board the milestone of interest. The averaging over the cells reduces deviation between the free energies of different milestones.
Figure 12:


(Top): Comparison of the free energy obtained with Milestoning (colored milestone lines) and the results from two 200-ns long MD trajectories (contour lines). The reactant and product anchors are indicated on the map by R and P respectively. (Middle): The committor values at the milestones are color coded and are shown at the milestone lines. (Bottom): the exit times (the time it takes for the system to go to the reactant or product state) are color coded in each milestone.
The committor values (different of the Pcommittor function of PLUMED) are available in Milestoning23 and are computed by the ScMiles2 script. The middle panel of figure 12 shows the committor values for every milestone. It represents the probability of hitting first the product (anchor 6) before reaching the milestones of the reactant (anchor 15).
Finally, the bottom panel of Figure 12 shows the exit time for each milestone. The exit time is the time it takes the system starting at each milestone to reach the reactant or the product state.
III.4. Translocation of NAF-144-67 peptide through a DOPC membrane
NAF-1144-67 is a peptide which is a fragment of the protein NAF-1/CISD2 that resides in the mitochondria and endoplasmic reticulum membranes.43 NAF-144-67 (FLGVLALLGYLAVRPFLPKKKQQK) has promising therapeutic benefits selectively permeating and killing cancer cells while not permeating normal cells. Recently, we have used classical Milestoning to study the permeation of the peptide through a DOPC membrane. Here, we consider how changes in the number of trajectories run in each milestone and an additional iteration affect observables of the translocation of the peptide from the center of the phospholipid membrane to the aqueous solution. This is a large system with 54794 atoms and the overall time scale exceeds seconds. Therefore, the calculation of the kinetic is challenging.
The details of the simulation parameters, the generation of anchors for this system and selection of coarse variables are described in a previous paper.44 Three coarse variables were selected that specify the position of residues F1, R14, and K24 along a coordinate perpendicular to the membrane center (see left panel of Figure 13). The total number of anchors accounting for the motion of the peptide through the membrane is 191. At the first iteration of Milestoning we run 50, 100 and 200 free trajectories and analyze the impact of these variations on the free energy and MFPT. In the right panel of Figure 13 we show a free energy profile along a one-dimensional MaxFlux pathway, defined on the Milestoning network.45 The MaxFlux pathway carries the maximum number of trajectories from reactant to product across the Milestoning network.
Figure 13:

(Left): Molecular representation of NAF-144-67 inside a DOPC membrane. The three residues used as coarse variables for the Milestoning calculation are labeled. The membrane center (used as a reference to determine the position of these three variables in the system) is represented with a dash line. (Right) The free energy profile along a maxflux path for the exit of NAF-144-67 from the membrane center (at milestone index 0) to the upper water layer (milestone index 47). The free energy for exact Milestoning with two iterations is shown in the black line profile. The results for classical Milestoning for 50, 100 and 200 unconstrained trajectories are shown as bars representing their deviations from the black line. The exact Milestoning plot is a second iteration that start from the classical Milestoning results with 200 free trajectories.
The results are qualitatively similar in all cases, presenting a barrier for the exit of the peptide from the membrane. This observation is encouraging and illustrates the stability of the Milestoning algorithm under different setups. However, the magnitude of the barrier varies between the simplest and the most elaborate calculations. That shows that for this system running 50 trajectories was not enough to capture the correct results. The smaller barrier for the 50-trajectory case makes the MFPT for exit to the water layer a fraction of a second (MFPT = 0.8 ± 0.2 s) while for the other cases the MFPT is larger (44 ± 5 s, 47 ± 4 s and 30 ± 4 s for the 100, 200, and the second iteration of exact Milestoning, respectively). Results for the 100 and 200 free trajectory cases with classical Milestoning and exact Milestoning with a second iteration are close and the difference are typical for kinetic calculations.
We also estimate the number of unique transitions between Milestones for the different cases (the number of non-zero elements of the transition matrix K). For 50 trajectories the number of unique transitions was 2576. This number increases to 2742 (100 trajectories), 2858 (200 trajectories) and 2907 (for the exact Milestoning run with two iterations). Clearly, more trajectories enrich the connectivity of the transition network. However, conducting 100 free trajectories or more, does not produce substantial changes in the overall kinetics and thermodynamics of this system.
IV. Discussion
Practical considerations for Milestoning runs
The ScMiles2 python script was designed to make it easier to launch the large number of trajectories needed in a Milestoning calculation. Still, the practical implementation of a ScMiles2 requires user decisions that influence the efficiency and the accuracy of the calculation. In the Discussion we consider some of the bottlenecks of Milestoning.
IV.1. Generation of anchors
Determination of anchors is the first step in any Milestoning calculations and it therefore deserves a careful consideration. We run long MD simulations to generate the anchors for the alanine dipeptide and adenosine systems. In more complex cases, a straightforward trajectory may be insufficient to sample an unbroken Milestoning network between the reactant and the product. Instead, we had used reaction paths,27 a high temperature MD trajectory46 and an SMD trajectory44 to generate anchors. For example, we pulled a peptide across a bilayer membrane to generate anchors for peptide permeation. Any of these methods generates anchor candidates rapidly. However, care must be used to ensure that the choice of the pulling variables (for example) is not biasing the Milestoning calculations. Milestoning is conducted in practice in the neighborhood of the anchors and is inefficient if sampling far from the anchor positions is required. In our applications we have used up to hundreds of anchors. Once an Exact Milestoning calculation is finished, and the needs for more anchors is apparent, we had to start the Milestoning calculation with the additional anchors from the beginning (the seek step).
Optimal distances between the anchors are system dependent. For coarse variables associated with Cartesian displacements we have used distances of 0.5-2.5 A, and for torsional angles distances of tens of degrees or less. One way to determine early on if the separation between the anchors is too large is using the seek step. A seek trajectory has an upper bound of time length (we used 500 ps for the simulations of alanine dipeptide in water). If many seek trajectories finish before hitting a milestone then the distances between the milestones are too large and more anchors are needed. A work in progress determines an optimal placement of milestones in a given coarse space by searching for dynamics which is closed to free diffusion. However, no such option is implemented in the present code.
IV.2. The choice of coarse variables
The choice of coarse variables has been considered extensively in the past not only for Milestoning but for many other applications. Even with advances in statistical learning it is still an open problem.47–48 From Milestoning perspective, a one dimensional or a small coarse space will have fewer milestones to consider and is more efficient. In the past we showed that if the milestones are iso-committor surfaces (one dimensional reaction coordinate) then a single iteration provides the exact answer for the MFPT.35 An intriguing approach to determine the committor function from general force was proposed recently.49 However, the space orthogonal to the one dimensional reaction coordinate can be exceptionally large and difficult to sample. Identifying the committor surfaces remains a non-trivial task and we therefore use multiple approaches to determine the reaction space.
From a practical point of view, ScMiles can use any coarse variable accessible in COLVAR or PLUMED and any number of coarse variables. If the coarse space is highly confined (e.g., forming a tunnel between the reactant and product), then more variables can be used to describe the shape of the tunnel. In other words, if the reaction space is of low dimension, the coarse variables can be represented as a combination of many internal degrees of freedom (e.g. the p variable in the adenosine example of this paper). With ScMiles2, complex coarse variables (functions of several individual collective variables) can be used.
The calculations can help identify problems in milestone placement and selection of coarse variables. For example, if the value of the MFPT is infinite, it is an indicator that the network from the reactant to the product is broken. The network needs to be examined and additional sampling or perhaps additional anchors should be considered. The Milestoning calculation can also hint if the choice of coarse variables is poor. Unbiased trajectories may not transition to nearby milestones. Lack of transitions at short distances in coarse space suggests that hidden coarse variables that contribute to slow kinetics are present. We have no simple solution in this case. We recommend further exploration of the complete space and analyzing atomically detailed trajectories to identify the missing coordinates.
IV.3. Exact and classical Milestoning
ScMiles2 is a script that focuses on exact Milestoning. The difference between classical and exact Milestoning is the number of iterations. In classical Milestoning we use only one iteration. In exact Milestoning we conduct several to several tens of iterations at each milestone until convergence is reached. Cost is, of course, an issue. If the system is relatively small (less than a few thousand atoms) then many iterations are feasible and the recommended approach. In larger systems we use a smaller number of iterations and check the convergence of observables such as the MFPT. Velocity decorrelation in large systems rarely exceed a few picoseconds.50 Thermal equilibration and ergodicity are therefore reached more quickly in large versus small systems. It is therefore no surprise that the initial guess for the flux q of thermal equilibrium is found adequate in many biological applications.
We comment on a computational trick that speeds up the convergence in exact Milestoning: The use of an average transition matrix instead of the value from the last iteration. In our experience, the averaging reduces significantly the statistical noise of the calculations and does not impact negatively the rate of convergence to the right answer. An interesting analysis examine the cost (or the number of Milestoning trajectories) as a function of the number of iterations and the number of trajectories at the milestone. A low-cost setup to reach MFPT within a given error tolerance of the exact value is to use a small number of trajectories per milestone, and to average the transition matrix over multiple iterations. Running more trajectories at a milestone has a smaller effect than increasing the number of iterations.
Another algorithmic consideration reported in the present manuscript is how to enrich the statistics of trajectories at milestones that are poorly sampled. We enrich the number of trajectories by either sampling new velocities from the canonical distribution, or by sampling new values for the forces in Langevin dynamics. For underdamped dynamics, force enrichment works significantly better than velocity enrichment.
IV.4. Computer system considerations
Milestoning runs are expensive due to the large number of trajectories that are computed. About twelve thousand trajectories were computed in an exact Milestoning run in this manuscript. For complex systems hundreds of thousand trajectories are generated. These multiple and independent trajectories run in trivial parallelism, however, they may fail at numerous points of the calculations (due for example, a power outage). The restart option discussed in the manuscript is therefore handy.
Another challenge in Milestoning calculations is storage. The computation of many trajectories generates many files that are saved on the computer disk. Running Classical Milestoning with hundreds of anchors and thousands of milestones, terabyte of data can be generated. A rapidly growing output file is “traj”. The traj files contain the coarse variable distances from each anchor and are written every one or two steps of a free trajectory. Other files that individually are not large, but add together to large storage are the restart files (generated when using NAMD). Given that disk storage can be critical in Milestoning runs, the template input files for the individual trajectory runs should not contain options that write down unnecessary data. For example, saving trajectory data (dcd or trr files) is not required for Milestoning computations.
V. Conclusions
We presented a python script to run exact Milestoning, called Scmiles2. Milestoning is a theory and an algorithm that exploits the use of a large number of trajectories. Dividing coarse space to small cells and conducting many trajectories between the boundaries of the cells yield the kinetics and thermodynamics of the system. The advantage of the short trajectories is that they can be computed efficiently in parallel and their initial conditions are chosen to speed up passages over barriers. On the other hand, the large number of trajectories complicates the management of computer resources and the detection and corrections of errors. ScMiles2 addressed many of these issue by automating the initiation and launching of Milestoning trajectories and by analyzing the results. This tools was useful in the recent simulation of peptide translation through a membrane.44 It is therefore our hope that by placing the software in https://github.com/Milestoning/ScMiles2.0, others may find it useful as well. The software ScMiles2 can be integrated with other packages based on the Milestoning theory. For example, it can be used in SEEKR16 to conduct exact Milestoning calculations (with iterations). However, integration with other technologies can be more complex. The use of the Weighted Ensemble with Milestoning14 is not supported at present. Weighted Ensemble requires the modification of the trajectory weights. Changes of weights are not implemented in ScMiles2.
Supplementary Material
VII. Acknowledgements
This research was supported by grants from the NIH GM59796 and GM111364 and by a grant from the Welch Foundation F-1896.
Footnotes
VI. Supporting Information: Enumeration of input parameter for ScMiles2; description of input files, python scripts, output directories; more details about the new features of ScMiles: Gromacs implementation, restart option, and analysis for long trajectories (DOC)
References
- 1.Shaw DE; Deneroff MM; Dror RO; Kuskin JS; Larson RH; Salmon JK; Young C; Batson B; Bowers KJ; Chao JC; Eastwood MP; Gagliardo J; Grossman JP; Ho CR; Ierardi DJ; Kolossvary I; Klepeis JL; Layman T; McLeavey C; Moraes MA; Mueller R; Priest EC; Shan YB; Spengler J; Theobald M; Towles B; Wang SC, Anton, a special-purpose machine for molecular dynamics simulation. Communications of the Acm 2008, 51, 91–97. [Google Scholar]
- 2.Bowman GR; Pande VS, An Introduction to Markov State Models and Their Applictaions to Long Timescale Molecular Simulations. Springer: 2014; p 139. [Google Scholar]
- 3.Adhikari U; Mostofian B; Copperman J; Subramanian SR; Petersen AA; Zuckerman DM, Computational Estimation of Microsecond to Second Atomistic Folding Times. Journal of the American Chemical Society 2019, 141, 6519–6526. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Aristoff D, Analysis and optimization of weighted ensemble sampling. Esaim-Mathematical Modelling and Numerical Analysis-Modelisation Mathematique Et Analyse Numerique 2018, 52, 1219–1238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Zhang BW; Jasnow D; Zuckerman DM, The “weighted ensemble” path sampling method is statistically exact for a broad class of stochastic processes and binning procedures. J. Chem. Phys 2010, 132, 054107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Moroni D; Bolhuis PG; van Erp TS, Rate constants for diffusive processes by partial path sampling. J. Chem. Phys 2004, 120, 4055–4065. [DOI] [PubMed] [Google Scholar]
- 7.Dinner AR; Mattingly JC; Tempkin JOB; van Koten B; Weare J, Trajectory Stratification of Stochastic Dynamics. Siam Review 2018, 60, 909–938. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Allen RJ; Frenkel D; ten Wolde PR, Forward flux sampling-type schemes for simulating rare events: Efficiency analysis. J. Chem. Phys 2006, 124, 17. [DOI] [PubMed] [Google Scholar]
- 9.Lopes LJS; Lelievre T, Analysis of the Adaptive Multilevel Splitting Method on the Isomerization of Alanine Dipeptide. J. Comput. Chem 2019, 40, 1198–1208. [DOI] [PubMed] [Google Scholar]
- 10.Bello-Rivas JM; Elber R, Exact milestoning. J. Chem. Phys 2015, 142, 094102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Vanden-Eijnden E; Venturoli M, Markovian milestoning with Voronoi tessellations. J. Chem. Phys 2009, 130, 194101. [DOI] [PubMed] [Google Scholar]
- 12.Grazioli G; Andricioaei I, Advances in milestoning. I. Enhanced sampling via wind-assisted reweighted milestoning (WARM). Journal of Chemical Physics 2018, 149, 084103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Schutte C; Noe F; Lu JF; Sarich M; Vanden-Eijnden E, Markov state models based on milestoning. J. Chem. Phys 2011, 134, 204105. [DOI] [PubMed] [Google Scholar]
- 14.Ray D; Andricioaei I, Weighted ensemble milestoning (WEM): A combined approach for rare event simulations. Journal of Chemical Physics 2020, 152, 234114. [DOI] [PubMed] [Google Scholar]
- 15.Wei W; Elber R, ScMile: A Script to Investigate Kinetics with Short Time Molecular Dynamics Trajectories and the Milestoning Theory. J. Chem. Theory Comput 2020, 16, 860–874. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Jagger BR; Lee CT; Amaro RE, Quantitative Ranking of Ligand Binding Kinetics with a Multiscale Milestoning Simulation Approach. Journal of Physical Chemistry Letters 2018, 9, 4941–4948. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Elber R; Fathizadeh A; Ma P; Wang H, Modeling molecular kinetics with ,(Information for this reference is incomplete) [Google Scholar]
- 18.Ma P; Elber R; Makarov DE, Value of Temporal Information When Analyzing Reaction Coordinates. J. Chem. Theory Comput 2020, 16, 6077–6090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Elber R; Makarov DE; Orland H, Molecular Kinetics in Condensed Phases: Theory, Simulation, and Analysis Wiley, Hoboken, NJ, USA, 2020. [Google Scholar]
- 20.Elber R, Milestoning: An Efficient Approach for Atomically Detailed Simulations of Kinetics in Biophysics. In Annual Review of Biophysics, Vol 49, 2020, Dill KA, Ed. 2020; Vol. 49, pp 69–85. [DOI] [PubMed] [Google Scholar]
- 21.Huber GA; Kim S, Weighted-ensemble Brownian dynamics simulations for protein association reactions. Biophys. J 1996, 70, 97–110. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Faradjian AK; Elber R, Computing time scales from reaction coordinates by milestoning. J. Chem. Phys 2004, 120, 10880–10889. [DOI] [PubMed] [Google Scholar]
- 23.Elber R; Bello-Rivas MJ; Ma P; Cardenas AE; Fathizadeh A, Calculating Iso-Committor Surfaces as Optimal Reaction Coordinates with Milestoning. Entropy 2017, 19, 219. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Mugnai ML; Elber R, Extracting the diffusion tensor from molecular dynamics simulation with Milestoning. J. Chem. Phys 2015, 142, 18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Ma P; Carednas AE; Chaughari ML; Elber R; Rempe SB, The Impact of Protonation on Early Translocation of Anthrax Lethal Factor: Kinetics from Molecular Dynamics Simulations and Milestoning Theory. Journal of the American Chemical Society 2017, 139, 14837–14840. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Atis M; Johnson KA; Elber R, Pyrophosphate Release in the Protein HIV Reverse Transcriptase. J. Phys. Chem. B 2017, 121 (41), 9557–9565. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Elber R, A milestoning study of the kinetics of an allosteric transition: Atomically detailed simulations of deoxy Scapharca hemoglobin. Biophys. J 2007, 92, L85–L87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Abraham JM; Murtola T; Schulz R; Pall S; Smith JC; Hess B; Lindahl E, GROMACS: High Performance Molecular Simulations Through Multi-Level Parallelism From Laptops to Supercomputers. SoftwareX 2015, 1-2, 19–25. [Google Scholar]
- 29.Majek P; Elber R, Milestoning without a reaction coordinate. J. Chem. Theory Comput 2010, 6, 1805–1817. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Phillips JC; Braun R; Wang W; Gumbart J; Tajkhorshid E; Villa E; Chipot C; Skeel RD; Kale L; Schulten K, Scalable molecular dynamics with NAMD. J. Comput. Chem 2005, 26, 1781–1802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Consortium TP, Promoting transparency and reproducibility in enhanced molecular simulations. Nat. Methods 2019, 16, 670. [DOI] [PubMed] [Google Scholar]
- 32.Tribello GA; Bonomi M; Branduardi D; Camilloni C; Bussi G, PLUMED2: New Feathers for an Old Bird. Comp. Phys. Comm, 2014, 185, 604. [Google Scholar]
- 33.Florin G; Klein ML, Using Collective Variables to Drive Molecular Dynamics Simulations. Molecular Physics 2013, 11, 3345–3362. [Google Scholar]
- 34.E WN; Vanden-Eijnden E, Transition-Path Theory and Path-Finding Algorithms for the Study of Rare Events. In Annual Review of Physical Chemistry, Vol 61, 2010; Vol. 61, pp 391–420. [DOI] [PubMed] [Google Scholar]
- 35.Vanden Eijnden E; Venturoli M; Ciccotti G; Elber R, On the assumption underlying Milestoning. J. Chem. Phys 2008, 129, 174102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Desale K; Kuche K; Jain S, Cell-penetrating peptides (CPPs): an overview of applications for improving the potential of nanotherapeutics. Biomaterials Science 2021, 9, 1153–1188. [DOI] [PubMed] [Google Scholar]
- 37.Ousterhout JK; Jones K, Tcl and Tk Toolkit. Addison Wesley: Upper Saddle River, NJ, 2010. [Google Scholar]
- 38.Jorgensen WL; Chandrasekhar J; Madura JD; Impey RW; Klein ML, Comparison of simple potential functions for simulating liquid water. J. Chem. Phys 1983, 79, 926–935. [Google Scholar]
- 39.Huang J; Rauscher S; Nawrocki G; Ran T; Feig M; de Groot BL; Grubmuller H; MacKerell AD, CHARMM36: An improved force field for folded and intrinsically disordered proteins. Biophys. J 2017, 112, 175A–176A. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Jo S; Lim JB; Klauda JB; Im W, CHARMM-GUI membrane builder for mixed bilayers and its application to yeast membranes. Biophys. J 2009, 97, 50–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Kirmizialtin S; Elber R, Revisiting and computing reaction coordinates with directional milestoning. Journal of Physical Chemistry A 2011, 115, 6137–6148. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Foloppe N; MacKerell AD, Intrinsic conformational properties of deoxyribonucleosides: Implicated role for cytosine in the equilibrium among the A, B, and Z forms of DNA. Biophys. J 1999, 76, 3206–3218. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Sohn Yang Sung, losub-Amir Anat, Cardenas Alfredo E., Karmi Ola, Yahana Merav Darash, Gruman Tal, Rowland Linda, Webb Lauren J., Mittler Ron, Elber Ron, Friedler Assaf and Nechushtai Rachel. A peptide-derived strategy for specifically targeting the mitochondrial-ER network of cancer cells: a new paradigm in fighting cancer, Chemical Science, 2022, 13, 6929. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Cardenas AE; Drexler CI; Nechushtai R; Mittler R; Friedler A; Webb LJ; Elber R, Peptide permeation across a phosphocholine membrane: An atomically detailed mechanism determined through simulations and supported by experimentation. Journal of Physical Chemistry B 2022, 126, 2834–2849. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Viswanath S; Kreuzer SM; Cardenas AE; Elber R, Analyzing milestoning networks for molecular kinetics: Definitions, algorithms, and examples. J. Chem. Phys 2013, 139, 174105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Kuczera K; Jas GS; Elber R, Kinetics of Helix Unfolding: Molecular Dynamics Simulations with Milestoning. Journal of Physical Chemistry A 2009, 113, 7461–7473. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Das P; Moll M; Stamati H; Kavraki LE; Clementi C, Low-dimensional, free-energy landscapes of protein-folding reactions by nonlinear dimensionality reduction. Proceedings of the National Academy of Sciences of the United States of America 2006, 103 (26), 9885–9890. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48.Evans L; Cameron MK; Tiwary P, Computing committors in collective variables via Mahalanobis diffusion maps. arxiv.math 2021, arXiv:2108.08979. [DOI] [PubMed] [Google Scholar]
- 49.Wu S; Li H; Ma A; A rigorous method for identifying a one-dimensional reaction coordinate in complex molecules, J. Chem. Theory Comput 2022, 18, 2836–2844 [DOI] [PubMed] [Google Scholar]
- 50.West AMA; Elber R; Shalloway D, Extending molecular dynamics time scales with milestoning: Example of complex kinetics in a solvated peptide. J. Chem. Phys 2007, 126, 145104. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
