Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Dec 4.
Published in final edited form as: J Phys Chem B. 2021 Jun 10;125(24):6609–6616. doi: 10.1021/acs.jpcb.1c02143

Optimizing String Method’s Reproducibility Using Generalized Solute Tempering Replica Exchange

Gourav Shrivastav 1, Cameron F Abrams 2
PMCID: PMC9719742  NIHMSID: NIHMS1853083  PMID: 34110824

Abstract

Obtaining accurate and reproducible free energies from molecular simulations is somewhat tricky due to incomplete knowledge of crucial slow degrees of freedom leading to hidden barriers that can stymie sampling. Employing a sufficiently large number of collective variables (CV) and ensuring ergodic sampling in orthogonal CV space, perhaps via tempering methods, can reduce these issues to some extent. For complex systems with high-dimensional free energy landscapes, both these approaches become computationally expensive. For high-dimensional landscapes, efficient exploration can be enabled by using temperature-accelerated MD (TAMD) and identification and characterization of minimum free energy pathways connecting minima can be found by using the string method (SM). Both TAMD and SM use mean-force estimates from finite MD simulations and are thus susceptible to sampling restrictions from hidden variables. A recent development in parallel tempering methods, “generalized replica exchange solute tempering” (gREST), can enhance sampling at a reasonable computational cost with its flexibility to target very specific “solutes” which can include arbitrary independent variables. Considering the advantages of both methods, we implement gREST-enabled TAMD and SM. By considering two different collective variable representations of the pentapeptide neurotransmitter met-enkephalin, we show that both gREST-enabled TAMD and SM yield more accurate and reproducible free energy predictions than TAMD and SM alone. Given the moderate computational cost of gREST compared with other replica-exchange methods, gREST-enabled SM represents a more attractive method for characterizing free energy minima and pathways among them for a large variety of systems.

Graphical Abstract

graphic file with name nihms-1853083-f0001.jpg

INTRODUCTION

Molecular dynamics simulations have become a standard tool to study thermodynamics and kinetics of complex systems. With advancements at the algorithmic levels and high-performance computing facilities, a typical MD simulation can now efficiently capture events at a much higher time (milliseconds to seconds) and length (1 billion atoms) scales.1,2 For chemical and biochemical processes, free energy calculations are also regularly used to probe the macroscopic behavior of metastable states. These free energy methods can be broadly classified as enhanced sampling313 and path optimization methods.14,15 The former category focuses on generating the free energy landscape followed by the identification of transition pathways. In contrast, the latter concentrates on identifying the most probable transition pathway first and computing the free energy differences. However, for both methods, adequate sampling needs to be ensured to capture the slow degrees of freedom coupled to system movements.

The reproducibility, consistency, and reliability of free energy calculations critically depend on the method of interest and system complexity. The more complex the system is, the more likely the chosen set of CVs lacks the ability to sample one or more critical slow modes of the system’s motion. These unknown slow modes then act as hidden variables and deteriorate the quality of free energy calculations due to undersampling in the relevant area of the free energy landscape. One way to reduce the impact of hidden variables is to choose a sufficiently large set of CVs. However, most of these available free energy methods become highly inefficient in exploring the multidimensional free energy landscape. Improving the sampling in orthogonal degrees of freedom is another approach to counter the issue of hidden variables.1618 It is in general achieved by employing tempering schemes, such as replica-exchange (RE) simulations,19,20 as the data from a single trajectory lack sufficient information to estimate the uncertainty accurately. For instance, in the context of TAMD, it has been shown that combining TAMD and replica-exchange parallel tempering results in significantly more accurate calculation of free energies via on-the-fly parametrization21 and metadynamics.22,23 Although conventional RE simulations reduce the dependence on CVs’ choice, the method requires a high computational cost. One may need many replicas with finely spaced temperature increments to reach high enough temperatures for sufficient barrier crossing in a reasonable time because potential energies must sufficiently overlap between adjacent temperature replicas to enable sufficiently many swaps. Important modifications of RE methods have also been developed which focus only on varying the temperature of the small part of the simulation system, viz. solute tempering.2427

Here, we aim to resolve the above-mentioned issues by combining both temperature-accelerated MD (TAMD) and string-method (SM) in collective variables14,15,2832 with a recent version of solute tempering method called “generalized replica-exchange solute tempering” (gREST).27 TAMD enables a single simulation system to perform enhanced sampling in CV space, including ease in identifying local free energy minima. SM identifies minimum free energy pathways (MFEPs) connecting two states of interest in some CV space and works efficiently even in high-dimensional CV spaces. GREST allows for a precise definition of system fragments subjected to tempering. If the set of variables tempered is small (a few dozen to a few hundred), one needs only three or four replicas to span more than 100 K. Hence, gREST provides flexibility for targeted solute tempering at large temperature gradients with a relatively much lower number of replicas.

In this study, we aim to show to what degree enabling SM with gREST leads to improved reproducibility in computing MFEP’s. We focus on multidimensional CV spaces describing the conformational distribution of the pentapeptide neuro-transmisster met-enkephalin in a vacuum (10-D CVs) and under aqueous conditions (2-D CVs). Despite its small size, met-enkephalin has reasonable complexity in its conformational space, in particular “hidden” barriers residing in side-chain rotameric states that can influence backbone dihedral angle conformational sampling, and therefore it remains a system of interest for testing a variety of free energy approaches.3335 Our paper continues with a description of the methods and simulation protocols, followed by results and discussion involving both vacuum and solvated met-enkephalin, followed by a conclusion.

METHODS AND SYSTEM DETAILS

Temperature-Accelerated MD.

TAMD is a biasing method for enhanced sampling in CV space.36 In TAMD, a fast exploration of a multidimensional free energy landscape (at target temperature) is achieved by evolving the CVs at a sufficiently high artificial temperatures. For this, a chosen set of CVs (θ) is coupled to a fictitious particle in CV space (z) via harmonic restraints, evolving at a separate time scale. Though different choices are available, one of the straightforward choices is to make these fictitious particles’ dynamics diffusive as

γmz·=κ(θ(x)z)+2β1γmη(t) (1)

where x are atomic coordinates, κ is the force constant, β = 1/(kBT) is the inverse temperature, and η(t) is a Gaussian white noise with mean zero and covariance ⟨ηi(t)ηj(t′) = δijδ(tt′)⟩. γ is the friction coefficient, and m is the mass tensor. In the limits of sufficiently large values of κ and γ, the restraining force on the CVs approximates the negative gradient of underlying free energy landscape. The complete details and implication of the method can be found in refs 12, 36, and 37.

String Method in Collective Variables.

The string method and its climbing variants allow identification of MFEPs connecting minima and their connecting saddles points in CV space of any dimensionality38,39 A MFEP can be imagined as a curvilinear path z(α) in CV space which is tangential to the gradient of free energy (∇F) and thus satisfies38,40

(M(z(α))F(z(α)))=0 (2)

where M(z(α)) and ∇F(z(α)) are the metric tensor and negative gradient of free energy, respectively. To converge to the MFEP between two locations in CV space, an initial guess of the curve z(α) is required, which comprises a string of N discrete images of the system with a particular parametrization such that parameter α = 0 corresponds to one end of the string and α = N the other end. The position of each intermediate image 0 ≤ αN on the string is then updated progressively according to

γz˙(α,t)=M(z(α,t))F(z(α,t))+λ(α,t)z(α,t) (3)

to optimize the pathway. Here, γ is the friction parameter, z˙ and z′ are the derivatives of z with respect to t and α, respectively, and λ is a Lagrange multiplier which renders the combined effect of the tangential force of the gradient and reparametrization of the string.

The M(z(α,t)) and ∇F(z(α),t) in eq 3 are obtained as the conditional expectations from restrained simulations at the assigned CV values for each image; that is, each is found from an independent system realization, requiring two MD systems per image. However, in the current setup, both M(z(α,t)) and ∇F(z(α),t) are estimated by using the data from the same image. A critical parameter is SM is the duration of these restrained MD simulations between string updates. Integration of eq 3 thus proceeds to obtain the steady-state solution as z(α) satisfying eq 2. The full details on the string method can be found in refs 38 and 4042.

In climbing mode, one end of the string is attracted to (or fixed at) a minimum, while the other end of the string climbs uphill on the FES in search of saddles. After updating the string according to eq 3, the forces acting on final image are altered with the boundary conditions for the end points as

γz˙(α=0,t)=M(z(α=0,t))F(z(α=0,t))γz˙(α=1,t)=M(z(α=1,t))F(z(α=1,t))+ντ(t)(τ(t)M(z(α=1,t))F(z(α=1,t))) (4)

where ν > 1 is a parameter that controls ascent speed and τ = z′(α = 1, t)/|z′(α = 1, t)| is the unit tangent along the string. That is, in the preceding equation, the tangential force along the string is reversed at the end. The climbing end at α = 1 now climbs in the direction tangent to the string.39,43

Generalized Replica-Exchange Solute Tempering.

As the name suggests, gREST is a generalization of the REST2 framework and employs generalized scaling parameters for solute–solvent interactions.27 Consequently, it provides a much more flexible selection to choose among intra- and intermolecular degrees of freedom as the solute, rather than restricting the choice to certain atoms. For example, in gREST, a single dihedral angle can be defined as a solute. Assuming two replicas a and b with solute temperature indexes ri, all the solute–solute interaction potential (Euu) remains scaled as βri/β0 as in the case of REST2, where βri and β0 are temperature of the solute and solvent region, respectively. The key difference between gREST and REST2 lies in modeling the solute–solvent interactions (Euv). Based on the nature of solute, gREST provides a dynamic scaling parameters for solute–solvent interactions as (βri/β0)ki/li. The scaling parameters li and ki represent the maximum number of atoms and number of atoms from solute in the ith solute–solvent interactions. In REST2, the values of ki and li are fixed to 1 and 2, respectively. For both gREST and REST2, the solvent–solvent interaction potential (Evv) remains unscaled. According to the gREST framework, the generalized potential energy of the replica a with solute (temperature index ri) and solvent subparts is expressed as27

EmgREST,[a]=βriβ0Euu(X[a])+i(βriβ0)ki/liEuv,i(X[a])+Evv(X[a]) (5)

where X[a] represents the target atomic coordinate of replica a. Given these potential energy parameters, the exchange probability between replicas a and b with temperature indexes ra and rb is calculated as

P(ab)={1,Δab0exp(Δab),Δab>0Δab=(βraβrb)(Euu(X[a])Euu(X[b]))+iβ01ki/li(βraki/liβrbki/li)(Euv,i(X[a])Euv,i(X[b])) (6)

TAMD and SM with gREST.

The central role of restrained MD in TAMD is estimation of mean forces used to update system position, and in SM MD estimates both mean forces and metric tensors used for image updates. Because these are expectations from finite simulations, they are susceptible to irreproducibility due to slow variations in unbiased, or “hidden”, variables. By employing a tempering approach, we aim to minimize the impact of these variables by ensuring they are sampled fully according to the correct equilibrium distributions. The tempering approach we have selected is gREST, primarily because it appears to be the most efficient way to perform temperature replica exchange. To implement gREST within TAMD and SM, the replicas corresponding to a given image are all restrained to the same location in CV space. After every position update, the lowest-temperature image communicates the new position to each replica. In the current implementation, only the samples from the lowest-temperature replica are used in the position update. Figure 1 shows a schematic of the integration of SM with gREST, a combination we refer to as generalized solute tempering with string method (gSTSM):

Figure 1.

Figure 1.

String method in collective variables and gREST. (A) Schematic representation of a discrete N-image string in a 3-dimensional CV space. (B) A combined SM-gREST calculation for an N-image string with M gREST replicas per image. Low-T replicas along the string communicate reparametrization forces to one another to maintain equidistance along the string, while gREST replicas at each image are allowed to swap randomly to approach ergodic Boltzmann statistics in the lowest-temperature replicas and therefore more accurate estimates of mean forces and metric tensors for evolving the string in CV space.

Implementation.

Our combined TAMD/SM+gREST approach has been implemented by using PLUMED in conjunction with GROMACS. PLUMED is an open-source common platform44 which can be interfaced with many MD simulation host codes.4547 PLUMED provides access to many collective variable-based enhanced sampling algorithms and tools to analyze the molecular dynamics (MD) trajectories. Here, we add an original implementation of conventional SM and gSTSM in PLUMED to extend its utility to path optimization methods. All these steps are achieved by using a new class “MFEP” which inherits from the “BIAS” class of PLUMED. Our implementation heavily depends on the Hamiltonian replica-exchange framework of PLUMED as implemented in GROMACS, though small modifications are made to make it consistent with SM framework. Here, all results are presented from PLUMED2–2.7.0 and GROMACS-2019.6.

Simulation Details.

For met-enkephalin (N-acetyl-Tyr-Gly-Gly-Phe-Met-methylamide) in a vacuum, a single chain in a cubic box of 30 Å is used. MD simulations were performed by employing the CHARMM22 force field48 without cmap corrections. The equations of motion were integrated by using the velocity-Verlet scheme with a 1 fs time step. For met-enkephalin under aqueous conditions, a single chain is solvated by using 984 water molecules. For both the systems, periodic boundary conditions were used, and long-range electrostatics were computed by using particle-mesh Ewald summation with a Fourier grid spacing of 1.2 Å. The van der Waals interactions were cut off beyond a distance of 10 Å, and a force switch was used at 9 Å. All simulations for met-enkephalin in a vacuum were performed in the canonical ensemble at 300 K. For simulation in solution, the system was first equilibrated in isothermal–isobaric ensemble at 300 K and 1 bar. The temperature was controlled using the stochastic velocity rescaling method as implemented in GROMACS using a damping constant of 0.1 ps−1. All string method calculations were performed in the canonical ensemble. For all the gREST enabled calculations, all the dihedral angles were designated as solute without altering any 1–4 pair interactions.

RESULTS AND DISCUSSION

Identifying a Minimum in a 10-D Torsional CV Space.

For met-enkephalin in a vacuum, a 10-D CV space comprising five Ramachandran angles associated with each residue is used. Initiating from a random configuration on a 10-D free energy landscape, the nearest minimum is identified by using the zero-temperature temperature accelerated molecular dynamics simulations (TAMD). For TAMD simulations, the restraining forces, by using a harmonic force constant of 150 kcal mol−1 rad−2, are accumulated over 20 ps simulations, and CVs are updated by using a friction parameter of 0.5 ps−1. For gREST enabled TAMD simulations, four replicas (Nr) per image (N) spanning over a temperature range of 300–600 K are used. The exact temperature for each replica is predicted by using an exponential distribution as Ti = T0 exp(k × i) where k=log(TmaxTmin)/(Nr1). Exchanges between two consecutive replicas are attempted every 500 steps. Parameters for updating the CV values are kept the same as in the case of TAMD calculations. For both TAMD and gREST-enable TAMD, five sets of simulations are performed by using different initial velocities. The average RMSD of from these calculations is used as a metric to check the consistency and reproducibility of the results from the two methodologies.

Figure 2 shows the comparison of the average RMSD of the CVs obtained from TAMD and TAMD+gREST. For a given set of parameters, the average RMSD from TAMD+gREST simulations shows a faster and less noisy convergence than average RMSD from TAMD only. The local force components acting on 10-D CVs are then analyzed for further insights, and representative trajectories of 10-D mean-force components are shown in Figure 3. For TAMD+gREST, the forces seem to converge with uniform fluctuation around the mean. However, for TAMD, nonuniform fluctuations are observed in the forces. Such observed behavior from TAMD could be due to the rare transition events of hidden variables, which are the three side chain dihedral angles for Tyr, Phe, and Met. To clarify, the distribution of the Ramachandran angles and side chain dihedral angles is examined to confirm the role of hidden variables (Figure 4).

Figure 2.

Figure 2.

Average CV RMSD vs elapsed time for TAMD simulations with and without gREST, as systems converge to a single minimum in CV space. For both TAMD and TAMD+gREST, the data are averaged over five different trajectories initiated from different velocities.

Figure 3.

Figure 3.

Representative 10-D mean force components vs elapsed time used for updating the CVs in TAMD and TAMD+gREST simulations. In the lower panel, mean force components are taken from the lowest-temperature replica.

Figure 4.

Figure 4.

Distributions of the Ramachandran (ψ, upper row) and (ϕ, middle row) angles, and side chain (χ, bottom row) dihedral angles, of each residue in met-enkephalin, computed from TAMD both with (blue) and without (red) gREST, averaged over five independent simulations. The probability distributions are normalized by using the number of data points and bin width. The cartoon shown in the bottom row shows the overlapped conformational states acquired by different residues in met-enkephalin.

In Figure 4, the observed distributions of all the five Ramachandran angles from TAMD are in poor agreement with each other across replicas, reflecting poor reproducibility. In contrast, TAMD+gREST-generated distributions are in near-perfect agreement across replicas. Similarly, the distributions of side-chain dihedral angles for three residues Tyr, Phe, and Met in TAMD+gREST simulations exhibit more uniform sampling than in TAMD. The poor reproducibility of the side chain dihedral angle distributions suggests the possibility of non-ergodically sampled transitions among the unbiased variables. For instance, at least one of the simulations completely undersamples the side chain dihedral for Phe at −1.3. Similar events are also observed for side chain dihedrals of Tyr and Met. These rare-event transitions further lead to nonconverged mean forces and may cause slow convergence and less reproducible outcomes. In contrast to TAMD, TAMD +gREST gives much more reproducible and consistent distributions across different simulations and, thus, better convergence and overall reproducibility. The inset in Figure 4 depicts an aligned overlay of 1000 configuration accessible to the Ramachandran angles and the side chains corresponding to the distributions shown in Figure 4. The configurations are obtained from 10 ns of a TAMD simulation. We then examined the effect of exchange frequency and the temperature window for TAMD+gREST simulations. Figure 5 (left panel) shows the effect of different exchange frequency on the RMSD, calculated by using four replicas spanning over a temperature window of 300–600 K. The exchange probability for this temperature window is observed to be in the range of 0.47–0.50. For all examined exchange frequencies, the RMSD values are found to be well converged with mean RMSD < 0.05 rad. The exchange probability is also known to affect the simulation performance as higher exchange frequency require more exchange attempts to be tested, in addition to frequent neighbor list generation. For met-enkephalin in a vacuum, the efficiency goes down from 1 to 0.92 as steps between consecutive exchange attempt is decreased from 1000 to 10. The drop in efficiency is not significant as the system is very small. Note that for all these simulations the frequency of neighbor-list updates is held constant a every 20 steps, rather than allowing for dynamic neighbor-list updates. Figure 5 (right panel) shows the effect of temperature window on the RMSD. Four different temperature windows 300–500, 300–600, 300–900, and 300–1200 K are used. For each window, the exchange is attempted every 20 steps. Like in the case of exchange frequency, the RMSD here also observed to be well converged. The temperature window across the replicas governs the exchange probability and therefore needs to be chosen properly. The exchange probability for each temperature window used for met-enkephalin in a vacuum is given in Table 1. As a rule of thumb, the exchange probability of 0.25–0.35 is usually considered adequate. Considering this, for met-enkephalin in a vacuum, the maximum temperature up to 900 K can be used across four replicas with sufficient overlapped sampling. However, even under solvent conditions, the temperature window and corresponding exchange probabilities for met-enkephalin should not change much, as the solute size will remain the same.

Figure 5.

Figure 5.

RMSD for TAMD+gREST simulations for finding a minimum. Left panel: TAMD calculations with four replicas spanning temperature range of 300–600 K. The neighbor list is built every 20 steps, irrespective of exchange frequency. Exchange probability: 0.47–0.50. Right panel: TAMD calculations with four replicas at different temperature windows. Then neighbor list is built every 20 steps, and exchange is attempted every 20 steps.

Table 1.

Exchange Probability as a Function of Temperature Window for TAMD+gREST Simulation of Met-Enkephalin

temperature window exchange probability
300–500 K 0.59–0.61
300–600 K 0.47–0.50
300–900 K 0.28–0.32
300–1200 K 0.16–0.24

Identifying a Saddle: The Climbing String Method.

Once a minimum is identified on a multidimensional free energy surface, the SM’s climbing version can locate connected saddles.39 Starting from a minimum located by using TAMD, climbing SM calculations are performed to identify the saddles connected to this minimum. A climbing string consisting of eight images and three replicas per image spanning over a temperature window of 300–600 K is employed in a 10-D CV space. Like TAMD+gREST calculations, all the dihedral angles of met-enkephalin are used as a gREST solute. The restraining forces on the CVs are accumulated for 10 ps by using a force constant of 150 kcal mol−1 rad−2. CVs are updated by using a friction parameter equal to 1 ps−1, with an ascent parameter ν of 2 for the climbing image.

Climbing is performed by using initial small push along random directions. Because the motivation of the paper is to show the applications of SM implementation in PLUMED, we focus on locating only a few saddle points. Once a saddle point is identified, the final image is pushed along a extrapolated direction of last two images. It is to find the connecting minimum on the other side. Three saddle points, and thus three other connecting minima, are identified and are shown in Figure 6. All three identified saddles (configurations shown in the middle circles in Figure 6) are connected to a common minimum (bottom circle, Figure 6), and the configurations of the other connecting minima are shown in the top circles of Figure 6. Among the three identified pathways, the pathway shown in right side of Figure 6 is found to exhibit lowest activation barrier (3.6 kcal/mol) and may possibly carry the maximum transition current. However, to identify if it is the only pathway carrying a maximum transition current, one needs to locate all the possible saddle states connected to a given minimum. As also shown in ref 39, the string method with its climbing variant can be used to generate a complete network of saddle points and minima in high-dimensional free energy surface.

Figure 6.

Figure 6.

Snapshots for the saddles (middle circles) and their connecting minima (top circles) obtained on climbing from the minimum configuration (bottom circle). The free energy (kcal/mol) in each circle is relative to the minimum at the bottom.

Effect on Dimensionality: A 2D Case.

For a more stringent test of gREST-enabled orthogonal sampling, we consider met-enkephalin under aqueous conditions, for which the number of CVs is further squeezed from ten to two (dihedral angles defined only by Cα atoms). This means we consider a case where the contribution from hidden variables is potentially more severe. After equilibration with classical MD, a 150 ns long trajectory is generated by using four replicas spanning a temperature range of 300–600 K (300, 377.9, 476.2, and 600 K). Here again, all the dihedral angles, along with the CMAP energy terms, are employed as solute. The exchange between two consecutive replicas is attempted every 50 steps (100 fs). The overall exchange probability is found to be in the range 0.37–0.40. The free energy map is generated from the snapshots from all the trajectories by using the multistate Bennett acceptance ratio (MBAR)27 (Figure 7). Figure 7 shows two major minima M1 and M2 at −0.9, 1.92 and 0.83, 0.85, respectively. The minima M1 and M2 correspond to a U-shaped and helical conformations and are in agreement with the literature.33,35,49

Figure 7.

Figure 7.

Projection of free energy in 2-dimensional collective variable space with some structures for metastable states.

The SM and gSTSM calculations are compared based on the convergence to an MFEP connecting minima M1 and M2. For SM, a string of eight images is used. The initial path for the string is considered to be little bent as the initial linear visually seems to be similar to minimum free energy path. The position of the images is updated every 200 steps by using a friction parameter of 40 ps−1 and a force constant of 70 kcal mol−1 rad−2. Both the ends of the string are also evolved to locate their respective minima. For gSTSM, a string of eight images with four replicas per image is used. The position of each image is updated by using the same parameters as in the case of SM. The temperature of the four replicas for each image is kept at 300, 377.9, 476.2, and 600 K, and exchange is attempted every 20 steps. The exchange probability is observed to be at 0.55. Figure 8 (left panel) shows the initial (white curve with solid circles) and converged free energy path (color curves with open circles). To check the effect of Nr, two more calculations with Nr equal to 2 and 3 are also performed. For Nr = 3, the replicas are kept at 300, 377.9, and 476.2 K, and for Nr = 2, the two replicas are kept at 300 and 377.9 K. For these simulations, the same parameters for string update and exchanges are used. The corresponding string’s RMSD (averaged over all the images) as a function of Nr is shown in Figure 8 (right panel). All the RMSDs clearly evident the string convergence. However, the Nr = 1, representing a conventional SM calculation, shows a relatively higher RMSD and exhibits large fluctuations. In contrast, the RMSDs from gSTSM gets converged much early (15000 iterations onward) and exhibited lower RMSD values.

Figure 8.

Figure 8.

Left panel: minimum free energy path connecting minima M1 and M2 obtained with different numbers of replicas per image. Right panel: RMSD of the string connecting M1M2 minima. Images on the string are updated every 200 time steps.

For the identified MFEPs, a 20 ns long restrained calculation is performed to obtain mean forces and, thus, the free energy differences. The strings connecting M1M2 consisting eight images and three replicas per replica at temperatures equal to 300, 387.2, and 500 K are employed. Average mean forces are calculated with the MBAR method using snapshots from all the replicas. Subsequently, the free energy is obtained from thermodynamic integration of mean forces. The M1 corresponding to the U-shaped conformation is found to be the more stable than M2 with a free energy difference of 0.5 kcal/mol. The free energy difference between M1 and M2 as obtained directly from the free energy surface is found to be 0.12 kcal/mol. The small differences in the energies could be due to the discrete nature of the path used for mean force calculations.

CONCLUSION

Temperature-accelerated MD, the string method in collective variables, its climbing variant, and its combination with the orthogonal-space sampling approach of gREST27 are implemented in PLUMED for execution with GROMACS. gREST is shown to improve sampling in orthogonal space at a reasonable computational cost, and it is shown to improve the reproducibility of free energy calculations for met-enkephalin. In particular, we used our hybrid methods to identify the minima, saddles, and MFEPs for met-enkephalin under vacuum and aqueous conditions. Under vacuum and aqueous conditions, a 10-D CV space (five pairs of Ramachandran angles) and a 2-D (dihedral angle of Cα) are utilized, respectively. Under vacuum conditions, first a minimum is identified by using TAMD. Subsequently, three saddles connected to this identified minimum and their connecting minima on other side are located by using climbing SM enabled with gREST. For met-enkephalin under aqueous conditions, an MFEP connecting minima M1 and M2 is identified by using SM, and their free energy difference is calculated. Under both vacuum and aqueous conditions, the string method calculations enabled with gREST show better convergence and reproducibility across multiple simulations.

In the current implementation of SM with gREST, we only use samples from the target temperature to update images on the string due to the bottleneck of generating the correct weights for snapshots from all temperatures, which requires all the modified potential energies for a given configuration. This enhancement will be considered in future efforts. In future implementations, we will also focus on improvement of the computational efficiency of method. For example, the sampling efficiency can also be achieved by an infinite-swapping-frequency implementation.50

Though gREST reduces the sensitivity of the free energy gradient estimation on the quality of the choice of the CVs used in TAMD or SM, it is important to note that gREST does not guarantee that poorly chosen CV’s will be free from hidden-barrier sampling issues. Identification of a good set of CVs and a proper gREST solute region should still be considered carefully. Nevertheless, the increased reproducibility conferred by gREST orthogonal sampling at a reasonable cost should further enhance the attractiveness of TAMD and SM in a variety of applications.

ACKNOWLEDGMENTS

The authors thank and acknowledge financial support from the National Institutes of Health (R01-GM-100472). This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation Grant ACI-15485862. The code for the PLUMED patch is available at https://github.com/cameronabrams/plumed-patch-grest-stringmethod.git.

Footnotes

Complete contact information is available at: https://pubs.acs.org/10.1021/acs.jpcb.1c02143

Publisher's Disclaimer: Published as part of The Journal of Physical Chemistry virtual special issue “Carol K. Hall Festschrift”.

The authors declare no competing financial interest.

Contributor Information

Gourav Shrivastav, Department of Chemical and Biological Engineering, Drexel University, Philadelphia, Pennsylvania 19104, United States.

Cameron F. Abrams, Department of Chemical and Biological Engineering, Drexel University, Philadelphia, Pennsylvania 19104, United States;

REFERENCES

  • (1).Wolf S; Lickert B; Bray S; Stock G Multisecond ligand dissociation dynamics from atomistic simulations. Nat. Commun 2020, 11, 1–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (2).Jung J; Nishima W; Daniels M; Bascom G; Kobayashi C; Adedoyin A; Wall M; Lappala A; Phillips D; Fischer W; et al. Scaling molecular dynamics beyond 100,000 processor cores for large-scale biophysical simulations. J. Comput. Chem 2019, 40, 1919–1930. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (3).Torrie G; Valleau J Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. J. Comput. Phys 1977, 23, 187–199. [Google Scholar]
  • (4).Carter E; Ciccotti G; Hynes JT; Kapral R Constrained reaction coordinate dynamics for the simulation of rare events. Chem. Phys. Lett 1989, 156, 472–477. [Google Scholar]
  • (5).Sprik M; Ciccotti G Free energy from constrained molecular dynamics. J. Chem. Phys 1998, 109, 7737–7744. [Google Scholar]
  • (6).Darve E; Pohorille A Calculating free energies using average force. J. Chem. Phys 2001, 115, 9169–9183. [Google Scholar]
  • (7).Laio A; Parrinello M Escaping free-energy minima. Proc. Natl. Acad. Sci. U. S. A 2002, 99, 12562–12566. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (8).Rosso L; Mináry P; Zhu Z; Tuckerman ME On the use of the adiabatic molecular dynamics technique in the calculation of free energy profiles. J. Chem. Phys 2002, 116, 4389–4402. [Google Scholar]
  • (9).Hénin J; Chipot C Overcoming free energy barriers using unconstrained molecular dynamics simulations. J. Chem. Phys 2004, 121, 2904–2914. [DOI] [PubMed] [Google Scholar]
  • (10).Maragliano L; Vanden-Eijnden E A temperature accelerated method for sampling free energy and determining reaction pathways in rare events simulations. Chem. Phys. Lett 2006, 426, 168–175. [Google Scholar]
  • (11).Abrams JB; Tuckerman ME Efficient and Direct Generation of Multidimensional Free Energy Surfaces via Adiabatic Dynamics without Coordinate Transformations. J. Phys. Chem. B 2008, 112, 15742–15757. [DOI] [PubMed] [Google Scholar]
  • (12).Abrams CF; Vanden-Eijnden E Large-scale conformational sampling of proteins using temperature-accelerated molecular dynamics. Proc. Natl. Acad. Sci. U. S. A 2010, 107, 4961–4966. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (13).Dickson BM; Legoll F; Leliévre T; Stoltz G; Fleurat-Lessard P Free Energy Calculations: An Efficient Adaptive Biasing Potential Method. J. Phys. Chem. B 2010, 114, 5823–5830. [DOI] [PubMed] [Google Scholar]
  • (14).E W; Ren W; Vanden-Eijnden E String method for the study of rare events. Phys. Rev. B: Condens. Matter Mater. Phys 2002, 66, 052301. [Google Scholar]
  • (15).E W; Ren W; Vanden-Eijnden E Transition pathways in complex systems: Reaction coordinates, isocommittor surfaces, and transition tubes. Chem. Phys. Lett 2005, 413, 242–247. [Google Scholar]
  • (16).Zheng L; Chen M; Yang W Random walk in orthogonal space to achieve efficient free-energy simulation of complex systems. Proc. Natl. Acad. Sci. U. S. A 2008, 105, 20227–20232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (17).Zheng L; Chen M; Yang W Simultaneous escaping of explicit and hidden free energy barriers: Application of the orthogonal space random walk strategy in generalized ensemble based conformational sampling. J. Chem. Phys 2009, 130, 234105. [DOI] [PubMed] [Google Scholar]
  • (18).Zheng L; Yang W Practically efficient and robust free energy calculations: Double-integration orthogonal space tempering. J. Chem. Theory Comput 2012, 8, 810–823. [DOI] [PubMed] [Google Scholar]
  • (19).Sugita Y; Okamoto Y Replica-exchange molecular dynamics method for protein folding. Chem. Phys. Lett 1999, 314, 141–151. [Google Scholar]
  • (20).Sugita Y; Kitao A; Okamoto Y Multidimensional replica-exchange method for free-energy calculations. J. Chem. Phys 2000, 113, 6042–6051. [Google Scholar]
  • (21).Paz SA; Vanden-Eijnden E; Abrams CF Polymorphism at 129 dictates metastable conformations of the human prion protein N-terminal β-sheet. Chem. Sci 2017, 8, 1225–1232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (22).Prakash A; Fu CD; Bonomi M; Pfaendtner J Biasing Smarter, Not Harder, by Partitioning Collective Variables into Families in Parallel Bias Metadynamics. J. Chem. Theory Comput 2018, 14, 4985–4990. [DOI] [PubMed] [Google Scholar]
  • (23).Pfaendtner J; Bonomi M Efficient Sampling of High-Dimensional Free-Energy Landscapes with Parallel Bias Metadynamics. J. Chem. Theory Comput 2015, 11, 5062–5067. [DOI] [PubMed] [Google Scholar]
  • (24).Liu P; Kim B; Friesner RA; Berne B Replica exchange with solute tempering: A method for sampling biological systems in explicit water. Proc. Natl. Acad. Sci. U. S. A 2005, 102, 13749–13754. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (25).Wang L; Friesner RA; Berne B Replica exchange with solute scaling: a more efficient version of replica exchange with solute tempering (REST2). J. Phys. Chem. B 2011, 115, 9431–9438. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (26).Terakawa T; Kameda T; Takada S On easy implementation of a variant of the replica exchange with solute tempering in GROMACS. J. Comput. Chem 2011, 32, 1228–1234. [DOI] [PubMed] [Google Scholar]
  • (27).Kamiya M; Sugita Y Flexible selection of the solute region in replica exchange with solute tempering: Application to protein-folding simulations. J. Chem. Phys 2018, 149, 072304. [DOI] [PubMed] [Google Scholar]
  • (28).Ren W; Vanden-Eijnden E; Maragakis P;E W Transition pathways in complex systems: Application of the finite-temperature string method to the alanine dipeptide. J. Chem. Phys 2005, 123, 134109. [DOI] [PubMed] [Google Scholar]
  • (29).E W; Ren W; Vanden-Eijnden E Finite temperature string method for the study of rare events. J. Phys. Chem. B 2005, 109, 6688–6693. [DOI] [PubMed] [Google Scholar]
  • (30).Maragliano L; Vanden-Eijnden E On-the-fly String Method for Minimum Free Energy Paths Calculation. Chem. Phys. Lett 2007, 446, 182–190. [Google Scholar]
  • (31).E W; Ren W; Vanden-Eijnden E Simplified and improved string method for computing the minimum energy paths in barrier-crossing events. J. Chem. Phys 2007, 126, 164103. [DOI] [PubMed] [Google Scholar]
  • (32).Vanden-Eijnden E; Venturoli M Revisiting the Finite Temperature String Method for the Calculation of Reaction Tubes and Free Energies. J. Chem. Phys 2009, 130, 194103. [DOI] [PubMed] [Google Scholar]
  • (33).Sanbonmatsu KY; García AE Structure of Met-enkephalin in explicit aqueous solution using replica exchange molecular dynamics. Proteins: Struct., Funct., Genet 2002, 46, 225–234. [DOI] [PubMed] [Google Scholar]
  • (34).Heyden A; Bell AT; Keil FJ Efficient methods for finding transition states in chemical reactions: Comparison of improved dimer method and partitioned rational function optimization method. J. Chem. Phys 2005, 123, 224101. [DOI] [PubMed] [Google Scholar]
  • (35).Wojtas-Niziurski W; Meng Y; Roux B; Bernèche S Self-learning adaptive umbrella sampling method for the determination of free energy landscapes in multiple dimensions. J. Chem. Theory Comput 2013, 9, 1885–1895. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (36).Maragliano L; Vanden-Eijnden E A temperature accelerated method for sampling free energy and determining reaction pathways in rare events simulations. Chem. Phys. Lett 2006, 426, 168–175. [Google Scholar]
  • (37).Abrams C; Bussi G Enhanced sampling in molecular dynamics using metadynamics, replica-exchange, and temperature-acceleration. Entropy 2014, 16, 163–199. [Google Scholar]
  • (38).Maragliano L; Fischer A; Vanden-Eijnden E; Ciccotti G String method in collective variables: Minimum free energy paths and isocommittor surfaces. J. Chem. Phys 2006, 125, 024106. [DOI] [PubMed] [Google Scholar]
  • (39).Shrivastav G; Vanden-Eijnden E; Abrams CF Mapping saddles and minima on free energy surfaces using multiple climbing strings. J. Chem. Phys 2019, 151, 124112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (40).Maragliano L; Vanden-Eijnden E On-the-fly string method for minimum free energy paths calculation. Chem. Phys. Lett 2007, 446, 182–190. [Google Scholar]
  • (41).E W; Ren W; Vanden-Eijnden E String method for the study of rare events. Phys. Rev. B: Condens. Matter Mater. Phys 2002, 66, 052301. [Google Scholar]
  • (42).E W; Ren W; Vanden-Eijnden E Finite Temperature String Method for the Study of Rare Events. J. Phys. Chem. B 2005, 109, 6688–6693. [DOI] [PubMed] [Google Scholar]
  • (43).Ren W; Vanden-Eijnden E A climbing string method for saddle point search. J. Chem. Phys 2013, 138, 134105. [DOI] [PubMed] [Google Scholar]
  • (44).Tribello GA; Bonomi M; Branduardi D; Camilloni C; Bussi G PLUMED 2: New feathers for an old bird. Comput. Phys. Commun 2014, 185, 604–613. [Google Scholar]
  • (45).Plimpton S Fast parallel algorithms for short-range molecular dynamics. J. Comput. Phys 1995, 117, 1–19. [Google Scholar]
  • (46).Phillips JC; Braun R; Wang W; Gumbart J; Tajkhorshid E; Villa E; Chipot C; Skeel RD; Kalé L; Schulten K Scalable molecular dynamics with NAMD. J. Comput. Chem 2005, 26, 1781–1802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (47).Van Der Spoel D; Lindahl E; Hess B; Groenhof G; Mark AE; Berendsen HJ GROMACS: fast, flexible, and free. J. Comput. Chem 2005, 26, 1701–1718. [DOI] [PubMed] [Google Scholar]
  • (48).Foloppe N; MacKerell AD Jr. All-atom empirical force field for nucleic acids: I. Parameter optimization based on small molecule and condensed phase macromolecular target data. J. Comput. Chem 2000, 21, 86–104. [Google Scholar]
  • (49).Henin J; Fiorin G; Chipot C; Klein ML Exploring multidimensional free energy landscapes using time-dependent biases on collective variables. J. Chem. Theory Comput 2010, 6, 35–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (50).Yu T-Q; Lu J; Abrams CF; Vanden-Eijnden E Multiscale implementation of infinite-swap replica exchange molecular dynamics. Proc. Natl. Acad. Sci. U. S. A 2016, 113, 11744–11749. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES