Abstract
A parallel implementation of the finite-temperature string method is described, which takes into account the invariance of coordinates with respect to rigid-body motions. The method is applied to the complex α-helix↔β-sheet transition in a β-hairpin miniprotein in implicit solvent, which exhibits much of the complexity of conformational changes in proteins. Two transition paths are considered, one derived from a linear interpolant between the endpoint structures and the other derived from a targeted dynamics simulation. Two methods for computing the conformational free energy (FE) along the string are compared, a restrained method, and a tessellation method introduced by E. Vanden-Eijnden and M. Venturoli [J. Chem. Phys. 130, 194103 (2009)]. It is found that obtaining meaningful free energy profiles using the present atom-based coordinates requires restricting sampling to a vicinity of the converged path, where the hyperplanar approximation to the isocommittor surface is sufficiently accurate. This sampling restriction can be easily achieved using restraints or constraints. The endpoint FE differences computed from the FE profiles are validated by comparison with previous calculations using a path-independent confinement method. The FE profiles are decomposed into the enthalpic and entropic contributions, and it is shown that the entropy difference contribution can be as large as 10 kcal/mol for intermediate regions along the path, compared to 15–20 kcal/mol for the enthalpy contribution. This result demonstrates that enthalpic barriers for transitions are offset by entropic contributions arising from the existence of different paths across a barrier. The possibility of using systematically coarse-grained representations of amino acids, in the spirit of multiple interaction site residue models, is proposed as a means to avoid ad hoc sampling restrictions to narrow transition tubes.
INTRODUCTION
Conformational transitions in proteins and DNA underlie many processes in biology, such as enzyme catalysis,1 DNA replication,2, 3 refolding of misfolded proteins,4 ATP synthesis and transport by molecular motors,5, 6 muscle contraction,7 and multidrug resistance.8 The modeling of conformational changes in biomolecules remains one of the most challenging problems in computational biophysics because transitions in proteins often require micro- to milli-seconds, while typical atomistic MD simulations of proteins are restricted to hundreds of nanoseconds. Although special-purpose hardware for accelerating MD simulations to be able to study millisecond phenomena is becoming more readily available,9, 10, 11 many macromolecular processes remain beyond the reach of conventional MD.
To overcome the time-scale limitation of standard MD simulations, many enhanced sampling techniques have been developed, such as Umbrella Sampling (US),12, 13 Metadynamics,14 Transition Path Sampling (TPS),15, 16 Adaptive Biasing Force (ABF),17, 18 Adiabatic MD,19 Targeted MD,20 Temperature Accelerated (TA) MD,21 Milestoning,22, 23, 24 and the String Method.25, 26 Most of these methods require a priori specification of a reaction coordinate or of coarse-grained variables (e.g., US, metadynamics, ABF, TAMD, string method in collective variables26) which are not generally obvious for complicated molecular systems. TPS15, 16 can refine an initial path connecting two metastable states without the need to specify a reaction coordinate or collective variables by using a random walk in the space of transition paths. However, because TPS relies on the generation of unbiased trajectories by MD, it can be very costly to apply to large biomolecules. Targeted MD requires only the specification of the endpoint structures, and generates transition paths within a prescribed simulation time using constraints20, 27, 28 or restraints.29 However, because the dynamics in TMD are biased, the resulting transition paths can be far from equilibrium.30 Further discussion of the various sampling methods can be found in recent reviews.31, 32
In this study, we focus on another approach, the finite-temperature string method25, 33 (FTS), which was proposed to circumvent some of the requirements of the above sampling methods. FTS can be applied directly in Cartesian coordinates, without requiring additional coarse-grained variables or a reaction coordinate. FTS does not rely on the generation of unbiased trajectories, and is therefore expected to converge faster than TPS methods. In addition, free energy profiles of transitions can be calculated from the converged path in a straightforward way. FTS still requires the user to supply an initial path, which can be obtained from methods of varying computational complexity (e.g., interpolation, the nudged elastic band method,34 conjugate peak refinement,35 zero-temperature string (ZTS),36 TMD). This choice can have a significant effect on the converged path in systems with rugged energy landscapes (e.g., see Sec. 3 in the present paper).
We describe the application of the FTS to the α-helix↔β-sheet transition in a G-protein peptide.37 This model system possesses much of the complexity of realistic proteins, and is therefore used here to assess the suitability of using the FTS to study arbitrary conformational changes. In Sec. 2, we summarize the theory underlying the FTS, present a parallel implementation in rigid-body-invariant coordinates, and describe two techniques for computing the free energy profile from the string. In Sec. 3, we describe the transition paths resulting from the application of the FTS to the G-protein peptide, along with the profiles of free energy, enthalpy, and entropy. In Sec. 4, we discuss the limitations of applying the FTS in the given coordinates, and in Sec. 5, we provide a summary and directions for future work.
METHODS
Summary of the finite-temperature string method
A rigorous derivation of the FTS method from transition path theory (TPT)38 has been given in Refs. 25 and 38, 39, 40. Briefly, in TPT the central object is the committor function q, which corresponds to the solution to the backward Kolgomorov equation in probability theory.41 Given a continuous time-evolving stochastic system whose state is described by and a definition of a reactants set () and a products set (), the committor corresponds to the probability that a system in state y will enter (commit to) the product set before reaching the reactant set. It is clear that the committor function is a natural measure of the reaction progress, and it has been used as a reaction coordinate in TPS simulations of various phenomena.15, 16, 42, 43, 44, 45 In the remainder of the paper, we will not make a distinction between the terms “committor” and “reaction coordinate.”
The FTS can be derived from the backward Kolmogorov equation for Langevin dynamics on a potential energy surface () in the high-friction limit39 under the assumption that the isosurface q = q0 of the committor can be represented by a hyperplane, i.e.,
| (1) |
where δ is the Dirac distribution, is the Jacobian factor, is a unit normal to the hypersurface at q = q0, and ϕ is constrained to satisfy the equation
| (2) |
Z(q0) is the partition function corresponding to the isosurface and the subscript denotes canonical sampling restricted to the hyperplane at q = q0. Thus, ϕ(q0) is the average position in the hyperplane associated with q0, and the curve ϕ, defined as the string, is the average reaction path. Provided that the variance of the quantity () is not too large, ϕ also represents the dominant reaction path. With the definitions in Eqs. 1, 2 it can be shown that, for Langevin dynamics in the high-friction limit, the hyperplane normal is parallel to the tangent vector to the string,39
| (3) |
i.e., within the accuracy of the hyperplane approximation, the isosurfaces of the reaction coordinate can be obtained from the string directly.
The FTS is an iterative algorithm for solving a discretized version of Eqs. 2, 3.25, 33, 40 From an approximation to the string at iteration i (ϕi) one obtains ni using Eq. 3, which permits computing ϕi + 1 from Eq. 2. The procedure is repeated until ϕi does not change. The free energy can then be obtained by thermodynamic integration of the free energy derivatives sampled on the hyperplanes,25, 33 or by sampling a Voronoi tessellation.46 The FTS algorithm with Voronoi tessellations has also been applied to compute probability distributions and reactive fluxes in nonequilibrium steady-state problems.47, 48, 49
The original publications of the FTS25, 40 did not explicitly consider the rigid-body invariance of systems that are free to translate and rotate in space, which includes most macromolecules simulated by molecular dynamics (MD). In the first application of the string method to the model biomolecule alanine dipeptide,33 simple harmonic restraints were applied to keep the molecule from rotating or translating in space. Other applications of the string method50, 51 used “best-fit” transformations of the absolute coordinates to a frame of reference associated with a particular structure52, 53, 54, 55 (e.g., an xray or an energy-minimized structure of a protein) to represent the string of replicas. This approach dates back to the chain-of-states methods56, 57 such as the nudged elastic band,34, 58 or path functional minimization.59 It has been shown that the best-fitting is equivalent to the application of the Eckart conditions.30, 60, 61, 62, 63 Vanden-Eijnden and Venturoli46 described a tessellation approach to remove degeneracy due to translation and rotation using a minimal distance metric, computed after the instantaneous simulation coordinates are best-fit onto a string replica. While the tessellation approach appears technically simple, it is based on constrained dynamics, and requires modifying the MD integrator. It is therefore not portable between different MD programs, or even between different integrators within the same MD program. To address this difficulty, Maragliano, Vanden-Eijnden, and Roux64 presented a method to sample a Voronoi tessellation using one-sided restraint dynamics, and applied it to solvated alanine dipeptide in collective variable space.
The present implementation of the FTS described below combines the idea of the degenerate distance metric of Vanden-Eijnden and Venturoli46 with the idea of using restraints to compute one-dimensional free energy profiles.64 The approach uses restrained dynamics with a harmonic potential centered on each hypersurface, and is therefore expected to be easily portable between MD programs. Both the restrained sampling method and the tessellation method are used to calculate the free energy profiles. The results are compared in Sec. 3C. An extension of the tessellation sampling approach that does not involve modifications to the integrator is the weighted ensemble (WE) string method.65 In this method, sampling constraints are not enforced explicitly during the MD simulation. Instead, simulations corresponding to different cells in the tessellation are reinitialized periodically after a prescribed time interval τ to ensure uniform sampling of the tessellation. The efficiency of the WE method appears to be sensitive to the choice of the parameter τ, which should be greater than the trajectory decorrelation time of the system.65 However, a comparison between the WE method and the restrained method presented here is beyond the scope of this study.
There are other methods that account for the degeneracy due to rigid body motion by re-expressing atomic coordinates in the reference frame of the moment-of-inertia tensor.66, 67 These methods require additional care in systems for which the eigenvalues of the tensor are not well separated to avoid discontinuities in the eigenvectors during MD simulation.
Rotation-invariant FTS
In this subsection, we describe a rigid-body-invariant implementation of the FTS based on Eqs. 2, 3. Let be set of N coordinate triplets of the atoms that undergo a conformational transition, and , where mi are the atomic masses. Center-of-mass (COM) coordinates are defined by . Henceforth, we assume that all coordinates, including points along the string, have been shifted to the corresponding COMs. Defining a parameter α ∈ [0, 1], we represent the string as . The parametrization rather than is used because the values of the committor q0 can be obtained from the string only after it is converged.25
By analogy with Eqs. 1, 3, we assume that the isosurface (or level set) of the committor function at q = q0(α) can be approximated by hypersurfaces defined by
| (4) |
where with ATA = I is a rotation matrix for which the squared norm
| (5) |
is minimized over all possible rotations. A corresponds to the “best-fit” superposition of the image ϕ(α) onto the coordinates .68 (Conversely, AT corresponds to the best-fit of onto ϕ(α), i.e., =.) For brevity, we will sometimes omit the explicit dependence of A on and . Invariance with respect to rotations of is implicit in Eq. 4 because for any that satisfies Eq. 4 with some A = A*, and any rigid rotation operation R, will also satisfy Eq. 4 with A = RA*. The relationship between Eq. 4 and the distance metric used by Vanden-Eijnden and Venturoli46 is made clear in Sec. 2D in which the tessellation approach for computing free energy profiles is reviewed.
The connection between Eq. 4 and the original hyperplanar ansatz in Eq. 1 can be seen by considering the change of variables and the associated mass-weighted metric tensor
| (6) |
It was shown in Ref. 26 that, for an arbitrary change of variables , the vector normal to the isocommittor surface transforms according to . The use of Eq. 4 is therefore justified when , i.e., when M is approximately equal to the diagonal inverse mass matrix. The metric tensor was computed for two systems, alanine dipeptide in vacuum, and a 16-residue β-hairpin peptide in implicit solvent. The details and results of the calculations are provided in the Appendix. The norm of the difference tensor = defined as
| (7) |
was ≃0.05 for the alanine dipeptide (which has 22 atoms in the all-atom representation, but N = 10 because hydrogen atoms were not treated as independent particles in the FTS), and ≃0.005 for a 16-residue β-hairpin peptide (which has 247 atoms, but only N = 107 atoms were selected as independent particles; see Sec. 3). Equation 4 is therefore a good approximation to the original condition Eq. 1, and is used in the present method to find an iterative solution to the string , up to a rigid-body rotation, i.e.,
| (8) |
We note that the use of the best fitting operation in Eq. 4 is equivalent to the construction of the Eckart reference frame60, 61, 69, 70, 71 for the approximate separation of rotational and vibrational motions in semi-rigid molecules. For the present calculations using the FTS it was found that some instantaneous molecular configurations () are distributed broadly around the string (), with the distances in Eq. 5 ∼3 Å (which underscores the importance of configurational entropy to the free energy of proteins in solution; see Sec. 3). Although such large values may invalidate the assumption of small vibrational displacements used in the Eckart construction, the relatively small norm of the difference tensor justifies the use of the Eckart frame.
Because practical use of the FTS requires a discrete representation of the string by a finite set of points (called “images” below), an additional condition must be imposed to maintain a prescribed numerical resolution. In the present case, we use
| (9) |
which corresponds to parametrization by arclength.40 The difference between Eq. 9 and the original constraint equation ( = 0)40 becomes nonzero for the discretized version of the string method, as can be seen by rewriting Eq. 9 as a finite-difference limit (and using ),
| (10) |
Thus, Eq. 9 fixes the distances between adjacent string images at Ldα, where L is string length, after each string image is best-fit to the preceding image.
Defining the unit normal = , the canonical average of a quantity restricted to the hypersurface = q0(α) (henceforth denoted by Sα) is
| (11) |
where is the potential energy, Z(α) is the partition function
| (12) |
and is the Jacobian associated with the change of variables . The Jacobian can be computed from the (unweighted) metric tensor determinant as , after the six rigid body degrees of freedom are accounted for, as done in the Appendix. Using the β-hairpin system as an example, it is shown that the Jacobian factor is nearly constant along the string, varying within about 3% of the average value. This observation is consistent with the fact that M is close to the identity, and suggests that the Jacobian can be replaced by a constant, which disappears from the configurational averages in Eq. 11. The resulting partition function Z(α) excludes translational and rotational contributions, which can be included approximately a posteriori (see Sec. 4). The small deviations of the Jacobian and the metric tensor from unity are caused by the inexact separability of rotational and internal degrees of freedom.60, 71, 72, 73
To emphasize the fact that the molecular system confined to the hypersurface described by Eq. 4 can rotate freely, we can switch to the (instantaneous) frame of by applying I = ATA inside Eq. 4 to obtain
| (13) |
where the third equality follows from the fact that and are both zero. (The first term is zero because AT′A is skew-symmetric; that the second term is zero can be seen by setting to zero the variation of the best-fit condition [Eq. 5] with respect to the infinitesimal rotation εAT′A, as done in Ref. 30.) Equation 13 shows that the hypersurface constraints (or restraints) are applied effectively after the string image is optimally rotated to overlap with the instantaneous coordinates . To compute the free energy associated with the hyperplane at α, i.e.,
| (14) |
we define the auxiliary potential
| (15) |
The corresponding partition function and free energy are
| (16) |
| (17) |
Up to errors of order O(1/k),26
| (18) |
The right-hand side of Eq. 18 can be estimated from a molecular dynamics simulation using the auxiliary potential , i.e.,
| (19) |
The free energy F(α) can be computed using thermodynamic integration (TI) and Eq. 19 with a = ∂Fk/∂α, i.e.,
| (20) |
To evaluate dFk/dα, we first differentiate the scalar product
| (21) |
where we have defined a curvature projection C(α) and used parametrization by arclength Eq. 9. Previous implementations of the FTS25, 33 used localization assumptions, e.g., |C(α)| ≪ L, to neglect the curvature term in the force computation. The assumption is justified if either the curvature of the string, or the width of the transition tube are small. We have found that the curvature term gives rise to contributions of 1 kcal/mol to the free energy profile for the alanine dipeptide in vacuum (see the supplementary material74). Therefore, the term is retained in the present calculations.
Using Eq. 21, and defining ξ ≡ C(α)/L, we have
| (22) |
where the last integral represents an average over a MD trajectory of duration T performed with the potential .
Discretization of the FTS
In this section, we give discretized analogs of Eqs. 20, 21, 22 needed for practical FTS calculations. Let M correspond to the number of points in the discretization of the string. The discrete string then consists of M “images” indexed by M values of the parameter α, i.e., = . From Eq. 13, the hypersurface corresponding to the string image is defined by
| (23) |
The reason for using the form in Eq. 23 (i.e., differentiating A along with ) is that the free energy can then be computed without aligning any of the string images to each other, as will be seen below. However, a discretization based on Eq. 4 gives essentially the same results, provided that care is taken to superpose each string image onto the preceding image prior to computing the free energy. For an internal string image (i.e., with 1 < i < M), we define a finite difference approximation to the unit normal to Si using the neighboring images ,
| (24) |
where Ai = is the best-fit rotation matrix for the image . The unit normals associated with the endpoint string images and are defined using one-sided analogs of Eq. 24, i.e.,
| (25) |
To second-order accuracy in the distance between string images L/(M − 1), where L is the string length, we make the approximation
| (26) |
Using Eqs. 24, 26, Eq. 23 can be written as
| (27) |
where we have defined the variables εi to simplify notation below. The hypersurfaces associated with the endpoint string images and are defined using one-sided analogs of Eq. 27,
| (28) |
| (29) |
Introducing the constants
| (30) |
the finite-difference approximation to the canonical average in Eq. 11 at α = αi is
| (31) |
where = + (k/2), and Zi and are normalization constants.
The curvature correction ξ(α) is computed for internal string images with the help of the approximation
| (32) |
where we used for all 0 < i < M with dα ≈ 1/(M − 1). For the endpoint string images, the curvature correction was set to zero. From Eqs. 20, 22 and the trapezoidal rule, the free energy associated with the hypersurface Sj is computed as
| (33) |
where
| (34) |
Free energy computation by tessellations
As mentioned in the Introduction, the free energy can also be evaluated by constructing a tessellation of the configurational space based on the discretized string.46 The tessellation can be sampled using either constrained46, 67, 75 or restrained64 dynamics, and the free energy of the tessellation can be obtained from the steady-state solution of the corresponding Master equation.46, 64, 67, 75 For the present calculations, the original constrained method was implemented, as follows. To each string image we assign a cell Bi, defined by
| (35) |
The equation for the boundary between the adjacent cells Bi, Bi+1, can be obtained by setting j = i+1, and rewriting the equality
| (36) |
where the last line corresponds to the second-order central finite-difference approximation to Eq. 23 at α = (αi+1+αi)/2. Equation 36, therefore, shows that boundaries between adjacent cells approximate isocommittor surfaces, as noted before.46, 67 The free energy of the tessellation can be computed by performing M independent MD simulations in the canonical ensemble, with the simulated systems constrained to remain inside the corresponding Bi. In the present calculations, the simple Brünger-Brooks-Karplus (BBK) Langevin Dynamics integrator76 is used (which is the default integrator in CHARMM77 and in NAMD78), with the following reflection rule to enforce the constraint:
| (37) |
where the term represents the propagation of the particle i over the period [t, t + δt]. The free energy of the tessellation can be obtained from the time record of reflections, as described before.75
Limiting the width of the transition tube
A central assumption of the FTS (Eq. 1) is that isocommittor surfaces can be approximated by hyperplanes. While the orientation of the hyperplane (i.e., the normal ) is derived rigorously in transition path theory,38, 39 the distance from the string over which the hyperplane representation is reasonably accurate is unknown, and dependent on the details of the system. Even if a higher-order expansion of the isocommittor were available (e.g., with local curvature information), practical calculations would quickly become prohibitive in high-dimensional spaces because they would require multiplications of tensors of progressively higher rank. For these reasons it may be useful to impose additional restraints (or constraints) on the simulation system to keep it sufficiently close to the string, so that the hyperplane approximation is accurate. (This approach neglects some entropic contributions to the free energy profiles and therefore can lead to an increase in the calculated free energy barriers.)
We define the following optional restraint potential that limits the width of the transition tube to a prescribed value η0,
| (38) |
where ηi = and
| (39) |
is the component of [] orthogonal to the string at . Using Eqs. 24, 26, and the corresponding one-sided definitions for the endpoint images (Eqs. 28, 29), ηi and its derivatives can be computed with very little effort once εi and its derivatives are known (see the Appendix).
Tube width restrictions can also be implemented in the sampling of the tessellation defined in Subsection 2D by modifying the reflection rule (Eq. 37),
| (40) |
Tests of the effect of these restrictions can be made by varying the prescribed width parameter η0 in Eq. 38, as done in Sec. 3.
Parallel implementation
The FTS requires an initial string , , which can be obtained from, e.g., a minimum energy path (MEP) or a biased dynamics trajectory (see Sec. 3). To each image we assign a separate CPU group I and a complete all-atom MD simulation system, denoted by , to be used for calculating the expectation values in Eq. 31.
Each CPU group is provided with the (neighbor) image(s) in addition to . A running average of is initialized as = . Starting at iteration n = 0, the following sequence of steps is repeated until the string is converged:
-
1.Sampling. Using the potential (and , if limiting the transition tube width is desired), each CPU group I performs a MD simulation with a time step Δt (typically 1–2 fs) for a prescribed simulation duration (where T is defined in Eq. 19, and is an equilibration period during which samples are discarded). The running average is updated after each m successive steps:
(41) -
2.Reparametrization. The running average is reparametrized to enforce Eq. 9 using linear interpolation, as described in Refs. 26, 36, 67, except that each string image is first superposed onto the neighbor image consistent with the parametrization constraint (Eq. 9). Communication between different CPU groups is required for this step. As in the previous implementations,26, 36, 67 the reparametrization procedure is approximate and can be performed iteratively to achieve arbitrary accuracy. The reparametrization error is measured by the quantity
which is computed after superposition of each string image onto the preceding image, as prescribed by Eq. 10. In the present calculations, once the string reached steady state (see Convergence Test below), err was around 0.01 after a single iteration. With the distance between the adjacent images of about 0.25 Å, the 1% error corresponds to uncertainties in the string image positions of only ∼0.0025 Å.(42) -
3.Update. The string images are updated from the running average
(43) -
4.Convergence test. Convergence of the string is checked by monitoring the norm
which corresponds to the root-mean-square distance through which the string images have moved at iteration n + 1. The loop is terminated after ρn reaches a plateau. Before returning to step 1, the neighboring images are communicated to each CPU group.(44)
Because step 1 of the above algorithm does not involve communication between different CPU groups, very high parallel efficiency will be achieved for long MD sampling times, e.g., 1 ps. Since conformational transition rates of many macromolecules of are typically low (1 μs–1 ms), such sampling times are often required.
After the string has converged, the free energy profile is computed from additional MD simulations (which can be performed in parallel or in series) using Eqs. 33, 34. An advantage of performing the simulations in parallel is that one can take advantage of the Hamiltonian replica exchange method (HREM) to accelerate convergence of the expectation values, as follows.79
After a prescribed number of iterations, we attempt to switch the atomic coordinates of the MD replicas and that correspond to the CPU groups I and , respectively,
| (45) |
where i is chosen randomly from 0, 1, …, M − 1. The acceptance probability of the move in Eq. 45 is computed according to the Metropolis criterion:
| (46) |
where
| (47) |
The performance of the present implementation of the FTS is described in Sec. 3 for a β-sheet miniprotein, and in the supplementary material74 for the alanine dipeptide.
RESULTS
In this section, we describe in detail the application of the rotation-invariant FTS to the β-sheet ↔α-helix conformational transition of the 16-residue fragment of streptococcal protein G37 (sequence 41-GEWTYDDATKTFTVTE-56). This mini-protein has been used for testing the suitability of enhanced sampling methods to biomolecules.80, 81, 82 In addition, the differences in the free energy, the enthalpy and the entropy between the two conformations were recently computed using a path-independent confinement method,83 and are used here to validate the results of the present calculations. The results are organized as follows. In Sec. 3A, we describe the generation of initial paths for the FTS. The application of the FTS to obtain converged transition paths is presented in Sec. 3B. Profiles of the free energy computed by the restrained and tessellation methods are described in Sec. 3C. We conclude the section with a description of the transition paths corresponding to the equilibrated paths in Sec. 3D. The FTS was also applied to the alanine dipeptide (AD) in vacuum as a model system. The results for AD are presented in the supplementary material.74
Initial minimum-energy paths
The preparation of the simulation structures is discussed in detail in Ref. 83, and is summarized only briefly here. A modified version of the program CHARMM77, 84 version c37a1 was used in all calculations. The essential modifications involved implementing the restraint potentials Uk and Vκ, and an inter-processor communication scheme to collect string image coordinates necessary for reparametrization (see Sec. 2). The Hamiltonian replica exchange method was based on the ENSEMBLE module in CHARMM. The modifications will be included in an upcoming release of CHARMM.
The β-sheet conformation was taken from the X-ray structure of protein G,37 and the α-helical conformation was generated from default internal coordinate entries in CHARMM,85 except for the backbone ϕ and ψ dihedrals, which were set to −57° and −47°, respectively. The all atom force-field with the CMAP correction85 was used to represent the polypeptide, and the FACTS solvation model86 was used to approximate the effects of solvent. The structures were subjected to 2000 steps of Adaptive-Basis Newton-Raphson (ABNR) minimization. The minimization of the α-helical structure was performed in the presence of harmonic restraints on the backbone dihedrals to maintain the α-helical conformation. The minimized structures were then equilibrated at 300 K in a 1 ns MD simulation with harmonic restraints on the backbone atom positions. An additional equilibration for 30 ns was performed without restraints. The α and β conformations after equilibration are shown in Fig. 1. The N-terminal turn of the α-helix conformation unwinds (Fig. 1a), but the rest of the helix is stable for the 30 ns MD simulation. In particular, the average and the standard deviation values of the helix backbone dihedral angles of res. 47–56 (i.e., excluding the partially unwound N-terminus) computed from the simulation were (ϕ, ψ) = (−63 ± 7°,−43 ± 7°). These values are within the rather broad α-helical region of the Ramachandran plot87, 88, 89, 90 (for example, Best et al.90 use the limits [−30 < ϕ < −100] × [−67 < ψ < −7]). We note that slightly different backbone dihedral values would result with the CHARMM36 force field, which includes a reoptimized backbone CMAP correction.90
Figure 1.
Equilibrated structures of the 16-residue peptide 41-GEWTYDDATKTFTVTE-56 from protein G in (a) α-helical conformation, (b) β-sheet conformation. The N-terminal domain in panel (a) is at the top. Residue side chain atoms that are excluded from the definition of the string are drawn in black lines. Most of the excluded atoms are related by two-fold symmetry (i.e., CD and CE atoms in Phe and Tyr, terminal oxygens in Asp, Glu and in the C-terminal carboxyl group) and the remaining atoms belong to the aromatic rings of Trp (with the exception of CH). Further details are given in the text.
Transitions to the α form have been observed in previous MD simulations in implicit solvent starting from the β-sheet structure, and metastable states with significant helical content have been found in explicit solvent simulations (see review by Zhou91). Whether the α-helical conformation of the β-hairpin peptide actually plays physical role is unclear because solution experiments have not found evidence for appreciable α-helical content.92
To initialize the FTS algorithm, two different MEPs were generated. The ZTS36 method was used to generate the first path using six cycles of interpolation and minimization, as described previously.67 In each cycle, a linear interpolation in Cartesian space that doubles the number of replicas was performed, followed by one thousand iterations of the ZTS method. Each iteration consists of 20 steps of steepest-descent (SD) minimization followed by a reparametrization step, to preserve equal distances between adjacent replicas. Because interpolation in Cartesian coordinates can lead to distortions in the local geometry (e.g., incorrect bond lengths, angles, or overlapping atoms), intermediate replicas may exhibit chirality errors. To maintain the correct (L-) amino-acid stereochemistry, the proton Hα on the Cα carbon was occasionally inverted through the plane normal to the Cα–Hα bond. Similarly, to maintain the correct stereochemistry of the ThrCβ center, the proton Hβ on the Cβ carbon was occasionally inverted through the plane normal to the Cβ–Hβ bond. To prevent the occurrence of cis-peptide bonds during minimization, weak dihedral-angle restraints centered on π(rad) with force constant Kω = 2 kcal/mol/rad2 were added to each peptide bond (ω-dihedral angle). The destabilizing effects of chirality errors and cis-peptide bonds on the structure of small α-helical proteins in MD simulations were discussed recently by Schreiner et al.93 The final 128-replica path was subjected to 25 000 additional ZTS iterations. The ZTS iterations are computationally inexpensive because they are dominated by SD minimization, and because each replica is assigned to a separate processor. The final ZTS minimization required approximately 1.5 h on 32 quad-core 2.4 GHz Opteron processors.
The second path was generated by a variant of Targeted Molecular Dynamics20, 27, 28 (TMD) using restraints,29 as implemented in Ref. 30. A 2 ns Restrained TMD (RTMD) simulation was carried out, starting from the α-helical structure, and targeting the β-sheet structure. The initial RMSD between the two structures (≃9 Å) was reduced to the final RMSD value (which was below 1 Å) in ΔRMSD increments of 10−5 Å per timestep (2 fs, with constraints on bond lengths involving hydrogens94). The target β-sheet structure was appended as the final point in the trajectory to ensure that the endpoints in the two initial paths were identical, and the resulting path was reduced to 128 replicas using linear interpolation. The reduced path was minimized using the ZTS method, as described for the final stage of the ZTS path above.
The potential energy profiles for the ZTS and RTMD paths are shown in Fig. 2 as a function of the normalized string length (α). The contributions to the potential energy for each MEP are listed in Table 1 (each value represents an average over the profile). From Fig. 2, we see that the initial MEP obtained from RTMD has significantly lower energies (the maximum is around ∼60 kcal/mol at α ≃ 0.82) compared to the MEP obtained using ZTS (with a maximum of ∼75 kcal/mol at α ≃ 0.45). For this system, the TMD method appears to produce lower-energy transition paths than the ZTS. The contributions from the different energy terms77, 86 listed in Table 1 differ between ZTS and RTMD mainly in the electrostatic and the polar solvation energies (computed using the FACTS model). In particular, the ZTS path has markedly higher electrostatic energy, which is offset by the lower polar solvation energy. The significance of the polar solvation term in the ZTS path suggests that the path proceeds through a more extensive exposure of the charged residues to the solvent, and the loss of favorable intramolecular interactions is compensated for by a gain in interactions with the solvent. To investigate this hypothesis quantitatively, we decomposed the polar solvation free energy into residue contributions, as implemented in the FACTS model in CHARMM, and compared the solvation energy profiles for the two paths for three different cases: all residues included, charged residues only (G41 [charged N-terminus], E42, D46, D47, K50, E56), and neutral residues only (43W, 44T, 45Y, 48A, 49T, 51T, 52F, 53T, 54V, 55T). The comparison is shown in Fig. S5a in the supplementary material.74 If the total polar solvation energy includes only the neutral residues, the profile energies for the two paths have similar magnitudes. If only the charged residues are included, the profiles have similar differences as when all the residues are included. Examination of the motions of the charged residues along the transition paths were consistent with this analysis. For example, the number of hydrogen bonds (HBs) involving at least one sidechain atom of a charged residue follows a similar trend as the polar solvation energy profile (see Fig. S5b in the supplementary material74), with the solvation energy being lower (i.e., more favorable) for replicas with fewer hydrogen bonds. (Following DeLoof, Nilsson, and Rigler,95 who evaluated several HB identification criteria, we used the “COOR HBOND” facility in CHARMM with the simple criterion that the hydrogen-acceptor distance must not be greater than 2.4 Å.) Structures illustrating the differences in the hydrogen bonding involving charged residues are shown in Fig. S5, panels (c) and (d) of the supplementary material.74
Figure 2.
Potential energy profiles along the MEPs that connect the α-helix and β-sheet conformations. Open symbols correspond to the MEPs from which the FTS iterations were initiated, and filled symbols correspond to the MEPs that were obtained by a minimization of the final equilibrated paths after the FTS converged (see text for details); —○— ZTS (initial); —•— ZTS (final); —□— RTMD (initial); —■— RTMD (final). The abscissa corresponds to the normalized string length (α=). The dotted horizontal lines are drawn at the vertical locations of the energy maxima of the paths after minimization.
Table 1.
Profile-average energy components for the MEPs in Fig. 2. The energy values corresponding to the first image (α = 0) were subtracted from each average. The last row corresponds to the sum of the electrostatic energy and the polar solvation energy.
| Energy | ZTS | ZTS | RTMD | RTMD |
|---|---|---|---|---|
| contribution | (initial) | (final) | (initial) | (final) |
| Total | 45.49 | 28.21 | 31.85 | 24.03 |
| Bond | −0.41 | 0.16 | −0.25 | −0.30 |
| Angle | 2.14 | 0.98 | 1.70 | 1.52 |
| Dihedral | 11.55 | 13.25 | 13.88 | 12.87 |
| Improper | −0.39 | −0.45 | −0.70 | −0.42 |
| Van der Waals | −0.89 | −1.75 | −1.12 | −5.65 |
| Electrostatic | 94.08 | 55.81 | 10.73 | 1.53 |
| Solvation (polar) | −57.20 | −34.85 | 14.54 | 20.28 |
| Solvation (nonpolar) | 0.54 | −1.06 | −0.28 | −0.87 |
| Elec. + Solv.(p.) | 36.88 | 20.95 | 25.28 | 21.81 |
The differences in the residue motions that are responsible for the different electrostatic and polar solvation energies can be understood by considering the methods (RTMD and ZTS) used to obtain the initial paths. The RTMD path was obtained from a MD trajectory biased to evolve continuously from one endpoint to another along the potential energy landscape. Since the MD forces generally act in the direction opposite to the energy gradient vector (the thermostat produces additional small random forces), the simulated system resists overcoming energetic barriers, including the breaking of hydrogen bonds. In the ZTS method, the initial path is obtained by interpolation between then endpoint states, which does not take into account energies, possibly generating high-energy configurations. Although the subsequent minimization steps in the ZTS rapidly reduce high-energies due to distorted bonds, angles, torsions, and overlapping atoms, they generally do not restore broken hydrogen bonds or salt bridges if the separation between the groups involved is large. In such cases, the electrostatic interactions are “screened” by the implicit solvation model, with a corresponding decrease in the polar solvation energy.
It is noteworthy that the average sum of the electrostatic and polar solvation energies has very similar values in the final ZTS and RTMD paths. As the above discussion suggests, the energetic compensation occurs because the electrostatic energy between protein atoms is computed in vacuum (i.e., with the dielectric of the medium set everywhere to unity), and is therefore overestimated. The FACTS solvation model essentially subtracts the overestimate and effectively shields the electrostatic interactions, as would be expected for a protein in solution. Profiles of the electrostatic energies, polar solvation energies, and their sum are shown in Fig. S6 of the supplementary material.74 In addition, the difference between the energy maxima along the profiles corresponding to the annealed paths (i.e., after the application of FTS; see horizontal lines in Fig. 2) is only about 1.5 kcal/mol, even though the paths exhibit different sequences of events, and the maxima correspond to different α values. This fact underscores the importance of finite-temperature equilibration for comparing different MEPs.
Further discussion of the paths is deferred to Sec. 3D, in which the transition mechanisms are described along with the corresponding free energy profiles.
Path optimization using FTS
Starting from the MEPs described in Sec. 3A, equilibrated transition paths were computed using the FTS, as follows. To each of the 128 images we assigned a separate MD simulation system running on 4 processor cores, for a total of 512 cores per FTS simulation. The initial MD coordinates (as well as the initial string images) were taken from the MEPs computed above. Hydrogen atoms were not included in the definition of the string (equivalently, the weights wi corresponding to the hydrogen atoms were zero). For reasons of computational efficiency, certain heavy side-chain atoms were also excluded from the string definition. These are (i) CD and CE carbon atoms in residues Phe and Tyr, (ii) terminal OD and OE oxygen atoms in Asp, Glu, and in the C-terminal carboxyl group, and (iii) atoms from the aromatic rings of Trp, with the exception of CH. The excluded atoms are illustrated using black lines in Fig. 1. For convenience, the (N) atoms treated explicitly by the FTS are henceforth called reaction atoms, and the remaining atoms are called bath atoms. Reducing the number of reaction atoms decreases the number of rotation matrix derivatives needed to evaluate atomic forces and free energy derivatives, which must be calculated at each step of the MD calculation. In addition, removing ring and symmetric atoms prevents the path from lengthening artificially, e.g., if a ring were to flip during string equilibration. (Although ring rotation is often a rare event in equilibrium MD simulations, it is more likely to occur for high-energy intermediates along the string during evolution.)
The above choice of the reaction atoms assumes implicitly that the transition can be described accurately using only the positions of these atoms (after rigid body motions are removed). In principle, this assumption can be validated by an a posteriori committor test.16, 26, 43, 44 However, the full committor test can be impractical for diffusive or multi-barrier transitions, in which very long integration times are required for trajectories to commit to a basin. Using Markovian milestoning,67, 75 we estimated the mean first passage times of the transition to be ⩾7 ms (see Fig. S8 in the supplementary material74). Given the ruggedness of the computed free energy profiles (discussed in Subsection 3C), the full committor test would require integration times that are not affordable. We therefore performed a partial committor test for the dominant free energy barrier in the ZTS path, in which trajectories were only required to commit to the metastable states on either side of the barrier. Trajectories were launched from the two states along the reaction path adjacent to the highest free energy barrier. We found that 22% and 78% of the trajectories launched from the state closer to the product side terminate on the reactant and product sides, respectively, compared to 99% and 1% of the trajectories that were launched from the state closer to the reactant side. These values indicate that the transition state for the barrier (for which the “splitting” probability would be 50%/50%) is located between the two chosen states, as expected. The details of the calculation are given in Sec. SIV of the supplementary material.74
Each replica was simulated in the canonical ensemble at 300 K using the Langevin dynamics thermostat coupled to heavy atoms with a friction constant of 1 ps−1. Covalent bonds to hydrogen atoms were constrained with SHAKE,94 and the FACTS implicit solvent model was used.86 The same simulation conditions were used previously to calculate free energy differences by a confinement method.83 The approximate time required to integrate 1 ns of MD was 2.85 CPU-core-hours per replica using a multiprocessor cluster equipped with quad-core AMD Opteron CPUs.
The parameters k, κ, and η0 in the FTS restraint potentials Uk and Vκ (Eqs. 15, 38, respectively) were selected empirically as follows. As done in previous studies,30, 33, 57, 96 we employed a potential Vκ to limit the width of the transition tube around the string. The tube width parameter η0 was set to 1 Å for all images, which was roughly equal to the RMS fluctuations computed from unrestrained endpoint MD trajectories. The tube width can also be estimated a posteriori (see Sec. 4). We note that the value of the tube width restraint can be important for the calculation of free energy profiles (discussed in Subsection 3C), but should not have a significant effect on the converged path itself.
The restraint force constants k and κ were set to 10 kcal/mol/Å2 and 20 kcal/mol/Å2, respectively. The value of k is sufficiently high that the free energy profiles computed by the restrained method were in good agreement with those obtained by the tessellation method, yet low enough to achieve significant replica exchange acceptance frequencies during FE computation (discussed in Sec. 3C). The value of κ was not optimized. However, unless κ is taken to be very low, its value is expected to have a much smaller effect on the free energy profile than that of the tube width parameter η0.
Starting from each MEP, the string was evolved from the initial condition until convergence, as follows. The running averages for the internal string images were updated after each 10 successive MD steps, the averages were reparametrized and the string images updated after 1000 MD steps. The string image update was performed gradually using linear interpolation over 100 MD steps () to avoid numerical instability. The running averages were not updated during this equilibration interval. The endpoint string images were not updated to facilitate comparisons between the free energy (FE) differences obtained from different paths (i.e., the endpoints are exactly the same).
Convergence of the paths was followed by monitoring the quantity ρ(t) (Eq. 44). The behavior of ρ is shown as function of simulation time in Fig. 3. Although convergence is reached after ∼60 ns for each path, the simulations were continued to ∼120 ns to ascertain that there was no systematic change in the evolution of ρ. For each simulation, the path RMSD between the initial and final paths was ∼2.5 Å (which is relatively small compared to a RMSD of ∼9 Å between the endpoint states), indicating that the string images do not travel very far from their initial positions, as was previously observed for simulations of myosin VI using the string method in collective variables.67 (Further discussion is deferred until Sec. 4.) The calculation of FE profiles is discussed next.
Figure 3.
Convergence of the FTS simulations monitored by the evolution of the distance ρ. —○— ZTS path evolution; —□— RTMD path evolution; the locations at which the string images were fixed for the calculation of the free profiles are marked by solid vertical lines for the ZTS path (t1 = 84 ns and t2 = 121 ns), and by a dashed line for the RTMD path (t1 = 107 ns). The data are plotted in 1 ns increments, so that each interval corresponds to 500 reparametrizations.
Calculation of free energy profiles
FE profiles were computed for the instantaneous realizations of the strings marked by vertical lines in Fig. 3. To determine the sensitivity of the free energy profiles to dynamical fluctuations in the string coordinates within the converged region (e.g., after ∼60 ns of simulation time), two time points corresponding to t1 = 84 ns and t2 = 121 ns were taken from the ZTS path. In principle, the time point at which the free energy profile is computed should not matter, as long as the corresponding string images are taken from the converged region, in which the fluctuations are small and random. We observed, however, that the computed profiles differ quantitatively, as described further in this subsection. For the RTMD path, a single realization of the string was taken as t = 107 ns.
The FE calculations performed with the tessellation (T) method, ZT1 and RT1 (Table 2), do not have any adjustable parameters, and we take the corresponding profiles as a reference for testing the restrained (R) method, which requires values for the force constants and for the transition tube width. Although the tessellation method may be implemented with tube width constraints (see Sec. 2) including such constraints did not make a significant difference for the present calculations.
Table 2.
Summary of FE simulations performed in this study.
| Method | ΔFα→β | ||||
|---|---|---|---|---|---|
| Index | Path | (R/T)1 | t(ns) | k, κ, η0 | (kcal/mol) |
| ZT12 | ZTS(t2) | T | 20 | n/a | −8.25 |
| ZR1 | ZTS(t2) | R | 20 | 10, 20, 0.5 | −8.4 ± 1.5 |
| ZR2 | ZTS(t2) | R | 20 | 10, 20, 1 | −2 ± 3 |
| ZR3 | ZTS(t2) | R | 10 | 3, 20, 0.5 | −5.6 ± 1.0 |
| ZR4 | ZTS(t2) | R | 20 | 10, 0, ∞ | −1.6 ± 1.3 |
| ZR5 | ZTS(t1) | R | 15 | 10, 20, 0.5 | −6.1 ± 1.8 |
| RT12 | RTMD | T | 20 | n/a | −8.3 |
| RR1 | RTMD | R | 20 | 10, 20, 0.5 | −8.2 ± 2.4 |
| SCM | −6.7 ± 0.4 |
R and T refer to the Restrained and Tessellation methods for the calculation of FE profiles. SCM refers to the Simplified Confinement Method described in Ref. 83.
The parameters k, κ, and η0 are not relevant for the T method. In simulation ZR4, tube-width restraints were not used. ΔF is the free energy difference between the β-sheet and α-helix conformations.
The parameters of the FE simulations are summarized in Table 2. The FE differences between the β and α conformations (ΔFα→β) computed from each profile are included. The value for ΔFα→β computed by a path independent confinement method83 is also listed. The restrained FE calculations were performed with the HREM to accelerate convergence, as described in Sec. 2 and discussed previously.67 The exchange of adjacent MD replicas was attempted after 500 successive MD steps, and the average acceptance ratio was ∼40% for the calculations (ZR1 and RR1) as discussed below.
The best match between the FE profiles computed using the T and R methods is obtained with the reference tube width η0 set to 0.5 Å for both paths. The profile comparison is shown in Fig. 4. Although the profiles differ at multiple locations by several kcal/mol (e.g., α = 0.3–0.4 in panel (a)), the agreement is fair, given the differences in the sampling regions between the restrained and tessellation methods. (While the R method uses a uniform η0 to sample the hyperplanes, in the T method the tube width is set implicitly to the sizes of the cells in the tessellation.) The profiles in Fig. 4 show similar ΔFα→β ≈ −8.3 kcal/mol, although the uncertainties for the restrained method are significant, around ≃2 kcal/mol. Given the uncertainty of the FTS methods, the value ΔFα→β ≈ −8.3 is also within the error limits of the value ΔFα→β ≈ −6.7 ± 0.4 obtained by the (very different) SCM method.83 The profiles corresponding to the remaining cases in Table 2 are discussed next.
Figure 4.

Free energy profiles of transition (ΔF = F(α)-F0) computed with the restrained (——) and tessellation (– – –) methods. (a) Initial path obtained using the linear interpolation combined with zero-temperature string (ZTS); (b) initial path obtained from a RTMD simulation and refined using ZTS.
The effect of the tube width η0 on the FE profiles is illustrated in Fig. 5a. In the absence of the restraint (η0 = ∞), the FE profile fluctuates within ∼3 kcal/mol around zero without displaying significant reaction barriers. This behavior is expected if the reaction coordinate isosurfaces are not very accurate, e.g., if they do not follow the actual FE barriers, as discussed by others.16, 43 In the present calculation, some instantaneous configurations were more than 4 Å away from the string, where the hyperplane approximation is likely to be inaccurate. With η0 = 1 Å (ZR2), the FE profile looks similar to that from simulation ZR1 (η0 = 0.5). Because of the larger tube width, however, the convergence rate of the FE derivatives was slower, resulting in higher standard errors. This behavior was also observed in previous string calculations on the protein calmodulin,30 in which a similarly narrow transition tube (η = 0.75 Å) was used to improve convergence of the FE profiles.
Figure 5.

(a) The influence of the transition tube width restraint (η0) on the FE profile obtained using the restrained method. (b) Illustration of the effect of lowering the restraint force constant k (—— vs. — · —), and of the dependence of the FE profile on the time point from which the path is taken (—— vs. – – –). The initial MEP was generated using the ZTS method and linear interpolation.
The effect of lowering the restraint force constant k on the FE profile is demonstrated in Fig. 5b. As noted previously,67 the effect of an insufficiently high force constant is to smooth the energy landscape by a convolution with a multivariate Gaussian with variances (βk)−1. The FE profile corresponding to the force constant k = 3 (ZR3) has lower FE barriers, and a lower ΔFα→β, as expected from a smoothed landscape. We also note that the calculation ZR3 converged more rapidly than the other calculations because lower force constants increased the replica exchange frequency (to ∼70%), as observed previously.67
Figure 5b also illustrates the effect of relatively minor variations in the string on the details of the FE profile. Simulations ZR1 and ZR5 use the instantaneous string obtained after 121 ns and 84 ns, respectively, and use identical simulation parameters. Both of the time instants are sampled after the RMS distance from the initial MEP ρ reaches a plateau (∼60 ns in Fig. 3). However, small differences between the paths accumulate over the intervening 17 ns despite only small changes in ρ(t), and result in differences in the FE profiles. The profiles are similar qualitatively, in that the locations of the FE minima and FE barriers are approximately preserved, but differ quantitatively. To understand the source of the differences, we computed the RMS difference between the reaction atoms in the corresponding images for the two paths. The average distance was ∼0.7 Å. The largest difference (∼1.5 Å) was found near α ∼0.8. However, the difference in this region was below 1 Å if the images corresponding to 121 ns were shifted forward by ∼0.07 (in terms of α). This suggests that, in the time interval t = (84 ns, 121 ns), the parametrization of the images at α ∼ 0.8 changed to α ∼ 0.73, i.e., the images were reindexed closer to the reactant basin, but did not undergo a significant structural change. This analysis (see also Sec. 4) reveals the limitations of the atom-based parametrization chosen for this study because of the high sensitivity of the string parametrization to small changes in the structure. This observation was an important motivation for the introduction of the string method in collective variables (SMCV),26 developed for use with coarse-grained collective variables. Suitable coarse-grained variables, however, are not generally known for complex transitions, and for such cases (as for the case at hand), the present FTS method is the more appropriate choice, in spite of its limitations.
We conclude this section with a decomposition of the free energy profiles into to the enthalpic and entropic contributions, shown in Fig. 6. The enthalpy profiles were computed by averaging the potential energy in the tessellation calculations ZT1 and RT2, and the entropic profile was obtained by subtracting from it the FE of the tessellation. While the enthalpy profile provides the larger contribution to the FE (15–20 kcal/mol), it is clear that the entropic “offset” is also significant, reaching values of ∼10 kcal/mol at some locations. For example, in Fig. 6a, the entropy term for α = 0.7 is ≃8 kcal/mol higher than that of the α-helix, and ≃13 kcal/mol higher than that of the β-sheet. It is noteworthy that the entropy of intermediate states tends to be higher than that of the endpoint states, suggesting the importance of multiple transition pathways. Furthermore, peaks in the entropy tend to accompany enthalpic peaks, thereby reducing the overall free energy barrier. This observation suggests the importance of enthalpy-entropy compensation in protein conformational transitions, which has been observed experimentally in protein-ligand association.97, 98, 99 Specifically, the binding of a ligand lowers the enthalpy of the bound complex, compared to the unbound form, but does so at the expense of (1) removing the entropy in the rotational and translational degrees of freedom (DOFs) of the ligand that are eliminated in the binding (the internal DOFs created by the formation of the complex contain significantly less entropy because the ligand is effectively immobilized in the protein binding site100, 101) and (2) reducing the conformational entropy of the protein residues involved in the binding, whose mobility is lower due to the favorable enthalpy of binding. In the case of conformational transitions, the protein must proceed through intermediate (higher-energy) states to switch between (low-energy) endpoint states; when the number of possible intermediate configurations is large, compared to those for the endpoints, the enthalpic barriers will be partially offset by entropic effects, as observed in Fig. 6.
Figure 6.

Decomposition of the FE profiles computed with the tessellation method (········) into enthalpic (——) and entropic (– – –) contributions. (a) Simulation ZT1; (b) simulation RT1.
Illustration of transition paths
To illustrate the transition, eight string images were selected from each path. The coordinates of the reaction atoms are shown in the α → β direction in Figures 7a (Multimedia view) and 7b (Multimedia view) for the ZTS and RTMD paths, respectively. The converged paths differ quantitatively in terms of the associated FE profiles (see Fig. 4), but also qualitatively in terms of the sequence of atomic motions. The first FE barrier along the ZTS path occurs for α ≃ 0.12 (Fig. 4a), corresponding to the unwinding of the first helical loop at the N-terminal end, which requires the breaking of intra-helical hydrogen bonds. The next FE barrier at α ≃ 0.22 corresponds to twisting motions within the partially unwound loop, which results in a nearly straight β-strand (α ≃ 0.37). The dominant FE barrier of the ZTS transition path (ΔFE ≃14 kcal/mol) corresponds to the formation of the hairpin (images at α ≃ 0.37 and α ≃ 0.47). It is noteworthy that this event also corresponds to the highest barrier in the RTMD path, although in that path it occurs at a much later stage (α ≃ 0.88 in Fig. 4b). The images corresponding to α ≃ 0.47 and α ≃ 0.59 are the most significant metastable states along the ZTS-derived path, with the FE profile indicating barriers of ∼7 kcal/mol on either side of the minima. The images correspond to “hybrid” structures in which the N-terminal half of the protein is a β-strand, and the C-terminal end has 1–2 turns of the α-helix. The final barrier corresponds to the unwinding of the remaining α-helical turn at α ≃ 0.72, which is followed by intra-strand hydrogen bond formation that leads to the β-sheet.
Figure 7.
Snapshots from finite temperature paths that illustrate the transition mechanism. (a) Initial path obtained using the linear interpolation combined ZTS, t = 121 ns (b) initial path obtained from a RTMD simulation and refined with ZTS, t = 107 ns. The peptide backbone is drawn in ribbon representation, and the reaction atoms are drawn in ball-and-stick representation. Forming or breaking hydrogen bonds are indicated by dotted lines. For each step, regions of the protein that are undergoing significant structural changes are highlighted with a green circle. Bath atoms are not shown because they are not part of the string definition (their string coordinates are undefined). All of the atoms are shown in Fig. 1. (Multimedia view) .
The sequence of transition events along the RTMD path is different, and is consistent with previous analyses of paths generated by TMD,28, 102, 103, 104 in which large-scale rearrangements precede small-scale motions.30 In particular, the initial stage of transition is dominated by a large displacement of the N-terminal end toward the C-terminus (α ≃ 0 to α ≃ 0.38), and the final stage (α ⩾ 0.88) consists of local adjustments to the conformation of the hairpin head. The transition begins with the unwinding of the first N-terminal helical loop (α ≃ 0.19) in a similar fashion and with a similar FE barrier as observed in the ZTS path, except that in this case the loop rapidly moves toward its final position in the β-hairpin. The C-terminal helical turn unwinds next, and the N-terminus moves further toward the C-terminus to form the initial hydrogen bond between the strands of the emerging hairpin (α ≃ 0.25 to α ≃ 0.6). These events are accompanied by a gradual increase in the FE along the profile (Fig. 4b) with rather shallow metastable states (barriers of 5 kcal/mol or less). The most metastable state is structurally similar to the image at α ≃ 0.59, and shows a hybrid structure, with the interior of the peptide sequence α-helical, and the ends resembling a β-sheet. The dominant FE barrier along this path occurs near α ≃ 0.88, and corresponds to the formation of the correct hairpin.
DISCUSSION
Calculating free energy profiles for conformational changes in proteins from molecular simulations is challenging because an appropriate reaction coordinate is difficult to determine. In the present study, we used the finite temperature string method (FTS) in rotation-invariant coordinates to approximate the reaction coordinate isosurfaces for a 16-residue β-hairpin. Free energy profiles were computed either by sampling the hyperplanes directly using restrained dynamics, or by constructing tessellations of the configurational space associated with the string. A central assumption in the FTS is that the most probable transition paths connecting reactants and products are clustered together in relatively narrow transition tubes in configurational space. Conceptually, this assumption implies that all transition events occur in a similar fashion, and therefore can be represented in an average sense, i.e., by a “string” of images that correspond to typical configurations at each stage of the reaction. Mathematically, the narrow-tube assumption is used to justify approximating the reaction coordinate isosurfaces by hyperplanes in the vicinity of a maximum in the density of reactive trajectories.25, 26, 40, 67 This approximation is crucial as it makes sampling the reaction coordinate isosurfaces computationally tractable. Qualitative support for the assumption can be found in the micro- to millisecond long MD trajectories obtained by Shaw and co-workers, who observed that most folding events for the WW domain Fip35 occurred essentially in the same order, and that the conformational transitions in the protein BPTI proceed via a small number of structurally distinct states.105, 106 An independent analysis of the Fip35 folding trajectories105 using Markov state modeling (MSM) concluded that there are multiple pathways to the native structure,107 in accord with earlier MSM predictions based on short MD simulations.108 Nevertheless, the folding pathway observed by Shaw et al.105 was found to carry the majority of the reactive flux, indicating that the other pathways are less important kinetically. Generally, the accuracy of the narrow tube assumption in the FTS method will depend on the specific system under study; the FTS method employed here will be most useful for transitions for which the narrow-tube assumption is accurate.
Quantitatively, the transition tube width is difficult to estimate because it depends on the coordinates chosen to represent the actual transition path ensemble (here, the string). In the present calculations, we used restraint potentials to investigate the effect of an imposed tube width on the FE profile, and found a strong dependence (Fig. 5a). To obtain agreement between the FE profiles computed by tessellation and those computed using restraints, it was necessary to limit MD sampling in the restrained calculations to relatively narrow tubes of radius ≃0.5 Å around the string. This value also gives the approximate range of validity of the hyperplane approximation (in the present atom-based coordinates). In the absence of tube width restraints, the restrained method produced an unrealistic FE profile with small barriers (Fig. 5a), as would be expected for an incorrect reaction coordinate.16, 43 Furthermore, using the instantaneous configurations from MD sampling of the tessellation in simulations ZT1 and RT1 (Table 2) the distances from the string to the MD replicas were found to vary from 0.5 Å to 2.5 Å along the transition path, roughly consistent with the value 0.5 Å used in the restrained simulations. The strong dependence of the FE profiles on the imposed tube width indicates that the narrow-tube assumption is inaccurate for some transitions (and/or some variables) and requires additional tube restraints to avoid sampling incorrect regions. Because the tessellation method prevents the crossing of hyperplanes by construction, it appears to be technically superior to the restraint method, which requires (ad hoc) specification of tube restrains. However, because neither method can guarantee the validity of the hyperplane approximation away from the string, comparing the FE profiles obtained from both methods, as done here, is useful for qualitative validation.
The requirement of a narrow tube restraint found in this study reflects, in part, the difficulty of finding suitable coordinates for coarse-graining the energy landscape. In a previous study, we considered several types of collective variables (CVs) to represent the string for a transition in the converter domain of Myosin VI.67 In that case, however, the choice of CVs was simplified because the differences in the endpoint conformations could be described by reorientations of α-helices and a β-sheet, connected by short loops. More generally, for transitions that involve complex motions of many atoms, such as the protein secondary structure changes considered here, finding appropriate coarse-grained CVs can be very difficult.
The choice of coordinates to represent the string also determines the dimensionality of the FE landscape. A larger number of coordinates will generally produce a more rugged landscape with more possible transition paths, which also require a finer discretization. In the present case, the FTS simulations were initialized from two different MEPs, each exhibiting a different order of conformational changes. Each FTS simulation converged in the vicinity of the initial path (RMS distance of ∼2.5 Å, see Fig. 3). The two very distinct paths had similar free energy profiles, indicating that both paths should be taken into account in describing the ensemble of actual transition paths. Because the free energy profile, per se, does not include information on the reaction rate, we estimated the mean first passage times (MFPT) from the α-helical state to the β-sheet state along the two paths using Markovian milestoning.95 The MFPTs were found to be ≃7 ms and ≃200 ms for the RTMD and ZTS paths, respectively, suggesting that the RTMD path carries more reactive flux. This result is also consistent with the fact that the computed FE barrier for the RTMD path is slightly lower than that for the ZTS path, although the difference must be interpreted in light of the ∼2 kcal/mol uncertainty in the FE profile values. The existence of the two distinct transition paths described here suggests that a larger number of paths (in the present string coordinates) could be required to characterize the transition completely. Alternatively/additionally there may exist paths not found by the present approach, along which the free energy barriers are lower, which carry more reactive flux, and which would be more kinetically significant than the paths described here. Clearly, the need to account for multiple paths remains a challenge in the study of protein folding and transitions by all presently available methods, including the FTS.
An important conclusion from the enthalpy/entropy decomposition of the FE profiles (Fig. 6) is that entropic differences can offset the enthalpic barriers in conformational transitions by as much as 10 kcal/mol. This implies that in the vicinity of energy barriers, the actual transition trajectories can follow a large number of paths. For the present β-hairpin system this is consistent with partially unfolded configurations (at e.g., α ≃ 0.45 or α ≃ 0.7 in Figs. 6a, 7a). In addition, the contributions from the entropy may actually be underestimated because of the limits on the transition tube width (which are implicit in the tessellation calculation).
A related point concerns the differences between minimum energies and free energies computed along each transition tube. A comparison between the profiles in Fig. 6 and the minimum energy profiles annealed from the string images in Fig. 2 shows that the values along the minimum energy profiles can be 20–30 kcal/mol higher than those of the free energy. This further highlights the fact that regions of configurational space large enough to capture entropic contributions to the free energy, rather than single-point configurations, should be considered when evaluating the stability of a particular state.
Finally, to the extent that the rigid-body motions are separable from the internal configurational motions, the present FTS algorithm computes only the internal part of the free energy, since rigid-body motions are not restrained. While the translational contribution to the total partition function is a constant that is proportional to the concentration, the rotational contribution will be variable along the string, because it is proportional to the square root of the inertia tensor determinant, so that an extended system is entropically favored over a compact system. To estimate the rotational contribution to the total free energy, we computed the inertial tensor determinants for the string corresponding to simulation ZTS(t2) (see the supplementary material74). The β-sheet conformation is favored over the α-helical conformation by about 0.1 kcal/mol, and the intermediate configurations around α = 0.25 are disfavored relative to the α-helix by about 0.12 kcal/mol. These trends are qualitatively consistent with the snapshots in Fig. 7a, in which the β-sheet configurations are the most extended, and the transitional structures around α = 0.22 are the most compact. The magnitude of the rotational corrections, however, is about an order of magnitude less than the standard error in the configurational free energy, and the corrections are therefore not significant for the present system.
CONCLUSION
We described a parallel implementation of the FTS method in rotation-invariant and translation-invariant coordinates that use the degenerate distance metric of Vanden-Eijnden and Venturoli46 with hypersurface restraints to compute one-dimensional free energy profiles. The main difference between the present approach and earlier FTS applications to rigid-body invariant systems,50, 51 is that coordinates are implicitly transformed to the reference frames of the corresponding string images, eliminating the need for arbitrarily predefined reference structures. The method was applied to the β-sheet↔α-helix transition in a 16-residue miniprotein to calculate transition tubes and free energy profiles associated with two distinct initial paths. The computed free energy differences were found to be within the uncertainty limits of the value obtained previously from a confinement free energy analysis.83 The present results demonstrate the utility of the FTS for computing transition paths and the associated free energy profiles in complex biomolecular systems. In particular, the transition considered here involves a complete change of the protein secondary structure, for which it is difficult to find a reaction coordinate,43, 44, 45 or even a suitable set of collective variables. The presence of many reaction barriers in the free energy profiles (e.g., Fig. 4) would also make it nontrivial to use methods such as TPS,15, 16 as each barrier would require a separate application.109 Thus, the FTS appears to be well-suited for obtaining transition paths and free energies in complex systems such as the one examined here. A technical caveat associated with the calculation of the free energy profiles concerns the accuracy of the hyperplane approximation implicit in the FTS. In particular, FE profiles computed using a method based on harmonic restraints to the hyperplane require the additional specification of a tube width restraint, whose reference value is generally determined by trial and error. The dependence of the computed free energy profiles on the tube width implies that the hyperplane approximation is not very accurate in regions farther than ∼1 Å away from the string, using the atom-based coordinates employed here. A possible direction for the construction of coarser-grained coordinates better-suited for generic protein transitions could start from bead models, in the spirit of the moderately coarse-grained residue representation used in the MARTINI force field,110 with multiple interaction sites per residue, and proceed systematically toward further coarse-graining, using, e.g., a single interaction site model.111, 112 The use of such coarse graining schemes would require only minor modifications to the FTS implementation outlined here. At each stage of coarse-graining, the FE profile computed based on the reduced set of coordinates can be validated with respect to those computed using finer sets. These explorations are left to a future study.
ACKNOWLEDGMENTS
The authors are grateful to Eric Vanden-Eijnden for a critical reading of the paper and for suggestions that improved the final version. V.O. is grateful to Arnaud Blondel for a discussion of amino acid stereochemistry in transition path optimization. Computer resources were provided by the National Energy Resource Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231, and by the Faculty of Arts and Sciences Computing Group at Harvard. The work done at Harvard was supported in part by the National Institutes of Health.
APPENDIX: CALCULATION OF THE METRIC TENSOR AND PROJECTION DERIVATIVES
First, we compute the metric tensor defined by Eq. 6. Noting that the derivatives with respect to the absolute coordinates () are related to those with respect to the coordinates relative to the COM by = , and using the identity30, we obtain
| (A1) |
where summation over the indices d and e is implied, and the transposition operator is omitted from A for brevity. The first term on the right-hand side of Eq. A1 is the inverse mass tensor, and the remaining terms constitute the difference tensor (defined in Eq. 7). Figure 8 shows the normalized tensor computed for the β-hairpin molecule, averaged over the 128 replicas in the converged path ZTS(t2) (see Table 2). The average deviation from zero is ≃0.005, and the highest deviation is ≃0.045. The largest contributions to the deviations arise from the last term in Eq. A1. Because the term contributes only to the diagonals of the 3 × 3 coordinate submatrices, the tensor has a “grainy” appearance. for the alanine dipeptide is shown in the supplementary material.74
Figure 8.
Normalized deviation tensor for the β-hairpin mini protein (see Eq. A1). The filled contours are shown as a series of adjacent 3×3 submatrices, i.e., the component indices a and b vary faster than the coordinate indices i and j in the horizontal and vertical directions, respectively.
For a regular coordinate transformation, the Jacobian can be computed from the (non-mass-weighted) metric tensor using J = (det M)1/2. However, the transformation is not invertible, because any rotation and/or translation of the absolute coordinates maps to the same (invariant) coordinates , and M in this case is singular with a zero determinant. The degeneracy of M can be removed by computing the “configurational determinant,” i.e., omitting the six lowest eigenvalues of M, which correspond to rotation and translation and are nearly zero, with the remaining eigenvalues close to unity. This procedure is analogous to normal mode calculations, in which the six lowest-frequency normal modes are excluded from analysis because they correspond to rigid motions.113 The Jacobian is then calculated as
| (A2) |
where the λi are the eigenvalues of M computed with the masses set to unity to avoid mass-weighting, and N is the number of string atoms. Figure 9 shows the variation of J(α) along the string. Both, an instantaneous curve and a curve averaged over the 20 ns ZT1 trajectory (see Table 2) are plotted. The Jacobian fluctuates within about 3% of the average value, and it is expected that the fluctuations would make negligible contributions to the free energy profile. For example, kBTlog 1.03 ≈ 0.017 kcal/mol, which is well below the standard errors reported in Table 2. Therefore, the omission of the Jacobian factor from Eq. 11 is fully justified.
Figure 9.
Jacobian of the transformation normalized by its average over the string in simulation ZT1; (– – –) instantaneous curve computed at the beginning of the simulation; (——) simulation average.
Next, we compute the gradients of the quantities εi and ηi that are required to apply the forces arising from the potentials and . Each εi in Eq. 27 can be rewritten more concisely as
| (A3) |
where and are constant, and minimize the norms and , respectively, and Δ=‖B − A‖. Using (′) to represent partial derivatives with respect to (t-coordinate of atom i), we have
| (A4) |
and
| (A5) |
The computation of derivatives of rotation matrices is described in the supplementary section of Ref. 30. Derivatives of η = ‖v⊥‖ can be computed with very little additional effort by noting that (with A and B as in Eq. A3)
| (A6) |
(compare with Eq. 39), which implies
| (A7) |
We note that in each of Eqs. A5, A7, the last term on the right-hand side is due to rotation matrix derivatives. The additional terms are needed to keep the MD replica on the reaction coordinate hypersurfaces because the optimal rotations will change when the coordinates are perturbed.
References
- Benkovic S. and Hammes-Schiffer S., Science 301, 1196 (2003). 10.1126/science.1085515 [DOI] [PubMed] [Google Scholar]
- Johnson K., Annu. Rev. Biochem. 62, 685 (1993). 10.1146/annurev.bi.62.070193.003345 [DOI] [PubMed] [Google Scholar]
- Joyce C. and Steitz T., Annu. Rev. Biochem. 63, 777 (1994). 10.1146/annurev.bi.63.070194.004021 [DOI] [PubMed] [Google Scholar]
- Ma J., Siegler P., Xu Z., and Karplus M., J. Mol. Biol. 302, 303 (2000). 10.1006/jmbi.2000.4014 [DOI] [PubMed] [Google Scholar]
- Karplus M. and Gao Y., Curr. Opin. Struct. Biol. 14, 250 (2004). 10.1016/j.sbi.2004.03.012 [DOI] [PubMed] [Google Scholar]
- Houdusse A. and Sweeney H. L., Curr. Opin. Struct. Biol. 11, 182 (2001). 10.1016/S0959-440X(00)00188-3 [DOI] [PubMed] [Google Scholar]
- Geeves M. and Holmes K., Annu. Rev. Biochem. 68, 687 (1999). 10.1146/annurev.biochem.68.1.687 [DOI] [PubMed] [Google Scholar]
- Morrison E. A., DeKoster G. T., Dutta S., Vafabakhsh R., Clarkson M., Bahl A., Ha D. K. T., and Henzler-Wildman K., Nature (London) 481, 45 (2012). 10.1038/nature10703 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dror R., Arlow D., Borhani D., Jensen M., Piana S., and Shaw D., Proc. Natl. Acad. Sci. U.S.A. 106, 4689 (2009). 10.1073/pnas.0811065106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Friedrichs M., Eastman P., Vaidyanathan V., Houston M., Legrand S., Beberg A., Ensign D., Bruns C., and Pande V., J. Comput. Chem. 30, 864 (2009). 10.1002/jcc.21209 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Harvey M., Giupponi G., and Fabritis G. D., J. Chem. Theory Comput. 5, 1632 (2009). 10.1021/ct9000685 [DOI] [PubMed] [Google Scholar]
- Torrie G. and Valleau J., J. Comput. Phys. 23, 187 (1977). 10.1016/0021-9991(77)90121-8 [DOI] [Google Scholar]
- Bartels C., Schaefer M., and Karplus M., J. Chem. Phys. 111, 8048 (1999). 10.1063/1.480139 [DOI] [Google Scholar]
- Laio A. and Parrinello M., Proc. Natl. Acad. Sci. U.S.A. 99, 12562 (2002). 10.1073/pnas.202427399 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dellago C., Bolhuis P., and Geissler P., Adv. Chem. Phys. 123, 1 (2003). 10.1002/0471231509.ch1 [DOI] [Google Scholar]
- Bolhuis P., Chandler D., Dellago C., and Geissler P., Ann. Rev. Phys. Chem. 53, 291 (2002). 10.1146/annurev.physchem.53.082301.113146 [DOI] [PubMed] [Google Scholar]
- Hénin J. and Chipot C., J. Chem. Phys. 121, 2904 (2004). 10.1063/1.1773132 [DOI] [PubMed] [Google Scholar]
- Hénin J., Fiorin G., Chipot C., and Klein M. L., J. Chem. Theory Comput. 6, 35 (2010). 10.1021/ct9004432 [DOI] [PubMed] [Google Scholar]
- Rosso L. and Tuckerman M., Mol. Sim. 28, 91 (2002). 10.1080/08927020211977 [DOI] [Google Scholar]
- Schlitter J., Engels M., Krüger P., Jacoby E., and Wollmer A., Mol. Sim. 10, 291 (1993). 10.1080/08927029308022170 [DOI] [Google Scholar]
- Maragliano L. and Vanden-Eijnden E., Chem. Phys. Lett. 426, 168 (2006). 10.1016/j.cplett.2006.05.062 [DOI] [Google Scholar]
- Faradjian A. and Elber R., J. Chem. Phys. 120, 10880 (2004). 10.1063/1.1738640 [DOI] [PubMed] [Google Scholar]
- Shalloway D. and Faradjian K., J. Chem. Phys. 124, 054112 (2006). 10.1063/1.2161211 [DOI] [PubMed] [Google Scholar]
- Kirmizialtin S., Nguyen V., Johnson K., and Elber R., Structure 20, 618 (2012). 10.1016/j.str.2012.02.018 [DOI] [PMC free article] [PubMed] [Google Scholar]
- E W., Ren W., and Vanden-Eijnden E., Chem. Phys. Lett. 413, 242 (2005). 10.1016/j.cplett.2005.07.084 [DOI] [Google Scholar]
- Maragliano L., Fischer A., Vanden-Eijnden E., and Ciccotti G., J. Chem. Phys. 125, 024106 (2006). 10.1063/1.2212942 [DOI] [PubMed] [Google Scholar]
- Schlitter J., Engels M., and Krüger P., J. Mol. Graph. 12, 84 (1994). 10.1016/0263-7855(94)80072-3 [DOI] [PubMed] [Google Scholar]
- van der Vaart A. and Karplus M., J. Chem. Phys. 122, 114903 (2005). 10.1063/1.1861885 [DOI] [PubMed] [Google Scholar]
- Apostolakis J., Ferrara P., and Caflisch A., J. Chem. Phys. 110, 2099 (1999). 10.1063/1.477819 [DOI] [Google Scholar]
- Ovchinnikov V. and Karplus M., J. Phys. Chem. B 116, 8584 (2012). 10.1021/jp212634z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schlick T., F1000 Biology Reports 1, 51 (2009). 10.3410/B1-51 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lei H. and Duan Y., Curr. Opin. Struct. Biol. 17, 187 (2007). 10.1016/j.sbi.2007.03.003 [DOI] [PubMed] [Google Scholar]
- Ren W., Vanden-Eijnden E., Maragakis P., and E W., J. Chem. Phys. 123, 134109 (2005). 10.1063/1.2013256 [DOI] [PubMed] [Google Scholar]
- Jónsson G., and Jacobsen K., in Classical and Quantum Dynamics in Condensed Phase Simulations, edited by Berne B., Ciccotti G., and Coker D. (World Scientific, Singapore, 1998), pp. 385–404. [Google Scholar]
- Fischer S. and Karplus M., Chem. Phys. Lett. 194, 252 (1992). 10.1016/0009-2614(92)85543-J [DOI] [Google Scholar]
- E W., Ren W., and Vanden-Eijnden E., J. Chem. Phys. 126, 164103 (2007). 10.1063/1.2720838 [DOI] [PubMed] [Google Scholar]
- Gronenborn A., Filpula D., Essig N., Achari A., Whitlow M., Wingfield P., and Clore G., Science 253, 657 (1991). 10.1126/science.1871600 [DOI] [PubMed] [Google Scholar]
- E W. and Vanden-Eijnden E., Annu. Rev. Phys. Chem. 61, 391 (2010). 10.1146/annurev.physchem.040808.090412 [DOI] [PubMed] [Google Scholar]
- E W. and Vanden-Eijnden E., in Multiscale Modeling and Simulation, edited by Attinger S. and Koumoutsakos P. (Springer-Verlag, Berlin, 2004). [Google Scholar]
- E W., Ren W., and Vanden-Eijnden E., Phys. Rev. B. 66, 052301 (2002). 10.1103/PhysRevB.66.052301 [DOI] [Google Scholar]
- Gardiner C., Handbook of Stochastic Methods, 3rd ed. (Springer-Verlag, Berlin, 2003). [Google Scholar]
- Hummer G., J. Chem. Phys. 120, 516 (2004). 10.1063/1.1630572 [DOI] [PubMed] [Google Scholar]
- Best R. and Hummer G., Proc. Natl. Acad. Sci. U.S.A. 102, 6732 (2005). 10.1073/pnas.0408098102 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ma A. and Dinner A., J. Phys. Chem. B 109, 6769 (2005). 10.1021/jp045546c [DOI] [PubMed] [Google Scholar]
- Peters B. and Trout B., J. Chem. Phys. 125, 054108 (2006). 10.1063/1.2234477 [DOI] [PubMed] [Google Scholar]
- Vanden-Eijnden E. and Venturoli M., J. Chem. Phys. 130, 194103 (2009). 10.1063/1.3130083 [DOI] [PubMed] [Google Scholar]
- Dickson A., Warmflash A., and Dinner A. R., J. Chem. Phys. 130, 074104 (2009). 10.1063/1.3070677 [DOI] [PubMed] [Google Scholar]
- Vanden-Eijnden E. and Venturoli M., J. Chem. Phys. 131, 044120 (2009). 10.1063/1.3180821 [DOI] [PubMed] [Google Scholar]
- Dickson A., Warmflash A., and Dinner A. R., J. Chem. Phys. 131, 154104 (2009). 10.1063/1.3244561 [DOI] [PubMed] [Google Scholar]
- Khavrutskii I., Arora K., and C.BrooksIII, J. Phys. Chem. 125, 174108 (2006). 10.1063/1.2363379 [DOI] [PubMed] [Google Scholar]
- Pan A., Sezer D., and Roux B., J. Phys. Chem. B 112, 3432 (2008). 10.1021/jp0777059 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Farrell J. L., Stuelpnagel J. C., Wessner R. H., Velman J. R., and Brook J. E., SIAM Rev. 8, 384 (1966). 10.1137/1008080 [DOI] [Google Scholar]
- Kabsch W., Acta Cryst. A32, 922 (1976). 10.1107/S0567739476001873 [DOI] [Google Scholar]
- Horn B., J. Opt. Soc. Am. A 4, 629 (1987). 10.1364/JOSAA.4.000629 [DOI] [Google Scholar]
- Coutsias E., Seok C., and Dill K., J. Comput. Chem. 25, 1849 (2004). 10.1002/jcc.20110 [DOI] [PubMed] [Google Scholar]
- Elber R. and Karplus M., Chem. Phys. Lett. 139, 375 (1987). 10.1016/0009-2614(87)80576-6 [DOI] [Google Scholar]
- Neria E., Fischer S., and Karplus M., J. Chem. Phys. 105, 1902 (1996). 10.1063/1.472061 [DOI] [Google Scholar]
- Chu J.-W., Trout B., and Brooks B., J. Chem. Phys. 119, 12708 (2003). 10.1063/1.1627754 [DOI] [Google Scholar]
- Czerminski R. and Elber R., J. Chem. Phys. 92, 5580 (1990). 10.1063/1.458491 [DOI] [Google Scholar]
- Eckart C., Phys. Rev. 47, 552 (1935). 10.1103/PhysRev.47.552 [DOI] [Google Scholar]
- Pickett H. and Strauss H., J. Am. Chem. Soc. 92, 7281 (1970). 10.1021/ja00728a009 [DOI] [Google Scholar]
- Kudin K. and Dymarsky A., J. Chem. Phys 122, 224105 (2005). 10.1063/1.1929739 [DOI] [PubMed] [Google Scholar]
- Kneller G., J. Chem. Phys. 128, 194101 (2008). 10.1063/1.2902290 [DOI] [PubMed] [Google Scholar]
- Maragliano L., Vanden-Eijnden E., and Roux B., J. Chem. Theory Comput. 5, 2589 (2009). 10.1021/ct900279z [DOI] [PMC free article] [PubMed] [Google Scholar]
- Adelman J. L. and Grabe M., J. Chem. Phys. 138, 044105 (2013). 10.1063/1.4773892 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Maragliano L., Cottone G., Ciccotti G., and Vanden-Eijnden E., J. Am. Chem. Soc. 132, 1010 (2010). 10.1021/ja905671x [DOI] [PubMed] [Google Scholar]
- Ovchinnikov V., Karplus M., and Vanden-Eijnden E., J. Chem. Phys. 134, 085103 (2011). 10.1063/1.3544209 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kabsch W., Acta Cryst. A34, 827 (1978). 10.1107/S0567739478001680 [DOI] [Google Scholar]
- Sayvetz A., J. Chem. Phys. 7, 383 (1939). 10.1063/1.1750455 [DOI] [Google Scholar]
- Ferigle S. and Weber A., Am. J. Phys. 21, 102 (1953). 10.1119/1.1933365 [DOI] [Google Scholar]
- E.WilsonJr., Decius J., and Cross P., Molecular Vibrations (McGraw-Hill, New York, 1955). [Google Scholar]
- Herschbach D. R., Johnston H. S., and Rapp D., J. Chem. Phys. 31, 1652 (1959). 10.1063/1.1730670 [DOI] [Google Scholar]
- Boresch S. and Karplus M., J. Chem. Phys. 105, 5145 (1996). 10.1063/1.472358 [DOI] [Google Scholar]
- See supplementary material at http://dx.doi.org/10.1063/1.4871685 for the alanine dipeptide test case, for discussions of the electrostatic and solvation energies and of rotational contributions to the free energy of the β-hairpin, and for commitment analysis of the ZTS path.
- Vanden-Eijnden E. and Venturoli M., J. Chem. Phys. 130, 194101 (2009). 10.1063/1.3129843 [DOI] [PubMed] [Google Scholar]
- Brünger A., Brooks C., and Karplus M., Chem. Phys. Lett. 105, 495 (1984). 10.1016/0009-2614(84)80098-6 [DOI] [Google Scholar]
- Brooks B., C.BrooksIII, A.MackerellJr., Nilsson L., Petrella R., Roux B., Won Y., Archontis G., Bartels C., Boresch S. et al. , J. Comput. Chem. 30, 1545 (2009). 10.1002/jcc.21287 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Phillips J., Braun R., Wang W., Gumbart J., Tajkhorshid E., Villa E., Chipot C., Skeel R., Kale L., and Schulten K., J. Comput. Chem. 26, 1781 (2005). 10.1002/jcc.20289 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fukunishi H., Watanabe O., and Takada S., J. Chem. Phys. 116, 9058 (2002). 10.1063/1.1472510 [DOI] [Google Scholar]
- Bussi G., Gervasio F. L., Laio A., and Parrinello M., J. Am. Chem. Soc. 128, 13435 (2006). 10.1021/ja062463w [DOI] [PubMed] [Google Scholar]
- Cecchini M., Krivov S., Spichty M., and Karplus M., J. Phys. Chem. B 113, 9728 (2009). 10.1021/jp9020646 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spichty M., Cecchini M., and Karplus M., J. Phys. Chem. Lett. 1, 1922 (2010). 10.1021/jz1005016 [DOI] [Google Scholar]
- Ovchinnikov V., Cecchini M., and Karplus M., J. Phys. Chem. B 117, 750 (2013). 10.1021/jp3080578 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brooks B., Bruccoleri R., Olafson B., States D., Swaminathan S., and Karplus M., J. Comput. Chem. 4, 187 (1983). 10.1002/jcc.540040211 [DOI] [Google Scholar]
- A.MacKerellJr., Feig M., and C.BrooksIII, J. Comput. Chem. 25, 1400 (2004). 10.1002/jcc.20065 [DOI] [PubMed] [Google Scholar]
- Haberthür U. and Caflisch A., J. Comput. Chem. 29, 701 (2008). 10.1002/jcc.20832 [DOI] [PubMed] [Google Scholar]
- Ramachandran G., Ramakrishnan C., and Sasisekharan V., J. Mol. Biol. 7, 95 (1963). 10.1016/S0022-2836(63)80023-6 [DOI] [PubMed] [Google Scholar]
- Kleywegt G. J. and Jones T. A., Structure (London) 4, 1395 (1996). 10.1016/S0969-2126(96)00147-5 [DOI] [PubMed] [Google Scholar]
- Hooft R. W., Sander C., and Vriend G., Comput. Appl. Biosci. 13, 425 (1997). 10.1093/bioinformatics/13.4.425 [DOI] [PubMed] [Google Scholar]
- Best R., Zhu X., Shim J., Lopes P., Mittal J., Feig M., and A.MacKerellJr., J. Chem. Theor. Comput. 8, 3257 (2012). 10.1021/ct300400x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zhou R., Proteins: Struct. Funct. Genet. 53, 148 (2003). 10.1002/prot.10483 [DOI] [PubMed] [Google Scholar]
- Blanco F., Rivas G., and Serrano L., Nat. Struct. Biol. 1, 584 (1994). 10.1038/nsb0994-584 [DOI] [PubMed] [Google Scholar]
- Schreiner E., Trabuco L., Freddolino P., and Schulten K., BMC Bioinformatics 12, 190 (2011). 10.1186/1471-2105-12-190 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ryckaert J.-P., Ciccotti G., and Berendsen H., J. Comput. Phys. 23, 327 (1977). 10.1016/0021-9991(77)90098-5 [DOI] [Google Scholar]
- De Loof H., Nilsson L., and Rigler R., J. Am. Chem. Soc. 114, 4028 (1992). 10.1021/ja00037a002 [DOI] [Google Scholar]
- Dickson B., Huang H., and Post C. B., J. Phys. Chem. B 116, 11046 (2012). 10.1021/jp304720m [DOI] [PMC free article] [PubMed] [Google Scholar]
- Searle M. and Williams D., J. Am. Chem. Soc. 114, 10690 (1992). 10.1021/ja00053a002 [DOI] [Google Scholar]
- Gilli P., Ferretti V., Gilli G., and Borea P., J. Phys. Chem. 98, 1515 (1994). 10.1021/j100056a024 [DOI] [Google Scholar]
- Frederick K., Marlow M., Valentine K., and Wand J., Nature (London) 448, 325 (2007). 10.1038/nature05959 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dunitz J., Science 264, 670 (1994). 10.1126/science.264.5159.670 [DOI] [PubMed] [Google Scholar]
- Dunitz J., Chem. Biol. 2, 709 (1995). 10.1016/1074-5521(95)90097-7 [DOI] [PubMed] [Google Scholar]
- Yu H., Ma L., Yang Y., and Cui Q., PLoS Comput. Biol. 3, 0214 (2007). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Huang H., Ozkirimli E., and Post C., J. Chem. Theory Comput. 5, 1304 (2009). 10.1021/ct9000153 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Golosov A. A., Warren J. J., Beese L. S., and Karplus M., Structure 18, 83 (2010). 10.1016/j.str.2009.10.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shaw D. E., Maragakis P., Lindorff-Larsen K., Piana S., Dror R. O., Eastwood M. P., Bank J. A., Jumper J. M., Salmon J. K., Shan Y., and Wriggers W., Science 330, 341 (2010). 10.1126/science.1187409 [DOI] [PubMed] [Google Scholar]
- Lindorff-Larsen K., Piana S., Dror R. O., and Shaw D. E., Science 334, 517 (2011). 10.1126/science.1208351 [DOI] [PubMed] [Google Scholar]
- Lane T. J., Bowman G. R., Beauchamp K., Voelz V. A., and Pande V. S., J. Am. Chem. Soc. 133, 18413 (2011). 10.1021/ja207470h [DOI] [PMC free article] [PubMed] [Google Scholar]
- Noé F., Schütte C., Vanden-Eijnden E., Reich L., and Weikl T. R., Proc. Natl. Acad. Sci. U.S.A. 106, 19011 (2009). 10.1073/pnas.0905466106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Radhakrishnan R. and Schlick T., Proc. Natl. Acad. Sci. U.S.A. 101, 5970 (2004). 10.1073/pnas.0308585101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- de Jong D., Singh G., Bennett W., Arnarez C., Wassenaar T., Schäfer L., Periole X., Tieleman D., and Marrink S., J. Chem. Theory Comput. 9, 687 (2013). 10.1021/ct300646g [DOI] [PubMed] [Google Scholar]
- Veitshans T., Klimov D., and Thirumalai D., Fold. Des. 2, 1 (1997). 10.1016/S1359-0278(97)00002-3 [DOI] [PubMed] [Google Scholar]
- Levitt M. and Warshel A., Nature 253, 694 (1975). 10.1038/253694a0 [DOI] [PubMed] [Google Scholar]
- Brooks B., Janežič D., and Karplus M., J. Comput. Chem. 16, 1522 (1995). 10.1002/jcc.540161209 [DOI] [Google Scholar]





