Normal mode partitioning of Langevin dynamics for biomolecules

Christopher R Sweet; Paula Petrone; Vijay S Pande; Jesús A Izaguirre

doi:10.1063/1.2883966

. 2008 Apr 8;128(14):145101. doi: 10.1063/1.2883966

Normal mode partitioning of Langevin dynamics for biomolecules

Christopher R Sweet ¹, Paula Petrone ², Vijay S Pande ^3,^a), Jesús A Izaguirre ^1,^b)

PMCID: PMC2673189 PMID: 18412479

Abstract

We propose a novel normal mode multiple time stepping Langevin dynamics integrator called NML. The aim is to approximate the kinetics or thermodynamics of a biomolecule by a reduced model based on a normal mode decomposition of the dynamical space. Our basis set uses the eigenvectors of a mass reweighted Hessian matrix calculated with a biomolecular force field. This particular choice has the advantage of an ordering according to the eigenvalues, which have a physical meaning of being the square of the mode frequency. Low frequency eigenvalues correspond to more collective motions, whereas the highest frequency eigenvalues are the limiting factor for the stability of the integrator. In NML, the higher frequency modes are overdamped and relaxed near their energy minimum while respecting the subspace of low frequency dynamical modes. Our numerical results confirm that both sampling and rates are conserved for an implicitly solvated alanine dipeptide model, with only 30% of the modes propagated, when compared to the full model. For implicitly solvated systems, NML gives a twofold improvement in efficiency over plain Langevin dynamics for sampling a small 22 atom (alanine dipeptide) model and in excess of an order of magnitude for sampling an 882 atom (bovine pancreatic trypsin inhibitor) model, with good scaling with system size subject to the number of modes propagated. NML has been implemented in the open source software PROTOMOL.

INTRODUCTION

Molecular dynamics (MD) is a useful technique for studying the kinetics and thermodynamics of proteins and other biomolecules. A primary limitation to the application of MD to study biologically relevant processes such as protein folding, docking, and functionally important conformational changes is the small time step size of conventional MD. Whereas the latter is typically measured in femtoseconds, dynamical processes of interest happen in microseconds and longer time scales. We seek to construct a reduced model of the kinetics or thermodynamics of biomolecules, that is, separate the dynamics into fast, uninteresting modes, and slow, functionally relevant modes. Once the space has been partitioned in this way, we perform MD in the reduced space to compute observables.

Among the most popular approaches in finding reduced dynamical spaces for biomolecules are normal mode analysis (NMA) and principal component analysis (PCA). NMA has proven successful in representing collective motions of atoms, such as domain and hinge bending in proteins.¹^,²^,³^,⁴^,⁵^,⁶^,⁷^,⁸^,⁹^,¹⁰^,¹¹^,¹²^,¹³ Because of this, NMA has been used to generate a generalized coordinate set in which to represent molecular motion. PCA, e.g., Ref. 14, or the essential dynamics analysis¹⁵ uses several conformations or relatively long simulations to capture slow motions. In comparison, NMA requires only an initial structure and no simulation to determine the essential space.

Methods to compute observables given a reduced space model have been previously developed. Some MD propagators have attempted to take advantage of the frequency spectra provided by NMA. For instance, a scheme called LIN (Ref. ¹⁶) propagates the high frequency normal modes using the analytical solution of a Langevin dynamics equation. Updating these modes requires frequent rediagonalization of the mass reweighted Hessian (MRH) every few femtoseconds or picoseconds. Also, the low frequency modes are updated using an implicit solver, which is generally very costly. These choices make LIN computationally expensive and limit the potential speedup of the method.

The amplified collective motion (ACM) MD method¹⁷ attempts to excite motions along a few low frequency modes (typically three to five) obtained from the anisotropic network model (ANM).¹⁸ After estimating the slowest ANM modes, the velocities are split into the essential subspace and the remainder subspace. The essential subspace is weakly coupled to a higher temperature bath. The modes are updated very frequently, every few femtoseconds. Simulations of ACM on proteins show enhanced sampling. The authors of ACM do not show what the stationary distribution and partition function of the method are. Another method, LLMOD,¹⁹ obtains the low frequency modes of the Hessian and does an analytical propagation along the low frequency modes. This integration step is nonphysical because it ignores the coupling with high frequency modes and quickly becomes invalid. Thus, it is not possible to determine the ensemble that is sampled. An interesting feature of LLMOD is that it uses an iterative Arnoldi method to compute the low frequency eigenvectors and is thus applicable to larger molecules. Similarly, ElNémo, a web server to compute an elastic network model using a rotation-translation-block approximation, is able to compute the 100 lowest frequency modes of a protein.²⁰

The subspace MD (SMD) (Ref. ²¹) is a method in which only the low frequency subspace is propagated according to projected Newtonian equations of motion, while high frequency modes remain unaltered. This elegant scheme, which works well in crystals, loses efficiency when extended to biomolecules.²² In this case, the system is not periodic and large interactions between subspaces occur, and failure to frequently rediagonalize the MRH results in constraining the dynamics to the neighborhood of the initial equilibrium structure.

The collective Langevin dynamics²³ uses PCA to obtain a reduced space of low frequency modes. The dynamics of the remaining space is accounted for through a generalized Langevin equation with a memory kernel. This method, though elegant, requires a relatively long simulation to obtain the memory kernel for the Langevin equation.

Other works have attempted to build longer time step integrators for MD.²⁴^,²⁵^,²⁶^,²⁷^,²⁸ One effort in this direction has been the use of multiple time step²⁹ (MTS) integrators. Linear and nonlinear resonances limit the length of the time step to about one-third of the period of the fastest vibrational mode. For MD of proteins using r-RESPA,³⁰^,³¹ a prototypical MTS integrator, the stability limit is at a time step of 3.3 fs.³² More stable MTS integrators only offer a modest improvement of roughly a factor of 1.5.³³^,³⁴

The largest impediment to a time step increase comes from the fastest internal vibrations. Algorithms to constrain bonds, such as SHAKE,³⁵^,³⁶^,³⁷ are normally implemented and popular solvent models constrain the bond lengths and angles for water. It is also a standard procedure to constrain covalent bonds to hydrogen in all-atom representations. This approach allows the time step to be doubled over unconstrained MD.³⁸

It is unclear, however, how to use constraints on atomic degrees of freedom beyond these obvious holonomic constraints without significantly affecting the dynamics of the molecule. Several methods have been proposed that treat whole regions of a biomolecule as rigid entities, e.g. Ref. 39. Because the definition of these regions is arbitrary, the error in the kinetics or thermodynamics incurred by this procedure is difficult to estimate. In general, constraints beyond just covalent bonds damage the thermodynamical and kinetic properties of the system.⁴⁰

A MTS algorithm for the diffusive Langevin equation has been devised.⁴¹ Here, bonds and angles are constrained to achieve a long outer time step, and then the unconstrained system is propagated to bring the high frequencies to thermal equilibrium. The method relies on separation of time scales introduced by the constraints and splitting of the forces. Unfortunately, in general, the separation of frequencies between these time scales is not well defined and the coupling between high and low frequencies is not always weak.⁴²^,⁴³^,⁴⁴ Also, relaxing through unconstrained steps limits the efficiency of the method, since many short time steps still need to be carried out.

We introduce a MTS propagator for Langevin dynamics of implicitly solvated biomolecules called normal mode Langevin (NML). We obtain the reduced model (low frequency modes) through a NMA of a MRH of a biomolecular force field [currently, CHARMM 22 (Refs. ⁴⁵^,⁴⁶)]. This partitioning is used to construct a damping matrix for the Langevin equation that overdamps and propagates the high frequency modes using Brownian dynamics, whereas it propagates the relevant low frequency modes using full Langevin dynamics. In our discretization, we accelerate the convergence of Brownian dynamics on high frequencies by an energy minimization.

The rationale for this treatment of the high frequencies can be understood if we consider the work of Petrone and Pande.⁴⁷ They have shown that, contrary to previous observations,⁷ the low frequency normal modes only are not enough to describe the transition between two conformations, and instead, a broad spectrum of frequencies is required. Since high frequency eigenvectors have short persistence throughout a simulation, we cannot propagate them without expensive rediagonalization.¹⁶ As regards the treatment of high frequency modes, constraining their amplitude to zero damages the long time dynamics,²¹ whereas relaxing them to thermal equilibrium through small time step integration is expensive.⁴¹ NML minimizes the energy with respect to the high frequency modes, which in the limit of large damping coefficient is equivalent to propagating these modes using Brownian dynamics. This is a more efficient way of thermalizing these high frequency modes, as shown below.

In our NML implementation, the force calculation, integration, and minimization or dampening of fast modes are performed in Cartesian coordinates. Because of this, our method combines the advantages of a generalized coordinate obtained through NMA from only one structure to select the relevant motions and the simplicity of the equations of motion given by the Cartesian coordinates.

To validate our method we use two test cases, the alanine dipeptide (AD) and bovine pancreatic trypsin inhibitor (BPTI), for which extensive data are available in the literature. We show the range of validity of our method by calculating the kinetic (for AD) and sampling properties of the test systems (AD, BPTI) as a function of the number of dynamical modes.

In Sec. 2, we describe the underlying rationale and theory for the NML method and derive the Langevin impulse (LÎ) version. In Sec. 3, we apply NML to explore the sampling or kinetics of blocked AD using a distance-dependent dielectric implicit solvent model and BPTI using the screened Coulomb potential implicit solvent model (SCPISM). In the AD tests, we demonstrate that using NML, when only the first 22–26 modes (of 66) are used, results in both Ramachandran plots and isomerization transition rates that are in good agreement with those obtained from the LÎ propagator. This leads to only a twofold increase in efficiency due to the small size of the system. To show the scalability of the method, we also test a BPTI model, demonstrating fair to good agreement with Ramachandran plots when propagating 100–300 modes (of 2646), with efficiency increases in excess of an order of magnitude. These results are reviewed and interpreted in Sec. 4.

THEORY OF NORMAL MODE PARTITIONING

We partition the system’s degrees of freedom into a subspace C and its complement C^⊥ according to frequency using NMA. NMA finds a harmonic approximation to the system of interest around an initial equilibrium structure using the MRH. The diagonalization of the MRH generates the normal mode vectors, ranked in increasing order of their frequency (square root of eigenvalue). These normal modes therefore become the generalized coordinates with which to capture the dynamics of the system. In subspace C, normal modes will be propagated, whereas in C^⊥, modes will be dampened to near their energy minimum while respecting the subspace of dynamical modes. In the case where C includes all the modes, the original Langevin dynamics are recovered. Since we wish to use standard MD force fields such as CHARMM,⁴⁵^,⁴⁶ which are more readily calculated in the 3N Cartesian space for system size N, the system is propagated in these coordinates rather than the normal mode coordinates for greater efficiency.

We describe the method for partitioning spaces and determining the largest time step in Sec. 2A. We derive the Langevin equations using a damping matrix to partition the system in Sec. 2B, and perform a MTS discretization called normal mode impulse (NMI), (Sec. 2C) and a more efficient method called NML (Sec. 2D) using a minimizer(Sec. 2E).

NMA and Hessian diagonalization

Many MD systems of interest can be described by a separable Hamiltonian with 3N Cartesian positions x and momenta p, potential energy (PE) U, and diagonal mass matrix M,

H (x, p) = \frac{1}{2} p^{T} M^{- 1} p + U (x) .

(1)

For these systems, we can formally expand the PE about an equilibrium point x₀, which we assume is a local minimum. The Hessian H is a factor in the first nonconstant, nonzero term of this expansion and a harmonic approximation to the original system can be found by truncating the expansion at this point. To rewrite the harmonic approximation as a set of decoupled oscillators, it is insufficient to diagonalize the Hessian as the resulting oscillators would be coupled through the projected mass matrix. Instead we mass reweight the system using $\hat{x} = M^{1 ∕ 2} (x - x_{0})$ and then diagonalize the resulting MRH,

M^{- 1 ∕ 2} H M^{- 1 ∕ 2} Q = Q Λ,

(2)

where Λ is the diagonal matrix of ordered eigenvalues λ₁,…,λ_3N and Q the matrix of column eigenvectors e₁,…,e_3N. This means that the frequency of each oscillator is uniquely defined by its associated eigenvalue. We can now project between the mode space and Cartesian space using the eigenvectors Q and mass matrix M,

c = Q^{T} M^{1 ∕ 2} (x - x_{0}), x = M^{- 1 ∕ 2} Q c + x_{0},

(3)

for mode amplitude vector c. We choose a cutoff frequency $f_{c} = \sqrt{λ_{i}}$ to partition the normal modes such that $Q = [Q, \bar{Q}]$ , Q=[e₁,…,e_i], and $\bar{Q} = [e_{i + 1}, \dots, e_{3 N}]$ are rectangular matrices whose columns span the subspace C and complement subspace C^⊥, respectively. In the following discussion, we will assume the dimensions of Q to be 3N×m. In the linear case, the time step is bounded by the asymptotic stability of the method⁴⁸^,⁴⁹ at a frequency equal to the square root of the ith eigenvalue rather than the highest frequency in the system. This can be estimated by

Δ t_{i} \approx Δ t_{3 N} \sqrt{\frac{λ_{3 N}}{λ_{i}}},

(4)

where λ_i is the ith diagonal element of Λ, λ_3N is the largest eigenvalue, and Δt_3N is the time step required to propagate the nonpartitioned system. The tests in Sec. 3 confirm that this is a good heuristic for the choice of time step for the full system.

Propagation using the Langevin equation

The system will be time propagated using the Langevin equation

d x = v d t, M d v = f d t - Γ M v d t + \sqrt{2 k_{B} T} Γ^{1 ∕ 2} M^{1 ∕ 2} d W (t),

(5)

where f are the internal forces between particles in the system, t is the time, W(t) is a collection of Wiener processes, k_B is the Boltzmann constant, T is the desired system temperature, v are the velocities, and Γ is the diagonalizable damping matrix. The system diffusion tensor D gives rise to Γ:

Γ = k_{B} T D^{- 1} M^{- 1} .

(6)

D could be chosen to model the dynamics of an implicit solvent, in which case the damping is related to solvent viscosity, or simply to sample from a canonical ensemble or for the equilibration phase of a subsequent microcanonical simulation.

We make the following physical approximations about the system:

Physical approximation 1. We use Γ to partition the system into two subspaces, C spanned by the low frequency eigenvectors Q and C^⊥ spanned by the high frequency eigenvectors $\bar{Q}$ , so that we can define independent damping heuristics for the subspaces C and C^⊥. In particular, we want to overdamp the high frequency modes.

Physical approximation 2. We assume that the solution in C^⊥, the high frequency modes, can be modeled by Brownian dynamics, the diffusive form of the Langevin equation. To this end, we select a large damping coefficient $\bar{γ} ⪢ γ_{c}$ , where γ_c is the critical damping coefficient, which is fixed for C^⊥. The damping matrix applied to C^⊥ is then the diagonal matrix ${\bar{Γ}}_{i i} = \bar{γ}$ and C^⊥ is “overdamped.”

By selecting the damping matrix applied to C to emulate system characteristics, such as water viscosity, in conjunction with the choice of $\bar{γ}$ , we can split the system’s degrees of freedom between C and C^⊥ using Γ. Here we assume that the damping matrix applied to C is a diagonal matrix ${\hat{Γ}}_{i i} = γ$ . The resulting system damping matrix is

Γ = P_{f} \hat{Γ} + P_{f}^{⊥} \bar{Γ},

(7)

with the projection matrices P_f and $P_{f}^{⊥}$ given in the Appendix, Eqs. A6, A7.

Numerically solving the Langevin equation 5 using Eq. 7 would be computationally expensive to implement since Γ is a matrix. Instead, we split Eq. 5 using physical approximation 2:

d \hat{x} = \hat{v} d t,

(8)

M d \hat{v} = \hat{f} d t - \hat{Γ} M \hat{v} d t + \sqrt{2 k_{B} T} {\hat{Γ}}^{1 ∕ 2} P_{f} M^{1 ∕ 2} d W_{1} (t),

d \bar{x} = {\bar{Γ}}^{- 1} M^{- 1} \bar{f} d t + \sqrt{2 k_{B} T} {\bar{Γ}}^{- 1 ∕ 2} M^{- 1} P_{f}^{⊥} M^{1 ∕ 2} d W_{2} (t)

(9)

for $\hat{x} = P_{x} x$ , $\hat{v} = P_{x} v$ , and $\hat{f} = P_{f} f$ in C, $\bar{x} = P_{x}^{⊥} x$ , $\bar{v} = P_{x}^{⊥} v$ , and $\bar{f} = P_{f}^{⊥} f$ in C^⊥, and W₁, W₂ are collections of independent Wiener processes. P_x, $P_{x}^{⊥}$ , P_f, and $P_{f}^{⊥}$ are given in the Appendix, Eqs. A3, A4, A6, A7.

MTS discretization (NMI)

We discretize the Langevin equations 8, 9 using the LÎ method.⁵⁰ LÎ reduces to the Euler–Maruyama method⁵¹^,⁵² when applied to the Brownian dynamics equation 9.

We apply the following numerical approximations to the system for computational efficiency.

Numerical approximation 1. We assume that the C, C^⊥ splitting at eigenvector i ensures that the frequencies found in C are bounded above by $\sqrt{λ_{i}}$ and that the time step Δt given in Eq. 4 guarantees stability of the discretization of Eq. 8.

Numerical approximation 2. We assume that the PE U, given f=−∇_xU, can be split into U_fast giving rise to the fast (local) forces $\bar{f}$ in C^⊥ and U_slow giving rise to the slow forces such that

U = U_{fast} + U_{slow} .

(10)

Then we propagate the Brownian dynamics equation 9 using the fast forces $\bar{f} \approx - \nabla_{x} U_{fast}$ .

This discretization becomes a MTS Langevin method with a large time step Δt associated with Eq. 8 and a smaller time step Δτ associated with Eq. 9. We call this method NMI:

Half-kick.

v^{n + ϵ} = v^{n} - \frac{Δ t}{2} M^{- 1} \hat{f} (x^{n}) .

(11)

Fluctuation 1.

v^{n + 1 ∕ 2} = \exp (- γ Δ t) v^{n + ϵ} + \sqrt{τ_{2}} \sqrt{2 γ k_{B} T} M^{- 1} P_{f} M^{1 ∕ 2} z_{1}^{n + 1},

(12)

x^{n + 1 ∕ (p + 1)} = x^{n} + P_{x} (τ_{1} v^{n + ϵ} + \frac{1}{γ} \sqrt{2 γ k_{B} T} M^{- 1} P_{f} M^{1 ∕ 2} (τ_{3} z_{1}^{n + 1} + τ_{4} z_{2}^{n + 1})) .

(13)

Fluctuation 2.

x^{n + (i + 1) ∕ (p + 1)} = x^{n + i ∕ (p + 1)} + Δ τ {\bar{γ}}^{- 1} M^{- 1} \bar{f} (x^{n + i ∕ (p + 1)}) + \sqrt{2 Δ τ {\bar{γ}}^{- 1} k_{B} T} M^{- 1} P_{f}^{⊥} M^{1 ∕ 2} z^{n + (i + 1) ∕ (p + 1)},

(14)

where i=1,…,p.

Half-kick.

v^{n + 1} = v^{n + 1 ∕ 2} - \frac{Δ t}{2} M^{- 1} \hat{f} (x^{n + 1}) .

(15)

Fluctuation 1 updates the slow modes and fluctuation 2 updates the high frequency modes; γ and $\bar{γ}$ are the scalar Langevin damping coefficients for C and C^⊥ subspaces, respectively; k_B is the Boltzmann constant and T is the temperature. z₁ and z₂ are joint, independently sampled, random variable vectors sampled from a Gaussian distribution. Constants τ₁, τ₂, τ₃, and τ₄ are given by

τ_{1} = \frac{1 - \exp (- γ Δ t)}{γ},

(16)

τ_{2} = \frac{1 - \exp (- 2 γ Δ t)}{2 γ},

(17)

τ_{3} = \frac{τ_{1} - τ_{2}}{\sqrt{τ_{2}}},

(18)

τ_{4} = \sqrt{Δ t - \frac{τ_{1}^{2}}{τ_{2}}} .

(19)

These constants are a consequence of the LÎ method and their derivation can be found in Ref. 50.

We evaluate NMI in Sec. 3B. Even though it is possible to obtain some computational gain by the slow-fast splitting of the PE and overdamping overcomes problems associated with MTS methods for MD, the computational gain is limited by the need to do O(Δt) Brownian dynamics steps with a small time step Δτ, e.g., 1 or 2 fs.

NML discretization

The computational efficiency limitation of NMI motivates the use of a further numerical approximation. This leads to the derivation of the NML method described below.

Numerical approximation 3. We replace Eq. 14 in the Euler–Maruyama method with a mass-weighted minimization scheme followed by some steps of noise. This approach comes from the observation that for sufficiently large $\bar{γ}$ and neglecting the noise term, Eq. 14 minimizes the PE with respect to C^⊥ as shown in Proposition 1, Sec. 2E. Furthermore, each minimization step will partially remove the previous noise term in Eq. 14. Thus, we replace the p steps in Eq. 14 by

{\tilde{x}}^{k} = {[G (x^{n + ϵ})]}^{k},

(20)

x^{n + 1} = {[G ({\tilde{x}}^{k}) + \sqrt{2 ζ_{l} k_{B} T} M^{- 1} P_{f}^{⊥} M^{1 ∕ 2} z_{2}^{k}]}^{l},

(21)

where k is the number of “nonperturbed” minimization steps and l the number of “perturbed” minimization steps, and ζ_l is the minimizer “line-search” solution described below. Here [⋅]^j denotes repeating the step j times using the positions from the previous step. An example of a solution method is provided in Sec. 2E. The minimization function G, for step n, minimization iteration i, and position x is

x^{n + (i + 1) ∕ (p + 1)} = G (x^{n + i ∕ (p + 1)}) = x^{n + i ∕ (p + 1)} + ζ_{i} M^{- 1} \bar{f} (x^{n + i ∕ (p + 1)}),

(22)

where $M^{- 1} \bar{f}$ is the direction of the position move and scalar ζ_i the distance in that direction determined by a line-search method. Equation 22 is equivalent to Eq. 14 with $Δ τ {\bar{γ}}^{- 1} = ζ_{i}$ and without noise. Further details of the minimization scheme can be found in Sec. 2E. Minimization allows the forces introduced by coupling into the slow subspace C to be reduced to a value which is arbitrarily close to zero. If a local minimum is attained, then the forces introduced into the slow subspace by the fast subspace are due to the random forces only, justifying physical approximations 1 and 2 in Sec. 2B. NML is thus defined by the equations of NMI with Eq. 14 replaced by Eqs. 20, 21, 22.

Finally, we introduce a computational optimization which is equivalent to NML but more efficient. Note that the positions in C are moved only in fluctuation 1 and those in C^⊥ are moved only in fluctuation 2. From this and since the minimizer both relaxes the system after the fluctuation 1 step and removes the previous random perturbation, defined as

δ \bar{x} = \sqrt{2 ζ_{l} k_{B} T} M^{- 1} P_{f}^{⊥} M^{1 ∕ 2} z_{2}^{k},

(23)

some computational advantage can be gained by removing δx before the next fluctuation 2. Equation 20 then becomes

{\tilde{x}}^{k} = {[G (x^{n + ϵ} - δ \bar{x})]}^{k} .

(24)

NML minimization

We first show that a mass-weighted steepest descent minimizer can be used to replace the drift term in Eq. 14 in the Euler–Maruyama method.

Proposition 1: For large $\bar{γ}$ , the drift term in Eq. 14 minimizes the PE with respect toC^⊥.

If we were to propagate Eq. 14, neglecting the noise term, at a step number n and for force

{\bar{f}}^{n} = - \nabla_{\bar{x}} U (\hat{x} + {\bar{x}}^{n} + x_{0}) = - \nabla_{\bar{x}} U^{n},

(25)

then ${\bar{x}}^{n + 1} = {\bar{x}}^{n} - Δ τ {\bar{γ}}^{- 1} M^{- 1} \nabla_{\bar{x}} U^{n}$ . We can expand U around $\hat{x} + x_{0}$ using a Taylor series

U^{n + 1} = U^{n} - Δ τ {\bar{γ}}^{- 1} {(\nabla_{\bar{x}} U^{n})}^{T} M^{- 1} \nabla_{\bar{x}} U^{n} + O {(Δ τ {\bar{γ}}^{- 1})}^{2} (.) + \dots .

(26)

We choose $Δ τ {\bar{γ}}^{- 1}$ to be sufficiently small, such that Eq. 26 converges and all terms above the second term can be ignored. Then, since ${(\nabla_{\bar{x}} U^{n})}^{T} M^{- 1} \nabla_{\bar{x}} U^{n} ⩾ 0$ , we have

U^{n + 1} ⩽ U^{n} \forall n,

(27)

with equality when $\bar{f} = 0$ . This means that for sufficiently large $\bar{γ}$ and number of steps p, such that pΔτ=Δt for C propagator time step Δt, the drift term in Eq. 14 will minimize the PE U with respect to C^⊥, up to random perturbation.◻

By considering Proposition 1 and defining $\bar{γ} (t, Δ τ)$ , dependent on time and step size, we can construct a method which minimizes more rapidly. The minimizer function G for mass-weighted steepest descent Eq. 22 is equivalent to the drift term of Eq. 14 for large $\bar{γ}$ when

Δ τ \bar{γ} {(t^{n} + i Δ τ, Δ τ)}^{- 1} = ζ_{i},

(28)

where scalar ζ_i represents the distance to move in the direction of the mass-weighted force, tⁿ is the time corresponding to ${\bar{x}}^{n}$ , and Δτ and i are as defined in Eq. 14. This method will require much fewer steps to achieve a given level of minimization by determining $\bar{γ} (t, Δ τ)$ as part of a minimization scheme, improving efficiency. To obtain an increase in efficiency, we require that the number of minimizer iterations k<p∕n_f, where p is the number of steps in Eq. 14 and n_f is the number of force evaluations per minimization step. This scheme has the advantage that using the mass-weighted force as a position move direction guarantees that the solution is in C^⊥.

We must choose a threshold u_th as a termination condition for the minimization algorithm. If the method gives solutions Uⁿ which are asymptotic to the actual minimum, then comparing subsequent solutions Uⁿ⁺¹−Uⁿ at minimization step n+1 will indicate how far from the minimum the solution is. The algorithm will then terminate when

U^{n + 1} - U^{n} ⩽ u_{th}

(29)

is satisfied.

The mass-weighted steepest descent method uses a quadratic approximation of the PE $U (\bar{x} + ζ M^{- 1} \bar{f}) \approx a ζ^{2} + b ζ + U (\bar{x})$ , where ζ is the scalar distance along the direction of the mass-weighted force $M^{- 1} \bar{f}$ . This approximation is then used to estimate the ζ corresponding to minimum U. The method is defined as follows:

ζ_{I} = λ_{3 N}^{- 1},

(30)

{\bar{f}}^{k - 1} = - P_{f}^{⊥} \nabla_{x} U_{fast} ({\bar{x}}^{k - 1}),

(31)

{\bar{f}}^{ϵ} = - P_{f}^{⊥} \nabla_{x} U_{fast} ({\bar{x}}^{k - 1} + ζ_{I} M^{- 1} {\bar{f}}^{k - 1}),

(32)

a = - \frac{1}{ζ_{I}} (\frac{U_{fast} ({\bar{x}}^{k - 1} + ζ_{I} M^{- 1} {\bar{f}}^{k - 1}) - U_{fast} ({\bar{x}}^{k - 1})}{ζ_{I}} - M^{- 1} {\bar{f}}^{k - 1} \cdot {\bar{f}}^{ϵ}),

(33)

b = M^{- 1} {\bar{f}}^{k - 1} \cdot {\bar{f}}^{ϵ} - 2 a ζ_{I},

(34)

ζ_{F} = - \frac{b}{2 a},

(35)

G ({\bar{x}}^{k - 1}) = {\bar{x}}^{k} = {\bar{x}}^{k - 1} + ζ_{F} M^{- 1} {\bar{f}}^{k - 1},

(36)

where U_fast is the PE that gives rise to the fast (local) forces. These were defined as bonded plus short-range nonbonded forces in our tests. The initial estimate of ζ_I is found by solving the discreet quadratic approximation, from the MRH diagonalization, for the highest frequency mode. For a system size N, this is mode number 3N, denoted c_3N, and the conservative forces are f_3N=−λ_3Nc_3N, with solution $ζ_{I} = λ_{3 N}^{- 1}$ for eigenvalue λ_3N. The number of steps required k is found from satisfying the termination condition in Eq. 29 for a given threshold u_th. The “simple” mass-weighted method uses ζ_I directly to calculate $U_{fast} ({\bar{x}}^{k - 1} + ζ_{I} M^{- 1} {\bar{f}}^{k - 1})$ . If this is less than $U_{fast} ({\bar{x}}^{k - 1})$ , we set ${\bar{x}}^{k} = {\bar{x}}^{k - 1} + ζ_{I} M^{- 1} {\bar{f}}^{k - 1}$ ; otherwise the quadratic fit is used to find ζ_F.

For this minimization step, the damping coefficient required to obtain the equivalent damping using Eq. 14 can be calculated from Eq. 28. Using this $\bar{γ}$ in the noise term of Eqs 14, 21 satisfies the fluctuation-dissipation theorem.⁵³

We recognize that for much larger systems, a more sophisticated approach may be necessary. Note that when PE minimization techniques which use a position move in the force direction are to be used, the position change must be restricted to C^⊥. This holds for the mass-weighted force, but for other methods, an additional projection is required to provide a new C^⊥ force

{\bar{f}}^{*} = P_{x}^{⊥} \bar{f}

(37)

for projection matrix $P_{x}^{⊥}$ given by Eq. A4, such that the new position $(x + ζ_{F} {\bar{f}}^{*})$ is invariant in C.

COMPUTATIONAL TESTS

We apply NML to study the kinetics and sampling of implicit solvent models of blocked AD (ACE ALA NME) and BPTI using the CHARMM 22 force field. NML is implemented in the open source software PROTOMOL.⁵⁴

Supplementary information (SI) to this paper is available.⁵⁵ It contains a detailed comparison of NML and NMI. In particular, it is shown that using the optimal damping coefficient, NMI is able to recover good sampling compared to LÎ. The SI contains all input files and analysis scripts to reproduce results in this section and instructions to download the executable. Figures S1 and S2 in the SI show that the dynamics using SMD are localized around the equilibrium structure.

The implicit solvent model used for the AD tests is a sigmoidal distance-dependent dielectric⁵⁶ to account for screening of electrostatic interactions due to solvent. The equation for the dielectric ε(r) as a function of distance r, parameters ϵ_s (the bulk dielectric constant), and S (screening factor), is

ε (r) = ϵ_{s} \frac{ϵ_{s} - 1}{2} (S^{2} r^{2} + 2 S r + 2) \exp (- S r) .

(38)

For these simulations, we used parameter values ϵ_s=78 and S=0.3.

For BPTI, the distance-dependent dielectric gave unsatisfactory results for both LÎ and NML: For instance, the Ramachandran plots were severely distorted compared to the explicit solvent; the molecule kept expanding in volume; these results were sensitive to the value of S. Thus, we used a more accurate implicit solvent model, the SCPISM, which overcame these problems. SCPISM uses the relation between the physically measurable dielectric function ε(r) and the screening function D(r), which for a single spherical particle in a pure solvent is given by

ε (r) = D (r) [1 + \frac{r}{D (r)} \frac{d}{d r} D (r)] .

(39)

Experiments and classical electrostatic theory of polar liquids (Debye–Lorentz–Sack theory) suggest that ε(r) has a sigmoidal distance-dependent form, a fact exploited by SCPISM and Ref. 56. SCPISM uses ε(r) obtained from experiments or theory as input to the model and then performs a numerical integration to obtain D(r). The SCPISM uses atom-type-dependent sigmoidal functions parametrized for the CHARMM 22 force field. Importantly, the D(r) approaches bulk medium value of the screening far from the protein, and thus SCPISM does not introduce a boundary that separates the protein from the solvent.

The electrostatic component of the energy of a protein in a polar solvent is given by

U (r) = \frac{1}{2} \sum_{i \neq j}^{N} \frac{q_{i} q_{j}}{D_{s} (r_{i j}) r_{i j}} + \frac{1}{2} \sum_{i = 1}^{N} \frac{q_{i}^{2}}{D_{s} (R_{i, B})} [\frac{1}{D_{s} (R_{i, B})} - 1],

(40)

where the first term corresponds to the interaction energy and the second to the self-energy. D_s(r) is a nonlinear screening term that accounts for all screening mechanisms, and R_i,B is the Born radius of atom i, which accounts for the local environment around the atom and thus changes with protein conformation. The Born radii are estimated using a contact map method, which uses a cutoff computation around each atom, and are also modified to account for individual hydrogen bonding interactions in the protein. The latter improves the accuracy of the method. The form of the screening function is

D_{s} (r_{i j}) = \frac{1 + ϵ_{s}}{1 + (ϵ_{s} - 1) \exp ∕ 2 (- α_{i j} r_{i j})},

(41)

where ϵ_s is the bulk dielectric as before, and α_ij is a parameter that depends on the interacting atoms. More implementation details can be found in the references, including the forces and parameters for CHARMM 22.⁵⁷^,⁵⁸^,⁵⁹^,⁶⁰^,⁶¹^,⁶²^,⁶³ For use of SCPISM in NMI and NML, we derived and implemented the Hessian for SCPISM (derivation found in the SI). Since we desire to reproduce kinetics for the implicit solvent biomolecules simulated, the equations of motion in C are described by a scalar Langevin equation with damping coefficient γ=91 ps⁻¹ to account for water viscosity.

For all of the following tests, the NMA was carried out using a Hessian which had been averaged over a 500 fs MD constant energy trajectory using leapfrog. The mass-weighted steepest descent minimizer was sufficient and extremely efficient for AD and BPTI. It generally reduced the PE to thresholds of u_th=0.1 kcal mol⁻¹, cf. Eq. 29, within two to five iterations for AD and two to eight iterations for BPTI.

Section 3A provides discussion on the choice of the number of modes to propagate. Sampling results are presented in Sec. 3B; rates for AD are shown in Sec. 3C. In Sec. 3D, we address the efficiency of NML. The stability of the thermostat is determined in Sec. 3E. In all cases, the results are compared to the LÎ method.

It is important to notice that the diagonalization of MRH often incurs negative eigenvalues for the lowest frequencies. Numerical experiments showed that the NML’s stability improves if these negative eigenvalues are not present. We eliminate this artifact by averaging the Hessian in the neighborhood of the minimized structure x₀. A comparison showing that the number of negative eigenvalues decreases with averaging trajectory length can be found in Fig. S3 of the SI.

Better minimization schemes, such as those based on quasi-Newton methods, may remove the negative eigenvalues without the necessity of averaging.

Choice of the number of modes to propagate

The number of modes propagated will determine the dynamics which can be recovered from the simulations. To estimate the dimension of the propagated subspace, we consider the dihedral space of the molecule. Based on some numerical experiments, there seems to be a correlation between the number of backbone plus side chain dihedrals and the number of modes that should be propagated in the slow subspace. BPTI has 57 residues, and thus 171 backbone dihedrals. Of these 57 residues, 42 have side chain dihedrals χ₁ thru possibly χ₄. GLY and ALA have no side chain dihedrals, and PRO has only two conformations. We use the following guesses for the number of modes. Note that six modes to account for translational and rotational invariances of the PE should be added to these numbers:

(1)
Number of backbone ϕ, ψ dihedrals: 114 for BPTI.
(2)
Number of backbone ϕ, ψ dihedrals plus side chain dihedrals: BPTI has 109 side chain dihedrals for a total of 223.
(3)
Number of backbone ϕ, ψ, ω dihedrals plus side chain dihedrals: for BPTI this number is 280.

We used for simplicity 100, 200, and 300 modes. The results below show that choice (3) above was a good guess on the number of modes needed to obtain very good agreement with the Ramachandran plot of the LÎ simulation. Choices (1) and (2) give rougher agreement: The location of the minima are preserved but their probability gets accentuated, which is consistent with the notion that the sampling is more localized to the low frequency events [see Figs. S5(a)–S5(d) in the SI, which show three-dimensional sampling plots]. Blocked AD has backbone ϕ, ψ, and ω but no side chain dihedrals. The analysis above is not helpful in the case of this small molecule.

Alternatively, if one selects a target time step, it is possible to find an approximate number of modes that can be propagated within the stability limit of the method. For example, we are given a model which can be propagated with the LÎ method using a time step of Δt_LI and has a target time step Δt_T. If the eigenvalues are ordered from smallest to largest and the largest has value λ_3N, then we find the ith eigenvalue such that

λ_{i} ⩽ \frac{λ_{3 N} Δ t_{L I}^{2}}{Δ t_{T}^{2}}

(42)

is satisfied. The required set of ordered eigenvectors is then {e₁,e₂,…,e_i}. Examples of potential speedups for BPTI, PIN1 WW domain, and calmodulin are in Fig. S9 and Table S1 in the SI.

Sampling using NML or NMI

Tests were conducted using NML to determine the minimum number of propagated modes, which was consistent with good sampling. The AD tests use the implicit solvent model of Eq. 38, and the BPTI tests use the SCPISM. The BPTI crystal structure was obtained from the protein data bank (PDB), with PDB id 6PTI. After short minimization and equilibration for 50 ps using CHARMM (Ref. ⁶⁴) version 33, the structure was run for 10 ns using PROTOMOL. The switching function used to partition the forces at a cutoff point is given in the SI. Both programs used the CHARMM 22 force field.

For AD, good sampling was observed when 22 (of 66) or more modes were propagated, and the Ramachandran free energy plot (in units of kcal mol⁻¹) for NML is comparable to that of LÎ in Fig. 1 at this point, both simulated for 500 ns. We use the metric proposed in Ref. 41 to compare the free energy plots. Over the regions which are well sampled on both simulations, that is, free energy <3k_BT, the root mean square deviation (RMSD) between the LÎ and NML free energy plots is 0.12 kcal mol⁻¹. This compares favorably with the results of the MTS diffusive Langevin method where a RMSD of 0.17 kcal mol⁻¹ was obtained for AD.⁴¹ Sampling of AD with NMI is reported in Figs. S4(a) and S4(b) in the SI. An optimum choice of damping coefficient for NMI allows recovery of sampling, as expected from the theory.

Comparison of AD Ramachandran free energy plots for (a) LÎ Δt=1 fs and (b) NML with 22 modes (of 66) propagated Δt=5.5 fs. Data were obtained from 500 ns trajectories with γ=91 ps⁻¹. The color bar represents the free energy in kcal mol⁻¹ and the RMSD free energy difference between the figures is 0.119 kcal mol⁻¹, which compares well with the MTS diffusive Langevin method (Ref. ⁴¹) RMSD of 0.17 kcal mol⁻¹.

The probability of being in the C7 equatorial, C5 axial, and α_R, as defined in Fig. 1, is given in Table 1 for both methods.

Table 1.

Probabilities of the AD model occupying C7 equatorial, C5 axial, and α_R positions in the Φ-Ψ plane, as defined in Fig. 1, for NML and LÎ propagators.

Method	C7 equ.	C5 ax.	α_R
LÎ	0.183	0.637	0.049
NML	0.271	0.632	0.039

Open in a new tab

This level of coarse graining allows a 5.5 fs time step with an efficiency increase of 2.1 for a minimization termination threshold Eq. 29 of 1.0 kcal mol⁻¹. It was found empirically that to obtain sampling comparable to that produced by the LÎ propagator required that the minimization in C^⊥ be resolved to 1 kcal mol⁻¹ or better for AD.

The BPTI Ramachandran free energy comparisons are shown in Fig. 2 for 300 modes propagated, of a total of 2646, for a 10 ns trajectory. The plots were constructed by including 43 of the 57 backbone residues dihedral ϕ-ψ angle pairs and were chosen to remove the residue set {1 4 5 6 12 28 29 36 37 47 49 50 56 57}. These dihedrals pairs are ones for which the minimized structure give points in the disallowed region of the Ramachandran plot and includes all GLY. Our conclusions are not modified if we include all dihedrals, but we do this for greater clarity. The results compare well with the LÎ results. Propagating 300 modes allows a 20 fs time step with an efficiency increase of 6, 7, 8, 9, 10, 11-fold for minimization termination thresholds 29 of 0.5–1.0 kcal mol⁻¹. The RMSD between the LÎ and NML free energy plots, over the regions which are well sampled on both simulations, that is, free energy <3k_BT, is 0.184 kcal mol⁻¹.

Comparison of BPTI Ramachandran free energy plots for (a) LÎ Δt=1 fs and γ=91 ps⁻¹ and (b) NML with 300 modes propagated (of 2646) with Δt=20 fs and γ=30 ps⁻¹. Data are for 43 of the 57 residues backbone dihedral Φ-Ψ pairs from 10 ns trajectories. The color bar represents the free energy in kcal mol⁻¹ and the RMSD free energy difference between the figures is 0.184 kcal mol⁻¹.

For Langevin propagators, stability or accuracy issues can arise if the condition γΔt⩽1 is not met. Since the viscosity of water requires a coefficient of around γ=91 ps⁻¹, then for a 1 fs time step, we have γΔt=0.091; however, for a 20 fs time step, we have γΔt=1.82. In our tests, we limit γ such that the condition is met for large time steps; for a time step of 20 fs, we use γ=30 ps⁻¹. Since there is experimental evidence that slow modes tend to be overdamped, we do not envisage this to be a problem until extremely large time steps are used for sampling. This might be an issue for kinetics.

To investigate the effect of a further reduced set of modes, the BPTI Ramachandran free energy for both 100 and 200 modes propagated, of a total of 2646, for 10 ns trajectories is shown in Fig. 3. Again the plots include 43 of the 57 residues backbone dihedral Φ-Ψ angle pairs. Depending on the minimization level (0.1–1.0 kcal mol⁻¹), propagating only 200 modes allows a 34 fs time step for real speedups between 12 and 16, whereas propagating 100 modes allows a 70 fs time step for real speedups between 21 and 30. The RMSD between the LÎ and NML free energy plots, over the regions which are well sampled on both simulations, that is, free energy 3k_BT, is 0.234 kcal mol⁻¹ for Fig. 3a and 0.213 kcal mol⁻¹ for Fig. 3b, which also compares well with the results of Ref. 41.

BPTI Ramachandran free energy plots for NML with (a) 200 modes propagated (of 2646) with Δt=34 fs and γ=28 ps⁻¹ and (b) 100 (of 2646) with Δt=70 fs and γ=12 ps⁻¹. Data are for 43 of the 57 residues backbone dihedral Φ-Ψ pairs from 10 ns trajectories. The color bar represents the free energy in kcal mol⁻¹; the RMSD free energy difference between (a) and Fig. 2a is 0.213 kcal mol⁻¹ and that between (b) and Fig. 2a is 0.234 kcal mol⁻¹.

Rate calculation using NML or NMI

With a small molecule like AD it is possible to sample for a sufficient length of time to measure the rates of transition between two states: In this case, we measure the isomerization rate between the C7 equatorial and α_R PE wells, cf. Ref. 65. For a model with states A and B, the rate from A to B, denoted k_AB, can be calculated using the approximation proposed by Best and Hummer⁶⁶ from the probability of transition P_TP and the average transition time ⟨t_TP⟩,

2 c_{A} k_{A B} = \frac{P_{T P}}{⟨ t_{T P} ⟩},

(43)

where c_A is the equilibrium mole fraction of conformation A. We consider conformation A to be C7 equatorial and C5 axial combined, giving c_A=0.83 from Table 1. Since the C7 axial conformation is also present for this model, we ignore trajectories which include it.

Rates for NMI were calculated for a model with 30 modes propagated and the remaining 36 modes overdamped. A time step of Δt=3 fs was used to solve Eq. 8 and the overdamped Eq. 14 was propagated with time steps Δτ of 0.5 and 1 fs. The behavior of the method with varying $\bar{γ}$ was assessed by determining the rates, and the results are shown in Fig. 4. Although the results compare poorly with the LÎ result of 2.75 ns⁻¹, the optimum $Δ τ ∕ \bar{γ}$ lies in the region between $λ_{3 N}^{- 1}$ , the discrete quadratic solution for a mass-weighted minimizer, and the stability limit for small $\bar{γ}$ .

Calculated transition rates for the overdamped NMI method for different $\bar{γ}$ , where 30 modes were propagated and the remaining 36 modes overdamped with time steps of Δτ=0.5 fs and Δτ=1 fs. The comparable rate for LÎ is 2.75 ns⁻¹. Results show that the optimum $Δ τ ∕ \bar{γ}$ lies in the region between $λ_{3 N}^{- 1}$ , the discrete quadratic solution for a mass-weighted minimizer, and the stability limit for small $\bar{γ}$ .

The NML method results are compared to the LÎ dynamics as shown in Figs. 5a, 5b for 100 ns trajectories with 1 fs time step and minimization thresholds of 0.1 and 0.5 kcal mol⁻¹, respectively. We denote the rates obtained by propagating the first n modes using NML as k_n and the rate obtained from LÎ as k_LI≈2.75 ns⁻¹. A similar rate calculation for AD on explicit solvent has been reported as 10 ns⁻¹.⁶⁷ The crude screened Coulomb implicit solvent model used probably accounts for the difference in rate with the LÎ simulation. The results show good correlation between k_n and k_LI in the region above n=24 for minimizations to 0.1 and 0.5 kcal mol⁻¹, with slightly better results for the lower threshold below n=36.

Calculated transition rates for NML for different minimization accuracies compared to LÎ (red line) for AD as a function of modes propagated. Mode number indicates the last propagated mode, after which all higher frequency mode forces are relaxed. The error bars are determined from running simulations with different initial random velocity seeds and initial conditions and represent one standard deviation. As expected, for mode 66, all modes are propagated and the rate is close to the LÎ rate.

We determined empirically that the transition time T fits a gamma distribution with shape parameter k=2.7 and scale parameter θ=2,549. That means that the standard deviation $σ_{T} = \sqrt{k} θ$ is close to its expected value E(T)=kθ. Thus, in Fig. 5 we do not use σ(T) to quantify error bars. Rather, we compute the standard deviation resulting from running four simulations for each data point, with two different initial random velocity seeds and two different initial configurations.

For NML to provide rates comparable to those produced by the LÎ propagator, it can be seen from Fig. 5 that the minimization threshold Eq. 27 must be resolved to 0.5 kcal mol⁻¹ or lower. This is a more stringent requirement than was found for sampling in Sec. 3B, where 1 kcal mol⁻¹ was sufficient.

From our results, we observe that in the case of AD, NML (a) permits the simulation of the transition between two energy minima (states α_R and C7 equatorial) and (b) provides rates that agree with LÎ. Despite the harmonic approximation of the Hessian that produced the original eigenvectors in state C7 equatorial, the low frequency eigenvectors are long lived enough to be able to allow a transition between C7 equatorial and α_R.

Efficiency of NML

The increased efficiency of the NML method comes primarily from the increased time steps available from reducing the number of modes to be propagated. The efficiency is then determined from the reduced number of propagation steps, inversely proportional to the time step, and the cost of minimization and diagonalization. Diagonalization can be amortized over the simulation length without significant cost. For AD, only one diagonalization was necessary to sample for up to 500 ns, whereas for BPTI, we sampled for up to 10 ns with one diagonalization.

As explained in Sec. 2A, the increase in time step is dependent on the frequency f_c at which the modes are partitioned and the relationship is shown for both the AD and BPTI models in Figs. 6 7. The diagrams in (a) show the maximum time step assuming a 1 fs time step for the LÎ method. To assess the stability margin of the method, the time steps were increased until the expected value, and variance of the kinetic energy deviated from the statistically correct values. A threshold of 5% error was used for trajectories of 1 ns for AD, and 0.1 ns for BPTI. On average this occurred when the actual time step is between two and three times the projected time step, which compares well with a similar test for the LÎ method where just under a 3 fs time step was possible for both models.⁶⁸ For both AD and BPTI, we observe a jump in the projected maximum time step in Figs. 6a, 7a (blue plot). This shows the separation of scales between the H–x (X is some heavy atom) bonds and confirms the twofold increase in time step observed empirically when using constraint methods such as SHAKE.³⁵^,³⁶^,³⁷

(a) AD maximum expected time step (blue) and maximum obtained time step for 1 ns trajectories (green). (b) NML efficiency increases by reducing the number of propagated modes. Mode number indicates the last propagated mode. Efficiency also depends on the minimization step, which has a computational cost depending on the threshold. We show curves for 1.0, 0.5, and 0.1 kcal mol⁻¹ thresholds compared to LÎ.

BPTI maximum expected time step (blue) and maximum obtained time step for 0.1 ns trajectories (green). (b) NML efficiency increases by reducing the number of propagated modes. Mode number indicates the last propagated mode. Efficiency also depends on the minimization step and we show curves for 1.0, 0.5, and 0.1 kcal mol⁻¹ thresholds compared to LÎ.

The efficiency calculations are based on 1 ns trajectories for the AD and 0.05 ns trajectories for BPTI using the PROTOMOL⁵⁴ package on a 3 GHz dual core Intel computer. The time steps used for different number of modes correspond with the eigenvector predicted values (blue line) rather than the maximum attainable for good stability and so that a comparison can be made with LÎ at 1 fs time step. For AD, a 1 ns trajectory took an average (over five runs) of 175.3 s and for the BPTI, a 0.05 ns trajectory took an average (again over five runs) of 6820.8 s. The minimizer used was the simple steepest descent using mass weighting of the forces and the minimization was carried out to different accuracies. The range of the Lennard–Jones and Coulombic forces is limited by using a C² continuity switch, details of which are given in the SI. For AD, the switch distances were switched on at 4.5 Å and off at 5.5 Å for the minimizer, and switch on at 9 Å and off at 12 Å for the propagator. For BPTI, the switch distances were switch on at 5.5 Å and off at 6.5 Å for the minimizer, and switch on at 15 Å and off at 20 Å for the propagator. The cost of diagonalization was amortized over the 1 and 0.05 ns trajectories, respectively. For AD, the diagonalization took less than 0.2 s, for BPTI, all of the modes were found in 57 s and 300 of 2646 could be found in 31 s.

The results are shown in Figs. 6b, 7b. It is clear that the small size of the AD limits the efficiency gains possible, as the calculation of both minimization and propagator force fields is of comparable cost. Propagating 22 modes using a 5.5 fs time step (the point beyond which sampling and rates begin to deviate from the LÎ results) gives an efficiency increase of 2.1 times. For BPTI, minimizing to 1.0 kcal mol⁻¹, which is appropriate for sampling, gives the following efficiency gains: 11 for propagating 300 modes using a time step of 20 fs, 16 for propagating 200 modes using a time step of 34 fs, and 30 for propagating 100 modes using a time step of 70 fs.

The cost of the method can be separated into the normal CHARMM force field calculations, e.g., O(N) for cutoff computations or multigrid methods or O(N ln N) for particle mesh Ewald; the cost for diagonalization O(N³), and the NM projection cost O(mN), where m is number of low frequency modes.

For the simple AD model, only one diagonalization was required, but we envisage that this will need to be carried out periodically for larger systems. If we diagonalize as often as once every N steps, the cost reduces to O(N²).

The projection cost will depend on the number of modes propagated; if this is proportional to the size of the system, this will have a cost of O(N²). However, larger systems will have a richer set of low frequency modes, so it may be desirable to increase the time step, reducing the cost to the point that, at fixed m, we have O(N). This trend can be observed even in the small models discussed above: To obtain sampling comparable to the LÎ propagator for the 22 atom model required 30% of the modes, whereas the 882 atom model required only 11%. Note that the latter value is not a lower bound, that is, we did not perform an exhaustive analysis to determine the minimum number of modes that need to be kept for BPTI.

Thermostat stability

We investigated the stability of our thermostat by studying the effect of increasing the time step on the mean and variance of the kinetic energy of the system. In Fig. 8, results for AD for time steps from 1 to 16 fs are presented when propagating 22 modes. The results show that the thermostat works well beyond the predicted time step of 5.5 fs and is consistent with stability results for Langevin dynamics integrators.⁶⁸^,⁶⁹

Kinetic energy stability for AD, propagating 22 modes, with increasing time step. The time step predicted from the eigenvalues is 5.5 fs.

DISCUSSION

We have developed a method that allows a long time step for Langevin dynamics simulations. The degrees of freedom of the biomolecule are partitioned into its normal modes ranked by frequency. The low frequency modes are propagated according to a Langevin integrator and the high frequency modes are dampened near their energy minimum and propagated using an accelerated Brownian dynamics propagator. This scheme allows us to speed up the run time step twofold for a small system such as AD. We also show that it scales extremely well with system size, subject to the number of modes propagated: The time step is increased 11-fold for a molecule of 882 atoms (BPTI). The two Langevin integrators presented here satisfy fluctuation dissipation.

Interest has grown in the use of internal coordinates as an alternative to Cartesian coordinates for the calculation of normal modes. Since they describe the positions of the atoms in terms of distances, angles, and dihedral angles with respect to some origin atom, they are seen as a more natural method of describing the geometry of biomolecules. However, the equations of motion in nonlinear coordinates become very difficult to solve. In contrast, NML can be used with well established force fields such as CHARMM 22, which are readily calculated in Cartesian coordinates.

In our tests, sampling and kinetics of small biomolecules are preserved when even only 10%–30% of modes are propagated, as indicated by the transition rates and Ramachandran plots studied. NML permits an increase in run time efficiency which is close to the theoretical limit of being proportional to inverse time step for a few propagated modes. The increase in time step is essential to simulate time scales that are currently out of reach of conventional dynamics. NML can be readily applied with better implicit solvent models, such as Poisson–Boltzmann or generalized Born (GB), which could improve agreement with explicitly solvated simulations and experiment. All that is needed is to compute Hessians. For instance, the package NAB implements GB with analytical Hessians.⁷⁰

We would like to apply NML to study proteins undergoing conformational changes. We need to understand when the propagated space C no longer adequately describes the real dynamical space of the protein. In this case, NML needs to perform a rediagonalization to obtain an updated propagated space. Work is underway to develop a metric to measure the validity of C and trigger rediagonalization.

The speed of the method can also be optimized. In particular, the diagonalization of the averaged Hessian with a few propagated modes can be improved through a Krylov subspace approximation, and it might be possible to construct more efficient minimizers. Since NML is linear-algebra intensive, it would map well to modern specialized architectures such as graphics processing unit.⁷¹

ACKNOWLEDGMENTS

J.A.I. acknowledges partial funding from NSF Grant Nos. CCF-0135195, DBI-0450067, and CCF-0622940. V.S.P. and P.P. acknowledge support from the Simbios Center (NIH U54 GM072970). J.A.I. performed this work while on sabbatical at the Institute for Computational and Mathematical Engineering at Stanford University. Santanu Chatterjee, Paul Brenner, and Trevor Cickovski from Notre Dame coded several aspects of this method and provided many of the analysis scripts. Thanks to John Chodera and Peter Minary from Stanford University and Bob Skeel from Purdue University for critical reading of earlier versions of the manuscript and several suggestions. The authors acknowledge helpful discussions with Eric Darve from Stanford. Sergio Hassan from NIH greatly assisted us in correctly implementing SCPISM in our code and comparing it to the CHARMM implementation.

APPENDIX: PROJECTION MATRICES

The NMA partitions the system into two subspaces, C spanned by the matrix of low frequency eigenvectors Q and C^⊥ spanned by the high frequency eigenvectors $\bar{Q}$ . To project the system positions x, velocities v, and forces f onto these subspaces requires a set of projection matrices which we derive below. The position and velocity projection matrices are equivalent and we denote them as P_x and $P_{x}^{⊥}$ for C and C^⊥, respectively. We denote the force projection matrices as P_f for C and $P_{f}^{⊥}$ for C^⊥.

The choice of system partition separates the positions around the equilibrium point x₀ into $\hat{x}$ in C and $\bar{x}$ in C^⊥ such that $x = \hat{x} + \bar{x} + x_{0}$ . Differentiating this equation with respect to time gives the associated velocity equation $v = \hat{v} + \bar{v}$ . The projections are

\hat{x} = P_{x} (x - x_{0}), \bar{x} = P_{x}^{⊥} (x - x_{0}),

(A1)

\hat{v} = P_{x} v, \bar{v} = P_{x}^{⊥} v,

(A2)

where

P_{x} = M^{- 1 ∕ 2} Q Q^{T} M^{1 ∕ 2},

(A3)

and

P_{x}^{⊥} = M^{- 1 ∕ 2} (I - Q Q^{T}) M^{1 ∕ 2}

(A4)

from Eq. 3, where the positions are mapped from the Cartesian space to mode space and back to Cartesian space.

We also need projection matrices applicable to the forces. Given the system forces f, we can determine $\hat{f}$ in C by considering Newton’s second law $M \ddot{x} = f$ and Eq. 3 to derive a projection matrix. We can map Newton’s second law into $\hat{c}$ , the modes in the C subspace

\ddot{\hat{c}} = Q^{T} M^{- 1 ∕ 2} f .

(A5)

To evolve the system in Cartesian space, we can map the equation back to the Cartesian space and multiply through by M. The normal mode subspace force projected onto the real space is

\hat{f} = P_{f} f = (M^{1 ∕ 2} Q Q^{T} M^{- 1 ∕ 2}) f

(A6)

for C force projection matrix P_f.

$\bar{f}$ can be found in a similar manner to $\hat{f}$ but using the complement of the projection matrix

\bar{f} = P_{f}^{⊥} f = (M^{1 ∕ 2} (I - Q Q^{T}) M^{- 1 ∕ 2}) f

(A7)

for C^⊥ force projection matrix $P_{f}^{⊥}$ .

References

Levitt C. S. M. and Stern P., J. Mol. Biol. 10.1016/0022-2836(85)90230-X 181, 423 (1985). [DOI] [PubMed] [Google Scholar]
Marques O. and Sanejouand Y., Proteins 10.1002/prot.340230410 23, 557 (1995). [DOI] [PubMed] [Google Scholar]
Tama F. and Sanejouand Y., Protein Eng. 10.1093/protein/14.1.1 14, 1 (2001). [DOI] [PubMed] [Google Scholar]
Li G. and Cui Q., Biophys. J. 86, 743 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
Ma J., Structure (London) 10.1016/j.str.2005.02.002 13, 373 (2005). [DOI] [PubMed] [Google Scholar]
Tama O. M. F. and M. C. L.BrooksIII, J. Struct. Biol. 10.1016/j.jsb.2004.03.002 147, 315 (2004). [DOI] [PubMed] [Google Scholar]
Tama O. M. F., Gadea F. X., and Sanejouand Y., Proteins 41, 1 (2000). [DOI] [PubMed] [Google Scholar]
Delarue M. and Sanejouand Y., J. Mol. Biol. 10.1016/S0022-2836(02)00562-4 320, 1011 (2002). [DOI] [PubMed] [Google Scholar]
Durand G. T. P. and Sanejouand Y., Biopolymers 10.1002/bip.360340608 34, 759 (1994). [DOI] [Google Scholar]
Brooks B. and Karplus M., Proc. Natl. Acad. Sci. U.S.A. 10.1073/pnas.82.15.4995 82, 4995 (1985). [DOI] [PMC free article] [PubMed] [Google Scholar]
Li G. and Cui Q., Biophys. J. 83, 2457 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]
Bahar A. A. I. and Erman B., Folding Des. 10.1016/S1359-0278(97)00024-2 2, 173 (1997). [DOI] [PubMed] [Google Scholar]
Doruker R. J. P. and Bahar I., J. Comput. Chem. 10.1002/jcc.1160 23, 119 (2002). [DOI] [PubMed] [Google Scholar]
Balsera M. A., Wriggers W., Oono Y., and Schulten K., J. Phys. Chem. 10.1021/jp9536920 100, 2567 (1996). [DOI] [Google Scholar]
Amadei A., Linnsen A. B. M., and Berendsen H. J. C., Proteins: Struct., Funct., Genet. 10.1002/prot.340170408 17, 412 (1993). [DOI] [PubMed] [Google Scholar]
Zhang G. and Schlick T., J. Comput. Chem. 10.1002/jcc.540141011 14, 1212 (1993). [DOI] [Google Scholar]
Zhang Z., Shi Y., and Liu H., Biophys. J. 84, 3583 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
Atilgan A. R., Durell S. R., Jernigan R. L., Demirel M. C., Keskin O., and Bahar I., Biophys. J. 10.1016/S0006-3495(01)76033-X 80, 505 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]
Kolossvary I. and Keser G., J. Comput. Chem. 22, 21 (2001). [DOI] [Google Scholar]
Suhre K. and Sanejouand Y.-H., Nucleic Acids Res. 10.1093/nar/gkh368 32, W610 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]
Space B. and Herschel R., J. Chem. Phys. 10.1063/1.465573 99, 9070 (1993). [DOI] [Google Scholar]
Askar B. S. A. and Rabitz A., J. Phys. Chem. 10.1021/j100019a017 99, 7330 (1995). [DOI] [Google Scholar]
Lange O. and Grubmuller H., J. Chem. Phys. 10.1063/1.2199530 124, 2149 (2006). [DOI] [PubMed] [Google Scholar]
Schlick E. B. T. and Mandziuk M., Annu. Rev. Biophys. Biomol. Struct. 10.1146/annurev.biophys.26.1.181 26, 181 (1997). [DOI] [PubMed] [Google Scholar]
Nyberg A. and Schlick T., Chem. Phys. Lett. 10.1016/0009-2614(92)85028-9 198, 538 (1992). [DOI] [Google Scholar]
Watanabe K., J. Phys. Chem. 10.1021/j100015a061 99, 5680 (1995). [DOI] [Google Scholar]
Gibson S., J. Comput. Chem. 10.1002/jcc.540110406 11, 468 (1990). [DOI] [Google Scholar]
Scully H., Mol. Simul. 10.1080/08927029308022177 11, 67 (1993). [DOI] [Google Scholar]
Streett D. T. W. B. and Saville G., Mol. Phys. 10.1080/00268977800100471 35, 639 (1978). [DOI] [Google Scholar]
Grubmüller H., Heller H., Windemuth A., and Schulten K., Mol. Simul. 10.1080/08927029108022142 6, 121 (1991). [DOI] [Google Scholar]
Tuckerman M., Berne B. J., and Martyna G. J., J. Chem. Phys. 10.1063/1.463137 97, 1990 (1992). [DOI] [Google Scholar]
Ma Q., Izaguirre J. A., and Skeel R. D., SIAM J. Sci. Comput. (USA) 10.1137/S1064827501399833 24, 1951 (2003). [DOI] [Google Scholar]
García-Archilla B., Sanz-Serna J. M., and Skeel R. D., SIAM J. Sci. Comput. (USA) 10.1137/S1064827596313851 20, 930 (1998). [DOI] [Google Scholar]
Izaguirre J. A., Reich S., and Skeel R. D., J. Chem. Phys. 10.1063/1.478995 110, 9853 (1999). [DOI] [Google Scholar]
Ryckaert G. C. J. P. and Berendsen H. J., J. Comput. Phys. 10.1016/0021-9991(77)90098-5 23, 327 (1977). [DOI] [Google Scholar]
van Gunsteren W. F. and Berendsen H. J. C., Mol. Phys. 10.1080/00268977700102571 34, 1311 (1977). [DOI] [Google Scholar]
Andersen H., J. Comput. Phys. 10.1016/0021-9991(83)90014-1 52, 24 (1983). [DOI] [Google Scholar]
Hammonds K. D. and Ryckaert J.-P., Comput. Phys. Commun. 10.1016/0010-4655(91)90105-T 62, 336 (1991). [DOI] [Google Scholar]
Ciccotti G., Ferrario M., and Ryckaert J.-P., Mol. Phys. 10.1080/00268978200100942 47, 1253 (1982). [DOI] [Google Scholar]
van Gunsteren W. F. and Karplus M., Macromolecules 10.1021/ma00234a015 15, 1528 (1982). [DOI] [Google Scholar]
Eastman P. and Doniach S., Proteins 30, 215 (1998). [DOI] [PubMed] [Google Scholar]
van Gunsteren W. F., Mol. Phys. 10.1080/00268978000102101 40, 1015 (1980). [DOI] [Google Scholar]
Kottalam J. and Case D., Biopolymers 10.1002/bip.360291008 29, 1409 (1990). [DOI] [PubMed] [Google Scholar]
Smith J., Q. Rev. Biophys. 24, 227 (1991). [DOI] [PubMed] [Google Scholar]
A. D.MacKerell, Jr., Bashford D., Bellott M., R. L.Dunbrack, Jr., Evanseck J., Field M. J., Fischer S., Gao J., Guo H., Ha S., Joseph D., Kuchnir L., Kuczera K., Lau F. T. K., Mattos C., Michnick S., Ngo T., Nguyen D. T., Prodhom B., Roux B., Schlenkrich M., Smith J., Stote R., Straub J., Watanabe M., Wiòrkiewicz-Kuczera J., Yin D., and Karplus M., FASEB J. A143, 6 (1992). [Google Scholar]
A. D.MacKerell, Jr., Bashford D., Bellott M., R. L.Dunbrack, Jr., Evanseck J., Field M. J., Fischer S., Gao J., Guo H., Ha S., Joseph-McCarthy D., Kuchnir L., Kuczera K., Lau F. T. K., Mattos C., Michnick S., Ngo T., Nguyen D. T., Prodhom B., W. E.ReiherIII, Roux B., Schlenkrich M., Smith J. C., Stote R., Straub J., Watanabe M., Wiòrkiewicz-Kuczera J., Yin D., and Karplus M., , J. Phys. Chem. B 10.1021/jp973084f 102, 3586 (1998). [DOI] [PubMed] [Google Scholar]
Petrone P. and Pande V. S., Biophys. J. 10.1529/biophysj.105.070045 90, 1583 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]
Leimkuhler B. and Reich S., Simulating Hamiltonian Dynamics (Cambridge University Press, Cambridge, 2004). [Google Scholar]
Batcho P. F. and Schlick T., J. Chem. Phys. 10.1063/1.1389855 115, 4019 (2001). [DOI] [Google Scholar]
Skeel R. D. and Izaguirre J. A., Mol. Phys. 10.1080/0026897021000018321 100, 3885 (2002). [DOI] [Google Scholar]
Ermak D. L. and McCammon J. A., J. Chem. Phys. 10.1063/1.436761 69, 1352 (1987). [DOI] [Google Scholar]
Kloeden P. E. and Platen E., Numerical Solution of Stochastic Differential Equations, Applications of Mathematics: Stochastic Modelling and Applied Probability Vol. 23 (Springer-Verlag, New York, 1992) (second corrected printing, 1995). [Google Scholar]
Gardiner C. W., Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences (Springer-Verlag, New York, 1985). [Google Scholar]
Matthey T., Cickovski T., Hampton S. S., Ko A., Ma Q., Nyerges M., Raeder T., Slabach T., and Izaguirre J. A., ACM Trans. Math. Softw. 10.1145/1024074.1024075 30, 237 (2004). [DOI] [Google Scholar]
See EPAPS Document No. E-JCPSA6-128-016810 for comparison of NML and NMI showing optimal damping coefficient; Input files, analysis scripts and executable to reproduce computational test results; and illustration of localized dynamics when using SMD. For more information on EPAPS, see (http://www.aip.org/pubservs/epaps.html).
Shen K. F. F. M.-Y., J. Comput. Chem. 10.1002/jcc.20211 26, 691 (2005). [DOI] [PubMed] [Google Scholar]
Hassan S., Mehler E., Zhang D., and Weinstein H., Proteins: Struct., Funct., Genet. 10.1002/prot.10330 51, 109 (2003). [DOI] [PubMed] [Google Scholar]
Hassan S., Guarneri F., and Mehler E., J. Phys. Chem. B 10.1021/jp993895e 104, 6478 (2000). [DOI] [Google Scholar]
Hassan S. A., Guarnieri F., and Mehler E., J. Phys. Chem. B 10.1021/jp9938967 104, 6490 (2000). [DOI] [Google Scholar]
Hassan S. and Mehler E., Proteins: Struct., Funct., Genet. 10.1002/prot.10059 47, 45 (2002). [DOI] [PubMed] [Google Scholar]
Hassan S., J. Phys. Chem. B 50, 19501 (2004). [Google Scholar]
Hassan S. and Mehler E., Int. J. Quantum Chem. 10.1002/qua.20526 102, 986 (2005). [DOI] [Google Scholar]
Li X., Hassan S., and Mehler E., Proteins: Struct., Funct., Bioinf. 60, 464 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
Brooks B. R., Bruccoleri R. E., Olafson B. D., States D. J., Swaminathan S., and Karplus M., J. Comput. Chem. 10.1002/jcc.540040211 4, 187 (1983). [DOI] [Google Scholar]
Chodera J. D., Swope W. C., Pitera J. W., and Dill K. A., Multiscale Model. Simul. 10.1137/06065146X 5, 1214 (2006). [DOI] [Google Scholar]
Best R. B. and Hummer G., Proc. Natl. Acad. Sci. U.S.A. 10.1073/pnas.0408098102 102, 6732 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
Bolhuis P. G., Dellago C., and Chandler D., Proc. Natl. Acad. Sci. U.S.A. 10.1073/pnas.100127697 97, 5877 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]
Mishra B. and Schlick T., J. Chem. Phys. 10.1063/1.471875 105, 299 (1996). [DOI] [Google Scholar]
Beard D. A. and Schlick T., J. Chem. Phys. 10.1063/1.481331 112, 7313 (2000). [DOI] [Google Scholar]
Brown R. and Case D., J. Comput. Chem. 10.1002/jcc.20479 27, 1662 (2006). [DOI] [PubMed] [Google Scholar]
Elsen E., Houston M., Vishal V., Darve E., Hanrahan P., and Pande V., Proceedings of Supercomputing, 2006. (unpublished).

[c1] Levitt C. S. M. and Stern P., J. Mol. Biol. 10.1016/0022-2836(85)90230-X 181, 423 (1985). [DOI] [PubMed] [Google Scholar]

[c2] Marques O. and Sanejouand Y., Proteins 10.1002/prot.340230410 23, 557 (1995). [DOI] [PubMed] [Google Scholar]

[c3] Tama F. and Sanejouand Y., Protein Eng. 10.1093/protein/14.1.1 14, 1 (2001). [DOI] [PubMed] [Google Scholar]

[c4] Li G. and Cui Q., Biophys. J. 86, 743 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c5] Ma J., Structure (London) 10.1016/j.str.2005.02.002 13, 373 (2005). [DOI] [PubMed] [Google Scholar]

[c6] Tama O. M. F. and M. C. L.BrooksIII, J. Struct. Biol. 10.1016/j.jsb.2004.03.002 147, 315 (2004). [DOI] [PubMed] [Google Scholar]

[c7] Tama O. M. F., Gadea F. X., and Sanejouand Y., Proteins 41, 1 (2000). [DOI] [PubMed] [Google Scholar]

[c8] Delarue M. and Sanejouand Y., J. Mol. Biol. 10.1016/S0022-2836(02)00562-4 320, 1011 (2002). [DOI] [PubMed] [Google Scholar]

[c9] Durand G. T. P. and Sanejouand Y., Biopolymers 10.1002/bip.360340608 34, 759 (1994). [DOI] [Google Scholar]

[c10] Brooks B. and Karplus M., Proc. Natl. Acad. Sci. U.S.A. 10.1073/pnas.82.15.4995 82, 4995 (1985). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c11] Li G. and Cui Q., Biophys. J. 83, 2457 (2002). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c12] Bahar A. A. I. and Erman B., Folding Des. 10.1016/S1359-0278(97)00024-2 2, 173 (1997). [DOI] [PubMed] [Google Scholar]

[c13] Doruker R. J. P. and Bahar I., J. Comput. Chem. 10.1002/jcc.1160 23, 119 (2002). [DOI] [PubMed] [Google Scholar]

[c14] Balsera M. A., Wriggers W., Oono Y., and Schulten K., J. Phys. Chem. 10.1021/jp9536920 100, 2567 (1996). [DOI] [Google Scholar]

[c15] Amadei A., Linnsen A. B. M., and Berendsen H. J. C., Proteins: Struct., Funct., Genet. 10.1002/prot.340170408 17, 412 (1993). [DOI] [PubMed] [Google Scholar]

[c16] Zhang G. and Schlick T., J. Comput. Chem. 10.1002/jcc.540141011 14, 1212 (1993). [DOI] [Google Scholar]

[c17] Zhang Z., Shi Y., and Liu H., Biophys. J. 84, 3583 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c18] Atilgan A. R., Durell S. R., Jernigan R. L., Demirel M. C., Keskin O., and Bahar I., Biophys. J. 10.1016/S0006-3495(01)76033-X 80, 505 (2001). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c19] Kolossvary I. and Keser G., J. Comput. Chem. 22, 21 (2001). [DOI] [Google Scholar]

[c20] Suhre K. and Sanejouand Y.-H., Nucleic Acids Res. 10.1093/nar/gkh368 32, W610 (2004). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c21] Space B. and Herschel R., J. Chem. Phys. 10.1063/1.465573 99, 9070 (1993). [DOI] [Google Scholar]

[c22] Askar B. S. A. and Rabitz A., J. Phys. Chem. 10.1021/j100019a017 99, 7330 (1995). [DOI] [Google Scholar]

[c23] Lange O. and Grubmuller H., J. Chem. Phys. 10.1063/1.2199530 124, 2149 (2006). [DOI] [PubMed] [Google Scholar]

[c24] Schlick E. B. T. and Mandziuk M., Annu. Rev. Biophys. Biomol. Struct. 10.1146/annurev.biophys.26.1.181 26, 181 (1997). [DOI] [PubMed] [Google Scholar]

[c25] Nyberg A. and Schlick T., Chem. Phys. Lett. 10.1016/0009-2614(92)85028-9 198, 538 (1992). [DOI] [Google Scholar]

[c26] Watanabe K., J. Phys. Chem. 10.1021/j100015a061 99, 5680 (1995). [DOI] [Google Scholar]

[c27] Gibson S., J. Comput. Chem. 10.1002/jcc.540110406 11, 468 (1990). [DOI] [Google Scholar]

[c28] Scully H., Mol. Simul. 10.1080/08927029308022177 11, 67 (1993). [DOI] [Google Scholar]

[c29] Streett D. T. W. B. and Saville G., Mol. Phys. 10.1080/00268977800100471 35, 639 (1978). [DOI] [Google Scholar]

[c30] Grubmüller H., Heller H., Windemuth A., and Schulten K., Mol. Simul. 10.1080/08927029108022142 6, 121 (1991). [DOI] [Google Scholar]

[c31] Tuckerman M., Berne B. J., and Martyna G. J., J. Chem. Phys. 10.1063/1.463137 97, 1990 (1992). [DOI] [Google Scholar]

[c32] Ma Q., Izaguirre J. A., and Skeel R. D., SIAM J. Sci. Comput. (USA) 10.1137/S1064827501399833 24, 1951 (2003). [DOI] [Google Scholar]

[c33] García-Archilla B., Sanz-Serna J. M., and Skeel R. D., SIAM J. Sci. Comput. (USA) 10.1137/S1064827596313851 20, 930 (1998). [DOI] [Google Scholar]

[c34] Izaguirre J. A., Reich S., and Skeel R. D., J. Chem. Phys. 10.1063/1.478995 110, 9853 (1999). [DOI] [Google Scholar]

[c35] Ryckaert G. C. J. P. and Berendsen H. J., J. Comput. Phys. 10.1016/0021-9991(77)90098-5 23, 327 (1977). [DOI] [Google Scholar]

[c36] van Gunsteren W. F. and Berendsen H. J. C., Mol. Phys. 10.1080/00268977700102571 34, 1311 (1977). [DOI] [Google Scholar]

[c37] Andersen H., J. Comput. Phys. 10.1016/0021-9991(83)90014-1 52, 24 (1983). [DOI] [Google Scholar]

[c38] Hammonds K. D. and Ryckaert J.-P., Comput. Phys. Commun. 10.1016/0010-4655(91)90105-T 62, 336 (1991). [DOI] [Google Scholar]

[c39] Ciccotti G., Ferrario M., and Ryckaert J.-P., Mol. Phys. 10.1080/00268978200100942 47, 1253 (1982). [DOI] [Google Scholar]

[c40] van Gunsteren W. F. and Karplus M., Macromolecules 10.1021/ma00234a015 15, 1528 (1982). [DOI] [Google Scholar]

[c41] Eastman P. and Doniach S., Proteins 30, 215 (1998). [DOI] [PubMed] [Google Scholar]

[c42] van Gunsteren W. F., Mol. Phys. 10.1080/00268978000102101 40, 1015 (1980). [DOI] [Google Scholar]

[c43] Kottalam J. and Case D., Biopolymers 10.1002/bip.360291008 29, 1409 (1990). [DOI] [PubMed] [Google Scholar]

[c44] Smith J., Q. Rev. Biophys. 24, 227 (1991). [DOI] [PubMed] [Google Scholar]

[c45] A. D.MacKerell, Jr., Bashford D., Bellott M., R. L.Dunbrack, Jr., Evanseck J., Field M. J., Fischer S., Gao J., Guo H., Ha S., Joseph D., Kuchnir L., Kuczera K., Lau F. T. K., Mattos C., Michnick S., Ngo T., Nguyen D. T., Prodhom B., Roux B., Schlenkrich M., Smith J., Stote R., Straub J., Watanabe M., Wiòrkiewicz-Kuczera J., Yin D., and Karplus M., FASEB J. A143, 6 (1992). [Google Scholar]

[c46] A. D.MacKerell, Jr., Bashford D., Bellott M., R. L.Dunbrack, Jr., Evanseck J., Field M. J., Fischer S., Gao J., Guo H., Ha S., Joseph-McCarthy D., Kuchnir L., Kuczera K., Lau F. T. K., Mattos C., Michnick S., Ngo T., Nguyen D. T., Prodhom B., W. E.ReiherIII, Roux B., Schlenkrich M., Smith J. C., Stote R., Straub J., Watanabe M., Wiòrkiewicz-Kuczera J., Yin D., and Karplus M., , J. Phys. Chem. B 10.1021/jp973084f 102, 3586 (1998). [DOI] [PubMed] [Google Scholar]

[c47] Petrone P. and Pande V. S., Biophys. J. 10.1529/biophysj.105.070045 90, 1583 (2006). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c48] Leimkuhler B. and Reich S., Simulating Hamiltonian Dynamics (Cambridge University Press, Cambridge, 2004). [Google Scholar]

[c49] Batcho P. F. and Schlick T., J. Chem. Phys. 10.1063/1.1389855 115, 4019 (2001). [DOI] [Google Scholar]

[c50] Skeel R. D. and Izaguirre J. A., Mol. Phys. 10.1080/0026897021000018321 100, 3885 (2002). [DOI] [Google Scholar]

[c51] Ermak D. L. and McCammon J. A., J. Chem. Phys. 10.1063/1.436761 69, 1352 (1987). [DOI] [Google Scholar]

[c52] Kloeden P. E. and Platen E., Numerical Solution of Stochastic Differential Equations, Applications of Mathematics: Stochastic Modelling and Applied Probability Vol. 23 (Springer-Verlag, New York, 1992) (second corrected printing, 1995). [Google Scholar]

[c53] Gardiner C. W., Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences (Springer-Verlag, New York, 1985). [Google Scholar]

[c54] Matthey T., Cickovski T., Hampton S. S., Ko A., Ma Q., Nyerges M., Raeder T., Slabach T., and Izaguirre J. A., ACM Trans. Math. Softw. 10.1145/1024074.1024075 30, 237 (2004). [DOI] [Google Scholar]

[c55] See EPAPS Document No. E-JCPSA6-128-016810 for comparison of NML and NMI showing optimal damping coefficient; Input files, analysis scripts and executable to reproduce computational test results; and illustration of localized dynamics when using SMD. For more information on EPAPS, see (http://www.aip.org/pubservs/epaps.html).

[c56] Shen K. F. F. M.-Y., J. Comput. Chem. 10.1002/jcc.20211 26, 691 (2005). [DOI] [PubMed] [Google Scholar]

[c57] Hassan S., Mehler E., Zhang D., and Weinstein H., Proteins: Struct., Funct., Genet. 10.1002/prot.10330 51, 109 (2003). [DOI] [PubMed] [Google Scholar]

[c58] Hassan S., Guarneri F., and Mehler E., J. Phys. Chem. B 10.1021/jp993895e 104, 6478 (2000). [DOI] [Google Scholar]

[c59] Hassan S. A., Guarnieri F., and Mehler E., J. Phys. Chem. B 10.1021/jp9938967 104, 6490 (2000). [DOI] [Google Scholar]

[c60] Hassan S. and Mehler E., Proteins: Struct., Funct., Genet. 10.1002/prot.10059 47, 45 (2002). [DOI] [PubMed] [Google Scholar]

[c61] Hassan S., J. Phys. Chem. B 50, 19501 (2004). [Google Scholar]

[c62] Hassan S. and Mehler E., Int. J. Quantum Chem. 10.1002/qua.20526 102, 986 (2005). [DOI] [Google Scholar]

[c63] Li X., Hassan S., and Mehler E., Proteins: Struct., Funct., Bioinf. 60, 464 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c64] Brooks B. R., Bruccoleri R. E., Olafson B. D., States D. J., Swaminathan S., and Karplus M., J. Comput. Chem. 10.1002/jcc.540040211 4, 187 (1983). [DOI] [Google Scholar]

[c65] Chodera J. D., Swope W. C., Pitera J. W., and Dill K. A., Multiscale Model. Simul. 10.1137/06065146X 5, 1214 (2006). [DOI] [Google Scholar]

[c66] Best R. B. and Hummer G., Proc. Natl. Acad. Sci. U.S.A. 10.1073/pnas.0408098102 102, 6732 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c67] Bolhuis P. G., Dellago C., and Chandler D., Proc. Natl. Acad. Sci. U.S.A. 10.1073/pnas.100127697 97, 5877 (2000). [DOI] [PMC free article] [PubMed] [Google Scholar]

[c68] Mishra B. and Schlick T., J. Chem. Phys. 10.1063/1.471875 105, 299 (1996). [DOI] [Google Scholar]

[c69] Beard D. A. and Schlick T., J. Chem. Phys. 10.1063/1.481331 112, 7313 (2000). [DOI] [Google Scholar]

[c70] Brown R. and Case D., J. Comput. Chem. 10.1002/jcc.20479 27, 1662 (2006). [DOI] [PubMed] [Google Scholar]

[c71] Elsen E., Houston M., Vishal V., Darve E., Hanrahan P., and Pande V., Proceedings of Supercomputing, 2006. (unpublished).

PERMALINK

Normal mode partitioning of Langevin dynamics for biomolecules

Christopher R Sweet

Paula Petrone

Vijay S Pande

Jesús A Izaguirre

Abstract

INTRODUCTION

THEORY OF NORMAL MODE PARTITIONING

NMA and Hessian diagonalization

Propagation using the Langevin equation

MTS discretization (NMI)

NML discretization

NML minimization

COMPUTATIONAL TESTS

Choice of the number of modes to propagate

Sampling using NML or NMI

Figure 1.

Table 1.

Figure 2.

Figure 3.

Rate calculation using NML or NMI

Figure 4.

Figure 5.

Efficiency of NML

Figure 6.

Figure 7.

Thermostat stability

Figure 8.

DISCUSSION

ACKNOWLEDGMENTS

APPENDIX: PROJECTION MATRICES

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Normal mode partitioning of Langevin dynamics for biomolecules

Christopher R Sweet

Paula Petrone

Vijay S Pande

Jesús A Izaguirre

Abstract

INTRODUCTION

THEORY OF NORMAL MODE PARTITIONING

NMA and Hessian diagonalization

Propagation using the Langevin equation

MTS discretization (NMI)

NML discretization

NML minimization

COMPUTATIONAL TESTS

Choice of the number of modes to propagate

Sampling using NML or NMI

Figure 1.

Table 1.

Figure 2.

Figure 3.

Rate calculation using NML or NMI

Figure 4.

Figure 5.

Efficiency of NML

Figure 6.

Figure 7.

Thermostat stability

Figure 8.

DISCUSSION

ACKNOWLEDGMENTS

APPENDIX: PROJECTION MATRICES

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases