Abstract
Self-guided molecular/Langevin dynamics (SGMD/SGLD) simulation methods were developed to enhance conformational sampling through promoting low frequency motion of molecular systems and have been successfully applied in many simulation studies. Quantitative understanding of conformational distribution in SGLD has been achieved by separating microscopic properties according to frequency. However, a missing link between the guiding factors and conformational distributions makes it highly empirical and system dependent when choosing the values of the guiding parameters. Based on the understanding that molecular interactions are the source of energy barriers and diffusion friction, this work reformulates the equation of the low frequency motion to resemble Langevin dynamics. This reformulation leads to new forms of guiding forces and establishes a relation between the guiding factors and conformational distributions. We call simulations with these new guiding forces the generalized self-guided molecular/Langevin dynamics (SGMDg/SGLDg). In addition, we present a new way to calculate low frequency properties and an efficient algorithm to implement SGMDg/SGLDg that minimizes memory usage and inter-processor communication. Through example simulations with a skewed double well system, an argon fluid, and a cryo-EM map flexible fitting case, we demonstrate the guiding effects on conformational distributions and conformational searching.
INTRODUCTION
The self-guided (SG) molecular simulation methods, namely, the self-guided molecular dynamics (SGMD)1,2 and the self-guided Langevin dynamics (SGLD),3–5 were developed for efficient conformational searching. Most enhanced conformational sampling methods utilize bias potential to help overcome energy barriers, such as metadynamics,6 accelerated molecular dynamics,7 and self-adapted accelerated molecular dynamics.8 SG methods do not rely on a priori energy barrier information to enhance sampling. Instead, they achieve an enhanced conformational search through promoting low frequency motion. Unlike high temperature simulations that raise all thermal motion, SG simulations focus on only the low frequency motion that is a well-known limiting factor for conformational searching and sampling. Because the low frequency motion accounts for a very small portion of thermal motion, an enhancement in the low frequency motion, together with a slight suppression in the high frequency motion, causes relatively small perturbations to the conformational distribution, but dramatically accelerates conformational search. Therefore, SG simulations sample mostly the original conformational distribution. SG methods are unique in the way to extract low frequency properties through a simple local averaging scheme. SGMD/SGLD has been applied to many studies of long time scale events such as peptide folding,9–12 conformational reorganization,13 conformational state recognization,14 conformational transitions,15,16 and protein denaturation.17
One main question about SG simulations is how the guiding forces quantitatively affect the conformational distribution. Through separating microscopic properties (i.e., momentum, force, and potential energy) into low frequency and high frequency parts, we were able to quantitatively describe the SGLD conformational distribution, which allows ensemble properties to be calculated through reweighting.4 Based on this understanding, we developed the force–momentum-based self-guided Langevin dynamics (SGLDfp) simulation method18 to directly sample canonical ensembles. Later, following the generalized Langevin equation (GLE), we developed SGLD-GLE method, which samples exactly the canonical ensemble.19 However, these approaches do not apply to SGMD where no friction constant can be used to define the momentum guiding force. In addition, the guiding effects on conformational distributions are not quantitatively related to the guiding factors, which makes choosing the values of the guiding factors highly empirical and system dependent.
In molecular systems, molecular interactions not only determine conformational distribution, but also affect molecular movement. Molecular interactions act as collisions to change moving directions and as friction to slow atom migration. Based on this understanding, we reformulate the equation of the low frequency motion to resemble Langevin dynamics, from which we propose the generalized self-guided molecular/Langevin dynamics simulation method, abbreviated as SGMDg or SGLDg. This reformulation produces new forms of guiding forces and allows us to establish a relation between the guiding factors and the conformational distribution, for both SGMDg and SGLDg.
In the “Theory and Methods” section, we first describe the SGMDg/SGLDg method and then introduce a new way to calculate low frequency properties and an efficient algorithm to implement the method. In the “Demonstration Simulations” section, we present the conformational distribution and conformational search of the new method with several example systems.
THEORY AND METHODS
Reformulate the equation of the low frequency motion
The Newtonian equation of motion for any atom has the following form:
| (1) |
Here, p is the atom’s momentum and is its time derivative. F is the apparent force acting on the atom, including all contributions such as molecular interactions, constraint forces from SHAKE20 if present, friction forces and random forces in Langevin dynamics, velocity scaling for constant temperature simulations, and the guiding forces in a self-guided simulation.
In a many body system, atoms interact with surrounding atoms and experience frequent collisions. As a result, atoms move along zigzagging paths. The energy surface is highly rugged, which acts as friction to slow down atom motion, as barriers to trap atoms, and as walls to change atoms’ moving directions. Even more complicated, this energy surface changes with time due to the motion of the local environment.
However, if we look at the energy surface as a big picture, it can be generalized as a smooth surface with noise. This smooth surface can be created through certain averaging schemes. In SG simulations, we use the following local average scheme along trajectories:1–3
| (2) |
Here, P is any time dependent property and is its local average. tL is the local average time, which defines a frequency threshold, 1/tL. Properties with frequency higher than this threshold are smoothed, and those with lower frequencies change little by the averaging.
By performing local averaging on both sides of Eq. (1), we obtain the following equation of the low frequency motion:
| (3) |
The low frequency apparent force can be split into three parts: a smoothed force, which is a local average of , denoted as , a friction force, , and the remaining part resembling a collision force, . Therefore, Eq. (3) is reformulated to mimic Langevin dynamics,
| (4) |
Here, is not truly a random force as suggested in Langevin dynamics, and ξ represents an apparent friction constant that can be estimated from simulation based on the requirement that is independent of ,
| (5) |
From Eq. (5), we can solve ξ,
| (6) |
Equation (4) describes a low frequency motion on a smoothed local average energy surface under the effects of a friction force and collision force. The friction force and the collision force control the temperature of the low frequency motion.
Guiding forces to manipulate the low frequency motion
The low frequency motion described by Eq. (4) is important for conformational searches. While thermal motion of a molecular system is constrained by ensemble variables such as temperature in an NVT ensemble and total energy in an NVE ensemble, the low frequency motion can be manipulated by adjusting the friction force and/or the collision force in Eq. (4).
From Eq. (4), we can see that the low frequency properties that control the low frequency motion are and . Changes in the friction force or the collision force can be achieved by adding forces proportional to or . In other words, or can be used as guiding vectors to manipulate low frequency motion. Therefore, we can write the equation of the self-guided low frequency motion in the following form:
| (7) |
where λ is the momentum guiding factor and μ is the force guiding factor. Please note the differences in these guiding forces from previous definitions. Previous SGLD methods3,18,19 use the Langevin friction constant, γ, to define the momentum guiding force, . Because γ = 0 in a MD simulation, no valid momentum guiding force is defined for SGMD.1,2 Now that the momentum guiding force uses the apparent friction constant that exists in both MD and LD, it can be equally applied to both SGLD and SGMD. Previously, SGMD and SGLDfp methods1,2,18 used the low frequency force as the guiding force, which significantly biases toward low energy states. Now the force guiding force uses the deviation of the low frequency forces, , reducing the bias effect toward low energy states.
Replacing the low frequency motion in Eq. (3) by Eq. (7) and substituting it into Eq. (1), we obtain the equation of the generalized self-guided motion,
| (8) |
With the two guiding factors, λ and μ, the low frequency motion is promoted to achieve enhanced conformational search. While both guiding forces can influence the low frequency motion, the momentum guiding force would promote diffusion limited conformational search, and the force guiding force would promote energy barrier limited conformational search. We call simulations, according to Eq. (8), the generalized self-guided molecular/Langevin dynamics simulation (SGMDg/SGLDg), depending on whether γ = 0.
Conformational distribution of SGMDg/SGLDg
While accelerating conformational search is crucial for studying long time scale events, accurate conformational distribution is of important value for quantitative studies. It is interesting to know how the guiding forces affect the conformational distribution.
First, let us take a look at the collision force. Because this collision force is due to the deviation of the low frequency force, it is proportional to the apparent force F. Assuming the energy surface is scaled up by 1 + μ, the low frequency equation of motion would be
| (9) |
Because the apparent friction constant, ξ, is proportional to the time correlation of the collision force, which is proportional to F, the apparent friction constant must be proportional to the square of the scaling factor. Therefore, the new collision force would have the following relation with the scaling factor:
| (10) |
This new collision force corresponds to an energy surface scaled up by 1 + μ. To maintain a conformational distribution unchanged on the low frequency energy surface, the low frequency temperature needs to be scaled up by 1 + μ, which can be achieved by scaling down the friction constant by ,
| (11) |
With the new random force given by Eq. (10) and the new apparent friction constant given by Eq. (11), the equation of the low frequency motion has the following form:
| (12) |
Comparing Eq. (12) with Eq. (7), we can see in order to maintain the conformational distribution, the momentum guiding factor should be
| (13) |
We call λμ the balanced momentum guiding factor. Equation (13) is derived to maintain a conformational distribution on the low frequency energy surface. However, it assumes that the guiding factors are small and does not consider the effect on high frequency motion. It assumes the low frequency motion is small and neglectable as compared with the high frequency motion. The momentum guiding force, , can cancel the effect of the force guiding force, , if λ and μ satisfy Eq. (13). Also, from Eq. (13), we can calculate the balanced force guiding factor, μλ, from λ,
| (14) |
Because the guiding force corresponds to an energy surface , we can infer that the guiding force corresponds to an energy surface . For any SG simulation with the guiding factors, λ and μ, the potential energy surface is incremented by . Therefore, the partition function has the following form:
| (15) |
An ensemble average of any property can be calculated in a SG simulation through reweighting,
| (16) |
Here, the summation is over simulation conformations and i is the conformation index. For convenience, index i is dropped in all following discussions. From the partition function, Eq. (15), we can see that the reweighting factor is given by
| (17) |
When μ = μλ, we have w(λ, μ) = 1 and a SG simulation will sample the canonical ensemble. Through Eqs. (14)–(17), we achieve our goal to establish a relation between the guiding factors and the conformational distribution. In previous SGLD methods, the reweight factor depends directly on the low frequency energy, which makes reweighting difficult for large systems.4,5,18 The reweighing factor described by Eq. (17) depends on the deviation of low frequency energies, which has a much smaller variance than the low frequency energies themselves, therefore making reweighting easier for large systems.
Please note that above derivation involves only one particle’s properties and is independent of other particles. Therefore, it can be extended in such a way that each particle has its own guiding factors. One typical application is that in a system of protein with water, the guiding forces can be applied only to protein atoms, with the guiding factors being zero for water molecules.
The momentum guiding factors represent a reduction in the apparent friction to the low frequency motion. The upper limit would be λ = 1, which corresponds to no friction to the low frequency motion. For the force guiding factor, the lower limit would be the one to have a balanced momentum guiding factor of λμ = −1, which corresponds to μ = −0.318. Therefore, the recommended guiding factor range is λ ≤ 1 and μ ≥ −0.318. To achieve maximum acceleration, we suggest using λ = 1, μ = 0 or λ = 0, μ = −0.318. Clearly, these ranges are not system dependent.
Since the development of self-guided Langevin dynamics via generalized Langevin equation (SGLD-GLE) method,19 we have developed several versions of “generalized” self-guided simulation method for both Langevin dynamics and molecular dynamics. These methods have been implemented as “SGLDg” into AMBER since 2014 and CHARMM since 2016. However, no direct relation was built between the momentum guiding factor and the conformational distribution. SGMDg/SGLDg method presented here introduces the apparent friction constant and the apparent collision forces that allows the momentum guiding factor be directly related to the conformational distribution. We plan to implement SGMDg/SGLDg into future versions of AMBER and CHARMM to replace previous self-guided simulation methods, including SGLDfp and “SGLDg” methods.
Calculation of local average properties
SG simulations employ a very efficient way to estimate low-frequency properties through the so-called local averaging,
| (18) |
Here, P represents an instantaneous property, such as interaction energy, force, or momentum. There are a series of local average properties to be calculated for a SG simulation, e.g., and . may contain many components, such as molecular interaction force, constraint force, friction force, and random force. It is beneficial to make these calculations more efficient. We find that many local average properties can be calculated from local average positions,
| (19) |
Therefore, the velocity and momentum local averages can be calculated by
| (20) |
| (21) |
The low frequency apparent force can also be related to momentum and so to positions,
| (22) |
Therefore, we have
| (23) |
| (24) |
Because some algorithms for the integration of the equation of motion, like the leap frog algorithm, do not automatically provide velocities at current time step, t, we can calculate current velocity or momentum from local average properties,
| (25) |
With Eqs. (20)–(25), all low frequency properties needed for SG simulations can be calculated from position properties: r(t), , and . In periodic boundary simulations, when a particle transfers from one box to another box, all the three position properties must be transferred together. For example, in a rectangular box, L = (Lx, Ly, Lz), the periodic boundary condition (PBC) is applied to all position properties: r(t), , and ,
| (26a) |
| (26b) |
| (26c) |
A leap-frog algorithm for SGMDg/SGLDg
Here, we present a leap-frog algorithm to implement the generalized SG simulation method. This algorithm needs only extra vector arrays to store , , and and uses these arrays to calculate all other quantities. In addition, atoms are independent of each other to integrate their equations of motion, saving inter-process communication, that benefits parallel computing.
-
1.
At each time step, t, calculate potential energy, Ep, and forces, f, which include the random forces in Langevin dynamics
-
2.
Calculate local average properties.
Local average energies:
| (27) |
| (28) |
From and , we can calculate the reweighting factor, w(λ, μ), of this conformation according to Eq. (17).
Local average positions:
| (29) |
| (30) |
To calculate local average forces, we need the local average momentum and the momentum at the current time step,
| (31) |
| (32) |
So, we can calculate
| (33) |
-
3.
Calculate apparent friction constants
We need ensemble averages of and to calculate the apparent friction constant of each atom according to Eq. (6). To be able to calculate them on the fly, we replace the ensemble averages by long time local averages. An average time, tavg, typically 10 times the local average time, tL, is defined for this purpose. Two scalar arrays, FP and PP, are used to store the averages,
| (34) |
| (35) |
| (36) |
The averages and friction constants are calculated for each atom. Alternatively, the apparent friction constants of atoms can be calculated from previous simulations and read in at the beginning of a simulation to avoid their calculation on the fly.
-
4.
Calculate the guiding forces:
| (37) |
-
5.
Calculate the energy conservation scaling factor:
An energy conservation scaling factor, η, is used to cancel energy input due to the guiding force. Previously, a uniform energy conservation factor is used for all atoms. Its calculation needs the guiding forces and momentums of all atoms and requires extra inter-processor communications in parallel computing. Here, we use atom-specific scaling factors and their calculations need no information from other atoms, therefore saving inter-processor communication in parallel computing. The atom-specific scaling factor is determined by the energy conservation requirement,
| (38) |
Here, is the average velocity during the time step, and . γ is the friction constant in Langevin dynamics. γ = 0 for a MD simulation. From Eq. (38), we can solve
| (39) |
Here, .
-
6.
Calculate velocities at :
| (40) |
Please note that, for SGLDg, γ is the friction constant and f is the interaction force plus the random force, and for SGMDg, γ = 0 and there is no random force in f.
-
7.
Calculate positions at t + δt:
| (41) |
-
8.
Go back to step 1 and repeat the above steps for the next time step.
DEMONSTRATION SIMULATIONS
SGMDg/SGLDg employs new forms of guiding forces, and the guiding factors are quantitatively related to conformational distributions. Here, we present several example simulations to demonstrate the effects of the guiding factors on conformational distribution and on conformational search.
A skewed double well system
We designed the following potential function to produce a skewed double well (SDW) surface,
| (42) |
where a = 1 kcal/mol is the well depth parameter, y0 = 1 Å is the well position parameter, b = 0.25 kcal/mol is the skew parameter, and c = 1000 kcal/mol is the restriction parameter. Figure 1 shows the energy profile along the y-axis at x = 0 and z = 0. As can be seen, there are two wells defined by the first term of Eq. (42). These two wells have different depths as specified by the second term of Eq. (42). The energy surface is narrowly restricted in the x and z directions as specified by the third term of Eq. (42), which represents the high frequency dimensions where motions are not supposed to be enhanced.
FIG. 1.
The skewed double well potential profile along the y-axis at x = 0 and z = 0. The potential is defined by Eq. (42) with a = 1 kcal/mol, b = 0.25 kcal/mol, c = 1000 kcal/mol, and y0 = 1 Å.
We performed 100 ns SGLDg simulations of an argon atom on the SDW energy surface with a friction constant of 10/ps. The local average time is set to tL = 0.2 ps. Figure 2 shows the distributions of the argon atom along the y-axis at various combinations of guiding factors. The top panel examines the momentum guiding force effect on conformational distribution (λ ≠ 0, μ = 0). Compared with the LD simulation (black line), which corresponds to λ = 0, μ = 0, SGLDg with λ = −1, μ = 0 shows a higher peak at the low energy well and a lower peak at the high energy peak (red line). With λ = 1, μ = 0, SGLDg produces a lower peak at the low energy well and a higher peak at the high energy well (green line). These results show that the momentum guiding force favors high energy states.
FIG. 2.
Distributions of the SDW system along the y-axis in the SGLDg simulations. Simulations are performed at γ = 10/ps, T = 100 K. The guiding factors are labeled in each panel. Top panel: simulations with only momentum guiding forces. Middle panel: simulations with only force guiding forces. Bottom panel: simulations with balanced guiding forces.
The middle panel of Fig. 2 examines the effect of the force guiding force (λ = 0, μ ≠ 0). When λ = 0, μ = −0.318, compared with LD (black line), SGLDg increases the high energy peak and decreases the low energy peak (red line). When λ = 0, μ = 0.3247, SGLDg increases the low energy peak and decreases the high energy peak (green line). These results indicate that the force guiding force favors low energy states.
The bottom panel of Fig. 2 compares SGLDg at λ = −1, μ = −0.318 and λ = 1, μ = 0.3247 with LD (black line). These are balanced guiding factors satisfying Eq. (13). Both SGLDg distributions overlap with the LD distribution, indicating that the bias effects from the momentum guiding force and from the force guiding force cancel each other. These results confirm that Eq. (13) is a relation for balanced guiding factors.
The average potential energies of the SDW system with different guiding factors are plotted in Fig. 3. With only the momentum guiding force, the potential energy increases with λ because a larger λ results in a stronger low frequency motion, which in turn samples more high energy states. With only the force guiding force, the potential energy decreases with μ or λμ because a larger μ increases more of the local average potential energy, which favors low energy states. In SGLDg simulations with balanced guiding forces, average energies remain almost constant from λ = −1 to λ = 1, indicating the effects from both guiding forces cancel with each other.
FIG. 3.
Average potential energies of the SDW system from the SGLDg simulations. For SGLDg simulations with λ = 0, x-axis uses the λμ values converted from μ according to Eq. (13).
The bias effects of the guiding forces can be described quantitatively by the reweighting factor calculated through Eq. (17) and the conformational distribution can be corrected through reweighting using Eq. (16). The reweighted averages from the SGLDg simulations are also shown in Fig. 3. As can be seen from Fig. 3, the reweighted averages agree well with the canonical averages from the LD (λ = 0, μ = 0) simulation.
The SDW system allows us to examine the energy barrier crossing ability. Figure 4 shows the number of energy barrier crossings during the SGLDg simulations. It is clear to see that the crossing numbers have strong dependence on the guiding factors. A larger momentum guiding factor results in a stronger low frequency motion and increases the power needed to overcome energy barriers. A more negative force guiding factor corresponds to a lower energy barrier, making energy barrier crossing easier. Balanced guiding factors result in little change in crossing numbers. Therefore, a positive λ or a negative μ is an excellent way to enhance energy barrier crossing.
FIG. 4.
Number of energy barrier crossings during the SGLDg simulations of the SDW system. For SGLDg simulations with λ = 0, x-axis uses the λμ values converted from μ according to Eq. (13).
Argon fluid
An argon fluid under periodic boundary condition is a convenient system to examine conformational sampling of molecular dynamics. The system contains 500 argon atoms interacting with Lennard-Jones potential. Long-range contributions are calculated with the isotropic periodic sum (IPS) method.21 The IPS radius or the cutoff distance is 10 Å. All simulations are performed at a constant temperature of 100 K and a constant volume of 28.53 × 28.53 × 28.53 Å3.
10 ns SGMDg simulations are performed with various guiding factors. The local average time is set to tL = 0.2 ps. The potential energies are shown in Fig. 5. A regular MD simulation is just a SGMDg simulation at λ = 0, μ = 0. As can be seen in Fig. 5, when μ = 0, a larger λ results in a higher average potential energy. When λ = 0, a large μ results a lower average energy. With balanced λ and μ that satisfy Eq. (13), the energies have very small dependence on the guiding factors, indicating the guiding force effects cancel mostly with each other. Because Eq. (13) is derived with the consideration of only the low frequency motion, we do not expect the effects of the guiding forces to cancel exactly.
FIG. 5.
Potential energies of the argon fluid in the SGMDg simulations. For SGMDg simulations with λ = 0, x-axis uses the λμ value converted from μ according to Eq. (13).
Figure 6 shows the power spectra of the argon fluid from the SGMDg simulations. The top panel shows the SGMDg results with the force guiding force only (i.e., λ = 0). When the force guiding factor changes from negative to positive, the spectrum’s high frequency part goes up while the spectrum’s low frequency part goes down. This means that the force guiding force suppresses the low frequency motion and enhances the high frequency motion. The middle panel shows the SGMDg results with only the momentum guiding forces (i.e., μ = 0). When the momentum guiding factor changes from negative to positive, the high frequency portion goes down and the low frequency portion goes up, indicating the momentum guiding force enhances the low frequency motion and suppresses the high frequency motion, opposite to the force guiding force.
FIG. 6.
The spectra of the argon fluid obtained from the SGMDg simulations. The guiding factors are labeled on each panel. Top panel: simulations with only force guiding forces, middle panel: simulations with only momentum guiding forces; bottom panel: simulations with balanced guiding forces.
The bottom panel shows the SGMDg simulations with balanced guiding factors. The overall effect is that when the momentum guiding factor changes from negative to positive, the high frequency portion goes down and the low frequency portion goes up, like the cases with only the momentum guiding force but with smaller changing scale. Figure 6 clearly shows how the guiding forces alter molecular motion in a frequency dependent way.
Diffusion constants correspond to the lowest frequency motion. Figure 7 shows the diffusion constant in the SGMDg simulations. The curve with μ = 0 represents the momentum guiding force effect on diffusion. As can be seen, a larger λ leads to a higher diffusion constant. The curve with λ = 0 shows the force guiding force effect on diffusion. To the opposite of the momentum guiding force, a larger μ results in a lower diffusion constant. With balanced guiding factors, larger guiding factors still result in higher diffusion constants, indicating that the two guiding forces do not cancel each other exactly.
FIG. 7.
Diffusion constants of the argon fluid in the SGMDg simulations. For SGMDg simulations with λ = 0, x-axis uses λμ values converted from μ according to Eq. (13).
Cryo-EM flexible fitting
Macromolecular simulation study relies heavily on conformational search. Thousands or millions of degrees of freedom makes conformational search a very time-consuming process. An enhancement in conformational search efficiency will save a significant amount of time and improve the quality of simulation results. Here, we use a MapSGLD flexible fitting22 as an example to demonstrate the acceleration of conformational search by SGLDg.
We use the cryo-EM structure of SARS-CoV-2 spike ectodomain (pdb code: 6vxx)23 as a template to generate a model of the protein using CHARMM-GUI.24,25 There are many missing residues in the 6vxx structure, possibly due to insufficient density in the cryo-EM map to locate these residues. These missing residues are modeled by CHARMM-GUI, as shown in Fig. 8(a) (colored blue). Superimposing the model with the cryo-EM map (EMDB code: emd_21452), we can see many of these modeled residues stick out of the map [Fig. 8(b)]. The positions of these modeled residues are very likely to deviate from the true structure and cannot be determined with the map density only. To fix these structure uncertainties, MapSGLD is an excellent tool to perform flexible fitting to derive a structure agreeing with the map and favored by the force field. Because flexible fitting involves local conformational changes, there are many energy barriers to cross. SGLDg provides enhanced energy barrier crossing ability and can accelerate the flexible fitting process.
FIG. 8.
MapSGLD flexible fitting of SARS-CoV-2 spike ectodomain. (a) The homotrimer model of the spike protein build from the cryo-EM structure (pdb code: 6vxx, cyan). The missing residues are modeled through CHARMM-GUI and are colored blue. (b) The model superimposed with the cryo-EM map (EMDB code: emd_21452). The missing residues are in the low-density regions insufficient to determine their positions. (c) The structure after the MapSGLD simulation. The missing residues (colored purple) are positioned according to map density and the force field, including solvation.
The system has 3416 residues, including 3 protein chains and 63 sugar molecules. There are 53 376 atoms in total. To compare the conformational search efficiency, we perform SGLDg with four sets of parameters: (a) λ = 0, μ = 0, which is a LD simulation, (b) λ = 1, μ = 0.324, which are balanced guiding factors to preserve the canonical ensemble, (c) λ = 1, μ = 0, which employs only the momentum guiding force, and (d) λ = 0, μ = −0.318, which utilizes only the force guiding force. All simulations are performed at γ = 1/ps, T = 300 K, and tL = 0.2 ps. The SCPISM implicit solvation model26 is used to describe the solvent effect. The map potential constant is 0.1 kcal/mol. The system is first minimized for 200 steps using the adopted basis Newton–Raphson method27 to get rid of clashes between atoms. Starting from the same energy minimized structure, 5000 ps simulations were performed with the four guiding factor sets.
The flexible fitting processes can be monitored by the potential energies. Figure 9 shows the total potential energies during the four simulations. These total potential energies contain molecular interaction, solvation, and map restraint. To suppress the fluctuations in the energy profiles, 10 ps sub-averages of the potential energies are plotted in Fig. 9. Also, to clearly show the trend of energy changes, the x-axis is plotted in a logarithmic scale. As can be seen, the four curves descend at different rates. The simulation with the balanced guiding forces (red line) descends slower at first than LD (black line) but catches up and reaches lower energies than the LD simulation at 200 ps and remains at lower energies thereafter. The simulation with only the momentum guiding force (green line) has a lower energy than the LD simulation after 40 ps, while the simulation with only the force guiding force (blue line) has a lower energy since the beginning and has the lowest energies throughout the simulation period.
FIG. 9.
Sub-average potential energies during MapSGLD simulations. Sub-averages are calculated over a 10 ps period.
For this system of 53 376 atoms, 5000 ps is obviously not enough to reach the global minimum state in these four simulations. Among the four simulations, the one with only the force guiding force searches the low energy states the fastest. Simply using the energy descending slope to estimate, as shown by the dashed lines, it would take 1 000 000 ps, 210 000 ps, and 8700 ps, for LD, SGLDg with the balance guiding forces, and SGLDg with only the momentum guiding force, respectively, to reach the energy that SGLDg with only the force guiding force reaches in 5000 ps. The time ratios are 200, 42, and 1.7, respectively. In other words, SGLDg with only the force guiding force speeds up the conformation search by at least 200 times over the LD simulation. Obviously, these estimates are very crude, and a linear extrapolation based on the slope is overly simplified. Nonetheless, this example demonstrates a significant acceleration in conformation search. The structure from SGLDg with only the force guiding force is shown in Fig. 8(c). It is clear that the stick-out modeled residues (purple) have significant conformational change. This structure is much improved over the initial model in terms of map agreement, molecular interaction, and solvation. Detailed structure descriptions and comparisons are beyond the scope of this work.
CONCLUSION
We reformulate the equation of the low-frequency motion to resemble Langevin dynamics, from which we propose the generalized self-guided molecular/Langevin dynamics simulation method, abbreviated as SGMDg or SGLDg. The momentum guiding force is defined as the low frequency momentum multiplied by the apparent friction constant and the momentum guiding factor. The force guiding force is defined as the deviation of the low frequency apparent force multiplied by the force guiding factor. These new forms of the guiding forces allow us to achieve the goal of establishing a relationship between the guiding factors and the conformational distribution for both SGLDg and SGMDg. The reweighing factors with the new guiding forces have a smaller variance than with previous guiding forces and, therefore, can be applied to larger systems.
This work further presents an efficient algorithm to implement SGMDg/SGLDg. All necessary low frequency properties can be calculated from local average positions. Integration of the equation of the self-guided motion for any atom is independent of other atoms. These improvements save memory and are efficient for parallel computing.
We examined SGMDg/SGLDg with a skewed double well system, an argon fluid, and a cryo-EM map flexible fitting case. These results show that both the momentum guiding force and the force guiding force can be used to enhance conformational search. Their effects on conformational distribution can be corrected through reweighting. With balanced guiding factors, their effects on conformational distribution largely cancel with each other.
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.
ACKNOWLEDGMENTS
This research was supported by the Intramural Research Programs of National Heart, Lung, and Blood Institute (Grant No. Z01 HL001027-30). We thank Mr. Andrew Brooks for editing the manuscript.
Note: This paper is part of the JCP Special Topic on Classical Molecular Dynamics (MD) Simulations: Codes, Algorithms, Force Fields, and Applications.
REFERENCES
- 1.Wu X. and Wang S., J. Chem. Phys. 110, 9401 (1999). 10.1063/1.478948 [DOI] [Google Scholar]
- 2.Wu X. and Wang S., J. Phys. Chem. B 102, 7238 (1998). 10.1021/jp9817372 [DOI] [Google Scholar]
- 3.Wu X. and Brooks B. R., Chem. Phys. Lett. 381, 512 (2003). 10.1016/j.cplett.2003.10.013 [DOI] [Google Scholar]
- 4.Wu X. and Brooks B. R., J. Chem. Phys. 134, 134108 (2011). 10.1063/1.3574397 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Wu X., Damjanovic A., and Brooks B. R., in Advances in Chemical Physics, edited by Rice S. A. and Dinner A. R. (John Wiley & Sons, Inc., Hoboken, 2012), p. 255. [Google Scholar]
- 6.Christen M. and Van Gunsteren W. F., J. Comput. Chem. 29, 157 (2008). 10.1002/jcc.20725 [DOI] [PubMed] [Google Scholar]
- 7.Hamelberg D., Mongan J., and McCammon J. A., J. Chem. Phys. 120, 11919 (2004). 10.1063/1.1755656 [DOI] [PubMed] [Google Scholar]
- 8.Gao N., Yang L., Gao F., Kurtz R. J., West D., and Zhang S., J. Phys.: Condens. Matter 29, 145201 (2017). 10.1088/1361-648x/aa574b [DOI] [PubMed] [Google Scholar]
- 9.Wu X., Wang S., and Brooks B. R., J. Am. Chem. Soc. 124, 5282 (2002). 10.1021/ja0257321 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Wu X. and Wang S., J. Phys. Chem. B 105, 2227 (2001). 10.1021/jp004048a [DOI] [Google Scholar]
- 11.Wu X. and Wang S., J. Phys. Chem. B 104, 8023 (2000). 10.1021/jp000529i [DOI] [Google Scholar]
- 12.Wu X.-W. and Sung S.-S., Proteins 34, 295 (1999). [DOI] [PubMed] [Google Scholar]
- 13.Damjanovic A. et al. , Biophys. J. 95, 4091 (2008). 10.1529/biophysj.108.130906 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Damjanovic A. et al. , J. Chem. Inf. Model. 48, 2021 (2008). 10.1021/ci800263c [DOI] [PubMed] [Google Scholar]
- 15.Damjanović A., García-Moreno E. B., and Brooks B. R., Proteins: Struct., Funct. Bioformatics 76, 1007 (2009). 10.1002/prot.22439 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Pendse P. Y., Brooks B. R., and Klauda J. B., J. Mol. Biol. 404, 506 (2010). 10.1016/j.jmb.2010.09.045 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Lee C.-I. and Chang N.-y., Biophys. Chem. 151, 86 (2010). 10.1016/j.bpc.2010.05.002 [DOI] [PubMed] [Google Scholar]
- 18.Wu X. and Brooks B. R., J. Chem. Phys. 135, 204101 (2011). 10.1063/1.3662489 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wu X., Brooks B. R., and Vanden-Eijnden E., J. Comput. Chem. 37, 595 (2016). 10.1002/jcc.24015 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Ryckaert J.-P., Ciccotti G., and Berendsen H. J. C., J. Comput. Phys. 23, 327 (1977). 10.1016/0021-9991(77)90098-5 [DOI] [Google Scholar]
- 21.Wu X. and Brooks B. R., J. Chem. Phys. 122, 044107 (2005). 10.1063/1.1836733 [DOI] [Google Scholar]
- 22.Wu X., Subramaniam S., Case D. A., Wu K. W., and Brooks B. R., J. Struct. Biol. 183, 429 (2013). 10.1016/j.jsb.2013.07.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Walls A. C., Park Y.-J., Tortorici M. A., Wall A., McGuire A. T., and Veesler D., Cell 181, 281 (2020). 10.1016/j.cell.2020.02.058 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Jo S. et al. , Adv. Protein Chem. Struct. Biol. 96, 235 (2014). 10.1016/bs.apcsb.2014.06.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Jo S., Kim T., Iyer V. G., and Im W., J. Comput. Chem. 29, 1859 (2008). 10.1002/jcc.20945 [DOI] [PubMed] [Google Scholar]
- 26.Hassan S. A., Mehler E. L., Zhang D., and Weinstein H., Proteins 51, 109 (2003). 10.1002/prot.10330 [DOI] [PubMed] [Google Scholar]
- 27.Brooks B. R. et al. , J. Comput. Chem. 30, 1545 (2009). 10.1002/jcc.21287 [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.









