Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2023 Jul 25;120(31):e2220068120. doi: 10.1073/pnas.2220068120

Fast protein folding is governed by memory-dependent friction

Benjamin A Dalton a, Cihan Ayaz a, Henrik Kiefer a, Anton Klimek a, Lucas Tepper a, Roland R Netz a,1
PMCID: PMC10401029  PMID: 37490533

Significance

To describe protein-folding dynamics, protein molecular structures are typically projected onto some low-dimensional reaction coordinate. This dimensionality reduction produces non-Markovian friction, resulting in protein dynamics that is typically overdamped and influenced by the history of the reaction coordinate. Despite this, protein-folding dynamics is usually described as a memoryless Markovian process with effective friction fitted to experimental or simulation data. Using recently developed techniques, we extract friction memory kernels from molecular dynamics simulations of 8 fast-folding proteins. We show that memory effects are significant and that the folding times depend more on the friction magnitudes than on the free-energy profiles. Quantitative theoretical predictions of protein folding times thus require full knowledge of the friction memory kernel and appropriate non-Markovian reaction rate theory.

Keywords: protein folding, reaction rate theory, molecular friction, generalized Langevin equation, non-Markovian dynamics

Abstract

When described by a low-dimensional reaction coordinate, the folding rates of most proteins are determined by a subtle interplay between free-energy barriers, which separate folded and unfolded states, and friction. While it is commonplace to extract free-energy profiles from molecular trajectories, a direct evaluation of friction is far more elusive and typically relies on fits of measured reaction rates to memoryless reaction-rate theories. Here, using memory-kernel extraction methods founded on a generalized Langevin equation (GLE) formalism, we directly calculate the time-dependent friction acting on the fraction of native contacts reaction coordinate Q, evaluated for eight fast-folding proteins, taken from a published set of large-scale molecular dynamics protein simulations. Our results reveal that, across the diverse range of proteins represented in this dataset, friction is more influential than free-energy barriers in determining protein folding rates. We also show that proteins fold in a regime where the finite decay time of friction significantly reduces the folding times, in some instances by as much as a factor of 10, compared to predictions based on memoryless friction.


For many proteins, functionality depends on successfully folding into a specific three-dimensional conformational state. This requires that a linear polypeptide chain is driven to explore conformation space by interactions with a solvating environment and is shaped by both solvent interactions and internal interactions between amino acids. When described by a low-dimensional reaction coordinate, folding and unfolding are typically described by a free-energy landscape (16), with distinct states separated by free-energy barriers. For proteins that fold in less than 100 μs, so-called “fast-folding” proteins, barrier heights determined from experiment, theory, and simulation are of the order of just a few kBT’s in height. Theories for describing protein folding are often phrased as reaction rate theories with an explicit dependence on the free energy barrier and friction. An accurate understanding of the kinetics in barrier-crossing reactions is important not only in characterizing protein folding but also in many other domains, including nucleation theory and chemical reaction kinetics. Reaction rate theory dates back to Arrhenius, who, in 1889, showed that the transition times between reactant and product scale according to κeU0/kBT, where U0 represents the energy barrier height between the two states (7). In principle, the prefactor κ can depend on both the free-energy profile and friction. Building on the transition state theory of Eyring (8), Kramers was the first to include an explicit dependence of κ on solvent friction (9). The friction in Kramers’ theory is frequency-independent, suggesting that the environment of the reacting system, and possibly internal processes, relax infinitely fast compared to the barrier crossing time. Despite its simplicity, Kramers’ theory predicts reaction rates in both overdamped and inertia-dominated regimes, making it suitable for describing a wide range of systems. Several advancements have emerged that bridge the gap between the overdamped and inertia-dominated regimes, while also accommodating finite friction relaxation times (1014). When considering proteins, it is understood that friction arises from a complex interplay between interactions with the surrounding solvent and internal intramolecular interactions (1518). This ultimately leads to finite relaxation times (19). A direct evaluation of the friction acting on a protein has not been possible and approaches have rather relied on determining friction indirectly from memoryless reaction rate theory (2023). This paper presents a method for directly extracting the friction that affects the conformational dynamics of proteins from molecular dynamics simulation trajectories, which accounts for both internal and external friction contributions, as well as finite relaxation times.

We evaluate friction for eight fast-folding proteins using previously published large-scale molecular dynamics (MD) simulation trajectories (24). These simulations, conducted by the Shaw group on the purpose-built Anton super-computer (25, 26), represent a breakthrough in large-scale simulation. The resulting trajectories cover protein lengths ranging from 10 to 80 amino acid residues and simulation times of 100 μs to 3 ms. They represent a diverse protein dataset, composed of mixtures of α-helix and β-hairpin secondary structures and an assortment of different tertiary structures.

We analyze the protein folding trajectories in the framework of a non-Markovian, generalized Langevin equation (GLE). Although previous studies have examined non-Markovian effects in protein folding (27, 28) and Markov-state models have been expanded to include memory effects (29), our investigation is the first to employ such extensive simulation data and utilize recent GLE memory-kernel extraction methods (3032). The GLE is a reduced dynamic description of a complex multidimensional system (33, 34). Here, we map the all-atom dynamics of the composite water–protein systems onto a one-dimensional fraction of native contacts reaction coordinate Q (35, 36) and hence extract friction memory kernels from Q(t). Q has been previously established as a good reaction coordinate for the proteins studied in this paper (36). Free-energy profiles and friction are specific to a given reaction coordinate. The extracted friction kernels reflect the combined dissipative effects of solvent and internal relaxation on the reaction coordinate over a wide range of time scales. To simulate multiple folding and unfolding events, Lindorff-Larsen et al. selected for proteins that have fast folding times (24). Using the extracted free-energy profiles and friction kernels, we predict folding times based on reaction rate theory on different levels of sophistication and compare to folding times from MD simulations. Our analysis shows that the variation of folding and unfolding times across the considered proteins is correlated more strongly with friction than with free energy barriers. In fact, the most accurate prediction for protein folding times is given by a previously developed multi-time-scale, non-Markovian theory (13, 14). Interestingly, we find that memory decay times are long for all proteins in the dataset, with some being as long as the folding times themselves. This indicates that Q is a poor reaction coordinate when judged based on its non-Markovianity. Overall, our results suggest that friction plays a dominant role in governing the fast conformational dynamics of proteins, and when represented by a reduced-dimensional reaction coordinate, one must account for strong non-Markovian effects.

Results & Discussion

Extracting Friction from Protein Simulations.

The native state structures and the number of residues of each protein are presented in Fig. 1A. The fraction of native contacts reaction coordinate Q provides a contact-based measure for the deviation of a given state away from the native configuration, which typically represents the folded state. Q is closer to Q = 1 in the folded state and closer to Q = 0 when unfolded. In Fig. 1B, we show a 250 μs trajectory segment for the α3D protein (for a summary of simulation details, see Methods and SI Appendix, section 1). The folded and unfolded states are discernible, located at Qf = 0.75 and Qu = 0.51, respectively, separated by a barrier at Qb = 0.66. Throughout this paper, we consider folding transitions as leading from the unfolded state minimum to the barrier top (QuQb) and unfolding transitions as leading from the folded state minimum to the barrier top (QfQb). This accounts for the asymmetries observed in the free-energy profiles for the eight proteins. In the Left magnification above the trajectory, we identify a sequence of folding first-passage events. The mean of all first-passage events gives the mean first-passage time τMFPMD for the folding reaction. Similarly, we evaluate the unfolding times as the mean of all first-passage events from Qf to Qb. In the Right magnification, we present an example folding transition path, linking the unfolded state minimum directly to the barrier top. The mean transition path time τMTP is the mean of all transition path durations for either folding or unfolding transition. From the trajectories of Q(t) for each protein, we calculate the free-energy profile U(Q) acting on the reaction coordinate. In Fig. 1C, we show U(Q)=kBTlog[ρ(Q)] for the α3D protein, where ρ(Q) is the probability density for Q(t), kB is Boltzmann’s constant, and T is the temperature, which is different for each protein. The free-energy profiles for all proteins are asymmetric, so that the barrier heights faced by the folded and unfolded states (U0u and U0f) are not equal. Similarly, the distances in reaction-coordinate space from the minima to the barrier top (Lf and Lu) are also different. The free-energy profiles for all proteins are shown in SI Appendix, section 2.

Fig. 1.

Fig. 1.

The folding and unfolding of eight fast-folding proteins. (A) Native states for all proteins with the number of amino acids indicated. All protein trajectories are taken from ref. 24. (B) 250 μs trajectory segment for the Q(t) reaction coordinate (α3D protein). Left magnification: a sequence of folding first-passage events, from the unfolded state Qu to the barrier topQb. Right magnification: an example folding transition path and corresponding transition path time τTP. (C) Free-energy profile for the α3D protein. Configuration snapshots show example unfolded, barrier-top, and folded states. Qu, Qb, and Qf are the reaction coordinate values in the unfolded, barrier-top, and folded states, respectively. For α3D, the barrier faced by the unfolded protein, U0u=U(Qb)U(Qu)=1.8kBT, is less than the barrier faced by the folded protein U0f=U(Qb)U(Qf)=3.2kBT. The distance from the unfolded state to the barrier top Lu = Qb − Qu is greater than the distance from the folded state to the barrier top Lf = Qf − Qb. (D) Normalized GLE memory kernel Γ(t), extracted from Q(t) for the α3D protein. Inset: Running integral G(t) (black line), the limiting total friction γ (dashed line), and an exponential fit (red curve). (E) Total friction γ for each protein, divided by kBT, plotted as a function of the number of residues N, (power law with exponent ∼2.8 (red line)). (F) Effective mass m, plotted as a function of N (power law with exponent ∼1.5 (red line)). (G) and (H) show the Arrhenius factor eU0/kBT and barrier crossing times τMFPMD for folding and unfolding transitions individually.

Central to this paper is extracting time-dependent friction kernels Γ(t) from Q(t) trajectories of each of the eight proteins. We describe the stochastic dynamics of Q(t) with the one-dimensional generalized Langevin equation (GLE)

mQ¨(t)=0tΓ(tt)Q˙(t)dtU[Q(t)]+FR(t), [1]

where Γ(t) is the friction memory kernel, FR(t) is the random force term satisfying the fluctuation-dissipation theorem ⟨FR(t)FR(t′)⟩=kBTΓ(t − t′), m is the effective mass of the reaction coordinate, and U(Q) is the free-energy profile. Although Eq. 1 neglects nonlinear friction and is approximate (3739), it has been demonstrated to be accurate for polypeptides (32), pair reactions in water (40), and chemical bond vibrations (41). There are different techniques for extracting dynamic friction from discrete time-series trajectories (30, 31, 4244). We explain the method used to extract Γ(t) from Q(t) in detail in SI Appendix, section 3. In short, we directly extract the running integral of the memory kernel G(t)=∫0tΓ(t′)dt′ using a Volterra extraction scheme, which is appropriate for arbitrary, nonlinear free-energy profiles U(Q) (31, 32). The memory kernel is then given by Γ(t)=dG(t)/dt, which is evaluated numerically. In Fig. 1D, we show the memory kernel for the α3D protein, normalized by Γ(0). The dynamics of the α3D protein exhibit significant memory effects, which is evident from the inset that shows the running integral G(t) reaching a plateau after about 10 μs. We present the memory kernels for all proteins in SI Appendix, section 4.

Chain-Length Dependence of Protein Dynamics.

After extracting the full time-dependent friction kernel via G(t), we have access to the steady-state friction γ = G(t → ∞), which we refer to as the total friction throughout. In Fig. 1E, we show γ for each protein as a function of the number of residues in each chain N. Since each system has a unique temperature, we divide by kBT and hence display the inverse total diffusivity along Q. We fit a power law and find that γ/kBT=3.2×103N2.8μs (red line). For a normalized reaction coordinate that is a sum of N uncorrelated atomic distances, one expects linear scaling in N (SI Appendix, section 5). However, Q is a highly nonlinear reaction coordinate. Additionally, it is known that reptation dominates the dynamics of collapsed polymers (45, 46). Both of these effects contribute to the super-linearity in N-scaling for the total friction. The effective mass of Q, calculated using the equipartition theorem m=kBT/Q˙2, also depends on N (Fig. 1F). The power law scaling is given by m=3.4×106N1.5unm2 (red line). The super-linearity here is solely due to the nonlinearity of Q (37). The Arrhenius factors eU0/kBT, plotted in Fig. 1G, show very little correlation with protein size. The folding and unfolding kinetics, however, quantified by the mean first-passage times τMFPMD (Fig. 1H), increase with N. The correlation is weak but the results are consistent with experimental results for a different dataset that encompasses larger proteins (47) (see SI Appendix, section 6 for a statistical analysis). Overall, Fig. 1EH reveal that, when projected onto Q, friction plays the dominant role in governing the dynamics and kinetics of the proteins considered here. A similar conclusion was previously reported using a 1D Markovian Langevin model (48). We explore this in more detail in a subsequent section.

Memory Duration Is Significant in Protein Folding.

Extracting Γ(t) provides not only the total friction γ but also a time scale for the duration of memory effects. In addition, γ can be used predictively to assess other key dynamic time scales. The inertia time τm = m/γ is the time scale at which the system becomes diffusive (49), while the diffusion time,

τD=γL2kBT, [2]

is the time taken for a Brownian particle to diffuse over a characteristic distance L in the absence of free-energy gradients. L is the distance from the unfolded state minimum to the top of the folding barrier (Lu) or the distance from the folded state minimum to the barrier top (Lf). A system is considered to be in the overdamped regime when τmτD. All eight proteins in this study meet this condition for both Lu and Lf (Fig. 2A). One borderline case is Chignolin unfolding (τm/τD = 0.027), for which both Lf and γ are small.

Fig. 2.

Fig. 2.

Comparison of relevant time scales for protein folding and unfolding kinetics. (A) The inertia time scales τm compared to the diffusion times τD, showing that all systems are in the over-damped regime. (B) Memory decay time scales τmem, calculated from the first moment of the memory kernel Γ(t), compared to the diffusion times τD . The light gray lines indicate the bounding regime for τmem between τD × 10−2 and τD × 101, which is the domain in which memory-induced kinetic acceleration is expected. (C) Memory times τmem compared to the folding and unfolding times, expressed as the mean first-passage times τMFPMD. (D) Memory times compared to the transition path times τMTP leading from the folded and unfolded state minima to the barrier tops. The broken lines in each plot indicate exact equivalence between the respective times.

By comparing τD to the decay times of extracted memory kernels, we can assess the expected influence of memory on barrier-crossing processes (12). Memory-dependent friction is typically multimodal, with time scales spanning many orders of magnitude (32). To assign a single effective time-scale to the memory decay, we evaluate the first-moment according to τmem=0tΓ(t)dt/0Γ(t)dt. In SI Appendix, section 7, we address the issue of temporal discretisation of the MD data. Fig. 2B shows that memory times are nonnegligible when compared to diffusion times. Memory has been shown to accelerate barrier crossing kinetics in the range 1 × 10−2 < τmem/τD < 1 × 101 (1214), as indicated in Fig. 2B. Thus, for the proteins considered here, memory is predicted to accelerate protein folding and unfolding, which we quantify further below. In Fig. 2C, we see that memory times are of the same order as the reaction times, indicating that Q is a poor reaction coordinate. The overall concurrence between τmem, τD, and τMFPMD predicts a coupling between memory effects and folding kinetics.

In Fig. 2D, we show that transition path times are much shorter than memory times for all proteins, sometimes by up to two orders of magnitude. This is significant since it indicates that entire transition processes occur under the influence of memory effects that extend back relatively far into the protein’s dynamic history. Transition paths, which are the segments of a trajectory where the protein undergoes a transition from one state to a target state, are of great interest since they reveal the actual folding mechanisms of a protein. They have been extensively studied in experiments (23, 5055) and in simulations and theory (24, 36, 5660).

Non-Markovian Rate Theory Predicts Protein Folding Times.

From the simulation trajectories, we extract the folding and unfolding times, evaluated via the mean first-passage times τMFPMD. Since we also extract the friction directly from the simulation trajectories, we can compare the performance of various reaction rate theories. To quantify deviations between predicted and measured folding times, we calculate the root-mean-square logarithmic deviation (RMSLD), which is a logarithmic measure of the relative deviations given by RMSLD=i=1n(log(τMFP,iMD)log(τMFP,itheo))2/n. Here, τMFP,iMD is the value measured from MD, τMFP,itheo is the theoretical prediction for a given model, and n = 16 is the total number of folding and unfolding reactions (for details, see SI Appendix, section 8). Fig. 3A shows that τD provides a remarkably good prediction of the MD data, despite completely neglecting contributions from free-energy profiles. A Markovian prediction, which in addition to friction explicitly accounts for the extracted free energy profile (49), is given by

τMFPMarQs,Qe=τDξU=τDQSQeeUx/kBTxeU(y)/kBTdyLdxL. [3]

Fig. 3.

Fig. 3.

Comparison of simulated protein folding and unfolding times τMFPMD with predictions on different levels of theory. (A) Diffusion times τD, according to Eq. 2, which do not depend on free-energy barriers. (B) Markovian predictions τMFPMar, according to Eq. 3, which use the extracted free-energy profile for each protein. (C) Free-energy-dependent factor ξU that accounts for the effect of the free energy profile but not for the friction, according to Eq. 3. (D) Non-Markovian predictions τMFPnoMar, according to Eq. 5. Multiexponential memory kernels are explicitly accounted for. The root-mean-squared deviations of the logarithmically transformed data (RMSLD) provides a measure of relative deviations for each model prediction from the observed data τMFPMD over the combined set of folding and unfolding times. A lower RMSLD indicates that a given model prediction is more accurate, where the dashed lines indicate an exact prediction. In (C), τD is calculated using γ = 1.1 × 109unm2μs−1 for all proteins, chosen to minimize the RMSLD. See Fig. 2 for symbol legend.

Eq. 3 assumes constant friction across Q and evaluates the mean first-passage times between a start and end-point on the free-energy profile, Qs and Qe. For folding reactions, for example, we calculate τMFPMar(Qu,Qb) (SI Appendix, section 9). In Fig. 3B, we see that the Markovian predictions are generally slow compared to the MD data. The RMSLD of 1.41 indicates that the Markovian prediction (Eq. 3) is slightly more accurate than τD, which has an RMSLD of 1.51, indicating that by explicitly including the free-energy profile via the correction factor ξU, defined in Eq. 3, we only slightly improve the prediction. To emphasize this, we consider the predictive power of the free energy profiles alone and hence evaluate τDξU (Fig. 3C), where τD′ is the diffusion time using a common friction coefficient for all proteins, chosen to minimize the RMSLD. The relatively large RMSLD value of 1.98 indicates that ξU is not a good predictor for folding times. We conclude that friction alone describes the variation of folding and unfolding times among this set of proteins well, while the explicit inclusion of free-energy profiles does little to improve the prediction.

We now consider the influence of non-Markovian effects on the folding time predictions. A formula was recently constructed that predicts the memory-induced speed-up and slow-down regimes (1214). Tested against extensive simulations, this formula explicitly accounts for non-Markovian dynamics correctly for multiexponential memory. Based on this formula, we introduce a non-Markovian correction factor ξnoMar as

ξnoMar=τMFPH(τD,U0,γi,τi,τm)τMFPH(τD,U0,γi,0,0). [4]

Here, τMFPH is the barrier-crossing time of a one-dimensional reaction coordinate with finite mass and a memory kernel Γ(t)=i=1M(γi/τi)exp(t/τi), characterized by M friction amplitudes {γi} and M memory times {τi} (13, 14) (see Methods section Eqs. M2M4 and SI Appendix, section 10 for details of ξnoMar and τMFPH). The multiexponential fits of our extracted memory kernels are discussed in SI Appendix, section 4, for Chignolin we use M=2 and for all other proteins M=3. We note that all parameters in Eq. 4 are uniquely extracted from the MD simulations. The denominator in Eq. 4 represents the overdamped Markovian limit, where the inertial time and all memory times limit to zero. The non-Markovian prediction for the barrier-crossing time can thus be written as

τMFPnoMar=ξnoMarτMFPMar=ξnoMarξUτD. [5]

In Fig. 3D, we see that τMFPnoMar improves the overall prediction of the MD folding and unfolding times (except for Chignolin, Villin and Protein G), as confirmed by the relatively low RMSLD value of 0.86. See SI Appendix, section 8 for additional statistical analysis of the data in Fig. 3.

Proteins Fold in the Memory-Induced Speed-Up Regime.

Memory effects can either speed-up or slow-down barrier-crossing times when compared to the memoryless limit, depending on the value of τmem/τD (1214). In our analysis, we use τMFPMar, as given by Eq. 3, to represent the overdamped, memoryless limit. By plotting τMFPMD/τMFPMar against the rescaled memory times τmem/τD for both folding (red symbols) and unfolding (black symbols) of all proteins in Fig. 4A, we reveal that speed-up is most pronounced for intermediate values of 1 × 10−1 < τmem/τD < 1 × 101, while in the short memory-time limit τmem/τD → 0, the reaction times trend to Markovian behaviour, i.e., τMFPMD/τMFPMar1. This is in line with previous treatments of model systems (13, 14) and simulations of short homo-peptide chains (32). Fig. 4A, therefore, indicates that protein folding and unfolding times are significantly accelerated by memory effects, in some instances by as much as a factor of 10, which is a remarkable contribution.

Fig. 4.

Fig. 4.

Barrier crossing times indicate memory-induced speed up. (A) Deviations from Markovian barrier crossing kinetics, plotted as a function of rescaled memory times τmem/τD, for the folding and unfolding of all eight proteins. Rescaled MD simulation values for each protein (τMFPMD/τMFPMar, red and black symbols) are compared to multimodal, non-Markovian prediction (ξnoMar, Eq. 4, blue and yellow symbols). See Fig. 2 for symbol legend. (B) Reaction-time curves generated by scaling all memory times by a common factor α. Vertical dashed lines indicate α = 1, which are different for folding (f) and unfolding (u) due to unique τD values. The multimodal non-Markovian prediction (ξnoMarα from Eq. 4—folding (blue) and unfolding (yellow)) is compared to the Grote–Hynes prediction (τGH, α from Eq. M6), rescaled by the memoryless, high-friction limit (τHFGH - Eq. M10) for folding (orange) and unfolding (grey). The blue and yellow symbols shown in (A) for the four example proteins coincide with the curve intercepts of the α = 1 lines in (B) for ξnoMarα. The red and black symbols are the same as those shown in (A), corresponding to the MD results for each protein, located at α = 1. The solid black lines show quadratic scaling, predicted for ξnoMar in the long memory-time limit. Dashed horizontal black lines show unity for all plots.

This notion can be made more quantitative by using ξnoMar predictively. According to Eq. 5, ξnoMar describes the ratio τMFPnoMar/τMFPMar and quantifies deviations from Markovianity. In Fig. 4A, we observe an excellent agreement between ξnoMar (blue and yellow symbols for folding and unfolding) and the rescaled simulation data, suggesting that a multimodal, non-Markovian reaction rate theory accurately describes the accelerated barrier-crossing in protein folding simulations, even predicting a 10-fold acceleration for intermediate memory times.

To gain a deeper understanding of how memory affects protein folding times, we rescale all memory times in Eq. 4 by a common factor α and generate an α-dependent memory kernel Γα(t)=i=1M(γi/ατi)exp(t/ατi). Such time-rescaling can be considered analogous to varying the solvent viscosity. In Fig. 4B, we show ξnoMarα from Eq. 4 using the α-dependent memory kernel Γα(t) for four proteins as solid lines, plotted as a function of ατmem/τD. The vertical dashed lines denote α = 1, corresponding to the original proteins described by the MD simulations. We overlay the MD results (τMFPMD/τMFPMar—the same colors and symbols as in Fig. 4A) at α = 1, which show good agreement with Eq. 4. For comparison, we also show the folding and unfolding times predicted by Grote–Hynes (GH) theory (11) as broken lines, which also explicitly accounts for memory effects (see Methods section Eqs. M6M10). The Grote–Hynes curves, τGH,α/τGHHF, agree with ξnoMarα for small memory times but disagree significantly for longer memory times. Here, τGH, α is the GH prediction (τGH - Eq. M6) with memory times rescaled by α, and τHFGH is the high-friction, memoryless GH limit (Eq. M10) (see SI Appendix, section 10 for details). Previous work has discussed the breakdown of GH theory for long memory times for model systems (12, 40). Here, we demonstrate the relevance for protein folding kinetics. While GH theory agrees with ξnoMar for some proteins (e.g., α3D and the λ-repressor in Fig. 4B), it significantly fails for other proteins (e.g., Trp-Cage and WW Domain; see also SI Appendix, Figs. S8 and S9), depending on the value of τmem/τD. Specifically, Trp-Cage unfolding is described by a particularly high value of τmem/τD = 2, which corresponds to the onset of memory-induced slowdown, as predicted well by Eq. 4 but not by GH theory.

Conclusion

We have extracted memory kernels from published MD simulation trajectories of eight fast-folding proteins (24) and have shown that, for the Q reaction coordinate, each protein’s memory time is comparable to its folding time and typically vastly exceeds its transition path time. By extracting the memory kernels, and from that the total friction γ, we show that a simple prediction for the barrier crossing times, which only depends on friction, i.e., τD, represents the simulated barrier crossing times well (Fig. 3A). Explicitly accounting for the exact free-energy profiles (Figs. 3 B and C) does not particularly improve upon this prediction. A theory that explicitly accounts for multi-time-scale memory-dependent friction most accurately predicts the folding and unfolding times observed in the MD simulations (Fig. 3D), demonstrating that non-Markovian effects must be accounted for when describing protein dynamics via some reduced-dimensional reaction coordinate.

The quality of reaction coordinates has been estimated using transition-state-ensemble and commitment-probability methods (58, 61, 62), based on the idea that an ideal reaction coordinate should project all transition states onto a single point and be Markovian (28). Using such methods and the same dataset from the Shaw group as we use here (24), Best and Hummer argued that Q is a good reaction coordinate for describing protein folding dynamics (36). However, we demonstrate that Q displays strong non-Markovianity, suggesting that qualifying a reaction coordinate is subtle. In SI Appendix, section 11, we show memory kernels for other standard reaction coordinates, such as the end-to-end distance, radius of gyration, and the RMSD from the native state. The resultant memory times span orders of magnitudes but are typically of the order of the diffusion time scale τD. Therefore, none of these reaction coordinates can be considered as a good reaction coordinate when judged based on their Markovianity. Regardless, by faithfully extracting memory friction functions from MD trajectories and using non-Markovian reaction rate theories, the folding kinetics of proteins can be analyzed and accurately predicted. Such a method may prove to be ideal in the experimental context (5052, 63, 64) where the investigator does not have the luxury to choose optimal reaction coordinates.

Alternative models, which neglect non-Markovian effects but assume position-dependent friction (5, 58), fail to generate a unique friction profile γ(Q) that describes both folding and unfolding kinetics (SI Appendix, section 12). This was already demonstrated for a simple α-helix forming homo-alanine chain (32). Similarly, we show in SI Appendix, section 13 that Markovian simulations with position-independent friction cannot reproduce the characteristic subdiffusive dynamics exhibited by proteins, contrary to simple GLE simulations. It is possible that position-dependent memory effects may also be important for protein folding in general (3739), but as we demonstrate in SI Appendix, section 13, any such effects are minor and therefore negligible for the proteins studied here. Finally, we mention that multidimensional projection techniques may significantly improve our analysis, especially for proteins that exhibit multiple distinct folding or unfolding pathways, such as NTL9 and the WW-Domain (24). However, this is beyond the scope of our present investigation.

Methods

In the document, SI Appendix, section 1, we present details for the molecular dynamics simulations, including various relevant simulation parameters. Additionally, we include a range of measured time-scales and other extracted quantities, and we compare results for our analysis of reaction times to those from Lindorff-Larsen et al. (24).

The Fraction of Native Contacts Reaction Coordinate.

For each protein, we project the backbone Cα atomic positions from the all-atom trajectories taken from ref. 24 onto the fraction of native contacts reaction coordinate Q, evaluated with a soft cut-off potential (36). The evaluation of Q(t) requires a reference state, which we take to be the native state for each trajectory. To evaluate the native state, we follow previous implementations and select from amongst the member states of the trajectories (as opposed to using, for example, the PDB entry). The approach is similar to that used by Lindorff-Larsen et al. (24) and Best et al. (36), which follows from ref. 65. Briefly, we sample a subset of evenly spaced states from the full trajectory. For each pair of states, we calculate the corresponding root-mean-squared deviation (RMSD) between the two states. If the RMSD between two states is less than 0.2 nm, then we place the pair into a list. We assign the state that has the most listed pairs satisfying the RMSD condition as the native state for a given protein. The native states for each protein are displayed in Fig. 1A. Note that for proteins with more than one independent trajectory segments, we select a single native state from amongst all trajectory segments, which we then use for all segments. In the native state, we define all Cα pairs that are separated by at least 5 residues in the primary sequence and which are separated by less than 0.9 nm in Cartesian distance, as the native contacts. Each protein will have Nnc native contacts. sij0 are the separation vectors for all native contact pairs in the native state, which have magnitudes sij0=sij0·sij0. sij(t) are the separation vectors for all native contact pairs at each time, with magnitudes sij(t)=sij(t)·sij(t). This gives the fraction of the native contacts that are deemed to be in contact at time t as

Q(t)=1Nnci<j11+eβ(sij(t)γsij0), [M1]

where the summation indices i and j are only for native contact pairs. Here, we set the parameters such that β = 30 nm−1 and γ = 1.6.

Non-Markovian Correction Factor ξnoMar.

From Kappler et al. (13) and Lavacchi et al. (14), we describe multiexponential memory dependent barrier crossing times as a sum of contributions from M overdamped contributions τODi, and M energy-diffusion contributions τEDi, where i = 1, 2, ..., M. The individual contributions are defined as follows. For the overdamped contributions, we have

τODi=τDγiγeβU0βU0×[π2211+10βU0τi/τD+βU0τmτD], [M2]

and for the energy-diffusion contributions

τEDi=τDγγieβU0βU0×[τmτD+4βU0(τiτD)2+βU0τmτD]. [M3]

Here, β = 1/kBT. We combine Eqs. M2 and M3 such that the predicted mean passage times are given by

τMFPH(τD,U0,γi,τi,τm)=i=1MτODi+[i=1M1τEDi]1. [M4]

{γi} and {τi} are the sets of M amplitudes and time-scales that appear in Eqs. M2 and M3. The total friction γ that appears in Eqs. M2 and M3 is accounted for since γ=i=1Mγi. The overdamped Markovian limit is achieved by setting all memory time scales and inertial times equal to 0 and is given by

τMFPH(τD,U0,}γi{,0,0)=τDπeβU0/2βU02, [M5]

leading to the non-Markovian correction factor ξnoMar and the non-Markovian barrier crossing time τMFPnoMar, as given by Eqs. 4 and 5, respectively. The α-scaling ξnoMarα, as discussed in the main text, is described in detail in SI Appendix, section 10.

Grote–Hynes Theory and Barrier Crossing Predictions.

The Grote–Hynes (GH) prediction for barrier crossing with frequency-dependent friction is given by ref. 11

τGH=2πωmaxλωmineβU0, [M6]

where ωmax=|Umax|/m and ωmin=Umin/m are the barrier frequencies at the free energy maximum and minimum, assumed to be parabolic such that Umax and Umin are the quadratic curvatures at those locations. λ is the barrier reactive frequency determined by solving the Grote–Hynes equation

λ2+λΓ~(λ)m=ωmax2. [M7]

Γ~(λ) is the Laplace transform of the friction memory kernel, given by

Γ~(λ)=0Γ(t)eλtdt. [M8]

In the Markovian limit, i.e., when Γ(t)=γδ(t), it can be shown that Eq. M6 reduces to the well-known Kramers (Kr) prediction for a massive particle in the medium-to-high friction regime

τKrGH=[γ24m2+ωmax2γ2m]12πωmaxωmineβU0. [M9]

In the high-friction (HF) limit, i.e., τm/τD → 0, Eq. M9 further reduces to

τHFGH=2πγmωminωmaxeβU0. [M10]

An extended discussion of the Grote–Hynes calculation, including the α-scaling discussed in the main text, is given in SI Appendix, section 10.

Supplementary Material

Appendix 01 (PDF)

Acknowledgments

We acknowledge support by the ERC Advanced Grant No. 835117 NoMaMemo and by the Deutsche Forschungsgemeinschaft (DFG) via the project SFB 1449-431232613 A02 and A03. We gratefully acknowledge computing time on the HPC clusters at the Physics department and Die Zentraleinrichtung für Datenverarbeitung (ZEDAT), FU Berlin.

Author contributions

B.A.D. and R.R.N. designed research; B.A.D. performed research; B.A.D., C.A., H.K., A.K., and L.T. analyzed data; C.A., H.K., A.K., and L.T. provided supplementary data and analysis; and B.A.D. and R.R.N. wrote the paper.

Competing interests

The authors declare no competing interest.

Footnotes

This article is a PNAS Direct Submission. J.E.S. is a guest editor invited by the Editorial Board.

Data, Materials, and Software Availability

Previously published data were used for this work (24). Processed data is available upon reasonable request.

Supporting Information

References

  • 1.Bryngelson J. D., Wolynes P. G., Intermediates and barrier crossing in a random energy model (with applications to protein folding). J. Phys. Chem. 93, 6902–6915 (1989). [Google Scholar]
  • 2.Bryngelson J. D., Onuchic J. N., Socci N. D., Wolynes P. G., Funnels, pathways, and the energy landscape of protein folding: A synthesis. Proteins: Struct. 21 (1995). [DOI] [PubMed] [Google Scholar]
  • 3.Dill K. A., Chan H. S., From Levinthal to pathways to funnels. Nat. Struct. Biol. 4, 10–19 (1997). [DOI] [PubMed] [Google Scholar]
  • 4.Schuler B., Eaton W. A., Protein folding studied by single-molecule FRET. Curr. Opin. Struct. Biol. 18, 16–26 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Hinczewski M., Gebhardt J. C. M., Rief M., Thirumalai D., From mechanical folding trajectories to intrinsic energy landscapes of biopolymers. Proc. Natl. Acad. Sci. 110, 4500–4505 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Plotkin S. S., Wolynes P. G., Buffed energy landscapes: Another solution to the kinetic paradoxes of protein folding. Proc. Natl. Acad. Sci. 100, 4417–4422 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Arrhenius S., Über die Reaktionsgeschwindigkeit bei der Inversion von Rohrzucker durch Säuren. Zeitschrift für Physikalische Chemie 4U, 226–248 (1889). [Google Scholar]
  • 8.Eyring H., The Activated Complex in Chemical Reactions. J. Chem. Phys. 3, 107–115 (1935). [Google Scholar]
  • 9.Kramers H. A., Brownian motion in a field of force and the diffusion model of chemical reactions. Physica 7, 284–304 (1940). [Google Scholar]
  • 10.Mel’nikov V. I., Meshkov S. V., Theory of activated rate processes: Exact solution of the Kramers problem. J. Chem. Phys. 85, 1018–1027 (1986). [Google Scholar]
  • 11.Grote R. F., Hynes J. T., The stable states picture of chemical reactions. II. Rate constants for condensed and gas phase reaction models. J. Chem. Phys. 73, 2715–2732 (1980). [Google Scholar]
  • 12.Kappler J., Daldrop J. O., Brünig F. N., Boehle M. D., Netz R. R., Memory-induced acceleration and slowdown of barrier crossing. J. Chem. Phys. 148, 14903 (2018). [DOI] [PubMed] [Google Scholar]
  • 13.Kappler J., Hinrichsen V. B., Netz R. R., Non-Markovian barrier crossing with two-time-scale memory is dominated by the faster memory component. Euro. Phys. J. E 42, 119 (2019). [DOI] [PubMed] [Google Scholar]
  • 14.Lavacchi L., Kappler J., Netz R. R., Barrier crossing in the presence of multi-exponential memory functions with unequal friction amplitudes and memory times. EPL 131 (2020). [Google Scholar]
  • 15.Ansari A., Jones C. M., Henry E. R., Hofrichter J., Eaton W. A., The role of solvent viscosity in the dynamics of protein conformational changes. Science 256, 1796 LP–1798 (1992). [DOI] [PubMed] [Google Scholar]
  • 16.Echeverria I., Makarov D. E., Papoian G. A., Concerted Dihedral Rotations Give Rise to Internal Friction in Unfolded Proteins. J. Am. Chem. Soc. 136, 8708–8713 (2014). [DOI] [PubMed] [Google Scholar]
  • 17.Schulz J. C. F., Schmidt L., Best R. B., Dzubiella J., Netz R. R., Peptide Chain Dynamics in Light and Heavy Water: Zooming in on Internal Friction. J. Am. Chem. Soc. 134, 6273–6279 (2012). [DOI] [PubMed] [Google Scholar]
  • 18.de Sancho D., Sirur A., Best R. B., Molecular origins of internal friction effects on protein-folding rates. Nat. Commun. 5, 4307 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lange O. F., Grubmüller H., Collective Langevin dynamics of conformational motions in proteins. J. Chem. Phys. 124, 214903 (2006). [DOI] [PubMed] [Google Scholar]
  • 20.Best R. B., Hummer G., Diffusive model of protein folding dynamics with Kramers turnover in rate. Phys. Rev. Lett. 96, 228104 (2006). [DOI] [PubMed] [Google Scholar]
  • 21.Best R. B., Hummer G., Coordinate-dependent diffusion in protein folding. Proc. Natl. Acad. Sci. 107, 1088 LP–1093 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Hinczewski M., von Hansen Y., Dzubiella J., Netz R. R., How the diffusivity profile reduces the arbitrariness of protein folding free energies. J. Chem. Phys. 132, 245103 (2010). [DOI] [PubMed] [Google Scholar]
  • 23.Chung H. S., Piana-Agostinetti S., Shaw D. E., Eaton W. A., Structural origin of slow diffusion in protein folding. Science (New York, N.Y.) 349, 1504–1510 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Lindorff-Larsen K., Piana S., Dror R. O., Shaw D. E., How Fast-Folding Proteins Fold. Science 334, 517 LP–520 (2011). [DOI] [PubMed] [Google Scholar]
  • 25.Shaw D. E., et al. , Millisecond-scale molecular dynamics simulations on Anton, in Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, (2009), pp. 1–11.
  • 26.Shaw D. E., et al. , Atomic-level characterization of the structural dynamics of proteins. Science (New York, N.Y.) 330, 341–346 (2010). [DOI] [PubMed] [Google Scholar]
  • 27.Plotkin S. S., Wolynes P. G., Non-Markovian Configurational Diffusion and Reaction Coordinates for Protein Folding. Phys. Rev. Lett. 80, 5015–5018 (1998). [Google Scholar]
  • 28.Berezhkovskii A. M., Makarov D. E., Single-Molecule Test for Markovianity of the Dynamics along a Reaction Coordinate. J. Phys. Chem. Lett. 9, 2190–2195 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Cao S., Montoya-Castillo A., Wang W., Markland T. E., Huang X., On the advantages of exploiting memory in Markov state models for biomolecular dynamics. J. Chem. Phys. 153, 14105 (2020). [DOI] [PubMed] [Google Scholar]
  • 30.Daldrop J. O., Kappler J., Brünig F. N., Netz R. R., Butane dihedral angle dynamics in water is dominated by internal friction. Proc. Natl. Acad. Sci. 115, 5169–5174 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Kowalik B., et al. , Memory-kernel extraction for different molecular solutes in solvents of varying viscosity in confinement. Phys. Rev. E 100, 12126 (2019). [DOI] [PubMed] [Google Scholar]
  • 32.Ayaz C., et al. , Non-Markovian modeling of protein folding. Proc. Natl. Acad. Sci. 118, e2023856118 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Zwanzig R., Memory Effects in Irreversible Thermodynamics. Phys. Rev. 124, 983–992 (1961). [Google Scholar]
  • 34.Mori H., Transport, Collective Motion, and Brownian Motion. Progr. Theor. Phys. 33, 423–455 (1965). [Google Scholar]
  • 35.Shakhnovich E., Farztdinov G., Gutin A. M., Karplus M., Protein folding bottlenecks: A lattice Monte Carlo simulation. Phys. Rev. Lett. 67, 1665–1668 (1991). [DOI] [PubMed] [Google Scholar]
  • 36.Best R. B., Hummer G., Eaton W. A., Native contacts determine protein folding mechanisms in atomistic simulations. Proc. Natl. Acad. Sci. 110, 17874 LP–17879 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Ayaz C., Scalfi L., Dalton B. A., Netz R. R., Generalized Langevin equation with a nonlinear potential of mean force and nonlinear memory friction from a hybrid projection scheme. Phys. Rev. E 105, 54138 (2022). [DOI] [PubMed] [Google Scholar]
  • 38.Vroylandt H., Monmarché P., Position-dependent memory kernel in generalized Langevin equations: Theory and numerical estimation. J. Chem. Phys. 156, 244105 (2022). [DOI] [PubMed] [Google Scholar]
  • 39.Vroylandt H., On the derivation of the generalized Langevin equation and the fluctuation-dissipation theorem. Euro. Lett. 140, 62003 (2022). [Google Scholar]
  • 40.Brünig F. N., Daldrop J. O., Netz R. R., Pair-Reaction Dynamics in Water: Competition of Memory, Potential Shape, and Inertial Effects. J. Phys. Chem. B 126, 10295–10304 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Brünig F. N., Geburtig O., von Canal A., Kappler J., Netz R. R., Time-Dependent Friction Effects on Vibrational Infrared Frequencies and Line Shapes of Liquid Water. J. Phys. Chem. B 126, 1579–1589 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Straub J. E., Borkovec M., Berne B. J., Calculation of dynamic friction on intramolecular degrees of freedom. J. Phys. Chem. 91, 4995–4998 (1987). [Google Scholar]
  • 43.Daldrop J. O., Kowalik B. G., Netz R. R., External Potential Modifies Friction of Molecular Solutes in Water. Phys. Rev. X 7, 41065 (2017). [Google Scholar]
  • 44.Darve E., Solomon J., Kia A., Computing generalized Langevin equations and generalized Fokker-Planck equations. Proc. Natl. Acad. Sci. 106, 10884–10889 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.de Gennes P. G., Reptation of a Polymer Chain in the Presence of Fixed Obstacles. J. Chem. Phys. 55, 572–579 (1971). [Google Scholar]
  • 46.Alexander-Katz A., Wada H., Netz R. R., Internal Friction and Nonequilibrium Unfolding of Polymeric Globules. Phys. Rev. Lett. 103, 28102 (2009). [DOI] [PubMed] [Google Scholar]
  • 47.Plaxco K. W., Simons K. T., Baker D., Contact order, transition state placement and the refolding rates of single domain proteins11Edited by P. E. Wright. J. Mol. Biol. 277, 985–994 (1998). [DOI] [PubMed] [Google Scholar]
  • 48.Zheng W., Best R. B., Reduction of All-Atom Protein Folding Dynamics to One-Dimensional Diffusion. J. Phys. Chem. B 119, 15247–15255 (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Zwanzig R., Nonequilibrium Statistical Mechanics (Oxford University Press, 2001). [Google Scholar]
  • 50.Chung H. S., Eaton W. A., Single-molecule fluorescence probes dynamics of barrier crossing. Nature 502, 685–688 (2013). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Chung H. S., McHale K., Louis J. M., Eaton W. A., Single-Molecule Fluorescence Experiments Determine Protein Folding Transition Path Times. Science 335, 981–984 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Neupane K., et al. , Direct observation of transition paths during the folding of proteins and nucleic acids. Science 352, 239–242 (2016). [DOI] [PubMed] [Google Scholar]
  • 53.Neupane K., Manuel A. P., Woodside M. T., Protein folding trajectories can be described quantitatively by one-dimensional diffusion over measured energy landscapes. Nat. Phys. 12, 700–703 (2016). [Google Scholar]
  • 54.Sturzenegger F., et al. , Transition path times of coupled folding and binding reveal the formation of an encounter complex. Nat. Commun. 9, 4708 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Zijlstra N., Nettels D., Satija R., Makarov D. E., Schuler B., Transition Path Dynamics of a Dielectric Particle in a Bistable Optical Trap. Phys. Rev. Lett. 125, 146001 (2020). [DOI] [PubMed] [Google Scholar]
  • 56.Satija R., Das A., Makarov D. E., Transition path times reveal memory effects and anomalous diffusion in the dynamics of protein folding. J. Chem. Phys. 147, 152707 (2017). [DOI] [PubMed] [Google Scholar]
  • 57.Daldrop J. O., Kim W. K., Netz R. R., Transition paths are hot. EPL (Europhys. Lett.) 113, 18004 (2016). [Google Scholar]
  • 58.Hummer G., From transition paths to transition states and rate coefficients. J. Chem. Phys. 120, 516–523 (2004). [DOI] [PubMed] [Google Scholar]
  • 59.Laleman M., Carlon E., Orland H., Transition path time distributions. J. Chem. Phys. 147, 214103 (2017). [DOI] [PubMed] [Google Scholar]
  • 60.Carlon E., Orland H., Sakaue T., Vanderzande C., Effect of Memory and Active Forces on Transition Path Time Distributions. J. Phys. Chem. B 122, 11186–11194 (2018). [DOI] [PubMed] [Google Scholar]
  • 61.Geissler P. L., Dellago C., Chandler D., Kinetic Pathways of Ion Pair Dissociation in Water. J. Phys. Chem. B 103, 3706–3710 (1999). [Google Scholar]
  • 62.Bolhuis P. G., Chandler D., Dellago C., Geissler P. L., Transition path sampling: Throwing ropes over rough mountain passes, in the dark. Annu. Rev. Phys. Chem. 53, 291–318 (2002). [DOI] [PubMed] [Google Scholar]
  • 63.Pirchi M., et al. , Photon-by-Photon Hidden Markov Model Analysis for Microsecond Single-Molecule FRET Kinetics. J. Phys. Chem. B 120, 13065–13075 (2016). [DOI] [PubMed] [Google Scholar]
  • 64.Aviram H. Y., et al. , Direct observation of ultrafast large-scale dynamics of an enzyme under turnover conditions. Proc. Natl. Acad. Sci. 115, 3243–3248 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65.Daura X., et al. , Peptide Folding: When Simulation Meets Experiment. Angew. Chem. Int. Edition 38, 236–240 (1999). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix 01 (PDF)

Data Availability Statement

Previously published data were used for this work (24). Processed data is available upon reasonable request.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES