Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2014 Dec 8;111(51):18243–18248. doi: 10.1073/pnas.1420914111

Folding kinetics of WW domains with the united residue force field for bridging microscopic motions and experimental measurements

Rui Zhou a,b,c,1, Gia G Maisuradze a,1,2, David Suñol d, Toni Todorovski d, Maria J Macias d,e, Yi Xiao b, Harold A Scheraga a,2, Cezary Czaplewski c, Adam Liwo c,1,2
PMCID: PMC4280612  PMID: 25489078

Significance

In spite of recent advances made in computer simulation techniques, one of the main challenges in the protein-folding field is to bridge microscopic motions and experimental measurements. This paper demonstrates that the physics-based, coarse-grained united-residue (UNRES) force field, which has the ability to simulate folding of small- and midsize proteins in the millisecond timescale, can predict the folding kinetics correctly and bridge theoretical and experimental worlds. The results suggest that the use of the UNRES force field will open a new door to the understanding of protein motions at much longer timescales and help explain the differences between theoretical results and experimental observations.

Keywords: FBP28 WW domain, nonnatural variants, folding rates, free-energy landscapes, millisecond-timescale canonical MD simulations

Abstract

To demonstrate the utility of the coarse-grained united-residue (UNRES) force field to compare experimental and computed kinetic data for folding proteins, we have performed long-time millisecond-timescale canonical Langevin molecular dynamics simulations of the triple β-strand from the Formin binding protein 28 WW domain and six nonnatural variants, using UNRES. The results have been compared with available experimental data in both a qualitative and a quantitative manner. Complexities of the folding pathways, which cannot be determined experimentally, were revealed. The folding mechanisms obtained from the simulated folding kinetics are in agreement with experimental results, with a few discrepancies for which we have accounted. The origins of single- and double-exponential kinetics and their correlations with two- and three-state folding scenarios are shown to be related to the relative barrier heights between the various states. The rate constants obtained from time profiles of the fractions of the native, intermediate, and unfolded structures, and the kinetic equations fitted to them, correlate with the experimental values; however, they are about three orders of magnitude larger than the experimental ones for most of the systems. These differences are in agreement with the timescale extension derived by scaling down the friction of water and averaging out the fast degrees of freedom when passing from all-atom to a coarse-grained representation. Our results indicate that the UNRES force field can provide accurate predictions of folding kinetics of these WW domains, often used as models for the study of the mechanisms of proein folding.


Recent advances in computer simulation techniques have facilitated the direct study of the folding process of small fast-folding proteins, using all-atom force fields (1). However, it is important to validate the simulation methodologies, and the only way to accomplish this is a quantitative comparison with experimental data with proper statistics. The validation of all-atom simulation methodologies is still a major problem because of the differences between the experimental timescale (from multiple microseconds to seconds) and the theoretical one (from hundreds of nanoseconds to microseconds). To overcome this problem, many approximate coarse-grained methods have been developed during the past decade (25). One of them makes use of a physics-based united-residue (UNRES) force field developed in our group over the past years (614) (SI Appendix, Fig. S1 and SI Materials and Methods).

The folding and unfolding rates are among the most accessible quantitative observables for two- and multistate folding proteins; therefore, a study of protein folding kinetics can bridge microscopic motions and the world of experimental measurements. In analyzing protein folding kinetics, the differential rate equations and their integrated forms become more complex as the number of intermediate forms between the completely unfolded form and the native form increases. Therefore, to determine the mechanisms and the microscopic rate constants, it is necessary to vary them to obtain a computed folding trajectory that matches the one that is simulated by molecular dynamics.

To cover a sufficiently large timescale and obtain a stable folding trajectory theoretically, it is necessary to use a coarse-grained, rather than an all-atom, force field. For this purpose, use is made of the UNRES force field to compute folding trajectories by canonical Langevin dynamics simulations. Then, any intermediate states are identified, and the dependence of the fractions of unfolded, intermediate, and native states of the protein (averaged over all trajectories) are determined as a function of time, and the kinetic equations are fitted to these data to compare the calculated rate constants with those determined experimentally (15).

This general approach is illustrated here, as an example, with the triple-β–stranded WW domain from the Formin binding protein 28 (FBP28) (PDB ID 1E0L) (16) and its full-size and truncated mutants (15) (SI Appendix, Fig. S2). The FBP28 WW domain is a member of the WW-domain family (17), with its kinetics examined by possible two-state and three-state models. The FBP28 WW domain is a good model with which to study β-sheet formation. It should be noted that the WW domains have been the subject of extensive theoretical (1, 14, 1828) and experimental (1517, 2934) studies because of their small size, biological importance (35), and interesting fast-folding kinetics.

As indicated here, a controversy still exists about whether the FBP28 WW domain folding proceeds by a two- or three-state mechanism. Using temperature denaturation and laser temperature-jump relaxation experiments, Nguyen et al. (15) concluded that the FBP28 WW domain can be tuned between two-state and three-state kinetics by temperature changes, selected point mutation, and truncation. These experiments (15) indicated that, below the transition midpoint, the wild-type (WT) FBP28 WW domain deviates from single-exponential kinetics, implying at least a three-state folding scenario, with two folding rate constants of 0.030 μs1 and <0.0011μs1, respectively. Above the folding-transition temperature, single-exponential kinetics were observed with a folding-rate constant of 0.071 μs1 (15). On the other hand, Ferguson et al. (31) argued that the kinetics proceed by a two-state mechanism and that the biphasic kinetics observed by Nguyen et al. (15) might be related to misfolding and aggregation, rapidly forming ribbon-like fibrils at physiological temperature and pH, with morphology typical of amyloid fibrils. Recently, by using infrared and fluorescence spectroscopy in studying the FBP28 WW domain and its tryptophan mutants (W8Y and W30Y), Davis and Dyer (36) found that the folding mechanism for the FBP28 WW domain is similar to that proposed by others (15, 29, 32), but their W8Y and W30Y mutants (36) provide evidence of an intermediate dry molten globule state.

Many computational studies were also performed on the FBP28 WW domain to gain atomic details and femtosecond temporal resolution. By carrying out all-atom molecular dynamics simulations with a Gō-like model on the FBP28 WW domain, Karanicolas and Brooks (18) demonstrated that biphasic kinetics originated from slow formation of the C-terminal β-hairpin in the FBP28 WW domain. Xu et al. (25) applied an all-atom Monte Carlo simulation to study the folding kinetics and revealed that two major folding pathways exist that differ in the order and mechanism of hairpin formation. Mu et al. (20) carried out replica exchange molecular dynamics (REMD) with explicit water at the all-atom level to explore the details of the kinetics. The results of Mu et al. showed that the formation of the second turn in the transition-state structure was responsible for the stable intermediate state that would lead to aggregation and misfolded structures as proposed by Ferguson et al. (31). By studying the effects of macromolecular crowding and confinement on the transition state structures in comparison with bulk for the 1PIN WW domain, Cheung and Thirumalai (37) found that (i) the folding rates of this protein in the presence of crowding or in confined spaces typically, but not always, increase because of entropic destabilization of the denatured states and, (ii) depending on which phenomenon is dominant, the entropic stabilization or enthalpic interactions, the transition state structures can be similar to or different from those in bulk.

One way to settle the discrepancies between the theoretical and experimental studies is to compare the same parameters obtained from theoretical simulations and experimental data. Therefore, we have generated a large number of trajectories by canonical Langevin dynamics simulations with the UNRES force field to analyze the folding kinetics of the FBP28 WW domain and its six mutants (Y11R, Y19L, W30F, ΔNY11R, ΔNΔCY11R, and ΔNΔCY11R/L26A, where ΔN and ΔC denote the deletion of the five N-terminal and the four C-terminal residues, respectively). These mutants, as well as the WT FBP28 WW domain, are associated with a strand-crossing hydrophobic cluster Tyr11/Tyr19/Trp30, involved in either hairpin of each protein (15). We ran 512 trajectories for each system (about 5–8 μs formal time and effectively 5–8 ms of each trajectory, per the UNRES timescale) of canonical Langevin dynamics simulations, starting from the fully extended structure. Then, we identified the native, intermediate, and unfolded states and determined the dependence of the fractions of these three states of each protein (averaged over all trajectories) as a function of time and fitted kinetic equations to these time-dependent fractions. The calculated rate constants were compared with those determined experimentally (15). Moreover, the MD trajectories were analyzed in terms of free-energy landscapes (FELs) along the Cα-rmsd from the native structure and radius of gyration (Rg). We have also characterized the structures of all variants, using the data obtained from high-resolution NMR (SI Appendix, Fig. S2). The variants investigated adopt a similar fold to that of the wild-type protein, with small differences in and around the sites where mutations and deletions have been introduced.

Results

Kinetic Studies for Wild-Type FBP28 Domain and Its Mutants.

The simulated results for the three full-size and three truncated mutants, as well as the WT FBP28 WW domain (SI Appendix, Fig. S2), are presented here. All parameters obtained from the fitting of the simulation data by Eq. 1 (single-exponential kinetics) and Eqs. 2 and 3 (double-exponential kinetics) are summarized in Table 1. Plots of the fractions of native structures vs. time, [N](t) (light blue ragged lines), and the fractions of intermediate structures vs. time, [I](t) (light blue ragged lines), along with fitting curves [single-exponential kinetics (dotted black line) and double-exponential kinetics (solid black line)] for the WT FBP28 WW domain (Fig. 1 A and B) and its two mutants, Y11R (Fig. 1 C and D) and ΔNΔCY11R/L26A (Fig. 1 E and F), are shown (similar plots for the rest of the mutants are illustrated in SI Appendix, Fig. S3). The fitting results, illustrated in Fig. 1 and SI Appendix, Fig. S3, and χ2 values in Table 1 indicate that all systems, except for the Y19L and W30F mutants (SI Appendix, Fig. S3 A–D), exhibit double-exponential kinetics, i.e., are three-state folders. The single-exponential kinetics of W30F are in agreement with experimental results (15), whereas the single- and double-exponential kinetics of Y19L (SI Appendix, Fig. S3 A and B) and ΔNΔCY11R (SI Appendix, Fig. S3 G and H) mutants, respectively, differ from experimental results, in which Y19L and ΔNΔCY11R mutants exhibit double- and single-exponential kinetics, respectively (15).

Table 1.

Fitting results of native states by single- (Eq. 1) and double-exponential (Eqs. 2 and 3) kinetics

Fitting results
Two-state model Three-state model Experimental data§
Name C0 λ0*, ×103 ns−1 χ2, ×102 C1 m1 C2 m2 λ1 (sim), ns1 λ2 (sim), ns1 χ2, ×102 λ1 (exp), μs1 λ2 (exp), μs1
Wild type 0.67 0.43 2.4 0.72 0.11 0.15 0.57 0.085 0.00032 1.4 0.030(1)# <0.0011
Y11R 0.80 3.0 1.8 0.80 0.12 0.10 1.97 0.040 0.0026 1.3 0.025(4) <0.0014
Y19L 0.018 6.2 0.23 0.019 0.79 0.57 0.35 0.24 0.00041 0.65 0.035(2) <0.0021
W30F 0.16 1.3 0.84 0.31 0.26 0.68 0.33 0.36 0.00012 1.8 0.054(2)
ΔNY11R 0.37 2.8 2.2 0.37 0.05 0.24 0.81 0.76 0.0025 2.0 0.026(3) <0.0016
ΔNΔCY11R 0.33 3.2 1.8 0.33 0.17 0.37 0.51 0.061 0.0025 1.7 0.050(2)
ΔNΔCY11R/L26A 0.46 2.2 2.0 0.47 0.14 0.29 0.55 0.039 0.0019 1.7 0.044(1) <0.0020
*

λ0 of Eq. 1.

Sum of squares of SI Appendix, Eq. S24, divided by the number of degrees of freedom.

λ1 (sim) and λ2 (sim) of Eqs. 2 and 3.

§

Ref. 15.

λ1 (exp) and λ2 (exp) are the same as k1 and k2 in ref. 15.

#

SDs are in parentheses.

Fig. 1.

Fig. 1.

Fractions of native (A, C, and E) and intermediate (B, D, and F) structures as functions of time for the wild-type FBP 28 WW domain (A and B) and its Y11R (C and D) and ΔNΔCY11R/L26A (E and F) mutants. Light blue ragged lines present the simulation data; dotted lines correspond to the fits by single-exponential kinetics (Eq. 1, fractions of only native structures); solid lines correspond to the fits by double-exponential kinetics (Eqs. 2 and 3, fractions of the native and intermediate structures, respectively). Insets represent the enlarged values of fractions of native structures for the first several hundred nanoseconds of time.

For the three-state folding systems, both macroscopic rate constants λ1 and λ2, in Table 1, have meaningful (positive) values, and the intermediate state is present in remarkable quantity even at the end of the simulations for all molecules, sometimes in an amount comparable to or greater than that of the native state, as illustrated for ΔNΔCY11R/L26A, ΔNY11R, and ΔNΔCY11R in Fig. 1F and SI Appendix, Fig. S3 F and H, respectively. Table 1 also includes the experimental rate constants, determined by Nguyen et al. (15). The rate constants determined by fitting to simulation results are about three orders of magnitude greater compared with experimental values (except for ΔNY11R), which is consistent with the fact that the timescale of UNRES is extended by about three orders of magnitude because of averaging out the fast motions of the secondary degrees of freedom (19) and scaling down water friction in our Langevin dynamics simulations by a factor of 1,000.

The values of the fast-phase rate constants (λ1) determined by simulations correlate well with their experimental counterparts except for the truncated ΔNY11R, ΔNΔCY11R, and ΔNΔCY11R/L26A mutants, which are clear outliers (Fig. 2). Unfortunately, the experimental values of the slow-phase rate constants (λ2) are not accurate enough to make the similar meaningful comparison with their simulated counterparts.

Fig. 2.

Fig. 2.

A correlation plot of the experimental (15) and simulated fast cumulative constants (λ1 of Table 1) with a least-squares fitting line.

The preexponential parameter m1 in Eq. 2 determines the percentage of pathways following the fast-phase (single-exponential) folding kinetics; consequently, the parameter (1 − m1) indicates the percentage of pathways following the slow-phase (double-exponential) folding kinetics. Indeed, the results for the m1 parameter, shown in Table 1, are in agreement with the fitting results. For example, the m1 parameter for WT and Y19L is 0.11 and 0.79, respectively; this indicates that 11% and 79% of the pathways of WT and Y19L, respectively, follow the fast-phase route and 89% and 21% of the pathways of WT and Y19L, respectively, follow the slow-phase route. A disagreement between the fitting results and the m1 parameter occurs only for the W30F mutant. A plausible explanation of this discrepancy is given below in Discussion.

Free-Energy Landscapes of Wild-Type FBP28 WW Domain and Its Mutants.

To obtain more insights into the folding kinetics of these proteins and explain the causes of the discrepancies between the experimental and computational results shown above (Figs. 1 and 2 and SI Appendix, Fig. S3), we analyzed the distribution of the conformational states in terms of free-energy landscapes along the Cα-rmsd from the native structure and the radius of gyration (Rg) as order parameters in three time intervals: initial, for which the fraction of the native structures is below 20% of the maximum (equilibrium) fraction; intermediate, for which the fraction of the native structures is from 20% to 50% of the maximum fraction of the native structures; and final, for which the fraction of the native structures exceeds 50% of the maximum value. The FELs along Cα-rmsd and Rg [μ(rmsd,Rg)=kBTlnP(rmsd,Rg), where P, T, and kB are the probability distribution function (pdf), the absolute temperature, and the Boltzmann constant, respectively] for the WT FBP28 WW domain and its Y11R and ΔNΔCY11R/L26A mutants are shown in Figs. 35, respectively. The FELs for the rest of the mutants are illustrated in SI Appendix, Figs. S4–S7.

Fig. 3.

Fig. 3.

Variation of the distribution of conformational states in terms of FELs (in kcal/mol) along the Cα-rmsd and Rgyr order parameters for the wild-type FBP28 WW domain. The data have been collected from different sections of all 512 trajectory sets for the molecule (shown in A–C, respectively). The FEL corresponding to the initial parts of the trajectories (with the average fraction of the native structures up to 20% of the maximum fraction) is shown in A; the FEL from the middle parts of the trajectories (the fraction of the native structures between 20% and 50% of the maximum fraction) is shown in B; and the FEL from the final parts of the trajectories (the fraction of the native structures exceeds 50% of the maximum fraction) is shown in C, respectively. The letters “U,” “I,” and “N” correspond to unfolded, intermediate, and native states, respectively. The representative structures of unfolded, intermediate, and native states are plotted on top of each state. Hairpin 1 and hairpin 2 are circled by black and red lines, correspondingly, in A.

Fig. 5.

Fig. 5.

(A–C) Same as in Fig. 3 but for the ΔNΔCY11R/L26A truncated mutant.

As was expected, and can be deduced easily from the fraction curves of Fig. 1 and SI Appendix, Fig. S3, the FELs of WT and all full-size and truncated mutants exhibit three-state folding kinetics; however, there are some discrepancies in the folding pathways and in the depths of the states. For example, mutants Y11R and Y19L fold along two different folding pathways (Fig. 4 and SI Appendix, Fig. S4), whereas WT and the rest of the mutants (W30F, ΔNY11R, ΔNΔCY11R, and ΔNΔCY11R/L26A) follow a single folding pathway (Figs. 3 and 5 and SI Appendix, Figs. S5–S7). Both Y11R and Y19L fold either along the pathway, in which hairpin 1 forms first in the intermediate state and then the protein jumps to the native state, or with a different order of formation of hairpins; i.e., hairpin 2 forms first in the intermediate state before the protein reaches the native state. These findings are in agreement with our previous study (27). The replacement of a nonpolar aromatic amino acid (Tyr) by a charged extremely hydrophilic amino acid (Arg) for the Y11R mutant and by a very nonpolar branched aliphatic amino acid (Leu) for the Y19L mutant, both in hairpin 1, may destabilize this hairpin and induce the second folding pathway, as was pointed out in our earlier work (27) (see the spherical representation of the mutated residue in the representative structures of the intermediate state in Fig. 4 and SI Appendix, Fig. S4). The WT and the rest of the mutants fold along a pathway, in which hairpin 1 forms first in the intermediate state before the protein reaches the native state (Figs. 3 and 5 and SI Appendix, Figs. S5–S7). These findings are in agreement with previous experimental (15, 30, 32, 33, 36) and theoretical (18, 24, 27, 28) studies. However, by performing unfolding MD simulations of the FBP28 WW domain, Petrovich et al. (32) found that, in 30% of the trajectories, the nonnative helical structure is formed, instead of the first β-strand, allowing the second hairpin to form first. We also observed the nonnative helical structure formed in the N-terminal region in the unfolded state for WT; however, it did not help the second hairpin to form first. Moreover, Petrovich et al. (32) explained the formation of the first hairpin in the remaining 70% of the trajectories; in particular, they found that the first loop is present in the early stage as a kink in the backbone of the protein to allow long-range interactions to occur, followed by side-chain interactions and hydrophobic collapse of residues to enable the first and second β-strands to form the first hairpin. This finding is in agreement with our recent study (28).

Fig. 4.

Fig. 4.

(A–C) Same as in Fig. 3 but for the Y11R full-size mutant.

The folding scenarios illustrated in the FELs for Y19L and W30F mutants (SI Appendix, Figs. S4 and S5) contradict our findings obtained by fitting the fractions of native and intermediate structures (SI Appendix, Fig. S3 A–D). In particular, the FELs of Y19L and W30F mutants clearly show a three-state folding scenario, whereas the fitting results in SI Appendix, Fig. S3 indicate single-exponential kinetics, which normally implies a two-state folding scenario.

In the end, it should be noted that, based on the results shown in SI Appendix, Figs. S3 A and B and S4, the replacement of Tyr-19 by Leu had the most destabilizing influence among all mutations, which indicates the importance of the Tyr-19 residue for the stability of the protein. These results are in agreement with the findings of an earlier experimental study (16), in which the structural roles of the conserved residues were studied by using site-directed mutagenesis of the FBP28 WW domain.

Discussion

To account for the discrepancies between the results obtained from the kinetic studies and FELs, we drew the free-energy diagrams (SI Appendix, Fig. S8) based on the distributions of the populations of the native, intermediate, and unfolded structures (PN, PI, PU) for all of the systems studied.

The main differences, illustrated in these free-energy diagrams, between the mutants exhibiting single- and double-exponential kinetics are (i) in the depth of the intermediate state and (ii) in the barrier heights of the unfolded/intermediate and intermediate/native states (SI Appendix, Fig. S8). In particular, the intermediate state is the deepest and the barrier height between the unfolded and intermediate states is much lower than the one between the intermediate and native states [ratios (R) between the barrier heights for all systems are in SI Appendix, Fig. S9] for the mutants exhibiting single-exponential kinetics (SI Appendix, Fig. S8 D and E); whereas the systems exhibiting double-exponential kinetics have the deepest native state, and the barrier height between the unfolded and intermediate states is higher than (or comparable to) that between the intermediate and native states (SI Appendix, Figs. S8 A–C and F and S9). This explains why single-exponential kinetics emerge during three-state folding. The point is that the barrier height between the intermediate and native states is so high compared with that between the unfolded and intermediate states [three (for Y19L) or four (for W30F) times higher than that between the unfolded and intermediate states] that the timescale separation occurs and only the slowest step is observed by experiment due to experimental limitations; hence, single-exponential kinetics arise. Among the studied mutants, ΔNΔCY11R is the only exception, which exhibits double-exponential kinetics from the fitting, but the features (depth of the intermediate state, barrier heights) illustrated in the free-energy diagram (SI Appendix, Fig. S8G) are characteristic of single-exponential kinetics, which actually were observed in the experiment (15) for this mutant. The reason for the double-exponential kinetics for the ΔNΔCY11R mutant is that the difference between the barrier heights of the unfolded/intermediate (0.53 kBT) and intermediate/native (0.60 kBT) states is not large enough to lead to single-exponential kinetics. However, the difference between the values of χ2 (0.018 and 0.017, respectively) (Table 1) of the single- and double-exponential fits for the ΔNΔCY11R mutant is the smallest among all mutants, which indicates that there is a tiny threshold between single- and double-exponential kinetics for this mutant.

In a perceptive analysis of the timescales for protein folding kinetics, Thirumalai (38) showed that, at the atomic level, the free-energy barrier height scales as N1/2, where N is the number of residues; i.e., the free-energy barrier heights for the FBP28 WW domain and its full-size (n = 37) and truncated (n = 32 and n = 28) mutants should vary between 5.3 kBT and 6 kBT. However, because of averaging out the fast motions of the secondary degrees of freedom, at the coarse-grained level, the free-energy barriers, illustrated in SI Appendix, Fig. S8, are lower than those at the atomic level.

Thus, we have illustrated that single-exponential kinetics do not arise only in two-state folding. They may emerge during three-state (or multistate) folding when one of the free-energy barriers is much higher than the other; consequently, a separation of timescales occurs and single-exponential kinetics arise.

To explain the discrepancy between the fitting results (SI Appendix, Fig. S3 C and D) and the m1 parameter for the W30F mutant, we have calculated the changes in the populations of the intermediate state over the above-defined time intervals (shown in Fig. 3 legend). It turns out that, unlike in the other mutants (in which PI increases gradually over time), PI increases rapidly (∼28%) in the initial part of the trajectories, in W30F, then decreases slightly (∼25%) in the middle part of the trajectories, and then increases again (∼38%). This has been reflected in a bit of unusual behavior of the fitted curve of the fractions of the intermediate structures (light blue ragged line in SI Appendix, Fig. S3D); consequently, the m1 parameter obtained from this fitting differs from the expected one (>0.5).

It should be noted that, for a similar reason, i.e., a rapid increase of PI in the initial part of the trajectories, the fast-phase rate constant of the ΔNY11R mutant increased by one order of magnitude.

In the end, the truncated mutants that are outliers in Fig. 2 might be caused by the unusual states with structures similar to the native states, but with the second hairpin shifted (Fig. 5 and SI Appendix, Figs. S6 and S7), which suggests that these states might derail folding to misfolded states (20, 39). One of the reasons for the “distortion” of structures of the ΔNΔCY11R and ΔNΔCY11R/L26A mutants could be the deletion of the C-terminal Leu36 residue, part of a delocalized hydrophobic core, Trp8/Tyr20/Pro33. As for the ΔNY11R mutant, based on earlier experimental studies (15, 16), the truncation of the N-terminal residues has no observable effect on the stability of the domain; hence, the reasons for the distortion of the structure must be different. An investigation of the reasons for this distortion is beyond the scope of the present study but it is worth pursuing in the future.

Conclusions

In this study, we carried out a quantitative analysis of the simulated kinetics of the folding of the FBP28 WW domain and its six mutants at the coarse-grained level. By analysis of the fractions of the native structures, we found double-exponential kinetics (three-state folding) for the WT and its mutants except for the Y19L and W30F mutants, which exhibited single-exponential kinetics implying two-state folding. The results from the FELs along the Cα-rmsd and Rg indicate that the WT and all its mutants are three-state folders. For most of the mutants, the obtained results are in agreement with experiment (15). The discrepancies between the results of our simulation and the experimental (15) kinetics studies and the FELs for some mutants, as well as the origins of single- and double-exponential kinetics and their correlations with two- and three-state folding, are explained in terms of the free-energy barrier heights. In particular, we have shown that single-exponential kinetics can emerge even in three-state (or multistate) folding when one of the free-energy barriers is much higher than the other. The calculated fast-phase rate constants correlate with the experimental values determined by Nguyen et al. (15) except for the truncated mutants. For most of the systems, the rate constants obtained by simulations are greater by about three orders of magnitude, which is caused by averaging out the fast degrees of freedom in our coarse-grained treatment (19) and by scaling down water friction by a factor of 1,000. It should be noted that scaling down the friction is a common practice in coarse-grained Langevin-dynamics simulations to accelerate the simulations (40, 41).

An important finding of this work is that, for such small proteins as the FBP28 WW domain and its variants, the intermediate and unfolded states are at equilibrium with the native state in water solutions (Fig. 1 and SI Appendix, Fig. S3). Therefore, estimating the content of the native state based on, e.g., the CD or fluorescence spectra of a protein solution must be considered with caution because the unfolded protein/intermediate state can be present even at temperatures well below the folding-transition temperature.

As in our previous studies (27, 28), consistent with previous findings (15, 18, 24, 30, 32, 33, 36), we found that the intermediate state consists predominantly of conformations with hairpin 1 well established and hairpin 2 only outlined. The Y11R and Y19L mutants are exceptions (Fig. 4 and SI Appendix, Fig. S4) because of the destabilization of hairpin 1 caused by the replacement of a hydrophobic tyrosine residue with a charged extremely hydrophilic (arginine) and a very nonpolar branched aliphatic (leucine) residue.

Materials and Methods

To determine the rate constants for protein folding kinetics, we first derived the rate equations for two- and three-state folding models. All steps of derivations are given in SI Appendix, SI Materials and Methods. Here, we present only the equations, in final form, for the mole fractions of the native state as a function of time, [N](t), for a two-state folding model,

[N](t)=C0[1exp(λ0t)], [1]

and for the mole fractions of native and intermediate states as functions of time, [N](t) and [I](t), respectively, for a three-state folding model

[N](t)=C1{1m1exp(λ1t)(1m1)exp(λ2t)} [2]
[I](t)=C2{1m2exp(λ1t)(1m2)exp(λ2t)}. [3]

To obtain the rate constants, the fractions of the U, I, and N states averaged over 512 MD trajectories at each time interval were fitted by Eq. 1 for a two-state model and by Eqs. 2 and 3 for a three-state model. In other words, the mole fractions of the native state (two-state model) and of the native and intermediate states (three-state model) calculated from simulation data were fitted by Eq. 1 (with C0 and λ0 as determinable parameters) and Eqs. 2 and 3 (with C1, C2, m1, m2, λ1, and λ2 as determinable parameters) (details of all parameters, the simulation, and data analysis are in SI Appendix).

Details of the preparation of FBP28 WW mutants and of the calculated structures are given in SI Appendix.

Supplementary Material

Supplementary File

Acknowledgments

This research was conducted by using the resources of (i) our 588-processor Beowulf cluster at the Baker Laboratory of Chemistry and Chemical Biology, Cornell University; (ii) the Informatics center of the Metropolitan Academic Network in Gdansk; (iii) our 96-processor cluster at the Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University of Science and Technology; and (iv) the National Science Foundation Terascale Computing System at the Pittsburgh Supercomputer Center. This work was supported by grants from the National Institutes of Health (GM-14312) and the National Science Foundation (MCB10-19767), by Grant 530-8370-D498-14 from the Polish Ministry of Science and Education, by Grant 31300598 from the National Science Foundation of China, and by the Spanish National Research Program [Ministry of Economy and Competitiveness, SAF2011-25119 (to M.J.M.)]. D.S. has a “laCaixa”/Institute for Research in Biomedicine Barcelona International PhD Programme fellowship, and T.T. is a recipient of a European Union Co-funding of Regional, National, and International Programmes - Marie Curie Actions grant. M.J.M. is a Catalan Institution for Research and Advanced Studies Programme Investigator.

Footnotes

The authors declare no conflict of interest.

Data deposition: The atomic coordinates have been deposited in the Protein Data Bank (PDB), www.pdb.org [PDB ID codes 2mw9 (Y11R), 2mwa (Y19L), 2mwb (W30F), 2mwf (ΔNY11R), 2mwd (ΔNΔCY11R), and 2mwe (ΔNΔCY11R/L26A)], and the BioMagResBank (BMRB), www.bmrb.wisc.edu [BMRB ID codes 25309 (Y11R), 25310 (Y19L), 25311 (W30F), 25315 (ΔNY11R), 25313 (ΔNΔCY11R), and 25314 (ΔNΔCY11R/L26A)].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1420914111/-/DCSupplemental.

References

  • 1.Lindorff-Larsen K, Piana S, Dror RO, Shaw DE. How fast-folding proteins fold. Science. 2011;334(6055):517–520. doi: 10.1126/science.1208351. [DOI] [PubMed] [Google Scholar]
  • 2.Skolnick J, et al. TOUCHSTONE: A unified approach to protein structure prediction. Proteins. 2003;53(Suppl 6):469–479. doi: 10.1002/prot.10551. [DOI] [PubMed] [Google Scholar]
  • 3.Scheraga HA, et al. The protein folding problem: Global optimization of the force fields. Front Biosci. 2004;9:3296–3323. doi: 10.2741/1482. [DOI] [PubMed] [Google Scholar]
  • 4.Tozzini V. Coarse-grained models for proteins. Curr Opin Struct Biol. 2005;15(2):144–150. doi: 10.1016/j.sbi.2005.02.005. [DOI] [PubMed] [Google Scholar]
  • 5.Thorpe IF, Zhou J, Voth GA. Peptide folding using multiscale coarse-grained models. J Phys Chem B. 2008;112(41):13079–13090. doi: 10.1021/jp8015968. [DOI] [PubMed] [Google Scholar]
  • 6.Liwo A, Pincus MR, Wawak RJ, Rackovsky S, Scheraga HA. Prediction of protein conformation on the basis of a search for compact structures: Test on avian pancreatic polypeptide. Protein Sci. 1993;2(10):1715–1731. doi: 10.1002/pro.5560021016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Liwo A, et al. A united-residue force field for off-lattice protein-structure simulations. I. Functional forms and parameters of long-range side-chain interaction potentials from protein crystal data. J Comput Chem. 1997;18:849–873. [Google Scholar]
  • 8.Liwo A, et al. A united-residue force field for off-lattice protein-structure simulations. II: Parameterization of local interactions and determination of the weights of energy terms by Z-score optimization. J Comput Chem. 1997;18:874–887. [Google Scholar]
  • 9.Liwo A, Czaplewski C, Pillardy J, Scheraga HA. Cumulant-based expressions for the multibody terms for the correlation between local and electrostatic interactions in the united-residue force field. J Chem Phys. 2001;115:2323–2347. [Google Scholar]
  • 10.Liwo A, Ołdziej S, Czaplewski C, Kozlowska U, Scheraga HA. Parametrization of backbone-electrostatic and multibody contributions to the UNRES force field for protein-structure prediction from ab initio energy surfaces of model systems. J Phys Chem B. 2004;108:9421–9438. [Google Scholar]
  • 11.Ołdziej S, Liwo A, Czaplewski C, Pillardy J, Scheraga HA. Optimization of the UNRES force field by hierarchical design of the potential-energy landscape. 2. Off-lattice tests of the method with single proteins. J Phys Chem B. 2004;108:16934–16949. [Google Scholar]
  • 12.Ołdziej S, et al. Optimization of the UNRES force field by hierarchical design of the potential-energy landscape. 3. Use of many proteins in optimization. J Phys Chem B. 2004;108:16950–16959. [Google Scholar]
  • 13.Liwo A, et al. Modification and optimization of the united-residue (UNRES) potential energy function for canonical simulations. I. Temperature dependence of the effective energy function and tests of the optimization method with single training proteins. J Phys Chem B. 2007;111(1):260–285. doi: 10.1021/jp065380a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Maisuradze GG, Senet P, Czaplewski C, Liwo A, Scheraga HA. Investigation of protein folding by coarse-grained molecular dynamics with the UNRES force field. J Phys Chem A. 2010;114(13):4471–4485. doi: 10.1021/jp9117776. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Nguyen H, Jager M, Moretto A, Gruebele M, Kelly JW. Tuning the free-energy landscape of a WW domain by temperature, mutation, and truncation. Proc Natl Acad Sci USA. 2003;100(7):3948–3953. doi: 10.1073/pnas.0538054100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Macias MJ, Gervais V, Civera C, Oschkinat H. Structural analysis of WW domains and design of a WW prototype. Nat Struct Biol. 2000;7(5):375–379. doi: 10.1038/75144. [DOI] [PubMed] [Google Scholar]
  • 17.Sudol M, Hunter T. NeW wrinkles for an old domain. Cell. 2000;103(7):1001–1004. doi: 10.1016/s0092-8674(00)00203-8. [DOI] [PubMed] [Google Scholar]
  • 18.Karanicolas J, Brooks CL., 3rd The structural basis for biphasic kinetics in the folding of the WW domain from a formin-binding protein: Lessons for protein design? Proc Natl Acad Sci USA. 2003;100(7):3954–3959. doi: 10.1073/pnas.0731771100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Liwo A, Khalili M, Scheraga HA. Ab initio simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptide chains. Proc Natl Acad Sci USA. 2005;102(7):2362–2367. doi: 10.1073/pnas.0408885102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Mu Y, Nordenskiöld L, Tam JP. Folding, misfolding, and amyloid protofibril formation of WW domain FBP28. Biophys J. 2006;90(11):3983–3992. doi: 10.1529/biophysj.105.076406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Maisuradze GG, Liwo A, Scheraga HA. Principal component analysis for protein folding dynamics. J Mol Biol. 2009;385(1):312–329. doi: 10.1016/j.jmb.2008.10.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Maisuradze GG, Liwo A, Scheraga HA. Relation between free energy landscapes of proteins and dynamics. J Chem Theory Comput. 2010;6(2):583–595. doi: 10.1021/ct9005745. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Shaw DE, et al. Atomic-level characterization of the structural dynamics of proteins. Science. 2010;330(6002):341–346. doi: 10.1126/science.1187409. [DOI] [PubMed] [Google Scholar]
  • 24.Piana S, et al. Computational design and experimental testing of the fastest-folding β-sheet protein. J Mol Biol. 2011;405(1):43–48. doi: 10.1016/j.jmb.2010.10.023. [DOI] [PubMed] [Google Scholar]
  • 25.Xu J, Huang L, Shakhnovich EI. The ensemble folding kinetics of the FBP28 WW domain revealed by an all-atom Monte Carlo simulation in a knowledge-based potential. Proteins. 2011;79(6):1704–1714. doi: 10.1002/prot.22993. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.A Beccara S, Škrbić T, Covino R, Faccioli P. Dominant folding pathways of a WW domain. Proc Natl Acad Sci USA. 2012;109(7):2330–2335. doi: 10.1073/pnas.1111796109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Maisuradze GG, Zhou R, Liwo A, Xiao Y, Scheraga HA. Effects of mutation, truncation, and temperature on the folding kinetics of a WW domain. J Mol Biol. 2012;420(4–5):350–365. doi: 10.1016/j.jmb.2012.04.027. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Maisuradze GG, Liwo A, Senet P, Scheraga HA. Local vs global motions in protein folding. J Chem Theory Comput. 2013;9(7):2907–2921. doi: 10.1021/ct4001558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Jäger M, Nguyen H, Crane JC, Kelly JW, Gruebele M. The folding mechanism of a beta-sheet: The WW domain. J Mol Biol. 2001;311(2):373–393. doi: 10.1006/jmbi.2001.4873. [DOI] [PubMed] [Google Scholar]
  • 30.Ferguson N, Johnson CM, Macias M, Oschkinat H, Fersht A. Ultrafast folding of WW domains without structured aromatic clusters in the denatured state. Proc Natl Acad Sci USA. 2001;98(23):13002–13007. doi: 10.1073/pnas.221467198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Ferguson N, et al. Rapid amyloid fiber formation from the fast-folding WW domain FBP28. Proc Natl Acad Sci USA. 2003;100(17):9814–9819. doi: 10.1073/pnas.1333907100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Petrovich M, Jonsson AL, Ferguson N, Daggett V, Fersht AR. Phi-analysis at the experimental limits: Mechanism of beta-hairpin formation. J Mol Biol. 2006;360(4):865–881. doi: 10.1016/j.jmb.2006.05.050. [DOI] [PubMed] [Google Scholar]
  • 33.Jäger M, et al. Structure-function-folding relationship in a WW domain. Proc Natl Acad Sci USA. 2006;103(28):10648–10653. doi: 10.1073/pnas.0600511103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Sharpe T, Jonsson AL, Rutherford TJ, Daggett V, Fersht AR. The role of the turn in beta-hairpin formation during WW domain folding. Protein Sci. 2007;16(10):2233–2239. doi: 10.1110/ps.073004907. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Serpell LC. Alzheimer’s amyloid fibrils: Structure and assembly. Biochim Biophys Acta. 2000;1502(1):16–30. doi: 10.1016/s0925-4439(00)00029-6. [DOI] [PubMed] [Google Scholar]
  • 36.Davis CM, Dyer RB. WW domain folding complexity revealed by infrared spectroscopy. Biochemistry. 2014;53(34):5476–5484. doi: 10.1021/bi500556h. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Cheung MS, Thirumalai D. Effects of crowding and confinement on the structures of the transition state ensemble in proteins. J Phys Chem B. 2007;111(28):8250–8257. doi: 10.1021/jp068201y. [DOI] [PubMed] [Google Scholar]
  • 38.Thirumalai D. From minimal models to real proteins: Time scales for protein folding kinetics. J Phys I. 1995;5:1457–1467. [Google Scholar]
  • 39.Neudecker P, et al. Structure of an intermediate state in protein folding and aggregation. Science. 2012;336(6079):362–366. doi: 10.1126/science.1214203. [DOI] [PubMed] [Google Scholar]
  • 40.Veitshans T, Klimov D, Thirumalai D. Protein folding kinetics: Timescales, pathways and energy landscapes in terms of sequence-dependent properties. Fold Des. 1997;2(1):1–22. doi: 10.1016/S1359-0278(97)00002-3. [DOI] [PubMed] [Google Scholar]
  • 41.Cieplak M, Hoang TX, Robbins MO. Thermal folding and mechanical unfolding pathways of protein secondary structures. Proteins. 2002;49(1):104–113. doi: 10.1002/prot.10188. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES