Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2014 Oct 27;111(45):15975–15980. doi: 10.1073/pnas.1404213111

Benchmarking all-atom simulations using hydrogen exchange

John J Skinner a,1, Wookyung Yu a,b,1, Elizabeth K Gichana a, Michael C Baxa a,c, James R Hinshaw d, Karl F Freed d,e,f,2, Tobin R Sosnick a,c,f,2
PMCID: PMC4234613  PMID: 25349413

Significance

Molecular dynamics simulations have recently become capable of observing multiple protein unfolding and refolding events in a single-millisecond–long trajectory. This major advance produces atomic-level information with nanosecond resolution, a feat unmatched by experimental methods. Such simulations are being extensively analyzed to assess their description of protein folding, yet the results remain difficult to validate experimentally. We apply a combination of hydrogen exchange, NMR, and other techniques to test the simulations with a resolution of single H-bonds. Several significant discrepancies between the simulations and experimental data were uncovered for regions of the energy surface outside of the native basin. This comparison yields suggestions for improving the force fields and provides a general method for experimentally validating folding simulations.

Keywords: molecular dynamics, unfolded state, denatured states, HX, protein folding

Abstract

Long-time molecular dynamics (MD) simulations are now able to fold small proteins reversibly to their native structures [Lindorff-Larsen K, Piana S, Dror RO, Shaw DE (2011) Science 334(6055):517–520]. These results indicate that modern force fields can reproduce the energy surface near the native structure. To test how well the force fields recapitulate the other regions of the energy surface, MD trajectories for a variant of protein G are compared with data from site-resolved hydrogen exchange (HX) and other biophysical measurements. Because HX monitors the breaking of individual H-bonds, this experimental technique identifies the stability and H-bond content of excited states, thus enabling quantitative comparison with the simulations. Contrary to experimental findings of a cooperative, all-or-none unfolding process, the simulated denatured state ensemble, on average, is highly collapsed with some transient or persistent native 2° structure. The MD trajectories of this protein G variant and other small proteins exhibit excessive intramolecular H-bonding even for the most expanded conformations, suggesting that the force fields require improvements in describing H-bonding and backbone hydration. Moreover, these comparisons provide a general protocol for validating the ability of simulations to accurately capture rare structural fluctuations.


Molecular dynamics (MD) simulations can now probe protein dynamics on millisecond timescales and thereby enable investigation of a variety of biological problems, including binding, conformational changes, and folding. A landmark example is the all-atom simulations by Shaw and coworkers where multiple folding and unfolding events were observed in long time trajectories (1, 2). In addition to predicting or matching observed folding rates with a single set of parameters, these simulations produced native-like models for 12 small, fast-folding proteins. Equally impressive is their observation of multiple discrete folding and unfolding transitions, which indicates that folding proceeds on an energy landscape with two major states separated by a free energy barrier. This barrier-limited folding behavior replicates that observed for many proteins. Not surprisingly, these remarkable simulations are being extensively analyzed (35).

The applicability of MD for many situations is limited by the extent to which the entire landscape is recapitulated. An accurate representation of native-like states does not imply a correct representation of other states (e.g., intermediates and unfolded structures). Proper validation requires a comparison with experiments that probe lowly populated conformations. NMR measurements probe subsecond dynamics with single residue resolution, although with a limitation to states with populations exceeding 0.5% (6). Fluorescence, CD, FRET, and small angle X-ray scattering (SAXS) measurements are well adapted to kinetic studies but provide limited spatial resolution.

Hydrogen exchange (HX) data report on the H-bond patterns and populations of extremely rare states, independent of the timescale on which these states are sampled (7, 8). Furthermore, the cooperativity and spatial extent of fluctuations can be assayed in a site-resolved manner by using NMR methods to measure HX as a function of denaturant concentration (9). Thus, HX is an ideal probe for characterizing excited states, such as the denatured state ensemble (DSE) and kinetically invisible intermediates, even under fully native conditions.

Here we describe a detailed comparison of the folding simulations to experimental data for a fast-folding variant of protein G [NuG2b; 76% identity (1)] with an optimized amino hairpin (10). The simulated DSE of NuG2b (and other proteins) contains mostly overly compact conformations, as recently noted by Shaw and coworkers (11) and others (3). We also find that the DSEsim exhibits high levels of H-bonding, often at levels comparable to the native state ensemble (NSE) and with some secondary structural elements arranged in a native-like manner.

For NuG2b, experimental measurements imply that folding is highly cooperative and the DSEexp is largely unstructured. Based on HX measurements, the DSE is highly solvated and devoid of stable H-bonding. The linear dependence on denaturant concentration of the stability and activation folding free energy (ΔGf) along with the chevron analysis indicate that DSEexp remains solvated and unstructured, even in the absence of denaturant. In addition, kinetic SAXS measurements reveal that DSEexp is highly expanded. Taken together, these measurements provide a detailed description of DSEexp, which is compared with DSEsim.

Results

Our results are presented in stages. We first analyze the level of structure present in DSEsim, primarily focusing on the extent of H-bonding. Next, we present the experimental methodology and describe the results. The section ends with a comparison between the experiment and simulation.

Properties of the Simulated DSE.

The four NuG2b MD trajectories (1) were analyzed to identify the number of native and total H-bonds, the Cα-rmsd from the native state, the radius of gyration (Rg), and the TM score [a 0-to-1 metric for structural similarity (12)] (Fig. 1A and Fig. S1). These quantities often change in concert during discrete folding transitions. The discrete character of the transitions enabled us to identify NSEsim as the set of conformations for which the rmsd is below 4.0 Å, the number of native H-bonds exceeds 20, and the TM score is greater than 0.6 (Fig. S2A). All other conformations were classified as members of DSEsim. The time-averaged H-bond fraction for each residue was calculated using all 6 × 106 conformations in the trajectories (SI Methods and Fig. S2 B and C). For each conformation, an amide proton was considered protected from HX if it participated in an intramolecular H-bond within 0.5 ns, irrespective of whether the partner was the native carbonyl oxygen.

Fig. 1.

Fig. 1.

Analysis of H-bonds and 2° structure in MD simulations. (A) H-bonding (green, native; red, total), Rg, backbone rmsd, and TM score for one MD trajectory (three others are shown in Fig. S1). The right column presents the distribution for the DSE. RgSAXS is indicated for the chemically denatured state (blue line). (B) H-bond frequencies, color-coded according to whether they are the native partners, or within ±1, ±2, or >±2 of the native partners. (C) Secondary structure frequencies (red, α, 3–10 and π helices; blue, extended; green, other; cyan, turn). (D) H-bond and contact maps.

The total number of native plus nonnative H-bonds remains relatively constant across the entire trajectory, even in many of the expanded conformations of DSEsim (Fig. 1A). On average, DSEsim structures are highly collapsed with RgDSE/RgNSE=12.0Å/10.5Å = 1.15 Å. A comparison of the solvent-accessible surface areas (ASAs) produces a similar result, <ASADSE>/<ASANSE> = 1.1, whereas this ratio is 1.5 for expanded conformations in DSEsim with Rg >20 Å and 1.9 for a self-avoiding statistical coil model (<Rg> = 24 Å) (13). The ASA of the polar atoms is similar for DSEsim and NSEsim, but the hydrophobic exposure is more variable and typically is greater in DSEsim (Fig. S2D).

Both the residue–residue contact map and the pattern of H-bond partners in DSEsim resemble those of NSEsim (Fig. 1D). The major fraction (∼80%) of the conformations in DSEsim contains a folded N-terminal β hairpin, whereas about half the ensemble retains at least a portion of the helix (Fig. 1). The C-terminal hairpin frequently remains fully extended and forms nonnative H-bonds with the β1 strand and with residues that are helical in the native state (Fig. 1D).

Even when the chain contains nonnative contacts in DSEsim, H-bonds still form, often within one or two residues of the native partner for the helix and the N-terminal hairpin (Fig. 1B), reflecting helical over/underwinding and β slippage, respectively. The N- and C-terminal strands often are in contact in DSEsim, although typically with the nonnative antiparallel orientation. The largest cluster (5) in DSEsim (36%) has the N-terminal hairpin and most of the helix positioned in a native-like arrangement, with an average TM score of 0.46. The majority of highly extended structures in the DSEsim also contain one or more elements of native 2° structure. In addition, some conformations of DSEsim adopt native-like topologies with a high TM score, 0.6. The average TM score in DSEsim is 0.34 and exceeds that for a random pair of structures (0.17) (12). Overall, DSEsim is often highly collapsed and extensively H-bonded with some structural elements arranged in a native-like manner.

Experimental Characterization of the DSE.

HX, kinetic, CD, and SAXS measurements were conducted to characterize NuG2b’s energy surface to compare with the simulations. The following two sections provide conceptual background for the interpretation of the HX and kinetic data.

Background: Hydrogen exchange.

The analysis of HX data assumes that backbone amide protons (NH) can exist in states that are either HX-competent or -incompetent, termed open and closed, respectively (7, 8). Typically, closed states are involved in intramolecular H-bonds. The measured HX rate (kex) for each NH is a function of the rates of opening (kop), closing (kcl), and chemical exchange from the open state (kchem) according to the reaction [Closed]kclkop[Open]kchem[Exchanged]. Intrinsic exchange rates, kchem, are well characterized for amide protons in unstructured regions (14, 15). Under most experimental conditions (including ours, where kfold = kcl > 10⋅kchem), kcl >> kchem and the system is in the so-called EX2 limit where the observed rate is given by

kex=kchemkop(kop+kcl)=kchemKop(Kop+1), [1]

where Kop = kop/kcl is the equilibrium constant for the openclosed reaction, and the free energy is given by ΔGHX = RTlnKop. Accordingly, a direct comparison of ΔGisim and ΔGiHX is possible for each residue i, even for reactions that are orders of magnitude faster than the HX measurement timescale.

The combined use of chemical denaturants with HX enables the quantification of the amount of exposed protein surface in the various conformational states. Denaturants such as guanidinium chloride (GdmCl) preferentially stabilize states with more exposed surface area (16), primarily peptide group ASA (17). We note that denaturants do not create new states, but rather promote preexisting ones with larger ASA (79). The sensitivity of the population shift can be translated into a plot of ΔG versus [GdmCl], which frequently is linear with a slope termed the m-value (units of kilocalories per mole per molar), that is, ΔG([den]) = ΔG(0) − m·[den]. When monitoring equilibrium folding as a function of denaturant (e.g., by CD or fluorescence), the m-value reflects the difference in ASA exposure between the NSE and the DSE, termed “mglobal.”

In combination with NMR methods, HX also enables determination of the residue-specific mHX-values that reflect the extent of structure remaining in states where a specific amide proton is exchange competent (Fig. S3A). Accordingly, the mHX-value for each residue depends upon whether its exchange occurs in the NSE, the DSE, or an intermediate state (79). A residue produces mHX = 0 if the amide proton exchanges from the NSE through a local perturbation that exposes little or no additional ASA and has lower free energy than the globally unfolded state. Amide protons that require global unfolding should exchange with the global stability (ΔGiHX = ΔGglobal) and with an m-value matching the global value (mHX = mglobal). Intermediate exchange behavior representing subglobal structural openings also is possible and is observed in larger proteins (79). However, only global and local opening events appear for NuG2b.

Background: Chevron analysis to test for residual DSE structure.

The denaturant dependence of the folding and unfolding activation free energies, ΔGf and ΔGu, are presented using the chevron analysis, a widely used method to demonstrate that a DSE is devoid of significant residual structure (1820). This test involves a comparison of the free energy difference between the DSE and NSE, measured independently in equilibrium melting and in kinetic folding experiments (ΔGeq = ΔGu − ΔGf). A similar comparison is made for surface burial parameters (mglobal = mumf), where mf and mu are the denaturant dependence of the ΔGf and ΔGu and provide measures of the ASA that is buried going to the transition state ensemble (TSE) from the DSE or the NSE, respectively. Agreement between the equilibrium and kinetically determined parameters for ΔGeq and mglobal, irrespective of the probe used to measure folding, implies that no conformations with significant stabilization or surface burial are populated on the DSE side of the TSE, and that the DSE under refolding conditions has the same level of exposed denaturant sensitive surface area as the DSE studied by equilibrium denaturation at elevated denaturant concentrations.

Experimental folding behavior of NuG2b.

The extent of H-bonding for the experimental ensemble was determined by collecting HX data as a function of denaturant (Fig. 2, Upper). HX rates were obtained by measuring the increase in amide peak volumes in 1H-15N heteronuclear single quantum coherence (HSQC) spectra over time for predeuterated NuG2b exchanging in 90% H2O at 313 K (Fig. 2 and Fig. S3B). Based upon stability and denaturant dependence, 12 sites throughout the protein seem to exchange through a global unfolding event. They have ΔGHX = 8.1 ± 0.3 kcal⋅mol−1 and mHX = 1.31 ± 0.08 kcal⋅mol−1⋅M−1 (errors reflect SD for the 12 sites).

Fig. 2.

Fig. 2.

Experimental studies. (Upper) Dependence on GdmCl concentration for D-to-H exchange of the 12 most stable, globally exchanging amide sites in 90% H2O at 313 K (expanded view in Fig. S3B). Stabilities are compared with values obtained from equilibrium denaturation (black) and kinetic fluorescence measurements in 100% H2O (red, width of the triangle reflects the statistical error on extrapolation to zero denaturant). The pHread for each denaturant concentration were as follows: 0 mM, pH 7.51; 250 mM, pH 6.98; 500 mM, pH 6.87; 750 mM, pH 6.83; 1 M, pH 7.12; 1.25 M, pH 6.66; 1.40 M, pH 6.52. (Inset) Locations of residues undergoing global exchange are mapped onto the NuG2b structure. (Middle) Equilibrium denaturation monitored by CD at 313 K. (Lower) Folding rates at 293, 313, and 333 K.

To test whether exchange of the most stable amide protons occurs from an unstructured DSE, the ΔGHX and mHX-values were compared with those obtained from equilibrium and kinetic folding studies at 313 K in 100% H2O (Fig. 2). The CD-monitored equilibrium unfolding data fit well to a two-state model, yielding a stability of 7.3 ± 0.2 kcal⋅mol−1 and mglobal = 1.31 ± 0.04 kcal⋅mol−1⋅M−1. These values closely match those obtained from kinetic data using chevron analysis under the same conditions (Fig. 2 and Fig. S4; ΔGeq = 7.7 ± 0.4 kcal⋅mol−1, mglobal = 1.39 ± 0.04 kcal⋅mol−1⋅M−1). Hence, NuG2b folding satisfies the chevron criterion indicative of an unstructured DSE.

Overall, the similarity of the ΔG and m-values calculated from CD, kinetic, and HX measurements suggests that they are probing the same all-or-none unfolding transition. Nearly identical m-values were observed despite the different denaturant ranges accessible to each experiment, with the HX measured down to zero denaturant. Furthermore, the folding arm of the chevron plot is linear from 6 M to the lowest measurable denaturant concentration (0.8 M at 293 K, Fig. 2). This linearity indicates that no partially collapsed species accumulate before the major folding phase even for jumps to low denaturant (18, 21).

The overall dimensions of DSEexp were derived from SAXS data collected for NuG2b in 7 M GdmCl at 291 K. DSEexp has an average radius of gyration of 23–25 Å (Fig. S5A), consistent with a self-avoiding statistical coil (13), as found for many chemically denatured small proteins (2227). The Rg of DSEexp at low denaturant was unobtainable using rapid mixing methods with millisecond resolution owing to the extremely fast folding rate of NuG2b. As a surrogate, we performed mixing measurements on the slower-folding protein G. No measurable decrease in the Rg was found when jumping from 4 to 0.5 M GdmCl before the observed folding phase (kf = 20 s−1; Fig. S5B), indicating that protein G's DSEexp remains expanded in low denaturant.

To further investigate the degree of structure in DSEexp, we performed CD measurements on a peptide corresponding to NuG2b’s redesigned N-terminal hairpin, because this region contains the most residual structure in the MD simulations. The CD spectrum of the 20-mer displays a negative minimum at 200 nm that is very similar to that of the spectra of other unfolded states, such as a thermally denatured protein G mutant, but quite distinct from the native NuG2b spectrum (Fig. S6). Moreover, the peptide’s spectrum is invariant over the measured temperature range (293–333 K). Because residual H-bonded structure is very likely to be thermally labile, the temperature invariance of the spectrum implies that the peptide does not contain residual H-bonded structure.

Limits on structure in the DSEExp.

HX, CD, and kinetic data produce similar global stabilities and denaturant dependence despite being conducted under a wide range of denaturant concentrations. The extrapolated stability from the denaturant melt at 6 M matches the HX value in water and its dependence on GdmCl. These results, together with the SAXS and peptide CD data and the chevron criterion, indicate that, under native conditions, the chains in DSEexp are expanded and expose the same amount of surface area as at high denaturant. In addition, the most protected H-bonds, which appear in all 2° structure elements, undergo exchange with the global m-value. All these results point to a cooperative unfolding process to an unstructured ensemble under all measured conditions.

However, the 12 amide sites exchanging with the global m-value have an average ΔGHX that is 0.4 kcal⋅mol−1 higher than the stability determined using the kinetic data (which requires less extrapolation than the CD data). This mild difference could reflect residual structure, although it is at the level of our statistical error and also could result from minor errors in pH and T, or the inaccuracy of the linear extrapolation. Recently, the same energy difference was observed for the Fyn SH3 domain by Kay and coworkers (28), who likewise interpreted HX as occurring from an unstructured state because the sites exhibited random coil chemical shifts in the DSE.

Alternatively, low levels of residual H-bonded structure could exist under native conditions. Six amide sites yield ΔGHX ≥0.4 kcal⋅mol−1, consistent with these H-bonds being formed with KeqHB∼1 [ΔG = RT ln (1+ KeqHB)]. The sites are located on the N- and C-terminal strands as well as on the helix, so they are unlikely to form in a concerted fashion, and hence other alternatives should be considered such as transient helix formation. Using the recent parameterization of m-values with ASA (17), we estimate that a time average of four helical or six nonlocal H-bonds would reduce the m-value by ∼0.3 kcal⋅mol−1⋅M−1 (and ∼0.04 kcal⋅mol−1⋅M−1 for each additional H-bond). This reduction would noticeably affect the mHX and mf-values of 1.3 and 1.1 kcal⋅mol−1⋅M−1, respectively, and ΔGHX. Hence, we believe this level provides a reasonable upper bound on the amount of residual structure consistent with ΔGHX and mHX matching their equilibrium counterparts, the two-state chevron fit with linear folding arms, and the N-terminal hairpin being unstructured in water. The measured Rg, which is consistent with a random coil (23), provides an additional constraint on the level of residual structure.

Comparison Between Experiment and Simulation.

Our analysis of the NuG2b simulations indicates that the DSESim is typically compact and highly H-bonded, often with some secondary structural elements arranged in a native-like manner. Experimentally, the HX, CD, and kinetic data produce similar global stabilities despite being conducted under a wide range of denaturant concentrations. These results, together with the SAXS and peptide data and the two-state chevron fit, indicate that under highly native conditions the chains in DSEexp are expanded and expose the same amount of surface area as at high denaturant.

Benchmarking simulations against experimental data are best conducted when both are collected under the same conditions. Our comparisons for NuG2b, however, use simulations conducted at 350 K (near the Tmsim) to maximize the number of folding transitions, whereas the HX measurements were performed under more stabilizing conditions, 313 K, to accurately measure global exchange events. To adjust for the 37-K difference, a Boltzmann reweighting of the microstates is theoretically possible but would introduce otherwise indeterminate errors. As an alternative, the stabilities of the 12 globally exchanging sites are referenced to the global stability (Fig. 3). The stabilities of the 12 H-bonds (ΔGisim) in the MD trajectories exceed the simulated global stability by 0.5–2.4 kcal⋅mol−1, reflecting a significant degree of residual H-bonding. The most stable H-bonds are in the N-terminal hairpin and a portion of the helix. In contrast, the 12 most protected H-bonds observed experimentally include every structural element and have stabilities within 0–0.8 kcal⋅mol−1 of the global stability.

Fig. 3.

Fig. 3.

Comparison of HX data to simulations. The stabilities of individual H-bonds derived from HX data (Table S1) are referenced to the global stability determined from chevron analysis (black). H-bond stabilities calculated from the MD trajectories (red) are referenced to the global stability (using the relative populations of the NSE and DSE; ΔGglobal = +0.2 kcal⋅mol−1 with Keq = [NSEsim]/[DSEsim]). Dashed lines indicate mean stability differences relative to ΔGglobal.

In addition to the differences in magnitude and pattern of H-bond protection, quite dissimilar opening events lead to H-bond breakage in the simulations and experiments. When any one of the 12 globally exchanging H-bonds is broken for 1+ ns in DSEsim, on average about two-thirds of the H-bonds remain intact. Experimentally, however, the 12 most stable amide sites exchange through a large unfolding transition involving the simultaneous breakage of most or all H-bonds and with a denaturant dependence that is consistent with global unfolding.

An analysis of surface exposure produces an even larger discrepancy. For the states where at least one of the 12 globally exchanging H-bonds is broken, ASAHX/ASANSE = 1.14–1.17, whereas ASAstatistical coil/ASANSE = 1.9 (Fig. S2D). Using these values and a recent m-value:ASA parameterization (17), we obtain msim HX = 0.20 and mglobal = 1.35 kcal⋅mol−1⋅M−1, a readily distinguishable difference. The calculated mglobal matches the experimental value, providing support for the parameterization and our use of a statistical coil model for the DSE. Hence, the structural fluctuations that break the most stable H-bonds are very different for the simulations and the experiments.

The less structured DSE observed experimentally might be argued as a consequence of a difference in conditions. However, the experimental characterization was conducted at lower temperatures where residual structure would have been stabilized. Therefore, more H-bonded structure is expected to appear at lower temperatures rather than less. In addition, the HX measurements were performed under fully aqueous conditions, and the global m-value agrees with the value at high denaturant. Finally, the N-terminal hairpin, the predominant residual structure in the simulations, is an unstructured peptide in water at 293–313 K (Fig. S6). Hence, the discrepancies between experiment and simulation are unlikely to be attributable to differences in conditions.

Nevertheless, the kfsim and Tmsim are remarkably similar to their experimental counterparts without any adjustment of the force field to fit the NuG2b data [Tmexp ∼370 K, Tmsim ∼345 K; kfexp (333 K) ∼14,000 ± 1,400 s−1; kfsim (345 K) ∼18,000 s−1]. These similarities now seem rather unexpected given the high levels of structure in DSEsim. Experimentally, the barrier for folding typically represents the formation of most 2° structure elements and the acquisition of the native topology starting from a largely unstructured DSE. In contrast, a high level of structure is already present in DSEsim, and hence much less structure and topology forms en route from DSEsim to TSEsim and then on to the NSE. It is unclear why this difference does not translate into a larger discrepancy in the predicted kf and Tm. These issues may also explain the underestimation of the changes in folding enthalpy and heat capacity observed in MD, although the enthalpy of NuG2b is not underestimated (11).

Discussion

Long-time MD simulations are accurate enough to refold small proteins to within 2 Å rmsd with multiple discrete folding transitions (1). However, the DSEs found in MD simulations often contain overly compact structures (11) with extensive amounts of H-bonding compared with the experimental data. The high level of residual structure indicates that the force field requires further improvement to characterize nonnative regions of the energy landscape and to describe the high degree of folding cooperativity observed experimentally. Nevertheless, the level of H-bonding in DSEsim is physically reasonable from the standpoint that buried H-bond donors and acceptors nearly always form H-bonds (29). However, the predominance of collapsed species in DSEsim suggests that terms in the force field promoting protein–protein and/or water–water H-bonds are too strong relative to those associated with backbone hydration.

In contrast, the experimental HX, denaturation, CD, peptide, two-state chevron fit, and SAXS data are all consistent with an extremely cooperative unfolding process to an unstructured DSE. The DSEexp contains highly solvated and expanded conformations with minimal H-bonded structure for concentrations of GdmCl from 0 to 7 M. The most stable H-bonds exchange through a highly cooperative transition to a globally unfolded state that contains few, if any, H-bonds, even in the absence of denaturant. Furthermore, linear folding arms of the chevron plot also point to a highly solvated DSEexp under native conditions. A more collapsed DSE under native conditions would have produced a measureable downward curvature in the folding arm owing to smaller ΔASA between the DSE and TSE. Our studies do not rule out a subtler change in the DSE, such as a change in local structure [e.g., a shift in the population of conformations having backbone angles in the polyproline II region of the Ramachandran map to angles in the helical region (13, 30)], which explains the frequently observed sloping DSE baseline in CD222nm-monitored denaturation (31, 32).

Similar signatures of an unstructured DSE appear in other proteins. For a ubiquitin variant (33) and a coiled-coil (34), the HX determined stabilities and m-values match those derived from global measurements. Also, Pace and coworkers (35) found that HX and equilibrium-determined stability matched for 19 of 20 proteins, indicating that the two methods typically probe the same transition to a DSE that is devoid of stable H-bonds. The DSE for many [but not all (36)] small proteins remains expanded upon a shift to native conditions according to SAXS (22, 2426). In addition, most [but not all (37)] small proteins satisfy the two-state chevron criterion, sometimes even down to 0 M denaturant (18, 19). This highly cooperative behavior is not unique to experiments because some coarse-grained models now can produce nearly all-or-none folding behavior with linear (38) or near linear chevrons (39).

Analyses by Shaw and coworkers of group trajectories for λ repressor (1) and ubiquitin (40) produce similar pictures of highly collapsed and H-bonded DSEsims containing considerable native character, although to a lesser degree than observed for NuG2b (Figs. S7 and S8). Other all-atom MD simulations also yield collapsed DSEs (4143). Experimentally, λ repressor and ubiquitin satisfy the authentic two-state chevron criterion (19, 21, 44, 45), again indicating that their DSEs at low denaturant concentration are highly solvated and that early intermediates are not significantly populated. Furthermore, NMR tracking of the thermal denaturation of λ repressor beautifully demonstrated that this protein fully unfolds in a single transition without populating any other species (46). In addition, time-resolved SAXS studies of ubiquitin indicate that chains in DSEexp remain expanded down to 0.7 M GdmCl at 293 K (25). The FRET-determined Rg of the low denaturant DSE often is less than that observed by SAXS [e.g., Rg ∼22 versus ∼26 Å (47)], possibly owing to transient hydrophobic contacts (48), but it still is considerably larger than found in the all-atom simulations (27). Thus, the choice of experimental technique is unlikely to provide an alternative explanation for the discrepancy, thus suggesting the result is general.

HX–MD Comparisons.

HX is an excellent tool for verifying MD simulations because it provides the stability and cooperativity of unfolding events with individual H-bond resolution (7, 8). Past attempts to compare MD to HX have been limited by the short durations of simulations and by the uncertainty in how to relate structural ensembles and HX rates (49). Recent technical advances in MD simulations (50) combined with the modeling of HX protection presented here make it possible to directly compare these two techniques.

The H-bond pattern in the NuG2b trajectories clearly differs from what was observed experimentally. However, future simulations may differ in subtler ways. In such cases, one should use a site-by-site comparison of the ΔG and m-values by reweighting each state in DSEsim according to surface exposure, ΔΔG([den]) = (%ASA)·[den]·mglobal. The reweighted ensemble can be used to calculate the H-bond populations as a function of denaturant from which individual m-values can be determined. This method could thereby be applied to various systems to extend the comparison beyond the DSE to include states that break only a subset of H-bonds, such as subglobal openings.

However, we assume that the HX rate is governed by the H-bond fraction and the intrinsic HX rate (kchem). This assumption is valid when considering unfolding of a segment to a well solvated conformation, but may not apply for small openings where local geometry may affect kchem. We leave this issue for future studies because it does not affect the present conclusions.

Conclusions

The MD simulations by Shaw and coworkers can reversibly fold small proteins and reproduce their native structures, folding rates, and melting temperatures. However, the discrete transitions observed in the MD simulations for NuG2b and other small proteins occur between the native state and a highly collapsed, H-bonded species. In contrast, experimental data indicate that small proteins often fold much more cooperatively with a denatured state that is expanded and largely unstructured even at low denaturant concentrations. The excessive amount of intramolecular H-bonding observed in the simulations suggests that changes in the force field are warranted to produce the correct balance between H-bonding, backbone solvation, and nonbonded interactions.

Methods

Before HX measurements, NuG2b was exchanged into D2O and then lyophilized. Exchange measurements were initiated by the addition of solvent [10% (vol/vol) D2O] to lyophilized samples. HX was measured by standard NMR methods using a 500-MHz magnet with a Bruker AVANCE III console. Refolding measurements used a Biologic SFM-400, -4000 instrument integrated with a PTI light source, and CD measurements were taken on a Jasco 715 spectrometer [100% (vol/vol) H2O]. See SI Methods for additional details.

Supplementary Material

Supplementary File
pnas.201404213SI.pdf (2.2MB, pdf)

Acknowledgments

We thank S. Piana-Agostinetti and D. Shaw for helpful discussions, comments on the paper, and data sharing. This work was supported by National Institutes of General Medical Sciences (NIGMS) Research Grant R01 GM055694 and National Science Foundation Grant CHE-13630120 (to K.F.F.). Use of the Advanced Photon Source, an Office of Science User Facility, operated for the Department of Energy (DOE) Office of Science by Argonne National Laboratory, was supported by the DOE under Contract DEAC02- 06CH11357. This project was supported by National Center for Research Resources Grant 2P41RR008630-18 and NIGMS Grant 9 P41 GM103622-18, both under National Institutes of Health. W.Y. was supported in part by National Creative Research Initiatives (Center for Proteome Biophysics) of the National Research Foundation of Korea Grant 2011-0000041.

Footnotes

The authors declare no conflict of interest.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1404213111/-/DCSupplemental.

References

  • 1.Lindorff-Larsen K, Piana S, Dror RO, Shaw DE. How fast-folding proteins fold. Science. 2011;334(6055):517–520. doi: 10.1126/science.1208351. [DOI] [PubMed] [Google Scholar]
  • 2.Shaw DE, et al. Atomic-level characterization of the structural dynamics of proteins. Science. 2010;330(6002):341–346. doi: 10.1126/science.1187409. [DOI] [PubMed] [Google Scholar]
  • 3.Best RB, Hummer G, Eaton WA. Native contacts determine protein folding mechanisms in atomistic simulations. Proc Natl Acad Sci USA. 2013;110(44):17874–17879. doi: 10.1073/pnas.1311599110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Beauchamp KA, McGibbon R, Lin YS, Pande VS. Simple few-state models reveal hidden complexity in protein folding. Proc Natl Acad Sci USA. 2012;109(44):17807–17813. doi: 10.1073/pnas.1201810109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Dickson A, Brooks CL., 3rd Native states of fast-folding proteins are kinetic traps. J Am Chem Soc. 2013;135(12):4729–4734. doi: 10.1021/ja311077u. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Mulder FA, Mittermaier A, Hon B, Dahlquist FW, Kay LE. Studying excited states of proteins by NMR spectroscopy. Nat Struct Biol. 2001;8(11):932–935. doi: 10.1038/nsb1101-932. [DOI] [PubMed] [Google Scholar]
  • 7.Englander SW, Sosnick TR, Englander JJ, Mayne L. Mechanisms and uses of hydrogen exchange. Curr Opin Struct Biol. 1996;6(1):18–23. doi: 10.1016/s0959-440x(96)80090-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Englander SW, Mayne L, Bai Y, Sosnick TR. Hydrogen exchange: The modern legacy of Linderstrøm-Lang. Protein Sci. 1997;6(5):1101–1109. doi: 10.1002/pro.5560060517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Bai Y, Sosnick TR, Mayne L, Englander SW. Protein folding intermediates: Native-state hydrogen exchange. Science. 1995;269(5221):192–197. doi: 10.1126/science.7618079. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Nauli S, Kuhlman B, Baker D. Computer-based redesign of a protein folding pathway. Nat Struct Biol. 2001;8(7):602–605. doi: 10.1038/89638. [DOI] [PubMed] [Google Scholar]
  • 11.Piana S, Klepeis JL, Shaw DE. Assessing the accuracy of physical models used in protein-folding simulations: Quantitative evidence from long molecular dynamics simulations. Curr Opin Struct Biol. 2014;24:98–105. doi: 10.1016/j.sbi.2013.12.006. [DOI] [PubMed] [Google Scholar]
  • 12.Zhang Y, Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins. 2004;57(4):702–710. doi: 10.1002/prot.20264. [DOI] [PubMed] [Google Scholar]
  • 13.Jha AK, Colubri A, Freed KF, Sosnick TR. Statistical coil model of the unfolded state: Resolving the reconciliation problem. Proc Natl Acad Sci USA. 2005;102(37):13099–13104. doi: 10.1073/pnas.0506078102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Bai Y, Milne JS, Mayne L, Englander SW. Primary structure effects on peptide group hydrogen exchange. Proteins. 1993;17(1):75–86. doi: 10.1002/prot.340170110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Connelly GP, Bai Y, Jeng M-F, Englander SW. Isotope effects in peptide group hydrogen exchange. Proteins. 1993;17(1):87–92. doi: 10.1002/prot.340170111. [DOI] [PubMed] [Google Scholar]
  • 16.Auton M, Holthauzen LM, Bolen DW. Anatomy of energetic changes accompanying urea-induced protein denaturation. Proc Natl Acad Sci USA. 2007;104(39):15317–15322. doi: 10.1073/pnas.0706251104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Guinn EJ, Kontur WS, Tsodikov OV, Shkel I, Record MT., Jr Probing the protein-folding mechanism using denaturant and temperature effects on rate constants. Proc Natl Acad Sci USA. 2013;110(42):16784–16789. doi: 10.1073/pnas.1311948110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Jackson SE, Fersht AR. Folding of chymotrypsin inhibitor 2. 1. Evidence for a two-state transition. Biochemistry. 1991;30(43):10428–10435. doi: 10.1021/bi00107a010. [DOI] [PubMed] [Google Scholar]
  • 19.Krantz BA, Mayne L, Rumbley J, Englander SW, Sosnick TR. Fast and slow intermediate accumulation and the initial barrier mechanism in protein folding. J Mol Biol. 2002;324(2):359–371. doi: 10.1016/s0022-2836(02)01029-x. [DOI] [PubMed] [Google Scholar]
  • 20.Jackson SE. How do small single-domain proteins fold? Fold Des. 1998;3(4):R81–R91. doi: 10.1016/S1359-0278(98)00033-9. [DOI] [PubMed] [Google Scholar]
  • 21.Krantz BA, Sosnick TR. Distinguishing between two-state and three-state models for ubiquitin folding. Biochemistry. 2000;39(38):11696–11701. doi: 10.1021/bi000792+. [DOI] [PubMed] [Google Scholar]
  • 22.Yoo TY, et al. Small-angle X-ray scattering and single-molecule FRET spectroscopy produce highly divergent views of the low-denaturant unfolded state. J Mol Biol. 2012;418(3-4):226–236. doi: 10.1016/j.jmb.2012.01.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kohn JE, et al. Random-coil behavior and the dimensions of chemically unfolded proteins. Proc Natl Acad Sci USA. 2004;101(34):12491–12496. doi: 10.1073/pnas.0403643101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Jacob J, Dothager RS, Thiyagarajan P, Sosnick TR. Fully reduced ribonuclease A does not expand at high denaturant concentration or temperature. J Mol Biol. 2007;367(3):609–615. doi: 10.1016/j.jmb.2007.01.012. [DOI] [PubMed] [Google Scholar]
  • 25.Jacob J, Krantz B, Dothager RS, Thiyagarajan P, Sosnick TR. Early collapse is not an obligate step in protein folding. J Mol Biol. 2004;338(2):369–382. doi: 10.1016/j.jmb.2004.02.065. [DOI] [PubMed] [Google Scholar]
  • 26.Plaxco KW, Millett IS, Segel DJ, Doniach S, Baker D. Chain collapse can occur concomitantly with the rate-limiting step in protein folding. Nat Struct Biol. 1999;6(6):554–556. doi: 10.1038/9329. [DOI] [PubMed] [Google Scholar]
  • 27.Sosnick TR, Barrick D. The folding of single domain proteins—have we reached a consensus? Curr Opin Struct Biol. 2011;21(1):12–24. doi: 10.1016/j.sbi.2010.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Long D, Bouvignies G, Kay LE. Measuring hydrogen exchange rates in invisible protein excited states. Proc Natl Acad Sci USA. 2014;111(24):8820–8825. doi: 10.1073/pnas.1405011111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Fleming PJ, Rose GD. Do all backbone polar groups in proteins form hydrogen bonds? Protein Sci. 2005;14(7):1911–1917. doi: 10.1110/ps.051454805. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Jha AK, et al. Helix, sheet, and polyproline II frequencies and strong nearest neighbor effects in a restricted coil library. Biochemistry. 2005;44(28):9691–9702. doi: 10.1021/bi0474822. [DOI] [PubMed] [Google Scholar]
  • 31.Sosnick TR, Shtilerman MD, Mayne L, Englander SW. Ultrafast signals in protein folding and the polypeptide contracted state. Proc Natl Acad Sci USA. 1997;94(16):8545–8550. doi: 10.1073/pnas.94.16.8545. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Qi PX, Sosnick TR, Englander SW. The burst phase in ribonuclease A folding and solvent dependence of the unfolded state. Nat Struct Biol. 1998;5(10):882–884. doi: 10.1038/2321. [DOI] [PubMed] [Google Scholar]
  • 33.Zheng Z, Sosnick TR. Protein vivisection reveals elusive intermediates in folding. J Mol Biol. 2010;397(3):777–788. doi: 10.1016/j.jmb.2010.01.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Meisner WK, Sosnick TR. Barrier-limited, microsecond folding of a stable protein measured with hydrogen exchange: Implications for downhill folding. Proc Natl Acad Sci USA. 2004;101(44):15639–15644. doi: 10.1073/pnas.0404895101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Huyghues-Despointes BMP, Scholtz JM, Pace CN. Protein conformational stabilities can be determined from hydrogen exchange rates. Nat Struct Biol. 1999;6(10):910–912. doi: 10.1038/13273. [DOI] [PubMed] [Google Scholar]
  • 36.Kimura T, et al. Specific collapse followed by slow hydrogen-bond formation of beta-sheet in the folding of single-chain monellin. Proc Natl Acad Sci USA. 2005;102(8):2748–2753. doi: 10.1073/pnas.0407982102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Religa TL, Markson JS, Mayor U, Freund SM, Fersht AR. Solution structure of a protein denatured state and folding intermediate. Nature. 2005;437(7061):1053–1056. doi: 10.1038/nature04054. [DOI] [PubMed] [Google Scholar]
  • 38.Liu Z, Reddy G, O’Brien EP, Thirumalai D. Collapse kinetics and chevron plots from simulations of denaturant-dependent folding of globular proteins. Proc Natl Acad Sci USA. 2011;108(19):7787–7792. doi: 10.1073/pnas.1019500108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Chen T, Chan HS. Effects of desolvation barriers and sidechains on local-nonlocal coupling and chevron behaviors in coarse-grained models of protein folding. Phys Chem Chem Phys. 2014;16(14):6460–6479. doi: 10.1039/c3cp54866j. [DOI] [PubMed] [Google Scholar]
  • 40.Piana S, Lindorff-Larsen K, Shaw DE. Atomic-level description of ubiquitin folding. Proc Natl Acad Sci USA. 2013;110(15):5915–5920. doi: 10.1073/pnas.1218321110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Voelz VA, Singh VR, Wedemeyer WJ, Lapidus LJ, Pande VS. Unfolded-state dynamics and structure of protein L characterized by simulation and experiment. J Am Chem Soc. 2010;132(13):4702–4709. doi: 10.1021/ja908369h. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Liu Y, Strümpfer J, Freddolino PL, Gruebele M, Schulten K. Structural characterization of λ-repressor folding from all-atom molecular dynamics simulations. J Phys Chem Lett. 2012;3(9):1117–1123. doi: 10.1021/jz300017c. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Lindorff-Larsen K, Trbovic N, Maragakis P, Piana S, Shaw DE. Structure and dynamics of an unfolded protein examined by molecular dynamics simulation. J Am Chem Soc. 2012;134(8):3787–3791. doi: 10.1021/ja209931w. [DOI] [PubMed] [Google Scholar]
  • 44.Burton RE, Huang GS, Daugherty MA, Calderone TL, Oas TG. The energy landscape of a fast-folding protein mapped by Ala—>Gly substitutions. Nat Struct Biol. 1997;4(4):305–310. doi: 10.1038/nsb0497-305. [DOI] [PubMed] [Google Scholar]
  • 45.Krantz BA, et al. Understanding protein hydrogen bond formation with kinetic H/D amide isotope effects. Nat Struct Biol. 2002;9(6):458–463. doi: 10.1038/nsb794. [DOI] [PubMed] [Google Scholar]
  • 46.Huang GS, Oas TG. Structure and stability of monomeric lambda repressor: NMR evidence for two-state folding. Biochemistry. 1995;34(12):3884–3892. doi: 10.1021/bi00012a003. [DOI] [PubMed] [Google Scholar]
  • 47.Merchant KA, Best RB, Louis JM, Gopich IV, Eaton WA. Characterizing the unfolded states of proteins using single-molecule FRET spectroscopy and molecular simulations. Proc Natl Acad Sci USA. 2007;104(5):1528–1533. doi: 10.1073/pnas.0607097104. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Meng W, Lyle N, Luan B, Raleigh DP, Pappu RV. Experiments and simulations show how long-range contacts can form in expanded unfolded proteins with negligible secondary structure. Proc Natl Acad Sci USA. 2013;110(6):2123–2128. doi: 10.1073/pnas.1216979110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Skinner JJ, Lim WK, Bédard S, Black BE, Englander SW. Protein hydrogen exchange: Testing current models. Protein Sci. 2012;21(7):987–995. doi: 10.1002/pro.2082. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Lane TJ, Shukla D, Beauchamp KA, Pande VS. To milliseconds and beyond: Challenges in the simulation of protein folding. Curr Opin Struct Biol. 2013;23(1):58–65. doi: 10.1016/j.sbi.2012.11.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.201404213SI.pdf (2.2MB, pdf)

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES