Prediction and Validation of a Protein’s Free Energy Surface Using Hydrogen Exchange and (Importantly) Its Denaturant Dependence

Xiangda Peng; Michael Baxa; Nabil Faruk; Joseph R Sachleben; Sebastian Pintscher; Isabelle A Gagnon; Scott Houliston; Cheryl H Arrowsmith; Karl F Freed; Gabriel J Rocklin; Tobin R Sosnick

doi:10.1021/acs.jctc.1c00960

. 2021 Dec 22;18(1):550–561. doi: 10.1021/acs.jctc.1c00960

Prediction and Validation of a Protein’s Free Energy Surface Using Hydrogen Exchange and (Importantly) Its Denaturant Dependence

Xiangda Peng ^†, Michael Baxa ^†, Nabil Faruk ^‡, Joseph R Sachleben ^§, Sebastian Pintscher ^†,^∥, Isabelle A Gagnon ^†, Scott Houliston ^⊥,^■, Cheryl H Arrowsmith ^⊥,^■, Karl F Freed ^#, Gabriel J Rocklin ^∇, Tobin R Sosnick ^†,^*

PMCID: PMC8757463 PMID: 34936354

Abstract

graphic file with name ct1c00960_0013.jpg

The denaturant dependence of hydrogen–deuterium exchange (HDX) is a powerful measurement to identify the breaking of individual H-bonds and map the free energy surface (FES) of a protein including the very rare states. Molecular dynamics (MD) can identify each partial unfolding event with atomic-level resolution. Hence, their combination provides a great opportunity to test the accuracy of simulations and to verify the interpretation of HDX data. For this comparison, we use Upside, our new and extremely fast MD package that is capable of folding proteins with an accuracy comparable to that of all-atom methods. The FESs of two naturally occurring and two designed proteins are so generated and compared to our NMR/HDX data. We find that Upside’s accuracy is considerably improved upon modifying the energy function using a new machine-learning procedure that trains for proper protein behavior including realistic denatured states in addition to stable native states. The resulting increase in cooperativity is critical for replicating the HDX data and protein stability, indicating that we have properly encoded the underlying physiochemical interactions into an MD package. We did observe some mismatch, however, underscoring the ongoing challenges faced by simulations in calculating accurate FESs. Nevertheless, our ensembles can identify the properties of the fluctuations that lead to HDX, whether they be small-, medium-, or large-scale openings, and can speak to the breadth of the native ensemble that has been a matter of debate.

Introduction

Proteins populate high-energy states as determined by their free energy surfaces. These states are often relevant in folding, catalysis, binding, conformational selection, aggregation, and allostery.¹ An ongoing challenge is to accurately calculate the free energy surface, including the generation of the Boltzmann ensemble of all major species. By providing the free energies for the breaking of individual hydrogen bonds, ΔG_HX, HDX and its denaturant dependence is an excellent method to identify excited states and test the veracity of a simulated free energy surface.

HDX occurs when an amide proton (NH) normally participating in an H-bond becomes exposed to solvent in a transient “open state” (Figure 1). A major advance in the interpretation of HDX data came about with the measurement of the denaturant dependence of exchange, which provides an indicator of the size of the opening event.^2,3 For many proteins, HDX of the most stable H-bonds has a large denaturant dependence, and the exchange process corresponds to the global unfolding of the protein, for example, as measured by temperature or chemical denaturation.⁴ The other, less stable H-bonds have a reduced sensitivity to denaturant, indicating that the structural opening involves only a portion of the protein.

Denaturant dependence of HDX. (A) H-bonds are broken when the protein undergoes global, subglobal, and local openings that can be identified by their sensitivity to denaturant (m-values). (B) The observed exchange rate is the sum of the rates from each class of openings, global (U), subglobal (Sub), or local (Loc), which results in the observed denaturant dependence (black curved line). The free energy diagrams illustrate the shift in populations as the denaturant concentration is increased, and changes which state dominants the exchange process.

The size of the opening varies, ranging from only one or a few amides in “local” openings to larger “subglobal” openings to complete or “global” unfolding. Rather than the breaking of a specific H-bond, local opening events have been proposed to reflect population shifts within a broad native ensemble.^5,6 For some proteins, subglobal openings likely represent unfolding intermediates where the disordered regions correspond to the unfolding of one or more helices and/or strands (“foldons”) that exchange in a concerted manner.³

The prediction of a free energy surface and its validation using HDX data is challenging. An implication of having one or more NHs exchanging with ΔG_HX matching the global stability ΔG_eq, and its denaturant dependence, is that the amount of residual H-bonded structure must be minimal in the denatured state ensemble (DSE). Achieving such a high degree of folding cooperativity is difficult as it requires that the energy function strike the proper balance between the stabilizing interactions needed to fold the protein and the destabilizing protein–solvent interactions necessary to produce a DSE devoid of residual structure.^7,8

All-atom molecular dynamics (MD) simulations in principle are well suited to generate Boltzmann ensembles for HDX calculations.⁹⁻¹⁴ Previously, however, we examined DESRES folding simulations for NuG2b, a small α/β protein, and found that the DSE possessed a near-native radius of gyration (R_g) and native-like H-bonding levels. These properties are inconsistent with our HDX and small-angle scattering data.¹⁵ Generally, MD force fields have excess residual structure, which inhibits their ability to predict HDX, although improvements have been made.^7,8

Here, we advance our near-atomic level MD Upside algorithm that can fold proteins with accuracy comparable to that of all-atom methods but in CPU-hours^16,17 and test whether it can generate ensembles that can reproduce HDX data for four proteins: two Rosetta-designed proteins,¹⁸ mammalian ubiquitin, and a ubiquitin variant (L50E). To predict the larger openings and their denaturant dependence, we find it necessary to train our force field in a manner that reduces the amount of residual structure in the DSE and increases folding cooperativity. Although the agreement between our simulations and the HDX data is laudable, areas for improvement include increasing cooperativity and reducing the number of spurious small-scale openings. In general, our study provides a firmer foundation for testing simulations against HDX data, which should lead to improvements in our ability to simulate the free energy surface and interpret HDX data.

Results

We first describe the Upside model and conduct dual-target contrastive divergence (ConDiv) training of our energy function using both the native state ensemble (NSE) and an unstructured DSE. This procedure is designed to balance the energy terms to decrease the amount of residual structure while still being able to fold proteins. Next, methods for comparing an MD trajectory to HDX data and its denaturant dependence are presented, followed by comparisons to data.

Upside is a near-atomic, implicit solvent model that conducts Langevin dynamics on just the three backbone N, C_α, and C atoms, with the backbone conformation guided by neighbor- and residue-dependent (ϕ,ψ) torsion maps.^16,17Upside’s speed in part arises from explicitly including only the three backbone atoms during the dynamics portion, but it uses the inferred positions of amide hydrogens, carbonyl oxygens, C_β atoms, and side chains during the force calculation. The side chains are represented by multiposition, amino acid-, and directional-dependent beads. After every Verlet integration time step, the bead position probabilities of all side chains are determined in a single global side chain packing step that produces the lowest side chain free energy. This step greatly reduces side chain friction, which, along with the lack of explicit solvent, explains much of the 10³–10⁴-fold speedup as compared to standard MD.

Contrastive Divergence (ConDiv) Training

Critical to Upside’s success is the development of a force field having the proper balancing of energy terms. Balancing is achieved by simultaneously training essentially all parameters using our version of the machine learning ConDiv method.¹⁷ Here, one considers two ensembles, the first restrained to be near the native structure and the second that is free to diffuse away during a simulation. For a perfect energy function, the unrestrained ensemble will remain close to the native ensemble. However, differences will arise with an imperfect function, which can be reduced by changing the strength of the energy terms to preferentially stabilize the native conformers as compared to the unconstrained ensemble. For example, if too many H-bonds form in the unconstrained ensemble, the H-bond energy is reduced, and the training simulations are repeated. This iterative procedure of updating the energy terms to preferentially stabilize the NSE continues until no energy parameter can be updated to produce a better NSE (Figure 2). To avoid overtraining, this procedure is run for 456 proteins in batches of 19 to produce our 2018 energy function, “FF1”.

“Dual objective” contrastive divergence training. Left: Before training, simulations that start from the DSE (blue) or NSE (red) are unstable and relax to an ensemble with residual structure or misfolded states, respectively. Right: Force field parameters are iteratively updated to increase the native stability while reducing the amount of residual structure in the DSE. Ideally, new simulations that start in the DSE and NSE retain their respective conformational ensembles (dotted gray energy surface becomes solid black).

An improved FF2 version is obtained in the present study by adding an explicit backbone desolvation term, a better side chain burial calculation to include the chemistry of the side chain’s environment, and modifying the H-bond energy to be secondary structure-dependent (varies by 10%, Supporting Information Improving the Energy Function). These modifications follow FF1’s overall philosophy of using as few physically based energy terms as possible while still maintaining accuracy (Supporting Information Improving the Energy Function).

At temperatures above the melting temperature (T_m), the DSE with FF1 has minimal H-bonded structure, but below the T_m, some residual structure remains in the DSE (Figure S6). To reduce this structure and increase folding cooperativity, we developed a ConDiv procedure where training is conducted on both the DSE and the NSE simultaneously. The goal is to achieve the delicate balance between reducing the amount of residual structure in the DSE while still being able to fold proteins. A DSE training ensemble was generated using expanded conformations from simulations run at ∼T_m (Figures 2 and S3, Supporting Information Parameterization by ConDiv). The target ensemble for the DSE was taken to be a self-avoiding random walk (SARW). This dual-target ConDiv training procedure was used to obtain FF2 (Supporting Information Parameterization by ConDiv).

The performance of FF2 was evaluated in terms of structural accuracy for de novo folding, the ability to remain in the native well, the degree of folding cooperativity, and the overall thermodynamic stability. The ability of FF2 to both fold proteins de novo and maintain the native state as a stable minimum was noticeably improved for a validation set of 16 proteins. For de novo folding with FF1 and FF2, the average TM-scores¹⁹ were 0.37 (⟨C_α-RMSD⟩ = 7.4 Å) and 0.42 (⟨C_α-RMSD⟩ = 6.1 Å), respectively. TM-score is a similarity metric that is independent of protein size and reflects the fraction of the predicted structure that can be aligned to the native or some other reference structure within a certain distance cutoff (e.g., 5 Å).¹⁹ Likewise, for trajectories starting from the native state, the TM-scores of the excursions within the native basin were 0.45 and 0.55 (RMSD values of 5.7 and 4.0 Å), respectively (Figures 3A, S4, and S5). The improvement in the ability to remain near the native state is especially significant as the trajectories that begin in the native state are those that are used to calculate HDX data.

Improved performance with force field FF2. (A) C_α-RMSD distribution of the validation set starting from either a native or an unfolded conformation (average values indicated with dashed lines). (B) H-bond distribution at the T_m for a set of designed mini-proteins and the corresponding melting curves. (C) Predicted and experimental ΔG values at 298 K for the mini-proteins using the original FF1 and the new FF2 energy function. The correlation coefficients and the slope of the best fit are listed with a 90% confidence interval.

To examine the importance of our dual-target ConDiv procedure, we created FF1+, a version that contains the extra energy terms found in FF2 but lacks the dual-target training protocol. A comparison of FF1+ to FF1 indicates that the new energy terms did help increase folding cooperativity (Figures 3, S6, and S7), but they did not significantly improve our estimation of global stability (Figures S8 and S9). Hence, our dual-target ConDiv training procedure is a key factor for the improvement of FF2. Effectively, having a disordered DSE serves as a regularizer that produces the minimum interaction energy that can still generate both a stable NSE and a disordered DSE.

The extent of folding cooperativity in FF2 was assessed by comparing the fraction of H-bonds formed in the conformations that lie outside the NSE to the fraction that are in the NSE at the T_m. After dual-target training, the H-bond distributions show increased bimodal character, and the ratio of the non-NSE H-bond fraction to the NSE H-bond fraction decreased from 38–50% to 20–31% for two Rosetta-designed mini-proteins, mammalian ubiquitin and an L50E variant (Figures 3B, S6, and S7). This reduction in residual H-bond levels should improve our ability to identify large opening events using HDX.

To make a direct comparison to the experimental data, our first step was to calibrate the Upside temperature scale using the measured stability of 13 Rosetta-designed mini-proteins (Table S1) that Upside can reversibly fold to near-native resolution, C_α-RMSD ≈ 1.2–5.0 Å (average 2.3 Å) at 298 K.¹⁸ For each protein, temperature replica exchange molecular dynamics (T-REMD) simulations^20,21 were run to generate melting curves that were fit assuming a two-state U-to-N model (Figure S8). The single Upside simulation temperature that best reproduced the experimental stabilities for the 13 proteins was defined to be 298 K (Supporting Information Temperature Calibration).

For this set, Upside’s estimation of global stability differs from experiment by an average of 1.3 kcal·mol^–1 (⟨ΔT_m⟩ = 7 K). The stability predicted by FF2 showed an improved correlation with the experimental values, having a Pearson correlation coefficient of 0.60 and a slope of 0.82, as compared to values of −0.14 and −0.15 for FF1 (Figures 3C and S9). This improvement is notable as the changes in FF2 were not explicitly intended to improve Upside’s ability to calculate stability.

Free Energy Surface of Two Designed Mini-Proteins

We ran Upside simulations for two Rosetta-designed α/β mini-proteins, termed EHEE_rd2_0005 (40 residues) and HEEH_rd4_0097 (43 residues).¹⁸ EHEE_rd2_0005 has a three-stranded β sheet with a helix inserted between the amino-terminal strand and the carboxy-terminal hairpin. HEEH_rd4_0097 has a central hairpin flanked by two helices. Reversible folding/unfolding behavior is observed for both proteins with the NSE having an RMSD ≈ 1 Å from the designed targets (Figure 4) and 1.1–1.7 Å to the 20 NMR structures (PDB: 5UYO)¹⁸ for HEEH_rd4_0097.

Reversible folding of EHEE_rd2_0005 (left) and HEEH_rd4_0097 (right). Predicted (cyan) and Rosetta-designed targets (red) are shown along with reversible folding trajectories, corresponding histograms, and heat maps of the FES at the T_m (320 K for EHEE_rd2_0005, 317 K for HEEH_rd4_0097). The FES is shown using two different sets of axes: (1) R_g and the number of H-bonds and (2) the first two components in the PCA. Representative pathways and structures are shown, which highlight the diverse folding routes.

To obtain the free energy surface, we performed T-REMD simulations and combined the different replicas using the multistate Bennett acceptance ratio (MBAR) method.²² For both proteins, the free energy surface contains two distinct energy wells for the NSE and DSE, the latter of which is largely devoid of H-bonds (Figures 4 and S13). Principal component analysis (PCA) was conducted using the C_α–C_α contact matrix as the coordinates (C_α–C_α distance < 10 Å). For EHEE_rd2_0005, two native-like intermediates appear that contain either the helix (I2) or the helix plus hairpin (I1). A minor folding pathway exists, which contains a lowly populated intermediate having only the carboxy hairpin (I3) at a population of only 5 × 10^–4 relative to I2 at 287 K (Figures 4 and S14). For HEEH_rd4_0097, there are two minor depressions in the free energy surface representing the loss of one or both helices, while an even lower populated species is observed lacking the internal hairpin (Figure 4). The PCA map suggests three reversible pathways, but none is dominant (Figure S15). These simulated ensembles were used to calculate HDX patterns.

HDX Calculations

HDX occurs when an H-bond is broken in a transient open state, and the NH is exposed to solvent:²³

The observed HDX rate (k_obs) can be expressed as

The relative slowing of k_obs as referenced to the intrinsic chemical exchange rate, k_chem, is termed the protection factor (PF = k_chem/k_obs). Under the typical “EX2” condition where the closing rate, k_close, of the H-bond is much faster than k_chem, the PF is directly related to the equilibrium stability according to

(See the section Testing for EX2 Behavior.) In our ensembles, exchange competent NHs are identified using Halle et al.’s criteria that the H-bond must be broken and the NH be coordinated to at least two nearby waters.⁹ These criteria are adapted to our implicit solvent simulations using our backbone burial level (BL^H) term and H-bond score. This score is bimodal, being near 0 (broken) or 1 (made) during the simulations (Figure S11A). The burial level of the NH (BL^H) comes from two contributions (Figure S11B), the heavy atoms on the backbone (BL_bb^H) and the side chain bead (BL_sc):

The value of 5 is used to increase the side chain score relative to that of the three backbone atoms, amide proton, and oxygen as the side chain bead is a single interaction center and hence under-represents the volume occupied by the side chain (Figure S11C). An NH is considered protected (PS_i = 1) when it is either H-bonded or buried with BL_i^H > 5; otherwise, PS_i = 0. For a simulation, the free energy for each NH is calculated according to

where n is the frame index and N is the number of frames.

Calculating the Denaturant Dependence of HDX

The analysis of the denaturant dependence of ΔG_HX parallels that used in equilibrium measurements where the global stability is linearly dependent on the denaturant concentration, ΔG = ΔG₀ – m₀[den]. For global unfolding, the m₀ value is proportional to the denaturant-sensitive surface area exposed upon unfolding of the protein.²⁴ Similarly, for HDX, the slope of ΔG_HX versus [den] reflects the size of the lowest energy opening where NH is in an exchange competent state. Such openings generally occur through global, subglobal, and local openings^3,25−33 (Figure 1). The HDX for these three classes has a large, medium, or near-zero sensitivity to denaturant, respectively.

A complicating aspect with HDX is that any given NH can exchange through multiple processes with fluxes that depend on the relative energy of each process, which can change with denaturant concentration. For example, for an NH that can be opened in each of the three classes, the net HDX rate is given by

This NH may start exchanging through a local opening if ΔG_local < ΔG_subglobal, ΔG_global in the absence of denaturant. With added denaturant, larger openings are preferentially promoted as they are more sensitive to denaturant. As a result, the NH that started exchanging via a local opening will transition to exchanging through openings having a stronger denaturant dependence with increasing denaturant once these opening events become the lowest free energy states. Likewise, a subglobal exchange process may in turn be overtaken by the global opening once the global opening becomes the open state with the lowest free energy. The transition from one class of opening to another provides a stringent test of a model’s ability to generate an accurate Boltzmann ensemble.

To include denaturant in the simulations, it is assumed to destabilize individual conformations proportional to the number of the protected backbone NHs (N_closed^HB) according to

where s is a scale factor that is selected to reproduce the slope of the experimental denaturant dependence for each protein (Table S4). The value of ΔΔG([den]) for each conformation is used to reweight its population at each simulated denaturant concentration. The use of NH protection as a proxy for the effects of urea is supported by a strong correlation between the number of H-bonds and exposed surface area in our simulations (R² = 0.97, Figure S12) as well as HDX and transfer studies that indicate that backbone exposure relates to denaturant sensitivity for urea.^34,35

Comparison to Experiment

HDX data were acquired for EHEE_rd2_0005 and HEEH_rd4_0097, respectively, at 298 K, pD_read 7.1 and 278 K, pD_read 4.6, where the most stable site on each protein has a stability of 8.6 and 4.3 kcal·mol^–1. To focus on comparing FESs, we chose to compare the simulations at temperatures of 293 and 299 K where the simulated stabilities best match the experimental values for EHEE_rd2_0005 or HEEH_rd4_0097, respectively (Figure 5; Supporting Information Experimental Data Fitting, with comparisons at the experimental temperatures presented in Figure S16).

HDX and its denaturant dependence for EHEE_rd2_0005 and HEEH_rd4_0097. Only residues with experimentally determined ΔG values are shown.

The pattern of site-resolved ΔG_HX and m-values from simulations is similar to that of their experimental counterparts (Figures 6, 7, and Table S14). As compared to the original FF1, the dual-target training of FF2 improved the prediction of HDX (Figure S16). The RMSE of FF2 to experimental ΔG_HX values is smaller by 2.5 and 0.3 kcal·mol^–1 for EHEE_rd2_0005 and HEEH_rd4_0097, respectively (Figure S17). The RMSE of the m-value is also lowered by 0.05 to 0.1 kcal·mol^–1·M^–1, which indicates that folding is more cooperative with FF2.

Comparison of the experimental and simulated ΔG_HX and m-values for EHEE_rd2_0005. Values from experiment and simulation are compared in the absence and presence of denaturant. The corresponding secondary structure is shown on the upper left (strands in yellow, helices in red). The correlation is shown on the right in each panel. The blue line is the best fit line. The Pearson (P) and Spearman (S) correlation coefficients and the 90% confidence interval (in the parentheses) are provided.

Comparison of the experimental and simulated ΔG_HX and m-values for HEEH_rd4_0097.

Both proteins have openings with mild to large denaturant dependence, although for the simulations with FF2, neither protein has any sites that exchange only via globally unfolding. The highest ΔG_HX values for EHEE_rd2_0005 and HEEH_rd4_0097 from simulation fall short by 1.2 and 0.5 kcal mol^–1, respectively.

An examination of the simulated free energy surface provides an explanation for why there are no global exchanging sites (Figure 4). For EHEE_rd2_0005, an intermediate exists on the major folding route that contains just the helix. While the helix potentially could then exchange through global unfolding from this state, the helix has another exchange route with an energy just below the global stability where only the β-hairpin remains intact. In fact, every secondary structure in EHEE_rd2_0005 can exchange prior to global unfolding on some pathway between the native and unfolded state. Likewise, for HEEH_rd4_0097, the helices and the hairpin can unfold independently. As a result, all sites can more readily exchange through a subglobal opening than through global unfolding.

Hence, Upside has not quite achieved the folding cooperativity under native conditions needed to reproduce the all-or-none behavior seen in the many proteins having sites that only exchange through global unfolding. At elevated urea concentrations, however, Upside’s performance is improved as the simulated m-values of most residues approach m_global, and the individual ΔG_HX traces merge into the global unfolding “isotherm” seen experimentally and computationally.

HDX Calculations for Ubiquitin and an L50E Mutant

We next studied ubiquitin (Ub, 76 residues) and a 5–6 kcal·mol^–1 destabilized L50E variant.³² The substituted glutamic acid is located on the β5 strand and points toward the hydrophobic core of the protein. Rather than undergoing an energetically costly pK_a shift to the neutral Glu° form and remaining folded, the β5 strand unfolds to solvate the Glu^– state.³² As a result of this “vivisection” strategy, the ground state of the L50E mutant lacks a folded β5 strand.

Upside is able to fold Ub and the L50E variant to within 4.0 Å under stabilizing conditions (280 K) beginning from an unstructured state. When starting from the native state, the native structure is maintained to within 3.5 Å (Figures S18 and S19). However, we have difficulty observing reversible refolding in a single CPU-week due to the presence of long-lived misfolded species (Figure S20). These kinetically trapped states prevent refolding for trajectories that have passed from the native state to the unfolded state. Given that there is no reversibility, the free energy surface cannot be properly sampled as is the case for the mini-proteins.

To circumvent this issue, we propose that one can still obtain a reasonable conformational ensemble to predict HDX by including only the structures present on the free energy surface lying between the NSE and DSE (Figures 8 and S22). In this procedure, which mimics the actual HDX experiment as the protein starts from the native state, trajectories beginning in the native state are allowed to continue until the protein unfolds and then tries to refold. From this point forward, no more conformations from this trajectory are included in the HDX calculation. Because the DSE is under-sampled, this method may overestimate the native stability. However, if the protein adequately samples the region of the FES between the NSE and the DSE (Figure S21), then ΔG_HX, which is referenced to the native state, should be correct for the states that lie between the NSE and DSE.

Proposed strategy for calculating HDX patterns when folding is irreversible. HDX is calculated using only the portion of the trajectories that go from the native to the unfolded state (U), prior to misfolding (gray regions).

We compared the results of this strategy to new HDX data on Ub acquired at 273 K, pD_read 7.6 and published data on L50E (277 K, pD_read 7.5).³² According to data for the sites with the largest ΔG_HX and m-values, the global stabilities are 8.6 and 4.7 kcal·mol^–1 for the Ub and the variant, respectively (Figure 9). For purposes of comparing the HDX patterns, we selected simulation temperatures of 300 and 305 K to match the experimental stabilities (simulations at the experimental temperatures are shown in Figure S25).

HDX data for Ub and the destabilized L50E variant.

As with the designed proteins, the simulated Ub and L50E ensembles did not have any sites that exchanged solely by global unfolding in the absence of urea, although L50E came close. The most stable NHs in the simulations were 1.3 and 0.2 kcal·mol^–1 less stable than their experimental counterparts. Consistent with the experiment, the simulations found that the amino-terminal hairpin and helix are more stable than the carboxy-half of the protein, which largely exchanged through smaller-scale openings (lower m-values). As the denaturant increased, the m-values increased as larger openings were preferentially stabilized by denaturant.

The overall HDX pattern and its denaturant dependence were much better predicted for the L50E variant (Figures 10, 11, and Table S14). This difference is due in part to the 4 kcal·mol^–1 decrease in the energy gap between the native and the fully unfolded state for the variant. For larger energy gaps, there is an increased probability that a spurious small-scale opening event occurs in the simulations that can be the dominant exchange route and worsen the agreement with the HDX data. Examples of such fluctuations are β-strand register shifts of 1 or 2 residues (Figure S23).

Comparison of the experimental and simulated ΔG_HX and m-values for Ub.

Comparison of the experimental and simulated ΔG_HX and m-values for L50E.

The free energy surfaces for Ub and the variant were calculated from the same portions of the trajectories used to calculate the HDX pattern. The surfaces contain two defined wells corresponding to the NSE and DSE (Figure S23). The PCA map has two additional wells near PC1=2 representing states with the β2−β3 strand pairing having different registers. There are various states between the NSE and DSE, and the projection of the trajectory onto the PC1 and PC2 axes finds multiple pathways with the amino-terminal hairpin and α-helix being the most stable structures, consistent with the HDX data (Figure S24).

Interpretation of Small m-Values

The m-value is the denaturant dependence of an H-bond’s free energy, reflecting the difference in the amount of exposed denaturant-sensitive area between the ensemble having the H-bond broken and the ensemble having the bond formed. We approximate the m-value as the difference in the total number of H-bonds formed in the closed and open states, that is, m ∝ N_closed^HB – N_open.

Traditionally, a small m-value is interpreted as an opening event involving only one or a few H-bonds in an otherwise very native-like conformation, that is, N_open^HB ≈ N_total – 1. In principle, however, a small m-value could still occur in a broad native well having multiple H-bonds broken at any given time, so long as N_closed^HB – N_open is small for the specific site under consideration.^5,6

We investigated this possibility by examining N_closed^HB for a variety of H-bonds having small m-values. In the Upside trajectories, we observe a narrow native ensemble with nearly all H-bonds being formed in the native well at low denaturant (N_closed ≈ N_total^HB) (Figures 12, S26, and S27). For example, the H-bond involving Arg22 of EHEE_rd2_0005 has an m-value that is 5% of the global m-value, and when the H-bond is formed, all other native H-bonds are formed >96% of the time. Additionally, when the H-bond of Arg22 is broken in the NSE, nearly all other H-bonds remain (i.e., 79% and 16% of the NSE have 0 or 1 additional H-bonds broken, respectively). For this and other sites having small m-values, the local opening events reflect the breaking of a single or a few H-bonds rather than a population redistribution within a broad native ensemble. Our simulations generally find that local openings occur at the termini of helices and strands or at turns. In helices, the donor and acceptor residues separate, whereas the NH becomes exposed in strands via individual crankshaft motions (Figure 12D).

Local opening events of EHEE_rd2_0005 at T = 293 K. (A) HDX for seven residues that exchange via local or near-local openings. (B) Their m-values are decomposed into values for the closed and open states (m_closed and m_open). The m_closed value of all of the local residues is close to 0 at low [urea] as all of the native H-bonds are formed >96% of the time when the H-bond of interest is formed. When the H-bond is broken, 79% and 16% of the time, 0 or 1 H-bond, respectively, is also broken. (C) Distribution of the number of H-bonds at T = 293 K. (D) Example structures for the local openings.

Testing for EX2 Behavior

An assumption in the calculation of stability from HDX data is that exchange is occurring in the thermodynamic EX2 limit where k_close ≫ k_chem and the rate is proportional to the fraction of time the NH is exchange competent, that is:

In the other EX1 limit where k_close ≪ k_chem, exchange occurs every time the NH becomes exchange competent so that the observed rate matches the opening rate, k_obs = k_open.

For EHEE_rd2_0005, we measured the refolding kinetics as a function of urea, tracking tryptophan fluorescence at 298 K, pH 7.54 to match the HDX condition. The folding rate extrapolated to 0 M urea was ∼1700 s^–1, whereas k_chem for the measured NHs was between 9 and 26 s^–1 for the five most stable sites. At the highest urea concentration (6 M), the corrected k_chem is 4–12 s^–1 for these residues, while k_f = 390 s^–1 (in 4 M urea, 1.25 M guanidine hydrochloride). Under the experimental conditions for HEEH_rd4_0097 (278 K, pD_read 4.6), k_chem is ∼0.008 s^–1, which makes it highly likely that this small protein exchanges in the EX2 limit.³⁶ Likewise, for Ub and the L50E variant, folding time constants are in sub-20 ms range (i.e., >50 s⁻¹),^32,37 whereas ⟨k_chem⟩ = 2.4 s^–1 at 273 K, pD_read 7.6 for the seven most stable sites. Hence, data for the proteins likely all are in the EX2 limit.

Discussion

The goals of our study are to advance the Upside model’s ability to calculate the free energy surface and develop protocols for validating this and similar calculations using HDX and its denaturant dependence. A comparison involving the denaturant dependence provides a sensitive test of the veracity of the simulated ensemble as both the free energy and the structural content of the partially folded states must be correctly predicted. This task is particularly challenging for the larger openings as it requires that the simulations have low levels of residual H-bonded structure in the disordered regions. To achieve such cooperativity requires the stabilizing interactions to be strong enough to fold the protein while not overstabilizing residual structure.

We achieved moderate success with two small, designed proteins as well as Ub and an L50E variant, observing openings having probabilities as low as 1 part in a million for the two most stable proteins, EHEE_rd2_0005 and Ub. In the absence of denaturant, however, these rare events do not quite represent global unfolding. Because of an overprediction of the partially folded states, the ΔG_HX values for the most stable NHs in the simulations are below the experimentally determined values for the sites that likely exchange by global unfolding by 0.2–1.5 kcal mol^–1. With urea, these states are destabilized, and the HDX shifts from being dominated by local and subglobal openings to occurring via larger-scale openings, including global exchange, in agreement with experiment.

Upside’s performance is the product of our ConDiv training procedure that was trained for stable native and disordered unfolded states. The dual-target training increased folding cooperativity and reduced the amount of residual H-bonded structure, which is essential for accurately predicting HDX patterns. Other improvements included energy terms related to backbone and side chain desolvation.

We are able to reversibly fold EHEE_rd2_0005 and HEEH_rd4_0097 but did not achieve this level of success for Ub and the L50E variant. For these larger proteins, we adopted a strategy of calculating HDX using only the portion of the trajectories that connects the NSE to the DSE, mimicking actual HDX measurements. The general agreement, especially for the L50E variant, argues that this approach does provide a way forward for handling larger proteins, especially for describing the lower energy, smaller-scale opening events.

Other challenges exist in predicting HDX data. There must be sufficient sampling that the ensemble properly reflects the Boltzmann weighting, especially for the rare opening events. The use of enhanced sampling methods including replica exchange combined with reweighting methods such as MBAR²² is extremely useful.

Another challenge relates to the observation of rare events. Their observation in an HDX experiment requires that no lower energy opening exists as it would dominate the exchange process. Hence, to match the experimental observation of a high energy species requires the model both accurately simulate the rare species and, at the same time, not have any lower energy states where those NHs are not involved in H-bonding. Hence, reproducing HDX data requires an accurate calculation of the free energy surface at both high and low energies.

This issue becomes more problematic with more stable proteins as overprediction of exchange competant states becomes more likely as the energy gap between the DSE and NSE increases. This effect can be seen in our poorer prediction of HDX for wild-type ubiquitin as compared to the L50E variant, which is 5 kcal·mol^–1 less stable.

From the experimental perspective, the conversion of experimental HDX rates to ΔG_HX requires knowledge of the intrinsic rate k_chem. The standard intrinsic rates were obtained using peptides in 0.5 M KCl, which is appropriate for unfolded chains.³⁸ While deviations in k_chem have been observed in unusual electrostatic environments^39,40 or with highly charged proteins,⁴⁰ we believe these effects are likely to have minimal impact on the large-scale unfolding events we are investigating as the charge density is lower in these expanded states. Nevertheless, these effects could affect the comparison between local unfolding events. Future HDX studies may be conducted in the presence of divalent cations to reduce these effects.⁴¹

Previous HDX Prediction Studies

The challenge of calculating a conformational ensemble that determines the free energy surface and HDX pattern is quite different from the task of predicting ΔG_HX. A variety of strategies have predicted ΔG_HX using properties of the native state²³ sometimes augmented by HDX enhancing motions.^42,43 HDX data have also been used to improve computational ensembles.⁴⁴⁻⁴⁶ For the purposes of our study, however, we are focused on de novo HDX calculations that are based solely on the NH protection levels in a predicted Boltzmann ensemble.

Generally, at least one amide proton exchanges with the stability and denaturant dependence consistent with total unfolding.^4,47 This observation implies that HDX for such sites occurs from a highly expanded state. Hence, models where exchange is predicted to occur from near-native or collapsed, H-bonded conformations do not accurately describe the Boltzmann ensemble, regardless of how well they may predict ΔG_HX in the absence of denaturant.

All-atom simulations are well suited to predict HDX.⁹⁻¹⁴ Although improvements have been made,^7,8 these simulations generally have too much residual structure to be consistent with the strong denaturant dependence seen in HDX data. Other approaches with the potential to predict Boltzmann ensembles and HDX patterns include coarse-grained Go models⁴⁸ and COREX.^5,49 The latter method generates energy-weighted ensembles where each site is considered binarily as either native or unfolded in windows of 6+ residues.^5,49

In their comparison of COREX to HDX data,⁶ Hilser et al. suggested that the NSE is broad with 10–20% of the buried surface exposed. Within the context of this broad ensemble, they proposed that small m-values arise when the average exposure in the subensemble having the specific H-bond closed matches the exposure in the subensemble where the H-bond is open: that is, m = m_closed – m_open ≈ 0.⁵ As discussed earlier, such a broad NSE is not observed in the Upside simulations. Rather, we find that small m-values reflect H-bonds predominantly breaking as singletons due to local deformations at the ends of secondary structures and turns (Figures 12). This observation supports the standard view that local unfolding events occur with the breaking of only one or a few H-bonds³ and the native well is a relatively narrow ensemble (Figures S26 and S27).

Conclusion

We found that the prediction of the free energy surface and validation using HDX is extremely challenging and hence provides a very rigorous test of the accuracy of an energy function and sampling engine. The pair must predict the probability and structural content of rare events while having a DSE with minimal residual structure and avoid overpredicting intermediate-level fluctuations. In this light, we view the outcome of the present study to be commendable, although we appreciate that there is room for improvement.

We improved our performance by training the energy function to simultaneously fold proteins and have an unstructured DSE. This results in more realistic denatured states and increased folding cooperativity, which improves the match to HDX data. The dual-target training procedure also improved our ability to predict protein stability even though this was not an explicit goal, suggesting that the training procedure should be useful in improving other energy functions.

Methods

MD simulation and sampling can be found in the Supporting Information Simulation Details and Sampling. Protein sequences, expression, HDX/NMR, and kinetic studies can be found in the Supporting Information Experimental Methods. The Upside package is available for download at https://github.com/sosnicklab/upside-md.

Acknowledgments

We thank W. Yu for his advice on HDX calculations and the Protein Production Core (Lauren Carter, director) at the University of Washington Institute for Protein Design for providing labeled EHEE_rd2_0005 and HEEH_rd4_0097 for NMR studies. This work is supported by NIGMS grants GM55694 (T.R.S., K.F.F.) and R01 GM130122 (T.R.S., P. Clark), NSF grants MCB-1517221 (B. Roux) and MCB-2023077 (T.R.S.), Natural Sciences, and the Princess Margaret Cancer Centre. NMR spectroscopy was performed in the Biomolecular NMR Core Facility at the University of Chicago. The Structural Genomics Consortium is a registered charity (no: 1097737) that receives funds from Bayer AG, Boehringer Ingelheim, Bristol Myers Squibb, Genentech, Genome Canada through Ontario Genomics Institute [OGI-196], EU/EFPIA/OICR/McGill/KTH/Diamond Innovative Medicines Initiative 2 Joint Undertaking [EUbOPEN grant 875510], Janssen, Merck KGaA (aka EMD in Canada and US), Pfizer, and Takeda.

Supporting Information Available

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.1c00960.

Detailed discussion about the energy function improving, dual-objective contrastive divergence training, temperature calibration, stability prediction, simulation/sampling methods, and experimental methods (PDF)

Author Contributions

X.P., M.B., K.F.F., G.J.R., and T.R.S. designed the research; X.P., N.F., K.F.F., and T.R.S. developed the new force field; X.P. and N.F. implemented the algorithms; M.B., J.R.S., I.A.G., S.H., S.P., C.H.A., and T.R.S. conducted the NMR&HX experiments; X.P., N.F., M.B., J.R.S., S.P., G.J.R., and T.R.S. performed the analysis; and X.P., M.B., J.R.S., N.F., S.H., G.J.R., and T.R.S. wrote the paper.

The authors declare no competing financial interest.

Supplementary Material

ct1c00960_si_001.pdf^{(4MB, pdf)}

References

Schug A.; Onuchic J. N. From Protein Folding to Protein Function and Biomolecular Binding by Energy Landscape Theory. Current Opinion in Pharmacology. Curr. Opin. Pharmacol. 2010, 10, 709–714. 10.1016/j.coph.2010.09.012. [DOI] [PubMed] [Google Scholar]
Mayo S. L.; Baldwin R. L. Guanidinium Chloride Induction of Partial Unfolding in Amide Proton Exchange in RNase A. Science (Washington, DC, U. S.) 1993, 262 (5135), 873–876. 10.1126/science.8235609. [DOI] [PubMed] [Google Scholar]
Bai Y.; Sosnick T. R.; Mayne L.; Englander S. W. Protein Folding Intermediates: Native-State Hydrogen Exchange. Science 1995, 269 (5221), 193–197. 10.1126/science.7618079. [DOI] [PMC free article] [PubMed] [Google Scholar]
Huyghues-Despointes B. M. P.; Scholtz J. M.; Pace C. N. Protein Conformational Stabilities Can Be Determined from Hydrogen Exchange Rates. Nat. Struct. Biol. 1999, 6 (10), 910–912. 10.1038/13273. [DOI] [PubMed] [Google Scholar]
Wooll J. O.; Wrabl J. O.; Hilser V. J. Ensemble Modulation as an Origin of Denaturant-Independent Hydrogen Exchange in Proteins. J. Mol. Biol. 2000, 301 (2), 247–256. 10.1006/jmbi.2000.3889. [DOI] [PubMed] [Google Scholar]
Hilser V. J.; Freire E. Structure-Based Calculation of the Equilibrium Folding Pathway of Proteins. J. Mol. Biol. 1996, 262 (5), 756–772. 10.1006/jmbi.1996.0550. [DOI] [PubMed] [Google Scholar]
Best R. B.; Zheng W.; Mittal J. Balanced Protein-Water Interactions Improve Properties of Disordered Proteins and Non-Specific Protein Association. J. Chem. Theory Comput. 2014, 10 (11), 5113–5124. 10.1021/ct500569b. [DOI] [PMC free article] [PubMed] [Google Scholar]
Robustelli P.; Piana S.; Shaw D. E. Developing a Molecular Dynamics Force Field for Both Folded and Disordered Protein States. Proc. Natl. Acad. Sci. U. S. A. 2018, 115 (21), E4758–E4766. 10.1073/pnas.1800690115. [DOI] [PMC free article] [PubMed] [Google Scholar]
Persson F.; Halle B. How Amide Hydrogens Exchange in Native Proteins. Proc. Natl. Acad. Sci. U. S. A. 2015, 112 (33), 10383–10388. 10.1073/pnas.1506079112. [DOI] [PMC free article] [PubMed] [Google Scholar]
Devaurs D.; Antunes D. A.; Papanastasiou M.; Moll M.; Ricklin D.; Lambris J. D.; Kavraki L. E. Coarse-Grained Conformational Sampling of Protein Structure Improves the Fit to Experimental Hydrogen-Exchange Data. Front. Mol. Biosci. 2017, 4 (MAR), 13. 10.3389/fmolb.2017.00013. [DOI] [PMC free article] [PubMed] [Google Scholar]
García A. E.; Hummer G. Conformational Dynamics of Cytochrome c: Correlation to Hydrogen Exchange. Proteins: Struct., Funct., Genet. 1999, 36 (2), 175–91. . [DOI] [PubMed] [Google Scholar]
Mohammadiarani H.; Shaw V. S.; Neubig R. R.; Vashisth H. Interpreting Hydrogen-Deuterium Exchange Events in Proteins Using Atomistic Simulations: Case Studies on Regulators of G-Protein Signaling Proteins. J. Phys. Chem. B 2018, 122 (40), 9314–9323. 10.1021/acs.jpcb.8b07494. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sheinerman F. B.; Brooks C. L. Molecular Picture of Folding of a Small α/β Protein. Proc. Natl. Acad. Sci. U. S. A. 1998, 95 (4), 1562–1567. 10.1073/pnas.95.4.1562. [DOI] [PMC free article] [PubMed] [Google Scholar]
Markwick P. R. L.; Peacock R. B.; Komives E. A. Accurate Prediction of Amide Exchange in the Fast Limit Reveals Thrombin Allostery. Biophys. J. 2019, 116 (1), 49–56. 10.1016/j.bpj.2018.11.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
Skinner J. J.; Yu W.; Gichana E. K.; Baxa M. C.; Hinshaw J. R.; Freed K. F.; Sosnick T. R. Benchmarking All-Atom Simulations Using Hydrogen Exchange. Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (45), 15975–15980. 10.1073/pnas.1404213111. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jumper J. M.; Faruk N. F.; Freed K. F.; Sosnick T. R. Accurate Calculation of Side Chain Packing and Free Energy with Applications to Protein Molecular Dynamics. PLoS Comput. Biol. 2018, 14 (12), e1006342 10.1371/journal.pcbi.1006342. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jumper J. M.; Faruk N. F.; Freed K. F.; Sosnick T. R. Trajectory-Based Training Enables Protein Simulations with Accurate Folding and Boltzmann Ensembles in Cpu-Hours. PLoS Comput. Biol. 2018, 14 (12), e1006578 10.1371/journal.pcbi.1006578. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rocklin G. J.; Chidyausiku T. M.; Goreshnik I.; Ford A.; Houliston S.; Lemak A.; Carter L.; Ravichandran R.; Mulligan V. K.; Chevalier A.; Arrowsmith C. H.; Baker D. Global Analysis of Protein Folding Using Massively Parallel Design, Synthesis, and Testing. Science (Washington, DC, U. S.) 2017, 357 (6347), 168–175. 10.1126/science.aan0693. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhang Y.; Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins: Struct., Funct., Genet. 2004, 57 (4), 702–710. 10.1002/prot.20264. [DOI] [PubMed] [Google Scholar]
Swendsen R. H.; Wang J. S. Replica Monte Carlo Simulation of Spin-Glasses. Phys. Rev. Lett. 1986, 57 (21), 2607–2609. 10.1103/PhysRevLett.57.2607. [DOI] [PubMed] [Google Scholar]
Sugita Y.; Okamoto Y. Replica-Exchange Molecular Dynamics Method for Protein Folding. Chem. Phys. Lett. 1999, 314 (1–2), 141–151. 10.1016/S0009-2614(99)01123-9. [DOI] [Google Scholar]
Shirts M. R.; Chodera J. D. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys. 2008, 129 (12), 124105. 10.1063/1.2978177. [DOI] [PMC free article] [PubMed] [Google Scholar]
Skinner J. J.; Lim W. K.; Bédard S.; Black B. E.; Englander S. W. Protein Hydrogen Exchange: Testing Current Models. Protein Sci. 2012, 21 (7), 987–995. 10.1002/pro.2082. [DOI] [PMC free article] [PubMed] [Google Scholar]
Myers J. K.; Nick Pace C.; Martin Scholtz J. Denaturant m Values and Heat Capacity Changes: Relation to Changes in Accessible Surface Areas of Protein Unfolding. Protein Sci. 1995, 4 (10), 2138–2148. 10.1002/pro.5560041020. [DOI] [PMC free article] [PubMed] [Google Scholar]
Takei J.; Pei W.; Vu D.; Bai Y. Populating Partially Unfolded Forms by Hydrogen Exchange-Directed Protein Engineering. Biochemistry 2002, 41 (41), 12308–12312. 10.1021/bi026491c. [DOI] [PubMed] [Google Scholar]
Kato H.; Vu N. D.; Feng H.; Zhou Z.; Bai Y. The Folding Pathway of T4 Lysozyme: An On-Pathway Hidden Folding Intermediate. J. Mol. Biol. 2007, 365 (3), 881–891. 10.1016/j.jmb.2006.10.048. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bai Y.; Feng H.; Zhou Z. Population and Structure Determination of Hidden Folding Intermediates by Native-State Hydrogen Exchange-Directed Protein Engineering and Nuclear Magnetic Resonance. Methods Mol. Biol. 2006, 350, 69–81. 10.1385/1-59745-189-4:69. [DOI] [PubMed] [Google Scholar]
Chamberlain A. K.; Handel T. M.; Marqusee S. Detection of Rare Partially Folded Molecules in Equilibrium with the Native Conformation of RNaseH. Nat. Struct. Mol. Biol. 1996, 3 (9), 782–787. 10.1038/nsb0996-782. [DOI] [PubMed] [Google Scholar]
Hollien J.; Marqusee S. Structural Distribution of Stability in a Thermophilic Enzyme. Proc. Natl. Acad. Sci. U. S. A. 1999, 96 (24), 13674–13678. 10.1073/pnas.96.24.13674. [DOI] [PMC free article] [PubMed] [Google Scholar]
Parker M. J.; Marqusee S. A Kinetic Folding Intermediate Probed by Native State Hydrogen Exchange. J. Mol. Biol. 2001, 305 (3), 593–602. 10.1006/jmbi.2000.4314. [DOI] [PubMed] [Google Scholar]
Hu W.; Walters B. T.; Kan Z. Y.; Mayne L.; Rosen L. E.; Marqusee S.; Englander S. W. Stepwise Protein Folding at near Amino Acid Resolution by Hydrogen Exchange and Mass Spectrometry. Proc. Natl. Acad. Sci. U. S. A. 2013, 110 (19), 7684–7689. 10.1073/pnas.1305887110. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zheng Z.; Sosnick T. R. Protein Vivisection Reveals Elusive Intermediates in Folding. J. Mol. Biol. 2010, 397 (3), 777–788. 10.1016/j.jmb.2010.01.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yu W.; Baxa M. C.; Gagnon I.; Freed K. F.; Sosnick T. R. Cooperative Folding near the Downhill Limit Determined with Amino Acid Resolution by Hydrogen Exchange. Proc. Natl. Acad. Sci. U. S. A. 2016, 113 (17), 4747–4752. 10.1073/pnas.1522500113. [DOI] [PMC free article] [PubMed] [Google Scholar]
Auton M.; Holthauzen L. M. F.; Bolen D. W. Anatomy of Energetic Changes Accompanying Urea-Induced Protein Denaturation. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 (39), 15317–15322. 10.1073/pnas.0706251104. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lim W. K.; Rösgen J.; Englander S. W. Urea, but Not Guanidinium, Destabilizes Proteins by Forming Hydrogen Bonds to the Peptide Group. Proc. Natl. Acad. Sci. U. S. A. 2009, 106 (8), 2595–2600. 10.1073/pnas.0812588106. [DOI] [PMC free article] [PubMed] [Google Scholar]
Jaswal S. S.; Miranker A. D. Scope and Utility of Hydrogen Exchange as a Tool for Mapping Landscapes. Protein Sci. 2007, 16 (11), 2378–2390. 10.1110/ps.072994207. [DOI] [PMC free article] [PubMed] [Google Scholar]
Krantz B. A.; Sosnick T. R. Distinguishing between Two-State and Three-State Models for Ubiquitin Folding. Biochemistry 2000, 39 (38), 11696–11701. 10.1021/bi000792+. [DOI] [PubMed] [Google Scholar]
Bai Y.; Milne J. S.; Mayne L.; Englander S. W. Primary Structure Effects on Peptide Group Hydrogen Exchange. Proteins: Struct., Funct., Genet. 1993, 17 (1), 75–86. 10.1002/prot.340170110. [DOI] [PMC free article] [PubMed] [Google Scholar]
Anderson J. S.; Hernández G.; LeMaster D. M. A Billion-Fold Range in Acidity for the Solvent-Exposed Amides of Pyrococcus Furiosus Rubredoxin. Biochemistry 2008, 47 (23), 6178–6188. 10.1021/bi800284y. [DOI] [PubMed] [Google Scholar]
Hernández G.; Anderson J. S.; Lemaster D. M. Experimentally Assessing Molecular Dynamics Sampling of the Protein Native State Conformational Distribution. Biophys. Chem. 2012, 163–164, 21–34. 10.1016/j.bpc.2012.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
Abdolvahabi A.; Gober J. L.; Mowery R. A.; Shi Y.; Shaw B. F. Metal-Ion-Specific Screening of Charge Effects in Protein Amide H/D Exchange and the Hofmeister Series. Anal. Chem. 2014, 86 (20), 10303–10310. 10.1021/ac502714v. [DOI] [PubMed] [Google Scholar]
Claesen J.; Politis A. POPPeT: A New Method to Predict the Protection Factor of Backbone Amide Hydrogens. J. Am. Soc. Mass Spectrom. 2019, 30 (1), 67–76. 10.1007/s13361-018-2068-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bahar I.; Wallqvist A.; Covell D. G.; Jernigan R. L. Correlation between Native-State Hydrogen Exchange and Cooperative Residue Fluctuations from a Simple Model. Biochemistry 1998, 37 (4), 1067–1075. 10.1021/bi9720641. [DOI] [PubMed] [Google Scholar]
Vendruscolo M.; Paci E.; Dobson C. M.; Karplus M. Rare Fluctuations of Native Proteins Sampled by Equilibrium Hydrogen Exchange. J. Am. Chem. Soc. 2003, 125 (51), 15686–15687. 10.1021/ja036523z. [DOI] [PubMed] [Google Scholar]
Best R. B.; Vendruscolo M. Structural Interpretation of Hydrogen Exchange Protection Factors in Proteins: Characterization of the Native State Fluctuations of CI2. Structure 2006, 14 (1), 97–106. 10.1016/j.str.2005.09.012. [DOI] [PubMed] [Google Scholar]
Bradshaw R. T.; Marinelli F.; Faraldo-Gómez J. D.; Forrest L. R. Interpretation of HDX Data by Maximum-Entropy Reweighting of Simulated Structural Ensembles. Biophys. J. 2020, 118 (7), 1649–1664. 10.1016/j.bpj.2020.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bai Y.; Englander J. J.; Mayne L.; Milne J. S.; Englander S. W. Thermodynamic Parameters from Hydrogen Exchange Measurements. Methods Enzymol. 1995, 259 (C), 344–356. 10.1016/0076-6879(95)59051-X. [DOI] [PubMed] [Google Scholar]
Craig P. O.; Lätzer J.; Weinkam P.; Hoffman R. M. B.; Ferreiro D. U.; Komives E. A.; Wolynes P. G. Prediction of Native-State Hydrogen Exchange from Perfectly Funneled Energy Landscapes. J. Am. Chem. Soc. 2011, 133 (43), 17463–17472. 10.1021/ja207506z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu T.; Pantazatos D.; Li S.; Hamuro Y.; Hilser V. J.; Woods V. L. Quantitative Assessment of Protein Structural Models by Comparison of H/D Exchange MS Data with Exchange Behavior Accurately Predicted by DXCOREX. J. Am. Soc. Mass Spectrom. 2012, 23 (1), 43–56. 10.1007/s13361-011-0267-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ct1c00960_si_001.pdf^{(4MB, pdf)}

[ref1] Schug A.; Onuchic J. N. From Protein Folding to Protein Function and Biomolecular Binding by Energy Landscape Theory. Current Opinion in Pharmacology. Curr. Opin. Pharmacol. 2010, 10, 709–714. 10.1016/j.coph.2010.09.012. [DOI] [PubMed] [Google Scholar]

[ref2] Mayo S. L.; Baldwin R. L. Guanidinium Chloride Induction of Partial Unfolding in Amide Proton Exchange in RNase A. Science (Washington, DC, U. S.) 1993, 262 (5135), 873–876. 10.1126/science.8235609. [DOI] [PubMed] [Google Scholar]

[ref3] Bai Y.; Sosnick T. R.; Mayne L.; Englander S. W. Protein Folding Intermediates: Native-State Hydrogen Exchange. Science 1995, 269 (5221), 193–197. 10.1126/science.7618079. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref4] Huyghues-Despointes B. M. P.; Scholtz J. M.; Pace C. N. Protein Conformational Stabilities Can Be Determined from Hydrogen Exchange Rates. Nat. Struct. Biol. 1999, 6 (10), 910–912. 10.1038/13273. [DOI] [PubMed] [Google Scholar]

[ref5] Wooll J. O.; Wrabl J. O.; Hilser V. J. Ensemble Modulation as an Origin of Denaturant-Independent Hydrogen Exchange in Proteins. J. Mol. Biol. 2000, 301 (2), 247–256. 10.1006/jmbi.2000.3889. [DOI] [PubMed] [Google Scholar]

[ref6] Hilser V. J.; Freire E. Structure-Based Calculation of the Equilibrium Folding Pathway of Proteins. J. Mol. Biol. 1996, 262 (5), 756–772. 10.1006/jmbi.1996.0550. [DOI] [PubMed] [Google Scholar]

[ref7] Best R. B.; Zheng W.; Mittal J. Balanced Protein-Water Interactions Improve Properties of Disordered Proteins and Non-Specific Protein Association. J. Chem. Theory Comput. 2014, 10 (11), 5113–5124. 10.1021/ct500569b. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref8] Robustelli P.; Piana S.; Shaw D. E. Developing a Molecular Dynamics Force Field for Both Folded and Disordered Protein States. Proc. Natl. Acad. Sci. U. S. A. 2018, 115 (21), E4758–E4766. 10.1073/pnas.1800690115. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref9] Persson F.; Halle B. How Amide Hydrogens Exchange in Native Proteins. Proc. Natl. Acad. Sci. U. S. A. 2015, 112 (33), 10383–10388. 10.1073/pnas.1506079112. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref10] Devaurs D.; Antunes D. A.; Papanastasiou M.; Moll M.; Ricklin D.; Lambris J. D.; Kavraki L. E. Coarse-Grained Conformational Sampling of Protein Structure Improves the Fit to Experimental Hydrogen-Exchange Data. Front. Mol. Biosci. 2017, 4 (MAR), 13. 10.3389/fmolb.2017.00013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref11] García A. E.; Hummer G. Conformational Dynamics of Cytochrome c: Correlation to Hydrogen Exchange. Proteins: Struct., Funct., Genet. 1999, 36 (2), 175–91. . [DOI] [PubMed] [Google Scholar]

[ref12] Mohammadiarani H.; Shaw V. S.; Neubig R. R.; Vashisth H. Interpreting Hydrogen-Deuterium Exchange Events in Proteins Using Atomistic Simulations: Case Studies on Regulators of G-Protein Signaling Proteins. J. Phys. Chem. B 2018, 122 (40), 9314–9323. 10.1021/acs.jpcb.8b07494. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref13] Sheinerman F. B.; Brooks C. L. Molecular Picture of Folding of a Small α/β Protein. Proc. Natl. Acad. Sci. U. S. A. 1998, 95 (4), 1562–1567. 10.1073/pnas.95.4.1562. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref14] Markwick P. R. L.; Peacock R. B.; Komives E. A. Accurate Prediction of Amide Exchange in the Fast Limit Reveals Thrombin Allostery. Biophys. J. 2019, 116 (1), 49–56. 10.1016/j.bpj.2018.11.023. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref15] Skinner J. J.; Yu W.; Gichana E. K.; Baxa M. C.; Hinshaw J. R.; Freed K. F.; Sosnick T. R. Benchmarking All-Atom Simulations Using Hydrogen Exchange. Proc. Natl. Acad. Sci. U. S. A. 2014, 111 (45), 15975–15980. 10.1073/pnas.1404213111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref16] Jumper J. M.; Faruk N. F.; Freed K. F.; Sosnick T. R. Accurate Calculation of Side Chain Packing and Free Energy with Applications to Protein Molecular Dynamics. PLoS Comput. Biol. 2018, 14 (12), e1006342 10.1371/journal.pcbi.1006342. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref17] Jumper J. M.; Faruk N. F.; Freed K. F.; Sosnick T. R. Trajectory-Based Training Enables Protein Simulations with Accurate Folding and Boltzmann Ensembles in Cpu-Hours. PLoS Comput. Biol. 2018, 14 (12), e1006578 10.1371/journal.pcbi.1006578. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref18] Rocklin G. J.; Chidyausiku T. M.; Goreshnik I.; Ford A.; Houliston S.; Lemak A.; Carter L.; Ravichandran R.; Mulligan V. K.; Chevalier A.; Arrowsmith C. H.; Baker D. Global Analysis of Protein Folding Using Massively Parallel Design, Synthesis, and Testing. Science (Washington, DC, U. S.) 2017, 357 (6347), 168–175. 10.1126/science.aan0693. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref19] Zhang Y.; Skolnick J. Scoring function for automated assessment of protein structure template quality. Proteins: Struct., Funct., Genet. 2004, 57 (4), 702–710. 10.1002/prot.20264. [DOI] [PubMed] [Google Scholar]

[ref20] Swendsen R. H.; Wang J. S. Replica Monte Carlo Simulation of Spin-Glasses. Phys. Rev. Lett. 1986, 57 (21), 2607–2609. 10.1103/PhysRevLett.57.2607. [DOI] [PubMed] [Google Scholar]

[ref21] Sugita Y.; Okamoto Y. Replica-Exchange Molecular Dynamics Method for Protein Folding. Chem. Phys. Lett. 1999, 314 (1–2), 141–151. 10.1016/S0009-2614(99)01123-9. [DOI] [Google Scholar]

[ref22] Shirts M. R.; Chodera J. D. Statistically Optimal Analysis of Samples from Multiple Equilibrium States. J. Chem. Phys. 2008, 129 (12), 124105. 10.1063/1.2978177. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref23] Skinner J. J.; Lim W. K.; Bédard S.; Black B. E.; Englander S. W. Protein Hydrogen Exchange: Testing Current Models. Protein Sci. 2012, 21 (7), 987–995. 10.1002/pro.2082. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref24] Myers J. K.; Nick Pace C.; Martin Scholtz J. Denaturant m Values and Heat Capacity Changes: Relation to Changes in Accessible Surface Areas of Protein Unfolding. Protein Sci. 1995, 4 (10), 2138–2148. 10.1002/pro.5560041020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref25] Takei J.; Pei W.; Vu D.; Bai Y. Populating Partially Unfolded Forms by Hydrogen Exchange-Directed Protein Engineering. Biochemistry 2002, 41 (41), 12308–12312. 10.1021/bi026491c. [DOI] [PubMed] [Google Scholar]

[ref26] Kato H.; Vu N. D.; Feng H.; Zhou Z.; Bai Y. The Folding Pathway of T4 Lysozyme: An On-Pathway Hidden Folding Intermediate. J. Mol. Biol. 2007, 365 (3), 881–891. 10.1016/j.jmb.2006.10.048. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref27] Bai Y.; Feng H.; Zhou Z. Population and Structure Determination of Hidden Folding Intermediates by Native-State Hydrogen Exchange-Directed Protein Engineering and Nuclear Magnetic Resonance. Methods Mol. Biol. 2006, 350, 69–81. 10.1385/1-59745-189-4:69. [DOI] [PubMed] [Google Scholar]

[ref28] Chamberlain A. K.; Handel T. M.; Marqusee S. Detection of Rare Partially Folded Molecules in Equilibrium with the Native Conformation of RNaseH. Nat. Struct. Mol. Biol. 1996, 3 (9), 782–787. 10.1038/nsb0996-782. [DOI] [PubMed] [Google Scholar]

[ref29] Hollien J.; Marqusee S. Structural Distribution of Stability in a Thermophilic Enzyme. Proc. Natl. Acad. Sci. U. S. A. 1999, 96 (24), 13674–13678. 10.1073/pnas.96.24.13674. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref30] Parker M. J.; Marqusee S. A Kinetic Folding Intermediate Probed by Native State Hydrogen Exchange. J. Mol. Biol. 2001, 305 (3), 593–602. 10.1006/jmbi.2000.4314. [DOI] [PubMed] [Google Scholar]

[ref31] Hu W.; Walters B. T.; Kan Z. Y.; Mayne L.; Rosen L. E.; Marqusee S.; Englander S. W. Stepwise Protein Folding at near Amino Acid Resolution by Hydrogen Exchange and Mass Spectrometry. Proc. Natl. Acad. Sci. U. S. A. 2013, 110 (19), 7684–7689. 10.1073/pnas.1305887110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref32] Zheng Z.; Sosnick T. R. Protein Vivisection Reveals Elusive Intermediates in Folding. J. Mol. Biol. 2010, 397 (3), 777–788. 10.1016/j.jmb.2010.01.056. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref33] Yu W.; Baxa M. C.; Gagnon I.; Freed K. F.; Sosnick T. R. Cooperative Folding near the Downhill Limit Determined with Amino Acid Resolution by Hydrogen Exchange. Proc. Natl. Acad. Sci. U. S. A. 2016, 113 (17), 4747–4752. 10.1073/pnas.1522500113. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref34] Auton M.; Holthauzen L. M. F.; Bolen D. W. Anatomy of Energetic Changes Accompanying Urea-Induced Protein Denaturation. Proc. Natl. Acad. Sci. U. S. A. 2007, 104 (39), 15317–15322. 10.1073/pnas.0706251104. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref35] Lim W. K.; Rösgen J.; Englander S. W. Urea, but Not Guanidinium, Destabilizes Proteins by Forming Hydrogen Bonds to the Peptide Group. Proc. Natl. Acad. Sci. U. S. A. 2009, 106 (8), 2595–2600. 10.1073/pnas.0812588106. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref36] Jaswal S. S.; Miranker A. D. Scope and Utility of Hydrogen Exchange as a Tool for Mapping Landscapes. Protein Sci. 2007, 16 (11), 2378–2390. 10.1110/ps.072994207. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref37] Krantz B. A.; Sosnick T. R. Distinguishing between Two-State and Three-State Models for Ubiquitin Folding. Biochemistry 2000, 39 (38), 11696–11701. 10.1021/bi000792+. [DOI] [PubMed] [Google Scholar]

[ref38] Bai Y.; Milne J. S.; Mayne L.; Englander S. W. Primary Structure Effects on Peptide Group Hydrogen Exchange. Proteins: Struct., Funct., Genet. 1993, 17 (1), 75–86. 10.1002/prot.340170110. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref39] Anderson J. S.; Hernández G.; LeMaster D. M. A Billion-Fold Range in Acidity for the Solvent-Exposed Amides of Pyrococcus Furiosus Rubredoxin. Biochemistry 2008, 47 (23), 6178–6188. 10.1021/bi800284y. [DOI] [PubMed] [Google Scholar]

[ref40] Hernández G.; Anderson J. S.; Lemaster D. M. Experimentally Assessing Molecular Dynamics Sampling of the Protein Native State Conformational Distribution. Biophys. Chem. 2012, 163–164, 21–34. 10.1016/j.bpc.2012.02.002. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref41] Abdolvahabi A.; Gober J. L.; Mowery R. A.; Shi Y.; Shaw B. F. Metal-Ion-Specific Screening of Charge Effects in Protein Amide H/D Exchange and the Hofmeister Series. Anal. Chem. 2014, 86 (20), 10303–10310. 10.1021/ac502714v. [DOI] [PubMed] [Google Scholar]

[ref42] Claesen J.; Politis A. POPPeT: A New Method to Predict the Protection Factor of Backbone Amide Hydrogens. J. Am. Soc. Mass Spectrom. 2019, 30 (1), 67–76. 10.1007/s13361-018-2068-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref43] Bahar I.; Wallqvist A.; Covell D. G.; Jernigan R. L. Correlation between Native-State Hydrogen Exchange and Cooperative Residue Fluctuations from a Simple Model. Biochemistry 1998, 37 (4), 1067–1075. 10.1021/bi9720641. [DOI] [PubMed] [Google Scholar]

[ref44] Vendruscolo M.; Paci E.; Dobson C. M.; Karplus M. Rare Fluctuations of Native Proteins Sampled by Equilibrium Hydrogen Exchange. J. Am. Chem. Soc. 2003, 125 (51), 15686–15687. 10.1021/ja036523z. [DOI] [PubMed] [Google Scholar]

[ref45] Best R. B.; Vendruscolo M. Structural Interpretation of Hydrogen Exchange Protection Factors in Proteins: Characterization of the Native State Fluctuations of CI2. Structure 2006, 14 (1), 97–106. 10.1016/j.str.2005.09.012. [DOI] [PubMed] [Google Scholar]

[ref46] Bradshaw R. T.; Marinelli F.; Faraldo-Gómez J. D.; Forrest L. R. Interpretation of HDX Data by Maximum-Entropy Reweighting of Simulated Structural Ensembles. Biophys. J. 2020, 118 (7), 1649–1664. 10.1016/j.bpj.2020.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref47] Bai Y.; Englander J. J.; Mayne L.; Milne J. S.; Englander S. W. Thermodynamic Parameters from Hydrogen Exchange Measurements. Methods Enzymol. 1995, 259 (C), 344–356. 10.1016/0076-6879(95)59051-X. [DOI] [PubMed] [Google Scholar]

[ref48] Craig P. O.; Lätzer J.; Weinkam P.; Hoffman R. M. B.; Ferreiro D. U.; Komives E. A.; Wolynes P. G. Prediction of Native-State Hydrogen Exchange from Perfectly Funneled Energy Landscapes. J. Am. Chem. Soc. 2011, 133 (43), 17463–17472. 10.1021/ja207506z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ref49] Liu T.; Pantazatos D.; Li S.; Hamuro Y.; Hilser V. J.; Woods V. L. Quantitative Assessment of Protein Structural Models by Comparison of H/D Exchange MS Data with Exchange Behavior Accurately Predicted by DXCOREX. J. Am. Soc. Mass Spectrom. 2012, 23 (1), 43–56. 10.1007/s13361-011-0267-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Prediction and Validation of a Protein’s Free Energy Surface Using Hydrogen Exchange and (Importantly) Its Denaturant Dependence

Xiangda Peng

Michael Baxa

Nabil Faruk

Joseph R Sachleben

Sebastian Pintscher

Isabelle A Gagnon

Scott Houliston

Cheryl H Arrowsmith

Karl F Freed

Gabriel J Rocklin

Tobin R Sosnick

Abstract

Introduction

Figure 1.

Results

Contrastive Divergence (ConDiv) Training

Figure 2.

Figure 3.

Free Energy Surface of Two Designed Mini-Proteins

Figure 4.

HDX Calculations

Calculating the Denaturant Dependence of HDX

Comparison to Experiment

Figure 5.

Figure 6.

Figure 7.

HDX Calculations for Ubiquitin and an L50E Mutant

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Interpretation of Small m-Values

Figure 12.

Testing for EX2 Behavior

Discussion

Previous HDX Prediction Studies

Conclusion

Methods

Acknowledgments

Supporting Information Available

Author Contributions

Supplementary Material

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases