Predicting X‐ray solution scattering from flexible macromolecules

Hao Zhou; Hugo Guterres; Carla Mattos; Lee Makowski

doi:10.1002/pro.3508

. 2018 Oct 16;27(12):2023–2036. doi: 10.1002/pro.3508

Predicting X‐ray solution scattering from flexible macromolecules

Hao Zhou ¹, Hugo Guterres ², Carla Mattos ², Lee Makowski ^3,^✉

PMCID: PMC6237699 PMID: 30230663

Abstract

Wide‐angle X‐ray solution scattering (WAXS) patterns contain substantial information about the structure and dynamics of a protein. Solution scattering from a rigid protein can be predicted from atomic coordinate sets to within experimental error. However, structural fluctuations of proteins in solution can lead to significant changes in the observed intensities. The magnitude and form of these changes contain information about the nature and spatial extent of structural fluctuations in the protein. Molecular dynamics (MD) simulations based on a crystal structure and selected force field generate models for protein internal motions, and here we demonstrate that they can be used to predict the impact of structural fluctuations on solution scattering data. In cases where the observed and calculated intensities correspond, we can conclude that the X‐ray scattering provides direct experimental validation of the structural and MD results. In cases where calculated and observed intensities are at odds, the inconsistencies can be used to determine the origins of these discrepancies. They may be because of overestimates or underestimates of structural fluctuations in MD simulations, under‐sampling of the structural ensemble in the simulations, errors in the structural model, or a mismatch between the experimental conditions and the parameters used in carrying out the MD simulation.

Keywords: MD simulation, SAXS, WAXS, vector‐length convolution, RAS, HIVp, sigma‐r plots

Abbreviations

MD: molecular dynamics
NMR: nuclear magnetic resonance
SAXS: small‐angle X‐ray solution scattering
aMD: accelerated molecular dynamics
WAXS: wide‐angle X‐ray solution scattering
cMD: conventional molecular dynamics
*: convolution
nsec: nanosecond

Introduction

Accurate prediction of the solution scattering pattern of a protein has many potential benefits, including testing models of protein structure and flexibility. Characterization of conformational fluctuations of proteins in solution may provide important insights into protein activity because these fluctuations frequent low energy pathways that have evolved in support of protein function.1 Paradoxically, slow, large‐scale fluctuations associated with many protein functions often prove the most challenging for study: (i) they may be constrained by intermolecular contacts in crystal lattices, restricting their study by crystallographic techniques; (ii) their relaxation times are incompatible with conventional nuclear magnetic resonance (NMR) techniques and indirect NMR approaches produce limited information that preclude detailed analyses;2 and (iii) these motions occur over time frames prohibitively long for conventional molecular dynamics (MD) simulations, motivating the development of coarse‐grained approaches that rely on approximations that may not be fully validated. The limitations in existing biophysical methods to the study of these fluctuations provide strong motivation for developing novel approaches.

Wide‐angle X‐ray solution scattering (WAXS) has proven relatively sensitive to structural fluctuations,3 in contrast to the lower resolution data obtained by conventional small‐angle X‐ray scattering (SAXS). Although containing no dynamical information, per se, WAXS data reflect the structural heterogeneity intrinsic to the entire conformational ensemble in solution. Comparison of WAXS from proteins in solution with that predicted for rigid proteins often identifies discrepancies that can be attributed to structural fluctuations and can provide information about the spatial scale and nature of those fluctuations4, 5 in a manner not possible with SAXS. The Debye formula6 should, in principle, provide a simple basis for the calculation of solution scattering from atomic coordinates, but accounting for solvent molecules in the hydration layer and the volume of solvent excluded by the protein complicates the calculation. Several strategies have been used to account for excluded volume,7, 8, 9 most involving recalculation of the electron density of protein atoms relative to the average electron density of solvent. However, the hydration layer also contributes to the scattering because packing of water molecules immediately adjacent to the protein is typically ~10% denser than in bulk.10 These issues were addressed in development of the widely used software package, CRYSOL,11 which uses a continuum model for the hydration layer and an efficient computation of excluded volume. It further provides the option to refine parameters characterizing excluded volume and hydration to minimize difference between calculated and observed intensities. With default parameters, CRYSOL is often accurate in the lower resolution SAXS regime, but not necessarily at wider angles of scattering, typically leading to an underestimate of scattering in the higher resolution WAXS regime.12 Refinement with CRYSOL often results in good agreement to the data, but may require physically unrealistic parameter values for excluded volume and hydration layer, bringing into question the relevance of the refinement process.13 Modeling the hydration layer using explicit atom water provides a more accurate estimate of scattered intensity without the need for adjustable parameters.12, 14 This approach requires placement of individual water molecules about the protein, most accurately carried out using MD simulations,12 although more empirical approaches have exhibited success.14 Typically, 50 or more “snapshots” of water about the protein are required to produce a meaningful average calculated intensity. Excluded volume is not an issue when explicit atom water molecules are used. Calculating the difference in predicted intensity between a “water droplet” containing the protein and a second “water droplet” of identical size and shape, containing only water molecules, accounts for the impact of bulk solvent at the protein surface.12 This approach provides prediction of intensity that, in many cases, is within experimental error for rigid proteins.

However, solution scattering data is highly sensitive to structural fluctuations, and the scattering from a flexible protein in solution can be very different from that predicted assuming it is rigid.4, 5 Many approaches to this problem have been discussed, such as producing a representative ensemble of protein structures, predicting the scattering from each member of the ensemble and modeling the observed scattering as a linear combination of these scattering patterns;15, 16 or predicting the SAXS profile by taking account of the thermal motions on the molecular component of the solution scattering of a single state of the macromolecule.17 Although a powerful and informative approach, construction of an ensemble of this kind may require postulation of details that will not be evident in the solution scattering pattern, and thereby not experimentally testable. Consequently, simpler approaches to modeling of scattering from a structural ensemble are worth pursuing.

In fact, all the information needed to reconstruct solution scattering is contained in a histogram of interatomic distances, the pair‐distribution function, p(r). It can be calculated from solution scattering intensity and, conversely, can be used to calculate the solution scattering intensity.18, 19 The p(r) also provides a simple way to model the impact of fluctuations on solution scattering: interatomic vectors that would all be the same length in a solution of identical, rigid proteins, will, in a solution of flexible proteins, take on a range of lengths. This means that the pair‐distribution function of a flexible protein can be calculated from that of a rigid protein by convolution of the pair‐distribution function of a rigid protein, p_r(r), with the distribution of length changes generated by intramolecular motion. The nature and breadth of that distribution will vary with interatomic vector length and its impact on p(r) can be expressed as a generalized convolution:

p (r) = p_{r} (s) * v (r - s, r),

(1)

where molecular motion alters interatomic vector lengths, r, by amounts, s, with a measure of how often each fluctuation occurs given by v(s, r), which is, itself, a function of interatomic vector length. The standard deviation of v(s, r) is designated as σ(r) and its variation with r is referred to here as a sigma‐r distribution. In wide‐angle solution scattering, it is possible to estimate σ(r) from the difference between the pair distribution function of a rigid protein, p_r(r), and that calculated directly from observed intensity, p_obs(r), by searching for an estimate, v_e(r − s, r), that minimizes

{min}_{σ} \{p_{obs} (r) - [p_{rigid} (r) * v_{e} (r - s, r)]\} .

(2)

In many cases, the distribution, v_e(r − s, r), can be approximated as a Gaussian,

v_{e} (r - s, r) = e^{\frac{- {(r - s)}^{2}}{2 {σ_{(r)}}^{2}}} .

(3)

Under conditions where this approximation is valid, the combination of pair‐distribution function and sigma‐r distribution contains all information necessary for calculating the impact of fluctuations on the solution scattering. p(r) can be calculated from an atomic coordinate set such as those available in the Protein Data Bank (PDB), and σ(r) can be calculated from atomic coordinates generated by an MD trajectory. To be used as a basis for calculation of solution scattering data, the MD trajectory must be long enough to provide an unbiased sample of the ensemble of structures that the protein exhibits in solution. MD approaches to assure the generation of an unbiased sampling of a structural ensemble remain uncertain and developing approaches for experimental validation is important.

Reduction of the information in an MD trajectory to a sigma‐r distribution amounts to a huge data compression from the gigabytes needed to specify the atomic coordinates of every atom at every time point in the trajectory, to a single one‐dimensional plot. Nevertheless, we have shown that it has the advantage of (i) supporting computationally efficient estimation of scattering from a flexible protein (ii) making possible direct comparison of MD results with results from WAXS experiments; and (iii) providing a simple, intuitively clear representation of the nature and magnitude of global fluctuations in the molecule during the trajectory.20

Comparison of sigma‐r plots from WAXS patterns and MD simulations from HIV protease5 indicated that the protein in solution was undergoing structural fluctuations with length scales much greater than those indicated by the simulations. This observation suggested either that the MD simulations did not reflect a full structural ensemble or that the solution conditions supported far greater structural fluctuations than reflected in the MD trajectories.

Here we use a combination of structural and dynamic information to predict WAXS scattering from a protein undergoing structural fluctuations in solution. We use the software package, XS,12 to calculate the WAXS pattern from a rigid protein; calculate the corresponding sigma‐r plot from an MD simulation and then carry out vector‐length convolution to predict the solution scattering that would occur from a protein undergoing fluctuations of the magnitude present in the MD simulation. Comparison of the calculated scattering with that observed provides a measure of the consistency of the results of experimental and computational studies of the dynamics of a protein.

We have chosen the human H‐Ras GTPase as a model system to demonstrate this approach, as there is evidence to indicate that hydrolysis of GTP to GDP by Ras is accompanied by a significant decrease in Ras flexibility.21, 22 Ras is a GTPase at the hub of signal transduction cascades that control cell proliferation, migration and survival, and its mutants are found in about 20% of human cancers.23 Ras is at rest when bound to GDP and active in the GTP‐bound form.24 While Ras bound to GDP has essentially the same canonical conformation in all structures found in the PDB, regardless of isoform (H‐Ras, K‐Ras or N‐Ras), a variety of conformational states have been observed for Ras proteins bound to GTP analogues that reflect a significantly larger range of accessible conformations.25 The structures of H‐Ras and its mutants bound to the GTP analogue guanylyl‐5‐imidodiphosphate (GNP for short) have been extensively studied, revealing several conformational states associated with Switch I (residues 30–40) and Switch II (residues 60–76).26 In particular, Switch I can attain a closed state compatible with effector binding and GTP hydrolysis and an open state associated with nucleotide exchange.27 Switch II is modulated by an allosteric site, which promotes the transition from a disordered T‐state to an ordered R‐state conformation conducive to GTP hydrolysis.25, 28 Here we use WAXS data to experimentally compare the flexibility of H‐Ras‐GDP to that of H‐Ras‐GNP in solution. We use accelerated MD (aMD)29 to overcome the energy barriers for transitions between the various conformational states for Switch I and Switch II. The scattering pattern calculated from the resulting trajectory is in excellent agreement with that obtained experimentally by WAXS. Furthermore, we demonstrate the detection of changes in the equilibrium between conformational states in wild type H‐Ras and its mutant Y71A, implicated in conformational transitions associated with Switch II. As a further demonstration of the approach, we use HIV protease as a model to test this method on an even more flexible protein with complex dynamic behavior.

Results

SAXS/WAXS of H‐Ras

Figure 1(a) includes plots of observed scattering intensity from H‐Ras‐GDP and H‐Ras‐GNP. Scattering patterns from the two samples are similar but not identical. The intensity of the peak at q ~ 0.28 Å⁻¹ is lower for H‐Ras‐GNP than H‐Ras‐GDP, indicating that the GDP form is more rigid than the GNP form. Figure 1(b) shows the predicted scattering patterns calculated from the rigid models of GDP‐bound (PDB ID 4Q21) and GNP‐bound (PDB ID 3K8Y) Ras. The major difference between these predicted patterns [Fig. 1(b)] and those observed [Fig. 1(a)] is the observed flattening of peaks and filling of troughs—hallmarks of the impact of structural fluctuations on solution scattering.

Impact of GDP/GNP exchange on solution scattering. (a) Comparison of observed solution scattering from H‐Ras‐GDP (blue) and H‐Ras‐GNP (red). (b) Comparison of scattering calculated for rigid H‐Ras‐GDP (blue) and H‐Ras‐GNP (red).

MD of H‐Ras

MD simulations of H‐Ras‐GDP (4Q21) and H‐Ras‐GNP (3K8Y; no bound ligands other than the nucleotide in either case) were carried out for 90 nsec using aMD. Figure 1 shows the sigma‐r plots calculated from each of these simulations. The smaller sigma‐r values associated with the longer interatomic vectors in H‐Ras‐GDP compared to the corresponding vectors in H‐Ras‐GNP, indicate a more rigid GDP‐bound protein, qualitatively consistent with the results of the solution scattering experiments shown in Figure 1. The challenge here is to demonstrate quantitative consistency of the MD results with the variations seen in the solution scattering data.

Prediction of SAXS/WAXS patterns

We combined the calculated scattering patterns for H‐Ras‐GDP and H‐Ras‐GNP shown in Figure 1(b) with the sigma‐r plots shown in Figure 2 to calculate the predicted scattering pattern from Ras undergoing the fluctuations generated in the aMD trajectories. Comparisons of the calculated and observed scattering patterns for both complexes are shown in Figure 3, the comparisons of SAXS patterns are shown in the insets on a semilog scale. It is immediately clear that predictions assuming a rigid protein [Fig. 3(a,c), black curves] do not adequately reproduce the observed data (red curves), as indicated by the fact that the scattering predicted for a rigid protein exhibits shaper peaks and deeper troughs than observed. Flattening of the peaks and filling of troughs account for the impact of flexibility in the plots shown in Figure 3(b,d) (blue curves), particularly for H‐Ras‐GNP.

Comparison of sigma‐r plots generated from 90 nsec aMD trajectories of H‐Ras‐GDP (blue) and H‐Ras‐GNP (red).

Comparison of calculated and observed scattering from WT H‐Ras. (a) Comparison of scattering observed from H‐Ras‐GNP (red) and calculated for a rigid protein from XS (black). (b) Comparison of scattering observed for H‐Ras‐GNP (red) and that calculated for a protein undergoing fluctuations at the scale predicted from aMD (blue), SAXS comparison is in semilog scale. (c) Comparison of scattering observed from H‐Ras‐GDP (red) and calculated for a rigid protein from XS (black). (d) Comparison of scattering observed for H‐Ras‐GDP (red) and that calculated for a protein undergoing fluctuations at the scale predicted from aMD (blue), SAXS comparison is in semilog scale.

In the case of H‐Ras‐GNP the fluctuations generated in the aMD simulations are entirely consistent with the SAXS/WAXS observations, as the two curves in Figure 3(b) overlap nearly perfectly [see inset in Fig. 3(b)]. For H‐Ras‐GDP the correspondence is not quite so good. As the details in Figure 3(d) suggest, although the estimate generated from thee aMD simulation is far closer to that observed than calculated from a rigid protein, the impression is that the simulation has overestimated the fluctuations [inset in Fig. 3(b)]. Overall, it is clear that the combination of crystal structure and MD simulation can improve prediction of solution scattering. This leads to the conclusion that aMD of H‐Ras‐GNP, and, to a lesser extent, H‐Ras‐GDP generates structural fluctuations consistent with experimental observations. The good agreement for H‐Ras‐GNP indicates that the 90 nsec aMD simulation provides an unbiased sampling of the conformational ensemble for. For H‐Ras‐GDP the fit is good, although the protein dynamics are somewhat overestimated by aMD. It appears that for the more rigid H‐Ras‐GDP the boost potential applied in aMD is not necessary and may lead to an overestimate of the flexibility in solution (see below).

Detection of substates in H‐Ras‐GTP protein solution

This approach can also be applied to detection of structural states in solution. H‐Ras‐GTP takes on two distinct states over the course of the 21,000 snapshots of the aMD simulation: the catalytically competent R‐state and the T‐state with a disordered Switch II25, 30 (Fig. 4). The starting model for the simulations represents Ras in the R‐state, pinning the structure to the R‐state early in the simulation. However, in the absence of allosteric effectors, H‐Ras‐GTP transitions to the T‐state over the course of the simulation. Is the protein in solution represented best by R‐state or T‐state? To address this, we divided the 21,000 snapshot aMD simulation into 21 parts, 1000 snapshots for each segment. These calculations indicate that a 1000‐snapshot simulation is too short to adequately sample the conformational ensemble—the protein does not have adequate time to fluctuate through the entire available structural space. To quantify the extent of this short‐fall, we calculated the R‐factors between observed and calculated intensities to assess the evolution of dynamics during the trajectory and compare them to those observed in solution. We applied vector length convolution to generate the 21 best agreements (calculated from 21 different sigma‐r plots obtained from the 21 aMD simulation segments). The results of these calculations are shown in Figure 5: The systematic decrease in R‐factor over the course of the simulation indicates that the latter parts of the aMD trajectory better represent the observed structural and dynamic state of the protein than the earlier parts. This suggests that H‐Ras‐GNP in the absence of an allosteric ligand is predominantly in the T‐state in solution. It also indicates an evolution toward a different structural/dynamic state, consistent with the trajectory representing a transition from R‐state to T‐state, as shown in Figure 4.

R‐state and T‐state conformational states associated with early and late time points in the H‐Ras‐GTP aMD simulations. In green is H‐Ras in the R‐state early in the simulation, with Helix 3 and Loop 7 shifted toward Helix 4 and an ordered Switch II. In yellow is H‐Ras in the T‐state, representative of most of the trajectory, with Helix 3 and Loop 7 shifted toward Switch II, resulting in a more disordered Switch II. For comparison, crystal structures of H‐Ras‐GNP in the R‐state and T‐state are shown in gray (PDB ID 3K8Y) and black (PDB ID 2RGE), respectively.

The R‐factor of 21 best fittings using 21 segments of MD simulation. The trends of R‐factors indicate that the MD simulation tends to approach the real dynamic motion of the protein in solution.

Identifying shortcomings in the in silico mutation model for H‐RasY71A‐GTP

Solution scattering from H‐RasY71A‐GNP suggests that this variant exhibits a greater level of fluctuation in solution than the wild type protein (Fig. 6). Figure 7 compares the observed scattering from Y71A with that calculated for a rigid protein using PDB file 3K8Y (which is a wild type structure) and using a model for the mutant in which the wild type crystal structure (PDB ID 3K8Y) was mutated in silico. The key features of the scattering pattern from Y71A are similar to that from wild type except the peaks are further muted and the troughs further filled in, indicative of increased flexibility in the mutant. When the dynamics of the mutant are predicted from aMD simulations based on the wild type crystal structure mutated in silico, the magnitude of the sigma‐r plot is not significantly changed and, when used to predict scattering from the Y71A mutant, the computational intensities are sharper at peaks and troughs compared to the observed [Fig. 7(b)], which indicates the flexibility of the mutant protein in solution is significantly greater than that predicted by aMD.

Comparison of observed data between H‐RasY71A‐GNP and wild type H‐Ras‐GNP. The decreased peak at q ~ 0.28 Å⁻¹ and the increased peak at q ~ 0.38 Å⁻¹ indicates the Y71A mutation has a greater level of fluctuation than the wild type H‐Ras‐GNP protein in solution.

Comparison of calculated and observed solution scattering from H‐RasY71A‐GNP. (a) Comparison of observed intensity with that calculated from a rigid homology model for the mutant based on 3K8Y. (b) Comparison of observed scattering with that calculated for H‐RasY71A‐GNP assuming a rigid protein and taking into account the fluctuations generated by the aMD trajectory (inset includes observed and prediction for fluctuating protein). (c) Comparison of observed intensity with that calculated from WT 3K8Y. (d) Comparison of observed scattering from H‐RasY71A‐GNP with that calculated for WT assuming a rigid protein and taking into account the fluctuations generated by the aMD trajectory (inset includes observed and prediction for fluctuating protein). (e) Sigma‐r plot from 90 nsec aMD simulation of H‐RasY71A‐GNP.

When the structure of wild type H‐Ras‐GNP (PDB ID 3K8Y) was used as the XS input (rather than the in silico mutated model) the change of a single residue led to a poorer agreement [Fig. 7(d)]. The R‐factor between the observed and the calculated intensity is ~0.057, which is worse than that obtained using the in silico mutated model as the XS input (~0.049). In addition, the peaks and troughs of calculated intensities are sharper than those of observed intensities in Figure 7(b,d), indicating that a 90 nsec aMD simulation is inadequate to produce an unbiased sample of the conformational ensemble, regardless of whether the simulations were started with the wild type or the in silico mutated structure. From these calculations we conclude that the replacement of Y71 by alanine (i) disrupts the packing of amino acids in the structure, resulting in a conformation that is not well represented by a simple in silico mutated model; and (ii) increases the dynamic fluctuations of the protein to an extent not readily represented by the aMD simulations starting with the in silico mutated model.

Comparison of cMD and aMD

Accelerated molecular dynamics (aMD) is an efficient simulation method that allows broader sampling of the conformational space by reducing energy barriers between different states of a system. Compared to conventional molecular dynamics (cMD), aMD will access more conformational space in a specific simulation time. The sigma‐r plot is a convenient way to illustrate how dramatic the impact of the reduction of energy barriers in aMD can be. Figure 8 shows a comparison of sigma‐r plots for aMD and cMD simulations of the same length for H‐Ras‐GTP and H‐Ras‐GDP. The differences are striking. The variation of interatomic vector lengths is on average 50% higher in the aMD simulation than in cMD. This demonstrates the power of a sigma‐r plot to distinguish between different MD methods.

Comparison of sigma‐r plots for aMD (red) and cMD (blue) simulations of the same time length for (a) H‐Ras‐GNP and (b) H‐Ras‐GDP.

We applied the results of the cMD simulation to predict the SAXS/WAXS patterns of H‐Ras‐GNP and H‐Ras‐GDP (Fig. 9), whereas, the results of aMD simulations led to accurate prediction of the corresponding WAXS patterns for H‐Ras‐GNP [Fig. 4(b)]; use of cMD simulations resulted in less accurate predictions of the observed intensities [Fig. 9(b)]. By contrast, the aMD simulation overestimated the degree of fluctuation in H‐Ras‐GDP [Fig. 4(d)], and the lower degree of fluctuation in the cMD simulation more closely reflected the experimental results [Fig. 9(d)]. In Figure 9(a,c), the sigma‐r plots from the aMD and cMD simulations are shown in different colors, the discrepancies indicate aMD samples more conformations compared to the cMD.

Comparison of calculated and observed intensities from H‐Ras‐GNP and H‐Ras‐GDP using cMD sigma‐r plots. (a) Sigma‐r plots from the same length of cMD and aMD H‐Ras‐GNP trajectories. (b) Comparison of observed intensity with that calculated from PDB 3K8Y using nonscaled cMD sigma‐r plot. (c) Sigma‐r plots from the same length of cMD and aMD H‐Ras‐GDP trajectories. (d) Comparison of observed intensity with that calculated from PDB 4Q21 using nonscaled cMD sigma‐r plot.

Characterizing large‐scale dynamics not reflected in MD simulations

HIV protease (HIVp) is a homodimer of 99 amino acid polypeptides that carries out sequence‐specific cleavages of the viral polyprotein needed to create the protein components that are essential for viral replication. HIVp is a principal drug target for AIDS therapies and has become one of the most thoroughly studied of all proteins. Amino acid threonine 80 is highly conserved in all patient populations, both treated and untreated. Although T80 makes no direct contact with substrate or inhibitors and T80N has a crystal structure essentially unchanged from WT,31 its replacement with any other amino acid greatly diminishes or abolishes enzymatic activity. The search for the mechanism by which this amino acid replacement impacts function has led to examination of the dynamics of T80N and MD simulations demonstrated the mutant has decreased conformational mobility in the flap regions that must open to allow substrate access to the enzymatic site.31 X‐ray solution scattering studies5 provided further evidence for the importance of dynamics in HIVp function and the suppression of dynamics in the nonfunctional T80N. We elaborate on those studies in the context of the methods introduced here.

Utilizing the approach described for H‐Ras (above) we combined X‐ray solution scattering observations from HIVp with the results of 100 nsec all‐atom MD simulations for both WT and T80N protease. As shown in Figure 10, there is a substantial difference between observed scattering from HIVp and that calculated for a rigid protein. The features in the predicted pattern are strongly muted in the observations, suggestive of significant structural fluctuation. Furthermore, the degree of large scale molecular motions suggested by the observations cannot be explained by the level of motion exhibited by 100 nsec MD simulations (Fig. 10). From this, we can conclude that the dynamic behavior represented in these MD simulations does not come close to explaining the structural polymorphisms within the protein solution. Much longer simulations may be required. However, the divergence at wider scattering angles indicates that the dynamics of HIVp may be too complex for the model of a single structure perturbed by fluctuations. The large‐scale fluctuations of the flaps (and potentially other features) may not be readily accounted for by a single representative structure and its variants. Other factors may also be involved. First, the behavior of the flap region of the HIV protease, which is highly flexible and capable of undergoing slow, large‐scale motion may make an unbiased sampling of the structural ensemble require far longer MD trajectories, bordering on microseconds to milliseconds. Second, there could be an intrinsic, static polymorphism because of damage to the proteins during isolation and purification. Third, there could be components within the experimental solution that generate structural polymorphism (either static or dynamic) that are not represented by the MD simulations.

Comparison of calculated and observed solution scattering from HIVp. (a) Solution scattering from WT HIVp as observed and as calculated for a rigid protein. (b) Sigma‐r plot calculated from a 100 nsec MD simulation of WT HIVp. (c) Comparison of predicted scattering from rigid WT HIVp and HIVp fluctuating at a level predicted by MD with that observed. (d) Solution scattering from T80N HIVp as observed and as calculated for a rigid protein. (e) Sigma‐r plot calculated from a 100 nsec MD simulation of T80N HIVp. (f) Comparison of predicted scattering from rigid T80N HIVp and HIVp fluctuating at a level predicted by MD with that observed.

Discussion

We have shown here that it is possible to predict, to within experimental error, the X‐ray solution scattering from a protein undergoing structural fluctuations by using a combination of structural information from X‐ray crystallography and dynamic information from MD simulations. We have done this without the need to construct an ensemble of structures or postulate (or fit) a set of relative abundances of representative structures. This capability provides a basis for evaluating structural ensembles generated through MD simulations; for providing experimental validation of novel MD techniques or force fields; and for exploring the impact of solution conditions on dynamics and structure.

Where this approach accurately predicts the solution scattering data, the correspondence provides strong experimental validation of the underlying model. Where the model fails to predict the solution scattering, the discrepancies provide a basis for rethinking the model, the experimental conditions under which the data were collected, or the presumption that experimental and computational conditions represent comparable conditions for evaluation of molecular behavior. Interpreting these discrepancies will be challenging and may require different strategies for different molecular systems.

When the impact of structural fluctuations is underestimated by the MD simulations the source of the discrepancy may be because of an insufficiently long simulation that led to a biased or incomplete ensemble. This can be addressed by longer simulations, by modifications to the potential to overcome energy barriers for conformational state transitions as in aMD, or potentially, by employing a coarse‐grained approach. Where the MD simulation appears to generate greater flexibility than is apparent from the experimental data, the source of the differences may be less obvious. One possibility is the use of an experimental (solution state) condition that limits molecular motions in unexpected ways, or the aMD simulation is not reweighted adequately to recover the original free energy landscapes of proteins.32 Another might be the postulation of a representative structure (from crystallography) that is not representative of the conformational ensemble in the solution conditions used A third possibility is that the model of a single dominant structure perturbed by structural fluctuations is inadequate to represent the solution ensemble. If a protein exhibits multiple conformations in solution the approach outlined here may not be adequate to model the scattering and may require a more complex approach incorporating multiple conformations.15

For molecular systems that fluctuate around a single representative structure, the approach presented here provides a powerful tool to validate dynamic models. Comparison with observations may provide a means to evaluate how long a simulation will be required to provide an unbiased sample of the structural ensemble.

The power of our new method to evaluate how well different simulation approaches represent protein dynamics as observed in solution was exemplified with four proteins: H‐Ras‐GDP, H‐Ras‐GNP/GTP, H‐RasY71A‐GNP/GTP, and HIVp, listed here in order of increasing intrinsic dynamic fluctuations as observed experimentally. These examples showcase the strengths and limitations of our method given a wide range of protein flexibility. Comparing H‐Ras in the GTP versus GDP bound states show that the simulation method needed to represent the proper protein dynamics, given a 90 nsec simulation time, varies according to the properties of the protein. While aMD most adequately represents the conformational range in H‐Ras‐GTP, cMD is best for the more rigid H‐Ras‐GDP. From this, we can conclude that H‐Ras‐GTP overcomes higher energy barriers in sampling its conformational states than does H‐Ras‐GDP, which most likely hovers around a single global energy minimum. The comparison with solution scattering data can serve as a guide and/or validation of MD simulation methods in a way that has not been possible to date. When in an appropriate context, it can be used to evaluate and select among the many existing simulation approaches or in the validation of new simulation methods being developed. In H‐Ras71A‐GNP, the point mutation increases the flexibility of the G‐domain such that aMD simulations are no longer adequate to represent its dynamic behavior in solution, although it is possible that other simulations strategies or a different starting model could yield better results. The HIVp example, where the simulations are far from representing the dynamic range in solution, points to great caution in interpreting the results in terms of functional roles. Multiple simulations from different starting structures may be necessary, but it is not clear that an appropriate range of models is currently available. In general, our results demonstrate that it is important to know something a priori about the extent of dynamics associated with a given protein structure to select the most appropriate method to simulate its behavior. To date, experimental validation of MD simulations has been challenging. The method presented here provides a simple and relatively quick way to directly compare the range of flexibility obtained computationally and by WAXS in solution experiments. Application of this approach could significantly increase confidence in conclusions about protein function accessible solely by computational methods.

Conclusion

In this paper, we present a new method that combines the crystal structure of a protein with corresponding MD simulation to predict a WAXS pattern. We have applied this method to two proteins: Ras, which exhibits relatively small conformational changes in solution, and HIV protein, which exhibits larger scale fluctuations in its dynamics. We find that for Ras there is good agreement between calculated and observed WAXS patterns, indicating that the MD trajectories provided an unbiased sample of conformational states comparable to that observed in solution. For a protein like Ras our method is robust and allows the selection and validation of appropriate simulation methods. Although the approach may be limited for more flexible systems such as the HIV protease, it provides a measure of the mismatch between computed and observed flexibility. The method is able to distinguish between these two cases and can serve as a general measure of how well MD simulations represent protein ensembles in solution. The quality of fit between calculated and observed intensity provides a measure of the self‐consistency of experimental and computational estimates of flexibility. MD simulations are widely used to study protein conformational properties and dynamics, with very few experimental resources available to check the quality of conformational space coverage. The results presented here indicate that WAXS data can be used to validate the extent to which computational trajectories represent an unbiased sample of the experimental ensemble.

Materials and Methods

Calculation of solution scattering from a rigid protein (XS)

To calculate the scattering intensity from a rigid protein we used the software package XS.12 XS starts with the atomic coordinates of a protein, for instance generated by X‐ray crystallography, and places water molecules around the protein, filling all space within at least 7 Å of any protein atom. It then carries out a short MD simulation of the water positions (holding the protein rigid); selects 50 snap shots of the water positions and calculates the scattering pattern corresponding to each snap shot. The MD calculations are performed using VMD33 and NAMD34 at 277 K using the TIP3P water model35 and the CHARM22 force field.36 A similar process is carried out for a “droplet” of water (identical in shape to the protein‐water drop) and the predicted scattering from the water droplet is subtracted from that of the protein.12 In most cases, the predicted scattering intensity of a rigid protein is within experimental error of that observed. It is then possible to calculate the pair distribution function, p(r), from the intensity distribution using an indirect Fourier transform:37

p_{r} (r) = 4 π r^{2} \int_{0}^{\infty} I (h) [\frac{sin 2 πhr}{2 πhr}] {(h)}^{2} d h,

(4)

where $h = \frac{1}{d} = \frac{2 sin θ}{λ}$ , 2θ is the scattering angle and λ is the wavelength of the X‐ray beam.

Calculation of sigma‐r distribution from an MD trajectory (SIGMA‐R)

An MD trajectory contains the coordinates of each atom at many time steps (for these calculations the trajectory is sampled at ~10 psec intervals). Calculation of the sigma‐r distribution requires determination of the standard deviation of each interatomic vector length over the course of the simulation. Interatomic distances are sorted from small to large, and their corresponding standard deviations averaged within intervals of Δr to determine the average σ(r) value for each interatomic distance. For comparison with X‐ray results, atoms must be weighted according to atomic number because X‐rays scatter from electrons and the scattering amplitude of an atom is proportional to number of electrons.

This process condenses a huge four dimensional (x, y, z, t) MD trajectory dataset into a one dimensional plot containing the standard deviation of interatomic distances sorted and averaged as a function of interatomic vector length. To minimize computation time we use a GPU + CPU combination. CPU undertakes the main sequential serial processing of the program and GPU is set for parallel calculation of the distance between each atom pair at each snapshot. A protein with n atoms, has $\frac{n (n - 1)}{2}$ interatomic distances. Adequate sampling of an MD simulation for sigma‐r plots should be in excess of 10,000 snapshots, which means that the CPU will compute interatomic distances more than $\frac{10,000 n (n - 1)}{2}$ times. Considering the number of atoms in a protein is always more than 10³, more than ~10¹¹ interatomic distances need to be calculated. Because the interatomic distance of each atom pair at each snapshot is independent, we can assign the distance calculation of each snapshot to a GPU. For the results reported here, we utilized a locally developed Python code (“SIGMA‐R” available on request) run on NVIDIA TESLA C2075 GPUs.

Vector‐length convolution (FLEX)

Combining the pair distribution function calculated for a rigid protein with the sigma‐r distribution calculated from an MD trajectory makes possible the calculation of the p(r) distribution for a flexible protein as in Equation (1) and the corresponding scattering intensity using

I_{c} (h) = 4 π \int_{0}^{D_{\max}} p (r) \frac{sin 2 πhr}{2 πhr} d r .

(5)

As we will report below, the results of this calculation often reproduce the observed scattering to within experimental error. In other cases, the discrepancy between calculated and observed intensity may indicate that the experimental fluctuations are of greater or lesser magnitude than those generated by MD. Where discrepancies are observed, the comparison of calculated and observed intensities indicates overestimation or underestimation of fluctuations in the MD trajectory. These calculations are carried out by a locally developed python program, FLEX.

Protein expression

H‐Ras (residues 1–166) was expressed in Escherichia coli BL21 cells and purified as previously described.38 The purified protein bound to GDP was flash‐frozen and stored at ‐80 °C until needed for WAXS data collection. Ras, bound to the nonhydrolyzable GTP analog GNP, was obtained by exchanging GDP to GNP as previously described.28 The analogue is necessary, as GTP would hydrolyze to GDP during the experimental time frame. The mutant H‐RasY71A (residues 1–166) was generated using QuikChange (Stratagene), expressed, purified and the GDP exchanged to GNP as for wild type H‐Ras. HIV protease was prepared as previously described.5

SAXS/WAXS

Samples of H‐Ras‐GDP, H‐Ras‐GNP, and H‐RasY71A‐GNP were prepared for solution scattering at protein concentrations of 5 and 10 mg/mL. X‐ray solution scattering data were collected at the G1 beam line at the Cornell High Energy Synchrotron Source (CHESS) using an X‐ray energy of 9.846 keV (λ = 1.259 Å) and beam size of ~250 μm × 250 μm.39 By collecting data simultaneously on both SAXS and WAXS detectors and combining circularly averaged data from both, intensities in the range of 0.009 < q < 0.72 Å⁻¹ were collected. For most calculations reported here, intensities beyond q > 0.54 Å⁻¹ were very weak and not utilized. The SAXS and WAXS data were scaled and merged by the software RAW40 and smoothed by robust local regression method with the span of 10% of the total data points. Data from HIV protease were collected at the BioCAT undulator beamline (18ID) at the Advanced Photon Source41 as reported previously.5

Protein structure files

PDB files 3K8Y28 and 4Q2142 were used to represent the crystal structures of H‐Ras bound to GNP and GDP, respectively. The impact of the Y71A mutation on the structure of H‐Ras‐GNP was modeled by homology to PDB file 3K8Y. PDB file 1F7A43 and 2FGV31 were used to represent wild type HIV protease and its mutant T80N.

MD simulations

Accelerated MD (aMD) simulation trajectories were generated by NAMD,34 including 90 nsec H‐Ras‐GTP, 90 nsec H‐Ras‐GDP, and 90 nsec H‐RasY71A. For these simulations, the GNP molecule in each of the crystal structures was changed to GTP. The analogue must be used in the experimental setup and is a good representative of GTP, used for decades in structural biology work on Ras. However, force field parameters are well established for GTP, not for GNP. The conformational space sampled by Ras‐GNP and Ras‐GTP is expected to be highly similar, such that it is reasonable to compare the simulations on Ras‐GTP with solution scattering of Ras‐GNP. Because of the sampling rate changing along the trajectories, the 90 nsec data, 12,000 frames were used to calculate the sigma‐r distribution. We also utilized a conventional 60 nsec MD simulation trajectory of H‐Ras‐GDP with 10,500 frames. The simulation protocol used for both conventional MD and aMD was the same as previously published,44 except that the aMD simulations contained the boost potential calculated as previously described.29

Conventional MD simulations extending to 100 nsec were run on both WT and T80N HIVp structures with the AMBER8 software package45 as described previously.31, 46, 47 The data‐collecting portion of the simulation was performed at constant temperature and pressure for 10 or 100 nsec with 1 fsec time steps.

Acknowledgment

The authors thank Richard Gillilan (CHESS) for assistance in collection of WAXS data.

This work is based in part upon research conducted at the Cornell High Energy Synchrotron Source (CHESS), which is supported by the National Science Foundation and the National Institutes of Health/National Institute of General Medical Sciences under NSF award DMR‐0936384, using the Macromolecular Diffraction at CHESS (MacCHESS) facility, which is supported by award GM‐103485 from the National Institutes of Health, through its National Institute of General Medical Sciences. Additional data was collected using resources of the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE‐AC02‐06CH11357. Use of BioCAT was supported by grant 9 P41 GM103622 from the National Institute of General Medical Sciences of the National Institutes of Health. This research was supported by NSF grant MCB‐1517295.

References

1. Henzler‐Wildman KA, Thai V, Lei M, Ott M, Wolf‐Watz M, Fenn T, Pozharski E, Wilson MA, Petsko GA, Karplus M, Hübner CG, Kern D (2007) Intrinsic motions along an enzymatic reaction trajectory. Nature 450:838–844. [DOI] [PubMed] [Google Scholar]
2. Kay LE, Ikura M, Tschudin R, Bax A (2011) Three‐dimensional triple‐resonance NMR spectroscopy of isotopically enriched proteins. J Magn Reson 213:423–441. [DOI] [PubMed] [Google Scholar]
3. Makowski L, Rodi DJ, Mandava S, Minh DD, Gore DB, Fischetti RF (2008) Molecular crowding inhibits intramolecular breathing motions in proteins. J Mol Biol 375:529–546. [DOI] [PMC free article] [PubMed] [Google Scholar]
4. Makowski L, Gore D, Mandava S, Minh D, Park S, Rodi DJ, Fischetti RF (2011) X‐ray solution scattering studies of the structural diversity intrinsic to protein ensembles. Biopolymers 95:531–542. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Zhou H, Li S, Badger J, Nalivaika E, Cai Y, Foulkes‐Murzycki J, Schiffer C, Makowski L (2015) Modulation of HIV protease flexibility by the T80N mutation. Proteins 83:1929–1939. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Debye P (1915) Zerstreuung von rotgenstrahlen. Ann Phys 462:309–823. [Google Scholar]
7. Fraser RD, MacRae TP, Suzuki E (1978) An improved method for calculating the contribution of solvent to the X‐ray diffraction pattern of biological macromolecules. J Appl Cryst 11:693–694. [Google Scholar]
8. Lattman EE (1989) Rapid calculation of the solution scattering profile from a macromolecule of known structure. Proteins 5:149–155. [DOI] [PubMed] [Google Scholar]
9. Pickover CA, Engelman DM (1982) On the interpretation and prediction of X‐ray scattering profiles of biomolecules in solution. Biopolymers 21:817–831. [Google Scholar]
10. Svergun DI, Richard S, Koch MH, Sayers Z, Kuprin S, Zaccai G (1998) Protein hydration in solution: experimental observation by X‐ray and neutron scattering. Proc Natl Acad Sci USA 95:2267–2272. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Svergun D, Barberato C, Koch MHJ (1995) Crysol—a program to evaluate X‐ray solution scattering of biological macromolecules from atomic coordinates. J Appl Cryst 28:768–773. [Google Scholar]
12. Park S, Bardhan JP, Roux B, Makowski L (2009) Simulated X‐ray scattering of protein solutions using explicit‐solvent models. J Chem Phys 130:134114. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Bardhan J, Park S, Makowski L (2009) Softwaxs: a computational tool for modeling wide‐angle X‐ray solution scattering from biomolecules. J Appl Cryst 42:932–943. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Grishaev A, Guo L, Irving T, Bax A (2010) Improved fitting of solution X‐ray scattering data to macromolecular structures and structural ensembles by explicit water modeling. J Am Chem Soc 132:15484–15486. [DOI] [PMC free article] [PubMed] [Google Scholar]
15. Onuk E, Badger J, Wang YJ, Bardhan J, Chishti Y, Akcakaya M, Brooks DH, Erdogmus D, Minh DDL, Makowski L (2017) Effects of catalytic action and ligand binding on conformational ensembles of adenylate kinase. Biochemistry 56:4559–4567. [DOI] [PubMed] [Google Scholar]
16. Yang S, Park S, Makowski L, Roux B (2009) A rapid coarse residue‐based computational method for X‐ray solution scattering characterization of protein folds and multiple conformational states of large protein complexes. Biophys J 96:4449–4463. [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Moore PB (2014) The effects of thermal disorder on the solution‐scattering profiles of macromolecules. Biophys J 106:1489–1496. [DOI] [PMC free article] [PubMed] [Google Scholar]
18. Luzzati V, Tardieu A (1980) Recent developments in solution X‐ray scattering. Ann Rev Biophys Bioengin 9:1–29. [DOI] [PubMed] [Google Scholar]
19. Putnam CD, Hammel M, Hura GL, Tainer JA (2007) X‐ray solution scattering (SAXS) combined with crystallography and computation: defining accurate macromolecular structures, conformations and assemblies in solution. Q Rev Biophys 40:191–285. [DOI] [PubMed] [Google Scholar]
20. Zhou H, Li S, Makowski L (2016) Visualizing global properties of a molecular dynamics trajectory. Proteins 84:82–91. [DOI] [PubMed] [Google Scholar]
21. Gorfe AA, Grant BJ, McCammon JA (2008) Mapping the nucleotide and isoform‐dependent structural and dynamical features of Ras proteins. Structure 16:885–896. [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Harrison RA, Lu J, Carrasco M, Hunter J, Manandhar A, Gondi S, Westover KD, Engen JR (2016) Structural dynamics in Ras and related proteins upon nucleotide switching. J Mol Biol 428:4723–4735. [DOI] [PMC free article] [PubMed] [Google Scholar]
23. Prior IA, Lewis PD, Mattos C (2012) A comprehensive survey of Ras mutations in cancer. Cancer Res 72:2457–2467. [DOI] [PMC free article] [PubMed] [Google Scholar]
24. Bourne HR, Sanders DA, McCormick F (1991) The GTPASE superfamily: conserved structure and molecular mechanism. Nature 349:117–127. [DOI] [PubMed] [Google Scholar]
25. Johnson CW, Reid D, Parker JA, Salter S, Knihtila R, Kuzmic P, Mattos C (2017) The small gtpases k‐Ras, n‐Ras, and h‐Ras have distinct biochemical properties determined by allosteric effects. J Biol Chem 292:12981–12993. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. Marcus K, Mattos C (2015) Direct attack on Ras: intramolecular communication and mutation‐specific effects. Clin Cancer Res 21:1810–1818. [DOI] [PubMed] [Google Scholar]
27. Spoerner M, Hozsa C, Poetzl JA, Reiss K, Ganser P, Geyer M, Kalbitzer HR (2010) Conformational states of human rat sarcoma (Ras) protein complexed with its natural ligand GTP and their role for effector interaction and GTP hydrolysis. J Biol Chem 285:39768–39778. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Buhrman G, Holzapfel G, Fetics S, Mattos C (2010) Allosteric modulation of Ras positions q61 for a direct role in catalysis. Proc Natl Acad Sci USA 107:4931–4936. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Hamelberg D, Mongan J, McCammon JA (2004) Accelerated molecular dynamics: a promising and efficient simulation method for biomolecules. J Chem Phys 120:11919–11929. [DOI] [PubMed] [Google Scholar]
30. Johnson CW, Mattos C (2013) The allosteric switch and conformational states in Ras GTPASE affected by small molecules. Enzymes 33:41–67. [DOI] [PubMed] [Google Scholar]
31. Foulkes JE, Prabu‐Jeyabalan M, Cooper D, Henderson GJ, Harris J, Swanstrom R, Schiffer CA (2006) Role of invariant THR80 in human immunodeficiency virus type 1 protease structure, function, and viral infectivity. J Virol 80:6906–6916. [DOI] [PMC free article] [PubMed] [Google Scholar]
32. Miao Y, Sinko W, Pierce L, Bucher D, Walker RC, McCammon JA (2014) Improved reweighting of accelerated molecular dynamics simulations for free energy calculation. J Chem Theory Comput 10:2677–2689. [DOI] [PMC free article] [PubMed] [Google Scholar]
33. Humphrey W, Dalke A, Schulten K (1996) VMD: visual molecular dynamics. J Mol Graph 14:27–38. [DOI] [PubMed] [Google Scholar]
34. Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kale L, Schulten K (2005) Scalable molecular dynamics with NAMD. J Comput Chem 26:1781–1802. [DOI] [PMC free article] [PubMed] [Google Scholar]
35. Miyamoto S, Kollman PA (1992) Settle: an analytical version of the shake and rattle algorithm for rigid water models. J Comput Chem 13:952–962. [Google Scholar]
36. MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph‐McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE III, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiórkiewicz‐Kuczera J, Yin D, Karplus M (1998) All‐atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 102:3586–3616. [DOI] [PubMed] [Google Scholar]
37. Svergun D (1992) Determination of the regularization parameter in indirect‐transform methods using perceptual criteria. J Appl Cryst 25:495–503. [Google Scholar]
38. Johnson CW, Buhrman G, Ting PY, Colicelli J, Mattos C (2016) Expression, purification, crystallization and X‐ray data collection for Ras and its mutants. Data Brief 6:423–427. [DOI] [PMC free article] [PubMed] [Google Scholar]
39. Acerbo AS, Cook MJ, Gillilan RE (2015) Upgrade of macchess facility for X‐ray scattering of biological macromolecules in solution. J Synchrotron Radiat 22:180–186. [DOI] [PMC free article] [PubMed] [Google Scholar]
40. Nielsen SS, Toft KN, Snakenborg D, Jeppesen MD, Jacobson JK, Vestergaard B, Kutter JP, Arleth L (2009) Bioxtas raw, a software program for high‐throughput automated small‐angle X‐ray scattering data reduction and preliminary analysis. J Appl Cryst 42:959–964. [Google Scholar]
41. Fischetti R, Stepanov S, Rosenbaum G, Barrea R, Black E, Gore D, Heurich R, Kondrashkina E, Kropf AJ, Wang S, Zhang K, Irving TC, Bunker GB (2004) The biocat undulator beamline 18id: a facility for biological non‐crystalline diffraction and X‐ray absorption spectroscopy at the advanced photon source. J Synchrotron Rad 11:399–405. [DOI] [PubMed] [Google Scholar]
42. Milburn MV, Tong L, deVos AM, Brunger A, Yamaizumi Z, Nishimura S, Kim SH (1990) Molecular switch for signal transduction: structural differences between active and inactive forms of protooncogenic Ras proteins. Science 247:939–945. [DOI] [PubMed] [Google Scholar]
43. Prabu‐Jeyabalan M, Nalivaika E, Schiffer CA (2000) How does a symmetric dimer recognize an asymmetric substrate? A substrate complex of HIV‐1 protease. J Mol Biol 301:1207–1220. [DOI] [PubMed] [Google Scholar]
44. Fetics SK, Guterres H, Kearney BM, Buhrman G, Ma B, Nussinov R, Mattos C (2015) Allosteric effects of the oncogenic RASQ61l mutant on RAF‐RBD. Structure 23:505–516. [DOI] [PMC free article] [PubMed] [Google Scholar]
45. Case DA, Cheatham TE 3rd, Darden T, Gohlke H, Luo R, Merz KM Jr, Onufriev A, Simmerling C, Wang B, Woods RJ (2005) The amber biomolecular simulation programs. J Comput Chem 26:1668–1688. [DOI] [PMC free article] [PubMed] [Google Scholar]
46. Cai Y, Schiffer CA (2010) Decomposing the energetic impact of drug resistant mutations in HIV‐1 protease on binding DRV. J Chem Theory Comput 6:1358–1368. [DOI] [PMC free article] [PubMed] [Google Scholar]
47. Mittal S, Cai Y, Nalam MN, Bolon DN, Schiffer CA (2012) Hydrophobic core flexibility modulates enzyme activity in HIV‐1 protease. J Am Chem Soc 134:4163–4168. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0001] 1. Henzler‐Wildman KA, Thai V, Lei M, Ott M, Wolf‐Watz M, Fenn T, Pozharski E, Wilson MA, Petsko GA, Karplus M, Hübner CG, Kern D (2007) Intrinsic motions along an enzymatic reaction trajectory. Nature 450:838–844. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0002] 2. Kay LE, Ikura M, Tschudin R, Bax A (2011) Three‐dimensional triple‐resonance NMR spectroscopy of isotopically enriched proteins. J Magn Reson 213:423–441. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0003] 3. Makowski L, Rodi DJ, Mandava S, Minh DD, Gore DB, Fischetti RF (2008) Molecular crowding inhibits intramolecular breathing motions in proteins. J Mol Biol 375:529–546. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0004] 4. Makowski L, Gore D, Mandava S, Minh D, Park S, Rodi DJ, Fischetti RF (2011) X‐ray solution scattering studies of the structural diversity intrinsic to protein ensembles. Biopolymers 95:531–542. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0005] 5. Zhou H, Li S, Badger J, Nalivaika E, Cai Y, Foulkes‐Murzycki J, Schiffer C, Makowski L (2015) Modulation of HIV protease flexibility by the T80N mutation. Proteins 83:1929–1939. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0006] 6. Debye P (1915) Zerstreuung von rotgenstrahlen. Ann Phys 462:309–823. [Google Scholar]

[pro3508-bib-0007] 7. Fraser RD, MacRae TP, Suzuki E (1978) An improved method for calculating the contribution of solvent to the X‐ray diffraction pattern of biological macromolecules. J Appl Cryst 11:693–694. [Google Scholar]

[pro3508-bib-0008] 8. Lattman EE (1989) Rapid calculation of the solution scattering profile from a macromolecule of known structure. Proteins 5:149–155. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0009] 9. Pickover CA, Engelman DM (1982) On the interpretation and prediction of X‐ray scattering profiles of biomolecules in solution. Biopolymers 21:817–831. [Google Scholar]

[pro3508-bib-0010] 10. Svergun DI, Richard S, Koch MH, Sayers Z, Kuprin S, Zaccai G (1998) Protein hydration in solution: experimental observation by X‐ray and neutron scattering. Proc Natl Acad Sci USA 95:2267–2272. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0011] 11. Svergun D, Barberato C, Koch MHJ (1995) Crysol—a program to evaluate X‐ray solution scattering of biological macromolecules from atomic coordinates. J Appl Cryst 28:768–773. [Google Scholar]

[pro3508-bib-0012] 12. Park S, Bardhan JP, Roux B, Makowski L (2009) Simulated X‐ray scattering of protein solutions using explicit‐solvent models. J Chem Phys 130:134114. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0013] 13. Bardhan J, Park S, Makowski L (2009) Softwaxs: a computational tool for modeling wide‐angle X‐ray solution scattering from biomolecules. J Appl Cryst 42:932–943. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0014] 14. Grishaev A, Guo L, Irving T, Bax A (2010) Improved fitting of solution X‐ray scattering data to macromolecular structures and structural ensembles by explicit water modeling. J Am Chem Soc 132:15484–15486. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0015] 15. Onuk E, Badger J, Wang YJ, Bardhan J, Chishti Y, Akcakaya M, Brooks DH, Erdogmus D, Minh DDL, Makowski L (2017) Effects of catalytic action and ligand binding on conformational ensembles of adenylate kinase. Biochemistry 56:4559–4567. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0016] 16. Yang S, Park S, Makowski L, Roux B (2009) A rapid coarse residue‐based computational method for X‐ray solution scattering characterization of protein folds and multiple conformational states of large protein complexes. Biophys J 96:4449–4463. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0017] 17. Moore PB (2014) The effects of thermal disorder on the solution‐scattering profiles of macromolecules. Biophys J 106:1489–1496. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0018] 18. Luzzati V, Tardieu A (1980) Recent developments in solution X‐ray scattering. Ann Rev Biophys Bioengin 9:1–29. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0019] 19. Putnam CD, Hammel M, Hura GL, Tainer JA (2007) X‐ray solution scattering (SAXS) combined with crystallography and computation: defining accurate macromolecular structures, conformations and assemblies in solution. Q Rev Biophys 40:191–285. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0020] 20. Zhou H, Li S, Makowski L (2016) Visualizing global properties of a molecular dynamics trajectory. Proteins 84:82–91. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0021] 21. Gorfe AA, Grant BJ, McCammon JA (2008) Mapping the nucleotide and isoform‐dependent structural and dynamical features of Ras proteins. Structure 16:885–896. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0022] 22. Harrison RA, Lu J, Carrasco M, Hunter J, Manandhar A, Gondi S, Westover KD, Engen JR (2016) Structural dynamics in Ras and related proteins upon nucleotide switching. J Mol Biol 428:4723–4735. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0023] 23. Prior IA, Lewis PD, Mattos C (2012) A comprehensive survey of Ras mutations in cancer. Cancer Res 72:2457–2467. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0024] 24. Bourne HR, Sanders DA, McCormick F (1991) The GTPASE superfamily: conserved structure and molecular mechanism. Nature 349:117–127. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0025] 25. Johnson CW, Reid D, Parker JA, Salter S, Knihtila R, Kuzmic P, Mattos C (2017) The small gtpases k‐Ras, n‐Ras, and h‐Ras have distinct biochemical properties determined by allosteric effects. J Biol Chem 292:12981–12993. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0026] 26. Marcus K, Mattos C (2015) Direct attack on Ras: intramolecular communication and mutation‐specific effects. Clin Cancer Res 21:1810–1818. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0027] 27. Spoerner M, Hozsa C, Poetzl JA, Reiss K, Ganser P, Geyer M, Kalbitzer HR (2010) Conformational states of human rat sarcoma (Ras) protein complexed with its natural ligand GTP and their role for effector interaction and GTP hydrolysis. J Biol Chem 285:39768–39778. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0028] 28. Buhrman G, Holzapfel G, Fetics S, Mattos C (2010) Allosteric modulation of Ras positions q61 for a direct role in catalysis. Proc Natl Acad Sci USA 107:4931–4936. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0029] 29. Hamelberg D, Mongan J, McCammon JA (2004) Accelerated molecular dynamics: a promising and efficient simulation method for biomolecules. J Chem Phys 120:11919–11929. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0030] 30. Johnson CW, Mattos C (2013) The allosteric switch and conformational states in Ras GTPASE affected by small molecules. Enzymes 33:41–67. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0031] 31. Foulkes JE, Prabu‐Jeyabalan M, Cooper D, Henderson GJ, Harris J, Swanstrom R, Schiffer CA (2006) Role of invariant THR80 in human immunodeficiency virus type 1 protease structure, function, and viral infectivity. J Virol 80:6906–6916. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0032] 32. Miao Y, Sinko W, Pierce L, Bucher D, Walker RC, McCammon JA (2014) Improved reweighting of accelerated molecular dynamics simulations for free energy calculation. J Chem Theory Comput 10:2677–2689. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0033] 33. Humphrey W, Dalke A, Schulten K (1996) VMD: visual molecular dynamics. J Mol Graph 14:27–38. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0034] 34. Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kale L, Schulten K (2005) Scalable molecular dynamics with NAMD. J Comput Chem 26:1781–1802. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0035] 35. Miyamoto S, Kollman PA (1992) Settle: an analytical version of the shake and rattle algorithm for rigid water models. J Comput Chem 13:952–962. [Google Scholar]

[pro3508-bib-0036] 36. MacKerell AD, Bashford D, Bellott M, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph‐McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE III, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiórkiewicz‐Kuczera J, Yin D, Karplus M (1998) All‐atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 102:3586–3616. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0037] 37. Svergun D (1992) Determination of the regularization parameter in indirect‐transform methods using perceptual criteria. J Appl Cryst 25:495–503. [Google Scholar]

[pro3508-bib-0038] 38. Johnson CW, Buhrman G, Ting PY, Colicelli J, Mattos C (2016) Expression, purification, crystallization and X‐ray data collection for Ras and its mutants. Data Brief 6:423–427. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0039] 39. Acerbo AS, Cook MJ, Gillilan RE (2015) Upgrade of macchess facility for X‐ray scattering of biological macromolecules in solution. J Synchrotron Radiat 22:180–186. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0040] 40. Nielsen SS, Toft KN, Snakenborg D, Jeppesen MD, Jacobson JK, Vestergaard B, Kutter JP, Arleth L (2009) Bioxtas raw, a software program for high‐throughput automated small‐angle X‐ray scattering data reduction and preliminary analysis. J Appl Cryst 42:959–964. [Google Scholar]

[pro3508-bib-0041] 41. Fischetti R, Stepanov S, Rosenbaum G, Barrea R, Black E, Gore D, Heurich R, Kondrashkina E, Kropf AJ, Wang S, Zhang K, Irving TC, Bunker GB (2004) The biocat undulator beamline 18id: a facility for biological non‐crystalline diffraction and X‐ray absorption spectroscopy at the advanced photon source. J Synchrotron Rad 11:399–405. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0042] 42. Milburn MV, Tong L, deVos AM, Brunger A, Yamaizumi Z, Nishimura S, Kim SH (1990) Molecular switch for signal transduction: structural differences between active and inactive forms of protooncogenic Ras proteins. Science 247:939–945. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0043] 43. Prabu‐Jeyabalan M, Nalivaika E, Schiffer CA (2000) How does a symmetric dimer recognize an asymmetric substrate? A substrate complex of HIV‐1 protease. J Mol Biol 301:1207–1220. [DOI] [PubMed] [Google Scholar]

[pro3508-bib-0044] 44. Fetics SK, Guterres H, Kearney BM, Buhrman G, Ma B, Nussinov R, Mattos C (2015) Allosteric effects of the oncogenic RASQ61l mutant on RAF‐RBD. Structure 23:505–516. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0045] 45. Case DA, Cheatham TE 3rd, Darden T, Gohlke H, Luo R, Merz KM Jr, Onufriev A, Simmerling C, Wang B, Woods RJ (2005) The amber biomolecular simulation programs. J Comput Chem 26:1668–1688. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0046] 46. Cai Y, Schiffer CA (2010) Decomposing the energetic impact of drug resistant mutations in HIV‐1 protease on binding DRV. J Chem Theory Comput 6:1358–1368. [DOI] [PMC free article] [PubMed] [Google Scholar]

[pro3508-bib-0047] 47. Mittal S, Cai Y, Nalam MN, Bolon DN, Schiffer CA (2012) Hydrophobic core flexibility modulates enzyme activity in HIV‐1 protease. J Am Chem Soc 134:4163–4168. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Predicting X‐ray solution scattering from flexible macromolecules

Hao Zhou

Hugo Guterres

Carla Mattos

Lee Makowski

Abstract

Abbreviations

Introduction

Results

SAXS/WAXS of H‐Ras

Figure 1.

MD of H‐Ras

Prediction of SAXS/WAXS patterns

Figure 2.

Figure 3.

Detection of substates in H‐Ras‐GTP protein solution

Figure 4.

Figure 5.

Identifying shortcomings in the in silico mutation model for H‐RasY71A‐GTP

Figure 6.

Figure 7.

Comparison of cMD and aMD

Figure 8.

Figure 9.

Characterizing large‐scale dynamics not reflected in MD simulations

Figure 10.

Discussion

Conclusion

Materials and Methods

Calculation of solution scattering from a rigid protein (XS)

Calculation of sigma‐r distribution from an MD trajectory (SIGMA‐R)

Vector‐length convolution (FLEX)

Protein expression

SAXS/WAXS

Protein structure files

MD simulations

Acknowledgment

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases