Abstract
Many proteins display complex dynamical properties that are often intimately linked to their biological functions. As the native state of a protein is best described as an ensemble of conformations, it is important to be able to generate models of native state ensembles with high accuracy. Due to limitations in sampling efficiency and force field accuracy it is, however, challenging to obtain accurate ensembles of protein conformations by the use of molecular simulations alone. Here we show that dynamic ensemble refinement, which combines an accurate atomistic force field with commonly available nuclear magnetic resonance (NMR) chemical shifts and NOEs, can provide a detailed and accurate description of the conformational ensemble of the native state of a highly dynamic protein. As both NOEs and chemical shifts are averaged on timescales up to milliseconds, the resulting ensembles reflect the structural heterogeneity that goes beyond that probed, e.g., by NMR relaxation order parameters. We selected the small protein domain NCBD as object of our study since this protein, which has been characterized experimentally in substantial detail, displays a rich and complex dynamical behaviour. In particular, the protein has been described as having a molten-globule like structure, but with a relatively rigid core. Our approach allowed us to describe the conformational dynamics of NCBD in solution, and to probe the structural heterogeneity resulting from both short- and long-timescale dynamics by the calculation of order parameters on different time scales. These results illustrate the usefulness of our approach since they show that NCBD is rather rigid on the nanosecond timescale, but interconverts within a broader ensemble on longer timescales, thus enabling the derivation of a coherent set of conclusions from various NMR experiments on this protein, which could otherwise appear in contradiction with each other.
Keywords: Protein structure, Protein dynamics, NMR, Molecular dynamics, Force field, Ensemble refinement, Maximum entropy
Introduction
Molecular dynamics (MD) simulations have the potential ability to provide an accurate, atomic-level description of the conformational ensembles of proteins and their macromolecular complexes (Lindorff-Larsen et al., 2005; Dror et al., 2012; Perilla et al., 2015). Nevertheless, simulations are limited by both the accuracy of the physical models (force fields) and the precision due to conformational sampling (Mobley, 2012; Esteban-Martín, Bryn Fenwick & Salvatella, 2012). To overcome these problems, it is possible to bias the simulations using experimental data as structural restraints taking into account the inherent averaging in the experiments (Lindorff-Larsen et al., 2005; Camilloni et al., 2012; Lehtivarjo et al., 2012; Pitera & Chodera, 2012; Camilloni & Vendruscolo, 2014; Ravera et al., 2016). In this way, the experimental data can be included as a system-specific force-field correction, that combines the two sources of information using Bayesian statistics or the maximum entropy principle (Pitera & Chodera, 2012; Roux & Weare, 2013; Cavalli, Camilloni & Vendruscolo, 2013; Boomsma, Ferkinghoff-Borg & Lindorff-Larsen, 2014; White & Voth, 2014; Olsson et al., 2014; MacCallum, Perez & Dill, 2015; Hummer & Köfinger, 2015; Bonomi et al., 2016; Bonomi et al., 2017; Bottaro et al., 2018). Among the many techniques that can be used to probe structure and dynamics of proteins, NMR spectroscopy stands out as being able to provide a number of different parameters that are sensitive to protein dynamics over different timescales, as well as to probe the “average structure” in solution.
Previously, replica-averaged simulations have provided a wealth of information about the dynamical ensembles that proteins can attain in solution (Lindorff-Larsen et al., 2005; Tang, Schwieters & Clore, 2007; Fenwick et al., 2011; Camilloni et al., 2012; Ángyán & Gáspári, 2013; Camilloni, Cavalli & Vendruscolo, 2013a; Camilloni, Cavalli & Vendruscolo, 2013b; Islam et al., 2013; Vögeli et al., 2014; Camilloni & Vendruscolo, 2014). Exploiting improvements in the accuracy and speed of predicting protein NMR chemical shifts from protein structure (Kohlhoff et al., 2009; Han et al., 2011; Li & Brüschweiler, 2012), it is now possible to combine experimental chemical shifts with molecular simulations to study protein structure and dynamics (Wishart & Case, 2001; Cavalli et al., 2007; Shen et al., 2008; Wishart et al., 2008; Robustelli et al., 2009; Robustelli et al., 2010; Boomsma et al., 2014). In particular, chemical shifts can be used as replica-averaged structural restraints to determine the conformational fluctuations in proteins (Camilloni et al., 2012; Camilloni, Cavalli & Vendruscolo, 2013a; Camilloni, Cavalli & Vendruscolo, 2013b; Kannan et al., 2014; Kukic et al., 2014; Krieger et al., 2014). By using experimental data as a “system specific force field correction” (Boomsma, Ferkinghoff-Borg & Lindorff-Larsen, 2014) such experimentally-restrained simulations remove some of the uncertainty associated with imperfect force fields and sampling (Tiberti et al., 2015; Löhr, Jussupow & Camilloni, 2017).
Previously, we developed a dynamic-ensemble refinement (DER) approach for determining simultaneously the structure and dynamics of proteins by combining distance restraints from nuclear Overhauser effect (NOE) experiments, dynamical information from relaxation order parameters and MD simulations (Lindorff-Larsen et al., 2005). Similarly, it has been demonstrated that accurate ensembles of conformations that represent longer timescale dynamics can be obtained from residual dipolar couplings (Lange et al., 2008; De Simone et al., 2009; De Simone et al., 2015). These applications have, however, relied on a type of data (relaxation order parameters or residual dipolar couplings) that may not be readily available.
We therefore sought to extend this approach to study conformational variability using more commonly available data, thus making the DER method more generally applicable. We thus focus on using NMR chemical shifts and NOEs as these are both commonly available and are averaged over long, millisecond timescales. We demonstrate the potential by describing the structural heterogeneity of a highly dynamic protein. Our method relies on supplementing the sparse experimental data with the experimentally-validated CHARMM22* force field (Piana, Lindorff-Larsen & Shaw, 2011), which provides a relatively accurate description of the subtle balance among the stability of the different secondary structure classes, and which has been shown to provide a good description of many structural and dynamical aspects related to protein structure (Shaw et al., 2010; Lindorff-Larsen et al., 2012a; Lindorff-Larsen et al., 2012b; Piana, Lindorff-Larsen & Shaw, 2012; Papaleo et al., 2014; Rauscher et al., 2015). Our hypothesis was that using a more accurate force field would make it possible to determine an accurate ensemble from less information-rich experimental data. In particular, though chemical shifts in principle contain very detailed information, this information is difficult to extract using current methods.
As object of our study we selected NCBD (the Nuclear Coactivator Binding Domain) of CBP (CREB Binding Protein), a 59-residue protein domain that has been experimentally characterized in substantial detail. Experiments on NCBD have revealed a rich and complex dynamical behaviour of the protein in solution (Demarest et al., 2004; Ebert et al., 2008; Kjaergaard, Teilum & Poulsen, 2010; Kjaergaard, Poulsen & Teilum, 2012; Kjaergaard et al., 2013). For a protein of its size, NCBD displays surprisingly broad NMR peaks, suggestive of conformational heterogeneity with relatively slow interconversion between different states. Nevertheless, it was possible to assign both backbone and side chain chemical shifts and determine a number of conformationally-averaged inter-nuclear distances, including a few long-range contacts, via NOE experiments (Ebert et al., 2008; Kjaergaard, Teilum & Poulsen, 2010; Kjaergaard, Poulsen & Teilum, 2012). NMR relaxation experiments suggest that the protein, at least on the nanosecond timescale, is relatively rigid (Kjaergaard, Poulsen & Teilum, 2012). NCBD forms complexes with several other proteins, where it intriguingly folds into remarkably different tertiary structures (Demarest et al., 2002; Qin et al., 2005). For example, the structure of NCBD in complex with ACTR (Demarest et al., 2002) and certain other partners (Waters et al., 2006; Lee et al., 2010) resembles the average structure populated by NCBD in the absence of binding partners (Fig. 1), whereas the structure of NCBD is markedly different when bound to the protein IRF-3 (Qin et al., 2005). Thus, the dynamical properties of NCBD, and its ability to adopt different conformations, appear crucial for its diverse biological functions.
Our results show that a dynamic ensemble refinement that combines NOEs, chemical shifts and the CHARMM22* force field provides a rather accurate description of the structural dynamics of the ground state structure of NCBD. We show via cross-validation with independent NMR data that all three components (the two sources of experimental information and the force field) contribute to the overall accuracy. The ensemble that we obtained reveals a relatively broad distribution of conformations, reflecting the conformational heterogeneity of NCBD on the millisecond timescale. Further, we quantified the level of structural fluctuations that would be measured by relaxation experiments and demonstrate that, on the nanosecond timescale, NCBD is more rigid, thus helping to reconcile earlier conflicting views of this protein.
Materials and Methods
Ensemble generation
MD simulations were performed using Gromacs 4.5, (Pronk et al., 2013) coupled to a modified version of Plumed 1.3, (Bonomi et al., 2009) and using either the CHARMM22* (Piana, Lindorff-Larsen & Shaw, 2011) or CHARMM22 (MacKerell et al., 1998) force fields. As starting structure for most simulations we used the first conformer from a previously determined NMR structure of free NCBD as deposited in the PDB entry 2KKJ (Kjaergaard, Teilum & Poulsen, 2010). To evaluate the effect of our choice of the initial structure, we also performed one simulation starting from an alternative NCBD conformation (PDB entry: 1ZOQ, chain C) (Qin et al., 2005). Missing residues in 1ZOQ (compared to 2KKJ) were rebuilt by Modeller 9.11 (Fiser & Šali, 2003).
The protein was embedded in a dodecahedral box containing 8372 TIP3P water molecules (Jorgensen et al., 1983) and simulated using periodic boundary conditions with a 2 fs timestep and LINCS constraints (Hess et al., 1993). Production simulations were performed in the NVT ensemble with the Bussi thermostat (Bussi, Donadio & Parrinello, 2007) using a pre-equilibrated starting structure for which the volume was selected based on a short NPT simulation. NaCl was added to a concentration of ∼20 mM to reproduce the experimental conditions at which chemical shifts and NOEs were determined (Kjaergaard, Teilum & Poulsen, 2010). The van der Waals and short-range electrostatic interactions were truncated at 9 Å, whereas long-range electrostatic effects were treated with the particle mesh Ewald method (Essmann et al., 1995).
We carried out MD simulations with replica-averaged experimental restraints using 1, 2, 4 or 8 replicas (Table S1 gives an overview of the simulations that were performed). The use of replica-averaged restrained simulations enables us to use different equilibrium experimental observable as a restraint in MD simulation in a way that minimises the risk of over restraining because replica-averaging is a practical implementation of the maximum entropy principle. As a control we also performed a simulation that was not biased by any experimental restraints (i.e., an unbiased simulation). To examine the role played by each of the different types of experimental data, we also performed simulations in which we included different combinations of the experimental restraints: chemical shifts only (CS), NOEs only (NOE), and both chemical shifts and NOEs (CS-NOE). In the simulations, each replica was evolved through a series of simulated annealing (SA) cycles between 304 and 454 K for a total duration of 0.6 ns per cycle. Specifically, for each SA cycle we performed: (i) 100 ps at 304 K, (ii) a linear increase of the temperature from 304 to 454 K over 100 ps, (iii) 100 ps at 454 K, and (iv) a linear cooling from 454 K to 304 K in the remaining 300 ps. Each new cycle was initiated from the final structure from the previous cycle. We only used structures from the 304 K portions of the simulations for our analyses, corresponding also to the temperature at which the NMR data were recorded (Kjaergaard, Teilum & Poulsen, 2010). Example scripts for performing the simulations are available as supporting information.
Chemical shifts for the backbone atoms (Cα, C′, Hα, H and N) and Cβ CS (deposited in BMRB entry 16363) were used as restraints (with the exception of the Cβ of glutamines, which we have sometimes found to be imprecisely predicted). The resulting dataset includes 54 Cα, 37 Cβ, 52 Hα and 48 C′, H and N chemical shifts, respectively. The backbone chemical shifts cover most of the NCBD sequence with the exception of the first four to six N-terminal residues, depending on type of chemical shifts. The Cβ chemical shifts for the first seven N-terminal and last five C-terminal residues, as well as for some residues of the loops connecting the α-helices, are missing with few exceptions.
During the structure determination protocol, chemical shifts were calculated by CamShift (Kohlhoff et al., 2009) for all the nuclei for which an experimental value is available and then averaged over the replicas. The resulting average over the replicas was compared with the experimental value, and the ensemble as a whole restrained using a harmonic function with a force constant of 5.2 kJ mol−1ppm−2 (Camilloni et al., 2012; Camilloni, Cavalli & Vendruscolo, 2013a). At the higher temperatures, T, explored during the simulated annealing, the force constant was scaled by a factor of (304 K/T). The value of the force constant was chosen roughly to match the calculated chemical shifts to experiments within the uncertainty of the CamShift predictor; the experimental uncertainty of the chemical shifts is negligible in comparison.
NOE restraints were obtained by 455 NOE-derived distance intervals (Kjaergaard, Teilum & Poulsen, 2010) (BMRB entry 16363) of which 46 were long-range (i.e., separated by more than 4 residues). The proton–proton distances, r, were calculated and averaged as r−6 over the replicas (Tropp, 1980; Lindorff-Larsen et al., 2005). We used a flat-bottomed harmonic function implemented in Gromacs to restrain the calculated averaged distances within the experimentally-derived intervals. We used a variable force constant for the NOE-restraints during the SA cycles, allowing the protein to sample more diverse structures in the high-temperature regime and thus to decrease the risk of getting trapped in local minima. Force constants of 1,000, 20 and 125 kJ mol−1 nm−2 were used for the 304 K phase, a heating phase (from 304 K to 454 K) and cooling phase (from 454 K to 304 K), respectively.
In short, in the replica-averaged simulations we calculated at each step and for each replica-conformation the atomic distances that were measured by the NOE experiments and the backbone chemical shifts. These calculated single-conformer values were then averaged (linearly for the shifts and using r−6 averaging for the distances) to determine the replica-averaged values, which were then compared to the experimentally determined values. Thus, the simulations penalize deviations between the calculated ensemble averages and experimental values but allow fluctuations of individual structures. In this way, the simulations are biased so as to agree with the experimental data as a whole, while allowing individual conformations to take on conformations whose NMR parameters differ from the experimentally derived averages.
To examine the role of the force field used in our approach, we compared the results from two different force fields belonging to the same family (CHARMM). These force fields mostly differ for the main-chain dihedral angle potential, as well a few parameters for certain side chains. Specifically, we used either the CHARMM22* (Piana, Lindorff-Larsen & Shaw, 2011) or CHARMM22 (MacKerell et al., 1998) force fields. The CHARMM22* force field is a refined version of CHARMM22 that includes modified backbone torsion angles optimized to give improved agreement with a range of NMR data in simulations of peptides of various lengths and secondary structure propensities. Furthermore in a previous comprehensive evaluation of protein force fields it, was demonstrated that these two force fields resulted in very different levels of agreement between simulations and experiments (Lindorff-Larsen et al., 2012a), making it possible for us to evaluate the importance of force field accuracy in restrained simulations.
Unbiased simulations for the calculation of fast-timescale order parameters
We also performed 28 independent unbiased MD simulations, each 50 ns long, at 304 K and with the same computational setup as the restrained simulations, but without any restraints. As starting points, we selected seven different structures from each of the four replicas obtained in the CS-NOE-4 ensemble (Table S1). In particular, the seven structures were selected from the SA cycles after convergence (i.e., at SA cycles 65, 75, 85, 95, 100, 110, 125). We calculated fast timescale order parameters, which correspond to those measured by NMR relaxation measurements, from these 28 unbiased simulations using a previously described approach (Maragakis et al., 2008). In particular, we calculated bond-vector autocorrelation functions (independently from each simulation) including both internal motions and overall tumbling of NCBD. The resulting correlation functions were then averaged over the 28 simulations and subsequently fitted globally to a Lipari-Szabo model (Lipari & Szabo, 1982) to yield relaxation order parameters. To calculate order parameters that report on the long-timescale motions we first aligned the full ensemble and then calculated order parameters as ensemble averages (Maragakis et al., 2008).
Analyses of convergence and cross validation
We used two different methods to examine the convergence of our simulations. First, we used the ENCORE ensemble comparison method (Lindorff-Larsen & Ferkinghoff-Borg, 2009; Tiberti et al., 2015) to quantify the overlap between the structural ensembles. The latter is based on clustering the structures using affinity propagation (setting the “preference value” in the clustering to 12) and subsequent comparison of the ensembles by calculating the Jensen–Shannon (JS) divergence between pairs of ensembles by comparing how they populate the different clusters. For additional details, please confer to original descriptions of the method (Lindorff-Larsen & Ferkinghoff-Borg, 2009; Tiberti et al., 2015). As an alternative method, we calculated the Root Mean Square Inner Product (RMSIP) over the first 10 eigenvectors obtained from a principal component analysis of the covariance matrix of atomic (Cα-atoms) fluctuations (Amadei, Linssen & Berendsen, 1993).
To cross-validate our ensembles we calculated the chemical shifts of side chain methyl hydrogen and carbon atoms using CH3Shift (Sahakyan et al., 2011) (both 1H and 13C shifts) and PPM (Li & Brüschweiler, 2012) (only 1H shifts) and compared to the previously determined experimental side chain chemical shifts. In particular, we compared the calculated side chain chemical shifts with the experimental values (deposited in BMRB entry 16363) using a reduced χ2 metric. In this metric, the square deviation between the calculated and experimental values were normalized by the variance of the chemical shift predictor (for each type of chemical shift) and the total number of chemical shifts, so that low numbers indicate good agreement between experimental and calculated chemical shifts.
Results
Convergence of the simulations
Before assessing the accuracy of the different structural ensembles that we generated, we first ensured that the simulated annealing protocol allowed us to obtain converged ensembles that represent the dynamical properties encoded in the experimental restraints and the molecular force field. To quantify convergence of the ensembles, we calculated two different measures of the overlap between the subspaces sampled by different simulations.
First, we used a previously described approach (Lindorff-Larsen & Ferkinghoff-Borg, 2009; Tiberti et al., 2015), which is based on a quantification of the extent to which the different ensembles mix during conformational clustering, to calculate the Jensen–Shannon (JS) divergence between the ensembles (Fig. 2). A JS divergence of zero is evidence of identical ensembles, and it has previously been observed that a JS divergence in the range of 0.1–0.3 represents similar ensembles (Lindorff-Larsen & Ferkinghoff-Borg, 2009; Tiberti et al., 2015). We expect that in a converged replica-averaged simulation that the different replicas should populate equally the different structural basins. With this in mind, we calculated the JS divergence between two replicas in a simulation restrained by NOEs and chemical shifts (Fig. 2, black line). We find that after approximately ∼30 cycles of simulated annealing the two replicas have covered approximately the same conformational space with the JS divergence stabilizing around 0.2–0.3 with the fluctuations in the JS-divergence representing the stochastic nature of the simulations. Thus, we decided to discard the first 45 simulated annealing cycles from all the simulations. As an alternative measure of ensemble similarity we also calculated the Root Mean Square Inner Product (Hess, 2002) (RMSIP) with very similar results. In particular, the similarity of the two replicas converge to an RMSIP value greater than 0.83 (here RMSIP = 1 is expected for fully overlapping ensembles).
As a second, perhaps even more stringent, test of convergence we also examined whether two simulations with the same number of replicas and experimental restraints, but initiated from substantially different starting structures, converge to similar ensembles. Indeed, we find that simulations initiated from two distinct structures of NCBD (Table S1) converge to similar ensembles when the first 45 cycles are discarded as initial equilibration (Fig. 2, grey line). Thus, based on these two tests we concluded that our sampling protocol allows us to obtain structural ensembles that represent the force field and restraints employed.
Assessment of the accuracy of the NCBD ensembles
Once we had assessed the convergence of the simulations, we analysed the different ensembles to evaluate their accuracy. To do so, we back-calculated experimental parameters that were not used as restraints and compared them with the experimental values. As our different simulations employed different sets of experimental restraints, not all experimental data can be employed for validation purposes. For example, while the NOEs can be used to evaluate the quality of an ensemble obtained using CS-restraints, they can obviously not be used to validate an ensemble that was generated using those NOEs as restraints.
We first examined whether the CS or NOE restraints alone are sufficient to increase the accuracy in the description of the conformational ensemble of NCBD. We thus compared unbiased simulations with simulations biased by either CS or NOEs by cross-validation with the measured NOEs and CS, respectively.
We back-calculated NOEs from the inter-proton distances and observed substantial violations (some greater than 2 Å) in both unbiased and CS ensembles (Fig. S1) independently of the number of replicas used for the averaging. To determine the origin of these discrepancies we calculated intramolecular contacts between side chains, and observed an overall decrease in these (from 27 in the previously-determined NMR ensemble, to 14 and 17 in unbiased and CS-restrained, respectively). More specifically we found a loss of inter-helical contacts between helices α1 and α2 in the simulations, in agreement with our finding of several long-range NOEs that are violated in these ensembles.
These results demonstrate that the CS-restraints and MD force field, as implemented here, are not sufficient to provide a fully accurate description of the conformational ensemble of NCBD. Similarly, we found that back-calculation of backbone chemical shifts from the unbiased simulation and, to a lesser extent a NOE-restrained ensemble, resulted in deviations from experiments. We therefore decided to determine conformational ensembles that combine the information of the NOEs, chemical shifts and force field in replica-averaged simulations (CS-NOE) aiming to provide a more accurate structural ensemble of NCBD than possible via the application of just one of the two classes of restraints. We also assessed the influence of the choice of force field since we expected that a more accurate ensemble could be obtained with the relatively limited amounts of experimental data when using a more accurate force field. Thus, we compared simulations using either the CHARMM22 force field (CS-NOE-4-C22 simulation), or a more recent and accurate force field variant, CHARMM22* (CS-NOE simulations).
As both the NOEs and backbone chemical shifts were used as restraints they cannot be used for validation of these ensembles. Instead, we turned to side-chain methyl chemical shifts for a comparison and validation of the different ensembles. Methyl-containing residues, for which the chemical shifts are available, cover the entire protein structure and are thus excellent probes of both local structure (13C methyl chemical shifts, which are mostly dependent on the rotameric state) and long-range contacts (1H methyl chemical shifts). The methyl chemical shifts were predicted by CH3Shift (Sahakyan et al., 2011) and the resulting values compared to experiments, separating the contributions from 13C and 1H. We then calculated thus taking into account the inherent uncertainty of the chemical shift predictions (Sahakyan et al., 2011).
As also indicated by the calculation of NOEs and backbone chemical shifts, we find that the side chain chemical shifts predicted from the unbiased simulation (green line in Fig. 3) deviates substantially from experiments. The introduction of backbone chemical shift restraints (CS ensembles, orange line in Fig. 3) provides a better structural ensemble than the force field alone, especially for 13C methyl chemical shifts and when averaged over 2 or 4 replicas. We also calculated the chemical shifts from NOE-derived ensembles, obtained with or without replica-averaging. Surprisingly, we find that the ensembles obtained using NOEs as replica-averaged restraints (NOE, magenta line in Fig. 3) perform slightly worse than the CS ensemble. Thus, when evaluated in this way, ensembles derived by MD refinement using either backbone chemical shifts or NOEs do not increase accuracy compared to the ensemble deposited in the PDB.
By combining the NOEs, chemical shifts and the CHARMM22* force field we were, however, able to obtain even more accurate ensembles, in particular when averaging over four replicas, as assessed by the ability to predict side chain 13C and 1H methyl chemical shifts (Fig. 3). Interestingly we find that not only the experimental data but also the CHARMM22* force field contributes to the improved agreement with the experimental data. Indeed, when we employ both chemical shift and NOE-based restraints in simulations averaged over 4 replicas, but replacing the CHARMM22* force field by an earlier, less accurate variant of the same force field (CHARMM22; CS-NOE-4-C22) (Lindorff-Larsen et al., 2012a) we find that the accuracy decreases dramatically. Calculations of 1H methyl chemical shifts using PPM (Li & Brüschweiler, 2012) instead of CH3Shift demonstrate that the conclusions are robust to the method for calculating the chemical shifts (Fig. S2). Similarly, calculations of the chemical shifts using the ensemble generated from the alternative starting structure (CS-NOE-2-1ZOQ) resulted in essentially the same agreement with the experimental data as when simulations were initiated from the 2KKJ structure (Fig. 3), confirming the conclusions from the convergence analysis described above (Fig. 2). The CS-NOE-4 ensemble, which we found to provide the most accurate representation of the free state of NCBD in solution, is shown in Fig. 4. It is a relatively broad ensemble of conformations, where the three helical regions are maintained overall, but differ in the lengths and relative positions of the three α-helices.
Small Angle X-ray scattering (SAXS) measurements have been carried out for NCBD in solution (Kjaergaard, Teilum & Poulsen, 2010) and previously been compared to simulation-derived ensembles of NCBD (Knott & Best, 2012; Naganathan & Orozco, 2013). We thus calculated the radius of gyration (Rg) using CRYSOL (Svergun, Barberato & Koch, 1995) for the various ensembles. In all cases we find that the average Rg values are in the range of 13.7 Å–14.9 Å. These values are comparable to that obtained previously from simulations (13.7 Å) (Knott & Best, 2012) but lower than the values estimated from a Guinier analysis of the experimental data (∼16.5 Å) or an ensemble-optimization method (18.8 Å) (Kjaergaard, Teilum & Poulsen, 2010). We note, however, that the experimental values also include contributions from a ∼8% population of unfolded protein that is not captured by our simulations. Although a detailed understanding is lacking for the role of solvation on the SAXS properties of partially disordered proteins we, however, expect that the discrepancy between experiment and simulation should be ascribed to remaining force field deficiencies. Indeed, overly large compaction of proteins is a common problem of most atomistic force fields (Piana, Klepeis & Shaw, 2014) though recent work suggests that, at least for fully disordered proteins, that modified protein-water interactions can improve accuracy (Nerenberg et al., 2012; Best, Zheng & Mittal, 2014; Henriques, Cragnell & Skepö, 2015; Mercadante et al., 2015; Piana et al., 2015). We also note that while the force field used here (CHARMM22*) in certain cases has been shown to produce too compact structures, (Piana et al., 2015) in other cases it appears to perform quite well (Rauscher et al., 2015). We expect that resolving these issues will require both further force field developments (Best, 2017) as well as improved methods for comparing experiments and SAXS experiments (Hub, 2018).
A unified view of NCBD dynamics
While the broad peaks and sparse NOEs are suggestive of a rather dynamic protein, previous NMR relaxation measurements of side chain dynamics found relatively high order parameters () comparable to values found in well-ordered proteins (Kjaergaard, Poulsen & Teilum, 2012). To shed light on this apparent discrepancy and to assess whether our relatively broad structural ensemble is compatible with mobility on different timescales, we calculated S2 values representing different timescales.
To mimic the dynamics probed in relaxation experiments we selected 28 structures from each of the four replicas of the CS-NOE-4 ensemble sampled at seven different SA steps. Starting from each of these conformations we performed 50 ns of unbiased MD simulation (in total 1.4 µs, Fig. S3), and from each simulation we calculated the autocorrelation functions of the N-H bond vectors (without removing the overall rotational motion of the protein). These correlation functions were subsequently averaged and fitted to the Lipari-Szabo model to estimate the values, which report on the nanosecond dynamics of the protein (Fig. 5, black line). The results show a relatively rigid ensemble on the ns timescale attested by high order parameters throughout most of the polypeptide backbone.
To quantify the backbone dynamics on the longer timescales that may influence both the NOE and chemical shifts (but which the relaxation measurements would not be sensitive to) we defined and calculated “”-values from the structural variability in the ensemble after aligning the structures. These S2 values include contributions also from any millisecond-timescale motions that might be present in the ground state of NCBD. As internal and overall motions cannot be decoupled, the results of such calculations will depend on how the ensemble is aligned. In our calculations we chose theseus (Theobald & Steindel, 2012) as the least biased method to align the structures (Fig. 4). These order parameter calculations reveal a broader distribution of conformations with additional, longer-timescale dynamics evident both in loop regions and the C-terminal region, even though relatively high S2 values are found in the regions of secondary structures (Fig. 5, grey line).
A similar analysis of side chain motions suggests even greater differences in motions present on relaxation and chemical shift timescales. In particular, we find that, for methyl-bearing side chains, -values are on average lower than -values by 0.4 compared to an average difference of 0.2 for the backbone amides. Finally, we note that although both calculated -values and -values correlate strongly with the experimentally determined side chain -values (Spearman correlation coefficient of 0.9 and 0.8, respectively), a more quantitative analysis is hampered by several issues including: (i) the presence of a small population of unfolded protein in the experiments, (ii) the difficulty in appropriate model selection of the calculated correlation functions, (iii) the well-known observation of too-fast rotational motions of proteins in the TIP3P model that we used and (iv) uncertainties in the parameterization of the rotational motions in the experimental analyses. We note, however, the potential complications that arise from the fact that the -values were obtained from simulations with an experimental bias, whereas the -values were obtained from simulations starting from such a biased ensemble, but performed with the standard CHARMM22* force field.
Discussion
We have performed restrained simulations of the small protein NCBD and find that after approximately ∼30 cycles of simulated annealing that two “identical” replicas have covered approximately the same conformational as judged by the JS divergence between them. Similarly, we find that simulations initiated from two distinct structures of NCBD converge to similar ensembles when the first 45 cycles are discarded. Thus, based on these two tests we concluded that our sampling protocol allows us to obtain structural ensembles that represent the force field and restraints employed.
Once we had assessed the convergence of the simulations, we analysed the different ensembles to evaluate their accuracy. As our different simulations employed different sets of experimental restraints, not all experimental data can be employed for validation purposes.
Our results revealed that the CS-restraints and MD force field, as implemented here, are not alone enough to describe accurately the conformational ensemble of NCBD. We therefore determined conformational ensembles that combine the information of the NOEs, chemical shifts and force field, and validated them using side-chain methyl chemical shifts. The results show that by combining the NOEs, chemical shifts and the CHARMM22* force field we are able to obtain even more accurate ensembles (compared to using these data individually), in particular when averaging over four replicas. Thus, we find that the CS-NOE-4 ensemble provides the most accurate representation of the free state of NCBD in solution among the different ensembles we have studied. We, however, find that this ensemble is slightly more compact than expected from experiment, and suggest that a more careful analysis of the SAXS data and a force field that gives a better balance between compact and expanded structures are necessary to solve these issues.
Our results also shed new light on the amount and time-scales of the dynamics in NCBD. In particular, our calculations of order parameters demonstrate that NCBD may be described as a semi-rigid protein on fast-timescales, but with additional dynamics in the backbone and–in particular–side chains on timescales longer than the rotational correlation time of the protein, as also previously suggested (Kjaergaard, Poulsen & Teilum, 2012).
Conclusions
We have presented an application of the dynamic-ensemble refinement method to study the native state dynamics of NCBD. In the original implementation of DER we combined NMR relaxation order parameters with NOEs in MD simulations (Lindorff-Larsen et al., 2005). This approach was here extended to the combination of chemical shifts and NOEs to make it more generally applicable. In particular, our results show that it is possible to combine NOEs, backbone chemical shifts and an accurate MD force field into replica-averaged restrained simulations, and that all three components add substantially to the accuracy of the resulting NCBD ensemble.
NMR structures are typically obtained by combining distance information from NOE measurements with in vacuo simulations, in certain cases with subsequent refinement by short, MD simulations in explicit solvent. Further, the inherent ensemble averaging of the experimental data is typically not exploited explicitly. In this way, standard NMR structures can provide highly accurate models of the “average structure” of a protein, but only little information about the conformational heterogeneity around this average.
Replica-averaged MD simulations make it possible to obtain structural ensembles that match the experimental data according to the principle of maximum entropy (Pitera & Chodera, 2012; Roux & Weare, 2013; Cavalli, Camilloni & Vendruscolo, 2013; Boomsma, Ferkinghoff-Borg & Lindorff-Larsen, 2014; White & Voth, 2014; Olsson et al., 2014). In such calculations prior information, here in the form of a molecular mechanics force field, is biased in a minimal fashion to agree with the experimental data. Thus, to obtain an accurate ensemble, such simulations require an accurate force field, an efficient sampling approach as well as sufficient experimental information. Our results show that, at least in the case of the small, but relatively mobile protein NCBD, it is possible to perform such simulations when NOEs are supplemented by the information available in the backbone chemical shifts and a well-parameterized molecular force field. The application of the experimentally-derived structural restraints helps overcome at least some of the deficiencies in force field accuracy and also improves sampling of the relevant regions of conformational space. While we find that four replicas are optimal for the system and data studied here, we expect that this value might vary between systems and hence recommend evaluating it, e.g., by comparing to independently measured data such as the side chain shifts analysed here.
Our approach also allowed us to probe the structural heterogeneity arising from both short- and long-timescale dynamics by the calculation of order parameters. In the case of NCBD we found that this protein can be described as a relatively rigid protein domain on a fast timescale, as attested by the high relaxation order parameters that, nevertheless, displays additional motions in both the backbone and side chains on longer timescales. This situation is reminiscent of the molten globule state of apomyoglobin, that also displays restricted motions on the nanosecond timescale but with greater motions on a slower timescale (Eliezer et al., 2000; Meinhold & Wright, 2011). The current study also provides the groundwork for further studies on NCBDs intricate conformational dynamics, and the relationship to ligand binding (Dogan et al., 2012; Zijlstra et al., 2017). Given the importance of understanding and quantifying protein dynamics, in particular on long timescales, we expect that our approach, which uses only commonly available data, and possible combined with novel algorithms for enhancing sampling (Bonomi et al., 2016; Bonomi, Camilloni & Vendruscolo, 2016), will have a wide range of applications.
Supplemental Information
Acknowledgments
We would like to thank Magnus Kjaergaard, Wouter Boomsma, Matteo Tiberti and Peter Wright for fruitful discussion and comments.
Funding Statement
Elena Papaleo and Kresten Lindorff-Larsen were supported by a Hallas-Møller stipend from the Novo Nordisk Foundation (to Kresten Lindorff-Larsen). The project was also supported by the Danish e-Infrastructure Cooperation HPC Grant 2013 and the PRACE Research Infrastructure Resource Curie (France, 7th PRACE Tier0, NMRFUNC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Additional Information and Declarations
Competing Interests
Elena Papaleo is as an Academic Editor for PeerJ.
Author Contributions
Elena Papaleo performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.
Carlo Camilloni, Kaare Teilum and Michele Vendruscolo analyzed the data, contributed reagents/materials/analysis tools, authored or reviewed drafts of the paper, approved the final draft.
Kresten Lindorff-Larsen conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, authored or reviewed drafts of the paper, approved the final draft.
Data Availability
The following information was supplied regarding data availability:
The raw data are provided in a Supplemental File.
References
- Amadei, Linssen & Berendsen (1993).Amadei A, Linssen AB, Berendsen HJ. Essential dynamics of proteins. Proteins. 1993;17:412–425. doi: 10.1002/prot.340170408. [DOI] [PubMed] [Google Scholar]
- Ángyán & Gáspári (2013).Ángyán AF, Gáspári Z. Ensemble-based interpretations of NMR structural data to describe protein internal dynamics. Molecules. 2013;18:10548–10567. doi: 10.3390/molecules180910548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Best (2017).Best RB. Computational and theoretical advances in studies of intrinsically disordered proteins. Current Opinion in Structural Biology. 2017;42:147–154. doi: 10.1016/j.sbi.2017.01.006. [DOI] [PubMed] [Google Scholar]
- Best, Zheng & Mittal (2014).Best RB, Zheng W, Mittal J. Balanced protein—water interactions improve properties of disordered proteins and non-specific protein association. Journal of Chemical Theory and Computation. 2014;10:5113–5124. doi: 10.1021/ct500569b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonomi et al. (2009).Bonomi M, Branduardi D, Bussi G, Camilloni C, Provasi D, Raiteri P, Donadio D, Marinelli F, Pietrucci F, Broglia RA, Parrinello M. PLUMED: a portable plugin for free-energy calculations with molecular dynamics. Computer Physics Communications. 2009;180:1961–1972. doi: 10.1016/j.cpc.2009.05.011. [DOI] [Google Scholar]
- Bonomi et al. (2016).Bonomi M, Camilloni C, Cavalli A, Vendruscolo M. Metainference: a Bayesian inference method for heterogeneous systems. Science Advances. 2016;2:e1501177–e1501177. doi: 10.1126/sciadv.1501177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonomi, Camilloni & Vendruscolo (2016).Bonomi M, Camilloni C, Vendruscolo M. Metadynamic metainference: enhanced sampling of the metainference ensemble using metadynamics. Scientific Reports. 2016;6:31232. doi: 10.1038/srep31232. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bonomi et al. (2017).Bonomi M, Heller GT, Camilloni C, Vendruscolo M. Principles of protein structural ensemble determination. Current Opinion in Structural Biology. 2017;42:106–116. doi: 10.1016/J.SBI.2016.12.004. [DOI] [PubMed] [Google Scholar]
- Boomsma, Ferkinghoff-Borg & Lindorff-Larsen (2014).Boomsma W, Ferkinghoff-Borg J, Lindorff-Larsen K. Combining experiments and simulations using the maximum entropy principle. PLOS Computational Biology. 2014;10:e1003406. doi: 10.1371/journal.pcbi.1003406. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boomsma et al. (2014).Boomsma W, Tian P, Frellsen J, Ferkinghoff-Borg J, Hamelryck T, Lindorff-Larsen K, Vendruscolo M. Equilibrium simulations of proteins using molecular fragment replacement and NMR chemical shifts. Proceedings of the National Academy of Sciences of the United States of America. 2014;111:13852–13857. doi: 10.1073/pnas.1404948111. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bottaro et al. (2018).Bottaro S, Bussi G, Kennedy SD, Turner DH, Lindorff-Larsen K. Conformational ensembles of RNA oligonucleotides from integrating NMR and molecular simulations. Science Advances. 2018;4:eaar8521. doi: 10.1126/sciadv.aar8521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bussi, Donadio & Parrinello (2007).Bussi G, Donadio D, Parrinello M. Canonical sampling through velocity rescaling. The Journal of Chemical Physics. 2007;126:14101. doi: 10.1063/1.2408420. [DOI] [PubMed] [Google Scholar]
- Camilloni, Cavalli & Vendruscolo (2013a).Camilloni C, Cavalli A, Vendruscolo M. Assessment of the use of NMR chemical shifts as replica-averaged structural restraints in molecular dynamics simulations to characterise the dynamics of proteins. The Journal of Physical Chemistry B. 2013a;117:1838–1843. doi: 10.1021/jp3106666. [DOI] [PubMed] [Google Scholar]
- Camilloni, Cavalli & Vendruscolo (2013b).Camilloni C, Cavalli A, Vendruscolo M. Replica-averaged metadynamics. Journal of Chemical Theory and Computation. 2013b;9:5610–5617. doi: 10.1021/ct4006272. [DOI] [PubMed] [Google Scholar]
- Camilloni et al. (2012).Camilloni C, Robustelli P, De Simone A, Cavalli A, Vendruscolo M. Characterisation of the conformational equilibrium between the two major substates of RNase A using NMR chemical shifts. Journal of the American Chemical Society. 2012;134:3968–3971. doi: 10.1021/ja210951z. [DOI] [PubMed] [Google Scholar]
- Camilloni & Vendruscolo (2014).Camilloni C, Vendruscolo M. Statistical mechanics of the denatured state of a protein using replica-averaged metadynamics. Journal of the American Chemical Society. 2014;136:8982–8991. doi: 10.1021/ja5027584. [DOI] [PubMed] [Google Scholar]
- Cavalli, Camilloni & Vendruscolo (2013).Cavalli A, Camilloni C, Vendruscolo M. Molecular dynamics simulations with replica-averaged structural restraints generate structural ensembles according to the maximum entropy principle. The Journal of Chemical Physics. 2013;138:94112. doi: 10.1063/1.4793625. [DOI] [PubMed] [Google Scholar]
- Cavalli et al. (2007).Cavalli A, Salvatella X, Dobson CM, Vendruscolo M. Protein structure determination from NMR chemical shifts. Proceedings of the National Academy of Sciences of the United States of America. 2007;104:9615–9620. doi: 10.1073/pnas.0610313104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Simone et al. (2015).De Simone A, Aprile FA, Dhulesia A, Dobson CM, Vendruscolo M. Structure of a low-population intermediate state in the release of an enzyme product. eLife. 2015;4:e02777. doi: 10.7554/eLife.02777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- De Simone et al. (2009).De Simone A, Richter B, Salvatella X, Vendruscolo M. Toward an accurate determination of free energy landscapes in solution states of proteins. Journal of the American Chemical Society. 2009;131:3810–3811. doi: 10.1021/ja8087295. [DOI] [PubMed] [Google Scholar]
- Demarest et al. (2004).Demarest SJ, Deechongkit S, Dyson HJ, Evans RM, Wright PE. Packing, specificity, and mutability at the binding interface between the p160 coactivator and CREB-binding protein. Protein Science. 2004;13:203–210. doi: 10.1110/ps.03366504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Demarest et al. (2002).Demarest SJ, Martinez-Yamout M, Chung J, Chen H, Xu W, Dyson HJ, Evans RM, Wright PE. Mutual synergistic folding in recruitment of CBP/p300 by p160 nuclear receptor coactivators. Nature. 2002;415:549–553. doi: 10.1038/415549a. [DOI] [PubMed] [Google Scholar]
- Dogan et al. (2012).Dogan J, Schmidt T, Mu X, Engström Å, Jemth P. Fast association and slow transitions in the interaction between two intrinsically disordered protein domains. The Journal of Biological Chemistry. 2012;287:34316–34324. doi: 10.1074/jbc.M112.399436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dror et al. (2012).Dror RO, Dirks RM, Grossman JP, Xu H, Shaw DE. Biomolecular simulation: a computational microscope for molecular biology. Annual Review of Biophysics. 2012;41:429–452. doi: 10.1146/annurev-biophys-042910-155245. [DOI] [PubMed] [Google Scholar]
- Ebert et al. (2008).Ebert M-O, Bae S-H, Dyson HJ, Wright PE. NMR relaxation study of the complex formed between CBP and the activation domain of the nuclear hormone receptor coactivator ACTR. Biochemistry. 2008;47:1299–1308. doi: 10.1021/bi701767j. [DOI] [PubMed] [Google Scholar]
- Eliezer et al. (2000).Eliezer D, Chung J, Dyson HJ, Wright PE. Native and non-native secondary structure and dynamics in the pH 4 intermediate of apomyoglobin. Biochemistry. 2000;39:2894–2901. doi: 10.1021/BI992545F. [DOI] [PubMed] [Google Scholar]
- Essmann et al. (1995).Essmann U, Perera L, Berkowitz ML, Darden T, Lee H, Pedersen LG. A smooth particle mesh Ewald method. The Journal of Chemical Physics. 1995;103:8577. doi: 10.1063/1.470117. [DOI] [Google Scholar]
- Esteban-Martín, Bryn Fenwick & Salvatella (2012).Esteban-Martín S, Bryn Fenwick R, Salvatella X. Synergistic use of NMR and MD simulations to study the structural heterogeneity of proteins. Wiley Interdisciplinary Reviews: Computational Molecular Science. 2012;2:466–478. doi: 10.1002/wcms.1093. [DOI] [Google Scholar]
- Fenwick et al. (2011).Fenwick RB, Esteban-Martín S, Richter B, Lee D, Walter KFA, Milovanovic D, Becker S, Lakomek NA, Griesinger C, Salvatella X. Weak long-range correlated motions in a surface patch of ubiquitin involved in molecular recognition. Journal of the American Chemical Society. 2011;133:10336–10339. doi: 10.1021/ja200461n. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Fiser & Šali (2003).Fiser A, Šali A. MODELLER: generation and refinement of homology-based protein structure models. Methods in Enzymology. 2003;374:461–491. doi: 10.1016/S0076-6879(03)74020-8. [DOI] [PubMed] [Google Scholar]
- Han et al. (2011).Han B, Liu Y, Ginzinger SW, Wishart DS. SHIFTX2: significantly improved protein chemical shift prediction. Journal of Biomolecular NMR. 2011;50:43–57. doi: 10.1007/s10858-011-9478-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Henriques, Cragnell & Skepö (2015).Henriques J, Cragnell C, Skepö M. Molecular dynamics simulations of intrinsically disordered proteins: force field evaluation and comparison with experiment. Journal of Chemical Theory and Computation. 2015;11:3420–3431. doi: 10.1021/ct501178z. [DOI] [PubMed] [Google Scholar]
- Hess (2002).Hess B. Convergence of sampling in protein simulations. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics. 2002;65:31910. doi: 10.1103/PhysRevE.65.031910. [DOI] [PubMed] [Google Scholar]
- Hess et al. (1993).Hess B, Bekker H, Berendsen H, Fraaije J. LINCS: a linear constraint solver for molecular simulations. Journal of Computational Chemistry. 1993;12:1463–1472. [Google Scholar]
- Hub (2018).Hub JS. Interpreting solution X-ray scattering data using molecular simulations. Current Opinion in Structural Biology. 2018;49:18–26. doi: 10.1016/J.SBI.2017.11.002. [DOI] [PubMed] [Google Scholar]
- Hummer & Köfinger (2015).Hummer G, Köfinger J. Bayesian ensemble refinement by replica simulations and reweighting. The Journal of Chemical Physics. 2015;143:243150. doi: 10.1063/1.4937786. [DOI] [PubMed] [Google Scholar]
- Islam et al. (2013).Islam SM, Stein RA, McHaourab HS, Roux B. Structural refinement from restrained-ensemble simulations based on EPR/DEER data: application to T4 lysozyme. The journal of Physical Chemistry B. 2013;117:4740–4754. doi: 10.1021/jp311723a. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jorgensen et al. (1983).Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML. Comparison of simple potential functions for simulating liquid water. The Journal of Chemical Physics. 1983;79:926. doi: 10.1063/1.445869. [DOI] [Google Scholar]
- Kannan et al. (2014).Kannan A, Camilloni C, Sahakyan AB, Cavalli A, Vendruscolo M. A conformational ensemble derived using NMR methyl chemical shifts reveals a mechanical clamping transition that gates the binding of the HU protein to DNA. Journal of the American Chemical Society. 2014;136:2204–2207. doi: 10.1021/ja4105396. [DOI] [PubMed] [Google Scholar]
- Kjaergaard et al. (2013).Kjaergaard M, Andersen L, Nielsen LD, Teilum K. A folded excited state of ligand-free nuclear coactivator binding domain (NCBD) underlies plasticity in ligand recognition. Biochemistry. 2013;52:1686–1693. doi: 10.1021/bi4001062. [DOI] [PubMed] [Google Scholar]
- Kjaergaard, Poulsen & Teilum (2012).Kjaergaard M, Poulsen FM, Teilum K. Is a malleable protein necessarily highly dynamic? The hydrophobic core of the nuclear coactivator binding domain is well ordered. Biophysical Journal. 2012;102:1627–1635. doi: 10.1016/j.bpj.2012.02.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kjaergaard, Teilum & Poulsen (2010).Kjaergaard M, Teilum K, Poulsen FM. Conformational selection in the molten globule state of the nuclear coactivator binding domain of CBP. Proceedings of the National Academy of Sciences of the United States of America. 2010;107:12535–12540. doi: 10.1073/pnas.1001693107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knott & Best (2012).Knott M, Best RB. A preformed binding interface in the unbound ensemble of an intrinsically disordered protein: evidence from molecular simulations. PLOS Computational Biology. 2012;8:e1002605. doi: 10.1371/journal.pcbi.1002605. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kohlhoff et al. (2009).Kohlhoff KJ, Robustelli P, Cavalli A, Salvatella X, Vendruscolo M. Fast and accurate predictions of protein NMR chemical shifts from interatomic distances. Journal of the American Chemical Society. 2009;131:13894–13895. doi: 10.1021/ja903772t. [DOI] [PubMed] [Google Scholar]
- Krieger et al. (2014).Krieger JM, Fusco G, Lewitzky M, Simister PC, Marchant J, Camilloni C, Feller SM, De Simone A. Conformational recognition of an intrinsically disordered protein. Biophysical journal. 2014;106:1771–1779. doi: 10.1016/j.bpj.2014.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kukic et al. (2014).Kukic P, Camilloni C, Cavalli A, Vendruscolo M. Determination of the individual roles of the linker residues in the interdomain motions of calmodulin using NMR chemical shifts. Journal of Molecular Biology. 2014;426:1826–1838. doi: 10.1016/j.jmb.2014.02.002. [DOI] [PubMed] [Google Scholar]
- Lange et al. (2008).Lange OF, Lakomek N-A, Farès C, Schröder GF, Walter KFA, Becker S, Meiler J, Grubmüller H, Griesinger C, De Groot BL. Recognition dynamics up to microseconds revealed from an RDC-derived ubiquitin ensemble in solution. Science. 2008;320:1471–1475. doi: 10.1126/science.1157092. [DOI] [PubMed] [Google Scholar]
- Lee et al. (2010).Lee CW, Martinez-Yamout MA, Dyson HJ, Wright PE. Structure of the p53 transactivation domain in complex with the nuclear receptor coactivator binding domain of CREB binding protein. Biochemistry. 2010;49:9964–9971. doi: 10.1021/bi1012996. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lehtivarjo et al. (2012).Lehtivarjo J, Tuppurainen K, Hassinen T, Laatikainen R, Peräkylä M. Combining NMR ensembles and molecular dynamics simulations provides more realistic models of protein structures in solution and leads to better chemical shift prediction. Journal of Biomolecular NMR. 2012;52:257–267. doi: 10.1007/s10858-012-9609-6. [DOI] [PubMed] [Google Scholar]
- Li & Brüschweiler (2012).Li D-W, Brüschweiler R. PPM: a side-chain and backbone chemical shift predictor for the assessment of protein conformational ensembles. Journal of Biomolecular NMR. 2012;54:257–265. doi: 10.1007/s10858-012-9668-8. [DOI] [PubMed] [Google Scholar]
- Lindorff-Larsen et al. (2005).Lindorff-Larsen K, Best RB, Depristo MA, Dobson CM, Vendruscolo M. Simultaneous determination of protein structure and dynamics. Nature. 2005;433:128–132. doi: 10.1038/nature03199. [DOI] [PubMed] [Google Scholar]
- Lindorff-Larsen & Ferkinghoff-Borg (2009).Lindorff-Larsen K, Ferkinghoff-Borg J. Similarity measures for protein ensembles. PLOS ONE. 2009;4:e4203. doi: 10.1371/journal.pone.0004203. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lindorff-Larsen et al. (2012a).Lindorff-Larsen K, Maragakis P, Piana S, Eastwood MP, Dror RO, Shaw DE. Systematic validation of protein force fields against experimental data. PLOS ONE. 2012a;7:e32131. doi: 10.1371/journal.pone.0032131. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lindorff-Larsen et al. (2012b).Lindorff-Larsen K, Trbovic N, Maragakis P, Piana S, Shaw DE. Structure and dynamics of an unfolded protein examined by molecular dynamics simulation. Journal of the American Chemical Society. 2012b;134:3787–3791. doi: 10.1021/ja209931w. [DOI] [PubMed] [Google Scholar]
- Lipari & Szabo (1982).Lipari G, Szabo A. Model-free approach to the interpretation of nuclear magnetic resonance relaxation in macromolecules. 1. Theory and range of validity. Journal of the American Chemical Society. 1982;104:4546–4559. doi: 10.1021/ja00381a009. [DOI] [Google Scholar]
- Löhr, Jussupow & Camilloni (2017).Löhr T, Jussupow A, Camilloni C. Metadynamic metainference: convergence towards force field independent structural ensembles of a disordered peptide. The Journal of Chemical Physics. 2017;146:165102. doi: 10.1063/1.4981211. [DOI] [PubMed] [Google Scholar]
- MacCallum, Perez & Dill (2015).MacCallum JL, Perez A, Dill KA. Determining protein structures by combining semireliable data with atomistic physical models by Bayesian inference. Proceedings of the National Academy of Sciences of the United States of America. 2015;112:6985–6990. doi: 10.1073/pnas.1506788112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- MacKerell et al. (1998).MacKerell AD, Bashford D, Dunbrack RL, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, Michnick S, Ngo T, Nguyen DT, Prodhom B, Reiher WE, Roux B, Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M, Wiórkiewicz-Kuczera J, Yin D, Karplus M. All-atom empirical potential for molecular modeling and dynamics studies of proteins. The Journal of Physical Chemistry B. 1998;102:3586–3616. doi: 10.1021/jp973084f. [DOI] [PubMed] [Google Scholar]
- Maragakis et al. (2008).Maragakis P, Lindorff-Larsen K, Eastwood MP, Dror RO, Klepeis JL, Arkin IT, Jensen MØ, Xu H, Trbovic N, Friesner RA, Palmer AG, Shaw DE. Microsecond molecular dynamics simulation shows effect of slow loop dynamics on backbone amide order parameters of proteins. The Journal of Physical Chemistry B. 2008;112:6155–6158. doi: 10.1021/jp077018h. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Meinhold & Wright (2011).Meinhold DW, Wright PE. Measurement of protein unfolding/refolding kinetics and structural characterization of hidden intermediates by NMR relaxation dispersion. Proceedings of the National Academy of Sciences of the United States of America. 2011;108:9078–9083. doi: 10.1073/pnas.1105682108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mercadante et al. (2015).Mercadante D, Milles S, Fuertes G, Svergun DI, Lemke EA, Gräter F. Kirkwood-Buff approach rescues over-collapse of a disordered protein in canonical protein force fields. The Journal of Physical Chemistry B. 2015;119:7975–7984. doi: 10.1021/acs.jpcb.5b03440. [DOI] [PubMed] [Google Scholar]
- Mobley (2012).Mobley DL. Let’s get honest about sampling. Journal of Computer-aided Molecular Design. 2012;26:93–95. doi: 10.1007/s10822-011-9497-y. [DOI] [PubMed] [Google Scholar]
- Naganathan & Orozco (2013).Naganathan AN, Orozco M. The conformational landscape of an intrinsically disordered DNA-binding domain of a transcription regulator. The Journal of Physical Chemistry B. 2013;117:13842–13850. doi: 10.1021/jp408350v. [DOI] [PubMed] [Google Scholar]
- Nerenberg et al. (2012).Nerenberg PS, Jo B, So C, Tripathy A, Head-Gordon T. Optimizing solute-water van der Waals interactions to reproduce solvation free energies. The Journal of Physical Chemistry B. 2012;116:4524–4534. doi: 10.1021/jp2118373. [DOI] [PubMed] [Google Scholar]
- Olsson et al. (2014).Olsson S, Vögeli BR, Cavalli A, Boomsma W, Ferkinghoff-Borg J, Lindorff-Larsen K, Hamelryck T. Probabilistic determination of native state ensembles of proteins. Journal of Chemical Theory and Computation. 2014;10:3484–3491. doi: 10.1021/ct5001236. [DOI] [PubMed] [Google Scholar]
- Papaleo et al. (2014).Papaleo E, Sutto L, Gervasio FL, Lindorff-Larsen K. Conformational changes and free energies in a proline isomerase. Journal of Chemical Theory and Computation. 2014;10:4169–4174. doi: 10.1021/ct500536r. [DOI] [PubMed] [Google Scholar]
- Perilla et al. (2015).Perilla JR, Goh BC, Cassidy CK, Liu B, Bernardi RC, Rudack T, Yu H, Wu Z, Schulten K. Molecular dynamics simulations of large macromolecular complexes. Current Opinion in Structural Biology. 2015;31:64–74. doi: 10.1016/j.sbi.2015.03.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piana et al. (2015).Piana S, Donchev AG, Robustelli P, Shaw DE. Water dispersion interactions strongly influence simulated structural properties of disordered protein states. The Journal of Physical Chemistry B. 2015;119:5113–5123. doi: 10.1021/jp508971m. [DOI] [PubMed] [Google Scholar]
- Piana, Klepeis & Shaw (2014).Piana S, Klepeis JL, Shaw DE. Assessing the accuracy of physical models used in protein-folding simulations: quantitative evidence from long molecular dynamics simulations. Current Opinion in Structural Biology. 2014;24:98–105. doi: 10.1016/j.sbi.2013.12.006. [DOI] [PubMed] [Google Scholar]
- Piana, Lindorff-Larsen & Shaw (2011).Piana S, Lindorff-Larsen K, Shaw DE. How robust are protein folding simulations with respect to force field parameterization? Biophysical Journal. 2011;100:L47–L49. doi: 10.1016/j.bpj.2011.03.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Piana, Lindorff-Larsen & Shaw (2012).Piana S, Lindorff-Larsen K, Shaw DE. Protein folding kinetics and thermodynamics from atomistic simulation. Proceedings of the National Academy of Sciences of the United States of America. 2012;109:17845–17850. doi: 10.1073/pnas.1201811109. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Pitera & Chodera (2012).Pitera JW, Chodera JD. On the use of experimental observations to bias simulated ensembles. Journal of Chemical Theory and Computation. 2012;8:3445–3451. doi: 10.1021/ct300112v. [DOI] [PubMed] [Google Scholar]
- Pronk et al. (2013).Pronk S, Páll S, Schulz R, Larsson P, Bjelkmar P, Apostolov R, Shirts MR, Smith JC, Kasson PM, Van der Spoel D, Hess B, Lindahl E. GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics. 2013;29:845–854. doi: 10.1093/bioinformatics/btt055. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Qin et al. (2005).Qin BY, Liu C, Srinath H, Lam SS, Correia JJ, Derynck R, Lin K. Crystal structure of IRF-3 in complex with CBP. Structure. 2005;13:1269–1277. doi: 10.1016/j.str.2005.06.011. [DOI] [PubMed] [Google Scholar]
- Rauscher et al. (2015).Rauscher S, Gapsys V, Gajda MJ, Groot BL De, Grubmüller H. Structural ensembles of intrinsically disordered proteins depend strongly on force field: a comparison to experiment structural ensembles of intrinsically disordered proteins depend strongly on force field: a comparison to experiment. Journal of Chemical Theory and Computation. 2015;11:5513–5524. doi: 10.1021/acs.jctc.5b00736. [DOI] [PubMed] [Google Scholar]
- Ravera et al. (2016).Ravera E, Sgheri L, Parigi G, Luchinat C. A critical assessment of methods to recover information from averaged data. Physical Chemistry Chemical Physics. 2016;18:5686–5701. doi: 10.1039/c5cp04077a. [DOI] [PubMed] [Google Scholar]
- Robustelli et al. (2009).Robustelli P, Cavalli A, Dobson CM, Vendruscolo M, Salvatella X. Folding of small proteins by Monte Carlo simulations with chemical shift restraints without the use of molecular fragment replacement or structural homology. The Journal of Physical Chemistry B. 2009;113:7890–7896. doi: 10.1021/jp900780b. [DOI] [PubMed] [Google Scholar]
- Robustelli et al. (2010).Robustelli P, Kohlhoff K, Cavalli A, Vendruscolo M. Using NMR chemical shifts as structural restraints in molecular dynamics simulations of proteins. Structure. 2010;18:923–933. doi: 10.1016/j.str.2010.04.016. [DOI] [PubMed] [Google Scholar]
- Roux & Weare (2013).Roux B, Weare J. On the statistical equivalence of restrained-ensemble simulations with the maximum entropy method. The Journal of Chemical Physics. 2013;138:84107. doi: 10.1063/1.4792208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sahakyan et al. (2011).Sahakyan AB, Vranken WF, Cavalli A, Vendruscolo M. Structure-based prediction of methyl chemical shifts in proteins. Journal of Biomolecular NMR. 2011;50:331–346. doi: 10.1007/s10858-011-9524-2. [DOI] [PubMed] [Google Scholar]
- Shaw et al. (2010).Shaw DE, Maragakis P, Lindorff-Larsen K, Piana S, Dror RO, Eastwood MP, Bank JA, Jumper JM, Salmon JK, Shan Y, Wriggers W. Atomic-level characterization of the structural dynamics of proteins. Science. 2010;330:341–346. doi: 10.1126/science.1187409. [DOI] [PubMed] [Google Scholar]
- Shen et al. (2008).Shen Y, Lange O, Delaglio F, Rossi P, Aramini JM, Liu G, Eletsky A, Wu Y, Singarapu KK, Lemak A, Ignatchenko A, Arrowsmith CH, Szyperski T, Montelione GT, Baker D, Bax A. Consistent blind protein structure generation from NMR chemical shift data. Proceedings of the National Academy of Sciences of the United States of America. 2008;105:4685–4690. doi: 10.1073/pnas.0800256105. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Svergun, Barberato & Koch (1995).Svergun D, Barberato C, Koch MHJ. CRYSOL—a program to evaluate X-ray solution scattering of biological macromolecules from atomic coordinates. Journal of Applied Crystallography. 1995;28:768–773. doi: 10.1107/S0021889895007047. [DOI] [Google Scholar]
- Tang, Schwieters & Clore (2007).Tang C, Schwieters CD, Clore GM. Open-to-closed transition in apo maltose-binding protein observed by paramagnetic NMR. Nature. 2007;449:1078–1082. doi: 10.1038/nature06232. [DOI] [PubMed] [Google Scholar]
- Theobald & Steindel (2012).Theobald DL, Steindel PA. Optimal simultaneous superpositioning of multiple structures with missing data. Bioinformatics. 2012;28:1972–1979. doi: 10.1093/bioinformatics/bts243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tiberti et al. (2015).Tiberti M, Papaleo E, Bengtsen T, Boomsma W, Lindorff-Larsen K. ENCORE: software for quantitative ensemble comparison. PLOS Computational Biology. 2015;11:e1004415. doi: 10.1371/journal.pcbi.1004415. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tropp (1980).Tropp J. Dipolar relaxation and nuclear Overhauser effects in nonrigid molecules: the effect of fluctuating internuclear distances. The Journal of Chemical Physics. 1980;72:6035. doi: 10.1063/1.439059. [DOI] [Google Scholar]
- Vögeli et al. (2014).Vögeli B, Orts J, Strotz D, Chi C, Minges M, Wälti MA, Güntert P, Riek R. Towards a true protein movie: a perspective on the potential impact of the ensemble-based structure determination using exact NOEs. Journal of Magnetic Resonance. 2014;241:53–59. doi: 10.1016/j.jmr.2013.11.016. [DOI] [PubMed] [Google Scholar]
- Waters et al. (2006).Waters L, Yue B, Veverka V, Renshaw P, Bramham J, Matsuda S, Frenkiel T, Kelly G, Muskett F, Carr M, Heery DM. Structural diversity in p160/CREB-binding protein coactivator complexes. The Journal of Biological Chemistry. 2006;281:14787–14795. doi: 10.1074/jbc.M600237200. [DOI] [PubMed] [Google Scholar]
- White & Voth (2014).White AD, Voth GA. Efficient and minimal method to bias molecular simulations with experimental data. Journal of Chemical Theory and Computation. 2014;10:3023–3030. doi: 10.1021/ct500320c. [DOI] [PubMed] [Google Scholar]
- Wishart et al. (2008).Wishart DS, Arndt D, Berjanskii M, Tang P, Zhou J, Lin G. CS23D: a web server for rapid protein structure generation using NMR chemical shifts and sequence data. Nucleic Acids Research. 2008;36:W496–W502. doi: 10.1093/nar/gkn305. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wishart & Case (2001).Wishart DS, Case DA. Use of chemical shifts in macromolecular structure determination. Methods in Enzymology. 2001;338:3–34. doi: 10.1016/s0076-6879(02)38214-4. [DOI] [PubMed] [Google Scholar]
- Zijlstra et al. (2017).Zijlstra N, Dingfelder F, Wunderlich B, Zosel F, Benke S, Nettels D, Schuler B. Rapid microfluidic dilution for single-molecule spectroscopy of low-affinity biomolecular complexes. Angewandte Chemie. 2017;129:7232–7235. doi: 10.1002/ange.201702439. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The following information was supplied regarding data availability:
The raw data are provided in a Supplemental File.