Skip to main content
Biophysical Journal logoLink to Biophysical Journal
. 2017 Aug 8;113(3):550–557. doi: 10.1016/j.bpj.2017.06.042

An Efficient Method for Estimating the Hydrodynamic Radius of Disordered Protein Conformations

Mads Nygaard 1,2, Birthe B Kragelund 1, Elena Papaleo 1,2, Kresten Lindorff-Larsen 1,
PMCID: PMC5550300  PMID: 28793210

Abstract

Intrinsically disordered proteins play important roles throughout biology, yet our understanding of the relationship between their sequences, structural properties, and functions remains incomplete. The dynamic nature of these proteins, however, makes them difficult to characterize structurally. Many disordered proteins can attain both compact and expanded conformations, and the level of expansion may be regulated and important for function. Experimentally, the level of compaction and shape is often determined either by small-angle x-ray scattering experiments or pulsed-field-gradient NMR diffusion measurements, which provide ensemble-averaged estimates of the radius of gyration and hydrodynamic radius, respectively. Often, these experiments are interpreted using molecular simulations or are used to validate them. We here provide, to our knowledge, a new and efficient method to calculate the hydrodynamic radius of a disordered protein chain from a model of its structural ensemble. In particular, starting from basic concepts in polymer physics, we derive a relationship between the radius of gyration of a structure and its hydrodynamic ratio, which in turn can be used, for example, to compare a simulated ensemble of conformations to NMR diffusion measurements. The relationship may also be valuable when using NMR diffusion measurements to restrain molecular simulations.

Introduction

Intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs) of proteins play important roles in central cellular processes such as cell-cycle regulation (1), transcription (2), membrane receptor signaling (3, 4), and nuclear transport (5). Thus, despite lacking a globular, folded structure—and often being substantially disordered under physiological conditions—they are able to perform specific and important biological functions (6).

Due to their high flexibility and fast dynamics, IDPs are difficult to characterize structurally, and are thus often described through integrative structural biology approaches (4, 7, 8). In addition to biophysical experiments, molecular simulation methods have emerged as central in our ability to describe disordered proteins and to interpret experimental data on these complex systems. In particular, much of our knowledge of the structural properties of IDPs and IDRs stems from combinations of molecular dynamics (MD) or Monte Carlo simulations and NMR spectroscopy (9, 10, 11, 12).

Recent years have witnessed dramatic advances in both the force fields and sampling methods used in MD simulations, and detailed comparisons, e.g., between simulations and NMR experiments, have shown continued accuracy in simulations of globular proteins, short, flexible peptides, and protein folding (13). In contrast, simulations of the unfolded state of folded proteins or IDPs (10, 14, 15, 16, 17, 18, 19) have suggested that many force fields result in overly compact structures. To help alleviate this problem and enable more accurate simulations of disordered proteins, several approaches for force field improvements have been suggested (15, 20, 21, 22, 23). Nevertheless, it still remains unclear which force fields perform best for a given system and molecular property (16).

Statistical coil models have also been used extensively to describe IDPs (24, 25, 26). Because of their computational efficiencies, these models are particularly attractive for sampling the many different kinds of structures IDPs may attain. Further, they have been shown to provide a relatively accurate description of the sequence-local structural properties as well as the overall expansion of the polypeptide chain (27).

The biological functions of disordered regions are intimately linked to their dynamical behavior, and the overall expansion of the polypeptide chain can be important for its ability to act as scaffolds and choreographers in, for instance, signaling. Specifically, different IDPs and IDRs have been found to have varying amounts and types of transient local structures, and they appear to be differentially compacted, likely reflecting the distribution of charges and/or hydrophobic amino acids along the chain (28).

Despite the increased focus on describing the global and local structural properties of IDPs and the molecular reasons for why individual disordered proteins have different levels of expansion, we still do not fully understand the relationship between protein sequence and structural properties. Similarly, the relationship between the local and global structural properties remains incompletely understood (29, 30, 31). Although computation has proven efficient in linking sequence with both local and global structure of IDPs, as well as their functions, the link between computation and experiment is far from trivial, and there is a continued need for validation of molecular simulations against experiments.

Whereas local structural properties can be experimentally assessed by a variety of NMR properties, including scalar and residual dipolar couplings and chemical shifts to provide residue-specific information (11), the overall expansion of the chain is accessible through other methods, including small-angle x-ray scattering (SAXS) (32, 33), pulsed-field gradient NMR diffusion measurements (PFG NMR) (34), size-exclusion chromatography (SEC) (35), or fluorescence correlation spectroscopy and dynamic light scattering (36). Although SAXS experiments probe the radius of gyration (Rg), PFG NMR, SEC, and fluorescence correlation spectroscopy depend on the hydrodynamic properties of the protein chain. In particular, PFG NMR generally reports on the translational diffusion coefficient (Dt) of a protein, although rotational motions may contribute under special conditions (37). Dt is in turn related to the hydrodynamic radius, Rh, through the Stokes-Einstein equation (38),

Dt=kBT6πηRh. (1)

Thus, by measuring Dt for a protein and a reference molecule with known Rh, PFG NMR provides a convenient and accurate method to measure Rh (34).

Although both Rg and Rh depend on the overall expansion of the polypeptide chain, they do so via different physical principles (Fig. 1), and they thus contain different information about proteins. Furthermore, because of the dynamic nature of IDPs, measured values of Rg and Rh are averages over a very large number of individual conformations that may differ substantially in size and shape. Because SAXS provides information about <Rg2> and PFG NMR provides information on <Rh−1>, the two experiments effectively report on different statistical moments of the distribution of expanded conformations. This effect was elegantly exploited in a study that combined SAXS and NMR to investigate the unfolded state of an SH3 domain (39).

Figure 1.

Figure 1

Visual representation of the radius of gyration and hydrodynamic radius. (Left) The radius of gyration (Rg) of an object can be calculated as the root mean-square distance between each point in the object and its center of mass. Thus, for a protein, it directly reports on the typical distance between an atom and the center of mass of the protein. In the case of a solid sphere, Rg=r3/5. (Right) The Stokes radius or hydrodynamic radius (Rh) of a solute is the corresponding radius of a hard sphere that diffuses at the same rate as that solute. Dt is the translational diffusion coefficient.

To utilize the information available in PFG NMR experiments to validate and determine computationally generated protein ensembles, it is desirable to have an effective and accurate method for calculating Rh from large conformational ensembles. Also, when experimental measurements are available for both Rg and Rh, it would be useful to have a method that relates the two at the molecular level. One method currently used to calculate hydrodynamic properties with substantial accuracy, including Rh and Dt, is provided by the HYDROPRO program (40, 41), which uses a surface-shell model and target-function minimization calculations. The surface-shell model is created by representing the molecule’s shape with a number of spheres. As the friction of the molecule only depends on the molecules in the solute-solution interface, the spheres inside the molecule are removed and the hydrodynamic properties are calculated based on the surface shell. The procedure is repeated for different levels of fine graining of the shell model and extrapolated to high resolution. The accuracy of these calculations, however, comes at a computational cost. Thus, for example, the calculation of Rh for a single conformation of a 200 aa residue protein on a single Intel i5 2.7GHz CPU core takes up to ∼30 min (depending on the accuracy required). This may complicate applications on ensembles that consist of many thousands of conformations, or when Rh needs to be calculated on the fly when used as a restraint in structure determination. We therefore sought to combine the accuracy of HYDROPRO and the ease of calculating the Rg by developing an efficient and sufficiently accurate method to relate Rg and Rh for unfolded and disordered protein ensembles.

We set out from earlier work (39) that had established an empirical relationship between Rg and Rh for selected proteins, but noted that the resulting parameters varied between the different proteins. We sought to expand that work by explicitly studying the chain-length dependency of the relationship between the ratio Rg/Rh and Rg. We thus generated coil models of different chain lengths and sequences, and used these to derive a relationship that can be used to estimate Rh from Rg (or vice versa) for proteins between 20 and 450 residues in length. The relationship is highly accurate, with a relative error of 3% in the estimated value of Rh, and it should be directly applicable to validation of structural ensembles of IDPs using experimental measurements of hydrodynamic properties.

Methods

We used Flexible-Meccano (24) with default settings to generate ensembles of three different types of polypeptide sequences with different chain lengths for each type (Table 1): 1) poly-valine, 2) polypeptides with a sequence composition similar to that of IDPs (Table S1) (42), and 3) a set of 12 IDPs whose Rh has previously been measured by PFG NMR experiments (Table S2) (43). For poly-valine and IDP-like polymers, we used chain lengths of N = 20, 30, 40, 80, 100, 200, 300, 400, and 450 residues, whereas the experimentally characterized IDPs were of lengths between 40 and 237 residues (43). The sequences of the random IDP-like polymers were generated to match the amino acid composition of the Disprot database (42) after removing engineered proteins, variants, fragments of <15 residues, and duplicate sequences.

Table 1.

Overview of Peptides Used in This Study

Ensemble No. of Sequences Sequence Length, N Reference
Poly-valine 9 20,30,40,80,100,200,300,400,450 N/A
IDP-like 9 20,30,40,80,100,200,300,400,450 DisProt
IDP 12 40,61,92,95,104,110,112,140,189,198,234,237 (43)

See Supporting Material for additional details.

For each of the resulting 30 polypeptides, we generated 100 conformations using Flexible-Meccano. To examine whether our final model for calculating Rh is biased by the use of Flexible-Meccano to sample the structures, we also tested it using conformations sampled by all-atom MD simulations. In particular, we extracted ∼100 conformers from each of two previously published (20) very long simulations of HIV1-integrase and α-synuclein (12 and 20 μs, respectively) performed using the CHARMM22 force field (44) in conjunction with the TIP4P-D water model (20).

We used HYDROPRO (40, 41) to calculate Dt for each of the structures. Before the HYDROPRO calculations, we added side chains to the Flexible-Meccano structures using PULCHRA with default settings (45). As the number of mini-spheres for calculating the surface-shell mode reached the upper limit allowed by HYDROPRO for peptides with chain length >300, we opted to calculate the hydrodynamic properties of these longer peptides using the coarser-grained “residue-based” model (HYDROPRO INDMODE 4 with default settings). For peptides with chain length ≤300, calculations were generally performed using the “all-atom” INDMODE 1 with default settings, but the values were also calculated using INDMODE 4 to examine the effect of this coarser model. In all cases, the resulting Dt values were converted to a Rh using the Stokes-Einstein relationship at 298 K and with η = 1 cP. For all conformations, we also calculated the Rg from the positions of the Cα atoms only, so that the relationship we have derived can be applied to backbone as well as all-atom models.

We also used the Kirkwood definition (46) to calculate Rh, using the pairwise distances, rij, between the Cα atoms only:

Rh1=rij1ij. (2)

We calculated the asphericity (Δ) and prolateness (S) using

Δ=32tr(Qˆ2)tr(Qˆ)2 (3)

and

S=27det(Qˆ)tr(Qˆ)3, (4)

where Qˆ is a traceless matrix related to the gyration tensor, Q (47).

Results and Discussion

As the starting point for our approach, we followed previous work that developed an approximate relationship between Rg and Rh for specific proteins (9, 39). Theory suggests that in specific limiting cases, there is a simple relationship between Rg and Rh (48). For an idealized spherical molecule (representing either folded proteins or compact conformations of disordered proteins), for example, the ratio Rg/Rh=3/50.78. At the other end of the scale, renormalization group theory shows that for a disordered coil, the same ratio is between 1.2 and 1.6, depending on whether the chain is self-avoiding or not (49). Thus, the ratio Rg/Rh depends on shape and compaction.

The fact that the Rg/Rh ratio depends on the level of expansion is also reflected in the known, experimentally parameterized scaling laws for proteins that relate the chain length, N, to Rg and Rh. These generally take the form

Rxs=R0xsNνxs, (5)

where x = {g, h} determines whether the relationship refers to Rg or Rh, and s = {folded, unfolded, IDP} refers to which of these states the scaling law is meant to describe. Empirically determined values for these parameters (Table 2) reveal that the scaling exponents (νxs) are ∼0.33 for folded proteins and ∼0.6 for disordered proteins. The value for the compact state is thus as expected for structures where the volume scales linearly with the number of monomers. A scaling exponent of ∼0.6 for a disordered chain follows from both basic considerations of polymer chains (50) and more detailed renormalization group calculations (51).

Table 2.

Previously Determined Scaling Laws for Proteins

Rx State R0 (Å) ν Reference
Rg folded 2.2 0.38 (56)
Rg unfolded 1.9 0.60 (57)
Rh folded 4.8 0.29 (34)
Rh folded 4.9 0.28 (43)
Rh unfolded 2.2 0.57 (34)
Rh unfolded 2.3 0.55 (43)
Rh IDP 2.5 0.51 (43)

In contrast to the similar scaling exponents, it is evident that the scaling factors, R0xs, differ for Rg and Rh, and that they also depend on whether the protein is compact or expanded. Thus, as expected from theory, the scaling laws also show that the ratio Rg/Rh increases substantially in an expanded state compared to a compact state. Together, these results reiterate how Rg and Rh contain independent information that reports on the overall properties of the chain, so that their ratio depends on how expanded the chain is.

Based on the considerations outlined above, one might expect a phenomenological relationship that relates the ratio Rg/Rh to both the compaction of the chain, e.g., quantified via Rg as well as chain length, N. Such a relationship could be very useful to help interpret experimental measurements of Rg and Rh. At this point, it is, however, worth stressing that both the experimental and theoretical scaling laws generally refer to ensemble-averaged quantities. Thus, the parameters in Table 2 and the theoretical scaling exponents for disordered polymers refer to averages observed over an ensemble and are not expected to be directly applicable to individual conformations. Instead, we aim to fit them by calculating Rh from the Rg values of single structures.

To calculate Rh from Rg for individual conformations, Choy et al. (39) developed phenomenological relationships between the ratio Rg/Rh and the overall chain expansion, quantified as the Rg for single conformations. In particular, they created structural models of disordered conformations of several proteins, with various levels of expansion for each protein, and calculated Rg and Rh (using HYDROPRO) for each conformation. They found empirically that for each protein, the ratio Rg/Rh was well described as a linear function of Rg as

Rg/Rh=aRg+b (6)

(or, equivalently, Rh−1 = a+bRg−1), and that the ratios for the smallest and largest Rg values converged to the values roughly expected for a spherical molecule and disordered state, respectively. Although a roughly linear relationship was observed for each protein, the values of a and b differed. Thus, for the shortest protein (crambin; 46 residues) they found a = 0.034 Å−1 and b = 0.38, whereas for the longest (reduced lysozyme; 129 residues) they found a = 0.015 Å−1 and b = 0.53.

These results suggested that there appears to be a general theory-based, but phenomenological, linear relationship between the ratio Rg/Rh and Rg, but that the details of this relationship depend on the length of the polypeptide chain. This dependency can be conceptually understood by the fact that the magnitude of Rg that is needed to be in the coil regime, and hence for the ratio Rg/Rh to increase, depends on the chain length.

We sought to extend this work to derive a relationship that can be used to estimate Rh from the calculated value of Rg for a given conformation for an unfolded protein of any length. This is important also because the length span of IDPs and IDRs is very wide (52). Using Flexible-Meccano (24), we generated conformational ensembles of three series of polypeptides (30 peptides in total) and with different chain lengths between 20 and 450 residues (Table 1; Tables S1 and S2), giving a total of 3000 individual structures. We chose this sampling method because it has previously been shown to provide accurate models of the local structure of unfolded proteins as well as a reasonably accurate description of the overall expansion of the chain (27). As we use the conformations simply to relate Rh to Rg, and not to model other properties of unfolded configurations, we expected the method to be sufficiently accurate, and validated this assumption using conformations obtained from state-of-the-art MD simulations (see below).

To account for potential sequence dependencies, we performed the Flexible-Meccano calculations on 1) poly-valine homopolymers, 2) sequences with an IDP-like amino acid composition, and 3) 12 authentic IDPs with measured Rh. In addition to the IDPs and IDP-like sequences, we chose to study poly-valine, since this amino acid is expected to sample expanded conformations similar to unfolded proteins and because it had previously been used as a homopolymeric model of protein conformations (53). Because the Flexible-Meccano model only has local structural propensities and excluded volume, and, e.g., no hydrophobic effect, we did not expect that the actual details of the sequence would have a big effect, nor did we observe such an effect (see below).

For each of the 100 conformations of the 30 peptides, we calculated Rh using HYDROPRO and Rg from the Cα atoms. We found that the Rg/Rh ratio for different individual peptide conformations spanned roughly the same range (0.8–1.6) as that suggested by theory for ensembles of compact globules and expanded chains, respectively (Fig. 2 A). As short polypeptides obviously are maximally expanded at a lower value of Rg compared to a long polypeptide, the slope depends on the chain length. This is clearly evident from a more detailed view of the ranges of Rg and Rh, and of the Rg/Rh ratio, that we sampled for each peptide (Figs. S1, S2, and S3), which also shows that the linear relationship holds for all three classes of peptides. For each peptide, we thus fitted the data separately to the linear relationship (Eq. 6), with the values of the two parameters, aN and bN, being different for each polypeptide and with the dependency of the chain length, N, explicitly stated (Fig. 2 A; Figs. S1, S2, and S3).

Figure 2.

Figure 2

An empirical relationship between Rg and Rh. For each of 30 polypeptides varying in length between 20 and 450 residues, we sampled 100 structures and calculated the hydrodynamic radius (Rh) and radius of gyration (Rg) for each structure. (A) In line with previous findings, we observed an approximately linear relationship between the ratio Rg/Rh and Rg, but with slope and intercept differing between polypeptides of different lengths (indicated by different colors). We fitted each dataset (indicated by the different shapes: squares for poly-valine; circles for IDP-like; and triangles for IDPs) to a straight line and observed that both the slope (B) and the intercept (C) systematically depended on the number of amino acid residues in the polypeptide. Error bars represent the error of fits. Note that the different sets of peptides appear to follow the same trends, suggesting that in this model it is the length of the peptide, not the composition, that is most relevant. The data for each peptide are shown separately in Figs. S1, S2, and S3. To see this figure in color, go online.

The best-fit values of aN (Fig. 2 B) and bN (Fig. 2 C) revealed the expected chain-length dependency of these two parameters. We also found that the values did not appear to depend on the sequence of the polypeptide, since peptides from the three different classes of the same length had comparable values of aN and bN (Fig. 2, B and C). We note here that this observation likely just reflects the choice of sampling model used (Flexible-Meccano), since it only takes sequence effects into account when modeling the local structural properties. Thus, in reality, one would expect that different peptides of the same length could have different compactions depending on their sequence composition (28). As the goal here, however, is to “translate” between Rg and Rh, we focus just on sampling different levels of compaction for proteins of different lengths.

Based on these data and the finding that the parameters aN and bN appeared to depend systematically on N, we aimed to derive a simple relationship to predict Rh from Rg and N. As a starting point for finding such a relationship, we used the theoretically and empirically justified scaling laws (Eq. 5; Table 2). By also assuming the empirically observed linear relationship (Fig. 2 A; Eq. 6) and making simplifying assumptions, we obtained the following expression (see also Supporting Material and Eq. S6), which we find describes the entire dataset with sufficiently high accuracy:

RgRh(N,Rg)=α1(Rgα2N0.33)N0.60N0.33+α3. (7)

We subsequently fitted the three parameters in Eq. 7 globally to the full set of Rg and Rh data for the 30 peptides. As Rh is averaged as <Rh−1> we fitted (by the least-squares approach) a form of Eq. 7 expressing Rh−1 as a function of Rg, N, and the three parameters and obtained the best-fit parameters α1 = (0.216 ± 0.001) Å−1, α2 = (4.06 ± 0.02) Å, and α3=(0.821 ± 0.002).

As a test of the robustness of the calculations and the dependency of the types of peptides we used in the fit, we also performed individual fits to the three peptide sets (poly-valine, IDPs, and IDP-like polymers). The results of the three fits were very similar, as evidenced, e.g., by the very similar models obtained for short, medium, and long chain lengths (Fig. S4). As a consistency check, we also compared the Rh values calculated using HYDROPRO with those calculated directly using the Kirkwood formula (Eq. 2) using the pairwise distances between the backbone Cα atoms (Fig. S5). As expected, the values derived by HYDROPRO, which take solvation effects into account, were ∼19% larger than those calculated directly from the atomic positions. Nevertheless, the two are surprisingly strongly correlated, suggesting that one may also estimate Rh using Eq. 2 (Fig. S5).

To visualize the quality of the global fit to Eq. 7, we used the equation to predict the entire set of Rh values from the Cα Rg data and chain lengths, and compared these values to those obtained directly using HYDROPRO. The results showed a very good relationship (Fig. 3 A), with a Pearson correlation coefficient of 0.99, an overall root mean-square deviation between the two values of 2 Å, and an average relative error in the predicted Rh of 3% and between 1.7 and 5.9% for the individual peptides. The average signed error is 0.4%, varying from −3.0 to 4.8% for the individual peptides, demonstrating that on average the equation provides an almost unbiased estimate of Rh.

Figure 3.

Figure 3

Quality of model for predicting the hydrodynamic radius. We assessed the quality of the global model (Eq. 7) by back-calculating Rh from Rg and N, and then compared the resulting values to those obtained by HYDROPRO. (A) The results show a strong correlation across the entire range of levels of compactions and chain lengths, with (B) a small increase in error (unsigned error in black) and a bias (signed error in red) toward overestimating Rh for the longest peptides (see also main text). Error bars represent the standard deviation. The differently shaped symbols represent the different datasets: squares for poly-valine, circles for IDP-like, and triangles for IDPs. To see this figure in color, go online.

We also analyzed to what extent the errors depended on the chain length. The results showed that the relative error was mostly constant for chain lengths between 20 and 300 residues (relative error 1–3%) (Fig. 3 B). For the two longest peptides (400 and 450 residues), for which the hydrodynamic properties were calculated using a coarser-grained model (see Methods), the errors were slightly larger (5–6%) with Eq. 7 generally overestimating Rh.

To test whether the differences in model used might be the cause of this observation, we also compared Rh calculated using the more detailed (atom-based) calculations with the coarser (residue-based) model for the peptides (N ≤ 300) where both calculations were possible (Fig. S6). The results suggest that the residue-based method underestimates Rh for the longest peptides, suggesting that the apparent overestimation from Eq. 7 compared to HYDROPRO for the longest peptides (Fig. 3) might in part be explained by the HYDROPRO values being underestimated in the residue-based model.

We also examined whether particular shapes of the conformers caused systematic effects on the error when using Eq. 7 to predict the Rh value. The asphericity (Δ) and prolateness (S) parameters have previously been used to describe the shapes of both folded and unfolded protein chains (54), and so we calculated these values for each of the conformations. In particular, we correlated the error of the predicted Rh values (Eq. 7 versus HYDROPRO) with the asphericity and prolateness (Fig. S7) and found a weak correlation between the error and these parameters.

All of the calculations described above are based on conformations of disordered protein structures that were generated by Flexible-Meccano. Because this model only takes steric repulsion and local structural preferences into account, we wanted to examine whether the observed relationship between Rg and Rh also holds for conformations generated by more realistic energy functions. Thus, we extracted ∼100 conformations from two previously published long MD simulations of the disordered apo N-terminal zinc-binding domain of HIV1 integrase and α-synuclein (20). These simulations were based on the CHARMM22 force field in conjunction with the TIP4P-D water model, a combination that has been shown to provide a relatively realistic description of IDPs (20). We thus compared the Rh values calculated from Eq. 7 and HYDROPRO, and the results (Fig. S8) show that for these conformations also there is very good agreement between the two. Thus, the model that we obtained appears to be broadly applicable, and we conclude that overall, Eq. 7 provides a sufficiently accurate estimate of Rh, with an accuracy comparable to that inherent in using HYDROPRO (41).

Conclusions

IDPs are generally characterized by their lack of a well-defined secondary and tertiary structure and a broad distribution of conformations. Depending on the overall amino acid composition and sequence patterns, IDPs may also differ substantially in their compaction (28), which may, in turn, have important consequences for function and biophysical properties (55). Molecular simulations and modeling offer a unique opportunity to provide a link between sequence, structural properties, and function, but experimental validation is still required. Here, we provide a simple, general, fast, and accurate approach to calculate the Rh for large ensembles of disordered proteins from their Rg and chain length, N, and thereby enable comparison of computationally generated conformational ensembles against experimental values from, e.g., PFG NMR or SEC measurements. The model that we derived should also be useful when constraining, e.g., distributions of Rg using measurements of the average values of Rg and Rh (39). Future studies could also explore whether the relationship may be used for globular proteins with IDRs. The expression may also potentially be used in methods for restraining simulations using experimental data (12) and to exploit more generally the different averaging properties of SAXS, PFG NMR, and SEC experiments that depend on both the level of expansion and the shape of the disordered conformations.

Author Contributions

K.L.-L. conceived the idea; M.N. and E.P. performed the simulations and calculations; M.N. and K.L.-L. performed the fitting analyses; E.P., B.B.K., and K.L.-L., designed the research; M.N., B.B.K., E.P., and K.L.-L. analyzed the data and wrote the manuscript.

Acknowledgments

We acknowledge D. E. Shaw Research for sharing their MD simulations of disordered proteins.

This research was supported by a Hallas-Møller stipend from the Novo Nordisk Foundation (to K.L.-L.). E.P.’s group is currently supported by the Center of Excellence in Autophagy, Recycling and Disease (CARD), funded by the Danish National Research Foundation, and B.B.K.’s group is supported by the Novo Nordisk Foundation and The Danish National Research Foundation (8481-00344).

Editor: Rohit Pappu.

Footnotes

Supporting Discussion, eight figures, and two tables are available at http://www.biophysj.org/biophysj/supplemental/S0006-3495(17)30692-6.

Supporting Material

Document S1. Supporting Discussion, Figs. S1–S8, and Tables S1 and S2
mmc1.pdf (1.5MB, pdf)
Document S2. Article plus Supporting Material
mmc2.pdf (2.1MB, pdf)

References

  • 1.Yoon M.-K., Mitrea D.M., Kriwacki R.W. Cell cycle regulation by the intrinsically disordered proteins p21 and p27. Biochem. Soc. Trans. 2012;40:981–988. doi: 10.1042/BST20120092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Zhang Z., Boskovic Z., Tjian R. Chemical perturbation of an intrinsically disordered region of TFIID distinguishes two modes of transcription initiation. eLife. 2015;4:e07777. doi: 10.7554/eLife.07777. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Haxholm G.W., Nikolajsen L.F., Kragelund B.B. Intrinsically disordered cytoplasmic domains of two cytokine receptors mediate conserved interactions with membranes. Biochem. J. 2015;468:495–506. doi: 10.1042/BJ20141243. [DOI] [PubMed] [Google Scholar]
  • 4.Bugge K., Papaleo E., Kragelund B.B. A combined computational and structural model of the full-length human prolactin receptor. Nat. Commun. 2016;7:11578. doi: 10.1038/ncomms11578. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Wright P.E., Dyson H.J. Intrinsically disordered proteins in cellular signalling and regulation. Nat. Rev. Mol. Cell Biol. 2015;16:18–29. doi: 10.1038/nrm3920. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Babu M.M. The contribution of intrinsically disordered regions to protein function, cellular complexity, and human disease. Biochem. Soc. Trans. 2016;44:1185–1200. doi: 10.1042/BST20160172. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Aznauryan M., Delgado L., Schuler B. Comprehensive structural and dynamical view of an unfolded protein from the combination of single-molecule FRET, NMR, and SAXS. Proc. Natl. Acad. Sci. USA. 2016;113:E5389–E5398. doi: 10.1073/pnas.1607193113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Sibille N., Bernadó P. Structural characterization of intrinsically disordered proteins by the combined use of NMR and SAXS. Biochem. Soc. Trans. 2012;40:955–962. doi: 10.1042/BST20120149. [DOI] [PubMed] [Google Scholar]
  • 9.Lindorff-Larsen K., Kristjansdottir S., Vendruscolo M. Determination of an ensemble of structures representing the denatured state of the bovine acyl-coenzyme a binding protein. J. Am. Chem. Soc. 2004;126:3291–3299. doi: 10.1021/ja039250g. [DOI] [PubMed] [Google Scholar]
  • 10.Lindorff-Larsen K., Trbovic N., Shaw D.E. Structure and dynamics of an unfolded protein examined by molecular dynamics simulation. J. Am. Chem. Soc. 2012;134:3787–3791. doi: 10.1021/ja209931w. [DOI] [PubMed] [Google Scholar]
  • 11.Jensen M.R., Zweckstetter M., Blackledge M. Exploring free-energy landscapes of intrinsically disordered proteins at atomic resolution using NMR spectroscopy. Chem. Rev. 2014;114:6632–6660. doi: 10.1021/cr400688u. [DOI] [PubMed] [Google Scholar]
  • 12.Marsh J.A., Forman-Kay J.D. Structure and disorder in an unfolded state under nondenaturing conditions from ensemble models consistent with a large number of experimental restraints. J. Mol. Biol. 2009;391:359–374. doi: 10.1016/j.jmb.2009.06.001. [DOI] [PubMed] [Google Scholar]
  • 13.Lindorff-Larsen K., Maragakis P., Shaw D.E. Systematic validation of protein force fields against experimental data. PLoS One. 2012;7:e32131. doi: 10.1371/journal.pone.0032131. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Piana S., Klepeis J.L., Shaw D.E. Assessing the accuracy of physical models used in protein-folding simulations: quantitative evidence from long molecular dynamics simulations. Curr. Opin. Struct. Biol. 2014;24:98–105. doi: 10.1016/j.sbi.2013.12.006. [DOI] [PubMed] [Google Scholar]
  • 15.Best R.B., Zheng W., Mittal J. Balanced protein-water interactions improve properties of disordered proteins and non-specific protein association. J. Chem. Theory Comput. 2014;10:5113–5124. doi: 10.1021/ct500569b. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Rauscher S., Gapsys V., Grubmüller H. Structural ensembles of intrinsically disordered proteins depend strongly on force field: a comparison to experiment. J. Chem. Theory Comput. 2015;11:5513–5524. doi: 10.1021/acs.jctc.5b00736. [DOI] [PubMed] [Google Scholar]
  • 17.Palazzesi F., Prakash M.K., Barducci A. Accuracy of current all-atom force-fields in modeling protein disordered states. J. Chem. Theory Comput. 2015;11:2–7. doi: 10.1021/ct500718s. [DOI] [PubMed] [Google Scholar]
  • 18.Do T.N., Choy W.Y., Karttunen M. Accelerating the conformational sampling of intrinsically disordered proteins. J. Chem. Theory Comput. 2014;10:5081–5094. doi: 10.1021/ct5004803. [DOI] [PubMed] [Google Scholar]
  • 19.Zerze G.H., Miller C.M., Mittal J. Free energy surface of an intrinsically disordered protein: comparison between temperature replica exchange molecular dynamics and bias-exchange metadynamics. J. Chem. Theory Comput. 2015;11:2776–2782. doi: 10.1021/acs.jctc.5b00047. [DOI] [PubMed] [Google Scholar]
  • 20.Piana S., Donchev A.G., Shaw D.E. Water dispersion interactions strongly influence simulated structural properties of disordered protein states. J. Phys. Chem. B. 2015;119:5113–5123. doi: 10.1021/jp508971m. [DOI] [PubMed] [Google Scholar]
  • 21.Nerenberg P.S., Jo B., Head-Gordon T. Optimizing solute-water van der Waals interactions to reproduce solvation free energies. J. Phys. Chem. B. 2012;116:4524–4534. doi: 10.1021/jp2118373. [DOI] [PubMed] [Google Scholar]
  • 22.Mercadante D., Milles S., Gräter F. Kirkwood-Buff approach rescues overcollapse of a disordered protein in canonical protein force fields. J. Phys. Chem. B. 2015;119:7975–7984. doi: 10.1021/acs.jpcb.5b03440. [DOI] [PubMed] [Google Scholar]
  • 23.Huang J., Rauscher S., MacKerell A.D., Jr. CHARMM36m: an improved force field for folded and intrinsically disordered proteins. Nat. Methods. 2017;14:71–73. doi: 10.1038/nmeth.4067. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ozenne V., Bauer F., Blackledge M. Flexible-meccano: a tool for the generation of explicit ensemble descriptions of intrinsically disordered proteins and their associated experimental observables. Bioinformatics. 2012;28:1463–1470. doi: 10.1093/bioinformatics/bts172. [DOI] [PubMed] [Google Scholar]
  • 25.Pietrucci F., Mollica L., Blackledge M. Mapping the native conformational ensemble of proteins from a combination of simulations and experiments: new insight into the src-SH3 domain. J. Phys. Chem. Lett. 2013;4:1943–1948. doi: 10.1021/jz4007806. [DOI] [PubMed] [Google Scholar]
  • 26.Jha A.K., Colubri A., Sosnick T.R. Statistical coil model of the unfolded state: resolving the reconciliation problem. Proc. Natl. Acad. Sci. USA. 2005;102:13099–13104. doi: 10.1073/pnas.0506078102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bernadó P., Blackledge M. A self-consistent description of the conformational behavior of chemically denatured proteins from NMR and small angle scattering. Biophys. J. 2009;97:2839–2845. doi: 10.1016/j.bpj.2009.08.044. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Das R.K., Ruff K.M., Pappu R.V. Relating sequence encoded information to form and function of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2015;32:102–112. doi: 10.1016/j.sbi.2015.03.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Tran H.T., Wang X., Pappu R.V. Reconciling observations of sequence-specific conformational propensities with the generic polymeric behavior of denatured proteins. Biochemistry. 2005;44:11369–11380. doi: 10.1021/bi050196l. [DOI] [PubMed] [Google Scholar]
  • 30.Ding F., Jha R.K., Dokholyan N.V. Scaling behavior and structure of denatured proteins. Structure. 2005;13:1047–1054. doi: 10.1016/j.str.2005.04.009. [DOI] [PubMed] [Google Scholar]
  • 31.Wang Z., Plaxco K.W., Makarov D.E. Influence of local and residual structures on the scaling behavior and dimensions of unfolded proteins. Biopolymers. 2007;86:321–328. doi: 10.1002/bip.20747. [DOI] [PubMed] [Google Scholar]
  • 32.Fitzkee N.C., Rose G.D. Reassessing random-coil statistics in unfolded proteins. Proc. Natl. Acad. Sci. USA. 2004;101:12497–12502. doi: 10.1073/pnas.0404236101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Bernadó P., Mylonas E., Svergun D.I. Structural characterization of flexible proteins using small-angle x-ray scattering. J. Am. Chem. Soc. 2007;129:5656–5664. doi: 10.1021/ja069124n. [DOI] [PubMed] [Google Scholar]
  • 34.Wilkins D.K., Grimshaw S.B., Smith L.J. Hydrodynamic radii of native and denatured proteins measured by pulse field gradient NMR techniques. Biochemistry. 1999;38:16424–16431. doi: 10.1021/bi991765q. [DOI] [PubMed] [Google Scholar]
  • 35.Wang Y., Teraoka I., Hassager O. A theoretical study of the separation principle in size exclusion chromatography. Macromolecules. 2010;43:1651–1659. [Google Scholar]
  • 36.Nettels D., Müller-Späth S., Schuler B. Single-molecule spectroscopy of the temperature-induced collapse of unfolded proteins. Proc. Natl. Acad. Sci. USA. 2009;106:20740–20745. doi: 10.1073/pnas.0900622106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Baldwin A.J., Christodoulou J., Lippens G. Contribution of rotational diffusion to pulsed field gradient diffusion measurements. J. Chem. Phys. 2007;127:114505. doi: 10.1063/1.2759211. [DOI] [PubMed] [Google Scholar]
  • 38.Einstein A. Über die von der molekularkinetischen Theorie der Wärme geforderte Bewegung von in ruhenden Flüssigkeiten suspendierten Teilchen. Ann. Phys. 1905;17:549–560. [Google Scholar]
  • 39.Choy W.-Y., Mulder F.A., Kay L.E. Distribution of molecular size within an unfolded state ensemble using small-angle x-ray scattering and pulse field gradient NMR techniques. J. Mol. Biol. 2002;316:101–112. doi: 10.1006/jmbi.2001.5328. [DOI] [PubMed] [Google Scholar]
  • 40.Ortega A., Amorós D., García De La Torre J. Prediction of hydrodynamic and other solution properties of rigid proteins from atomic- and residue-level models. Biophys. J. 2011;101:892–898. doi: 10.1016/j.bpj.2011.06.046. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.García De La Torre J., Huertas M.L., Carrasco B. Calculation of hydrodynamic properties of globular proteins from their atomic-level structure. Biophys. J. 2000;78:719–730. doi: 10.1016/S0006-3495(00)76630-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Sickmeier M., Hamilton J.A., Dunker A.K. DisProt: the database of disordered proteins. Nucleic Acids Res. 2007;35:D786–D793. doi: 10.1093/nar/gkl893. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Marsh J.A., Forman-Kay J.D. Sequence determinants of compaction in intrinsically disordered proteins. Biophys. J. 2010;98:2383–2390. doi: 10.1016/j.bpj.2010.02.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Piana S., Lindorff-Larsen K., Shaw D.E. How robust are protein folding simulations with respect to force field parameterization? Biophys. J. 2011;100:L47–L49. doi: 10.1016/j.bpj.2011.03.051. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Rotkiewicz P., Skolnick J. Fast procedure for reconstruction of full-atom protein models from reduced representations. J. Comput. Chem. 2008;29:1460–1465. doi: 10.1002/jcc.20906. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Kirkwood J.G. The general theory of irreversible processes in solutions of macromolecules. J. Polym. Sci., Polym. Phys. Ed. 1954;12:1–14. [Google Scholar]
  • 47.Aronovitz J.A., Nelson D.R. Universal features of polymer shapes. J. Phys. 1986;47:1445–1456. [Google Scholar]
  • 48.Burchard W., Schmidt M., Stockmayer W.H. Information on polydispersity and branching from combined quasi-elastic and integrated scattering. Macromolecules. 1980;13:1265–1272. [Google Scholar]
  • 49.Oono Y., Kohmoto M. Renormalization group theory of transport properties of polymer solutions. I. Dilute solutions. J. Chem. Phys. 1983;78:520–528. [Google Scholar]
  • 50.Flory P.J. Cornell University Press; Ithaca, NY: 1953. Principles of Polymer Chemistry. [Google Scholar]
  • 51.Le Guillou J.C., Zinn-Justin J. Critical exponents for the n-vector model in three dimensions from field theory. Phys. Rev. Lett. 1977;39:95–98. [Google Scholar]
  • 52.Uversky V.N., Gillespie J.R., Fink A.L. Why are “natively unfolded” proteins unstructured under physiologic conditions? Proteins. 2000;41:415–427. doi: 10.1002/1097-0134(20001115)41:3<415::aid-prot130>3.0.co;2-7. [DOI] [PubMed] [Google Scholar]
  • 53.Cossio P., Trovato A., Laio A. Exploring the universe of protein structures beyond the Protein Data Bank. PLOS Comput. Biol. 2010;6:e1000957. doi: 10.1371/journal.pcbi.1000957. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Dima R.I., Thirumalai D. Asymmetry in the shapes of folded and denatured states of proteins. J. Phys. Chem. B. 2004;108:6564–6570. [Google Scholar]
  • 55.Bah A., Vernon R.M., Forman-Kay J.D. Folding of an intrinsically disordered protein by phosphorylation as a regulatory switch. Nature. 2015;519:106–109. doi: 10.1038/nature13999. [DOI] [PubMed] [Google Scholar]
  • 56.Skolnick J., Kolinski A., Ortiz A.R. MONSSTER: a method for folding globular proteins with a small number of distance restraints. J. Mol. Biol. 1997;265:217–241. doi: 10.1006/jmbi.1996.0720. [DOI] [PubMed] [Google Scholar]
  • 57.Kohn J.E., Millett I.S., Plaxco K.W. Random-coil behavior and the dimensions of chemically unfolded proteins. Proc. Natl. Acad. Sci. USA. 2004;101:12491–12496. doi: 10.1073/pnas.0403643101. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Supporting Discussion, Figs. S1–S8, and Tables S1 and S2
mmc1.pdf (1.5MB, pdf)
Document S2. Article plus Supporting Material
mmc2.pdf (2.1MB, pdf)

Articles from Biophysical Journal are provided here courtesy of The Biophysical Society

RESOURCES