Abstract
The intrinsically disordered 4E-BP2 protein regulates mRNA cap-dependent translation through interaction with the predominantly folded eukaryotic initiation factor 4E (eIF4E). Phosphorylation of 4E-BP2 dramatically reduces the level of eIF4E binding, in part by stabilizing a binding-incompatible folded domain. Here, we used a Rosetta-based sampling algorithm optimized for IDRs to generate initial ensembles for two phospho forms of 4E-BP2, non- and 5-fold phosphorylated (NP and 5P, respectively), with the 5P folded domain flanked by N- and C-terminal IDRs (N-IDR and C-IDR, respectively). We then applied an integrative Bayesian approach to obtain NP and 5P conformational ensembles that agree with experimental data from nuclear magnetic resonance, small-angle X-ray scattering, and single-molecule Förster resonance energy transfer (smFRET). For the NP state, inter-residue distance scaling and 2D maps revealed the role of charge segregation and pi interactions in driving contacts between distal regions of the chain (~70 residues apart). The 5P ensemble shows prominent contacts of the N-IDR region with the two phosphosites in the folded domain, pT37 and pT46, and, to a lesser extent, delocalized interactions with the C-IDR region. Agglomerative hierarchical clustering led to partitioning of each of the two ensembles into four clusters with different global dimensions and contact maps. This helped delineate an NP cluster that, based on our smFRET data, is compatible with the eIF4E-bound state. 5P clusters were differentiated by interactions of C-IDR with the folded domain and of the N-IDR with the two phosphosites in the folded domain. Our study provides both a better visualization of fundamental structural poses of 4E-BP2 and a set of falsifiable insights on intrachain interactions that bias folding and binding of this protein.
Graphical Abstract

INTRODUCTION
Proteins are inherently dynamic and adopt conformations that range from very stable to completely disordered.1,2 An extreme case of protein polymorphism, intrinsically disordered proteins (IDPs) have been found to perform an increasingly diverse range of cellular functions, despite (or perhaps due to) lacking stable secondary and tertiary structure.3 Statistics of the human proteome revealed that nearly 60% of proteins contain stretches of greater than 30 residues of intrinsic disorder and ~5% of proteins are completely disordered.4 IDPs are highly involved in cellular signaling and regulation, function as hubs of protein–protein interaction (PPI) networks,5,6 show unexpected mechanisms of PPIs,7 and are drivers of protein phase separation.8,9 They are particularly sensitive to posttranslational modifications (PTMs), which can result in either stabilization or destabilization of transient secondary structures10,11 and induce order–disorder12 or disorder-to-order transitions.13 IDPs are enriched in many neurodegenerative and cancer pathways,14 but they are challenging therapeutics targets due to the lack of stable binding pockets for small molecules.15
Eukaryotic translation is a highly regulated process, with most mRNAs requiring interaction with the eukaryotic translation initiation factor (eIF4E) to be translated.13,16–18 The eIF4F complex is formed by assembly of eIF4E and eIF4G, which is subsequently recruited to the 40S subunit of the ribosome.17 The assembly of the eIF4F complex is inhibited by the intrinsically disordered 4E-BPs (eIF4E binding proteins), which compete with eIF4G for an overlapping surface of eIF4E.19
The neuronal-specific 4E-BP isoform, 4E-BP2, modulates neuroplasticity, impacts learning and memory formation,20,21 and is associated with autism spectrum disorders.22 4E-BP2 binds eIF4E at both the canonical 54YDRKFLLDRR63 and a secondary 78IPGVT82 binding site; the canonical motif binds to eIF4E in a helical motif on the same convex surface as eIF4G,19,23 while the secondary binding site is more dynamic and binds to the lateral surface of eIF4E.24
Hierarchical phosphorylation of 4E-BP2 at residues T37, T46, T70, S65, and S83 results in the five-phosphorylated (5P) state and decreases the affinity of the 4E-BP2:eIF4E complex by ~4000-fold compared to the nonphosphorylated (NP) state, via the formation of a 4-stranded β-sheet structure from residues 18–62.13,25 The initial two phosphorylations at residues T37 and T46 result in a ~100-fold decrease in eIF4E affinity, while the additional phosphorylations in the C-terminal intrinsically disordered region (C-IDR) cause a further ~40-fold decrease.13,25 Because of this, interactions with the C-IDR containing the additional three phosphosites were proposed to enhance the stability of the folded β-sheet structure (which would reduce binding). In order to support this hypothesis or otherwise explain the enhanced stability and reduced 4E binding, structural models of full-length 4E-BP2 in both phosphostates are required.
The free energy landscapes of IDPs are typically shallow but not featureless, with local energy minima corresponding to transient secondary and tertiary structural biases which confer functional attributes.26–29 The potentially vast number of relevant structures makes the experimental and computational characterization of IDPs difficult. Modeling them necessitates a framework of sufficient complexity to capture relevant features while avoiding being too large to be computationally intractable. IDPs are often modeled as conformational ensembles, which are a set of 3D structures (having x,y,z coordinates of each atom) with associated weights.30–32 Data from nuclear magnetic resonance (NMR), small-angle X-ray scattering (SAXS), and single-molecule Förster resonance energy transfer (smFRET) can be used to refine a starting pool of conformations by imposing agreement with the experimental data.33,34 Different experiments are sensitive to different length scales and time scales, with different degrees of time-averaging and ensemble-averaging. This is a heavily underdetermined inverse problem, as the experimental restraints available are vastly insufficient to determine a unique conformational ensemble.
Several approaches have been applied to generate disordered conformational ensembles, such as Trajectory Directed Ensemble Sampling (TraDES),35,36 flexible-meccano,37 IDPConformerGenerator38 and FastFloppyTail (FFT).39 TraDES generates conformers by first building the backbone from Φ/Ψ angles sampled from a nonredundant set of structures from the PDB, with geometric restraints and a Lennard-Jones potential to avoid steric clashes. Flexible-meccano samples amino acid-specific Φ/Ψ potential wells from a compilation of non-secondary structure (loop) elements derived from the PDB. IDPConformerGenerator samples ϕ, ψ, and ω torsion angles from the PDB for various fragment lengths and with different secondary structural biases, including based on experimental NMR chemical shifts. FFT is a PyRosetta based method that samples three-residue fragments from the PDB with a bias toward loop regions.
Optimization methods such as ENSEMBLE,40 Extended Experimental Inferential Structure Determination (X-EISD),41 and Bayesian Maximum Entropy (BME)42 reweight or select a subset of the initial conformational ensemble so that back-calculated biophysical observables match their experimental counterparts. The ENSEMBLE method uses pseudoenergy terms to quantify agreement between computation and experiment, where deviation from the initial ensemble is not being penalized. In contrast, the X-EISD and BME methods use Bayesian frameworks that account for uncertainties in both experimental data and back-calculators. For example, BME treats the experimental data as time/ensemble averages and reweights the prior ensemble such that it agrees with experiments while maximizing the relative Shannon entropy. In this way, confidence is given to both the prior ensemble and the experimental data to prevent overfitting.
Arranging conformations into groups that share structural similarities, i.e., clusters, can lead to better visualization of heterogeneous IDP ensembles and help formulate structure–function relationships.43 The high degree of conformational disorder makes traditional similarity measures that require atomic superimposition of conformers ill-suited for IDPs.44 Conversely, a similarity criterion based on inter-residue α-carbon (Cα) Euclidean distance can be applied in agglomerative hierarchical clustering, which was shown to be a useful tool to characterize the heterogeneity of IDPs.45
In this work, we applied the BME method42 to optimize 4E-BP2 ensembles in both NP and 5P states that were generated by FFT.39 Agreement to experimental data such as the SAXS curve, two mean FRET efficiencies derived from smFRET histograms, and Cα/Cβ Chemical Shifts (CS) for most of the chain (excluding residues within the folded domain in the 5P state) was imposed in the optimization procedure. An independent data set, the Paramagnetic Relaxation Enhancements (PREs) at several positions distributed along the 120-residue chain, was reserved for cross-validation and tuning the hyperparameters of the BME optimization.
Structural-based clustering suggests that NP 4E-BP2 predominantly samples four overall structural states. One of these clusters shares structural features with the eIF4E-bound state, indicating that some conformations contain preformed features that may enhance the probability of complex formation upon collision with eIF4E. Contact maps of the 5P ensemble revealed pronounced interactions of the folded-domain phosphorylation sites pT37 and pT46 with the N-IDR (residues 1–17), while contacts with the C-IDR were less frequent and more delocalized. 5P clustering analysis led to the separation of these interactions into four different clusters. This work describes highly probable structural poses and provides novel insights into the structure–function relationship of a fascinating disordered protein that regulates translation initiation. Importantly, it also provides specific ideas valuable for designing experiments to test the validity of these insights.
METHODS
Conformer Generation.
4E-BP2 conformers were generated using the FastFloppyTail (FFT) algorithm, an optimized version of the Rosetta-based FloppyTail program which is ~10 times faster and has enhanced accuracy via an improved fragment selection scheme39 (see Supporting Information, section 1.1). FFT has been applied to model interdomain linkers46 and several IDPs such as α-synuclein, Sic1 and the unfolded state of the drkN SH3 domain.39
For NP 4E-BP2, we generated 20,000 conformers using FFT and a disorder prediction file created by the PsiPred DISOPRED3 web server.47 For partially folded 5P 4E-BP2, we used FFT to sample the N- and C-terminal IDRs (residues 1–17 and 63–121, respectively) with PsiPred DISOPRED3 disorder predictions. The folded domain (residues 18–62) consisted of the 20 lowest energy NMR-derived structures13 that were fixed during FFT sampling of the IDRs. Each of the 20 PDB entries (PDB ID: 2MX4) were used with equal weight in generating the 5P 4E-BP2 ensemble (1000 structures per folded domain for a 20,000-conformer ensemble). The starting 5P structures had N- and C-terminal IDRs concatenated to the folded domain by using the “bond” function between pairs of carbon atoms in PyMOL.48 To create ideal bond lengths and angles while avoiding steric clashes, the “Idealize” and “Relax” Rosetta algorithms49,50 were applied to the structures.
FRET Calculations.
Back-calculated FRET efficiency, 〈E〉, values of the IDP ensembles were computed via accessible volume simulations51,52 using the AvTraj53 and MDTraj54 Python packages. We utilize dye parameters for Alexa488 and Alexa647 dye-linker systems documented previously.55,56 The back-calculated uncertainty was calculated by taking the difference between the mean FRET efficiencies in an ensemble computed using the lower and upper bounds of the Förster radius, respectively.34 A back-calculated uncertainty of was computed for all ensembles. For use in BME, we added this uncertainty in quadrature with the experimental uncertainty , resulting in a combined uncertainty of used for BME calculations. This uncertainty was also used to calculate the z-test statistic as the absolute value of the difference between back-calculated and experimental mean FRET efficiency values divided by the combined uncertainty, .
SAXS Data and Calculations.
The cloning, expression, purification, and phosphorylation of 4E-BP2 was performed as previously described.13,19,25 A SAXSpace instrument with an ASX autosampler (Anton-Paar GmbH, Austria) was used to conduct small-angle X-ray scattering experiments. The SAX-Space was equipped with a long fine focus glass sealed copper tube using line collimation focus (40 kV/50 mA, Kα = 0.1542 nm), TCStage 150 sample holder and a 1D CMOS Mythen2 R 1K detector. 4E-BP2 protein samples at concentrations of 2–20 mg/mL were loaded into a 1 mm diameter quartz flow cell using the autosampler and six 10 min exposure frames were collected at 20 °C under vacuum. Data were corrected for background scattering using sample buffer alone analyzed under the same conditions. SAXStreat software (Anton-Paar GmbH, Austria) was used to define the origin of the scattering curve, correct image distortion, and convert the data to 1D scattering profiles. SAXSQuant (Anton-Paar GmbH, Austria) was then used to desmear the data.
The Pepsi-SAXS method57 with default solvation parameters (r0 = 1.61 Å, dρ = 0.0167 e/Å3) was used to back-calculate SAXS curves from IDP ensembles. Due to the unavailability of scattering factors for phosphorylated residues, the phosphosites in the 5P ensemble were treated as being unphosphorylated in Pepsi-SAXS calculations. Pepsi-SAXS is an efficient method that utilizes the multipole expansion scheme for scattering intensities and has been validated on more than 50 experimental SAXS scattering profiles. Only the experimental SAXS scattering intensity uncertainties were utilized in the BME optimization.
NMR Data and Calculations.
The ShiftX program58 was used to back-calculate secondary structure chemical shifts (CS) from the IDP ensembles. The ShiftX method can quickly compute backbone and side-chain 1H, 13C and 15N chemical shifts for a single ~100-residue conformer. All experimentally measured chemical shifts19,25 were employed in our BME calculations except in the 5P 4E-BP2 ensemble where phosphosites (65, 70, and 83) and immediately subsequent residues (66, 71, and 84) were excluded due to the inability of ShiftX to accurately predict shifts due to phosphates. We also excluded 5P 4E-BP2 CS values assigned to residues within the folded domain (residues 19–61) since BME would not be able to refine the disordered conformer ensemble otherwise. Back-calculation uncertainties of 0.98 and 1.10 were used for Cα and Cβ chemical shifts, respectively for BME calculations.
To generate PRE data for NP 4E-BP2, we first generated single-cysteine mutant constructs using a cysteineless version with C35 and C73 mutated to serines. Single cysteines were then introduced at positions 35, 65, 73, 91, 110, and 121 in order to attach a paramagnetic spin label at these positions. Proteins, that were labeled isotopically with 15N, were purified and a TEMPOL-maleimide (Toronto Research Chemicals) spin label was covalently linked as previously described.25 Two matched samples were made for each protein with the spin label in either an oxidized or reduced state. Samples were oxidized or reduced by addition of either a 5-fold excess of TEMPOL (Toronto Research Chemicals) or 1 mM ascorbic acid, respectively. Prior to NMR experiments, the samples were buffer exchanged into a buffer containing 30 mM sodium phosphate, 100 mM sodium chloride, 1 mM EDTA, and 1 mM benzamidine, pH 6, using argon purged buffers to maintain the oxidation state of the spin label.
For all samples, sensitivity-enhanced HSQC experiments59 and unenhanced-NH-T2 experiments60 were recorded at 20 °C on an 800 MHz Bruker spectrometer equipped with a triple-resonance cryoprobe. Relaxation delays for the T2 experiment were 7, 9, 14, 20, 26, 33, 41, 49, 59, 70, 82, and 95 ms, with the 14 and 59 ms points repeated for error estimation. A comparison of the T2 data and the ratios of the oxidized and reduced samples revealed very similar trends. Though less rigorously quantitative, the peak intensities from the HSQC experiments were used as input for DEER-PREdict (see below) because the T2 data and HSQC shared highly similar trends. PRE data for NP 4E-BP2 is included in the Supporting Information, and the PRE data for 5P 4E-BP2 has been published previously.25
The DEER-PREdict program was used to back-calculate PRE intensity ratios.61 The parameters used were the same for both phosphoforms: total correlation time τt = 0.5 ns, spin label effective correlation time τC = 4 ns, total INEPT time td = 10 ms, reduced transverse relaxation time R2 = 6 Hz and proton Larmor frequency ωH/2π = 800.14. PRE data points for which both the spin label residue and residue to which it transfers were both in the folded domain were excluded in the analysis, as they experienced little or no change. The metric chosen to quantify agreement is the root-mean-squared average over the root-mean-squared deviations between back-calculated and experimental PRE intensity ratios (PRE RMSD) (see Supporting Information, section 2.8).
Ensemble Refinement.
We used the BME method42 to refine the starting FFT conformational ensembles based on information supplied by experimental data. BME accounts for the uncertainty in estimating the confidence in the unrestrained FFT ensemble versus the experimental data by means of a tunable hyperparameter (θ). Given certain restraints, it holds that the most probable distribution compatible with the experimental data is the distribution of maximal entropy.47,62 As such, conformer weights are tuned to minimize the following objective function:
| (1) |
where ωi is the weight for conformer i in the reweighted ensemble, M is the total number of conformers, θ is a hyperparameter which represents the degree of ensemble refinement and is the nonreduced χ-squared. Lower values of θ result in greater agreement with experimental data but with greater deviation from the prior ensemble, which is quantified by a lower effective number of conformations in the posterior ensemble, Neff. Note that we refer to the reduced χ-squared (normalized by the number of degrees of freedom) without a tilde (χ2) and to the nonreduced variant with a tilde . For more details, see ref 42 and Supporting Information, section 2.1.
In the absence of a clear minimum of the BME optimization curve, L-curve analysis was applied to find the “knee” point using the kneed package in Python.63 As such, the value of Neff for the optimized ensembles was assigned to the point of maximum curvature. The variance in Neff was estimated by taking the largest absolute difference among replicate ensembles that were generated by splitting the 20,000-conformer ensemble into four equally sized ensembles (see Supporting Information, Tables S7 and S8). This results in Neff uncertainty of ±0.03 and ±0.05 for the NP and 5P 4E-BP2 ensembles, respectively.
Upon refining ensembles with BME, the difficulty of fitting the mean FRET efficiency for NP 4E-BP2 labeled at residues 32 and 91 (〈E〉32–91) became apparent. Indeed, a large fraction of the prior ensemble must be discarded (Neff = 0.03) to obtain good agreement with the experimental averages (). The reason for this behavior is due to the BME protocol minimizing the sum: , where each term in the sum is a nonreduced chi-squared. This means that experiments with many experimental data points, although not all independent (e.g., SAXS and CS), contribute much more to the total than smFRET. Hence, the optimization will be heavily biased toward reducing their values. To correct this, a hyperparameter controlling the weight of in the BME optimization was introduced (Ω), modifying to the following form: , as implemented in our previous study.64 The hyperparameter Ω was tuned such that FRET was in good agreement with experimental values with negligible changes to the other restraints (see Supporting Information, Figure S10).
Due to inaccuracies in both prior ensemble and experimental data, it is not clear what θ value should be selected for the most probable ensemble that fits all restraints. To resolve this issue, PRE data were not integrated as a restraint and were used instead to determine an optimal Neff by choosing the “knee” point on the PRE RMSD curve that is uniformly sampled 1000 times for the full range of Neff after spline interpolation. For comparison of experimental and back-calculated PRE NMR data, we compared ratios of intensities of peaks in the oxidized and reduced samples to back-calculated data using DEER-PREdict;61 see above. We prefer comparing intensity ratios in contrast to a generally utilized strategy which converts PRE intensity ratios to distances.65 Such estimates are highly imprecise, and due to the required r−6 averaging, PRE distances act as a weak restraint where only a few conformers are needed to fit the data in order to achieve good agreement.66
Hierarchical Clustering.
The NP and 5P 4E-BP2 ensembles were divided into subensembles using hierarchical clustering using the Ward variance minimization algorithm.67 The distance metric for conformer (dis)similarity is computed as the Euclidean distance in the 7260–dimensional space where conformers are represented as matrices containing all nondegenerate pairwise inter-residue Cα–Cα distances. The distance Di,j between two conformers i and j is
| (2) |
where N is the number of residues (121 in this case) and is the distance between Cα atoms of the kth residue pair for the ith conformer.
The dendrogram distance axis does not have a simple biophysical interpretation (see Supporting Information, section 2.4); we therefore transformed the dendrogram distance axis to a Euclidean distance between cluster means (DT) using the relation given by eq S8 in the Supporting Information, section 2.5.68 We then divide this value by the square root of the number of nondegenerate inter-residue distance combinations to obtain an RMSD value of inter-residue Cα distances. We name this quantity, which is analogous to the atomic RMSD for protein structures (eq S4), “normalized variance”. To determine a cutoff for clustering, the number of clusters was plotted against the normalized variance (see Figure S8), and L-curve analysis was applied to find the optimum number of clusters. However, when this leads to very low-population clusters (~5% or less), the cutoff is reduced to the nearest point to eliminate them.
RESULTS
Optimized 4E-BP2 Ensembles.
Motivated by the availability of structural data yet a lack of appropriate full-length computational ensembles of the 4E-BP2 protein, we calculated conformational ensembles consisting of 20,000 static conformers for both the NP and 5P variants. Our approach utilizes optimization and analysis methods that have been previously applied to model IDP ensembles.39,46 A unique aspect of 4E-BP2 in comparison with other IDPs is the presence of a folded domain within the otherwise disordered 5P phosophoform. In this hyperphosphorylated state, a four-stranded beta-fold domain spanning residues 18–62 is stabilized. Modeling such a case motivated our choice of the FFT conformer generator,39 which allows the N- and C-IDRs to be sampled separately while maintaining folded domain poses derived from solution NMR experiments.13
Optimization of the NP and 5P ensembles was performed with the BME method42 using our previously published CS and smFRET data and new SAXS data (see Methods). [Note that, while sampling IDR tails and internal IDRs of proteins with folded domains is now possible within IDPConformerGenerator,69 it was not when our study began, nor was the current X-EISDv2 version with enhanced accessibility.41] To validate and/or further optimize these ensembles, we evaluated their ability to reproduce experimental data that was withheld from the BME refinement process.64 As such, we further tuned the ensemble optimization using PRE data with its sensitivity to inter-residue contacts (<25 Å).
For the NP ensemble (Figure 1A), decreases as the initial pool is reweighted, and the effective fraction of conformations (Neff) decreases (see Methods). The decrease is initially steep, but then it levels-off with a markedly flatter slope below Neff≈ 0.6. The region of steep decrease is where the conformations that are least consistent with experimental data are essentially discarded; i.e., their weights go to zero. As the slope flattens, further optimization only marginally increases the agreement with experiments. After an initial increase (see Figure S1A), PRE RMSD follows a similar downward trend, although shifted to a lower Neff range than . To avoid overfitting, Neff = 0.40 (θ = 35) was chosen at the “knee” point of the sampled PRE RMSD curve (Methods) for the optimized NP ensemble.
Figure 1.

BME optimization for NP (A) and 5P 4E-BP2 (B) ensembles using FFT-generated prior pools with 20,000 conformers and imposing agreement with experimental data (SAXS, CS and FRET). A combination of fitting the restraints () and cross-validation (PRE RMSD, see Supporting Information, section 2.8) was used to determine the global fitting parameter, Neff, indicated as dashed vertical lines and gray areas (see Methods)).
Similarly, for the 5P ensemble (Figure 1B), increased conformer reweighting leads to improved agreement with both the restraints incorporated within BME (decrease of ) and the external data (decrease of PRE RMSD). The knee points of the two curves are very close to each other, with the lower of the two, Neff = 0.78 (θ = 27), being chosen for the optimized 5P ensemble. Fitting parameters of the BME-optimized ensembles are shown in Table 1, and comparison of experimental and back-calculated values throughout the sequence for CS and PRE data is shown in Supporting Information (Figures S12–S14).
Table 1.
Fitness Parameters and Back-Calculated Global Parameters for Ensembles of NP and 5P 4E-BP2a
| N eff | Rg (Å) | (Å) | (Å) | |||||
|---|---|---|---|---|---|---|---|---|
| NP 4E-BP2 | 0.40 | 0.64 | 1.03 | 0.67 | 0.62 | 28.7 ± 0.1 | 29.0 ± 1.5 | 23.5 ± 0.1 |
| 5P 4E-BP2 | 0.78 | 0.76 | 1.20 | 0.95 | 0.37 | 26.5 ± 0.1 | 26.8 ± 1.5 | 20.8 ± 0.1 |
Uncertainties of Rg and Rh are the weighted standard deviation of the mean of the ensemble distributions.
Optimization curves for each restraint are shown in Figures S1 and S2 in the Supporting Information and the initial and optimizing fitness parameters are displayed in Tables S1 and S2. The effect of optimization can be visualized by the change in the distribution of conformer weights (Figure S3). The NP distribution contains distinct outlier values that are well-separated from those in the bulk. In addition, 61% of the initial conformers have 95% of the weight in the optimized NP ensemble, while for 5P the fraction is much higher, 83%. This was perhaps expected since the 5P initial ensemble integrates atomic coordinates derived from the NMR solution structure of the folded domain (~40 residues), and fewer residues required refinement.
Table 1 also includes back-calculated global size parameters, radii of gyration, and hydrodynamic radii (Rg and Rh), of the two optimized ensembles. The back-calculated Rg values are close to those derived by Guinier analysis from the SAXS data, i.e., 32.18 ± 2.22 and 27.06 ± 0.69 Å for NP and 5P states, respectively (Figure S11), and confirm that the 5P state is overall more compact than the NP state. The Rh of the optimized NP 4E-BP2 ensemble, back-calculated using the Kirkwood–Riseman approximation (23.5 ± 0.1 Å), is closer to the value measured by FCS (24.8 ± 0.1 Å)26 than the value back-calculated with HYDROPRO (29.5 ± 1.5 Å). Our results are consistent with a recent comparative study, where the Kirkwood–Riseman approach was shown to be a better predictor of experimental hydrodynamic radii of IDP ensembles and resulted in values ~20% lower than HYDROPRO predictions.70 However, the Kirkwood–Riseman prediction for the 5P ensemble (20.8 ± 0.1 Å) is significantly smaller than the FCS-measured value (27.9 ± 1.1 Å), while the HYDROPRO prediction (26.8 ± 1.5 Å) is in better agreement. This discrepancy is perhaps not surprising, given that a significant fraction of the 5P protein (~1/3 of the sequence) forms a stable fold, and HYDROPRO has been optimized using folded proteins.
Charge Segregation and Global Compaction of NP 4E-BP2.
Despite showing significant structural flexibility, IDPs have transiently sampled contacts due to intrachain interactions such as hydrophobic,71,72 electrostatic,73,74 and π interactions.75,76 Considering the global compaction of NP 4E-BP2 (see above), we asked whether there are indicators of nonlocal residue interactions in the optimized ensemble. As such, we analyzed the relation between mean inter-residue distances (R|i-j|) and residue separations (|i – j|), i.e., the Internal Scaling Profile (ISP). Distances were calculated as double averages, first for each conformer and then within the ensemble (Gomes JACS 2020).34 For comparison with a null-hypothesis lacking preferential interactions, we generated an ensemble consisting of 20,000 self-avoiding random coil (RC) conformations using TraDES35,36 and computed its ISP curve.
Within the polymer physics framework, the ISP curve is typically fitted to the following power-law relation:
| (3) |
where b is the distance between bonded Cα atoms (3.8 Å), ν is the Flory scaling exponent and the persistence length lp was fixed at lp = 4 Å (see Supporting Information, Table S3, for fitting parameter values). This persistence length is commonly applied to model disordered proteins and has been shown to be applicable for unfolded and disordered proteins.77 The behavior of infinitely long homopolymer models representing the comparative strength of protein–protein interactions (PPIs) versus protein–solvent interactions (PSIs) converge for three distinct cases. A case in which PPIs dominate is termed the poor-solvent state (ν ~ 0.33), PPIs being equal to PSIs is denoted as the θ-state (ν ~ 0.5) and a chain with dominating PSIs is termed the good-solvent state, or the excluded-volume (EV) limit (ν ~ 0.59).
To facilitate comparison, the ISPs of the optimized NP 4E-BP2 and TraDES RC ensembles are plotted together with the ISPs of the EV limit and θ-state homopolymers (Figure 2A). For sequence separations 10 ≤ |i – j| ≤ 40 the NP 4E-BP2 scaling resembles the TraDES RC ensemble (ν = 0.556), while for the largest separations, 100 ≤ |i – j| ≤ 120, the scaling exponent decreases only slightly (ν = 0.539). In the intermediate range, 60 ≤ |i – j| ≤ 95, the ISP curve flattens and undergoes a change in concavity, so it cannot be fit to a simple power-law dependence. In addition, intrachain distances in the NP 4E-BP2 ensemble start to deviate from those in the TraDES RC ensemble for |i – j| ≥ 20 (Figure 2A). Taken together, this suggests that scale invariance breaks down due to specific intrachain contacts, which are also responsible for the high transient helical content spanning the entire chain19 (Figure S4).
Figure 2.

(A) Internal scaling profiles of the optimized NP 4E-BP2 ensemble (red), the TraDES random coil ensemble (blue), excluded-volume (black, dotted), and θ-solvent (black, dashed) homopolymers, and fits of the regions 10–40 and 100–121 to eq 3 (green dashed). A concave region of the ISP curve, spanning residue separations of 60–95, is indicated by a gray shaded box. (B) Net charge per residue (NCPR) index calculated using a five-residue sliding window: blue, positive; red, negative.
Charge segregation or patterning within a disordered chain can be quantified by the parameter κ, 0 ≤ κ ≤ 1, with the low limit corresponding to well-mixed charges and the high limit to positive and negative charges separated in the two halves of the chain,78 or by the sequence charge decoration (SCD) parameter.79 Das and Pappu tested the effects of charge segregation on the ISP behavior for a 50-residue model chain consisting of two oppositely charged residues that are distributed in patches of variable size across the sequence.78 They also observed a concavity “dip” in the ISP curves of model sequences, which became more pronounced with increasing κ. Interestingly, their model sequence with the closest κ value to that of NP 4E-BP2 (κ = 0.1552) has an ISP curve with a dip similar to that of our NP ensemble.
We evaluated various sequence-charge parameters using the Classification of Intrinsically Disordered Ensemble Relationships (CIDER) program80 (Table S4). For example, the net charge per residue (NCPR) has been previously used to relate global dimensions of IDPs to electrostatic interactions.81,74 The NCPR distribution of NP 4E-BP2 (Figure 2B) shows patches of oppositely charged residues in the sequence, which may cause the dip in the ISP curve for 60 ≤ |i – j| ≤ 95 via electrostatic attraction. By considering the entire residue spans of net positive/negative NCPR and matching them based on the residue separations depicted in the gray region of Figure 2A, we identified three such attractive pairs: 11–24 (positive NCPR) with 85–98 (negative NCPR), 22–37 (negative NCPR) with 103–111 (positive NCPR), and 47–63 (positive NCPR) with 108–121 (negative NCPR).
To better visualize the proximity between different regions of the NP 4E-BP2 chain in our optimized ensemble, we constructed the 2D map of mean pairwise inter-residue Cα–Cα distance map normalized by each respective value from the RC ensemble (Figure 3A). The most prominent region of compaction is between residues ~20–40 and ~80–100. The putative interacting regions based on NCPR analysis (Figures 3B–D) also contain hydrophobic, hydrogen-bonding, and π-containing residues. This suggests that transient contacts are formed through a combined effect of charge-based attraction with other physicochemical interactions, potentially including the hydrophobic effect, hydrogen bonding and π interactions.
Figure 3.

2D maps of mean inter-residue distances in NP 4E-BP2. (A) Distances in the BME-optimized ensemble normalized by the TraDES RC ensemble (red-expanded, blue-compacted). Zoom in the regions corresponding to pairs with opposite sign NCPRs (see Figure 2): (B) residues 11–24 with residues 85–98, (C) 22–37 with 103–111, and (D) 47–63 with 108–121; residue color scheme: positive, blue; negative, red; hydrophobic, green; aromatic, magenta; hydrogen bonding, italic.
In particular, π contacts between two tyrosines (Y34 and Y54) and two C-terminal lysines (K92 and K107) and/or an arginine (R106) could contribute synergistically to the nonlocal interactions causing the dip in the ISP curve of NP 4E-BP2. Notably, for the first pair, the largest deviations from random coil expectations are located in residues of the positive NCPR selection and contain sites which are functionally relevant: the phosphoregulatory RAIP site (residues 15–18),82 and a region following the secondary binding site.
Nonlocal Contacts That Stabilize the Folded Domain of 5P 4E-BP2.
Phosphorylation at residues T37, T46, S65, T70 and S83 induces the formation of a four-stranded beta-fold between residues 18–62 which sequesters the canonical eIF4E binding motif and is incompatible with binding.13 Phosphorylation is hierarchical. Initial phosphorylation at residues T37 and T46 leads to folding of a marginally stable domain, decreasing the eIF4E binding affinity by ca. 100-fold. Subsequent phosphorylation of the C-IDR at residues T70, S65 and S83 decrease the binding affinity by a further ca. 40-fold,13 primarily by stabilization of the folded domain and not by direct interactions with eIF4E. The noncooperative folding/stabilization of this domain allows a graded inhibition of translation inhibition by phosphorylation-induced tuning of the eIF4E:4E-BP2 affinity.25
However, no structural models exist to provide detailed information on how the three additional C-IDR phosphorylation sites stabilize the folded domain, despite several experimental studies probing the properties of 5P 4E-BP2.13,25,26 Molecular dynamics simulations have studied the formation of the four-stranded β-fold but the N-IDR and C-IDR were omitted.83,84 NP 4E-BP2 contains significant transient α-helical structure, particularly between residues 49–67, partially preordering the canonical helical eIF4E-binding element, and in the C-terminal region.19 Phosphorylation at residues S37 and S46 switches this helical character to extended β-like, and the additional C-IDR phosphorylations result in additional helical character in residues proximal to the canonical binding element as well as in the C-IDR, with pS65 having the largest effect.25 We examined our models to better understand stabilization of the fold by identifying potential C-IDR phosphorylation-induced stabilizing contacts between the folded domain and the rest of 4E-BP2 and potential destabilizing contacts present in the NP state that are abolished in the 5P state.
To evaluate 5P intrachain interactions in the context of “topological” features imposed by the presence of a fixed folded domain, we compared the optimized 5P 4E-BP2 ensemble to the 5P coil ensemble (see Supporting Information 1.2). Similar to the NP analysis above, normalized pairwise inter-residue Cα–Cα distances reveal regions of compaction and expansion . Most inter-residue distances are closer in the 5P BME optimized ensemble than the 5P coil ensemble, with the closest contacts (besides those within the folded domain) involving residues of the folded domain with those of the N-IDR (Figure 4A). Interestingly, the NP ensemble (Figure 3A) showed greater distances between residues of the canonical binding motif 54YXXXXLΦ60 and the N-terminus (residues 1–17), than for the 5P state.
Figure 4.

5P 4E-BP2 inter-residue distance and contact maps of optimized vs. coil ensembles. (A) 2D map of the mean inter-residue distances of the 5P 4E-BP2 optimized ensemble normalized by the 5P coil ensemble (red–expanded, blue–compacted). (B) Difference contact map obtained by subtracting the fractional degree of inter-residue contacts in the 5P coil ensemble from those in the BME-optimized 5P ensemble. Two residues are in contact if their Cα atoms are within 8 Å.
These changes are consistent with the observation that the chemical shift changes between the NP and the 5P state are the largest at the canonical binding site residues.25 In the NP state there are larger distances between the T46 phosphorylation site and all residues that will become the “N-IDR” upon phosphorylation than for the coil ensemble, and there are also larger distances between T37 and some residues in this N-IDR forming domain than in the coil (Figure 3A). Conversely, in the 5P ensemble, the residues near phosphorylation sites pT37 and pT46 have distances that are the most reduced compared with the 5P coil ensemble. This can be seen more clearly by considering the difference contact map (Figure 4B), where differences in fractional occupancy of inter-residue contacts between the optimized and the coil 5P ensembles are shown, with a contact defined as a Cα–Cα distance less than 8 Å (see Supporting Information, section 2.9). The areas of greatest positive contact difference are centered around the T37 and T46 phosphorylation sites and the N-IDR.
It has been proposed that C-IDR phosphorylation induces stabilizing contacts with the folded domain, possibly via electrostatic attractions between the C-IDR phosphate groups and the basic regions of the folded domain.13,25 In our analysis, although the C-IDR is more compact than the random coil and shows sparse contacts with the folded domain, these contacts are not exclusive to the phosphorylation sites, implying that the underlying interactions are of a mean-field nature. Instead, our results allude to a potential major role of the N-IDR in stabilizing the structure of the folded domain. The NCPR for 5P 4E-BP2 (see Supporting Information, Figure S5) illustrates that the N-IDR is predominantly positive, while phosphorylation at T37 and T46 lead to a negative four charge difference in the folded domain.
A combination of electrostatic interactions between the basic N-IDR and the negative phosphosites of the folded domain and between the basic parts of the folded domain and the negative phosphosites in the C-IDR may increase the stability of the folded domain. At the same time, our analysis suggests that C-IDR phosphorylation disrupts the network of intramolecular interactions at regions far away from the phosphorylation sites with only small changes to the global dimensions, similar to other multiphosphorylated proteins.34,85,86
Prominent 4E-BP2 Structural States Revealed by Clustering.
In contrast to stable folded proteins, IDPs feature a shallow and rugged free-energy landscape without a pronounced global minimum. This facilitates fast conformational exchange, however weakly funneled landscapes exist for various IDPs.29,87,88 Our previous NMR studies have shown that intrachain interactions significantly affect conformational propensities of 4E-BP2 in different phosphorylation states.13,19,25
To better define nonlocal interactions impacting the 4E-BP2 structure and function, we applied agglomerative hierarchical clustering to partition the two optimized ensembles.89,45 The partitioning leads to a separation of global dimensions and shape, such as radius of gyration, end-to-end distance and asphericity (see Supporting Information, section 2.6, Figures S6 and S7). The dendrogram obtained from hierarchical clustering provides a visualization of the conformer amalgamation process (Figure 5A). Next, we characterized each resulting 4E-BP2 cluster from the perspective of inter-residue contact maps with the aim to identify distinguishing structural features and shed new light on the conformational landscape and the functional aspects, (e.g., (un)folding and (un)-binding), of this protein.
Figure 5.

Agglomerative hierarchical clustering applied to the unrestrained NP 4E-BP2 ensemble. (A) Dendrogram showing the 4 resulting clusters: Cluster 1 (green), Cluster 2 (purple), Cluster 3 (brown), and Cluster 4 (pink). Inter-residue distance maps for each cluster normalized by the entire BME-optimized NP ensemble: (B) Cluster 1, (C) Cluster 2, (D) Cluster 3, and (E) Cluster 4.
The NP ensemble (unrestrained) partitions first into a small (23%, 4510 conformers) and a large (77%, 15490 conformers) cluster. These clusters then split twice before the cutoff criterion is satisfied (Figure S8A), which brings the total number of clusters to six. Due to the low populations (1.3%, 2.9%, and 7.3%) that result from splitting the initial “small” cluster, we kept it intact so that a total of four clusters were obtained (Figure 5A). Upon reweighting the conformers with their BME-derived weights, the abundance of each cluster in the optimized ensemble is obtained (Table S5). Mean pairwise Cα inter-residue distances in each reweighted cluster were normalized by the corresponding distances for the optimized ensemble (Figures 5B–E).
These maps confirm that the clusters have clearly distinct distributions of inter-residue distances, as expected since the dissimilarity metric used was a Euclidean distance between inter-residue distances in different conformers (see Methods). Note that such populations could not be trivially determined by analyzing the distribution of global parameters such as the radius of gyration (see Supporting Information, Figure S9), underscoring the utility of clustering to disentangle coarse-grained structural propensities in a large and disordered protein ensemble.
Cluster 1 (green), whose fraction was reduced from 23% to ~12% upon BME optimization, is the most expanded of all clusters (Figure 5B). In particular, the N- and C-terminal regions are further apart, indicative of extended, quasi-linear poses. On the contrary, Cluster 2 (purple) is the most compact overall, while the other two clusters (3-brown, 4-magenta) have complementary distance maps, with a mixture of expansion and compaction compared to the full ensemble. Motivated by the growing literature on the binding mechanisms of IDPs90–92 and the expansion we previously captured between residues 32–91 and 73–121 of NP 4E-BP2 upon binding to eIF4E,26 we asked whether the expanded clusters were conformationally similar to bound-state structures.
To this end, back-calculated mean FRET values for each NP cluster (see Methods) were compared via a z-test (Table 2) to the experimental values obtained for the eIF4E-bound state, E32–91 = 0.26 ± 0.02 and E73–121 = 0.51 ± 0.02, and for the apo-state, E32–91 = 0.63 ± 0.02 and E73–121 = 0.58 ± 0.02.26 With this metric, Cluster 1 resembles the eIF4E-bound state, as it agrees within a 3σ tolerance level to the experimental values, in particular, regarding E32–91. In contrast, the other three clusters have significantly higher E32–91 values than the bound state but instead agree within 3σ tolerance with the apo-state value.
Table 2.
Mean FRET Efficiencies of NP 4E-BP2 Clusters Compared to Experimental Apo and eIF4E-Bound FRET Values via a z-Testa
| Cluster # | E 32–91 | E 73–121 | Bound z32–91 | Bound z73–121 | Apo z32–91 | Apo z73–121 |
|---|---|---|---|---|---|---|
| Cluster 1 | 0.34 | 0.56 | 2.1 | 1.4 | 8.2 | 0.5 |
| Cluster 2 | 0.70 | 0.60 | 12.2 | 2.4 | 1.9 | 0.5 |
| Cluster 3 | 0.65 | 0.62 | 10.8 | 3.0 | 0.6 | 1.0 |
| Cluster 4 | 0.58 | 0.49 | 9.0 | 0.6 | 1.3 | 2.6 |
The presence of a sizable cluster resembling the bound-state within the apo-state ensemble cannot be predicted a priori, as an ensemble could be split/clustered in many ways. Our results allude to a subclass of extended 4E-BP2 conformations that could maximize attractive interactions with the eIF4E surface and initiate binding. At the same time, a large majority of conformers (88%) are not compatible with bound-state FRET. Although our model is not able to determine whether the binding occurs through an induced fit or a conformational selection mechanism, the presence of a significant binding-compatible population is intriguing. This remains an area of interest in the field, as both binding models have been proposed for disordered proteins93 and combined binding mechanisms have also been described.94,95
For the 5P ensemble, the normalized cutoff distance clearly levels off when the number of clusters is increased above N = 4 (see Supporting Information, Figure S8B). BME reweighting increases the population of the most expanded cluster from 20.1% to 27.7%, while reducing the population of the most compact cluster from 41.3% to 33.1% (Table S6). These observations agree with the overall expansion observed in 4E-BP2 upon hyper-phosphorylation using smFRET.26
A more obvious clustering cutoff distance and a more balanced distribution of cluster fractions for the 5P ensemble compared to the NP ensemble suggest that the former energy landscape has fewer and deeper “structural wells” than the latter. The inter-residue distance maps reveal the complementarity of clusters (Figures 6B–E). For instance, Cluster 1 mostly consists of conformers that are expanded throughout the entire chain, while the opposite is true for Cluster 2; similarly, Cluster 3 is compact in regions where Cluster 4 is expanded and vice versa.
Figure 6.

Agglomerative hierarchical clustering on the unrestrained 5P 4E-BP2 ensemble. (A) Dendrogram showing the 4 resulting clusters: Cluster 1 (green), Cluster 2 (purple), Cluster 3 (brown) and Cluster 4 (pink). Inter-residue distance maps for each cluster normalized by the entire BME-optimized 5P ensemble: (B) Cluster 1, (C) Cluster 2, (D) Cluster 3, and (E) Cluster 4.
Since interactions of disordered tails with the folded domain in 5P 4E-BP2 are thought to increase the stability of the fold,25 we then analyzed the abundance of intramolecular contacts at the cluster level (Figure 7). Cluster 1 shows prominent contacts between the N-IDR and a segment around pT46 in the folded domain, while Cluster 2 shows a delocalized contact pattern between the C-IDR and the folded domain. Conversely, Clusters 3 and 4 contain prominent contacts between the N-IDR and pT46 and pT37 sites in the folded domain, respectively. We refer to Clusters 1 and 2 as C-interaction mode (CIM) clusters and Clusters 3 and 4 as N-interaction mode (NIM) clusters. To differentiate within the same mode, we denote Cluster 1 as CIM-off and Cluster 2 as CIM-on, while Clusters 3 and 4 are denoted as NIM-pT46 and NIM-pT37, respectively.
Figure 7.

Difference contact maps obtained by subtracting the fractional level of inter-residue contacts in the entire BME-optimized 5P ensemble from those in each cluster. (A) Cluster 1 or CIM-off; (B) Cluster 2 or CIM-on; (C) Cluster 3 or NIM-pT46; (D) Cluster 4 or NIM-pT37. Representative conformations of each cluster are shown in the upper region of each panel. Two residues are in contact if their Cα atoms are within 8 Å.
These pairs of clusters represent two prominent modes of interaction within the BME-optimized 5P 4E-BP2 ensemble. The CIM-on cluster is enriched in contacts between residues throughout the C-IDR and residues in the N-IDR. Such contacts may be stabilized by attractive charge-based interactions as the aforementioned C-IDR residues have negative NCPR values and those within the N-IDR have positive NCPR values (Figure S5). To a lesser degree, contacts are formed between pT37 and pT46 of the folded domain and the entire C-IDR. This implies that CIM-on conformations exhibit more contacts between the N-IDR and the C-IDR, which also brings the C-IDR in closer proximity to the folded domain for possible interactions. Such contacts are absent in the CIM-off cluster. Instead, CIM-off is enriched in contacts between a region near residue pT46 (45–51) and the entire N-IDR.
This N-IDR interaction with the folded domain (residues 45–51) is more prominent in the NIM-pT46 cluster, with the highest contact fractions between residues 14–15 and residue and residue 47. This cluster has minimal C-IDR contacts, resembling the CIM-off map. The NIM-pT37 cluster also has a high contact fraction between the N-IDR and the folded domain, except it is centered around residue pT37 and interacts with residues 1–11 of the N-IDR. The opposite sign NCPR values in these regions may aid in driving such contacts (Figure S5). For these conformers, prominent contacts between the entire N-IDR and the residues 64–98 in the C-IDR. This is similar to that observed for CIM-on conformers, suggesting that contacts occurring between the N- and C-IDRs facilitate interactions between the folded domain and the C-IDR.
DISCUSSION AND CONCLUSIONS
Ensemble modeling of dynamic and/or disordered proteins is a growing area of research,33,34,96 reflecting the increased awareness of their functional importance. Recently, we assessed the effects of various starting conformer pools and optimization methods for the integrative modeling of the disordered protein Sic1.64 The quality of initial conformer pools had the highest impact in obtaining good agreement of the optimized ensemble with experimental data and is positively correlated with the Neff value for the optimized ensemble. Using MD priors for the Sic1 protein, we found Neff ≈ 0.75,64 while a study that applied BME to MD simulations of the ACTR protein with CS restraints found Neff ≈ 0.67.97
For 4E-BP2, the Neff obtained for the optimized 5P ensemble is significantly higher than the value for the optimized NP ensemble, 0.78 vs. 0.40. This result is consistent with the aforementioned studies as 5P conformers contain a significant folded fraction (residues 18–62). Although we included 20 different PDB structures to describe the folded domain,13 initial 5P conformers are much more restrained than initial NP conformers. For both phosphoforms, two smFRET efficiencies and distances were the most powerful restraints for the optimization procedure, while chemical shifts had the least impact, reflecting the high uncertainty from back-calculation of these values.
Interestingly, the NP ensemble appears to be more expanded overall than the 5P ensemble by Rg (experimental, back-calculated) and Rh (back-calculated) measures, while Rh measured by FCS and two internal distances measured by smFRET show the opposite trend.25,26 This may be a real effect reflecting different shapes/topologies of the two 4E-BP2 phosphoforms, or it may be an artifact due to limited sampling in the initial pools. Future studies will benefit from more accurate and diverse sampling of the conformational space, e.g., by better sampling at and around the five phosphorylation sites. In addition, more reliable back-calculators for NMR quantities (CS and PRE) and more smFRET distance restraints would significantly increase the confidence in the optimized ensembles.
Analysis of the optimized NP ensemble revealed a pronounced concavity in the inter-residue scaling profile for sequence separations on the order of 60–80 residues. This is likely caused by a combination of electrostatic charge mixing, hydrophobic interactions between residues 20–40 and 80–100, and pi–pi aand cation–pi interactions involving tyrosines Y34 and Y54 and C-terminal lysines and arginines. Future experiments on specific mutant constructs are needed to reveal the precise identity and nature of the physical-chemical interactions driving the concavity in the ISP profile of NP 4E-BP2.
Residues 19–28 show increased flexibility when bound to eIF4E.19 Our 2D distance maps point to interactions between these residues and residues in the C-terminal region controlling the nonlocal compaction of NP 4E-BP2. Binding to the surface of eIF4E may release these intramolecular interactions, enhancing chain dynamics. This scenario would be consistent with our recent findings, in which we captured the expansion and increased local dynamics of 4E-BP2 upon binding to eIF4E.26 Furthermore, clustering analysis reveals only a minor subpopulation (~12%) that is “bound-state like”, also indicating major rearrangements of the chain when 4E-BP2 binds to eIF4E. smFRET experiments are currently under way probing different segments of the chain in the apo vs the bound state. The new data will add important restraints to our modeling and help define the binding mechanism of this IDP system.
Difference contact maps reveal that the residues in the folded 5P domain show fewer contacts with the C-terminal region than they do in the NP state. Previously, we found evidence of a fast exchange between α-helical and β-strand conformations in the 2P state (pT37 and pT46), especially between residues 49 and 67.13 Our results suggest that the addition of three phosphate groups in the C-terminal region may break the residue contacts that stabilize the α-helix, thus favoring the β-fold. Clustering analysis of the 5P ensemble captured four different modes of nonlocal interaction that define major topologies of the 5P state. This categorization of the ensemble’s heterogeneity hints at a mechanism for stabilization of the folded domain by the C-IDR in which the N-IDR acts as both a chaperone and an inhibitor.
This mechanism could act as follows: interaction of the N-IDR with the folded domain driven by contacts with pT46 (present in CIM-off and more prominently in NIM-pT46) stabilizes a pose in which N- and C-IDRs are brought closer to each other. The average conformation then enters a NIM-pT37 average conformation, where the pT46 contact is broken and the N-IDR forms contacts with the folded domain around pT37. This allows the C-IDR to form contacts with the starting residues of the N-IDR and allows the C-IDR to loosely interact with the starting residues of the folded domain. Finally, the N-IDR moves away from the folded domain, as 5P 4E-BP2 enters the CIM-on state. Here, residues throughout the C-IDR form contacts around residues pT37 and pT46 after being led there by the N-IDR.
Additional ensemble models of the other 4E-BP2 phosphorylation states, in particular the 2-fold phosphorylated state, would contribute to unraveling this mechanism. Undoubtedly, combined efforts in improving the quality of the starting conformers, increasing the accuracy of back-calculators, and obtaining new restrictive experimental data will be instrumental in solving the fascinating molecular puzzle that is the 4E-BP2:eIF4E system.
Supplementary Material
ACKNOWLEDGMENTS
We thank John J. Ferrie from the Department of Chemistry at the University of Pennsylvania for advice on applying FastFloppyTail to generate 4E-BP2 conformers, as well as modifying the FastFloppyTail program so that it can include phosphorylated residues for the 5P state. This work has been supported by the Natural Sciences and Engineering Research Council of Canada (NSERC RGPIN-2023-04864 to C.C.G.) and the Canadian Institutes of Health Research (CIHR FND-148375 to J.D.F.-K.).
Footnotes
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpcb.3c04052.
Extended description of FFT conformer generation, BME ensemble optimization, hierarchical clustering, polymer scaling and 2-D contact maps, and additional figures and tables (PDF)
Intensity data (XLSX)
Scattering data (XLSX)
The authors declare no competing financial interest.
Contributor Information
Thomas E. Tsangaris, Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada; Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada.
Spencer Smyth, Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada; Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada.
Gregory-Neal W. Gomes, Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada; Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada; Present Address: Altos Laboratories, San Diego, California 92121, United States
Zi Hao Liu, Program in Molecular Medicine, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada; Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada.
Moses Milchberg, Program in Molecular Medicine, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada; Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada; Present Address: Department of Biochemistry, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, 53706, United States.
Alaji Bah, Program in Molecular Medicine, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada; Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada; Present Address: Department of Biochemistry and Molecular Biology, SUNY Upstate Medical University, Syracuse, NY 13210, United States.
Gregory A. Wasney, Peter Gilgan Centre for Research and Learning, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
Julie D. Forman-Kay, Program in Molecular Medicine, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada; Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
Claudiu C. Gradinaru, Department of Physics, University of Toronto, Toronto, Ontario M5S 1A7, Canada; Department of Chemical & Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario L5L 1C6, Canada
REFERENCES
- (1).Fisher CK; Stultz CM Protein Structure along the Order-Disorder Continuum. J. Am. Chem. Soc 2011, 133, 10022–10025. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (2).Forman-Kay JD; Mittag T From Sequence and Forces to Structure, Function, and Evolution of Intrinsically Disordered Proteins. Structure 2013, 21, 1492–1499. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (3).Berlow RB; Dyson HJ; Wright PE Functional advantages of dynamic protein disorder. Febs Lett. 2015, 589, 2433–2440. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (4).Tsang B; Pritišanac I; Scherer SW; Moses AM; Forman-Kay JD Phase Separation as a Missing Mechanism for Interpretation of Disease Mutations. Cell 2020, 183, 1742–1756. [DOI] [PubMed] [Google Scholar]
- (5).Csizmok V; Follis AV; Kriwacki RW; Forman-Kay JD Dynamic Protein Interaction Networks and New Structural Paradigms in Signaling. Chem. Rev 2016, 116, 6424–6462. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (6).Dosztanyi Z; Chen J; Dunker AK; Simon I; Tompa P Disorder and sequence repeats in hub proteins and their implications for network evolution. J. Proteome Res 2006, 5, 2985–2995. [DOI] [PubMed] [Google Scholar]
- (7).Borgia A; Borgia MB; Bugge K; Kissling VM; Heidarsson PO; Fernandes CB; Sottini A; Soranno A; Buholzer KJ; Nettels D; Kragelund BB; Best RB; Schuler B Extreme disorder in an ultrahigh-affinity protein complex. Nature 2018, 555, 61–66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (8).Martin EW; Holehouse AS Intrinsically disordered protein regions and phase separation: sequence determinants of assembly or lack thereof. Emerg Top Life Sci. 2020, 4, 307–329. [DOI] [PubMed] [Google Scholar]
- (9).Pak CW; Kosno M; Holehouse AS; Padrick SB; Mittal A; Ali R; Yunus AA; Liu DR; Pappu RV; Rosen MK Sequence Determinants of Intracellular Phase Separation by Complex Coacervation of a Disordered Protein. Mol. Cell 2016, 63, 72–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (10).Baker JMR; Hudson RP; Kanelis V; Choy WY; Thibodeau PH; Thomas PJ; Forman-Kay JD CFTR regulatory region interacts with NBD1 predominantly via multiple transient helices. Nat. Struct Mol. Biol 2007, 14, 738–745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (11).He YN; Chen YH; Mooney SM; Rajagopalan K; Bhargava A; Sacho E; Weninger K; Bryan PN; Kulkarni P; Orban J Phosphorylation-induced Conformational Ensemble Switching in an Intrinsically Disordered Cancer/Testis Antigen. J. Biol. Chem 2015, 290, 25090–25102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (12).Banerjee PR; Mitrea DM; Kriwacki RW; Deniz AA Asymmetric Modulation of Protein Order–Disorder Transitions by Phosphorylation and Partner Binding. Angew. Chem., Int. Ed 2016, 55, 1675–1679. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (13).Bah A; Vernon RM; Siddiqui Z; Krzeminski M; Muhandiram R; Zhao C; Sonenberg N; Kay LE; Forman-Kay JD Folding of an intrinsically disordered protein by phosphorylation as a regulatory switch. Nature 2015, 519, 106–109. [DOI] [PubMed] [Google Scholar]
- (14).Uversky VN; Oldfield CJ; Dunker AK Intrinsically disordered proteins in human diseases: Introducing the D-2 concept. Annu. Rev. Biophys 2008, 37, 215–246. [DOI] [PubMed] [Google Scholar]
- (15).Biesaga M; Frigole-Vivas M; Salvatella X Intrinsically disordered proteins and biomolecular condensates as drug targets. Curr. Opin Chem. Biol 2021, 62, 90–100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (16).Tait S; Dutta K; Cowburn D; Warwicker J; Doig AJ; McCarthy JEG Local control of a disorder-order transition in 4E-BP1 underpins regulation of translation via eIF4E. P Natl. Acad. Sci. USA 2010, 107, 17627–17632. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (17).Sonenberg N; Hinnebusch AG Regulation of Translation Initiation in Eukaryotes: Mechanisms and Biological Targets. Cell 2009, 136, 731–745. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (18).De Benedetti A; Harris AL eIF4E expression in tumors: its possible role in progression of malignancies. Int. J. Biochem Cell B 1999, 31, 59–72. [DOI] [PubMed] [Google Scholar]
- (19).Lukhele S; Bah A; Lin H; Sonenberg N; Forman-Kay JD Interaction of the Eukaryotic Initiation Factor 4E with 4E-BP2 at a Dynamic Bipartite Interface. Structure 2013, 21, 2186–2196. [DOI] [PubMed] [Google Scholar]
- (20).Banko JL; Merhav M; Stern E; Sonenberg N; Rosenblum K; Klann E Behavioral alterations in mice lacking the translation repressor 4E-BP2. Neurobiol Learn Mem 2007, 87, 248–256. [DOI] [PubMed] [Google Scholar]
- (21).Klann E; Sweatt JD Altered protein synthesis is a trigger for long-term memory formation. Neurobiol Learn Mem 2008, 89, 247–259. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (22).Gkogkas CG; Khoutorsky A; Ran I; Rampakakis E; Nevarko T; Weatherill DB; Vasuta C; Yee S; Truitt M; Dallaire P; Major F; Lasko P; Ruggero D; Nader K; Lacaille JC; Sonenberg N Autism-related deficits via dysregulated eIF4E-dependent translational control. Nature 2013, 493, 371–377. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (23).Peter D; Igreja C; Weber R; Wohlbold L; Weiler C; Ebertsch L; Weichenrieder O; Izaurralde E Molecular Architecture of 4E-BP Translational Inhibitors Bound to eIF4E. Mol. Cell 2015, 57, 1074–1087. [DOI] [PubMed] [Google Scholar]
- (24).Igreja C; Peter D; Weiler C; Izaurralde E 4E-BPs require non-canonical 4E-binding motifs and a lateral surface of eIF4E to repress translation. Nat. Commun 2014, 5, 4790. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (25).Dawson JE; Bah A; Zhang ZF; Vernon RM; Lin H; Chong PA; Vanama M; Sonenberg N; Gradinaru CC; Forman-Kay JD Non-cooperative 4E-BP2 folding with exchange between eIF4E-binding and binding-incompatible states tunes cap-dependent translation inhibition. Nat. Commun 2020, 11, 3146. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (26).Smyth S; Zhang Z; Bah A; Tsangaris TE; Dawson J; Forman-Kay JD; Gradinaru CC Multisite phosphorylation and binding alter conformational dynamics of the 4E-BP2 protein. Biophys. J 2022, 121, 3049–3060. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (27).Mittag T; Marsh J; Grishaev A; Orlicky S; Lin H; Sicheri F; Tyers M; Forman-Kay JD Structure/Function Implications in a Dynamic Complex of the Intrinsically Disordered Sic1 with the Cdc4 Subunit of an SCF Ubiquitin Ligase. Structure 2010, 18, 494–506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (28).Marsh JA; Dancheck B; Ragusa MJ; Allaire M; Forman-Kay JD; Peti W Structural Diversity in Free and Bound States of Intrinsically Disordered Protein Phosphatase 1 Regulators. Structure 2010, 18, 1094–1103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (29).Papoian GA Proteins with weakly funneled energy landscapes challenge the classical structure-function paradigm. P Natl. Acad. Sci. USA 2008, 105, 14237–14238. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (30).Bonomi M; Heller GT; Camilloni C; Vendruscolo M Principles of protein structural ensemble determination. Curr. Opin Struc Biol 2017, 42, 106–116. [DOI] [PubMed] [Google Scholar]
- (31).Jensen MR; Zweckstetter M; Huang JR; Blackledge M Exploring Free-Energy Landscapes of Intrinsically Disordered Proteins at Atomic Resolution Using NMR Spectroscopy. Chem. Rev 2014, 114, 6632–6660. [DOI] [PubMed] [Google Scholar]
- (32).Marsh JA; Forman-Kay JD Ensemble modeling of protein disordered states: Experimental restraint contributions and validation. Proteins-Structure Function and Bioinformatics 2012, 80, 556–572. [DOI] [PubMed] [Google Scholar]
- (33).Naudi-Fabra S; Tengo M; Jensen MR; Blackledge M; Milles S Quantitative Description of Intrinsically Disordered Proteins Using Single-Molecule FRET, NMR, and SAXS. J. Am. Chem. Soc 2021, 143, 20109–20121. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (34).Gomes GNW; Krzeminski M; Namini A; Martin EW; Mittag T; Head-Gordon T; Forman-Kay JD; Gradinaru CC Conformational Ensembles of an Intrinsically Disordered Protein Consistent with NMR, SAXS, and Single-Molecule FRET. J. Am. Chem. Soc 2020, 142, 15697–15710. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (35).Feldman HJ; Hogue CW A fast method to sample real protein conformational space. Proteins 2000, 39, 112–131. [PubMed] [Google Scholar]
- (36).Feldman HJ; Hogue CWV Probabilistic sampling of protein conformations: New hope for brute force? Proteins-Structure Function and Bioinformatics 2002, 46, 8–23. [PubMed] [Google Scholar]
- (37).Ozenne V; Bauer F; Salmon L; Huang JR; Jensen MR; Segard S; Bernado P; Charavay C; Blackledge M Flexible-meccano: a tool for the generation of explicit ensemble descriptions of intrinsically disordered proteins and their associated experimental observables. Bioinformatics 2012, 28, 1463–1470. [DOI] [PubMed] [Google Scholar]
- (38).Teixeira JMC; Liu ZH; Namini A; Li J; Vernon RM; Krzeminski M; Shamandy AA; Zhang O; Haghighatlari M; Yu L; Head-Gordon T; Forman-Kay JD IDPConformerGenerator: A Flexible Software Suite for Sampling the Conformational Space of Disordered Protein States. J. Phys. Chem. A 2022, 126, 5985–6003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (39).Ferrie JJ; Petersson EJ A Unified De Novo Approach for Predicting the Structures of Ordered and Disordered Proteins. J. Phys. Chem. B 2020, 124, 5538–5548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (40).Krzeminski M; Marsh JA; Neale C; Choy WY; Forman-Kay JD Characterization of disordered proteins with ENSEMBLE. Bioinformatics 2013, 29, 398–399. [DOI] [PubMed] [Google Scholar]
- (41).Lincoff J; Haghighatlari M; Krzeminski M; Teixeira JMC; Gomes GNW; Gradinaru CC; Forman-Kay JD; Head-Gordon T Extended experimental inferential structure determination method in determining the structural ensembles of disordered protein states. Commun. Chem 2020, 3, 74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (42).Bottaro S; Bengtsen T; Lindorff-Larsen K Integrating Molecular Simulation and Experimental Data: A Bayesian/Maximum Entropy Reweighting Approach. Methods Mol. Biol 2020, 2112, 219–240. [DOI] [PubMed] [Google Scholar]
- (43).Appadurai R; Koneru JK; Bonomi M; Robustelli P; Srivastava A Demultiplexing the heterogeneous conformational ensembles of intrinsically disordered proteins into structurally similar clusters. bioRxiv 2022;2022.2011.2011.516231 [DOI] [PMC free article] [PubMed] [Google Scholar]
- (44).Lazar T; Guharoy M; Vranken W; Rauscher S; Wodak SJ; Tompa P Distance-Based Metrics for Comparing Conformational Ensembles of Intrinsically Disordered Proteins. Biophys. J 2020, 118, 2952–2965. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (45).Baul U; Chakraborty D; Mugnai ML; Straub JE; Thirumalai D Sequence Effects on Size, Shape, and Structural Heterogeneity in Intrinsically Disordered Proteins. J. Phys. Chem. B 2019, 123, 3462–3474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (46).Graham TGW; Ferrie JJ; Dailey GM; Tjian R; Darzacq X Detecting molecular interactions in live-cell single-molecule imaging with proximity-assisted photoactivation (PAPA). Elife 2022, 11, No. e76870. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (47).Jones DT; Cozzetto D DISOPRED3: precise disordered region predictions with annotated protein-binding activity. Bioinformatics 2015, 31, 857–863. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (48).Schrodinger LLC The PyMOL Molecular Graphics System, Version 1.8; 2015. [Google Scholar]
- (49).Bonneau R; Tsai J; Ruczinski I; Chivian D; Rohl C; Strauss CE; Baker D Rosetta in CASP4: progress in ab initio protein structure prediction. Proteins Suppl 2001, 45, 119–126. [DOI] [PubMed] [Google Scholar]
- (50).Rohl CA; Strauss CE; Misura KM; Baker D Protein structure prediction using Rosetta. Methods Enzymol 2004, 383, 66–93. [DOI] [PubMed] [Google Scholar]
- (51).Kalinin S; Peulen T; Sindbert S; Rothwell PJ; Berger S; Restle T; Goody RS; Gohlke H; Seidel CA A toolkit and benchmark study for FRET-restrained high-precision structural modeling. Nat. Methods 2012, 9, 1218–1225. [DOI] [PubMed] [Google Scholar]
- (52).Sindbert S; Kalinin S; Nguyen H; Kienzler A; Clima L; Bannwarth W; Appel B; Muller S; Seidel CA Accurate distance determination of nucleic acids via Forster resonance energy transfer: implications of dye linker length and rigidity. J. Am. Chem. Soc 2011, 133, 2463–2480. [DOI] [PubMed] [Google Scholar]
- (53).Dimura M; Peulen TO; Hanke CA; Prakash A; Gohlke H; Seidel CA Quantitative FRET studies and integrative modeling unravel the structure and dynamics of biomolecular systems. Curr. Opin Struct Biol 2016, 40, 163–185. [DOI] [PubMed] [Google Scholar]
- (54).McGibbon RT; Beauchamp KA; Harrigan MP; Klein C; Swails JM; Hernandez CX; Schwantes CR; Wang LP; Lane TJ; Pande VS MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories. Biophys. J 2015, 109, 1528–1532. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (55).Gebhardt C; Lehmann M; Reif MM; Zacharias M; Gemmecker G; Cordes T Molecular and Spectroscopic Characterization of Green and Red Cyanine Fluorophores from the Alexa Fluor and AF Series. ChemPhysChem 2021, 22, 1546. [DOI] [PubMed] [Google Scholar]
- (56).Peulen TO; Opanasyuk O; Seidel CAM Combining Graphical and Analytical Methods with Molecular Simulations To Analyze Time-Resolved FRET Measurements of Labeled Macro-molecules Accurately. J. Phys. Chem. B 2017, 121, 8211–8241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (57).Grudinin S; Garkavenko M; Kazennov A Pepsi-SAXS: an adaptive method for rapid and accurate computation of small-angle X-ray scattering profiles. Acta Crystallographica Section D 2017, 73, 449–464. [DOI] [PubMed] [Google Scholar]
- (58).Neal S; Nip AM; Zhang H; Wishart DS Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. Journal of Biomolecular NMR 2003, 26, 215–240. [DOI] [PubMed] [Google Scholar]
- (59).Kay LE; Keifer P; Saarinen T Pure absorption gradient enhanced heteronuclear single quantum correlation spectroscopy with improved sensitivity. J. Am. Chem. Soc 1992, 114, 10663–10665. [Google Scholar]
- (60).Kay LE; Torchia DA; Bax A Backbone dynamics of proteins as studied by 15N inverse detected heteronuclear NMR spectroscopy: application to staphylococcal nuclease. Biochemistry 1989, 28, 8972–8979. [DOI] [PubMed] [Google Scholar]
- (61).Tesei G; Martins JM; Kunze MBA; Wang Y; Crehuet R; Lindorff-Larsen K DEER-PREdict: Software for efficient calculation of spin-labeling EPR and NMR data from conformational ensembles. PLoS Comput. Biol 2021, 17, No. e1008551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (62).Rozycki B; Kim YC; Hummer G SAXS Ensemble Refinement of ESCRT-III CHMP3 Conformational Transitions. Structure 2011, 19, 109–116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (63).Satopaa V; Albrecht J; Irwin D; Raghavan B Finding a ″Kneedle″ in a Haystack: Detecting Knee Points in System Behavior, In 2011 31st International Conference on Distributed Computing Systems Workshops, 2011; pp 166–171. [Google Scholar]
- (64).Gomes GW; Namini A; Gradinaru CC Integrative Conformational Ensembles of Sic1 Using Different Initial Pools and Optimization Methods. Front Mol. Biosci 2022, 9, 910956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (65).Gillespie JR; Shortle D Characterization of long-range structure in the denatured state of staphylococcal nuclease. I. Paramagnetic relaxation enhancement by nitroxide spin labels. J. Mol. Biol 1997, 268, 158–169. [DOI] [PubMed] [Google Scholar]
- (66).Ganguly D; Chen J Structural interpretation of paramagnetic relaxation enhancement-derived distances for disordered protein states. J. Mol. Biol 2009, 390, 467–477. [DOI] [PubMed] [Google Scholar]
- (67).Ward JH Hierarchical Grouping to Optimize an Objective Function. J. Am. Stat Assoc 1963, 58, 236–244. [Google Scholar]
- (68).Wishart D 256. Note: An Algorithm for Hierarchical Classifications. Biometrics 1969, 25, 165–170. [Google Scholar]
- (69).Liu ZH; Teixeira JMC; Zhang O; Tsangaris TE; Li J; Gradinaru CC; Head-Gordon T; Forman-Kay JD Local Disordered Region Sampling (LDRS) for Ensemble Modeling of Proteins with Experimentally Undetermined or Low Confidence Prediction Segments. bioRxiv 2023, 2023.07.25.550520 DOI: 10.1101/2023.07.25.550520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (70).Pesce F; Newcombe EA; Seiffert P; Tranchant EE; Olsen JG; Grace CR; Kragelund BB; Lindorff-Larsen K Assessment of models for calculating the hydrodynamic radius of intrinsically disordered proteins. Biophys. J 2023, 122, 310–321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (71).Felitsky DJ; Lietzow MA; Dyson HJ; Wright PE Modeling transient collapsed states of an unfolded protein to provide insights into early folding events. Proc. Natl. Acad. Sci. U. S. A 2008, 105, 6278–6283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (72).Yamada J; Phillips JL; Patel S; Goldfien G; Calestagne-Morelli A; Huang H; Reza R; Acheson J; Krishnan VV; Newsam S; Gopinathan A; Lau EY; Colvin ME; Uversky VN; Rexach MF A bimodal distribution of two distinct categories of intrinsically disordered structures with separate functions in FG nucleoporins. Mol. Cell Proteomics 2010, 9, 2205–2224. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (73).Sizemore SM; Cope SM; Roy A; Ghirlanda G; Vaiana SM Slow Internal Dynamics and Charge Expansion in the Disordered Protein CGRP: A Comparison with Amylin. Biophys. J 2015, 109, 1038–1048. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (74).Bianchi G; Longhi S; Grandori R; Brocca S Relevance of Electrostatic Charges in Compactness, Aggregation, and Phase Separation of Intrinsically Disordered Proteins. Int. J. Mol. Sci 2020, 21, 6208. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (75).Jo Y; Jang J; Song D; Park H; Jung Y Determinants for intrinsically disordered protein recruitment into phase-separated protein condensates. Chem. Sci 2022, 13, 522–530. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (76).Vernon RM; Chong PA; Tsang B; Kim TH; Bah A; Farber P; Lin H; Forman-Kay JD Pi-Pi contacts are an overlooked protein feature relevant to phase separation. Elife 2018, 7, No. e31486. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (77).Hofmann H; Soranno A; Borgia A; Gast K; Nettels D; Schuler B Polymer scaling laws of unfolded and intrinsically disordered proteins quantified with single-molecule spectroscopy. Proc. Natl. Acad. Sci. U. S. A 2012, 109, 16155–16160. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (78).Das RK; Pappu RV Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc. Natl. Acad. Sci. U. S. A 2013, 110, 13392–13397. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (79).Sawle L; Ghosh K A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins. J. Chem. Phys 2015, 143, 085101. [DOI] [PubMed] [Google Scholar]
- (80).Holehouse AS; Das RK; Ahad JN; Richardson MO; Pappu RV CIDER: Resources to Analyze Sequence-Ensemble Relationships of Intrinsically Disordered Proteins. Biophys. J 2017, 112, 16–21. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (81).Mao AH; Crick SL; Vitalis A; Chicoine CL; Pappu RV Net charge per residue modulates conformational ensembles of intrinsically disordered proteins. Proc. Natl. Acad. Sci. U. S. A 2010, 107, 8183–8188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (82).Wang X; Beugnet A; Murakami M; Yamanaka S; Proud CG Distinct Signaling Events Downstream of mTOR Cooperate To Mediate the Effects of Amino Acids and Insulin on Initiation Factor 4E-Binding Proteins. Mol. Cell. Biol 2005, 25, 2558–2572. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (83).Bomblies R; Luitz MP; Zacharias M Molecular Dynamics Analysis of 4E-BP2 Protein Fold Stabilization Induced by Phosphorylation. J. Phys. Chem. B 2017, 121, 3387–3393. [DOI] [PubMed] [Google Scholar]
- (84).Zeng J; Jiang F; Wu YD Mechanism of Phosphorylation-Induced Folding of 4E-BP2 Revealed by Molecular Dynamics Simulations. J. Chem. Theory Comput 2017, 13, 320–328. [DOI] [PubMed] [Google Scholar]
- (85).Gibbs EB; Lu F; Portz B; Fisher MJ; Medellin BP; Laremore TN; Zhang YJ; Gilmour DS; Showalter SA Phosphorylation induces sequence-specific conformational switches in the RNA polymerase II C-terminal domain. Nat. Commun 2017, 8, 15233. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (86).Martin EW; Holehouse AS; Grace CR; Hughes A; Pappu RV; Mittag T Sequence Determinants of the Conformational Properties of an Intrinsically Disordered Protein Prior to and upon Multisite Phosphorylation. J. Am. Chem. Soc 2016, 138, 15323–15335. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (87).Qiao Q; Bowman GR; Huang X Dynamics of an intrinsically disordered protein reveal metastable conformations that potentially seed aggregation. J. Am. Chem. Soc 2013, 135, 16092–16101. [DOI] [PubMed] [Google Scholar]
- (88).Choi UB; Sanabria H; Smirnova T; Bowen ME; Weninger KR Spontaneous Switching among Conformational Ensembles in Intrinsically Disordered Proteins. Biomolecules 2019, 9, 114. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (89).Samanta HS; Chakraborty D; Thirumalai D Charge fluctuation effects on the shape of flexible polyampholytes with applications to intrinsically disordered proteins. J. Chem. Phys 2018, 149, 163323. [DOI] [PubMed] [Google Scholar]
- (90).Schuler B; Borgia A; Borgia MB; Heidarsson PO; Holmstrom ED; Nettels D; Sottini A Binding without folding – the biomolecular function of disordered polyelectrolyte complexes. Curr. Opin Struc Biol 2020, 60, 66–76. [DOI] [PubMed] [Google Scholar]
- (91).Wang W Recent advances in atomic molecular dynamics simulation of intrinsically disordered proteins. Phys. Chem. Chem. Phys 2021, 23, 777–784. [DOI] [PubMed] [Google Scholar]
- (92).Luong TDN; Nagpal S; Sadqi M; Munoz V A modular approach to map out the conformational landscapes of unbound intrinsically disordered proteins. Proc. Natl. Acad. Sci. U. S. A 2022, 119, No. e2113572119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (93).Uversky VN Multitude of binding modes attainable by intrinsically disordered proteins: a portrait gallery of disorder-based complexes. Chem. Soc. Rev 2011, 40, 1623–1634. [DOI] [PubMed] [Google Scholar]
- (94).Arai M; Sugase K; Dyson HJ; Wright PE Conformational propensities of intrinsically disordered proteins influence the mechanism of binding and folding. Proc. Natl. Acad. Sci. U. S. A 2015, 112, 9614–9619. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (95).Chu X; Nagpal S; Muñoz V Molecular Simulations of Intrinsically Disordered Proteins and Their Binding Mechanisms, In Protein Folding: Methods and Protocols; Muñoz V, Ed.; Springer US: New York, NY, 2022; pp 343–362. [DOI] [PubMed] [Google Scholar]
- (96).Lazar T; Martínez-Pérez E; Quaglia F; Hatos A; Chemes LB; Iserte JA; Méndez NA; Garrone NA; Saldaño TE; Marchetti J; Rueda AJV; Bernadó P; Blackledge M; Cordeiro TN; Fagerberg E; Forman-Kay JD; Fornasari MS; Gibson TJ; Gomes G-NW; Gradinaru CC; Head-Gordon T; Jensen MR; Lemke EA; Longhi S; Marino-Buslje C; Minervini G; Mittag T; Monzon AM; Pappu RV; Parisi G; Ricard-Blum S; Ruff KM; Salladini E; Skepö M; Svergun D; Vallet SD; Varadi M; Tompa P; Tosatto SCE; Piovesan D PED in 2021: a major update of the protein ensemble database for intrinsically disordered proteins. Nucleic Acids Res. 2021, 49, D404–D411. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (97).Crehuet R; Buigues PJ; Salvatella X; Lindorff-Larsen K Bayesian-Maximum-Entropy Reweighting of IDP Ensembles Based on NMR Chemical Shifts. Entropy 2019, 21, 898. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
