Abstract
Recent advances in residue-level coarse-grained (CG) computational models have enabled molecular-level insights into biological condensates of intrinsically disordered proteins (IDPs), shedding light on the sequence determinants of their phase separation. The existing CG models that treat protein chains as flexible molecules connected via harmonic bonds cannot populate common secondary structure elements. Here, we present a CG dihedral angle potential between four neighboring beads centered at Cα atoms to faithfully capture the transient helical structures of IDPs. In order to parameterize and validate our new model, we propose Cα-based helix assignment rules based on dihedral angles that succeed in reproducing the atomistic helicity results of a polyalanine peptide and folded proteins. We then introduce sequence-dependent dihedral angle potential parameters () and use experimentally available helical propensities of naturally occurring twenty amino acids to find their optimal values. The single-chain helical propensities from the CG simulations for commonly studied, prion like IDPs are in excellent agreement with NMR-based α-helix fraction prediction, demonstrating that the new HPS-SS model can accurately produce structural features of IDPs. Furthermore, this model can be easily implemented for large-scale simulations due to its simplicity.
Graphical Abstract

Introduction
Intrinsically disordered proteins (IDPs) and regions (IDRs) play an important role in many biological functions in the human proteome, including DNA repair1, 2, cellular signaling3, and stress response4, and have been related with a broad range of neurodegenerative diseases such as ALS1, Alzheimer’s5, and dementia6. IDPs are also associated with the phenomenon called cellular liquid−liquid phase separation (LLPS) to form membrane-less organelles5 and may drive pathological aggregation1. Interestingly, they are biologically functional as described above without adopting a well-defined, stable tertiary structure, but have many distinct residue-level conformations. Over the past three decades, several studies have focused on the determination of transient secondary structure in IDPs7–11, and various methods have been introduced to predict those residual structures using the combination of both experimental and computational tools12–21. Recent studies have shed light on the existence of residual helices22, 23, and β-sheet structures24, 25 in IDPs, which are found to be important for phase separation, contributing to their intermolecular interactions. Accurately identifying these transient secondary structures of IDPs that promote LLPS is vital to understanding the driving forces of the membrane-less organelle formation and their role in disease pathology.
Numerous biophysical experimental methods have been used to structurally characterize IDP conformational properties, which can provide information on the average ensemble values rather than complete distributions of conformational and structural variables. Local secondary structure propensities can be obtained based on residues-specific chemical shifts and residual dipolar couplings from solution-based nuclear magnetic resonance (NMR) spectroscopy26 and circular dichroism (CD) ellipticities27, whereas small-angle X-ray scattering (SAXS) intensities can give low-resolution information on the shape, size and folding state of IDPs28, 29. Recent studies have also suggested that the single-molecule Förster resonance energy transfer (smFRET) can provide transient populations of IDP structural conformations30. However, this is a low-resolution tool that lacks atomistic details31, and thus molecular dynamics (MD) simulations will be essential as complementary tools in studying IDPs.
MD simulations based on the fully atomistic force fields and water models have been improved extensively in studying IDPs making them transferable, which do not need any experimental input32. Although the use of all-atom representation is a reliable tool to complement experimental methods, they are not computationally efficient to simulate longer proteins and LLPS, requiring very long timescales for convergence. In contrast, coarse-grained (CG) models that replace the atomistic details with a coarser protein description, often not more than 3 or 4 beads per residue, can allow the speed-up to reach the time and length scales needed to study phase separation of IDPs. The computational expense of conformational sampling is further reduced in CG simulations by representing solvent in an implicit manner rather than atomistic explicit solvent models.
Numerous CG models have been created to faithfully study the structural and dynamic conformation of IDPs. Commonly used coarse-grained force field, Martini33, which uses a mapping of two to four non-hydrogen atoms to one bead, can accurately capture electrostatic interactions but relies on constraints from native structures to maintain secondary structures. Therefore, it is not able to explore the protein conformational fluctuations accurately and predict the transient secondary structure propensity of proteins. Other models like CAMELOT34 parameterize the model for a specific system using machine-learning tools based on all-atom data, making it non-transferable. In recent CG models, higher levels of details have been added in the backbone to account for the change in structural conformations in IDPs. A hybrid resolution (HyRes) model developed by Chen and Liu35 consists of an atomistic description of the backbone, providing the semi-quantitative secondary structure propensities, whereas the SOP-IDP force field36 is a two-bead model parameterized to reproduce the SAXS scattering profile of IDPs. AWSEM-IDP developed by Wolynes, Papoian, and colleagues37 include , , and O atoms and has been parameterized for IDPs relying on bioinformatics memory using NMR experimental data and atomistic simulations to structurally bias protein fragments. Despite the challenges to using single Cα-based beads to describe local structure propensities, Latham and Zhang developed MOFF-IDP38 using the combinations of maximum entropy biasing, least-squares fitting, and basic principles of energy landscape theory that reproduces experimental radius of gyration for various IDPs, however, secondary structure potential of this model is protein-specific. Another attempt has been made by introducing a multibody pseudo-improper-dihedral potential term39 to the single-bead model to reproduce the nature of backbone and side-chain interactions between amino acids, which were five times slower and harder to parallelize after the addition of the complex term. Even though there were many efforts to develop CG models with backbone constraints, there is still a need for a transferable CG model able to capture the transient secondary structure of IDPs.
Alongside, numerous residue level CG models that mainly focuses on the LLPS of IDPs, namely Kim-Hummer40, HPS-KR41, FB-HPS42, TSCL-M243, Mpipi44 have been designed. We have also recently developed a Cα-based single-bead CG model that considers the physicochemical properties of all 20 naturally occurring amino acids, making it suitable for LLPS studying45. The model uses short-range van der Waals (vdW) and long-range electrostatic to describe non-bonded interactions, while bonded interactions are only based on the harmonic bond potential. The short-range potential utilizes Ashbaugh-Hatch functional form46, parameterized based on the hydrophobicity scale proposed by Urry et. al47. Despite its success to study IDPs, predicting short sticky regions in IDPs that drive phase separation45, it cannot capture the local secondary structure of IDPs since it lacks chain rigidity and depends solely on the harmonic bond potential between two consecutive beads.
Here, we introduce a transferable approach that doesn’t require further experimental input by developing new dihedral angle potentials with control parameters that modulate the population of conformations between the α-helix and extended configurations. The dihedral angle potential parameters are derived from the experimentally measured helical propensities of all twenty naturally occurring amino acids48, using the Lifson & Roig model of helix-coil transition49 with a model Ala-based host-guest peptide50. We also introduce Cα-based helix definition rules based on dihedral angles to compute the helical propensity at the residue level. Sequence-dependent dihedral angle parameters are introduced to control the helicity and computed using various mixing rules combining contributions from neighboring residues. We show the ability of our model in predicting the transient α-helical structure of prion-like, low-complexity IDPs: TDP-43 CTD, hnRNPA2 LC, FUS LC and semi-randomly selected 9 test IDPs. Furthermore, we show how introducing the backbone potentials affect IDPs conformations compared to the fully flexible model and its implications for studying LLPS of IDPs.
Model development
A. Coarse-Grained IDP model
The previously developed IDP model, “HPS-Urry,” has been successfully applied for understanding the sequence-dependent LLPS of various IDPs45. This model is based on a single-bead representation of each amino acid, where the beads are centered at the corresponding Cα positions and connected by harmonic springs. However, HPS-Urry model cannot capture the secondary structure elements of IDPs due to the absence of angle and dihedral angle potentials between successive beads. Here we develop novel, sequence-dependent dihedral angle potentials between four consecutive residues to properly capture the backbone rigidity and the secondary structure elements, especially the helical propensity. The total potential energy of the model is given by
| (1) |
where Uvdw and Uelec are the non-bonded pairwise short-range van der Waals (vdW) and long-range electrostatic interactions, respectively, while Ubond, Uangle, and Udihed are the bond, angle, and dihedral angle potentials between two, three, and four consecutive beads (Fig. 1A). Here, the first three terms, Uvdw, Uelec, and Ubond, in the potential energy are adopted from the HPS-Urry model (see Ref. 45 for more details).
Figure 1.

(a) Cα based single-bead representation of amino acids. Schematic illustration of bond, angle, and dihedral angle between two, three and four consecutive beads in HPS-SS model. (b) Dihedral angle potential energy. Potential energy vs control parameter, , that accounts for the populations between extended and α-helical configurations. The dashed vertical lines denote the inflection points of the dihedral-angle free energy from the all-atom Ala40 data (see Fig. S1). More negative means a more stable α-helix configuration.
The non-bonded vdW interaction potential is modeled using the Ashbaugh and Hatch functional form given by46
| (2) |
where, is the usual Lennard-Jones (LJ) potential:
| (3) |
As for previous studies45, we set kcal/mol and, for a pair of residue (i,j), , and , where and are the hydropathy value and vdW diameter of a residue i, respectively. The electrostatic interaction between two charged residues is given by the simple Debye-Hückel screened potential
| (4) |
where qi is the charge of residue i, D is the dielectric constant of water (set equal to 80), and is the inverse screening length (= 0.1 Å–1 at physiological salt concentration). Among twenty amino acids, we set for ARG and LYS, -q0 for ASP and GLU, where q0 is the elementary charge of a proton.
The bond potential, Ubond, between two neighboring residues are represented by a harmonic potential
| (5) |
where r is the distance between two residues, k is the spring constant (set equal to 10 kcal/mol/Å2) and r0 is the equilibrium distance (= 3.82 Å).
B. Angle potential
The angle potential between three consecutive Cα atoms is represented by a double-well Gaussian with the parameterized functional form taken from Kim and Hummer40
| (6) |
where, is the angle between three residues and the parameters are set as follows: mol/kcal, kcal/mol, rad, rad, kcal/(mol rad2), and - kcal/(mol rad2). This double-well potential accounts for both helical (centered around ) and extended angles (centered around ).
C. Dihedral angle potential
Here our goal is to develop the dihedral angle potential that can accurately capture the transient α-helical structures of IDPs, that is prevalent in proteins and easily predictable. There have been several dihedral angle potentials introduced for protein CG models using periodic trigonometric functions51. A dihedral angle force field developed by Karanicolas and Brooks52 uses a four-term cosine-series for each pair of middle residues resulting in 1600 sets of parameters (four cosine terms for each pair) and is based on the statistical potentials derived from folded protein structures. In this article we instead derive the dihedral angle potential phenomenologically from experimentally available helical propensities of twenty amino acids48. Specifically, the functional form of the dihedral angle potential is inspired by the dihedral angle distribution of the Cα-based coarse-grained trajectory of all-atom Ala40 simulation data that shows a double-well shape (see Fig. S1). Therefore, we adopt the functional form similar to the double-well potential of the angle potential as:
| (7) |
where is the dihedral angle between four consecutive atoms, and are the multi-Gaussian functions representing the helical and extended configurations, respectively, while is the sequence-dependent parameter that controls the relative well-depth between two configurations. The helical part of the function, , is given by
| (8) |
where , , , , . The functional form for the extended configuration, , is given by
| (9) |
Where , , , , , . Here is the control parameter that depends on the sequence of the residues constituting the dihedral angle. Like in the angle potential, terms: e0, e1, and e2 are responsible for the relative depths of the two basins as well as the energy barriers. The exponential functions with quartic terms are added to provide a reasonable energy barrier between two configurations (Fig. 1B). Without these quartic terms, the energy barriers will be too high for meaningful exchange. The terms with shifted exponents by guarantee the periodic condition for Udihed as a function of the dihedral angle. Width of the helix and extended basins are controlled by (large means narrow helix basin) and (smaller means wider extended basin) parameters whereas their minima are located by and terms.
Figure 1B shows the dihedral angle potential function for different helix-control parameter, , values. As seen in the figure, the well depths of two configurations are controlled by , where negative favors the helical conformation, while positive εd favors the extended conformation. The parameter for a given dihedral angle will be determined by the identity of its constituent residue beads and, if necessary, neighboring residues as discussed below.
Simulation Details
CG simulations were performed using the LAMMPS molecular dynamics simulations package (Oct 2020 version)53, in which both HPS-Urry and HPS-SS codes have been implemented. The simulations are done in the NVT ensemble using the Langevin thermostat with a damping coefficient of 1 ps–1 and periodic boundary conditions. The duration of the production simulations was 5 μs with a 10ns equilibration time. Simulation data was saved every 1000th timestep.
All-atom simulations (AAMD) of Ala40 were run using OpenMM 7.5.154 for 1.4 μs using Amber99SBws-STQ force field32 (with tip4p/2005 water model55), solvated in a truncated octahedron box of 65 Å.
Results & Discussion
A. Cα-based helix assignment rules
Prediction of secondary structure elements (SSE) (helix, strand, coil) is an important step for the characterization of IDP conformational ensembles. The secondary structure of the IDPs can be predicted using automated tools such as DSSP56, 57 and Structural Identification (STRIDE)58 at an atomistic level. DSSP is the most used predictor which assigns SSE based on the N-H and C=O hydrogen-bonding patterns. STRIDE utilizes both hydrogen bonding patterns with modified hydrogen bond energy function and backbone geometry for secondary structure analysis. Since our model is based on a single-bead representation for each amino acid, it becomes a challenge to compute the helix propensities from only coordinates. Several -based assignment methods such as P-SEA59, SABA60, PCASSO61 have been developed. P-SEA uses distances (i to i + 2, i + 3, and i + 4) and two dihedral angle criteria for secondary structure assignment, whereas SABA utilizes pseudo-center (PC) position between neighboring beads and then assigns SSEs based on the PC-dependent geometric criteria. Recently developed PCASSO predicts SSEs based on the random forest (RF) approach, achieving high accuracy with respect to DSSP. Although these methods can be adopted in computing the helix propensities of -based trajectories, here we first introduce new helix assignment rules solely based on the dihedral angles, mainly due to the fact that the helix conformation in our model is controlled by the dihedral angle potential only.
In our scheme, each residue is assigned a binary code, either 0 or 1, depending on dihedral angles. Specifically, when a dihedral angle, , of residues between i and i+3 falls in a specified range of the helical basin, we assign either 0 or 1 to each residue depending on assignment rules. For example, the (i,i+3) assignment rule means that, when the dihedral angle is in the helical range, the outer two residues i and i+3 are assigned 1, while the others 0 (see Fig. S2). The middle two residues, i+1 and i+2, are assigned 1 for the (i+1,i+2) assignment rule (Fig. S2). On the other hand, when the dihedral angle lies outside the helical range, all the constituting residues are assigned 0. Once a residue is assigned 1 from any of dihedral angles that the residue participates in, it stays with 1 regardless of the other dihedral angles. The residue assigned with 1 is referred to being in a pseudo-helical (h’) state that can initiate the α-helix formation (see Fig. S2). A residue i is then defined as helical (h), if the residue and its neighboring 2n residues (i-n, i-n+1, ..., i+n-1, i+n; n=0,1, 2...) are both in pseudo-helical (h’) state (see Fig. S2). For n = 0, the pseudo-helical state (h’) becomes automatically helical (h), while for n > 1, the cooperativity of neighboring residues is required for a residue to be a part of the helical conformation.
To determine the optimal helix assignment rule among all possible cases, we carried out an all-atom MD simulation on polyalanine, Ala40, and calculated the residue-level helical fractions (including α, 310, π helices) using the DSSP algorithm (Fig. 2A). As anticipated from the high helical propensity of alanine, the all-atom MD simulation yields relatively high helical fractions (> 0.4) for Ala40. To calculate the helical fractions using our assignment rules from coordinates, it is then necessary to first identify the helical range of the dihedral angle. The free energy profile of the dihedral angle between successive coordinates from the all-atom Ala40 data shows a steep potential well around 0.9 with inflection points at ~0.28 and ~1.48 (see Fig. S1). As a starting point, it is then natural to define the helical basin as the region between these two inflection points.
Figure 2. Helical assignment rule.

(a) Cα-based helical definitions based on all-atom MD (AAMD) Cα-trajectory worked well to match the helix fraction of Ala40 computed via all-atom DSSP criteria. (b) The selection of the single dihedral angle-based helix assignment rule, comparing the chi-squared (χ2) errors between the helix fractions obtained from AAMD Cα-trajectory and DSSP criteria.
The Cα-based helical fractions are then calculated for different assignment rules using (0.28,1.48) as a helical range. Figure 2B shows , the deviation of the CG-based helical fractions from the all-atom DSSP helical fractions, for all possible eleven assignment rules where at least two residues are assigned pseudo-helical state (h’) with varying n. The rules with two pseudo-helical states, i.e., (i,i+1), (i,i+2), …, show the comparable minima of at , while those with more pseudo-helical states require to reach the minima. Closer examination of Fig. 2B shows that the (i+1,i+2) rule exhibits the least deviation from the DSSP helical fractions, although the differences from other rules are insignificant as shown in Fig. 2B. The (i+1,i+2) rule can be considered as the most natural rule from single dihedral angle criteria, since the dihedral angle is defined by the rotation angle of the two planes consisting of (i,i+1,i+2) and (i+1,i+2,i+3) around the middle bond between i+1 and i+2 residues. Similarly, the dihedral angle potentials derived by Karanicolas and Brooks52 depend only on the sequences of two middle residues. Therefore, the (i+1,i+2) assignment rule with n = 1 is used herein to calculate the helicity of MD trajectories.
Sensitivity of the helical range of the dihedra angle is further tested using the (i+1,i+2) assignment rule (with n = 1) and is found to be modest (see Fig. S3). For the values around the inflection points (0.28 and 1.48), the CG-based helical fractions match the all-atom DSSP helical fractions very well with being less than 0.06 (Fig. S3). However, the optimal dihedral angle range (0.25,1.3) is chosen based on the least value and thus adopted throughout this article.
B. Determination of residue-specific dihedral-angle potential parameters
Once the helical assignment rule is chosen, our next goal is to determine the sequence-dependent helix-control parameter, , of the dihedral angle potential in Eqs. (8) and (9). For a given dihedral angle, , is determined by four constituting residues i, i+1, i+2, and i+3 with appropriate weights given by
| (10) |
where is the helical “energy” of a residue i that depends on the residue type, and 𝜂i is the unnormalized weight. Here we derive from the experimental context-independent helical propensities of Moreau et. al.48 for naturally occurring 20 amino acids. For this purpose, the Lifson & Roig (LR) model of the helix-coil transition is adopted to estimate the helical propensities of twenty amino acids from coarse-grained MD simulations. In the LR analysis, a residue is in a helical state if it is helical (h) according to our assignment rule, otherwise it is in a coil state. A residue in a coil state is assigned a statistical weight factor,1.0. A residue of type i in a helical state but not within a helical segment is assigned a weight vi, while a residue in a helical segment is assigned wi except for the terminal residues in the segment which have weight vi.49 The helical propensities, wi, are then fitted against the experimental data.48
First, a parameter, , for alanine, is determined from the CG Ala40 simulations by computing the helical propensities for different values (Fig. S4). The chain length of 40 is chosen to minimize the effects of chain length on the helical propensities. Using the LR analysis, the helical propensity of alanine, wA, is calculated from the CG trajectories. The estimated helical propensity of alanine decreases monotonically with , as anticipated from the fact that the helical basin well-depth of the dihedral angle potential becomes shallower with increasing parameter. The non-bonded interactions have little effect on wA as three different hydrophobicity, , values yield within 2% difference from one another. Simulated helical propensity, wA, is then compared to the experimental helical propensity for alanine, to determine the .48 As shown in Fig. S4, yields the experimental helical propensity of alanine.
Next, the parameters, , of the remaining 19 amino acids are determined by simulating Ala-based guest-host systems similar to the one used by Moreau et. al.48 Here we use two systems, A20X4A20 and A20XA20, where X is the guest residue. Unlike the polyalanine A40, the identity of the guest residue influences the total potential energy, as the nonbonded interactions depend on the hydrophobicity value of X and the dihedral angle potential parameter, , is affected not only by the parameter value, but also by the weight factors, . However, since the polyalanine simulations show little or no effects of the non-bonded pair interactions on the helical propensity, we expect this to be the case for the guest-host systems also. Thus, we set for alanine taken from the HPS-Urry model, while and 1 are used for the guest residue to ensure that the non-bonded pairwise interactions have a negligible effect on estimating . For dihedral angle potential, two weighting schemes, or mixing rules, are used to see their effects on . These are referred as 0110 and 1111 mixing rules, where the 0110 mixing rule means that the unnormalized weight factor, , is set equal to 1 for i+1 and i+2 and 0 for the others, while for the 1111 mixing rule, the weight factors, , for all four residues are set equal to 1.
Figure 3A shows the helical propensity of the guest residue estimated via the LR analysis for the 1111 mixing rule, where the guest-residue helical propensity is shown to decrease to zero as increases. As expected, the hydrophobicity value of the guest residue has no effect on the helical propensity. Using the nineteen helical propensity data of Moreau et. al48, the control parameters, , are then determined for the remaining residues (see Table S1). Figure 3B shows that the estimated control parameters for 1111 and 0110 mixing rules from two guest-host systems have a linear relationship with the relative helical free energies derived from the NMR chemical shifts for the 20 amino acids from Moreau et. al.48
Figure 3. Parameterization of the dihedral angle potential.

(a) Parameterization of amino acids using A20XA20 guest-host system, via the (i+1,i+2) assignment rule with n = 1 and 1111 mixing rule (b) Control parameters () obtained using the (i+1,i+2) assignment rule with n = 1 for different guest-host models and mixing rules have a linear relationship with the relative free energies for the 20 amino acids derived from the NMR chemical shifts48.
C. CG prediction of TDP-43 helical propensities
To test the predictability of the helix secondary structure for IDPs using our CG model with the parameters derived above, we simulated the CTD domain of the TDP-43 protein, which is an RNA binding protein whose aggregates have been involved in neurodegenerative diseases, such as amyotrophic lateral sclerosis (ALS)23. The CTD domain of TDP-43 has been shown to exhibit a transient α-helical region in the 320–341 domain,23 and thus is well-suited to evaluate the performance of our HPS-SS model. Furthermore, ALS mutations, as well as target mutations designed by both experimental and computational efforts, have identified the importance of the helical region in tuning the phase behavior of TDP-43.23, 62 Thus it is desired that the HPS-SS model with ability to predict the helix structure can be applicable in simulating the LLPS of TDP-43 with right helical conformations.
From the available NMR data23, a single residue-specific secondary structure propensity (SSP) score63 is calculated based on the deviations of NMR chemical shifts from Poulsen IDP/IUP random coil chemical shifts64, known as secondary chemical shifts.65 The SSP score in Fig. 4 shows a helical region between residue number 320 and 341 with the helical fractions exceeding 0.2. MD simulation was then performed on the TDP-43 CTD with GHM tag (TDP-43267–414) with our HPS-SS model. The residue-level helix fraction was computed using the -based helical assignment rule and compared with experimental data. Figure 4 shows the comparison between CGMD results and experimental SSP score. Overall, the HPS-SS model with three different parameter sets is able to capture the most helical region (aa:321–334) correctly with comparable helix fractions. On the other hand, the HPS-SS model with parameters derived from the A20X4A20 system overestimates the helicity outside 321–334. This can be explained by the fact that the parameters, , for non-alanine residues are lower than those from the A20XA20 system, thus making the corresponding dihedral angles more helical-friendly. The model parameters derived from the A20XA20 guest-host system with 1111 mixing rule underestimates the helix fraction by ~10 % per residue in the region 321–334, while predicting the overall low helix fractions outside this region in agreement with experimental data.
Figure 4.

Comparison of the helix fraction of TDP-43 CTD obtained via parameterization of residues using the (i+1,i+2) assignment rule with n = 1 for different mixing rules with the NMR-based SSP-predicted helix fraction.
The largest discrepancy between simulations and experiments occurs at residue number 320 and 321 (with 320 being proline). The NMR chemical shifts show that the α-helical segment (aa: 320–340) of TDP-43267–414 starts with the Pro-320. On the other hand, the parameter, , of Pro in the HPS-SS model is most positive, implying that it is least helix stabilizing. Within the current mixing rules, the existence of Pro in the sequence proves to be helix-breaking, leading to near zero helix fraction.
D. New helical assignment rule based on two dihedral angles
The main drawback of the simple, single-dihedral-angle based helical assignment rule is the failure to detect the helicity of proline that starts as the first residue of a helix segment. Although proline is known to act as a helix breaker, it can act as a hydrogen bond acceptor via the carbonyl group, forming the hydrogen bond with the N-H group of the amino acid located four residues later (i+4) along the protein sequence66 (Fig. 5A). Therefore, a new helical assignment rule that accounts for the (i,i+4) hydrogen bonding is warranted.
Figure 5.

Schematic representation of (a) hydrogen bond formation of proline (residue i) via its carbonyl oxygen to the amino group of residue i+4 to stabilize . (b) (i,i+4) helical assignment rule based on the two consecutive dihedral angles between residue i and i+4. (c) Control parameters () were obtained using (i,i+4) helical assignment rule for 1–1001-1 mixing rule and A20XA20 guest-host system.
Here we consider two consecutive dihedral angles, and , between residues i and i+4 to define helicity. When these two dihedral angles lie in the helical range, the residues i and i+4 are defined as pseudo-helical (h’) as shown in Fig. 5B, thereby referred as (i,i+4) assignment rule. As above, a residue i is defined as helical (h) if the residue and its neighboring residues (i-1 and i+1 for n = 1) are in pseudo-helical (h’) state (Fig. 5B). Note that our (i,i+4) rule is similar to P-SEA scheme (with n = 0)59 except for the distance criteria. The dihedral angle range for the helical basin is optimized by fitting the helix fraction against an all-atom Ala40 simulation data. It is found that the (i,i+4)-rule based helical fractions for the helical region match very well with the all-atom DSSP results that averaged over a trajectory using the helical range of (0.25,1.7) (see Fig S5). Again, these values are not far from the inflection points (0.28,1.48) of the dihedral angle free energy of the all-atom Ala40 trajectory (Fig. S1). The new rule, (i,i+4) is also in good agreement with all-atom DSSP results (α-helix only) to capture individual conformations of the given residue over the trajectory as presented in Fig. S6. For instance, the new rule was 93% accurate compared with DSSP estimates for 10th residue (See Fig. S6A and S7A) whereas the minimum accuracy of 72% is determined for residue at position 30 (See Fig. S6B and S7B). Additionally, the new helical assignment rule is validated against folded proteins from Lindorff-Larsen et al.67, comparing the helix fraction from DSSP and (i,i+4) rule based on Cα coordinates from Protein Data Bank68 (See Fig. S8). The average chi-squared, value is 0.14, which shows that the assignment rule is faithfully located the helix positions for folded proteins.
The control parameter for a given dihedral angle is determined by considering the fact that two hydrogen bonds, (i−1,i+3) and (i,i+4), enforce to be in the helical range. Therefore, the dihedral angle potential is affected mostly by its two outer residues, i and i+3, as well as its neighboring residues, i-1 and i+4. Hence, a new mixing rule, denoted by 1–1001-1 (where 1- and -1 refer to the preceding and succeeding residues, respectively), for a dihedral angle is given by
| (11) |
with for j = i-1, i, i+3, and i+4 and 0 otherwise. The residue-specific parameters, , for twenty amino acids are derived via the same procedure as (i+1,i+2) rule above (see Fig. S9) and range from −2.59 for Ala to 3.70 for Pro as shown in Fig. 5C.
Figure 6 shows the residue-level helical fractions for TDP-43 from the CG model with the new (i,i+4) helical assignment rule and parameters (see Table S1) compared to the experimental SSP score. The CG model captures both the local trend and the magnitude of the α-helical fraction of TDP-43267–414. In particular, the helical fraction at Pro-320 is significantly higher (~ 0.2) than the single-dihedral-angle based model presented above. Helix fractions of the conserved region (aa:320–343) match quantitively with the NMR data. The HPS-SS model is also applied to two other RNA binding proteins, namely FUS LC69 and hnRNPA2 LC70, which have experimentally been shown to be fully disordered (i.e., low helical fraction). Our model also predicts low helix fractions for both proteins in agreement with the NMR chemical shifts (Fig. S10). In summary, the HPS-model with dihedral potential parameters was able to accurately predict the local helical propensities of the prion-like, low-complexity IDPs.
Figure 6.

Helix fractions were obtained using the newly developed HPS-SS model for TDP-43 CTD compared against the SSP-predicted helix fraction. Our model was able to faithfully capture both the trend and the magnitude of the α-helical fraction. Snapshot of TDP-43 CTD simulated using the HPS-SS model (aa:320–334 is shown as red).
E. Effects of bonded potentials on conformational ensemble of IDPs
To explore how the addition of the backbone constraints via angle and dihedral angle potentials affects the conformations of IDPs, MD simulations are carried out on the previously studied45 42 IDPs of various lengths and sequences using the HPS-SS model with (i,i+4) rule-based parameters (see Table S1).
Figure 7 shows the average radius of gyration for these IDPs in comparison with the previous model (HPS-Urry) without angle and dihedral angle potentials. As shown in Fig. 7, the Rg values from the HPS-SS are smaller than those from the HPS-Urry except for a few (notably tau) IDPs. The distributions of most IDPs from the HPS-SS are shifted towards smaller Rg values (see Fig. S11). The smaller Rg values for the HPS-SS may be due to cooperative non-bonded interactions between locally rigid chain segments resulting in stronger attractions. However, for all 42 IDPs, the addition of the angle and dihedral angle potentials results in overall less than ~9% deviations from the HPS-Urry model based on the chi-square scoring function45. Among the 42 IDPs simulated here, r15, r17, protα-N, protα-C, protein-L, and erm show significant helix fractions (> 0.3, see Fig. S12). The difference between the Rg distributions as well as the helix fractions from both models is measured using the relative entropy (Kullback-Leibler divergence)71 and Jensen-Shannon divergence metric72, showing that the major deviations in Rg distribution from the HPS-Urry model come from the high helical content of the proteins (ProTα-C, R17d, IBB, hCyp, K44) captured by the HPS-SS model (Fig S11 & S12).
Figure 7.

Scatter plot of simulated Rg using the HPS-Urry and the HPS-SS model for 42 IDPs, Black (dot-dash) lines represent an ideal linear relationship between these two models. Simulated Rg from two models has a strong linear relationship ().
F. Limitations of the model and future work
We showed that our model works well with prion-like, low-complexity IDPs. Predictive capabilities of the model are tested against semi-randomly selected 9 IDPs with available NMR data, namely: ACTR73, α-synuclein74, p5375, tau-k1876, SIC177, paaA278, Ntail79, NCBD80, and N17-polyQ81. Overall, our model correctly predicts the helix positions in most proteins but fails to match the magnitude of the helicity for some of them (See Fig. 8). Like for low-complexity IDPs with no substantial helical propensity, the new model doesn’t overestimate the helicities for p53, tau-k18, and SIC1. It also works to determine the helical regions of IDPs with high residual helicity such as paaA2, NCBD, Ntail, and N17-polyQ. Among the test proteins, α-synuclein is predicted to have a quite high helicity based on our model even though it should have a minimal helical propensity in its free state74. However, the locations of the helices predicted by the models match with the helix locations when α-synuclein is micelle-bound74. This result is true for ACTR73, 82 as well, for which the model predicts two helical domains compared to one main residual helical structure in the free state73. Charged residues: arginine and glutamic acid are found to have larger deviations against experimental helix fractions when the amino acid sequence of test proteins is analyzed. Helix formation by the charged residues may be more complex than we assumed in our model since it involves complex salt bridges83 and side-chain interactions84. Given the simplicity of the model (based on Cα positions only), the current model can still be considered a good approach with the limitations that can be fixed in future work.
Figure 8.

Helix fractions obtained using the HPS-SS model for 9 IDPs is tested against SSP-predicted helix fraction. For ACTR and α-synuclein, experimental helix fraction in free (red) and bound state (magenta) is given.
Conclusions
Here, we have introduced new dihedral angle potentials that allow us to compute the transient helical propensity of IDPs. The dihedral angle potentials do not require any further input for helical fraction calculations but are parameterized using experimental helical context-independent propensities of amino acids. We proposed new Cα-based helical assignment rules based solely on dihedral angles that agree well with DSSP results of all-atom MD simulations and folded proteins. The single-chain helical propensity of TDP-43 CTD, FUS LC, hnRNPA2 LC computed using our CG model were in good agreement with NMR data. The new dihedral angle potentials allow us to accurately predict the helical propensities of IDPs with some limitations and can be easily implemented in the large-scale assembly simulations of IDPs, thus enabling us to explore the role of structure in the function of IDPs and phase separation.
Supplementary Material
Acknowledgments
We would like to thank Dr. Greg Dignon and Prof. Wenwei Zheng for their useful discussions relating to many of the topics discussed in this article. Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R01GM136917. Research on TDP-43 phase separation was supported by NINDS and NIA R01NS116176. Y.C.K. is supported by the Office of Naval Research via the U.S. Naval Research Laboratory base program.
Footnotes
Data and Software Availability
The source code needed to run the HPS-SS model within the LAMMPS (Oct 2020) package is provided at the following location (https://github.com/azamat-rizuan/HPS-SS-model). The repository also contains the script for Cα-based helix assignment rules along with validation data from an atomistic single-chain simulation of the A40 peptide. Input parameters for the new model with the example input files, helix propensity script based on the adopted Lifson & Roig (LR) model of the helix-coil transition, validation data sets, radius of gyration data are also available within the repository. Other source data can be obtained from the corresponding authors upon request.
Supporting Information
Details of the helical assignment rules and dihedral potential: graphical explanations of the assignment rules, correlation of the rule and DSSP to estimate individual residual conformations over the trajectory, validation of the assignment against folded proteins, parameterization and testing of the HPS-SS model using those rules, table of the control parameters () for naturally occurring 20 amino acids obtained using the helical assignment rules, and free energy distribution of dihedral angle obtained from all-atom Ala40 simulation trajectory.
References
- 1.Patel A; Lee Hyun O.; Jawerth L; Maharana S; Jahnel M; Hein Marco Y.; Stoynov S; Mahamid J; Saha S; Franzmann Titus M.; Pozniakovski A; Poser I; Maghelli N; Royer Loic A.; Weigert M; Myers Eugene W.; Grill S; Drechsel D; Hyman Anthony A.; Alberti S, A Liquid-to-Solid Phase Transition of the ALS Protein FUS Accelerated by Disease Mutation. Cell 2015, 162, 1066–1077. [DOI] [PubMed] [Google Scholar]
- 2.Altmeyer M; Neelsen KJ; Teloni F; Pozdnyakova I; Pellegrino S; Grofte M; Rask MD; Streicher W; Jungmichel S; Nielsen ML; Lukas J, Liquid demixing of intrinsically disordered proteins is seeded by poly(ADP-ribose). Nat Commun 2015, 6, 8088. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Wippich F; Bodenmiller B; Trajkovska MG; Wanka S; Aebersold R; Pelkmans L, Dual specificity kinase DYRK3 couples stress granule condensation/dissolution to mTORC1 signaling. Cell 2013, 152, 791–805. [DOI] [PubMed] [Google Scholar]
- 4.Biamonti G; Vourc'h C, Nuclear stress bodies. Cold Spring Harb Perspect Biol 2010, 2, a000695. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Boeynaems S; Alberti S; Fawzi NL; Mittag T; Polymenidou M; Rousseau F; Schymkowitz J; Shorter J; Wolozin B; Van Den Bosch L; Tompa P; Fuxreiter M, Protein Phase Separation: A New Phase in Cell Biology. Trends Cell Biol 2018, 28, 420–435. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Mateju D; Franzmann TM; Patel A; Kopach A; Boczek EE; Maharana S; Lee HO; Carra S; Hyman AA; Alberti S, An aberrant phase transition of stress granules triggered by misfolded protein and prevented by chaperone function. EMBO J 2017, 36, 1669–1687. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhang O; Forman-Kay JD, Structural characterization of folded and unfolded states of an SH3 domain in equilibrium in aqueous buffer. Biochemistry 1995, 34, 6784–94. [DOI] [PubMed] [Google Scholar]
- 8.Bertoncini CW; Jung YS; Fernandez CO; Hoyer W; Griesinger C; Jovin TM; Zweckstetter M, Release of long-range tertiary interactions potentiates aggregation of natively unstructured alpha-synuclein. Proc Natl Acad Sci U S A 2005, 102, 1430–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Neri D; Billeter M; Wider G; Wuthrich K, NMR determination of residual structure in a urea-denatured protein, the 434-repressor. Science 1992, 257, 1559–63. [DOI] [PubMed] [Google Scholar]
- 10.Sgourakis NG; Yan Y; McCallum SA; Wang C; Garcia AE, The Alzheimer's peptides Abeta40 and 42 adopt distinct conformations in water: a combined MD / NMR study. J Mol Biol 2007, 368, 1448–57. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Jensen MR; Bernado P; Houben K; Blanchard L; Marion D; Ruigrok RWH; Blackledge M, Structural Disorder within Sendai Virus Nucleoprotein and Phosphoprotein: Insight into the Structural Basis of Molecular Recognition. Protein and Peptide Letters 2010, 17, 952–960. [DOI] [PubMed] [Google Scholar]
- 12.Dill KA; Shortle D, Denatured states of proteins. Annu Rev Biochem 1991, 60, 795–825. [DOI] [PubMed] [Google Scholar]
- 13.Dyson HJ; Wright PE, Peptide Conformation and Protein-Folding. Curr. Opin. Struct. Biol 1993, 3, 60–65. [Google Scholar]
- 14.Smith LJ; Bolin KA; Schwalbe H; MacArthur MW; Thornton JM; Dobson CM, Analysis of main chain torsion angles in proteins: prediction of NMR coupling constants for native and random coil conformations. J Mol Biol 1996, 255, 494–506. [DOI] [PubMed] [Google Scholar]
- 15.Bernado P; Blanchard L; Timmins P; Marion D; Ruigrok RW; Blackledge M, A structural model for unfolded proteins from residual dipolar couplings and small-angle x-ray scattering. Proc Natl Acad Sci U S A 2005, 102, 17002–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Bernadó P; Mylonas E Fau - Petoukhov MV; Petoukhov Mv Fau - Blackledge M; Blackledge M Fau - Svergun DI; Svergun DI, Structural characterization of flexible proteins using small-angle X-ray scattering [DOI] [PubMed]
- 17.Camilloni C; De Simone A; Vranken WF; Vendruscolo M, Determination of Secondary Structure Populations in Disordered States of Proteins Using Nuclear Magnetic Resonance Chemical Shifts. Biochemistry 2012, 51, 2224–2231. [DOI] [PubMed] [Google Scholar]
- 18.Yachdav G; Kloppmann E; Kajan L; Hecht M; Goldberg T; Hamp T; Honigschmid P; Schafferhans A; Roos M; Bernhofer M; Richter L; Ashkenazy H; Punta M; Schlessinger A; Bromberg Y; Schneider R; Vriend G; Sander C; Ben-Tal N; Rost B, PredictProtein--an open resource for online prediction of protein structural and functional features. Nucleic Acids Res 2014, 42, W337–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Sormanni P; Camilloni C; Fariselli P; Vendruscolo M, The s2D method: simultaneous sequence-based prediction of the statistical populations of ordered and disordered regions in proteins. J Mol Biol 2015, 427, 982–996. [DOI] [PubMed] [Google Scholar]
- 20.Estana A; Barozet A; Mouhand A; Vaisset M; Zanon C; Fauret P; Sibille N; Bernado P; Cortes J, Predicting Secondary Structure Propensities in IDPs Using Simple Statistics from Three-Residue Fragments. J Mol Biol 2020, 432, 5447–5459. [DOI] [PubMed] [Google Scholar]
- 21.Thomasen FE; Lindorff-Larsen K, Conformational ensembles of intrinsically disordered proteins and flexible multidomain proteins. Biochem Soc Trans 2022, 50, 541–554. [DOI] [PubMed] [Google Scholar]
- 22.Mittal J; Yoo TH; Georgiou G; Truskett TM, Structural ensemble of an intrinsically disordered polypeptide. J Phys Chem B 2013, 117, 118–24. [DOI] [PubMed] [Google Scholar]
- 23.Conicella AE; Zerze GH; Mittal J; Fawzi NL, ALS Mutations Disrupt Phase Separation Mediated by alpha-Helical Structure in the TDP-43 Low-Complexity C-Terminal Domain. Structure 2016, 24, 1537–49. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Murray DT; Kato M; Lin Y; Thurber KR; Hung I; McKnight SL; Tycko R, Structure of FUS Protein Fibrils and Its Relevance to Self-Assembly and Phase Separation of Low-Complexity Domains. Cell 2017, 171, 615–627 e16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Hughes MP, S MR; Boyer DR; Goldschmidt L; Rodriguez JA; Cascio D; Chong L; Gonen T; Eisenberg DS, Atomic structures of low-complexity protein segments reveal kinked β sheets that assemble networks. Science 2018, 359, 698–701. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Schneider R; Huang JR; Yao M; Communie G; Ozenne V; Mollica L; Salmon L; Jensen MR; Blackledge M, Towards a robust description of intrinsic protein disorder using nuclear magnetic resonance spectroscopy. Mol Biosyst 2012, 8, 58–68. [DOI] [PubMed] [Google Scholar]
- 27.Greenfield NJ, Using circular dichroism spectra to estimate protein secondary structure. Nat Protoc 2006, 1, 2876–90. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kachala M; Valentini E; Svergun DI, Application of SAXS for the Structural Characterization of IDPs. In Intrinsically Disordered Proteins Studied by NMR Spectroscopy, Felli IC; Pierattelli R, Eds. Springer International Publishing: Cham, 2015; pp 261–289. [DOI] [PubMed] [Google Scholar]
- 29.Bernado P; Svergun DI, Structural analysis of intrinsically disordered proteins by small-angle X-ray scattering. Mol Biosyst 2012, 8, 151–67. [DOI] [PubMed] [Google Scholar]
- 30.Schuler B; Soranno A; Hofmann H; Nettels D, Single-Molecule FRET Spectroscopy and the Polymer Physics of Unfolded and Intrinsically Disordered Proteins. Annu Rev Biophys 2016, 45, 207–31. [DOI] [PubMed] [Google Scholar]
- 31.Bari KJ; Prakashchand DD, Fundamental Challenges and Outlook in Simulating Liquid-Liquid Phase Separation of Intrinsically Disordered Proteins. J Phys Chem Lett 2021, 12, 1644–1656. [DOI] [PubMed] [Google Scholar]
- 32.Tang WS; Fawzi NL; Mittal J, Refining All-Atom Protein Force Fields for Polar-Rich, Prion-like, Low-Complexity Intrinsically Disordered Proteins. J Phys Chem B 2020, 124, 9505–9512. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Marrink SJ; Risselada HJ; Yefimov S; Tieleman DP; de Vries AH, The MARTINI force field: coarse grained model for biomolecular simulations. J Phys Chem B 2007, 111, 7812–24. [DOI] [PubMed] [Google Scholar]
- 34.Ruff KM; Harmon TS; Pappu RV, CAMELOT: A machine learning approach for coarse-grained simulations of aggregation of block-copolymeric protein sequences. J Chem Phys 2015, 143, 243123. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Liu X; Chen J, HyRes: a coarse-grained model for multi-scale enhanced sampling of disordered protein conformations. Phys Chem Chem Phys 2017, 19, 32421–32432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Baul U; Chakraborty D; Mugnai ML; Straub JE; Thirumalai D, Sequence Effects on Size, Shape, and Structural Heterogeneity in Intrinsically Disordered Proteins. J Phys Chem B 2019, 123, 3462–3474. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Wu H; Wolynes PG; Papoian GA, AWSEM-IDP: A Coarse-Grained Force Field for Intrinsically Disordered Proteins. J Phys Chem B 2018, 122, 11115–11125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Latham AP; Zhang B, Maximum Entropy Optimized Force Field for Intrinsically Disordered Proteins. J Chem Theory Comput 2020, 16, 773–781. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Mioduszewski L; Rozycki B; Cieplak M, Pseudo-Improper-Dihedral Model for Intrinsically Disordered Proteins. J Chem Theory Comput 2020, 16, 4726–4733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kim YC; Hummer G, Coarse-grained models for simulations of multiprotein complexes: application to ubiquitin binding. J Mol Biol 2008, 375, 1416–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Dignon GL; Zheng W; Kim YC; Best RB; Mittal J, Sequence determinants of protein phase behavior from a coarse-grained model. PLoS Comput Biol 2018, 14, e1005941. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Dannenhoffer-Lafage T; Best RB, A Data-Driven Hydrophobicity Scale for Predicting Liquid-Liquid Phase Separation of Proteins. J Phys Chem B 2021, 125, 4046–4056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Tesei G; Schulze TK; Crehuet R; Lindorff-Larsen K, Accurate model of liquid-liquid phase behavior of intrinsically disordered proteins from optimization of single-chain properties. Proc Natl Acad Sci U S A 2021, 118, e2111696118. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Joseph JA; Reinhardt A; Aguirre A; Chew PY; Russell KO; Espinosa JR; Garaizar A; Collepardo-Guevara R, Physics-driven coarse-grained model for biomolecular phase separation with near-quantitative accuracy. Nature Computational Science 2021, 1, 732–743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Regy RM; Thompson J; Kim YC; Mittal J, Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins. Protein Sci 2021, 30, 1371–1379. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Ashbaugh HS; Hatch HW, Natively unfolded protein stability as a coil-to-globule transition in charge/hydropathy space. J Am Chem Soc 2008, 130, 9536–42. [DOI] [PubMed] [Google Scholar]
- 47.Urry DW; Gowda DC; Parker TM; Luan CH; Reid MC; Harris CM; Pattanaik A; Harris RD, Hydrophobicity scale for proteins based on inverse temperature transitions. Biopolymers 1992, 32, 1243–50. [DOI] [PubMed] [Google Scholar]
- 48.Moreau RJ; Schubert CR; Nasr KA; Torok M; Miller JS; Kennedy RJ; Kemp DS, Context-independent, temperature-dependent helical propensities for amino acid residues. J Am Chem Soc 2009, 131, 13107–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Lifson S; Roig A, On the Theory of Helix—Coil Transition in Polypeptides. The Journal of Chemical Physics 1961, 34, 1963–1974. [Google Scholar]
- 50.Best RB; de Sancho D; Mittal J, Residue-specific alpha-helix propensities from molecular simulation. Biophys J 2012, 102, 1462–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Chng CP; Yang LW, Coarse-grained models reveal functional dynamics--II. Molecular dynamics simulation at the coarse-grained level--theories and biological applications. Bioinform Biol Insights 2008, 2, 171–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 52.Karanicolas J; Brooks CL 3rd, The origins of asymmetry in the folding transition states of protein L and protein G. Protein Sci 2002, 11, 2351–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Plimpton S, Fast Parallel Algorithms for Short-Range Molecular Dynamics. Journal of Computational Physics 1995, 117, 1–19. [Google Scholar]
- 54.Eastman P; Swails J; Chodera JD; McGibbon RT; Zhao Y; Beauchamp KA; Wang LP; Simmonett AC; Harrigan MP; Stern CD; Wiewiora RP; Brooks BR; Pande VS, OpenMM 7: Rapid development of high performance algorithms for molecular dynamics. PLoS Comput Biol 2017, 13, e1005659. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Abascal JL; Vega C, A general purpose model for the condensed phases of water: TIP4P/2005. J Chem Phys 2005, 123, 234505. [DOI] [PubMed] [Google Scholar]
- 56.Kabsch W; Sander C, Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22, 2577–637. [DOI] [PubMed] [Google Scholar]
- 57.Touw WG; Baakman C; Black J; te Beek TA; Krieger E; Joosten RP; Vriend G, A series of PDB-related databanks for everyday needs. Nucleic Acids Res 2015, 43, D364–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 58.Frishman D; Argos P, Knowledge-based protein secondary structure assignment. Proteins 1995, 23, 566–79. [DOI] [PubMed] [Google Scholar]
- 59.Labesse G; Colloc'h N; Pothier J; Mornon JP, P-SEA: a new efficient assignment of secondary structure from C alpha trace of proteins. Comput Appl Biosci 1997, 13, 291–5. [DOI] [PubMed] [Google Scholar]
- 60.Park SY; Yoo MJ; Shin J; Cho KH, SABA (secondary structure assignment program based on only alpha carbons): a novel pseudo center geometrical criterion for accurate assignment of protein secondary structures. BMB Rep 2011, 44, 118–22. [DOI] [PubMed] [Google Scholar]
- 61.Law SM; Frank AT; Brooks CL 3rd, PCASSO: a fast and efficient Calpha-based method for accurately assigning protein secondary structure elements. J Comput Chem 2014, 35, 1757–61. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62.Conicella AE; Dignon GL; Zerze GH; Schmidt HB; D'Ordine AM; Kim YC; Rohatgi R; Ayala YM; Mittal J; Fawzi NL, TDP-43 alpha-helical structure tunes liquid-liquid phase separation and function. Proc Natl Acad Sci U S A 2020, 117, 5883–5894. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63.Marsh JA; Singh VK; Jia Z; Forman-Kay JD, Sensitivity of secondary structure propensities to sequence differences between alpha- and gamma-synuclein: implications for fibrillation. Protein Sci 2006, 15, 2795–804. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64.Kjaergaard M; Poulsen FM, Sequence correction of random coil chemical shifts: correlation between neighbor correction factors and changes in the Ramachandran distribution. J Biomol NMR 2011, 50, 157–65. [DOI] [PubMed] [Google Scholar]
- 65.Wang Y; Wishart DS, A simple method to adjust inconsistently referenced 13C and 15N chemical shift assignments of proteins. J Biomol NMR 2005, 31, 143–8. [DOI] [PubMed] [Google Scholar]
- 66.Chakrabarti P; Chakrabarti S, C--H...O hydrogen bond involving proline residues in alpha-helices. J Mol Biol 1998, 284, 867–73. [DOI] [PubMed] [Google Scholar]
- 67.Lindorff-Larsen K; Piana S; Dror RO; Shaw DE, How fast-folding proteins fold. Science 2011, 334, 517–20. [DOI] [PubMed] [Google Scholar]
- 68.Berman HM; Westbrook J; Feng Z; Gilliland G; Bhat TN; Weissig H; Shindyalov IN; Bourne PE, The Protein Data Bank. Nucleic Acids Res 2000, 28, 235–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Burke KA; Janke AM; Rhine CL; Fawzi NL, Residue-by-Residue View of In Vitro FUS Granules that Bind the C-Terminal Domain of RNA Polymerase II. Mol Cell 2015, 60, 231–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Ryan VH; Dignon GL; Zerze GH; Chabata CV; Silva R; Conicella AE; Amaya J; Burke KA; Mittal J; Fawzi NL, Mechanistic View of hnRNPA2 Low-Complexity Domain Structure, Interactions, and Phase Separation Altered by Mutation and Arginine Methylation. Mol Cell 2018, 69, 465–479 e7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 71.Kullback S; Leibler RA, On Information and Sufficiency. The Annals of Mathematical Statistics 1951, 22, 79–86. [Google Scholar]
- 72.Fuglede B; Topsoe F In Jensen-Shannon divergence and Hilbert space embedding, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings., 27 June-2 July 2004; 2004; p 31. [Google Scholar]
- 73.Ebert MO; Bae SH; Dyson HJ; Wright PE, NMR relaxation study of the complex formed between CBP and the activation domain of the nuclear hormone receptor coactivator ACTR. Biochemistry 2008, 47, 1299–308. [DOI] [PubMed] [Google Scholar]
- 74.Rao JN; Kim YE; Park LS; Ulmer TS, Effect of pseudorepeat rearrangement on alpha-synuclein misfolding, vesicle binding, and micelle binding. J Mol Biol 2009, 390, 516–29. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 75.Wong TS; Rajagopalan S; Freund SM; Rutherford TJ; Andreeva A; Townsley FM; Petrovich M; Fersht AR, Biophysical characterizations of human mitochondrial transcription factor A and its binding to tumor suppressor p53. Nucleic Acids Res 2009, 37, 6765–83. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 76.Barre P; Eliezer D, Structural transitions in tau k18 on micelle binding suggest a hierarchy in the efficacy of individual microtubule-binding repeats in filament nucleation. Protein Sci 2013, 22, 1037–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 77.Mittag T; Marsh J; Grishaev A; Orlicky S; Lin H; Sicheri F; Tyers M; Forman-Kay JD, Structure/function implications in a dynamic complex of the intrinsically disordered Sic1 with the Cdc4 subunit of an SCF ubiquitin ligase. Structure 2010, 18, 494–506. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 78.Sterckx YG; Volkov AN; Vranken WF; Kragelj J; Jensen MR; Buts L; Garcia-Pino A; Jove T; Van Melderen L; Blackledge M; van Nuland NA; Loris R, Small-angle X-ray scattering- and nuclear magnetic resonance-derived conformational ensemble of the highly flexible antitoxin PaaA2. Structure 2014, 22, 854–65. [DOI] [PubMed] [Google Scholar]
- 79.Houben K; Marion D; Tarbouriech N; Ruigrok RW; Blanchard L, Interaction of the C-terminal domains of sendai virus N and P proteins: comparison of polymerase-nucleocapsid interactions within the paramyxovirus family. J Virol 2007, 81, 6807–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 80.Karlsson E; Sorgenfrei FA; Andersson E; Dogan J; Jemth P; Chi CN, The dynamic properties of a nuclear coactivator binding domain are evolutionarily conserved. Commun Biol 2022, 5, 286. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 81.Urbanek A; Popovic M; Morato A; Estana A; Elena-Real CA; Mier P; Fournet A; Allemand F; Delbecq S; Andrade-Navarro MA; Cortes J; Sibille N; Bernado P, Flanking Regions Determine the Structure of the Poly-Glutamine in Huntingtin through Mechanisms Common among Glutamine-Rich Human Proteins. Structure 2020, 28, 733–746 e5. [DOI] [PubMed] [Google Scholar]
- 82.Demarest Sj Fau - Chung J; Chung J Fau - Dyson HJ; Dyson Hj Fau - Wright PE; Wright PE, Assignment of a 15 kDa protein complex formed between the p160 coactivator ACTR and CREB binding protein [DOI] [PubMed]
- 83.Meuzelaar H; Vreede J; Woutersen S, Influence of Glu/Arg, Asp/Arg, and Glu/Lys Salt Bridges on alpha-Helical Stability and Folding Kinetics. Biophys J 2016, 110, 2328–2341. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84.Maxfield FR; Scheraga HA, The effect of neighboring charges on the helix forming ability of charged amino acids in proteins. Macromolecules 1975, 8, 491–3. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
