Abstract
The formation of membraneless organelles (MLOs) via liquid-liquid phase separation (LLPS) of biomolecules is a topic that has garnered significant attention in the scientific community recently. Experimental studies have revealed that intrinsically disordered proteins (IDPs) may play a major role in driving the formation of these droplets via LLPS by forming multivalent interactions between amino acids. To quantify these interactions is an arduous task as it is difficult to investigate these interactions at the amino acid level using currently available experimental tools. It becomes necessary to complement experimental studies using appropriate computational methods such as coarse-grained models of IDPs that can allow one to simulate biomolecular LLPS using general-purpose hardware. Here, we summarize our coarse-grained modeling framework that uses a single bead per amino acid resolution and the coexistence sampling technique to study sequence-specific protein phase separation using molecular dynamics simulations. We further discuss the caveats and technicalities, which one must consider while using this method to obtain thermodynamic phase diagrams. To ease the learning curve, we provide our implementations of the coarse-grained potentials in the HOOMD-Blue simulation package and associated python scripts to run such simulations.
Keywords: HPS model, co-existence slab simulations, coarse-grained modeling
1. Introduction
Membraneless organelles (MLOs) are concentrated compartments consisting of biomolecules that lack a membrane separating their contents from the surrounding cytosol [1-4]. These MLOs are formed via liquid-liquid phase separation (LLPS), where the loss of entropy upon demixing of a homogeneous mixture can be accounted for by the favorable enthalpy due to intermolecular interactions between different components[5]. Previous studies have shown that for most MLOs, intrinsically disordered proteins (IDPs) play a significant role in driving or promoting phase separation. IDPs are proteins that lack a well-defined three-dimensional structure but are known to adopt a range of conformations [6,7]. An ensemble description of IDPs makes it difficult for any single experimental technique to provide a complete view of the structural properties [8]. The use of computational methods can, therefore, offer complementary information to describe the phase separation properties [9-11].
The investigation of the intermolecular interactions driving protein phase separation has been an uphill task for experimental studies as these IDPs are known to have multiple different modes of interactions amongst themselves, which is challenging to quantify solely based on experimental studies. One would have to look for solutions in the computational domain where methods like all-atom simulations [12-14] would provide a detailed view of all possible interactions driving or disrupting phase separation. Currently, it is computationally prohibitive to directly simulate protein phase separation using an all-atom representation, and these methods are instead used to provide molecular information on the atomic modes of interactions [15-18]. Hence, one must turn to simpler models, which can capture the sequence-specific details of proteins deemed necessary to study their phase behavior while still providing molecular or sequence-level information [19]. These methods range from analytical and empirical theories, bioinformatics, to coarse-grained (CG) molecular simulations. Every model has its merit based on the questions one would like to ask. For instance, analytical or numerical mean-field polymer models have the advantage of quickly providing valuable information on the chain-length dependent multivalent interactions, the contribution of charges, and other sequence-descriptors based on amino acid properties [5,20-25]. However, such models often require experimental input in a system-specific manner and are limited in their ability to provide details on molecular structures and dynamics. Molecular simulations based on highly simplified particle-based CG models can further bridge this gap and allow one to study phase behaviors, molecular interactions responsible for stabilizing the liquid-like state, and the dynamics within the condensed phase. The simplest CG model that can capture the IDP sequences is based on representing each amino acid with a single bead, and such a model was proposed recently by us to study sequence-dependent LLPS of IDPs [9].
In this chapter, we discuss the technical details of this off-lattice CG model, henceforth referred to as the HPS (HydroPathy Scale) model, which represents IDPs as flexible polymers with 20 different types of monomer beads and solvent interactions captured implicitly in terms of nonbonded inter-residue interactions. Even with such a highly CG representation, it is almost impossible to compute the phase behavior (co-existence envelope between the high-density and low-density phases) using commonly employed sampling techniques such as replica exchange molecular dynamics, as expected based on the nature of the phase transition involved [26]. We have successfully overcome this challenge by using the co-existence method [27], in which the two phases are simulated next to each other in a single simulation cell. Here, we provide more details on how to construct the system using the HPS model and practically conduct MD simulations on a publicly available software package HOOMD-Blue [28]. We also provide scripts using which the reader would be able to set up such simulations for a system of their interest. Our primary goal in writing this chapter is to help the reader understand the underlying details of the model, the methods employed in our previous work and what challenges to expect when conducting such simulations.
2. HPS model
Our CG modeling approach for IDPs uses one bead per amino acid level of resolution (Fig. 1A), and IDP sequence details are preserved using different parameter values for all the twenty naturally occurring amino acids (Fig. 1B). The following parameters completely define the identity of each amino acid in our model: the mass of the amino acid, the charge of the amino acid at near neutral pH, the diameter of the amino acid (σ), and the hydropathy (λ) that we believe can capture the short-range interactions between amino acids. We compute σ by mapping the van der Waals volume of an amino acid to a sphere [29]. The λ value for each amino acid is estimated using an atomistic hydropathy scale proposed by Kapcha and Rossky [30], which is based on partial charges of atoms belonging to an amino acid in an all-atom force field. For convenience, the λ values for all the amino acids are scaled to range from 0 (Arginine) to 1 (Phenylalanine).
Fig. 1.
HPS model. A) One residue per bead coarse-grained (CG) representation of a peptide. B) Parameters used in the HPS model. C) The energy function of short-range pairwise interaction for representative amino acids.
In the CG model energy function, we have three different types of interactions, namely, bonded, electrostatic, and short-range pairwise interactions. The bonds between consecutive amino acids in the protein sequence are modeled using a harmonic potential with a spring constant of 10 kJ/Å2 and a bond length of 3.8 Å. Hence the protein representation in our CG model resembles a freely jointed polymer chain where amino acids are represented by spheres that are connected by harmonic springs. The electrostatic interactions are modeled using a Debye-Hückel electrostatic screening term [31],
| (1) |
in which κ is the Debye screening length and ionic strength dependent, and D = 80 is the dielectric constant of the solvent. The Debye length is proportional to the square root of the ionic strength, and a Debye screening length of 10 Å corresponds to ~100 mM ionic strength at room temperature. This functional form is available in HOOMD-Blue using the function hoomd.md.pair.yukawa [28].
The following functional form by Ashbaugh-Hatch is used for the short-range pairwise interactions [32]:
| (2) |
where ΦLJ is the standard 12-6 Lennard-Jones potential given by,
| (3) |
Both the hydropathy (λij) and the diameter (σij) for a pair of amino acids (i,j) are set to be the arithmetic average of the corresponding values of the two amino acids. Small values of λij represent weak interactions between the two amino acids. In contrast, high values represent strong inter-residue interactions though the exact atomic modes of interactions such as hydrophobic, cation-π, hydrogen bonding, etc. are not explicitly included in our CG approach. The interaction parameter (ε) in eq. 3 is a free parameter that sets the absolute scale of interactions among different amino acid pairs, which was adjusted to reproduce the experimentally measured radius of gyration of IDPs as 0.2 kcal/mol [9]. The resulting potential energy as a function of the distance for three representative amino acids is shown in Fig. 1C. In practice, the functional form for these non bonded interactions can be realized using the function azplugins.pair.ashbaugh() provided within the ‘azplugins’ component (plugin) for HOOMD-Blue found at (https://github.com/mphowardlab/azplugins).
Next, we would like to point out some salient features of our model to the reader, which make it well suited for studying phase separation and the formation of MLOs compared to other existing coarse-grained models for IDPs and folded proteins [33,34]. First of all, the sequence specificity of the HPS model allows us to investigate interactions at the amino acid level and therefore the effects of mutations and sequence patterning on phase behavior [9,35]. Second, the HPS model is a transferable modeling framework, which does not require experimental input on a system of interest after parameterization, with potential extensibility to non-natural amino acids and other biomolecules such as nucleic acids. Given the limited number of free parameters in the HPS model, it should be relatively easier to optimize model parameters further as more experimental data is available [36]. Third, additional features such as temperature dependence of short-range interactions can be introduced in the model to capture the thermoresponsive LLPS behavior of IDPs accurately [37]. These features of the CG modeling framework make it a nifty tool to utilize when studying LLPS in protein systems of interest in biology or material science.
3. Simulation strategy for obtaining phase diagrams
To construct a thermodynamic phase diagram, estimation of the protein saturation concentrations in the protein-rich and protein-deficient phases is required as a function of control variables such as temperature. Obtaining these estimates using commonly used simulation techniques such as the Gibbs ensemble method or grand canonical simulations for phase behavior calculations of simple soft matter systems is a herculean task that cannot be accomplished in a reasonable time even with a CG off-lattice model [38,39]. An alternate sampling method is, therefore, necessary to overcome the enthalpy gap between the two phases [26,27]. One efficient way of achieving this in brute-force molecular dynamics simulations is to construct the system in a way by which the two phases coexist (commonly referred to as the co-existence method) with a common interface to allow for exchanges of molecules between the two phases for sufficient equilibration of densities in these phases. The use of planar interfaces in a slab configuration is effective in reducing the system-size effects as opposed to a spherical droplet configuration to sample the interfacial properties and saturation densities of co-existing phases [27,40].
The following section provides detailed information on how to set up and analyze co-existence simulations to construct phase diagrams and is organized into three separate subsections, each explaining the steps the reader would have to take to implement our methodology, i.e., 1) how to create the initial configuration of protein chains for a coexistence simulation, 2) how to calculate density profiles over the simulation box and check for convergence, and 3) how to construct the phase diagram given the density profiles as a function of the temperature.
3.1. Preparing an initial slab configuration
In the first step, we select an equilibrated IDP conformation based on the HPS model. This can be done by using the HPS model and HOOMD-Blue, or by using any other software packages such as LAMMPS [41] or ABSINTH [10]. Then we place required copies of the protein chain in a selected conformation or multiple conformations on a periodic lattice inside a cubic simulation box large enough to avoid clashes between monomers of different protein chains (Fig. 2A). The selection of the number of chains for such simulations is a balance between the accuracy (system-size dependent effects) and the computational power needed to conduct simulations to sufficient timescales. We provide some practical guidance in this direction later in this chapter.
Fig. 2.
Illustration of preparing a slab configuration. A) IDPs in arbitrary lattice positions. B) Configurations equilibrated with NPT simulations in a periodic cubic box. C) The z-axis of the cubic box is extended to make a slab configuration.
Once the proteins are arranged in a low-density configuration, we use it as the initial configuration for an NPT simulation to condense these protein chains into a smaller cubic periodic box of the desired side length to generate a high-density configuration at protein concentrations higher than observed experimentally for many IDPs, e.g., 200-500 mg/ml (Fig. 2B). This step can be efficiently done in HOOMD-Blue using the MTK barostat [42] provided in the function hoomd.md.integrate.npt. One can iteratively select a target pressure for a given temperature to condense protein chains into the appropriate box size. From our experience, this process is usually rapid and it happens within tens of nanosecond simulation time. An alternative method to generate a high-density protein condensed phase is to simply reduce the box size over a certain runtime using the function hoomd.update.box_resize(). The resulting configuration can be well-equilibrated in a high-density condensed phase before the next step by either allowing the box resizing to happen over a sufficiently long time or conducting an NVT simulation, over tens of nanoseconds, to help equilibrate protein chains and start the next step with a low potential energy configuration.
Selecting a condensed configuration within the cubic box from the NPT simulation or after resizing the box, we now extend the box length in the z-direction to create empty space to allow for redistribution of protein chains in two co-existing high- and low-density phases (Fig 2C). To avoid boundary artifacts, as the periodic configuration from the previous step is placed in a new periodic box with a different z-dimension, care should be taken to move the monomers of a protein chain that are split across the periodic boundary in the z-direction to make each protein whole. This extended slab with protein chains in the center (Fig. 3A) would then serve as the initial configuration for our coexistence simulations, as described in the next section.
Fig. 3.
Obtaining the density profile using the co-existence simulation method.
3.2. Obtaining density profiles
With the initial slab configuration (Fig. 3A), we conduct 5 μs NVT simulations at a range of temperatures so that we can obtain coexistence densities to make a phase diagram as a function of protein concentration and temperature. As we go from low to high temperatures, we see that protein chains start breaking off the dense central region and enter into the space created in the extended z-direction of the slab simulation box more frequently until a dynamic equilibrium is established after a sufficient simulation time (Fig. 3B). As we go to very high temperatures, we see all the proteins that were initially condensed at the center, breaking off and getting distributed throughout the simulation box in a single phase, which is indicative of the simulation temperature being higher than the critical temperature. The presence of this critical temperature above which system exists in a single phase is a characteristic of systems with an upper critical solution temperature (UCST). Also, many systems can display lower solution critical temperature (LCST) behavior, for which phase separation occurs above the critical temperature [43]. Some systems can also show both UCST and LCST behavior, which would present an hourglass phase diagram as a single phase is observed between the two critical temperatures. We have recently proposed an extension of our coarse-grained modeling framework to capture the thermoresponsive behavior of proteins by introducing temperature-dependent nonbonded interactions. For the sake of simplicity, we will proceed to explain our methodology with the original model here [37].
During the co-existence NVT simulation, the interface between the low- and high-density phases tends to drift along the z-direction (Fig. 3B), which makes it difficult to determine the average concentration profile as a function of the z-direction without further processing to account for this slab mobility. To overcome this, we center all the saved trajectory frames to the initial configuration so that the dense phase is placed at z=0 for all the frames. In practice, we first identify the largest protein cluster inside the simulation box by counting the number of protein chains in the clusters found in each frame. We then center the simulation box to the center of mass of the largest cluster for each frame, as shown in Fig. 3C. This gives us a processed trajectory file in which the dense slab phase is at the center of the simulation box and can be used for further analysis. Another method, which can be used to fix the location of the interface is by using a spatial autocorrelation function to align concentration profiles for each frame to have the highest peak of intensity at the center of the simulation box [44]. This ‘centered’ trajectory is binned along the z-direction to calculate the average protein density as a function of the z-coordinate averaged over all the frames, as shown in Fig 3D. To estimate the sampling errors and to provide guidance on the simulation runtime to achieve sufficient convergence of co-existing densities, we use block averages by dividing the trajectory into blocks of 1 μs each. Before we move on to the calculation of the coexistence densities based on the computed density profiles, we would like to make the reader aware of some issues that must be taken into account to use this simulation method properly.
The first issue is how to decide on the number of protein chains that can provide a trustworthy estimate of the coexistence densities. The most important indicators of a suitable system size are the ability to observe a flat density profile in the middle region corresponding to the high-density liquid phase and that the protein concentration in this phase is statistically independent of any further increase in the number of protein chains in the same size simulation box. In a previous study of the FUS low complexity (LC) disordered region (163 residues) with liquid phase densities between 300 and 600 mg/mL, we found that 100 FUS LC chains are sufficient to achieve reasonable accuracy and the results with 200 chains are almost identical (Fig. 4A,B) [25]. One should perform a similar analysis on a system of interest to decide on a suitable system-size.
Fig. 4.
Issues with obtaining density profiles from simulations. A) Density profiles by using 100 (blue) and 200 (red) chains of proteins. B) Phase diagram by using 100 and 200 chains of proteins. C) Density profiles highlighting that condensed slab configuration can be obtained from a wholly dispersed configuration after several microseconds of simulation time.
The second issue is deciding on the simulation run length to obtain a converged density profile. In our recent work, we used a simulation time of 5 μs because we stop seeing any significant deviations in the density profile when comparing 1 μs block averages over the full trajectory. The choice can be system-specific but careful attention should be paid to the evolution of the density profiles over time and if the information obtained from multiple independent trajectories or block averages over a long trajectory is within desirable statistical uncertainty values. We note that there is an inherent advantage in using the pre-formed slab configuration as opposed to waiting for fully dispersed chains in a long simulation box to form a desired planar condensed phase spontaneously. As shown in Fig. 4C starting from a dispersed configuration, the evolution of the density profile starts to look similar to the profile obtained from an initial slab configuration after about 3 μs of simulation time. This time can be saved by condensing a low-density configuration within tens of nanoseconds, as we discussed earlier in the chapter.
For benchmarking purposes, we were able to simulate a 100-chain system with a 163 amino acid long protein chain, for 5 μs with a wallclock time of approximately 36 hours on a standard desktop computer using HOOMD-Blue (v.2.1.5) with a single Nvidia GTX1080 GPU.
3.3. Generating phase diagrams
Once we have the converged density profiles from the simulations at different temperatures, we can estimate the densities ρH (high-density liquid phase) and ρL (low-density dilute phase). In a well-sampled density profile from a converged simulation, we would see a flat peak at the center of the slab, i.e., at z=0 with either half of the density profile resembling a sigmoidal curve. And as shown in Fig. 4A, the density at the center z=0 would start decreasing, and densities away from the center would start increasing as we go to higher temperatures closer to the critical temperature. We can define the regions corresponding to the low and high-density phases along the z-axis and take averages over bins in the respective regions to get ρH and ρL. However, in practice, there can be sampling issues at specific temperatures, especially on the low-density side. Hence, the following sigmoidal function can also be used to fit the density profile [27]
| (4) |
Once we have ρH and ρL at different temperatures, we can estimate the critical temperature using the following equation,
| (5) |
Where β is the critical exponent (β=0.325 for universality class of 3D Ising model [45]), A is a protein-specific fitting parameter, and Tc is the critical temperature (Fig. 5A). The critical concentration ρc can be further obtained by fitting to,
| (6) |
where B is a fitting parameter. We note that eq. 5 is only asymptotically valid near the critical temperature, so this should be accounted for when selecting appropriate data for fitting [27]. The fit results should be relatively independent of the maximum temperature near the Tc to a reasonable degree, but this should be checked for a specific system of interest carefully [25]. Of course, additional simulations should be conducted if there is a large gap between the estimated Tc and the highest temperature simulated.
Fig. 5.
Obtaining the phase diagrams from density profiles. A) Fitting the density profile to obtain the critical temperature. B) Drawing phase diagrams.
As it may be difficult to ascertain the suitable temperature range for a new system a priori, one can utilize empirical correlation between the theta-solvent temperature (Tθ), i.e., the temperature at which a protein behaves like a random coil, and Tc [25]. As Tθ can be efficiently determined using HOOMD-Blue or other CPU-based codes such as LAMMPS [41] for a single-chain system that can be further facilitated using replica exchange molecular dynamics, this should result in a significant saving of computational effort in searching for a suitable temperature range to conduct co-existence simulations.
With ρH and ρL from density profiles at a range of temperatures, the critical temperature Tc and the critical concentration ρc that we obtain from fitting to Eq. 5 and 6, we can plot a typical thermodynamic phase diagram with the densities on the x-axis and the temperature on the y-axis. (Fig. 5B). The left arm of the phase diagram represents the protein concentration in the low-density phase, also referred to as the saturation concentration csat in the protein LLPS literature, whereas the right arm represents the protein concentration in the high-density liquid phase. The region inside these two curves and below Tc is the coexistence region; a protein solution prepared at concentrations within this region will undergo phase separation into two separate co-existing phases. The concentration of these two phases can be estimated by drawing a horizontal tie line which connects the two sides of the phase envelope and passes through the initial concentration point. In fact, these fundamental features of a thermodynamic phase diagram were used for setting up the co-existence simulations and the successful generation of phase diagrams. These phase diagrams provide a rigorous thermodynamic basis for comparison between different variants of a protein sequence and how underlying changes due to mutations or additions/deletions of domains may affect their phase behavior. In experimental studies, instead of temperature, salt concentration can also be used as a parameter of interest for constructing the phase diagram. In our coarse-grained modeling framework, changing the salt concentration will alter the electrostatic interactions by modifying the Debye length based on eq. 1. One can also use our model to study the sequence-specific characteristics contributing to the phase separation propensity of a protein, including changes arising from point mutations [15,17].
Summary.
In this chapter, we introduce the reader to a practical and efficient way of computing the phase behavior of an intrinsically disordered protein using our coarse-grained (CG) but sequence-specific protein model together with a co-existence method in a molecular dynamics simulation. We also try to bring into light some of the nuances associated with our methodology, which the reader needs to be cautious with, while implementing this proposed method into practice. The computer codes/scripts needed for implementation are provided which work with HOOMD-Blue, which is a publicly available software package which uses graphics processing units (GPUs) to achieve significant computational speed in such calculations. Our CG modeling strategy is a flexible and extensible framework that is most importantly, also transferable without requiring any additional experimental input on a system of interest. This has already been applied to numerous important biological systems to provide critical molecular insights into the sequence-dependent LLPS behavior. We hope that the information contained in this chapter will lower the barrier for others to integrate our modeling framework into their workflow as an efficient way of estimating phase behavior of IDPs on systems of their choice and help move the field forward. The python scripts and the instructions for using these scripts can be found at the following website: https://bitbucket.org/jeetain/hoomd_slab_builder.
Acknowledgment
Our work summarized in this book chapter was supported by the US Department of Energy, Office of Science, Basic Energy Sciences Award DE-SC00013979, the National Institutes of Health grants R01GM118530, R01NS116176, and R01GM120537, and the National Science Foundation (NSF) grants 2004796 and 1720530. W.Z acknowledges the support from NSF MCB-2015030. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported under Contract No. DE-AC02-05CH11231. Use of the high- performance computing capabilities of the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by the National Science Foundation, project no. TG-MCB120014 is also gratefully acknowledged.
Reference
- [1].Brangwynne CP, Eckmann CR, Courson DS, Rybarska A, Hoege C, Gharakhani J, Jülicher F, Hyman AA, Julicher F, Gharakhani J, Eckmann CR, Hoege C, Rybarska A, Hyman AA, Germline P Granules Are Liquid Droplets That Localize by Controlled Dissolution/Condensation, Science (80-. ) 324 (2009) 1729–1732. 10.1126/science.1172046. [DOI] [PubMed] [Google Scholar]
- [2].Brangwynne CP, Mitchison TJ, Hyman AA, Active liquid-like behavior of nucleoli determines their size and shape in Xenopus laevis oocytes, Proc. Natl. Acad. Sci 108 (2011) 4334–4339. 10.1073/pnas.1017150108. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Elbaum-Garfinkle S, Kim Y, Szczepaniak K, Chen CC-H, Eckmann CR, Myong S, Brangwynne CP, The disordered P granule protein LAF-1 drives phase separation into droplets with tunable viscosity and dynamics, Proc. Natl. Acad. Sci 112 (2015) 7189–7194. 10.1073/pnas.1504822112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Burke KA, Janke AM, Rhine CL, Fawzi NL, Residue-by-residue view of in vitro FUS granules that bind the C-terminal domain of RNA polymerase II, Mol. Cell 60 (2015) 231–241. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Brangwynne CP, Tompa P, V Pappu R, Polymer physics of intracellular phase transitions, Nat. Phys 11 (2015) 899. [Google Scholar]
- [6].Tompa P, Intrinsically disordered proteins: A 10-year recap, Trends Biochem. Sci 37 (2012) 509–516. 10.1016/j.tibs.2012.08.004. [DOI] [PubMed] [Google Scholar]
- [7].Wright PE, Dyson HJ, Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm, J. Mol. Biol 293 (1999) 321–331. 10.1006/jmbi.1999.3110. [DOI] [PubMed] [Google Scholar]
- [8].Fisher CK, Stultz CM, Constructing ensembles for intrinsically disordered proteins, Curr. Opin. Struct. Biol 21 (2011) 426–431. 10.1016/j.sbi.2011.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Dignon GL, Zheng W, Kim YC, Best RB, Mittal J, Sequence determinants of protein phase behavior from a coarse-grained model, PLoS Comput. Biol 14 (2018) e1005941 10.1101/238170. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [10].Vitalis A, V Pappu R, ABSINTH: A new continuum solvent model for simulations of polypeptides in aqueous solutions, J. Comput. Chem 30 (2008) 673–699. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Lin Y-H, Forman-Kay JD, Chan HS, Sequence-specific polyampholyte phase separation in membraneless organelles, Phys. Rev. Lett 117 (2016) 178101. [DOI] [PubMed] [Google Scholar]
- [12].Best RB, Zheng W, Mittal J, Balanced protein-water interactions improve properties of disordered proteins and non-specific protein association, J. Chem. Theor. Comput 10 (2014) 5113–5124. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Piana S, Donchev AG, Robustelli P, Shaw DE, Water dispersion interactions strongly influence simulated structural properties of disordered protein states, J. Phys. Chem. B 119 (2015) 5113–5123. [DOI] [PubMed] [Google Scholar]
- [14].Huang J, Rauscher S, Nawrocki G, Ran T, Feig M, de Groot BL, Grubmüller H, MacKerell AD Jr, CHARMM36m: an improved force field for folded and intrinsically disordered proteins, Nat Methods. 14 (2017) 71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Murthy AC, Dignon GL, Kan Y, Zerze GH, Parekh SH, Mittal J, Fawzi NL, Molecular interactions underlying liquid–liquid phase separation of the FUS low-complexity domain, Nat. Struct. Mol. Biol 26 (2019) 637–648. 10.1038/s41594-019-0250-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [16].Schuster BS, Dignon GL, Tang WS, Kelley FM, Ranganath AK, Jahnke CN, Simpkins AG, Regy RM, Hammer DA, Good MC, Mittal J, Identifying Sequence Perturbations to an Intrinsically Disordered Protein that Determine Its Phase Separation Behavior, BioRxiv. (2020) 894576 10.1101/2020.01.06.894576. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Conicella AE, Dignon GL, Zerze GH, Schmidt HB, D’Ordine AM, Kim YC, Rohatgi R, Ayala YM, Mittal J, Fawzi NL, TDP-43 α-helical structure tunes liquid–liquid phase separation and function, Proc. Natl. Acad. Sci. U. S. A 117 (2020) 5883–5894. 10.1073/pnas.1912055117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Conicella AE, Zerze GH, Mittal J, Fawzi NL, ALS mutations disrupt phase separation mediated by $α$-helical structure in the TDP-43 low-complexity C-terminal domain, Structure. 24 (2016) 1537–1549. 10.1016/j.str.2016.07.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Dignon GL, Zheng W, Mittal J, Simulation methods for liquid–liquid phase separation of disordered proteins, Curr. Opin. Chem. Eng 23 (2019) 92–98. 10.1016/j.coche.2019.03.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Flory PJ, Principles of Polymer Chemistry, Cornell University Press, 1953. [Google Scholar]
- [21].de Gennes P-G, Scaling Concepts in Polymer Physics, Cornell University Press, 1978. [Google Scholar]
- [22].Zheng W, Zerze GH, Borgia A, Mittal J, Schuler B, Best RB, Inferring properties of disordered chains from FRET transfer efficiencies, J. Chem. Phys (2018). 10.1063/1.5006954. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Zheng W, Dignon G, Brown M, Kim YC, Mittal J, Hydropathy patterning complements charge patterning to describe conformational preferences of disordered proteins, BioRxiv. (2020) 2020.01.25.919498. 10.1101/2020.01.25.919498. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].Sawle L, Ghosh K, A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins, J. Chem. Phys 143 (2015). 10.1063/1.4929391. [DOI] [PubMed] [Google Scholar]
- [25].Dignon GL, Zheng W, Best RB, Kim YC, Mittal J, Relation between single-molecule properties and phase behavior of intrinsically disordered proteins, Proc. Natl. Acad. Sci 115 (2018) 9929–9934. 10.1073/pnas.1804177115. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Kim J, Keyes T, Straub JE, Generalized replica exchange method, J. Chem. Phys 132 (2010) 224107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].Blas FJ, MacDowell LG, de Miguel E, Jackson G, Vapor-liquid interfacial properties of fully flexible Lennard-Jones chains, J. Chem. Phys 129 (2008) 144703 10.1063/1.2989115. [DOI] [PubMed] [Google Scholar]
- [28].Anderson JA, Glaser J, Glotzer SC, HOOMD-blue: A Python package for high-performance molecular dynamics and hard particle Monte Carlo simulations, Comput. Mater. Sci 173 (2020) 109363 10.1016/j.commatsci.2019.109363. [DOI] [Google Scholar]
- [29].Creighton TE, Proteins: structures and molecular properties, Macmillan, 1993. [Google Scholar]
- [30].Kapcha LH, Rossky PJ, A simple atomic-level hydrophobicity scale reveals protein interfacial structure, J. Mol. Biol 426 (2014) 484–498. 10.1016/j.jmb.2013.09.039. [DOI] [PubMed] [Google Scholar]
- [31].Debye P, Hückel E, De la theorie des electrolytes. I. abaissement du point de congelation et phenomenes associes, Phys. Zeitschrift 24 (1923) 185–206. [Google Scholar]
- [32].Ashbaugh HS, Hatch HW, Natively unfolded protein stability as a coil-to-globule transition in charge/hydropathy space, J. Am. Chem. Soc 130 (2008) 9536–9542. [DOI] [PubMed] [Google Scholar]
- [33].Clementi C, Coarse-grained models of protein folding: toy models or predictive tools?, Curr. Opin. Struct. Biol 18 (2008) 10–15. 10.1016/j.sbi.2007.10.005. [DOI] [PubMed] [Google Scholar]
- [34].Ayton GS, Noid WG, Voth GA, Multiscale modeling of biomolecular systems: in serial and in parallel, Curr. Opin. Struct. Biol 17 (2007) 192–198. [DOI] [PubMed] [Google Scholar]
- [35].Monahan Z, Ryan VH, Janke AM, Burke KA, Zerze GH, O’Meally R, Dignon GL, Conicella AE, Zheng W, Best RB, Cole RN, Mittal J, Shewmaker F, Phosphorylation of FUS low-complexity domain disrupts phase separation, aggregation, and toxicity, EMBO J. (2017) e201696394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Latham AP, Zhang B, Improving Coarse-Grained Protein Force Fields with Small-Angle X-ray Scattering Data, J. Phys. Chem. B 123 (2019) 1026–1034. 10.1021/acs.jpcb.8b10336. [DOI] [PubMed] [Google Scholar]
- [37].Dignon GL, Zheng W, Kim YC, Mittal J, Temperature-Controlled Liquid-Liquid Phase Separation of Disordered Proteins, ACS Cent. Sci 5 (2019) 821–830. 10.1021/acscentsci.9b00102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Potoff JJ, Panagiotopoulos AZ, Critical point and phase behavior of the pure fluid and a Lennard-Jones mixture, J. Chem. Phys 109 (1998) 10914–10920. 10.1063/1.477787. [DOI] [Google Scholar]
- [39].Panagiotopoulos AZ, Monte Carlo methods for phase equilibria of fluids, J. Phys. Condens. Matter 12 (2000) R25 10.1088/0953-8984/12/3/201. [DOI] [Google Scholar]
- [40].Silmore KS, Howard MP, Panagiotopoulos AZ, Vapour--liquid phase equilibrium and surface tension of fully flexible Lennard--Jones chains, Mol. Phys 115 (2017) 320–327. [Google Scholar]
- [41].Plimpton S, Crozier P, Thompson A, LAMMPS-large-scale atomic/molecular massively parallel simulator, Sandia Natl. Lab 18 (2007) 43. [Google Scholar]
- [42].Martyna GJ, Tobias DJ, Klein ML, Constant pressure molecular dynamics algorithms, J. Chem. Phys 101 (1994) 4177–4189. 10.1063/1.467468. [DOI] [Google Scholar]
- [43].Quiroz FG, Chilkoti A, Sequence heuristics to encode phase behaviour in intrinsically disordered protein polymers, Nat. Mater 14 (2015) 1164–1171. 10.1038/nmat4418. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [44].Jung H, Yethiraj A, A simulation method for the phase diagram of complex fluid mixtures, J. Chem. Phys 148 (2018). 10.1063/1.5033958. [DOI] [PubMed] [Google Scholar]
- [45].Rowlinson JS, Widom B, Molecular theory of capillarity, Courier Corporation, 2013. [Google Scholar]





