Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jan 1.
Published in final edited form as: Methods Enzymol. 2020 Aug 4;646:185–222. doi: 10.1016/bs.mie.2020.07.002

Small angle x-ray scattering experiments of monodisperse intrinsically disordered protein samples close to the solubility limit

Erik W Martin a, Jesse B Hopkins b, Tanja Mittag a,1
PMCID: PMC8370720  NIHMSID: NIHMS1730894  PMID: 33453925

Abstract

The condensation of biomolecules into biomolecular condensates via liquid-liquid phase separation (LLPS) is a ubiquitous mechanism that drives cellular organization. To enable these functions, biomolecules have evolved to drive LLPS and facilitate partitioning into biomolecular condensates. Determining the molecular features of proteins that encode LLPS will provide critical insights into a plethora of biological processes. Problematically, probing biomolecular dense phases directly is often technologically difficult or impossible. By capitalizing on the symmetry between the conformational behavior of biomolecules in dilute solution and dense phases, it is possible to infer details critical to phase separation by precise measurements of the dilute phase thus circumventing complicated characterization of dense phases. The symmetry between dilute and dense phases is found in the size and shape of the conformational ensemble of a biomolecule – parameters that small-angle x-ray scattering (SAXS) is ideally suited to probe. Recent technological advances have made it possible to accurately characterize samples of intrinsically disordered protein regions at low enough concentration to avoid interference from intermolecular attraction, oligomerization or aggregation, all of which were previously roadblocks to characterizing self-assembling proteins. Herein, we describe the pitfalls inherent to measuring such samples, the details required for circumventing these issues and analysis methods that place the results of SAXS measurements into the theoretical framework of LLPS.

Keywords: SAXS, intrinsically disordered protein, IDP, IDR, LLPS, phase separation, self-association, size-exclusion chromatography, coflow

1. Introduction

Intrinsically disordered protein regions (IDRs) are a class of proteins which do not adopt stable secondary or tertiary structure (Oldfield & Dunker, 2014; Tompa, 2012; van der Lee et al., 2014). This lack of structure can be thought of as arising from the flattening of the conformational energy landscape which results in a large ensemble of interconverting conformations with similar free energies. Thus, the characterization of the structure of IDRs must account for the ensemble nature of IDRs (Mittag & Forman-Kay, 2007). Many traditional structural techniques fail to provide information on IDRs; those that do, including NMR spectroscopy and single molecule fluorescence, provide largely ensemble-average or time-average observables.

Important measures of the structural features of IDRs are parameters that report on their global dimensions – quantities that small-angle x-ray scattering (SAXS) is ideally suited to measuring. Assuming a homogeneous, dilute solution, SAXS directly probes the spatial distribution of atoms in the measurement volume thus providing direct access to the size and shape of the ensemble of conformations. The utility of SAXS in probing the ensemble properties of IDRs is well established and has been reviewed several times (Bernado & Svergun, 2012; Cordeiro et al., 2017; Kachala, Valentini, & Svergun, 2015; Rambo & Tainer, 2011).

Recent developments in cell biology have drawn attention to the physical process of liquid-liquid phase separation (LLPS) as a relevant driver of non-stochiometric molecular assembly and cellular compartmentalization (Banani, Lee, Hyman, & Rosen, 2017; Boeynaems et al., 2018; Shin & Brangwynne, 2017). IDRs were suspected early on as being able to provide the driving force for biomolecular phase separation (Kato et al., 2012; Molliex et al., 2015), perhaps due to their qualitative similarity to well characterized aqueous multi-phase systems involving synthetic polymers (i.e., Polyethylene glycol and Dextran). Indeed, disordered regions have been shown to be sufficient for driving the phase transitions of many proteins (A. E. Conicella, G. H. Zerze, J. Mittal, & N. L. Fawzi, 2016; Kato et al., 2012; Molliex et al., 2015; Nott et al., 2015; Patel et al., 2015). Determining what elements within IDRs are providing the adhesive interactions that drive phase transitions has spurred increased interest in the conformational behavior of IDRs (Brady et al., 2017; Burke, Janke, Rhine, & Fawzi, 2015; Martin et al., 2020; V. H. Ryan et al., 2018).

Polymer phase transitions occur if the sum of three attractive potentials, i.e. polymer-polymer, solvent-solvent and polymer-solvent potentials, favors the solvation of polymer molecules by themselves rather than by solvent and is able to overcome the favorable entropy of mixing. This is captured in the Flory-Huggins mean field theory describing phase separating polymers (Flory, 1942; Huggins, 1942). If a polymer is of sufficient length (often termed the infinite chain limit), Flory-Huggins theory predicts that the emergent properties of a solution of many molecules is recapitulated internally in a single polymer; the affinity between monomers in the same polymer is identical to that across polymers (Figure 1A). While IDRs are heteropolymers and of finite length, homopolymer theory often describes their behavior well suggesting that heterogeneous, long-range interactions within IDRs are sufficiently dynamic that they may cancel on the ensemble level (Hofmann et al., 2012; Martin et al., 2016; Schuler, Soranno, Hofmann, & Nettels, 2016). Current research indeed supports the prediction from homopolymer theory that the dimensions of IDRs are predictive of their phase behavior (Dignon, Zheng, Best, Kim, & Mittal, 2018; Lin & Chan, 2017; Martin et al., 2020). Therefore, the details of molecular interactions that can be inferred from precise measurements of dilute IDR samples provides insight into the properties of dense IDR phases. For example, perturbations to the chemical nature of an IDR either via mutation/permutation of the amino acid sequence or via posttranslational modification (phosphorylation, methylation, etc.) impact the single chain protein properties and can be measured by SAXS; these changes are reflective of changes in the shape of the coexistence curve, or binodal, which quantifies the free energy of concentrated protein solutions. As a consequence of this symmetry, quantitative measurements informative of phase behavior can be made in technologically accessible dilute regimes. In the context of SAXS, this means that IDR shape information can be measured without convolution with interparticle interactions inherent in semi-dilute or concentrated solutions.

Figure 1:

Figure 1:

Single chain IDR dimensions report on emergent properties. (A) Conditions that lead to dilute chain compaction similarly drive phase separation, while conditions that result in expansion maintain disperse protein. (B) The resolution of a SAXS experiment. The parameter D represents the resolution or distances within the protein that are primarily contributing to the SAXS curve at a defined point. The fractal dimension, d, is defined as the slope of log(I(q)) vs log(q) in the higher angles.

2. Resolution of structural information from SAXS

The minimum distance accessible to a SAXS experiment (Figure 1B) is related to the momentum transfer vector, q.

q= 4πsin(θ)λ, (1)

where λ is the x-ray wavelength (1–2 Å at synchrotron sources) and 2θ is the scattering angle. The resolution increases with increasing angle such that its length scale is D=2πq. The typical SAXS experiment provides a resolution range of 0.005 Å−1 < q < 0.5 Å−1. Therefore, the highest obtainable resolution is ~10 Å. However, this cutoff is not strictly analogous to the resolution in high-resolution structural techniques such as x-ray crystallography or cryo-electron microscopy (Tuukkanen, Kleywegt, & Svergun, 2016). While it is true that SAXS does not provide details at the amino acid level, parameters such as the radius of gyration (RG) can be calculated at precision far exceeding this resolution (Svergun & Feǐgin, 1986). This apparent dichotomy is rooted in the fact that SAXS is a ‘low information’ technique where only a limited number of parameters can be extracted precisely (Koch, Vachette, & Svergun, 2003).

In addition to the available q range, the resolution of protein SAXS data is limited by the scattering intensity, which is related to the electron scattering length density (SLD) difference between the protein and the solvent. The SLD quantifies the scattering power of an object. X-rays interact with electrons and thus the SLD for SAXS is related to electron density. Generally, the SLD of a particular atom scales with the number of electrons and therefore with atomic number (Z). However, due to absorption, the relationship between Z and SLD is not completely monotonic – a property exploited in anomalous x-ray scattering experiments. Due to the presence of only atoms with low Z, protein samples suffer from low scattering contrast that becomes vanishingly small at higher angles. This is particularly problematic in IDRs where the electron density is spread over a wider distribution of distances. Therefore, measurements at wider angles provide little additional information.

The scattering intensity as a function of the scattering angle, I(q), from completely disordered IDRs is typically characterized by two distinct resolution regimes. At the smallest angles, the scattering intensity approaches a finite value, I0, that, in the absence of correlations between molecules, is dependent only on concentration and molecular mass. At wider angles, the scattering is approximately a power law, I(q) ∝ qd. The exponent d is the fractal dimension which quantifies self-similarity and is a measure of how the mass increases relative to volume (where, massdistanced). For folded proteins, the mass roughly scales directly with volume and d approaches 3. However, inhomogeneities in electron density caused by the non-random atom distribution in folded domain structure result in departures from pure power law behavior – a feature exploited to model protein shape. In contrast, the diverse ensembles of conformations that are characteristic of IDRs average out many inhomogeneities and the scattering intensity at higher angles is often well characterized by I(q) ∝ qd.

The fractal dimension, d, thus reports on the ensemble averaged density in the sample. At the extremes, d ≈ 3 for a collapsed globule, similar to folded proteins, and for a fully extended rod, d = 1. For a homopolymer near the infinite length limit, values of d are bounded between ~1.6 and 2 which represent the limits of a self-avoiding random walk and a Gaussian coil, respectively (Rubinstein & Colby, 2003). The self-avoiding limit applies to a polymer with a net repulsion between monomers whereas the Gaussian coil limit applies at the specific condition, designated the ‘theta state’, where repulsive and attractive forces cancel. IDRs most often have a fractal dimension that falls within this window (Bernado & Blackledge, 2009; Riback et al., 2017). Due to the finite length of IDRs, regions of local or long-range attraction or repulsion could drive the conformational ensemble outside of the theoretical limits of a homopolymer toward either a disordered globule (Hofmann et al., 2012; Martin et al., 2020) or rod-like shape (Muller-Spath et al., 2010). For example, short polyelectrolytes are more extended than a self-avoiding polymer and have values of d less than 1.6. Values of d between 2–3 could suggest partial folding or collapse into a molten globule state (Ptitsyn, 1995; Uversky, 2002). In short, the fractal dimension reports on the packing of the IDR and hence is diagnostic of the shape of the ensemble. The crossover between fractal regime into the small-angle regime is defined by a correlation length past which the IDR no longer exhibits fractal-like behavior. This characteristic length is directly related to the RG of the IDR (Figure 1B) (Hammouda, 1993).

Both d and RG are properties determined by the complete distribution of atomic distances in the sample and thus ensemble averaged quantities that report on the entire accessible conformational space of the IDR. Assuming that the sample is monodisperse and sufficiently dilute, RG can be determined precisely and reflects the mean distance of all protein atoms from the center of mass (Koch et al., 2003). Determination of the fractal dimension, d, can be less straight-forward. If the IDR truly behaves like a homopolymer in the sense that the amino acid positions can be represented by a smooth distribution (i.e., all heterogeneity is averaged out), the fractal dimension can be accurately obtained and should be coupled to RG via the scaling relation RG~N1d, where N is the number of amino acids in the IDR. The inverse of the fractal dimension is the Flory scaling exponent (Flory, 1953; Rubinstein & Colby, 2003), ν=1d. Strictly speaking, the scaling relation applies to a homopolymer of infinite length – a condition not met by finite length IDRs. Therefore, the measured value of v can be thought of as an “apparent scaling exponent” comparing the IDR to a similar homopolymer and allowing values outside the theoretical limits.

If the distribution of monomers in the IDR deviate from statistical randomness as a result of long-range correlations between protein regions, which can arise due to strong binary interactions between distant regions, RG and v are decoupled (Banks, Qin, Weiss, Stanley, & Zhou, 2018; Fuertes et al., 2018; Riback et al., 2019). The case of an IDR that is linked to a folded domain is an extreme example of long-range correlations and could have a scattering profile with additional correlation lengths at short distances that reflect the atomic distance distribution in the folded domain. In short, due to the limited resolution of SAXS experiments, the parameters RG and v contain the majority of the information content for samples of completely disordered IDRs. The goal of a SAXS experiment is to accurately extract these parameters to obtain insight into the size and shape of the conformational ensemble and quantify deviations from ideal, random behavior.

3. Complications of SAXS measurements of self-assembling proteins

SAXS measurements on soluble biomolecules are straightforward and often done rapidly and effectively in high-throughput facilities (Classen et al., 2013). However, the very features that allow some IDRs to phase separate limit our ability to make high-quality SAXS measurements. The primary issues fall into two categories. (1) Some phase-separating proteins form soluble oligomers via oligomerization domains which enhance phase separation (Mitrea et al., 2018; Powers et al., 2019; Wang et al., 2018); they may also form off-pathway aggregates (Alexander E. Conicella, Gül H. Zerze, Jeetain Mittal, & Nicolas L. Fawzi, 2016; Molliex et al., 2015; Patel et al., 2015; Schmidt, Barreau, & Rohatgi, 2019). The SAXS curve of a heterogeneous ensemble is the mass average, not the number average, of components, and small populations of oligomers or aggregates contribute disproportionately to the scattering. Additionally, the contribution of large species is particularly high at the small angles which are required to precisely determine RG. While SAXS can be used to monitor both oligomerization (Williamson, Craig, Kondrashkina, Bailey-Kellogg, & Friedman, 2008) and aggregation (Herranz-Trillo et al., 2017), both processes severely impair analysis of the ensemble of monomers. (2) The saturation concentration (csat), i.e. the concentration above which the protein forms a dense phase, determines a concentration limit above which SAXS data on monomeric samples is not accessible. However, the effective limit is often lower because the attractive interaction potential between proteins that mediates phase separation can lead to an upturn in intensity at small angles even below csat. This feature is indicative of interference between molecules arising from intermolecule distances that are close enough in space to lie within the SAXS resolution. These features can be modeled by a function called the structure factor which decays to unity at high angles and is discussed further in section 7. Measurements at these low concentrations limit the signal-to-noise of IDR SAXS data with low scattering contrast. These issues are so pervasive that SAXS measurements have historically favored soluble proteins. This point is well illustrated by the fact that the overwhelming majority of IDRs measured by SAXS are more expanded than the polymer theta state (i.e.,v > 0.5), which implies self-avoidance (Bernado & Blackledge, 2009; Cordeiro et al., 2017). These observations have fostered the view that IDRs are generally highly soluble and expanded (Riback et al., 2017). However, phase separating IDRs – under conditions that promote phase separation – have been reported to have v > 0.5 (Martin et al., 2020) highlighting the ability of adhesive elements in the chain to act inter- as well as intramolecularly.

In order to obtain high-quality data on challenging IDRs characterized by limited solubility, contrast and high aggregation potential, we recommend size-exclusion chromatography (SEC)-coupled measurements at high-flux synchrotron radiation sources. SEC-coupled SAXS (SEC-SAXS) eliminates small aggregates and provides flexibility in sample buffer conditions, superior baseline subtraction and rigorous analysis of interparticle interference which we will discuss below.

4. IDR sample measurement in SEC-SAXS mode

A critical consideration when developing a SAXS experiment on an associative IDR is to map the conditions under which the IDR is soluble, forms aggregates and phase separates. These conditions will set the practical limits for the experiment. In the ideal SEC-SAXS setup, the SEC column will be plumbed to minimize the dead volume between the elution from the column and the x-ray scattering measurement. This minimizes the possibility of sample aggregation or phase separation allowing measurements close to the solubility limit. To have the highest obtainable concentration at the x-ray beam, samples need to be loaded onto the SEC column several fold higher in concentration to account for the on-column dilution factor. A practical way to approach this is to load the samples onto the column in a buffer in which the protein is soluble to high concentrations (Riback et al., 2017). For IDRs, this buffer can contain a denaturant such as guanidinium hydrochloride because there is no risk of perturbing important structural features in the protein in an irreversible manner. Alternatively, buffers that include high (or very low) salt concentrations can be effective in maintaining solubility. The experiment then relies on the buffer-exchanging ability of SEC to place the sample in the correct solution conditions at the x-ray beam.

As a practical consideration, the associative properties of phase-separating proteins can slow their passage through a SEC column. The solubilizing additives in the sample loading buffer can ‘push’ the protein through the column. This will manifest as one protein peak eluting where expected based on the protein’s hydrodynamic radius, followed by a second peak that contains more protein as well as the additives. For this reason, it is useful if the loading has a significantly different conductance than the running buffer to easily distinguish elution frames containing the sample in the desired buffer.

SEC columns are selected such that resin sufficiently separates the protein from small molecules. SAXS beamlines differ in their preference for silica- or dextran-based SEC resins. Superdex Increase resins from GE Healthcare have been used successfully at the BioCAT beamline at the Advanced Photon Source at Argonne National Lab. An important consideration in column selection is the maximum flowrate. Radiation damage is mitigated by the sample flow and, in a traditional system, the minimum flow rate will be determined by the rate of x-ray damage to the sample – faster flow results in shorter exposure of any given sample volume resulting in less damage. The coflow systems implemented at the Australian National Synchrotron and BioCAT, which will be discussed in detail later, allow for the use of smaller columns and slower flow. A GE Healthcare Superdex Increase 5/150 column with 3 mL volume, which has a maximum flow rate of 0.45 mL / min, is typically sufficient to avoid radiation damage. The minimally allowable flowrate is beamline dependent. While it is impossible to exceed the solubility limit of a particular protein, longer columns allow for larger volumes to be injected; they spread the elution over a larger volume and therefore more frames that can be averaged. The experimental conditions thus represent a tradeoff that has to take into account protein solubility, flux, column flowrate and sample availability.

The SAXS data is collected as individual exposures; their length is a balance between flow rate, the volume in which the sample elutes and the x-ray flux. With a 3 mL SEC column eluting at 0.4 mL / min into a coflow sample chamber, 0.5 second exposures are a good compromise. The resulting data is a series of detector images that can be analyzed in a similar fashion to all SAXS data with freely available software in packages such as BioXTAS RAW (Hopkins, Gillilan, & Skou, 2017) and CHROMIX in ATSAS (Panjkovich & Svergun, 2018).

The obvious benefit of SEC-SAXS is the elimination of oligomers and aggregates (Figure 2). Equally important for the analysis of adhesive proteins is an ideal baseline subtraction (Figure 2) and a built-in continuous concentration gradient (Figure 3). SEC-SAXS provides the best possible baseline subtraction via the selection of buffer frames that are as close to the protein elution peak as possible. As a result, the composition of the buffer is identical to that in the sample. While this is often a minor consideration when dealing with concentrated samples with high relative signal to background, the quality of the buffer match becomes critically important as the sample concentration decreases. The continuous concentration gradient that is intrinsic in the elution peak from a SEC column (Figure 3) enables the critical examination of the data for interparticle interference. Interparticle interference will appear as a dome when the RG is plotted as a function of elution frame (Figure 3). If associative IDRs oligomerize in a concentration dependent manner on the column, the RG plot will have a negative slope, because larger species elute earlier from SEC columns(Figure 3). If the association happens with a lag time but is faster than the deadtime between SEC elution and data collection, the RG plot will also appear as a dome because self-association is strongest at the highest protein concentrations in the middle of the elution peak (Figure 3). Even in the presence of such effects, regions of the SEC elution with the lowest concentration may be selectively averaged to ensure any of these artifacts are eliminated.

Figure 2:

Figure 2:

SEC-SAXS enables the separation of monomeric IDR from unwanted oligomers and aggregates. The baseline can be chosen from statistically similar frames, where no part of the curve deviates from expected random noise, near the sample elution. The desired sample data is averaged from a region where the RG is not concentration dependent.

Figure 3:

Figure 3:

The RG as a function of elution time is diagnostic of intermolecular interactions. The elution from the SEC column is monitored by UV absorption. The magnitude of the absorption is directly related to the protein concentration. The RG is calculated from a rolling average along the protein elution. If the RG is flat across all elution concentrations, there are no inter-monomer interactions and the sample is monodispersed. A negative slope indicates that larger species are separated by the column and are eluting first. A domed elution profile indicates that the protein self-associates in a concentration dependent manner after elution from the column.

5. Primary data analysis

The goal of the SAXS experiment is to extract the ensemble-average RG of the IDR and the shape of the ensemble of conformations. The primary analysis of data on adhesive IDRs proceeds through a series of relatively simple transformations of the scattering profile (Figure 4) which inform on the size and shape.

Figure 4:

Figure 4:

Primary SAXS data transformations. (A) Raw SAXS data on IDRs presented in log-log format highlights the low q data, power law regimes and correlation length. Binned data is shown as circles. (B) The Guinier transformation of low q (qRG < 1) data is used to calculate RG. Lower plot shows the residuals. (C) The Kratky transformation enables visual inspection of the fractal dimension as indicated by the slope of the data at high q relative to Gaussian chain and globule references. On a dimensionless Kratky plot, the intersection of red lines indicates the location of the maximum for a globule. Deviation of the maximum from this point indicates flexibility.

5.1. Guinier analysis

The easiest way to obtain size information from a SAXS profile is a linear fit to the Guinier transform of small angle scattering data. The Guinier equation results from a first order expansion of the Debye scattering equation in the limit of very small angles.

I(q)=I0eq2RG23. (2)

I0, the zero-angle scattering, is determined by protein concentration and molecular weight. In a Guinier plot, data is plotted as the natural logarithm of the intensity as a function of q2 (Figure 4B). Thus, the resulting linear slope is proportional to the square of RG. Due to the fact that the Guinier equation discards all higher-order terms in the expansion, the equation is only valid at very small angles, which for IDRs is taken to be approximately q<1RG (Svergun, Feǐgin, & Taylor, 1987). Guinier analysis provides a good first approximation of the RG and information on sample quality. The pattern of the residuals in a linear fit of ln(I) versus q2 (Figure 4B) can indicate aggregation or interparticle interference, both of which manifest as an upturn at the smallest angles. While these data points can be excluded, the effects impact the whole scattering profile. When analyzing low-concentration data on IDRs, Guinier analysis can be of limited utility. The noise combined with the limited utilizable q range can result in a high uncertainty of the RG, and these issues are magnified if there is the slightest concern that the sample is not monodisperse.

5.2. Kratky transformation

The Kratky transformation scales the intensity by q2 (Figure 4C). In folded proteins, a Kratky plot can inform on the flexibility of the system. The Kratky plots of well-folded proteins have bell curves with well-defined maxima and converge to zero at higher q (Figure 4C). In contrast, Kratky plots of unfolded proteins appear hyperbolic and serve as a visual indicator of d. An IDR at the theta point, with d = 2, reaches a plateau that is flat until the end of the fractal region of the curve at high angles (a limit which is rarely experimentally accessible and not discussed here). For self-avoiding IDRs with d < 2, the Kratky plot at high q has a positive slope. For self-interacting IDRs with d > 2, the slope is negative (Figure 4C). The Kratky plot can be normalized and made dimensionless by scaling the intensity by the zero-angle scattering and the scattering angle by RG. By normalizing the intensity by I0, samples of different concentrations or molecular weights can be compared. Multiplying q by RG normalizes the distance resolution by the protein radius. In the context of a Kratky plot, these normalizations allow for direct comparison of d, and therefore the scaling exponent ν, across samples (Durand et al., 2010).

5.3. Comments on indirect Fourier transform

In x-ray crystallography (XRC), molecules are present in regular orientation in the crystal lattice and the resulting diffraction pattern represents the position of individual atoms in inverse space. With knowledge of phases, Fourier transform of this data yields the position of atoms in 3D space. Similarly, the SAXS pattern is also determined by the distribution of atoms in space. However, unlike in XRC, the particles in a SAXS sample are typically randomly distributed resulting in spherical averaging of the particle electron density. In the context of IDRs and highly flexible systems, the data represents an additional average over the ensemble of conformations. The quantity derived from a SAXS experiment that is analogous to 3D coordinates in XRC is the atomic pair distribution function, P(r). The SAXS intensity can be directly calculated from the Fourier transform of P(r).

I(q)=4π0P(r)sin(qr)qrdr (3)

In principle, recovering the P(r) distribution from the SAXS intensities should be possible by the inverse Fourier transform. In practice, experimental limitations in measurable scattering angles would require extrapolation to zero angle and wide angles. To circumvent this issue, use of the indirect Fourier transform (IFT), in which the data is represented by a series of basis functions, was proposed by Otto Glatter (Glatter, 1977).

Similar methods are implemented in most SAXS data analysis software and generally are effective in determining parameters such as the RG and the shape within the distance resolution of the measurement. This information is used for ab initio shape determination for well-folded systems (Svergun, 1999). However, IFT is problematic for SAXS data analysis of highly flexible, disordered systems. The issue is inherent to finding the solution to the IFT which is an ill-posed inverse problem. The fit therefore relies heavily on regularization to ensure that the solution is smooth. Further, the IFT requires the probability density to converge to zero at zero distance and at a fixed maximum dimension (Dmax). Finding an appropriate value of Dmax for proteins with well-defined shape can be accomplished by sampling values around an estimated solution while monitoring the quality of fit. In the case of IDRs, the Dmax is poorly defined. Conformations with a Dmax that approaches the contour length of the IDR may exist but with vanishingly small populations -- a problem that is magnified by the weaker contrast for very extended conformations. Given that the solution for RG from IFT depends on the choice of Dmax, the RG from IFT for IDRs can, at best, have a large error and, at worst, be inaccurate.

The P(r) distribution has some utility in analyzing SAXS data on phase-separating proteins. The shape of the P(r) distribution is likely similar at intermediate distances for all possible solutions and could thus be used to assess changes in shape. If phase separation in a particular protein is initiated by folding or changes in relative domain orientations, the shape of the P(r) distribution will be diagnostic of these changes. However, it is important to keep in mind that the information content of the P(r) distribution is limited by the resolution of the experiment. For small IDRs, high signal beyond q = 0.25 − 0.3 Å−1 is rare. Changes in P(r) that suggest shape changes at intermediate distances may thus be caused by artifacts from low signal or suspect baseline subtraction -- a problem equally likely for larger proteins if higher resolution angles are included in the fit but are not free from issues of low signal or baseline subtraction artifacts. IFT is often advertised as a more precise method for obtaining radii compared to Guinier analysis because it uses scattering data from the full q range, but for IDRs, the problems resulting from IFT often outweigh the benefits.

6. Synchrotron SAXS beamline hardware

Making high quality measurements of IDRs with low inherent contrast and concentration mandates attention to maximizing signal-to-noise. This requires the use of high-flux synchrotron radiation sources that are specifically designed for biological SAXS measurements.

The Advanced Photon Source (APS) at Argonne National Lab is a high brilliance, third generation synchrotron radiation source ideally suited for measurement of demanding biological samples. The Biophysical Collaborative Access Team (BioCAT) beamline at Sector 18 is dedicated to biological SAXS, both of fibers and muscles (fiber diffraction) and solutions and has optimized x-ray optics and detection hardware. The BioCAT beamline is a good model for discussing hardware and will thus be the template for this section. Similar beamlines at synchrotrons around the world will be discussed at the end of this section. The overall layout and primary x-ray optics of the beamline are well described (Fischetti et al., 2004). In brief, BioCAT uses both vertical and horizontal focusing optics to run monochromic SAXS experiments at 12 keV with ~2*1013 ph/s in the full 30×140 μm2 focused beam. Collimation for solution SAXS experiments results in an available experimental flux of ~6*1012 ph/s. A Pilatus3 X 1M detector (Dectris) is used to measure the resulting scattering.

The signal-to-noise in SAXS measurements can be improved in three ways via the beamline hardware: improving x-ray detection, reducing background scattering, and increasing x-ray flux. X-ray detector technology is already mostly optimized. In the last ~10 years, x-ray pixel array detectors (PADs) (Barna et al., 1995) have become the most common detector type for SAXS. These are single photon counting detectors with zero readout noise, high dynamic range, and detection efficiencies near 100% for x-ray energies in the range of 10–12 keV. Therefore, the noise level is only limited by the shot noise (Poisson noise) inherent in a counting experiment. The previous and current generation of PADs, most commonly the Pilatus and Eiger detectors made by Dectris (Kraft et al., 2009), provide large detection areas (up to ~300 × 300 mm2), fast readouts (up to 2000 Hz), and minimal (~1 ms) or no deadtime between images. While improvements are possible for certain applications, these detectors are already ideal for SAXS. Beamlines typically have the largest area detector they can afford to maximize the percentage of scattered photons captured. Charge-coupled device (CCD)-based detectors may still be used at some beamlines but are non-ideal for SAXS due to generally slow readouts, non-zero readout noise, and lower detection efficiencies.

SAXS measures x-rays coming from three sources: the sample, the solution, and the SAXS instrument itself. Minimizing the number of x-rays from the instrument can significantly improve the signal-to-noise of a measurement. Instrumental x-ray scattering, also called background or parasitic scattering, is the dominant component of measured x-rays from the instrument, though there may also be a contribution from x-ray fluorescence. Instrumental scattering comes from anything in the beam path, including x-ray optics, x-ray windows, beam monitoring devices, and most significantly air. All SAXS beamlines use at least a partial vacuum environment, which removes air scattering from the measurement. Some SAXS beamlines use in-air sample cells, which increases flexibility for doing experiments with different types of sample holders at the cost of adding scattering from two x-ray windows and a small amount of air. BioCAT and other SAXS beamlines run experiments in a full vacuum environment, including using an in-vacuum sample cell. This removes as much scattering as possible from the x-ray flight path.

Some things are required to be in the beam path, for example x-ray optical elements used to clean and shape the beam before measuring the sample. The scattering from these items is generally divergent from the incident x-ray beam. By placing these items far away from the sample and using one or more sets of slits or pinholes carefully positioned in the beam path, the excess scattering from these items can be almost entirely removed from the measurement. The exact type and layout of these slits or pinholes varies from beamline to beamline, depending on available space. BioCAT uses three sets of single crystal bladed scatterless x-ray slits with both horizontal and vertical blades (Xenocs and JJ X-ray) to collimate the beam and remove excess instrumental scattering. This, combined with minimizing the number of scattering sources by using an in-vacuum sample cell, keeps the intensity of instrumental scattering ~100-fold lower than the solution/sample scattering.

Signal-to-noise levels for a measurement can also be increased by increasing the x-ray flux on the sample. The maximum flux available at a beamline is limited by the x-ray source and beamline design and cannot be easily improved without expensive and intrusive upgrades. However, an additional, practical limitation on flux is defined by the susceptibility of the sample to radiation damage (Hopkins & Thorne, 2016; Jeffries, Graewert, Svergun, & Blanchet, 2015). Thus, strategies which minimize radiation damage without compromising flux are crucial to maximizing signal-to-noise. Historically, strategies for reducing radiation damage have involved a combination of continuously flowing the sample through the x-ray beam to spread the incident flux over more sample and the addition of buffer components that act as radical scavengers. While these methods are partially effective, the majority of high-flux beamlines like BioCAT still require attenuation of the incident beam to prevent damage.

Recently, a novel sample cell design, called a coflow cell (Figure 5A,B), was proposed to mitigate issues surrounding radiation damage (Kirby et al., 2016). The coflow cell provides an exterior sheath of buffer ‘coflowing’ with a central core of sample solution in a laminar flow regime. This flow geometry both minimizes the likelihood of damage to the sample while simultaneously preventing spurious background scattering that can occur from damaged samples adhering to the cell. Using this design, Kirby et al. were able to use an order of magnitude more x-ray flux without damage to the sample, and were able to take full advantage of the high flux beamline (Kirby et al., 2016). Versions of the coflow cell are used at the SAXS/WAXS beamline at the Australian Synchrotron and BioCAT and allow the beamlines to use all available flux without causing radiation damage, which increases signal-to-noise. The increase in flux coupled to decrease in noise allows for measurement at concentrations as low as 0.4 mg/mL (prior to injection onto an SEC column) for a 12 kDa IDR (Figure 5C). This advance has been critical to acquiring interpretable SAXS data on IDRs with low csat.

Figure 5:

Figure 5:

The coflow sample chamber enables maximizing signal-to-noise while minimizing radiation damage. (A) A schematic view of the coflow sample chamber. The top of the chamber contains buffer inlet and an outlet to remove excess buffer. The sample is injected into the middle of the capillary where a constant flow is maintained by the total flow outlet pump. The result is the sample ‘coflowing’ with the buffer in the laminar regime and not mixing. (B) The coflow sample chamber installed at the BioCAT beamline (18ID-D) at the Advanced Photon Source. (C) Low q data shown in Guinier format indicating that identical RGs can be calculated from 0.05 mL samples injected onto a 3 mL SEC column at concentrations ranging from 13.5 to 0.4 mg/mL (12.5 kDa protein).

In the next ~3 years, the APS will be undergoing an upgrade (APS-U), which will increase flux at the BioCAT beamline. BioCAT will also be adding a new multilayer monochromator, which will further increase available flux. While not all of this flux may be useable for SAXS experiments, any portion that is will further improve signal-to-noise.

There are a number of x-ray beamlines around the world with dedicated biological solution SAXS setups similar to that at BioCAT. While details often differ, depending on the specifics of the x-ray source and layout, all beamlines with a dedicated biological SAXS program provide broadly similar resources. These include in-line SEC-SAXS, and increasingly SEC-MALS-SAXS, batch mode SAXS measurements, and often automated sample changing and sample cell cleaning. All beamlines also provide PAD detectors, usually Dectris Pilatus or Eiger detectors with detecting areas equal to or greater than the Pilatus 1M.

Despite the similarities, different beamlines have different specialties. For example, the SIBYLS beamline at the Advanced Light Source (Classen et al., 2013) is highly specialized for high-throughput batch-mode SAXS measurements, using a commercial pipetting robot with custom built sample cells to measure a full 96 well plate every few hours. However, a tradeoff to the focus on high throughput is that the experiments must be done in an air gap, which reduces the signal-to-noise of the measurement. Some BioSAXS beamlines are limited by the source, such as BM29 at the ESRF (Pernot et al., 2013) which is on a bending magnet, limiting the total flux at the beamline to around 1012 ph/s. Additionally, the adoption of the coflow technology has been somewhat slow, and only two beamlines worldwide currently use the coflow cell (though others are either in the process of or are planning to implement it), which limits the number of high flux beamlines that can fully utilize all available photons. Taken together, there are only a few beamlines particularly suitable for experiments attempting to measure extremely low signal-to-noise systems.

In addition to BioCAT, the other optimal beamline for these measurements is the SAXS/WAXS beamline at the Australian Synchrotron where the coflow system was designed (T. M. Ryan et al., 2018). It has a full beam flux similar to that at BioCAT (~5*1012 ph/s), uses an in-vacuum sample cell, and numerous other improvements to minimize excess scattering from beamline components. The P12 beamline at Petra III (Blanchet et al., 2015) and the SWING beamline at Synchrotron SOLEIL (David & Perez, 2009) both provide high flux (up to ~1013 and 5*1012 ph/s respectively), and a fully optimized system with an in-vacuum sample cell and minimal parasitic scattering. However, neither beamline uses a coflow cell, which limits the useable flux and thus the signal to noise achievable. Most other SAXS beamlines around the world lack either the flux or the coflow cell, while several also do not use an in-vacuum sample cell.

7. Model dependent analysis

Phase separation of synthetic polymers which are analogous to IDRs has been theoretically modeled for decades. Homopolymer theories either treat the interaction between monomers with a mean field approach (i.e. Flory-Huggins theory) (Flory, 1942; Huggins, 1942) or explicitly consider the networking of monomer units (i.e. Flory-Stockmayer theory) (Flory, 1941; Stockmayer, 1944). In either case, the phase-separating system is considered to be composed of identical monomers distributed in space based on connectivity and entropy. Percolation and networking models can be extended to heterogeneous ‘stickers-and-spaces’ systems (Choi, Holehouse, & Pappu, 2020) wherein the adhesive elements within the polymer – Flory-Stockmeyer monomers – are the ‘stickers’ and they are linked by ‘spacers’ which are similar to non-interacting monomers with positive excluded volume in Flory-Huggins theory. Deducing the adhesive elements in IDRs via careful SAXS experiments can parameterize an IDR system such that it can be treated with an appropriate polymer models with the level of complexity relevant to the system. This section will discuss analytic form factors that can model x-ray scattering from polymer systems. The degree of complexity required to model the single chain system hints at the complexity needed to model phase separation.

When polymer theories are applied to IDRs, it is assumed that IDRs can be treated similarly to ideal polymers or with an additional level of complexity. In this context, it can be useful to interpret SAXS data from IDRs through the lens of polymer models which define the probability density of monomers in the polymer chain based on their covalent connections and non-covalent contacts. Different types of polymer chain models result in different form factors and therefore ensemble size distributions. The form factor P(q) is related to the scattering intensity normalized by I0:

I(q)=I0P(q)S(q) (4)

This form is apparent in equation 2, where the first order expansion of the Debye scattering equation is an estimate of the form factor at small angles. It is important to note that the structure factor S(q) should always equal 1 in dilute solutions. In the presence of interparticle interference, S(q) can take on values below 1 (indicating repulsion) and above 1 (indicating attraction) at small angles; it converges to 1 at wide angles. Procedures to model S(q) are well established for a number of colloidal systems (Lindner & Zemb, 2002) but are beyond the scope of this chapter. If these features appear in dilute phase data, it is best to further dilute the sample and only analyze low concentration fractions.

Given the nature of SAXS resolution, all variations of analytical form factors based on the Debye scattering equation (shown in equation 2) should converge on the same RG at very small angles. Up to which angle – relative to the IDR RG – the form factor fits the data, varies between models. Understanding the assumptions inherent in a model and determining systematic deviations from it, can provide valuable information about the nature of the IDR versus an ideal homopolymer.

7.1. Gaussian Chain

The simplest polymer form factor describes a random chain in which the monomers obey Gaussian statistics, i.e. the orientation of each monomer is not correlated with any other monomer. The Gaussian chain model results in a relationship between the end-to-end distance Ree, RG and number of residues N.

Ree2=Nb2 (5)
RG2=Ree26 (6)

This relationship leads to the Flory scaling exponent v = 0.5 (Flory, 1953). The Kuhn length, b, is the renormalized monomer length that is required so that the IDR can follow Gaussian statistics. Classically, this model would be constructed such that an IDR with N residues is renormalized to N’ residues each with a length b such that the model holds. The Kuhn length would then be a function of features of the IDR that result in departure from Gaussian statistics such as steric clashes between sidechains and restrictions to bond rotation. Experimentally, and conveniently, the Kuhn length for IDRs is often ~1 residue (Kohn et al., 2004; Wilkins et al., 1999). The Gaussian chain model is physically unrealistic in that it allows multiple monomers to occupy the same space. Although this aspect of the model is non-realistic and does not allow the extraction of ensembles of conformations, if the form factor fits the data, the real IDR ensemble will be statistically identical to the non-realistic model. Gaussian chain statistics have the following form factor:

P(q)= 2q4RG4[eq2RG21+q2RG2]. (7)

This form factor was derived by Peter Debye and is a special case of the Debye scattering function where the explicit sum over all atom pairs is replaced by an integral over the probability density (or pair distribution function) of interatom distances in a Gaussian chain. The Gaussian chain form factor has identical limiting behavior as the Debye scattering equation at small angles:

P(q)2q4RG4[1+q2RG2+1q2RG2+(q2RG2)22(q2RG2)36+]1q2RG23 (8)

At wide angles, the equations can be simplified to:

P(q) 2q2RG2~q2 (9)

The fractal dimension of the Gaussian chain, d = 2, is characteristic of the theta state of a polymer where attractive interactions between monomers exactly balance with excluded volume, i.e. repulsive interactions. The theta state can also be thought of as a tipping point above which the polymer is well solvated and below which attractive interactions between monomers are dominant and the polymer collapses. This collapse below the theta state is analogous to phase separation in polymer dense solutions. A consequence of this connection is that an IDR, under conditions where it will phase separate, will likely be near the theta state and therefore a simple Gaussian chain model may be a good fit to experimental data (Figure 6A,B). The RG is the only free parameter determining the shape of the Gaussian chain form factor (equation 7). Given that the small angles are sufficient to determine the RG (equation 8), the Gaussian chain form factor should fit a wide range of data up to q=1RG. How far experimental data deviates from the model in the crossover regime and the fractal scaling regime is diagnostic of how similar the conformational ensemble of the IDR is to that of the theta state of a homopolymer (Figure 6A,B).

Figure 6:

Figure 6:

Model-dependent fitting of SAXS data on a phase-separating IDR. Fits using the Gaussian Chain, Swollen Gaussian Chain and the Empirical Molecular FF to experimental data on a phase-separating IDR are shown in (A) raw and (B) Kratky format. The Swollen Gaussian Chain and Empirical Molecular FF provide good fits to the data and near equal RG. However, the scaling exponents differ significantly.

7.2. Swollen Gaussian Chain

Real polymer chains cannot intersect with themselves, which results in self-avoidance and can be quantified by a parameter called “excluded volume”. Excluded volume is added as an exponent to the Gaussian chain statistics from equations 5 and 6:

Ree2=N2νb2 (10)
RG2=b2[N2ν(2ν+1)(2ν+2)] (11)

where v is the Flory scaling exponent discussed in section 2 which is generally between 13 for collapsed polymers and globules and 35 for self-avoiding polymers. The scaling exponent of 12 represents the theta state and restores the Gaussian statistics in the previous section – a special case where the excluded volume is effectively zero. Unlike for the Gaussian chain, there is no analytic solution to the form factor that includes self-avoidance for any value of excluded volume other than 12. The solution was originally put into integral form by Henry Benoit in 1957 and was more recently put in a more tractable ‘semi analytic form’ by Boualem Hammouda (Hammouda, 1993) using a combination of incomplete gamma functions.

P(q)=1νU12ν[γ(12ν,U)1U12νγ(12ν,U)] (12)
U=q2RG2(2ν+1)(2ν+2)6 (13)
γ(x,U)=0Uettx1dt (14)

This form factor can be useful for obtaining RG because it is valid to wider angles than the Guinier approximation or the fit to a Gaussian chain form factor due to the variable parameter v (Figure 6A,B). v itself is a useful parameter for the characterization of IDR size scaling and can even hint at a driving force for phase separation in some proteins (Martin et al., 2020). However, the absolute accuracy of the parameter v extracted by a fit to this form factor has been called into question (Figure 6B) (Riback et al., 2017). Specifically, the swollen Gaussian chain form factor can fail to reproduce parameters of limiting cases such as self-avoiding walks generated by molecular simulations. Given these limitations, it is important to be cautious when interpreting fitted values. Differences in values in response to changes in condition or IDR sequence are likely meaningful, while the absolute values with respect to theoretical limiting cases should be viewed with skepticism.

7.3. Empirical derivations

Instead of deriving a form factor based on an ideal, random pair distribution function, an alternate strategy is to empirically derive a form factor based on molecular simulations. This strategy was effectively implemented using Monte Carlo simulations of polystyrene by Pederson and Schurtenberger (Pedersen & Schurtenberger, 1996). In this application, simulations were used to parameterize numerical solutions to the form factor by linear combination of the form factor of a Gaussian chain and an infinitely thin rod. Riback et. al used a similar approach to parameterize a form factor specific to IDRs. They used a series of molecular dynamics simulations in which the excluded volume was effectively titrated by adjusting the attractive potential between beta sidechain carbons in a poly-alanine chain (Riback et al., 2017). The simulations were used to parameterize an empirical function that fits scattering data as a function of RG and v (Figure 6A,B). The empirical molecular form factor (FF) has proven useful in analyzing SAXS data on IDRs (Banks et al., 2018; Martin et al., 2020; Riback et al., 2017; Riback et al., 2019). The IDR-specific parameterization aids in the ability of this model to reproduce values of v across sequences (Figure 6B). Further, using a model derived from IDRs allows for rapid evaluation of SAXS data relative to an ideal, homopolymer IDR sequence. Specifically, by evaluating the prefactor, obtained from the form factor fit for a given IDR sequence length, RG and v, relative to the prefactor in the poly-alanine model, it is possible to diagnose deviations from random statistics. For example, local stiffness or transient structuring could result in deviations in the Kuhn length which would be manifest as an atypical prefactor. Additionally, a non-random distribution of monomers in an IDR will impact the shape of the SAXS curve outside of the small angle regime. In extreme cases where there are significant long-range correlations between monomers, Riback et al. parameterized an extension of the empirical form factor (Riback et al., 2019). It is important to note that high signal to noise and ideal baseline subtraction is required to quantitatively compare a model including heterogeneity with a homogeneous ensemble.

7.4. Simulation-based analysis

Typical samples of synthetic polymers contain a distribution of molecular weights. Assuming that the IDR sample is pure (a feature that should be controlled for and ensured by using SEC-SAXS), knowledge of the chemical identity of the IDR can be used to improve SAXS data interpretation over that with the polymer models from above. Knowledge of the sequence length and amino acid identity can provide details aiding in analysis.

Incorporating information about protein sequence into the analysis usually involves generating an explicit ensemble of conformations that fits the experimental data. A number of different methods exist, but all rely on the concept that the data can be represented by a linear combination of representative conformations. First, an ensemble of conformations is generated. This can be done by sampling from amino-acid specific degrees of freedom, creating the starting distribution of conformations. The ensemble can also be generated by Molecular Dynamics or Monte Carlo simulations with a physics-based forcefield. Second, the fit of the data to the ensemble is inspected. In the best-case scenario (e.g. in the case of a well-performing force field and sampling method), the starting ensemble gives good agreement with the data. If the fit is deemed suboptimal, the conformations in the ensemble can be reweighted or a sub-ensemble can be selected. This process has been accomplished via Bayesian inference optimization of the complete prior distribution (Antonov, Olsson, Boomsma, & Hamelryck, 2016), genetic algorithms which subsample and reweight the prior (Bernado, Mylonas, Petoukhov, Blackledge, & Svergun, 2007; Pelikan, Hura, & Hammel, 2009), clustering followed by maximum entropy reweighting (Rozycki, Kim, & Hummer, 2011), and reweighting the prior based on minimization of a pseudo-energy through simulated annealing (Krzeminski, Marsh, Neale, Choy, & Forman-Kay, 2013; Marsh & Forman-Kay, 2012). The quality of the reweighted ensemble strongly depends on the starting ensemble; if the distribution of properties in the starting ensemble is smooth and already recapitulates the experimental data closely (e.g. because the force field performs well), then reweighting has been shown to result in ensembles with physically reasonable properties such as smooth size distributions (Tria, Mertens, Kachala, & Svergun, 2015). If the starting ensemble is in poor agreement with the experimental data, reweighting can result in ensembles with rough distributions of properties that are physically unreasonable. Hence, realistic and extensive sampling of the conformational space of the sequence in question is key to ensuring the extracted model is minimally biased by the prior distribution (Hummer & Kofinger, 2015; A. H. Larsen et al., 2020).

The low information content of SAXS data results in a danger of overfitting. This is particularly true if the number of conformations in a minimal ensemble are similar to the information content in the SAXS curve, which may erroneously result in suggesting heterogeneity in the ensemble. A solution to preventing overfitting is implemented in the MultiFoXS algorithm which increases the size of the ensemble until it no longer significantly improves the quality of the fit (Schneidman-Duhovny, Hammel, Tainer, & Sali, 2016). This can also be accomplished through Bayesian inference optimization of minimal ensembles (Yang, Parisien, Major, & Roux, 2010). Intriguingly, the majority of SAXS curves of IDRs can be well represented with fewer than five conformations. These results stress the low information content in SAXS data and the fact that useful data analysis should be testing compatibility with a defined physical model. The molecular details of conformations in minimal ensembles, including the contacts that result in their size and shape and correlations between such interactions, all of which are sought when trying to understand how self-association is encoded in IDRs, cannot be extracted from minimal ensembles. Therefore, the only reliable parameter in minimal models may be the ensemble-average level of compaction – a parameter that is rapidly accessible from analytic models – and not features like intramolecular contacts.

All-atom or coarse-grained simulations using physics-based force fields can, in favorable cases, be used for the molecular interpretation of SAXS data without fitting. The SAXS form factor calculated from an ensemble that represents the conformational space within the simulation can be directly compared with experimental data. This type of approach has been effectively applied using Monte Carlo simulations with the ABSINTH implicit solvent model (Martin et al., 2016; Martin et al., 2020). In case the form factor from simulations and experiment are in agreement, the simulations are likely capturing physically realistic conformational behavior of the IDR. Comparison to additional data, e.g. from NMR experiments, helps to establish whether the simulations capture additional features of the conformational behavior of the IDR such as formation of transient contacts or residual secondary structure.

Molecular-detail insight from simulations is particularly useful when evaluating the conformations of phase-separating IDRs because intra-chain contacts can be assumed to also exist intermolecularly in semi-dilute and dense solutions and drive phase transitions. Thus, SAXS-validated simulations can aid in identifying the adhesive elements in IDRs. The synergy between simulation and experiment can be exploited to devise a mutational strategy in which proposed adhesive elements in an IDR are rapidly probed via simulation and their impact verified by SAXS (Martin et al., 2020).

A well-known difficulty in this approach is finding a simulation paradigm that accurately reflects experimental data – a problem particularly relevant to IDRs for which simulations often result in overly compact ensembles (Best & Mittal, 2010). Continued force field development with particular attention being paid to the interaction between solvent and protein has helped ameliorate these shortcomings in recent years (Best, Zheng, & Mittal, 2014; Robustelli, Piana, & Shaw, 2018). If current forcefields that are optimized for IDRs are insufficient to match experimental data, ad hoc modification of the force field or simulation parameters may help. This can often be accomplished by tuning the simulation temperature (Das, Huang, Phillips, Kriwacki, & Pappu, 2016; Francis, Lindorff-Larsen, Best, & Vendruscolo, 2006; Martin et al., 2020) or by subtle modifications to the interactions between solvent and protein (Best et al., 2014; Andreas Haahr Larsen et al., 2019; Piana, Donchev, Robustelli, & Shaw, 2015). Tuning force field parameters and reweighting ensembles can even be mathematically equivalent. In the context of a mutation-based experimental design, the force field could be tuned to recreate experimental data of an experimentally well-characterized IDR sequence. This same force field is used to simulate sequence variants where proposed adhesive elements are modified. If the force field modifications succeed in capturing physically realistic behavior, the simulation results should be a good fit to experimental data on all variants without any further modification.

An alternative approach is bottom-up coarse-grained simulations in which intraresidue interaction potentials are parameterized to match experimental data (Norgaard, Ferkinghoff-Borg, & Lindorff-Larsen, 2008). These types of simulations can be implemented in generalized coarse-grained engines such as HOOMD-Blue (Anderson, Lorenz, & Travesset, 2008). This analysis is particularly attractive for studying phase-separating systems. Parameterizing a coarse-grained simulation enables testing of models in which the affinities between particular residues or groups of residues, so-called ‘stickers’, are titrated and the resulting global dimensions are compared to experimental data. The profound advantage of coarse-grained systems is that the same forcefield can be transferred to simulations of dense IDR solutions to assay how the interaction potentials translate into emergent properties. An example of this type of simulation are coarse-grained on-lattice Monte Carlo simulations that allow for rapid sampling and convergence in simulations containing many IDR molecules (Choi, Dar, & Pappu, 2019; Holehouse & Pappu, 2020). The PIMMS and LASSI software packages are designed specifically to implement this workflow. PIMMS was used recently to translate the interaction potential between individual amino acids parameterized based on experimental SAXS data into the calculation of complete coexistence curves for a phase-separating IDR (Martin et al., 2020). An alternate approach relies on ‘slab geometry’ molecular dynamics simulations (Dignon, Zheng, Best, et al., 2018; Dignon, Zheng, Kim, Best, & Mittal, 2018). These methods also allow the direct transfer of a coarse-grained force field from single chain to dense solutions and have the advantage that they are off-lattice and could contain more information about dense phase structure. However, this comes at the cost of higher simulation time.

8. Implementation of SAXS measurements of single chain IDR behavior for characterizing phase behavior

Careful analysis of dilute protein data has the potential to reveal vital information about the nature of sequence features that contribute to higher-order assembly and phase separation. Data we carefully collected on a series of IDRs with progressively higher aromatic amino acid content displayed an increasing drive to compact (Martin et al., 2020) (Figure 7). The compaction was quantified by the decreasing scaling exponent (Figure 7B) which correctly predicted the phase behavior of these sequences. This example suggests a coherent workflow where IDR sequence or solution conditions can be systematically titrated, and the results directly mapped to protein phase behavior. If the IDR behaves like a homopolymer on the length scales measurable by SAXS, this connection will likely exist. However, as IDRs are heteropolymers of finite length, cases will exist with decoupling and symmetry breaking between single chain and emergent properties. This behavior can be predicted within the ‘stickers-and-spacers’ formalism recently described by Choi et al. (Choi et al., 2020). If the ‘stickers’ are strong enough and the ‘spacers’ have sufficient excluded volume, it is possible that the ‘stickers’ on a single chain may not interact with each other. Importantly, even if RG and v do not directly report on csat, the dilute protein ensemble still informs on the characteristics of the dense phase via the excluded volume of spacers and the cumulative strength of interactions between stickers – parameters that can be inferred from the scaling exponent and csat respectively. Systems where RG and v do not, a priori, suggest a low csat, but the stickers are strong enough to drive assembly, could result in a lower dense phase concentration or even a percolated, gelled, network in the absence of phase separation (Harmon, Holehouse, Rosen, & Pappu, 2017).

Figure 7:

Figure 7:

Changing sequence features that influence the driving force for phase separation are reflected in the protein dimensions under dilute conditions. (A) Raw data on a series of IDR variants with different numbers of aromatic amino acids shows the predicted decrease in radius with increased propensity to phase separate. (B) The scaling exponent increases with the removal of aromatic amino acids indicating more favorable solvation. Dashed lines are fits to the Empirical Molecular FF.

9. Conclusions

SAXS has always been a valuable tool for characterizing IDRs because it provides information on their ensemble-average sizes and size distributions. However, these measurements have been largely restricted to highly soluble IDRs due to the requirement for aggregate- / oligomer-free samples of high enough concentration to achieve sufficient signal-to-noise. We propose that these limitations resulted in a bias toward IDR sequences that behave as if they are in a good solvent (v > 0.5) (Bernado & Blackledge, 2009; Cordeiro et al., 2017; Riback et al., 2017). While the oligomerization of self-associating IDRs has been monitored by SAXS in individual, interesting examples (Herranz-Trillo et al., 2017), examination of this category of protein in the dilute regime has largely been ignored. Recent advancements in hardware at synchrotron radiation sources along with refined sample preparation has opened the door to SAXS characterization of IDRs that have traditionally been inaccessible.

Acknowledgement

The authors would like to thank Thomas Boothby for providing the sample used in Figure 2. T.M. acknowledges funding by NIH grant R01GM112846, the St. Jude Children’s Research Hospital Research Collaborative on Membrane-less Organelles in Health and Disease and the American Lebanese Syrian Associated Charities. This research used resources of the Advanced Photon Source, a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Argonne National Laboratory under Contract No. DE-AC02-06CH11357. This project was supported by grant 9 P41 GM103622 from the National Institute of General Medical Sciences of the National Institutes of Health. Use of the Pilatus 3 1M detector was provided by grant 1S10OD018090-01 from NIGMS. The content is solely the responsibility of the authors and does not necessarily reflect the official views of the National Institute of General Medical Sciences or the National Institutes of Health.

References

  1. Anderson JA, Lorenz CD, & Travesset A (2008). General purpose molecular dynamics simulations fully implemented on graphics processing units. Journal of Computational Physics, 227(10), 5342–5359. doi: 10.1016/j.jcp.2008.01.047 [DOI] [Google Scholar]
  2. Antonov LD, Olsson S, Boomsma W, & Hamelryck T (2016). Bayesian inference of protein ensembles from SAXS data. Phys Chem Chem Phys, 18(8), 5832–5838. doi: 10.1039/c5cp04886a [DOI] [PubMed] [Google Scholar]
  3. Banani SF, Lee HO, Hyman AA, & Rosen MK (2017). Biomolecular condensates: organizers of cellular biochemistry. Nat Rev Mol Cell Biol, 18(5), 285–298. doi: 10.1038/nrm.2017.7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Banks A, Qin S, Weiss KL, Stanley CB, & Zhou HX (2018). Intrinsically Disordered Protein Exhibits Both Compaction and Expansion under Macromolecular Crowding. Biophys J, 114(5), 1067–1079. doi: 10.1016/j.bpj.2018.01.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Barna S, Shepherd J, Wixted R, Tate M, Rodricks B, & Gruner S (1995). Development of a fast pixel array detector for use in microsecond time-resolved x-ray diffraction (Vol. 2521): SPIE. [Google Scholar]
  6. Bernado P, & Blackledge M (2009). A self-consistent description of the conformational behavior of chemically denatured proteins from NMR and small angle scattering. Biophys J, 97(10), 2839–2845. doi: 10.1016/j.bpj.2009.08.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Bernado P, Mylonas E, Petoukhov MV, Blackledge M, & Svergun DI (2007). Structural characterization of flexible proteins using small-angle X-ray scattering. J Am Chem Soc, 129(17), 5656–5664. doi: 10.1021/ja069124n [DOI] [PubMed] [Google Scholar]
  8. Bernado P, & Svergun DI (2012). Analysis of intrinsically disordered proteins by small-angle X-ray scattering. Methods Mol Biol, 896, 107–122. doi: 10.1007/978-1-4614-3704-8_7 [DOI] [PubMed] [Google Scholar]
  9. Best RB, & Mittal J (2010). Protein simulations with an optimized water model: cooperative helix formation and temperature-induced unfolded state collapse. J Phys Chem B, 114(46), 14916–14923. doi: 10.1021/jp108618d [DOI] [PubMed] [Google Scholar]
  10. Best RB, Zheng W, & Mittal J (2014). Balanced Protein-Water Interactions Improve Properties of Disordered Proteins and Non-Specific Protein Association. J Chem Theory Comput, 10(11), 5113–5124. doi: 10.1021/ct500569b [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Blanchet CE, Spilotros A, Schwemmer F, Graewert MA, Kikhney A, Jeffries CM, … Svergun DI (2015). Versatile sample environments and automation for biological solution X-ray scattering experiments at the P12 beamline (PETRA III, DESY). Journal of Applied Crystallography, 48(Pt 2), 431–443. doi: 10.1107/S160057671500254X [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Boeynaems S, Alberti S, Fawzi NL, Mittag T, Polymenidou M, Rousseau F, … Fuxreiter M (2018). Protein Phase Separation: A New Phase in Cell Biology. Trends Cell Biol, 28(6), 420–435. doi: 10.1016/j.tcb.2018.02.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Brady JP, Farber PJ, Sekhar A, Lin YH, Huang R, Bah A, … Kay LE (2017). Structural and hydrodynamic properties of an intrinsically disordered region of a germ cell-specific protein on phase separation. Proc Natl Acad Sci U S A, 114(39), E8194–E8203. doi: 10.1073/pnas.1706197114 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Burke KA, Janke AM, Rhine CL, & Fawzi NL (2015). Residue-by-Residue View of In Vitro FUS Granules that Bind the C-Terminal Domain of RNA Polymerase II. Mol Cell, 60(2), 231–241. doi: 10.1016/j.molcel.2015.09.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Choi JM, Dar F, & Pappu RV (2019). LASSI: A lattice model for simulating phase transitions of multivalent proteins. PLoS Comput Biol, 15(10), e1007028. doi: 10.1371/journal.pcbi.1007028 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Choi JM, Holehouse AS, & Pappu RV (2020). Physical Principles Underlying the Complex Biology of Intracellular Phase Transitions. Annu Rev Biophys. doi: 10.1146/annurev-biophys-121219-081629 [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Classen S, Hura GL, Holton JM, Rambo RP, Rodic I, McGuire PJ, … Tainer JA (2013). Implementation and performance of SIBYLS: a dual endstation small-angle X-ray scattering and macromolecular crystallography beamline at the Advanced Light Source. J Appl Crystallogr, 46(Pt 1), 1–13. doi: 10.1107/S0021889812048698 [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Conicella AE, Zerze GH, Mittal J, & Fawzi NL (2016). ALS Mutations Disrupt Phase Separation Mediated by alpha-Helical Structure in the TDP-43 Low-Complexity C-Terminal Domain. Structure, 24(9), 1537–1549. doi: 10.1016/j.str.2016.07.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Cordeiro TN, Herranz-Trillo F, Urbanek A, Estaña A, Cortés J, Sibille N, & Bernadó P (2017). Structural Characterization of Highly Flexible Proteins by Small-Angle Scattering. In Chaudhuri B, Muñoz IG, Qian S, & Urban VS (Eds.), Biological Small Angle Scattering: Techniques, Strategies and Tips (pp. 107–129). Singapore: Springer Singapore. [DOI] [PubMed] [Google Scholar]
  20. Das RK, Huang Y, Phillips AH, Kriwacki RW, & Pappu RV (2016). Cryptic sequence features within the disordered protein p27Kip1 regulate cell cycle signaling. Proc Natl Acad Sci U S A, 113(20), 5616–5621. doi: 10.1073/pnas.1516277113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. David G, & Perez J (2009). Combined sampler robot and high-performance liquid chromatography: a fully automated system for biological small-angle X-ray scattering experiments at the Synchrotron SOLEIL SWING beamline. Journal of Applied Crystallography, 42(5), 892–900. doi:doi: 10.1107/S0021889809029288 [DOI] [Google Scholar]
  22. Dignon GL, Zheng W, Best RB, Kim YC, & Mittal J (2018). Relation between single-molecule properties and phase behavior of intrinsically disordered proteins. Proc Natl Acad Sci U S A, 115(40), 9929–9934. doi: 10.1073/pnas.1804177115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Dignon GL, Zheng W, Kim YC, Best RB, & Mittal J (2018). Sequence determinants of protein phase behavior from a coarse-grained model. PLoS Comput Biol, 14(1), e1005941. doi: 10.1371/journal.pcbi.1005941 [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Durand D, Vivès C, Cannella D, Pérez J, Pebay-Peyroula E, Vachette P, & Fieschi F (2010). NADPH oxidase activator p67phox behaves in solution as a multidomain protein with semi-flexible linkers. Journal of Structural Biology, 169(1), 45–53. doi: 10.1016/j.jsb.2009.08.009 [DOI] [PubMed] [Google Scholar]
  25. Fischetti R, Stepanov S, Rosenbaum G, Barrea R, Black E, Gore D, … Bunker GB (2004). The BioCAT undulator beamline 18ID: a facility for biological non-crystalline diffraction and X-ray absorption spectroscopy at the Advanced Photon Source. J Synchrotron Radiat, 11(Pt 5), 399–405. doi: 10.1107/S0909049504016760 [DOI] [PubMed] [Google Scholar]
  26. Flory PJ (1941). Molecular Size Distribution in Three Dimensional Polymers. I. Gelation1. Journal of the American Chemical Society, 63(11), 3083–3090. doi: 10.1021/ja01856a061 [DOI] [Google Scholar]
  27. Flory PJ (1942). Thermodynamics of High Polymer Solutions. The Journal of Chemical Physics, 10(1), 51–61. doi: 10.1063/1.1723621 [DOI] [Google Scholar]
  28. Flory PJ (1953). Principles of polymer chemistry. Ithaca,: Cornell University Press. [Google Scholar]
  29. Francis CJ, Lindorff-Larsen K, Best RB, & Vendruscolo M (2006). Characterization of the residual structure in the unfolded state of the Δ131Δ fragment of staphylococcal nuclease. Proteins: Structure, Function, and Bioinformatics, 65(1), 145–152. doi: 10.1002/prot.21077 [DOI] [PubMed] [Google Scholar]
  30. Fuertes G, Banterle N, Ruff KM, Chowdhury A, Pappu RV, Svergun DI, & Lemke EA (2018). Comment on “Innovative scattering analysis shows that hydrophobic disordered proteins are expanded in water”. Science, 361(6405). doi: 10.1126/science.aau8230 [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Glatter O (1977). A new method for the evaluation of small-angle scattering data. Journal of Applied Crystallography, 10(5), 415–421. doi:doi: 10.1107/S0021889877013879 [DOI] [Google Scholar]
  32. Hammouda B (1993). SANS from homogeneous polymer mixtures: A unified overview. In Polymer Characteristics (pp. 87–133). Berlin, Heidelberg: Springer Berlin Heidelberg. [Google Scholar]
  33. Harmon TS, Holehouse AS, Rosen MK, & Pappu RV (2017). Intrinsically disordered linkers determine the interplay between phase separation and gelation in multivalent proteins. Elife, 6. doi: 10.7554/eLife.30294 [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Herranz-Trillo F, Groenning M, van Maarschalkerweerd A, Tauler R, Vestergaard B, & Bernado P (2017). Structural Analysis of Multi-component Amyloid Systems by Chemometric SAXS Data Decomposition. Structure, 25(1), 5–15. doi: 10.1016/j.str.2016.10.013 [DOI] [PubMed] [Google Scholar]
  35. Hofmann H, Soranno A, Borgia A, Gast K, Nettels D, & Schuler B (2012). Polymer scaling laws of unfolded and intrinsically disordered proteins quantified with single-molecule spectroscopy. Proc Natl Acad Sci U S A, 109(40), 16155–16160. doi: 10.1073/pnas.1207719109 [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Holehouse AS, & Pappu RV (2020). PIMMS (0.24 pre-beta). Zenodo. doi: 10.5281/zenodo.3588456 [DOI] [Google Scholar]
  37. Hopkins JB, Gillilan RE, & Skou S (2017). BioXTAS RAW: improvements to a free open-source program for small-angle X-ray scattering data reduction and analysis. J Appl Crystallogr, 50(Pt 5), 1545–1553. doi: 10.1107/S1600576717011438 [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Hopkins JB, & Thorne RE (2016). Quantifying radiation damage in biomolecular small-angle X-ray scattering. J Appl Crystallogr, 49(Pt 3), 880–890. doi: 10.1107/S1600576716005136 [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Huggins ML (1942). Some Properties of Solutions of Long-chain Compounds. The Journal of Physical Chemistry, 46(1), 151–158. doi: 10.1021/j150415a018 [DOI] [Google Scholar]
  40. Hummer G, & Kofinger J (2015). Bayesian ensemble refinement by replica simulations and reweighting. J Chem Phys, 143(24), 243150. doi: 10.1063/1.4937786 [DOI] [PubMed] [Google Scholar]
  41. Jeffries CM, Graewert MA, Svergun DI, & Blanchet CE (2015). Limiting radiation damage for high-brilliance biological solution scattering: practical experience at the EMBL P12 beamline PETRAIII. J Synchrotron Radiat, 22(2), 273–279. doi: 10.1107/S1600577515000375 [DOI] [PubMed] [Google Scholar]
  42. Kachala M, Valentini E, & Svergun DI (2015). Application of SAXS for the Structural Characterization of IDPs. Adv Exp Med Biol, 870, 261–289. doi: 10.1007/978-3-319-20164-1_8 [DOI] [PubMed] [Google Scholar]
  43. Kato M, Han TW, Xie S, Shi K, Du X, Wu LC, … McKnight SL (2012). Cell-free formation of RNA granules: low complexity sequence domains form dynamic fibers within hydrogels. Cell, 149(4), 753–767. doi: 10.1016/j.cell.2012.04.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
  44. Kirby N, Cowieson N, Hawley AM, Mudie ST, McGillivray DJ, Kusel M, … Ryan TM (2016). Improved radiation dose efficiency in solution SAXS using a sheath flow sample environment. Acta Crystallogr D Struct Biol, 72(Pt 12), 1254–1266. doi: 10.1107/S2059798316017174 [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Koch MH, Vachette P, & Svergun DI (2003). Small-angle scattering: a view on the properties, structures and structural changes of biological macromolecules in solution. Q Rev Biophys, 36(2), 147–227. doi: 10.1017/s0033583503003871 [DOI] [PubMed] [Google Scholar]
  46. Kohn JE, Millett IS, Jacob J, Zagrovic B, Dillon TM, Cingel N, … Plaxco KW (2004). Random-coil behavior and the dimensions of chemically unfolded proteins. Proceedings of the National Academy of Sciences of the United States of America, 101(34), 12491. doi: 10.1073/pnas.0403643101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Kraft P, Bergamaschi A, Broennimann C, Dinapoli R, Eikenberry EF, Henrich B, … Schmitt B (2009). Performance of single-photon-counting PILATUS detector modules. J Synchrotron Radiat, 16(Pt 3), 368–375. doi: 10.1107/S0909049509009911 [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Krzeminski M, Marsh JA, Neale C, Choy WY, & Forman-Kay JD (2013). Characterization of disordered proteins with ENSEMBLE. Bioinformatics, 29(3), 398–399. doi: 10.1093/bioinformatics/bts701 [DOI] [PubMed] [Google Scholar]
  49. Larsen AH, Wang Y, Bottaro S, Grudinin S, Arleth L, & Lindorff-Larsen K (2019). Combining molecular dynamics simulations with small-angle X-ray and neutron scattering data to study multi-domain proteins in solution. bioRxiv, 2019.2012.2026.888834. doi: 10.1101/2019.12.26.888834 [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Larsen AH, Wang Y, Bottaro S, Grudinin S, Arleth L, & Lindorff-Larsen K (2020). Combining molecular dynamics simulations with small-angle X-ray and neutron scattering data to study multi-domain proteins in solution. PLoS Comput Biol, 16(4), e1007870. doi: 10.1371/journal.pcbi.1007870 [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Lin YH, & Chan HS (2017). Phase Separation and Single-Chain Compactness of Charged Disordered Proteins Are Strongly Correlated. Biophys J, 112(10), 2043–2046. doi: 10.1016/j.bpj.2017.04.021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Lindner P, & Zemb T (2002). Neutrons, X-rays, and light : scattering methods applied to soft condensed matter (1st ed.). Amsterdam ; Boston: Elsevier. [Google Scholar]
  53. Marsh JA, & Forman-Kay JD (2012). Ensemble modeling of protein disordered states: experimental restraint contributions and validation. Proteins, 80(2), 556–572. doi: 10.1002/prot.23220 [DOI] [PubMed] [Google Scholar]
  54. Martin EW, Holehouse AS, Grace CR, Hughes A, Pappu RV, & Mittag T (2016). Sequence Determinants of the Conformational Properties of an Intrinsically Disordered Protein Prior to and upon Multisite Phosphorylation. J Am Chem Soc, 138(47), 15323–15335. doi: 10.1021/jacs.6b10272 [DOI] [PMC free article] [PubMed] [Google Scholar]
  55. Martin EW, Holehouse AS, Peran I, Farag M, Incicco JJ, Bremer A, … Mittag T (2020). Valence and patterning of aromatic residues determine the phase behavior of prion-like domains. Science, 367(6478), 694–699. doi: 10.1126/science.aaw8653 [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Mitrea DM, Cika JA, Stanley CB, Nourse A, Onuchic PL, Banerjee PR, … Kriwacki RW (2018). Self-interaction of NPM1 modulates multiple mechanisms of liquid-liquid phase separation. Nat Commun, 9(1), 842. doi: 10.1038/s41467-018-03255-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Mittag T, & Forman-Kay JD (2007). Atomic-level characterization of disordered protein ensembles. Curr Opin Struct Biol, 17(1), 3–14. doi: 10.1016/j.sbi.2007.01.009 [DOI] [PubMed] [Google Scholar]
  58. Molliex A, Temirov J, Lee J, Coughlin M, Kanagaraj AP, Kim HJ, … Taylor JP (2015). Phase separation by low complexity domains promotes stress granule assembly and drives pathological fibrillization. Cell, 163(1), 123–133. doi: 10.1016/j.cell.2015.09.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Muller-Spath S, Soranno A, Hirschfeld V, Hofmann H, Ruegger S, Reymond L, … Schuler B (2010). From the Cover: Charge interactions can dominate the dimensions of intrinsically disordered proteins. Proc Natl Acad Sci U S A, 107(33), 14609–14614. doi: 10.1073/pnas.1001743107 [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Norgaard AB, Ferkinghoff-Borg J, & Lindorff-Larsen K (2008). Experimental Parameterization of an Energy Function for the Simulation of Unfolded Proteins. Biophysical Journal, 94(1), 182–192. doi: 10.1529/biophysj.107.108241 [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Nott TJ, Petsalaki E, Farber P, Jervis D, Fussner E, Plochowietz A, … Baldwin AJ (2015). Phase transition of a disordered nuage protein generates environmentally responsive membraneless organelles. Mol Cell, 57(5), 936–947. doi: 10.1016/j.molcel.2015.01.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Oldfield CJ, & Dunker AK (2014). Intrinsically disordered proteins and intrinsically disordered protein regions. Annu Rev Biochem, 83, 553–584. doi: 10.1146/annurev-biochem-072711-164947 [DOI] [PubMed] [Google Scholar]
  63. Panjkovich A, & Svergun DI (2018). CHROMIXS: automatic and interactive analysis of chromatography-coupled small-angle X-ray scattering data. Bioinformatics, 34(11), 1944–1946. doi: 10.1093/bioinformatics/btx846 [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Patel A, Lee HO, Jawerth L, Maharana S, Jahnel M, Hein MY, … Alberti S (2015). A Liquid-to-Solid Phase Transition of the ALS Protein FUS Accelerated by Disease Mutation. Cell, 162(5), 1066–1077. doi: 10.1016/j.cell.2015.07.047 [DOI] [PubMed] [Google Scholar]
  65. Pedersen JS, & Schurtenberger P (1996). Scattering Functions of Semiflexible Polymers with and without Excluded Volume Effects. Macromolecules, 29(23), 7602–7612. doi: 10.1021/ma9607630 [DOI] [Google Scholar]
  66. Pelikan M, Hura GL, & Hammel M (2009). Structure and flexibility within proteins as identified through small angle X-ray scattering. Gen Physiol Biophys, 28(2), 174–189. doi: 10.4149/gpb_2009_02_174 [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Pernot P, Round A, Barrett R, De Maria Antolinos A, Gobbo A, Gordon E, … McSweeney S (2013). Upgraded ESRF BM29 beamline for SAXS on macromolecules in solution. J Synchrotron Radiat, 20(Pt 4), 660–664. doi: 10.1107/S0909049513010431 [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Piana S, Donchev AG, Robustelli P, & Shaw DE (2015). Water dispersion interactions strongly influence simulated structural properties of disordered protein states. J Phys Chem B, 119(16), 5113–5123. doi: 10.1021/jp508971m [DOI] [PubMed] [Google Scholar]
  69. Powers SK, Holehouse AS, Korasick DA, Schreiber KH, Clark NM, Jing H, … Strader LC (2019). Nucleo-cytoplasmic Partitioning of ARF Proteins Controls Auxin Responses in Arabidopsis thaliana. Mol Cell, 76(1), 177–190 e175. doi: 10.1016/j.molcel.2019.06.044 [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Ptitsyn OB (1995). Molten globule and protein folding. Adv Protein Chem, 47, 83–229. doi: 10.1016/s0065-3233(08)60546-x [DOI] [PubMed] [Google Scholar]
  71. Rambo RP, & Tainer JA (2011). Characterizing flexible and intrinsically unstructured biological macromolecules by SAS using the Porod-Debye law. Biopolymers, 95(8), 559–571. doi: 10.1002/bip.21638 [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Riback JA, Bowman MA, Zmyslowski AM, Knoverek CR, Jumper JM, Hinshaw JR, … Sosnick TR (2017). Innovative scattering analysis shows that hydrophobic disordered proteins are expanded in water. Science, 358(6360), 238–241. doi: 10.1126/science.aan5774 [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Riback JA, Bowman MA, Zmyslowski AM, Plaxco KW, Clark PL, & Sosnick TR (2019). Commonly used FRET fluorophores promote collapse of an otherwise disordered protein. Proc Natl Acad Sci U S A, 116(18), 8889–8894. doi: 10.1073/pnas.1813038116 [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Robustelli P, Piana S, & Shaw DE (2018). Developing a molecular dynamics force field for both folded and disordered protein states. Proc Natl Acad Sci U S A, 115(21), E4758–E4766. doi: 10.1073/pnas.1800690115 [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Rozycki B, Kim YC, & Hummer G (2011). SAXS ensemble refinement of ESCRT-III CHMP3 conformational transitions. Structure, 19(1), 109–116. doi: 10.1016/j.str.2010.10.006 [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Rubinstein M, & Colby RH (2003). Polymer physics. Oxford ; New York: Oxford University Press. [Google Scholar]
  77. Ryan TM, Trewhella J, Murphy JM, Keown JR, Casey L, Pearce FG, … Kirby N (2018). An optimized SEC-SAXS system enabling high X-ray dose for rapid SAXS assessment with correlated UV measurements for biomolecular structure analysis. Journal of Applied Crystallography, 51(1), 97–111. doi:doi: 10.1107/S1600576717017101 [DOI] [Google Scholar]
  78. Ryan VH, Dignon GL, Zerze GH, Chabata CV, Silva R, Conicella AE, … Fawzi NL (2018). Mechanistic View of hnRNPA2 Low-Complexity Domain Structure, Interactions, and Phase Separation Altered by Mutation and Arginine Methylation. Mol Cell, 69(3), 465–479 e467. doi: 10.1016/j.molcel.2017.12.022 [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Schmidt HB, Barreau A, & Rohatgi R (2019). Phase separation-deficient TDP43 remains functional in splicing. Nature Communications, 10(1), 4890. doi: 10.1038/s41467-019-12740-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Schneidman-Duhovny D, Hammel M, Tainer JA, & Sali A (2016). FoXS, FoXSDock and MultiFoXS: Single-state and multi-state structural modeling of proteins and their complexes based on SAXS profiles. Nucleic Acids Res, 44(W1), W424–429. doi: 10.1093/nar/gkw389 [DOI] [PMC free article] [PubMed] [Google Scholar]
  81. Schuler B, Soranno A, Hofmann H, & Nettels D (2016). Single-Molecule FRET Spectroscopy and the Polymer Physics of Unfolded and Intrinsically Disordered Proteins. Annu Rev Biophys, 45, 207–231. doi: 10.1146/annurev-biophys-062215-010915 [DOI] [PubMed] [Google Scholar]
  82. Shin Y, & Brangwynne CP (2017). Liquid phase condensation in cell physiology and disease. Science, 357(6357). doi: 10.1126/science.aaf4382 [DOI] [PubMed] [Google Scholar]
  83. Stockmayer WH (1944). Theory of Molecular Size Distribution and Gel Formation in Branched Polymers II. General Cross Linking. The Journal of Chemical Physics, 12(4), 125–131. doi: 10.1063/1.1723922 [DOI] [Google Scholar]
  84. Svergun DI (1999). Restoring low resolution structure of biological macromolecules from solution scattering using simulated annealing. Biophysical Journal, 76(6), 2879–2886. doi: 10.1016/S0006-3495(99)77443-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  85. Svergun DI, & Feǐgin LA (1986). Rentgenovskoe i neǐtronnoe malouglovoe rasseianie. Moskva: “Nauka,” Glav. red. fiziko-matematicheskoǐ lit-ry. [Google Scholar]
  86. Svergun DI, Feǐgin LA, & Taylor GW (1987). Structure analysis by small-angle x-ray and neutron scattering. New York: Plenum Press. [Google Scholar]
  87. Tompa P (2012). Intrinsically disordered proteins: a 10-year recap. Trends Biochem Sci, 37(12), 509–516. doi: 10.1016/j.tibs.2012.08.004 [DOI] [PubMed] [Google Scholar]
  88. Tria G, Mertens HD, Kachala M, & Svergun DI (2015). Advanced ensemble modelling of flexible macromolecules using X-ray solution scattering. IUCrJ, 2(Pt 2), 207–217. doi: 10.1107/S205225251500202X [DOI] [PMC free article] [PubMed] [Google Scholar]
  89. Tuukkanen AT, Kleywegt GJ, & Svergun DI (2016). Resolution of ab initio shapes determined from small-angle scattering. IUCrJ, 3(6), 440–447. doi:doi: 10.1107/S2052252516016018 [DOI] [PMC free article] [PubMed] [Google Scholar]
  90. Uversky VN (2002). Natively unfolded proteins: a point where biology waits for physics. Protein Sci, 11(4), 739–756. doi: 10.1110/ps.4210102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  91. van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, … Babu MM (2014). Classification of intrinsically disordered regions and proteins. Chem Rev, 114(13), 6589–6631. doi: 10.1021/cr400525m [DOI] [PMC free article] [PubMed] [Google Scholar]
  92. Wang A, Conicella AE, Schmidt HB, Martin EW, Rhoads SN, Reeb AN, … Fawzi NL (2018). A single N-terminal phosphomimic disrupts TDP-43 polymerization, phase separation, and RNA splicing. EMBO J, 37(5). doi: 10.15252/embj.201797452 [DOI] [PMC free article] [PubMed] [Google Scholar]
  93. Wilkins DK, Grimshaw SB, Receveur V, Dobson CM, Jones JA, & Smith LJ (1999). Hydrodynamic Radii of Native and Denatured Proteins Measured by Pulse Field Gradient NMR Techniques. Biochemistry, 38(50), 16424–16431. doi: 10.1021/bi991765q [DOI] [PubMed] [Google Scholar]
  94. Williamson TE, Craig BA, Kondrashkina E, Bailey-Kellogg C, & Friedman AM (2008). Analysis of self-associating proteins by singular value decomposition of solution scattering data. Biophys J, 94(12), 4906–4923. doi: 10.1529/biophysj.107.113167 [DOI] [PMC free article] [PubMed] [Google Scholar]
  95. Yang S, Parisien M, Major F, & Roux B (2010). RNA structure determination using SAXS data. J Phys Chem B, 114(31), 10039–10048. doi: 10.1021/jp1057308 [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES