Abstract
Structural biology is entering an exciting time where many new high-resolution structures of large complexes and membrane proteins are determined regularly. These advances have been driven by over fifteen years of technology advancements; first in macromolecular crystallography and recently in Cryo-electron microscopy. These structures are allowing detailed questions about functional mechanisms of the structures, and the biology enabled by these structures, to be addressed for the first time. At the same time, mass spectrometry technologies for protein structure analysis, “footprinting” studies, have improved their sensitivity and resolution dramatically, and can provide detailed sub-peptide and residue level information for validating structures and interactions or understanding the dynamics of structures in the context of ligand binding or assembly. In this perspective, we review the use of protein footprinting to extend our understanding of macromolecular systems, particularly for systems challenging for analysis by other techniques, such as intrinsically disordered proteins, amyloidogenic proteins, and other proteins/complexes so far recalcitrant to existing methods. We also illustrate how the availability of high-resolution structural information can be a foundation for a suite of hybrid approaches to divine structure function relationships beyond what individual techniques can deliver.
Keywords: proteins, mass spectrometry, structural biology, footprinting, hybrid methods
Advances in structural biology and biophysics applications of macromolecular crystallography, cryo-EM and NMR have revolutionized our access to protein and nucleic acid structural information. These structural advances, coupled to advances in genomic sequencing and molecular biology, have provided rapid identification of drug targets and have revolutionized rational drug development for both small molecules and biologics [1]. A major theme emerging in structural biology is that of combining different methods through appropriate data integration. In Figure 1 we illustrate this cycle of analysis, which starts with high-resolution structural information (or models) from macromolecular crystallography, Cryo-EM, or other methods and through the application of a range of orthogonal approaches asks questions like: What are the structures of multi-component macromolecular complexes in varying molecular states? How do they structurally interconvert between these functional states? And, what structural features drive the kinetics and thermodynamics of assembly?
Fig 1. Integrated workflow for structure assessment.
Reprinted with permission [1].
The structural genomics revolution provided template structures for most soluble domains from crystallographic and NMR data [2, 3], but these “Lego blocks” representing discrete tertiary structural elements need additional data to “assemble” them into their physiological context as components of complex macromolecular machines. Techniques like small angle x-ray scattering (SAXS) and electron microscopy (EM) are powerful approaches that provide global shape envelopes (Figure 1); these can help to understand the different ways of assembling these “Lego blocks” [4, 5]. Along with these shape measures, native mass spectrometry (MS) [6, 7]has been quite valuable in elucidating the composition of these large assemblies. In this MS approach (also called “top-down” because an intact protein species has been introduced to the instrument) macromolecular complexes are ionized and introduced to the gas phase, then subjected to cycles of analysis and fragmentation to understand subunit stoichiometries, topology, and binding to ligands.
This top-down approach is complementary to classical “bottom-up” mass spectrometry, where proteins are digested by proteases into a set of constituent peptides (ranging from 5–15 residues each) and the peptides are separated by chromatography and individually analyzed. In bottom-up footprinting experiments the protein complexes are chemically labelled, then digested to the peptide level and introduced to the instrument while preserving as much of the labeling as possible, and the labeling patterns of the peptides are analyzed to infer important structural and dynamics information. The most popular bottom-up approaches are cross-linking, hydrogen deuterium exchange (HDX), and irreversible covalent labeling. Cross-linking typically uses bifunctional chemical reagents with linkers of defined length (from so called zero length to longer) that react with side chain groups (e.g. Lys) linking side chains that can achieve close approach. This data can be deconvolved to determine which proteins are adjacent in the case of a large complex (inter-protein crosslinks) or it can be used to understand domain-domain distances for single proteins as well (intra-protein crosslinks) [8, 9]. Hydrogen deuterium exchange (HDX) and various other footprinting (FP) approaches use mono-functional reagents that specifically label the backbone (in the case of HDX) [10] or side chains [1, 11–13]. Changes in the reactivity of these reagents with respect to the specific sites on the protein can reveal the peptide (or sub-peptide) level details of secondary (HDX) or tertiary and quaternary (HDX and FP) structural changes.
In order to better understand the strengths and weaknesses of these evolving approaches, and determine the right integrative approach for a particular problem, we have prepared a summary of the salient features of major structural biology techniques familiar to most modern research laboratories. Table 1 lays out the major techniques, provides a graphic illustrating the method or results, lists the number of structures associated with that technique currently in the Protein Data Bank (https://www.rcsb.org/stats/summary), states limitations of molecular size or amounts of material associated with typical studies, and adds additional commentary on significance. The vast number of deposited structures, exceeding 160,000 (as of 2/2020), are dominated x-ray crystallography, due to the very high throughput nature of modern synchrotron data collection and structure solution. Other methods also make very important contributions, with NMR exceeding 12,000 structures and single particle analysis by Cryo-EM increasing quite rapidly with now over 4000 structures deposited. Structures based on integrative or hybrid methods, including low-resolution EM, crosslinking, footprinting, and SAXS are also emerging and several dozen structures have been deposited in a separate database, called PDB-Dev (https://pdb-dev.wwpdb.org/), designed specifically for these integrative approaches.
Table 1:
Comparison of major structural biology techniques in terms of resolution, limitations, and other relevant figures of merit. Tomography figure courtesy of Chiu lab.
Technique (PDB deposits) | Size (Sample state) | Resolution Limits | Amounts | Notes | |
---|---|---|---|---|---|
NMR (12835) | ![]() |
<100 kDa (solution) | ~ 3 – 4 Å | μmoles/milligrams | Requires labeled recombinant protein, disordered regions can be observed but may not be assigned |
X-ray Crystallography (141165) | ![]() |
Limited by crystal quality | < 1–3 Å | μmoles/milligrams | Mutant constructs necessary for many membrane proteins, disordered regions invisible. Gold standard for structural water |
Cryo-EM: Single particle (4092) | ![]() |
>100 kDa (vitrified ice) | Mostly >3 Å | nanomoles/μgrams | Resolution and size limits improving, best samples have symmetry, disordered regions invisible |
Cryo-EM: Tomography | ![]() |
Cells or tissues | 30–40 Å | thin sections/individual cells | Resolution improving; captures large-scale spatial organization in cells |
SAXS | ![]() |
>10 kDa (solutions) | >20 Å | nanomoles/μgrams | Native material can usually be used, (similar to FP samples) |
Footprinting: HRF-MS [and HDX-MX] | ![]() |
HRF-MS: No limit [<100KDa for HDX] (solution) | Peptide to single-residue (single base for NA) | picomoles/nanograms | Native material can usually be used (both), absolute surface area can be estimated (HRF) disordered regions visible (HRF). Studies in cells/tissue possible (HRF) |
NMR is a valuable solution-based technique, most effective on proteins < 100 kDa. Milligrams of material and micromolar concentrations of sample are typical; some experiments require expensive isotope labeling and/or expensive expression systems for some proteins. Data collection times can be hours or days. X-ray crystallography, conducted at synchrotrons, is very high throughput, as hundreds of crystals can be analyzed in a single day. It has very high potential resolution, and is currently essential for reliably identifying sites of water or metal ion occupancy. Protein stability issues and limitations of crystallization are challenges for membrane proteins and larger complexes[14, 15]. Cryo-EM has provided a breakthrough for solving medium to high resolution structures of large complexes and for membrane proteins, and sample amounts are considerably reduced, from the milligrams required for crystallography and NMR to micrograms for the Cryo-EM vitrification sample preparation process. An additional complexity is that molecules at the air/water interface can be denatured as a result of the vitrification and data collection times may also be hours/days. Another current limitation of Cryo-EM is that >90% of the structures do not exceed 3 Å resolution, and samples that lack symmetry or are under 100 kDa suffer relatively rapid radiation damage limiting data collection and or resolution[16, 17]. Thus, NMR and Cryo-EM have quite orthogonal optimums for protein size. Similar to Cryo-EM, Cryo-electron tomography similarly has the potential to elucidate the molecular envelopes of large macromolecular complexes while operating in situ albeit currently at resolutions of 20–30 Å [18, 19]. Small-angle x-ray scattering can be applied in solution, even for relatively large complexes and provides shape information at ~20 Å resolution [20]. With the strengths and weaknesses of each of these techniques in mind, structural mass spectrometry can be quite useful to fill in gaps in both approach and information. Some advantages of MS include: only nanograms of material are needed (three orders of magnitude less than cryo-EM and six orders less than crystallography/NMR); samples generally do not need to be engineered/re-engineered or labeled (e.g. can be native); samples are assessed in solution or other convenient matrices; samples can be of almost any size or complexity (with the caveat that HDX is less effective for systems >100 kDa due to back exchange); and MS experiments can provide high resolution assessments of secondary, tertiary and quaternary structure [1].
Figure 1 illustrates the workflow we have adopted to answer key questions in structure assessment. The workflow’s foundation includes high resolution structural information from crystallography, NMR or cryo-EM and relevant homology modeling, and then envisioning experiments that assess the temporal and spatial features of the assembly and dynamics of the molecular systems under study, particularly those that represent a perturbation of the “canonical” structure. It’s often the case that crystallographic or Cryo-EM information is available only for one form of a complex of interest, e.g. with a ligand while the “apo” structure without a ligand may not be known. In these cases, techniques sensitive to local structural information (HDX and FP) can be used to infer the ligand binding site and sites of potential allosteric change while global measures such as native-MS, EM and SAXS can determine ligand depended shape changes for the overall protein/complex. Repeated test and validation cycles where structural models are evaluated by mutagenesis or other orthogonal biophysics experiments are essential for establishing rigor of results.
A number of interesting and challenging problems in structural biology can now be solved by this enabling suite of approaches; some of these are outlined in Table 2. In this perspective we illustrate novel integrations of SAXS, FP, and molecular docking to overcome these challenges to provide structure. In terms of potential applications, structures of macromolecular complexes, for example antibody/antigen complexes and checkpoint interactions in immune cells, are of high priority for analysis, but some systems, due to disorder or size, are not amenable to standard approaches. In addition, the study of membrane proteins enabled by crystallography and cryo-EM can be followed up with MS approaches. FP has the potential to provide protein interaction and dynamic data for membrane proteins [21, 22] as it is challenging to obtain models/structures of all the productive complexes involved in cellular signaling and function. For example, HDX and FP have recently leveraged high resolution structural information to address the dynamics of signaling and the mechanisms of GPCR-G-protein complex assembly [23]. In this study, the details of the dynamics of interconversions between known states, including temporal and mechanistic details of interactions, were revealed providing a new window into biological function. Specifically, an intermediate in the reaction to form the GPCR-G-protein complex was observed on the hundreds of milliseconds timescale, well prior to actual cellular signaling events.
Table 2:
Major opportunity areas for structural footprinting.
■ Protein complexes/ligand interactions not amenable to crystallography or Cryo-EM |
■ Membrane proteins in native like states |
■ Intrinsically disordered proteins |
■ Amyloidogenic proteins |
■ Structural kinetics and intermediates of protein machines |
■ Structure of molecules in cells/tissue |
Key in integrative structural studies of challenging systems is the provision of a flexible, capable analytical toolbox to integrate disparate types of structural data; this has the potential to provide high resolution structural information even in the absence of success via conventional methods. A further challenge is assessing native structures in at least solution based, if not their physiological cellular and/or tissue context. Successful completion of such integrative, “in (cellular) context” studies will require integration of multiple-types of experimental data using varying computational strategies, including integration of high-resolution structural information spanning both local and global scales. An additional set of unmet challenges is to obtain a readout of protein dynamics to both expand and validate these models, while attaining the readouts at time resolutions sufficient to capture signatures of intermediate states of signaling and dynamics. Mass spectrometry will play a major role in sampling dynamics of structural proteomes for both in vitro systems and on cellular scales, [24]
Footprinting for Structural Biology
Footprinting as a structural tool is a chemistry-based approach, wherein small molecules are designed to covalently modify macromolecules at sites of structural interest [1, 25]. Hydroxyl radical footprinting (HRF) is the most popular approach for macromolecule modification or cleavage although its utility can be reduced by the presence of scavenging elements in solution, including buffering agents or stabilizers needed to poise the macromolecule or cell in the biochemical state of interest. These scavengers soak up the dose of OH radicals, requiring chemical approaches to increase OH radical concentrations or increased timescales to accumulate sufficient dose to efficiently detect products. This is undesirable as it increases the potential for secondary radical generation, complicating the analysis of primary radical reactions with the macromolecules, leading to significant challenges in acquiring reproducible or accurate data [26]. Due to these limiting factors, high-flux sources of hydroxyl radicals, using photolysis of peroxide [27–29] or radiolysis of water [1, 11, 13], have long been popular as ways to achieve sufficient [OH] radical concentrations to optimize labeling for successful footprinting.
Synchrotron resources have long been used as sources of OH radicals for footprinting, the high dose (HD) of radicals available has particular advantages including higher coverage of modifications in challenging samples including membrane proteins or prions, while HD applications are essential for enhanced prospects for footprinting in organelles or cells. HD applications also enable time–resolved applications in conjunction with rapid mixing or other fast reaction initiation schemes. Recently, the Center for Synchrotron Biosciences CSB completed the construction and commissioning of a new state-of-the-art beamline for synchrotron footprinting at the National Synchrotron Light Source-II, the 17-BM X-ray Footprinting of Biomaterials (XFP) beamline [30], which delivers world-leading flux densities and photon dose capabilities (>1016 photons/s, and up to 500 W/mm absorbed dose achievable). This new beamline (Figure 2) is also adjacent to other cutting-edge NSLS-II structural biology beamlines for crystallography, small angle scattering and biological X-ray imaging, providing an integrated “Structural Biology Village” environment. This modern resource for synchrotron footprinting provides the ideal opportunity to rectify some of the limitations of present sample handling devices and develop an integrated platform for footprinting, which when coupled to new data analysis tools, can drive adoption of the technique by the broader structural biology community and the adoption of radiolytic HRF at other synchrotron beamlines both in the US and internationally [30–36].
Fig 2. XFP beamline at NSLS-II.
Synchrotron white light (pale pink cylinder) is produced from a 3-pole wiggler (3PW) source at NSLS-II. White light passes through a set of beam-defining white-beam slits, and then intercepts a toroidal mirror that focuses the X-ray light and rejects high-energy X-rays (>16 keV), producing “pink” beam (pink cylinder). Pink beam passes through the shield wall into the experimental hutch, where it intercepts beam defining pink beam slits and then achieves 1:1 focus at the High Dose footprinting endstation before diverging to a larger beam at the downstream High Throughput endstation. (right) photo of the XFP experimental hutch in the High-Throughput X-ray footprinting configuration. The 3PW, white beam slits, and toroidal focusing mirror are located upstream behind the shield wall and are not visible in this photo. The first element in the hutch is the pink-beam slits in a photon delivery chamber, followed by transport pipe to the High Dose and High Throughput endstations. Changes between the two endstations are accomplished by removal or installation of an exchangeable beampipe, as required.
Integrating Footprinting and stopped flow kinetics
With the availability of footprinting beamlines capable of delivering sufficient flux densities to label proteins within microseconds in the context of flow-based exposure cells, studying single residue dynamics at the protein interface during formation of a complex is clearly enabled on such timescales if the samples can be appropriately poised. This includes examining the structural transitions that govern the formation of a productive protein-protein binding interface or the protein dynamics that govern the interactions of a protein with its ligand partner. This approach provides a much-needed temporal window into the structures provided by X-ray crystallography and cryo-EM structural studies, which by their very nature can only reveal static snapshots of energetic intermediates/endpoints that may not fully represent or recapitulate aspects of the physiological transition. Furthermore, as HRF techniques can utilize native protein, HRF results have the potential to deliver a more physiologically relevant understanding of protein-protein and protein–ligand complex kinetic parameters in the absence of the conditions, mutations, crystal contacts, and constructs often required to stabilize proteins for structural studies. While competing techniques such as HDX are limited in their temporal resolution, essentially providing snapshots limited by the exchange rate of label, the speed of OH radical generation using synchrotron radiation opens up new temporal vistas, enabling us to observe microsecond to millisecond (as well as longer) timescale events.
We recently employed the HRF technique in combination with stopped flow kinetics to probe the structural and dynamic transitions involved in the formation of the β2-Adrenergic receptor (β2-AR) Gs signaling complex (Figure 3) after mixing the individual species of G-protein and receptor. The structural details of the formation of this complex is of great scientific interest as it elucidates aspects of heart rate and blood pressure signaling, is a target of a number of commonly prescribed blood pressure mediations, while also serving as a prototypical receptor for the understanding of G protein coupled Receptor signaling in general. This customized beamline front end of 17-BM allowed us to label (and thus observe) components of the β2-AR Gs signaling complex at time points ranging from to milliseconds after mixing out to 10 minutes, providing a kinetic trace with amino acid resolution for several component regions of the receptor and G protein as they interact to form the signal initiating complex. This experimental methodology provides the first look into the residue level changes in β2-AR/Gs dynamics during complex formation as well as providing a mechanistic understanding of Gsα subunit recognition by activated receptor.
Figure 3. Time-Resolved Analysis of GPCR-Gs Complex Formation by HRF-MS.
Utilizing synchrotron based HRF technologies in combination with hydrogen deuterium mass spectrometry data, it is possible to both probe the formation of a protein complex as well as the amino acid/peptide resolution changes that underlie the formation of a protein complex over timescales ranging from 10’s of milliseconds to minutes. In this study, the changes in protein dynamics and solvent accessibility that underlie the formation of the β2-AR – signaling complex were studied, yielding temporal data on the small scale changes that enable the GPCR to specifically recognize its cognate G protein and initiate allosteric/structural changes that enable the process of nucleotide exchange required for downstream signaling. (A) X-ray generated radiolytic oxidative modification profiles of selected peptides or residues from the β2-AR or Gαs. Oxidative modification changes of Gαs upon incubation with the β2-AR were analyzed. The modified peptides or residues are indicated as colored regions or sticks on the X-ray crystal structure of the β2-AR-Gs complex (PDB: 3SN6). (B) The surrounding environment of M386. In the GDP-bound Gs structure, M386 is located within a pocket formed by four amino acids (green spheres) with limited solvent exposure.
(C) Rearrangement of interactions with M221 and F376 of Gαs following formation of the nucleotide-free β2-AR-Gs complex. In the GDP-bound Gs structure, M221 and F376 form interactions with residues within β2-β3 strands and α1 helix (left), which are lost in the β2-AR-bound nucleotide-free structure (PDB: 3SN6) (right). In the β2-AR-bound nucleotide-free structure (PDB: 3SN6), F376 forms new interactions with F139 of the β2-AR and amino acids in the αN/β1 hinge and β2/β3 loop (right). Adapted with permission [23].
Integrations of Footprinting and SAXS
Advanced FP approaches can measure the relative or absolute solvent accessibility of side chain residues in Å2via a protection factor (PF) analysis, and thus can provide a quantitative input to modeling exercises, where the predicted solvent accessibility from a model can be compared to experimentally derived solvent accessibility data from footprinting for scoring purposes [37, 38]. FP, however, as it reports only local structural information, is not sufficient to understand the full context and scope of all relevant structural interactions. The synergism of FP with other complementary techniques such as SAXS, which provides overall protein shape information, is able to overcome the limitations of the individual techniques providing a more complete and comprehensive picture of a given system of interest.
Consistent with the data integration philosophy articulated in Figure 1, the iSPOT (integration of Scattering, footPrinting, and dOcking simulaTion) software [20] has been developed to integrate crystallographic or other high-resolution structural models of macromolecules with SAXS and FP data to resolve macromolecular structures (Figure 4). iSPOT combines multiple sources of structural data and it builds on approaches pioneered by others such as IMP, HADDOCK, and CNS [39–43]. CNS uses information from X-ray crystallography or NMR, and can also integrate data from EM, while HADDOCK can integrate NMR, SAXS and EM data. The IMP approach developed in the Sali group was initially focused on modeling using low resolution EM data, but has the capability to incorporate many different types of data (such as cryo-EM, X-ray crystallography and chemical cross-linking) for enabling the modeling of very large molecular complexes such as the nuclear pore [43]. In contrast to these three methods, a key focus of iSPOT is the orthogonality and complementarity of FP and SAXS: the latter sensitive to protein shape and overall arrangement and the former sensitive to residue-specific solvent exposure. iSPOT fills a gap in modeling as it can address protein-protein complexes that are too large for NMR, too dynamic for crystallization, and/or too small/dynamic for cryo-EM. While iSPOT is the first approach to include FP data for structural modeling, the addition of HRF data has been shown to improve the overall performance of the Rosetta modeling for protein structure prediction as well [44].
Fig 4. The iSPOT workflow.
It consists of four components: (a) computational protein-protein docking, (b) experimental SAXS and footprinting data acquisition, (c) scoring and selection, and (d) structural model optimization. Reprinted with permission [42].
Computational docking is central to iSPOT modeling by providing a basis set of conformations to fit the SAXS and FP data simultaneously. One such method is straightforward rigid-body docking, where the known structures of individual components are treated as two rigid particles and fed into many available software packages such as ClustPro [45]. The binding of two interacting proteins, however, often induces conformational changes. Under this circumstance, flexible docking is required to account for binding-induced structural changes. In fact, when such flexibility occurs at the amino acid level, the side chains at the interface can be optimized during the docking process as implemented in HADDOCK [39]. In other scenarios, the conformational flexibility can go beyond the local side chain changes, e.g., involving tightening of secondary structure and packing at the interface due to the binding. As such, these induced-fit changes need to be considered and can be addressed. For iSPOT, coarse-grained molecular dynamics simulations are implemented to allow large-scale induced-fit structural changes [20, 42]. Even larger-scale conformational changes such as allosteric dynamics that travel beyond the protein-interface are quite challenging to accommodate; this has been one limiting factor of current integrative modeling approaches. Nonetheless, as computing power increases, integrative modeling is on the path to accurately and reliably characterize both rigid-body and flexible induced-fit protein-protein docking approaches.
Our success in integrating SAXS and FP datasets to model protein-protein interacting complexes [20, 42] includes the determination of the multidomain structure of human estrogen receptor (Fig. 5a)[46]. Although the individual structures of the DNA and ligand binding domains were known at high resolution, all previous attempts to solve the structure of the complex at high resolution were unsuccessful [47]. To understand the structure of the complex, we first measured its experimental protection factors (PF), which are rates of hydroxyl radical reactivity measured by experiment and then normalized by intrinsic reactivity of the individual amino acid side chains, for twenty independent side chain sites in the complex. We then plotted the solvent accessible surface area for the individual DNA and ligand binding domains from crystallography vs. the PF for the 20 residues. Fourteen of these 20 sites obeyed the expected linear relationship of PF and solvent accessible surface area, indicating that the PF values accurately reflected the crystal structures’s orientations for these residues. The other six did not obey the known relationship, all were much more protected (Higher PF) than expected, and were thus identified as potential interfacial residues. Instructively, three of the residues formed a tight patch on the DNA binding domain and the other three formed a tight patch on the ligand binding domain. Thus, an important interaction point between the two domains was clearly identified; this assumption was used to drive a molecular docking strategy (Figure 4) that generated thousands of potential conformations that maintained this interface point. Scoring schemes were then employed to compare experimental FP and SAXS data to define a subset of the simulations that were self-consistent with all the data (Figure 4). The novel 3-D arrangement of the ligand and DNA binding domains revealed a previously unknown interface at high resolution (~3–4 Å), and can be used to drive the development of small molecules to interact with and modulate estrogen receptor function.
Fig 5. Example of data integration via iSPOT.
(a) Multidomain architecture of human estrogen receptor (ER) homodimer (ligand-binding domain or LBD in green and DNA-binding domain or DBD in blue). Neither SAXS or FP alone has been able to unambiguously depict a meaningful picture of the LDB-DBD complex, although the combination of structural information from FP (on a set of 20 residues, six of which are solvent-protected at the domain interface) and SAXS (shape and spatial distribution) is shown to successfully determine the ensemble-structures that are subsequently validated, structurally and functionally. Modified with permission [46]. (b) Ensemble-structures of the intrinsically disordered region of ER’s N-terminal transactivation domain or NTD, where the FP probes a set of 16 residues along the protein amino acid sequence, joining forces with SAXS data and molecular dynamics simulations for the structural-ensemble characterization of the NTD as an IDP. Modified with permission [51].
In the above case high resolution structures of DNA and ligand binding domains were available and were essentially correctly docked by the application of SAXS, FP and computation. However, this combination of techniques mediated by iSPOT can also be quite valuable even when virtually no structural information is available or when the macromolecules are present in an ensemble of distributed structures. For example, we have recently combined SAXS and FP data to describe the ensemble-structures of the estrogen receptor’s N-terminal transactivation domain (NTD), which is an intrinsically disordered protein (IDP) (Fig. 5b). In this latter case FP protection factors, which are highly correlated with solvent accessible surface area (SA) were calculated for multiple residues in the NTD; these data reflected the average SA for all members of the ensemble [37, 38, 48]. Coupling these constraints with analysis of SAXS data[20, 49–51], we determined that this IDP has a novel structured element, whose release is likely relevant to receptor function. Overall, we anticipate that the proposed multi-technique integrative modeling approach is well suited for studies of IDPs.
Challenges.
One of the key challenges in these integrative studies, is to coordinate experimental data acquisition of individual datasets, i.e., from HRF and SAXS separately. For example, one of our recent samples is very prone to protein aggregation as the protein itself degrades within several hours after size-exclusion chromatography (SEC). To tackle this issue, we have performed SEC purification on-site, enabling identical, fresh protein samples to be employed for both HRF and SAXS data acquisition simultaneously, in which SAXS was performed at the NSLS-II LiX (16-ID) beamline, adjacent to the XFP (17-BM) beamline. This physical proximity is a key advantage to our data acquisition strategy and will be implemented with dedicated analytical coordination prior to our follow-up iSPOT-based structure determinations. Overall the NSLS-II’s “Biology Village” project can be leveraged to promote novel integrated structural assessments of interesting biological systems, providing a one stop shop for the streamlining of multimodal structural analysis.
Conclusions.
Structural biology has entered a new era, enabling us to answer complex questions of dynamics and multiprotein complex assembly. Integrated structural biology and continuing technology development will, going forward, enable the most challenging systems to be tackled. The extension of these studies to the physiological environment, to better understand the structural basis of critical functions in the cell is the next frontier.
Highlights:
-
1-
Novel structures of large complexes and membrane proteins are newly available
-
2-
Protein footprinting can leverage these structure data with follow-on experiments to understand the kinetics of assembly or ligand induced conformational changes
-
3-
Protein footprinting has special advantages for examining intrinsically disordered proteins, amyloidogenic proteins, and other systems recalcitrant to existing methods.
-
4-
Protein footprinting coupled to small angle x-ray scattering can provide high resolution structures of proteins and accurate and useful descriptions of protein ensembles.
Acknowledgements.
This research was supported by grants R01-GM-114056 from the National Institute of General Medicine Sciences (NIH), the Mt. Sinai Healthcare Foundation. Support for the Center for Synchrotron Biosciences is through P30-EB-009998. Funding for development of 17-BM was provided by the National Science Foundation, Division of Biological Infrastructure (grant No. 1228549). 17-BM at the National Synchrotron Light Source II, is a U.S. Department of Energy (DOE) Office of Science User Facility operated for the DOE Office of Science by Brookhaven National Laboratory under Contract No. DE-SC0012704. We thank Tiffany Bowman, Brookhaven National Laboratory Graphic Design and Ricarda Laasch at the NSLS-II User Office for assistance with the Figure 2 graphic.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Conflict of interest. MRC is a scientific advisory board member and a shareholder of GenNext technologies, and is also a Founder and Chief Scientific Officer of Neo Proteomics. JK is a consultant for Neo Proteomics.
References
- [1].Kiselar J, Chance MR. High-Resolution Hydroxyl Radical Protein Footprinting: Biophysics Tool for Drug Discovery. Annu Rev Biophys. 2018. [DOI] [PubMed] [Google Scholar]
- [2].Pieper U, Chiang R, Seffernick JJ, Brown SD, Glasner ME, Kelly L, et al. Target selection and annotation for the structural genomics of the amidohydrolase and enolase superfamilies. J Struct Funct Genomics. 2009;10:107–25. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Structural Genomics C, China Structural Genomics C, Northeast Structural Genomics C, Graslund S, Nordlund P, Weigelt J, et al. Protein production and purification. Nat Methods. 2008;5:135–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [4].Cerofolini L, Fragai M, Ravera E, Diebolder CA, Renault L, Calderone V. Integrative Approaches in Structural Biology: A More Complete Picture from the Combination of Individual Techniques. Biomolecules. 2019;9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Kim JS, Afsari B, Chirikjian GS. Cross-Validation of Data Compatibility Between Small Angle X-ray Scattering and Cryo-Electron Microscopy. J Comput Biol. 2017;24:13–30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Gault J, Robinson CV. Cracking Complexes To Build Models of Protein Assemblies. ACS Cent Sci. 2019;5:1310–1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [7].Leney AC, Heck AJ. Native Mass Spectrometry: What is in the Name? J Am Soc Mass Spectrom. 2017;28:5–13. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Fajardo JE, Shrestha R, Gil N, Belsom A, Crivelli SN, Czaplewski C, et al. Assessment of chemical-crosslink-assisted protein structure modeling in CASP13. Proteins. 2019;87:1283–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [9].Koukos PI, Bonvin A. Integrative Modelling of Biomolecular Complexes. J Mol Biol. 2019. [DOI] [PubMed] [Google Scholar]
- [10].Zheng J, Strutzenberg T, Pascal BD, Griffin PR. Protein dynamics and conformational changes explored by hydrogen/deuterium exchange mass spectrometry. Curr Opin Struct Biol. 2019;58:305–13. [DOI] [PubMed] [Google Scholar]
- [11].Guan JQ, Almo SC, Reisler E, Chance MR. Structural reorganization of proteins revealed by radiolysis and mass spectrometry: G-actin solution structure is divalent cation dependent. Biochemistry. 2003;42:11992–2000. [DOI] [PubMed] [Google Scholar]
- [12].Kiselar JG, Mahaffy R, Pollard TD, Almo SC, Chance MR. Visualizing Arp2/3 complex activation mediated by binding of ATP and WASp using structural mass spectrometry. Proc Natl Acad Sci U S A. 2007;104:1552–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [13].Takamoto K, Chance MR. Radiolytic protein footprinting with mass spectrometry to probe the structure of macromolecular complexes. Annu Rev Biophys Biomol Struct. 2006;35:251–76. [DOI] [PubMed] [Google Scholar]
- [14].Maeda S, Koehl A, Matile H, Hu H, Hilger D, Schertler GFX, et al. Development of an antibody fragment that stabilizes GPCR/G-protein complexes. Nat Commun. 2018;9:3712. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [15].Rosenbaum DM, Cherezov V, Hanson MA, Rasmussen SG, Thian FS, Kobilka TS, et al. GPCR engineering yields high-resolution structural insights into beta2-adrenergic receptor function. Science. 2007;318:1266–73. [DOI] [PubMed] [Google Scholar]
- [16].Khoshouei M, Danev R, Plitzko JM, Baumeister W. Revisiting the Structure of Hemoglobin and Myoglobin with Cryo-Electron Microscopy. J Mol Biol. 2017;429:2611–8. [DOI] [PubMed] [Google Scholar]
- [17].Khoshouei M, Radjainia M, Baumeister W, Danev R. Cryo-EM structure of haemoglobin at 3.2 A determined with the Volta phase plate. Nat Commun. 2017;8:16099. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [18].Koning RI, Koster AJ, Sharp TH. Advances in cryo-electron tomography for biology and medicine. Ann Anat. 2018;217:82–96. [DOI] [PubMed] [Google Scholar]
- [19].Lucic V, Rigort A, Baumeister W. Cryo-electron tomography: the challenge of doing structural biology in situ. J Cell Biol. 2013;202:407–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [20].Huang W, Ravikumar KM, Parisien M, Yang S. Theoretical modeling of multiprotein complexes by iSPOT: Integration of small-angle X-ray scattering, hydroxyl radical footprinting, and computational docking. J Struct Biol. 2016;196:340–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [21].Gupta S, Chai J, Cheng J, D’Mello R, Chance MR, Fu D. Visualizing the kinetic power stroke that drives proton-coupled zinc(II) transport. Nature. 2014;512:101–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Orban T, Jastrzebska B, Gupta S, Wang B, Miyagi M, Chance MR, et al. Conformational dynamics of activation for the pentameric complex of dimeric G protein-coupled receptor and heterotrimeric G protein. Structure. 2012;20:826–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [23].Du Y, Duc NM, Rasmussen SGF, Hilger D, Kubiak X, Wang L, et al. Assembly of a GPCR-G Protein Complex. Cell. 2019;177:1232–42 e11. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [24].de Souza N, Picotti P. Mass spectrometry analysis of the structural proteome. Curr Opin Struct Biol. 2019;60:57–65. [DOI] [PubMed] [Google Scholar]
- [25].Johnson DT, Di Stefano LH, Jones LM. Fast photochemical oxidation of proteins (FPOP): A powerful mass spectrometry-based structural proteomics tool. J Biol Chem. 2019;294:11969–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Xu G, Chance MR. Hydroxyl radical-mediated modification of proteins as probes for structural proteomics. Chem Rev. 2007;107:3514–43. [DOI] [PubMed] [Google Scholar]
- [27].Gau BC, Sharp JS, Rempel DL, Gross ML. Fast photochemical oxidation of protein footprints faster than protein unfolding. Anal Chem. 2009;81:6563–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Sharp JS, Becker JM, Hettich RL. Analysis of protein solvent accessible surfaces by photochemical oxidation and mass spectrometry. Anal Chem. 2004;76:672–83. [DOI] [PubMed] [Google Scholar]
- [29].Hambly DM, Gross ML. Laser flash photolysis of hydrogen peroxide to oxidize protein solvent-accessible residues on the microsecond timescale. J Am Soc Mass Spectrom. 2005;16:2057–63. [DOI] [PubMed] [Google Scholar]
- [30].Asuru A, Farquhar ER, Sullivan M, Abel D, Toomey J, Chance MR, et al. The XFP (17-BM) beamline for X-ray footprinting at NSLS-II. J Synchrotron Radiat. 2019;26:1388–99. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [31].Baud A, Ayme L, Gonnet F, Salard I, Gohon Y, Jolivet P, et al. SOLEIL shining on the solution-state structure of biomacromolecules by synchrotron X-ray footprinting at the Metrology beamline. J Synchrotron Radiat. 2017;24:576–85. [DOI] [PubMed] [Google Scholar]
- [32].Bohon J, D’Mello R, Ralston C, Gupta S, Chance MR. Synchrotron X-ray footprinting on tour. J Synchrotron Radiat. 2014;21:24–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [33].Bohon J, Sullivan M, Dvorak J, Abel D, Toomey J, Chance M. Development of the XFP Beamline X-ray Footprinting at NSLS-II. 12th International Conference on Synchrotron Radiation Instrumentation: AMerican Institute of Physics, Inc; 2016. [Google Scholar]
- [34].Gupta S, Celestre R, Petzold CJ, Chance MR, Ralston C. Development of a microsecond X-ray protein footprinting facility at the Advanced Light Source. J Synchrotron Radiat. 2014;21:690–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [35].Gupta S, Feng J, Chance M, Ralston C. Recent Advances and Applications in Synchrotron X-Ray Protein Footprinting for Protein Structure and Dynamics Elucidation. Protein Pept Lett. 2016;23:309–22. [DOI] [PubMed] [Google Scholar]
- [36].Gupta S, Sullivan M, Toomey J, Kiselar J, Chance MR. The Beamline X28C of the Center for Synchrotron Biosciences: a national resource for biomolecular structure and dynamics experiments using synchrotron footprinting. J Synchrotron Radiat. 2007;14:233–43. [DOI] [PubMed] [Google Scholar]
- [37].Huang W, Ravikumar KM, Chance MR, Yang S. Quantitative mapping of protein structure by hydroxyl radical footprinting-mediated structural mass spectrometry: a protection factor analysis. Biophys J. 2015;108:107–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Kaur P, Kiselar J, Yang S, Chance MR. Quantitative protein topography analysis and high-resolution structure prediction using hydroxyl radical labeling and tandem-ion mass spectrometry (MS). Mol Cell Proteomics. 2015;14:1159–68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Dominguez C, Boelens R, Bonvin AM. HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J Am Chem Soc. 2003;125:1731–7. [DOI] [PubMed] [Google Scholar]
- [40].Russel D, Lasker K, Webb B, Velazquez-Muriel J, Tjioe E, Schneidman-Duhovny D, et al. Putting the pieces together: integrative modeling platform software for structure determination of macromolecular assemblies. Plos Biology. 2012;10:e1001244. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [41].Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, Grosse-Kunstleve RW, et al. Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr. 1998;54:905–21. [DOI] [PubMed] [Google Scholar]
- [42].Hsieh A, Lu L, Chance MR, Yang S. A Practical Guide to iSPOT Modeling: An Integrative Structural Biology Platform. Adv Exp Med Biol. 2017;1009:229–38. [DOI] [PubMed] [Google Scholar]
- [43].Alber F, Dokudovskaya S, Veenhoff LM, Zhang W, Kipper J, Devos D, et al. Determining the architectures of macromolecular assemblies. Nature. 2007;450:683–94. [DOI] [PubMed] [Google Scholar]
- [44].Aprahamian ML, Chea EE, Jones LM, Lindert S. Rosetta Protein Structure Prediction from Hydroxyl Radical Protein Footprinting Mass Spectrometry Data. Anal Chem. 2018;90:7721–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [45].Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, et al. The ClusPro web server for protein-protein docking. Nat Protoc. 2017;12:255–78. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [46].Huang W, Peng Y, Kiselar J, Zhao X, Albaqami A, Mendez D, et al. Multidomain architecture of estrogen receptor reveals interfacial cross-talk between its DNA-binding and ligand-binding domains. Nat Commun. 2018;9:3520. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [47].Yi P, Wang Z, Feng Q, Pintilie GD, Foulds CE, Lanz RB, et al. Structure of a biologically active estrogen receptor-coactivator complex on DNA. Mol Cell. 2015;57:1047–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [48].Gustavsson M, Wang L, van Gils N, Stephens BS, Zhang P, Schall TJ, et al. Structural basis of ligand interaction with atypical chemokine receptor 3. Nat Commun. 2017;8:14135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [49].Ravikumar KM, Huang W, Yang S. Fast-SAXS-pro: a unified approach to computing SAXS profiles of DNA, RNA, protein, and their complexes. J Chem Phys. 2013;138:024112. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [50].Yang SC, Park S, Makowski L, Roux B. A Rapid Coarse Residue-Based Computational Method for X-Ray Solution Scattering Characterization of Protein Folds and Multiple Conformational States of Large Protein Complexes. Biophysical Journal. 2009;96:4449–63. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [51].Peng Y, Cao S, Kiselar J, Xiao X, Du Z, Hsieh A, et al. A Metastable Contact and Structural Disorder in the Estrogen Receptor Transactivation Domain. Structure. 2019;27:229–40 e4. [DOI] [PMC free article] [PubMed] [Google Scholar]