SUMMARY
Multidomain proteins in which individual domains are connected by linkers often possess inherent inter-domain flexibility that significantly complicates their structural characterization in solution using either NMR spectroscopy or small-angle X-ray scatting (SAXS) alone. Here we report a novel protocol for joint refinement of flexible multidomain protein structures against NMR distance and angular restraints, residual dipolar couplings and SAXS data. The protocol is based on the EOM principle (Bernardo et al., 2007) and is compared with different refinement strategies for the structural characterization of the flexible two-domain protein sf3636 from Shigella flexneri 2a. The results of our refinement suggest the existence of a dominant population of configurational states in solution possessing an overall elongated shape and restricted relative twisting of the two domains.
INTRODUCTION
It is estimated that up to 65% of prokaryotic, and 70% of eukaryotic proteins belong to multi-domain protein families (Apic et al., 2001). The abundance of multidomain architectures in nature creates a challenge for structural and functional studies and has inspired the development of new strategies to study the behavior of multi-domain proteins as a whole, rather than domains individually. Solution NMR, has demonstrated abilities to characterize multi-domain structural states and their dynamics detail, especially when combined with other structural methods. For example, Liu et al. (2010) used residual dipolar couplings (RDCs) and PREs to assess the mobility and orientation of the N-terminal bicelle-associated domain, and C-terminal catalytic domain of a full-length perdeuterated and myristoylated construct of the yeast Arf1, which was bound to GTP in a membrane-mimetic. Yang et al. introduced EPR data into the program CYANA for homodimer structure calculations (Yang et al., 2010). Here we implement a combinatorial approach which make use of RDC, nuclear Overhauser effect (NOE), and small-angle X-ray scatting (SAXS) data.
SAXS, a well-established method that provides data on the size and overall shape of the molecule in solution, can be used to extract low-resolution spatial organization of the domains from large multi-domain proteins and their complexes (Jacques and Trewhella, 2010; Mertens and Svergun, 2010). Several computational approaches have been developed for the refinement of NMR structures against SAXS data and integrated into the structural calculation programs CNSSOLVE (Grishaev et al., 2005; Gabel et al., 2008) and Xplor-NIH (Schwieters et al., 2007, 2010; Wang et al., 2009). These programs implement a single-conformation approach that is appropriate for the refinement of relatively rigid multi-domain proteins and complexes for cases where a single dominant conformation of the system exists in solution. This approach is not suitable for a flexible protein system that explores a range of conformations in solution. Several methods have been reported to characterize flexible systems in solution using SAXS data, including the ensemble optimization method (EOM) (Bernado et al., 2007), minimal ensemble search (Pelican et al., 2009), basis-set supported SAXS (Yang et al., 2010a), integrative modeling platform (Forster et al., 2008), maximum-entropy refinement (Rozycki et al., 2011), and maximum occurrence method MaxOcc (Bertini et al., 2012). In the ensemble optimization method (EOM), a large pool of random configurations is generated first to explore and sample the accessible conformational space, and a genetic algorithm is then used to select a subset of conformers that fit the experimental SAXS data. The EOM and other methods based on the same basic principle differ one from another in the way a pool is generate and how the optimal ensemble is selected from the pool. For example, the recently reported MaxOcc method also relies on the random pool that is generated as in EOM, however the construction of the best fitting ensemble is performed using a procedure that combines a target function minimization and a heuristic random search. In addition to the SAXS data, the MaxOcc method relies on the use of pseudocontact shifts and self-orientation RDCs originating from a paramagnetic center bound in a metal binding site or introduced by covalent tagging.
Here, we have developed an ensemble refinement protocol based on the EOM principle to refine NMR structures of flexible protein systems, which in addition to SAXS data, utilizes NOEs and RDCs measured in three alignment media. In this protocol, distance restraints derived from NOE data are incorporated in both the pool generation and the optimal ensemble search. We have applied our protocol to characterize the solution structure of sf3636 from Shigella flexneri 2a, a two-domain protein with inter-domain flexibility. To our knowledge, this work is the first report of the joint ensemble refinement of a flexible multi-domain protein against NOE, RDC, and SAXS data.
Shigella flexneri 2a is a pathogenic Gram-negative bacterium that causes an acute bloody diarrhea known as shigellosis or bacillary dysentery in humans. The 120-residue protein sf3636 is a member of the protein superfamily DUF2810 that shares a common modular architecture with unknown function. Figure S1A presents the sequence alignment of Shigella flexneri 2a sf3636 with homologues from a range of pathogenic Enterobacteriaceae. Initial characterization of the NMR structure of sf3636 based on NOEs, chemical shift and scalar coupling data could not accurately define the interdomain configuration. Additional RDC and SAXS data proved to be critical to validate initial structure, to assess the level of interdomain flexibility, and for the final structure refinement. We have compared different structure refinement approaches, and demonstrate that the addition of NOE-derived distance restraints proved particularly valuable for the determination of the dominant population of conformers in solution.
RESULTS
Results of Single Conformation refinement with NMR, RDC and SAXS data
Structure description
The input for the structure refinement calculations by CNSSOLVE consisted of a total of 3371 NMR-derived structural restraints (3059 NOEs, 112 hydrogen bonds and 200 dihedral angles), 355 RDC restraints measured in the three alignment media and SAXS-derived restraints. A representative structure of sf3636 in solution is shown in Figure 1A. The protein is comprised of the N-terminal domain (NTD, residues 1–56) and C-terminal domain (CTD, residues 63–120). In the N-terminal domain, two long helices, helix α1 (residues 8–25) and helix α2 (residues 34–56), are connected by a short loop, which form an extended antiparallel coiled-coil structure. The two α-helices pack at an angle of ~154°. A structural comparison of the N-terminal domain with the DALI database (Holm and Sander 1993) identified many proteins containing the helix-loop-helix motif with Z-scores higher than 4.5. The C-terminal domain adopts an α/β fold consisting of a central three-stranded antiparallel β-sheet surrounded by four α helices on the external surface. The strands (residues 71–76, 94–99 and 112–116) are arranged with a β1β3β2 topology. The DALI search identified certain structures showing low Z-scores of 2.0. A superposition of these structures showed significantly different overall folding, indicating that the structure of C-terminal domain represents a novel fold.
Figure 1.
Structure of sf3636 and global variables describing the relative configuration of the two domains. The center of mass of the N-terminal (NTD) and C-terminal (CTD) domains, rNTD and rCTD, respectively, are shown as blue spheres. a) Ribbon diagram of a representative structure of the SCP ensemble. b) Relative domain position is described by three spherical coordinates (RNC, θ, φ) that specify the relative position of the CTD center-of-mass rCTD in the spherical coordinate system defined by the principal axes ix,iy, and iz of the NTD inertia tensor. RNC is the distance between positions of NTD and CTD centers-of-mass. c) Relative domains orientation is described by three anglesζ1, ζ2, and Ω. ζ1 is a bond angle formed by vectors RNC and uN, where RNC = rCTD − rNTD, and uN is a vector connecting positions of Cα atoms of residues Leu16 and Lys42. ζ2 is a bond angle formed by vectors RNC and uC, where uC is a vector connecting positions of Cα atoms of residues Thr113 and Phe115. Ω is a torsion angle formed by vectors uN, RNC, and uC. See also Figure S1.
A wide and open groove runs at the interface between the two domains. The electrostatic potential surface of sf3636 (Figure S1B) shows that an array of acidic residues is aligned along the edge of the two-helix bundle, creating an attractive potential for target binding. The C-terminal novel domain exhibits an asymmetric distribution of basic residues over the surface. Almost all residues participating in the positively charged strips on the surface of the C-terminal domain are highly conserved and probably essential to common sf3636 functions.
Structural diversity in the SCP ensemble
Each structure in the final ensemble of 20 structures refined using the single-conformation protocol (SCP) approach has an elongated molecule shape with two domains bridged by a six-residue linker (S57QKLSK62). The RMSD to the mean structure calculated for residues 6–56 in the NTD is 0.52 ± 0.11 Å for the backbone atoms and 0.84±0.13 Å for the heavy atoms. The CTD domain has an RMSD, calculated for residues 62–117, of 0.60 ± 0.12 Å for the backbone atoms and 0.96 ± 0.12 Å for the heavy atoms. The precision of the structure of the N- and C-terminal domains alone is much higher than that of the full-length protein. The pair-wise backbone RMSD, calculated for residues 5–56 and 62–117, between two structures in the SCP ensemble could be as large as 4.5Å, which indicates that the relative interdomain configuration varies significantly within the ensemble. At the same time, all structures in the SCP ensemble have very similar overall size and shape: the radius of gyration Rg and the shape anisotropy values the structure are 21.5±0.01Å and 2.99±0.01 Å, respectively. The structures in the ensemble fit the experimental data equally well and have equally good quality scores. The structure parameters of the SCP ensemble are summarized in Table 1, and the scores reflecting the goodness-of-fit of the ensemble to the NMR and SAXS data are presented in Table 2.
Table 1.
Summary of NMR and structural statisticsa for Shigella flexneri 2a sf3636
Distance restraints | ||
All | 3171 | |
Intraresidue | 600 | |
Sequential (|i-j| = 1) | 772 | |
Medium range (2 ≤ |i-j| ≤4) | 922 | |
Long range (|i-j| > 4) | 765 | |
Hydrogen bonds | 56 | |
Inter-domain | 0 | |
NTD to linker | 46 | |
CTD to linker | 26 | |
Dihedral angle restraints | ||
All | 200 | |
φ | 100 | |
ψ | 100 | |
Residual dipolar coupling restraints | ||
Neutral stretched gel | ||
1DNH | 68 | |
1DNC′ | 79 | |
Positively charged PEGCTAB | ||
1DNH | 68 | |
1DNC′ | 70 | |
Positively charged stretched gel | ||
1DNH | 70 | |
Number of violations | ||
Distance restraints ( > 0.5Å ) | 0 | |
Dihedral angle restraints (> 5°) | 0 | |
RMSD from experimental restraints | ||
distances ( Å ) | 0.018 ± 0.0011 | |
dihedral angles (°) | 0.13 ± 0.07 | |
RMSD from idealized covalent geometry | ||
bond lengths ( Å ) | 0.0142 ± 0.0002 | |
bond angles (°) | 1.089 ± 0.015 | |
Pairwise RMSD | ||
Ordered regionb,c (6–117) | ||
backbone atoms | 1.20 ± 0.71 | |
all heavy atoms | 1.47 ± 0.73 | |
NTD (6–56) | ||
backbone atoms | 0.52 ± 0.11 | |
all heavy atoms | 0.84 ± 0.13 | |
CTD (63–117) | ||
backbone atoms | 0.60 ± 0.12 | |
all heavy atoms | 0.96 ± 0.12 | |
Ramachandran statistics (%)b,c | ||
Residues in most favored regions | 91.1 | |
Residues in additionally allowed regions | 8.7 | |
Residues in generously allowed regions | 0.2 | |
Residues in disallowed regions | 0.0 | |
Global quality scoresc | ||
Raw | Z-score | |
Verify3d | 0.32 | −2.25 |
ProsaII | 0.73 | 0.33 |
Procheck (phi-psi)b | −0.23 | −0.59 |
Procheck (all)b | −0.26 | −1.54 |
MolProbity clash-score | 16.52 | −1.31 |
Structural statistics were computed for ensemble of 20 deposited structures (PDB ID: 2LF0)
Calculated for ordered region: residues 6–117.
Calculated using NESG PSVS program suite (Bhattacharya et al., 2007)
Table 2.
Goodness-of-fit to the NMR, RDC, and SAXS data for different structural ensembles of sf3636.
NMRc | NMR+RDCc | SCPd | OEP1e | OEP2f | OEP12g | |
---|---|---|---|---|---|---|
| ||||||
DATA used to generate the ensemble | NOE | NOE + RDC1 +RDC2 + RDC3 | NOE + RDC1 + RDC2 + RDC3 + SAXS | NOE + RDC1 + RDC2 + RDC3 + SAXS | RDC1 + RDC2 + RDC3 + SAXS | NOE RDC1 + RDC2 + RDC3 + SAXS |
| ||||||
SAXS: | ||||||
χensa | 7.21 | 4.21 | 1.49 | 1.48 | 1.50 | 1.47 |
<χ>b | 7.71 ± 0.44 | 3.75 ± 0.18 | 1.46 ± 0.01 | 1.93 ± 0.49 | 3.47 ± 1.69 | 2.97 ± 1.83 |
| ||||||
RDC1 (neutral gel): | ||||||
Qensa | 0.96 | 0.23 | 0.29 | 0.27 | 0.19 | 0.23 |
<Q>b | 0.85 ± 0.02 | 0.20 ± 0.01 | 0.24 ± 0.02 | 0.36 ± 0.10 | 0.59 ± 0.16 | 0.44 ± 0.18 |
| ||||||
RDC2 (PEGCTAB): | ||||||
Qensa | 0.97 | 0.37 | 0.35 | 0.26 | 0.26 | 0.22 |
<Q>b | 0.93 ± 0.01 | 0.21 ± 0.02 | 0.28 ± 0.03 | 0.35 ± 0.07 | 0.57 ± 0.15 | 0.43 ± 0.19 |
| ||||||
RDC3 (charged gel): | ||||||
Qensa | 0.92 | 0.97 | 0.76 | 0.27 | 0.23 | 0.19 |
<Q>b | 0.49 ± 0.03 | 0.29 ± 0.01 | 0.20 ± 0.01 | 0.36 ± 0.12 | 0.37 ± 0.13 | 0.37 ± 0.13 |
| ||||||
DP-scoreh | 0.86 | 0.84 | 0.82 | 0.82 | 0.81 | 0.81 |
| ||||||
NOE violationsi : | ||||||
Nvioj | 0 | 0 | 0 | 0 | 53 | 0 |
Goodness-of-fit for the ensemble fitting;
Ensemble average and standard deviation values of goodness-of-fit of the individual members of the ensemble;
Ensemble of 20 NMR structures;
Ensemble of 20 structures refined using single conformation refinement protocol;
Optimal ensemble of 100 structures selected from a pool of the 16,000 NOE-restrained structures (pool P1);
Optimal ensemble of 100 structures selected from a pool of 18,000 random structures (pool P2);
Optimal ensemble of 100 structures selected from the combination of pools P1 and P2;
DP score (Huang et al. 2005) reflect the goodness-of-fit of an ensemble to the NMR data;
Violation by more then 0.5A of the NOE-derived distance restraints used to calculate NMR structure;
Nvio is number of distance restraints violated by the ensemble as a whole. An r−6 summed average is used to assess the restraints violation.
Impact of including of SAXS data in the structure refinement on the relative domain configuration
We will use the following six variables to describe quantitatively the relative domain configuration (see Figure 1B,C). The position of the C-terminal domain relative to N-terminal domain in each structure can be described in terms of vector RNC that connects centers of mass of two domains: RNC = rCTD − rNTD, where rNTD and rCTD are positions of center of mass of the NTD and CTD, respectively. The vector RNC can be specified by three coordinates (RNC, θ, φ) in the spherical coordinate system defined by the principal axes of the NTD tensor of inertia (see Figure 1B,1C), were RNC is a distance between CTD and NTD center-of-mass.
The relative orientation of the two domains will be described in terms of three angles - Ω, ζ1, and ζ2 - that are defined as it shown in Figure 1C. The angle Ω measures the relative twisting of the two domains about the line connecting their centers of mass.
In order to assess the impact of including SAXS data in the structure determination of sf3636, we have compared the relative configuration of the two domains in the structures refined using the SCP protocol versus those calculated from NMR-derived data, with and without inclusion of RDC measurements. Figure 2 shows that the predicted scattering intensity curves generated from the two ensembles of NMR-derived structures differ with the experimental SAXS profile of sf3636, while an analogous curve generated from SCP-refined structures fits the SAXS data well. The structures from all three of the ensembles fit the NMR data equally well (see Table 2). The structures from all three of the ensembles fit the NMR data equally well (see Table 2). The ensemble average and standard deviation values of global variables describing the relative configuration of the two domains are presented in Table 3. With respect to the interdomain configuration, one can see that the inclusion of SAXS data in the refinement results in more compact structures: the radius of gyration decreases from 22.4Ǻ to 21.5Ǻ and the degree of shape anisotropy also decreases by ~ 1.0. The global variables most-affected by inclusion of SAXS-derived restraints are the inter-domain distance RNC and the spherical angle θ. The RNC in the SCP structures is on average ~ 4 Ǻ shorter than it is in the NMR+RDC structures, and the angle θ differs in the two ensembles on average by ~ 20° (see Table 3 and Figure 3). Two additional points should be noted: (i) all structures from SCP ensemble have practically the same inter-domain distance, reflected in the sharp distribution of RNC, in contrast to the broader distribution of RNC in the NMR+RDC ensemble, and; (ii) the angle Ω undergoes a significant variation, of up to 50°, within the SCP ensemble.
Figure 2.
Comparison of experimental (black circles) with predicted SAXS data. Prediction was made by ensemble fitting of SCP (red lines), NMR1 (cyan lines), and NMR2 (green lines) structural ensembles. See Table 2 and text for the ensemble definition. A) Scattering intensity plot. B) Kratky plot. C) Pair distance distribution function P(r).
Table 3.
Global variablesa describing the relative configuration of two domains and overall structure shape calculated for different structural ensembles.
NMRb | NMR+RDCb | SCPb | OEP1b | OEP2b | OEP12b | |
---|---|---|---|---|---|---|
RNC(Å) | 23.6 ± 0.8 | 41.2 ± 1.1 | 37.7 ± 0.1 | 37.7 ± 2.2 | 36.2 ± 5.5 | 37.1 ± 5.5 |
θ (deg) | 61.4 ± 2.0 | 13.4 ± 2.1 | 32.0 ± 1.2 | 26.5 ± 7.9 | 42.2 ± 18.2 | 33.9 ± 15.3 |
φ (deg) | 161.2 ± 2.8 | 189.9 ± 8.8 | 196.1 ± 3.1 | 194.6 ± 12.8 | 152.9 ± 80.3 | 172.7 ± 51.1 |
Ω (deg) | −115.4 ± 5.8 | −101.4 ± 17.3 | −86.5 ± 16.9 | −70.6 ± 13.9 | 28.7 ± 93.0 | −34.0 ± 73.8 |
ζ1(deg) | 86.8 ± 3.2 | 72.0 ± 2.7 | 63.6 ± 3.0 | 64.4 ± 5.9 | 75.0 ± 29.3 | 73.2 ± 24.0 |
ζ2(deg) | 45.7 ± 2.2 | 89.2 ± 2.9 | 80.4 ± 2.9 | 80.4 ± 10.5 | 94.2 ± 27.9 | 90.6 ± 22.3 |
Rg(Å) | 17.1 ± 0.2 | 22.4 ± 0.5 | 21.5 ± 0.01 | 21.4 ± 0.8 | 21.2 ± 1.9 | 21.3 ± 1.8 |
𝒜c | 1.81 ± 0.06 | 3.95 ± 0.10 | 2.99 ± 0.01 | 3.08 ± 0.23 | 2.75 ± 0.48 | 2.92 ± 0.54 |
Dmax(Å) | 60.5 ± 1.4 | 78.2 ± 1.1 | 71.6 ± 0.8 | 70.3 ± 2.8 | 69.5 ± 5.0 | 69.4 ± 5.7 |
An ensemble average and standard deviation values are presented for the inter-domain distance RNC, the radius of gyration Rg, the angles θ, ϕ, Ω, ζ1, and ζ2 (see Figure 1 and text), the anisotropy of the overall molecular shape 𝒜, and the maximum iter-atomic distance Dmax;
Ensembles are defined in Table 2;
𝒜 is calculated as 2A1/(A2+A3), where A1, A2, and A3 are principal values of the inertia tensor ordered so that A1≥A2≥A3;
Figure 3.
Comparison of relative domain configuration in the structures of three different ensembles obtained using SCP approach. The backbone trace of the superimposed structures comprising the SCP (data in red), NMR (data in cyan), and NMR+RDC (data in green) ensembles are shown in E, F, and G, respectively. See Table 2 and text for the ensembles definition. The ensembles are overlaid with the ab-initio SAXS-predicted envelope and shown from two points of view. The right and left views are related by a 90° rotation. A) Distribution of the distance between NTD and CTD centers-of mass. B) Distribution of the inter-domain twist angle Ω (see Figure 1C). C) The spherical angles θ and φ specifying the relative domain position (see Figure 1B). D) The backbone trace of the 60 structures comprising the NMR (shown in cyan), NMR+RDC (shown in green), and SCP (shown in red) ensembles. The N-terminal domains of all shown structures are superimposed. Two vectors, vector R NC and vector of the unique principal axes of NTD, respectively, that form spherical angle θ are shown schematically as solid and dashed gray lines, respectively.
Configurational flexibility of SF3636
The fact that refinement with SAXS data using SCP method didn’t result in convergence of the sf3636 structures to a common interdomain configuration indicates some level of configurational disorder due to inter-domain motion. The relative mobility of the two domains is mediated by the flexibility in the linker. Linker residues Lys59, Leu60 and Ser61 are strictly conserved within the DUF2810 family, suggesting functional or structural roles for this region. The amide resonances of Gln58 and Lys59 are weak in the 15N-1H HSQC spectrum, and those of Leu60 and Ser61 are not visible. Therefore, the linker sequence may mediate conformational transitions among multiple ordered states on a μs-ms time scale.
15 N relaxation data reflects anisotropy and flexibility
15N relaxation measurements at 500 MHz were performed to probe the dynamic features of sf3636. Due to resonance overlap or weak cross-peak intensities, twenty residues including linker residues Gln58 – Ser61 were excluded from the relaxation analysis. Complete quantitative relaxation measurements were made for 97 residues of sf3636 (49 residues of NTD, and 48 residues of CTD). The experimental relaxation parameters obtained for sf3636 are shown in Figure 4. The NOE data shows significant decrease in the N-terminal (residues Met1 - Asn7) and α1/α2 loop (residues Asp25 – Ala31) regions, thus indicating flexible regions in the domain. Excluding the disordered residues enriched in picoseconds to nanosecond motions, the average NOE is 0.73±0.06, and the NOE values averaged across the N-terminal and C-terminal domains are approximately of the same value as the average across the entire protein.
Figure 4.
Experimental and calculated relaxation data for sf3636 at 11.74 Tesla. The experimental values (black bars) of amide T1, T2, and NOE versus residue number are shown in A, B, and C, respectively. The orange lines show the values obtained by averaging the experimental data over all residues from the secondary structure elements. The theoretical values of T1 and T2 were calculated for each SCP structure using the program HYDRONMR with the parameter a set to 2.2 Ǻ. The ensemble-averaged values of T1 and T2 are shown as blue filled circles. The horizontal bars on the top of each graph indicate the position of the secondary structure elements in the protein sequence. See also Figure S2 and Tables S1 and S2.
The values of the relaxation parameters averaged over the residues belonging to well-defined secondary structure elements are shown in Table S1. In contrast to NOEs, both the longitudinal and transverse relaxation times have on average noticeably different values across two different domains. Deviation of T1 and T2 values from the average across the protein in opposite direction from each other in a correlated manner is indicative of rotational diffusion anisotropy. One of the characteristic features of anisotropic tumbling is that the T1/T2 ratio depends on the angle between the N-H bond vector and the unique axis of the rotational diffusion tensor, α. Namely, for a prolate ellipsoid, T1 and the T1/T2 ratio increase when the angle α is less than 54.7°, while T1 and T1/T2 ratio decrease when α is greater then 54.7°. There is approximately linear relationship between relaxation parameters and the angle α in the range 15° < α < 70°. The inertia tensor of sf3636 calculated from a representative SCP structure has the ratio of the principal values of 1.0:0.97:0.19, indicating that the protein can be modeled as an axially symmetric prolate ellipsoid with large anisotropy. We have estimated rotational diffusion tensor from the experimental T1/T2 ratio and a representative structure of SCP ensemble assuming axially symmetric rotational diffusion. The angle α calculated for residues in the well-defined secondary structural elements shows a significant correlation with the observed T1/T2 ratio, with a linear correlation coefficient of 0.88. This correlation can be seen qualitatively in Figure 5B. It should be also noted that the average value of the angle α in two different domains is significantly different: α = 24.2±10° and 60.1± 7.6° for NTD and CTD, respectively. This difference in angle α values should result in the effective correlation time experienced by NH bonds in the N-terminal domain being larger than that in the C-terminal domain, which is in qualitative agreement with the observed effective correlation times (see Table S1 and Figure 5).
Figure 5.
Effect of anisotropy on T1/T2 ratio for sf3636. A) The angle α (black bars) between N-H bond vector and the unique principal axis of the diffusion tensor as a function of residue number is calculated for a representative structure of the SCP ensemble. The domain-averaged values are shown as orange lines. B) The experimental (filled circles) and predicted (blue line) T1/T2 ratio as a function of the angle α. The angle α and predicted T1/T2 values were calculated using the rotational diffusion tensor that was obtained by fitting the structure to the experimental data and assuming rigid body tumbling of the protein molecule. The program MODELFREE was used to calculate the rotational diffusion tensor.
T1 and T2 values can be accurately predicted from the atomic coordinates of a protein molecule using, for example, the rigid-body hydrodynamic “shell” model implemented in the HYDRONMR program (Garcia de la Torre et al., 2000). This model is characterized by a single adjustable parameter a. We performed the prediction of T1 and T2 values from each structure of the SCP ensemble using HYDRONMR with the adjustable parameter a set to 2.2Å, the value found to be optimal for this ensemble. The comparison of experimental and the ensemble-averaged predicted T1 and T2 values is shown in Figure 4A,B. One can see that the experimental values do not agree with the predicted values. The experimental T1 values are lower, and the T2 values are considerably higher with respect to the predicted values based on the rigid protein model. The two domains behave as if they belong to a lower molecular weight protein (see Figure S2 and Table S2) and they poses some degree of motion which is independent from the motion of sf3636 as a whole. Taking into account that NOE data demonstrate that both N-terminal and C-terminal domains are expected to exhibit rigid-body hydrodynamics, we can conclude that the relaxation data provide the evidence of inter-domain mobility that occurs on time scale that is faster than the rotational time of the entire protein. More detailed analysis of the relaxation data is hampered by the coupling between the relative domain motion and global tumbling.
The RDC data do not support a single relative configuration of two domains as well. When the two domains are fitted to the RDC data separately, the alignment tensors for NTD and CTD have different parameters. For example, for sf3636 in a positively charged stretching gel, the values of the HN RDC amplitude/rhombicity of NTD and CTD are of 10.5Hz/0.12 and 17.1Hz/0.24, respectively. This indicates that two domains have different mobility. SAXS data further provide evidence supporting inter-domain motion (see Figure 2). Both the Kratky plot and pair-distance distribution function profile suggest that the two domains do not adopt a unique arrangement in solution but rather form a flexible system.
Results of Ensemble refinement with NOE, RDC and SAXS data
Comparison of the results of single-conformation and ensemble refinements
For a flexible system the modeling of a solution state as an ensemble of conformers should be employed rather than an approach based on the concept of a single conformer. For sf3636, we used an ensemble refinement protocol to build an ensemble of conformers that fits both NMR and SAXS data as a whole. The optimal ensemble, consisting of 100 structures (referred to here as the OEP1 ensemble), was selected from the NOE-restrained pool of structures, pool P1, as described in the Experimental Procedures.
Here the OEP1 ensemble is compared with the SCP ensemble in terms of the global variables describing the relative configuration of two domains (see Table 3 and Figures 3 and 6). With the exception of the twist angle Ω, the distribution of global variables among OEP1structures is broader than among those generated by the SCP method, but the average values are similar. For example, the inter-domain distance RNC in the SCP ensemble has a very narrow distribution with a sharp peak at RNC =37.7Ǻ (Figure 3A). In contrast, the RNC distribution of the OEP1 ensemble is rather broad and covers the RNC -range from about 34 to 44 Ǻ (Figure 6E).
Figure 6.
Comparison of relative domain configuration in the structures of three different ensembles obtained using EOM approach. The backbone trace of the superimposed structures comprising the OEP1, OEP2, and OEP12 ensembles is shown in A, B, and C, respectively. See Table 2 and text for the ensembles definition. The ensembles are overlaid with the ab-initio SAXS-predicted envelope (gray mesh) and shown from two points of view. The right and left views are related by a 90° rotation. D) The positions of CTD centers-of-mass in all structures from the OEP1 (blue spheres), OEP2 (magenta spheres), and OEP12 (brown spheres) ensembles with superimposed N-terminal domains (residues 8–55). For clarity, the structure of superimposed NTDs is shown as a ribbon diagram. Distribution of inter-domain distance RNC in the OEP1 (E, blue), OEP2 (G, magenta) and OEP12 (F, brown) ensembles are shown as bars. The distribution of the interdomain twist angle Ω in the OEP1 (H, blue), OEP2 (J, magenta) and OEP12 (I, brown) ensembles are shown as bars. For comparison, the distributions in the NOE-restricted (E, H, black) and in the random (F, G, I, J, orange) pools of structures are shown as solid lines. See also Figures S3 and S4.
Both OEP1 and SCP ensembles as a whole fit the SAXS data equally well with the discrepancy χ ~ 1.48. The average scattering pattern predicted for OEP1 is graphically indistinguishable from that for SCP ensemble (see red solid curve in Figure 2). At the same time, while OEP1 ensemble fits the RDC data well, the fit of the SCP ensemble as a whole to RDC data is not satisfactory (see Table 2).
Impact of inclusion of NOE-derived restraints on the result of EOM ensemble refinement
To get better insight on the impact of including NOE restraints in the ensemble refinement protocol, we generated two more related optimal ensembles and compared their structural characteristics. The ensemble OEP2 was generated without using NOE restraints in either the pool creation step, or in the ensemble selection. Its pool of starting structures (named P2) consists of structures that have random relative interdomain configuration. Such a random pool is generally used in OEM modeling with SAXS data for flexible systems. As expected, the distribution of the global variables describing relative domain configuration in P2 is significantly broader than in P1 (see Figure S3). The ensemble OEP12 consists of structures taken from both P1 and P2 pools, and the NOE restraints were used in the ensemble selection (see Table 2).
Three different optimal ensembles OEP1, OEP2, and OEP12 are shown in Figure 6. As one can see from Table 2, all three ensembles fit SAXS and NMR data well except the OEP2 ensemble has 53 NOE distance restraints violated. We used an r−6 summed average to assess the restraints violation by an ensemble as a whole (see Experimental Procedures). At the same time, the configurational characteristics of the ensembles are noticeably different. The inter-domain distance distributions in the optimal ensembles and in the respective pools are compared in Figures 6E to 6G. The RNC distribution in OEP2 and OEP12 ensembles is much broader than that in the OEP1 ensemble and covers the RNC -range from 19 Ǻ up to 50 Ǻ, corresponding to compact and extended domain configurations, respectively. In the compact configuration, CTD is packed closely against the NTD, Rg ~ 16 Ǻ, and the shape anisotropy is as low as 𝒜 ~ 1.4. In the extended configuration, the NTD extends away from the CTD, Rg ~ 26 Ǻ, and 𝒜 ~ 4.4. In contrast, all structures in the OEP1 ensemble display an open domain arrangement with Rg=21.4 ± 0.8Ǻ and all have a similar shape with anisotropy 𝒜=3.1 ± 0.2 (see Figures 6, S4 and S5 and Table 3). It is interesting that the twist angle distribution in both OEP2 and OEP12 covers the whole range of possible Ω-values, which is in contrast to the relatively narrow Ω distribution in the OEP1 (Figures 6H to 6J). In addition, the distribution of Ω in the OEP2 ensemble is close to a uniform one, while there is a peak-centered at −80° in the Ω-distribution for the OEP12 ensemble.
Cross-validation with RDCs
The availability of three different sets of RDC data enabled us to perform cross-validation calculations. Optimal ensembles where generated as described above with OEP1, OEP2 and OEP12, however only one or two of the three sets of RDCs were included in the selection procedure. The omitted set(s) of RDC data were then used for the validation of the ensemble. Our cross-validation results indicate that when two sets of RDC data are included, all generated optimal ensembles fit well with the omitted set of RDCs. However, when only one set of RDC data is included, the goodness-of-fit to the omitted RDC data sets is significantly improved if NOE-derived restraints are included in the ensemble generation (Qens ≤ 0.35) versus the ensemble generated without the use of NOEs (Qens ~ 0.65) (see Tables S3 and S4).
DISCUSSION
Structural determination of sf3636 is challenging because of the elongated shape of the molecule, which is formed from two non-interacting domains connected by a short linker. Although the ensemble of NMR-derived structures in each domain is well defined, the relative configuration of the two domains in solution was not accurately determined by the NOE, RDC and dihedral angle restraints. The RDCs were not sufficient to define inter-domain orientation because of the presence of relative domain motion. The inclusion of SAXS data in the refinement using the SCP approach produced structures with modestly defined levels of inter-domain configuration. Furthermore, the SCP ensemble as a whole does not fit the RDC data well.
We developed a protocol based on OEM, to characterize the sf3636 structure, which generates an ensemble of conformers that fits the experimental NMR and SAXS data. The protocol differs from the original OEM in the way a pool of representative structures is generated, and because it makes use of RDC and NOE data in the selection of an optimal ensemble. In EOM a large pool of random conformers is build using a library of random loops, while here Replica Exchange MD simulations were used to generate a protein structures and NOE restrains are used in the simulation. The NOE distance restraints for sf3636 included 21 long-range contacts between linker residues and the two domains. In particular, NOEs can be seen from the hydrophobic side chain of Leu60 to the side chains of His56, Ala64, Gln65, Gln106 and Met107 (see Figure S3D). Though the distance restraints include no direct inter-domain NOEs, they provide important restrictions on the movement of residues in the linker region. In the ensemble refinement protocol, the target function used in the MC genetic search for an optimal ensemble includes a term that penalizes the violation of the NOE restraints by the ensemble as a whole. It is assumed that each conformer in the ensemble contributes to the NOESY cross peak volume independently, and r-6 summed average is used to assess the restraints violation.
The OEP1 ensemble was selected from the NOE-restrained pool P1, so that each of its members had no NOE violations. This ensemble could not reflect the composition of the entire ensemble of conformers in solution for a flexible protein such as sf3636. The random pool P2 represents more extensive sampling of the protein configuration space than P1 pool does (see Figure S3). Each random structure in P2 pool has a number of NOEs violated. The OEP12 ensemble, the best-fit ensemble selected from the extended pool, a combination of P1 and P2 pools, contains random structures that comprise about 40% of the ensemble. At the same time, the distributions of all global variables, including twist angle Ω, in the OEP12 ensemble indicate that only about 20% of OEP12 conformers could be considered as outliers of the corresponding distribution in the OEP1 ensemble. We don’t have direct experimental evidence that would allow us to clarify if these 20% of conformers reflect reality or are a result of data over-fitting. It is interesting to note that we have failed to select the optimal ensemble that fits all experimental data satisfactory form the P2 pool alone. The OE3P2 ensemble was generated by selecting structures from P2 with NOE restraints included in the selection (see Table S3). The OE3P2 ensemble violates 11 NOE restraints, which are the key restraints that restrict the linker flexibility (Figure S3D).
The OEP2 ensemble was generated without using NOE restraints to elucidate the role of these restraints in the ensemble refinement. The goodness-of-fit to the SAXS and RDC data obtained for the OEP1 is almost equivalent to that obtained for the OEP2 (see Table 2), indicating the inability of SAXS+RDC data to distinguish between the two ensembles. At the same time, the OEP1 and OEP2 have notably different structural characteristics of the domain arrangement. More than 30% of the OEP2 conformers fit SAXS data badly with χ-value > 4 and have Rg values significantly different form the experimentally estimated value of 21.75 Ǻ (see Figures 7, S4 and S5). These extremely elongated or compact structures contribute to the successful fit of the OEP2 ensemble as a whole. In contrast, the OEP1 conformers all have similar shape and radius of gyration compared to that obtained from the experimental data. The most significant impact of NOE restraints can be seen on the ensemble distribution of the interdomain twist angle Ω. The OEP1 conformers have Ω restricted between −120° to −30°, and this range is highly populated (75%) by OEP12 conformers as well. In contrast, there is no Ω-region favored by the OEP2 conformers (see Figure 6). Therefore, without including the NOE restraints in the ensemble refinement, we were unable to identify the dominant population of conformers characterized by an overall elongated shape and restricted interdomain twisting.
Figure 7.
Comparison of the goodness-of-fit to the SAXS data for different structural models of sf3636 with their radius of gyration Rg (A) and maximum inter-atomic distance Dmax (B). The goodness-of-fit to the SAXS data is measured by discrepancy χ. The data are shown for all structures in the random pool (orange dots) and in the SCP (red filled circles), OEP1 (blue filled circles), OEP2 (magenta filled circles), OEP12 (black filled circles), NMR (cyan filled circles), and NMR+RDC (green filled circles) ensembles. See Table 2 and text for the ensembles definition. χ-values for each structure were calculated using program CRYSOL. See also Figure
Concluding remarks
In summary, we have characterized the solution structure and dynamics of sf3636 from Shigella flexneri 2a, which consists of a coiled-coil domain and a novel c-terminal α/β domain connected by a short semi-flexible linker. Our results suggest, although there is no direct experimental evidence for it, that sf3636 predominantly populates configurational states that have an extended arrangement of the two domains, and the relative twisting of the domains is restricted within a ~80° interval. We developed and implemented the ensemble refinement protocol that is based on EOM and incorporates NOE–derived distance restraints and RDCs in the optimal ensemble generation procedure. We have compared different refinement strategies and found that refinement using a standard SCP approach provides good models representing an average of the conformations accessible for sf3636 in solution. We have also demonstrated that the inclusion of NOE restraints in the ensemble refinement of sf3636 was essential to deduce the relative domain configurations in the dominant population of sf3636 conformers.
EXPERIMENTAL PROCEDURES
Protein purification
Full-length sf3636 from Shigella flexneri 2a was subcloned into the vector p15Tv LIC with an N-terminal His tag in Escherichia coli BL21 (DE3) cells. Uniformly 15N and 15N/13C-labeled proteins were prepared for NMR experiments by growing in M9-minimal medium containing 15NH4Cl with or without 13C-glucose. The protein was purified to homogeneity using metal affinity chromatography as described previously (Yee et al., 2002). The N-terminal His tag was cleaved with TEV protease and the reaction mixture was passed through a nickel affinity column. The purified protein was then concentrated by ultrafiltration to 1.0 mM to 1.5 mM in the NMR buffer containing 10 mM Tris (pH 7.0), 300 mM NaCl, 10 mM DTT, 10 μM ZnSO4, 1 mM benzamidine, 0.01% NaN3, 1 × inhibitor cocktail (Roche Applied Science), 95% H2O/5% D2O. Based on 1D 15N T1 and T2 relaxation measurements, the overall rotational correlation time of sf3636 is ~ 9.0ns, indicating that the protein is monomeric under the conditions used in the NMR studies.
NMR spectroscopy and structure calculation
The NMR experiments were carried out at 25°C on either 600 or 800 MHz Bruker Avance spectrometers equipped with cryogenic probes. All 3D spectra employed non-uniform sampling in the indirect dimensions and were reconstructed by multi-dimensional decomposition software MDDNMR (Gutmanas et al., 2002; Orekhov et al., 2003). The resonance assignment and structure calculation were performed using approach described previously (Lemak et al., 2008; 2010). The details of NMR data collection and calculation of the NMR structural ensemble can be found in the Supplemental Experimental Procedures.
Residual dipolar couplings
The backbone dipolar couplings 1DNH and 1DNC′ were measured in two different media, a stretched neutral polyacrylamide gel and a C12E5 alkyl polyethylene glycol (PEG) bicelle given a positive charge by the addition of 1:30 cetyltrimethylammonium bromide (CTAB) relative to PEG. An additional set of 1DNH couplings was measured in a stretched acrylamide gel made positive by using 50% 3-acrylamidopropyl-trimethylammonium chloride in place of acrylamide. Preparation of these media, as well as a device for stretching polyacrylamide gels have been described previously (Barb et al., 2011; Liu and Prestegard, 2010). Samples as were supplied in buffers as described for other NMR experiments described above. A sample at 400 uM protein concentration was used to swell the neutral stretched gels directly. The PEG sample was diluted by a factor of approximately 0.75 in the process of adding a concentrated PEG preparation for the bicelle experiments to reach a final concentration of 400 uM. The sample for the positively charged stretched gel was swelled with a 400uM protein sample prepared in 10 mM Tris (ph 7.00 buffer in 150 mM NaCl. RDCs were collected on a Varian INOVA 600 MHz spectrometer using J modulation experiments along the lines of those described previously for 1DNC′ RDCs (Liu and Prestegard, 2009).
15N Relaxation Measurements
All backbone dynamics experiments were performed using a Bruker Avance 500 MHz spectrometer on the 15N-labeled sample (Farrow et al., 1994). The relaxation delays for the T1 experiments were 5, 65, 145, 245, 366, 527, 757, 1148 and 1500 ms and those for the T2 experiments were 34, 51, 68, 85, 102, 119, 136, and 150 ms. The 15N{1H}NOE spectra were acquired with and without 1H saturation in an interleaved manner.
SAXS experiments
SAXS data were acquired at the beamline 12-ID-C of the Advanced Photon Source, Argonne National Laboratory. Data collection, processing and analysis was performed as was described previously (Liao et al. 2011). The energy of the X-ray beam was 18 Kev and two setups (small- and wide- angle X-ray scattering, SAXS and WAXS) were used. The data were collected using MAR CCD detector positioned from the sample at the distance that was adjusted to achieve scattering q values of 0.006 < q < 2.3Å−1, where q=(4π/λ)sinθ, and 2θ is the scattering angle. Data were analyzed using programs from the ATSAS package (Konarev et al., 2008). See also Supplemental Experimental Procedures.
Single Conformation Refinement with NMR and SAXS data
Our initial attempt to refine the NMR ensemble structures followed a SAXS-refinement strategy based on fitting the data to a single conformation (Grishaev et al., 2005; Gabel et al., 2008; Schwieters et al., 2007, 2010). The SAXS-derived restraints were incorporated in the NMR structure refinement protocol by developing modules for fitting both solution scattering intensity and radius of gyration and incorporating them into the existing CNSSOLVE version 1.3 program suite (Brunger et al., 1998).
Our refinement protocol consists of two phases. In the first stage, simulated annealing from 2000K to 1K in 10K increments, with Cartesian dynamics for 0.2ps at each temperature was performed for the protein in a vacuum. Potential energy terms used in refinement include NOE, RDC, and SAXS restraints, dihedral angle restraints, and empirical geometrical energy.
Force constants for all energy terms but RDC restraints were kept constant during this phase. The value of the force constant for RDC restraints was scaled geometrically from its initial value of 0.01 to its final value 0.3. The second phase of the protocol consists of the refinement in explicit solvent using standard protocol (Linge et al., 2003) that is modified by including both RDC and SAXS-derived restraints as additional energy terms. More details of the protocol can be found in the Supplemental Experimental Procedures.
In the present work, we used 30 equally q-spaced SAXS data points in the range of 0< q≤0.3Å−1. RDCs from disordered residues were excluded in the calculations. The 20 structures comprising the NMR ensemble were refined using this protocol and deposited into the Protein Data Bank (PDB ID: 2LF0).
Ensemble Refinement with NOE, RDC and SAXS data
We also implemented an ensemble refinement protocol based on the ensemble optimization method (EOM) (Bernardo et al., 2007). The protocol consists of two steps: 1) generating a pool of representative structures and 2) selecting a sub-set (ensemble) of structures from the pool that fit experimental data, respectively. On the first step of the protocol, we utilized replica exchange molecular dynamics method (Sujita et al., 1999) to generate a pool of representative structures. In our calculations, 20 replicas of the protein were simulated independently in parallel at a wide-range of temperatures from 250K to 1500K. Each replica was started from a structure taken from the NMR ensemble and the torsional MD simulation lasted for 22ns. At every 15ps the MD runs were stopped, the conformations of the protein saved, and exchange of conformations between neighboring replicas were attempted and made when the Metropolis criterion is met. We performed MD runs with CNSSOLVE while the exchange of conformations between replicas (replica exchange) was performed using in-house scripts. In the MD simulations both N-terminal and C-terminal domains of the protein were kept rigid so that only residues 1–8, 54–64, and 117–120 were flexible, and all NOE-derived inter-proton distance restrains used to calculate NMR structure were also included in the simulation. The structures saved during the last 14ns of the simulations comprise the pool of representative structures.
On the second step of the ensemble refinement protocol, a genetic algorithm based Monte Carlo (MC) search was used to find the ensemble consisting of Nens protein conformations that best fits to the experimental data. During the MC search consisting of 5000 steps the following scoring function is minimized:
were
Here, χsaxs is a discrepancy of ensemble predicted SAXS scattering profile from the experimental scattering profile, Q ens is the quality factor of agreement between experimental and predicted dipolar couplings, is the number of NOE-derived distance restraints violated by the ensemble, and wRDC and wNOE are weighting factors both set to 0.5 in the present calculations. In its turn, Iens and are ensemble average scattering profile and dipolar couplings, respectively. To assess NOE-derived distance restraint against ensemble of structures we used r−6 summed ensemble averaging of the corresponding inter-proton distance given by (Nilges, 1995)
Iens was obtained by averaging the individual scattering patterns calculated with CRYSOL (Svergun et al., 1995) from the ensemble members. To calculate , first the RDCs of a given nuclear pair were calculated from each individual structure using an alignment tensor predicted from the 3D molecular shape. Each ensemble member has its own values of axial and rhombic alignment tensor components, while a single orientation of the tensor is used for all ensemble members. In the case of positively charged alignment media we used PALES (Zweckstetter, 2008) to make a prediction of the alignment tensor, while in the case of neutral alignment media the in-house routine based on the method of Almond (Almond et al., 2002) was used. The average dipolar coupling of a given nuclear pair was approximated by the calculation of the mean of the predicted RDCs over all models in the ensemble. The average RDCs were scaled to maximize the agreement with the experimental RDCs. See also Supplemental Experimental Procedures.
Supplementary Material
HIGHLIGHTS.
Novel joint ensemble refinement protocol for flexible multidomain proteins
Comparison with other refinement methods
Detection of inter-domain motions
Acknowledgments
This work was supported by the NIH Protein Structure Initiative grant U54 GM094597, the Natural Sciences and Engineering Research Council of Canada, and the Canadian Institute of Health Research through the Canada Research Chairs program (to C.H.A.).
Footnotes
Supplemental information includes five figures, two tables, and Supplemental Experimental Procedures and can be found with this article online.
AUTHORS CONTRIBUTIONS
A.Y., M.G., and A.S. designed and prepared protein samples for NMR, RDC, and SAXS experiments. S.H. and A.G. collected and processed NMR data. X.F. collected and processed SAXS data. H.W.L. collected and processed RDC data. C.H.A., J.H.P., and Y.X.W. contributed to the interpretation of NMR, RDC, and SAXS data. B.W. processed and analyzed NMR data; determined solution NMR structure. A.L. analyzed and interpreted NMR relaxation data; developed and implemented joint refinement protocol; performed and analyzed refinement calculations. A.L., B.W., S.H., A.Y., H.W.L., X.F., J.H.P., and C.H.A. wrote the manuscript.
References
- Almond A, Axeisen JB. Physical Interpretation of Residual Dipolar couplings in neutral Aligned Media. J Am Chem Soc. 2002;124:9986–9987. doi: 10.1021/ja026876i. [DOI] [PubMed] [Google Scholar]
- Apic G, Gough J, Teichmann SA. Domain combinations in archaeal, eubacterial and eukaryotic proteomes. J Mol Biol. 2001;310:311–325. doi: 10.1006/jmbi.2001.4776. [DOI] [PubMed] [Google Scholar]
- Barb AW, Cort JR, Seetharaman J, et al. Structures of domains I and IV from YbbR are representative of a widely distributed protein family. Protein Sci. 2011;20:396–405. doi: 10.1002/pro.571. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bernado P, Mylonas E, Petoukhov MV, Blackledge M, Svergun DI. Structural characterization of flexible proteins using Small-Angle X-ray Scattering. J Am Chem Soc. 2007;129:5656–5664. doi: 10.1021/ja069124n. [DOI] [PubMed] [Google Scholar]
- Bertini I, Ferella L, Luchinat C, Parigi G, Petoukhov MV, Ravera E, Rosato A, Svergun DI. MaxOcc: a web portal for maximum occurrence analysis. J Biomol NMR. 2012;53:271–280. doi: 10.1007/s10858-012-9638-1. [DOI] [PubMed] [Google Scholar]
- Bhattacharya A, Tejero R, Montelione GT. Evaluating protein structures determined by structural genomics consortia. Proteins. 2007;66:778–795. doi: 10.1002/prot.21165. [DOI] [PubMed] [Google Scholar]
- Brunger AT, Adams PD, Clore GM, DeLano WL, Gros P, et al. Crystallography & NMR system: A new software suite for macromolecular structure determination. Acta Crystallogr D Biol Crystallogr. 1998;54:905–921. doi: 10.1107/s0907444998003254. [DOI] [PubMed] [Google Scholar]
- Farrow NA, Muhandiram R, Singer AU, Pascal SM, Kay CM, Gish G, Shoelson SE, Pawson T, Forman-Kay JD, Kay LE. Backbone dynamics of a free and phosphopeptide-complexed Src homology 2 domain studied by 15N NMR relaxation. Biochemistry. 1994;33:5984–6003. doi: 10.1021/bi00185a040. [DOI] [PubMed] [Google Scholar]
- Forster F, Webb B, Krukenberg KA, Tsuruta H, Agard DA, Sali A. Intergation of small-angle X-ray scattering data into structural modeling of proteins and their complexes. J Mol Biol. 2008;382:1089–1106. doi: 10.1016/j.jmb.2008.07.074. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gabel F, Simon B, Nilges M, Petoukhov M, Svergun D, Sattler M. A structure refinement protocol combining NMR residual dipolar couplings and small angle scattering restraints. J Biomol NMR. 2008;41:199–208. doi: 10.1007/s10858-008-9258-y. [DOI] [PubMed] [Google Scholar]
- Garcia de la Torre J, Huertas ML, Carrasco B. HYDRONMR: Prediction of NMR relaxation of globular proteins from atomic-level structures and hydrodynamic calculations. J Magn Reson. 2000;147:138–146. doi: 10.1006/jmre.2000.2170. [DOI] [PubMed] [Google Scholar]
- Grishaev A, Wu J, Trewhella J, Bax A. Refinement of multidomain protein structures by combination of solution small-angle X-ray scattering and NMR data. J Am Chem Soc. 2005;127:16621–16628. doi: 10.1021/ja054342m. [DOI] [PubMed] [Google Scholar]
- Gutmanas A, Jarvoll P, Orekhov VY, Billeter M. Three-way decomposition of a complete 3D 15N-NOESY-HSQC. J Biomol NMR. 2002;24:191–201. doi: 10.1023/a:1021609314308. [DOI] [PubMed] [Google Scholar]
- Holm L, Sander C. Protein structure comparison by alignment of distance matrices. J Mol Biol. 1993;233:123–138. doi: 10.1006/jmbi.1993.1489. [DOI] [PubMed] [Google Scholar]
- Huang YJ, Powers R, Montelione GT. Protein NMR recall, precision, and F-measure scores (RPF scores): structure quality assessment measures based on information retrieval statistics. J Am Chem Soc. 2005;127:1665–1674. doi: 10.1021/ja047109h. [DOI] [PubMed] [Google Scholar]
- Jacques DA, Trawhella J. Small-angle scattering for structural biology – Expanding the frontier while avoiding the pitfalls. Protein Sci. 2010;19:642–657. doi: 10.1002/pro.351. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Konarev PV, Petoukhov MV, Volkov VV, Svergun DI. ATSAS 2.1 program package for small-angle scattering data analysis. J Appl Cryst. 2006;39:277–286. doi: 10.1107/S0021889812007662. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lemak A, Steren CA, Arrowsmith CH, Linas M. Sequence specific resonance assignment via multicanonical Monte Carlo search using an ABACUS approach. J Biomol NMR. 2008;41:29–41. doi: 10.1007/s10858-008-9238-2. [DOI] [PubMed] [Google Scholar]
- Lemak A, Gutmanas A, Chitayat S, Karra M, Farès C, Sunnerhagen M, Arrowsmith CH. A novel strategy for NMR resonance assignment and protein structure determination. J Biomol NMR. 2010;49:27–38. doi: 10.1007/s10858-010-9458-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liao JC, Lam R, Brazda V, Duan S, Ravichandran M, Ma J, Xiao T, Tempel W, Zuo X, Wang YX, Chirgadze NY, Arrowsmith CH. Interferon-inducible protein 16: insight into the interaction with tumor suppressor p53. Structure. 2011;19:418–429. doi: 10.1016/j.str.2010.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Linge JP, Williams MA, Spronk CA, Bonvin AM, Nilges M. Refinement of protein structures in explicit solvent. Proteins. 2003;50:496–506. doi: 10.1002/prot.10299. [DOI] [PubMed] [Google Scholar]
- Liu Y, Prestegard JH. Measurement of one and two bond N-C couplings in large proteins by TROSY-based J-modulation experiments. J Mag Res. 2009;200:109–118. doi: 10.1016/j.jmr.2009.06.010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y, Kahn RA, Prestegard JH. Dynamic structure of membrane-anchored Arf*GTP. Nat Struct Mol Biol. 2010;17:876–881. doi: 10.1038/nsmb.1853. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Liu Y, Prestegard JH. A device for the measurement of residual chemical shift anisotropy and residual dipolar coupling in soluble and membrane-associated proteins. J Biomol NMR. 2010;47:249–258. doi: 10.1007/s10858-010-9427-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mareuil F, Sizun C, Perez J, Schoenauer M, Lallemand JY, Bontems F. A simple genetic algorithm for the optimization of multidomain protein homology models driven by NMR residual dipolar coupling and small angle X-ray scattering data. Eur Biophys J. 2007;37:95–104. doi: 10.1007/s00249-007-0170-2. [DOI] [PubMed] [Google Scholar]
- Nilges M. Calculation of protein structures with ambiguous distance restraints. J Mol Biol. 1995;245:645–660. doi: 10.1006/jmbi.1994.0053. [DOI] [PubMed] [Google Scholar]
- Mertens HDT, Svergun DI. Structural characterization of proteins and complexes using small-angle X-ray solution scatting. J Struct Biol. 2010;172:128–141. doi: 10.1016/j.jsb.2010.06.012. [DOI] [PubMed] [Google Scholar]
- Orekhov V, Ibraghimov I, Billeter M. Optimizing resolution in multidimensional NMR by three-way decomposition. J Biomol NMR. 2003;27:165–173. doi: 10.1023/a:1024944720653. [DOI] [PubMed] [Google Scholar]
- Pelican M, Hura GL, Hammel M. Structure and flexibility within proteins as identified through small angle X-ray scattering. Gen Physiol Biophys. 2009;28:174–189. doi: 10.4149/gpb_2009_02_174. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rozycki B, Kim YC, Hummer G. SAXS ensemble refinement of ESCRT-III CHMP2 conformational transitions. Structure. 2011;19:109–116. doi: 10.1016/j.str.2010.10.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwieters CD, Clore GM. A physical picture of atomic motions within the Dickerson DNA dodecamer in solution derived from joint ensemble refinement against NMR and Large-Angle X-ray Scattering data. Biochemistry. 2007;46:1152–1166. doi: 10.1021/bi061943x. [DOI] [PubMed] [Google Scholar]
- Schwieters CD, Suh JY, Grishaev A, Ghirlando R, Takayama Y, Clore GM. Solution structure of the 128 kDa enzyme I dimer from Escherichia coli and its 146 kDa complex with HPr using residual dipolar couplings and small- and wide-angle X-ray scatting. J Am Chem Soc. 2010;132:13026–13045. doi: 10.1021/ja105485b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sugita Y, Okamoto Y. Replica-exchange molecular dynamics method for protein folding. Chem Phys Lett. 1999;314:141–151. [Google Scholar]
- Svergun DI, Barberato C, Koch MHJ. CRYSOL – a Program to evaluate X-ray solution scatting of biology macromolecules from atomics coordinates. J Appl Crystallogr. 1995;28:768–773. [Google Scholar]
- Yang Y, Ramelot TA, McCarrick RM, Ni S, et al. Combing NMR and EPR methods for homodimer protein structure determination. J Am Chem Soc. 2010;132:11910–11913. doi: 10.1021/ja105080h. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yang S, Blachowicz L, Makowski L, Roux B. Multidomain assembled states of Hck tyrosine kinase in solution. Proc Natl Acad Sci USA. 2010a;107:15757–15762. doi: 10.1073/pnas.1004569107. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Yee A, Chang X, Pineda-Lucena A, Wu B, et al. An NMR approach to structural proteomics. Proc Natl Acad Sci USA. 2002;99:1825–1830. doi: 10.1073/pnas.042684599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang J, Zuo X, Yu P, Byeon IJ, Jung J, Wang X, Dyba M, Seifert S, Schwieters CD, Qin J, Gronenborn AM, Wang YX. Determination of multicomponent protein structures in solution using global orientation and shape restraints. J Am Chem Soc. 2009;131:10507–10515. doi: 10.1021/ja902528f. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Zweckstetter M. NMR: prediction of molecular alignment from structure using the PALES software. Nat Protocol. 2008;3:679–690. doi: 10.1038/nprot.2008.36. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.