Skip to main content
The Journal of Chemical Physics logoLink to The Journal of Chemical Physics
. 2009 Sep 1;131(9):095101. doi: 10.1063/1.3211103

Atomic resolution protein structure determination by three-dimensional transferred echo double resonance solid-state nuclear magnetic resonance spectroscopy

Andrew J Nieuwkoop 1, Benjamin J Wylie 1, W Trent Franks 1, Gautam J Shah 1, Chad M Rienstra 1,a)
PMCID: PMC2832044  PMID: 19739873

Abstract

We show that quantitative internuclear 15N–13C distances can be obtained in sufficient quantity to determine a complete, high-resolution structure of a moderately sized protein by magic-angle spinning solid-state NMR spectroscopy. The three-dimensional ZF-TEDOR pulse sequence is employed in combination with sparse labeling of 13C sites in the β1 domain of the immunoglobulin binding protein G (GB1), as obtained by bacterial expression with 1,3-13C or 2-13C-glycerol as the 13C source. Quantitative dipolar trajectories are extracted from two-dimensional 15N–13C planes, in which ∼750 cross peaks are resolved. The experimental data are fit to exact theoretical trajectories for spin clusters (consisting of one 13C and several 15N each), yielding quantitative precision as good as 0.1 Å for ∼350 sites, better than 0.3 Å for another 150, and ∼1.0 Å for 150 distances in the range of 5–8 Å. Along with isotropic chemical shift-based (TALOS) dihedral angle restraints, the distance restraints are incorporated into simulated annealing calculations to yield a highly precise structure (backbone RMSD of 0.25±0.09 Å), which also demonstrates excellent agreement with the most closely related crystal structure of GB1 (2QMT, bbRMSD 0.79±0.03 Å). Moreover, side chain heavy atoms are well restrained (0.76±0.06 Å total heavy atom RMSD). These results demonstrate for the first time that quantitative internuclear distances can be measured throughout an entire solid protein to yield an atomic-resolution structure.

INTRODUCTION

Magic-angle spinning (MAS) solid-state NMR (SSNMR) has been used to solve several complete protein structures in recent years,1, 2, 3, 4, 5, 6 and the precision of these structures is improving.5 These advances are important because SSNMR is able to investigate samples with only local order and offers the most promising avenue toward routinely solving the structures of insoluble proteins that do not form single crystals. Specifically, substantial progress has been made in SSNMR structural investigations of both membrane proteins7, 8, 9, 10, 11, 12, 13, 14 and protein aggregates.6, 15, 16, 17, 18, 19, 20 One major remaining goal is to solve structures of these systems at a rate and quality comparable to those produced by solution NMR and x-ray crystallography. Most SSNMR structures so far have been solved with a combination of semiquantitative 13C–13C, 15N–15N, and 1H–1H distances, of precision comparable to nuclear Overhauser effects (NOEs) measured in solution. In addition, empirical dihedral angle restraints from TALOS assist in computational refinement of secondary structure elements.21 Until recently, the highest quality SSNMR structures were reported with backbone RMSD (bbRMSD) values of ∼0.8–1.4 Å.1, 2, 3, 4 We have recently reported that with large numbers of distances, as well as high precision vector angle restraints, structures can be refined to ∼0.2 Å bbRMSD.5 Likewise, we have shown that chemical shift tensors provide another avenue to atomic-resolution structure refinement.22

Here we investigate the measurement of high precision heteronuclear (15N–13C) distance restraints to refine protein structure to atomic resolution. Heteronuclear dipolar couplings, normally averaged by MAS, can be reintroduced with a train of rotor synchronized rf pulses, as in the rotational echo double resonance (REDOR) (Ref. 23) experiment and its analog transferred echo double resonance (TEDOR).24 For spin pairs or clusters, these methods offer exquisitely high precision determination of internuclear distances by reporting dipolar dephasing (for REDOR) or buildup (for TEDOR) trajectories under the influence of a pulse sequence in which the multiple spin-pair interactions mutually commute with one another.25, 26 Furthermore, for application to proteins with a large number of isotopic labels, the three-dimensional (3D) z-filtered TEDOR (ZF-TEDOR) (Ref. 27) pulse sequence was developed, enabling quantitative trajectories to be extracted from a series of two-dimensional (2D) spectra in which the 13C and 15N chemical shifts for each unique spin pair are encoded. Therefore many dozens of distances could be measured in uniformly 13C, 15N-labeled peptides.28, 29

Although the ZF-TEDOR experiment addresses the direct and indirect effects of 13C–13C scalar couplings in the observed trajectories and peak line shapes, the modulations arising from one-bond couplings compromise the dynamic range of the measurement of weak dipolar couplings. This well-recognized challenge has previously been addressed by band-selective decoupling (BASE TEDOR),27 as well as a semiconstant time TEDOR version.30 Here we address this in a third, complementary way by preparing protein samples where the percentage of directly bonded 13C pairs is minimized, as derived from either 1,3-13C-glycerol (along with natural abundance carbonate) or 2-13C-glycerol (and 13C carbonate) as its sole sources of carbon. This expression scheme, originally developed by LeMaster et al.,31 and utilized previously in SSNMR,1, 5, 32 produces an anticorrelated “checkerboard” pattern of 13C and 12C labeling for most amino acids, enhancing resolution and sensitivity most notably in 13C–13C 2D and 15N–13C–13C 3D experiments.

We show here that for 3D ZF-TEDOR experiments, this glycerol-derived labeling pattern is particularly beneficial, enabling high-quality experimental data to be acquired with hundreds of resolved correlations. We then demonstrate a protocol for fitting the dipolar trajectories quantitatively, modeling the spin dynamics for each set of several 15N spins coupled to each resolved 13C site. This approach enables quantitative analysis for distances of up to at least ∼5 Å, as well as detection of much longer distances (∼8 Å) with moderate precision. These distance restraints prove useful in determining protein tertiary structure.

EXPERIMENTAL AND COMPUTATIONAL METHODS

Sample preparation

Samples of GB1, a 6 kDa streptococcal protein expressed from E. coli, were 13C and 15N isotopically labeled by bacterial overexpression in media containing 15N ammonium chloride as the sole nitrogen source and either (a) 2-13C-glycerol and calcium 13C-carbonate, or (b) 1,3-13C-glycerol and natural abundance carbonate, as the sole carbon sources.1, 3, 5, 31, 33 We refer to these preparations throughout the text as [2]-GB1 and [1,3]-GB1, respectively. In each case, ∼18 mg of nanocrystalline protein was packed into the central 80% of a limited speed 3.2 mm Varian rotor (Varian, Inc., Fort Collins, CO). All experiments were performed using a 500 MHz InfinityPlus spectrometer (Varian, Inc., Palo Alto, CA and Fort Collins, CO) equipped with a 3.2 mm T3 Balun1H–13C–15N MAS probe. Pulse widths (π∕2) for 1H, 13C, and 15N were 2.6, 3.0, and 6.0 μs, respectively. Spinning was controlled via a Varian MAS controller to 11 111±2 Hz.

NMR spectroscopy

A series of 2D 15N–13C planes was acquired according to the 3D ZF-TEDOR pulse sequence.27 TPPM decoupling was used during acquisition with a 1H field of ∼70 kHz (6.7 μs, 15.0° total phase difference). TPPM conditions during REDOR (5.0 μs, 18.0°) were optimized with 5.76 ms mixing and a nominal 1H field of ∼100 kHz. 13C and 15N π-pulse widths during TEDOR were 6.0 and 15.2 μs, respectively, the latter value adjusted to ensure that the ratio of 1H to 15N nutation frequency was at least 3:1 to minimize decoupling interference.34, 35 Spectra were acquired with the minimum phase cycle of 16 scans per row, resulting in 4.5 h blocks of measurement time with each 2D plane digitized to 640 points (TPPI) by 45 μs per row in t1 (15N, 28.8 ms maximum evolution time) and 3072 complex points (15 μs dwell) in the t2 acquisition dimension (13C, 46.1 ms acquisition time). Data processing was performed with 15 and 5 Hz net line broadening in the direct and indirect dimensions, respectively (using a 1:2.5 ratio of negative Lorentzian to positive Gaussian apodization) and zero filling to 8192 by 8192 complex points.

The 15N–13C dipolar trajectories were sampled with values of tmix (according to Jaroniec et al., Fig. 1) (Ref. 27) incremented from 1.44 to 14.40 ms in steps of 1.44 ms. Signal averaging times were increased for longer mixing times, with a minimum of two blocks (9 h) acquired with tmix=1.44 ms and a maximum of nine blocks (40.5 h) with tmix=14.40 ms. Spectra were normalized, according to the total number of scans per row, prior to further analysis.

Figure 1.

Figure 1

Carbonyl 13C region of 2D 15N–13C TEDOR spectra (1.44 ms mixing time). Spectra of GB1 samples prepared from (a) 2-13C-glycerol, calcium 13C-carbonate, and 15NH4Cl ([2]-GB1) and (b) 1,3-13C-glycerol, natural abundance carbonate, and 15NH4Cl ([1,3]-GB1). The sequence of GB1 (c) is provided for reference. Spectra were acquired at 500 MHz 1H frequency. Peaks are labeled with their backbone 15N and 13C frequencies, respectively, except for side chain 15N sites which are explicitly indicated. Data were processed with 15 Hz net line broadening (Lorentzian-to-Gaussian apodization) in 13C and 5 Hz in 15N. Acquisition time was 9 h for each spectrum.

Numerical simulations and data fitting

Spectra were processed in NMRPipe (Ref. 36) and peak intensities extracted from the 2D 15N–13C planes using the nlinLS package, resulting in trajectories of integrated peak intensities as a function of mixing time, from 1.44 to 14.4 ms in 1.44 ms increments.

Simulated trajectories were generated using SPINEVOLUTION (Ref. 37) to model spin dynamics of ZF-TEDOR with the following assumptions: (1) All spin coherences derived from a single 13C site commute with all other spin operators involving other 13C spins; (2) each 13C site is coupled to n15N sites, which are not coupled to other 15N spins; (3) the relative orientations of the heteronuclear dipoles could be ignored. (We show experimentally (vide infra) that these approximations are valid in the limit of weak couplings.) Thus, simulated trajectories for each of the n15N sites coupled to a single 13C resonance were generated.

The dipolar couplings and other incidental parameters were then determined by minimizing the global difference between the simulated and experimental trajectories, using in-house FORTRAN code that called MINUIT minimization libraries and SPINEVOLUTION.37 The distance between 13C and 15N spin pairs was simulated for each point on the trajectory and a scaling factor was applied to the result to account for the 13C labeling percentage. In this scheme, y-scaling, corresponding to labeling percentage, is held fixed over the ensemble and T2 relaxation, applied as an exponential decay, is allowed to vary by 10% over the ensemble, to account for possible higher order relaxation from 15N CSA recoupling during the REDOR periods.

RESULTS AND DISCUSSION

2D TEDOR heteronuclear correlation spectroscopy

As demonstrated previously,38 the 2D 15N–13C spectra of [2]-GB1 [Fig. 1a] and [1,3]-GB1 [Fig. 1b] GB1 are well resolved, with natural 13C linewidths of ∼0.2 ppm and 15N linewidths of ∼0.5 ppm, principally limited by instrumental factors (such as the B0 homogeneity and the maximum time for which high power 1H decoupling can be applied). Spectra acquired at 1.44 ms 15N–13C mixing, the shortest value utilized here, contain strong peaks from one-bond correlations (e.g., N[i]-C[i-1]), with the relative intensity depending principally upon the efficiency of the 13C labeling within each amino acid. For example, in the [2]-GB1 sample, carbonyl sites for Leu are nearly 100% labeled, whereas Ala, Gly, Val, and other amino acids from the glycolysis pathway are labeled only to a small extent; in the [1,3]-GB1 sample, a complementary pattern is observed. In both samples, carbonyl sites from amino acids in the citric acid cycle (most notably, Gln∕Glu, Asp∕Asn, Lys, and Thr) are observed with approximately equal intensity in the [2]-GB1 and [1,3]-GB1 samples, due to the scrambling of 13C within the cycle. These signals serve as valuable internal controls to validate data analysis procedures.

Likewise, the TEDOR spectra are well resolved in the CA region for both the [2]-GB1 [Fig. 2a] and [1,3]-GB1 [Fig. 2b] samples, with nearly every expected cross peak resolved. Partial overlap of the N8, A20, and D46 resonances in the [2]-GB1 spectrum is alleviated by the reduction in A20 intensity in the [1,3]-GB1 spectrum. An additional benefit of the glycerol labeling scheme is the absence of most one-bond homonuclear J-couplings, with the exception the Val and Ile CA-CB, which lead to doublet patterns observed for I6 and the four Val residues (21, 29, 39, and 54). Moreover, broadening from off or near rotational resonance (R2) is avoided.33, 39, 40, 41, 42, 43 These factors together contribute to high observed signal-to-noise ratio (SNR) for both samples, approximately 500:1 in 9 h of data collection for directly bonded 15N–13C pairs that were ∼100% labeled and proportionately lower for the fractionally labeled sites. Thus it was also possible to observe weak peaks at natural abundance and∕or with a small percentage of labeling; in several instances, sites that are nominally unlabeled appeared with approximately 5% of the maximum intensity [e.g., A48 C in the [2]-GB1 sample, Fig. 1a].

Figure 2.

Figure 2

CA region of 2D 15N–13C TEDOR spectra (1.44 ms mixing time). (a) [2]-GB1; (b) [1,3]-GB1. Spectra were processed with 15 Hz net line broadening (Lorentzian-to-Gaussian apodization) in 13C and 5 Hz in 15N. Acquisition time was 9 h for each spectrum.

The glycerol-derived isotopic incorporation on an amino acid specific basis is likely to vary among proteins, due to differences in the relative percentages of each amino acid; in the case of GB1, such effects are likely to be exaggerated by the very high levels of expression (>100 mg∕L) and the fact that GB1 lacks several amino acids (Cys, His, Pro, Ser, Arg) and has a large number of others (such as Thr, Asp, and Glu). Therefore the NCA and NCO 2D spectra can be utilized to approximate the percentage of 13C labeling in our samples. This percentage was incorporated as a parameter into the fitting procedure as an intensity scaling factor; this treatment is valid in the limit that the relaxation parameters are similar for all sites, which is satisfactory in the limit of short TEDOR mixing times. The results of this analysis are shown in Fig. 3. Overall patterns were very similar to those reported previously,1, 3 but with substantial signal intensities (∼10% of the maximum) for sites that were not expected to be labeled, such as CO resonances for Gly, Trp, Phe, Tyr, and Ala in the [2]-GB1 sample. Despite the presence of these peaks, we consistently observed lineshapes lacking fine structure from 1JCC couplings, consistent with the anticorrelated labeling of neighboring carbons.

Figure 3.

Figure 3

13C labeling pattern measured by TEDOR in GB1. Red represents the [2-GB1] and green [1,3-GB1]. Each circle represents the intensity of correlations to the carbon type indicated at the top in the residue group indicated to the left.

At longer 15N–13C mixing times, additional 13C resonances are observed at each 15N frequency (Fig. 4); many of these correlations are consistent with medium and long-range distance restraints. For example, the methyl region of [1,3]-GB1 contains strong crosspeaks for two-bond N[i]-CB[i] pairs (dipolar coupling ∼200 Hz) with SNR values of ∼30, or ∼6% of the maximum intensity for the aforementioned one-bond correlations at 1.44 ms 15N–13C mixing. At progressively longer mixing times, these crosspeaks increase in intensity to a maximum of ∼150 at 5.76 ms (∼30% of the maximum one-bond crosspeak intensity). Even at the 1.44 ms mixing time, some inter-residue correlations are observed (e.g., D40-V39CG1, G41-V54CG2, W43NE1-V54CG2), which must have distances of greater than 3 Å. Peaks first appearing at 2.88 ms include those reporting on backbone-to-side-chain distances of ∼4 Å (Val and Thr N-CG) as well as various long-range correlations (G9-V39CG1, L7-V54CG1, Q2-T18CG2, W43-V54CG1). Nearly every 13C and 15N frequency is uniquely resolved, enabling straightforward assignment of the longer mixing time spectra. Chemical shifts agreed well with previously published values.44

Figure 4.

Figure 4

Methyl region of 2D 15N–13C TEDOR spectrum of [1,3]-GB1 at several mixing times. (a) 1.44 ms; (b) 2.88 ms; (c) 4.32 ms; (d) 5.76 ms. Peaks are labeled only in the spectrum where they first appear. Labels indicate the backbone 15N site followed by the 13C to which it is correlated, except for side chain 15N sites that are specifically noted. Intraresidue correlations have no second residue number before the 13C label. Spectra were processed with 15 Hz net line broadening (Lorentzian-to-Gaussian apodization) in 13C and 5 Hz in 15N.

To evaluate the distance measurement range using this approach, we examined the spectrum at 14.4 ms mixing, the longest utilized in this study. At selected 13C frequencies, correlations are observed to nearly a dozen 15N sites. For example, I6CD1 (a uniquely resolved 13C frequency 12.6 ppm, Fig. 5) shows correlations not only within the β1 strand to L5, I6, and L7 but also to β2 (G14, E15, and T16) and β4 (F52 and T53). These correlations are potentially useful in restraining the relative position and register of these three secondary structure elements. Likewise, L5CD2 exhibits correlations within the β1 strand (to L5∕L7, I6), β2 (T16), and β4 (F52 and T53). Complementary information is observed for V54CG1, within β4, with correlations to its neighboring residues (T53 and T55), β1 (I6, L7, G9), and β3 (G41, W43). Such correlations between secondary structure elements are reliably observed, and in several cases unambiguous peaks corresponding to distances of 8 Å or more (based on the 2QMT x-ray crystal structure45) are observed. Examples shown in Fig. 5 include F52N-I6CG1 (7.79 Å distance in the crystal structure), V21N-M1CE (8.43 Å), and G14N-I6CD1 (8.71 Å). These long-range peaks are the most useful with regards to defining the tertiary structure of the protein.

Figure 5.

Figure 5

Methyl region of [1,3]-GB1 2D 15N–13C TEDOR spectrum (14.4 ms 15N–13C mixing, 40 h measurement time). Data were processed with 15 Hz net line broadening (Lorentzian-to-Gaussian apodization) in 13C and 5 Hz in 15N.

We considered whether some of the observed cross peaks might arise due to a multi-step polarization transfer involving two 13C spins and one 15N. This seemed unlikely due to the fact that the MAS rate was chosen to avoid R2 conditions.33, 39, 40, 41, 42, 43 Furthermore, at the longest mixing time utilized here (i.e., 14.4 ms), with high power decoupling the rate of homonuclear transfer via spin diffusion must be substantially slower than what we observed in GB1 with rotary resonance recoupling on the 1H channel. In that case,5 we found it necessary to mix at least 50 ms to observe significant (>5% of the diagonal) intensity between 13C pairs separated by 2.5 Å. We examined a specific case, of Ile6 CD1, in the resulting GB1 structure to determine whether there were additional 13C sites physically positioned between source (Ile6 CD1) and destination (G14N) spins; in this case there were none within 5 Å. We attribute this, and the general lack of spin diffusion by 13C–13C couplings, to be a consequence of the sparse, glycerol-derived isotopic labeling.

Quantitative analysis of polarization transfer trajectories

As illustrated above, a large number of crosspeaks are observed in the 2D TEDOR spectra, and the corresponding distances may be approximated based on relative intensity within individual spectra, in a manner analogous to NOE analysis by solution NMR. However, for purposes of quantitative distance determinations, it is necessary to fit the trajectories with rigorous spin physics models. To do so, we next illustrate the fitting procedure for the polarization transfer trajectories. Like REDOR difference spectra,23 TEDOR trajectories24 for spin pairs can readily be fit to analytical (or numerically exact) spin dynamics models to extract accurate 15N–13C distances, and similar logic has been utilized to examine clusters of several spins.46, 47, 48 In contrast to REDOR, however, the fit of the TEDOR trajectories is somewhat complicated by the fact that the maximum absolute intensity in the trajectory depends not only on heteronuclear dipolar coupling, but also the 13C labeling percentage, as well as other factors (relaxation, presence of multiple dipoles, relative orientations, etc.).27 In particular, the labeling percentage is a critical parameter that contributes to the normalization of the spectral intensity, which distinguishes between (a) short distances involving a fractionally labeled 13C and (b) long distances involving a 13C that is ∼100% labeled. These factors must be explicitly considered in order to determine quantitative coupling constants. To first approximation, we considered the couplings, labeling percentage, and relaxation as fit parameters, assuming that the labeling percentage and relaxation must be the same for each 13C site, and that all 15N sites were labeled to 100%. The relative orientations of multiple 13C–15N dipoles were ignored.

Figure 6 illustrates two typical scenarios encountered in fitting ensembles of 15N sites correlated with a single 13C. The A34CB resonance [Fig. 6a] has intraresidue, sequential, and long-range correlations. The intraresidue (A34N-A34CB) trajectory exhibits a maximum at 5.76 ms mixing, from which the distance of 2.44 Å is fit directly, also uniquely constraining the intensity scaling factor. With accurate knowledge of this scaling factor, the remaining trajectories can be fitted unambiguously. In comparison to the crystal structure [2QMT, Fig. 6c], we find reasonable agreement for the short (A34N-A34CB, 2.44±0.10 Å TEDOR versus 2.44 Å x ray), medium (N35N-A34CB, 3.29±0.10 Å TEDOR versus 3.09 Å x ray), and long (V39N-A34CB, 4.46±0.86 Å TEDOR versus 5.13 Å x ray; G41N-A34CB, 5.30±1.79 Å TEDOR versus 6.74 Å x ray; W43NE1-A34CB, 5.9±3.2 Å TEDOR versus 5.47 Å x ray) distances. Uncertainties in the TEDOR-determined distances (as determined by the MINUIT routine) are largest for the peaks observed only at the long mixing times, such as W43NE1-A34CB, which is at the borderline of sensitivity in this experiment.

Figure 6.

Figure 6

TEDOR polarization transfer trajectories of selected 13C–15N correlations in GB1. Trajectories for (a) A34CB in [1,3]-GB1 and (b) W43CZ3 in [2]-GB1. Correlations labeled on the 2QMT crystal structure for (c) A34B and (d) W43CZ3. Intensities in the trajectories are derived from integrated volumes of peaks in 2D 15N–13C TEDOR experiments at the respective mixing times. Simulated trajectories were generated using SPINEVOLUTION simulations of the TEDOR pulse sequence as described in the text. Fit and x-ray distances for A34CB [(a) and (c)] are, respectively, as follows; A34N, 2.44±0.10 Å vs 2.44 Å; N35N, 3.29±0.10 Å vs 3.09 Å; V39N, 4.46±0.86 Å vs 5.13 Å; G41N, 5.30±1.79 Å vs 6.74 Å; W43NE1, 5.9±3.2 Å vs 5.47 Å. Fit and x-ray distances for W43CZ3 [(b) and (d)] are, respectively, as follows: W43NE1, 4.10 vs 4.06 Å; K31N, 4.53 vs 4.61 Å; V54N, 4.55 vs 4.85 Å; T53N, 4.90 vs 5.18 Å; W43N, 6.02 vs 6.36 Å; T44N, 6.04 vs 6.15 Å; K31NZ, 6.38 vs 6.00 Å. The molecular graphic shows the backbone of helices in purple, strands in yellow, and loops in cyan; the dark blue circles are nitrogen atoms, and the distances to the labeled 13C site are indicated with dotted lines.

In general, 15N–13C distances of less than ∼3.5 Å will yield trajectories exhibiting maxima at less than 14.4 ms, which can be uniquely fit to intensity scaling and heteronuclear coupling. In cases where no 15N nuclei are within 3.5 Å of an observed 13C resonance, this procedure becomes problematic and the final scaling factor, and therefore distances, becomes less reliable. For example, the W43CZ3 TEDOR trajectory [Fig. 6b] has not yet reached its maximum at 14.4 ms; trajectories in the initial rate regime exhibit a strong covariance between the intensity scaling factor and the dipolar coupling. Therefore the certainty of distance determination is limited by knowledge of the hypothetical maximum signal intensity. To address this problem, knowledge of known molecular geometry in this case can be utilized as an internal control to disambiguate the fit parameters. Specifically, the W43NE1-W43CZ3 distance is dictated by the geometry of the indole ring to be 4.06 Å. Thus in fitting this group of trajectories we constrained this distance to be 4.05±0.05 Å, and the global fit for all six distances was found to be internally consistent. Distances calculated from this trajectory were systematically short without the control, but using this chemical restraint the 15N to W43CZ3 distances converge very well to the those observed in the x-ray structure (K31N, 4.53±0.10 versus 4.61 Å x ray; V54N, 4.55±0.10 versus 4.85 Å; T53N, 4.90±3.90 versus 5.18 Å; W43N, 6.02±0.10 versus 6.36 Å; T44N, 6.04±0.10 versus 6.15 Å; K31NZ, 6.38±0.10 versus 6.00 Å). When fixing a distance within the ensemble, error determinations for the remaining distances become less reliable as the mingrad MINUIT algorithm cannot properly sample the solution space to determine the reliability of the fit, although a reasonable estimate of the uncertainty for all trajectories shown here is better than ±0.4 Å (i.e., the range of distances from the mean required to generate simulations that encompass all observed experimental data points). A total of 726 distance restraints (340 from the [1,3]-GB1 and 386 from the [2]-GB1) were determined. Among these, 454 had distance uncertainties of 0.2 Å or less; another 95 had uncertainties between 0.2 and 0.5 Å, 48 had uncertainties between 0.5 and 1.0 Å, and the remaining 64 had uncertainties greater than 1.0 Å. In some instances (including most of the 64 reported to have greater than 1.0 Å uncertainty), MINUIT failed to report reasonable errors [for example, T53N-W43CZ3, Fig. 6b] although by manual inspection it was clear that the trajectories were well fit. Nevertheless, the large uncertainties were used in subsequent calculations (vide infra).

We identified and removed intermolecular correlations based on similarity to contacts observed previously.5, 30, 49 The remaining TEDOR-determined 13C–15N distances showed good agreement with the x-ray structure (Fig. 7). Excluding the prochiral methyl carbons with ambiguous assignments and side chain 15N sites, 240 out of 327 distances (73%) determined in the [2]-GB1 sample were within 10% of the x-ray determined distances, and 313 out of 327 (96%) were within 25% of the x-ray distance. For [1,3]-GB1, likewise, 168 out of 260 (65%) were within 10%, and 235 (90%) were within 25%. There is a trend evident in the plot, that outlying points with shorter-than-expected NMR distances are observed more frequently than longer-than-expected NMR distances; we attribute this to the fact that the short-than-expected NMR distances are derived from stronger-than-expected cross peaks, which are above the noise floor, whereas the weaker-than-expected cross peaks are not observed at all. In addition, variations between the TEDOR and x-ray distances are expected because the sample preparations are not identical. For example, we previously found significant differences in side chain conformation due to subtle differences in crystal polymorphism45 which more detailed NMR studies could elucidate.

Figure 7.

Figure 7

Scatter plot of 15N–13C distances determined by TEDOR experiments and x-ray crystallography (PDB entry 2QMT). (a) Plot including all 15N sites. (b) Plot including only backbone amide sites. The solid line has a slope of 1, to guide the eye, and the dotted lines indicate variations of ±25% from the nominal distance.

Protein structure calculations with heteronuclear distance restraints

We next calculated a de novo structure of GB1 in XPLOR-NIH (Ref. 50) using as experimental restraints the complete set of TEDOR-determined 15N–13C distances. Experimental uncertainties were assumed to have a minimum of 0.1 Å, even in cases where MINUIT reported smaller uncertainties. In addition to distances, TALOS (Ref. 21) was used to determine backbone dihedral angle from the isotropic chemical shifts. Initial structures calculated this way converged to an ∼0.6 Å backbone RMSD fold of GB1, with 36 15N–13C distances violating by more than 0.5 Å. In cases where distance restraints consistently violated the majority of lowest energy structures, the uncertainties were increased by 1.0 Å and the calculation repeated. Two restraints, T53N-I6CG1 from [1,3]-GB1 and K10NZ-CG from [2]-GB1 (out of the 36) that continued to violate, despite this increase in uncertainty, were removed altogether. The backbone RMSD converged to 0.35 Å after this round of calculations. The violation threshold was then reduced to 0.3 Å, and another 52 violating distance restraints were relaxed and three more, W43N-V54CG1 from both [1,3]-GB1 and [2]-GB1 and T54N-I6CG1 from [2]-GB1, were removed (86 in total). The process was repeated until no violations greater than 0.3 Å remained. Table 1 shows all restraints used in the calculation sorted by the distance in the primary structure. Long-range (∣ij∣>3), medium range (1<∣ij∣<4), sequential (∣ij∣=1), and intraresidue correlations are listed along with they number of violations seen in each category. The majority of all restraints (∼88%) never violated in any of the calculations.

Table 1.

TEDOR restraints used in structure calculations.

Restraint type Never violatea Violate onceb Violate twicec Total
Longd 108 23 4 135
Mediume 53 4 0 57
Sequentialf 211 17 0 228
Intraresidue 268 42 1 311
Total 640 86 5 731
Percentage of total 87.6 11.8 0.7  
a

Restraints that did not violate by more than 0.3 Å in a majority of the lowest energy models.

b

Restraints that violated once and had 1 Å added to their error.

c

Restraints that violated after being lengthened and were removed from additional calculations.

d

ij∣>4.

e

ij∣≤3.

f

ij∣=1.

The violating distance restraints were predominantly in one of two categories: (1) long-range restraints to 13C or 15N sites on flexible side chains and (2) distances involving 13C nuclei with no 15N within ∼4 Å. Examples of the first case are I6CD1, I6CG1∕2, L7CG, L7CD1∕2, K13CE, K13CD, and several Lys NZ resonances. As shown by Ishii and Terao, molecular motion alters the effective dipolar couplings in SSNMR,51 in a manner that for strong couplings will generally result in larger values for r determined by ⟨1∕r3⟩ (the NMR observable) than from ⟨r⟩ (the x-ray diffraction observable). In the case of 13C nuclei with no nearby 15N, the y-scaling factor is highly uncertain, and therefore the covariance between intensity and coupling in this initial rate regime gives the distance estimations a high uncertainty. These effectively serve as long-range distance restraints of quality similar to NOEs.

Once violating distance restraints were removed by this procedure, the resulting structure calculation converged to a family of structures (ten lowest energy out of 260 calculated) with a backbone RMSD of 0.25±0.09 Å with an all atom RMSD of 0.76±0.06 Å. The average structure showed good agreement with the most closely related crystal structure (2QMT), with a bbRMSD of 0.79±0.03 Å. Figure 8 shows the family of best structures shown with (a) only backbone atoms and (b) all side chain atoms. Most variation among the structures is observed in the turns at the beginning and end of the helix. The side chains of residues in the core of the protein are very well constrained leading to very high similarity as indicated by the low all atom RMSD. Thus a majority of the variance in the all atom RMSD is from external side chains that are less well constrained. Table 2 shows the average XPLOR energies for this family of structures, which indicate that a majority of the pseudoenergy in these structures arises from the TEDOR distance constraints and the XPLOR angle restraints.

Figure 8.

Figure 8

GB1 structure calculated using TEDOR distance and TALOS dihedral constraints. (a) Backbone trace of the ten lowest energy structures (out of 260) calculated. (b) Side chains drawn in CPK coloring scheme. The backbone RMSD is 0.25±0.09 Å within the family of the ten best NMR structures, and the RMSD in comparison to the crystal structure (2QMT) is 0.76±0.06 Å. The all heavy atom RMSD is 0.79±0.03 Å.

Table 2.

XPLOR calculation energies for the ten lowest energy structures.

Energies Average (kcal mol−1) Standard deviation
Total 228.78 3.52
Bonds 19.35 1.18
Angles 80.15 2.28
VDW 0.03 0.09
TEDOR 115.96 2.93
Dihedral 4.56 0.55
Improper 8.72 0.76

CONCLUSIONS

We have demonstrated that the TEDOR pulse sequence can be applied to a preparation of GB1 that is uniformly 15N and sparsely 13C labeled. This sparse 13C labeling allows for long-range TEDOR distances to be observed for every 13C site in GB1. Multiple 15N–13C mixing times were used to create dipolar coupling trajectories for carbon and nitrogen sites up to 6 Å apart and to detect correlations arising from pairs up to 8 Å apart. These trajectories were fit to exact numerical simulations of TEDOR allowing 15N–13C distances to be determined with both precision and accuracy. When incorporated into simulated annealing calculations these distance restraints, combined with dihedral restraints, produced a very high-resolution fold of GB1. Substantially fewer distances were required to determine this structure than in solution NMR studies with similar precision due to the fact that TEDOR short-to-medium distances can be determined with greater precision and accuracy than 1H–1H NOEs. Because SSNMR has no intrinsic upper molecular weight limit, this approach is promising for the study of larger nanocrystalline proteins, membrane proteins, and insoluble aggregates, in cases where high-resolution 2D 15N–13C spectra can be obtained.

ACKNOWLEDGMENTS

This work was supported by the National Institutes of Health Grant No. R01-GM073770. A.J.N. thanks the Molecular Biophysics Training Program at the University of Illinois for financial support.

References

  1. Castellani F., van Rossum B., Diehl A., Schubert M., Rehbein K., and Oschkinat H., Nature (London) 420, 98 (2002). 10.1038/nature01070 [DOI] [PubMed] [Google Scholar]
  2. Lange A., Becker S., Seidel K., Giller K., Pongs O., and Baldus M., Angew. Chem., Int. Ed. 44, 2089 (2005). 10.1002/anie.200462516 [DOI] [PubMed] [Google Scholar]
  3. Zech S. G., Wand A. J., and McDermott A. E., J. Am. Chem. Soc. 127, 8618 (2005). 10.1021/ja0503128 [DOI] [PubMed] [Google Scholar]
  4. Zhou D. H., Shea J. J., Nieuwkoop A. J., Franks W. T., Wylie B. J., Mullen C., Sandoz D., and Rienstra C. M., Angew. Chem., Int. Ed. 46, 8380 (2007). 10.1002/anie.200702905 [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Franks W. T., Wylie B. J., Schmidt H. L. F., Nieuwkoop A. J., Mayrhofer R. M., Shah G. J., Graesser D. T., and Rienstra C. M., Proc. Natl. Acad. Sci. U.S.A. 105, 4621 (2008). 10.1073/pnas.0712393105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Wasmer C., Lange A., Van Melckebeke H., Siemer A. B., Riek R., and Meier B. H., Science 319, 1523 (2008). 10.1126/science.1151839 [DOI] [PubMed] [Google Scholar]
  7. Opella S. J. and Marassi F. M., Chem. Rev. (Washington, D.C.) 104, 3587 (2004). 10.1021/cr0304121 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Gammeren A. J., Hulsbergen F. B., Hollander J. G., and de Groot H. J., J. Biomol. NMR 31, 279 (2005). 10.1007/s10858-005-1604-8 [DOI] [PubMed] [Google Scholar]
  9. Kobayashi M., Matsuki Y., Yumen I., Fujiwara T., and Akutsu H., J. Biomol. NMR 36, 279 (2006). 10.1007/s10858-006-9094-x [DOI] [PubMed] [Google Scholar]
  10. Frericks H. L., Zhou D. H., Yap L. L., Gennis R. B., and Rienstra C. M., J. Biomol. NMR 36, 55 (2006). 10.1007/s10858-006-9070-5 [DOI] [PubMed] [Google Scholar]
  11. Etzkorn M., Martell S., Andronesi O. C., Seidel K., Engelhard M., and Baldus M., Angew. Chem., Int. Ed. 46, 459 (2007). 10.1002/anie.200602139 [DOI] [PubMed] [Google Scholar]
  12. Kijac A. Z., Li Y., Sligar S. G., and Rienstra C. M., Biochemistry 46, 13696 (2007). 10.1021/bi701411g [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Hiller M., Higman V. A., Jehle S., van Rossum B. J., Kuhlbrandt W., and Oschkinat H., J. Am. Chem. Soc. 130, 408 (2008). 10.1021/ja077589n [DOI] [PubMed] [Google Scholar]
  14. Li Y., Berthold D. A., Gennis R. B., and Rienstra C. M., Protein Sci. 17, 199 (2008). 10.1110/ps.073225008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Heise H., Hoyer W., Becker S., Andronesi O. C., Riedel D., and Baldus M., Proc. Natl. Acad. Sci. U.S.A. 102, 15871 (2005). 10.1073/pnas.0506109102 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Tycko R., Q. Rev. Biophys. 39, 1 (2006). 10.1017/S0033583506004173 [DOI] [PubMed] [Google Scholar]
  17. Kloepper K. D., Woods W. S., Winter K. A., George J. M., and Rienstra C. M., Protein Expression Purif. 48, 112 (2006). 10.1016/j.pep.2006.02.009 [DOI] [PubMed] [Google Scholar]
  18. Siemer A. B., Arnold A. A., Ritter C., Westfeld T., Ernst M., Riek R., and Meier B. H., J. Am. Chem. Soc. 128, 13224 (2006). 10.1021/ja063639x [DOI] [PubMed] [Google Scholar]
  19. Kloepper K. D., Hartman K. L., Ladror D. T., and Rienstra C. M., J. Phys. Chem. B 111, 13353 (2007). 10.1021/jp077036z [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Varga K., Tian L., and McDermott A. E., Biochim. Biophys. Acta 1774, 1604 (2007). [DOI] [PubMed] [Google Scholar]
  21. Cornilescu G., Delaglio F., and Bax A., J. Biomol. NMR 13, 289 (1999). 10.1023/A:1008392405740 [DOI] [PubMed] [Google Scholar]
  22. Wylie B. J., Schwieters C. D., Oldfield E., and Rienstra C. M., J. Am. Chem. Soc. 131, 985 (2009). 10.1021/ja804041p [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Gullion T. and Schaefer J., J. Magn. Reson. (1969-1992) 81, 196 (1989). 10.1016/0022-2364(89)90280-1 [DOI] [Google Scholar]
  24. Hing A., Vega S., and Schaefer J., Magn. Reson. (1969-1992) 96, 205 (1992). 10.1016/0022-2364(92)90305-Q [DOI] [Google Scholar]
  25. Michal C. A. and Jelinski L. W., J. Am. Chem. Soc. 119, 9059 (1997). 10.1021/ja9711730 [DOI] [Google Scholar]
  26. Jaroniec C. P., Lansing J. C., Tounge B. A., Belenky M., Herzfeld J., and Griffin R. G., J. Am. Chem. Soc. 123, 12929 (2001). 10.1021/ja016923r [DOI] [PubMed] [Google Scholar]
  27. Jaroniec C. P., Filip C., and Griffin R. G., J. Am. Chem. Soc. 124, 10728 (2002). 10.1021/ja026385y [DOI] [PubMed] [Google Scholar]
  28. Jaroniec C. P., MacPhee C. E., Astrof N. S., Dobson C. M., and Griffin R. G., Proc. Natl. Acad. Sci. U.S.A. 99, 16748 (2002). 10.1073/pnas.252625999 [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Jaroniec C. P., MacPhee C. E., Bajaj V. S., McMahon M. T., Dobson C. M., and Griffin R. G., Proc. Natl. Acad. Sci. U.S.A. 101, 711 (2004). 10.1073/pnas.0304849101 [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Helmus J. J., Nadaud P. S., Hofer N., and Jaroniec C. P., J. Chem. Phys. 128, 052314 (2008). 10.1063/1.2817638 [DOI] [PubMed] [Google Scholar]
  31. LeMaster D. M., J. Am. Chem. Soc. 118, 9255 (1996). 10.1021/ja960877r [DOI] [Google Scholar]
  32. Hong M., J. Magn. Reson. 139, 389 (1999). 10.1006/jmre.1999.1805 [DOI] [PubMed] [Google Scholar]
  33. Wylie B. J., Sperling L. J., Frericks H. L., Shah G. J., Franks W. T., and Rienstra C. M., J. Am. Chem. Soc. 129, 5318 (2007). 10.1021/ja0701199 [DOI] [PubMed] [Google Scholar]
  34. Ishii Y., Ashida J., and Terao T., Chem. Phys. Lett. 246, 439 (1995). 10.1016/0009-2614(95)01136-5 [DOI] [Google Scholar]
  35. Bennett A. E., Rienstra C. M., Griffiths J. M., Zhen W. G., Lansbury P. T., and Griffin R. G., J. Chem. Phys. 108, 9463 (1998). 10.1063/1.476420 [DOI] [Google Scholar]
  36. Delaglio F., Grzesiek S., Vuister G. W., Zhu G., Pfeifer J., and Bax A., J. Biomol. NMR 6, 277 (1995). 10.1007/BF00197809 [DOI] [PubMed] [Google Scholar]
  37. Veshtort M. and Griffin R. G., J. Magn. Reson. 178, 248 (2006). 10.1016/j.jmr.2005.07.018 [DOI] [PubMed] [Google Scholar]
  38. Wylie B. J., Sperling L. J., and Rienstra C. M., Phys. Chem. Chem. Phys. 10, 405 (2008). 10.1039/b710736f [DOI] [PubMed] [Google Scholar]
  39. Raleigh D. P., Levitt M. H., and Griffin R. G., Chem. Phys. Lett. 146, 71 (1988). 10.1016/0009-2614(88)85051-6 [DOI] [Google Scholar]
  40. Levitt M. H., Raleigh D. P., Creuzet F., and Griffin R. G., J. Chem. Phys. 92, 6347 (1990). 10.1063/1.458314 [DOI] [Google Scholar]
  41. Helmle M., Lee Y. K., Verdegem P. J. E., Feng X., Karlsson T., Lugtenburg J., de Groot H. J. M., and Levitt M. H., J. Magn. Reson. 140, 379 (1999). 10.1006/jmre.1999.1843 [DOI] [PubMed] [Google Scholar]
  42. Duma L., Hediger S., Lesage A., Sakellariou D., and Emsley L., J. Magn. Reson. 162, 90 (2003). 10.1016/S1090-7807(02)00174-X [DOI] [PubMed] [Google Scholar]
  43. Igumenova T. I. and McDermott A. E., J. Magn. Reson. 164, 270 (2003). 10.1016/S1090-7807(03)00239-8 [DOI] [PubMed] [Google Scholar]
  44. Franks W. T., Zhou D. H., Wylie B. J., Money B. G., Graesser D. T., Frericks H. L., Sahota G., and Rienstra C. M., J. Am. Chem. Soc. 127, 12291 (2005). 10.1021/ja044497e [DOI] [PubMed] [Google Scholar]
  45. Frericks Schmidt H. L., Sperling L. J., Gao Y. G., Wylie B. J., Boettcher J. M., Wilson S. R., and Rienstra C. M., J. Phys. Chem. B 111, 14362 (2007). 10.1021/jp075531p [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Mueller K. T., Jarvie T. P., Aurentz D. J., and Roberts B. W., Chem. Phys. Lett. 242, 535 (1995). 10.1016/0009-2614(95)00773-W [DOI] [Google Scholar]
  47. Mueller K. T., J. Magn. Reson., Ser. A 113, 81 (1995). 10.1006/jmra.1995.1059 [DOI] [Google Scholar]
  48. Schaefer J., J. Magn. Reson. 137, 272 (1999). 10.1006/jmre.1998.1643 [DOI] [PubMed] [Google Scholar]
  49. Peng X. H., Libich D., Janik R., Harauz G., and Ladizhansky V., J. Am. Chem. Soc. 130, 359 (2008). 10.1021/ja076658v [DOI] [PubMed] [Google Scholar]
  50. Schwieters C. D., Kuszewski J. J., Tjandra N., and Clore G. M., J. Magn. Reson. 160, 65 (2003). 10.1016/S1090-7807(02)00014-9 [DOI] [PubMed] [Google Scholar]
  51. Ishii Y., Terao T., and Hayashi S., J. Chem. Phys. 107, 2760 (1997). 10.1063/1.474633 [DOI] [Google Scholar]

Articles from The Journal of Chemical Physics are provided here courtesy of American Institute of Physics

RESOURCES