Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2008 Aug 22.
Published in final edited form as: J Am Chem Soc. 2006 Jun 21;128(24):7855–7870. doi: 10.1021/ja060406x

Variability of the 15N Chemical Shielding Tensors in the B3 Domain of Protein G from 15N Relaxation Measurements at Several Fields

Implications for Backbone Order Parameters

Jennifer B Hall 1, David Fushman 1,*
PMCID: PMC2519110  NIHMSID: NIHMS61090  PMID: 16771499

Abstract

We applied a combination of 15N relaxation and CSA/dipolar cross-correlation measurements at five magnetic fields (9.4, 11.7, 14.1, 16.4, and 18.8 Tesla) to determine the 15N chemical shielding tensors for backbone amides in protein G in solution. The data were analyzed using various model-independent approaches and those based on Lipari-Szabo approximation, all of them yielding similar results. The results indicate a range of site-specific values of the anisotropy (CSA) and orientation of the 15N chemical shielding tensor, similar to those in ubiquitin. Assuming a Gaussian distribution of the 15N CSA values, the mean anisotropy is -173.9 to -177.2 ppm (for 1.02-Å NH-bond length) and the site-so-site CSA variability is ±17.6 to ±21.4 ppm, depending on the method used. This CSA variability is significantly larger than derived previously for ribonuclease H or recently, using “meta-analysis” for ubiquitin. Standard interpretation of 15N relaxation studies of backbone dynamics in proteins involves an a priori assumption of a uniform 15N CSA. We show that this assumption leads to a significant discrepancy between the order parameters obtained at different fields. Using the site-specific CSAs obtained from our study removes this discrepancy and allows simultaneous fit of relaxation data at all five fields to Lipari-Szabo spectral densities. These findings emphasize the necessity of taking into account the variability of 15N CSA for accurate analysis of protein dynamics from 15N relaxation measurements.

Introduction

The chemical shielding tensor (CST) reflects the local electronic environment of a nucleus under nuclear magnetic resonance observation and therefore contains valuable information on the local chemical structure and conformation of a molecule. Fast random molecular tumbling in solution averages the individual components of the tensor, so that only its trace, reflected in the isotropic chemical shift, is directly observed in high-resolution NMR spectra. Site-specific variations in the isotropic chemical shifts allow separation of NMR signals from various sites in macromolecules, and deviations of chemical shifts from their random coil values are widely used for secondary 1,2 and tertiary 3 structure predictions in proteins. The anisotropic components of the tensor contribute directly to nuclear spin relaxation; their knowledge is therefore essential for NMR applications to protein dynamics 4-8, especially at higher field strengths, the development of TROSY techniques to study large molecules 9, and for the use of chemical shielding anisotropy for structure refinement 10.

Understanding of the relationship between the chemical shielding tensor and protein structure is likely to facilitate the development of new approaches to structure prediction and to refine the theoretical models for chemical shielding calculations. Amide 15N CSTs in proteins present a particular challenge, because they are susceptible to a variety of factors, including conformations (torsion angles) of both current and preceding residues, hydrogen bonding, solvent accessibility, long range electrostatics, etc 11-14.

The complete chemical shielding tensor could, in principle, be measured directly by solid-state NMR methods, and such studies provided valuable information on 15N CSTs in short peptides 15-21. However, applications of these techniques to uniformly labeled proteins are still in development. Recent solution NMR approaches based on orientation-dependent changes in 15N resonances in weakly aligned protein solutions 22-24 are very promising, although the accuracy and precision of these measurements do not yet allow site-specific determination of the CST values.

It has been demonstrated 4-6,8,25,26 that the anisotropy (CSA), of the 15N CST can be directly obtained from 15N relaxation measurements in proteins in solution. Measurements in ubiquitin 5,6 revealed a range of site-specific backbone 15N CSA values, from approximately -120 to -220 ppm, with a mean of -157 ppm and a standard deviation (not the site-to-site variability) of 19 ppm. This range includes data for both conformationally well-defined amides and those located in the flexible regions. The angle between the unique axis of the 15N CST and the NH bond was found to vary from 6° to 26°, with the mean of 15.7° and standard deviation (std) of 5° 5,6. These findings were confirmed by independent relaxation studies in ubiquitin 27. A higher in absolute value average CSA of -173 ppm (converted to an NH distance of 1.02Å) with site-to-site variation of up to ±17 ppm was derived from shifts in peak positions in weakly aligned solutions of ubiquitin 24, while recent MAS studies 28 of aligned ubiquitin in a similar medium yielded -162.0 ± 4.3 ppm for the mean 15N CSA and 18.6° ± 0.5° for the angle, in agreement with those from previous 15N relaxation data 5,6. A similar range of site-specific 15N CSA values (-129 to -213 ppm) was reported for ribonuclease H 8, although with a somewhat different mean (-172 ppm), for a selection of well-ordered amides. For this subset of residues, the site-to-site variability in CSA was estimated to be ±5.5 ppm (upper limit ±9.6 ppm at 95% confidence), assuming a Gaussian distribution for 15N CSA values. This number is relatively small, given the ∼30 ppm range of variation in the isotropic chemical shifts, and could be a result of the limited experimental precision in the CSA data, as the experimental uncertainties (±13 ppm) in the individual 15N CSA values in that paper are noticeably bigger than the reported variability. A recent study 29 combining new experimental measurements in ubiquitin with the literature data 5,27, resulted in an even more extreme mean 15N CSA of -179.6 ppm (converted to NH distance of 1.02 Å) and a CSA variability comparable to that in ribonuclease H. However, the results of another recent study based on a combination of fourteen auto- and cross-correlation rates in ubiquitin 30 agree with the earlier data5,6, and give average CSAs ranging from -146.4 to -164.0 ppm and the angles from 17.5-18.9°, depending on the choice of local motional model.

While the existence of some site-specific variability in 15N CSA is now established, it still remains to be understood whether the differences between the reported data reflect some protein specificity of the CSA distribution or differences in the experimental approaches and/or in data analyses. Measurements in other proteins and with higher experimental precision are therefore required in order to address this issue.

It is even more important to understand the effect of site-specific variations in 15N CSA on the motional characteristics of proteins derived from 15N relaxation data, in order to improve the accuracy of NMR approaches to protein dynamics. Although computer simulations 31 show that ignoring the variability in CSA values could significantly affect the NMR-derived picture of the backbone dynamics, a direct experimental analysis of this issue has not been at hand.

Here we apply a combination of NMR relaxation and cross-correlation measurements at several magnetic fields to determine the 15N chemical shielding tensors in a 56-amino acid protein, the third immunoglobulin-binding domain of protein G (further called GB3). We use several model-independent methods of data analysis to derive the 15N CSA values and compare them with the values obtained assuming the Lipari-Szabo spectral densities. We then analyze the effect of these site-specific CSA values on the LS analysis of the backbone dynamics and on the derived order parameters in GB3.

Materials and Methods

Sample preparation and NMR measurements

The GB3 domain construct (56 a.a.) in these studies was the same as in 32. The protein was a generous gift from Dr. Ad Bax, NIH. All measurements were performed on the same protein sample containing 1.8 mM of uniformly 15N-enriched GB3 dissolved in 30 mM phosphate buffer (pH 5.8) containing 9% D2O. Sample temperature was set to 24 °C using a glycerol temperature calibration sample, with each spectrometer being individually calibrated.

Relaxation measurements included rates of 15N longitudinal (R1) and transverse (R2) relaxation and the rate of 15N- 1H cross-relaxation measured via steady-state 15N{1H} nuclear Overhauser effect (NOE). These experiments were performed at five magnetic fields of 9.4, 11.7, 14.1, 16.4, and 18.8 Tesla, and used standard pulse sequences (e.g. 33). The NOEs were determined using a flip-back measurement scheme 34 for water suppression, the recycling delay was 4-5 s (see Supporting Table 1 for complete list of relaxation delays). The R1, R2, and NOE measurements at 9.4 Tesla were performed twice, on different instruments (at UMD and at CERM) yielding similar results. 15N CSA/15N-1H dipolar cross-correlation measurements were performed using the method described in 35,36. Transverse cross-correlation rates (ηxy) were measured at 9.4, 11.7, 14.1, and 18.8 Tesla, while longitudinal cross-correlation (ηz) was measured at 9.4, 11.7, and 14.1 Tesla. To verify that the ηz values were not affected by dipolar cross-relaxation of proton magnetization, the ηz measurements at 14.1 Tesla were repeated on a perdeuterated 15N-labeled GB3 sample and yielded the same results as for the protonated sample (Hall & Fushman, ms in preparation). Note also that the ηxy values at 11.7 and 14.1 Tesla were also measured using the spin-state selection method, yielding similar results 37.

The spectra were recorded in an interleaved fashion, as detailed in 32, and then processed using XWINNMR. Further analysis including automatic peak picking and integration, relaxation curve fitting, and data analysis was performed using an in-house suite of Matlab programs. The program DYNAMICS 32,33 was modified to include the site-specific 15N CSA as an additional fitting parameter.

Auto-relaxation and cross-correlation rates were obtained from least-square fitting of peak intensities in the corresponding series of 2D spectra to a mono-exponential decay. The heteronuclear NOE values were obtained from the ratio of peak intensities in the NOE and NONOE experiments 33. Experimental errors in peak intensities were estimated in two ways38: by integrating regions of spectra containing no cross peaks or, where applicable, from repeated (quadruplicate) measurements, using the method of 39. The errors in the rates were estimated using a Monte Carlo simulation of 500 experimental data sets per residue and assuming a normal distribution of experimental errors in peak intensities. The experimental errors in relaxation rates were around 1% on average: 1.16%, 0.83%, 1.43%, 1.09%, and 1.37% for R1; 1.21%, 1.21%, 1.33%, 0.96%, and 1.30% for R2, and 1.13%, 1.14%, 1.05%, 1.00%, and 1.06% for NOE values measured at 9.4, 11.7, 14.1, 16.4, and 18.8 Tesla, respectively. The average errors in ηxy were 1.37%, 1.50%, 1.67%, and 1.47% at 9.4, 11.7, 14.1 and 18.8 T, respectively; the errors in ηz were 1.27%, 1.16%, and 1.52% at 9.4, 11.7, and 14.1 T.

Determination of 15N CSA and the backbone dynamics from the experimental data

The 15N chemical shielding anisotropies were derived from the measured relaxation and cross-correlation rates using five different methods outlined below.

The R/η method

This method is a generalization of that of ref 5 and is based on the fact that the ratio of the corresponding cross-correlation and auto-relaxation rates is independent, to a good approximation, of the spectral densities J(ω) 25,7:

R2ηxy=R1ηz=d2+c22dcg (1)

Here d=μ0γHγNħ(8πr3HN) is the strength of the 15N-1H dipolar coupling and c = γNBoΔσ/3 = - ωN ·Δσ/3 and cg = - ωN ·Δσg/3 represent the 15N CSA contributions to auto-relaxation and cross-correlation rates, respectively, where 40

Δσ=σxx2+σyy2+σzz2(σxxσyy+σxxσzz+σyyσzz)½, (2)
Δσg=(σzzσyy)P2(cosβz)+(σyyσxx)P2(cosβx); (3)

σii are the principal values of the 15N CST. P2(x) is the Legendre polynomial, and βz, βx are the intervening angles between the principal axes (z, x) of the 15N CST and the N-H bond vector. rHN is the internuclear distance (here assumed to be 1.02Å for all backbone amides), γH, γN and ωH, ωN are the gyromagnetic ratios and the absolute values of the Larmor frequencies, respectively, of 1H and 15N, and μo is the permeability of vacuum. Δσ has the meaning of the effective anisotropy of 15N CST, and will be referred to as the 15N CSA throughout this paper; Δσg has the meaning of a “projection” of the CSA tensor onto the NH vector and can be represented as Δσ times an orientation factor. Note that here we use the convention that σzz≤σyy≤σxx and define the principal axes of the 15N CST such that its z-axis corresponds to the least shielded component (σzz), i.e. is close in orientation to the NH bond. The other two axes are then defined such that the y-axis is approximately orthogonal to the peptide plane, and the x-axis located approximately in-plane. Under the assumption of an axial symmetry of the 15N CST (σxxyy, σzz), Eqs.2 and 3 simplify into their more “conventional” form (e.g. 25):

Δσ=σσandΔσg=ΔσP2(cosβz). (4)

The primes in Eq.1 and throughout this paper indicate “reduced” relaxation rates: the contributions from high-frequency components of the spectral density were subtracted as follows6:

R1=R1[11.249γNγH(1NOE)]=3(d2+c2)J(ωN), (5)
R2=R21.079γNγHR1(1NOE)=0.5(d2+c2)[4J(0)+3J(ωN)] (6)

Equations (1) can be recast to yield a linear dependence on ωN2,

2ωNR2ηxy=2ωNR1ηz=3dΔσg(Δσ)23dΔσgωN2. (7)

which can then be fit to a straight line, m · x+ b (where xN2), using a simple linear regression. This form allows a direct determination of Δσg and Δσ from the intercept b and slope m of this line:

Δσg=3db; (8)
Δσ=3d(mb)½. (9)

The choice of the sign in Eq.9 reflects negative 15N CSA, according to solid-state NMR data. For an axially symmetric 15N CST this gives5,6 (cf. Eq.4) σ - σ= -3d (m/b)½ and P2(cosβz) = (m·b).

It should be mentioned here that the CSA parameters (Δσ, Δσg) obtained using this method are independent, of the motional characteristics of the molecule, as discussed in ref 25.

The 2R2-R1 method

This method is based on a quadratic field dependence of the following combination of the auto-relaxation rates (e.g. 6),

2R2R1=4d2J(0)+(49)J(0)(Δσ)2ωN2, (10)

which allows a direct determination of J(0) and Δσ from the intercept b and the slope m of the line m · ωN2 + b representing a linear dependence of 2R2′ - R1′ on ωN2:

J(0)=b(4d2); (11)
Δσ=3d(mb)½. (12)

In this method, the spectral density J(0) is determined solely from the intercept of the fitting line, and therefore is independent of the 15N CSA. We assume throughout this paper that the conformational exchange contribution to R2 is negligible, which holds for all residues in GB3 except possibly Val39 32. When present, conformational exchange contribution (in the case of fast exchange) has the same field dependence as the (Δσ)2 term (e.g. Eqs.7,10), and special care is required in order to separate them 6,7.

The 2ηxy - ηz method

This method utilizes a linear field dependence of the following combination of the cross-correlation rates:

2ηxyηz=(83)dΔσgJ(0)ωN=mωN, (13)

which allows determination of the product, Δσg·J(0), directly from the slope m of the fitting line with zero intercept:

ΔσgJ(0)=m3(8d). (14)

This approach has the advantage over the abovementioned methods in that (1) it is not affected by the possible conformational exchange contribution to R2 and (2) it does not require correction for the high-frequency components of the spectral density (cf. Eqs.5, 6). The drawback is that it does not allow separate determination of Δσg and J(0). If one of these parameters is known (e.g. J(0) from the 2R2-R1 method), then the other one (in this case, Δσg) can be directly obtained from Eq.14.

Analyses of the relaxation data using Lipari-Szabo approximation

While the methods outlined above are truly independent of the model of local and overall motion, the following approaches use a specific, so-called “model-free” or Lipari-Szabo (LS) form of the spectral density function 41-43 that describes the backbone dynamics in terms of an order parameter S and a correlation time τloc of local motion.

“Standard” Lipari-Szabo approach (LS)

The now standard, LS-type analysis of the relaxation data (R1, R2, NOE) (see e.g. 33,44) was performed using the program DYNAMICS and assuming a uniform 15N CSA value, as described in 32,33. Up to eight motional models (listed in 33) were considered per residue, depending on the number of available observables. The overall tumbling of GB3 was assumed anisotropic, described by the average diffusion tensor shown in Table 1 (see text). For amides in the loop regions, where the NH-vector orientation is less well defined than in the elements of secondary structure, we adopted a conservative approach, in that the overall tumbling was assumed isotropic, in order to avoid bias by a particular loop conformation captured in the crystal structure. Using the anisotropic diffusion model and crystal structure coordinates for residues in the loop regions resulted in slightly different values of the order parameters32 but did not alter the conclusions of the analysis. The same approach was also adopted for the other LS-based models throughout this paper.

Table 1. Statistics of the 15N CSA values in GB3 determined here using various methods.
Method Analyzed set of residues Number of residues max(Δσ)a (ppm) min(Δσ)b (ppm) <Δσ>c (ppm) μd (ppm) mediane (ppm) <δΔσ>f (ppm) std(Δσ)g (ppm) Λh (ppm)
2R2-R1 Alli 47 -111.3 -241.0 -174.2 -173.9 -175.4 6.0 22.2 21.4
χ2/dffit<95% cutoff 32 -154.0 -207.0 -178.1 -178.2 -178.9 7.0 12.9 10.6
α-helixj 11 -140.4 -198.2 -175.8 -176.4 -177.0 7.6 18.1 14.1
β-strandsj 19 -154.0 -241.0 -180.3 -180.2 -177.5 7.3 19.1 16.3
R/η Alli 44 -127.9 -237.9 -177.4 -177.2 -178.4 7.5 19.5 17.6
χ2/dffit<95% cutoff 33 -155.7 -203.5 -178.2 -178.2 -178.3 7.8 12.5 10.2
α-helixj 11 -141.6 -203.5 -177.6 -179.3 -178.3 9.2 16.7 8.3
β-strandsj 19 -159.2 -237.9 -181.1 -180.7 -178.5 7.5 18.3 14.7
LS-CSA Alli 32 -126.0 -243.4 -176.9 -176.9 -176.8 3.1 20.0 19.2
χ2/dffit<95% cutoff 25 -158.1 -201.9 -178.3 -178.3 -177.2 3.3 12.6 11.9
α-helixj 11 -126.0 -196.9 -174.3 -174.3 -180.5 3.4 21.3 19.9
β-strandsj 16 -159.3 -243.4 -180.7 -180.6 -175.9 3.1 20.9 19.6
Average of all 3 methods Alli 50 -111.3 -240.8 -174.2 -173.8 -175.9 7.1 22.2 21.2
χ2/dffit<cutoff 35 -155.7 -203.4 -177.7 -177.2 -178.3 7.9 11.9 9.1
α-helixj 11 -136.0 -196.3 -176.0 -177.3 -184.6 9.2 18.1 12.0
β-strandsj 20 -159.8 -240.8 -180.3 -179.9 -177.8 8.2 18.6 14.5
a

The smallest absolute value of the 15N CSA.

b

The largest absolute value of the 15N CSA.

c

The arithmetic mean of measured values of the 15N CSA.

d

The value of μ that maximizes the likelihood function p(μ,Λ) (Eq. 19); μ is an estimate of the true mean of the CSA distribution.

e

Median of measured values of the 15N CSA.

f

The arithmetic mean of experimental uncertainties in the 15N CSA.

g

The standard deviation of the measured values of the 15N CSA.

h

The value of Λ that maximizes the likelihood function p(μ,Λ); Λ is an estimate of the true site-to-site variability in the CSA distribution.

i

All residues with acceptable agreement of regression methods (out of 50 analyzable residues, see text).

j

The α-helix in GB3 extends from Ala23 to Asp36 with Thr25, Glu27, and Asn35 impossible to resolve in the spectra due to overlap (hence 11 analyzable residues). The β-strands comprise Tyr3-Ile7, Gly14-Lys19, Val42-Asp46, and Thr51-Thr55, with Glu15 excluded due to overlap (altogether 20 analyzable residues). Gln2 was excluded from the LS analyses (see text).

Lipari-Szabo approach including CSA (LS-CSA)

This approach is an extension of the “standard” LS analysis of the relaxation data (R1, R2, NOE) (see above) that here includes site-specific 15N CSA (Δσ) as an additional adjustable parameter. The LS-CSA method, therefore, yields Δσ and the conventional LS parameters (e.g. S2, τloc) and possibly Rex, depending on the model selection for local dynamics. Up to eight motional models (listed in 33) were considered per residue, depending on the number of available observables. For these purposes, the recent version of our computer program DYNAMICS 32 that already accounts for the overall rotational anisotropy was upgraded to include Δσ as an additional fitting parameter in a simplex-based optimization. Previously, a similar type of inclusion of CSA in the derivation of the LS parameters has been used to assess the accuracy of overall rotational diffusion parameters 45 and for simultaneous analysis of single-field relaxation and cross-correlation data 27.

The robustness of this procedure of deriving Δσ was tested on 1,000 sets (per model) of synthetic relaxation data (R1, R2, NOE at the five field strengths) containing 1% “experimental” noise. The range of the input parameters for simple LS models was: S2 from 0.6 to 1, τloc from 0 to 100 ps (typical range of values for elements of secondary structure), and Δσ from -100 to -300 ppm. The output order parameters and the Δσ were within 4.38% (mean 0.004%, std 1.08%) and 6.68% (mean -0.012%, std 1.71%), respectively, from their input values, though only 94.9% of the data could be fit to within a 95% confidence level with this level of noise. In the case of the “extended model-free” model 43, the fast dynamics were characterized by Sfast2 from 0.7 to 1 (with S2 = Sslow2 ·Sfast2 < S2fast) and τfast from 0 to 100 ps, while the slow motions had Sslow2 from 0.6 to 1 and τslow from 200 to 500 ps. Here the output order parameters and the Δσ were within 4.78% (mean 0.008%, std 1.11%) and 8.93% (mean -0.02%, std 1.86%), respectively, from their input values, and 95.9% of the data could be fit to within a 95% confidence level with this level of noise. No Rex contributions to R2 were included in the simulation. From these analyses, we concluded that the order parameter and CSA could be fit to within reasonable uncertainty with the existing errors in the experimental relaxation data.

Lipari-Szabo analysis of spectral densities directly (LS-SDF)

The CSA values were also derived by simultaneous fitting of the spectral densities measured at all five fields to a LS spectral density, JLS(ω) 41, that describes local dynamics in terms of S2 and τloc only. The JLS(ω) values included the effect of the overall rotational anisotropy 46,47, calculated from the diffusion tensor characteristics (Table 1) and the orientation of a given NH vector reconstructed according to the crystal structure of GB3 (1IGD.pdb). For each residue, the experimental values of the spectral density function J(ω) at ω=0, ωN, and 0.87ωH were directly derived from the relaxation data (R1, R2, NOE) at each field strength using the reduced spectral density approximation 48,49, as follows:

J(0.87ωH)=(γNγH)(1NOE)R1(5d2); (15)
J(ωN)=R17(0.870.921)2d2J(0.87ωH)3(d2+c2); (16)
J(0)=2R2R16(0.87)2d2J(0.87ωH)4(d2+c2). (17)

Altogether this resulted in 15 values of J(ω) per residue, five of which were J(0) values derived from different-field measurements and which are expected to be the same within experimental precision. The LS parameters, (S2, τloc), and the 15N CSA value for each residue were obtained from an unconstrained nonlinear minimization of the following target function:

χMF2=i[J(ωi)JLS(ωi)δJi]2 (18)

where the sum is over all available ωi values for a given residue, and δJi represents the experimental error in Ji). This method is analogous to the “classical” LS analysis except that reduced spectral densities are being used and the CSA is an additional fitting parameter.

Robust analysis of data

The methods described above usually rely on a least-squares fit of experimental data. Given the small number of available experimental data points per residue, the results of such fit are susceptible to experimental errors. Measures were taken to ensure that the conditions of each experiment were identical within practical limits; however, there are outlying data points in several residues, as can be seen, for example, from the linear regression plots (Supporting Information). These deviations do not seem to come from the random noise in the spectra, but rather are a result of spectral artifacts caused by baseline drift, water suppression problems etc, the distribution of which is unknown and cannot be readily determined from the small sample of measurements. Least-squares fits (including linear regression) are particularly susceptible to outliers 50,51,52, as their contributions to the target function increase as a square of the deviation from the fitting curve. In light of this, for each method of deriving the CSA, in addition to the “standard” least-squares regression analysis, two so-called “robust” regression methods50,51 were used to obtain alternative values of the CSA and other pertinent parameters, with slightly different weights given to outlying data points. A least-squares regression involves minimization of the target function, ρ(z)=12z2, where z is given by z=yimeasypred(xi)δyi, and yimeas and ypred(xi) are the measured and predicted data, respectively, for a given residue, and δyi is the experimental uncertainty in ymeasi. For this type of ρ(z), the more deviant the point from the model, the greater the weight that this point is given in the minimization. Robust regression methods involve minimization of alternative functions of z. Here we use two such functions as the target of the minimization50,51: (1) the absolute value of z (ρ(z) = |z|), in which all deviant points are given the same relative weight, and (2) ρ(z)=log(1+12z2), where the relative weight given to deviant points initially increases with deviation (while z<2) and then decreases so that those points which are the furthest from the fitting curve are given the least relative weight.

For the majority of residues in GB3 the results of the least-squares regression and the two robust methods agreed within their estimated uncertainties. For these residues the average of the parameters from the three types of regression is reported. As the experimental uncertainties in the derived parameters we report the biggest of the errors from the least-squares fit (using standard equations50 for uncertainties in linear regression parameters or Monte-Carlo simulations) and from the robust methods (using Monte-Carlo simulations), estimated by propagating the experimental errors in relaxation and cross-correlation rates.

For those few residues where the three methods disagreed (i.e. where using a different weight function ρ(z) for the same data set resulted in significant changes in the derived fitting parameters) no CSA is reported - except those cases where the deviation in the results of the least-squares regression can be unambiguously ascribed to undue weight given to a single clearly outlying data point (see examples in Supporting Information). For these residues, the average of the two robust methods is reported. All three fits (least-squares and the two robust methods) for each model-independent method for every amide are shown in Supporting Material.

Separation of true variability in the CSA from experimental uncertainty

The observed range of site-specific 15N CSA values reflects both true CSA variability and random statistical errors in the measured parameters 8. To address the actual variability of the CSA tensor we adopted the same statistical approach as in 8,29 that assumes that the CSA values in proteins follow Gaussian distribution. Assuming that the experimentally determined uncertainties are correct, the “true” values of the mean CSA (μ, in ppm) and site-to-site CSA variability (Λ, also in ppm) can be determined by maximizing the following likelihood function29,50

p(μ,Λ)=i=1N12π(Λ2+(δΔσi)2)exp((μΔσi)22(Λ2+(δΔσi)2)). (19)

Here N is the number of residues probed in the measured distribution, Δσi and δΔσi are the measured CSA value and its experimental uncertainty for residue i. The confidence limits for μ and Λ were estimated from the boundaries of a 95% bivariate confidence region determined from the following equation: p(μ,Λ)/max{p(μ,Λ)} = exp(-0.5χ20.95, 2), where χ20.95, 2=5.99 is the 95th - percentile point of the chi-square distribution with 2 degrees of freedom.

Results

The transverse (R2) and longitudinal (R1) 15N relaxation rates and the steady-state 15N{1H} NOEs in GB3 were measured at five magnetic fields, 9.4, 11.7, 14.1, 16.4 and 18.8 T. The transverse (ηxy) and longitudinal (ηz) 15N CSA/dipolar cross-correlation measurements were performed at four fields (9.4, 11.7, 14.1, and 18.8 T) for ηxy and at three fields (9.4, 11.7, and 14.1 T) for ηz. The experimental details are outlined in Materials and Methods; the actual data and errors are listed in Supporting Table 2. Overall, 50 out of 55 amides were analyzed; residues Glu15, Thr25, Glu27, and Asn35 were excluded because of signal overlap and Val39 due to conformational exchange 32. Gln2 was excluded from LS analyses since the atom coordinates for this residue (which is a mutation) were not available from the crystal structure.

Model-independent determination of 15N CSA

The use of data measured at multiple field strengths is expected to improve the accuracy of the derived picture of protein dynamics by allowing direct and independent determination of the spectral densities and the 15N CSA 31. The 15N CSA values for backbone amides in GB3 were obtained using three different model-independent methods, detailed in Materials and Methods. These methods are model-independent in the sense that they involve no assumption about a particular parameterization of the spectral density function.

The 2R2-R1 Method

The 15N Δσ values and the spectral density J(0) were determined directly from the observed field dependence of the combination of reduced auto-relaxation rates, 2R2′ - R1′ (Fig.1). Relaxation data (R1, R2, and NOE) at all five fields were used for each residue. The data were fitted to a linear dependence on ωN2 (Eq.10) using the three linear regression methods (least-squares and two robust methods) as discussed in Materials and Methods; the quality of the fit for each residue is shown in Supporting Material. All three regression methods had good agreement (both slope and intercept agreed within the experimental uncertainty) for 38 out of 50 residues in GB3. For an additional 9 residues (Leu12, Ala20, Val21, Gly38, Asp40, Asp47, Ala48, Thr49, and the C-terminal residue, Glu56) the two robust methods agreed within their experimental uncertainties (68.3% confidence interval). Only for 3 residues (Lys10, Gly41, and Lys50, all of which are in the loops in GB3) can no definitive CSA be reported because all three regression methods disagree for the 2R2′ - R1′ fit.

Figure 1. Representative fits of the dependence of 2R2′ - R1′ on ωN2.

Figure 1

Shown are fits from the 2R2-R1 method for three residues in GB3. This plot also illustrates the variation in the 15N CSA values between these residues. The amides shown here have very similar values of J(0), as evidenced by the fact that they have the same intercept b (cf. Eq. 11), but exhibit strikingly different slopes reflecting the difference in their CSA values (Eq. 12). The plots of 2R2′ - R1′ versus ωN2 for all residues in GB3 can be found in the Supporting Material. The error bars here and in all other figures represent standard errors (corresponding to 68.3% confidence intervals).

The average site-specific 15N CSA values from the three fits are presented in Fig.2 (solid squares), the values of J(0) are shown in Fig.3 (solid squares). The site-specific 15N CSAs from this method range from -111.3 ±1.7 ppm (Leu12) to -241.0 ±8.7 ppm (Phe52), with a mean of <Δσ> = -174.2 ppm and a standard deviation of 22.2 ppm. The median Δσ is -175.4 ppm, in good agreement with the mean, indicating that the mean is not dominated by a small number of outliers (Table 1).

Figure 2. Site-specific 15N CSA values in GB3 obtained using the three methods (2R2-R1, R/η, and LS-CSA).

Figure 2

(a) The site-specific 15N CSAs, from the 2R2-R1 method (black squares), R/η method (blue circles), and the LS-CSA method (green triangles) versus residue number. The secondary structure of GB3 is indicated at the top of the panel.

(b) Correlation between 15N CSA values measured using the model-independent methods, 2R2-R1 and R/η. The Pearson’s correlation coefficient r for these two data sets is 0.79; 81% of these CSA data agree within the experimental uncertainties. These values improve to r=0.80 and 87% agreement if only those data (shown as solid squares) where the least-squares fits pass the 95%-confidence level χ2/df cutoff are considered. (c) Correlation between the CSAs from 2R2-R1 and LS-CSA methods. The correlation coefficient is 0.95; it decreases to r=0.93 if only those fits that pass the χ2/df cutoff (solid squares) are included, though the percent agreement improves from 94% to 96%. (d) Correlation between the results from R/η and LS-CSA methods. The correlation coefficient is 0.80 and remains unchanged when the χ2/df cutoff is applied (solid squares). The percent agreement increases from 84% for all considered residues to 88% for those residues with the χ2/df below the cutoff value. In all correlation plots (panels b-d) the solid symbols represent values obtained for least squares fits that passed the χ2/df cutoff while open symbols correspond to the remaining residues. Outliers and extreme values of the CSA are labeled. Note that those few residues that show significant differences in the CSA values between the methods are all located in the loops/termini. Also in the loops are all residues where only one out of the three methods resulted in an acceptable fit (panel a).

Figure 3. The agreement between the spectral density component, J(0), measured using the 2R2-R1 method and reconstructed from the LS parameters.

Figure 3

The spectral density component J(0) obtained from the 2R2-R1 method directly (solid symbols) and calculated from the order parameters and local correlation times obtained in the LS-CSA method (open symbols). Throughout this paper, the factor 2/5 arising from the normalization of the spectral density of the overall rotational diffusion is explicitly included in the corresponding expression for J(ω).

The observed site-specific 15N CSA values were analyzed assuming a normal distribution of the true CSA values, as detailed in Materials and Methods. The average estimated relative uncertainty is 2.67% for J(0) and 3.44% (or 6.0 ppm) for Δσ. From these data, the true CSA values in GB3 are characterized by a mean of μ = -173.9 ppm and the site-to-site variability Λ = 21.4 ppm (see Eq. 19). We estimate a joint 95% confidence interval for μ from this method to range from -165.7 to -182.2 ppm and for Λ from 16.6 to 28.6 ppm (Fig. 4). It is worth pointing out that site-to-site variability in the CSA in GB3 is evident from a comparison of the linear dependence of 2R2′ - R1′ on ωN2 (Eqs.10-12) for three residues with similar J(0) values (Fig. 1).

Figure 4. The likelihood functions (Eq. 19) obtained from different methods and sets of data show significant site-so-site variability in the 15N CSA values.

Figure 4

Contour plots of the likelihood functions p(μ, Λ) (Eq. 19) corresponding to the 15N CSA values from the three methods (2R2-R1 (black), R/η (blue), and LS-CSA (green)) (a) for all analyzed residues in GB3 and (b) for only those residues where χ2/df from the least-squares fits passed the goodness-of-fit test at a 95% confidence level. Also shown (in cyan), for comparison, is the analogous likelihood function obtained for the recently reported 15N CSAs in ubiquitin 29, scaled to a NH-bond length of 1.02 Å. The location of the maximum for each function is indicated by a dot (see also Table 1), the contour lines represent 68.3%, 90% and 95% bivariate confidence regions for μ and Λ. In panel a, the 95% joint confidence intervals (in ppm) for μ and Λ are (-165.7, -182.2) and (16.6, 28.6) from 2R2-R1, (-169.9, -184.6) and (13.2, 24.3) from R/η, and (-168.0, -185.7) and (14.3, 27.3) from LS-CSA methods. For a subset of residues (panel b) that pass the χ2/df cutoff, the corresponding confidence intervals for μ and Λ are (-172.7, -185.2) and (6.8, 17.1) from 2R2-R1, (-172.5, -183.9) and (6.4, 15.8) from R/η, and (-171.8, -184.7) and (8.5, 18.0) from LS-CSA methods.

The R/η Method

This method is based on the field dependence of the ratio of the (reduced) auto-relaxation rate (R2′ or R1′) and the corresponding 15N CSA/dipolar cross-correlation rate (ηxy or ηz, respectively), Eqs.1-9. Both R2′/ηxy and R1′/ηz ratios are expected to have the same values (Eq.1), therefore these data were analyzed together (see also below). The analysis included R2′/ηxy data at four fields and R1′/ηz at three fields for each residue. Using both R2′/ηxy and R1′/ηz data improves the accuracy of analysis by increasing the number of data points included in the fit. In addition, the R1z values have the advantage of being free of any contribution from conformational exchange. The quality of the fit for each residue in GB3 is shown in Supporting Material. All three regression methods had good agreement (both slope and intercept agreed within the experimental uncertainty) for 37 out of 50 amides in GB3. For an additional 7 residues (Gly9, Thr11, Lys13, Ala26, Gly38, Phe52, and the C-terminal Glu56) the two robust methods agreed within their experimental uncertainties (68.3% confidence interval). For 6 residues (Leu12, Ala20, Asp40, Gly41, Ala48, and Thr49, all of which are in loop/turn regions of GB3), no CSA is reported here because all three regression methods disagree in the 48, and Thr49, all of which are in loop/turn R/η fit. The 15N CSA values (Δσ) obtained using this approach are shown in Fig. 2, the values of Δσ are presented in Fig. 5. These 15N CSAs range from -127.9 ±4.0 ppm (Gly38) to -237.9 ±11.1 ppm (Phe52), with a mean value of -177.4 ppm and a standard deviation of 19.5 ppm. The median is -178.4 ppm. The average estimated level of the experimental errors is 4.23% (or 7.5 ppm) for Δσ. The maximization of the likelihood function (Eq. 19, Materials and Methods) yielded the true variability in Δσ of Λ = 17.6 ppm and a true mean CSA of -177.2 ppm. We estimate a 95% confidence interval on μ from this method to be from -169.9 to -184.6 ppm and for Λ from 13.2 to 24.3 ppm (Fig.4).

Figure 5. The values of Δσg and the βz angles from the R/η and 2ηxyz methods.

Figure 5

(a) Measured site-specific 15N Δσg values for GB3 from the R/η (black squares) and the 2ηxyz methods (blue circles). The Δσg values range from -108.9 ppm (Ala20, 2ηxyz) to -189.8 ppm (Phe52, 2ηxyz). (b) Correlation between Δσg values measured using the R/η and 2ηxyz methods. The correlation coefficient is 0.94 for all residues and 0.96 for only those fits that pass the χ2/df cutoff. (c) βz angles (in degrees) determined from the R/η method (black squares) and by combining the Δσg values from the 2ηxyz method with the Δσ values from 2R2-R1 (blue circles). The Pearson’s correlation coefficient for the agreement of the β angles from these two measurements is 0.94. The derivation of βz assumed axial symmetry of the 15N chemical shielding tensor. The secondary structure of GB3 is indicated on the top of panels (a) and (c).

The angles βz derived from these Δσ and Δσg values assuming axial symmetry of the 15N CST are shown in Fig. 5c (black squares). The range of βz values is from 7.5° (Val6) to 27.6° (Thr11) with a mean value of 19.9° and standard deviation of 4.5°, in agreement with the βz values observed in ubiquitin 5,24,28. Very similar βz values were also determined from a combination of the Δσg values from the 2ηxy-2ηz method with the Δσ values from 2R2-R1 (see below).

Note that using the mean of R2′/ηxy and R1′/ηz as the R/η value at a given field (where both data are available at 9.1, 11.7, and 14.1 T) resulted in the CSA values from -127.9 to - 237.9 ppm with a mean CSA of -177.4 ppm and a standard deviation of 19.5 ppm. These results have a Pearson’s correlation coefficient r of 0.97 to CSA values obtained using the individual measurements (see above). Fitting the R2′/ηxy values alone gave 15N CSA values in the range from -140.5 to -234.8 ppm, with a mean of -179.2 ppm and a standard deviation of 19.2 ppm, with a correlation coefficient of 0.91 to the CSAs derived from both transverse and longitudinal data. The R1′/ηz data alone yielded a broader spread of CSAs, from -129.9 to -251.6 ppm, with a somewhat larger absolute values of the mean (-185.5 ppm) and standard deviation (23.9 ppm). These data show a poor correlation (r=0.13) with the CSAs obtained from both transverse and longitudinal data together, which likely reflects a lesser accuracy of the R1z data alone due to a narrower range of magnetic fields covered by the ηz measurements.

It is worth mentioning, that 15N CSAs obtained by the R/η method are expected to be independent of the magnitude of the spectral density function 5,25,31. Indeed, the correlation coefficient between the J(0) values derived from the 2R2-R1 method (these values are independent of Δσ) and the CSA values from the R/η approach was -0.23.

Quality Control Using the 2ηxyz Method

The field dependence of the cross-correlation data alone (Eq.13, 14) yields the product of Δσg and J(0). The quality of fit is shown in Supporting Material. This analysis is totally independent of the auto-relaxation data. We then used the value of J(0) derived from the 2R2-R1 method (this value is independent of Δσ) to obtain Δσg (Fig.5). The Δσg values thus obtained range from -107.2 (±1.2) ppm for Leu12 to -186.1 (±1.0) ppm for Ala34, with the mean value of -154.4 ppm, and a median at -154.1 ppm. These values were then compared with the Δσg values derived from the R/η approach, which are independent of J(0). The excellent agreement (r=0.94 to 0.96, Fig.5b) between the values of the same parameter determined independently from different sets of measurements thus provides strong quality control for our analysis.

Assuming axial symmetry of the 15N CST, and using Δσ values from the 2R2-R1 method, we determined the angle βz between the unique (least shielded) component of the tensor and the NH bond vector (Fig.5c). These βz values are in very good agreement (r=0.93) with βz derived from the R/η method described above.

Determination of 15N CSA and order parameters in GB3 using Lipari-Szabo approximation

The analysis of relaxation data was also performed assuming the so-called “model-free” form of the spectral density 41-43, using both the conventional Lipari-Szabo approach (LS) and its modifications, LS-CSA and LS-SDF, described in Materials and Methods.

Analysis of the overall tumbling

The importance of a correct treatment of the overall tumbling of a molecule for the accurate derivation of local motional parameters has been established in the literature 32,45,53-55. The overall rotational diffusion tensor of GB3 was derived from the 15N relaxation data (R1, R2, NOE) at each field using the program ROTDIF 56. The diffusion tensor obtained by this method is independent, to a good approximation, of site-specific values of the 1H-15N dipolar interaction, 15N CSA, and NH order parameters 57 58. This follows from the fact that R2′ and R1′ are both proportional to (d2 + c2) (See Eqs.5,6) and, for protein core residues with restricted backbone mobility, also to S2 (because J(0), JN) ∞ S2). Thus, in the R2′/R1′ ratio, analyzed in ROTDIF, all these factors unrelated to overall rotational diffusion cancel out. The results of the analyses are shown in Table 2. At all five field strengths an axially symmetric diffusion tensor was found to be a significant improvement over an isotropic model (evaluated by the statistical F-test 50), whereas the use of a more complex, fully anisotropic diffusion tensor model was not statistically warranted (see also below).

Table 2. Characteristics of the overall rotational diffusion tensor of GB3 derived from 15N relaxation data at different magnetic fields.

The N-H vectors for this analysis were taken from the original crystal structure of GB3 (PDB entry 1IGD); similar results were obtained for GB3 structures refined using residual dipolar couplings (PDB entries 1P7E and 1P7F 70) (Supporting Table 3). These diffusion tensor characteristics are in good agreement with those theoretically predicted from the shape of the molecule 32. Also shown are the diffusion tensors derived from the cross-correlation rates, ηxy and ηz

Magnetic field (Tesla) 1H resonance frequency (MHz) Da (107 s-1) Da (107 s-1) Φ° b Θ° b τcc (ns) Anisotropyd χ2/dfe Pf
From auto- and cross-relaxation rate measurements
9.4 400 4.40(0.19) 6.13(0.62) 89(18) 66(23) 3.35(0.20) 1.39(0.13) 0.64 6·10-11
11.7 500 4.45(0.31) 6.20(1.12) 95(15) 68(19) 3.31(0.32) 1.39(0.24) 0.69 4·10-13
14.1 600 4.45(0.15) 6.05(0.44) 90(8) 70(10) 3.34(0.14) 1.36(0.09) 0.72 2·10-13
16.4 700 4.44(0.14) 6.24(0.41) 99(7) 63(11) 3.31(0.13) 1.41(0.08) 0.88 6·10-19
18.8 800 4.46(0.08) 6.15(0.27) 100(7) 67(10) 3.32(0.08) 1.38(0.06) 0.74 3·10-14
Averaged tensor 4.44 6.14 99 66 3.33 1.38
Global-fit tensor 4.44 6.14 95 66 3.33 1.38 0.72 6·10-15
From cross-correlation rate measurements
9.4 400 4.50(0.16) 6.00(0.52) 101(9) 77(13) 3.33(0.16) 1.33(0.11) 0.66 9·10-11
11.7 500 4.38(0.12) 6.14(0.40) 90(6) 59(9) 3.36(0.12) 1.40(0.08) 0.96 1·10-12
14.1 600 4.40(0.06) 6.20(0.19) 93(4) 65(6) 3.33(0.06) 1.41(0.04) 0.51 3·10-17

Numbers in the parentheses represent standard deviations.

a

Principal values of the rotational diffusion tensor.

b

Polar and azimuthal angles {Θ, Φ} (in degrees) describe the orientation of the diffusion tensor axis with respect to protein coordinate frame.

c

Overall rotational correlation time of the molecule, τc=1/[2 tr(D)].

d

The degree of anisotropy of the diffusion tensor, D/D.

e

Residuals of the fit divided by the number of degrees of freedom

f

The probability that the reduction in the χ2 compared to the isotropic diffusion model occurred by chance.

The good agreement (within the experimental errors) between the diffusion tensors determined at different fields indicates that there is no significant difference in the experimental conditions (in particular, temperature) between the measurements on different spectrometers. This then justifies the simultaneous analysis of these relaxation data acquired at various fields for the purpose of extracting field-independent parameters, like CSA, S2 etc. Note also that there is practically no difference between the diffusion tensors derived using the crystal and solution structures of GB3 (cf. Table 2 and Supporting Table 3). Also there is no significant difference between the diffusion tensor obtained from a simultaneous (global) fit of all the data and the result of averaging the diffusion tensors obtained at each field (Table 2). Therefore for our LS analyses, we used the diffusion tensor resulting from the simultaneous fit of all data.

The overall rotational diffusion tensor can also be derived from cross-correlation measurements 7, independently of the auto- and cross-relaxation measurements, when both ηxy and ηz data are available. This approach has the advantage of being essentially free of any effects of conformational exchange contributions to R2 and also does not require correction for high-frequency components of the spectral density. The diffusion tensors obtained from ηxy and ηz measurements at 9.4, 11.7, and 14.1 T (shown in Table 1) are in excellent agreement with those derived from the auto-relaxation rates and NOE.

Backbone order parameters: assuming a uniform 15N CSA

When relaxation data (R1, R2, NOE) at several fields are available, order parameters for a given NH vector can be obtained from the data at each field separately or from a simultaneous fit of the relaxation data for all available field strengths. Because the LS backbone dynamics should not depend on the applied magnetic field, all these order parameters are expected to agree with each other.

We first analyzed the relaxation data at each field separately using a standard LS approach 32 assuming a uniform value of 15N CSA of -160 ppm. In all these analyses the quality of fit was very good: the residuals of the fit for the majority of residues (96% at 9.4T, 96% at 11.7 T, 98% at 14.1 T, 94% at16.4 T, 84% at 18.8 T, and 94% overall) were within the acceptance level for a 95%-confidence goodness-of-fit test 50, which indicates that the uncertainties in the experimental data are correct or overestimated, but not underestimated. The results, however, show a striking discrepancy between the derived order parameters corresponding to different field strengths (Figs. 6a,f): for most residues in GB3 the observed variation in the derived S2 values among the fields exceeds their experimental uncertainties. Even in the well-ordered parts of the protein, the difference in derived S2 values between 800 and 400 MHz data exceeded 0.10 for some residues. Similar results were obtained when using 15N CSA of -170 ppm (suggested in ref. 59) or the mean CSA of -174.2 ppm (the mean CSA from the three determination methods, 2R2-R1, R/η, and LS-CSA, Figs. 6b,g). The observed disagreement between the derived S2 values obtained for the same NH group from the measurements at different fields thus raises significant concern about the accuracy of the order parameters derived by the standard analysis.

Figure 6. Backbone order parameters determined from 15N relaxation data at each field using different CSA models.

Figure 6

Shown are backbone order parameters in GB3 derived from a LS analysis of the 15N relaxation data (R1, R2, NOE) at different fields (left panels). Right panels represent the differences, ΔS2=S2 - S2(9.4T), between the S2 values at a particular field and at 9.4 Tesla, where the 15N CSA contribution to 15N relaxation rates is the weakest. (a, f) The LS analysis was performed in a conventional way, i.e. assuming a uniform CSA of -160 ppm for all residues. (b, g) The LS analysis was performed assuming a uniform CSA of -174.2 ppm (the average of the site-specific CSAs in GB3, see Table 1) for all residues. (c, h) Site-specific 15N CSA values from the 2R2-R1 method were used as input parameters. (d, i) Site-specific 15N CSA values from the R/η method were used as input parameters. (e, j) The LS analysis was performed for each field separately using the site-specific CSAs derived from the global fit (LS-CSA) of all five fields. Also shown as open circles in panel (d) are the order parameters from the global fit. The coloring is as follows: the 18.8T data are shown in black, 16.4 T in red, 14.1 T in green, 11.7 T in blue, and 9.4 T data in cyan. The dashed horizontal lines represent the average estimated level (±0.029) of the experimental uncertainty in ΔS2. Val39 has been removed from all panels because of the conformational exchange contribution 32. In order to exclude deviations in S2 due to a change in the model selection for different fields in a few residues, all data presented here were obtained assuming a model of local motion (model 2 in 71, model “B” in 33) that includes S2 and τloc as fitting parameters. Our model-selection analysis showed that for the majority of residues in the secondary-structure elements of GB3 this was the preferred model 32. Allowing freedom in the model selection led to even greater discrepancies between the order parameters from different fields, which, however, exhibit the same behavior as shown here (Supporting Fig. 2). As a measure of the discrepancy in order parameters, the rmsd from the average (over all five fields) S2 value for each method is 0.024 (panel a), 0.015 (b), 0.010 (c), 0.012 (d), and 0.009 (e), calculated for the secondary structure elements only.

We also attempted to analyze simultaneously the relaxation data at all five fields using a uniform CSA of -160 ppm and the average diffusion tensor and NH vector orientations from the crystal structure. This analysis indicated serious problems of fitting - only 8 out of 51 (Tyr3, Lys4, Leu5, Val6, Thr16, Ala23, Lys28 and Ala29) amides had residuals of the fit (χ2) which passed the goodness-of-fit test at 95% confidence level 50. Using a uniform CSA of -170 ppm did not significantly improve the fit: here only 12 residues (Tyr3, Lys4, Leu5, Thr16, Thr18, Lys19, Ala23, Lys28, Gln32, Asp46, Thr51, and Thr55) had acceptable χ2 values. Using the mean CSA value of -174.2 ppm gave only 14 residues (Gln2, Tyr3, Leu5, Thr16, Thr18, Lys19, Ala23, Lys28, Gln32, Ala34, Val42, Tyr45, Asp46, and Thr51) with acceptable χ2 values. These results from multiple approaches clearly indicate that the conventional LS treatment assuming a uniform 15N CSA fails totally to describe the multi-field experimental data in GB3.

A similar problem was previously noted by Farrow et al. 48, who observed that order parameters obtained from fitting relaxation data measured at several field strengths have low precision (though they are more accurate than order parameters obtained from data at one field strength) due to poor fits of multi-field data to a LS spectral density function. Other examples of discrepancies in the LS parameters derived from relaxation measurements at several fields can be found elsewhere 45,59,60.

It is noteworthy that for most residues in GB3, the observed difference in the order parameters appears systematic, i.e. it increases with the field strength (Figs. 6a,e). This tendency is present even in the data obtained using the average CSA of -174.2 ppm (Figs. 6b,g). This behavior could arise from (1) conformational exchange contributions to 15N R2 not accounted for in the analysis or (2) deviations in the site-specific values of 15N CSA from their assumed values. Site-specific deviations in the 1N-15H bond length from a uniform value of 1.02 or 1.04 Å could also result in erroneous order parameters; however, the currently available experimental data on variations in the NH bond length in proteins are insufficient in order to address rigorously this issue. A failure of the LS spectral density model to accurately represent data at multiple fields cannot be excluded (e.g. 45,54), particularly with regard to the uncoupling of local and global motions, however our analysis (below) indicates that a modified LS model (using site-specific CSAs) nicely fits the observed spectral densities in GB3.

Several lines of evidence suggest that conformational exchange is not the source of the observed discrepancy in the order parameters in GB3. As shown earlier 32, conformational exchange contributions are negligible for most of amides in GB3, except Val39. This conclusion is also confirmed by the agreement (Supporting Fig. 1) between the measured R2s and their reconstructed “exchange-free” values 7, R2free′ =R1′·ηxyz. The exclusion of conformational exchange as a possible cause of the observed discrepancy between the S2 values is further supported by the results of a LS analysis of the data at the individual fields. Here, 12 (excluding Val39) residues (Tyr3, Leu5, Ile7, Thr16, Ala23, Tyr33, Asp36, Asn37, Asp40, Thr44, Ala48, and Thr51) required a Rex-containing model of local motion 33 at 18.8 T, where the Rex contribution is expected to be the strongest. These Rex values were relatively small (maximum 0.53±0.10 s-1 for Asp36 at 18.8 T) and likely reflect errors in LS model-selection, because the only residue that systematically showed conformational exchange at all five fields was Val39. In addition, excluding R2s from the simultaneous analysis of the five-field data (hence using only R1s and NOEs, as suggested in 45) did not improve the quality of fit for CSA=-160 ppm: only 9 residues passed the goodness-of-fit test (Tyr3, Leu5, Lys13, Thr16, Lys19, Ala23, Ala29, Thr51, and Thr5) in this case. Note also that in terms of spectral densities, the presence of Rex contribution will affect J(0) but not the JN) values (Eqs.17,16), therefore, the introduction of the Rex terms might force the J(0) values from different fields to converge, but will not improve the fit of spectral densities at ω=ωN (Fig.7, Supporting Fig.7) derived assuming a uniform CSA of -160 ppm or even -174.2 ppm (see below). Finally, the Rex-free values of overall diffusion tensor obtained solely from the cross-correlation measurements are in excellent agreement with those from the R2/R1 ratio (Table 2).

Figure 7. Illustration of the LS fit of the spectral density components determined at all five fields.

Figure 7

Representative LS fit of all spectral density components from the five-field measurements for Phe30. Symbols depict the J(ω) values for ω = 0, ωN, and 0.87ωH derived from relaxation data for each field separately (Eqs. 15-17) assuming CSA of -160 ppm (open circles) or the CSA value of -199.1 ppm for Phe30 that optimizes the fit (solid circles). The corresponding fitting curves are shown as dashed and solid lines, respectively. Shown in the insets is a blow up of the regions corresponding to ω= ωN and 0.87ωH, indicated as “ωN” and “ωH”. The values of S2 and τloc were 0.93 and 3.0 ps when using CSA of -160 ppm, and 0.81 and 10.3 ps for the fit CSA values. A 35-fold decrease in χ2/df was observed when using the CSA and the LS parameters from the LS-SDF fit. The Δσ value derived using the 2R2-R1 method (-194.3 ppm for Phe30) resulted in a fit which was practically indistinguishable from the LS-SDF fit shown here, as does the use of the CSA value (Δσ = -196.9 ± 2.93 ppm) from the LS-CSA fit for Phe30. For comparison, the result of this fit when the mean site-specific CSA of -174.2 ppm is used is shown in Supporting Fig.7.

Backbone order parameters: the effect of site-specific 15N CSAs

To verify that the observed field-dependence in the order parameters (Fig.6a) could reflect site-specific variations in the 15N CSA (Fig.2) unaccounted for in the conventional LS analysis, we performed the same derivation as above, this time using as input the site-specific 15N CSA values measured using the model-independent approaches. As shown in Figs.6c,d,h,i, the inclusion of site-specific 15N CSA has dramatically reduced the variation in the order parameters among the fields, which is now within the level of experimental noise for most residues.

We therefore modified the LS analysis by including the CSA as an additional fitting parameter (LS-CSA method, Materials and Methods). This resulted in a significant improvement in the quality of fit of the five-field data for the majority of residues in GB3. For example, when the 15N CSA was allowed to vary in the LS-CSA method, the mean χ2/df for residues in the secondary structure dropped from 5.12 (for a uniform CSA of -160 ppm) to a value of 0.92. All of the secondary structure residues except for Ala26 and Phe52 now have χ2/df low enough to pass the goodness-of-fit test at a 95% confidence level. Altogether, 47 out of 49 analyzed residues exhibited a decrease in χ2 of the LS fit, and in 40 residues there is also a decrease in χ2/df. The residues where the χ2/df is not improved (Asn8, Leu12, Lys13, Thr16, Gly38, Asp40, Gly41, Asp47, and Thr49) are all in flexible regions of GB3 except for Thr16 for which the resulting CSA (-162.3 ppm) is very close to -160 ppm and the residuals of fit were already sufficiently low: χ2/df =0.56 and 0.67 for the LS and LS-CSA methods, respectively.

For those residues where a reduction in χ2 was accompanied by an increase in the number of fitting parameters (33 residues in GB3), a statistical F-test was performed 50 to determine if the improvement in the χ2 was significant. For 31 (94%) of these residues, the reduction in the χ2 is statistically justified at a 95% significance level or higher (i.e. the probability, P, that the reduction in χ2 occurred by chance is P < 0.05). For 25 (76%) of these residues the significance level is higher than 99% (i.e. P < 10-2), and for 22 (67%) of these residues the significance level is even higher than 99.9% (i.e. P < 10-3).

The order parameters derived from a simultaneous (global) fit of data from all five fields using the LS-CSA method are shown as open symbols in Fig 6e. All three regression methods (the least-squares and two robust methods) had good agreement (within the experimental uncertainty for both S2 and the CSA) for 28 out of 49 amides in GB3 (Gln2 not included here because its coordinates are unavailable from the crystal structure). For an additional four residues (Gly9, Asp36, Asn37, and Gly41) the two robust methods agreed within their experimental uncertainties (68.3% confidence interval). For 17 residues (Tyr3, Ile7, Asn8, Lys10, Thr11, Leu12, Ala20, Val21, Asp22, Gly38, Asp40, Asp46, Asp47, Ala48, Thr49, Lys50, and the C-terminal Glu56), all of which are either in the loops/termini or at the edges of secondary structure elements, no CSA is reported here for the LS-CSA method because all three regression methods disagreed for either S2 or Δσ.

The site-specific 15N CSA values from the LS-CSA method were in the range from - 126.0 ± 3.9 ppm (Ala26) to -243.4 ± 4.7 ppm (Phe52), with a mean of -176.9 ppm, a median of -176.8 ppm, and standard deviation of 20.0 ppm. The average estimated level of the experimental errors is 1.76% (or 3.1 ppm) for the CSA, which gives a true site-to-site CSA variability Λ of 19.2 ppm and a true mean of -176.9 ppm. We estimate 95% confidence limits from this method to be from -168.0 to -185.7 ppm for μ and from 14.3 to 27.3 ppm for Λ (Fig.4).

Using these site-specific 15N CSA values as input for the LS analyses at separate fields resulted in a further reduction in the spread of the order parameters among the fields (Figs. 6e,j). These results clearly indicate that the discrepancy in the order parameters in Fig.6a is caused by site-specific variations in the 15N CSA.

LS fit of the spectral densities directly

A direct analysis of the spectral densities produced similar results. For a uniform CSA of -160 ppm, the χ2/df of the fit of the spectral density functions at all five fields for the secondary structure elements of GB3 ranges from 0.46 (Tyr16) to 20.6 (Trp43) with a mean value of 4.73. The quality of the fits of the spectral density functions is shown in Fig.7, a similar comparison for a CSA of -174.2 ppm can be found in Supplementary Fig.7. Overall, major discrepancies between the experimental data and the LS model were for ω=0, due to the spread in the J(0) values derived at various fields, and at ω=ωN, where the experimental JN) values noticeably deviate from the theoretical curve. There is a good agreement for the high-frequency components (which are CSA-independent), particularly taking into account the reduced spectral density approximation 49,48 (Eq.15) made when deriving J(0.87ωH) from the experimental data.

The inclusion of CSA as a third fitting parameter (in addition to S2 and τloc, see LS-SDF in Materials and Methods) resulted in the reduction of the residuals of fit for 29 out of 35 residues (or 83%) in the secondary structure elements; the χ2/df with CSA as an additional adjustable parameter ranged from 0.3 (Thr18) to 6.1 (Phe52) with a mean of 1.25. The LS-SDF method resulted in a significantly better convergence of J(0) values from different fields and, at the same time, in a better fit of the JN) values (Fig.7). A similar improvement in the fit was obtained when using site-specific CSA values from the 2R2-R1 method, resulting in reduced χ2/df for 27 amides in the secondary structure.

Discussion

The agreement between the 15N CSA values in GB3 derived from various methods

There is an excellent agreement between the results of the LS-CSA and LS-SDF methods: for the residues in the secondary structure, the CSAs from the two methods agree within their errors and have a correlation coefficient of 0.99. The order parameters and τloc values derived using these methods agree within their respective errors for all but two residues (Ala23 and Lys28) in the secondary structure. For those residues were there is good agreement, this indicates that the use of reduced spectral densities does not significantly alter the values of these parameters.

Furthermore, the CSA values from these two approaches based on the LS form of the spectral density function are in good agreement with the results of the model-independent approaches (Fig.2c,d). For all residues in GB3, the Pearson’s correlation coefficient is 0.95 between the CSAs from the LS analyses and the 2R2-R1 method and 0.80 between the CSA values from the LS analyses and those measured using the R/η method. To validate the characteristics of the backbone dynamics (S2, τloc) derived simultaneously with site-specific 15N CSAs (LS-CSA method), we compared the spectral density J(ω) at ω=0 reconstructed from these data with J(0) values obtained directly from the 2R2-R1 method (recall that this latter J(0) is independent of the 15N CSA). The good agreement between the two values of J(0) (Fig.3) for the secondary structure elements of GB3 thus validates the LS parameters derived using the LS-CSA method.

The distribution of site-specific 15N CSA values

The range of 15N CSAs obtained from all abovementioned methods for each residue in GB3 is shown in Fig. 8, together with a histogram of the average CSA values (from the three determination methods) for each residue. The likelihood functions p(μ,Λ) (Eq. 19) generated from the results of each of the three CSA determination methods are shown in Fig. 4. The true mean CSA values (μ) from these methods are on average slightly higher in absolute value than those observed earlier in ubiquitin (mean CSA = -157 ppm) 5,6,27 and in Rnase H (μ = -172 ppm) 8, and slightly lower than those recently reported for ubiquitin 29 (μ = -179.6 ppm when scaled to a NH bond length of 1.02 Å), although within the average uncertainty of these measurements. These site-specific 15N CSA values were then combined with the isotropic chemical shift data in order to reconstruct the individual components of the 15N CST in GB3, assuming axial symmetry of the tensor (Supporting Tables 5a,b).

Figure 8. Site-specific 15N CSA values, averaged over all three methods, show significant CSA variability in GB3.

Figure 8

(a) Range of 15N CSAs for each backbone amide in GB3 from the three methods (2R2-R1, R/η, and LS-CSA) shown as solid vertical bars. The open symbols represent the average site-specific CSA, Δσ, from the three methods; the error bars represent the maximum error from the three methods for each residue. (b) A histogram of the average site-specific CSA values shown in panel (a). Including these average site-specific CSA values into the analysis of the derivation of the true CSA values (Eq.19) resulted in the true mean μ = -173.8 ppm and the site-to-site variability Λ = 21.2 ppm (Table 1). The black curve represents a Gaussian distribution with the mean of -174.2 ppm and the standard deviation of 22.2 ppm. The dashed curve is also a Gaussian, with the same mean but with a standard deviation of 13.0 ppm - this curve corresponds to the case when all seven outliers in panel (b) are taken out.

We observed no significant correlation between CSA values and secondary structure or amino acid type. There is no obvious correlation with the isotropic chemical shifts (Supporting Fig.8), although some residues with large |Δσ|, in particular Phe52 and Trp43, do show large isotropic shifts, while Asp49 and Gly38 have both small isotropic chemical shifts and |Δσ|. The mean CSAs of residues in the α-helix and β-strands are shown in Table 1. There is a weak correlation between the βz angles and secondary structure, with slightly smaller angles in the β-strands (mean angle 18.9°) and turns (mean angle 19.1°) than in the helix (where the mean angle is 21.0°). Both the CSAs and βz angles show smaller variation in the α-helix (where the standard deviations in the CSA and the angle are 18.1 ppm and 3.1°, respectively) compared to the β-strands (18.6 ppm and 4.7°), and even larger variations were observed in the loops/turns (26.3 ppm and 7.5°), possibly consistent with significantly different electronic arrangement in the secondary structures.

Site-to-site 15N CSA variability in GB3: comparison with literature data

The true site-to-site variability Λ in 15N CSA obtained here is comparable to the standard deviation of the CSA values in ubiquitin 5,6 but significantly bigger than the Λ values reported for Rnase H 8 and recently for ubiquitin 29. The CSA distribution in ubiquitin, reconstructed from the individual CST components reported in ref 30, is in a better agreement with our data for GB3: the standard deviations in these CSAs range from 10.1 to 13.7 ppm, and the site-to-site variability, Λ, from 7.8 to 10.5 ppm, depending on the model of local motion.

The value of Λ extracted from the observed site-specific CSA values, naturally, depends on the experimental uncertainties in CSA. Therefore, at least in principle, higher Λ values in GB3 could be a result of an underestimation of the experimental errors in the CSA. However, several lines of evidence suggest that this is not the case here. First of all, the residuals of fit from the diffusion tensor analyses (Table 2, rightmost column) are smaller than the ideal value of χ2/df ∼ 1. This suggests that the errors in the relaxation and cross-correlation rates were possibly overestimated rather than underestimated. Second, the residuals of fit in the LS analysis (uniform CSA of -160 ppm) of the autorelaxation data and NOEs at each field separately passed the goodness-of-fit test for the overwhelming majority of residues in GB3 (98%, 96%, 100%, 98%, and 84% of residues passed the 95% confidence test at 9.4, 11.7, 14.1, 16.4 and 18.8 T, respectively, and 97% overall), also suggesting that the errors in the relaxation data were not underestimated. Similar results were obtained for a CSA of -174.2 ppm. Third, in order to reduce Λ to the 5 ppm level reported in 8,29, we had to scale up significantly the experimental errors in CSA (by a factor of 3 for the R/η method, 4 for the LS-CSA method, and >6.5 for the 2R2-R1 method) assuming that all errors are uniformly underestimated. This scaling factor is too big, given the reasonable χ2/df values in all other fits presented here.

In addition, to further explore this issue, we introduced a certain χ2/df cutoff level (determined here by a 95% confidence level for the goodness-of-fit test 50) as a highly conservative criterion for eliminating fits from consideration here. This cutoff excludes those residues where the robust regressions were acceptable but the χ2/df of the least-squares fit was too high due to an outlier that was effectively ignored by the robust methods: there are 9 such exclusions from the 2R2-R1 method, 6 from R/η and 4 from the LS-CSA fit. If only those residues with the χ2/df of the least-squares fit lower than its 95% confidence limit are considered (32 amides from the 2R2-R1 method, 33 from R/η, and 25 from LS-CSA, represented by the filled symbols in Figs. 2b,c,d, and Fig. 5b), the CSA variability from each method is reduced to what could probably be considered its lower bound in GB3: Λ2R2-R1=10.6 ppm, ΛR/η=10.2 ppm, and ΛLS-CSA=11.9 ppm. These estimates of the site-to-site CSA variability are still, consistently, almost a factor of two higher than those reported for Rnase H 8 or recently for ubiquitin 29).

The results obtained here also differ from the 15N CSA statistics in short peptides, where for a set of 39 solid-state NMR data (summarized in ref60) we estimate a mean CSA of -155.8 ppm and a standard deviation of the distribution of 5.8 ppm. The bigger range of CSA variability in GB3 compared to peptides could reflect greater internal structural heterogeneity in proteins.

To explore the effect of outliers as a possible source of the higher CSA variability observed here, we excluded from the set of residues for which p(μ, Λ) was generated for each method the extrema of the corresponding CSA range (Fig.2a, Fig.8b). The mean CSA values were largely unchanged (μ=-174.0, -177.4, and -176.3 ppm, for 2R2-R1, R/η, and LS-CSA, with Leu12 and Phe52, Ala26 and Phe52, and Gly38 and Phe52, excluded respectively) and the measures of the site-to-site variability Λ were reduced to 17.2, 14.1, and 13.3 ppm, respectively. Restricting the CSA distribution even further by excluding all seven outliers in Fig.8 (Leu12, Ala20, Ala26, Gly38, Ala48, Thr49, and Phe52), thus effectively reducing the distribution to that contained within the dashed Gaussian curve shown in Fig. 8, reduced the calculated site-to-site CSA variability Λ to 11.5, 13.8, and 13.1 ppm (for 2R2-R1, R/η, and LS-CSA, respectively), while the values of the true mean μ were only slightly affected (-176.5, -177.9, and -176.3 ppm, respectively). These exclusions also resulted in similar changes for the distribution function generated from the average CSAs of the three methods (Fig. 8): μ = -175.1 ppm and Λ = 13.5 ppm. Note that all these reduced estimates of the site-to-site variability in 15N CSA are still significantly larger than those reported in 8,29.

In summary, all these data then suggest that the site-to-site variability in 15N CSA reported here for GB3 is most probably correctly estimated, or underestimated. This conclusion has important implications for the analysis of protein dynamics, since this degree of variability in the 15N CSA means that the assumption of a uniform 15N CSA value could result in significant errors in LS parameters.

Is there a correlation between the individual components of the 15N chemical shielding tensor?

It is instructive to discuss the CSA variability obtained here in relationship to the spread in the isotropic chemical shifts in GB3. The isotropic chemical shift (δiso) and the CSA are both combinations of the principal values of the 15N CST: δiso = (δxx + δyy + δzz)/3 ≈ σref - (σxx + σyy + σzz)/3; Δσ ≈ σzz - (σxx + σyy)/2, where σref is the isotropic shielding of the reference compound, and the equation for Δσ used here is an approximate form of Eq.2, which is exact in the case of the axial symmetry of the CST, Eq.4. Assuming a random model, when all three components of the 15N CST are allowed to vary from site to site and are normally distributed with equal variances 61, one can easily obtain from these equations the following relationship between the standard deviations in the CSA (here referred to as the variability Λ) and in the isotropic chemical shift (Δδiso):

Λ=32κΔδiso. (20)

where κ is a numeric coefficient reflecting the interrelationship between the individual components of the CST:

κ=32Rzx2Rzy+Rxy3+2Rzx+2Rzy+2Rxy (21)

Here Rij is the correlation coefficient between σii and σjj. In a particular case when all three CST components vary completely independently, κ = 1. Given the standard deviation of the isotropic chemical shift in GB3 is 6.5 ppm, the expected value of Λ in this case is 13.8 ppm. This number is smaller than the CSA variability obtained for all residues in GB3 (Λ2R2-R1=21.4, ΛR/η=17.6, and ΛLS-CSA=19.2 ppm) but slightly larger than the values (Λ2R2-R1=10.6, ΛR/η=10.2, and ΛLS-CSA=11.9 ppm) obtained when considering only those residues with χ2/df below the 95% goodness-of-fit cutoff. The deviation in the value of κ from 1 suggests that the individual components of the 15N CST tensor are not independent from each other, however, it is impossible at this stage to draw a more definitive conclusion about the correlation coefficients between the individual components, and further studies are required to address this issue.

For example, it follows from Eq.20 that a positive correlation between σxx and σyy, both being independent of σzz will give κ < 1 (with the lower bound at κ=25), while an anti-correlation of these two components will result in κ > 1 (up to Inline graphic) with the upper bound on the CSA variability at Λ = 3 Δδiso (or 19.5 ppm for GB3). It has been suggested 22 that σxx and σyy possibly vary in an anti-correlated manner - this would be consistent with the CSA variability in GB3 larger than 13.8 ppm. However, if the 15N CST is truly axially symmetric (i.e. σxx = σyy, hence Rxy = 1), then the Λ value is expected to be smaller, Λ=325Δδiso, which gives the CSA variability around 12.3 ppm for GB3, again assuming that σxx and σzz (or σ and σ in this case) are normally distributed and vary independently. A positive correlation between σ and σ will further reduce the Λ values, down to zero at full correlation, while the anti-correlation will result in greater Λs, with an upper bound at Λ = 6 Δδiso= 39 ppm. Using the correlation coefficients calculated from a collection60 of 39 solid-state NMR data on short peptides, Rzx= 0.06, Rzy = 0.43, Rxy = - 0.12, one would expect Λ of 14 ppm in GB3. Inserting into Eq.20 the correlation coefficients between the individual components of the 15N CST recently measured in ubiquitin 30, we estimate Λ to range from 9.6 to 13.3 ppm in ubiquitin (where the standard deviation in the isotropic chemical shift is 5.9 ppm) and from 10.5 to 14.6 in GB3.

Possible sources of systematic errors in 15N CSA determination from multiple-field data

In addition to the imprecision in the CSA values caused by random noise associated with the measurements, there could be systematic errors - largely inaccuracy - stemming from the underlying assumptions in the analysis. Here we focus on some of them, a detailed analysis can be found elsewhere 31.

The N-H bond length

As it is clear from Eqs. 9, 12, 14, the 15N CSA values are determined via the dipolar term d, hence depend on our knowledge of the NH-bond length. Two aspects are of importance here. First, a uniform value of the NH bond length is usually assumed. Site-to-site variations in rHN will necessarily affect the Δσ values. Thus, a small, unaccounted for, deviation in the bond length by δrHN will introduce an error in the CSA value of the order of 3(δrHN/rHN). However, the currently available information on the variations in the NH-bond length is insufficient for a rigorous analysis of this issue. Second, the CSA values derived here were obtained assuming the NH-bond length of 1.02 Å. For comparison with the CSA data obtained for rHN =1.04 Å, our results should be uniformly scaled by (1.02/1.04)3=0.94 (see also 31). Thus, the mean 15N CSA and the site-to-site variability (average of all three methods) obtained here correspond to -164.3 ppm and 20.0 ppm, respectively, if rHN is 1.04 Å.

The spectral densities

The usual assumption made when analyzing 15N relaxation data, be it LS approach or the model-independent analyses, is to neglect the difference between the spectral densities describing the effect of motion on the contributions to spin Hamiltonian from the 15N-1H dipolar interaction (JDD(ω)) and from the 15N CSA (JCSA(ω)), i.e. JDD(ω)=JCSA(ω)=J(ω). In general, however 62,31, JDD(ω) ≠ JCSA(ω), and a correction for the difference between the spectral densities can be included as:

Δσcorrect=Δσf, (22)

where f is the correction factor: f = [JDD(0)/JCSA(0)]½ for the 2R2-R1 method, f = {[4JDD(0)+ 3JDDN)]/[4JCSA(0)+ 3JCSAN)]}½ for R2xy and f = [JDDN)/JCSAN)]½ for R1z..There are several reasons why the spectral densities JDD(ω) and JCSA(ω) are not the same 31.

First, the very nature of the chemical shielding suggests that it should fluctuate when local environment of a nucleus changes as a result of internal motions in a protein. Here not only the orientation (as usually assumed in the equations relating relaxation rates to the spectral densities) of the CST but also the principal values themselves are expected to fluctuate. In contrast, the NH-bond length is less likely to change, except when transient hydrogen bonding occurs in the course of protein dynamics. Note also that the changes in local environment that modulate the CST do not necessarily have to affect the orientation of the NH bond. A detailed analysis of the “breathing” of the 15N CST requires molecular dynamics simulations (e.g.63) and is beyond the scope of this paper.

Second, even when neglecting the differences in the mechanisms of modulation of these two tensors by motions within a protein, the difference between the spectral densities is expected to arise from the fact that the CSA and dipolar tensors are not collinear. As follows from our data (Fig.5c), the average angle βz between the NH vector and the z-axis of the CSA tensor is 19.9°. The effect of CSA-dipolar noncollinearity on the contribution to the spectral density from anisotropic overall tumbling has been analyzed in detail in 62. Our calculations (not shown) using the average site-specific CSAs from the three methods and the βz angles (from R/η, Fig.5c) for GB3 resulted in the contributions from the noncollinearity to relaxation and cross-correlation rates that were on average within their respective experimental errors. As a result, the inclusion of these corrections in the model-independent and LS methods outlined above had no significant effect on the derived CSA values.

In addition, because of the anisotropic character of backbone motion in proteins 64,65, where the principal mode of motion is rocking of the peptide plane about the Cα-Cα axis, the CSA-dipolar noncollinearity will result in different amplitudes (and associated order parameters) of the NH-vector and CSA tensor motions. To investigate the effect of noncollinearity due to anisotropic backbone motions, we explored the difference in the order parameters for the NH vector and for a vector (representing the σzz axis) tilted by 20° towards the carbonyl atom in the peptide plane in a model system undergoing angular fluctuations about the Cα-Cα axis. We found that the maximum difference in the squared order parameters for these vectors was 5%, with SCSA2 always smaller than SNH2, for a rotational angle of 40°, which is well above the maximum amplitude of Gaussian angular fluctuations about this axis recently reported for GB3 66. Assuming that the correlation time of GAF motion is similar to that of LS model, and the order parameters are close to 1, Eq.22 gives fSNH/SCSA < 1.03. This difference in the order parameters is insufficient to account for the large variability in the CSA that we observe in GB3. For example, if we assume for the sake of argument that the CSA in GB3 has a uniform mean value of -174.2 ppm, the factor f would have to range from 0.7 to 1.6 (hence JDD(0)/JCSA(0) from 0.5 to 2.6) to account for the observed range of CSAs from the 2R2-R1 method. Similarly, to account for all the variability in the R/η measurements with respect to the average, f would have to vary from 0.7 to 1.4.

The assumption of axial symmetry of the overall tumbling

The order parameters and the 15N CSA values derived from the LS-based methods (but not those from the model-independent approaches) are sensitive to the model of overall tumbling used for the analysis. As demonstrated earlier 32 and further supported by the data presented here (Table 2, Supporting Tables 3,4), the overall tumbling of GB3 in solution is anisotropic. While the axially symmetric and fully anisotropic tumbling models both provide a significant improvement in the fit over the isotropic diffusion model, the axially symmetric model for the overall tumbling was assumed here, based on several lines of evidence.

  1. The molecular shape of GB3 to a good approximation is axially symmetric. The ratio of the principal values of the inertia tensor of the molecule is 1.80:1.79:1.00. Moreover, theoretical predictions for GB3 32 based on hydrodynamic calculations using HYDRONMR program 67 give a rotational diffusion tensor with the ratio of the principal components of 1.00:1.05:1.43:, which suggests a high degree of axial symmetry.

  2. The fully anisotropic diffusion tensor derived from the relaxation data (Supplementary Table 3) also shows a high degree of axial symmetry, with the principal values of the tensor, Dxx and Dyy, within their mutual errors at all fields. Also a global fit of the relaxation of data at all five fields resulted in a diffusion tensor with near zero rhombicity (0.08). This is also reflected in the large experimental uncertainties in the orientation of the x- and y-axes of the fully anisotropic tensor (angle Ψ, Supporting Table 4), indicating that the orientations of these axes of the diffusion tensor are not well defined.

  3. Based on the statistical F-test 50, the probabilities that the observed reduction in χ2 for the fully anisotropic model compared to the axially symmetric model occurred by chance (rightmost column in Supporting Table 4) at each field are not low enough (or the corresponding F-values are not high enough) in order to reject with certainty the null-hypothesis that both models fit the data similarly well.

  4. The fact that the spectral densities obtained using the axially symmetric model are in good agreement with the model-independent analyses (Fig.3) derived without any assumption about the overall tumbling, further supports this conclusion.

All these observations support the conclusion that the axially symmetric diffusion tensor is an adequate model for rotational diffusion of GB3. (Note that the theoretical predictions mentioned above suggest that this is likely due to the overall shape of the protein, rather than a consequence of the quality or limited amount of experimental data.) However, to completely rule out the possibility that some 15N CSA values might be affected by the neglect of the deviation of the diffusion tensor from axial symmetry and therefore might appear site-specific due to the individual orientations of amide bonds with respect to the protein’s diffusion tensor, we also performed the LS-CSA analysis using the globally-fit fully anisotropic diffusion tensor (Supporting Table 4). These 15N CSAs are in excellent agreement (Supporting Figure 6) with the above-reported CSA values derived for the axially symmetric diffusion tensor. It should be noted here that the model-independent methods for CSA determination (2R2-R1, R/η, and ηxyz) presented in the text do not depend on the overall diffusion model. Therefore, the good agreement between the 15N CSAs determined using the LS-CSA and LS-SDF-CSA methods for an axially symmetric tumbling model with the results of the model-independent analyses (Figure 2c,d) is also a strong indication that the results presented in this paper are not affected by the axially symmetric diffusion tensor representation.

15N CSAs and the order parameters: what errors in the order parameters to expect?

As shown in this study, relaxation data at multiple fields allowed an accurate assessment of the site-specific 15N CSAs, and these values, in turn influence the order parameters extracted from the data. Because measurements at multiple fields (particularly higher fields) are not always available to a general NMR user, it is instructional to estimate the level of uncertainties in the order parameters expected from the use of a constant CSA instead of the true CSA values. A comparison of the order parameters obtained from the LS-CSA analysis of all five-field data with those obtained for a typical field of 14.1 T, assuming a constant CSA, gave pair-wise rmsd values of 0.06 (or 6.5%, range of deviations from -0.06 to 0.11) for -160 ppm and 0.04 (or 4.1%, range from -0.09 to 0.07) for -174.2 ppm. The corresponding numbers for 11.4 T were, naturally, smaller: rmsd = 0.04 (4.9%, range from -0.04 to 0.09) for -160 ppm and 0.03 (3.2%, range -0.07 to 0.06) for -174.2 ppm. This comparison included only residues in the secondary structure of GB3, the deviations in the loop regions could be larger. Thus, even at low fields, the errors in the order parameters might not be negligible, particularly for those applications where quantitative changes in order parameters are of importance (as e.g. entropy changes monitored by 15N relaxation).

Conclusions

Here we presented a comprehensive study of the 15N chemical shielding anisotropy in a protein based on a combination of 15N relaxation and CSA/dipolar cross-correlation measurements at five static magnetic fields. The analysis was performed using various combinations of the experimental data and using model-independent approaches as well as methods based on Lipari-Szabo approximation. The results indicate significant site-to-site variations in the principal values and the orientation of the 15N CSA, similar to those observed earlier in ubiquitin 5,6. Our estimates of the true variability in the 15N CSA in GB3 depend to some degree upon which method for determining the CSA was used and which subset of residues is considered. These estimates range from 10.2 ppm (for 33 residues that pass the χ2/df cutoff from the R/η method) to 21.4 ppm for all 47 residues from the 2R2-R1 method. Although this range of values could be a result of limited statistics, all of these estimates are still larger than the derived variability in the 15N CSA from studies of ribonuclease H 8 or recently of ubiquitin using a subset of the methods used here 29. The true mean CSA values range from - 173.9 ppm (2R2-R1) to -177.2 ppm (R/η).

Our data show that using the site-specific values of the 15N chemical shielding anisotropy obtained here significantly improves the agreement between LS order parameters measured at different fields and allows simultaneous fit of the 15N relaxation data at five fields to LS spectral densities. These findings emphasize the necessity of taking into account the variability of the 15N chemical shielding tensor for accurate analysis of protein dynamics from 15N relaxation measurements. This can be achieved by including CSA as an additional fitting parameter in the LS analysis of multiple-field data, provided the sample temperature and other experimental conditions were the same at all fields/spectrometers. These analyses also show that the Lipari-Szabo form of the spectral density provides a satisfactory approximation for the experimental spectral densities obtained using reduced spectral density approach.

Significant variation in the true CSAs from their assumed values will affect several applications of NMR relaxation that depend on 15N CSA. These include, in addition to the backbone order parameters and local correlation times from Lipari-Szabo analysis, the spectral density components (specifically, J(0) and JN), but not JH)) obtained from spectral density mapping, conformational exchange contributions derived from the field dependence of the 15N relaxation rates (see discussion in 6), as well as local molecular geometries and order parameters determined from cross-correlation measurements involving the CSA mechanism. Among other characteristics that could be influenced by CSA variability, are changes in the local conformational entropy (for example, accompanying ligand binding) estimated from the differences in order parameters. As mentioned above, characteristics of the overall tumbling are not expected to be sensitive to the 15N CSA values, when determined from the ratio of cross-correlation or reduced relaxation rates.

In contrast to other heteronuclei (e.g. carbonyl 13C 68,69) the 15N shifts (and CSAs) in proteins are not yet predictable, indicating that the subtleties of non-covalent bonding forces are still poorly understood in proteins. The site-specific 15N CSA values presented above provide experimental data for testing and calibration of theoretical methods for shielding tensor predictions.

Supplementary Material

si20060407_113

Acknowledgements

Supported by NIH grant GM65334 to D.F. We thank Dr. Ad Bax for providing us with the protein sample and for the encouragement and Dr. David Cowburn for useful suggestions to the manuscript. The measurements at 9.4, 16.4, and 18.8 T were performed at CERM (University of Florence), and we thank Prof. Ivano Bertini for the NMR time and Dr. Rainer Kümmerle (Bruker) for help with the experimental setup.

The modified version of our program DYNAMICS that includes 15N CSA as an additional fitting parameter is available from the authors upon request.

References

  • (1).Wishart DS, Sykes BD. J. Biomol. NMR. 1994;4:171–180. doi: 10.1007/BF00175245. [DOI] [PubMed] [Google Scholar]
  • (2).Wuthrich K. NMR of Proteins and Nucleic Acids. Wiley; New York: 1986. [Google Scholar]
  • (3).Cornilescu G, Delaglio F, Bax A. J Biomol NMR. 1999;13:289–302. doi: 10.1023/a:1008392405740. [DOI] [PubMed] [Google Scholar]
  • (4).Tjandra N, Szabo A, Bax A. J. Am. Chem. Soc. 1996;118:6986–6991. [Google Scholar]
  • (5).Fushman D, Tjandra N, Cowburn D. J.Am.Chem.Soc. 1998;120:10947–10952. [Google Scholar]
  • (6).Fushman D, Tjandra N, Cowburn D. J. Am. Chem. Soc. 1999;121:8577–8582. [Google Scholar]
  • (7).Kroenke CD, Loria JP, Lee LK, Rance M, Palmer AGI. J. Am. Chem. Soc. 1998;120:7905–7915. [Google Scholar]
  • (8).Kroenke CD, Rance M, Palmer AGI. J.Am.Chem.Soc. 1999;121:10119–10125. [Google Scholar]
  • (9).Pervushin K, Riek R, Wider G, Wuthrich K. Proc.Natl.Acad.Sci.USA. 1997;94:12366–12371. doi: 10.1073/pnas.94.23.12366. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (10).Lipsitz RS, Tjandra N. J Magn Reson. 2003;164:171–176. doi: 10.1016/s1090-7807(03)00176-9. [DOI] [PubMed] [Google Scholar]
  • (11).Glushka J, Lee M, Coffin S, Cowburn D. J Am Chem Soc. 1989;111:7716–7722. [Google Scholar]
  • (12).Oldfield E. J.Biomol. NMR. 1995;5:217–225. doi: 10.1007/BF00211749. [DOI] [PubMed] [Google Scholar]
  • (13).Sitkoff D, Case D. Progr.NMR Spectr. 1998;32:165–190. [Google Scholar]
  • (14).Poon A, Birn J, Ramamoorthy A. Journal of Physical Chemistry B. 2004;108:16577–16585. doi: 10.1021/jp0471913. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (15).Harbison GS, Jelinski LW, Stark RE, Torchia DA, Herzfeld J, Griffin RG. J.Magn.Reson. 1984;60:79–82. [Google Scholar]
  • (16).Hartzell CJ, Whitfield M, Oas TG, Drobny GP. J.Am.Chem.Soc. 1987;109:5966–5969. [Google Scholar]
  • (17).Oas TG, Hartzell CJ, Dahlquist FW, Drobny GP. J.Am.Chem.Soc. 1987;109:5962–5966. [Google Scholar]
  • (18).Hiyama Y, Niu C, Silverton J, Bavoso A, Torchia D. J.Amer.Chem.Soc. 1988;110:2378–2383. [Google Scholar]
  • (19).Shoji A, Ozaki T, Fujito T, Deguchi K, Ando S, Ando I. Macromolecules. 1989;22:2860–2863. [Google Scholar]
  • (20).Mai W, Hu W, Wang C, Cross TA. Protein Science. 1993;2:532–542. doi: 10.1002/pro.5560020405. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (21).Wu CH, Ramamoorthy A, Gierasch LM, Opella SJ. J.Amer.Chem.Soc. 1995;117:6148–6149. [Google Scholar]
  • (22).Cornilescu G, Marquardt JL, Ottiger M, Bax A. J. Am. Chem. Soc. 1998;120:6836–6837. [Google Scholar]
  • (23).Boyd J, Redfield C. J. Am. Chem. Soc. 1999;121:7441–7442. [Google Scholar]
  • (24).Cornilescu G, Bax A. J. Am. Chem. Soc. 2000;122:10143–10154. [Google Scholar]
  • (25).Fushman D, Cowburn D. J.Am.Chem.Soc. 1998;120:7109–7110. [Google Scholar]
  • (26).Damberg P, Jarvet J, Allard P, Graslund A. J.Biomol.NMR. 1999;15:27–37. doi: 10.1023/A:1008308224556. [DOI] [PubMed] [Google Scholar]
  • (27).Kover KE, Batta G. J Magn Reson. 2001;150:137–146. doi: 10.1006/jmre.2001.2322. [DOI] [PubMed] [Google Scholar]
  • (28).Kurita J, Shimahara H, Utsunomiya-Tate N, Tate S. J Magn Reson. 2003;163:163–173. doi: 10.1016/s1090-7807(03)00080-6. [DOI] [PubMed] [Google Scholar]
  • (29).Damberg P, Jarvet J, Graslund A. J Am Chem Soc. 2005;127:1995–2005. doi: 10.1021/ja045956e. [DOI] [PubMed] [Google Scholar]
  • (30).Loth K, Pelupessy P, Bodenhausen G. J Am Chem Soc. 2005;127:6062–6068. doi: 10.1021/ja042863o. [DOI] [PubMed] [Google Scholar]
  • (31).Fushman D, Cowburn D. In: Methods in Enzymology. James T, Schmitz U, Doetsch V, editors. Vol. 339. 2001. pp. 109–126. [DOI] [PubMed] [Google Scholar]
  • (32).Hall JB, Fushman D. J Biomol NMR. 2003;27:261–275. doi: 10.1023/a:1025467918856. [DOI] [PubMed] [Google Scholar]
  • (33).Fushman D, Cahill S, Cowburn D. J. Mol. Biol. 1997;266:173–194. doi: 10.1006/jmbi.1996.0771. [DOI] [PubMed] [Google Scholar]
  • (34).Grzesiek S, Bax A. J.Am.Chem.Soc. 1993;115:12593–12594. [Google Scholar]
  • (35).Hall JB, Dayie KT, Fushman D. J Biomol NMR. 2003;26:181–186. doi: 10.1023/a:1023546107553. [DOI] [PubMed] [Google Scholar]
  • (36).Hall JB, Fushman D. Magnetic Resonance in Chemistry. 2003;41:837–842. [Google Scholar]
  • (37).Vasos PR, Hall JB, Fushman D. J.Biomol.NMR. 2005;31:149–154. doi: 10.1007/s10858-004-7562-8. [DOI] [PubMed] [Google Scholar]
  • (38).Note that the results previously presented in ref.32 were obtained using experimental uncertainties estimated from noise integration.
  • (39).Skelton N, Palmer A, Akke M, Kordel J, Rance M, Chazin W. J. Magn. Reson. 1993;B102:253–264. [Google Scholar]
  • (40).Canet D. Concepts Magn. Reson. 1998;10:291–297. [Google Scholar]
  • (41).Lipari G, Szabo A. J.Am.Chem.Soc. 1982;104:4546–4559. [Google Scholar]
  • (42).Lipari G, Szabo A. J.Am.Chem.Soc. 1982;104:4559–4570. [Google Scholar]
  • (43).Clore GM, Szabo A, Bax A, Kay LE, Driscoll PC, Gronenborn AM. J.Am.Chem.Soc. 1990;112:4989–4936. [Google Scholar]
  • (44).Mandel AM, Akke M, Palmer AGI. J. Molecular Biology. 1995;246:144–163. doi: 10.1006/jmbi.1994.0073. [DOI] [PubMed] [Google Scholar]
  • (45).Lee AL, Wand AJ. J.Biomol.NMR. 1999;13:101–112. doi: 10.1023/a:1008304220445. [DOI] [PubMed] [Google Scholar]
  • (46).Woessner D. J. Chem. Phys. 1962;37:647–654. [Google Scholar]
  • (47).Tjandra N, Feller SE, Pastor RW, Bax A. J. Am. Chem. Soc. 1995;117:12562–12566. [Google Scholar]
  • (48).Farrow NA, Zhang O, Szabo A, Torchia DA, Kay LE. J Biomol NMR. 1995;6:153–162. doi: 10.1007/BF00211779. [DOI] [PubMed] [Google Scholar]
  • (49).Ishima R, Nagayama K. Biochemistry. 1995;34:3162–3171. doi: 10.1021/bi00010a005. [DOI] [PubMed] [Google Scholar]
  • (50).Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical Recipes in C. Cambridge University Press; NY: 1992. [Google Scholar]
  • (51).Rousseeuw R, Leroy AM. Robust regression and outlier detection. John Wiley & Sons; New York, NY USA: 2003. [Google Scholar]
  • (52).Draper NR, Smith H. Applied regression analysis. 3rd ed. John Wiley & Sons; New York: 1998. [Google Scholar]
  • (53).Luginbuhl P, Pervushin KV, Iwai H, Wuthrich K. Biochemistry. 1997;36:7305–7312. doi: 10.1021/bi963161h. [DOI] [PubMed] [Google Scholar]
  • (54).Korzhnev DM, Orekhov VY, Arseniev AS. J Magn Reson. 1997;127:184–191. doi: 10.1006/jmre.1997.1190. [DOI] [PubMed] [Google Scholar]
  • (55).Fushman D, Cowburn D. In: Structure, Motion, Interaction and Expression of Biological Macromolecules. Sarma R, Sarma M, editors. Adenine Press; Albany, NY: 1998. pp. 63–77. [Google Scholar]
  • (56).Walker O, Varadan R, Fushman D. J. Magn. Reson. 2004;168:336–345. doi: 10.1016/j.jmr.2004.03.019. [DOI] [PubMed] [Google Scholar]
  • (57).Fushman D, Varadan R, Assfalg M, Walker O. Progress NMR Spectroscopy. 2004;44:189–214. [Google Scholar]
  • (58).Fushman D, Cowburn D. In: Protein NMR for the Millenium (Biological Magnetic Resonance Vol 20) Krishna NR, editor. Kluwer; 2002. pp. 53–78. L. B. [Google Scholar]
  • (59).Tjandra N, Wingfield P, Stahl S, Bax A. J Biomol NMR. 1996;8:273–284. doi: 10.1007/BF00410326. [DOI] [PubMed] [Google Scholar]
  • (60).Korzhnev DM, Billeter M, Arseniev AS, Orekhov V. Progress NMR Spectroscopy. 2001;38:197–266. Y. [Google Scholar]
  • (61).The standard deviations of the individual components of the 15N chemical shift tensor from a collection of 39 solid-state measurements in short peptides (see p. 221 in ref.60 for individual references) are approximately equal: 5.7, 7.3, and 6.5 ppm, for δzz, δyy, and δxx, respectively. The standard deviations of the individual components of the 15N CST from solution NMR measurements in ubiquitin (ref: 28) are also approximately similar, ranging (depending on the model of local motion) from: 6.8 to 9.1 ppm, 11.3 to 13.2 ppm, and 7.3 to 8.8 ppm, for δzz, δyy, and δxx, respectively. Recall that the 15N CST components are defined here such that σzz≤σyy≤σxx, i.e. σzz is the least shielded component.
  • (62).Fushman D, Cowburn D. J. Biomol. NMR. 1999;13:139–147. doi: 10.1023/a:1008349331773. [DOI] [PubMed] [Google Scholar]
  • (63).Scheurer C, Skrynnikov NR, Lienin SF, Straus SK, Bruschweiler R, Ernst RR. J Am Chem Soc. 1999;121:4242–4251. [Google Scholar]
  • (64).Bremi T, Bruschweiler R. J.Amer.Chem.Soc. 1997;119:6672–6673. [Google Scholar]
  • (65).Lienin SF, Bremi T, Brutscher B, Bruschweiler R, Ernst RR. J.Amer.Chem.Soc. 1998;120:9870–9879. [Google Scholar]
  • (66).Bouvignies G, Bernado P, Meier S, Cho K, Grzesiek S, Bruschweiler R, Blackledge M. Proc Natl Acad Sci U S A. 2005;102:13885–13890. doi: 10.1073/pnas.0505129102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • (67).García de la Torre J, Huertas ML, Carrasco B. J. Magn. Reson. 2000;B147:138–146. doi: 10.1006/jmre.2000.2170. [DOI] [PubMed] [Google Scholar]
  • (68).Markwick PR, Sattler M. J Am Chem Soc. 2004;126:11424–11425. doi: 10.1021/ja047859r. [DOI] [PubMed] [Google Scholar]
  • (69).Markwick PR, Sprangers R, Sattler M. Angew Chem Int Ed Engl. 2005;44:3232–3237. doi: 10.1002/anie.200462495. [DOI] [PubMed] [Google Scholar]
  • (70).Ulmer TS, Ramirez BE, Delaglio F, Bax A. J Am Chem Soc. 2003;125:9179–9191. doi: 10.1021/ja0350684. [DOI] [PubMed] [Google Scholar]
  • (71).Mandel AM, Akke M, Palmer AG., 3rd J Mol Biol. 1995;246:144–163. doi: 10.1006/jmbi.1994.0073. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

si20060407_113

RESOURCES