Abstract
Residual dipolar couplings (RDCs) are NMR parameters that provide both structural and dynamic information concerning inter-nuclear vectors, such as N–HN and Cα–Hα bonds within the protein backbone. Two approaches for extracting this information from RDCs are the model free analysis (MFA) (Meiler et al. in J Am Chem Soc 123:6098–6107, 2001; Peti et al. in J Am Chem Soc 124:5822–5833, 2002) and the direct interpretation of dipolar couplings (DIDCs) (Tolman in J Am Chem Soc 124:12020–12030, 2002). Both methods have been incorporated into iterative schemes, namely the self-consistent RDC based MFA (SCRM) (Lakomek et al. in J Biomol NMR 41:139–155, 2008) and iterative DIDC (Yao et al. in J Phys Chem B 112:6045–6056, 2008), with the goal of removing the influence of structural noise in the MFA and DIDC formulations. Here, we report a new iterative procedure entitled Optimized RDC-based Iterative and Unified Model-free analysis (ORIUM). ORIUM unifies theoretical concepts developed in the MFA, SCRM, and DIDC methods to construct a computationally less demanding approach to determine these structural and dynamic parameters. In all schemes, dynamic averaging reduces the actual magnitude of the alignment tensors complicating the determination of the absolute values for the generalized order parameters. To readdress this scaling issue that has been previously investigated (Lakomek et al. in J Biomol NMR 41:139–155, 2008; Salmon et al. in Angew Chem Int Edit 48:4154–4157, 2009), a new method is presented using only RDC data to establish a lower bound on protein motion, bypassing the requirement of Lipari–Szabo order parameters. ORIUM and the new scaling procedure are applied to the proteins ubiquitin and the third immunoglobulin domain of protein G (GB3). Our results indicate good agreement with the SCRM and iterative DIDC approaches and signify the general applicability of ORIUM and the proposed scaling for the extraction of inter-nuclear vector structural and dynamic content.
Electronic supplementary material
The online version of this article (doi:10.1007/s10858-013-9775-1) contains supplementary material, which is available to authorized users.
Keywords: RDCs, Dynamics, Ubiquitin, GB3
Introduction
Protein structure and dynamics are routinely investigated with NMR spectroscopy at atomic resolution. An essential NMR parameter that provides both structural and dynamic information is the residual dipolar coupling (RDC) between two nuclear magnetic moments, for example a N–HN or Cα–Hα vector within the polypeptide backbone of a protein (Tjandra and Bax 1997; Tolman et al. 1997). An important application of RDCs as a probe for protein dynamics has been shown recently where RDCs measured in 36 different alignment media demonstrated that the ground state conformational ensemble of ubiquitin covers the conformational space captured in crystal structures of ubiquitin complexes (Lange et al. 2008; Lakomek et al. 2008). These findings provide strong support for conformational selection as a means for molecular recognition. Therefore, extracting the dynamical content from RDCs has implications for understanding the mechanisms of molecular recognition and protein function.
The significance of the RDC’s dynamical content is highlighted when considering other approaches for studying protein dynamics. Measurements of NMR spin-relaxation provide information concerning amplitudes of inter-nuclear vector motions occurring on time-scales faster than the rotational correlation time (τc) of the protein (picosecond to nanosecond) (Kay et al. 1989), which are parameterized by the Lipari–Szabo order parameter (S 2LS) (Lipari and Szabo 1982a, b). Relaxation dispersion methods probe the kinetics of conformational exchange that modulates the isotropic chemical shift and contributes to the effective transverse relaxation rate (Palmer 2004). To date, relaxation dispersion techniques have been limited to time-scales slower than approximately 25 μs (Ban et al. 2011, 2012). By contrast, RDCs provide vital insight into the amplitude and direction of internal vector motions on time-scales covering the previously inaccessible time window spanning τc to ~25 μs (referred to as the supra-τc window).
Residual dipolar couplings (RDCs) arise from placing a protein in an anisotropic medium, such as filamentous phages or lipid bilayers, or paramagnetic tagging, leading to partial alignment of the protein with respect to the external magnetic field. In the anisotropic media, all possible orientations for an inter-nuclear vector are populated with unequal probability, resulting in the dipolar couplings no longer averaging to zero. The magnitude of the measured RDC is given by the time-averaged angle between the inter-nuclear vector and the magnetic field (Tolman et al. 1997).
Since the potential to extract dynamics from RDCs was first recognized, two schemes for extracting the dynamical content from these NMR parameters in the form of a generalized order parameter have been proposed. In the model free analysis (MFA), five independent alignment media are necessary to calculate the five independent elements of the inter-nuclear vector tensor (Meiler et al. 2001; Peti et al. 2002). Figure 1 illustrates the three frames of reference used in the analysis of RDCs, the molecular frame (MF), the alignment frame (AF), and the vector frame (VF). Knowledge of the protein structure is necessary to determine the alignment tensors. With the alignment tensor information, the averages over the second rank spherical harmonics describing the mean orientations of the vectors, contained within the inter-nuclear vector tensor , see also Fig. 2), provide the desired structural and dynamic content. An alternative approach, the direct interpretation of dipolar couplings (DIDCs), was developed to bypass the need for structural input in the calculation of the inter-nuclear vector’s structural and dynamic content (Tolman 2002). Five independent alignment media are also necessary for the DIDC method. A single matrix equation is employed to represent the RDC data obtained in multiple alignment media. The inter-nuclear vector tensors are optimized simultaneously and variation in is minimized.
Both the MFA and DIDC methods have been incorporated into iterative schemes with the goal of improving the accuracy of the alignment tensor calculation by reducing the effects of the structural noise, termed the self-consistent RDC based MFA (SCRM) and iterative DIDC (Lakomek et al. 2008; Yao et al. 2008). The iterative schemes achieve this by using the refined dynamically averaged coordinates as input for additional runs of either MFA or DIDC, however each approach, as implemented, relies on computationally expensive procedures. In the iterative DIDC method, a grid search is performed which minimizes the difference between the vector’s coordinates and a pool of possible solutions built from an exhaustive list of (θ, ϕ) combinations to find dynamically averaged coordinates (Yao et al. 2008). As for SCRM, the dynamic average orientation of each vector is calculated by performing a coordinate transformation with maximization of (Lakomek et al. 2008). Recently, it has been shown that this transformation can be replaced with the diagonalization of the local ordering Saupe matrix (Meirovitch et al. 2012), which is computationally less demanding.
Here, we describe a new iterative procedure for extracting structural and dynamic information from RDCs entitled Optimized RDC-based Iterative and Unified Model-free analysis (ORIUM). ORIUM unifies the theoretical concepts developed in the MFA, SCRM, and DIDC methods. In addition, a new method is presented to establish a lower bound on protein motion using RDC data alone without requiring a separate determination of . While this has been achieved previously (Salmon et al. 2009) based on several sets of RDCs assuming Gaussian fluctuations, the method introduced here works on a single set of RDCs and does not require a motional model. The applicability of ORIUM and the new scaling procedure are tested with the model proteins ubiquitin and the third immunoglobulin domain of protein G (GB3).
Theory
Optimized RDC-based Iterative and Unified Model-free analysis (ORIUM) consists of three principal stages for the extraction of RDC order parameters from data measured in multiple alignment media (see Fig. 2 for schematic diagram). First, the matrix formalism introduced by Tolman in the DIDC approach is utilized to calculate refined structural coordinates from the alignment tensors (Tolman 2002). From here, each refined vector is put into a local axis system in order to determine the vector specific structural and dynamic information (Meirovitch et al. 2012; Meiler et al. 2001; Peti et al. 2002). Finally, the resulting Euler angles are used as structural input to restart the calculation in an iterative fashion, similar to SCRM (Lakomek et al. 2008). ORIUM continues until the variation in for the entire dataset falls below a certain threshold.
Alignment tensor calculation
For two nuclear spins, the observed resonance splitting (Hz) resulting from the partial alignment of a protein emanate from the secular part of the magnetic dipole interaction
1 |
1a |
where is the permeability of vacuum, is the gyromagnetic ratio of spin X, ℏ is Planck’s constant, r ij is the distance between nuclei i and j (assumed to be fixed at 1.02 Å for the N–HN and 1.095 Å for the Cα–Hα vectors), and θ k is the angle between the inter-nuclear vector formed by nuclear spin pair k and the magnetic field (B 0). The angular brackets denote ensemble averaging. As Eq. (1) explicitly illustrates, the magnitude of depends on . By definition, the term cos θ k is the scalar product between an inter-nuclear vector and the vector parallel to B 0.
When considering a rigid molecule, the coordinates of an inter-nuclear vector can be described within an arbitrary reference frame, termed the molecular frame (MF), and defined by three angles, β x, β y, and β z, between the vector and the respective MF axes. In a similar fashion, the vector parallel to B 0 can be expressed by three angles representing the instantaneous orientation of B 0 relative to the MF axes, α x, α y, and α z. Within the MF, can be recast as
2 |
where is the scalar product of two vectors representing the inter-nuclear orientations (B) and the B 0 orientations (A). Here, A is the alignment tensor and B is the inter-nuclear vector tensor. Both A and B contain five independent terms and are related to a 3 × 3 second rank Cartesian order tensor as follows (Saupe 1964, 1968; Snyder 1965)
3 |
where the orientation of B 0 in the MF is given by
3a |
and
4 |
where the orientation of the inter-nuclear vector in the MF is described by
4a |
The term δ mn represents the Kronecker delta function, l is the alignment condition, and m, n = x, y, z.
A matrix formalism is introduced to render analysis of the RDC data in a more intuitive manner (Tolman 2002). When K RDCs are measured under L alignments, then Eq. (2) becomes
5 |
where D is a K × L matrix, B is a K × 5 matrix, and A is a 5 × L matrix. In Eq. (5), the term is included in . The rows of B are defined by Eq. (4) and the columns of A are given by Eq. (3). An inherent assumption in the present analysis is that inter-nuclear dynamics are uncorrelated with the alignment process; hence the averages of and are independent of each other. This assumption can be tested with the SECONDA analysis (Hus and Brüschweiler 2002; Hus et al. 2003). When the structure of the molecule is known and RDCs for at least five linearly independent inter-nuclear vectors are measured, the matrix B (input from the rigid structure or random structural coordinates) and the measured RDCs are used to calculate
6 |
where B + is the pseudo-inverse of B. It should be noted that a single alignment tensor per alignment medium is necessary for the successful application of the following protocols. Intrinsically disordered proteins (see Bertoncini et al. 2005; Bernadó et al. 2005) and multiple domain proteins (see Bertini et al. 2004; Rodriguez-Castañeda et al. 2006) will have to be described by several alignment tensors per alignment medium and will not be amenable to the present analysis.
Each column of , given by Eq. (3), can be recast into L symmetric 3 × 3 second rank Cartesian order tensors, . These order matrices are then redefined in a principal axis system (PAS), termed the alignment frame (AF), where Eq. (1) becomes (Bax et al. 2001)
7 |
In Eq. (7), the magnitude of the alignment tensor is , the rhombicity is are the polar angles defining the inter-nuclear vector in the AF, and are the eigenvalues resulting from the diagonalization of . From the eigenvectors , the Euler angles describing the rotation of into the PAS are defined
8 |
Model free analysis
With the MFA (Meiler et al. 2001; Peti et al. 2002), the five parameters describing each alignment tensor in the PAS, , are used to construct the F matrix which is needed to derive the five dynamically averaged second order spherical harmonics
9a |
9b |
9c |
Equation (7) can be recast in terms of dynamically averaged second order spherical harmonics
10 |
where
10a |
10b |
The F matrix relates the measured RDCs to the spherical harmonics defined in the MF by a Wigner rotation from the MF to the AF
11 |
with
12 |
In analogy to the component definition from Eq. (5), Y is a K × 5 matrix containing the dynamically averaged spherical harmonics in the MF and F is a 5 × L matrix containing the alignment tensor information. The matrix is determined in direct correspondence to Eq. (6)
13 |
Here, represents in order to normalize the contributions of each alignment condition to the calculation of refined structural coordinates. Each row of is used to determine
14 |
From the dynamically averaged spherical harmonics, the dynamically averaged orientations for each inter-nuclear vector, , can be obtained. Maximizing places the z axis of the vector’s axis system, termed the vector frame (VF), in the center of the inter-nuclear vector’s orientational distribution,
15 |
The terms vanish in the VF and possesses information on the amplitude of anisotropy, η k, and the orientation of anisotropic motions,
16 |
17 |
It should be noted that is the same in any frame, thus
18 |
which is equivalent to Eq. (14).
Standard tensorial analysis
Recalling Eq. (4), the following relationships are established in order to construct (Snyder 1965)
19a |
19b |
19c |
19d |
19e |
The resulting eigenvalues ( contain the dynamic information for each vector , while the eigenvectors, , encompass the bond orientations and the direction of the anisotropic local motion . The following equations detail how the dynamic parameters are calculated from . The Saupe order parameters are defined as
20a |
20b |
21 |
22 |
For each inter-nuclear vector, and are extracted from the transpose of the resulting matrix
23 |
Direct interpretation of dipolar couplings
With DIDC, once is determined from Eq. (6), is used to directly calculate a new set of dynamically averaged coordinates, , without extracting each set of , according to
24 |
This formula leaves the information for in the MF. It should be noted that the previous implementations of DIDC did not scale the RDCs by as in the MFA (see Eq. 13), which is necessary to normalize the contributions of each alignment condition for the calculation of refined structural coordinates. Therefore, we have modified Eq. (24) as follows
25 |
where and represent the RDCs and alignment tensors divided by .
As described by Tolman, the first term in Eqs. (24) and (25) encompasses the contribution of the measured RDCs to determining (Tolman 2002). When the rank of is smaller than 5, then the second term accounts for the degeneracy in the possible solutions that results from B. Otherwise, this term will equal zero for data representing more than five alignment media. With , the 3 × 3 second rank Cartesian tensor, , for each inter-nuclear vector is constructed, diagonalized into the VF, and Eqs. (21), (22), and (23) calculate each set of .
ORIUM procedure
Optimized RDC-based Iterative and Unified Model-free analysis (ORIUM) is an iterative approach (see Fig. 2) and is related to but different from the SCRM and iterative DIDC approaches as discussed in this section. The scheme is summarized as follows. First, alignment tensors, , are calculated with Eq. (6) and are used to determine . A comparison of the effects of scaling the RDCs with in the determination of will be presented in the Applications section below [see Eqs. (24) and (25)]. Based on , the 3 × 3 symmetric Saupe matrix is constructed for the inter-nuclear vectors using expressions defined in Eqs. (19a)–(19e), and is put into the local PAS. Utilizing Eqs. (21), (22), and (23), each set of is extracted. These refined angles (θ MFk, ϕ MFk) are used as input for the next cycle of ORIUM. The cycle is finished when the convergence of order parameter is achieved using the relationship
26 |
where r is a cycle of iteration.
The ORIUM approach differs from the SCRM method as follows. There is a minor difference: with SCRM, the inter-nuclear vector coordinates are defined in terms of spherical harmonics, while ORIUM utilizes Cartesian coordinates. The relationship between the spherical harmonics and the Cartesian coordinates are give by Eqs. (19a)–(19e). A key difference is that SCRM requires the five alignment tensor parameters to construct the F matrix for the determination of . DIDC and ORIUM calculate directly from . Finally, SCRM maximizes , whereas ORIUM places into a local axis system by diagonalization of the symmetric 3 × 3 second rank Cartesian tensor.
There are three key differences between ORIUM and the iterative DIDC approach. First, a grid search is implemented with the iterative DIDC scheme which minimizes the difference between the vector’s coordinates obtained from and an exhaustive list of combinations to find dynamically averaged coordinates. As stated above, ORIUM diagonalizes into a local axis system in order to extract this information. The second key difference is that with the iterative DIDC scheme each inter-nuclear vector is constrained to be rigid . Only during the final iterative run is the constraint removed. ORIUM never constrains the dynamics of the inter-nuclear vectors during the iterative procedure. A final divergence between the two procedures is how flexible inter-nuclear vectors are removed from the calculation of the alignment tensors. In the iterative DIDC procedure, RDC data for an individual inter-nuclear vector is removed from the calculation of the alignment tensors if the error in the experimental and back-calculated RDCs is greater than a factor of 2. The calculation is restarted and RDC data for the next inter-nuclear vector is once again removed from the calculation if the deviation is greater than a factor of 2. This procedure is repeated until all inter-nuclear vectors fulfill the threshold for the error in experimental and back-calculated RDCs. At this point, the constraint is removed and a final iteration is performed. ORIUM removes the most flexible residues , after Eq. (26) has been satisfied (see below) and then ORIUM is restarted until Eq. (26) is once again fulfilled.
As with the RDC-based model free analysis, the fundamental assumption is that the internal protein dynamics for each inter-nuclear vector is uncorrelated with fluctuations with the alignment tensor. Thus, a single average alignment tensor can be utilized for each medium. Molecular dynamics simulations indicate that this assumption is true for secondary structural elements, however and dynamics may be correlated for the most mobile regions of a protein (Louhivuori et al. 2006; Salvatella et al. 2008). To circumvent this potential inseparability of mobile inter-nuclear vectors and the alignment tensor fluctuations, the approach outlined in the SCRM procedure is followed (Lakomek et al. 2008). After convergence is achieved with Eq. (26), the residues that are the most mobile, as determined by fulfilling the relationship , are removed from the calculation and ORIUM is restarted with from the previous iteration until Eq. (26) is once again satisfied.
The validity of ORIUM was accessed with synthetic RDC data containing a measurement error (0.3 Hz) for the 36 alignment media, which was generated using the RDC refined ubiquitin ensemble ERNST (PDB:2KOX) (Fenwick et al. 2011). The corresponding dynamic parameters ( and η) were also calculated using the same ensemble. Using these synthetic RDC data, ORIUM was conducted and the resulting dynamic parameters have been compared with those calculated from the ensemble. The Pearson correlations of and η are 0.97 and 0.93, respectively.
It should be noted that the local PAS differs from the VF when is a negative value, although the local PAS is usually the VF. In this case, the averaged vector orientation is actually orthogonal to the z axis of PAS. This issue can be alleviated by choosing a new axis system referred to the vector frame system (VFS), with eigenvalues ordered instead of . It should also be noted that from the VFS and the PAS can be significantly different in the case that has a negative value. ORIUM utilizes the VFS after removal of residues with to obtain dynamically averaged angles of the bond vector distribution.
Determination of scaling factor:
An inherent complication when calculating from experimental RDCs is that dynamic averaging will reduce the actual magnitude of or . This reduced magnitude will result in some parameters over 1, and thus can only be considered as relative gauge of the actual amplitudes of motion, defined as (Lakomek et al. 2006; Meiler et al. 2001). It should be noted that the alignment tensor parameters are unaffected by the reduction in the magnitude of or . Two avenues to circumvent this complication have been developed. Either all the order parameters are scaled relative to the largest leaving one order parameter equal to one (iterative DIDC approach) (Tolman 2002; Yao et al. 2008), or is scaled relative to the Lipari-Szabo order parameters () calculated for each residue (MFA/SCRM approach) (Lakomek et al. 2006, 2008). The problem with the first approach is that the resulting parameters will underestimate the amplitude of motion for each inter-nuclear vector. Overestimation can only occur if the largest parameter has a large experimental error, leading to an artificially greater value for this parameter than in reality. Sub- and supra-τc motion happening for each vector equally will not be picked up by this approach, underestimating the motion except for the mentioned case. As for the second approach, are required which may not be available for the vectors being analyzed. While this approach has been successfully applied, it may also underestimate motion since a general supra-τc motion affecting all the nuclei will not be picked up by this approach. Comparison of the derived in Lakomek et al. 2008 and Lange et al. 2008 with the average order parameter from solid state data (Schanda et al. 2010) shows that the solid state NMR derived average order parameter is smaller than the one derived by this second approach suggesting that supra-τc motion affecting all nuclei is seen by solid state NMR but not the versus approach.
Here, we present a new method for determining without the requirement of additional information, such as , which may not be available for the inter-nuclear vectors under investigation. The scaling procedure separates an inter-nuclear vector’s motion into its principal axes in Cartesian space and leads to parameters that have a more straightforward physical interpretation. The inter-nuclear vector’s motional variance is directly related to the resulting eigenvalues calculated from diagonalization of into a local axis system. The methodology outlined below exploits the fact that variance cannot be negative by definition. Therefore, a uniform scaling parameter, , is necessary to insure that the variance for each inter-nuclear vector about each of the three principal axes is positive. In the following, we present a brief outline for the derivation of bond vector motional variance for the determination of .
For each vector, the following relationships between the dynamically averaged Eigenvalues and the unit vector coordinates (x, y, z) within the VF, as shown in Eq. (4a), are as follows
27 |
The normalization condition sets , which also implies . Therefore, can be recast as
28 |
Utilizing the definitions of and η [Eqs. (21), and (22)], we can now reformulate and η in terms of the Cartesian coordinates defined within the VF
29 |
30 |
The definition of variance is , where k = x, y. Therefore, can be substituted for . Now, and η are defined in terms of variance
31 |
32 |
Solving the system of equations gives the inverse relationships
33 |
34 |
A graphical depiction of the mapping between these parameters is shown in Figure S1. Using the relationship (), these equations can be written as
35 |
36 |
Since the variance must always be positive, the axis with the least variance ( should also be positive. Thus, the following inequalities are derived relating and η to
37 |
38 |
Using Eq. (38), residue-specific can be obtained using and η, or from the lowest eigenvalue. The eigenvalue definition of follows directly from Eq. (27).
Since the reduction of magnitude in the alignment due to dynamic averaging is a global effect throughout all residues, the least residue-specific may be utilized as the scaling factor, if there is no experimental error. The previous method in which all order parameters are scaled relative to the largest , leaving one order parameter equal to one (iterative DIDC approach) (Tolman 2002; Yao et al. 2008) is related to this new approach. If bond vector anisotropy is assumed to be axially symmetric (η = 0), turns into . This is identical to scaling all inter-nuclear vectors such that the largest is 1 (Figure S1).
This scaling approach using the lowest residue-specific may introduce a systematic bias due to the fact that experimental data contain errors. In order to alleviate the systematic bias, we used a statistical procedure accounting for the effect of experimental noise on without any knowledge of unlike the SCRM approach. First, scaling factors were calculated from the original data as well as datasets with noise added equivalent to the experimental error. These scaling factors were used to determine a value (which we term ), below which there was a 95 % chance that the true would fall. Given the maximum scaling factor from the original data that fulfills the constraint equation for all inter-nuclear vectors, , and corresponding set of values from noise added data (NAD), , the value can be calculated as follows:
39 |
where the quantile function returns the given quantile of the set. The quantile prefactor compensates for systematic shifts resulting from the addition of experimental error. With the previous study (Lakomek et al. 2008), the determination of was conservative in order to circumvent the chance for over-estimating the supra-τc motion, reflected in the reported . Here, the criterion for scaling is that should be positive, which possesses no time-scale bias. Yet, it should be noted that this overall order parameter is an upper limit for since it could underestimate motion if there is a uniform sub- or supra-τc motion affecting all vectors. This is summarized in Table 1.
Table 1.
Scaling | (ORIUM) | Solid state | 3D GAF cross validation | ||
---|---|---|---|---|---|
Advantage | No other data required | Sub-τc motion included | No other data required | All motion is reflected in the order parameters | No other data are required |
Disadvantage | Motion in the most rigid internuclear vector leads to underestimation of | Homogeneous supra-τc motion leads to underestimation of | Homogenous sub- and supra-τc motion leads to underestimation of | Different sample than solution | Gaussian fluctuations assumed, several sets of RDCs required |
Reference | Yao et al. (2008) | Lakomek et al. (2008) | Schanda et al. (2010) | Salmon et al. (2009) |
Applications
Ubiquitin: comparison of ORIUM with SCRM
In order to compare ORIUM with the SCRM method, N–HN RDCs were used from measurements performed in 36 different alignment media (D36M) for the 76-residue protein ubiquitin (see Lakomek et al. 2008 for RDCs and references therein). The X-ray structure 1UBI of ubiquitin was used as the input structure for the first cycle of ORIUM (Ramage et al. 1994). For the error estimation of each extracted set of , 1,000 Monte Carlo simulations were performed by adding uncertainty to the RDCs drawn from a Gaussian distribution with a standard deviation given by the error in the RDC set (0.3 Hz). On a single core of an Intel Core i7-2635QM CPU, the 1,000 Monte Carlo simulations required 18 min for ORIUM versus 83 min for SCRM, where the convergence criterion was set as in the SCRM implementation (Lakomek et al. 2008). When we used a 100-fold stricter convergence criterion than SCRM (see Eq. 26), the calculation was still faster on the same CPU (74 min). Thus, ORIUM is an optimized approach for the better convergence of the dynamic parameters.
A comparison of the D36M RDC data set analyzed with ORIUM and the SCRM method (re-implemented in this study) is presented in Fig. 3, which shows and η determined for each residue (see Table S1 for the actual values calculated from the ORIUM analysis of the D36M data set). The correlation coefficients for the and η parameters are both 0.99, which shows that in principle both the iterative DIDC and SCRM should yield identical results. However, in the previous implementations of DIDC (Tolman 2002; Yao et al. 2008), the effect of the alignment magnitude on the angle calculation, as shown in Eqs. (24) and (25), was not recognized, leading to variations in the and η values (Figure S2).
To determine whether normalization by alignment strength produces more accurate results, we compared Q factors for each alignment condition (Figure S3). For both the standard fitting procedure and a cross validation procedure, normalization produced significantly better Q factors (with respective p values of 0.022 and 0.00033). In the unnormalized case without cross validation, stronger alignment conditions showed lower Q factors than weak conditions, indicating that they were contributing disproportionately to the fit. This lack of proportionality is also evident in an examination of the alignment tensors themselves. To estimate the degree to which the five dimensional alignment tensor space is uniformly sampled, condition numbers can be used, where a lower value indicates more uniform sampling (Peti et al. 2002). We checked the condition numbers of and for ORIUM implemented with Eqs. (24) and (25), respectively. Figure S4 plots the condition number versus ORIUM iteration until Eq. (26) is satisfied. For the unnormalized alignment tensors ( using Eq. 24), the condition number finishes at 8.36, while for the normalized alignment tensors ( using Eq. 25) the condition number is significantly lower finishing at 6.19. This shows that normalization of the RDCs based on is important to adjust the contributions of strong versus weak alignments in the calculation of inter-nuclear vector orientations and dynamics. It should also be mentioned that the different values of the condition numbers do not directly indicate the reliability of the dynamics but rather the degree to which alignment space is uniformly sampled.
The slight deviation in between ORIUM and SCRM (Fig. 3b) originates from the , which is 0.87 from the ORIUM approach compared with the reported value of 0.89 from the SCRM report using to calculate . Remarkably, the present approach for determination yields scaled parameters that are below or within error of the values without utilizing in the calculation of . Due to the slightly lower with ORIUM, the average for ORIUM and SCRM is 0.69 and 0.72, respectively.
In addition to starting with the 1UBI structure, the ORIUM calculation was also tested with random coil input structural coordinates and results are identical to those starting with 1UBI. This worked for ORIUM and not other procedures like SCRM because at the beginning of iteration, a sizable fraction of residues used for alignment tensor calculation had their largest eigenvalue with a negative sign. For these residues, the angle used for tensor calculation was then orthogonal to the mean angle. By the time the iteration was completed, no residues had negative maximum eigenvalues. This result shows the potential utility of ORIUM in the calculation of structural parameters from random structural input and perhaps used in the refinement of conformational ensembles. Both ideas are currently under investigation.
It is also interesting to compare the results from ORIUM, specifically in regard to the calculated of 0.87, to a recent study examining the dynamics of ubiquitin in the microcrystalline state (Schanda et al. 2010). Here, the scaling of the solid-state order parameters () is unnecessary since the protein is not tumbling and therefore the calculated order parameters should reflect the absolute magnitude of the amplitudes of motion for each inter-nuclear vector. The time-scale of motion embodied in spans up to about one digit microsecond (Chevelkov et al. 2010), whereas encompasses motion up to about one millisecond. In principle, is expected to be higher than due to the time-scale of motion embodied by the two order parameters, assuming that both the conditions for ubiquitin in microcrystalline and in solution are identical. The previously reported values using of 0.89 (Lakomek et al. 2008) are on average higher than the order parameters reported for the microcrystalline state (Schanda et al. 2010). The average order parameter values of and for residues 2–70 are 0.75 and 0.72, respectively. This may be due to the fact that the determination of was done to circumvent the chance for over-estimating the supra-τc motion and thus the was too conservative as described earlier. Figure S5 shows that with the approach presented here for the calculation of without any time-scale bias most of the residues possess that are comparable or within the error of .
GB3: comparison of ORIUM with iterative DIDC
We compared ORIUM with the iterative DIDC method. N–HN and Cα–Hα RDCs were used from measurements performed on 6 mutants of the third immunoglobulin domain of protein G (GB3) aligned with Pf1 phages (Yao et al. 2008). The reported error in the N–HN and Cα–Hα RDCs were 0.2 and 0.4 Hz, respectively. The NMR structure 2OED of GB3 was used as the input structure for the first cycle of ORIUM (Ulmer et al. 2003). From the SECONDA analysis of the RDC data as described in the iterative DIDC publication (see Fig. 4 in Yao et al. 2008), the following RDCs were removed from the entire analysis due to inconsistencies in the structure and dynamics for these inter-nuclear vectors over the 6 different alignment media: residues 19 and 41 for the N–HN RDCs and residues 11, 25, 30, and 40 for the Cα–Hα RDCs. Error estimation followed the same protocol used for ubiquitin.
The ORIUM calculation was performed with only the N–HN RDCs or the Cα–Hα RDCs using the actual errors in the RDC data of 0.2 and 0.4 Hz for the N–HN RDCs and the Cα–Hα RDCs, respectively. The results are illustrated in Figs. 4 and 5 and compiled in Tables S2 and S3. The correlation for the N–HN RDCs is 0.91 for and is 0.92 for η. As for the Cα–Hα RDCs, the correlation coefficients for and η are 0.85 and 0.77, respectively. Here, the calculated from the ORIUM approach is 0.83 for both RDC types. This scaling leads to an average N–HN of 0.65 and Cα–Hα of 0.66. Thus, the values are on average 22 % lower than in the iterative DIDC publication. As shown in Figs. 4e and 5e, iterative DIDC has significantly more negative variances than ORIUM. This is primarily because the constraint used for ORIUM is more restrictive than the constraint used for iterative DIDC (Figure S1), making the ORIUM values lower than iterative DIDC.
In principle, both ORIUM and the iterative DIDC should give identical results as described earlier. The discrepancy reflected in the correlations may originate from the removal of the bias in the calculated structural and dynamic parameters due to the magnitude of alignment media [see Eqs. (24) and (25)]. In order to check this possibility, we calculated with ORIUM from the N–HN RDCs and the Cα–Hα RDCs simultaneously using Eq. (24) instead of Eq. (25). For these calculations, uncertainties of 0.3 and 0.6 Hz were used for the MC analysis as done in iterative DIDC method and the Cα–Hα RDCs were scaled by a factor of 2.08−1 (Bax et al. 2001) (see Figures S6 and S7). Because of the small number of alignment conditions for GB3, it is not possible determine significant differences in a Q factor analysis with and without normalization. However, we checked the condition numbers of (Peti et al. 2002) for ORIUM implemented with either Eq. (24) or Eq. (25) (see Figure S8). For the unnormalized alignment tensors, the condition number finishes at 7.84, while for the normalized alignment tensors the condition numbers are again lower finishing at an average of 7.15. The correlation for the N–HN RDCs is 0.94 for and is 0.96 for η. As for the Cα–Hα RDCs, the correlation coefficients for and η are 0.88 and 0.91, respectively. Although the correlations are improved, the remaining discrepancies may result from the differences in the implementation of ORIUM versus the iterative DIDC. The iterative DIDC method utilizes a grid search to find for each inter-nuclear vector. During the grid search, each vector is constrained to be rigid, , until the final iterative run when the constraints are lifted and the dynamic parameters calculated. As in case with normalized ORIUM, unnormalized ORIUM shows lower values than DIDC, which results from the less restrictive DIDC constraint () allowing more negative variances (Figures S6 and S7).
Conclusions
With ORIUM, we present a computationally efficient method for extracting structural and dynamic information for inter-nuclear vectors from RDCs unifying previously published concepts into one compact protocol. Furthermore, we demonstrate a new scheme for scaling the derived parameters based on variances of a single type of RDC without needing as a constraint which constitutes an upper limit of . Dynamics occurring on time-scales slower than the rotational correlation time of proteins, encoded in RDCs, have important implications protein functionality, including enzyme catalysis, molecular recognition, and correlated motions. The concepts set forth in this paper will go far in streamlining the procedure for calculating the dynamic average orientation and associated amplitudes of motion for inter-nuclear vectors in an iterative manner as long as RDC data sets are acquired in at least five independent alignment media.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgments
This paper is dedicated to Richard R. Ernst on the occasion of his 80th birthday. We thank Beat Vögeli and Bert L. de Groot for useful discussions. This work has been supported by the Max-Planck Society and the EU (ERC grant agreement number 233227). T.M.S and C.A.S. were supported by an Alexander von Humboldt Foundation postdoctoral research fellowship.
Footnotes
T. Michael Sabo and Colin A. Smith have contributed equally to this work.
Contributor Information
Donghan Lee, Phone: +49-551-2012201, Phone: +49-551-2012251, FAX: +49-551-2012202, Email: dole@nmr.mpibpc.mpg.de.
Christian Griesinger, Phone: +49-551-2012201, Phone: +49-551-2012251, FAX: +49-551-2012202, Email: cigr@nmr.mpibpc.mpg.de.
References
- Ban D, Funk M, Gulich R, Egger D, Sabo TM, Walter KFA, Fenwick RB, Giller K, Pichierri F, de Groot BL, Lange OF, Grubmüller H, Salvatella X, Wolf M, Loidl A, Kree R, Becker S, Lakomek NA, Lee D, Lunkenheimer P, Griesinger C. Kinetics of conformational sampling in ubiquitin. Angew Chem Int Edit. 2011;50:11437–11440. doi: 10.1002/anie.201105086. [DOI] [PubMed] [Google Scholar]
- Ban D, Gossert AD, Giller K, Becker S, Griesinger C, Lee D. Exceeding the limit of dynamics studies on biomolecules using high spin-lock field strengths with a cryogenically cooled probehead. J Magn Reson. 2012;221:1–4. doi: 10.1016/j.jmr.2012.05.005. [DOI] [PubMed] [Google Scholar]
- Bax A, Kontaxis G, Tjandra N. Dipolar couplings in macromolecular structure determination. Method Enzymol. 2001;339:127–174. doi: 10.1016/S0076-6879(01)39313-8. [DOI] [PubMed] [Google Scholar]
- Bernadó P, Blanchard L, Timmins P, Marion D, Ruigrok RWH, Blackledge M. A structural model for unfolded proteins from residual dipolar couplings and small-angle X-ray scattering. Proc Natl Acad Sci USA. 2005;102:17002–17007. doi: 10.1073/pnas.0506202102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bertini I, Del Bianco C, Gelis I, Katsaros N, Luchinat C, Parigi G, Peana M, Provenzani A, Zoroddu MA. Experimentally exploring the conformational space sampled by domain reorientation in calmodulin. Proc Natl Acad Sci USA. 2004;101:6841–6846. doi: 10.1073/pnas.0308641101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bertoncini CW, Jung YS, Fernandez CO, Hoyer W, Griesinger C, Jovin TM, Zweckstetter M. Release of long-range tertiary interactions potentiates aggregation of natively unstructured alpha-synuclein. Proc Natl Acad Sci USA. 2005;102:1430–1435. doi: 10.1073/pnas.0407146102. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang S-L, Tjandra N. Temperature dependence of protein backbone motion from carbonyl 13C and amide 15N NMR relaxation. J Magn Res. 2005;174:43–53. doi: 10.1016/j.jmr.2005.01.008. [DOI] [PubMed] [Google Scholar]
- Chevelkov V, Xue Y, Linser R, Skrynnikov NR, Reif B. Comparison of solid-state dipolar couplings and solution relaxation data provides insight into protein backbone dynamics. J Am Chem Soc. 2010;132:5015–5017. doi: 10.1021/ja100645k. [DOI] [PubMed] [Google Scholar]
- Fenwick RB, Esteban-Martín S, Richter B, Lee D, Walter KFA, Milovanovic D, Becker S, Lakomek NA, Griesinger C, Salvatella X. Weak long-range correlated motions in a surface patch of ubiquitin involved in molecular recognition. J Am Chem Soc. 2011;133:10336–10339. doi: 10.1021/ja200461n. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hall JB, Fushman D. Characterization of the overall and local dynamics of a protein with intermediate rotational anisotropy: differentiating between conformational exchange and anisotropic diffusion in the B3 domain of protein G. J Biomol NMR. 2003;27:261–275. doi: 10.1023/A:1025467918856. [DOI] [PubMed] [Google Scholar]
- Hus JC, Brüschweiler R. Principal component method for assessing structural heterogeneity across multiple alignment media. J Biomol NMR. 2002;24:123–132. doi: 10.1023/A:1020927930910. [DOI] [PubMed] [Google Scholar]
- Hus JC, Peti W, Griesinger C, Brüschweiler R. Self-consistency analysis of dipolar couplings in multiple alignments of ubiquitin. J Am Chem Soc. 2003;125:5596–5597. doi: 10.1021/ja029719s. [DOI] [PubMed] [Google Scholar]
- Kay LE, Torchia DA, Bax A. Backbone dynamics of proteins as studied by n-15 inverse detected heteronuclear NMR-spectroscopy—application to staphylococcal nuclease. Biochemistry. 1989;28:8972–8979. doi: 10.1021/bi00449a003. [DOI] [PubMed] [Google Scholar]
- Lakomek NA, Carlomagno T, Becker S, Griesinger C, Meiler J. A thorough dynamic interpretation of residual dipolar couplings in ubiquitin. J Biomol NMR. 2006;34:101–115. doi: 10.1007/s10858-005-5686-0. [DOI] [PubMed] [Google Scholar]
- Lakomek NA, Walter KF, Farés C, Lange OF, de Groot BL, Grubmüller H, Brüschweiler R, Munk A, Becker S, Meiler J, Griesinger C. Self-consistent residual dipolar coupling based model-free analysis for the robust determination of nanosecond to microsecond protein dynamics. J Biomol NMR. 2008;41:139–155. doi: 10.1007/s10858-008-9244-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lange OF, Lakomek NA, Farés C, Schröder GF, Walter KFA, Becker S, Meiler J, Grubmüller H, Griesinger C, de Groot BL. Recognition dynamics up to microseconds revealed from an RDC-derived ubiquitin ensemble in solution. Science. 2008;320:1471–1475. doi: 10.1126/science.1157092. [DOI] [PubMed] [Google Scholar]
- Lipari G, Szabo A. Model-free approach to the interpretation of nuclear magnetic-resonance relaxation in macromolecules. 1. Theory and range of validity. J Am Chem Soc. 1982;104:4546–4559. doi: 10.1021/ja00381a009. [DOI] [Google Scholar]
- Lipari G, Szabo A. Model-free approach to the interpretation of nuclear magnetic-resonance relaxation in macromolecules. 2. Analysis of experimental results. J Am Chem Soc. 1982;104:4559–4570. doi: 10.1021/ja00381a010. [DOI] [Google Scholar]
- Louhivuori M, Otten R, Lindorff-Larsen K, Annila A. Conformational fluctuations affect protein alignment in dilute liquid crystal media. J Am Chem Soc. 2006;128:4371–4376. doi: 10.1021/ja0576334. [DOI] [PubMed] [Google Scholar]
- Meiler J, Prompers JJ, Peti W, Griesinger C, Brüschweiler R. Model-free approach to the dynamic interpretation of residual dipolar couplings in globular proteins. J Am Chem Soc. 2001;123:6098–6107. doi: 10.1021/ja010002z. [DOI] [PubMed] [Google Scholar]
- Meirovitch E, Lee D, Walter KF, Griesinger C. Standard tensorial analysis of local ordering in proteins from residual dipolar couplings. J Phys Chem B. 2012;116:6106–6117. doi: 10.1021/jp301451v. [DOI] [PubMed] [Google Scholar]
- Palmer AG., 3rd NMR characterization of the dynamics of biomacromolecules. Chem Rev. 2004;104:3623–3640. doi: 10.1021/cr030413t. [DOI] [PubMed] [Google Scholar]
- Peti W, Meiler J, Brüschweiler R, Griesinger C. Model-free analysis of protein backbone motion from residual dipolar couplings. J Am Chem Soc. 2002;124:5822–5833. doi: 10.1021/ja011883c. [DOI] [PubMed] [Google Scholar]
- Ramage R, Green J, Muir TW, Ogunjobi OM, Love S, Shaw K. Synthetic, structural and biological studies of the ubiquitin system: the total chemical synthesis of ubiquitin. Biochem J. 1994;299:151–158. doi: 10.1042/bj2990151. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodriguez-Castañeda F, Haberz P, Leonov A, Griesinger C. Paramagnetic tagging of diamagnetic proteins for solution NMR. Magn Reson Chem. 2006;44:S10–S16. doi: 10.1002/mrc.1811. [DOI] [PubMed] [Google Scholar]
- Salmon L, Bouvignies G, Markwick P, Lakomek NA, Showalter S, Li DW, Walter KFA, Griesinger C, Brüschweiler R, Blackledge M. Protein conformational flexibility from structure-free analysis of NMR dipolar couplings: quantitative and absolute determination of backbone motion in ubiquitin. Angew Chem Int Edit. 2009;48:4154–4157. doi: 10.1002/anie.200900476. [DOI] [PubMed] [Google Scholar]
- Salvatella X, Richter B, Vendruscolo M. Influence of the fluctuations of the alignment tensor on the analysis of the structure and dynamics of proteins using residual dipolar couplings. J Biomol NMR. 2008;40:71–81. doi: 10.1007/s10858-007-9210-6. [DOI] [PubMed] [Google Scholar]
- Saupe A. Kernresonanzen in Kristallinen Flussigkeiten in Kristallinflussigen Losungen. I. Z Naturforsch A. 1964;19:161–171. [Google Scholar]
- Saupe A. Recent results in field of liquid crystals. Angew Chem Int Edit. 1968;7:97–118. doi: 10.1002/anie.196800971. [DOI] [Google Scholar]
- Schanda P, Meier BH, Ernst M. Quantitative analysis of protein backbone dynamics in microcrystalline ubiquitin by solid-state NMR spectroscopy. J Am Chem Soc. 2010;132:15957–15967. doi: 10.1021/ja100726a. [DOI] [PubMed] [Google Scholar]
- Snyder LC. Analysis of nuclear magnetic resonance spectra of molecules in liquid–crystal solvents. J Chem Phys. 1965;43:4041–4050. doi: 10.1063/1.1696638. [DOI] [Google Scholar]
- Tjandra N, Bax A. Direct measurement of distances and angles in biomolecules by NMR in a dilute liquid crystalline medium. Science. 1997;278:1111–1114. doi: 10.1126/science.278.5340.1111. [DOI] [PubMed] [Google Scholar]
- Tolman JR. A novel approach to the retrieval of structural and dynamic information from residual dipolar couplings using several oriented media in biomolecular NMR spectroscopy. J Am Chem Soc. 2002;124:12020–12030. doi: 10.1021/ja0261123. [DOI] [PubMed] [Google Scholar]
- Tolman JR, Flanagan JM, Kennedy MA, Prestegard JH. NMR evidence for slow collective motions in cyanometmyoglobin. Nat Struct Biol. 1997;4:292–297. doi: 10.1038/nsb0497-292. [DOI] [PubMed] [Google Scholar]
- Ulmer TS, Ramirez BE, Delaglio F, Bax A. Evaluation of backbone proton positions and dynamics in a small protein by liquid crystal NMR spectroscopy. J Am Chem Soc. 2003;125:9179–9191. doi: 10.1021/ja0350684. [DOI] [PubMed] [Google Scholar]
- Yao L, Vögeli B, Torchia DA, Bax A. Simultaneous NMR study of protein structure and dynamics using conservative mutagenesis. J Phys Chem B. 2008;112:6045–6056. doi: 10.1021/jp0772124. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.