Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Jul 1.
Published in final edited form as: J Biomol NMR. 2015 Jun 5;62(3):353–371. doi: 10.1007/s10858-015-9951-6

Information content of long-range NMR data for the characterization of conformational heterogeneity

Witold Andrałojć a, Konstantin Berlin b, David Fushman b,*, Claudio Luchinat a,c,*, Giacomo Parigi a,c, Enrico Ravera a,c, Luca Sgheri d
PMCID: PMC4782772  NIHMSID: NIHMS757743  PMID: 26044033

Abstract

Long-range NMR data, namely residual dipolar couplings (RDCs) from external alignment and paramagnetic data, are becoming increasingly popular for the characterization of conformational heterogeneity of multidomain biomacromolecules and protein complexes. The question addressed here is how much information is contained in these averaged data. We have analyzed and compared the information content of conformationally averaged RDCs caused by steric alignment and of both RDCs and pseudocontact shifts caused by paramagnetic alignment, and found that, despite the substantial differences, they contain a similar amount of information. Furthermore, using several synthetic tests we find that both sets of data are equally good towards recovering the major state(s) in conformational distributions.

Keywords: paramagnetic NMR, residual dipolar couplings, two-domain proteins, protein mobility, conformational variability

Introduction

Biological macromolecules are inherently flexible objects and often accomplish their task through extensive conformational rearrangement (Sicheri and Kuriyan, 1997; Pickford and Campbell, 2004; Zhang and Zuiderweg, 2004; Tonks, 2006; Chuang et al., 2010). Characterization of such rearrangements and the relevant conformational states can provide important clues about the mechanisms underlying biological function. This however is a challenging task because the system is underdetermined, implying a large degeneracy in the reconstructed solutions, and requires extensive experimental work often involving multiple techniques (Bonvin and Brunger, 1996; Choy and Forman-Kay, 2001; Svergun et al., 2001; Burgi et al., 2001; Clore and Schwieters, 2004; Schroeder et al., 2004; Iwahara et al., 2004; Bertini et al., 2004a; Blackledge, 2005; Lindorff-Larsen et al., 2005; Fragai et al., 2006; Tolman and Ruan, 2006; Boehr et al., 2006; Ryabov and Fushman, 2006; Chen et al., 2007; Bernadò et al., 2007; Bertini et al., 2007; Ryabov and Fushman, 2007; Lange et al., 2008; Hulsker et al., 2008; Korzhnev and Kay, 2008; Nodet et al., 2009; Boehr et al., 2009; Stelzer et al., 2009; Huang and Grzesiek, 2010; Fisher et al., 2010; Bashir et al., 2010; Rinnenthal et al., 2011; Bothe et al., 2011; Fisher and Stultz, 2011; Berlin et al., 2013; Russo et al., 2013; Guerry et al., 2013; Kukic et al., 2014; Ravera et al., 2014; Torchia, 2015). Therefore, it is important to know the information content provided by various experimental methods in order to decide on an optimal set of experiments a priori.

Residual dipolar couplings (RDC) (Lohman and Maclean, 1978) are widely used as a source of information on biomolecular structure and dynamics (Tolman, 2001; Tolman and Ruan, 2006; Berlin et al., 2013; Ravera et al., 2014). They arise in the presence of partial molecular orientation, which can be achieved by interactions with alignment media surrounding the molecule (Tolman et al., 1995; Tjandra and Bax, 1997; Hansen et al., 1998; Losonczi and Prestegard, 1998; Ramirez and Bax, 1998; Wang et al., 1998; Al-Hashimi et al., 2000; Prestegard et al., 2000; Zweckstetter and Bax, 2001; Lakomek et al., 2008) and/or by the preferential orientation of the molecule itself in a magnetic field due to its magnetic susceptibility anisotropy (Lohman and Maclean, 1978; Tolman et al., 1995; Zhang et al., 2003; Latham et al., 2008; Ravera et al., 2014; Musiani et al., 2014). RDCs obtained by alignment induced by an external orienting medium, herein referred to as diamagnetic RDCs (dRDC), depend on the nature of the interactions of the biomolecule with the medium. These interactions can be steric and/or electrostatic and, because of this, dRDC are reporters also on the overall shape of the macromolecule and/or its charge distribution (Zweckstetter and Bax, 2000; Zweckstetter, 2008; Berlin et al., 2009; Maltsev et al., 2014). On the other hand, RDCs caused by molecular self-alignment, often induced by the presence of a paramagnetic center with an anisotropic magnetic susceptibility, herein termed paramagnetic RDCs (pRDC), only depend on the orientation of the internuclear vectors in the reference frame of the magnetic susceptibility tensor and are generally independent of the shape of the molecule. However, the presence of an anisotropic magnetic susceptibility also gives rise to pseudocontact shifts (PCS) (Kurland and McGarvey, 1970), which are reporters on the positions of the nuclei in the principal axis frame of the magnetic susceptibility tensor centered on the paramagnetic site, and therefore contain information about the structure/shape of a molecule. The use of paramagnetism-induced restraints (Gochin and Roder, 1995a; Gochin and Roder, 1995b; Banci et al., 1996; Banci et al., 1998; Bertini et al., 2001a; Gaponenko et al., 2004; Bertini et al., 2005; Diaz-Moreno et al., 2005; Jensen et al., 2006; Bertini et al., 2008; Schmitz et al., 2012; Yagi et al., 2013b) is becoming increasingly popular because of the introduction of lanthanide binding tags (Barthelmes et al., 2011; Wöhnert et al., 2003; Rodriguez-Castañeda et al., 2006; Su et al., 2006; John and Otting, 2007; Pintacuda et al., 2007; Zhuang et al., 2008; Su et al., 2008b; Su et al., 2008a; Keizers et al., 2008; Häussinger et al., 2009; Su and Otting, 2010; Hass et al., 2010; Man et al., 2010; Das Gupta et al., 2011; Saio et al., 2011; Swarbrick et al., 2011b; Swarbrick et al., 2011a; Bertini et al., 2012a; Liu et al., 2012; Kobashigawa et al., 2012; Cerofolini et al., 2013; Yagi et al., 2013a; Gempf et al., 2013; Loh et al., 2013), that extend the range of applications from paramagnetic metalloproteins (Banci et al., 1996; Banci et al., 1997) (or proteins in which the naturally occurring metal can be replaced by a paramagnetic one (Allegrozzi et al., 2000; Bertini et al., 2001a; Bertini et al., 2001c; Bertini et al., 2001b; Bertini et al., 2003; Bertini et al., 2004b; Balayssac et al., 2008; Bertini et al., 2010a; Luchinat et al., 2012b)) to, in principle, any protein.

Given the various possibilities and limited resources, choosing the optimal set of observables for the characterization of protein conformational heterogeneity is important. In this work we analyze the information content associated with the two commonly used types of experimental data (dRDC and paramagnetic data) and discuss their features and advantages and pitfalls. Specifically, we want to understand what information can be recovered and to what extent. Importantly, the methodology that we develop below is not limited to dRDC or paramagnetic data, and can be applied to any set of experimental observables.

Theory

Formulation of the ensemble problem

We focus on analyzing the ensemble information content of three specific types of NMR restraints, dRDC, pRDC, and PCS, in the case of proteins composed of two domains connected by a flexible linker. We have used the two-domain protein calmodulin (CaM) as a test case. As done previously (Bertini et al., 2007; Berlin et al., 2013), we assume that all three types of NMR restraints considered here represent a population-weighted average of the corresponding values for the individual conformers, and therefore have a linear dependence on the ensemble populations, such that

y=a1x1++aNxN+ε=Ax+ε (1)

where y is a length-L column vector representing the experimental data (dRDC, pRDC, PCS, or some combination thereof), A is an LxN prediction matrix consisting of N column-vectors aj (j=1…N) representing the predicted data for each of the N conformers, xj is the population weight for the jth conformer, and ε is the difference between y and Ax due to the presence of experimental error. This assumption seems reasonable for pRDC and PCS (Bertini et al., 2012c), whereas for dRDC the interconversion between conformers can occur on a timescale that could be comparable to the one of the interaction with the alignment medium; additionally, the latter may perturb the system.

Since in general recovering x from equation 1 is an ill-posed problem, having an infinite number of solutions, we seek to recover the minimum ensemble (sparsest solution) satisfying the experimental observables, which we express as a constrained linear least-squares problem (Berlin et al., 2013),

x*=argminxW(Axy)2 s.t. x0,x0=M (2)

where W is the weight matrix that non-uniformly weighs the residuals between y and Ax, M is the desired ensemble size, ‖…‖2 is the Euclidian norm, and ‖x0 is the ℓ0 quasinorm of x, i.e. the number of nonzero elements in x. Typically the experimental errors are assumed to be uncorrelated, in which case W is simply a diagonal matrix with Wii = 1/σi, where σi is the estimated experimental error of the ith observation yi. For simplicity, for the rest of the manuscript we will drop W from our equations by assuming that A and y are already multiplied by W. In the Sparse Ensemble Selection (SES) method the ensemble size is chosen by solving the problem for reasonable values of M and using the L-curve to select the appropriate M value (Berlin et al., 2013). A different approach was also applied, based on the calculation of the maximum occurrence allowed for each conformer (MaxOcc, see below) (Bertini et al., 2002a; Gardner et al., 2005; Longinetti et al., 2006; Bertini et al., 2007; Bertini et al., 2010b; Luchinat et al., 2012a; Bertini et al., 2012b; Bertini et al., 2012c; Andralojc et al., 2014).

Predicting RDC and PCS data

For steric dRDC data, we generate the prediction matrix A using program PATI (Berlin et al., 2009; Berlin et al., 2013), which assumes the presence of a steric planar alignment medium (Fig. 1A). Electrostatically induced RDCs were similarly simulated using PALES (Zweckstetter, 2008). The absolute scaling in the predicted dRDC values is regulated by changing the value of the parameter “liquid crystal concentration” (Zweckstetter and Bax, 2000) that controls the distance between the planar steric barriers. In the SES model the absolute scaling of the predicted dRDC is treated as an implicit parameter since the sum of all weights (jxj) is not constrained (Berlin et al., 2013).

Figure 1.

Figure 1

Schematic illustration of the relationship between the conformation of a multidomain protein and the alignment tensor for the two experimental methods considered her (or: alignment tensor caused by external and internal alignment). In the case of partial orientation induced by external orienting media the alignment tensor changes for different conformations of a two-domain protein (A) whereas in the case of partial orientation induced by a paramagnetic metal ion attached to the protein the alignment tensor is invariant with respect to the orientation of the domain where it is attached (B).

For pRDC and PCS, without loss of generality, we can assume that a metal ion tag is located on the first (rigid) domain of the protein (Bertini et al., 2003). Therefore, the position of the metal ion relative to that domain is the same for all conformers. So, instead of performing the prediction of pRDC and PCS values for both domains, we obtain the prediction matrix A for a two-domain rigid system by first deriving the magnetic susceptibility anisotropy tensor (and metal ion’s position) from the experimental data for the first domain, and then use these tensors to predict the matrix A values for the second domain based on its position relative to the first domain (Fig. 1B). This formulation assumes that the distribution of the relative positions of the two domains is independent of the orientation of the magnetic susceptibility anisotropy tensor in the magnetic field (Bertini et al., 2002a).

Given a specific conformer, the pRDC values in the A matrix are thus predicted by first deriving the vector containing the 5 independent components of the alignment tensor, S*, directly from the experimental data for the first domain:

S*=argminsV1Sy12 (3)

where V1 is a 5-column matrix, the elements of which depend on the orientations of the normalized bond vectors in the fixed frame (Losonczi et al., 1999; Valafar and Prestegard, 2004; Berlin et al., 2009; Simin et al., 2014) and y1 are the observed experimental pRDC values for the first domain. Then, using the derived S*, we predict the pRDC for the second domain of the jth conformer (ApRDC,j) as

ApRDC,j=V2jS* (4)

where V2j is the 5-column matrix of the bond vectors for the second domain in the jth conformer.

Similarly, the PCS values for the first domain can be used to derive the magnetic susceptibility anisotropy tensor T*, represented by a 3×3 traceless symmetric matrix, and the metal ion’s position p* (computed by alternating between solving a non-linear least-squares problem for p*, and a linear problem for T*). These values are then used to predict the PCS for the second domain of the jth conformer (APCS,j). The elements of the APCS,j vector are the PCSs predicted for each nucleus i of the second domain, according to the relationship

APCS,j,i112πrij25tr([3rij,12rij223rij,1rij,23rij,1rij,33rij,1rij,23rij,22rij223rij,2rij,33rij,1rij,33rij,2rij,33rij,32rij22]T*) (5)

where rij = [rij,1, rij,2, rij,3] is the vector connecting the metal ion (located at p*) and the ith atom in the jth conformer, and tr(…) designates the trace of a matrix. The elements of the tensor T* and the components of the alignment tensor S* are related to one another by a proportionality constant (Bertini et al., 2002b), so that each of the two can be easily calculated from the other.

Similarly to dRDC for multiple alignment media, pRDC and PCS from multiple metal ion derivatives (determined from the S* and T* tensors, respectively, of the corresponding metals) can be combined together in a single A matrix of predicted data.

Methods

Constraining SES ensemble populations

Since the scaling of the predicted dRDC values has an uncertainty (Berlin et al., 2013), when recovering SES ensembles using dRDC, we allow the total sum of x, jxj, to float, and only use the restraint x ≥ 0 (see Eq. 2).

By contrast, the values of pRDC and PCS are determined without any adjustable scaling factor, and thus the two datasets can be directly combined into a single population-constrained pRDC+PCS SES problem,

x*=argminx[ypRDCyPCS][ApRDCAPCS]x2 s.t. x0,jxjc,x0=M (6)

where c is the upper bound on the total population weight. Since jxj represents the total population weight jxj should be 1. However, we allow for the sum of the weights to be less than 1, since we aim at recovering the sparsest ensemble representing the major states (potentially there could be a very large set of transient minor states). The validity of the recovered solution can be evaluated from the geometrical interpretation of pRDC: a solution is a convex combination of a set of conformers such that the averaged pRDC belong to the polyhedron with vertices in the conformers (see Figure S5) (Gardner et al., 2005; Longinetti et al., 2006). Since the problem is underdetermined, there will be many solutions, and the SES method chooses to limit the number of vertices to M. In order to find a solution with this constraint, we need to use a c<1 in Eq. (6). This is equivalent to shrinking the vertices of the polyhedron towards the origin by a factor c and renormalizing the weighting factors to 1. However, since the origin is an acceptable point (Sgheri, 2010a) and the set is convex, the shrunk vertices will be anyway acceptable points. In other words, if c is relatively close to 1, the conformers representing the vertices are anyway good representatives of the conformational freedom of the system. Finally, the jxj1 restraint prevents from finding unphysical solutions.

SES algorithm implementation

SES ensemble recovery was implemented using the multi-orthogonal matching pursuit (MOMP) algorithm (Berlin et al., 2013). We modified the MOMP method to handle the jxjc requirement using the active set method (O’Leary, 2009) to restrain our solution for each iteration of MOMP. Given that there are two restraints on x: x ≥ 0 and jxjc during each iteration of the MOMP algorithm there are four possible sets of active restraints: (i) no restraints are active; (ii)jxjc restraint is active; (iii) the x ≥ 0 restraint is active; or (iv) both x ≥ 0 and jxjc are active. To summarize, the constrained least-squares problem is solved as follows: update the solution using conjugate gradient (CG) method; if the solution violates x ≥ 0 or jxjc, solve the linearly constrained linear least-squares problem by using a “feasible direction” method (O’Leary, 2009); if the solution still violates x ≥ 0, drop this solution from a list of possible solutions stored in a priority queue. This procedure is repeated for all propagated solutions from the previous iteration.

The time vs. accuracy tradeoff in the MOMP algorithm is controlled by how many top solutions, K, from the current iteration are propagated to the next iteration of MOMP (Berlin et al., 2013). In order to improve the memory requirement for running SES using very large K values (>106), we modified the algorithm used to solve the overdetermined linear least-squares problem for each iteration of SES, when a new solution must be computed right after one new column is added to the list of active columns (see Supporting Information in (Berlin et al., 2013)). In the previous implementation (Berlin et al., 2013), the least-squares solution was efficiently updated by doing a rank-1 update of the QR decomposition. However, this approach requires us to store K QR decompositions during each iteration. In our current updated version, we switched to an iterative CG least-squares solver, which requires that we only store the previous-iteration solution, rather than the QR decomposition. This significantly reduced the SES memory footprint for large K. The full AT A matrix required for the CG algorithm is never explicitly formed, and instead the multiplication step in the CG algorithm is computed as AT(Ax). With the CG implementation we are able to run SES on a 10 GB RAM desktop for K=106, without any sacrifice in computational time or accuracy, as compared to the previous implementation.

MaxOcc calculations

The Maximum Occurrence (MaxOcc) of each and every conformer is defined as the maximum weight that it can obtain when part of a conformational ensemble without violating the constraints of the experimental data. No restriction is posed on the number of conformations to be included in the ensemble. Maximum occurrence (MaxOcc) can be interpreted as the maximum fraction of time that a conformation can exist, when taken together with any ensemble of conformations with optimized weights (Longinetti et al., 2006; Bertini et al., 2007; Sgheri, 2010b; Bertini et al., 2010b; Das Gupta et al., 2011; Luchinat et al., 2012a; Bertini et al., 2012b; Bertini et al., 2012c; Cerofolini et al., 2013).

We formulate MaxOcc as a convex regularization problem, where for each conformer j we find the weight vector x which minimizes

argminx{Axy22+λ(xjxMO)2+λ(1xMOi=1,ijNxi)2} s.t. x0 (7)

where xMO is the desired weight of the conformation j, and λ is a weighting factor. The calculations are repeated for increasing values of xMO; the MaxOcc of conformation j is defined as the highest xMO providing a value of the expression in Eq. 7 not exceeding the minimum value by more than a prefixed threshold, for example 20%. The value of λ was fixed to 15, as found with the L-curve method, as a compromise between a good fit of the experimental observables and the proximity of the sum of the weights to 1. A frugal coordinate descent algorithm, combined with random coordinate search (Nesterov, 2012), is used to solve Eq. 7.

Calculations are also performed to determine the maximum occurrence of a region (MaxOR) defined in the conformational space of the protein (Andralojc et al., 2014). The MaxOR, similar to MaxOcc, is defined as the maximum weight that a region in conformational space (composed of multiple structures) can have in an ensemble without causing a violation of the experimental restraints. First, the highest-MaxOcc structures are clustered according to their positions using a k-means algorithm as implemented in the Python library SciPy (Jones et al., 2001). The number of clusters is set to the highest value yielding reproducible clustering by the algorithm. Once the clusters are built, small regions are defined around the centers of the clusters, which include all conformations within a given distance Δ from the center of the cluster. The MaxORs of these regions are determined by solving

argminx{Axy22+λ[(xMOiCxi)2(1xMOiDxi)2]} s.t. x0 (8)

where xMO is the fixed value that must correspond to the sum of the weights of all conformations within the region, and C and D indicate the structures within and outside that region, respectively. Again, the largest xMO providing a good fit of the experimental data defines the maxOR of the region.

Results and Discussion

An important theoretical question that we would like to answer a priori, before performing any time-consuming simulation or experiment, is how much information for ensemble recovery is contained in dRDC vs. pRDC vs. PCS and in dRDC vs. pRDC+PCS combined. For example, intuitively, dRDC should contain more information than pRDC, since dRDC contain shape/size-related information, while the relative informational content of PCS is harder to intuitively quantify. To what extent combining pRDC with PCS yields better results than each of these data separately? Is the information provided by pRDC+PCS similar to that provided by dRDC? Would using several different metal ions be needed to obtain results comparable to those obtained with multiple sets of dRDC, or do they produce a better set of experimental data for the characterization of the conformational heterogeneity?

In order to answer these questions, we analyzed several algebraic properties of eight experimentally feasible datasets: (i) single-alignment medium dRDC; (ii) single-metal ion pRDC; (iii) single-metal ion PCS; (iv) single-metal ion pRDC+PCS combined; and (v-viii) datasets analogous to (i-iv) but with three alignment media or thee metal ions. We will refer to the one and three media/metal ions datasets as the one- and three-restraint datasets, respectively.

The datasets were generated for a pool of 32723 conformers of calmodulin (CaM), a protein composed of two rigid domains connected by a 4-residue flexible linker (Barbato et al., 1992; Tjandra et al., 1995; Chou et al., 2001; Kukic et al., 2014). This large pool of sterically allowed conformations of the protein was taken from reference (Bertini et al., 2010b), where it was generated using the program RanCh (Bernadò et al., 2007), For each conformer and for each aligning medium or metal ion, a set of dRDCs, pRDCs, and PCSs was generated, as described in the Theory section.

Simulated PCS and pRDC data

The paramagnetic restraints consisted of PCS of the amide H atoms and pRDC of amide N-H pairs of the C-terminal domain of CaM induced by the presence of a paramagnetic center in its N-terminal domain. Three metals with non-coinciding magnetic susceptibility tensors (corresponding to the experimental ones obtained for Tb(III), Tm(III), and Dy(III) CaM) were used to generate three sets of PCSs (132 observations in total) and pRDCs (112 observations in total). The magnetic susceptibility anisotropy tensors were taken from reference (Bertini et al., 2009).

Simulated dRDC data

The simulated diamagnetic restraints were amide 15N-1H dRDCs (219 in total) induced in both CaM domains by 3 independent external alignment media: flat uncharged discs and either positively or negatively charged rods. In the first case, dRDCs were generated using PATI (Berlin et al., 2009), in the other cases using PALES (Zweckstetter and Bax, 2000; Zweckstetter, 2008). In both cases, the calculation of the alignment tensors, and of the corresponding dRDC, are performed under the assumption that the protein’s conformations are rigid during the time course of its interaction with the alignment medium. As a word of caution we note that every interaction of a protein with the alignment medium might actually perturb its conformation, and these interactions can occur on a timescale that is slower than the conformational averaging itself. The assumption that the averaged dRDCs correspond to a weighted average of the RDCs calculated for the individual conformations, although universally used, might fall short in representing the real physical picture.

SVD of prediction matrices

The first and simplest analysis we performed was aimed at evaluating the theoretical information content of the eight different datasets described above. This was done through the spectral analysis of the prediction matrix A for each dataset. The spectral analysis measures the number of significant linearly independent components present in the data, by counting the eigenvalues corresponding to linearly independent eigenvectors. This directly provides an upper bound on the number of independent conformers we can hope to extract. Trying to recover a larger number of independent conformers would result in overfitting. The results are shown in Fig. 2A,D.

Figure 2.

Figure 2

SVD decomposition (left panels), histogram of column correlations (center panels), and condition number of randomly subselected set of columns (right panels), for the eight described datasets. The results for a single medium/metal ion are shown on the top, and the results for the 3 media/metal ions are shown on the bottom. (A–D) The 35 largest singular values of the associated A matrices. (B–E) The distribution of the uncentered correlations between all pairs of columns in the A matrix, estimated by performing 20000 random samples. (C–F) The expected mean and standard deviation of the relative error for recovering population weights from an arbitrary M=1,…,10 subset of columns.

As shown in Eq. 3, any vector of RDC values (either pRDC or dRDC) from a rigid domain can be expressed as a matrix V, which can be determined from the orientations of the bond vectors of that domain, multiplied by the 5 independent components of the alignment tensor matrix. Since there is a linear dependence of the observed data on the 5 components of the alignment tensor, we expect the A matrix for dRDC to have rank 10 (5 independent parameters for each of the two domains), and for pRDC to have rank 5, since only the second domain data are used for ensemble recovery. The number of unknowns in the paramagnetic case is also smaller because the alignment tensor for the first domain (5 parameters) can be easily determined from PCS and pRDC measured for this domain, as they are not averaged by conformational variability.

Numerical spectral analyses of the generated prediction matrices for dRDC and pRDC (Fig. 2A,D) support our theoretical analysis, and show that the number of singular values of matrix A for one-restraint dRDC and pRDC data is 10 and 5, respectively. Going from 1 to 3 alignment tensors triples the number of non-zero singular values for dRDC and pRDC, as would be expected for linearly independent alignments. The large decrease in the magnitude of singular values for the last 10 dRDC and 5 pRDC non-zero singular values in the three-restraint datasets likely reflects the difficulty in experimentally obtaining three fully independent alignment tensors. The larger magnitudes of dRDC singular values compared to the singular values for pRDC are not related to their information content, but merely reflect the relative strength of diamagnetic versus paramagnetic alignment in the simulated data. On the contrary, it is the decrease in the relative magnitude of the singular values with respect to the largest value, calculated from a set of data, that reflects the difficulty in exploiting the associated restraints, and is hence ultimately related to the information content.

Similarly, the observed PCS data for a rigid domain which is not containing the paramagnetic ion (i.e. for the second domain) can be expressed using 8 parameters: the 5 independent components/parameters defining the T tensor, and the 3 parameters describing the metal-ion’s position p with respect to this domain. However, since the observed PCS vector y is not linearly related to p, the rank of APCS (calculated from the PCSs in the second domain) is much higher than 8, and greater than that for dRDC or pRDC datasets. The rank of APCS is actually close to (up to) the number of observations; however as Figure 2A,D show, the magnitude of the singular values decreases very rapidly. This decrease reflects the strong difference in the PCS values between conformers where the C-terminal (second) domain is close to the metal ion (paramagnetic center) and those where it is far away. After the first ≈15 entries, in the one-restraint case the singular values are very small because similar PCS values are calculated for conformers not very far from one another and for nuclei which are spatially close to several other nuclei. When using three sets of metal ions, the number of conformers with large and different PCS values increases. Thus, the decrease in the magnitude of the singular values is significantly slower than in the case of a single metal ion (Fig. 2A–D).

One major advantage of using metal ions instead of steric alignment is that both pRDC and PCS are collected from the same biochemical construct. Thus, two independent datasets can be directly combined, as described in Eq. 6. When combining these datasets, a significantly slower decay in singular values of A is obtained compared to the pRDC and PCS datasets analyzed independently. This supports the accepted intuition that pRDC and PCS provide orthogonal structural restraints (pRDCs are very sensitive to orientation, PCSs mostly provide distance restraints).

Histograms of prediction matrices

The spectral analysis of the A matrices suggests that pRDC+PCS and even PCS alone provide better restraints for ensemble selection than dRDC. However, singular values are not an exhaustive description of the overall vector distribution. Therefore, we directly analyzed the distribution of correlations between all columns of the matrix A calculated for dRDC, pRDC, and PCS. The uncentered correlation distributions between all pairs of columns are shown in Fig. 2B,E. The more uncorrelated the columns of each specific A (AdRDC, ApRDC, APCS) the smaller the chance that an alternative conformer can explain the same subset of experimental data, thus decreasing the number of viable alternative ensembles. In the optimal case, all columns would have zero correlation, and the ensemble solution would be unique.

Figure 2B,E clearly demonstrate that even though the number of singular values of PCS is larger than that of dRDC and pRDC, the correlation distribution is actually significantly worse than for any other dataset, so that their information content could not be larger. The higher correlation for large fraction of the conformers reflects a distribution of PCS where very large changes occur in proximity of the metal ion only, whereas almost no change occurs far away from the metal ion. Additional metal ions can significantly improve the distribution of correlations, although it remains poor with respect to that of the other restraints.

Since pRDCs are distance-independent, they provide a more uniform distribution of values, so that their correlation distribution is much better than for PCS. The pRDC distribution is anyway worse than that of dRDC in the one-restraint case; it significantly improves, essentially to the level of dRDC, in the three-restraint case. Interestingly, the dRDC distribution changes only slightly between one and three restraints, which suggests that the information contained in the additional dRDC datasets is more redundant than in the pRDC case.

Combining pRDC with PCS results in a better correlation distribution than for pRDC and PCS individually. In turn, the correlation distribution of pRDC+PCS is very similar to that of dRDC in the one-restraint case and actually somewhat better in the three-restraint case.

Expected relative error

While the correlation plots in Fig. 2B,E provide an estimate of the A matrix column vector distribution, they do not directly tell how well ensembles greater than two can be recovered, nor do they take signal-to-noise ratio into account. To assess how well larger ensembles can be recovered, we computed the mean and standard deviation of the relative error from a synthetically generated y data (with added Gaussian error) for M=1,…,10 columns. The mean and standard deviation were computed by randomly sampling, for each M value, M columns and uniformly at random generating the associated population weights x. The synthetic y was generated as y=Ax+N(0,1), where N(0,1) is the zero-mean Gaussian distribution with σ=1. The vector x* and the associated relative error, ||x-x*||2/||x||2, were recovered by solving Eqs. 2 and 6. In order to guarantee a less than 0.1% relative error with greater than 99.999% confidence using Chernoff bound, the process was repeated 40000 times for each M. The results for all datasets are shown in Fig. 2C,F.

For the one-restraint datasets, dRDC has lower relative error than pRDC, PCS, or pRDC+PCS. As expected, there is a rapid growth in pRDC errors due to the low matrix rank, and high errors overall in PCS due to the high correlation between columns. In the case of the three-restraint datasets, dRDC has significantly lower relative error than pRDC, even though on the correlation plot the two distributions are very similar. Interestingly, combining pRDC+PCS yields only slightly higher error rate than for dRDC.

Recovering the conformational variability from synthetic datasets

In the previous sections we theoretically analyzed the information content of 8 datasets of synthetic dRDC, pRDC and/or PCS data. Here we perform a direct comparison of the performance of the different restraints in recovering information on the structural variability of the system. To achieve this, we determined i) the minimum-size sparsest ensemble solution using the Sparse Ensemble Selection (SES) method (Berlin et al., 2013) and ii) the conformations (as well as the regions in the conformational space) with the highest MaxOcc values. In this way it becomes possible to analyze the accuracy of the recovered solutions from the different sets of synthetic averaged data.

For this purpose, we devised three simulations modeling i) extensive mobility around a single conformation, ii) two-site exchange with limited mobility around each center, and iii) two-site exchange with a reduced difference in the orientations of the two centers. In each of the simulations, the two-domain protein CaM was allowed to sample different, well defined, parts of its sterically allowed conformational space. Synthetic restraints were calculated as weighted averages over the values of dRDC, pRDC, and PCS of the individual conformations belonging to the sampled regions. These average data were perturbed with a Gaussian error with a standard deviation of 1, 2, or 3 Hz for pRDC and dRDC and of 0.01, 0.02, or 0.03 ppm for PCS.

In the following descriptions of the simulated conformational ensembles, the N-terminal domain of CaM is taken as the frame of reference, and each conformation is described by the different position and orientation of the C-terminal domain with respect to the N-terminal domain. The exact details of each simulation, although described accurately for completeness, are not crucial for the success of the ensemble recovery attempts.

Simulation 1

In this first simulation we consider the case of conformational variability centered at a single extended conformation of CaM. The sampled ensemble consists of all the conformers, present in the pool of the 32723 sterically allowed conformers, within a distance Δ (measured as a combination of translation and rotation) from the central extended structure (Fig. 3A) (Bertini et al., 2012b). Specifically, this distance is defined as:

Δ=d+f(1cosα) (9)

where d is the translation of the center of mass of the C-terminal domain from the central structure, and α is the angle of rotation from the central structure, calculated as α = acrccos(|qc · q|), where qc and q are the unitary quaternions describing the central structure and the other structure. Note that the two structures are actually 2α apart in Cartesian space (Kuffner, 2004). Δ defines the largest allowed spatial displacement (when α is 0) and the largest allowed rotation (when d is 0; it also depends on the factor f) from the position of the central conformer. In the present simulation, conformations with Δ up to 30 Å (a reasonable estimate for this system) were accepted and the value of f was set to 84 Å. In this way, the conformers in the constructed ensemble can have the center of mass of the C-terminal domain at a maximum distance of 30 Å with respect to the conformer at the center of the distribution, if they have the same orientation (the distance decreases with increasing the difference in the orientation). Their C-terminal domain can be rotated up to 100° (α=50°) with respect to the central conformer, if there is no translation of the center of mass (and gradually less and less as the translational component increases). The weight of each conformation in the ensemble depends on its Δ, and is fixed according to a Gaussian distribution centered at Δ=0, with standard deviation chosen to provide weights close to zero when Δ is close to 30 Å.

Figure 3.

Figure 3

The simulated ensembles. Different positions of the C-terminal domain of CaM are represented by a triad of Cartesian axes, centered at the center of mass of the C-terminal domain. The conformers are color coded according to their relative weights (form red = high weight to blue = low weight). (A) Simulation 1, (B) Simulation 2, (C) Simulation 3. The ensembles are shown from two different points of view in the left and right panels. All the conformers are superimposed by the N-terminal domain, which is shown in cartoon representation.

Simulation 2

This simulation models the case of a two-site exchange, with limited mobility allowed around each of the two main conformers (Fig. 3B). The two centers were separated by approximately 30 Å and their C-terminal domains were rotated by ca. 140° with respect to each other. The mobility around each center was simulated as in the previous case with the threshold on Δ set to 10 Å and f equal to 42.7 Å, which corresponds to a maximum allowed angular displacement with respect to the central conformer of 80° (α=40°).

Simulation 3

This simulation is similar to Simulation 2, with the difference that the angular distance between the two sites was decreased almost twofold (Fig. 3C). Sites with more similar orientations are likely to present a bigger challenge in ensemble recovery using restraints which depend on the domain orientations. The distance between the centers (both distinct from those used in Simulation 2) is 30 Å while the difference in orientation of the C-terminal domains is now 80°. The threshold of Δ and the value of f used to simulate the residual mobility around each center were the same as in Simulation 2, hence the same upper limit on the angle α.

SES ensembles

We applied the SES method to these simulated datasets and analyzed how the various restraints affect the recovery of the main conformations contained in the synthetic ensembles used to generate the data. The recovered ensembles were evaluated in terms of their sizes (number of major states) and of the proximity of recovered structures to the centers of the synthetic ensembles (in terms of spatial and angular displacement). As already mentioned in the Theory section, the ensemble size was chosen using the L-curve method (Berlin et al., 2013) (see Figure S6).

The results are presented in Tables 1, 2 and 3. In general, dRDCs allowed a reasonably accurate recovery of the major states that were used to generate the synthetic datasets (see, for instance, Fig. 4A). However, in all three simulations, in some solutions one additional conformer was recovered, albeit with a relatively low weight. This additional conformer either belongs to the distribution of conformers around one of the main centers (as in Simulation 1 with error of 1 Hz and 2 Hz, and in Simulation 2 with error of 2 Hz, Fig. 4B) or is positioned in-between the two major states (as in Simulation 2 with error of 1 Hz, Fig. 4C). In the first case its presence may reflect conformational heterogeneity; in the second case it is likely related to artifacts. The latter may arise because, ‘average conformers’ can be more compatible with the averaged experimental observables than any of the actually sampled conformations taken individually.

Table 1.

Results of Simulation 1. The table reports the sizes of the recovered ensembles, the specific weights ascribed to their constituent conformers, and for each of these conformers their spatial and angular displacement from the center of the original ensemble.

Simulation 1 Ensemble Conformer 1 Conformer 2
Restraint Error Ensemble
size
Total
weight
Weight
conformer 1
Weight
conformer 2
translation [Å] Rotation
[deg]
translation [Å] Rotation
[deg]
PCS 0.01 ppm 1 0.792 0.792 --- 3.85 15.6 --- ---
0.02 ppm 1 0.799 0.799 --- 3.85 15.6 --- ---
0.03 ppm 1 0.771 0.771 --- 3.85 15.6 --- ---
dRDC 1 Hz 2 0.866 0.663 0.203 3.85 15.6 12.78 107.3
2 Hz 2 0.839 0.482 0.356 8.68 22.4 6.34 39.5
3 Hz 1 0.755 0.755 --- 3.85 15.6 --- ---
pRDC 1 Hz 1 0.598 0.598 --- 10.48 11.8 --- ---
2 Hz 1 0.597 0.597 --- 10.48 11.8 --- ---
3 Hz 1 0.608 0.608 --- 14.01 15.6 --- ---
pRDC+PCS 1 Hz/0.01 ppm 1 0.755 0.755 --- 3.85 15.6 --- ---
2 Hz/0.02 ppm 1 0.761 0.761 --- 3.85 15.6 --- ---
3 Hz/0.03 ppm 1 0.742 0.742 --- 3.85 15.6 --- ---

Table 2.

Results of Simulation 2. The table reports the sizes of the recovered ensembles, the specific weights ascribed to their constituent conformers and for each of these conformers indicates the closest of the two original sites together their spatial and angular displacement from it.

Simulation 2 Ensemble Conformer 1 Conformer 2 Conformer 3
Restraint Error Ensemble
size
Total
weight
Weight
Conformer
1
Weight
Conformer
2
Weight
Conformer
3
Closest
center
Translation
[Å]
Rotation
[deg]
Closest
center
Translation
[Å]
Rotation
[deg]
Closest
center
Translation
[Å]
Rotation
[deg]
PCS 0.01
ppm
2 1.000 0.617 0.383 --- B 3.65 26.2 A 0.00 0.0 --- --- ---
0.02
ppm
2 1.000 0.594 0.406 --- B 3.65 26.2 A 0.00 0.0 --- --- ---
0.03
ppm
2 1.000 0.557 0.443 --- A 5.44 16.8 B 3.67 42.6 --- --- ---
dRDC 1 Hz 3 0.754 0.345 0.272 0.137 A 0.00 0.0 B 3.99 31.7 B 14.99 63.4
2 Hz 3 0.881 0.423 0.328 0.130 A 3.12 8.6 B 3.67 42.6 B 4.17 37.5
3 Hz 2 0.737 0.384 0.353 -- A 6.49 14.4 B 5.16 19.7 -- --- ---
pRDC 1 Hz 2 0.674 0.338 0.336 --- A 8.45 14.1 B 14.52 21.6 --- --- ---
2 Hz 2 0.636 0.360 0.275 --- A 12.22 32.8 B 14.86 13.2 --- --- ---
3 Hz 2 0.692 0.365 0.327 --- B 9.66 22.0 A 15.56 14.6 --- --- ---
pRDC+PCS 1
Hz/0.01
ppm
2 0.741 0.384 0.357 --- B 7.72 34.1 A 0.00 0.0 --- --- ---
2
Hz/0.02
ppm
2 0.926 0.481 0.445 --- B 7.85 24.3 A 0.00 0.0 --- --- ---
3
Hz/0.03
ppm
2 1.000 0.546 0.454 --- B 7.85 24.3 A 7.63 12.0 --- --- ---

Table 3.

Results of Simulation 3. The table reports the sizes of the recovered ensembles, the specific weights ascribed to their constituent conformers and for each of these conformers indicates the closest of the two original sites together their spatial and angular displacement from it.

Simulation 3 Ensemble Conformer 1 Conformer 2 Conformer 3
Restraint Error Ensemble
size
Total
weight
Weight
Conformer
1
Weight
Conformer
2
Weight
Conformer
3
Closest
center
Translation
[Å]
Rotation
[deg]
Closest
center
Translation
[Å]
Rotation
[deg]
Closest
center
translation [Å] rotation [deg]
PCS 0.01
ppm
1 1.000 1.000 --- --- A 10.92 24.8 --- --- --- --- --- ---
0.02
ppm
1 1.000 1.000 --- --- A 10.92 24.8 --- --- --- --- --- ---
0.03
ppm
1 1.000 1.000 --- --- A 10.92 24.8 --- --- --- --- --- ---
dRDC 1 Hz 3 0.965 0.451 0.378 0.137 A 2.15 19.6 B 4.05 19.6 B 9.31 86.9
2 Hz 2 0.820 0.431 0.389 -- A 7.09 21.6 B 4.25 24.3 -- --- ---
3 Hz 2 0.876 0.440 0.436 -- A 3.29 18.1 B 4.05 19.6 -- --- ---
pRDC 1 Hz 2 0.739 0.374 0.364 --- B 9.60 20.5 A 12.36 11.0 --- --- ---
2 Hz 2 0.697 0.361 0.336 --- B 10.62 11.1 A 9.38 14.2 --- --- ---
3 Hz 2 0.720 0.362 0.358 --- B 4.25 24.3 A 5.20 13.9 --- --- ---
pRDC+PCS 1 Hz/
0.01
ppm
2 0.772 0.391 0.380 --- A 3.29 18.1 B 6.70 8.9 --- --- ---
2 Hz/
0.02
ppm
2 0.799 0.408 0.392 --- B 17.82 19.9 A 8.03 20.0 --- --- ---
3 Hz/
0.03
ppm
2 0.823 0.445 0.377 --- A 3.05 11.6 B 6.70 8.9 --- --- ---

Figure 4.

Figure 4

SES recovery using dRDC data. Color code for the C-terminal domain: green - simulated conformers in the centers of the regions, red, blue, and yellow - reconstructed conformers with highest, intermediate, and lowest weight, respectively. (A) An ensemble with correctly recovered major states (Simulation 3, 3 Hz error), (B) An ensemble with an additional state present next to one of the centers (Simulation 2, 2 Hz error), (C) An ensemble with an additional state (yellow) recovered (Simulation 2, 1 Hz error). The ensembles are shown from two different points of view in the left and right panels. All conformers are superimposed by the N-terminal domain.

In the case of pRDCs, the right number of major states was always recovered (Fig. S1), and in the corresponding conformers the domains were oriented with an accuracy comparable to that achieved with dRDC. It should be recalled that pRDCs contain no information whatsoever on the relative positions of the domains, which therefore results in inaccuracy of their positioning.

PCS data alone in two out of three simulations were sufficient to recover the correct solutions (Fig. S2) in terms of ensemble sizes and locations of the major states (with the accuracy similar to dRDC). However in Simulation 3, where the two states are more alike to one another, the calculations provide only a single state (Fig. S2B) situated in-between the two actual centers (in terms of both translation and orientation). The recovery of such an incorrect state is most likely, as already mentioned for dRDC, the outcome of the averaging of the experimental observables. Using all the paramagnetic data together (i.e., pRDC and PCS) improved the robustness of the recovery: both translations and orientations were satisfactory accurate in all cases (Fig. 5). The translation and rotation with respect to the conformers at the center of the distributions were within 4 Å and 16° for Simulation 1, 8 Å and 34° for Simulation 2, and 3 Å and 18° or 7 Å and 9° for Simulation 3 (1 Hz and 0.01 ppm error case). The ensemble recovery is robust, as increased errors did not noticeably affect the accuracy of solutions.

Figure 5.

Figure 5

SES recovery with all the paramagnetic data. Color code for the C-terminal domain: green - simulated conformers in the centers of the regions, red, blue - reconstructed conformers with higher and lower weight, respectively. (A) Simulation 1, (B) Simulation 2, (C) Simulation 3, with 0.03 ppm and 3 Hz errors. The ensembles are shown from two different points of view in the left and right panels. All conformers are superimposed by the N-terminal domain.

In conclusion, diamagnetic RDC, as well as the combination of paramagnetic RDC and PCS, are both equally suitable restraints for the recovery of the major states present in conformational ensembles. Special attention should be paid to the fact that, occasionally, ‘average conformers’ may be recovered.

MaxOcc analysis

Similar to the SES analysis, we performed MaxOcc analysis on the same datasets. From the MaxOcc values, it is possible to determine which conformers can be sampled with the largest weights. In order to speed the computational analysis up, we used random sampling to detect regions of with potentially high MaxOcc conformers, and then expanded those regions, to find the globally best solution. To do this, we first computed MaxOcc for 400 conformers, randomly chosen from the generated pool (Bertini et al., 2010b; Bertini et al., 2012b; Cerofolini et al., 2013). Then the conformers with the highest MaxOcc (up to 0.8 of the MaxOcc of the highest scoring conformer) were selected and the MaxOcc of their neighboring conformers (in the conformational space) were calculated. The procedure was repeated until no more neighbors with high MaxOcc were found. The neighboring conformers scored at each iteration were chosen using Eq. 9 with the threshold on Δ of 5 Å and f=40 Å. If the final distribution of the highest MaxOcc conformers was broad, the analysis was supplemented by the Maximum Occurrence of Regions (MaxOR) approach, which permitted to discriminate between the cases of high MaxOcc conformers corresponding to conformers actually sampled by the protein and the cases of high MaxOcc conformers corresponding to conformers arising from data averaging (Andralojc et al., 2014).

The results of the MaxOcc analysis are reported in Table 4 for all three simulations. In Simulation 1, for both the paramagnetic and diamagnetic data, the analysis revealed that all the conformers with the highest MaxOcc (from 0.8 to 1 of the highest MaxOcc, corresponding to 0.58–0.73 for the paramagnetic data and 0.57–0.71 for the dRDC) form a single, relatively compact, region in the conformational space (Fig. 6A,C). In order to quantify its agreement with the original distribution, the center of the region was calculated by averaging the translational and orientation parameters of the highest MaxOcc conformers. The conformation so obtained was then compared with the conformation at the center of the original distribution. As shown in Table 4 and Fig. 6B,D, the agreement was very good in terms of spatial and angular displacement for both the diamagnetic and the paramagnetic data, either for 1 Hz/0.01 ppm or for 3 Hz/0.03 ppm errors.

Table 4.

The MaxOcc/MaxOR analysis. For each simulation the spatial and angular displacement of the center of the ensemble of the best scoring conformers from the center of the actually sampled distribution is reported, together with the indication of the closest site, if applicable. The cases where MaxOR was used are indicated with an asterisk next to the error level.

Simulation Restraint Error recovered center 1 recovered center 2
closest
center
translation [Å] rotation
[deg]
closest
center
translation [Å] rotation
[deg]
Simulation 1 dRDC 1 Hz --- 3.42 5.7 --- --- ---
3 Hz --- 2.65 10.3
pRDC
+PCS
1 Hz/0.01
ppm
--- 4.72 9.1 --- --- ---
3 Hz/0.03
ppm
--- 4.16 9.5 --- --- ---
Simulation 2 dRDC 1 Hz A 3.44 4.4 B 1.71 25.5
3 Hz A 3.27 9.7 B 3.26 29.8
pRDC
+PCS
1 Hz/0.01
ppm*
A 4.42 14.4 B 1.02 28.1
3 Hz/0.03
ppm
A 7.15 25.1 B 6.60 24.7
Simulation 3 dRDC 1 Hz* A 0.99 14.3 B 8.17 16.6
3 Hz A 5.32 28.9 B 5.35 16.0
pRDC
+PCS
1 Hz/0.01
ppm*
A 4.39 10.8 B 3.57 31.4
3 Hz/0.03
ppm
A 5.45 9.9 B 5.80 40.3

Figure 6.

Figure 6

Figure 6

MaxOcc results for Simulation 1. Each conformation is represented by a triad of Cartesian axes, centered at the center of mass of the C-terminal domain. Color code for A,C – according to the MaxOcc value (0.0-blue, 0.8-red), (A) The conformers with the highest MaxOcc recovered with the paramagnetic data (with error of 1 Hz for pRDC and 0.01 ppm for PCS) , (B) The center of the distribution shown in panel A (red) versus the center of the simulated region (black), (C) The conformers with the highest MaxOcc recovered with dRDC (with error of 1 Hz), (D) The center of the distribution shown in panel C (red) versus the center of the simulated region (black). The results are shown from two different points of view in the left and right panels. All conformers are superimposed by the N-terminal domain, shown as a ribbon.

In simulation 2, i.e. the case of two well separated conformational regions, when dRDC are used, the highest MaxOcc conformers are positioned in two distinct, clearly separated regions (Fig. 7A), the centers of which are positioned very close to the centers of the actually sampled distribution (Table 4, Fig. 7B). When paramagnetic data (PCS+pRDC) are used, the highest MaxOcc (0.41–0.51) conformers are positioned in one elongated, banana-shape region in the conformational space (Fig. 8A), which includes the two actually sampled centers, but also many conformers situated between them (their high score is an outcome of conformational averaging as described in the SES results paragraph). From these results, one cannot conclude whether the studied conformational ensemble mainly reflects a two-site exchange case or the sampling of all the conformations within the determined region. In order to distinguish between these two cases, MaxOR calculations were performed. The highest MaxOcc conformers were clustered in 5 regions, shown in Fig. 8B, which include all conformations with distance Δ≤ 5Å from the central conformation (calculated using eq. 9, with f =147 Å). The MaxOR values for these regions are reported in Table S1 (diagonal entries). All regions have similar MaxOR values (up to 0.60), not much higher than the largest MaxOcc values for the individual conformations. If however MaxOR values are calculated for pairs of regions (off-diagonal entries of Table S1), strong differences arise. All pairs yielding the highest MaxOR (0.90–1.00) are composed of regions at the opposite sides of the distribution of the highest MaxOcc conformers, whereas all pairs composed of the regions located on the same side of the distribution or more importantly containing a region in the middle, have significantly lower MaxOR (up to 0.63 and 0.78, respectively). This strongly suggests the occurrence of a two-site exchange model. The pair of regions with the highest MaxOR has their central conformations in nice agreement with the conformations in the center of the distributions in the synthetic ensemble, with an accuracy comparable to that obtained by SES (Table 4, Fig. 8D).

Figure 7.

Figure 7

MaxOcc results for Simulation 2 with dRDC with error of 1 Hz. Each conformation is represented by a triad of Cartesian axes, centered at the center of mass of the C-terminal domain. (A) The conformers with the highest MaxOcc, color code: according to the MaxOcc value (0.0-blue, 0.6-red). (B) The center of the distribution shown in panel A (red) versus the center of the simulated region (black). All conformers are superimposed by the N-terminal domain, shown as a ribbon.

Figure 8.

Figure 8

Figure 8

MaxOcc/MaxOR results for Simulation 2 with paramagnetic data (1 Hz error for pRDC and 0.01 ppm error for PCS). Each conformation is represented by a triad of Cartesian axes, centered at the center of mass of the C-terminal domain. (A) The conformers with the highest MaxOcc, color code – according to the MaxOcc value (0.0-blue, 0.6-red). (B) The five clusters formed by the conformers in panel A. (C) The centers of the clusters. (D) The centers of clusters with the largest MaxOR versus the centers of the simulated regions (black). All conformers are superimposed by the N-terminal domain, shown as a ribbon.

In simulation 3, for both the paramagnetic and diamagnetic data, the conformers recovered by MaxOcc form elongated regions comprising both the two centers and conformers situated between them (Figs. S3A and S4A). MaxOR was thus applied in both cases. As in the previous simulation, no single region has MaxOR significantly higher than the others, but the analysis of pairs of regions indicated again the occurrence of a two-site exchange (Tables S2 and S3). The two central conformations of the synthetic ensemble were identified with good accuracy (Table 4, Figs. S3D and S4D) using both kinds of experimental restraints. Again, the results are robust, as increased errors did not largely affect the accuracy of the solutions.

The performed MaxOcc/MaxOR analysis, as it appears from Table 4 as a whole, confirms the conclusion from the SES results that paramagnetic and diamagnetic restraints are equally useful for the recovery of conformational ensembles.

Conclusions

In many experimental studies RDCs have been shown to be precious restraints for analyzing molecular conformational freedom (Montalvao et al., 2014; Ravera et al., 2014; Camilloni and Vendruscolo, 2015; Torchia, 2015). Here we compared paramagnetic and diamagnetic RDCs and found substantial differences in their information content in the case of multidomain proteins. We found that the information content of dRDC is larger than that of pRDC in terms of number of singular values, and this reflects the shape dependence of dRDC. However, since the internal alignment due to paramagnetism also gives rise to PCSs, the total informational content recovered in a paramagnetic experiment is at least on par with dRDCs.

We have performed several simulations to evaluate the capability of recovering the conformational variability of two-domain proteins by the use of two different approaches, SES and MaxOcc/MaxOR. The main states of the protein were recovered reasonably well for both paramagnetic and diamagnetic datasets, with both approaches (see Tables 14 and also Table S4). Even for rather large experimental errors, we have found that both datasets still retain the ability of recovering the main conformational states, thus resulting appealing for the analysis of averaged experimental data possibly also in the case of large systems, where RDCs are affected by large errors. Of course, since the problem is underdetermined, a correct reconstruction of the main states may be unsuccessful for different rather unpredictable conformational distributions.

Such analysis suggests that pRDC+PCS provide a very promising alternative to dRDC data. It is important to note that this analysis does not include modeling error, which is harder to quantify. Therefore, our analysis does not capture the principal advantages of pRDC+PCS over dRDC, in that it does not require assumption of a barrier model in order to predict the alignment. In addition, one has to consider that the interactions of the protein with the alignment medium might actually perturb the system, and that these interactions can occur on a timescale that is slower than the conformational averaging itself, so that the assumption that the measured dRDCs can be represented as a population-weighted average of the RDCs for the individual (rigid) conformers may fall short in representing the real physical picture.

Finally, the availability of a number of rigid lanthanide-binding tags nowadays may make the acquisition of three independent metal ion datasets more practical and safer than the acquisition and prediction of three independent alignment media. One current limitation of using metal ions is the low signal-to-noise ratio in pRDC and PCS data, which could potentially be improved with better technology and methodology.

Supplementary Material

Supplemental Info

Acknowledgments

This work has been supported by Ente Cassa di Risparmio di Firenze, MIUR PRIN 2012SK7ASN, NIH grant GM065334, European Commission projects BioMedBridges No. 284209, pNMR No. 317127, and Instruct, part of the European Strategy Forum on Research Infrastructures (ESFRI) and supported by national member subscriptions. Specifically, we thank the EU ESFRI Instruct Core Centre CERM, Italy.

Footnotes

Compliance with Ethical Standards

Conflict of Interest: The authors declare that they have no conflict of interest.

This article does not contain any studies with human participants or animals performed by any of the authors.

Reference List

  1. Al-Hashimi HM, Valafar H, Terrell M, Zartler ER, Eidsness MK, Prestegard JH. Variation of molecular alignment as a means of resolving orientational ambiguities in protein structures from dipolar couplings. J Magn Reson. 2000;143:402–406. doi: 10.1006/jmre.2000.2049. [DOI] [PubMed] [Google Scholar]
  2. Allegrozzi M, Bertini I, Janik MBL, Lee Y-M, Liu G, Luchinat C. Lanthanide induced pseudocontact shifts for solution structure refinements of macromolecules in shells up to 40 A from the metal ion. J Am Chem Soc. 2000;122:4154–4161. [Google Scholar]
  3. Andralojc W, Luchinat C, Parigi G, Ravera E. Exploring regions of conformational space occupied by two-domain proteins. J Phys Chem B. 2014;118:10576–10587. doi: 10.1021/jp504820w. [DOI] [PubMed] [Google Scholar]
  4. Balayssac S, Bertini I, Bhaumik A, Lelli M, Luchinat C. Paramagnetic shifts in solid-state NMR of proteins to elicit strucutral information. Proc Natl Acad Sci USA. 2008;105:17284–17289. doi: 10.1073/pnas.0708460105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Banci L, Bertini I, Bren KL, Cremonini MA, Gray HB, Luchinat C, Turano P. The use of pseudocontact shifts to refine solution structures of paramagnetic metalloproteins: Met80Ala cyano-cytochrome c as an example. J Biol Inorg Chem. 1996;1:117–126. [Google Scholar]
  6. Banci L, Bertini I, Gori Savellini G, Romagnoli A, Turano P, Cremonini MA, Luchinat C, Gray HB. Pseudocontact shifts as constraints for energy minimization and molecular dynamic calculations on solution structures of paramagnetic metalloproteins. Proteins Struct Funct Genet. 1997;29:68–76. doi: 10.1002/(sici)1097-0134(199709)29:1<68::aid-prot5>3.0.co;2-b. [DOI] [PubMed] [Google Scholar]
  7. Banci L, Bertini I, Huber JG, Luchinat C, Rosato A. Partial orientation of oxidized and reduced cytochrome b5 at high magnetic fields: Magnetic susceptibility anisotropy contributions and consequences for protein solution structure determination. J Am Chem Soc. 1998;120:12903–12909. [Google Scholar]
  8. Barbato G, Ikura M, Kay LE, Pastor RW, Bax A. Backbone dynamics of calmodulin studied by 15N relaxation using inverse detected two-dimensional NMR spectroscopy; the central helix is flexible. Biochemistry. 1992;31:5269–5278. doi: 10.1021/bi00138a005. [DOI] [PubMed] [Google Scholar]
  9. Barthelmes K, Reynolds AM, Peisach E, Jonker HRA, DeNunzio NJ, Allen KN, Imperiali B, Schwalbe H. Engineering Encodable Lanthanide-Binding Tags into Loop Regions of Proteins. J Am Chem Soc. 2011;133:808–819. doi: 10.1021/ja104983t. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bashir Q, Volkov AN, Ullmann GM, Ubbink M. Visualization of the Encounter Ensemble of the Transient Electron Transfer Complex of Cytochrome c and Cytochrome c Peroxidase. J Am Chem Soc. 2010;132:241–247. doi: 10.1021/ja9064574. [DOI] [PubMed] [Google Scholar]
  11. Berlin K, Castañeda CA, Schneidman-Dohovny D, Sali A, Nava-Tudela A, Fushman D. Recovering a Representative Conformational Ensemble from Underdetermined Macromolecular Structural Data. J Am Chem Soc. 2013;135:16595–16609. doi: 10.1021/ja4083717. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Berlin K, O’Leary DP, Fushman D. Improvement and analysis of computational methods for prediction of residual dipolar couplings. J Magn Reson. 2009;201:25–33. doi: 10.1016/j.jmr.2009.07.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bernadò P, Mylonas E, Petoukhov MV, Blackledge M, Svergun DI. Structural characterization of flexible proteins using small-angle X-ray scattering. J Am Chem Soc. 2007;129:5656–5664. doi: 10.1021/ja069124n. [DOI] [PubMed] [Google Scholar]
  14. Bertini I, Bhaumik A, De Paepe G, Griffin RG, Lelli M, Lewandowski JR, Luchinat C. High-Resolution Solid-State NMR Structure of a 17.6 kDa Protein. J Am Chem Soc. 2010a;132:1032–1040. doi: 10.1021/ja906426p. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Bertini I, Calderone V, Cerofolini L, Fragai M, Geraldes CFGC, Hermann P, Luchinat C, Parigi G, Teixeira JMC. The catalytic domain of MMP-1 studied through tagged lanthanides. Dedicated to Prof. A.V. Xavier. FEBS Lett. 2012a;586:557–567. doi: 10.1016/j.febslet.2011.09.020. [DOI] [PubMed] [Google Scholar]
  16. Bertini I, Del Bianco C, Gelis I, Katsaros N, Luchinat C, Parigi G, Peana M, Provenzani A, Zoroddu MA. Experimentally exploring the conformational space sampled by domain reorientation in calmodulin. Proc Natl Acad Sci USA. 2004a;101:6841–6846. doi: 10.1073/pnas.0308641101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Bertini I, Donaire A, Jiménez B, Luchinat C, Parigi G, Piccioli M, Poggi L. Paramagnetism-based Versus Classical Constraints: An Analysis of the Solution Structure of Ca Ln Calbindin D9k . J Biomol NMR. 2001a;21:85–98. doi: 10.1023/a:1012422402545. [DOI] [PubMed] [Google Scholar]
  18. Bertini I, Ferella L, Luchinat C, Parigi G, Petoukhov MV, Ravera E, Rosato A, Svergun DI. MaxOcc: a web portal for Maximum Occurence Analysis. J Biomol NMR. 2012b;53:271–280. doi: 10.1007/s10858-012-9638-1. [DOI] [PubMed] [Google Scholar]
  19. Bertini I, Fragai M, Lee Y-M, Luchinat C, Terni B. Paramagnetic metal ions in ligand screening: the CoII matrix metalloproteinase 12. Angew Chem Int Ed. 2004b;43:2254–2256. doi: 10.1002/anie.200353453. [DOI] [PubMed] [Google Scholar]
  20. Bertini I, Gelis I, Katsaros N, Luchinat C, Provenzani A. Tuning the Affinity for Lanthanides of Calcium Binding Proteins. Biochemistry. 2003;42:8011–8021. doi: 10.1021/bi034494z. [DOI] [PubMed] [Google Scholar]
  21. Bertini I, Giachetti A, Luchinat C, Parigi G, Petoukhov MV, Pierattelli R, Ravera E, Svergun DI. Conformational space of flexible biological macromolecules from average data. J Am Chem Soc. 2010b;132:13553–13558. doi: 10.1021/ja1063923. [DOI] [PubMed] [Google Scholar]
  22. Bertini I, Gupta YK, Luchinat C, Parigi G, Peana M, Sgheri L, Yuan J. Paramagnetism-Based NMR Restraints Provide Maximum Allowed Probabilities for the Different Conformations of Partially Independent Protein Domains. J Am Chem Soc. 2007;129:12786–12794. doi: 10.1021/ja0726613. [DOI] [PubMed] [Google Scholar]
  23. Bertini I, Janik MBL, Lee Y-M, Luchinat C, Rosato A. Magnetic Susceptibility Tensor Anisotropies for a Lanthanide Ion Series in a Fixed Protein Matrix. J Am Chem Soc. 2001b;123:4181–4188. doi: 10.1021/ja0028626. [DOI] [PubMed] [Google Scholar]
  24. Bertini I, Janik MBL, Liu G, Luchinat C, Rosato A. Solution structure calculations through self-orientation in a magnetic field of cerium (III) substituted calcium-binding protein. J Magn Reson. 2001c;148:23–30. doi: 10.1006/jmre.2000.2218. [DOI] [PubMed] [Google Scholar]
  25. Bertini I, Kursula P, Luchinat C, Parigi G, Vahokoski J, Willmans M, Yuan J. Accurate solution structures of proteins from X-ray data and minimal set of NMR data: calmodulin peptide complexes as examples. J Am Chem Soc. 2009;131:5134–5144. doi: 10.1021/ja8080764. [DOI] [PubMed] [Google Scholar]
  26. Bertini I, Longinetti M, Luchinat C, Parigi G, Sgheri L. Efficiency of paramagnetism-based constraints to determine the spatial arrangement of α-helical secondary structure elements. J Biomol NMR. 2002a;22:123–136. doi: 10.1023/a:1014214015858. [DOI] [PubMed] [Google Scholar]
  27. Bertini I, Luchinat C, Nagulapalli M, Parigi G, Ravera E. Paramagnetic relaxation enhancements for the characterization of the conformational heterogeneity in two-domain proteins. Phys Chem Chem Phys. 2012c;14:9149–9156. doi: 10.1039/c2cp40139h. [DOI] [PubMed] [Google Scholar]
  28. Bertini I, Luchinat C, Parigi G. Magnetic susceptibility in paramagnetic NMR. Progr NMR Spectrosc. 2002b;40:249–273. [Google Scholar]
  29. Bertini I, Luchinat C, Parigi G, Pierattelli R. NMR of paramagnetic metalloproteins. ChemBioChem. 2005;6:1536–1549. doi: 10.1002/cbic.200500124. [DOI] [PubMed] [Google Scholar]
  30. Bertini I, Luchinat C, Parigi G, Pierattelli R. Perspectives in NMR of paramagnetic proteins. Dalton Trans. 2008;2008:3782–3790. doi: 10.1039/b719526e. [DOI] [PubMed] [Google Scholar]
  31. Blackledge M. Recent progress in the study of biomolecular structure and dynamics in solution from residual dipolar couplings. Progress in NMR Spectroscopy. 2005;46:23–61. [Google Scholar]
  32. Boehr DD, McElheny D, Dyson HJ, Wright PE. The dynamic energy landscape of dihydrofolate reductase catalysis. Science. 2006;313:1638–1642. doi: 10.1126/science.1130258. [DOI] [PubMed] [Google Scholar]
  33. Boehr DD, Nussinov R, Wright PE. The role of dynamic conformational ensembles in biomolecular recognition (vol 5, pg 789, 2009) Nature Chemical Biology. 2009;5:954. doi: 10.1038/nchembio.232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Bonvin AM, Brunger AT. Do NOE distances contain enough information to assess the relative populations of multi-conformer structures? J Biomol NMR. 1996;7:72–76. doi: 10.1007/BF00190458. [DOI] [PubMed] [Google Scholar]
  35. Bothe JR, Nikolova EN, Eichhorn CD, Chugh J, Hansen AL, Al Hashimi HM. Characterizing RNA dynamics at atomic resolution using solution-state NMR spectroscopy. Nat Methods. 2011;8:919–931. doi: 10.1038/nmeth.1735. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Burgi R, Pitera J, Van Gunsteren WF. Assessing the effect of conformational averaging on the measured values of observables. J Biomol NMR. 2001;19:305–320. doi: 10.1023/a:1011295422203. [DOI] [PubMed] [Google Scholar]
  37. Camilloni C, Vendruscolo M. A Tensor-Free Method for the Structural and Dynamical Refinement of Proteins using Residual Dipolar Couplings. J Phys Chem B. 2015;119:653–661. doi: 10.1021/jp5021824. [DOI] [PubMed] [Google Scholar]
  38. Cerofolini L, Fields GB, Fragai M, Geraldes CFGC, Luchinat C, Parigi G, Ravera E, Svergun DI, Teixeira JMC. Examination of Matrix Metalloproteinase-1 (MMP-1) in solution: a preference for the pre-collagenolysis state. J Biol Chem. 2013;288:30659–30671. doi: 10.1074/jbc.M113.477240. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Chen Y, Campbell SL, Dokholyan NV. Deciphering protein dynamics from NMR data using explicit structure sampling and selection. Biophys J. 2007;93:2300–2306. doi: 10.1529/biophysj.107.104174. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Chou JJ, Li S, Klee CB, Bax A. Solution structure of Ca2+ calmodulin reveals flexible hand-like properties of its domains. Nature Struct Biol. 2001;8:990–997. doi: 10.1038/nsb1101-990. [DOI] [PubMed] [Google Scholar]
  41. Choy W-Y, Forman-Kay JD. Calculation of Ensembles of Structures Representing the Unfolded State of an SH3 Domain. J Mol Biol. 2001;308:1011–1032. doi: 10.1006/jmbi.2001.4750. [DOI] [PubMed] [Google Scholar]
  42. Chuang GY, Mehra-Chaudhary R, Ngan CH, Zerbe BS, Kozakov D, Vajda S, Beamer LJ. Domain motion and interdomain hot spots in a multidomain enzyme. Protein Sci. 2010;19:1662–1672. doi: 10.1002/pro.446. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Clore GM, Schwieters CD. How Much Backbone Motion in Ubiquitin Is Required To Account for Dipolar Coupling Data Measured in Multiple Alignment Media as Assessed by Independent Cross-Validation? J Am Chem Soc. 2004;126:2923–2938. doi: 10.1021/ja0386804. [DOI] [PubMed] [Google Scholar]
  44. Das Gupta S, Hu X, Keizers PHJ, Liu W-M, Luchinat C, Nagulapalli M, Overhand M, Parigi G, Sgheri L, Ubbink M. Narrowing the conformational space sampled by two-domain proteins with paramagnetic probes in both domains. J Biomol NMR. 2011;51:253–263. doi: 10.1007/s10858-011-9532-2. [DOI] [PubMed] [Google Scholar]
  45. Diaz-Moreno I, Diaz-Quintana A, De la Rosa MA, Ubbink M. Structure of the complex between plastocyanin and cytochrome f from the cyanobacterium nostoc Sp. PCC 7119 as determined by paramagnetic NMR. J Biol Chem. 2005;280:18908–18915. doi: 10.1074/jbc.M413298200. [DOI] [PubMed] [Google Scholar]
  46. Fisher CK, Huang A, Stultz CM. Modeling Intrinsically Disordered Proteins with Bayesian Statistics. J Am Chem Soc. 2010;132:14919–14927. doi: 10.1021/ja105832g. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Fisher CK, Stultz CM. Constructing ensembles for instrinsically disordered proteins. Curr Opin Struct Biol. 2011;21:426–431. doi: 10.1016/j.sbi.2011.04.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Fragai M, Luchinat C, Parigi G. “Four-dimensional” protein structures: examples from metalloproteins. Acc Chem Res. 2006;39:909–917. doi: 10.1021/ar050103s. [DOI] [PubMed] [Google Scholar]
  49. Gaponenko V, Sarma SP, Altieri AS, Horita DA, Li J, Byrd RA. Improving the accuracy of NMR structures of large proteins using pseudocontact shifts as long/range restraints. J Biomol NMR. 2004;28:205–212. doi: 10.1023/B:JNMR.0000013706.09264.36. [DOI] [PubMed] [Google Scholar]
  50. Gardner RJ, Longinetti M, Sgheri L. Reconstruction of orientations of a moving protein domain from paramagnetic data. Inv Probl. 2005;21:879–898. [Google Scholar]
  51. Gempf KL, Butler SJ, Funk AM, Parker D. Direct and selective tagging of cysteine residues in peptides and proteins with 4-nitropyridyl lanthanide complexes. Chem Commun (Camb ) 2013;49:9104–9106. doi: 10.1039/c3cc45875j. [DOI] [PubMed] [Google Scholar]
  52. Gochin M, Roder H. Protein Structure Refinement based on Paramagnetic NMR shifts. Applications to Wild-Type and mutants forms of Cytochrome c. Protein Sci. 1995a;4:296–305. doi: 10.1002/pro.5560040216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Gochin M, Roder H. Use of pseudocontact shifts as a structural constraint for macromolecules in solution. Bull Magn Reson. 1995b;17:1–4. [Google Scholar]
  54. Guerry P, Salmon L, Mollica L, Ortega RosaRoldan JL, Markwick P, van Nuland NA, McCammon JA, Blackledge M. Mapping the population of protein conformational energy sub-states from NMR dipolar couplings. Angew Chem Int Ed Engl. 2013;52:3181–3185. doi: 10.1002/anie.201209669. [DOI] [PubMed] [Google Scholar]
  55. Hansen MR, Mueller L, Pardi A. Tunable alignment of macromolecules by filamentous phage yields dipolar coupling interactions. Nat Struct Biol. 1998;5:1065–1074. doi: 10.1038/4176. [DOI] [PubMed] [Google Scholar]
  56. Hass MAS, Keizers PHJ, Blok A, Hiruma Y, Ubbink M. Validation of a Lanthanide Tag for the Analysis of Protein Dynamics by Paramagnetic NMR Spectroscopy. J Am Chem Soc. 2010;132:9952–9953. doi: 10.1021/ja909508r. [DOI] [PubMed] [Google Scholar]
  57. Häussinger D, Huang J, Grzesiek S. DOTA-M8: An extremely Rigid, High-Affinity Lanthanide Chelating Tag for PCS NMR Spectroscopy. J Am Chem Soc. 2009;131:14761–14767. doi: 10.1021/ja903233w. [DOI] [PubMed] [Google Scholar]
  58. Huang J, Grzesiek S. Ensemble Calculations of Unstructured Proteins Constrained by RDC and PRE Data: A Case study of Urea-Denatured Ubiquitin. J Am Chem Soc. 2010;132:694–705. doi: 10.1021/ja907974m. [DOI] [PubMed] [Google Scholar]
  59. Hulsker R, Baranova MV, Bullerjahn GS, Ubbink M. Dynamics in the transient complex of plastocyanin-cytochrome f from Prochlorothrix hollandica. J Am Chem Soc. 2008;130:1985–1991. doi: 10.1021/ja077453p. [DOI] [PubMed] [Google Scholar]
  60. Iwahara J, Schwieters CD, Clore GM. Ensemble approach for NMR structure refinement against H-1 paramagnetic relaxation enhancement data arising from a flexible paramagnetic group attached to a macromolecule. J Am Chem Soc. 2004;126:5879–5896. doi: 10.1021/ja031580d. [DOI] [PubMed] [Google Scholar]
  61. Jensen MR, Hansen DF, Ayna U, Dagil R, Hass MA, Christensen HE, Led JJ. On the use of pseudocontact shifts in the structure determination of metalloproteins. Magn Reson Chem. 2006;44:294–301. doi: 10.1002/mrc.1771. [DOI] [PubMed] [Google Scholar]
  62. John M, Otting G. Strategies for Measurements of Pseudocontact Shifts in Protein NMR Spectroscopy. ChemPhysChem. 2007;8:2309–2313. doi: 10.1002/cphc.200700510. [DOI] [PubMed] [Google Scholar]
  63. Jones E, Oliphant E, Peterson P, et al. SciPy: Open Source Scientific Tools for Python. 2001 [Google Scholar]
  64. Keizers PHJ, Saragliadis A, Hiruma Y, Overhand M, Ubbink M. Design, Synthesis, and Evaluation of a Lanthanide Chelating Protein Probe: CLaNP-5 Yields Predictable Paramagnetic Effects Independent of Environment. J Am Chem Soc. 2008;130:14802–14812. doi: 10.1021/ja8054832. [DOI] [PubMed] [Google Scholar]
  65. Kobashigawa Y, Saio T, Ushio M, Sekiguchi M, Yokochi M, Ogura K, Inagaki F. Convenient method for resolving degeneracies due to symmetry of the magnetic susceptibility tensor and its application to pseudo contact shift-based protein-protein complex structure determination. J Biomol NMR. 2012;53:53–63. doi: 10.1007/s10858-012-9623-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  66. Korzhnev DM, Kay LE. Probing invisible, low-populated States of protein molecules by relaxation dispersion NMR spectroscopy: an application to protein folding. Acc Chem Res. 2008;41:442–451. doi: 10.1021/ar700189y. [DOI] [PubMed] [Google Scholar]
  67. Kuffner JJ. Effective sampling and distance metrics for 3D rigid body path planning. Proc. IEEE Intl. Conf. on Robotics and Automation (ICRA) 2004;4:3993. Apr 26, 2004. [Google Scholar]
  68. Kukic P, Camilloni C, Cavalli A, Vendruscolo M. Determination of the individual roles of the linker residues in the interdomain motions of calmodulin using NMR chemical shifts. J Mol Biol. 2014;426:1826–1838. doi: 10.1016/j.jmb.2014.02.002. [DOI] [PubMed] [Google Scholar]
  69. Kurland RJ, McGarvey BR. Isotropic NMR shifts in transition metal complexes: calculation of the Fermi contact and pseudocontact terms. J Magn Reson. 1970;2:286–301. [Google Scholar]
  70. Lakomek NA, Walter KF, Fares C, Lange OF, de Groot BL, Grubmuller H, Bruschweiler R, Munk A, Becker S, Meiler J, Griesinger C. Self-consistent residual dipolar coupling based model-free analysis for the robust determination of nanosecond to microsecond protein dynamics. J Biomol NMR. 2008;41:139–155. doi: 10.1007/s10858-008-9244-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  71. Lange OF, Lakomek N-A, Farès C, Schröder GF, Walter KFA, Becker S, Meiler J, Grubmüller H, Griesinger C, de Groot BL. Recognition Dynamics Up to Microseconds Revealed from an RDC-Derived Ubiquitin Ensemble in Solution. Science. 2008;320:1471–1475. doi: 10.1126/science.1157092. [DOI] [PubMed] [Google Scholar]
  72. Latham MP, Hanson P, Brown DJ, Pardi A. Comparison of alignment tensors generated for native tRNA(Val) using magnetic fields and liquid crystalline media. J Biomol NMR. 2008;40:83–94. doi: 10.1007/s10858-007-9212-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  73. Lindorff-Larsen K, Best RB, DePristo MA, Dobson CM, Vendruscolo M. Simultaneous determination of protein structure and dynamics. Nature. 2005;433:128–132. doi: 10.1038/nature03199. [DOI] [PubMed] [Google Scholar]
  74. Liu WM, Keizers PH, Hass MA, Blok A, Timmer M, Sarris AJ, Overhand M, Ubbink M. A pH-sensitive, colorful, lanthanide-chelating paramagnetic NMR probe. J Am Chem Soc. 2012;134:17306–17313. doi: 10.1021/ja307824e. [DOI] [PubMed] [Google Scholar]
  75. Loh CT, Ozawa K, Tuck KL, Barlow N, Huber T, Otting G, Graham B. Lanthanide tags for site-specific ligation to an unnatural amino acid and generation of pseudocontact shifts in proteins. Bioconjug Chem. 2013;24:260–268. doi: 10.1021/bc300631z. [DOI] [PubMed] [Google Scholar]
  76. Lohman JAB, Maclean C. Alignment effects on high resolution NMR spectra induced by the magnetic field. Chem Phys. 1978;35:269–274. [Google Scholar]
  77. Longinetti M, Luchinat C, Parigi G, Sgheri L. Efficient determination of the most favored orientations of protein domains from paramagnetic NMR data. Inv Probl. 2006;22:1485–1502. [Google Scholar]
  78. Losonczi JA, Andrec M, Fischer MW, Prestegard JH. Order matrix analysis of residual dipolar couplings using singular value decomposition. J Magn Reson. 1999;138:334–342. doi: 10.1006/jmre.1999.1754. [DOI] [PubMed] [Google Scholar]
  79. Losonczi JA, Prestegard JH. Improved dilute bicelle solutions for high-resolution NMR of biological macromolecules. J Biomol NMR. 1998;12:447–451. doi: 10.1023/a:1008302110884. [DOI] [PubMed] [Google Scholar]
  80. Luchinat C, Nagulapalli M, Parigi G, Sgheri L. Maximum occurence analysis of protein conformations for different distributions of paramagnetic metal ions within flexible two-domain proteins. J Magn Reson. 2012a;215:85–93. doi: 10.1016/j.jmr.2011.12.016. [DOI] [PubMed] [Google Scholar]
  81. Luchinat C, Parigi G, Ravera E, Rinaldelli M. Solid state NMR crystallography through paramagnetic restraints. J Am Chem Soc. 2012b;134:5006–5009. doi: 10.1021/ja210079n. [DOI] [PubMed] [Google Scholar]
  82. Maltsev AS, Grishaev A, Roche J, Zasloff M, Bax A. Improved cross validation of a static ubiquitin structure derived from high precision residual dipolar couplings measured in a drug-based liquid crystalline phase. J Am Chem Soc. 2014;136:3752–3755. doi: 10.1021/ja4132642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  83. Man B, Su XC, Liang H, Simonsen S, Huber T, Messerle BA, Otting G. 3-Mercapto-2,6-Pyridinedicarboxylic Acid: A Small Lanthanide-Binding Tag for Protein Studies by NMR Spectroscopy. Chem Eur J. 2010;16:3827–3832. doi: 10.1002/chem.200902904. [DOI] [PubMed] [Google Scholar]
  84. Montalvao R, Camilloni C, De SA, Vendruscolo M. New opportunities for tensor-free calculations of residual dipolar couplings for the study of protein dynamics. J Biomol NMR. 2014;58:233–238. doi: 10.1007/s10858-013-9801-3. [DOI] [PubMed] [Google Scholar]
  85. Musiani F, Rossetti G, Capece L, Gerger TM, Micheletti C, Varani G, Carloni P. Molecular dynamics simulations identify time scale of conformational changes responsible for conformational selection in molecular recognition of HIV-1 transactivation responsive RNA. J Am Chem Soc. 2014;136:15631–15637. doi: 10.1021/ja507812v. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Nesterov Y. Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM Journal on Optimization. 2012;22:341–362. [Google Scholar]
  87. Nodet L, Salmon L, Ozenne V, Meier S, Jensen MR, Blackledge M. Quantitative Description of Backbone Conformational Sampling of Unfolded Proteins at Amino Acid Resolution from NMR Residual Dipolar Couplings. J Am Chem Soc. 2009;131:17908–17918. doi: 10.1021/ja9069024. [DOI] [PubMed] [Google Scholar]
  88. O’Leary DP. Scientific computing with case studies. 2009 SIAM. [Google Scholar]
  89. Pickford AR, Campbell ID. NMR studies of modular protein structures and their interactions. Chem Rev. 2004;104:3557–3566. doi: 10.1021/cr0304018. [DOI] [PubMed] [Google Scholar]
  90. Pintacuda G, John M, Su XC, Otting G. NMR Structure Determination of Protein-Ligand Complexes by Lanthanide Labeling. Acc Chem Res. 2007;40:206–212. doi: 10.1021/ar050087z. [DOI] [PubMed] [Google Scholar]
  91. Prestegard JH, Al-Hashimi HM, Tolman JR. NMR structures of biomolecules using field oriented media and residual dipolar couplings. Q Rev Biophys. 2000;33:371–424. doi: 10.1017/s0033583500003656. [DOI] [PubMed] [Google Scholar]
  92. Ramirez BE, Bax A. Modulation of the Alignment Tensor of Macromolecules Dissolved in a Dilute Liquid Crystalline Medium. J Am Chem Soc. 1998;120:9106–9107. [Google Scholar]
  93. Ravera E, Salmon L, Fragai M, Parigi G, Al-Hashimi HM, Luchinat C. Insights into domain-domain motions in proteins and RNA from solution NMR. Acc Chem Res. 2014;47:3118–3126. doi: 10.1021/ar5002318. [DOI] [PMC free article] [PubMed] [Google Scholar]
  94. Rinnenthal J, Buck J, Ferner J, Wacker A, Furtig B, Schwalbe H. Mapping the landscape of RNA dynamics with NMR spectroscopy. Acc Chem Res. 2011;44:1292–1301. doi: 10.1021/ar200137d. [DOI] [PubMed] [Google Scholar]
  95. Rodriguez-Castañeda F, Haberz P, Leonov A, Griesinger C. Paramagnetic tagging of diamagnetic proteins for solution NMR. Magn Reson Chem. 2006;44:S10–S16. doi: 10.1002/mrc.1811. [DOI] [PubMed] [Google Scholar]
  96. Russo L, Maestre-Martinez M, Wolff S, Becker S, Griesinger C. Interdomain dynamics explored by paramagnetic NMR. J Am Chem Soc. 2013;135:17111–17120. doi: 10.1021/ja408143f. [DOI] [PubMed] [Google Scholar]
  97. Ryabov YE, Fushman D. Analysis of interdomain dynamics in a two-domain protein using residual dipolar couplings together with 15N relaxation data. Magn Reson Chem. 2006;44:S143–S151. doi: 10.1002/mrc.1822. [DOI] [PubMed] [Google Scholar]
  98. Ryabov YE, Fushman D. A model of Interdomain Mobility in a Multidomain Protein. J Am Chem Soc. 2007;129:3315–3327. doi: 10.1021/ja067667r. [DOI] [PMC free article] [PubMed] [Google Scholar]
  99. Saio T, Ogura K, Shimizu K, Yokochi M, Burke TR, Jr, Inagaki F. An NMR strategy for fragment-based ligand screening utilizing a paramagnetic lanthanide probe. J Biomol NMR. 2011;51:395–408. doi: 10.1007/s10858-011-9566-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  100. Schmitz C, Vernon R, Otting G, Baker D, Huber T. Protein Structure Determination from Pseudocontact Shifts Using ROSETTA. J Mol Biol. 2012;416:668–677. doi: 10.1016/j.jmb.2011.12.056. [DOI] [PMC free article] [PubMed] [Google Scholar]
  101. Schroeder R, Barta A, Semrad K. Strategies for RNA folding and assembly. Nat Rev Mol Cell Biol. 2004;5:908–919. doi: 10.1038/nrm1497. [DOI] [PubMed] [Google Scholar]
  102. Sgheri L. Conformational freedom of proteins and the maximal probability of sets of orientations. Inv Probl. 2010a;26:035003-1–035003-19. [Google Scholar]
  103. Sgheri L. Joining RDC data from flexible protein domains. Inv Probl. 2010b;26:115021-1–115021-12. [Google Scholar]
  104. Sicheri F, Kuriyan J. Structures of Src-family tyrosine kinases. Curr Opin Struct Biol. 1997;7:777–785. doi: 10.1016/s0959-440x(97)80146-7. [DOI] [PubMed] [Google Scholar]
  105. Simin M, Irausquin S, Cole CA, Valafar H. Improvements to REDCRAFT: a software tool for simultaneous characterization of protein backbone structure and dynamics from residual dipolar couplings. J Biomol NMR. 2014;60:241–264. doi: 10.1007/s10858-014-9871-x. [DOI] [PubMed] [Google Scholar]
  106. Stelzer AC, Frank AT, Bailor MH, Andricioaei I, Al RosaRoldan HM. Constructing atomic-resolution RNA structural ensembles using MD and motionally decoupled NMR RDCs. Methods. 2009;49:167–173. doi: 10.1016/j.ymeth.2009.08.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
  107. Su XC, Huber T, Dixon NE, Otting G. Site-Specific Labelling of Proteins with a Rigid Lanthanide-Binding Tag. ChemBioChem. 2006;7:1599–1604. doi: 10.1002/cbic.200600142. [DOI] [PubMed] [Google Scholar]
  108. Su XC, Man B, Beeren S, Liang H, Simonsen S, Schmitz C, Huber T, Messerle BA, Otting G. A Dipicolinic Acid Tag for Rigid Lanthanide Tagging of Proteins and Paramagnetic NMR Spectroscopy. J Am Chem Soc. 2008a;130:10486–10487. doi: 10.1021/ja803741f. [DOI] [PubMed] [Google Scholar]
  109. Su XC, McAndrew K, Huber T, Otting G. Lanthanide-binding peptides for NMR measurements of residual dipolar couplings and paramagnetic effects from multiple angles. J Am Chem Soc. 2008b;130:1681–1687. doi: 10.1021/ja076564l. [DOI] [PubMed] [Google Scholar]
  110. Su XC, Otting G. Paramagnetic labelling of proteins and oligonucleotides for NMR. J Biomol NMR. 2010;46:101–112. doi: 10.1007/s10858-009-9331-1. [DOI] [PubMed] [Google Scholar]
  111. Svergun DI, Petoukhov MV, Koch MHJ. Determination of domain structure of proteins from X-ray solution scattering. Biophys J. 2001;80:2946–2953. doi: 10.1016/S0006-3495(01)76260-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  112. Swarbrick JD, Ung P, Chhabra S, Graham B. An iminodiacetic acid based lanthanide binding tag for paramagnetic exchange NMR spectroscopy. Angew Chem Int Ed Engl. 2011a;50:4403–4406. doi: 10.1002/anie.201007221. [DOI] [PubMed] [Google Scholar]
  113. Swarbrick JD, Ung P, Su XC, Maleckis A, Chhabra S, Huber T, Otting G, Graham B. Engineering of a bis-chelator motif into a protein alpha-helix for rigid lanthanide binding and paramagnetic NMR spectroscopy. Chem Commun (Camb ) 2011b;47:7368–7370. doi: 10.1039/c1cc11893e. [DOI] [PubMed] [Google Scholar]
  114. Tjandra N, Bax A. Direct measurement of distances and angles in biomolecules by NMR in a diluite liquid crystalline medium. Science. 1997;278:1111–1114. doi: 10.1126/science.278.5340.1111. [DOI] [PubMed] [Google Scholar]
  115. Tjandra N, Kuboniwa H, Ren H, Bax A. Rotational dynamics of calcium-free calmodulin studied by 15N-NMR relaxation measurements. Eur J Biochem. 1995;230:1014–1024. doi: 10.1111/j.1432-1033.1995.tb20650.x. [DOI] [PubMed] [Google Scholar]
  116. Tolman JR. Dipolar couplings as a probe of molecular dynamics and structure in solution. Curr Opin Struct Biol. 2001;11:532–539. doi: 10.1016/s0959-440x(00)00245-1. [DOI] [PubMed] [Google Scholar]
  117. Tolman JR, Flanagan JM, Kennedy MA, Prestegard JH. Nuclear magnetic dipole interactions in field-oriented proteins: information for structure determination in solution. Proc Natl Acad Sci USA. 1995;92:9279–9283. doi: 10.1073/pnas.92.20.9279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  118. Tolman JR, Ruan K. NMR residual dipolar couplings as probes of biomolecular dynamics. Chem Rev. 2006;106:1720–1736. doi: 10.1021/cr040429z. [DOI] [PubMed] [Google Scholar]
  119. Tonks NK. Protein tyrosine phosphatases: from genes, to function, to disease. Nat Rev Mol Cell Biol. 2006;7:833–846. doi: 10.1038/nrm2039. [DOI] [PubMed] [Google Scholar]
  120. Torchia DA. NMR studies of dynamic biomolecular conformational ensembles. Prog Nucl Magn Reson Spectrosc. 2015;84–85:14–32. doi: 10.1016/j.pnmrs.2014.11.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  121. Valafar H, Prestegard JH. REDCAT: a residual dipolar coupling analysis tool. J Magn Reson. 2004;167:228–241. doi: 10.1016/j.jmr.2003.12.012. [DOI] [PubMed] [Google Scholar]
  122. Wang H, Eberstadt M, Olejniczak ET, Meadows RP, Fesik SW. A liquid crystalline medium for measuring residual dipolar couplings over a wide range of temperatures. J Biomol NMR. 1998;12:443–446. [Google Scholar]
  123. Wöhnert J, Franz KJ, Nitz M, Imperiali B, Schwalbe H. Protein alignment by a coexpressed lanthanide-binding tag for the measurement of residual dipolar couplings. J Am Chem Soc. 2003;125:13338–13339. doi: 10.1021/ja036022d. [DOI] [PubMed] [Google Scholar]
  124. Yagi H, Maleckis A, Otting G. A systematic study of labelling an alpha-helix in a protein with a lanthanide using IDA-SH or NTA-SH tags. J Biomol NMR. 2013a;55:157–166. doi: 10.1007/s10858-012-9697-3. [DOI] [PubMed] [Google Scholar]
  125. Yagi H, Pilla KB, Maleckis A, Graham B, Huber T, Otting G. Three-Dimensional Protein Fold Determination from Backbone Amide Pseudocontact Shifts Generated by Lanthanide Tags at Multiple Sites. Structure. 2013b;21:883–890. doi: 10.1016/j.str.2013.04.001. [DOI] [PubMed] [Google Scholar]
  126. Zhang Q, Throolin R, Pitt SW, Serganov A, Al Hashimi HM. Probing motions between equivalent RNA domains using magnetic field induced residual dipolar couplings: accounting for correlations between motions and alignment. J Am Chem Soc. 2003;125:10530–10531. doi: 10.1021/ja0363056. [DOI] [PubMed] [Google Scholar]
  127. Zhang Y, Zuiderweg ER. The 70-kDa heat shock protein chaperone nucleotide-binding domain in solution unveiled as a molecular machine that can reorient its functional subdomains. Proc Natl Acad Sci U S A. 2004;101:10272–10277. doi: 10.1073/pnas.0401313101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  128. Zhuang T, Lee HS, Imperiali B, Prestegard JH. Structure determination of a Galectin-3-carbohydrate complex using paramagnetism-based NMR constraints. Protein Sci. 2008;17:1220–1231. doi: 10.1110/ps.034561.108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  129. Zweckstetter M. NMR: prediction of molecular alignment from structure using the PALES software. Nat Protoc. 2008;3:679–690. doi: 10.1038/nprot.2008.36. [DOI] [PubMed] [Google Scholar]
  130. Zweckstetter M, Bax A. Prediction of Sterically Induced Alignment in a Dilute Liquid Crystalline Phase: Aid to Protein Structure Determination by NMR. J Am Chem Soc. 2000;122:3791–3792. [Google Scholar]
  131. Zweckstetter M, Bax A. Characterization of molecular alignment in aqueous suspensions of Pf1 bacteriophage. J Biomol NMR. 2001;20:365–377. doi: 10.1023/a:1011263920003. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Info

RESOURCES