Abstract
Crucial to the function of proteins is their existence as conformational ensembles sampling numerous and structurally diverse substates. Despite this widely accepted notion there is still a high demand for meaningful and reliable approaches to characterize protein ensembles in solution. As it is usually conducted in solution, NMR spectroscopy offers unique possibilities to address this challenge. Particularly, cross‐correlated relaxation (CCR) effects have long been established to encode both protein structure and dynamics in a compelling manner. However, this wealth of information often limits their use in practice as structure and dynamics might prove difficult to disentangle. Using a modern Maximum Entropy (MaxEnt) reweighting approach to interpret CCR rates of Ubiquitin, we demonstrate that these uncertainties do not necessarily impair resolving CCR‐encoded structural information. Instead, a suitable balance between complementary CCR experiments and prior information is found to be the most crucial factor in mapping backbone dihedral angle distributions. Experimental and systematic deviations such as oversimplified dynamics appear to be of minor importance. Using Ubiquitin as an example, we demonstrate that CCR rates are capable of characterizing rigid and flexible residues alike, indicating their unharnessed potential in studying disordered proteins.
Keywords: cross-correlated relaxation, NMR spectroscopy, protein dynamics, protein structures, statistical inference
Cross‐correlated spin relaxation (CCR) rates uniquely encode both protein structure and dynamics but their quantitative interpretation is challenging in the presence of conformational heterogeneity. Using a Maximum Entropy approach, we demonstrate that CCR rates are capable of characterizing dihedral angle distributions of both rigid and flexible residues in Ubiquitin.

1. Introduction
The structure‐function‐paradigm is compelling not only due to its conceptual simplicity. The possibility to describe a protein system by a singular minimum energy configuration greatly reduces the complexity by which the underlying structural ensemble has to be modeled. Most importantly, the assumed absence of conformational averaging allows to derive a singular model structure directly from the experimental data, a key concept of conventional structure calculation methods. [1] However, this simplifying assumption fails in case of pronounced structural flexibility. While arguably inconvenient for structure determination purposes, the ability to sample numerous and structurally diverse substates is crucial for many proteins to function [2] as it allows for important regulatory processes such as allosteric regulation.[ 3 , 4 , 5 ] An extreme case of conformational flexibility is found in so‐called intrinsically disordered proteins (IDPs) that appear unstructured under native conditions. Disordered regions of at least 30 residues in length are estimated to occur in 10–35 % of prokaryotic and 15–45 % of eukaryotic proteins. [6]
In order to model and characterize protein systems in their full structural diversity, methods and protocols are required that do not make overly restrictive assumptions about the characteristics of the underlying ensemble. However, reconstructing ensembles from averaged data is generally an ill‐posed problem, i. e. very different ensembles might be compatible with the data at hand, and only a substantial increase in the number of experimental observables can possibly alleviate this problem. NMR spectroscopy offers numerous experimental parameters to study protein conformational ensembles in solution, such as hydrogen exchange rates (HDX), scalar couplings (3J), chemical shifts (CS), residual dipolar couplings (RDCs), nuclear Overhauser effects (NOEs) and paramagentic relaxation enhancements (PREs), allowing to probe (even transient) structural elements as well as their encoded dynamics with atomic resolution.[ 7 , 8 , 9 ]
Here we advocate a less common approach using cross‐correlated relaxation (CCR) mechanisms to probe backbone dihedral angle distributions of proteins in solution. Cross‐correlated relaxation arises from interference effects between the fluctuations of different relaxation mechanisms, typically between two different dipolar (DP) and/or chemical shift anisotropy (CSA) interactions. Any two relaxation mechanisms arising from equal rank tensorial interactions can lead to observable relaxation interference. While, in theory, independent of their respective distance, in practice, this is of course limited by the effective means of coherence transfer involving more remote nuclei. Still, interactions within sequentially adjacent residues are generally accessible, allowing for a wide variety of CCR effects to be probed,[ 10 , 11 ] most of which depend on the backbone dihedral angles. While inherently non‐injective, these dependencies have been shown to be complementary, i. e. CCR rates can be analyzed in a combined fashion to resolve their mutual ambiguities assuming a stable fold.[ 12 , 13 ] We argue that this complementarity goes further than simply allowing to resolve angular ambiguities in folded proteins.
We have recently demonstrated that CCR rates reveal surprising structural and dynamical propensities in intrinsically disordered proteins (IDPs). [14] However, in the presence of substantial conformational averaging, quantitative interpretation of these rates appears difficult due to their non‐trivial convolution of structural and dynamical contributions. Uncertainties about the dynamics translate into uncertainties about the structure and vice versa. For this reason, CCR rates seem to demonstrate their full potential predominantly in well‐characterized protein systems.[ 15 , 16 , 17 , 18 , 19 , 20 ] Using Ubiquitin as an example, we show that imprecisely modeled protein dynamics do not necessarily obfuscate CCR‐encoded structural information. It is neither necessary to require precise quantitative agreements between model and experiment nor to assume a rigid backbone to reliably characterize backbone dihedral angle distributions. Using a Maximum Entropy (MaxEnt) approach, we find that the complementary nature of CCR rates allows them to compensate for dynamical and geometrical uncertainties even in cases of substantial structural flexibility.
While traditional approaches developed for folded proteins are often ill‐equipped to model heterogeneous structural ensembles, MaxEnt inspired methods have made tremendous progress in the recent past. Particularly promising are MaxEnt reweighting approaches, which have matured to well‐founded and efficient protocols over the last few years.[ 21 , 22 , 23 , 24 , 25 ] Given a predefined population‐weighted set of conformations, called the prior, these methods form an ensemble estimate by reweighting the prior as little as necessary to match the experimental data. This works well in practice as long as all experimentally relevant conformations can safely be specified a priori. [26] While this requirement cannot be met in every case, it does generally apply if the protein ensemble is considered not in its full 3D complexity but in terms of the experimentally relevant variables, such as interatomic distances or backbone dihedral angles. This dimensionality reduction not only allows for simple and extensive definitions of the prior conformations but also tends to increase the relative information content of the experiments. A backbone dihedral angle distribution is more easily defined and restrained than conformational space in 3D. Thus, we consider MaxEnt reweighting as the method of choice to assess the structural information encoded by CCR rates.
We start by deriving an alternative approach to existing MaxEnt heuristics from first principles. Applying the method on Ubiquitin, we find that CCR‐guided MaxEnt reweighting is capable of characterizing rigid and flexible protein backbone regions alike. Most importantly, our results suggest that neither experimental nor systematic errors are the most relevant factors in mapping backbone dihedral angle distributions, but rather a reasonable balance between complementary experiments and prior information. Our findings indicate a surprisingly low sensitivity to experimental uncertainties and oversimplified dynamics, highlighting the potential of CCR rates for the characterization of conformational ensembles of proteins.
2. Theory
2.1. Cross‐Correlated Relaxation
Cross‐correlated relaxation (CCR) effects result from correlated interferences of simultaneous spin relaxation processes. Using these effects to study protein backbone geometry was first proposed in the late 90s by Reif et al., [27] who deduced ψ from relaxation interference of interresidual ‐ and N‐H N dipolar vectors. Other interactions probing dihedral angles along the protein backbone were soon proposed,[ 28 , 29 , 30 , 31 , 32 , 33 , 34 ] for more in‐depth reviews see e. g. Refs. [10, 11, 35].
The angular information is encoded in the spectral density function, which is most commonly modeled under the simplifying assumption of isotropic molecular tumbling with no internal dynamics.[ 27 , 28 ] The correlation time is often scaled by a heuristic order parameter [36] to mimic the effects of local motions,[ 11 , 13 ] see Refs. [11, 37] for more sophisticated models. Here, we consider the most commonly exploited types of relaxation mechanisms, dipolar (DP) and chemical shift anisotropy (CSA), under the simplifying assumptions above. Three different combinations can be distinguished,
| (1) |
| (2) |
| (3) |
a, b, c and d denote nuclei subject to dipolar coupling, u and v are nuclei with CSA, γ is the gyromagentic ratio, r is the distance between two nuclei, σ xx,yy,zz are the tensor components of the diagonal CSA tensor (in ppm), μ 0 is the vacuum permeability, is the reduced Planck constant, B 0 is the magnetic field strength, τc is the overall correlation time, S 2 is the local order parameter and θ denotes the projection angle between the dipolar vectors (ab, cd) and/or the principal axes of the CSA tensor coordinate system of nucleus u ( , , ) or nucleus v ( , , ). These projection angles are of course related to the backbone geometry, allowing us to map to . Figure 1 shows the angular dependencies of all CCR rates employed in this work.
Figure 1.

Angular dependencies of different dipole/CSA interferences assuming a rigid backbone geometry. Upper middle: . Middle left: . Middle: . Middle right: . Lower left: . Lower middle: . Lower right: .
2.2. Maximum Entropy
We consider a distribution of backbone dihedral angles in (φ,ψ)‐space in terms of discretized populations , with . The information about the is represented by m population avagered CCR rates , . These observables might not be able to uniquely determine the underlying populations . If this is the case, Jaynes’ Maximum Entropy (MaxEnt) principle [38] identifies the optimal solution as the maximum entropy distribution compatible with the data. Following Jaynes’ original narrative, this principle can be derived from the intuitive notion of entropy as an information measure. Any other lower entropy solution would contain more information than originally supplied. While the original calculus has remained essentially unchanged, its justifications and axiomatizations have been thoroughly investigated over the years[ 39 , 40 , 41 , 42 , 43 , 44 , 45 ] and linked to the principles of Bayesian inference.[ 46 , 47 , 48 ]
Irrespective of its interpretation, the MaxEnt calculus takes the form of a constrained optimization problem, see e. g. Refs. [49–52] Jaynes’ originally suggested maximization of Shannon entropy [53] is generalized in
| (4) |
where is the probability vector and S is the negative Kullback‐Leibler divergence [54] or relative entropy,
| (5) |
where denote the so‐called prior probabilities, which result from the relative nature of entropy measures in general. Analogous to its counterpart in Bayesian statistics, the prior is often interpreted as a state of a priori beliefs. Here, the prior probabilities are not predefined but fixed, hence S is a function of the alone.
In addition the are subject to a set of constraints,
| (6) |
| (7) |
where (6) represents the normalization condition. As shown in the Appendix, the problem is solved by forming the Lagrangian (17) and inspecting its partial derivatives (18) with respect to , which yields the general expression
| (8) |
where Z is the partition function,
| (9) |
which results from the normalization condition (6). Note that the MaxEnt distribution (8) is a function solely of the m Lagrange multipliers λj, hence the dimensionality of the original problem (4), (6), (7) is drastically reduced.
While the resulting system of m equations and m unknowns could readily be solved, a common strategy involves forming the so‐called dual Lagrangian instead.[ 49 , 55 , 56 , 57 ] Substitution of (8) into the Lagrangian yields a convex “free energy” potential of , see Apendix Eq. 23,
| (10) |
which allows to solve the originally constrained problem (4), (6), (7) by unconstrained optimization of (10) instead. At its minimum, the partial derivatives of L equate to the conditions stated in 7,
| (11) |
where angled brackets denote the population weighted average over the MaxEnt distribution (8). The minimizing multipliers uniquely characterize (8) with maximum S (4) subject to the constraints (6) and (7). The Hessian of L,
| (12) |
corresponds to a positive semi‐definite covariance matrix [49] or Fisher information metric. [54] Minimization of (10) is thus straightforward to implement numerically.
2.3. Maximum Entropy Regression
At first glance, the MaxEnt framework appears like a natural fit for dealing with ensemble averaged data. Instead of modeling the populations explicitly, they can be obtained from a simple and low‐dimensional optimization problem. However, we have not yet accounted for the possibility of errors in the data. This is a key limitation of MaxEnt in practice, because experimental information rarely takes the form of expectation constraints as assumed in (7). Instead, the conditional probability terms of Bayesian statistics are generally required to properly account for experimental uncertainties and/or inaccuracies of the forward model. [58] However, defining and sampling meaningful priors and likelihoods over ensembles, i. e. distributions over distributions, is challenging both conceptually and practically. While this concept has been explored,[ 59 , 60 , 61 ] most pragmatic approaches opt for a simple point estimate instead.
Here, we suggest a novel perspective that aims to preserve the advantages of MaxEnt as introduced in Section 2.2, such as low dimensionality and well‐defined optimality conditions. To distinguish from noise‐ and error‐free constraints , we denote experimentally determined rates as . While the general notion of population averaging still applies, we acknowledge that the associated equality constraints (7) are ill‐suited for erroneous . Thus, a MaxEnt representation as in (8) cannot be derived from , at least in a strict sense. However, we suggest that the functional form of (8) still holds some validity. Reconsidering the Dual Lagrangian (10), we note that the gradient (11) evaluates the difference between prediction and observation, i. e. the residual. Instead of requiring this term to be zero, we could choose to minimize a suitable norm, which allows us to reframe MaxEnt in terms of a simple regression model. For now, we consider a conventional fit,
| (13) |
where angled brackets denote the population weighted average over (8) and σj represents the standard deviation associated with Γj. This expression is of course easily generalized to include multiple measurements of . The similarities with canonical MaxEnt are best illustrated by the optimality conditions,
| (14) |
or equivalently, considering (11) and 12,
| (15) |
The gradient entries (14) and (15) combine the gradient of the MaxEnt Dual (11) and its Hessian (12) in a sum of products, balancing the residuals of each rate weighted with their respective variances and covariances. Qualitatively speaking, if the residuals are approaching zero, i. e. a very good fit can be achieved, the χ 2 fit will resemble an orthodox MaxEnt solution. If this is not the case, the covariances are adjusted and scaled down in order to minimize the non‐zero residuals. Of course, the latter case bears little similarity with a conventional MaxEnt representation. Rather than increasing the inferential uncertainty, experimental errors lead to smaller covariances, implying averages over smaller subspaces and thus lower rather than higher entropy solutions. In a way, the expression for (8) was used for its convenient parametrization without keeping its original justification intact. Thus, entropy has to be reintroduced to the equation, which can be achieved by regularization. Since explicit entropy has already been investigated by other studies,[ 21 , 23 , 24 , 62 , 63 , 64 ] we choose to explore it implicitly using a more conventional quadratic regularization term commonly known in the context Tikhonov regularization[ 65 , 66 ] or ridge regression,[ 67 , 68 ]
| (16) |
where β is a free parameter, often identified as “temperature”, balancing the contributions of χ 2 and the L2 penalty. This indirect way of entropy penalization is particularly suited for biasing techniques[ 22 , 69 , 70 , 71 ] that do not allow for explicit calculation of S. Its effect on the entropy can easily be assessed from Eq. (8), noticing that smaller λj imply smaller perturbations of the prior. An established heuristic for the choice of β is the L‐curve criterion.[ 72 , 73 ] As nonlinear least squares problems, both Eq. (13) and (16) are locally convex and straightforward to minimize using the Levenberg‐Marquardt algorithm. [74]
Naturally, Eq. (16) shares many similarities with previously devised approaches. Already in 1978, Gull and Daniell assumed reduced χ 2‐statistics to repurpose the MaxEnt Lagrangian for erroneous data. [75] Other authors chose to preserve the Lagrangian by extending the equality constraints assuming additive errors, an idea first sketched in the 90 s[ 76 , 77 , 78 ] and recently rediscovered for structural biology purposes by Cesari et al.[ 22 , 70 ] Assuming Gaussian errors, both approaches can be shown (Ref. [24] and [70], supplementary material) to correspond to the Bayesian MAP estimate of Hummer and Köfinger,[ 21 , 24 ] who reframed their entropy‐regularized χ 2‐fitting procedure[ 62 , 63 , 64 ] in Bayesian terms assuming a Gaussian likelihood and an entropy‐inspired prior. A different perspective has been sketched by Dudik et al. who investigated the Lagrangian under various “relaxed” constraints to avoid overfitting.[ 79 , 80 ] While the heuristic proposed here is derived from slightly different considerations, it does of course illustrate the same general principle. Uncertainty in the data must be reflected in a stronger emphasis on the prior.
3. Results
Eq. (16) depends on various parameters and assumptions which need to be carefully assessed before interpreting its solution, namely the variances weighting different rates according to their overall size and spread, the regularization parameter β balancing experimental and prior information and the forward model relating the observed rates Γj to the backbone dihedral angles.
In a first step, the variances of each rate were assessed by comparing the experimental CCR rates with rates predicted from the Ubiquitin ensemble of Lange et al., [81] PDB code 2k39. Simulating the general case of limited prior knowledge, CSA tensors and dynamics were assumed uniform for all rates and residues. With a global correlation time τc of 4.1 ns, an overall order parameter S 2 of 0.7 (similar to Refs. [19, 82]) was found to give reasonable but far from perfect agreements between measured and predicted rates. Four examples are depicted in Figure 2, the remaining rates are shown in Figure S1, Supporting Information. Accounting for differences in temperature and/or magentic field strength, the obtained range of experimental values agrees well with the original publications validated on Ubiquitin.[ 14 , 28 , 29 , 30 , 32 ] As can be seen from Figure 2, not all measured rates can safely be modeled assuming simple Gaussian noise. Since the conventional variance estimate is highly sensitive to outliers, the squared median absolute deviation (MAD2) was considered a better suited estimate for the subsequent fitting procedure. Due to the noticable presence of outliers, systematic deviations and the poor agreements of rate (d), , the reported minimum requirement of five observables per residue [13] was applied. For 56 out of 74 non‐terminal residues five or more CCR rates could be quantified. Residues that yielded four or less CCR rates (18, including 5 glycines and 3 prolines) were excluded from the analysis.
Figure 2.

Comparison of CCR rates Γ2k39 calculated from the Lange ensemble,
[81]
PDB code 2k39, and the experimentally obtained rates Γexp, (a)
, (b)
, (c)
, (d)
. The squared median absolute deviations MAD
2 are specified in the lower right corners. Outliers of rate (b) that were found critical in the subsequent fitting procedure are highlighted in red. Diamond: 30I, Triangle: 41Q, Square: 70V. The remaining three rates are depicted in Figure S1, Supporting Information.
With τc, S 2 and fixed, the regularization parameter β is the only free parameter left. It is often chosen using the L‐curve criterion. For each residue, is minimized for different β using the random coil prior defined in Sec. 7. The pairwise contributions of χ 2 and are plotted on a logarithmic scale. The knee point of the curve indicates the region where χ 2 and are balanced in the sense that lower β allow for high λ‐variability with little gain in χ 2 while higher β restrain the λj at noticeable expense of χ 2. An exemplary L‐curve for I14 is shown in Figure 3.
Figure 3.

Red circles: Exemplary L‐curve for I14 obtained by optimizing , Eq. (16), for different β (between 0 and 1000) and . The contributions of χ 2 and describe an L‐shaped curve in a log‐log plot. Blue squares: The entropy S, Eq. (5), of the backbone dihedral angle distribution, Eq. (8), corresponding to the minimum of . Points are linearly interpolated for improved readability. The knee‐point at is highlighted, marks the regularization parameter chosen for further evaluations, see Figure 4, 5 and 6.
However, considering the quality of the data and the obvious presence of outliers, Figure 2, the L‐curve criterion is likely too optimistic in favoring low χ 2 solutions. This was confirmed by comparing the fitting results with the Lange ensemble [81] in terms of average dihedral angles for different choices of β, Figure S2, Supporting Information, bottom row. By evaluating the L‐curves collectively, a more conservative estimate was found in the highest overall knee point of . The corresponding fitting results are summarized in Figure 4.
Figure 4.

Comparison of average backbone dihedral angles in Ubiquitin between the Lange ensemble, [81] PDB code 2k39, and the CCR‐derived )‐distributions obtained from Eq. (16) with and . As specified in the legend, different markers are used to indicate the number of experimental CCR rates used. The three strong ψ‐outliers correspond to the outliers of highlighted in Figure 2, panel (b).
As entropy S and are defined relative to a prior distribution, it is important to assess the influence of the random coil prior used. Thus, the fitting procedure was repeated using the uniform prior. The results are summarized in Figure S3, Supporting Information. Compared to the random coil prior, Figure S2, Supporting Information, the uniform prior is less capable of correcting for improbable (φ,ψ)‐assignments.
Finally, the forward model itself represents an additional source of uncertainty. The functional forms of Eq. (1) and (2) build on the assumption of isotropic molecular tumbling in the absence of internal dynamics, which leads to the convenient factorization of structural and dynamical contributions. To asses the influence of the assumed dynamic scaling, the order parameter x ≙ S 2 was varied between 0.1 and 1.2, as shown for in Figure 5. A comparison of fitting results with different x and β is shown in Figure S2, Supporting Information.
Figure 5.

Average fitting results of Eq. (16) with β=2.5 for all 56 residues with five or more quantifiable CCR rates assuming different dynamic scalings x ≙ S 2 and τc=4.1 ns. is calculated with respect to the average backbone dihedral angles of the Lange ensemble, [81] PDB code 2k39. Points are linearly interpolated for improved readability.
To evaluate φ and ψ not only in terms of averages, a selection of flexible residues is compared to the ensemble of Lange et al. [81] in terms of (φ,ψ)‐distributions in Figure 6. The full set of (φ,ψ)‐distributions is shown in Figure S4, Supporting Information.
Figure 6.

Comparison of selected backbone dihedral angle distributions in Ubiquitin between the Lange ensemble [81] in column (a) and the CCR‐derived fitting results in column (b) obtained from Eq. (16) with and . Populations are color‐coded according to the linear color gradient at the bottom. Residue type and number are indicated in the top right corner. and denote the absolute difference in average dihedral angles between (a) and (b), #rates is the number of CCR rates used to derive (b).
4. Discussion
In Figure 2, a selection of experimentally determined rates is compared to their expected values calculated from the ensemble of Lange et al. [81] While overall correlation between experiments and predictions can be observed, there are obvious differences between different rates. For (a), / , deviations are relatively small and could be reasonably approximated by Gaussian noise. The spread of (b), / , is not only larger but appears to contain a few obvious outliers, three of which have been found to heavily influence the subsequent analysis (highlighted in red). Our newly proposed experiment [14] (c), / , shows systematic deviations from predicted values most pronounced for higher rates. This is consistent with our previous findings, suggesting a higher overall correlation time as well as a strong sensitivity to variations of the CSA tensor and/or the backbone geometry. [14] Rate (d), / , shows particularly poor agreements. The expected angular dependency is obscured by considerable scatter and systematic deviations, likely due to its small functional range with the highest sensitivity at mostly unpopulated regions with positive φ, see Figure 1. The three remaining rates (e), (f) and (g), Figure S1, Supporting Information, behave similarly to (a) and (c). Errors appear mostly randomly distributed, minor systematic biases can be observed for rates (e), / , and (g), / .
Considering these quantitative discrepancies, the agreements in terms of MaxEnt‐derived and , Figure 4, might appear quite surprising. However, as established in Sec. 3.3, the experimental and theoretical uncertainties are reflected in a stronger emphasis on the prior ( ). Instead of forcing χ 2 close to its absolute minimum, a suitable balance between χ 2 and allows to suppress overfitting, revealing the complementary structural dependencies of CCR rates even in the presence of systematic and experimental errors.
To achieve this balance, i. e. finding an appropriate β, the L‐curve offers an intuitive heuristic. As explained in Sec. 4, the pairwise contributions of χ 2 and for different β are compared on a log‐log plot as illustrated for I14 in Figure 3. At the knee point, λ‐variability is considered sufficiently penalized without excessive restraint of χ 2. However, these knee point solutions, with β between 0.01 and 2.5 for different residues, do not necessarily yield the best results, see Figure S2, Supporting Information, bottom row. Instead, we find that a more strongly weighted prior of for all residues leads to better agreements in terms of average dihedral angles. As described in Sec. 4, the presence of errors should be reflected by an increase in inferential uncertainty, yielding higher rather than lower entropy solutions. Thus, a low χ 2 does not necessarily indicate a good solution, especially if it is achieved at the expense of an informative prior. While the L‐curve provides a useful visual guideline for choosing β, the knee point criterion appears too reliant on χ 2 to avoid overfitting, prior information needs to be considered more explicitly. Alternatively, the observed change in slope of S or , i. e. the second derivative, might provide a reasonable heuristic.
Of course, this does not imply that the prior can substitute for the experimental information encoded in χ 2, the experimental data still determines the solution. This is well illustrated by the highlighted outliers of rate (b), / , Figure 2. As can be seen by their vertical and horizontal projections, these points correspond to very plausible values for ψ. As a consequence, these discrepancies are mirrored by the obvious ‐deviations in Figure 4. Upon exclusion of rate (b), / , the predicted (φ,ψ)‐densities improve drastically (Figure S5, Supporting Information). The prior can only correct for implausible conformations, χ 2 still shapes the solution. What conformations are deemed implausible depends on the prior of course. See Figure S2 and S3, Supporting Information, to compare the fitting results between random coil and uniform prior.
These findings suggest another interesting implication. If the absolute value of χ 2 is of minor importance, how crucial is the choice of the effective correlation time? This question is explored in Figure 5, which summarizes the average fitting results assuming “order parameters” x ≙ S 2 between 0.1 and 1.2. Here, the “order parameter” simply serves to rescale the functional range of Γ. Thus, it is not surprising that the average increases with lower x. However, average and still appear quite stable over a considerable range, illustrating the general robustness with respect to systematic errors. As long as the angular dependencies (1), (2) and (3) are reasonably applicable, their overall scale is of less importance. However, if the scaling is underestimated by too much (x=0.1–0.3), the data might point towards a different (φ,ψ)‐region, analogous to the outliers in Figure 4. In Figure S2, Supporting Information, this increase of incorrectly assigned (φ,ψ)‐pairs can readily be seen. Overestimating the dynamic scaling (x=0.9–1.2) leads to an entirely different behavior. As can be seen by the vertical spreads in Figure S1 and S2, Supporting Information, increasing the functional range of Γ puts less restraint on (φ,ψ)‐space, allowing for better fits in terms of . However, the underlying (φ,ψ)‐distributions can be implausibly distorted as a consequence.
These results suggest that the dynamical contributions, attributed to a simple effective correlation time, are less crucial than the angular information encoded in locally concerted motions. Of course, dynamics and geometry are more intertwined in reality. For IDPs in particular, more pronounced diffusion anisotropy and local dynamics will lead to even further deviations from the simple angular dependencies assumed in Eqs. (1), (2) and (3). Still, a surprising robustness can be observed for Ubiquitin despite its notable deviations from theory, Figure 2. While the assumed dynamic model must be verified and possibly adapted for other protein systems, this robust fitting behavior might hold true in the general case. To this end, we expect that Figure 5 could be used similarly to the L–Curve, Figure 3, by evaluating the relative changes in and upon variation of the dynamics.
Finally, to compare φ and ψ in terms of actual distributions, a selection of flexible residues is compared to the Lange ensemble [81] in Figure 6.
Q40 is part of a turn‐motif preceding a β‐annotated segment from 41 to 45. While featuring mostly right‐handed helical propensities in the PDB ensemble (a), there is noticeable density in the left‐handed region as well. This feature is reproduced by the CCR‐derived density (b), albeit with slightly different populations. Still, and are in very good agreement. In addition, the CCR data seems to suggest a slight β‐propensity, which might be noteworthy considering the transition from turn to β‐strand.
E51 has no assigned secondary structure motif, but precedes a short turn segment, indicated by a small α‐propensity in the PDB ensemble (a) alongside a broadly populated β‐region. Again, the CCR‐derived densities are in strong qualitative and quantitative agreement.
Q62 is similar to E51. It has no assigned secondary structure but is located right before a short turn segment. However, in terms of and , the agreement between the PDB ensemble (a) and the CCR‐derived populations (b) appears rather poor. Still, the CCR fit (b) is quite remarkable. Broad β‐populations extending to negative ψ are well reproduced. Propensities of α and ζ [83] ( , ) in (a) are correctly predicted but more heavily weighted in (b). Interestingly, even the small ϵ‐populations around and are reproduced, which is expected to be very sparsely populated for non‐glycine residues. This presence of positive φ likely explains the noticeable and incorrect emphasis on left‐handed α in (b), which is far more frequent in the prior than ϵ and thus more pronounced.
R72 is annotated as bend‐like and part of the flexible C‐terminal tail. It follows a β‐strand stretch from 66–71. Its flexible nature is well‐reflected in the PDB ensemble (a) with broadly distributed populations in β, ζ and α‐regions. Interestingly, propensities with positive φ are not centered around canonical left‐handed α, but rather shifted towards smaller values of ψ. Overall, these propensities are well reproduced by the CCR‐derived distribution (b) with similar densities of β and right‐handed α. However, (b) appears far smoother than (a), a result of comparing 116 snapshots to a comparatively fine‐grained distribution. In addition, little ζ‐propensity is found in (b) and positive ψ values are biased towards the left‐handed α‐region, again illustrating the influence of the chosen prior.
As expected, average φ and ψ are not perfectly representative for (φ, ψ)‐distributions especially in flexible residues. While noticeable deviations might be observed on average, there are still similarities to be discovered in detail. Overall, CCR‐guided MaxEnt reweighting appears well suited to characterize rigid and flexible residues alike. However, details can only be resolved as far as experiments and prior allow. Occasional artifacts of experimental deviations and conflicting prior assumptions are still noticeable and should be addressed with care. Conformational heterogeneity can reflect structural flexibility as well as inferential uncertainty, i. e. rigid residues can appear rather flexible, see Figure S4, Supporting Information. Including additional observables, such as CCR rates, scalar couplings, chemical shifts or short‐range NOEs, can be expected to alleviate these effects. Quantitative agreements could be further improved by multiple measurements and/or alternative pulse sequences to allow for outlier corrections, cross‐validation and improved variance estimation.
In more general terms, it is important to emphasize that the information contained in CCR rates strongly depends on the prior employed. Detailed structural priors allow us to refine, calculate and/or analyze structural ensembles of folded proteins in terms of their dynamics.[ 17 , 18 , 19 , 20 ] Accurate models of both structure and dynamics make the encoded CSA tensors accessible.[ 15 , 16 ] The MaxEnt approach presented here provides a framework for the interpretation of CCR rates in cases of unspecific prior knowledge. While structural interpretations appear limited to backbone dihedral angle distributions as a consequence, we foresee a variety of possible extensions. Firstly, the interresidual CCR rate / might allow for joint analysis of sequential residues that would otherwise be treated independently. Secondly, by translating ensemble averaged observables into distributions over so‐called “collective variables”, our method might provide valuable inputs for metadynamics‐based MaxEnt biasing techniques.[ 71 , 84 , 85 ] Thirdly, our approach could of course be adapted to reweight a molecular dynamics simulation. [25] For this case in particular we expect CCR rates to provide valuable experimental constraints reflecting both structure and dynamics.
5. Conclusions
Cross‐correlated relaxation (CCR) rates were shown to resolve dihedral angle distributions of both rigid and flexible residues in Ubiquitin. This was achieved using a modern Maximum Entropy (MaxEnt) reweighting approach that allows to account for the presence of structural flexibility in proteins.
While classical MaxEnt is ill‐equipped to deal with the errors that come with experimental data, we show that it can be recast into a simple regression scheme, retaining its low dimensionality and well‐defined optimality conditions. Crucially, the procedure does not depend on the number of ensemble members but only on the number of Lagrange multipliers, one for each experiment. This implicit way of modeling both entropy and conformational space proves to be very robust even outside its canonical scope. While the classical Lagrangian is not strictly applicable, it can still be repurposed to approximate the Lagrange multipliers. While many approaches achieve this by loosening the equality constraints, we derived a simple and robust χ 2‐type cost function which we expect to be particularly useful for MaxEnt biasing techniques that still build on rather cautious modifications of the Lagrange Dual.[ 22 , 69 , 70 , 71 ]
However, since MaxEnt requires errors to be treated somewhat ad hoc to arrive at a simple point estimate, free parameters must always be treated with care. Firstly, the accuracy of each experiment is represented by a variance parameter to weight the contribution of each observable. While easy to estimate post hoc for a system like Ubiquitin, studies of less well‐known proteins might require additional considerations. Secondly, the balance between residual and prior must be assessed with care. We have found that the widely‐used L‐curve criterion might be too biased towards small residuals. Our data suggests that entropy or suitable estimates thereof are better suited to properly avoid overfitting. This aspect is particularly important if the fitting results cannot be compared with independent data. Thirdly, the prior itself must not be overlooked. Since overfitting is suppressed by enforcing entropy, the prior conformations necessarily affect the solution. While a uniform prior merely flattens the predicted densities, more informed priors put stronger emphasis on regions deemed probable a priori. For Ubiquitin, this does not mean that a random coil prior necessarily leads to broad distributions for rigid residues. However, an artificial bias towards more strongly populated regions in (φ,ψ)‐space could be observed for residues with unusual propensities especially with positive φ. Assessing the influence of different priors is thus highly recommended.
Lastly, the forward model relating experimental data to structural information is of crucial importance. In this work, we examined Ubiquitin under the oversimplified assumption of isotropic molecular tumbling without internal dynamics. Most notably, this implies that dynamical and structural contributions factorize, reducing the effect of dynamics to a simple scaling factor. While this model did not yield quantitative agreements of comparable quality between different CCR rates, the underlying (φ,ψ)‐distributions could still be resolved with surprising levels of detail, despite the presence of outliers and systematic deviations. These results suggest that the structural information encoded in CCR rates is capable of outweighing other sources of uncertainty, such as experimental errors, CSA tensor variations, simplified backbone geometries and imprecise dynamics. This robustness was observed even for residues with considerable amounts of conformational flexibility, indicating the unharnessed potential of CCR rates for studying disordered protein systems. While the simplified model of a rigid protein under isotropic tumbling must still be critically assessed and possibly adapted, we expect that CCR rates will allow us to better understand and potentially disentangle the subtle interplay of structure and dynamics in intrinsically disordered proteins.
6. Computational Methods
A total of seven different CCR interactions were employed, including our newly proposed experiment probing [14] and the closely related interference. The remaining five interactions have been described elsewhere: , [27] , [28] , [29] , [30] . [32] To ensure internal consistency and reproducibility, all were calculated according to Eq. (1) and (2) by rotating an Avogadro [86] ‐generated backbone geometry with (Table S1, Supporting Information). Parameters were adapted primarily from Engh and Huber, [87] angles involving hydrogens were taken from Momany et al. [88] The principal axes of the carbonyl CSA tensor were set according to Teng et al. [89] The Z‐axis was defined as the cross product of the ‐O and the ‐ bond unit vectors, the X‐ and Y‐axis as clockwise rotations of the ‐O bond unit vector around the Z‐axis by 82° and −8°, effectively approximating the O‐ ‐N angle with 120°. The tensor components of Ubiquitin were taken from Cisnetti et al. [15] σxx and σzz were set according to the reported averages as 249.4 ppm and 87.9 ppm. Following the suggested calibration, the average σyy was obtained from the chemical shifts (BMRB ID 17769 [90] ) as 191.1 ppm. The resulting angular dependencies are depicted in Figure 1. A correlation time τc of 4.1 ns was assumed, [91] S 2 was treated as a free parameter. B 0 was set in accordance with the experimental magnetic field of 600 MHz.
For quantitative comparison the ensemble of Lange et al. [81] was used (PDB code 2k39). CCR rates were calculated from φ and ψ for every structure according to Eq. (1) and (2) using the above stated CSA tensor, backbone geometry and τc assuming an order parameter S 2 of 0.7. Subsequent averaging over all 116 structures yielded the CCR rate predictions.
For MaxEnt regression according to Eq. (16), the variances were estimated by the median absolute deviation between the 2k39‐based predictions and the experimental values. The CCR rates were approximated as 360x360 arrays by rotating the backbone (Table S1, Supporting Information) in 1° steps. Analogously, the random coil prior was defined on a 360x360 grid using the coil library of Manstyzov et al. [92] After discarding ill‐defined terminal segments as well as glycine and proline residues, a total of 147091 angle pairs was extracted and rounded to integers. The prior was then obtained as the normalized histogram of angle pair counts. It is shown in Figure S6, Supporting Information. Out of 76 residues, 56 yielded five or more CCR rates. These residues were analyzed according to Eq. (16) with different β (0 to 1000) and S 2 (0.1 to 1.2). Minimization was achieved using the Levenberg‐Marquardt algorithm implemented in SciPy [93] version 1.0.0 with iteratively updated scaling. [94]
Experimental Section
The sample of 2 mM C, N‐uniformly labeled Ubiquitin dissolved in 50 mM phosphate buffer of pH , was purchased from Giotto Biotech.
All experiments were performed on a Bruker 600 MHz spectrometer equipped with a 5 mm room‐temperature probe. The experiments were performed at 298 K. The experimental parameters are summarized in Table 1.
Table 1.
Experimental parameters for all data sets (dim ‐ dimensionality of the experiment, sw ‐ spectral width, td ‐ number of points acquired in a given dimension (real plus imaginary), version: J‐res ‐ J‐resolved type of experiment, ref/cross ‐ reference/transfer version of quantitative gamma experiment type, ns ‐ number of scans).
|
CCR rate |
Tc [ms] |
dim |
C’ |
|
N |
|
version |
ns |
time |
|
|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
sw [Hz] |
td |
sw [Hz] |
td |
|
|
|
|
|
|
28 |
3D |
2600 |
90 |
2300 |
140 |
J‐res |
8 |
37 h |
|
|
|
28 |
3D |
2600 |
96 |
2300 |
104 |
J‐res |
4 |
14 h |
|
|
|
28 |
2D |
– |
– |
2127 |
96 |
ref |
32 |
1 h |
|
|
cross |
256 |
8 h |
||||||||
|
|
28 |
2D |
– |
‐ |
2127 |
220 |
ref |
16 |
1 h |
|
|
cross |
128 |
8 h |
||||||||
|
|
28 |
2D |
– |
– |
2300 |
670 |
ref |
4 |
1 h |
|
|
cross |
48 |
12 h |
||||||||
|
|
28 |
2D |
– |
– |
1850 |
120 |
ref |
48 |
2 h |
|
|
cross |
384 |
16 h |
||||||||
|
|
66 |
3D |
1850 |
238 |
2200 |
192 |
J‐res |
4 |
67 h |
The pulse sequence programs were prepared for Bruker spectrometers, according to the original publications:
,
[14]
,
[27]
,
[28]
,
[29]
,
[30]
.
[32]
For some pulse sequences minor optimizations were introduced. These will be described in a future publication alongside the pulse sequence used to measure
. All pulse sequences are available from the authors upon request.
All experiments were performed using the conventional sampling scheme. Data were processed using the fast Fourier transform algorithm implemented in mddnmr. [95] The data were displayed and analyzed using Sparky. [96]
7. Appendix
The constrained optimization problem (4),(6),(7) is recast into the Lagrangian
| (17) |
The partial derivative with respect to is
| (18) |
Setting the partial derivative to zero, we obtain
| (19) |
Taking the exponential and rearranging yields
| (20) |
By applying the normalization condition 6,
| (21) |
the partition function is obtained as
| (22) |
To arrive at the Lagrange Dual (10), we first rearrange the Lagrangian (17) and then substitute for the MaxEnt distribution (8)/(20), making use of the normalization condition 6,
| (23) |
Conflict of interest
The authors declare no conflict of interest.
Supporting information
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re‐organized for online delivery, but are not copy‐edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
Supplementary
Acknowledgements
This work was supported by the Austrian Science Fund FWF (P28359 and P28937). A. Z.‐K. was supported by a FWF Lise‐Meitner Postdoctoral Fellowship (M 2084).
C. Kauffmann, A. Zawadzka-Kazimierczuk, G. Kontaxis, R. Konrat, ChemPhysChem 2021, 22, 18.
Contributor Information
Clemens Kauffmann, Email: clemens.kauffmann@univie.ac.at.
Prof. Robert Konrat, Email: robert.konrat@univie.ac.at.
References
- 1. Nilges M., Habeck M., Rieping W., C. R. Chim. 2008, 11, 356–369. [Google Scholar]
- 2. Tantos A., Han K.-H., Tompa P., Mol. Cell. Endocrinol. 2012, 348, 457–465. [DOI] [PubMed] [Google Scholar]
- 3. Garcia-Pino A., Balasubramanian S., Wyns L., Gazit E., De Greve H., Magnuson R. D., Charlier D., van Nuland N. A. J., Loris R., Cell 2010, 142, 101–111. [DOI] [PubMed] [Google Scholar]
- 4. Ferreon A. C. M., Ferreon J. C., Wright P. E., Deniz A. A., Nature 2013, 498, 390–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Motlagh H. N., Wrabl J. O., Li J., Hilser V. J., Nature 2014, 508, 331–339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Tompa P., Trends Biochem. Sci. 2012, 37, 509–516. [DOI] [PubMed] [Google Scholar]
- 7. Dyson H. J., Wright P. E., Chem. Rev. 2004, 104, 3607–3622. [DOI] [PubMed] [Google Scholar]
- 8. Mittag T., Forman-Kay J. D., Curr. Opin. Struct. Biol. 2007, 17, 3–14,. [DOI] [PubMed] [Google Scholar]
- 9. Mittermaier A. K., Kay L. E., Trends Biochem. Sci. 2009, 34, 601–611. [DOI] [PubMed] [Google Scholar]
- 10. Schwalbe H., Carlomagno T., Hennig M., Junker J., Reif B., Richter C., Griesinger C., et al., Methods Enzymol. 2002, 338, 35–81. [DOI] [PubMed] [Google Scholar]
- 11. Vögeli B., Vugmeyster L., ChemPhysChem 2019, 20, 178–196. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12. Yang D., Kay L. E., J. Am. Chem. Soc. 1998, 120, 9880–9887. [Google Scholar]
- 13. Kloiber K., Schüler W., Konrat R., J. Biomol. NMR 2002, 22, 349–363. [DOI] [PubMed] [Google Scholar]
- 14. Kauffmann C., Kazimierczuk K., Schwarz T. C., Konrat R., Zawadzka-Kazimierczuk A., J. Biomol. NMR 2020, 74, 257–265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Cisnetti F., Loth K., Pelupessy P., Bodenhausen G., ChemPhysChem 2004, 5, 807–814. [DOI] [PubMed] [Google Scholar]
- 16. Loth K., Pelupessy P., Bodenhausen G., J. Am. Chem. Soc. 2005, 127, 6062–6068. [DOI] [PubMed] [Google Scholar]
- 17. Vugmeyster L., McKnight C. J., Biophys. J. 2008, 95, 5941–5950. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Fenwick R. B., Esteban-Martín S., Richter B., Lee D., Walter K. F. A., Milovanovic D., Becker S., Lakomek N. A., Griesinger C., Salvatella X., J. Am. Chem. Soc. 2011, 133, 10336– 10339. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Fenwick R. B., Schwieters C. D., Vögeli B., J. Am. Chem. Soc. 2016, 138, 8412–8421. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Sabo T. M., Gapsys V., Walter K. F. A., Fenwick R. B., Becker S., Salvatella X., de Groot B. L., Lee D., Griesinger C., Methods 2018, 138–139, 85–92. [DOI] [PubMed] [Google Scholar]
- 21. Hummer G., Köfinger J., J. Chem. Phys. 2015, 143, 243150. [DOI] [PubMed] [Google Scholar]
- 22. Cesari A., Reißer S., Bussi G., Computation 2018, 6, 15. [Google Scholar]
- 23. Bottaro S., Bussi G., Kennedy S. D., Turner D. H., Lindorff-Larsen K., Sci. Adv. 2018, 4, eaar8521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Köfinger J., Stelzl L. S., Reuter K., Allande C., Reichel K., Hummer G., J. Chem. Theory Comput. 2019, 15, 3390–3401. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Orioli S., Haahr Larsen A., Bottaro S., Lindorff-Larsen K., in Prog. Mol. Biol. Transl. Sci., Vol. 170, Elsevier, 2020, pp. 123–176. [DOI] [PubMed] [Google Scholar]
- 26. Rangan R., Bonomi M., Heller G. T., Cesari A., Bussi G., Vendruscolo M., J. Chem. Theory Comput. 2018, 14, 6632–6641. [DOI] [PubMed] [Google Scholar]
- 27. Reif B., Hennig M., Griesinger C., Science 1997, 276, 1230–1233. [DOI] [PubMed] [Google Scholar]
- 28. Yang D., Konrat R., Kay L. E., J. Am. Chem. Soc. 1997, 119, 11938–11940. [Google Scholar]
- 29. Pelupessy P., Chiarparin E., Ghose R., Bodenhausen G., J. Biomol. NMR 1999, 14, 277–280. [DOI] [PubMed] [Google Scholar]
- 30. Kloiber K., Konrat R., J. Biomol. NMR 2000, 17, 265–268. [DOI] [PubMed] [Google Scholar]
- 31. Skrynnikov N. R., Konrat R., Muhandiram D. R., Kay L. E., J. Am. Chem. Soc. 2000, 122, 7059–7071. [Google Scholar]
- 32. Kloiber K., Konrat R., J. Am. Chem. Soc. 2000, 122, 12033–12034. [Google Scholar]
- 33. Chiarparin E., Pelupessy P., Ghose R., Bodenhausen G., J. Am. Chem. Soc. 2000, 122, 1758–1761. [Google Scholar]
- 34. Pelupessy P., Ravindranathan S., Bodenhausen G., J. Biomol. NMR 2003, 25, 265–280. [DOI] [PubMed] [Google Scholar]
- 35. Kumar A., Grace R. C. R., Madhu P. K., Nucl. Magn. Reson. 2000, 37, 191–319. [Google Scholar]
- 36. Lipari G., Szabo A., J. Am. Chem. Soc. 1982, 104, 4546–4559. [Google Scholar]
- 37. Vögeli B., J. Chem. Phys. 2010, 133, 014501. [DOI] [PubMed] [Google Scholar]
- 38. Jaynes E. T., Phys. Rev. 1957, 108, 171. [Google Scholar]
- 39. Csiszár I., Entropy 2008, 10, 261–273. [Google Scholar]
- 40. Shore J., Johnson R., IEEE Trans. Inf. Theory 1980, 26, 26–37. [Google Scholar]
- 41. Skilling J., in Maximum-Entropy and Bayesian Methods in Science and Engineering, Springer, 1988, pp. 173–187. [Google Scholar]
- 42. Paris J. B., Vencovská A., Int. J. Approx. Reasoning 1990, 4, 183–223. [Google Scholar]
- 43. Csiszar I., The annals of statistics 1991, 19, 2032–2066. [Google Scholar]
- 44. Csiszar I., in Maximum entropy and Bayesian methods, Springer, 1996, pp. 35–50. [Google Scholar]
- 45. Knuth K. H., Skilling J., Axioms 2012, 1, 38–73. [Google Scholar]
- 46. Williams P. M., British J. Philosophy Science 1980, 31, 131–144. [Google Scholar]
- 47. Caticha A., AIP Conf. Proc. 2004, 707, 75–96. [Google Scholar]
- 48. Caticha A., Giffin A., AIP Conf. Proc. 2006, 872, 31–42. [Google Scholar]
- 49. Wu Z., G. N. Phillips Jr , Tapia R., Zhang Y., SIAM Review 2001, 43, 623–642. [Google Scholar]
- 50. Bricogne G., Acta Crystallogr. Sect. A 1984, 40, 410–445. [Google Scholar]
- 51. Bricogne G, Acta Crystallogr. Sect. D 1993, 49, 37–60. [DOI] [PubMed] [Google Scholar]
- 52.E. T. Jaynes, Probability theory: The logic of science, Cambridge university press, 2003.
- 53. Shannon C. E., Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar]
- 54. Kullback S., Leibler R. A., Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar]
- 55. Charnes A., Wager Cooper W., Atti Accad. Naz. Lincei, Cl. Sci. Fis., Mat. Nat., Rend. 1975, 58, 568–576. [Google Scholar]
- 56. Charnes A., Wager Cooper W., Seiford L, Math. Operationsforschung Statistik. Series Optimization 1978, 9, 21–29. [Google Scholar]
- 57. Alhassid Y., Agmon N., Levine R. D., Chem. Phys. Lett. 1978, 53, 22–26. [Google Scholar]
- 58. Rieping W., Habeck M., Nilges M., Science 2005, 309, 303–306. [DOI] [PubMed] [Google Scholar]
- 59. Fisher C. K., Huang A., Stultz C. M., J. Am. Chem. Soc. 2010, 132, 14919–14927. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 60. Olsson S., Frellsen J., Boomsma W., Mardia K. V., Hamelryck T., PLoS One 2013, 8, e79439. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 61. Beauchamp K. A., Pande V. S., Das R., Biophys. J. 2014, 106, 1381–1390. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 62. Różycki B., Kim Y. C., Hummer G., Structure 2011, 19, 109–116. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 63. Mantsyzov A. B., Maltsev A. S., Ying J., Shen Y., Hummer G., Bax A., Protein Sci. 2014, 23, 1275–1290. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 64. Mantsyzov A. B., Shen Y., Lee J. H., Hummer G., Bax A., J. Biomol. NMR 2015, 63, 85–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 65. Phillips D. L., J. ACM (JACM) 1962, 9, 84–97. [Google Scholar]
- 66. N. Tikhonov A., Soviet Math. 1963, 4, 1035–1038. [Google Scholar]
- 67. Hoerl A. E., Kennard R. W., Technomet 1970, 12, 55–67. [Google Scholar]
- 68. Hoerl A. E., Kennard R. W., Technomet 1970, 12, 69–82. [Google Scholar]
- 69. White A. D., Voth G. A., J. Chem. Theory Comput. 2014, 10, 3023–3030. [DOI] [PubMed] [Google Scholar]
- 70. Cesari A., Gil-Ley A., Bussi G., J. Chem. Theory Comput. 2016, 12, 6192–6200. [DOI] [PubMed] [Google Scholar]
- 71. Amirkulova D. B., White A. D., Mol. Simul. 2019, 45, 1285–1294. [Google Scholar]
- 72. Miller K., SIAM J. Math. Analysis 1970, 1, 52–74. [Google Scholar]
- 73. Hansen P. C., SIAM Rev. 1992, 34, 561–580. [Google Scholar]
- 74. Marquardt D. W., J. Soc. Industrial Appl. Math. 1963, 11, 431–441. [Google Scholar]
- 75. Gull S. F., Daniell G. J., Nature 1978, 272, 686–690. [Google Scholar]
- 76. Golan A., Judge G., Perloff J. M., J. Am. Stat. Assoc. 1996, 91, 841–853. [Google Scholar]
- 77.S. F. Chen, R. Rosenfeld. A gaussian prior for smoothing maximum entropy models. Technical report, Carnegie-Mellon Univ Pittsburgh PA School of Computer Science, 1999.
- 78. Chen S. F., Rosenfeld R., IEEE Trans. Speech Audio Proc. 2000, 8, 37–50. [Google Scholar]
- 79. Dudík M., Phillips S. J., Schapire R. E., in International Conference on Computational Learning Theory, pages 472–486. Springer, 2004. [Google Scholar]
- 80. Dudík M., Phillips S. J., Schapire R. E., J. Mach. Learn. Res. 2007, 8, 1217–1260. [Google Scholar]
- 81. Lange O. F., Lakomek N.-A., Farés C., Schröder G. F., Walter K. F. A., Becker S., Meiler J., Grubmüller H., Griesinger C., de Groot B. L., Science 2008, 320, 1471–1475. [DOI] [PubMed] [Google Scholar]
- 82. Vögeli B., J. Biomol. NMR 2017, 67, 211–232. [DOI] [PubMed] [Google Scholar]
- 83. Karplus P. A., Protein Sci. 1996, 5, 1406–1420. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 84. Marinelli F., Faraldo-Gómez J. D., Biophys. J. 2015, 108, 2779–2782. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 85. White A. D., Dama J. F., Voth G. A., J. Chem. Theory Comput. 2015, 11, 2451–2460. [DOI] [PubMed] [Google Scholar]
- 86. Hanwell M. D., Curtis D. E., Lonie D. C., Vandermeersch T., Zurek E., Hutchison G. R., J. Cheminform. 2012, 4, 17. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 87.R. A. Engh, R. Huber, Structure quality and target parameters, chapter 18.3, pages 382–392. American Cancer Society, 2006.
- 88. Momany F. A., McGuire R. F., Burgess A. W., Scheraga H. A., J. Phys. Chem. 1975, 79, 2361–2381. [Google Scholar]
- 89. Teng Q., Iqbal M., Cross T. A., J. Am. Chem. Soc. 1992, 114, 5312–5321. [Google Scholar]
- 90. Cornilescu G., Marquardt J. L., Ottiger M., Bax A., J. Am. Chem. Soc. 1998, 120, 6836–6837. [Google Scholar]
- 91. Schneider D. M., Dellwo M. J., Wand A. J., Biochemistry 1992, 31, 3645–3652. [DOI] [PubMed] [Google Scholar]
- 92.A. B. Mantsyzov, Y. Shen, J. H. Lee, G. Hummer, A. Bax, MERA: Maximum Entropy Ramachandran map Analysis from NMR data, 2015 https://spin.niddk.nih.gov/bax/software/MERA (accessed August 26, 2020).
- 93.E. Jones, T. Oliphant, P. Peterson, SciPy: Open source scientific tools for Python, 2001.
- 94. Moré J., in Numerical analysis, Springer, 1978, pp. 105–116. [Google Scholar]
- 95.V. Y. Orekhov, V. Jaravine, M. Mayzel, K. Kazimierczuk, MddNMR – Reconstruction of NMR spectra from NUS signal using MDD and CS, 2004–2020.
- 96.T. D. Goddard, D. G. Kneller. Sparky 3, University of California, San Francisco, 2002.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
As a service to our authors and readers, this journal provides supporting information supplied by the authors. Such materials are peer reviewed and may be re‐organized for online delivery, but are not copy‐edited or typeset. Technical support issues arising from supporting information (other than missing files) should be addressed to the authors.
Supplementary
