Skip to main content
The Journal of Chemical Physics logoLink to The Journal of Chemical Physics
. 2008 Sep 10;129(10):104511. doi: 10.1063/1.2975206

Z-matrix formalism for quantitative noise assessment of covariance nuclear magnetic resonance spectra

David A Snyder 1, Arindam Ghosh 2, Fengli Zhang 1, Thomas Szyperski 2, Rafael Brüschweiler 1,a)
PMCID: PMC2669766  PMID: 19044928

Abstract

Due to the limited sensitivity of many nuclear magnetic resonance (NMR) applications, careful consideration must be given to the effect of NMR data processing on spectral noise. This work presents analytical relationships as well as simulated and experimental results characterizing the propagation of noise by unsymmetric covariance NMR processing, which concatenates two NMR spectra along a common dimension, resulting in a new spectrum showing spin correlations as cross peaks that are not directly measured in either of the two input spectra. It is shown how the unsymmetric covariance spectrum possesses an inhomogeneous noise distribution across the spectrum with the least amount of noise in regions whose rows and columns do not contain any cross or diagonal peaks and with the largest amount of noise on top of signal peaks. Therefore, methods of noise estimation commonly used in Fourier transform spectroscopy underestimate the amount of uncertainty in unsymmetric covariance spectra. Different data processing procedures, including the Z-matrix formalism, thresholding, and maxima ratio scaling, are described to assess noise contributions and to reduce noise inhomogeneity. In particular, determination of a Z score, which measures the difference in standard deviations of a statistic from its mean, for each spectral point yields a Z matrix, which indicates whether a given peak intensity above a threshold arises from the covariance of signals in the input spectra or whether it is likely to be caused by noise. Application to an unsymmetric covariance spectrum, obtained by concatenating two 2D 13C–1H heteronuclear, single quantum coherence (HSQC) and 13C–1H heteronuclear, multiple bond correlation (HMBC) spectra of a metabolite mixture along their common proton dimension, reveals that for sufficiently sensitive input spectra the reduction in sensitivity due to covariance processing is modest.

INTRODUCTION

Because NMR measurement time is often limited by the achievable sensitivity, careful consideration must be given to the effect of NMR data processing on spectral noise. While for standard Fourier transform (FT) the effect of noise on spectra is well understood,1 more recent processing methods can have advantages, in particular, when the shortening of measurement time of multidimensional spectra is essential.2, 3, 4, 5, 6 However, many of these methods affect the noise signature resulting in changes in both the apparent and the actual sensitivity.4, 7

Due to its linear nature, the FT method converts a free induction decay that includes additive white noise into a spectrum that is superimposed on a homogeneous noise floor. This property allows a straightforward assessment of the signal-to-noise (S∕N) ratio by comparing signal intensities to a summary statistics, such as the standard deviation or the median absolute value of the noise floor in a peak-free region. Nonlinear methods, on the other hand, may affect the noise lying away from a signal peak different from noise on a signal peak itself and thereby they may improve the apparent but not the actual sensitivity.

Covariance NMR is a recently introduced method for spectral resolution enhancement of multidimensional spectra.3 Direct covariance processing endows the indirect (or donor) dimension(s) of a spectrum with the same resolution and spectral width as the corresponding acceptor dimensions, which include the high-resolution detection dimension.3, 5, 8 Conversely, indirect covariance maps an indirect dimension of a spectrum onto the direct dimension.9 When the covariance spectrum is symmetric, application of the matrix square root strongly attenuates or eliminates artifacts due to relay effects and chemical shift near degeneracy.6 In fact, with regularization10 applied as necessary, the covariance transform followed by a matrix square root leaves the signal-to-noise properties of a spectrum essentially unperturbed.

The covariance NMR concept can be generalized to pairs of spectra, which has been referred to as “unsymmetric” covariance NMR, by multiplying the matrices belonging to two spectra along a common dimension resulting in a spectrum that is generally nonsymmetric.7 Unsymmetric covariance NMR has been demonstrated for small molecule NMR where it provides a rapid computational approach to correlate spin resonances for which the experimental measurement of correlations would be very time consuming.7, 11, 12, 13 A related concept was introduced in the context of “hyperdimensional” NMR of proteins14 and extensions have recently been reported for protein backbone assignment, such as COBRA (Ref. 15) and Burrow–Owl,16 where pairs of three-dimensional (3D) spectra are combined to four-dimensional (4D) spectra. It should be noted that, despite the covariance name, these spectra generally do not fulfill the mathematical properties of covariance matrices any longer.

Unsymmetric covariance NMR involves the matrix product of two spectra and, thus, is a nonlinear transformation. As a consequence, the propagation of noise through unsymmetric covariance processing is fundamentally different from FT spectroscopy. This paper demonstrates analytically, by simulations and by using experimental data how uniform noise distribution of input FT spectra propagates through unsymmetric covariance processing in an inhomogeneous fashion, thereby endowing different regions of the unsymmetric covariance spectrum with different amounts of noise. The analytical relationships developed here provide a rigorous framework for assessing the sensitivity and experimental statistical errors of peaks in unsymmetric covariance spectra and they are tested both for simulated and experimental spectra. This work does not address the separate issue of systematic errors, for example, due to chemical shift degeneracy, of unsymmetric covariance spectra.

THEORY

Consider two real N1×N2 two-dimensional (2D) FT spectra represented by matrices A and B that contain additive noise, where each noise element is an independent, identical distributed (i.i.d.) Gaussian random variable with zero mean and standard deviation σA and σB, respectively. Unsymmetric covariance processing multiplies the two spectra so that the common ω1 dimension is contracted,7

C=ATB. (1)

Equation 1 implies that element Ckl is the inner product of column k of A with column l of B, where both of these column vectors have N1 elements.

To determine the signal-to-noise ratio of C we separately consider a region of signal and different regions of noise. A region of signal around matrix element (i,j) of C arises when columns i and j of A and B have one or several peaks at similar (or identical) positions. The expected signal at point (i,j) is

Signal=Cij=k=1N1A^kiB^kj, (2)

where A^ki and B^kj denote the noise-free spectral data points belonging to the noise-affected data points Aki and Bkj, respectively.

Unlike for FT spectra, the amount of noise in C varies depending on the spectral region considered. For a peak-free region around point (p,q) where only noise is multiplied and coadded according to Eq. 1,

Noisefree=Cpq=k=1N1AkpBkq. (3)

The elements Akp and Bkq are i.i.d. Gaussian distributed. While products of Gaussian random variables do not follow the Gaussian distribution,17 statistical tests of computationally generated samples of sums of products of Gaussian distributed random variables indicate that, for most practical applications, the sum of N1>25 products is in very good approximation Gaussian distributed.

We find (see EPAPS supporting information18) that the sum of products of i.i.d. Gaussian variables of mean zero has in very good approximation the variance

varfree=N1σA2σB2. (4)

When Eq. 4 is combined with Eq. 1, one can define a S∕N ratio,

(SignalNoise)free=(k=1N1AkiBkj)N1σAσB, (5)

for any data point Cij in a “peak-free” region, which is a region that neither aligns horizontally nor vertically with a cross peak (note that Aki, Bkj denote noise-affected data points).

However, some regions of C contain noise at a level that is different from the one determined above. These regions originate from the inner product of a column of B(A) that contains one or several signal peaks with a column of A(B) that contains only noise. These regions are located in C along a column (row) that contains a cross peak. The variances of these inner products are given by

varcol=N1σA2σB2+k=1N1B^kj2σA2andvarrow=N1σA2σB2+k=1N1A^ki2σB2. (6)

It follows that the S∕N of a column and row that belong to a cross peak in C centered around point (i,j) is

(SignalNoise)colj=(k=1N1AkiBkj)σA(N1σB2+k=1N1B^kj2)12,(SignalNoise)rowi=(k=1N1AkiBkj)σB(N1σA2+k=1N1A^ki2)12. (7)

The variance of the cross peak, which is defined as the variance of the peak center over multiple identical experiments, is given by

varpeak=N1σA2σB2+k=1N1(A^ki2σB2+B^kj2σA2). (8)

Accordingly, the S∕N ratio of the peak, defined by the average peak height (i.e., averaged over multiple experiments) divided by its standard deviation, is

(SignalNoise)peak=(k=1N1A^kiB^kj)[N1σA2σB2+k=1N1(A^ki2σB2+B^kj2σA2)]12. (9)

It should be noted that Eq. 9 does not only apply to a signal peak at location (i,j), but also to all locations in C where “noise ridges” intersect. The latter are caused by inner products of columns of A and B that contain signal peaks at different locations. Equation 8 subsumes Eqs. 4, 6 when considering that A^ki and B^kj are zero in the noise columns and rows, respectively (and are both zero in the baseline noise region). Thus, Eqs. 8, 9 quantify the noise variance and the signal-to-noise ratio for any point in the covariance spectrum. Equations 8, 9, however, assume knowledge of noise-free input spectra, which are experimentally unattainable. Substituting experimentally measured Aki and Bkj for their noise-free counterparts A^ki and B^kj in Eqs. 8 or 9 will thereby result in a biased estimator. On the other hand, an unbiased estimator for Eq. 8 can be derived (see EPAPS supporting information18)

varunbiased=N1σA2σB2+k=1N1(Aki2σB2+Bkj2σA2), (10)

and the corresponding unbiased S∕N ratio (applicable for any point in the spectrum) is

Zij=(SignalNoise)unbiased=(k=1N1AkiBkj)[N1σA2σB2+k=1N1(Aki2σB2+Bkj2σA2)]12. (11)

Equation 11 defines a Z matrix whose elements provide a Z score for testing the hypothesis that the intensity at (i,j) arises from a signal∕signal covariance. If the p value, associated with Zij via the standard normal distribution, is less than a critical probability α, the intensity Cij is not likely to arise due to noise but rather due to covariance of input signals. Conversely, if Zij is greater than the critical Z value (Zcrit), Cij likely arises from a covariance of input signals rather than noise. Since this hypothesis test is implicitly done for every point in the spectrum, a modification (see EPAPS supporting information18) is required to control the false positive rate, such as the Dunn–Sidak correction.19

Taken together, these results show that the noise distribution in an unsymmetrical covariance spectrum is in good approximation Gaussian with a standard deviation that varies for different spectral regions. Regions that neither align vertically nor horizontally with any signal peak contain less noise than a column or row that contains a peak. Generally,

(Noise)peak>(Noise)rowcolumn>(Noise)free, (12)

which implies that the precision of a spectrum is lowest precisely at the positions of the signal peaks and the cross sections between noise ridges. This behavior of unsymmetric covariance processing is in stark contrast to the 2D FT spectrum where random noise is evenly distributed over different regions. The second-most noisy regions are the “noise ridges” that align with peaks either horizontally or vertically (while these features are reminiscent of “t1 noise” in 2D FT spectra, their origin is entirely different). According to Eq. 6, the presence of a very intense signal in the ith column of A (or the jth column of B) results in increased noise for the ith row (respectively, the jth column) of C, possibly to the point where weaker cross peaks, with their intensities potentially further reduced due to noise [Eqs. 8, 10, 12] in that row (respectively, column), are obscured.

In the remainder of this section, two alternative approaches for the reduction of inhomogeneous noise effects are described, which use either thresholding or maxima ratio scaling.

Thresholding

The thresholding of regions in A and B in which signals are not expected (by removing all points in A and B for which the intensity is close to or below that of the noise floor) eliminates several of the terms in Eqs. 4, 6, 10. Thresholding prior to covariance computation helps to render the baseline noise of covariance spectra uniform and also eliminates the additional noise at the intersection of “noise ridges” where no peak is actually present. This leads to a clear reduction of the inhomogeneity of the noise in the covariance spectrum. However, thresholding will not improve the uncertainty of peak heights due to noise, since it cannot remove the noise superimposed onto the FT peaks that correlate to yield a covariance peak.

Maxima ratio scaling

Thresholding relies on the measurement of the noise level and requires some knowledge about the expected peak heights. Direct multiplication of the input matrices AT and B according to Eq. 1, on the other hand, leads to the appearance of noise ridges along the rows and columns of all cross peaks. These features are caused by the inner product of a column of A that contains a peak with a column of B that contains only noise or vice versa. To suppress such effects a differential scaling procedure referred to as “maxima ratio scaling” (mrs), can be applied by multiplying each element of C=ATB by the weighting factor

Wij=1exp{ln(maxk(Aki)maxl(Blj))}. (13)

The maxima ratio scaled unsymmetric covariance matrix Cmrs has elements

Cijmrs=Wijk=1N1AkiBkj, (14)

where the weights Wij reflect the magnitude ratio of columns i and A and j of B. If the columns have a similar maximum value, which is the case if both columns either contain peaks of similar intensity or if both columns contain only noise, Wij≈1 and therefore CijmrsCij. However, if column i contains one or several peaks and column j contains only noise, or vice versa, Wij⪡1 and CijmrsCij, i.e., mrs leads to the desired reduction of the ridges along the columns and rows of a cross peak.

MATERIALS AND METHODS

Simulations

In order to test the equations derived in the Theory section, numerical simulations of model spectra were performed where each spectrum has a statistically independent noise floor and contains a single peak in the same location along the axis concatenated via covariance. The spectral widths of both the direct and the indirect dimensions are 2.5 ppm with N1=32 and N2=128 data points. For the present purposes, “signal spectra” with a single diagonal peak at 1 ppm and a full width at half maximum peak height of 12.75 Hz (assuming a 600 MHz resonance frequency to convert from Hz to ppm) were simulated in the program MATLAB.20 Gaussian distributed noise floors were generated in MATLAB and added to the signal spectrum to generate two independent noise and signal spectra to be combined via unsymmetric covariance processing according to Eq. 1.

Experimental

2D 13C–1H–HSQC and 13C–1H–HMBC spectra were recorded at 298 K for a mixture of seven common metabolites at natural 13C abundance, namely, D-carnitine, D-glucose, L-glutamine, L-histidine, L-lysine, myo-inositol, and shikimic acid (each at a concentration of 10 mM in D2O), on a Bruker AVANCE 800 spectrometer equipped with a cryogenic probe. The direct 1H dimension of each spectrum was acquired with 2048 complex points and a spectral width of 8013 Hz. The indirect 13C dimensions were acquired with 1024 complex points. The 13C spectral widths of the HSQC and HMBC spectra were set to 50 314 and 32 206 Hz, respectively. For the HMBC spectrum, the “magnitude” spectrum was calculated after FT.21

Both data sets were processed using NMRPipe,22 leaving each spectrum unapodized in the direct dimension (to be concatenated via covariance), while the indirect dimensions were subjected to exponential-to-Gaussian apodization. The indirect, unsymmetric covariance calculations, yielding an HMBC-HSQC covariance spectrum,12 were performed in MATLAB.

RESULTS AND DISCUSSION

Theory

The signal-to-noise ratio of a covariance peak has according to Eq. 9 a characteristic dependence on the S∕N ratio of the input spectra A and B as well as on the size N1 of the concatenated dimension. Figure 1 shows this dependence for input spectra A and B that have a single peak represented by the Kronecker delta function for variable σA and σB. In Fig. 1a, σ=σAB takes values 0.05, 0.1, and 0.2 corresponding to S∕N ratios of 20:1, 10:1, and 5:1, respectively. For increasing N1, the sensitivity decreases with O(1N1). The spectrum with the lowest sensitivity is most strongly affected for increasing N1, whereas the change in sensitivity of the highest sensitivity spectrum is relatively modest. Figure 1b shows the situation when the sensitivity of spectrum A is constant at 20:1 while the S∕N of spectrum B takes the values 20:1, 10:1, and 5:1. The S∕N of the resulting unsymmetric covariance spectrum C steadily decreases with increasing N1 and it has a S∕N that is always below that of the less sensitive input spectrum.

Figure 1.

Figure 1

Signal-to-noise (S∕N) ratio of a peak arising from the covariance of a pair of peaks, computed using Eq. 9 as a function of N1 and the S∕N ratio of the input peaks. (a) S∕N for the covariance between peaks each having the indicated (5, 10, and 20) S∕N values. (b) S∕N for the covariance between a peak with S∕N=20 and a peak with the indicated S∕N. Note that the lower the signal to noise of the weaker peak, the lower the signal to noise of the covariance peak. However, in the limit where the weaker peak is much weaker than the stronger peak, so long as N1 is small, the signal to noise of the covariance peak approaches that of the weaker peak. The sensitivity of an unsymmetric covariance spectrum, for small values of N1, is not that much lower than that of the less sensitive of the two spectra subject to covariance with signal-to-noise values decreasing at most by a factor 21∕2 from that of the least sensitive of the two input spectra.

Simulations

Numerical simulations were performed to test the results presented in the Theory section. The model spectrum A with S∕N=45 [Fig. 2a] was unsymmetrically covariance processed with a spectrum B that has an identical signal peak as spectrum A, but a different random Gaussian noise floor with the same standard deviation. The corresponding unsymmetric covariance spectrum C=ATB, depicted in Fig. 2b, demonstrates the presence of an uneven noise floor with noise ridges in the same row and column as the signal peak. The variances given in Table 1 quantify the difference between the relatively low variance of baseline noise in covariance spectra and the higher variance in noise intensities in the same column or row as the covariance peak.

Figure 2.

Figure 2

Noise propagation through unsymmetric covariance. (a) Simulated (noisy) input spectrum A with S∕N=45. (b) Covariance spectrum ATB, where B has the same signal peak as A and the same noise level. (c) The variance calculated using Eq. 10 at each point of the covariance spectrum. (d) Z matrix calculated according to Eq. 11. (e) same as (d) after setting all elements to zero with a S∕N ratio less than Zcrit (4.85), the Z score belonging to the critical p value for which Dunn–Sidak correction yields a spectrum-wide α=0.01. (f) The covariance spectrum produced by thresholding by setting all elements of A and B less than 3σ to zero prior to covariance. In (a), (b), (d), (e), and (f) the cross peak is truncated to highlight noise features: the actual peak heights are 44, 2112, 33, 33, and 2101, respectively.

Table 1.

Variance of noise intensities in a simple unsymmetric covariance spectrum: simulation vs theory.

Locationa Variance in simulationb Theoryc
Mean Std. dev. Exampled Idealized Mean Std. dev.
Free (noise) 32.1 1.10 31.3 32 32.1 11.3
Row 2150 265.3 2579 2174 2159 109.5
Column 2167 320.2 1936 2174 2185 93.3
Peak 4341 679.2 0 4316 4315 150.6
a

Variance measures var[location] defined in the Theory section.

b

Statistics given for varfree, varrow, and varcol represent averages over 10 000 simulations.

c

Idealized theory uses the signal spectra to calculate var[location]. Additionally, var[location] is calculated using Eq. 10 and then taking the mean (and standard deviation) for all points in the given location over the simulations performed (100 replications for varfree, varrow, and varcol and 100 rounds of 100 replications for varpeak).

d

Variances for the simulation displayed in Fig. 1.

Table 1 gives both idealized variance estimates, calculated based on Eqs. 4, 6, 8 using noise-free spectra as well as estimates obtained using Eq. 10, which demonstrates the excellent correspondence between theory (both idealized and using unbiased estimators with simulated noisy input data) and simulation. Figure 2c plots the variance predicted by Eq. 10 on a point-by-point basis, displaying the expected “noise ridges” and highest variance at the location of the peak itself. Division of the signal by the predicted variance [Eq. 11] yields a noise floor that, like the noise floor in an FT spectrum, is homogeneous [Fig. 2d]. Figure 2e thresholds Fig. 2d at the critical Z score of 4.85, as adjusted by the Dunn–Sidak correction19 for 128×128=16 384 hypothesis tests, demonstrating that the Z score is an effective statistic for distinguishing between signal and noise in the covariance spectrum.

While thresholding eliminates most of the noise along the noise floor, including the noise ridges [Fig. 2f], the variance in peak height, calculated analogously to the corresponding variance reported in Table 1, is 4319 (with a standard deviation of 661) indicating that thresholding does not significantly improve the precision of the covariance peak intensities. For comparison, maxima ratio scaling suppresses noise ridges (Fig. 3) and yields 5687±886 for the corresponding variance in peak height. Thus, the precision of peak intensities produced by the mrs method is only slightly lower than the one obtained by thresholding.

Figure 3.

Figure 3

Covariance of simulated spectra (as described in text) subjected to maxima ratio scaling (mrs). The peak is truncated and has an actual height of 2059.

The simulations described above were repeated with a S∕N ratio of 7 for both input spectra (Fig. 4). Because varfree [defined by Eq. 4] scales with σA2σB2 while the additional noise present in varcol and varrow scales with σA2 and σB2, respectively, decreased signal-to-noise results in less pronounced noise ridges (Fig. 4b) than are seen in Fig. 2. The chosen S∕N ratio yields a covariance peak with S∕N [estimated by Eq. 11] of 5, which is just above Zcrit [Fig. 4d] and only slightly lower than the input S∕N ratio of 7. The mean local noise estimated by Eq. 10 for varfree, varrow, and varcol are 6.25×104, 1.42×105, and 1.63×105, respectively, which closely match the variances directly calculated from the covariance spectrum (6.21×104, 1.37×105, and 1.94×105, respectively). Together with Table 1, these results demonstrate the applicability of Eqs. 10, 11 in estimating the signal-to-noise ratio of covariance spectra over a wide S∕N range.

Figure 4.

Figure 4

Noise propagation through unsymmetric covariance. (a) Simulated input spectrum A as in Fig. 2 with S∕N=7. (b) Covariance spectrum ATB, where B has the same signal peak as A and the same noise level. (c) Z matrix calculated according to Eq. 11. (d) same as (c) after setting all elements to zero with a S∕N ratio less than Zcrit (4.85), cf. Fig. 2. (e) The covariance spectrum produced by thresholding by setting all elements of A and B less than 3σ to zero prior to covariance. (f) Covariance of simulated spectra subjected to maxima ratio scaling (mrs). In each panel, the cross peak is truncated to highlight noise features: the actual peak heights are 48, 2387, 5, 5, 2170, and 2246, respectively.

Experiment

The variance of the measured spectral intensities in a peak-free region of the HMBC (A) spectrum is 58.5 while the variance of intensities in a peak free region of the HSQC spectrum (B) is 108.8. Equation 4 predicts a variance of 2.61×107 for regions of the unsymmetric HMBC-HSQC covariance spectrum (C=ATB) that do not align (either by row or column) to covariance peaks. The region of the covariance spectrum, representing the covariance of the peak free regions of the HSQC and HMBC spectra used to evaluate the noise levels of those input spectra, has calculated intensities with a variance of 2.69×107, which differs by less than 5% from the predicted variance. Such a minor discrepancy may result from “colored noise” effects.23

Table 2 compares noise variances calculated according to Eq. 10 for two randomly selected covariance peaks, which are (1) the lysine Cγ–Cε cross peak at (24.12,41.76) ppm and (2) the carnitine Cβ–Cα cross peak at (45.66,66.79) ppm. Equation 10 predicts noise generally within 5% of the expected variance in noise intensities. The S∕N values for peaks 1 and 2 roughly correspond to the S∕N values of their associated traces in the FT spectra subject to covariance (Table 3).

Table 2.

Variances of column∕row noise and peak intensity for two covariance peaks.

  Experimenta Theoryb Relative error (%)c
Column Row Column Row Column Row
Peak 1d 9.56×109 1.30×1010 7.56×109 1.24×1010 26.5 5.1
Peak 2d 4.63×109 3.32×109 4.69×109 3.44×109 1.4 3.5
a

Variances calculated from intensities (in indicated, peak-free locations relative to the center of the peak at hand) in the covariance spectrum obtained by multiplying an experimentally measured HMBC and with a corresponding HSQC spectrum.

b

Values obtained from Eq. 10.

c

Difference between experiment and theory relative to the theoretical value.

d

Peaks in locations indicated in text.

Table 3.

Expected signal-to-noise ratios for two covariance peaks.

  Covariancea HSQCb HMBCb
Peak 1c 437 664 492
Peak 2c 294 346 258
a

Expected S∕N ratio according to Eq. 11.

b

S∕N ratio (peak intensity divided by standard deviation of measured intensities in a representative peak free region) for maximally intense peak contributing to the given covariance peak in the indicated spectrum.

c

Peaks in locations indicated in text.

As is the case with simulated data, the increased noise in the same column∕row of covariance peaks is visible upon plotting of covariance spectra [Fig. 5a], which shows the enhancement of noise in the same column and row as the (45.66,66.79) ppm cross peak in the covariance spectra as well as enhanced noise along the same row or the same column as other peaks. Some of the peaks leaving such ridges in the depicted region of the covariance spectrum are located in distal regions of the spectra, which are not displayed in Fig. 5a. [Note that Eqs. 6, 10 do not predict any attenuation of the noise “ridges” with increased distance from a peak along a column or row.]

Figure 5.

Figure 5

Selected spectral region taken from an experimental unsymmetric HSQC-HMBC covariance spectrum of metabolite mixture using different processing schemes. (a) Covariance spectrum computed according to Eq. 1. (b) The variance calculated, by Eq. 10 at each point in the covariance spectrum. (c) Z matrix calculated according to Eq. 11. (d) as (c), after setting all elements to zero having a S∕N ratio less than Zcrit (5.85), the Z score belonging to the critical p value for which Dunn–Sidak correction yields a spectrum-wide α=0.01. (e) Spectrum computed using thresholding at 3σ applied to the input HMBC and HSQC spectra. (f) Spectrum computed using maxima ratio scaling according to Eqs. 13, 14. In (a), (c), (d), (e), and (f), the cross peak (corresponding to peak 2 in text and tables) has been clipped: the maximum amplitude of this peak is 263 [(a) and (e)], 294 [(c) and (d)], and 144 (f).

Consistent with the simulations shown in Figs. 2f, 34e, thresholding the input spectra or maxima ratio scaling during the covariance process strongly suppresses noise ridges as seen in Figs. 5e, 5f. Thresholding does not affect the peak height, while mrs reduces the peak height by a factor (<2), which is much smaller than ridge suppression (<10). The 3σ threshold used in generating the covariance spectrum shown in Fig. 5e is the same as that used in the simulation shown in Figs. 2f, 4e. Increasing this threshold would further attenuate the ridges at the risk of removing true signals from the input spectra and hence attenuate or eliminate true peaks in the resulting covariance spectrum.

Figures 5b, 5c display the variance calculated by Eq. 10 and the Z score calculated by Eq. 11 on a point-by-point basis. Figure 5d thresholds the Z score by its critical value of 5.85, as adjusted via the Dunn–Sidak correction for 2048×2048 hypothesis tests. The only points in the depicted region of the unsymmetric HMBC-HSQC covariance spectrum with signal to noise above the critical value are those associated with peak 2.

CONCLUSION

A chief utility of signal-to-noise comparisons is in determining whether or not a particular intensity is a signal of potential interest or whether it can be explained as random noise. For a Gaussian noise distribution, the signal-to-noise ratio corresponds directly to a Z score for testing such a hypothesis. In Fourier transform spectra, the homogeneity of the noise floor allows for a simple evaluation of the standard deviation of the noise distribution by computing statistics over a spectral region that is void of peak signals. When applied to nonlinearly processed datasets, such as the unsymmetric covariance spectra discussed here, it can lead to spectral distortions and the emergence of false peaks. Such peaks are most likely to occur at the intersection of noise ridges caused by strong signal peaks in the input spectra.

For two FT input spectra A and B whose noise floor standard deviations σA and σB have been determined by standard methods, an unbiased S∕N ratio matrix Z can be calculated for each point in the unsymmetric covariance spectrum. Because Z has a homogeneous noise distribution, its noise interpretation is analogous to the one of a FT spectrum. In fact, the S∕N ratio of an unsymmetric covariance peak is generally quite close in value to the S∕N ratios of the FT peaks whose covariance gives rise to that covariance peak.

From a statistical perspective, the Z matrix provides a Z score for each spectral point, which translates into a probability that a given Zij intensity represents an actual signal as opposed to random noise. This feature puts the unsymmetric covariance Z matrix on the same quantitative footing as its input FT spectra.

Alternatively, the inhomogeneity of the baseline noise in unsymmetric covariance spectra can be eliminated by thresholding, at the risk of eliminating weak cross peaks that are true, or by maxima ratio scaling. Application of the mrs method ensures that the covariance between two columns, one containing a peak and the other containing only noise, is smaller than the covariance between two columns that both contain noise. The mrs method, however, also scales down covariance peak intensities. A strong noise ridge in an unsymmetric covariance spectrum derives from a strong input peak. The mrs method reduces ridge intensity by scaling down such strong input peaks, which also reduces the intensity of covariance peaks associated with strong noise ridges. Thus, to a first approximation, mrs leaves apparent S∕N ratios of peaks unaffected.

Unsymmetric covariance processing has generally limited consequences for the sensitivity. For example, a homonuclear 13C or 15N spectrum obtained via unsymmetric covariance has a sensitivity nearing that of proton-detected NMR rather than the sensitivity experimentally available with 13C or 15N direct detection, confirming previous assessments of indirect∕unsymmetric covariance processing.9, 13 In addition, thresholding, maxima ratio scaling, or Z-matrix analysis provide effective means for the suppression of ridge artifacts in the final unsymmetric covariance spectrum based on minimal assumptions, namely, that the Gaussian noise of the spectra subjected to covariance processing are known.

The Z-matrix formalism presented in this work provides a general link between standard multidimensional FT spectroscopy and covariance NMR. It enables one to quantify the sensitivity of covariance spectra and to evaluate whether covariance intensities arise from noise or from covariances between input signals, thereby helping NMR spectroscopists to optimize the acquisition and analysis of datasets subjected to this type of processing.

ACKNOWLEDGMENTS

This work was supported by the National Science Foundation (Grant No. MCB 0416899 to T.S.) and the National Institutes of Health (Grant No. GM 066041 to R.B.). The NMR experiments were conducted at the National High Magnetic Field Laboratory (NHMFL) supported by cooperative agreement DMR 0654118 between the NSF and the State of Florida.

References

  1. Ernst R. R., Bodenhausen G., and Wokaun A., Principles of Nuclear Magnetic Resonance in One and Two Dimensions (Clarendon, Oxford, 1987); [Google Scholar]; Ernst R. R., Adv. Magn. Reson. 2, 1 (1966); [Google Scholar]; Hoch J. C. and Stern A. S., NMR Data Processing (Wiley-Liss, New York, 1996). [Google Scholar]
  2. Billeter M. and Orekhov V., in Computational Science-ICCS 2003, International Conference, Melbourne, Australia and St. Petersburg, Russia, 2003, Proceedings, Part I, Lecture Notes in Computer Science, Vol. 2657, edited by Sloot P. M. A., Abramson D., Bogdanov A. V., Dongarra J. J., Zomaya A. Y., and Gorbachev Y. E. (Springer, Berlin∕Heidelberg, 2003);; Kim S. and Szyperski T., J. Am. Chem. Soc. 10.1021/ja028197d 125, 1385 (2003); [DOI] [PubMed] [Google Scholar]; Mobli M., Maciejewski M. W., Gryk M. R., and Hoch J. C., J. Biomol. NMR 39, 133 (2007); [DOI] [PubMed] [Google Scholar]; Orekhov V. Y., Ibraghimov I., and Billeter M., J. Biomol. NMR 10.1023/A:1024944720653 27, 165 (2003); [DOI] [PubMed] [Google Scholar]; Szyperski T. and Atreya H. S., Magn. Reson. Chem. 10.1002/mrc.1817 44, S51 (2006); [DOI] [PubMed] [Google Scholar]; Kupce E. and Freeman R., J. Am. Chem. Soc. 10.1021/ja049432q 126, 6429 (2004). [DOI] [PubMed] [Google Scholar]
  3. Brüschweiler R., J. Chem. Phys. 10.1063/1.1755652 121, 409 (2004); [DOI] [PubMed] [Google Scholar]; Brüschweiler R. and Zhang F., J. Chem. Phys. 10.1063/1.1647054 120, 5253 (2004). [DOI] [PubMed] [Google Scholar]
  4. Donoho D. L., Johnstone I. M., Stern A. S., and Hoch J. C., Proc. Natl. Acad. Sci. U.S.A. 87, 5066 (1990); [DOI] [PMC free article] [PubMed] [Google Scholar]; Mandelshtam V. A., Prog. Nucl. Magn. Reson. Spectrosc. 10.1016/S0079-6565(00)00032-7 38, 159 (2001). [DOI] [Google Scholar]
  5. Snyder D. A., Zhang F., and Brüschweiler R., J. Biomol. NMR 10.1007/s10858-007-9187-1 39, 165 (2007). [DOI] [PubMed] [Google Scholar]
  6. Trbovic N., Smirnov S., Zhang F., and Brüschweiler R., J. Magn. Reson. 10.1016/j.jmr.2004.08.007 171, 277 (2004). [DOI] [PubMed] [Google Scholar]
  7. Blinov K. A., Larin N. I., Kvasha M. P., Moser A., Williams A. J., and Martin G. E., Magn. Reson. Chem. 10.1002/mrc.1674 43, 999 (2005). [DOI] [PubMed] [Google Scholar]
  8. Snyder D. A., Xu Y. Q., Yang D. W., and Brüschweiler R., J. Am. Chem. Soc. 129, 14126 (2007). [DOI] [PubMed] [Google Scholar]
  9. Zhang F. and Brüschweiler R., J. Am. Chem. Soc. 10.1021/ja047241h 126, 13180 (2004). [DOI] [PubMed] [Google Scholar]
  10. Chen Y. B., Zhang F., Snyder D., Gan Z. H., Bruschweiler-Li L., and Brüschweiler R., J. Biomol. NMR 38, 73 (2007). [DOI] [PubMed] [Google Scholar]
  11. Blinov K. A., Larin N. I., Williams A. J., Mills K. A., and Martin G. E., J. Heterocycl. Chem. 43, 163 (2006). [Google Scholar]
  12. Blinov K. A., Larin N. I., Williams A. J., Zell M., and Martin G. E., Magn. Reson. Chem. 10.1002/mrc.1766 44, 107 (2006). [DOI] [PubMed] [Google Scholar]
  13. Blinov K. A., Williams A. J., Hilton B. D., Irish P. A., and Martin G. E., Magn. Reson. Chem. 45, 544 (2007). [DOI] [PubMed] [Google Scholar]
  14. Kupce E. and Freeman R., J. Am. Chem. Soc. 128, 6020 (2006). [DOI] [PubMed] [Google Scholar]
  15. Lescop E. and Brutscher B., J. Am. Chem. Soc. 129, 11916(2007). [DOI] [PubMed] [Google Scholar]
  16. Benison G., Berkholz D. S., and Barbar E., J. Magn. Reson. 189, 173 (2007). [DOI] [PubMed] [Google Scholar]
  17. Springer M. D. and Thompson W. E., SIAM J. Appl. Math. 10.1137/0118065 18, 721 (1970). [DOI] [Google Scholar]
  18. See EPAPS Document No. E-JCPSA6-129-604835 for derivations of formulas used in the text. For more information on EPAPS, see http://www.aip.org/pubservs/epaps.html.
  19. Sidak Z., J. Am. Stat. Assoc. 62, 626 (1967). [Google Scholar]
  20. The Mathworks Inc., MATLAB, Natick, MA, 2005.
  21. Bax A. and Summers M. F., J. Am. Chem. Soc. 10.1021/ja00268a061 108, 2093 (1986). [DOI] [PubMed] [Google Scholar]
  22. Delaglio F., Grzesiek S., Vuister G. W., Zhu G., Pfeifer J., and Bax A., J. Biomol. NMR 10.1007/BF00197809 6, 277 (1995). [DOI] [PubMed] [Google Scholar]
  23. Grage H. and Akke M., J. Magn. Reson. 162, 176 (2003). [DOI] [PubMed] [Google Scholar]

Articles from The Journal of Chemical Physics are provided here courtesy of American Institute of Physics

RESOURCES