Abstract
High-dimensional nuclear magnetic resonance (NMR) spectroscopy can assist in determining protein structure, but it requires time-consuming acquisition. Deep learning enables ultrafast reconstruction but is limited to spectra of up to three dimensions and cannot provide faithful reconstruction under unseen acceleration factors. Extending deep learning to handle higher-dimensional spectra and varying acceleration factors is desirable. However, scalability requires complex networks and more data, seriously hindering applications. To address this, we designed a network to learn data in one dimension (1D). First, time-domain signals were modeled as the outer product of 1D exponentials. Then, each 1D exponential was approximated with a rank-one Hankel matrix. Last, reconstruction error was corrected with a neural network. Here, we demonstrate robust 3D NMR reconstruction across acceleration factors (2 to 33) using one trained network. In addition, we find that reconstruction of 4D NMR is possible with artificial intelligence. This work opens an avenue for accelerating arbitrarily high-dimensional NMR.
Separable dimensional deep learning provides robust and fast high-dimensional nuclear magnetic resonance spectroscopy of proteins.
INTRODUCTION
Multidimensional (N ≥ 3) nuclear magnetic resonance (NMR) spectroscopy is a fundamental analytical tool for studying the structure of proteins. It can overcome the spectral peak overlapping signals in one-dimensional (1D) or 2D NMR and substantially increase the quality of structure determination for proteins (1). For example, the presence of a faithful peak in 4D methyl-methyl nuclear Overhauser effect spectroscopy (NOESY) indicates that two methyls of the protein are close to each other (typically <0.5 nm) (2). This information is valuable for protein structure determination (1, 3, 4) and contributes to assigning methyl groups if the structure is known (5).
The acquisition of multidimensional NMR signals comes at the cost of exponentially increased sampling time (6). This time could be reduced by acquiring a partial time-domain signal, known as the free induction decay (FID), through nonuniform sampling (NUS). However, NUS introduces artifacts into the reconstructed spectrum (7–10). To remove these artifacts, state-of-the-art reconstruction methods leverage specific signal priors to infer the missing data points. These priors include (i) few nonzero spectral intensities (sparsity) in compressed sensing (CS) (9–12), (ii) a minimal number of 1D or 2D peaks (low-rank Hankel matrix representation) (7, 13, 14), and (iii) a minimal number of multidimensional peaks (low-rank tensor representation) (6, 15).
Challenge for multidimensional spectrum reconstruction
CS enables efficient reconstruction of 2D/3D/4D spectra due to its iterative procedure, which only relies on fast Fourier transform and soft thresholding (9, 10, 16). However, CS often fails to faithfully reconstruct broad peaks, which typically have low intensity, because they violate the underlying sparsity assumption, particularly when FID is highly undersampled (7). Low-rank Hankel matrix methods effectively address this limitation because their core assumption of a minimal number of peaks is independent of signal intensity (7, 17). The low-rank Hankel matrix methods can be extended to higher-dimensional spectra by minimizing the number of peaks along each dimension (15). However, methods based on low-rank Hankel matrices are time-consuming due to demanding matrix computation (section S1).
In the era of artificial intelligence, deep learning NMR (DLNMR) (8) has enabled ultrafast and reliable performance in NMR spectrum reconstruction (8, 18–28), DEER data process (29), parameter estimation from Laplace spectra (30, 31), NMR denoising (32), and peak picking (33). DLNMR can reconstruct high-quality spectra but only supports cases under the matched sampling rates, which causes two disadvantages: (i) Spectrometer costs more memory to save multiple weights of the model for different sampling rates. (ii) Users need to choose the weight to match the sampling rate for NUS spectra, and the reconstruction quality reduces if the sampling rate mismatches.
In practice, a mismatch between the training acceleration factor (AF) and the target AF inevitably leads to degraded reconstruction quality. DLNMR performs poorly when applied to data acquired with unseen acceleration factors (34). Although extension to unseen AF is possible, it is restricted to a narrow range of sampling rates (5 to 12%, equivalent to AF = 20 to 8.3). (34). Therefore, achieving faithful reconstruction of accelerated 3D NMR data acquisition across diverse AFs remains a critical challenge.
Besides, deep learning (DL) has not been applied to 4D NMR, due to the exponentially increased data size (6, 15) and the hypercomplex signal format (35). But NUS for 4D NMR is eagerly required due to extremely long acquisition times required for full sampling (up to weeks or months).
Brief introduction to the proposed method
Herein, we propose rank-one approximation decomposition (ROAD), a DL method for reconstructing multidimensional (N ≥ 3) NMR spectra. ROAD can handle unseen AFs (i.e., AFs not encountered during training) by integrating principles from low-rank reconstruction techniques (7, 17) into a DL framework. ROAD is extended to 4D NMR through separable learning of 1D exponentials, thereby avoiding the need for a complex network structure and a massive training dataset.
ROAD integrates the merits of state-of-the-art reconstruction methods into a single unified framework. These merits include (i) fundamental modeling of time-domain signals in exponentials (6, 7, 36); (ii) low-rank tensor representation of multidimensional spectrum (6, 7, 15); (iii) theoretically best sparsity, i.e., minimal number of peaks, in each dimension (7, 37); and (iv) fast and enhanced reconstruction with DL correction (8, 38). The first merit makes the modeling and reconstruction reliable. The second and third merits allow the joint search of low-dimensional signal space. This leads to fast and robust reconstruction, particularly for data acquired with unseen AFs. The fourth merit compensates for residual errors introduced by the fast iterative reconstruction steps and further accelerates the overall process through neural network inference.
RESULTS
Experimental setups for NMR spectra and description of sampling schedules are provided in section S5.
Reconstruction for 3D spectra
For two retrospectively sampled 3D NMR, 2D NUS is applied on the indirect plane of the 3D spectrum. The proposed ROAD is compared with four state-of-the-art DL reconstruction methods: DLNMR (8), model-inspired deepthresholding network (MoDern) (34), weak peak reconstruction network (WPR-Net) (26), and joint time-frequency domain deep learning network (JTF-Net) (27) (detail of training in section S8). In addition, ROAD is also compared with three high-fidelity iterative reconstruction methods—sparse multidimensional iterative lineshape-enhanced (SMILE) (39, 40), CS (10, 41), and hypercomplex low rank matrix factorization (HLRF) (35)—that does not require any training set and can handle the hypercomplex FID (35). We reconstruct NMR data under the Poisson gap NUS (42) in indirect dimensions.
False peak rate (FPR) of false peaks, missing peak rate (MPR), and frequency error (FE) of true peaks (43, 44) and squared Pearson correlation coefficient (SPCC) (45) of peak intensities are applied to quantify the peak fidelity between the reconstructed and fully sampled spectra (definition of quantitative metrics in section S11). The positions of peaks are detected through peak picking of CcpNmr software (46). Lower value of FPR, MPR, and FE and higher value of SPCC mean a better spectra reconstruction.
For the 3D HNCACB (experiment that correlates the resonances of the amide 1H and 15N frequencies with those of the intra- and interresidue 13Cα and 13Cβ resonances of proteins) spectra under the sampling rate of 5%, among all compared DL methods, the proposed ROAD achieves the lowest FPR (2.7), MRP (5.76), and low FE (0.52), and the highest SPCC (99.95). MoDern and DLNMR also achieve the low FE (0.41 and 0.55) but present the high MPR (>14), indicating that some peaks are missing (marked in pink arrow in Fig. 1, G and H). Besides, under an increased sampling rate of 20%, MoDern and DLNMR fail to reconstruct the spectrum with a high FPR (>65), while ROAD still achieves good performance on all metrics. As the sampling rate increases, JTF-Net reconstructs the spectrum with lower MRP (15.71) and FE (0.44) but higher than those of ROAD (4.19 and 0.21). WPR-Net only supports spectra reconstruction under 1D NUS; thus, extending it to reconstruct 2D NUS of 3D spectra introduces obvious artifacts (Fig. 1E).
Fig. 1. Reconstructed 3D HNCACB spectra of GB1-HttNTQ7 protein.
(A) Fully sampled spectrum; (B to E and G to J) reconstructions by SMILE, CS, HLRF, WPR-Net, DLNMR, MoDern, JTF-Net, and ROAD from 5% data; (L to O) reconstructions by DLNMR, MoDern, JTF-Net, and ROAD from 20% data. (F and K) Sampling patterns at NUS rates of 5 and 20%. Note: The first and second rows are the spectrum projections on 1H-15N and 1H-13C planes. Note: The original FID is fully sampled and in size of 90 × 44 × 1024. The NUS is performed on the indirect plane of 90 × 44. JTF-Net is trained by a 2D NUS dataset of sampling rates from 6 to 8%, and WPR-Net is trained by 1D NUS dataset under a sampling rate of 20%. Model weights are shared by the original authors. Other DL methods are trained by 2D NUS dataset under a sampling rate of 5%. Values in pink and blue denote the best evaluation metric under each sampling rate. Arrows in pink and blue mark missing peaks and false peaks, respectively. ppm, parts per million.
Compared with the non-DL method CS, the proposed ROAD can provide high-quality spectra with low FPR (2.7), MPR (5.76), and FE (0.52). Another non-DL method, SMILE, is sensitive to the shape of peaks and has a high MRP (12.04) due to peak distortion in 13C dimension (Fig. 1B). Besides, ROAD only computes 3.2 s, which is faster than SMILE (23 s) and much faster than CS (169 s) and HLRF (8913 s) (computation platforms and reconstruction time are provided in section S12).
A similar observation can be found in the reconstruction of 3D HNCO (experiment that provides correlation between 15N and its attached 1H and 13C from the carbonyl carbon of previous amino acid) spectra of Azurin protein (Fig. 2). Compared with the reconstruction of 3D HNCACB spectra under a sampling rate of 5%, the 3D HNCO spectra reconstructed by all methods under a sampling rate of 3% have similar or better performance in FPR, MRP, and FE but slightly worse SPCC. The proposed ROAD achieves a high-quality reconstruction with the lowest FPR (0), MRF (3.73), and FE (0.28) among all methods. SMILE, CS, DLNMR, and MoDern lose some weak peaks, while HLRF, WPR-Net, and JTF-Net lose all weak peaks in the marked region (zoomed-in views are provided section S11.4). As the sampling rate is increased to 50%, DLNMR and MoDern still fail to reconstruct spectra due to a lack of robustness to sampling rate. ROAD provides the best performance of all metrics among all DL methods.
Fig. 2. Reconstructed 3D HNCO spectra of Azurin protein.
(A) Fully sampled spectrum; (B to E and G to J) reconstructions by SMILE, CS, HLRF, WPR-Net, DLNMR, MoDern, JTF-Net, and ROAD from 3% data; (L to O) reconstructions by DLNMR, MoDern, JTF-Net, and ROAD from 50% data. (F and K) Sampling patterns at NUS rates of 3 and 50%. Note: The first and second rows are the spectrum projections on 1H-15N and 1H-13C planes. The original FID is fully sampled and in size of 60 × 60 × 1024. NUSs are performed on the indirect plane of 60 × 60. JTF-Net is trained by a 2D NUS dataset of sampling rates from 6 to 8%, and WPR-Net is trained by 1D NUS dataset under a sampling rate of 20%. Model weights are shared by the original authors. Other DL methods are trained by 2D NUS dataset under a sampling rate of 5%. The 2D planes of spectra marked with a red box are shown in section S11.4. Values in pink and blue denote the best metrics under sampling rates of 3 and 50%.
Furthermore, quantitative metrics under each sampling rate are calculated over 10 Monte Carlo trials (Fig. 3). The DL methods, JTF-Net and ROAD, and the non-DL methods are robust to various sampling rates. Under all sampling rates, ROAD is superior to JTF-Net and provides comparable performance with non-DL methods in all quantitative metrics. For 3D HNCO spectra of Azurin protein, ROAD even has no false peaks for under the sampling rate of more than 5% (Fig. 3E). More results of different NUS for 3D NMR and synthetic signals are provided in sections S9 to S11.
Fig. 3. Mean of quantitative metrics of reconstructed spectra.
(A to D) FPR, MPR, error of the center frequency of peaks, and SPCC of peak intensities of 3D HNCACB spectra of GB1-HttNTQ7 protein. (E to H) FPR, MPR, error of the center frequency of peaks, and SPCC of peak intensities of 3D HNCO spectra of Azurin protein. Note: Each data point represents the quantitative metric of one NUS experiment. Lower value in (A) to (C) and (E) to (G) and higher value in (D) and (H) mean better reconstruction.
Reconstruction for 4D spectra
For 3D NUS (4D NMR), ROAD is not retrained, which is directly extended from the model trained for 2D NUS. To faithful reconstruction of compared methods, we compare ROAD with the reconstruction method used in the original paper of corresponding spectrum.
The 4D NOESY spectrum of human mucosa-associated lymphatic tissue 1 (MALT1) protein was originally sampled in 84 hours at a NUS rate of 9% (AF = 11.1) (5) in three indirect dimensions. All sampled FID data points are used in reconstruction. The proposed ROAD is compared with a fast iterative sparse reconstruction method, CS (10, 41). Due to the lack of full sampling, we adopt the peak assignment to evaluate the reconstruction performance.
According to the projections of the reconstructed spectrum of human MALT1 protein (Fig. 4, A and B), ROAD can successfully handle the reconstruction of 4D spectra, as confirmed by the similar projections obtained by CS. The same peak assignment can also be provided by CS and ROAD (Fig. 4, C to F). The result demonstrates the feasibility of reconstructing 4D NMR spectra with DL. Besides, ROAD runs in 24 s, which is 12 times faster than CS (5 min).
Fig. 4. Reconstructed 4D 13C-13C-SF HMQC NOESY spectrum of human MALT1 protein.
(A and B) Reconstructions performed by CS and ROAD from 9% of the data. (C to F) The assignments of diagonal and cross peaks from (A) and (B), respectively. Note: The first, second, and third columns are projections of the reconstructed spectrum on 1H-1H, 13C-1H, and 13C-13C planes. The fourth and fifth columns represent the 2D planes in the 13C/1H dimensions of 24.702/0.817 ppm and 20.401/0.139 ppm. The original FID is NUS sampled and with a size of 32 × 40 × 40 × 1024, and NUS is performed on the first three dimensions.
Another 4D methyl heteronuclear multiple-quantum coherence (HMQC)–NOESY–HMQC spectrum of the isoleucine, leucine and valine methyl-labeled m04 protein of cytomegalovirus (39) is reconstructed by SMILE and ROAD. This spectrum was originally sampled at a NUS rate of 1.56% (AF = 64) in three indirect dimensions. Compared to SMILE, ROAD provides the reconstructed spectrum with similar peaks (Fig. 5). Besides, ROAD takes 19 min, which is 13 times faster than SMILE (4.4 hours).
Fig. 5. Reconstructed 4D methyl HMQC-NOESY-HMQC spectrum of the isoleucine, leucine and valine methyl-labeled m04 protein of cytomegalovirus.
(A and B) Reconstructions performed by SMILE and ROAD from 1.56% of the data. (C to F) The 2D planes extracted in 13C/1H dimension at 15.066/0.717 ppm and 23.474/0.558 ppm, respectively. Note: The first, second, and third columns are projections of the reconstructed spectrum on 1H-1H, 13C-1H, and 13C-13C planes. The original FID is NUS sampled and with a size of 56 × 80 × 80 × 1024, and NUS is performed on the first three dimensions.
DISCUSSION
Limitation of weak cross peaks
In the 4D NOESY spectrum of human MALT1 protein, cross peaks exhibit notably lower intensity than their corresponding diagonal peaks. ROAD can preserve the strong cross peaks but may lose some weak peaks (Fig. 6), e.g., the intensity of the diagonal peak is more than 100 times stronger than that of cross peaks 2 and 6. These weak cross peaks belong to the methyl groups of amino acids, which are located farther away from 705ValHga,Cga in MALT1 spatial structure than other methyl groups. Loss of weak peak for 4D NOESY spectrum results from two aspects:
Fig. 6. Assignment of peaks in 2D planes of the reconstructed 4D 13C-13C-SF HMQC NOESY spectrum of human MALT1 protein.
(A and B) Reconstructions performed by CS and ROAD from 9% of the data. The original FID is NUS sampled and with a size of 32 × 40 × 40 × 1024, and the 2D plane is extracted at the 13C/1H dimensions of 20.453/0.329 ppm.
1) Simple thresholding operator to extract the peak. Peak retrieval module sets five times the standard deviation (SD) of the inputted spectrum as the threshold value to extract the peak. If the previous block can recover the signal with high-quality strong peaks, then the unrecovered weak peak can appear in the residual signal and be extracted (section S4.5). However, for NOESY spectra, the intensity of strong peaks (diagonal peak) is hundreds of times stronger than that of some weak peaks (cross peaks). A slight error in strong peaks may be higher than the intensity of weak peaks, causing the failed extraction of unrecovered peaks.
2) Mismatched reconstruction from the 2D NUS to 3D NUS spectrum. Reconstruction for 3D NUS is extended from the model for 2D NUS. Because the whole high-dimensional signal, rather than its 1D components, is used to train ROAD, limited computing memory cannot support effective training for 3D NUS. Thus, weights for 2D NUS are applied in the model for 3D NUS, which causes a mismatched dimension of signals and decreased reconstruction quality.
Spectrum with zero-order and first-order phases
Besides fast reconstruction, compared with CS and SMILE, ROAD does not require phase correction before reconstruction. This advantage of ROAD comes from our two core designs: (i) The low rank property does not depend on the phase in the Hankel matrix. (ii) The neural network implicitly learns phase information.
Zero-order phase (Ph0) of spectrum arises from a mismatch between the reference phase and the receiver detector phase, while first-order phase (Ph1) arises from a time delay between excitation and detection (47). The NUS spectrum in Fig. 4 has a Ph0 of 90° in one of 13C dimensions. Another NUS spectrum in Fig. 5 has Ph0 of 87° and Ph1 of 180° in one of 13C dimensions. The successful reconstruction of traditional methods, CS and SMILE, relies on accurate phase correction (section S11.5). This process requires time-consuming manual calibration from the NUS spectrum. Without the prestep of phase correction, CS (Fig. 7) introduces obvious artifacts and may lose some peaks, while ROAD still successfully reconstructs spectra.
Fig. 7. Reconstructed 4D 13C-13C-SF HMQC NOESY spectrum of human MALT1 protein.
(A and B) Reconstructions performed by CS and ROAD from 9% of the data. The Ph0 in one of 13C dimensions is not adjusted to zero before reconstruction. The first, second, and third columns are projections of the reconstructed spectrum on 1H-1H, 13C-1H, and 13C-13C planes.
In one single reconstruction, CS costs 5 min for the 4D NOSEY spectrum of size of (5), and SMILE costs 4.4 hours for another 4D NOSEY spectrum of more FID data points (39). High-quality spectra reconstructed by CS and SMILE may require multiple interactive processes of phase correction and reconstruction by experienced experts. For someone who is not familiar with phase correction, the process is time-consuming. In contrast, ROAD only needs one single reconstruction within 24 s for the spectrum of size of (5) and 20 min for the spectrum of size of (39). Thus, ROAD reconstruction is faster than CS and SMILE. More reconstruction results of one 4D NUS NOESY spectrum (48) of spindle and kinetochore-associated protein 1 are provided in section S11.5.
Concluding remarks
In summary, we proposed ROAD, a scalable multidimensional DL network designed to accelerate 3D/4D NMR spectral reconstruction. To address challenges posed by network complexity and extensive training data requirements, we model time-domain signals as the outer product of 1D exponentials and train ROAD within a 1D signal space. The framework outperforms state-of-the-art DL methods, demonstrating robust 3D spectrum reconstruction across varying acceleration factors. Furthermore, through simple extension, we achieve the successful reconstruction of 4D NMR spectra via DL, opening an avenue for artificial-intelligence-driven fast data acquisition of arbitrarily high-dimensional NMR spectroscopy.
METHODS
Signal model of multidimensional spectrum
This design flow follows the mathematical modeling of multidimensional time-domain signals, which is also called FID. A fully sampled -dimensional FID is expressed as an outer product of hypercomplex exponentials (Fig. 8) (35, 36)
| (1) |
where is the addition of peaks, is the -dimensional multiplication, and denotes the time variable of the nth dimension. The FID is hypercomplex because the imaginary unit in each dimension is different, i.e., but if (section S3). For the rth peak, is the amplitude, and , , denotes the phase, damping factor, and normalized frequency in the nth dimension, respectively. The amplitude of each exponential is the same for each dimension. Hence, the subscript n is not presented in
Fig. 8. Multidimensional modeling of NMR with outer product of exponentials.
(A) Spectrum. (B) FID.
The low rank structure of lies in three folds (see the toy example in section S2): (i) The tensor rank of is R (6, 15, 49). (ii) The rank of the factor matrix is R (6, 15). The factor matrix is defined through placing single exponential in the nth dimension as a column, i.e., . (iii) The Hankel matrix of the rth column of , i.e., , is rank one (15). The third property has been evidenced to provide high-fidelity reconstruction at the cost of long computation time (15). These low-rank properties point out the way of finding a low-dimensional space that the signal resides in, thus providing the possibility to reconstruct a faithful spectrum from highly undersampled FID data.
Scalable DL reconstruction
The NUS FID with the undersampling operator is reconstructed by solving a ROAD model. It finds the FID that contains at most peak in multidimensional space and has only one peak in each column of . The mathematical model is
| (2) |
where selects the rth column of , ℋ converts this column into a Hankel matrix, means arranging factor matrices of all dimensions to compose a fully sampled FID , is a regularization parameter, is the nuclear norm, and is the Frobenius norm of a hypercomplex matrix.
A fast algorithm (Fig. 9) is proposed to solve ROAD model (derivations of the algorithm are in section S4). After initializing a spectrum that has NUS artifacts and inaccurate exponentials in each dimension obtained by N-D ESPRIT (multidimensional estimation of signal parameters via rotational invariance techniques) (50), multiple iteration blocks gradually reduce spectrum artifacts (see section S6). This iteration block consists of four modules, including peak retrieval, rank-one approximation (ROA), data consistency, and factor matrix correction module (FMCM). Similar to peak detection and selection of SMILE (39), the peak retrieval module finds the center frequency of peaks from the spectrum with NUS artifact, improving the reconstruction quality of low-intensity peaks (Fig. 9B). ROA enables fast matrix decomposition with its one row and column (Fig. 9D), avoiding the slow singular value decomposition in minimizing nuclear norm. Data consistency enforces the reconstruction being aligned to the acquired FID data, showing a weak ability to correct the false peak (Fig. 10, C and D). To interpret the mechanism for the improved generalization, one ablation experiment is provided in section S7.
Fig. 9. Algorithm pipeline of ROAD.
(A) The iteration block for 2D NUS reconstruction. (B) Peak retrieval module. (C) Factor matrix correction module (FMCM). (D) ROA module. (E) The iteration block for 3D NUS reconstruction. In (A), the blue lines with arrows at the top represent the undersampled time-domain signal and the undersampling operator from the input of ROAD to the modules. Blue lines with arrows at the top represent the undersampled FID and the undersampling operator , which are not changed. Blue lines with arrows at the bottom and the middle represent the intermediate time-domain variables, which are changed in flowing from one module to another module.
Fig. 10. Algorithm interpretation with a toy 2D example.
(A) A NUS spectrum before reconstruction; (B) one peak of (A); (C and D) the reconstruction of (B) in the third iteration block without and with data consistency (without neural network correction); (E) fully sampled spectrum; (F) one peak of (E); (H and G) the reconstructed peaks in the 3rd and 10th iteration blocks with neural network correction.
To solve the global optimization problem of NUS reconstruction, ROAD applies the dual domain correction, i.e., frequency domain and time domain, because one single point in time (or frequency) domain stands for a global signal in the frequency (or time) domain. In the frequency domain, the DL module FMCM uses the convolution to fast remove the NUS artifact locally. In the time domain, optimization solvers with low-rank prior reconstruct high-quality exponential functions globally.
An essential part of ROAD is the FMCM, designed with a neural network (inputs and outputs are provided in section S7). It remedies the signal loss introduced by other operations. We use a toy example to interpret the algorithm (Fig. 10). A spectral peak under fully sampled conditions, as an outer product of 1D peaks, has a clear line shape in each dimension (Fig. 10F). FMCM greatly improves the intensity (Fig. 10H) lost in ROA (Fig. 10C). After 10 iteration blocks, ROAD obtains high-fidelity 2D peak (and 1D peaks in Fig. 10G) that is very close to the ground truth (Fig. 10F).
The FMCM is a DL network, built by DenseNet (51) with a residual structure (52) and Fourier transform. DenseNet consists of 1D convolutional layers connected by ReLU activation functions (53). The weights of DenseNet can be shared within all dimensions because the learned mapping in each dimension has similar inputs and outputs, except lengths, reducing the number of parameters and improving the robustness for different lengths . This advantage also helps us to extend the method to handle 4D NMR.
Within the training dataset, 40,000 pairs ( ) of 2D hypercomplex exponential functions are synthesized with amplitude , phases , damping factors , normalized frequencies , and signal size of 89 43. NUS patterns are generated following the 2D Poisson distribution (42, 54). Gaussian noise with a SD from 0 to 0.02 is added. Within the synthetic dataset, 90%/10% pairs are used for training/validation.
During training ROAD, the estimated number of exponentials is set as 20, which is twice the maximal number of exponentials of each spectrum in the training dataset. ROAD is built by K = 10 iteration blocks. In each FMCM, DenseNet consists of six convolutional layers, whose number and size of the filters are 12 and (detailed description in section S4.4).
To force the reconstructed signals in each iteration, block can approach the noise-free full sampling , all the mean squared errors between the signals after all data consistency (DC) modules and the full sampling are chosen as the loss function
| (3) |
where denotes all learnable parameters, including the values of convolution kernels in FMCM and regularization parameters in ROA module and in data consistency module. These parameters of all modules are jointly optimized, and optimal values of the parameters are obtained by minimizing the loss function with Adam optimizer (55). The learning rate drops gradually from 10–3.0 to 10–4.5 at a rate of 10–0.5 when the relative norm error stops decreasing.
Extend to reconstruction for 3D NUS of 4D spectra
In nature, ROAD reconstructs a multidimensional spectrum by updating factor matrices in each dimension. From 2D NUS to 3D NUS, ROAD only adds the number of ROA and FMCM (Fig. 9E). Because the weights of FMCM are shared in all dimensions during training 2D NUS, ROAD can load the shared weights of 2D NUS to reconstruct 3D NUS. By similar derivations of reconstruction for 2D NUS, other modules, including peak retrieval and data consistency, can support reconstruction for 3D NUS.
Acknowledgments
We thank the reviewers for constructive comments.
Funding: This work was supported, in part, by the National Natural Science Foundation of China (62331021 and 62122064 to X.Q. and 62371410 to D.G.), the Natural Science Foundation of Fujian Province of China under grant (2023 J02005 to X.Q.), Industry-University Cooperation Projects of the Ministry of Education of China (231107173160805 to X.Q.), National Key R&D Program of China (2023YFF0714200 to X.Q.), the President Fund of Xiamen University (20720220063 to X.Q.), the Xiamen University Nanqiang Outstanding Talents Program (to X.Q.), Swedish Research Council (2023-03485 and 2024-06251 to V.O.).
Author contributions: Conceptualization: X.Q. and Y.H. Methodology: Y.H. and X.Q. Investigation: Y.H., Y.G., and Z.T. Visualization: Y.H., Y.G., T.A., V.O., S.G.H., and G.W. Resources: T.A., V.O., S.G.H., G.W., Z.C., D.G., and X.Q. Formal analysis: Y.H., T.A., V.O., S.G.H., and G.W. Supervision and funding acquisition: D.G. and X.Q. Writing—original draft: Y.H., Y.G., D.G., and X.Q. Writing—review and editing: All authors.
Competing interests: The authors declare that they have no competing interests.
Data and materials availability: The demonstration code, 3D/4D NMR data, and the NMRPipe scripts have been deposited in the Zenodo under https://zenodo.org/uploads/15852769. All other data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials.
Supplementary Materials
This PDF file includes:
Supplementary Text
Sections S1 to S13
Tables S1 to S7
Figs. S1 to S30
References
REFERENCES AND NOTES
- 1.Hiller S., Ibraghimov I., Wagner G., Orekhov V. Y., Coupled decomposition of four-dimensional noesy spectra. J. Am. Chem. Soc. 131, 12970–12978 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.A. Marintchev, D. Frueh, G. Wagner, in Methods in Enzymology, J. Lorsch, Ed. (Academic Press, 2007), vol. 430, pp. 283–331. [DOI] [PubMed]
- 3.Tugarinov V., Choy W.-Y., Orekhov V. Y., Kay L. E., Solution NMR-derived global fold of a monomeric 82-kDa enzyme. Proc. Natl. Acad. Sci. U.S.A. 102, 622–627 (2005). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Hiller S., Garces R. G., Malia T. J., Orekhov V. Y., Colombini M., Wagner G., Solution structure of the integral human membrane protein VDAC-1 in detergent micelles. Science 321, 1206–1210 (2008). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Han X., Levkovets M., Lesovoy D., Sun R., Wallerstein J., Sandalova T., Agback T., Achour A., Agback P., Orekhov V. Y., Assignment of IVL-Methyl side chain of the ligand-free monomeric human MALT1 paracaspase-IgL3 domain in solution. Biomol. NMR Assign. 16, 363–371 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Jaravine V., Ibraghimov I., Yu Orekhov V., Removal of a time barrier for high-resolution multidimensional NMR spectroscopy. Nat. Methods 3, 605–607 (2006). [DOI] [PubMed] [Google Scholar]
- 7.Qu X., Mayzel M., Cai J.-F., Chen Z., Orekhov V., Accelerated NMR spectroscopy with low-rank reconstruction. Angew. Chem. Int. Ed. Engl. 54, 852–854 (2015). [DOI] [PubMed] [Google Scholar]
- 8.Qu X., Huang Y., Lu H., Qiu T., Guo D., Agback T., Orekhov V., Chen Z., Accelerated nuclear magnetic resonance spectroscopy with deep learning. Angew. Chem. Int. Ed. Engl. 59, 10297–10300 (2020). [DOI] [PubMed] [Google Scholar]
- 9.Holland D. J., Bostock M. J., Gladden L. F., Nietlispach D., Fast multidimensional NMR spectroscopy using compressed sensing. Angew. Chem. Int. Ed. Engl. 50, 6548–6551 (2011). [DOI] [PubMed] [Google Scholar]
- 10.Kazimierczuk K., Orekhov V. Y., Accelerated NMR spectroscopy by using compressed sensing. Angew. Chem. Int. Ed. Engl. 50, 5556–5559 (2011). [DOI] [PubMed] [Google Scholar]
- 11.X. Qu, X. Cao, D. Guo, Z. Chen, “Compressed sensing for sparse magnetic resonance spectroscopy,” in Proceedings of the International Society for Magnetic Resonance in Medicine Scientific Meeting (Curran Associates, 2010), p. 3371.
- 12.Qu X., Guo D., Cao X., Cai S., Chen Z., Reconstruction of self-sparse 2D NMR spectra from undersampled data in the indirect dimension. Sensors 11, 8888–8909 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Guo D., Lu H., Qu X., A fast low rank Hankel matrix factorization reconstruction method for non-uniformly sampled magnetic resonance spectroscopy. IEEE Access 5, 16033–16039 (2017). [Google Scholar]
- 14.Lu H., Zhang X., Qiu T., Yang J., Ying J., Guo D., Chen Z., Qu X., Low rank enhanced matrix recovery of hybrid time and frequency data in fast magnetic resonance spectroscopy. I.E.E.E. Trans. Biomed. Eng. 65, 809–820 (2018). [DOI] [PubMed] [Google Scholar]
- 15.Ying J., Lu H., Wei Q., Cai J.-F., Guo D., Wu J., Chen Z., Qu X., Hankel matrix nuclear norm regularized tensor completion for N-dimensional exponential signals. IEEE Trans. Signal Process. 65, 3702–3717 (2017). [Google Scholar]
- 16.Pustovalova Y., Mayzel M., Orekhov V. Y., XLSY: Extra-large NMR spectroscopy. Angew. Chem. Int. Ed. Engl. 57, 14043–14045 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Qiu T., Wang Z., Liu H., Guo D., Qu X., Review and prospect: NMR spectroscopy denoising and reconstruction with low-rank Hankel matrices and tensors. Magn. Reson. Chem. 59, 324–345 (2021). [DOI] [PubMed] [Google Scholar]
- 18.Chen D., Wang Z., Guo D., Orekhov V., Qu X., Review and prospect: Deep learning in nuclear magnetic resonance spectroscopy. Chem. Eur. J. 26, 10391–10401 (2020). [DOI] [PubMed] [Google Scholar]
- 19.Hansen D. F., Using deep neural networks to reconstruct non-uniformly sampled NMR spectra. J. Biomol. NMR 73, 577–585 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Shukla V. K., Karunanithy G., Vallurupalli P., Hansen D. F., A combined NMR and deep neural network approach for enhancing the spectral resolution of aromatic side chains in proteins. Sci. Adv. 10, eadr2155 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Luo Y., Zheng X., Qiu M., Gou Y., Yang Z., Qu X., Chen Z., Lin Y., Deep learning and its applications in nuclear magnetic resonance spectroscopy. Prog. Nucl. Magn. Reson. Spectrosc. 146-147, 101556 (2025). [DOI] [PubMed] [Google Scholar]
- 22.Zheng X., Yang Z., Yang C., Shi X., Luo Y., Luo J., Zeng Q., Lin Y., Chen Z., Fast acquisition of high-quality nuclear magnetic resonance pure shift spectroscopy via a deep neural network. J. Phys. Chem. Lett. 13, 2101–2106 (2022). [DOI] [PubMed] [Google Scholar]
- 23.Jahangiri A., Han X., Lesovoy D., Agback T., Agback P., Achour A., Orekhov V., NMR spectrum reconstruction as a pattern recognition problem. J. Magn. Reson. 346, 107342 (2023). [DOI] [PubMed] [Google Scholar]
- 24.Zhan H., Liu J., Fang Q., Chen X., Hu L., Accelerated pure shift NMR spectroscopy with deep learning. Anal. Chem. 96, 1515–1521 (2024). [DOI] [PubMed] [Google Scholar]
- 25.Zhan H., Liu J., Fang Q., Chen X., Ni Y., Zhou L., Fast pure shift NMR spectroscopy using attention-assisted deep neural network. Adv. Sci. 11, 2309810 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Chen X., Zhou L., Ni Y., Liu J., Fang Q., Huang Y., Chen Z., Xia H., Zhan H., WPR-net: A deep learning protocol for highly accelerated NMR spectroscopy with faithful weak peak reconstruction. Anal. Chem. 97, 7010–7019 (2025). [DOI] [PubMed] [Google Scholar]
- 27.Luo Y., Chen W., Su Z., Shi X., Luo J., Qu X., Chen Z., Lin Y., Deep learning network for NMR spectra reconstruction in time-frequency domain and quality assessment. Nat. Commun. 16, 2342 (2025). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Luo J., Zeng Q., Wu K., Lin Y., Fast reconstruction of non-uniform sampling multidimensional NMR spectroscopy via a deep neural network. J. Magn. Reson. 317, 106772 (2020). [DOI] [PubMed] [Google Scholar]
- 29.Worswick S. G., Spencer J. A., Jeschke G., Kuprov I., Deep neural network processing of deer data. Sci. Adv. 4, eaat5218 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Chen B., Wu L., Cui X., Lin E., Cao S., Zhan H., Huang Y., Yang Y., Chen Z., High-quality reconstruction for Laplace NMR based on deep learning. Anal. Chem. 95, 11596–11602 (2023). [DOI] [PubMed] [Google Scholar]
- 31.Chen B., Fang Z., Zhang Y., Guan X., Lin E., Feng H., Zeng Y., Cai S., Yang Y., Huang Y., Chen Z., Two-dimensional Laplace NMR reconstruction through deep learning enhancement. J. Am. Chem. Soc. 146, 21591–21599 (2024). [DOI] [PubMed] [Google Scholar]
- 32.Wu K., Luo J., Zeng Q., Dong X., Chen J., Zhan C., Chen Z., Lin Y., Improvement in signal-to-noise ratio of liquid-state NMR spectroscopy via a deep neural network dn-unet. Anal. Chem. 93, 1377–1382 (2021). [DOI] [PubMed] [Google Scholar]
- 33.Li D.-W., Hansen A. L., Yuan C., Bruschweiler-Li L., Brüschweiler R., Deep picker is a deep neural network for accurate deconvolution of complex two-dimensional NMR spectra. Nat. Commun. 12, 5229 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Wang Z., Guo D., Tu Z., Huang Y., Zhou Y., Wang J., Feng L., Lin D., You Y., Agback T., Orekhov V., Qu X., A sparse model-inspired deep thresholding network for exponential signal reconstruction-application in fast biological spectroscopy. IEEE Trans. Neural Netw. Learn Syst. 34, 7578–7592 (2023). [DOI] [PubMed] [Google Scholar]
- 35.Guo Y., Zhan J., Tu Z., Zhou Y., Wu J., Hong Q., Huang Y., Orekhov V., Qu X., Guo D., Hypercomplex low rank reconstruction for NMR spectroscopy. Signal Process. 203, 108809 (2023). [Google Scholar]
- 36.J. C. Hoch, A. S. Stern, NMR Data Processing (Wiley, 1996).
- 37.Shchukina A., Kasprzak P., Dass R., Nowakowski M., Kazimierczuk K., Pitfalls in compressed sensing reconstruction and how to avoid them. J. Biomol. NMR 68, 79–98 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Huang Y., Zhao J., Wang Z., Orekhov V., Guo D., Qu X., Exponential signal reconstruction with deep Hankel matrix factorization. IEEE Trans. Neural Netw. Learn Syst. 34, 6214–6226 (2023). [DOI] [PubMed] [Google Scholar]
- 39.Ying J., Delaglio F., Torchia D. A., Bax A., Sparse multidimensional iterative lineshape-enhanced (smile) reconstruction of both non-uniformly sampled and conventional NMR data. J. Biomol. NMR 68, 101–118 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Delaglio F., Grzesiek S., Vuister G. W., Zhu G., Pfeifer J., Bax A., NMRPipe: A multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6, 277–293 (1995). [DOI] [PubMed] [Google Scholar]
- 41.Mayzel M., Kazimierczuk K., Orekhov V., The causality principle in the reconstruction of sparse NMR spectra. Chem. Commun. 50, 8947–8950 (2014). [DOI] [PubMed] [Google Scholar]
- 42.Hyberts S. G., Milbradt A. G., Wagner A. B., Arthanari H., Wagner G., Application of iterative soft thresholding for fast reconstruction of NMR data non-uniformly sampled with multidimensional Poisson gap scheduling. J. Biomol. NMR 52, 315–327 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Pustovalova Y., Delaglio F., Craft D. L., Arthanari H., Bax A., Billeter M., Bostock M. J., Dashti H., Hansen D. F., Hyberts S. G., Johnson B. A., Kazimierczuk K., Lu H., Maciejewski M., Miljenović T. M., Mobli M., Nietlispach D., Orekhov V., Powers R., Qu X., Robson S. A., Rovnyak D., Wagner G., Ying J., Zambrello M., Hoch J. C., Donoho D. L., Schuyler A. D., Nuscon: A community-driven platform for quantitative evaluation of nonuniform sampling in NMR. Magn. Reson. 2, 843–861 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Karunanithy G., Hansen D. F., FID-Net: A versatile deep neural network architecture for NMR spectral reconstruction and virtual decoupling. J. Biomol. NMR 75, 179–191 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.J. Benesty, J. Chen, Y. Huang, I. Cohen, Noise Reduction in Speech Processing (Springer Science & Business Media, 2009), vol. 2.
- 46.Skinner S. P., Fogh R. H., Boucher W., Ragan T. J., Mureddu L. G., Vuister G. W., Ccpnmr analysisassign: A flexible platform for integrated NMR analysis. J. Biomol. NMR 66, 111–124 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.T. D. Claridge, High-Resolution NMR Techniques in Organic Chemistry (Elsevier, 2016), vol. 27.
- 48.Boeszoermenyi A., Schmidt J. C., Cheeseman I. M., Oberer M., Wagner G., Arthanari H., Resonance assignments of the microtubule-binding domain of the C. elegans spindle and kinetochore-associated protein 1. Biomol. NMR Assign. 8, 275–278 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Orekhov V. Y., Ibraghimov I. V., Billeter M., MUNIN: A new approach to multi-dimensional NMR spectra interpretation. J. Biomol. NMR 20, 49–60 (2001). [DOI] [PubMed] [Google Scholar]
- 50.Sahnoun S., Usevich K., Comon P., Multidimensional ESPRIT for damped and undamped signals: Algorithm, computations, and perturbation analysis. IEEE Trans. Signal Process. 65, 5897–5910 (2017). [Google Scholar]
- 51.G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2017), pp. 4700–4708. [Google Scholar]
- 52.K. He, X. Zhang, S. Ren, J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2016), pp. 770–778. [Google Scholar]
- 53.V. Nair, G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the International Conference on Machine Learning (ICML, 2010), pp. 807–814. [Google Scholar]
- 54.Hyberts S. G., Takeuchi K., Wagner G., Poisson-gap sampling and forward maximum entropy reconstruction for enhancing the resolution and sensitivity of protein NMR data. J. Am. Chem. Soc. 132, 2145–2147 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.D. P. Kingma, J. Ba, Adam: A method for stochastic optimization. arXiv:1412.6980 (2014). 10.48550/arXiv.1412.6980. [DOI]
- 56.Orekhov V. Y., Ibraghimov I., Billeter M., Optimizing resolution in multidimensional NMR by three-way decomposition. J. Biomol. NMR 27, 165–173 (2003). [DOI] [PubMed] [Google Scholar]
- 57.Tugarinov V., Kay L. E., Ibraghimov I., Orekhov V. Y., High-resolution four-dimensional 1h−13c noe spectroscopy using methyl-trosy, sparse data acquisition, and multidimensional decomposition. J. Am. Chem. Soc. 127, 2767–2775 (2005). [DOI] [PubMed] [Google Scholar]
- 58.Huang Y., Wang Z., Zhang X., Cao J., Tu Z., Lin M., Li L., Jiang X., Guo D., Qu X., Improve robustness to mismatched sampling rate: An alternating deep low-rank approach for exponential function reconstruction and its biomedical magnetic resonance applications. J. Magn. Reson. 376, 107898 (2025). [DOI] [PubMed] [Google Scholar]
- 59.Delsuc M. A., Spectral representation of 2D NMR spectra by hypercomplex numbers. J. Magn. Reson. 77, 119–124 (1988). [Google Scholar]
- 60.I. L. v. Kantor, A. S. Solodovnikov, A. Shenitzer, Hypercomplex Numbers: An Elementary Introduction to Algebras (Springer, 1989), vol. 302.
- 61.Ying J., Cai J.-F., Guo D., Tang G., Chen Z., Qu X., Vandermonde factorization of Hankel matrix for complex exponential signal recovery—Application in fast NMR spectroscopy. IEEE Trans. Signal Process. 66, 5520–5533 (2018). [Google Scholar]
- 62.Yeniay Ö., Penalty function methods for constrained optimization with genetic algorithms. Math Comput. Appl. 10, 45–56 (2005). [Google Scholar]
- 63.Cai J.-F., Candes E. J., Shen Z., A singular value thresholding algorithm for matrix completion. SIAM J. Optim. 20, 1956–1982 (2010). [Google Scholar]
- 64.Blümich B., Ziessow D., Skyline projections in two-dimensional NMR spectroscopy. J. Magn. Reson. 49, 151–154 (1982). [Google Scholar]
- 65.Korzhnev D. M., Karlsson B. G., Orekhov V. Y., Billeter M., NMR detection of multiple transitions to low-populated states in azurin. Protein Sci. 12, 56–65 (2003). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 66.Kotler S. A., Tugarinov V., Schmidt T., Ceccon A., Libich D. S., Ghirlando R., Schwieters C. D., Clore G. M., Probing initial transient oligomerization events facilitating Huntingtin fibril nucleation at atomic resolution by relaxation-based NMR. Proc. Natl. Acad. Sci. U.S.A. 116, 3562–3571 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 67.Zhang F., Quaternions and matrices of quaternions. Linear Alg. Appl. 251, 21–57 (1997). [Google Scholar]
- 68.Antun V., Renna F., Poon C., Adcock B., Hansen A. C., On instabilities of deep learning in image reconstruction and the potential costs of AI. Proc. Natl. Acad. Sci. U.S.A. 117, 30088–30095 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 69.Maciejewski M. W., Schuyler A. D., Gryk M. R., Moraru I. I., Romero P. R., Ulrich E. L., Eghbalnia H. R., Livny M., Delaglio F., Hoch J. C., Nmrbox: A resource for biomolecular NMR computation. Biophys. J. 112, 1529–1534 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 70.Hyberts S. G., Arthanari H., Robson S. A., Wagner G., Perspectives in magnetic resonance: NMR in the post-FFT era. J. Magn. Reson. 241, 60–73 (2014). [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplementary Text
Sections S1 to S13
Tables S1 to S7
Figs. S1 to S30
References










