Abstract
Purpose:
To accelerate radially sampled diffusion weighted spin-echo (Rad-DW-SE) acquisition method for generating high quality apparent diffusion coefficient () maps.
Methods:
A deep learning method was developed to generate accurate maps from accelerated DWI data acquired with the Rad-DW-SE method. The deep learning method integrates convolutional neural networks (CNNs) with vision transformers to generate high quality maps from accelerated DWI data, regularized by a monoexponential model fitting term. A model was trained on DWI data of 147 mice and evaluated on DWI data of 36 mice, with acceleration factors of 4x and 8x compared to the original acquisition parameters. We have made our code publicly available at GitHub: https://github.com/ymli39/DeepADC-Net-Learning-Apparent-Diffusion-Coefficient-Maps, and our dataset can be downloaded at https://pennpancreaticcancerimagingresource.github.io/data.html.
Results:
Ablation studies and experimental results have demonstrated that the proposed deep learning model generates higher quality maps from accelerated DWI data than alternative deep learning methods under comparison when their performance is quantified in whole images as well as in regions of interest, including tumors, kidneys, and muscles.
Conclusions:
The deep learning method with integrated CNNs and transformers provides an effective means to accurately compute maps from accelerated DWI data acquired with the Rad-DW-SE method.
Keywords: Parametric estimation, self-Attention, monoexponential model, diffusion weighted MRI, apparent diffusion coefficient, convolutional neural network
1. Introduction
Diffusion weighted (DW) MRI provides quantitative metrics related to the Brownian motion of water hindered by microstructures present in biological tissues (1,2). The apparent diffusion coefficient () of water derived from DW images at multiple -values has been employed extensively as a biomarker in neurological and oncological applications (3–6). Accurate map generation can assist in differentiating between benign and malignant tumors, determining tumor aggressiveness and monitoring tumor response to treatment (7,8). Furthermore, accurate assessment with reducing scan time has several advantages, including better subject tolerance of shorter scans and potential reduction of motion-related artifacts which are exacerbated by long scan times. Since DWI pulse sequences are sensitive to motion on the micrometer scale, macroscopic (millimeter) scale movement of tissue /organ due to respiratory motion can introduce artifacts, resulting in errors in quantitative measurement of in the affected tissue. In clinical DWI, respiratory motion is often mitigated by breath-holds or with respiratory navigators, as well as by employing rapid acquisition schemes such as single-shot EPI (9) to minimize motion corruption. In DWI of mice, however, higher respiration rates and increased magnetic susceptibility effects due to high magnetic field strength, EPI-based DWI performed on preclinical MRI instruments leads to greater levels of distortions and artifacts (10). By leveraging the intrinsic motion-insensitive property of radial k-space sampling, previous studies have shown that the radially sampled diffusion weighted spin-echo (Rad-DW-SE) acquisition method effectively suppresses respiratory motion artifacts in DW-MR images of mouse abdomen over a wide range of -values (10,11). However, compared to single-shot EPI, the acquisition time of Rad-DW-SE is substantially longer. An effective means to shorten the Rad-DW-SE scanning time is to acquire under-sampled k-space data. However, accelerating Rad-DW-SE k-space data acquisition degrades the image quality dramatically, especially at higher -values due to lower signal-to-noise ratios (SNR), and subsequently degrades the derived maps.
Two approaches can be adopted to generate high quality maps from accelerated DW images: 1) generating high quality DW images followed by fitting to a diffusion model to estimate (12); or 2) directly generating high quality maps from accelerated maps. High quality DW images can be generated using deep learning methods that have achieved promising performance from accelerated k-space data in k-space domain (13–15), image domain (16–21), or both (22–24). However, the performance of such indirect methods hinges on the quality of the generated DW images at different -values with varied signal-to-noise ratios. On the other hand, directly generating high quality maps from under-sampled maps can be implemented using a deep learning model under a supervised learning setting. However, such an approach only utilizes maps and does not utilize individual DW images.
In this study, we develop a deep learning model, referred to as DeepADC-Net, to generate high quality maps from radially accelerated DW data, in conjunction with a monoexponential diffusion model that estimates the maps from the DW images. Our deep learning model takes the accelerated DW images of multiple -values and their derived map as a multi-channel input to generate high quality maps. The deep learning model is trained to optimize two complementary loss functions 1) the difference between the maps generated by deep learning model and those derived from fully-sampled DW images using a monoexponential model, and 2) the difference between fully-sampled DW images and the DW images estimated from the learned maps (12). Different from the existing deep learning based MR image generation methods that are typically built on convolutional neural networks (CNNs), our deep learning method is an integration of CNNs with vision transformers (25) that have shown great potential to learn the global context information as a self-attention module in conjunction with CNNs for feature extractions (26). Extensive ablation studies and experimental results demonstrate that the monoexponential model and the integration of CNNs with vision transformers could enhance deep learning to generate high quality maps from accelerated data. Although many deep learning methods have been developed for MRI data generation tasks with different image acquisition methods, our method is developed to improve computation from accelerated DW data collected with the radial k-space sampled diffusion weighted spin-echo (Rad-DW-SE) acquisition method.
2. Methods
2.1. Datasets
All animal handling protocols were reviewed and approved by the IACUC of the University of Pennsylvania. Animal studies employed a genetically engineered mouse model of pancreatic ductal adenocarcinoma that spontaneously develops premalignant pancreatic intraepithelial neoplastic lesions at 7–10 weeks of age. Animals were prepared for MRI exam by induction of general anesthesia and placement of vital signs probes as detailed elsewhere (10). MRI was performed on a 9.4T horizontal bore scanner (Bruker, Billerica, MA) using a 35 mm quadrature birdcage RF coil (M2M, Cleveland, Oh) for transmit and reception (10). Following a set of localizers, a contiguous series of axial slices spanning the tumor volume were acquired using a diffusion weighted, radially sampled spin echo sequence (Rad-DW-SE) with even sampling of the view angles over 360 degrees by acquiring one spoke per echo (, readout points, views, , , , -values=, total acquisition time=). To achieve sufficient SNR and image quality, 403 spokes were used as the reference acquisition, although Nyquist criterion requires fewer views (~150). Subsequently, 4x and 8x reduction in sampling from the reference acquisition was used to evaluate the effectiveness of our deep learning strategy for accelerating the data acquisition.
Furthermore, we also collected “real-world” accelerated data at a factor of 4x and compared maps generated by our deep learning model with those computed from fully-sampled data that were subsequentially collected using the same protocol. It is worth noting that the real-world 4x accelerated data might capture diffusion information different from that captured by the full-sampled data in that they were not collected simultaneously. All relevant results are presented in the supplemental document.
Based on the fully sampled Rad-DW-SE data, we evaluated our proposed model with two different acceleration factors of 4x and 8x. The 4x accelerated DW data were generated by sampling one out of every four radial views in k-space, resulting in a total of 101 views, while the 8x accelerated DW data were generated by sampling one out of every eight radial views in k-space, resulting in a total of 50 views. DW images were reconstructed from both the fully sampled and accelerated k-space data using Python code developed in house. Following zero and first order phase correction of each of the acquired views, the k-space data were regridded to a 96x96 Cartesian array using the Kaiser-Bessel kernel and a convolution window of four. The regridded data were then Fourier transformed and divided by the deconvolution function to yield the reconstructed images.
maps were computed from both the accelerated and the fully sampled DW images by least-squares-fitting the monoexponential model (27). The maps derived from the fully sampled DW images were used as ground truth, and values were excluded if they lie outside of the range [0, 0.0032] mm2/sec since 0.0032 corresponds to of free water at 37°C (28). We split the entire dataset into a training subset with scans of 147 animals consisting of a total of 2255 slices, and a testing subset with scans of 36 animals with a total of 557 slices.
2.2. DeepADC-Net
2.2.1. Problem formulation
Given Rad-DW-SE scans collected at different -values, an map can be computed from the DW images by fitting a monoexponential model :
| (1) |
where is intensity value of a DW scan at a -value of , , is the ADC parametric map, and is the intensity value of a DW scan in the absence of diffusion weighting. Based on the monoexponential model, the map can be calculated from DW images collected with at least two different -values:
| (2) |
where and are two different -values, and and are their corresponding DW images. To make the estimation robust, DW images are typically collected at three or more -values, and a least-squares-fitting algorithm is then adopted to estimate the and values.
Given accelerated DW images, we aim to optimize the DeepADC-Net to generate high-quality maps close to those estimated from their corresponding full-sampled DW images:
| (3) |
where is a deep learning model with parameter , its input consists of the accelerated DW images , and their corresponding map computed by fitting the monoexponential model, is the fully sampled map computed using the monoexponential model from their corresponding fully sampled DW images, and is a similarity measure between two maps.
2.2.2. DeepADC Network architecture
DeepADC-Net is constructed to generate high quality maps from accelerated DWI data collected with the Rad-DW-SE sequence, as schematically illustrated in Figure 1. The input to the deep learning model includes the accelerated DW images and their corresponding maps. The model consists of two parts in the training setting: 1) using accelerated DW images and their corresponding maps as multi-channel input, as illustrated in Figure 1(a), to generate high quality maps and map, which corresponds to a DW image in the absence of diffusion weighting (12), as shown in Figure 1(b); 2) estimating high quality DW images from generated and with the monoexponential model, illustrated in Figure 1(c). In the inference setting, the deep learning model is applied to accelerated DW images and their corresponding maps to generate high quality maps, as shown in Figure 1 (a) and (b).
Figure 1.

DeepADC-Net flowchart: a) the input consists of multiple channels, including the accelerated DW images and their corresponding maps generated by fitting a monoexponential diffusion model; b) a densely connected encoder-decoder backbone that contains a bottleneck transformer with the self-attention; c) the output includes high quality map and where high quality DW images are generated from those outputs using the monoexponential model.
Encoder-Decoder architecture:
In deep learning tasks, the encoder compresses the input data into a low-dimensional representation and the decoder generates high-dimensional output data from the low-dimensional representation, while U-Net is a type of Encoder-Decoder network (29) that includes skip connections between the encoder and the decoder (30). In our approach, high quality maps are generated using an Encoder-Decoder network with five densely connected blocks for both the encoder and the decoder (31). This network takes both the accelerated DW images at -values and their corresponding map as a multi-channel input, where and are the dimensions of the image matrix. The Decoder’s last layer has two paralleled output heads and that generate a high quality map and its corresponding DW image at a -value of 0, denoted by , respectively.
To generate values within a physiologically plausible range, we adopt a scaled sigmoid activation function, in the decoder’s output head , formulated as:
| (4) |
where and are the lower and upper boundaries of values respectively, and is the output of (i.e., the one before the activation layer). was set to 0, which is the smallest possible value indicating no diffusion, and was set to the of free water at 37°C, which equals and is presumably the largest possible value in vivo.
According to the monoexponential model specified in Equation (1), both and are learnable parametric maps, where represents the intensity values of the DW image in absence of diffusion weighting, and DW images are collected with diffusion weighting at -values. The intensity values of DW images at different -values are positive, decreasing as the -value increases. Therefore, the output of should be equal to or larger than its corresponding DW scan acquired at the lowest -value, denoted by . Accordingly, the output of is formulated as:
| (5) |
where is the DW scan collected at the lowest -value, and is a feature map generated by the decoder’s output head .
Given and generated from the deep learning model and their ground truths and generated from the monoexponential model using fully-sampled DW images, DW images at different -values can be calculated to regularize the output maps by encouraging the generated DW images to be close to their corresponding fully sampled DW images . According to Equation (1), DW images can be computed with the generated and by the monoexponential model, referred to as , at -values used for collecting the DW images.
Bottleneck Self-Attention:
Attention mechanism is a technique used in deep learning to selectively focus on different parts of the input data, allowing deep learning models to capture the relationships between different elements of the input and make more informed decisions (32). The bottleneck self-attention is a variant of self-attention that specifically focuses on the bottleneck of an Encoder-Decoder deep learning network (26). Let be the input feature map, be the queries, be the keys, and be the values, the output from self-attention layer can be computed as:
| (6) |
where and represent two different locations, , , and are elements of learned attention weights , , and , respectively. To make the self-attention mechanism sensitive to the positional information, the learnable position encoding is incorporated into the self-attention layer. The Multi-Head self-attention module is applied by taking different query, key, and value matrices to enable the attention layers focus on different parts of the input feature maps.
Loss functions:
Multiple loss functions are adopted to optimize the network for generating high quality maps, including:
| (7) |
where is the generated map and is DW images at -value of 0, are the DW images computed from generated and using with -values according to equation (1), and , and are their counterparts of the fully sampled data. The overall loss function is:
| (8) |
where , , and are regularization parameters. We set and , which yielded the overall best results among a range of values, as detailed in Supplementary Table S2 and Section 1.2.
2.3. Implementation Details and Evaluation Metrics
We performed our experiment on a single NVIDIA TITAN RTX GPU with PyTorch implementation. We utilized the Adam optimizer with a learning rate of , and a weight decay of . We chose the head size of four for multi-head self-attention module (26) in our implemented bottleneck transformer. The model was trained in a total of 1000 epochs in approximately two hours. While the dynamic range of the signal intensities of DW images with five -values are highly varied, its corresponding maps are in the range of [0, 0.0032] , where the maximal value corresponds to of free water at 37°C. To normalize the DW images and their corresponding maps into feasible scales to feed into the deep learning model, we therefore clipped the DWI data with a maximum value of 99 percentile, and further normalized the clipped DWI data into a [0, 1] range. The maps were normalized into the range of [0, 1] during training and the predicted maps were scaled back to their original ranges.
We evaluated map generation based on the testing images. Particularly, maps computed from the fully sampled imaging data were used as ground-truth data. To quantitatively evaluate the generated maps, we used correlation coefficient (CC) for quantifying linear relationship of voxelwise values between the generated and ground truth maps, in addition to structural similarity (SSIM) index, peak signal-to-noise ratio (PSNR), and normalized mean square error (NMSE), which are widely adopted in image generation studies (22–24). We evaluated the generated maps of the testing data, focusing on the whole images and regions of interests (ROIs), including tumor, muscle, and kidney. To evaluate the on the whole image basis, we utilized all testing images with background excluded by masking out the non-tissue regions to reduce the influence of noise during the evaluation. Instead of manually generating the tissue masks, we automatically excluded image pixels if their values were outside of [0, 0.0032] or their intensity values of the DWI scans were not in descending order according to their ascending -values. The ROIs were manually labeled.
2.4. Comparison with state-of-the-art deep learning and compressed sensing methods
We compared our DeepADC-Net with state-of-the-art deep learning methods, including: 1) U-Net (30), 2) DenseU-Net (31), 3) FBP-ConvNet (16), and 4) Att-UNet (17). These methods were implemented with the same network architectures as reported in their corresponding papers to generate high quality maps from the accelerated maps. We utilized the same training and inference setting to train all the models, where the best models were saved based on the best correlation coefficient score estimated based on the training dataset, and we evaluated the models’ overall performance on the testing dataset, with all four quantitative metrics, including CC, SSIM, PSNR and NMSE as described in Section 2.3. Specifically, the U-Net model contained encoders and decoders, each with 4 convolutional blocks; the DenseU-Net model contained encoders and decoders, each with 5 densely-connected blocks; the FBP-ConvNet model used the U-Net based architecture with a skip connection between the input and the output; and the Att-UNet model utilized a channel attention mechanism within the U-Net backbone. Furthermore, we also compared our method with a compressed sensing method (33) by adopting an implementation provided by SparseMRI V0.21. The CS method was implemented using MATLAB (version R2022a, MathWorks, Natick, MA). Since the CS method performs best when using a randomized sampling scheme, we selected the views for the accelerated datasets using block randomization, where a random view is selected out of every four.
2.5. Ablation studies
By systematically removing individual components of DL processing to determine their contribution to the overall performance, we carried out ablation studies to investigate how different components of the proposed deep learning method contribute to the map generation, including different inputs, different combinations of the loss function term, and self-attention, as summarized in Table 1. All the models with different components under evaluation had the same DenseU-Net backbone (31), one model had the input of map generated from the accelerated DW images by fitting the monoexponential model, and all the other models had the same multi-channel input of both the accelerated DW images and their associated map. All the models were trained and evaluated with the same training and inference settings. All the studies were performed on the accelerated DW images with an acceleration factor of 4.
Table 1.
Ablation studies of deep learning models trained with () and without () indicated components of the proposed deep learning method, including different inputs, different loss functions, and self-attention.
| Models | inputs | Self-Attention | |||
|---|---|---|---|---|---|
| DenseU-Net | |||||
| DenseU-ADC | |||||
| DenseU-DWI | |||||
| DenseU-ADC-DWI | |||||
| DeepADC-Net |
2.5.1. Ablation studies on network inputs
As indicated in Table 1 (rows 1 and 2), we evaluated how different inputs contributed to the generation with the deep learning models trained by optimizing . Specifically, the DenseU-Net model with the accelerated map alone as its input is referred to as DenseU-Net, whereas the DenseU-Net model with the multi-channel input of both the accelerated DW images at multiple -values and their map is referred to as DenseU-ADC.
2.5.2. Ablation study on loss functions
We evaluated how the loss function terms contributed to the generation with the deep learning models trained with the multi-channel input of both the accelerated DW images at multiple -values and their map. Specifically, we trained two deep learning models to generate maps and , which are used to further compute DW images by optimizing or a combination of , and . The former is referred to as DenseU-DWI and the latter is referred to as DenseU-ADC-DWI.
2.5.3. Ablation study on self-attentions
We also investigated how the self-attention module contributed to the generation of maps. The multi-head self-attention (MHSA) module was implemented in the network bottleneck, consisting of three densely connected convolutional layers, and the MHSA module was applied after the first and second convolutional layers. The model is referred to as DeepADC-Net and was compared with DenseU-ADC-DWI.
3. Results
3.1. Comparison with state-of-the-art Methods on the Entire Cross-Section
Figure 2 shows representative fully-sampled and accelerated DW images and their corresponding maps obtained by fitting the monoexponential model, indicating that the accelerated DW images were noisy, especially at higher -values, and the derived maps lose anatomical details.
Figure 2.

Diffusion weighted images and maps from fully and accelerated Rad-DW-SE scans at different -values. The accelerated images were obtained by down-sampling the fully-sampled data with acceleration factor of four and eight, and the maps were computed from their corresponding DW images by fitting a monoexponential model. Compared with their fully-sampled counterparts, the accelerated DW images appeared noisy, especially at higher -values, and the derived maps lost anatomical detail as shown in green bounding boxes. The decreases as the degree of acceleration increases.
Table 2 summarizes quantitative evaluation metrics for maps generated by DeepADC-Net and alternative state-of-the-art methods under comparison. The maps estimated from accelerated DW images with least-squares-fitting were substantially different from those derived from their corresponding fully sampled DW images. U-net, DenseU-Net, Att-UNet, and FBP-ConvNet yielded maps with improved similarity to fully sampled data compared with those estimated directly from accelerated DW images with least-squares-fitting. DeepADC-Net yielded the best similarity for all metrics studied. DeepADC-Net improved upon the second-best method by 3.15%, 0.13%, 1.29dB, and 0.63% in terms of CC, SSIM, PSNR, and NMSE, respectively. The quantitative evaluation results summarized in Table 2 also demonstrate that it was more challenging to estimate high quality maps from the 8x accelerated data than from the 4x accelerated data.
Table 2.
Quantitative evaluation of maps generated by DeepADC-Net and alternative state-of-the-art methods on both 4x and 8x accelerated testing datasets. Results are shown as (Mean ± Standard Deviation).
| Models | Sampling Factor | Correlation |
SSIM |
PSNR | NMSE |
|---|---|---|---|---|---|
| Least-Squares-Fitting | 4x | 68.35 ±6.25 | 96.13 ±1.60 | 14.98 ±1.69 | 13.39 ±3.76 |
| Compressed Sensing | 72.14 ± 6.87 | 98.79 ± 0.43 | 17.24 ± 1.44 | 7.81 ± 2.13 | |
| FBPConvNet | 87.28 ±3.66 | 99.49 ±0.15 | 21.89 ±1.11 | 2.45 ±0.34 | |
| AttUnet | 87.76 ±3.41 | 99.45 ±0.08 | 21.73 ±0.84 | 2.61 ±0.38 | |
| DenseUnet | 87.68 ±3.48 | 99.47 ±0.10 | 21.85 ±0.96 | 2.47 ±0.30 | |
| Unet | 87.41 ±3.57 | 99.48 ±0.11 | 21.89 ±1.09 | 2.46 ±0.32 | |
| DeepADC-Net | 90.91 ±2.28 | 99.62 ±0.06 | 23.18 ±0.90 | 1.82 ±0.19 | |
| Least-Squares-Fitting | 8x | 46.00 ±7.90 | 84.86 ±4.44 | 9.60 ±1.53 | 43.15 ±9.34 |
| Compressed Sensing | 57.56 ±22.0 | 97.36 ±3.44 | 15.90 ±2.76 | 11.46 ±10.05 | |
| FBPConvNet | 76.01 ±5.45 | 99.02 ±0.18 | 19.35 ±0.92 | 4.41 ±0.63 | |
| AttUnet | 76.76 ±5.49 | 99.00 ±0.18 | 19.30 ±0.91 | 4.52 ±0.79 | |
| DenseUnet | 76.16 ±5.31 | 99.03 ±0.19 | 19.40 ±0.92 | 4.37 ±0.64 | |
| Unet | 76.27 ±5.43 | 99.03 ±0.18 | 19.39 ±0.92 | 4.36 ±0.62 | |
| DeepADC-Net | 85.77 ±3.15 | 99.37 ±0.09 | 21.06 ±0.75 | 2.97 ±0.49 |
We also compared our method with a compressed sensing (CS) method (33). As summarized in Table 2, the CS method had better performance than the least-squares-fitting method but performed worse than DeepADC-Net in terms of all four performance evaluation metrics. Please refer to Supplementary Section 1.6 for more details.
Representative maps of the same fully-sampled and their accelerated versions generated by all the methods under comparison are shown in Figure 3 (row 1 and row 4). The maps estimated from the accelerated DW images using least-squares-fitting lose texture details when compared with their corresponding fully sampled DW images. While the CS method yielded better visualization than least-squares-fitting, the CNN based methods, including U-net, DenseU-Net, Att-UNet, and FBP-ConvNet, outperformed both CS and least-squares-fitting in generating maps. Notably, DeepADC-Net generated the best accelerated maps, with smaller errors than those generated by the alternative methods under comparison for both 4x and 8x accelerated image slices.
Figure 3.

Visualization of 4x and 8x accelerated image slices obtained by all methods under comparison. Image slice is randomly selected with median NMSE performance. The first and fourth rows show the ground truth and the generated maps. The second and fifth rows show the absolute error maps with range displayed up to 75% of the maximum difference. The third and sixth rows show the absolute error maps in the tumor region with range displayed up to 25% maximum difference.
3.2. Comparison with state-of-the-art methods on the regions of interest
We evaluated all the methods under comparison in three ROIs, including tumor, muscle, and kidney. As summarized in Table 3 for the 4x accelerated testing dataset, the least-squares-fitting showed the worst performance on all three ROIs. Additionally, the CS method had better performance than the least-squares-fitting method on all three ROIs. Among the deep learning methods under comparison, Att-UNet achieved the best performance in correlation coefficient, whereas the FBP-ConvNet yielded best performance in SSIM, PSNR and NMSE on all three ROIs. Our DeepADC-Net obtained the overall the best performance on all three ROIs. Similar trends were observed on the 8x accelerated testing dataset, as demonstrated by the results summarized in Table 4. Specifically, our DeepADC-Net obtained improved performance by 10.81%, 21.06% and 15.62% on tumor, muscle and kidney respectively in terms of correlation coefficient, compared with the second best method.
Table 3.
Quantitative comparison of maps generated by DeepADC-Net and alternative state-of-the-art methods for 4x accelerated testing datasets on different ROIs. Results are shown as (Mean ± Standard Deviation).
| Models | ROIs | Correlation | SSIM | PSNR | NMSE |
|---|---|---|---|---|---|
| Least-Squares-Fitting | Tumor | 69.43 ±15.9 | 97.38 ±3.00 | 16.16 ±3.46 | 10.22 ±9.70 |
| Compressed Sensing | 71.88 ±19.1 | 99.09 ±0.86 | 18.95 ±3.19 | 4.76 ±3.48 | |
| FBPConvNet | 84.62 ±9.1 | 99.50 ±0.27 | 21.44 ±2.07 | 2.50 ±1.35 | |
| AttUnet | 84.79 ±9.5 | 99.46 ±0.37 | 21.27 ±2.19 | 2.67 ±1.87 | |
| DenseUnet | 84.60 ±9.6 | 99.47 ±3.11 | 21.24 ±2.03 | 2.61 ±1.51 | |
| Unet | 84.67 ±9.1 | 99.49 ±0.28 | 21.43 ±2.10 | 2.51 ±1.26 | |
| DeepADC-Net | 88.37 ±6.9 | 99.62 ±0.18 | 22.49 ±1.92 | 1.90 ±0.85 | |
| Least-Squares-Fitting | Muscle | 60.11 ±19.2 | 94.84 ±5.41 | 13.49 ±3.72 | 17.08 ±14.71 |
| Compressed Sensing | 65.35 ±19.3 | 98.58 ±1.60 | 16.87 ±2.93 | 6.44 ±5.77 | |
| FBPConvNet | 79.74 ±10.0 | 99.41 ±0.31 | 20.20 ±1.83 | 2.85 ±1.26 | |
| AttUnet | 80.95 ±9.20 | 99.41 ±0.31 | 20.17 ±1.78 | 2.88 ±1.41 | |
| DenseUnet | 81.08 ±8.9 | 99.42 ±0.28 | 20.18 ±1.66 | 2.84 ±1.28 | |
| Unet | 79.94 ±9.9 | 99.41 ±0.31 | 20.17 ±1.84 | 2.87 ±1.27 | |
| DeepADC-Net | 86.27 ±6.0 | 99.59 ±0.19 | 21.51 ±1.50 | 2.08 ±0.84 | |
| Least-Squares-Fitting | Kidney | 61.76 ±19.5 | 96.04 ±4.48 | 14.52 ±3.54 | 14.21 ±12.77 |
| Compressed Sensing | 64.24 ±20.4 | 98.89 ±1.14 | 17.56 ±2.87 | 5.75 ±4.26 | |
| FBPConvNet | 80.40 ±12.2 | 99.45 ±2.82 | 20.67 ±1.92 | 2.74 ±1.18 | |
| AttUnet | 80.80 ±12.5 | 99.40 ±0.34 | 20.49 ±2.01 | 2.93 ±1.66 | |
| DenseUnet | 80.31 ±12.8 | 99.41 ±0.33 | 20.53 ±1.88 | 2.85 ±1.88 | |
| Unet | 80.64 ±11.7 | 99.44 ±0.28 | 20.65 ±1.89 | 2.73 ±1.12 | |
| DeepADC-Net | 85.30 ±9.7 | 99.57 ±0.20 | 21.76 ±1.76 | 2.11 ±0.87 |
Table 4.
Quantitative comparison of maps on DeepADC-Net and alternative state-of-the-art methods for 8x accelerated testing datasets on different ROIs. Results are shown as (Mean ± Standard Deviation).
| Models | ROIs | Correlation | SSIM | PSNR | NMSE |
|---|---|---|---|---|---|
| Least-Squares-Fitting | Tumor | 45.64 ±18.5 | 89.24 ±7.92 | 10.51 ±3.11 | 33.84 ±20.72 |
| Compressed Sensing | 58.18 ±22.3 | 97.25 ±5.27 | 16.14 ±3.89 | 10.79 ±13.10 | |
| FBPConvNet | 72.11 ±14.0 | 99.10 ±0.51 | 18.99 ±2.52 | 4.42 ±2.52 | |
| AttUnet | 72.68 ±14.2 | 99.06 ±0.53 | 18.85 ±2.01 | 4.63 ±2.89 | |
| DenseUnet | 73.01 ±16.3 | 99.16 ±0.46 | 19.17 ±1.97 | 4.17 ±2.16 | |
| Unet | 71.94 ±14.0 | 99.09 ±0.50 | 19.00 ±1.97 | 4.42 ±2.52 | |
| DeepADC-Net | 83.82 ±9.7 | 99.45 ±0.31 | 21.08 ±2.00 | 2.76 ±1.32 | |
| Least-Squares-Fitting | Muscle | 27.32 ±17.10 | 75.14 ±9.48 | 6.24 ±1.84 | 68.55 ±19.74 |
| Compressed Sensing | 49.77 ±20.9 | 95.81 ±6.84 | 13.85 ±3.36 | 14.23 ±16.53 | |
| FBPConvNet | 59.19 ±17.5 | 98.92 ±0.48 | 17.57 ±1.62 | 5.11 ±1.93 | |
| AttUnet | 59.64 ±17.2 | 98.88 ±0.55 | 17.48 ±1.68 | 5.27 ±2.43 | |
| DenseUnet | 58.80 ±17.5 | 98.91 ±0.21 | 17.55 ±1.60 | 5.13 ±1.97 | |
| Unet | 59.23 ±16.7 | 98.92 ±0.46 | 17.55 ±1.58 | 5.12 ±1.91 | |
| DeepADC-Net | 80.70 ±7.4 | 99.41 ±0.30 | 19.63 ±1.55 | 2.94 ±1.76 | |
| Least-Squares-Fitting | Kidney | 34.10 ±16.8 | 80.11 ±10.11 | 7.58 ±2.28 | 56.61 ±22.48 |
| Compressed Sensing | 49.99 ±22.5 | 96.53 ±5.48 | 14.55 ±3.57 | 13.34 ±14.81 | |
| FBPConvNet | 62.29 ±17.7 | 98.91 ±0.60 | 17.99 ±1.92 | 5.16 ±2.61 | |
| AttUnet | 62.36 ±17.5 | 98.85 ±0.66 | 17.84 ±1.92 | 5.42 ±3.23 | |
| DenseUnet | 61.99 ±17.8 | 98.91 ±5.63 | 18.01 ±1.92 | 5.13 ±2.47 | |
| Unet | 62.53 ±17.1 | 98.91 ±0.57 | 18.01 ±1.94 | 5.12 ±2.43 | |
| DeepADC-Net | 78.15 ±11.5 | 99.30 ±0.32 | 19.73 ±1.80 | 3.43 ±1.69 |
Figure 3 (row 3 and row 6) also shows error maps in a tumor region for both 4x and 8x accelerated data, showing that the direct least-squares-fitting of accelerated images yielded the worst performance. All the deep learning methods obtained improved performance, and our method obtained the overall best performance.
3.3. Ablation studies on network inputs, loss functions and self-attentions
Table 5 summarizes quantitative performance measures of the deep learning models built by our method with different components. Specifically, DenseU-Net had the overall worst performance and its performance was worse than DenseU-ADC, indicating that the multi-channel input of both the accelerated DW images and their map provided richer information than the accelerated map alone for generating high-quality maps.
Table 5.
Ablation studies of maps for 4x accelerated testing datasets. Results are shown as (Mean ± Standard Deviation).
| Models | Correlation | SSIM | PSNR | NMSE |
|---|---|---|---|---|
| DenseU-Net | 87.68 ± 3.48 | 99.47 ± 0.10 | 21.85 ± 0.96 | 2.47 ± 0.30 |
| DenseU-ADC | 90.15 ± 2.65 | 99.59 ± 0.06 | 22.81 ± 0.91 | 1.96 ± 0.21 |
| DenseU-DWI | 84.37 ± 3.13 | 99.37 ± 0.10 | 20.80 ± 0.70 | 3.15 ± 0.48 |
| DenseU-ADC-DWI | 90.63 ± 2.46 | 99.62 ± 0.07 | 23.05 ± 0.88 | 1.87 ± 0.24 |
| DeepADC-Net | 90.91 ± 2.28 | 99.62 ± 0.06 | 23.18 ± 0.90 | 1.82 ± 0.19 |
The deep learning models with the same multi-channel input but built by the proposed method with different loss function terms had a different performance. Specifically, DenseU-DWI had the worst performance compared with all the other two models that were trained by optimizing three complementary loss function terms, including , , and , whereas DenseU-DWI was trained by optimizing alone. DenseU-ADC-DWI shared with DeepADC-Net the same input and the same loss function, but it performed worse than DeepADC-Net, indicating that the multi-head self-attention module was useful to improve the generation.
4. Discussion
This study has demonstrated that a new deep leaning method, referred to as DeepADC-Net, successfully generated maps from accelerated diffusion-weighted MR data. The network contained three complementary loss functions, including the difference between maps learned and those computed from fully-sampled DW images and the differences between fully-sampled DW images and derived DW images from the learned and maps ( and ). The network was further enhanced by a bottleneck transformer with multi-head self-attention module. The method has been evaluated on accelerated DW images by 4x and 8x. Quantitative performance measures have demonstrated that our method obtained an accurate estimation of maps, which reduces the acquisition time from 25 minutes to just over six minutes. Since DW-MRI is highly sensitive to any unwanted motion that is not diffusion-related and motion is often exacerbated by long scan times, such reduction in scan time could potentially yield more accurate values and lead to better distinction between benign and malignant tumors or to better assessment of changes in response to treatment.
We have compared our method with conventional least-squares-fitting and state-of-the-art deep learning methods. In particular, we compared our method with the least-squares-fitting algorithm, U-Net (30), DenseU-Net (31), FBP-ConvNet (16), and Att-UNet (17) with the same training and testing settings. Comparison results, as summarized in Tables 2, 3, and 4, have demonstrated that DeepADC-Net obtained the best results among all methods under comparison for map generation on both the entire cross-section images and specific ROIs. Figure 3 visualizes differences between maps generated by DeepADC-Net and the alternative methods under comparison, demonstrating that DeepADC-Net generated high-quality maps with the minimum errors and less discrepancy from full-sampled maps. Furthermore, DeepADC-Net outperformed the CS method in generating maps as demonstrated in Table 2, while the CS method performed better than the least-squares-fitting method. This shows that DL can overcome the limitations of the CS method, such as inefficiency in capturing complex features through sparsifying transforms (34,35).
The present study built deep learning models to compute high quality maps from accelerated DW images, different from typical image super-resolution studies. Though many deep learning algorithms have been developed for accelerated MRI data, our method was applied to a dataset obtained by radially sampled diffusion weighted spin-echo (Rad-DW-SE) acquisition method for quantitative estimation. Our approach is fundamentally different from existing DL methods that focus on improving MR images from accelerated k-space data. Our method directly computes maps in the deep learning process with regularization terms built upon the monoexponential model, facilitating end-to-end learning of the maps with improved quality and efficiency. It is noteworthy that our proposed model can be considered as a plug-and-play module that can be adopted by other deep learning approaches, including those under comparison. Since no alternative deep learning method is available for a direct comparison, we evaluated the proposed method through extensive ablation studies. Our ablation studies have demonstrated that 1) the accelerated DWI and their derived maps as multi-channel inputs could improve the model performance compared to the model with only maps as its input, 2) minimizing the discrepancy between the DW images computed from the generated maps and fully-sampled DW images could regularize the DL-generated maps, 3) the bottleneck transformer was useful to improve the generation of maps. Additionally, we have evaluated DeepADC-Net on both real-world and simulated 4x accelerated datasets (Supplementary Section 1.1, Table S1) and experimental results show that DeepADC-Net outperformed standard curve-fitting methods on real-world accelerated data. However, the results should be interpreted with a caveat that the real-world 4x accelerated data might capture diffusion information different from that captured by the full-sampled data in that they were not collected simultaneously.
Our research has several limitations. First, our ablation studies were carried out based on the 4x accelerated dataset that we tune model parameters by fixing some of them while testing the rest, instead of using a fine-grid searching method. Second, we trained and evaluated our method largely on simulated data at two different acceleration factors since it is difficult to collect both real-world accelerated and fully-sampled data simultaneously. We did evaluate our method based on real-world accelerated data and compared their derived maps with those computed from fully-sampled data subsequently. However, such pairs of data might capture different diffusion information. Third, the present study only considered identical, evenly spaced view angles of DW images at all -values. A more complete analysis would involve other view ordering schemes, such as golden-angle, or one in which different view angles are used to encode for different -values. In our current work, although fewer -values are required to fit an exponential curve, five -values were collected in order to improve least-square fitting and also to leave open the possibility to access diffusion kurtosis in subsequent analysis. The availability of additional -values may have facilitated high acceleration factors that were achieved in this work than in fewer -values had been available. Lastly, our research has been focusing on preclinical imaging studies. Therefore, we have not specifically tested our method on human data, such as those collected using PROPELLER (36). We believe that our method can be applied to any other data with different image acquisition schemes given that the proposed method is based on deep learning, which can learn the underlying relationships among k-space data, regardless of the acquisition scheme used to obtain the data. The network, however, would need to be retrained with data acquired using the desired acquisition scheme.
5. Conclusion
We developed a deep learning method, referred to as DeepADC-Net, to generate apparent diffusion coefficient maps from accelerated diffusion-weighted MR data, achieving 4 to 8-fold acceleration of DWI acquisition. The proposed DeepADC-Net integrating a densely connected Encoder-Decoder architecture with a vision transformer is shown to perform superior to widely used compressed sensing and several state-of-the-art deep learning models for computing maps.
Supplementary Material
Acknowledgements
This work was supported in part by the National Institutes of Health [grant numbers: MH120811, EB022573, AG066650, and U24 CA231858 (Penn Pancreatic Cancer Imaging Resource)].
Footnotes
Reference
- 1.White NS, McDonald C, Farid N, Kuperman J, Karow D, Schenker-Ahmed NM, Bartsch H, Rakow-Penner R, Holland D, Shabaik A, Bjornerud A, Hope T, Hattangadi-Gluth J, Liss M, Parsons JK, Chen CC, Raman S, Margolis D, Reiter RE, Marks L, Kesari S, Mundt AJ, Kane CJ, Carter BS, Bradley WG, Dale AM. Diffusion-weighted imaging in cancer: physical foundations and applications of restriction spectrum imaging. Cancer Res 2014;74(17):4638–4652. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Le Bihan D Apparent diffusion coefficient and beyond: what diffusion MR imaging can tell us about tissue structure. Radiology 2013;268(2):318–322. [DOI] [PubMed] [Google Scholar]
- 3.Guo Y, Cai YQ, Cai ZL, Gao YG, An NY, Ma L, Mahankali S, Gao JH. Differentiation of clinically benign and malignant breast lesions using diffusion‐weighted imaging. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine 2002;16(2):172–178. [DOI] [PubMed] [Google Scholar]
- 4.Marini C, Iacconi C, Giannelli M, Cilotti A, Moretti M, Bartolozzi C. Quantitative diffusion-weighted MR imaging in the differential diagnosis of breast lesion. European radiology 2007;17(10):2646–2655. [DOI] [PubMed] [Google Scholar]
- 5.Sinha S, Lucas‐Quesada FA, Sinha U, DeBruhl N, Bassett LW. In vivo diffusion‐weighted MRI of the breast: potential for lesion characterization. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine 2002;15(6):693–704. [DOI] [PubMed] [Google Scholar]
- 6.Tong D, Yenari M, Albers G, O'brien M, Marks M, Moseley M. Correlation of perfusion-and diffusion-weighted MRI with NIHSS score in acute (< 6.5 hour) ischemic stroke. Neurology 1998;50(4):864–869. [DOI] [PubMed] [Google Scholar]
- 7.Yamasaki F, Kurisu K, Satoh K, Arita K, Sugiyama K, Ohtaki M, Takaba J, Tominaga A, Hanaya R, Yoshioka H. Apparent diffusion coefficient of human brain tumors at MR imaging. Radiology 2005;235(3):985–991. [DOI] [PubMed] [Google Scholar]
- 8.Romanello Joaquim M, Furth EE, Fan Y, Song HK, Pickup S, Cao J, Choi H, Gupta M, Cao Q, Shinohara R. DWI Metrics Differentiating Benign Intraductal Papillary Mucinous Neoplasms from Invasive Pancreatic Cancer: A Study in GEM Models. Cancers 2022;14(16):4017. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Bammer R, Keeling SL, Augustin M, Pruessmann KP, Wolf R, Stollberger R, Hartung HP, Fazekas F. Improved diffusion‐weighted single‐shot echo‐planar imaging (EPI) in stroke using sensitivity encoding (SENSE). Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine 2001;46(3):548–554. [DOI] [PubMed] [Google Scholar]
- 10.Cao J, Song HK, Yang H, Castillo V, Chen J, Clendenin C, Rosen M, Zhou R, Pickup S. Respiratory Motion Mitigation and Repeatability of Two Diffusion-Weighted MRI Methods Applied to a Murine Model of Spontaneous Pancreatic Cancer. Tomography 2021;7(1):66–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Zaitsev M, Maclaren J, Herbst M. Motion artifacts in MRI: A complex problem with many partial solutions. Journal of Magnetic Resonance Imaging 2015;42(4):887–901. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Neil JJ, Bretthorst GL. On the use of bayesian probability theory for analysis of exponential decay date: An example taken from intravoxel incoherent motion experiments. Magnetic resonance in medicine 1993;29(5):642–647. [DOI] [PubMed] [Google Scholar]
- 13.Akçakaya M, Moeller S, Weingärtner S, Uğurbil K. Scan‐specific robust artificial‐neural‐networks for k‐space interpolation (RAKI) reconstruction: Database‐free deep learning for fast imaging. Magnetic resonance in medicine 2019;81(1):439–453. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lee J-H, Kang J, Oh S-H, Ye DH. Multi-Domain Neumann Network with Sensitivity Maps for Parallel MRI Reconstruction. Sensors 2022;22(10):3943. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Wang S, Cheng H, Ying L, Xiao T, Ke Z, Zheng H, Liang D. DeepcomplexMRI: Exploiting deep residual network for fast parallel MR imaging with complex convolution. Magnetic Resonance Imaging 2020;68:136–147. [DOI] [PubMed] [Google Scholar]
- 16.Jin KH, McCann MT, Froustey E, Unser M. Deep convolutional neural network for inverse problems in imaging. IEEE Transactions on Image Processing 2017;26(9):4509–4522. [DOI] [PubMed] [Google Scholar]
- 17.Lee J, Kim H, Chung H, Ye JC. Deep learning fast MRI using channel attention in magnitude domain 2020. IEEE. p 917–920. [Google Scholar]
- 18.Quan TM, Nguyen-Duc T, Jeong W-K. Compressed sensing MRI reconstruction using a generative adversarial network with a cyclic loss. IEEE transactions on medical imaging 2018;37(6):1488–1497. [DOI] [PubMed] [Google Scholar]
- 19.Wang S, Su Z, Ying L, Peng X, Zhu S, Liang F, Feng D, Liang D. Accelerating magnetic resonance imaging via deep learning 2016. IEEE. p 514–517. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gibbons EK, Hodgson KK, Chaudhari AS, Richards LG, Majersik JJ, Adluru G, DiBella EV. Simultaneous NODDI and GFA parameter map generation from subsampled q‐space imaging using deep learning. Magnetic resonance in medicine 2019;81(4):2399–2411. [DOI] [PubMed] [Google Scholar]
- 21.Golkov V, Dosovitskiy A, Sperl JI, Menzel MI, Czisch M, Sämann P, Brox T, Cremers D. Q-space deep learning: twelve-fold shorter and model-free diffusion MRI scans. IEEE transactions on medical imaging 2016;35(5):1344–1351. [DOI] [PubMed] [Google Scholar]
- 22.Du T, Zhang H, Li Y, Pickup S, Rosen M, Zhou R, Song HK, Fan Y. Adaptive convolutional neural networks for accelerating magnetic resonance imaging via k-space data interpolation. Medical Image Analysis 2021;72:102098. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Eo T, Jun Y, Kim T, Jang J, Lee HJ, Hwang D. KIKI‐net: cross‐domain convolutional neural networks for reconstructing undersampled magnetic resonance images. Magnetic resonance in medicine 2018;80(5):2188–2201. [DOI] [PubMed] [Google Scholar]
- 24.Souza R, Lebel RM, Frayne R. A hybrid, dual domain, cascade of convolutional neural networks for magnetic resonance image reconstruction 2019. PMLR. p 437–446. [Google Scholar]
- 25.Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S. An image is worth 16x16 words: Transformers for image recognition at scale arXiv preprint arXiv:201011929 2020.
- 26.Srinivas A, Lin T-Y, Parmar N, Shlens J, Abbeel P, Vaswani A. Bottleneck transformers for visual recognition 2021. p 16519–16529.
- 27.de Figueiredo EH, Borgonovi AF, Doring TM. Basic concepts of MR imaging, diffusion MR imaging, and diffusion tensor imaging. Magnetic Resonance Imaging Clinics 2011;19(1):1–22. [DOI] [PubMed] [Google Scholar]
- 28.Patlak CS, Hospod FE, Trowbridge SD, Newman GC. Diffusion of radiotracers in normal and ischemic brain slices. Journal of Cerebral Blood Flow & Metabolism 1998;18(7):776–802. [DOI] [PubMed] [Google Scholar]
- 29.Li Y, Fan Y. DeepSEED: 3D squeeze-and-excitation encoder-decoder convolutional neural networks for pulmonary nodule detection 2020. IEEE. p 1866–1869. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation 2015. Springer. p 234–241. [Google Scholar]
- 31.Li Y, Li H, Fan Y. ACEnet: Anatomical context-encoding network for neuroanatomy segmentation. Medical Image Analysis 2021;70:101991. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. Advances in neural information processing systems 2017;30.
- 33.Lustig M, Donoho D, Pauly JM. Sparse MRI: The application of compressed sensing for rapid MR imaging. Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine 2007;58(6):1182–1195. [DOI] [PubMed] [Google Scholar]
- 34.Arshad M, Qureshi M, Inam O, Omer H. Transfer learning in deep neural network based under-sampled MR image reconstruction. Magnetic Resonance Imaging 2021;76:96–107. [DOI] [PubMed] [Google Scholar]
- 35.Sandilya M, Nirmala S. Compressed sensing trends in magnetic resonance imaging. Engineering science and technology, an international journal 2017;20(4):1342–1352. [Google Scholar]
- 36.Pipe JG. Motion correction with PROPELLER MRI: application to head motion and free‐breathing cardiac imaging. Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine 1999;42(5):963–969. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
