Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2025 Aug 14;95(1):138–156. doi: 10.1002/mrm.70033

Comparative evaluation of supervised and unsupervised deep learning strategies for denoising hyperpolarized 129Xe lung MRI

Abdullah S Bdaiwi 1, Matthew M Willmering 1, Riaz Hussain 1, Erik Hysinger 1,2, Jason C Woods 1,2,3, Laura L Walkup 1,2,3,4, Zackary I Cleveland 1,2,3,4,
PMCID: PMC12370284  NIHMSID: NIHMS2100809  PMID: 40810302

Abstract

Purpose

Reduced signal‐to‐noise ratio (SNR) in hyperpolarized 129Xe MR images can affect accurate quantification for research and diagnostic evaluations. Thus, this study explores the application of supervised deep learning (DL) denoising, traditional (Trad) and Noise2Noise (N2N) and unsupervised Noise2void (N2V) approaches for 129Xe MR imaging.

Methods

The DL denoising frameworks were trained and tested on 952 129Xe MRI data sets (421 ventilation, 125 diffusion‐weighted, and 406 gas‐exchange acquisitions) from healthy subjects and participants with cardiopulmonary conditions and compared with the block matching 3D denoising technique. Evaluation involved mean signal, noise standard deviation (SD), SNR, and sharpness. Ventilation defect percentage (VDP), apparent diffusion coefficient (ADC), membrane uptake, red blood cell (RBC) transfer, and RBC:Membrane were also evaluated for ventilation, diffusion, and gas‐exchange images, respectively.

Results

Denoising methods significantly reduced noise SDs and enhanced SNR (p < 0.05) across all imaging types. Traditional ventilation model (Tradvent) improved sharpness in ventilation images but underestimated VDP (bias = −1.37%) relative to raw images, whereas N2Nvent overestimated VDP (bias = +1.88%). Block matching 3D and N2Vvent showed minimal VDP bias (≤ 0.35%). Denoising significantly reduced ADC mean and SD (p < 0.05, bias ≤ − 0.63 × 10−2). The values of Tradvent and N2Nvent increased mean membrane and RBC (p < 0.001) with no change in RBC:Membrane. Denoising also reduced SDs of all gas‐exchange metrics (p < 0.01).

Conclusions

Low SNR may impair the potential of 129Xe MRI for clinical diagnosis and lung function assessment. The evaluation of supervised and unsupervised DL denoising methods enhanced 129Xe imaging quality, offering promise for improved clinical interpretation and diagnosis.

Keywords: deep learning, denoise, hyperpolarized 129Xe, noise reduction, SNR enhancement

1. INTRODUCTION

Regional lung function and structure (i.e., ventilation, airspace microstructure, and gas‐exchange) can effectively be evaluated in both research and clinical settings using hyperpolarized xenon‐129 (129Xe) MRI. 1 , 2 However, low signal‐to‐noise ratio (SNR) can affect accurate quantification and clinical interpretation of these images. Factors that contribute to low SNR in hyperpolarized 129Xe imaging include modest polarization levels—often due to long delays between polarization and imaging, end‐of‐life optical pumping cells, or suboptimal laser performance 3 , 4 —and the use of large coils to accommodate the entire chest volume, which reduces the filling factor. Additional factors include limited patient compliance (e.g., incomplete inhalation of 129Xe) and the need to shorten scan times due to breath‐hold constraints. Even when polarization is high (> 30%), 129Xe signal is inherently limited by the need to conserve the finite polarization of 129Xe, which decays over time due to radiofrequency pulsing and T1 relaxation. 5 Previous studies have shown that 129Xe MRI can be biased when SNR is low, particularly affecting the precision of ventilation defect percentage (VDP) 6 and apparent diffusion coefficient (ADC) measurements. 7 , 8 Consequently, postprocessing techniques to enhance SNR could be beneficial for accurate quantification of 129Xe images. 6 , 7

To mitigate the effects of low SNR in MRI, various denoising techniques have been developed to improve image quality and support more reliable quantification. Classical approaches such as block‐matching 3D (BM3D) filtering, 9 and its higher‐dimensional variants BM4D and BM5D, exploit spatial redundancy across image patches and have been successfully applied in MRI. 10 More advanced techniques include higher‐order singular value decomposition, initially developed for diffusion‐weighted brain imaging 11 , 12 and later adapted for hyperpolarized 13C 13 and 129Xe 14 MRI. Tensor Marchenko‐Pastur principal component analysis, a computationally efficient method with minimal input requirements, has also been proposed specifically for x‐nuclei imaging. 12 However, many of these methods were designed for four‐dimensional data and may be less effective for 129Xe images with lower dimensionality. Some also require long processing times, parameter tuning, or make assumptions that limit their practicality in lung imaging.

In recent years, image denoising has advanced significantly with the rise of deep learning (DL) methods, many of which have been applied to MRI for enhancing low‐signal images. 15 , 16 , 17 These range from basic convolutional neural networks (CNNs) 18 to more complex architectures like encoder–decoders, 19 U‐Net, 20 and generative adversarial networks. 21 Most of these are supervised models requiring paired noisy and clean images—often unavailable for inherently low‐SNR data. Their generalizability also depends on consistent image characteristics, prompting interest in unsupervised approaches. 22 , 23 Noise2Noise (N2N) reduces reliance on clean targets by using multiple noisy realizations, 24 , 25 although it still requires repeated acquisitions. More recently, Noise2Void (N2V) further relaxes these requirements by training on single noisy images. 26

In this work, we evaluate supervised and unsupervised deep learning–based denoising methods for 129Xe ventilation, diffusion‐weighted, and gas‐exchange MRI. To our knowledge, this is the first study to directly compare these approaches across multiple 129Xe imaging types. Given the challenges of low SNR in 129Xe imaging, these methods offer a practical solution to improve the accuracy of key VDP, ADC, and gas‐exchange metrics. We also compare these approaches to classical BM3D 9 filtering to assess relative performance. These methods offer important advantages by reducing or eliminating the need for high‐SNR reference images—an important step toward robust and feasible denoising for hyperpolarized gas MRI.

2. METHODS

2.1. Study population

Data were retrospectively collected (December 2014 to November 2023) from protocols approved by the institutional review board at Cincinnati Children's Hospital and under Food and Drug Administration IND‐123577. Written informed consent (adults or parents) and age‐appropriate assent was obtained from all pediatric subjects undergoing 129Xe MRI. A total of 952 129Xe data sets were used: 421 ventilation, 125 diffusion, and 406 gas‐exchange. These data sets have been published previously in related studies (ventilation studies, 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 diffusion studies, 7 , 14 , 32 , 38 , 39 and gas‐exchange studies 40 , 41 ). Subjects were categorized into nine clinical groups: healthy controls, individuals with asthma, adult survivors of childhood cancer treated with bleomycin, bone marrow transplant recipients, patients with bronchiolitis obliterans syndrome, cases of bronchopulmonary dysplasia (BPD), individuals with cystic fibrosis, subjects with fibrotic lung diseases, which included idiopathic pulmonary fibrosis and other fibrosing interstitial lung diseases, and patients with lymphangioleiomyomatosis (LAM). No repeat visits were considered in the analysis (first visit used only). Subject demographics are detailed in Table 1. Based on the inclusion and exclusion criteria of the previously mentioned protocols, all patients were 5 years of age or older and capable of completing a 16‐s breath‐hold, whereas exclusions covered recent respiratory illness, SpO2 ≤ 95%, pregnancy, and standard MRI contraindications.

TABLE 1.

Subjects' demographics and MRI acquisition parameters.

Disease N Age (years) Sex Acquisition parameters
Ventilation Control 60 15.4 ± 7.6 23 F, 37 M Sequence 2D gradient echo
Asthma 25 13.3 ± 8.3 13 F, 12 M Voxel size (mm3) 3–4 × 3–4 × 15
BLEO 14 31.0 ± 8.0 3 F, 11 M FOV (mm2) 275–325
BMT 64 12.0 ± 5.0 32 F, 32 M
BOS 7 16.2 ± 4.3 3 F, 4 M TE/TR (ms) 3.75/7.73 (Cartesian)
BPD 19 12.5 ± 8.5 10 F, 9 M 1.52/12.6 (spiral)
CF 90 15.7 ± 7.5 44 F, 46 M Flip angle (º) 6–12 (Cartesian)
FLD 92 35.2 ± 21.5 54 F, 38 M 10–30 (spiral)
LAM 50 47.3 ± 10.6 49 F, 1 M BW (Hz/pixel) ˜270 (Cartesian)
Total 421 230 F, 191 M ˜160 (spiral)
Diffusion Control 38 18.7 ± 10.1 17 F, 21 M Sequence 2D gradient echo
CF 40 15.0 ± 7.7 19 F, 21 M Voxel size (mm3) 5 × 5 × 15
LAM 27 15.0 ± 12.6 27 F, 0 M FOV (mm2) 250–350
Asthma 20 13.0 ± 8.1 12 F, 8 M TE/TR (ms) 2.9/6.1 (Cartesian)
Total 125 75 F, 50 M Flip Angle (º) 5–15
BW (Hz/pixel) 264.8
b‐values (s/cm2) (5) 0–25 (increment of 6.25)
Gas‐exchange Control 106 34.1 ± 18.8 48 F, 58 M Sequence 3D radial
BLEO 15 31.2 ± 7.3 3 F, 12 M Voxel size (mm3) 5.8 × 5.8 × 5.8
BMT 47 14.3 ± 5.6 15 F, 32 M FOV (mm3) 3253
BPD 23 11.9 ± 6.8 12 F, 11 M
CF 115 14.8 ± 5.9 58 F, 57 M TE/TR (ms) ˜0.46 (TE = 90 a )/5
FLD 12 41.9 ± 29.3 4 F, 8 M
LAM 31 46.6 ± 9.4 31 F, 0 M Flip angle (º) 20 (dissolve)
Others b 57 19.6 ± 13.1 26 F, 31 M 0.5 (gas)
Total 406 197 F, 209 M BW (Hz/pixel) ˜471

Abbreviations: BLEO, adult survivors of childhood cancer who were treated with bleomycin; BMT, bone‐marrow transplantation; BOS, bronchiolitis‐obliterans syndrome; BPD, bronchopulmonary dysplasia; BW, bandwidth; CF, cystic fibrosis; F, female; FLD, fibrotic lung diseases; FOV, field of view; LAM, lymphangioleiomyomatosis; M, male; TE, echo time; TR, repetition time.

a

Echo time needed to achieve a 90° phase separation for 1‐point Dixon imaging.

b

Other populations include systemic juvenile idiopathic arthritis (n = 3), Down syndrome (n = 4), posterior urethral valve (n = 1), post‐COVID (n = 2), common variable immunodeficiency with interstitial lung diseases (n = 3), pulmonary alveolar proteinosis (n = 4), sarcoidosis/pneumatocele (n = 1), pulmonary arterial hypertension (n = 2), neuroendocrine cell hyperplasia of infancy (n = 5), pre‐BMT (n = 8), asthma (n = 3), dyskeratosis congenita (n = 2), Ruxolitinib treatment (n = 2), pre–hematopoietic stem cell transplant (n = 8), and mixed complicated cases (n = 9).

2.2. 129Xe polarization and delivery

Isotopically enriched xenon (85% 129Xe; Linde Elec & Specialty Gases, Alpha, NJ, USA) was polarized to 10%–40% using a Model 9810 or 9820 from Polarean Imaging (Durham, NC, USA) and dispensed into Tedlar bags (Jensen Inert Products, Coral Springs, FL, USA). The total xenon dose administered was one‐sixth of the predicted total lung capacity based on sex and height, up to 1 L maximum. 129Xe images were then obtained during a breath‐hold lasting up to 16 s. Throughout the procedure, a medical professional was present to monitor heart rate and blood oxygenation.

2.3. MRI data acquisition

129Xe imaging was performed using either a 3T Philips Achieva or Ingenia scanner (Philips Healthcare, Best, Netherlands). The imaging was conducted with either a custom‐built dual‐loop, single‐channel 129Xe transmit‐receive coil or a flexible transmit/receive 129Xe chest coil (Clinical MR Solutions, Brookfield, WI, USA). Ventilation images were acquired using a two‐dimensional (2D) multislice gradient‐echo sequence, in either an axial or coronal orientation, and used cartesian or spiral k‐space sampling. 5 , 33 For diffusion images, a 2D axial, multislice, and multiple b‐value gradient‐echo sequence was used. 38 , 39 Gas‐exchange images were obtained using a 1‐point Dixon, three‐dimensional (3D) radial acquisition. 41 , 42 Detailed acquisition parameters for all imaging types can be found in Table 1.

Ventilation images acquired using Cartesian sequences were reconstructed using the standard Philips on‐line reconstruction pipeline. However, for ventilation images acquired with spiral sequences and for diffusion‐weighted images, complex k‐space data were exported after applying a DC offset correction and reconstructed offline. Similarly, gas‐exchange imaging data were exported after DC offset correction, gridded onto a Cartesian matrix using iterative density compensation, and reconstructed offline. Membrane and red blood cells (RBC) images were then extracted from the dissolved‐phase images using 1‐point Dixon separation. 41 , 42 , 43 Additionally, dissolved‐phase images were reconstructed at a sharper reconstruction kernel relative to the standard reconstruction pipeline, as a proof of concept to demonstrate that higher‐resolution dissolved images can be obtained with acceptable SNR. These reconstructions were performed offline using our recently published open‐source package, XIPline (129Xe Image Processing Pipeline), 43 developed in MATLAB (MathWorks, Natick, MA, USA).

For the ventilation and gas‐exchange images, a 1H anatomical scan was performed during a separate breath‐hold using a volume‐matched air bag, with the same imaging sequence to segment the thoracic cavity.

2.4. Training and validation

All networks were implemented on the TensorFlow platform, with all computations carried out using a 16‐GB NVIDIA‐RTX A4000 graphics card. Each network was trained for 200 epochs. Our implementation was based on the CSBDeep framework, using a U‐Net architecture with batch normalization added before each activation function. Input data consisted of 64 × 64 patches for 2D images (ventilation and diffusion) and 64 × 64 × 8 patches for 3D images (gas‐exchange: gas and dissolved). All image data were normalized to a [0, 1] intensity range before data‐set partitioning. The data sets were first split by subject and disease type into 85% for training/validation and 15% for testing. The training/validation subset was further divided, with 85% used for training and 15% for validation. The loss function used was mean absolute error for the Traditional approach and mean squared error for the N2N and N2V models. Optimization was performed using the Adam optimizer with a learning rate of 0.0004 and batch sizes of 8, 16, and 64 for the Traditional, N2N, and N2V models, respectively. Separate models were trained for each imaging type, including 2D Traditional ventilation model (Tradvent), 2D N2N ventilation model (N2Nvent), 2D N2V ventilation model (N2Vvent), 2D N2N diffusion model (N2Ndiff), 2D N2V diffusion model (N2Vdiff), 3D N2V gas model (N2Vgas), and 3D N2V dissolved model (N2Vdiss). A summary of the models, workflow, and training configurations is provided in Table S1 and Figure S1. The Python scripts used for training the models are available at https://github.com/aboodbdaiwi/XeDL_Denoising.

In traditional supervised training, 26 a CNN (U‐Net) is trained to map a noisy input image to a clean output image. 129Xe ventilation images, which typically exhibit the highest SNR compared with diffusion and gas‐exchange images, were selected as the clean data (Slice SNR ≥ 20) for training the traditional CNN model. To create corresponding noisy data sets, we first generated complex k‐space data from the clean images by applying the inverse fast Fourier transform. Gaussian noise was then added to the k‐space data, resulting in three levels of noisy images with approximate SNRs of 15, 10, and 5. A total of 2819 high‐SNR slices from 356 data sets were used to train and validate the model, with 2396 slices allocated for training and 423 for validation. For testing, the model was evaluated on 65 independent ventilation data sets consisting of 1036 slices.

N2N 24 training eliminates the need for clean ground truth images by using pairs of noisy images with different noise realizations from the same distribution. For ventilation images, similar procedures, as described previously, were used to generate multiple noisy images. However, in this case, five different noise levels were produced, yielding ventilation images with approximate SNRs ranging from 5 to 20. The same number of ventilation data sets used in the Traditional approach was also used for training and validation. For diffusion images, images across the diffusion‐encoding dimension (i.e., the b‐value images) are used as the noisy images. A total of 526 slices from 105 diffusion data sets were used, with 421 slices for training and 105 slices for validation. The model was evaluated on 20 independent diffusion data sets consisting of 108 test slices.

Noise2Void training further eliminates the need for paired images altogether, allowing training from single noisy images. Because this method does not require clean or multiple noisy images, all 129Xe images, including ventilation (356 data sets, 4007 2D slices), diffusion (105 data sets, 526 2D slices), and gas‐exchange (346 3D data sets), were used for training and validation regardless of their SNR levels, without adding additional noise. The model was then evaluated on independent test sets consisting of 65 ventilation data sets (1036), 20 diffusion data sets (108 slices), and 60 gas‐exchange data sets.

2.5. Testing and image analysis

The testing data set consisted of images from 145 129Xe scans, including 65 ventilation (1036 slices), 20 diffusion (108 slices), and 60 gas‐exchange data sets (3D). Preprocessing steps involved image resizing (128 × 128 for ventilation, 64 × 64 for diffusion, and 112 × 112 × 112 for gas‐exchange) and normalization [0,1]. The denoising of magnitude images was then performed using the trained models. Due to the limited availability of clean (high SNR) diffusion and gas‐exchange data and to test generalizability/transferability, the 2D Tradvent model was applied to all image types, including ventilation, diffusion, and gas‐exchange. Additionally, the 2D N2Nvent model was used to denoise gas‐exchange images (gas and dissolved). The performance of these denoising frameworks was compared with BM3D, 9 which relies on an enhanced sparse representation in a transform domain. The noise standard deviation (SD) for the BM3D filter was set at 5%, balancing SNR enhancement and image blurring.

A binary mask of the lung parenchyma, excluding large airways, was generated either manually or using an in‐house pretrained deep learning model, 43 with additional manual modifications as necessary. For diffusion images, masks were created using signal thresholding on the first b‐value image to restrict analysis to ventilated lung regions excluding large airways, with additional manual editing if necessary. SNR was calculated for all raw and denoised images as Slung/σBG, where Slung represents the mean signal amplitude within the lung mask, and σBG is the SD of background noise. Image sharpness was evaluated using the gradient of the magnitude of both raw and denoised images, quantified with the Sobel operator (x).

For ventilation images, after denoising (if performed), VDP was calculated using a thresholding method, 27 , 32 , 43 , 44 as the proportion of lung volume with signal intensity below 60% of the whole lung mean. ADC maps for diffusion images were generated from both raw and denoised images using a log‐linear fit. 7 , 43 Membrane and RBC images were extracted from the raw gas‐exchange images using a 1‐point Dixon analysis and then denoised with all methods. Maps of Membrane‐uptake, RBC‐transfer, and RBC:Membrane were then generated based on a healthy reference distribution using the generalized linear binning method. 41 , 45 Both diffusion and gas‐exchange images were scaled to the mean of the raw images following denoising.

2.6. Statistical analysis

Mean lung signal, noise SD, SNR, and image sharpness (x) were compared across raw and denoised images. Pairwise comparisons across methods were performed using Friedman test with Bonferroni correction to adjust the critical values for multiple comparisons. The same statistical approach was used to compare VDPs (ventilation), mean and SD of ADC (diffusion), and mean and SD of gas‐exchange maps (Membrane‐uptake, RBC‐transfer, and RBC:Membrane).

3. RESULTS

3.1. Denoising ventilation images

The evaluated denoising methods significantly improved 129Xe ventilation image quality compared with raw images. As shown in Figure 1A for a bone marrow transplant patient, noise SD was evidently reduced with corresponding increases in SNR. Additionally, DL‐based models offered faster processing (0.4 s per slice) than BM3D (0.6 s). VDP was notably altered by denoising: Tradvent decreased VDP by about 15% (yellow arrow), whereas N2Nvent increased it by about 10% (blue arrow), compared with the raw images. A similar example for a healthy subject is provided in Figure S2.

FIGURE 1.

FIGURE 1

Denoising 129Xe ventilation images. (A) 129Xe ventilation image for a 15‐year‐old bone‐marrow transplantation (BMT) patient before and after denoising. Signal‐to‐noise ratio (SNR) increased significantly in denoised images using block matching and 3D filtering (BM3D) and deep learning models compared with the raw images. Ventilation defect percentage (VDP) decreased drastically in Tradvent (yellow arrow) and increased in N2Nvent (blue arrow) compared with the raw and the other denoising methods. (B) Overall, the SNR showed significant improvement in the denoised images relative to the raw images across all testing data sets. (C) The VDP decreased significantly in Tradvent and increased significantly in N2Nvent compared with the raw and the other denoising methods. VDP maps (red) the proportion of lung volume with signal intensity below 60% of the whole lung mean. Raw, original image; N2N, noise2noise; N2V, noise2void; Trad, traditional.

Ventilation testing (n = 65) results are summarized in Table 2. The Tradvent model significantly increased mean lung signal compared with both raw and other denoised images (p < 0.001; Figure S3A). All denoising methods significantly reduced noise SD relative to raw images (p < 0.001; Figure S3B), resulting in noticeably improved SNR (p < 0.001; Figure 1B). Sharpness (mean gradient magnitude, x) significantly increased in Tradvent‐denoised and N2Vvent‐denoised images (p < 0.001), whereas N2Nvent resulted in a sharpness decrease compared with raw and other denoised images (p < 0.001; Figure S4).

TABLE 2.

Image quantification parameters for raw 129Xe ventilation (n = 65) and diffusion‐weighted (n = 20) test images, as well as all applied denoising techniques.

Methods
Parameters Raw BM3D Trad N2N N2V p‐Value
Ventilation (N = 65) Lung signal 0.466 (0.093) 0.468 (0.095) 0.503 (0.097) 0.460 (0.096) 0.469 (0.095) < 0.001
Noise SD (×10−2) 1.63 (5.7) 1.00 (5.0) 0.90 (4.4) 0.92 (3.8) 0.96 (5.2) < 0.001
SNR 35.6 (16.1) 82.2 (50.2) 86.3 (21.9) 80.4 (25.7) 105.7 (70.2) < 0.001
Sharpness (x) 1.62 (0.42) 1.61 (0.45) 1.66 (0.47) 1.48 (0.38) 1.70 (0.47) < 0.001
VDP (%) 9.3 (6.8) 9.5 (7.0) 8.7 (6.4) 11.2 (7.7) 9.7 (7.1) < 0.001
Diffusion (N = 20) Lung signal for b0 0.331 (0.064) 0.330 (0.066) 0.341 (0.064) 0.366 (0.065) 0.336 (0.067) < 0.001
Noise SD (×10−2) for b0 2.15 (0.9) 0.81 (0.79) 0.50 (0.44) 0.48 (0.22) 0.37 (0.42) < 0.001
SNR for b0 19.3 (7.3) 176.8 (143.8) 112.3 (34.1) 91.8 (22.0) 172.4 (42.6) < 0.001
Sharpness (x) (b0) 1.37 (0.36) 1.29 (0.34) 1.18 (0.30) 1.41 (0.29) 1.24 (0.27) < 0.001
ADC (mm2/s) 3.15 (0.73) 3.13 (0.72) 2.79 (0.39) 2.98 (0.66) 3.09 (0.73) < 0.001
SD ADC 1.21 (0.29) 1.09 (0.25) 0.92 (0.25) 0.70 (0.19) 0.88 (0.22) < 0.001

Note: Values are presented as mean (SD). The p‐values in this table represent the overall comparison across methods using the Friedman test, while the pairwise comparisons results are shown in the corresponding figures and/or text.

Abbreviations: ADC, apparent diffusion coefficient; BM3D, block‐matching and 3D filtering; b0, b‐value = 0 s/cm2; N2N, noise2noise; N2V, noise2void; Raw, original image; SD, standard deviation; SNR, signal‐to‐noise ratio; Trad, traditional.

To further assess the performance of the denoising methods, we applied them across multiple noise levels in a ventilation data set (Figure S5). The N2Nvent method outperformed all others, delivering the greatest improvement in image SNR, even at very high noise levels. Both Tradvent and N2Vvent also surpassed the performance of the BM3D method.

The VDP across all subjects (Figure 1C) significantly decreased when using the Tradvent method and significantly increased when using the N2Nvent method compared with the VDP obtained from raw images and other denoising methods. Mean VDP across all subjects was BM3D = 9.5 ± 7.0%, Tradvent = 8.7 ± 6.4%, N2Nvent = 11.2 ± 7.7%, N2Vvent = 9.7 ± 7.1%, and the raw images = 9.3 ± 6.8%. Additionally, Bland–Altman plots (Figure S6) of the VDPs derived from raw and denoised images revealed method‐specific biases: −0.13% for BM3D, 1.37% for Tradvent, −1.88% for N2Nvent, and −0.35% for N2Vvent, relative to the VDP calculated from the raw images.

3.2. Denoising diffusion images

The denoising techniques applied to 129Xe diffusion‐weighted images noticeably enhanced image quality relative to raw images. Figure 2 shows a representative slice from a LAM patient, including b = 0 and b = 25 s/cm2 images and the corresponding ADC maps across all methods. Denoised images exhibited substantially higher SNR than raw images. The 2D deep learning models also achieved faster processing, completing denoising across all b‐values in 1.6 s per slice versus 6 s with BM3D. Although mean ADC values were slightly reduced, ADC SD decreased, indicating less variability. Orange and red arrows highlight elevated ADC in cyst regions. A healthy participant example is shown in Figure S7.

FIGURE 2.

FIGURE 2

129Xe diffusion‐weighted images (lowest and highest b‐value images) for a 25 year‐old patient with lymphangioleiomyomatosis (LAM) before and after denoising. The signal‐to‐noise ratio (SNR) showed a significant increase in the denoised images compared with the raw images. The mean and the standard deviation of apparent diffusion coefficient (ADC) were significantly reduced in the denoised images. Orange arrows indicate cyst regions in the LAM patient, highlighting elevated ADC values in these areas, as further emphasized by the red arrows. BM3D, block matching and 3D filtering; Raw, original image; N2N, noise2noise; N2V, noise2void.

Table 2 and Figure 3 summarize the denoising performance across all diffusion testing data sets (n = 20). For the first b‐value images (Figure S8A), the N2Ndiff method yielded significantly higher mean lung signal intensity than all other methods except Tradvent (p < 0.001), whereas Tradvent also showed a significantly higher mean signal than BM3D (p < 0.001). All denoising methods significantly reduced noise SD compared with raw images (p < 0.001; Figure S8B), resulting in noticeable SNR improvements (Figure 3A). Similar trends were observed for the final b‐value images (Figures 3B and S8C,D). Sharpness, x, was significantly reduced in all denoising methods except N2Ndiff (p < 0.05), as indicated in Table 2 and Figure S9.

FIGURE 3.

FIGURE 3

Signal‐to‐noise ratio (SNR) for 129Xe diffusion images (first and last b‐value images), along with mean and standard deviation (SD) of apparent diffusion coefficient (ADC) for the testing data across all denoising methods. (A) SNR demonstrated a significant improvement in the denoised images relative to the raw images. (B) A similar trend was observed in the last b‐value images. (C,D) The mean and the SD of ADC were significantly reduced in the denoised images. BM3D, block matching and 3D filtering; N2N, noise2noise; N2V, noise2void; Raw, original image; Trad., traditional.

To further evaluate the effectiveness of the denoising techniques under varying noise conditions, we applied them across a range of noise levels in a diffusion‐weighted data set (Figure S10). Among the methods, N2Ndiff demonstrated superior performance, providing the highest SNR improvement even under high noise levels. The values of Tradvent and N2Vdiff also outperformed BM3D.

Across all subjects (Figure 3C), mean ADC values were significantly lower (p < 0.01) in the denoised images (BM3D = 0.0313 ± 0.0072, Tradvent = 0.0279 ± 0.0039, N2Ndiff = 0.0298 ± 0.0066, N2Vdiff = 0.0309 ± 0.0073 cm2/s) compared with the raw images (0.0322 ± 0.0073 cm2/s). Tradvent also produced significantly lower mean ADC than all other methods (p < 0.05). ADC SD was significantly reduced (p < 0.01) by all DL methods (Tradvent = 0.0092 ± 0.0025, N2Ndiff = 0.0070 ± 0.0019, N2Vdiff = 0.0088 ± 0.0022) versus the raw images (0.0121 ± 0.0029), with N2Ndiff achieving the lowest variability (p < 0.05). Bland–Altman analysis (Figure S11) revealed ADC biases relative to raw images of −0.09 (BM3D), −0.63 (Tradvent), −0.24 (N2Ndiff), and − 0.14 (N2Vdiff).

3.3. Denoising gas‐exchange images

Figure 4 shows a representative slice from an adult with a history of BPD across denoising methods. All denoised images—gas, dissolved, membrane, and RBC—exhibited higher SNR compared with the raw images. Notably, the higher‐resolution dissolved images also showed noticeable SNR improvement. The DL models were highly efficient, completing denoising in 0.6 s per data set versus 69.4 s for BM3D. Similar SNR enhancements are illustrated for a healthy subject in Figure S12.

FIGURE 4.

FIGURE 4

Denoised 129Xe gas‐exchange images, including gas, low‐resolution (LR) reconstructed dissolved images, 1‐point Dixon separated membrane images, red blood cell (RBC) images, and high‐resolution (HR) reconstructed dissolved images for a patient with bronchopulmonary dysplasia before and after denoising. The signal‐to‐noise ratio exhibited a significant enhancement in the denoised images compared with the raw images. Blue line indicates the lung mask boundaries. Images are normalized to 0–1. BM3D, block matching and 3D filtering; N2N, noise2noise; N2V, noise2void; Raw, original image.

Table 3 and Figure 5 summarize denoising results across gas‐exchange testing data sets (n = 60). For gas images, mean lung signal was significantly higher with Tradvent and N2Nvent compared with raw, BM3D, and N2Vgas (p < 0.001; Figure S13A). Denoising also significantly reduced noise SD (p < 0.01; Figure S13B), leading to substantial SNR improvements in gas images (p < 0.001; Figure 5A). Similar patterns were observed across dissolved, membrane, RBC, and high‐resolution dissolved images, as detailed in Table 3, Figures 5B–E, and Figures S13C–J. Additionally, image sharpness in gas images was significantly higher with Tradvent and N2Nvent compared with raw, BM3D, and N2Vgas (p < 0.05; Figure S14A). Similar effects were observed in the dissolved, membrane, and RBC images (Table 3; Figure S14B–D).

TABLE 3.

Image quantification parameters across raw 129Xe gas‐exchange test images (n = 60) and all applied denoising techniques.

Methods
Image type Parameters Raw BM3D Trad N2N N2V p‐Value
Gas Lung signal 0.384 (0.094) 0.387 (0.095) 0.564 (0.081) 0.543 (0.081) 0.374 (0.089) < 0.001
Noise SD (×10−2) 2.02 (0.51) 1.37 (0.52) 1.45 (0.87) 1.64 (0.55) 1.28 (0.43) < 0.001
SNR 19.9 (4.8) 32.2 (11.8) 104.9 (32.7) 55.7 (16.0) 32.0 (8.6) < 0.001
Sharpness (x) 0.83 (0.17) 0.57 (0.17) 2.78 (0.25) 2.10 (0.13) 0.50 (0.13) < 0.001
Dissolved Lung signal 0.471 (0.060) 0.475 (0.059) 0.689 (0.061) 0.636 (0.062) 0.426 (0.065) < 0.001

Noise SD

(×10−2)

3.39 (0.66) 3.25 (0.69) 5.36 (1.21) 3.10 (0.60) 1.37 (0.49) < 0.001
SNR 15.4 (3.0) 16.4 (3.4) 20.7 (6.3) 40.9 (7.5) 47.1 (12.6) < 0.001
Sharpness (x) 1.62 (0.42) 1.61 (0.45) 1.66 (0.47) 1.48 (0.38) 1.70 (0.47) < 0.001
Membrane Lung signal 0.468 (0.061) 0.479 (0.059) 0.690 (0.060) 0.642 (0.057) 0.424 (0.066) < 0.001

Noise SD

(×10−2)

3.65 (0.71) 3.47 (0.72) 5.89 (1.31) 3.25 (0.59) 1.43 (0.52) < 0.001
SNR 14.1 (2.8) 15.3 (3.1) 18.2 (5.5) 39.0 (7.1) 44.5 (12.0) < 0.001
Sharpness (x) 0.57 (0.11) 0.45 (0.09) 1.66 (0.28) 1.30 (0.22) 0.37 (0.07) < 0.001
Mean membrane uptake (×10−3) 9.0 (1.2) 9.0 (1.2) 9.6 (0.9) 9.6 (0.8) 9.0 (1.3) < 0.001
SD membrane uptake (×10−3) 2.3 (0.5) 2.3 (0.5) 2.7 (0.4) 2.7 (0.4) 2.3 (0.5) < 0.001
RBC Lung signal 0.408 (0.079) 0.424 (0.076) 0.597 (0.082) 0.568 (0.078) 0.357 (0.083) < 0.001
Noise SD (×10−2) 4.03 (0.63) 3.63 (0.62) 4.99 (1.33) 3.24 (0.85) 1.17 (0.35) < 0.001
SNR 10.7 (2.7) 12.5 (3.2) 20.2 (8.7) 28.3 (8.8) 40.8 (13.9) < 0.001
Sharpness (x) 0.95 (0.19) 0.78 (0.19) 2.40 (0.23) 1.94 (0.20) 0.41 (0.08) < 0.001
Mean RBC transfer (×10−3) 3.4 (0.7) 3.4 (0.7) 3.6 (0.7) 3.6 (0.6) 3.4 (0.8) < 0.001
SD RBC transfer (×10−4) 7.7 (2.9) 7.6 (2.9) 7.4 (2.9) 6.8 (2.7) 8.0 (2.9) < 0.001

Note: Values are presented as mean (SD). The p‐values in this table represent the overall comparison across methods using the Friedman test, while the pairwise comparisons results are shown in the corresponding figures and/or text.

Abbreviations: BM3D, block‐matching and 3D filtering; N2N, noise2noise; N2V, noise2void; Raw, original image; RBC, red blood cell; SD, standard deviation; SNR, signal‐to‐noise ratio; Trad, traditional.

FIGURE 5.

FIGURE 5

Signal‐to‐noise ratio (SNR) for 129Xe gas‐exchange images, including gas image, low‐resolution (LR) reconstructed dissolved images, 1‐point Dixon‐separated membrane images, red blood cell (RBC) images, and high‐resolution (HR) reconstructed dissolved images across all denoising methods in the testing data. SNR showed significant improvement in the denoised images, particularly when using the deep learning models, compared with the raw images across all types. BM3D, block matching and 3D filtering; N2N, noise2noise; N2V, noise2void; Raw, original image.

To further evaluate denoising performance under varying noise conditions, all methods were tested across a range of noise levels for a gas image from a representative gas‐exchange dataset (Figure S15). The values of N2Nvent and N2Vgas consistently achieved the highest SNR improvements, even at elevated noise levels, demonstrating strong robustness. The value of Tradvent also outperformed BM3D across all noise levels.

Figure 6A shows representative slices of Membrane‐uptake, RBC‐transfer, and RBC:Membrane maps from a BPD subject (same as in Figure 4), before and after denoising. Both Tradvent and N2Nvent methods led to increased mean Membrane and RBC signal values compared with the raw, BM3D, and N2Vdiss images, with this trend consistently observed across the full cohort (Figure 6B,C). These increases were statistically significant (p < 0.001). However, no significant changes were detected in the mean RBC:Membrane ratio (p > 0.05). Additionally, SDs of all three maps were significantly reduced using Tradvent and N2Nvent methods, indicating more stable signal distributions. Full quantitative results are provided in Table 3.

FIGURE 6.

FIGURE 6

129Xe gas‐exchange metrics before and after denoising. (A) Slice‐selective membrane, red blood cell (RBC), and RBC:Membrane maps from the same bronchopulmonary dysplasia subject as in Figure 4. The Tradvent and N2Nvent methods resulted in significant increases in mean Membrane and RBC values compared with the raw data and other denoising techniques, whereas no significant changes were observed in RBC:Membrane values across all methods. (B) These trends were consistent across all subjects, with significant increases in mean Membrane and RBC observed with the Tradvent and N2Nvent methods, but no significant changes in RBC:Membrane values. (C) The standard deviation (SD) of the Membrane, RBC, and RBC:Membrane maps showed significant reductions with the Tradvent and N2Nvent methods compared with the raw data and other denoising techniques. BM3D, block matching and 3D filtering; N2N, noise2noise; N2V, noise2void; Raw, original image.

Bland–Altman analyses (Figures S16–S18) demonstrated the bias between raw and denoised images across all methods. For mean Membrane‐uptake values, the observed biases were BM3D = −0.3 × 10−5, Tradvent = −0.6 × 10−3, N2Nvent = −0.6 × 10−3, and N2Vdiss = −0.04 × 10−4 (Figure S16). For mean RBC‐transfer, the corresponding biases were BM3D = −2.2 × 10−6, Tradvent = −2.0 × 10−4, N2Nvent = −2.3 × 10−4, and N2Vdiss = −1.1 × 10−5 (Figure S17). For RBC:Membrane ratios, bias values were BM3D = −0.08 × 10−4, Tradvent = 0.01 × 10−1, N2Nvent = −0.01 × 10−1, and N2Vdiss = 0.87 × 10−3 (Figure S18).

4. DISCUSSION

This study shows that DL‐based denoising improves image quality across 129Xe imaging types—ventilation, diffusion, and gas‐exchange—improving quantification and reducing low‐SNR‐related bias. These improvements are particularly important as denoising becomes a key focus in clinical MRI, where enhancing image fidelity can directly affect diagnostic accuracy and reproducibility. All major MRI vendors are actively developing DL‐based denoising methods, highlighting the clinical value these techniques offer for improving lung function and structure assessment. Although the core denoising architectures in this study were adapted from established frameworks, the novelty lies in the systematic application, tuning, and validation of these models across all major 129Xe imaging types. Additionally, to our knowledge, this is the first study to train and deploy both 2D and 3D denoising networks for 129Xe images—serving as a proof of concept.

4.1. Ventilation

Tradvent‐denoised images showed increased sharpness and mean signal intensity but underestimated VDP (−1.37% bias), likely due to overemphasized structural edges and aggressive noise smoothing, which suppressed low‐intensity signals and reduced the visibility of mild ventilation defects. In contrast, N2Nvent‐denoised images exhibited reduced sharpness and lower mean signal intensity, resulting in VDP overestimation (+1.88% bias). This result aligns with a recent study using commercial DL reconstruction pipeline for denoising 129Xe ventilation images where similar bias (+1.3%) was observed. 16 BM3D and N2Vvent denoising methods exhibited minimal changes in sharpness and mean signal, resulting in the smallest VDP biases (+0.13% and + 0.35%, respectively). These methods appear to achieve a more optimal balance between noise reduction and the preservation of quantitative measurements. It is worth noting that the ventilation images evaluated were already of high quality (mean SNR = 35 ± 16), so limited changes in VDP are expected. Substantial VDP shifts are more likely to occur when the SNR falls below ˜8. 6

Denoising 129Xe ventilation images provides clear clinical value by improving image quality. In low‐SNR images, noise can obscure ventilation defects and lead to underestimation of disease severity, particularly in patients with mild or heterogeneous abnormalities—as illustrated in Figure 7A, where denoising reveals additional ventilation defects that were not visible in the raw image of a pediatric cystic fibrosis subject. In this example, the VDP increased by 4.1% (˜44% relative increase) following denoising—a change that far exceeds the minimal clinically important difference (˜3% 46 ). This finding aligns with a prior study demonstrating that VDP is underestimated in lower‐SNR images. 6 By reducing noise, denoising enhances visibility of regional ventilation patterns, supports more confident identification of abnormalities, and minimizes quantification bias. This may support more reliable disease assessment, enhance longitudinal monitoring, and contribute to increased clinical confidence when interpreting subtle changes in lung function.

FIGURE 7.

FIGURE 7

Denoising enhances image quality and clinical interpretability across 129Xe MRI types. (A) In a pediatric cystic fibrosis (CF) subject, low signal‐to‐noise ratio (SNR) in the raw ventilation image led to underestimation of ventilation defects, whereas denoising revealed additional abnormalities and increased ventilation defect percentage. (B) In a 44‐year‐old lymphangioleiomyomatosis (LAM) subject, apparent diffusion coefficient (ADC) maps from raw diffusion images showed high variability, which was substantially reduced after denoising, improving reliability. In a younger LAM subject (25 years old), denoising improved delineation of microstructural airway and cyst regions, enhancing visualization of disease‐specific parenchymal changes.

Although some of the denoising methods (BM3D and N2Vvent) preserved VDP values relative to raw images, denoising remains clinically valuable even when quantitative changes are modest. Clinical decision making often relies on a combination of both visual interpretation and quantitative metrics. In particular, denoising may help improve the interpretability of low‐SNR images, aiding radiologists and clinicians in confidently identifying subtle ventilation abnormalities that may be visually ambiguous in noisy data. Furthermore, even small biases in VDP—particularly underestimations caused by high noise levels—can lead to clinically relevant misclassification. Therefore, selecting a denoising method should consider both its ability to preserve quantitative accuracy and its impact on visual clarity, as both influence disease assessment and monitoring in clinical settings.

4.2. Diffusion

Mean ADC values were consistently lower in denoised images compared with raw images, with observed biases of −0.09 for BM3D, −0.63 for Tradvent, −0.24 for N2Ndiff, and − 0.14 for N2Vdiff. Previous simulation studies have shown that an SNR greater than 15 is required for accurate and reliable ADC estimation in hyperpolarized gas diffusion imaging. 7 , 8 At lower SNR levels, increased noise can systematically bias ADC values downward, leading to underestimation of true diffusivity.

In the present study, although denoising led to a reduction in mean ADC, this is more plausibly attributed to decreased variability and suppression of noise‐driven high‐ADC outliers rather than an artifact of bias. Denoising mitigates spurious signal fluctuations, narrows the distribution of ADC values, and reduces noise‐related dispersion, resulting in more consistent and physiologically plausible measurements. Figure 7B,C illustrates how noise in raw images can elevate ADC estimates and increase regional variability, particularly in diseased lungs. Following denoising, ADC variability was substantially reduced, improving visualization of regional microstructure (e.g., cyst regions) in subjects with LAM. These findings suggest that noise not only obscures subtle pathological changes but may also falsely imply disease severity or presence.

Furthermore, the observed reductions in ADC SD in denoised images indicate improved measurement consistency. This is critical in 129Xe diffusion MRI, where image noise can significantly affect the precision of quantitative maps and hinder accurate assessment of lung microstructure. 7 , 8 , 47 These findings are consistent with our previous work using higher‐order singular value decomposition–based denoising for 129Xe diffusion imaging, 14 which also demonstrated substantial reductions in ADC variability.

4.3. Gas exchange

Gas‐exchange 129Xe MR images pose significant challenges due to their inherently low signal strength and high sensitivity to noise, making accurate quantification difficult. These signals are further complicated by their dependence on physiological processes within the lungs, such as ventilation, perfusion, and diffusion, leading to greater variability and noise compared with ventilation and diffusion imaging. As a result, gas‐exchange imaging is more challenging to acquire and interpret, yet it plays a crucial role in assessing pulmonary function at the alveolar‐capillary level, providing critical insights into gas transfer abnormalities in different lung diseases. 41 , 42

To address these challenges, we applied denoising techniques that significantly improved SNR for gas, dissolved, membrane, and RBC images. These improvements enhance image quality and may facilitate a more reliable quantification of gas‐exchange metrics, ensuring better differentiation of functional abnormalities and improving the accuracy of pulmonary function assessments.

Moreover, gas‐exchange imaging often uses undersampled (by > 50%) 3D radial acquisition techniques, 42 which can lead to noise‐like blurring and streaking artifacts that further diminish SNR. To counteract these effects, gas‐exchange images are frequently reconstructed with larger kernel sizes (lower resolution)—a practice aimed at improving SNR. However, this approach can compromise image quality by blurring or obscuring fine anatomical or physiological details that are critical for accurate interpretation, especially in clinical settings where subtle variations in tissue properties are key to diagnosis. As proof of concept, we demonstrated that dissolved‐phase images can be reconstructed at a sharper reconstruction kernel (higher resolution) while maintaining acceptable SNR using the evaluated denoising methods. This finding suggests that it is possible to enhance spatial resolution without sacrificing SNR.

The effect of different denoising methods on Membrane‐uptake, RBC‐transfer, and RBC:Membrane maps was also evaluated. Both Tradvent and N2Nvent methods resulted in increased mean Membrane and RBC metrics compared with raw and other denoising methods. This is likely due to the ability of these methods to reduce high‐frequency noise, which increases signal detection sensitivity and leads to higher observed values. However, despite these increases, no significant differences were observed in the RBC:Membrane ratio across all methods, suggesting that the denoising methods predominantly affect individual components rather than their relative balance. Additionally, both Tradvent and N2Nvent methods led to significant reductions in SD of the Membrane‐uptake, RBC‐transfer, and RBC:Membrane maps, indicating that the denoised images provided more consistent measurements. However, the clinical impact of denoising on gas‐exchange images was limited, primarily because membrane and RBC images already undergo intrinsic smoothing during reconstruction using large kernel sizes. As a result, the benefit of denoising is most pronounced in the gas‐phase images, where higher noise levels and finer structural detail make signal enhancement more impactful. Finally, with sufficiently robust denoising of gas‐phase images, it may be possible to derive ventilation‐like information directly from gas‐exchange acquisitions, potentially reducing the need for separate ventilation imaging.

4.4. Denoising model application and advantages

Due to the lack of clean, high‐SNR diffusion and gas‐exchange images, the Tradvent and N2Nvent models—trained solely on ventilation images—were applied to denoise diffusion images (using Tradvent) and both the gas and dissolved‐phase images (using Tradvent and N2Nvent). This approach served as proof of concept, demonstrating the applicability of these models under various imaging conditions, such as different image resolutions. However, applying these models led to a significant increase in both mean signal and noise SD (Figures S8 and S13). This is primarily because these models were originally trained on higher‐resolution images, whereas the diffusion and gas‐exchange images are acquired at lower resolution, resulting in an unintended increase in both mean signal and noise SD. Despite this limitation, the models still produced substantial improvements in SNR across all image types, showcasing their potential for enhancing image quality even in the context of lower‐resolution acquisitions. This proof of concept indicates that, while further optimization is needed for lower‐resolution diffusion and gas‐exchange imaging, these denoising models are effective tools for enhancing SNR.

Applying advanced denoising techniques offers multiple advantages in both research and clinical settings, improving image quality and expanding the potential for more efficient and accessible 129Xe MRI. One key benefit is the reduction in the need for scan repetitions due to low SNR, which can result from several factors such as suboptimal performance of the 129Xe polarizer or incomplete inhalation of the gas dose—particularly in populations with compliance challenges, such as pediatric subjects. By improving SNR, these methods can enhance the reliability of single‐breath imaging and reduce the burden on patients. Another important advantage is the potential for image acceleration and multibreath imaging approaches, which can offer more detailed and comprehensive assessments of pulmonary function during respiration. 48 , 49 These techniques open the door for faster acquisition protocols while maintaining image fidelity, improving the practicality of 129Xe MRI in clinical practice. 50 Additionally, significant cost savings could be achieved by using naturally abundant 129Xe for imaging, which is less expensive than enriched 129Xe but typically produces lower SNR images. Although this was not specifically examined in the current study, denoising algorithms could help compensate for the reduced SNR, enabling high‐quality imaging without relying solely on enriched gas. This is particularly relevant as the cost of enriched 129Xe continues to rise, posing increasing financial challenges for both research and clinical implementation. This combination of improved image quality, reduced costs, and enhanced acquisition protocols strengthens the case for widespread adoption of denoising techniques in pulmonary imaging.

4.5. Limitations

The study has several limitations:

  • It relies exclusively on synthetic noise for training the denoising models, which may not fully replicate the noise characteristics encountered in clinical data—particularly for non‐Cartesian acquisition data.

  • Due to the lack of clean, high SNR diffusion and gas‐exchange images, the Tradvent and N2Nvent models were used to denoise diffusion and gas‐exchange images. Although these models yielded better quality images, these models were trained on ventilation images, which might limit their performance. Dedicated models for each imaging type will be required for better performance and to reduce biases.

  • Although the unsupervised N2V method offers significant practical utility, it is inherently limited by its dependence on noisy images for training. This approach may struggle to accurately denoise unique voxels that differ substantially from their neighbors, potentially leading to erroneous estimates.

  • These models have only been applied to magnitude images, limiting their use for gas‐exchange imaging, which relies on separating RBC and membrane signals into real and imaginary channels. 51 , 52 , 53 However, future work could adapt these denoising approaches to complex data, allowing for the performance of Dixon separation and improving the overall analysis of gas‐exchange imaging.

  • Results from healthy participants and disease subjects were not evaluated separately during the denoising method comparisons. However, because pairwise comparisons were performed, the results are expected to be consistent across all images.

  • This study lacks external validation, which limits assessment of generalizability across sites and scanners. Future work will address this by testing model performance on independent data sets.

  • This study uses a relative threshold for VDP, which varies across denoising methods due to changes in mean signal intensity. This may affect defect classification and highlights the need for standardized VDP approaches in future work.

5. CONCLUSIONS

This study highlights the effectiveness of DL‐based denoising methods in improving the quality of 129Xe MRI data, particularly for ventilation, diffusion, and gas‐exchange images. The improvements in SNR and noise reduction observed in all imaging protocols suggest that DL‐based denoising methods hold great promise for clinical translation. These techniques could enable more reliable diagnostic assessments, especially in scenarios where scan time, breath‐hold duration, or patient compliance limits image quality. Future work should address the limitations of synthetic noise training, non‐Rician data, and explore the broader applicability of these methods to real‐world clinical data.

FUNDING INFORMATION

The study was supported by the National Institutes of Health (R00HL111217, R01HL131012, R01HL166335, 2R01HL126771, R01HL151588, R01HL143011, R01HL 166335, and R44HL123299), University of Cincinnati Cancer Center, Cystic Fibrosis Foundation (CLEVEL16A0), and the National Organization for Rare Disorders (20003).

CONFLICT OF INTEREST

Matthew M. Willmering and Jason C. Woods are consultants to Polarean Imaging, plc.

Supporting information

Figure S1. Workflow for data preparation, patch generation, and model training for Traditional, N2N, and N2V denoising methods applied to 129Xe MRI.

Figure S2. 129Xe ventilation images for a 14 year‐old healthy control subject before and after denoising. Signal‐to‐noise ratio (SNR) increased significantly in denoised images using block‐matching and 3D filtering (BM3D) and deep‐learning models compared with the raw images. Ventilation defect percentage (VDP) maps (red), indicating lung regions with signal < 60% of the whole‐lung mean, showed minimal defects. N2N, noise2noise; N2V, noise2void; Raw, original image; Trad, traditional.

Figure S3. Mean and noise standard deviation (SD) of lung signal for the testing ventilation data across all denoising methods. (A) Mean lung signal was significantly higher in the Tradvent relative to the other images. (B) The noise SD was significantly reduced in the denoised images compared with the raw images.

Figure S4. Quantitative comparison of sharpness in raw and denoised ventilation images. Image sharpness was significantly increased in the Tradvent method and decreased in the N2Nvent method compared with the raw and other denoised images (p < 0.001).

Figure S5. Performance of denoised methods on a ventilation data set across different noise levels. Left: Average signal‐to‐noise ratio (SNR) values as a function of the amount of added Gaussian noise. Right: Qualitative results of the denoising methods compared with the raw images with Gaussian noise (SD = 0, 50, 100).

Figure S6. Bland–Altman plots of the difference in ventilation defect percentage (VDP) obtained from raw and denoised images. (A–D) Bland–Altman plots comparing raw VDP to VDP values calculated using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vvent techniques, respectively. The solid black lines represent the bias (mean difference) between the VDP values calculated from raw and denoised images, whereas the dashed red lines indicate the standard deviation (σ) at ±1.96σ from the mean (95% confidence interval).

Figure S7. 129Xe diffusion‐weighted images (lowest and highest b‐value images) for a 31 year‐old control subject before and after denoising. The signal‐to‐noise ratio (SNR) showed a significant increase in the denoised images compared with the raw images. The standard deviation of apparent diffusion coefficient (ADC) was significantly reduced in the denoised images.

Figure S8. Image parameters for the first and last b‐value images, along with mean apparent diffusion coefficient (ADC) and standard deviation (SD) of ADC for the testing data across all denoising methods. (A) Mean lung signal increased significantly in N2Ndiff relative to the other images. (B) The noise SD was significantly lower in the denoised images compared with the raw images. (C,D) A similar trend was observed in the last b‐value images.

Figure S9. Quantitative comparison of sharpness in raw and denoised diffusion images. Image sharpness was significantly decreased in Tradvent, block matching 3D (BM3D), and N2Vdiff methods (p < 0.05), whereas sharpness remained unchanged in the N2Ndiff method compared with the raw images.

Figure S10. Performance of denoised methods on a diffusion data set across different noise levels. Left: Average signal‐to‐noise ratio (SNR) values as a function of the amount of added Gaussian noise. Right: Qualitative results of the denoising methods compared with the raw images with Gaussian noise (SD = 0, 12.5, 25).

Figure S11. Bland–Altman plots of the difference in apparent diffusion coefficient (ADC) obtained from raw images and denoised images. (A–D) Bland–Altman plots comparing raw ADC to ADC values calculated from denoised images using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vvent techniques, respectively. The solid black lines represent the bias (mean difference) between the ADC values calculated from raw and denoised images, whereas the dashed red lines indicate the 95% confidence interval.

Figure S12. Denoised 129Xe gas‐exchange images, including gas, low‐resolution reconstructed dissolved images, 1‐point Dixon separated membrane images, red blood cell (RBC) images, and high‐resolution reconstructed dissolved images for a healthy subject before and after denoising. The signal‐to‐noise ratio (SNR) exhibited a significant enhancement in the denoised images compared with the raw images. Blue line indicates the lung mask boundaries.

Figure S13. Image parameters for 129Xe gas‐exchange images, including gas image, low‐resolution reconstructed dissolved images, 1‐point Dixon‐separated membrane images, RBC images, and high‐resolution reconstructed dissolved images across all denoising methods in the testing data. The mean lung signal was significantly higher in the Tradvent and N2Nvent methods compared with the raw and other denoising methods for all image types. In contrast, the noise standard deviation (SD) was significantly lower in block matching 3D (BM3D), N2Nvent, N2Vgas, and N2Vdiss, but significantly higher in Tradvent method compared with the raw images.

Figure S14. Quantitative comparison of sharpness of raw and denoised gas‐exchange images. Sharpness of denoised gas‐exchange images using Tradvent and N2Nvent was significantly higher compared with the sharpness of raw images and images denoised using block matching 3D (BM3D), N2Vgas, and N2Vdiss.

Figure S15. Performance of denoised methods on a gas data set across different noise levels. Left: Average signal‐to‐noise ratio (SNR) values as a function of the amount of added Gaussian noise. Right: Qualitative results of the denoising methods compared with the raw images with Gaussian noise (standard deviation [SD] = 0, 75, 150).

Figure S16. Bland–Altman plots of the difference in membrane obtained from raw and denoised gas‐exchange images. (A–D) Bland–Altman plots comparing raw membrane to membrane values calculated from denoised images using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vdiss methods, respectively. The solid black lines represent the bias (mean difference) between the membrane values calculated from raw and denoised images, whereas the dashed red lines indicate the 95% confidence interval.

Figure S17. Bland–Altman plots of the difference in red blood cells (RBCs) obtained from raw and denoised gas‐exchange images. (A–D) Bland–Altman plots comparing raw RBC to RBC values calculated from denoised images using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vdiss methods, respectively. The solid black lines represent the bias (mean difference) between the RBC values calculated from raw and denoised images, whereas the dashed red lines indicate the 95% confidence interval.

Figure S18. Bland–Altman plots of the difference in red blood cell (RBC): membrane obtained from raw and denoised gas‐exchange images. (A–D) Bland–Altman plots comparing raw RBC:Membrane to RBC:Membrane values calculated from denoised images using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vdiss methods, respectively. The solid black lines represent the bias (mean difference) between the RBC:Membrane values calculated from raw and denoised images, whereas the dashed red lines indicate the 95% confidence interval.

Table S1. Denoising model summary.

MRM-95-138-s001.docx (7MB, docx)

ACKNOWLEDGMENTS

The authors thank Andrew Bryan, Joseph Plummer, and Carter McMaster for preparing HP 129Xe gas, and Kaley Bridgewater, Kelsey Murphy, Lacey Haas, Brynne Williams, and John Lanier for operating the MRI hardware.

Bdaiwi A. S., Willmering M. M., Hussain R., et al., “Comparative evaluation of supervised and unsupervised deep learning strategies for denoising hyperpolarized 129Xe lung MRI ,” Magnetic Resonance in Medicine 95, no. 1 (2026): 138–156, 10.1002/mrm.70033.

DATA AVAILABILITY STATEMENT

Python scripts for training the models described in this study are located here: https://github.com/aboodbdaiwi/XeDL_Denoising.

REFERENCES

  • 1. Walkup LL, Woods JC. Translational applications of hyperpolarized 3He and 129Xe. NMR Biomed. 2014;27:1429‐1438. [DOI] [PubMed] [Google Scholar]
  • 2. Roos JE, McAdams HP, Kaushik SS, Driehuys B. Hyperpolarized gas MR imaging: technique and applications. Magn Reson Imaging Clin N Am. 2015;23:217‐229. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Khan AS, Harvey RL, Birchall JR, et al. Enabling clinical technologies for hyperpolarized 129Xenon magnetic resonance imaging and spectroscopy. Angew Chem Int Ed Engl. 2021;60:22126‐22147. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Goodson BM. Nuclear magnetic resonance of laser‐polarized noble gases in molecules, materials, and organisms. J Magn Reson Imaging. 2002;155:157‐216. [DOI] [PubMed] [Google Scholar]
  • 5. Bdaiwi AS, Costa ML, Plummer JW, Willmering MM, Walkup LL, Cleveland ZI. B1 and magnetization decay correction for hyperpolarized 129Xe lung imaging using sequential 2D spiral acquisitions. Magn Reson Med. 2023;90:473‐482. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6. He M, Zha W, Tan F, Rankine L, Fain S, Driehuys B. A comparison of two hyperpolarized 129Xe MRI ventilation quantification pipelines: the effect of signal to noise ratio. Acad Radiol. 2019;26:949‐959. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Bdaiwi AS, Niedbalski PJ, Hossain MM, et al. Improving hyperpolarized 129Xe ADC mapping in pediatric and adult lungs with uncertainty propagation. NMR Biomed. 2021;35:e4639. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. O'Halloran RL, Holmes JH, Altes TA, Salerno M, Fain SB. The effects of SNR on ADC measurements in diffusion‐weighted hyperpolarized He‐3 MRI. J Magn Reson. 2007;185:42‐49. [DOI] [PubMed] [Google Scholar]
  • 9. Dabov K, Foi A, Katkovnik V, Egiazarian K. Image denoising by sparse 3‐D transform‐domain collaborative filtering. IEEE Trans Image Process. 2007;16:2080‐2095. [DOI] [PubMed] [Google Scholar]
  • 10. Kong Z, Han L, Liu X, Yang X. A new 4‐D nonlocal transform‐domain filter for 3‐D magnetic resonance images denoising. IEEE Trans Med Imaging. 2017;37:941‐954. [DOI] [PubMed] [Google Scholar]
  • 11. Zhang X, Peng J, Xu M, et al. Denoise diffusion‐weighted images using higher‐order singular value decomposition. NeuroImage. 2017;156:128‐145. [DOI] [PubMed] [Google Scholar]
  • 12. Christensen NV, Vaeggemose M, Bøgh N, et al. A user independent denoising method for x‐nuclei MRI and MRS. Magn Reson Med. 2023;90:2539‐2556. [DOI] [PubMed] [Google Scholar]
  • 13. Kim Y, Chen HY, Autry AW, et al. Denoising of hyperpolarized 13C MR images of the human brain using patch‐based higher‐order singular value decomposition. Magn Reson Med. 2021;86:2497‐2511. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14. Soderlund SA, Bdaiwi AS, Plummer JW, Woods JC, Walkup LL, Cleveland ZI. Improved diffusion‐weighted hyperpolarized 129Xe lung MRI with patch‐based higher‐order, singular value decomposition denoising. Acad Radiol. 2024;31:5289‐5299. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Manjón JV, Coupe P. MRI Denoising Using Deep Learning. Springer; 2018:12‐19. [Google Scholar]
  • 16. Stewart NJ, de Arcos J, Biancardi AM, et al. Improving Xenon‐129 lung ventilation image SNR with deep‐learning based image reconstruction. Magn Reson Med. 2024;92:2546‐2559. [DOI] [PubMed] [Google Scholar]
  • 17. Kang B, Lee W, Seo H, Heo HY, Park H. Self‐supervised learning for denoising of multidimensional MRI data. Magn Reson Med. 2024;92:1980‐1994. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Tripathi PC, Bag S. CNN‐DMRI: a convolutional neural network for denoising of magnetic resonance images. Pattern Recogn Lett. 2020;135:57‐63. [Google Scholar]
  • 19. Couturier R, Perrot G, Salomon M. Image Denoising Using a Deep Encoder‐Decoder Network with Skip Connections. Springer; 2018:554‐565. [Google Scholar]
  • 20. Mehta D, Padalia D, Vora K, Mehendale N. MRI Image Denoising Using U‐Net and Image Processing Techniques. IEEE; 2022:306‐313. [Google Scholar]
  • 21. Tian M, Song K. Boosting magnetic resonance image denoising with generative adversarial networks. IEEE Access. 2021;9:62266‐62275. [Google Scholar]
  • 22. Moreno López M, Frederick JM, Ventura J. Evaluation of MRI denoising methods using unsupervised learning. Front Artif Intell. 2021;4:642731. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23. Rai S, Bhatt JS, Patra SK. An unsupervised deep learning framework for medical image denoising. arXiv:210306575, 2021.
  • 24. Lehtinen J, Munkberg J, Hasselgren J, et al. Noise2Noise: learning image restoration without clean data. arXiv:180304189, 2018.
  • 25. de Negreiros ACSV, Giraldi G, Werner H, Santos ÍMF. Self‐Supervised Image Denoising Methods: an Application in Fetal MRI. In Workshop de Visão Computacional (WVC). São Bernardo do Campo, Brazil: SBC, FEI; 2023:137‐141. [Google Scholar]
  • 26. Krull A, Buchholz T‐O, Jug F. Noise2void‐learning denoising from single noisy images. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2019:2129‐2137. [Google Scholar]
  • 27. Roach DJ, Willmering MM, Plummer JW, et al. Hyperpolarized 129Xenon MRI ventilation defect quantification via thresholding and linear binning in multiple pulmonary diseases. Acad Radiol. 2022;29:S145‐S155. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Hussain R, Plummer J, Willmering M, Bdaiwi A, Walkup L, Cleveland Z. Addressing healthy aging variations in ventilation from childhood to older age using xenon MRI. B80‐1 Methodological Advancements in Pulmonary Imaging. American Thoracic Society; 2024:A4489. [Google Scholar]
  • 29. Lin NY, Roach DJ, Willmering MM, et al. 129Xe MRI as a measure of clinical disease severity for pediatric asthma. J Allergy Clin Immunol. 2021;147:2146‐2153. [DOI] [PubMed] [Google Scholar]
  • 30. Willmering M, Hysinger E, Janjindamai C, et al. Assessment of obstructive and restrictive lung disease via MRI in bronchopulmonary dysplasia patients. C29 Advanced Lung Imaging; the Pulmonary Paparazzi. American Thoracic Society; 2024:A5211. [Google Scholar]
  • 31. Walkup LL, Myers K, El‐Bietar J, et al. Xenon‐129 MRI detects ventilation deficits in paediatric stem cell transplant patients unable to perform spirometry. Eur Respir J. 2019;53:1801779. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Walkup LL, Roach DJ, Hall CS, et al. Cyst ventilation heterogeneity and alveolar airspace dilation as early disease markers in lymphangioleiomyomatosis. Ann Am Thorac Soc. 2019;16:1008‐1016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Bdaiwi AS, Willmering MM, Woods JC, Walkup LL, Cleveland ZI. Quantifying spatial distribution of ventilation defects in multiple pulmonary diseases with hyperpolarized 129Xenon MRI. J Magn Reson Imaging. 2024;61:1860‐1873. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34. Hussain R, Plummer J, Bdaiwi A, Willmering M, Walkup L, Cleveland Z. Improved quantification of ventilation heterogeneity in hyperpolarized xenon MRI using physics‐rooted bias‐field correction. B80‐1 Methodological Advancements in Pulmonary Imaging. American Thoracic Society; 2024:A4488. [Google Scholar]
  • 35. Hussain R, Plummer JW, Bdaiwi AS, Willmering MM, Walkup LL, Cleveland ZI. Improved efficiency and quantitative accuracy in hyperpolarized 129Xe ventilation imaging using 2D spiral acquisition. In: Proceedings of the 32nd Joint ISMRM & ISMRT Annual Meeting, Toronto, Ontario, Canada, 2023.
  • 36. Bdaiwi AS, Roach DJ, West ME, et al. Longitudinal monitoring of Lumacaftor/Ivacaftor response in young children with cystic fibrosis lung disease using 129Xe MRI. Acad Radiol. 2025. [DOI] [PubMed] [Google Scholar]
  • 37. Matheson AM, Bdaiwi AS, Willmering MM, et al. Disease classification of pulmonary xenon ventilation MRI using artificial intelligence. Acad Radiol. 2025. [DOI] [PubMed] [Google Scholar]
  • 38. Bdaiwi AS, Svoboda AM, Murdock KE, et al. Quantifying abnormal alveolar microstructure in cystic fibrosis lung disease via hyperpolarized 129Xe diffusion MRI. J Cyst Fibros. 2024;23:926‐935. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Bdaiwi AS, Willmering MM, Wang H, Cleveland ZI. Diffusion weighted hyperpolarized 129Xe MRI of the lung with 2D and 3D (FLORET) spiral. Magn Reson Med. 2022;89:1342‐1356. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Plummer JW, Willmering MM, Cleveland ZI, Towe C, Woods JC, Walkup LL. Childhood to adulthood: accounting for age dependence in healthy‐reference distributions in 129Xe gas‐exchange MRI. Magn Reson Med. 2023;89:1117‐1133. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Willmering MM, Walkup LL, Niedbalski PJ, et al. Pediatric 129Xe gas‐transfer MRI—feasibility and applicability. J Magn Reson Imaging. 2022;56:1207‐1219. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Wang Z, He M, Bier E, et al. Hyperpolarized 129Xe gas transfer MRI: the transition from 1.5 T to 3T. Magn Reson Med. 2018;80:2374‐2383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Bdaiwi AS, Willmering MM, Plummer JW, et al. 129Xe image processing pipeline: an open‐source, graphical user interface application for the analysis of hyperpolarized 129Xe MRI. Magn Reson Med. 2024;93:1220‐1237. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Thomen RP, Walkup LL, Roach DJ, Cleveland ZI, Clancy JP, Woods JC. Hyperpolarized 129Xe for investigation of mild cystic fibrosis lung disease in pediatric patients. J Cyst Fibros. 2017;16:275‐282. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45. Wang Z, Robertson SH, Wang J, et al. Quantitative analysis of hyperpolarized 129Xe gas transfer MRI. Med Phys. 2017;44:2415‐2428. [DOI] [PubMed] [Google Scholar]
  • 46. McIntosh MJ, Biancaniello A, Kooner HK, et al. 129Xe MRI ventilation defects in asthma: what is the upper limit of normal and minimal clinically important difference? Acad Radiol. 2023;30:3114‐3123. [DOI] [PubMed] [Google Scholar]
  • 47. Sukstanskii AL, Bretthorst GL, Chang YV, Conradi MS, Yablonskiy DA. How accurately can the parameters from a model of anisotropic 3He gas diffusion in lung acinar airways be estimated? Bayesian view. J Magn Reson. 2007;184:62‐71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48. Hamedani H, Ma K, DiBardino D, et al. Simultaneous imaging of ventilation and gas exchange with hyperpolarized 129Xe MRI for monitoring patients with endobronchial valve interventions. Am J Respir Crit Care Med. 2022;205:e48‐e50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49. Hamedani H, Amzajerdian F, Kadlecek S, et al. Quantifying ventilation using dynamic xenon MRI during free breathing. In: Proceedings of the Annual Meeting of ISMRM, London, UK, 2022.
  • 50. Plummer JW, Hussain R, Bdaiwi AS, et al. A decay‐modeled compressed sensing reconstruction approach for non‐Cartesian hyperpolarized 129Xe MRI. Magn Reson Med. 2024;92:1363‐1375. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51. Kaushik SS, Robertson SH, Freeman MS, et al. Single‐breath clinical imaging of hyperpolarized (129)Xe in the airspaces, barrier, and red blood cells using an interleaved 3D radial 1‐point Dixon acquisition. Magn Reson Med. 2016;75:1434‐1443. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52. Wang Z, He M, Bier E, et al. Hyperpolarized 129Xe gas transfer MRI: the transition from 1.5T to 3T. Magn Reson Med. 2018;80:2374‐2383. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Willmering MM, Cleveland ZI, Walkup LL, Woods JC. Removal of off‐resonance xenon gas artifacts in pulmonary gas‐transfer MRI. Magn Reson Med. 2021;86:907‐915. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1. Workflow for data preparation, patch generation, and model training for Traditional, N2N, and N2V denoising methods applied to 129Xe MRI.

Figure S2. 129Xe ventilation images for a 14 year‐old healthy control subject before and after denoising. Signal‐to‐noise ratio (SNR) increased significantly in denoised images using block‐matching and 3D filtering (BM3D) and deep‐learning models compared with the raw images. Ventilation defect percentage (VDP) maps (red), indicating lung regions with signal < 60% of the whole‐lung mean, showed minimal defects. N2N, noise2noise; N2V, noise2void; Raw, original image; Trad, traditional.

Figure S3. Mean and noise standard deviation (SD) of lung signal for the testing ventilation data across all denoising methods. (A) Mean lung signal was significantly higher in the Tradvent relative to the other images. (B) The noise SD was significantly reduced in the denoised images compared with the raw images.

Figure S4. Quantitative comparison of sharpness in raw and denoised ventilation images. Image sharpness was significantly increased in the Tradvent method and decreased in the N2Nvent method compared with the raw and other denoised images (p < 0.001).

Figure S5. Performance of denoised methods on a ventilation data set across different noise levels. Left: Average signal‐to‐noise ratio (SNR) values as a function of the amount of added Gaussian noise. Right: Qualitative results of the denoising methods compared with the raw images with Gaussian noise (SD = 0, 50, 100).

Figure S6. Bland–Altman plots of the difference in ventilation defect percentage (VDP) obtained from raw and denoised images. (A–D) Bland–Altman plots comparing raw VDP to VDP values calculated using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vvent techniques, respectively. The solid black lines represent the bias (mean difference) between the VDP values calculated from raw and denoised images, whereas the dashed red lines indicate the standard deviation (σ) at ±1.96σ from the mean (95% confidence interval).

Figure S7. 129Xe diffusion‐weighted images (lowest and highest b‐value images) for a 31 year‐old control subject before and after denoising. The signal‐to‐noise ratio (SNR) showed a significant increase in the denoised images compared with the raw images. The standard deviation of apparent diffusion coefficient (ADC) was significantly reduced in the denoised images.

Figure S8. Image parameters for the first and last b‐value images, along with mean apparent diffusion coefficient (ADC) and standard deviation (SD) of ADC for the testing data across all denoising methods. (A) Mean lung signal increased significantly in N2Ndiff relative to the other images. (B) The noise SD was significantly lower in the denoised images compared with the raw images. (C,D) A similar trend was observed in the last b‐value images.

Figure S9. Quantitative comparison of sharpness in raw and denoised diffusion images. Image sharpness was significantly decreased in Tradvent, block matching 3D (BM3D), and N2Vdiff methods (p < 0.05), whereas sharpness remained unchanged in the N2Ndiff method compared with the raw images.

Figure S10. Performance of denoised methods on a diffusion data set across different noise levels. Left: Average signal‐to‐noise ratio (SNR) values as a function of the amount of added Gaussian noise. Right: Qualitative results of the denoising methods compared with the raw images with Gaussian noise (SD = 0, 12.5, 25).

Figure S11. Bland–Altman plots of the difference in apparent diffusion coefficient (ADC) obtained from raw images and denoised images. (A–D) Bland–Altman plots comparing raw ADC to ADC values calculated from denoised images using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vvent techniques, respectively. The solid black lines represent the bias (mean difference) between the ADC values calculated from raw and denoised images, whereas the dashed red lines indicate the 95% confidence interval.

Figure S12. Denoised 129Xe gas‐exchange images, including gas, low‐resolution reconstructed dissolved images, 1‐point Dixon separated membrane images, red blood cell (RBC) images, and high‐resolution reconstructed dissolved images for a healthy subject before and after denoising. The signal‐to‐noise ratio (SNR) exhibited a significant enhancement in the denoised images compared with the raw images. Blue line indicates the lung mask boundaries.

Figure S13. Image parameters for 129Xe gas‐exchange images, including gas image, low‐resolution reconstructed dissolved images, 1‐point Dixon‐separated membrane images, RBC images, and high‐resolution reconstructed dissolved images across all denoising methods in the testing data. The mean lung signal was significantly higher in the Tradvent and N2Nvent methods compared with the raw and other denoising methods for all image types. In contrast, the noise standard deviation (SD) was significantly lower in block matching 3D (BM3D), N2Nvent, N2Vgas, and N2Vdiss, but significantly higher in Tradvent method compared with the raw images.

Figure S14. Quantitative comparison of sharpness of raw and denoised gas‐exchange images. Sharpness of denoised gas‐exchange images using Tradvent and N2Nvent was significantly higher compared with the sharpness of raw images and images denoised using block matching 3D (BM3D), N2Vgas, and N2Vdiss.

Figure S15. Performance of denoised methods on a gas data set across different noise levels. Left: Average signal‐to‐noise ratio (SNR) values as a function of the amount of added Gaussian noise. Right: Qualitative results of the denoising methods compared with the raw images with Gaussian noise (standard deviation [SD] = 0, 75, 150).

Figure S16. Bland–Altman plots of the difference in membrane obtained from raw and denoised gas‐exchange images. (A–D) Bland–Altman plots comparing raw membrane to membrane values calculated from denoised images using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vdiss methods, respectively. The solid black lines represent the bias (mean difference) between the membrane values calculated from raw and denoised images, whereas the dashed red lines indicate the 95% confidence interval.

Figure S17. Bland–Altman plots of the difference in red blood cells (RBCs) obtained from raw and denoised gas‐exchange images. (A–D) Bland–Altman plots comparing raw RBC to RBC values calculated from denoised images using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vdiss methods, respectively. The solid black lines represent the bias (mean difference) between the RBC values calculated from raw and denoised images, whereas the dashed red lines indicate the 95% confidence interval.

Figure S18. Bland–Altman plots of the difference in red blood cell (RBC): membrane obtained from raw and denoised gas‐exchange images. (A–D) Bland–Altman plots comparing raw RBC:Membrane to RBC:Membrane values calculated from denoised images using block matching 3D (BM3D), Tradvent, N2Nvent, and N2Vdiss methods, respectively. The solid black lines represent the bias (mean difference) between the RBC:Membrane values calculated from raw and denoised images, whereas the dashed red lines indicate the 95% confidence interval.

Table S1. Denoising model summary.

MRM-95-138-s001.docx (7MB, docx)

Data Availability Statement

Python scripts for training the models described in this study are located here: https://github.com/aboodbdaiwi/XeDL_Denoising.


Articles from Magnetic Resonance in Medicine are provided here courtesy of Wiley

RESOURCES