Abstract
Coronary CT angiography is increasingly used for cardiac diagnosis. Dose modulation techniques can reduce radiation dose, but resulting functional images are noisy and challenging for functional analysis. This retrospective study describes and evaluates a deep learning method for denoising functional cardiac imaging, taking advantage of multiphase information in a three-dimensional convolutional neural network. Coronary CT angiograms (n = 566) were used to derive synthetic data for training. Deep learning–based image denoising was compared with unprocessed images and a standard noise reduction algorithm (block-matching and three-dimensional filtering [BM3D]). Noise and signal-to-noise ratio measurements, as well as expert evaluation of image quality, were performed. To validate the use of the denoised images for cardiac quantification, threshold-based segmentation was performed, and results were compared with manual measurements on unprocessed images. Deep learning–based denoised images showed significantly improved noise compared with standard denoising-based images (SD of left ventricular blood pool, 20.3 HU ± 42.5 [SD] vs 33.4 HU ± 39.8 for deep learning–based image denoising vs BM3D; P < .0001). Expert evaluations of image quality were significantly higher in deep learning–based denoised images compared with standard denoising. Semiautomatic left ventricular size measurements on deep learning–based denoised images showed excellent correlation with expert quantification on unprocessed images (intraclass correlation coefficient, 0.97). Deep learning–based denoising using a three-dimensional approach resulted in excellent denoising performance and facilitated valid automatic processing of cardiac functional imaging.
Keywords: Cardiac CT Angiography, Deep Learning, Image Denoising
Supplemental material is available for this article.
© RSNA, 2024
Keywords: Cardiac CT Angiography, Deep Learning, Image Denoising
Summary
Deep learning–based denoising of dose-modulated cardiac CT angiographic examinations using a three-dimensional approach resulted in excellent image quality compared with conventional methods, and left ventricular segmentation on the denoised images were strongly correlated with expert manual segmentations.
Key Points
■ Deep learning–based denoised images showed significantly lower noise compared with images denoised using standard methods (SD of left ventricular blood pool, 20.3 HU vs 33.4 HU for deep learning vs standard denoising; P < .001).
■ Expert evaluations showed significantly higher quality scores for deep learning–based denoised images, with a median quality score of 5 (scale of 0 to 5, with 5 representing excellent quality) versus 2 for images with standard denoising (P < .001).
■ A threshold-based algorithm for left ventricular segmentation applied to deep learning–based denoised images resulted in excellent agreement with expert segmentations (intraclass correlation coefficient, 0.97), indicating valid measurements on processed images.
Introduction
Cardiac CT angiography (CTA) has moved from being a niche procedure to being a cornerstone of cardiac diagnostics in the last decade. Retrospectively gated studies are frequently performed due to their robustness in a variety of clinical scenarios, including high heart rate, arrhythmia, and tachypnea (1). Retrospective gating with dose modulation produces high-noise functional data that are usually discarded. This study aims to make such data usable for downstream applications like measuring the left ventricular ejection fraction (LVEF). Functional imaging can have additional clinical value compared with coronary evaluation alone (2–4). However, if functional data are not desired and there is a regular rhythm, prospective gating offers an option with generally lower radiation doses.
Low-dose functional images, which may be reconstructed as a byproduct on retrospectively gated dose-modulated coronary CTA acquisitions, are often not usable for diagnostic purposes due to high noise. Kang et al (5) used a cycle-consistent generative adversarial network (CycleGAN) to learn denoising of the high-noise portions of unpaired data. Another approach is to create synthetic high-noise data and use a convolutional neural network to learn denoising (6). These methods do not leverage the fact that a very similar high-quality time frame is available. Therefore, each time point should ideally not be analyzed in isolation, and all available data should be used for denoising.
In this study, we propose a method to create synthetic ground truth data using image coregistration to register the lowest noise time point with the higher noise time points. While the synthetic labels are far from perfect, deep learning has been shown to tolerate imperfect labels (7). Thus, these data may improve denoising performance. Additionally, we used a three-dimensional (3D) U-Net architecture, with time as the third dimension (two-dimensional [2D]+t), as we assumed it would be superior to a 2D U-Net which has no access to prior and later time points.
Materials and Methods
Data Collection
This Health Insurance Portability and Accountability Act–compliant retrospective study was approved by our institutional review board, and requirements for patient consent were waived due to the use of anonymized data. Data from 566 consecutive adult patients undergoing cardiac CTA for evaluation of anatomy and pathology of the coronary arteries (see Appendix S1) with dose-modulated retrospective gating were acquired. Cardiac CTA was performed with a dual-source CT scanner (Siemens Force) using intravenous contrast material (iopamidol, 370 mg/mL; Bracco). The reconstruction kernel was Bf32 and section thickness was 2 mm with reduced reconstruction matrix size (256 × 256). We reduced the dimension of the image data to 128 × 128 pixels by scaling. For the 3D U-Net architecture, we preprocessed each 2D axial section with 10 time points as the third dimension. In addition, we padded three sections along the time axis (in a wraparound fashion so that three frames of the cycle are padded at the beginning and end), resulting in the final dimensions of 128 × 128 × 16, effectively presenting 1.6 cardiac cycles to the convolutional neural network. The final dataset, split on a per-patient basis with a ratio of 80:10:10, consisted of 29 699, 3087, and 3325 time series for training, validation, and test datasets, respectively.
Generation of Synthetic Denoised Data
We selected the time point with the lowest noise level (usually diastole), automatically using the difference between the image data (3D) and median filtered data (kernel size 2). We used nonrigid image coregistration to deform this frame to match each of the higher noise frames, thereby generating a low-noise synthetic time series. For nonrigid image coregistration, we used the freely available software, elastix 5.0.0 (8). Detailed registration parameters are stated in Appendix S1.
U-Net Architecture
We used a modified U-Net with residual connections and group normalization (9–12). L1 loss was used as it has been previously shown to be superior to L2 loss in image restoration (13). We used 64 initial feature maps, a learning rate of 0.0001, a batch size of 4, and the Adam optimizer. To provide additional context, we used a 3D U-Net with two spatial dimensions and one temporal dimension (2D+t). For comparison, we also trained a 2D U-Net. Training was performed using 2xV100 (NVIDIA), taking a total of 14 hours.
Comparison Algorithm
We used the standard algorithm, block-matching and 3D filtering, or BM3D (pybm3d), for comparison, which has been shown to deliver good results over a vast range of applications.
Evaluation
Objective measurements.— For image quality evaluation, we defined noise as the SD of CT numbers of a 1-cm region of interest placed in the blood pool in the left ventricle (LV) (n = 20). We also calculated signal-to-noise ratio (SNR) by dividing the mean by the SD of CT numbers within the blood pool region of interest. These measurements and calculations were performed for each frame in a time series.
Observer evaluation.— Experienced physician readers (one radiologist with >3 years cardiovascular imaging experience [D.M.] and one physician radiology researcher with extensive cardiovascular imaging research experience [M.J.W.]) evaluated the overall image quality in terms of noise level and the number of artifacts present on a scale from 1 to 5 (with 5 describing the highest image quality). Raters had access to the full time series of a single section and were instructed to score the most problematic frame of each time series. Images were presented blinded to denoising method and in random order (n = 50).
Evaluation of usability of reconstructions for LV segmentation.— To determine if automatic LV measurements on denoised CT images are valid, we used a simple proof-of-concept threshold-based LV segmentation algorithm and compared results to manual segmentations (n = 20 patients, single section, 10 frames, performed by V.S.). In a preprocessing step, all high attenuation areas not within the LV (eg, bones or the left atrium) were manually masked. A threshold for LV segmentation was selected based on mean LV blood pool attenuation multiplied by 0.75. All unmasked pixels above the threshold were considered to be LV lumen.
In addition, we performed analyses of the LVEF which was calculated as follows: LVEFArea = 100 × (end-diastolic volume − end-systolic volume) / end-diastolic volume.
Statistical Analysis
All statistical analyses were performed using R (version 4.0; The R Foundation). For comparison of multiple measurements, the Friedman test with a paired Conover test and Bonferroni adjustment was performed (R package PMCMR) after measurements were averaged over all frames for each case. For LV measurement comparisons, intraclass correlation coefficients (ICCs) (R package psych, command ICC) with CIs were calculated (using bootstrapping, R package boot). In addition, Bland-Altman analysis was performed (R package blandr, ggplot2). Significance at the level of P < .05 was assumed when CIs were not overlapping.
Results
Example Images
Figure 1 and Movies 1 and 2 show example coronary CT angiograms with large variability of noise over the cardiac cycle and denoising results, as well as synthetic training data. Subjectively, the 3D U-Net shows good denoising performance. Figure 1 and Movie 1 show an example of imperfect synthetic data where an exaggerated deformation of the aortic root by the coregistration algorithm occurs, likely related to changes in contrast attenuation in the right atrium (best seen in Movie 1). However, the deep learning denoising results do not show such a deformation.
Figure 1:
Example images of a cine series with high-level noise and artifact are shown for unprocessed, block-matching and three-dimensional filtering (BM3D), two-dimensional (2D) U-Net, and 3D U-Net images. The rightmost column shows the corresponding synthetic training data.
Movie 1:
Video of main document Figure 1 data. The images from left to right show: Original, BM3D, 2D-Unet, 3D-Unet, Synthetic.
Movie 2:
Additional cine videos of 8 cases. Naming for columns from right to left: Original, BM3D, 2D U-net, 2D U-net.
Noise Measurements
Mean and highest noise measurements.— The noise levels varied considerably over the cardiac cycle (Fig S1). Despite the variance of noise over time, we performed an analysis of the mean blood pool noise level. Figure 2 (left upper panel) shows that 3D U-Net denoising provided superior results compared with BM3D (P < .001). The 2D U-Net showed similar mean noise levels compared with BM3D (Fig 2; P = .64). Relatively high SDs reflect a large range of noise levels of the analyzed images.
Figure 2:
Box-and-whisker plots show results for noise measurements using the SD of the blood pool in Hounsfield units (upper panel) and signal-to-noise ratio (SNR, lower panel) measured within the left ventricular cavity. The left plots show mean measurements including all time frames of the cardiac cycle while the right plots show the results for the most problematic time frame, which usually limits quantitative functional analysis. The box midline represents median, the borders indicate the 1st and 3rd quartiles, and the whisker boundaries extend 1.5 quartiles. BM3D = block-matching and three-dimensional filtering, 2D = two-dimensional.
The limiting factor for advanced functional analysis is the highest noise time point of the cardiac cycle. Therefore, we also performed an analysis based on the highest noise occurrence in each time series. The results shown in Figure 2 (right upper panel) indicate that the advantage of the 3D U-Net is more pronounced for the most critical time points compared with the average values (24.0 HU vs 60.8 HU for 3D U-Net and BM3D, respectively; P < .001).
SNR values.— The SNR was calculated for the blood pool (Fig 2, lower panels). In line with findings of noise levels, the deep learning–based denoising method showed superior results on analysis of mean and lowest SNR per time series (P < .001).
Observer Evaluation
Figure 3 demonstrates that the 3D U-Net achieved meaningfully higher subjective image quality in all subcategories (overall quality, noise, and artifacts; P < .001 for all characteristics).
Figure 3:
Expert evaluations of overall image quality (upper panel), noise-related image quality (middle panel), and artifact-related image quality (lower panel) are shown using a score from 1 (red, unusable) to 5 (dark green, excellent quality). The three-dimensional (3D) U-Net has the highest proportion of high-quality scores in all three categories by a large margin (dark green bar). The block-matching and 3D filtering (BM3D) images show higher quality scores in the noise category compared with the unprocessed images (middle panel). Of note, the quality scores in regard to artifacts are lower for BM3D images compared with unprocessed images (lower panel). The table shows scores as medians with IQRs in parentheses.
Evaluation of Validity of LV Measurements on Denoised Images
As a proof of concept, we used a simple threshold-based segmentation algorithm with the 3D U-Net denoised images as input. Agreement of automatic LV area measurements with the reference standard of manual segmentation was excellent (ICC, 0.97; 95% CI: 0.96, 0.98; Fig S2) and significantly better compared with using BM3D denoised images (ICC, 0.69; 95% CI: 0.61, 0.75). Bland-Altman analysis also showed significantly narrower limits of agreement for the 3D U-Net image–based measurements. LVEF measurements showed similar results (Figs S2–S4).
Discussion
Our results show that a 3D U-Net trained with synthetic data was highly effective in removing noise and artifacts from the high-noise portions of the cardiac cycle with reliable delineation of the LV endocardial contour. The 3D U-Net outperformed sophisticated conventional denoising methods like BM3D by a clinically meaningful margin.
The results for the 2D U-Net approach were more in line with conventional methods. This indicates that receiving the full information of the cardiac cycle including low and high noise portions, as the 3D U-Net does, is key to allowing the convolutional neural network to successfully remove severe noise and noise-related artifacts. The expert assessment of image quality corroborated these objective measures.
A large body of literature exists on the issue of denoising of CT images using a variety of algorithms (14), but the specific issue of multi-time-point acquisition is rarely addressed. A major concern is that deep learning reconstruction networks may inaccurately reconstruct anatomy in ways that bias quantitative measurements (eg, cardiac function assessment). Networks are optimized to reduce noise and artifacts but may not preserve true anatomy. As shown in the results, exaggerated deformations can occur in the synthetic training data. Deep learning methods have been shown to affect anatomic measurements at coronary CT (15). Therefore, to provide additional validation, we performed a comparison of threshold-based LV measurements performed on denoised images which showed excellent agreement with expert LV measurements. The synthetic training data generation relies on the performance of the coregistration algorithm in the setting of high noise. Multiscale registration methods, as used in this work, have been shown to be particularly effective in noisy images (16). Of note, our method aims to generate low-noise low-resolution images from very high-noise images for use in LV functional assessment but does not have the detail to be usable for coronary vessel or valve evaluation. Other measurements such as size of the aortic root were not validated in the current study which focused on LV function.
Based on these observations, it is reasonable to assume that automatic functional analysis will be possible in the subgroup of patients who, for a variety of clinical reasons, are undergoing dose-modulated coronary CTA scans. This could provide additional clinically valuable information such as LVEF and potentially LV strain.
Our study had several limitations. While our dataset is relatively large, it originates from a single center and a single vendor, limiting generalizability. In addition, generalizability is limited by the use of standardized injection protocols. Due to anonymization processes, we cannot correlate results with specific patient characteristics. We did not compare our method directly to other previously described deep learning denoising methods. We used only a simple 2D thresholding method for automated analysis of the LV as a proof of concept.
The results show the high performance of deep learning denoising in the setting of time-variable noise occurring at dose-modulated cardiac CT. Future research directions include extension of this method to noise properties of photon-counting CT and to evaluate the feasibility of LV strain measurements in high-noise acquisitions.
Acknowledgments
Acknowledgments
The authors would like to acknowledge support from the Stanford Radiology Department Radcombinator program and computer support from the Stanford 3D and Quantitative Imaging Laboratory.
Authors declared no funding for this work.
Disclosures of conflicts of interest: V.S. No relevant relationships M.J.W. Payments made to institution from the American Heart Association; founder and CEO of Segmed; stock options in Segmed. M.C. Postdoctoral scholarship from the American Heart Association (no. 826389); payment for lecture on research methodologies from FASTeR; holds shares in Tempus AI and stock options in Arterys; associate editor for Radiology: Artificial Intelligence, editorial board member for European Radiology Experimental, and scientific editorial board member for European Radiology. D.M. Research grant from the National Institute of Biomedical Imaging and Bioengineering (no. 5T32EB009035); consulting fees and stock options in Segmed; trainee editorial board member of Radiology: Cardiothoracic Imaging. D.F. Deputy editor for Radiology: Cardiothoracic Imaging.
Abbreviations:
- BM3D
- block-matching and three-dimensional filtering
- CTA
- cardiac CT angiography
- ICC
- intraclass correlation coefficient
- LV
- left ventricle
- LVEF
- LV ejection fraction
- SNR
- signal-to-noise ratio
- 3D
- three-dimensional
- 2D
- two-dimensional
References
- 1. Maroules CD , Rybicki FJ , Ghoshhajra BB , et al . 2022 use of coronary computed tomographic angiography for patients presenting with acute chest pain to the emergency department: An expert consensus document of the Society of cardiovascular computed tomography (SCCT): Endorsed by the American College of Radiology (ACR) and North American Society for cardiovascular Imaging (NASCI) . J Cardiovasc Comput Tomogr 2023. ; 17 ( 2 ): 146 – 163 . [DOI] [PubMed] [Google Scholar]
- 2. Schlett CL , Banerji D , Siegel E , et al . Prognostic value of CT angiography for major adverse cardiac events in patients with acute chest pain from the emergency department: 2-year outcomes of the ROMICAT trial . JACC Cardiovasc Imaging 2011. ; 4 ( 5 ): 481 – 491 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Kang DK , Lim SH , Park JS , Sun JS , Ha T , Kim TH . Clinical utility of early postoperative cardiac multidetector computed tomography after coronary artery bypass grafting . Sci Rep 2020. ; 10 ( 1 ): 9186 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Arsanjani R , Berman DS , Gransar H , et al . Left ventricular function and volume with coronary CT angiography improves risk stratification and identification of patients at risk for incident mortality: results from 7758 patients in the prospective multinational CONFIRM observational cohort study . Radiology 2014. ; 273 ( 1 ): 70 – 77 . [DOI] [PubMed] [Google Scholar]
- 5. Kang E , Koo HJ , Yang DH , Seo JB , Ye JC . Cycle-consistent adversarial denoising network for multiphase coronary CT angiography . Med Phys 2019. ; 46 ( 2 ): 550 – 562 . [DOI] [PubMed] [Google Scholar]
- 6. Green M , Marom EM , Konen E , Kiryati N , Mayer A . 3-D Neural denoising for low-dose Coronary CT Angiography (CCTA) . Comput Med Imaging Graph 2018. ; 70 : 185 – 191 . [DOI] [PubMed] [Google Scholar]
- 7. Rolnick D , Veit A , Belongie S , Shavit N . Deep Learning is Robust to Massive Label Noise . arXiv 1705.10694 [preprint] https://arxiv.org/abs/1705.10694. Published May 30, 2017. Accessed March 12, 2024 . [Google Scholar]
- 8. Klein S , Staring M , Murphy K , Viergever MA , Pluim JPW . elastix: a toolbox for intensity-based medical image registration . IEEE Trans Med Imaging 2010. ; 29 ( 1 ): 196 – 205 . [DOI] [PubMed] [Google Scholar]
- 9. Sandfort V , Yan K , Pickhardt PJ , Summers RM . Data augmentation using generative adversarial networks (CycleGAN) to improve generalizability in CT segmentation tasks . Sci Rep 2019. ; 9 ( 1 ): 16884 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Wu Y , He K . Group Normalization . arXiv 1803.08494 [preprint] https://arxiv.org/abs/1803.08494. Published March 22, 2018. Accessed March 12, 2024 . [Google Scholar]
- 11. Çiçek Ö , Abdulkadir A , Lienkamp SS , Brox T , Ronneberger O . 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation . In: Ourselin S , Joskowicz L , Sabuncu MR , Unal G , Wells W , eds. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016 . MICCAI 2016. Lecture Notes in Computer Science, vol 9901 . Springer; , 2016. ; 424 – 432 . [Google Scholar]
- 12. Isensee F , Kickingereder P , Wick W , Bendszus M , Maier-Hein KH . Brain Tumor Segmentation and Radiomics Survival Prediction: Contribution to the BRATS 2017 Challenge . In: Crimi A , Bakas S , Kuijf H , Menze B , Reyes M , eds. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. BrainLes 2017 . Lecture Notes in Computer Science, vol 10670 . Springer; , 2018. ; 287 – 297 . [Google Scholar]
- 13. Zhao H , Gallo O , Frosio I , Kautz J . Loss Functions for Image Restoration With Neural Networks . IEEE Trans Comput Imaging 2017. ; 3 ( 1 ): 47 – 57 . [Google Scholar]
- 14. McCollough CH , Bartley AC , Carter RE , et al . Low-dose CT for the detection and classification of metastatic liver lesions: Results of the 2016 Low Dose CT Grand Challenge . Med Phys 2017. ; 44 ( 10 ): e339 – e352 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Denzinger F , Wels M , Breininger K , et al . How scan parameter choice affects deep learning-based coronary artery disease assessment from computed tomography . Sci Rep 2023. ; 13 ( 1 ): 2563 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Paquin D , Levy D , Xing L . Multiscale deformable registration of noisy medical images . Math Biosci Eng 2008. ; 5 ( 1 ): 125 – 144 . [DOI] [PubMed] [Google Scholar]



