Table 1.
A summary of deep learning techniques for PET image enhancement
Task | Learning Style | Paper | Method and Architecture | Data and Radiotracer | Loss Function | Input and Output | Evaluation Metric |
---|---|---|---|---|---|---|---|
Denoising | Supervised | da Costa-Luis et al. [27, 28] | 3D CNN 3 layers | 18F-FDG simulation and human data | L2 loss | Input: Low-count PET images with and without resolution modeling, T1-weighted MR, and T1-guided NLM filtering of the resolution modeling reconstruction Output/Training target: Full-count PET | NRMSE Bias vs. variance curves |
Gong et al. [29] | CNN with residual learning 5 residual blocks | 18F-FDG simulation data, 18F-FDG human data | L2 loss + perceptual loss | Input: Low-count PET Output/Training target: Full-count PET | CRC vs. variance curves | ||
Xiang et al. [30] | Deep auto-context CNN 12 convolutional layers | 18F-FDG human data | L2 loss + L2 norm weight regularization | Input: Low-count PET image, T1-weighted MRI Output/Training target: Full-count PET | NRMSE, PSNR | ||
Chen et al. [31] | 2D residual U-Net | 18F-Florbetaben human brain data | L1 loss | Input: Low-count PET image, multi-contrast MR images (T1-weighted, T2-weighted, T2 FLAIR) Output/Training target: Full-count PET image | NRMSE, PSNR, SSIM | ||
Spuhler et al. [32] | 2D residual dilated CNN | 18F-FDG human data | L1 loss | Input: Low-count PET Output/Training target: Full-count PET | SSIM, PSNR, MAPE | ||
Serreno-Sosa et al. [33] | 2.5D U-Net with residual learning and dilated convolution | 18F-FDG human brain data | - | Input: Low-count PET Output/Training target: Full-count PET | SSIM, PSNR, MAPE | ||
Schaefferkoet ter et al. [34] | 3D U-Net | 18F-FDG human data | L2 loss | Input: Low-count PET Output/Training target: Full-count PET | CRC | ||
Sano et al. [35] | 2D residual U-Net | Proton-induced PET data from simulations and a human head and neck phantom study | L2 loss | Input: Noisier low-count PET Output/Training target: Less noisy low-count PET | PSNR | ||
Wang et al. [36] | GAN Generator: 3D U-Net Discriminator: 4-convolution layer CNN | 18F-FDG simulated data, 18F-FDG human brain data | L1 loss + adversarial loss | Input: Low-count PET, T1-weighted MRI, fractional anisotropy and mean diffusivity images computed from diffusion MRI Output/Training target: Full-count PET | PSNR, SSIM | ||
Zhao et at. [37] | CycleGAN Generator: multi-layer CNN Discriminator: 4-convolution layer CNN | 18F-FDG simulated data, 18F-FDG human data | L1 supervised loss + Wasserstein adversarial loss + cycle-consistency loss + identity loss | Input: Low-count PET Output/Training target: Full-count PET | NRMSE, SSIM, PSNR, learned perceptual image patch similarity, SUV bias | ||
Xue et al. [38] | Least squares GAN Generator: 3D U-Net like network with residual learning and self-attention modules Discriminator: 4-convolution layer CNN | 18F-FDG human data | L2 loss + adversarial loss | Input: Low-count PET Output/Training target: Full-count PET | PSNR, SSIM | ||
Wang et al. [39] | cGANs with progressive refinement Generator: 3D U-Net Discriminator: 4-convolution layer CNN | 18F-FDG human brain data | L1 supervised loss + adversarial loss | Input: Low-count PET Output/Training target: Full-count PET | NMSE, PSNR, SUV bias | ||
Kaplan et al. [40] | GAN Generator: 2D encoder-decoder with skip connection Discriminator: 5-layer CNN | 18F-FDG human data | L2 loss + gradient loss + total variation loss + adversarial loss | Low-count PET Output/Training target: Full-count PET | RMSE, MSSIM, PSNR | ||
Zhou et al. [41] | CycleGAN Generator: multi-layer 2D CNN Discriminator: 6-layer CNN | 18F-FDG human data | L1 supervised loss + Wasserstein adversarial loss + cycle-consistency loss + identity loss | Input: Low-count PET Output/Training target: Full-count PET | NRMSE SSIM PSNR SUV bias | ||
Ouyang et al. [42] | GAN Generator: 2.5D U-Net Discriminator: 4-convolution layer CNN | 18F-florbetaben human data | L1 loss + adversarial loss + task-specific perceptual loss | Input: Low-count PET Output/Training target: Full-count PET | SSIM PSNR RMSE | ||
Gong et al. [43] | GAN Generator: hybrid 2D and 3D encoder-decoder Discriminator: 6-layer CNN | 18F-FDG human data | L2 loss + Wasserstein adversarial loss | Input: Low-count PET Output/Training target: Full-count PET | NRMSE, PSNR, Riesz transformbased feature similarity index, visual information fidelity | ||
Liu et al. [44] | 3D U-Net cross-tracer cross-protocol transfer learning | 18F-FDG human data, 18F-FMISO human data, 68Ga-DOTATATE data | L2 loss | Input: Low-count PET Output/Training target: Full-count PET | NRMSE, SNR, SUV bias | ||
Lu et al. [45] | Network comparison: Convolutional autoencoder, U-Net, residual U-Net, GAN, 2D vs. 2.5D vs. 3D | 18F-FDG human lung data | L2 loss | Input: Low-count PET Output/Training target: Full-count PET | NMSE, SNR, SUV bias | ||
Ladefoged et al. [46] | 3D U-Net | 18F-FDG human cardiac data | Huber loss | Input: Low-count PET, CT Output/Training target: Full-count PET | NRMSE, PSNR, SUV bias, | ||
Sanaat et al. [47] | 3D U-Net | 18F-FDG human data | L2 loss | Input: Low-dose PET image/sinogram Output/Training target: Standard-dose PET image/sinogram | RMSE, PSNR, SSIM, SUV bias | ||
He et al. [48] | Deep CNN | 18F-FDG simulated brain data, 18F-FDG dynamic data | L1 loss + gradient loss + total variation loss | Input: Noisy dynamic PET, MRI Output/Training target: composite dynamic images | RMSE, SSIM, CRC vs. variance curves | ||
Wang et al. [49] | Deep CNN | 18F-FDG human whole-body data | Attention-weighted loss | Input: Low-count PET, T1-weighted LAVA MRI Output/Training target: Full-count PET | NRMSE, SSIM, PSNR, SUV bias | ||
Schramm et al. [50] | 3D CNN with residual learning | 18F-FDG, 18F-PE2I, 18F-FET human data | L2 loss | Input: OSEM-reconstructed Low-count PET, T1-weighted MRI Output/Training target: Enhanced PET (based on anatomical guidance) | CRC, SSIM | ||
Jeong et al. [51] | GAN Generator: 2D U-Net Discriminator: 3-layer CNN | 18F-FDG human brain data | L2 loss + adversarial loss | Input: Low-count PET Output/Training target: Full-count PET | NRMSE, PSNR, SSIM, SUV bias | ||
Tsuchiya et al. [52] | 2D CNN with residual learning | 18F-FDG human whole-body data | Weighted L2 loss | Input: Low-count PET image, Output/Training target: Full-count PET | SUV bias | ||
Liu et al. [53] | 2D U-Net with asymmetric skip connections | Simulated 18F-FDG brain data | L2 loss | Input: Filtered backprojection reconstructed PET, T1-weighted MRI Output/Training target: MLEM-reconstructed PET | MSE, CNR, bias-variance images | ||
Sanaat et al. [54] | CycleGAN Generator: 2D U-Net like network Discriminator: 9-layer CNN ResNet 20 convolutional layers | 18F-FDG human data | CycleGAN: L1 loss + adversarial loss ResNet: L2 loss | Input: Low-count PET Output/Training target: Full-count PET | MSE, PSNR, SSIM, SUV bias | ||
Chen et al. [55] | 2D U-Net with residual learning | 18F-FDG human brain data | L1 loss | Input: Low-count PET image, multi-contrast MR I (T1-weighted, T2-weighted, T2 FLAIR) Output/Training target: Full-count PET image | RMSE, PSNR, SSIM | ||
Katsari et al. [56] | SubtlePETâ„¢ AI | 18F-FDG PET/CT human data | - | - | SUV bias, Subjective image quality, lesion detectivity | ||
Unsupervised, weakly-supervised, or self-supervised | Cui et al. [57] | Deep Image Prior 3D U-Net | Simulation and human data from two radiotracers: Ga-PRGD2 (PET/CT) and 18F-FDG (PET/MR) | L2 loss | Inputs: CT/MR image Output: Denoised PET Training target: Noisy PET | CRC vs. variance curves | |
Hashimoto et al. [58] | Deep Image Prior 3D U-Net | 18F-FDG simulated data, 18F-FDG monkey data | L2 loss | Input: Static PET Training target: noisy dynamic PET image Output: Denoised dynamic PET | PSNR, SSIM, CNR | ||
Hashimoto et al. [59] | 4D Deep Image Prior Shared 3D U-Net as feature extractor and reconstruction branch for each output frame | 18F-FDG simulated data and 18F-FDG and 11C-raclopride monkey data | Weighted L2 loss | Input: Static PET Training target: 4D dynamic PET Output: Denoised dynamic PET | Bias vs. variance curves, PSNR, SSIM | ||
Wu et al. [60] | Noise2Noise 3D CNN encoder-decoder | 15O-water human data | L2 denoising loss + L2 bias control loss + L2 content loss | Inputs: Low-count PET images from one injection Output: Denoised low-count PET Training target: Low-count PET images from another injection | CRC | ||
Yie et al. [61] | Noisier2Noise 3D U-net | 18F-FDG human data | L2 loss | Inputs: Extreme low-count PET Output: Denoised low-count PET Training target: Low-count PET | PSNR, SSIM | ||
Deblurring | Supervised | Song et al. [62] | Very Deep CNN 20-layer CNN with residual learning | 18F-FDG simulation and human data | L1 loss | Input: Low-resolution PET, T1-weighted MRI, spatial (radial + axial) coordinates Output/Training target: High-resolution PET | PSNR, SSIM |
Gharedaghi et al. [63] | Very Deep CNN 16-layer CNN with residual learning | Human data, radiotracer unknown | L2 loss | Input: Low-resolution PET, Output/Training target: High-resolution PET | PSNR, SSIM | ||
Chen et al. [64] | CycleGAN Model trained on simulation data and applied to clinical data | 18F-FDG simulated images for training and human images for validation | Adversarial loss + cycle-consistency loss | Input: Low-resolution PET, Output/Training target: High-resolution PET | Visual examples only, no quantitative results | ||
Unsupervised, weakly-supervised, or self-supervised | Song et al. [65] | Dual GANs Generator: 8-layer CNN Discriminator: 12-layer CNN | FDG simulated images for pre-training and human images for validation | Two L2 adversarial losses + cycle-consistency loss + total variation penalty | Input: Low-resolution PET, T1-weighted MRI, spatial (radial + axial) coordinates Output: High-resolution PET Training target: Unpaired high-resolution PET | PSNR, RMSE, SSIM |