Improving Diagnostic Accuracy in Low-Dose SPECT Myocardial Perfusion Imaging with Convolutional Denoising Networks

Albert Juan Ramon; Yongyi Yang; P Hendrik Pretorius; Karen L Johnson; Michael A King; Miles N Wernick

doi:10.1109/TMI.2020.2979940

. Author manuscript; available in PMC: 2022 Sep 14.

Published in final edited form as: IEEE Trans Med Imaging. 2020 Mar 10;39(9):2893–2903. doi: 10.1109/TMI.2020.2979940

Improving Diagnostic Accuracy in Low-Dose SPECT Myocardial Perfusion Imaging with Convolutional Denoising Networks

Albert Juan Ramon ¹, Yongyi Yang ², P Hendrik Pretorius ³, Karen L Johnson ⁴, Michael A King ⁵, Miles N Wernick ⁶

PMCID: PMC9472754 NIHMSID: NIHMS1625172 PMID: 32167887

Abstract

Lowering the administered dose in SPECT myocardial perfusion imaging (MPI) has become an important clinical problem. In this study we investigate the potential benefit of applying a deep learning (DL) approach for suppressing the elevated imaging noise in low-dose SPECT-MPI studies. We adopt a supervised learning approach to train a neural network by using image pairs obtained from full-dose (target) and low-dose (input) acquisitions of the same patients. In the experiments, we made use of acquisitions from 1,052 subjects and demonstrated the approach for two commonly used reconstruction methods in clinical SPECT-MPI: 1) filtered backprojection (FBP), and 2) ordered-subsets expectation-maximization (OSEM) with corrections for attenuation, scatter and resolution. We evaluated the DL output for the clinical task of perfusion-defect detection at a number of successively reduced dose levels (1/2, 1/4, 1/8, 1/16 of full dose). The results indicate that the proposed DL approach can achieve substantial noise reduction and lead to improvement in the diagnostic accuracy of low-dose data. In particular, at ½ dose, DL yielded an area-under-the-ROC-curve (AUC) of 0.799, which is nearly identical to the AUC=0.801 obtained by OSEM at full-dose (p-value=0.73); similar results were also obtained for FBP reconstruction. Moreover, even at 1/8 dose, DL achieved AUC=0.770 for OSEM, which is above the AUC=0.755 obtained at full-dose by FBP. These results indicate that, compared to conventional reconstruction filtering, DL denoising can allow for additional dose reduction without sacrificing the diagnostic accuracy in SPECT-MPI.

Keywords: SPECT-MPI, dose reduction, deep learning, convolutional neural networks

I. Introduction

MYOCARDIAL perfusion imaging (MPI) with SPECT was determined to be the number two contributing source of increased radiation risk to the public (only behind CT) in medical imaging [1]. SPECT-MPI is a frequently ordered test in nuclear medicine that provides objective findings for detection of coronary artery disease (CAD) [2, 3], the leading cause of death in the US [4]. As in any nuclear imaging modality, lowering the radiation dose in SPECT-MPI has become critically important [5–8]. Recent guidelines of American Society of Nuclear Cardiology (ASNC) mandate reducing the injected imaging dose to patients with advanced image reconstruction strategies [7].

In the literature many studies have shown that the administered activities (or imaging time) can be reduced by a factor of two or higher through use of iterative reconstruction algorithms in SPECT-MPI [9–17]. While lowering radiation exposure is critical, as pointed out in [7], reducing the amounts of injected radionuclide in patients also degrades image quality, which can impact the diagnostic accuracy in the clinic [18]. In our previous study [9], we quantified the diagnostic performance in detecting perfusion defects with different reduced dose levels, in which the spatial filters used for image reconstruction (i.e., pre- or post-reconstruction filters) were optimized for each dose level. In [12], we investigated a patient-specific (“personalized”) approach for tailoring the injected activities to the attributes of individual patients in order to achieve dose reduction.

As noted in [18], it is critical that in clinical SPECT studies the image quality must not be sacrificed while achieving dose reduction. In this study we investigate the potential benefit of applying a deep learning (DL) approach for suppressing the elevated imaging noise in clinical low-dose studies while maintaining or even improving the diagnostic accuracy in the image data. DL has emerged as a powerful tool in medical imaging [19, 20, 34, 35], which has produced promising results in medical image analysis [21], segmentation [22], image reconstruction [23], as well as image denoising [24–33]. In particular, several of the latter studies [27–33] demonstrate the potential of DL for denoising of low-dose CT data. However, to the best of our knowledge, there are no studies so far that evaluate DL denoising of low-dose SPECT-MPI data.

While DL networks have been demonstrated to be very effective for image denoising compared to conventional smoothing filters [25, 32, 33], it is important to investigate whether this can translate to a meaningful improvement in diagnostic accuracy in clinical studies. In several studies [27–29], image quality measures, including mean-squared error (MSE), peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), were used to assess the impact of DL denoising on image quality at reduced-dose levels. In [30], diagnostic performance based on quantification of coronary calcium scoring in cardiac CT was evaluated. In this study, we focus on the clinical task of perfusion-defect detection in SPECT-MPI [9, 12]. To be realistic, we make use of a large set of clinical studies (as opposed to phantom simulations) for training and evaluating the DL networks employed in the study. This enables us to account for the large variability in patient characteristics observed clinically (as described later in Section II.C).

Based on the previous success of DL in low-dose CT studies (CT) [27–30], we will adopt a supervised learning approach for denoising of low-dose cardiac SPECT images, wherein a neural network is trained with image pairs obtained from low-dose (input) and full-dose (target) acquisitions of the same patients. However, different from most of the previous applications in CT, we consider three-dimensional (3D) DL network structures [30], as opposed to two-dimensional ones, in which we make use of 3D convolutional layers to exploit the spatial correlations among the image voxels in the heart volume.

In the experiments we demonstrate the DL approach for the following two commonly used reconstruction methods in clinical SPECT-MPI: 1) FBP, and 2) OSEM with corrections for attenuation, scatter and resolution (AC-SC-RC). Both reconstruction methods were optimized previously for low-dose acquisitions [9]. For both methods, the performance of DL denoising is evaluated at a number of successively reduced dose levels (1/2, 1/4, 1/8, 1/16 of clinical full dose).

We note that some preliminary results of this study were presented previously in [36], where we demonstrated the feasibility of DL denoising for SPECT images at 1/8 of full clinical dose. Herein, we extend this previous work by: 1) studying multiple network structures and the effect of their variations (i.e., number of layers and number of filters per layer), 2) evaluating the performance of DL denoising at multiple reduced dose levels (i.e., 1/2, 1/4, 1/8, 1/16 of standard full dose), and 3) investigating training a “one-size-fits-all” network vs. a dose-level specific network, where for the former a single network is trained with data collectively from multiple reduced dose levels and for the latter a network is specifically trained with data from a given dose level. We note that, while our study is demonstrated on SPECT data, we expect the methodology and findings to be applicable to other imaging modalities such as PET.

The rest of the paper is organized as follows. In Section II, we describe the denoising problem in low-dose SPECT and the DL network structures considered in the study. In Section III, we describe the training procedure for the DL networks using clinical datasets, and the performance metrics for evaluation of the trained DL structures. We present the evaluation results in Section IV and conclusions in Section V.

II. Low-Dose SPECT Denoising via Deep Learning

A. Problem formulation

Let vector $x$ denote a reconstructed image volume from a low-dose SPECT-MPI acquisition, and vector $y$ the corresponding image reconstructed from a standard full-dose acquisition of a given subject. Our goal is to determine a mapping from $x$ to $y$ such that

y \approx f (x)

(1)

In (1), the reconstructed image x, with lower data counts, represents a noisy version of image $y$ (with full-dose data counts). The mapping $f (\cdot)$ in (1) can be viewed as a denoising operator. In this study, we investigate the use of a deep learning structure to learn $f (\cdot)$ from image pairs of low-dose and full-dose data obtained from a set of clinical acquisitions.

Specifically, assume a total of $T$ such training image pairs are available. Then the training dataset is formed by input-output pairs as:

{(x^{(i)}, y^{(i)}), i = 1, \dots, T}

(2)

where $x^{(i)}$ denotes the low-dose image from patient $i$ , and $y^{(i)}$ the corresponding full-dose image. Our goal is to determine the mapping $f (\cdot)$ based on the training set as:

\hat{f} = \underset{f}{\arg \min} (\sum_{i = 1}^{N} {‖ y^{(i)} - f (x^{(i)}) ‖}^{2})

(3)

where $\hat{f}$ denotes the estimated mapping. Once optimized, $\hat{f}$ is applied subsequently to a low-dose (unseen) image x to generate output $\tilde{y} = \hat{f} (x)$ , which is desired to be similar to what would be obtained with a full-dose acquisition.

Given a deep learning structure, the mapping $\hat{f}$ in (3) is obtained through optimizing the parameters and weights associated with the learning network, such that the image similarity between the network output and corresponding full-dose image is maximized.

Below we consider two neural network structures, one convolutional autoencoder (CAE) and one convolutional neural network (CNN), for determining the mapping $f (\cdot)$ . Such structures are favored in terms of their computational efficiency due to the use of convolutional layers. They are both recently applied for denoising in low-dose CT images [24, 27–29]. For example, in [29] a CAE structure was trained using pairs of standard-dose and low-dose CT patches; in [28] a residual encoder-decoder convolutional neural network was used. Herein, different from these previous applications, we consider three-dimensional (3D) network structures, as opposed to two-dimensional structures, in which we make use of 3D convolutional (Conv) layers to exploit the spatial correlations among the voxels in the heart volume [22]. We note that besides CAE and CNN structures, generative adversarial networks (GAN) have also been used for denoising in recent years [30, 31]. The latter differs from the former in that a discriminator network is used to distinguish between denoised low-dose images from their full-dose counterparts.

B. Neural network structures for low-dose SPECT

1). 3D convolutional autoencoder (3D CAE)

The first structure we consider is formed by a combination of the classic autoencoder (AE) structure [37] with convolutional layers [22]. As illustrated in Fig. 1, we use an architecture formed by a cascade of convolutional layers (stacked encoders) followed by symmetric transposed convolutional (aka deconvolutional) layers (stacked decoders) [24, 27]. Here 3D convolutional layers are used for the stacked encoders and decoders. For our denoising problem, the input low-dose image $x$ is mapped to a latent representation via a sequence of encoding layers. Afterward, the latent representation is reconstructed back (via decoding layers) to form the (denoised) output image $\tilde{y}$ [37].

Fig. 1. — Three-dimensional convolutional autoencoder structure (3D-CAE). The network consists of symmetric 3D convolutional and transposed convolutional (aka deconvolutional) layers (four in total). Batch normalization is applied after each convolutional layer. Skip connections are used to connect the output of early encoding layers to the input of their symmetric decoding layers. This avoids potential loss of important image details from early layers to deeper layers in a deep network.

The operations of the encoder layers can be written as:

x_{l} = E^{(l)} (x_{l - 1}) = ReLU (W_{l} * x_{l - 1} + b_{l}), l = 1, \dots, L

(4)

where $x_{l - 1}$ denotes the output from previous layer $l = 1$ , $W_{l}$ and $b_{l}$ denote the weights of the convolutional filters and biases of layer $l$ , respectively, and $L$ is the number of encoders. In (4), a rectifier linear unit (ReLu) is used as the activation function, where $ReLu (x) = \max (0, x)$ , in each layer [24].

Similarly, the operations of the decoder layers can be written as:

v_{l} = D^{(l)} (v_{l - 1}) = ReLU (W_{l}^{T} \otimes v_{l - 1} + b_{l}), l = 1, \dots, L

(5)

where $v_{l - 1}$ denotes the output from the previous layer $l = 1$ , $W_{l}^{T}$ and $b_{l}$ denote the weights of the transposed convolutional filters and biases of layer $l$ , and $\otimes$ the transposed convolutional operator.

To ensure that the output image has the same dimension as the input, the transposed convolutional layers have the same filter size and convolutional stride as in the corresponding convolutional layers (i.e., the last transposed convolutional layer corresponds to the first convolutional layer, and so on). As in [28], no image padding and stride one is used for the convolutions of each layer. Thus, the amount of dimensionality reduction among different layers is dependent on the convolutional filter sizes applied in each layer.

In Fig. 1, batch normalization layers [38] are also included after each convolutional and de-convolutional layer (but not shown in the figure). These layers are used to normalize the output at a given layer to achieve zero mean and unit standard deviation in each training batch.

2). 3D convolutional neural network (3D CNN)

The second structure we consider is formed all by 3D convolutional layers, as shown in Fig. 2, referred to as 3D convolutional neural network (3D CNN) hereafter. Here the output of each layer is maintained to have the same size (padding with zero and stride one is applied). Compared to the CAE structure above, there are no transposed convolutional layers in 3D CNN.

The operations of each convolutional layer are also as described in (4), except that now the output $X_{l}$ has the same dimensions for each layer $l = 1, \dots, L$ The output of the final layer $L$ yields the denoised image $\tilde{y}$ [37]. As in CAE, batch normalization layers are also included after each convolutional layer in Fig. 2 (but not shown).

3). Use of skip connections

As shown in Figs. 1&2, we also consider inclusion of skip connections (indicated by dashed lines) in the 3D CAE and 3D CNN structures. These connections are used to directly connect the output at early layers to the input of deeper layers in the network. Note that with the top skip connection present, the CAE/CNN network becomes a residual network. They can be useful both to prevent potential loss of image details in the deeper layers of the network and to address the issue of vanishing gradients [39] in training deeper networks [40, 41]. In this work, we made use of additive skip connections, which directly add the values from an input layer to the values at an output layer having the same dimension.

Also, for the case of CAE, increasing the number of encoding layers corresponds to more dimensionality reduction in the resulting latent representation [24], which can potentially lead to loss of important image details in the stacked encoders. The skip connections can help resolve this issue by providing a parallel path to feed the image features from shallower encoding layers to the deeper decoding layers [24].

4). Network structure parameters for optimization

For each of the network structures described above, there are a number of global parameters that need to be determined during the training phase. They include the number of layers, the number of filters within each layer, and the size of the filters. In the experiments, we applied a training-validation procedure to select these parameters (Section III.A). Specifically, the number of layers was varied as 2, 4, 8, and 10; the number of filters was varied as 10, 20, 32, 64; and the filter size was varied as 3×3×3 and 5×5×5. For both 3D CAE and CNN, the number of filters was one in the final (output) layer of the network. Also, for 3D CAE, the number of layers is the total of both convolutional layers and transposed convolutional layers.

C. Training and evaluation with clinical SPECT-MPI data

To be realistic for clinical applications, we make use of a large set of clinical acquisitions (as opposed to phantom simulations) for training and evaluating the learning network. This enables us to accommodate the variability in the acquired data statistics among patients observed clinically. Specifically, we obtained a total of 1,052 clinical acquisitions under Institutional Review Board (IRB) approval with patients’ written consent. The studies were stress imaging acquired on a Philips BrightView SPECT/CT system in list-mode with Tc-99m sestamibi from 2013 to 2018 at the University of Massachusetts Medical School. All patients underwent a one-day rest/stress SPECT MPI protocol with Tc-99m sestamibi, with administrated activity level ranging from 370 to 444 MBq (10 to 12 mCi) for rest, adjusted according to the patient BMI (based on recommendations of the ASCN [42]), and the activity level was 3 times higher for stress. Hereafter these acquisition data are referred to as full-dose data. In practice, when a different protocol is used, the optimized DL networks can be applied according to the actual injected activity level accordingly (instead of the nominal factor of dose reduction such as ½ and ¼ of full dose).

The SPECT/CT system was configured with two camera heads at 90° apart, and 64 projection angles over 180° were used. The acquired list-mode data were framed in 128×128×64 projection matrices with pixel size equal to 0.466 cm for image reconstruction. There was no subsampling applied for subsequent processing by the CAE/CNN networks. In addition, attenuation maps from cone-beam CT imaging prior to emission imaging were obtained for use in attenuation correction (AC) during image reconstruction [43]. The cone-beam acquisitions were acquired in 0.83-degree steps over 360°. Scatter correction (SC) was implemented using the triple energy window (TEW) method [44]. The primary energy window was centered at 140.5 keV with a 15% width and the scatter window was centered at 121 keV with a 4% width.

The characteristics of the 1,052 patients were as follows: 506/546 male/female, BMI: 32.4 ± 6.7 kg×m⁻², age: 62.2 ± 11.1 years. Among them, 490 were read as having normal scans, 372 as having either perfusion or motion abnormalities, and the others were read as somewhat normal.

For network training and evaluation, the 1,052 patients were divided into three subsets: 1) one with 740 patients used for training, 2) one with 122 patients used as validation set, and 3) one with 190 patients for performance evaluation (denoted as test set). The latter set of 190 patients was previously used in [9, 12] for performance evaluation, and hence was also used in this study for the ease of comparison. The rest of the cases were randomly divided into the training set and validation set, with the training set containing a mixture of 318 abnormal cases and 422 normal or somewhat normal cases. Note that the test cases need to have ground truth known as to the presence of perfusion defects for the purpose of diagnostic performance evaluation. In contrast, the images in the training/validation sets need not to have such truth known, as the training target is the full-dose reference image for each case. The validation set was used to select the hyper-parameters of the CAE/CNN networks.

From the full-dose acquisitions, we obtained corresponding low-dose data by applying a statistical subsampling procedure [9], during which the photon events registered in the list-mode data at full dose were accepted or rejected according to a given probability (e.g., 1/2 dose data were obtained by subsampling with probability 0.5). With this procedure, we generated sinogram data for each of the following reduced dose levels: 1/2, 1/4, 1/8, and 1/16 of full dose.

D. Dose specific network vs. one-size-fits-all network

We investigate two approaches for training the denoising neural networks. In the first approach, we train the 3D CAE/CNN to learn the mapping from a specific reduced-dose level (i.e., 1/2, 1/4, 1/8 or 1/16 dose) to full dose. That is, the denoising network is obtained specifically for a given dose level. In the second approach, we train a “one-size-fits-all” network by using data collectively from various reduced dose levels at the same time. That is, the network was trained with data from doses of ½, ¼, ⅛, and 1/16 all pooled together, resulting in four times more training data. During testing, it was tested only at a given dose level (e.g. 1/16) on the test cases. The purpose is to determine whether having images from different noise levels during training would help the network to improve the denoising performance [27].

III. Experimental Framework

A. Image reconstruction methods

In this study, we investigate the denoising approach for two reconstruction methods commonly used in clinical SPECT MPI: 1) FBP, and 2) ordered-subsets expectation-maximization (OSEM) with attenuation, scatter and resolution corrections. For FBP, the cutoff frequency for the pre-reconstruction Butterworth filter (order five) was set to be 0.22 cycles/pixel; for OSEM, the number of subsets was 16, the number of iterations was 12, and the width parameter of the post-smoothing Gaussian filter was 1.2 voxels. These settings were based on previously determined maximum perfusion-defect detection performance for full dose studies [9].

For the low-dose data, as input to the learning network, the images were reconstructed with the same settings as in the full-dose data for each reconstruction method. This is to ensure that the input low-dose images can retain the same frequency content of the image signal as their full-dose counterparts. The 3D CAE/CNN network can be viewed as a non-linear post-processing filter which was trained to extract the signal components while suppressing the elevated imaging noise in the low-dose data.

B. Network training and implementation

1). Extraction of training image samples for heart volumes

In SPECT-MPI, the main object of interest when evaluating the images is the myocardium, which is much smaller compared to the entire reconstructed image volume of a patient. For example, for the majority of patients in the dataset, the entire myocardium was found to be confined in a 21×21×21 voxel volume, whereas the reconstructed images were of 128×128×128 voxels. Therefore, the vast majority of the image voxels in a patient correspond to background regions (or other organs) outside the heart volume. To improve the learning efficiency, we extracted a volume-of-interest (VOI) of size 42×42×21 voxels centered on the heart of each patient for training the network. Note that the third dimension of the volume corresponds to the inferior-superior axis direction. Such a VOI provides not only image voxels from the myocardium but also much more image voxels outside the heart for training.

Next, to form training samples, we further extracted image patches of size 21×21×21 from the VOI of each patient. By dividing the VOI of each patient into multiple smaller patches, we can increase the number of different patients (hence patient variability) within each training batch, which can lead to faster convergence in training. In the experiments, the image patches were extracted with a sliding interval of 7 voxels in each dimension, resulting in 9 patches per VOI. The extracted image patches from the training set of patients were then collectively used to form the training set in (2). To accommodate the difference in count levels in the input images, the image VOI was normalized to have the same maximum intensity (value 1.0) in the heart wall for each study.

For the purpose of comparisons, we also experimented with using image patches extracted from the entire image volume (as opposed to heart VOI). This resulted in more image patches for training, but the overwhelming majority of them contain only noisy image background.

2). Network implementation

We implemented the DL networks using the Keras library with TensorFlow in Python 3.5. The Adam optimizer (stochastic gradient descent) was used for the optimization in (3) [45], for which the default parameters from the Keras library were used. The implementation was on a NVIDIA GeForce GTX 1080 Ti 12GB graphical processing unit. The training time was approximately 12 hours, and it took less than a minute to post-filter the 3D low-dose image for each patient. As in equation (3), the loss function used during training was mean-squared error. In our experiments we also considered using $L_{1}$ loss and found it to yield very similar results.

C. Performance evaluation

To evaluate the performance of the denoising network, we evaluated the output results using two metrics. First, we quantified the similarity between the processed low-dose images and their full-dose counterparts. Next, we conducted a receiver-operating characteristics (ROC) study to evaluate whether the processed low-dose images can yield comparable performance in detecting perfusion-defects to that of full-dose data. Below we describe the two performance metrics in detail.

1). Image similarity to full-dose data

For a given patient, we compared the network output $\tilde{y}$ with its corresponding full-dose image $y$ using the Pearson correlation, which is defined as

ρ = \frac{\sum_{i = 1}^{n} (y_{i} - μ_{y}) ({\tilde{y}}_{i} - μ_{\tilde{y}})}{\sqrt{\sum_{i = 1}^{n} {(y_{i} - μ_{y})}^{2}} \sqrt{\sum_{i = 1}^{n} {({\tilde{y}}_{i} - μ_{\tilde{y}})}^{2}}}

(6)

where ${\tilde{y}}_{i}$ , $y_{i}$ denote the image value at voxel $i$ of $\tilde{y}$ and $y$ , respectively, $n$ is the number of voxels, and $μ_{\tilde{y}}$ , $μ_{y}$ denote the mean value of $\tilde{y}$ and $y$ (over all voxels in the volume), respectively. Since our main object of interest is the heart, $ρ$ was computed only for a segmented myocardium volume of 21×21×21 in (6). The correlation coefficient $ρ$ was computed for each patient, and then averaged among all the patients in the test (or validation) set. The segmented myocardium volume was obtained with a procedure as follows. First, we located the image voxel having the maximum value in the top-left quadrant of the reconstructed volume (i.e., the region containing the heart). Then, with this voxel as seed, region-growing was applied to obtain an approximate segmentation of the heart. Afterward, a 3D bounding box (21×21×21 in size) was placed around the center of the segmented heart to extract the heart image region. The segmentation result was visually inspected and in those cases where it failed (due to presence of extra-cardiac activities) the center of the bounding box was manually adjusted.

2). Perfusion defect detection

To evaluate the task-based performance of the denoising network, we performed ROC studies to quantify the perfusion-defect detection at different dose levels. For this purpose, we made use of the set of 190 test patients in the same way as in [9], and computed the total perfusion deficit (TPD) score using the Quantitative Perfusion SPECT (QPS) software package [46, 47] to assess the detectability of perfusion defects in the reconstructed images as done clinically. The TPD score measures both the severity (magnitude of decrease in uptake) and extent (area over which change occurs), and has been found to correlate well with clinical interpretations.

Specifically, among the 190 patients, 60 (30 male and 30 female) were set aside as the reference database for the QPS observer, and the rest 130 patients were used for perfusion defect detection. In these patients, 58 patients were used as normal and 72 patients were used to create hybrid studies [46] in which perfusion defects were introduced with varying vascular locations, sizes and contrast levels as described in [9].

The 60 reference cases were used by QPS to derive the normal limits for computing the total perfusion deficit (TPD) scores. They were set aside in order to achieve complete independence from the study patients in the ROC study. When computing the TPD scores, these reference cases were processed the same way (i.e., using the same network for a given algorithm and dose level) as the study patients [9].

IV. Results

A. Network structure optimization

1). OSEM reconstruction

In Fig. 3 we show the image similarity results obtained with different 3D CAE structures when trained on the 1/8 dose data with OSEM reconstruction. In these results the correlation coefficient $ρ$ was averaged over all the patients in the validation set with 1/8 dose. The results in Fig. 3(a) were obtained without using skip connections in the CAE, and the results in Fig. 3(b) were with skip connections. In each figure, the results are given as the number of layers was varied from 2 to 10 (x-axis) and the number of filters in each layer was varied from 10 to 64 (indicated by different curves). The filter size used was 3×3×3.

Note that the optimal $ρ = 0.988 \pm 0.001$ was obtained in the CAE with skip connections wherein 4 layers and 10 filters are used; in comparison, without CAE processing, OSEM reconstruction (i.e., network input) yielded $ρ = 0.971 \pm 0.001$ (p-value < 0.05, paired t-test).

Similarly, in Fig. 4 we show the image similarity results obtained with different 3D CNN structures on the 1/8 dose data with OSEM reconstruction. The optimal $ρ = 0.987 \pm 0.001$ was obtained in the CNN with skip connections wherein 6 layers and 20 filters are used. Moreover, this result was not found to be significantly different from the CAE result above (p-value > 0.43, paired t-test).

In both CAE and CNN above, we also varied the filter size to 5×5×5 in each layer. The obtained image similarity values were found to be lower than the corresponding results obtained with filter size 3×3×3. Therefore, the results are given only for filter size 3×3×3 from this point on.

2). FBP reconstruction

Similar results were also obtained for FBP reconstruction on the 1/8 dose data using different 3D CAE and CNN structures. For CAE, the best $ρ = 0.987 \pm 0.001$ was obtained with skip connections (4 layers, 10 filters), in comparison, without CAE processing, FBP reconstruction yielded $ρ = 0.969 \pm 0.001$ (p-value < 0.05). For CNN, the best $ρ = 0.987 \pm 0.001$ was also obtained with skip connections (8 layers, 64 filters).

From the results above it is observed that for both OSEM and FBP the best image similarity results were obtained by CAE with skip connections (4 layers, 10 filters). Similar results were also obtained by CNN with skip connections. However, the optimal CAE structure has fewer layers and filters, which can be computationally advantageous. Therefore, in the rest of the study we focus on the results obtained using the optimal CAE structure (with skip connections, 4 layers, 10 filters).

For subsequent performance evaluation, we re-trained the CAE network by using all the available cases in the training and validation sets at different reduced dose levels (i.e., 1/2, 1/4, 1/8 or 1/16 dose). Afterward, the network was tested on the cases in the test set. We provide the test results below for OSEM and FBP at these dose levels.

B. Test performance on image similarity to full-dose data

1). OSEM reconstruction

In Fig. 5 we show the image similarity results obtained by CAE when trained on a specific dose level (i.e., 1/2, 1/4, 1/8 or 1/16 dose), labelled as “Dose-level specific”. In addition, we also show the obtained image similarity results when the CAE was trained using data collectively from all reduced dose levels, i.e., a one-size-fits-all network. For comparison, the image similarity results are also given for the input OSEM images reconstructed using settings based on previously determined maximum perfusion-defect detection performance for full dose studies [9] (i.e., network input). These results were averaged over all the patients in the test set.

From Fig. 5 it is observed that the image similarity obtained with dose-level specific CAE is higher than that obtained with one-size-fits-all CAE. For example, at 1/4 dose, dose-level specific CAE yielded $ρ = 0.992 \pm 0.001$ , compared to $ρ = 0.989 \pm 0.002$ for one-size-fits-all CAE (p-value < 0.05, paired t-test); without CAE processing, $ρ = 0.982 \pm 0.001$ for the OSEM input.

2). FBP reconstruction

Similar test results were also obtained for FBP reconstruction at different reduced dose levels. Specifically, with CAE, the image similarity $ρ$ values were 0.992±0.001, 0.990±0.001, 0.985±0.002 and 0.971±0.002 at 1/2, 1/4, 1/8 and 1/16 dose, respectively; without CAE, the corresponding $ρ$ values were 0.990±0.001, 0.978±0.001, 0.967±0.002 and 0.921±0.003, respectively (p-value < 0.05 at each dose level in comparison between with vs. without CAE).

C. Test performance on perfusion-defect detection

1). OSEM reconstruction

Figure 6 shows the results on perfusion-defect detection (measured by AUC in the ROC study using TPD as a numerical observer) obtained from CAE processing on OSEM reconstruction at different reduced dose levels. The detection performance obtained by OSEM with optimized Gaussian post-filtering [9] was also given for comparison. The error-bars in the figure represent the standard deviations obtained from the ROC fitting software [48], which was also used to perform statistical analysis to compare AUC values below.

Fig. 6. — Perfusion-defect detection performance (measured by AUC) obtained with CAE processing on OSEM reconstruction at reduced dose levels. For reference, the performance obtained with optimized OSEM reconstruction [9] is also given.

It is observed that, at each dose level, the AUC value obtained by CAE is higher than that obtained by traditional post-filtering. Specifically, at 1/2 dose, CAE yielded AUC=0.799, which is very similar to the AUC (0.801) obtained by OSEM at full-dose (p-value = 0.73). Similarly, at 1/8 dose, CAE yielded AUC=0.770, which is similar to the AUC (0.764) obtained by OSEM at 1/4 dose (p-value = 0.57). These results indicate that the dose level could be further reduced by 1/2 when CAE is applied while still achieving similar performance to OSEM.

2). FBP reconstruction

Figure 7 shows the results on perfusion-defect detection obtained from CAE processing on FBP reconstruction at different reduced dose levels. For comparison, the detection performance obtained by FBP with optimized filtering [9] was also given. As in the OSEM results above, at each dose level, the AUC value obtained by CAE is higher than that by FBP with traditional pre-filtering. In particular, at 1/2 dose, CAE yielded AUC=0.752, which is comparable to the AUC (0.755) obtained by FBP at full-dose (p-value = 0.51).

Fig. 7. — Perfusion-defect detection performance (measured by AUC) obtained with CAE processing on FBP reconstruction at reduced levels. For reference, the performance obtained with optimized FBP reconstruction [9] is also given.

Note that, with CAE applied to OSEM reconstruction, the AUC (0.770) achieved at 1/8 dose is still above the AUC (0.755) obtained by full-dose FBP reconstruction (p-value = 0.42).

D. Example images

In Fig. 8 we show the reconstructed images of a subject (male, BMI=22, age=53) obtained at different dose levels. The subject was interpreted to have normal perfusion. Shown in each row are short-axis, horizontal long-axis, vertical long-axis slices plus the polar map. As reference, the images in Fig. 8(a) are from full-dose OSEM reconstruction. The images in Fig. 8(b) are obtained from 1/2, 1/4, 1/8 and 1/16 dose with OSEM reconstruction; the Gaussian post-filtering was optimized for each dose level in these results [9]. The images in Fig. 8(c) were obtained with CAE processing at the different dose levels.

It can be observed that the CAE images at different reduced dose levels appear more similar to the full-dose images than the OSEM results. The LV wall is more uniform and appears less blurry in CAE, whereas the OSEM images suffer from notably elevated noise (especially at 1/8 and 1/16 dose). Also, at reduced dose, the polar maps exhibit more uniform perfusion in CAE than OSEM, and are more similar to the full dose results.

Similarly, in Fig. 9 we show the reconstructed images of another subject (male, BMI=37.4, age=65) at different dose levels. This subject had a perfusion defect in the left anterior descending artery (LAD) territory. As in Fig. 8 above, the CAE images are more similar to the full dose images than OSEM with traditional post-filtering. Importantly, the extent and contrast of the perfusion defect are well preserved in the CAE images even at 1/8 and 1/16 doses, whereas noticeable distortions are observed in the OSEM results.

From Figs. 8&9 the 1/8 and 1/16 dose CAE images are also observed to show artifacts in the septal wall due to the extremely high noise present in the image data. These results indicate that, as with conventional Gaussian filtering, the CAE is also limited in its denoising capability when the noise level becomes too high.

V. Discussions

In this study, we applied several 3D deep-learning structures for suppressing the elevated imaging noise in low-dose SPECT-MPI studies. The results in Fig. 5 show that the proposed approach can achieve substantial noise reduction and yield images that are similar to what would be obtained with full dose using conventional reconstruction [9]; this was also reflected in the reconstruction examples in Figs. 8&9. The results in Figs. 6&7 show that this also leads to improvement in the diagnostic accuracy of perfusion-defects over optimized conventional reconstruction at reduced-dose levels [9]. These results indicate that using CAE can allow for further dose reduction when compared to conventional filtering from our previous study [9]. In particular, using CAE with OSEM reconstruction, the dose could be reduced down to 1/8 while still maintaining the AUC not falling below that of FBP at full dose as judged by the TPD numerical observer. The results in Figs. 3&4 reveal that when skip connections are used in the CAE or CNN networks, the image similarity results become less sensitive to the number of layers and filters used than without skip connections. We believe that a possible reason for this is that by adding the skip connections it can alleviate the issue associated with over-fitting when the number of layers becomes unnecessarily large. This is especially the case for CAE (Fig. 3), in which increasing the number of layers corresponds to more dimensionality reduction of the input image.

The network optimization results (Section IV.A) suggest that the optimal structures needed for denoising of SPECT-MPI images do not need to be as complex as those seen in other applications such as low-dose CT, which can help mitigating the risk of overfitting. We believe this is due to the lower image resolution and less anatomical detail in SPECT-MPI images compared to other imaging modalities such as CT or MRI. In Section IV.A, we reported the optimization results of different structures based on the image data from 1/8 dose. The resulting optimal structure was subsequently applied to other reduced dose levels. While we didn’t re-optimize the network for each specific dose, we do not expect this optimal structure to vary much. This is reflected by the results earlier in Figs 3&4, where image similarity results showed little change as the number of layers and filters deviated from their optimal setting.

The results in Fig. 5 show that a dose-specific network can be more accurate than a “one-size-fits-all” network. We believe a reason for this is that, since a “one-size-fits-all” network encompasses large differences of noise levels (i.e., from 1/2 to 1/16 dose), it fails to adapt the interval filtering according to the noise level in the input images. Regardless of the dose level, in our experiments the network was always trained with all the cases in the training set. In a situation where the number of available cases is small, it might be useful to use multiple noise levels for training (to help reducing overfitting).

As described in Section III.B, in the experiments we used training samples formed by image patches extracted from a VOI containing the myocardium in order to improve the training efficiency. For comparison, we also considered alternatively by extracting image patches from the whole patient volume or by directly using the VOI as the training samples. The resulting image similarity values (on the validation set) for these two alternatives were $ρ = 0.976 \pm 0.003$ and $ρ = 0.983 \pm 0.001$ , respectively, both lower than the optimal $ρ = 0.987 \pm 0.001$ in our patch-based training (p-value<0.05 for each case, paired t-test).

Clinically the main object of interest in MPI is the heart (LV). The rest of the image is only examined for presence of incidental findings (i.e., unexpected abnormality of pathology). As seen in the example images in Figs. 8 and 9, the trained CAE/CNN essentially functions as a “low-pass” filter. Consequently, the resulting image background also appeared to become smoother (similar to that of Gaussian post-filtering).

In the experiments, we used the same cut-off frequency (i.e., 0.22 cycles/pixel), which was optimized on full-dose studies for perfusion detection [5], as the learning target for the reduced dose levels. The idea was to train the DL networks to achieve this optimal smoothing in the output. Alternatively, it might be beneficial to also adjust the smoothing level in the full-dose target according to the reduced-dose level. But this would greatly increase the complexity of the experiment design. In addition, in the study we have focused on DL denoising for the task of perfusion-defect detection. It would be important to further investigate function studies on cardiac-gated data using DL denoising on reduced dose data, as done previously with conventional smoothing [13].

In this study, we considered a post-filtering approach on reduced-dose images. Alternatively, one may consider training a CNN model to directly obtain image reconstruction from the sinogram data [49]. However, this would likely require a much more complex network, as the CNN would also need to learn how to incorporate attenuation correction into the reconstruction process.

Finally, in the experiments the data used were from one scanner. It would be of interest to also test the generalizability of our methods to other scanners. It is reasonable to expect that the denoising models demonstrated in the experiment can be applicable to other scanners as well for the following reasons. First, two-headed rotating SPECT systems with parallel-hole collimators are fairly generic and the images from such systems are fundamentally similar. Second, in the experiments the optimal CNN structure was demonstrated to be essentially the same for OSEM and FBP reconstructions. Note that the FPB reconstruction does not incorporate the depth-dependent collimator response of the scanner (hence less specific to the scanner characteristics). In the event that the image characteristics may vary due to a scanner change, one could apply the technique of transfer learning by using our trained networks as a starting point and refining them with a small number of new images [33].

VI. Conclusions

We investigated several 3D convolutional denoising network structures for the clinical task of perfusion-defect detection in SPECT-MPI at a number of successively reduced dose levels. The network structures with skip connections were found to be less sensitive to variations in the number of layers and filters used, and a dose-level specific network achieved higher performances that a “one-size-fits-all” network. It was also found that the optimal structures for denoising of SPECT-MPI images can be less complex than those seen in low-dose CT.

Our evaluation results indicate that the proposed DL structures can achieve substantial noise reduction and lead to improvement in the diagnostic accuracy of low-dose data. In particular, at ½ dose, DL denoising can achieve nearly the same diagnostic accuracy as the full-dose data in both conventional FBP and OSEM reconstruction. These results indicate that the proposed DL approach can allow for further dose reduction when compared to conventional filtering. Encouraged by these results, in the future we plan to conduct a reader study using clinical experts to further validate these results for potential clinical use.

Acknowledgments

This work was supported by the National Institutes of Health (NIH) under grant R01-HL122484. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Contributor Information

Albert Juan Ramon, Illinois Institute of Technology, Chicago, IL 60616 USA.

Yongyi Yang, Illinois Institute of Technology, Chicago, IL 60616 USA.

P. Hendrik Pretorius, Department of Radiology, Division of Nuclear Medicine, University of Massachusetts Medical School, Worcester, MA 01655 USA.

Karen L. Johnson, Department of Radiology, Division of Nuclear Medicine, University of Massachusetts Medical School, Worcester, MA 01655 USA

Michael A. King, Department of Radiology, Division of Nuclear Medicine, University of Massachusetts Medical School, Worcester, MA 01655 USA

Miles N. Wernick, Illinois Institute of Technology, Chicago, IL 60616 USA

References

[1].NCRP., Ionizing radiation exposure of the population of the United States: NCRP Report No. 93, 1986.
[2].Stratmann HG, Williams GA, Wittry MD, Chaitman BR, and Miller DD, “Exercise technetium-99m sestamibi tomography for cardiac risk stratification of patients with stable chest pain,” Circulation, vol. 89, no. 2, pp. 615–622, 1994. [DOI] [PubMed] [Google Scholar]
[3].Sabharwal N, and Lahiri A, “Role of myocardial perfusion imaging for risk stratification in suspected or known coronary artery disease,” Heart, vol. 89, no. 11, pp. 1291–1297, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]
[4].Finegold JA, Asaria P, and Francis DP, “Mortality from ischaemic heart disease by country, region, and age: statistics from World Health Organisation and United Nations,” Int. J. Cardiol, vol. 168, no. 2, pp. 934–945, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
[5].Brenner DJ, and Hall EJ, “Computed Tomography—An Increasing Source of Radiation Exposure,” N. Engl. J. Med, vol. 357, pp. 2277–84, 2007. [DOI] [PubMed] [Google Scholar]
[6].U.S. Food and Drug Administration, “Initiative to reduce unnecessary radiation exposure from medical imaging,” Center for Devices and Radiological Health, ed, 2010. [Google Scholar]
[7].Jerome SD, Tilkemeier PL, Farrell MB, and Shaw LJ, “Nationwide laboratory adherence to myocardial perfusion imaging radiation dose reduction practices,” JACC: Cardiovasc. Imaging, vol. 8, no. 10, pp. 1170–1176, 2015. [DOI] [PubMed] [Google Scholar]
[8].Einstein AJ et al. , “Current worldwide nuclear cardiology practices and radiation exposure: results from the 65 country IAEA Nuclear Cardiology Protocols Cross-Sectional Study (INCAPS),” Eur. Heart J, vol. 36, no. 26, pp. 1689–1696, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
[9].Juan Ramon A, Yang Y, Pretorius PH, Slomka PJ, Johnson KL, King MA, and Wernick MN, “Investigation of dose reduction in cardiac perfusion SPECT via optimization and choice of the image reconstruction strategy,” J. Nucl. Card, 2017, pp. 1–12. [DOI] [PMC free article] [PubMed]
[10].Ali I, Ruddy TD, Almgrahi A, Anstett FG, and Wells RG, “Half-time SPECT myocardial perfusion imaging with attenuation correction,” J. of Nucl. Med, vol. 50, no. 4, pp. 554–562, 2009. [DOI] [PubMed] [Google Scholar]
[11].DePuey EG, Bommireddipalli S, Clark J, Thompson L, and Srour Y, “Wide beam reconstruction “quarter-time” gated myocardial perfusion SPECT functional imaging: a comparison to “full-time” ordered subset expectation maximum,” J. Nucl. Card, vol. 16, no. 5, pp. 736–752, 2009. [DOI] [PubMed] [Google Scholar]
[12].Juan Ramon A, Yang Y, Pretorius PH, Johnson KL, King MA, and Wernick MN, “Personalized Models for Injected Activity Levels in SPECT Myocardial Perfusion Imaging,” IEEE Trans. Med. Imag, 2018. [DOI] [PMC free article] [PubMed]
[13].Juan Ramon A, Yang Y, Wernick MN, Pretorius PH, Johnson KL, Slomka PJ, and King MA, “Evaluation of the effect of reducing administered activity on assessment of function in cardiac gated SPECT,” J. Nucl. Card, pp. 1–11, 2018. [DOI] [PMC free article] [PubMed]
[14].Modi B, Brown J, Kumar G, Driver R, Kelion A, Peters A, and Fowler J, “A qualitative and quantitative assessment of the impact of three processing algorithms with halving of study count statistics in myocardial perfusion imaging: filtered backprojection, maximal likelihood expectation maximisation and ordered subset expectation maximisation with resolution recovery,” J. Nucl. Card, vol. 19, no. 5, pp. 945–957, 2012. [DOI] [PubMed] [Google Scholar]
[15].Lecchi M, Martinelli I, Zoccarato O, Maioli C, Lucignani G, and Del Sole A, “Comparative analysis of full-time, half-time, and quarter-time myocardial ECG-gated SPECT quantification in normal-weight and overweight patients,” J. Nucl. Card, pp. 1–12, 2016. [DOI] [PubMed]
[16].Lyon MC et al. , “Dose reduction in half-time myocardial perfusion SPECT-CT with multifocal collimation,” J. Nucl. Card, pp. 1–11, 2016. [DOI] [PubMed]
[17].Oddstig J, Hindorf C, Hedeer F, Jögi J, Arheden H, Hansson MJ, and Engblom H, “The radiation dose to overweighted patients undergoing myocardial perfusion SPECT can be significantly reduced: validation of a linear weight-adjusted activity administration protocol,” J. Nucl. Card, pp. 1–10, 2016. [DOI] [PubMed]
[18].Wells RG, “Dose reduction is good but it is image quality that matters,” J. Nucl. Cardiol, vol. 25, pp. 1–3, Jul. 2018. [DOI] [PubMed] [Google Scholar]
[19].LeCun Y, Bengio Y, and Hinton G, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436, 2015. [DOI] [PubMed] [Google Scholar]
[20].Lakhani P, Gray DL, Pett CR, Nagy P, and Shih G, “Hello World Deep Learning in Medical Imaging,” J. Dig. Imag, vol. 31, no. 3, pp. 283–289, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].Ker J, Wang L, Rao J, and Lim T, “Deep learning applications in medical image analysis,” IEEE Access, vol. 6, pp. 9375–9389, 2018. [Google Scholar]
[22].Casamitjana A, Puch S, Aduriz A, Sayrol E, and Vilaplana V, “3d convolutional networks for brain tumor segmentation,” Proc. of the MICCAI Chall. on Multim. Brain Tumor Image Segment. (BRATS), pp. 65–68, 2016.
[23].Boublil D, Elad M, Shtok J, and Zibulevsky M, “Spatially-adaptive reconstruction in computed tomography using neural networks,” IEEE Trans. Med. Imag, vol. 34, no. 7, pp. 1474–1485, 2015. [DOI] [PubMed] [Google Scholar]
[24].Mao X, Shen C, and Yang Y-B, “Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections.” Adv. in Neur. Inf. Proc. Sys, pp. 2802–2810.
[25].Zhang K, Zuo W, Chen Y, Meng D, and Zhang L, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,” IEEE Trans. Imag. Proc, vol. 26, no. 7, pp. 3142–3155, 2017. [DOI] [PubMed] [Google Scholar]
[26].Jiang D, Dou W, Vosters L, Xu X, Sun Y, and Tan T, “Denoising of 3D magnetic resonance images with multi-channel residual learning of convolutional neural network,” Jpn. J. Radiol, vol. 36, no. 9, pp. 566–574, 2018. [DOI] [PubMed] [Google Scholar]
[27].Chen H, Zhang Y, Zhang W, Liao P, Li K, Zhou J, and Wang G, “Low-dose CT via convolutional neural network,” Biomed. Opt. Express, vol. 8, no. 2, pp. 679–694, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[28].Chen H. et al. , “Low-dose CT with a residual encoder-decoder convolutional neural network,” IEEE Trans. Med. Imag, vol. 36, no. 12, pp. 2524–2535, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].Nishio M. et al. , “Convolutional auto-encoder for image denoising of ultra-low-dose CT,” Heliyon, vol. 3, no. 8, pp. e00393, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[30].Wolterink JM, Leiner T, Viergever MA, and Išgum I, “Generative adversarial networks for noise reduction in low-dose CT,” IEEE Trans. Med. Imag, vol. 36, no. 12, pp. 2536–2545, 2017. [DOI] [PubMed] [Google Scholar]
[31].Yang Q. et al. , “Low dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss,” IEEE Trans. Med. Imag, 2018. [DOI] [PMC free article] [PubMed]
[32].Gong K, Guan J, Liu C-C, and Qi J, “PET image denoising using a deep neural network through fine tuning,” IEEE Trans. Rad. Plas. Med. Scien, vol. 3, no. 2, pp. 153–161, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[33].Shan H. et al. , “3-D convolutional encoder-decoder network for low-dose CT via transfer learning from a 2-D trained network,” IEEE Trans. Med. Imag, vol. 37, no. 6, pp. 1522–1534, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[34].Duncan JS, Insana MF, and Ayache N, “Biomedical Imaging and Analysis in the Age of Big Data and Deep Learning,” Proc. IEEE, vol. 108, no. 1, pp. 3–10, 2019. [Google Scholar]
[35].Gong K, Berg E, Cherry SR, and Qi J, “Machine Learning in PET: From Photon Detection to Quantitative Image Reconstruction,” Proc. IEEE, vol. 108, no. 1, pp. 51–68, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[36].Juan Ramon A, Yongyi Y, Pretorius PH, Johnson KL, King MA, and Wernick MN, “Initial Investigation of Low-Dose SPECT-MPI via Deep Learning.” in Conf. Rec. IEEE Nucl. Sci. Symp., 2018. [Google Scholar]
[37].Vincent P, Larochelle H, Bengio Y, and Manzagol P-A, “Extracting and composing robust features with denoising autoencoders.” Proc. of the 25th Inter. Conf. on Mach. Learn., pp. 1096–1103. [Google Scholar]
[38].Ioffe S, and Szegedy C, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.” Proc. of the 25th Inter. Conf. on Mach. Learn., pp. 448–456. [Google Scholar]
[39].Bengio Y, Simard P, and Frasconi P, “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. Neur. Net, vol. 5, no. 2, pp. 157–166, 1994. [DOI] [PubMed] [Google Scholar]
[40].Pascanu R, Mikolov T, and Bengio Y, “Understanding the exploding gradient problem,” CoRR, abs/1211.5063, 2012.
[41].Dong C, Loy CC, He K, and Tang X, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern. Anal. Mach. Intell, vol. 38, no. 2, pp. 295–307, 2016. [DOI] [PubMed] [Google Scholar]
[42].Henzlova MJ, Duvall WL, Einstein AJ, Travin MI, Verberne HJ, “ASNC imaging guidelines for SPECT nuclear cardiology procedures: Stress, protocols, and tracers,” J. Nucl. Cardiol, vol. 23, no. 3, pp. 606–639, 2016. [DOI] [PubMed] [Google Scholar]
[43].Bai C, Shao L, Da Silva AJ, and Zhao Z, “A generalized model for the conversion from CT numbers to linear attenuation coefficients,” IEEE Trans. Nucl. Sci, vol. 50, no. 5, pp. 1510–1515, 2003. [Google Scholar]
[44].Ogawa K, Harata Y, Ichihara T, Kubo A, and Hashimoto S, “A practical method for position-dependent Compton-scatter correction in single photon emission CT,” IEEE Trans. Med. Imag, vol. 10, no. 3, pp. 408–412, 1991. [DOI] [PubMed] [Google Scholar]
[45].Kingma DP, and Ba J, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[46].Arsanjani R. et al. , “Comparison of fully automated computer analysis and visual scoring for detection of coronary artery disease from myocardial perfusion SPECT in a large population,” Journal of Nuclear Medicine, vol. 54, no. 2, pp. 221–228, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
[47].Slomka PJ et al. , “Automated quantification of myocardial perfusion SPECT using simplified normal limits,” J. Nucl. Card, vol. 12, no. 1, pp. 66–77, 2005. [DOI] [PubMed] [Google Scholar]
[48].Metz C, Lorenzo LP, and Papaioannu J. “ROC-kit software,” http://metz-roc.uchicago.edu/MetzROC/software.
[49].Jin KH, McCann MT, Froustey E, and Unser M, “Deep convolutional neural network for inverse problems in imaging,” IEEE Trans. Imag. Proc, vol. 26, no. 9, pp. 4509–4522, 2017. [DOI] [PubMed] [Google Scholar]

[R1] [1].NCRP., Ionizing radiation exposure of the population of the United States: NCRP Report No. 93, 1986.

[R2] [2].Stratmann HG, Williams GA, Wittry MD, Chaitman BR, and Miller DD, “Exercise technetium-99m sestamibi tomography for cardiac risk stratification of patients with stable chest pain,” Circulation, vol. 89, no. 2, pp. 615–622, 1994. [DOI] [PubMed] [Google Scholar]

[R3] [3].Sabharwal N, and Lahiri A, “Role of myocardial perfusion imaging for risk stratification in suspected or known coronary artery disease,” Heart, vol. 89, no. 11, pp. 1291–1297, 2003. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] [4].Finegold JA, Asaria P, and Francis DP, “Mortality from ischaemic heart disease by country, region, and age: statistics from World Health Organisation and United Nations,” Int. J. Cardiol, vol. 168, no. 2, pp. 934–945, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] [5].Brenner DJ, and Hall EJ, “Computed Tomography—An Increasing Source of Radiation Exposure,” N. Engl. J. Med, vol. 357, pp. 2277–84, 2007. [DOI] [PubMed] [Google Scholar]

[R6] [6].U.S. Food and Drug Administration, “Initiative to reduce unnecessary radiation exposure from medical imaging,” Center for Devices and Radiological Health, ed, 2010. [Google Scholar]

[R7] [7].Jerome SD, Tilkemeier PL, Farrell MB, and Shaw LJ, “Nationwide laboratory adherence to myocardial perfusion imaging radiation dose reduction practices,” JACC: Cardiovasc. Imaging, vol. 8, no. 10, pp. 1170–1176, 2015. [DOI] [PubMed] [Google Scholar]

[R8] [8].Einstein AJ et al. , “Current worldwide nuclear cardiology practices and radiation exposure: results from the 65 country IAEA Nuclear Cardiology Protocols Cross-Sectional Study (INCAPS),” Eur. Heart J, vol. 36, no. 26, pp. 1689–1696, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] [9].Juan Ramon A, Yang Y, Pretorius PH, Slomka PJ, Johnson KL, King MA, and Wernick MN, “Investigation of dose reduction in cardiac perfusion SPECT via optimization and choice of the image reconstruction strategy,” J. Nucl. Card, 2017, pp. 1–12. [DOI] [PMC free article] [PubMed]

[R10] [10].Ali I, Ruddy TD, Almgrahi A, Anstett FG, and Wells RG, “Half-time SPECT myocardial perfusion imaging with attenuation correction,” J. of Nucl. Med, vol. 50, no. 4, pp. 554–562, 2009. [DOI] [PubMed] [Google Scholar]

[R11] [11].DePuey EG, Bommireddipalli S, Clark J, Thompson L, and Srour Y, “Wide beam reconstruction “quarter-time” gated myocardial perfusion SPECT functional imaging: a comparison to “full-time” ordered subset expectation maximum,” J. Nucl. Card, vol. 16, no. 5, pp. 736–752, 2009. [DOI] [PubMed] [Google Scholar]

[R12] [12].Juan Ramon A, Yang Y, Pretorius PH, Johnson KL, King MA, and Wernick MN, “Personalized Models for Injected Activity Levels in SPECT Myocardial Perfusion Imaging,” IEEE Trans. Med. Imag, 2018. [DOI] [PMC free article] [PubMed]

[R13] [13].Juan Ramon A, Yang Y, Wernick MN, Pretorius PH, Johnson KL, Slomka PJ, and King MA, “Evaluation of the effect of reducing administered activity on assessment of function in cardiac gated SPECT,” J. Nucl. Card, pp. 1–11, 2018. [DOI] [PMC free article] [PubMed]

[R14] [14].Modi B, Brown J, Kumar G, Driver R, Kelion A, Peters A, and Fowler J, “A qualitative and quantitative assessment of the impact of three processing algorithms with halving of study count statistics in myocardial perfusion imaging: filtered backprojection, maximal likelihood expectation maximisation and ordered subset expectation maximisation with resolution recovery,” J. Nucl. Card, vol. 19, no. 5, pp. 945–957, 2012. [DOI] [PubMed] [Google Scholar]

[R15] [15].Lecchi M, Martinelli I, Zoccarato O, Maioli C, Lucignani G, and Del Sole A, “Comparative analysis of full-time, half-time, and quarter-time myocardial ECG-gated SPECT quantification in normal-weight and overweight patients,” J. Nucl. Card, pp. 1–12, 2016. [DOI] [PubMed]

[R16] [16].Lyon MC et al. , “Dose reduction in half-time myocardial perfusion SPECT-CT with multifocal collimation,” J. Nucl. Card, pp. 1–11, 2016. [DOI] [PubMed]

[R17] [17].Oddstig J, Hindorf C, Hedeer F, Jögi J, Arheden H, Hansson MJ, and Engblom H, “The radiation dose to overweighted patients undergoing myocardial perfusion SPECT can be significantly reduced: validation of a linear weight-adjusted activity administration protocol,” J. Nucl. Card, pp. 1–10, 2016. [DOI] [PubMed]

[R18] [18].Wells RG, “Dose reduction is good but it is image quality that matters,” J. Nucl. Cardiol, vol. 25, pp. 1–3, Jul. 2018. [DOI] [PubMed] [Google Scholar]

[R19] [19].LeCun Y, Bengio Y, and Hinton G, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436, 2015. [DOI] [PubMed] [Google Scholar]

[R20] [20].Lakhani P, Gray DL, Pett CR, Nagy P, and Shih G, “Hello World Deep Learning in Medical Imaging,” J. Dig. Imag, vol. 31, no. 3, pp. 283–289, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].Ker J, Wang L, Rao J, and Lim T, “Deep learning applications in medical image analysis,” IEEE Access, vol. 6, pp. 9375–9389, 2018. [Google Scholar]

[R22] [22].Casamitjana A, Puch S, Aduriz A, Sayrol E, and Vilaplana V, “3d convolutional networks for brain tumor segmentation,” Proc. of the MICCAI Chall. on Multim. Brain Tumor Image Segment. (BRATS), pp. 65–68, 2016.

[R23] [23].Boublil D, Elad M, Shtok J, and Zibulevsky M, “Spatially-adaptive reconstruction in computed tomography using neural networks,” IEEE Trans. Med. Imag, vol. 34, no. 7, pp. 1474–1485, 2015. [DOI] [PubMed] [Google Scholar]

[R24] [24].Mao X, Shen C, and Yang Y-B, “Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections.” Adv. in Neur. Inf. Proc. Sys, pp. 2802–2810.

[R25] [25].Zhang K, Zuo W, Chen Y, Meng D, and Zhang L, “Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising,” IEEE Trans. Imag. Proc, vol. 26, no. 7, pp. 3142–3155, 2017. [DOI] [PubMed] [Google Scholar]

[R26] [26].Jiang D, Dou W, Vosters L, Xu X, Sun Y, and Tan T, “Denoising of 3D magnetic resonance images with multi-channel residual learning of convolutional neural network,” Jpn. J. Radiol, vol. 36, no. 9, pp. 566–574, 2018. [DOI] [PubMed] [Google Scholar]

[R27] [27].Chen H, Zhang Y, Zhang W, Liao P, Li K, Zhou J, and Wang G, “Low-dose CT via convolutional neural network,” Biomed. Opt. Express, vol. 8, no. 2, pp. 679–694, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] [28].Chen H. et al. , “Low-dose CT with a residual encoder-decoder convolutional neural network,” IEEE Trans. Med. Imag, vol. 36, no. 12, pp. 2524–2535, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].Nishio M. et al. , “Convolutional auto-encoder for image denoising of ultra-low-dose CT,” Heliyon, vol. 3, no. 8, pp. e00393, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] [30].Wolterink JM, Leiner T, Viergever MA, and Išgum I, “Generative adversarial networks for noise reduction in low-dose CT,” IEEE Trans. Med. Imag, vol. 36, no. 12, pp. 2536–2545, 2017. [DOI] [PubMed] [Google Scholar]

[R31] [31].Yang Q. et al. , “Low dose CT image denoising using a generative adversarial network with Wasserstein distance and perceptual loss,” IEEE Trans. Med. Imag, 2018. [DOI] [PMC free article] [PubMed]

[R32] [32].Gong K, Guan J, Liu C-C, and Qi J, “PET image denoising using a deep neural network through fine tuning,” IEEE Trans. Rad. Plas. Med. Scien, vol. 3, no. 2, pp. 153–161, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] [33].Shan H. et al. , “3-D convolutional encoder-decoder network for low-dose CT via transfer learning from a 2-D trained network,” IEEE Trans. Med. Imag, vol. 37, no. 6, pp. 1522–1534, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] [34].Duncan JS, Insana MF, and Ayache N, “Biomedical Imaging and Analysis in the Age of Big Data and Deep Learning,” Proc. IEEE, vol. 108, no. 1, pp. 3–10, 2019. [Google Scholar]

[R35] [35].Gong K, Berg E, Cherry SR, and Qi J, “Machine Learning in PET: From Photon Detection to Quantitative Image Reconstruction,” Proc. IEEE, vol. 108, no. 1, pp. 51–68, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] [36].Juan Ramon A, Yongyi Y, Pretorius PH, Johnson KL, King MA, and Wernick MN, “Initial Investigation of Low-Dose SPECT-MPI via Deep Learning.” in Conf. Rec. IEEE Nucl. Sci. Symp., 2018. [Google Scholar]

[R37] [37].Vincent P, Larochelle H, Bengio Y, and Manzagol P-A, “Extracting and composing robust features with denoising autoencoders.” Proc. of the 25th Inter. Conf. on Mach. Learn., pp. 1096–1103. [Google Scholar]

[R38] [38].Ioffe S, and Szegedy C, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.” Proc. of the 25th Inter. Conf. on Mach. Learn., pp. 448–456. [Google Scholar]

[R39] [39].Bengio Y, Simard P, and Frasconi P, “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. Neur. Net, vol. 5, no. 2, pp. 157–166, 1994. [DOI] [PubMed] [Google Scholar]

[R40] [40].Pascanu R, Mikolov T, and Bengio Y, “Understanding the exploding gradient problem,” CoRR, abs/1211.5063, 2012.

[R41] [41].Dong C, Loy CC, He K, and Tang X, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern. Anal. Mach. Intell, vol. 38, no. 2, pp. 295–307, 2016. [DOI] [PubMed] [Google Scholar]

[R42] [42].Henzlova MJ, Duvall WL, Einstein AJ, Travin MI, Verberne HJ, “ASNC imaging guidelines for SPECT nuclear cardiology procedures: Stress, protocols, and tracers,” J. Nucl. Cardiol, vol. 23, no. 3, pp. 606–639, 2016. [DOI] [PubMed] [Google Scholar]

[R43] [43].Bai C, Shao L, Da Silva AJ, and Zhao Z, “A generalized model for the conversion from CT numbers to linear attenuation coefficients,” IEEE Trans. Nucl. Sci, vol. 50, no. 5, pp. 1510–1515, 2003. [Google Scholar]

[R44] [44].Ogawa K, Harata Y, Ichihara T, Kubo A, and Hashimoto S, “A practical method for position-dependent Compton-scatter correction in single photon emission CT,” IEEE Trans. Med. Imag, vol. 10, no. 3, pp. 408–412, 1991. [DOI] [PubMed] [Google Scholar]

[R45] [45].Kingma DP, and Ba J, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

[R46] [46].Arsanjani R. et al. , “Comparison of fully automated computer analysis and visual scoring for detection of coronary artery disease from myocardial perfusion SPECT in a large population,” Journal of Nuclear Medicine, vol. 54, no. 2, pp. 221–228, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] [47].Slomka PJ et al. , “Automated quantification of myocardial perfusion SPECT using simplified normal limits,” J. Nucl. Card, vol. 12, no. 1, pp. 66–77, 2005. [DOI] [PubMed] [Google Scholar]

[R48] [48].Metz C, Lorenzo LP, and Papaioannu J. “ROC-kit software,” http://metz-roc.uchicago.edu/MetzROC/software.

[R49] [49].Jin KH, McCann MT, Froustey E, and Unser M, “Deep convolutional neural network for inverse problems in imaging,” IEEE Trans. Imag. Proc, vol. 26, no. 9, pp. 4509–4522, 2017. [DOI] [PubMed] [Google Scholar]

PERMALINK

Improving Diagnostic Accuracy in Low-Dose SPECT Myocardial Perfusion Imaging with Convolutional Denoising Networks

Albert Juan Ramon

Yongyi Yang

P Hendrik Pretorius

Karen L Johnson

Michael A King

Miles N Wernick

Abstract

I. Introduction

II. Low-Dose SPECT Denoising via Deep Learning

A. Problem formulation

B. Neural network structures for low-dose SPECT

1). 3D convolutional autoencoder (3D CAE)

Fig. 1.

2). 3D convolutional neural network (3D CNN)

Fig. 2.

3). Use of skip connections

4). Network structure parameters for optimization

C. Training and evaluation with clinical SPECT-MPI data

D. Dose specific network vs. one-size-fits-all network

III. Experimental Framework

A. Image reconstruction methods

B. Network training and implementation

1). Extraction of training image samples for heart volumes

2). Network implementation

C. Performance evaluation

1). Image similarity to full-dose data

2). Perfusion defect detection

IV. Results

A. Network structure optimization

1). OSEM reconstruction

Fig. 3.

Fig. 4.

2). FBP reconstruction

B. Test performance on image similarity to full-dose data

1). OSEM reconstruction

Fig. 5.

2). FBP reconstruction

C. Test performance on perfusion-defect detection

1). OSEM reconstruction

Fig. 6.

2). FBP reconstruction

Fig. 7.

D. Example images

Fig. 8.

Fig. 9.

V. Discussions

VI. Conclusions

Acknowledgments

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases