Segmentation-free PVC for Cardiac SPECT using a Densely-connected Multi-dimensional Dynamic Network

Huidong Xie; Zhao Liu; Luyao Shi; Kathleen Greco; Xiongchao Chen; Bo Zhou; Attila Feher; John C Stendahl; Nabil Boutagy; Tassos C Kyriakides; Ge Wang; Albert J Sinusas; Chi Liu

doi:10.1109/TMI.2022.3226604

. Author manuscript; available in PMC: 2024 May 2.

Published in final edited form as: IEEE Trans Med Imaging. 2023 May 2;42(5):1325–1336. doi: 10.1109/TMI.2022.3226604

Segmentation-free PVC for Cardiac SPECT using a Densely-connected Multi-dimensional Dynamic Network

Huidong Xie ¹, Zhao Liu ², Luyao Shi ³, Kathleen Greco ⁴, Xiongchao Chen ⁵, Bo Zhou ⁶, Attila Feher ⁷, John C Stendahl ⁸, Nabil Boutagy ⁹, Tassos C Kyriakides ¹⁰, Ge Wang ¹¹, Albert J Sinusas ¹², Chi Liu ¹³

PMCID: PMC10204821 NIHMSID: NIHMS1897477 PMID: 36459599

Abstract

In nuclear imaging, limited resolution causes partial volume effects (PVEs) that affect image sharpness and quantitative accuracy. Partial volume correction (PVC) methods incorporating high-resolution anatomical information from CT or MRI have been demonstrated to be effective. However, such anatomical-guided methods typically require tedious image registration and segmentation steps. Accurately segmented organ templates are also hard to obtain, particularly in cardiac SPECT imaging, due to the lack of hybrid SPECT/CT scanners with high-end CT and associated motion artifacts. Slight mis-registration/mis-segmentation would result in severe degradation in image quality after PVC. In this work, we develop a deep-learning-based method for fast cardiac SPECT PVC without anatomical information and associated organ segmentation. The proposed network involves a densely-connected multi-dimensional dynamic mechanism, allowing the convolutional kernels to be adapted based on the input images, even after the network is fully trained. Intramyocardial blood volume (IMBV) is introduced as an additional clinical-relevant loss function for network optimization. The proposed network demonstrated promising performance on 28 canine studies acquired on a GE Discovery NM/CT 570c dedicated cardiac SPECT scanner with a 64-slice CT using Technetium-99m-labeled red blood cells. This work showed that the proposed network with densely-connected dynamic mechanism produced superior results compared with the same network without such mechanism. Results also showed that the proposed network without anatomical information could produce images with statistically comparable IMBV measurements to the images generated by anatomical-guided PVC methods, which could be helpful in clinical translation.

Keywords: Cardiac SPECT, Coronary Microvascular Disease, Dynamic Convolution, Deep Learning, Intramyocardial Blood Volume, Partial Volume Correction

I. Introduction

SINGLE photon emission computed tomography (SPECT) is a nuclear imaging modality used to visualize and measure radiotracer activities within the body that is widely used for clinical purposes [1]. In SPECT imaging, the image quality is negatively affected by various factors, such as photon attenuation, photon scatterings, low photon sensitivities, and motion [2]. In addition to these, factors related to limited spatial resolution generally cause partial volume effects (PVEs) in which contributions from different tissues may be combined into a single voxel. Because different tissues have different patterns and kinetics of tracer uptake, PVE will cause undesirable blurring in the reconstructed images. In addition, the size of reconstructed voxels of SPECT is usually larger than that of other higher-resolution imaging modalities (such as computed tomography (CT)), leading to additional PVEs. In SPECT imaging, PVEs are usually represented as spill-over of photon counts between different tissues [3]. For example, in cardiac imaging, photons emitted from the myocardium may spill-over to the blood pool and vice versa, resulting in under-estimation or over-estimation of tracer activities in the reconstructed images.

The spatial resolution of SPECT imaging systems is characterized by their point spread functions (PSFs), and the goal of partial volume corrections (PVCs) is to reverse the effects of the system PSF. Using only the SPECT data, this can be simply done by deconvolution in the image domain or by incorporating PSF into the system matrix for iterative reconstruction [4]. Nonetheless, both ways would result in undesirable artifacts due to loss of high-frequency information [3]. As discussed in [5], this issue can be effectively addressed by incorporating anatomical information for PVC. For cardiac SPECT imaging, our group has investigated a series of anatomical-guided PVC methods incorporating high-resolution contrast-enhanced CT angiography (CTA) images [6], [7]. Among these methods, we found that the iterative Yang (iY) [3], [8] method is preferred, as this approach is robust to image noise [9]. Despite its promising results, iY requires tedious image registration and segmentation steps between SPECT data and anatomical organ templates, which typically take about 10 hours per scan for the canine studies used in this work. iY also assumes perfect registration/segmentation between SPECT data and anatomical organ templates, which may not be practical in clinical settings due to motion artifacts, especially for cardiac imaging. Lastly, contrast-enhanced CT (CECT) is usually used to obtain anatomical organ templates, which introduces additional radiation exposure to patients along with the need of non-contrast CT for attenuation correction.

We previously proposed an atlas-based method to achieve segmentation-free PVC [10]. This method requires a set of atlas images and the corresponding segmentation templates. Non-rigid registrations were needed between the target CECT images and the atlas CECT images (target images refer to the test data in the context of a neural network). The transformation matrixes obtained from registrations were then applied to the atlas segmentation templates. Lastly, all the transformed segmentation templates were fused together for iY-PVC. Therefore, the manual segmentation step for iY-PVC can be avoided. However, this method is time and memory-consuming due to numerous non-rigid registration steps on high-resolution CECT image volumes. Also, its performance is highly dependent on the atlas database. The atlas-based method needs to be individually optimized for different species, patient populations, and even different sizes of canine studies. The paper showed that the atlas-based method led to high variances for the PVC results [10]. Lastly, the atlas-based method only skips the manual segmentation step, but contrast CT is still needed.

Deep learning represents a new class of reconstruction algorithm [11] and may be an ideal tool to address the above-mentioned limitations of anatomical-guided or atlas-based PVC methods. In the past years, convolutional-based neural networks have been implemented in various medical imaging applications such as low-dose CT [12], few-view CT [13], [14], low-dose SPECT [15], few-view SPECT [16], [17], attenuation map generations [18] etc. However, little attention has been given to deep-learning-based PVC.

All the proposed convolutional-based networks mentioned above were trying to learn static convolutional kernels throughout the networks. In the context of this paper, a convolutional kernel is defined as a set of filters in a convolutional layer. Recent studies showed that dynamic convolutional networks can significantly improve the performance for classification tasks [19]. Initially proposed by Chen et al. [20] and Yang et al. [21], in a dynamic convolutional layer, instead of applying the same convolutional kernel to all the input data after the network is fully trained, dynamic convolution learns a weighted combination of multiple convolutional kernels adapted based on the input features, with negligible extra computational cost. Recently, as pointed out by Li et al. [19], the previously proposed dynamic convolution limits the adaptive capability to only one dimension (i.e., the number of convolutional kernels), and other dimensions were omitted (i.e., kernel spatial size, input channel size, and output channel size). Li et al. proposed an omni-dimensional dynamic convolution, which generalizes the dynamic mechanism to all the dimensions in convolutional networks. However, the dynamic weights of all the previously proposed dynamic mechanisms are calculated based on feature maps of only the previous layer. In this work, inspired by the Dense-net structure [22], we proposed a densely-connected multi-dimensional dynamic (DC-Dy) mechanism, in which dynamic weights are obtained not only from the previous layer, but also all the preceding layers. This idea encourages the network to reuse feature maps from all the layers and improve adaptive capability of the dynamic mechanism. Compared with the onmi-dimensional dynamic mechanism, the proposed DC-Dy mechanism does not introduce additional parameters. To the best of our knowledge, dynamic convolutional networks have not been investigated so far in the field of medical imaging.

When training a medical image reconstruction network, global image quality metrics, such as mean-absolute-error (MAE) and structural similarity index measurement (SSIM), are widely implemented as the loss functions to optimize network parameters. Even though using these metrics has achieved promising results in various deep-learning networks, it is a well-known issue that these metrics do not necessarily reflect the true image quality in clinical settings, especially for cardiac images acquired on dedicated cardiac scanners like GE Discovery NM/CT 570c [23] or D-SPECT [24]. Since these scanners have a field-of-view (FOV) smaller than the reconstructed matrix size, there are some insignificant features outside the FOV. But these relatively unimportant features share the same weight in MAE or SSIM calculations. In this work, we proposed to incorporate a clinically-relevant image quantification metric as part of the loss function for network optimizations.

Coronary microvascular disease (CMVD) is a prevalent and critical global health problem, which is often unrecognized [25]. Non-invasive methods to evaluate myocardial micro-circulatory function to accurately diagnose CMVD remains complicated and time consuming. We previously proposed a methodological framework for intramyocardial blood volume (IMBV) quantification as a metric for microvascular function obtainable with SPECT imaging of ^99mTc-labeled red blood cells (RBCs) [9]. ^99mTc-RBCs is routinely used for cardiac blood-pool imaging in the assessment of regional and global left ventricular function. Previous results showed that SPECT images with anatomical-guided PVC methods produced more accurate estimation of IMBV [9]. IMBV could served as a novel index to diagnose CMVD [26]. The noninvasive diagnosis of CMVD is a challenging clinical problem especially in the presence of coexisting epicardial coronary artery disease [27]. Here, we focused on investigating ^99mTc-RBCs canine studies as an example for cardiac SPECT PVC, while expect the proposed method can be applied to other imaging tracers/modalities in the future.

In this work, we proposed deep neural network with a densely-connected multi-dimensional dynamic (DC-Dy) mechanism for whole-volume cardiac SPECT PVC. Compared with the static counterpart, the proposed dynamic network demonstrated consistently better performance on canine studies acquired on the GE Discovery NM/CT 570c scanner for evaluation of IMBV. Additionally, based on the imaging tracer and medical application, IMBV was incorporated as a clinically-derived loss to optimize network parameters. Ablation experiments presented in this paper demonstrated the effectiveness of this clinically-derived loss function. Compared with the anatomical-guided iY method, the proposed deep-learning method demonstrated fast and comparable performance without CECT as prior knowledge. Lastly, we also tested the proposed network with both CECT and SPECT images as dual-channel input (but without segmented organ templates). The network trained with dual-channel input further enhanced the PVC quality without tedious image segmentation.

II. Methodology

A. Data Acquisition

A total of 28 resting canine studies scanned with ^99mTc-RBCs were included in this project to validate the effectiveness of the proposed deep-learning PVC method. Formulated in (1), IMBV is defined as the ratio between the mean activities of the entire myocardium ( $M_{m y o}$ ) and the mean activities of the left ventricular blood pool ( $M_{l v - b l p}$ ). IMBV served as the primary metric in this work to evaluate the image quality and quantitative accuracy before and after PVC. After PVC, IMBV values are expected to decrease as the $M_{m y o}$ decreases and $M_{l v - b l p}$ increases by reducing the spill-over effects. $M_{m y o}$ and $M_{l v - b l p}$ were calculated using the manually-segmented 3D organ templates obtained from the CECT image volumes.

IMBV = \frac{M_{m y o}}{M_{l v - b l p}}

(1)

Prior to imaging, about 3 ml arterial blood was withdrawn from each animal from a femoral artery catheter into a heparinized vacutainer. RBCs were then labeled in vitro with sodium pertechnetate. About 30 minutes after RBC labeling, ^99mTc-RBCs (18.8 ± 4.3mCi) were intravenously injected into the imaging subjects. SPECT scans were then performed 15 minutes after the injections. The SPECT scan time was about 10 minutes. A low-dose non-contrast CT was performed afterward for attenuation correction. Then, CECT scans were performed with retrospective electrocardiogram (ECG) gating during end-expiration. The animals were mechanically ventilated and were under general anesthesia (1-2% isoflurane and 55-60% nitrous oxide) during the procedure [28]. The use of animal data in this study was approved by the Institutional Animal Care & Use Committee (IACUC) of Yale University.

All the scans were performed on a GE Discovery NM/CT 570c dedicated cardiac SPECT/CT system. The SPECT scanner consists of 19 solid-state cadmium zinc telluride (CZT) detector modules to generate projections $P \in R^{32 \times 32 \times 19}$ with pixel size 2.46 × 2.46 mm². The system has pinhole collimators focusing on the heart with a FOV of about 19 cm in diameter. Images ( $I \in R^{70 \times 70 \times 50}$ ) were reconstructed using the maximum likelihood expected-maximization (MLEM) algorithm [29] with 80 iterations and voxel size 4×4×4 mm³. No post-filtering was applied.

B. Iterative Yang PVC

The acquired SPECT list-mode data were rebinned into 8 cardiac gates during end-expiration to ensure alignments between CECT and SPECT. End-diastolic and end-systolic gates were selected for iY-PVC and network training/validation/testing. SPECT images were first resized to the dimension of CECT images. CECT images were then manually registered to SPECT images. 5 organ templates, including myocardium ( $R O I_{m y o}$ ), blood pool ( $R O I_{b l p}$ ), liver ( $R O I_{l i v e r}$ ), lung ( $R O I_{l u n g}$ ), and background ( $R O I_{b k g}$ ) were then generated by manual segmentations from contrast CT for iY-PVC. The generated templates are binary and cover the entire 3D volumes. iY-PVC was achieved by calculating voxel-wise correction factors through the process summarized in algorithm 1.

Algorithm 1 iY-PVC method

\begin{matrix} Require : I ⋄ Initial reconstructed image using MLEM \\ Require : S ⋄ The system matrix \\ Require : R O I_{O} ⋄ Segmented organ templates \\ O \in {m y o, b l p, l u n g, l i v e r, b k g} \\ k \leftarrow 0 \\ I_{k} \leftarrow I \\ while k < 10 do ⋄ 10 iterations for iY-PVC \\ O_{m e a n} \leftarrow m e a n {I_{k} (R O I_{O})} ⋄ Mean values in \\ each organ \\ T \leftarrow \sum O_{m e a n} \cdot R O I_{O} ⋄ Noise-free template \\ T_{p r o j} \leftarrow S \cdot T ⋄ Forward projection \\ T_{r e c o n} \leftarrow MLEM (T_{p r o j}) ⋄ MLEM reconstruction \\ F \leftarrow T ∕ T_{r e c o n} ⋄ Correction factors \\ I_{k + 1} \leftarrow I \cdot F ⋄ Corrected image \\ k \leftarrow k + 1 \\ end while \end{matrix}

Open in a new tab

In iY-PVC, a noise-free organ template is generated first using the mean values within the segmented templates. The voxel-wise correction factors are equal to the ratio between the noise-free template and the MLEM-reconstructed template after forward-projection using the system matrix. This method typically requires a few iterations to converge. In this work, 10 iterations were used for iY-PVC. The manual registration took a few minutes per scan and the manual segmentation for five organs took around 10 hours per scan. The time-consuming process of generating accurate organ templates demonstrates the need for segmentation-free PVC method.

C. Densely-connected Multi-dimensional Dynamic Convolution

Mathematically, a static 3D convolutional operations [30] can be formulated as:

y = W ⊛ x + B

(2)

where $x \in R^{d \times w \times h \times C_{i n}}$ and $y \in R^{d \times w \times h \times C_{o u t}}$ represent input feature maps and output feature maps of the convolutional layer, respectively. $d$ , $w$ , and $h$ denote the spatial dimension of input/output feature maps, which may be different depending on the chosen parameters of the convolutional layer. $C_{i n}$ and $C_{o u t}$ represent input and output channel dimensions. $W \in R^{k \times k \times k \times C_{i n} \times C_{o u t}}$ represents the weights in convolutional kernel. $k$ is the kernel spatial dimension. $B \in R^{C_{o u t}}$ represents the bias term. $⊛$ denotes the convolutional operations.

In the proposed DC-Dy convolutional layer, the kernel weight W becomes adaptive based on the feature maps of all preceding layers on three kernel dimensions (i.e., spatial, input channel, and output channel dimensions). Specifically, three attention values, $a_{s p a} \in R^{k \times k \times k \times}$ , $a_{i n} \in R^{C_{i n}}$ , and $a_{o u t} \in R^{C_{o u t}}$ , are obtained via an attention mechanism. $a_{s p a}$ , $a_{i n}$ , and $a_{o u t}$ are different for different input image volumes. Mathematically, (2) becomes:

y = (\frac{1}{3} W ⊙ (a_{s p a} + a_{i n} + a_{o u t})) ⊛ x + B

(3)

The kernel weight $W$ is progressively weighted using $a_{s p a}$ , $a_{i n}$ , and $a_{o u t}$ , and make it different for different input image volumes along with all spatial locations, input channels, and output channels. With this design, dynamic convolutional layer can capture more informative features than its static counterpart.

The attention mechanism used to obtain $a_{s p a}$ , $a_{i n}$ , and $a_{o u t}$ is adapted from the squeeze-and-excitation attention proposed by Hu et al. [31]. The input features $x$ are first squeezed along spatial dimension using a global average pooling (GAP) operation (GAP( $x$ ) $R^{1 \times 1 \times 1 \times C_{i n}}$ ), followed by a fully connected (FC) layer and a rectified linear unit (ReLU) as the activation function. Another FC layer is used to project the output from the previous FC layer to the size of the attention values, followed by a sigmoid activation function to squeeze the attention values between 0 and 1. The 2 FC layers contain $2 n$ and $n$ neurons, where $n = k^{3}$ for $a_{s p a}$ , $n = C_{i n}$ for $a_{i n}$ , and $n = C_{o u t}$ for $a_{o u t}$ . A graphical illustration of the proposed multi-dimensional dynamic convolutional with kernel size 3 × 3 × 3 (i.e., $k$ = 3) is presented in Fig. 1. The attention mechanism used for the convolutional kernels is also presented in Fig. 1.

Fig. 1. — Graphical illustration of the proposed DC-Dy convolution with kernel size 3 × 3 × 3, input channel $C_{in}$ and output channel $C_{out}$ . The attention mechanism used for convolutional kernels is also presented here. Small light-blue cubes represent values in the static convolutional kernel (before applying the attention). Cubes with other colors represent how the attention values are applied. The dynamic kernel weights are obtained by multiplying the kernel weights and three attention values. Three dynamic kernel weights are then added together before performing the convolutional operations. Input feature maps are obtained using (4).

With proposed DC-Dy mechanism, three attention values are obtained based on input features of all preceding layers. Specifically, with the DC mechanism, the three attention weights (i.e., $a_{s p a}$ , $a_{i n}$ , and $a_{o u t}$ ) are obtained using feature maps from the previous layer ( $(l - 1)^{t h}$ ) and all the preceding layers (1^st to $(l - 2)^{t h}$ layers). Note that the DC mechanism only affects the three attention weights for convolutional kernels but not the input to the convolutional layers. Formulated in (4), to avoid introducing additional parameters and memory burden, features for attention weights calculations are combined using additions:

x_{l}^{i n} = \frac{1}{2} (x_{l - 1} + x_{p r e v})

(4)

where $x_{l}^{i n}$ and $x_{l}$ denote input features to the $l^{t h}$ layer for attention weights calculations, and output features from the $l^{t h}$ , respectively. Note that by using (4), features from the $(l - 1)^{t h}$ layer has the largest weight when calculating the three attention values for the $l^{t h}$ layer, and weights of all preceding layers gradually diminish. Intuitively, features that are the closest to the $l^{t h}$ layer should contribute most to the attention calculations. For the 1^st layer, the input image volume is used to calculate three attention values. Feature maps are resized for dimension matching before (4).

D. Network Structure

The proposed network adapts a U-net-like structure [32], and it is presented in Fig. 2. The proposed network takes a batch of 3D non-PVC image volumes as input and tries to perform PVC without anatomical information from CECT. The dimensions for both input and output are $N_{b}$ × 50 × 70 × 70, where $N_{b}$ represents the input batch size. As depicted in Fig. 2, the network contains 4 down-sampling blocks and 4 up-sampling blocks. Each of the down-sampling/up-sampling block has either a dynamic convolutional (Dy-Conv) or dynamic de-convolutional (Dy-DeConv) layer, followed by a dense-net block [22]. The size of dynamic convolutional kernel used for down-sampling or up-sampling is 1 × 3 × 3 without zero-padding, so that the dimension along z-axis remains the same throughout the network. The dynamic convolutional kernel size used in dense-net block is 5 × 3 × 3 with zero-padding to allow the network to capture contextual information for adjacent slices. ReLU activation functions were used after all the convolutional layers. Conveying paths are used to connect earlier layers and later layers. All convolutional layers contain 32 filters.

E. Optimization and Training

The proposed network was trained in a supervised manner with the MLEM-reconstructed images post-processed using iY as the reference. The objective function used to optimize all the trained networks in this study includes MAE and SSIM [33]. MAE loss can be formulated as:

ℓ_{MAE} (Y, X) = \frac{1}{N_{b} W H D} \sum_{i = 1}^{N_{b}} ‖ Y_{i} - X_{i} ‖_{1},

(5)

where $N_{b}$ is input batch size. $W$ = 70, $H$ = 70, and $D$ = 50 are the height, width, and depth of the input/output image volumes, respectively. $X$ and $Y$ represent output image volumes and the corresponding training label, respectively.

SSIM measures the structural similarity between two images, and it equals to 1 when two images are identical. In this work, the convolutional window used to calculate SSIM is set as 11 × 11. The SSIM formula is expressed as:

SSIM (Y, X) = \frac{(2 μ_{Y} μ_{X} + C_{1}) (2 σ_{Y X} + C_{2})}{(μ_{Y}^{2} + μ_{X}^{2} + C_{1}) (σ_{Y}^{2} + σ_{X}^{2} + C_{2})}

(6)

where $C_{1} = (0.01 \cdot R)^{2}$ and $C_{2} = (0.03 \cdot R)^{2}$ are constants to stabilize the ratios when the denominator is too small. $R$ stands for the dynamic range of pixel values. $μ_{Y}$ , $μ_{X}$ , $σ_{Y}^{2}$ , $σ_{X}^{2}$ and $σ_{Y X}$ are the means of $Y$ and $X$ , deviations of $Y$ and $X$ , and the correlation between $Y$ and $X$ respectively. Since the network output and the training label are both 3D image volumes, we took advantage of that by including the SSIM values along 3 planes (transverse, coronal, and sagittal) into the objective function for network optimization. SSIM loss used to optimize the network parameters is expressed as: $ℓ_{SSIM} = 1 - S S I M (Y, X)$ .

To emphasize the edges for different organs, the sobel operator (SO) [34] is used to obtain edge images from the network output and training label. The SO uses two separable filters to produce two gradient vector maps at each 2D spatial location. The MAEs between the gradient vector maps from network output and the training label are included as part of the objective function. Similar to the SSIM calculations, SO is applied for 3 planes, and the corresponding MAEs are included in the objective function. MAE loss calculated from the gradient vector maps can be expressed as:

ℓ_{SO} (Y, X) = MAE (SO (X) - SO (Y))

(7)

Since the GE Discovery NM Alcyone scanner used in this work has a FOV smaller than the reconstructed matrix, features inside and outside the FOV share the same weights in the SSIM and MAE calculations. This is not ideal because features inside the FOV should be more clinically relevant. As the anatomical information is available from CECT, and IMBV is a more relevant image quality metric before and after PVC, we took advantage of that by including the IMBV values into the objective function. IMBV loss function can be formulated as:

ℓ_{IMBV} (Y, X) = ∣ IMBV (X) - IMBV (Y) ∣

(8)

IMBV values were calculated using the manually-segmented organ templates. Note that IMBV calculations were required only during the training stage. At the testing stage, no segmented template was needed.

SSIM values and MAE for gradient vector maps along 3 planes have the same weight in the composite loss function, which can be formulated as:

min_{θ_{N e t}} L = ℓ_{MAE} (Y, X) + λ_{a} ℓ_{SSIM} (Y, X) + λ_{b} ℓ_{SO} (Y, X) + λ_{c} ℓ_{IMBV} (Y, X)

(9)

where $λ_{a}$ = 0.8, $λ_{b}$ = 0.1, $λ_{c}$ = 0.1 are hyper-parameters used to balance different loss functions. $θ_{N e t}$ represents all the trainable parameters in the network.

The Adam method [35] was used to optimize all the trainable parameters in the network with 2 exponential decay rates $β_{1}$ = 0.9 and $β_{2}$ = 0.999. Xavier method [36] was used to initialize all the convolutional kernel weights. All bias terms were initialized to 0. Among all the 28 canine studies, 15 were used for network training, 3 were used for validation, and the remaining 10 were used for network testing. Since end-diastolic and end-systolic gates are available for all the canine studies, there are a total of 30, 6, and 20 image volumes for network training, validation, and testing, respectively. Due to the limited amount of training data, 30 image volumes were augmented by rotating the image volume 30° in the interval (0°, 360°) along all 3 planes, resulting in a total of 30×11×3+30 = 1, 020 image volumes for network training. Batch size $N_{b}$ = 6 was used.

F. Evaluations

In this work, image quality was quantitatively evaluated using SSIM, root-mean-squared-error (RMSE), peak signal-to-noise ratio (PSNR), and IMBV. One-way analysis of variance (ANOVA) and Tukey multiple comparison test [37] were used to evaluate the statistical significance in this study. In this work, p-value p < 0.05 indicates statistical significance. Pairings were assumed in statistical testings. After PVC, the spill-over effects should be corrected, leading to a higher $M_{l v - b l p}$ and lower $M_{m y o}$ . Hence, lower IMBV values typically represent better image quality in this work. The IMBV values obtained from iY-PVC results served as the gold standard for comparison.

Six networks were trained in this work to evaluate the effectiveness of all the network components and the proposed IMBV-derived loss function $ℓ_{IMBV}$ . Six networks include: (1) a U-net (network depicted in Fig. 2 without the dynamic mechanism); (2) a Dynamic U-net (U-net with the proposed multi-dimensional dynamic mechanism but without $ℓ_{IMBV}$ , denoted as Dy-U); (3) a Dynamic U-net with CECT as the second-channel input but without $ℓ_{IMBV}$ (denoted as Dy-U-CT); (4) a Dynamic U-net with $ℓ_{IMBV}$ as an additional loss function (denoted as Dy-U-BV); (5) a Dynamic U-net with CECT and $ℓ_{IMBV}$ as an additional loss function (denoted as Dy-U-CT-BV); (6) a densely-connected dynamic U-net with $ℓ_{IMBV}$ as an additional loss function (denoted as DC-Dy-U-BV). Note that dynamic networks (2)-(5) did not implement the DC mechanism and the three attention values were obtained using features only from the previous layer. For networks with CECT as the second-channel input, CECT images were resized to SPECT dimension before concatenation.

It is expected that networks with dynamic mechanism would produce better results than the network without it, and networks with $ℓ_{IMBV}$ would produce images with better IMBV quantification than the networks without $ℓ_{IMBV}$ . Images post-processed with iY served as the reference in this work.

III. Results

In this section, various network components are progressively added to demonstrate their effectiveness for cardiac SPECT PVC. Numbers are presented as MEAN ± STD.

A. Results on Dynamic U-net

In this sub-section, the first 3 networks (U-net, Dy-U, and Dy-U-CT) are compared. Both U-net and Dy-U used only the SPECT image data for PVC without CECT and segmented organ templates. Dy-U-CT concatenated CECT and SPECT data together as input without segmented organ templates. Since CECT served as the prior knowledge in the iY-PVC method, we expected Dy-U-CT should capture more anatomical information from CECT and perform better than the networks without CECT.

One sample canine study was selected and presented in Fig. 3. All three deep learning methods produced images with reduced PVEs. Especially in the coronal slice, the contour of the left-ventricular blood-pool is fused with the myocardium region and is not clearly visible due to spill-over effects. The deep learning methods significantly improved reconstruction results, and the contours of each organ became more aligned with the CECT image. In addition to visual observations in the figure, the lower IMBV values and the profile plots (Fig. 4) demonstrated that networks with dynamic mechanisms produced even better results than the U-net. As indicated by the blue arrows in the transverse slice in Fig. 3, U-net did not perform well at the septal wall between the right and left ventricular blood-pools. Using the same network design with the proposed multi-dimensional dynamic mechanism, Dy-U demonstrated superior performance over U-net. Dy-U-CT produced images with better estimations of the blood-pool region than Dy-U did (green arrows in the transverse slice of Fig. 3). For this study, images reconstructed using Dy-U-CT have lower IMBV values than iY-PVC images, even without segmented organs.

Fig. 4. — Profile plots along the dashed blue lines in Fig. 3. The upper and lower plots correspond to the images on the first and second rows of Fig. 3, respectively. Arrows with the same color point to the same regions in the images presented in Fig. 3.

The average IMBV values across all the 20 testing canine studies are 0.209 ± 0.042, 0.193 ± 0.038, 0.185 ± 0.037, 0.172 ± 0.033, and 0.167 ± 0.033 for Non-PVC, U-net, Dy-U, Dy-U-CT, and iY-PVC results, respectively. This downward trend is consistent with our expectation. Statistically significant differences were observed between all 5 groups (p < 0.001), except for the difference between IMBV values obtained from iY-PVC and Dy-U-CT, which had a p-value p = 0.152. This p-value demonstrates the effectiveness of using CECT as network input. Registered CECT images fused with the iY-PVC SPECT images are also included in Fig. 3.

B. Results on Dynamic U-net with IMBV Loss Function

The Dy-U and Dy-U-CT were re-trained with the $ℓ_{IMBV}$ (denoted as Dy-U-BV, and Dy-U-CT-BV, respectively) as an additional loss function to demonstrate the effectiveness of the proposed IMBV-derived loss function $ℓ_{IMBV}$ . Ideally, using $ℓ_{IMBV}$ would force the network to concentrate more on the cardiac region and produce images with IMBV measurements closer to those of iY-PVC images than images generated by the networks without $ℓ_{IMBV}$ . With the additional $ℓ_{IMBV}$ , the average IMBV values of the 20 testing canine studies were 0.174 ± 0.036 and 0.166 ± 0.033 for Dy-U-BV and Dy-U-CT-BV, respectively. Compared with iY-PVC, the p-values were p = 0.009 and p = 0.986 for Dy-U-BV and Dy-U-CT-BV, respectively. Another canine study was selected and presented in Fig. 5. For the SPECT-only networks, the network with $ℓ_{IMBV}$ (Dy-U-BV) generated images with better reconstructed anatomical structure of the heart (blue arrows in Fig. 5), and less PVEs (green arrows in Fig. 5), as compared with Dy-U. For the networks using CECT as a prior knowledge (Dy-U-CT and Dy-U-CT-BV), despite the improved quantitative accuracy, they produced visually similar reconstructions.

Fig. 5. — Coronal slices of a canine study reconstructed using different methods. Blue arrows point to anatomical features of the heart. Corresponding IMBV values are also included in the figure.

C. Results on Densely-connected Dynamic U-net

The proposed Dy-U-CT-BV network already produces images with no statistically significant different IMBV quantification using the multi-dimensional dynamic mechanism and $ℓ_{IMBV}$ but without the DC mechanism. However, most available GE Alcyone scanners installed globally do not have an integrated CT system, and an additional CT scan also results in higher radiation dose to patients. Hence, a SPECT-only network has a higher potential of clinical translation. With the proposed DC-Dy mechanism and the $ℓ_{IMBV}$ , the DC-Dy-U-BV network also produced images with no statistically significant IMBV measurements to the iY method (0.169 ± 0.034, p = 0.946) using only the SPECT data. On the other hand, without the DC mechanism, the average IMBV values produced by the Dy-U-BV network is 0.174 ± 0.036 with p = 0.009 when compared with iY. One sample canine study was selected and presented in Fig. 6. The network with the DC-Dy mechanism produced images that are more aligned with the iY images.

Fig. 6. — Coronal slices of a canine study reconstructed using different methods. Profile plot was generated along the dashed blue lines to present the partial volume effects and is presented in Fig. 7. Blue arrows point to regions with noticeable spill-over effects. Corresponding IMBV values are also included in the figure.

D. Additional Quantitative Evaluations

In order to compare results across all the 20 testing canine studies, 7 Bland-Altman plots and 7 linear fitting plots were generated using the IMBV values obtained from all the testing canine studies and are presented in Fig. 8-9, using the values obtained from iY-PVC results as the reference. The effectiveness of different network components can be clearly observed in Fig. 8-9.

Fig. 8. — Bland-Altman plots for comparing non-PVC and deep learning methods. Results obtained from iY-PVC were used as the reference. IMBV values were used as the data points in the plots.

Fig. 9. — Linear fitting plots for comparing non-PVC and deep learning methods. Results obtained from iY-PVC were used as the reference. IMBV values were used as the data points in the plots. Coefficient of determination ( $R^{2}$ ) and Pearson correlation coefficient (Corr. Coef.) are included in the linear fitting plots.

A box plot summarizing the IMBV values of all the networks is also presented in Fig. 10. After applying the $ℓ_{IMBV}$ and using CECT as the second-channel input, no statistical difference was observed for IMBV quantification between Dy-U-CT-BV results and the iY-PVC results (p = 0.986). Without CECT as network input, the p-value between Dy-U-BV and iY-PVC results was p = 0.0089. The difference between Dy-U-CT and iY-PVC was also not significant (p = 0.153). With only the SPECT data, network with the proposed DC-Dy mechanism and $ℓ_{IMBV}$ (DC-Dy-U-BV) also results similar IMBV quantification to iY (0.169 ± 0.034, p = 0.946).

In addition to IMBV calculations, other image quality measurements, including SSIM, PSNR, and RMSE are presented in Table I. Using the same loss function and training strategy, Dy-U benefited from the proposed dynamic convolution and outperformed the U-net with superior quantitative measurements (p = 0.003, p = 0.018, p = 0.039, and p < 0.0001 for SSIM, RMSE, PSRN, and IMBV, respectively).

TABLE I.

Quantitative assessment on different methods (MEAN ± STD). For each metric, the best result is marked in red, and the second best result is marked in blue. The measurements were obtained by averaging the values on the 20 testing canine studies. Slices that do not contain the heart are excluded from calculations. For IMBV calculations, numbers that are closest to the IY-PVC results are considered the best.

	Non-PVC	U-net	Dy-U	Dy-U-BV	DC-Dy-U-BV	Dy-U-CT	Dy-U-CT-BV	iY-PVC
IMBV	0.209 ± 0.042	0.193 ± 0.038	0.185 ± 0.037	0.174 ± 0.036	0.169 ± 0.034	0.172 ± 0.033	0.166 ± 0.033	0.167 ± 0.033 (Reference)
PSNR	35.552 ± 2.095	37.130 ± 2.485	37.641 ± 2.803	37.686 ± 2.642	37.748 ± 2.804	40.319 ± 3.598	40.434 ± 3.662	╲
SSIM	0.967 ± 0.016	0.979 ± 0.014	0.9822 ± 0.0155	0.983 ± 0.015	0.9824 ± 0.015	0.9873 ± 0.0166	0.9875 ± 0.0166	╲
RMSE	0.017 ± 0.006	0.015 ± 0.006	0.0139 ± 0.0062	0.0138 ± 0.0060	0.0137 ± 0.0061	0.0107 ± 0.0067	0.0106 ± 0.0067	╲

Open in a new tab

Dy-U-CT, the network using CECT as additional input, produced results with better quantitative results than Dy-U did (p < 0.0001 for SSIM, RMSE, PSNR, and IMBV).

With $ℓ_{IMBV}$ as an additional loss function, both Dy-U-BV and Dy-U-CT-BV produced images with more accurate IMBV measurements compared with the networks without $ℓ_{IMBV}$ (p < 0.0001 for both networks when compared with their counterparts).

As presented in Table I, compared with the networks without $ℓ_{IMBV}$ , networks with $ℓ_{IMBV}$ produced images with superior IMBV measurements but similar SSIM, PSNR, and RMSE values. We believe such a small difference in the measured values will not affect overall image quality, especially in clinical settings. The DC mechanism also resulted in similar SSIM, PSNR, and RMSE values compared with its counterpart (Dy-U-BV) but superior IMBV measurements. We believe the improvements in IMBV quantification are more meaningful in clinical settings, especially for the problem targeted in this work.

E. Additional Ablation Studies

Two additional ablated networks were trained to demonstrate the effectiveness of IMBV as part of the loss function and CECT as a second-channel input to the network. These two networks are denoted as U-net-BV (U-net with IMBV as an additional loss function) and U-net-CT (U-net with CECT as a second-channel input). Corresponding quantitative measurements are presented in Table II.

TABLE II.

Quantitative assessment on different methods (MEAN ± STD). The measurements were obtained by averaging the values on the 20 testing canine studies. Slices that do not contain the heart are excluded from calculations.

	U-net-BV	U-net-CT
IMBV	0.180 ± 0.034	0.177 ± 0.033
PSNR	36.903 ± 2.383	38.980 ± 3.214
SSIM	0.980 ± 0.014	0.982 ± 0.015
RMSE	0.015 ± 0.006	0.012 ± 0.006

Open in a new tab

Similar to the dynamic networks, with IMBV as an additional loss function, U-net-BV produced images with superior IMBV quantification (p < 0.0001) but with similar SSIM, PSNR, and RMSE values, compared with the U-net.

With CECT images as additional prior knowledge, U-net-CT produced images with superior quantitative measurements compared with U-net (p < 0.02 for SSIM, RMSE, PSNR, and IMBV). However, its dynamic counterpart (Dy-U-CT) still outperformed U-net-CT across all the chosen metrics (p < 0.05 for SSIM, RMSE, PSNR, and IMBV), which again demonstrated the effectiveness of dynamic convolutions in the case of cardiac SPECT PVC.

IV. Discussion

Ischemic heart disease (IHD) is considered a leading cause of mortality globally, accounting for more than 9 million deaths in 2016 [38]. IHD can be divided into obstructive coronary artery disease (CAD) and/or CMVD. CAD affects larger epicardial arteries (>500 μm), while CMVD affects smaller ones (<500 μm). CMVD exists in a large proportion of patients with IHD and/or other cardiovascular diseases, and it may or may not co-exist with CAD [25]. IMBV could serve as a novel index for micro-vascular function to diagnose CMVD independent of CAD. Our previous work [9] showed that SPECT imaging using ^99mTc-RBCs could be implemented as a non-invasive imaging technique for IMBV quantification. However, physical limitations of SPECT imaging result in PVEs and compromise the quantitative accuracy. We previously demonstrated that SPECT images reconstructed with iY significantly improved IMBV quantification [9].

However, iY is not practical to be implemented in reality due to tedious image registration and segmentation steps. Additional radiation dose introduced from CT scans with or without contrast administration is also not ideal in human studies. In this work, we proposed a deep-learning technique to perform fast and consistent PVC without requiring segmented organ templates from CECT. The proposed network is featured with a U-net-like structure, the DC-Dy convolution, and an IMBV-derived loss function. Conventional convolutional-based networks aim to learn static convolutional kernels throughout the network. After training, the same set of convolutional kernels is applied to all the testing images, which is not ideal due to the large variability between imaging subjects. In this work, the proposed DC-Dy convolution allows the network to have adaptive convolutional kernels based on the input data to capture more informative contextual features for better performance and compensate for the variances between imaging subjects. Specifically, the DC-Dy mechanism in the network aims to assign different weights to the convolutional kernels along different dimensions for each input image volume. By doing so, the convolutional kernels in the proposed network can be adjusted even after the network is fully trained. Compared with the recently proposed onmi-dimensional dynamic convolution [19], the proposed DC-Dy mechanism allows the network to learn the attention values based on features not only from the previous layer, but also all the preceding layers, without introducing additional parameters. This newly proposed mechanism further strengthens the adaptive capability of the network and improves IMBV quantification. To the best of our knowledge, this is the first work to investigate dynamic network in the medical imaging field.

The proposed network with both SPECT and CECT (without segmented organ templates) was validated with 28 canine studies, and produced images with comparable IMBV measurements to the iY-PVC images (p = 0.986). Since most installed GE Alcyone scanners are SPECT-only systems, the use of co-registered CECT images is not as easily accomplished. We also demonstrated that using only the SPECT data, the network with DC-Dy mechanism and $ℓ_{IMBV}$ (DC-Dy-U-BV) produced images with comparable IMBV measurements to the iY-PVC images (p = 0.946).

Note that iY-PVC method assumes perfect alignments between SPECT images and segmented organ masks. In this work, to ensure perfect alignments, CECT images were acquired with retrospective ECG gating during end-expiration. Such procedure is slow and will introduce additional radiation. To demonstrate that the image quality degrades severely in the case of imperfect registration, iY-PVC was performed on a canine study (acquired in end-diastolic gate) using CECT from another cardiac gate (end-systolic gate). One transverse slice was selected and presented in Fig. 11. As pointed out by the blue arrows in Fig. 11, if the CECT is not well-aligned with the SPECT images, undesired artifacts were introduced in the reconstructed images. The proposed deep-learning network overcame this limitation as CECT and segmented organ templates were not required. As indicated by the IMBV values, imperfectly registered CECT also negatively affects quantitative accuracy. With only the SPECT data as input, the proposed network demonstrated the potential of accurate IMBV quantification without CECT information and alleviate the concern of any SPECT-CT mismatch.

Fig. 11. — A transverse slice of a canine study reconstructed using iY with CECT from a different cardiac gate to demonstrate the limitations of iY. Blue arrows point to undesired artifacts introduced by imperfect registration between CECT and SPECT.

In this work, the images acquired with ^99mTc-RBCs are used for IMBV quantification. However, superior SSIM or MAE values do not directly relate to better IMBV measurements. We proposed to incorporate the IMBV-derived metric into the overall loss function for network training. The network can then concentrate more on the cardiac region to produce images with more accurate IMBV quantification for potentially better clinical results. It is worth noting that dynamic network with $ℓ_{IMBV}$ (Dy-U-BV) has similar performance to the dynamic network with CECT (Dy-U-CT) in terms of IMBV quantification, which demonstrated the effectiveness of $ℓ_{IMBV}$ . We believe that similar ideas incorporating the clinical quantification measurements into loss functions could be implemented for a wide range of clinical problems and/or other imaging modalities/tracers.

The proposed multi-dimension dynamic convolution is not limited to PVC for cardiac SPECT. Given its dynamic nature, the proposed dynamic mechanism could be implemented in various neural networks for medical image reconstruction for different imaging modalities to achieve better performance.

The proposed loss function $ℓ_{IMBV}$ and the DC mechanism produced images with statistically better IMBV quantification but did not lead to noticeable improvements in SSIM, RMSE, and PSNR measurements. We suspect that it may be attributed to the fact that IMBV calculations only focus on the heart, while other metrics are based on the entire image volumes. Also, SSIM, RMSE, and PSNR were calculated using the iY-PVC images as the gold-standard. However, iY is not a perfect method for PVC as it relies on numerous assumptions [3]. Thus, we believe IMBV is a more relevant metric in this work to evaluate the image quality before and after PVC.

The GE Discovery NM/CT570c and 530c scanners are equipped with 19 CZT detectors mounted on an L-shape arc that covers a nearly 180° range for stationary imaging. Therefore, these scanners essentially represent limited-view imaging systems with truncated projections. Our previous work [16] proposed a multi-angle reconstruction approach to acquire multi-angle projections for improved image quality with detector gantry rotations. But obtaining multi-angle projections is not always feasible and complicates and extends the acquisition time. Combined corrections for both limited-view and partial volume artifacts will be incorporated in future studies to further improve the resolution and accuracy of reconstructions on scanners of this type.

V. Conclusion

In conclusion, we propose a deep-learning method for fast and robust PVC for cardiac SPECT imaging. The proposed network overcomes the limitations of the anatomical-guided PVC method. The network is featured with a densely-connected multi-dimension dynamic convolutions that allow the network to have adaptive convolutional kernels for each input image volume, even after the network is fully trained. The proposed network produced images with statistically comparable IMBV measurements to the gold-standard iY-PVC method, which demonstrates a strong potential for clinical implementations. However, the results presented in this paper are limited to canine studies. Further analysis is needed to evaluate the network performance in human studies for the diagnosis of CMVD. In our future studies, we also plan to validate the proposed method on other perfusion imaging tracers.

Fig. 7. — Profile plot along the dashed blue lines in Fig. 6. Blue arrows point to the same regions in the images presented in Fig. 6.

Acknowledgments

This work was supported by NIH grants R01HL154345, R01HL123949, T32HL098069, and S10RR025555.

Contributor Information

Huidong Xie, Department of Biomedical Engineering at Yale University..

Zhao Liu, Department of Radiology and Biomedical Imaging at Yale University.

Luyao Shi, Department of Biomedical Engineering at Yale University..

Kathleen Greco, Department of Radiology and Biomedical Imaging at Yale University.

Xiongchao Chen, Department of Biomedical Engineering at Yale University..

Bo Zhou, Department of Biomedical Engineering at Yale University..

Attila Feher, Department of Internal Medicine (Cardiology) at Yale University.

John C. Stendahl, Department of Internal Medicine (Cardiology) at Yale University

Nabil Boutagy, Department of Pharmacology (Vascular Biology and Therapeutics) at Yale University.

Tassos C. Kyriakides, Department of Biostatistics at Yale University.

Ge Wang, Department of Biomedical Engineering at Rensselaer Polytechnic Institute.

Albert J. Sinusas, Department of Biomedical Engineering at Yale University; Department of Radiology and Biomedical Imaging at Yale University; Department of Internal Medicine (Cardiology) at Yale University

Chi Liu, Department of Biomedical Engineering at Yale University; Department of Radiology and Biomedical Imaging at Yale University.

REFERENCES

[1].Bockisch A, Freudenberg LS, Schmidt D, and Kuwert T, “Hybrid Imaging by SPECT/CT and PET/CT: Proven Outcomes in Cancer Imaging,” Seminars in Nuclear Medicine, vol. 39, pp. 276–289, July 2009. [DOI] [PubMed] [Google Scholar]
[2].Ritt P, Vija H, Hornegger J, and Kuwert T, “Absolute quantification in SPECT,” Eur J Nucl Med Mol Imaging, vol. 38, pp. 69–77, May 2011. [DOI] [PubMed] [Google Scholar]
[3].Erlandsson K, Buvat I, Pretorius PH, Thomas BA, and Hutton BF, “A review of partial volume correction techniques for emission tomography and their applications in neurology, cardiology and oncology,” Phys Med Biol, vol. 57, pp. R119–159, Nov. 2012. [DOI] [PubMed] [Google Scholar]
[4].Reader A, Julyan P, Williams H, Hastings D, and Zweit J, “EM algorithm system modeling by image-space techniques for PET reconstruction,” IEEE Transactions on Nuclear Science, vol. 50, pp. 1392–1397, Oct. 2003. [Google Scholar]
[5].Strul D and Bendriem B, “Robustness of Anatomically Guided Pixel-by-Pixel Algorithms for Partial Volume Effect Correction in Positron Emission Tomography,” J Cereb Blood Flow Metab, vol. 19, pp. 547–559, May 1999. [DOI] [PubMed] [Google Scholar]
[6].Chan C, Liu H, Grobshtein Y, Stacy MR, Sinusas AJ, and Liu C, “Noise suppressed partial volume correction for cardiac SPECT/CT,” Medical Physics, vol. 43, no. 9, pp. 5225–5239, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[7].Liu H, Chan C, Grobshtein Y, Ma T, Liu Y, Wang S, Stacy MR, Sinusas AJ, and Liu C, “Anatomical-based partial volume correction for low-dose dedicated cardiac SPECT/CT,” Phys. Med. Biol, vol. 60, pp. 6751–6773, Aug. 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
[8].Yang J, Huang S, Mega M, Lin K, Toga A, Small G, and Phelps M, “Investigation of partial volume correction methods for brain FDG PET studies,” IEEE Transactions on Nuclear Science, vol. 43, pp. 3322–3327, Dec. 1996. [Google Scholar]
[9].Mohy-ud Din H, Boutagy NE, Stendahl JC, Zhuang ZW, Sinusas AJ, and Liu C, “Quantification of intramyocardial blood volume with 99mTc-RBC SPECT-CT imaging: A preclinical study,” J. Nucl. Cardiol, vol. 25, pp. 2096–2111, Dec. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[10].Liu Q, Mohy-ud Din H, Boutagy NE, Jiang M, Ren S, Stendahl JC, Sinusas AJ, and Liu C, “Fully automatic multi-atlas segmentation of CTA for partial volume correction in cardiac SPECT/CT,” vol. 62, no. 10, pp. 3944–3957. Publisher: IOP Publishing. [DOI] [PMC free article] [PubMed] [Google Scholar]
[11].Wang G, “A Perspective on Deep Imaging,” IEEE Access, vol. 4, pp. 8914–8924, 2016. [Google Scholar]
[12].Shan H, Zhang Y, Yang Q, Kruger U, Kalra MK, Sun L, Cong W, and Wang G, “3-D Convolutional Encoder-Decoder Network for Low-Dose CT via Transfer Learning From a 2-D Trained Network,” IEEE Transactions on Medical Imaging, vol. 37, pp. 1522–1534, June 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[13].Xie H, Shan H, and Wang G, “Deep Encoder-Decoder Adversarial Reconstruction (DEAR) Network for 3D CT from Few-View Data,” Bioengineering, vol. 6, p. 111, Dec. 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[14].Xie H, Shan H, Cong W, Liu C, Zhang X, Liu S, Ning R, and Wang G, “Deep Efficient End-to-End Reconstruction (DEER) Network for Few-View Breast CT Image Reconstruction,” IEEE Access, vol. 8, pp. 196633–196646, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Aghakhan Olia N, Kamali-Asl A, Hariri Tabrizi S, Geramifar P, Sheikhzadeh P, Farzanefar S, Arabi H, and Zaidi H, “Deep learning–based denoising of low-dose SPECT myocardial perfusion images: quantitative assessment and clinical performance,” Eur J Nucl Med Mol Imaging, vol. 49, pp. 1508–1522, Apr. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[16].Xie H, Thorn S, Chen X, Zhou B, Liu H, Liu Z, Lee S, Wang G, Liu Y-H, Sinusas AJ, and Liu C, “Increasing angular sampling through deep learning for stationary cardiac SPECT image reconstruction,” J. Nucl. Cardiol, May 2022, DOI: 10.1007/s12350-022-02972-z. [DOI] [PubMed] [Google Scholar]
[17].Ryden T, Essen M, Marin I, Svensson J, and Bernhardt P, “Deep learning generation of synthetic intermediate projections improves 177Lu SPECT images reconstructed with sparsely acquired projections,” Journal of Nuclear Medicine, Aug. 2020, DOI: 10.2967/jnumed.120.245548. [DOI] [PMC free article] [PubMed] [Google Scholar]
[18].Chen X, Zhou B, Xie H, Shi L, Liu H, Holler W, Lin M, Liu Y-H, Miller EJ, Sinusas AJ, and Liu C, “Direct and indirect strategies of deep-learning-based attenuation correction for general purpose and dedicated cardiac SPECT,” Eur J Nucl Med Mol Imaging, Feb. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[19].Li C, Zhou A, and Yao A, “Omni-dimensional dynamic convolution,” in International Conference on Learning Representations, 2022. [Google Scholar]
[20].Chen Y, Dai X, Liu M, Chen D, Yuan L, and Liu Z, “Dynamic Convolution: Attention over Convolution Kernels,” arXiv:1912.03458 [cs], Mar. 2020. [Google Scholar]
[21].Yang B, Bender G, Le QV, and Ngiam J, “CondConv: Conditionally Parameterized Convolutions for Efficient Inference,” arXiv:1904.04971 [cs], Sept. 2020. [Google Scholar]
[22].Huang G, Liu Z, L V, and Weinberger K, “Densely Connected Convolutional Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269, July 2017. [Google Scholar]
[23].Bocher M, Blevis I, Tsukerman L, Shrem Y, Kovalski G, and Volokh L, “A fast cardiac gamma camera with dynamic SPECT capabilities: design, system validation and future potential,” Eur J Nucl Med Mol Imaging, vol. 37, pp. 1887–1902, Oct. 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
[24].Gambhir SS, Berman DS, Ziffer J, Nagler M, Sandler M, Patton J, Hutton B, Sharir T, Haim SB, and Haim SB, “A Novel High-Sensitivity Rapid-Acquisition Single-Photon Cardiac Imaging Camera,” Journal of Nuclear Medicine, vol. 50, no. 4, pp. 635–643, 2009. [DOI] [PubMed] [Google Scholar]
[25].Bradley C and Berry C, “Definition and epidemiology of coronary microvascular disease,” J. Nucl. Cardiol, May 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
[26].Chung KS and Nguyen PK, “Non-invasive measures of coronary microcirculation: Taking the long road to the clinic,” J. Nucl. Cardiol, vol. 25, pp. 2112–2115, Dec. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[27].Feher A and Sinusas AJ, “Quantitative Assessment of Coronary Microvascular Function,” Circulation: Cardiovascular Imaging, vol. 10, p. e006427, Aug. 2017. Publisher: American Heart Association. [DOI] [PMC free article] [PubMed] [Google Scholar]
[28].Feher A, Boutagy NE, Stendahl JC, Hawley C, Guerrera N, Booth CJ, Romito E, Wilson S, Liu C, and Sinusas AJ, “Computed Tomographic Angiography Assessment of Epicardial Coronary Vasore-activity for Early Detection of Doxorubicin-Induced Cardiotoxicity,” JACC: CardioOncology, vol. 2, pp. 207–219, June 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].Shepp L and Vardi Y, “Maximum Likelihood Reconstruction for Emission Tomography,” IEEE Transactions on Medical Imaging, vol. 1, pp. 113–122, Oct. 1982. [DOI] [PubMed] [Google Scholar]
[30].Yamashita R, Nishio M, Do RKG, and Togashi K, “Convolutional neural networks: an overview and application in radiology,” Insights Imaging, vol. 9, pp. 611–629, Aug. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[31].Hu J, Shen L, and Sun G, “Squeeze-and-Excitation Networks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141, June 2018. [Google Scholar]
[32].Ronneberger O, Fischer P, and Brox T, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp. 234–241, Springer International Publishing, 2015. [Google Scholar]
[33].Wang Z, Bovik A, Sheikh H, and Simoncelli E, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, pp. 600–612, Apr. 2004. [DOI] [PubMed] [Google Scholar]
[34].Duda RO and Hart PE, “Pattern classification and scene analysis,” in A Wiley-Interscience publication, 1973. [Google Scholar]
[35].PK D and B J, “Adam: A method for stochastic optimization,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [Google Scholar]
[36].Glorot X and Bengio Y, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, vol. 9 of Proceedings of Machine Learning Research, pp. 249–256, 2010. [Google Scholar]
[37].Keselman HJ and Rogan JC, “The Tukey multiple comparison test: 1953–1976,” Psychological Bulletin, vol. 84, no. 5, pp. 1050–1056, 1977. [Google Scholar]
[38].Nowbar AN, Gitto M, Howard JP, Francis DP, and Al-Lamee R, “Mortality From Ischemic Heart Disease,” Circulation: Cardiovascular Quality and Outcomes, vol. 12, p. e005375, June 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] [1].Bockisch A, Freudenberg LS, Schmidt D, and Kuwert T, “Hybrid Imaging by SPECT/CT and PET/CT: Proven Outcomes in Cancer Imaging,” Seminars in Nuclear Medicine, vol. 39, pp. 276–289, July 2009. [DOI] [PubMed] [Google Scholar]

[R2] [2].Ritt P, Vija H, Hornegger J, and Kuwert T, “Absolute quantification in SPECT,” Eur J Nucl Med Mol Imaging, vol. 38, pp. 69–77, May 2011. [DOI] [PubMed] [Google Scholar]

[R3] [3].Erlandsson K, Buvat I, Pretorius PH, Thomas BA, and Hutton BF, “A review of partial volume correction techniques for emission tomography and their applications in neurology, cardiology and oncology,” Phys Med Biol, vol. 57, pp. R119–159, Nov. 2012. [DOI] [PubMed] [Google Scholar]

[R4] [4].Reader A, Julyan P, Williams H, Hastings D, and Zweit J, “EM algorithm system modeling by image-space techniques for PET reconstruction,” IEEE Transactions on Nuclear Science, vol. 50, pp. 1392–1397, Oct. 2003. [Google Scholar]

[R5] [5].Strul D and Bendriem B, “Robustness of Anatomically Guided Pixel-by-Pixel Algorithms for Partial Volume Effect Correction in Positron Emission Tomography,” J Cereb Blood Flow Metab, vol. 19, pp. 547–559, May 1999. [DOI] [PubMed] [Google Scholar]

[R6] [6].Chan C, Liu H, Grobshtein Y, Stacy MR, Sinusas AJ, and Liu C, “Noise suppressed partial volume correction for cardiac SPECT/CT,” Medical Physics, vol. 43, no. 9, pp. 5225–5239, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] [7].Liu H, Chan C, Grobshtein Y, Ma T, Liu Y, Wang S, Stacy MR, Sinusas AJ, and Liu C, “Anatomical-based partial volume correction for low-dose dedicated cardiac SPECT/CT,” Phys. Med. Biol, vol. 60, pp. 6751–6773, Aug. 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] [8].Yang J, Huang S, Mega M, Lin K, Toga A, Small G, and Phelps M, “Investigation of partial volume correction methods for brain FDG PET studies,” IEEE Transactions on Nuclear Science, vol. 43, pp. 3322–3327, Dec. 1996. [Google Scholar]

[R9] [9].Mohy-ud Din H, Boutagy NE, Stendahl JC, Zhuang ZW, Sinusas AJ, and Liu C, “Quantification of intramyocardial blood volume with 99mTc-RBC SPECT-CT imaging: A preclinical study,” J. Nucl. Cardiol, vol. 25, pp. 2096–2111, Dec. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] [10].Liu Q, Mohy-ud Din H, Boutagy NE, Jiang M, Ren S, Stendahl JC, Sinusas AJ, and Liu C, “Fully automatic multi-atlas segmentation of CTA for partial volume correction in cardiac SPECT/CT,” vol. 62, no. 10, pp. 3944–3957. Publisher: IOP Publishing. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] [11].Wang G, “A Perspective on Deep Imaging,” IEEE Access, vol. 4, pp. 8914–8924, 2016. [Google Scholar]

[R12] [12].Shan H, Zhang Y, Yang Q, Kruger U, Kalra MK, Sun L, Cong W, and Wang G, “3-D Convolutional Encoder-Decoder Network for Low-Dose CT via Transfer Learning From a 2-D Trained Network,” IEEE Transactions on Medical Imaging, vol. 37, pp. 1522–1534, June 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] [13].Xie H, Shan H, and Wang G, “Deep Encoder-Decoder Adversarial Reconstruction (DEAR) Network for 3D CT from Few-View Data,” Bioengineering, vol. 6, p. 111, Dec. 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] [14].Xie H, Shan H, Cong W, Liu C, Zhang X, Liu S, Ning R, and Wang G, “Deep Efficient End-to-End Reconstruction (DEER) Network for Few-View Breast CT Image Reconstruction,” IEEE Access, vol. 8, pp. 196633–196646, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Aghakhan Olia N, Kamali-Asl A, Hariri Tabrizi S, Geramifar P, Sheikhzadeh P, Farzanefar S, Arabi H, and Zaidi H, “Deep learning–based denoising of low-dose SPECT myocardial perfusion images: quantitative assessment and clinical performance,” Eur J Nucl Med Mol Imaging, vol. 49, pp. 1508–1522, Apr. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] [16].Xie H, Thorn S, Chen X, Zhou B, Liu H, Liu Z, Lee S, Wang G, Liu Y-H, Sinusas AJ, and Liu C, “Increasing angular sampling through deep learning for stationary cardiac SPECT image reconstruction,” J. Nucl. Cardiol, May 2022, DOI: 10.1007/s12350-022-02972-z. [DOI] [PubMed] [Google Scholar]

[R17] [17].Ryden T, Essen M, Marin I, Svensson J, and Bernhardt P, “Deep learning generation of synthetic intermediate projections improves 177Lu SPECT images reconstructed with sparsely acquired projections,” Journal of Nuclear Medicine, Aug. 2020, DOI: 10.2967/jnumed.120.245548. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] [18].Chen X, Zhou B, Xie H, Shi L, Liu H, Holler W, Lin M, Liu Y-H, Miller EJ, Sinusas AJ, and Liu C, “Direct and indirect strategies of deep-learning-based attenuation correction for general purpose and dedicated cardiac SPECT,” Eur J Nucl Med Mol Imaging, Feb. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] [19].Li C, Zhou A, and Yao A, “Omni-dimensional dynamic convolution,” in International Conference on Learning Representations, 2022. [Google Scholar]

[R20] [20].Chen Y, Dai X, Liu M, Chen D, Yuan L, and Liu Z, “Dynamic Convolution: Attention over Convolution Kernels,” arXiv:1912.03458 [cs], Mar. 2020. [Google Scholar]

[R21] [21].Yang B, Bender G, Le QV, and Ngiam J, “CondConv: Conditionally Parameterized Convolutions for Efficient Inference,” arXiv:1904.04971 [cs], Sept. 2020. [Google Scholar]

[R22] [22].Huang G, Liu Z, L V, and Weinberger K, “Densely Connected Convolutional Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2261–2269, July 2017. [Google Scholar]

[R23] [23].Bocher M, Blevis I, Tsukerman L, Shrem Y, Kovalski G, and Volokh L, “A fast cardiac gamma camera with dynamic SPECT capabilities: design, system validation and future potential,” Eur J Nucl Med Mol Imaging, vol. 37, pp. 1887–1902, Oct. 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] [24].Gambhir SS, Berman DS, Ziffer J, Nagler M, Sandler M, Patton J, Hutton B, Sharir T, Haim SB, and Haim SB, “A Novel High-Sensitivity Rapid-Acquisition Single-Photon Cardiac Imaging Camera,” Journal of Nuclear Medicine, vol. 50, no. 4, pp. 635–643, 2009. [DOI] [PubMed] [Google Scholar]

[R25] [25].Bradley C and Berry C, “Definition and epidemiology of coronary microvascular disease,” J. Nucl. Cardiol, May 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] [26].Chung KS and Nguyen PK, “Non-invasive measures of coronary microcirculation: Taking the long road to the clinic,” J. Nucl. Cardiol, vol. 25, pp. 2112–2115, Dec. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] [27].Feher A and Sinusas AJ, “Quantitative Assessment of Coronary Microvascular Function,” Circulation: Cardiovascular Imaging, vol. 10, p. e006427, Aug. 2017. Publisher: American Heart Association. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] [28].Feher A, Boutagy NE, Stendahl JC, Hawley C, Guerrera N, Booth CJ, Romito E, Wilson S, Liu C, and Sinusas AJ, “Computed Tomographic Angiography Assessment of Epicardial Coronary Vasore-activity for Early Detection of Doxorubicin-Induced Cardiotoxicity,” JACC: CardioOncology, vol. 2, pp. 207–219, June 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].Shepp L and Vardi Y, “Maximum Likelihood Reconstruction for Emission Tomography,” IEEE Transactions on Medical Imaging, vol. 1, pp. 113–122, Oct. 1982. [DOI] [PubMed] [Google Scholar]

[R30] [30].Yamashita R, Nishio M, Do RKG, and Togashi K, “Convolutional neural networks: an overview and application in radiology,” Insights Imaging, vol. 9, pp. 611–629, Aug. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] [31].Hu J, Shen L, and Sun G, “Squeeze-and-Excitation Networks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7132–7141, June 2018. [Google Scholar]

[R32] [32].Ronneberger O, Fischer P, and Brox T, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pp. 234–241, Springer International Publishing, 2015. [Google Scholar]

[R33] [33].Wang Z, Bovik A, Sheikh H, and Simoncelli E, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, pp. 600–612, Apr. 2004. [DOI] [PubMed] [Google Scholar]

[R34] [34].Duda RO and Hart PE, “Pattern classification and scene analysis,” in A Wiley-Interscience publication, 1973. [Google Scholar]

[R35] [35].PK D and B J, “Adam: A method for stochastic optimization,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [Google Scholar]

[R36] [36].Glorot X and Bengio Y, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, vol. 9 of Proceedings of Machine Learning Research, pp. 249–256, 2010. [Google Scholar]

[R37] [37].Keselman HJ and Rogan JC, “The Tukey multiple comparison test: 1953–1976,” Psychological Bulletin, vol. 84, no. 5, pp. 1050–1056, 1977. [Google Scholar]

[R38] [38].Nowbar AN, Gitto M, Howard JP, Francis DP, and Al-Lamee R, “Mortality From Ischemic Heart Disease,” Circulation: Cardiovascular Quality and Outcomes, vol. 12, p. e005375, June 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Segmentation-free PVC for Cardiac SPECT using a Densely-connected Multi-dimensional Dynamic Network

Huidong Xie

Zhao Liu

Luyao Shi

Kathleen Greco

Xiongchao Chen

Bo Zhou

Attila Feher

John C Stendahl

Nabil Boutagy

Tassos C Kyriakides

Ge Wang

Albert J Sinusas

Chi Liu

Roles

Abstract

I. Introduction

II. Methodology

A. Data Acquisition

B. Iterative Yang PVC

C. Densely-connected Multi-dimensional Dynamic Convolution

Fig. 1.

D. Network Structure

Fig. 2.

E. Optimization and Training

F. Evaluations

III. Results

A. Results on Dynamic U-net

Fig. 3.

Fig. 4.

B. Results on Dynamic U-net with IMBV Loss Function

Fig. 5.

C. Results on Densely-connected Dynamic U-net

Fig. 6.

D. Additional Quantitative Evaluations

Fig. 8.

Fig. 9.

Fig. 10.

TABLE I.

E. Additional Ablation Studies

TABLE II.

IV. Discussion

Fig. 11.

V. Conclusion

Fig. 7.

Acknowledgments

Contributor Information

REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases