Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Nov 1.
Published in final edited form as: IEEE Trans Med Imaging. 2020 Oct 28;39(11):3512–3522. doi: 10.1109/TMI.2020.2998480

Improved low-count quantitative PET reconstruction with an iterative neural network

Hongki Lim 1, Il Yong Chun 2, Yuni K Dewaraja 3, Jeffrey A Fessler 4
PMCID: PMC7685233  NIHMSID: NIHMS1641883  PMID: 32746100

Abstract

Image reconstruction in low-count PET is particularly challenging because gammas from natural radioactivity in Lu-based crystals cause high random fractions that lower the measurement signal-to-noise-ratio (SNR). In model-based image reconstruction (MBIR), using more iterations of an unregularized method may increase the noise, so incorporating regularization into the image reconstruction is desirable to control the noise. New regularization methods based on learned convolutional operators are emerging in MBIR. We modify the architecture of an iterative neural network, BCD-Net, for PET MBIR, and demonstrate the efficacy of the trained BCD-Net using XCAT phantom data that simulates the low true coincidence count-rates with high random fractions typical for Y-90 PET patient imaging after Y-90 microsphere radioembolization. Numerical results show that the proposed BCD-Net significantly improves CNR and RMSE of the reconstructed images compared to MBIR methods using non-trained regularizers, total variation (TV) and non-local means (NLM). Moreover, BCD-Net successfully generalizes to test data that differs from the training data. Improvements were also demonstrated for the clinically relevant phantom measurement data where we used training and testing datasets having very different activity distributions and count-levels.

Keywords: Iterative neural network, Regularized model-based image reconstruction, Low-count quantitative PET, Y-90

I. Introduction

Image reconstruction in low-count PET is particularly challenging because dominant gammas from natural radioactivity in Lu-based crystals cause high random fractions, lowering the measurement signal-to-noise-ratio (SNR) [1]. To accurately reconstruct images in low-count PET, regularized model-based image reconstruction (MBIR) solves the following optimization problem consisting of 1) a data fidelity term f(x) that models the physical PET imaging system, and 2) a regularization term R(x) that penalizes image roughness and controls noise [2]:

x^=argminx0f(x)+R(x)f(x):=1T(Ax+r¯)yTlog(Ax+r¯). (1)

Here, f(x) is the Poisson negative log-likelihood for measurement y and estimated measurement means y¯(x)=Ax+r¯, the matrix A denotes the system model, and r¯ denotes the mean background events such as scatter and random coincidences. Recently, applying learned regularizers to R(x) is emerging for MBIR [3].

While there is much ongoing research on machine learning or deep-learning techniques applied to CT [4]–[8] and MRI [9]–[13] reconstruction problems, fewer studies have applied these techniques to PET. Most past PET studies used deep learning in image space without exploiting the physical imaging model in (1). For example, [14] applied a deep neural network (NN) mapping between reconstructed PET images with normal dose and reduced dose and [15] applied a multilayer perceptron mapping between reconstructed images using maximum a posteriori algorithm and a reference (true) image, and their framework uses the acquisition data only to form the initial image. Recently, [16] trained a NN to reconstruct a 2D image directly from PET sinogram and [17], [18] proposed a PET MBIR framework using a deep-learning based regularizer. Our proposed MBIR framework, BCD-Net, also uses a regularizer that penalizes differences between the unknown image and “denoised” images given by a regression neural network in an iterative manner. In particular, whereas [17], [18] trained only a single image denoising NN, the proposed method is an iterative framework that includes multiple trained NNs. This iterative framework enables the NNs in the later stages to learn how to recover fine details. Our proposed BCD-Net also differs from [17], [18] in that our denoising NNs are defined by an optimization formulation with a mathematical motivation (whereas, for the trained regularizer, [17], [18] brought U-Net [19] and DnCNN that were [20] developed for medical image segmentation and general Gaussian denoising, respectively) and characterized by fewer parameters, thereby avoiding over-fitting and generalizing well to unseen data especially when training samples are limited.

Iterative NNs [8]–[11], [21]–[24] are a broad family of methods that originate from an unrolling algorithm for solving an optimization problem and BCD-Net [25] is a specific example of an iterative NN. BCD-Net is constructed by unfolding a block coordinate descent (BCD) MBIR algorithm using “learned” convolutional analysis operators [26]–[28], leading to significantly improved image recovery accuracy in extreme imaging applications, e.g., low-dose CT [29], dual-energy CT [30], highly undersampled MRI [25], denoising low-SNR images [25], etc. A preliminary version of this paper was presented at the 2018 Nuclear Science Symposium and Medical Imaging Conference [31]. We significantly extended this work by applying our proposed method to measured PET data with newly developed techniques. We also added detailed analysis of our proposed method as well as comparisons to related works.

To show the efficacy of our proposed BCD-Net method in low-count PET imaging, we performed both digital phantom simulation and experimental measurement studies with activity distributions and count-rates that are relevant to clinical Y-90 PET imaging after liver radioembolization. Novel therapeutic applications have sparked growing interest in quantitative imaging of Y-90, an almost pure beta emitter that is widely used in internal radionuclide therapy. In addition to the FDA approved Y-90 microsphere radioembolization and Y-90 ibritumomab radioimmunotherapy, there are 50 active clinical trials for Y-90 labeled therapies (www.clinicaltrials.gov). However, the lack of gamma photons complicates imaging of Y-90; it involves SPECT via bremsstrahlung photons produced by the betas [32] or PET via a very low abundance positron in the presence of bremsstrahlung that leads to low signal-to-noise [33]. This paper applies a BCD-Net that is trained for realistic low-count PET imaging environments and compares its performance with those of non-trained regularizers. Our proposed BCD-Net applies to PET imaging in general, particularly in other imaging situations that also have low counts. Using shorter scan times and lower tracer activity in diagnostic PET has cost benefits and reduces radiation exposure, but at the expense of reduced counts that makes traditional iterative reconstruction challenging.

Section II develops the proposed BCD-Net architecture for PET MBIR. Section II also explains the simulation studies in the setting of Y-90 radioembolization and provides details on how we perform the physical phantom measurement. Section III presents how the different reconstruction methods perform with the simulation and measurement data. Section IV discusses what training and imaging factors most affect generalization performance of BCD-Net. Section V concludes with future works.

II. Methods

This section presents the problem formulation of the BCD-Net and gives a detailed derivation that inspires the final form of BCD-Net. We also provide several techniques for BCD-Net that we specifically devised for PET data where each measurement has different count-level (and noise-level). Then we review the related works that we compare with BCD-Net such as MBIR methods using conventional non-trained regularizers. This section also describes the simulation setting and details on the measurement data and what evaluation metrics are used to assess the efficacy of each reconstruction algorithm.

A. BCD algorithm for MBIR using “learned” convolutional regularization

Conventional PET regularizers penalize differences between neighboring pixels [34]. That approach is equivalent to assuming that convolving the image with the [1,−1] finite difference filter along different directions produces sparse outputs. Using such “hand-crafted” filters is unlikely to be the best approach. A recent trend is to use training data to learn filters ck that produce sparse outputs when convolved with images of interest [26], [27], [35], [36]. Such learned filters can be used to define a regularizer that prefers images having sparse outputs, as follows [37]:

R(x)=min{zk}β(k=1K12ckxzk22+αkzk1), (2)

where β is regularization parameter, {ckR:k=1,,K} is a set of convolutional filters, {zknp:k=1,,K} is a set of sparse codes, {αk:k=1,,K} is a set of thresholding parameters controlling the sparsity of {zk}, np is the number of image voxels, and R and K denote the size and number of learned filters, respectively. BCD-Net is inspired by this type of “learned” regularizer. Ultimately, we hope that the learned regularizer can better separate true signal from noisy components compared to hand-crafted filters [29].

A natural BCD algorithm solves (1) with regularizer (2) by alternatively updating {zk} and x :

{zk(n+1)}=argmin{zk}12ckx(n)zk22+αkzk1=T(ckx(n),αk) (3)
x(n+1)=argminx0f(x)+β2(k=1Kckxzk(n+1)22), (4)

where T(,) is the element-wise soft thresholding operator: T(t,q)j:=sign(tj)max(|tj|q,0).

Assuming that learned filters {ck} satisfy the tight-frame condition, k=1Kckx22=x22x [26], we rewrite the updates in (3)(4) as follows:

u(n+1)=k=1Kc˜k(T(ckx(n),αk)) (5)
x(n+1)=argminx0f(x)+β2xu(n+1)22, (6)

where c˜k denotes a rotated version of ck. The operations of convolution, soft thresholding and then filtering again with summation typically have the effect of denoising the image x(n).

For efficient image reconstruction (6) in PET, we use the standard EM-surrogate of Poisson log-likelihood function [38]:

f(x)+β2xu(n+1)22=i=1nd[Ax]i+r¯iyilog([Ax]i+r¯i)+β2j=1np(xjuj(n+1))2j=1np{ej(x(n))(xj(n))log(xj)+ajxj+β2(xjuj(n+1))2}=j=1npQj(xj)

where n′ denotes n′th inner-iteration in (6), ej(x(n))=i=1ndaijyiy¯i(x(n)), aij denotes an element of the system model at ith row and jth column, and nd is the number of rays. Equating Qj(xj)xj to zero is equivalent to finding the root of the following quadratic formula:

βxj2+(ajβuj(n+1))xjej(x(n))xj(n)=0,

and finding the root [39] leads to the minimizer:

xj(n+1)={λ2+βνλβ,λ<0νλ2+βν+λ,λ0,

where λ=12(ajβuj(n+1)), ν=ej(x(n))xj(n), aj=i=1ndaij.

B. BCD-Net for PET MBIR and training its denoising module

To further improve denoising capability by providing more trainable parameters, we extend the convolutional image denoiser (CID) in (5) [25], by replacing {c˜k} with separate decoding filters {dk}. We define BCD-Net to use the following updates for each iteration:

u(n+1)=k=1Kdk(n+1)(T(ck(n+1)x(n),αk(n+1))) (7)
x(n+1)=argminx0f(x)+β2xu(n+1)22, (8)

Algorithm 1.

BCD-Net for PET MBIR

Require:
  {ck(n),dk(n),αk(n):n=1,,T},y,r¯,A,c
Initialize:
x(0) using EM algorithm
Calculate aj=i=1ndaij
for n = 0, …, T − 1 do
u(n+1)=i=1Kdk(n+1)(T(ck(n+1)g1(x(n)),αk(n+1)))
β(n+1)=aji=1ndaijyiy¯i(x(n))2x(n)g2(u(n+1))2c
for n′ = 0, …, T′ − 1 do
  λ=12(ajβ(n+1)g2(uj(n+1)))
  ν=xj(n)(i=1ndaijyiy¯i(x(n)))
  xj(n+1)={λ2+β(n+1)νλβ(n+1),λ<0νλ2+β(n+1)ν+λ,λ0
end for
x(n + 1) = x(T′)
end for

where separate encoding and decoding filters {ck} and {dk} are learned for each iteration. Fig. 1 shows the corresponding BCD-Net architecture. We refer to the u and x updates in (7)(8) as two modules: 1) image denoising module and 2) image reconstruction module. The final output image is from the reconstruction module.

Fig. 1.

Fig. 1.

Architecture of the proposed BCD-Net for PET. The proposed BCD-Net has an iterative NN architecture: each BCD-Net iteration uses three inputs – fixed measurement and mean background {y,r¯}, and the image x(n−1) reconstructed at the previous BCD-Net iteration – and provides the reconstructed image x(n). A circular arrow above MAP EM update indicates inner iterations. g1(·) and g2(·) are the normalization and scaling functions described in Section II-C.

The image denoising module consists of encoding and decoding filters {ck(n+1)}, {dk(n+1)} and thresholding values {αk(n+1)}. We train these parameters to “best map” from noisy images into high-quality reference images (e.g., true images if available) in the sense of mean squared error:

argmin{ck},{dk},{αk}l=1Lxtrue,lk=1Kdk(T(ckxl(n),αk))22, (9)

where L is the total number of training samples, {xtrue,lnp:l=1,,L} is a set of true images and {xl(n)np:l=1,,L} is a set of images estimated by image reconstruction module in the nth iteration. We train the set of filters and thresholding values iteration-by-iteration and do not include the system matrix or sinograms for training as shown in (9). Moreover, we do not enforce the tight-frame condition when training the filters.

One can further extend the CID in (7) to a general regression NN, e.g., a deep U-Net [19]. We investigated if the iterative BCD-Net combined with U-Net denoisers (by replacing the denoising module in (7) with a U-Net) performs better than the proposed BCD-Net using CID (7). Section II-G2 gives the details of the U-Net implementation.

C. Adaptive BCD-Net generalizing to various count-levels

1). Normalization and scaling scheme:

Different PET images can have very different intensity values due to variations in scan time and activity, and it is important for trained methods to be able to generalize to a wide range of count levels. Towards this end, we implemented normalization and scaling techniques in BCD-Net. [18] extended [17] by implementing “local linear fitting” to ensure that the denoising NN output has similar intensity as the input patch from the current estimated image. Our approach is different in that we normalize and scale the image with a global approach, not a patch-based approach. In particular, we modify the architecture in (7)(8) as:

u(n+1)=k=1Kdk(n+1)(Tαk(n+1)(ck(n+1)g1(x(n)))) (10)
x(n+1)=argminx0f(x)+β2xg2(u(n+1))22, (11)

where the normalization function g1(·) is defined by g1(v):=1jvjv to ensure that 1Tg1(v) = 1, and the scaling function g2(·) is defined by g2(v) := {argmins f(s · v)}v. We solve the optimization problem over s using Newton’s method:

s(n+1)=s(n)sf(s(n)v)s2f(s(n)v)=s(n)i=1nd[Av]iyi[Av]is(n)[Av]i+r¯ii=1ndyi([Av]i(s(n)[Av]i+r¯i))2. (12)

To be consistent with the modified CID in (10), we also apply this image-based normalization technique when training the convolutional filters and thresholding values:

argmin{ck},{dk},{αk}l=1Lg1(xtrue,l)k=1Kdk(Tαk(ckg1(xl(n))))22.

2). Adaptive regularization parameter scheme:

The best regularization parameter value can also vary greatly between scans, depending on the count level. Therefore, instead of choosing one specific value for the regularization parameter, we set the β value for each iteration based on evaluation on current gradients of data-fidelity term and regularization term:

β(n+1)=xf(x(n))2xR(x(n))2c=ajej(x(n))2x(n)g2(u(n+1))2c,n=0,,T1, (13)

where c is a constant specifying how we balance between the data-fidelity term and regularization term and n denotes nth outer-iteration. Algorithm (1) gives detailed pseudocode of the proposed method. T denotes the total number of outer-iterations and T′ denotes the number of inner iterations used for (8). We use x(n) as the initial image when solving (11).

D. Conventional MBIR methods: Non-trained regularizers

We compared the proposed BCD-Net with two MBIR methods that use standard non-trained regularizers.

1). Total-variation (TV):

TV regularization penalizes the sum of absolute value of differences between adjacent voxels:

R(x)=βCx1,

where C is finite differencing matrix. Recent work [40] applied Primal-Dual Hybrid Gradient (PDHG) [41] for PET MBIR using TV regularization and demonstrated that PDHG-TV is superior than clinical reconstruction (e.g., OS-EM) for low-count datasets in terms of several image quality evaluation metrics such as contrast recovery and variability.

2). Non-local means (NLM):

NLM regularization penalizes the differences between nearby patches in image:

R(x)=βi,jSip(NixNjx22),

where p(t) is a potential function of a scalar variable t, Si is the search neighborhood around the ith voxel, and Ni is a patch extraction operator at the ith voxel. We used the Fair potential function for p(t):

p(t)=σf2(tσf2Nf+log(1+tσf2Nf)),

where σf is a design parameter and Nf is the number of voxels in the patch Nix. Unlike conventional local filters that assume similarity between only adjacent voxels, NLM filters can average image intensities over distant voxels. As in [42], we used ADMM to accelerate algorithmic convergence with an adaptive penalty parameter selection method [43].

E. Experimental setup: Digital phantom simulation and experimental measurement

1). Y-90 PET/CT XCAT simulations:

We used the XCAT [44] phantom (Fig. 2) to simulate Y-90 PET following radioembolization. We set the image size to 128×128×100 with a voxel size 4.0×4.0×4.0 (mm3) and chose 100 slices ranging from lung to liver. To simulate extremely low count scans with high random fractions, typical for Y-90 PET, we set total true coincidences and random fractions based on numbers from patient PET imaging performed after radioembolization [45]. To test the generalization capability of the trained BCD-Net, we changed all imaging factors between training and testing dataset. Here, imaging factors include activity distribution (shape and size of tumor and liver background, concentration ratio between hot and warm region) and count-level (total true coincidences and random fraction). Fig. 2 and Table I provide details on how we changed the testing dataset from the training dataset. We trained BCD-Net using five pairs (L = 5) of 3D true images and estimated images at each iteration (1 true image, 5 realizations). We generated multiple (5) realizations to train the denoising NN to deal with the Poisson noise. We also generated 5 realizations (1 true image, 5 realizations) as a testing dataset to evaluate the noise across realizations.

Fig. 2.

Fig. 2.

XCAT phantom simulation: (First row) coronal and axial view of attenuation map and true relative activity distribution corresponding to axial attenuation map. (Second row) reconstructed images of one slice from different reconstruction methods. BCD-Net-CID/UNet is the BCD-Net with CID/UNet and params indicates the number of trainable parameters.

TABLE I.

Details on XCAT simulation data: variations between training and testing data.

Training data Testing data
Concentration ratio (hot:warm) 9:1 4:1
Total net trues 200 K 500 K
Random fraction (%) 90.9 87.5

2). Y90 PET/CT physical phantom measurements and patient scan:

For training BCD-Net, we used PET measurements of a sphere phantom (Fig. 4) where six ‘hot’ spheres (2,4,8,16,30 and 113 mL, 0.5 MBq/ml) are placed in a ‘warm’ background (0.057 MBq/ml) with total activity of 0.65 GBq. The phantom was scanned for 40 (3 acquisitions) - 80 (1 acquisition) (L = 4) minutes on a Siemens Biograph mCT PET/CT. For testing BCD-Net and other reconstruction algorithms, we used an anthropomorphic liver/lung torso phantom (Fig. 4) with total activity and distribution that is clinically realistic for imaging following radioembolization with Y-90 microspheres: 5% lung shunt, 1.17 MBq/mL in liver, 3 hepatic lesions (4 and 16 mL spheres, 29 mL ovoid) of 6.6 MBq/ml. The phantom with total activity of 1.9 GBq was scanned 5 times (each 30 minutes) on a Siemens Biograph mCT PET/CT. Fig. 4 and Table II provide details on the count-level (random fraction) and activity distribution differences between training (sphere phantom) and testing (liver phantom) dataset. We also tested BCD-Net with an actual Y-90 patient scan and Table III provides count-level information.

Fig. 4.

Fig. 4.

Y90 PET/CT physical phantom measurement: (First row: training data, Second row: testing data) Attenuation map, true activity, and x(0) of regularized methods of sphere and liver phantom used for training and testing BCD-Net. (Third row) Reconstructed images of one slice from different reconstruction methods.

TABLE II.

Details on phantom measurement data: activity concentration ratio between hot and warm regions and randoms fractions for two phantom studies.

Sphere Liver-torso
Total activity (GBq) 0.65 1.9
Concentration ratio (hot:warm) 8.9:1 5.4:1
Total prompts 3.2 – 6.3 M 2.3 M
Total randoms 2.9 – 5.7 M 2.1 M
Total net trues 308 – 599 K 220 K
Random fraction(%) 90.3 – 90.5 90.7
TABLE III.

Details on typical patient measurement data: total trues and randoms fractions.

Patient A
Total activity (GBq) 2.55
Total prompts 2.7 M
Total randoms 2.3 M
Total net trues 380 K
Random fraction(%) 85.8

We acquired all measurement data with time of flight TOF information. The measurement data size is 200×168×621×13. The last dimension of measurement indicates the number of time bin. The reconstructed image size is 200×200×112 with a voxel size 4.07×4.07×2.03 (mm3). To reconstruct the image with measurement data, we used a SIEMENS TOF system model (A in (1)) along with manufacturer given attenuation/normalization correction, PSF modelling, and randoms/scatters estimation.

F. Evaluation metrics

For the XCAT phantom simulation, we evaluated each reconstruction with contrast recovery (CR) (volume-of-interest (VOI): cold spot indicated in Fig. 2), noise across realizations, root mean squared error (RMSE), and contrast to noise ratio (CNR). For the physical phantom measurement, we used CR (VOI: hot spheres) and CNR averaged over multiple hot spheres. We define each VOI’s mask based on attenuation map interpolated to PET voxel size. For the patient measurement, we used CNR and the field of view (FOV) activity bias since the total activity in FOV is known (equal to the injected activity because the microspheres are trapped) wheareas the activity distribution is unknown:

CR(VOI:coldspot)=(1CVOICBKG)×100(%)
CR(VOI:hotsphere)=CVOICBKG1RTrue1×100(%)
Noise=1JLiverjLiver(1M1m=1M(x^m[j]1Mm=1Mx^m[j])2)1JLiverjLiverxtrue[j]×100%
RMSE=j(xtrue[j]x^[j])2JFOV×100(%)
CNR=CLesionCBKGSTDBKG
FOVbias=jx^[j]xtrue[j]jxtrue[j]×100(%),

where CVOI is mean counts in the VOI, RTrue is true ratio between hot and warm region, x[j] denotes the jth voxel of an image x, M is the number of realizations (M = 5 in both XCAT phantom simulation and physical phantom measurement) and JLiver is the number of voxels in the volume of liver, STDBKG is standard deviation between voxel values in uniform background liver (indicated in Fig. 2), and JFOV is the total number of voxels in the FOV. As the background region when calculating the patient CNR, we used a part of liver region that has relatively uniform activity distribution.

G. Training details

We trained the denoising network in each iteration with a stochastic gradient descent method using the PyTorch [46] deep-learning library.

1). BCD-Net with CID:

We trained a set of CID for the denoising module in BCD-Net where each iteration has 78 sets of thresholding values and convolutional encoding/decoding filters (K = 78). We set the size of each filter as 3×3×3 (R = 33), and set the initial thresholding values by sorting the initial estimate of image and getting a 10% largest value of sorted initial image. We used the Adam optimization method [47] to train the NN. We applied the learning rate decay scheme. Due to the large size of 3D input, we set the batch size as 1.

2). BCD-Net with U-Net:

We implemented a 3-D version of U-Net by modifying a shared code1 (implemented for denoising 2-D MRI images) for fastMRI challenge [48]. We used a batch normalization layer instead of the instance normalization layer used in the baseline code. The ‘encoder’ part of U-Net consists of multiple sets of 1) max pooling layer, 2) 3×3×3 convolutional layer, 3) batch normalization (BN) layer, 4) ReLU layer and the ‘decoder’ part of U-Net consists of multiple sets of 1) upsampling with trilinear interpolation [17], 2) 3×3×3 convolutional layer, 3) BN layer, 4) ReLU layer. For training the U-Net, we used the same training dataset that we used for training the CID. We also used the Adam optimization method and identical settings (number of epochs, learning rate decay, batch size) as those of the CID. We trained and tested two different U-Nets sizes. At each BCD-Net iteration, the U-Net has either about 4 K (similar size to the CID) or 1.4 M trainable parameters. We set the number of convolutional filter channels of the first encoder layer as 12 with 4 times of contraction/expansion for the U-Net with 1.4 M parameters and 5 with 1 time of contraction/expansion for the U-Net with 4 K parameters.

III. Results

A. Reconstruction setup

We compared the proposed BCD-Net method to the standard EM (1 subset), TV-based MBIR with PDHG algorithm (PDHG-TV), and NLM-based MBIR with ADMM algorithm (ADMM-NLM). For regularized MBIR methods including BCD-Net, we used 10 EM algorithm iterations to get the initial image x(0). For each regularization method, we finely tuned the regularization parameter β (within range [2−15, 215]) by considering the recovery accuracy and noise. For NLM, we additionally tuned the window and search sizes. For the XCAT simulation data, we used 40 iterations for EM and 30 iterations (T = 30) for PDHG-TV, ADMM-NLM, and BCD-Net. We used 1 inner-iteration (T′ = 1) for the reconstruction module (11) for each outer-iteration of BCD-Net. For the measured data, we used 20 iterations for EM and 10 iterations (T = 10) for PDHG-TV, ADMM-NLM, and BCD-Net. We used 1 inner-iteration (T′ = 1) for the reconstruction module (11). We set c = 0.01 in (13) in the XCAT simulation study and c = 0.005 in both the phantom measurement and patient studies.

B. Results: Reconstruction (testing) on simulation data

Fig. 23 shows that the proposed iterative NN, BCD-Net, significantly improves overall reconstruction performance over the other non-trained regularized MBIR methods. Fig. 3 reports averaged evaluation metrics over realizations. Fig. 3 shows that BCD-Net with a trained CID achieves the best results in most evaluation metrics. In particular, BCD-Net with a CID improves CNR and RMSE compared to PDHG-TV and ADMM-NLM. BCD-Net also improved contrast recovery in the cold region while not increasing noise compared to the initial EM reconstruction, whereas PHDG-TV and ADMM-NLM improved noise while degrading the CR. For Fig. 2, we selected the iteration number for EM to obtain the highest CNR and the last iteration number for other methods. Fig. 2 shows that BCD-Net’s reconstructed image with a CID is closest to the true image whereas PHDG-TV and ADMM-NLM exceedingly blur the cold region. BCD-Net with the U-Net denoiser shows good recovery for the cold region, however, it blurs the hot region. Moreover, the larger sized U-Net (params: 1.4 M) denoiser worsens the performance of BCD-Net possibly due to over-fitting the training dataset.

Fig. 3.

Fig. 3.

(a) Plot of noise in background liver vs contrast recovery in cold spot (b) RMSE vs iteration (c) Contrast to noise ratio vs iteration. We initialized regularized methods with the 10th iterate of EM reconstruction.

C. Results: Reconstruction (testing) on measurement data

1). Phantom study:

Similar to the simulation results, Fig. 45 shows that, BCD-Net improved overall reconstruction performance over the other reconstruction methods. Fig. 4 shows that reconstructed images using PHDG-TV and ADMM-NLM show uniform texture in background liver compared to EM, however, those exceedingly blur around hot spheres. The blurred hot region is more evident in the quantification results in Fig. 5. BCD-Net gives more visibility for hot spheres with noisier texture in uniform liver region. Fig. 5 shows that BCD-Net with a CID improves CNR compared to PDHG-TV and ADMM-NLM. BCD-Net with CID also improved contrast recovery in hot spheres while slightly increasing noise compared to the initial EM reconstruction. In Fig. 5 (a), BCD-Net with U-Net denoiser shows a fluctuation with iterations, however, the plot trend is similar to that of BCD-Net with CID.

Fig. 5.

Fig. 5.

Liver phantom measurement: (a) Plot of noise in background liver vs contrast recovery in hot spheres (b) Contrast to noise ratio vs iteration. We initialized regularized methods with the 10th iterate of EM reconstruction.

2). Patient study:

Because of the unknown true activity distribution, we quantitatively evaluated each reconstruction method with FOV activity bias. In this quantitative evaluation, BCD-Net showed similar results compared to other methods. See Fig. 67. Fig. 6 shows that the quality of image using different methods in patient study is similar to that of phantom measurement study shown in Fig. 4. Fig. 7 (b) shows that the CNR trend in the patient study is similar to that of the XCAT simulation and the liver phantom measurement.

Fig. 6.

Fig. 6.

Y90 PET/CT patient measurement: Attenuation map and reconstructed images of one slice (coronal and axial view) using OSEM, TV, NLM, and BCD-Net. We visualized the reconstructed image of BCD-Net-UNet with 4 K parameters

Fig. 7.

Fig. 7.

Patient scan: (a) Field of view bias vs iteration. BCD-Net shows similar results compared to other methods. (b) Contrast to noise ratio vs iteration.

IV. Discussion

In this study we showed the efficacy of trained BCD-Net on both qualitative and quantitative Y-90 PET/CT imaging and compared between conventional non-trained regularizers. The proposed approach uses learned denoising NNs to lift estimated signals and thresholding operations to remove unwanted signals. In particular, the iterative framework of BCD-Net enables one to train the filters and thresholding values to deal with the different image roughness at its each iteration. We experimentally demonstrate its generalization capabilities with simulation and measurement data. In the XCAT PET/CT simulation with activity distributions and count-rates mimicking Y-90 PET imaging, total counts in the cold spot were overestimated with standard reconstruction and other MBIR methods using non-trained regularization, yet approached the true value with the proposed approach. Improvements were also demonstrated for the measurement data where we used training and testing datasets having very different activity distribution and count-levels. The architecture and size of denoising NN significantly affect the performance of BCD-Net. In both simulation and measurement experiments, the CID outperformed the U-Net architectures. Using a U-Net with more trainable parameters degraded the performance, especially in the simulation study, due to the small size of dataset. Size of the denoising NN should be set with consideration of training dataset size.

We tested which imaging variable most affects the generalization performance of the proposed BCD-Net. Table IV shows how BCD-Net performs when training and testing data had the same activity distribution and count-level (only difference is Poisson noise) and how the performance of BCD-Net is degraded when each imaging variable is changed between training and testing dataset. We changed one of three factors (shape and size of tumor and liver, concentration ratio, count-level) in training dataset compared to testing dataset. The result shows that generalization performance of the proposed BCD-Net depends largely on all imaging variables. However, training with higher contrast and lower count-level dataset (compared to testing dataset) gave less degradation of performance compared to the opposite cases. This result suggests that it is better to have noisier data in training dataset than testing dataset. In other words, training for extra noise reduction than needed is better than less noise reduction than needed.

TABLE IV.

Impact of imaging variable on generalization capability of BCD-Net-CID.

Changed imaging variable Training Testing RMSE Drop (%)
Identical - 4.74 -
Shape and size See Fig. 2 5.49 15.9
Concentration ratio 9:1 4:1 5.55 17.1
Concentration ratio 1.7:1 4:1 5.81 22.5
Trues Count-level 2 × 105 5 × 105 5.01 5.7
Trues Count-level 11 × 105 5 × 105 5.71 20.5

We also investigated how each factor in training of denoising module (7) impacts the generalization capability of BCD-Net. Fig. 8(a)(b) show the impact of number and size of filters on performance. Plots show that the proposed BCD-Net achieved lower training RMSE when using larger number and size of filters; however, it did not decrease testing RMSE compared to smaller number and size of filters and BCD-Net with larger size of filter exceedingly blurs image thereby resulting in higher RMSE. See Fig. 8(e). We also tested l1 training loss to see if it improves the performance over the l2 loss (MSE) in (9). However, it led to unnaturally piecewise constant images and details in small cold regions were ignored.

Fig. 8.

Fig. 8.

(a)-(b) Impact of number/size of filter and training loss on testing dataset RMSE. (c) Reconstructed image from BCD-Net-CID with filters and thresholding values trained with l1-loss.

Fig. 9 shows how the regularization parameter β in (13) changes with iterations in training and testing datasets. The β value in each iteration converges to different limits in training and testing cases. The adaptive scheme automatically increases the β value when the count-level decreases. This behavior concurs with the general knowledge that more regularization is needed when the noise-level increases. These empirical results underscore the importance of such adaptive regularization parameter selection schemes proposed in Section II-C2 in PET imaging.

Fig. 9.

Fig. 9.

Efficacy of adaptive selection of regularization parameter β.

Many related works [5], [14], [15], [49] use single image denoising (deep) NN (e.g., U-Net) as a post-reconstruction processing and we investigated how the denoising NN detached from the data-fit term performs compared to iterative NN. Fig. 10 illustrates u(1) generated by CID and U-Net. As in the iterative NN, using more trainable parameters degraded the visual quality and RMSE value in U-Net case and CID achieved better result than U-Net. In all cases, iterative NN achieved lower RMSE compared to those post-reconstruction processed images as shown in Table V.

Fig. 10.

Fig. 10.

g2(u(1)) generated by (a) CID and (b)-(c) U-Net. Using more parameters degraded the visual quality and RMSE value as in the iterative NN.

TABLE V.

Comparison between post reconstruction processing and iterative NN.

Method g2 (u(1)) RMSE x(30) RMSE Δ (%)
CID (params: 4 k) 6.87 6.36 7.52
U-Net (params: 4 k) 7.39 6.70 9.30
U-Net (params: 1.4 M) 7.67 7.04 8.15

BCD-Net is trained for a specific number of iterations and its practical use would be akin to how ML-EM is used with a fixed number of iterations in clinical systems. If one is interested in convergence guarantees with running more iterations, then one can extend the sequence convergence guarantee of BCD-Net in [23] by setting the nth adaptive denoiser as D˜(n)=g2(D(n)(g1(x(n1)))) with some nth denoiser D(n) (e.g., CID (7) and U-Net), ∀n, using sufficient T′ (so MAP EM finds a critical point), and additionally assuming that β(n) converges. We empirically observed that the β(n) tends to converge to some constant in this Y-90 PET as well as another application of Lu-177 SPECT.

To more practically guarantee the convergence, one could use training and testing dataset having similar count-level and a fixed regularization parameter value across iterations using an initial estimated image and a corresponding denoised image as follows:

β=xf(x(0))2xR(x(0))2c=ajej(x(0))2x(0)g2(u(1))2c.

The convergence properties depend on additional technical assumptions detailed in [23].

V. Conclusion

It is important for a “learned” regularizer to have generalization capability to help ensure good performance when applying it to an unseen dataset. For low-count PET reconstruction, the proposed iterative NN, BCD-Net, showed reliable generalization capability even when the training dataset is small. The proposed BCD-Net achieved significant qualitative and quantitative improvements over the conventional MBIR methods using “hand-crafted” non-trained regularizers: TV and NLM. In particular, these conventional MBIR methods have a trade-off between noise and recovery accuracy, whereas the proposed BCD-Net improves CR for hot regions while not increasing the noise when the regularization parameter is appropriately set. Visual comparisons of the reconstructed images also show that the proposed BCD-Net significantly improves PET image reconstruction performance compared to MBIR methods using non-trained regularizers.

Future work includes investigating performance of BCD-Net trained with end-to-end training principles and adaptive selection of trainable parameter numbers depending on the size of training dataset.

VI. ACKNOWLEDGMENT

We acknowledge Se Young Chun (UNIST) for providing NLM regularization codes. We also acknowledge Maurizio Conti and Deepak Bharkhada (SIEMENS Healthcare Molecular Imaging) for providing the forward/back projector for TOF measurement data. This work was supported by NIH-NIBIB grant R01EB022075.

This work was supported by grant R01 EB022075, awarded by the National Institute of Biomedical Imaging and Bioengineering, National Institute of Health, U.S. Department of Health and Human Services.

Footnotes

Contributor Information

Hongki Lim, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109 USA.

Il Yong Chun, Department of Electrical Engineering, University of Hawai’i–Mānoa, HI 96822 USA.

Yuni K. Dewaraja, Department of Radiology, University of Michigan, Ann Arbor, MI 48109 USA.

Jeffrey A. Fessler, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109 USA.

References

  • [1].Carlier T, Willowson KP, Fourkal E, Bailey DL, Doss M, and Conti M, “Y90-PET imaging: exploring limitations and accuracy under conditions of low counts and high random fraction,” Med. Phys, vol. 42, no. 7, pp. 4295–309, June 2015. [DOI] [PubMed] [Google Scholar]
  • [2].Ahn S, Ross SG, Asma E, Miao J, Jin X, Cheng L, Wollenweber SD, and Manjeshwar RM, “Quantitative comparison of OSEM and penalized likelihood image reconstruction using relative difference penalties for clinical PET,” Physics in Medicine & Biology, vol. 60, no. 15, p. 5733, 2015. [DOI] [PubMed] [Google Scholar]
  • [3].Wang G, Ye JC, Mueller K, and Fessler JA, “Image reconstruction is a new frontier of machine learning,” IEEE Trans. Med. Imag, vol. 37, no. 6, pp. 1289–96, June 2018. [DOI] [PubMed] [Google Scholar]
  • [4].Chen H, Zhang Y, Kalra MK, Lin F, Chen Y, Liao P, Zhou J, and Wang G, “Low-dose CT with a residual encoder-decoder convolutional neural network,” IEEE Trans. Med. Imag, vol. 36, no. 12, pp. 2524–2535, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Jin KH, McCann MT, Froustey E, and Unser M, “Deep convolutional neural network for inverse problems in imaging,” IEEE Trans. Image Process, vol. 26, no. 9, pp. 4509–4522, 2017. [DOI] [PubMed] [Google Scholar]
  • [6].Ye JC, Han Y, and Cha E, “Deep convolutional framelets: A general deep learning framework for inverse problems,” SIAM Journal on Imaging Sciences, vol. 11, no. 2, pp. 991–1048, 2018. [Google Scholar]
  • [7].Gupta H, Jin KH, Nguyen HQ, McCann MT, and Unser M, “CNN-based projected gradient descent for consistent CT image reconstruction,” IEEE Trans. Med. Imag, vol. 37, no. 6, pp. 1440–1453, 2018. [DOI] [PubMed] [Google Scholar]
  • [8].Chun IY, Lim H, Huang Z, and Fessler JA, “Fast and convergent iterative signal recovery using trained convolutional neural networkss,” in Proc. Allerton Conf. on Commun., Control, and Comput., Allerton, IL, Oct. 2018, pp. 155–159. [Google Scholar]
  • [9].Aggarwal HK, Mani MP, and Jacob M, “MoDL: model-based deep learning architecture for inverse problems,” IEEE Trans. Med. Imag, vol. 38, no. 2, pp. 394–405, February 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Hammernik K, Klatzer T, Kobler E, Recht MP, Sodickson DK, Pock T, and Knoll F, “Learning a variational network for reconstruction of accelerated MRI data,” Magn. Reson. Imaging, vol. 79, no. 6, pp. 3055–3071, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Sun J, Li H, Xu Z et al. , “Deep ADMM-Net for compressive sensing MRI,” in Proc. NIPS, 2016, pp. 10–18. [Google Scholar]
  • [12].Mardani M, Gong E, Cheng JY, Vasanawala SS, Zaharchuk G, Xing L, and Pauly JM, “Deep generative adversarial neural networks for compressive sensing MRI,” IEEE Trans. Med. Imag, vol. 38, no. 1, pp. 167–179, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Yang G, Yu S, Dong H, Slabaugh G, Dragotti PL, Ye X, Liu F, Arridge S, Keegan J, Guo Y et al. , “DAGAN: deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction,” IEEE Trans. Med. Imag, vol. 37, no. 6, pp. 1310–1321, 2018. [DOI] [PubMed] [Google Scholar]
  • [14].Xu J, Gong E, Pauly J, and Zaharchuk G, “200x low-dose PET reconstruction using deep learning,” arXiv preprint arXiv:1712.04119, 2017. [Google Scholar]
  • [15].Yang B, Ying L, and Tang J, “Artificial Neural Network Enhanced Bayesian PET Image Reconstruction,” IEEE Trans. Med. Imag, vol. 37, no. 6, pp. 1297–1309, June 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [16].Haggstrom I, Schmidtlein CR, Campanella G, and Fuchs TJ, “DeepPET: A deep encoder-decoder network for directly solving the PET image reconstruction inverse problem,” Med. Im. Anal, vol. 54, pp. 253–62, May 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [17].Gong K, Guan J, Kim K, Zhang X, Yang J, Seo Y, El Fakhri G, Qi J, and Li Q, “Iterative PET image reconstruction using convolutional neural network representation,” IEEE Trans. Med. Imag, vol. 38, no. 3, pp. 675–685, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Kim K, Wu D, Gong K, Dutta J, Kim JH, Son YD, Kim HK, El Fakhri G, and Li Q, “Penalized PET reconstruction using deep learning prior and local linear fitting,” IEEE Trans. Med. Imag, vol. 37, no. 6, pp. 1478–1487, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Ronneberger O, Fischer P, and Brox T, “U-net: Convolutional networks for biomedical image segmentation,” in Proc. Med. Image Compt. and Computer Assist. Interven. (MICCAI). Springer, 2015, pp. 234–241. [Google Scholar]
  • [20].Zhang K, Zuo W, Chen Y, Meng D, and Zhang L, “Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising,” IEEE Trans. Image Process, vol. 26, no. 7, pp. 3142–3155, July 2017. [DOI] [PubMed] [Google Scholar]
  • [21].Gregor K and LeCun Y, “Learning fast approximations of sparse coding,” in Proc. ICML, 2010, pp. 399–406. [Google Scholar]
  • [22].Chen Y and Pock T, “Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration,” IEEE Trans. Pattern Anal. Mach. Intell, vol. 39, no. 6, pp. 1256–1272, 2017. [DOI] [PubMed] [Google Scholar]
  • [23].Chun IY, Huang Z, Lim H, and Fessler JA, “Momentum-net: Fast and convergent iterative neural network for inverse problems,” arXiv preprint arXiv:1907.11818, July 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Ye S, Long Y, and Chun IY, “Momentum-Net for low-dose CT image reconstruction,” arXiv preprint arXiv:2002.12018, February 2020. [Google Scholar]
  • [25].Chun IY and Fessler JA, “Deep BCD-net using identical encoding-decoding CNN structures for iterative image recovery,” in Proc. Image, Video, and Multidim. Signal Process. (IVMSP) Workshop, Zagori, Greece, Apr. 2018, pp. 1–5. [Google Scholar]
  • [26].——, “Convolutional analysis operator learning: acceleration and convergence,” IEEE Trans. Image Process, vol. 29, no. 1, pp. 2108–2122, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Chun IY, Hong D, Adcock B, and Fessler JA, “Convolutional analysis operator learning: Dependence on training data,” IEEE Signal Process. Lett, vol. 26, no. 8, pp. 1137–1141, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Crockett C, Hong D, Chun IY, and Fessler JA, “Incorporating handcrafted filters in convolutional analysis operator learning for ill-posed inverse problems,” in Proc. IEEE Intl. Workshop on Compt. Adv. in Multi-Sensor Adaptive Process (CAMSAP), Guadeloupe, West Indies, Dec. 2019, pp. 316–320. [Google Scholar]
  • [29].Chun IY, Zheng X, Long Y, and Fessler JA, “BCD-Net for low-dose CT reconstruction: Acceleration, convergence, and generalization,” in Proc. Med. Image Compt. and Computer Assist. Interven. (MICCAI), Shenzhen, China, Oct. 2019, pp. 31–40. [Google Scholar]
  • [30].Li Z, Chun IY, and Long Y, “Image-domain material decomposition using an iterative neural network for dual-energy CT,” in Proc. IEEE Intl. Symp. Biomed. Imag. (ISBI) (to appear), Iowa City, IA, Apr. 2020. [Google Scholar]
  • [31].Lim H, Huang Z, Fessler JA, Dewaraja YK, and Chun IY, “Application of trained deep BCD-Net to iterative low-count PET image reconstruction,” in Proc. IEEE Nuclear Science Symposium and Medical Imaging Conference (NSS-MIC), Sydney, Australia, Nov. 2018, pp. 1–4. [Google Scholar]
  • [32].Elschot M, Lam MG, van den Bosch MA, Viergever MA, and de Jong HW, “Quantitative Monte Carlo-based 90Y SPECT reconstruction,” J. Nucl. Sci, vol. 54, no. 9, pp. 1557–1563, 2013. [DOI] [PubMed] [Google Scholar]
  • [33].Pasciak AS, Bourgeois AC, McKinney JM, Chang TT, Osborne DR, Acuff SN, and Bradley YC, “Radioembolization and the dynamic role of 90Y PET/CT,” Frontiers in oncology, vol. 4, p. 38–2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Nuyts J, Beque D, Dupont P, and Mortelmans L, “A concave prior penalizing relative differences for maximum-a-posteriori reconstruction in emission tomography,” IEEE Trans. Nucl. Sci, vol. 49, no. 1, pp. 56–60, 2002. [Google Scholar]
  • [35].Pfister L and Bresler Y, “Learning sparsifying filter banks,” in Wavelets and Sparsity XVI, vol. 9597 International Society for Optics and Photonics, 2015, p. 959703. [Google Scholar]
  • [36].Cai J-F, Ji H, Shen Z, and Ye G-B, “Data-driven tight frame construction and image denoising,” Applied and Computational Harmonic Analysis, vol. 37, no. 1, pp. 89–105, 2014. [Google Scholar]
  • [37].Chun IY and Fessler JA, “Convolutional analysis operator learning: Application to sparse-view CT,” in Proc. Asilomar Conf. on Signals, Syst., and Comput, Pacific Grove, CA, Oct. 2018, pp. 1631–1635. [Google Scholar]
  • [38].De Pierro AR, “A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography,” IEEE Trans. Med. Imag, vol. 14, no. 1, pp. 132–7, March 1995. [DOI] [PubMed] [Google Scholar]
  • [39].Press WH, Flannery BP, Teukolsky SA, and Vetterling WT, Numerical recipes in C. New York: Cambridge Univ. Press, 1988. [Google Scholar]
  • [40].Zhang Z, Rose S, Ye J, Perkins AE, Chen B, Kao C-M, Sidky EY, Tung C-H, and Pan X, “Optimization-based image reconstruction from low-count, list-mode TOF-PET data,” IEEE Transactions on Biomedical Engineering, vol. 65, no. 4, pp. 936–946, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [41].Chambolle A and Pock T, “An introduction to continuous optimization for imaging,” Acta Numerica, vol. 25, pp. 161–319, 2016. [Google Scholar]
  • [42].Chun SY, Dewaraja YK, and Fessler JA, “Alternating direction method of multiplier for tomography with nonlocal regularizers,” IEEE Trans. Med. Imag, vol. 33, no. 10, pp. 1960–1968, 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Boyd S, Parikh N, Chu E, Peleato B, and Eckstein J, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. & Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2010. [Google Scholar]
  • [44].Segars W, Sturgeon G, Mendonca S, Grimes J, and Tsui BM, “4D XCAT phantom for multimodality imaging research,” Medical Physics, vol. 37, no. 9, pp. 4902–4915, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [45].Lim H, Dewaraja YK, and Fessler JA, “A PET reconstruction formulation that enforces non-negativity in projection space for bias reduction in Y-90 imaging,” Phys. Med. Biol, vol. 63, no. 3, p. 035042, February 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [46].Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, and Lerer A, “Automatic differentiation in PyTorch,” in NIPS-W, 2017. [Google Scholar]
  • [47].Kingma DP and Ba J, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014. [Google Scholar]
  • [48].Zbontar J, Knoll F, Sriram A, Muckley MJ, Bruno M, Defazio A, Parente M, Geras KJ, Katsnelson J, Chandarana H et al. , “fastMRI: An open dataset and benchmarks for accelerated MRI,” arXiv preprint arXiv:1811.08839, 2018. [Google Scholar]
  • [49].Kang E, Min J, and Ye JC, “A deep convolutional neural network using directional wavelets for low-dose X-ray CT reconstruction,” Medical Physics, vol. 44, no. 10, pp. e360–e375, 2017. [DOI] [PubMed] [Google Scholar]

RESOURCES