Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Apr 7.
Published in final edited form as: Med Phys. 2022 Feb 17;49(4):2531–2544. doi: 10.1002/mp.15520

A back-projection-and-filtering-like (BPF-like) reconstruction method with the deep learning filtration from listmode data in TOF-PET

Li Lv 1, Gengsheng L Zeng 2, Yunlong Zan 3, Xiang Hong 1, Minghao Guo 4, Gaoyu Chen 1, Weijie Tao 1,3, Wenxiang Ding 1, Qiu Huang 1,3
PMCID: PMC10080664  NIHMSID: NIHMS1888308  PMID: 35122265

Abstract

Purpose:

The time-of-flight (TOF) information improves signal-to-noise ratio (SNR) for positron emission tomography (PET) imaging. Existing analytical algorithms for TOF PET usually follow a filtered back-projection process on reconstructing images from the sinogram data. This work aims to develop a back-projection-and-filtering-like (BPF-like) algorithm that reconstructs the TOF PET image directly from listmode data rapidly.

Methods:

We extended the 2D conventional non-TOF PET projection model to a TOF case, where projection data are represented as line integrals weighted by the one-dimensional TOF kernel along the projection direction. After deriving the central slice theorem and the TOF back-projection of listmode data, we designed a deep learning network with a modified U-net architecture to perform the spatial filtration (reconstruction filter). The proposed BP-Net method was validated via Monte Carlo simulations of TOF PET listmode data with three different time resolutions for two types of activity phantoms. The network was only trained on the simulated full-dose XCAT dataset and then evaluated on XCAT and Jaszczak data with different time resolutions and dose levels.

Results:

Reconstructed images show that when compared with the conventional BPF algorithm and the MLEM algorithm proposed for TOF PET, the proposed BP-Net method obtains better image quality in terms of peak signal-to-noise ratio, relative mean square error, and structure similarity index; besides, the reconstruction speed of the BP-Net is 1.75 times faster than BPF and 29.05 times faster than MLEM using 15 iterations. The results also indicate that the performance of the BP-Net degrades with worse time resolutions and lower tracer doses, but degrades less than BPF or MLEM reconstructions.

Conclusion:

In this work, we developed an analytical-like reconstruction in the form of BPF with the reconstruction filtering operation performed via a deep network. The method runs even faster than the conventional BPF algorithm and provides accurate reconstructions from listmode data in TOF-PET, free of rebinning data to a sinogram.

Keywords: analytical reconstruction, back-projection and filtering, deep learning, listmode, TOF PET

1. INTRODUCTION

Adding time-of-flight (TOF) information in positron emission tomography (PET) design has the potential to improve image qualities, for instance, to reduce noise in the reconstructed images.17 TOF PET usually uses iterative reconstruction methods, such as TOF MLEM8,9 and TOF OSEM.9 They are typically able to reconstruct images with a better signal-to-noise ratio (SNR) than analytical methods but are computationally inefficient.

Research reported in reference4 has found that analytical reconstruction can compete with iterative reconstruction in terms of SNR but with a higher speed, when accurate time resolution (~200 ps) is reached. Generally, analytical reconstruction with TOF information is implemented by adding one-dimensional TOF filtration along the TOF line based on filtered back-projection (FBP). To perform the filtration, the confidence weighting (CW) method uses a 2D angle-dependent filter.1 Alternative methods use different TOF weighting functions,10 or design the filter based on adaptive adjustment of the object size.11 These FBP algorithms need to bin the acquired listmode data into sinograms.

To obtain the binning-error-free images, Tomitani2 proposed a BPF-based method including back-projection with 1D TOF weighting kernel and a 2D filter in the Fourier domain. Zeng et al.12 used Hankel transform to derive a simpler explicit representation, but the derivation of the filter modeling was incomplete. The above methods have a significant drawback of the wrong mean value of reconstructed images. This error is caused by sampling the continuous filter to obtain a discrete filter in the Fourier domain, which results in zero DC component and aliasing.13 Alternatively, one can use a deconvolution kernel in the spatial domain. However, it is challenging to derive an explicit expression. The third approach is to replace the filtration with a neural network.

Nowadays, machine learning has become one of the most popular techniques used in the field of image reconstruction due to its data-driven characteristic and high computational efficiency.14 It saves the efforts of deriving the explicit model and has a good ability at suppressing noise. Some researches adopted direct deep learning methods to learn a direct mapping between data domain and image domain with fully connected layers15,16 or convolutional neural networks.17 However, large amounts of training data are required to train these models. The unrolling methods1821 turn a conventional iterative reconstruction method into unfolded deep neural networks. By introducing physical model and data consistency, these models usually can achieve better image quality with much less data. However, due to the iterative process, they spend long computational time. To speed up reconstruction, deep learning methods can also be combined with analytical reconstruction algorithms such as FBP14,22 or BPF.23 Also, several works2426 used back-projection-like histo-images as the network input for direct reconstruction, where the histo-image is an intermediate image between projected data and reconstructed image.

In this work, we first complete the derivation of back-projection and filtering (BPF) reconstruction algorithm of TOF PET that was missing in Zeng et al.12 Then, we propose a BP-Net for TOF PET reconstruction, which combines the BPF algorithm with deep learning. Specifically, it learns the deconvolution in the spatial domain to avoid the wrong DC value. The proposed BP-Net has a high reconstruction speed comparable to the traditional BPF algorithm and needs smaller training data for convergence than direct deep learning methods.

This paper is organized as follows: Section 2 describes the properties of TOF PET analytical models, introduces the proposed BP-Net framework, and then describes the generation of simulation data and implementation details of the BP-Net. Section 3 describes the experimental results. Discussions are followed in Section 4. Finally, conclusions are shown in Section 5.

2 |. MATERIALS AND METHODS

2.1 |. Analytical model of 2D TOF PET

2.1.1 |. Projection model

Following Zeng,27 we extend the 2D conventional non-TOF PET projection model to the corresponding TOF case, where projection data are represented as line integrals weighted by the one-dimensional TOF kernel along the projection direction. For a 2D object f(x,y) defined in the Cartesian coordinate system, we define the projection along the direction uˆ=(sinθ,cosθ) as:

p(θ,s,t)=f(su^+lu^)wt(tl)dl=f(x,y)δ(xcosθ+ysinθs)Wt×(xsinθ+ycosθt)dxdy, (1)

where δ is the Dirac delta function; uˆ=(cosθ,sinθ) is a 2D unit vector perpendicular to uˆ;t is the measured time difference of the pair of coincident photons. A typical choice for the TOF kernel wt(r) is Gaussian function g(r;σ) defined for a scalar variable r:

gr;σ=12πσer22σ2, (2)

with σ=cΔt28/n2, and Δt is the time resolution. This means that the spatial FWHM of g(r;σ) is consistent with the time resolution of the system.

2.1.2 |. Central slice theorem

Similarly, the central slice theorem can also be extended to the TOF case.

Taking the 2D Fourier transform of the 2D TOF PET projection data concerning s and t, we get:

P(θ,ω,τ)=p(θ,s,t)e2πi(sω+tτ)dsdt=f(x,y)δ(xcosθ+ysinθs)wt(xsinθ+ycosθt)dxdye2πi(sω+tτ)dsdt=Wt(τ)f(x,y)×e2πi[x(ωcosθτsinθ)+y(ωsinθ+τcosθ)]dxdy. (3)

Let us introduce short-hand notations u and v as:

u=ωcosθτsinθv=ωsinθ+τcosθ (4)

The Fourier transform can be rewritten as follows:

P(θ,ω,τ)=Wt(τ)F(ωcosθτsinθ,ωsinθ+τcosθ)=Wt(τ)F(u,v). (5)

Here, F(u,v) is a short-hand notation for a rotated version of F(ω,τ).

We use these theorems to perform the TOF BPF reconstruction algorithm, shown as follows:

2.1.3 |. Back-projection and filtering

Using (5), the TOF back-projection image b(x,y) can be expressed as follows:

b(x,y)=0πp(θ,s,l)wbp(tl)dldθ=0πp(θ,s,t)*wbp(t)dθ=0πωτF(u,v)Wt(τ)Wbp(τ)e2πi(ωs+τt)dτdωdθ. (6)

The first equation in (6) is the definition of the back-projection with s=xcosθ+ysinθ and t=xsinθ+ycosθ. In (6), u and v are not variables; short-hand notations are defined in (4). If we express ω and τ in terms of u and v, it can be verified that ωs+τt=xu+yv. Then (6) becomes

b(x,y)=0πuvF(u,v)Wt(usinθ+vcosθ)Wbp(usinθ+vcosθ)e2πi(xu+yv)dudvdθ. (7)

Taking the 2D Fourier transform of the TOF back-projected image concerning x and y, we have

B(u,v)=0πxyuˆvˆF(uˆ,vˆ)Wt(uˆsinθ+vˆcosθ)Wbp(uˆsinθ+vˆcosθ)e2πi(xuˆ+yvˆ)e2πi(xu+yv)duˆdvˆdxdydθ=0πuˆvˆF(uˆ,vˆ)Wt(uˆsinθ+vˆcosθ)Wbp(uˆsinθ+vˆcosθ)xye2πi(x(uˆu)+y(vˆv))dxdy]duˆdvˆ}dθ=0πuˆvˆF(uˆ,vˆ)Wt(uˆsinθ+vˆcosθ)Wbp×(uˆsinθ+vˆcosθ)δ(uˆu,vˆv)duˆdvˆ}dθ=F(u,v)0πWt(usinθ+vcosθ)Wbp(usinθ+vcosθ)dθ. (8)

Let W(ρ)=Wt(ρ)Wbp(ρ), and w(r) be its inverse Fourier transform. We have

F(u,v)0πW(usinθ+vcosθ)dθ=F(u,v)0πw(r)e2πir(usinθ+vcosθ)drdθ=F(u,v)0πw(r)|r|e2πir(usinθ+vcosθ)|r|drdθ=F(u,v)w(x2+y2)x2+y2e2πi(xu+yv)dxdy=F(u,v)×G(u,v), (9)

where

G(u,v)=wx2+y2x2+y2e2πi(xu+yv)dxdy. (10)

In (9), x=rsinθ,y=rcosθ,dxdy=|r|drdθ, and G(u,v) is the 2D Fourier transform of the convolution kernel function wx2+y2x2+y2.

Let wt(r)=gr;σ1=12πσ1er22σ12 and wbp(r)=gr;σ2=12πσ2er22σ22. By using the Hankel transform, it is readily to show that the filter transfer function G(u,v) is a function of ρ=u2+v2.

The 2D convolution kernel is circular symmetric and can be expressed as:

k(r)=wx2+y2x2+y2=12πσ3er22σ32|r|, (11)

where σ32=σ12+σ22 and r2=x2+y2.

The relationship between the back-projected image and the true image can be expressed as

bx,y=fx,y**kr. (12)

There are two approaches to find the reconstruction filter. One is to take the 2D Fourier transform of the above equation; the reciprocal of the Fourier transformation of k(r) is the reconstruction filter. Zeng et al.12 used the Hankel transform to derive a simpler explicit representation of k(r) in the Fourier domain, and then obtained the explicit representation of the filter from b(x,y) to f(x,y). The other is to seek an equivalent deconvolutional kernel in the spatial domain and perform deconvolution on b(x,y), which is more challenging. The neural network has the potential to learn this spatial deconvolution operator.

2.2 |. BP-Net framework

We propose a BP-Net method to learn the deconvolution operator in this work. The objective function is expressed as:

argminψ(ϕ)ψ(ϕ,b)I, (13)

where the neural network ψ, with ϕ being the parameters, takes the back-projected image b as the input of the neural network and the ground-truth image I as the label image. The estimated image lˆ is represented as ψ(ϕ,b).

2.2.1 |. Network architecture

The network architecture proposed in this work, as shown in Figure 1, is a modified U-net.28 It consists of several repeated modules: (i) convolution module; (ii) up-sampling module; (iii) shortcut module. In the convolution module, there are three types of convolution strategies, including a 3×3 kernel with stride 1×1, a 3×3 kernel with stride 2×2 for down-sampling and a 1×1 kernel with stride 1×1 to combine the features in every channel to generate the reconstructed image. Compared to the original U-net, we use a convolution kernel with stride 2×2 for down-sampling instead of the max pooling module. Each convolution is followed by a rectified linear unit (ReLU). In the up-sampling module, we replace the deconvolution module with a 2×2 bi-linear up-sampling module to reduce the checkboard artifacts. To reduce training parameters, the copy-and-add operation is implemented in the shortcut module to images of the same size between the encoder and the decoder, instead of concatenating. Furthermore, we remove the batch normalization (BN) module because it is unfriendly to a regression problem29 and has a high computational cost.30

FIGURE 1.

FIGURE 1

The overall architecture of the proposed deep filter network. Compared to the conventional U-net, we used a convolution kernel with stride 2 × 2 for down-sampling instead of the max pooling module; removed the BN module; used a 2×2 bi-linear up-sampling module for up-sampling instead of a deconvolution module; used a copy-and-add operation for shortcut instead of concatenating

2.3 |. Experimental setup

2.3.1 |. Data set generation

Monte Carlo simulation

Using Monte Carlo simulations via GATE,31 we generated PET listmode data with different TOF resolutions (200, 400, 600 ps). The modeled scanner system was a single-ring PET, details of which are shown in Table 1.

TABLE 1.

Simulated 2D TOF cylindrical PET scanner parameters

Parameter Value

Inner radius (mm) 424.5
Outer radius (mm) 444.5
Blocks per ring 48
Detectors per block 27
Block size (mm × mm × mm) 20 × 55.4 × 2
Detector size (mm × mm × mm) 20 × 2 × 2
Time resolution (ps) 200, 400, 600
Energy window 435–650 keV

In this work, two types of phantoms were simulated. The XCAT digital phantom32 was used to produce realistic three-dimensional (3D) PET images. To increase the diversity of 3D XCAT phantoms, we randomly changed the gender, activity distribution, breast on or off; incorporated breathing and heartbeat movement; inserted a random number of hot spheres of diameters ranging from 12.8 to 22.4 mm as lung lesions. A total of 1236 unique 2D activity maps were generated, the size of which was 200 × 200. Another phantom was the Jaszczak phantom, with a 20-cm diameter uniform cylinder and six hot cylinders inside. The activity concentration ratio was 4:1 for hot cylinders and the background. Since this work focuses on the BPF-like algorithm, scattering and attenuation were not considered when training the model. However, in the test process, we considered these physics effects to mimic the practical data. Datasets with a full dosage of 5 × 106 per slice in listmode were simulated for training and testing. Subsequently, those listmode data were reprocessed to produce smaller dosages of 106 (20%) and 5 × 105 (10%) for testing the proposed algorithm at different count levels.

Back-projection

The collected true coincidence data were back-projected to form our datasets. Since the distance-driven method offers lower computational cost and less artifacts than the ray-driven method,33 a distance-driven algorithm with TOF information was implemented for back-projection. The image matrix size was 200 × 200 and the pixel size was 3.125 mm × 3.125 mm.

2.3.2 |. BP-Net implementation

The network for each time resolution was trained separately. For all three time resolutions, the back-projected XCAT dataset (n = 1236) using 5 × 106 events in each data realization was randomly divided into three groups where 871 were used for training (~70%), 118 for validation (~10%), and 247 for testing (~20%). The three groups of data were all cropped to 192 × 192 when inputting to the network. To make consistent value range between the ground-truth and the back-projected images, both two types of images were normalized to [0, 1].

The network was implemented in the PyTorch deep learning toolbox,34 both trained and tested on NVIDIA GTX 1080Ti graphic processing unit (GPU). The mean squared error (MSE) was utilized as the loss function,

MSE=1Jj=1J(l^jlj)2, (14)

where l^ is the reconstructed image from the network; I is the ground truth; J is the total number of pixels. The Adam optimization algorithm35 was chosen as the optimizer. The batch size and the decay factor were fixed throughout the training process, but the learning rate decayed every 100 epochs after initialization according to the decay factor. The decay factors were chosen empirically and experimentally. We compared different combinations of learning rate and decay factors for each time resolution and selected the optimal parameters based on the training loss. The epochs were chosen based on the trade-off between the convergence and overfitting of the networks. These settings are shown in Table 2.

TABLE 2.

Selected parameters for training BP-Net

Time resolution (ps) Batch size Learning rate Decay factor Max training epoch

200 16 0.0005 0.6 400
400 16 0.0005 0.5 800
600 16 0.0005 0.8 400

Specifically, the BP-Net model was trained on full dose data and then applied to lower dose data or other phantoms without fine-tuning.

2.3.3 |. Image reconstruction

The normalized and cropped back-projected images in the test set were reconstructed by a single forward pass through the trained BP-Net. As a comparison, the raw listmode data were reconstructed by a BPF-type analytical reconstruction algorithm,12 denoted as BPF, and a classic TOF iterative method,9 denoted as MLEM. For the MLEM algorithm, 15 iterations were computed. All three methods were implemented without post-reconstruction smoothing.

2.3.4 |. Image quality evaluation

To compare the performance of different methods, the relative bias and standard deviation (std) images between reconstructed images and the ground truth images were calculated. Taking 200 ps as an example, one sample was randomly chosen from the test set, and N different noise realizations of simulation data were generated with the full dosage. Images from these replicates were reconstructed via BPF, MLEM, and BP-Net to calculate the average bias and standard deviation:

Biasimage=(l¯l)/μl (15)
Stdimage=1N1i=1N(l^il^¯μl)2 (16)

where N is the repetition (i.e., noise realization) number, I is the reference image (i.e., the ground truth), μl denotes the average intensity of the whole I,l^i(i=1,,N) denotes the estimated image of I from N different replicates, and l^¯ is the average estimated image of all N different realizations. In our experiments, N=10.

To quantitatively evaluate the performance of our proposed BP-Net reconstruction method, three metrics were adopted, including the peak signal-to-noise ratio (PSNR), the relative mean square error (rRMSE), and the structural similarity index (SSIM), as follows:

PSNR=20log10(ImaxMSE(I^,I)), (17)

where Imax is the maximum intensity of the ground truth.

rRMSE=MSE(l^,I)/μl, (18)

where μI is the average intensity of the ground truth.

SSIM=(2μl^μl+C1)(μl^2+μl2+C1)(2σl^l+C2)(σl^2+σl2+C2), (19)

where C1 and C2 are small constants, often set empirically;μl^, μl, σl, σl^ and σı^I are statistics calculated pixel by pixel for the estimated image lˆ and the ground truth I.

PSNR and RMSE compare the error of corresponding pixels of two images, while SSIM evaluates two images from aspects of brightness, contrast, and structure similarity. The higher the PSNR and SSIM value and the lower the rRMSE, the better the image quality. Quantitative comparisons of the three algorithms were performed via the average value of these three metrics of all the samples in the test set.

3 |. RESULTS

3.1 |. Model performance evaluation

3.1.1 |. Image quality analysis

The example bias and standard deviation images using different reconstruction algorithms for time resolution being 200 ps are shown in Figure 2. As seen, BP-Net outperforms both BPF and MLEM with the lowest bias and has relatively low variance with MLEM. The high bias of BPF may be related to the errors during the FFT and the IFFT caused by the finite back-projection matrix. Furthermore, the resulting image quality of the three methods is shown in Table 3. As seen, for all three metrics (PSNR, rRMSE, and SSIM), BP-Net performs the best while BPF performs the worst with statistical significance (p < 10−10). The metric rRMSE for BPF is more than twice as high as that for BP-Net. It indicates the denoising ability of neural network on reconstruction. Also, these metrics for BPF are close to those for MLEM, which demonstrates the potential of the analytical method for TOF PET.

FIGURE 2.

FIGURE 2

Bias and standard deviation (STD) images reconstructed using three different methods for time resolution being 200 ps

TABLE 3.

Average quantitative measures (n = 247) for different methods at different time resolutions

200 ps 400 ps 600 ps



PSNR rRMSE SSIM PSNR rRMSE SSIM PSNR rRMSE SSIM

BPF 32.02 0.54 0.94 30.84 0.61 0.89 29.92 0.67 0.84
MLEM 32.46 0.52 0.96 30.98 0.61 0.95 30.11 0.67 0.94
BP-Net 38.05 0.27 0.99 36.47 0.32 0.98 36.47 0.32 0.98

Reconstructions of the same sample above are shown in Figure 3. As shown, for each time resolution, the proposed BP-Net method is superior to the other two methods in suppressing noise and preserving edges. More details are preserved in BP-Net including small targets and organ boundaries. As the time resolution gets worse, the noise increases obviously and the recovery of small targets gets worse in BPF and MLEM reconstructions, while BP-Net keeps a good performance, except for some tiny targets missing when the time resolution is 600 ps, as shown in the zoom-in images. The quantitative metrics of the current slice show consistent results.

FIGURE 3.

FIGURE 3

Reconstructions of the XCAT phantom using two traditional methods (BPF and MLEM) and the proposed BP-Net method at different time resolutions (200, 400, 600 ps), with the quantitative metrics of the current slice shown at the lower left corner. BP-Net shows the best image quality compared to the other two methods

3.1.2 |. Reconstruction speed

Our proposed method reconstructs the XCAT phantom faster than the other two methods. The average time consumption, shown in Table 4, is calculated for the XCAT test dataset (n = 247) at the full dosage. With an average speed of 2.13 s per slice, BP-Net compared favorably to BPF at average 3.73 s (1.75 times slower) and MLEM with 15 iterations at average 61.88 s (29.05 times slower).

TABLE 4.

The statistics (n = 247) of time consumption to reconstruct one slice in different methods

Time (s)

BPF 3.73
MLEM 61.88
BP-Net 2.13

3.2 |. Generalization ability of the network

In this section, we evaluate the reconstruction performance of the BP-Net on different test sets, such as the same type XCAT phantoms with lower dosages, corrupted by attenuation, scatter, and randoms, and the Jaszczak phantom with extremely different activity distribution. We directly apply the network trained in the previous section without retaining or fine-tuning.

3.2.1 |. XCAT phantom with lower dosages

As the dose goes down, BP-Net retains more details and better reserves the contrast between organs, compared to the other two methods. From Figure 4, we see that MLEM loses contrast and details; BPF is too noisy to resolve the boundary. These manifestations are pronounced in the zoom-in images. The average quantitative performance for the test set at dosage of 20% and 10% is shown in Table 5. Compared with Table 3, the performance of the three methods worsens because of the lower dosage, but BP-Net still performs the best with statistically significant (p < 10−10) while BPF is the worst at different time resolutions. These results suggest that the BP-Net can be applied for lower dose levels, as low as 20%.

FIGURE 4.

FIGURE 4

Reconstructions of the XCAT phantom with different doses (100%, 20%, and 10% of the dose used in Figure 3) for time resolution being 200 ps, with the quantitative metrics of the current slice shown at the lower left corner. Compared to the other two methods, the BP-Net retains more details and better reserves the contrast between organs as the dose goes down, up to fivefold

TABLE 5.

Average quantitative measures (n = 247) for different methods at lower dose

200 ps 400 ps 600 ps



Dose Method PSNR rRMSE SSIM PSNR rRMSE SSIM PSNR rRMSE SSIM

20% BPF 28.49 0.78 0.88 26.65 0.96 0.79 25.64 1.07 0.69
MLEM 30.17 0.66 0.93 29.52 0.70 0.92 29.04 0.74 0.91
BP-Net 33.63 0.43 0.97 32.19 0.51 0.96 31.57 0.54 0.95
10% BPF 26.40 0.99 0.84 24.60 1.21 0.73 23.73 1.33 0.63
MLEM 28.45 0.79 0.90 28.25 0.81 0.90 28.03 0.83 0.90
BP-Net 30.69 0.60 0.94 29.37 0.70 0.94 28.70 0.75 0.92

3.2.2 |. XCAT phantom including attenuation, scatter, and randoms

In this section, physics effects such as attenuation, scatter, and random were taken into account in GATE simulation to mimic the actual data more closely. Both prompt and delayed coincidences were collected. Then data were further pre-corrected for scatter and random coincidences as in references.3638 The attenuation maps were used for attenuation correction according to references,39,40 and then, back-projection and reconstruction were applied to the corrected emission data.

As shown in Figure 5, BP-Net is still obviously superior to both MLEM and BPF for all three different time resolutions. When comparing with Figure 3, all three methods have some degradation on image quality due to the imperfectness in the correction for physics effects, but the degradation of BP-Net is the least, with the most details and the lowest image noise. The quantitative comparison is consistent with visualization observation. As shown in Table 6, BP-Net performs the best while BPF performs the worst. It is consistent with the conclusion in Section 3.1.1, where no attenuation, scatter, and random is considered.

FIGURE 5.

FIGURE 5

Reconstructions of the XCAT phantom when considering attenuation, scatter, and random effects at different time resolutions (200, 400, 600 ps), with the quantitative metrics of the current slice shown at the lower left corner. The BP-Net still shows the best image quality compared with the other two methods

TABLE 6.

Average quantitative measures (n = 247) for different methods at different time resolutions

200 ps 400 ps 600 ps



PSNR rRMSE SSIM PSNR rRMSE SSIM PSNR rRMSE SSIM

BPF 30.39 0.62 0.90 28.86 0.74 0.84 27.90 0.83 0.78
MLEM 31.5 0.57 0.94 29.93 0.66 0.93 29.11 0.73 0.92
BP-Net 33.43 0.44 0.97 31.30 0.57 0.96 31.03 0.58 0.96

3.2.3 |. Jaszczak phantom

Figure 6 shows the reconstruction of the Jaszczak phantom. At the full dosage, BP-Net generates visually comparable results with MLEM, while both keep lower noise level than BPF. The quantitative metrics show that BP-Net performs best for all three time resolutions. As the dosages decrease, reconstructions from the BP-Net show increasing noise, but at 600 ps, the BP-Net is more sensitive to dose change than at other time resolutions. These results demonstrate that the BP-Net has the potential to obtain good generalization performance on new datasets.

FIGURE 6.

FIGURE 6

Reconstructions of the Jaszczak phantom using three methods at different time resolutions and using the BP-Net without batch normalization at different dose levels, with the quantitative metrics of the current slice shown at the lower left corner. BP-Net generates comparable results with MLEM and lower noise than BPF

3.3 |. Evaluation of network parameters

To further improve the network performance, we investigated whether adding the batch normalization module would make any difference. Similar to Table 2, Table 7 shows some hyper-parameter combinations that were used for the BP-Net with the BN module.

TABLE 7.

Selected parameters for training BP-Net with BN module

Time resolution (ps) Batch size Learning rate Decay factor Max training epoch

200 16 0.0005 0.6 400
400 16 0.0005 0.7 400
600 16 0.0005 0.8 400

Due to the simple characteristic, Jaszczak phantom was chosen again for evaluation. Compared with Figure 6, the BP-Net with BN module shows more heterogeneity in the background area, as indicated by the circular regions in Figure 7. The quantitative performance shows that the BP-Net without BN performs better as shown in the lower left corner of every image. We selected a non-BN architecture as the final one for BP-Net based on the above analysis.

FIGURE 7.

FIGURE 7

Results of BP-Net with batch normalization on Jaszczak phantom at different time resolutions and different dosages, with the quantitative metrics of the current slice shown at the lower left corner. The circular regions indicate the heterogeneous region of the background

4 |. DISCUSSION

In this work, we have first defined the projection in TOF PET and proved the central slice theorem accordingly, similar to works by other researchers derived.1,2,10 Then the back-projection was derived as in Zeng et al.,12 where the detailed steps were not provided. Since the spatial deconvolution kernel is hard to derive, we proposed a modified U-Net to replace the spatial filtration in this BPF-type algorithm. The modified U-Net does not produce a deconvolution kernel. The BP-Net is non-linear due to the existence of non-linear components such as the ReLU units. Once the ReLU units are removed, the proposed network degenerates to a linear regression model that performs worse than our non-linear model, especially at the presence of noise in the data, since the non-linear activation functions improve the representative ability of linear neural network.41,42

As another advantage, our proposed BP-Net algorithm is capable of performing TOF PET reconstruction from listmode data. Reconstruction directly from listmode data has been shown to achieve better image quality than that from default compressed or mashed sinogram.43 However, listmode data put a big burden on storage and computation because they record each event separately and do not bin the repeated LORs. In the proposed method, we perform on-the-fly back-projection while collecting data and restore only the back-projected image to reduce the storage space. In addition, the BP-Net needs smaller training data (about 1300 slices) for convergence than networks completely based on projection data (for instance, DeepPET requires about 20 000 slices17), because the TOF back-projection model adds information to the network.

Recently, different from our proposed BP-Net that used single-view back-projected image with TOF weighting along LOR, some works also used back-projection-like histo-images as the network input for reconstruction.2426 In reference,24 the TOF kernel weighting along LOR was not used during back-projection; each event was back-projected to the most likely position. The attenuation map was an additional input to the neural network. In reference,25 the attenuation effect was compensated during TOF back-projection, with no scatter and random corrections considered in this step. The target images were scatter and random corrected. In reference,26 the simulated histo-images were grouped into 48 views, and also the paper studied the effects of the number of views in terms of the neural network performance. The main difference between our proposed method and the works discussed above is that we search for a tomographic filter in a BPF algorithm. This paper shows that our non-linear tomographic filter outperforms the linear tomographic filter that is derived mathematically in this paper.

The BP-Net results in more accurate reconstructions than the BPF filtering in the Fourier domain and the conventional MLEM algorithm, as seen in Figure 3 for different time resolutions, and in Figure 4 for different noise levels. Though the noise on the reconstructed image still increases with the dose lowered, the BP-Net can preserve edges and recover most details. Those denoising tricks commonly used in reconstruction can also be applied to our method. For example, one promising solution is to add an extra network,14,4448 or a conventional filter as a post-process operator for denoising like a Gaussian filter. Another way is to fine-tune the network on a noisy dataset16,49 for different time resolutions, which is also one of our future works.

As shown in Table 4, the average reconstruction time at full dose of BP-Net is 1.75 times faster than BPF and 29.05 times faster than MLEM, for comparable (or even better) results. The BP-Net back-projects the data only once and filters the back-projected image using the network trained beforehand. Neither complicated filtration is required as in BPF nor several forward projection and back-projection operations are necessary as in MLEM.

In the proposed BP-Net method, the TOF information affects network performance. Comparing the cases of 200, 400, and 600 ps, the better the time resolution, the greater the image quality (shown in Tables 5 and 6), and the stronger the recovery ability of small objects, which is consistent with the previous research on TOF improvements.5,6,50

In the current BP-Net reconstruction model, the BN module was not adopted. We found that the BN module lost some uniformity on the Jaszczak phantom and was more sensitive to time resolution and dose as shown in Figure 7. A different research group had a similar observation that the BN module is not suitable when the training data set has a significant difference in the distribution.51 Our training data set has inconsistent statistical mean and variance among slices. Also, there is a large difference between the Jaszczak phantom and the XCAT phantom, which brings artifacts when the BN module is used.52

The proposed algorithm was validated via GATE simulations of different phantoms at different noise levels in the system with different time resolutions. As shown in numerous works, the GATE simulation has the ability to mimic the real-world scenario when taking into account of various physical effects such as attenuation, scatter, and random. However, real patient data are still preferred for a complete evaluation of our method. Since the proposed BP-Net shows a good generative ability from the XCAT phantom to the Jaszczak phantom, it has the potential to show good performance on real patient data after being trained with experimental data, which will be part of our future works.

5 |. CONCLUSION

In this paper, we investigated the analytical model of 2D TOF PET directly from listmode data. Then, we proposed a BPF-like reconstruction algorithm with filtering operation performed via a neural network. The method adopted a convolutional neural network to learn the deconvolution operator in the spatial domain. The simulation results of XCAT and Jaszczak phantoms showed that the proposed BP-Net method is capable of performing image reconstruction for different time resolutions and different dose levels. Compared to traditional analytical and iterative methods, the BP-Net method has the potential to improve the image quality with higher PSNR, SSIM, and lower rRMSE and save computational time.

ACKNOWLEDGMENTS

This work was supported by the National Natural Science Foundation of China (51627807) and the Interdisciplinary Program of Shanghai Jiao Tong University (YG2021QN15).

Funding information

National Natural Science Foundation of China, Grant/Award Number: 51627807; Interdisciplinary Program of Shanghai Jiao Tong University, Grant/Award Number: YG2021QN15

Footnotes

CONFLICT OF INTEREST

The authors have no conflict of interest.

REFERENCES

  • 1.Snyder DL, Thomas LJ, Ter-Pogossian MM. A mathematical model for positron-emission tomography systems having time-of-flight measurements. IEEE Trans Nucl Sci.1981;28(3):3575–3583. [Google Scholar]
  • 2.Tomitani T Image reconstruction and noise evaluation in photon time-of-flight assisted positron emission tomography. IEEE Trans Nucl Sci. 1981;28(6):4581–4589. [Google Scholar]
  • 3.Karp JS, Surti S, Daube-Witherspoon ME, Muehllehner G. Benefit of time-of-flight in PET: experimental and clinical results. J Nucl Med. 2008;49(3):462–470. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Westerwoudt V, Conti M, Eriksson L. Advantages of improved time resolution for TOF PET at very low statistics. IEEE Trans Nucl Sci. 2014;61(1):126–133. [Google Scholar]
  • 5.Kadrmas DJ, Casey ME, Conti M, Jakoby BW, Lois C, Townsend DW. Impact of time-of-flight on PET tumor detection. J Nucl Med. 2009;50(8):1315–1323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Surti S, Karp JS. Experimental evaluation of a simple lesion detection task with time-of-flight PET. Phys Med Biol. 2009;54(2):373–384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Surti S Update on time-of-flight PET imaging. J Nucl Med. 2015;56(1):98–105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Shepp LA, Vardi Y. Maximum likelihood reconstruction for emission tomography. IEEE Trans Med Imaging. 1982;1(2):113–122. [DOI] [PubMed] [Google Scholar]
  • 9.Parra L, Barrett HH. List-mode likelihood: EM algorithm and image quality estimation demonstrated on 2-D PET. IEEE Trans Med Imaging. 1998;17(2):228–235. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Watson CC. An evaluation of image noise variance for time-of-flight PET. IEEE Trans Nucl Sci. 2007;54(5):1639–1647. [Google Scholar]
  • 11.Watson CC. An improved weighting kernel for analytical time-of-flight PET reconstruction. IEEE Trans Nucl Sci. 2008;55(5):2551–2556. [Google Scholar]
  • 12.Zeng GL, Li Y, Huang Q. Analytic time-of-flight positron emission tomography reconstruction: two-dimensional case. Vis Comput Ind Biomed Art. 2019;2(1):1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zeng GL, Gullberg GT. Can the backprojection filtering algorithm be as accurate as the filtered backprojection algorithm? Proceedings of 1994 IEEE Nuclear Science Symposium - NSS’94. 1994;3:1232–1236. [Google Scholar]
  • 14.Wang B, Liu H.FBP-Net for direct reconstruction of dynamic PET images. Phys Med Biol. 2020;65(23):1–16. [DOI] [PubMed] [Google Scholar]
  • 15.Zhu B, Liu JZ, Cauley SF, Rosen BR, Rosen MS. Image reconstruction by domain-transform manifold learning. Nature 2018;555(7697):487–492. [DOI] [PubMed] [Google Scholar]
  • 16.He J, Wang Y, Ma J. Radon inversion via deep learning. IEEE Trans Med Imaging. 2018;39(6):2076–2087. [DOI] [PubMed] [Google Scholar]
  • 17.Häggström I, Schmidtlein CR, Campanella G, Fuchs TJ. Deep-PET: A deep encoder–decoder network for directly solving the PET image reconstruction inverse problem. Med Image Anal. 2019;54:253–262. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Lim H, Chun IY, Dewaraja YK, Fessler JA. Improved low-count quantitative PET reconstruction with an iterative neural network. IEEE Trans Med Imaging. 2020;39(11):3512–3522. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Gong K, Wu D, Li Q, et al. EMnet: an unrolled deep neural network for PET image reconstruction. Medical Imaging 2019: Physics of Medical Imaging. 2019;10948:185. [Google Scholar]
  • 20.Gong K, Wu D, Kim K, et al. MAPEM-Net: an unrolled neural network for fully 3D PET image reconstruction. 15th International Meeting on Fully Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine. 2019;11072:102. [Google Scholar]
  • 21.Mehranian A, Reader AJ. Model-based deep learning PET image reconstruction using forward–backward splitting expectation–maximization. IEEE Trans Radiat Plasma Med Sci. 2020;5(1):54–64. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Würfl T, Hoffmann M, Christlein V, et al. Deep learning computed tomography: Learning projection-domain weights from image domain in limited angle problems. IEEE Trans Med Imaging. 2018;37(6):1454–1463. [DOI] [PubMed] [Google Scholar]
  • 23.Ge Y, Zhang Q & Hu Z et al. Deconvolution-based backprojectfilter (BPF) computed tomography image reconstruction method using deep learning technique. arXiv preprint, 2018. http://arxiv.org/abs/1807.01833 [Google Scholar]
  • 24.Whiteley W, Panin V, Zhou C, Cabello J, Bharkhada D, Gregor J. FastPET: near real-time reconstruction of PET histo-image data using a neural network. IEEE Trans Radiat Plasma Med Sci. 2020;5(1):65–77. [Google Scholar]
  • 25.Feng T, Yao S, Xi C, et al. Deep learning-based image reconstruction for TOF PET with DIRECT data partitioning format. Phys Med Biol. 2021;66(16):165007. [DOI] [PubMed] [Google Scholar]
  • 26.Li Y & Matej S DeepDIRECT: deep direct image reconstruction from multi-view TOF PET histoimages using convolutional LSTM: 16th International Meeting on Fully 3D Image Reconstruction in Radiology and Nuclear Medicine, online, 19–23 July 2021:111–115. [Google Scholar]
  • 27.Zeng GL. Medical Image Reconstruction: A Conceptual Tutorial. Springer Berlin Heidelberg; 2010. [Google Scholar]
  • 28.Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. Vol 9351. Springer; 2015:234–241. [Google Scholar]
  • 29.Lim B, Son S, Kim H, Nah S, Lee KM. Enhanced deep residual networks for single image super-resolution. IEEE Comput Socety Conf Comput Vision Pattern Recognit Workshops. 2017;1132–1140. [Google Scholar]
  • 30.Zhang Y, Tian Y, Kong Y, Zhong B, Fu Y. Residual dense network for image super-resolution. Proc IEEE Comput Soc Conf Comput Vision Pattern Recognit. 2018;2472–2481. [Google Scholar]
  • 31.Strulab D, Santin G, Lazaro D, Breton V, Morel C. GATE (Geant4 Application for Tomographic Emission): a PET/SPECT general-purpose simulation platform. Nucl Phys B - Proc Suppl. 2003;125:75–79. [Google Scholar]
  • 32.Segars WP, Sturgeon G, Mendonca S, Grimes J, Tsui BMW. 4D XCAT phantom for multimodality imaging research. Med Phys. 2010;37(9):4902–4915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.De Man B, Basu S. Distance-driven projection and backprojection in three dimensions. IEEE Nucl Sci Symp Med Imaging Conf. 2002;3:1477–1480. [Google Scholar]
  • 34.Paszke A, Gross S & Massa F et al. ”Pytorch: An imperative style, high-performance deep learning library.” Advances in neural information processing systems 32 (2019). [Google Scholar]
  • 35.Kingma DP & Ba JL Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. [Google Scholar]
  • 36.Bailey DL, Townsend DW, Valk PE, Maisey MN. Positron Emission Tomography. London: Springer; 2005. [Google Scholar]
  • 37.Watson CC, Newport D, Casey ME. A single scatter simulation technique for scatter correction in 3D PET. In: Grangeat P, Amans JL, eds. Three-Dimensional Image Reconstruction in Radiology and Nuclear Medicine. Springer; 1996:255–268. [Google Scholar]
  • 38.Dawood M, Jiang X, Schäfers KP. Correction Techniques in Emission Tomography. CRC Press; 2012. [Google Scholar]
  • 39.Abella M, Alessio AM, Mankoff DA, et al. Accuracy of CT-based attenuation correction in PET/CT bone imaging. Phys Med Biol. 2012;57(9):2477–2490. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Kinahan PE, Hasegawa BH, Beyer T. X-ray-based attenuation correction for positron emission tomography/computed tomography scanners. Semin Nucl Med. 2003;33(3):166–179. [DOI] [PubMed] [Google Scholar]
  • 41.LeCun Y, Bengio Y, Hinton G. Deep learning. Nature 2015;521(7553):436–444. [DOI] [PubMed] [Google Scholar]
  • 42.G I BY, Courville A. Deep learning. Nature 2016;29(7553):1–73. [Google Scholar]
  • 43.Bharkhada D, Panin V, Conti M, Daube-Witherspoon ME, Matej S, Karp JS. Listmode reconstruction for Biograph Vision PET/CT scanner. 2019 IEEE Nucl. Sci. Symp. Med. Imaging Conf. NSS/MIC 2019, 2019;1–6. [Google Scholar]
  • 44.Han YS, Yoo J, Ye JC. Deep residual learning for compressed sensing CT reconstruction via persistent homology analysis. arXiv preprint; 2016. http://arxiv.org/abs/1611.06391 [Google Scholar]
  • 45.Lu W, Onofrey JA, Lu Y, et al. An investigation of quantitative accuracy for deep learning based denoising in oncological PET. Phys Med Biol. 2019;64(16):165019. [DOI] [PubMed] [Google Scholar]
  • 46.Chen H, Zhang Y, Kalra MK, et al. Low-dose CT with a residual encoder-decoder convolutional neural network. IEEE Trans Med Imaging. 2017;36(12):2524–2535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Kaplan S, Zhu YM. Full-dose PET image estimation from low-dose PET image using deep learning: a pilot study. J Digit Imaging. 2019;32(5):773–778. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Zhou L, Schaefferkoetter JD, Tham IWK, Huang G, Yan J. Supervised learning with cyclegan for low-dose FDG PET image denoising. Med Image Anal. 2020;65:101770. [DOI] [PubMed] [Google Scholar]
  • 49.Gong K,Guan J,Liu C-C,Qi J.PET image denoising using a deep neural network through fine tuning. IEEE Trans Radiat Plasma Med Sci. 2018;3(2):153–161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.van der Vos CS, Koopman D, Rijnsdorp S, et al. Quantification, improvement, and harmonization of small lesion detection with state-of-the-art PET. Eur J Nucl Med Mol Imaging. 2017;44(1):4–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 51.Yu J, Fan Y, Yang J, et al. Wide activation for efficient and accurate image super-resolution. arXiv preprint, Dec- 2018.http://arxiv.org/abs/1808.08718 [Google Scholar]
  • 52.Wang X, Yu K & Wu S et al. ESRGAN: enhanced super-resolution generative adversarial networks. In: Proceedings of the European conference on computer vision (ECCV) workshops. 2018 [Google Scholar]

RESOURCES