Abstract
The ordered subset expectation maximization (OSEM) algorithm approximates the gradient of a likelihood function using a subset of projections instead of using all projections so that fast image reconstruction is possible for emission and transmission tomography such as SPECT, PET, and CT. However, OSEM does not significantly accelerate reconstruction with computationally expensive regularizers such as patch-based nonlocal (NL) regularizers, because the regularizer gradient is evaluated for every subset. We propose to use variable splitting to separate the likelihood term and the regularizer term for penalized emission tomographic image reconstruction problem and to optimize it using the alternating direction method of multiplier (ADMM). We also propose a fast algorithm to optimize the ADMM parameter based on convergence rate analysis. This new scheme enables more sub-iterations related to the likelihood term. We evaluated our ADMM for 3-D SPECT image reconstruction with a patch-based NL regularizer that uses the Fair potential function. Our proposed ADMM improved the speed of convergence substantially compared to other existing methods such as gradient descent, EM, and OSEM using De Pierro’s approach, and the limited-memory Broyden–Fletcher–Goldfarb–Shanno algorithm.
Index Terms: Alternating direction method of multiplier, emission tomography, nonlocal (NL) regularizer, ordered-subset expectation maximization (OSEM)
I. Introduction
Incorporating noise models in tomographic image reconstruction can improve image quality. However, unlike analytical image reconstruction methods such as filtered back-projection (FBP), statistical image reconstruction methods such as the expectation-maximization (EM) algorithm [1], often require gradient-based iterative algorithms. Since the gradient of a likelihood function should be evaluated at each iteration, these algorithms (including EM) are undesirably slow.
Statistical image reconstruction for emission tomography started to be used widely in clinics and in commercial scanners after the fast algorithm called ordered-subset expectation-maximization (OSEM) was developed [2]. The main idea was to speed up gradient computation by approximating it using a subset of projections instead of using all projections (ordered-subset or OS approximation). This approximation has been used for unregularized emission tomographic image reconstruction [2] and regularized emission and transmission tomographic image reconstruction with simple quadratic or edge-preserving regularizers [3]. Since the computation cost for these regularizers is fairly low compared to that for the likelihood term, OS algorithms that approximate the gradient of the likelihood term often speed up penalized likelihood (PL) image reconstruction, too.
Recently, nonlocal (NL) regularizers have been proposed that improve image quality substantially compared to conventional regularizers such as quadratic or edge-preserving functions in image deconvolution [4], emission image reconstruction with convex functions [5], [6], and magnetic resonance imaging (MRI) image reconstruction with nonconvex functions [7]. NL regularizers have been extended to use high-resolution CT or MRI side information for emission and super-resolution image reconstruction for further improvement of image quality [8]–[11]. For emission tomography problems such as [6], [9]–[11], various optimization algorithms were used for image reconstruction such as gradient descent (GD) [10], EM (or OSEM) algorithm based on optimization transfer using De Pierro’s lemma [6], the EM algorithm using one-step late approach [11], and the quasi-Newton algorithm called the limited-memory Broyden–Fletcher–Goldfarb–Shanno with a Box constraint (L-BFGS-B) [9]. They showed promising improvement of image quality, but their algorithms were undesirably slow since the computation cost of the NL regularizers can be comparable to or even higher than that of the likelihood. Therefore, the OS approximation does not significantly accelerate the convergence rate of existing PL image reconstruction algorithms with these NL regularizers.
In this paper, we propose to use variable splitting to separate the likelihood term and the regularizer term for PL image reconstruction problem and to optimize it using the alternating direction method of multipliers (ADMM) [12]. We also propose a fast algorithm to optimize the ADMM parameter based on convergence rate analysis that was extended from the work of Ghadimi et al. with quadratic data fitting and regularizer terms [13]. There are existing methods that use variable splitting for the data fidelity term and the regularizer term [4], [14]–[16]. These previous methods address nonsmooth regularizers such as total variation (TV). A sub-problem with nonsmooth regularizers can be solved quickly by using efficient proximal operators such as shrinkage. Our proposed variable splitting has different motivation. We divide the original optimization into a few sub-problems and we solve the sub-problem related to the likelihood term using OS approximation and we update the sub-problem related to the NL regularizer less often.
We evaluated our new ADMM for 3-D SPECT image reconstruction with a patch-based convex NL regularizer that uses the Fair potential [6]. Our simulation using the XCAT phantom [17] shows that our proposed ADMM for image reconstruction with NL regularizers accelerated convergence substantially compared to existing methods such as GD, EM, and OSEM using De Pierro’s approach [3], and the L-BFGS-B algorithm [18]. This paper is organized as follows. Section II reviews statistical image reconstruction in emission tomography, OS approximation, and various NL regularizers. Section III proposes an efficient method for NL regularized image reconstruction by using variable splitting and ADMM. Section IV proposes an analytical method to select automatically a suitable ADMM parameter for fast convergence rate. Lastly, Section V presents 3-D SPECT simulation results with the XCAT phantom [17] for an application in I-131 radioimmunotherapy (RIT) [19].
II. Image Reconstruction With NL Regularizers
A. Statistical Image Reconstruction for Tomography
The usual form of statistical image reconstruction is to perform the following optimization with respect to an image f:
| (1) |
where y is a measured sinogram data, L denotes a negative log-likelihood function, β is a regularization parameter, and R is a regularizer.
For emission tomography, the negative Poisson log-likelihood is
| (2) |
where yl is the lth element of the measurement y, Ω is the set of indexes of all measurements, and ȳl (f) ≜ Σjaljfj + sl where alj is the element of the system matrix A at the lth row and the jth column, fj is the jth element of the image vector f, and sl is a random and scatter component for the lth measurement. We focus on SPECT imaging where we incorporate an attenuation map and a depth-dependent point spread function model including penetration tails [20] in the system matrix A. We assume known sl; in practice, this scatter component can be estimated by using a triple energy window (TEW) method or by Monte Carlo methods [21].
B. Ordered-Subset Approximation
Iterative image reconstruction algorithms for (1) usually require calculating the gradient of L(y|f) and R(f) at every iteration. The gradient is evaluated at f(n) where f(n) is an estimate of f at the nth iteration. These algorithms include GD, EM [1], and L-BFGS-B [18]. Calculating the gradient of typical R(f) such as quadratic or edge-preserving regularizers is very fast. Evaluating the gradient of L(y|f) is much slower since this requires one forward projection and one back projection of A for each iteration.
OS methods [2] approximate the gradient of L(y|f) at f(n) with the gradient of Lk(y|f) at f(n+k/K) where
| (3) |
Ωk are mutually exclusive, Ω = ∪kΩk, and k = 1, ···, K. Evaluating the OS approximated gradient in (3) is about K times faster than calculating the original gradient in (2). In this way, OSEM achieves faster image reconstruction. Note that OS methods with a fixed K > 1 do not guarantee convergence, but yield approximate PL images.
C. Nonlocal Regularizers
Recently, many researchers have formed high-quality images in many image reconstruction problems by replacing R(f) in (1) with a NL regularizer [4]–[7]. A typical NL regularizer looks like
| (4) |
where pij(t) is a function of a scalar variable t, ||·|| is the ℓ2 norm, Si is the search neighborhood around the ith voxel (usually the set of all voxels within a fixed ℓ∞ distance from the ith voxel), and Ni is an operator on the image f such that Nif is a vector of image intensities of all voxels within a fixed ℓ∞ distance from the ith voxel (cube-shaped patch).
A typical choice for the function pij is [4], [5]
| (5) |
where Nf is the number of voxels in the patch Nif (assuming that the patch size is the same for all i), a weighting function is
| (6) |
and σf is a design parameter. For the image f̃, Lou et al. used an initial image from any analytical image reconstruction (e.g., FBP) [5] and Zhang et al. used an estimated image from the previous iteration f(n−1) so that pij(t) changes over iterations [4].
Yang et al. used a few nonconvex potentials including the Welsh potential [22]
| (7) |
and showed that using (7) is equivalent to using (5) with an estimated value f(n−1) for f̃ at the nth iteration [7]. Wang et al. used the Fair potential [23], [24]
| (8) |
Both (7) and (8) do not depend on an initial image and (8) is convex while (7) is nonconvex. It has been reported that non-convex functions yielded better image quality than a convex function [7].
One can also design NL regularizers that incorporate high-resolution side information such as CT or MR images [9]–[11] to further improve image quality. One way to incorporate high-resolution side information into the NL regularizer is to use the following type of NL regularizer [10]:
| (9) |
where
| (10) |
g is a high-resolution image such as MR or CT, Mi is an operator on the image g such that Mig is a vector of image intensities in a patch around the ith voxel, Na is the number of voxels in the patch Mig, and σa is another design parameter. Another NL regularizer incorporating high resolution side information is [9]
| (11) |
All of these NL regularizers are computationally expensive due to the calculation of (6) at each iteration. The gradients of both L(y|f) and R(f) should be evaluated at each iteration for optimization. Using OS approximations of the gradient of L(y|f) does not improve the speed of convergence much since one can not use OS approximations for R(f) so the gradient of R(f) must be evaluated at each sub-iteration.
III. Alternating Direction Method of Multipliers
A. ADMM for NL Regularization
To benefit from an OS approximation for L(y|f), while avoiding heavy computation of the gradient of R(f) at each sub-iteration, we split the variable for the likelihood term and the regularizer term by replacing (1) with the following equivalent constrained optimization problem:
| (12) |
The augmented Lagrangian for (12) is
| (13) |
where μ is a scalar value (design parameter) and d is a Lagrangian multiplier vector. We need to find a saddle point of the augmented Lagrangian (13).
We solve (13) using the ADMM algorithm [25], [26] as follows:
For n = 1,2, …
| (14) |
| (15) |
| (16) |
End.
For convex R, this ADMM algorithm is guaranteed to converge for any μ > 0 [26]. We can solve the sub-problems of (14) and (15) using any existing method. One need not solve the subproblems exactly to guarantee the convergence of the ADMM algorithm [26, Th. 8].
B. Optimization for the Sub-Problem (14)
We used the GD algorithm to solve (14) as follows:
| (17) |
where α is a step size and
| (18) |
We plug (17) into (14) to determine the step size as follows:
| (19) |
where
| (20) |
The gradient of Φ(n)(u) is
| (21) |
where
| (22) |
and ṗij(t) is the derivative of pij(t). Since solving (19) is an intermediate step of solving (14), we do not need to find an exact α value to minimize (19). We chose to use one step of Newton’s method for (19) as follows [27]:
| (23) |
where ϕ̇(n)(α) and ϕ̈(n)(α) are the first and second derivatives of ϕ(n)(α) with respect to α
| (24) |
where and
| (25) |
We approximate ϕ̈(n)(α) by excluding the second derivative of p(t) as suggested in [28, p. 683].
C. Optimization for the Sub-Problem (15)
One can solve (15) using any statistical image reconstruction algorithms with a slight modification for a shifted quadratic regularizer. OS approximation can be usually used to speed up convergence rate. Using this splitting, one may focus more computational resources on solving the sub-problem (15) instead of solving the sub-problem (14) so that one may achieve faster overall convergence rate.
We modified De Pierro’s EM algorithm [3] for (15) by considering a simple shifted quadratic regularizer. The surrogate function for the likelihood term in (15) is
| (26) |
where Q̃j(fj|f(n−1)) is a surrogate function that can be found in [3] and [6]. We use a surrogate function Qj(fj|f(n−1)) that is equivalent to Q̃j(fj|f(n−1)) by omitting the terms that are independent of fj as follows:
| (27) |
where
and
| (28) |
Note that Q̃j(fj|f(n−1)) and Qj(fj|f(n−1)) are minimized at the same fj.
The regularizer in (15) is separable in the image domain
| (29) |
Therefore, one must minimize the following surrogate function to solve the sub-problem in (15):
| (30) |
for all j. Differentiating (30) with respect to fj and setting it to be zero leads to the following second-order polynomial with respect to fj
| (31) |
The nonnegative root of (31) is the minimizer of (30). This root always exists because ej(f(n−1)) and are nonnegative [29].
An OS approximation for this modified De Pierro’s algorithm can be easily done by substituting the most time-consuming part (28) in (15) at each iteration with the following approximate term
| (32) |
This new term (32) requires about K times less computation than the term (28) does. Since calculating (28) dominates the overall calculation of (31) for each iteration, OS approximation for (15) substantially reduces the computation time per update.
In this Section, we proposed to use variable splitting and ADMM for efficient computation of NL regularized image reconstruction in (14), (15), and (16). We also described detailed algorithms for each sub-problem: gradient descent with one step of Newton’s method for (14) and modified De Pierro’s OSEM algorithm for (15). Even though our proposed ADMM algorithm for convex R guarantees to converge for any μ, the parameter μ in (13) affects convergence rate. In the next Section, we propose a method to optimize μ automatically.
IV. Parameter Selection for ADMM
A. Ideal ADMM Update and Approximation
Ghadimi et al. optimized μ in ADMM for ℓ2-regularized minimization with a quadratic data fitting term [13]. We generalize and extend their analysis to optimize μ for our case of having NL regularizers with the negative Poisson likelihood term.
Let us derive the “ideal” ADMM update for (14), (15), and (16). Using the gradient of (14) with respect to u, the first update for (14) at the nth iteration is
| (33) |
where the approximate Hessian of (4) is
We used the GD algorithm in (17) since (33) is impractical. Similarly, using the gradient of (15) with respect to f and ignoring the non-negative constraint for f, “ideally” the update for (15) at the nth iteration is
| (34) |
where
Because (34) is impractical, we used De Pierro’s EM algorithm in (31). We can use (16) itself as the third “ideal” ADMM update at the nth iteration.
To facilitate the analysis using these ideal ADMM updates (33), (34), and (16), we need to fix R(u(n−1)) and W(n−1) for all iterations or for all n. We conjecture that we can set u(n−1) = f(n−1) = f̂init for R(u(n−1)) and W(n−1) so that we can fix
| (35) |
and
| (36) |
We used these approximations only for selecting μ. The initial image f̂init can be anything like the FBP image or the reconstructed image using the OSEM with a few iterations. For the results in Section V, we used the reconstructed image using un-regularized OSEM with five iterations (six subsets). Please see Appendix A for technical details for the approximation of (35) and (36).
B. Nearly Optimal ADMM Parameter
To study the convergence rate, we need to derive the update equation of f(n) in terms of f(n−1). Here, we generalize the approach of [13] for our “ideal” ADMM update equations for (14)–(16).
First, rearrange (34) with (36) and use it for (16). Then, (16) becomes
| (37) |
and this also works for n − 1. Secondly, we use (37) with n − 1 for (14) with (35) to obtain
| (38) |
where c is a constant vector. Lastly, combining (37) (with n − 1), (38), and (34) [with (36)] yields
| (39) |
where c̃ is a constant vector and
| (40) |
where H ≜ A′WA.
With the goal of approximately optimizing the convergence rate, we choose the ADMM parameter μ as follows:
| (41) |
where ρ(Sμ) is the spectral radius of the matrix Sμ, which is the maximum eigen value of the matrix Sμ. However, it is very challenging to find eigen values for large matrices such as Sμ and it is also infeasible to find the matrix Sμ itself due to the inversion of large matrices such as H + μI and βR + μI.
To make the problem (41) tractable, we approximately optimize the ADMM parameter μ for the center of the image and use that parameter everywhere else. We approximate H and R as being locally circulant around that local area; a similar assumption often is used when analyzing spatial resolution properties [30], [31]. Then, the eigen value of the matrix Sμ for the pth eigen vector (or the pth discrete Fourier basis) becomes
| (42) |
where rp and hp are the eigen values of R and H for the corresponding pth eigen vector, respectively. Then, our choice for the ADMM parameter μ is
| (43) |
Unlike the case in [13], it is not easy to find the analytical solution for (43). Instead, we can quickly solve the problem (43). We find rp and hp for all p using the fast Fourier transform (FFT) of R1c and H1c, respectively, where 1c is a unit vector where the element of 1c is one at the center voxel of the image and zero otherwise. Then, we calculated maxp λp(Sμ) for a finite set of μ values on , and then obtain μ∘ from maxp λp(Sμ). Since λp(Sμ) is a nondecreasing function of μ for , μo must be less than or equal to . The computational cost for evaluating maxp λp(Sμ) is linear in the number of voxels. This procedure is fast since it is parallelizable. One can speed up this procedure using coarse-to-fine approach. One could also use the Golden section search [28, p. 397] to find μ∘ to minimize maxp λp(Sμ).
V. Result
A. Simulation Setup
We simulated the Siemens Symbia Truepoint 3-D SPECT system with high energy collimators (parallel hexagonal collimators with a septal thickness of 2 mm, a hole diameter of 4 mm, and a hole length of 59.7 mm) with a nonuniform attenuation map, depth-dependent collimator-detector response [20], and scatter component (128 × 21, 4.82 mm2 pixel size). The system resolution at 10 cm was 13.4 mm and 60 views were collected around 360°. We used the XCAT phantom [17] to generate the true SPECT image with activity distributions realistic for I-131 RIT. The dimension of the SPECT image was 128 × 128 × 21, 4.83 mm3 voxel size. Three spherical lesions were placed within the XCAT phantom with volumes 176 cc, 32 cc, and 9 cc as shown in the bottom-right figure of Fig. 4(a) or (b). Poisson noise was added after scaling the projections to the count-level corresponding to day 2 post-therapy in I-131 RIT (about 600 K total counts per slice with about 300 K scatter counts per slice).
Fig. 4.
Estimated images of different methods at 500 and 1000 s and the true image for the case of the Fair potential. ADMM yielded the best contrast recovery among all other methods. (a) 500 s. (b) 1000 s.
We set the common regularization parameters for all optimization methods as follows: β = 2−13, σf = 21.5, the patch size 3 × 3× 3, and the search neighborhood size 7 × 7× 7. Six subsets were used for OSEM and ADMM. ADMM separates the likelihood update and the regularizer update by split-ting and we chose to run more sub-iterations for the likelihood update (2 outer-iterations ×6 subsets) than for the regularizer update (one outer-iteration). We used six threads for computation (Intel Core i7 2.8 GHz) for all methods and they used the same compiled ANSI C99 code to evaluate the cost function and the gradient of the cost function. We measured the normalized root mean square error (RMSE) over the whole image at each (outer) iteration
| (44) |
B. Parameter Selection
There are parameters to tune for each optimization method. We selected the step size for GD and the number of past estimated images for hessian approximation of L-BFGS-B. We chose those values to yield the fastest convergence rate empirically: the step size for GD was 0.04 and the number of past estimated images for L-BFGS-B was 5. In our experiment, not shown here, the step size for GD was critical for convergence. GD diverged with too large step size and GD converged slowly with too small step size. However, the number of past images used for L-BFGS-B did not affect convergence rate much.
We selected the ADMM parameter μ using (43). We first obtained maxp λp(Sμ) for μ = 0, 0.0001, 0.0002, ···, 0.2125 as shown in Fig. 1 where . Based on this plot, we chose the ADMM parameter μ to be 0.0106, which is the red star mark (*) in Fig. 1.
Fig. 1.
Plot of maxp λp(Sμ) for μ = 0, 0.0001, 0.0002, ···, 0.1. Red star (*) denotes our choice of μ∘.
We evaluated the ADMM convergence rate as a function of μ empirically. Fig. 2 shows that our choice of ADMM parameter μ = 0.0106 achieved reasonably fast convergence compared to other choices of μ. Too large μ such as μ = 0.1250 (green square mark) yielded slower convergence rate and too small μ value such as μ = 0.0005 (magenta triangle mark) resulted in fluctuating tails. Due to the approximations such as local shift invariance and fixing some matrices like R and W, we can not claim that our choice of μ is optimal, but Fig. 2 suggests that our choice is adequate for fast convergence of ADMM.
Fig. 2.
RMSEs of estimated images over time using ADMM with different μ values. Atomatically selected value (μ = 0.0106, red star *) yielded relatively fast convergence rate of ADMM compared to other choices of μ.
C. Simulation Results for NL Regularization
We reconstructed images using different optimization algorithms for the cost function (1) with the (convex) Fair potential (8).
Fig. 3 shows the plots of RMSE values versus computation time for different methods: GD, EM, and OSEM using De Pierro’s lemma, L-BFGS-B, and proposed ADMM. EM yielded faster convergence rate than GD with fixed step size. OSEM does improve convergence speed as compared to EM, but it provides little acceleration due to computationally expensive NL regularizer calculation for all sub-iterations. L-BFGS-B yielded similar convergence rate to OSEM with six subsets. Our proposed ADMM substantially improved convergence speed over all other methods. Other methods did not reach the minimum RMSE before 2000 s, but ADMM achieved the minimum RMSE before 1000 s. These simulation results illustrate that repeated likelihood updates are more important for fast convergence than regularizer updates.
Fig. 3.
RMSEs of estimated images using different algorithms versus time for NL regularization with the Fair potential. Proposed ADMM showed faster convergence rate than other methods.
Fig. 4(a) and (b) shows reconstructed images by different methods at 500 and 1000 s and the true image for NL regularization with the Fair potential. At this early time of 500 s, ADMM yielded the best contrast recovery among all other methods. As time goes by (at 1000 s), other methods also started to yield similar images to that of ADMM since all optimization methods minimize the almost same cost function. However, they may not be exactly the same due to the OS approximation.
Fig. 5(a)–(c) shows recovery coefficients (RCs) of different methods over time when using the Fair potential. The larger the lesion is, the faster it approached to the achievable RC value. Note that we did not optimize NL regularizer parameters such as σf and one may achieve better RC for smaller lesions like 9 cc than that in Fig. 5(c).
Fig. 5.
Recovery coefficients for different size lesions over time when using the Fair potential. ADMM yielded the best recovery coefficients among all other methods. (a) 176 cc. (b) 32 cc. (c) 9 cc.
VI. Discussion
We developed a new algorithm for tomography with computationally expensive NL regularizers using ADMM. We also proposed a method to determine automatically a suitable μ value for fast convergence rate. By combining with the OS approach, our proposed ADMM approached convergence much faster than existing methods such as GD, EM, and OSEM using the De Pierro lemma, and L-BFGS-B. Since it seems more important to update the likelihood part frequently, our ADMM yielded faster convergence. Since the cost function has both the likelihood term and the regularization term, increasing the number of iterations for the likelihood term did not always further accelerate convergence. We chose a good combination of iterations for both terms empirically. Similarly, one could use an approximate gradient for the NL regularizer to reduce computation, but in our simulation not shown here, our ADMM outperformed this approximation.
In this paper, we minimized the same NL regularized cost function (the Fair potential) as Wang et al. did [6]. Whereas Wang et al. used De Pierro’s algorithm with the surrogate function of their NL regularizer, we use De Pierro’s algorithm with a shifted quadratic regularizer, which requires far less computation. Zhang et al. also applied a splitting approach to the optimization problem with NL regularizers [4]. However, their motivation for splitting was to apply shrinkage operator to the nonsmooth potential function such as TV. In addition, our way of splitting in (12) was different from theirs. Xu et al. used the same type of splitting as ours for the case of using non-smooth regularizer [16]. They used a similar formula for the sub-problem of (15) except a Lagrangian multiplier vector, but their motivation was to deal with nonsmooth regularizer rather than to deal with computation-intensive NL regularizer.
Our approach to optimizing μ was based on ‘ideal’ updates. In other words, we assumed fully converged images for sub-iterations of (14) and (15). Nevertheless, our optimized μ worked well even for sub-iterations that did not converge fully. This may be because a few sub-iterations yielded good approximations of fully converged images for the sub-problems (14) and (15).
The proposed method worked well for SPECT image reconstruction with the patch-based convex Fair potential function [6]. Our proposed method can be easily extended to other computationally expensive NL regularizers [4], [5], [7] and NL regularizers that use high-resolution side information [9]–[11] for both emission and transmission tomography. Even though ADMM works only with convex cases theoretically, our proposed ADMM can be a practical method for many nonconvex NL regularizers. Improving image quality with proper regularizers and appropriate regularization parameter selection using our fast ADMM algorithm will be an important and interesting future work.
Acknowledgments
This work was supported in part by the National Institutes of Health under Grant 2RO1 EB001994 and in part by the 2013 Research Fund (1.130006.01) of UNIST (Ulsan National Institute of Science and Technology).
The authors thank S. Ramani for helpful discussion on ADMM and H. Nien for bringing [13] to our attention.
Appendix A. Approximation for Ideal ADMM Update
For R, even though f̂init and u(n−1) differ substantially, the patch selection operator Ni − Nj and the ℓ2 norm can make this difference much smaller. Therefore, it seems reasonable to use the approximation (35). For W, this type of approximation in (36) has been used in different gradient-based analysis such as mean-variance analysis [32], spatial resolution analysis [30], [31], and noise analysis [33]. Even though there may be non-negligible difference between f̂init and f(n−1), the approximation A′WA ≈ A′W(n−1)A still holds fairly well. Thus, it also seems reasonable to use the approximation (36).
Footnotes
Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.
Contributor Information
Se Young Chun, Department of Electrical Engineering and Computer Science and Department of Radiology, University of Michigan, Ann Arbor, MI, USA. He is now with the School of Electrical and Computer Engineering, Ulsan National Institute of Science and Technology (UNIST), Ulsan 689-798, Republic of Korea.
Yuni K. Dewaraja, Department of Radiology, University of Michigan, Ann Arbor, MI 48109 USA.
Jeffrey A. Fessler, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109 USA.
References
- 1.Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B. 1977;39(1):1–38. [Online]. Available: http://www.jstor.org/stable/info/2984875?seq=1. [Google Scholar]
- 2.Hudson HM, Larkin RS. Accelerated image reconstruction using ordered subsets of projection data. IEEE Trans Med Imag. 1994 Dec;13(4):601–609. doi: 10.1109/42.363108. [DOI] [PubMed] [Google Scholar]
- 3.De Pierro AR. A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography. IEEE Trans Med Imag. 1995 Mar;14(1):132–137. doi: 10.1109/42.370409. [DOI] [PubMed] [Google Scholar]
- 4.Zhang X, Burger M, Bresson X, Osher S. Bregmanized nonlocal regularization for deconvolution and sparse reconstruction. SIAM J Imag Sci. 2010;3(3):253–276. [Google Scholar]
- 5.Lou Y, Zhang X, Osher S, Bertozzi A. Image recovery via nonlocal operators. J Sci Comput. 2010 Feb;42(2):185–197. [Google Scholar]
- 6.Wang G, Qi J. Penalized likelihood PET image reconstruction using patch-based edge-preserving regularization. IEEE Trans Med Imag. 2012 Dec;31(12):2194–2204. doi: 10.1109/TMI.2012.2211378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Yang Z, Jacob M. Nonlocal regularization of inverse problems: A unified variational framework. IEEE Trans Image Process. 2013 Aug;22(8):3192–3203. doi: 10.1109/TIP.2012.2216278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rousseau F. A non-local approach for image super-resolution using intermodality priors. Med Image Anal. 2010 Aug;14(4):594–605. doi: 10.1016/j.media.2010.04.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Chun SY, Fessler JA, Dewaraja YK. Non-local means methods using CT side information for I-131 SPECT image reconstruction,” in. Proc IEEE Nucl Sci Symp Med Imag Conf. 2012:3362–3366. [Google Scholar]
- 10.Vunckx K, Atre A, Baete K, Reilhac A, Deroose C, Laere KV, Nuyts J. Evaluation of three MRI-based anatomical priors for quantitative pet brain imaging. IEEE Trans Med Imag. 2012 Mar;31(3):599–612. doi: 10.1109/TMI.2011.2173766. [DOI] [PubMed] [Google Scholar]
- 11.Nguyen V-G, Lee S-J. Incorporating anatomical side information into PET reconstruction using nonlocal regularization. IEEE Trans Image Process. 2013 Oct;22(10):3961–3973. doi: 10.1109/TIP.2013.2265881. [DOI] [PubMed] [Google Scholar]
- 12.Chun SY, Dewaraja YK, Fessler JA. Alternating direction method of multiplier for emission tomography with non-local regularizers. Proc Int Meet Fully 3-D Image Recon Rad Nucl Med. 2013:62–65. [Online]. Available: proc/13/web/chun-13-adm.pdf.
- 13.Ghadimi E, Teixeira A, Shames I, Johansson M. On the optimal step-size selection for the alternating direction method of multipliers. Proc IFAC Workshop Estimat Control Netw Syst. 2012 [Online]. Available: www.ee.kth.se/mikaelj/papers/gtsj12.pdf.
- 14.Goldstein T, Osher S. The split Bregman method for L1-regularized problems. SIAM J Imag Sci. 2009;2(2):323–343. [Google Scholar]
- 15.Figueiredo MAT, Bioucas-Dias JM. Restoration of Poissonian images using alternating direction optimization. IEEE Trans Image Process. 2010 Dec;19(12):3133–3145. doi: 10.1109/TIP.2010.2053941. [DOI] [PubMed] [Google Scholar]
- 16.Xu J, Chen S, Tsui BMW. Total variation penalized maximum-likelihood image reconstruction for a stationary small animal SPECT system,” in. Proc Int Meet Fully 3-D Image Recon Rad Nucl Med. 2011:225–228. [Google Scholar]
- 17.Segars WP, Mahesh M, Beck TJ, Frey EC, Tsui BMW. Realistic CT simulation using the 4D XCAT phantom. Med Phys. 2008 Aug;35(8):3800–3808. doi: 10.1118/1.2955743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Morales JL, Nocedal J. Remark on “algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization”. ACM Trans Math Softw. 2011 Nov;38(1):7:1–7:4. [Google Scholar]
- 19.Dewaraja YK, Schipper MJ, Roberson PL, Wilderman SJ, Amro H, Regan DD, Koral KF, Kaminski MS, Avram AM. 131I-tositumomab radioimmunotherapy: Initial tumor dose-response results using 3-dimensional dosimetry including radiobiologic modeling. J Nucl Med. 2010 Jul;51(7):1155–1162. doi: 10.2967/jnumed.110.075176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Chun S, Fessler J, Dewaraja Y. Correction for collimator-detector response in SPECT using point spread function template. IEEE Trans Med Imag. 2013 Feb;32(2):295–305. doi: 10.1109/TMI.2012.2225441. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Dewaraja YK, Ljungberg M, Fessler JA. 3-D Monte Carlo-based scatter compensation in quantitative I-131 SPECT reconstruction. IEEE Trans Nucl Sci. 2006 Feb;53(1):181–188. doi: 10.1109/TNS.2005.862956. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Rivera M, Marroquin JL. Efficient half-quadratic regularization with granularity control. Imag Vision Comput. 2003 Apr;21(4):345–357. [Google Scholar]
- 23.Fair RC. On the robust estimation of econometric models. Ann Econ Social Measur. 1974 Oct;2:667–677. [Online]. Available: http://fairmodel.econ.yale.edu/rayfair/pdf/1974D.HTM. [Google Scholar]
- 24.Lange K. Convergence of EM image reconstruction algorithms with Gibbs smoothing. IEEE Trans Med Imag. 1990 Dec;9(4):439–446. doi: 10.1109/42.61759. [DOI] [PubMed] [Google Scholar]
- 25.Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn. 2010;3(1):1–122. [Google Scholar]
- 26.Eckstein J, Bertsekas DP. On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math Programm. 1992 Apr;55(1–3):293–318. [Google Scholar]
- 27.Fessler JA, Booth SD. Conjugate-gradient preconditioning methods for shift-variant PET image reconstruction. IEEE Trans Image Process. 1999 May;8(5):688–699. doi: 10.1109/83.760336. [DOI] [PubMed] [Google Scholar]
- 28.Press WH, Flannery BP, Teukolsky SA, Vetterling WT. Numerical recipes in C. 2. New York: Cambridge Univ. Press; 1992. [Google Scholar]
- 29.Fessler JA, Hero AO. Penalized maximum-likelihood image reconstruction using space-alternating generalized EM algorithms. IEEE Trans Image Process. 1995 Oct;4(10):1417–1429. doi: 10.1109/83.465106. [DOI] [PubMed] [Google Scholar]
- 30.Fessler JA, Rogers WL. Spatial resolution properties of penalized-likelihood image reconstruction methods: Space-invariant tomographs. IEEE Trans Image Process. 1996 Sep;5(9):1346–1358. doi: 10.1109/83.535846. [DOI] [PubMed] [Google Scholar]
- 31.Chun SY, Fessler JA. Spatial resolution properties of motion-compensated image reconstruction methods. IEEE Trans Med Imag. 2012 Jul;31(7):1413–1425. doi: 10.1109/TMI.2012.2192133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Fessler JA. Mean and variance of implicitly defined biased estimators (such as penalized maximum likelihood): Applications to tomography. IEEE Trans Image Process. 1996 Mar;5(3):493–506. doi: 10.1109/83.491322. [DOI] [PubMed] [Google Scholar]
- 33.Chun SY, Fessler JA. Noise properties of motion-compensated tomographic image reconstruction methods. IEEE Trans Med Imag. 2013 Feb;32(2):141–152. doi: 10.1109/TMI.2012.2206604. [DOI] [PMC free article] [PubMed] [Google Scholar]





