Accelerating ordered subsets image reconstruction for X-ray CT using spatially non-uniform optimization transfer

Donghwan Kim; Debashish Pal; Jean-Baptiste Thibault; Jeffrey A Fessler

doi:10.1109/TMI.2013.2266898

. Author manuscript; available in PMC: 2014 Nov 1.

Published in final edited form as: IEEE Trans Med Imaging. 2013 Jun 7;32(11):10.1109/TMI.2013.2266898. doi: 10.1109/TMI.2013.2266898

Accelerating ordered subsets image reconstruction for X-ray CT using spatially non-uniform optimization transfer

Donghwan Kim ¹, Debashish Pal ², Jean-Baptiste Thibault ³, Jeffrey A Fessler ⁴

PMCID: PMC3818426 NIHMSID: NIHMS502911 PMID: 23751959

Abstract

Statistical image reconstruction algorithms in X-ray CT provide improved image quality for reduced dose levels but require substantial computation time. Iterative algorithms that converge in few iterations and that are amenable to massive parallelization are favorable in multiprocessor implementations. The separable quadratic surrogate (SQS) algorithm is desirable as it is simple and updates all voxels simultaneously. However, the standard SQS algorithm requires many iterations to converge. This paper proposes an extension of the SQS algorithm that leads to spatially non-uniform updates. The non-uniform (NU) SQS encourages larger step sizes for the voxels that are expected to change more between the current and the final image, accelerating convergence, while the derivation of NU-SQS guarantees monotonic descent. Ordered subsets (OS) algorithms can also accelerate SQS, provided suitable “subset balance” conditions hold. These conditions can fail in 3D helical cone-beam CT due to incomplete sampling outside the axial region-of-interest (ROI). This paper proposes a modified OS algorithm that is more stable outside the ROI in helical CT. We use CT scans to demonstrate that the proposed NU-OS-SQS algorithm handles the helical geometry better than the conventional OS methods and “converges” in less than half the time of ordinary OS-SQS.

Index Terms: Statistical image reconstruction, computed tomography, parallelizable iterative algorithms, ordered subsets, separable quadratic surrogates

I. Introduction

Statistical image reconstruction methods can improve resolution and reduce noise and artifacts by minimizing either penalized likelihood (PL) [1]–[3] or penalized weighted least-squares (PWLS) [4]–[6] cost functions that model the physics and statistics in X-ray CT. The primary drawback of these methods is their computationally expensive iterative algorithms. This paper describes new accelerated minimization algorithms for X-ray CT statistical image reconstruction.

There are several iterative algorithms for X-ray CT. Coordinate descent (CD) algorithms [7] (also known as Gauss Siedel algorithms [8, p. 507]) and block/group coordinate descent (BCD/GCD) algorithms [9]–[11], update one or a group of voxels sequentially. These can converge in few iterations but can require long computation time per iteration [6], [12]. Considering modern computing architectures, algorithms that update all voxels simultaneously and that are amenable to parallelization are desirable, such as ordered subsets based on separable quadratic surrogate (OS-SQS) [13]–[15] and preconditioned conjugate gradient (PCG) algorithms [16]. However, those highly parallelizable algorithms require more iterations than CD algorithms [6], [12], and thus it is desirable to reduce the number of iterations needed to reach acceptable images. Splitting techniques [17] can accelerate convergence [18], but require substantial extra memory.

In this paper, we propose an enhanced version of a highly parallelizable SQS algorithm that accelerates convergence. SQS algorithms are optimization transfer methods that replace the original cost function by a simple surrogate function [19], [20]. Here we construct surrogates with spatially non-uniform curvatures that provide spatially non-uniform step sizes to accelerate convergence.

Spatially non-homogeneous (NH) approach [7] accelerated the CD algorithm by more frequently visiting the voxels that need updates. This approach is effective because the differences between the initial and final images are non-uniform. Inspired by such ideas, we propose a spatially non-uniform (NU) optimization transfer method that encourages larger updates for voxels that are predicted to be farther from the optimal value, using De Pierro’s idea in SQS [21]. We provide a theoretical justification for the acceleration of NU method by analyzing the convergence rate of SQS algorithm (in Section II-D). The NH approach also balanced homogeneous and non-homogeneous updates for fast overall convergence rate [7]. Section III-C discusses similar considerations for the proposed NU approach.

Ordered subsets (OS), also known as incremental gradient methods [22], [23] or block iterative methods [24], can accelerate gradient-based algorithms by grouping the projection data into (ordered) subsets and updating the image using each subset. OS algorithms are most effective when a properly scaled gradient of each subset data-fit term approximates the gradient of the full data-fidelity term, and then it can accelerate convergence by a factor of the number of subsets. However, standard OS algorithms usually approach a limit-cycle where the sub-iterations loop around the optimal point. OS algorithms can be modified so that they converge by introducing relaxation [25], reducing the number of subsets, or by using incremental optimization transfer methods [26]. Unfortunately, such methods converge slower than ordinary OS algorithms in early iterations. Therefore, we investigated averaging the sub-iterations when the algorithm reaches a limit-cycle, which improves image quality without slowing convergence. (There was a preliminary simulation study of this idea in [27].)

In cone-beam CT, the user must define a region-of-interest (ROI) along the axial (z) direction for image reconstruction (see Fig. 1). Model-based reconstruction methods for cone-beam CT should estimate many voxels outside the ROI, because parts of each patient usually lie outside the ROI yet contribute to some measurements. However, accurately estimating non-ROI voxels is difficult since they are incompletely sampled, which is called the “long-object problem” [28]. Reconstructing the non-ROI voxels adequately is important, as they may impact the estimates within the ROI. Unfortunately in OS algorithms, the sampling of these extra slices leads to very imbalanced subsets particularly for large number of subsets, which can destabilize OS algorithm outside the ROI. This paper proposes an improved OS algorithm that is more stable for 3D helical CT by defining better scaling factors for the subset-based gradient [29].

The paper is organized as follows. Section II reviews PL and PWLS problems for X-ray CT image reconstruction. We review the optimization transfer methods including the SQS algorithm and analyze its convergence rate. Section III presents the proposed spatially non-uniform SQS algorithm (NU-SQS). Section IV reviews the standard OS algorithm and refines it for 3D helical CT. Section V shows the experimental results on various data sets, quantifying the convergence rate and reconstructed image quality. Finally, Section VI offers conclusions. The results show that the NU approach more than doubles the convergence rate, and the improved OS algorithm provides acceptable images in helical CT.

II. Statistical image reconstruction

A. Problem

We reconstruct a non-negative image $x = (x_{1}, \dots, x_{N_{p}}) \in ℝ_{+}^{N_{p}}$ from noisy measured transmission data Y ∈ IR^N_d by minimizing either penalized likelihood (PL) or penalized weighted least-squares (PWLS) cost functions:

\hat{x} = arg min_{x ≽ 0} Ψ (x),

(1)

Ψ (x) ≜ L (x) + R (x) = \sum_{i = 1}^{N_{d}} h_{i} ({[A x]}_{i}) + \sum_{k = 1}^{N_{r}} ψ_{k} ({[C x]}_{k}),

(2)

where x̂ is a minimizer of Ψ(x) subject to a non-negativity constraint. The function L(x) is a negative log-likelihood term (data-fit term) and R(x) is a regularizer. The matrix A = {a_ij} is a projection operator (a_ij ≥ 0 for all i, j) where ${[A x]}_{i} ≜ \sum_{j = 1}^{N_{p}} a_{i j} x_{j}$ , and C = {c_kj} is a finite differencing matrix considering 26 neighboring voxels in 3D image space.¹ The function ψ_k(t) is a (convex and typically non-quadratic) edge-preserving potential function. The function h_i(t) is selected based on the chosen statistics and physics:

Penalized likelihood (PL) for pre-log data Y_i with Poisson model [1]–[3] uses:
$h_{i} (t) = (b_{i} e^{- t} + r_{i}) - Y_{i} log (b_{i} e^{- t} + r_{i}),$ (3)

where b_i is the blank scan factor and r_i is the mean number of background events. The function h_i(·) is non-convex if r_i ≠ 0, or convex otherwise. A shifted Poisson model [30] that partially accounts for electronic recorded noise can be used instead.
Penalized weighted least squares (PWLS) for post-log data y_i = log (b_i/(Y_i − r_i)) with Gaussian model [4]–[6] uses a convex quadratic function:
$h_{i} (t) = \frac{1}{2} w_{i} {(t - y_{i})}^{2},$ (4)

where w_i = (Y_i − r_i)²/Y_i provides statistical weighting. We use the PWLS cost function for our experiments in Section IV and V.

The proposed NU-SQS algorithm, based on optimization transfer methods (in Section II-B), decreases the cost function Ψ(x) monotonically for either (3) or (4).

B. Optimization transfer method

When a cost function Ψ(x) is difficult to minimize, we replace Ψ(x) with a surrogate function φ⁽ⁿ⁾(x) at the nth iteration for computational efficiency. This method is called optimization transfer [19], [20], which is also known as a majorization principle [31], and a comparison function [32]. There are many optimization transfer algorithms such as expectation maximization (EM) algorithms [33], [34], separable surrogate algorithms based on De Pierro’s lemma [35]–[37] and surrogate algorithms using Lipschitz constants [38], [39].

The basic iteration of an optimization transfer method is

x^{(n + 1)} = arg min_{x ≽ 0} φ^{(n)} (x) .

(5)

To monotonically decrease Ψ(x), we design surrogate functions φ⁽ⁿ⁾(x) that satisfy the following majorization conditions:

\begin{array}{l} Ψ (x^{(n)}) = φ^{(n)} (x^{(n)}), \\ Ψ (x) \leq φ^{(n)} (x), \forall x \in ℝ_{+}^{N_{p}} . \end{array}

(6)

Constructing surrogates with smaller curvatures while satisfying condition (6) is the key to faster convergence in optimization transfer methods [11].

Optimization transfer has been used widely in tomography problems. De Pierro developed a separable surrogate (SS) approach in emission tomography [35], [36]. Quadratic surrogate (QS) functions have been derived for non-quadratic problems, enabling monotonic descent [1]. SQS algorithms combine SS and QS [14], and are the focus of this paper. Partitioned SQS methods for multi-core processors have been proposed for separating the image domain by the number of processors and updating each of them separately while preserving the monotonicity [40]. In addition, replacing $ℝ_{+}^{N_{p}}$ in (6) by an interval that is known to include the minimizer x̂ can reduce the surrogate curvature [7], [41].

Building on this history of optimization transfer methods that seek simple surrogates with small curvatures, we propose a spatially non-uniform SQS (NU-SQS) algorithm that satisfies condition (6) and converges faster than the standard SQS. We review the derivation of the SQS algorithm next.

C. Separable quadratic surrogate (SQS) algorithm

We first construct a quadratic surrogate at the nth iteration for the non-quadratic cost function in (2):

Ψ (x) = L (x) + R (x) \leq Q_{L}^{(n)} (x) + Q_{R}^{(n)} (x),

(7)

where $Q_{L}^{(n)} (x)$ and $Q_{R}^{(n)} (x)$ are quadratic surrogates for L(x) and R(x). Based on (2), the quadratic surrogate for L(x) has the form:

\begin{array}{l} Q_{L}^{(n)} (x) = \sum_{i = 1}^{N_{d}} q_{i}^{(n)} ({[{A x}^{(n)}]}_{i}), \\ q_{i}^{(n)} (t) ≜ h_{i} (t_{i}^{(n)}) + {\dot{h}}_{i} (t_{i}^{(n)}) (t - t_{i}^{(n)}) + \frac{{\overset{⌣}{c}}_{i}^{(n)}}{2} {(t - t_{i}^{(n)})}^{2}, \end{array}

(8)

where $t_{i}^{(n)} ≜ {[{A x}^{(n)}]}_{i}$ , and ${\overset{⌣}{c}}_{i}^{(n)} = max {{\overset{⌣}{c}}_{i}^{(n)}, η}$ is the curvature of $q_{i}^{(n)} (t)$ for some small positive value η that ensures the curvature ${\overset{⌣}{c}}_{i}^{(n)}$ positive [1]. In PWLS problem, h_i (·) is quadratic already, so $q_{i}^{(n)} (t) = h_{i} (t)$ . The quadratic surrogate $Q_{R}^{(n)} (x)$ for R(x) is defined similarly.

We choose curvatures ${{\overset{⌣}{c}}_{i}^{(n)}}$ that satisfy the monotonicity conditions in (6). For PL, the smallest curvatures:

{\overset{⌣}{c}}_{i}^{(n)} ≜ {\begin{cases} {[2 \frac{h_{i} (0) - h_{i} (t_{i}^{(n)}) + t_{i}^{(n)} {\dot{h}}_{i} (t_{i}^{(n)})}{{[t_{i}^{(n)}]}^{2}}]}_{+}, & t_{i}^{(n)} > 0, \\ {[{\ddot{h}}_{i} (0)]}_{+}, & t_{i}^{(n)} = 0, \end{cases}

(9)

where [t]₊ = max{t, 0}, called “optimal curvatures,” lead to the fastest convergence rate but require an extra back-projection each iteration for non-quadratic problems [1]. Alternatively, we may use “maximum curvatures”:

{\overset{⌣}{c}}_{i} ≜ max_{t \geq 0} {\ddot{h}}_{i} (t)

(10)

that we can precompute before the first iteration [1].

Next, we generate a separable surrogate of the quadratic surrogate. For completeness, we repeat De Pierro’s argument in [14]. We first rewrite forward projection [Ax]_i as follows:

{[A x]}_{i} = \sum_{j = 1}^{N_{p}} a_{i j} x_{j} = \sum_{\begin{matrix} j = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{p}} π_{i j}^{(n)} (\frac{a_{i j}}{π_{i j}^{(n)}} (x_{j} - x_{j}^{(n)}) + {[{A x}^{(n)}]}_{i}),

(11)

where a non-negative real number $π_{i j}^{(n)}$ is zero only if a_ij is zero for all i, j, and satisfies $\sum_{j = 1}^{N_{p}} π_{i j}^{(n)} = 1$ for all i. Using the convexity of $q_{i}^{(n)} (\cdot)$ and the convexity inequality yields

q_{i}^{(n)} ({[A x]}_{i}) \leq \sum_{\begin{matrix} j = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{p}} π_{i j}^{(n)} q_{i}^{(n)} (\frac{a_{i j}}{π_{i j}^{(n)}} (x_{j} - x_{j}^{(n)}) + {[{A x}^{(n)}]}_{i}) .

(12)

Thus we have the following separable quadratic surrogate $φ_{L}^{(n)} (x)$ (with a diagonal Hessian) for the data-fit term L(x):

L (x) \leq Q_{L}^{(n)} (x) \leq φ_{L}^{(n)} (x) ≜ \sum_{j = 1}^{N_{p}} φ_{L, j}^{(n)} (x_{j}),

(13)

φ_{L, j}^{(n)} (x_{j}) ≜ \sum_{\begin{matrix} i = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{d}} π_{i j}^{(n)} q_{i}^{(n)} (\frac{a_{i j}}{π_{i j}^{(n)}} (x_{j} - x_{j}^{(n)}) + {[{A x}^{(n)}]}_{i}) .

(14)

The second derivative (curvature) of the surrogate $φ_{L, j}^{(n)} (x_{j})$ is

d_{j}^{L, (n)} ≜ \frac{\partial^{2}}{\partial x_{j}^{2}} φ_{L, j}^{(n)} (x_{j}) = \sum_{\begin{matrix} i = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} \frac{a_{i j}^{2}}{π_{i j}^{(n)}} .

(15)

We can define a separable quadratic surrogate $φ_{R, j}^{(n)} (x_{j})$ for the regularizer similarly, and it has the curvature:

d_{j}^{R, (n)} ≜ \frac{\partial^{2}}{\partial x_{j}^{2}} φ_{R, j}^{(n)} (x_{j}) = \sum_{\begin{matrix} k = 1 \\ c_{k j} \neq 0 \end{matrix}}^{N_{r}} {\ddot{ψ}}_{k} (0) \frac{c_{k j}^{2}}{π_{k j}^{(n)}},

(16)

where $π_{k j}^{(n)}$ have similar constraints as $π_{i j}^{(n)}$ , ψ̈_k(0) = max_t ψ̈_k(t) for maximum curvature [14], or ψ̈_k(0) can be replaced by ψ̇_k([Cx⁽ⁿ⁾]_k)/[Cx⁽ⁿ⁾]_k for Huber’s optimal curvature [32, Lemma 8.3, p.184].

Combining the surrogates for the data-fit term and regularizer and minimizing it in (5) leads to the following separable quadratic surrogate (SQS) method [14] that updates all voxels simultaneously with a “denominator” $d_{j}^{(n)} ≜ d_{j}^{L, (n)} + d_{j}^{R, (n)}$ :

x_{j}^{(n + 1)} = {[x_{j}^{(n)} - \frac{1}{d_{j}^{(n)}} \frac{\partial}{\partial x_{j}} Ψ (x^{(n)})]}_{+},

(17)

where a clipping [·]₊ enforces the non-negativity constraint. This SQS decreases the cost function Ψ(x) monotonically, and it converges based on the proof in [20]. If Ψ(x) is convex, a sequence {x⁽ⁿ⁾} converges to x^(∞) that is a global minimizer x̂. Otherwise, {x⁽ⁿ⁾} converges to a local minimizer x^(∞) which may or may not be a global minimizer x̂ depending on the initial image x⁽⁰⁾.

The implementation and convergence rate of SQS depend on the choice of $π_{i j}^{(n)}$ . A general form for $π_{i j}^{(n)}$ is

π_{i j}^{(n)} ≜ \frac{λ_{i j}^{(n)}}{\sum_{\begin{matrix} l = 1 \\ a_{i l} \neq 0 \end{matrix}}^{N_{p}} λ_{i l}^{(n)}},

(18)

where a non-negative real number $λ_{i j}^{(n)}$ is zero only if a_ij is zero. Then (15) can be re-written as

d_{j}^{L, (n)} = \sum_{\begin{matrix} i = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} \frac{a_{i j}^{2}}{λ_{i j}^{(n)}} (\sum_{\begin{matrix} l = 1 \\ a_{i l} \neq 0 \end{matrix}}^{N_{p}} λ_{i l}^{(n)}) .

(19)

Summations involving the constraint a_ij ≠ 0 require knowledge of the projection geometry, and thereby each summation can be viewed as a type of forward or back projection.

The standard choice [11], [14]:

{\bar{λ}}_{i j} = a_{i j}, {\bar{λ}}_{k j} = ∣ c_{k j} ∣,

(20)

leads to

{\bar{d}}_{j}^{L, (n)} = \sum_{i = 1}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} a_{i j} (\sum_{l = 1}^{N_{p}} a_{i l}),

(21)

and

{\bar{d}}_{j}^{R, (n)} = \sum_{k = 1}^{N_{r}} {\ddot{ψ}}_{k} (0) ∣ c_{k j} ∣ (\sum_{l = 1}^{N_{p}} ∣ c_{k l} ∣) .

(22)

This choice is simple to implement, since the (available) standard forward and back projections can be used directly in (21). (Computing ${\bar{d}}_{j}^{R, (n)}$ in (22) is negligible compared with (21).) The standard SQS generates a sequence {x⁽ⁿ⁾} in (17) by defining the denominator as

{\bar{d}}_{j}^{(n)} ≜ {\bar{d}}_{j}^{L, (n)} + {\bar{d}}_{j}^{R, (n)} .

(23)

However, we prefer choices for $λ_{i j}^{(n)}$ (and $λ_{k j}^{(n)}$ ) that provide fast convergence. Therefore, we first analyze the convergence rate of the SQS algorithm in terms of the choice of $λ_{i j}^{(n)}$ in the next section. Section III introduces acceleration by choosing better $λ_{i j}^{(n)}$ (and $λ_{k j}^{(n)}$ ) than the standard choice (20).

D. Convergence rate of SQS algorithm

The convergence rate of the sequence {x⁽ⁿ⁾} generated by the SQS iteration (17) depends on the denominator $D^{(n)} ≜ diag {d_{j}^{(n)}}$ . This paper’s main goal is to choose $λ_{i j}^{(n)}$ so that the sequence {x⁽ⁿ⁾} converges faster.

The asymptotic convergence rate of a sequence {x⁽ⁿ⁾} that converges to x^(∞) is measured by the root-convergence factor defined as R₁{x⁽ⁿ⁾} ≜ lim sup_n_→∞ ||x⁽ⁿ⁾ − x^(∞)||^1/ⁿ in [31, p. 288]. The root-convergence factor at x^(∞) for SQS algorithm is given as R₁ {x⁽ⁿ⁾}= ρ(I − [D^(∞)]⁻¹H^(∞)) in [31, Linear Convergence Theorem, p. 301] and [42, Theorem 1], where the spectral radius ρ(·) of a square matrix is its largest absolute eigenvalue and H^(∞) ≜ ∇²Ψ(x^(∞)), assuming that D⁽ⁿ⁾ converges to D^(∞). For faster convergence, we want R₁{x⁽ⁿ⁾} and ρ(·) to be smaller. We can reduce the root-convergence factor based on² [42, Lemma 1], by using a smaller denominator D⁽ⁿ⁾ subject to the majorization conditions in (6) and (13).

However, the asymptotic convergence rate does not help us design D⁽ⁿ⁾ in the early iterations, so we consider another factor that relates to the convergence rate of SQS:

Lemma 1

For a fixed denominator D (using the maximum curvature (10)), a sequence {x⁽ⁿ⁾} generated by an SQS algorithm (17) satisfies

Ψ (x^{(n + 1)}) - Ψ (x^{(\infty)}) \leq \frac{{∣ ∣ x^{(0)} - x^{(\infty)} ∣ ∣}_{D}^{2}}{2 (n + 1)},

(24)

for any n ≥ 0, if Ψ (x) is convex. Lemma 1 is a simple generalization of Theorem 3.1 in [39], which was shown for a surrogate with a scaled identity Hessian (using Lipschitz constant). The inequality (24) shows that minimizing ||x⁽⁰⁾−x^(∞)||_D with respect to D will reduce the upper bound of Ψ(x⁽ⁿ⁾)−Ψ(x^(∞)), and thus accelerate convergence. (Since the upper bound is not tight, there should be a room for further acceleration by choosing better D, but we leave it as future work.)

We want to adaptively design D⁽ⁿ⁾ to accelerate convergence at the nth iteration. We can easily extend Lemma 1 to Corollary 1 by treating the current estimate x⁽ⁿ⁾ as an initial image for the next SQS iteration:

Corollary 1

A sequence {x⁽ⁿ⁾} generated by an SQS algorithm (17) satisfies

Ψ (x^{(n + 1)}) - Ψ (x^{(\infty)}) \leq \frac{{∣ ∣ x^{(n)} - x^{(\infty)} ∣ ∣}_{D^{(n)}}^{2}}{2}

(25)

for any n ≥ 0, if Ψ(x) is convex. The inequality (25) motivates us to use $∣ x_{j}^{(n)} - x_{j}^{(\infty)} ∣$ when selecting $d_{j}^{(n)}$ (and $λ_{i j}^{(n)}$ ) to accelerate convergence at nth iteration. We discuss this further in Section III-A. We fix D⁽ⁿ⁾ after the n_fix number of iterations to ensure convergence of SQS iteration (17), based on [20]. In this case, D⁽ⁿ⁾ must be generated by the maximum curvature (10) to guarantee the majorization condition (6) for subsequent iterations.

From (17) and (19), the step size $Δ_{j}^{(n)}$ of the SQS iteration (17) has this relationship:

Δ_{j}^{(n)} ≜ x_{j}^{(n + 1)} - x_{j}^{(n)} \propto \frac{1}{d_{j}^{(n)}},

(26)

where smaller $d_{j}^{(n)}$ (and relatively larger $λ_{i j}^{(n)}$ ) values lead to larger steps. Therefore, we should encourage $d_{j}^{(n)}$ to be small ( $λ_{i j}^{(n)}$ to be relatively large) to accelerate the SQS algorithm. However, we cannot reduce $d_{j}^{(n)}$ simultaneously for all voxels, due to the majorization conditions in (6) and (13). Lemma 1 (and Corollary 1) suggest intuitively that we should try to encourage larger steps $Δ_{j}^{(n)}$ (smaller $d_{j}^{(n)}$ ) for the voxels that are far from the optimum to accelerate convergence.

III. Spatially non-uniform separable quadratic surrogate (NU-SQS)

We design surrogates that satisfy condition (6) and provide faster convergence based on Section II-D. We introduce the “update-needed factors” and propose a spatially non-uniform SQS (NU-SQS) algorithm.

A. Update-needed factors

Based on Corollary 1, knowing $∣ x_{j}^{(n)} - x_{j}^{(\infty)} ∣$ would be helpful for accelerating convergence at the nth iteration, but $x_{j}^{(\infty)}$ is unavailable in practice. NH-CD algorithm [7] used the difference between the current and previous iteration instead:

u_{j}^{(n)} ≜ max {∣ x_{j}^{(n)} - x_{j}^{(n - 1)} ∣, δ^{(n)}},

(27)

which we call the “update-needed factors” (originally named a voxel selection criterion (VSC) in [7]). Including the small positive values {δ⁽ⁿ⁾} ensures all voxels to have at least a small amount of attention for updates. This $u_{j}^{(n)}$ accelerated the NH-CD algorithm by visiting voxels with large $u_{j}^{(n)}$ more frequently.

B. Design

For SQS, we propose to choose $λ_{i j}^{(n)}$ to be larger if the jth voxel is predicted to need more updates based on the “update-needed factors” (27) after the nth iteration. We select

{\tilde{λ}}_{i j}^{(n)} = a_{i j} u_{j}^{(n)},

(28)

which is proportional to $u_{j}^{(n)}$ and satisfies the condition for $λ_{i j}^{(n)}$ . This choice leads to the following NU-based denominator:

{\tilde{d}}_{j}^{L, (n)} = \frac{1}{u_{j}^{(n)}} \sum_{i = 1}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} a_{i j} (\sum_{l = 1}^{N_{p}} a_{i l} u_{l}^{(n)}),

(29)

which leads to spatially non-uniform updates $Δ_{j}^{(n)} \propto u_{j}^{(n)}$ .

If it happened that

∣ x_{j}^{(n)} - x_{j}^{(\infty)} ∣ \approx B ∣ x_{j}^{(n)} - x_{j}^{(n - 1)} ∣ for all j,

(30)

where B is a constant, then the NU denominator ${\tilde{d}}_{j}^{L, (n)}$ would minimize the upper bound of Ψ(x⁽ⁿ⁺¹⁾) − Ψ(x^(∞)) in Corollary 1:

Lemma 2

The proposed choice ${\tilde{d}}_{j}^{L, (n)}$ in (29) minimizes the following weighted sum of the denominators

\sum_{j = 1}^{N_{p}} {(u_{j}^{(n)})}^{2} d_{j}^{L, (n)}

(31)

over all possible choices of the $d_{j}^{L, (n)}$ in (19).

Proof

In Appendix A.

The proposed ${\tilde{d}}_{j}^{L, (n)}$ in (29) reduces to the standard choice ${\bar{d}}_{j}^{L, (n)}$ in (21) when ${u_{j}^{(n)}}$ is uniform. Similar to the standard choice ${\bar{d}}_{j}^{L, (n)}$ , the proposed choice ${\tilde{d}}_{j}^{L, (n)}$ can be implemented easily using standard forward and back projection. However, since ${\tilde{d}}_{j}^{L, (n)}$ depends on iteration (n), additional projections required for ${\tilde{d}}_{j}^{L, (n)}$ at every iteration would increase computation. We discuss ways to reduce this burden in Section III-F.

Similar to the data-fit term, we derive the denominator of NU-SQS for the regularizer term to be:

{\tilde{d}}_{j}^{R, (n)} = \frac{1}{u_{j}^{(n)}} \sum_{k = 1}^{N_{r}} {\ddot{ψ}}_{k} (0) ∣ c_{k j} ∣ (\sum_{l = 1}^{N_{p}} ∣ c_{k l} ∣ u_{l}^{(n)}),

(32)

from the choice ${\tilde{λ}}_{k j}^{(n)} = ∣ c_{k j} ∣ u_{j}^{(n)}$ and the maximum curvature method in [14]. Alternatively, we may use Huber’s optimal curvature [32, Lemma 8.3, p.184] replacing ψ̈_k(0) in (32) by ψ̇_k([Cx⁽ⁿ⁾]_k)/[Cx⁽ⁿ⁾]_k. The computation of (32) is much less than that of the data-fit term.

Defining the denominator in the SQS iteration (17) as

{\tilde{d}}_{j}^{(n)} ≜ {\tilde{d}}_{j}^{L, (n)} + {\tilde{d}}_{j}^{R, (n)}

(33)

leads to the accelerated NU-SQS iteration, while the algorithm monotonically decreases Ψ(x) and is provably convergent [20]. We can further accelerate NU-SQS by ordered subsets (OS) methods [13], [14], while losing the guarantee of monotonicity. This algorithm, called ordered subsets algorithms based on a spatially non-uniform SQS (NU-OS-SQS), is explained in Section IV.

C. Dynamic range adjustment (DRA) of $u_{j}^{(n)}$ In reality, (30) will not hold, so (27) will be suboptimal. We could try to improve (27) by finding a function f⁽ⁿ⁾(·): [δ⁽ⁿ⁾, ∞) → [ε, 1] based on the following

arg min_{f^{(n)} (\cdot)} \sum_{j = 1}^{N_{p}} {(f^{(n)} (u_{j}^{(n)}) - \frac{∣ x_{j}^{(n)} - x_{j}^{(\infty)} ∣}{{max}_{j} ∣ x_{j}^{(n)} - x_{j}^{(\infty)} ∣})}^{2},

(34)

where ε is a small positive value. Then we could use $f^{(n)} (u_{j}^{(n)})$ as (better) update-needed factors. However, solving (34) is intractable, so we searched empirically for good candidates for a function f⁽ⁿ⁾(·).

Intuitively, if the dynamic range of the update-needed factors $u_{j}^{(n)}$ in (27) is too wide, then there will be too much focus on the voxels with relatively large $u_{j}^{(n)}$ , slowing the overall convergence rate. On the other hand, a narrow dynamic range of $u_{j}^{(n)}$ will provide no speed-up, since the algorithm will distribute its efforts uniformly. Therefore, adjusting the dynamic range of the update-needed factors is important to achieve fast convergence. This intuition corresponds to how the NH-CD approach balanced between homogeneous update orders and non-homogeneous update orders [7].

To adjust the dynamic range and distribution of $u_{j}^{(n)}$ , we first construct their empirical cumulative density function:

F_{cdf}^{(n)} (u) ≜ \frac{1}{N_{p}} \sum_{j = 1}^{N_{p}} I_{{u_{j}^{(n)} \leq u}}

(35)

to somewhat normalize their distribution, where I_B = 1 if B is true or 0 otherwise. Then we map the values of $F_{cdf}^{(n)} (u)$ by a non-decreasing function g(·): [0, 1] → [ε, 1] as follows

{\tilde{u}}_{j}^{(n)} ≜ f^{(n)} (u_{j}^{(n)}) = g (F_{cdf}^{(n)} (u_{j}^{(n)})),

(36)

which controls the dynamic range and distribution of ${{\tilde{u}}_{j}^{(n)}}_{j = 1}^{N_{p}}$ , and we enforce positivity in g(·) to ensure that the new adjusted parameter ${\tilde{λ}}_{i j}^{(n)} = a_{i j} {\tilde{u}}_{j}^{(n)}$ is positive if a_ij is positive. (We set δ⁽ⁿ⁾ in (27) to zero here, since a positive parameter ε ensures the positivity of ${\tilde{λ}}_{i j}^{(n)}$ if a_ij is positive.) The transformation (36) from $u_{j}^{(n)}$ to ${\tilde{u}}_{j}^{(n)}$ is called dynamic range adjustment (DRA), and two examples of such ${\tilde{u}}_{j}^{(n)}$ are presented in Fig. 2. Then we use ${\tilde{u}}_{j}^{(n)}$ instead of $u_{j}^{(n)}$ in (28).

Fig. 2 — A shoulder region scan: ${\tilde{u}}_{j}^{(2)}$ and ${\tilde{u}}_{j}^{(8)}$ after dynamic range adjustment (DRA) for NU-OS-SQS(82 subsets), with the choice g(v) = {max v¹⁰, 0.05}. NU-OS-SQS updates the voxels with large ${\tilde{u}}_{j}^{(n)}$ more, whereas ordinary OS-SQS updates all voxels equivalently.

Here, we focus on the following function for adjusting the dynamic range and distribution:

g (v) ≜ max {v^{t}, ε}

(37)

where t is a non-negative real number that controls the distribution of ${\tilde{u}}_{j}^{(n)}$ and ε is a small positive value that controls the maximum dynamic range of ${\tilde{u}}_{j}^{(n)}$ . The function reduces to the ordinary SQS choice in (20) when t = 0. The choice of g(·), particularly the parameters t and ε here, may influence the convergence rate of NU-SQS for different data sets, but we show that certain values for t and ε consistently provide fast convergence for various data sets.

D. Related work

In addition to the standard choice (20), the choice

λ_{i j}^{(n)} = a_{i j} max {x_{j}^{(n)}, δ},

(38)

with a small non-negative δ, has been used in emission tomography problems [35], [36] and in transmission tomography problems [11], [37]. This choice is proportional to $x_{j}^{(n)}$ , and thereby provides a relationship $Δ_{j}^{(n)} \propto x_{j}^{(n)}$ . This classical choice (38) can be also viewed as another NU-SQS algorithm based on “intensity”. However, intensity is not a good predictor of which voxels need more update, so (38) does not provide fast convergence based on the analysis in Section II-D.

E. Initialization of $u_{j}^{(0)}$

Unfortunately, $u_{j}^{(n)}$ in (27) is available only for n ≥ 1, i.e., after updating all voxels once. To define the initial update factors $u_{j}^{(0)}$ , we apply edge and intensity detectors to an initial filtered back-projection (FBP) image. This is reasonable since the initial FBP image is a good low-frequency estimate, so the difference between initial and final image will usually be larger near edges. We investigated one particular linear combination of edge and intensity information from an initial image. We used the 2D Sobel operator to approximate the gradient of the image within each transaxial plane. Then we normalized both the approximated gradient and the intensity of the initial image, and computed a linear combination of two arrays with a ratio 2: 1 for the initial update-needed factor $u_{j}^{(0)}$ , followed by DRA method. We have tried other linear combinations with different ratios, but the ratio 2 : 1 provided the fastest convergence rate in our experiments.

F. Implementation

The dependence of $λ_{i j}^{(n)}$ on iteration (n) increases computation, but we found two practical ways to reduce the burden. First, we found that it suffices to update ${\tilde{u}}_{j}^{(n)}$ (and ${\tilde{d}}_{j}^{(n)}$ ) every n_loop > 1 iterations instead of every iteration. This is reasonable since the update-needed factors usually change slowly with iteration. In this case, we must generate a surrogate with the maximum curvature (10) to guarantee the majorization condition (6) for all iterations. Second, we compute the NU-based denominator (29) simultaneously with the data-fit gradient in (17). In 3D CT, we use forward and back-projectors that compute elements of the system matrix A on the fly, and those elements are used for the gradient ∇L(x) in (17). For efficiency, we reuse those computed elements of A for the NU-based denominator (29). We implemented this using modified separable footprint projector subroutines [43] that take two inputs and project (or back-project) both. This approach required only 25% more computation time than a single forward projection rather than doubling the time (see Table I). Combining this approach with n_loop = 3 yields a NU-SQS algorithm that required only 13% more computation time per iteration than standard SQS, but converges faster.

TABLE I.

Run time of one iteration of NU-OS-SQS(82 subsets) for different choice of n_loop for GE performance phantom.

	SQS	OS-SQS(82)	NU-OS-SQS(82)
n_loop	·	·	1	3	5
1 Iter. [sec]	82	125	161	139	133

Open in a new tab

Computing ${\tilde{u}}_{j}^{(n)}$ and the corresponding NU-based denominator requires one iteration each. In the proposed algorithm, we computed ${\tilde{u}}_{j}^{(n)}$ during one iteration, and then computed the NU-based denominator (29) during the next iteration combined with the gradient computation ∇L(x). Then we used the denominator for n_loop iterations and then compute ${\tilde{u}}_{j}^{(n)}$ again to loop the process (see outline in Appendix B).

IV. Improved ordered subsets (OS) algorithm for helical CT

Ordered subsets (OS) methods can accelerate algorithms by a factor of the number of subsets in early iterations, by using a subset of the measured data for each subset update. However, in practice, OS methods break the monotonicity of SQS and NU-SQS, and typically approach a limit-cycle looping around the optimum. This section describes a simple idea that reduces this problem, only slightly affecting the convergence rate unlike previous convergent OS algorithms. In helical CT geometries, we observed that conventional OS algorithms for PL and PWLS problem are unstable for large subset numbers as they did not consider their non-uniform sampling. Thus, we describe an improved OS algorithm that is more stable for helical CT.

A. Ordinary OS algorithm

An OS algorithm (with M subsets) for accelerating the SQS or NU-SQS updates (17) has the following mth sub-iteration within the nth iteration using the denominator³ ${\tilde{d}}_{j}^{(n)}$ in (33):

x_{j}^{(n + \frac{m + 1}{M})} = {[x_{j}^{(n + \frac{m}{M})} - \frac{1}{{\tilde{d}}_{j}^{(n)}} ({\hat{γ}}_{j}^{(n + \frac{m}{M})} \frac{\partial}{\partial x_{j}} L_{m} (x^{(n + \frac{m}{M})}) + \frac{\partial}{\partial x_{j}} R (x^{(n + \frac{m}{M})}))]}_{+},

(39)

where ${\hat{γ}}_{j}^{(n + \frac{m}{M})}$ scales the gradient of a subset data-fit term L_m(x) = Σ_{i∈S_m} h_i([Ax]_i), and S_m consists of projection views in mth subset for m = 0, 1, ···, M − 1. We count one iteration when all M subsets are used once, since the projection A used for computing data-fit gradients is the dominant operation in SQS iteration.

If we use many subsets to attempt a big acceleration in OS algorithm, some issues arise. The increased computation for the gradient of regularizer in (39) can become a bottleneck (this has been relieved in [44]). Also having less measured data in each subset will likely break the subset balance condition

\nabla L_{0} (x) \approx \nabla L_{1} (x) \approx \dots \approx \nabla L_{M - 1} (x) .

(40)

The update in (39) would accelerate the SQS algorithm by exactly M if the scaling factor ${\hat{γ}}_{j}^{(n + \frac{m}{M})}$ satisfied the condition:

{\hat{γ}}_{j}^{(n + \frac{m}{M})} = \frac{\frac{\partial}{\partial x_{j}} L (x^{(n + \frac{m}{M})})}{\frac{\partial}{\partial x_{j}} L_{m} (x^{(n + \frac{m}{M})})} .

(41)

It would be impractical to compute this factor exactly, so the conventional OS approach is to simply use the constant γ = M. This “approximation” often works well in the early iterations when the subsets are suitably balanced, and for small number of subsets. But in general, the errors caused by the differences between ${\hat{γ}}_{j}^{(n + \frac{m}{M})}$ and a constant scaling factor γ cause two problems in OS methods. First, the choice γ = M causes instability in OS methods in a helical CT that has limited projection views outside ROI, leading to very imbalanced subsets. Therefore, we propose an alternative choice γ_j that better stabilizes OS for helical CT in Section IV-B. Second, even with γ replaced by γ_j, OS methods approach a limit-cycle that loops around the optimal point within sub-iterations [25]. Section IV-C considers a simple averaging idea that reduces this problem.

B. Proposed OS algorithm in helical CT

The constant scaling factor γ = M used in the ordinary regularized OS algorithm is reasonable when all the voxels are sampled uniformly by the projection views in all the subsets. But in geometries like helical CT, the voxels are non-uniformly sampled. In particular, voxels outside the ROI are sampled by fewer projection views than voxels within the ROI (see Fig. 1). So some subsets make no contribution to such voxels, i.e., very imbalanced subsets. We propose to use a voxel-based scaling factor γ_j that considers the non-uniform sampling, rather than a constant factor γ.

After investigating several candidates, we focused on the following scaling factor:

γ_{j} = \sum_{m = 0}^{M - 1} I_{{\sum_{i \in S_{m}} {\overset{⌣}{c}}_{i}^{(n)} a_{i j} (\sum_{l = 1}^{N_{p}} a_{i l} {\tilde{u}}_{l}^{(0)}) > 0}},

(42)

where I_B = 1 if B is true or 0 otherwise. As expected, γ_j < M for voxels outside the ROI and γ_j = M for voxels within the ROI. The scaling factor (42) has small compute overhead as it can be computed simultaneously with the precomputation of the initial data-fit denominator (29) by rewriting it as

{\tilde{d}}_{j}^{L, (0)} ≜ \frac{1}{{\tilde{u}}_{j}^{(0)}} \sum_{m = 0}^{M - 1} (\sum_{i \in S_{m}} {\overset{⌣}{c}}_{i}^{(n)} a_{i j} (\sum_{l = 1}^{N_{p}} a_{i l} {\tilde{u}}_{l}^{(0)})) .

(43)

We store (42) as a short integer for each voxel outside the ROI only, so it does not require very much memory.

We evaluated the OS algorithm with the proposed scaling factors (42) using the GE performance phantom. Fig. 3 shows that the OS algorithm using the proposed scaling factors (42) leads to more stable reconstruction than the ordinary OS approach which diverges outside the ROI. The instability seen with the ordinary OS approach may also degrade image quality within the ROI as seen by the noise standard deviations in Fig. 3. The results in Fig. 4 further show that the ordinary OS algorithm exhibits more variations within the ROI due to the instability outside ROI, whereas the proposed OS algorithm is robust.

Fig. 4 — GE performance phantom: mean and standard deviation within a uniform region in the first slice of the ROI (see Fig. 3) vs. iteration, showing the instability of ordinary OS approach with 328 subsets, compared with the proposed OS approach. Also shown is the result from a converged image x^(∞) generated from several iterations of a convergent algorithm.

C. OS algorithm with averaging

Although the new scaling factors (42) stabilize OS in helical CT and reduce artifacts, the final noise level is still worse than a convergent algorithm (see Fig. 3 and 4) because any OS method with constant scaling factors will not converge [45]. This section discusses one practical method that can reduce noise without affecting the convergence rate. This approach helps the OS algorithm come closer to the converged image, reducing the undesirable noise in images reconstructed using OS algorithms with large M.

To ensure convergence, the incremental optimization transfer method [26] was proposed, which involves a form of averaging, but the greatly increased memory space required has prevented its application in 3D X-ray CT. As a practical alternative, we investigated an approach where the final image is formed by averaging all of the sub-iterations at the final iteration n_end of the OS algorithm (after it approaches its limit cycle). A memory-efficient implementation of this approach uses a recursive in-place calculation:

{\bar{x}}^{(\frac{m + 1}{M})} = \frac{m}{m + 1} {\bar{x}}^{(\frac{m}{M})} + \frac{1}{m + 1} x^{(n_{end} - 1 + \frac{m + 1}{M})},

(44)

where x̄⁽⁰⁾ is an initial zero image, and x̄⁽¹⁾ is the final averaged image. There was a preliminary simulation investigation of averaging the final iteration in [27], and we applied the averaging technique to CT scans here. In Table II, we investigated this averaging method using a scan of the GEPP phantom and quantified the noise and resolution properties (as described in Fig. 3), and evaluated root mean square difference (RMSD⁴) between current and converged image within ROI. Table II shows that the averaging technique successfully reduces the noise and RMSD.

TABLE II.

GE performance phantom: Noise, resolution and RMSD behavior of OS-SQS(328 subsets) after 20 iterations followed by averaging.

	Smoothed FBP	OS-SQS(328)		Conv.
	Smoothed FBP	w/o averaging	w/ averaging	Conv.
Mean [HU]	1127.7	1123.3	1123.8	1123.7
Std. Dev. [HU]	2.3	8.0	7.2	6.6
FWHM [mm]	1.4	0.7	0.7	0.7
RMSD [HU]	9.4	3.4	0.8	·

Open in a new tab

Overall, we have enhanced the standard OS-SQS algorithm into the NU-OS-SQS method for 3D helical CT. First, we accelerated the standard OS-SQS algorithm by non-uniform (NU) approach, encouraging larger step sizes for the voxels that need more updates. We modified the algorithm to handle the helical CT geometry by introducing the scaling factor γ_j. We also averaged all sub-iterations at the final iteration to reduce noise. The outline of the proposed algorithm is presented in Appendix B. We investigate the performance of the NU-OS-SQS algorithm for various CT scans in the next section.

V. Experimental results

We investigated the proposed NU-OS-SQS algorithm for PWLS image reconstruction with a non-negativity constraint. The PWLS cost function is strictly convex and has a unique global minimizer [46]. We implemented the NU-OS-SQS algorithm in C and executed it on a Mac with two 2.26GHz quad-core Intel Xeon processors and a 16GB RAM. We used 16 threads, and projection views were grouped and assigned to each thread.

Three 3D helical CT data sets are used in this section to compare the proposed NU-OS-SQS algorithm to the ordinary OS-SQS algorithm, and we used the GE performance phantom (GEPP) to measure the resolution. We used two other clinical data sets to investigate the performance of NU approach. We investigated tuning the DRA function g(·) in (37) to provide fast convergence rate for various data sets. We also provide results from a simulation data set in a supplementary material for reproducibility.⁵

We chose the parameters of the cost function Ψ(x) in (2) to provide a good image. We defined an edge-preserving potential function as ψ_k([Cx]_k) ≜ ω̄_kψ([Cx]_k), where the function:⁶

ψ (t) = \frac{δ^{2}}{b^{3}} (\frac{{a b}^{2}}{2} {| \frac{t}{δ} |}^{2} + b (b - a) | \frac{t}{δ} | + (a - b) log (1 + b | \frac{t}{δ} |))

(45)

is a generalized version of a Fair potential function in [47], and the spatial weighting ω̄_k [48] provides resolution properties that emulate the GE product “Veo”. We used M = 82 subsets for the OS algorithms, assigning 12 out of 984 projection views per rotation to each subset. We used the maximum curvature (10) for generating the denominator of surrogate function of the cost function Ψ(x), and focused on n_loop = 3 which balances the convergence rate and run time, based on Table I.

In Section II-D, we recommended fixing the denominator ${\tilde{d}}_{j}^{(n)}$ (generated by the maximum curvature (10)) after n_fix iterations in NU-SQS algorithm to guarantee convergence. This condition is less important theoretically when we accelerate the NU-SQS algorithm with OS methods that break the convergence property. However, we still recommend fixing ${\tilde{d}}_{j}^{(n)}$ after n_fix iterations (before approaching the limit-cycle) in the NU-OS-SQS algorithm, because we observed some instability from updating ${\tilde{d}}_{j}^{(n)}$ (and ${\tilde{u}}_{j}^{(n)}$ ) every n_loop iterations near the limit-cycle in our experiments. We selected n_fix = 7 for GEPP, but we did not use n_fix for other two cases because the algorithm did not reach a limit-cycle within n_end = 20 iterations, and we leave optimizing n_fix as a future work.

In Section IV-B, we stabilized the OS-SQS algorithm outside ROI in helical geometry by using the factor γ_j in (42). However, we experienced some instability outside ROI in NU-OS-SQS methods even with (42), because a small NU denominator ${\tilde{d}}_{j}^{(n)}$ outside ROI is more likely to lead to instability than for voxels within the ROI due to the incomplete sampling outside ROI. Therefore, we prevent the denominator ${\tilde{d}}_{j}^{(n)}$ outside ROI from being very small. We empirically modified the DRA function in Section III-C, and used it for our experiments, improving stability outside ROI. We first modified the function (35) as follows:

F_{cdf}^{(n)} (u) ≜ \frac{1}{N_{p}} \sum_{j = 1}^{N_{p}} I_{{γ_{j} u_{j}^{(n)} \leq u}},

(46)

since the value of $u_{j}^{(n)}$ in (27) outside ROI was found to be relatively large due to the incomplete sampling. We further modified (36) and (37) to prevent ${\tilde{d}}_{j}^{(n)}$ from becoming very small outside ROI as follows with g(v; α) ≜ max {αv^t, ε}:

{\tilde{u}}_{j}^{(n)} ≜ {\begin{cases} g (F_{cdf}^{(n)} (γ_{j} u_{j}^{(n)}); 1), & if j th voxel within ROI, \\ g (F_{cdf}^{(n)} (γ_{j} u_{j}^{(n)}); 0.5), & otherwise . \end{cases}

(47)

A. GE performance phantom (GEPP)

We reconstructed 512 × 512 × 47 images of the GEPP from a 888 × 64 × 3071 sinogram (the number of detector columns × detector rows × projection views) with pitch 0.5. We evaluated the full width at half maximum (FWHM) of a tungsten wire (see Fig. 3). Fig. 5(a) shows the resolution versus run time and confirms that non-uniform (NU) approach accelerates the SQS algorithm. This dramatic speed-up in FWHM is promising since SQS-type algorithms are known to have slow convergence rate of high frequency components [6]. We also evaluated the convergence rate by computing RMSD between current and converged⁷ image versus run time, within ROI.

Fig. 5(a) and 5(b) illustrate that increasing t in g(·) in (37) accelerates the convergence of “update-needed” region, particularly the wire and edges in GEPP. However, highly focusing the updates on few voxels will not help speed up the overall convergence for all objects. Therefore, we further investigate the choice of g(·) using various patient CT scans.

The RMSD plots⁸ of NU-OS-SQS in Fig. 5(b) reached a limit-cycle after 1500 sec that did not approach zero. Averaging the sub-iterations at the final iteration improved the final image with small computation cost, yielding the drop in RMSD at the last 20th iteration in Fig. 5(b). The reduced noise was measurable in the reconstructed image as seen in Table II.

B. A shoulder region scan

In this experiment, we reconstructed a 512 × 512 × 109 image from a shoulder region scan 888 × 32 × 7146 sinogram with pitch 0.5. Fig. 6(a) and 6(b) show that the non-uniform approach accelerates convergence, depending on the choice of parameters in g(·). We investigated the relationship between the convergence rate and the DRA function g(·) by tuning both the parameters t and ε in (37). Fig. 6(a) shows that increasing t to 10 accelerated convergence, but larger t values did not help as the choice of t = 40 was slower than t = 10. In Fig. 6(b), decreasing ε to 0.01 accelerated the algorithm in this shoulder region scan, but not for the data set in Section V-C, so ε = 0.05 appears to be a reasonable choice overall.

We averaged the sub-iterations at the last iteration, but Fig. 6(a) and 6(b) did not show a drop at the final iteration (which appeared in Fig. 5(b)), because the algorithm had not yet reached a limit-cycle. Even though averaging did not noticeably decrease the RMSD, the reconstructed image had measurable noise reduction in regions that already reached a limit-cycle like uniform regions. (Results not shown.)

In Fig. 7(a), we illustrate that statistical image reconstruction can reduce noise and preserve image features compared to analytical FBP reconstruction. The reconstructed images of (NU-)OS-SQS show that NU approach helps OS-SQS to approach the converged image faster than the ordinary method. After the same computation (95 min.), the reconstructed image of OS-SQS still contains streaks from the initial FBP image, while NU-OS-SQS has reduced the streaks. This is apparent in the difference images between the reconstructed and converged images in Fig. 7(b).

By analyzing NU-OS-SQS in two CT scans, we observed that the parameters t = 10 and ε = 0.05 consistently accelerated the algorithm by about a factor of more than two.⁹ (The choice ε = 0.01 was too aggressive in our other experiments.) We also have observed more than two-fold accelerations in other experiments. (Results not shown.) Fig. 6(b) shows the RMSD plot using the (practically unavailable) oracle update-needed factor ${\hat{u}}_{j}^{(n)} ≜ ∣ x_{j}^{(n)} - x_{j}^{(\infty)} ∣$ instead of our heuristic choice ${\tilde{u}}_{j}^{(n)}$ . This result suggests that additional optimization of the DRA method and initialization of ${\tilde{u}}_{j}^{(0)}$ could further speed up the NU algorithm in future work.

C. A truncated abdomen scan

We also reconstructed a 390 × 390 × 239 image from a 888 × 64 × 3516 sinogram with pitch 1.0. This scan contains transaxial truncation and the initial FBP image has truncation artifacts [49] that can be reduced by iterative reconstruction. The choice of $u_{j}^{(0)}$ described in Section III-E did not consider truncation effects, and we found that NU-OS-SQS did not reduce such artifacts faster than standard OS-SQS. (The large patient size may also have reduced the possible speed-up by the NU method, compared to the previous two scans.) Therefore, we investigated an alternative NU method that can reduce truncation artifacts faster than standard algorithm.

We designed a modified NU method using a few (m_sub) sub-iterations of standard OS-SQS to generate the initial update-needed factor $u_{j}^{(0)}$ , which may also be a reasonable approach for other scans. We perform initial sub-iterations $x_{sub}^{(m / M)}$ in (39) efficiently using two-input projectors (in Section III-F) and replacing the all-view denominator ${\tilde{d}}_{j}^{L, (n)}$ in (29) by a standard subset-based denominator [25]:

{\tilde{d}}_{j, sub}^{L_{m}, (\frac{m}{M})} ≜ γ \sum_{i \in S_{m}} {\overset{⌣}{c}}_{i}^{(\frac{m}{M})} a_{i j} (\sum_{l = 1}^{N_{p}} a_{i l}),

(48)

where S_m consists of projection views in mth subset. The scaling factor γ_j in (42) is unavailable at this point, so we use γ = M instead. After m_sub sub-iterations, we compute the following initial update-needed factors:

{\tilde{u}}_{j}^{(0)} ≜ f_{sub}^{(\frac{m_{sub}}{M})} (| x_{j, sub}^{(\frac{m_{sub}}{M})} - x_{j}^{(0)} |),

(49)

where $f_{sub}^{(\frac{m_{sub}}{M})} (\cdot)$ is a DRA function in (36), and we use these to compute the NU denominators ${\tilde{d}}_{j}^{L, (0)}$ and ${\tilde{d}}_{j}^{R, (0)}$ that we use for first n_loop outer iterations.

Fig. 8(a) shows that statistical image reconstruction provides better image quality than FBP reconstruction. Fig. 8(b) illustrates that this NUsub-OS-SQS approach reduces the truncation artifacts faster than the standard OS-SQS and NU-OS-SQS. Although standard OS-SQS reduces noise faster than other two algorithms in Fig. 8(b), both NU-OS-SQS and NUsub-OS-SQS show better convergence near the spine, the boundary of patient, and other internal structures than OS-SQS at the same computation time (90 min.).

VI. Conclusion

This paper has presented a spatially non-uniform SQS algorithm that can efficiently minimize both PL and PWLS problems monotonically. The experimental results show that the proposed NU-SQS approach converged more than twice as fast as SQS. The OS algorithm, further applied to SQS method for acceleration, was enhanced to handle non-uniformly sampled geometries such as helical CT. The improvements showed promising results on large 3D helical CT data sets.

The key of the NU-SQS approach is designing “update-needed” factors $u_{j}^{(n)}$ in (27) that encourage larger step sizes for voxels that are predicted to need larger changes to reach the final image. Further optimization of these factors, e.g., by improving the initialization of ${\tilde{u}}_{j}^{(0)}$ and the DRA function in (36), should lead to further acceleration and stability of the proposed NU-SQS and NU-OS-SQS methods.

Supplementary Material

supplement

NIHMS502911-supplement-supplement.pdf^{(281.4KB, pdf)}

Acknowledgments

The first author would like to thank Sathish Ramani for suggesting Lemma 1. The authors would like to thank anonymous reviewers for their valuable comments. J. A. Fessler acknowledges helpful discussions with Johan Nuyts about the denominators for OS methods.

This work was supported in part by GE Healthcare, the National Institutes of Health under Grant R01-HL-098686, and equipment donations from Intel.

Appendix A. Proof of Lemma 2

The proposed choice ${\tilde{λ}}_{i j}^{(n)} = a_{i j} u_{j}^{(n)}$ in (28) and its corresponding ${\tilde{d}}_{j}^{L, (n)}$ in (29) are a choice that minimizes $\sum_{j = 1}^{N_{p}} {(u_{j}^{(n)})}^{2} d_{j}^{L, (n)}$ among all possible $d_{j}^{L, (n)}$ in (19), i.e.,

{{\tilde{λ}}_{i j}^{(n)}} = arg min_{{λ_{i j}^{(n)}}} \sum_{j = 1}^{N_{p}} {(u_{j}^{(n)})}^{2} (\sum_{\begin{matrix} i = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} \frac{a_{i j}^{2}}{λ_{i j}^{(n)}} \sum_{\begin{matrix} l = 1 \\ a_{i l} \neq 0 \end{matrix}}^{N_{p}} λ_{i l}^{(n)}),

subject to the positivity constraint on $λ_{i j}^{(n)}$ if a_ij ≠ 0.

Proof

By the Schwarz inequality 〈s, t〉² ≤ ||s||²||t||², we have

{(\sum_{\begin{matrix} j = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{p}} a_{i j} u_{j}^{(n)})}^{2} \leq \sum_{\begin{matrix} j = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{p}} {(u_{j}^{(n)})}^{2} \frac{a_{i j}^{2}}{λ_{i j}^{(n)}} \sum_{\begin{matrix} l = 1 \\ a_{i l} \neq 0 \end{matrix}}^{N_{p}} λ_{i l}^{(n)},

where $s_{j} = \sqrt{\frac{λ_{i j}^{(n)}}{\sum_{l = 1}^{N_{p}} λ_{i l}^{(n)}}}$ and $t_{j} = a_{i j} u_{j}^{(n)} \sqrt{\frac{\sum_{l = 1}^{N_{p}} λ_{i l}^{(n)}}{λ_{i j}^{(n)}}}$ . Then,

\begin{array}{l} \sum_{j = 1}^{N_{p}} {(u_{j}^{(n)})}^{2} d_{j}^{L, (n)} = \sum_{i = 1}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} (\sum_{\begin{matrix} j = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{p}} {(u_{j}^{(n)})}^{2} \frac{a_{i j}^{2}}{λ_{i j}^{(n)}} \sum_{\begin{matrix} l = 1 \\ a_{i l} \neq 0 \end{matrix}}^{N_{p}} λ_{i l}^{(n)}) \\ \geq \sum_{i = 1}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} {(\sum_{\begin{matrix} j = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{p}} a_{i j} u_{j}^{(n)})}^{2} \\ = \sum_{j = 1}^{N_{p}} (\sum_{\begin{matrix} i = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} a_{i j} u_{j}^{(n)} \sum_{\begin{matrix} l = 1 \\ a_{i l} \neq 0 \end{matrix}}^{N_{p}} a_{i l} u_{l}^{(n)}) \\ = \sum_{j = 1}^{N_{p}} {(u_{j}^{(n)})}^{2} (\frac{1}{u_{j}^{(n)}} \sum_{\begin{matrix} i = 1 \\ a_{i j} \neq 0 \end{matrix}}^{N_{d}} {\overset{⌣}{c}}_{i}^{(n)} a_{i j} \sum_{\begin{matrix} l = 1 \\ a_{i l} \neq 0 \end{matrix}}^{N_{p}} a_{i l} u_{l}^{(n)}) \\ = \sum_{j = 1}^{N_{p}} {(u_{j}^{(n)})}^{2} {\tilde{d}}_{j}^{L, (n)} . \end{array}

Appendix B. Outline of the proposed NU-OS-SQS algorithm]

Set M, n_end, n_loop and initialize x by an FBP image.

Generate u_j from an FBP image by edge and intensity detectors.

Compute the maximum curvature č_i = max {ḧ_i(0), η}.

d_{j}^{L} = 0

, γ_j = 0, x_j,_ref = x_j, and the final image x̄;_j = 0.

{\tilde{d}}_{j}^{R} = \frac{1}{u_{j}} \sum_{k = 1}^{N_{r}} {\ddot{ψ}}_{k} (0) ∣ c_{k j} ∣ (\sum_{l = 1}^{N_{p}} ∣ c_{k l} ∣ u_{l})

(50)

n = 0

for m = 0, 1, …, M − 1

\begin{array}{l} d_{j, sub}^{L} = \frac{1}{u_{j}} \sum_{i \in S_{m}} {\overset{⌣}{c}}_{i} a_{i j} (\sum_{l = 1}^{N_{p}} a_{i l} u_{l}) \\ d_{j}^{L} + = d_{j, sub}^{L} and γ_{j} + = I_{{d_{j, sub}^{L} > 0}} \end{array}

(51)

end

for n = 1, 2, …, n_end − 1

if n mod n_loop = 1 and n ≤ n_fix

{\tilde{d}}_{j}^{L} = d_{j}^{L}, and compute {\tilde{d}}_{j}^{R} by (50)

elseif n mod n_loop = n_loop − 1 and n ≤ n_fix − 2

x_{j, ref} = x_{j}

elseif n mod n_loop = 0 and n ≤ n_fix − 1

d_{j}^{L} = 0, and u_{j} = g (F_{cdf} (x_{j} - x_{j, ref}))

end

for m = 0, 1, …, M − 1

x_{j, prev} = x_{j}

if n mod n_loop ≠ 0 or n ≥ n_fix

g_{j, sub}^{L} = \frac{\partial}{\partial x_{j}} L_{m} (x_{prev})

(52)

else

compute both

d_{j, sub}^{L}

by (51) and

g_{j, sub}^{L}

by (52)

simultaneously using two-input projection function, and

d_{j}^{L} + = d_{j, sub}^{L}

end

x_{j} = {[x_{j, prev} - \frac{γ_{j} g_{j, sub}^{L} + \frac{\partial}{\partial x_{j}} R (x_{prev})}{{\tilde{d}}_{j}^{L} + {\tilde{d}}_{j}^{R}}]}_{+}

(53)

if n = n_end − 1

{\bar{x}}_{j} = \frac{m}{m + 1} {\bar{x}}_{j} + \frac{1}{m + 1} x_{j}

end

Open in a new tab

Footnotes

Each row of C consists of a permutation of (1, −1, 0, …, 0) ∈ IR^N_p where the indices of the nonzero entries 1 and −1 corresponds to adjacent voxel locations in 3D image space.

If $D_{s}^{- 1} ≽ D_{l}^{- 1} ≽ H^{(\infty)} ≽ 0$ , then $ρ (I - D_{s}^{- 1} H^{(\infty)}) \leq ρ (I - D_{l}^{- 1} H^{(\infty)}) < 1$ .

We consider the maximum curvature (10) here for computational efficiency in OS methods.

⁴

$RMSD ≜ {∣ ∣ x_{ROI}^{(n)} - x_{ROI}^{(\propto)} ∣ ∣}_{2} / \sqrt{N_{p, ROI}} [HU]$ , where N_p,_ROI is the number of voxels in the ROI.

⁵

The supplementary material is available at http://ieeexplore.ieee.org.

⁶

The gradient $\dot{ψ} (t) = t \frac{1 + a ∣ t / δ ∣}{1 + b ∣ t / δ ∣}$ avoids expensive power operations, saving computation for OS-type methods. The function reduces to a Fair potential function in [47] when a = 0 and b = 1. We used a = 0.0558, b = 1.6395, and δ = 10 in our experiments.

⁷

We ran 100 iterations of OS-SQS algorithm with 41 subsets, followed by each 100 iterations of OS-SQS algorithm with 4 subsets, and 2000 iterations of (convergent) SQS. We subsequently performed 100 iterations of (convergent) NH-ABCD-SQS [21] to generate (almost) converged images x^(∞).

⁸

We also provide the plots of the cost function for GEPP and shoulder region scan in the supplementary material.

⁹

We used the run time and RMSD of standard OS-SQS after 20 iterations (without averaging) as a reference to compare with the NU-OS-SQS for each data set. Then we compared the run time of NU-OS-SQS that is required for achieving the reference RMSD with the reference run time, and confirmed that NU provided more than two-fold accelerations in two CT scans.

This paper has supplementary downloadable materials available at http://iee-explore.ieee.org, provided by the author.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Contributor Information

Donghwan Kim, Email: kimdongh@umich.edu, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48105 USA.

Debashish Pal, Email: debashish.pal@ge.com, GE Healthcare Technologies, 3000 N Grandview Blvd, W-1180, Waukesha, WI 53188 USA.

Jean-Baptiste Thibault, Email: jean-baptiste.thibault@med.ge.com, GE Healthcare Technologies, 3000 N Grandview Blvd, W-1180, Waukesha, WI 53188 USA.

Jeffrey A. Fessler, Email: fessler@umich.edu, Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48105 USA.

References

1.Erdogan H, Fessler JA. Monotonic algorithms for transmission tomography. IEEE Trans Med Imag. 1999 Sep;18(9):801–14. doi: 10.1109/42.802758. [DOI] [PubMed] [Google Scholar]
2.Elbakri IA, Fessler JA. Statistical image reconstruction for polyenergetic X-ray computed tomography. IEEE Trans Med Imag. 2002 Feb;21(2):89–99. doi: 10.1109/42.993128. [DOI] [PubMed] [Google Scholar]
3.Fessler JA. Statistical image reconstruction methods for transmission tomography. In: Sonka M, Michael Fitzpatrick J, editors. Handbook of Medical Imaging, Volume 2. Medical Image Processing and Analysis. SPIE; Bellingham: 2000. pp. 1–70. [Google Scholar]
4.Thibault JB, Sauer K, Bouman C, Hsieh J. A three-dimensional statistical approach to improved image quality for multi-slice helical CT. Med Phys. 2007 Nov;34(11):4526–44. doi: 10.1118/1.2789499. [DOI] [PubMed] [Google Scholar]
5.Thibault J-B, Bouman CA, Sauer KD, Hsieh J. A recursive filter for noise reduction in statistical iterative tomographic imaging. Proc SPIE 6065 Computational Imaging IV. 2006:60650X. [Google Scholar]
6.Sauer K, Bouman C. A local update strategy for iterative reconstruction from projections. IEEE Trans Sig Proc. 1993 Feb;41(2):534–48. [Google Scholar]
7.Yu Z, Thibault JB, Bouman CA, Sauer KD, Hsieh J. Fast model-based X-ray CT reconstruction using spatially non-homogeneous ICD optimization. IEEE Trans Im Proc. 2011 Jan;20(1):161–75. doi: 10.1109/TIP.2010.2058811. [DOI] [PubMed] [Google Scholar]
8.Golub GH, Van Loan CF. Matrix computations. 2. Johns Hopkins Univ. Press; 1989. [Google Scholar]
9.Fessler JA, Kim D. Axial block coordinate descent (ABCD) algorithm for X-ray CT image reconstruction. Proc Intl Mtg on Fully 3D Image Recon in Rad and Nuc Med. 2011:262–5. [Google Scholar]
10.Benson TM, Man BKBD, Fu L, Thibault J-B. Block-based iterative coordinate descent. Proc IEEE Nuc Sci Symp Med Im Conf. 2010:2856–9. [Google Scholar]
11.Fessler JA, Ficaro EP, Clinthorne NH, Lange K. Grouped-coordinate ascent algorithms for penalized-likelihood transmission image reconstruction. IEEE Trans Med Imag. 1997 Apr;16(2):166–75. doi: 10.1109/42.563662. [DOI] [PubMed] [Google Scholar]
12.De Man B, Basu S, Thibault J-B, Hsieh J, Fessler JA, Bouman C, Sauer K. A study of different minimization approaches for iterative reconstruction in X-ray CT. Proc IEEE Nuc Sci Symp Med Im Conf. 2005;5:2708–10. [Google Scholar]
13.Hudson HM, Larkin RS. Accelerated image reconstruction using ordered subsets of projection data. IEEE Trans Med Imag. 1994 Dec;13(4):601–9. doi: 10.1109/42.363108. [DOI] [PubMed] [Google Scholar]
14.Erdogan H, Fessler JA. Ordered subsets algorithms for transmission tomography. Phys Med Biol. 1999 Nov;44(11):2835–51. doi: 10.1088/0031-9155/44/11/311. [DOI] [PubMed] [Google Scholar]
15.Nuyts J, De Man B, Dupont P, Defrise M, Suetens P, Mortelmans L. Iterative reconstruction for helical CT: A simulation study. Phys Med Biol. 1998 Apr;43(4):729–37. doi: 10.1088/0031-9155/43/4/003. [DOI] [PubMed] [Google Scholar]
16.Fessler JA, Booth SD. Conjugate-gradient preconditioning methods for shift-variant PET image reconstruction. IEEE Trans Im Proc. 1999 May;8(5):688–99. doi: 10.1109/83.760336. [DOI] [PubMed] [Google Scholar]
17.Goldstein T, Osher S. The split Bregman method for L1-regularized problems. SIAM J Imaging Sci. 2009;2(2):323–43. [Google Scholar]
18.Ramani S, Fessler JA. A splitting-based iterative algorithm for accelerated statistical X-ray CT reconstruction. IEEE Trans Med Imag. 2012 Mar;31(3):677–88. doi: 10.1109/TMI.2011.2175233. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Lange K, Hunter DR, Yang I. Optimization transfer using surrogate objective functions. J Computational and Graphical Stat. 2000 Mar;9(1):1–20. [Google Scholar]
20.Jacobson MW, Fessler JA. An expanded theoretical treatment of iteration-dependent majorize-minimize algorithms. IEEE Trans Im Proc. 2007 Oct;16(10):2411–22. doi: 10.1109/tip.2007.904387. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Kim D, Fessler JA. Parallelizable algorithms for X-ray CT image reconstruction with spatially non-uniform updates. Proc 2nd Intl Mtg on image formation in X-ray CT. 2012:33–6. [Google Scholar]
22.Bertsekas DP. A new class of incremental gradient methods for least squares problems. SIAM J Optim. 1997 Nov;7(4):913–26. [Google Scholar]
23.Nedic A, Bertsekas DP. Incremental subgradient methods for nondifferentiable optimization. SIAM J Optim. 2001;12(1):109–38. [Google Scholar]
24.Byrne CL. Block-iterative methods for image reconstruction from projections. IEEE Trans Im Proc. 1996 May;5(5):792–3. doi: 10.1109/83.499919. [DOI] [PubMed] [Google Scholar]
25.Ahn S, Fessler JA. Globally convergent image reconstruction for emission tomography using relaxed ordered subsets algorithms. IEEE Trans Med Imag. 2003 May;22(5):613–26. doi: 10.1109/TMI.2003.812251. [DOI] [PubMed] [Google Scholar]
26.Ahn S, Fessler JA, Blatt D, Hero AO. Convergent incremental optimization transfer algorithms: Application to tomography. IEEE Trans Med Imag. 2006 Mar;25(3):283–96. doi: 10.1109/TMI.2005.862740. [DOI] [PubMed] [Google Scholar]
27.Angelis GI, Reader AJ, Kotasidis FA, Lionheart WR, Matthews JC. The performance of monotonic and new non-monotonic gradient ascent reconstruction algorithms for high-resolution neuroreceptor PET imaging. Phys Med Biol. 2011 Jul;56(13):3895–917. doi: 10.1088/0031-9155/56/13/010. [DOI] [PubMed] [Google Scholar]
28.Defrise M, Noo F, Kudo H. A solution to the long-object problem in helical cone-beam tomography. Phys Med Biol. 2000 Mar;45(3):623–43. doi: 10.1088/0031-9155/45/3/305. [DOI] [PubMed] [Google Scholar]
29.Kim D, Pal D, Thibault J-B, Fessler JA. Improved ordered subsets algorithm for 3D X-ray CT image reconstruction. Proc 2nd Intl Mtg on image formation in X-ray CT. 2012:378–81. [Google Scholar]
30.Yavuz M, Fessler JA. Statistical image reconstruction methods for randoms-precorrected PET scans. Med Im Anal. 1998 Dec;2(4):369–78. doi: 10.1016/s1361-8415(98)80017-0. [DOI] [PubMed] [Google Scholar]
31.Ortega JM, Rheinboldt WC. Iterative solution of nonlinear equations in several variables. Academic; New York: 1970. [Google Scholar]
32.Huber PJ. Robust statistics. Wiley; New York: 1981. [Google Scholar]
33.Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc Ser B. 1977;39(1):1–38. [Google Scholar]
34.De Pierro AR. On the convergence of the iterative image space reconstruction algorithm for volume ECT. IEEE Trans Med Imag. 1987 Jun;6(2):174–5. doi: 10.1109/TMI.1987.4307819. [DOI] [PubMed] [Google Scholar]
35.De Pierro AR. On the relation between the ISRA and the EM algorithm for positron emission tomography. IEEE Trans Med Imag. 1993 Jun;12(2):328–33. doi: 10.1109/42.232263. [DOI] [PubMed] [Google Scholar]
36.De Pierro AR. A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography. IEEE Trans Med Imag. 1995 Mar;14(1):132–7. doi: 10.1109/42.370409. [DOI] [PubMed] [Google Scholar]
37.Lange K, Fessler JA. Globally convergent algorithms for maximum a posteriori transmission tomography. IEEE Trans Im Proc. 1995 Oct;4(10):1430–8. doi: 10.1109/83.465107. [DOI] [PubMed] [Google Scholar]
38.Daubechies I, Defrise M, Mol CD. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm Pure Appl Math. 2004 Nov;57(11):1413–57. [Google Scholar]
39.Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci. 2009;2(1):183–202. [Google Scholar]
40.Sotthivirat S, Fessler JA. Image recovery using partitioned-separable paraboloidal surrogate coordinate ascent algorithms. IEEE Trans Im Proc. 2002 Mar;11(3):306–17. doi: 10.1109/83.988963. [DOI] [PubMed] [Google Scholar]
41.Kim D, Fessler JA. Accelerated ordered-subsets algorithm based on separable quadratic surrogates for regularized image reconstruction in X-ray CT. Proc IEEE Intl Symp Biomed Imag. 2011:1134–7. [Google Scholar]
42.Fessler JA, Clinthorne NH, Rogers WL. On complete data spaces for PET reconstruction algorithms. IEEE Trans Nuc Sci. 1993 Aug;40(4):1055–61. [Google Scholar]
43.Long Y, Fessler JA, Balter JM. 3D forward and back-projection for X-ray CT using separable footprints. IEEE Trans Med Imag. 2010 Nov;29(11):1839–50. doi: 10.1109/TMI.2010.2050898. [DOI] [PMC free article] [PubMed] [Google Scholar]
44.Cho JH, Fessler JA. Accelerating ordered-subsets image reconstruction for X-ray CT using double surrogates. Proc SPIE 8313 Medical Imaging 2012: Phys Med Im. 2012:83131X. doi: 10.1109/TMI.2013.2266898. [DOI] [PMC free article] [PubMed] [Google Scholar]
45.Luo ZQ. On the convergence of the lms algorithm with adaptive learning rate for linear feedforward networks. Neural Computation. 1991 Jun;3(2):226–45. doi: 10.1162/neco.1991.3.2.226. [DOI] [PubMed] [Google Scholar]
46.Delaney AH, Bresler Y. Globally convergent edge-preserving regularized reconstruction: an application to limited-angle tomography. IEEE Trans Im Proc. 1998 Feb;7(2):204–21. doi: 10.1109/83.660997. [DOI] [PubMed] [Google Scholar]
47.Fair RC. On the robust estimation of econometric models. Ann Econ Social Measurement. 1974 Oct;2:667–77. [Google Scholar]
48.Fessler JA, Rogers WL. Spatial resolution properties of penalized-likelihood image reconstruction methods: Space-invariant tomographs. IEEE Trans Im Proc. 1996 Sep;5(9):1346–58. doi: 10.1109/83.535846. [DOI] [PubMed] [Google Scholar]
49.Yu L, Zou Y, Sidky EY, Pelizzari CA, Munro P, Pan X. Region of interest reconstruction from truncated data in circular cone-beam CT. IEEE Trans Med Imag. 2006 Jul;25(7):869–81. doi: 10.1109/tmi.2006.872329. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplement

NIHMS502911-supplement-supplement.pdf^{(281.4KB, pdf)}

[R1] 1.Erdogan H, Fessler JA. Monotonic algorithms for transmission tomography. IEEE Trans Med Imag. 1999 Sep;18(9):801–14. doi: 10.1109/42.802758. [DOI] [PubMed] [Google Scholar]

[R2] 2.Elbakri IA, Fessler JA. Statistical image reconstruction for polyenergetic X-ray computed tomography. IEEE Trans Med Imag. 2002 Feb;21(2):89–99. doi: 10.1109/42.993128. [DOI] [PubMed] [Google Scholar]

[R3] 3.Fessler JA. Statistical image reconstruction methods for transmission tomography. In: Sonka M, Michael Fitzpatrick J, editors. Handbook of Medical Imaging, Volume 2. Medical Image Processing and Analysis. SPIE; Bellingham: 2000. pp. 1–70. [Google Scholar]

[R4] 4.Thibault JB, Sauer K, Bouman C, Hsieh J. A three-dimensional statistical approach to improved image quality for multi-slice helical CT. Med Phys. 2007 Nov;34(11):4526–44. doi: 10.1118/1.2789499. [DOI] [PubMed] [Google Scholar]

[R5] 5.Thibault J-B, Bouman CA, Sauer KD, Hsieh J. A recursive filter for noise reduction in statistical iterative tomographic imaging. Proc SPIE 6065 Computational Imaging IV. 2006:60650X. [Google Scholar]

[R6] 6.Sauer K, Bouman C. A local update strategy for iterative reconstruction from projections. IEEE Trans Sig Proc. 1993 Feb;41(2):534–48. [Google Scholar]

[R7] 7.Yu Z, Thibault JB, Bouman CA, Sauer KD, Hsieh J. Fast model-based X-ray CT reconstruction using spatially non-homogeneous ICD optimization. IEEE Trans Im Proc. 2011 Jan;20(1):161–75. doi: 10.1109/TIP.2010.2058811. [DOI] [PubMed] [Google Scholar]

[R8] 8.Golub GH, Van Loan CF. Matrix computations. 2. Johns Hopkins Univ. Press; 1989. [Google Scholar]

[R9] 9.Fessler JA, Kim D. Axial block coordinate descent (ABCD) algorithm for X-ray CT image reconstruction. Proc Intl Mtg on Fully 3D Image Recon in Rad and Nuc Med. 2011:262–5. [Google Scholar]

[R10] 10.Benson TM, Man BKBD, Fu L, Thibault J-B. Block-based iterative coordinate descent. Proc IEEE Nuc Sci Symp Med Im Conf. 2010:2856–9. [Google Scholar]

[R11] 11.Fessler JA, Ficaro EP, Clinthorne NH, Lange K. Grouped-coordinate ascent algorithms for penalized-likelihood transmission image reconstruction. IEEE Trans Med Imag. 1997 Apr;16(2):166–75. doi: 10.1109/42.563662. [DOI] [PubMed] [Google Scholar]

[R12] 12.De Man B, Basu S, Thibault J-B, Hsieh J, Fessler JA, Bouman C, Sauer K. A study of different minimization approaches for iterative reconstruction in X-ray CT. Proc IEEE Nuc Sci Symp Med Im Conf. 2005;5:2708–10. [Google Scholar]

[R13] 13.Hudson HM, Larkin RS. Accelerated image reconstruction using ordered subsets of projection data. IEEE Trans Med Imag. 1994 Dec;13(4):601–9. doi: 10.1109/42.363108. [DOI] [PubMed] [Google Scholar]

[R14] 14.Erdogan H, Fessler JA. Ordered subsets algorithms for transmission tomography. Phys Med Biol. 1999 Nov;44(11):2835–51. doi: 10.1088/0031-9155/44/11/311. [DOI] [PubMed] [Google Scholar]

[R15] 15.Nuyts J, De Man B, Dupont P, Defrise M, Suetens P, Mortelmans L. Iterative reconstruction for helical CT: A simulation study. Phys Med Biol. 1998 Apr;43(4):729–37. doi: 10.1088/0031-9155/43/4/003. [DOI] [PubMed] [Google Scholar]

[R16] 16.Fessler JA, Booth SD. Conjugate-gradient preconditioning methods for shift-variant PET image reconstruction. IEEE Trans Im Proc. 1999 May;8(5):688–99. doi: 10.1109/83.760336. [DOI] [PubMed] [Google Scholar]

[R17] 17.Goldstein T, Osher S. The split Bregman method for L1-regularized problems. SIAM J Imaging Sci. 2009;2(2):323–43. [Google Scholar]

[R18] 18.Ramani S, Fessler JA. A splitting-based iterative algorithm for accelerated statistical X-ray CT reconstruction. IEEE Trans Med Imag. 2012 Mar;31(3):677–88. doi: 10.1109/TMI.2011.2175233. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Lange K, Hunter DR, Yang I. Optimization transfer using surrogate objective functions. J Computational and Graphical Stat. 2000 Mar;9(1):1–20. [Google Scholar]

[R20] 20.Jacobson MW, Fessler JA. An expanded theoretical treatment of iteration-dependent majorize-minimize algorithms. IEEE Trans Im Proc. 2007 Oct;16(10):2411–22. doi: 10.1109/tip.2007.904387. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Kim D, Fessler JA. Parallelizable algorithms for X-ray CT image reconstruction with spatially non-uniform updates. Proc 2nd Intl Mtg on image formation in X-ray CT. 2012:33–6. [Google Scholar]

[R22] 22.Bertsekas DP. A new class of incremental gradient methods for least squares problems. SIAM J Optim. 1997 Nov;7(4):913–26. [Google Scholar]

[R23] 23.Nedic A, Bertsekas DP. Incremental subgradient methods for nondifferentiable optimization. SIAM J Optim. 2001;12(1):109–38. [Google Scholar]

[R24] 24.Byrne CL. Block-iterative methods for image reconstruction from projections. IEEE Trans Im Proc. 1996 May;5(5):792–3. doi: 10.1109/83.499919. [DOI] [PubMed] [Google Scholar]

[R25] 25.Ahn S, Fessler JA. Globally convergent image reconstruction for emission tomography using relaxed ordered subsets algorithms. IEEE Trans Med Imag. 2003 May;22(5):613–26. doi: 10.1109/TMI.2003.812251. [DOI] [PubMed] [Google Scholar]

[R26] 26.Ahn S, Fessler JA, Blatt D, Hero AO. Convergent incremental optimization transfer algorithms: Application to tomography. IEEE Trans Med Imag. 2006 Mar;25(3):283–96. doi: 10.1109/TMI.2005.862740. [DOI] [PubMed] [Google Scholar]

[R27] 27.Angelis GI, Reader AJ, Kotasidis FA, Lionheart WR, Matthews JC. The performance of monotonic and new non-monotonic gradient ascent reconstruction algorithms for high-resolution neuroreceptor PET imaging. Phys Med Biol. 2011 Jul;56(13):3895–917. doi: 10.1088/0031-9155/56/13/010. [DOI] [PubMed] [Google Scholar]

[R28] 28.Defrise M, Noo F, Kudo H. A solution to the long-object problem in helical cone-beam tomography. Phys Med Biol. 2000 Mar;45(3):623–43. doi: 10.1088/0031-9155/45/3/305. [DOI] [PubMed] [Google Scholar]

[R29] 29.Kim D, Pal D, Thibault J-B, Fessler JA. Improved ordered subsets algorithm for 3D X-ray CT image reconstruction. Proc 2nd Intl Mtg on image formation in X-ray CT. 2012:378–81. [Google Scholar]

[R30] 30.Yavuz M, Fessler JA. Statistical image reconstruction methods for randoms-precorrected PET scans. Med Im Anal. 1998 Dec;2(4):369–78. doi: 10.1016/s1361-8415(98)80017-0. [DOI] [PubMed] [Google Scholar]

[R31] 31.Ortega JM, Rheinboldt WC. Iterative solution of nonlinear equations in several variables. Academic; New York: 1970. [Google Scholar]

[R32] 32.Huber PJ. Robust statistics. Wiley; New York: 1981. [Google Scholar]

[R33] 33.Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc Ser B. 1977;39(1):1–38. [Google Scholar]

[R34] 34.De Pierro AR. On the convergence of the iterative image space reconstruction algorithm for volume ECT. IEEE Trans Med Imag. 1987 Jun;6(2):174–5. doi: 10.1109/TMI.1987.4307819. [DOI] [PubMed] [Google Scholar]

[R35] 35.De Pierro AR. On the relation between the ISRA and the EM algorithm for positron emission tomography. IEEE Trans Med Imag. 1993 Jun;12(2):328–33. doi: 10.1109/42.232263. [DOI] [PubMed] [Google Scholar]

[R36] 36.De Pierro AR. A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography. IEEE Trans Med Imag. 1995 Mar;14(1):132–7. doi: 10.1109/42.370409. [DOI] [PubMed] [Google Scholar]

[R37] 37.Lange K, Fessler JA. Globally convergent algorithms for maximum a posteriori transmission tomography. IEEE Trans Im Proc. 1995 Oct;4(10):1430–8. doi: 10.1109/83.465107. [DOI] [PubMed] [Google Scholar]

[R38] 38.Daubechies I, Defrise M, Mol CD. An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm Pure Appl Math. 2004 Nov;57(11):1413–57. [Google Scholar]

[R39] 39.Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci. 2009;2(1):183–202. [Google Scholar]

[R40] 40.Sotthivirat S, Fessler JA. Image recovery using partitioned-separable paraboloidal surrogate coordinate ascent algorithms. IEEE Trans Im Proc. 2002 Mar;11(3):306–17. doi: 10.1109/83.988963. [DOI] [PubMed] [Google Scholar]

[R41] 41.Kim D, Fessler JA. Accelerated ordered-subsets algorithm based on separable quadratic surrogates for regularized image reconstruction in X-ray CT. Proc IEEE Intl Symp Biomed Imag. 2011:1134–7. [Google Scholar]

[R42] 42.Fessler JA, Clinthorne NH, Rogers WL. On complete data spaces for PET reconstruction algorithms. IEEE Trans Nuc Sci. 1993 Aug;40(4):1055–61. [Google Scholar]

[R43] 43.Long Y, Fessler JA, Balter JM. 3D forward and back-projection for X-ray CT using separable footprints. IEEE Trans Med Imag. 2010 Nov;29(11):1839–50. doi: 10.1109/TMI.2010.2050898. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R44] 44.Cho JH, Fessler JA. Accelerating ordered-subsets image reconstruction for X-ray CT using double surrogates. Proc SPIE 8313 Medical Imaging 2012: Phys Med Im. 2012:83131X. doi: 10.1109/TMI.2013.2266898. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] 45.Luo ZQ. On the convergence of the lms algorithm with adaptive learning rate for linear feedforward networks. Neural Computation. 1991 Jun;3(2):226–45. doi: 10.1162/neco.1991.3.2.226. [DOI] [PubMed] [Google Scholar]

[R46] 46.Delaney AH, Bresler Y. Globally convergent edge-preserving regularized reconstruction: an application to limited-angle tomography. IEEE Trans Im Proc. 1998 Feb;7(2):204–21. doi: 10.1109/83.660997. [DOI] [PubMed] [Google Scholar]

[R47] 47.Fair RC. On the robust estimation of econometric models. Ann Econ Social Measurement. 1974 Oct;2:667–77. [Google Scholar]

[R48] 48.Fessler JA, Rogers WL. Spatial resolution properties of penalized-likelihood image reconstruction methods: Space-invariant tomographs. IEEE Trans Im Proc. 1996 Sep;5(9):1346–58. doi: 10.1109/83.535846. [DOI] [PubMed] [Google Scholar]

[R49] 49.Yu L, Zou Y, Sidky EY, Pelizzari CA, Munro P, Pan X. Region of interest reconstruction from truncated data in circular cone-beam CT. IEEE Trans Med Imag. 2006 Jul;25(7):869–81. doi: 10.1109/tmi.2006.872329. [DOI] [PubMed] [Google Scholar]

PERMALINK

Accelerating ordered subsets image reconstruction for X-ray CT using spatially non-uniform optimization transfer

Donghwan Kim

Debashish Pal

Jean-Baptiste Thibault

Jeffrey A Fessler

Roles

Abstract

I. Introduction

Fig. 1.

II. Statistical image reconstruction

A. Problem

B. Optimization transfer method

C. Separable quadratic surrogate (SQS) algorithm

D. Convergence rate of SQS algorithm

Lemma 1

Corollary 1

III. Spatially non-uniform separable quadratic surrogate (NU-SQS)

A. Update-needed factors

B. Design

Lemma 2

Proof

C. Dynamic range adjustment (DRA) of uj(n) In reality, (30) will not hold, so (27) will be suboptimal. We could try to improve (27) by finding a function f(n)(·): [δ(n), ∞) → [ε, 1] based on the following

Fig. 2.

D. Related work

E. Initialization of uj(0)

F. Implementation

TABLE I.

IV. Improved ordered subsets (OS) algorithm for helical CT

A. Ordinary OS algorithm

B. Proposed OS algorithm in helical CT

Fig. 3.

Fig. 4.

C. OS algorithm with averaging

TABLE II.

V. Experimental results

A. GE performance phantom (GEPP)

Fig. 5.

B. A shoulder region scan

Fig. 6.

Fig. 7.

C. A truncated abdomen scan

Fig. 8.

VI. Conclusion

Supplementary Material

Acknowledgments

Appendix A. Proof of Lemma 2

Proof

Appendix B. Outline of the proposed NU-OS-SQS algorithm]

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

C. Dynamic range adjustment (DRA) of $u_{j}^{(n)}$ In reality, (30) will not hold, so (27) will be suboptimal. We could try to improve (27) by finding a function f⁽ⁿ⁾(·): [δ⁽ⁿ⁾, ∞) → [ε, 1] based on the following

E. Initialization of $u_{j}^{(0)}$