A Fast Convergent Ordered-Subsets Algorithm with Subiteration-Dependent Preconditioners for PET Image Reconstruction

Jianfeng Guo; C Ross Schmidtlein; Andrzej Krol; Si Li; Yizun Lin; Sangtae Ahn; Charles Stearns; Yuesheng Xu

doi:10.1109/TMI.2022.3181813

. Author manuscript; available in PMC: 2023 Nov 1.

Published in final edited form as: IEEE Trans Med Imaging. 2022 Oct 27;41(11):3289–3300. doi: 10.1109/TMI.2022.3181813

A Fast Convergent Ordered-Subsets Algorithm with Subiteration-Dependent Preconditioners for PET Image Reconstruction

Jianfeng Guo ¹, C Ross Schmidtlein ², Andrzej Krol ³, Si Li ⁴, Yizun Lin ⁵, Sangtae Ahn ⁶, Charles Stearns ⁷, Yuesheng Xu ⁸

PMCID: PMC9810102 NIHMSID: NIHMS1846006 PMID: 35679379

Abstract

We investigated the imaging performance of a fast convergent ordered-subsets algorithm with subiteration-dependent preconditioners (SDPs) for positron emission tomography (PET) image reconstruction. In particular, we considered the use of SDP with the block sequential regularized expectation maximization (BSREM) approach with the relative difference prior (RDP) regularizer due to its prior clinical adaptation by vendors. Because the RDP regularization promotes smoothness in the reconstructed image, the directions of the gradients in smooth areas more accurately point toward the objective function’s minimizer than those in variable areas. Motivated by this observation, two SDPs have been designed to increase iteration step-sizes in the smooth areas and reduce iteration step-sizes in the variable areas relative to a conventional expectation maximization preconditioner. The momentum technique used for convergence acceleration can be viewed as a special case of SDP. We have proved the global convergence of SDP-BSREM algorithms by assuming certain characteristics of the preconditioner. By means of numerical experiments using both simulated and clinical PET data, we have shown that the SDP-BSREM algorithms substantially improve the convergence rate, as compared to conventional BSREM and a vendor’s implementation as Q.Clear. Specifically, SDP-BSREM algorithms converge 35%−50% faster in reaching the same objective function value than conventional BSREM and commercial Q.Clear algorithms. Moreover, we showed in phantoms with hot, cold and background regions that the SDP-BSREM algorithms approached the values of a highly converged reference image faster than conventional BSREM and commercial Q.Clear algorithms.

Index Terms—: Image reconstruction, ordered-subsets, positron emission tomography, preconditioner, relative difference prior

Introduction

Positron emission tomography (PET) data are inherently count limited due to health consideration, basic physical processes, and patient tolerance. Moreover, these data must be reconstructed into images within a few minutes of acquisition. This creates a challenging situation in which vendors strive to produce high quality images in a clinically viable time frame. In this study, we introduce a method for accelerating the reconstruction of high quality PET images.

Over last 20 years, the non-penalized maximum-likelihood (ML) statistical approaches have become a preferred model for the reconstruction of PET [1], [2]. However, when iterated to full convergence, ML methods produce extremely noisy images, and are sensitive to small statistical perturbations in the data. Hence, these methods are seldom run to full convergence and iterations are stopped before fitting noise becomes unacceptable at the expense of excessive blur in the reconstructed images. It has been demonstrated that applications of penalized likelihood (PL) models that include a data fidelity term (Kullback-Leibler divergence) and a regularization term leads to improved quantification and better noise suppression, as compared to non-penalized reconstructions [3]. To reduce the computational expense, ordered-subsets expectation maximization (OSEM) algorithms proposed by Hudson and Larkin are widely used in unregularized PET image reconstruction [4]. However, OSEM is unsuitable for regularized image reconstruction leading to the development of relaxation [5], [6] and the block sequential regularized expectation maximization (BSREM) algorithm [7].

Initially, quadratic penalties were explored [8], [9], but they had resulted in over-smoothed edges and loss of details in the reconstructed images. Later, a number of other penalties were developed to address these problems but they often had undesirable properties including nonsmooth [10], [11], non-covex [12], or requiring additional hyper-parameters [13]. An example of such an approach is the total variation penalty that is able to preserve sharp boundaries between low-variability regions [10], [14]. Thus, ability to deal with non-smooth priors became an urgent issue, However, only a few reconstruction algorithms have been able to combine the Poisson noise model and non-smooth priors [15]–[17].

In an alternative approach, Nuyts et al. [18] introduced the relative difference prior (RDP) that preserved high spatial frequencies in reconstructed images while still being smooth and convex. This RDP was adopted by General Electric (GE) Healthcare as the penalty term in their PL PET reconstruction model. The penalty term is controlled by a single user-defined parameter called beta. The GE Healthcare introduced a modified BSREM algorithm [8] to solve the PL model in their commercial clinical software, called Q.Clear, that is currently available on GE PET/CT scanners [3]. Other interesting methods, suitable for optimization with smooth penalties, include the optimization transfer descent algorithm (OTDA) [19], [20] and the preconditioned limited-memory Broyden-Fletcher-Goldfarb-Shanno with boundary constraints algorithm (L-BFGS-B-PC) [9]. While these algorithms converge very rapidly, they represent a substantial departure from the BSREM algorithm complicating their implementation.

The choice of preconditioners in the algorithm is well known to strongly affect the convergence rate [9], [16], [21], [22]. The widely used preconditioners have been designed based on the EM matrix [16], [21]–[23] or the Hessian matrix [9]. As part of the BSREM convergence proof, Ahn and Fessler [8] presented the subiteration-independent preconditioner, which can be viewed as a uniform operator of the image for all subiteration. However, a subiteration-independent preconditioner is overly restrictive and may result in a slower convergence rate. We believe that a well-designed subiteration-dependent preconditioner (SDP) will accelerate the algorithm convergence.

In the present study, we propose a subiteration-dependent preconditioned BSREM (SDP-BSREM) for the RDP regularized PET image reconstruction. We prove that it is convergent under certain assumptions imposed on the preconditioner. According to the smoothness-promoting property of the RDP regularization in the reconstructed image, the directions of the gradients in smooth areas more accurately point toward the objective function’s minimizer than those in variable areas. Inspired by this observation, we propose two SDPs satisfying the assumptions needed for the convergence proof. We note that the momentum technique is a special case of SDP. We have used the numerical gradient of the image to measure its smoothness. These two SDPs achieve larger step-sizes in the smooth areas of the image and smaller step-sizes in the variable areas of the image. The proposed algorithms have been compared with BSREM for simulations and with the Q.Clear method with data acquired from a GE PET/CT. In simulations, two numerical phantoms were used. In the clinical comparisons, data from a whole-body PET patient and an American College of Radiology (ACR) quality assurance phantom (Esser Flangeless PET phantom) were used both with and without time-of-flight data.

This paper is organized in five sections. In section II, we first describe the RDP regularized PET image reconstruction model and the modified BSREM algorithm and then develop our new SDP-BSREM algorithm. In section III, proofs for convergence of SDP-BSREM are provided with and without an interior assumption and four SDPs satisfying the convergence conditions are presented as well. Comprehensive comparisons of the results obtained for simulated and clinical data obtained by means of our proposed SDP-BSREM methods versus BSREM and Q.Clear are provided in section IV. The conclusions are presented in section V. Two appendices with additional details are also provided.

II. Methodology

In this section, we develop the SDP-BSREM algorithm for solving the RDP regularized PET image reconstruction model.

A. RDP Regularized PET Image Reconstruction Model

We denote by $ℝ_{+}$ the set of all nonnegative real numbers, by $ℝ_{+ +}$ the set of all positive real numbers, by $ℕ$ the set of positive integers, and by $ℕ_{0} : = ℕ \cup \{0\}$ . For $p, q \in ℕ$ , we let $A \in ℝ_{+}^{p \times q}$ denote the PET system matrix whose entries are the probability of detection of the positron annihilation gamma photon pairs emitted from a particular voxel containing PET radiotracer, and let $γ \in ℝ_{+ +}^{p}$ denote the mean value of the background events produced by random and scatter coincidences. The relation of the radiotracer distribution $f \in ℝ_{+}^{q}$ within a patient with the projection data $g \in ℝ_{+}^{p}$ acquired by a PET scanner is described by the Poisson model

g = Poisson (Af + γ),

(1)

where Poisson(x) denotes a Poisson-distributed random vector with mean x.

Model (1) may be solved by minimizing the fidelity term

F (f) : = 〈Af, 1_{p}〉 - 〈 \ln (Af + γ), g 〉,

(2)

where $1_{p} \in ℝ^{p}$ denotes the vector with all components 1, ln $x : = {[\ln x_{1}, \ln x_{2}, \dots, \ln x_{n}]}^{⊤}$ is the logarithmic function at a vector $x \in ℝ_{+ +}^{n}$ and x_i is the i-th component of x, and $〈 x, y 〉 : = \sum_{i = 1}^{n} x_{i} y_{i}$ denotes the inner product of $x, y \in ℝ^{n}$ . It is well-known that model (2) is ill-posed [24] and its solutions may result in over-fitting in reconstructed images. Regularization is often used to avoid the over-fitting problem. A commonly used regularized PET image reconstruction model has the following form:

\underset{f \in ℝ_{+}^{q}}{\arg \min} Φ (f),

(3)

where

Φ (f) : = F (f) + β R (f),

(4)

with $β \in ℝ_{+}$ being the regularization parameter and R(f) representing the regularization term. In this study, we will consider the RDP [18] regularization term that is given by

R (f) : = \sum_{j = 1}^{q} \sum_{k \in N_{j}} \frac{{(f_{j} - f_{k})}^{2}}{(f_{j} + f_{k}) + γ_{R} |f_{j} - f_{k}| + ϵ},

(5)

where $γ_{R} \in ℝ_{+}$ controls the degree of edge preservation, and N_j is the neighborhood of pixel j.

In model (5), a small constant ϵ > 0 is added to the denominator to avoid singularities when both f_j and f_k are equal to zero. By its definition, RDP is a function of both differences of neighboring pixels and their sums. The inclusion of the sums term makes RDP differ from conventional regularization terms and causes the regularizer to be activity-dependent. We note that the function Φ(f) is twice differentiable since both F (f) and R(f) are twice differentiable [16], [18]. The inclusion of a small constant ϵ in the denominator of RDP provides the objective function Φ with two useful properties (the proofs are provided in appendix A): (i) it is strictly convex under an assumption that $A^{⊤} g$ is a nonzero vector; and (ii) it has a Lipschitz continuous gradient on $ℝ_{+}^{q}$ .

B. Modified BSREM Algorithm

A modified BSREM [8] was adopted by GE Healthcare as the optimizer in the Q.Clear method [3] for solving the model (3). Here we describe and review the modified BSREM.

Minimization problem (3) is often solved by the gradient descent method. However, computing the whole gradient ∇Φ is computationally expensive. To alleviate this issue, the ordered-subsets (OS) algorithm was developed to accelerate its convergence [4]–[7]. For $n \in ℕ$ , let $ℕ_{n} : = {1, 2, \dots, n}$ . For $M \in ℕ$ , let $I : = \{I_{i} : i \in ℕ_{M}\}$ be a collection of disjoint subsets of $ℕ_{p}$ such that $\cup_{i = 1}^{M} I_{i} = ℕ_{p}$ . The partition $I$ is chosen as in [8]. According to the partition $I$ , we partition the system matrix A into M row sub-matrices A_i, and g and γ into M sub-vectors g_i and γ_i, respectively, for $i \in ℕ_{M}$ . We use |Ω| to denote the cardinality of set Ω. For $i \in ℕ_{M}$ , we define

Φ_{i} (f) : = 〈A_{i} f, 1_{|I_{i}|}〉 - 〈\ln (A_{i} f + γ_{i}), g_{i}〉 + \frac{β}{M} R (f) .

(6)

It follows that $Φ (f) = \sum_{i = 1}^{M} Φ_{i} (f)$ . An OS algorithm computes only one ∇Φ_i at each subiteration step.

A subiteration-independent preconditioned OS algorithm was proposed in [8]. The preconditioner is designed by using an upper bound of the solution set of minimization problem (3). It was proved in [8] that for any projection data g, there exists a constant U > 0 such that the solution set $S^{*}$ of minimization problem (3) is contained in the bounded set

B : = \{f : f \in ℝ_{+}^{q}, 0 ⩽ f_{j} ⩽ U, j \in ℕ_{q}\} .

(7)

That is, $S^{*} \subset B$ . For $f \in B$ , a subiteration-independent diagonal preconditioner S(f) is defined as

S {(f)}_{j j} : = \{\begin{array}{l} f_{j} / p_{j} & if 0 \leq f_{j} < U / 2, \\ (U - f_{j}) / (p_{j}) & if U / 2 \leq f_{j} \leq U, \end{array}

(8)

where p_j are defined by

p_{j} : = \{\begin{array}{l} {(A^{⊤} 1_{p})}_{j} / M & if {(A^{⊤} 1_{p})}_{j} > 0, \\ 1 / M & if {(A^{⊤} 1_{p})}_{j} = 0, \end{array} for j \in ℕ_{q} .

(9)

Note that the preconditioner S is uniform for all iterations.

For a small t ∈ (0, U) and $f \in ℝ^{q}$ , an operator $P_{t} : ℝ^{q} \to B$ is defined by

P_{t} {(f)}_{j} : = \{\begin{array}{l} t & if f_{j} ⩽ 0, \\ U - t & if f_{j} ⩾ U, \\ f_{j} & otherwise . \end{array}

(10)

Using operator P_t, the modified BSREM algorithm [8] may be described as for $k \in ℕ_{0}$ , $i \in ℕ_{M}$ ,

\{\begin{array}{l} {\tilde{f}}^{k, i} = f^{k, i - 1} - λ_{k} S (f^{k, i - 1}) \nabla Φ_{i} (f^{k, i - 1}), \\ f^{k, i} = P_{t} ({\tilde{f}}^{k, i}), \end{array}

(11)

with $f^{k, 0} : = f^{k}, f^{k + 1} : = f^{k, M}$ , where λ_k > 0 is the relaxation parameter. For simplicity of notation, we will refer to the modified form of BSREM simply as BSREM.

C. BSREM with Subiteration-Dependent Preconditioners

In this subsection we propose subiteration-dependent preconditioners (SDPs). To motivate them, we review the momentum approach. The momentum is an acceleration technique widely used in optimization [25]–[27]. The Nesterov momentum [25] has been combined with OS by Kim et al. [28] for CT image reconstruction. Recently, Lin et al. [22] successfully applied a different form of momentum to PET image reconstruction. However, no explicit convergence proof has been provided for the OS combined momentum methods. Instead, Kim et al. proved that the expectation of the successive steps converged, while Lin et al. proved convergence for the non-OS method. The momentum technique used in [22] can be described as follows: for $k \in ℕ_{0}$ , $i \in ℕ_{M}$ ,

\{\begin{array}{l} {\tilde{f}}^{k, i} = \max [f^{k, i - 1} - λ_{k} S (f^{k, i - 1}) \nabla Φ_{i} (f^{k, i - 1}), 0], \\ f^{k, i} = (1 - α_{k, i}) f^{k, i - 1} + α_{k, i} {\tilde{f}}^{k, i}, \end{array}

(12)

where 𝜶_k,i > 1 is the momentum sequence. Under an assumption that ${\tilde{f}}^{k, i}$ are non-negative, one can obtain

f^{k, i} = f^{k, i - 1} - λ_{k} α_{k, i} S (f^{k, i - 1}) \nabla Φ_{i} (f^{k, i - 1}) .

(13)

By letting S^k,i(f) := 𝜶_k,iS(f), we may reinterpret S^k,i as an SDP and (13) as a BSREM algorithm with S^k,i. Inspired by this, we introduce SDPs by setting S^k,i(f) := diag(𝜶^k,i)S(f), where 𝜶^k,i is a positive vector sequence and diag(y) denotes the diagonal matrix with the diagonal entries being the components of the vector y. Using the same notation as for BSREM above, we have arrived at the SDP-BSREM algorithm for solving model (3) given in Table I.

TABLE I.

SDP-BSREM Algorithm

\begin{array}{l} Preparation: f^{0}, M, T, P_{t} is defined in (10) \\ for k = 0, 1, 2, \dots, T \\ f^{k, 0} = f^{k} \\ for i = 1, 2, \dots, M \\ {\tilde{f}}^{k, i} = f^{k, i - 1} - λ_{k} S^{k, i} (f^{k, i - 1}) \nabla Φ_{i} (f^{k, i - 1}) \\ f^{k, i} = P_{t} ({\tilde{f}}^{k, i}) \\ end \\ f^{k + 1} = f^{k, M} \\ end \end{array}

Open in a new tab

Bearing in mind the momentum concept defined in [22], it is clear that the iteration sequence provided in (13) is a special case of our proposed SDP-BSREM algorithm with 𝜶^k,i := 𝜶_k,i1_q. For this reason, we expect that our proposed SDP-BSREM algorithm setting will allow us to choose an SDP to yield the convergence acceleration. We will evaluate its performance by means of numerical experiments to be presented in section IV.

III. Convergence of SDP-BSREM algorithm

In this section we present convergence properties of the SDP-BSREM algorithm. We also describe four specific SDPs that satisfy the convergence condition. To this end, we assume that the objective function Φ satisfies the hypothesis:

i
Φ has a unique minimizer on $B$ ;
ii
Φ is convex and twice differentiable on $B$ ;
iii
∇Φ_i are Lipschitz continuous on $B$ for all $i \in ℕ_{M}$ .

The use of SDPs results in scaled subset gradients with their sum inconsistent with the scaled full gradient. It complicates the convergence proof and requires additional assumptions on the preconditioner to make the inconsistencies asymptotically approach zero. For a general SDP, our proposed SDP-BSREM algorithm may not converge [8]. Nevertheless, by imposing certain assumptions on S^k,i, convergence to the desired optimum point can be ensured. For a SDP S^k,i(f) = diag(𝜶^k,i)S(f) with S(f) defined in (8), the required additional assumptions are as follows:

iv
The relaxation sequence satisfies $\sum_{k = 0}^{\infty} λ_{k} = \infty$ and $\sum_{k = 0}^{\infty} λ_{k}^{2} < \infty$ ;
v
There exists a positive vector 𝜶 such that $\lim_{k \to \infty} α^{k, i} = α$ for all $i \in ℕ_{M}$ ;
vi
The vector series $\sum_{k = 0}^{\infty} λ_{k} (α - α^{k, i})$ converge for all $i \in ℕ_{M}$ .

Note that condition (iv) was imposed in [8], [29] for convergence proofs for relaxed OS algorithms. Conditions (v) and (vi) were imposed to overcome difficulties caused by the use of different preconditioners in different subiterations.

We now state a lemma regarding the Lipschitz continuity properties of SDPs. The proof is included in Appendix B.

Lemma 1: If conditions (iii) and (v) are satisfied, then $S^{k, i} (f) \nabla Φ_{i} (f)$ are uniformly bounded and Lipschitz continuous on $B$ with Lipschitz constants bounded above by a uniform constant, for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ .

The inclusion of the operator P_t in the SDP-BSREM algorithm complicates the convergence proof. Here, we present our approach in dealing with this difficulty. Let $int B$ denote the interior of $B$ . We prove the convergence of SDP-BSREM in two steps. We first prove it with the interior assumption ${\tilde{f}}^{k, i} \in int B$ for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ , and then prove it by showing that the interior assumption holds true under certain conditions.

We now proceed the first step. If ${\tilde{f}}^{k, i} \in int B$ for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ , then $f^{k, i} = P_{t} ({\tilde{f}}^{k, i}) = {\tilde{f}}^{k, i}$ and the iteration scheme can be formulated as

f^{k, i} = f^{k} - λ_{k} \sum_{l = 1}^{i} S^{k, l} (f^{k, l - 1}) \nabla Φ_{t} (f^{k, l - 1}) .

(14)

We first establish a technical lemma.

Lemma 2: Suppose conditions (iii) and (v) are satisfied. If $\lim_{k \to \infty} λ_{k} = 0$ , and ${\tilde{f}}^{k, i} \in int B$ , for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ , then $\lim_{k \to \infty} (f^{k, i} - f^{k}) = 0$ , for all $i \in ℕ_{M}$ .

Proof: By Lemma 1, $S^{k, i} (f) \nabla Φ_{i} (f)$ , $k \in ℕ_{0}$ , $i \in ℕ_{M}$ , are uniformly bounded on $B$ . This combined with $\lim_{k \to \infty} λ_{k} = 0$ and (14) yields the desired result. ■

Let $δ^{k, i} : = α - α^{k, i}$ , $δ_{k} : = \max_{i \in ℕ_{M}, j \in ℕ_{q}} |δ_{j}^{k, i}|$ . We state a technical lemma whose proof is included in Appendix B.

Lemma 3: Suppose conditions (iii)-(vi) are satisfied and ${\tilde{f}}^{k, i} \in int B$ for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ . If $f_{j}^{k, i - 1} \in (0, U / 2)$ for all $i \in ℕ_{M}$ , then

f_{j}^{k} - f_{j}^{k + 1} = λ_{k} f_{j}^{k} [\frac{α_{j}}{p_{j}} \frac{\partial Φ (f^{k})}{\partial f_{j}} + O (δ_{k}) + O (λ_{k})] .

(15)

If $f_{j}^{k, i - 1} \in [U / 2, U)$ for all $i \in ℕ_{M}$ , then

f_{j}^{k} - f_{j}^{k + 1} = λ_{k} (U - f_{j}^{k}) [\frac{α_{j}}{p_{j}} \frac{\partial Φ (f^{k})}{\partial f_{j}} + O (δ_{k}) + O (λ_{k})] .

(16)

We recall that a cluster point of a sequence f^k is defined as the limit of a convergent subsequence of f^k and state a lemma whose proof is included in Appendix B.

Lemma 4: If conditions (i)-(vi) are satisfied and ${\tilde{f}}^{k, i} \in int B$ for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ , then (a) $Φ (f^{k})$ converges in $ℝ$ , (b) there exists a cluster point f^∗ of f^k with S(f^∗)∇Φ(f^∗) = 0, and (c) such a cluster point f^∗ described in (b) is a global minimizer of Φ over $B$ .

We are ready to prove convergence of f^k and f^k,i with the interior assumption.

Proposition 5: If conditions (i)-(vi) are satisfied and ${\tilde{f}}^{k, i} \in int B$ for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ , then both f^k and f^k,i converge to the global minimizer of Φ on $B$ .

Proof: According to Lemma 4 (c), f^∗ is a global minimizer of Φ over $B$ . Suppose there exists another cluster point f^∗∗ ≠ f^∗. By Lemma 4 (a), Φ(f^k) converges in $ℝ$ , which implies that Φ(f^∗) = Φ(f^∗∗). Then f^∗∗ is also a minimizer of Φ(f), which is a contradiction since Φ(f) has a unique minimizer on $B$ . Then we obtain $\lim_{k \to \infty} f^{k} = \arg \min_{f \in B} Φ (f)$ . The convergence of f^k,i follows from Lemma 2 and the convergence of f^k. ■

We have shown that condition ${\tilde{f}}^{k, i} \in int B$ for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ is sufficient for the convergence of SDP-BSREM algorithm. Next, we prove the convergence of SDP-BSREM without the interior assumption. The proof of convergence is completed by proving ${\tilde{f}}^{k, i} \in int B$ for all $i \in ℕ_{M}$ and k > K for some K > 0. We now state a lemma to prove it.

Lemma 6: Suppose condition (iii) is satisfied. If 𝜶^k,i is bounded and $\lim_{k \to \infty} λ_{k} = 0$ , then ${\tilde{f}}^{k, i} \in int B$ for all $i \in ℕ_{M}$ and k > K for some K > 0.

Proof: It suffices to prove that ${\tilde{f}}_{j}^{k, i} \in (0, U)$ for all $i \in ℕ_{M}$ , $j \in ℕ_{q}$ , k > K, for some K > 0. By condition (iii), $(\partial / \partial f_{j}) Φ_{i} (f)$ is bounded over $B$ for all $i \in ℕ_{M}$ , $j \in ℕ_{q}$ . Combining this with the boundedness of 𝜶^k,i, there exists c₁ > 0 such that $|α_{j}^{k, i} / p_{j} (\partial / \partial f_{j}) Φ_{i} (f)| ⩽ c_{1}$ , for all $i \in ℕ_{M}$ , $j \in ℕ_{q}$ , $k \in ℕ_{0}$ , and all $f \in B$ . Because $\lim_{k \to \infty} λ_{k} = 0$ , there exists K > 0 such that $|λ_{k}| < 1 / c_{1}$ for all k > K, so that $|λ_{k} α_{j}^{k, i} / p_{j} (\partial / \partial f_{j}) Φ_{i} (f^{k, i - 1})| < 1$ . Hence, for k > K, $i \in ℕ_{M}$ , if $f_{j}^{k, i - 1} \in (0, U / 2)$ , the preconditioner $S^{k, i} {(f^{k, i - 1})}_{j j} = α_{j}^{k, i} f_{j}^{k, i - 1} / p_{j}$ gives rise to ${\tilde{f}}_{j}^{k, i} = f_{j}^{k, i - 1} [1 - λ_{k} α_{j}^{k, i} / p_{j} (\partial / \partial f_{j}) Φ_{i} (f^{k, i - 1})]$ , from which we can show that ${\tilde{f}}_{j}^{k, i} \in (0, U)$ . Likewise, if $f_{j}^{k, i - 1} \in [U / 2, U)$ , the preconditioner $S {(f^{k, i - 1})}_{j j} = α_{j}^{k, i} (U - f_{j}^{k, i - 1}) / p_{j}$ gives that $(U - {\tilde{f}}_{j}^{k, i}) = (U - f_{j}^{k, i - 1}) [1 + λ_{k} α_{j}^{k, i} / p_{j} (\partial / \partial f_{j}) Φ_{i} (f^{k, i - 1})]$ , from which we can show that ${\tilde{f}}_{j}^{k, i} \in (0, U)$ . ■

We now arrive at the following theorem for the convergence of SDP-BSREM algorithm without an interior assumption.

Theorem 7: If conditions (i)-(vi) are satisfied, then f^k,i converges to the global minimizer of Φ on $B$ .

Proof: We have that $\lim_{k \to \infty} λ_{k} = 0$ and 𝜶^k,i is bounded from conditions (iv) and (v) respectively. Thus, by Lemma 6, there exists K > 0 such that ${\tilde{f}}^{k, i} \in int B$ for all k > K, $i \in ℕ_{M}$ . Then the proof follows from Proposition 5. ■

We next propose four specific SDPs which satisfy the convergence conditions (iv)-(vi). For S^k,i(f) = diag(𝜶^k,i)S(f), let $α^{k, i} : = α_{k, i} ν^{k, i}$ , where 𝜶_k,i is a scalar sequence and ν^k,i is a vector sequence to be determined. Other, potentially better, choices of 𝜶^k,i are left as future work.

In this case, inspired by momentum techniques [22], [25], we consider the following two choices of 𝜶_k,i. This first one is derived from Nesterov momentum [25]:

α_{k, i} : = 1 + (t_{k, i} - 1) / t_{k, i + 1},

(17)

Where $t_{k, i + 1} : = (1 + \sqrt{1 + 4 t_{k, i}^{2}}) / 2$ , $t_{0, 1} : = 1$ and $t_{k + 1, 1} : = t_{k, M + 1}$ , $k \in ℕ_{0}$ , $i \in ℕ_{M}$ . The second one has the following form:

α_{k, i} : = (ϱ (k M + i - 1) + δ_{2}) / (k M + i - 1 + δ_{1}),

(18)

for $k \in ℕ_{0}$ , $i \in ℕ_{M}$ , where $ϱ$ , δ₁ and δ₂ are positive parameters. We notice that this 𝜶_k,i is an extension of the momentum proposed in [22].

The motivation for the design of ν^k,i is presented as follows. The use of different step-sizes for different regions in the image can accelerate the convergence of the algorithm. The diagonal nonnegative definite preconditioner plays an important role in rescaling the step-sizes of the algorithm. Hence, a good preconditioner can significantly accelerate the convergence of the algorithm. We propose a type of preconditioner that is related to the regularization term that promotes smoothness in the reconstructed image. Our goal is to find the minimizer of the objective function which consists of the fidelity term and the regularization term. The fidelity term estimates the fitting quality of the reconstructed image to the data and the regularization term defined in (5) promotes smoothness in the reconstructed image. Smaller fidelity term makes the reconstructed image more consistent with the data and smaller regularization term leads to a smoother reconstructed image.

Suppose f^∗ is a minimizer of the objective function Φ and the iteration scheme of the algorithm converges to f^∗. Let I_s and I_v be the smooth and variable areas of f^∗, respectively. Suppose $\nabla_{I_{s}} Φ$ and $\nabla_{I_{v}} Φ$ are the subsets of $\nabla Φ$ defined in the areas I_s and I_v, respectively. Then for the smooth areas I_s, the descent directions of the fidelity term and the regularization term are consistent whereas for the variable areas I_v, the descent directions of the fidelity term and the regularization term are inconsistent. Thus the direction of $\nabla_{I_{s}} Φ$ more accurately points toward the minimizer than the direction of $\nabla_{I_{v}} Φ$ . Therefore, we conclude that the directions of $\nabla_{I_{s}} Φ (f^{k, i})$ and $\nabla_{I_{s}} Φ (f^{k, i + 1})$ are more consistent than the directions of $\nabla_{I_{v}} Φ (f^{k, i})$ and $\nabla_{I_{v}} Φ (f^{k, i + 1})$ . We illustrate this conjecture (the subset gradient is used in practice) via numerical experiments (see Fig. 2). The preconditioner is designed to achieve larger iteration step-sizes in the smooth areas I_s and smaller iteration step-sizes in the variable areas I_v. The numerical gradient (the gradient function in Matlab) of f^∗ is applied to measure the smoothness degree of the image f^∗. Then larger and smaller step-sizes will be used in the areas having smaller and larger numerical gradients, respectively.

Fig. 2. — Angle (left) and average angle (right) between the gradients of the successive subiterations vs. iterations projected in the smooth areas and variable areas in the reconstructed images, respectively, for the brain phantom with high count data. Top row: SDP-P1 (24). Bottom row: SDP-P2(24).

Suppose $f \in ℝ_{+}^{q_{1} q_{2} \times 1}$ is a 2D image with size q₁ × q₂. Let $mat (f) \in ℝ_{+}^{q_{1} \times q_{2}}$ be the matrix form of f. Using the gradient function in Matlab, we compute the gradients of mat(f) along the x and y directions, namely, ${grad}_{x} (mat (f))$ and ${grad}_{y} (mat (f))$ . Let $grad (f) : = \sqrt{{({grad}_{x} (mat (f)))}^{2} + {({grad}_{y} (mat (f)))}^{2}}$ , where the square and square root operations are element-wise. For PET patients data, the minimizer f^∗ is unknown and we use f^k,i to approximate f^∗. Based on the consideration that areas with larger numerical gradients should have smaller step-sizes, we first let $μ^{k, i} : = \max (0.01, grad (f^{k, i}) / mean (f^{k, i}))$ , where $mean (f) : = \sum_{j = 1}^{q} f_{j} / q$ is used to normalize the f^k,i. Instead of directly letting ν^k,i be the 1/µ^k,i, we define a projection operator to avoid too large or too small step-sizes. For two positive numbers ν_m < ν_M, and $f \in ℝ_{+}^{q}$ , a projection operator $P_{ν_{m}}^{ν_{M}} : ℝ^{q} \to ℝ^{q}$ is defined by $P_{ν_{m}}^{ν_{M}} {(f)}_{j} : = \min \{ν_{M}, \max \{f_{j}, ν_{m}\}\}$ .

Let $J_{k, i} : = k M + i$ , $k \in ℕ_{0}$ , $i \in ℕ_{M}$ . For $0 < ν_{1} < ν_{2}$ , and $0 ⩽ J_{0} ⩽ J_{1} : = k_{1} M + i_{1}$ , the $ν^{k, i}$ is determined by

ν^{k, i} : = \{\begin{array}{l} 1_{q} & if J_{k, i} ⩽ J_{0}, \\ P_{ν_{1}}^{ν_{2}} (mean (μ^{k, i}) / μ^{k, i}) & if J_{0} < J_{k, i} ⩽ J_{1}, \\ ν^{k_{1}, i_{1}} & if J_{k, i} > J_{1} . \end{array}

(19)

For the first J₀ subiterations, the ν^k,i is set to the identity vector since the approximation of f^k,i to f^∗ is poor. The approximation becomes better as the iteration continues, hence different step-sizes for different regions of the image are used for J₀ < J_k,i ≤ J₁. After J₁-th subiteration ν^k,i is then fixed due to improved approximation. The preconditioners S^k,i(f) = S(f)diag(𝜶_k,iν^k,i) are denoted by P1 and P2 depending on 𝜶_k,i defined in (17) and (18) respectively. The momentum-like preconditioners S^k,i(f) = S(f)diag(𝜶_k,i1_q) are denoted by M1 and M2 depending on 𝜶_k,i defined in (17) and (18) respectively. Then we have the following theorem. The proof can be found in the appendix B.

Theorem 8: The SDP-BSREM algorithm with Φ defined in (4), and with relaxation $λ_{k} : = λ_{0} / (a k + 1)$ , λ₀, a > 0, and preconditioner P1, P2, M1, or M2 is convergent.

IV. Numerical results

In this section, we present results of evaluations of the SDP-BSREM algorithms performance obtained by means of numerical experiments using both simulated and clinical PET data, in comparison with BSREM and with the clinical version of BSREM (Q.Clear, GE).

A. Simulation Setup

The algorithms were implemented using a 2D PET simulation model developed in the Matlab environment [22], [30]. The projection matrix, based on a single detector ring of a GE D710 PET/CT, was built using a ray-driven model with 32 parallel rays per detector pair. Cylindrical detector ring, consisting of 576 detectors whose width are 4 mm, was applied. The field of view (FOV) was set to 300 mm and 288 projection angles were used to reconstruct a 256×256 image with pixel size 1.17 mm×1.17mm. The true count projection data were obtained by forward projecting the phantom convolved in image space with an idealized point spread function (PSF). The PSF was a shift-invariant Gaussian function with full width at half maximum (FWHM) equal to 6.59 mm [31]. Uniform water attenuation, with attenuation coefficient 0.096 cm⁻¹, was simulated using the PET image as support. Scatter was simulated by adding highly smoothed and scaled projection of the phantom to the attenuated image sinograms. The scaling factor was equal to the estimated scatter fraction SF := S/(T + S), where T and S are true and scatter counts respectively [32]. Random counts were simulated by adding a uniform distribution to the true and scatter count distributions scaled by a random fraction RF := R/(T +S +R), where R is the random count [32]. The total count was defined as TC := T +S+R. In the simulations, it was TC = 6.8×10⁶ for high count data and TC = 6.8×10⁵ for low count data. In both cases SF = RF = 0.25. The individual noise realizations were generated by adding the Poisson noise to the total count distribution. The same system matrix was used to simulate the data and to reconstruct them.

To investigate the convergence acceleration and its impact on reconstructed images fidelity, two figures-of-merit were computed: the objective function $(Φ (f^{k}))$ and the normalized root mean square difference (NRMSD). The region of interest (ROI) based NRMSD is defined by $\sqrt{\sum_{j \in Ω} {(f_{j}^{k} - f_{j}^{\infty})}^{2}} / \sqrt{\sum_{j \in Ω} {(f_{j}^{\infty})}^{2}}$ , where $f^{\infty}$ is the converged image at 1000 iterations by BSREM algorithm with 24 subsets in simulations, and Ω is the ROI. In the simulations the global NRMSD is obtained by setting the ROI as the whole image.

Two 256×256 numerical 2D phantoms shown in Fig. 1 were used in simulations. The brain phantom [30] was obtained from a high quality clinical PET image. The uniform phantom consists of 4 uniform hot spheres and 2 uniform cold spheres with distinct radii: 4, 6, 8 (cold), 10 (cold), 12, 14 pixels. The contrast ratio for the cold and hot spheres are 0 : 1 and 1 : 10, respectively. All simulations were performed in a 64-bit Windows 10 OS laptop with Intel Core i7–8550U Processor at 1.80 GHz, 16 GB of DDR4 memory and 512 GB SATA SSD.

The parameter t in P_t was set to 10⁻⁴. The constant ϵ was set to 10⁻¹². The regularization parameter β in model (3) was set to 0.1 and 0.8 for high and low count data respectively. In RDP regularization term, the parameter γ_R was set to 2 and 8-point neighborhood was considered. The initialization f⁰ was set to 1_q to examine the setting of ν^k,i. We used the relaxation sequence defined by $λ_{0} / (a k + 1)$ , a > 0. In all simulation experiments, for simplicity, we empirically set λ₀ = 1, J₀ = 3, J₂ = 1000 and δ₁ = δ₂. Other parameter values, shown in Table II, were chosen based on the performance of objective function value.

TABLE II.

Algorithmic parameters for 2D simulation reconstruction

Parameters	Brain phantom	Uniform phantom
BSREM(12)	high count: a = 1/400 low count: a = 1/18	-
SDP-P1(12)	high count: a = 1/13, ν₁ = 1.6, ν₂ = 2.4 low count: a = 0.5, ν₁ = 1.6, ν₂ = 2.4	-
SDP-P2(12)	high count: a = 1/5, ϱ = 5, δ₁ = 5, ν₁ = 0.8, ν₂ = 2.2 low count: a = 1.3, ϱ = 7.5, δ₁ = 5, ν₁ = 1.3, ν₂ = 2.1	-
BSREM(24)	high count: a = 1/35 low count: a = 1/5	high count: a = 1/35
SDP-P1(24)	high count: a = 0.35, ν₁ = 1.6, ν₂ = 2.4 low count: a = 1.3, ν₁ = 1.4, ν₂ = 2.5	high count: a = 0.5, ν₁ = 1.8, ν₂ = 2.5
SDP-P2(24)	high count: a = 0.45, ϱ = 4, δ₁ = 3, ν₁ = 0.8, ν₂ = 1.8 low count: a = 1.4, ϱ = 2.2, δ₁ = 1, ν₁ = 1.3, ν₂ = 2.4	high count: a = 0.7, ϱ = 3, δ₁ = 7, ν₁ = 1.4, ν₂ = 2.3

Open in a new tab

B. Simulation Results

1). Comparison of Gradient Consistency:

To measure the directional consistency of two vectors, we computed the angle between them. The angle between vectors v₁ and v₂ is defined as $θ (v_{1}, v_{2}) : = \arccos (〈v_{1}, v_{2}〉 / ({‖v_{1}‖}_{2} {‖v_{2}‖}_{2}))$ . We define the smooth areas sequence by $I_{s}^{k, i} : = \{j \in ℕ_{q} : grad {(f^{k, i})}_{j} < 0.01 \cdot mean (f^{k, i})\}$ , and the variable areas sequence by $I_{v}^{k, i} : = \{j \in ℕ_{q} : grad {(f^{k, i})}_{j} > 0.2 \cdot mean (f^{k, i})\}$ . In order to estimate the consistency of $\nabla_{I_{s}^{k, i}} Φ_{i - 1} (f^{k, i - 1})$ and $\nabla_{I_{s}^{k, i}} Φ_{i} (f^{k, i})$ , we computed the angle between them. For smooth areas, we define the angle sequence by $θ_{k, i} : = θ (\nabla_{I_{s}^{k, i}} Φ_{i - 1} (f^{k, i - 1}), \nabla_{I_{s}^{k, i}} Φ_{i} (f^{k, i}))$ , and the average angle in each iteration by $θ_{k} : = \sum_{i = 1}^{M} θ_{k, i} / M$ , where $Φ_{0} : = Φ_{M}$ . Similarly, for variable areas, we computed ${\tilde{θ}}_{k, i} : = θ (\nabla_{I_{v}^{k, i}} Φ_{i - 1} (f^{k, i - 1}), \nabla_{I_{v}^{k, i}} Φ_{i} (f^{k, i}))$ and ${\tilde{θ}}_{k} : = \frac{1}{M} \sum_{i = 1}^{M} {\tilde{θ}}_{k, i}$ . In Fig. 2, we observed that for SDP-P1 and SDP-P2, the angle and average angle in smooth areas were smaller than those in the areas with more variability. This is consistent with our conjecture that the directions of gradients in the smooth areas more accurately point toward the minimizer than those in the variable areas. Hence, larger step-sizes in the smooth areas and smaller step-sizes in the variable areas are reasonable.

2). Comparison of Preconditioners:

In order to reveal the improvement due to the application of a preconditioner, as compared to the use of a momentum, we compared SDP-BSREM algorithm with four different preconditioners: P1, P2, M1, and M2, where M1 and M2 are surrogates of momenta. The parameter a in SDP-M1(12) and SDP-M1(24) was set to 1/50 and 1/6, respectively. And the parameters a, ϱ, δ₁ in SDP-M2(12) and SDP-M2(24) were set to 1/15, 3, 1 and 1/5, 2.6, 0.5, respectively. In Fig. 3, one can observe that SDP-P1 and SDP-P2 outperform SDP-M1 and SDP-M2, in reaching the same objective function value, by 25–30% and 25%, respectively.

Fig. 3. — Comparison of performance of preconditioners investigated in this study. Objective function vs. elapsed CPU time in reconstructions performed with SDP-BSREM algorithm with four preconditioners: M1, M2, P1, and P2, and with 12 subsets (left) and 24 subsets (right), respectively, for the brain phantom with high count data. Preconditioners P1 and P2 were generalized from M1 and M2, respectively.

3). Comparison of SDP-BSREM with BSREM:

In this sub-section, we analyzed the performance of SDP-BSREM algorithms compared to the BSREM algorithm. First, we showed the global NRMSD for all the algorithms, with 12 and 24 subsets, in Fig. 4, using the brain phantom. It showed that all algorithms converged to the same solution for both low and high count data. Further, this figure showed that SDP-P1 and SDP-P2 outperformed BSREM with respect to global NRMSD. To analyze convergence acceleration, we showed the objective function values of each algorithm in Fig. 5. In this figure, one can observe that both proposed algorithms, SDP-P1 and SDP-P2, outperform the BSREM algorithm, in reaching the same objective function value, by roughly a factor of two for all cases: 12 and 24 subsets for both low and high count data using the brain phantom.

Fig. 4. — Global NRMSD vs. Iterations in reconstructions performed with different algorithms: BSREM(12), SDP-P1(12), SDP-P2(12), BSREM(24), SDP-P1(24), SDP-P2(24) for the brain phantom with low (top row) and high (bottom row) count data, respectively.

Fig. 5. — Comparison of performance of SDP-BSREM vs. BSREM algorithm. Objective function vs. elapsed CPU time in reconstructions performed with BSREM, SDP-P1, and SDP-P2, with 12 (left) and 24 (right) subsets for the brain phantom with low (top row) and high (bottom low) count data, respectively. The dash lines represent the objective function values of BSREM at 20 seconds CPU time.

Next, we examined the local convergence performance of SDP-BSREM algorithms by ROI based NRMSD in 8 different ROIs with different contrast ratios in the reconstructions of the uniform phantom. High count data and 24 subset were used in this experiments. In Fig. 6, we observed that the proposed SDP-P1 and SDP-P2 algorithms converged fast than BSREM algorithm in all 8 ROIs.

Fig. 6. — Comparison of local convergence performance of SDP-BSREM vs. BSREM algorithms for the uniform phantom with high count data. ROI based normalized root mean square difference (NRMSD) vs. iterations is shown. The eight ROIs are the 4 hot spheres, 2 cold spheres, 1 background spheres, and the region consisting of all the former 7 ROIs, named “all ROIs”.

C. Clinical Results

The reconstructions were performed with our SDP-BSREM algorithm with P2 preconditioner and with commercial Q.Clear by means of the GE toolbox [33] with the penalty weight (β) set to the default value of 350. Because the 2D-projectors used in the simulations and 3D-projectors used in clinical data reconstructions were scaled differently, the penalty values used in respective reconstructions differed substantially.

To mimic the GE’s clinical implementation of Q.Clear, 25 and 8 iterations were used for non-TOF and TOF data, respectively, with 24 subsets in the experiments. For the same reason we initialized both the non-TOF and TOF reconstructions using OSEM with 2 iterations and 24 subsets. The reconstructions with TOF data were further initialized using 3 iterations with 24 subsets of non-TOF algorithm. This gives a more clinically realistic view of the performance, but at the cost of being able to isolate TOF performance.

The parameter values are shown in Table III. For simplicity, since good initializations were used, we set J₀ = 0 and J₁ = 1000. The other algorithmic parameters were found via an iterative golden search procedure using a single bed position (centered on the Derenzo region) from an ACR PET phantom [35] with similar count characteristics as the patient’s data. Using this phantom each parameter was sequentially optimized with 5% tolerance and then used in search for the next parameter until parameter values ceased to change (∼3 iterations).

TABLE III.

Algorithmic parameters for 3D patient reconstruction

Algorithm	Parameters
Q.Clear(nonTOF)	λ₀ = 2, a = 1/5
Q.Clear(TOF)	λ₀ = 1.2, a = 1/5
SDP-P2(nonTOF)	λ₀ = 1.6, a = 1/4, ϱ = 2.2, δ₁ = 0.1, δ₂ = 1.6, ν₁ = 0.6, ν₂ = 1.25
SDP-P2(TOF)	λ₀ = 1.1, a = 1/3, ϱ = 2.4, δ₁ = 0.6, δ₂ = 1.6, ν₁ = 0.6, ν₂ = 1.25

Open in a new tab

In Fig. 8 we show convergence, via the objective function value as a function of iteration for non-TOF/TOF data, for an ¹⁸F-FDG whole-body PET clinical patient (shown in Fig. 7a). This data was obtained from 8 bed positions acquired on a GE D710 PET/CT. The nominal administered activity and post-administration acquisition were 444 MBq and 1-hour, respectively, with 3-minute dwell times and 25% overlap, resulting in [4.1/3.4/4.2/4.5/4.6/3.8/3.1/2.6] × 10⁷ total counts, where the bolded numbers are from the bed positions 4 and 6 shown in Fig. 8.

Fig. 7. — Clinical PET patient and ACR phantom. a) Clinical PET patient: Coronal maximum intensity projection (MIP) of a clinical whole-body 18F-FDG PET patient image acquired on a GE D710 PET/CT and reconstructed using the Q.Clear clinical method [3]. b) Clinical ACR quality assurance phantom showing the regions of interest for cold/hot cylinders, 0:1 and 2.5:1 activity concentration ratios, respectively, and background [34].

We observed that our SDP-BSREM method outperformed the Q.Clear algorithm, in reaching the same objective function value, by 40–50% and 35–50% for non-TOF and TOF data, respectively. We note that both the clinical 3D projection and penalty operator have much greater computational complexity than the convergence acceleration scheme described in section III. Hence, the increased computational cost required for the use of the SDP is negligible.

To evaluate the local convergence, TOF data from a quarterly ACR quality assurance test was used. Following ACR guidelines [34] the activity corresponded to a nominal 444 MBq (12 mCi) of ¹⁸F-FDG administration used at MSKCC. The upper proton of this phantom contains 8 regions (3 cold, 4 hot, and background) with nominal contrast ratios of 0:1 and 2.5:1 for the cold and hot cylinders, respectively. ROI were defined using the cylinder boundaries from registered CT images. Using the methodology described by Kim et al. [36], for each ROI we measured the NRMSD, where thef^∞ in NRMSD is the converged image at 300 iterations by Q.Clear without subsets. These reconstructions used the same parameters as those used in the whole body patient reconstructions (i.e., J₀ = 0, J₁ = 1000 and Table III). The results are shown in Fig. 9. For each ROI, the SDP-BSREM method converged to f^∞ faster than Q.Clear.

Fig. 9. — Local convergence is assessed using the 8 regions of interest from an ACR quality assurance test with TOF data [34]. These regions of interest can be seen in Fig. 7b. Each subplot represents one of the eight regions, which are from left to right and top to bottom: background, cold Teflon/air/water, and hot 8/12/16/25 mm cylinders, respectively.

V. Conclusion

In this paper, we have presented the SDP-BSREM algorithms with two SDPs and their global convergence theorems. The two SDPs were designed based on the smoothness-promoting property in the reconstructed images of the regularization term. We tested these algorithms using both simulated and clinical PET data. Using two simulated phantoms, our numerical studies showed that, for solving the RDP regularized PET image reconstruction model, our proposed algorithms converged more quickly than BSREM. Similarly, when using clinical patient and phantom PET data, our proposed algorithm SDP-P2 outperformed Q.Clear. We plan to test the SDP-BSREM algorithm on more varied data sets acquired under a wide range of conditions seen in the clinic

Acknowledgments

J. Guo is supported by the Special Project on High-performance Computing under the National Key R&D Program (No. 2016YFB0200602) and by the National Natural Science Foundation of China under grant 11771464; C. R. Schmidtlein is partially supported by the Memorial Sloan Kettering Cancer Center’s support grant from the National Cancer Institute (P30 CA008748); C. R. Schmidtlein, A. Krol and Y. Xu are partially supported by the National Cancer Institute of the National Institutes of Health under Award Number R21CA263876; S. Li is supported by the Natural Science Foundation of Guangdong Province under grant 2022A1515012379, by the Opening Project of Guangdong Province Key Laboratory of Computational Science at Sun Yat-sen University under grant 2021007 and by the National Natural Science Foundation of China under grant 11771464; Y. Lin is supported by Guangdong Basic and Applied Basic Research Foundation under grant 2021A1515110541, by the Fundamental Research Funds for the Central Universities of China under grant 21620352 and by the Opening Project of Guangdong Province Key Laboratory of Computational Science at Sun Yat-sen University under grant 2021006; Y. Xu is supported by the US National Science Foundation under grant DMS-1912958. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Appendix A

This appendix includes the proof of strict convexity and Lipschitz continuous gradient of the objective function Φ.

Proposition 9: If $A^{⊤} g \neq 0$ , then the objective function Φ(f) in (4) is strictly convex on $ℝ_{+}^{q}$ .

Proof: From (2), the gradient of fidelity term F is given by $\nabla F = A^{⊤} (1_{p} - g / (Af + γ))$ . Then the Hessian matrix of F can be computed as follows:

\nabla^{2} F (f) = A^{⊤} GA

(20)

with G a diagonal matrix and G = diag(g/(Af + γ)²). The first-order partial derivative of the RDP term R(f) is given by

\frac{\partial R}{\partial f_{j}} = 2 \sum_{k \in N_{j}} \frac{(f_{j} - f_{k}) (γ_{R} |f_{j} - f_{k}| + f_{j} + 3 f_{k} + 2 ϵ)}{{(f_{j} + f_{k} + γ_{R} |f_{j} - f_{k}| + ϵ)}^{2}} .

(21)

Then we have the second-order partial derivative of R as follows:

\frac{\partial^{2} R}{\partial f_{j} \partial f_{k}} = \{\begin{array}{l} \sum_{l \in N_{j}} \frac{4 {(2 f_{l} + ϵ)}^{2}}{{(f_{j} + f_{l} + γ_{R} |f_{j} - f_{l}| + ϵ)}^{3}} & if k = j, \\ - \frac{4 (4 f_{j} f_{k} + 2 ϵ (f_{j} + f_{k}) + ϵ^{2})}{{(f_{j} + f_{k} + γ_{R} |f_{j} - f_{k}| + ϵ)}^{3}} & if k \in N_{j}, \\ 0 & otherwise . \end{array}

(22)

For any $0 \neq x \in ℝ^{q}$ , ignoring the zero entries of ∇²R, we obtain

x^{⊤} \nabla^{2} R x = \sum_{j = 1}^{q} \frac{\partial^{2} R}{\partial f_{j}^{2}} x_{j}^{2} + \sum_{j = 1}^{q} \sum_{k \in N_{j}} \frac{\partial^{2} R}{\partial f_{j} \partial f_{k}} x_{j} x_{k} .

(23)

By (22) and (23), we have

x^{⊤} \nabla^{2} R x = \sum_{j = 1}^{q} \sum_{k \in N_{j}} \frac{2 {((2 f_{k} + ϵ) x_{j} - (2 f_{j} + ϵ) x_{k})}^{2}}{{(f_{j} + f_{k} + γ_{R} |f_{j} - f_{k}| + ϵ)}^{3}} .

(24)

We can see that $x^{⊤} \nabla^{2} R x ⩾ 0$ and for any $x \neq 0$ , $x^{⊤} \nabla^{2} R x = 0$ if and only if there exists a nonzero constant c, such that x = c(2f + ϵ). For any x = c(2f + ϵ) ≠ 0, we have $x^{⊤} \nabla^{2} F x = c^{2} {‖G^{1 / 2} A (2 f + ϵ)‖}^{2}$ , thus $x^{⊤} \nabla^{2} F x > 0$ by using $A^{⊤} g \neq 0$ . Since $\nabla^{2} Φ (f) = \nabla^{2} F (f) + β \nabla^{2} R (f)$ , one can obtain that $x^{⊤} \nabla^{2} Φ x > 0$ for all x ≠ 0. Then Φ is strictly convex on $ℝ_{+}^{q}$ . ■

Proposition 10: The objective function Φ(f) in (4) has a Lipschitz continuous gradient on $ℝ_{+}^{q}$ .

Proof: From (20), one can obtain that ${‖\nabla^{2} F‖}_{2} ⩽ {‖A‖}_{2}^{2} {‖g‖}_{\infty} / γ_{\min}$ where γ_min > 0 is the minimum entry of γ. For any $x \in ℝ^{q}$ with ${‖x‖}_{2} = 1$ and $f \in ℝ_{+}^{q}$ , let $h (f) : = x^{⊤} \nabla^{2} R (f) x$ . From (24), we know that h(f) is continuous on $ℝ_{+}^{q}$ and $\lim_{{‖f‖}_{2} \to \infty} h (f) = 0$ . Thus there exists C₁ > 0 such that $|h (f)| ⩽ C_{1}$ for any x with ${‖x‖}_{2} = 1$ . Then we have ${‖\nabla^{2} R‖}_{2} ⩽ C_{1}$ which, when combined with the boundedness of ${‖\nabla^{2} F‖}_{2}$ , implies that ${‖\nabla^{2} Φ‖}_{2} ⩽ {‖A‖}_{2}^{2} {‖g‖}_{\infty} / γ_{\min} + C_{1}$ . Using Lemma 1.2.2 in [37], one can obtain that ∇Φ(f) is Lipschitz continuous with constant $C : = ({‖A‖}_{2}^{2} {‖g‖}_{\infty}) / (γ_{\min}) + C_{1}$ on $ℝ_{+}^{q}$ . ■

Appendix B

This appendix includes proofs of Lemma 1, Lemma 3, Lemma 4, and Theorem 7. Here is the proof of Lemma 1.

Proof: We present only a proof of the Lipschitz continuity and the uniform boundedness can be shown in a similar manner. For any f, $\tilde{f} \in B$ , we consider the quantities $Δ : = S^{k, i} (f) \nabla Φ_{i} (f) - S^{k, i} (\tilde{f}) \nabla Φ_{i} (\tilde{f})$ , $Δ_{1} : = \nabla Φ_{i} (f) - \nabla Φ_{i} (\tilde{f})$ and $Δ_{2} : = S^{k, i} (f) - S^{k, i} (\tilde{f})$ .

By the triangle inequality and the Cauchy–Schwarz inequality, we have that

{‖Δ‖}_{2} ⩽ {‖S^{k, i} (f)‖}_{2} {‖Δ_{1}‖}_{2} + {‖\nabla Φ_{i} (\tilde{f})‖}_{2} {‖Δ_{2}‖}_{2} .

(25)

According to condition (v), 𝜶^k,i is bounded. This together with the definition of S(f) implies that S^k,i(f) := diag(𝜶^k,i)S(f) is bounded for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ and $f \in B$ . By condition (iii), there exists c₂ > 0 such that for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$

{‖S^{k, i} (f)‖}_{2} {‖Δ_{1}‖}_{2} ⩽ c_{2} {‖f - \tilde{f}‖}_{2} .

(26)

We next prove that S^k,i(f) are Lipschitz continuous on $B$ with Lipschitz constants bounded above by a constant. For any f, $\tilde{f} \in B$ , if both $f_{j}, {\tilde{f}}_{j}$ are in [0, U/2) or in [U/2, U ], then $|S^{k, i} {(f)}_{j j} - S^{k, i} {(\tilde{f})}_{j j}| = |α_{j}^{k, i} / p_{j}| \cdot |f_{j} - {\tilde{f}}_{j}|$ . If f_j and ${\tilde{f}}_{j}$ are not in the same interval, without lose of generality, assuming $f_{j} \in [0, U / 2)$ , ${\tilde{f}}_{j} \in [U / 2, U]$ , then $|f_{j} - (U - {\tilde{f}}_{j})| ⩽ |f_{j} - {\tilde{f}}_{j}|$ . Therefore, one can get $|S^{k, i} {(f)}_{j j} - S^{k, i} {(\tilde{f})}_{j j}| ⩽ |α_{j}^{k, i} / p_{j}| \cdot |f_{j} - {\tilde{f}}_{j}|$ . Thus, $‖ Δ_{2} ‖_{2} ⩽ ‖ α^{k, i} ‖_{2} / p_{0} ‖ f - \tilde{f} ‖_{2}$ , where $p_{0} : = \min \{p_{j} : j \in ℕ_{q}\}$ . Since 𝜶^k,i and $\nabla Φ_{i} (\tilde{f})$ are bounded and $p_{0} > 0$ there exists a constant $c_{3} > 0$ such that for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ ,

‖ \nabla Φ_{i} (\tilde{f}) ‖ {‖ Δ_{2} ‖}_{2} ⩽ c_{3} ‖ f - {\tilde{f} ‖}_{2} .

(27)

Let L := c₂ + c₃. It follows from (25), (26) and (27) that ${‖Δ_{2}‖}_{2} ⩽ L {‖ f - \tilde{f} ‖}_{2}$ . That is, $S^{k, i} (f) \nabla Φ_{i} (f)$ are Lipschitz continuous with Lipschitz constants bounded above by L for all $k \in ℕ_{0}$ , $i \in ℕ_{M}$ . ■

The proof of Lemma 3 is presented as follows.

Proof: Let $b^{k, i} : = diag (α - δ^{k, i}) \nabla Φ_{i} (f^{k, i - 1})$ for $k \in ℕ_{0}$ , $i \in ℕ_{M}$ . For $f_{j}^{k, i - 1} \in (0, U / 2)$ , $i \in ℕ_{M}$ , we have that $S^{k, i} {(f^{k, i - 1})}_{j j} = (α_{j} - δ_{j}^{k, i}) f_{j}^{k, i} / p_{j}$ . By assumption, $f^{k, i} = P_{t} ({\tilde{f}}^{k, i}) = {\tilde{f}}^{k, i}$ . The definition of f^k,i yields $f_{j}^{k, i} = f_{j}^{k, i - 1} (1 - λ_{k} / p_{j} b_{j}^{k, i})$ _, which implies $f_{j}^{k + 1} = f_{j}^{k} Π_{i = 1}^{M} (1 - λ_{k} / p_{j} b_{j}^{k, i})$ _. By the boundedness of $b_{j}^{k, i} / p_{j}$ , we have $f_{j}^{k + 1} = f_{j}^{k} (1 - λ_{k} / p_{j} \sum_{i = 1}^{M} b_{j}^{k, i} + O (λ_{k}^{2}))$ . We next estimate $\sum_{i = 1}^{M} b^{k, i}$ . By definition, $b^{k, i} = diag (α) \nabla Φ_{i} (f^{k, i - 1}) - diag (δ^{k, i}) \nabla Φ_{i} (f^{k, i - 1})$ . To estimate the first term of the last equation, we write $\nabla Φ_{i} (f^{k, i - 1}) = Δ^{k, i} + \nabla Φ_{i} (f^{k})$ , where $Δ^{k, i} : = \nabla Φ_{i} (f^{k, i - 1}) - \nabla Φ_{i} (f^{k})$ . It follows that $\sum_{i = 1}^{M} \nabla Φ_{i} (f^{k, i - 1}) = \sum_{i = 1}^{M} Δ^{k, i} + \nabla Φ_{i} (f^{k})$ . By condition (iii), there exists c₄ > 0 such that ${‖Δ^{k, i}‖}_{2} ⩽ c_{4} {‖f^{k, i - 1} - f^{k}‖}_{2}$ . Since ${\tilde{f}}^{k, i} \in int B$ by (14) and Lemma 1, there exists a constant c₅ > 0 such that

{‖f^{k, i - 1} - f^{k}‖}_{2} ⩽ c_{5} M λ_{k} .

(28)

Hence ${‖Δ^{k, i}‖}_{2} = O (λ_{k})$ and this gives that $\sum_{i = 1}^{M} \nabla Φ_{i} (f^{k, i - 1}) = \nabla Φ (f^{k}) + O (λ_{k})$ . Moreover, by condition (iii), we have that ${‖\sum_{i = 1}^{M} diag (δ^{k, i}) \nabla Φ_{i} (f^{k, i - 1})‖}_{2} = O (δ_{k})$ . Therefore, $\sum_{i = 1}^{M} b^{k, i} = diag (α) \nabla Φ (f^{k}) + O (λ_{k}) + O (δ_{k})$ . Thus we obtain (15).

Equation (16) may be shown in a similar manner. Indeed, for $f_{j}^{k, i - 1} \in [U / 2, U)$ , $i \in ℕ_{M}$ , we have that $S^{k, i} {(f^{k, i - 1})}_{j j} = α_{j}^{k, i} (U - f_{j}^{k, i}) / p_{j}$ . The definition of f^k,i yields that $(U - f_{j}^{k + 1}) = (U - f_{j}^{k}) Π_{i = 1}^{M} (1 + λ_{k} / p_{j} b_{j}^{k, i})$ . This equation with similar arguments leads to (16). ■

The proof of Lemma 4 is presented as follows.

Proof: We first prove part (a). Let H(f) denote the Hessian matrix of Φ(f), ‘ $⊙$ ’ denote the component-wise multiplication of two vectors, and $h^{k} : = f^{k + 1} - f^{k}$ . By the Taylor expansion, we have that

Φ (f^{k + 1}) = Φ (f^{k}) + {(\nabla Φ (f^{k}))}^{⊤} h^{k} + R_{k},

(29)

where $R_{k} : = {(h^{k})}^{⊤} H (f^{k} + θ ⊙ h^{k}) h^{k}$ , for some vector $θ \in ℝ_{+}^{q}$ with $θ_{j} \in (0, 1)$ for all $j \in ℕ_{q}$ . We now estimate R_k. By condition (iii), ∇Φ(f) is Lipschitz continuous. Hence H(f) is bounded on $B$ . Then we have $|R_{k}| = O ({‖h^{k}‖}_{2}^{2})$ . We next evaluate the term h^k. For notation simplicity, we let $e^{k, i} : = S^{k, i} (f^{k}) \nabla Φ_{i} (f^{k}) - S^{k, i} (f^{k, i - 1}) \nabla Φ_{i} (f^{k, i - 1})$ , and $e^{k} : = \sum_{i = 1}^{M} e^{k, i}$ . By Lemma 1, we have ${‖ e^{k, i} ‖}_{2} ⩽ L {‖ f^{k, i - 1} - f^{k} ‖}_{2}$ . This combined with (28) implies that ${‖e^{k, i}‖}_{2} = O (λ_{k})$ , and thus ${‖e^{k}‖}_{2} = O (λ_{k})$ . By assumption, ${\tilde{f}}^{k, i} \in int B$ , from (14), we obtain that

h^{k} = - λ_{k} \sum_{i = 1}^{M} S^{k, i} (f^{k}) \nabla Φ_{i} (f^{k}) + λ_{k} e^{k} .

(30)

Let $d^{k} : = S (f^{k}) \sum_{i = 1}^{M} diag (δ^{k, i}) \nabla Φ_{i} (f^{k})$ , and $J_{k} : = {(\nabla Φ (f^{k}))}^{⊤} diag (α) S (f^{k}) \nabla Φ (f^{k})$ . Then ${‖d^{k}‖}_{2} = O (δ_{k})$ by condition (v) and Lemma 1, and J_k ⩾ 0 by the positive semi-definiteness of diag(𝜶)S(f^k). Since S^k,i(f^k) = diag(𝜶 − δ^k,i)S(f^k), then we have

\sum_{i = 1}^{M} S^{k, i} (f^{k}) \nabla Φ_{i} (f^{k}) = diag (α) S (f^{k}) \nabla Φ (f^{k}) - d^{k} .

(31)

Combining this with (30), we have

h^{k} = - λ_{k} diag (α) S (f^{k}) \nabla Φ (f^{k}) + λ_{k} (d^{k} + e^{k}) .

(32)

By the boundedness of diag(𝜶)S(f)∇Φ(f) and the norm of e^k and d^k, we have ${‖h^{k}‖}_{2} = O (λ_{k})$ and hence, $|R_{k}| = O (λ_{k}^{2})$ . This combined with (29), (32) and the boundedness of $\nabla Φ (f)$ yields that

Φ (f^{k + 1}) = Φ (f^{k}) - λ_{k} J_{k} + O (λ_{k} δ_{k}) + O (λ_{k}^{2}) .

(33)

For $s \in ℕ$ , summing both sides of (33) for k from 0 to s, we obtain that

Φ (f^{s + 1}) = Φ (f^{0}) + \sum_{k = 0}^{s} [- λ_{k} J_{k} + O (λ_{k} δ_{k}) + O (λ_{k}^{2})] .

(34)

We now prove the convergence of the right hand side of (34). By condition (vi), $\sum_{k = 0}^{\infty} λ_{k} δ^{k, i}$ is convergent, and hence $\sum_{k = 0}^{\infty} λ_{k} δ_{k}$ is convergent. Notice the facts we have in hand: (i) $\sum_{k = 0}^{\infty} λ_{k}^{2} < \infty$ (by condition (iv)); (ii) the convergence of $\sum_{k = 0}^{\infty} λ_{k} δ_{k}$ , it remains to show that $\sum_{k = 0}^{\infty} λ_{k} J_{k}$ is convergent. In view of the facts (i), (ii) and the boundedness of Φ (f) on $B$ , the partial sum $\sum_{k = 0}^{s} λ_{k} J_{k}$ is bounded, which combined with its monotonicity $(λ_{k} J_{k} ⩾ 0)$ implies its convergence.

We next prove part (b). Since for each k, J_k ≥ 0, there exists a subsequence $J^{k_{n}}$ such that $\lim_{n \to \infty} J^{k_{n}} = 0$ . In fact, assume to the contrary that there exists ϵ₀ > 0 and $K_{0} \in ℕ_{+}$ such that $J_{k} ⩾ ϵ_{0}$ , for all k > K₀. Because $\sum_{k = 0}^{\infty} λ_{k} = \infty$ , by condition (iv), and λ_k > 0, we would have $\sum_{k = 0}^{s} λ_{k} J_{k} ⩾ ϵ_{0} \sum_{k = 0}^{s} λ_{k} \to \infty$ , as $s \to \infty$ , a contradiction. Moreover, since $f^{k_{n}}$ is bounded, there exists a convergent subsequence $f^{k_{n}^{'}}$ having the limit $f^{*} \in B$ . Thus, ${(\nabla Φ (f^{*}))}^{⊤} diag (α) S (f^{*}) \nabla Φ (f^{*}) = \lim_{n \to \infty} J^{k_{n}^{'}} = 0$ . Letting $r_{j} : = (\partial / \partial f_{j}) Φ (f^{*})$ and $s_{j} : = S {(f^{*})}_{j j}$ , the last equation yields $\sum_{j = 1}^{q} α_{j} s_{j} r_{j}^{2} = 0$ . Since $s_{j} ⩾ 0$ and 𝜶_j > 0, we have for all $j \in ℕ_{q}$ that $α_{j} s_{j} r_{j}^{2} ⩾ 0$ , which implies that $s_{j} r_{j} = 0$ for all $j \in ℕ_{q}$ . That is, $S (f^{*}) \nabla Φ (f^{*}) = 0$ .

Finally, we show part (c). According to [38, Page 203], it suffices to prove that for each $j \in ℕ_{q}$ , (i) if $0 < f_{j}^{*} < U$ , then $r_{j} = 0$ ; (ii) if $f_{j}^{*} = 0$ then $r_{j} ⩾ 0$ ; and if $f_{j}^{*} = U$ , then $r_{j} ⩽ 0$ . Case (i) clearly follows from part (b), from which we have s_jr_j = 0 for all $j \in ℕ_{q}$ . By the definition (8) of S(f), s_j > 0 for $0 < f_{j}^{*} < U$ , and thus, r_j = 0.

It remains to prove case (ii). To this end, we Let $J_{1} : = \{j^{'} \in ℕ_{q} : f_{j^{'}}^{*} = 0, r_{j^{'}} < 0\}$ , $J_{2} : = \{j^{″} \in ℕ_{q} : f_{j^{″}}^{*} = U, r_{j^{″}} > 0\}$ , $J : = J_{1} \cup J_{2}$ and show $J = 0$ . Assume to the contrary that $J \neq 0$ . Then, either $J_{1} \neq 0$ or $J_{2} \neq 0$ . Assume $J_{1} \neq 0$ , then for any $j^{'} \in J_{1}$ , since ∇Φ(f) is continuous at f^∗, there exists δ ∈ (0, U/4) such that for all $f \in B_{δ} : = \{f \in B : {‖f - f^{*}‖}_{2} < δ\}$ , there holds $(\partial / \partial f_{j^{'}}) Φ (f) < 0$ . By Lemma 2, there exists $K_{1} \in ℕ$ such that for k > K₁, ${‖f^{k, i - 1} - f^{k}‖}_{2} < δ$ . Then for k > K₁, if $f^{k} \in B_{δ}$ , we have ${‖f^{k, i - 1} - f^{*}‖}_{2} < 2 δ < U / 2$ , and hence, $f_{j^{'}}^{k, i - 1} \in (0, U / 2)$ for all $i \in ℕ_{M}$ , $j^{'} \in J_{1}$ . In this case, $S^{k, i} {(f^{k, i - 1})}_{j^{'} j^{'}} = α_{j^{'}}^{k, i} f_{j^{'}}^{k, i} / p_{j^{'}}$ for all $i \in ℕ_{M}$ , and hence Lemma 3 ensures that (15) holds. Since $(\partial / \partial f_{j^{'}}) Φ (f^{k}) < 0$ for $f^{k} \in B_{δ}$ and $λ_{k}$ , $δ_{k} \to 0$ as $k \to \infty$ , then there exists K₂ > K₁ such that if k > K₂ and $f^{k} \in B_{δ}$ , we have $f_{j^{'}}^{k + 1} > f_{j^{'}}^{k}$ for any $j^{'} \in J_{1}$ . Therefore, for k > K₂, if $f^{k} \in B_{δ}$ , then we have $f_{j^{'}}^{k + 1} > f_{j^{'}}^{k}$ , for any $j^{'} \in J_{1}$ . Similarly, if $J_{2} \neq 0$ , for any $j^{″} \in J_{2}$ , there exists K₃ > K₂ such that for k > K₃, if $f^{k} \in B_{δ}$ , then (16) ensures that $f_{j^{″}}^{k + 1} > f_{j^{″}}^{k}$ .

Since $\lim_{n \to \infty} f^{k_{n}^{'}} = f^{*}$ , there exists K₄ > K₃ such that if $k_{n}^{'} > K_{4}$ , $f^{k_{n}^{'}} \in B_{δ}$ Suppose $k_{n_{0}}^{'} > K_{4}$ for some n₀. Let $t_{n} : = \max \{k : K_{4} < k < k_{n + n_{0}}^{'}, f^{k} \notin B_{δ}\}$ . If for some n, $f^{k} \in B_{δ}$ for all $K_{4} < k < k_{n + n_{0}}^{'}$ , set t_n := K₄, and hence $t_{n} ⩾ K_{4}$ . Clearly, we have $f^{k} \in B_{δ}$ if $t_{n} + 1 ⩽ k ⩽ k_{n + n_{0}}^{'}$ . Moreover, t_n is a monotone increasing sequence. Then, either (a) $\lim_{n \to \infty} t_{n} = t_{0} ⩾ K_{4}$ , (b) $\lim_{n \to \infty} t_{n} = + \infty$ . If it is the case (a), then $f^{k} \in B_{δ}$ for all k > t₀. Thus, for m > l > t₀ that $f_{j^{'}}^{m} > f_{j^{'}}^{l} > 0$ for $j^{'} \in J_{1}$ . This contradicts the fact that $f_{j^{'}}^{*} = 0$ for $j^{'} \in J_{1}$ . Hence, it must be the case (b). Since $f^{k} \in B_{δ}$ for $t_{n} + 1 ⩽ k ⩽ k_{n + n_{0}}^{'}$ , we have that $f_{j^{'}}^{k_{n + n_{0}}^{'}} ⩾ f_{j^{'}}^{t_{n} + 1} > 0$ for $j^{'} \in J_{1}$ and $f_{j^{″}}^{k_{n + n_{0}}^{'}} ⩽ f_{j^{″}}^{t_{n} + 1} < U$ for $j^{″} \in J_{2}$ . It follows that $\lim_{n \to \infty} f_{j^{'}}^{t_{n}^{J} + 1} = 0$ for $j^{'} \in J_{1}$ and $\lim_{n \to \infty} f_{j^{″}}^{t_{n} + 1} = U$ for $j^{″} \in J_{2}$ . Then, Lemma 2 ensures that $\lim_{n \to \infty} f_{j^{'}}^{t_{n}} = 0$ for $j^{'} \in J_{1}$ and $\lim_{n \to \infty} f_{j^{″}}^{t_{n}} = U$ for $j^{″} \in J_{2}$ . Thus, we can find a convergent subsequence $f^{t_{n_{l}}}$ of $f^{t_{n}}$ such that $\lim_{l \to \infty} f^{t_{n_{l}}} = f^{* *}$ with $f_{j^{'}}^{* *} = 0$ for $j^{'} \in J_{1}$ and $f_{j^{″}}^{* *} = U$ for $j^{″} \in J_{2}$ . Since $f^{t_{n}} \notin B_{δ}$ , we observe that $f^{* *} \notin B_{δ}$ , which ensures that $f^{* *} \neq f^{*}$ . By part (a), $Φ (f^{k})$ is convergent, which implies that $Φ (f^{* *}) = Φ (f^{*})$ . Let $D : = \{f \in B : f_{j^{'}} = 0 for j^{'} \in J_{1} and f_{j^{″}} = U for j^{″} \in J_{2}\}$ . Thus, $f^{*}, f^{* *} \in D$ . It can be verified for any $f \in D$ that $〈f - f^{*}, \nabla Φ (f^{*})〉 ⩾ 0$ . By [38, Page 203], f^∗ is a minimizer of Φ over $D$ . Hence, Φ has two different minimizers f^∗ and f^∗∗ over the convex set $D$ . This contradicts the assumption that Φ has a unique minimizer on $B$ . Thus, we have that $J = 0$ . ■

Here is the proof of Theorem 8.

Proof: From Proposition 9 and Proposition 10, we have that Φ satisfies conditions (i)-(iii). One can directly obtain that the relaxation sequence λ_k satisfies condition (iv).

By Theorem 7, to prove the convergence of SDP-BSREM algorithm with λ_k and S^k,i, it is sufficient to show that λ_k and S^k,i satisfy conditions (v) and (vi). To do this, for the subiteration-dependent preconditioner $S^{k, i} (f) = diag (α_{k, i} ν^{k, i}) S (f)$ , one need to show that $\lim_{k \to \infty} α_{k, i} = α > 0$ for all $i \in ℕ_{M}$ ,and $\sum_{k = 0}^{\infty} λ_{k} (α - α_{k, i})$ converges for all $i \in ℕ_{M}$ since ν^k,i is a positive vector sequence and $ν^{k, i} = ν^{k_{1}, M}$ for k > k₁ and $i \in ℕ_{M}$ .

For 𝜶_k,i defined in (17), the sequence t_k,i is increasing. By induction, we have that $t_{k, i} > (k M + i) / 2$ . Further, it can be shown that $\lim_{k \to \infty} t_{k, i} / k = M / 2$ , and thus $\lim_{k \to \infty} t_{k, i} / t_{k, i + 1} = 1$ for all $i \in ℕ_{M}$ . Then we can obtain $\lim_{k \to \infty} α_{k, i} = 1 + \lim_{k \to \infty} (t_{k, i} - 1) / t_{k, i + 1} = 2 > 0$ . By computing $2 - α_{k, i} = 1 / (2 t_{k, i + 1} (\sqrt{1 + 4 t_{k, i}^{2}} + 4 t_{k, i})) + 3 / (2 t_{k, i + 1})$ , we have that $\lim_{k \to \infty} k (2 - α_{k, i}) = 3 / M$ . Thus the series $\sum_{k = 0}^{\infty} λ_{k} (2 - α_{k, i})$ converges since $\sum_{k = 0}^{\infty} 1 / k^{2} < \infty$ . Therefore, for preconditioners P1 or M1 and relaxation λ_k, conditions (v) and (vi) are satisfied.

For 𝜶_k,i defined in (18), we have that 𝜶_k,i is monotone and $\lim_{k \to \infty} α_{k, i} = ϱ > 0$ . It can be shown that $\lim_{k \to \infty} k (ϱ - α_{k, i}) = (ϱ δ_{1} - δ_{2}) / M$ . Hence for precondition P2 or M2 and relaxation λ_k, conditions (v) and (vi) are satisfied. ■

Contributor Information

Jianfeng Guo, School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510275, China.

C. Ross Schmidtlein, Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.

Andrzej Krol, Departments of Radiology and Pharmacology, SUNY Upstate Medical University, 750 East Adams Street, Syracuse NY 13210, USA.

Si Li, School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China.

Yizun Lin, Department of Mathematics, College of Information Science and Technology, Jinan University, Guangzhou 510632, China.

Sangtae Ahn, GE Research, Niskayuna, NY 12309, USA.

Charles Stearns, GE Healthcare, Waukesha, WI 53188, USA.

Yuesheng Xu, Department of Mathematics and Statistics, Old Dominion University, Norfolk, VA 23529, USA.

References

[1].Shepp LA and Vardi Y, “Maximum likelihood reconstruction for emission tomography,” IEEE Trans. Med. Imag, vol. 1, no. 2, pp. 113–122, Oct. 1982. [DOI] [PubMed] [Google Scholar]
[2].Lange K and Carson R, “EM reconstruction algorithms for emission and transmission tomography,” J. Comput. Assist. Tomogr, vol. 8, no. 2, pp. 306–316, 1984. [PubMed] [Google Scholar]
[3].Ahn S, Ross SG, Asma E, Miao J, Jin X, Lishui Cheng SDW, and Manjeshwar1 RM, “Quantitative comparison of OSEM and penalized likelihood image reconstruction using relative difference penalties for clinical PET,” Phys. Med. Biol, vol. 60, no. 15, pp. 5733–5751, 2015. [DOI] [PubMed] [Google Scholar]
[4].Hudson HM and Larkin RS, “Accelerated image reconstruction using ordered subsets of projection data,” IEEE Trans. Med. Imag, vol. 13, no. 4, pp. 601–609, Dec. 1994. [DOI] [PubMed] [Google Scholar]
[5].Browne J and De Pierro AR, “A row-action alternative to the EM algorithm for maximizing likelihood in emission tomography,” IEEE Trans. Med. Imag, vol. 15, no. 5, pp. 687–699, Oct. 1996. [DOI] [PubMed] [Google Scholar]
[6].Bertsekas DP, “A new class of incremental gradient methods for least squares problems,” SIAM J. Optim, vol. 7, no. 4, pp. 913–926, 1997. [Google Scholar]
[7].De Pierro AR and Yamagishi MEB, “Fast EM-like methods for maximum” a posteriori” estimates in emission tomography,” IEEE Trans. Med. Imag, vol. 20, no. 4, pp. 280–288, Apr. 2001. [DOI] [PubMed] [Google Scholar]
[8].Ahn S and Fessler JA, “Globally convergent image reconstruction for emission tomography using relaxed ordered subsets algorithms,” IEEE Trans. Med. Imag, vol. 22, no. 5, pp. 613–626, May 2003. [DOI] [PubMed] [Google Scholar]
[9].Tsai Y-J, Bousse A, Ehrhardt MJ, Stearns CW, Ahn S, Hutton BF, Arridge S, and Thielemans K, “Fast quasi-Newton algorithms for penalized reconstruction in emission tomography and further improvements via preconditioning,” IEEE Trans. Med. Imag, vol. 37, no. 4, pp. 1000–1010, Apr. 2018. [DOI] [PubMed] [Google Scholar]
[10].Rudin LI, Osher S, and Fatemi E, “Nonlinear total variation based noise removal algorithms,” Phys. D, vol. 60, no. 1–4, pp. 259–268, 1992. [Google Scholar]
[11].Bredies K, Kunisch K, and Pock T, “Total generalized variation,” SIAM J. Imag. Sci, vol. 3, no. 3, pp. 492–526, 2010. [Google Scholar]
[12].Geman S and McClure DE, “Statistical methods for tomographic image reconstruction,” Bull. Int. Statist. Inst, vol. 52, no. 4, pp. 5–21, 1987. [Google Scholar]
[13].Mumcuoglu EÜ, Leahy RM, and Cherry SR, “Bayesian reconstruction of PET images: methodology and performance analysis,” Phys. Med. Biol, vol. 41, no. 9, pp. 1777–1807, 1996. [DOI] [PubMed] [Google Scholar]
[14].Zhang Z, Ye J, Chen B, Perkins AE, Rose S, Sidky EY, Kao C-M, Xia D, Tung C-H, and Pan X, “Investigation of optimizationbased reconstruction with an image-total-variation constraint in PET,” Phys. Med. Biol, vol. 61, no. 16, pp. 6055–6084, Aug. 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
[15].Chambolle A and Pock T, “A first-order primal-dual algorithm for convex problems with applications to imaging,” J. Math. Imag. Vis, vol. 40, no. 1, pp. 120–145, 2011. [Google Scholar]
[16].Krol A, Li S, Shen L, and Xu Y, “Preconditioned alternating projection algorithms for maximum a posteriori ECT reconstruction,” Inverse Problems, vol. 28, no. 11, p. 115005, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
[17].Li S, Zhang J, Krol A, Schmidtlein CR, Vogelsang L, Shen L, Lipson E, Feiglin D, and Xu Y, “Effective noise-suppressed and artifact-reduced reconstruction of SPECT data using a preconditioned alternating projection algorithm,” Med. Phys, vol. 42, no. 8, pp. 4872–4887, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
[18].Nuyts J, Bequé D, Dupont P, and Mortelmans L, “A concave prior penalizing relative differences for maximum-a-posteriori reconstruction in emission tomography,” IEEE Trans. Nucl. Sci, vol. 49, no. 1, pp. 56–60, Feb. 2002. [Google Scholar]
[19].Wang G and Qi J, “Penalized likelihood PET image reconstruction using patch-based edge-preserving regularization,” IEEE Trans. Med. Imag, vol. 31, no. 12, pp. 2194–2204, Dec. 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
[20].——, “Edge-preserving PET image reconstruction using trust optimization transfer,” IEEE Trans. Med. Imag, vol. 34, no. 4, pp. 930–939, Apr. 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
[21].Mumcuoglu EÜ, Leahy R, Cherry SR, and Zhou Z, “Fast gradient-based methods for bayesian reconstruction of transmission and emission PET images,” IEEE Trans. Med. Imag, vol. 13, no. 4, pp. 687–701, Dec. 1994. [DOI] [PubMed] [Google Scholar]
[22].Lin Y, Schmidtlein CR, Li Q, Li S, and Xu Y, “A krasnoselskiimann algorithm with an improved EM preconditioner for PET image reconstruction,” IEEE Trans. Med. Imag, vol. 38, no. 9, pp. 2114–2126, Sep. 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[23].Ehrhardt MJ, Markiewicz P, and Sch C-Bönlieb, “Faster PET reconstruction with non-smooth priors by randomization and preconditioning,” Phys. Med. Biol, vol. 64, no. 22, p. 225019, Nov. 2019. [DOI] [PubMed] [Google Scholar]
[24].Yu DF and Fessler JA, “Edge-preserving tomographic reconstruction with nonlocal regularization,” IEEE Trans. Med. Imag, vol. 21, no. 2, pp. 159–173, Feb. 2002. [DOI] [PubMed] [Google Scholar]
[25].Nesterov YE, “A method of solving a convex programming problem with convergence rate O(1/k2),” Sov. Math. Dokl, vol. 27, no. 2, pp. 372–376, 1983. [Google Scholar]
[26].Beck A and Teboulle M, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imag. Sci, vol. 2, no. 1, pp. 183–202, 2009. [Google Scholar]
[27].Zeng X, Shen L, and Xu Y, “A convergent fixed-point proximity algorithm accelerated by FISTA for the l0 sparse recovery problem,” in Imaging, Vision and Learning Based on Optimization and PDEs, Tai X-C, Bae E, and Lysaker M, Eds. Cham, Switzerland: Springer Int. Publishing, 2018, pp. 27–45. [Google Scholar]
[28].Kim D, Ramani S, and Fessler JA, “Combining ordered subsets and momentum for accelerated X-ray CT image reconstruction,” IEEE Trans. Med. Imag, vol. 34, no. 1, pp. 167–178, Jan. 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
[29].Li Q, Asma E, Ahn S, and Leahy RM, “A fast fully 4-D incremental gradient reconstruction algorithm for list mode PET data,” IEEE Trans. Med. Imag, vol. 26, no. 1, pp. 58–67, Jan. 2007. [DOI] [PubMed] [Google Scholar]
[30].Schmidtlein CR, Lin Y, Li S, Krol A, Beattie BJ, Humm JL, and Xu Y, “Relaxed ordered subset preconditioned alternating projection algorithm for PET reconstruction with automated penalty weight selection,” Med. Phys, vol. 44, no. 8, pp. 4083–4097, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
[31].Moses WW, “Fundamental limits of spatial resolution in PET,” Nucl. Instrum. Methods Phys. Res. A, Accel. Spectrom. Detect. Assoc. Equip, vol. 648, pp. S236–S240, Aug. 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
[32].Berthon B, Häggström I, Apte A, Beattie BJ, Kirov AS, Humm JL, Marshall C, Spezi E, Larsson A, and Schmidtlein CR, “PETSTEP: generation of synthetic PET lesions for fast evaluation of segmentation methods,” Phys. Med, vol. 31, no. 8, pp. 969–980, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
[33].Ross S and Thielemans K, General Electric PET Toolbox Release 5.0, 2011–2019.
[34].A. A. of Physicists in Medicine et al. , PET phantom instructions for evaluation of PET image quality, 2012.
[35].MacFarlane CR, “Acr accreditation of nuclear medicine and PET imaging departments,” J. Nucl. Med. Technol, vol. 34, no. 1, pp. 18–24, Mar. 2006. [PubMed] [Google Scholar]
[36].Kim K, Kim D, Yang J, El Fakhri G, Seo Y, Fessler JA, and Li Q, “Time of flight pet reconstruction using nonuniform update for regional recovery uniformity,” Med. Phys, vol. 46, no. 2, pp. 649–664, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
[37].Nesterov YE, Introductory Lectures on Convex Optimization: A Basic Course New York: Kluwer, 2004. [Google Scholar]
[38].Polyak BT, Introduction to optimization New York, NY, USA: Optim. Softw., 1987. [Google Scholar]

[R1] [1].Shepp LA and Vardi Y, “Maximum likelihood reconstruction for emission tomography,” IEEE Trans. Med. Imag, vol. 1, no. 2, pp. 113–122, Oct. 1982. [DOI] [PubMed] [Google Scholar]

[R2] [2].Lange K and Carson R, “EM reconstruction algorithms for emission and transmission tomography,” J. Comput. Assist. Tomogr, vol. 8, no. 2, pp. 306–316, 1984. [PubMed] [Google Scholar]

[R3] [3].Ahn S, Ross SG, Asma E, Miao J, Jin X, Lishui Cheng SDW, and Manjeshwar1 RM, “Quantitative comparison of OSEM and penalized likelihood image reconstruction using relative difference penalties for clinical PET,” Phys. Med. Biol, vol. 60, no. 15, pp. 5733–5751, 2015. [DOI] [PubMed] [Google Scholar]

[R4] [4].Hudson HM and Larkin RS, “Accelerated image reconstruction using ordered subsets of projection data,” IEEE Trans. Med. Imag, vol. 13, no. 4, pp. 601–609, Dec. 1994. [DOI] [PubMed] [Google Scholar]

[R5] [5].Browne J and De Pierro AR, “A row-action alternative to the EM algorithm for maximizing likelihood in emission tomography,” IEEE Trans. Med. Imag, vol. 15, no. 5, pp. 687–699, Oct. 1996. [DOI] [PubMed] [Google Scholar]

[R6] [6].Bertsekas DP, “A new class of incremental gradient methods for least squares problems,” SIAM J. Optim, vol. 7, no. 4, pp. 913–926, 1997. [Google Scholar]

[R7] [7].De Pierro AR and Yamagishi MEB, “Fast EM-like methods for maximum” a posteriori” estimates in emission tomography,” IEEE Trans. Med. Imag, vol. 20, no. 4, pp. 280–288, Apr. 2001. [DOI] [PubMed] [Google Scholar]

[R8] [8].Ahn S and Fessler JA, “Globally convergent image reconstruction for emission tomography using relaxed ordered subsets algorithms,” IEEE Trans. Med. Imag, vol. 22, no. 5, pp. 613–626, May 2003. [DOI] [PubMed] [Google Scholar]

[R9] [9].Tsai Y-J, Bousse A, Ehrhardt MJ, Stearns CW, Ahn S, Hutton BF, Arridge S, and Thielemans K, “Fast quasi-Newton algorithms for penalized reconstruction in emission tomography and further improvements via preconditioning,” IEEE Trans. Med. Imag, vol. 37, no. 4, pp. 1000–1010, Apr. 2018. [DOI] [PubMed] [Google Scholar]

[R10] [10].Rudin LI, Osher S, and Fatemi E, “Nonlinear total variation based noise removal algorithms,” Phys. D, vol. 60, no. 1–4, pp. 259–268, 1992. [Google Scholar]

[R11] [11].Bredies K, Kunisch K, and Pock T, “Total generalized variation,” SIAM J. Imag. Sci, vol. 3, no. 3, pp. 492–526, 2010. [Google Scholar]

[R12] [12].Geman S and McClure DE, “Statistical methods for tomographic image reconstruction,” Bull. Int. Statist. Inst, vol. 52, no. 4, pp. 5–21, 1987. [Google Scholar]

[R13] [13].Mumcuoglu EÜ, Leahy RM, and Cherry SR, “Bayesian reconstruction of PET images: methodology and performance analysis,” Phys. Med. Biol, vol. 41, no. 9, pp. 1777–1807, 1996. [DOI] [PubMed] [Google Scholar]

[R14] [14].Zhang Z, Ye J, Chen B, Perkins AE, Rose S, Sidky EY, Kao C-M, Xia D, Tung C-H, and Pan X, “Investigation of optimizationbased reconstruction with an image-total-variation constraint in PET,” Phys. Med. Biol, vol. 61, no. 16, pp. 6055–6084, Aug. 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] [15].Chambolle A and Pock T, “A first-order primal-dual algorithm for convex problems with applications to imaging,” J. Math. Imag. Vis, vol. 40, no. 1, pp. 120–145, 2011. [Google Scholar]

[R16] [16].Krol A, Li S, Shen L, and Xu Y, “Preconditioned alternating projection algorithms for maximum a posteriori ECT reconstruction,” Inverse Problems, vol. 28, no. 11, p. 115005, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] [17].Li S, Zhang J, Krol A, Schmidtlein CR, Vogelsang L, Shen L, Lipson E, Feiglin D, and Xu Y, “Effective noise-suppressed and artifact-reduced reconstruction of SPECT data using a preconditioned alternating projection algorithm,” Med. Phys, vol. 42, no. 8, pp. 4872–4887, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] [18].Nuyts J, Bequé D, Dupont P, and Mortelmans L, “A concave prior penalizing relative differences for maximum-a-posteriori reconstruction in emission tomography,” IEEE Trans. Nucl. Sci, vol. 49, no. 1, pp. 56–60, Feb. 2002. [Google Scholar]

[R19] [19].Wang G and Qi J, “Penalized likelihood PET image reconstruction using patch-based edge-preserving regularization,” IEEE Trans. Med. Imag, vol. 31, no. 12, pp. 2194–2204, Dec. 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] [20].——, “Edge-preserving PET image reconstruction using trust optimization transfer,” IEEE Trans. Med. Imag, vol. 34, no. 4, pp. 930–939, Apr. 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] [21].Mumcuoglu EÜ, Leahy R, Cherry SR, and Zhou Z, “Fast gradient-based methods for bayesian reconstruction of transmission and emission PET images,” IEEE Trans. Med. Imag, vol. 13, no. 4, pp. 687–701, Dec. 1994. [DOI] [PubMed] [Google Scholar]

[R22] [22].Lin Y, Schmidtlein CR, Li Q, Li S, and Xu Y, “A krasnoselskiimann algorithm with an improved EM preconditioner for PET image reconstruction,” IEEE Trans. Med. Imag, vol. 38, no. 9, pp. 2114–2126, Sep. 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] [23].Ehrhardt MJ, Markiewicz P, and Sch C-Bönlieb, “Faster PET reconstruction with non-smooth priors by randomization and preconditioning,” Phys. Med. Biol, vol. 64, no. 22, p. 225019, Nov. 2019. [DOI] [PubMed] [Google Scholar]

[R24] [24].Yu DF and Fessler JA, “Edge-preserving tomographic reconstruction with nonlocal regularization,” IEEE Trans. Med. Imag, vol. 21, no. 2, pp. 159–173, Feb. 2002. [DOI] [PubMed] [Google Scholar]

[R25] [25].Nesterov YE, “A method of solving a convex programming problem with convergence rate O(1/k2),” Sov. Math. Dokl, vol. 27, no. 2, pp. 372–376, 1983. [Google Scholar]

[R26] [26].Beck A and Teboulle M, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imag. Sci, vol. 2, no. 1, pp. 183–202, 2009. [Google Scholar]

[R27] [27].Zeng X, Shen L, and Xu Y, “A convergent fixed-point proximity algorithm accelerated by FISTA for the l0 sparse recovery problem,” in Imaging, Vision and Learning Based on Optimization and PDEs, Tai X-C, Bae E, and Lysaker M, Eds. Cham, Switzerland: Springer Int. Publishing, 2018, pp. 27–45. [Google Scholar]

[R28] [28].Kim D, Ramani S, and Fessler JA, “Combining ordered subsets and momentum for accelerated X-ray CT image reconstruction,” IEEE Trans. Med. Imag, vol. 34, no. 1, pp. 167–178, Jan. 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] [29].Li Q, Asma E, Ahn S, and Leahy RM, “A fast fully 4-D incremental gradient reconstruction algorithm for list mode PET data,” IEEE Trans. Med. Imag, vol. 26, no. 1, pp. 58–67, Jan. 2007. [DOI] [PubMed] [Google Scholar]

[R30] [30].Schmidtlein CR, Lin Y, Li S, Krol A, Beattie BJ, Humm JL, and Xu Y, “Relaxed ordered subset preconditioned alternating projection algorithm for PET reconstruction with automated penalty weight selection,” Med. Phys, vol. 44, no. 8, pp. 4083–4097, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] [31].Moses WW, “Fundamental limits of spatial resolution in PET,” Nucl. Instrum. Methods Phys. Res. A, Accel. Spectrom. Detect. Assoc. Equip, vol. 648, pp. S236–S240, Aug. 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] [32].Berthon B, Häggström I, Apte A, Beattie BJ, Kirov AS, Humm JL, Marshall C, Spezi E, Larsson A, and Schmidtlein CR, “PETSTEP: generation of synthetic PET lesions for fast evaluation of segmentation methods,” Phys. Med, vol. 31, no. 8, pp. 969–980, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] [33].Ross S and Thielemans K, General Electric PET Toolbox Release 5.0, 2011–2019.

[R34] [34].A. A. of Physicists in Medicine et al. , PET phantom instructions for evaluation of PET image quality, 2012.

[R35] [35].MacFarlane CR, “Acr accreditation of nuclear medicine and PET imaging departments,” J. Nucl. Med. Technol, vol. 34, no. 1, pp. 18–24, Mar. 2006. [PubMed] [Google Scholar]

[R36] [36].Kim K, Kim D, Yang J, El Fakhri G, Seo Y, Fessler JA, and Li Q, “Time of flight pet reconstruction using nonuniform update for regional recovery uniformity,” Med. Phys, vol. 46, no. 2, pp. 649–664, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] [37].Nesterov YE, Introductory Lectures on Convex Optimization: A Basic Course New York: Kluwer, 2004. [Google Scholar]

[R38] [38].Polyak BT, Introduction to optimization New York, NY, USA: Optim. Softw., 1987. [Google Scholar]

PERMALINK

A Fast Convergent Ordered-Subsets Algorithm with Subiteration-Dependent Preconditioners for PET Image Reconstruction

Jianfeng Guo

C Ross Schmidtlein

Andrzej Krol

Si Li

Yizun Lin

Sangtae Ahn

Charles Stearns

Yuesheng Xu

Roles

Abstract

Introduction

II. Methodology

A. RDP Regularized PET Image Reconstruction Model

B. Modified BSREM Algorithm

C. BSREM with Subiteration-Dependent Preconditioners

TABLE I.

III. Convergence of SDP-BSREM algorithm

Fig. 2.

IV. Numerical results

A. Simulation Setup

Fig. 1.

TABLE II.

B. Simulation Results

1). Comparison of Gradient Consistency:

2). Comparison of Preconditioners:

Fig. 3.

3). Comparison of SDP-BSREM with BSREM:

Fig. 4.

Fig. 5.

Fig. 6.

C. Clinical Results

TABLE III.

Fig. 8.

Fig. 7.

Fig. 9.

V. Conclusion

Acknowledgments

Appendix A

Appendix B

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases