Assessment of Vectorial Total Variation Penalties on Realistic Dual-Energy CT Data

David S Rigie; Adrian A Sanchez; Patrick J La Rivière

doi:10.1088/1361-6560/aa6392

. Author manuscript; available in PMC: 2018 Apr 21.

Published in final edited form as: Phys Med Biol. 2017 Apr 21;62(8):3284–3298. doi: 10.1088/1361-6560/aa6392

Assessment of Vectorial Total Variation Penalties on Realistic Dual-Energy CT Data

David S Rigie ¹, Adrian A Sanchez ¹, Patrick J La Rivière ¹

PMCID: PMC5575889 NIHMSID: NIHMS898593 PMID: 28350547

Abstract

Vectorial extensions of total variation have recently been developed for regularizing the reconstruction and denoising of multi-channel images, such as those arising in spectral computed tomography. Early studies have focused mainly on simulated, piecewise-constant images whose structure may favor total-variation penalties. In the current manuscript, we apply vectorial total variation to real dual-energy CT data of a whole turkey in order to determine if the same benefits can be observed in more complex images with anatomically realistic textures. We consider the total nuclear variation (TV_N) as well as another vectorial total variation based on the Frobenius norm (TV_F) and standard channel-by-channel total variation (TV_S). We performed a series of 3D TV denoising experiments comparing the three TV variants across a wide range of smoothness parameter settings, optimizing each regularizer according to a very-high-dose “ground truth” image. Consistent with the simulation studies, we find that both vectorial TV variants achieve a lower error than the channel-by-channel TV and are better able to suppress noise while preserving actual image features. In this real data study, the advantages are subtler than in the previous simulation study, although the TV_N penalty is found to have clear advantages over either TV_S or TV_F when comparing material images formed from linear combinations of the denoised energy images.

1. Background

Due to the recent progress in dual- and multi-energy x-ray CT, there has been growing interest in developing effective regularizers for multi-channel image reconstruction and denoising. In multi-energy scanning modes, the total x-ray dose is subdivided among two or more energy channels, thus reducing the signal-to-noise ratio (SNR) of each one. Furthermore, a pre- or post-reconstruction material decomposition step is usually performed, which greatly amplifies noise [1]. Some of these deleterious effects may be partially mitigated by including strong image priors into the reconstruction or post-processing steps. Several different approaches appear in the literature, including methods based on robust principal component analysis [2], patch-based penalties [3], and vectorial extensions of the total variation (TV) [4, 5, 6]. These varied techniques share a common goal of leveraging redundant structure across image channels (e.g. shared edges) in order to improve noise suppression while mitigating feature loss.

Previously, we applied a vectorial extension of the total variation penalty, which has been referred to as the “total nuclear variation” (TV_N), to spectral CT reconstruction [6]. For clarity, we will recap several salient points from that manuscript. TV_N extends the total variation to multi-channel images by penalizing the nuclear norm of the Jacobian. The resulting functional encourages multi-channel images to have shared edges and gradient vectors that point in the same direction, while being insensitive to contrast or scale differences. Furthermore, TV_N is convex and reduces to the usual total variation when there is only a single image channel. In [6], we described a flexible image reconstruction framework based on a data-constrained TV minimization approach that allowed us to directly assess the performance of TV_N compared to a channel-by-channel application of the conventional TV. By jointly processing several image channels with edge-coupling regularization, improved reconstructions were demonstrated from the same noisy data in a series of simulation studies with the numerical XCAT phantom. Furthermore, the TV_N was applied to both energy and material images with similar gains relative to the channel-by-channel TV. Overall, TV_N allowed for a greater reduction in image noise with less edge blurring due to its ability to pool edge information across all image channels.

In this paper, we will build on those findings by investigating whether the same benefits can be demonstrated on more realistic CT data. In doing so, we aim to address several limitations of the previous study. Firstly, we will evaluate the TV_N on images that have a realistic texture. Although the XCAT phantom utilized in the previous study is based closely on real CT images of a human cadaver, its contrast is piecewise. This could be problematic for evaluating any sort of TV regularization, since TV is known to favor this kind of structure [7]. Furthermore, one of the potential benefits of the TV_N is that it promotes aligned gradient vectors across energy channels, but due to its piecewise nature, the XCAT phantom also trivially satisfies this condition. Other authors have applied a very similar regularizer to the joint reconstruction of PET-MR patient data with favorable results [8], but to our knowledge, no studies have reported on TV_N for real multi-energy CT images.

To address the question of whether the previous findings are translatable to realistic multi-energy CT data, we will investigate the proposed regularizer on dual-energy images of a turkey acquired on a diagnostic, clinical scanner. While the previous study focused on image reconstruction, the TV_N can also be applied to other image restoration problems [4, 5]. Since we did not have access to the projection data or geometry parameters of our scanner, we focus solely on denoising in this work. Because denoising is much less computationally expensive than reconstruction, this enabled us to conduct a thorough exploration of the parameter space, which was not feasible in the previous study. In addition to assessing the TV_N on realistic data, we also address several other issues of practical interest. Firstly, in order to isolate the importance of edge-coupling and gradient alignment, we also study another multi-channel variant of the TV [9] that encourages joint sparsity of the gradient magnitude images, without explicit encouragement of aligned gradient vectors. Additionally, we demonstrate that the TV_N can be efficiently extended to 3D image volumes, which is of practical relevance for reconstructing or denoising multi-detector and cone beam CT data.

2. Theory

2.1. Notation and definitions

In this section, we outline the notation that we will use for the remainder of the document. We adopt a somewhat unconventional set of indexing rules to maintain as much clarity as possible in describing algebraic operations on vectors with multiple dimensions of spatial and spectral information.

Single-Channel images

Consider a discretized 3D image u ∈ I where I = ℝ^M^·^N^·^P is a finite dimensional vector space equipped with an inner product

{〈 u, v 〉}_{I} = \sum_{i, j, k} u (i, j, k) v (i, j, k), u, v \in I .

(1)

We use the convention u(i, j, k) to refer to a particular voxel, where i, j, and k specify the row, column, and slice indices and u(i, j, k) ∈ ℝ is a discretized version of some continuous image function u(x, y, z). The integers M, N, and P specify the total number of rows, columns, and slices, respectively. The quantity ∇u is a vector in the space G = I × I, where the operator ∇u : I → G represents a discrete approximation to the gradient. At each voxel location we define the quantity (∇u)(i, j, k) ∈ ℝ³ as

(\nabla u) (i, j, k) = (\begin{array}{l} {(\nabla u)}^{x} (i, j, k) \\ {(\nabla u)}^{y} (i, j, k) \\ {(\nabla u)}^{z} (i, j, k) \end{array}),

(2)

where

{(\nabla u)}^{x} (i, j, k) = {\begin{cases} u (i, j, k + 1) - u (i, j, k) & k < P \\ 0 & k = P \end{cases}

(3)

{(\nabla u)}^{y} (i, j, k) = {\begin{cases} u (i, j + 1, k) - u (i, j, k) & j < N \\ 0 & j = N \end{cases}

(4)

{(\nabla u)}^{z} (i, j, k) = {\begin{cases} u (i + 1, j, k) - u (i, j, k) & i < M \\ 0 & i = M \end{cases} .

(5)

We also define an inner product in G,

{〈 ν, z 〉}_{G} = \sum_{α \in {x, y, z}} \sum_{i, j, k} ν^{α} (i, j, k) z^{α} (i, j, k), ν, z \in G .

(6)

Note that we use bold font to indicate that each spatial location (i, j, k) maps to a vector of gradient values. Further, we will need a discrete divergence operator div z : G → I, which is chosen to be the negative transpose of the gradient operator, defined by

{〈 \nabla u, z 〉}_{G} = - {〈 u, div z 〉}_{I} .

(7)

Lastly, we define the mixed ℓ₁/ℓ₂-norm in G as

{‖ z ‖}_{1, 2} = \sum_{i, j, k} {‖ z (i, j, k) ‖}_{2} z \in G,

(8)

indicating that we take an ℓ₁-norm over the spatial indices (i, j, k) and an ℓ₂-norm of each 3-vector z(i, j, k). This mixed-norm notation is often used in the literature on sparse regression and occasionally to compactly define the isotropic total variation [10],

TV (u) = {‖ \nabla u ‖}_{1, 2} .

(9)

Multi-channel images

Now, we consider a discrete image u ∈ 𝒥 with L spectral channels,

u (i, j, k) = (\begin{matrix} u_{1} (i, j, k) \\ u_{2} (i, j, k) \\ ⋮ \\ u_{L} (i, j, k) \end{matrix}),

(10)

where 𝒥 = ℝ^L^·^M^·^N^·^P. The quantity u is a discretized version of some continuous vector field u(x, y, z), such as a multi-energy image volume. We also define an inner product in 𝒥,

{〈 u, v 〉}_{J} = \sum_{ℓ} \sum_{i, j, k} u_{ℓ} (i, j, k) v_{ℓ} (i, j, k) u, v \in J,

(11)

where the subscript ℓ denotes a particular image channel. Since u is the discrete analog of a vector field, we can also define the discrete Jacobian, Ju : 𝒥 → 𝒢, which generalizes the gradient operator to vector fields. In particular we have

(J u) (i, j, k) = (\begin{matrix} {(\nabla u_{1})}^{x} (i, j, k) & {(\nabla u_{1})}^{y} (i, j, k) & {(\nabla u_{1})}^{z} (i, j, k) \\ {(\nabla u_{2})}^{x} (i, j, k) & {(\nabla u_{2})}^{y} (i, j, k) & {(\nabla u_{2})}^{z} (i, j, k) \\ ⋮ \\ {(\nabla u_{L})}^{x} (i, j, k) & {(\nabla u_{L})}^{y} (i, j, k) & {(\nabla u_{L})}^{z} (i, j, k) \end{matrix}) .

(12)

At every voxel (i, j, k), the L × 3 sub-matrix (Ju) (i, j, k) fully characterizes the first-order derivatives of u, with each row consisting of the gradient vector of one of the L image channels. The quantity Ju is in vector space 𝒢 = 𝒥 × 𝒥 = ℝ⁽^L^×3)·(^M^·^N^·^P⁾ with an inner product

{〈 V, Z 〉}_{G} = \sum_{α \in {x, y, z}} \sum_{ℓ} \sum_{i, j, k} V_{ℓ}^{α} (i, j, k) Z_{ℓ}^{α} (i, j, k) V, Z \in G .

(13)

An element V ∈ 𝒢 is a discretized version of a tensor field V (x, y, z), so we use uppercase font to indicate that every spatial location (i, j, k) maps to a matrix. Again, we need an analog of the divergence operator Div Z : 𝒢 → 𝒥 that is the negative transpose of J. It is constructed to satisfy

{〈 J u, Z 〉}_{G} = - {〈 u, Div Z 〉}_{J} .

(14)

We define the mixed ℓ₁/nuclear-norm as

{‖ Z ‖}_{1, ★} = \sum_{i, j, k} {‖ Z (i, j, k) ‖}_{★} Z \in G,

(15)

where ||Z(i, j, k)||_★ is referred to as the “nuclear norm” of matrix Z(i, j, k) and is equal to the sum of its singular values. Additionally, we define the mixed ℓ₁/Frobenius norm as

{‖ Z ‖}_{1, F} = \sum_{i, j, k} {‖ Z (i, j, k) ‖}_{F} Z \in G,

(16)

where the Frobenius norm of a generic M × N matrix is given by

{‖ X ‖}_{F} = \sqrt{\sum_{i = 1}^{M} \sum_{j = 1}^{N} X_{i j}^{2}} .

(17)

Denoising Model

In this section, we will describe a generic denoising model as well as the three variants of the TV that we have investigated. Our denoising model is specified by the following optimization problem:

\underset{u}{argmin} \frac{1}{2} {‖ u - g ‖}_{2}^{2} + λ TV (u),

(18)

where g is the noisy image, λ controls the trade-off between data fidelity and smoothness, and TV refers generally to some variant of the total variation. The simplest way to generalize the TV to multi-channel images is to apply the standard isotropic TV separately to each energy channel; we refer to this as the channel-by-channel TV, denoted by TV_S. Additionally, we can define a “vectorial” TV that simultaneously penalizes edges across all image channels, which is defined by the voxel-wise application of some matrix norm to the discrete Jacobian of the multi-channel image function. Two interesting choices are the Frobenius norm, which encourages joint sparsity in the image gradients, and the Nuclear norm, which encourages joint sparsity and directional alignment of gradient vectors [5, 6]. We denote these two vectorial TV variants by TV_F and TV_N, respectively. All three of these regularizers are summarized in Table 1.

Table 2.

Relative error and UQI (in parenthesis) of each VTV variant at its respective optimal λ value. These correspond to the minima of the error curves in Figure 1, ε(λ_opt).

Regularizer	80 kVp	140 kVp
TV_S	5.501e-2 (0.9981)	3.101e-2 (0.9994)
TV_F	4.820e-2 (0.9985)	3.113e-2 (0.9994)
TV_N	3.876e-2 (0.9990)	3.027e-2 (0.9994)

Open in a new tab

Primal-Dual Saddlepoint Problem

As in [6], the denoising model can be expressed as a saddlepoint problem, which is given by

min_{u} max_{Z} {〈 J u, Z 〉}_{G} - δ_{K} (Z) + \frac{1}{2 λ} {‖ u - g ‖}_{2}^{2},

(19)

where δ_𝒦 is a generic set indicator function defined by

δ_{K} (X) \equiv {\begin{cases} 0 & X \in K \\ \infty & X \notin K \end{cases} .

(20)

The set 𝒦 is equal to 𝒮, ℱ, or 𝒩 for TV_S, TV_F, and TV_N, respectively. The sets 𝒮, ℱ, and 𝒩 are defined below:

S = {Z \in G : {‖ Z_{ℓ} (i, j, k) ‖}_{2} \leq 1 \forall i, j, k, ℓ}

(21)

F = {Z \in G : {‖ Z (i, j, k) ‖}_{F} \leq 1 \forall i, j, k}

(22)

N = {Z \in G : σ_{max} (Z (i, j, k)) \leq 1 \forall i, j, k} .

(23)

Algorithm Updates

As in [6], we solve this optimization problem using Chambolle and Pock’s first-order, primal-dual algorithm (CPPD) [11]. However, unlike the reconstruction problem described in [6], this saddle point problem is strongly convex in the primal variable u, with convexity constant λ⁻¹, so we can use their accelerated algorithm. This version of CPPD achieves a convergence rate of O (1/N²) rather than O(1/N). The updates are given by Algorithm 1.

The primal and dual stepsizes τ and σ must be initialized so that τσ||J||² ≤ 1, where ||J|| is the spectral radius (also the maximum singular value) of the Jacobian operator J. For 3D image arrays, It can be shown that $‖ J ‖ \leq \sqrt{12}$ [12]. We find that convergence is always fastest when στ = 1/12; however there still exists freedom to choose the ratio of primal to dual stepsizes. The observed rate of convergence can be dramatically impacted by this choice, but we are unaware of any good method to automatically select these parameters. We have used τ = 0.05 for all of the results in this paper. We iterate until the primal-dual gap falls below 10⁻⁶, which we have found, empirically, to be a reasonable stopping point where no appreciable voxel value changes can be observed in the windows of interest.

Algorithm 1.

TV Denoising Update Equations

1:	Initialize: σ_kτ_k = 1/12
2:	repeat
3:	Z⁽^k⁺¹⁾ = Π_𝒦 (Z⁽^k⁾ + σ_kJ ū⁽^k⁾)
4:	u⁽^k⁺¹⁾ = (λu⁽^k⁾ + τ_kg + λτkDiv Z⁽^k⁺¹⁾)/(λ + τ_k)
5:	$θ_{k} = \sqrt{λ / (λ + 2 τ_{k})}$ , τ_k = θ_kτ_k, σ_k = σ_k/θ_k
6:	ū(^k⁺¹⁾ = u⁽^k⁺¹⁾ + θ_k (u⁽^k⁺¹⁾ − u⁽^k⁾)
7:	until ▷ stop when convergence criteria met

Open in a new tab

Implementation of Projection Operators

The algorithm updates involve projecting the dual variable Z onto one of the convex sets, 𝒮, ℱ, or 𝒩, corresponding to TV_S, TV_F, or TV_N. Pseudocode for each of these projection operators is given by Algorithm 2, Algorithm 3, and Algorithm 4, respectively. For TV_S and TV_F, the projection operators are very straightforward and only involve simple conditionals and vector normalization. The projection operator Π_𝒩 for TV_N is slightly more complicated and requires the computation of eigenvalues and eigenvectors of a real, symmetric 3×3 matrix for every image voxel. Luckily, analytic methods exist for diagonalizing 3×3 Hermitian matrices, so the computational burden is not too significant; we rely on the implementation described by Kopp [13]. We also note that for the special case of only two energy channels, there is an alternative implementation of Π_𝒩 that only involves a 2 × 2 eigendecomposition, but we present this version because it generalizes to an arbitrary number of image channels, which is important for photon-counting spectral CT systems.

Algorithm 2.

Implementation of Projection Operator Π_𝒮(Z)

1:	for all i, j, k, ℓ do
2:	x ← Z_ℓ(i, j, k)
3:	if \|\|x\|\|₂ ≥ 1 then
4:	x ← x/\|\|x\|\|₂
5:	end if
6:	Zℓ(i, j, k) ← x
7:	end for

Open in a new tab

Algorithm 3.

Implementation of Projection Operator Π_ℱ(Z)

1:	for all i, j, k do
2:	X ← Z(i, j, k)
3:	if \|\|X\|\|_F ≥ 1 then
4:	X ← X/\|\|X\|\|_F
5:	end if
6:	Z(i, j, k) ← X
7:	end for

Open in a new tab

3. Denoising Experiments

Data Acquisition

A series of scans were acquired of a frozen supermarket turkey at 80 and 140 kVp on a 256-slice multidetector CT (Brilliance iCT, Philips Healthcare) in axial mode. At each kVp, we acquired both a “low-dose” scan (100mAs) and ten high dose scans (410 mAs for 80 kVp and 425 mAs for 120/140 kVp). The high-dose scans were repeated so that the resulting images could be averaged to create a very low-noise “ground truth” image. All 33 scans were acquired with the exact same table position and no other parameters were altered between acquisitions besides the kVp and mAs.

Algorithm 4.

Implementation of Projection Operator Π_𝒩(Z)

1:	for all i, j, k do
2:	X ← Z(i, j, k)
3:	(λ, V ) ← Eig(X^TX) ▷ compute eigenvalues and eigenvectors [13]
4:	$\sum \leftarrow diag (\sqrt{λ})$
5:	if Tr (Σ) > 1 then
6:	Σp ← min(1, Σ) ▷ element-wise minimum
7:	X ← XV Σ+Σ_pV ^T
8:	end if
9:	Z(i, j, k) ← X
10:	end for

Open in a new tab

All of the scans were reconstructed with FBP (high resolution mode, Filter E) via the manufacturer’s software and using the exact same parameters: 512 × 512 × 128 image volumes with 0.562mm voxels and a 0.625mm slice thickness. Linear interpolation was used along the z-axis (axial direction) to synthesize a volume with isotropic resolution (0.562mm) to avoid complications with the discrete gradient and divergence operators. Since the scan was acquired in axial mode, the slices toward the axial extremes of the acquired data volume exhibited severe cone beam artifacts. A smaller volume consisting of only the central 32 slices was extracted in order to minimize these effects.

Comparison of TV_S, TV_F, and TV_N

In order to compare these TV variants, we synthesized a dual kVp image pair by extracting the same 3D volume from the 80 and 140 kVp low-dose (100mAs) datasets. Then we denoised this dual-energy data over a wide range of λ values with each of TV_S, TV_F, and TV_N; we first identified λ_min and λ_max, corresponding to images that appeared under- and over-smoothed and sampled 100 equally spaced values within this interval. Each result was then compared to the corresponding 10X averaged high-dose image, and the normalized Euclidean distance was computed as an error metric. This is defined as

ε (λ) = \frac{{‖ u^{*} (λ) - u_{ref} ‖}_{2}}{{‖ u_{ref} ‖}_{2}},

(24)

where u^*(λ) is the denoised image with parameter λ, and u_ref is the corresponding 10X high-dose reference image. In the case of TV_F and TV_N, the dual-energy data are processed jointly due to the interchannel coupling of the regularizer, but we report the error metrics individually for each channel. This is shown in Figure 1.

Relative error curves based on (24) are shown for the three different VTV variants studied at both 80 kVp (left) and 140 kVp (right). Optimal values of λ are identified by the minima of these curves.

In general, the error curves should not be compared at any one specific common value of λ because the functions TV_S, TV_F, and TV_N have inherently different scales. Instead the curves should be evaluated in terms of the possible family of solutions they each represent and, most importantly, their minimum-error solution. As can be seen in Figure 1, the channel-coupling of the VTV has little effect on the 140 kVp image; however, it does improve the accuracy of the noisier 80 kVp image, where it is clear that TV_N outperforms TV_F, which in turn outperforms TV_S. The minima of the various error curves is reported for both the 80 and 140 kVp channels for all regularizers in Table 2. In addition to relative error, which can be thought of as a normalized root-mean-square error (RMSE), we also report the universal quality index (UQI) proposed in [14]. For two images x and y, the UQI is defined as

Table 1.

Three multi-channel variants of the TV studied in this work. Note, TV_S induces no inter-channel coupling and is equivalent to separately denoising each image channel.

Defintion	Edge Coupling	Gradient Alignment
TV_S(u) = ${TV}_{S} (u) = \sum_{ℓ = 1}^{L} TV (u_{ℓ})$
TV_F(u) = \|\|Ju\|\|₁_,F	✓
TV_N(u) = \|\|Ju\|\|₁_,_★	✓	✓

Open in a new tab

UQI = \frac{4 σ_{x y} \bar{x} \bar{y}}{(σ_{x}^{2} + σ_{y}^{2}) [{(\bar{x})}^{2} + {(\bar{y})}^{2}]},

(25)

where

\bar{x} = \frac{1}{N} \sum_{i = 1}^{N} x_{i}, \bar{y} = \frac{1}{N} \sum_{i = 1}^{N} y_{i} σ_{x}^{2} = \frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - \bar{x})}^{2}, σ_{y}^{2} = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2} σ_{x y} = \frac{1}{N - 1} \sum_{i = 1}^{N} (x_{i} - \bar{x}) (y_{i} - \bar{y}) .

In particular, the UQI is computed between the high dose reference images and the denoised images and reported in Table 2 in parenthesis. The dynamic range of UQI is [−1, 1], where higher values are better.

Figure 2 shows a comparison between the denoised 80 kVp images between different VTV variants, each shown at its optimal lambda value (the minimizer of the error curve); Figure 3 shows the same comparison for an ROI. It can be seen that the channel-coupling does improve the preservation of details and suppression of noise. The differences between TV_F and TV_N are quite subtle with TV_N arguably having slightly cleaner edges, but the improvement over channel-by-channel TV (TV_S) is somewhat more pronounced. Figures 4 and 5 shows the corresponding results for the 140 kVp channel. In that case, differences between regularizers are practically indiscernible. This observation is consistent with our previous findings that the channel-coupling effect of these VTV penalties is primarily helpful to the noisier image channels because it allows edge information to be transferred from the less noisy channel.

Reconstructed 80 kVp images corresponding to low-dose data (a), 10x averaged high-dose data (b) and denoised low-dose images with TV_S, TV_F, and TV_N regularization, respectively (c–e). The parameter λ is selected to optimize each regularizer according to figure 1, and the display window is [−200 HU, 200 HU].

ROI from reconstructed 80 kVp images corresponding to low-dose data (a), 10x averaged high-dose data (b) and denoised low-dose images with TV_S, TV_F, and TV_N regularization, respectively (c–e). The parameter λ is selected to optimize each regularizer according to figure 1, and the display window is [−200 HU, 200 HU].

Reconstructed 140 kVp images corresponding to low-dose data (a), 10x averaged high-dose data (b) and denoised low-dose images with TV_S, TV_F, and TV_N regularization, respectively (c–e). The parameter λ is selected to optimize each regularizer according to figure 1, and the display window is [−200 HU, 200 HU].

ROI from reconstructed 80 kVp images corresponding to low-dose data (a), 10× averaged high-dose data (b) and denoised low-dose images with TV_S, TV_F, and TV_N regularization, respectively (c–e). The parameter λ is selected to optimize each regularizer according to figure 1, and the display window is [−200 HU, 200 HU].

4. Water-Subtracted Images

In principle, one can use dual-energy CT measurements to solve for quantitative material density maps; however, this sort of material decomposition requires either accurate knowledge of the input spectrum or a significant amount of calibration data. Alternatively, one can simply take linear combinations of the reconstructed 80 and 140 kVp images, where the coefficients are chosen to cancel out a given material [15]. For example, merely subtracting the 140 kVp image from the 80 kVp image should approximately cancel out any voxels that have an effective atomic number very similar to water, provided that the original images have already been converted to Hounsfield units (HU). Though the resultant image lacks quantitative accuracy, it is qualitatively very similar to a bone density image.

After denoising the low-dose 80/140 kVp images with the various regularizers (TV_S, TV_F, and TV_N), we formed water-subtracted images from each image pair. We find that the differences between the various regularizers is far more pronounced in this water-subtracted image than in either the 80 or 140 kVp images, which can be seen in figures 7 and 8. Compared with TV_S, TV_N does a better job at preserving fine details in the bony structures while also achieving a less noisy texture. Visually, TV_N also appears to slightly outperform TV_F in this regard, indicating that the effect of encouraging aligned gradient vectors across image channels may be important, despite the fact that TV_F and TV_N barely show any visible differences at all in the raw 80/140 kVp images. These differences are quantified and confirmed in table 3, wherein the contrast-to-noise ratio (CNR) is computed between an inclusion in the vertebral structure and the nearby bone for each water-subtracted image. We define CNR as

Water-subtracted images corresponding to low-dose data (a), 10× averaged high-dose data (b) and denoised low-dose images with TV_S, TV_F, and TV_N regularization, respectively (c–e). The parameter λ is selected to optimize each regularizer according to figure 1.

ROI from water-subtracted kVp images corresponding to low-dose data (a), 10× averaged high-dose data (b) and denoised low-dose images with TV_S, TV_F, and TV_N regularization, respectively (c–e). The parameter λ is selected to optimize each regularizer according to figure 1.

Table 3.

A comparison of the contrast-to-noise ratio (CNR) between an inclusion in the vertebral structure and the adjacent solid bone across TV_S, TV_F, and TV_N in the water-subtracted image.

Image	CNR
Low Dose (Original)	2.4
10× Dose (Reference)	6.0
TV_S	3.0
TV_F	3.2
TV_N	5.7

Open in a new tab

CNR = \sqrt{2 \frac{{({\bar{HU}}_{bone} - {\bar{HU}}_{inclusion})}^{2}}{σ_{bone}^{2} + σ_{inclusion}^{2}}},

(26)

where ${\bar{HU}}_{inclusion}$ is the mean HU in the ROI shown in figure 6 and ${\bar{HU}}_{bone}$ is an identically sized ROI placed over bone. $σ_{inclusion}^{2}$ and $σ_{bone}^{2}$ are the variances in these ROIs.

The ROI placed over the inclusion is depicted by the yellow rectangle.

As in Table 2, we also report the relative error and UQI metrics for the water- subtracted images in Table 4.

Table 4.

Relative error and UQI of each VTV variant at its respective optimal λ value for the water-subtracted images.

Regularizer	Relative Error	Universal Quality Index (UQI)
TV_S	0.6260	0.8044
TV_F	0.3768	0.8885
TV_N	0.3261	0.9258

Open in a new tab

5. Discussion

In this paper, we assessed the utility of the vectorial TV for realistic CT images. Consistent with [6], we found that the edge-coupling properties of TV_N can help to preserve details and also improve quantitative accuracy, compared with channel-by-channel TV. The advantages were most striking when examining material images formed as linear combinations of the denoised energy images. In this study, the benefits were more subtle than in the reconstruction study utilizing the numerical XCAT phantom. This is not particularly surprising since that phantom is piecewise constant, which makes it possible to operate in a regime of very strong regularization.

In addition, we looked at TV_F, which also encourages common edges but is ambivalent to gradient-vector alignment between image channels. Although TV_N led to more accurate images (relative to the reference images) and arguably cleaner edges, especially for material images, TV_F has some practical advantages, as it is simpler to implement and slightly less computationally expensive. However, we have demonstrated that TV_N can be implemented efficiently in 3D for an arbitrary number of energy channels using the analytic matrix diagonalization method described in [13]. For this denoising study, TV_N involves slightly longer computation times than TV_F or TV_S (≈ 20% longer run time in our implementation), but this difference is negligible for image reconstruction because the the projection and backprojection steps are the dominant factors determining the per-iteration runtime.

Interestingly, the benefits of the inter-channel coupling were most apparent when there was a very large disparity in the noise levels between the low- and high-energy image channels. Intuitively, this makes sense because both TV_F and TV_N allow for edge information to be transferred from the low-noise image into noisy image. This could be useful for making multi-energy CT acquisitions more robust to scenarios where certain channels receive low count rates, such as dual kVp imaging of obese patients.

6. Conclusion

The edge-coupling properties of TV_F and TV_N lead to improved image quality and accuracy compared to the conventional total variation when denoising multi-channel images. On real data, with realistic anatomical texture and appearance, the improvement is not as pronounced as it was in previous numerical experiments with piecewise constant phantoms[6]. However, TV_N consistently outperforms TV_F and TV_S in terms of root-mean-square error, universal quality index, and subjective visual assessment.

Acknowledgments

This work was supported in part by Toshiba Medical Research Institute USA, NIH grant R01EB017293, the National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health under grant number T32 EB002103, and the National Institute of General Medical Sciences of the National Institutes of Health under Award Number R25GM109439 (Project Title: University of Chicago Initiative for Maximizing Student Development (IMSD)). Partial funding for the computation in this work was provided by NIH Grant Nos. S10 RRO21039 and P30 CA14599. Its contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health or Toshiba Medical Research Institute USA.

References

1.Alvarez RE. Med Phys. 2013;40:111909. doi: 10.1118/1.4824057. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Gao H, Yu H, Osher S, Wang G. Inverse Probl. 2011;27 doi: 10.1088/0266-5611/27/11/115012. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Kim K, Ye JC, Worstell W, Ouyang J, Rakvongthai Y, El Fakhri G, Li Q. IEEE Trans Med Imaging. 2015;34:748–760. doi: 10.1109/TMI.2014.2380993. [DOI] [PubMed] [Google Scholar]
4.Lefkimmiatis S, Roussos A, Unser M, Maragos P. 2013 Convex generalizations of total variation based on the structure tensor with applications to inverse problems. In: Kuijper A, Bredies K, Pock T, Bischof H, editors. Scale Space and Variational Methods in Computer Vision. Berlin, Heidelberg: Springer Berlin Heidelberg; pp. 48–60. (Lecture Notes in Computer Science vol 7893) [Google Scholar]
5.Holt K. IEEE Trans Image Process. 2014;23:3975–3989. doi: 10.1109/TIP.2014.2332397. [DOI] [PubMed] [Google Scholar]
6.Rigie DS, La Rivière PJ. Phys Med Biol. 2015;60:1741–1762. doi: 10.1088/0031-9155/60/5/1741. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Chan T, Marquina A, Mulet P. SIAM J Sci Comput. 2000;22:503–516. (Preprint http://dx.doi.org/10.1137/S1064827598344169) [Google Scholar]
8.Knoll F, Koesters T, Otazo R, Block T, Feng L, Vunckx K, Faul D, Nuyts J, Boada F, Sodickson DK. EJNMMI Phys. 2014;1:A26. doi: 10.1186/2197-7364-1-S1-A26. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Bresson X, Chan TF. Inverse Probl Imaging. 2008;2:455–484. [Google Scholar]
10.Andersen MS, Hansen PC. Numer Algorithms. 2013;67:121–144. [Google Scholar]
11.Chambolle A, Pock T. J Math Imaging Vis. 2010;40:120–145. doi: 10.1007/s10851-019-00919-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Beck A, Teboulle M. IEEE Trans Image Process. 2009;18:2419–2434. doi: 10.1109/TIP.2009.2028250. [DOI] [PubMed] [Google Scholar]
13.Kopp J. International Journal of Modern Physics C. 2008;19:523–548. URL http://www.worldscientific.com/doi/abs/10.1142/S0129183108012303. [Google Scholar]
14.Wang Z, Bovik AC. IEEE Signal Process Lett. 2002;9:81–84. [Google Scholar]
15.Lehmann LA, Alvarez RE, Macovski A, Brody WR, Pelc NJ, Riederer SJ, Hall AL. Med Phys. 1981;8:659–667. doi: 10.1118/1.595025. [DOI] [PubMed] [Google Scholar]

[R1] 1.Alvarez RE. Med Phys. 2013;40:111909. doi: 10.1118/1.4824057. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] 2.Gao H, Yu H, Osher S, Wang G. Inverse Probl. 2011;27 doi: 10.1088/0266-5611/27/11/115012. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Kim K, Ye JC, Worstell W, Ouyang J, Rakvongthai Y, El Fakhri G, Li Q. IEEE Trans Med Imaging. 2015;34:748–760. doi: 10.1109/TMI.2014.2380993. [DOI] [PubMed] [Google Scholar]

[R4] 4.Lefkimmiatis S, Roussos A, Unser M, Maragos P. 2013 Convex generalizations of total variation based on the structure tensor with applications to inverse problems. In: Kuijper A, Bredies K, Pock T, Bischof H, editors. Scale Space and Variational Methods in Computer Vision. Berlin, Heidelberg: Springer Berlin Heidelberg; pp. 48–60. (Lecture Notes in Computer Science vol 7893) [Google Scholar]

[R5] 5.Holt K. IEEE Trans Image Process. 2014;23:3975–3989. doi: 10.1109/TIP.2014.2332397. [DOI] [PubMed] [Google Scholar]

[R6] 6.Rigie DS, La Rivière PJ. Phys Med Biol. 2015;60:1741–1762. doi: 10.1088/0031-9155/60/5/1741. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Chan T, Marquina A, Mulet P. SIAM J Sci Comput. 2000;22:503–516. (Preprint http://dx.doi.org/10.1137/S1064827598344169) [Google Scholar]

[R8] 8.Knoll F, Koesters T, Otazo R, Block T, Feng L, Vunckx K, Faul D, Nuyts J, Boada F, Sodickson DK. EJNMMI Phys. 2014;1:A26. doi: 10.1186/2197-7364-1-S1-A26. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Bresson X, Chan TF. Inverse Probl Imaging. 2008;2:455–484. [Google Scholar]

[R10] 10.Andersen MS, Hansen PC. Numer Algorithms. 2013;67:121–144. [Google Scholar]

[R11] 11.Chambolle A, Pock T. J Math Imaging Vis. 2010;40:120–145. doi: 10.1007/s10851-019-00919-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Beck A, Teboulle M. IEEE Trans Image Process. 2009;18:2419–2434. doi: 10.1109/TIP.2009.2028250. [DOI] [PubMed] [Google Scholar]

[R13] 13.Kopp J. International Journal of Modern Physics C. 2008;19:523–548. URL http://www.worldscientific.com/doi/abs/10.1142/S0129183108012303. [Google Scholar]

[R14] 14.Wang Z, Bovik AC. IEEE Signal Process Lett. 2002;9:81–84. [Google Scholar]

[R15] 15.Lehmann LA, Alvarez RE, Macovski A, Brody WR, Pelc NJ, Riederer SJ, Hall AL. Med Phys. 1981;8:659–667. doi: 10.1118/1.595025. [DOI] [PubMed] [Google Scholar]

PERMALINK

Assessment of Vectorial Total Variation Penalties on Realistic Dual-Energy CT Data

David S Rigie

Adrian A Sanchez

Patrick J La Rivière

Abstract

1. Background

2. Theory

2.1. Notation and definitions

Single-Channel images

Multi-channel images

Denoising Model

Table 2.

Primal-Dual Saddlepoint Problem

Algorithm Updates

Algorithm 1.

Implementation of Projection Operators

Algorithm 2.

Algorithm 3.

3. Denoising Experiments

Data Acquisition

Algorithm 4.

Comparison of TVS, TVF, and TVN

Figure 1.

Table 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

4. Water-Subtracted Images

Figure 7.

Figure 8.

Table 3.

Figure 6.

Table 4.

5. Discussion

6. Conclusion

Acknowledgments

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Comparison of TV_S, TV_F, and TV_N