Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2016 Apr 7.
Published in final edited form as: Phys Med Biol. 2015 Mar 17;60(7):2803–2818. doi: 10.1088/0031-9155/60/7/2803

Tensor-based Dictionary Learning for Dynamic Tomographic Reconstruction

Shengqi Tan 1,2, Yanbo Zhang 3,4, Ge Wang 5, Xuanqin Mou 3, Guohua Cao 6, Zhifang Wu 1,2, Hengyong Yu 4,*
PMCID: PMC4394841  NIHMSID: NIHMS676330  PMID: 25779991

Abstract

In dynamic computed tomography (CT) reconstruction, the data acquisition speed limits the spatio-temporal resolution. Recently, compressed sensing theory has been instrumental in improving CT reconstruction from far few-view projections. In this paper, we present an adaptive method to train a tensor-based spatio-temporal dictionary for sparse representation of an image sequence during the reconstruction process. The correlations among atoms and across phases are considered to capture the characteristics of an object. The reconstruction problem is solved by the alternating direction method of multipliers. To recover fine or sharp structures such as edges, the nonlocal total variation is incorporated into the algorithmic framework. Preclinical examples including a sheep lung perfusion study and a dynamic mouse cardiac imaging demonstrate that the proposed approach outperforms the vectorized dictionary-based CT reconstruction in the case of few-view reconstruction.

I. Introduction

X-ray computed tomography (CT) has been developed and widely used in hospitals and clinics for diagnosis and intervention. The CT image quality can be significantly degraded by respiratory motion, and accurate estimation of physiological and pathological parameters become difficult and unreliable. To solve this problem, the dynamic CT imaging technique was developed as a powerful tool for noninvasive visualization and analysis of changes inside a patient [14]. During a dynamic CT imaging procedure, a sequence of projections are collected at multiple time points or different phases. However, the data acquisition speed of the current CT systems usually cannot collect a sufficient number of projections for each phase to reach the Shannon/Nyquist sampling rate required by the classic reconstruction methods. Over recent years, compressive sensing (CS) theory has become popular for CT reconstruction from far-few projections [5][6]. In the CS-based reconstruction framework, the key is to explore the prior information which generally focuses on the sparsity of an object to be reconstructed. In the CT reconstruction field, the discrete gradient transform (DGT) is the most popular sparsifying transform implicitly assuming a piecewise constant image model, and the corresponding reconstruction technique is called total variation (TV) minimization [710]. However, most of the medical images are not piecewise constant. Hence, the DGT cannot distinguish true structures from image noise. Therefore, some fine structures may be lost or distorted, and blocky artifacts are generated in reconstructed images [11]. Herman and Davidi showed that tumor-like structures may be removed by the TV-minimization based method for few-view reconstruction [12]. The inherent defect of the TV-minimization approach motivated us to investigate other sparse representations, such as dictionary learning (DL) based options [11]. The sparse representation in terms of a redundant dictionary has attracted a great interest and achieved significant improvements such as for denoising [13], restoration [14], image analysis [15], and magnetic resonance imaging [16]. In particular, the DL method has demonstrated a promising performance for CT reconstruction in terms of preserving fine structures and suppressing image noise [11]. A redundant dictionary is an over-complete basis. The atoms in this basis are learned from application-specific training images. The representative methods to optimize a dictionary include the method of optimal directions (MOD) [17] and K-SVD [18] which treat signals as vectors.

Dynamic volumetric imaging involves a time dimension. The correlation across different phases must be taken into account [14][19]. For example, a spatio-temporal dictionary has been trained to encode the whole dataset which is superior to the spatial dictionary [3][4] because the temporal coherence across respiration phases is usually not guaranteed otherwise [20]. Generally speaking, the dimensionality of this dynamic dictionary is 3-D or higher dimensions, and the tensor notion has been used to deal with multidimensional data as a powerful tool [21]–[24]. It has been shown that tensor-based DL methods effectively capture structures and sparsely represent signals [21]–[24]. Also, tensor tools were employed to deal with high-dimensional data including multi-energy or spectral CT reconstruction [25]–[27] and 4D CT reconstruction [28], [29]. Encouraging results were reported based on the prior knowledge of rank, sparsity, and structure learning.

There are two widely used tensor methods to decompose datasets: one is the Tucker decomposition, and the other is Canonical Decomposition (CANDECOMP, abbreviated CPD) also known as Parallel Factor Analysis (PARAFAC) [21]. In the Tucker decomposition, multidimensional signals are represented by sparse core tensors with block structures. In this paper, via the Tucker decomposition we propose a tensor-based DL method to train a 3-D spatio-temporal dictionary for sparse representation during the CT reconstruction [30][31]. The alternating direction method of multipliers (ADMM) scheme [32] is introduced to accelerate the iteration. To improve finer structures with sharp edges, the nonlocal mean (NLM) method was well known as well. Buades et al. proposed a NLM approach to achieve better performance for image denoising [33]. Combined with the TV and NLM methods, the nonlocal total variation (NLTV) regularization was applied and proven helpful in enhancing the local strong edge information in reconstructed images [34]–[36]. The key idea behind these methods is to exploit the similarity between image patches.

The rest of this paper is organized as follows. In Section II, we introduce the tensor based DL and the spatio-temporal dictionary training. In Section III, the proposed tensor based DL reconstruction framework is presented, including the ADMM optimization scheme. The performances of two DL methods (tensor and vector based DL) are compared. In Section IV, the improved tensor DL method with NLTV is described. Representative experimental results are reported which are reconstructed from preclinical and clinical projections respectively. Finally, related issues are discussed in Section V.

II. Tensor Based Dictionary Learning

Sparse coding has attracted much attention in many fields. A representative method is the sparse representation by a well-trained over-complete dictionary. In 2006, Elad et al. proposed an K-SVD algorithm to train a dictionary for image denoising [18], which was extended by Mairal et al. for color image restoration [14]. In 2012, Xu et al. developed a low-dose x-ray CT reconstruction method using a dictionary learning technique [11]. However, the input signals/patches in the aforementioned methods were rearranged as vectors. This strategy can lose or distort the inherent spatial constraints of the original structures, especially for high dimensional images. The tensor factorization and decomposition have been popular in the signal processing community to deal with high dimensional signals/images, and the tensor based methods were investigated for DL to achieve better performance in the cases of multi-dimensional signals/images [21][22][23].

Briefly speaking, a vector is considered as a one-dimensional (1D) tensor and a matrix as a two-dimensional (2D) tensor. Thus, a three-dimensional (3D) signal e.g. RI1×I2×I3 is a 3D tensor, where R represents the real space and In (n = 1,2,3) are the dimensions of each mode. A 3D tensor can be unfolded to a matrix U(n) according to a specific order, i.e. U(1)RI1×I2I3, U(2)RI2×I1I3 and U(3)RI3×I1I2. The Tucker tensor decomposition is defined as [37][38][39]:

U¯=G¯×D11×D22×D33, (1)

where DnRIn×Rn are factor matrices, RR1×R2×R3 is the coefficient tensor (also known as core tensor) reflecting the affiliated interaction between factor matrices, Rn (n = 1,2,3) are the dimensions of each mode of , and ×nDn expresses a product of tensor with a matrix Dn in mode n (n = 1,2,3). The CPD can be seen as a special case of the Tucker decomposition where the coefficient tensor is diagonal. Given a group of 3-D training tensors (t), t = 1,2, ⋯, T, the dictionary learning problem can be formulated as the following minimization problem:

argminG¯(t),Dn(Σt=1TU¯(t)G¯(t)×D11×D22×D33F2)s.t.G¯(t)0K, (2)

where K represent the sparsity level and it is the number of nonzero entries in the coefficient tensor (t). An alternating least squares method can be used to update Dn at a time while other variables are fixed. For example, to update D1, the function can be rewritten in mode-1 as follows [21]:

Σt=1TU¯(t)G¯(t)×D11×D22×D33F2=[U(1)(1)U(1)(T)]D1[G(1)(1)C(1)G(1)(T)C(1)]F2, (3)

where C(1) = (D3D2)T and ⊗ represents the Kronecker product [37]. Denoting Us=[U(1)(1)U(1)(T)] and W=[G(1)(1)C(1)G(1)(T)C(1)], the least square solution for D1 can be simplified as

D1=(Us·WT)(WWT), (4)

where † represents the Moore-Penrose pseudo-inverse of the matrix.D2 and D3 can be computed following the same fashion. Once Dn (n = 1,2,3) are learned from the training dataset, we can apply the Kronecker-OMP [40] to update (t). The pseudo-codes for the Tensor-MOD DL approach can be summarized as the following Algorithm I. This process can be stopped when the predefined criteria are reached (e.g. the number of iterations, representation error).

Algorithm1: Tensor?MOD Dictionary Learning
Input: Training set of tensors (1), (2), ⋯, (T), ∈ RI1×I2×I3;
sparsity level K and tolerance ε.
Output: matrices DnRIn×Rn (n = 1,2,3).
1: Initializing matrices Dn (n = 1,2,3) randomly;
2: Computing global error e=Σt=1TU¯(t)G¯(t)×D11×D22×D33F2.
3: While the stopping criteria are not satisfied (e.g. e > ε) do
4:   for n = 1 to 3 do
5:     updating Dn using Eq.(4);
6:   end for
7:   Normalizing columns of matrices Dn (n = 1,2,3);
8:   Updating sparse core tensors (t)RR1×R2×R3 (t = 1,2, ⋯, T) using the Kronecker-OMP with the sparsity level K;
9:   Updating global error e;
10: end while
11: Return Dn (n = 1,2,3).

III. Tensor Based DL for CT Reconstruction

3.1. Reconstruction Method

Let Es be an operator to extract a spatio-temporal patch (a 3D tensor) from a dynamic image sequence u with the size of (sx, sy, st), where x and y are two spatio-dimensions and t is temporal dimension. Based on the aforementioned DL model, the dynamic CT reconstruction problem can be formulated as follow:

minu,D,{G¯(s)}Σs(λ12G¯(s)×D11×D22×D33Esu22+G¯(s)0)+λ22Auf22 (5)

where λ1 and λ2 are tuning parameters. While the last term enforces the data fidelity in projection domain with a system matrix A and projection measurements f, the first and second correspond to the tensor based DL model. For convenience, we use Dgs to represent (s)×1D1×2D2×3D3 and (5) can be simplified as

minu,D,{gs}Σs(λ12DgsEsu22+gs0)+λ22Auf22. (6)

To solve the minimization problem Eq.(6), an alternating direction method of multipliers (ADMM) based on the augmented Lagrangian can be applied with two auxiliary variables b = EuDg and h. The ADMM method and its variants are widely applied in the field of image processing to solve convex minimization problems [41]. The detail iterative procedure in our framework is as follow [32]:

{Dk+1,gsk+1,uk+1,bsk+1}=argminD,{gs},u,bΣs(gs0+λ12bs22+β2Dgs+bsEsuhs/β22)+λ22Auf22, (7)
hsk+1=hsk+β(Dk+1gsk+1bsk+1+Esuk+1). (8)

Problem (7) can be broken down into the following three subproblems.

  1. Dictionary Learning and Sparse Coding

  2. Fixing other variables, we update the dictionary D and sparse coding gs to represent the spatio-temporal image u sparsely and accurately:
    {Dk+1,gsk+1}=argminD,{gs}ΣsDgsEsu22s.t.gs0Ks. (9)
    As mentioned in section 2, the tensor-MOD dictionary learning method can be applied to update D and the Kronecker-OMP can be used to compute gs for all patches.
  3. b-sub-problem

  4. This sub-problem views b as the only free variable. As a result, Problem (6) becomes:
    bsk+1=argminbsΣs(λ12bs22+β2Dkgsk+bsEsukhs/β22). (10)
    This least square problem (10) is solved by
    bsk+1=βλ1+β(EsukDk+1gsk+1+hsk/β). (11)
  5. u-sub-problem

  6. The optimization problem for variable u can be expressed as:
    uk+1=argminuΣs(β2Dkgsk+bskEsuhs/β22)+λ22Auf22. (12)
    The problem (12) can be solved by the separable paraboloid surrogate method [42]:
    uk+1=ukλ2AT(Aukf)+βΣsEsT(EsukDk+1gsk+1bsk+1+ysk/β)λ2AAT+βΣsESEsT. (13)
    The pseudo-codes of the tensor based adaptive DL for dynamic CT reconstruction are summarized as the following Algorithm 2.
Algorithm2: Tensor Based Adaptive DL for Dynamic CT Reconstruction
Input: projection data f, regularization parameters λ1, λ2, β;
patch size Ix×Iy×It and core tensor size R1×R2×R3 and other parameters.
Output: reconstructed image sequences u
1: Initializing matrices Dn (n = 1,2,3), u and b = 0, h = 0;
2: While the stopping criteria are not satisfied do
3:   Constructing a dictionary Dn (n = 1,2,3) and coefficients g using the Algorithm 1;
4:   Updating b using Eq. (11);
5:   Representing u using Eq.(13);
6:   Updating h using Eq. (8);
7: end while
8: Return the final reconstruction u.

The tensor dictionary is acquired adaptively in our scheme and updated during each iteration step. As discussed by Xu et al. [11], when the training images do not match a specific application closely, the trained global dictionary cannot reveal some details in the reconstruction. To reconstruct images stably, the tensor dictionary is trained adaptively in our following experiments, while the computational cost should be taken into account. When an excellent training data set is available, a global tensor dictionary can be prepared to boost the iteration. Besides, how to choose the tuning parameters is a common and interesting problem for almost all the regularized reconstruction algorithms. In our work, these parameters were selected empirically to ensure the reconstructed image quality and reasonable computational cost.

3.2. Experimental Results

Projections acquired in a low-dose (80kV/17mAs) sheep lung perfusion study at 20 time points (20 phases) [43] were used to evaluate and validate the proposed algorithm. 1160 projections were collected uniformly over a 360° range for all phases and for each projection, 672 detector elements were equal-angularly distributed. First, 20 ground truth images of 512×512 were reconstructed using the fully sampled data for each phase by the conventional filtered backprojection (FBP), and a global dictionary was trained by the well-known K-SVD algorithm for the vector based DL. Readers can refer to [43] for details of data acquisition.

Throughout our experiments, the reconstructed images were matrixes of 512×512 pixels covering a 29.09×29.09cm2 region and the sparsity constrain was forced on the entire lung region of 500×370 pixels. The regularization parameters were set to λ1 = 10−4, λ2 = 1 and β = 0.2. Regarding the vector based DL, we empirically set the number of atoms as 512 and the sparsity level K as 6. The patch size was 8×8 and the image sequence was reconstructed frame-by-frame. For the adaptive tensor based DL, the default patch size was 4×4×4 covering the same amount of pixels in vector based DL and the sparsity level K was also 6. The core tensor size was 8×8×8. The image sequence was treated as a volume data. Because this method was developed for dynamic (sparse projections) and low dose CT (high level noise), the projections here for each phase were down-sampled from 1160 to 232 and 116 views respectively to test the method. The convergence of DL based method for CT reconstruction has been discussed in [11] and we did not repeat it in this paper. In our experiments, the iteration number was 70 and the reconstructed images changed little with further iterations for both of the methods. Two indexes were introduced to quantitatively evaluate the reconstructed image quality. The first one is the root mean square error (RMSE): RMSE=Σj=1J(ûjujr)2/J, where ujr is the r-th reconstructed value, ûj is the ground truth value of the jth pixel and J is the number of the pixels in an image. The second one is the widely used image quality assessment (IQA) index for structural similarity (SSIM) [44]. The closer to 1 the value of SSIM/FSIM is, the higher the structural similarity is achieved.

Fig. 1. shows some representative results of the sheep lung perfusion study for phase 10. In this case, the results reconstructed from the down-sampled projections were very similar and their differences were not obvious. Alternatively, the difference images with respect to the reference were plotted for better visualization and comparison. The quantitative evaluation results in terms of RMSE and SSIM are in Fig. 2.

Fig. 1.

Fig. 1

Tensor dictionary learning based reconstruction results in the sheep lung perfusion study. (a) The reference image reconstructed from a full dataset (1160 views) in a display window [−700 HU,800 HU]; (b) with respect to that reference the difference image between the absolute errors of the vector and tensor dictionary learning methods from 116 down-sampled projections in a display window [−5HU,5HU], where the white regions show greater reconstruction errors for the vector dictionary learning method; and (c) the counterpart of (b) for 232 projections.

Fig. 2.

Fig. 2

Quantitative evaluation in the sheep lung perfusion study in terms of (a) RMSE and (b) SSIM, where the horizontal axis indicates the phase of the dynamic reconstruction.

Evidently, the proposed tensor based DL method achieves lower RMSEs and higher SSIMs/FSIMs than the vector based DL for dynamic CT reconstruction at all phases. Since both of the methods follow the same reconstruction framework, it is not easy to distinguish them from the reconstructed images visually although the images reconstructed by the tensor based DL are closer to the ground truth than the vector based DL. The white regions in Figs. 2(b) and (c) indicated the images reconstructed by the tensor based DL method are more accurate. Therefore, we conclude that tensor based DL outperforms the vector based DL method.

IV. Tensor-based DL Method with Nonlocal Total Variation

4.1. Reconstruction Approach

The DL method is not always effective to reconstruct some finer structures with strong edges, especially when the chosen image patch size is not suitable. The TV constraint processes the image globally which may generate blocky artifacts. In order to enhance the strong edge information and local finer structures, we incorporate the nonlocal total variation (NLTV) regularization [46] into the tensor based DL reconstruction framework. Nonlocal means schemes exploit the similarity between patches in the image which takes a mean of all pixels in the image, weighted by the similarities between these pixels and the target pixel [47].

Let Ω ∈ ℝ2, x ∈ Ω, and u(x) be a real function Ω → ℝ. The nonlocal gradient ∇NLu(x) is defined as a vector of all partial differences ∇NLu(x,·) at x as

NLu(x,y)=(u(y)u(x))w(x,y),yΩ, (14)

where w is a nonnegative symmetric weight factor. The graph divergence of a vector p⃗: Ω×Ω → ℝ is defined as the standard adjoint relation with the NL gradient operator as

NLu,p=u,divNLp,p:Ω×Ω, (15)

where the graph divergence divNL of p: Ω×Ω → ℝ is obtained by

divNLp(x)=Ω(p(x,y)p(y,x))w(x,y)dy. (16)

Then the L1 norm of the weighted graph gradient ∇NLu(x), that is NLTV, is defined:

NLu1=Ω|NLu(x)|dx. (17)

We incorporate the NLTV into our tensor based DL method by solving the following problem:

minu,D,{gs},dΣs(λ12DgsEsu22+gs0)+λ22Auf22+λ3NLd1,s.t.d=u. (18)

The scaled augmented Lagrangian function of Eq. (18) can be expressed as [4], [32]:

Lβ(D,gs,u,d,y)=s(gs0+λ12DgsEsu22)+λ22Auf22+λ3NLd1+β2duy22, (19)

where y is an auxiliary variable. Similar to the iteration procedure in section III, the iteration solution for problem Eq.(19) can be summarized as follow:

{Dk+1,gsk+1}=argminD,{gs}ΣsDgsEsuk22s.t.gs0Ks, (20)
dk+1=argmindλ3NLd1+β2dukyk22, (21)
uk+1=argminuΣs(λ12Dk+1gsk+1Esu22)+λ22Auf22+β2dk+1uyk22, (22)
yk+1=yk+uk+1dk+1. (23)

The solution for subproblem (20) has been discussed in section III, and u can be updated by :

uk+1=ukλ1ΣsEsT(EsukDk+1gsk+1)+λ2AT(Aukf)+β(uk+ykdk+1)λ1ΣsESEsT+λ2AAT+βI. (24)

Therefore, here we focus on the sub-problem (21). Because there is no coupling between elements of d, we can apply the Split-Bregman method proposed by Zhang et al.[34] to deal with this nonlocal optimization problem, and the variable d was updated phase by phase during the iteration. The readers can refer to [34][46] for more details of the NLTV optimization problem. Because the nonlocal dictionary learning has achieved good results for image denoising [48], combination of nonlocal TV and nonlocal sub-dictionary learning will be an interesting follow-up topic.

4.2. Sheep-lung Perfusion Study

In order to evaluate the improved tensor based DL method with NLTV, we compared the reconstructed results by the tensor based DL with and without NLTV. The regularization parameters used in the tensor-DL NLTV were: λ1 = 0.2, λ2 = 1 and β = 0.1 and other parameters remain the same which were listed in section 3.2. The improvement in terms of difference image and quantitative evaluation are in Figures 3 and 4, respectively.

Fig. 3.

Fig. 3

Representative results in the sheep lung perfusion study for the phase 10. (a) The difference image between the absolute errors (relative to the reconstruction from the full dataset) of the tensor DL and tensor DL NLTV methods from 116 down-sampled projections in a display window [−5HU,5HU], where the tensor DL NLTV method produced more details in the white regions; and (b) the counterpart of (a) for 232 projections.

Fig. 4.

Fig. 4

Quantitative evaluation for the tensor method with/wihout NLTV in terms of (a) RMSE, and (b) SSIM in the sheep lung perfusion study, where the horizontal axis indicates the phase of the dynamic reconstruction.

Because the datasets used for the above evaluations were acquired from a real CT scanner, the measurement noise has already been included. In order to evaluate the robustness of the proposed method at different noise levels, additive white Gaussian noise was manually added into the raw projections to simulate different signal-noise-ratios (SNR): 25dB (high noise) and 40dB (low noise). While the Poisson noise model was widely accepted to simulate the interactions between the imaging object and x-ray photons, the Gaussian noise model is approximately equivalent to the Poisson noise model with an idea bowtie filter to reach the same expected photon number at all the detector elements. The same parameters in section 3 were used for the vector based and tensor based DL methods. As for the tensor based DL NLTV method, the values of the regularization parameters were: λ1 = 0.2 and λ2 = 1. We choose different values of β for different SNRs: β = 0.3 (25dB) and β = 0.1 (40dB).

The results of phase 10 reconstructed from 116 down-sampled projections with SNR=40dB were shown in Fig. 5 and the corresponding magnified local regions were shown in Fig. 6. The quantitative evaluation results in terms of RMSE, SSIM and FSIM were shown in Fig. 7. In Fig. 7, the left column were for data with SNR=25dB and the right column were for SNR=40dB. Although the performance of all the methods degrades as the SNR and projection data quality decrease, the tensor based DL NLTV method outperforms the other two methods for almost all the conditions with finer structures and less noise. It can be observed that the quality degradation in tensor based DL NLTV is not as apparent as the other two methods. Therefore, we can conclude that the tensor based DL NLTV method is robust. From figures 57, it can be concluded that the tensor based DL method outperforms the vector based DL method and the improved tensor based DL method with NLTV reconstructs the best image quality with lowest RMSEs and highest SSIMs/FSIMs, especially for the situations of few-view and noisy projections. In Fig. 6, it can be seen that tensor based DL NLTV method can recover local regions with stronger edges (see the regions indicated by arrow “A” and “C”) and finer details (see the region indicated by arrow “B”).

Fig. 5. The advantages of tensor based DL and NLTV are naturally combined into a new framework to improve the reconstructed images eventually.

Fig. 5

Reconstructed images at the phase 10 with 40dB white Gaussian noise. (a) The reference image reconstructed from the full dataset (1160 views), and (b)–(d) the reconstructed images using the vector DL, tensor DL and tensor DL NLTV methods from 116 down-sampled projections, respectively. The display window is [−700 HU, 800 HU].

Fig. 6.

Fig. 6

Magnified portions in the corresponding rectangular regions in Figure 5 in a display window of [−650HU, 600HU].

Fig. 7.

Fig. 7

Fig. 7

Quantitative evaluation results in the sheep lung perfusion study with additive noise in terms of RMSE and SSIM. The projection SNRs for the left and right columns are 25dB and 40dB, respectively.

4.3. Mouse Cardiac Imaging Study

To demonstrate the potential application of the proposed algorithms for dynamic cardiac imaging, we collected a group of existing raw projections in a mouse cardiac CT studies (2 phases). 400 projections were equiangularly ranged from 0° to 199.5°. For each projection, 1200 detector elements were equidistantly distributed, and the detector size was 60mm. The radius of the scanning trajectory was 141.53mm. In this study, images were reconstructed in a matrix of 512×512 over a 39.168×39.168 mm2 region. Throughout this experiments, the parameters were chosen as λ1 = 0.2, λ2 = 1 and β = 0.3. For the vector based DL method, the patch size was 5×5, and the number of atoms was set to 101 and the sparsity level K was set to 5. Both of the phases were reconstructed successively. For the tensor based DL with and without NLTV methods, the core tensor size was 7×7×2, the patch size was 5×5×2 and the sparsity level K was 5. The global dictionary for vector based DL was trained from images reconstructed by the FBP using the fully sampled data. The reconstructed images were shown in Fig. 8. Because the raw projections were collected from limited angels, it can be seen that the FBP results had more streak artifacts. However, the images reconstructed from the DL based methods were much better. Meanwhile, the NLTV regularization could enhance the edges of the reconstructed images. The results for phase 2 of the mouse cardiac CT from 200 views were shown in Fig. 9.

Fig. 8.

Fig. 8

Reconstructed images of the mouse cardiac imaging study (phase 1). From left to right, the images were reconstructed by the FBP, vector based DL, tensor based DL and tensor based DL NLTV, respectively. From top to bottom, the imagers were reconstructed from projection data down-sampled to 100, 200 and 400 views. The display window is [−600 HU, 1500 HU].

Fig. 9.

Fig. 9

Reconstructed images of the mouse cardiac CT (phase 2). From left to right, the images were reconstructed by the FBP, vector based DL, tensor based DL and tensor based DL NLTV, respectively. The original projections were down-sampled to 200 views and the display window is [−600 HU, 1500 HU].

Clearly, the conventional FBP algorithm is the worst choice in this situation with few-view projections. Images reconstructed from the tensor based DL NLTV method were much cleaner. By the limitation of the number of phases in this preclinical study, information lies in the temporal dimensional was not emphasized so much. Because the image quality reconstructed by the tensor based DL NLTV method was the tradeoff between tensor based DL and NLTV, the shortcomings of NLTV will also be introduced. Both of the DL and NLTV regularizations will affect the reconstructed image quality. The tradeoff between definition and cleanness for the reconstructed images needs to be carefully evaluated. Besides, the sizes of core tensor and patch are quite important for the reconstruction and they need to be selected according to the scanned imaging objects and the required spatial resolution.

V. Discussions and Conclusion

Because a tensor dictionary reflects correlations between pixels and across phases, the intrinsic features are emphasized. As a result, more information can be applied for image reconstruction. When the few-view and/or low-dose problems are dealt with, better results can be reconstructed with more details. When the dictionary learning approach is applied to CT reconstruction, atoms should match structural features as much as possible. Therefore, tensor atoms are preferred over vector atoms in which the temporal variation is not easy to learn.

From the experimental results in Sections III and IV, it is seen that the tensor-based dictionary has shown superiority over the vector-based dictionary, as evidenced by the fact that the structural information in the reconstructed results is richer with the former. The redundancy in the data was learned, and the sparsity constraints imposed on the tensor dictionary adaptively to exploit both temporal correlation and spatial structures. The improvements are evident in our experiments in terms of quantitative evaluation and visual comparison.

The proposed algorithms were solved with ADMM which works at a fast convergence rate. Several popular methods were also introduced to deal with the sub-problems in the iteration. By incorporating the non-local TV regularization to the tensor-based DL method, the images reconstructed from few-view and noisy projections are refined to give finer structures. Further research is needed to optimize the controlling parameters, especially the sizes of patch and core tensor. The combination of nonlocal TV and nonlocal sub-dictionary learning will be an interesting topic because both of them may benefit from non-locality. In the Tucker decomposition, multidimensional signals can be efficiently represented by sparse core tensors with block structures. The sparsity of core tensor (especially in the time dimension) could be coupled with a low-rank requirement, and a data structure oriented tensor dictionary learning (such as Quadtree decomposition, et al.) is also worth serious investigation. In the near future, we will extend this method for cardiac CT reconstruction and systematically evaluate the imaging performance through extensive numerical simulation and preclinical/clinical experiments.

Acknowledgment

This work was partially supported by the NSF CAREER Award CBET-1149679, the NSF collaborative grant DMS-1210967, and NIH/NIBIB U01 grant (EB011740).

References

  • 1.Gao H, Cai J-F, Shen Z, Zhao H. Robust principal component analysis-based four-dimensional computed tomography. Phys. Med. Biol. 2011 Jun.56(11):3181–3198. doi: 10.1088/0031-9155/56/11/002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Cai J-F, Jia X, Gao H, Jiang SB, Shen Z, Zhao H. Cine cone beam CT reconstruction using low-rank matrix factorization: algorithm and a proof-of-princple study. ArXiv12043595 Phys. 2012 Apr. doi: 10.1109/TMI.2014.2319055. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Caballero J, Price AN, Rueckert D, Hajnal JV. Dictionary Learning and Time Sparsity for Dynamic MR Data Reconstruction. IEEE Trans. Med. Imaging. 2014 Apr.33(4):979–994. doi: 10.1109/TMI.2014.2301271. [DOI] [PubMed] [Google Scholar]
  • 4.Wang Y, Ying L. Compressed Sensing Dynamic Cardiac Cine MRI Using Learned Spatiotemporal Dictionary. IEEE Trans. Biomed. Eng. 2014 Apr.61(4):1109–1120. doi: 10.1109/TBME.2013.2294939. [DOI] [PubMed] [Google Scholar]
  • 5.Donoho DL. Compressed sensing. IEEE Trans. Inf. Theory. 2006 Apr.52(4):1289–1306. [Google Scholar]
  • 6.Candes EJ, Romberg J, Tao T. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory. 2006 Feb.52(2):489–509. [Google Scholar]
  • 7.Yu H, Wang G. Compressed sensing based interior tomography. Phys. Med. Biol. 2009 May;54(9):2791. doi: 10.1088/0031-9155/54/9/014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Yu H, Yang J, Jiang M, Wang G. Supplemental Analysis on Compressed Sensing Based Interior Tomography. Phys. Med. Biol. 2009 Sep.54(18):N425–N432. doi: 10.1088/0031-9155/54/18/N04. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Xu Q, Mou X, Wang G, Sieren J, Hoffman EA, Yu H. Statistical Interior Tomography. IEEE Trans. Med. Imaging. 2011 May;30(5):1116–1128. doi: 10.1109/TMI.2011.2106161. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Ritschl L, Bergner F, Fleischmann C, Kachelrieß M. Improved total variation-based CT image reconstruction applied to clinical data. Phys. Med. Biol. 2011 Mar.56(6):1545. doi: 10.1088/0031-9155/56/6/003. [DOI] [PubMed] [Google Scholar]
  • 11.Xu Q, Yu H, Mou X, Zhang L, Hsieh J, Wang G. Low-Dose X-ray CT Reconstruction via Dictionary Learning. IEEE Trans. Med. Imaging. 2012 Sep.31(9):1682–1697. doi: 10.1109/TMI.2012.2195669. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Herman GT, Davidi R. Image reconstruction from a small number of projections. Inverse Probl. 2008 Aug.24(4):045011. doi: 10.1088/0266-5611/24/4/045011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Elad M, Aharon M. Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries. IEEE Trans. Image Process. 2006 Dec.15(12):3736–3745. doi: 10.1109/tip.2006.881969. [DOI] [PubMed] [Google Scholar]
  • 14.Mairal J, Elad M, Sapiro G. Sparse Representation for Color Image Restoration. IEEE Trans. Image Process. 2008 Jan.17(1):53–69. doi: 10.1109/tip.2007.911828. [DOI] [PubMed] [Google Scholar]
  • 15.Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y. Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 2009 Feb.31(2):210–227. doi: 10.1109/TPAMI.2008.79. [DOI] [PubMed] [Google Scholar]
  • 16.Ravishankar S, Bresler Y. MR image reconstruction from highly undersampled k-space data by dictionary learning. IEEE Trans. Med. Imaging. 2011 May;30(5):1028–1041. doi: 10.1109/TMI.2010.2090538. [DOI] [PubMed] [Google Scholar]
  • 17.Engan K, Aase SO, Hakon Husoy J. Method of optimal directions for frame design. 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, 1999. Proceedings. 1999;5:2443–2446. vol.5. [Google Scholar]
  • 18.Aharon M, Elad M, Bruckstein A. K-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation. IEEE Trans. Signal Process. 2006 Nov.54(11):4311–4322. [Google Scholar]
  • 19.Jia X, Tian Z, Lou Y, Sonke J-J, Jiang SB. Four-dimensional cone beam CT reconstruction and enhancement using a temporal nonlocal means method. Med. Phys. 2012 Sep.39(9):5592–5602. doi: 10.1118/1.4745559. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Wu G, Wang Q, Lian J, Shen D. Estimating the 4D respiratory lung motion by spatiotemporal registration and super-resolution image reconstruction. Med. Phys. 2013 Mar.40(3):031710. doi: 10.1118/1.4790689. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Caiafa CF, Cichocki A. Multidimensional compressed sensing and their applications. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2013 Nov.3(6):355–380. [Google Scholar]
  • 22.Duan G, Wang H, Liu Z, Deng J, Chen Y-W. K-CPD: Learning of overcomplete dictionaries for tensor sparse coding; 2012 21st International Conference on Pattern Recognition (ICPR); 2012. pp. 493–496. [Google Scholar]
  • 23.Zubair S, Wang W. Tensor dictionary learning with sparse TUCKER decomposition; 2013 18th International Conference on Digital Signal Processing (DSP); 2013. pp. 1–6. [Google Scholar]
  • 24.Peng Y, Meng D, Xu Z, Gao C, Yang Y, Zhang B. Decomposable Nonlocal Tensor Dictionary Learning for Multispectral Image Denoising. CVPR. 2014 [Google Scholar]
  • 25.Li L, Chen Z, Wang G, Chu J, Gao H. A tensor PRISM algorithm for multi-energy CT reconstruction and comparative studies. J. X-Ray Sci. Technol. 2014 Jan.22(2):147–163. doi: 10.3233/XST-140416. [DOI] [PubMed] [Google Scholar]
  • 26.Chen G-H, Tang J, Leng S. Prior image constrained compressed sensing (PICCS): a method to accurately reconstruct dynamic CT images from highly undersampled projection data sets. Med. Phys. 2008 Feb.35(2):660–663. doi: 10.1118/1.2836423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Gao H, Yu H, Osher S, Wang G. Multi-energy CT based on a prior rank, intensity and sparsity model (PRISM) Inverse Probl. 2011 Nov.27(11):115012. doi: 10.1088/0266-5611/27/11/115012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Gao H, Li R, Lin Y, Xing L. 4D cone beam CT via spatiotemporal tensor framelet. Med. Phys. 2012 Nov.39(11):6943–6946. doi: 10.1118/1.4762288. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Zhou W, Cai J-F, Gao H. Adaptive tight frame based medical image reconstruction: a proof-of-concept study for computed tomography. Inverse Probl. 2013 Dec.29(12):125006. [Google Scholar]
  • 30.Römer F, Galdo GD. Tensor-Based Dictionary Learning for Multidimensional Sparse Recovery. [Online]. Available: http://workshops.fhr.fraunhofer.de/cosera/pdf/A1_3t.pdf. [Google Scholar]
  • 31.Goldstein T, Osher S. The Split Bregman Method for L1-Regularized Problems. SIAM J. Imaging Sci. 2009 Jan.2(2):323–343. [Google Scholar]
  • 32.Liu Q, Liang D, Song Y, Luo J, Zhu Y, Li W. Augmented Lagrangian-Based Sparse Representation Method with Dictionary Updating for Image Deblurring. SIAM J. Imaging Sci. 2013 Jan.6(3):1689–1718. [Google Scholar]
  • 33.Buades A, Coll B, Morel J. A Review of Image Denoising Algorithms, with a New One. Multiscale Model. Simul. 2005 Jan.4(2):490–530. [Google Scholar]
  • 34.Zhang X, Burger M, Bresson X, Osher S. Bregmanized Nonlocal Regularization for Deconvolution and Sparse Reconstruction. SIAM J. Imaging Sci. 2010 Jan.3(3):253–276. [Google Scholar]
  • 35.Jung C, Ju J, Jiao L, Yang Y. Enhancing dictionary-based super-resolution using nonlocal total variation regularization. Opt. Eng. 2013;52(1):017005–017005. [Google Scholar]
  • 36.Yang Z, Jacob M. Nonlocal Regularization of Inverse Problems: A Unified Variational Framework. IEEE Trans. Image Process. 2013 Aug.22(8):3192–3203. doi: 10.1109/TIP.2012.2216278. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Kolda T, Bader B. Tensor Decompositions and Applications. SIAM Rev. 2009 Aug.51(3):455–500. [Google Scholar]
  • 38.Bader BW, Kolda TG. MATLAB Tensor Toolbox Version 2.5. 2012 [Online]. Available: http://www.sandia.gov/~tgkolda/TensorToolbox/.
  • 39.Bader BW, Kolda TG. Efficient MATLAB Computations with Sparse and Factored Tensors. SIAM J. Sci. Comput. 2008 Jan.30(1):205–231. [Google Scholar]
  • 40.Caiafa CF, Cichocki A. Computing Sparse Representations of Multidimensional Signals Using Kronecker Bases. Neural Comput. 2012 Sep.25(1):186–220. doi: 10.1162/NECO_a_00385. [DOI] [PubMed] [Google Scholar]
  • 41.Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers. Found Trends Mach Learn. 2011 Jan.3(1):1–122. [Google Scholar]
  • 42.Elbakri IA, Fessler JA. Statistical image reconstruction for polyenergetic X-ray computed tomography. IEEE Trans. Med. Imaging. 2002 Feb.21(2):89–99. doi: 10.1109/42.993128. [DOI] [PubMed] [Google Scholar]
  • 43.Yu H, Zhao S, Hoffman EA, Wang G. Ultra-low Dose Lung CT Perfusion Regularized by a Previous Scan. Acad. Radiol. 2009 Mar.16(3):363–373. doi: 10.1016/j.acra.2008.09.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 2004 Apr.13(4):600–612. doi: 10.1109/tip.2003.819861. [DOI] [PubMed] [Google Scholar]
  • 45.Zhang L, Zhang D, Mou X, Zhang D. FSIM: A Feature Similarity Index for Image Quality Assessment. IEEE Trans. Image Process. 2011 Aug.20(8):2378–2386. doi: 10.1109/TIP.2011.2109730. [DOI] [PubMed] [Google Scholar]
  • 46.Gilboa G, Osher S. Nonlocal Operators with Applications to Image Processing. Multiscale Model. Simul. 2008 Nov.7(3):1005–1028. [Google Scholar]
  • 47.Buades A, Coll B, Morel J-M. A non-local algorithm for image denoising. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005. CVPR 2005. 2005;2:60–65. vol. 2. [Google Scholar]
  • 48.Yan R, Shao L, Liu Y. Nonlocal Hierarchical Dictionary Learning Using Wavelets for Image Denoising. IEEE Trans. Image Process. 2013 Dec.22(12):4689–4698. doi: 10.1109/TIP.2013.2277813. [DOI] [PubMed] [Google Scholar]

RESOURCES