Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2011 Jul 20.
Published in final edited form as: Inf Process Med Imaging. 2011;22:550–561. doi: 10.1007/978-3-642-22092-0_45

Nonnegative Factorization of Diffusion Tensor Images and Its Applications

Yuchen Xie 1, Jeffrey Ho 1, Baba C Vemuri 1,
PMCID: PMC3140004  NIHMSID: NIHMS301207  PMID: 21761685

Abstract

This paper proposes a novel method for computing linear basis images from tensor-valued image data. As a generalization of the nonnegative matrix factorization, the proposed method aims to approximate a collection of diffusion tensor images using nonnegative linear combinations of basis tensor images. An efficient iterative optimization algorithm is proposed to solve this factorization problem. We present two applications: the DTI segmentation problem and a novel approach to discover informative and common parts in a collection of diffusion tensor images. The proposed method has been validated using both synthetic and real data, and experimental results have shown that it offers a competitive alternative to current state-of-the-arts in terms of accuracy and efficiency.

1 Introduction

In this paper, we introduce the novel notion of nonnegative factorization of tensor fields and its application to the segmentation of diffusion tensor images (DTI). Tensor images (or tensor fields) abound in medical imaging applications, with well-known examples such as DTIs, conductivity tensor images (CTI) and elasticity tensors in Elastography. There is an increasing demand for a principled and versatile method that can automatically extract important features from single as well as multiple tensor images, and this paper takes a step forward in this direction by investigating the general factorization problem for tensor fields. Specifically, we formulate the factorization as an optimization problem, and as the main technical contribution, we present an iterative algorithm that can efficiently and reliably optimize the objective function and compute good-quality factorizations.

Matrix factorization in the form of factoring a data matrix V into a product of the basis matrix W and the coefficient matrix H appears frequently in computer vision and image processing. Examples include the principal component analysis (PCA) and the Tomasi-Kanade factorization from multi-view 3D structure recovery. Algebraically, the factorization attempts to “discover” the linear subspace spanned by the columns of W that can be used to approximate the data given as the columns of V. Instead of the full subspace, the linear model can be further constrained by requiring that the data vectors belong to (or can be approximated by) the cone generated by the columns of W. Since a point in a cone can be written as a nonnegative linear combination of the cone’s generators, this gives rise to the nonnegative matrix factorization (NMF) first introduced in [7]. Specifically, using images as data, the matrices appeared in an NMF are all required to have nonnegative components. While the unconstrained matrix factorization admits a comparatively straightforward solution using linear algebra, nonnegative matrix factorization is generally more difficult to solve as its objective function is non-convex and there exists no known linear algebraic methods that can compute its solution in closed form. Nevertheless, the optimization is not particularly involved as the nonnegative constraint can be easily enforced, and NMF is almost always solved using an alternating sequence of nonnegative linear least squares with variables in W and H separately. More importantly, the nonnegativity constraint has been argued in the original paper [7] to be more capable, compared with the unconstrained factorization, of capturing features that are common among the data. In particular, the columns of the basis matrix W can be considered as the discovered common parts among the input data, and the coefficient matrix H provides the weights for reconstructing each input data using these parts. The power and versatility of NMF have been demonstrated to various degrees by a wide range of successful applications that include data clustering [2], parts discovery [9] and MR image analysis [6].

The proposed nonnegative factorization of tensor-valued images is an extension of the nonnegative matrix factorization method above. In our context, components in the matrices V and W are represented as symmetric positive semi-definite (PSD) tensors of rank two and H is the coefficient matrix with nonnegative real components. In particular, the nonnegativity constraint in our more general setting has now acquired two different realizations: the nonnegative constraint on the components of Has in the original NMF and the generalized nonnegative constraint on the components of W that they belong to the space of PSD(n) of n × n symmetric positive semi-definite matrices. Geometrically, this corresponds to replacing the nonnegative real line (intensity values) with the space PSD(n) (diffusion tensors), and it is their respective cone structures that permit us to formulate nonnegative factorization using these spaces. While our extension is easy to understand conceptually, the resulting optimization problem in our case is considerably more difficult to solve because of the generalized nonnegative constraint on W and the dimension of PSD(3). The former requires generalized matrix inequalities to enforce the constraint [1] and the latter introduces a large number of variables in the objective function. Therefore, a major portion of this paper is devoted to an optimization algorithm that can reliably and efficiently compute the factorization.

Having overcome this computational hurdle, we will next show that the proposed tensor images factorization method can be successfully applied to segmentation problems for single and multiple DTI images. For a single image, the data matrix V is an 1 × n array of PSD tensors, and a direct clustering on the coefficient matrix H gives the segmentation. For multiple tensor images, the basis matrix W gives as before a “part-decomposition” of the collection, and the common parts given as the columns of W can usually be realized as tensor fields with local support. In the current medical imaging literature, algorithms that segment DTI images can be broadly classified into two categories, the level-set based methods (e.g., [16, 8]) and the methods based on combinatorial optimization such as graph cuts [17]. In terms of its underlying motivation and numerics, our factorization-based method offers a completely different approach to the segmentation problem. We have validated the proposed method using both synthetic and real tensor images. Preliminary segmentation results from single and multiple DTI images have indicated that the proposed factorization-based approach is a viable and competitive alternative method for segmenting tensor images.

2 Preliminaries

Given an n × m non-negative matrix V, each of whose column vi represents an input image, non-negative matrix factorization (NMF) [7] attempts to factor it into two matrices with non-negative components, VWH, where W denotes the n × r basis matrix and H denotes the r × m coefficient matrix. According to the geometric interpretation elucidated in [3], NMF determines a cone W={xx=j=1rhjwj,hj0} that approximates the input images with the basis elements (or generators of ΣW) wj given by the j-th column of W.

For tensor-valued images, there is a d × d symmetric, positive semi-definite matrix associated to each pixel (or voxel). d = 3 for diffusion tensor images. The mathematics for nonnegative factorization of tensor images is straightforward as it only requires a cone structure that is provided by PSD(d), the space of d × d symmetric positive semi-definite matrices. A collection of m tensor images of size n can be arranged into a block matrix V=(V11V1mVn1Vnm), where VkiPSD(d), k = 1, …, n and i = 1, …, m. Each of the m columns represents one tensor image. As before, nonnegative factorization for tensor images attempts to factor V into a product of the basis matrix W the coefficient matrix H whose elements are non-negative real numbers:

VWH=(W11W1rWn1Wnr)(h11h1mhr1hrm), (1)

where the blocks Wij in W are matrices in PSD(d), and the blockwise product * is defined as

WH=(j=1rW1jhj1j=1rW1jhjmj=1rWnjhj1j=1rWnjhjm). (2)

r in the above equation is the number of basis elements (generators) used in the factorization, and it is clear that our nonnegative factorization reduces to the usual NMF when d = 1. We remark that the nonnegative factorization can be considered as a generative model for the collection of input tensor images as each input tensor image is approximated by a non-negative linear combination of r columns of W, each of which can be considered as a tensor image. To determine W, H from the data matrix V, we formulate a constrained optimization problem that minimizes the cost function

E(W,H)=12i=1mk=1n||Vkij=1rWkjhji||F2 (3)

with the constraints Wkj ≽ 0 and hji ≥ 0, i = 1, …, m, j = 1, …, r and k = 1, …, n. ≽ denotes matrix inequality and ||·||F denotes the Frobenius norm.

3 Algorithm and implementation details

In this section, we present an efficient algorithm that solves the constrained optimization defined above. While the objective function E(W, H) is not convex, it is convex with respect to the two block variables W and H. A common approach to solve this type of constrained optimization problem is the block coordinate descent method [10], and in particular, we can alternatively fix one block variable and improve the other as shown in Algorithm 1. Grippo et al [5] have shown that every limit point of the sequence {Wt, Ht} generated by Algorithm 1 is a stationary point of the optimization problem defined in (3). Since both sub-problems are convex, our algorithm can easily be shown to be provably convergent. The details for solving the two sub-problems efficiently will be discussed below.

Algorithm 1.

Alternating Non-negative Factorization

Initialize H1 ≥ 0.
For t = 1, 2, …
 - Wt+1=argminWE(W,Ht), s.t. Wkj ≽ 0, ∀k, j.
 - Ht+1=argminHE(Wt+1,H), s.t. H ≥ 0.

3.1 Optimization with respect to the basis matrix W

When H is fixed, the optimization problem (3) reduces to a quadratic semi-definite programming problem with a large number of positive semi-definite matrices as the constrained variables. This is a challenging optimization problem without readily available solvers as most available semi-definite programming solvers such as SeDuMi [11] and SDPT3 [14] require the linear objective functions. Toh et al. [13] proposed an inexact primal-dual algorithm to solve a special class of convex quadratic semi-definite programming problem. However, their algorithm only deals with a single positive semi-definite matrix and cannot be directly applied to our optimization problem. Instead, we will exploit the special feature in our problem that d = 3 is a relatively small number and design a specific algorithm based on primal-dual path-following interior-point method to solve this subproblem.

The primal problem is given by

minW12i=1mk=1n||Vkij=1rWkjhji||F2s.t.Wkj0,k=1,,n,j=1,,r. (4)

We introduce the d × d symmetric matrices Zkj associated with the matrix inequalities Wkj ≽ 0, where k = 1, …, n, j = 1, …, r. The Lagrangian of problem (4) is then

L(W,Z)=12i=1mk=1n||Vkij=1rWkjhji||F2k=1nj=1rTr(ZkjWkj). (5)

If we take the derivative of L(W, Z) with respect to Wkj and set it to zero, we have

Zkj=i=1m(Vkil=1rWklhli)hji. (6)

Substituting this expression back into the Lagrangian gives the dual problem

maxZi=1mk=1n(12||Vkij=1rWkjhji||F2+<Vkij=1rWkjhji,Vki>)s.t.i=1m(Vkihjil=1rWklhjihli)+Zkj=0Zkj0,k=1,,n,j=1,,r. (7)

For primal-dual interior point method, we use the perturbed Karush-Kuhn-Tucker (KKT) conditions:

Zkj+i=1m(Vkil=1rWklhli)hji=0WkjZkj=νkjIWkj0,Zkj0k=1,,n,j=1,,r (8)

where νkj are positive parameters. Given the current iterate (W, Z), the search direction (ΔW, ΔZ) at each interior-point iteration is the solution of the following Newton system

l=1rΔWkl(i=1mhlihji)ΔZkj=Rdkj:=Zkj+i=1m(Vkil=1rWklhli)hji (9)
HPkj(ΔWkjZkj+WkjΔZkj)=σμkjIHPkj(WkjZkj+ΔWkjΔZkj) (10)

where μkj =< Wkj, Zkj > /d and σ ∈ (0, 1) is the centering parameter. The symmetrization scheme HP defined as HP(M)=12[PMP1+PTMTPT] is required here to generate symmetric ΔWkj and ΔZkj. Inspired by [12], we choose Pkj=Tkj1/2, where Tkj is the Nesterov-Todd scaling matrix satisfying TkjZkjTkj = Wkj. The linearization of equation (10) gives

Ekj(ΔWkj)+Fkj(ΔZkj)=Rckj:=σμkjIHPkj(WkjZkj) (11)

where ℯkj and Inline graphic are linear operators. By eliminating ΔZkj in equations (9) and (11), we obtain

l=1r(i=1mhlihji)ΔWkl+Fkj1Ekj(ΔWkj)=Rdkj+Fkj1Rckj. (12)

Therefore, we just need to solve n linear systems to get ΔW. Each linear system includes d(d+1)r2 equations. Because both d and r are small for our problem, these linear systems can be solved efficiently, and ΔZ can be computed easily using equation (9). The detail steps of the algorithm is shown in Algorithm 2.

Algorithm 2.

Quadratic semi-definite programming for nonnegative factorization

  1. Initialization Wkj ≽ 0, Zkj ≽ 0, ∀k, j and τ = 0.9.

  2. Convergence test Stop the iteration if the accuracy measure φ is sufficiently small.
    φ=k=1nj=1r<Wkj,Zkj>1+pobj+dobj

    where pobj and dobj are values of primal and dual objective functions.

  3. Predictor step Compute the predictor search direction (δW, δZ) by choosing σ = 0.

  4. Predictor step-length Compute αp = min(1, τα). α is the maximum step length that can be taken so that for k = 1, …, n and j = 1, …, r, Wkj + αδWkj and Zkj + αδZkj remain positive semidefinite.

  5. Centering rule Set
    σ=k,j<Wkj+αpδWkj,Zkj+αpδZkj>k,j<Wkj,Zkj>.
  6. Corrector step Compute the search direction (ΔW, ΔZ) using
    Rckj=σμkjIHPkj(WkjZkj+δWkjδZkj).
  7. Corrector step-length Compute αc similar to step 4 with (δW, δZ) replaced by ((ΔW, (ΔZ).

  8. Update (W, Z) to the next iterate (W+, Z+).
    Wkj+=Wkj+αcΔWkj,Zkj+=Zkj+αcΔZkj.
  9. Update the step-length parameter by τ+ = 0.9 + 0.08αc.

3.2 Optimization with respect to coefficient matrix H

With a fixed W, (3) becomes a convex optimization problem. Since

Ehji=k=1nTr(VkiWkj)+k=1nl=1rTr(WkjWkl)hli, (13)

setting Ehji=0 gives

l=1r(k=1n<Wkj,Wkl>)hli=k=1n<Vki,Wkj>. (14)

Thus we just need to solve a non-negative least squares problem Ar×rHr×m = Br×m to get hji, where Ajl=k=1n<Wkj,Wkl> and Bji=k=1n<Vki,Wkj>, for all i, j, l. In our implementation, we use the fast active set method proposed by Van Benthem and Keenan [15] to solve this large-scale nonnegative least squares problem.

Once we have obtained the basis matrix W, we can easily compute the “projection” of a new diffusion tensor image X = (X1, …, Xn)T by solving the following optimization problem

miny0k=1n||Xkj=1rWkjyj||F2 (15)

where each Xk is a diffusion tensor and Y = (y1, …, yr)T is the nonnegative coefficient vector. This problem, similar to the subproblem with respect to H above, can also be efficiently solved using the same fast active set method.

3.3 Diffusion tensor image segmentation

Because of its relations to K-means [2] and probabilistic latent semantic analysis [4], nonnegative matrix factorization has been widely used in data clustering, e.g., [19]. In this section, we formulate the diffusion tensor image segmentation problem as a special case of our more general nonnegative factorization problem with spatial constraints.

Specifically, given a diffusion tensor image of size m, we arrange the m tensors in a row to form the data matrix V = (V1, …, Vm). The original nonnegative factorization problem is modified as

minW,H12i=1m||Vij=1rWjhji||F2+λ2(k,l)Ωj=1r(hjkhjl)2 (16)

with nonnegative constraints Wj ≽ 0 and hji ≥ 0, for all i, j. The first term is simply the objective function in (3) given V as a row vector. The second term is the spatial smoothness (soft) constraint that requires neighboring pixels to have similar coefficients, and in the equation above, Ω denotes the edge set of the discrete image graph, and λ the coupling parameter. The optimization problem (16) can be solved using the same alternating method discussed above as the second term is also quadratic in H. Once the coefficient matrix H has been determined, we cluster the columns of H to produce the diffusion tensor image segmentation. In our implementation, we use K-means for this last clustering step.

4 Experiments

In this section, we present two sets of experimental results. The first experiment is on diffusion tensor image segmentation using the segmentation algorithm outlined in the previous section. For the second set of experiments, we work with multiple images, and the results demonstrated that, as for scalar-valued images, meaningful parts can be discovered or detected using nonnegative factorization.

4.1 Diffusion tensor image segmentation

Synthetic Tensor Images

In this experiment, we test the accuracy of the segmentation algorithm using synthetic tensor images with various levels of added noise. We first generate the 32 × 32 tensor image T shown in Figure 1 that contains only two tensors: diagonal matrices diag(0.5. 0.25, 0.25) and diag(0.25, 0.5, 0.25), and this defines the ground truth of the segmentation. Different levels of Gaussian noise N(0, σ) are added to the tensor image T to generate noisy tensor images, and we compare the segmentation accuracy of our method with the segmentation algorithm based on clustering the pixels using K-means (on the tensors). In this experiment, the segmentation accuracy is defined by the percentage of correctly segmented pixels, and the comparison across different noise levels is plotted in Figure 1. The result clearly shows the robustness of our method when compared with K-means, especially in the presence of substantial amount of noise.

Fig. 1.

Fig. 1

Left: Input tensor field without noise. Right: Segmentation accuracy vs. different levels of added Gaussian noise N(0, σ) with covariance σ.

Diffusion Tensor Images

We next present segmentation results on three real diffusion tensor images shown in Figure 2. These are DTI images of the spinal cord, corpus callosum and an isolated hippocampus of a rat. The data were acquired using a PGSE with TR=1.5s, TE=28.3ms, bandwidth= 35Khz. Twenty-one diffusion weighted images with a b-value of 1250s/mm2 were collected. The sizes of the regions of interest for rat spinal cord, corpus callosum and hippocampus are 71 × 61, 101 × 74 and 71 × 39, respectively. In this experiment, the number of clusters for these images are four, eight and seven, respectively, and the number of basis elements (columns in W) is set to five for all three images. Again, we compared our method with the K-means based segmentation and the results consistent demonstrate that our method can produce anatomically more accurate and meaningful segmentation than the straightforward K-means clustering. Finally, we note that both algorithms use K-means for clustering pixels. However, in our method, K-means is applied only to the coefficient vectors in H, while in the comparison method, K-means is applied directly to the tensors. While the clustering power of the nonnegative factorization for matrices are well-known for scalar-valued data (e.g., [2]), our experimental results provide the first convincing evidence of its clustering power for tensor-valued data as the two sets of experiments have shown that there is no reasons to expect that a direct clustering on tensors would produce desired segmentation results. However, direct clustering on coefficient vectors does yield satisfactory results.

Fig. 2.

Fig. 2

First Row: Spinal cord of a rat. Second Row: Corpus callosum of a rat. Third Row: Hippocampus of a rat. Columns (a): Diffusion tensor images. (b): Segmentation results using K-means. (c): Segmentation results using our method. Segments are color-coded.

4.2 Nonnegative factorization with multiple tensor fields

For a suitable collection of images, it is well-known that NMF has the ability to automatically determine similar decompositions of the images into their common parts and provide part-based representations for the images in the collection [3]. In this experiment using synthetic data, we show that nonnegative factorization also has similar capability for tensor images. Twenty-seven tensor fields with size 5 × 3 (15 pixels) are generated and form the columns of the data matrix V as shown in Figure 3(a). We factor V into the coefficient matrix H and the basis matrix W shown in Figure 3(b). Our factorization algorithm correctly determines the nine basis elements (columns of W) required to form the data matrix V, and the coefficient matrix returned by our algorithm is a sparse matrix. Furthermore, the L2 factorization error is less than 8.57 × 10−10.

Fig. 3.

Fig. 3

(a) Visualization of the data matrix V, each of whose columns represents a 5 × 3 tensor image. (b) Visualization of the basis matrix W, each of whose columns represents a basis tensor image.

Finally, we present a preliminary result on applying our nonnegative factorization method to automatically discover and segment anatomically important regions from a collection of 53 rat brain diffusion tensor images. In the preprocessing step, all images are aligned using similarity transforms, and in each image, a region of interest of size 46 × 27 × 7 is manually selected. The left column of Figure 4 displays five sample slices from the input 3D diffusion tensor images. We apply the factorization algorithm with r = 5 (five basis elements) to this collection of DTIs, and one sample slice from each of the five basis images found by the algorithm is shown on the right column in Figure 4. Important anatomical regions such as white matter, putamen and nucleus accumbens are clearly represented in the five basis images.

Fig. 4.

Fig. 4

Left column: Five sample slices from 53 input rat brain diffusion tensor images. Right column: Sample slices from the five basis tensor images produced by the proposed nonnegative factorization algorithm. We have enhanced the anisotropy for better visualization. The first, second and fourth basis images clearly include white matter, putamen and nucleus accumbens, respectively.

5 Conclusions

This paper introduces the novel notion of nonnegative factorization of single and multiple tensor-valued images. The well-known method of nonnegative matrix factorization is extended to tensor-valued data, and an algorithm is proposed to efficiently and reliably solve the new factorization problem formulated as an optimization problem with a non-convex objective function. We have formulated a new approach to DTI segmentation using the proposed nonnegative factorization, and experimental results have shown that our algorithm offers a competitive alternative to currently available methods in terms of its accuracy and efficiency. Perhaps more importantly, our work has demonstrated the usefulness and versatility of the notion of nonnegative factorization, now in the more general setting of tensor-valued images. We believe that this simple yet powerful notion will find its rightful place in the analysis of tensor-valued images. For example, it could potentially offer a new and effective approach to simultaneously analyze a large number of diffusion tensor images, an essential task in group studies and other applications, that can currently be achieved using only a limited number of available tools, such as the standard statical analysis in Euclidean space or principal geodesic analysis on Riemannian manifolds [18].

Contributor Information

Yuchen Xie, Email: yxie@cise.ufl.edu.

Jeffrey Ho, Email: jho@cise.ufl.edu.

Baba C. Vemuri, Email: vemuri@cise.ufl.edu.

References

  • 1.Boyd S, Vandenberghe L. Convex optimization. Cambridge University Press; 2004. [Google Scholar]
  • 2.Ding C, He X, Simon H. On the equivalence of nonnegative matrix factorization and spectral clustering. Proc SIAM Data Mining Conf. 2005 [Google Scholar]
  • 3.Donoho D, Stodden V. When does non-negative matrix factorization give a correct decomposition into parts? NIPS; 2004. [Google Scholar]
  • 4.Gaussier E, Goutte C. Relation between PLSA and NMF and implications. ACM SI-GIR; 2005. [Google Scholar]
  • 5.Grippo L, Sciandrone M. On the convergence of the block nonlinear gaussseidel method under convex constraints. Operations Research Letters. 2000;26(3):127–136. [Google Scholar]
  • 6.Joshi S, Karthikeyan S, Manjunath B, Grafton S, Kiehl K. Anatomical parts-based regression using non-negative matrix factorization. CVPR; 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lee D, Seung H. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401:788–791. doi: 10.1038/44565. [DOI] [PubMed] [Google Scholar]
  • 8.Lenglet C, Rousson M, Deriche R. DTI segmentation by statistical surface evolution. IEEE Trans on Medical Imaging. 2006;25(6):685–700. doi: 10.1109/tmi.2006.873299. [DOI] [PubMed] [Google Scholar]
  • 9.Li S, Hou X, Zhang H, Cheng Q. Learning spatially localized, parts-based representation. CVPR; 2001. [Google Scholar]
  • 10.Nocedal J, Wright S. Numerical optimization. Springer; 2000. [Google Scholar]
  • 11.Sturm J. Using SeDuMi 1. 02, a MATLAB toolbox for optimization over symmetric cones. Optimization methods and software. 1999;11(1):625–653. [Google Scholar]
  • 12.Todd M, Toh K, Tutuncu R. On the Nesterov-Todd direction in semidefinite programming. SIAM Journal on Optimization. 1998;8(3):769–796. [Google Scholar]
  • 13.Toh K, Tutuncu R, Todd M. Inexact primal-dual path-following algorithms for a special class of convex quadratic SDP and related problems. Pac J Optim. 2007;3:135–164. [Google Scholar]
  • 14.Tutuncu R, Toh K, Todd M. Solving semidefinite-quadratic-linear programs using SDPT3. Mathematical Programming. 2003;95(2):189–217. [Google Scholar]
  • 15.Van Benthem M, Keenan M. Fast algorithm for the solution of large-scale non-negativity-constrained least squares problems. Journal of chemometrics. 2004;18(10):441–450. [Google Scholar]
  • 16.Wang Z, Vemuri B. DTI segmentation using an information theoretic tensor dissimilarity measure. IEEE Trans on Medical Imaging. 2005;24(10):1267–1277. doi: 10.1109/TMI.2005.854516. [DOI] [PubMed] [Google Scholar]
  • 17.Weldeselassie Y, Hamarneh G. DT-MRI segmentation using graph cuts. Proc of SPIE Medical Imaging: Image Processing. 2007 [Google Scholar]
  • 18.Xie Y, Vemuri B, Ho J. Statistical analysis of tensor fields. MICCAI; 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Xu W, Liu X, Gong Y. Document clustering based on non-negative matrix factorization. ACM SIGIR; 2003. [Google Scholar]

RESOURCES