Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jul 20.
Published in final edited form as: Med Image Comput Comput Assist Interv. 2015 Nov 18;9351:132–139.

Brain Tissue Segmentation Based on Diffusion MRI Using ℓ0 Sparse-Group Representation Classification

Pew-Thian Yap †,, Yong Zhang , Dinggang Shen
PMCID: PMC6054460  NIHMSID: NIHMS963633  PMID: 30035276

Abstract

We present a method for automated brain tissue segmentation based on diffusion MRI. This provides information that is complementary to structural MRI and facilitates fusion of information between the two imaging modalities. Unlike existing segmentation approaches that are based on diffusion tensor imaging (DTI), our method explicitly models the coexistence of various diffusion compartments within each voxel owing to different tissue types and different fiber orientations. This results in improved segmentation in regions with white matter crossings and in regions susceptible to partial volume effects. For each voxel, we tease apart possible signal contributions from white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) with the help of diffusion exemplars, which are representative signals associated with each tissue type. Each voxel is then classified by determining which of the WM, GM, or CSF diffusion exemplar groups explains the signal better with the least fitting residual. Fitting is performed using ℓ0 sparse-group approximation, circumventing various reported limitations of ℓ1 fitting. In addition, to promote spatial regularity, we introduce a smoothing technique that is based on ℓ0 gradient minimization, which can be viewed as the ℓ0 version of total variation (TV) smoothing. Compared with the latter, our smoothing technique, which also incorporates multi-channel WM, GM, and CSF concurrent smoothing, yields marked improvement in preserving boundary contrast and consequently reduces segmentation bias caused by smoothing at tissue boundaries. The results produced by our method are in good agreement with segmentation based on T1-weighted images.

1 Introduction

Brain tissue segmentation is most commonly performed using T1-weighted images, which are typically rich with anatomical details thanks to their higher spatial resolution (1 × 1 × 1 mm3). However, the recent availability of high spatial resolution (1.25 × 1.25 × 1.25 mm3) diffusion MRI data from the Human Connectome Project1 begs the following questions: 1) Can tissue segmentation be performed equally well solely based on diffusion data, therefore making it possible to avoid the technical difficulties involved in transferring segmentation information from T1-weighted images, such as geometric distortion and cross-modality registration? 2) Can diffusion data, acquired based on a totally different contrast mechanism, provide information complementary to T1-weighted images for further improving segmentation?

In this paper, we attempt to address these questions by introducing a segmentation method that works directly with diffusion MRI data. In contrast to existing segmentation methods that are based on diffusion tensor imaging (DTI) [13], our method explicitly models the coexistence of various diffusion compartments within each voxel owing to different tissue types and different fiber orientations. This improves segmentation in regions with white matter crossings and in regions susceptible to partial volume effects. For each voxel, we tease apart possible signal contributions from white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) with the help of diffusion exemplars, which are representative signals associated with each tissue type. More specifically, the WM diffusion exemplars are sampled from diffusion tensors oriented in different directions with different axial and radial diffusivities; GM from isotopic tensors of low diffusivities; and CSF from isotropic tensors of high diffusivities. Each voxel is then classified by determining which of the WM, GM, or CSF diffusion exemplars explain the signal better with the least fitting residual.

Fitting is performed using ℓ0 sparse-group approximation, circumventing various reported limitations of ℓ1 fitting. The use of ℓ0 penalization is motivated by the observations reported in [4], where the authors have shown that the commonly used ℓ1-norm penalization [5, 6] conflicts with the unit sum requirement of the volume fractions and hence results in suboptimal solutions. To overcome this problem, the authors propose to employ the reweighted ℓ1 minimization approached described by Candès et al. [7] to obtain solutions with enhanced sparsity, approximating solutions given by ℓ0 minimization. However, despite giving improved results, this approach is still reliant on the suboptimal solution of the unweighted ℓ1 minimization problem that has to be solved in the first iteration of the reweighted minimization scheme. In the current work, we will employ an algorithm that is based directly on ℓ0 minimization.

To promote spatial regularity, we introduce a smoothing technique that is based on ℓ0 gradient minimization [8]. This can be viewed as the ℓ0 version of total variation (TV) smoothing. Compared with the latter, our smoothing technique yields marked improvement in the preservation of boundary contrast. In addition, our method smooths the probability maps of WM, GM, and CSF concurrently. This is achieved by an ℓ0 adaptation of a multi-channel smoothing algorithm [9], solved using alternating direction method of multipliers (ADMM) [10].

2 Approach

Our approach to tissue segmentation is inspired by the face recognition work of Wright et al. [11]. However, instead of the ℓ1 sparse approximation used in [11], we use a sparse-group ℓ0 minimization approach that circumvents the problems mentioned in [4]. To promote spatial regularity, we also propose a multi-channel gradient minimization algorithm for smoothing of the tissue probability maps, producing edge-preserving effect better than smoothing based on TV regularization [12].

Linear Subspaces

We assume that the signal from each class of tissue lies in a linear subspace. The subspace is spanned by diffusion exemplars, which are hypothetical signal vectors generated using the tensor model, S(b, ĝ) = S0 exp(−bĝT), with varying diffusion parameters. Here, ĝ is a unit vector representing the gradient direction, S0 is the baseline signal with no diffusion weighting, and D is the diffusion tensor. The WM subspace is spanned by multiple groups of diffusion exemplars. Each WM diffusion exemplar group consists of signal vectors sampled from a set of unidirectional axial-symmetric diffusion tensor models with a range of typical axial and radial diffusivities. Multiple groups of WM diffusion exemplars are generated by tensors with principal directions uniformly covering the unit sphere. The GM and CSF diffusion exemplar groups consist of signal vectors sampled from isotropic tensors with GM diffusivities set lower than CSF diffusivities, consistent with what was reported in [1, 13]. For each class c ∈ 𝒞 = {WM, GM, CSF}, we arrange the nc signal vectors of the diffusion exemplars as columns of a matrix Ac = [sc,1, sc,2, …, sc,nc]. We then concatenate the exemplar matrices of all tissue classes into a matrix A = [AWM|AGM|ACSF], where AWM = [AWM1 |…|AWMk|…|AWMNWM] and each numerical subscript k of the WM exemplar matrix AWMk denotes the index corresponding to a WM direction.

0 Sparse-Group Representation

Given the signal vector s of a voxel that we wish to classify, we first compute its sparse-representation coefficient vector f by solving the follow ℓ0 sparse-group approximation problem:

min f0{ϕ(f)=Afs22+γ[αf0+(1α)g𝒢(fg2)]}, (1)

where ℐ(z) is an indicator function returning 1 if z ≠ 0 or 0 if otherwise. The ℓ0-“norm” gives the cardinality of the support, i.e., ‖f0= | supp(f)|= |{k : fk ≠ 0}|. Parameters α ∈ [0, 1] and γ > 0 are for penalty tuning, analogous to those used in the sparse-group LASSO [14]. Note that α = 1 gives the ℓ0 fit, whereas α = 0 gives the group ℓ0 fit. fg denotes the subvector containing the elements associated with group g ∈ 𝒢 = {WM1, …, WMNWM, GM, CSF}. We solve this problem using an algorithm called non-monotone iterative hard thresholding (NIHT) [15], inspired by [16, 17]. Proof of convergence can be obtained by modifying the results shown in [17].

Tissue Classification

Each voxel is classified as the class with diffusion exemplars that best explain the signal. This is achieved, based on [11], by determining the class that gives the least reconstruction residual:

min c{r(s|c)=Aδc(f)s2}, (2)

where δc(f) is a new vector whose only nonzero entries are the entries in f that are associated with class c. We modify the above problem to become a maximum a posteriori (MAP) estimation problem:

max c{p(c|s)p(s|c)p(c)}, (3)

where p(c) is the prior probability and p(s|c) is the likelihood function defined as

p(s|c)=1σc2πexp [r2(s|c)2σc2]. (4)

The scale σ can be determined from the data via σc2=1|Ωc|iΩcr2(si|c), where Ωc ⊂ Ω = {1, …, N} is the subset of indices of voxels with class c giving the least residuals. N is total number of voxels. This alternative formulation allows us to visualize the posterior probability maps {p(c|si)|i ∈ Ω, c ∈ 𝒞} (disregarding constant scaling) for qualitative assessment of tissue segmentation. The prior probabilities can be set according to a pre-computed probabilistic atlas for guided segmentation.

Multi-Channel Gradient Minimization

Tissue classification as discussed in the previous section can be improved in terms of robustness by imposing spatial regularity. To achieve this, we smooth the posterior probability maps of WM, GM, and CSF concurrently prior to MAP estimation. In contrast to the commonly used TV-regularized smoothing, which is essentially an ℓ1 gradient minimization (L1-GM) algorithm, we will use here ℓ0 gradient minimization (L0-GM), which has been shown in [8] to be more effective than L1-GM in preserving edges. Moreover, L0-GM is more suitable in our case due to the piecewise constant nature of the segmentation maps. Here, we describe a multi-channel version of L0-GM.

We first define for the i-th voxel a probability vector pip(si) = [p(WM|si), p(GM|si), p(CSF|si)]T. We then solve for a smoothed version of the probability map {pi ∈ ℝ|𝒞|, i ∈ Ω}, i.e., u = {ui ∈ ℝ|𝒞|, i ∈ Ω} via the following problem:

min u{ψ(u)=iuipi22+βidDi,du220}. (5)

We let Di,du ∈ ℝ1×|𝒞|, where u ∈ ℝN×|𝒞|, be a row vector concatenating the finite difference values of all channels of u in the d-th spatial dimension. Note that Di,d ∈ ℝN is the finite difference matrix. The first term in (5) maintains data fidelity and the second term penalizes small edges in a multi-channel image. If we replace the ℓ0-“norm” in the second term with ℓ1-norm, the above problem become a TV-regularized smoothing problem. Note that the above optimization problem is known to be computationally intractable. We thus implement an approximate solution using ADMM [18] by introducing a number of auxiliary variables. The ADMM formulation amounts to repeatedly performing hard thresholding and spatial convolution/deconvolution [10].

3 Experiments

3.1 Data

Diffusion weighted (DW) datasets from the Human Connectome Project (HCP) [19] were used. DW images with 1.25 × 1.25 × 1.25 mm3 resolution were acquired with diffusion weightings b = 1000, 2000, and 3000 s/mm2, each applied in 90 directions. 18 baseline images with low diffusion weighting b = 5 s/mm2 were also acquired. The DW datasets were acquired with reversed phase encoding to correct for EPI distortion. T1-weighted anatomical images were acquired as anatomical references.

3.2 Diffusion Parameters

The parameters of the tensors used to generate the diffusion exemplars were set to cover the typical values of the diffusivities of the WM, GM, and CSF voxels in the above dataset: λWM=1×103 mm2/s,λWM=[0.1:0.1:0.3]×103 mm2/s,λGM=[0.00:0.01:0.80]×103 mm2/s, and λCSF=[1.0:0.1:3.0]×103 mm2/s. The notation [a : s : b] denotes values from a to b, inclusive, with step s. Note that in practice, these ranges do not have to be exact but should however cover possible parameter values. The direction of each group of the WM diffusion exemplars corresponds to one of the 321 points evenly distributed on a hemisphere, generated by the subdivision of the faces of an icosahedron three times.

3.3 Comparison Methods

We compared the proposed method (L200) with the following methods:

  • L211: Sparse-group LASSO [14] using diffusion exemplars identical to the proposed method. Similar to [4] and according to [7], we executed sparse-group LASSO multiple times, each time reweighing the ℓ21-norm and the ℓ1-norm so they eventually approximate their ℓ0 counterparts.

  • L0: ℓ0 minimization using a single diffusion exemplar each for WM, GM, and CSF [13]. Similar to [13], WM-GM-CSF segmentation was used to help determine the parameters for the diffusion exemplars. The axial and radial diffusivities of the WM diffusion exemplars were determined based on WM voxels with fractional anisotropy (FA) greater than 0.7. The diffusivity of the isotropic GM/CSF diffusion exemplar was determined based on GM/CSF voxels with FA less than 0.2.

The tuning parameter γ was set to 1 × 10−4 for all methods. In addition, we set α = 0.05, β = 0.001, p(WM) = 0.35, p(GM) = 0.50, and p(CSF) = 0.15 for the proposed method.

3.4 Results

Qualitative

Figure 1 indicates that the segmentation result of the proposed method, L200, resembles very closely to that produced using the T1-weighted image with the FSL FAST algorithm [20]. L211 produces WM segmentation result that is similar to L200, but underestimates GM. Note that these two methods are able to separate the deep GM structures, such as caudate and putamen, from the surrounding WM. The segmentation of the thalamus is more challenging because it is a mixture of GM and WM (see likelihood maps in the bottom row of Fig. 1).

Fig. 1.

Fig. 1

(Top) T1-weighted, fractional anisotropy (FA), mean diffusivity (MD), and T1 segmentation images. (Middle) Segmentation maps given by L200 (proposed), L211, and L0. (Bottom) Likelihood maps for WM, GM, and CSF given by L200.

Quantitative

Figure 2 shows the Dice scores for WM-GM-CSF segmentation of 5 subjects from the HCP data repository, confirming again that the proposed method produces segmentation results that agree most with segmentation based on T1-weighted images. The average Dice scores for L200/L211/L0 are 0.8603/0.8581/0.8019 (WM), 0.8105/0.7177/0.6844 (GM), and 0.7204/0.5941/0.6985 (CSF).

Fig. 2.

Fig. 2

Accuracy of segmentation outcomes evaluated based on Dice score using T1 segmentations as the ground truth.

Smoothing

Figure 3 shows the effects of smoothing with different strengths using L1-GM and L0-GM. The results confirm that despite the increased smoothing strength, L0-GM can still preserve edges effectively. On the other hand, L1-GM blurs the edges when the smoothing strength is increased.

Fig. 3.

Fig. 3

Effects of light and heavy smoothing using L1-GM and L0-GM. The WM, GM, CSF posterior probability maps are smoothed concurrently. However, due to space limitation, only the WM probability maps are shown here.

4 Conclusion

In this paper, we have presented a tissue segmentation method that works directly with diffusion MRI data. We demonstrated that the proposed method is able to produce segmentation results that are in good agreement with the more conventional T1-based segmentation. We also showed that diffusion MRI provides additional information for segmentation of deep gray matter structures, complementary to T1-weighted imaging, where image contrast in this region is typically low. Future research will be directed to further improving the segmentation of deep gray matter.

Acknowledgments

This work was supported in part by a UNC BRIC-Radiology startup fund and NIH grants (EB006733, EB009634, AG041721, MH100217, AA010723, and 1UL1TR001111).

Footnotes

References

  • 1.Liu T, Li H, Wong K, Tarok A, Guo L, Wong ST. Brain tissue segmentation based on DTI data. NeuroImage. 2007;38(1):114–123. doi: 10.1016/j.neuroimage.2007.07.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Awate SP, Zhang H, Simon TJ, Gee JC. Multivariate segmentation of brain tissues by fusion of MRI and DTI data; IEEE International Symposium on Biomedical Imaging (ISBI); 2008. pp. 213–216. [Google Scholar]
  • 3.Lenglet C, Rousson M, Deriche R. A statistical framework for DTI segmentation; International Symposium on Biomedical Imaging; Apr, 2006. pp. 794–797. [Google Scholar]
  • 4.Daducci A, Ville DVD, Thiran JP, Wiaux Y. Sparse regularization for fiber ODF reconstruction: From the suboptimality of ℓ2 and ℓ1 priors to ℓ0. Medical Image Analysis. 2014;18:820–833. doi: 10.1016/j.media.2014.01.011. [DOI] [PubMed] [Google Scholar]
  • 5.Yap PT, Shen D. Spatial transformation of DWI data using non-negative sparse representation. IEEE Transactions on Medical Imaging. 2012;31(11):2035–2049. doi: 10.1109/TMI.2012.2204766. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Ramirez-Manzanares A, Rivera M, Vemuri BC, Carney P, Mareci T. Diffusion basis functions decomposition for estimating white matter intra-voxel fiber geometry. IEEE Transactions on Medical Imaging. 2007;26(8):1091–1102. doi: 10.1109/TMI.2007.900461. [DOI] [PubMed] [Google Scholar]
  • 7.Candès EJ, Wakin MB, Boyd SP. Enhancing sparsity by reweighted ℓ1 minimization. Journal of Fourier Analysis and Applications. 2008;14(5):877–905. [Google Scholar]
  • 8.Xu L, Lu C, Xu Y, Jia J. Image smoothing via ℓ0 gradient minimization. ACM Transactions on Graphics. 2011;30(5) [Google Scholar]
  • 9.Yang J, Yin W, Zhang Y, Wang Y. A fast algorithm for edge-preserving variational multichannel image restoration. SIAM Journal on Imaging Sciences. 2009;2(2):569–592. [Google Scholar]
  • 10.Tao M, Yang J. Technical report. Department of Mathematics, Nanjing University; 2009. Alternating direction algorithms for total variation deconvolution in image reconstruction. [Google Scholar]
  • 11.Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2009;31(2):1–18. doi: 10.1109/TPAMI.2008.79. [DOI] [PubMed] [Google Scholar]
  • 12.Rudin LI, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D. 1992;60:259–268. [Google Scholar]
  • 13.Jeurissen B, Tournier JD, Dhollander T, Connelly A, Sijbers J. Multi-tissue constrained spherical deconvolution for improved analysis of multi-shell diffusion MRI data. NeuroImage. 2014 doi: 10.1016/j.neuroimage.2014.07.061. [DOI] [PubMed] [Google Scholar]
  • 14.Simon N, Friedman J, Hastie T, Tibshirani R. A sparse-group lasso. Journal of Computational and Graphical Statistics. 2013;22(2):231–245. [Google Scholar]
  • 15.Yap PT, Zhang Y, Shen D. Diffusion compartmentalization using response function groups with cardinality penalization. Medical Image Computing and Computer-Assisted Intervention. 2015 doi: 10.1007/978-3-319-24553-9_23. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Blumensath T, Davies ME. Iterative thresholding for sparse approximations. Journal of Fourier Analysis and Applications. 2008;14(5–6):629–654. [Google Scholar]
  • 17.Lu Z. Iterative hard thresholding methods for ℓ0 regularized convex cone programming. Mathematical Programming. 2013:1–30. [Google Scholar]
  • 18.Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trend in Machine Learning. 2010;3(1):1–122. [Google Scholar]
  • 19.Essen DCV, Smith SM, Barch DM, Behrens TE, Yacoub E, Ugurbil K. The WU-Minn human connectome project: An overview. NeuroImage. 2013;80:62–79. doi: 10.1016/j.neuroimage.2013.05.041. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Zhang Y, Brady M, Smith S. Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm. IEEE Transactions on Medical Imaging. 2001;20(1):45–57. doi: 10.1109/42.906424. [DOI] [PubMed] [Google Scholar]

RESOURCES