Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Nov 30.
Published in final edited form as: Med Image Comput Comput Assist Interv. 2008;11(Pt 1):279–286. doi: 10.1007/978-3-540-85988-8_34

Joint Segmentation of Thalamic Nuclei from a Population of Diffusion Tensor MR Images

Ulas Ziyan 1, Carl-Fredrik Westin 1,2
PMCID: PMC2785443  NIHMSID: NIHMS145394  PMID: 18979758

Abstract

Several recent studies explored the use of unsupervised segmentation methods for segmenting thalamic nuclei from diffusion tensor images. These methods provide a plausible segmentation on individual subjects; however, they do not address the problem of consistently identifying the same functional areas in a population. The lack of correspondence between the segmented nuclei make it more difficult to use the results from the unsupervised segmentation tools for morphometry. In this paper we present a novel segmentation algorithm to automatically segment the gray matter nuclei while ensuring consistency between subjects in a population. This new algorithm, referred to as Consistency Clustering, finds correspondence between the nuclei as the segmentation is achieved through a single model for the whole population, similar to the brain atlases experts use to identify thalamic nuclei.

1 Introduction

Diffusion tensor imaging (DTI) is a relatively new imaging modality that measures free water diffusion, i.e. Brownian motion, of the endogenous water in tissue [1]. In human brain tissue, the water diffusion is not the same in all directions, since it is obstructed by structural elements such as cell membranes or myelin [1]. When this obstruction constrains the water diffusion in a coherent direction, such as within the cerebral white matter, the resulting water diffusion tensor becomes anisotropic, containing information about the directionality of the white matter connectivity. Thus, quantification of water diffusion in tissue through DTI provides a unique way to analyze white matter organization of the brain.

Unlike white matter, the tissue in gray matter is less organized in orientation. The lack of coherent orientation limits the use of DTI for gray matter analysis in some areas, such as the cerebral cortex. However, there are certain gray matter structures that exhibit coherence in diffusion direction due to the presence of coherent white matter near these structures, such as the thalamus. The thalamus acts as the central relay station of the brain with nearly all of the sensory tract projections reach to the cortex passing through the thalamus. Since functionally related pathways target the same region of cortex once they leave the thalamus, they result in organization of diffusivity within the thalamus. This organized diffusion can be measured in DTI, and it has been proposed that the thalamic nuclei can be distinguished by their characteristic diffusion orientation [2].

Precise identification of the thalamic nuclei is essential in a clinical setting, since many motor-control disorders are surgically corrected by applying chronic electrical stimulation to the appropriate functional area of the thalamus. Currently, these regions are detected qualitatively before the operation using generic atlases along with structural MRI [6], which does not provide adequate contrast to identify the distinct nuclei. Changes have also been reported in the thalamic nuclei during the progression of a large number of diseases, including schizophrenia [7] and Parkinson’s disease [8].

Since the realization that thalamic nuclei can be resolved through DTI, several segmentation algorithms have been proposed to segment the thalamic nuclei. The earliest segmentation method, which depends on DTI data only from within the thalamus, uses the k-means clustering algorithm [2]. Later, other clustering methods have been proposed that use spectral clustering [3], level-sets [4] and the mean-shift algorithm [5]. These later methods avoid some of the weaknesses of the k-means, which includes a bias toward ellipsoidal clusters and sensitivity to initialization. Even though each of these clustering algorithms produce plausible segmentations for any given subject in a population, they do not find a correspondence between the segments acquired from different subjects.

In this paper, we present a new approach to the segmentation of thalamic nuclei. Unlike the previous methods, this new algorithm, referred to as Consistency Clustering (CC), is designed to segment multiple subjects simultaneously and find a correspondence between the segmentation results (Figure 1). CC achieves these goals by learning a thalamic model of the population under investigation, which serves as a probabilistic atlas of the thalamic nuclei. This model involves a spatial component as well as a directional component for each nuclei. CC also performs a poly-rigid registration to account for inter-subject variability. Since the segmentation of each individual subject is done according to a common model, the consistency of segmentations between subjects is ensured. This joint segmentation approach results in a segmentation for each subject and determines a correspondence between subjects. Also, the thalamic model, which is learned from a population of labeled or unlabeled data, serves as an anatomical atlas for the population under investigation.

Fig. 1.

Fig. 1

Schematic description of previous thalamus segmentation algorithms [2,3,4,5] (left) as opposed to the Consistency Clustering (right)

In the following sections, we first describe the theory behind the method, and then present results from several experiments that demonstrate the feasibility of the proposed method with DTI data from 10 healthy participants.

2 Theory

In this section we formulate the problem of joint segmentation of thalamic nuclei as a maximum likelihood problem and solve it using the generalized expectation maximization algorithm [9]. The algorithm iteratively increases the joint probability of observing the set of thalami under investigation. The joint probability is measured in terms of a mixture density model that accounts for spatial distribution of the nuclei as well as the principal diffusion orientation. The inter-subject variability is also handled within the same framework by introducing a set of parameters describing a poly-rigid registration.

The DTI data is modeled with a set of parametersi, Θ = {πc,μc,Σc, νc, κc} ∪ {Rs}, where c is an index over clusters, i.e. c ∈ {1, 2, …, C} and s is an index over subjects, i.e. s ∈ {1, 2, …, S}. Given these parameters, the likelihood of the subjects becomes:

Λ(X,V;Θ)=i=1Nc=1Cπcfx(xi;Θ)fυ(υi;Θ),

where we assume independence between every observed sample (voxel) and also independence between the spatial location xi and principal diffusion orientation υi. We model the spatial distribution with a Gaussian:

fx(x;Θ)=fx(x;µc,Σc)=1(2π)3/2|Σc|1/2exp(12(xµc)TΣc1(xµc)),

where µc is the mean vector and Σc is the covariance matrix. We model the distribution of the principal diffusion directions with a von Mises-Fisher distribution:

fυ(υ;Θ)=fυ(υ;νc,κc)=C(κc)exp(κcνcTυ),

where νc is the mean orientation and κc is the concentration parameter. The constant, C(κ) = κ1/2/(2π)3/2I1/2(κ), and I1/2(κ) is a modified Bessel function of the first kind and order 1/2. Under this model, we formulate our problem as a maximum likelihood estimation of the parameter set Θ:

Θ*=argmaxΘΛ(X,V;Θ).

In the next sections we present the update equations for our formulation to iteratively estimate Θ*. Detailed derivations have been omitted due to limited space.

2.1 E-Step

In the E-step, CC updates the membership probabilities for each voxel, given the estimate of the parameter set at iteration (n), Θ(n):

p(c|xi,υi;Θ(n))πc(n)fx(Rs(n)xi;μc(n),Σc(n))fυ(Rs(n)υi,νc(n),κc(n)),pci(n),

where pci(n) is normalized at every iteration, so that cpci(n)=1 for all voxels.

2.2 M-Step

In the M-step, CC updates the parameter set Θ to maximize the expected value of the log likelihood. Ignoring constant term that does not depend on Θ, the expected value of the log likelihood is derived as:

β(X,V;Θ)=i=1Nc=1Cpci(n)(logπc+logfx(Rs(n)xi;Θ(n))+logfυ(Rs(n)υi;Θ(n))) (1)

For a given parameter set Θ(n), the update equations for Θ(n+1) are derived using Lagrange multipliers for the corresponding constraints and setting the derivative of (1) to zero. Let Pc(n)i=1Npci(n), then the resulting update equations are:

πc(n+1)=Pc(n)/N,μc(n+1)=(1/Pc(n))×i=1Npci(n)xi,Σc(n+1)=(1/Pc(n))×i=1Npci(n)(xiμc(n+1))(xiμc(n+1))T,rc=i=1Npci(n)υi,r¯c=rc/Pc(n),νc(n+1)=rc/rc,κc(n+1)(3r¯cr¯c3)/(1r¯c2),

where the last equation is an approximation to the true parameter κc [10].

Registration

Registration parameters are also updated in the m-step. We parametrize the registration as one rigid transformation per cluster per subject, i.e.,

Rs(n)xi=Rsc(n)(xiμsc(n))+μsc(n)+tsc(n),Rs(n)υi=Rsc(n)υi,

where μsc(n)=ispci(n1)xi/ispci(n1), and represents the weighted mean of the voxel locations in a given subject. Similar to other parameters, setting the derivative of (1) to zero, we get:

tsc(n+1)=μc(n+1)μsc(n+1),

where μc(n+1)=iscpci(n)xi/iscpci(n). Unfortunately, the same technique does not lead into a simple analytical solution for the rotation matrices,Rsc(n+1). However, we derive a maximum likelihood optimization function and optimize the function using a numerical scheme. The resulting optimization function is:

Rsc(n+1)=argmaxRscis2κc(n+1)νc(n+1),TRscυixiTRscTΣc(n+1),TRscxis.t.RscRscT=Iand|R|=1.

We further parametrize Rsc using Euler angles so that the constraints are automatically met. Then we find optimal values for the Euler angles (and therefore Rsc) using a simplex search method [11].

3 Methods

3.1 Image Acquisition and Pre-processing

DTI data were acquired using a twice-refocused spin-echo EPI sequence [12] on a 3 Tesla Siemens Trio MRI scanner using an 8-channel head coil. The sequence parameters were TR/TE=8400/82 ms, b=700 s/mm2, gmax=26 mT/m, 10 T2 images, 60 diffusion gradient directions. The resulting images had 2 × 2 mm in-plane resolution with a slice thickness of 2 mm with 0 mm gap.

Correction for motion and residual eddy current distortion was achieved by registering all of the scans to the first acquired non-diffusion-weighted scan for each participant. The registration used a 12 degree-of-freedom global affine transformation and a mutual information cost function [13]. Trilinear interpolation was used for the resampling. The diffusion tensor were calculated for each voxel using the formulas of [14].

The diffusion tensor volumes were normalized to MNI-space (Montreal Neurological Institute) by registering each participant’s T2 volume to a skull-stripped version of the MNI 152-subject T2 template [15] and then applying the transformation to the diffusion tensor volumes with 12 degree-of-freedom global affine transformation. The tensors were reoriented using the rotational portion of the atlas transformation.

Thalamus masks were then drawn manually for each individual by a trained neuro-anatomist. The masks were drawn for each hemisphere on each individual’s MNI-normalized FA map following the guidelines from [16]. Each hemisphere was further segmented into its seven nuclei on the corresponding tensor map by the neuro-anatomist following the drawings of [16].

3.2 Experiments

CC was validated on 10 normal subjects’ DTI datasets. Each subject’s thalami (only the left hemispheres) were segmented individually using the k-means algorithm as described in [2] and spectral clustering as described in [3] to create benchmarks. The same thalami were then segmented jointly using CC with the same (uniform) initialization used for the k-means algorithm. The joint segmentation resulted in corresponding segmentations in all subjects, whereas the k-means clustering did not (Figure 2A, B). To test the use of prior information, we repeated the joint segmentation experiment 10 times for each thalami in a leave-one-out fashion. For each joint segmentation, we fixed the voxel labels for 9 of the subjects at the expert labels (to “anchor” the model), and let the last subject’s labels vary. The resulting segmentations were not only consistent among subjects (Figure 2C), but also matched well qualitatively with the expert labels (Figure 2D).

Fig. 2.

Fig. 2

Segmentation results from three subjects’ left thalamic hemispheres are shown. Colors indicate the mean diffusion orientation in each cluster. (A) Segmentations obtained using k-means have an ellipsoidal bias and they do not correspond well between subjects. (B) Segmentations obtained using CC with no prior information are consistent among subjects. (C) Segmentation using CC with prior information are both consistent among subjects and match well with the expert segmentations. (D) Expert labeled thalami are shown. Note that even though the segmentations in (C) and (D) look very similar, they are not exactly the same (see Figure 3).

We also quantified the accuracy of the segmentation results against the expert labels using the Dice volume overlap measure [17] (Figure 3). Both spectral clustering and CC without prior information resulted in comparable volume overlaps, while k-means performed the worst due to its simple nature. Furthermore, CC resulted in a slightly higher average overlap and less variability around this average among subjects, indicating a better performance overall. The decrease in the variability is due to the increased consistency of segmentations between the subjects. Not surprisingly, the use of prior information improved the volume overlaps, indicating the need for prior information and the weakness of the unsupervised algorithms for replicating expert preference.

Fig. 3.

Fig. 3

Volume overlaps between the expert segmentations and segmentations obtained using k-means [2], spectral clustering [3] and Consistency Clustering (CC), without and with prior information on the expert labels, from the left hemispheres from 10 subjects. The boxes indicate one standard deviation around the mean, and the thin lines indicate the range.

The algorithm took under 2 minutes to converge on a desktop personal computer with a non-optimized MATLAB implementation for the joint segmentation of 10 subjects. The algorithm’s complexity is linear with the number of voxels for fixed number of clusters, similar to the simple k-means clustering.

4 Discussion and Conclusion

In this paper we presented a novel algorithm, called Consistency Clustering, for jointly segmenting a population of diffusion tensor images of the deep gray matter. The joint segmentation resulted not only in plausible segmentations for each subject, but also correspondence between the subjects. This is an important difference between the CC and previous algorithms proposed to segment the gray matter, since without correspondence between the segmentations of individual subjects, it is difficult to assign consistent anatomical labels to the resulting segmentations. Also, without consistent and anatomically meaningful segmentations, the quantitative morphometry becomes a challenge in the gray matter.

CC not only provided consistent segmentations for the population, but it was also able to handle prior information about the expert labels. Also, through the use of labels from other subjects in the population, the algorithm was able to produce segmentations that were both qualitatively and quantitatively very similar to the expert’s preference. Therefore, CC can be used in two different ways to produce consistent segmentations in a population. The first way involves running the algorithm unsupervised on the population, and then assigning anatomical labels to the segmentations only on one of the subjects. The labels are then automatically transferred to the rest of the subjects since the correspondence problem is already solved at this stage. The second way involves labeling one or several subjects by hand, and then using these labeled subjects as prior information to label the rest of the population according to the expert preference.

Either way, CC (or a variant with an improved model for the thalamic nuclei) is a powerful tool that provides fast and consistent segmentation of the deep gray matter and has a use in a variety of applications such as in quantitative morphometry studies and pre-surgical planning.

Footnotes

*

The authors would like to thank Jonathan J Wisco for providing the thalamus masks and manual segmentations of the nuclei. This work was supported by NIH NIBIB NAMIC U54-EB005149, NIH NCRR NAC P41-RR13218 and R01-MH074794.

References

  • 1.Basser P, Mattiello J, Bihan DL. MR diffusion tensor spectroscopy and imaging. Biophys. J. 1994;66:259–267. doi: 10.1016/S0006-3495(94)80775-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Wiegell MR, et al. Automatic segmentation of thalamic nuclei from diffusion tensor magnetic resonance imaging. Neuroimage. 2003;19:391–402. doi: 10.1016/s1053-8119(03)00044-2. [DOI] [PubMed] [Google Scholar]
  • 3.Ziyan U, Tuch D, Westin CF. Segmentation of thalamic nuclei from DTI using spectral clustering. In: Larsen R, Nielsen M, Sporring J, editors. MICCAI 2006. LNCS. vol. 4191. Heidelberg: Springer; 2006. pp. 807–814. [DOI] [PubMed] [Google Scholar]
  • 4.Jonasson L, et al. A level set method for segmentation of the thalamus and its nuclei in DT-MRI. Signal Process. 2007;87(2):309–321. [Google Scholar]
  • 5.Duan Y, Li X, Xi Y. Thalamus segmentation from diffusion tensor magnetic resonance imaging. Journal of Biomedical Imaging. 2007;2 doi: 10.1155/2007/90216. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Guridi J, et al. Targeting the basal ganglia for deep brain stimulation in parkinson’s disease. Neurology. 2000;55:S21–S28. [PubMed] [Google Scholar]
  • 7.Portas C, et al. Volumetric evaluation of the thalamus in schizophrenic male patients using magnetic resonance imaging. Biol. Psych. 1998:649–659. doi: 10.1016/s0006-3223(97)00339-9. [DOI] [PubMed] [Google Scholar]
  • 8.Giroux ML, et al. Medication related changes in cerebral glucose metabolism in Parkinson’s disease. ICFMHB. 1998:237. [Google Scholar]
  • 9.Mclachlan GJ, Krishnan T. The EM Algorithm and Extensions. Chichester: Wiley-Interscience; 2007. [Google Scholar]
  • 10.Banerjee A, Dhillon IS, Ghosh J, Sra S. Clustering on the unit hypersphere using von Mises-Fisher distributions. J. Mach. Learn. Res. 2005;6:1345–1382. [Google Scholar]
  • 11.Lagarias JC, et al. Convergence properties of the nelder-mead simplex method in low dimensions. SIAM Journal of Optimization. 1998;9:112–147. [Google Scholar]
  • 12.Reese TG, et al. Reduction of eddy-current-induced distortion in diffusion MRI using a twice-refocused spin echo. MRM. 2003;49(1):177–182. doi: 10.1002/mrm.10308. [DOI] [PubMed] [Google Scholar]
  • 13.Jenkinson M, et al. Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeurImg. 2002;17:825–841. doi: 10.1016/s1053-8119(02)91132-8. [DOI] [PubMed] [Google Scholar]
  • 14.Basser PJ, Mattiello J, LeBihan D. Estimation of the effective self-diffusion tensor from the NMR spin echo. J. Magn. Reson. B. 1994;103(3):247–254. doi: 10.1006/jmrb.1994.1037. [DOI] [PubMed] [Google Scholar]
  • 15.Mazziotta JC, et al. A probabilistic atlas of the human brain: theory and rationale for its development. Neuroimage. 1995;2(2):89–101. doi: 10.1006/nimg.1995.1012. [DOI] [PubMed] [Google Scholar]
  • 16.Ooteman W, Cretsinger K. Thalamus Tracing Guidelines. http://www.psychiatry.uiowa.edu/mhcrc/IPLpages/manual_tracing.htm.
  • 17.Dice LR. Measures of the amount of ecologic association between species. Ecology. 1945;26:297–302. [Google Scholar]

RESOURCES