Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Oct 5.
Published in final edited form as: Patch Based Tech Med Imaging (2016). 2016 Sep 22;9993:34–42. doi: 10.1007/978-3-319-47118-1_5

Consistent Multi-Atlas Hippocampus Segmentation for Longitudinal MR Brain Images with Temporal Sparse Representation

Lin Wang 1,2, Yanrong Guo 2, Xiaohuan Cao 2,3, Guorong Wu 2, Dinggang Shen 2,
PMCID: PMC6172962  NIHMSID: NIHMS963642  PMID: 30294728

Abstract

In this paper, we propose a novel multi-atlas based longitudinal label fusion method with temporal sparse representation technique to segment hippocampi at all time points simultaneously. First, we use groupwise longitudinal registration to simultaneously (1) estimate a group-mean image of a subject image sequence and (2) register its all time-point images to the estimated group-mean image consistently over time. Then, by registering all atlases with the group-mean image, we can align all atlases longitudinally consistently to each time point of the subject image sequence. Finally, we propose a longitudinal label fusion method to propagate all atlas labels to the subject image sequence by simultaneously labeling a set of temporally-corresponded voxels with a temporal consistency constraint on sparse representation. Experimental results demonstrate that our proposed method can achieve more accurate and consistent hippocampus segmentation than the state-of-the-art counterpart methods.

1 Introduction

The hippocampus plays a crucial role in memory and spatial navigation function of brain [1]. The structural change of hippocampus over time is highly related to many neurodegenerative diseases, such as Alzheimer's disease (AD). As a characteristic feature of AD, hippocampal atrophy is considered as a potential biomarker for the diagnosis and assessment of AD in magnetic resonance (MR) imaging based neuro-science studies [2, 3]. In order to measure the hippocampal atrophy over time, accurate quantization of hippocampal volumes from serial structural three-dimensional (3D) MR images is required. To this end, it is important to accurately and temporally consistently segment four-dimensional (3D+t) hippocampus from longitudinal structural MR images.

Many automatic segmentation methods have been proposed to segment 3D hippocampus independently from MR images of different time points [48]. In this case, hippocampus segmentation from longitudinal MR images is decomposed into a series of separate hippocampus segmentation from each 3D MR image. However, due to various reasons such as variant noises and hippocampal tissue contrast in the acquired longitudinal MR images, 3D segmentation methods, when applied to longitudinal MR images, have limited temporal consistency for segmented longitudinal hippocampi. To improve segmentation consistency from longitudinal MR images, several longitudinal segmentation methods have been proposed [912]. Wolz et al. [9] proposed a 4D graph-cut based method to simultaneously segment longitudinal MR images. By using a 4D graph to represent longitudinal MR data, the method segmented MR images at all time points by solving the min-cut/max-flow problem on the 4D graph. Chincarini et al. [12] presented a hippocampal segmentation method by integrating longitudinal information. They implemented longitudinal analysis with four progressive steps, and addressed the impact of these steps on longitudinal performance of hippocampal volume measurements for early detection of AD. However, due to large variance of noise level and intensity bias field across different time points in the longitudinal MR images, consistent hippocampus segmentation of serial MR images remains a challenging problem.

Accordingly, in this paper, we propose a 3D+t hippocampus segmentation method for longitudinal MR brain images, by integrating temporal sparse representation within the multi-atlas patch-based label fusion framework. First, we use the groupwise longitudinal image registration toolbox (GLIRT) [13] to simultaneously (1) estimate a subject-specific group-mean image and (2) register all time-point images of subject image sequence consistently to the estimated group-mean image. Then, by registering all atlas images to the estimated group-mean image, we can align all atlases longitudinally consistently to each time point of the subject image sequence. Thus, given the temporal correspondence in subject image sequence, we can form a 3D+t image patch at each location of subject brain. Then, we can use temporal sparse representation technique to simultaneously determine labels for all time points of the subject image sequence by propagating labels from all aligned atlas images. Experimental results on both simulated and real longitudinal MR images demonstrate that our proposed method can achieve more accurate and consistent hippocampus segmentation than the state-of-the-art multi-atlas label fusion methods, which often apply label fusion for each time point independently.

2 Method

Before describing our method in detail, we first introduce some mathematical descriptions. First, a longitudinal subject image sequence (also namely 3D+t subject image) is denoted as {T't}t=1N, where Tt is a 3D image at time point t(t ∈ {1, …, N}). Then, M 3D atlas images are denoted as I(1), …, I(M) with their corresponding hippocampal label images denoted as L(1), …,L(M). So, our goal is to segment 3D+t subject image, i.e., to automatically estimate a hippocampal label image sequence {L't}t=1N corresponding to the 3D+t subject image {T't}t=1N as illustrated in Fig. 1, where Lt denotes a 3D subject label image at time point t(t ∈ {1, …, N}).

Fig. 1.

Fig. 1

Schematic diagram of the proposed 3D+t hippocampus segmentation method.

2.1 Temporal Sparse Representation

Unbiased Groupwise Registration

The first step in our method is to estimate temporal correspondences along different time points of the subject image sequence. Here, we use GLIRT [13] to achieve this goal. Unlike other pairwise registration methods, which need choose a reference template, the groupwise registration is free of template selection, and can thus build an unbiased subject-specific group-mean image and simultaneously register all time-point images to this group-mean image. Assuming {φ't}t=1N as the deformation fields for N time points, the original 3D+t subject image {T't}t=1N can be transformed to the group-mean image space as {T't}t=1N, which is illustrated in Fig. 1. Here, Tt=φ't(T't)..

Furthermore, all 3D atlas images are registered to the estimated group-mean image, first by affine registration [14] with 12 degrees of freedom and then by deformable registration [15]. By following the same estimated affine transformation matrix and deformation field, the corresponding 3D hippocampal label image of each atlas can be also registered to the estimated group-mean image space. As shown in Fig. 1, I(1), …, I(M) and L(1), …, L(M) denote the M aligned 3D atlas images and their corresponding aligned 3D label images in the estimated group-mean image space, respectively.

Temporal Sparse Patch-Based Representation

For each voxel x in the group-mean image, its 3D+t subject patch can be extracted as {α(x,t)}t=1N, where α(x,t) represents a 3D patch centered at voxel x of the aligned 3D+t subject image at time point t. Let n(x) denote a spatial neighborhood of x. All candidate atlas patches within the search neighborhood n(x) of the aligned atlas images {I(m)}m=1M are denoted as { βz(m)|zn(x), m = 1, …, M}, along with their corresponding center voxel labels { lz(m)|zn(x), m = 1, …, M}. The total number of atlas patches { βz(m)|zn(x), m = 1, …, M} used to label the 3D+t subject patch {α(x,t)}t=1N is Q = M × |n(x)|, where |n(x)| denotes the cardinality of n(x).

After rearranging βz(m) and α(x,t) into a d-dimensional column vector bz(m) and a(x,t), respectively, where d is the number of voxels in each 3D patch, our temporal sparse representation can be formulated as a problem of finding optimal sparse representation for the 3D+t subject patch vector {a(x,t)}t=1N by using all atlas patch vectors {bz(m)}m=1M as follows [6]:

{w^(x,t)}t=1N=arg min{w(x,t)}t=1N{12t=1NBxw(x,t)a(x,t)22+λ1t=1Nw(x,t)1+λ2t=1N1w(x,t)w(x,t+1)1}s.t.w(x,t)0,t=1,,N (1)

where Bx ∈ ℛd×Q denotes a dictionary matrix constructed by arranging {bz(m)}m=1M column by column, w(x,t) ∈ ℛQ denotes a weight vector by arranging all non-negative weights { w(x,t),z(m)|w(x,t),z(m)0, zn(x), m = 1, …, M} into a column vector, and w(x,t),z(m) is the representation coefficient of the patch vector bz(m) of the m-th atlas image in constructing the patch vector a(x,t) of the subject image at time point t. The first term in Eq. (1) is the reconstruction discrepancy. The second term, which is equivalent to the ℓ1-norm, enforces sparsity in w(x,t). The third term is the temporal fused smoothness term, used to constrain the temporal consistency of two successive sparse representation vectors (w(x,t) and w(x,t+1)).λ1 and λ2 are the two weighting parameters used to balance the contributions from the second and third terms. The objective function in Eq. (1) can be solved by the fast proximal gradient method [16].

2.2 Multi-Atlas Based Label Fusion with Temporal Sparse Representation

Once the temporal sparse code {w^(x,t)}t=1N is estimated by solving the optimization problem in Eq. (1), the label at the voxel (x, t) of the aligned subject image Tt can be obtained by the multi-atlas based label fusion method by combining the center voxel labels { lz(m)|zn(x), m = 1, …, M} using the estimated temporal sparse code {w^(x,t)}t=1N.

By following the same order of the dictionary matrix Bx, { lz(m)|zn(x), m = 1, …, M} is constructed as a label vector lx(lx ∈ ℛQ). Supposing there are P possible labels {L1, …, Lp, …,LP} in the atlases, the label at the voxel (x, t) of the aligned subject image Tt can be determined by:

L^(x,t)=argminLp,p=1,,P{j=1Qw^(x,t),jδ(lx,j,Lp)},t=1,,N (2)

where ŵ(x,t),j and lx,j are the j-th components of ŵ(x,t) and lx, respectively, and the function δ{lx,j, Lp) is equal to 1 if lx,j = Lp and 0 otherwise.

After determining the label (x,t) at each voxel (x, t) of aligned 3D+t subject image {Tt}t=1N, the aligned 3D+t subject label image {Lt}t=1N in the group-mean image space can be obtained. Then, we can obtain the 3D+t subject label image {L't}t=1N corresponding to the input 3D+t subject image {T't}t=1N by transforming {Lt}t=1N back to the subject image space by following L't=φt'1(Lt), where φt'1 is the inverse transformation field of φt'.

3 Experimental Results

In this section, we evaluate our proposed longitudinal (3D+t) hippocampus segmentation method on both simulated and real longitudinal MR brain image datasets. Specifically, 10 subjects with simulated atrophy in hippocampi, and 12 subjects with each subject having three MR images acquired at three time points in the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (http://www.adni-info.org/) are used, respectively. All the images in both datasets are the T1-weighted MR images, which are processed to have the same size and same resolution of 256 × 256 × 256 and 1 × 1 × 1 mm3, respectively. The proposed method is compared to two state-of-the-art label fusion methods, namely nonlocal patch-based method (Non-local) [5] and sparse patch-based method (SPBM) [6]. To be fair, a similar unbiased groupwise registration is applied to the two comparison methods. That is, subject image is registered to the estimated group-mean image of subject image sequence by GLIRT, and all atlases are registered to the estimated group-mean image sequentially by affine registration and deformable registration. The final segmentations are evaluated in each subject's own space. In the following experiments, the patch size and search neighborhood size are both set to 3 × 3 × 3, and the regularization coefficients λ1 and λ2 are both set to 0.001.

3.1 Experiments on Simulated Dataset

In the simulation experiments, 10 subjects with manual hippocampal labels at year 1 are used as the baseline data (t = 1), and then used to simulate atrophy on the hippocampi. Each subject is simulated with three longitudinal MR images, where images at year 2 (t = 2) and year 3 (t = 3) are generated by an atrophy simulation model [17] to ensure shrinking hippocampal volumes along the temporal dimension. Thus, ten sets of simulated longitudinal data with about 5% of annual hippocampal volume shrinking are obtained. The total simulated atrophy rate of hippocampus in three years is 9:15%.

A leave-one-out strategy on the total 10 simulated subjects is adopted to compare the segmentation performances of Non-local, SPBM, and our proposed method. Specifically, in each leave-one-out experiment, one subject is selected as 3D+t subject image, and the rest 9 subjects are selected as 3D atlases (27 atlases in total). For both Non-local and SPBM methods, three time-point images in the 3D+t subject image are segmented independently. The mean and standard deviation of Dice ratios are shown in Table 1. We can observe that our proposed method receives significant improvement over both Non-local and SPBM methods in terms of Dice ratio according to the paired t-test (p < 0:05). Thus, our proposed method achieves the best segmentation accuracy.

Table 1.

Mean and standard deviation of Dice ratio in hippocampus segmentation on simulated atrophy data (Unit: %).

Method Left hippo Right hippo Overall
Non-local 81.27 ± 1.98 80.08 ± 2.82 80.69 ± 2.12
SPBM 80.80 ± 1.71 79.55 ± 2.43 80.19 ± 1.76
Proposed 82.71 ± 1.51* 81.86 ± 2.44* 82.30 ± 1.76*
*

Indicates significant improvement over Non-local and SPBM methods (p < 0.05)

Figure 2(a) shows the curves of the longitudinal loss of overall hippocampus volume in a typical subject by Non-local, SPBM, and our method. Figure 2(b) shows the averaged loss of overall hippocampus volume estimated by these three methods. The final estimated mean and standard deviation of loss of overall hippocampus volume is 11:84% ± 1:63% by Non-local, 12:31% ± 1:57% by SPBM, and 10:87% ± 1:21% by the proposed method. We can see that our proposed method is the closest to the ground truth, and also the most consistent in measuring longitudinal hippocampal volume changes, due to the use of temporal consistency constraint on sparse representation of multi-atlas based label fusion.

Fig. 2.

Fig. 2

Demonstration of loss of overall hippocampus volume through 3 time points. (a) Loss of longitudinal hippocampus volume for a typical subject, and (b) average loss of longitudinal hippocampus volumes for all subjects.

3.2 Experiments on Real Dataset

In the real experiments, we randomly select 12 subjects with each having 3 time points (baseline, 6 and 12 months) from ADNI dataset. The hippocampi of these MR images have been manually labeled, which are regarded as ground truth.

We also adopt a leave-one-out strategy on the total 12 real subjects for experiments of hippocampus segmentation. The mean and standard deviation of Dice ratios and also the average symmetric surface distance (ASSD) of the hippocampus segmentation results by Non-local, SPBM, and our proposed method are shown in Table 2. We can observe that our proposed method achieves significant improvement over Non-local and SPBM methods in terms of both Dice ratio and ASSD according to the pair t-test (p < 0.05). Figure 3 shows the surface distances of the left hippocampus for a typical subject, between manual segmentations and automatic segmentations by Non-local, SPBM, and our proposed method. It is obvious that our proposed 3D+t hippocampus segmentation method achieves the best segmentation performance.

Table 2.

Mean and standard deviation of Dice ratios and the average symmetric surface distance (ASSD) for automatic segmentations by Non-local, SPBM, and our proposed method.

Method Left hippo Right hippo Overall
Dice ratio (%) Non-local 83.53 ± 2.99 82.37 ± 4.02 82.93 ± 2.97
SPBM 82.32 ± 3.83 82.32 ± 2.64 82.31 ± 2.87
Proposed 84.85 ± 2.30* 84.29 ± 2.34* 84.55 ± 2.22*
ASSD (mm) Non-local 0.497 ± 0.072 0.524 ± 0.098 0.512 ± 0.067
SPBM 0.520 ± 0.096 0.509 ± 0.057 0.515 ± 0.066
Proposed 0.455 ± 0.049* 0.476 ± 0.063* 0.466 ± 0.053*
*

Indicates significant improvement over Non-local and SPBM methods (p < 0.05)

Fig. 3.

Fig. 3

Visualization of surface distances (in mm) for hippocampus segmentation results by three methods.

4 Conclusion

In this paper, we proposed an integrated temporal sparse representation and multi-atlas patch-based label fusion method for longitudinal (3D+t) hippocampus segmentation in the longitudinal MR images. To make the registration at different time points consistent to the subsequent data analysis, we registered the 3D+t subject image and all atlases to the group-mean image of 3D+t subject image by using GLIRT. Moreover, to respect the smooth change of longitudinal structure (i.e., hippocampus), we added a temporal fused smoothness term to the objective function of sparse representation, for enforcing small difference between two successive sparse representation vectors from adjacent time points. Experimental results demonstrated the improved segmentation accuracy and longitudinal consistency by our proposed method, compared to both Non-local and SPBM methods.

Acknowledgments

This work was supported in part by National Natural Science Foundation of China (No. 61503300) and China Postdoctoral Science Foundation (No. 2014M560801).

References

  • 1.Bird CM, Burgess N. The hippocampus and memory: insights from spatial processing. Nat Rev Neurosci. 2008;9(3):182–194. doi: 10.1038/nrn2335. [DOI] [PubMed] [Google Scholar]
  • 2.Schuff N, et al. MRI of hippocampal volume loss in early Alzheimer's disease in relation to ApoE genotype and biomarkers. Brain. 2009;132(4):1067–1077. doi: 10.1093/brain/awp007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Schröder J, Pantel J. Neuroimaging of hippocampal atrophy in early recognition of Alzheimer's disease - a critical appraisal after two decades of research. Psychiatry Res: Neuroimaging. 2016;247:71–78. doi: 10.1016/j.pscychresns.2015.08.014. [DOI] [PubMed] [Google Scholar]
  • 4.van der Lijn F, et al. Hippocampus segmentation in MR images using atlas registration, voxel classification, and graph cuts. NeuroImage. 2008;43(4):708–720. doi: 10.1016/j.neuroimage.2008.07.058. [DOI] [PubMed] [Google Scholar]
  • 5.Rousseau F, Habas PA, Studholme C. A supervised patch-based approach for human brain labeling. IEEE Trans Med Imaging. 2011;30(10):1852–1862. doi: 10.1109/TMI.2011.2156806. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Zhang D, Guo Q, Wu G, Shen D. Sparse patch-based label fusion for multi-atlas segmentation. In: Yap PT, Liu T, Shen D, Westin CF, Shen L, editors. MBIA 2012 LNCS. Vol. 7509. Springer; Heidelberg: 2012. pp. 94–102. [Google Scholar]
  • 7.Zarpalas D, et al. Gradient-based reliability maps for ACM-based segmentation of hippocampus. IEEE Trans Biomed Eng. 2014;61(4):1015–1026. doi: 10.1109/TBME.2013.2293023. [DOI] [PubMed] [Google Scholar]
  • 8.Song Y, Wu G, Sun Q, Bahrami K, Li C, Shen D. Progressive label fusion framework for multi-atlas segmentation by dictionary evolution. In: Navab N, Hornegger J, Wells WM, Frangi AF, editors. MICCAI 2015, Part III LNCS. Vol. 9351. Springer; Heidelberg: 2015. pp. 190–197. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Wolz R, et al. Measurement of hippocampal atrophy using 4D graph-cut segmentation: application to ADNI. NeuroImage. 2010;52(1):109–118. doi: 10.1016/j.neuroimage.2010.04.006. [DOI] [PubMed] [Google Scholar]
  • 10.Leung KK, et al. Automated cross-sectional and longitudinal hippocampal volume measurement in mild cognitive impairment and Alzheimer's disease. NeuroImage. 2010;51(4):1345–1359. doi: 10.1016/j.neuroimage.2010.03.018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Guo Y, Wu G, Yap PT, Jewells V, Lin W, Shen D. Segmentation of infant hippocampus using common feature representations learned for multimodal longitudinal data. In: Navab N, Hornegger J, Wells WM, Frangi AF, editors. MICCAI 2015, Part III LNCS. Vol. 9351. Springer; Heidelberg: 2015. pp. 63–71. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Chincarini A, et al. Integrating longitudinal information in hippocampal volume measurements for the early detection of Alzheimer's disease. NeuroImage. 2016;125:834–847. doi: 10.1016/j.neuroimage.2015.10.065. [DOI] [PubMed] [Google Scholar]
  • 13.Wu G, Wang Q, Shen D. Registration of longitudinal brain image sequences with implicit template and spatial-temporal heuristics. NeuroImage. 2012;59(1):404–421. doi: 10.1016/j.neuroimage.2011.07.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Jenkinson M, et al. Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage. 2002;17(2):825–841. doi: 10.1016/s1053-8119(02)91132-8. [DOI] [PubMed] [Google Scholar]
  • 15.Vercauteren T, et al. Diffeomorphic demons: efficient non-parametric image registration. NeuroImage. 2009;45(1):S61–S72. doi: 10.1016/j.neuroimage.2008.10.040. [DOI] [PubMed] [Google Scholar]
  • 16.Beck A, Teboulle M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci. 2009;2(1):183–202. [Google Scholar]
  • 17.Karacali B, Davatzikos C. Simulation of tissue atrophy using a topology preserving transformation model. IEEE Trans Med Imaging. 2006;25(5):649–652. doi: 10.1109/TMI.2006.873221. [DOI] [PubMed] [Google Scholar]

RESOURCES