Abstract
The identification of subtle brain changes that are associated with mild cognitive impairment (MCI), the at-risk stage of Alzheimer’s disease, is still a challenging task. Different from existing works, which employ multimodal data (e.g., MRI, PET or CSF) to identify MCI subjects from normal elderly controls, we use four MRI sequences, including T1-weighted MRI (T1), Diffusion Tensor Imaging (DTI), Resting-State functional MRI (RS-fMRI) and Arterial Spin Labeling (ASL) perfusion imaging. Since these MRI sequences simultaneously capture various aspects of brain structure and function during clinical routine scan, it simplifies finding the relationship between subjects by incorporating the mutual information among them. To this end, we devise a hypergraph-based semi-supervised learning algorithm. In particular, we first construct a hypergraph for each of MRI sequences separately using a star expansion method with both the training and testing data. A centralized learning is then performed to model the optimal relevance between subjects by incorporating mutual information between different MRI sequences. We then combine all centralized hypergraphs by learning the optimal weight of each hypergraph based on the minimum Laplacian. We apply our proposed method on a cohort of 41 consecutive MCI subjects and 63 age-and-gender matched controls with four MRI sequences. Our method achieves at least a 7.61% improvement in classification accuracy compared to state-of-the-art methods using multiple MRI data.
1 Introduction
Alzheimer’s disease (AD) is the most common form of dementia in elderly over 65 years of age. The number of AD patients has reached 26.6 million in nowadays and is expected to double within the next 20 years, leading to 1 in every 85 people worldwide being affected by AD by 2050. Therefore, the diagnosis of AD at its at-risk stage of mild cognitive impairment (MCI) [7] becomes extremely essential and has attracted extensive research efforts in recent years [11, 9]. Previous studies [10] have shown that structural and functional brain changes may start before clinically converted to AD and can be used as potential biomarkers for MCI identification.
Recent studies [4, 11] show great promises for integrating multiple modalities, e.g., MRI, PET and CSF, for improving AD/MCI diagnosis accuracy, and semi-supervised learning for multimodal data has also been investigated [2]. However, in most previous works, modeling the relationship among subjects is often performed separately for each modality, ignoring the crucial mutual information between different modalities. In practice, integrating the information acquired from different modalities is a challenging task, since the relationship among subjects may differ for different modalities.
On the other hand, multiple MR sequences, e.g., T1-weighted (T1), Diffusion Tensor Imaging (DTI) and Resting-State functional MRI (RS-fMRI), can be used in clinical routine scans to capture different aspects of the brain structures and functions. For instance, T1 provides the tissue type information of the brain, DTI measures macroscopic axonal organization in nervous system tissues, and RS-fMRI provides the regional interactions that take place when the subject in the absence of an explicit task. As a relatively new technique, Arterial Spin Labeling (ASL) [1] perfusion imaging is introduced to measure brain perfusion without any injection of a contrast agent and demonstrated consistent reduction in basal perfusion notably in the posterior cingulate cortex in MCI and AD [1]. More recently, ASL was even able to predict very early cognitive decline in healthy elderly controls, i.e., the earliest stage of neurodegeneration. Multiple MRI sequences can be easily and simultaneously captured during clinical routine scans.
In this work, we propose a centralized hypergraph learning method (CHL) to better model relationship among subjects with multiple MRIs for the purpose of MCI diagnosis. The basic idea of the proposed method is to estimate the relevance between different subjects that reflects how likely two subjects belong to the same category by integrating multiple imaging data in a semi-supervised manner. Then, the relationship information among subjects is represented by a hypergraph structure that connects all subjects in both training and testing sets. Compared to the simple graph, in which an edge can only link two vertices, hypergraph [12] conveys more information through a set of hyperedges that connects more than two vertices at the same occasion. It has been successfully applied to various applications, such as image and object retrieval [3]. In this way, a hypergraph structure is able to capture the higher-order relationship among different subjects, i.e., whether a group of subjects share similar content. Then, MCI diagnosis is formulated as a binary classification task in the hypergraph structure to classify each subject as MCI patient or normal control (NC). Figure 1 presents the schematic diagram of the proposed framework. We first construct a hypergraph using both the training and testing data for each of multiple MRIs separately to reflect the higher-order relationship among subjects. In hypergraph construction, each time one subject is selected as a centroid. It is then connected to its K nearest subjects in the feature space via a hyperedge. We then conduct a centralized hypergraph learning to explore the underneath relationship of a set of samples, where the relevance among subjects and the hyperedge weights are optimized simultaneously via an alternating optimization approach. Specifically, each time one hypergraph is first selected as the core and the rest as auxiliary information in the learning process. This procedure is repeated for each hypergraph, producing a set of relevance scores for each subject for classification. To obtain the final decision, we assemble the relevance scores based on the optimal weights learned by minimizing the overall hypergraph Laplacian. Note that, for the training subjects, we just use their imaging features to construct hypergraphs. Therefore, the relevance scores are conveyed globally, leading to a semi-supervised learning model, and better avoiding over-fitting to the training set.
Fig. 1.
An overview of the proposed centralized hypergraph learning for MCI diagnosis.
2 Method
Data and Preprocessing
A dataset containing T1, DTI, RS-fMRI and ASL from 41 MCI patients and 63 normal controls was collected through the University Hospital of Geneva in Switzerland. The T1 images were preprocessed by skull stripping, cerebellum removal, and tissue segmentation [6]. The anatomical automatic labeling (AAL) atlas, parcellated with 90 predefined regions-of-interest (ROIs) in cerebrum area, are aligned to the native space of each subject using a deformable registration algorithm. For T1 data, WM and GM tissue volumes in each region are computed and further normalized to generate a 180-dimensional feature vector, i.e., 90 WM and 90 GM features.
The DTI were first parcellated into 90 regions by propagating the AAL ROIs to each image using a deformable DTI registration algorithm. Whole-brain streamline fiber tractography was performed on each DTI image. The number of fibers passing through each pair of regions was counted and the averages of on-fiber fractional anisotropy (FA) for each ROI pair were computed to form a structural connectivity matrix.
For RS-fMRI, a 9min ON-OFF CO2 challenge was employed, i.e.,1 min OFF, 2 min ON, 2 min OFF, 2 min ON, 2 min OFF, during data acquisition. Slice timing correction and head-motion correction were performed using the Statistical Parametric Mapping software package. To ensure magnetization equilibrium, the first 10 acquired RS-fMRI images of each subject were discarded. The remaining 170 images were first corrected for acquisition time delays among different slices before they were realigned to the first volume of the remaining images for head-motion corrections. A Pearson correlation-based connectivity matrix of dimension 90 × 90 based on AAL atlas was constructed for each subject.
Preprocessing of ASL images was performed using ASLtbx. Similar to RS-fMRI, a symmetric connectivity matrix of dimension 90 × 90 was constructed for each subject.
For DTI, RS-fMRI and ASL, the local clustering coefficients, which quantify the cliquishness of the nodes [8], are computed for the connectivity networks. For these three MRI sequences, both the raw feature (8100-D) and the clustering coefficients-based feature (90-D) are employed.
Hypergraph Construction
For each type of multiple imaging data, a hypergraph is constructed. For the i-th imaging with N subjects, a hypergraph Gi = (Vi, Ei, Wi) with N vertices is constructed where each subject is represented by a vertex. Here, Vi is the vertex set, Ei is the hyperedge set, and Wi is the corresponding weight for the hyperedges. A star expansion method is employed to generate a set of hyperedges among vertices. Specifically, in each feature space, a vertex is selected as the centroid vertex and then a hyperedge is developed by connecting toa its nearest neighbors within φd̄ distance, where d̄ is the average distance between subjects in the feature space and φ is set as 1 in our experiment. Incidence matrix of the hypergraph is then generated to represent the relationship among different vertices. The (v, e)-th entry of the incidence matrix indicates whether the vertex v is connected via the hyperedge e to other vertices. The incidence matrix Hi of hypergraph Gi = (Vi, Ei, Wi) is generated as
| (1) |
where di(v, vc) is the distance for the i-th imaging data between a vertex v and its corresponding centroid vertex vc in hyperedge e, and d̂i is the average pairwise subject distance for the i-th imaging data. Accordingly, the vertex degree of the vertex v ∈ Vi and hyperedge degree of the hyperedge e ∈ Ei are calculated, respectively, as . Thus, we can define as diagonal matrices representing vertex degrees and hyperedge degrees, respectively. Note that all the hyperedges are initialized with an equal weight, e.g., 1.
Centralized Hypergraph Learning
To model the relationship of subjects with multiple imaging data, we propose a centralized hypergraph learning method. MCI diagnosis is formulated as a binary classification task in the hypergraph structure. Given four hypergraphs, at each stage one hypergraph is selected as the core hypergraph, while the rest are used to provide extra guidance for updating the hypergraphs.
During centralized learning, hypergraphs are assigned with different weights according to their influences on the structure of the core hypergraph. The core hypergraph is assigned with a weight of α1, while the other hypergraphs are assigned with an equal weight, α2. Assume the j-th hypergraph is selected as the core hypergraph, we can have the j-th centralized hypergraph. To learn the relevance among vertices and improve the structure of the j-th centralized hypergraph, we optimize the following objective function including the weights of hyperedges by imposing an -norm regularizer on Wi (i=1,2,3,4) as
| (2) |
where is the regularizer for the j-th centralized hypergraph, aiming to smooth the relationship among vertices on the hypergraph structure, Wi (e) is the weight of the hyperedge e ∈ Ei, Fj is the to-be-learned relevance matrix, and Remp (Fj) is the empirical loss, and is the regularizer term to learn the weights of hyperedges. The two constraints guarantee that the vertex degree is not changed during the learning process and all the weights are non-negative, respectively. is defined as
| (3) |
where the first term is the regularizer for the core hypergraph and the second term is the regularizer for the other hypergraphs. We define the regularizer term Ωi (Fj ) as
| (4) |
where can be accordingly obtained by can be simplified as
| (5) |
where is the j-th centralized hypergraph Laplacian.
The empirical loss Remp (Fj) is defined as
| (6) |
where is the label matrix and each of its entry denotes the category a subject belongs to. Y(:, k) is the k-th column of Y. If the vertex v belongs to the first category, then the (v, 1)-th and (v, 2)-th entries in Y are set to 1 and 0, respectively.
Solution
To solve the optimization problem in Eq. (2), we employ an alternating optimization approach. We first fix Wi and optimize Fj as follows
| (7) |
which leads to the following closed-form solution for
| (8) |
We then fix Fj and optimize Wi as follows
| (9) |
The above optimization task is convex on Wi and can be solved via quadratic programming. The learning process is repeated several times, once for each of the hypergraphs as the core hypergraph, to generate a centralized relevance matrix for each imaging modality.
2.1 Relevance Matrices Fusion
To optimally integrate information from multiple MRIs, we learn the weights of each hypergraph by minimizing the overall hypergraph Laplacian. Let ρi be the weight for the i-th centralized hypergraph, we impose an -norm penalization on the weights of all centralized hypergraphs as follows
| (10) |
where η is to balance the Laplacian and the weight regularizer. Eq. (10) can be solved using the Lagrangian method. The final relevance matrix is then computed as ΣρiFi. Classification of a subject could therefore be determined by its corresponding value in the final relevance matrix. If the relevance score to MCI is larger than that to NC, a subject is classified as MCI, and vice versa.
3 Experiments
Experimental Settings
To evaluate the performance of the proposed CHL method, a 10-fold cross-validation strategy is used in our experiments. Specifically, we randomly partition the subjects in the MCI and NC groups into 10 non-overlapping approximately equal sets. One set is first left out as testing set and the rest sets are used as the training set. The training set is further divided into 5 subsets for a 5-fold inner cross-validation to learn the optimal parameters. This procedure is repeated 10 times, once for each of the 10 sets, to compute the overall cross-validation classification performance. Six statistical measures are used for performance evaluation, which include accuracy (ACC), sensitivity (SEN), specificity (SPEC), balance accuracy (BAC), positive predictive value (PPV), and negative predictive value (NPV).
Results
We compare the proposed method with two state-of-the-art methods: 1) Multimodal multitask learning (M3T) [11], which employs multimodal imaging data for MCI classification, and 2) Manifold regularized multitask feature learning (M2TFS) [5], which is another semi-supervised method on MCI classification. We further compare with two simplified versions of the proposed method: 1) centralized simple graph learning (CSL), in which only simple graphs are employed instead of hypergraphs in our method, and 2) Multi-hypergraph learning (MHL) [3] without centralized learning.
Figure 2 shows the MCI classification results of all compared methods using different imaging data individually and together. The proposed method (CHL) outperforms all competing state-of-the-arts methods. Specifically, by using four types of imaging data, CHL achieves an improvement of 8.65 × 2.50% and 7.61 × 1.92%, in terms of ACC, compared with M3T and M2TFS, respectively. Better performance of our method is attributed to the following aspects. First, different from existing methods that employ the labeled samples to train classifiers, the proposed method employs both the labeled and unlabeled samples to explore the underlying relationships among different subjects in a semi-supervised learning way, leading to a more general model that is not over-fitted to the training data. Second, the employed hypergraph structure [12] is superior in formulating the joint relationship among multiple vertices compared with simple graph. As shown in the results, CHL achieves an improvement of 5.10 × 2.13% in terms of ACC compared with CSL when all four imaging data are employed. Third, the proposed method employs a centralized learning approach to model the subject relationship by using one of multiple imaging data with the guidance from others, enabling more accurate modeling of underlying relationships among samples.
Fig. 2.
MCI identification performance using four types of imaging data individually and together.
Compared with MHL, the proposed CHL demonstrates better performance, especially when multiple types of imaging data are utilized. More specifically, CHL achieves an improvement of 4.42 × 1.57% in terms of ACC compared with MHL by using four types of imaging data. This result demonstrates that the proposed centralized method can better explore the relationship underlying the multiple imaging data.
4 Conclusion
In this paper, we proposed a centralized hypergraph learning method to model the relationship among subjects with multiple MRIs for MCI identification. In our method, this relationship is encoded by the structure and the weights of hyperedges in a hypergraph. Multiple hypergraphs are constructed using multiple MRIs, respectively. In the learning process, each time one hypergraph is selected as the core and the relationships among subjects from other hypergraphs help to provide extra guidance to meliorate the structure of the core hypergraph. Integrating hypergraphs with different weights enables optimal utilization of supplementary information conveyed by different imaging data. Our findings demonstrate the effectiveness of the CHL method on MCI diagnosis.
References
- 1.Binnewijzend MA, Kuijer JP, Benedictus MR, van der Flier WM, Wink AM, Wattjes MP, van Berckel BN, Scheltens P, Barkhof F. Cerebral blood flow measured with 3D pseudocontinuous arterial spin-labeling MR imaging in alzheimer disease and mild cognitive impairment: a marker for disease severity. Radiology. 2013;267(1):221–230. doi: 10.1148/radiol.12120928. [DOI] [PubMed] [Google Scholar]
- 2.Cai X, Nie F, Cai W, Huang H. Heterogeneous image features integration via multimodal semi-supervised learning model. IEEE International Conference on Computer Vision.2013. pp. 1737–1744. [Google Scholar]
- 3.Gao Y, Wang M, Tao D, Ji R, Dai Q. 3-D object retrieval and recognition with hypergraph analysis. IEEE Transactions on Image Processing. 2012;21(9):4290–4303. doi: 10.1109/TIP.2012.2199502. [DOI] [PubMed] [Google Scholar]
- 4.Hinrichs C, Singh V, Xu G, Johnson SC, Initiative ADN, et al. Predictive markers for AD in a multi-modality framework: an analysis of MCI progression in the ADNI population. NeuroImage. 2011;55(2):574–589. doi: 10.1016/j.neuroimage.2010.10.081. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Jie B, Zhang D, Cheng B, Shen D. Manifold regularized multi-task feature selection for multi-modality classification in alzheimer’s disease. In: Mori K, Sakuma I, Sato Y, Barillot C, Navab N, editors. MICCAI 2013, Part I. LNCS; Springer, Heidelberg. 2013. pp. 275–283. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Lim KO, Pfefferbaum A. Segmentation of mr brain images into cerebrospinal fluid spaces, white and gray matter. Journal of Computer Assisted Tomography. 1989;13(4):588–593. doi: 10.1097/00004728-198907000-00006. [DOI] [PubMed] [Google Scholar]
- 7.Richiardi J, Monsch AU, Haas T, Barkhof F, Van de Ville D, Radü EW, Kressig RW, Haller S. Altered cerebrovascular reactivity velocity in mild cognitive impairment and Alzheimer’s disease. Neurobiology of Aging. 2014 doi: 10.1016/j.neurobiolaging.2014.07.020. [DOI] [PubMed] [Google Scholar]
- 8.Rubinov M, Sporns O. Complex network measures of brain connectivity: uses and interpretations. NeuroImage. 2010;52(3):1059–1069. doi: 10.1016/j.neuroimage.2009.10.003. [DOI] [PubMed] [Google Scholar]
- 9.Wang H, Nie F, Huang H, Risacher SL, Saykin AJ, Shen L, et al. Identifying disease sensitive and quantitative trait-relevant biomarkers from multidimensional heterogeneous imaging genetics data via sparse multimodal multitask learning. Bioinformatics. 2012;28(12):i127–i136. doi: 10.1093/bioinformatics/bts228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Ye J, Wu T, Li J, Chen K. Machine learning approaches for the neuroimaging study of Alzheimer’s disease. Computer. 2011;44(4):99–101. [Google Scholar]
- 11.Zhang D, Shen D. Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease. NeuroImage. 2012;59(2):895–907. doi: 10.1016/j.neuroimage.2011.09.069. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Zhou D, Huang J, Schokopf B. Learning with hypergraphs: Clustering, classification, and embedding. Proceedings of Advances in Neural Information Processing Systems.2006. pp. 1601–1608. [Google Scholar]


