Abstract
Functional connectivity (FC) analysis is an appealing tool to aid diagnosis and elucidate the neurophysiological underpinnings of autism spectrum disorder (ASD). Many machine learning methods have been developed to distinguish ASD patients from healthy controls based on FC measures and identify abnormal FC patterns of ASD. Particularly, several studies have demonstrated that deep learning models could achieve better performance for ASD diagnosis than conventional machine learning methods. Although promising classification performance has been achieved by the existing machine learning methods, they do not explicitly model heterogeneity of ASD, incapable of disentangling heterogeneous FC patterns of ASD. To achieve an improved diagnosis and a better understanding of ASD, we adopt capsule networks (CapsNets) to build classifiers for distinguishing ASD patients from healthy controls based on FC measures and stratify ASD patients into groups with distinct FC patterns. Evaluation results based on a large multi-site dataset have demonstrated that our method not only obtained better classification performance than state-of-the-art alternative machine learning methods, but also identified clinically meaningful subgroups of ASD patients based on their vectorized classification outputs of the CapsNets classification model.
Index Terms—: Autism spectrum disorder, Heterogeneity, Functional connectivity, Capsule network
1. INTRODUCTION
Autism spectrum disorder (ASD) represents a heterogeneous group of neurodevelopmental disorders, characterized by a spectrum of phenotypes such as impaired social–communication function. To improve ASD diagnosis and elucidate its neurophysiological underpinnings, enormous effort has been put into disentangling the heterogeneity of ASD through genetic findings [1]. Recent neuroimaging studies have also revealed that ASD is associated with abnormal functional connectivity (FC) patterns and the FC measures derived from resting-state functional MRI (fMRI) data could be used in conjunction with machine learning techniques to aid ASD diagnosis [2].
Many machine learning (ML) methods have been developed to build classifiers to distinguish ASD patients from healthy controls (HCs) based on fMRI. Although the methods differ in many aspects, they typically adopt a similar workflow: (1) extracting FC features that are correlation measures between fMRI signals of functional brain network nodes; (2) building a classifier to distinguish ASD patients from HCs based on the FC features using an ML method, such as support vector machine (SVM); and (3) identifying FC features that contribute significantly to the classification for understanding how ASD patients are different from HCs in their FC features. It has been demonstrated that it may lead to limited classification performance to directly apply conventional ML methods, such as SVM, naïve Bayes, and random forests (RF), to FC measures [3]. To improve the classification performance, several feature mining and fusion strategies have been proposed. For example, low-rank representative learning was introduced to improve the discriminative ability of the FC features and the method achieved improved classification performance for ASD diagnosis [4].
Deep learning (DL) has also been adopted to build classifiers for ASD diagnosis. For instance, two cascaded autoencoders were trained to learn low-dimension representations from original high-dimension FC features for building classifiers to aid ASD diagnosis [2]. A graph convolutional neural network (GCN) model was adopted to build classifiers on FC measures to aid ASD diagnosis [5]. These DL methods have achieved competitive classification performance compared with the conventional ML methods, partially due to their powerful data-driven feature learning capacity [6]. However, all these methods do not explicitly model the heterogeneity of ASD, which may not only hamper the understanding of its biological underpinnings but also limit their classification performance.
To improve the ASD diagnosis and elucidate its neurophysiological underpinnings, we adopt capsule networks (CapsNets) to build classifiers for distinguishing ASD patients from HCs based on their FC measures and stratify ASD patients into groups with distinct FC patterns based on vectorized outputs of the CapsNet classifiers.
Different from most existing ML/DL methods that build classifiers with scalar classification outputs, the CapsNets based classifiers are built upon capsules, each capsule containing a series of neurons, and have vectorized classification output with each element characterizing a distinct latent pattern learned through dynamic routing [8]. The vectorized classification output facilitates the stratification of the ASD patients into subgroups with distinct FC patterns. We have validated our method based on a large-scale multi-site ASD fMRI datasets—ABIDE I [7] and compared it with state-of-the-art ML/DL methods in terms of their classification performance. We have also revealed heterogenous FC patterns of ASD and subtypes of ASD with distinct FC patterns and clinical measures.
2. METHODS
Our CapsNet framework is illustrated by Fig. 1, consisting of (a) extracting FC features from fMRI data and (b) building a CapsNet classifier for ASD diagnosis. The classifier’s vectorized classification output is then used to stratify ASD patients into distinct subgroup and identify subnetworks of FC patterns associated with each element of the output vector using class activation mapping [9].
Fig.1.
A schematic framework of the proposed CapsNet method for ASD diagnosis, consisting of (a) extracting FC features from fMRI data and (b) the network structure of our CapsNet with a flattened FC vector as its input that is successively propagated through Fully Connection layer, Reshape operation, and Dynamic routing between Representation Capsules and Diagnosis Capsules (one capsule for the category of ASD and the other for HC). All the operations are illustrated in (c). PCC: Pearson Correlation Coefficient.
2.1. Computation of FC features
Before computing FC features, fMRI data are processed using the C-PAC preprocessing pipeline, which includes time correction, motion correction, and intensity normalization [7]. Then, we adopt a functional brain network with 200 nodes defined by CC200 functional parcellation atlas to compute the FC features for individual subjects [2]. As illustrated by Fig. 1a, the FC measures between each pair of nodes is computed as Pearson correlation coefficient (PCC) between their fMRI signals. Following the existing studies [2][5], and the lower triangle of the whole brain FC matrix (excluding its main diagonal elements) is flattened to form a vector of FC measures (a 19900-dimension vector for every subject) as the input to our CapsNet model.
2.2. CapsNet for ASD diagnosis
Our ASD diagnosis model is built based on CapsNets. Different from alternative deep neural networks, each node of capsule layers in the CapsNets is a capsule containing a series of neurons. The activity of each capsule is represented by an activation vector (activation values of the neurons). The norm of this vector is a probability that an object of interest possesses a certain property. The classification layer consists of classification capsules as the diagnosis output whose norm represents a probability that an instance belongs to a certain class. The classification capsules’ vectors are then utilized to stratify instances of the same class into distinct subgroups using clustering techniques and learn subnetworks of FC patterns associated with each of their elements. In the present study, ASD patients are grouped into subgroups.
Our CapsNet contains a feature representation layer and two Capsule layers, as illustrated by Fig. 1b, The feature representation layer consists of fully connected filters which serves to reduce the dimension of original FC features and obtain a high-level feature representation. The same parameters were used in this feature representation layer as a previous work [2]. Particularly, 1000 filters are to be learned to generate a 1000-dimension feature vector F for each subject. In the Capsule layers, the feature vector F is first reshaped to form a series of 8-dimension capsules [f1, … , fi, … , fM], referred to as Representation Capsules. The dimension of Representation Capsules is set following a typically parameter setting for CapsNets [8]. Then, [f1, … , fi, … , fM] are connected to two Diagnosis Capsules that model the probability that a subject is an ASD patient or an HC subject respectively. Connections between capsules in directly connected layers are optimized via “dynamic routing by agreement” algorithm [8], as summarized in following paragraphs.
Denoting the output of capsule i in Representation Capsules with μi, its parent capsule j in the Diagnosis Capsule layer is computed by
(1) |
where Wij is a trainable weight matrix between paired capsules in capsule layers. A coupling cij between these two capsules is defined as
(2) |
where bij represents the probability that capsule i is coupled with capsule j with an initialization value of 0, and k is the number of capsules in the Diagnosis Capsule layer. The parent capsule j has an input sj, computed by
(3) |
Then, a squashing function, as formulated by Eq. 4, is applied to restrict the norm of output vector vj from the capsule j to the range of [0, 1]. Therefore, the norm of this vector can act as a probability for classification.
(4) |
For the Diagnosis Capsules, the norm of vj represents the probability that a subject belongs to ASD or HC.
An agreement factor aij between the capsule i and its parent capsule j is defined as inner product, i.e.,
(5) |
The agreement factor aij is added to bij in the next iteration step of the dynamic routing to enhance coupling between these two capsules. The loss function LD of CapsNet is defined by
(6) |
where Tc = 1 iff an instance from class c (ASD or HC) is present to the network, vc is the output of the capsule, representing class c, λ is a weight set to 0.5, m+ = 0.9, and m− = 0.1 [8].
2.3. Characterization of the heterogeneity of ASD
The vectorized representation in the ASD capsule makes it possible to identify subtypes of ASD. To disentangle the heterogeneity of ASD, we group ASD patients into subgroups by applying k-means to their classification capsules, i.e., the vectorized classification outputs. The subgrouping result is then assessed in terms of group differences in their clinical measures. FC measures actively associated with each element of the vectorized classification outputs are quantified using class activation mapping (CAM) [9]. Particularly, the activation values of FC measure associated with each element are normalized onto [−1, 1], and then the top-10 most activated FCs are visualized.
3. EXPERIMENT RESULTS
3.1. fMRI Dataset and Experimental Settings
Our CapsNet method was evaluated for distinguishing ASD patients from HCs based on ABIDE I dataset [7]. Particularly, our study focused on 505 ASD patients and 530 HCs with fMRI data of high quality. The fMRI scans were preprocessed, and FC measures were computed as described in section 2.1. To train our model, we adopted Adam as the optimizer, and the weights of our network were initialized by Xavier. The number of neurons in the classification capsules was set to 4. A larger number of neurons in the classification capsules did not increase the classification accuracy. Our CapsNet model was implemented based on PyTorch.
We also compared our method with state-of-the-art ML/DL methods, including DNN [2], SVM and RF [3], in terms of classification performance that was estimated using a 10-fold cross-validation procedure.
3.2. Classification Performance
Table 1 summarizes classification performance measures obtained by all the methods under comparison, including Accuracy, Sensitivity, and Specificity. DL methods (our CapsNet model and the DNN model) obtained similar performance, better than the SVM and RF models. Overall, our method obtained the best accuracy.
Table 1.
Mean of classification performance measures of all the methods under comparison.
Methods | Accuracy | Sensitivity | Specificity |
---|---|---|---|
RF | 0.65 | 0.68 | 0.62 |
SVM | 0.63 | 0.69 | 0.58 |
DNN | 0.70 | 0.74 | 0.63 |
CapsNet | 0.71 | 0.73 | 0.66 |
3.3. Heterogeneous FC Patterns of ASD
We identified two subgroups of ASD patients. As shown in Fig. 2a, the 4-dimension ASD capsule vectors of testing ASD patients visualized on a 2D space via t-SNE [12] distributed in two clusters. Therefore, 2-class k-means clustering was used to group ASD patients into 2 subtypes which were significantly different in their autism diagnostic observation schedule (ADOS) scores (ranging from 0 to 22, of which larger ADOS values represent higher ASD levels), as illustrated by the violin plot shown in Fig.2b, with a p value of 0.048. These results indicated that the subgroups identified were clinically meaningful.
Fig. 2.
Visualization of ASD subtype analysis results.
Fig. 3 shows subnetworks associated with elements of the 4-dimension ASD capsule vectors. Its top panel shows the top-10 most activated FC measures associated with each element, while its bottom panel shows their distributions on Yeo-7 functional brain parcellations [10]. Particularly, the distributions were computed as the average number of the top-10 most activated FC measures of the subnetwork under consideration in each of the 7 functional brain parcellations. These results indicated that ASD were associated with a variety of FC measures that mainly located in the somatomotor, attention, frontoparietal, and default mode networks, consistent with existing neuropsychiatric findings [11]. The distinct subnetworks also highlighted that ASD patients might be associated with heterogenous abnormal FC patterns that could be effectively characterized by the CapsNet classification model.
Fig. 3.
Subnetworks associated with elements of the 4-dimension ASD capsule vectors of testing ASD patients. The top-10 most activated FC measures associated with each element are visualized on the top panel and their distributions on Yeo-7 functional brain parcellations are shown in the bottom panel.
4. CONCLUSIONS
Our study has demonstrated that the CapsNet based classification method could effectively characterize heterogenous FC patterns of ASD patients and improve classification performance compared with state-of-the-art ML/DL methods. Different from the alternative ML/DL methods whose classification outputs are scalar values, not equipped to differentiate subjects of the same clinical class, the proposed CapsNet classification model has vectorized classification outputs that facilitate stratification of ASD patients with distinct FC patterns. The vectorized classification outputs also help disentangle heterogenous FC patterns of ASD patients, as reflected by different subnetworks associated with their individual activation elements in capsules. Importantly, the subgroups identified based on the FC measures had statistically significant difference in their ADOS scores, indicating that the subgrouping results were clinically meaningful. Finally, the FC measures actively contributed the classification were largely consistent with existing neuropsychiatric findings, demonstrating that the CapsNet classification model could not only obtain improved classification performance, but also capture heterogeneous and abnormal FC patterns associated with ASD.
ACKNOWLEDGEMENTS
This study was supported in part by National Institutes of Health grants [EB022573 and MH120811]. The funding sources were not involved in the study design, in the collection, analysis and interpretation of data, in the writing of the report, or in the decision to submit the article for publication.
5. REFERENCES
- [1].Jeste Shafali S., and Geschwind Daniel H.. “Disentangling the heterogeneity of autism spectrum disorder through genetic findings.” Nature Reviews Neurology 10.2 (2014): 74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [2].Heinsfeld Anibal Sólon, et al. “Identification of autism spectrum disorder using deep learning and the ABIDE dataset.” NeuroImage: Clinical 17 (2018): 16–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Tejwani Ravi, et al. “Autism classification using brain functional connectivity dynamics and machine learning.” arXiv preprint arXiv:1712.08041 (2017). [Google Scholar]
- [4].Wang Mingliang, et al. “Low-Rank Representation for Multi-center Autism Spectrum Disorder Identification.” MICCAI, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Parisot Sarah, et al. “Disease prediction using graph convolutional networks: Application to ASD and Alzheimer’s disease.” Medical image analysis 48 (2018): 117–130. [DOI] [PubMed] [Google Scholar]
- [6].LeCun Yann, Bengio Yoshua, and Hinton Geoffrey. “Deep learning.” nature 521.7553 (2015): 436. [DOI] [PubMed] [Google Scholar]
- [7].Nielsen Jared A., et al. “Multisite functional connectivity MRI classification of autism: ABIDE results.” Frontiers in human neuroscience 7 (2013): 599. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [8].Sabour Sara, Frosst Nicholas, and Hinton Geoffrey E.. “Dynamic routing between capsules.” Advances in neural information processing systems. 2017. [Google Scholar]
- [9].Zhou Bolei, et al. “Learning deep features for discriminative localization.” CVPR. 2016. [Google Scholar]
- [10].Thomas Yeo BT, et al. “The organization of the human cerebral cortex estimated by intrinsic functional connectivity.” Journal of neurophysiology 106.3 (2011): 1125–1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [11].Cerliani Leonardo, et al. “Increased functional connectivity between subcortical and cortical resting-state networks in autism spectrum disorder.” JAMA psychiatry 72.8 (2015): 767–777. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Laurens van der Maaten, and Hinton Geoffrey. “Visualizing data using t-SNE.” Journal of machine learning research 9 November (2008): 2579–2605. [Google Scholar]