Improving Diagnosis of Autism Spectrum Disorder and Disentangling its Heterogeneous Functional Connectivity Patterns Using Capsule Networks

Zhicheng Jiao; Hongming Li; Yong Fan

doi:10.1109/isbi45749.2020.9098524

. Author manuscript; available in PMC: 2021 Apr 1.

Published in final edited form as: Proc IEEE Int Symp Biomed Imaging. 2020 May 22;2020:1331–1334. doi: 10.1109/isbi45749.2020.9098524

Improving Diagnosis of Autism Spectrum Disorder and Disentangling its Heterogeneous Functional Connectivity Patterns Using Capsule Networks

Zhicheng Jiao ¹, Hongming Li ¹, Yong Fan ¹

PMCID: PMC7687286 NIHMSID: NIHMS1646957 PMID: 33250955

Abstract

Functional connectivity (FC) analysis is an appealing tool to aid diagnosis and elucidate the neurophysiological underpinnings of autism spectrum disorder (ASD). Many machine learning methods have been developed to distinguish ASD patients from healthy controls based on FC measures and identify abnormal FC patterns of ASD. Particularly, several studies have demonstrated that deep learning models could achieve better performance for ASD diagnosis than conventional machine learning methods. Although promising classification performance has been achieved by the existing machine learning methods, they do not explicitly model heterogeneity of ASD, incapable of disentangling heterogeneous FC patterns of ASD. To achieve an improved diagnosis and a better understanding of ASD, we adopt capsule networks (CapsNets) to build classifiers for distinguishing ASD patients from healthy controls based on FC measures and stratify ASD patients into groups with distinct FC patterns. Evaluation results based on a large multi-site dataset have demonstrated that our method not only obtained better classification performance than state-of-the-art alternative machine learning methods, but also identified clinically meaningful subgroups of ASD patients based on their vectorized classification outputs of the CapsNets classification model.

Index Terms—: Autism spectrum disorder, Heterogeneity, Functional connectivity, Capsule network

1. INTRODUCTION

Autism spectrum disorder (ASD) represents a heterogeneous group of neurodevelopmental disorders, characterized by a spectrum of phenotypes such as impaired social–communication function. To improve ASD diagnosis and elucidate its neurophysiological underpinnings, enormous effort has been put into disentangling the heterogeneity of ASD through genetic findings [1]. Recent neuroimaging studies have also revealed that ASD is associated with abnormal functional connectivity (FC) patterns and the FC measures derived from resting-state functional MRI (fMRI) data could be used in conjunction with machine learning techniques to aid ASD diagnosis [2].

Many machine learning (ML) methods have been developed to build classifiers to distinguish ASD patients from healthy controls (HCs) based on fMRI. Although the methods differ in many aspects, they typically adopt a similar workflow: (1) extracting FC features that are correlation measures between fMRI signals of functional brain network nodes; (2) building a classifier to distinguish ASD patients from HCs based on the FC features using an ML method, such as support vector machine (SVM); and (3) identifying FC features that contribute significantly to the classification for understanding how ASD patients are different from HCs in their FC features. It has been demonstrated that it may lead to limited classification performance to directly apply conventional ML methods, such as SVM, naïve Bayes, and random forests (RF), to FC measures [3]. To improve the classification performance, several feature mining and fusion strategies have been proposed. For example, low-rank representative learning was introduced to improve the discriminative ability of the FC features and the method achieved improved classification performance for ASD diagnosis [4].

Deep learning (DL) has also been adopted to build classifiers for ASD diagnosis. For instance, two cascaded autoencoders were trained to learn low-dimension representations from original high-dimension FC features for building classifiers to aid ASD diagnosis [2]. A graph convolutional neural network (GCN) model was adopted to build classifiers on FC measures to aid ASD diagnosis [5]. These DL methods have achieved competitive classification performance compared with the conventional ML methods, partially due to their powerful data-driven feature learning capacity [6]. However, all these methods do not explicitly model the heterogeneity of ASD, which may not only hamper the understanding of its biological underpinnings but also limit their classification performance.

To improve the ASD diagnosis and elucidate its neurophysiological underpinnings, we adopt capsule networks (CapsNets) to build classifiers for distinguishing ASD patients from HCs based on their FC measures and stratify ASD patients into groups with distinct FC patterns based on vectorized outputs of the CapsNet classifiers.

Different from most existing ML/DL methods that build classifiers with scalar classification outputs, the CapsNets based classifiers are built upon capsules, each capsule containing a series of neurons, and have vectorized classification output with each element characterizing a distinct latent pattern learned through dynamic routing [8]. The vectorized classification output facilitates the stratification of the ASD patients into subgroups with distinct FC patterns. We have validated our method based on a large-scale multi-site ASD fMRI datasets—ABIDE I [7] and compared it with state-of-the-art ML/DL methods in terms of their classification performance. We have also revealed heterogenous FC patterns of ASD and subtypes of ASD with distinct FC patterns and clinical measures.

2. METHODS

Our CapsNet framework is illustrated by Fig. 1, consisting of (a) extracting FC features from fMRI data and (b) building a CapsNet classifier for ASD diagnosis. The classifier’s vectorized classification output is then used to stratify ASD patients into distinct subgroup and identify subnetworks of FC patterns associated with each element of the output vector using class activation mapping [9].

2.1. Computation of FC features

Before computing FC features, fMRI data are processed using the C-PAC preprocessing pipeline, which includes time correction, motion correction, and intensity normalization [7]. Then, we adopt a functional brain network with 200 nodes defined by CC200 functional parcellation atlas to compute the FC features for individual subjects [2]. As illustrated by Fig. 1a, the FC measures between each pair of nodes is computed as Pearson correlation coefficient (PCC) between their fMRI signals. Following the existing studies [2][5], and the lower triangle of the whole brain FC matrix (excluding its main diagonal elements) is flattened to form a vector of FC measures (a 19900-dimension vector for every subject) as the input to our CapsNet model.

2.2. CapsNet for ASD diagnosis

Our ASD diagnosis model is built based on CapsNets. Different from alternative deep neural networks, each node of capsule layers in the CapsNets is a capsule containing a series of neurons. The activity of each capsule is represented by an activation vector (activation values of the neurons). The norm of this vector is a probability that an object of interest possesses a certain property. The classification layer consists of classification capsules as the diagnosis output whose norm represents a probability that an instance belongs to a certain class. The classification capsules’ vectors are then utilized to stratify instances of the same class into distinct subgroups using clustering techniques and learn subnetworks of FC patterns associated with each of their elements. In the present study, ASD patients are grouped into subgroups.

Our CapsNet contains a feature representation layer and two Capsule layers, as illustrated by Fig. 1b, The feature representation layer consists of fully connected filters which serves to reduce the dimension of original FC features and obtain a high-level feature representation. The same parameters were used in this feature representation layer as a previous work [2]. Particularly, 1000 filters are to be learned to generate a 1000-dimension feature vector F for each subject. In the Capsule layers, the feature vector F is first reshaped to form a series of 8-dimension capsules [f₁, … , f_i, … , f_M], referred to as Representation Capsules. The dimension of Representation Capsules is set following a typically parameter setting for CapsNets [8]. Then, [f₁, … , f_i, … , f_M] are connected to two Diagnosis Capsules that model the probability that a subject is an ASD patient or an HC subject respectively. Connections between capsules in directly connected layers are optimized via “dynamic routing by agreement” algorithm [8], as summarized in following paragraphs.

Denoting the output of capsule i in Representation Capsules with μ_i, its parent capsule j in the Diagnosis Capsule layer is computed by

μ_{j ∣ i} = W_{i j} μ_{i},

(1)

where W_ij is a trainable weight matrix between paired capsules in capsule layers. A coupling c_ij between these two capsules is defined as

c_{i j} = \frac{e x p (b_{i j})}{\sum_{k} e x p (b_{i k})},

(2)

where b_ij represents the probability that capsule i is coupled with capsule j with an initialization value of 0, and k is the number of capsules in the Diagnosis Capsule layer. The parent capsule j has an input s_j, computed by

s_{j} = \sum_{i} c_{i j} μ_{j ∣ i} .

(3)

Then, a squashing function, as formulated by Eq. 4, is applied to restrict the norm of output vector v_j from the capsule j to the range of [0, 1]. Therefore, the norm of this vector can act as a probability for classification.

v_{j} = \frac{{‖ s_{j} ‖}^{2}}{1 + {‖ s_{j} ‖}^{2}} \frac{s_{j}}{‖ s_{j} ‖} .

(4)

For the Diagnosis Capsules, the norm of v_j represents the probability that a subject belongs to ASD or HC.

An agreement factor a_ij between the capsule i and its parent capsule j is defined as inner product, i.e.,

a_{i j} = v_{j} \cdot μ_{j ∣ i} .

(5)

The agreement factor a_ij is added to b_ij in the next iteration step of the dynamic routing to enhance coupling between these two capsules. The loss function L_D of CapsNet is defined by

L_{D} = T_{c} max {(0, m^{+} - ‖ v_{c} ‖)}^{2} + λ (1 - T_{c}) max {(0, ‖ v_{c} ‖ - m^{-})}^{2},

(6)

where T_c = 1 iff an instance from class c (ASD or HC) is present to the network, v_c is the output of the capsule, representing class c, λ is a weight set to 0.5, m⁺ = 0.9, and m⁻ = 0.1 [8].

2.3. Characterization of the heterogeneity of ASD

The vectorized representation in the ASD capsule makes it possible to identify subtypes of ASD. To disentangle the heterogeneity of ASD, we group ASD patients into subgroups by applying k-means to their classification capsules, i.e., the vectorized classification outputs. The subgrouping result is then assessed in terms of group differences in their clinical measures. FC measures actively associated with each element of the vectorized classification outputs are quantified using class activation mapping (CAM) [9]. Particularly, the activation values of FC measure associated with each element are normalized onto [−1, 1], and then the top-10 most activated FCs are visualized.

3. EXPERIMENT RESULTS

3.1. fMRI Dataset and Experimental Settings

Our CapsNet method was evaluated for distinguishing ASD patients from HCs based on ABIDE I dataset [7]. Particularly, our study focused on 505 ASD patients and 530 HCs with fMRI data of high quality. The fMRI scans were preprocessed, and FC measures were computed as described in section 2.1. To train our model, we adopted Adam as the optimizer, and the weights of our network were initialized by Xavier. The number of neurons in the classification capsules was set to 4. A larger number of neurons in the classification capsules did not increase the classification accuracy. Our CapsNet model was implemented based on PyTorch.

We also compared our method with state-of-the-art ML/DL methods, including DNN [2], SVM and RF [3], in terms of classification performance that was estimated using a 10-fold cross-validation procedure.

3.2. Classification Performance

Table 1 summarizes classification performance measures obtained by all the methods under comparison, including Accuracy, Sensitivity, and Specificity. DL methods (our CapsNet model and the DNN model) obtained similar performance, better than the SVM and RF models. Overall, our method obtained the best accuracy.

Table 1.

Mean of classification performance measures of all the methods under comparison.

Methods	Accuracy	Sensitivity	Specificity
RF	0.65	0.68	0.62
SVM	0.63	0.69	0.58
DNN	0.70	0.74	0.63
CapsNet	0.71	0.73	0.66

Open in a new tab

3.3. Heterogeneous FC Patterns of ASD

We identified two subgroups of ASD patients. As shown in Fig. 2a, the 4-dimension ASD capsule vectors of testing ASD patients visualized on a 2D space via t-SNE [12] distributed in two clusters. Therefore, 2-class k-means clustering was used to group ASD patients into 2 subtypes which were significantly different in their autism diagnostic observation schedule (ADOS) scores (ranging from 0 to 22, of which larger ADOS values represent higher ASD levels), as illustrated by the violin plot shown in Fig.2b, with a p value of 0.048. These results indicated that the subgroups identified were clinically meaningful.

Fig. 3 shows subnetworks associated with elements of the 4-dimension ASD capsule vectors. Its top panel shows the top-10 most activated FC measures associated with each element, while its bottom panel shows their distributions on Yeo-7 functional brain parcellations [10]. Particularly, the distributions were computed as the average number of the top-10 most activated FC measures of the subnetwork under consideration in each of the 7 functional brain parcellations. These results indicated that ASD were associated with a variety of FC measures that mainly located in the somatomotor, attention, frontoparietal, and default mode networks, consistent with existing neuropsychiatric findings [11]. The distinct subnetworks also highlighted that ASD patients might be associated with heterogenous abnormal FC patterns that could be effectively characterized by the CapsNet classification model.

4. CONCLUSIONS

Our study has demonstrated that the CapsNet based classification method could effectively characterize heterogenous FC patterns of ASD patients and improve classification performance compared with state-of-the-art ML/DL methods. Different from the alternative ML/DL methods whose classification outputs are scalar values, not equipped to differentiate subjects of the same clinical class, the proposed CapsNet classification model has vectorized classification outputs that facilitate stratification of ASD patients with distinct FC patterns. The vectorized classification outputs also help disentangle heterogenous FC patterns of ASD patients, as reflected by different subnetworks associated with their individual activation elements in capsules. Importantly, the subgroups identified based on the FC measures had statistically significant difference in their ADOS scores, indicating that the subgrouping results were clinically meaningful. Finally, the FC measures actively contributed the classification were largely consistent with existing neuropsychiatric findings, demonstrating that the CapsNet classification model could not only obtain improved classification performance, but also capture heterogeneous and abnormal FC patterns associated with ASD.

ACKNOWLEDGEMENTS

This study was supported in part by National Institutes of Health grants [EB022573 and MH120811]. The funding sources were not involved in the study design, in the collection, analysis and interpretation of data, in the writing of the report, or in the decision to submit the article for publication.

5. REFERENCES

[1].Jeste Shafali S., and Geschwind Daniel H.. “Disentangling the heterogeneity of autism spectrum disorder through genetic findings.” Nature Reviews Neurology 10.2 (2014): 74. [DOI] [PMC free article] [PubMed] [Google Scholar]
[2].Heinsfeld Anibal Sólon, et al. “Identification of autism spectrum disorder using deep learning and the ABIDE dataset.” NeuroImage: Clinical 17 (2018): 16–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
[3].Tejwani Ravi, et al. “Autism classification using brain functional connectivity dynamics and machine learning.” arXiv preprint arXiv:1712.08041 (2017). [Google Scholar]
[4].Wang Mingliang, et al. “Low-Rank Representation for Multi-center Autism Spectrum Disorder Identification.” MICCAI, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
[5].Parisot Sarah, et al. “Disease prediction using graph convolutional networks: Application to ASD and Alzheimer’s disease.” Medical image analysis 48 (2018): 117–130. [DOI] [PubMed] [Google Scholar]
[6].LeCun Yann, Bengio Yoshua, and Hinton Geoffrey. “Deep learning.” nature 521.7553 (2015): 436. [DOI] [PubMed] [Google Scholar]
[7].Nielsen Jared A., et al. “Multisite functional connectivity MRI classification of autism: ABIDE results.” Frontiers in human neuroscience 7 (2013): 599. [DOI] [PMC free article] [PubMed] [Google Scholar]
[8].Sabour Sara, Frosst Nicholas, and Hinton Geoffrey E.. “Dynamic routing between capsules.” Advances in neural information processing systems. 2017. [Google Scholar]
[9].Zhou Bolei, et al. “Learning deep features for discriminative localization.” CVPR. 2016. [Google Scholar]
[10].Thomas Yeo BT, et al. “The organization of the human cerebral cortex estimated by intrinsic functional connectivity.” Journal of neurophysiology 106.3 (2011): 1125–1165. [DOI] [PMC free article] [PubMed] [Google Scholar]
[11].Cerliani Leonardo, et al. “Increased functional connectivity between subcortical and cortical resting-state networks in autism spectrum disorder.” JAMA psychiatry 72.8 (2015): 767–777. [DOI] [PMC free article] [PubMed] [Google Scholar]
[12].Laurens van der Maaten, and Hinton Geoffrey. “Visualizing data using t-SNE.” Journal of machine learning research 9 November (2008): 2579–2605. [Google Scholar]

[R1] [1].Jeste Shafali S., and Geschwind Daniel H.. “Disentangling the heterogeneity of autism spectrum disorder through genetic findings.” Nature Reviews Neurology 10.2 (2014): 74. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R2] [2].Heinsfeld Anibal Sólon, et al. “Identification of autism spectrum disorder using deep learning and the ABIDE dataset.” NeuroImage: Clinical 17 (2018): 16–23. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] [3].Tejwani Ravi, et al. “Autism classification using brain functional connectivity dynamics and machine learning.” arXiv preprint arXiv:1712.08041 (2017). [Google Scholar]

[R4] [4].Wang Mingliang, et al. “Low-Rank Representation for Multi-center Autism Spectrum Disorder Identification.” MICCAI, 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] [5].Parisot Sarah, et al. “Disease prediction using graph convolutional networks: Application to ASD and Alzheimer’s disease.” Medical image analysis 48 (2018): 117–130. [DOI] [PubMed] [Google Scholar]

[R6] [6].LeCun Yann, Bengio Yoshua, and Hinton Geoffrey. “Deep learning.” nature 521.7553 (2015): 436. [DOI] [PubMed] [Google Scholar]

[R7] [7].Nielsen Jared A., et al. “Multisite functional connectivity MRI classification of autism: ABIDE results.” Frontiers in human neuroscience 7 (2013): 599. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] [8].Sabour Sara, Frosst Nicholas, and Hinton Geoffrey E.. “Dynamic routing between capsules.” Advances in neural information processing systems. 2017. [Google Scholar]

[R9] [9].Zhou Bolei, et al. “Learning deep features for discriminative localization.” CVPR. 2016. [Google Scholar]

[R10] [10].Thomas Yeo BT, et al. “The organization of the human cerebral cortex estimated by intrinsic functional connectivity.” Journal of neurophysiology 106.3 (2011): 1125–1165. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] [11].Cerliani Leonardo, et al. “Increased functional connectivity between subcortical and cortical resting-state networks in autism spectrum disorder.” JAMA psychiatry 72.8 (2015): 767–777. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] [12].Laurens van der Maaten, and Hinton Geoffrey. “Visualizing data using t-SNE.” Journal of machine learning research 9 November (2008): 2579–2605. [Google Scholar]

PERMALINK

Improving Diagnosis of Autism Spectrum Disorder and Disentangling its Heterogeneous Functional Connectivity Patterns Using Capsule Networks

Zhicheng Jiao

Hongming Li

Yong Fan

Abstract

1. INTRODUCTION

2. METHODS

Fig.1.

2.1. Computation of FC features

2.2. CapsNet for ASD diagnosis

2.3. Characterization of the heterogeneity of ASD

3. EXPERIMENT RESULTS

3.1. fMRI Dataset and Experimental Settings

3.2. Classification Performance

Table 1.

3.3. Heterogeneous FC Patterns of ASD

Fig. 2.

Fig. 3.

4. CONCLUSIONS

ACKNOWLEDGEMENTS

5. REFERENCES

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Improving Diagnosis of Autism Spectrum Disorder and Disentangling its Heterogeneous Functional Connectivity Patterns Using Capsule Networks

Zhicheng Jiao

Hongming Li

Yong Fan

Abstract

1. INTRODUCTION

2. METHODS

Fig.1.

2.1. Computation of FC features

2.2. CapsNet for ASD diagnosis

2.3. Characterization of the heterogeneity of ASD

3. EXPERIMENT RESULTS

3.1. fMRI Dataset and Experimental Settings

3.2. Classification Performance

Table 1.

3.3. Heterogeneous FC Patterns of ASD

Fig. 2.

Fig. 3.

4. CONCLUSIONS

ACKNOWLEDGEMENTS

5. REFERENCES

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases