Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Apr 12.
Published in final edited form as: Proc Int Soc Magn Reson Med Sci Meet Exhib Int Soc Magn Reson Med Sci Meet Exhib. 2017 Apr;2017:1719.

Learning Subnetwork Biomarkers via Hypergraph for Classification of Autism Disease

C Zu 1,2, Y Gao 3, B Munsell 4, M Kim 1, Z Peng 5, Y Zhu 1, W Gao 6, D Zhang 2, D Shen 1, G Wu 1
PMCID: PMC5896768  NIHMSID: NIHMS954491  PMID: 29657557

Synopsis

Most brain network connectivity models consider correlations between discrete-time series signals that only connect two brain regions. Here we propose a method to explore sub-network biomarkers that are significantly distinguishable between two clinical cohorts. We construct a hypergraph by exhaustively inspecting all possible sub-networks for all subjects. The objective function of hypergraph learning is to jointly optimize the weights for all hyperedges. We deploy our method to find high order childhood autism biomarkers from rs-fMRI images. Promising results have been obtained from comprehensive evaluation on the discriminative power in diagnosis of Autism.

Purpose

We propose a novel learning-based method to discover high order sub-network connectome biomarkers that can be used to distinguish two clinical cohorts.

Method

Figure 1 illustrates the intuition behind our proposed learning-based method. We assume there are three subjects in one cohort (top left in Figure 1) and two subjects in anther cohort (bottom left in Figure 1). Only two possible subnetworks (triangle cliques in purple and red) are under investigation. Our learning-based method aims to find out the biomarkers by inspecting the performance of each hyperedge in separating subjects from two groups. To that end, we first assume the label on each subject is not known yet. Thus, we use hypergraph learning technique to estimate the likelihood fn for each subject vn, which is driven by (a) the minimization of discrepancies between ground truth label vector y and the estimated likelihood vector f = [f1, f2, …, fN]T, and (b) the consistency of clinical labels within each hyperedge. The consistency requirement can be defined as:

Ωf(W)=θ=1Θn,n=1Nwθh(n,θ)h(n,θ)δ(θ)(fnd(n)-fnd(n))2. (1)

Figure 1.

Figure 1

The overview of our learning-based method to discover high order brain connectome patterns by hypergraph inference.

The regulation term Ωf(W) penalizes the label discrepancy by encouraging the difference between the normalized likelihoods fn/d(n) and fn/d(n) to be as small as possible if vn and vn are in the same hyperedge eθ. It is clear that the regularization term Ωf(W) is a function of both W and f, which eventually makes the optimization of W reflect the quality of each hyperedge being the biomarker. In order to avoid overfitting, we use Frobenius norm on the weighting matrix W. Therefore, the objective function to look for high order connectome patterns is:

argminW,fΩf(W)+λy-f22+μWF2. (2)

where λ and μ are two scalars controlling the strength of data fitting term and Frobenius norm on the weighting matrix W, respectively.

Results

We applied our learning-based method to find critical subnetworks based on 45 ASD and 47 typical control (TC) subjects from the NYU site of Autism Brain Imaging Data Exchange (ABIDE) database [1]. Figure 2 shows the top 10 most critical subnetworks (white triangle cliques) out of 253,460 candidates between ASD and TC cohorts. The color on each vertex differentiates the functions in human brain. It is clear (a) most of the brain regions involved in the selected top 10 critical subnetworks locate the key areas related with ASD, such as amygdala, middle temporal gyrus, superior frontal gyrus; and (b) most of the selected subnetworks travel cross the subcortical and cortical regions, which is in consensus with the recent discover of autism pathology in neuroscience community [2]. In the following experiments, we use functional connectivity flows on top 200 critical subnetworks as the feature representation (feature dimension: 200×3) to classify ASD and TC subjects. Then traditional Support Vector Machine (SVM) [3] is adopted to train the classifier directly based on the concatenated feature vector, denoted as Subnetwork-SVM. Since the functional connectivity flow comes from each subnetwork, it is straightforward to organize them to a tensor representation and use advanced Support Tensor Machine (STM) [4] to take advantage of the structured feature representation, denoted as Subnetwork-STM in the following experiments. In order to demonstrate the advantage of subnetwork over the conventional region-to-region connection in brain network, we compare with two counterpart methods Link-SVM (use the Pearson’s correlations on each link as the feature) and Toplink-SVM (select top 600 links by t-test and use the Pearson’s correlation on the selected links to form the feature vector). As the classification performance plots and the ROC curves shown in Figure 3, the classifiers trained on connectome features from our learned subnetworks have achieved much higher classification performance than those trained by the same classification tool but based on the connectome features from the conventional region-to-region connection links. Also, the substantial classification improvements by Subnetwork-STM over Subnetwork-SVM indicate the benefit of using structured data presentation in classification where such high order information is clearly delivered in the learned subnetworks.

Figure 2.

Figure 2

The top 10 selected subnetworks (white triangle cliques) where the functional connectivity flow running inside has significant difference between ASD and TC cohorts.

Figure 3.

Figure 3

Classification performance of four different classification methods.

Discussion and conclusion

In this paper, we propose a novel learning method to discover high order brain connectome biomarkers which are beyond the widely used region-to-region connections in conventional brain network analysis. Hypergraph technique is introduced to model complex subject-wise relationships in terms of various subnetworks and quantify the significance of each subnetwork based on the discrimination power across clinical groups and consistency within each cohort. We apply our learning-based method to find the subnetwork biomarkers between ASD and TC subjects. The learned top sub-networks are not only in consensus with the recent clinical findings, but also able to significantly improve accuracy in identifying ASD subjects, strongly supporting their potential use and impact in neuroscience study and clinic practice.

References

  • 1.Di Martino A, Yan CG, Li Q, Denio E, Castellanos FX, Alaerts K, Anderson JS, Assaf M, Bookheimer SY, Dapretto M, et al. The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism. Molecular psychiatry. 2014;19(6):659–667. doi: 10.1038/mp.2013.78. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Minshew NJ, Williams DL. The new neurobiology of autism: cortex, connectivity, and neuronal organization. Archives of Neurology. 2007;64(7):945–950. doi: 10.1001/archneur.64.7.945. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Cortes C, Vapnik V. Support-vector networks. Machine Learning. 1995;20(3):273–297. [Google Scholar]
  • 4.Tao D, Li X, Wu X, Hu W, Maybank SJ. Supervised tensor learning. Knowledge and Information Systems. 2007;13(1):1–42. [Google Scholar]

RESOURCES