Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Jun 4.
Published in final edited form as: IEEE Trans Neural Netw Learn Syst. 2024 Jun 4;35(6):7363–7375. doi: 10.1109/TNNLS.2022.3220220

Contrastive Brain Network Learning via Hierarchical Signed Graph Pooling Model

Haoteng Tang 1, Guixiang Ma 2, Lei Guo 3, Xiyao Fu 4, Heng Huang 5, Liang Zhan 6
PMCID: PMC10183052  NIHMSID: NIHMS1864925  PMID: 36374890

Abstract

Recently, brain networks have been widely adopted to study brain dynamics, brain development, and brain diseases. Graph representation learning techniques on brain functional networks can facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. However, current graph learning techniques have several issues on brain network mining. First, most current graph learning models are designed for unsigned graph, which hinders the analysis of many signed network data (e.g., brain functional networks). Meanwhile, the insufficiency of brain network data limits the model performance on clinical phenotypes’ predictions. Moreover, few of the current graph learning models are interpretable, which may not be capable of providing biological insights for model outcomes. Here, we propose an interpretable hierarchical signed graph representation learning (HSGPL) model to extract graph-level representations from brain functional networks, which can be used for different prediction tasks. To further improve the model performance, we also propose a new strategy to augment functional brain network data for contrastive learning. We evaluate this framework on different classification and regression tasks using data from human connectome project (HCP) and open access series of imaging studies (OASIS). Our results from extensive experiments demonstrate the superiority of the proposed model compared with several state-of-the-art techniques. In addition, we use graph saliency maps, derived from these prediction tasks, to demonstrate detection and interpretation of phenotypic biomarkers.

Keywords: Brain functional networks, contrastive learning, data augmentation, hierarchical graph pooling (HGP), interpretability, signed graph learning

I. Introduction

Understanding brain organizations and their relationship with phenotypes (e.g., clinical outcomes, behavioral, or demographical variables) are of prime importance in the modern neuroscience field. One of the important research directions is to use noninvasive neuroimaging data (e.g., functional magnetic resonance imaging or fMRI) to identify potential imaging biomarkers for clinical purposes. Most previous studies focus on voxelwise and region-of-interests (ROIs) imaging features [1], [2], [3]. However, evidences show that the brain is a complex system whose function relies on a diverse set of interactions among brain regions. These brain functions will further determine human clinical or behavioral phenotypes [4], [5], [6], [7], [8], [9], [10], [11], [12], [13]. Therefore, more and more studies have been conducted to predict those phenotypes using the brain network as the delegate of interactions among brain regions [14], [15], [16]. In addition, compared with the traditional neuroimaging features, brain network has more potential to gain interpretable and system-level insights into phenotype-induced brain dynamics [17]. A brain network is a 3-D brain graph model, where graph nodes represent the attributes of brain regions and graph edges represent the connections (or interactions) among these regions.

Many studies have been conducted to analyze brain networks based on the graph theory; however, most of these studies focus on predefined network features, such as clustering coefficient and small-worldness [18], [19], [20], [21], [22]. This may be suboptimal since these predefined network features may not be able to capture the characteristics of the whole brain network. However, the whole brain network is difficult to be analyzed due to the high dimensionality. To tackle this issue, graph neural network (GNN), as one of embedding techniques, has gained increasing attentions to explore the biological characteristics of brain network–phenotype associations in recent years [23], [24], [25]. GNN is a class of deep neural networks that can embed the high-dimensional graph topological structures with graph node features into low-dimensional latent space based on the information passing mechanism [26], [27], [28]. A few studies proposed different GNNs to embed the nodes in brain networks and applied a global readout operation (e.g., global mean or sum) to summarize all the latent node features as the whole brain network representation for downstream tasks (e.g., behavioral score regression, clinical disease classification) [4], [24], [25], [29]. However, the message passing of GNNs is inherently “flat” which only propagates information across graph edges and is unable to capture hierarchical structures rooted in graphs which are crucial in brain functional organizations [30], [31], [32], [33]. To address this issue, many recent studies introduce hierarchical GNNs, including node embedding and hierarchical graph pooling (HGP) strategies, to embed the whole brain network in a hierarchical manner [30], [34], [35], [36], [37].

Although GNNs have achieved great progresses on brain network mining, several issues should be addressed. First, most existing GNNs are designed for unsigned graphs in which all the graph nodes are connected via nonnegative edges [i.e., edge weights are in the range of [0, ∞)]. However, signed graphs are very common in brain research (e.g., functional MRI-derived brain networks or brain functional networks), which leads to a demand of signed graph embedding models. To tackle this issue, a few recent studies proposed signed graph embedding models based on the balance theory [38], [39], [40], [41]. The balance theory, motivated by human attitudes in social networks, is used to describe the node relationship in signed graphs, where nodes connected by positive edges are considered as “friends,” otherwise are considered as “opponents.” In the realm of the brain functional networks, the positive edge means coactivation and the negative edge indicates antiactivation between those connected nodes. Meanwhile, the balance theory defines four higher order relationships among graph nodes: 1) the “friend” of “friend” is “friend;” 2) the “opponent” of “friend” is “opponent;” 3) the “friend” of “opponent” is “opponent;” and 4) the “opponent” of “opponent” is “friend.” These definitions are accorded with the nodal relationships in the functional brain network, which indicates that the balance theory is applicable in brain functional network embedding. In this study, we adopt the balance theory to coembed the positive and negative edges as well as local brain nodes. Therefore, generated latent node features include balanced and unbalanced feature components. Beyond focusing on local structures, we also consider the hierarchical structure in graphs as one of the global graph features. As suggested by literature [30], [42], [43], [44], the graph hierarchical structure can facilitate to yield whole graph representations and to enable the graph-level tasks (i.e., clinical disease classification based on the whole brain networks). Particularly, we propose a new hierarchical pooling module for signed graphs based on the information theory and extend the current methods on signed graph from local embedding to global embedding.

The second issue is that most of the current GNNs on brain network studies are not interpretable, and thus are incapable of providing biological explanations or heuristic insights for model outcomes. This is mainly due to the black-box nature of neural networks. To address this issue, we propose a signed graph learning model with an interpretable graph pooling module. Previous studies indicated that brain networks are hierarchically organized by some regions as neuro-information hubs and peripheral regions, respectively [45], [46], [47], [48]. In our graph pooling module, we compute an information score (IS) to measure the information gain for each brain node and choose top-K nodes with high information gains as information hubs. The information of other peripheral brain nodes will be aggregated onto these hubs. Hence, the proposed pooling module can be interpreted as a brain information hub generator. Apparently, the outcome of this pooling module is a subgraph of the original brain network without creating any new nodes. Therefore, yielded subgraph nodes can be regarded as potential biomarkers to provide heuristic biological explanations for tasks.

To further boost the proposed model performance on prediction tasks, we introduce graph contrastive learning into our proposed hierarchical signed graph representation learning (HSGRL) model. A data augmentation strategy to generate contrastive brain functional network samples is necessary to achieve graph contrastive learning. The data augmentation for contrastive learning aims at creating reasonable data samples, by applying certain transformations, which are similar to the original data samples. For example, image rotation and cropping are common transformations to generate new samples in the image classification tasks [49], [50], [51], [52], [53]. In graph structural data, a few studies proposed to use graph perturbations (i.e., add/drop graph nodes, manipulate graph edges) and graph view augmentation (e.g., graph diffusion) to generate contrastive graph samples from different views [54], [55], [56], [57], [58]. These strategies, although boosting the model performance on large-scale benchmark datasets (e.g., CORA, CITESEER), may not be suitable to generate contrastive brain network samples. On one hand, each node in brain networks represents a defined brain region with specific brain activity information so that the brain node cannot be arbitrarily removed or added. On the other hand, add/drop operations on the brain network may lead to unexpected model outcomes which are difficult to explain and understand from biological views. Motivated by [59], [60], we generate contrastive brain functional network samples directly from the fMRI blood-oxygen-level-dependent (BOLD) signals, where the generated contrastive samples are similar to the original ones, and the internal biological structure is therefore maintained. Our main contributions are summarized as follows.

  1. We propose an HSGRL model to embed the brain functional networks and we apply the proposed model on multiple phenotype prediction tasks.

  2. We propose a contrastive learning architecture with our proposed HSGRL model to boost the model performance on several prediction tasks. A graph augmentation strategy is proposed to generate contrastive samples for the fMRI-derived brain network data.

  3. The proposed HSGPL model is interpretable which yields heuristic biological explanations.

  4. Extensive experiments are conducted to demonstrate the superiority of our method. Moreover, we draw graph saliency maps for clinical tasks, to enable interpretable identifications of phenotype biomarkers.

II. Related Works

A. GNNs and Brain Network Embedding

GNNs are generalized deep learning architectures which are broadly used for graph representation learning in many fields (e.g., social network mining [61], [62], molecule studies [63], [64], and brain network analysis [65]). Most existing GNN models (e.g., graph convolutional network (GCN) [26], GAT [27], GraphSage [66]) focus on node-level representation learning and only propagate information across edges of the graph in a flat way. When deploying these models on graph-level tasks (e.g., graph classification, graph similarity learning, [42], [43], [44], [67]), the whole graph representations are obtained by a naive global readout operation (e.g., sum or average all the node feature vectors). However, this may lead to poor performance and low efficiency in graph-level tasks since the hierarchical structure, an important property that existed in graphs, is ignored in these models. To explore and capture hierarchical structures in graphs, a few HGP strategies are proposed to learn representations for the whole graph in a hierarchical manner [30], [34], [35], [68], [69]. The traditional methods to extract brain network patterns are based on the graph theory [18], [19], [20], [21], [22], [70] or geometric network optimization [71], [72], [73], [74]. A few recent studies [24], [25], [75] introduce GNNs to discover brain patterns for phenotypes’ predictions. However, hierarchical structures in the brain networks are not considered in these models, which limit the model performance in a way. Recently, a few hierarchical brain network embedding models are proposed [36], [65], [76].

However, all the aforementioned GNNs are designed for unsigned graph representation learning. A few recent studies are proposed to handle the signed graphs; however, they only consider the ode-level representation learning [39], [41], [77], [78]. In this work, we design a signed graph hierarchical pooling strategy to extract graph-level representations from the brain functional networks.

B. Interpretable Graph Learning Model

Generally, the mechanism about how GNNs embed the graph nodes can be explained as a message passing process, which includes message aggregations from neighbor nodes and message (nonlinear) transformations [28], [36], [79]. However, most current hierarchical pooling strategies are not interpretable [30], [34], [35]. A few recent studies try to propose interpretable graph pooling strategies to make the pooling module intelligible to the model users. Most of these pooling strategies downsample graphs relying on network communities which are one of the important hierarchical structures that can be interpreted [36], [37], [80], [81]. For example, [36] proposed an HGP neural network relying on the brain network community to yield interpretable biomarkers. The hierarchical pooling strategy proposed in this work relies on the network information hub which is another important hierarchical structure in the brain networks.

C. Data Augmentation for Graph Contrastive Learning

Most current graph contrastive learning methods augment graph contrastive samples by manipulating graph topological structures. For example, [55], [56] generate the contrastive graph samples by dropping nodes and perturbing edges. Other studies generate contrastive samples by changing the graph local receptive field, which is named as graph view augmentation [54], [82]. In this work, we introduce graph contrastive learning into the brain functional network analysis and generate contrastive samples from the fMRI BOLD signals.

III. Preliminaries of Brain Functional Networks

We denote a brain functional network with N nodes as G = {V, E} = (A, H). V is the graph node set where each node (i.e., vi, i = 1, . . . , N) represents a brain region. E is the graph edge set where each edge (i.e., ei,j) describes the connection between nodes vi and vj. AN×N is the graph adjacency matrix where each element, ai,jA, is the weight of edge ei,j. HN×C is the node feature matrix where HiH is the ith row of H representing the feature vector of vi. Let BN×D be the fMRI BOLD signal matrix, where D is the signal length. Generally, the edge weight in the brain functional network can be computed from the fMRI BOLD signal by ai,j = corr(bi, bj), where bi is the ith row of B representing the BOLD signal of vi, and corr(·) is the correlation coefficient operator. Note that ai,j can be either positive or negative value so that the brain functional network is a signed graph. For each subject, we use ˆ and ˇ to denote a functional brain network contrastive sample pair [i.e., Gˆ=(Aˆ,Hˆ) and Gˇ=(Aˇ,Hˇ)].

IV. Methodology

In this section, we first propose a data augmentation strategy to generate contrastive samples for the brain functional networks. Second, we introduce our proposed HSGRL model with node embedding and HGP modules. Finally, we deploy the contrastive learning framework on our proposed HSGRL model to yield the representations for the whole graph, which can be applied to downstream prediction tasks.

A. Contrastive Samples of Brain Functional Networks

The generation of contrastive samples aims at creating reasonable and similar functional brain network pairs by applying certain transformations. Here, we propose a new strategy to generate the brain functional network contrastive samples from the fMRI BOLD signals. For each node vi, we generate two sub-BOLD signals (bˆi and bˇi) by manipulating its original bold signal bi. Specifically, we use a window (size = d) to clamp bi from the signal head and tail, respectively,

bˆi=bi[d+1,d+2,,D]bˇi=bi[1,2,,Dd]. (1)

Obviously, bi1×D, bˆi and bˇi1×(Dd). To keep the similarity between Gˆ and Gˇ, we set the window size dD. After we generate a pair of subbold signals, we can compute edge weights of the pairwise contrastive brain functional network samples by

aˆi,j=corr(bˆi,bˆj)aˇi,j=corr(bˇi,bˇj) (2)

where aˆi,jAˆ and aˇi,jAˇ are the weights of ei,j in two contrastive samples. We do not consider the contrastive node features in this work, and therefore, Xˆ=Xˇ=X. The generated contrastive sample pairs are similar to the same node features and slightly different edge weights. We will show this similarity in Section V-C.

B. HSGRL Model

We present our HSGRL model in Fig. 1. The HSGRL model includes the BUE module and HGP module.

Fig. 1.

Fig. 1.

Diagram of the proposed contrastive graph learning framework (in the bottom black box) with the HSGRL model (in the top black box) for the functional brain network embedding and downstream tasks (i.e., phenotype classification or regression). The HSGRL model consists of the cascaded balanced and unbalanced embedding (BUE) and HGP modules to extract graph-level representations of contrastive brain functional network pairs (i.e., XˆG and XˇG) in a hierarchical manner. XˆG and XˇG participate to build up the contrastive loss for graph contrastive learning. Meanwhile, a concatenate operation is used to generate the fused graph feature by XG=[XˆGXˇG]). The fused graph feature XG is used for downstream prediction tasks (i.e., graph classification and regression).

1). BUE Module:

The balance theory is broadly used to analyze the node relationships in the signed graphs. The theory states that given a node vi in a signed graph, any other node (i.e., vj) can be assigned into either balanced node set or unbalanced node set to vi regarding a path between vi and vj. Specifically, if the number of negative edges is even in the path between vi and vj, then vj belongs to the balanced set of vi. Otherwise, vj belongs to the unbalanced set of vi. The balance theory indicates that the following.

  1. Each graph node, vj, can belong to either the balanced or unbalanced node set of a given target node vi.

  2. The path between vi and vj determines the balance attribute of vj.

Motivated by this, we adopt the idea of the signed graph attention networks from [41] to embed brain functional network nodes to generate latent node features with the balanced and unbalanced components

XB,XU=Fsign(A,H) (3)

where Fsign(·) is the signed graph attention encoder [41]. XB and XU are the node balanced and unbalanced components of node latent features, respectively. We fuse the two feature components as the node latent features by

X=[XBXU] (4)

where [∥] denotes the concatenate operation.

2). Hierarchical Signed Graph Pooling:

As shown in Fig. 1, the proposed HGP module consists of four steps including: 1) ISs computation; 2) top-K informative hubs selection; 3) features’ aggregation; and 4) graph pooling.

a). IS computation:

The IS of each node is also considered to contain the balanced and unbalanced components to measure the information quantity that each node gains from the balanced node set and unbalanced node set, respectively. We first split the signed graph (i.e., with adjacency matrix as A) into positive subgraph (with adjacency matrix as A+) and negative one (with adjacency matrix as A). Then we use the Laplace normalization to normalize these two adjacency matrices as

A¯+=D+12A+D+12A¯=D12|A|D12 (5)

where A¯ is the normalized adjacency matrix. D+ and D are degree matrices of A+ and |A|, respectively. Note that the ith line in A¯, denoted by A¯i, represents the connectivity probability distribution between vi and any other nodes. For each node (i.e., vi), we, respectively, define the balanced and unbalanced components of IS by

ISiB=A¯+,i:XBL˜1+A¯,i:XUL˜1ISiU=A¯+,i:XUL˜1+A¯,i:XBL˜1 (6)

where L˜1 is the linewise L1 norm, and ⊗ is the scalar multiplication between each line of two matrices. ⊤ represents the transpose of vector. Then the IS of vi can be obtained by

ISi=ISiB+ISiU. (7)
b). Top-K node selection and feature aggregation:

After we obtain the IS for each brain node, we rank the IS and select K brain nodes, with top-K IS values, as informative network hubs. For the other nodes, we aggregate their features on the selected K network hubs based on the feature attention. Particularly, the feature attention between vi and vj is computed by xixj. We weighted add (i.e., set feature attentions as weights) the feature of each unselected node to one of the hub features, where the attention value between these two nodes is the biggest.

c). Graph pooling:

After feature aggregation, we down-scale the graph node by removing all the unselected nodes. In another word, only the selected top-K network hubs and the edges among them will be preserved after graph pooling. Since the functional brain network is a fully connected graph, no isolated node existed in the down-scaled graph.

C. Contrastive Learning Framework With BUE and HGP

The contrastive learning framework with HSGRL is presented in Fig. 1. Assume that we forward a pair of contrastive graph samples into the proposed HSGRL model, we will obtain two node latent features, Xˆ and Xˇ, after the last pooling module. We first generate the graph-level representations of two functional brain networks based on the latent node features by a readout operator

XˆG=i=1Nxˆi,XˇG=i=1Nxˇi (8)

where xˆi and xˇi are the ith row of Xˆ and Xˇ, respectively. N′(< N) is the number of nodes in the down-scaled graph generated by the last pooling module.

1). Contrastive Loss:

The normalized temperature-scaled cross entropy loss [83], [84], [85] is used to construct the contrastive loss. In the framework training stage, we randomly sample M pairs from the generated contrastive graph samples as a mini-batch and forward them to the proposed HSGRL model to generate contrastive graph representation pairs (i.e., XˆG and XˇG). We use m ∈ {1, . . . , M} to denote the identity number (ID) of the sample pair. The contrastive loss of the mth sample pair is formulated as

m=logexp(Φ(XˆGm,XˇGm)/α)t=1,tmMexp(Φ(XˆGm,XˇGt)/α) (9)

where α is the temperature parameter. Φ(·) denotes a similarity function that

Φ(XˆGm,XˇGm)=XˆGmXˇGm/XˆGmXˇGm. (10)

The batch contrastive loss can be computed by

contrastive=1Mm=1Mm. (11)

2). Downstream Task and Loss Functions:

We use an multi-layer perceptron (MLP) to generate the framework prediction for both the classification and regression tasks. Specifically, prediction can be generated by Ypred=MLP([XˆGXˇG]). We use negative log likelihood loss (NLLLoss) and L1Loss as supervised loss functions (supervised) of the classification and regression tasks, respectively. The whole framework can be trained in an end-to-end manner by optimizing

=η1supervised+η2contrastive (12)

where η1 and η2 are the loss weights.

V. Experiments

A. Datasets and Data Preprocessing

Two publicly available datasets were used to evaluate our framework. The first includes 1206 young healthy subjects (mean age 28.19±7.15, 657 women) from the Human Connectome Project (HCP) [86]. The second includes 1326 subjects (mean age = 70.42±8.95, 738 women) from the Open Access Series of Imaging Studies (OASIS) dataset [87]. Details of each dataset can be found on their official websites1,2 CONN [88] was used to preprocess fMRI data, and the preprocessing pipeline follows our previous publications [89], [90]. For HCP data, each subject’s network has a dimension of 82 × 82 based on 82 ROIs defined using FreeSurfer (V6.0) [91]. For OASIS data, each subject’s network has a dimension of 132×132 based on the Harvard-Oxford Atlas and automated anatomical labeling (AAL) Atlas. We deliberately chose different network resolutions for HCP and OASIS to evaluate whether the performance of our new framework is affected by the network dimension or atlas.

B. Implementation Details

We randomly split the entire functional brain network dataset into five disjoint subsets for five-fold cross-validations in our experiments. The values in the adjacency matrices Aˆ and Aˇ of the brain functional networks are within the range of [−1, 1]. We compute the kurtosis and skewness values of the fMRI BOLD signals as the node feature matrices (H). We use the Adam optimizer [92] to optimize the loss functions in our model with a batch size of 128. The initial learning rate is 1e−4 and decayed by (1 − (current_epoch/max_epoch))0.9. We also regularized the training with an L2 weight decay of 1e−5. We set the maximum number of training epochs as 1000, and following the strategy in [34] and [93], stop training if the validation loss does not decrease for 50 epochs. The experiments were deployed on one NVIDIA RTX A6000 graphics processing unit (GPU).

C. Similarities of Contrastive Samples

We use the L1 distance and Cosine similarity to measure the similarities of the adjacency matrices of contrastive brain networks. Here, we set the window size d = 10 to generate the contrastive adjacency matrices. The inner pair similarity is computed by (1/M)m=1MΨ(Aˆm,Aˇm), and the interpair similarity is computed by (1/M2)m=1Mt=1MΨ(Aˆm,Aˇt) where Ψ(·) is the similarity function (i.e., L1 distance or Cosine similarity). The inner pair L1 distances on HCP and OASIS data are 0.1301 and 0.0915, respectively. The inner pair Cosine similarities on HCP and OASIS data are 0.9283 and 0.9466, respectively. The interpair L1 distances on HCP and OASIS data are 0.2925 and 0.3137, respectively. The interpair Cosine similarities on HCP and OASIS data are 0.7311 and 0.7014, respectively. We visualize the averaged adjacency matrices on HCP and OASIS data in Fig. 2(a) and (b), respectively, to show their similarities. The original sample is generated using the whole fMRI BOLD signal (i.e., d = 0).

Fig. 2.

Fig. 2.

Visualization of the averaged adjacency matrices for the original and contrastive samples on (a) HCP dataset and (b) OASIS dataset. The averaged contrastive sample pair is generated using a window size d = 10.

D. Classification Tasks

1). Experiment Setup:

For comparison, we adopted seven baseline models, which include two traditional graph embedding models (tensor-based brain network embedding (t-BNE) [73] and multimodal CCA+ joint ICA (mCCA-ICA) [74]), one basic GNN (i.e., GCN [26]), two deep graph representation learning models designed for brain network embedding (BrainChey [25] and BrainNet-convolutional neural networks (CNNs) [24]), and two hierarchical GNNs with graph pooling strategies (hierarchical graph representation learning with differentiable pooling (DIFFPOOL) [30] and self-attention graph pooling (SAGPOOL) [34]). As aforementioned, the existing GNN-based models cannot directly take signed graphs as the input, and we therefore compute the absolute values of graph adjacency matrices as the input for these baseline models, which is consistent with previous studies [36], [94]. Meanwhile, we compare our model with and without optimizing contrastive loss to demonstrate the effectiveness of contrastive learning in boosting the model performance. The results for gender and Alzheimer disease (AD) classification are reported in accuracy, precision, and F1-score with their standard deviation (std). The results for zygosity classification (i.e., three classes’ classification task with class labels as: not twins, monozygotic twins, and dizygotic twins) are reported in accuracy and Macro-F1-score with their std. The number of the cascaded BUE and HGP modules is set to three, and the number of top-K nodes in the pooling module is 50% of the number of nodes in the current graph. We search the loss weights η1 and η2 in the range of [0.1, 1, 5] and [0.01, 0.1, 0.5, 1], respectively, and determine the loss weights as η1 = 1 and η2 = 0.1. The temperature parameter in contrastive loss is set as 0.2. Details of the hyperparameters analysis are shown in Section V-F.

2). Results:

Table I shows the results of gender classification, zygosity classification, and AD classification. It shows that our model achieves the best performance compared with all the baseline methods on three tasks. For example, in gender classification, our model outperforms the baselines with at least 8.56%, 8.18%, and 8.91% increases in accuracy, precision, and F1-scores, respectively. In general, the deep GNNs are superior than the traditional graph embedding methods (i.e., t-BNE and mCCA-ICA). When we remove the supervision of the contrastive loss, the performance, though comparable to baselines, decreases in a way. This manifests the effectiveness of contrastive learning which can substantially boost the model performance.

TABLE I.

Classification Accuracy With std Values Under Fivefold Cross-Validation on Gender Classification, Zygosity Classification, and ad Classification Tasks. The Values in Bold Show the Best Results

Method HCP OASIS

Gender Zygosity AD

Acc. Pre. F1. Acc. Macro-F1. Acc. Pre. F1.

t-BNE 63.84(2.09) 64.17(1.90) 63.264(2.12) 37.19(2.65) 39.67(3.04) 61.26(2.31) 63.58(2.06) 62.05(1.97)
mCCA-ICA 61.21(4.03) 63.11(3.75) 62.20(3.59) 35.51(4.64) 38.71(3.34) 63.37(1.98) 62.06(2.12) 64.37(2.09)

GCN 66.76(2.22) 65.09(3.13) 67.58(2.84) 46.66(2.14) 47.21(2.51) 67.37(2.69) 69.21(2.00) 68.51(4.29)
SAGPOOL 68.12(3.07) 69.96(2.48) 67.51(2.65) 49.91(2.22) 51.07(2.31) 67.23(2.15) 68.83(1.13) 67.51(2.51)
DIEFPOOL 72.06(2.28) 74.05(1.90) 73.07(2.42) 53.37(1.88) 54.28(2.14) 72.79(1.66) 71.55(2.15) 70.83(2.01)
BrainCheby 75.08(1.98) 76.14(2.38) 74.09(1.84) 56.25(2.12) 57.37(2.05) 72.55(2.45) 73.36(1.88) 72.62(1.33)
BrainNet-CNN 74.09(2.49) 73.71(1.96) 73.27(2.21) 54.03(2.20) 55.25(2.46) 68.37(1.71) 69.97(1.30) 68.51(2.02)

Ours w/o Contrastive 78.86(2.18) 80.06(1.33) 77.52(1.69) 61.05(1.70) 63.24(2.51) 76.26(2.32) 75.42(1.62) 76.80(1.72)
Ours 81.51(1.14) 82.37(1.95) 80.69(2.03) 63.33(2.06) 64.51(1.74) 77.51(1.84) 78.83(1.78) 78.28(1.95)

E. Regression Tasks

1). Experiment Setup:

In the regression tasks, we use the same baselines for comparisons. The regression tasks include predicting mini-mental state exam (MMSE) scores on OASIS data, Flanker scores, Card-Sort scores, and three Achenbach adult self-report (ASR) scores (i.e., Aggressive, Intrusive, and Rule-Break scores) on HCP data. Particularly, MMSE test [95], Flanker test [96], and Wisconsin Card-Sort test [97], [98], [99] are three neuropsychological tests designed to measure the status and risks of human neurodegenerative disease and mental illness. The ASR is a life function which is used to measure the emotion and social support of adults. The structure of the proposed model remains unchanged. The loss weights are set as η1 = 0.5 and η2 = 1. The regression results are reported in average mean absolute errors (MAEs) with its std under fivefold cross validations.

2). Results:

The regression results are presented in Table II. It shows that our model achieves the best MAE values compared with all the baseline methods. Similar to the classification tasks, the deep GNNs are superior than the traditional graph embedding methods (i.e., t-BNE and mCCA-ICA). Comparing our method with and without the supervision of the contrastive loss, we can hold the conclusion that contrastive learning can further boost the model performance.

TABLE II.

Regression Mae With std Under Fivefold Cross-Validation. The Values in Bold Show the Best Results

Method OASIS HCP

MMSE Flanker Card-Sort Aggressive Intrusive Rule-Break

t-BNE 2.02(0.36) 1.69(0.19) 1.58(0.22) 1.89(0.10) 1.84(0.22) 1.77(0.41)
mCCA-ICA 2.68(0.19) 1.82(0.21) 1.67(0.17) 1.47(0.26) 1.97(0.13) 1.61(0.29)

GCN 2.05(0.07) 1.67(0.15) 1.46(0.11) 1.59(0.32) 1.66(0.24) 1.69(0.08)
SAGPOOL 1.84(0.33) 1.55(0.06) 1.44(0.13) 1.52(0.18) 1.50(0.24) 1.74(0.23)
DIEFPOOL 1.27(0.20) 1.34(0.14) 1.16(0.30) 1.27(0.41) 1.25(0.07) 1.43(0.15)
Brain-Cheby 1.51(0.67) 1.17(0.26) 1.24(0.31) 0.79(0.06) 1.09(0.21) 1.58(0.41)
BrainNetCNN 1.26(0.19) 1.43(0.24) 0.91(0.11) 1.33(0.23) 1.14(0.13) 1.29(0.19)

Ours w/o Contrastive 1.02(0.11) 0.89(0.13) 0.97(0.20) 0.74(0.17) 0.96(0.15) 1.15(0.11)
Ours 0.83(0.24) 0.66(0.17) 0.69(0.14) 0.45(0.12) 0.73(0.08) 1.02(0.16)

F. Ablation Studies

In this section, we investigate the effect of four hyperparameters on our model performance, including: 1) the window size (d) which we used to clamp the fMRI BOLD signals when generating contrastive functional brain network samples; 2) temperature parameter (α) within contrastive loss; 3) the number of the BUE and HGP modules used in the HSGRL model; and 4) loss weights η1 and η2. First, we set the window size as [0, 5, 10, 20, 30, 40, 50], respectively, and generate different contrastive samples as the input of our proposed model. The first column in Fig. 3 shows the analysis of the window size parameter. It indicates that the best window size is around d = 10. When the window size decreases to 0, the model performance declines since the data are only duplicated without any substantial new samples. It is interesting that the performance when d = 0 is even worse than that obtained without contrastive learning but with contrastive samples generated with d = 10 (see ours w/o contrastive in Tables I and II). The reason is that data augmentation is introduced in the latter case but not in the first case. Second, we increase the temperature α from 0.1 to 1.0 with a step of 0.1. The second column in Fig. 3 demonstrates the analysis of the temperature parameter. It shows that the best temperature value for our framework is α = 0.2. Moreover, we set the number of the BUE and HGP modules as [1, 2, 3, 4, 5], respectively, for our framework. The third column in Fig. 3 shows the analysis of this parameter. It manifests that the framework performance is consistent and steady when different numbers of the BUE and HGP modules are deployed. The best number of the modules for almost all the tasks is three, except for the regression tasks on Flanker and Aggressive. Finally, we present the loss weights analysis (see Fig. 4) on the three classification tasks, and the best results are achieved when η1 = 1 and η2 = 0.1.

Fig. 3.

Fig. 3.

Parameter analysis. The model performance obtained with: contrastive samples generated by different window sizes (Column 1), different temperature parameters in contrastive loss (Column 2), and different numbers of the BUE and HGP modules (Column 3). (a) Analysis on the classification tasks. (b) Analysis on the regression tasks.

Fig. 4.

Fig. 4.

Loss weights analysis on the classification tasks. (a) Analysis on gender classification. (b) Analysis on zygosity classification. (c) Analysis on AD classification. The red points represent best results, where η1 = 1 and η2 = 0.1.

G. Interpretation With Brain Saliency Map

Within our new graph pooling module, an IS is designed to measure the information gain for each brain node and only top-K nodes with high information gains will be preserved as brain information hubs, while the information of other peripheral nodes will be aggregated onto these hubs. These hubs, through the final pooling layer, will serve as the delegate of the whole brain network and then be linked to clinical phenotypes (e.g., clinical/behavior scores or diagnosis). Therefore, they can provide hints for further clinical analyses on how this phenotype is associated with the brain functional network from the global view. We use the class activation mapping (CAM) approach [100], [101], [102] to generate the brain network saliency map, which indicates the top brain regions associated with each prediction task. Figs. 5 and 6 illustrate brain saliency maps for the classification and regression tasks, respectively. For example, in the classification task [AD versus normal control (NC)], the saliency map for AD highlights multiple regions (such as planum polare, frontal operculum cortex, supracalcarine cortex) which are conventionally conceived as the biomarkers of AD in medical imaging analysis [103], [104], [105], [106]. In the meantime, the saliency map for NC highlights many regions in cerebellum and frontal lobe. These regions control cognitive thinking, motor control, and social mentalizing as well as emotional self-experiences [107], [108], [109], in which patients with AD typically show problems. Another example is the classification of male versus female on HCP data. Females are more “emotional” or “sensitive,” suggested by regions such as isthmuscingulate and caudalanteriorcingulate, while males tend to be more competitive and dominant, manifested in regions such as lateralorbitofrontal and precuneus. These results are consistent with previous findings in the literature [110], [111], [112], [113]. The details of all the highlighted brain regions in each task are summarized in Table III for the OASIS dataset, and in Table IV(a) and (b) for the HCP dataset. These highlighted regions can help us locating the brain regions associated with any phenotype, which provide clues for future clinical investigations.

Fig. 5.

Fig. 5.

Brain saliency maps for the classification tasks. Here we identify: 1) top 15 regions associated with AD and NC from OASIS and 2) top 10 regions associated with each sex and each zygosity from HCP.

Fig. 6.

Fig. 6.

Brain saliency maps for regression tasks. Here we identify: 1) top 15 regions associated with MMSE from OASIS and 2) top 10 regions associated with Flanker score, Card-Sort score, Aggressive score, Intrusive score, and Rule-Break score from HCP.

TABLE III.

List of Highlighted Brain Regions for the Oasis Dataset, Including AD and NC Classification Tasks and MMSE Regression Task

AD Planum Polare Left Frontal Operculum Cortex Left Supracalcarine Cortex Left Left-Caudate Supramarginal Gyms, anterior division Right Superior Temporal Gyms, anterior division Right Middle Temporal Gyms, posterior division Left Superior Temporal Gyms, posterior division Left
Heschl’s Gyms Left Intracalcarine Cortex Left Middle Frontal Gyms Left Planum Polare Right Temporal Fusiform Cortex, anterior division Left Middle Temporal Gyrus, temporooccipital part Left Supracalcarine Cortex Right
NC Paracingulate Gyrus Right Intracalcarine Cortex Right Frontal Pole Right Cerebelum 6 Right Paracingulate Gyms Left Left-Putamen Cerebelum 8 Left Cerebelum 7b Right
Heschl’s Gyms Left Cuneal Cortex Right Precuneous Cortex Cerebelum Crus2 Left Lateral Occipital Cortex, superior division Right Brain-Stem Cerebelum 8 Right
MMSE Right-Caudate Temporal Pole Right Planum Temporale Left Cerebelum Cmsl Right Middle Temporal Gyms, posterior division Right Temporal Occipital Fusiform Cortex Left Temporal Occipital Fusiform Cortex Right Middle Temporal Gyrus, temporooccipital part Left
Planum Temporale Right Frontal Orbital Cortex Left Vermis 9 Temporal Pole Left Middle Temporal Gyms, temporooccipital part Right Left-Caudate Temporal Pole Left

TABLE IV.

List of Highlighted Brain Regions for Classification Tasks on the HCP Dataset

(a)

Male Female Not Twins Monozygotic Dizygotic

ctx-lh-precuneus ctx-rh-superiorfrontal ctx-lh- lateraloccipital ctx-lh- isthmuscingulate ctx-lh-postcentral
ctx-rh- superiorparietal Right-Accumbens-area ctx-rh-baiikssts ctx-rh-pericalcarine ctx-rh- transversetemporal
Right-Hippocampus ctx-rh- caudalmiddlefrontal ctx-lh-precentral ctx-rh-frontalpole ctx-rh- transversetemporal
ctx-rh- parahippocampal ctx-lh-parsorbitalis ctx-lh- parahippocampal ctx-lh-fusiform Paracingulate Gyrus
Right Paracingulate
Right-Amygdala Right-Amygdala ctx-lh-entorhinal ctx-lh-entorhinal ctx-lh- caudalanteriorcingulate
ctx-lh-pericalcarine ctx-rh-paracentral Right-Pallidum ctx-lh- superiorfrontal ctx-rh-parsorbitahs
Right-Putamen
ctx-lh-transversetemporal ctx-lh-precentral ctx-lh- superiortemporal ctx-lh- temporalpole ctx-rh-precentral
ctx-rh-transversetemporal ctx-lh- isthmuscingulate ctx-rh-parsorbitalis ctx-lh- superiorparietal ctx-rh-caudalmiddlefrontal
ctx-rh- lateralorbitofrontal ctx-rh- isthmuscingulate ctx-lh-superiorfrontal Left-Pallidum ctx-lh-precuneus
ctx-lh- temporalpole ctx-lh- caudalanteriorcingulate ctx-rh- caudalmiddlefrontal ctx-rh-parsorbitahs ctx-lh-temporalpole

(b)

Flanker Card-Sort Aggressive Intrusive Rule-Break

Left-Accumbens-area Left-Accumbens-area ctx-lh-bahkssts ctx-lh-bankssts ctx-lh-precuneus
ctx-lh-inferiortemporal ctx-lh- caudalmiddlefrontal ctx-lh- inferiortemporal ctx-lh- inferiortemporal ctx-lh- inferiortemporal
ctx-rh-insula ctx-rh-ftontalpole ctx-lh- lateraloccipital ctx-lh- parahippocampal Right-Caudate
ctx-lh- middletemporal ctx-lh- rostralanteriorcingulate ctx-lh-precentral ctx-rh- supramarginal ctx-rh- lateraloccipital
ctx-lh-postcentral ctx-rh- middletemporal ctx-rh-frontalpole ctx-rh-paracentral ctx-lh- supramarginal
ctx-lh-temporalpole ctx-lh-frontalpole ctx-rh-parsorbitahs ctx-rh- parstriangularis ctx-rh-insula
ctx-rh- superiortemporal ctx-rh-precentral ctx-rh- parstriangularis ctx-lh- caudalanteriorcingulate ctx-rh- parstriangularis
ctx-lh-frontalpole ctx-rh- caudalmiddlefrontal ctx-lh- middletemporal ctx-lh-precentral ctx-lh-lingual
ctx-rh-precentral ctx-rh-precuneus ctx-rh-entorhinal ctx-rh- caudalmiddlefrontal ctx-rh- temporalpole
ctx-rh-fusiform Left-Putamen ctx-rh-temporalpole ctx-lh-parsorbitalis Right-Amygdala

VI. Conclusion

We propose a novel contrastive learning framework with an interpretable HSGRL model for brain functional network mining. In addition, a new data augmentation strategy is designed to generate the contrastive samples for the brain functional network data. Our new framework is capable of generating more accurate representations for the brain functional networks compared with other state-of-the-art methods, and these network representations can be used in various prediction tasks (e.g., classification and regression). Moreover, Brain saliency maps may assist with phenotypic biomarker identification and provide interpretable explanation on framework outcomes.

Acknowledgment

Data were provided in part by the Human Connectome Project, MGH-USC Consortium (Principal Investigators: Bruce R. Rosen, Arthur W. Toga, and Van Wedeen; U01MH093765) funded by the NIH Blueprint Initiative for Neuroscience Research Grant; in part by the National Institutes of Health under Grant P41EB015896; and in part by the Instrumentation under Grant S10RR023043, Grant 1S10RR023401, and Grant 1S10RR019307.

Part of the work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562. Specifically, it used the Bridges system, which is supported by NSF award number ACI-1445606, at the Pittsburgh Supercomputing Center (PSC).

This work was supported in part by the National Institutes of Health under Grant R01AG071243, Grant R01MH125928, and Grant U01AG068057; and in part by the National Science Foundation under Grant IIS 2045848 and Grant IIS 1837956.

Biographies

graphic file with name nihms-1864925-b0007.gif

Haoteng Tang (Graduate Student Member, IEEE) received the B.E. degree in biomedical engineering from Xi’an Jiaotong University (XJTU), Xi’an, China, in 2016, and the M.S. degree in biomedical engineering from the University of Southern California (USC), Los Angeles, CA, USA, in 2018. He is currently pursuing the Ph.D. degree with the Electrical and Computer Engineering Department, University of Pittsburgh, Pittsburgh, PA, USA.

His research interests include graph mining, brain network representation learning, and medical image analysis.

graphic file with name nihms-1864925-b0008.gif

Guixiang Ma (Member, IEEE) received the Ph.D. degree in computer science from the University of Illinois at Chicago (UIC), Chicago, IL, USA, in 2019.

She is currently an artificial intelligence (AI) Research Scientist with the Intel Laboratory, Hillsboro, OR, USA. Her research interests include machine learning, data mining, graph representation learning, and their applications to various domains.

graphic file with name nihms-1864925-b0009.gif

Lei Guo received the B.E. degree in electrical engineering from Xi’an Jiaotong University (XJTU), Xi’an, China, in 2003, and the M.S. degree in industrial and system engineering from the National University of Singapore, Singapore, in 2007. She is currently pursuing the Ph.D. degree in electrical and computer engineering with the University of Pittsburgh, Pittsburgh, PA, USA.

Her research interests include brain network mining and bioinformatics.

graphic file with name nihms-1864925-b0010.gif

Xiyao Fu received the B.E. degree in computer science and engineering from the University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2017. He is currently pursuing the Ph.D. degree with the Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, USA.

His research interests include medical image analysis and graph mining.

graphic file with name nihms-1864925-b0011.gif

Heng Huang received the B.S. and M.S. degrees from Shanghai Jiao Tong University (SJTU), Shanghai, China, in 1997 and 2001, respectively, and the Ph.D. degree in computer science from Dartmouth College, Hanover, NH, USA, in 2006.

He is currently the John A. Jurenko Endowed Professor of computer engineering with the Electrical and Computer Engineering Department, University of Pittsburgh, Pittsburgh, PA, USA. His research interests include machine learning, data mining, computer vision, pattern recognition, and biomedical data science.

graphic file with name nihms-1864925-b0012.gif

Liang Zhan received the Ph.D. degree in biomedical engineering from the University of California, Los Angeles (UCLA), Los Angeles, CA, USA, in 2011.

He is currently an Associate Professor with the Departments of Electrical and Computer Engineering and Bioengineering, University of Pittsburgh, PA, USA. His research interests include computational neuroimaging, brain connectomics, machine learning, and bioinformatics.

Footnotes

Contributor Information

Haoteng Tang, Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA 15260 USA.

Guixiang Ma, Intel Laboratory, Hillsboro, OR 97124 USA.

Lei Guo, Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA 15260 USA.

Xiyao Fu, Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA 15260 USA.

Heng Huang, Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA 15260 USA.

Liang Zhan, Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA 15260 USA.

References

  • [1].Rusinek H et al. , “Regional brain atrophy rate predicts future cognitive decline: 6-year longitudinal MR imaging study of normal aging,” Radiology, vol. 229, no. 3, pp. 691–696, 2003. [DOI] [PubMed] [Google Scholar]
  • [2].Sabuncu MR and Konukoglu E, “Clinical prediction from structural brain MRI scans: A large-scale empirical study,” Neuroinformatics, vol. 13, no. 1, pp. 31–46, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Seo S, Mohr J, Beck A, Wüstenberg T, Heinz A, and Obermayer K, “Predicting the future relapse of alcohol-dependent patients from structural and functional brain images,” Addiction Biol, vol. 20, no. 6, pp. 1042–1055, Nov. 2015. [DOI] [PubMed] [Google Scholar]
  • [4].Zhang Y, Zhan L, Cai W, Thompson P, and Huang H, “Integrating heterogeneous brain networks for predicting brain disease conditions,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. Cham, Switzerland: Springer, 2019, pp. 214–222. [Google Scholar]
  • [5].Genon S, Reid A, Langner R, Amunts K, and Eickhoff SB, “How to characterize the function of a brain region,” Trends Cognit. Sci, vol. 22, no. 4, pp. 350–364, Apr. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Kraljević N et al. , “Behavioral, anatomical and heritable convergence of affect and cognition in superior frontal cortex,” NeuroImage, vol. 243, Nov. 2021, Art. no. 118561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Bressler SL and Menon V, “Large-scale brain networks in cognition: Emerging methods and principles,” Trends Cogn. Sci, vol. 14, no. 6, pp. 277–290, Jun. 2010. [DOI] [PubMed] [Google Scholar]
  • [8].Tang H et al. , “A hierarchical graph learning model for brain network regression analysis,” Frontiers Neurosci, vol. 16, pp. 1–12, Nov. 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Sun H et al. , “Linked brain connectivity patterns with psychopathological and cognitive phenotypes in drug-naïve first-episode schizophrenia,” Psychoradiology, vol. 2, no. 2, pp. 43–51, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Levenson RW, Sturm VE, and Haase CM, “Emotional and behavioral symptoms in neurodegenerative disease: A model for studying the neural bases of psychopathology,” Annu. Rev. Clin. Psychol, vol. 10, p. 581, Mar. 2014. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Yan J et al. , “Modeling spatio-temporal patterns of holistic functional brain networks via multi-head guided attention graph neural networks (multi-head GAGNNs),” Med. Image Anal, vol. 80, Aug. 2022, Art. no. 102518. [DOI] [PubMed] [Google Scholar]
  • [12].Baniqued PL et al. , “Brain network modularity predicts exercise-related executive function gains in older adults,” Frontiers Aging Neurosci, vol. 9, p. 426, Jan. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Braun U, Muldoon SF, and Bassett DS, “On human brain networks in health and disease,” eLS, pp. 1–9, Feb. 2015. [Google Scholar]
  • [14].van den Heuvel MP, Kahn RS, Goñi J, and Sporns O, “High-cost, high-capacity backbone for global brain communication,” Proc. Nat. Acad. Sci. USA, vol. 109, no. 28, pp. 11372–11377, Jul. 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Sporns O, “The human connectome: Origins and challenges,” NeuroImage, vol. 80, pp. 53–61, Oct. 2013. [DOI] [PubMed] [Google Scholar]
  • [16].Mattar MG and Bassett DS, “Brain network architecture,” in Network Science in Cognitive Psychology. Oxfordshire, U.K.: Routledge, 2019, p. 30. [Google Scholar]
  • [17].Zhang Y, Zhan L, Wu S, Thompson P, and Huang H, “Disentangled and proportional representation learning for multi-view brain connectomes,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. Cham, Switzerland: Springer, 2021, pp. 508–518. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Beaty RE et al. , “Robust prediction of individual creative ability from brain functional connectivity,” Proc. Nat. Acad. Sci. USA, vol. 115, no. 5, pp. 1087–1092, Jan. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Brown CJ et al. , “Prediction of brain network age and factors of delayed maturation in very preterm infants,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. Cham, Switzerland: Springer, 2017, pp. 84–91. [Google Scholar]
  • [20].Eichele T et al. , “Prediction of human errors by maladaptive changes in event-related brain networks,” Proc. Nat. Acad. Sci. USA, vol. 105, no. 16, pp. 6173–6178, 2008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Li X, Li Y, and Li X, “Predicting clinical outcomes of Alzheimer’s disease from complex brain networks,” in Proc. Int. Conf. Adv. Data Mining Appl. Cham, Switzerland: Springer, 2017, pp. 519–525. [Google Scholar]
  • [22].Warren DE et al. , “Brain network theory can predict whether neuropsychological outcomes will differ from clinical expectations,” Arch. Clin. Neuropsychol, vol. 32, no. 1, pp. 40–52, 2017. [DOI] [PubMed] [Google Scholar]
  • [23].Hu C, Ju R, Shen Y, Zhou P, and Li Q, “Clinical decision support for Alzheimer’s disease based on deep learning and brain network,” in Proc. IEEE Int. Conf. Commun. (ICC), May 2016, pp. 1–6. [Google Scholar]
  • [24].Kawahara J et al. , “BrainNetCNN: Convolutional neural networks for brain networks; towards predicting neurodevelopment,” NeuroImage, vol. 146, pp. 1038–1049, Feb. 2017. [DOI] [PubMed] [Google Scholar]
  • [25].Ktena SI et al. , “Metric learning with spectral graph convolutions on brain connectivity networks,” NeuroImage, vol. 169, pp. 431–442, Apr. 2018. [DOI] [PubMed] [Google Scholar]
  • [26].Kipf TN and Welling M, “Semi-supervised classification with graph convolutional networks,” 2016, arXiv:1609.02907. [Google Scholar]
  • [27].Veličković P, Cucurull G, Casanova A, Romero A, Liò P, and Bengio Y, “Graph attention networks,” 2017, arXiv:1710.10903. [Google Scholar]
  • [28].Ying Z, Bourgeois D, You J, Zitnik M, and Leskovec J, “GNNexplainer: Generating explanations for graph neural networks,” in Proc. Adv. Neural Inf. Process. Syst, vol. 32, 2019, pp. 1–12. [PMC free article] [PubMed] [Google Scholar]
  • [29].Bao R, Gu B, and Huang H, “Fast OSCAR and OWL regression via safe screening rules,” in Proc. 37th Int. Conf. Mach. Learn., 2020, pp. 653–663. [Google Scholar]
  • [30].Ying Z, You J, Morris C, Ren X, Hamilton W, and Leskovec J, “Hierarchical graph representation learning with differentiable pooling,” in Proc. Adv. Neural Inf. Process. Syst, vol. 31, 2018, pp. 1–11. [Google Scholar]
  • [31].Hilgetag CC and Goulas A, “‘Hierarchy’ in the organization of brain networks,” Philos. Trans. Roy. Soc. B, vol. 375, no. 1796, 2020, Art. no. 20190319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Mastrandrea R, Gabrielli A, Piras F, Spalletta G, Caldarelli G, and Gili T, “Organization and hierarchy of the human functional brain network lead to a chain-like core,” Sci. Rep, vol. 7, no. 1, pp. 1–13, Dec. 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Meunier D, “Hierarchical modularity in human brain functional networks,” Frontiers Neuroinform, vol. 3, p. 37, Oct. 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Lee J, Lee I, and Kang J, “Self-attention graph pooling,” in Proc. 36th Int. Conf. Mach. Learn., 2019, pp. 3734–3743. [Google Scholar]
  • [35].Zhang Z et al. , “Hierarchical graph pooling with structure learning,” 2019, arXiv:1911.05954. [Google Scholar]
  • [36].Li X et al. , “BrainGNN: Interpretable brain graph neural network for fMRI analysis,” Med. Image Anal, vol. 74, Dec. 2021, Art. no. 102233. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Tang H, Ma G, He L, Huang H, and Zhan L, “CommPOOL: An interpretable graph pooling framework for hierarchical graph representation learning,” Neural Netw, vol. 143, pp. 669–677, Nov. 2021. [DOI] [PubMed] [Google Scholar]
  • [38].Cartwright D and Harary F, “Structural balance: A generalization of Heider’s theory,” Psychol. Rev, vol. 63, no. 5, p. 277, Sep. 1956. [DOI] [PubMed] [Google Scholar]
  • [39].Derr T, Ma Y, and Tang J, “Signed graph convolutional networks,” in Proc. IEEE Int. Conf. Data Mining (ICDM), Nov. 2018, pp. 929–934. [Google Scholar]
  • [40].Heider F, “Attitudes and cognitive organization,” J. Psychol, vol. 21, no. 1, pp. 107–112, Jan. 1946. [DOI] [PubMed] [Google Scholar]
  • [41].Li Y, Tian Y, Zhang J, and Chang Y, “Learning signed network embedding via graph attention,” in Proc. AAAI Conf. Artif. Intell, 2020, vol. 34, no. 4, pp. 4772–4779. [Google Scholar]
  • [42].Li Y, Tarlow D, Brockschmidt M, and Zemel R, “Gated graph sequence neural networks,” 2015, arXiv:1511.05493. [Google Scholar]
  • [43].Vinyals O, Bengio S, and Kudlur M, “Order matters: Sequence to sequence for sets,” 2015, arXiv:1511.06391. [Google Scholar]
  • [44].Zhang M, Cui Z, Neumann M, and Chen Y, “An end-to-end deep learning architecture for graph classification,” in Proc. 32nd AAAI Conf. Artif. Intell., 2018, pp. 1–8. [Google Scholar]
  • [45].van den Heuvel MP and Sporns O, “Network hubs in the human brain,” Trends Cogn. Sci, vol. 17, pp. 683–696, Dec. 2013. [DOI] [PubMed] [Google Scholar]
  • [46].Ilyas MU, Shafiq MZ, Liu AX, and Radha H, “A distributed and privacy preserving algorithm for identifying information hubs in social networks,” in Proc. IEEE INFOCOM, Apr. 2011, pp. 561–565. [Google Scholar]
  • [47].Hwang K, Hallquist MN, and Luna B, “The development of hub architecture in the human functional brain network,” Cerebral Cortex, vol. 23, no. 10, pp. 2380–2393, Oct. 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [48].Zhan L et al. , “The significance of negative correlations in brain connectivity,” J. Comparative Neurol, vol. 525, no. 15, pp. 3251–3265, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [49].Khosla P et al. , “Supervised contrastive learning,” in Proc. Adv. Neural Inf. Process. Syst, vol. 33, 2020, pp. 18661–18673. [Google Scholar]
  • [50].Berthelot D, Carlini N, Goodfellow I, Papernot N, Oliver A, and Raffel CA, “Mixmatch: A holistic approach to semi-supervised learning,” in Proc. Adv. Neural Inf. Process. Syst, vol. 32, 2019, pp. 1–11. [Google Scholar]
  • [51].Xie Q, Dai Z, Hovy E, Luong T, and Le Q, “Unsupervised data augmentation for consistency training,” in Proc. Adv. Neural Inf. Process. Syst, vol. 33, 2020, pp. 6256–6268. [Google Scholar]
  • [52].Wu Y, Wang Z, Zeng D, Li M, Shi Y, and Hu J, “Decentralized unsupervised learning of visual representations,” in Proc. IJCAI, 2022, pp. 2326–2333. [Google Scholar]
  • [53].Wu Y, Zeng D, Wang Z, Shi Y, and Hu J, “Federated contrastive learning for volumetric medical image segmentation,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. Cham, Switzerland: Springer, 2021, pp. 367–377. [Google Scholar]
  • [54].Hassani K and Khasahmadi AH, “Contrastive multi-view representation learning on graphs,” in Proc. 37th Int. Conf. Mach. Learn., 2020, pp. 4116–4126. [Google Scholar]
  • [55].You Y, Chen T, Sui Y, Chen T, Wang Z, and Shen Y, “Graph contrastive learning with augmentations,” in Proc. Adv. Neural Inf. Process. Syst, vol. 33, 2020, pp. 5812–5823. [Google Scholar]
  • [56].Zhu Y, Xu Y, Yu F, Liu Q, Wu S, and Wang L, “Graph contrastive learning with adaptive augmentation,” in Proc. Web Conf, Apr. 2021, pp. 2069–2080. [Google Scholar]
  • [57].Zhao T, Liu Y, Neves L, Woodford O, Jiang M, and Shah N, “Data augmentation for graph neural networks,” 2020, arXiv:2006.06830. [Google Scholar]
  • [58].Wu Y, Zeng D, Wang Z, Shi Y, and Hu J, “Distributed contrastive learning for medical image segmentation,” Med. Image Anal, vol. 81, Oct. 2022, Art. no. 102564. [DOI] [PubMed] [Google Scholar]
  • [59].Zhang L, Zaman A, Wang L, Yan J, and Zhu D, “A cascaded multi-modality analysis in mild cognitive impairment,” in Proc. Int. Workshop Mach. Learn. Med. Imag. Cham, Switzerland: Springer, 2019, pp. 557–565. [Google Scholar]
  • [60].Zhang L, Wang L, and Zhu D, “Jointly analyzing Alzheimer’s disease related structure-function using deep cross-model attention network,” in Proc. IEEE 17th Int. Symp. Biomed. Imag. (ISBI), Apr. 2020, pp. 563–567. [Google Scholar]
  • [61].Chen J, Ma T, and Xiao C, “FastGCN: Fast learning with graph convolutional networks via importance sampling,” 2018, arXiv:1801.10247. [Google Scholar]
  • [62].Huang W, Zhang T, Rong Y, and Huang J, “Adaptive sampling towards fast graph representation learning,” in Proc. Adv. Neural Inf. Process. Syst, 2018, pp. 4558–4567. [Google Scholar]
  • [63].Dai H, Dai B, and Song L, “Discriminative embeddings of latent variable models for structured data,” in Proc. 33rd Int. Conf. Mach. Learn., 2016, pp. 2702–2711. [Google Scholar]
  • [64].Duvenaud DK et al. , “Convolutional networks on graphs for learning molecular fingerprints,” in Proc. Adv. Neural Inf. Process. Syst, 2015, pp. 2224–2232. [Google Scholar]
  • [65].Liu J, Ma G, Jiang F, Lu C-T, Yu PS, and Ragin AB, “Community-preserving graph convolutions for structural and functional joint embedding of brain networks,” in Proc. IEEE Int. Conf. Big Data (Big Data), Dec. 2019, pp. 1163–1168. [Google Scholar]
  • [66].Hamilton W, Ying Z, and Leskovec J, “Inductive representation learning on large graphs,” in Proc. Adv. Neural Inf. Process. Syst, 2017, pp. 1024–1034. [Google Scholar]
  • [67].Ma G, Ahmed NK, Willke TL, and Yu PS, “Deep graph similarity learning: A survey,” Data Mining Knowl. Discovery, vol. 35, no. 3, pp. 688–725, May 2021. [Google Scholar]
  • [68].Gao H and Ji S, “Graph U-Nets,” in Proc. 36th Int. Conf. Mach. Learn., 2019, pp. 2083–2092. [Google Scholar]
  • [69].Yuan H and Ji S, “Structpool: Structured graph pooling via conditional random fields,” in Proc. 8th Int. Conf. Learn. Represent., 2020, pp. 1–12. [Google Scholar]
  • [70].Ma G et al. , “Deep graph similarity learning for brain data analysis,” in Proc. 28th ACM Int. Conf. Inf. Knowl. Manag., 2019, pp. 2743–2751. [Google Scholar]
  • [71].Korthauer LE, Zhan L, Ajilore O, Leow A, and Driscoll I, “Disrupted topology of the resting state structural connectome in middle-aged APOE ε4 carriers,” NeuroImage, vol. 178, pp. 295–305, Sep. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [72].Zhan L, Liu Y, Zhou J, Ye J, and Thompson PM, “Boosting classification accuracy of diffusion MRI derived brain networks for the subtypes of mild cognitive impairment using higher order singular value decomposition,” in Proc. IEEE 12th Int. Symp. Biomed. Imag. (ISBI), Apr. 2015, pp. 131–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [73].Cao B et al. , “t-BNE: Tensor-based brain network embedding,” in Proc. SIAM Int. Conf. Data Mining, 2017, pp. 189–197. [Google Scholar]
  • [74].Sui J et al. , “Discriminating schizophrenia and bipolar disorder by fusing fMRI and DTI in a multimodal CCA+ joint ICA model,” NeuroImage, vol. 57, no. 3, pp. 839–855, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [75].Zhang Y and Huang H, “New graph-blind convolutional network for brain connectome data analysis,” in Proc. Int. Conf. Inf. Process. Med. Imag. Cham, Switzerland: Springer, 2019, pp. 669–681. [Google Scholar]
  • [76].Jiang H, Cao P, Xu M, Yang J, and Zaiane O, “Hi-GCN: A hierarchical graph convolution network for graph embedding learning of brain network and brain disorders prediction,” Comput. Biol. Med, vol. 127, Dec. 2020, Art. no. 104096. [DOI] [PubMed] [Google Scholar]
  • [77].Jung J, Yoo J, and Kang U, “Signed graph diffusion network,” 2020, arXiv:2012.14191. [Google Scholar]
  • [78].Shen X and Chung F-L, “Deep network embedding for graph representation learning in signed networks,” IEEE Trans. Cybern, vol. 50, no. 4, pp. 1556–1568, Apr. 2020. [DOI] [PubMed] [Google Scholar]
  • [79].Huang Q, Yamada M, Tian Y, Singh D, Yin D, and Chang Y, “GraphLIME: Local interpretable model explanations for graph neural networks,” 2020, arXiv:2001.06216. [Google Scholar]
  • [80].Cui H, Dai W, Zhu Y, Li X, He L, and Yang C, “BrainNNExplainer: An interpretable graph neural network framework for brain network based disease analysis,” 2021, arXiv:2107.05097. [Google Scholar]
  • [81].Ma G, Lu C-T, He L, Yu PS, and Ragin AB, “Multi-view graph embedding with hub detection for brain network analysis,” in Proc. IEEE Int. Conf. Data Mining (ICDM), Nov. 2017, pp. 967–972. [Google Scholar]
  • [82].Xu D, Cheng W, Luo D, Chen H, and Zhang X, “InfoGCL: Information-aware graph contrastive learning,” in Proc. Adv. Neural Inf. Process. Syst, vol. 34, 2021, pp. 1–12. [Google Scholar]
  • [83].Wu Z, Xiong Y, Yu SX, and Lin D, “Unsupervised feature learning via non-parametric instance discrimination,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit, Jun. 2018, pp. 3733–3742. [Google Scholar]
  • [84].Sohn K, “Improved deep metric learning with multi-class n-pair loss objective,” in Proc. Adv. Neural Inf. Process. Syst, vol. 29, 2016, pp. 1–9. [Google Scholar]
  • [85].van den Oord A, Li Y, and Vinyals O, “Representation learning with contrastive predictive coding,” 2018, arXiv:1807.03748. [Google Scholar]
  • [86].Van Essen DC et al. , “The WU-Minn human connectome project: An overview,” NeuroImage, vol. 80, pp. 62–79, Oct. 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [87].LaMontagne PJ et al. , “OASIS-3: Longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer’s disease,” Alzheimer’s Dementia, vol. 14, no. Suppl. 7, p. P1097, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S155252601831611X, doi: 10.1016/j.jalz.2018.06.1439. [DOI] [Google Scholar]
  • [88].Whitfield-Gabrieli S and Nieto-Castanon A, “Conn: A functional connectivity toolbox for correlated and anticorrelated brain networks,” Brain Connectivity, vol. 2, no. 3, pp. 125–141, Jun. 2012. [DOI] [PubMed] [Google Scholar]
  • [89].Fortel I et al. , “Connectome signatures of hyperexcitation in cognitively intact middle-aged female APOE-ε4 carriers,” Cerebral Cortex, vol. 30, no. 12, pp. 6350–6362, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [90].Ajilore O et al. , “Constructing the resting state structural connectome,” Frontiers Neuroinform, vol. 7, p. 30, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [91].Fischl B, “Freesurfer,” NeuroImage, vol. 62, no. 2, pp. 774–781, Aug. 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [92].Kingma DP and Ba J, “Adam: A method for stochastic optimization,” 2014, arXiv:1412.6980. [Google Scholar]
  • [93].Shchur O, Mumme M, Bojchevski A, and Günnemann S, “Pitfalls of graph neural network evaluation,” 2018, arXiv:1811.05868. [Google Scholar]
  • [94].Zhang L, Wang L, and Zhu D, “Predicting brain structural network using functional connectivity,” Med. Image Anal, vol. 79, Jul. 2022, Art. no. 102463. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [95].Tombaugh TN and McIntyre NJ, “The mini-mental state examination: A comprehensive review,” J. Amer. Geriatrics Soc, vol. 40, no. 9, pp. 922–935, Sep. 1992. [DOI] [PubMed] [Google Scholar]
  • [96].Eriksen BA and Eriksen CW, “Effects of noise letters upon the identification of a target letter in a nonsearch task,” Perception Psychophys, vol. 16, no. 1, pp. 143–149, Jan. 1974. [Google Scholar]
  • [97].Pangman VC, Sloan J, and Guse L, “An examination of psychometric properties of the mini-mental state examination and the standardized mini-mental state examination: Implications for clinical practice,” Appl. Nursing Res, vol. 13, no. 4, pp. 209–213, Nov. 2000. [DOI] [PubMed] [Google Scholar]
  • [98].Monchi O, Petrides M, Petre V, Worsley K, and Dagher A, “Wisconsin card sorting revisited: Distinct neural circuits participating in different stages of the task identified by event-related functional magnetic resonance imaging,” J. Neurosci, vol. 21, no. 19, pp. 7733–7741, Oct. 2001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [99].Berg EA, “A simple objective technique for measuring flexibility in thinking,” J. Gen. Psychol, vol. 39, no. 1, pp. 15–22, Jul. 1948. [DOI] [PubMed] [Google Scholar]
  • [100].Zhang W, Zhan L, Thompson P, and Wang Y, “Deep representation learning for multimodal brain networks,” in Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. Cham, Switzerland: Springer, 2020, pp. 613–624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [101].Arslan S, Ktena SI, Glocker B, and Rueckert D, “Graph saliency maps through spectral convolutional networks: Application to sex classification with brain connectivity,” in Graphs in Biomedical Image Analysis and Integrating Medical Imaging and Non-Imaging Modalities. Cham, Switzerland: Springer, 2018, pp. 3–13. [Google Scholar]
  • [102].Pope PE, Kolouri S, Rostami M, Martin CE, and Hoffmann H, “Explainability methods for graph convolutional neural networks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 10772–10781. [Google Scholar]
  • [103].Rasero J et al. , “Multivariate regression analysis of structural MRI connectivity matrices in Alzheimer’s disease,” PLoS ONE, vol. 12, no. 11, 2017, Art. no. e0187281. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [104].Kutová M, Mrzílková J, Riedlová J, and Zach P, “Asymmetric changes in limbic cortex and planum temporale in patients with Alzheimer disease,” Current Alzheimer Res, vol. 15, no. 14, pp. 1361–1368, Nov. 2018. [DOI] [PubMed] [Google Scholar]
  • [105].Hiscox LV et al. , “Mechanical property alterations across the cerebral cortex due to Alzheimer’s disease,” Brain Commun, vol. 2, no. 1, 2020, Art. no. fcz049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [106].Hafkemeijer A et al. , “Resting state functional connectivity differences between behavioral variant frontotemporal dementia and Alzheimer’s disease,” Frontiers Hum. Neurosci, vol. 9, p. 474, Sep. 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [107].Stoodley CJ, Valera EM, and Schmahmann JD, “Functional topography of the cerebellum for motor and cognitive tasks: An fMRI study,” NeuroImage, vol. 59, no. 2, pp. 1560–1570, Jan. 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [108].Van Overwalle F, Ma Q, and Heleven E, “The posterior crus II cerebellum is specialized for social mentalizing and emotional self-experiences: A meta-analysis,” Social Cognit. Affect. Neurosci, vol. 15, no. 9, pp. 905–928, Nov. 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [109].Sawyer RP, Rodriguez-Porcel F, Hagen M, Shatz R, and Espay AJ, “Diagnosing the frontal variant of Alzheimer’s disease: A clinician’s yellow brick road,” J. Clin. Movement Disorders, vol. 4, no. 1, pp. 1–9, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [110].Calati R et al. , “Repatriation is associated with isthmus cingulate cortex reduction in community-dwelling elderly,” World J. Biol. Psychiatry, vol. 19, no. 6, pp. 421–430, 2018. [DOI] [PubMed] [Google Scholar]
  • [111].Hornung J, Smith E, Junger J, Pauly K, Habel U, and Derntl B, “Exploring sex differences in the neural correlates of self-and other-referential gender stereotyping,” Frontiers Behav. Neurosci, vol. 13, p. 31, Feb. 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [112].Chen T et al. , “The neural substrates of sex differences in balanced time perspective: A unique role for the precuneus,” Brain Imag. Behav, vol. 16, pp. 2239–2247, Jun. 2022. [DOI] [PubMed] [Google Scholar]
  • [113].Adinoff B, Williams MJ, Best SE, Harris TS, Chandler P, and Devous MD, “Sex differences in medial and lateral orbitofrontal cortex hypoperfusion in cocaine-dependent men and women,” Gender Med, vol. 3, no. 3, pp. 206–222, Sep. 2006. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES