Abstract
Extensive studies focus on analyzing human brain functional connectivity from a network perspective, in which each network contains complex graph structures. Based on resting-state functional MRI (rs-fMRI) data, graph convolutional networks (GCNs) enable comprehensive mapping of brain functional connectivity (FC) patterns to depict brain activities. However, existing studies usually characterize static properties of the FC patterns, ignoring the time-varying dynamic information. In addition, previous GCN methods generally use fixed group-level (e.g., patients or controls) representation of FC networks, and thus, cannot capture subject-level FC specificity. To this end, we propose a Temporal-Adaptive GCN (TAGCN) framework that can not only take advantage of both spatial and temporal information using resting-state FC patterns and time-series but also explicitly characterize subject-level specificity of FC patterns. Specifically, we first segment each ROI-based time-series into multiple overlapping windows, then employ an adaptive GCN to mine topological information. We further model the temporal patterns for each ROI along time to learn the periodic brain status changes. Experimental results on 533 major depressive disorder (MDD) and health control (HC) subjects demonstrate that the proposed TAGCN outperforms several state-of-the-art methods in MDD vs. HC classification, and also can be used to capture dynamic FC alterations and learn valid graph representations.
1. Introduction
Major depression disorder (MDD), one of the largest mental diseases, affects as many as 300 million people annually. Patients suffer this debilitating illness from depressed mood, diminished interests, and impaired cognitive function [1,2]. Despite many efforts have been made from different areas such as basic science, clinical neuroscience, and psychiatric research, the pathophysiology of MDD is still unclear. In addition, conventional diagnosis of MDD often depends on a subjective clinical impression from Diagnostic and Statistical Manual of Mental Disorders (DSM) criterion and treatment responses. Recently, many researchers have developed various computer-aided diagnostic tools based on noninvasive neuroimaging techniques to better understand the neurobiological mechanisms underpinning this mental disorder [3–5].
Among various neuroimaging techniques, resting-state functional magnetic resonance imaging (rs-fMRI) can depict large scale abnormality or dysfunction on brain connectivity networks by measuring the blood-oxygen level in the brain [6,7]. This technology has been widely used to identify MDD from healthy controls (HCs) [4,8]. Most of the existing rs-fMRI based studies hold an implicit but strong assumption that the brain functional connectivity network is temporal stationary through the whole scanning period, by relying on static functional connectivity (FC) networks. Therefore, these methods ignore the temporal dynamic information of brain FC networks, which can not well monitor the changes of macroscopic neural activities underlying critical aspects of cognitive/behavioral disorders [9,10]. Spectral graph convolutional neural networks (GCNs) have been used to explicitly capture topological information for learning useful representations of brain FC networks [11,12]. However, conventional GCNs generally use fixed group-level rather than subject-level adjacent matrix to model the relationships among different brain regions, failing to capture the time-varying information in fMRI data. Intuitively, it is interesting to capture subject-level specificity of functional connectivities to boost the performance of automated MDD identification.
In this paper, we propose a Temporal-Adaptive Graph Convolutional Network (TAGCN) to extract both static and dynamic information of brain FC patterns for MDD identification, as shown in Fig. 1. Specifically, we first extract rs-fMRI time-series signals from each specific region-of-interest (ROI), and employ fixed-size sliding windows to divide time-series data into multiple overlapped blocks. For each block, an adaptive graph convolutional layer is subsequently used to generate a flexible connectivity matrix, which can help model multilevel semantic information within the whole time series. After that, convolution operations on each ROI along different blocks are used to capture temporal dynamics of the complete time series. Finally, a fully connected layer followed by a softmax function is used for MDD classification. Experimental results on 533 subjects from an open-source MDD dataset demonstrate the effectiveness of our TAGCN in capturing dynamic FC alteration and learning valid graph representations. Also, our TAGCN achieves better performance than several state-of-the-art methods in the task of MDD vs. HC classification. To the best of our knowledge, this is among the first attempt to use an end-to-end GCN model to capture adaptive FC topology for automated MDD diagnosis.
Fig. 1.

Illustration of the proposed Temporal-Adaptive Graph Convolutional Network (TAGCN) and details on adaptive GCN layer. The whole framework contains 3 parts: 1) Using step-wise slice windows to generate several time series blocks. 2) Applying an adaptive GCN layer to construct flexible brain functional connectivity topology structure within each block. 3) Employing a temporal convolutional layer to extract dynamic information between blocks on one ROI. With the output of temporal convolutional layer, a fully-connected layer is employed to predict MDD classification (with N-dimensional input and the class label as output). As shown in the right panel, three types of matrix (i.e., A, R, and S), Normalized Embedded Gaussian function (i.e., θ, and ϕ) and several simple operations constitute the whole adaptive GCN layer. ⊕ and ⊗ denote the element-wise summation and matrix multiplication operations. Pink boxes present those parameters are learnable while blue boxes denote fixed parts.
2. Method
2.1. Data and fMRI Pre-processing
A total of 533 subjects from an open-source MDD dataset [13] with rs-fMRI data are used in this work, including 282 MDD subjects and 251 healthy controls (HCs) recruited from the Southwest University. For each scan, the TR (repetition time) is 2, 000 ms, TE (echo time) is 30 ms, slice thickness is 3.0 mm, and time points are 242 s. The demographic information of the studied subjects is provided in Table 1.
Table 1.
Demographic information of studied subjects in the MDD dataset.
| Category | Sex | Age | Education | First Period | On Medication | Duration of Illness |
|---|---|---|---|---|---|---|
| MDD | 99M 183F | 38.7 ± 13.6 | 10.8 ± 3.6 | 209(Y)/49(N) 24(D) | 124(Y)/125(N) 33(D) | 50.0 ± 65.9 35(D) |
| HC | 87M 164F | 39.6 ± 15.8 | 13.0 ± 3.9 | - | - | - |
M: Male; F: Female; Y: Yes; N: No; D: Lack of record; Mean ± Standard deviation.
Each rs-fMRI scan was pre-processed using the Data Processing Assistant for Resting-State fMRI (DPARSF) [14]. Specifically, we first discard the first ten time points, followed by slice timing correction, head motion correction, regression of nuisance covariates of head motion parameters, white matter, and cerebrospinal fluid (CSF). Then, fMRI data are normalized with an EPI template in the MNI space, and resampled to the resolution of 3 × 3 × 3 mm3, followed by spatial smoothing using a 6 mm full width half maximum Gaussian kernel. Finally, the Harvard-Oxford atlas, with 112 pre-defined regions-of-interest (ROIs) including cortical and subcortical areas, are nonlinearly aligned onto each scan to extract the mean time series for each ROI.
2.2. Proposed Temporal-Adaptive Graph Convolutional Network
As shown in Fig. 1, our model aims to capture temporal and graph topology information to identify MDD subjects from HCs based on rs-fMRI time series. Denote a subject as , where contains all time-series information at the n-th ROI. Here, N = 112 and M = 232 denote the number of ROIs and the time points, respectively. The slicing window size L is set to 25 TR (i.e., 50 s) and the stride of slide window is set to 10 TR (i.e., 20 s). To reduce the overlap of the last two blocks, we discard the first TR and generate T = 10 blocks for each subject.
Spectral Graph Convolutional Network.
Spectral Graph Convolutional Network (GCN) has recently shown its superiority in learning high-level graph features from brain fMRI data [11,12,15]. Denote fin and fout as the input and output of a GCN, respectively. The simplest spectral GCN layer [16] can be formulated as:
| (1) |
where denotes the N × N degree matrix (with N representing the number of ROIs), and W denotes the learnable weighted matrix for those connected vertices. Here, , where A and I denote the adjacent matrix and an identity matrix, respectively. However, in the definition of spectral graph convolution, the localized first-order approximation makes the nodes i and j share the same parameter if the node j is directly connected to i. To enable specifying different weights to different nodes in a neighborhood, a Graph Attention (GAT) [17] layer is further proposed, with its definition shown in the following:
| (2) |
where is a N × N degree matrix that only adds constant small numbers to avoid empty rows. ⊙ denotes the dot product and M is an attention map which presents the importance of each node/vertex/ROI. However, both the conventional spectral GCN layer and GAT layer still highly depend on the construction of brain functional connectivity topology, while each fMRI scan is usually treated as a complete/fully-connected graph.
Adaptive Graph Convolutional Layer.
To solve the problem caused by the fixed topology of brain functional connectivity, we employ a new adjacent matrix A + R + S to generate an end-to-end learning module. The definition of the adaptive graph convolutional layer is shown as follows:
| (3) |
where the definitions of A, R and S are shown below.
The matrix A is an N × N adjacency matrix, which determines whether a connection exists between two ROIs (i.e., vertices). Specifically, we first calculate the mean FC matrix of all training subjects within the same time-series block and then construct a k-Nearest Neighbour (KNN) graph by connecting each vertex with its top k nearest neighbors (with the Pearson’s correlation coefficient as the similarity metric).
The matrix R is an N × N adjacency matrix, which is parameterized and optimized in the training process with other tunable parameters. It is a data-driven matrix without any constraint, through which one can learn graphs more individualized for different topology information between different time-series blocks. Although the attention matrix M in Eq. (2) can model the existence and strength of connections between two ROIs, the dot operation ⊙ leads to that those zero elements in the adjacent matrix A always be 0 (i.e., not affected by M). Different from the attention matrix M, R is learned in a data-driven manner, and thus, is more flexible.
The matrix S is used to learn the topology information of brain functional connectivity in each time-series block. We employ a normalized embedding Gaussian function [18] to calculate the similarity of two ROIs in S. Specifically, this function determines whether two ROIs (e.g., ri and rj) should be connected and also the connection strength if the connection exists, defined as follows:
| (4) |
where ri,j is the element of S and θ(*) and ϕ(*) are two embedding functions. These two embedding functions map the input feature map (Cin × T × N) into the size of (Ce × T × N), where Cin, Ce and T denote the numbers of channels, embedding size and temporal blocks, respectively. We use the 1 × 1 convolutional layer as the embedding function. After rearranging the new feature maps into the shape of N × CeT and CeT × N, a N × N matrix S is generated by multiplying them. The element rij in the matrix S denotes the similarity of two ROIs (i.e., ri and rj) that is normalized to [0, 1]. Details on adaptive graph convolutional layer are shown in the right panel of Fig. 1.
Temporal Convolutional Layer.
For the temporal dimension, since the number of blocks is fixed as T = 10 (as mentioned in Sect. 2.2), we perform the graph convolution similar to the traditional convolution operation. Specifically, a Kt × 1 convolution operation is employed to work on the output feature maps calculated from adaptive graph convolutional layer, where Kt is the kernel size of the temporal dimension. The kernel size is set as Kt = 3 empirically.
Implementation.
We optimize the proposed TAGCN model via the Adam algorithm, with the learning rate of 0.001, the number of epochs of 200, and the mini-batch size of 5. For a new test subject, our TAGCN costumes about 8.6 seconds to predict its class label (i.e., MDD or HC) using a single GPU (NVIDIA GTX TITAN 12 GB).
3. Experiment
Experimental Setup.
We evaluate the proposed TAGCN on the MDD dataset based on a 5-fold cross-validation strategy. The performance of MDD identification from age-matched HCs is measured by four metrics, i.e., accuracy (ACC), sensitivity (SEN), specificity (SPE), and area under the ROC curve (AUC).
Competing Method.
We first compare the proposed TAGCN method with two baseline methods based on static FC matrices, i.e., (1) support vector machine (SVM) with Radial Basis Function kernel, and (2) Clustering Coefficients (CC) with SVM (CC+SVM). CC not only measures the clustering degree of each node in a graph but also can be treated as a feature selection algorithm. Hence, we employ SVM with and without CC to discriminate MDD from HCs based on their static FC matrices. Specifically, each static FC matrix (corresponding to a specific subject) is constructed based on the Pearson’s correlation between the whole time series of each pair of pre-defined ROIs. The SVM method direct perform classification based on the static FC matrix. The CC+SVM method is associated with the degree of network sparsity, where the sparsity parameter is chosen from {0.10, 0.15, ⋯, 0.40} according to cross-validation performance. The parameter C in SVM with RBF kernel is chosen from {0.80, 0.85, ⋯, 3.00} via cross validation, and we use default values for the other parameters.
We also compare our TAGCN with two state-of-the-art GCN methods, including (1) sGCN [16] shown in Eq. 1, and (2) GAT [17] shown in Eq. 2. Both networks are tested on static FC matrices generated from rs-fMRI. Li et al. [19] found that spectral GCN models can be explained as a special form of Laplacian smoothing which employs features of each vertex as well as its neighbors. In order to use brain functional network more effectively, we construct a KNN graph, instead of fully-connected graphs which cannot capture the node-centralized local topology via spectral GCNs, by connecting each vertex with its k-nearest neighbors to model the node-centralized local topology. It should be noted that the graph topology (reflected by vertices and their connectivity) of such a group-level (rather than subject-level) KNN graph is shared by all subjects. The parameter k for constructing KNN graphs is chosen from {1, 2, ⋯, 30}. These networks contain 3 graph convolutional layers and one fully-connected layer. Besides, these 3 graph convolutional layers share the same size of inputs to make sure that features can be well explained. The number of heads on GAT is chosen from {2, 3, ⋯, 6} via cross validation. The parameter of attention dropout is 0.6 and the negative slope of leaky ReLU is 0.2.
Result.
In the left panel of Fig. 2, we report the disease classification results achieved by 2 traditional machine learning methods (i.e., SVM and CC+SVM), 2 GCN methods (i.e., sGCN and GAT) and our TAGCN. We further show the ROC curves and AUC values of these methods in the right panel of Fig. 2. From Fig. 2, one can have the following interesting observations. First, GCN-based models are superior to traditional methods (including SVM and CC+SVM) significantly. For instance, these traditional methods (without considering graph topology information) achieve at least 5% lower performance than other GCN-based models. This demonstrates the necessity and effectiveness of exploiting graph topology on FC. Second, GAT (with different weights to different nodes/ROIs in a neighborhood) outperforms sGCN, which means GAT might conquer the negative influence of using group-level adjacent matrix. Besides, our proposed adaptive learning strategies with flexible brain connectivity topology structure achieve better performance than GAT and sGCN. It implies that modeling subject-level functional connectivity topology structure helps capture discriminative features than group-level topology structure.
Fig. 2.

Three indexes (i.e., accuracy, sensitivity, and specificity), ROC curves and related AUC values of five different methods in the task of MDD vs. HC classification.
Ablation Study.
To evaluate the contributions of our proposed three matrices and temporal learning strategy, we further compare TAGCN with its four types of variants, including (1) TAGCN_noT based on static FC matrix, i.e., ignoring temporal dynamic information, (2) TAGCN without the KNN adjacency matrix A in Eq. 3, denoted as TAGCN_noA, (3) TGCN without the randomly initial adjacency matrix R (TAGCN_noR), and (4) TGCN without the similarity matrix S (TAGCN_noS). For the fair comparison, all GCN-related layers in six GCN methods(without GAT) are followed by a batch Normalization (BN) layer and a ReLU layer. The experimental results are shown in Fig. 3.
Fig. 3.

Three indexes (i.e., accuracy, sensitivity, and specificity), ROC curves and related AUC values of our TAGCN and its four variants in the task of MDD vs. HC classification.
Figure 3 suggests that TAGCN with temporal information promotes the classification results, compared with TAGCN_noT using the static FC matrix. This confirms that dynamic fluctuation in FCs also contributes to discriminating MDD from HCs. In addition, TAGCN_noR achieves the highest specificity without random Matrix R, indicating that topological information based on KNN may pay more attention on abnormal FC. Also, three variants of TAGCN (i.e., TAGCN_noR, TAGCN_noS, and TAGCN_noA) yield comparable results with TAGCN, suggesting that three matrices (i.e., R, S, and A) in Eq. (3) provide complementary useful information for MDD identification.
As shown in the right of Figs. 2 and 3, our proposed TAGCN achieves good ROC performance and the best AUC value when compared to the competing methods. These results further suggest the efficiency of TAGCN in MDD vs. HC diagnosis.
4. Conclusion
In this paper, we propose a temporal-adaptive graph convolution network (TAGCN) to mine spatial and temporal information using rs-fMRI time series. Specifically, the time-series data are first segmented with fixed sliding windows. Then, an adaptive GCN module is employed to generate unfixed topological information, by mainly focusing on each specific sliding window. We further model the temporal patterns of each ROI within the whole time series to learn periodic changes of the brain. The proposed TAGCN can not only learn completed data-driven based graph topology information but also effectively capture dynamic variations of brain fMRI data. Instead of sharing one group-level adjacent matrix, TAGCN with an adaptive GCN layer takes subject-level topological information (i.e, self adjacent matrix) into consideration. Experimental results on the MDD dataset demonstrate that our method yields state-of-the-art performance in identifying MDD patients from healthy controls.
In the current work, we only focus on using rs-fMRI data to capture subject-level connectivity topology. Actually, other modalities (e.g., structure MRI and diffusion tensor imaging) can also help uncover the neurobiological mechanisms of MDD by providing more direct structural connectivity topology. In future, we will extend TAGCN to multi-modal brain imaging data. Moreover, it is interesting to design other strategies to generate and segment fMRI time-series to take advantage of temporal dynamics.
Acknowledgements.
This work was partly supported by NIH grant (No. MH108560).
References
- 1.Organization, W.H., et al. : Depression and Other Common Mental Disorders: Global Health Estimates. World Health Organization, Technical report (2017) [Google Scholar]
- 2.Otte C, et al. : Major depressive disorder. Nat. Rev. Dis. Primers 2(1), 1–20 (2016) [DOI] [PubMed] [Google Scholar]
- 3.Gray JP, Müller VI, Eickhoff SB, Fox PT: Multimodal abnormalities of brain structure and function in major depressive disorder: a meta-analysis of neuroimaging studies. Am. J. Psychiatry 177(5), 422–434 (2020) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gao S, Calhoun VD, Sui J: Machine learning in major depression: from classification to treatment outcome prediction. CNS Neurosci. Ther 24(11), 1037–1052 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Sui J, et al. : Multimodal neuromarkers in schizophrenia via cognition-guided MRI fusion. Nat. Commun 9(1), 1–14 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Jie B, Liu M, Shen D: Integration of temporal and spatial properties of dynamic connectivity networks for automatic diagnosis of brain disease. Med. Image Anal 47, 81–94 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zhang D, Huang J, Jie B, Du J, Tu L, Liu M: Ordinal pattern: a new descriptor for brain connectivity networks. IEEE Trans. Med. Imaging 37(7), 1711–1722 (2018) [DOI] [PubMed] [Google Scholar]
- 8.Li G, et al. : Identification of abnormal circuit dynamics in major depressive disorder via multiscale neural modeling of resting-state fMRI. In: Shen D, et al. (eds.) MICCAI 2019. LNCS, vol. 11766, pp. 682–690. Springer, Cham: (2019). 10.1007/978-3-030-32248-9_76 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wang M, Lian C, Yao D, Zhang D, Liu M, Shen D: Spatial-temporal dependency modeling and network hub detection for functional MRI analysis via convolutional-recurrent network. IEEE Transactions on Biomedical Engineering. IEEE; (2019) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Jiao Z, et al. : Dynamic routing capsule networks for mild cognitive impairment diagnosis. In: Shen D, et al. (eds.) MICCAI 2019. LNCS, vol. 11767, pp. 620–628. Springer, Cham: (2019). 10.1007/978-3-030-32251-9_68 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Yao D, et al. : Triplet graph convolutional network for multi-scale analysis of functional connectivity using functional MRI. In: Zhang D, Zhou L, Jie B, Liu M (eds.) GLMI 2019. LNCS, vol. 11849, pp. 70–78. Springer, Cham: (2019). 10.1007/978-3-030-35817-4_9 [DOI] [Google Scholar]
- 12.Ktena SI, et al. : Metric learning with spectral graph convolutions on brain connectivity networks. NeuroImage 169, 431–442 (2018) [DOI] [PubMed] [Google Scholar]
- 13.Yan CG, et al. : Reduced default mode network functional connectivity in patients with recurrent major depressive disorder. Proc. Nat. Acad. Sci 116(18), 9078–9083 (2019) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Yan CG, Wang XD, Zuo XN, Zang YF: DPABI: data processing & analysis for (resting-state) brain imaging. Neuroinform. 14(3), 339–351 (2016) [DOI] [PubMed] [Google Scholar]
- 15.Parisot S, Ktena SI, Ferrante E, Lee M, Guerrero R, Glocker B, Rueckert D: Disease prediction using graph convolutional networks: application to autism spectrum disorder and Alzheimer’s disease. Med. Image Anal 48, 117–130 (2018) [DOI] [PubMed] [Google Scholar]
- 16.Kipf TN, Welling M: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016) [Google Scholar]
- 17.Velivcković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017) [Google Scholar]
- 18.Shi L, Zhang Y, Cheng J, Lu H: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12026–12035. IEEE; (2019) [Google Scholar]
- 19.Li Q, Han Z, Wu XM: Deeper insights into graph convolutional networks for semi-supervised learning. In: Thirty-Second AAAI Conference on Artificial Intelligence. (2018) [Google Scholar]
