Classification of schizophrenia by intersubject correlation in functional connectome

Gong‐Jun Ji; Xingui Chen; Tongjian Bai; Lu Wang; Qiang Wei; Yaxiang Gao; Longxiang Tao; Kongliang He; Dandan Li; Yi Dong; Panpan Hu; Fengqiong Yu; Chunyan Zhu; Yanghua Tian; Yongqiang Yu; Kai Wang

doi:10.1002/hbm.24527

. 2019 Jan 21;40(8):2347–2357. doi: 10.1002/hbm.24527

Classification of schizophrenia by intersubject correlation in functional connectome

Gong‐Jun Ji ^1,^2,³, Xingui Chen ^4,^2,³, Tongjian Bai ^4,^2,³, Lu Wang ^4,^2,³, Qiang Wei ^4,^2,³, Yaxiang Gao ^2,³, Longxiang Tao ⁵, Kongliang He ^6,⁷, Dandan Li ^4,^2,³, Yi Dong ^6,⁸, Panpan Hu ^4,^2,³, Fengqiong Yu ^1,^2,³, Chunyan Zhu ^1,^2,³, Yanghua Tian ^4,^2,³, Yongqiang Yu ⁵, Kai Wang ^4,^✉

PMCID: PMC6865403 PMID: 30663853

Abstract

Functional connectomes have been suggested as fingerprinting for individual identification. Accordingly, we hypothesized that subjects in the same phenotypic group have similar functional connectome features, which could help to discriminate schizophrenia (SCH) patients from healthy controls (HCs) and from depression patients. To this end, we included resting‐state functional magnetic resonance imaging data of SCH, depression patients, and HCs from three centers. We first investigated the characteristics of connectome similarity between individuals, and found higher similarity between subjects belonging to the same group (i.e., SCH–SCH) than different groups (i.e., HC–SCH). These findings suggest that the average connectome within group (termed as group‐specific functional connectome [GFC]) may help in individual classification. Consistently, significant accuracy (75–77%) and area under curve (81–86%) were found in discriminating SCH from HC or depression patients by GFC‐based leave‐one‐out cross‐validation. Cross‐center classification further suggests a good generalizability of the GFC classification. We additionally included normal aging data (255 young and 242 old subjects with different scanning sequences) to show factors could be improved for better classification performance, and the findings emphasized the importance of increasing sample size but not temporal resolution during scanning. In conclusion, our findings suggest that the average functional connectome across subjects contained group‐specific biological features and may be helpful in clinical diagnosis for schizophrenia.

Keywords: classification, functional connectome, functional magnetic resonance imaging, multicenter, resting state, schizophrenia

1. INTRODUCTION

Functional magnetic resonance imaging (fMRI) is a powerful noninvasive technique to investigate how the human brain works. Primarily based on group‐level analysis, we have accumulated abundant knowledge about the universal principles of brain function (Raichle, 2009). However, it is still undetermined how these advances can be generalized to solve individualized problems, such as clinical diagnosis (Arbabshirani, Plis, Sui, & Calhoun, 2017; Finn et al., 2017).

Recently, it has been demonstrated that the whole‐brain pattern of resting‐state functional connectivity (rsFC) can be used to identify an individual from a data pool with an accuracy of up to 99% (Anderson, Ferguson, Lopez‐Larson, & Yurgelun‐Todd, 2011; Finn et al., 2015; Miranda‐Dominguez et al., 2014). This finding was further validated by studies with large sample sizes (Waller et al., 2017) and disease conditions (Kaufmann et al., 2017; Rosenberg et al., 2016). Thus, whole‐brain rsFC patterns could be viewed as a connectome fingerprint containing an individual's identity features (Miranda‐Dominguez et al., 2014). This reliability feature of rsFC patterns within a subject coexists with its variability across subjects. For instance, canonical correlation analysis indicated that the functional connectome pattern across subjects is associated with specific sets of demographic/psychometric measures (Smith et al., 2015). Based on these intrasubject and intersubject features of rsFC, we hypothesized that subjects with similar psychiatric states may have similar connectome patterns; and inversely, their common connectome features (termed as group‐specific functional connectome [GFC]) (Gratton et al., 2018) may help in discriminating whether an unknown individual has a similar psychiatric state to that of the cohort study. In support of this hypothesis, Parkinson, Kleinbaum, and Wheatley (2018) found that the similarity of neural responses to audiovisual movies decreased with increasing distance between individuals in their shared social network.

In this study, we used schizophrenia (SCH) as an example to show the potential of GFC‐classification in clinical diagnosis. SCH is a major psychiatry disorder, and many MRI‐based algorithms have been developed for its automatic diagnosis. However, a current review (Wolfers, Buitelaar, Beckmann, Franke, & Marquand, 2015) and our summary (Table S1, Supporting Information) indicated that most of these classification studies relied on small samples sizes (n < 60) in a single center. A major limitation of the within‐center design is the undetermined generalizability of the classification models (Gabrieli, Ghosh, & Whitfield‐Gabrieli, 2015). Recently, three multicenter studies with large sample sizes considered this issue in SCH classification. The accuracy of cross‐center prediction ranged from 70 to 81% (Nieuwenhuis et al., 2012; Rozycki et al., 2017; Zeng et al., 2018). Although the accuracy was significantly higher than that of a random classification, it was still far from clinical application; at the same time, future directions to improve the effects of these classification models was not thoroughly investigated in these studies.

Recently, performance of diagnosis prediction methods was examined with golden standard predefined by conventional clinical approaches (e.g., The Diagnostic and Statistical Manual of Mental Disorders). However, the psychiatry diagnosis system, largely based on self‐report symptoms and subjective assessment (Drysdale et al., 2017; Insel et al., 2010), may lead to a high percent of misdiagnosis (Altamura & Goikolea, 2008; Hirschfeld, Lewis, & Vornik, 2003). On the contrary, there is no doubt that young (<40 years) and old (>50 years) subjects belong to different age groups. Thus, aging may provide a model with golden standard to estimate novel method of classification (Garrett, Kovacevic, McIntosh, & Grady, 2010). Additionally, the abundant public‐available datasets for subjects in different age groups (Nooner et al., 2012; Wei et al., 2018) also enable us to investigate factors (e.g., sample size and scanning sequence) that may affect classification performance, and give implications to further improve classification performance.

Based on resting‐state fMRI (rs‐fMRI) datasets from multicenters, this study aimed to estimate the performance of GFC‐based classification in discriminating SCH patients from HCs and from depression patients. We also investigated the neural mechanism underlying this classification and included aging datasets (young vs. old subjects) to reveal factors important for improving classification performance.

2. MATERIAL AND METHODS

2.1. Experiment design and subjects

This study included rs‐fMRI datasets from five centers: Anhui Medical University Site 1 (AMU1), AMU Site 2 (AMU2), Southwest University (SWU) (Wei et al., 2018), Nathan Kline Institute (NKI) (Nooner et al., 2012), and the Center of Biomedical Research Excellence (COBRE) (Cetin et al., 2014). This study was approved by the AMU Ethics Committee (see Appendix S1, Supporting Information for data screening). All subjects from AMU were informed of the format of the study and gave written consent. The study design and analyses on each dataset were illustrated in Figure 1.

Study design and analysis. Multicenter SCH and aging data were included to characterize the connectome similarity between subjects and groups, as well as the performance of connectome‐based classification (Analysis 1). Particularly, the SWU and NKI data were used to test the influence of sample size and scanning sequence on classification performance (Analysis 2). The findings provide important points for future studies to improve the classification performance. AMU = Anhui Medical University; COBRE = Center of Biomedical Research Excellence; GFC = group‐specific functional connectome; HC = healthy control; MD = major depression; NKI = Nathan Kline institute; SCH = schizophrenia; SWU = Southwest University [Color figure can be viewed at http://wileyonlinelibrary.com]

Patients with mental disorders and healthy controls (HCs) were included from AMU1 (85 HC, 46 major depression patients, and 66 SCH patients), AMU2 (38 HC and 56 SCH patients) and COBRE (62 HC and 62 SCH patients) (Table 1). Depression data were included to test the differential diagnosis ability of GFC‐based classification for SCH. Thus, we performed discriminating analysis between SCH and depression patients but not between depression patients and HCs. For the normal aging dataset, young (20–40 years of age) and old (50–70 years of age) healthy subjects were included from SWU (187 young and 184 old adults) and NKI (68 young and 58 old adults; Table 2).

Table 1.

Demographic and clinical characteristics of patients and matched controls

Characteristic	AMU1		AMU2	COBRE
Characteristic	SCH vs. HC	SCH vs. MD	SCH vs. HC	SCH vs. HC
Sample size	66, 85	46, 46	56, 38	62, 62
Age (yrs)a	33.4 (7.2), 33.3 (9.7)	34.4 (7.8), 35.7 (9.2)	24.9 (6.2), 27.3 (7.1)	39.4 (14.0), 38.4 (11.6)
Sex (m/f)	27/39, 44/41	16/30, 16/30	28/28, 14/24	51/11, 51/11
Education (yrs)a	10.2 (3.1), 10.8 (3.9)	9.8 (3.0), 9.3(4.2)	11.9 (2.8), 11.6 (2.2)	4.3 (1.3), 4.0 (1.4)b
Duration (yrs)a	7.2 (6.4), NA	7.3 (6.7), 2.7(3.2)	4.4 (3.6), NA	16.3 (13.2)c, NA
BPRSa	41.3 (10.6), NA	43.4 (10.8), NA	NA, NA	NA, NA
PANSS totala	NA, NA	NA, NA	45.8 (8.8), NA	58.4 (5.0), NA
PANSS positivea	NA, NA	NA, NA	11.1 (3.4), NA	14.8 (5.6), NA
PANSS negativea	NA, NA	NA, NA	10.1 (3.3), NA	15.2 (15.7), NA
TR (ms)	2,000		2,000	2,000
Total volumes	240		240	150
Voxel size (mm)	3.44 × 3.44 × 4.6		3.75 × 3.75 × 4.0	3.75 × 3.75 × 4.55

Open in a new tab

BPRS = The Brief Psychiatric Rating Scale; f = female; HC = healthy control; m = male; MD = major depression patient; NA = not available; PANSS = positive and negative syndrome scale; SCH = schizophrenia patient; TR = time of repetition; yrs = years.

Data are means, with SE in parentheses.

The education levels in the COBRE sample were classified into eight grades. The average scores, 4.3 and 4.0, represent education at the college level.

The durations of the COBRE sample were calculated by subtracting the “age at first psychiatric illness” from the “age at the time of experiment.”

Table 2.

Demographic characteristics and scanning parameters of healthy participants

Characteristic	SWU sample (young, old)	NKI sample (young, old)
Age (yrs)a	26.3 (5.3), 58.6 (4.7)	25.6 (5.6), 61.5 (6.1)
Sex (m/f)	76/111, 73/111	34/34, 29/29
TR (ms)	2,000	645	1,400	2,500
Total volumes	242	900	404	120
Voxel size (mm)	3.4 × 3.4 × 3.0	3.0 × 3.0 × 3.0	2.0 × 2.0 × 2.0	3.0 × 3.0 × 3.0

Open in a new tab

Data are means, with SE in parentheses.

SCH patients from AMU met the following inclusion criteria: (a) diagnosis of SCH using the Structured Clinical Interview for the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM‐IV); (b) on psychotropic medication at steady dosages for at least 8 weeks prior to the study; and (c) a verbal intelligence quotient >85, as measured by a Chinese version of the National Adult Reading Test. Exclusion criteria were as follows: (a) history of significant head trauma or neurological disorders; (b) alcohol or drug abuse; (c) focal brain lesions on T1‐ or T2‐weighted fluid‐attenuated inversion‐recovery MRIs; (d) recent aggression or other forms of behavioral dysfunction; (e) head motion exceeding 3 mm in translation or 3° in rotation during resting state fMRI scanning; and (7) Hamilton Anxiety Rating Scale or Hamilton Depression Rating Scale Score >7.

Depression patients from AMU were initially recruited for understanding the mechanism of electroconvulsive therapy. Thus, the recruitment criteria were designed for electroconvulsive therapy. All patients met the diagnoses criteria of depression in DSM‐IV and showed resistance to drug therapy or a severe suicidal tendency. We excluded patients with substance dependence, pregnancy, life threatening somatic disease, neurological disorders, or other comorbid mental disorders or MRI‐related contraindications (Wei et al., 2014). All the depression patients received electroconvulsive therapy after the MRI data acquisition, and part of the samples have been reported in our previous work (Wei et al., 2014). In this study, this depression data were included for discriminative diagnosis between SCH and depression. The criteria of subject recruitment of COBRE can be found at http://cobre.mrn.org/phase1/recruitment/index.html.

2.2. Image acquisition and preprocessing

All rs‐fMRI datasets were collected in 3.0T MRI scanners in which the participants were awake in the scanner, not performing a task, using standard protocols described in Appendix S1, Supporting Information.

fMRI datasets were preprocessed by Data Processing & Analysis for Brain Imaging (DPABI) (Yan, Wang, Zuo, & Zang, 2016), which synthesizes procedures in SPM12 software (http://www.fil.ion.ucl.ac.uk/spm). The first five images were excluded to ensure steady‐state longitudinal magnetization, and the remaining images were corrected for temporal differences and head motion. Images were then normalized to standard Montreal Neurological Institute space (voxel size = 3 × 3 × 3 mm³) and were spatially smoothed (4 mm full‐width at half‐maximum). To remove physiological noise, we regressed out nuisance variables including 24 parameters obtained by rigid body head motion correction (Friston, Williams, Howard, Frackowiak, & Turner, 1996; Yan et al., 2013) and three average signals from white matter, cerebrospinal fluid, and whole brain area, respectively. To keep consistent with the processing of Finn et al. (2015) and our previous work (Ji et al., 2017; Ji, Liao, Chen, Zhang, & Wang, 2017; Ji, Yu, Liao, & Wang, 2017), we did not remove images according to head motion parameters.

To construct functional connectome, we subdivided the whole brain into 268 regions of interest (ROIs) using the same template as Finn et al. (2015), and performed Pearson's correlation among each paired ROIs. Finally, unique elements (268 × 267/2 = 35,778) of the correlation matrix were transformed by Fisher's z‐transformation for further analysis. Notably, most of the samples from SWU did not cover the whole cerebellum, thus only 216 of the 268 ROIs covered in all subjects during scanning were included to construct correlation matrix.

2.3. Group‐level functional connectome

We defined GFC as the average functional connectome across subjects in the same phenotypic group. This is consistent with current findings that functional brain networks are dominated by stable group and individual factors, not by cognitive or daily variations (Gratton et al., 2018). We then performed the following computations and statistical analyses.

2.3.1. Intersubject similarity of the connectome

We computed the Pearson's correlation (Fisher's z‐transformation) between the functional connectome from each paired subject (Figure 2). Mean correlations of intergroup and intragroup were compared by two‐sample t tests. For SCH data, the intergroup condition indicated correlations between SCH patients and HCs, while the intragroup condition indicated correlations between two SCH patients or two HCs. For normal aging data, the intergroup condition indicated correlations between young and old subjects, while the intragroup condition indicated correlations between two young subjects or two old subjects.

Intersubject/group similarity of the functional connectome. (a) Functional connectome similarities were assessed between each of paired subjects. The similarities were then categorized into intergroup (e.g., young–old) and intragroup (e.g., old–old and young–young) conditions. The latter condition showed higher similarity compared with the former across all five centers (^*** p < 0.001, ^** p < 0.01). (b) Taking the normal aging as an example, 499 subgroups with the same sample size (N) were randomly selected from the young group and two independent subgroups (target and test group) were separated from the old group. The same procedure was performed for schizophrenia and healthy control groups. The line graph indicated that group identification accuracy increased with the sample size in all seven datasets [Color figure can be viewed at http://wileyonlinelibrary.com]

2.3.2. Stability of GFC

To test whether the GFC would be more reliable and robust as the sample size increased, a “group identification” analysis for each center was used, as described by Finn et al. (2015). For convenience, we took the SWU sample as an example to describe the analysis. Specifically: (a) N subjects were randomly drawn from the young group 499 times and the GFC was calculated each time, which constituted part of a database pool; (b) two independent subgroups, each with N subjects, were randomly drawn from the old group. The GFC of one group was added in the database pool (500 GFCs in total) as “target data” and the other group was used as “test data”; (c) we correlated the test data with each component in the database pool. The “group identification” was successful if the test data showing higher correlation coefficient with the target group than the others in the database pool. (d) we repeated Step “(b)” 500 times and the identification accuracy was defined as the number of successful identifications divided by 500; and (e) finally, identification accuracy under different sample sizes was obtained by repeating the above steps. The sample size was initially set at two and increased one subject each time. The upper limit of the sample size was less than half the sample size of the old group because: (a) the number of subjects should be equal between “test data” and” target data”; and (b) each subject could only be assigned to one of the two groups (test or target).

2.3.3. Individual classification

We determined the classification ability of GFC on SCH and aging data by leave‐one‐out as well as cross‐center validation. In each iteration of the leave‐one‐out cross‐validation, one subject (e.g., an old subject) would be left out as a test subject. Pearson's correlation coefficient between test data and the two GFCs (e.g., belonging to the young and old group, respectively) then determined to which group (the one with larger coefficient) the test subject was assigned. Sensitivity, specificity, accuracy, and area under curve (AUC) of the receiver operating characteristic (ROC) were reported for indicating classification performance. Sensitivity and specificity referred to the percentage of cases that the GFC method correctly identified old (or SCH) and young (or HCs) subjects, respectively. Accuracy referred to the total proportion of samples correctly classified. The ROC curves provided information regarding the balance between sensitivity and the false positive rate (1‐specificity) across a range of decision thresholds.

Cross‐center validation was performed by predicting old (or SCH) in one center based on the GFC information of another center. Our results are primarily based on accuracy and AUC, both of which were analyzed for significance by permutation tests. Specifically, the group labels of each subject in the training data were randomly permuted 1,000 times and based on the distribution of accuracy and the AUC, the discriminative performance in real conditions were determined.

2.3.4. Edgewise contribution for classification

When performing subject classification, the Pearson's correlation coefficients were computed between the connectivity pattern of a single subject (i.e., SCH patient) and the GFC from two groups (i.e., HC and SCH groups). The subject was then sorted to the group that had the largest correlation coefficient. Computationally, the Pearson's correlation of two vectors is the sum of element‐wise products, given that the two vectors are z‐score normalized (0 mean with unit SD). Thus, the classification score (CS) of each edge could be computed as the R _i × R _c − R _i × R _u, where R _i, R _c, and R _u were the normalized correlation coefficient of one edge in subject i, the group with (R _c) or without (R _u) similar phenotype as subject i, respectively. The positive CS values were summed across classifications to indicate the total contribution of each edge. Negative values were excluded because of their negative contribution to correct classification. Finally, element‐wise CS was illustrated within and between networks, as defined in previous studies (Finn et al., 2015; Shen, Tokoglu, Papademetris, & Constable, 2013): specifically, the medial frontal, frontoparietal, default mode, subcortical–cerebellar, motor, and visual networks. The visual network included three visual components from the work of Finn et al. (2015). The SC for the psychometric model was first computed for the AMU1, AMU2, and COBRE datasets, respectively, and was then averaged across centers. The same processes were performed in the SWU and NKI₂₅₀₀ datasets.

2.3.5. Factors affecting individual classification

Sample size

Using the SWU sample, we tested whether the classification performance could be improved by increasing the sample size. The initial sample size was the least size corresponding to 100% accuracy in “group identification” (i.e., 50), after which the sample size was increased by five subjects in each step until the total number was equal to the total number of old subjects (i.e., 184). In each step, N subjects were randomly drawn from each group 100 times with a guarantee that no significant differences existed in gender between the groups. Finally, the average accuracy and the AUC were reported as measures of classification performance.

MRI scanning sequence

Using the NKI database, we included subjects who took part in all fMRI experiments using three different protocols. According to the “Time of Repetition,” they were termed as protocol NKI₆₄₅, NKI₁₄₀₀, and NKI₂₅₀₀, respectively. The performance of GFC‐based classification was estimated by these datasets independently. Additionally, cross‐center validation was also performed using the SWU as training data and the NKI as test data. The classification accuracy and the AUC were estimated by permutation tests.

Data processing parameter

Two analyses were performed to test the effect of frame‐wise head motion and global brain signal regression on the classification of SCH patients, respectively. In the first analysis, we kept all processing steps except global signal regression. In the second analysis, we performed volume censoring (“scrubbing”) according to frame‐wise displacement (FD) (Power, Barnes, Snyder, Schlaggar, & Petersen, 2012). Specifically, image frames with a large FD (>0.5 mm) were identified as bad time points (Yan et al., 2013); along with the bad time points, the one preceding, and the two following points, were deleted to assure exclusion of confounding motion‐related bias. Subjects were excluded if the remaining data was less than 100 volumes.

3. RESULTS

3.1. Demographics and clinical characteristics

Chi‐square tests indicated similar male/female ratios between young and old groups in both SWU (χ² = 0.04, p = 0.85) and NKI (χ² = 0, p = 1) samples. No significant age, education, or male/female ratio difference existed between SCH patients and HCs in AMU1 (t = 0.02, p = 0.99; t = 1.00, p = 0.32; χ² = 1.76, p = 0.18), AMU2 (t = 1.68, p = 0.10; t = 0.54, p = 0.59; χ² = 1.59, p = 0.21), or COBRE (t = 0.41, p = 0.68; χ² = 0, p = 1). In AMU1, 46 SCH patients and 46 major depression patients were matched for differential diagnosis. No significant age (t = 0.74, p = 0.46) or male/female ratio (χ² = 0, p = 1) differences existed between the SCH and depression groups, but disease duration was higher (t = 4.28, p < 0.01) in the SCH group.

3.2. Inter‐subject similarity of the connectome

The connectome similarity between intergroup subjects (e.g., old–young) and intragroup (e.g., old–old) subjects was compared in the SWU, NKI, AMU1, AMU2, and COBRE samples. All analyses showed significantly higher correlation in intragroup than intergroup conditions (Table S2, Supporting Information and Figure 2a).

3.3. Stability of the GFC

Overall, we found the accuracy of group identification increased with sample size in an “S” shape (Figure 2b). Within the SWU data, the old group could be identified with 100% accuracy if the sample size was increased to 50 subjects. The highest accuracy of the AMU1, AMU2, and COBRE datasets were 98, 99, and 90%, when the sample size reached 33, 26, and 28 respectively. The curves from the three NKI datasets were quite similar. All accuracies increased to 96% when 29 subjects were in the test/target groups.

3.4. Diagnosing individuals with SCH

GFC‐based classification differentiated SCH patients from HCs in AMU1 datasets with 78% accuracy and with 84% AUC; in AMU2 datasets with 76% accuracy and 86% AUC; and in COBRE datasets with 77% accuracy and 81% AUC (Table 3, all p < 0.001; Figure 3a). Additionally, the GFC discriminated SCH from depression patients in the AMU1 dataset with 76% specificity, 74% sensitivity, 75% accuracy, and 81% AUC (all p < 0.005, Figure 3a). Cross‐center classification showed that the GFC information from AMU1 SCH patients (with a large sample size) could correctly predict for AMU2 SCH patients (accuracy = 73%, AUC = 81%) and for COBRE SCH patients (accuracy = 77%, AUC = 83%), as the within‐center performances (all p < 0.001). Full cross‐center results were presented in Table 3 (Figure 3b–d).

Table 3.

Classification performance within and between centers

Training data	Test data (specificity/sensitivity/accuracy/AUC)
Young vs. old subjects
	SWU	NKI₆₄₅	NKI₁₄₀₀	NKI₂₅₀₀
SWU	87/83/85/92	99/24/64/80	91/33/55/72	91/71/82/86
NKI₆₄₅	88/36/63/74	84 ^/ 76/80/89	–	–
NKI₁₄₀₀	90/27/59/69	–	81/76/79/88	–
NKI₂₅₀₀	96/33/65/80	–	–	90/74/83/87
SCH patients vs. healthy controls
	AMU1	AMU2	COBRE
AMU1	77/78/78/84	80/63/73/81	69/86/77/83
AMU2	76/71/73/82	68/80/76/86	60/69/65/74
COBRE	65/64/64/69	70/66/68/72	71/82/77/81

Open in a new tab

Bold numbers indicate within center performance. AMU1 = Anhui Medical University Site 1; AUC = area under curve; COBRE = Center of Biomedical Research Excellence; HC = healthy control; SCH = schizophrenia patient.

Classification performance of discriminating old from young adults. High AUCs were observed both within the SWU (a) and the NKI (b) datasets. Classification accuracy could be improved by increasing sample size (c). Scanning sequence had a significant effect on cross‐center (d, e) but not within‐center classification (b). Cross‐center classification indicated high AUC (up to 86%) when the two datasets were acquired by a similar scanning sequence (NKI₂₅₀₀ and SWU) [Color figure can be viewed at http://wileyonlinelibrary.com]

3.5. Discriminating old subjects from young adults

Based on the GFC information, old individuals could be discriminated from the SWU dataset with significant accuracy (85%, p < 0.001) and AUC (92%, p < 0.001, Figure 4a). In the NKI datasets, the classification accuracy and the AUC were 80 and 89%, respectively, for NKI₆₄₅; 79 and 88%, respectively, for NKI₁₄₀₀; and 83 and 87%, respectively for NKI₂₅₀₀ (Figure 4b, all p < 0.001).

Classification performance within and across centers. Within all centers (AMU1, AMU2, and COBRE), the GFC‐based method could discriminate schizophrenia (SCH) patients from healthy controls with an AUC > 0.80 (a). Particularly, a high AUC (0.81) was also found when discriminating SCH from major depression patients within the AMU1 center (a). Cross‐center classification shows a high AUC when using the (b) AMU1 and (c) AMU2 as training data but shows a relatively low AUC when using COBRE as training data (d) [Color figure can be viewed at http://wileyonlinelibrary.com]

We recomputed the functional connectome for the NKI dataset using 216 nodes from the SWU dataset, then performed cross‐center prediction for old subjects. Based on the GFC information of the SWU data, the classification accuracy and the AUC were 64 and 80%, respectively, for NKI₆₄₅ (both p < 0.01); 55 and 72%, respectively, for NKI₁₄₀₀ (both p < 0.01); and 82 and 86%, respectively, for NKI₂₅₀₀ (both p < 0.001). Inversely, NKI datasets also showed significant predictions for the SWU dataset in both accuracy and the AUC (Table 3).

3.6. Edgewise contribution for classification

In the demographic model (young vs. old), edgewise contribution was the average SC from NKI and SWU datasets (Figure 5a). In the psychometric model (SCH vs. HC), edgewise contribution was the average SC from the AMU1, AMU2, and COBRE datasets (Figure 5b). To show the edges with high contribution in classification, the highest 99.9 percentile of SC are illustrated in Figure 5. The highest contributions in the aging data mainly belonged to the subcortical–cerebellar, motor, and visual networks, while those in the SCH data were distributed across all six networks.

Edgewise contribution for classification. Numbers 1–6 indicate the medial frontal, frontoparietal, default mode, subcortical–cerebellar, motor, and visual network, respectively. The matrix shows edgewise contribution for each classification, while the circular diagram indicates the top contributors within (colored lines) and between (gray lines) networks. The circular diagram was designed using DynamicBC software, http://restfmri.net/forum/DynamicBC (Liao et al., 2014) [Color figure can be viewed at http://wileyonlinelibrary.com]