Skip to main content
Schizophrenia Bulletin logoLink to Schizophrenia Bulletin
. 2022 Aug 4;48(6):1217–1227. doi: 10.1093/schbul/sbac096

Morphometric Integrated Classification Index: A Multisite Model-Based, Interpretable, Shareable and Evolvable Biomarker for Schizophrenia

Yingying Xie 1,2, Hao Ding 3,4,5, Xiaotong Du 6,7, Chao Chai 8,9, Xiaotong Wei 10,11, Jie Sun 12,13, Chuanjun Zhuo 14, Lina Wang 15, Jie Li 16, Hongjun Tian 17, Meng Liang 18,19,20, Shijie Zhang 21,, Chunshui Yu 22,23,24,, Wen Qin 25,26,✉,
PMCID: PMC9673259  PMID: 35925032

Abstract

Background and Hypothesis

Multisite massive schizophrenia neuroimaging data sharing is becoming critical in understanding the pathophysiological mechanism and making an objective diagnosis of schizophrenia; it remains challenging to obtain a generalizable and interpretable, shareable, and evolvable neuroimaging biomarker for schizophrenia diagnosis.

Study Design

A Morphometric Integrated Classification Index (MICI) was proposed as a potential biomarker for schizophrenia diagnosis based on structural magnetic resonance imaging data of 1270 subjects from 10 sites (588 schizophrenia patients and 682 normal controls). An optimal XGBoost classifier plus sample-weighted SHapley Additive explanation algorithms were used to construct the MICI measure.

Study Results

The MICI measure achieved comparable performance with the sample-weighted ensembling model and merged model based on raw data (Delong test, P > 0.82) while outperformed the single-site models (Delong test, P < 0.05) in either the independent-sample testing datasets from the 9 sites or the independent-site dataset (generalizable). Besides, when new sites were embedded in, the performance of this measure was gradually increasing (evolvable). Finally, MICI was strongly associated with the severity of schizophrenia brain structural abnormality, with the patients’ positive and negative symptoms, and with the brain expression profiles of schizophrenia risk genes (interpretable).

Conclusions

In summary, the proposed MICI biomarker may provide a simple and explainable way to support clinicians for objectively diagnosing schizophrenia. Finally, we developed an online model share platform to promote biomarker generalization and provide free individual prediction services (http://micc.tmu.edu.cn/mici/index.html).

Keywords: schizophrenia, biomarker, multi-site, structural magnetic resonance imaging, machine learning, morphometric integrated classification index

Introduction

Schizophrenia is one type of severe chronic mental disorders that causes heavy economic and social burdens on patients, families, and the community.1,2 Magnetic resonance imaging (MRI) has been frequently used to identify schizophrenia brain structural damage for its high spatial resolution and multi-dimensional measures. Early studies have reported extensive structural abnormalities in schizophrenia,3–5 suggesting MRI-derived structural measures are potential biomarkers for the diagnosis, treatment evaluation, and prognosis prediction of schizophrenia.6–8 In addition, recent studies combining MRI and machine learning (ML) methods have shown promising performance for schizophrenia classification.9–11

Previous MRI-based studies have shown great diversities in either brain structural damages and classification performance for schizophrenia,12–14 which lead to poor generalization for single-site studies.15–17 Recent advances in neuroimaging and ML have shown that multisite big data can minimize the effects of heterogeneities across sites and thus increase prediction generalization.18–21 However, multisite raw MRI data sharing is becoming a challenge by many factors.17,22,23 For example, massive storage, networking, and computing resources are required, leading to huge human and resource costs for research and clinical institutions. Besides, raw MRI data contain sensitive, personally identifiable information; even some de-identification techniques are applied to them.24–26 A candidate solution is sharing anonymous intermediate summary data rather than sensitive raw data. A successful case of intermediate neuroimaging data sharing is the Enhancing Neuro Imaging Genetics Through Meta-Analysis (ENIGMA) Consortium, which ensembles tens of thousands of neuroimaging data from sites globally by sharing their MRI statistical results for meta-analysis (http://enigma.ini.usc.edu/).27 However, it is still challenging to find an effective way to ensemble multisite intermediate data to achieve a prediction performance comparable to or higher than those using the original MRI data.

Another challenge of using structural MRI data for schizophrenia diagnosis is the poor interpretability of the multivariable ML models.28,29 Interpretability of biomarker is critically important for clinical practice, as it can increase understanding of the biological and clinical correlations between the biomarker and the disease, and provide insight into how a model can be improved.30 With the rapid progress of machine learning and deep learning, many models may effectively discriminate schizophrenia patients from healthy controls11,31; however, most of them are “black boxes” whose algorithms and parameters are too complicated for a psychiatrist to comprehend, which is especially true for deep learning models that are composed of millions of nonlinear hyper-parameters.32 To disentangle the “black boxes” issue, several methods have recently been proposed to help users interpret the ML predictions.33–36 For example, a recent study constructed a neuroimaging biomarker “FSA score” based on the classification distance to the separating hyperplane of support vector machine classifier and showed convincing results on both biological meanings and classification for schizophrenia.37 Recently, a SHAP (SHapley Additive exPlanations) framework had been proposed to assign each feature an additive importance value for a particular prediction in a single model.36 However, the SHAP framework is designed to interpret a single-mode, and its expansion to multisite models remains to be explored. Besides, its biological and clinical relevance also needs to be further discovered.

This study tried to extend the SHAP framework into multisite ML models to construct a generalizable and explainable neuroimaging biomarker named Morphometric Integrated Classification Index (MICI) (see flowchart in figure 1). Based on structural MRI (sMRI) data of 1270 subjects (588 schizophrenia patients and 682 healthy controls) from 10 sites (Supplementary Tables S1, S2, Supplementary figure S1), our findings highlighted the generalizable, interpretable, evolvable, and shareable of the proposed MICI measure for schizophrenia diagnosis.

Figure 1.

Figure 1.

Flowchart of the study design. Abbreviations: MICI, Morphometric Integrated Classification Index; NC, normal control; SHAP, SHapley Additive exPlanation; SZ, Schizophrenia.

Methods

Detailed descriptions of methods were provided in Supplementary materials.

Results

Construction of the MICI and performance evaluation

In the present study, a total of 10 datasets were used in this study, including three local datasets and seven public neuroimaging datasets.38–43 Schizophrenia was diagnosed according to the Diagnostic and Statistical Manual of Mental Disorders-IV (DSM-IV). After quality control (Supplementary figure S1), 588 SZ and 682 NC were finally included and the distribution of Contrast to Noise Ratio and Signal to Noise Ratio in each site were shown in Supplementary figure S2. The detailed sources and scanning parameters for the sMRI data and demographic information for finally included participants are shown in the Supplementary tables S1, S2.

A total of 484 sMRI-derived morphometric features were chosen to construct the ensembling model and its derived MICI measure, including 444 cortical features derived from the Destrieux atlas (average cortical thickness [148], sum of cortical surface area [148] and cortical volume [148]), 33 subcortical features derived from Gaussian Classifier Atlas, and 7 whole-brain features (Supplementary table S3). First, we tested previously applied MRI-based machine learning models for schizophrenia to identify the best classification model for a single site (Supplementary table S4).44–50 Among the 11 candidate classifiers, the XGBoost achieved the highest performance in both single-site and merged models (Delong test, P < 0.05) (figure 2A, Supplementary figure S3A, B). Thus, the XGBoost classifier was chosen for multisite model ensembling.

Figure 2.

Figure 2.

Classification performance of the multisite ensembling model and MICI prediction. (A) The classification performance of different single-site classifiers. (B) The classification performance of different multisite models. * The AUC of an independent-site dataset (Local3). (C) The intra-class correlation of AUCs between the MICI prediction and the SWEM in independent-sample datasets. (D) and (E) The average AUCs of the SWEM (D) and the MICI prediction (E) increasing as the increase of the embedded numbers of sites. The box margins represent the 25% and 75% percentiles, and the line margins represent the 1.5 times interquartile range. (F) The intra-class correlation of AUCs between the SWEM and the MICI prediction for the 100 times iterations. (G) and (H) The average AUCs of the SWEM (G) and the MICI prediction (H) in an independent testing site increasing as a function of the embedded number of sites. The gray background represents one standard deviation. (I) The intra-class correlation of the prediction probability between the MICI and the SWEM in the independent-site dataset. Abbreviations: AUC, area under curve; MICI, Morphometric Integrated Classification Index; SWEM, sample-weighted ensembling model.

Then we proposed a sample-weighted ensembling model (SWEM) to integrate the multiple “single-site” pre-trained classifiers into a single prediction considering the sample size used for training in each site. To evaluate the performance of our proposed ensembling model, we first compared the classification performance of the SWEM with the traditional single-site models (satellite model) and multisite raw data-based model (merged model) in the independent-sample testing datasets from the 9 sites and an independent-site testing dataset. We found that SWEM achieved comparable area under the curve (AUC) (independent-sample: 0.82 ± 0.08; independent-site: 0.83) with the merged model (independent-sample: 0.81 ± 0.11; independent-site: 0.82) (Delong test, P > 0.82), and higher AUC than the satellite models (independent-sample: within-site AUC = 0.77 ± 0.07, cross-site AUC = 0.70 ± 0.08; independent-site: 0.72 ± 0.06) (Delong test, P < 0.05) (figure 2B and Supplementary tables S5–S7). To explore if the classification performance increases with the embedment of new sites for the proposed SWEM (evolvability), we randomly embedded the sites into the ensembling model with 100 times iterations. The average classification AUC of the independent-sample testing datasets was gradually increasing from 0.72 (1 site) to 0.82 (9 sites) when new sites were added into this framework (figure 2D), and this increasing pattern was reliably validated in each testing dataset (Supplementary figure S4), and this pattern was replicated on an independent-site testing dataset (AUC = [0.73:0.83]) (figure 2G).

A measure named MICI was created from the SWEM to increase the interpretability. The MICI values were calculated based on the SHapley Additive exPlanation (SHAP) method,36 which is based on the cooperative game theory51 that splits the final prediction into individual additive contributions (SHAP values) of each valid feature. In the present study, we extended the SHAP value onto a multisite scenario by ensembling multiple “single-site” SHAPs into a scalar named MICI value for each feature (MICIF) or each subject (MICIP), in considering the sample size of the pre-trained model of each site (see the Method part for detail). We found that the proposed MICIP measure performed comparable performance (independent-sample AUC = 0.82 ± 0.08, independent-site AUC = 0.83) with the SWEM and the merged model (Delong test, all P > 0.82), and had higher performance than the satellite model (Delong test, all P < 0.05) (figure 2B, Supplementary figure S3C, D). Besides, as shown in figure 2C, the classification performances between the SWEM and MICI were highly consistent across testing sites (ICC, R ≈ 1.00, P < 0.001). To test if the CNR or SNR affect the classification performance, we also calculated the MICI after regressing out the CNR and SNR from the morphometric features. The classification performance of the new MICI showed no significant difference compared to the one without CNR/SNR regression (Delong test, P = 0.28 for independent-sample testing datasets, P = 0.54 for independent-site dataset) (Supplemental figure S5).

We also evaluated the evolvability of the MICI prediction. On the independent-sample testing datasets, we found the performance of MICI was monotonically increasing when new sites were gradually embedded (AUC = [0.73:0.82]) (figure 2E). Moreover, a high similarity was observed between the SWEM and MICI performance (ICC= 0.99, P < 0.001) (figure 2F). The evolvability of MICI was replicated in the independent-site dataset (AUC = [0.73:0.83], ICC = 0.99, P < 0.001) (figure 2H, I).

MICI differences between the SZ and NC groups

To explore the potential biological meaning of the proposed MICI value, we first compared the differences in participant-wise MICI (MICIP) and feature-wise MICI values (MICIF) between the SZ and NC groups using a two-sample t-test. We found that the proposed MICIP in the SZ was significantly higher than those in the NC (T = 22.34, P = 1.77E-93), and this difference can be replicated in each site (figure 3A). On the feature level, compared to the NC group, the SZ group demonstrated increased MICIF values derived from cortical volume and thickness in the insula, prefrontal, sensory, and parietal cortex, and from subcortical volume in the putamen, pallidum, and lateral ventricle, and so on (P < 0.05, FWE correction for independent comparisons, figure 3B). Furthermore, we compared the inter-group differences in raw morphometric features and found significantly increased gray matter volume in subcortical structures such as the lateral ventricle, putamen, and pallidum, and found widely decreased thickness and volume in the cerebral cortex, such as the insula, prefrontal, sensory, and parietal cortex (P < 0.05, FWE correction, figure 3C).

Figure 3.

Figure 3.

Group differences in MICI and its association with schizophrenia morphometric abnormalities. (A) The histogram distributions of participant-wise MICI values in SZ and HC and their differences. (B) and (C) The intergroup difference in MICI values (B) and brain morphometric measures (C) between the SZ and HC groups (P < 0.05, FWE correction). Colorbar represents the T values. (D) the Spearman correlations in schizophrenia-related absolute changes between MICI and brain morphometric measures (P < 0.05). Abbreviations: HC, healthy controls; MICI, Morphometric Integrated Classification Index; SZ, schizophrenia.

Associations between MICI and brain morphometric changes

On the association between the MICI and gray matter morphometry, we found a significantly negative correlation between the MICIF and cortical morphometric features (especially the cortical volume and cortical thickness) in both the SZ and NC groups (Spearman correlation, P < 0.05, Bonferroni correction) (Supplementary figure S6), indicating the potential of MICIF in the representation of the gray matter morphometry. Moreover, there was a significant positive correlation between SZ-related brain morphometric abnormality and MICIF change levels across features (R = 0.75, P = 1.81E-81). The associations between SZ-related MICIF changes and morphometric abnormalities were also identified in different types of morphometric features, including cortical surface area (R = 0.56, P = 1.41E-12), cortical thickness (R = 0.77, P = 3.44E-28), cortical volume (R = 0.61, P = 2.69E-15), and subcortical volume (R = 0.83, P = 2.17E-19) (figure 3D). Finally, these associations were replicated in each site (P < 0.05, Bonferroni correction, Supplementary figure S7). These findings indicate that MICI reflects the severity of schizophrenia brain morphometric abnormality.

Association between MICI and clinical measures

At the participant level, a significant positive correlation was identified between MICIP values and the total scores of Scale for Assessment of Negative Symptoms (SANS) (R = 0.38, P = 6.50E-10) and between MICIP values and Scale for Assessment of Positive Symptom (SAPS) scores in schizophrenia patients (R = 0.24, P = 1.30E-4) (figure 4A). To further elucidate which MICI features contribute to the symptoms of schizophrenia, Spearman correlations identified three features’ MICIF values that were positively associated with SANS scores (P < 0.05/484, Bonferroni correction), including the cortical volume of the superior frontal sulcus and planum temporale of the superior temporal gyrus (STG), and the cortical thickness of middle occipital gyrus (MOG); we also identified three features’ MICIF values that were positively associated with SAPS scores (P < 0.05/484, Bonferroni correction), including the cortical volume of the anterior transverse collateral sulcus and the cortical surface of medial occipitotemporal and lingual sulci, and intraparietal and transverse parietal sulci (figure 4B). Besides, the MICIP values showed a weak positive correlation with duration of schizophrenia (R = 0.18, P = 0.012, uncorrected) (figure 4A). We did not find correlation between MICIP and total chlorpromazine equivalent dose in schizophrenia patients (R = 0.099, P = 0.38).

Figure 4.

Figure 4.

Associations between MICI values and clinical measures in schizophrenia. (A) The Spearman correlation of participant-wise MICI values with disease duration, SANS scores, and SAPS scores in schizophrenia patients. (B) The Spearman correlation of feature-level MICI values derived from cortical features with SANS and SAPS scores. Colorbar represents the -log(p) of correlation, and the significant brain areas were contoured by white lines. Abbreviations: MICI, Morphometric Integrated Classification Index; SANS, scale for the assessment of negative symptoms; SAPS, scale for the assessment of positive symptoms.

Association between MICI and brain expression of risk genes of schizophrenia

Among the 196 schizophrenia-associated risk genes, Spearman correlation identified 27 genes whose brain mRNA expression levels were significantly associated with the MICIF differences between schizophrenia patients and HC (P < 0.05, permutation-based FWE correction), including 19 genes with the MICIF derived from cortical thickness, seven with the MICIF from cortical surface area, and one with the MICIF from cortical volume (figure 5A, Supplementary table S8). Hierarchical clustering analysis grouped these 27 genes into two differential-expressed clusters with the highest Calinski-Harabasz score. Cluster 1 contained nine genes that mainly showed a negative correlation with the MICIF changes, among which CNTN4 showed the strongest negative correlation (R = -0.35, P < 0.05, FWE correction based on permutation). Cluster 2 included 18 genes that were positively correlated with the MICIF changes, among which TAF5 showed the strongest positive correlation (R = 0.41, P < 0.05, FWE correction based on permutation) (figure 5B). Brain lifespan expression profile of gene CNTN4 demonstrates a low expression level in the brain at the early embryonic stage, increasing expression to peak at the middle embryonic stage, and staying stable. In contrast, the TAF5 shows the highest expression at the early embryonic stage, decreasing rapidly to the bottom at postnatal one year and remaining stable afterward (figure 5C).

Figure 5.

Figure 5.

Associations between MICI changes and brain transcription of schizophrenia risk genes. Spearman correlations between the average T-value of MICI changes and AHBA brain mRNA expression levels of 196 schizophrenia risk genes across brain areas were performed to identify genes whose expression levels were associated with the MICI changes (P < 0.05, permutation-based FWE correction). (A) The hierarchical clustering heatmap of identified 27 genes. Colorbar represents the Spearman correlation coefficients. (B) The scatter plot of the correlation between MICI changes and the representative genes of the 2 clusters, CNTN4 (upper) and TAF5 (bottom). (C) The human brain gene expression profiles throughout the lifespan of the two representative genes according to the Human Brain Transcriptome database (http://hbatlas.org/).

Discussion

This work proposed a potential biomarker named Morphometric Integrated Classification Index to discriminate schizophrenia patients from normal controls based on sMRI data of 1270 subjects from 10 sites. This measure is derived from an XGBoost plus sample-weighted ensembling framework based on ensembling anonymous single-site model parameters rather than original private MRI data, thus helping to protect patients’ privacy and saving computation resources. The MICI measure achieved better classification performance than the traditional machine learning models trained with raw MRI data on several independent-site datasets. Moreover, a new site is easily included in this ensembling framework, and the classification performance of the MICI is increasing as a new site is embedded. Finally, the derived MICI measure was associated with brain structural abnormalities, clinical symptoms of schizophrenia, and brain mRNA expressions profile of schizophrenia risk genes. Thus, these findings highlighted the generalizability, shareability, evolvability, and interpretability of the proposed MICI for schizophrenia diagnosis using structural neuroimaging data acquired from different sites.

Although black-box machine learning models have been proved to outperform traditional clinical practices in many fields, the poor interpretability severely restricts their clinical applications.28,29 It is a dilemma on the complexity and interpretability of ML models: on the one hand, the model is expected to be as simpler as people can understand but may sacrifice performance; on the other hand, the model is expected to be more powerful but maybe at the cost of increased complexity (and thus sacrifice interpretability). Multisite model-ensembling strategies proposed in the present study further complicated the situation of model interpretability. Thus another challenge for machine learning is the tradeoff between model interpretability and model flexibility to make model-agnostic be explained.52 To solve this issue, we proposed a single scalar named morphometric integrated classification index—MICI—from the ensembling framework based on the theorem of SHAP values. SHAP fits a linear additive model to split the final prediction into individual additive contributions (SHAP values) for each valid feature, meaning that the sum of the SHAP values of all valid features is approximately equivalent to the model prediction.36,51 However, the SHAP approach is initially designed for a single model scenario. In this work, we extended it into a multisite scenario by ensembling the SHAP values of multiple sites into a single MICI value accounting for the sample size of each model. Benefiting from the “local accuracy” characteristic of SHAP value,36 our results showed that the MICI value achieved comparable classification performance and evolvability as the multi-variates ensembling model and merged model based on raw data. Moreover, the MICI values demonstrated preferable interpretability on brain structural changes and clinical features of schizophrenia, and brain transcriptional profiles of schizophrenia risk genes.

Although mounting studies have shown that multisite data ensembling can greatly promote generalization and ML prediction performance,18–20,53,54 multisite raw MRI data sharing is severely restricted by computing resources, personal privacy, and other effects.22–26 To solve these barriers during MRI data sharing, we presented an XGBoost plus sample-weighted ensembling framework that integrates the parameters of anonymous single-site models rather than original private MRI data to make a neuroimaging biomarker. Like other model ensembling approaches,19,55 our presented biomarker owed several advantages. First, it diminished ethical and privacy issues brought by raw data sharing and thus improved data security. Second, the shared model-based data are much smaller than the raw MRI data and thus greatly save storing and networking resources during sharing. Finally, the present model-ensembling method does not need further re-training, thus saving computation resources.11 These features can accelerate the drive toward multisite data sharing and promote clinical applications.

Specifically, we found that MICI values in schizophrenia patients were consistently higher than those in healthy controls across sites, and the MICI values in schizophrenia were significantly higher in many cortical and subcortical structures, such as the prefrontal, insula, sensory cortex, and lateral ventricle, putamen, and pallidum, which was consistent with previous reports about the abnormal cerebral structures in schizophrenia.56–58 This is substantiated by the significant correlations between MICI changes and brain structural changes in schizophrenia, indicating the powerfulness of MICI values in representing schizophrenia’s brain structural abnormalities. It should be noted that the direction of MICI changes was not always consistent with the morphometric changes. For example, the increment of cortical MICI was accompanied by a decrement of cortical thickness and volume, while the increment of subcortical MICI was accompanied by a consistent increment of subcortical volume, suggesting that MICI may be considered an abnormality severity measure rather than an abnormality direction measure. We also demonstrated a positive association between MICI and positive and negative symptoms of schizophrenia, indicating higher MICI may reflect more severe clinical symptoms. The MICI-symptom association was also identified at the area level. For example, negative symptoms were significantly positively associated with the MICI of the STG, which was consistent with an early finding reporting a positive association between the GMV of this area and negative symptoms.59 Besides, positive symptoms were significantly associated with the MICI of the MOG. Similarly, the association between the thickness of MOG and hallucination had been reported in first-episode psychosis patients.60 These findings indicate that MICI values can reflect the severity of clinical symptoms in schizophrenia.

Finally, we identified 27 schizophrenia risk genes whose brain mRNA expression levels were significantly associated with the MICI changes in schizophrenia. These genes can be classified into two clusters with opposite associations with the MICI. As a representative gene for cluster 1, CNTN4 showed the strongest negative association with MICI increment; and its brain expression level increases rapidly from the early embryonic to the middle embryonic stage and keeps stable after. CNTN4 gene is an axonal glycoprotein associated with cell adhesion molecules and plays important roles in axon arborization and neuronal development.61,62 Early reports had indicated it is a risk gene for schizophrenia and autism.63,64 Furthermore, overexpression of CNTN4 resulted in a significant decrease in head size of zebrafish embryos, indicating an important role of this gene in regulating brain development.65 On the contrary, the representative gene for cluster 2, TAF5 demonstrated the strongest positive association with MICI increment, shows the highest expression at the early embryonic stage, and whose expression decreases rapidly down to the bottom at postnatal one year. TAF5 is related to DNA-binding transcription factor activity66 and was also reported as a candidate risk gene for schizophrenia.67 These findings indicate that the proposed MICI measure can provide biologically meaningful information about schizophrenia.

However, several limitations should be mentioned when interpreting our findings. Firstly, the generalization of the MICI still needs to be validated and improved with more datasets, especially using drug-naïve first-episode schizophrenia dataset. Therefore, we developed an online free-share platform and wish more and more sites share their models to validate and improve the biomarker’s performance (http://micc.tmu.edu.cn/mici/index.html). In addition, this cloud platform also provides free individual prediction services for the public, which not only provide the predictions based on uploading data, but also the MICI can be obtained directly to perform further scientific research and clinical assistance. Second, we established MICI only based on sMRI data; however, this measure is theoretically modality-free. Different neuroimaging modalities (such as diffusion MRI, functional MRI [fMRI], and PET) may provide unique information about schizophrenia diagnosis and treatment. Thus, it is preferred to apply this framework to other neuroimaging modalities to obtain different types of ensembling biomarkers for schizophrenia in the future. Moreover, in consideration that the symptoms of schizophrenia patients often change over time, thus it is preferred to combine the fMRI with sMRI to construct MICI, which may increase the power in characterizing the state of schizophrenia and promote prognosis prediction.

Supplementary Material

sbac096_suppl_Supplementary_Materials

Acknowledgments

We are grateful to all patients and healthy volunteers who participated in this research. The authors have declared that there are no conflicts of interest in relation to the subject of this study. The code for MICI is available via https://github.com/BrainWanderLab/MICI.

Contributor Information

Yingying Xie, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China.

Hao Ding, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China; School of Medical Imaging, Tianjin Medical University, Tianjin, China.

Xiaotong Du, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China.

Chao Chai, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China.

Xiaotong Wei, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China.

Jie Sun, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China.

Chuanjun Zhuo, Department of Psychiatry Functional Neuroimaging Laboratory, Tianjin Mental Health Center, Tianjin Anding Hospital, Tianjin, China.

Lina Wang, Department of Psychiatry Functional Neuroimaging Laboratory, Tianjin Mental Health Center, Tianjin Anding Hospital, Tianjin, China.

Jie Li, Department of Psychiatry Functional Neuroimaging Laboratory, Tianjin Mental Health Center, Tianjin Anding Hospital, Tianjin, China.

Hongjun Tian, Tianjin Fourth Central Hospital, Tianjin, China.

Meng Liang, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China; School of Medical Imaging, Tianjin Medical University, Tianjin, China.

Shijie Zhang, Department of Pharmacology, Tianjin Medical University, Tianjin, China.

Chunshui Yu, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China; School of Medical Imaging, Tianjin Medical University, Tianjin, China.

Wen Qin, Department of Radiology, Tianjin Medical University General Hospital, Tianjin, China; Tianjin Key Lab of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China.

Funding

This work was supported by the National Key Research and Development Program of China (2018YFC1314300), National Natural Science Foundation of China (81971599, 82030053, and 81971694), Tianjin Key Project for Chronic Diseases Prevention (2017ZXMFSY00070), Science&Technology Development Fund of Tianjin Education Commission for Higher Education (2018KJ082), Tianjin Applied Basic Research Diversified Investment Foundation (21JCYBJC01490).

References

  • 1. Os JV, Kapur S.. Schizophrenia. Lancet. Lancet 2009;374(9690):635–645. [DOI] [PubMed] [Google Scholar]
  • 2. Charlson FJ, Ferrari AJ, Santomauro DF, et al. Global epidemiology and burden of Schizophrenia: findings from the global burden of disease study 2016. Schizophr Bull. 2018;44(6):1195–1203. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3. Kotrla KJ, Weinberger DR.. Brain imaging in schizophrenia. Annu Rev Med. 1995;46:113–122. [DOI] [PubMed] [Google Scholar]
  • 4. Lieberman J, Chakos M, Wu H, et al. Longitudinal study of brain morphology in first episode schizophrenia. Biol Psychiatry. 2001;49(6):487–499. [DOI] [PubMed] [Google Scholar]
  • 5. Kasai K, Shenton ME, Salisbury DF, et al. Differences and similarities in insular and temporal pole MRI gray matter volume abnormalities in first-episode schizophrenia and affective psychosis. Arch Gen Psychiatry. 2003;60(11):1069–1077. [DOI] [PubMed] [Google Scholar]
  • 6. Shenton ME, Dickey CC, Frumin M, McCarley RW.. A review of MRI findings in schizophrenia. Schizophr Res. 2001;49(1-2):1–52. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Shepherd AM, Laurens KR, Matheson SL, Carr VJ, Green MJ.. Systematic meta-review and quality assessment of the structural brain alterations in schizophrenia. Neurosci Biobehav Rev. 2012;36(4):1342–1356. [DOI] [PubMed] [Google Scholar]
  • 8. Pantelis C, Velakoulis D, McGorry PD, et al. Neuroanatomical abnormalities before and after onset of psychosis: a cross-sectional and longitudinal MRI comparison. Lancet 2003;361(9354):281–288. [DOI] [PubMed] [Google Scholar]
  • 9. Nieuwenhuis M, van Haren NE, Hulshoff Pol HE, Cahn W, Kahn RS, Schnack HG.. Classification of schizophrenia patients and healthy controls from structural MRI scans in two large independent samples. Neuroimage 2012;61(3):606–612. [DOI] [PubMed] [Google Scholar]
  • 10. Kambeitz J, Kambeitz-Ilankovic L, Leucht S, et al. Detecting neuroimaging biomarkers for schizophrenia: a meta-analysis of multivariate pattern recognition studies. Neuropsychopharmacology. 2015;40(7):1742–1751. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Schwarz E, Doan NT, Pergola G, et al. Reproducible grey matter patterns index a multivariate, global alteration of brain structure in schizophrenia and bipolar disorder. Transl Psychiatry. 2019;9(1):12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Zhang T, Koutsouleris N, Meisenzahl E, Davatzikos C.. Heterogeneity of structural brain changes in subtypes of schizophrenia revealed using magnetic resonance imaging pattern analysis. Schizophr Bull. 2015;41(1):74–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13. Weinberg D, Lenroot R, Jacomb I, et al. Cognitive subtypes of schizophrenia characterized by differential brain volumetric reductions and cognitive decline. JAMA Psychiatry 2016;73(12):1251–1259. [DOI] [PubMed] [Google Scholar]
  • 14. Brugger SP, Howes OD.. Heterogeneity and homogeneity of regional brain structure in schizophrenia: a meta-analysis. JAMA Psychiatry 2017;74(11):1104–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15. Pearlson G. Multisite collaborations and large databases in psychiatric neuroimaging: advantages, problems, and challenges. Schizophr Bull. 2009;35(1):1–2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Segall JM, Turner JA, van Erp TG, et al. Voxel-based morphometric multisite collaborative study on schizophrenia. Schizophr Bull. 2009;35(1):82–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Nunes A, Schnack HG, Ching CRK, et al. Using structural MRI to identify bipolar disorders - 13 site machine learning study in 3020 individuals from the ENIGMA Bipolar Disorders Working Group. Mol Psychiatry. 2020;25(9):2130–2143. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Van Horn JD, Toga AW.. Multisite neuroimaging trials. Curr Opin Neurol. 2009;22(4):370–378. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19. Dluhos P, Schwarz D, Cahn W, et al. Multi-center machine learning in imaging psychiatry: a meta-model approach. Neuroimage 2017;155:10–24. [DOI] [PubMed] [Google Scholar]
  • 20. Tai AMY, Albuquerque A, Carmona NE, et al. Machine learning and big data: implications for disease modeling and therapeutic discovery in psychiatry. Artif Intell Med. 2019;99:101704. [DOI] [PubMed] [Google Scholar]
  • 21. Zeng LL, Wang H, Hu P, et al. Multi-site diagnostic classification of schizophrenia using discriminant deep learning with functional connectivity MRI. EBioMedicine 2018;30:74–85. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Taitsman JK, Grimm CM, Agrawal S.. Protecting patient privacy and data security. N Engl J Med. 2013;368(11):977–979. [DOI] [PubMed] [Google Scholar]
  • 23. Tucker K, Branson J, Dilleen M, et al. Protecting patient privacy when sharing patient-level data from clinical trials. BMC Med Res Methodol. 2016;16(Suppl 1):77. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Theyers AE, Zamyadi M, O’Reilly M, et al. Multisite comparison of MRI defacing software across multiple cohorts. Front Psychiatry. 2021;12:617997. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Schwarz CG, Kremers WK, Wiste HJ, et al. Changing the face of neuroimaging research: comparing a new MRI de-facing technique with popular alternatives. Neuroimage 2021;231:117845. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Finn ES, Shen X, Scheinost D, et al. Functional connectome fingerprinting: identifying individuals using patterns of brain connectivity. Nat Neurosci. 2015;18(11):1664–1671. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. van Erp TGM, Walton E, Hibar DP, et al. Cortical brain abnormalities in 4474 individuals with Schizophrenia and 5098 Control Subjects via the Enhancing Neuro Imaging Genetics Through Meta Analysis (ENIGMA) Consortium. Biol Psychiatry. 2018;84(9):644–654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28. Miotto R, Wang F, Wang S, Jiang X, Dudley JT.. Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform. 2018;19(6):1236–1246. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Samek W, Wiegand T, Müller K.. Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. 2017.
  • 30. Freitas AA. Comprehensible classification models: a position paper. SIGKDD Explor Newsl. 2014;15(1):1–10. [Google Scholar]
  • 31. Cabral C, Kambeitz-Ilankovic L, Kambeitz J, et al. Classifying Schizophrenia using multimodal multivariate pattern recognition analysis: evaluating the impact of individual clinical profiles on the neurodiagnostic performance. Schizophr Bull. 2016;42(Suppl 1):S110–S117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32. Santana AN, de Santana CN, Montoya P.. Chronic pain diagnosis using machine learning, questionnaires, and QST: a sensitivity experiment. Diagnostics (Basel, Switzerland) 2020;10(11):958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Štrumbelj E, Kononenko I.. Explaining prediction models and individual predictions with feature contributions. Knowl Inf Syst. 2014;41(3):647–665. [Google Scholar]
  • 34. Ribeiro MT, Singh S, Guestrin C. “Why should i trust you?” Explaining the predictions of any classifier. Paper presented at: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016.
  • 35. Lundberg SM, Erion G, Chen H, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2(1):56–67. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Lundberg SM, Lee S-I.. A Unified Approach to Interpreting Model Predictions. 31st Conference on Neural Information Processing Systems (NIPS 2017). Long Beach, CA, USA.; 2017:1–10. [Google Scholar]
  • 37. Li A, Zalesky A, Yue W, et al. A neuroimaging biomarker for striatal dysfunction in schizophrenia. Nat Med. 2020;26(4):558–565. [DOI] [PubMed] [Google Scholar]
  • 38. Wang L, Alpert KI, Calhoun VD, et al. SchizConnect: mediating neuroimaging databases on schizophrenia and related disorders for large-scale integration. Neuroimage 2016;124(Pt B):1155–1167. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Landis D, Courtney W, Dieringer C, et al. COINS data exchange: an open platform for compiling, curating, and disseminating neuroimaging data. Neuroimage 2016;124(Pt B):1084–1088. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40. Gollub RL, Shoemaker JM, King MD, et al. The MCIC collection: a shared repository of multi-modal, multi-site brain image data from a clinical investigation of schizophrenia. Neuroinformatics 2013;11(3):367–388. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Wang L, Kogan A, Cobia D, et al. Northwestern University Schizophrenia Data and Software Tool (NUSDAST). Front Neuroinform 2013;7:25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42. Ozyurt IB, Keator DB, Wei D, et al. Federated web-accessible clinical data management within an extensible neuroimaging database. Neuroinformatics 2010;8(4):231–249. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43. Poldrack RA, Congdon E, Triplett W, et al. A phenome-wide examination of neural and cognitive function. Sci Data. 2016;3:160110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44. Liaw A, Wiener M.. Classification and regression by randomForest. R News 2002;23(2):18–22. [Google Scholar]
  • 45. Vapnik VN. The Nature of Statistical Learning Theory: The nature of statistical learning theory; 1995. [Google Scholar]
  • 46. Duda RO, Hart PE, Stork DG.. Pattern Classification: Pattern classification; 2004. [Google Scholar]
  • 47. Domingos P, Pazzani M.. On the Optimality of the Simple Bayesian Classifier under Zero-One Loss. Mach Learn. 1997;29(2-3):103–130. [Google Scholar]
  • 48. Mclachlan GJ. Discriminant Analysis and Statistical Pattern Recognition. John Wiley & Sons; 2005. [Google Scholar]
  • 49. Hinton GE, Salakhutdinov RR.. Reducing the dimensionality of data with neural networks. Science. 2006;313(5786):504–507. [DOI] [PubMed] [Google Scholar]
  • 50. Chen T, Guestrin C.. XGBoost: a scalable tree boosting system. Paper presented at: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016.
  • 51. Chalkiadakis G, Elkind E, Wooldridge M.. Computational Aspects of Cooperative Game Theory. Vol 5: Morgan & Claypool; 2011. [Google Scholar]
  • 52. Elshawi R, Al-Mallah MH, Sakr S.. On the interpretability of machine learning-based model for predicting hypertension. BMC Med Inform Decis Mak. 2019;19(1):1–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53. Car J, Sheikh A, Wicks P, Williams MS.. Beyond the hype of big data and artificial intelligence: building foundations for knowledge and wisdom. BMC Med. 2019;17(1):1–5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Nichols TE, Das S, Eickhoff SB, et al. Best practices in data analysis and sharing in neuroimaging using MRI. Nat Neurosci. 2017;20(3):299–303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55. Dogan A, Birant D.. A weighted majority voting ensemble approach for classification. Paper presented at: 2019 4th International Conference on Computer Science and Engineering (UBMK), 2019, Samsun, Turkey. [Google Scholar]
  • 56. Ji Y, Zhang X, Wang Z, et al. Genes associated with gray matter volume alterations in schizophrenia. Neuroimage. 2021;225:117526. [DOI] [PubMed] [Google Scholar]
  • 57. Rimol LM, Hartberg CB, Nesvag R, et al. Cortical thickness and subcortical volumes in schizophrenia and bipolar disorder. Biol Psychiatry. 2010;68(1):41–50. [DOI] [PubMed] [Google Scholar]
  • 58. Takayanagi Y, Sasabayashi D, Takahashi T, et al. Reduced cortical thickness in Schizophrenia and schizotypal disorder. Schizophr Bull. 2020;46(2):387–394. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59. Kim JJ, Crespo-Facorro B, Andreasen NC, O’Leary DS, Magnotta V, Nopoulos P.. Morphology of the lateral superior temporal gyrus in neuroleptic nai;ve patients with schizophrenia: relationship to symptoms. Schizophr Res. 2003;60(2-3):173–181. [DOI] [PubMed] [Google Scholar]
  • 60. Buchy L, Ad-Dab’bagh Y, Lepage C, et al. Symptom attribution in first episode psychosis: a cortical thickness study. Psychiatry Res. 2012;203(1):6–13. [DOI] [PubMed] [Google Scholar]
  • 61. Shimoda Y, Watanabe K.. Contactins: emerging key roles in the development and function of the nervous system. Cell Adh Migr. 2009;3(1):64–70. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Molenhuis RT, Bruining H, Remmelink E, et al. Limited impact of Cntn4 mutation on autism-related traits in developing and adult C57BL/6J mice. J Neurodev Disord. 2016;8:6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Roohi J, Montagna C, Tegay DH, et al. Disruption of contactin 4 in three subjects with autism spectrum disorder. J Med Genet. 2009;46(3):176–182. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 64. Yu H, Yan H, Wang L, et al. Five novel loci associated with antipsychotic treatment response in patients with schizophrenia: a genome-wide association study. Lancet Psychiatry. 2018;5(4):327–338. [DOI] [PubMed] [Google Scholar]
  • 65. Fromer M, Roussos P, Sieberts SK, et al. Gene expression elucidates functional impact of polygenic risk for schizophrenia. Nat Neurosci. 2016;19(11):1442–1453. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 66. Johnson AN, Weil PA.. Identification of a transcriptional activation domain in yeast repressor activator protein 1 (Rap1) using an altered DNA-binding specificity variant. J Biol Chem. 2017;292(14):5705–5723. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Ripke S, O’Dushlaine C, Chambert K, et al. Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nat Genet. 2013;45(10):1150–1159. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

sbac096_suppl_Supplementary_Materials

Articles from Schizophrenia Bulletin are provided here courtesy of Oxford University Press

RESOURCES