Assisting schizophrenia diagnosis using clinical electroencephalography and interpretable graph neural networks: a real-world and cross-site study

Haiteng Jiang; Peiyin Chen; Zhaohong Sun; Chengqian Liang; Rui Xue; Liansheng Zhao; Qiang Wang; Xiaojing Li; Wei Deng; Zhongke Gao; Fei Huang; Songfang Huang; Yaoyun Zhang; Tao Li

doi:10.1038/s41386-023-01658-5

. 2023 Jul 25;48(13):1920–1930. doi: 10.1038/s41386-023-01658-5

Assisting schizophrenia diagnosis using clinical electroencephalography and interpretable graph neural networks: a real-world and cross-site study

Haiteng Jiang ^1,^2,^3,^#, Peiyin Chen ^4,^5,^#, Zhaohong Sun ⁶, Chengqian Liang ¹, Rui Xue ¹, Liansheng Zhao ⁷, Qiang Wang ⁷, Xiaojing Li ¹, Wei Deng ¹, Zhongke Gao ⁵, Fei Huang ⁴, Songfang Huang ⁴, Yaoyun Zhang ^4,^✉, Tao Li ^1,^2,^3,^✉

PMCID: PMC10584957 PMID: 37491671

Abstract

Schizophrenia (SCZ) is a chronic and serious mental disorder with a high mortality rate. At present, there is a lack of objective, cost-effective and widely disseminated diagnosis tools to address this mental health crisis globally. Clinical electroencephalogram (EEG) is a noninvasive technique to measure brain activity with high temporal resolution, and accumulating evidence demonstrates that clinical EEG is capable of capturing abnormal SCZ neuropathology. Although EEG-based automated diagnostic tools have obtained impressive performance on individual datasets, the transportability of potential EEG biomarkers in cross-site real-world application is still an open question. To address the challenges of small sample sizes and population heterogeneity, we develop an advanced interpretable deep learning model using multimodal clinical EEG features and demographic information as inputs to graph neural networks, and further propose different transfer learning strategies to adapt to different clinical scenarios. Taking the disease discrimination of health control (HC) and SCZ with 1030 participants as a use case, our model is trained on a small clinical dataset (N = 188, Chinese) and enhanced using a large-scale public dataset (N = 508, American) of adult participants. Cross-site validation from an independent dataset of adult participants (N = 157, Chinese) produced stable performance, with AUCs of 0.793–0.852 and accuracies of 0.786–0.858 for different SCZ prevalence, respectively. In addition, cross-site validation from another dataset of adolescent boys (N = 84, Russian) yielded an AUC of 0.702 and an accuracy of 0.690. Moreover, feature visualization further revealed that the ranking of feature importance varied significantly among different datasets, and that EEG theta and alpha band power appeared to be the most significant and translational biomarkers of SCZ pathology. Overall, our promising results demonstrate the feasibility of SCZ discrimination using EEG biomarkers in multiple clinical settings.

Subject terms: Schizophrenia, Diagnostic markers, Schizophrenia

Introduction

SCZ is a chronic and complex neuropsychiatric disorder affecting about 21 million people globally [1]. It is characterized by heterogenous positive (e.g., hallucinations and delusions) and negative (e.g., blunted affect) symptoms and is associated with functional impairments, decreased quality of life, and substantial mortality [2, 3]. In view of the global mental health crisis with unbalanced resources [4], accurate and timely screening/diagnosis of SCZ, especially with objective and feasible assessments in real-world settings, is crucial for precise prevention and intervention [2, 3].

Despite great efforts in developing the artificial intelligence (AI)-aided technologies for SCZ diagnosis [5–9], interpretable models, especially those transportable across different clinical sites, are still lacking [6, 10]. Previously, schizophrenia detection was performed using several widely accepted neuroimaging techniques, such as positron emission tomography (PET), magnetic resonance imaging (MRI), and functional MRI (fMRI) [11]. However, these neuroimaging techniques have high cost of imaging equipment and high demands on operational skills, limiting their wide dissemination in clinical practice. Therefore, there is an urgent need for easy-to-use and cost-effective brain monitoring techniques to aid in the diagnosis of schizophrenia.

This requirement can be fulfilled by electroencephalogram (EEG) technology, which enables the noninvasive monitoring of electrical activity in the brain using scalp electrodes. As an effective indicator of neuropathology, EEG shows great potential in the automatic diagnosis of neurological disorders and mental disorders such as epilepsy [12], Alzheimer’s disease [13], and depression [14]. Despite having a lower spatial resolution compared to other neuroimaging techniques like fMRI, the superior temporal resolution of EEG encompasses a wealth of temporal dynamic information. Recently, increasing evidences have shown that EEG is capable to capture abnormal neuropathology, providing rich clues for the diagnosis of SCZ [5]. For example, SCZ patients were found to exhibit power disruptions across various frequency bands [15]. Considering this premise, numerous studies have employed machine learning (ML) techniques, encompassing both traditional or deep learning (DL) methods, to construct classification models based on EEG features [10, 16–22]. These studies propose that EEG signals contain discriminative information valuable for SCZ diagnosis, and that ML serves as an effective tool to help accomplish this objective. Promising performance was obtained in these studies, with accuracies ranging from 81.5–99% in different datasets (Supplementary Table 1).

However, the translational ability has long been a barrier to widespread dissemination of EEG-based automatic diagnosis tools [10, 12]. The daunting challenges for generalization of mental disorder classifier are population heterogeneity and real-world sampling variability [10, 23]. Furthermore, the limited data annotation in mental health institutions, especially of HCs, makes it infeasible to build automated diagnosis tools independently. Over the past decades, tremendous efforts have been spent on resolving MRI issues of sample sizes, heterogeneity and sampling variability [24], leading to promising results from large multi-center studies with several hundreds of participants [25]. On the contrary, applications of EEG for SCZ diagnosis are still at the early stage. Previous EEG studies have primarily worked on small-scale datasets from a single site which are prone to overfitting and cannot cover the diverse population heterogeneity [26]. Therefore, the accuracies from these studies should be interpreted with caution. Besides, although multiple approaches have been proposed to remove artifacts from EEG signals [27], the EEG datasets used in previous studies were mostly sampled from research lab settings rather than collected from real-world clinical procedures, which may have more noisy signals. Surprisingly, it should be noted that few studies have examined the performance of EEG-based diagnosis tools generated from one population on other populations, making the transportability of potential EEG biomarkers across population heterogeneity an open question.

Furthermore, transfer learning (TL) techniques in deep neural networks (DNNs) effectively transfer knowledge or patterns learned from (large-scale) available data to new tasks. Thus, the performance of new tasks could be improved with reduced data annotation in a more cost-effective way [28–33]. In particular, graph neural networks (GNNs) [34, 35] have demonstrated unique advantages in representing complex brain structures (such as connections between channels in different brain regions) in EEG-based BCI (Brain-Computer Interface) applications [36, 37]. However, few TL studies have worked on improving the generalizability and wide dissemination of EEG-based diagnosis tools for mental disorders. As a proof-of-concept study, taking the differential diagnosis of HCs and SCZs as a use case, we investigated the direct cross-site transportability of EEG-based diagnosis models and the feasibility of employing TL techniques to improve the transportability:

Our study design of classification-based differential diagnosis between HCs and SCZs was illustrated in Fig. 1. First, eligible 1030 participants were collected from four data sources: including 508 HCs from TUH (USA) and other three independent groups from Chengdu (China, 95 HCs/93 SCZs), Hangzhou (China, 56 HCs/194 SCZs), and Moscow (Russia, 39 HCs/45 SCZs), respectively. Next, a range of interpretable frequency/time-domain EEG features were extracted, and graph convolutional neural networks (GCNs) were used to build the diagnosis models. Semantically interpretable multichannel EEG biomarkers and demographic information were used as features. Furthermore, different TL strategies were investigated in an integrated framework, EEG_TL, to increase the diagnosis classifier transportability. The performance of direct cross-site evaluation and employing TL techniques was investigated systematically, with comparisons to different structures of EEG graphs and other classical DL/ML methods. Feature importance related to SCZ pathology was also measured, visualized, and validated based on neuropathology analysis. In particular, salient EEG biomarkers with strong transportability across multiple sites were explored.

Methods

Participants

In total, four EEG datasets were used in this study: the TUH dataset only contained HCs and was used as an auxiliary resource for transfer learning; the Chengdu dataset was used to build automatic diagnosis models and the rest three, namely Hangzhou, Moscow were used for cross-site performance validation. All samples were collected in the eyes-closed resting state of participants using the standard 10–20 EEG montage. For simplicity, each dataset is referred to by its original location in this study. Details of each dataset are as follows:

TUH

508 HCs were filtered out from more than 1400 subjects with normal neurological pathology in the open-accessed TUH Abnormal EEG Corpus [38]. According to the physician’s report for each EEG recording, subjects with mental disorders, brain injuries, and epilepsy were excluded. Subjects with age between 20 and 45 years old were selected. The original EEG recordings had durations ranging from 15–48 min, sampled with 33 channels at 250 HZ.

Chengdu

we recruited 93 inpatients from West China Hospital (Chengdu, China), diagnosed with SCZ based on ICD-10 criteria and 95 HCs through advertisement between January 2014 and February 2021. We included only patients who had no history of other mental disorders, and HCs were ensured that they had no personal or immediate family history of mental illness. Exclusion criteria include (1) current or past history of neurological or neurodevelopmental disorders, substance-related problems, or head trauma with loss of consciousness, (2) left-handedness, and (3) any other EEG contraindications. All patients aged between 18 and 65 collected 3 minutes of clinical EEG data at the beginning of their routine examination. The EEG data was collected by 16-channel Shanghai Nuocheng device at a sampling frequency of 128 Hz. The study was approved by the local ethic committee.

Hangzhou

We recruited 194 SCZ inpatients from Hangzhou Seventh Hospital and 56 HCs between November 2021 and April 2022 in Hangzhou, China. The protocol in Hangzhou was similar to Chengdu. Unlike Chengdu, the EEG data was acquired by 19-channel Beijing Solar device, and the sampling frequency was 256 Hz.

Moscow

We used the open-access EEG database, which was obtained and prepared by Dr. Gorbachevskaya (leading researcher at the Mental Health Research Center) and Dr. S.V. Borisov (senior researcher at the faculty of Biology M.V. Lomonosov Moscow State University) in the Laboratory for Neurophysiology and Neuro-Computer Interfaces at M.V. Lomonosov Moscow State University [39]. The EEG signal was recorded from two groups of male adolescent individuals, including 39 HCs and 45 with SCZ. Each EEG signal of this dataset consists of 16 channels. These recorded EEG signals are of 1-min duration, each containing 7680 EEG voltages of the order of mV, and are sampled at 128 Hz.

Data preprocessing

We used Automagic [40], an open-source MATLAB toolbox, to standardize preprocessing of big EEG data. Automagic has proven to be an effective method to reduce a large extent of EEG by applying a pipeline of algorithms to identify artifactual channels in combination with multiple artifact rejection algorithms. All EEG data from the four different sites were preprocessed using the same procedure. The standardized preprocessing pipeline includes the following steps: (1) 0.1–40 Hz bandpass filtering; (2) removal of flatline channels, low-frequency drifts, low-correlation channels and noise channels; (3) application of Multiple Artifact Rejection Algorithm (MARA) for artifact correction; (4) channel interpolation and referencing.

To assure the data quality, we took the EEG recordings between the 3rd minute to the 15th minute of TUH EEG corpus as the samples to avoid the interference caused by signals of electromyography and electrooculogram at the beginning of the subject’s EEG acquisition and to avoid the fatigue and sleepiness of the subject during the longer acquisition. To be consistent with Chengdu and Hangzhou, the recordings were split into 3232 samples of 2-minute long, with a 1-minute overlap between adjacent samples.

Besides, to address the long-tailed distribution of the time-domain characteristics of TUH EEG corpus, we introduce the Box–Cox transformation [11], which makes the distribution of TUH EEG corpus more similar to the normal distribution. The Box–Cox transformation equation is as follows.

y (λ) = f (x) = \{\begin{matrix} \frac{y_{i}^{λ} - 1}{λ}, λ \neq 0 \\ ln (y_{i}), λ = 0 \end{matrix})

Feature generation

Sixteen channels common in all four datasets are selected, including FP1, FP2, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, and T6. The electrode distribution for all channels is set according to the international 10-20 system. When considering features of EEG signal representation, although original time series are often used in DL-based methods, they are more prone to overfitting and have poor generalizability in external evaluations [36]. To simplify the EEG signal representations for a better generalizability, this study extracted semantically interpretable frequency/time-domain biomarkers [41, 42] that have already been validated in traditional ML methods, and applied them in DL-based models for the first time. As listed in Supplementary Table 2, we have extracted two types of frequency-domain features and four types of time-domain features to represent each channel of EEG signals. In addition, demographic features of each subject are also considered because of the large individual discrepancy in EEG signals.

Frequency-domain EEG feature: absolute power and differential entropy (DE).
Time-domain EEG feature: energy, amplitude, mean value, variance, 1st order difference, and 2nd order difference of signals, 1st order difference and 2nd order difference of normalized signals, Hjorth activity, Hjorth mobility, and Hjorth complexity, Petrosian fractal dimension.
Demographic feature: age and gender.

In particular, we obtain frequency features from five different rhythms, including delta (1–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz), and gamma (30–44 Hz). Thus, we can obtain 10 different frequency features for each channel. However, different sites employ various EEG devices, introducing measurement variability between the datasets. To address this issue, we implement a min-max normalization technique for all extracted features in each dataset independently prior to inputting them into the ML or DL models. This normalization step helps mitigate the effects of confounding factors, such as impedance difference, and ensures a fair comparison across the datasets. The two types of demographic characteristics are the age and gender of each subject.

EEG-based diagnosis model using transfer learning

Transfer learning strategies

Three TL strategies were employed in this study to improve the transportability of EEG-based diagnosis models: large-scale of HC recordings in the external TUH dataset were leveraged for pre-training EEG graph representations (PT) and meta learning-based data augmentation (META). For PT (Supplementary Fig. 1), EEG graph representations learned from TUH were fine-tuned on the Chengdu data for diagnosis classifier training. For META (Supplementary Fig. 2), TUH recordings were added into the training data as an augmentation for model learning. Both PT and META aimed to enrich the EEG representations for a better generalizability. In addition, unsupervised domain adaptation (DA) was implemented during cross-site classification. For DA, unlabeled samples from the target site were used to reduce the discrepancies between the source data (where the model is originally trained) and the target data, so that to adapt the classifier to the target site. More details about the TL strategies are elaborated in the method introduction in supplementary materials.

Diagnosis model building

EEG graph construction

A left-right symmetric graph structure was constructed to represent EEG signals of 16 channels in GCN, as shown in Fig. 2a. Multimodal (frequency/time) EEG features were extracted to represent each of the sixteen channels, namely each node in the graph. Edges between node pairs were created to represent connections between channels in the graph. EEG features and demographic features were first represented independently and then fused together for classification.

Diagnosis model training

The diagnosis model was primarily trained using the Chengdu data. The GCN model pre-trained on the TUH dataset can be employed to initialize the parameters of the diagnosis classifier for fine-tuning. Additionally, the healthy control (HC) data from the TUH dataset are utilized as an augmentation to enrich the HC patterns in the training data. Specifically, we designed two meta-learners, which include a schizophrenia (SCZ) classifier based on contrastive learning and a domain discriminator to align the feature distributions across different datasets. In each iteration (i.e., every 300 epochs), HCs from the TUH dataset are randomly sampled in equivalent size and integrated into the training data as an augmentation. These two meta-learners are then used to incrementally optimize performance.

The SCZ classifier comprises a fully-connected layer with 2 output units, connected to a Softmax activation function. The InfoNCE loss function of contrastive learning is employed to enhance the classification performance (see Supplementary Equation 19). This approach aims to minimize the distance between a given sample (i.e., a query) and samples of the same class (i.e., positive samples) while maximizing the distance between a given sample and samples from another class (i.e., negative samples).

Similarly, the domain discriminator also consists of a fully-connected layer of 2 output units, connected with a Softmax activation function. To illustrate, when aligning the TUH data (domain s) and the Chengdu data (domain t) as an example, let $F^{s} = \{f_{1}^{s}, f_{2}^{s}, \dots,) (f_{n}^{s})$ and $F^{t} = \{f_{1}^{t}, f_{2}^{t}, \dots,) (f_{n}^{t})$ denote their respective features. During training, domain-invariant features are rewarded to reduce variance of the marginal distribution of the source and target data (see Supplementary Equation 20).

The diagnosis classification model is trained on the GCN model of EEG graph representation using the following loss function:

L (θ_{f}, θ_{c}, θ_{d}) = α L_{c} + β L_{d}

where α, β are the tradeoff parameters to balance the SCZ classifier $L_{c}$ and the domain discriminator $L_{d}$ for optimal performance.

More details of the diagnosis model training can be found in Supplementary eMethods.

Cross-site diagnosis

In cross-site diagnosis, an unlabeled sample from the external site went through the diagnosis model to identify its label as HC or SCZ. In this stage, the TL strategy of DA could be implemented, by sending the unlabeled sample through the domain discriminator as defined above to reduce the discrepancy between data from different sites. In this way, the diagnosis classifier was adapted to the external site and the diagnosis performance was boosted (Supplementary Fig. 3).

Evaluation

Experimental setup

(1) Firstly, an intra-site evaluation was conducted on the Chengdu dataset using five-fold cross validation. (2) Next, a cross-site evaluation was conducted on the Hangzhou and Moscow datasets using classification models trained from the entire Chengdu dataset. In particular, subsets of different HC/SCZ splits from the Hangzhou dataset were used to test performance stability with diverse SCZ prevalence and population heterogeneity, as suggested by Fernandes et al. [7]. In addition, the performance on the Moscow dataset was also examined for the transportability from models trained on adult data to adolescent data. (3) Furthermore, models with/out TL strategies were compared to check the original cross-site performance and potential transportability improvements after using TL strategies. (4) DL algorithms (EEGNet [43], LSTM [44], and GAT [45]) and ML algorithms (SVM [46], Logistic regression [47], Random Forest [48], and XGBoost [49]) commonly used for EEG-based classification were also implemented for performance comparison. Of note, raw EEG signals with time sequences and demographic features were used as features for EEGNet/LSTM, while GAT and all ML algorithms used the same set of features as our proposed GCN. Detailed experiment configuration and (hyper-)parameters were reported in the Supp - Training and parameters section.

Evaluation criteria

Taking SCZ as the positive class, standard metrics, i.e., AUC, ACC, sensitivity, specificity, precision, and F1, were reported to provide an objective model assessment from different aspects. Balanced sensitivity and specificity scores were reported based on the ROC (receiver operating characteristic) curves. All experiments were repeated 50 times, and the mean (±SD) of each evaluation criterion was reported.

Statistical analysis

Statistical analyses were conducted by IBM SPSS Statistics 22 (IBM SPSS, Turkey). A comparison of categorical variables between the two groups was performed using the χ² test. Comparison of continuous variables between two groups were performed using a two-tailed two-sample t test (for examined normal distribution of data). Unless specified otherwise, the significance of all tests was set to p < 0.05. All statistical analyses were done using R (version 4.1).

Ethical aspects

The study was carried out in accordance with ethical principles for medical research involving humans (WMA, Declaration of Helsinki). All data were collected anonymously. The study protocol was approved by the Ethical Committee of the West China Hospital and the Hangzhou Seventh People’s Hospital.

Results

Data statistics

All EEG samples in this study were collected in an eyes-closed resting state using the standard 10–20 EEG montage. In total, four EEG datasets were used in this study, and their demographic statistics is listed in Table 1. Detailed data eligibility and sampling configurations can be found in the “Method” section. More data descriptions are referenced to Supplementary Tables 3 and 4.

Table 1.

Demographics of samples in this study.

	TUH		Chengdu				Moscow
Characteristic	HC (508)		HC (95)		SCZ (93)		HC (39)		SCZ (45)
Age^{^} (years)	32.54 ± 7.77		25.17 ± 5.22		29.25 ± 10.84**		10–19		10–19
Female^#	345 (0.680)		52 (0.547)		46 (0.495)*		0 (0.0)		0 (0.0)
	Hangzhou
Characteristic	HC (56)	SCZ (194) ^1:3		SCZ (112) ^1:2		SCZ (56) ^1:1		SCZ (28) ^2:1	SCZ (19) ^3:1
Age^{^} (years)	31.21 ± 12.25	32.27 ± 10.03*		32.04 ± 11.08*		31.19 ± 10.77**		31.40 ± 11.56**	32.14 ± 11.66**
Female^#	38 (0.679)	119 (0.613)**		107 (0.637)**		72 (0.643)**		55 (0.655)**	49 (0.662)**

Open in a new tab

^{^}Values are shown as mean and standard deviation. ^#Values are shown as count and percent of the distribution.

1:3, 1:2, 1:1, 2:1, and 3:1 stand for the ratio between healthy controls (HCs) and schizophrenia (SCZ) patients.

^* Statistically significant difference (p value < 0.05) between HCs and SCZ patients.

^** Statistically significant difference (p value < 0.001) between HCs and SCZ patients.

EEG samples

Original EEG recordings were of different lengths (TUH: 15–45 min, Chengdu/Hangzhou: 3 min, Moscow: 1 min), and they went through a series of automated data cleaning, transformation and splitting before being used in experiments. Detailed information about data processing can be found in the Method section. In a pilot study (Supp - EEG Sampling), EEG samples in the Chengdu and TUH datasets were tested of different lengths (e.g., 1 second, 1 min, 2 min) and of different overlap windows (to generate samples of different scales), based on their five-fold cross-validation performance on the Chengdu data. In the end, a 2-min sample was generated for each participant in the Chengdu and Hangzhou datasets, respectively. A total of 3232 2-min TUH samples were used in the experiment. In contrast, the averaged samples of Moscow only had 56 seconds after data preprocessing.

Intra-site validation

Supplementary Fig. 4 provides the five-fold cross-validation performance on the Chengdu dataset. Supplementary Fig. 4a depicts the result histogram accompanied by a performance table for different method comparison. Performance of the base GCN model, GCN with EEG_TL implementation (PT + META), and other compared DL/ML methods is reported. Supplementary Fig. 4b depicts the ROC curves of the base GCN model, after adding TL strategies of PT, META and PT + META into GCN, respectively. The base GCN model obtained a promising result of AUC/ACC—0.903/0.903. With the implementation of TL strategies, the performance of GCN was improved continuously to 0.932/0.932. EEGNet used raw time series of EEG signals as features and yielded the highest performance (0.968/0.968). Detailed performance of mean (std) values is provided in Supp1ementary Table 5.

Cross-site evaluation

Of note, achieving the optimal performance on a single dataset is not of primary importance in this study. Rather, transportability and performance stability across different sites with various population heterogeneity are more critical pursues here. Cross-site performance on the Hangzhou dataset is provided in Fig. 3. Performance of both direct cross-site evaluation using the base GCN model and EEG_TL integration is provided. Figure 3a depicts the result histogram accompanied by a performance table for different method comparison; Fig. 3b shows a performance comparison among HC/SCZ ratios of 1:3, 1:2, 1:1, 2:1, and 3:1; Fig. 3c illustrates the ROC curves of the base GCN model, after adding each TL strategy and their combinations into GCN, respectively. A HC/SCZ ratio of 1:1 is used in Fig. 3a, c. More detailed performance with different HC/SCZ ratios is illustrated in Supplementary Tables 6–10 and Supplementary Fig. 5.

Performance stability of different DL/ML methods: As shown in Fig. 3a, different EEG representations and DL/ML methods have dramatic differences in cross-site diagnosis performance. The cross-site performance of EEGNet decreased significantly to 0.656/0.656 (vs. 0.968/0.968 on Chengdu), indicating an overfitting and a poor generalizability of the raw EEG time series. Despite well-known transportability and robustness in previous works, the performance of LR also dropped sharply to 0.782/0.732 (vs. 0.921/0.875). In comparison, our base GCN model of multichannel-based graphs with interpretable biomarkers had a relatively small decrease - 0.798/0.798 (vs. 0.903/0.903).

Performance stability across different prevalence: Similar trends as in the HC/SCZ ratio of 1:1 (Fig. 3a) can also be observed in the performance of other ratios (Fig. 3b, Supplementary Tables 3–7, Supplementary Fig. 3). According to results of direct cross-site evaluation, the base GCN model demonstrated the strongest stability and robustness for different SCZ prevalence. The mean cross-site AUCs for the five HC/SCZ ratios were GCN 0.780 (±0.015) vs. LR 0.757 (±0.029), and mean ACCs were GCN 0.772 (±0.037) vs. LR 0.723 (±0.042).

Performance stability of different EEG graph representations: In addition to the simple left-right symmetric graph structure used in our base model (Fig. 2a), we also tested other structures with more complicated network connections between channels as shown in Fig. 2b, c. We also explored adjusting the edge weights between node pairs (channel pairs) dynamically during model training. However, models obtained with these complicated graph structures dropped the cross-site AUC for about 3% compared to the simple graph structure. One potential reason is that complex EEG networks lead to an overfitting and reduce the generalizability of the diagnosis model.

Performance stability of TL strategies: steady performance improvements can be observed by incrementally implementing different TL strategies (Fig. 3b), with larger performance gains achieved by adding TUH (mean AUC: 0.787 vs. 0.824; mean ACC: 0.790 vs. 0.825). Overall, the proposed EEG_TL framework produced stable performance for different SCZ prevalence, with AUCs of 0.793–0.852 and accuracies of 0.786–0.858, respectively. An average improvement of 4.4% for AUC and 5.3% for ACC was achieved compared to the base model; and an average improvement of 4.6~25.21% for AUC and 7.9~24.35% was achieved for ACC compared to other implemented methods. More detailed discussion of performance contribution from each TL strategy can be found in the Discussion section.

In addition, Moscow’s performance is reported in Supplementary Table 11. Notably, the Moscow dataset belongs to a different age group (adolescent only), ethnicity (Russia vs. China vs. the USA), gender (boy only), and length of EEG samples (56 s vs. 2 min) from that of Chengdu and TUH. Applying the base diagnosis model to cross-site data directly yielded an AUC of 0.606 and an ACC of 0.595, relatively lower than the cross-site performance of Hangzhou (0.798/0.798). Encouragingly, an improvement of ~10% was obtained by EEG_TL (0.702/0.690), demonstrating the potential of using TL strategies to scale up the dissemination of EEG-based diagnosis tools for SCZ.

Feature significance and generalization in HC and SCZ discrimination

Supplementary Fig. 6 and the section of Feature distribution comparison of HCs depict example feature distributions of HCs in the three datasets (i.e., TUH, Chengdu, Hangzhou). In addition, Fig. 4 visualizes feature comparisons between the Chengdu and Hangzhou datasets from different aspects:

Feature correlation heatmaps. In total, 4 groups of features in 5 frequency bands and 16 EEG channels are analyzed, including EEG frequency features of power (16*5), DE (16*5), EEG time-domain features (16*14), and demographic features (i.e., gender and age). SCZs in the Chengdu dataset are highly correlated with frequency features. In contrast, SCZs in the Hangzhou dataset also have a strong correlation with time-domain features in addition to frequency features.
Mean EEG power in each frequency band for HCs and SCZs. Significant differences in mean power values are observed not only between HCs/SCZs in two different datasets but also between HCs/SCZs of the same dataset, indicating the essential heterogeneity in EEG signals and the necessity to select the most informative features with good differential power and generalizability.
shows the SHAP values of the top 15 features of the diagnosis model before and after TL, indicating the evolution of feature importance from Chengdu to Hangzhou through the TL process. Each row shows the importance of different values of a feature, with red for high feature values and blue for low feature values. The direction of the positive x-axis represents the importance of each feature value to support SCZ, and the direction of the negative axis represents the importance of each feature value to support HCs. The rows are ranked by the overall feature importance vertically.

As shown in Fig. 4c, salient features of Chengdu mainly consist of frequency features, while time-domain features (e.g., energy of time sequence) are among the top features in Hangzhou in addition to frequency features, which is aligned with the feature correlation shown in Fig. 4a, indicating the adaptation from Chengdu to Hangzhou. Despite significant changes in feature importance ranking between the Chengdu and Hangzhou datasets, it is interesting to note that the most important features remained in the theta and alpha band power in both datasets. From a spatial perspective, we observed 5 common channels (Fp1, Fp2, P4, O1, and O2) in the top 15 features in both datasets, which may be related to essential heterogeneity. Although the consistency of spatial features is not as good as that of temporal features, it still suggests the existence of some common spatial-temporal EEG abnormality underlying SCZ pathology. Upon closer examination of the salient feature sets of Chengdu and Hangzhou, we found that different brain regions of the same frequency band showed varying degree of importance in these two datasets (e.g., the alpha band power of the F4 channel vs. the O2 channel, the theta band power of the Fp1 channel vs. the O2 channel). In addition to the theta and alpha band power that demonstrated importance globally across different datasets, these findings suggest that different brain regions of the same frequency band may play different roles in distinguishing SCZ patients from healthy controls, highlighting the potential for personalized diagnosis and treatment based on individual differences in brain function.

Discussion

There is an urgent need for accurate and timely screening/diagnosis of SCZ globally, especially with objective and cost-effective assessments that can be widely disseminated in real-world settings [2, 3]. This study takes the initiative to investigate the cross-site transportability of EEG-based diagnosis models and the feasibility of employing TL techniques to improve the transportability. Overall, an AUC of 0.780 (±0.015) and an ACC of 0.772 (±0.037) were obtained for different SCZ prevalence in direct cross-site evaluation. The performance was further improved to an AUC of 0.824 (±0.027) and an ACC 0.825 (±0.035) with TL strategies (Supplementary Fig. 2 and Supplementary Tables 3–7). Stable performance was produced in cross-site evaluation for different SCZ prevalence, indicating great potential in real clinical scenarios. Despite feature importance varied significantly among different datasets, EEG theta and alpha power appeared to be the most significant and translational biomarkers of SCZ pathology.

Interestingly, a combination of a simplified EEG graph structure and a set of semantically interpretable EEG features in DL-based models achieved the optimal transportability for SCZ diagnosis in this study. These results are encouraging for real-world applications and are aligned with findings on other disorders for cross-site generalizability [12, 50, 51]. It is important to note that EEG activity changes dynamically as the human brain develops from adolescence to adulthood [52]. Consequently, it is not surprising that when our model trained on adult data was applied to the adolescent Moscow dataset, its performance decreased. Additionally, medication may be a potential confounding factor. Patients in the Moscow adolescent dataset may be first-episode or in early stages, whereas the Chinese dataset consists of chronic inpatients, the majority of whom were receiving medication (Supplementary Table 12). These differences in both developmental stages and medication status may have contributed to the observed discrepancy in performance when the model was applied to the adolescent dataset. Encouragingly, an improvement of ~10% was obtained with TL strategies, demonstrating the potential to scale up the dissemination of EEG-based diagnosis tools for SCZ. Future research with enriched adolescent data will be conducted to further address this discrepancy.

Although previous studies have attempted to assist SCZ diagnosis through PET, MRI and fMRI, they have limitations in sample size, cost, promotion and feasibility in clinical practice. On the other hand, as a potentially cost-effective solution with wide dissemination, EEG is still rarely examined in large-scale research, which may be due not only to lack of information, but also to methodological problems. EEG is often highly contaminated with artifacts, and the quality of EEG data of participants varies greatly. Here, we applied a standardized preprocessing method, namely Automagic method [40], which has been proved to be effective in reducing a large number of EEG artifacts automatically. Based on relatively large clinical EEG datasets from multiple sites, we have the opportunity to develop more advanced DL methods and provide new insights into SCZ pathology.

Despite the significant challenges in identifying effective biomarkers, they hold the potential to not only aid in the diagnosis, prognosis, risk stratification, and treatment monitoring of schizophrenia (SCZ) in clinical research, but also to provide insights into the biological foundations of the disorder. While psychiatric interview and examination can distinguish schizophrenia patients in clinical practice in some degree, EEG offers added value in understanding the underlying pathology of SCZ and guiding the monitoring of treatment response. Of note, the predictive value of a biomarker depends on its generalizability across clinical population, as well as its sensitivity and specificity for the target population. In our study, although the top 15 features of Chengdu and Hangzhou datasets are different, theta and alpha power disruptions appeared to be common in both datasets, suggesting that they are potential biomarkers of SCZ pathology. It is worth noting that previous studies have demonstrated that the abnormal EEG power of SCZ is not related to disease treatment or disease duration [53]. On the other hand, delta/theta power in patients with SCZ showed a significant positive correlation with frontal metabolism, while alpha power exhibited a significant positive correlation with subcortical metabolism. The positive association between low-frequency power in SCZ and frontal-subcortical metabolism [54] might be supported by the observed relationship between low-frequency power enhancement, negative symptoms, and ocular motor dysfunction. Furthermore, the disruption of thalamic-cortical networks could contribute to the symptomatology and information processing defects observed in patients with SCZ [55]. In summary, the clinical and biological correlates of low-frequency power abnormalities (theta and alpha) likely indicate the presence of thalamic and frontal-lobe dysfunction in SCZ. Furthermore, low-frequency oscillations are associated with long-distance brain communication[62], which is particularly important when conceptualizing schizophrenia as a dysconnectivity syndrome[63,64]. The dysconnection hypothesis in schizophrenia proposes a failure of functional integration within distributed but circumscribed neuronal systems, which may manifest as reduced functional connectivity compared to healthy controls[65]. Consequently, disruption of theta and alpha activities may thus establish a link between the symptoms and signs of schizophrenia and the dysconnection hypothesis. Additionally, further research should be undertaken on antipsychotic-naïve patients to examine whether EEG theta and alpha power can be utilized as potential biomarkers for predicting treatment response.

Translational ability and scalability of the proposed framework: (1) Given the limited sample sizes, population heterogeneity, and sampling variations, there is always a need of TL techniques for cross-site transportability. By integrating multiple effective TL strategies, the framework of EEG_TL is composable and flexible to adapt to various clinical settings with different resources. With the help of large-scale and low-cost external resources, the model generalizability could be improved by enriched EEG representations. Diagnosis models can be applied directly to cross-site data, or first adjusted using inexpensive, unlabeled data from new sites if required. (2) Previous studies using MRI data have shown that hybrid data from multi sources can increase model generalization on new datasets [56]. Our proposed framework is highly scalable with additional data sources. Domain discriminator can also be used in the pre-training stage to generalize feature representations from multi-source inputs for better performance.

Limitations and future work: (1) The experiments in this study used data from four sites. Although transportability across different population distributions were examined, the model needs to be validated in a next step on more samples from different settings. (2) Currently, EEG signals of sixteen channels commonly present in four resources were employed in this study. For the convivence of EEG signal sampling using wearable devices, performance and generalizability using reduced number (e.g., one or two) of channels will be explored in the next step. (3) Moreover, the pre-configuration of external source ratios for data augmentation can be flexible in different clinical scenarios. (4) The proposed method can be easily generalized to other clinical diseases (such as depression and Alzheimer’s disease) and EEG-based applications (such as prognosis prediction). This has important implications for facilitating automated diagnosis of neurological and psychiatric disorders by leveraging large-scale, low-cost data resources. (5) Explainable Artificial Intelligence (XAI) comprises a collection of processes and techniques that enable human users to understand and trust the results and output generated by machine learning algorithms. While we have attempted to interpret our model using Sharp values in the present study, we acknowledge the value of incorporating XAI in future research. The approach will help illuminate the relevance of specific characteristics in the final model, further enhancing its trustworthiness and transparency.

In summary, this study investigated whether large-scale public EEG data from external domains and small-scale EEG data from different mental health sites can be leveraged to produce interpretable and translational models. Promising cross-site performance demonstrated the feasibility of using EEG biomarkers for differential diagnosis. The proposed framework is compositional and flexible to adapt to different clinical practice scenarios, and it can also be generalized to other clinical applications using EEG biomarkers.

Supplementary information

Supplementary Materials^{(2.2MB, docx)}

Author contributions

The work presented here was carried out in collaboration among all authors. HJ, YZ, and TL designed the initial methods and experiments. HJ, LZ, QW, XL, WD, ZG, FH, SH, and TL discussed and refined the study design. CL and RX carried out the data collection, HJ, PC and YZ analyzed the data, HJ and YZ interpreted the results and drafted the paper. All authors have attributed to, read, and approved the manuscript.

Funding

This work was partly supported by STI2030-Major Projects (grant number 2022ZD0212400, 2021ZD0200404), the Fundamental Research Funds for the Central Universities (grant number 226-2022-00138), Hangzhou Biomedical and Health Industry Special Projects for Science and Technology (2021WJCY240), the National Natural Science Foundation of China (grant number 81920108018), the Key R & D Program of Zhejiang (grant number 2022C03096), Project for Hangzhou Medical Disciplines of Excellence.

Data availability

The TUH dataset can be open-accessed at https://isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml, and the Moscow dataset can be open-accessed at http://brain.bio.msu.ru/eeg_schizophrenia.htm. The Chengdu/Hangzhou clinical EEG datasets supporting the findings of this study are accessible from the corresponding author upon reasonable request, with the approval of the Institutional Review Board of the West China Hospital/the Hangzhou Seventh People’s Hospital.

Code availability

The codes of the proposed approach are accessible at https://github.com/ChenPeiyin/Diagnosis-based-EEG-A-cross-site-study. Codes of traditional machine learning algorithms used in this study are publicly available at https://github.com/scikit-learn/scikit-learn. The code used for SHAP analysis is available at https://github.com/slundberg/shap.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

These authors contributed equally: Haiteng Jiang, Peiyin Chen.

These authors jointly supervised this work: Yaoyun Zhang, Tao Li.

Contributor Information

Yaoyun Zhang, Email: xiaoni5122@gmail.com.

Tao Li, Email: litaozjusc@zju.edu.cn.

Supplementary information

The online version contains supplementary material available at 10.1038/s41386-023-01658-5.

References

1.Saha S, Chant D, Welham J, McGrath J. A systematic review of the prevalence of schizophrenia. PLoS Med. 2005;2:e141. doi: 10.1371/journal.pmed.0020141. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Jin H, Mosweu I. The societal cost of schizophrenia: a systematic review. PharmacoEconomics. 2017;35:25–42. doi: 10.1007/s40273-016-0444-6. [DOI] [PubMed] [Google Scholar]
3.Chong HY, Teoh SL, Wu DB, Kotirum S, Chiou CF, Chaiyakunapruk N. Global economic burden of schizophrenia: a systematic review. Neuropsychiatr Dis Treat. 2016;12:357–73. doi: 10.2147/NDT.S96649. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Qureshi O, Endale T, Ryan G, Miguel-Esponda G, Iyer SN, Eaton J, et al. Barriers and drivers to service delivery in global mental health projects. Int J Mental Health Syst. 2021;15:1–13. doi: 10.1186/s13033-020-00427-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Ke P, Xiong D, Li J, Pan Z, Zhou J, Li S, et al. An integrated machine learning framework for a discriminative analysis of schizophrenia using multi-biological data. Sci Rep. 2021;11:1–11. doi: 10.1038/s41598-021-94007-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Sadeghi D, Shoeibi A, Ghassemi N, Moridian P, Khadem A, Alizadehsani R, et al. An overview of artificial intelligence techniques for diagnosis of Schizophrenia based on magnetic resonance imaging modalities: Methods, challenges, and future works. Comput Biol Med. 2022;146:105554. doi: 10.1016/j.compbiomed.2022.105554. [DOI] [PubMed] [Google Scholar]
7.Fernandes BS, Karmakar C, Tamouza R, Tran T, Yearwood J, Hamdani N, et al. Precision psychiatry with immunological and cognitive biomarkers: a multi-domain prediction for the diagnosis of bipolar disorder or schizophrenia using machine learning. Transl Psychiatry. 2020;10:1–13. doi: 10.1038/s41398-020-0836-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Shim M, Hwang H-J, Kim D-W, Lee S-H, Im C-H. Machine-learning-based diagnosis of schizophrenia using combined sensor-level and source-level EEG features. Schizophr Res. 2016;176:314–9. doi: 10.1016/j.schres.2016.05.007. [DOI] [PubMed] [Google Scholar]
9.Schizophrenia. A survey of artificial intelligence techniques applied to detection and classification. Int J Environ Res Public Health. 2021;18:6099. doi: 10.3390/ijerph18116099. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Barros C, Silva CA, Pinheiro AP. Advanced EEG-based learning approaches to predict schizophrenia: Promises and pitfalls. Artif Intell Med. 2021;114:102039. doi: 10.1016/j.artmed.2021.102039. [DOI] [PubMed] [Google Scholar]
11.Sakia RM. The Box‐Cox transformation technique: a review. J R Stat Soc Ser D (Statistician) 1992;41:169–78. [Google Scholar]
12.Saab K, Dunnmon J, Ré C, Rubin D, Lee-Messer C. Weak supervision as an efficient approach for automated seizure detection in electroencephalography. Npj Digit Med. 2020;3:1–12. doi: 10.1038/s41746-020-0264-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Tait L, Tamagnini F, Stothart G, Barvas E, Monaldini C, Frusciante R, et al. EEG microstate complexity for aiding early diagnosis of Alzheimer’s disease. Sci Rep. 2020;10:17627. doi: 10.1038/s41598-020-74790-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
14.de Aguiar Neto FS, Rosa JLG. Depression biomarkers using non-invasive EEG: a review. Neurosci Biobehav Rev. 2019;105:83–93. doi: 10.1016/j.neubiorev.2019.07.021. [DOI] [PubMed] [Google Scholar]
15.Newson JJ, Thiagarajan TC. EEG frequency bands in psychiatric disorders: a review of resting state studies. Front Hum Neurosci. 2019;12:521. doi: 10.3389/fnhum.2018.00521. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Bose T, Sivakumar SD, Kesavamurthy B. Identification of schizophrenia using EEG alpha band power during hyperventilation and post-hyperventilation. J Med Biol Eng. 2016;36:901–11. [Google Scholar]
17.Jeong JW, Wendimagegn TW, Chang E, Chun Y, Park JH, Kim HJ, et al. Classifying schizotypy using an audiovisual emotion perception test and scalp electroencephalography. Front Hum Neurosci. 2017;11:450. doi: 10.3389/fnhum.2017.00450. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.Chu W-L, Huang M-W, Jian B-L, Cheng K-S. Analysis of EEG entropy during visual evocation of emotion in schizophrenia. Ann Gen Psychiatry. 2017;16:1–9. doi: 10.1186/s12991-017-0157-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Alimardani F, Boostani R. DB-FFR: a modified feature selection algorithm to improve discrimination rate between bipolar mood disorder (BMD) and schizophrenic patients. Iran J Sci Technol Trans Electr Eng. 2018;42:251–60. [Google Scholar]
20.Phang, C-R, Ting, C-M, Samdin, SB, Ombao, H. Classification of EEG-based effective brain connectivity in schizophrenia using deep neural networks. In: 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER). IEEE; 2019. pp. 401–406.
21.Siuly S, Khare SK, Bajaj V, Wang H, Zhang Y. A computerized method for automatic detection of schizophrenia using EEG signals. IEEE Trans Neural Syst Rehabil. 2020;28:2390–2400. doi: 10.1109/TNSRE.2020.3022715. [DOI] [PubMed] [Google Scholar]
22.Sun J, Cao R, Zhou M, Hussain W, Wang B, Xue J, et al. A hybrid deep neural network for classification of schizophrenia using EEG Data. Sci Rep. 2021;11:1–16. [Google Scholar]
23.Vasey B, Nagendran M, Campbell B, Clifton DA, Collins GS, Denaxas S, et al. Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. Nat Med. 2022;28:924–33. doi: 10.1038/s41591-022-01772-9. [DOI] [PubMed] [Google Scholar]
24.Mechelli A, Vieira S. From models to tools: clinical translation of machine learning studies in psychosis. NPJ Schizophr. 2020;6:4. doi: 10.1038/s41537-020-0094-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Schnack HG. Improving individual predictions: machine learning approaches for detecting and attacking heterogeneity in schizophrenia (and other psychiatric diseases) Schizophr. Res. 2019;214:34–42. doi: 10.1016/j.schres.2017.10.023. [DOI] [PubMed] [Google Scholar]
26.Hawkins DM. The problem of overfitting. J Chem Inf Comput Sci. 2004;44:1–12. doi: 10.1021/ci0342472. [DOI] [PubMed] [Google Scholar]
27.Maswanganyi C, Tu C, Pius O, Du S. Overview of artifacts detection and elimination methods for BCI using EEG. In: 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC). IEEE; 2018. pp. 832–836.
28.Lu J, Behbood V, Hao P, Zuo H, Xue S, Zhang G. Transfer learning using computational intelligence: a survey. Knowl Based Syst. 2015;80:14–23. [Google Scholar]
29.Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data. 2021;8:1–74. doi: 10.1186/s40537-021-00444-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Li RC, Asch SM, Shah NH. Developing a delivery science for artificial intelligence in healthcare. NPJ Digit Med. 2020;3.1:1–3. doi: 10.1038/s41746-020-00318-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Yu K-H, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng. 2018;2:719–31. doi: 10.1038/s41551-018-0305-z. [DOI] [PubMed] [Google Scholar]
32.Sermesant M, Delingette H, Cochet H, Jaïs P, Ayache N. Applications of artificial intelligence in cardiovascular imaging. Nat Rev Cardiol. 2021;18:600–9. doi: 10.1038/s41569-021-00527-2. [DOI] [PubMed] [Google Scholar]
33.Myszczynska MA, Ojamies PN, Lacoste A, Neil D, Saffari A, Mead R, et al. Applications of machine learning to diagnosis and treatment of neurodegenerative diseases. Nat Rev Neurol. 2020;16:440–56. doi: 10.1038/s41582-020-0377-8. [DOI] [PubMed] [Google Scholar]
34.Yue X, Wang Z, Huang J, Parthasarathy S, Moosavinasab S, Huang Y, et al. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics. 2020;36:1241–51. doi: 10.1093/bioinformatics/btz718. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A comprehensive survey on graph neural networks. IEEE Trans neural Netw Learn Syst. 2020;32:4–24. doi: 10.1109/TNNLS.2020.2978386. [DOI] [PubMed] [Google Scholar]
36.Gu X, Cao Z, Jolfaei A, Xu P, Wu D, Jung TP, et al. EEG-based brain-computer interfaces (BCIs): A survey of recent studies on signal sensing technologies and computational intelligence approaches and their applications. IEEE/ACM Trans Comput Biol Bioinform. 2021;18:1645–66. doi: 10.1109/TCBB.2021.3052811. [DOI] [PubMed] [Google Scholar]
37.Portillo-Lara R, Tahirbegi B, Chapman CAR, Goding JA, Green RA. Mind the gap: State-of-the-art technologies and applications for EEG-based brain–computer interfaces. APL Bioeng. 2021;5:031507. doi: 10.1063/5.0047237. [DOI] [PMC free article] [PubMed] [Google Scholar]
38.Obeid I, Picone J. The temple university hospital EEG data corpus. Front Neurosci. 2016;10:196. doi: 10.3389/fnins.2016.00196. [DOI] [PMC free article] [PubMed] [Google Scholar]
39.Gorbachevskaya NN, Borisov SB. EEG data of healthy adolescents and adolescents with symptoms of schizophrenia. http://brain.bio.msu.ru/eeg_schizophrenia.htm.
40.Pedroni A, Bahreini A, Langer N. Automagic: standardized preprocessing of big EEG data. NeuroImage. 2019;200:460–73. doi: 10.1016/j.neuroimage.2019.06.046. [DOI] [PubMed] [Google Scholar]
41.Gemein L, Schirrmeister RT, Chrabszcz P, Wilson D, Ball T. Machine-learning-based diagnostics of EEG pathology. Neuroimage. 2020;220:117021. doi: 10.1016/j.neuroimage.2020.117021. [DOI] [PubMed] [Google Scholar]
42.Newson JJ, Thiagarajan TC. EEG frequency bands in psychiatric disorders: a review of resting state studies. Front Hum Neurosci. 2019;12:521. doi: 10.3389/fnhum.2018.00521. [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Lawhern VJ, Solon AJ, Waytowich NR, Gordon SM, Hung CP, Lance BJ. EEGNet: a compact convolutional neural network for EEG-based brain–computer interfaces. J neural Eng. 2018;15:056013. doi: 10.1088/1741-2552/aace8c. [DOI] [PubMed] [Google Scholar]
44.Harati A, Lopez S, Obeid I, Picone J, Jacobson M, Tobochnik S. The TUH EEG CORPUS: A big data resource for automated EEG interpretation. In: 2014 IEEE Signal Processing in Medicine and Biology Symposium (SPMB), Philadelphia, PA, USA: 2014. pp. 1–5. 10.1109/SPMB.2014.7002953.
45.Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph Attention Networks. 2018. https://arxiv.org/abs/1710.10903.
46.Tikka SK, Singh BK, Nizamie SH, Garg S, Mandal S, Thakur K, et al. Artificial intelligence-based classification of schizophrenia: A high density electroencephalographic and support vector machine study. Indian J Psychiatry. 2020;62:273–82. doi: 10.4103/psychiatry.IndianJPsychiatry_91_20. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Austin PC, Steyerberg EW. Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable. BMC Med Res Methodol. 2012;12:82. doi: 10.1186/1471-2288-12-82. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Liaw A, Wiener M. Classification and regression by randomForest. R N. 2002;2:18–22. [Google Scholar]
49.XGBoost Documentation — xgboost 1.6.2 documentation. https://xgboost.readthedocs.io/en/stable/.
50.Kalantidis Y, et al. Hard negative mixing for contrastive learning. Adv Neural Inf Process Syst. 2020;33:21798–809. [Google Scholar]
51.Wang C, Li Y, Tsuboshita Y, Sakurai T, Goto T, Yamaguchi H, et al. A high-generalizability machine learning framework for predicting the progression of Alzheimer’s disease using limited data. NPJ Digit Med. 2022;5.1:1–10. doi: 10.1038/s41746-022-00577-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
52.Kaminska A, Eisermann M, Plouin P. Child EEG (and maturation) Handb Clin Neurol. 2019;160:125–42. doi: 10.1016/B978-0-444-64032-1.00008-4. [DOI] [PubMed] [Google Scholar]
53.Sponheim SR, Clementz BA, Iacono WG, Beiser M. Clinical and biological concomitants of resting state EEG power abnormalities in schizophrenia. Biol Psychiatry. 2000;48:1088–97. doi: 10.1016/s0006-3223(00)00907-0. [DOI] [PubMed] [Google Scholar]
54.Alper K, Gunther W, Prichep LS, John ER, Brodie J. Correlation of qEEG with PET in schizophrenia. Neuropsychobiology. 1998;38:50–56. doi: 10.1159/000026516. [DOI] [PubMed] [Google Scholar]
55.Steriade M, Gloor P, Llinas RR, Lopes da Silva FH, Mesulam MM. Report of IFCN committee on basic mechanisms: Basic mechanisms of cerebral rhythmic activities. Electroencephalogr Clin Neurophysiol. 1990;76:481–508. doi: 10.1016/0013-4694(90)90001-z. [DOI] [PubMed] [Google Scholar]
56.Orban P, Dansereau C, Desbois L, Mongeau-Pérusse V, Giguère CÉ, Nguyen H, et al. Multisite generalizability of schizophrenia diagnosis classification based on functional brain connectivity. Schizophr Res. 2018;192:167–71. doi: 10.1016/j.schres.2017.05.027. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials^{(2.2MB, docx)}

Data Availability Statement

[CR1] 1.Saha S, Chant D, Welham J, McGrath J. A systematic review of the prevalence of schizophrenia. PLoS Med. 2005;2:e141. doi: 10.1371/journal.pmed.0020141. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR2] 2.Jin H, Mosweu I. The societal cost of schizophrenia: a systematic review. PharmacoEconomics. 2017;35:25–42. doi: 10.1007/s40273-016-0444-6. [DOI] [PubMed] [Google Scholar]

[CR3] 3.Chong HY, Teoh SL, Wu DB, Kotirum S, Chiou CF, Chaiyakunapruk N. Global economic burden of schizophrenia: a systematic review. Neuropsychiatr Dis Treat. 2016;12:357–73. doi: 10.2147/NDT.S96649. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Qureshi O, Endale T, Ryan G, Miguel-Esponda G, Iyer SN, Eaton J, et al. Barriers and drivers to service delivery in global mental health projects. Int J Mental Health Syst. 2021;15:1–13. doi: 10.1186/s13033-020-00427-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR5] 5.Ke P, Xiong D, Li J, Pan Z, Zhou J, Li S, et al. An integrated machine learning framework for a discriminative analysis of schizophrenia using multi-biological data. Sci Rep. 2021;11:1–11. doi: 10.1038/s41598-021-94007-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] 6.Sadeghi D, Shoeibi A, Ghassemi N, Moridian P, Khadem A, Alizadehsani R, et al. An overview of artificial intelligence techniques for diagnosis of Schizophrenia based on magnetic resonance imaging modalities: Methods, challenges, and future works. Comput Biol Med. 2022;146:105554. doi: 10.1016/j.compbiomed.2022.105554. [DOI] [PubMed] [Google Scholar]

[CR7] 7.Fernandes BS, Karmakar C, Tamouza R, Tran T, Yearwood J, Hamdani N, et al. Precision psychiatry with immunological and cognitive biomarkers: a multi-domain prediction for the diagnosis of bipolar disorder or schizophrenia using machine learning. Transl Psychiatry. 2020;10:1–13. doi: 10.1038/s41398-020-0836-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Shim M, Hwang H-J, Kim D-W, Lee S-H, Im C-H. Machine-learning-based diagnosis of schizophrenia using combined sensor-level and source-level EEG features. Schizophr Res. 2016;176:314–9. doi: 10.1016/j.schres.2016.05.007. [DOI] [PubMed] [Google Scholar]

[CR9] 9.Schizophrenia. A survey of artificial intelligence techniques applied to detection and classification. Int J Environ Res Public Health. 2021;18:6099. doi: 10.3390/ijerph18116099. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] 10.Barros C, Silva CA, Pinheiro AP. Advanced EEG-based learning approaches to predict schizophrenia: Promises and pitfalls. Artif Intell Med. 2021;114:102039. doi: 10.1016/j.artmed.2021.102039. [DOI] [PubMed] [Google Scholar]

[CR11] 11.Sakia RM. The Box‐Cox transformation technique: a review. J R Stat Soc Ser D (Statistician) 1992;41:169–78. [Google Scholar]

[CR12] 12.Saab K, Dunnmon J, Ré C, Rubin D, Lee-Messer C. Weak supervision as an efficient approach for automated seizure detection in electroencephalography. Npj Digit Med. 2020;3:1–12. doi: 10.1038/s41746-020-0264-0. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Tait L, Tamagnini F, Stothart G, Barvas E, Monaldini C, Frusciante R, et al. EEG microstate complexity for aiding early diagnosis of Alzheimer’s disease. Sci Rep. 2020;10:17627. doi: 10.1038/s41598-020-74790-7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.de Aguiar Neto FS, Rosa JLG. Depression biomarkers using non-invasive EEG: a review. Neurosci Biobehav Rev. 2019;105:83–93. doi: 10.1016/j.neubiorev.2019.07.021. [DOI] [PubMed] [Google Scholar]

[CR15] 15.Newson JJ, Thiagarajan TC. EEG frequency bands in psychiatric disorders: a review of resting state studies. Front Hum Neurosci. 2019;12:521. doi: 10.3389/fnhum.2018.00521. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR16] 16.Bose T, Sivakumar SD, Kesavamurthy B. Identification of schizophrenia using EEG alpha band power during hyperventilation and post-hyperventilation. J Med Biol Eng. 2016;36:901–11. [Google Scholar]

[CR17] 17.Jeong JW, Wendimagegn TW, Chang E, Chun Y, Park JH, Kim HJ, et al. Classifying schizotypy using an audiovisual emotion perception test and scalp electroencephalography. Front Hum Neurosci. 2017;11:450. doi: 10.3389/fnhum.2017.00450. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR18] 18.Chu W-L, Huang M-W, Jian B-L, Cheng K-S. Analysis of EEG entropy during visual evocation of emotion in schizophrenia. Ann Gen Psychiatry. 2017;16:1–9. doi: 10.1186/s12991-017-0157-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR19] 19.Alimardani F, Boostani R. DB-FFR: a modified feature selection algorithm to improve discrimination rate between bipolar mood disorder (BMD) and schizophrenic patients. Iran J Sci Technol Trans Electr Eng. 2018;42:251–60. [Google Scholar]

[CR20] 20.Phang, C-R, Ting, C-M, Samdin, SB, Ombao, H. Classification of EEG-based effective brain connectivity in schizophrenia using deep neural networks. In: 2019 9th International IEEE/EMBS Conference on Neural Engineering (NER). IEEE; 2019. pp. 401–406.

[CR21] 21.Siuly S, Khare SK, Bajaj V, Wang H, Zhang Y. A computerized method for automatic detection of schizophrenia using EEG signals. IEEE Trans Neural Syst Rehabil. 2020;28:2390–2400. doi: 10.1109/TNSRE.2020.3022715. [DOI] [PubMed] [Google Scholar]

[CR22] 22.Sun J, Cao R, Zhou M, Hussain W, Wang B, Xue J, et al. A hybrid deep neural network for classification of schizophrenia using EEG Data. Sci Rep. 2021;11:1–16. [Google Scholar]

[CR23] 23.Vasey B, Nagendran M, Campbell B, Clifton DA, Collins GS, Denaxas S, et al. Reporting guideline for the early-stage clinical evaluation of decision support systems driven by artificial intelligence: DECIDE-AI. Nat Med. 2022;28:924–33. doi: 10.1038/s41591-022-01772-9. [DOI] [PubMed] [Google Scholar]

[CR24] 24.Mechelli A, Vieira S. From models to tools: clinical translation of machine learning studies in psychosis. NPJ Schizophr. 2020;6:4. doi: 10.1038/s41537-020-0094-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR25] 25.Schnack HG. Improving individual predictions: machine learning approaches for detecting and attacking heterogeneity in schizophrenia (and other psychiatric diseases) Schizophr. Res. 2019;214:34–42. doi: 10.1016/j.schres.2017.10.023. [DOI] [PubMed] [Google Scholar]

[CR26] 26.Hawkins DM. The problem of overfitting. J Chem Inf Comput Sci. 2004;44:1–12. doi: 10.1021/ci0342472. [DOI] [PubMed] [Google Scholar]

[CR27] 27.Maswanganyi C, Tu C, Pius O, Du S. Overview of artifacts detection and elimination methods for BCI using EEG. In: 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC). IEEE; 2018. pp. 832–836.

[CR28] 28.Lu J, Behbood V, Hao P, Zuo H, Xue S, Zhang G. Transfer learning using computational intelligence: a survey. Knowl Based Syst. 2015;80:14–23. [Google Scholar]

[CR29] 29.Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O, et al. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data. 2021;8:1–74. doi: 10.1186/s40537-021-00444-8. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Li RC, Asch SM, Shah NH. Developing a delivery science for artificial intelligence in healthcare. NPJ Digit Med. 2020;3.1:1–3. doi: 10.1038/s41746-020-00318-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Yu K-H, Beam AL, Kohane IS. Artificial intelligence in healthcare. Nat Biomed Eng. 2018;2:719–31. doi: 10.1038/s41551-018-0305-z. [DOI] [PubMed] [Google Scholar]

[CR32] 32.Sermesant M, Delingette H, Cochet H, Jaïs P, Ayache N. Applications of artificial intelligence in cardiovascular imaging. Nat Rev Cardiol. 2021;18:600–9. doi: 10.1038/s41569-021-00527-2. [DOI] [PubMed] [Google Scholar]

[CR33] 33.Myszczynska MA, Ojamies PN, Lacoste A, Neil D, Saffari A, Mead R, et al. Applications of machine learning to diagnosis and treatment of neurodegenerative diseases. Nat Rev Neurol. 2020;16:440–56. doi: 10.1038/s41582-020-0377-8. [DOI] [PubMed] [Google Scholar]

[CR34] 34.Yue X, Wang Z, Huang J, Parthasarathy S, Moosavinasab S, Huang Y, et al. Graph embedding on biomedical networks: methods, applications and evaluations. Bioinformatics. 2020;36:1241–51. doi: 10.1093/bioinformatics/btz718. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR35] 35.Wu Z, Pan S, Chen F, Long G, Zhang C, Yu PS. A comprehensive survey on graph neural networks. IEEE Trans neural Netw Learn Syst. 2020;32:4–24. doi: 10.1109/TNNLS.2020.2978386. [DOI] [PubMed] [Google Scholar]

[CR36] 36.Gu X, Cao Z, Jolfaei A, Xu P, Wu D, Jung TP, et al. EEG-based brain-computer interfaces (BCIs): A survey of recent studies on signal sensing technologies and computational intelligence approaches and their applications. IEEE/ACM Trans Comput Biol Bioinform. 2021;18:1645–66. doi: 10.1109/TCBB.2021.3052811. [DOI] [PubMed] [Google Scholar]

[CR37] 37.Portillo-Lara R, Tahirbegi B, Chapman CAR, Goding JA, Green RA. Mind the gap: State-of-the-art technologies and applications for EEG-based brain–computer interfaces. APL Bioeng. 2021;5:031507. doi: 10.1063/5.0047237. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] 38.Obeid I, Picone J. The temple university hospital EEG data corpus. Front Neurosci. 2016;10:196. doi: 10.3389/fnins.2016.00196. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR39] 39.Gorbachevskaya NN, Borisov SB. EEG data of healthy adolescents and adolescents with symptoms of schizophrenia. http://brain.bio.msu.ru/eeg_schizophrenia.htm.

[CR40] 40.Pedroni A, Bahreini A, Langer N. Automagic: standardized preprocessing of big EEG data. NeuroImage. 2019;200:460–73. doi: 10.1016/j.neuroimage.2019.06.046. [DOI] [PubMed] [Google Scholar]

[CR41] 41.Gemein L, Schirrmeister RT, Chrabszcz P, Wilson D, Ball T. Machine-learning-based diagnostics of EEG pathology. Neuroimage. 2020;220:117021. doi: 10.1016/j.neuroimage.2020.117021. [DOI] [PubMed] [Google Scholar]

[CR42] 42.Newson JJ, Thiagarajan TC. EEG frequency bands in psychiatric disorders: a review of resting state studies. Front Hum Neurosci. 2019;12:521. doi: 10.3389/fnhum.2018.00521. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Lawhern VJ, Solon AJ, Waytowich NR, Gordon SM, Hung CP, Lance BJ. EEGNet: a compact convolutional neural network for EEG-based brain–computer interfaces. J neural Eng. 2018;15:056013. doi: 10.1088/1741-2552/aace8c. [DOI] [PubMed] [Google Scholar]

[CR44] 44.Harati A, Lopez S, Obeid I, Picone J, Jacobson M, Tobochnik S. The TUH EEG CORPUS: A big data resource for automated EEG interpretation. In: 2014 IEEE Signal Processing in Medicine and Biology Symposium (SPMB), Philadelphia, PA, USA: 2014. pp. 1–5. 10.1109/SPMB.2014.7002953.

[CR45] 45.Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y. Graph Attention Networks. 2018. https://arxiv.org/abs/1710.10903.

[CR46] 46.Tikka SK, Singh BK, Nizamie SH, Garg S, Mandal S, Thakur K, et al. Artificial intelligence-based classification of schizophrenia: A high density electroencephalographic and support vector machine study. Indian J Psychiatry. 2020;62:273–82. doi: 10.4103/psychiatry.IndianJPsychiatry_91_20. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR47] 47.Austin PC, Steyerberg EW. Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable. BMC Med Res Methodol. 2012;12:82. doi: 10.1186/1471-2288-12-82. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR48] 48.Liaw A, Wiener M. Classification and regression by randomForest. R N. 2002;2:18–22. [Google Scholar]

[CR49] 49.XGBoost Documentation — xgboost 1.6.2 documentation. https://xgboost.readthedocs.io/en/stable/.

[CR50] 50.Kalantidis Y, et al. Hard negative mixing for contrastive learning. Adv Neural Inf Process Syst. 2020;33:21798–809. [Google Scholar]

[CR51] 51.Wang C, Li Y, Tsuboshita Y, Sakurai T, Goto T, Yamaguchi H, et al. A high-generalizability machine learning framework for predicting the progression of Alzheimer’s disease using limited data. NPJ Digit Med. 2022;5.1:1–10. doi: 10.1038/s41746-022-00577-x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR52] 52.Kaminska A, Eisermann M, Plouin P. Child EEG (and maturation) Handb Clin Neurol. 2019;160:125–42. doi: 10.1016/B978-0-444-64032-1.00008-4. [DOI] [PubMed] [Google Scholar]

[CR53] 53.Sponheim SR, Clementz BA, Iacono WG, Beiser M. Clinical and biological concomitants of resting state EEG power abnormalities in schizophrenia. Biol Psychiatry. 2000;48:1088–97. doi: 10.1016/s0006-3223(00)00907-0. [DOI] [PubMed] [Google Scholar]

[CR54] 54.Alper K, Gunther W, Prichep LS, John ER, Brodie J. Correlation of qEEG with PET in schizophrenia. Neuropsychobiology. 1998;38:50–56. doi: 10.1159/000026516. [DOI] [PubMed] [Google Scholar]

[CR55] 55.Steriade M, Gloor P, Llinas RR, Lopes da Silva FH, Mesulam MM. Report of IFCN committee on basic mechanisms: Basic mechanisms of cerebral rhythmic activities. Electroencephalogr Clin Neurophysiol. 1990;76:481–508. doi: 10.1016/0013-4694(90)90001-z. [DOI] [PubMed] [Google Scholar]

[CR56] 56.Orban P, Dansereau C, Desbois L, Mongeau-Pérusse V, Giguère CÉ, Nguyen H, et al. Multisite generalizability of schizophrenia diagnosis classification based on functional brain connectivity. Schizophr Res. 2018;192:167–71. doi: 10.1016/j.schres.2017.05.027. [DOI] [PubMed] [Google Scholar]

PERMALINK

Assisting schizophrenia diagnosis using clinical electroencephalography and interpretable graph neural networks: a real-world and cross-site study

Haiteng Jiang

Peiyin Chen

Zhaohong Sun

Chengqian Liang

Rui Xue

Liansheng Zhao

Qiang Wang

Xiaojing Li

Wei Deng

Zhongke Gao

Fei Huang

Songfang Huang

Yaoyun Zhang

Tao Li

Abstract

Introduction

Fig. 1. Study design of classification-based diagnosis of schizophrenia.

Methods

Participants

TUH

Chengdu

Hangzhou

Moscow

Data preprocessing

Feature generation

EEG-based diagnosis model using transfer learning

Transfer learning strategies

Diagnosis model building

EEG graph construction

Fig. 2. Graph structure of EEG channels used in the graph convolutional neural networks.

Diagnosis model training

Cross-site diagnosis

Evaluation

Experimental setup

Evaluation criteria

Statistical analysis

Ethical aspects

Results

Data statistics

Table 1.

EEG samples

Intra-site validation

Cross-site evaluation

Fig. 3. Performance comparison for cross-site validation on the Hangzhou dataset.

Feature significance and generalization in HC and SCZ discrimination

Fig. 4. Feature comparisons between the Chengdu and Hangzhou datasets.

Discussion

Supplementary information

Author contributions

Funding

Data availability

Code availability

Competing interests

Footnotes

Contributor Information

Supplementary information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases