Skip to main content
APL Bioengineering logoLink to APL Bioengineering
. 2025 Sep 22;9(3):036118. doi: 10.1063/5.0250559

Machine learning-enabled detection of electrophysiological signatures in iPSC-derived models of schizophrenia and bipolar disorder

Kai Cheng 1, Autumn Williams 1, Anannya Kshirsagar 2, Sai Kulkarni 2, Rakesh Karmacharya 3,4,5,6,7,3,4,5,6,7,3,4,5,6,7,3,4,5,6,7,3,4,5,6,7, Deok-Ho Kim 1,8,1,8, Sridevi V Sarma 1,9,10,11,1,9,10,11,1,9,10,11,1,9,10,11, Annie Kathuria 1,8,11,1,8,11,1,8,11,a)
PMCID: PMC12456967  PMID: 40995147

Abstract

Neuropsychiatric disorders such as schizophrenia (SCZ) and bipolar disorder (BD) remain challenging to diagnose due to the absence of objective biomarkers, with current assessments relying largely on subjective clinical evaluations. In this study, we present a computational analysis pipeline designed to identify disease-specific electrophysiological signatures from multi-electrode array (MEA) recordings of patient-derived cerebral organoids (COs) and two-dimensional cortical interneuron cultures (2DNs). Using a Support Vector Machine classifier optimized for high-dimensional data, we achieved 95.8% classification accuracy in distinguishing SCZ from control samples in 2DNs under both baseline and post-electrical-stimulation (PES) conditions with the extracted electrophysiological signatures. In COs, classification accuracy improved from 83.3% at baseline to 91.6% following PES, enabling robust separation of control, SCZ, and BD cohorts. Key discriminative features included channel-specific measures of network activity, with PES significantly enhancing classification performance, particularly for BD. These results underscore the potential of MEA-based functional phenotyping, coupled with machine learning, to uncover reliable, stimulation-sensitive electrophysiological biomarkers, offering a path toward more objective diagnosis and personalized treatment strategies for neuropsychiatric disorders.

I. INTRODUCTION

Neuropsychiatric disorders, such as schizophrenia (SCZ) and bipolar disorder (BD), pose a significant global health burden, affecting millions of individuals worldwide.1 Precise prevalence estimates of these disorders are impossible to obtain due to clinical and methodological factors, such as the complexity of neuropsychiatric diagnosis, their clinical symptomatology overlaps with other disorders, and varying methods for determining diagnoses. Given these complexities, SCZ, BD, and other psychotic disorders are often combined in prevalence estimation studies.2 The diagnosis and treatment of SCZ and BD are based only on clinical symptomatology. Current treatments are only partly effective, and there are no biomarkers to aid in diagnosis, in guiding treatment decisions, or in monitoring treatment response.3

Traditional postmortem studies have been instrumental in advancing our understanding of the neuropathology underlying SCZ and BD. These studies have provided critical insights into the cellular and molecular abnormalities associated with these conditions, forming the foundation for subsequent hypotheses and experimental approaches. One of the most consistently replicated findings from postmortem studies in SCZ is evidence of GABAergic deficits in the prefrontal cortex, suggesting a decrease in the activity of cortical interneurons.3 This has been supported by observations of reduced levels of glutamate decarboxylase 67 (GAD67), a key enzyme in GABA synthesis, in the prefrontal cortex of SCZ patients.4 Additionally, postmortem studies have revealed synaptic abnormalities, with reductions in postsynaptic elements observed in cortical tissue. In BD, postmortem studies have identified reduced glial populations in the prefrontal cortex, highlighting dysregulated synaptic protein levels, mitochondrial function, and immune response genes. These studies have also implicated abnormalities in neuronal and calcium signaling pathways.

While postmortem studies provide valuable insights suggesting critical dysfunctions within neural circuits and cellular pathways that correlate with neuropsychiatric symptoms, the static nature of postmortem data and its susceptibility to postmortem degradation limit the temporal resolution needed to capture disease progression or neurodevelopmental disruptions that unfold over time.

To overcome these limitations, patient-derived cerebral organoids (COs) and induced pluripotent stem cell (iPSC) neural cultures now serve as dynamic, genetically personalized models for studying neuropsychiatric disorders in real time. Derived from patient somatic cells, these models retain the individual's genetic background, providing a personalized disease model that can reflect patient-specific risk variants or mutations and enabling the examination of patient-specific genetic variants and their impact on neurodevelopment and network formation. This is particularly valuable in neuropsychiatric research, where high genetic diversity complicates standardized disease models. COs, which recapitulate aspects of human brain architecture and connectivity, provide a platform to investigate neurodevelopmental abnormalities thought to be integral to SCZ and BD. Furthermore, they allow for controlled environmental manipulations, such as electrical or chemical stimulation, to probe disease-specific responses, offering insights into neural dynamics that are inaccessible through postmortem approaches.

To address the critical need for objective biomarkers in neuropsychiatry, this study explores computational models on the electrophysiological data of BD and SCZ patient-derived COs and cortical interneurons, including co-cultured excitatory and inhibitory neurons.5 We use data from our previous published studies3,6,7 in which patient-derived COs and two-dimensional cortical interneuron cultures (2DNs) in conjunction with multi-electrode array (MEA) recording were used. As a result, we have developed a digital analysis pipeline (DAP) that builds upon the foundation established by electroencephalography (EEG) analyses.8–14 This approach aims to uncover distinct electrophysiological signatures associated with SCZ and BD, offering a physiologically relevant and comprehensive assessment of neural network dynamics. The application of MEA in patient-derived COs enables comprehensive analysis of neural network dynamics and extraction of biomarkers that can serve as objective diagnostic indicators, complementing insights gleaned from traditional EEG techniques.15 By bridging the gap between advanced in vitro models, cutting-edge electrophysiological analysis, and machine learning-driven feature extraction, this study presents a promising approach to addressing the critical need for validated biomarkers in the field of neuropsychiatry.

The DAP designed in this study investigates the electrophysiological properties of COs and 2DNs derived from patients with SCZ and BD (Fig. 1). Utilizing MEA and stimulus–response dynamic network modeling (SRDNM), we analyzed neural recordings to identify influential nodes in the neural network. An extensive feature map of the sink index dynamics was computed for each channel and screened by a feature selection algorithm16 to identify the features most significant to cohort classification. These features were then used to train a Support Vector Machine (SVM) classifier to distinguish different patient-derived organoids from healthy controls [Figs. 2(a) and 2(c)].

FIG. 1.

FIG. 1.

The workflow of the proposed analysis pipeline was designed from EEG analysis12 to uncover distinct electrophysiological signatures associated with schizophrenia (SCZ) and bipolar disorder (BD). The process begins with the generation of iPSCs from patient cohorts, including both SCZ and BD patients, as well as healthy controls. These iPSCs are differentiated into cerebral organoids (COs) and cortical interneurons, which serve as in vitro models for studying neuropsychiatric disorders. The COs and cortical interneurons in monolayer culture (2DNs) are then subjected to electrophysiological analysis using a MED64-Presto (Alpha MED Scientific Inc., Osaka, Japan) setup to capture dynamic neural activity. The raw MEA recordings from these models are processed through a series of preprocessing steps, including noise filtering, down-sampling, spike collation, and rate coding to identify specific patterns of neural firing and connectivity. The refined data are used to develop stimulus–response dynamic network models, visualized as heatmaps and connectivity sink matrices. A features map computed from the sink matrices is passed through the minimum redundancy maximum relevance (MRMR) selection algorithm.16 The selected features, considered as distinct electrophysiological signatures associated with SCZ and BD, facilitate a machine learning-based classification of patient cohorts, enabling the identification of unique neural signatures and network dynamics that differentiate SCZ and BD from each other and from healthy controls. This integrated approach leverages iPSC-derived in vitro models and advanced computational modeling to provide insights into the pathophysiology of these disorders.

FIG. 2.

FIG. 2.

Classification of control vs schizophrenia (SCZ) in two-dimensional neuron (2DN) cultures using sink index features from MEA recordings. (a) 3D scatterplot illustrating the separation of control and SCZ 2DNs under baseline conditions using an SVM classifier at an accuracy of 0.938. Sink index features, calculated by Eq. (2), were selected for cohort prediction by MRMR. Significant sink index features identified include covariance of channels 9 and 11, and the median of channel 12. (b) Confusion matrix depicting classification performance of the 2DNs SVM in the baseline condition. (c) 3D scatterplot illustrating the separation of control and SCZ 2DNs under PES condition using an SVM classifier at an accuracy of 0.958. Sink index features were selected for cohort prediction by MRMR. Significant sink index features identified include the autocorrelation of channel 1, the range of channel 9, and the kurtosis of channel 14. (d) Confusion matrix depicting classification performance of the 2DNs SVM model in the PES condition. (e) Linear Discriminant Analysis (LDA) projection of the selected feature space for 2DNs under baseline conditions, reduced through PCA to capture >95% variance. The projection highlights the limited separability between control (blue circles) and SCZ (orange squares) cohorts, as shown by overlapping density distributions in the histogram of LDA component 1. (f) LDA projection of 2DNs under PES conditions, revealing significantly improved class separability. The histogram shows clear shifts in mean values of LDA component 1 for control and SCZ cohorts, with minimal overlap, reflecting enhanced discrimination following electrical stimulation.

II. RESULTS

This section reports the results of feature extraction from previously published MEA data.3,6,7 We present cohort classification and evaluation of electrophysiological characteristics by applying the DAP methodology. The analysis was performed on electrophysiological data from patient-derived 2DNs and COs. Data were collected under both baseline and post-electrical-stimulation (PES) conditions using an MEA system (Alpha MED Scientific Inc., Osaka, Japan). Each well in the MEA plate has a 4 × 4 microelectrodes array, and each electrode is a channel shown as a subscript number, from 1 to 16, in channel-wise sink index features. The experimental cultures, generated and maintained following standardized protocols,17,18 were derived from iPSCs of individuals with SCZ, BD, and healthy controls. Detailed protocols for culture maintenance, differentiation efficiency, and conditions used for recording and stimulation are elaborated in Sec. V.

A. Classification results and electrophysiological insights

In 2DN cultures, the baseline condition highlighted significant distinctions between control and SCZ cohorts. Figures 2(a) and 2(c) show scatterplots in a three-dimensional feature space with the SVM decision boundary for 2DNs separating the healthy control cohort (n = 24 for baseline, n = 12 for PES condition, as blue dots) vs the SCZ cohort (n = 24 for baseline, n = 12 for PES condition, as orange squares) under baseline [Fig. 2(a)] and PES conditions [Fig. 2(c)]. The scatterplots illustrate the separation of control and SCZ neurons based on significant features of the sink index, indicating that features related to channel-wise sink index dynamics can distinguish healthy and SCZ populations with high accuracy.

In the control vs SCZ baseline [Fig. 2(a)], three channel-wise sink index features were identified as critical to cohort separation: the median of channel 12  (r=0.86) and the covariance of channels 9  (r=0.42) and 11  (r=0.45), achieving a classification accuracy of 0.938 in the validation test (supplementary material Fig. S1A). In the PES condition [Fig. 2(c)], sink index features identified include the autocorrelation of channel 1 ( r=0.78), range of channel 9  (r=1.22), and kurtosis of channel 14  (r=11.4), achieving a classification accuracy of 0.958 in the validation test (supplementary material Fig. S1B). The respective confusion matrices [Figs. 2(b) and 2(d)] provide a visual representation of classification performance. Only three instances among the 48 samples in the baseline condition and one instance among the 24 samples in the PES conditions were misclassified, indicating a robust separation between control and SCZ classes under both paradigms. These results suggest that the SCZ neurons exhibit distinct sink-related electrophysiological features that are accentuated during active stimulation, providing a dynamic measure of disease-related alterations.

Similarly, for CO cultures, Figs. 3(a) and 3(c) show 3D scatterplots displaying SVM decision boundaries for the three cohorts: control (n = 24 for baseline, n = 8 for PES condition, as blue dots), SCZ (n = 24 for baseline, n = 8 for PES condition, as orange squares), and BD (n = 24 for baseline, n = 8 for PES condition, as yellow diamonds) in COs under the baseline [Fig. 3(a)] and PES conditions [Fig. 3(c)], respectively.

FIG. 3.

FIG. 3.

Classification of control, SCZ, and BD cohorts in COs using sink index features from MEA recordings. (a) 3D scatterplot illustrating the separation of control, SCZ, and BD COs under baseline conditions using an SVM classifier at an accuracy of 0.833. Sink index features, calculated by Eq. (2), were selected for cohort prediction by MRMR. Significant sink index features identified include the range of channel 2, the mean of channel 13, and the all-channel-minimum singular value. (b) Confusion matrix depicts classification performance of the COs SVM in the baseline condition. (c) 3D scatterplot illustrating the separation of control, SCZ, and BD COs under PES conditions using an SVM classifier at an accuracy of 0.916. Sink index features were selected for cohort prediction by MRMR. Significant sink index features identified include the mean of channel 5, autocorrelation of channel 11, and skewness of channel 11. (d) Confusion matrix depicting classification performance of the COs SVM model in the PES condition. (e) LDA projection of PCA-reduced feature space (>95% variance) for COs under baseline conditions. The histogram with the mean line of control (blue circles), SCZ (orange squares), and BD (yellow diamonds) cohorts shows significant overlap in both LDA components, as seen in the density plots. Baseline conditions result in less distinct class boundaries. (f) LDA projection of PCA-reduced feature space (>95% variance) for COs under PES conditions, where the separability among control, SCZ, and BD cohorts is markedly improved. Histograms with mean lines demonstrate reduced overlap in both LDA components, particularly for BD samples, suggesting that electrical stimulus enhances the detection of distinct electrophysiological features for classification.

In the baseline condition [Fig. 3(a)], the MRMR feature selection algorithm identified the sink index features significant to classification as the range of channel 2  (r=1.28), mean of channel 13  (r=0.64), and the minimum singular value crossing all channels (r=2.21) (supplementary material Fig. S1C). These features were able to separate control, SCZ, and BD COs with an accuracy of 0.833, despite an imbalance in cohort sizes. In the PES condition [Fig. 3(c)], selected sink index features were the mean of channel 5  (r=0.27), autocorrelation of channel 11  (r=1.13), and skewness of channel 11  (r=8.46), with a significant improvement in classification accuracy to 0.916 (supplementary material Fig. S1D). The respective confusion matrices [Figs. 3(b) and 3(d)] demonstrate the classification performance. These results show that most misclassified instances under baseline conditions were distributed in the BD class. However, we were able to improve the classification accuracy significantly in the PES condition, as the BD cohort was densely clustered in the identified feature space; only two within control cohorts among 24 instances were misclassified as diseased cohorts. Hence, while we could not clearly distinguish BD cohorts and control cohorts under baseline conditions, we were able to distinguish between healthy cohorts and disease cohorts (Control vs SCZ, Control vs BD, and SCZ vs BD) using analyses of MEA recordings from the PES conditions.

The classification results appear to reflect the overall performance of the SVM classifiers across the entire spectrum (Figs. 2 and 3). However, it is important to clarify that these results represent the average performance across the tenfold used in cross-validation. The overall accuracy of 95.8% for distinguishing SCZ from controls in 2DNs under baseline conditions and 91.6% in COs under PES conditions were the outcomes averaged over the 10 validation tests. This averaging process ensures that the reported accuracies are not inflated by overfitting and reflects the generalization capability of the model.

B. Implications of stimulation protocols

Electrical stimulation, employed to elicit dynamic neural responses, demonstrated its utility in unveiling latent electrophysiological features across both 2DN and CO models. Figures 2(e), 2(f), 3(e), and 3(f) demonstrate the influence of electrical stimulation on improving the separability of cohorts, thus improving classification accuracy. Under baseline conditions, the Linear Discriminant Analysis (LDA) projections of the selected feature space (reduced through PCA to capture >95% variance) in Figs. 2(e) and 3(e) show significant overlap between cohort distributions, particularly in 2D neurons for control (mean = 1.27) and SCZ (mean = −1.27) and in organoids for BD (mean = [0.55, 0.03]). This overlap reflects a lack of strong distinguishing features in the electrophysiological data under resting conditions. In contrast, Figs. 2(f) and 3(f) illustrate that PES markedly enhances separability, as shown by the greater distances between cohort means and reduced overlap in density distributions. For 2D neurons, the mean distances between control (mean = 5.02) and SCZ (mean = −4.97) increased significantly. Similarly, for organoids, PES resulted in clearer distinctions, with cohort means increasing to control (mean = [−1.45, 2.67]), SCZ (mean = [−0.54, −4.44]), and BD (mean = [2.00, 1.76]).

To further validate the discriminative power of the key features identified by our DAP, we compared the values of these features across all replicates in each cohort, both before and after electrical stimulation. For 2DNs, the median sink index of channel 12 significantly differed between control (0.78 ± 0.09) and SCZ (0.45 ± 0.08) cohorts under baseline conditions (p < 0.001, two-tailed t-test). Following electrical stimulation, the autocorrelation of channel 1 demonstrated even greater separation between control (0.54 ± 0.12) and SCZ (0.14 ± 0.08) cohorts (p < 0.0001, two-tailed t-test). For COs, the mean of channel 13 under baseline conditions showed significant differences between control (0.64 ± 0.05), SCZ (0.48 ± 0.06), and BD (0.57 ± 0.08) cohorts (one-way ANOVA, F(2,69) = 49.3, p < 0.0001). Post hoc Tukey tests confirmed significant differences between all pairwise comparisons (p < 0.01). Following electrical stimulation, the skewness of channel 11 demonstrated even clearer separation among control (1.21 ± 0.42), SCZ (6.53 ± 1.15), and BD (3.42 ± 0.75) cohorts (one-way ANOVA, F(2,21) = 97.6, p < 0.0001), with post hoc tests confirming significant differences between all groups (p < 0.001).

To benchmark the performance of our DAP against prevailing clinical practice, we compared its classification accuracy to published reliability data for structured clinical interviews (SCID-5) and expert consensus diagnosis. For the binary 2DN task under PES, our pipeline achieved a mean accuracy of 0.944 ± 0.022, surpassing the ∼0.80 concordance typically reported for SCID-based SCZ diagnosis. For the more challenging three-class classification using COs, the PES paradigm yielded an accuracy of 0.904, substantially outperforming the κ < 0.60 values commonly reported for clinical differentiation between SCZ and BD.

To reconcile the unexpectedly high classification accuracies in the confusion matrices with the modest visual separation observed in the 2D LDA plots [Figs. 2(e) and 3(e)], we emphasize that the SVM classifier operates in the full, kernel-transformed feature space, where non-linear decision boundaries can exploit subtle inter-feature relationships that are lost when the data are linearly projected for visualization. Specifically, the three most informative sink index features (e.g., median channel 12, covariance channels 9–11) span a three-dimensional manifold that the SVM separates almost perfectly, whereas LDA compresses these dimensions into two orthogonal axes, inevitably discarding variance that is critical for cohort discrimination. Kernel-matrix inspection confirmed that samples from the SCZ group become nearly linearly separable only after this transformation. Moreover, the post-electrical-stimulation (PES) recordings provide an additional axis of variance—dynamic network responsiveness—that further enlarges inter-cohort distances, explaining the jump in accuracy from 83.3% (baseline COs) to 91.6% (PES) despite only modest changes in the 2D LDA overlap. Hence, the apparent mismatch between classification performance and LDA visualization reflects methodological differences, not biological inconsistencies.

To conclude, these results demonstrate that for the 2DN and CO cohorts, an SVM classifier can effectively distinguish healthy control and patient-derived organoids and neurons, utilizing extracted biomarkers regarding sink index features from MEA data. More importantly, symptoms of diseased cohorts were significantly manifested in the selected feature space under the PES condition for both 2DNs and COs. Thus, we were able to distinguish more clearly among all cohorts, especially in COs. These findings reinforce the hypothesis that dynamic states, rather than resting conditions, reveal more pronounced network dysfunctions associated with neuropsychiatric disorders. The clear separation of classes in the scatter plots and the high classification accuracy depicted in the confusion matrices underscore the potential of this DAP for identification of electrophysiological signatures associated with neuropsychiatric disorders.

III. DISCUSSION

This study highlights the potential of using electrophysiological data derived from iPSC-based models, including 2DNs and COs, for the classification of neuropsychiatric disorders such as SCZ and BD. By developing a DAP that integrates MEA recordings with SRDNM and machine learning-based classification, we have identified distinct electrophysiological signatures that can effectively classify SCZ and BD cohorts with high accuracy. These findings have significant implications for advancing the field toward the development of objective biomarkers for psychiatric conditions, which could complement existing diagnoses and lead to more personalized treatment strategies.

While the source–sink analysis method was originally developed for seizure localization,14 its underlying principles make it particularly valuable for studying neuropsychiatric disorders. Both epilepsy and conditions like SCZ and BD involve fundamental disruptions in neural network dynamics, albeit manifesting differently. The sink index quantifies directional influence within neural networks by identifying nodes that primarily receive signals, effectively capturing how information flows through neural circuits. This approach aligns with the emerging network-based understanding of neuropsychiatric disorders, where pathology manifests not just in individual cells but in altered connectivity patterns and disrupted excitation/inhibition balance. The mathematical framework of source–sink analysis captures both local and global aspects of neural dynamics, allowing detection of subtle network abnormalities that might be missed by conventional analyses focused solely on firing rates or simple correlations. Our preliminary analyses comparing various network metrics, including graph theoretical measures, spectral properties, and cross correlation features19–21 demonstrated that sink index-derived features yielded the highest discriminative power for distinguishing between control and patient-derived cultures, supporting our hypothesis that network-level information flow is fundamentally altered in these condition.

The application of MEA in patient-derived COs and 2DNs offers a relevant and controlled environment for studying electrophysiological correlates of neuropsychiatric disorders. Clinical research22,23 has identified significant barriers to the effective diagnosis and treatment of neuropsychiatric disorders, including the subjective nature of current diagnostic methods and the lack of reliable biomarkers. Our DAP provides ways to improve biomarker accuracy by utilizing a quantifiable and objective method for assessing neural activity. Moreover, it manages and interprets complex, high-dimensional, time-varying data using advanced machine learning techniques,24 facilitating the extraction of meaningful features that distinguish SCZ and BD. Since traditional animal models have limitations in replicating human brain physiology,15 studies of human COs can bridge the gap between in vitro studies and clinical applications. The application of electrical stimulation to neural cultures provides a valuable paradigm for assessing network dynamics under perturbed conditions. In SCZ and BD, emerging evidence suggests that while baseline neural activity may show subtle abnormalities, more pronounced deficits emerge when networks are challenged to respond to external inputs. This parallels clinical observations where patients often show normal function in some contexts but demonstrate impaired information processing during cognitively demanding tasks. Electrical stimulation protocols in vitro can reveal latent network dysfunctions that may correspond to the information processing deficits observed in patients during cognitive tasks or sensory processing. Converging clinical evidence shows that the most robust abnormalities in SCZ and BD emerge when neural circuits are challenged by external input. For example, attenuation of auditory mismatch negativity and other evoked potentials is one of the best-replicated endophenotypes in SCZ,25 while transcranial-magnetic-stimulation (TMS) studies reveal blunted long-term-potentiation/-depression-like plasticity in both disorders.26,27 At the cellular level, postmortem and iPSC studies document GABAergic interneuron deficits and excitation–inhibition imbalance in SCZ,25 and dysregulated Ca2+-signaling cascades in BD.7 Controlled electrical stimulation in vitro is conceptually analogous to task-based fMRI or event-related EEG. It perturbs the network so that latent circuit pathophysiology becomes measurable. Previous studies using EEG and MEG have demonstrated abnormal neural responses to sensory stimuli in both SCZ and BD patients, particularly in evoked potentials and neural oscillations.28 By applying electrical stimulation to our in vitro models, we aimed to capture analogous differences in the dynamic response properties of patient-derived neural networks, potentially providing more sensitive measures of disease-related alterations than static baseline recordings alone.

The ability to distinguish between healthy controls and patients under both baseline and PES conditions suggests that the identified electrophysiological features related to the sink index dynamics reflect disease-specific alterations in neural network activity. The improved classification accuracy observed under PES conditions suggests the underlying neural circuit dynamics are more robustly characterized when cultures are actively responding to stimuli, emphasizing the importance of dynamic neural responses in characterizing neuropsychiatric disorders. The enhanced separability under PES condition is attributed to the dynamic activation of neural networks, which unveils latent disease-specific features that are otherwise less pronounced. This insight suggests that disorders like SCZ and BD may involve distinct neural network dysfunctions that are more pronounced during conditions requiring active network processing, providing a potential avenue for targeted therapeutic interventions. By incorporating electrical stimulation, this methodology offers a robust framework for distinguishing between control and diseased cohorts, including complex cases like BD, which may exhibit subtle alterations in resting-state activity. Furthermore, the robust performance of the SVM classifier across tenfold cross-validation (supplementary material Fig. S2) underscores the reliability of the extracted features. Notably, the classification accuracy averaging across folds mitigates concerns of overfitting, highlighting the generalizability of the model.

The study scope was constrained by the number of patient-derived iPSC lines utilized. This was primarily due to the cost-inefficiency in generating patients' iPSC-derived COs. We acknowledge that the limited number of samples used in this study may not fully capture the genetic and phenotypic diversity of the broader population of individuals with SCZ and BD, potentially affecting the generalizability of our findings. However, as the field grows in a few years and generating patients iPSC-derived COs becomes more cost-efficient, future studies are planning to include larger and more diverse cohorts of patient-derived iPSC lines, encompassing a wider range of genetic backgrounds and clinical phenotypes. This would help ensure that the identified electrophysiological signatures are broadly applicable and reliable as biomarkers across different patient populations.

Additionally, variability in iPSC-derived cell lines and COs —such as differences in differentiation efficiency, maturation state, and cellular composition—could introduce variability in the electrophysiological data, potentially confounding the classification results.29 However, as mentioned in our previous publications,3,6,7 we have multiple replicates of the COs and 2DNs, where we have collected data from, to mitigate variability. Also, it is important to note that while animal models have traditionally been used to study neuropsychiatric disorders, they represent only disease analogs of SCZ and BD, offering limited insights compared to the human-specific cellular models we utilized. These iPSC-derived models provide a window into human pathophysiology that animal models cannot fully replicate, particularly in capturing the complex genetic and phenotypic heterogeneity of human neuropsychiatric conditions.30 In addition, we deployed standardized differentiation protocols and the use of more homogeneous cell populations to reduce variability in the generated organoids and neuronal cultures. Incorporating advanced bioinformatics techniques to account for batch effects and inter-line variability further refined classification accuracy and improved the robustness of the identified biomarkers.

IV. CONCLUSION

This study demonstrates that patient-derived COs and 2DNs exhibit distinct electrophysiological signatures associated with SCZ and BD. By integrating MEA recordings, sink index analysis, and machine learning, we achieved high classification accuracy, particularly under PES conditions. These dynamic responses unmasked disease-relevant network dysfunctions—most notably excitation–inhibition imbalance—supporting their potential as objective biomarkers. While larger, more diverse cohorts are needed for clinical translation, our approach offers a scalable, biologically grounded framework for advancing precision diagnostics and therapeutic screening in neuropsychiatric disorders.

V. METHODS

This study aims to employ the SRDNM approach and the sink index14 to identify features as biomarkers to classify healthy controls from patients with SCZ or BD. These methodologies will utilize data derived from MEA recordings to uncover distinct neural activity patterns associated with psychiatric conditions.

iPSC lines used in previous studies3,6,7 were derived from a well-characterized cohort of patients with SCZ, BD, and healthy controls. These iPSCs were reprogrammed from subjects' fibroblast cells (with approval from the Massachusetts General Hospital and the McLean Hospital Institutional Review Board) using either modified mRNA or transient transfection with retroviruses, validated via standard protocols, and then differentiated into COs and cortical interneurons (Secs. V A and V B), providing a genetically faithful model that retains the unique genetic background of each donor (donor details3,6,7) Fibroblasts were obtained through skin punch biopsies from patients with BD and matched healthy control individuals, with informed consent obtained from all participants. Patients were screened by an experienced team using the Structured Clinical Interview for DSM Disorders (SCID) and detailed clinical histories, while healthy controls were selected based on the absence of psychiatric diagnoses or family history of SCZ or BD. This approach allows for an examination of patient-specific lines and their impact on neurodevelopment and network function, which is particularly relevant for disorders with high genetic heterogeneity like SCZ and BD.

A. Generation and maintenance of cerebral organoids

1. iPSC lines' genomic integrity

All iPSC lines used in this study underwent genomic integrity testing prior to differentiation into COs and cortical interneurons. KaryoStat™ analysis (Thermo Fisher Scientific) was performed at passage numbers used for differentiation (P25–P30) to confirm normal chromosomal structure and number. Additionally, copy number variation (CNV) analysis was conducted using single nucleotide polymorphism (SNP) arrays to detect any subkaryotypic abnormalities. All lines used in this study maintained normal karyotypes and did not exhibit pathogenic CNVs. Mycoplasma testing was regularly performed to ensure cultures remained contamination-free throughout the experimental period. This has been reported in Ref. 8.

2. Cerebral organoid differentiation

In our previous studies,6 iPSCs were cultured in NutriStem hPSC XF Medium (Sartorius) on plates coated with Geltrex (Gibco). For the formation of organoids, iPSCs were transferred to U-bottom plates at high density to form embryoid bodies (EBs, 15–20 k cells/per EB) and maintained in EB formation media with 5 mM Y-27632 for 5 days, with media changes every other day. The EBs were then resuspended in STEMdiff Cerebral Organoid Induction Media (STEMCELL Technologies, catalog 08570) for 2 days before embedding in Matrigel (Corning CLS354234) droplets on day 7.

Following embedding, the organoids were transferred to orbital shakers and maintained in STEMdiff Cerebral Organoid Maturation Media (STEMCELL Technologies, catalog 08571). To promote neural differentiation, 10 ng/ml brain-derived neurotrophic factor (BDNF) was added to the media starting from day 30, with media changes every 3–4 days. The organoids were cultured at 37 °C with 5% CO2, and they reached a stable size and showed differentiation markers for several neuronal subtypes over a period of 6–9 months.

3. Organoid integrity validation

To validate the integrity and cellular composition of the COs used in this study, immunohistochemical characterization was performed at multiple timepoints during development (supplementary material Fig. S3). Organoids were stained for neural progenitor markers (Nestin), astrocytes markers (GFAP), upper layer markers (CUX1), and GABAergic neuron markers (GAD65). Quantification of cell populations demonstrated appropriate distribution of these markers across all organoid lines, confirming the presence of properly organized neural progenitors and differentiated neurons. RT-qPCR analysis further validated the expression of these markers at the transcriptional level (supplementary material Fig. S4). Notably, no significant differences in the proportions of major cell types were observed between control and patient-derived organoids, indicating that the electrophysiological differences detected were not attributable to gross differences in cellular composition.

B. Differentiation and maintenance of two-dimensional cortical neurons

Two-dimensional cortical neurons were developed in previous published studies,3 human iPSCs were first cultured on Geltrex (Thermo Fisher Scientific, A1413202) and maintained in NutriStem media (Stemgent; 01-0005)18 until 100% confluency, at which point they were differentiated into neural progenitor cells (NPCs) by culturing in N2/B27 medium, which consist of 50% N2 medium [485 ml Neurobasal medium (Life Technologies, 21103049), 5 ml N-2 supplement (Gibco, 17502001), 5 ml GlutaMAX (Thermo Fisher Scientific, 35050061), and 5 ml penicillin–streptomycin (Gibco, 15140122)] and 50% B-27 medium [10 ml B-27 supplement (Gibco, 17504044), 480 ml Dulbecco's modified Eagle medium (Sigma-Aldrich, D6421), 5 ml GlutaMAX (Thermo Fisher Scientific, 35050061), and 5 ml penicillin–streptomycin (Gibco, 15140122)]. The medium is supplemented with 10 μM SB431542 (Sigma, S4317), 2 μM XAV939 (Sigma-Aldrich; X3004), and 1 μM dorsomorphin (Sigma-Aldrich; P5499) for 7 days with a medium refreshed daily. After neural induction, the cells were split at a 1:1 ratio and transferred onto a Geltrex-coated (Thermo Fisher Scientific, A1413202) substrate on day 8, and these NPCs were maintained in N2/B27 without SMAD inhibitors until passaged upon reaching confluency, followed by the addition of 1.5 μM purmorphamine (Sigma, SML0868) during days 10–20 to drive forebrain lineage specification. NPCs were subsequently transferred to plates coated with 10 μg/ml poly-L-ornithine (Sigma, P3655) and 10 μg/ml laminin (Sigma, L2020) and maintained in BrainPhys neuronal media (StemCell Technologies; 05790) supplemented with B-27 (Gibco, 17504044) and, for interneurons, 10 μM DAPT (Sigma-Aldrich, D5942) to enhance neuronal maturation starting on day 21. The medium is replaced daily until day 29, after which it is refreshed twice a week until maturation is achieved by days 90–120.

C. Functional analysis and electrophysiological stimulation

The MED 64-Presto MEA system (Alpha MED Scientific Inc., Osaka, Japan) was used to assess the functional activity of 9-month-old organoids and co-cultured iPSC-derived neurons at day 90. Transcriptomic analysis3,6,7 was performed previously to confirm these cultures are in their mature developmental stage with a range of cell types expressing layer-specific markers, which ensures functional neural networks. In our published studies,3,6,7 cultures were plated on MEA 24-well plates, which have 16 electrodes per well. Organoids were attached to the same type of MEA plates pre-coated with poly-L-ornithine and laminin and cultured for an additional 3 months with regular media changes. Baseline spontaneous activity was recorded for 1-min periods before the application of 10 electrical pulses at 0.8 V, followed by a 1-min recording to assess stimulated responses using MEA symphony software (Alpha MED Scientific Inc., Osaka, Japan). Spontaneous firing rate was calculated as the average of a 200-ms sliding-window rate vector over the 60-s pre-stimulus recording and reported as the mean ± SEM across active electrodes (≥5 Hz).

Electrical stimulation was applied with the integrated stimulator of the MED64-Presto MEA. Parameter sweeps performed beforehand established settings that reliably evoked network activity while remaining well below electrochemical safety limits. The final protocol—used for both 2DNs and COs —comprised ten charge-balanced, square-wave biphasic pulses (negative phase first) delivered over 50 s. Each pulse had an amplitude of 0.8 V pp, a phase width of 200 μs, a 100 μs inter-phase gap, and a 5 s inter-pulse interval. For 2DNs, stimuli were injected through the central electrode (row 2, column 2; channel 6 of the 4 × 4 array), which consistently exhibited high connectivity and acceptable impedance; electrodes exhibiting damage or impedance outside 30–50 kΩ at 1 kHz were not used. In COs, the stimulating electrode was chosen individually from the four corner positions (channels 1, 4, 13, and 16) to ensure good contact with the organoid edge and to mimic an afferent input rather than directly activating the core. Under these conditions the measured current was 16–27 μA per phase, corresponding to 3.2–5.4 nC phase−1, well below the 30 nC phase−1 safety threshold. A 3 ms blanking window suppressed stimulation artifacts, after which post-electrical-stimulation (PES) activity was recorded for 60 s; analyses began 500 ms after the last pulse to exclude residual artifacts while capturing the early network response.

Prior to feature extraction and classification, we validated that the recorded signals represented genuine neuronal activity through multiple approaches. First, baseline firing properties were characterized across all samples to ensure viable neuronal activity. As shown in Fig. 3, control COs displayed average inter-spike intervals (ISIs) of approximately 0.5 s, with burst rates of approximately 10 bursts/min and 6–7 spikes per burst (supplementary material Fig. S5). The baseline spontaneous spike frequency was approximately 70 Hz for control organoids, 60–70 Hz for SCZ organoids, and 70 Hz for BD organoids, with no statistically significant differences between groups under baseline conditions.3,6,7

Characteristic differences emerged when comparing other firing parameters. SCZ organoids exhibited significantly higher average ISI (0.7 s, p = 0.0003) compared to controls, indicating altered network timing. BD organoids showed significantly decreased average spikes in burst (10 spikes compared to 32 in controls, p < 0.0001). Both SCZ and BD organoids demonstrated significantly increased mean ISI within the network (65 ms for SCZ and 32 ms for BD compared to 7 ms for controls, p < 0.0001).

Upon electrical stimulation, control organoids showed a robust increase in spike frequency from approximately 70 to 105 Hz (p < 0.001), while SCZ and BD organoids failed to show significant increases in firing rates. Similarly, following application of 30 mM KCl for depolarization-induced firing, control organoids exhibited a significant increase in spike frequency from approximately 70 to 82 Hz (p < 0.01), whereas SCZ and BD organoids showed minimal or decreased responses.3,6,7

To pharmacologically verify that the recorded signals originated from neuronal action potentials rather than artifacts, we applied 1 μM tetrodotoxin (TTX, Sigma, T8024), a selective sodium channel blocker, to inhibit sodium channels. Application of TTX consistently abolished >95% of detected spikes across all culture types, confirming their neuronal origin.3,6,7 These electrophysiological characterizations demonstrate that while disease-specific organoids develop spontaneous electrical activity, they exhibit distinct functional deficits in their response to stimulation, consistent with the molecular and structural abnormalities identified in our transcriptomic analysis.

Furthermore, to ensure that low-amplitude signals were not misclassified as spikes, we implemented a rigorous spike detection algorithm with an adaptive threshold set at 5 standard deviations above the RMS noise level for each electrode. This conservative threshold minimized false positives while still capturing genuine neuronal activity. Only recordings with signal-to-noise ratios exceeding 5:1 and stable baseline activity for at least 5 min were included in our analysis, ensuring that the dataset used for model training represented viable neuronal activity.

D. Electrophysiological data acquisition and preprocessing

We analyzed extracellularly recorded spontaneous activity of 2DNs and COs derived from patients and healthy individuals. Electrophysiological recordings were previously performed using the MED64-Presto System (Alpha MED Scientific Inc., Osaka, Japan) with a sampling frequency of 20 kHz.

A consistent recording media formulation was used across all experiments to ensure standardized conditions. The recording media consisted of BrainPhys basal medium (STEMCELL Technologies, catalog #05790) supplemented with 2% B-27 supplement (Gibco, catalog #17504044), 1% N2 supplement (Gibco, catalog #17502048), 20 ng/ml brain-derived neurotrophic factor (BDNF) (PeproTech, catalog #450-02), 20 ng/ml glial cell line–derived neurotrophic factor (GDNF) (PeproTech, catalog #450-10), 1 mM glutamine (Gibco, catalog #25030081), and 0.5 mM dibutyryl cyclic-AMP (Sigma, catalog #D0627). This media formulation was selected based on optimization experiments demonstrating its ability to maintain stable electrophysiological activity during extended recording sessions.

Prior to recording, cultures were adapted to the recording media through a 50% media exchange 24 h before the experiment, followed by a complete exchange to fresh recording media 1 h before the start of data acquisition. All recordings were performed at a controlled temperature of 37 ± 0.5 °C, maintained using a heating system integrated with the MEA recording setup (Alpha MED Scientific Inc., Osaka, Japan). The pH was maintained at 7.3–7.4 through continuous perfusion with a mixture of 5% CO2 and 95% air at a flow rate of 1.5 l/min. Osmolarity of the recording media was adjusted to 305–315 mOsm/kg. For standardization across experiments, all pharmacological agents (TTX and KCl) were diluted in the same recording media formulation and applied to cultures following baseline recordings. Solutions were preheated to 37 °C before application to avoid temperature-induced artifacts in neural activity.

Data preprocessing involves several steps to ensure high-quality spike train data for analysis. The initial step in data preprocessing involved the application of a bandpass filter to the raw electrophysiological data, confining the signal within a frequency range of 0.5–3000 Hz. This was implemented using a fourth-order Butterworth filter, which is known for its flat frequency response in the passband, ensuring minimal distortion of the relevant neural signals. The choice of the 0.5–3000 Hz range was selected to encompass the full spectrum of neurophysiologically relevant signals in extracellular recordings.

Subsequent to the bandpass filtering, a notch filter centered at 60 Hz with a 2 Hz stopband was applied to eliminate power line interference, a common source of noise in electrophysiological recordings. This notch filtering was repeated at harmonics of 60 Hz (i.e., 120 and 180 Hz) to further reduce residual power line noise.

Spike times were identified by appending the frequency-adjusted index of the peak voltage during each spiking period to the start time of the spiking period. These spike times were used to populate a binarized time series with 1 at each spike time and all other points marked as 0s. The resultant spike train data were then downsampled to 1 kHz. Down-sampling was essential to balance the trade-off between data resolution and computational load, ensuring that the temporal resolution remained adequate for capturing the dynamics of neural firing while optimizing data processing efficiency.

In cases of spike collocation, where multiple spikes occurred within a short-time window, the spike marker was incremented rather than overwritten, ensuring that all spiking events were accurately represented in the time series. Finally, the spike train was rate-coded by calculating the average spike count over a sliding window of 200 ms, effectively smoothing the data to emphasize the underlying neural firing rates and reduce the impact of transient noise. This rate coding allowed for the characterization of neural activity patterns in terms of firing rates, which are crucial for understanding network dynamics.

E. Stimulus–response dynamic network modeling

Stimulus–response dynamic network models (SRDNMs) were developed in MATLAB [The MathWorks Inc. (2022), Version: 23.2 R2023b, Natick, Massachusetts] to capture the stimulus–response relationships31 in the neural network of COs and 2DNs culture. The model notation includes time t=1,2,3, in milliseconds, ntRL as an L-dimensional vector of neural firing rate measurements, utRK as a K-dimensional vector of electrical stimulation inputs, and wtRL as an L-dimensional vector of Gaussian white noise.12

The state evolution equation is given by

nt+1=An(t)+Bu(t)+w(t), (1)

where ARL×L is the state transition matrix that captures how current neural activity affects future activity and BRL×K captures the influence of stimulation.

The matrix A in the SRDNMs represents the connectivity of the network, where each element Aij signifies the influence of node j on node i. The A matrix thus encapsulates the dynamic interactions among nodes in the network, with rows representing the influence received by a node and columns representing the influence exerted by a node.14

F. Extracting sink index features map from SRDNMs

After estimating SRDNMs from the data, sink index features were extracted from the state transition matrix A in Eq. (1) to characterize the network properties of COs and 2DNs culture. While sources are nodes that highly influence other nodes while not themselves receiving high influence, sinks are nodes that receive high influence from other nodes but do not themselves highly influence other nodes. The ideal sink is defined as a node that receives maximal influence from all the other nodes in the network but does not impact the future activity of them, which means it will have nearly all zeros in its column vector in the A matrix while having a large absolute value in its row vector.

By computing the sum of the absolute values across its row and column in the state transition matrix A, the amount of influence on and from the channel was quantified, showing each channel's sink characteristics. Then channels are ranked based on the row sums, with the highest sum (most influenced) getting the highest rank, which is N, and these row ranks are normalized by the number of channels. Similarly, channels are also ranked based on the column sums, with the highest sum (most influential) getting the highest rank, and these column ranks are also normalized.

The sink index sinki(i1,2,16) measures the distance between channel i and the ideal sink, which is defined as a channel whose normalized row rank is equal to 1 and normalized column rank is equal to 1N, and is computed as

sinki=2(ri,,ci)(1,1N)2, (2)

where ri is the row rank of channel i and ci is the column rank of channel i in terms of influence from and to the rest of the network and N is the number of MEA channels. The larger the sink index, the more likely the channel is a sink.14

The sink index was used to construct a 2D sink representation for each network, and a sink index feature map (supplementary material Table 1), including 271 statistical features of the sink index, was computed and fed to a feature selection algorithm. Fourteen statistical features of the sink index were calculated for each single channel, creating 224 channel-wise sink index features in the feature map, with a subscript number from 1 to 16 showing which channel's sink index is calculated from. The other 47 sink index features were derived from sink indices crossing all 16 channels.

G. Feature selection using MRMR

To handle the high-dimensional feature space resulting from the SRDNM, the minimum redundancy maximum relevance (MRMR) feature selection framework16 was employed. MRMR optimizes feature selection by maximizing the relevance of features to the classification target while minimizing redundancy among selected features. This step was crucial in reducing the dimensionality of the dataset, ensuring that only the most informative features were retained for classification. The MRMR framework ranks features such that their mutual information with the class labels is maximized, while the mutual information between potential features is kept to a minimum. This dual optimization ensures that the selected features are not only relevant but also provide complementary information, which is critical for enhancing the performance of the classification algorithm.

H. Cohort classification and validation

To classify the CO's data, the selected features from the MRMR framework were used as input to multiple classifiers, and we compared the performance of several commonly used machine learning algorithms, including Random Forest, k-Nearest Neighbors (k-NN), and Logistic Regression before selecting the Support Vector Machine (SVM). Each classifier was evaluated based on key performance metrics such as accuracy, precision, recall, and F1-score. These metrics were calculated using a tenfold cross-validation strategy to ensure robustness and minimize the risk of overfitting. The cross-validation loss observed during training of the SVMs is presented in supplementary material Fig. 2, corresponding to the classification results shown in Figs. 2 and 3.

To prevent information leakage between model selection and performance estimation, we replaced the single-level tenfold cross-validation with a fully nested design. In each outer fold (k = 10), one subject-wise partition was held out for testing while the remaining nine partitions were passed to an inner threefold loop that simultaneously (i) executed minimum redundancy maximum relevance feature selection and (ii) optimized the hyperparameters of every candidate algorithm (Support Vector Machine, Random Forest, k-Nearest Neighbor, and Logistic Regression) by Bayesian minimization of cross-entropy loss. The algorithm–feature–parameter combination that achieved the lowest inner-loop loss was then retrained on the complete inner-loop training set and evaluated exactly once on the held-out outer fold. All performance metrics quoted in the study are averaged over the ten outer folds. 95% confidence intervals were computed with bias-corrected accelerated bootstrapping. This nested procedure guarantees that no information from test folds influences either model choice or hyperparameter tuning.

We conducted a systematic comparison of multiple machine learning algorithms using identical feature sets and validation procedures. Four classifiers were evaluated: Support Vector Machine (SVM), Random Forest (RF), k-Nearest Neighbors (k-NN), and Logistic Regression (LR). For each classifier, we optimized hyperparameters using Bayesian optimization with tenfold cross-validation on the training set. For 2DN classification under baseline conditions, SVM achieved the highest accuracy (93.8%), followed by Random Forest (87.5%), k-NN (83.3%), and Logistic Regression (79.2%). The precision, recall, and F1-scores showed similar patterns, with SVM consistently outperforming other classifiers. Specifically, SVM achieved an F1-score of 0.936, compared to 0.874 for RF, 0.830 for k-NN, and 0.789 for LR. For CO classification under baseline conditions (three-class problem), the performance advantage of SVM was even more pronounced. SVM achieved an accuracy of 83.3%, substantially higher than RF (70.8%), k-NN (66.7%), and LR (62.5%). The weighted F1-scores were 0.833, 0.704, 0.662, and 0.617, respectively. Under PES conditions, SVM maintained its performance advantage, with accuracies of 95.8% for 2DN classification and 91.6% for CO classification, consistently outperforming other algorithms by margins of 8–15 percentage points in accuracy and F1-score.

We attribute SVM's superior performance to several factors. First, its ability to find optimal hyperplanes in high-dimensional feature spaces makes it well suited for the complex electrophysiological data in our study. Second, the kernel trick allows SVM to capture non-linear relationships between features without explicitly transforming the data, providing additional flexibility compared to inherently linear models like Logistic Regression. Third, SVM's margin-based classification approach provides robustness against overfitting, which is particularly important given our relatively limited sample size.

Among the classifiers tested, SVM consistently outperformed the others, particularly in terms of its ability to handle the high-dimensional feature space generated by the SRDNM analysis. SVM's effectiveness in finding the optimal hyperplane that maximizes the margin between different classes made it particularly well suited for this study, where the classes represent different patient-derived organoids, including SCZ, BD, and healthy controls.

I. Hyperparameter tuning and optimization

A Bayesian optimization framework with cross-validation was employed to tune the hyperparameters of the SVM, including the kernel type, such as linear, polynomial, or radial basis function, and the regularization parameter C. Bayesian optimization was chosen due to its efficiency in navigating hyperparameter space by focusing on the most promising regions, thus reducing the number of iterations required compared to grid search or random search.

The kernel type determines the decision boundary shape in the feature space, while the regularization parameter controls the trade-off between maximizing the margin and minimizing classification errors. By optimizing these hyperparameters, we aimed to enhance the classifier's generalization ability and reduce the likelihood of overfitting. The Bayesian optimization process involved evaluating the model's performance across a range of hyperparameter values, guided by minimizing the tenfold cross-validation loss.

J. Classification performance and validation

The final SVM model, trained with optimized hyperparameters, was evaluated on the testing set to assess its classification performance. Confusion matrices and performance metrics, including accuracy, precision, recall, and F1-score, were calculated to provide a comprehensive assessment of the classifier's ability to correctly distinguish between SCZ, BD, and control samples, particularly under both baseline and PES conditions.

The confusion matrices presented in Figs. 2(b), 2(d), 3(b), and 3(d) represent the aggregated classification results across all test sets from the outer cross-validation folds. For 2DNs under baseline conditions [Fig. 2(b)], the confusion matrix includes all 48 samples (24 control and 24 SCZ), with each sample appearing exactly once as a test instance across the 10 outer folds. Similarly, for 2DNs under PES conditions [Fig. 2(d)], the confusion matrix includes all 24 samples (12 control and 12 SCZ). For COs under baseline conditions [Fig. 3(b)], the confusion matrix includes all 72 samples (24 control, 24 SCZ, and 24 BD), and for COs under PES conditions [Fig. 3(d)], all 24 samples (8 control, 8 SCZ, and 8 BD) are represented.

To generate these aggregate confusion matrices, we recorded the true and predicted labels for each sample when it appeared in a test set during cross-validation. This approach provides a comprehensive evaluation of model performance across the entire dataset while maintaining strict separation between training and testing data for each prediction. The accuracy values reported alongside these confusion matrices represent the proportion of correctly classified instances across all test folds, equivalent to the sum of diagonal elements divided by the total number of samples in the confusion matrix.

K. Dataset

All MEA datasets utilized in this study were derived from our previously published research3,6,7 which featured COs and 2DNs. Each clinical group comprised four independent iPSC lines, one line per donor. From every line we generated six cortical-interneuron monolayer cultures (2DNs) and six COs, yielding 48 baseline 2DN recordings and 72 baseline CO recordings. An additional subset of these cultures (12 2DNs and 24 COs, evenly distributed across donors) underwent the post-electrical-stimulation (PES) protocol. All cross-validation partitions were performed at the donor level so that no replicate from a given individual appeared simultaneously in training and test sets.

We placed one culture in a single well on the MEA plates. We recorded the spontaneous activity of n = 48 2DNS and n = 72 COs in their mature stage of development (90 days in vitro for 2DNs and 9 months for COs, DIVs). Specifically, the 2DN dataset includes 24 control samples recorded at baseline, n = 12 control samples recorded at PES, n = 24 SCZ samples recorded at baseline, and n = 12 SCZ samples recorded post-stimulation. For COs, recordings include n = 24 samples each of control, SCZ, and BD cohorts at baseline, as well as n = 8 samples per cohort recorded after electrical stimulation.

SUPPLEMENTARY MATERIAL

See the supplementary material for the following: ranked features with confidence selected by the minimum redundancy maximum relevance (MRMR) algorithm (Fig. S1); cross-validation loss observed during the training of the support vectors machine classifier (Fig. S2); immunohistochemical characterization performed at multiple timepoints during development (Fig. S3); single nuclear RNA sequencing analysis (Fig. S4); firing characteristics of cerebral organoids and 2D neurons at baseline (Fig. S5); differential expression of GABAergic genes (SCZ/Control) (Fig. S6) and the sink index features map used in the study (Table 1).

ACKNOWLEDGMENTS

This work was supported by NIH grants R01MH113858 and K08MH086846 (to R.K.), and R01NS133965 (to D.-H.K.).

NOMENCLATURE

BDNF

Brain-derived neurotrophic factor

BD

Bipolar disorder

CALB

Calbindin

CO

Cerebral organoid

DAP

Digital analysis pipeline

DNA

Deoxyribonucleic acid

EB

Embryoid body

E–I

Excitatory–inhibitory (balance)

FOXG1

Forkhead Box G1

GABA

Gamma-aminobutyric acid

GAD1/GAD67

Glutamate decarboxylase 1/67

GDNF

Glial cell line–derived neurotrophic factor

GWAS

Genome-wide association study

iPSC

Induced pluripotent stem cell

ISI

Inter-spike interval

LDA

Linear Discriminative Analysis

MAP2

Microtubule-associated protein 2

MEA

Multi-electrode array

MRMR

Minimum redundancy maximum relevance

NLGN2

Neuroligin 2

NPC

Neural progenitor cell

OST

Oxidative stress

PCA

Principle component analysis

PES

Post-electrical stimulation

PFC

Prefrontal cortex

qPCR

Quantitative polymerase chain reaction

RNA

Ribonucleic acid

SCZ

Schizophrenia

SNP

Single nucleotide polymorphism

SRDNM

Stimulus–response dynamic network modeling

SST

Somatostatin

SVM

Support Vector Machine

SYT1/2

Synaptotagmin 1/2

TMS

Transcranial magnetic stimulation

TTX

Tetrodotoxin

2DN

Two-dimensional neuron (cortical interneurons in monolayer culture)

Note: This paper is part of the Special Topic on Bioengineering of the Brain.

AUTHOR DECLARATIONS

Conflict of Interest

The authors have no conflicts to disclose.

Ethics Approval

Ethics approval is not required.

Author Contributions

Kai Cheng: Formal analysis (lead); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Visualization (lead); Writing – original draft (lead); Writing – review & editing (equal). Autumn Williams: Formal analysis (equal); Methodology (equal); Software (equal); Validation (equal); Writing – review & editing (supporting). Anannya Kshirsagar: Writing – original draft (equal). Sai Kulkarni: Writing – original draft (supporting). Rakesh Karmacharya: Data curation (equal); Resources (lead); Writing – review & editing (supporting). Deok-Ho Kim: Data curation (supporting); Resources (supporting); Writing – review & editing (supporting). Sridevi V. Sarma: Investigation (supporting); Methodology (equal); Project administration (equal); Supervision (equal); Writing – review & editing (supporting). Annie Kathuria: Conceptualization (equal); Data curation (equal); Investigation (equal); Project administration (equal); Supervision (equal); Writing – review & editing (equal).

DATA AVAILABILITY

The data that support the findings of this study are openly available in MEA at https://github.com/ckhdd/RateCoded_MEA_Analysis, Ref. 32.

References

  • 1.Bray N. J. and O'Donovan M. C., “The genetics of neuropsychiatric disorders,” Brain Neurosci. Adv. 2 (published online 2018). 10.1177/2398212818799271 [DOI] [Google Scholar]
  • 2.See https://www.nimh.nih.gov/health/statistics/schizophrenia for “Schizophrenia,” National Institute of Mental Health (NIMH). [Google Scholar]
  • 3.Kathuria A., Lopez-Lengowski K., Watmuff B., McPhie D., Cohen B. M., and Karmacharya R., “Synaptic deficits in iPSC-derived cortical interneurons in schizophrenia are mediated by NLGN2 and rescued by N-acetylcysteine,” Transl. Psychiatry 9(1), 321 (2019). 10.1038/s41398-019-0660-x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Hughes H., Brady L. J., and Schoonover K. E., “GABAergic dysfunction in postmortem dorsolateral prefrontal cortex: Implications for cognitive deficits in schizophrenia and affective disorders,” Front. Cell. Neurosci. 18, 1440834 (2024). 10.3389/fncel.2024.1440834 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Kathuria A., Lopez-Lengowski K., Watmuff B., and Karmacharya R., “Comparative transcriptomic analysis of cerebral organoids and cortical neuron cultures derived from human induced pluripotent stem cells,” Stem Cells Dev. 29(21), 1370–1381 (2020). 10.1089/scd.2020.0069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kathuria A., Lopez-Lengowski K., Jagtap S. S., McPhie D., Perlis R. H., Cohen B. M., and Karmacharya R., “Transcriptomic landscape and functional characterization of induced pluripotent stem cell–derived cerebral organoids in schizophrenia,” JAMA Psychiatry 77(7), 745–754 (2020). 10.1001/jamapsychiatry.2020.0196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kathuria A., Lopez-Lengowski K., Vater M., McPhie D., Cohen B. M., and Karmacharya R., “Transcriptome analysis and functional characterization of cerebral organoids in bipolar disorder,” Genome Med. 12(1), 34 (2020). 10.1186/s13073-020-00733-6 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Passaro A. P. and Stice S. L., “Electrophysiological analysis of brain organoids: Current approaches and advancements,” Front. Neurosci. 14, 622137 (2021). 10.3389/fnins.2020.622137 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Trujillo C. A., Gao R., Negraes P. D., Gu J., Buchanan J., Preissl S., Wang A., Wu W., Haddad G. G., Chaim I. A., Domissy A., Vandenberghe M., Devor A., Yeo G. W., Voytek B., and Muotri A. R., “Complex oscillatory waves emerging from cortical organoids model early human brain network development,” Cell Stem Cell 25(4), 558–569.e7 (2019). 10.1016/j.stem.2019.08.002 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Li A., Huynh C., Fitzgerald Z., Cajigas I., Brusko D., Jagid J., Claudio A. O., Kanner A. M., Hopp J., Chen S., Haagensen J., Johnson E., Anderson W., Crone N., Inati S., Zaghloul K. A., Bulacio J., Gonzalez-Martinez J., and Sarma S. V., “Neural fragility as an EEG marker of the seizure onset zone,” Nat. Neurosci. 24(10), 1465–1474 (2021). 10.1038/s41593-021-00901-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Smith R. J., Hays M. A., Kamali G., Coogan C., Crone N. E., Kang J. Y., and Sarma S. V., “Stimulating native seizures with neural resonance: A new approach to localize the seizure onset zone,” Brain 145(11), 3886–3900 (2022). 10.1093/brain/awac214 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Li A., Gunnarsdottir K. M., Inati S., Zaghloul K., Gale J., Bulacio J., Martinez-Gonzalez J., and Sarma S. V., “Linear time-varying model characterizes invasive EEG signals generated from complex epileptic networks,” in Conference Proceedings—IEEE Engineering in Medicine and Biology Society (EMBC) 2017, Jeju, Korea (South) (IEEE, 2017), pp. 2802–2805. 10.1109/EMBC.2017.8037439 [DOI] [Google Scholar]
  • 13.McCullagh P., Generalized Linear Models (Champman and Hall/CRC, 2001). [Google Scholar]
  • 14.Gunnarsdottir K. M., Li A., Smith R. J., Kang J.-Y., Korzeniewska A., Crone N. E., Rouse A. G., Cheng J. J., Kinsman M. J., Landazuri P., Uysal U., Ulloa C. M., Cameron N., Cajigas I., Jagid J., Kanner A., Elarjani T., Bicchi M. M., Inati S., Zaghloul K. A., Boerwinkle V. L., Wyckoff S., Barot N., Gonzalez-Martinez J., and Sarma S. V., “Source-sink connectivity: A novel interictal EEG marker for seizure localization,” Brain 145(11), 3901–3915 (2022). 10.1093/brain/awac300 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Eichmüller O. L. and Knoblich J. A., “Human cerebral organoids—A new tool for clinical neurology research,” Nat. Rev. Neurol. 18(11), 661–680 (2022). 10.1038/s41582-022-00723-9 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Ding C. and Peng H., “Minimum redundancy feature selection from microarray gene expression data,” J. Bioinf. Comput. Biol. 3(2), 185–205 (2005). 10.1142/S0219720005001004 [DOI] [Google Scholar]
  • 17.Lancaster M. A. and Knoblich J. A., “Generation of cerebral organoids from human pluripotent stem cells,” Nat. Protoc. 9(10), 2329–2340 (2014). 10.1038/nprot.2014.158 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Shi Y., Kirwan P., and Livesey F. J., “Directed differentiation of human pluripotent stem cells to cerebral cortex neurons and neural networks,” Nat. Protoc. 7(10), 1836–1846 (2012). 10.1038/nprot.2012.116 [DOI] [PubMed] [Google Scholar]
  • 19.Zhang Y., Zhang S., Ide J. S., Hu S., Zhornitsky S., Wang W., Dong G., Tang X., and Li C.-S. R., “Dynamic network dysfunction in cocaine dependence: Graph theoretical metrics and stop signal reaction time,” NeuroImage Clin. 18, 793–801 (2018). 10.1016/j.nicl.2018.03.016 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Xia M. and He Y., “Magnetic resonance imaging and graph theoretical analysis of complex brain networks in neuropsychiatric disorders,” Brain Connect. 1(5), 349–365 (2011). 10.1089/brain.2011.0062 [DOI] [PubMed] [Google Scholar]
  • 21.Bernhardt B. C., Bonilha L., and Gross D. W., “Network analysis for a network disorder: The emerging role of graph theory in the study of epilepsy,” Epilepsy Behav. 50, 162–170 (2015). 10.1016/j.yebeh.2015.06.005 [DOI] [PubMed] [Google Scholar]
  • 22.Taslim S., Shadmani S., Saleem A. R., Kumar A., Brahma F., Blank N., Bashir M. A., Ansari D., Kumari K., Tanveer M., Varrassi G., Kumar S., and Raj A., “Neuropsychiatric disorders: Bridging the gap between neurology and psychiatry,” Cureus 16(1), e51655 (2024). 10.7759/cureus.51655 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Bahn S., Noll R., Barnes A., Schwarz E., and Guest P. C., “Challenges of introducing new biomarker products for neuropsychiatric disorders into the market,” Int. Rev. Neurobiol. 101, 299–327 (2011). 10.1016/B978-0-12-387718-5.00012-2 [DOI] [PubMed] [Google Scholar]
  • 24.Iyortsuun N. K., Kim S.-H., Jhon M., Yang H.-J., and Pant S., “A review of machine learning and deep learning approaches on mental health diagnosis,” Healthcare 11(3), 285 (2023). 10.3390/healthcare11030285 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kim H. K., Blumberger D. M., and Daskalakis Z. J., “Neurophysiological biomarkers in schizophrenia-P50, mismatch negativity, and TMS-EMG and TMS-EEG,” Front. Psychiatry 11, 795 (2020). 10.3389/fpsyt.2020.00795 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Mehta U. M., Thanki M. V., Padmanabhan J., Pascual-Leone A., and Keshavan M. S., “Motor cortical plasticity in schizophrenia: A meta-analysis of transcranial magnetic stimulation—Electromyography studies,” Schizophr. Res. 207, 37–47 (2019). 10.1016/j.schres.2018.10.027 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Jannati A., Oberman L. M., Rotenberg A., and Pascual-Leone A., “Assessing the mechanisms of brain plasticity by transcranial magnetic stimulation,” Neuropsychopharmacology 48(1), 191–208 (2023). 10.1038/s41386-022-01453-8 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Uhlhaas P. J. and Singer W., “Abnormal neural oscillations and synchrony in schizophrenia,” Nat. Rev. Neurosci. 11(2), 100–113 (2010). 10.1038/nrn2774 [DOI] [PubMed] [Google Scholar]
  • 29.Volpato V. and Webber C., “Addressing variability in iPSC-derived models of human disease: Guidelines to promote reproducibility,” Dis. Models Mech. 13(1), dmm042317 (2020). 10.1242/dmm.042317 [DOI] [Google Scholar]
  • 30.Monteggia L. M., Heimer H., and Nestler E. J., “Meeting report: Can we make animal models of human mental illness?” Biol. Psychiatry 84(7), 542–545 (2018). 10.1016/j.biopsych.2018.02.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Beauchene C., Zurn C. A., Ehrens D., Duff I., Duan W., Caterina M., Guan Y., and Sarma S. V., “Steering toward normative wide-dynamic-range neuron activity in nerve-injured rats with closed-loop peripheral nerve stimulation,” Neuromodulation 26(3), 552–562 (2023). 10.1016/j.neurom.2022.09.011 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.See https://github.com/ckhdd/RateCoded_MEA_Analysis for “MEA.”

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that support the findings of this study are openly available in MEA at https://github.com/ckhdd/RateCoded_MEA_Analysis, Ref. 32.


Articles from APL Bioengineering are provided here courtesy of American Institute of Physics

RESOURCES