Abstract
We propose a robust deep learning framework to simultaneously detect and localize seizure activity from multichannel scalp EEG. Our model, called DeepSOZ, consists of a transformer encoder to generate global and channel-wise encodings. The global branch is combined with an LSTM for temporal seizure detection. In parallel, we employ attention-weighted multi-instance pooling of channel-wise encodings to predict the seizure onset zone. DeepSOZ is trained in a supervised fashion and generates high-resolution predictions on the order of each second (temporal) and EEG channel (spatial). We validate DeepSOZ via bootstrapped nested cross-validation on a large dataset of 120 patients curated from the Temple University Hospital corpus. As compared to baseline approaches, DeepSOZ provides robust overall performance in our multi-task learning setup. We also evaluate the intra-seizure and intra-patient consistency of DeepSOZ as a first step to establishing its trustworthiness for integration into the clinical workflow for epilepsy.
Keywords: Epilepsy, EEG, Multi-instance learning, Trustworthy AI
1. Introduction
Epilepsy is a debilitating neurological disorder characterized by spontaneous and recurring seizures [17]. Roughly 30% of epilepsy patients are drug resistant, meaning they do not positively respond to anti-seizure medications. In such cases, the best alternative treatment is to identify and surgically resect the brain region responsible for triggering the seizures, i.e., the seizure onset zone (SOZ). Scalp electroencephalography (EEG) is the first and foremost modality used to monitor epileptic activity. However, seizure detection and SOZ localization from scalp EEG are based on expert visual inspection, which is time consuming and heavily prone to the subjective biases of the clinicians [8].
Computer-aided tools for scalp EEG almost exclusively focus on the task of (temporal) seizure detection. Early works approached the problem via feature engineering and explored spectral [24,25], entropy-based [9], and graphtheoretic [1] features for the task. In general, these methods extract features from short time windows and use a machine learning classifier to discriminate between window-wise seizure and baseline activity [1,25]. More recently, deep learning models have shown promise in extracting generalizable information from noisy and heterogeneous datasets. Deep learning applications to EEG include convolutional neural networks (CNNs) [4,12,22,23], graph convolutional networks(GCNs) [21], and a combination of attention-based feature extraction [10] and recurrent layers to capture evolving dynamics [4,15,20]. Transformers have also been used for seizure detection, both in combination with CNNs [14] and directly on the EEG signals and their derived features [11,18]. While these methods have greatly advanced the problem of seizure detection, they provide little information about the SOZ, which is ultimately the more important clinical question.
A few works have explored the difficult task of localizing the SOZ via post hoc evaluations of deep networks trained for seizure detection. For example, the authors of [7,16] perform a cross-channel connectivity analysis of the learned representations to determine the SOZ. In contrast, the method of [2] identifies the SOZ by dropping out nodes of the trained GCN until the seizure detection performance degrades below a threshold. Finally, the SZTrack model of [6] jointly detects and tracks the spatio-temporal seizure spread by aggregating channel-wise detectors; the predictions of this model are seen to correlate with the SOZ. While valuable, the post hoc nature of these unsupervised analyses means that the results may not generalize to unseen patients. The first supervised approach for SOZ localization was proposed by [3] and uses probabilistic graphical models for simultaneous detection and localization. The more recent SZLoc model [5] proposes an end-to-end deep architecture for SOZ localization along with a set of novel loss functions to weakly supervise the localization task from coarse inexact labels. While these two methods represent seminal contributions to the field, they are difficult to train and only report the localization performance on short (i.e., < 2 min) EEG recordings around the time of seizure onset.
In this paper, we present DeepSOZ, a robust model for joint seizure detection and SOZ localization from multichannel scalp EEG. Our model consists of a spatial transformer encoder to combine cross-channel information and LSTM layers to capture dynamic activity for window-wise seizure detection. In parallel, we use a novel attention-weighted multi-instance pooling to supervise seizure-level SOZ localization at the single channel resolution. We curate a large evaluation dataset from the publicly available TUH seizure corpus by creating SOZ labels from the clinician notes for each patient. We perform extensive window-level, seizure-level, and patient-level evaluations of our model. Additionally, we analyze the consistency of predictions across seizure occurrences, which has not previously been reported for SOZ localization. Quantifying the error variance is the first step in establishing trust in DeepSOZ for clinical translation.
2. Methodology
Figure 1 illustrates our DeepSOZ architecture. The inputs to DeepSOZ are multichannel EEG data for a single seizure recording segmented into one-second windows. The outputs are a temporal sequence of predicted seizure versus baseline activity (detection) and a channel-wise posterior distribution for the SOZ (localization). Formally, let denote the EEG data for channel and time window . Clinical EEG is recorded in the 10–20 system, which consists of 19 channels distributed across the scalp. For training, let denote the seizure versus baseline activity label for time window , and let be a vector representing the clinician annotated SOZ. Below, we describe each component of DeepSOZ, along with our training and validation strategy.
Fig. 1.
Schematic of our DeepSOZ model. Left: Transformer encoder that uses positional encoding and self attention to generate hidden representations. Top Right: Bidirectional LSTM for seizure detection. Bottom Right: Attention weighted multi-instance pooling for SOZ localization.
2.1. The DeepSOZ Model Architecture
Spatial Transformer Encoder:
For each time window , the multichannel EEG data is passed to a transformer encoder consisting of multi-head attention (MHA) layers to generate both a global and channel-wise encodings. Since the spatial orientation of these channels is crucial for tracking seizure activity, we add a positional embedding generated by a trainable linear layer , resulting in the modified input , where is the indicator function for a one hot encoding at element .
The hidden representations are computed by the transformer encoder from the modified multichannel input as follows:
| (1) |
where denotes layer normalization, and represents a learned two layer feed-forward network with ReLU activation.
The operation uses parallel self attentions to map the input data into a set of projections, as guided by the other channels in the montage. Formally, let index the attention head. The attention weights captures global (1) and cross-channel (19) similarities via the key matrix and query matrix as follows:
| (2) |
where represents the softmax function, and is our model dimension. The attention is multiplied by the value matrix to generate the output for head . These outputs are concatenated and fed into a linear layer to produce . Finally, these MHA outputs are passed into a two layer feed forward neural network with ReLU activation, post residual connections and layer normalization to generate the hidden encoding .
The matrices , , and are trained parameters of the encoder. For simplicity, we set the model dimension to be the same as our input ( in this work), and we specify 8 attention heads in the MHA operation.
LSTM for Temporal Seizure Detection:
We use a bidirectional LSTM to capture evolving patterns in the global encodings of the one-second EEG windows, i.e., . We use a single LSTM layer with 100 hidden units to process the global encodings and capture both long-term and short-term dependencies. The output of the LSTM is passed into a linear layer, followed by a softmax function, to generate window-wise predictions . Here, represents the posterior probability of seizure versus baseline activity at window .
Attention-Weighted Multi-Instance Pooling for SOZ Localization:
We treat the localization task as a multi-instance learning problem to predict a channel-wise posterior distribution for the SOZ vector by computing a weighted average of the hidden representations from the transformer. We first map the channel-wise encodings to scalars using the same linear layer across channels. We use the predicted seizure probability as our attention to compute the final SOZ prediction as follows:
| (3) |
where is the sigmoid function. The final patient-level predictions are obtained by averaging across all seizure recordings for that patient.
2.2. Loss Function and Model Training
We train DeepSOZ in two stages. First, the transformer and LSTM layers are trained for window-wise seizure detection using weighted cross entropy loss:
| (4) |
where the weight is fixed based on the ratio of non-seizure to seizure activity in the dataset. DeepSOZ is then finetuned for SOZ localization. To avoid catastrophic forgetting of the detection task, we freeze the LSTM layers and provide a weak supervision for detection via the loss function:
| (5) |
where the penalizes the L1 norm to encourage sparsity in predicted
2.3. Model Validation
We evaluate DeepSOZ using bootstrapped 5-fold nested cross validation. Within each training fold, we select the learning rate and seizure detection threshold through a grid search with a fixed dropout of 0.15. We use PyTorch v1.9.0 with Adam [13] for training with a batch size of one patient; early stopping is implemented using a validation set drawn from the training data. We re-sample the original 5-fold split three times and report the results across all 15 models1.
Seizure Detection:
At the window level, we report sensitivity, specificity, and area under the receiver operating characteristic curve (AU-ROC). At the seizure level, we adopt the strategy of [4] and select a detection threshold that ensures no more than 2 min of false positive detections per hour in the validation dataset. To eliminate spikes, we smooth the output predictions using a 30s window and count only the contiguous intervals beyond the calibrated detection threshold as seizure predictions. Following the standard of [4], we do not penalize post-ictal predictions. We report the false positive rate (FPR) per hour (min/hour), the sensitivity, and the latency (seconds) in seizure detection.
SOZ Localization:
By construction, DeepSOZ processes each seizure recording separately to find the SOZ. Patient-level SOZ predictions are obtained by averaging across all seizure recordings for that patient. The SOZ is correctly localized if the maximum channel-wise probability lies in the neighborhood determined by the clinician. We quantify the prediction variance at the seizure level by generating Monte Carlo samples during test via active dropout. At the patient level, we compute the prediction variance across all seizures for that patient.
Baseline Comparisons:
We compare the performance of DeepSOZ with one model ablation and four state-of-the-art methods from the literature. Our ablation replaces the attention-weighted multi-instance pooling in DeepSOZ with a standard maxpool operation within the prediction seizure window (DeepSOZ-max). Our baselines consist of the CNN-BLSTM model for seizure detection developed by [4], the SZTrack model proposed by [6] that uses a convolutional-recurrent architecture for each channel, the SZLoc model by [5] consisting of CNN-transformer-LSTM layers, and the Temporal Graph Convolutional Network (TGCN) developed by [2]. SZTrack and SZLoc are trained and evaluated for localization via the approach published by the authors which uses only 45 s of data around onset time. We modify the TGCN slightly to extract channel-wise prediction for localization task but evaluate it on the full 10-minute recordings like DeepSOZ. Finally, we note that the CNN-BLSTM can only be used for seizure detection, and SZLoc is only trained for SOZ localization.
3. Experimental Results
Data and Preprocessing:
We validate DeepSOZ on 642 EEG recordings from 120 adult epilepsy patients in the publicly available Temple University Hospital (TUH) corpus [19] with a well characterized unifocal seizure onset. We use the clinical notes to localize the SOZ to a subset of the 19 EEG channels. Table 1 describes the seizure characteristics across patients in our curated subset.
Table 1.
Description of patient demographics in the curated TUH dataset.
| curated TUH dataset | |
|---|---|
| Number of patients | 120 |
| Male/Female | 55/65 |
| Average age | 55.2±16.6 |
| Min/Max age | 19/91 |
| Seizures per patient | 14.7±25.2 |
| Min/Max seizures per patient | 1/152 |
| Average EEG duration per patient | 79.8±135 min |
| Average seizure duration | 88.0±123.5 s |
| Min/Max seizure duration | 7.5±1121 s |
| Temporal/Extra-temporal Onset | 72/48 |
| Right/Left onset zone | 59/61 |
Following [4], we re-sample the raw EEG to 200 Hz for uniformity, filter the signals between 1.6–30 Hz, and clip them at two standard deviations from mean to remove high intensity artifacts. All signals are normalized to have zero mean and unit variance. We standardize the input lengths by cropping the signals to 10 min around the seizure interval, while ensuring that the onset times are uniformly distributed within this period. We segment the EEG into one second non-overlapping windows to obtain the model inputs .
Seizure Detection Performance:
Table 2 reports the seizure detection performance averaged over the 15 bootstrapped testing folds. At the window level, both aggregation strategies for DeepSOZ (weighted posterior and max pooling) perform similarly and achieve higher AU-ROC values than the other baselines. The TGCN and CNN-BLSTM baselines achieve notably worse AU-ROC values, establishing the power of a transformer encoder in extracting more meaningful features. SZTrack is trained using the published strategy in [6] and fails to detect seizures effectively. The differences in AU-ROC between DeepSOZ and TGCN, SZTrack, and CNN-BLSTM are statistically significant per a De Long’s test at . At the seizure level, DeepSOZ achieves a good balance between sensitivity (0.81) and FPR (0.44 min/h). The negative latency of 18s contributes towards the slightly elevated FPR. The TGCN and SZTrack have a high sensitivity, which comes at the cost of much higher FPR, while the CNN-BLSTM has a low detection sensitivity but comparable FPR.
Table 2.
Temporal seizure detection performance on the TUH dataset. Window-level metrics are calculated for each one-second windows. Seizure-level metrics are aggregated over the duration of seizure after post-processing.
| Model | Window-Level | Seizure-Level | ||||
|---|---|---|---|---|---|---|
| AU-ROC | Sensitivity | Specificity | FPR | Sensitivity | Latency | |
| DeepSOZ | .901±.027 | .679±.100 | .890±.030 | .44±.23 | .808±.106 | −18.45±15.67 |
| DeepSOZ-max | .907±.032 | .676±.079 | .909±.029 | .288±.153 | .700±.105 | −15.39±9.91 |
| TGCN [2] | .887±.032 | .711±.148 | .835±.085 | .808±.591 | .869±.085 | −36.01±29.49 |
| SZTrack [6] | .5202±.045 | .464±.300 | .535±.303 | 2.06±.844 | .799±.135 | −50.5±71.5 |
| CNN-BLSTM [4] | .876 ± .044 | .664 ± .135 | .876 ± .055 | .351 ± .45 | .42 ± .281 | 28.89 ± 54.88 |
SOZ Localization Performance:
Table 3 summarizes the SOZ localization performance across models. DeepSOZ performs the best at both patient and seizure levels. In contrast, the SZTrack and TGCN baselines are confident in their predictions but more often incorrect, once again highlighting the value of a transformer encoder. While the SZLoc model performs the best of the baselines, we note that both it and SZTrack have an unfair advantage of being trained and evaluated on 45 s EEG recordings around the seizure onset time. In contrast, DeepSOZ processes full 10-minute recordings for both tasks.
Table 3.
SOZ localization metrics. The seizure-level results are calculated independently on all seizure recordings. Patient-level results are aggregated over multiple seizures of the patients. Number of model parameters is also given.
| Model | Seizure-Level | Patient-Level | # Params | ||
|---|---|---|---|---|---|
| Accuracy | Uncertainty | Accuracy | Uncertainty | ||
| DeepSOZ | .731±.061 | .009±.001 | .744±.058 | .142±.013 | 510K |
| DeepSOZ-max | .513±.154 | .0±.0 | .411±.076 | .023±.007 | 510K |
| TGCN [2] | .479±.07 | .0±.0 | .486±.123 | .153±.015 | 1.16M |
| SZTrack [6] | .454±.065 | .003±.001 | .450±.142 | .017±.007 | 19K |
| SZLoc [5] | .682±.094 | .008±.001 | .740±.056 | .074±.008 | 491K |
Figure 2 aggregates the final predictions of DeepSOZ across the 120 patients into quadrants. As seen, DeepSOZ is adept at differentiating right- and left-hemisphere onsets but struggles to differentiate anterior and posterior SOZs. We hypothesize that this trend is due to the skew towards temporal epilepsy patients in the TUH dataset. A similar trend can be observed at the finer lobe-wise predictions. Figure 3 illustrate sample DeepSOZ outputs for two patients in the testing fold. As seen, DeepSOZ accurately detects the seizure interval in all cases but has two false positive detections for Patient 1. Nonetheless, DeepSOZ correctly localizes the seizure to the left frontal area. The localization for Patient 2 is more varied, which correlates with the patient notes that specify a right-posterior onset but epileptogenic activity quickly spreading to the left hemisphere. Overall, DeepSOZ is more uncertain about this patient.
Fig. 2.
Confusion matrices between the max channel-wise posterior and the true SOZ. Left: Quadrant-based aggregation (L: Left, R: Right, Ant: Anterior, Post: Posterior). Right: Functional region-based aggregation (F: Frontal, FC: Frontocentral, FT: Frontotemporal, T: Temporal, C: Central, P: Parietal, O: Occipital).
Fig. 3.
Visualization for two testing patients. Top: Temporal seizure detection. Blue lines correspond to the DeepSOZ prediction; horizontal orange lines denote the seizure detection threshold from training; shaded region is the ground-truth seizure interval. Bottom: Predicted SOZ for the above seizure projected onto a topological scalp plot. Side: Patient-level SOZ with ground-truth below.
4. Conclusion
We have introduced DeepSOZ for joint seizure detection and SOZ localization from scalp EEG. DeepSOZ leverages a self-attention mechanism to generate informative global and channel-wise latent representations that strategically fuse multi-channel information. The subsequent recurrent layers and attention-weighted pooling allow DeepSOZ to generalize across a heterogeneous cohort. We validate DeepSOZ on data from 120 epilepsy patients and report improved detection and localization performance over numerous baselines. Finally, we quantify the prediction uncertainty as a first step towards building trust in the model.
Supplementary Material
Acknowledgements.
This work was supported by the National Institutes of Health R01 EB029977 (PI Caffo), the National Institutes of Health R01 HD108790 (PI Venkataraman), and the National Institutes of Health R21 CA263804 (PI Venkataraman).
Footnotes
Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-3-031-43993-3_18.
Our code and data can be accessed at https://github.com/deeksha-ms/DeepSOZ.git.
References
- 1.Akbarian B, et al. : A framework for seizure detection using effective connectivity, graph theory, and multi-level modular network. Biomed. Sig. Process. Control 59, 101878 (2020) [Google Scholar]
- 2.Covert IC, et al. : Temporal graph convolutional networks for automatic seizure detection. In: Machine Learning for Healthcare, pp. 160–180. PMLR; (2019) [Google Scholar]
- 3.Craley J, Johnson E, Jouny C, Venkataraman A: Automated noninvasive seizure detection and localization using switching markov models and convolutional neural networks. In: Shen D., et al. (eds.) MICCAI 2019. LNCS, vol. 11767, pp. 253–261. Springer, Cham: (2019). 10.1007/978-3-030-32251-928 [DOI] [Google Scholar]
- 4.Craley J, et al. : Automated inter-patient seizure detection using multichannel convolutional and recurrent neural networks. Biomed. Sig. Process. Control 64, 102360 (2021) [Google Scholar]
- 5.Craley J, et al. : SZLoc: a multi-resolution architecture for automated epileptic seizure localization from scalp EEG. In: Medical Imaging with Deep Learning; (2022) [Google Scholar]
- 6.Craley JEA: Automated seizure activity tracking and onset zone localizationfrom scalp EEG using deep neural networks. PloS One 17(2), e0264537 (2022) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Dissanayake T, et al. : Geometric deep learning for subject independent epileptic seizure prediction using scalp EEG signals. IEEE J. Biomed. Health Inf. 26(2), 527–538 (2021) [DOI] [PubMed] [Google Scholar]
- 8.van Donselaar CA, et al. : Value of the electroencephalogram in adult patients with untreated idiopathic first seizures. Arch. Neurol. 49(3), 231–237 (1992) [DOI] [PubMed] [Google Scholar]
- 9.Güler NF, et al. : Recurrent neural networks employing Lyapunov exponents for EEG signals classification. Expert Syst. Appl. 29(3), 506–514 (2005) [Google Scholar]
- 10.He J, et al. : Spatial-temporal seizure detection with graph attention network and bi-directional LSTM architecture. Biomed. Sig. Process. Control 78, 103908 (2022) [Google Scholar]
- 11.Hussein R, et al. : Multi-channel vision transformer for epileptic seizure prediction. Biomedicines 10(7), 1551 (2022) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Khan H, Marcuse L, Fields M, Swann K, Yener B: Focal onset seizure prediction using convolutional networks. IEEE Trans. Biomed. Eng. 65(9), 2109–2118 (2017) [DOI] [PubMed] [Google Scholar]
- 13.Kingma DP, et al. : Adam: A method for stochastic optimization. arXiv preprintarXiv:1412.6980 (2014) [Google Scholar]
- 14.Li C, et al. : EEG-based seizure prediction via transformer guided CNN. Measurement 203, 111948 (2022) [Google Scholar]
- 15.Liang W, Pei H, Cai Q, Wang Y: Scalp EEG epileptogenic zone recognition and localization based on long-term recurrent convolutional network. Neurocomputing 396, 569–576 (2020) [Google Scholar]
- 16.Mansouri A, et al. : Online EEG seizure detection and localization. Algorithms 12(9), 176 (2019) [Google Scholar]
- 17.Miller J, et al. : Epilepsy. Hoboken; (2014) [Google Scholar]
- 18.Pedoeem J, Bar Yosef G, Abittan S, Keene S: TABS: transformer based seizure detection. In: Obeid I, Picone J, Selesnick I (eds.) Biomedical Sensing and Analysis. Springer, Cham: (2022). 10.1007/978-3-030-99383-24 [DOI] [Google Scholar]
- 19.Shah V, et al. : The temple university hospital seizure detection corpus. Front. Neuroinform. 12, 83 (2018). https://isip.piconepress.com/projects/tuheeg/html/downloads.shtml [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Vidyaratne L, Glandon A, Alam M, Iftekharuddin KM: Deep recurrent neural network for seizure detection. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 1202–1207. IEEE; (2016) [Google Scholar]
- 21.Wagh N, et al. : EEG-GCNN: augmenting electroencephalogram-based neurological disease diagnosis using a domain-guided graph convolutional neural network. In: Machine Learning for Health, pp. 367–378. PMLR; (2020) [Google Scholar]
- 22.Wei Z, Zou J, Zhang J, Xu J: Automatic epileptic EEG detection using convolutional neural network with improvements in time-domain. Biomed. Sig. Process. Control 53, 101551 (2019) [Google Scholar]
- 23.Yuan Y, Xun G, Jia K, Zhang A: A multi-view deep learning framework forEEG seizure detection. IEEE J. Biomed. Health Inf. 23(1), 83–94 (2018) [DOI] [PubMed] [Google Scholar]
- 24.Zandi AS, et al. : Automated real-time epileptic seizure detection in scalp EEG recordings using an algorithm based on wavelet packet transform. IEEE Trans. Biomed. Eng. 57(7), 1639–1651 (2010) [DOI] [PubMed] [Google Scholar]
- 25.Zhang Y, et al. : Integration of 24 feature types to accurately detect and predict seizures using scalp EEG signals. Sensors 18(5), 1372 (2018) [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.



