Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Oct 3.
Published in final edited form as: Conf Proc IEEE Eng Med Biol Soc. 2018 Jul;2018:3148–3151. doi: 10.1109/EMBC.2018.8512930

EEG Classification via Convolutional Neural Network-Based Interictal Epileptiform Event Detection

John Thomas 1, Luca Comoretto 2, Jing Jin 1, Justin Dauwels 1, Sydney S Cash 3, M Brandon Westover 3
PMCID: PMC6775768  NIHMSID: NIHMS1052410  PMID: 30441062

Abstract

Diagnosis of epilepsy based on visual inspection of electroencephalogram (EEG) abnormalities is an inefficient, time-consuming, and expert-centered process. Moreover, the diagnosis based on ictal epileptiform events is challenging as the ictal patterns are infrequent. Consequently, the development of an automated, fast, and reliable epileptic EEG diagnostic system is essential. The interictal epileptiform discharges (IEDs) are recurring patterns that are highly suggestive of epilepsy. In this paper, we propose an epileptic EEG classification system based on IED detection. The proposed system comprises of three modules: pre-processing, waveform-level classification, and EEG-level classification. We employ a Convolutional Neural Network (CNN) for waveform-level classification and a Support Vector Machine (SVM) for EEG-level classification. We evaluated the proposed system on a dataset of 156 EEGs recorded at Massachusetts General Hospital (MGH), Boston. The system achieved a mean 4-fold classification accuracy of 83.86% for classifying EEGs with and without IEDs.

I. INTRODUCTION

Epilepsy refers to a group of chronic brain disorders which are characterized by unpredictable, recurrent seizures. It is the fourth most common neurological disorder according to the Epilepsy Foundation [1]. Approximately, 65 million people in the world are affected by epilepsy [2]. An electroencephalogram (EEG) is a recording of the electrical signals produced by the brain, with the help of electrodes placed on the scalp. Currently, epilepsy is widely diagnosed by monitoring the occurrence of seizures. However, the seizure events are infrequent [3] and the diagnosis tends to be time-consuming. Interictal epileptiform discharges (IEDs) that appear in the EEG are suggestive of epilepsy. The IEDs are quite frequent in comparison with seizures. IEDs are further categorized into sharp waves, spikes, spike-wave complexes, and polyspike-wave complexes [4]. Our research primarily focuses on spikes and spike-wave complexes. Spikes are classic biomarkers that play an important role in the diagnosis of epilepsy, prediction of seizure recurrence, and in prescribing an appropriate treatment. In current clinical practice, the epileptiform spikes are visually identified by neurologists. This process is manual, tedious, and time-consuming. Moreover, the inter-rater agreement between neurologists over EEG interpretation is low [5]. Spike detection is a difficult task as the spikes exhibit a huge morphological variation across patients. Moreover, there is no standard definition for spikes [6]. Consequently, the reliability of the diagnosis heavily depends on the experience and expertise of the neurologists. Several methods have been tested to develop reliable automated spike detection systems in the literature [7]. None of these methods are universally agreed upon as they have not been validated on a sizable dataset. An automated spike detection system is highly beneficial for the clinical assessment and treatment of epilepsy. Often, the neurologists are not concerned about the number of spikes identified for diagnosing epilepsy. Consequently, the EEG-level classification (whether it is epileptic or non-epileptic) is more important in comparison with detecting spikes.

We propose an automated EEG classification system to detect epileptic EEGs based on IEDs. The system comprises of three major modules: preprocessing, waveform-level classification, and EEG-level classification. The pre-processing module performs the necessary filtering, normalization, and configures the EEG montage. The waveform-level classification module detects the various IED patterns in the EEG signal. The EEG-level classification module is developed based on the output of the waveform-level classification to identify epileptic EEGs. We employ a Convolutional Neural Network (CNN) [8] for waveform-level classification and a Support Vector Machine (SVM) [9] with Gaussian kernel for EEG-level classification. CNN-based systems have shown superior performance for detecting patterns in EEG for various applications such as Brain-Computer Interfaces (BCIs) [10], spike detection systems [11], etc. Deep learning techniques are generally data-hungry and have been shown to achieve superior performance when evaluated with a sizable dataset. SVM classifiers have been shown to perform well amongst the traditional classifiers. Considering the small input dataset size for EEG-level classification, we employ an SVM classifier. We achieved a mean 4-fold classification accuracy of 83.86% for identifying EEGs with and without IEDs.

In Section II, we elaborate on the patient data and the various methods applied in the study. In Section III, we illustrate and discuss the results achieved for the proposed EEG classification system. In Section IV, we provide conclusions and ideas for future research.

II. Methods

A. Scalp EEG

In this study, we analyzed 30-minute EEG recordings of 156 subjects (93 epileptic patients with annotated IEDs and 63 spike-free EEGs). The data was recorded with the International 10–20 electrode system at Massachusetts General Hospital (MGH), Boston. The data is down-sampled to 128 Hz after applying an anti-aliasing FIR lowpass filter. We applied two filters: an IIR notch filter of 60 Hz to remove the electrical interference, and a 1 Hz high pass-filter to remove the baseline fluctuations. The Common Average Referential (CAR) montage is employed. We analyzed the system with a set of 18,164 spikes that were cross-annotated by two neurologists. Each spike was extracted as a 500-millisecond waveform (64 samples). The duplicates of the annotated spikes from the adjacent channels were removed before extracting the background waveforms. The back-ground waveforms were also extracted as 500-millisecond waveforms with an overlap of 75%.

B. EEG Classification system

We propose the development of a robust, fast and efficient EEG classification system with three modules: preprocessing, waveform-level classification, and EEG-level classification. The block diagram of the proposed system is illustrated in Fig. 1.

Fig. 1.

Fig. 1.

The basic structure of the proposed EEG classification system.

C. Pre-processing

The pre-processing module converts the data from the EEG recording system into the EEG classification system input format. This module performs the following tasks:

  1. Down-sampling the data to 128 Hz.

  2. Filtering to remove the power-line interference and baseline drifts.

  3. Configuring the montage (here we apply CAR montage).

D. Waveform-level classification

The waveform-level classification module was trained to detect epileptiform transients in the EEG. Here the classification problem is to identify a waveform as a spike or a background (the non-spike EEG waveforms are categorized as backgrounds). The classification problem is heavily class-imbalanced. The typical ratio between the positive class (spikes) and the negative class (backgrounds) is 1:1250. We employ a Convolutional Neural Network (CNN)-based system for waveform-level classification [8].

The CNN system was developed using Tensorflow 1.2.1 [12] with a K40 Tesla graphical processing unit (GPU) on Ubuntu 16.04. The optimal hyperparameters of the CNN are evaluated by applying a nested cross-validation on the training set. The parameters of the implemented CNN are detailed in Table I. We employ different techniques to optimize the network and to prevent overfitting, namely, dropout, batch processing, balanced training, and nested cross-validation to determine the stopping criteria. In dropout, the weights of a certain percentage of the fully connected layer were dropped at each iteration to prevent overfitting. We applied balanced training (spike is to background ratio is maintained as 1:1) to prevent negative class overfitting due to the class-imbalance. In batch processing the ratio is maintained as 1:1, but the background set is each time replaced after a particular number of iterations. This training scheme yields better sensitivity and precision for detecting IEDs. Next the hyperparameters and the training termination criteria of the CNN system are evaluated by applying a nested cross-validation on the training set. The classifier training and validation contain 80% and 20% of the data respectively. We also optimized the training process by improving the quality of the background set applied for training.

TABLE I.

The different parameters of the implemented CNN network

Parameter Values
Number of convolutional layers 1
Number of convolutional filters 32
Dimension convolutional filters 1×5
Number of pooling layers 1
Number of fully connected layers 1
Number of hidden layer neurons 3000
Activation function ReLU (Rectified Linear Unit)
Pooling block size 2×2
Dropout probability 0.5
Optimizer Adam optimizer
Learning rate 10−4
Training termination criteria Until the validation error saturates

E. EEG-level classification

The EEG-level classification system predicts whether the EEG contains contain IEDs by analyzing the output of waveform-level classification. We implemented this module by means of a Support Vector Machine (SVM) [9] with Gaussian kernel. The input feature vector for SVM is extracted from the output of the CNN-based spike detection system. The SVM was implemented with scikit-learn [13] on Ubuntu 16.04. The different parameters of the SVM were optimized by performing a nested cross-validation on the training set.

F. Approach

We divide the MGH epileptic dataset (93 patients) into four folds, randomly, by keeping the distribution of EEGs based on spike frequency similar. The fold details are shown in Table II. We applied two folds of epileptic EEG for the waveform-level CNN training, one fold for EEG-level classification SVM training, and one fold for evaluating the trained system. The spike-free EEGs are only applied for the EEG-level classification training and testing. We employed 32 spike-free EEGs for training and the remaining 31 for testing. We have 6 different combinations of folds (A-F) for waveform-level classification and 12 combinations of folds for EEG-level classification (A1, A2, B1, B2, … , F2). The output of the CNN-based was mapped into [0,1] with a softmax function, i.e, for each 500-millisecond input, the CNN produces an output between [0,1] with a higher value indicating the spikiness of the waveform. The CNN system converts the 19-channel EEG file into 19-channel CNN output values. We also implemented time-instant detection, i.e., the 19-channel CNN outputs from a single time instant were combined together to generate a combined output for the particular time instant. The maximum value of the 19-channel output was selected as the combined time instant output, resulting in a single value between [0,1] for each 500-millisecond EEG interval. Next 20 features were extracted from the time-instant based CNN outputs. The CNN output range [0,1] is divided into 20 equal ranges of width 0.05: [0,0.05), [0.05,0.1), … , [0.95,1]. Each feature is defined as the fraction of time-instant CNN outputs that belongs to a particular threshold range. These 20 features are evaluated based on the ability to discriminate between EEGs with and without IEDs. The most salient features are applied as the input for the EEG-level classification SVM.

TABLE II.

Data division fold details

Fold number Number of epileptic EEG Number of annotated spikes
Fold 1 24 4812
Fold 2 23 4573
Fold 3 23 4065
Fold 4 23 4714

III. Results

The waveform-level classification CNN results are presented in table III. The CNN system achieved a mean Area-Under-Curve (AUC) value of 0.935 and a mean precision of 0.55 for a sensitivity of 80%. The sensitivity-precision curves for the different fold combinations are presented in Fig. 2. The various studies for spike detection available in the literature employ different datasets as well as provide different evaluation measures. This makes the comparison with the previous studies in the literature difficult. The performance of our CNN spike detector is superior to most of the previous studies in the literature [7], [14], [15]. Moreover, we have evaluated the system on a sizable database of spikes, whereas most of the studies in the literature are performed on a smaller set of spikes.

TABLE III.

The results of waveform-level classification with CNN for the different combination of folds.

Fold combination CNN training set Testing set AUC
A Folds 1,2 Folds 3,4 0.920
B Folds 1,3 Folds 2,4 0.962
C Folds 1,4 Folds 2,3 0.868
D Folds 2,3 Folds 1,4 0.972
E Folds 2,4 Folds 1,3 0.929
F Folds 3,4 Folds 1,2 0.960
Mean AUC 0.935
Mean precision for sensitivity of 80% 0.55

Fig. 2.

Fig. 2.

The sensitivity-precision curves for the different combinations of the waveform-level classification.

We extracted 20 features from the output of CNN, i.e, the fraction of CNN outputs in 20 different intervals of equal size. We selected the best features based on p-values. A two-sample t-test was performed on each feature to check the ability to discriminate between EEGs with and without IEDs. The fraction of CNN outputs in the following intervals were selected: [0.45–0.5), [0.55–0.6), [0.6–0.65), [0.65–0.7), [0.7–0.75), [0.75–0.8), [0.8–0.85), and [0.9–0.95). This 8-dimensional feature vector is applied to the EEG-level classification SVM. We set the SVM output threshold as 0.4 for identifying EEGs containing IEDs. The results of the EEG-level classification is presented in Table IV. In Fig. 3 we show the scatter plot between the two most salient features, i.e., the fraction of CNN outputs between [0.45–0.5) and [0.9–0.95) for combination A. The plot indicates that the two classes, EEGs with and without IEDs, are reasonably separable by the current system. We achieved a mean EEG-level classification accuracy of 83.86% over the 12 different combinations of folds. Most of the studies in the literature are limited to epileptic spike detection. Classification of EEGs based on the interictal epileptiform activity is a relatively novel concept. Altunay et al. has presented a similar study in [16] by applying a linear prediction filter to classify EEGs based on IEDs. The system is reported to have achieved a sensitivity of 95%, but a direct comparison cannot be made as the study was conducted with intracranial EEG data of 5 patients.

TABLE IV.

The results for EEG-level classification based on the 8-vector feature vector.

Fold combination SVM training set Testing set AUC TP FP
A1 Fold 3 Fold 4 0.901 23 5
A2 Fold 4 Fold 3 0.833 16 5
B1 Fold 2 Fold 4 0.903 23 3
B2 Fold 4 Fold 2 0.863 18 3
C1 Fold 2 Fold 3 0.812 19 7
C2 Fold 3 Fold 2 0.784 15 6
D1 Fold 1 Fold 4 0.969 22 4
D2 Fold 4 Fold 1 0.885 18 2
E1 Fold 1 Fold 3 0.849 17 5
E2 Fold 3 Fold 1 0.852 20 5
F1 Fold 1 Fold 2 0.873 17 3
F2 Fold 2 Fold 1 0.925 19 3
Mean AUC 0.870
Mean accuracy 83.86%

Fig. 3.

Fig. 3.

The scatter plot of the two most salient features, i.e., the fraction of CNN outputs between the intervals [0.45–0.5) and [0.9–0.95) (fold combination A1) for EEG-level classification. The plot shows that the two classes are reasonably separable.

IV. CONCLUSIONS

In this study, we present preliminary results for an auto-mated EEG classification system developed to classify EEGs with and without IEDs. It comprises of three main modules: pre-processing, waveform-level classification (CNN), and EEG-level classification (SVM). Our system has achieved a mean 4-fold cross-validation accuracy of 83.86% by testing it on a sizable database of 156 subjects. This system may aid neurologists to diagnose epilepsy efficiently. In our future work, we intend to incorporate artifact rejection in order to reduce the false detections. We also aim to modify the pre-processing module to adapt to different montages and EEG recording equipment.

References

  • [1].Epilepsy Foundation of America. About Epilepsy: The Basics. http://www.epilepsy.com/learn/about-epilepsy-basics;.
  • [2].Thurman DJ, Beghi E, Begley CE, Berg AT, Buchhalter JR, Ding D, et al. Standards for epidemiologic studies and surveillance of epilepsy. Epilepsia. 2011;52(s7):2–26. [DOI] [PubMed] [Google Scholar]
  • [3].Moran N, Poole K, Bell G, Solomon J, Kendall S, McCarthy M, et al. Epilepsy in the United Kingdom: seizure frequency and severity, anti-epileptic drug utilization and impact on life in 1652 people with epilepsy. Seizure-European Journal of Epilepsy. 2004;13(6):425–433. [DOI] [PubMed] [Google Scholar]
  • [4].Deuschl G, Eisen A, et al. Recommendations for the practice of clinical neurophysiology: guidelines of the International Federation of Clinical Neurophysiology. 1999;. [PubMed]
  • [5].Halford JJ, Schalkoff RJ, Zhou J, Benbadis SR, Tatum WO, Turner RP, et al. Standardized database development for EEG epileptiform transient detection: EEGnet scoring system and machine learning analysis. Journal of neuroscience methods. 2013;212(2):308–316. [DOI] [PubMed] [Google Scholar]
  • [6].Khouma O, Ndiaye ML, Farsi SM, Montois Jj, Diop I, Diouf B. Comparative methods of spike detection in epilepsy. In: Science and Information Conference (SAI), 2015 IEEE; 2015. p. 749–755. [Google Scholar]
  • [7].Halford JJ. Computerized epileptiform transient detection in the scalp electroencephalogram: Obstacles to progress and the example of computerized ECG interpretation. Clinical Neurophysiology. 2009;120(11):1909–1915. [DOI] [PubMed] [Google Scholar]
  • [8].Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems; 2012. p. 1097–1105.
  • [9].Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intelligent Systems and their Applications. 1998;13(4):18–28. [Google Scholar]
  • [10].Thomas J, Maszczyk T, Sinha N, Kluge T, Dauwels J. Deep learning-based classification for brain-computer interfaces. In: Systems, Man, and Cybernetics (SMC), 2017 IEEE International Conference on IEEE; 2017. p. 234–239. [Google Scholar]
  • [11].Thomas J, Sinha N, Shaju N, Maszczyk T, Jin J, Cash SS, et al. P290 Convolutional Neural Network-based Interictal Epileptiform Discharge Detection, 26th Annual Computational Neuro-science Meeting (CNS*2017): Part 3. BMC Neuroscience. 2017. August;18(1):60 Available from: 10.1186/s12868-017-0372-1. [DOI] [Google Scholar]
  • [12].Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems; 2015. Software available from tensorflow.org. Available from: https://www.tensorflow.org/.
  • [13].Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830. [Google Scholar]
  • [14].Wilson SB, Emerson R. Spike detection: a review and comparison of algorithms. Clinical Neurophysiology. 2002;113(12):1873–1881. [DOI] [PubMed] [Google Scholar]
  • [15].de Moraes FD, Callegari DA. Automated Detection of Interictal Spikes in EEG: A literature review. Pontif´ıcia Universidade Católica do Rio Grande do Sul Av: Ipiranga; 2014. [Google Scholar]
  • [16].Altunay S, Telatar Z, Erogul O. Epileptic EEG detection using the linear prediction error energy. Expert Systems with Applications. 2010;37(8):5661–5665. [Google Scholar]

RESOURCES