Optimisation of deep neural networks for identification of epileptic abnormalities from electroencephalogram signals

Wattanapong Kurdthongmee

doi:10.1016/j.heliyon.2020.e05694

. 2020 Dec 18;6(12):e05694. doi: 10.1016/j.heliyon.2020.e05694

Optimisation of deep neural networks for identification of epileptic abnormalities from electroencephalogram signals

Wattanapong Kurdthongmee ^1,^∗

PMCID: PMC7753124 PMID: 33364484

Abstract

An electroencephalogram (EEG) measures and records the electrical activity of the brain. It provides valuable information that can be used to identify epileptic abnormalities. However, the visual identification of such abnormalities from EEG signals by expert neurologists is time consuming. Therefore, several researchers have proposed using deep neural networks (DNNs) to automate the identification of these abnormalities. Their studies have examined the use of different numbers of layers, different numbers of parameters, and various operation types arranged in different architectures. This paper presents the shallowest 11-layer DNN architecture capable of classifying three classes of EEG signals: normal, preictal, and seizure. When the proposed architecture was applied to the standard University of Bonn EEG signal dataset, it achieved accuracy, specificity, and sensitivity values of 99.43%, 99.57%, and 99.10%, respectively. It not only had a better performance than the state of the art DNN architectures, but also had shallower layers with fewer parameters. This allowed it to more quickly identify epileptic abnormalities. Experiments were also conducted where the length of the EEG signals was reduced to 65% (2,662 samples with a period of 15.26 s), which in turn minimised the total parameters of the proposed architecture so that it was comparable to the smallest state-of-the-art architecture and decreased the lag time for identification. Even in these experiments, it was capable of producing equal performance measures, with the execution time reduced to only 69% of that when employing the full length of EEG signals.

Keywords: Computer science, Epileptic abnormalities identification, Deep neural networks, EEG signals

Computer science; Epileptic abnormalities identification; Deep neural networks; EEG signals.

1. Introduction

Epilepsy is a central nervous system (neurological) disorder in which brain activity becomes abnormal, causing seizures or periods of unusual behaviour, sensations, and sometimes loss of awareness that can lead to serious physical injuries to the patient. It is classified as a severe disease that affects 50 million people worldwide, 85% of which reside in developing countries, with 2.4 million new cases occurring every year at a global level [1]. With this high impact, it is an important field of research in the biomedical field. Epilepsy is characterised by unprovoked seizures due to the involvement of the central nervous system, with the normal neuronal network abruptly turning into a hyper-excited network in an unpredictable manner. These occurrences may vary from once a year to several times a day. The ability to detect the occurrence of seizures would make it possible to improve therapeutic treatment, which would enhance the quality of life of epileptic patients [2]. Electroencephalogram (EEG) signals, which record details of the electrical activity of the brain and can be utilised to examine brain functions, are commonly used to identify epilepsy. Traditionally, the identification of EEG signals in order to identify epileptic abnormalities is carried out visually by expert neurologists. However, this procedure is tedious and prone to human errors. Hence, automating the identification process for an epileptic seizure using EEG signals is an important problem. The problem basically involves the extraction of distinguishing features from EEG signals for seizure detection.

Deep neural networks (DNNs) have been successfully used for object classification and detection. Originally, the application of a DNN to a two-dimensional domain was proposed for the recognition of handwritten zip codes from their images [3]. Based on the impressive results in terms of classification accuracy, it was extended to one-dimensional domains, including human activity recognition from gyroscope/accelerometer sensors attached to the body [4], natural language processing (NLP) [5], and speech applications [6]. These also showed impressive detection levels. A thorough survey of the applications of DNNs to one-dimensional problem domains was published by Kiranyaz et al. [7].

A DNN uses several layers of repetitive operations to extract the low-level attributes of an object and change the size of its features. By definition, a feature is the result of applying an operation to the input data (such as EEG signals in this case) in the first layer, or the feature from the preceding layer. Common DNN operations include convolution, pooling, and fully connected operations. This paper only provides brief explanations of the basic functions of these operations. Greater detail was provided by Yamashita et al. [8]. Convolution is an operation that is performed to extract a set of features from an input feature by applying a kernel in the form of a sliding window over the input feature. Depending on the values of the convolutional kernel, specific patterns (i.e., horizontal, and vertical edges) can be extracted from the input feature. The main function of the pooling operation is to reduce the size of an input feature, which makes the computation faster, reduces the memory size, and protects the DNN model from the effect of overfitting, which occurs if a model learns the noise and details in the training set to such an extent that it negatively impacts the model's performance on new data [2]. Two common types of pooling operations are maximum (max) and average pooling, which operate by selecting the maximum or average value from the predefined members of an input feature, respectively. The flatten operation is responsible for converting an input feature into a one-dimensional array. Finally, the fully connected or dense operation is a regular neural network operation that takes an input feature, computes the class scores, and outputs a one-dimensional array with a size equal to the number of classes.

Different approaches have been utilised to automatically identify epileptic abnormalities. This is normally performed by making it possible to recognise three different classes of EEG signals: normal, preictal, and seizure classes. The approaches have achieved different levels of performance measures in terms of accuracy, specificity, and sensitivity. Fortunately, a majority of these approaches give rise to over 95% of these performance measures. The approaches proposed before 2018 can be classified into the following groups: neural networks with some enhancement techniques to improve their performance measures [9,10,11,12], support vector machine (SVM) methods [13,14,15,16], wavelets [9,17,18], and others, e.g. Gaussian mixture model [19], decision tree [20], and random forest methods [21]. These approaches share a common requirement that most of the features within the EEG signals need to be identified by a domain expert in order to reduce the complexity of the data and make patterns more visible to allow the learning algorithms to work. Since 2018, DNNs have been employed in this research domain. The dominant feature of a DNN is its attempt to learn high-level features from data in an incremental manner. This eliminates the need for domain expertise and complex tasks to perform feature extraction. DNN-based epileptic abnormality identification was pioneered by Acharya et al. [22] with their 14-layer DNN architecture. In terms of usage operators, the architecture relies on using one-dimensional convolution, max pooling, flattening, and dense operations. The architecture achieved an accuracy of 88.7%, a sensitivity of 95%, and a specificity of 90%. An improvement in terms of accuracy was reported by Ullah et al. [23] using a 14-layer architecture. Two data augmentation schemes were introduced to the DNN learning stage in order to increase the size of the learning dataset. An accuracy of 99.1% was attained without any reports on the specificity and sensitivity. Later, the feature-scaling approach was introduced to a DNN to improve the performance measures [24]. This successfully achieved an accuracy of 97.56%, a sensitivity of 98.17%, and a specificity of 94.93%. Later, the effectiveness of the stacking ensemble DNN approach for epileptic seizure detection was studied by Akyol [25]. An average accuracy value of 97.17% was attained, along with an average sensitivity of 93.11%. The most recent publication by Abiyev et al. [2] reported the best achievements, with an accuracy of 98.67%, a sensitivity of 97.67%, and a specificity of 98.83%.

It is very common for reports on the topic of DNN-based epileptic abnormality identification to present their findings in the form of their proposed DNN architectures and the improvements in terms of performance measures. In addition, they rely on using the standard EEG signal dataset provided by the University of Bonn, Germany. Only the publication of Ullah et al. [23] focused on optimising their DNN architecture to minimise the memory requirement and execution time. It was claimed that the architecture was suitable for real-time clinical settings. With these minimisations, it will ease the burden of neurologists and will assist patients by alerting them before a seizure occurs.

This study examined a shallower DNN architecture with a smaller total number of parameters and shorter execution time, which was experimentally proven to be more effective than the state-of-the-art method in terms of the performance measures. The contributions of this study include the presentation of a technique that focuses on using a shallow DNN architecture to directly process a raw EEG dataset which is capable to improve the performance measures of the classifier. The systematic approach; instead of the traditional trial-and-error one which has been used by the previously publications, to design the DNN architecture is also provided. We also present the DNN architecture that works well in the detection of epileptic seizures with the least feature extraction. In addition, the main differences between our study and the previously proposed ones are on the point of view of the number of DNN layers in the architecture, its total parameters and the length of the EEG signal used for detection. This strengthens the understanding that; in the EEG epileptic abnormality identification domain – to be specific, the deep DNN architectures and the longer length of EEG signals do not promise to always produce the superior prediction performance measures. The research community can work another way around to seek for better shallow architectures. The remainder of this paper is organised as follows. Section 2 discusses the materials and methods in detail. This is followed by the presentation of the experimental results and discussion in section 3. Finally, the paper is concluded in section 4.

2. Materials and methods

Before providing the details in this section and the rest of this paper, the terms architecture and classifier are defined. The term architecture refers to the DNN architecture, which consists of the convolution, pooling, and dense layers. The classifier is a DNN inferencing program that uses the model produced by training the architecture to identify or classify the EEG signals into three different classes: normal, preictal, and seizure. Finally, the prediction performance measures, or simply the performance measures, in all of the experiments are the accuracy, sensitivity, and specificity.

2.1. Dataset

This research used a standard dataset similar to previous publications [2,22]. The University of Bonn dataset was collected, preprocessed, and provided to the research community by Andrzejak et al. [26]. It is available for free download at http://epilepsy.uni-freiburg.de/database It has the following characteristics.

•
EEG signals were selected from continuous multichannel EEG signal recordings after removing artefacts due to muscle activity and eye movement via visual examination.
•
The EEG signals were obtained from five patients in each of three data classes (normal, preictal, and seizure) with these details:
- –
  Normal class: EEG signals of 100 cases from five healthy subjects,
- –
  Preictal class: EEG signals of 100 cases from five epileptic patients,
- –
  Seizure class: EEG signals of 100 cases from patients in the preictal class obtained as they experienced an epileptic seizure during the signal capturing process.
•
Each class of the dataset contains 100 EEG signals with a record duration of 23.6 s which is equivalent to 4096 samples.
•
The total number of EEG signals in the dataset is 300.

Based on these characteristics, it can be observed that all the classes within the dataset are balanced; that is, they have equal numbers of cases with equal lengths of EEG signals. In the preprocessing stage of the experiment, all the individual EEG signals of the dataset were normalised to have a mean of 0 and a standard deviation of 1.00. In each experiment, the normalised dataset was then randomly partitioned into training and testing datasets, with a ratio of 90:10. Thus, the training dataset had a total of 270 EEG signals (90% of 300), which seemed to be a fairly small number. A 10-fold cross-validation approach was used in the experiment. The approach was applied to prevent biasing by possibly selecting a group of neighbouring EEG signals for the DNN training, which were likely to have similar properties. In detail, the EEG signals within the training dataset were randomly divided into 10 equal bins. Within each bin, nine of the ten EEG signals were used for training the model, while the rest were used for validating the accuracy of the model. These processes were repeated 10 times by shifting the training and validation datasets. The performance measures were obtained by averaging the results from running the experiment 10 times.

2.2. Method

In this research, several one-dimensional architectures were designed with the following main goals:

•
lowering the total number of layers compared to the previously proposed architectures,
•
relying on fewer convolution layers in order to increase the recognition rate of the architecture, and
•
minimising the total number of parameters in order to reduce the memory usage.

All of these architectures were designed with the final goal of obtaining better performance measures. Figure 1 illustrates the flowchart that we followed during the period to design our proposed DNN architecture. All the processes within the round edge rectangle were automated by use of our self-developed Python script. The architectures of Acharya et al. [22] and Abiyev et al. [2] were used as the starting points and benchmarked architectures. In terms of performance measures, the Abiyev architecture is the state-of-the-art one. The Acharya architecture dominates from the point of view of the smallest total number of parameters. From an architectural point of view, these architectures consist of 14 and 16 layers, respectively, of convolution, max pooling, and dense operations. In detail, the Acharya (see Table 1) architecture has five convolution layers. Each convolution layer supplies the output feature to a max pooling layer, which is responsible for reducing the feature size. The Abiyev architecture (see Table 2) consists of four double convolution layers and two convolution operations for any input feature. The output of each double convolution layer, with the exception of the final one, is connected to the max pooling layer. The last three layers of both architectures are dense layers. The Abiyev architecture also introduces a dropout layer with the aim of minimising the impact of overfitting.

Flowchart illustrating how our proposed DNN architecture was designed.

Table 1.

Details of Acharya architecture.

Layer	Operator	Feature Size	Filters	Kernel	Parameters
1	Conv1D	4092	4	6	28
2	Max Pooling	2046	4		0
3	Conv1D	2042	4	5	84
4	Max Pooling	1021	4		0
5	Conv1D	1018	10	4	170
6	Max Pooling	509	10		0
7	Conv1D	506	10	4	410
8	Max Pooling	253	10		0
9	Conv1D	250	15	4	615
10	Max Pooling	125	15		0
11	Flatten		1875		0
12	Dense		50		93800
13	Dense		20		1020
14	Dense		3		63
Total					96190

Open in a new tab

Table 2.

Details of Abiyev architecture.

Layer	Operator	Feature Size	Filters	Kernel	Parameters
1	Conv1D	4095	32	3	128
2	Conv1D	4093	32	3	3104
3	Max Pooling	1364	32		0
4	Conv1D	1362	64	3	6208
5	Conv1D	1360	64	3	12352
6	Max Pooling	453	64		0
7	Conv1D	451	128	3	24704
8	Conv1D	449	128	3	49280
9	Max Pooling	150	128		0
10	Conv1D	148	256	3	98560
11	Conv1D	146	256	3	196864
12	Global Average Pooling1D		256		0
13	Dropout		256		0
14	Dense		32		8224
15	Dense		64		2112
16	Dense		3		195
Total					401731

Open in a new tab

In order to attain the second goal, the proposed architecture borrowed from that of Acharya in the way that it relies on using an alternating arrangement of convolution and max pooling layers. This helps reduce the number of parameters overall, as well as within the layers of the architecture. This can clearly be seen in Table 1. In experiments, the number of pairs of convolution and max pooling layers varied, with a maximum of five pairs, which was equivalent to the Acharya architecture.

From Tables 1 and 2, it can be observed that the kernel sizes of the Acharya architecture are assigned to the convolution layers in decreasing order, while the kernel sizes of the Abiyev architecture are kept constant for all convolution layers. In the experiments conducted in this study, the kernel sizes were assigned to the convolution layers in the following manner: increasing order, decreasing order, and constant. Additionally, the stride parameter of the max pooling layers also varied. The performance measures of the trained architecture with these parameter variations were then recorded and thoroughly examined.

The Google Cloud Platform (GCP) was used because the experiments were performed with many variations in the architecture's parameters, i.e. the number of pairs of convolution and max pooling layers, kernel sizes, and strides, which could take an extremely large amount of time for the computations. The configuration of the GCP platform consists of 8 vCPUs with a storage size of 30 GB and 1 NVIDIA Tesla P4. The Linux operating system was installed on the platform. Keras, which is a powerful deep-learning library that runs on top of TensorFlow, was utilised for training and testing the models. The Python scikit-learn package was employed to provide a 10-fold cross validation function. The model training stage was subdivided into pre-training and main-training stages. Both stages relied on fixing a batch size of three. The batch size was defined as the number of EEG signals that were used for each training update of the architectures. In contrast, different provisions were made for the epoch, which referred to one iteration of applying the full training dataset to the architecture. The pre-training and main-training stages used 10 and 150 epochs, respectively. In this way, it was possible to quickly eliminate non-candidate architectures from further consideration. These architectures were those whose performance measures were poorer than those of the Abiyev architecture, which was treated as the state-of-the-art method for benchmarking the performance measures. It should be noted that the average times taken to perform the pre-training and main training stages on the selected platform were 5 and 45 min, respectively. After obtaining the candidate architectures, they were elaborately examined by further training with a full 150 epochs. A batch size of three and full set of 150 epochs were used in imitation of the experiments of Acharya et al. [22]. The common training hyperparameters used in all the experiments included the categorical cross-entropy loss function, Adam optimiser, and accuracy metric.

In addition to focusing on the performance measures, this study also focused on the time required for the prediction with the proposed architecture. It is true that reducing the number of layers and total number of parameters of a DNN architecture results in speeding up the prediction time. Another way to accelerate the prediction time is processing shorter EEG signals. Previously proposed architectures relied on conducting training and prediction using full-length EEG signals with a sample size of 4096, which was equivalent to a duration of 23.5 s. This means that the current classifier results were processed using the previous 23.5 s of EEG signals. The current experiment was also used to study the relationships between the length of the EEG signals, execution time, and performance measures of the proposed architectures. In the experiment, the length of the EEG signals was varied between 10% and 100% in steps of 5%, and all the performance measures were recorded. The training was repeated only for the best architecture from the candidate architectures obtained in the previous experiment with the full 150 epochs.

Finally, experiments were also performed to study the execution time of the proposed architecture in comparison with the Acharya and Abiyev architectures. These were also carried out on the previously discussed platform.

3. Results and discussions

After performing the first experiment, several candidate architectures were produced with better performance measures than the Abiyev and Acharya architectures. Table 3 lists all the candidate architectures with better performances than the Abiyev and Acharya architectures as the benchmarks. Each architecture name within the table is encoded in the following format, starting from the left.

1.
The brackets are used to refer to a convolution layer, with the main parameters defined within the brackets. The first and second numbers represent the filter size, which is equivalent to the number of channels of the output feature, and kernel size that applies to the input feature, respectively.
2.
A max pooling layer follows the convolution layer. The max pooling stride parameter is defined between the underscore characters.
3.
The flattened layer is denoted by F.
4.
The last number or, in some cases, last two numbers following the flattened layer provides the parameters of the dense layer (s).

Table 3.

Comparison of candidate architectures and benchmark architectures in terms of total number of parameters and performance measures.

Architecture	Parameters	Accuracy	Specificity	Sensitivity
(16 _3)_3_(32_4)_3_(64_5)_3_(96_6)_3_F_16_32	123795	99.43%	99.57%	99.10%
(32 _5) _3_ (64_5) _3_ (96_5) _3_ (128_3) _3 F_32_64	281347	98.54%	97.57%	98.62%
(32 2) _4_ (64_4) _4_ (96_6) _4_ (128_8) _4_F_32_64	203427	98.54%	97.56%	98.57%
(32 3) _3_ (64_4) _3_ (96_5) _3_ (128_6) _3_F_32_64	312003	98.19%	97.31%	98.07%
Abiyev	401731	98.67%	98.83%	97.67%
Acharya	96190	88.7%	90%	95%

Open in a new tab

For example, the first architecture listed in Table 3 (16_3)_3_(32_4)_3_(64_5)_3_(96_6)_3_F_16_32, has the following characteristics:

1.
A convolution layer with a kernel size of 3 and 16 filters,
2.
A max pooling layer with a stride of 3,
3.
A convolution layer with a kernel size of 4 and 32 filters,
4.
A max pooling layer with a stride of 3,
5.
A convolution layer with a kernel size of 5 and 64 filters,
6.
A max pooling layer with a stride of 3,
7.
A convolution layer with a kernel size of 6 and 96 filters,
8.
A max pooling layer with a stride of 3,
9.
A flattened layer, and
10.
A dense layer with 16 nodes.
11.
A dense layer with 32 nodes.

With respect to the experimental results, it is obvious that the (16_3)_3_(32_4)_3_(64_5)_3_(96_6)_3_F_16_32 architecture is the best or the winner one. A comparison of the previously proposed DNN architectures along with their performance measures and the best architecture from this study is given in Table 4. Figure 2 illustrates the comparison of identification accuracies of all candidate and benchmarked architectures in a form of box plot from the experiments with 10 repeats. The figure clearly indicates the winner architecture whose accuracy is the highest compared to the benchmark architectures which are resulted from our experiments and the previously proposed ones. From the figure, the winner architecture does not only produce the best median which is very close to 100%, it also gives the narrowest standard deviation. Additionally, even the outlier is less than the Abiyev and Acharya architectures.

Table 4.

Comparison of previously proposed DNN architectures and their performance measures.

Researchers	Architecture	Performance measures
Researchers	Architecture	Accuracy	Specificity	Sensitivity
Acharya et al. [22]	14-layer	88.7%	90%	95%
Ullah et al. [23]	14-layer	99.1%	NA	NA
Thara et al. [24]	NA	97.56%	94.93%	98.17%
Abiyev et al. [2]	16-layer	98.67%	98.83%	97.67%
Akyol [25]	Stacking Ensemble DNN	97.17%	97.17%	93.11%
Our winner one	11-layer	99.43%	99.57%	99.10%

Open in a new tab

Comparison of identification accuracies of all candidate and benchmark architectures.

The determination of its superiority was based on the fact that all of its performance measures were better than those of the state-of-the-art architecture. It was also a shallower architecture with 11 layers and a total number of parameters of 123,795. In terms of the total number of parameters, it was comparable to the Acharya architecture and had approximately a third of the parameters of the Abiyev architecture. The other three architectures also showed improvements over the Abiyev architecture. However, they had lower performance measures than the above-mentioned winning architecture. Additionally, the total numbers of parameters of the third and fourth architectures seemed to be equal to those of the Abiyev architecture.

The analysis of multi-class classification performance was also carried out to compare the benchmarked architectures with our winning. The normalized confusion matrices resulted from the experiment results were used to compare the multi-class classification performance. Figure 3 comparatively illustrates the normalized confusion matrices from the two benchmarked and our winning architectures. In term of true positive (TP), it is obvious that our winning architecture gives rise to the comparable overall performance to the Abiyev architecture. It is slightly superior to the Abiyev one for the seizure class. The TP performance of both Abiyev and our winning architectures significantly improve over the Archaya architecture. Table 5 provides the comparative analysis results in terms of precision, recall, specificity and F1-score of the benchmarked and our winning architectures on a class by class basis. For each performance measure, the values of the winner architecture(s) are highlighted. It can be seen that most of the highlighted values are within Abiyev and our winning architectures. Both of these analysis results further confirm that although our winning architecture has a shallower architecture and a smaller total number of parameters, its overall multi-class classification performance is comparable to a more complex state of the art architecture.

Comparison of the confusion matrices resulted from applying Abiyev, Acharya and our winning architectures to classify the EEG signals into three different classes. (a) Below the figure to Archarya architecture.

Table 5.

Comparative multi-class classification performance of benchmarked and proposed architectures.

Class	Precision			Recall			Specificity			F1-score
Class	Abiyev	Archaya	Our Arch	Abiyev	Archaya	Our Arch	Abiyev	Archaya	Our Arch	Abiyev	Archaya	Our Arch
Normal	0.96	1.00	0.98	0.96	0.95	1.00	0.97	1.00	0.99	0.96	0.98	0.99
Preictal	0.97	1.00	1.00	0.93	1.00	1.00	0.99	1.00	1.00	0.95	1.00	1.00
Seizure	0.91	0.95	1.00	0.97	1.00	0.98	0.97	0.98	1.00	0.94	0.97	0.99

Open in a new tab

Figure 4 shows the experimental results from a further study of the performance measures of the winning architecture along with the execution times for a variety of lengths of EEG signals, which were varied between 10% and 100% in increments of 5%. Observations showed that the length percentage of the EEG signals had a linear relationship with the total number of parameters of the architecture. It should be noted that all the architectures had to be retrained after changing the length percentage of the EEG signals. Increasing the length percentage of the EEG signals only affected the number of parameters in the first dense layer, the 10^th layer, of the winning architecture. The parameters of all the remaining layers remained intact. When the total execution time was considered, it was expected that a small length percentage for the EEG signals would be likely to take a shorter time to process. This was confirmed by the experimental results, which are also shown in Figure 3 as a red line graph. Because the exact execution times are likely to be machine/platform dependent, their relative values compared to the execution time for the entire length of the EEG signals are presented instead. The execution times seemed to have a linear relationship with the length percentage of the EEG signals. A consideration of the execution times and performance measures of the winning architecture clearly shows that when length percentages of 65, 70, 75, and 80% of the EEG signals were used, it was capable of producing performance measures comparable to those with 100% of the signals. Table 6 provides a summary of the parameters of the winning architecture using these length percentages for the EEG signals. For these different length percentages, the execution times were 69, 73, 75, 77, and 81% of the time required when 100% of the length of the EEG signals was used. The total numbers of parameters for this architecture in these cases were 97,683, 100,755, 105,363, and 108,435, which represented increases of 2, 5, 10, and 30%, respectively, with respect to the total number of parameters for the Acharya architecture.

Comparison of performance measures in terms of accuracy, specificity, and sensitivity with variation in length of EEG signals of winning architecture.

Table 6.

Candidate CNN architecture.

Operator	Percentage of EEG signals
	65%		70%		75%		80%
	Output	Param	Output	Param	Output	Param	Output	Param
Conv1D	(2661, 16)	64	(2865, 16)	64	(3070, 16)	64	(3275, 16)	64
Max Pooling	(887, 16)	0	(955, 16)	0	(1023, 16)	0	(1092, 16)	0
Conv1D	(884, 32)	2080	(952, 16)	2080	(1020, 16)	2080	(1089, 16)	2080
Max Pooling	(295, 32)	0	(317, 32)	0	(340, 32)	0	(363, 32)	0
Conv1D	(291, 64)	10304	(104, 32)	10304	(336, 32)	10304	(359, 32)	10304
Max Pooling	(97, 64)	0	(99, 64)	0	(112, 64)	0	(120, 64)	0
Conv1D	(92, 96)	36960	(97, 64)	36960	(107, 64)	36960	(115, 64)	36960
Max Pooling	(31, 96)	0	(33, 96)	0	(36, 96)	0	(38, 96)	0
Flatten	2976	0	3168	0	3456	0	3648	0
Dense	16	47632	16	50704	16	55312	16	58384
Dense	32	544	32	544	32	544	32	544
Dense	3	99	3	99	3	99	3	99
Total		97,683		100,755		105,363		108,435

Open in a new tab

Finally, Table 7 presents the average execution times along with their standard deviations for all the benchmarks and proposed architectures. All the execution times were measured on the GCP platform detailed in section 2. The results indicated that even when the winning architecture used 100% of the length of the EEG signals it outperformed both the benchmark architectures. When the winning architecture used lower length percentages for the EEG signals, it required approximately half the execution time of the Abiyev architecture. These experiments confirmed the effectiveness of the proposed architecture, which relies on lowering the number of layers, total number of parameters, and length of the EEG signals. The improvement in terms of the execution time was accompanied by the production of better performance measures.

Table 7.

Comparative execution times of benchmark and proposed architectures.

Measures	Architectures
	Acharya	Abidev	Proposed architectures
	Acharya	Abidev	65%	70%	75%	80%	85%	100%
Average Execution Time (ms)	8.68	11.65	5.23	5.54	5.69	5.84	6.14	7.59
Standard Deviation	0.75	0.57	0.11	0.12	0.12	0.13	0.13	0.16

Open in a new tab

In order to confirm the applicability of our proposed architecture in term of its stability for processing a long run of EEG signals, we also performed an additional experiment to carry out identifying epileptic abnormalities from the continuous long run EEG signals. The EEG signals from the dataset detailed in Section 2.1 were annotated and combined to create a single long run signal which was then used as an input for our winning architecture-based classifier. The positive result was obtained from the experiment. That is to say our winning architecture-based classifier successfully performed abnormalities identification with comparable performance measures to the previous experiment results. Additionally, the classifier was able to continuously perform without any execution failures.

4. Conclusion

The identification of epileptic abnormalities from EEG signals, which can be classified as normal, preictal, and seizure signals, is an important goal in the biomedical research field. A DNN has been applied to automate the identification process to avoid the tedious time spent by expert neurologists. The state-of-the-art DNN architectures for epileptic abnormality identification claim to have high performance measures in terms of accuracy, specificity, and sensitivity on the standard University of Bonn dataset. However, their common weaknesses consist of the requirement for several convolution, pooling, and dense processing layers, along with a high total number of parameters. These influence the execution time and memory demand of the computer platform. In the current experiment, a DNN architecture with shallower layers and a lower number of total parameters was tested. The standard EEG signal dataset was used for the DNN training and classification in order to ensure that the performance measures were comparable to those of previous publications. The 10-fold cross validation approach with 10 iterations per experiment was employed as a means to compensate for the small dataset specifically used for DNN applications. All the training parameters were taken from the previously proposed architectures in order to make the results comparable. The best architecture from our experiment was proven to provide superior performance measures with a significant improvement in the execution time. Additionally, more experiments were performed to train the winning architecture with various percentage lengths for the EEG signals. These variations showed a linear relationship with the total parameters of the architectures. Several superior architectures with a lower number of total parameters were found. All of them had shorter execution times compared to the original winning architecture while preserving the performance measures. In summary, the proposed winning architecture only had 11 layers. It achieved accuracy, specificity, and sensitivity values of 99.43%, 99.57%, and 99.10%, respectively. When the EEG signals were reduced to 65, 70, 75, 80, and 85%, the architecture only required 69, 73, 75, 77, and 81% of the average execution time of the original winning architecture, respectively. This was approximately half of the execution time of the Abiyev architecture. This means that the lag time for identification was reduced by a factor of two. The total numbers of parameters for these versions of the winning architecture were 2, 5, 10, and 30% greater, respectively, than the total number of parameters for the Acharya architecture, which had the lowest total number of parameters. The findings of this study could be used as guidelines to select an appropriate hardware platform to implement the automatic DNN-based identification of epileptic abnormalities from EEG signals.

Declarations

Author contribution statement

W. Kurdthongmee: Conceived and designed the analysis; Analyzed and interpreted the data; Contributed analysis tools or data; Wrote the paper.

Funding statement

This research was supported by Digital Economy and Society Development Funds of Thailand (GT1 017/63).

Data availability statement

Data will be made available on request.

Declaration of interests statement

The authors declare no conflict of interest.

Additional information

No additional information is available for this paper.

References

1.Acharya U.R., Vinitha Sree S., Swapna G., Martis R.J., Suri J.S. Automated eeg analysis of epilepsy: a review. Knowl. Base Syst. 2013;45:147–165. http://www.sciencedirect.com/science/article/pii/S0950705113000798 [Google Scholar]
2.Abiyev R., Arslan M., Bush J.I., Sekeroglu B., Ilhan A. Identification of epileptic eeg signals using convolutional neural networks. Appl. Sci. 2019;10(12):4089. [Google Scholar]
3.LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521(7553):436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]
4.Bayat A., Pomplun M., Tran D.A. Vol. 34. 2014. A study on human activity recognition using accelerometer data from smartphones, Procedia Computer Science; pp. 450–457.http://www.sciencedirect.com/science/article/pii/S1877050914008643 (The 9th International Conference on Future Networks and Communications (FNC’14)/The 11th International Conference on Mobile Systems and Pervasive Computing (MobiSPC’14)/Affiliated Workshops). [Google Scholar]
5.Otter D.W., Medina J.R., Kalita J.K. IEEE Transactions on Neural Networks and Learning Systems; 2020. A Survey of the Usages of Deep Learning for Natural Language Processing. [DOI] [PubMed] [Google Scholar]
6.Alam M., Samad M.D., Vidyaratne L., Glandon A., Iftekharuddin K.M. Neurocomputing; 2020. Survey on Deep Neural Networks in Speech and Vision Systems. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Kiranyaz S., Avci O., Abdeljaber O., Ince T., Gabbouj M., Inman D.J. 2019. 1d Convolutional Neural Networks and Applications: A Survey. arXiv preprint arXiv:1905.03554. [Google Scholar]
8.Yamashita R., Nishio M., Do R., Togashi K. Vol. 9. Insights into Imaging; 2018. (Convolutional Neural Networks: an Overview and Application in Radiology). [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Ghosh-Dastidar S., Adeli H. Improved spiking neural networks for eeg classification and epilepsy and seizure detection. Integrated Comput. Aided Eng. 2007;14:187–212. [Google Scholar]
10.Ghosh-Dastidar S., Adeli H. A new supervised learning algorithm for multiple spiking neural networks with application in epilepsy and seizure detection. Neural Network. 2009;22(10):1419–1431. doi: 10.1016/j.neunet.2009.04.003. [DOI] [PubMed] [Google Scholar]
11.Ghosh-Dastidar S., Adeli H., Dadmehr N. Mixed-band wavelet-chaos-neural network methodology for epilepsy and epileptic seizure detection. IEEE (Inst. Electr. Electron. Eng.) Trans. Biomed. Eng. 2007;54(9):1545–1551. doi: 10.1109/TBME.2007.891945. [DOI] [PubMed] [Google Scholar]
12.Guo L., Rivero D., Dorado J., Munteanu C.R., Pazos A. Automatic feature extraction using genetic programming: an application to epileptic eeg classification. Expert Syst. Appl. 2011;38(8):10425–10436. [Google Scholar]
13.Faust O., Acharya U.R., Min L.C., Sputh B.H.C. Automatic identification of epileptic and background eeg signals using frequency domain parameters. Int. J. Neural Syst. 2010;20(2):159–176. doi: 10.1142/S0129065710002334. [DOI] [PubMed] [Google Scholar]
14.Acharya U.R., Sree S.V., Suri J.S. Automatic detection of epileptic eeg signals using higher order cumulant features. Int. J. Neural Syst. 2011;21(5):403–414. doi: 10.1142/S0129065711002912. [DOI] [PubMed] [Google Scholar]
15.Bhattacharyya A., Pachori R.B., Upadhyay A., Acharya U.R. Tunable-q wavelet transform based multiscale entropy measure for automated classification of epileptic eeg signals. Appl. Sci. 2017;7(4):385. [Google Scholar]
16.Sharma M., Pachori R.B., Acharya U.R. A new approach to characterize epileptic seizures using analytic time-frequency flexible wavelet transform and fractal dimension. Pattern Recogn. Lett. 2017;94:172–179. [Google Scholar]
17.Acharya U.R., Sree S.V., Alvin A.P.C., Suri J.S. Use of principal component analysis for automatic classification of epileptic eeg activities in wavelet framework. Expert Syst. Appl. 2012;39(10):9072–9078. [Google Scholar]
18.Sharma M., Bhurane A.A., Acharya U.R., Mmsfl-owfb A novel class of orthogonal wavelet filters for epileptic seizure detection. Knowl. Base Syst. 2018;160:265–277. [Google Scholar]
19.Chua K.C., Chandran V., Acharya U.R., Lim C.M. Application of higher order spectra to identify epileptic eeg. J. Med. Syst. 2011;35(6):1563–1571. doi: 10.1007/s10916-010-9433-z. [DOI] [PubMed] [Google Scholar]
20.Martis R.J., Acharya U.R., Tan J.H., Petznick A., Yanti R., Chua C.K., Ng E.K., Tong L. Application of empirical mode decomposition (emd) for automated detection of epilepsy using eeg signals. Int. J. Neural Syst. 2012;22(6):1250027. doi: 10.1142/S012906571250027X. [DOI] [PubMed] [Google Scholar]
21.Bhattacharyya A., Pachori R.B. A multivariate approach for patient-specific eeg seizure detection using empirical wavelet transform. IEEE (Inst. Electr. Electron. Eng.) Trans. Biomed. Eng. 2017;64(9):2003–2015. doi: 10.1109/TBME.2017.2650259. [DOI] [PubMed] [Google Scholar]
22.Acharya U.R., Oh S.L., Hagiwara Y., Tan J.H., Adeli H. Deep convolutional neural network for the automated detection and diagnosis of seizure using eeg signals. Comput. Biol. Med. 2018;100:270–278. doi: 10.1016/j.compbiomed.2017.09.017. http://www.sciencedirect.com/science/article/pii/S0010482517303153 [DOI] [PubMed] [Google Scholar]
23.Ullah I., Hussain M., Aboalsamh H. An automated system for epilepsy detection using eeg brain signals based on deep learning approach. Expert Syst. Appl. 2018;107:61–71. [Google Scholar]
24.Thara D., PremaSudha B., Xiong F. Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques. Pattern Recogn. Lett. 2019;128:544–550. [Google Scholar]
25.Akyol K. Stacking ensemble based deep neural networks modeling for effective epileptic seizure detection. Expert Syst. Appl. 2020;148:113239. [Google Scholar]
26.Andrzejak R.G., Lehnertz K., Mormann F., Rieke C., David P., Elger C.E. Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys. Rev. E. 2001;64 doi: 10.1103/PhysRevE.64.061907. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Data will be made available on request.

[bib1] 1.Acharya U.R., Vinitha Sree S., Swapna G., Martis R.J., Suri J.S. Automated eeg analysis of epilepsy: a review. Knowl. Base Syst. 2013;45:147–165. http://www.sciencedirect.com/science/article/pii/S0950705113000798 [Google Scholar]

[bib2] 2.Abiyev R., Arslan M., Bush J.I., Sekeroglu B., Ilhan A. Identification of epileptic eeg signals using convolutional neural networks. Appl. Sci. 2019;10(12):4089. [Google Scholar]

[bib3] 3.LeCun Y., Bengio Y., Hinton G. Deep learning. Nature. 2015;521(7553):436–444. doi: 10.1038/nature14539. [DOI] [PubMed] [Google Scholar]

[bib4] 4.Bayat A., Pomplun M., Tran D.A. Vol. 34. 2014. A study on human activity recognition using accelerometer data from smartphones, Procedia Computer Science; pp. 450–457.http://www.sciencedirect.com/science/article/pii/S1877050914008643 (The 9th International Conference on Future Networks and Communications (FNC’14)/The 11th International Conference on Mobile Systems and Pervasive Computing (MobiSPC’14)/Affiliated Workshops). [Google Scholar]

[bib5] 5.Otter D.W., Medina J.R., Kalita J.K. IEEE Transactions on Neural Networks and Learning Systems; 2020. A Survey of the Usages of Deep Learning for Natural Language Processing. [DOI] [PubMed] [Google Scholar]

[bib6] 6.Alam M., Samad M.D., Vidyaratne L., Glandon A., Iftekharuddin K.M. Neurocomputing; 2020. Survey on Deep Neural Networks in Speech and Vision Systems. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib7] 7.Kiranyaz S., Avci O., Abdeljaber O., Ince T., Gabbouj M., Inman D.J. 2019. 1d Convolutional Neural Networks and Applications: A Survey. arXiv preprint arXiv:1905.03554. [Google Scholar]

[bib8] 8.Yamashita R., Nishio M., Do R., Togashi K. Vol. 9. Insights into Imaging; 2018. (Convolutional Neural Networks: an Overview and Application in Radiology). [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] 9.Ghosh-Dastidar S., Adeli H. Improved spiking neural networks for eeg classification and epilepsy and seizure detection. Integrated Comput. Aided Eng. 2007;14:187–212. [Google Scholar]

[bib10] 10.Ghosh-Dastidar S., Adeli H. A new supervised learning algorithm for multiple spiking neural networks with application in epilepsy and seizure detection. Neural Network. 2009;22(10):1419–1431. doi: 10.1016/j.neunet.2009.04.003. [DOI] [PubMed] [Google Scholar]

[bib11] 11.Ghosh-Dastidar S., Adeli H., Dadmehr N. Mixed-band wavelet-chaos-neural network methodology for epilepsy and epileptic seizure detection. IEEE (Inst. Electr. Electron. Eng.) Trans. Biomed. Eng. 2007;54(9):1545–1551. doi: 10.1109/TBME.2007.891945. [DOI] [PubMed] [Google Scholar]

[bib12] 12.Guo L., Rivero D., Dorado J., Munteanu C.R., Pazos A. Automatic feature extraction using genetic programming: an application to epileptic eeg classification. Expert Syst. Appl. 2011;38(8):10425–10436. [Google Scholar]

[bib13] 13.Faust O., Acharya U.R., Min L.C., Sputh B.H.C. Automatic identification of epileptic and background eeg signals using frequency domain parameters. Int. J. Neural Syst. 2010;20(2):159–176. doi: 10.1142/S0129065710002334. [DOI] [PubMed] [Google Scholar]

[bib14] 14.Acharya U.R., Sree S.V., Suri J.S. Automatic detection of epileptic eeg signals using higher order cumulant features. Int. J. Neural Syst. 2011;21(5):403–414. doi: 10.1142/S0129065711002912. [DOI] [PubMed] [Google Scholar]

[bib15] 15.Bhattacharyya A., Pachori R.B., Upadhyay A., Acharya U.R. Tunable-q wavelet transform based multiscale entropy measure for automated classification of epileptic eeg signals. Appl. Sci. 2017;7(4):385. [Google Scholar]

[bib16] 16.Sharma M., Pachori R.B., Acharya U.R. A new approach to characterize epileptic seizures using analytic time-frequency flexible wavelet transform and fractal dimension. Pattern Recogn. Lett. 2017;94:172–179. [Google Scholar]

[bib17] 17.Acharya U.R., Sree S.V., Alvin A.P.C., Suri J.S. Use of principal component analysis for automatic classification of epileptic eeg activities in wavelet framework. Expert Syst. Appl. 2012;39(10):9072–9078. [Google Scholar]

[bib18] 18.Sharma M., Bhurane A.A., Acharya U.R., Mmsfl-owfb A novel class of orthogonal wavelet filters for epileptic seizure detection. Knowl. Base Syst. 2018;160:265–277. [Google Scholar]

[bib19] 19.Chua K.C., Chandran V., Acharya U.R., Lim C.M. Application of higher order spectra to identify epileptic eeg. J. Med. Syst. 2011;35(6):1563–1571. doi: 10.1007/s10916-010-9433-z. [DOI] [PubMed] [Google Scholar]

[bib20] 20.Martis R.J., Acharya U.R., Tan J.H., Petznick A., Yanti R., Chua C.K., Ng E.K., Tong L. Application of empirical mode decomposition (emd) for automated detection of epilepsy using eeg signals. Int. J. Neural Syst. 2012;22(6):1250027. doi: 10.1142/S012906571250027X. [DOI] [PubMed] [Google Scholar]

[bib21] 21.Bhattacharyya A., Pachori R.B. A multivariate approach for patient-specific eeg seizure detection using empirical wavelet transform. IEEE (Inst. Electr. Electron. Eng.) Trans. Biomed. Eng. 2017;64(9):2003–2015. doi: 10.1109/TBME.2017.2650259. [DOI] [PubMed] [Google Scholar]

[bib22] 22.Acharya U.R., Oh S.L., Hagiwara Y., Tan J.H., Adeli H. Deep convolutional neural network for the automated detection and diagnosis of seizure using eeg signals. Comput. Biol. Med. 2018;100:270–278. doi: 10.1016/j.compbiomed.2017.09.017. http://www.sciencedirect.com/science/article/pii/S0010482517303153 [DOI] [PubMed] [Google Scholar]

[bib23] 23.Ullah I., Hussain M., Aboalsamh H. An automated system for epilepsy detection using eeg brain signals based on deep learning approach. Expert Syst. Appl. 2018;107:61–71. [Google Scholar]

[bib24] 24.Thara D., PremaSudha B., Xiong F. Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques. Pattern Recogn. Lett. 2019;128:544–550. [Google Scholar]

[bib25] 25.Akyol K. Stacking ensemble based deep neural networks modeling for effective epileptic seizure detection. Expert Syst. Appl. 2020;148:113239. [Google Scholar]

[bib26] 26.Andrzejak R.G., Lehnertz K., Mormann F., Rieke C., David P., Elger C.E. Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state. Phys. Rev. E. 2001;64 doi: 10.1103/PhysRevE.64.061907. [DOI] [PubMed] [Google Scholar]

PERMALINK

Optimisation of deep neural networks for identification of epileptic abnormalities from electroencephalogram signals

Wattanapong Kurdthongmee

Abstract

1. Introduction