An automatic EEG-based sleep staging system with introducing NAoSP and NAoGP as new metrics for sleep staging systems

Mesut Melek; Negin Manshouri; Temel Kayikcioglu

doi:10.1007/s11571-020-09641-2

. 2020 Oct 12;15(3):405–423. doi: 10.1007/s11571-020-09641-2

An automatic EEG-based sleep staging system with introducing NAoSP and NAoGP as new metrics for sleep staging systems

Mesut Melek ^1,^✉, Negin Manshouri ², Temel Kayikcioglu ²

PMCID: PMC8131449 PMID: 34040668

Abstract

Different biological signals are recorded in sleep labs during sleep for the diagnosis and treatment of human sleep problems. Classification of sleep stages with electroencephalography (EEG) is preferred to other biological signals due to its advantages such as providing clinical information, cost-effectiveness, comfort, and ease of use. The evaluation of EEG signals taken during sleep by clinicians is a tiring, time-consuming, and error-prone method. Therefore, it is clinically mandatory to determine sleep stages by using software-supported systems. Like all classification problems, the accuracy rate is used to compare the performance of studies in this domain, but this metric can be accurate when the number of observations is equal in classes. However, since there is not an equal number of observations in sleep stages, this metric is insufficient in the evaluation of such systems. For this purpose, in recent years, Cohen’s kappa coefficient and even the sensitivity of NREM1 have been used for comparing the performance of these systems. Still, none of them examine the system from all dimensions. Therefore, in this study, two new metrics based on the polygon area metric, called the normalized area of sensitivity polygon and normalized area of the general polygon, are proposed for the performance evaluation of sleep staging systems. In addition, a new sleep staging system is introduced using the applications offered by the MATLAB program. The existing systems discussed in the literature were examined with the proposed metrics, and the best systems were compared with the proposed sleep staging system. According to the results, the proposed system excels in comparison with the most advanced machine learning methods. The single-channel method introduced based on the proposed metrics can be used for robust and reliable sleep stage classification from all dimensions required for real-time applications.

Electronic supplementary material

The online version of this article (10.1007/s11571-020-09641-2) contains supplementary material, which is available to authorized users.

Keywords: EEG, Sleep stage classification, PAM, Sensitivity polygon, General polygon

Introduction

The classification of sleep stages is very important both in the diagnosis and treatment of sleep problems and in research, such as child behavior analysis. Experts in this field divide biological signals such as EEG, electromyography (EMG), and electrooculography (EOG) into 30-s slices (epochs) and then examine the slices one by one for determining the sleep stages. The large number of biological signal recordings may cause poor-quality signal recording due to the comfort restrictions it will impose on the patient (Zhu et al. 2014). For this, the use of EEG and, more importantly, a single EEG channel has many advantages such as low cost, patient comfort, ease of electrode placement, and low computational burden. Therefore, the classification of sleep stages with a single EEG channel has attracted the attention of numerous researchers.

According to the R&K standard (Rechtschaffen and Kales 1968), sleep is divided into two general stages: REM (rapid eye movement) and NonREM (no rapid eye movement). The NREM stage is divided into four sub-stages of NREM1, 2 (superficial sleep), 3, 4 (deep sleep). REM sleep is also called paradoxical sleep. In a healthy person, each of these stages takes 15–20 min, and a full cycle takes 75–90 min. After the completion of a cycle, stages NREM1 through REM are repeated again (Sharma et al. 2019a).

The NREM1 stage is the phase of transition from wakefulness to sleep and may take 1–5 min. At this stage, the amplitude of the EEG signal is small and it includes theta and alpha waves. During the NREM2 stage, sleep spiders and k-complexes appear in the EEG signal, which will have slightly lower frequencies than alpha waves. In the NREM3 stage, more than 20% and below 50% delta wave appear in every 30 s of epochs. The frequency of the delta wave is below 2 Hz and its amplitude is above 75 micro-volts. Night terror and sleepwalking are experienced in this stage (Sharma et al. 2019a). The NREM4 stage is similar to the NREM3 stage, but the delta wave constitutes more than 50% of each epoch of this stage. Upon waking up from this stage, a sense of disorientation is very common (Sharma et al. 2019a). The NREM3 and NREM4 stages are called deep sleep. The REM stage has an EEG signal with a small amplitude and mixed frequencies. At this stage, a rapid movement of the eyes appears and the EMG signal of the chin muscles is very small. Vertex signals are rare in this evolution but sawtooth waves can be observed. The EEG signal of the REM stage is similar to the EEG signal of the NREM1 and wake stages. The jaw's EMG signal is used to distinguish REM and NREM1 stages (Shephard 1991). Wakefulness is generally divided into two stages. When awake, the EEG signal with eyes wide open has a low amplitude and a high frequency. In eyes-closed vigilance, the alpha wave appears predominantly in the EEG signal (Shephard 1991).

As previously mentioned, the common clinical method for classifying sleep stages is the visual evaluation by experts that constitutes looking directly at the EEG signals for comparing certain waveforms to previously known wave signals according to the R&K standard. The classification of signals in this way is very tiring and time-consuming and the margin of error is high (Boashash and Ouelha 2016; Akben and Alkan 2016). Moreover, the results of the visual evaluation techniques depend on the experience of the expert, and the accuracy rate is below 90% (Penzel and Conradt 2000). In addition, the slowness of this method poses serious problems for research in this domain (Hassan and Bhuiyan 2017, 2016), and some clinical treatments may require the patient's sleep stages to be examined very quickly (Hsu et al. 2013; Lajnef 2015). The software-supported classification of sleep stages will accelerate this process and significantly contribute to its accuracy (Li et al. 2016).

The literature presents numerous studies in sleep stage identification (Ronzhina et al. 2012; Liang et al. 2012), sleep disorder identification by using EEG (Dhok et al. 2020) and electrocardiogram (ECG) (Sharma et al. 2018b, 2019b). Ronzhina et al. classified sleep stages with an artificial neural network classifier using a single EEG channel by a method based on power spectral density (Ronzhina et al. 2012). In another study, features were extracted from the EEG signal using the Renyi entropy and the stages were classified (Liang et al. 2012). In Sharma et al. (2018a), a new single-channel EEG based sleep-stages identification system using a three-band time–frequency localized wavelet filter bank for feature extraction and support vector machine (SVM) for classification was developed. They tested the proposed method in a large dataset and achieved a 91.5% accuracy rate in a six-stage case. Moreover, Kayikcioglu et al. proposed a feature extraction and classification method based on the auto-regressive (AR) model and partial least squares (PLS) algorithm to classify the sleep and wake stages (Kayikcioglu et al. 2015).

The open-access PhysioNet (Goldberger et al. 2000) database is frequently utilized to compare the methods proposed for the classification of sleep stages. One of the most frequently used datasets in this database is the Sleep-EDF dataset (Kemp et al. 2000), where EDF stands for European Define Format. More information about this data set is given in the Materials and Method section. Not just the researchers using this data set, but all the sleep scoring researchers use the accuracy rate and Cohen’s kappa coefficient (Cohen 1960) to compare their proposed system with existing systems. Since the classification of the stage NREM1 is difficult compared to other stages, the sensitivity of this stage has been used in recent years as a comparison metric of sleep staging systems. The study, which obtained the highest accuracy rate on the Pz-Oz channel in the sleep-EDF dataset, was conducted by Ghimatgar et al. (2019). The authors reached a 94.55% accuracy rate. However, in the confusion matrix via leave-one-out cross-validation (LOO-CV) strategy obtained in this study, the sensitivity values of NREM1 were around 11%. In the testing phase, the sensitivity of NREM1 for the five-stage sleep classification case was given as 10.93%. In this study, sensitivities were not given for the six-stage sleep classification case. Moreover, in a study based on deep learning (Mousavi et al. 2019), the highest sensitivity was obtained for NREM1. This value was around 68%, although the accuracy rate obtained from the system's confusion matrix was around 90%. From among the studies conducted with the machine learning methods, the highest NREM1 sensitivity was 42.05% in the six-stage sleep classification case that was achieved by Hassan and Bhuiyan (2017). The accuracy rate of this study was specified as 88.07%.

Generally, several metrics are available to evaluate the performance of a classification phase, such as accuracy rate, sensitivity, specificity, Cohen’s kappa coefficient, and F-measure. As mentioned in sleep staging systems, accuracy rate, Cohen’s kappa coefficient, and NREM1 sensitivity are among the most widely used metrics. The superiority of one system over another is not acceptable only by using these metrics. The reason is that a system may be successful in one metric, while it may not be successful in another metric or metrics. The study by Ghimatgar et al. (2019) is an example of this. In this study, although the accuracy rate was specified as 95.40% in the five-stage sleep classification case, the NREM1 sensitivity was calculated as 10.93%. In Hassan and Bhuiyan (2017), the accuracy rate was 88.07%, whereas the NREM1 sensitivity equaled 42.05%. In addition, systems often borrow from the sensitivity of other stages to increase certain stage sensitivity. This may not always be clear. Therefore, a new parameter is needed to evaluate sleep staging systems from all dimensions. In Aydemir (2020) a polygon area metric (PAM) consisting of those parameters, instead of six different parameters, was proposed for evaluating the classifier performance. Thus, the performance of a classifier can be easily evaluated with a single metric without the need to compare various metrics. The stability and validity of the proposed metric were tested on seven data sets with the k-nearest neighbor (k-NN), support vector machines (SVM), and linear discriminant analysis (LDA) classifiers.

In this paper, we propose two new metrics based on PAM for the evaluation of sleep staging systems. One of these two metrics evaluates the overall sensitivity of the system, and the other one evaluates the system from all dimensions. In this way, the sensitivity of the systems to all sleep stages can be measured with a single metric. Also, the superiority of the systems to each other can be expressed with only one metric. In addition, in this study, a new sleep staging system without writing codes was proposed with the two applications offered by MATLAB (2019b). Feature extraction, evaluation, and selection were performed by using Diagnostic Feature Designer and the classification by Classification Learner applications. The system in the proposed general metric is superior to the existing machine learning systems.

The rest of this article is organized as follows: “Materials and methods” section introduces the materials and method. Data acquisition, feature extraction, and classifications are also described in this section. “Results” section presents the results. Finally, in “Discussion” and “Conclusion” sections, the discussion and conclusions are respectively given.

Materials and methods

Data set

In order to perform a comparison between the proposed system and existing systems, the sleep-EDF dataset (Kemp et al. 2000) in the PhysioNet (Goldberger et al. 2000) database was employed in this study. This data set consists of eight Caucasian males and females aged 21–35 years. Four of them are healthy (as marked SC and recorded in 1989) and four of them with mild difficulty in falling asleep (as marked ST and recorded in 1994). The subjects did not use any medication. EEG recordings were obtained from the Fpz-Cz, and Pz-Oz channels with a sampling frequency of 100 Hz. Also, dataset is included horizontal EOG. According to R&K standards, the signals in this data set were divided into epochs of 30 s and classified into six sleep stages by a clinician. Each person's hypnogram data were provided next to the set. Table 1 shows the number of epochs obtained from eight individuals per stage. The columns named S1, S2, S3, S4, R, and W are NREM1, NREM2, NREM3, NREM4, REM, and wake, respectively. In addition, the number of training and test epochs appears in this table. Based on the studies that used this dataset, half of the epochs of each stage were allocated for training and a half for testing.

Table 1.

Epoch distributions of sleep stages in the sleep-EDF dataset (total, training, and testing set)

	S1	S2	S3	S4	R	W
Total	604	3621	672	627	1609	8031
Train	302	1810	336	313	804	4015
Test	302	1811	336	314	805	4016

Classification cases	Sleep stages
6	S1	S2	S3	S4	R	W
5	S1	S2	S3 + S4		R	W
4	S1 + S2		S3 + S4		R	W
3	S1 + S2 + S3 + S4				R	W
2	S1 + S2 + S3 + S4 + R					W

$M e a n = \frac{1}{n} \sum_{1}^{n} x_{i}$	$P e a k v a l u e = m a x \|x_{i}\|$
$R M S = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2}}$	$I m p u l s e f a c t o r = \frac{pv}{\frac{1}{n} \sum_{1}^{n} \|x_{i}\|}$
$S t a n d a r d d e v i a t i o n = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}$	$C l e a r a n c e f a c t o r = \frac{pv}{{(\frac{1}{n} \sum_{1}^{n} \|x_{i}\|)}^{2}}$
$S k e w n e s s = \frac{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{3}}{(n - 1) σ^{3}}$	$S N R = \frac{P_{signal}}{P_{noise}}$
$K u r t o s i s = \frac{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{4}}{(n - 1) σ^{4}}$	$S I N A D = \frac{P_{signal} + P_{distortion} + P_{noise}}{P_{noise} + P_{distortion}}$
$S h a p e f a c t o r = \frac{\sqrt{\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2}}}{\frac{1}{n} \sum_{i = 1}^{n} \|x_{i}\|}$	$T H D = \frac{\sqrt{H_{2}^{2} + H_{3}^{2} + H_{4}^{2} + H_{5}^{2}}}{H_{1}}$
$C r e s t f a c t o r = \frac{pv}{\sqrt{\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2}}}$

Order	Band	Frequency (Hz)
1	Theta 1	0.27–2.16
2	Theta 2	2.16–4.14
3	Delta 1	4.14–6.03
4	Delta 2	6.03–8.01
5	Alpha	8.01–12.06
6	Beta 1	12.06–20.07
7	Beta 2	20.07–29.97
8	Total	0.27–29.97
9	Theta	0.27–4.14
10	Delta	4.14–8.01

Classifier	Methods	Pz-Oz		Fpz-Cz
Classifier	Methods	ACC	Training time (s)	ACC	Training time (s)
Decision trees	Fine tree	86.5	24.57	84.1	30.01
	Medium tree	83.3	15.61	81.2	30.58
	Coarse tree	74.5	11.92	70.1	51.37
Discriminant analysis	Linear discriminant	74.1	12.49	73.7	9.96
Discriminant analysis	Quadratic discriminant	51.7	9.01	53.3	10.13
Naive Bayes	Gaussian naive Bayes	38.3	15.52	43.9	18.49
Naive Bayes	Kernel naive Bayes	73.8	427.55	64.2	574.86
Nearest neighbors	Fine KNN	74.5	21.43	75.2	37.64
	Medium KNN	78.4	18.75	77.3	22.90
	Coarse KNN	74.9	21.44	72.6	30.39
	Cosine KNN	77.5	20.49	76.3	71.96
	Cubic KNN	77.4	623.69	76.9	708.7
	Weighted KNN	78.9	20.74	78.3	37.69
Ensemble	Boosted tree	85.2	415.07	82.9	557.02
	Bagged tree	88.9	237.74	88.2	320.8
	Subspace discriminant	69.1	62.36	68.1	135.23
	Subspace KNN	83.8	292.51	76.7	470.2
	Rusboosted	77.4	159.21	74.5	423.4
Support vector machines	Linear SVM	–	–	–	–
	Quadratic SVM	–	–	–	–
	Cubic SVM	–	–	–	–
	Coarse Gaussian SVM	75.5	589.62	75.5	657.94
	Medium Gaussian SVM	84.5	484.78	84.2	504.19
	Fine Gaussian SVM	85.2	899.12	–	–

Order	Kruskal–Wallis	ACC	ANOVA	ACC
1	Beta 2	60.7	SNR	41.3
2	Theta 2	81.7	Crest factor	48.9
3	Theta	84.8	SINAD	52.5
4	Theta	84.8	Delta 1	64.4
5	Delta 1	86.2	Theta 2	72.0
6	Beta 1	88.0	Delta	74.2
7	All bands	87.9	Beta 2	86.2
8	Standard deviation	88.1	Impulse factor	86.4
9	Delta	88.1	Clearance factor	85.7
10	RMS	88.1	Skewness	86.0
11	Crest factor	88.6	Delta 2	86.5
12	Impulse factor	88.6	Alpha	86.5
13	Clearance factor	88.5	Standard deviation	87.4
14	SNR	88.0	RMS	88.0
15	Peak value	88.0	Beta 1	88.9
16	SINAD	88.3	Peak value	88.9
17	Delta 2	88.4	Shape factor	88.8
18	Kurtosis	88.5	Mean	89.6
19	Alfa	89.0	Kurtosis	89.5
20	Shape factor	88.9	Theta 1	89.1
21	Skewness	88.9	Theta	89.3
22	Mean	89.2	All bands	89.0
23	THD	88.9	THD	88.8

	S1	S2	S3	S4	R	W	Sen
S1	177	114	2	3	144	164	29.3
S2	30	3292	113	14	105	67	90.9
S3	5	207	359	88	2	11	53.4
S4	3	18	102	500	0	4	79.7
R	46	213	0	1	1282	67	79.7
W	50	24	5	3	30	7919	98.6

	S1	S2	S3	S4	R	W	Sen
S1	183	95	3	1	158	164	30.3
S2	20	3305	105	18	102	71	91.3
S3	5	216	370	68	2	11	55.1
S4	2	19	98	501	2	5	79.9
R	51	186	0	1	1306	65	81.2
W	47	27	3	0	29	7925	98.7

Metrics\number of learner	100	200	300	400	500
Kappa	0.8480	0.8489	0.8475	0.8495	0.8493
ACC	0.9035	0.9041	0.9033	0.9044	0.9044
Sen. S1	0.2980	0.2864	0.2864	0.3029	0.2897
Sen. S2	0.9229	0.9218	0.9204	0.9199	0.9201
Sen. S3	0.5669	0.5714	0.5550	0.5744	0.5684
Sen. S4	0.8086	0.8070	0.8197	0.8213	0.8197
Sen. R	0.8272	0.8346	0.8297	0.8272	0.8346
Sen. W	0.9912	0.9920	0.9924	0.9930	0.9922
NAoSP	0.5068	0.5062	0.5029	0.5134	0.5092
NAoGP	0.5777	0.5777	0.5748	0.5832	0.5801

	S1	S2	S3	S4	R	W	Sen
S1	186	105	1	0	147	165	0.3029
S2	19	3331	98	13	93	67	0.9199
S3	3	191	386	77	2	13	0.5744
S4	2	19	82	515	2	7	0.8213
R	36	175	0	1	1331	66	0.8272
W	21	12	2	1	20	7975	0.9930

Iteration	Kappa	ACC	Sen. S1	Sen. S2	Sen. S3	Sen. S4	Sen. R	Sen. W	NAoSP	NAoGP
1	0.8356	0.8957	0.2350	0.9193	0.5386	0.7961	0.8161	0.9882	0.4715	0.5477
2	0.8401	0.8988	0.2615	0.9249	0.5029	0.8025	0.8211	0.9912	0.4738	0.5508
3	0.8347	0.8955	0.2251	0.9243	0.5208	0.7866	0.8074	0.9905	0.4595	0.5387
4	0.8383	0.8975	0.2483	0.9149	0.5416	0.8535	0.8037	0.9905	0.4855	0.5591
5	0.8391	0.8982	0.2218	0.9199	0.5625	0.8025	0.8223	0.9900	0.4777	0.5538
6	0.8398	0.8986	0.2615	0.9188	0.5029	0.8184	0.8211	0.9922	0.4767	0.5530
7	0.8344	0.8954	0.2649	0.9182	0.5208	0.7898	0.7962	0.9920	0.4690	0.5454
8	0.8288	0.8916	0.2450	0.9204	0.5386	0.7770	0.7776	0.9885	0.4592	0.5362
9	0.8434	0.9008	0.2814	0.9210	0.5208	0.8439	0.8223	0.9902	0.4942	0.5669
10	0.8457	0.9020	0.2913	0.9304	0.5952	0.8184	0.7987	0.9880	0.5071	0.5770
Avg.	0.8380	0.8974	0.2536	0.9212	0.5345	0.8089	0.8086	0.9901	0.4774	0.5529

Iteration	Kappa	ACC	Sen. S1	Sen. S2	Sen. S3 + S4	Sen. R	Sen. W	NAoSP	NAoGP
1	0.8546	0.9080	0.2450	0.9144	0.8169	0.8173	0.9880	0.5377	0.6127
2	0.8543	0.9082	0.2516	0.9133	0.8092	0.8074	0.9915	0.5344	0.6106
3	0.8536	0.9079	0.2152	0.9226	0.8076	0.8049	0.9902	0.5208	0.6010
4	0.8571	0.9098	0.2748	0.9055	0.8307	0.8149	0.9912	0.5516	0.6236
5	0.8601	0.9119	0.2417	0.9171	0.8369	0.8248	0.9895	0.5470	0.6216
6	0.8583	0.9107	0.2384	0.9182	0.8092	0.8248	0.9915	0.5367	0.6139
7	0.8530	0.9075	0.2615	0.9077	0.8138	0.8037	0.9920	0.5374	0.6122
8	0.8479	0.9041	0.2450	0.9144	0.8107	0.7788	0.9892	0.5219	0.5992
9	0.8584	0.9107	0.2847	0.9144	0.8138	0.8136	0.9912	0.5511	0.6236
10	0.8577	0.9102	0.2582	0.9177	0.8476	0.7950	0.9890	0.5461	0.6199
Avg.	0.8555	0.9089	0.2517	0.9146	0.8197	0.8086	0.9904	0.5385	0.6139

Iteration	Kappa	ACC	Sen. S1 + S2	Sen. S3 + S4	Sen. R	Sen. W	NAoSP	NAoGP
1	0.8596	0.9133	0.8703	0.8167	0.7565	0.9830	0.7320	0.7496
2	0.8581	0.9124	0.8646	0.8123	0.7540	0.9855	0.7275	0.7459
3	0.8521	0.9088	0.8570	0.8107	0.7354	0.9868	0.7156	0.7352
4	0.8514	0.9080	0.8580	0.8307	0.7428	0.9800	0.7247	0.7408
5	0.8589	0.9131	0.8608	0.8123	0.7639	0.9868	0.7308	0.7486
6	0.8616	0.9146	0.8674	0.7969	0.7701	0.9875	0.7305	0.7497
7	0.8561	0.9112	0.8551	0.8184	0.7527	0.9875	0.7260	0.7440
8	0.8506	0.9079	0.8717	0.7923	0.7192	0.9835	0.7063	0.7280
9	0.8549	0.9104	0.8613	0.8123	0.7552	0.9833	0.7257	0.7431
10	0.8561	0.9111	0.8698	0.8400	0.7242	0.9818	0.7260	0.7438
Avg.	0.8560	0.9111	0.8637	0.8143	0.7475	0.9846	0.7246	0.7429

Iteration	Kappa	ACC	Sen. NREM	Sen. R	Sen. W	NAoSP	NAoGP
1	0.8887	0.9364	0.9207	0.7664	0.9813	0.7871	0.8048
2	0.8923	0.9385	0.9250	0.7515	0.9853	0.7824	0.8041
3	0.8888	0.9367	0.9218	0.7416	0.9860	0.7746	0.7974
4	0.8946	0.9398	0.9283	0.7639	0.9830	0.7909	0.8106
5	0.8889	0.9365	0.9160	0.7639	0.9853	0.7850	0.8037
6	0.8917	0.9382	0.9207	0.7602	0.9860	0.7858	0.8059
7	0.8932	0.9390	0.9218	0.7677	0.9853	0.7907	0.8097
8	0.8846	0.9342	0.9247	0.7304	0.9815	0.7667	0.7902
9	0.8928	0.9388	0.9225	0.7726	0.9833	0.7932	0.8109
10	0.8854	0.9345	0.9283	0.7279	0.9803	0.7664	0.7905
Avg.	0.8901	0.9373	0.9230	0.7547	0.9838	0.7823	0.8028

Iteration	Kappa	ACC	Sen. sleep	Sen. W	NAoGP
1	0.9483	0.9742	0.9680	0.9798	0.9362
2	0.9480	0.9741	0.9663	0.9810	0.9358
3	0.9388	0.9695	0.9613	0.9768	0.9246
4	0.9420	0.9711	0.9674	0.9743	0.9287
5	0.9422	0.9712	0.9621	0.9793	0.9286
6	0.9443	0.9723	0.9624	0.9810	0.9311
7	0.9435	0.9719	0.9607	0.9818	0.9301
8	0.9430	0.9716	0.9677	0.9750	0.9299
9	0.9430	0.9716	0.9638	0.9785	0.9297
10	0.9367	0.9684	0.9663	0.9703	0.9224
Avg	0.9430	0.9716	0.9646	0.9778	0.9297

Iteration	Kappa	ACC	Sen. S1	Sen. S2	Sen. S3	Sen. S4	Sen. R	Sen. W	NAoSP	NAoGP
1	0.8475	0.9030	0.2682	0.9337	0.6250	0.7961	0.8086	0.9875	0.5065	0.5773
2	0.8496	0.9046	0.3013	0.9348	0.5833	0.8025	0.8074	0.9907	0.5069	0.5782
3	0.8413	0.8975	0.3344	0.9160	0.6398	0.8089	0.7888	0.9843	0.5256	0.5885
4	0.8482	0.9034	0.3079	0.9320	0.5982	0.8535	0.7900	0.9875	0.5189	0.5864
5	0.8518	0.9059	0.2913	0.9353	0.6250	0.8025	0.8136	0.9890	0.5174	0.5868
6	0.8538	0.9073	0.3211	0.9359	0.5892	0.8184	0.8111	0.9912	0.5201	0.5893
7	0.8488	0.9037	0.3576	0.9348	0.6428	0.7961	0.7950	0.9828	0.5354	0.5982
8	0.8456	0.9004	0.3708	0.9287	0.6339	0.8152	0.7689	0.9853	0.5333	0.5954
9	0.8545	0.9075	0.3609	0.9320	0.5982	0.8439	0.8000	0.9900	0.5372	0.6018
10	0.8552	0.9079	0.3344	0.9425	0.6547	0.8184	0.7888	0.9875	0.5371	0.6022
Avg.	0.8496	0.9041	0.3248	0.9325	0.6190	0.8155	0.7972	0.9875	0.5238	0.5904

Iteration	Kappa	ACC	Sen. S1	Sen. S2	Sen. S3 + S4	Sen. R	Sen. W	NAoSP	NAoGP
1	0.8640	0.9138	0.2715	0.9304	0.8553	0.8037	0.9863	0.5593	0.6312
2	0.8672	0.9161	0.3211	0.9298	0.8507	0.7937	0.9897	0.5737	0.6425
3	0.8654	0.9152	0.2781	0.9409	0.8430	0.7888	0.9885	0.5549	0.6288
4	0.8682	0.9165	0.3112	0.9215	0.8769	0.8049	0.9885	0.5808	0.6479
5	0.8714	0.9187	0.3013	0.9331	0.8692	0.8136	0.9877	0.5801	0.6488
6	0.8683	0.9169	0.2880	0.9331	0.8384	0.8099	0.9910	0.5637	0.6362
7	0.8612	0.9124	0.3079	0.9210	0.8507	0.7900	0.9885	0.5649	0.6340
8	0.8554	0.9087	0.2682	0.9265	0.8446	0.7652	0.9880	0.5396	0.6142
9	0.8684	0.9167	0.3509	0.9282	0.8523	0.7913	0.9897	0.5843	0.6503
10	0.8692	0.9173	0.3046	0.9353	0.8876	0.7826	0.9870	0.5766	0.6453
Avg.	0.8659	0.9153	0.3003	0.9300	0.8569	0.7944	0.9885	0.5678	0.6380

Iteration	Kappa	ACC	Sen. S1 + S2	Sen. S3 + S4	Sen. R	Sen. W	NAoSP	NAoGP
1	0.8599	0.9135	0.8665	0.8169	0.7677	0.9830	0.7354	0.7520
2	0.8630	0.9154	0.8712	0.8123	0.7652	0.9855	0.7355	0.7536
3	0.8563	0.9115	0.8655	0.8107	0.7378	0.9868	0.7205	0.7404
4	0.8561	0.9109	0.8641	0.8307	0.7540	0.9800	0.7325	0.7482
5	0.8605	0.9140	0.8613	0.8123	0.7714	0.9868	0.7343	0.7517
6	0.8646	0.9165	0.8707	0.7969	0.7788	0.9875	0.7359	0.7547
7	0.8601	0.9137	0.8613	0.8184	0.7602	0.9875	0.7321	0.7500
8	0.8513	0.9083	0.8698	0.7923	0.7279	0.9835	0.7093	0.7304
9	0.8606	0.9140	0.8679	0.8123	0.7714	0.9833	0.7359	0.7527
10	0.8613	0.9142	0.8726	0.8400	0.7465	0.9818	0.7375	0.7540
Avg.	0.8594	0.9133	0.8672	0.8143	0.7581	0.9846	0.7309	0.7488

Iteration	Kappa	ACC	Sen. NREM	Sen. R	Sen. W	NAoSP	NAoGP
1	0.8905	0.9372	0.9109	0.8074	0.9813	0.8072	0.8180
2	0.8966	0.9407	0.9174	0.7987	0.9853	0.8079	0.8220
3	0.8940	0.9394	0.9142	0.7937	0.9860	0.8032	0.8177
4	0.9000	0.9427	0.9232	0.8086	0.9830	0.8164	0.8290
5	0.8963	0.9405	0.9058	0.8360	0.9853	0.8245	0.8318
6	0.8978	0.9414	0.9109	0.8236	0.9860	0.8202	0.8301
7	0.8955	0.9402	0.9163	0.7975	0.9853	0.8065	0.8205
8	0.8869	0.9352	0.9153	0.7726	0.9815	0.7880	0.8043
9	0.8950	0.9398	0.9145	0.8099	0.9833	0.8121	0.8236
10	0.8936	0.9390	0.9236	0.7863	0.9803	0.8008	0.8159
Avg.	0.8947	0.9397	0.9153	0.8035	0.9838	0.8087	0.8213

Approach	Kappa	ACC	Sen. S1	Sen. S2	Sen. S3	Sen. S4	Sen. R	Sen. W	NAoSP	NAoGP
Sharma et al. (2019a)	0.8680	0.9149	0.6492	0.8835	0.6507	0.8771	0.8401	0.9851	0.6594	0.6942
Sharma et al. (2017)	0.8429	0.9002	0.1887	0.9251	0.5238	0.8325	0.8365	0.9923	0.4682	0.5420
Ghimatgar et al. (2019)	0.7601	0.8473	0.1092	0.8657	0.1654	0.6602	0.8178	0.9723	0.2981	0.3853
Proposed sistem	0.8495	0.9044	0.3029	0.9199	0.5744	0.8213	0.8272	0.9930	0.5134	0.5832

Method	Approach	Kappa	ACC	Sen. S1	Sen. S2	Sen. S3	Sen. S4	Sen. R	Sen. W	NAoSP	NAoGP
Deep learning algorithms	Mousavi et al. (2019)	0.8624	0.9096	0.6788	0.8292	0.8303	0.8306	0.9577	0.9662	0.7196	0.7368
Machine learning algorithms	Seifpour et al. (2018)	0.8207	0.8859	0.1920	0.9088	0.4107	0.7579	0.8422	0.9860	0.4195	0.4973
	Hassan and Bhuiyan (2016)	0.7967	0.8688	0.4735	0.8901	0.7827	0.3885	0.7838	0.9505	0.4846	0.5331
	Hassan et al. (2015a)	0.7638	0.8501	0.2450	0.8387	0.4434	0.7929	0.6906	0.9709	0.4235	0.4697
	Hassan et al. (2015b)	0.6889	0.8034	0.2152	0.8155	0.3541	0.6146	0.5652	0.9419	0.3191	0.3636
	Hassan and Subasi (2017)	0.8360	0.8962	0.3741	0.9287	0.3452	0.8407	0.8223	0.9858	0.4842	0.5462
	Hassan and Bhuiyan (2017)	0.7767	0.8543	0.4205	0.7951	0.8660	0.4808	0.8049	0.9515	0.5002	0.5389
	Hassan and Bhuiyan (2017)	0.8434	0.9000	0.4072	0.9077	0.4821	0.8184	0.8335	0.9880	0.5289	0.5828
	Hassan and Bhuiyan (2016)	0.8434	0.9004	0.3874	0.9177	0.5059	0.7770	0.8360	0.9865	0.5213	0.5772
	Hassan and Bhuiyan (2016)	0.8209	0.8863	0.3907	0.9044	0.4255	0.8216	0.7850	0.9793	0.4947	0.5475
	Proposed mother system	0.8380	0.8974	0.2536	0.9212	0.5345	0.8089	0.8086	0.9901	0.4774	0.5529
	Proposed hybrid system	0.8496	0.9041	0.3248	0.9325	0.6190	0.8155	0.7972	0.9875	0.5238	0.5904

Approach	Kappa	ACC	Sen. S1	Sen. S2	Sen. S3 + S4	Sen. R	Sen. W	NAoSP	NAoGP
Hassan and Bhuiyan (2016)	0.8616	0.9125	0.3741	0.9144	0.8107	0.8211	0.9868	0.6002	0.6470
Hassan and Bhuiyan (2017)	0.8639	0.9136	0.3973	0.9022	0.8230	0.8298	0.9888	0.6130	0.6578
Hassan and Bhuiyan (2016)	0.8438	0.9011	0.3973	0.8956	0.7984	0.7813	0.9820	0.5866	0.6286
Hassan and Bhuiyan (2016)	0.8550	0.9069	0.4701	0.9237	0.9000	0.8086	0.9528	0.6548	0.6838
Hassan et al. (2015a)	0.7794	0.8604	0.2748	0.8293	0.7692	0.6956	0.9659	0.4944	0.5329
Hassan et al. (2015b)	0.7169	0.8203	0.2284	0.8083	0.6738	0.6186	0.9339	0.4155	0.4507
Hassan and Subasi (2017)	0.8542	0.9079	0.3874	0.9061	0.7892	0.8136	0.9858	0.5923	0.6384
Proposed mother system	0.8555	0.9089	0.2517	0.9146	0.8197	0.8086	0.9904	0.5385	0.6139
Proposed hybrid system	0.8659	0.9153	0.3003	0.9300	0.8569	0.7944	0.9885	0.5678	0.6380

PERMALINK

An automatic EEG-based sleep staging system with introducing NAoSP and NAoGP as new metrics for sleep staging systems

Mesut Melek

Negin Manshouri

Temel Kayikcioglu

Abstract

Electronic supplementary material

Introduction

Materials and methods

Data set

Table 1.

Table 2.

Pre-processing

Fig. 1.

Feature extraction and selection

Table 3.

Table 4.

Classification

Polygon area metric

Fig. 2.

Normalized area of sensitivity polygon (NAoSP)

Normalized area of general polygon (NAoGP)

Strategies in the evaluation of sleep stage classification systems

K-fold cross-validation (K-FCV) strategy

Holdout strategy

Results

Classifier selection

Table 5.

Feature set evaluation

Table 6.

Table 7.

Table 8.

Evaluation of the proposed system

System evaluation by K-fold cross-validation strategy

Table 9.

Table 10.

System evaluation by holdout strategy

Table 11.

Table 16.

Table 17.

Table 18.

Table 19.

The proposed hybrid system

Fig. 3.

Table 12.

Fig. 4.

Fig. 5.

Table 20.

Fig. 6.

Table 21.

Fig. 7.

Table 22.

Fig. 8.

Fig. 9.

Discussion

Table 13.

Table 14.

Table 15.

Conclusion

Electronic supplementary material

Appendix

Footnotes

Contributor Information

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases