Skip to main content
. Author manuscript; available in PMC: 2024 Oct 5.
Published in final edited form as: IEEE Trans Neural Netw Learn Syst. 2023 Oct 5;34(10):6983–7003. doi: 10.1109/TNNLS.2022.3145365

TABLE IV.

A summary of contributions describing the application of RNNS for sleep stage classification.

Author Model architecture Dataseta,b Performance
Supratak et al. 2017[21] • The authors proposed a hierarchical structure called DeepSleepNet with CNN and RNN parts.
• Each signal chunk was first fed into a CNN part composed of two parallel CNN-1Ds with different filter sizes, and the two outputs were then concatenated as a chunk-level feature.
• All chunk features were connected as temporal sequences and fed into a 2-layer bidirectional LSTM with a residual connection.
MASS and Sleep-EDF For MASS, the accuracy and macro F1-score were 86.2% and 0.817; for Sleep-EDF, the accuracy and macro F1-score were 82.0% and 0.769.
Dong et al. 2017[38] • In each signal chunk, engineering features were extracted..
• Each feature was first fed into 2 dense-layers. The outputs of all the chunks in each recording were then connected into a feature sequence.
• The feature sequence was fed into an LSTM as an input vector.
MASS The accuracy and macro F1-score were 85.92% and 0.805
Phan et al. 2018[23] • The authors first calculated the log-power spectral coefficients in each chunk as model input.
• A DNN model was used as a filter bank for feature extraction and dimension reduction.
• A two-layer bidirectional GRU with an attention layer was then applied on the top of the DNN part. The attention vector was fed to a softmax layer for final output.
• After the model training, the softmax layer was replaced by an SVM for final decision-making.
Sleep-EDF Expanded When in-bed part was considered only, the accuracy and macro F1-score were 79.1% and 0.698; when before and after sleep periods were also considered, the accuracy and macro F1-score were 82.5% and 0.72.
Michielli et al. 2018[81] • Each 30s signal chunk was divided into 30 smaller chunks with 1s length, and 55 features were extracted in each chunk.
• The authors designed a 2-level LSTM-based classifier. After feature selection, the feature sequences were fed into the first-level classifier. It classified 4-classes, in which the stages N1 and REM were merged into a single class.
• The samples identified as N1 stage/REM were then fed to the second binary classifier.
Sleep-EDF The accuracy was 86.7%
Phan et al. 2019[14] • The authors proposed a hierarchical structure called SeqSleepNet with filter bank layers and two levels of RNNs.
• Raw data had 3 channels (EEG, EOG and EMG). In each chunk, power spectrums image was calculated in each channel. The filter bank layer was applied to these images.
• Connected the time point features as chunk-level sequence, which was fed into first level RNN (bidirectional GRU) with attention layer.
• The authors connected the features as a chunk-level sequence, which was fed into bidirectional GRU with an attention layer.
MASS The accuracy and macro F1-score were 87.1% and 0.833
Phan et al. 2019[31] • The authors proposed a transfer learning strategy.
• The author conducted either the CNN part from DeepSleepNet or the RNN part from SeqSleepNet to extract chunk-level features in each chunk.
• The chunk-level features were connected as feature sequences and fed into a 2-layer bidirectional LSTM with residual connection.
The model was trained on MASS, and fine- turned on Sleep-EDF-SC, Sleep-EDF-ST, Surrey-cEEGGrid, and Surrey-PSG The accuracy and macro F1-score obtained from transfer learning outperfomed directly training on all the four datasets.
Phan et al. 2019[82] The authors proposed a Fusion model, which was composed of DeepSleepNet and SeqSleepNet with slight modifications. MASS The accuracy and macro F1-score were 88.0% and 0.843.
Mousavi et al. 2019[83] • The authors proposed a Seq2seq model called SleepEEGNet.
• The signal was first fed into CNN layers for extracting the feature sequence, which was used as input for an encoder..
• Both encoder and decoder were constructed with bidirectional LSTM and attention mechanism.
Sleep-EDF Expanded (version 1 and 2) The accuracy and macro F1-score were 84.26% and 0.7966 for version 1; 80.03% and 0.7355 for version 2.
a

MASS: Montreal Archive of Sleep Studies[84]; Sleep-EDF: a database in European Data Format[15]; Sleep-EDF Expanded: An expanded version of Sleep-EDF[15]; Sleep-EDF-SC: the Sleep Cassette, a subset of the Sleep-EDF Expanded dataset; Sleep-EDF-ST: the Sleep Telemetry, a subset of the Sleep-EDF Expanded dataset; Surrey-cEEGGrid and Surrey-PSG: collected at the University of Surrey using the behind-the-ear electrodes and PSG electrodes respectively [85].

b

Sleeping stages are commonly categorized into five classes: rapid eye movement, three sleep stages corresponding to different depths of sleep (N1∼N3), and wakefulness.