Table A1.
№ | Tasks | Dataset | CNN | Learning Type | Steps | Structure | Optimization | Activation Function | Function Loss | Evaluation Metrics | Framework | Ref. | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Classification task | Sleep stage annotations | Physionet Sleep-EDF dataset | SleepEEGNet | Supervised | Decomposition of data into frequency components and subsequent classification | 2DCNN and BiRNN | RMSProp optimizer | ReLU | lMFE | k-fold cross validation. overall accuracy, precision, recall (sensitivity), specificity, and F1-score. |
Python 3.7–3.10, TensorFlow 2.8 | [256] |
2 | Emotion recognition | DEAP dataset 39 | EEG-Based Emotion Recognition Using a 2D CNN | Supervised | Decomposition of data into frequency components and subsequent classification | 2D CNN | Particle Swarm Optimization | LeakyReLU Outpit—Softmax |
Cross Entropy | 85% | Python 3.7–3.10 | [214] | |
3 | Motor Imagery Signals Classification | BCI Competition IV 2a (BCI-IV2a), High Gamma (HGD) | MBEEGSE | Supervised | MBEEGSE architecture. Divided into three branches, each with EEGNet and SE block | EEGNet and Squeeze-and-Excitation (SE) Block | Adam optimizer | Softmax | Cross Entropy | 70% | Keras 3.0.4, Python 3.6, 3.7, 3.8, 3.9 | [202] | |
4 | Motor Imagery EEG Decoding | BCI Competition 2008 IV 2a Dataset High Gamma Dataset: (HGD) |
TS-SEFFNet | Supervised | First, the deep temporal convolution block (DT-Conv block). Second, multispectral convolution block (MS-Conv block) is then run in parallel using multilevel wavelet convolutions. Finally, block (SE-Feature-Fusion block) displays the depth-time and multispectral features into complex pooled feature maps that extract the feature responses across channels. |
DT-Conv block, MS-Conv block, SE-Feature-Fusion block | The Optimization Steps of the Proposed TS- SEFFNet Method | Softmax | Custom loss function | 93.25% | Torch 1.4, Python 3.8 | [217] | |
5 | Sleep stage annotations | Physionet Challenge dataset | Self-supervised learning (SSL) | Unsupervised | The first step is a sampling process by which examples are extracted from the time series S (EEG recording). The following describes a learning process where sample examples are used to train the feature extractor end-to-end. | Relative positioning (RP). Temporal shuffing (TS), Contrastive predective coding (CPC) | Adam optimizer | Rectified Linear Unit (ReLU) | Cross-entropy loss function | 72.3% | Torch 1.4, Python 3.9 | [218] | |
6 | Prediction task | EEG Imaginary Speech Recognition | Kara One database | - | Supervised | a CNN containing two convolutional layers with 64 and 128 filters connected to a dense layer with 64 neurons for input signal spectrum of a 0.25 s window | 2D CNN | Adam optimizer | Linear | Categorical cross-entropy | 37% | - | [257] |
7 | EEG-Speech Recognition | Custom dataset (not available) | - | Supervised | ResNet18/50/101 with 2 layers of managed recurrent units—Gated Recurrent Unit (GRU). And after that ResNet18 operation are fed to the input of a recurrent neural network containing 1024 hidden GRUs. | CNN and RNN | Adam optimizer | Softmax | - | 85% | - | [258] | |
8 | EEG speech recognition | Custom dataset (not available) | - | Supervised | The architecture used includes the already trained VGG Net CNN design and the target CNN design, while the already trained VGG Net CNN design extracts global features for general image classification work, and the target CNN design aims at efficient and accurate categorization of EEG signals using already trained Model VGG Net CNN. | Deep Residual–encoder–based VGG net CNN | - | Softmax | Softmax cross-entropy | 95% | - | [259] | |
9 | Seizure prediction | CHB-MIT and Kaggle | - | Supervised | a hybrid network that can combine the additional benefits of CNN and Transformer. The CNN is used to extract local information that contains two 3 × 3 convolutions with stride 1 and another 3 × 3 convolution with stride 2 to reduce the size of the input features. Each convolutional layer is followed by a GELU activation and a batch normalization (BN) layer. The model has two stages for extracting multiscale features from the EEG spectrum. Each stage consists of a set of Transformer blocks applied to extract long-term dependencies. | CNN and transformer | Adam | Softmax | Cross-entropy | 95% | Torch 1.4, Python 3.8 | [260] | |
10 | Predicting Human Intention-Behavior | BCI competition IV Dataset 2b | - | Supervised | The multi-scale CNN model has seven layers, which are one input layer, two convolutional layers, one max-pooling layer, one multi-scale layer, one full connection layer and one softmax output layer. The input layer in the multi-scale CNN model is fed with a time-frequency image with the size of 40 × 32 × 3 after EEG signals are preprocessed by STFT | Multi-Scale CNN Model | - | Linear | Cross-entropy | 73.9% | Python 3.8, Keras 3 | [261] | |
11 | Artifact Removal | EEG Artifact Detection and Correction | Costum dataset, not available | - | Unsupervised | modification of a feed-forward neural network that uses weight sharing and exhibits translation invariance. Learning in the CNNs operates on the same principle as a traditional feed-forward neural network where an error from output layer is back-propagated through the network and weights of the network are proportionally updated to the gradient of error. | CNN | Adam | - | Cross-entropy | - | Python 3.9, Keras 3 | [262] |
12 | Remove Muscle Artifacts from EEG | EEGdenoiseNet | - | Supervised | CNN for myogenic artifact reduction contains seven similar blocks. In each of the first six blocks, two 1D-convolution layers with small 1*3 kernels, 1 stride, and a ReLU activation function are followed by a 1D-Average pooling layer with pool size equal to two. In the seventh block, two 1D-convolution layers are followed by a flatten layer.The network gradually reduce the EEG signal sampling rate by the 1D-Average pooling layer. | CNN | RMSprop | ReLU | mean squared error (MSE) | - | Python 3.10, Tensorflow 2.8 | [263] | |
13 | Denoise EEG signal from artifacts | EEGdenoiseNet | MultiResUNet3+ | Supervised | Net3+ consists of full-blown pass-through connections that aggregate connections between encoders and decoders and internal connections between decoder subnets. Instead of directly combining the encoder and decoder functions, the encoder functions go through several convolutional levels with residual connections | CNN, encoders | Adam | Rely | mean squared error (MSE) | - | - | [152] |