Table 4.
Details of music emotion recognition algorithms based deep learning method.
| Method | Dataset | Features | Classifier/regressor | Performance |
|---|---|---|---|---|
| Keelawat et al. (2019) | 12 recruited listened to 16 songs selected from MIDI | Segmented EEG | CNN | Accuracy of 78.36 and 83.67% in binary classification of arousal and valence, respectively. |
| Er et al. (2021) | Nine recruited listened to 16 audio tracks | Power spectrogram | Pretrained VGG16 | Accuracy of 73.28% in quaternary classification. |
| Thammasan et al. (2016a) | 15 recruited listened to 16 songs selected from MIDI | HFD, PSD, Discrete Wavelet Transform | Deep Belief Networks | Accuracy of 81.98% in binary classification of arousal and valence. |
| Rahman et al. (2020) | 24 recruited listened to Twelve songs | DFA, Approximate Entropy, Fuzzy Entropy, Shannon's Entropy, Permutation Entropy, Hjorth Parameters, Hurst Exponent | Neuron Network | 3 emotion scales (Depressing vs. Exciting and Sad vs. Happy and Irritating vs. Soothing). |
| Liu et al. (2022) | 15 recruited listened to 13 music excerpts | Power spectrogram | Xception | Accuracy of 76.84% in HVHA vs. LVLA |
| Luo et al. (2022) | DEAP | PSD | RBF-SVM, LSTM | A SAM score of 6.17(high) and 4.76(low) in continuous valance scale, that is close to 6.98 and 4.36 evaluated in music database. |
| Hsu et al. (2018) | IADS | Segmented EEG | Neuron Network | MSE of 1.865 in 2D continuous SAM score. |
| Sheykhivand et al. (2020) | 16 recruited listened to ten music excerpts | Segmented EEG | CNN, LSTM | Accuracy of 76.84% in HVHA vs. LVLA. |
| Li and Zheng (2021) | 21 recruited listened to 15 music excerpts | Segmented EEG | Stacked Sparse Auto-Encoder | Accuracy of 59.5% and 66.8% in binary classification of arousal and valence, respectively. |
| Salama et al. (2018) | DEAP | Segmented EEG | 3D CNN | Accuracy of 88.49% and 87.44% in binary classification of arousal and valence, respectively. |