Skip to main content
. 2025 May 2;15:15411. doi: 10.1038/s41598-025-96588-1

Table 8.

Comparison of the architectures and accuracy based on the datasets.

Dataset Architecture Accuracy TRILL-Distilled FRILL BRILLsson Citation
MUSAN RNN classifier using wake up sensor (WUS)

This architecture shows less than

3% no trigger rate and less than

1% dirty cycle

98.5 98.5 93.0 15
MUSAN ULP RNN <3% NTR (no trigger rate) 16
MUSAN

We need to make learn large mode

of noise robustness under the loud

noises. Sequentially large noise is

compressed into small network

using enabled distillation.

96.4% at 20dB

91.1% at 0dB

96.4% at 20dB urbansound8k

- - - 17
AUDIOSET

DNN

RNN

CNN

LSTM

Due to low computing power and

memory requirements of TinyML

we use decision trees instead of NN.

18
AUDIOSET

Audio spectrogram transformer

Pre-trained transformer

98.11% 19
ESC 50

Embedded systems with SparkFun

MicroMod RP2040 processor,

Micromod machine learning board,

HC-SR04 ultrasonic sensor.

24.44% accuracy

39% loss

87.9 86.4 85.0 20
ESC 50

CNN using augmentation techniques

like standard signal augmentation,

short signal augmentation,

super signal augmentation,

time scale modification,

short spectrum augmentation,

super spectrum augmentation.

96.82% accuracy on birds sounds

90.51% accuracy on cat sounds

21
CREMA D

DNN

CNN

SVM

This model can detect emotions if

confidence level of threshold >0.98

then it is anger, if it is between 0.55

and 0.98 then it is about to be angry,

if it less than 0.55 then it is not anger.

70.2 70.9 85.0 22