Table 8.
Comparison of the architectures and accuracy based on the datasets.
Dataset | Architecture | Accuracy | TRILL-Distilled | FRILL | BRILLsson | Citation |
---|---|---|---|---|---|---|
MUSAN | RNN classifier using wake up sensor (WUS) |
This architecture shows less than 3% no trigger rate and less than 1% dirty cycle |
98.5 | 98.5 | 93.0 | 15 |
MUSAN | ULP RNN | <3% NTR (no trigger rate) | – | – | – | 16 |
MUSAN |
We need to make learn large mode of noise robustness under the loud noises. Sequentially large noise is compressed into small network using enabled distillation. |
96.4% at 20dB 91.1% at 0dB 96.4% at 20dB urbansound8k |
- | - | - | 17 |
AUDIOSET |
DNN RNN CNN LSTM |
Due to low computing power and memory requirements of TinyML we use decision trees instead of NN. |
– | – | – | 18 |
AUDIOSET |
Audio spectrogram transformer Pre-trained transformer |
98.11% | – | – | – | 19 |
ESC 50 |
Embedded systems with SparkFun MicroMod RP2040 processor, Micromod machine learning board, HC-SR04 ultrasonic sensor. |
24.44% accuracy 39% loss |
87.9 | 86.4 | 85.0 | 20 |
ESC 50 |
CNN using augmentation techniques like standard signal augmentation, short signal augmentation, super signal augmentation, time scale modification, short spectrum augmentation, super spectrum augmentation. |
96.82% accuracy on birds sounds 90.51% accuracy on cat sounds |
– | – | – | 21 |
CREMA D |
DNN CNN SVM |
This model can detect emotions if confidence level of threshold >0.98 then it is anger, if it is between 0.55 and 0.98 then it is about to be angry, if it less than 0.55 then it is not anger. |
70.2 | 70.9 | 85.0 | 22 |