. 2025 May 2;15:15411. doi: 10.1038/s41598-025-96588-1

Table 8.

Comparison of the architectures and accuracy based on the datasets.

Dataset	Architecture	Accuracy	TRILL-Distilled	FRILL	BRILLsson	Citation
MUSAN	RNN classifier using wake up sensor (WUS)	This architecture shows less than 3% no trigger rate and less than 1% dirty cycle	98.5	98.5	93.0	¹⁵
MUSAN	ULP RNN	<3% NTR (no trigger rate)	–	–	–	¹⁶
MUSAN	We need to make learn large mode of noise robustness under the loud noises. Sequentially large noise is compressed into small network using enabled distillation.	96.4% at 20dB 91.1% at 0dB 96.4% at 20dB urbansound8k	-	-	-	¹⁷
AUDIOSET	DNN RNN CNN LSTM	Due to low computing power and memory requirements of TinyML we use decision trees instead of NN.	–	–	–	¹⁸
AUDIOSET	Audio spectrogram transformer Pre-trained transformer	98.11%	–	–	–	¹⁹
ESC 50	Embedded systems with SparkFun MicroMod RP2040 processor, Micromod machine learning board, HC-SR04 ultrasonic sensor.	24.44% accuracy 39% loss	87.9	86.4	85.0	²⁰
ESC 50	CNN using augmentation techniques like standard signal augmentation, short signal augmentation, super signal augmentation, time scale modification, short spectrum augmentation, super spectrum augmentation.	96.82% accuracy on birds sounds 90.51% accuracy on cat sounds	–	–	–	²¹
CREMA D	DNN CNN SVM	This model can detect emotions if confidence level of threshold >0.98 then it is anger, if it is between 0.55 and 0.98 then it is about to be angry, if it less than 0.55 then it is not anger.	70.2	70.9	85.0	²²