Skip to main content
. 2025 May 2;15:15411. doi: 10.1038/s41598-025-96588-1

Table 9.

Comparison of TinyML-based voice assistant models.

Model Dataset Accuracy (%) Latency (ms) Energy consumption
Transformer-based model Common voice dataset (Multilingual) 92.0 80–150 High
Hybrid CNN-RNN CHiME Speech Dataset (Noisy Environments) 89.5 100–250 Moderate
DNN Gesture Dataset 99.0 50–100 High
CNN UrbanSound8K 94.0 30–70 Moderate
RNN AudioSet 95.0 100–200 High
Decision Tree ESC-50 90.0 10–30 Low
SVM VoxForge dataset (Multilingual Speech) 91.5 90–180 Moderate