TABLE XXII.
Impact of Runtime Optimizations vs. No Optimizations
Application | Dataset | Method | Accuracy | Latency/MAC | Flash |
---|---|---|---|---|---|
Human Activity Recognition | Custom* | CNN-TFLM [189] | 85% | 58 mS | 275 kB |
CNN-Cube.AI [189] | 85% | 14 mS | 192 kB | ||
Audio Keyword Spotting | Speech Commands* | CNN-TFLM [189] | - | 380 mS | 288 kB |
CNN-Cube.AI [189] | 373 mS | 247 kB | |||
Image Recognition | ImageNet | MCUNet MbNetv2 [121] | 60.3%-68.5% | 68M-126M | 1MB-2MB |
MCUNetV2 MbNetv2 [121] | 64.9%-71.8% | 119M-256M | 1MB-2MB | ||
Pascal VOC | MbNetv2+CMSIS [121] | mAP: 31.6% | 34M | OOS | |
MCUNetV MbNetv2 [121] | mAP: 51.4% | 168M | <2 MB | ||
MCUNetV2 MbNetv2 [121] | mAP: 64.6% | 172M | <1 MB | ||
CIFAR-10@ | CNN [164] | 80.3% | 456 mS | < 1 MB | |
CNN-CMSIS [164] | 80.3% | 99 mS | < 1 MB |
Device: STM32 Cortex-M4
Device: STM32 Cortex-M7, OOS: Overflowed SRAM
Superior optimization techniques than comparing method in the same dataset class