Table 3.
Comparison of different models on VisDrone validation set.
| Method | Backbone | Resolution | mAP@.5% | mAP@.75% | mAP@.5:.95% |
|---|---|---|---|---|---|
| RetinaNet23 | ResNet-50 | 2400*2400 | 44.9 | 27.1 | 26.2 |
| ClusDet25 | ResNet-50 | 1000*600 | 50.6 | 24.4 | 26.7 |
| ClusDet | ResNext-101 | 1000*600 | 53.2 | 26.4 | 28.4 |
| DMNet24 | ResNet-50 | 1000*600 | 47.6 | 28.9 | 28.2 |
| DMNet | ResNext-101 | 1000*600 | 49.3 | 30.6 | 29.4 |
| GLSAN26 | ResNet-50 | 1000*600 | 51.5 | 22.9 | 25.8 |
| HRDNet29 | ResNet-50+ ResNet-101 | 2666*1600 | 49.3 | 28.2 | 28.3 |
| QueryDet2 | ResNet-50 | 2400*2400 | 48.1 | 28.8 | 28.3 |
| GFL V14 | ResNet18 | 1333*800 | 50.0 | 27.8 | 28.4 |
| GFL V1(CEASC)28 | ResNet-18 | 1333*800 | 50.7 | 28.4 | 28.7 |
| Cascade27 | ResNet-50 | – | 47.1 | 29.3 | 28.8 |
| DFPN3 | Modified CSP v5-M | 768*768 | 50.9 | 30.5 | 30.3 |
| YOLOv8 | CSPDarkNet | 640*640 | 37.6 | – | 22.1 |
| FocusDet(ours) | STCF-EANet | 768*768 | 48.7 | 35.6 | 30.4 |
The bolded performance is the best one.