Table 4.
Comparison of different backbones.
| Method | CNN backbone | Swin Transformer | MW-Swin Transformer | AP 50 | AP 75 | |
|---|---|---|---|---|---|---|
| Faster R-CNN | ✔ | 0.301 | 0.709 | 0.215 | ||
| ✔ | 0.397 (9.6%↑) | 0.881 (17.2%↑) | 0.276 (6.1%↑) | |||
| ✔ | 0.417 (2%↑) | 0.893 (1.2%↑) | 0.315 (1.2%↑) | |||
| Mask R-CNN | ✔ | 0.345 mAP | 0.774 | 0.237 | ||
| ✔ | 0.426 (8.1%↑) | 0.914 (14%↑) | 0.318 (8.1%↑) | |||
| ✔ | 0.433 (0.7%↑) | 0.909 (0.5%↓) | 0.344 (2.6%↑) | |||
| Centernet | ✔ | 0.414 | 0.876 | 0.318 | ||
| ✔ | 0.436 (2.2%↑) | 0.913 (3.7%↑) | 0.372 (5.4%↑) | |||
| ✔ | 0.448 (1.2%↑) | 0.912 (0.1%↑) | 0.365 (0.7%↓) | |||
| WheatFormer | ✔ | 0. 368 | 0.825 | 0. 250 | ||
| ✔ | 0. 452 (8.4%↑) | 0. 886 (6.1%↑) | 0. 402 (15.2↑) | |||
| ✔ | 0. 459 (0.7%↑) | 0. 918 (3.2%↑) | 0. 384 (1.8%↓) |
Bold values are the results of our experimental method.
The symbols “↑” means the increase values compared to the previous method, "↓" means the decrease values compared to the previous method, and "✔" means the method used in the model.