Skip to main content
. 2023 Sep 29;9:121. doi: 10.1038/s41378-023-00580-6

Table 2.

Test of different tricks (combinations) for performance enhancement on the optimal basic network (YOLOv5-x)

Step Objectives Tricks (combinations) mAP@0.5 Inclusion in the BALFilter Reader
1 Improving the intersection between the prediction and annotation bounding boxes GIoU loss 92.1% No
DIoU loss 91.3% No
CIoU loss 93.0% Yes
2 Data augmentation to improve the generalization of the basic network CIoU loss & image flip 93.7% No
CIoU loss & image flip & mosaic 95.0% No
CIoU loss & image flip & mixup 92.0% No
CIoU loss & image flip & mosaic & HSV augmentation 95.7% Yes
3 Improved recognition of poorly distinguishable objects CIoU loss & image flip & mosaic & HSV augmentation & focal loss 94.7% No
4 Deduplication of multiple bounding boxes CIoU loss & image flip & mosaic & HSV augmentation & default-NMS 95.7% Yes
CIoU loss & image flip & mosaic & HSV augmentation & soft-NMS 95.7% No
CIoU loss & image flip & mosaic & HSV augmentation & weighted-NMS 95.7% No
5 Overall performance improvement via multiple rounds of inference CIoU loss & image flip & mosaic & HSV augmentation & default-NMS & TTA (NMS) 96.2% Yes

The bold part is the combination of selected tricks for the best performance (highest mAP@0.5), which is also the focus of this work