Table 2.
Comparison of some classical defect detection models. mAP represents the average detection accuracy of the model for the PASCAL VOC 2007 dataset and the PASCAL VOC 2012 dataset.
| Model | Improvement | Insufficient | mAP |
|---|---|---|---|
| R-CNN | mAP greatly improved; introduced RP+CNN. | The training steps are tedious, and the training speed is slow. | 58.50% 62.00% |
| Fast R-CNN | mAP has been improved, and training time has been shortened. | Unable to meet real-time requirements, and unable to detect end-to-end. | 70.00% 66.00% |
| Faster R-CNN | Both accuracy and speed are improved, thus enabling end-to-end detection. | Real-time detection cannot be achieved, and the calculation is large. | 73.20% 70.40% |
| YOLOv4 | Large residual blocks, SPP, and PANNnet are introduced, and Mish activation functions are used. | Small target detection is poor, and the recall rate is relatively low. | 85.46% 86.68% |
| YOLOv5 | Adaptive anchor calculation, focus structure, and CSP structure are introduced, and GIOU_Loss is adopted. | Small target recognition is unstable and requires massive training data. | 89.10% 92.68% |
| SSD | Multiscale detection, default anchor, data enhancement. | Needs artificial setting box value, and small target recognition effect is poor. | 76.90% 77.91% |