Skip to main content
. 2025 Nov 25;15:41940. doi: 10.1038/s41598-025-25755-1

Table 2.

Cross-validation results for six deep learning models on the kidney ultrasound classification task. The table reports both the average accuracy (± standard deviation) across five folds and the best validation accuracy achieved in any single fold. YOLO11x-cls demonstrates the highest average accuracy and peak validation performance, indicating strong generalization across different patient subsets. This table highlights the consistency and robustness of each model under a 5-fold cross-validation scheme. The performance variance across folds is also reflected through the standard deviation, offering insights into each model’s stability when trained on different subsets of the data.

Model Average accuracy across folds (%) Best validation accuracy (%)
InceptionV3 Inline graphic 94.21
EfficientNet Inline graphic 72.08
VGG16 Inline graphic 88.75
ResNet34 Inline graphic 89.20
ResNet50 Inline graphic 90.08
YOLOv8x-cls Inline graphic 95.44
YOLO11x-cls 90 ± 5.9 95.9