Skip to main content
. 2023 May 18;13(10):1793. doi: 10.3390/diagnostics13101793

Table 4.

Performance of the different deep learning models on the 80-pixel sub-database testing set. The best-achieved results are bold. For the ensemble learning models, WA stands for weighted averaging; UA stands for unweighted averaging, and MV stands for majority voting, and the 3 and 5 at the end of the ensemble models refer to top 3 or 5 base models. All metrics are measured in % unit.

Model Accuracy AUC Precision Recall Specificity F1-Score
MobileNet 95.82 95.73 94.90 95.15 96.30 95.02
MobileNetV2 95.29 94.87 96.36 92.26 97.48 94.27
EfficientNetB0 96.47 96.46 95.26 96.39 96.53 95.82
EfficientNetB1 96.50 96.41 95.83 95.83 96.99 95.83
DenseNet121 96.61 96.30 97.42 94.41 98.20 95.89
DenseNet169 96.67 96.70 95.26 96.88 96.52 96.07
InceptionV3 94.56 94.55 92.71 94.47 94.63 93.58
Xception 95.48 95.40 94.34 94.92 95.88 94.63
Ensemble-WA3 97.56 97.51 96.97 97.22 97.80 97.09
Ensemble-WA5 97.69 97.59 97.54 96.95 98.23 97.24
Ensemble-UA3 97.59 97.57 96.80 97.47 97.67 97.13
Ensemble-UA5 97.72 97.65 97.39 97.18 98.12 97.28
Ensemble-MV3 97.49 97.47 96.66 97.38 97.57 97.02
Ensemble-MV5 97.66 97.59 97.32 97.10 98.07 97.21