Table 1.
Models | Accuracy (95%CI) | Precision (95%CI) | Recall (95%CI) | F1 score (95%CI) |
---|---|---|---|---|
Ensemble DenseNet [40] | 86.2% (85.4–88.7%) | 87.2% (85.9–87.7%) | 85.3% (84.4–86.2%) | 86.2% (85.1–86.9%) |
ResNet [24] | 83.3% (81.8–84.6%) | 84.2% (83.0–85.4%) | 82.6% (81.1–83.0%) | 83.4% (82.0–84.2%) |
Efficient-Net B4 | 84.5% (82.2–85.6%) | 83.9% (82.8–84.5%) | 85.2% (84.1–86.3%) | 84.5% (83.4–85.4%) |
Two-stage framework | 87.3% (86.0–88.4%) | 86.8% (86.3–88.2%) | 88.5% (83.3–88.9%) | 87.6% (84.3–88.5%) |
U-Net with multitask model | 89.4% (88.2–91.2%) | 90.3% (88.1–92.0%) | 88.0% (87.4–90.8%) | 89.1% (87.7–91.4%) |
Multi-task without pretrain | 92.5% (90.3–93.1%) | 91.4% (89.9–93.0%) | 93.3% (91.9–94.0%) | 92.3% (90.9–93.5%) |
Multi-task with regression | 92.2% (90.7–93.6%) | 91.8% (89.3–92.8%) | 92.9% (90.0–93.5%) | 92.3% (89.6–93.1%) |
Proposed method | 94.3% (91.4–95.0%) | 93.8% (90.7–94.3%) | 94.6% (92.1–95.2%) | 94.2% (91.4–94.7%) |