Table 2.
Models | Accuracy (95%CI) | Precision (95%CI) | Recall (95%CI) | F1 score (95%CI) |
---|---|---|---|---|
Ensemble DenseNet [40] | 83.4% (80.9–84.1%) | 81.3% (79.6–83.0%) | 83.9% (82.1–84.4%) | 83.2% (81.5–84.0%) |
ResNet [24] | 81.0% (79.5–83.0%) | 78.6% (77.9–80.4%) | 81.5% (80.1–82.4%) | 80.8% (79.5–81.9%) |
Efficient-Net B4 | 82.8% (81.7–83.6%) | 83.9% (82.0–84.7%) | 82.1% (81.5–83.9%) | 82.5% (81.6–84.1%) |
Two-stage framework | 85.6% (84.1–85.9%) | 86.0% (84.4–86.7%) | 83.2% (83.0–84.5%) | 83.9% (83.3–85.0%) |
U-Net with multitask model | 85.9% (84.3–86.7%) | 85.0% (83.9–86.2%) | 86.7% (84.9–87.0%) | 86.3% (84.6–86.8%) |
Multi-task without pretrain | 87.2% (86.4–88.6%) | 85.0% (83.8–86.2%) | 87.9% (86.1–88.5%) | 87.2% (85.5–87.9%) |
Multi-task with regression | 89.1% (87.0–91.1%) | 90.3% (88.7–90.9%) | 88.0% (87.6–89.8%) | 88.6% (87.9–90.1%) |
Proposed method | 90.8% (88.6–93.3%) | 90.3% (89.0–92.6%) | 92.4% (90.1–94.2%) | 91.9% (89.8–93.8%) |