Skip to main content
. 2024 Nov 28;10(12):1915–1929. doi: 10.3390/tomography10120139

Table 2.

Performance comparison of proposed method and baseline models for distal ulna maturity grading.

Models Accuracy (95%CI) Precision (95%CI) Recall (95%CI) F1 score (95%CI)
Ensemble DenseNet [40] 83.4% (80.9–84.1%) 81.3% (79.6–83.0%) 83.9% (82.1–84.4%) 83.2% (81.5–84.0%)
ResNet [24] 81.0% (79.5–83.0%) 78.6% (77.9–80.4%) 81.5% (80.1–82.4%) 80.8% (79.5–81.9%)
Efficient-Net B4 82.8% (81.7–83.6%) 83.9% (82.0–84.7%) 82.1% (81.5–83.9%) 82.5% (81.6–84.1%)
Two-stage framework 85.6% (84.1–85.9%) 86.0% (84.4–86.7%) 83.2% (83.0–84.5%) 83.9% (83.3–85.0%)
U-Net with multitask model 85.9% (84.3–86.7%) 85.0% (83.9–86.2%) 86.7% (84.9–87.0%) 86.3% (84.6–86.8%)
Multi-task without pretrain 87.2% (86.4–88.6%) 85.0% (83.8–86.2%) 87.9% (86.1–88.5%) 87.2% (85.5–87.9%)
Multi-task with regression 89.1% (87.0–91.1%) 90.3% (88.7–90.9%) 88.0% (87.6–89.8%) 88.6% (87.9–90.1%)
Proposed method 90.8% (88.6–93.3%) 90.3% (89.0–92.6%) 92.4% (90.1–94.2%) 91.9% (89.8–93.8%)