Skip to main content
. 2025 Sep 29;86(5):624–654. [Article in Korean] doi: 10.3348/jksr.2025.0058

Table 4. Comparison of Performance Metrics of KL Grading of AI Models.

AUC Kappa Accuracy Sensitivity Specificity Test
Tiulpin et al. (47) 0.93 External (OAI)
Liu et al. (48) 0.83 0.78 0.95 Internal
Norman et al. (49) KL0, 1 = 71% KL0, 1 = 86% Internal
KL2 = 69% KL2 = 84%
KL3 = 86% KL3 = 98%
KL4 = 85% KL4 = 99%
Nguyen et al. (50) 0.79 53% External (MOST)
Tiulpin & Saarakkala (51) 0.82 0.67 External (MOST)
Kim et al. (52) KL0, 1 = 0.80 Internal
KL2 = 0.69
KL3 = 0.89
KL4 = 0.95
Thomas et al. (53) 0.86 0.66 Internal (OAI)
Brejnebøl et al. (54) 0.84 0.88 KL0 = 79% KL0 = 100% KL0 = 76% External
KL1 = 73% KL1 = 27% KL1 = 98%
KL2 = 83% KL2 = 75% KL2 = 85%
KL3 = 92% KL3 = 94% KL3 = 91%
KL4 = 96% KL4 = 86% KL4 = 100%

AUC = area under curve, KL = Kellgren-Lawrence, MOST = Multicenter Osteoarthritis Study, OAI = Osteoarthritis Initiative