Skip to main content
. 2022 Mar 5;4:16. doi: 10.1186/s42836-022-00118-7

Table 2.

 A summary of reviewed studies on knee osteoarthritis diagnosis and knee arthroplasty prediction

Author (Year) Journal Prediction outcome AI/ML algorithm(s) Statistical performance Strengths Weaknesses Clinical significance of study
Norman (2019) [18] Journal of Digital Imaging OA severity (KL grade) DenseNet neural network architectures Sensitivity & specificity: 84% & 86% (KL grades 0–1), 70% & 84% (KL grade 2), 69% & 97% (KL grade 3), 86% & 99% (KL grade 4). Comparable sensitivity and specificity to manual KL grading and previous automatic systems employing different AI/ML algorithms Training, validation and testing sets were selected from the same dataset. Misclassifications of KL grading typically occurred when there was hardware in the knee. Provides additional data supporting the potential of AI in automatic assessment of OA radiological severity.
Tiulpin (2018) [19] Scientific reports OA severity (KL grade) Deep Siamese CNN architecture Average multi-class accuracy: 66.71%. AUC: 0.93. Kappa coefficient (agreement with expert annotations on test dataset): 0.83 (excellent). MSE value: 0.48. Different datasets used for initial training and testing Validation and testing sets were selected from the same dataset. The provision of probability distributions for each KL grade prediction may assist clinicians in choosing KL grade in ambiguous cases.
Heisinger (2020) [13] Journal of Clinical Medicine Need for TKA Artificial neural networks (ANNs) with linear, radial basis function and three-layer perceptron neural networks architectures Total percentage of correctly predicted knees: 80%. Positive predictive value: 84%. Negative predictive value: 73%. Sensitivity: 41%. Specificity 30%. First study to consider longitudinal change in symptomology (pain, function, quality of life) and radiographic structural change in a 4-year period prior to TKA Training and testing sets were selected from the same dataset. Future externally validated algorithms that can predict TKA need in advance using routinely available patient data could be highly useful for decisions for referral and triage in a primary care setting.
Leung (2020) [15] Radiology Need for TKA Multitask deep learning model (ResNet34) trained with transfer learning AUC: 0.87. Sensitivity: 83%. Specificity: 77%. First study to directly predict TKA from knee radiographs using deep learning model Limited data size (radiographs from 728 individuals in total) / Training and testing sets were selected from the same dataset. TKA prediction models solely based on radiological data have limited clinical utility, although they may serve as a reference for future ML studies.
El-Galaly (2020) [12] Clinical Orthopaedics and Related Research Need for early revision TKA LASSO regression, random forest classifier, gradient boosting model, neural network AUCs: 0.57–0.60. First study to predict early revision TKA (≤ 2 years of primary TKA) using preoperative patient data from arthroplasty registries / Temporal external validation was conducted (testing set selected from a separate hold-out year not included in training set). Training and testing sets were selected from the same dataset. Results from this study suggest that future models predicting early revision TKA may benefit from including more pre-operative information or predicting revision over a longer follow-up duration.