Abstract
Purpose
To predict parameters associated with patellar instability from magnetic resonance imaging (MRI) measurements using a machine learning model and to quantify the relative importance of radiographic risk factors that are associated with the presence of instability.
Methods
Patients with a confirmed clinical diagnosis of patellar instability and age- and sex-matched controls without patellofemoral pathology were identified retrospectively. Multiple measurements to describe patella alta, malalignment, and trochlear dysplasia were performed on knee MRI scans. Univariate and multivariable logistic regressions were used to identify MRI measurements associated with patellar instability. Machine learning models were developed and evaluated for accuracy, discrimination, and calibration in predicting patellar instability. Shapley additive explanations (SHAP) were used to evaluate global and local variable importance.
Results
A total of 256 patients were included in this study (128 with patellar instability and 128 controls, 63% female sex). Multivariable logistic regression found significant associations between diagnosis of patellar instability and lower patellotrochlear index (OR, 1.39 [95% CI, 1.15-1.69]; P < .001), greater Insall-Salvati ratio (OR, 1.65 [95% CI, 1.37-2.02]; P < .001), greater tibial tubercle–trochlear groove (TT-TG) distance (OR, 1.12 [95% CI, 1.06-1.19]; P < .001), and lower trochlear depth (OR, 1.42 [95% CI, 1.09-1.87]; P = .009). The random forest model had the highest performance among machine learning models, with an area under the receiver operating characteristic curve of 0.85. In this model, the variables with the greatest importance were Insall-Salvati ratio, TT-TG distance, and trochlear depth.
Conclusions
The final model was able to reliably predict MRI-based parameters associated with patellar instability. Insall-Salvati ratio, TT-TG distance, and trochlear depth were the most important risk factors both in the machine learning models and using conventional statistical analysis.
Clinical Relevance
This model has the potential to improve the diagnostic accuracy of patellar instability from MRI scans. The explanations provided by the model could enable clinicians to personalize care and understand the factors driving patellar instability in individual patients.
Patellar instability, clinically defined as the subluxation or dislocation of the patella from the trochlear groove, is common among adolescents and female individuals.1 Several anatomic risk factors have been identified as contributors to patellar instability, including trochlear dysplasia, patella alta, and malalignment.2 The multifaceted etiology of the condition presents challenges when making treatment decisions because these morphologies can vary significantly between individuals.3 Additionally, current understanding of the additive effects of these morphologies is limited, and numerous imaging measurements have been developed for the identification and evaluation of each of these anatomic risk factors.4 However, no single anatomic parameter is considered the gold standard, underscoring the complexity of effectively diagnosing and managing patellar instability.
One of the primary risk factors for patellar instability is trochlear dysplasia, which is characterized by a flattening of the trochlear groove and subsequent loss of the osteochondral restraint of the patella.5 Several quantitative imaging measurements for trochlear dysplasia have been developed, including the sulcus angle, lateral trochlear inclination, and trochlear depth.6 However, these measurements and associated optimal cutoff values have been shown to vary based on factors such as sex, measurement technique, and landmarks.7 Another risk factor for patellar instability is patella alta, in which the position of the patella within the patellofemoral joint is more proximal than normal, leading to a greater range of motion in which the patella is not engaged within the trochlear groove.8 Multiple measurements have been developed to measure patella alta, including the Insall-Salvati (IS) ratio, modified Insall-Salvati (m-IS) ratio, Caton-Deschamps index (CDI), and Blackburne-Peel index (BPI).9 However, there is currently no consensus on which of these measurements is best for the diagnosis of patella alta given that each measurement as well as imaging modality has unique strengths and limitations.8 Malalignment has also been shown to be a risk factor for patellar instability, and several measurements have been used to quantify this risk factor. This is primarily measured using the tibial tubercle–trochlear groove (TT-TG) distance.10 A TT-TG distance greater than 20 mm on computed tomography (CT) imaging is sometimes used as an indication for additional surgery, such as medializing tibial tubercle osteotomy.11 Although measurement of the TT-TG distance has been shown to have high interobserver reliability, the TT-TG distance can vary based on factors such as imaging modality, knee flexion, and patient size.11,12 To reduce variability with knee flexion, the tibial tubercle–posterior cruciate ligament (TT-PCL) distance has been described, whereas additional measurements of anteroposterior (AP) TT-TG distance or sagittal TT-TG distance have been used as measurements of sagittal malalignment.13, 14, 15
The purposes of this study were to predict parameters associated with patellar instability from magnetic resonance imaging (MRI) measurements using a machine learning model and to quantify the relative importance of radiographic risk factors that are associated with the presence of instability. We hypothesized that the predictive model would be able to accurately and reliably predict the diagnosis of patellar instability and provide insight into the contribution of different morphologic characteristics that contribute to patellar instability.
Methods
Study Design
This was a retrospective case-control study conducted using data from the electronic medical record system at an academic tertiary care institution. The study was approved by the Institutional Review Board and followed the Transparent Reporting of a Multivariate Prediction Model for Individual Prognosis or Diagnosis + Artificial Intelligence (TRIPOD+AI) guidelines.16, 17, 18
Study Population
The study population included patients with patellar instability, as well as age- and sex-matched control patients, who were identified through the electronic medical record system. The inclusion criteria for the patellar instability group were a diagnosis of patellar instability between January 2010 and January 2020, age between 18 and 40 years, and availability of a knee MRI scan. Patellar instability was defined as the occurrence of at least 1 documented patellar dislocation. The exclusion criteria were medical history of knee fracture or ligament injury and prior surgery. Patients with poor-quality MRI scans or with osteoarthritis that limited the ability to perform measurements were excluded as well. Age- and sex-matched controls were selected from a dataset of patients who received knee MRI studies with a diagnosis of a meniscal tear. Eligibility criteria for the control group were availability of a knee MRI scan and no history of patellofemoral disorders, fracture, or surgery of the knee.
MRI Measurements
MRI measurements were conducted using Visage Imaging (Richmond, Australia) and performed by 2 graduate students (A.S., I.W.) in bioengineering and medicine who were trained to perform the measurements (Fig 1). Descriptions of each measurement technique are included in Appendix 1.
Fig 1.
The sulcus angle was measured as the angle between the lines from the deepest point of the trochlear groove to the most prominent points of the lateral and medial condyles.19 Lateral trochlear inclination was measured as the ratio between the lines tangent to the posterior femoral condyles and the lateral trochlear facet. Trochlear depth was measured as the average of the distances from the posterior condylar line to the medial and lateral condyles minus the distance from the posterior condylar line to the trochlear groove. The Insall-Salvati (IS) ratio was calculated as the ratio between the length of the patellar tendon (B) divided by the patellar length (A). The Blackburne-Peel index was calculated as the ratio between the patellar articular length (A) and the distance from the distal end of the patellar articular surface to the tibial plateau line (B). The Caton-Deschamps index was defined as the ratio between the distance between the distal end of the patellar articular surface and the most anterosuperior point of the tibial plateau (B) and the length of the patellar articular surface (A). The modified IS ratio was calculated as the ratio between the distance between the distal end of the patellar articular surface and the tibial tuberosity (B) and the length of the patellar articular surface (A). The patellotrochlear index was calculated as the ratio between the length of trochlear cartilage overlapping patellar cartilage (B) and the patellar cartilage length (A). The TT-TG distance (red arrows) was measured as the distance between the most prominent position in the tibial tubercle and the deepest point in the trochlear groove on a line parallel to the posterior condylar axis. The AP TT-TG distance (red arrows) was measured as the anteroposterior distance between those 2 points. The tibial tubercle–posterior cruciate ligament distance was defined as the distance between the midpoint of the tibial tubercle and the medial border of the posterior cruciate ligament. (AP, anteroposterior; TTTG, tibial tubercle–trochlear groove.)
Outcome
The diagnosis of patellar instability was established through chart review and confirmation of a history of at least 1 documented episode of patellar dislocation.
Statistics
Sample size was determined using the recommendation of having at least 10 events per variable established by Peduzzi et al.29 Given the inclusion of 128 patients with patellar instability, up to 12 variables could be incorporated into the regression and prediction models.
Descriptive statistics for patient characteristics and MRI measurements were reported using proportions, means, and standard deviations (SDs). Normality was assessed using the Shapiro-Wilk test. Values were compared between symptomatic and control patients using independent-sample t tests and the Pearson χ2 test. Univariate and multivariable binary logistic regression was used to measure associations between MRI measurements and diagnosis of patellar instability.
Variables were selected for multivariable regression based on univariate analysis, literature, and clinical relevance. Multicollinearity between predictors was assessed using pair-wise Spearman correlations and the variance inflation factor, which measures how much the variance of a regression coefficient is inflated owing to multicollinearity with other predictors in the model. Variables with a variance inflation factor below 4 were included for regression. Measurements with lower average values in the symptomatic group (patellotrochlear index [PTI], trochlear depth, trochlear angle, and lateral trochlear inclination) were presented with inverse odds ratios (ORs). MRI metrics derived from ratios (IS ratio, m-IS ratio, CDI, BPI, and PTI) were scaled by 10 for logistic regression analysis so that a 0.1 increase in the values of these measurements would be equivalent to a 1-unit increase in the logistic regression analysis.
Model Development
Although regression analysis can be considered a rudimentary form of machine learning, the primary purpose of the regression analysis was to measure the independent associations between the risk factors and patellar instability. To directly assess the predictive utility of these factors, a set of machine learning models using various methods was trained and tested on the dataset.
Variables were selected for use in machine learning models using recursive feature elimination with random forests. The machine learning models used in this study included logistic regression, decision trees, and random forest. Logistic regression estimates the probability of a binary outcome by fitting the input data to a logistic function. Decision tree classifiers are a form of supervised machine learning that work by recursively splitting the input data into subsets based on feature values to make predictions. Random forest classifiers build an ensemble of decision trees with randomized subsets of input data and features and use the average output of the individual trees to make predictions. The optimal hyperparameters of the models were determined by performing a grid search.
The dataset consisted of an equal number of cases and controls. It was randomly divided into a training set containing 70% of the data and a testing set containing the remaining 30%. The training set was used to perform 10-fold cross validation for each of the models. In this process, the training set was divided into 10 subsets, and the model was trained 10 times using a different subset of the data as the validation set in each iteration. The best model was then trained on the entire training set and evaluated on the testing set.
Model Performance
During cross validation, models were evaluated using accuracy, area under the receiver operating characteristic curve (AUC), F1 score, precision, sensitivity, and specificity. Accuracy was defined as the number of correct predictions made by the model divided by the total number of predictions. The AUC is a measure of the ability of the model to distinguish between symptomatic and control patients. The AUC was defined as poor if between 0.6 and 0.7, fair if between 0.7 and 0.8, good if between 0.8 and 0.9, and excellent if above 0.9.30 It was calculated by plotting the true-positive rate against the false-positive rate at various classification thresholds. Precision was defined as the proportion of true positives among all positive predictions made by the model. Sensitivity was defined as the proportion of positive classifications made among true-positive cases in the dataset. Specificity was defined as the proportion of negative classifications among true-negative cases. The F1 score was calculated as the harmonic mean between precision and recall.
The final model was evaluated using receiver operating characteristic curve analysis, calibration, and the Brier score. The calibration curve is developed by splitting the test set into bins and plotting the predicted probabilities against the proportions of true positives. The slope and intercept of the curve were measured to evaluate how well the predicted probabilities matched the true outcomes. The Brier score was calculated by finding the mean squared difference between outcomes and predicted probabilities.
Model Explanations
To provide insight into the factors influencing model outputs, both global and local explanations for model decisions were provided. Global and local explanations for decisions on individual patients were calculated using Shapley additive explanations (SHAP). Shapley additive explanations (SHAP) values estimate the impact of each feature on an individual prediction by calculating the average marginal contribution of the feature across all possible feature subsets.31 A Shapley summary plot was developed to display the local explanations for all patients in the test set. Each dot on the plot was unique to an individual patient, and the features were ranked using the mean absolute values of the Shapley values for each feature.
Results
Patient Characteristics
A total of 128 patients with patellar instability (mean age, 27.50 years [SD, 9.87 years]; 63% female sex) and 128 age- and sex-matched controls (mean age, 27.59 years [SD, 9.50 years]; 63% female sex) were included in this study. Patient characteristics are summarized in Table 1. Among the 256 patients included in the study, 37% were men and 63% were women (Table 2). Patients in the patellar instability group had greater values for several MRI metrics, including TT-TG distance (P < .001), AP TT-TG distance (P < .001), and IS ratio (P < .001). Patients with patellar instability had lower lateral trochlear inclination (P < .001), trochlear depth (P < .001), and PTI (P < .001) relative to controls. No significant difference was found between cases and controls for age, sex, laterality, cartilage sulcus angle, trochlear angle, CDI, TT-PCL distance, and BPI.
Table 1.
Patient Characteristics
| Characteristic | Control (n = 128) | Patellar Instability (n = 128) | P Value∗ |
|---|---|---|---|
| Age, yr | 27.59 (9.50) | 27.50 (9.87) | >.900 |
| Sex, n (%) | >.900 | ||
| Female | 81 (63) | 81 (63) | |
| Male | 47 (37) | 47 (37) | |
| Knee, n (%) | .6 | ||
| Left | 69 (54) | 73 (57) | |
| Right | 59 (46) | 55 (43) | |
| Patellar height | |||
| Insall-Salvati ratio | 1.17 (0.17) | 1.35 (0.23) | <.001† |
| Blackburne-Peel index | 0.92 (0.16) | 0.91 (0.18) | .7 |
| Modified Insall-Salvati ratio | 1.84 (0.25) | 1.76 (0.24) | .010† |
| Caton-Deschamps index ratio | 1.10 (0.18) | 1.12 (0.20) | .3 |
| Patellotrochlear index | 0.45 (0.15) | 0.36 (0.21) | <.001† |
| Malalignment | |||
| TT-TG distance, mm | 12.24 (4.51) | 16.32 (6.07) | <.001† |
| AP TT-TG distance, mm | 1.34 (5.93) | 4.24 (6.40) | <.001† |
| TT-PCL distance, mm | 23.69 (4.79) | 23.64 (5.31) | >.900 |
| Trochlear dysplasia | |||
| Sulcus angle, ° | 149.81 (8.33) | 153.32 (30.77) | .2 |
| Lateral trochlear inclination, ° | 16.06 (4.85) | 13.00 (6.36) | <.001† |
| Cartilaginous trochlear depth | 4.11 (1.43) | 2.93 (1.73) | <.001† |
| Trochlear length angle | 64.93 (6.66) | 63.38 (5.95) | .051 |
NOTE. Data are presented as mean (standard deviation) unless otherwise indicated.
AP, anteroposterior; TT-PCL, tibial tubercle–posterior cruciate ligament; TT-TG, tibial tubercle–trochlear groove.
Pearson χ2 test or Welch 2-Sample t test.
Statistically significant.
Table 2.
MRI Measurement Values of Cohort Aggregated by Sex
| Characteristic | Female (n = 162) | Male (n = 94) | P Value∗ |
|---|---|---|---|
| Age, yr | 28.41 (10.09) | 26.06 (8.74) | .052 |
| Patellar height | |||
| Insall-Salvati ratio | 1.27 (0.22) | 1.23 (0.23) | .2 |
| Blackburne-Peel index | 0.91 (0.16) | 0.91 (0.18) | >.900 |
| Modified Insall-Salvati ratio | 1.81 (0.24) | 1.78 (0.26) | .3 |
| Caton-Deschamps index ratio | 1.12 (0.19) | 1.10 (0.19) | .5 |
| Patellotrochlear index | 0.38 (0.20) | 0.43 (0.17) | 0.043 |
| Malalignment | |||
| TT-TG distance, mm | 13.96 (5.75) | 14.83 (5.63) | .2 |
| AP TT-TG distance, mm | 3.48 (6.18) | 1.60 (6.44) | .023 |
| TT-PCL distance, mm | 23.01 (4.40) | 24.79 (5.85) | .011 |
| Trochlear dysplasia | |||
| Sulcus angle, ° | 151.19 (27.16) | 152.22 (10.90) | .7 |
| Lateral trochlear inclination, ° | 14.04 (5.97) | 15.37 (5.56) | .075 |
| Cartilaginous trochlear depth | 3.29 (1.65) | 3.92 (1.69) | .004 |
| Trochlear length angle | 64.53 (7.14) | 63.51 (4.65) | .2 |
NOTE. Data are presented as mean (standard deviation).
AP, anteroposterior; MRI, magnetic resonance imaging; TT-PCL, tibial tubercle–posterior cruciate ligament; TT-TG, tibial tubercle–trochlear groove.
Welch 2-sample t test or Pearson χ2 test.
Binary Logistic Regression
Univariate logistic regressions found that several variables had significant associations with diagnosis of patellar instability (Table 3). MRI metrics found to have significant associations with diagnosis of patellar instability at the P < .05 level were smaller trochlear depth, greater IS ratio, smaller PTI, greater TT-TG distance, lower m-IS ratio, smaller lateral trochlear inclination, and greater AP TT-TG distance.
Table 3.
Univariate Logistic Regression of MRI Characteristics Associated With Diagnosis of Patellar Instability (N = 256)
| Characteristic | N | OR | 95% CI | P Value |
|---|---|---|---|---|
| Patellar height | ||||
| Insall-Salvati ratio (scaled 10×) | 256 | 1.55 | 1.35-1.80 | <.001∗ |
| Blackburne-Peel index (scaled 10×) | 256 | 0.98 | 0.84-1.13 | .74 |
| Caton-Deschamps ratio (scaled 10×) | 256 | 1.07 | 0.94-1.22 | .33 |
| Patellotrochlear index (scaled 10×, inverse) | 256 | 1.33 | 1.16-1.55 | <.001∗ |
| Modified Insall-Salvati ratio (scaled 10×) | 256 | 0.87 | 0.78-0.97 | .011∗ |
| Malalignment | ||||
| TT-TG distance, mm | 256 | 1.16 | 1.10-1.23 | <.001∗ |
| AP TT-TG distance, mm | 256 | 1.08 | 1.04-1.13 | <.001∗ |
| TT-PCL distance, mm | 256 | 1.00 | 0.95-1.05 | .94 |
| Trochlear dysplasia | ||||
| Cartilaginous trochlear depth (inverse), mm | 256 | 1.63 | 1.37-1.98 | <.001∗ |
| Trochlear angle (inverse), ° | 256 | 1.04 | 1.00-1.09 | .055 |
| Sulcus angle ° | 256 | 1.01 | 1.00-1.03 | .28 |
| Lateral trochlear inclination (inverse), ° | 256 | 1.10 | 1.05-1.15 | <.001∗ |
AP, anteroposterior; CI, confidence interval; MRI, magnetic resonance imaging; OR, odds ratio; TT-PCL, tibial tubercle–posterior cruciate ligament; TT-TG, tibial tubercle–trochlear groove.
Statistically significant.
On multiple logistic regression, diagnosis of patellar instability was found to have independent, positive associations with higher TT-TG distance (OR, 1.12 [95% confidence interval (CI), 1.06-1.19]; P < .001) and IS ratio (OR, 1.65 [95% CI, 1.37-2.02]; P < .001) (Table 4). Increased trochlear depth (OR, 1.42 [95% CI, 1.09-1.87]; P = .009), m-IS ratio (OR, 0.69 [95% CI, 0.57-0.87]; P < .001), and PTI (OR, 1.39 [95% CI, 1.15-1.69]; P < .001) were found to have independent negative associations with diagnosis of patellar instability.
Table 4.
Multivariable Logistic Regression of Features Associated With Diagnosis of Patellar Instability (N = 256)
| Characteristic | OR | 95% CI | P Value |
|---|---|---|---|
| Patellar height | |||
| Modified Insall-Salvati ratio (scaled 10×) | 0.69 | 0.57-0.81 | <.001∗ |
| Patellotrochlear index (scaled 10×, inverse) | 1.39 | 1.15-1.69 | <.001∗ |
| Insall-Salvati ratio (scaled 10×) | 1.65 | 1.37-2.02 | <.001∗ |
| Malalignment | |||
| TT-TG distance, mm | 1.12 | 1.06-1.19 | <.001∗ |
| AP TT-TG distance, mm | 1.01 | 0.95-1.07 | .8 |
| Trochlear dysplasia | |||
| Cartilaginous trochlear depth (inverse), mm | 1.42 | 1.09-1.87 | .009∗ |
| Trochlear length angle (inverse), ° | 1.03 | 0.98-1.10 | .2 |
| Lateral trochlear inclination (inverse), ° | 1.04 | 0.96-1.12 | .4 |
AP, anteroposterior; CI, confidence interval; OR, odds ratio; TT-TG, tibial tubercle–trochlear groove.
Statistically significant.
Model Development
Variables selected for inclusion in machine learning models after recursive feature elimination were IS ratio, TT-TG distance, PTI, m-IS ratio, trochlear depth, sulcus angle, lateral trochlear inclination, and AP TT-TG distance. During cross validation (Table 5), the AUCs of the models ranged from 0.65 for the decision tree model to 0.85 for the random forest model. The accuracy of the models ranged from 0.64 for the decision tree model to 0.76 for the random forest model. The F1 score ranged from 0.626 for the decision tree model to 0.74 for the logistic regression and random forest models.
Table 5.
Ten-Fold Cross Validation of Model on Training Set (n = 179)
| Model | Accuracy | AUC | F Measure | Precision | Sensitivity | Specificity | Hyperparameters |
|---|---|---|---|---|---|---|---|
| Logistic regression | 0.747 (0.680-0.814) | 0.822 (0.745-0.900) | 0.741 (0.665-0.816) | 0.766 (0.673-0.860) | 0.733 (0.635-0.830) | 0.76 (0.644-0.877) | |
| Decision tree | 0.644 (0.554-0.733) | 0.652 (0.567-0.737) | 0.626 (0.526-0.727) | 0.665 (0.550-0.780) | 0.606 (0.490-0.723) | 0.678 (0.546-0.810) | Criterion = "entropy", max_depth = 10, max_features = 'log2', max_leaf_nodes = 77 |
| Random forest | 0.755 (0.690-0.819) | 0.849 (0.793-0.906) | 0.738 (0.653-0.822) | 0.764 (0.676-0.851) | 0.751 (0.603-0.900) | 0.761 (0.667-0.855) | Criterion = "gini", max_depth = 6, max_features = 'sqrt', n_estimators = 200 |
AUC, area under receiver operating characteristic curve.
The random forest model, which had the highest AUC and accuracy, was chosen for evaluation on the test set. On evaluation in the test set, the accuracy was 78% and the AUC was 0.85 (Fig 2). The slope of the calibration curve was 1.18, and the intercept was –0.1 (Fig 3). The Brier score was calculated to be 0.16, compared with a null-model Brier score of 0.248.
Fig 2.
Receiver operating curve for final model (random forest) on testing set (n = 77).
Fig 3.
Calibration plot of final model (random forest) on testing set (n = 77).
Model Explanations
A summary plot of individual predictions for all patients in the test set is displayed in Figure 4. The most important variables when making individual predictions in the test set were IS ratio, TT-TG distance, and trochlear depth.
Fig 4.
Shapley additive explanations (SHAP) values summary plot for magnetic resonance imaging measurements influencing probability of patellar instability. Each dot represents a data point, with blue indicating lower values and pink indicating higher values. Features to the left (negative SHAP values) decrease the prediction, whereas features to the right (positive SHAP values) increase it. This visualization helps identify which features most influence the model’s predictions. (AP, anteroposterior; TTTG, tibial tubercle–trochlear groove.)
An example of a decision with variable importance for an individual patient with patellar instability is shown in Figure 5. On the basis of the patient’s MRI metrics, the model predicted that the patient had an 82.9% chance of having patellar instability. The largest contributing factors were the IS ratio of 1.634 and TT-TG distance of 17.9 mm, which both increased the predicted probability of the patient having patellar instability. The trochlear depth of 3.05 mm, AP TT-TG distance of 9.1 mm, and lateral trochlear inclination of 13.2° caused small increases in the patient’s probability of patellar instability. The patient’s m-IS ratio of 2.051 and sulcus angle of 145.9° caused small decreases in the predicted probability of the model. The patient’s PTI of 0.384 did not impact the prediction of the model.
Fig 5.
Example of variable importance for individual patient prediction with patellar instability using Shapley additive explanations. The chart shows the impact of each variable on the predicted outcome for this patient, with positive values indicating a higher likelihood of patellar instability and negative values indicating a lower likelihood of patellar instability. (AP, anteroposterior; TTTG, tibial tubercle–trochlear groove.)
An example of a control patient is shown in Figure 6. The values for IS ratio, trochlear depth, AP TT-TG distance, TT-TG distance, PTI, and lateral trochlear inclination all decreased the predicted probability of the patient having a patellar dislocation. The values for PTI and m-IS ratio caused a slight increase in the predicted probability.
Fig 6.
Example of variable importance for individual patient prediction without patellar instability using Shapley additive explanations. The chart shows the impact of each variable on the outcome for this patient, with positive values indicating a higher likelihood of patellar instability and negative values indicating a lower likelihood of patellar instability. (AP, anteroposterior; TTTG, tibial tubercle–trochlear groove.)
Discussion
In this study, we found a high level of performance of machine learning models in identifying patellar instability from MRI measurements as measured by discrimination, calibration, and the Brier score. These performance metrics indicate the model’s ability to accurately differentiate between positive and negative cases, as well as its reliability in predicting the likelihood of patellar dislocations. Furthermore, the model provides explanations for individual decisions that outline the contributions of different morphologic characteristics. This model has the potential to improve the diagnostic accuracy of patellar instability from MRI scans. The explanations provided by the model could enable clinicians to personalize care and understand the factors driving patellar instability in individual patients.
In this study, the performance of logistic regression, decision tree classifier, and random forest classifier models was compared. The random forest classifier outperformed logistic regression by a small margin, and the decision tree yielded the lowest performance on all metrics. The superior performance of the random forest model over the decision tree may be attributed to its incorporation of an ensemble of decision trees, which reduces overfitting.32 The difference in performance between logistic regression and other machine learning models has been shown to vary in the literature based on the nature of the dataset and clinical question being addressed.33 Oosterhoff et al.34 found that novel machine learning methods performed similarly to logistic regression when evaluated on a collection of 9 datasets from previous orthopaedic trauma studies. Further evaluation of various models on larger datasets could help elucidate the optimal modeling technique for this purpose.
According to Shapley summary plots generated from the final model, the most influential measurements were IS ratio, TT-TG distance, and trochlear depth. Other measurements found to be useful in the model were sulcus angle, m-IS ratio, AP TT-TG distance, patellotrochlear distance, and lateral trochlear inclination. Among these features, IS ratio, TT-TG distance, trochlear depth, m-IS ratio, and PTI were found to be independently significant in logistic regression analysis. In both conventional statistical analysis and machine learning model development, IS ratio, TT-TG distance, and trochlear depth were the most important measurements of patellar height, malalignment, and trochlear dysplasia, respectively. The concordance between machine learning models and conventional statistical analysis supports the consideration of these measurements in the evaluation of patellar instability.
IS ratio was identified as the most important MRI measurement in both machine learning models and conventional statistical analysis. The importance of the IS ratio was previously evaluated by Askenberger et al.,35 who found that the average IS ratio was significantly higher among 103 patients with lateral patellar dislocations than among 69 controls. These findings also align with the results reported by Ling et al.,36 who developed a multivariable model to predict recurrent dislocations within a cohort of 291 patients who already experienced first-time dislocations. Ling et al. found that the IS ratio was highly predictive of recurrent patellar dislocations with an integer of 69.5, but the CDI was not a significant predictor. Additionally, the IS ratio has been found to have the highest reliability of all the patella alta measurements. Verhulst et al.37 performed measurements of IS ratio, CDI, m-IS ratio, BPI, and PTI in 48 patients with patellar instability and found that the IS ratio had the highest reliability both between and among observers. Among the measurements, the m-IS ratio was found to have the lowest inter-rater and intrarater reliabilities, which may explain our variable findings related to m-IS values in this study.
TT-TG distance was the second most important variable in the model and the only metric of malalignment that was significant in logistic regression. Heidenreich et al.38 conducted a study among 87 patients with first-time patellar dislocation and found that the TT-TG distance was significantly predictive of recurrent instability, with an OR of 8.9. Brady et al.39 reviewed 239 knees with patellofemoral instability or anterior cruciate ligament ruptures and found that the TT-TG distance was more predictive and reliable than the TT-PCL distance when assessing knees for patellofemoral instability. Although a TT-TG distance measurement greater than 20 mm has been described as pathologic on CT imaging, our measurements in both groups in this study were found to be under 20 mm. However, MRI has been shown to underestimate this measurement by 3.4 mm, and our study shows the utility of MRI-based TT-TG distance measurements to identify knees with patellar instability while showing a lower value than the traditionally described 20-mm cutoff associated with pathology.
Trochlear depth was identified as the third most important measurement overall in the machine learning model. Among measurements of trochlear dysplasia, it was the only one that was significant in multivariable logistic regression. Tanaka et al.7 studied the use of various trochlear dysplasia metrics in evaluating patellar instability among a set of 238 knee MRI scans. The authors found that among multiple trochlear dysplasia measurements including sulcus angle, lateral inclination, and trochlear depth, trochlear depth had the greatest level of discrimination between knees with patellar instability and controls. Ferlic et al.40 found that linear measurements of trochlear dysplasia were more reliable than angular measurements in a study of 66 cases with and without patellar instability.
The use of machine learning techniques has the potential to help surgeons customize care for patients with patellar instability. By identifying pertinent MRI measurements and quantifying the relative influence of different morphologic risk factors, models may be able to help evaluate the need for concurrent surgical procedures such as tibial tubercle osteotomy and trochleoplasty during patellar stabilization procedures.
Limitations
The work presented in this study has several limitations. The data were collected retrospectively, which creates the risk of selection bias. Diagnosis of patellar instability was determined by reviewing patient charts, and no physical examination was performed to confirm the diagnosis. The dataset was collected entirely from a single institution, and external validation of the model on diverse populations and datasets is needed to evaluate the generalizability of the model. Additionally, the cohort was constructed through age and sex matching, which may have reduced the generalizability and variability of the training set used for the machine learning models. The size of the dataset was also limited, and a larger dataset may have allowed for improved performance and better differentiation between the efficacies of the various models. Although the risk factors included in the model are clinically and biomechanically relevant, there are additional factors that could be included in future models, such as femoral anatomy and rotational parameters. This study primarily used MRI metrics, which allowed for the consideration of both bony and soft tissue. Use of metrics from multiple imaging modalities may have led to multicollinearity between measurements, so CT and radiographic measurements were not incorporated.
Conclusions
The final model was able to reliably predict MRI-based parameters associated with patellar instability. IS ratio, TT-TG distance, and trochlear depth were the most important risk factors both in the machine learning models and using conventional statistical analysis.
Disclosures
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: M.J.T. reports that financial support was provided by Arthroscopy Association of North America and reports a consulting or advisory relationship with DePuy Synthes, Arthrex, and Vericel. All other authors (V.N., A.S., I.W., K.V., O.M., A.B.) declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Appendix 1. Measurement Technique Description
Cartilaginous measurements of trochlear morphology were conducted on axial views as described by Tanaka et al.7 and included trochlear depth, sulcus angle, and lateral trochlear inclination. The intraclass correlation coefficient values for intrarater and inter-rater reliability were calculated to range from 0.83 (95% confidence interval [CI], 0.45-0.98) to 0.95 (95% CI, 0.83-0.99) and from 0.74 (95% CI, 0.42-0.92) to 0.89 (95% CI, 0.72-0.97), respectively. The sulcus angle was measured using cartilaginous landmarks and was calculated as the angle between the lines from the deepest point of the trochlear groove to the most prominent points of the lateral and medial condyles.19 Lateral trochlear inclination was measured as the ratio between the lines tangent to the posterior femoral condyles and the lateral trochlear facet.20 Cartilaginous trochlear depth was measured as the average of the distances from the posterior condylar line to the medial and lateral condyles minus the distance from the posterior condylar line to the trochlear groove.21
Measurements of trochlear length were conducted on sagittal views. The Insall-Salvati (IS) ratio, Blackburne-Peel index, Caton-Deschamps index, modified IS ratio, and patellotrochlear index were included as measurements of patellar height. The IS ratio was calculated as the ratio between the length of the patellar tendon and the patellar length.22 The Blackburne-Peel index was calculated as the ratio between the patellar articular length and the distance from the distal end of the patellar articular surface to the tibial plateau line.23 The Caton-Deschamps index was defined as the ratio of the distance between the distal end of the patellar articular surface and the anterosuperior tibial plateau to the length of the patellar articular surface.24 The modified IS ratio was calculated as the ratio between the distance from the distal patellar articular surface to the tibial tuberosity and the length of the patellar articular surface.25 The patellotrochlear index was calculated as the ratio between the length of trochlear cartilage overlapping patellar cartilage and the patellar cartilage length.26
Measurements of tibial tubercle–trochlear groove (TT-TG) distance, anteroposterior (AP) TT-TG distance, and tibial tubercle–posterior cruciate ligament distance were performed to evaluate malalignment as previously described.15 The TT-TG distance was measured as the distance between the most prominent position in the tibial tubercle and the deepest point in the trochlear groove on a line parallel to the posterior condylar axis.27 The AP TT-TG distance was measured as the anteroposterior distance between those 2 points.15 The tibial tubercle–posterior cruciate ligament distance was defined as the distance between the midpoint of the tibial tubercle and the medial border of the posterior cruciate ligament.28
References
- 1.Koh J.L., Stewart C. Patellar instability. Clin Sports Med. 2014;33:461–476. doi: 10.1016/j.csm.2014.03.011. [DOI] [PubMed] [Google Scholar]
- 2.Huntington L.S., Webster K.E., Devitt B.M., Scanlon J.P., Feller J.A. Factors associated with an increased risk of recurrence after a first-time patellar dislocation: A systematic review and meta-analysis. Am J Sports Med. 2020;48:2552–2562. doi: 10.1177/0363546519888467. [DOI] [PubMed] [Google Scholar]
- 3.Sanchis-Alfonso V. How to deal with chronic patellar instability: What does the literature tell us? Sports Health Multidiscip Approach. 2016;8:86–90. doi: 10.1177/1941738115604156. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.White A.E., Otlans P.T., Horan D.P., et al. Radiologic measurements in the assessment of patellar instability: A systematic review and meta-analysis. Orthop J Sports Med. 2021;9 doi: 10.1177/2325967121993179. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Bollier M., Fulkerson J.P. The role of trochlear dysplasia in patellofemoral instability. Am Acad Orthop Surg. 2011;19:8–16. doi: 10.5435/00124635-201101000-00002. [DOI] [PubMed] [Google Scholar]
- 6.Paiva M., Blønd L., Hölmich P., et al. Quality assessment of radiological measurements of trochlear dysplasia; a literature review. Knee Surg Sports Traumatol Arthrosc. 2018;26:746–755. doi: 10.1007/s00167-017-4520-z. [DOI] [PubMed] [Google Scholar]
- 7.Tanaka M.J., Sodhi A., Wadhavkar I., et al. Redefining trochlear dysplasia: Normal thresholds vary by measurement technique, landmarks, and sex. Am J Sports Med. 2023;51:1202–1210. doi: 10.1177/03635465231158099. [DOI] [PubMed] [Google Scholar]
- 8.Yılmaz B., Ozdemir G., Sirin E., Cicek E.D., Anıl B.S., Bulbun G. Evaluation of patella alta using MRI measurements in adolescents. Indian J Radiol Imaging. 2017;27:181–186. doi: 10.4103/ijri.IJRI_222_16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Biedert R.M., Tscholl P.M. Patella alta: A comprehensive review of current knowledge. Am J Orthop (Belle Mead NJ) 2017;46:290–300. [PubMed] [Google Scholar]
- 10.Danielsen O., Poulsen T.A., Eysturoy N.H., Mortensen E.S., Hölmich P., Barfod K.W. Trochlea dysplasia, increased TT-TG distance and patella alta are risk factors for developing first-time and recurrent patella dislocation: A systematic review. Knee Surg Sports Traumatol Arthrosc. 2023;31:3806–3846. doi: 10.1007/s00167-022-07255-1. [DOI] [PubMed] [Google Scholar]
- 11.Camp C.L., Stuart M.J., Krych A.J., et al. CT and MRI measurements of tibial tubercle–trochlear groove distances are not equivalent in patients with patellar instability. Am J Sports Med. 2013;41:1835–1840. doi: 10.1177/0363546513484895. [DOI] [PubMed] [Google Scholar]
- 12.Dietrich T.J., Betz M., Pfirrmann C.W.A., Koch P.P., Fucentese S.F. End-stage extension of the knee and its influence on tibial tuberosity-trochlear groove distance (TTTG) in asymptomatic volunteers. Knee Surg Sports Traumatol Arthrosc. 2014;22:214–218. doi: 10.1007/s00167-012-2357-z. [DOI] [PubMed] [Google Scholar]
- 13.Heidenreich M.J., Camp C.L., Dahm D.L., Stuart M.J., Levy B.A., Krych A.J. The contribution of the tibial tubercle to patellar instability: Analysis of tibial tubercle–trochlear groove (TT-TG) and tibial tubercle–posterior cruciate ligament (TT-PCL) distances. Knee Surg Sports Traumatol Arthrosc. 2017;25:2347–2351. doi: 10.1007/s00167-015-3715-4. [DOI] [PubMed] [Google Scholar]
- 14.Lansdown D.A., Christian D., Madden B., et al. The sagittal tibial tubercle–trochlear groove distance as a measurement of sagittal imbalance in patients with symptomatic patellofemoral chondral lesions. Cartilage. 2021;13(suppl):449S–455S. doi: 10.1177/1947603519900802. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Tanaka M.J., D’Amore T., Elias J.J., Thawait G., Demehri S., Cosgarea A.J. Anteroposterior distance between the tibial tuberosity and trochlear groove in patients with patellar instability. Knee. 2019;26:1278–1285. doi: 10.1016/j.knee.2019.08.011. [DOI] [PubMed] [Google Scholar]
- 16.Collins G.S., Reitsma J.B., Altman D.G., Moons K.G.M. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. BMJ. 2015;350 doi: 10.1136/bmj.g7594. g7594-g7594. [DOI] [PubMed] [Google Scholar]
- 17.Moons K.G.M., Altman D.G., Reitsma J.B., et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): Explanation and elaboration. Ann Intern Med. 2015;162:W1–W73. doi: 10.7326/M14-0698. [DOI] [PubMed] [Google Scholar]
- 18.Collins G.S., Moons K.G.M., Dhiman P., et al. TRIPOD+AI statement: Updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ. 2024;385 doi: 10.1136/bmj-2023-078378. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Davies A.P., Costa M.L., Shepstone L., Glasgow M.M., Donell S. The sulcus angle and malalignment of the extensor mechanism of the knee. J Bone Joint Surg Br. 2000;82:1162–1166. doi: 10.1302/0301-620x.82b8.10833. [DOI] [PubMed] [Google Scholar]
- 20.Carrillon Y., Abidi H., Dejour D., Fantino O., Moyen B., Tran-Minh V.A. Patellar instability: Assessment on MR images by measuring the lateral trochlear inclination—Initial experience. Radiology. 2000;216:582–585. doi: 10.1148/radiology.216.2.r00au07582. [DOI] [PubMed] [Google Scholar]
- 21.Pfirrmann C.W., Zanetti M., Romero J., Hodler J. Femoral trochlear dysplasia: MR findings. Radiology. 2000;216:858–864. doi: 10.1148/radiology.216.3.r00se38858. [DOI] [PubMed] [Google Scholar]
- 22.Insall J., Salvati E. Patella position in the normal knee joint. Radiology. 1971;101:101–104. doi: 10.1148/101.1.101. [DOI] [PubMed] [Google Scholar]
- 23.Blackburne J.S., Peel T.E. A new method of measuring patellar height. J Bone Joint Surg Br. 1977;59:241–242. doi: 10.1302/0301-620X.59B2.873986. [DOI] [PubMed] [Google Scholar]
- 24.Caton J., Deschamps G., Chambat P., Lerat J.L., Dejour H. [Patella infera. Apropos of 128 cases] Rev Chir Orthop Reparatrice Appar Mot. 1982;68:317–325. [in French] [PubMed] [Google Scholar]
- 25.Grelsamer R.P., Meadows S. The modified Insall-Salvati ratio for assessment of patellar height. Clin Orthop Relat Res. 1992;282:170–176. [PubMed] [Google Scholar]
- 26.Biedert R.M., Albrecht S. The patellotrochlear index: A new index for assessing patellar height. Knee Surg Sports Traumatol Arthrosc. 2006;14:707–712. doi: 10.1007/s00167-005-0015-4. [DOI] [PubMed] [Google Scholar]
- 27.Dejour H., Walch G., Nove-Josserand L., Guier C. Factors of patellar instability: An anatomic radiographic study. Knee Surg Sports Traumatol Arthrosc. 1994;2:19–26. doi: 10.1007/BF01552649. [DOI] [PubMed] [Google Scholar]
- 28.Seitlinger G., Scheurecker G., Högler R., Labey L., Innocenti B., Hofmann S. Tibial tubercle-posterior cruciate ligament distance: A new measurement to define the position of the tibial tubercle in patients with patellar dislocation. Am J Sports Med. 2012;40:1119–1125. doi: 10.1177/0363546512438762. [DOI] [PubMed] [Google Scholar]
- 29.Peduzzi P., Concato J., Kemper E., Holford T.R., Feinstein A.R. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996;49:1373–1379. doi: 10.1016/s0895-4356(96)00236-3. [DOI] [PubMed] [Google Scholar]
- 30.Corbacıoğlu Ş.K., Aksel G. Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to interpreting the area under the curve value. Turk J Emerg Med. 2023;23:195–198. doi: 10.4103/tjem.tjem_182_23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Lundberg S., Lee S.I. arXiv. 2017. A unified approach to interpreting model predictions. [DOI] [Google Scholar]
- 32.Ramkumar P.N., Luu B.C., Haeberle H.S., Karnuta J.M., Nwachukwu B.U., Williams R.J. Sports medicine and artificial intelligence: A primer. Am J Sports Med. 2022;50:1166–1174. doi: 10.1177/03635465211008648. [DOI] [PubMed] [Google Scholar]
- 33.Rajula H.S.R., Verlato G., Manchia M., Antonucci N., Fanos V. Comparison of conventional statistical methods with machine learning in medicine: Diagnosis, drug development, and treatment. Medicina (Mex) 2020;56:455. doi: 10.3390/medicina56090455. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Oosterhoff J.H.F., Gravesteijn B.Y., Karhade A.V., et al. Feasibility of machine learning and logistic regression algorithms to predict outcome in orthopaedic trauma surgery. J Bone Joint Surg Am. 2022;104:544–551. doi: 10.2106/JBJS.21.00341. [DOI] [PubMed] [Google Scholar]
- 35.Askenberger M., Janarv P.M., Finnbogason T., Arendt E.A. Morphology and anatomic patellar instability risk factors in first-time traumatic lateral patellar dislocations: A prospective magnetic resonance imaging study in skeletally immature children. Am J Sports Med. 2017;45:50–58. doi: 10.1177/0363546516663498. [DOI] [PubMed] [Google Scholar]
- 36.Ling D.I., Brady J.M., Arendt E., et al. Development of a multivariable model based on individual risk factors for recurrent lateral patellar dislocation. J Bone Joint Surg Am. 2021;103:586–592. doi: 10.2106/JBJS.20.00020. [DOI] [PubMed] [Google Scholar]
- 37.Verhulst F.V., Van Sambeeck J.D.P., Olthuis G.S., Van Der Ree J., Koëter S. Patellar height measurements: Insall-Salvati ratio is most reliable method. Knee Surg Sports Traumatol Arthrosc. 2020;28:869–875. doi: 10.1007/s00167-019-05531-1. [DOI] [PubMed] [Google Scholar]
- 38.Heidenreich M.J., Sanders T.L., Hevesi M., et al. Individualizing the tibial tubercle to trochlear groove distance to patient specific anatomy improves sensitivity for recurrent instability. Knee Surg Sports Traumatol Arthrosc. 2018;26:2858–2864. doi: 10.1007/s00167-017-4752-y. [DOI] [PubMed] [Google Scholar]
- 39.Brady J.M., Sullivan J.P., Nguyen J., et al. The tibial tubercle–to–trochlear groove distance is reliable in the setting of trochlear dysplasia, and superior to the tibial tubercle–to–posterior cruciate ligament distance when evaluating coronal malalignment in patellofemoral instability. Arthroscopy. 2017;33:2026–2034. doi: 10.1016/j.arthro.2017.06.020. [DOI] [PubMed] [Google Scholar]
- 40.Ferlic P.W., Runer A., Seeber C., Thöni M., Spicher A., Liebensteiner M.C. Linear anterior-posterior computed tomography parameters used to quantify trochlear dysplasia are more reliable than angular measurements. Arthroscopy. 2021;37:1204–1211. doi: 10.1016/j.arthro.2020.11.032. [DOI] [PubMed] [Google Scholar]






