Abstract
Background
Load-bearing structural degradation is crucial in knee osteoarthritis (KOA) progression, yet limited prediction models use load-bearing tissue radiomics for radiographic (structural) KOA incident.
Purpose
We aim to develop and test a Load-Bearing Tissue plus Clinical variable Radiomic Model (LBTC-RM) to predict radiographic KOA incidents.
Study design
Risk prediction study.
Methods
The 700 knees without radiographic KOA at baseline were included from Osteoarthritis Initiative cohort. We selected 2164 knee MRIs during 4-year follow-up. LBTC-RM, which integrated MRI features of meniscus, femur, tibia, femorotibial cartilage, and clinical variables, was developed in total development cohort (n = 1082, 542 cases vs. 540 controls) using neural network algorithm. Final predictive model was tested in total test cohort (n = 1082, 534 cases vs. 548 controls), which integrated data from five visits: baseline (n = 353, 191 cases vs. 162 controls), 3 years prior KOA (n = 46, 19 cases vs. 27 controls), 2 years prior KOA (n = 143, 77 cases vs. 66 controls), 1 year prior KOA (n = 220, 105 cases vs. 115 controls), and at KOA incident (n = 320, 156 cases vs. 164 controls).
Results
In total test cohort, LBTC-RM predicted KOA incident with AUC (95 % CI) of 0.85 (0.82–0.87); with LBTC-RM aid, performance of resident physicians for KOA prediction were improved, with specificity, sensitivity, and accuracy increasing from 50 %, 60 %, and 55 %–72 %, 73 %, and 72 %, respectively. The LBTC-RM output indicated an increased KOA risk (OR: 20.6, 95 % CI: 13.8–30.6, p < .001). Radiomic scores of load-bearing tissue raised KOA risk (ORs: 1.02–1.9) from 4-year prior KOA whereas 3-dimensional feature score of medial meniscus decreased the OR (0.99) of KOA incident at KOA confirmed. The 2-dimensional feature score of medial meniscus increased the ORs (1.1–1.2) of KOA symptom score from 2-year prior KOA.
Conclusions
We provided radiomic features of load-bearing tissue to improved KOA risk level assessment and incident prediction. The model has potential clinical applicability in predicting KOA incidents early, enabling physicians to identify high-risk patients before significant radiographic evidence appears. This can facilitate timely interventions and personalized management strategies, improving patient outcomes.
The Translational Potential of this Article
This study presents a novel approach integrating longitudinal MRI-based radiomics and clinical variables to predict knee osteoarthritis (KOA) incidence using machine learning. By leveraging deep learning for auto-segmentation and machine learning for predictive modeling, this research provides a more interpretable and clinically applicable method for early KOA detection. The introduction of a Radiomics Score System enhances the potential for radiomics as a virtual image-based biopsy tool, facilitating non-invasive, personalized risk assessment for KOA patients. The findings support the translation of advanced imaging and AI-driven predictive models into clinical practice, aiding early diagnosis, personalized treatment planning, and risk stratification for KOA progression. This model has the potential to be integrated into routine musculoskeletal imaging workflows, optimizing early intervention strategies and resource allocation for high-risk populations. Future validation across diverse cohorts will further enhance its clinical utility and generalizability.
Keywords: Cartilage defect, Knee, MRI, Neural network, Osteoarthritis, Radiomics
Graphical abstract
1. Introduction
Osteoarthritis (OA) affects 7 % of the global population, impacting over 500 million individuals [1]. The prevalence of OA is rising due to aging societies and obesity epidemic, leading to a growing economic burden [2]. Among weight-bearing joints, knee osteoarthritis (KOA) is the most common form, causing significant disability in older adults [3]. Current treatments for OA have not keep up with advancements in managing other musculoskeletal and chronic non-communicable diseases [4]. However, implementing public health interventions focused on OA prevention, reducing inappropriate care, and providing cost-effective treatments can alleviate the burden of OA and enhance the quality of life for millions [3].
Many previous predictive models for KOA have largely relied on basic demographic and clinical variables such as age, gender, body mass index (BMI), and medical history [5,6]. However, risk prediction models based purely on demographic data may not capture the heterogeneous nature of KOA progression, which can vary significantly between individuals depending on lifestyle, genetic predisposition, and joint biomechanics [7]. Furthermore, such models tend to perform poorly in predicting the onset of KOA in asymptomatic patients who do not yet display clinical symptoms but have underlying joint degeneration [8]. This lack of sensitivity can delay the opportunity for early interventions, such as lifestyle modifications or preventive treatments, that could mitigate the progression of the disease.
To overcome these limitations, our model integrates MRI-based radiomic features from load-bearing tissues such as the femur, tibia, and meniscus, with traditional clinical variables to enhance predictive performance. Radiomic features extracted from MRI images offer a detailed quantitative analysis of tissue characteristics, such as texture, shape, and intensity, which can reveal early structural alterations that precede clinical symptoms [[9], [10], [11], [12], [13], [14]]. These imaging biomarkers are particularly sensitive to changes in cartilage and meniscus health, which are often the first tissues affected in the early stages of KOA. When clinical variables such as age, BMI, and history of knee pain are combined with these radiomic features, the model can precisely predict the incidence of KOA [13,14]. By incorporating both structural and patient-specific data, radiomic model addresses the shortcomings of previous models that relied solely on clinical risk factors.
Magnetic resonance imaging (MRI) has revolutionized understanding of three-dimensional OA structural pathology, revealing pathologies undetectable by X-rays [15]. While radiomic techniques have been applied to KOA MRI data, their potential in predicting radiographic KOA incidents remains underexplored. In the Osteoarthritis Initiative (OAI) cohort, baseline MRI data predicted radiographic KOA incidents with Area under the receiver operating characteristic curve (AUC) of 0.70–0.75 [9,14]. Another study using cartilage T2 texture maps demonstrated 71 %–76 % accuracy in predicting incident symptomatic KOA [16]. Additionally, a novel transport-based morphometry algorithm achieved an impressive AUC of 0.87 in predicting incident symptomatic KOA [17]. However, no predictive model fully integrates load-bearing tissues, including bone, meniscus, and cartilage, representing a crucial research gap.
Computer-aided diagnosis and radiomics are increasingly used in predicting patient outcomes and treatment response [18]. Some radiomic tools are already clinically useful with US. food and drug administration clearance [19], and guidelines on data acquisition and analysis are available [[20], [21], [22], [23]]. Our study focused on developing a precise predictive model for KOA incidents in the OAI cohort, utilizing clinical data and MRI radiomics of load-bearing tissues (femur, tibia, meniscus, and femorotibial cartilage).
2. Materials and methods
2.1. Data sources
Data for our study were sourced from the OAI, a multicenter cohort study aiming to identify OA biomarkers (Clinical Trials.gov identifier: NCT00080171). The study utilized the pivotal OAI MR Imaging analyses incidental OA (POMA_inc.OA) cohort, a nested case–control analysis of knees without radiographic KOA (Kellgren and Lawrence grading, KLG<2) at baseline but at increased risk of developing KOA [24]. Detailed participant selection, inclusion, and exclusion criteria are provided in the Supplemental Methods (pp 2–3) and Fig. S1. The datasets used in our analysis are accessible at https://nda.nih.gov/oai/.
2.2. POMA_inc.OA cohort study inclusion and exclusion criteria
Participants in the POMA_inc.OA [24] cohort will not have symptomatic KOA at baseline but will possess characteristics that increase their risk of developing radiographic osteoarthritis (ROA). The flow chart of the POMA_inc.OA cohort can be found in Fig. S1. Incident radiographic ROA is defined as radiographic evidence of KOA (KLG ≥2) during the study. The cohort's flow chart is illustrated in Fig. S1. Eligibility criteria for the POMA_inc.OA cohort include the following risk factors: (1) Knee symptoms in a native knee in the past 12 months, (2) Overweight based on gender- and age-specific cut-points for weight, (3) History of knee injury that caused difficulty walking for at least a week, (4) History of any knee surgery, including meniscal or ligamentous repairs, and unilateral total knee replacement for OA, (5) Family history of OA with total knee replacement in a parent or sibling, (6) Presence of Heberden's nodes, self-reported bony enlargements (“knobby fingers”) in 1+ distal interphalangeal (DIP) joints in both hands, (7) Repetitive knee bending due to daily activities requiring frequent climbing, stooping, squatting, or kneeling, and (8) Age between 70 and 79, where only one additional risk factor is required for eligibility. Exclusion criteria for the cohort include: (1) Rheumatoid arthritis (RA) or other inflammatory arthritis with a history of RA-specific medications, (2) Severe joint space narrowing (OARSI grade 3 or bone-on-bone) in both knees at baseline or unilateral TKR with severe narrowing in the remaining knee, (3) Bilateral total knee replacements or plans for bilateral knee replacement within 3 years, (4) Inability to undergo a 3.0 T MRI due to contraindications or size limitations, (5) Positive pregnancy test, (6) Inability to provide a blood sample, including cases involving bilateral mastectomy or dialysis shunts, (7) Reliance on ambulatory aids (other than a straight cane) for more than 50 % of ambulation, (8) Co-morbid conditions impeding 4-year study participation, (9) Likely relocation from the clinic area within 3 years, (10) Current participation in a double-blind randomized controlled trial, and (11) Refusal to sign informed consent.
2.3. Outcomes
The main clinical outcome measured was the KOA incident, defined as the first appearance of KOA findings (KLG ≥2) during the study. Case knees developed KOA, while control knees did not progress to KOA. Further participant follow-up and clinical examination details are in the Supplemental Methods (pp 3).
2.4. Patient follow-up
The OAI cohort selected MRI scans at one-year intervals to detect early and subtle structural changes in joint tissues, such as cartilage and meniscus, which are critical for monitoring KOA progression over time [25,26]. This interval ensures sufficient temporal resolution to capture disease evolution while maintaining practicality for large-scale longitudinal studies. After enrollment, participants underwent routine clinical examination, knee MRI, and knee joint radiography every 12 months for 4 years. ROA was determined by knee joint radiography.
2.5. Clinical examination
The POMA_inc.OA cohort used the following clinical examination: (1) Knee pain severity scale. (2) Participant global assessment. (3) Western Ontario and McMaster Universities Arthritis Index (WOMAC) Osteoarthritis Index TM. (4) Knee Outcomes in Osteoarthritis Survey (KOOS). (5) Limitation in activity due to knee pain. (6) General health and functional status. (7) Walking ability and endurance. (8) Upper leg strength.
2.6. Radiography
For radiographic assessment, knee fixed-flexion, posterior-anterior weight-bearing radiographs were obtained at baseline and during all annual follow-up visits. A Plexiglas positioning frame (SynaFlexer; BioClinica, Newark, CA) was used, with knees flexed to 5–15° and feet internally rotated 10°. All radiographs were centrally evaluated to determine the KLG, with ROA defined as a KLG of 2 or higher [24].
2.7. MRI protocol and assessment
We used baseline sagittal 3-dimensional double echo steady-state with selective water excitation (SAG-3D-DESS-WE) MRI data from the POMA_inc.OA cohort, acquired with Siemens Trio 3.0 T scanners (Magnetom Trio, Siemens Healthcare, Erlangen, Germany), employing near anisotropic voxels (0.7 mm slice thickness × 0.37 mm × 0.46 mm) for high sagittal spatial resolution (10.5 min acquisition time) [27]. Further details of the MRI protocol can be found in Table S1. MRI assessment was performed by two experienced musculoskeletal radiologists (Frank W. Roemer and Ali Guermazi) using MOAKS criteria, showing substantial to high inter- and intra-observer reliabilities (0.61–1.0) [24].
2.8. MRI preprocessing
For automated MRI segmentation, we utilized convolutional neural networks (CNNs) [28] to develop a method capable of segmenting six anatomical structures, including the femur, tibia, femoral and tibial cartilages [29] (Fig. S2), as well as both the medial and lateral meniscus [30]. The segmentation of the femur, tibia, femoral and tibial cartilage was performed using the approach described by Ambellan et al., 2019 [29], while the segmentation of the medial and lateral meniscus was based on the method proposed by Tack et al., 2018 [30]. Volumes of interest (VOIs) were defined for each knee (n = 20 knees) and encompassed the femur, tibia, femorotibial cartilages, and meniscus. These VOIs were manually adjusted by two independent authors (including S.F.L. and T.Y.C., with 6 years of experience in orthopedics), who were unaware of the clinical outcome data. Itk-SNAP version 3.8.0 software (www.itksnap.org) was used for visualization and adjustments. The agreement between manual adjustment and automated segmentation was assessed using the Dice Similarity Coefficients (DSCs) (Table S2).
2.9. Three-dimensional radiomic feature extraction
Variability in scanner settings can affect radiomic feature consistency [31]. The MRI scans in the OAI database were performed using a standardized MRI protocol across all sites, which minimizes variability and ensures consistency in data acquisition [32].
Raw 3D MR images and participant-specific three-dimensional maps of VOIs (each knee MRI has six VOIs, such as femur (Red), tibial (Blue), femoral cartilage (Green), tibial cartilage (Yellow), lateral meniscus (Purple), medial meniscus (Baby blue) were matched (Fig. S2), and imported into Matlab R2021a (version 9.10.0). The Standardized Environment for Radiomics Analysis (SERA) package is a Matlab-based framework developed at Johns Hopkins University that calculates radiomic features (https://www.bccrc.ca/dept/io-programs/qurit/software/sera/) based on guidelines from the Image Biomarker Standardization Initiative (IBSI) [20]. We extracted 487 IBSI-standardized features from each 3D segmentation (femur, tibia, femoral cartilage, tibial cartilage, lateral meniscus, and medial meniscus), including 79 first-order features (morphology, statistical, histogram and intensity-histogram features), 272 higher-order 2D features (second order radiomic features), and 136 3D features (third order radiomic features). Values of extracted features for the development cohort were standardized with z scores; feature values of the total test cohort were then standardized to z scores using the mean and standard deviation values derived from the total development cohort.
2.10. Radiomic features selection
The radiomic features selection included two parts (Figs. S3–4). The first part was a single-structure radiomic features selection. In this part, the least absolute shrinkage and selection operator (LASSO) logistic regression repeated 1,000,000 times was applied to select the nonzero coefficients among the 487 radiomic features in each VOI. The 67 femur features, 33 femoral cartilage features, 15 tibia features, 8 tibial cartilage features, 19 medial meniscal features, and 11 lateral meniscal features remained (Fig. S3). The second part was multi-structure radiomic features selection. To further streamline the radiomic features, the LASSO logistic regression was applied when analyzing the total 153 selected features of the six VOIs. By optimizing the tuning parameter in the LASSO logistic regression and avoiding overfitting effects when analyzing the selected features of the four VOIs, the remaining 115 features with nonzero coefficients were selected for load-bearing tissue features and 110 features were selected for load-bearing tissue plus clinical features (Fig. S4). The names of radiomic features are presented in Table S3.
2.11. Neural network model development and test
Deep learning predictive models, such as recurrent neural networks (RNNs), can automatically extract features from MRI data [33], but their complexity often increases the risk of overfitting, especially with small medical imaging datasets [34]. Overfitting leads to models that perform well on training data but generalize poorly to new data. In contrast, using machine learning in simpler models, require more manual feature engineering but are more interpretable and less prone to overfitting [35]. For clinical applications, where interpretability is crucial, this trade-off is important. Our model balances complexity and interpretability, integrating radiomic and clinical data using machine learning for robust predictive performance [36].
We established a neural network model based on the remained radiomic features through LASSO logistic regression for ROA incident prediction in the total development cohort. The fully connected shallow neural network algorithm had one hidden layer containing Tanh functions between the input features and the outcome layer. In the single-structure radiomic model (femur, tibia, femoral cartilage, tibial cartilage, lateral meniscus, and medial meniscus), a supervised learning strategy with a neural network algorithm was developed using corresponding features. The single-structure radiomic model was merged into a load-bearing tissue radiomic model (bone-meniscus-cartilage) using an same neural network framework based on load-bearing tissue radiomic features. Clinical model factors were selected using LASSO that were significantly related to ROA incident, including knee injury, WOMAC stiffness score, WOMAC disability score, and total WOMAC score. The predictive performance and weight of clinical factors can be found in Table S7. The load-bearing tissue plus clinical features were selected by LASSO. All models were evaluated on the test cohorts using 10-fold cross validation repeated 15,000 times. To prevent overfitting, feature selection was performed using recursive feature elimination [37], while L2 regularization and dropout [38] were applied to reduce model complexity and improve generalization. In addition, hyperparameters were optimized through grid search to balance bias and variance. An accessible introduction to developing neural network algorithm can be found in previous research [35].
In this work, the activation function was TanH. It transforms values between −1 and 1. TanH is a logistic function that is transformed by a centralized and uniform scale. TanH is defined as:
x is a linear combination of radiomic features.
The outcome, y, takes on values 0 (non-ROA), 1 (ROA incident). Under the parameterization of the multinomial distribution using the canonical log odds parameter:
the loglikelihood is:
where is the indicator function of the event that . The neural model is such that each is a linear combination of the uppermost hidden layer nodes and the set of parameters that correspond to plus an intercept type parameter. In this way, the prediction formula of the probability that , , is:
for j < k, and:
For testing purposes, one can find the parameters from the prediction probabities using the relation, . For the statistics for outcome, the null model is the one where for all i, where is the sample proportion of observations where .
2.12. Double echo steady-state signal intensity map of knee MRI
We generated baseline double echo steady-state (DESS) signal intensity maps (mean pixel value) for load-bearing tissues using 3D Slicer software (version 5.0.3) [14]. This approach facilitated the detection of MRI changes in participants with KOA incidents using the SAG-3D-DESS-WE sequence.
2.13. Development and test data sets
We utilized convolutional neural networks (CNNs) [28] to develop a method to segment six anatomical structures, including the femur, tibia, femoral and tibial cartilages [29] (Fig. S2), as well as both the medial and lateral meniscus [30]. Data sets for development and testing were sourced from the POMA_inc.OA cohort study, which initially included 710 knees with 2214 knee MRIs. After excluding 10 knees (50 MRIs) with non-conforming images, a set of 700 knees with 2164 knee MRIs remained. Each visit's knee MRIs were divided into development and test cohorts in a 1:1 ratio. The total development cohort comprised 1082 knee MRIs (542 cases vs. 540 controls), while the total test cohort (n = 1082, 534 cases vs. 548 controls) integrated data from five visits: baseline (test cohort 1, n = 353, 191 cases vs. 162 controls), 3 years prior to KOA (P-3) (test cohort 2, n = 46, 19 cases vs. 27 controls), 2 years prior (P-2) (test cohort 3, n = 143, 77 cases vs. 66 controls), 1 year prior (P-1) (test cohort 4, n = 220, 105 cases vs. 115 controls), and at KOA incident (P0) (test cohort 5, n = 320, 156 cases vs. 164 controls). Knee MRIs with non-conforming images were excluded from each test cohort (Fig. 1A).
Fig. 1.
Workflow of the study. (A) 700 eligible knees (development cohort/test cohort 1 = 1/1) were selected without ROA at baseline in the POMA_inc.OA cohort (clinicaltrials.gov identifier: NCT00080171), The MRIs of baseline, P-3, P-2, P-1, and P0 were included as test cohort 1(n = 353), test cohort 2 (n = 46), test cohort 3 (n = 143), test cohort 4 (n = 220), and test cohort 5 (n = 320), respectively. (B) Knee MRIs were automatically segmented for feature extraction. After feature evaluation, six sets of load-bearing structural signatures (femur, femoral cartilage, tibia, tibial cartilage, medial meniscus, and lateral meniscus features) were selected and further used to develop the LBTC-RM (Intergrated MRI radiomics and clinical variables). The performance of LBTC-RM in predicting ROA incident (i.e., ROA incident vs. non-ROA incident) was tested in test cohorts. POMA_inc.OA cohort: Pivotal OAI MR Imaging Analyses incidental OA cohort, P-3: 3 years prior ROA, P-2: 2 years prior ROA, P-1: 1 year prior ROA, P0: ROA incident, SAG-3D-DESS-WE: SAGittal 3-Dimensional Double Echo Steady-State with selective Water Excitation, CNNs: Convolutional Neural Networks, LASSO: Least Absolute Shrinkage and Selection Operator, FE-RM: FEmur Radiomic Model, FC-RM: Femoral Cartilage Radiomic Model, TI-RM: TIbia Radiomic Model, TC-RM: Tibial Cartilage Radiomic Model, MM-RM: Medial Meniscal Radiomic Model, LM-RM: Lateral Meniscal Radiomic Model, LBTC-RM: Load-Beaaring Tissue plus Clinical variable Radiomic Model, OAI: OsteoArthritis Iniciative, ROI: Region Of Interest, MRI: Magnetic Resonance Imaging, ROA: Radiographic OsteoArthritis, 2-D: 2-Dimensional, 3-D: 3-Dimensional, AUC: Area Under the receiver operating characteristic Curve, AI: Artificical Intelligence, Test cohort 1: baseline, Test cohort 2: 3 years prior to KOA, Test cohort 3: 2 years prior, Test cohort 4: 1 year prior, Test cohort 5: KOA incident, Total test cohort: the combined dataset from Test cohorts 1 to 5.
2.14. Three-dimensional radiomic feature analysis
We applied standardized radiomic features from 2164 knee SAG-3D-DESS-WE MRIs using IBSI standards [20] in the POMA_inc.OA cohort, resulting in 2992 features through SERA [39]. Employing LASSO logistic regression, nonzero coefficient features for each structure were identified, leading to the development of specific radiomic models for femur, femoral cartilage, tibia, tibial cartilage, medial meniscus, and lateral meniscus (FE-RM, FC-RM, TI-RM, TC-RM, MM-RM and LM-RM, respectively). Further refinement using LASSO regression produced load-bearing tissue radiomic model (LBT-RM) with 115 nonzero coefficient features. Integration with clinical variables resulted in load-bearing tissue plus clinical variable radiomic model (LBTC-RM) with 110 features. Additional models, including MRI osteoarthritis knee score (MOAKS) image biomarker models and a clinical model, were constructed for comparison. All models utilized a fully connected single hidden layer shallow neural network algorithm. For detailed information on feature selection and model development, refer to the Supplemental Methods (pp 3–5).
2.15. Reader tests
Five resident physicians (G.W.Z., Z.J.W., Y.W., X.L., and S.J.W.), with 1–4 years of clinical experience, predicted KOA incidents using knee MRI and clinical variables. They assessed scenarios with and without LBTC-RM assistance, which provided KOA incident probabilities and true classification thresholds.
2.16. Evaluation of model performance
We evaluated the accuracy of the ROA incident prediction models using receiver operating characteristic (ROC) analysis. Evaluation metrics, including the Area Under receiver operating characteristic Curve (AUC), sensitivity, specificity, accuracy, and kappa value were computed.
2.17. Score system of predictive model features
Utilizing LASSO logistic regression feature weights, we devised the scoring system for LBTC-RM (Table S11). Initially, we established the Load-bearing tissue Radiomic plus Clinical variable score (LRC_score), comprising subgroups LR_score and C_score. LR_score further branched into Load-bearing tissue First order radiomic score (LF_score), Load-bearing tissue Second order (2-dimensional features) radiomic score (LS_score), and Load-bearing tissue Third order (3-dimensional features) radiomic score (LT_score). These subtypes (LF_score, LS_score, LT_score) were categorized into six knee structures (femur, femoral cartilage, tibia, tibial cartilage, lateral meniscus, and medial meniscus), each with distinct radiomic scores. Standardization of all scores in the total test cohort was performed using t score.
2.18. Statistical analysis
We employed the generalized estimating equation (GEE) method to assess the risk of KOA incident, accounting for the correlation of repeated measures within subjects over time. The GEE model is particularly suitable for longitudinal data, as it addresses intra-subject variability and provides robust population-averaged estimates. By using this approach, we were able to evaluate the association between risk factors and KOA incidence while accounting for time-dependent changes, ensuring more accurate and reliable results in predicting KOA incidence. Radiomic feature extraction and Dice similarity coefficient (DSC) calculation were conducted using Matlab R2021a (version 9.10.0), while LASSO logistic regression and neural network modeling were performed using Statistical Analysis System (SAS, version 9.4). Model performance in predicting KOA incident was evaluated using various metrics, including AUC, sensitivity, specificity, accuracy, and kappa coefficient, separately in the total development and test cohort. To compare case and control groups in the test cohort of different visit, we used appropriate tests (unpaired t-test for continuous variables and χ2 test or Mann–Whitney test for categorical variables). The significance of differences in AUCs was determined using the DeLong test, with p < .05 considered statistically significant. All statistical analyses were carried out using R (version 4.1.1) and the “pROC” package (version 1.18.0).
3. Results
3.1. Demographic characteristics
Table S4 displays baseline demographics for development cohort 1 and test cohort 1, with 67 % females in both. Whites or Caucasians constituted 83 % (289/347) in development cohort 1 and 81 % (285/353) in test cohort 1. Incidence of KOA in contralateral knees was 39 % (135/347) and 35 % (125/353) for development cohort 1 and test cohort 1, respectively. No significant differences existed in baseline demographics between the cohorts (all p > .05, Table S4).
Baseline demographics showed minimal differences between case and control knees in both cohorts, except for BMI (p < .05) in both and certain factors [race background, knee injury, WOMAC scores] in test cohort 1 (Table 1).
Table 1.
Baseline characteristics of the participants in development cohort 1 and test cohort 1, stratified by Case and Control Groups.
Development cohort 1 (n = 347) |
Test cohort 1 (n = 353) |
|||||
---|---|---|---|---|---|---|
Control (n = 158) | Case (n = 189) | p value | Control (n = 191) | Case (n = 162) | p value | |
Age (year)a | 60 ± 8 | 60 ± 8 | 0.96 | 60 ± 9 | 60 ± 9 | 0.87 |
Femaleb | 110 (70 %) | 124 (66 %) | 0.43 | 126 (66 %) | 112 (69 %) | 0.53 |
Racec | 0.79 | 0.013 | ||||
White | 132 (83 %) | 157 (83 %) | 162 (85 %) | 123 (76 %) | ||
Black | 22 (14 %) | 27 (14 %) | 24 (12 %) | 33 (20 %) | ||
Asian | 1 (1 %) | 2 (1 %) | 1 (1 %) | 4 (3 %) | ||
Other | 3 (2 %) | 3 (2 %) | 4 (2 %) | 2 (1 %) | ||
Contralateral knee KLGc | 0.28 | 0.32 | ||||
KLG = 0 | 23 (15 %) | 39 (21 %) | 40 (21 %) | 24 (15 %) | ||
KLG = 1 | 68 (43 %) | 82 (43 %) | 89 (47 %) | 75 (46 %) | ||
KLG = 2+ | 67 (42 %) | 68 (36 %) | 62 (32 %) | 63 (39 %) | ||
BMI (kg/m2)a | 27 ± 4 | 29 ± 5 | 0.003 | 28 ± 4 | 29 ± 4 | 0.02 |
Knee injuryb | 29 (18 %) | 41 (22 %) | 0.44 | 31 (16 %) | 46 (28 %) | 0.006 |
Knee surgeryb | 7 (4 %) | 9 (5 %) | 0.88 | 14 (7 %) | 8 (5 %) | 0.36 |
NSAIDs useb | 32 (20 %) | 48 (25 %) | 0.26 | 43 (23 %) | 42 (26 %) | 0.46 |
CES-Dc | 4 (1, 8) | 4 (1, 8) | 0.86 | 4 (2, 8) | 4 (2, 8) | 0.98 |
PASEc | 151 (108, 212) | 164 (114, 225) | 0.46 | 156 (104, 218) | 173 (102, 226) | 0.56 |
WOMAC pain scorec | 1 (0, 3) | 1 (0, 3) | 0.80 | 0 (0, 3) | 1 (0, 4) | 0.016 |
WOMAC stiffness scorec | 1 (0, 2) | 1 (0, 2) | 0.13 | 1 (0, 2) | 2 (0, 3) | 0.006 |
WOMAC disability scorec | 1 (0, 7) | 2 (0, 11) | 0.29 | 1 (0, 7) | 3 (0, 15) | 0.004 |
Total WOMAC scorec | 3 (0, 12) | 4 (0, 15) | 0.25 | 3 (0, 12) | 6 (1, 20) | 0.002 |
Data are represented as mean ± SD, n (%), or median (IQR).
KLG: Kellgren Lawrence Grade, BMI: Body Mass Index, NSAIDs: NonSteroidal AntiInflammatory Drugs, CES-D: Center for Epidemiologic Studies Depression scale, PASE: Physical Activity Scale for the Elderly, WOMAC: Western Ontario and McMaster Universities Arthritis Index, IQR: InterQuartile Range, SD: Standard Deviation.
Un-paired t-tests are used for differences between means.
χ2 tests are used for differences between proportions.
Mann–Whitney tests are used for differences between medians.
3.2. Automatic segmentation reliability and features selection
We utilized the CNNs automated segmentation our dataset, which exhibited good agreement with manual segmentations (DSC>0.80, Table S2). The process of feature selection through LASSO regression was described in Figs. S3–4, and the selected features were shown in Table S3. The performance of final predictive model was validated by 10-fold cross-validation. The maximum and median AUC of final predictive model were 0.93 and 0.81 to predict KOA incidence, respectively (Fig. S5).
3.3. Performance of single-structure radiomic models: comparison with single-structure MOAKS models
In the baseline assessment, the KOA group (Fig. S6B) displayed heightened DESS signal intensity and heterogeneity across knee structures compared to the control group (Fig. S6A).
For predicting knee KOA incident, single-structure radiomic models (FE-RM, FC-RM, TI-RM, TC-RM, MM-RM, and LM-RM) achieved AUCs ranging from 0.55 to 0.77 in the test cohorts (Figs. S7A–F). These models demonstrated acceptable sensitivity (59 %–70 %), specificity (47 %–74 %), and accuracy (48 %–67 %) across knee structures (Table S5). Kappa coefficients for these models ranged from 0.04 to 0.43 in the test cohorts (Figs. S8A–F).
In the test cohorts, single-structure radiomic models generally outperformed corresponding MOAKS models (FE-RM vs. FE-MM, FC-RM vs. FC-MM, TI-RM vs. TI-MM, TC-RM vs. TC-MM, LM-RM vs. LM-MM, and MM-RM vs. MM–MM) in terms of AUC, sensitivity, and accuracy (Table S6). However, MOAKS models exhibited better specificity in the total test cohort (p < .05) (Table S6).
3.4. Performance of LBTC-RM: comparison with single-structure radiomic models, LBT-RM, and clinical model
The baseline DESS signal intensity and heterogeneity in load-bearing tissues were significantly higher in the KOA group compared to the control group, particularly in key regions such as cartilage and meniscus. These differences are visually apparent in Fig. 2, where the 2-D and 3-D signal intensity maps of the control group (Fig. 2B and C) show lower and more uniform intensities compared to the KOA group (Fig. 2E and F), where increased signal heterogeneity reflects early structural changes associated with KOA progression.
Fig. 2.
Load-bearing structural feature maps and prediction performance of LBT-RM, LBT-MM, CM, and LBTC-RM in test cohorts. 2-D and 3-D DESS signal intensity maps (mean pixel value) of load-bearing tissues were displayed in the control (B–C) and ROA group (E–F). These maps highlight key differences in signal intensities, particularly in regions such as cartilage and meniscus, where increased intensity in the ROA group correlates with structural damage associated with KOA progression. AUC was demonstrated for predicting ROA incident among the LBT-RM, LBT-MM, CM, and LBTC-RM in the test cohort 1 (G), test cohort 2 (H), test cohort 3 (I), test cohort 4 (J), test cohort 5 (K), and total test cohort (L). AUC: Area Under the receiver operating characteristic Curve, LBT-RM: Load-Bearing Tissue Radiomic Model, LBT-MM: Load-Bearing Tissue MOAKS Model, CM: Clinical Model, LBTC-RM: Load-Bearing Tissue plus Clinical variable Radiomic Model, MOAKS: Magnetic resonance imaging OsteoArthritis Knee Score, ROA: Radiograhic OsteoArthritis, 2-D: 2-Dimensional, 3-D: 3-Dimensional, DESS: Double Echo Steady-State, Test cohort 1: baseline, Test cohort 2: 3 years prior to KOA, Test cohort 3: 2 years prior, Test cohort 4: 1 year prior, Test cohort 5: KOA incident, Total test cohort: the combined dataset from Test cohorts 1 to 5.
In test cohorts, LBT-RM exhibited AUC values of 0.83–0.85, outperforming load-bearing tissue MOAKS model (LBT-MM) with AUCs of 0.59–0.65 (Fig. 2G-L). LBT-RM demonstrated specificity of 67–77 %, sensitivity of 71–84 %, and accuracy of 71–76 % across test cohorts, with kappa coefficients ranging from 0.42 to 0.53 (Figs. S9A–F).
The clinical model had AUCs of 0.60–0.68 in the test cohorts, with sensitivity of 51–58 %, specificity of 55–68 %, and accuracy of 56–61 % (Table 2). LBTC-RM achieved AUCs of 0.83–0.86, sensitivity of 70–77 %, specificity of 74–84 %, and accuracy of 76–77 % (Table 2). LBTC-RM and LBT-RM demonstrated superior predictive performance in AUC, sensitivity, specificity, accuracy, and kappa coefficient compared to single-structure radiomic models, LBT-MM, clinical model, and LBTC-MM in the test cohorts (Table S6), all p < .05. The AUC difference between LBTC-RM and LBT-RM was insignificant in all test cohorts (Table S6). The results demonstrate that the integration of radiomic features with clinical data in the LBTC-RM model offers superior predictive performance compared to the MOAKS and clinical variable-only models.
Table 2.
Prediction performance of load-bearing tissue models and clinical model in the test cohorts.
Model performance | Test cohort 1 | Test cohort 2 | Test cohort 3 | Test cohort 4 | Test cohort 5 | Total test cohort |
---|---|---|---|---|---|---|
LBT-RM | ||||||
AUC | 0.84 (0.80–0.88) | 0.82 (0.66–0.92) | 0.82 (0.75–0.88) | 0.85 (0.79–0.89) | 0.82 (0.77–0.86) | 0.83 (0.81–0.86) |
Sensitivity | 77 % (124/162) | 70 % (19/27) | 67 % (44/66) | 71 % (82/115) | 76 % (125/164) | 74 % (394/534) |
Specificity | 76 % (145/191) | 84 % (16/19) | 75 % (58/77) | 80 % (84/105) | 71 % (111/156) | 76 % (414/548) |
Accuracy | 76 % (269/353) | 76 % (35/46) | 71 % (102/143) | 75 % (166/220) | 74 % (236/320) | 75 % (808/1082) |
LBT-MM | ||||||
AUC | 0.65 (0.59–0.70) | 0.59 (0.42–0.74) | 0.62 (0.52–0.71) | 0.62 (0.55–0.70) | 0.64 (0.57–0.70) | 0.64 (0.60–0.67) |
Sensitivity | 57 % (93/162) | 48 % (13/27) | 55 % (36/66) | 57 % (65/115) | 63 % (104/164) | 58 % (311/534) |
Specificity | 76 % (145/191) | 79 % (15/19) | 83 % (64/77) | 75 % (79/105) | 73 % (114/156) | 76 % (417/548) |
Accuracy | 67 % (238/353) | 61 % (28/46) | 70 % (100/143) | 65 % (144/220) | 68 % (218/320) | 67 % (728/1082) |
Clinical model | ||||||
AUC | 0.61 (0.55–0.67) | 0.68 (0.51–0.81) | 0.61 (0.52–0.70) | 0.60 (0.52–0.67) | 0.61 (0.54–0.67) | 0.61 (0.58–0.64) |
Sensitivity | 56 % (90/162) | 56 % (15/27) | 58 % (38/66) | 51 % (59/115) | 57 % (93/164) | 55 % (295/534) |
Specificity | 59 % (112/191) | 68 % (13/19) | 55 % (42/77) | 61 % (64/105) | 56 % (88/156) | 58 % (319/548) |
Accuracy | 57 % (202/353) | 61 % (28/46) | 56 % (80/143) | 56 % (123/220) | 57 % (181/320) | 57 % (614/1082) |
LBTC-RM | ||||||
AUC | 0.86 (0.81–0.89) | 0.83 (0.67–0.92) | 0.84 (0.76–0.89) | 0.85 (0.80–0.90) | 0.84 (0.79–0.88) | 0.85 (0.82–0.87) |
Sensitivity | 75 % (121/162) | 70 % (19/27) | 73 % (48/66) | 74 % (85/115) | 77 % (127/164) | 75 % (400/534) |
Specificity | 77 % (147/191) | 84 % (16/19) | 78 % (60/77) | 81 % (85/105) | 74 % (116/156) | 77 % (424/548) |
Accuracy | 76 % (268/353) | 76 % (35/46) | 76 % (108/143) | 77 % (170/220) | 76 % (243/320) | 76 % (824/1082) |
LBTC-MM | ||||||
AUC | 0.64 (0.58–0.69) | 0.67 (0.50–0.81) | 0.64 (0.54–0.73) | 0.67 (0.59–0.74) | 0.65 (0.58–0.71) | 0.65 (0.61–0.68) |
Sensitivity | 68 % (110/162) | 70 % (19/27) | 74 % (49/66) | 68 % (78/115) | 71 % (116/164) | 70 % (372/534) |
Specificity | 64 % (122/191) | 68 % (13/19) | 65 % (50/77) | 72 % (76/105) | 63 % (98/156) | 66 % (359/548) |
Accuracy | 66 % (232/353) | 70 % (32/46) | 69 % (99/143) | 70 % (154/220) | 67 % (214/320) | 68 % (731/1082) |
AUCs are represented as mean (95 % CI). Sensitivity, specificity, and accuracy are represented as percentages (numerator/denominator for percentages).
Sensitivity = TP/(TP + FN), specificity = TN/(TN + FP), accuracy = TP + TN/(TP + FP + TN + FN).
AUC: Area Under the receiver operating characteristic Curve, CI: Confidence Interval, LBT-RM: Load-Bearing Tissue Radiomic Model, LBT-MM: Load-Bearing Tissue MOAKS Model, LBTC-RM: Load-Bearing Tissue plus Clinical variable Radiomic Model, LBTC-MM: Load-Bearing Tissue plus Clinical variable MOAKS Model, MOAKS: Magnetic resonance imaging OsteoArthritis Knee Score, TP: true positive, TN: True Negative, FP: False Positive, FN: False Negative.
3.5. Association of risk factors with KOA incident
Table S7 displays the odds ratios (ORs) for clinical model features predicting KOA incident in the POMA_inc.OA cohort. Adjusted for baseline factors, knee injury, baseline WOMAC stiffness, disability, and total WOMAC scores increased the OR of KOA incident, with adjusted ORs [95 % confidence interval (CI)] of 1.7 (1.1–2.4), 1.1 (1.02–1.3), 1.03 (1.01–1.05), and 1.02 (1.003–1.03), respectively.
Table S8 presents the ORs of predictive model outputs for KOA incident in the POMA_inc.OA cohort. Adjusted for baseline factors, LBTC-RM output had the highest risk, with an adjusted OR (95 % CI) of 20.6 (13.8–30.6). LBT-RM output showed an adjusted OR (95 % CI) of 19.7 (13.3–29.1) for KOA incident. For single-structure radiomic model outputs, adjusted ORs for KOA incident ranged from 2.3 to 4.5. Similarly, adjusted ORs for KOA incident in single-structure MOAKS model outputs ranged from 1.1 to 3.9. The clinical model, LBT-MM, and LBTC-MM outputs had adjusted ORs (95 % CI) of 1.8 (1.3–2.5), 4.8 (3.4–6.7), and 5.0 (3.5–6.9), respectively.
3.6. Performance of LBTC-RM-supported physicians: compared with no LBTC-RM-supported physicians
In total test cohort, five resident physicians used knee MRIs and clinical data to predict KOA incident (Fig. 3A). LBTBC-RM achieved a balanced sensitivity and specificity with a true classification probability threshold of 48 % (Fig. S10A). With LBTC-RM assistance, physicians significantly improved their predictive performance, increasing mean sensitivity, specificity, and accuracy from 50 % to 72 %, 60 %–73 %, and 55 %–72 %, respectively, in total test cohorts (Fig. 3B–C, Figs. S10B–K, and Table S10). Physicians' ability to predict KOA incident significantly improved in total test cohort, with mean sensitivity, specificity, and accuracy increasing by 44 %, 22 %, and 31 %, respectively (all p < .05) (Table S10).
Fig. 3.
Performance of resident physicians and models in predicting the ROA incident in total test cohort. A schematic workflow illustrated the assessment of ROA incident by resident physicians and the support provided by LBTC-RM. The LBTC-RM assisted physicians by providing ROA incident probability for the participants (A). (B) The AUC of LBTC-RM for predicting ROA incident was compared with LBTC-MM in total test cohort, and the average performance of all physicians to predict ROA incident was displayed, both without (blue dot) and with (red dot) the support of LBTC-RM. The performance of LBTC-RM was superior to LBTC-MM in terms of predicting ROA incident. When the assistance of LBTC-RM provided, both the sensitivity and specificity of the physicians increased (indicated by the black arrow). (C) The individual performance of physicians in the total test cohort was represented by open shapes (without LBTC-RM support) and filled shapes (with LBTC-RM support). The integration of LBTC-RM into the prognostic loop significantly enhanced the prediction performance of physicians, as shown by the dashed connection lines. AUC: Area Under the receiver operating characteristic Curve, LBTC-RM: Load-Bearing Tissue plus Clinical variable Radiomic Model, LBTC-MM: Load-Bearing Tissue plus Clinical variable MOAKS Model, MOAKS: Magnetic resonance imaging OsteoArthritis Knee Score, ROA: Radiograhic OsteoArthritis, Total test cohort: the combined dataset from Test cohorts 1 to 5.
3.7. Association between radiomic, KOA, and knee symptom
Over the 4-year follow-up, LRC_score, LR_score, and C_score decreased, while LF_score and LT_score also decreased, and LS_score increased. Significant differences between case and control groups were noted in LRC_score, LR_score, and LS_score at all visits (p < .05) (Table S12). Adjusted for baseline factors, LRC_score, LR_score, and LS_score showed a positive bidirectional causality with KOA at all visits. LF_score of femoral cartilage had a positive bidirectional causality after P-2, and tibial cartilage and medial meniscus showed it after P-1 (Table S13). LS_score, at various time points, had a positive bidirectional causality with KOA, starting at P-4 for femur and medial meniscus, P-3 for tibial cartilage, P-1 for femoral cartilage, and P0 for tibial cartilage (Table S13). However, LT_score of medial meniscus showed a negative bidirectional causality at P0 (Tables S12–13). Finally, LS_score of medial meniscus exhibited a positive bidirectional causality with C_score after P-2. LS_score of tibial cartilage had a positive bidirectional causality with C_score, while LS_score of femoral cartilage showed a negative bidirectional causality only at P-2. Additionally, LR_score, LS_score of femur, LT_score of femoral cartilage showed a positive bidirectional causality with C_score only at P0 (Table S14).
4. Discussion
In this multicenter observational study, we analyzed 2164 knee MRIs from 700 knees over a 4-year follow-up to develop a shallow neural network model to predict KOA incident using knee MRIs. The final LBTC-RM model combines radiomics from load-bearing structures of knees, achieving AUC exceeding 0.80 in predicting KOA in different test cohorts. The support of LBTC-RM significantly improved resident physicians' predictive performance, with mean sensitivity, specificity, and accuracy increases ranged from 30 % to 64 %, 17 %–33 %, and 23 %–46 % in the test cohorts, respectively. Furthermore, the LBTC-RM output demonstrated an increased risk of KOA incidents (OR of 20.6), and the final model we developed holds promise in assessing the risk level of pre-KOA participants.
This is the first study to evaluate complete knee load-bearing tissue radiomics to predict KOA incident. While MOAKS semi-quantitative grading systems demonstrate predictive value in KOA incidents, their complexity and reliance on different MRI techniques present challenges [40,41]. In the OAI cohort, an integrated MOAKS model combining bone marrow lesion, cartilage lesion, meniscal lesion, and clinical variables, achieved AUC values of 0.84 in predicting KOA incident [40]. In our study, LBTC-RM outperformed LBTC-MM, achieving AUC of 0.86, with AUC increases ranging from 24 % to 34 %. Moreover, LBT-RM and LBTC-RM surpassed single-structure radiomic models, with AUC increases ranging from 8 % to 51 %.
In individuals without KOA, baseline structural damage on MRI was associated with cartilage volume loss, even with KOA incident [42]. In the OAI study, high baseline femur and tibia bone marrow lesion (OR: 1.38) and meniscal extrusion (OR: 1.72) MOAKS grades had an increased risk of KOA incident [43]. In our study, predictive model outputs were associated with KOA incident. Integration of load-bearing tissue features and clinical variables increased the OR of KOA incident. The highest OR in predictive models was 20.6 from LBTC-RM output, whereas LBTC-MM had an OR of 5.0.
Early detection, crucial for intervention, has been demonstrated in rheumatoid arthritis [44]. Although RA and KOA do not share identical early intervention strategies, early diagnosis has demonstrated broader value in musculoskeletal diseases overall. Defining a drug as a disease-modifying osteoarthritis drug (DMOAD) is challenging with minimal radiographic findings [15]. MRI, showing promise as an early knee OA biomarker, detects structural changes earlier than radiographs [45]. While MRI offers superior sensitivity in detecting early structural changes in KOA, its use is limited by high cost, accessibility, and time required for imaging [3]. This reduces its feasibility for population-wide screening or routine follow-up. However, MRI screening can provide substantial benefits for high-risk groups, such as individuals with a family history of KOA, those with early joint pain, and athletes involved in high-impact activities [46]. For these populations, MRI screening, combined with our predictive model, could facilitate earlier interventions and personalized treatment strategies, preventing severe structural damage [47]. Additionally, MRI screening may be particularly useful in cases where X-rays fail to reveal early joint changes [48]. However, radiologist-dependent MRI has limitations, including low sensitivity (0–48 %) for early cartilage lesion detection [49]. In our study, assessing publicly available OAI MOAKS data scored by experienced radiologists, we found that, compared to LBTC-MM (68–74 % sensitivity), LBTC-RM (70–77 % sensitivity) increased sensitivity by −1%–10 % in the test cohorts.
Radiomics, originally applied in oncology, has now emerged as a powerful tool for diagnosing and predicting KOA [50]. In a previous study, radiomic features extracted from MRI scans of knee cartilage in a cohort of 148 patients (72 with KOA and 76 without) were analyzed using machine learning models. While the final model demonstrated strong diagnostic performance with an AUC of 0.983, the small sample size limited its generalizability to broader populations [11]. Our earlier work built upon this by integrating baseline MRI radiomic features from knee joint tissues, including cartilage and meniscus, with neural networks. This approach also received high accuracy in KOA incidence prediction, achieving an AUC of 0.931 [13]. However, the use of only baseline data restricted the model's ability to account for dynamic changes in tissue over time. To address this limitation, our current study incorporates longitudinal MRI radiomics data from load-bearing tissues, such as the femur, tibia, cartilage, and meniscus, over a four-year period. This longitudinal approach captures temporal changes in tissue health, leading to high accurate and clinically relevant predictions of KOA progression. The model's AUC consistently surpassed 0.8 at multiple time points, highlighting the robustness and practical utility of our model in early KOA diagnosis and progression monitoring.
Radiomics also hold promise as a virtual image-based biopsy tool for patients [51]. In our study, we devised a scoring system for load-bearing tissue radiomics and its subtypes, alongside OA symptom features, to quantify the risk level of participants with pre-radiographic KOA. The LRC_score, LR_score, and LS_score increased the ORs of radiographic KOA incident four years prior, while the C_score increased the ORs of KOA incidents from one year before KOA. Furthermore, the LS_score of the femur and tibial cartilage increased the ORs of KOA incidents at four visits, whereas the LT_score of the medial meniscus decreased the ORs of KOA incidents upon KOA confirmation. Lastly, the LS_score of the medial meniscus increased the ORs of C_score (mainly representing the risk level of OA symptoms) from two years prior to KOA.
The combination of deep learning for auto-segmentation and machine learning for predictive modeling leverages the strengths of both approaches. Deep learning automates precise segmentation of medical images, allowing for the accurate extraction of key structural features, while machine learning integrates clinical data, adding interpretability and improving predictive accuracy [52]. This hybrid approach enables more personalized disease progression models. Previous studies that focus solely on image-based features, without incorporating clinical data [53]. In contrast, our model combines MRI radiomic features with clinical variables, improving sensitivity, especially in the early detection of KOA. Additionally, longitudinal MRI data has proven effective in tracking changes in muscle composition and adiposity [54]. Similarly, our approach utilizes four-year longitudinal MRI data, extracting radiomic features from load-bearing tissues, further refining KOA predictions. By integrating clinical data alongside radiomic features, our model enhances predictive power, offering more comprehensive insights into disease progression compared to image-only models.
The prognostic outlook for KOA incidents is challenging for clinicians, relying heavily on clinical reasoning, which can be influenced by practitioners' scientific knowledge and biases. To enhance decision-making and improve outcomes, a precise predictive model independent of clinicians' knowledge and tailored to patients' preferences is invaluable [55]. In OAI cohort, LBT-MM showed sensitivity, specificity, and accuracy in the test cohorts in predicting KOA incident ranging from 48 % to 63 %, 73 %–83 %, and 61 %–70 %, respectively. Notably, our resident physicians' sensitivity, specificity, and accuracy ranged from 45 % to 54 %, 58 %–66 %, and 52 %–57 % in the test cohorts, respectively. However, with LBTC-RM support, our physicians' sensitivity, specificity, and accuracy improved range from 70 % to 74 %, 69 %–78 %, and 70 %–76 % in the test cohorts, respectively.
The clinical relevance of LBTC-RM lies in its ability to tailor treatment strategies based on a patient's individual risk profile. By integrating MRI-based radiomic features with clinical variables, the model helps clinicians create personalized treatment plans that address the unique progression of the disease in each patient [56]. For instance, high-risk individuals can be enrolled in more intensive physical rehabilitation programs or prescribed early pharmacological treatments, such as non-steroidal anti-inflammatory drugs (NSAIDs) or intra-articular corticosteroid injections, which have been shown to alleviate pain and slow the progression of KOA [57]. Additionally, this personalized approach reduces the reliance on a “one-size-fits-all” strategy, ensuring that interventions are targeted and more effective.
This study has several potential limitations. Firstly, the MRI sequence used in our research is not a standard clinical MRI sequence. Future studies will incorporate multiple MRI sequences to enhance the accuracy and applicability of the model. Secondly, due to the lack of precise MRI segmentation in the muscle, fat tissue, ligament, and synovium tissue, these soft tissue radiomic features were not integrated into our predictive model. Further research will focus on developing automatic segmentation techniques for these soft tissues to improve the model's comprehensiveness. Thirdly, another limitation of this study was its use of only MRI data. While MRI is suitable for high-risk or ambiguous cases, X-rays are more practical for general screening. Future work should explore hybrid models combining both modalities to balance diagnostic accuracy and cost-effectiveness. Forthly, a limitation of this study is the 4-year follow-up period, which restricts the model's assessment of long-term KOA progression. Future research should collect data beyond 4 years to evaluate the model's predictive power over extended periods, providing insights into long-term disease progression and management strategies. Lastly, to establish the generalizability of our radiomic model for predicting KOA incident, it needs to be further tested and validated in independent populations. For external validation involving datasets from different scanners or protocols, variability in radiomic features remains a concern. In such cases, ComBat Harmonization, a well-established technique for reducing batch effects across different imaging datasets, can be applied to address this issue and ensure feature reproducibility.
We developed a machine learning model to predict the KOA incident using radiomic features extracted from longitudinal knee MRIs of femur, tibia, meniscus, and femorotibial cartilage. This model shows promise in assisting MRI-readers in assessing risk level and prognosticating KOA incident, but its clinical utility needs to be further evaluated in future studies.
Funding
This study was supported by the Postdoctoral Fund of Hebei Medical University (Grant number: 322109) and the Postdoctoral Fund of Hebei Province (Grant number: B2023003035), the National Natural Science Foundation of China (Grant number: 82350003, 92049201), the key development projects of the Sichuan Provincial Science and Technology Plan (Grant No. 2024YFFK0298), the Primary Health Development Research Center of Sichuan Province Program (Grant number: SWFZ24-Z-11), the Chengdu Medical Research Project (Grant number: 2024169, 2024327, 2024487), and the Key R&D Project of Chengdu Science and Technology Bureau (Grant number: 2024-YF05-00119-SN).
Declaration of competing interest
None.
Acknowledgment
We would like to acknowledge the dedication and commitment of the OAI study participants. The OAI is a public-private partnership comprised of five contracts (N01-AR-2-2258; N01-AR-2-2259; N01-AR-2-2260; N01-AR-2-2261; N01-AR-2-2262) funded by the NIH and conducted by the OAI Study Investigators. Private funding partners include Merck Research Laboratories, Novartis Pharmaceuticals Corporation, GlaxoSmithKline, and Pfizer, Inc. Private sector funding for the OAI is managed by the Foundation for the NIH. This manuscript was prepared using an OAI public use data set (in addition to data obtained within NIH/NIAMS funded ancillary grants) and does not necessarily reflect the opinions or views of the OAI investigators, the NIH, or the private funding partners. We used ChatGPT for langue editing in this study.
Footnotes
Supplementary data to this article can be found online at https://doi.org/10.1016/j.jot.2025.01.007.
Contributor Information
Ting Wang, Email: wangting1101smu@163.com.
Shengfa Li, Email: lisheng-fa@qq.com.
Yingze Zhang, Email: yzling_liu@163.com.
Appendix A. Supplementary data
The following is the Supplementary data to this article:
References
- 1.Yang G., Wang J., Liu Y., Lu H., He L., Ma C., et al. Burden of knee osteoarthritis in 204 countries and territories, 1990-2019: results from the global burden of disease study 2019. Arthritis Care Res. 2023 doi: 10.1002/acr.25158. [DOI] [PubMed] [Google Scholar]
- 2.Wenham C.Y., Conaghan P.G. New horizons in osteoarthritis. Age Ageing. 2013;42(3):272–278. doi: 10.1093/ageing/aft043. [DOI] [PubMed] [Google Scholar]
- 3.Hunter D.J., Bierma-Zeinstra S. Osteoarthritis. Lancet (London, England) 2019;393(10182):1745–1759. doi: 10.1016/S0140-6736(19)30417-9. [DOI] [PubMed] [Google Scholar]
- 4.Hunter D.J., March L., Chew M. Osteoarthritis in 2020 and beyond: a Lancet commission. Lancet (London, England) 2020;396(10264):1711–1712. doi: 10.1016/S0140-6736(20)32230-3. [DOI] [PubMed] [Google Scholar]
- 5.Wang L., Lu H., Chen H., Jin S., Wang M., Shang S. Development of a model for predicting the 4-year risk of symptomatic knee osteoarthritis in China: a longitudinal cohort study. Arthritis Res Ther. 2021;23(1):65. doi: 10.1186/s13075-021-02447-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Paz-González R., Balboa-Barreiro V., Lourido L., Calamia V., Fernandez-Puente P., Oreiro N., et al. Prognostic model to predict the incidence of radiographic knee osteoarthritis. Ann Rheum Dis. 2024;83(5):661–668. doi: 10.1136/ard-2023-225090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Silverwood V., Blagojevic-Bucknall M., Jinks C., Jordan J.L., Protheroe J., Jordan K.P. Current evidence on risk factors for knee osteoarthritis in older adults: a systematic review and meta-analysis. Osteoarthritis Cartilage. 2015;23(4):507–515. doi: 10.1016/j.joca.2014.11.019. [DOI] [PubMed] [Google Scholar]
- 8.Lo G.H., Strayhorn M.T., Driban J.B., Price L.L., Eaton C.B., McAlindon T.E. Subjective crepitus as a risk factor for incident symptomatic knee osteoarthritis: data from the osteoarthritis initiative. Arthritis Care Res. 2018;70(1):53–60. doi: 10.1002/acr.23246. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Yu K., Ying J., Zhao T., Lei L., Zhong L., Hu J., et al. Prediction model for knee osteoarthritis using magnetic resonance-based radiomic features from the infrapatellar fat pad: data from the osteoarthritis initiative. Quant Imag Med Surg. 2023;13(1):352–369. doi: 10.21037/qims-22-368. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Hirvasniemi J., Klein S., Bierma-Zeinstra S., Vernooij M.W., Schiphof D., Oei E.H.G. A machine learning approach to distinguish between knees without and with osteoarthritis using MRI-based radiomic features from tibial bone. Eur Radiol. 2021;31(11):8513–8521. doi: 10.1007/s00330-021-07951-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Cui T., Liu R., Jing Y., Fu J., Chen J. Development of machine learning models aiming at knee osteoarthritis diagnosing: an MRI radiomics analysis. J Orthop Surg Res. 2023;18(1):375. doi: 10.1186/s13018-023-03837-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Angelone F., Ciliberti F.K., Tobia G.P., Jónsson H., Ponsiglione A.M., Gislason M.K., et al. Innovative diagnostic approaches for predicting knee cartilage degeneration in osteoarthritis patients: a radiomics-based study. Inf Syst Front. 2024 [Google Scholar]
- 13.Li S., Cao P., Li J., Chen T., Luo P., Ruan G., et al. Integrating radiomics and neural networks for knee osteoarthritis incidence prediction. Arthritis Rheumatol. 2024;76(9):1377–1386. doi: 10.1002/art.42915. [DOI] [PubMed] [Google Scholar]
- 14.Li J., Fu S., Gong Z., Zhu Z., Zeng D., Cao P., et al. MRI-Based texture analysis of infrapatellar fat pad to predict knee osteoarthritis incidence. Radiology. 2022;304(3):611–621. doi: 10.1148/radiol.212009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Mahmoudian A., Lohmander L.S., Mobasheri A., Englund M., Luyten F.P. Early-stage symptomatic osteoarthritis of the knee - time for action. Nat Rev Rheumatol. 2021;17(10):621–632. doi: 10.1038/s41584-021-00673-4. [DOI] [PubMed] [Google Scholar]
- 16.Urish K.L., Keffalas M.G., Durkin J.R., Miller D.J., Chu C.R., Mosher T.J. T2 texture index of cartilage can predict early symptomatic OA progression: data from the osteoarthritis initiative. Osteoarthritis Cartilage. 2013;21(10):1550–1557. doi: 10.1016/j.joca.2013.06.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Kundu S., Ashinsky B.G., Bouhrara M., Dam E.B., Demehri S., Shifat E.R.M., et al. Enabling early detection of osteoarthritis from presymptomatic cartilage texture maps via transport-based learning. Proc Natl Acad Sci U S A. 2020;117(40):24709–24719. doi: 10.1073/pnas.1917405117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Clark G.M. Prognostic factors versus predictive factors: examples from a clinical trial of erlotinib. Mol Oncol. 2008;1(4):406–412. doi: 10.1016/j.molonc.2007.12.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Zhu S., Gilbert M., Chetty I., Siddiqui F. The 2021 landscape of FDA-approved artificial intelligence/machine learning-enabled medical devices: an analysis of the characteristics and intended use. Int J Med Inf. 2022;165 doi: 10.1016/j.ijmedinf.2022.104828. [DOI] [PubMed] [Google Scholar]
- 20.Zwanenburg A., Vallières M., Abdalah M.A., Aerts H., Andrearczyk V., Apte A., et al. The image biomarker standardization initiative: standardized quantitative radiomics for high-throughput image-based phenotyping. Radiology. 2020;295(2):328–338. doi: 10.1148/radiol.2020191145. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Fournier L., Costaridou L., Bidaut L., Michoux N., Lecouvet F.E., de Geus-Oei L.F., et al. Incorporating radiomics into clinical trials: expert consensus endorsed by the European Society of Radiology on considerations for data-driven compared to biologically driven quantitative biomarkers. Eur Radiol. 2021;31(8):6001–6012. doi: 10.1007/s00330-020-07598-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.McShane L.M., Cavenagh M.M., Lively T.G., Eberhard D.A., Bigbee W.L., Williams P.M., et al. Criteria for the use of omics-based predictors in clinical trials: explanation and elaboration. BMC Med. 2013;11:220. doi: 10.1186/1741-7015-11-220. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Huang E.P., O’Connor J.P.B., McShane L.M., Giger M.L., Lambin P., Kinahan P.E., et al. Criteria for the translation of radiomics into clinically useful tests. Nat Rev Clin Oncol. 2023;20(2):69–82. doi: 10.1038/s41571-022-00707-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Roemer F.W., Kwoh C.K., Hannon M.J., Hunter D.J., Eckstein F., Fujii T., et al. What comes first? Multitissue involvement leading to radiographic osteoarthritis: magnetic resonance imaging-based trajectory analysis over four years in the osteoarthritis initiative. Arthritis Rheumatol. 2015;67(8):2085–2096. doi: 10.1002/art.39176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Eckstein F., Collins J.E., Nevitt M.C., Lynch J.A., Kraus V.B., Katz J.N., et al. Brief report: cartilage thickness change as an imaging biomarker of knee osteoarthritis progression: data from the foundation for the national institutes of health osteoarthritis biomarkers consortium. Arthritis Rheumatol. 2015;67(12):3184–3189. doi: 10.1002/art.39324. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Hunter D.J., Zhang Y.Q., Niu J.B., Tu X., Amin S., Clancy M., et al. The association of meniscal pathologic changes with cartilage loss in symptomatic knee osteoarthritis. Arthritis Rheum. 2006;54(3):795–801. doi: 10.1002/art.21724. [DOI] [PubMed] [Google Scholar]
- 27.Peterfy C.G., Schneider E., Nevitt M. The osteoarthritis initiative: report on the design rationale for the magnetic resonance imaging protocol for the knee. Osteoarthritis Cartilage. 2008;16(12):1433–1441. doi: 10.1016/j.joca.2008.06.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Tack A., Ambellan F., Zachow S. Towards novel osteoarthritis biomarkers: multi-criteria evaluation of 46,996 segmented knee MRI data from the Osteoarthritis Initiative. PLoS One. 2021;16(10) doi: 10.1371/journal.pone.0258855. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29.Ambellan F., Tack A., Ehlke M., Zachow S. Automated segmentation of knee bone and cartilage combining statistical shape knowledge and convolutional neural networks: data from the Osteoarthritis Initiative. Med Image Anal. 2019;52:109–118. doi: 10.1016/j.media.2018.11.009. [DOI] [PubMed] [Google Scholar]
- 30.Tack A., Mukhopadhyay A., Zachow S. Knee menisci segmentation using convolutional neural networks: data from the Osteoarthritis Initiative. Osteoarthritis Cartilage. 2018;26(5):680–688. doi: 10.1016/j.joca.2018.02.907. [DOI] [PubMed] [Google Scholar]
- 31.Zhao B., Tan Y., Tsai W.Y., Qi J., Xie C., Lu L., et al. Reproducibility of radiomics for deciphering tumor phenotype with imaging. Sci Rep. 2016;6 doi: 10.1038/srep23428. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Eckstein F., Kwoh C.K., Link T.M. Imaging research results from the osteoarthritis initiative (OAI): a review and lessons learned 10 years after start of enrolment. Ann Rheum Dis. 2014;73(7):1289–1300. doi: 10.1136/annrheumdis-2014-205310. [DOI] [PubMed] [Google Scholar]
- 33.Litjens G., Kooi T., Bejnordi B.E., Setio A.A.A., Ciompi F., Ghafoorian M., et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60–88. doi: 10.1016/j.media.2017.07.005. [DOI] [PubMed] [Google Scholar]
- 34.Shen D., Wu G., Suk H.I. Deep learning in medical image analysis. Annu Rev Biomed Eng. 2017;19:221–248. doi: 10.1146/annurev-bioeng-071516-044442. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Hastie T., Tibshirani R., Friedman J.H., Friedman J.H. vol. 2. Springer; 2009. (The elements of statistical learning: data mining, inference, and prediction). [Google Scholar]
- 36.Fritz B., Yi P.H., Kijowski R., Fritz J. Radiomics and deep learning for disease detection in musculoskeletal radiology: an overview of novel MRI- and CT-based approaches. Invest Radiol. 2023;58(1):3–13. doi: 10.1097/RLI.0000000000000907. [DOI] [PubMed] [Google Scholar]
- 37.Alhamzawi R., Ali H.T.M. The Bayesian adaptive lasso regression. Math Biosci. 2018;303:75–82. doi: 10.1016/j.mbs.2018.06.004. [DOI] [PubMed] [Google Scholar]
- 38.Lemay A., Hoebel K., Bridge C.P., Befano B., De Sanjosé S., Egemen D., et al. Improving the repeatability of deep learning models with Monte Carlo dropout. NPJ digital medicine. 2022;5(1):174. doi: 10.1038/s41746-022-00709-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Ashrafinia S. The Johns Hopkins University; 2019. Quantitative nuclear medicine imaging using advanced image reconstruction and radiomics. [Google Scholar]
- 40.Sharma L., Hochberg M., Nevitt M., Guermazi A., Roemer F., Crema M.D., et al. Knee tissue lesions and prediction of incident knee osteoarthritis over 7 years in a cohort of persons at higher risk. Osteoarthritis Cartilage. 2017;25(7):1068–1075. doi: 10.1016/j.joca.2017.02.788. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Eckstein F., Guermazi A., Gold G., Duryea J., Hellio Le Graverand M.P., Wirth W., et al. Imaging of cartilage and bone: promises and pitfalls in clinical trials of osteoarthritis. Osteoarthritis Cartilage. 2014;22(10):1516–1532. doi: 10.1016/j.joca.2014.06.023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Teichtahl A.J., Cicuttini F.M., Abram F., Wang Y., Pelletier J.P., Dodin P., et al. Meniscal extrusion and bone marrow lesions are associated with incident and progressive knee osteoarthritis. Osteoarthritis Cartilage. 2017;25(7):1076–1083. doi: 10.1016/j.joca.2017.02.792. [DOI] [PubMed] [Google Scholar]
- 43.Sharma L., Chmiel J.S., Almagor O., Dunlop D., Guermazi A., Bathon J.M., et al. Significance of preradiographic magnetic resonance imaging lesions in persons at increased risk of knee osteoarthritis. Arthritis Rheumatol. 2014;66(7):1811–1819. doi: 10.1002/art.38611. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Burgers L.E., Raza K., van der Helm-van Mil A.H. Window of opportunity in rheumatoid arthritis - definitions and supporting evidence: from old to new perspectives. RMD Open. 2019;5(1) doi: 10.1136/rmdopen-2018-000870. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Roemer F.W., Kwoh C.K., Hayashi D., Felson D.T., Guermazi A. The role of radiography and MRI for eligibility assessment in DMOAD trials of knee OA. Nat Rev Rheumatol. 2018;14(6):372–380. doi: 10.1038/s41584-018-0010-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Pedoia V., Lee J., Norman B., Link T.M., Majumdar S. Diagnosing osteoarthritis from T(2) maps using deep learning: an analysis of the entire Osteoarthritis Initiative baseline cohort. Osteoarthritis Cartilage. 2019;27(7):1002–1010. doi: 10.1016/j.joca.2019.02.800. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Oei E.H.G., Hirvasniemi J., van Zadelhoff T.A., van der Heijden R.A. Osteoarthritis year in review 2021: imaging. Osteoarthritis Cartilage. 2022;30(2):226–236. doi: 10.1016/j.joca.2021.11.012. [DOI] [PubMed] [Google Scholar]
- 48.Guermazi A., Hayashi D., Roemer F.W., Felson D.T. Osteoarthritis: a review of strengths and weaknesses of different imaging options. Rheum Dis Clin N Am. 2013;39(3):567–591. doi: 10.1016/j.rdc.2013.02.001. [DOI] [PubMed] [Google Scholar]
- 49.Quatman C.E., Hettrich C.M., Schmitt L.C., Spindler K.P. The clinical utility and diagnostic performance of magnetic resonance imaging for identification of early and advanced knee osteoarthritis: a systematic review. Am J Sports Med. 2011;39(7):1557–1568. doi: 10.1177/0363546511407612. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Jiang T., Lau S.H., Zhang J., Chan L.C., Wang W., Chan P.K., et al. Radiomics signature of osteoarthritis: current status and perspective. J Orthopaed. Translat. 2024;45:100–106. doi: 10.1016/j.jot.2023.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 51.Gill R.R. Virtual image-based biopsy of lung metastases: the promise of radiomics. Acad Radiol. 2023;30(1):47–48. doi: 10.1016/j.acra.2022.10.030. [DOI] [PubMed] [Google Scholar]
- 52.Hagiwara A., Fujita S., Kurokawa R., Andica C., Kamagata K., Aoki S. Multiparametric MRI: from simultaneous rapid acquisition methods and analysis techniques using scoring, machine learning, radiomics, and deep learning to the generation of novel metrics. Invest Radiol. 2023;58(8):548–560. doi: 10.1097/RLI.0000000000000962. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 53.Calivà F., Namiri N.K., Dubreuil M., Pedoia V., Ozhinsky E., Majumdar S. Studying osteoarthritis with artificial intelligence applied to magnetic resonance imaging. Nat Rev Rheumatol. 2022;18(2):112–121. doi: 10.1038/s41584-021-00719-7. [DOI] [PubMed] [Google Scholar]
- 54.Moradi K., Mohajer B., Guermazi A., Kwoh C.K., Bingham C.O., Mohammadi S., et al. Cachexia in preclinical rheumatoid arthritis: longitudinal observational study of thigh magnetic resonance imaging from osteoarthritis initiative cohort. J Cachexia Sarcopenia Muscle. 2024;15(5):1823–1833. doi: 10.1002/jcsm.13533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 55.Bullock G.S., Hughes T., Sergeant J.C., Callaghan M.J., Riley R., Collins G. Methods matter: clinical prediction models will benefit sports medicine practice, but only if they are properly developed and validated. Br J Sports Med. 2021;55(23):1319–1321. doi: 10.1136/bjsports-2021-104329. [DOI] [PubMed] [Google Scholar]
- 56.Emery C.A., Whittaker J.L., Mahmoudian A., Lohmander L.S., Roos E.M., Bennell K.L., et al. Establishing outcome measures in early knee osteoarthritis. Nat Rev Rheumatol. 2019;15(7):438–448. doi: 10.1038/s41584-019-0237-3. [DOI] [PubMed] [Google Scholar]
- 57.Bannuru R.R., Osani M.C., Vaysbrot E.E., Arden N.K., Bennell K., Bierma-Zeinstra S.M.A., et al. OARSI guidelines for the non-surgical management of knee, hip, and polyarticular osteoarthritis. Osteoarthritis Cartilage. 2019;27(11):1578–1589. doi: 10.1016/j.joca.2019.06.011. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.