Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2024 Nov 1.
Published in final edited form as: Curr Rheumatol Rep. 2023 Aug 10;25(11):213–225. doi: 10.1007/s11926-023-01114-9

Machine learning approaches to the prediction of osteoarthritis phenotypes and outcomes

Liubov Arbeeva 1, Mary C Minnig 2, Katherine A Yates 1,3, Amanda E Nelson 1,2,3
PMCID: PMC10592147  NIHMSID: NIHMS1923930  PMID: 37561315

Abstract

Purpose of review:

Osteoarthritis (OA) is a complex heterogeneous disease with no effective treatments. Artificial intelligence (AI) and its subfield machine learning (ML) can be applied to data from different sources to 1) assist clinicians and patients in decision making, based on machine-learned evidence and 2) improve our understanding of pathophysiology and mechanisms underlying OA, providing new insights into disease management and prevention. The purpose of this review is to improve the ability of clinicians and OA researchers to understand the strengths and limitations of AI/ML methods in applications to OA research.

Recent findings:

AI/ML can assist clinicians by prediction of OA incidence and progression and by providing tailored personalized treatment. These methods allow using multidimensional multi-source data to understand the nature of OA, to identify different OA phenotypes, and for biomarker discovery.

Summary:

We described the recent implementations of AI/ML in OA research and highlighted potential future directions and associated challenges.

Keywords: osteoarthritis, machine learning, artificial intelligence, precision medicine

INTRODUCTION

Osteoarthritis (OA) is a heterogeneous and debilitating disease that resulted in $140 billion in medical cost and $164 billion in lost wages in 2013 in the United States alone (1). Current American College of Rheumatology (ACR) guidelines recommend treatment with physical, psychosocial, and mind-body approaches and strongly encourage regular, ongoing exercise and weight loss (2). Pharmacologic recommendations include pain control with non-steroidal anti-inflammatory drugs (NSAIDs), acetaminophen, duloxetine, and intraarticular joint injections (2). To date, there is no approved therapy that effectively stops the progression of OA (3), and the effect sizes of the recommended interventions for symptom reduction are low. The heterogeneity of the structural and clinical features of OA has complicated the development of novel therapies to halt or postpone progression. For this reason, there has been renewed focus on developing tools, specifically artificial intelligence (AI) and machine learning (ML) algorithms, to define subgroups of OA that correspond to patients who may respond better to targeted interventions or benefit from enrollment in clinic trials. In this paper, we discuss several aspects of AI/ML applications with examples including phenotyping; prediction of OA occurrence; symptomatic and/or structural progression and total joint replacement (TJR); biomarker discovery for progression; and finally, applications in the context of precision medicine. Our goal is to enhance the ability of clinicians and researchers to understand the strengths and limitations of these methods in application to OA research.

METHODS

A PubMed literature search was performed from 2/1/2020 to 2/1/2023 using terms including osteoarthritis, machine learning, phenotype/phenotyping, prediction, and precision medicine. As our focus was on prediction, phenotypes, and outcomes in OA, articles focused solely on OA diagnosis, image scoring (e.g., to determine semi-quantitative grades), histologic grading, genome-wide association studies (GWAS, in isolation), or surgical outcomes/satisfaction were excluded. Additionally, articles had to be available in English, include human subjects (at least in part), not be reviews or editorials, and not exclusively use traditional methods (e.g., latent class analysis or regression). The search yielded 23 manuscripts for initial full text review by two authors (LA and AEN). Following initial review, another three papers were excluded (one described surgical outcomes (4), and two automated image assessments (5, 6)), leaving 20 papers for the final literature synthesis.

RESULTS

Of the 20 manuscripts included in this literature review, we identified four main themes: phenotyping (4 articles), prediction of progression (6 articles and 1 protocol paper on pain/structure progression) and 3 articles on prediction of TJR, biomarker discovery (2 articles), prediction of incident disease or features (3 articles), and precision medicine (1 article).

1. Phenotypes

Four articles sought to identify subgroups, or phenotypes of OA. Two of these used data from the Osteoarthritis Initiative (OAI) (7), a large publicly available dataset including individuals aged 45 and older with or at risk of knee OA. Although the OAI is not generalizable to other populations, the availability of these data makes it a common choice for this type of research. Work from our group used more than 3000 individuals from the OAI, applying biclustering (simultaneously clustering knees and clinical features to account for their interaction) (8, 9) to identify subgroups of both knees and 86 baseline features. We found six such subgroups, two of which had poorer prognosis than the overall cohort, with higher frequency of TJR and more structural progression. Although this analysis was not externally validated, we were able to identify subgroups of knees with differing prognosis and which may respond to specific interventions based on baseline features (10). Demanse et al. also used data from the OAI, including 157 baseline variables, to apply two unsupervised methods (deep embedded clustering [DEC] and a combination of hierarchical clustering with multiple factor analysis [MFAC]) to identify phenotypes (11). Both methods identified a cluster with high body mass index [BMI], high comorbidity burden, and low physical activity; a younger group that was more physically active; and an older more limited group. This study also lacked external validation and focused on baseline features in this specific cohort, and most subgroups had similar structural progression. The authors highlight the importance of utilizing more than one assessment for subjective features like pain (11). Additional papers which will be discussed later in this review have utilized longitudinal data and/or trajectory-based models for this purpose (10, 12).

Several groups have utilized methods for multi-modal data, such as clinical features and molecular phenotyping methods on joint fluids and tissues, to identify potential phenotypes (13, 14). Steinberg et al. sought to assess how gene expression profiles in OA joint tissues relate to phenotypes by comparing low and high-grade OA cartilage samples from weight-bearing joints (majority knees) from 113 patients undergoing TJR (15). Each participant contributed matched low- and high-grade cartilage samples, and most also provided synovial lining samples. Two clusters were identified in synovium, and two independent clusters in the low-grade cartilage samples. These clusters were characterized by differences in inflammation, extracellular matrix, and cell adhesion pathways, but with distinct molecular changes by affected tissue. Interestingly, no subgroups were identified in the high-grade cartilage, suggesting less molecular variation at this stage of disease (15). Trajerova et al. utilized a convenience sample of patients (n=119) with clinical and radiographic knee OA who were undergoing arthrocentesis for effusion, and then received intra-articular steroid and/or NSAID therapy. Clinical outcome was defined as improved, unchanged, or worsened based on longitudinal self-report data at 3–6 months (12). Based on synovial fluid-derived immune cells patterns (percentages of lymphocytes, monocyte-macrophage lineage cells and HLA-DR+ CD8+ T-lymphocytes) the authors identified and described four clusters, representing knee OA immune-phenotypes, which were further associated with three clinical outcome trajectories. This was a small exploratory study, whose results need and deserve further investigation with respect to their utility for tailored treatment for OA based on molecular subtypes.

2. Prediction of progression

The prediction of OA is complex and may include prediction of further structural progression, prediction of future pain and/or function related scores and identifying individuals at higher risk of TJR. Therefore, OA prediction studies vary greatly concerning specific outcome, data types, study population, length of follow-up and statistical methodology. We identified seven articles that sought to predict pain or structural progression, two which focused on biomarker discovery, and three articles aimed at predicting TJR. Most studies only used baseline information which included radiographs, MRI, demographic and clinical features. Two studies used genetic and epigenetic data, and one study used ultrasound (US) data. In most papers, the researchers integrated multiple types of data, for example, imaging and clinical, or genetic and clinical.

a. Pain or structural progression

Widera et al. sought to identify rapid progressors for enrollment in randomized clinical trials (RCT) to improve efficiency and outcomes from these expensive studies (16, 17). They tested 6 ML algorithms (Table) for prediction of four main progression categories (defined as none, pain, structural, and pain plus structural, based on reference (18)) utilizing Cohort Hip and Cohort Knee (n=1002) and OAI (n=3465) data (clinical, X-ray, and TJR outcomes). The model-based selection was compared to conventional criteria for RCT inclusion (e.g., ACR clinical criteria, Kellgren-Lawrence Grade [KLG], and Western Ontario and McMaster Universities Arthritis Index [WOMAC]). They found that prediction of progression was enhanced using the algorithmic approach versus standard criteria (to ~45% up from ~20%), with the use of class probabilities showing better results compared to using strict class labels. This approach needs to be externally validated and tested in a prospective RCT. The rigorous computational framework developed by the authors is a step forward in the development of more efficient screening tools that can lead to higher quality RCTs.

Table.

Summary of articles, goals, and approaches included in this review

Author (year) Goal Data source Data Type Tools/approach Validation Drawbacks Implications for future work
Nelson (2022) Phenotype OAI (n=3000) OAI features (n=86) Biclustering (Cheng and Church algorithm) None Lack of validation, not generalizable, small n, baseline features only Could identify groups that may respond to targeted interventions
Demanse (2023) Phenotype OAI (n=4674) OAI features (n=157) Deep Embedded Clustering; Multiple Factor Analysis with Hierarchical clustering None Lack of validation, not generalizable, small n, baseline features only Lack of clear differences in progression by phenotype, importance of using multiple assessments
Steinberg (2021) Phenotype patients undergoing TJR (n=113) RNA seq from cartilage and synovium, clinical characteristics from EHRs Consensus clustering followed by differential gene expression analysis; Multi-Omics Factor Analysis (MOFA); Nearest shrunken centroid classification external validation One time sampling of synovium and cartilage at time of TJR Could identify groups that may respond to targeted interventions
Trajerova (2022) Phenotype 119 patients from outpatient center immune cells from synovial fluid, longitudinal self-reported pain Patient similarity network; Gephi tool used for visualization; FlowSOM clustering internal validation One time evaluation of synovial fluid; Follow-up was self-reported; no external validation Supports theory that immune cells play a role in KOA; may suggest future therapeutic targets
Widera (2020) Progression CHECK (n=1002), OAI (n=3465) clinical, X-ray, JSW, TJR data 6 ML algorithms (LR, multinomial LR, kNN, SVC with linear kernel, SVC with RBF kernel, RF) internal cross validation Includes a lot of non-progressors, unclear how to apply to rolling enrollment Model-based selection outperforms the conventional inclusion criteria, reducing by 20–25% the number of patients who show no progression; improve RCT efficiency
Bonakdari (2022) (Biomed) Progression OAI (n=3395) Baseline MRI, age, BMI; X-ray RF, M5Rules, M5P, MLP, ANFIS external validation Variable selection needs to be further validated; baseline data only Reliable and generalizable ML model predicts global and regional cartilage loss based on bone curvature at one year
Schiratti (2021) Progression OAI (n=3268) MRI, WOMAC Multiple Instance Learning with gated attention mechanism Used n=300 images for human benchmark test OAI data only, without external validation on independent cohort Improves interpretability by highlighting relevant regions in MRI; can support RCTs at screening phase
Guan (2020) Progression OAI (n=2300) X-rays and demographic Traditional ML models (RF, LR, and artificial neural networks), DL (two deep CNNs connected in a cascaded fashion), and combined internal validation OAI data only, without external validation on independent cohort Fully automated models were developed; easy to implement in future studies, code is available
Guan (2022) Progression OAI (6567 knees) demographic, clinical, KLG and images Traditional models (RF, LR, artificial neural network); DL models (CNN); combination internal validation No external validation, did not take into account specific treatment regimes that can affect severity of pain Feasibility of DL for pain prediction; potentially useful in clinical practice; could identify those at risk for rapid progression.
Yoo (2022) Progression Clinical data warehouse: 2151 knees to predict progression rate, 2492 knees to predict TJR clinical, KLG LR, RF, XGB internal cross validation Retrospective, limited SS which can induce overfitting, no external validation Data easy to obtain, cost-effective; approach can use the data from other sources
Dunn (2023) Progression biomarker discovery OAI-FNIH (n=554); JoCoOA (n=128); OAI (n=56) Epigenetic biomarkers and clinical Elastic net/regression Internal (70/30); 2 independent external Mismatched validation cohorts, did not consider MJOA, unclear mechanism Small number of methylation features provided time-stable biomarkers of OA progression
Bonakdari (2022) BMC Progression biomarker discovery OAI (n=901 Whites) SNPs and mtDNA haplogroups ML algorithms (SVM, kNN, RF, DT, ELM, self-adaptive ELM, individual and hybrid) Rigorous internal and external Only Caucasians, some variants were rare Can be used early in OA process and guide clinicians to improve long term outcomes
Tiulpin (2022) Predict TJR Cohort in South-Eastern Norway (n=557) Ultrasound and x-rays; clinical LR with regularization (Ridge penalty) with 5 different multivariate models Nested cross-validation Self-reported OA (not necessarily in the knee); lack of diversity, no reliability assessment for US or KLG, few progressors, no pain assessments US was comparable to x-rays in predicting outcome of TJR; US provided added value to x-ray when used in combination
Jamshidi (2021) Predict TJR OAI Clinical, demographic; MRI CoxPH model; DeepSurv; Linear/neural Multitask LR; Neural linear multi-task logistic LR;
Random survival forest model
Linear/kernel SVM
Internal validation No external validation; MRIs are not routinely obtained in clinical evaluation of OA Model predicts with high accuracy need and timing of TJR
Leung (2020) Predict TJR OAI cohort (n= 728)
-matching TJR patients; controls (did not undergo a TJR)
X-ray Single-task and multitask learning DL models (ResNet34) with and without transfer learning Nested cross-validation binary outcome where time-to-event is preferable; small sample size; baseline X-rays only
-OA data (majority in subsample were White), not generalizable
-did not identify the parameters extracted from radiographs
TJR outcome and KL grade predictions can be obtained from pretrained models or by training new models with nested cross-validation
Hirvasniemi (2022) Predict incident OA PROOF RCT (n=242 women) X-ray, MRI, clinical XGB classifier, LR, MLP, Gradient boosting machine, ResNet18, Gaussian Naïve Bayesian, LDA, Ensemble classifiers Training set (30 knees) and test set (423 knees) were released to all groups; external validation is "N/A" due challenge design obese women only (might not work for other age groups and men); RCT data; limited data to train models; correlation between 2 knees from same person not addressed (potentially introduced bias) Benchmark for predicting incident symptomatic radiographic knee OA
Pierson (2021) Estimate severity of pain OAI (n=4172) X-ray and KOOS CNN Training (25049 radiographs from 2877 people); Independent validation (11320 radiographs from 1295 people) No external validation; Unable to evaluate which features of the x-rays the algorithm is using; variance in pain accounted for remains low Measuring disease severity with ALG-P rather than KLG doubled potential eligibility for TJR for Black patients; importance of a diverse dataset
Joseph (2022) Predict incident OA OAI (n=1,044) MRI and clinical Three XGB models (decision-tree based): with and without imaging internal cross-validation No external testing, imbalanced case and control groups Smaller, clinically feasible model had slightly lower AUC than large model (but with improved feasibility)
Jiang (2021) Precision medicine IDEA trial (n=399) Clinical, X-ray, IL-6 24 ML models in 5 categories (penalized linear regression, DT, dynamic treatment regime, SVM, and Bayesian and compared to zero-order models (fixed treatment assignment) None Missing outcome and covariate data Identify subgroups that may have an improved response to specific interventions in the spirit of precision medicine

Abbreviations: ANFIS, adaptive neuro-fuzzy inference system; BMI, Body Mass Index; CHECK, Cohort Knee and Cohort Hip; CNN, Convolutional Neural Network; CoxPH, Cox Proportional Hazards Model; DL, Deep Learning; DeepSurv, Depp feed-forward Neural Network; DT, Decision Tree; EHR, Electronic Health Record; ELM, Extreme Learning Machine; FNIH, Foundation for the NIH Osteoarthritis Biomarkers Consortium; IDEA, Intensive Diet and Exercise for Arthritis; IL-6, Interleukin-6; JoCoOA, Johnston County Osteoarthritis Project; JSW, Joint Space Width; KLG, Kellgren-Lawrence Grade; kNN, k-nearest neighbors; KOA, Knee Osteoarthritis; KOOS, Knee Injury and Osteoarthritis Outcome Score; LDA, Linear Discriminant Analysis; LR, Logistic Regression; mtDNA, mitochondrial deoxyribonucleic acid; M5P, tree-based piecewise linear model; ML, Machine Learning; MLP, multilayer perceptron; MRI, Magnetic resonance imaging; OAI, Osteoarthritis Initiative; PROOF, Prevention of Knee Osteoarthritis in Overweight Females; RBF, Radial Basis Function Kernel; RCT, Randomized Clinical Trial; ResNet(X), convolutional neural network that is (x) layers deep; RF, Random Forest; SNP, Single Nucleotide Polymorphism; SVC, Support Vector Classifier; SVM, Support Vector Machines; TJR, Total Joint Replacement; US, Ultrasound data; WOMAC, Western Ontario and McMaster Universities Arthritis Index; XGB, eXtreme Gradient Boosting

The inclusion of patients with advanced OA characterized by severe cartilage loss, when it is difficult to stop or slow degenerative progress, represents an additional challenge in RCT design as it affects statistical power. To address this, Bonakdari and colleagues (19) used OAI data (n=3395) to understand bone curvature features (obtained from MRI via an automated system (20) defining mean bone surface curvature for eight regions [lateral and medial trochlea, lateral and medial central condyle, lateral and medial posterior condyle, and lateral and medial tibial plateau]) that could predict cartilage volume loss in one year. Results identified the medial condyle as the most frequent region of cartilage volume loss. A gender-based model using five bone curvature regions at baseline to predict global and regional bone cartilage loss at 12-months provided R ≥ 0.79 for the OAI testing cohort and R ≥ 0.78 for the validation cohort. Results from this study are strengthened due to the use of an external validation cohort (21). While an adaptive neuro-fuzzy inference system was found to be the best ML algorithm (of five algorithms tested, Table), its real-world use may be limited by the need for ten input variables, increasing its associated computational expense. Schiratti and colleagues were the first to use a weakly supervised learning technique called multiple instance learning (22) to predict OA progression over a year, using knee MRI (23). The model was developed using 9280 MRIs from 3268 OAI participants. First, the gated attention mechanism was used to “understand” which information from all images was most important. This information was quantified using attention scores computed for each slice of the input image. These scores, along with clinical variables, were used in the second classification sub-model, which aggregated this information into progression probabilities, although classification was still modest (23). To improve interpretability of results, the authors used GradCam heatmaps (24) to highlight the most relevant regions on knee MRI identified by the models.

Guan et al. compared diagnostic performance for predicting progression of radiographic joint space loss (defined as 0.7 mm decrease in medial joint space width at 48-months) among traditional ML models (e.g., RF, logistic regression, and artificial neural networks) in 4447 knees from 2300 OAI participants (25). In the latter model, two deep CNNs were connected in a cascaded fashion (first to narrow down the range of information from images, second to determine the likelihood of progression from cropped images), creating an efficient fully automated process. The models automatically learned a representative subset of imaging features associated with OA progression, which is different (and potentially advantageous) versus obtaining quantitative parameters based on a priori knowledge. In addition, combined traditional and DL models were developed and demonstrated improved performance. This study was limited by the lack of healthy knees included in model training and testing phases, but these promising results indicate that a joint training approach (e.g., combining demographic and radiographic risk factors with DL analysis of baseline knee radiographs) may be useful. This group used a similar approach to predict pain progression (a change in WOMAC score of at least 9 points) over 48 months also utilizing OAI data (2097 knees without/ 2103 with pain progression for training; 150/150 knees from the same cohort for validation) (26). Again, the combined model was found to have the best diagnostic performance (Area Under Curve [AUC]=0.81) for predicting pain progression (compared to <0.7 for traditional models); AUCs were significantly higher for knees with KLG 2–4 than for those with KLG 0–1, although analysis did not take into account individual treatment that could affect pain severity. If externally validated, such fully-automated models may be useful in future RCTs and clinical practice.

Most of the studies we have discussed so far have used data from the OAI. Data from OA cohorts, including OAI, are usually accurately labelled, allowing sophisticated ML algorithms to be used to assess disease progression. However, as discussed in our previous review (27), the use of retrospective studies to develop prediction algorithms makes it difficult to implement them in real world data. In addition, as already noted, the results from OA cohort studies cannot be fully generalized. Clinical data warehouses (CDW) are another source of data (usually from electronic health records [EHR]) to train prediction models. Balancing model complexity and performance, as well as cost-benefit of identified predictors, presents an additional challenge and area for research. Yoo et al. sought to compare prediction of OA progression in a CDW dataset using logistic regression (LR) and two ML algorithms (Random Forest and Extreme gradient boost [XGB])(28). Prediction variables included demographics, occupational factors, comorbidities, surgeries, and KLG. Knees with an initial KLG between 0–3 were eligible for inclusion, and rapid OA progression was defined as an increase in two KLG within 7 years (n=2151 knees). A binary surgery outcome was defined as surgical versus non-surgical intervention (n=2492 knees). ML algorithms, specifically XGB with all variables used, showed better performance than LR. Age, BMI, and bone mineral density were found to be significant contributors to the prediction, although the AUC for progression was still low (<0.7). The results were not externally validated and are not generalizable to other populations or other CDW data.

b. Biomarker discovery for progression

Two papers sought to identify novel biomarkers of OA progression using AI/ML methods (defined as pain-only, radiographic-only, or both versus none) (29). First, we contributed to an analysis using data from three independent cohorts: the subset of the OAI participating in the Foundation for the NIH Osteoarthritis Biomarkers Consortium (n=554) in a 70/30 split for discovery, and for validation, participants from the Johnston County OA Project (n=128), and an independent sample from the OAI (n=56). Elastic net regularized generalized LR models were employed to determine differences in DNA methylation patterns and different progression outcomes. Clinical features alone performed poorly for discrimination (AUC 0.54–0.68), while models using methylation data performed robustly with or without clinical data, particularly for discriminating pain-only and pain plus radiographic progressors from non-progressors. A parsimonious set of 13 CpG sites identified in initial model development had similar discriminative capability both in the original test set and in two independent test sets (radiographic-only vs non-progressors, AUC 0.88–0.89). Although the definitions varied somewhat among the three cohorts, and consideration was not made for potential multiple joint OA, the study did employ both internal and external validation and utilized clear and clinically relevant progression definitions. A relatively small number of epigenetic modifications (13 CpG features) may provide time-stable (not requiring time integrated concentrations) biomarkers of OA progression.

Bonakdari et al. aimed to determine whether genetic factors, including a set of 8 single nucleotide polymorphisms previously reported in robust GWAS studies and mitochondrial DNA haplogroups, in combination with age and BMI (major OA risk factors) could predict structural knee OA progression (30). Data from 901 OAI participants (276 structural progressors and 625 non-progressors) were used for model development and internal validation experiments. Importantly, the results were externally validated using data from the community-based Tasmanian Older Adult Cohort Study (31). The authors applied several ML algorithms, both single and hybrid (Table), and more parsimonious models. The two best models had excellent internal accuracy (>95%) and good validation accuracy (>85%) in the external independent cohort. Other genetic variants and other omics data can be used in future studies with expansion to more diverse populations.

c. Prediction of joint replacement

Multiple research groups have developed predictive models for the progression of OA to TJR, including Yoo et al. discussed above (28). Two groups recently approached this question using data from the OAI (32, 33). Leung et al. evaluated the performance of a multitask DL model on predicting TJR outcomes and KLG (32). A transfer learning model was found to have the best performance and was noted to have focused on regions of the X-ray near the joint space, but more detailed parameters were unable to be extracted. In contrast to work by Yoo and Leung that used TJR as a binary outcome, Jamshidi et al. evaluated the time to TJR as a continuous outcome (33). Using a regression shrinkage method, the authors identified the 10 most important characteristics among more than one thousand, which were further assessed for their prognostic power in seven ML algorithms (Table). The most accurate predictive model identified the presence of bone marrow lesions on MRI, KLG, and knee symptoms as the main contributors to the progression of OA to TJR. Tiulpin et al. investigated the utility of US in predicting OA progression to TJR (34). Using a nested-leave-one-out cross validation model in a population-based prospective cohort in Southeastern Norway with self-reported knee, hip, and/or hand OA, this group found that US, when combined with radiographic data, provides additional value in predicting progression to TJR (although this outcome was rare in the cohort).

3. Prediction of incident OA or OA features

The molecular and microstructural changes of OA start years before the onset of OA clinical symptoms (e.g., pain and decreased joint function), radiographic change, and therefore prior to OA diagnoses, leaving ample opportunities for early prediction and treatment. A common problem in prediction is the difficulty in comparing algorithms across studies due to different populations, sample characteristics, and methodologies. There is a need for validation and benchmarking data sets and competitions to evaluate the performance of different models on the same data. Hirvasniemi et al. (35) organized the first such challenge on the prediction of knee OA incidence, to objectively compare different methods for the prediction of incident symptomatic radiographic knee OA (by combined ACR criteria) within 78 months. The data were from a subset of RCT participants (242 overweight/obese women without symptomatic radiographic knee OA at baseline, with baseline X-ray and MRI images of 453 knees and follow-up data at 2.5 to 6.5 years (36)). Seven groups used the same data but applied different models (Table) for multiple submissions (n=23 overall). Importantly, the ground truth outcomes, i.e., which knees developed incident symptomatic radiographic knee OA, were not provided to the participants. AUCs and balanced accuracy (BACC) were used to assess model performance and compare submissions from different groups. Most of the submissions used DL for extracting information from the images. The results (AUC 0.50–0.64; BACC 0.48–0.59) suggest that DL models pre-trained on a related task and an ensemble of diverse models could achieve higher (although still modest) performance for predicting incident knee OA. Although it is unclear how the submitted models might generalize to other age groups and sex, this challenge established a benchmark for predicting incident symptomatic radiographic knee OA and outlined limitations that can be potentially addressed in the future to solve this complicated problem.

OA is well known to disproportionately affect underserved populations with people of color experiencing higher levels of pain than their White counterparts even with the same structural severity (37, 38). Pierson et al. used a CNN to predict pain scores called algorithmic pain prediction (ALG-P) based on knee X-ray data from 4,172 OAI participants (39). This study was unique in that it was designed specifically to mathematically quantify and to address health disparities. ALG-P, the OA severity measure, was used as predictor in a linear model, with Knee Injury and Osteoarthritis Outcome Score as the dependent variable, adjusted for race and socioeconomic status. The coefficient for this adjustment, which statistically represents the gap in pain between groups when adjusting for severity, was used as an OA disparity measure. The authors confirmed that controlling for KLG did not fully account for higher pain levels experienced by Black patients. ALG-P better accounted for the observed disparities, and assessing severity via ALG-P, rather than KLG, nearly doubled the potential eligibility for TJR for Black patients. These results are in line with known disparities in that non-White patients experience longer time to TJR, and undergo fewer TJR, than White patients (33, 40, 41).The findings need to be validated in independent and more diverse populations, and improved methods to understand the underlying imaging features are needed.

Joseph et al. used an XGB ensemble learning algorithm to predict incident radiographic OA over 8 years of follow-up (42). OAI data were retrospectively analyzed, with n=1044 individuals ultimately included in this study. Incident radiographic OA was defined as right knees that developed KLG of 2–4 over 8 years (n=183); right knees that remained at KLG 0–1 over 8 years (n=861) were not considered to have developed OA. The authors sought to balance feasibility (e.g., using readily available clinical variables) and model performance. Based on existing knowledge, the authors selected 112 clinically relevant variables, including MRI, demographics, muscle strength, and symptoms. All predictors were used in the initial model, the top 10 features in a second reduced model, and all but the imaging predictors (the most expensive to obtain) from the reduced model were used in model 3. The authors found that the inclusion of both T2 relaxation times and Whole-Organ Magnetic Resonance Imaging (WORM) scores improved prediction performance (AUC=0.77) compared to models without such imaging predictors (AUC=0.67), indicating the importance of using imaging biomarkers. While the initial model with 112 predictors had a slightly higher AUC (0.79) than the final, third model, the authors suggest that the ease of use of the 10-predictor model make it a more useful option in the clinical setting. This study lacked an external testing group, and cases and controls were not matched, but this study developed a clinically feasible ML model that can relatively accurately predict incident radiographic OA.

4. Precision Medicine

Few studies to date have explored precision medicine in OA. While there is a lack of effective pharmacologic interventions for OA, there are beneficial non-pharmacologic interventions, such as weight loss and exercise, that have been studied in numerous RCTs. Importantly, not all participants benefit from such interventions, and an understanding of predictors of response and non-response is needed. Our group applied precision medicine approaches to data from the Intensive Diet and Exercise in Arthritis RCT (43), which randomized people with excess body weight and symptomatic knee OA to either exercise alone, diet alone, or the combination of exercise plus diet. In the 399 participants with data for this analysis, 24 ML models (in 5 categories: penalized linear regression, decision trees, dynamic treatment regime, Support Vector Machines, and Bayesian) were compared to zero-order models (fixed treatment assignment) using estimated value functions for multiple outcomes (44). Precision medicine models outperformed fixed treatment assignment, particularly for the outcomes of weight loss and Interleukin-6, where optimal subgroups were identified for specific treatment assignments. Although limited by missing data and a lack of external validation, this work shows that aspects of heterogeneity in treatment response can be identified from baseline data and thus optimal therapeutic strategies can be targeted to the patient groups most likely to benefit.

DISCUSSION:

In this review, we have outlined the recent implementations of ML algorithms in areas such as phenotyping, prediction of progression/TJR, and precision medicine in OA. This is a rapidly developing area in OA and medicine in general, so this work provides a snapshot of the current environment and is not definitive. In the following discussion, we will focus on future directions and other tasks where ML could be used to improve OA outcomes and the challenges associated with implementation of ML algorithms in clinical settings.

Despite extensive research efforts, we still do not have disease modifying drugs for OA and existing interventions provide only modest benefit. In this review, we have discussed how ML algorithms could improve the efficiency of RCTs (16), potentially reducing time and cost for such endeavors. Although not discussed in our review, AI/ML methods can also be used to repurpose known drugs for OA treatment, significantly reducing research and development time and cost, by using population-based, network-based, or transcriptomic-based approaches, for which the reader is referred to references (4547).

There are many avenues that will help clinicians and researchers better manage OA progression and develop new treatment and prevention strategies, including (but not limited to) the use of new diverse data sources and the integration of different data types/sources using advanced AI techniques and rigorous validation of models. Optimizing the use of EHR data in combination with AI/ML methods is an important future direction in medicine. EHRs and other administrative data often include longitudinal observations allowing study of outcomes and can be potentially linked to omics data. The availability and cost reduction of genetic data and rapidly expanding technologies in the omics era leads to another future direction.

The use of multi-source data (e.g., omics, images, genetic, clinical), which can be effectively analyzed using AI/ML methods, is likely to continue to provide new insights into disease development and management. Many studies in this review integrated images with clinical data (10, 16, 19, 25, 26, 33, 34, 42) to great benefit. Further integration of these data with data from multi-omics could improve our understanding of pathophysiology and mechanisms underlying OA to give new insights into management. Of particular interest are studies that could identify previously unknown and potentially modifiable risk factors, such as metabolite targets for new drug development to address disease progression in metabolic OA subtypes, or targeted CpG sites for tailored methylation-modifying therapeutics.

As we and others continue to note, AI/ML methodologies are prone to the same biases, errors, and assumptions that affect more traditional statistical methods (27). Most studies included in this review used internal validation strategies (Table), but most also suffered from a lack of external validation. Differences in underlying populations and data sources, in addition to a lack of external validation (particularly in diverse and real-world data), threatens generalizability. This highlights the need for diverse large open datasets, including population-based cohorts and real clinical data, for use in benchmarking and validating AI/ML algorithms for broader use. Challenges here include those around data storage and sharing, data safety and privacy, data harmonization and controlling data quality (especially missing values). For example, EHR data that were originally designed for billing purposes, may introduce unintended biases (48) and lack other important information (e.g., social determinants). Omics data may not be collected from all individuals, leading to incomplete data sets which have the potential to bias outcomes. Longitudinal data from EHRs, as well from RCTs and cohort studies are also subject to non-random, informative attrition (e.g., loss of access, study eligibility, death). Widera et al. proposed several solutions to adjust for imbalance, such as up/down sampling and cost sensitive learning (16), but it is not clear how to apply them to real-world data. Therefore, continuous development of ML approaches to address these and other relevant issues is needed to optimize the fair and equitable use of complex information from administrative data, RCTs, clinical observational data, and cohort studies in OA and other conditions.

In addition to external validation of a given model in one or more independent data sets, direct comparison of different AI/ML methods using the same data is essential for good scientific practice and future clinical applications (49). Like RCTs, which represent a gold standard for the assessment of new drugs or interventions, biomedical challenges (35) can be used to compare performance of different algorithms and are a huge step towards reproducibility of results. The design of such challenges is as important as the design of RCTs and is an area of continued development.

Conclusions:

OA is a common, complex, heterogeneous, serious disease with no disease-modifying therapy short of joint replacement surgery. Studies of individual aspects of disease, single joint tissues, or small cohorts are limited in the insight they can provide. AI/ML methods allow the incorporation not only of larger datasets, but also diverse data of many types, scales, and origin, high dimensional data on small sample sizes, and combinations of these, providing an avenue to novel insights and hopefully patient-level breakthroughs in the not-to-distant future.

Funding and competing interests

Funding for this work was provided in part by NIH/NIAMS K24AR081368 and P30AR07250. The funders had no role in the writing or submission of the manuscript. Dr. Nelson also reports funding outside this work from NIH/NIAMS and the Rheumatology Research Foundation; she has received honoraria from Osteoarthritis and Cartilage and Nestle Health. The other authors report no competing interests.

Abbreviations

ACR

American College of Rheumatology

AI

Artificial Intelligence

AUC

Area Under Curve

BMI

Body Mass Index

CDW

Clinical data warehouses

CNN

Convolutional Neural Network

DEC

Deep Embedded Clustering

DL

Deep Learning

EHR

Electronic Health Record

GWAS

genome-wide association studies

KLG

Kellgren-Lawrence Grade

LR

Logistic Regression

MFAC

clustering with multiple factor analysis

ML

Machine Learning

MRI

Magnetic resonance imaging

NSAID

non-steroidal anti-inflammatory drugs

OA

Osteoarthritis

OAI

Osteoarthritis Initiative

RCT

Randomized Clinical Trial

RF

Random Forest

TJR

Total Joint Replacement

US

Ultrasound data

WOMAC

Western Ontario and McMaster Universities Arthritis Index

XGB

eXtreme Gradient Boosting

Footnotes

Human and animal rights statement: All reported studies/experiments with human or animal subjects performed by the authors have been previously published and complied with all applicable ethical standards (including the Helsinki declaration and its amendments, institutional/national research committee standards, and international/national/institutional guidelines).

REFERENCES

  • 1.Murphy LB, Cisternas MG, Pasta DJ, Helmick CG, Yelin EH. Medical Expenditures and Earnings Losses Among US Adults With Arthritis in 2013. Arthritis Care Res (Hoboken). 2018;70(6):869–76. [DOI] [PubMed] [Google Scholar]
  • 2.Kolasinski SL, Neogi T, Hochberg MC, Oatis C, Guyatt G, Block J, et al. 2019 American College of Rheumatology/Arthritis Foundation Guideline for the Management of Osteoarthritis of the Hand, Hip, and Knee. Arthritis Rheumatol. 2020;72(2):220–33. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Grässel S, Muschter D. Recent advances in the treatment of osteoarthritis. F1000Res. 2020. May 4;9:F1000 Faculty Rev-325. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Loos NL, Hoogendam L, Souer JS, Slijper HP, Andrinopoulou ER, Coppieters MW, et al. Machine Learning Can be Used to Predict Function but Not Pain After Surgery for Thumb Carpometacarpal Osteoarthritis. Clin Orthop Relat Res. 2022;480(7):1271–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Bowes MA, Kacena K, Alabas OA, Brett AD, Dube B, Bodick N, et al. Machine-learning, MRI bone shape and important clinical outcomes in osteoarthritis: data from the Osteoarthritis Initiative. Ann Rheum Dis. 2021;80(4):502–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Chaudhari AS, Stevens KJ, Wood JP, Chakraborty AK, Gibbons EK, Fang Z, et al. Utility of deep learning super-resolution in the context of osteoarthritis MRI biomarkers. J Magn Reson Imaging. 2020;51(3):768–79. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lester G The Osteoarthritis Initiative: A NIH Public-Private Partnership. HSS J. 2012;8(1):62–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Chen G, Sullivan PF, Kosorok MR. Biclustering with heterogeneous variance. Proc Natl Acad Sci U S A. 2013;110(30):12253–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Cheng Y, Church GM. Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol. 2000;8:93–103. [PubMed] [Google Scholar]
  • 10.Nelson AE, Keefe TH, Schwartz TA, Callahan LF, Loeser RF, Golightly YM, et al. Biclustering reveals potential knee OA phenotypes in exploratory analyses: Data from the Osteoarthritis Initiative. PLoS One. 2022;17(5):e0266964. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Demanse D, Saxer F, Lustenberger P, Tanko LB, Nikolaus P, Rasin I, et al. Unsupervised machine-learning algorithms for the identification of clinical phenotypes in the osteoarthritis initiative database. Semin Arthritis Rheum. 2023;58:152140. [DOI] [PubMed] [Google Scholar]
  • 12.Trajerova M, Kriegova E, Mikulkova Z, Savara J, Kudelka M, Gallo J. Knee osteoarthritis phenotypes based on synovial fluid immune cells correlate with clinical outcome trajectories. Osteoarthritis Cartilage. 2022;30(12):1583–92. [DOI] [PubMed] [Google Scholar]
  • 13.Deveza LA, Nelson AE, Loeser RF. Phenotypes of osteoarthritis: current state and future implications. Clin Exp Rheumatol. 2019;37 Suppl 120(5):64–72. [PMC free article] [PubMed] [Google Scholar]
  • 14.Mobasheri A, van Spil WE, Budd E, Uzieliene I, Bernotiene E, Bay-Jensen AC, et al. Molecular taxonomy of osteoarthritis for patient stratification, disease management and drug development: biochemical markers associated with emerging clinical phenotypes and molecular endotypes. Curr Opin Rheumatol. 2019;31(1):80–9. [DOI] [PubMed] [Google Scholar]
  • *15. Steinberg J, Southam L, Fontalis A, Clark MJ, Jayasuriya RL, Swift D, et al. Linking chondrocyte and synovial transcriptional profile to clinical phenotype in osteoarthritis. Ann Rheum Dis. 2021;80(8):1070–4. Used machine learning to assess gene expression profiles with results supporting the theory that osteoarthritis is a continuum with less variation at later stages of disease; greater heterogeneity early in disease suggests an opportunity for tailored treatment.
  • *16. Widera P, Welsing PMJ, Ladel C, Loughlin J, Lafeber F, Petit Dop F, et al. Multi-classifier prediction of knee osteoarthritis progression from incomplete imbalanced longitudinal data. Sci Rep. 2020;10(1):8427. Rigorous statistical framework using advanced statistical techniques to account for classes imbalance and incomplete data. Used categorical rather than binary definition of the outcome, KOA progression.
  • 17.van Helvoort EM, van Spil WE, Jansen MP, Welsing PMJ, Kloppenburg M, Loef M, et al. Cohort profile: The Applied Public-Private Research enabling OsteoArthritis Clinical Headway (IMI-APPROACH) study: a 2-year, European, cohort study to describe, validate and predict phenotypes of osteoarthritis using clinical, imaging and biochemical markers. BMJ Open. 2020;10(7):e035101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Kraus VB, Collins JE, Hargrove D, Losina E, Nevitt M, Katz JN, et al. Predictive validity of biochemical biomarkers in knee osteoarthritis: data from the FNIH OA Biomarkers Consortium. Ann Rheum Dis. 2017;76(1):186–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Bonakdari H, Pelletier JP, Abram F, Martel-Pelletier J. A Machine Learning Model to Predict Knee Osteoarthritis Cartilage Volume Changes over Time Using Baseline Bone Curvature. Biomedicines. 2022;10(6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Raynauld JP, Pelletier JP, Delorme P, Dodin P, Abram F, Martel-Pelletier J. Bone curvature changes can predict the impact of treatment on cartilage volume loss in knee osteoarthritis: data from a 2-year clinical trial. Rheumatology (Oxford). 2017;56(6):989–98. [DOI] [PubMed] [Google Scholar]
  • 21.Raynauld JP, Martel-Pelletier J, Bias P, Laufer S, Haraoui B, Choquette D, et al. Protective effects of licofelone, a 5-lipoxygenase and cyclo-oxygenase inhibitor, versus naproxen on cartilage loss in knee osteoarthritis: a first multicentre clinical trial using quantitative MRI. Ann Rheum Dis. 2009;68(6):938–47. [DOI] [PubMed] [Google Scholar]
  • 22.Ilse M, Tomczak J, Welling M. Attention-based deep multiple instance learning. International conference on machine learning; 2018: PMLR; 2018. p. 2127–36. [Google Scholar]
  • *23. Schiratti JB, Dubois R, Herent P, Cahane D, Dachary J, Clozel T, et al. A deep learning method for predicting knee osteoarthritis radiographic progression from MRI. Arthritis Res Ther. 2021;23(1):262. Developed a weakly supervised deep learning algorithm to predict OA progression over a short time frame; encouraging results suggest that such algorithms can feasibility be integrated into the screening phase of clinical trials and improve how inclusion criteria are determined.
  • 24.Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. International Journal of Computer Vision. 2020;128(2):336–59. [Google Scholar]
  • 25.Guan B, Liu F, Haj-Mirzaian A, Demehri S, Samsonov A, Neogi T, et al. Deep learning risk assessment models for predicting progression of radiographic medial joint space loss over a 48-MONTH follow-up period. Osteoarthritis Cartilage. 2020;28(4):428–37. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Guan B, Liu F, Mizaian AH, Demehri S, Samsonov A, Guermazi A, et al. Deep learning approach to predict pain progression in knee osteoarthritis. Skeletal Radiol. 2022;51(2):363–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • **27. Nelson AE, Arbeeva L. Narrative Review of Machine Learning in Rheumatic and Musculoskeletal Diseases for Clinicians and Researchers: Biases, Goals, and Future Directions. J Rheumatol. 2022;49(11):1191–200. Review of machine learning in rheumatic and musculoskeletal diseases beyond osteoarthritis, providing extensive discussion around potential biases and limitations.
  • 28.Yoo HJ, Jeong HW, Kim SW, Kim M, Lee JI, Lee YS. Prediction of progression rate and fate of osteoarthritis: Comparison of machine learning algorithms. J Orthop Res. 2023;41(3):583–90. [DOI] [PubMed] [Google Scholar]
  • *29. Dunn CM, Sturdy C, Velasco C, Schlupp L, Prinz E, Izda V, et al. Peripheral Blood DNA Methylation-Based Machine Learning Models for Prediction of Knee Osteoarthritis Progression: Biologic Specimens and Data From the Osteoarthritis Initiative and Johnston County Osteoarthritis Project. Arthritis Rheumatol. 2023;75(1):28–40. Use of fully independent data for external validation and investigation of potentially novel epigenetic biomarkers for useful clinical progression definitions are strengths of this work.
  • 30.Bonakdari H, Pelletier JP, Blanco FJ, Rego-Pérez I, Durán-Sotuela A, Aitken D, et al. Single nucleotide polymorphism genes and mitochondrial DNA haplogroups as biomarkers for early prediction of knee osteoarthritis structural progressors: use of supervised machine learning classifiers. BMC Med. 2022;20(1):316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Dore D, Martens A, Quinn S, Ding C, Winzenberg T, Zhai G, et al. Bone marrow lesions predict site-specific cartilage defect development and volume loss: a prospective study in older adults. Arthritis Res Ther. 2010;12(6):R222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Leung K, Zhang B, Tan J, Shen Y, Geras KJ, Babb JS, et al. Prediction of Total Knee Replacement and Diagnosis of Osteoarthritis by Using Deep Learning on Knee Radiographs: Data from the Osteoarthritis Initiative. Radiology. 2020;296(3):584–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Jamshidi A, Pelletier JP, Labbe A, Abram F, Martel-Pelletier J, Droit A . Machine Learning-Based Individualized Survival Prediction Model for Total Knee Replacement in Osteoarthritis: Data From the Osteoarthritis Initiative. Arthritis Care Res (Hoboken). 2021;73(10):1518–27. [DOI] [PubMed] [Google Scholar]
  • 34.Tiulpin A, Saarakkala S, Mathiessen A, Hammer HB, Furnes O, Nordsletten L, et al. Predicting total knee arthroplasty from ultrasonography using machine learning. Osteoarthr Cartil Open. 2022;4(4):100319. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • **35. Hirvasniemi J, Runhaar J, van der Heijden RA, Zokaeinikoo M, Yang M, Li X, et al. The KNee OsteoArthritis Prediction (KNOAP2020) challenge: An image analysis challenge to predict incident symptomatic radiographic knee osteoarthritis from MRI and X-ray images. Osteoarthritis Cartilage. 2023;31(1):115–25. The first biomedical challenge on the prediction of incident symptomatic radiographic knee OA, a step towards unbiased comparison between different models, robust validation and clinical translation of AI/ML algorithms.
  • 36.Runhaar J, van Middelkoop M, Reijman M, Willemsen S, Oei EH, Vroegindeweij D, et al. Prevention of knee osteoarthritis in overweight females: the first preventive randomized controlled trial in osteoarthritis. Am J Med. 2015;128(8):888–95 e4. [DOI] [PubMed] [Google Scholar]
  • 37.Allen KD, Helmick CG, Schwartz TA, DeVellis RF, Renner JB, Jordan JM. Racial differences in self-reported pain and function among individuals with radiographic hip and knee osteoarthritis: the Johnston County Osteoarthritis Project. Osteoarthritis Cartilage. 2009;17(9):1132–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Vaughn IA, Terry EL, Bartley EJ, Schaefer N, Fillingim RB. Racial-Ethnic Differences in Osteoarthritis Pain and Disability: A Meta-Analysis. J Pain. 2019;20(6):629–44. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • **39. Pierson E, Cutler DM, Leskovec J, Mullainathan S, Obermeyer Z. An algorithmic approach to reducing unexplained pain disparities in underserved populations. Nat Med. 2021;27(1):136–40. An example of implementation of AI algorithm for predicting the severity of OA symptoms based on objective image data rather than subjective self-report and/or radiologist assessment. If externally validated, can be used as a decision aid for TJR referral as it can potentially mitigate bias in pain assessment in disadvantaged social groups and reduce health disparities in pain management and medical decisions.
  • 40.Blum MA, Ibrahim SA. Race/ethnicity and use of elective joint replacement in the management of end-stage knee/hip osteoarthritis: a review of the literature. Clin Geriatr Med. 2012;28(3):521–32. [DOI] [PubMed] [Google Scholar]
  • 41.Singh JA, Lu X, Rosenthal GE, Ibrahim S, Cram P. Racial disparities in knee and hip total joint arthroplasty: an 18-year analysis of national Medicare data. Ann Rheum Dis. 2014;73(12):2107–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Joseph GB, McCulloch CE, Nevitt MC, Link TM, Sohn JH. Machine learning to predict incident radiographic knee osteoarthritis over 8 Years using combined MR imaging features, demographics, and clinical factors: data from the Osteoarthritis Initiative. Osteoarthritis Cartilage. 2022;30(2):270–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Messier SP, Mihalko SL, Legault C, Miller GD, Nicklas BJ, DeVita P, et al. Effects of intensive diet and exercise on knee joint loads, inflammation, and clinical outcomes among overweight and obese adults with knee osteoarthritis: the IDEA randomized clinical trial. JAMA. 2013;310(12):1263–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • **44. Jiang X, Nelson AE, Cleveland RJ, Beavers DP, Schwartz TA, Arbeeva L, et al. Precision Medicine Approach to Develop and Internally Validate Optimal Exercise and Weight-Loss Treatments for Overweight and Obese Adults With Knee Osteoarthritis: Data From a Single-Center Randomized Trial. Arthritis Care Res (Hoboken). 2021;73(5):693–701. This is among the first studies to apply precision medicine methodology to interventions in OA, and uses data from an existing, high quality RCT, finding potential subgroups where benefit could be increased by optimal assignment based on baseline features.
  • 45.Chen B, Butte AJ. Leveraging big data to transform target selection and drug discovery. Clin Pharmacol Ther. 2016;99(3):285–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Hodos RA, Kidd BA, Shameer K, Readhead BP, Dudley JT. In silico methods for drug repurposing and pharmacology. Wiley Interdiscip Rev Syst Biol Med. 2016;8(3):186–210. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Jang IJ. Artificial intelligence in drug development: clinical pharmacologist perspective. Transl Clin Pharmacol. 2019;27(3):87–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Gianfrancesco MA, Tamang S, Yazdany J, Schmajuk G. Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data. JAMA Intern Med. 2018;178(11):1544–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Reinke A, Tizabi MD, Eisenmann M, Maier-Hein L. Common Pitfalls and Recommendations for Grand Challenges in Medical Artificial Intelligence. Eur Urol Focus. 2021;7(4):710–2. [DOI] [PubMed] [Google Scholar]

RESOURCES