Abstract
Purpose
To develop a multimodal machine learning–based pipeline to predict patient-specific risk of dislocation following primary total hip arthroplasty (THA).
Materials and Methods
This study retrospectively evaluated 17 073 patients who underwent primary THA between 1998 and 2018. A test set of 1718 patients was held out. A hybrid network of EfficientNet-B4 and Swin-B transformer was developed to classify patients according to 5-year dislocation outcomes from preoperative anteroposterior pelvic radiographs and clinical characteristics (demographics, comorbidities, and surgical characteristics). The most informative imaging features, extracted by the mentioned model, were selected and concatenated with clinical features. A collection of these features was then used to train a multimodal survival XGBoost model to predict the individualized hazard of dislocation within 5 years. C index was used to evaluate the multimodal survival model on the test set and compare it with another clinical-only model trained only on clinical data. Shapley additive explanation values were used for model explanation.
Results
The study sample had a median age of 65 years (IQR: 18 years; 52.1% [8889] women) with a 5-year dislocation incidence of 2%. On the holdout test set, the clinical-only model achieved a C index of 0.64 (95% CI: 0.60, 0.68). The addition of imaging features boosted multimodal model performance to a C index of 0.74 (95% CI: 0.69, 0.78; P = .02).
Conclusion
Due to its discrimination ability and explainability, this risk calculator can be a potential powerful dislocation risk stratification and THA planning tool.
Keywords: Conventional Radiography, Surgery, Skeletal-Appendicular, Hip, Outcomes Analysis, Supervised Learning, Convolutional Neural Network (CNN), Gradient Boosting Machines (GBM)
Supplemental material is available for this article.
© RSNA, 2022
Keywords: Conventional Radiography, Surgery, Skeletal-Appendicular, Hip, Outcomes Analysis, Supervised Learning, Convolutional Neural Network (CNN), Gradient Boosting Machines (GBM)
Summary
A multimodal machine learning–based risk calculator using a single preoperative anteroposterior hip radiograph coupled with demographic, comorbidity, and surgical information predicted patient-specific dislocation risk following primary total hip arthroplasty.
Key Points
■ Imaging features from a single preoperative anteroposterior hip radiograph were the most influential factors in predicting patient-specific risk of dislocation, even beyond well-known and well-characterized clinical risk factors for dislocation.
■ Imaging features and clinical characteristics had a synergistic effect in predicting dislocation risk, creating a robust patient-specific calculator (C index, 0.74; P = .02).
■ The proposed calculator enables surgeons to determine changes in individual patient risk after adjusting three surgical factors within their control (surgical approach, acetabular liner type, and femoral head size).
Introduction
Total hip arthroplasty (THA) is the second highest performed elective surgery in the United States (behind total knee arthroplasty) and is estimated to see a threefold increase by the year 2040 (1). While THA is typically quite successful, complications can occur. Implant dislocation is the most common complication, with a lifetime incidence of 4%–5%, resulting in emergency department visits for reduction under sedation. Oftentimes, patients become “chronic dislocators,” requiring multiple reductions by a health care professional and possibly revision surgery (2). In fact, implant dislocation is the leading cause of revision THA in the United States (3).
There are several risk factors identified for THA dislocation, which can be categorized broadly into preoperative nonmodifiable factors such as demographics (eg, age, sex, body mass index) and specific comorbidities (eg, neurologic disease, spinal disease, spine surgery, indication for THA). However, there are also well-described intraoperative choices within a surgeon’s control that can be used to mitigate the risk of dislocation, including the surgical approach and implant options, such as acetabular liner type and femoral head size (4–8). Postoperative risk can be mitigated mainly through revision surgery, which is indicated only after the patient experiences dislocation (9). Hence, interventions should be targeted at intraoperative variables.
Recently, our group developed a patient-specific THA dislocation risk calculator based on a rigorously characterized group of patients (nearly 30 000) by using follow-up data from our total joint registry capturing every dislocation, regardless of whether it takes place in our health care system (10). The calculator incorporates all the aforementioned preoperative nonmodifiable factors and modifiable surgical decisions in a single multivariable model of clinical data. The nomograms from the calculator enable surgeons to identify patient baseline risk and determine how this risk can be modified by surgical decisions. A deficit of the calculator was modest discrimination, with a C index of 0.62–0.65. This is not unexpected for an outcome like dislocation that is multifactorial in nature. Nevertheless, we endeavored to improve the calculator by incorporating deep learning evaluation of a single preoperative anteroposterior (AP) pelvic radiograph, which is the standard of care in preoperative THA workup.
The aims of this study were to develop a deep learning classifier model trained to extract imaging features from a preoperative AP pelvic radiograph and to evaluate if the fusion of these features with a previously developed clinical patient-specific risk prediction calculator could improve model performance.
Materials and Methods
A pipeline of machine learning algorithms (image classification, feature extraction, and survival analysis) was designed to predict dislocation risk when given a patient’s latest preoperative AP radiograph and clinical characteristics (Fig 1). Additionally, a graphical user interface was developed to facilitate clinical use, while ensuring model explainability.
Figure 1:

Flow diagram of study recruitment and pipeline schematics. AP = anteroposterior, THA = total hip arthroplasty.
Study Sample Selection and Splitting
This retrospective cohort study of a total joint arthroplasty registry was Health Insurance Portability and Accountability Act compliant and approved by the institutional review board with waiver of informed consent. Patients who underwent primary THA between 1998 and 2018 were evaluated, yielding 21 978 patients whose charts were reviewed to extract demographics, relevant comorbidities, and surgical characteristics, as detailed in Table 1. The primary outcome was postoperative dislocation within 5 years, as dislocation becomes less common as time from surgery increases (11). Patients who did not have digitized preoperative AP pelvic or hip radiographs (n = 4905) in our previously established THA radiograph registry were excluded, yielding a final cohort of 17 073 THA procedures. A representative test set of 10% of these patients with the same distribution of outcomes and surgery dates was randomly selected. The remainder of the dataset was separated into 10 folds at the patient level, while ensuring all folds contained approximately the same number of patients with dislocation. Tenfold cross-validation was used for hyperparameter optimization, with evaluation of the best-performing fold on the holdout test set.
Table 1:
Encoding of Clinical Variables for Model Training: 21 Total Features
Image Classifier
Patients with definite 5-year ground truth labels (ie, those sustaining dislocation within 5 years or who were followed up for 5 years and did not have dislocation) were used to train a multimodal binary classifier for predicting 5-year dislocation. Preoperative AP images were de-identified and their pixel values were clipped between the 2.5th and 97.5th percentile and were passed to a previously trained object detection model to detect the hip joint coordinates and side of the body shown in the image (12). The resulting bounding box for the joint of interest was dilated by 25% in the lateral and superior borders and was used to crop the images. Cropped images were padded to square, resized to 512 × 512 pixels, and standardized to have values between 0 and 1.
A hybrid network was developed that used the output of the first five blocks of an EfficientNetB4 (pretrained on ImageNet) to serve as the patch-embedding layer of a Swin-B vision transformer (randomly initialized) (Fig 2) (13,14). Both models were adapted from PyTorch image models (timm) implementation (15). Clinical variables were encoded (Table 1) and concatenated to the imaging features extracted by the transformer. This model was trained using FocalLoss as loss function and LAMB as optimizer (16,17). To account for imbalance in the training data given the rarity of dislocation as a complication, the minority class was oversampled by a factor of 15.
Figure 2:
Proposed architecture for the deep learning multimodal classifier.
Although our primary interest was focused on patients’ most recent preoperative AP pelvic and hip radiographs, using only the latest radiographs would limit the training data. To address this issue, all preoperative AP pelvic and hip radiographs were used for training, but the loss value for each image was weighted based on their temporal timestamp (TT), as follows:
![]() |
where d is the duration between image acquisition and surgery in days. Also, the TT for each image was broadcasted and stacked as the second channel to that image (Fig 2). Using TT as described above ensured a larger contribution of recent images to model learning (through loss weighting) and demonstrated the importance of image-to-surgery interval on the features learned (through image concatenation). During training, images were augmented by applying rotation (±15°), zooming (±0.1), and horizontal flipping, using the Medical Open Network for Artificial Intelligence (ie, MONAI) package (18). Each batch contained 128 images. Learning rate was increased from 0.0001 to 0.001 during the first 50 epochs and gradually decreased for a total of 100 epochs (19). Exponential moving weight averaging with a decay factor of 0.99 and weight decay of 0.0005 were used to regularize the training. Training was carried out using the PyTorch framework (v1.10.0; pytorch.org) on four NVIDIA A100 accelerators (20).
Although the outputs of the developed deep learning model naively could be regarded as dislocation risks, this would mean excluding from the training data the patients who had undergone less than 5 years of follow-up, leading to potential bias through changing the data distribution. Therefore, we used this model as an image feature selector and trained a final survival model on all patients’ data, regardless of follow-up length.
Image Feature Selection and Survival Analysis
The trained classifier was applied to the radiographs in all patients, including those with indefinite outcome (ie, having undergone less than 5 years of follow-up). An output vector of the Swin transformer features (length = 1024) was saved for each image (Fig 2). A first extreme gradient boosting machine model (XGBoost) was then used to select the top 10 imaging features that had the most overall gain of performance on the training set across cross-validation folds (21). These 10 features were then extracted for all patients and concatenated with 21 encoded clinical features (encoding details in Table 1). This set of 31 features was used to train a second XGBoost survival model to predict the risk of dislocation for each patient, serving as the final calculator. Survival embeddings were also applied to the risk outputs of all XGBoost models to calculate 5-year hazard, hereafter called risk (22).
For model development, features were extracted from all available images for patients in the training set, but only the latest preoperative images for patients in the holdout test set were used for evaluation purposes (as is the real use case of the model). For all survival models, hyperparameters were optimized during cross-validation, as detailed in Table E1 (supplement).
Statistical Analysis
Categorical variables are presented as numbers with percentages in parentheses and continuous variables as medians with IQRs in parentheses or as means ± SDs. To compare patient characteristics in the training and test sets, χ2 test (for categorical variables) and t test (for continuous variables) were used. Area under the receiver operating characteristic curve (AUC) was used for classifier model selection and performance evaluation. Captum package (v0.4.0; ai.facebook.com) was used to visualize the 10 selected imaging features, and each neuron’s global integrated gradient map was generated (23,24). The performance of the multimodal survival model was reported and compared with a naive XGBoost survival model, which was trained with only clinical data, without including imaging features (hereafter called the clinical-only model). Harrell C index, a generalization of AUC to survival data, was used to report survival model performance (25). The C index shows the percentage of patient pairs in which the risk of the patient with a shorter time to event is higher than that of the other patient in that pair. To gain an intuition behind survival model predictions, Shapley additive explanation (SHAP) values were calculated that show the contribution of each variable in the final output (26). Details about SHAP and chart interpretations are provided in Appendix E1 (supplement).
All metrics were reported on the latest preoperative radiograph in patients in the test set and calculated using the SciPy package x(v1.7.0; https://www.python.org) (27). CIs of the C indexes for survival models were calculated by using the survcomp package (v3.14) and compared using the compareC (v1.3.1) package in R language (R Foundation for Statistical Computing) (28). P values less than .05 were considered significant.
Results
Study Sample Characteristics
The final study sample contained 17 073 THA procedures (Fig 1) in 8184 men (48%) and 8889 women (52%) with a median age of 65 years (IQR: 18 years) and mean follow-up time of 4.3 years ± 4.0. The median number of images for each patient was four (IQR: four images). The incidence of 5-year dislocation was 2% (355 of 17 073). Table 2 summarizes patient characteristics in the training and test sets. Table E2 (supplement) shows the composition of modifiable surgical factors in patients with and without dislocations.
Table 2:
Patient Characteristics in Training and Internal Test Sets
Image Classifier
A total of 22 724 images from 129 different acquisition devices were used to train and validate the image classifier. This model achieved a mean AUC of 0.73 ± 0.02 across the validation folds (Table E3 [supplement]). The classifier was able to differentiate patients with implant dislocation from those who did not have dislocation within 5 years, with an AUC of 0.77 (95% CI: 0.74, .81) on the test set. The sensitivity of this model was 69% (25 of 36), with a specificity of 68% (469 of 685), a negative predictive value of 98% (469 of 480), and a positive predictive value of 10% (25 of 241).
Image Feature Selection and Survival Analysis
The clinical-only survival model achieved a C index of 0.64 (95% CI: 0.60, 0.68) on the holdout test. In comparison, the multimodal survival model, which leveraged both clinical and imaging features, achieved a C index of 0.74 (95% CI: 0.69, 0.78; P = .02). The SHAP summary plot of the multimodal XGBoost model and clinical-only model are presented in Figure 3 and Figure E1 (supplement), respectively. Four of the top five and 10 of the top 13 features that influenced the multimodal survival model’s predictions were imaging features. The integrated gradient maps of the 10 selected features are shown in Figure 4, demonstrating a focus of the model on the area of maximum load in the hip joint articulation and the acetabular teardrop.
Figure 3:
Shapley additive explanation (SHAP) summary plot of the final survival model. The chart color is determined by the feature value for the feature given on the left axis. For example, a posterior surgical approach (red) is associated with higher SHAP values or, in other words, higher risk of dislocation.
Figure 4:
Global integrated gradient attributions of the selected features.
The developed graphical user interface for the multimodal calculator starts by taking all the demographic information, relevant comorbidity variables, and latest preoperative AP pelvic radiograph for a patient. It then produces a matrix of all possible outputs of patient-specific risk on the basis of the 18 combinations of modifiable surgical variables (two femoral head component sizes, three acetabular liner types, and three surgical approaches). This matrix output enables a surgeon to see the range of risk possibilities for a specific patient and the degree to which that risk can be modified by various surgical strategies. Figure 5 demonstrates this procedure for two patients from the test set with known outcomes, highlighting patient-specific variability of risk and how the outputs may be visualized. For example, the patient in Figure 5B has a low risk at baseline, and the model matrix demonstrates essentially no impact of surgical decisions. By contrast, the patient in Figure 5C has a higher risk, and the model matrix shows that risk can be as high as 8% versus as low as 3% based on surgical decisions. Figures 5D and 5E show patient-specific SHAP plots, highlighting the relative contribution of various factors to model prediction.
Figure 5:
(A) Graphical user interface of the proposed risk calculator. (B) Risk matrix for a 52-year-old woman with a body mass index (BMI) of 32.8 and no notable past medical diseases or surgeries (colors indicate the risk heatmap). (C) Risk matrix for a 68-year-old man with a BMI of 26.9 and minor spinal disease (colors indicate the risk heatmap). (D) The waterfall Shapley additive explanation (SHAP) plot for the patient presented in C with a combination of standard acetabular liner, femoral head component smaller than 32 mm, and posterior approach. (E) The waterfall SHAP plot for the patient presented in C with a combination of dual-mobility acetabular liner, femoral head component larger than 36 mm, and direct anterior approach.
Discussion
Implant dislocation is the most common complication of THA and causes serious morbidity. There are surgical choices that can reduce the risk of dislocation. A patient-specific risk prediction tool with dynamic output based on surgical decisions has remained elusive and has been a primary barrier to progress on this large-scale problem. In this study, a machine learning–based pipeline is introduced to evaluate risk of dislocation on the basis of a single preoperative radiograph in combination with clinical and surgical characteristics of patients undergoing primary THA. It showed that a fusion of imaging and clinical features was synergistic and produced optimal performance (C index, 0.74; P = .02). The multimodal model provides individualized risk and a matrix of outputs based on surgical decisions for consideration by a surgeon preoperatively.
As dislocation is a time-to-event outcome, a single classifier would not satisfy the needs of risk prediction, therefore survival analysis methods were used in our study. Survival analysis can be considered as a type of regression task, meaning that the model, regardless of type, will output a risk or a proxy measure of it that is inversely correlated to patient time-to-event duration (29). More importantly, survival analysis can handle censored patients (those who had undergone less than 5 years of follow-up), but these patients should be removed for training classifiers. If two patients have dislocation in 1 year (patient A) and 3 years (patient B), a binary classifier would assign the same label to both; however, a survival model will assign a higher risk value to the patient who had dislocation in 1 year. A third patient (patient C) who was followed up for 4 years but did not have dislocation should be excluded from the classifier training pool because of the lack of a definite 5-year outcome. But a survival model should assign a lower risk to this patient, compared with patient B, as patient C definitely had no dislocation by 3 years (the time-to-event for patient B).
Our team has previously worked on a logistic regression-based risk calculator with a C index of 62% that uses demographic, past medical, and surgical variables to predict the dislocation risk of patients (10). That clinical calculator highlighted wide patient-specific dislocation risks (0.3%–45%), as well as the importance of modifiable surgical variables. To build upon this work, a multimodal survival model was developed that leverages both imaging and clinical data. A preoperative AP radiograph was used as the imaging input, given that it is standardized, low cost, and universally obtained prior to THA. Furthermore, this image has the potential for deriving risk scores for other THA complications, like periprosthetic femoral fracture, which could ultimately inform a more comprehensive THA risk prediction and personalized surgical planning tool.
The developed multimodal calculator outperformed the previously published logistic regression–based model by a high margin in terms of C index. It also performed better than the clinical-only gradient boosting machine survival model developed in this study, demonstrating that the observed gain of performance was not attributed to architectural differences. Also, the SHAP plots highlight the importance of imaging features in predicting dislocation risk. There were differences between the SHAP values in the clinical-only model and the final multimodal model. For example, the effect size of patient sex and spinal surgery was different in the two models. A possible explanation for this difference is that variables like sex or past surgeries might have imaging signals that are detected in a proxy fashion by the model.
The superior predictive performance of imaging features compared with most clinical features in the multimodal model may provide new insight into the root causes of dislocation risk. The integrated gradient maps of the 10 most important imaging features in the model showed the most focus on regions of anatomic interest, including the joint line at the acetabular sourcil (the most common area for joint space narrowing indicating a THA) and the acetabular teardrop. These are discrete anatomic landmarks, and they enable conjecture into how the model may be determining subtleties of risk. Perhaps the model is gaining insight into the severity of arthritis, leg length, and hip joint offset and how these parameters compare in proportion to other landmarks, though extensive efforts are needed to gain insight into how a network reaches a decision.
There have been numerous studies using convolutional neural networks on musculoskeletal images, showing the applicability of these architectures in undertaking tasks like classification and segmentation by learning fine image details (30,31). More recently, transformers are achieving state-of-the-art results in public datasets (32). However, their application in medical image research is limited by the intuition that these models outperform convolutional neural networks when there is an abundance of training data (33). To tackle this problem, convolutional vision transformers use several convolutional layers for patching and tokenizing the images, while Swin transformers use hierarchical resizing to get information from different scales of the image (34,35). In our design, convolutional layers were used in the same way as the convolutional vision transformers and were combined with Swin transformers to boost model performance on our moderate-sized dataset. Additionally, previous studies have shown that adding clinical data to machine learning models improves final model performance, so we used a multimodal model to better estimate patient dislocation risk (36,37).
The results of this study should be interpreted considering some limitations. First, patients from a single institution were evaluated, emphasizing the need for external testing using data from other institutions (38). It should be noted that our total joint registry includes data from a large population and captures every dislocation event for patients, regardless of whether they occur in our health system or not. Although images were acquired over 20 years with more than 120 different devices, extensive multi-institutional studies are required to evaluate the algorithm’s generalizability. Second, as this was a retrospective study, the pipeline should be validated in prospective studies. Third, the model accounts for preoperative characteristics and imaging to determine how risk can be managed intraoperatively with surgical decisions. However, there are other elements to surgical technique that were prohibitively difficult to account for, such as implant positioning and soft-tissue management. Finally, rare outcomes, especially those with multifactorial determination like THA dislocation, limit the performance of survival models. Especially, in our study, there was not a high acetabular liner diversity between patients with dislocation. This needs further exploration in future multicentric studies.
In summary, a multimodal calculator for predicting dislocation was introduced by combining preoperative characteristics and radiographic features of patients undergoing primary THA. This study highlights the superiority of imaging features compared with clinical variables and the synergy between these modes of patient evaluation. This tool enables patient-specific dislocation risk prediction with an acceptable C index, and more importantly, shows the degree to which this risk is modified by decisions within a surgeon’s control. Ultimately, we envision this will underlie a powerful tool for personalized surgery to address the most common complication (dislocation) of THA surgery. Furthermore, the high performance of predicting dislocation on the basis of a single preoperative radiograph, and the resultant integrated gradient maps, provide new insights for orthopedic surgeons into the possible causes of dislocation.
Supported in part by the Mayo Foundation Presidential Fund and the National Institutes of Health (NIH) (grant nos. R01AR73147 and P30AR76312).
Disclosures of conflicts of interest: B.K. No relevant relationships. P.R. No relevant relationships. H.M.K. National Institutes of Health (NIH) grants. D.R.L. No relevant relationships. Q.J.J. No relevant relationships. S.F. No relevant relationships. W.K.K. NIH funds for research paid to institution; author provides statistical analyses to Data Safety Monitoring Boards; mutual funds or similar. B.J.E. Chair of Research Committee for Society for Imaging Informatics in Medicine; consultant to the editor for Radiology: Artificial Intelligence. R.J.S. Royalties or licenses from Zimmer Biomet, OrthAlign, and Link Orthopedics; consulting fees from Think Surgical and OrthAlign; patent planned, issued, or pending with Zimmer Biomet; leadership or fiduciary role with the American Association of Hip and Knee Surgeons, Muller Foundation, and the Academic Network of Conservational Hip Outomes Research; receipt of equipment, materials, drugs, medical writing, gifts, or other services from Springer. M.J.T. No relevant relationships. C.C.W. No relevant relationships.
Abbreviations:
- AP
- anteroposterior
- AUC
- area under the receiver operating characteristic curve
- SHAP
- Shapley additive explanation
- THA
- total hip arthroplasty
- TT
- temporal timestamp
References
- 1. Singh JA , Yu S , Chen L , Cleveland JD . Rates of total joint replacement in the United States: future projections to 2020-2040 using the national inpatient sample . J Rheumatol 2019. ; 46 ( 9 ): 1134 – 1140 . [DOI] [PubMed] [Google Scholar]
- 2. Blom AW , Rogers M , Taylor AH , Pattison G , Whitehouse S , Bannister GC . Dislocation following total hip replacement: the Avon Orthopaedic Centre experience . Ann R Coll Surg Engl 2008. ; 90 ( 8 ): 658 – 662 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Levine BR , Springer BD , Golladay GJ . Highlights of the 2019 American Joint Replacement Registry Annual Report . Arthroplast Today 2020. ; 6 ( 4 ): 998 – 1000 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Rowan FE , Benjamin B , Pietrak JR , Haddad FS . Prevention of dislocation after total hip arthroplasty . J Arthroplasty 2018. ; 33 ( 5 ): 1316 – 1324 . [DOI] [PubMed] [Google Scholar]
- 5. Esposito CI , Gladnick BP , Lee YY , et al . Cup position alone does not predict risk of dislocation after hip arthroplasty . J Arthroplasty 2015. ; 30 ( 1 ): 109 – 113 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Wagner ER , Kamath AF , Fruth KM , Harmsen WS , Berry DJ . Effect of body mass index on complications and reoperations after total hip arthroplasty . J Bone Joint Surg Am 2016. ; 98 ( 3 ): 169 – 179 . [DOI] [PubMed] [Google Scholar]
- 7. Malkani AL , Ong KL , Lau E , Kurtz SM , Justice BJ , Manley MT . Early- and late-term dislocation risk after primary hip arthroplasty in the Medicare population . J Arthroplasty 2010. ; 25 ( 6 Suppl ): 21 – 25 . [DOI] [PubMed] [Google Scholar]
- 8. Byström S , Espehaug B , Furnes O , Havelin LI ; Norwegian Arthroplasty Register . Femoral head size is a risk factor for total hip luxation: a study of 42,987 primary hip arthroplasties from the Norwegian Arthroplasty Register . Acta Orthop Scand 2003. ; 74 ( 5 ): 514 – 524 . [DOI] [PubMed] [Google Scholar]
- 9. Barnsley L , Barnsley L , Page R . Are hip precautions necessary post total hip arthroplasty? A systematic review . Geriatr Orthop Surg Rehabil 2015. ; 6 ( 3 ): 230 – 235 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Wyles CC , Maradit-Kremers H , Larson DR , et al . Creation of a total hip arthroplasty patient-specific dislocation risk calculator . J Bone Joint Surg Am 2022. ; 104 ( 12 ): 1068 – 1080 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Berry DJ , von Knoch M , Schleck CD , Harmsen WS . Effect of femoral head diameter and operative approach on risk of dislocation after primary total hip arthroplasty . J Bone Joint Surg Am 2005. ; 87 ( 11 ): 2456 – 2463 . [DOI] [PubMed] [Google Scholar]
- 12. Rouzrokh P , Khosravi B , Johnson QJ , et al . Applying deep learning to establish a total hip arthroplasty radiography registry: a stepwise approach . J Bone Joint Surg Am 2022. . 10.2106/JBJS.21.01229. Published online July 21, 2022 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13. Tan M , Le Q . EfficientNet: rethinking model scaling for convolutional neural networks . In: Chaudhuri K , Salakhutdinov R , eds. Proceedings of the 36th International Conference on Machine Learning. PMLR ; 09–15 Jun 2019; 6105 – 6114 . https://proceedings.mlr.press/v97/tan19a.html. Accessed April 2, 2022 . [Google Scholar]
- 14. Liu Z , Lin Y , Cao Y , et al . Swin transformer: hierarchical vision transformer using shifted windows . Proceedings of the IEEE/CVF International Conference on Computer Vision , 2021. ; 10012 – 10022 . https://openaccess.thecvf.com/content/ICCV2021/html/Liu_Swin_Transformer_Hierarchical_Vision_Transformer_Using_Shifted_Windows_ICCV_2021_paper.html. Accessed February 9, 2022 . [Google Scholar]
- 15. Wightman R , Soare A , Arora A , et al . rwightman/pytorch-image-models: TPU VM Trained Weight release w/PyTorch XLA . Published March 18, 2022. Accessed April 2, 2022 . [Google Scholar]
- 16. Lin TY , Goyal P , Girshick R , He K , Dollar P . Focal loss for dense object detection . In: 2017 IEEE International Conference on Computer Vision (ICCV) , Venice, Italy , October 22–29, 2017 . Piscataway, NJ : IEEE; ; 2017. ; 2999 – 3007 . [Google Scholar]
- 17. You Y , Li J , Reddi S , et al . Large batch optimization for deep learning: training BERT in 76 minutes . arXiv 1904.00962 [preprint] https://arxiv.org/abs/1904.00962. Posted April 1, 2019. Accessed April 2, 2022 . [Google Scholar]
- 18. The MONAI Consortium. Project MONAI. Published December 15, 2020. Accessed April 2, 2022 .
- 19. Smith LN . A disciplined approach to neural network hyper-parameters: Part 1 – learning rate, batch size, momentum, and weight decay . arXiv 1803.09820 [preprint] https://arxiv.org/abs/1803.09820. Posted March 26, 2018. Accessed April 2, 2022 .
- 20. Paszke A , Gross S , Massa F , et al . PyTorch: An imperative style, high-performance deep learning library . Adv Neural Inf Process Syst . 2019. ; 32 . https://proceedings.neurips.cc/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html. Accessed February 11, 2022 . [Google Scholar]
- 21. Chen T , Guestrin C . XGBoost: A scalable tree boosting system . arXiv 1603.02754 [preprint] https://arxiv.org/abs/1603.02754. Posted March 9, 2016. Accessed April 2, 2022 . [Google Scholar]
- 22. Vieira D , Gimenez G , Marmerola G , Estima V . XGBoost survival embeddings . 2021. . doi: 10.5281/zenodo.6326018. Published February 9, 2021. Accessed April 2, 2022 . [DOI] [Google Scholar]
- 23. Mudrakarta PK , Taly A , Sundararajan M , Dhamdhere K . Did the model understand the question? arXiv 1805.05492 [preprint] https://arxiv.org/abs/1805.05492. Posted May 14, 2018. Accessed April 2, 2022 . [Google Scholar]
- 24. captum: Model interpretability and understanding for PyTorch . Github . https://github.com/pytorch/captum. Accessed May 27, 2022 .
- 25. Harrell FE Jr , Lee KL , Mark DB . Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors . Stat Med 1996. ; 15 ( 4 ): 361 – 387 . [DOI] [PubMed] [Google Scholar]
- 26. Lundberg SM , Nair B , Vavilala MS , et al . Explainable machine-learning predictions for the prevention of hypoxaemia during surgery . Nat Biomed Eng 2018. ; 2 ( 10 ): 749 – 760 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Virtanen P , Gommers R , Oliphant TE , et al . SciPy 1.0: fundamental algorithms for scientific computing in Python . Nat Methods 2020. ; 17 ( 3 ): 261 – 272 . [Published correction appears in Nat Methods 2020;17(3):352.] [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Schröder MS , Culhane AC , Quackenbush J , Haibe-Kains B . survcomp: an R/Bioconductor package for performance assessment and comparison of survival models . Bioinformatics 2011. ; 27 ( 22 ): 3206 – 3208 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Crowson CS , Larson DR , Devick KL , et al . Living with survival analysis in orthopedics . J Arthroplasty 2021. ; 36 ( 10 ): 3358 – 3361 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Rouzrokh P , Wyles CC , Philbrick KA , et al . A deep learning tool for automated radiographic measurement of acetabular component inclination and version after total hip arthroplasty . J Arthroplasty 2021. ; 36 ( 7 ): 2510 – 2517.e6 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Rouzrokh P , Ramazanian T , Wyles CC , et al . Deep learning artificial intelligence model for assessment of hip dislocation risk following primary total hip arthroplasty from postoperative radiographs . J Arthroplasty 2021. ; 36 ( 6 ): 2197 – 2203.e3 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Khan S , Naseer M , Hayat M , Zamir SW , Khan FS , Shah M . Transformers in vision: a survey . ACM Comput Surv 2022. ; 54 ( 10s ): 1 – 41 . [Google Scholar]
- 33. Dosovitskiy A , Beyer L , Kolesnikov A , et al . An image is worth 16x16 words: transformers for image recognition at scale . arXiv 2010.11929 [preprint] https://arxiv.org/abs/2010.11929. Posted October 22, 2020. Accessed April 2, 2022 . [Google Scholar]
- 34. Liu Z , Lin Y , Cao Y , et al . Swin transformer: hierarchical vision transformer using shifted windows . arXiv 2103.14030 [preprint] https://arxiv.org/abs/2103.14030. Posted March 25, 2021. Accessed April 2, 2022 . [Google Scholar]
- 35. Wu H , Xiao B , Codella N , et al . CvT: Introducing convolutions to vision transformers . arXiv 2103.15808 [preprint] https://arxiv.org/abs/2103.15808. Posted March 29, 2021. Accessed April 2, 2022 . [Google Scholar]
- 36. Gao R , Tang Y , Khan MS , et al . Cancer risk estimation combining lung screening CT with clinical data elements . Radiol Artif Intell 2021. ; 3 ( 6 ): e210032 . [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Al-Waisy AS , Abed Mohammed M , Al-Fahdawi S , et al . COVID-DeepNet: Hybrid multimodal deep learning system for improving COVID-19 pneumonia detection in chest X-ray images . Comput Mater Continua 2021. ; 67 ( 2 ): 2409 – 2429 . [Google Scholar]
- 38. Yu AC , Mohajer B , Eng J . External validation of deep learning algorithms for radiologic diagnosis: a systematic review . Radiol Artif Intell 2022. ; 4 ( 3 ): e210064 . [DOI] [PMC free article] [PubMed] [Google Scholar]








