Abstract
Background
Postoperative blood transfusion remains a significant concern following total knee arthroplasty. Clinical prediction models can facilitate early identification of patients at risk, enabling targeted blood management to reduce unnecessary transfusions and related complications. However, the predictive performance, methodological quality, and clinical applicability of these models remain uncertain. Therefore, we systematically reviewed existing models for predicting postoperative transfusion in total knee arthroplasty.
Methods
Ten English and Chinese databases were comprehensively searched from database inception to February 2025 to identify relevant studies. Two reviewers independently extracted data based on the checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS). The risk of bias and the applicability of each study was evaluated applying the Prediction model Risk Of Bias Assessment Tool (PROBAST). Extracted AUC of included models were pooled and analyzed utilizing a random-effects meta-analysis. A leave-one-out sensitivity analysis and an exploratory subgroup meta-analysis by modelling approach were also conducted to explore the sources of heterogeneity. All statistical analyses were performed in Stata 17.0 software.
Results
Twelve studies involving eighteen models were incorporated in this review. All studies established the prediction models employing logistic regression or machine learning methods. The most commonly used predictors were preoperative hemoglobin, age, body mass index, surgery duration, and the use of tranexamic acid. The pooled AUC for the six internally validated models was 0.83 (95% CI: 0.74–0.92), demonstrating a relatively high predictive discrimination. Sensitivity analysis did not materially change the estimates, and the subgroup meta-analyses showed that the modelling approach alone could not explain the heterogeneity (p = 0.406). However, all model were considered as having a high risk of bias, mainly owing to the unsuitable study design and poor reporting within the analysis domain.
Conclusions
Despite the included studies demonstrating moderate to excellent discrimination for predicting postoperative transfusion after total knee arthroplasty, all studies were considered as having a high risk of bias following the PROBAST due to some methodological shortcomings and inadequate external validation. Future research should focus on improving methodological quality and performing multicenter external validation to ensure clinical applicability.
Clinical trial number
Not applicable.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12891-025-09164-z.
Keywords: Blood transfusion, Total knee arthroplasty, Clinical prediction model, Systematic reviews, Meta-Analyses
Introduction
Total knee arthroplasty (TKA) is widely recognized as an effective treatment for end-stage knee osteoarthritis (KOA). With the aging population and the rising prevalence of KOA, the volume of TKA in various countries has significantly increased [1–4]. This surgery effectively alleviates pain, restores joint function, and enhances patients’ quality of life [5]. Despite the high success rate of TKA, the significant blood loss and the subsequent need for postoperative blood transfusion in TKA continue to be major concerns [6]. Reported transfusion rates after TKA vary from 3.2 to 18.1% [7]. Although blood transfusion can effectively restore blood volume and save lives, blood products are limited resource. Clinical judgment errors can result in blood overuse, leading to the unnecessary wastage of blood products and associated complications. Several studies have shown that transfusion is associated with increased risks of complications, including thromboembolic events and increased surgical site infections, which can prolong hospital stays, raise hospitalization costs, and increase mortality risk [8–10]. To avoid unnecessary transfusion, relevant patient blood management (PBM) strategies have been implemented, such as preoperative erythropoietin administration, preoperative iron supplementation, and the application of tranexamic acid (TXA), which have shown certain benefits for patients [11, 12]. Therefore, accurately assessing transfusion risk and implementing a targeted PBM strategy is crucial for improving patient outcomes and saving the blood resources of healthcare institutions.
Clinical prediction models can help clinicians recognize high-risk patients and guide diagnostic and therapeutic strategy selection to reduce adverse outcomes [13]. In recent years, several predictive models have been developed based on diverse participant groups and predictors to identify the requirements of blood transfusion after TKA effectively. These clinical prediction models can help clinicians identify high-risk patients before surgery, facilitating the implementation of appropriate preventive measures to decrease the rate of postoperative transfusion in patients who underwent TKA. Nonetheless, due to the varied designs and populations, the quality and effectiveness of these models have not been comprehensively assessed, which limits their widespread implementation.
To our knowledge, no systematic review has comprehensively evaluated the predictive performance and clinical applicability of these models. Therefore, this systematic review and meta-analysis aimed to identify and summarize existing prediction models for transfusion after TKA, critically appraise their methodological quality, and compare their predictive performance. The findings are intended to inform the refinement of future models and support more individualized clinical decision-making.
Methods
Study design
This systematic review was reported adhering to the Transparent Reporting of Multivariable Prediction Models for Individual Prognosis or Diagnosis: Checklist for Systematic Reviews and Meta-analyses (TRIPOD-SRMA) [14] statement and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [15] (see Additional file 1). The study protocol was registered in PROSPERO (registration number CRD42024523294).
Data sources and search strategy
A thorough literature search was performed across ten databases, including both English and Chinese databases. Chinese databases comprised the China National Knowledge Infrastructure (CNKI), Wanfang, China Science and Technology Journal Database (VIP), and the China Biomedical Literature Service System (SinoMed). English databases included PubMed, Web of Science, Cochrane Library, CINAHL, Embase, and Scopus. The search covered the period from the inception of each database to February 19, 2025. Additionally, reference lists of relevant reviews and included studies were examined to identify further eligible studies.
Boolean operators (“OR” and “AND”) were applied to combine mesh terms with free-text keywords. The search strategy framework proposed by Geersing et al.’s study [16] was also used. The following keywords were used: “Arthroplasty, Replacement, Knee,” “Blood Transfusion,” “risk model,” “risk factor,” “risk assessment,” “predictive model,” “nomogram,” “scoring system,” “stratification,” “ROC curve,” “discrimination,” “c-statistics,” “calibration,” “algorithm,” “multivariable,” “indices,” and “AUC.” Detailed search strategies for each database were provided in Additional file 2.
Inclusion and exclusion criteria
Cohort and case-control studies that involved developing or validating at least one prediction model were included in this review. As the critical appraisal and data extraction for systematic reviews of prediction modelling studies (CHARMS) [17] checklist recommended, the PICOTS framework was applied to define the inclusion criteria as follows:
-
P (Population): patients aged ≥ 18 years who underwent total knee arthroplasty, including unilateral TKA, staged bilateral TKA, and simultaneous bilateral TKA,
I (Intervention model): any developed and published prediction models that predict the risk of blood transfusion following TKA, including at least two predictors.
C (Comparator): None.
O (Outcome): The primary outcome of interest was the occurrence of blood transfusion after TKA surgery.
T (Timing): Any time point during the postoperative period.
S (Setting): Prediction models applied in ambulatory or inpatient wards
The following types of studies were excluded from the review: (1) studies that were not peer-reviewed; (2) studies that failed to Establish a prediction model; (3) studies published in languages other than english or chinese; (4) studies with unavailable full-text or duplicate publications
Study selection and screening
Two reviewers independently identified relevant articles according to the following surgeries. Firstly, duplicate studies were removed using the endnote software (version X9.3.3, clarivate, philadelphia, pennsylvania, USA). Titles and abstracts from the preliminary search were screened using the Rayyan platform (https://new.rayyan.ai/.), which is a free, web-based document management platform for systematic reviews that could detect and remove duplicates. Subsequently, two reviewers independently assessed the full texts of the selected articles to exclude those deemed irrelevant. Any disagreements regarding inclusion were resolved through discussion with a third reviewer.
Data extraction and synthesis
Two reviewers independently extracted data based on the checklist for critical appraisal and data extraction for systematic reviews of prediction modelling studies (CHARMS) [17]. data were extracted from the included studies and organized into two main sections: (1) general characteristics: first author, year of publication, country, study design, data source, study setting, participant characteristics, outcome (transfusion indications), outcome measurement (timing of prediction), sample size and number of events. (2) prediction model information: how to process missing data and continuous variables, method of predictors selection, final predictors, modelling methods, model validation (internal or external), model performance including discrimination (area under curve or C-index) and calibration (calibration plot or Hosmer-Lemeshow test), and formats of model presentation. Any inconsistencies were addressed through discussion with a third author. When essential information was missing or unclear, the corresponding authors of the original studies were contacted via email to obtain additional information. The outcome of data extraction was synthesized using a narrative synthesis method, and the key information was shown using a tabular form with Microsoft excel (version 2021, Microsoft Corporation). Moreover, the frequency of final predictors was demonstrated using a bar diagram.
Quality assessment
Each included study was independently assessed by two authors using the Prediction Model Risk Of Bias Assessment Tool (PROBAST) [18] checklist, which evaluates both the risk of bias and applicability concerns in diagnostic or prognostic prediction model studies. The tool comprises 20 signaling questions divided into four domains: participants, predictors, outcome, and analysis. Each item was answered with the following responses: “yes,” “probably yes,” “no,” “probably no,” or an indication of “no information” [19]. Each domain is rated as having a low, high, or unclear risk of bias. A domain was considered to have a high risk of bias if any signaling question within that domain was answered as “no” or “probably no,” Any disagreements between authors were solved through discussion until a consensus was reached.
Statistical analysis
A meta-analysis was performed utilizing Stata software (version 17, Stata corporation, college station, texas, USA). Given clinical and methodological differences across studies, the discrimination of the included models was pooled and analyzed using a random-effects model instead of a fixed-effects model [20]. The 95% confidence interval was calculated using restricted maximum likelihood (REML) Estimation and the Hartung-Knapp-Sidik-Jonkman (HKSJ) method [20]. The statistical heterogeneity across studies was evaluated using cochran’s Q test. Additionally, the I2 statistic was used to quantify statistical heterogeneity, with 25%, 50%, and 75% indicating low, moderate, and high heterogeneity, respectively [21]. Publication bias was not quantified because the Cochrane collaboration guidelines did not recommend funnel-plot asymmetry tests (e.g., egger’s test) in a meta-analysis when fewer than 10 studies [22]. As a sensitivity analysis, the leave-one-out method was implemented to assess the robustness of the result, which could re-estimate the summary effect after excluding each study in turn to identify influential studies [23]. Additionally, given the limited number of studies, we only conducted an exploratory subgroup meta-analysis using different modelling methods.
Results
Study selection
Figure 1 demonstrates the literature selection process of this systematic review. A total of 5,896 studies were retrieved initially. After removing 2,603 duplicates, 3,293 titles and abstracts were screened according to predefined inclusion and exclusion criteria, and then 50 full-text studies were for further evaluation. Of which 21 studies were excluded for not developing prediction models or only focusing on risk factors. Moreover, 10 studies were excluded due to unmatched study populations, four due to inconsistent results, two were duplicate publications, and one was not peer-reviewed. Ultimately, 12 studies reported that 18 prediction models were included in this study.
Fig. 1.
Flowchart of literature search and selection
General characteristics of included studies
Table 1 demonstrates the general characteristics of the included studies, published from 2012 to 2024. The 12 studies were performed in China, the United States, the United Kingdom, Brazil, South Korea, and France. All included studies were retrospective design. Regarding the type of TKA, six studies [24–29] focused on unilateral TKA, Five [30–34] focused on both unilateral and bilateral TKA, while only Li et al. [35] focused on simultaneous bilateral TKA. Regarding outcome definitions, Eight studies [24, 26–31, 35] defined transfusion based on postoperative hemoglobin (Hb) levels. Mohammed et al. [33] defined outcomes using blood transfusion requirements according to the International Classification of Diseases, 9th Revision (ICD-9) procedure code. Liu et al. [32] defined outcomes by calculating total blood loss ratios according to the Gross formula [36], and the remaining two studies [25, 34] did not report a clear definition. As for the predictive windows, four studies reported precise time points, including within 14 days postoperatively [30, 31, 35] and within 48 h [29], while the remaining studies failed to report specifics. The count of participants in the included studies varied from 234 to 63,606, and the reported rate of blood transfusion ranged from 3.2 to 37.47%.
Table 1.
Overview of the general characteristics of included studies
| First author | Country | Study design | Data source | Participants | Outcome definition |
Timing of prediction | Event fraction |
|---|---|---|---|---|---|---|---|
|
Ahmed 2012 [24] |
United Kingdom | Retrospective cohort study | Regional arthroplasty database of a hospital | Patients underwent primary unilateral TKA | Postoperative Hb < 8.5 g/dL | Postoperatively |
227/2281 (10%) |
| Noticewala 2012 [25] | America | Retrospective cohort study | Hip and Knee Replacement center of a hospital | Patients underwent primary unilateral TKA | - | Any point in the postoperative course before discharge |
71/644 (11%) |
| Cavazos 2023 [26] | America | Retrospective cohort study | Trauma center of a hospital | Patients underwent primary unilateral TKA | Postoperative Hb < 7 g/dL | Postoperatively | 67/2093 (3.2%) |
| Faure 2024 [27] | France | Retrospective cohort study | Hospital electronic database of a hospital | Patients (age > 18) underwent unilateral TKA | Postoperative Hb < 7 g/dL | Postoperatively | 100/774 (12.9%) |
| Kolin 2023 [28] | America | Retrospective cohort study | Inpatient Setting of a hospital | Patients underwent primary single-stage TKA | Postoperative Hb < 7 g/dL | Postoperatively | 543/14,188 (4%) |
| Mozella 2021 [29] | Brazil | Retrospective cohort study | National Institute of Traumatology and Orthopedics (INTO) | Patients (age 30–83) underwent unilateral TKA | Postoperative Hb < 7 g/dL |
Within 48 h postoperatively |
79/234 (33.7%) |
| Hu 2020 [30] | China | Retrospective cohort study | Inpatient orthopedic unit of a hospital | Patients underwent unilateral or bilateral TKA | Postoperative Hb < 7 g/dL or Hb < 8 g/dL with symptoms of anemia |
Within 14 days postoperatively |
391/5402 (7.2%)development: 148/1116 (13.3%)validation |
| Jo 2020 [31] | South Korea | Retrospective cohort study | Electronic medical recording system and clinical data warehouse system of a institution | Patients underwent unilateral TKA, staged bilateral TKA, and simultaneous bilateral TKA | Postoperative Hb < 7 g/dL |
Within 14 days postoperatively |
108/1686 (7.2%)development: 7/400 (1.8%)validation |
| Liu 2024 [32] | China | Retrospective cohort study |
Hospital electronic database of a hospital |
Patients underwent TKA |
TBL/EBV ratio >20% |
Postoperatively | 73/329 (22.2%) |
| Mohammed 2022 [33] | America | Retrospective cohort study | Inpatient data of National Inpatient Sample (NIS) | Patients underwent TKA | ICD-9 surgery codes for blood transfusion | Postoperatively | 73,020/636,062 (11.41%) |
| Chen 2021a [34] | China | Retrospective cohort study | Hospital electronic medical records of a hospital | Patients (age > 18) underwent elective TKA | - | Postoperatively | 107/634 (16.9%) |
| Li 2024 a [35] | China | Retrospective cohort study | Hospital’s electronic medical database of a hospital | Patients underwent simultaneous bilateral TKA | Postoperative Hb < 7 g/dL or Hb < 8 g/dL with symptoms of anemia |
Within 14 days postoperatively |
323/862 (37.47%) |
TKA Total knee arthroplasty, Hb Hemoglobin
a Study was published in Chinese
Characteristics of prediction models information
Table 2 summarizes the model information of the included studies. Regarding the modelling method, Over half of the studies [24, 25, 28–30, 32, 35] developed models utilizing the logistic regression (LR) method, two [27, 31] used the gradient boosting machine (GBM) method, and one [26] used the message passing neural network (MPNN) method. The remaining two studies [33, 34] used both the LR method and machine learning methods to develop and compare models. As for candidate predictors, All studies selected multi-dimensional predictors, including demographic and laboratory ones, and the count of candidate predictors ranged between 7 and 43. As for how to handle continuous variables, ten studies did not transform continuous variables, while two studies [24, 33] converted continuous variables into categorical ones. Regarding the way of handling missing data, two studies [28, 33] applied imputation methods, one study [27] applied the K-Nearest Neighbors (KNN) algorithm, one [32] excluded missing data directly, and the remaining eight studies [24–26, 29, 31, 34, 35, 37] did not report their handling approach. Univariate analysis was frequently used to select initial predictors [24–26, 29, 30, 32, 35], one study [28] selected predictors based on expert opinion. Multivariate logistic regression was commonly utilized to determine the final predictors, with one study [24] using forward, two [29, 32] using stepwise, one using [33] backward, one [30] using the combination of Least Absolute Shrinkage and Selection Operator (LASSO) algorithm and multivariate analysis. Furthermore, among studies using machine learning, two studies [27, 31] used Recursive Feature Elimination (RFE) algorithms to select the final predictors, two [33, 34] used variable importance analysis, and one [26] used the MPNN algorithm.
Table 2.
Overview of the prediction model information and performance of the included studies
| Study | modelling methods | Candidate predictors(N) |
Handling of Continuous variable | Handling of missing data | Predictors selection in modelling (Prior/Final) |
Final predictors | Model validation |
Model presentation |
|---|---|---|---|---|---|---|---|---|
| Ahmed 2012 [24] | LR | 9 | Categorical variables | - |
Univariate Analysis/ Forward conditional |
3: age, weight, and preoperative Hb |
- | Equation |
| Noticewala 2012 [25] | LR | 31 | Continuous variables | - |
Univariate Analysis/ Multiple LR |
4: age, anemia, preoperative Hb, and surgery time | External: Temporal | Equation |
| Cavazos 2023 [26] | MPNN | 13 | Continuous variables | - |
Univariate Analysis/ MPNN |
11: preoperative Hb, preoperative creatinine level, surgery time, simultaneous bilateral surgeries, TXA use, ASA score, preoperative ALB, ethanol use, preoperative anticoagulation use, age, and surgery type |
Internal: Split sample (7:3) |
- |
| Faure 2024 [27] | GBM | 25 | Continuous variables | KNN | RFE | 5: age, BMI, TXA use, preoperative Hb and platelet count |
Internal: Split sample (7:3) |
Online tool |
| Kolin 2023 [28] | LR | 8 | Continuous variables |
Means and Bag Imputation |
Expert Opinions/ Multiple LR |
8: age, sex, BMI, preoperative Hb, TXA use, ASA, surgery time, and Drain Use |
Internal: Split sample (7:3) |
Equation |
| Mozella 2021 [29] | LR | 7 | Continuous variables | - |
Univariate Analysis/ Stepwise forward |
2: Preoperative Hb and intraoperative ischemia time | - | Equation |
| Hu 2020 [30] | LR | 17 | Continuous variables | - |
Univariate Analysis/ LASSO and Multiple LR |
5: Age, BMI, surgery type, CHD, and preoperative Hb | External: Temporal | Nomogram |
| Jo 2020 [31] | GBM | 43 | Continuous variables | - | RFE | 6: TXA use, surgery type, platelet count, age, body weight, and preoperative Hb |
Internal: tenfold cross External: Geographical |
Web-based risk-assessment system |
| Liu 2024 [32] | LR | 24 | Continuous variables | Delete |
Univariate LR/ Stepwise forward |
4: TXA use, preoperative ESR, HCT, and ALB |
Internal: Split sample (7:3), bootstrap |
Nomogram |
| Mohammed 2022 [33] |
LR, GBM, RF, ANN |
37 | Categorical variables | SRMI | Variable importance analysis and Backward selection (LR model) |
15: admission month, year of admission, patient location, deficiency anemia, median household income for the patient, age, race, sex, health insurance, fluid and electrolyte disorders, hypertension, obesity, diabetes (uncomplicated)control/ ownership of hospital, hospital bed size |
Internal: Split sample (5:2:3) |
- |
| Chen 2021a [34] |
LR SVM RF XGBoost |
16 | Continuous variables | - | Variable importance analysis | 5: Preoperative Hb, age, surgery time, BMI, and surgery type |
Internal: Split sample (8:2), fivefold cross |
- |
| Li 2024 a [35] | LR | 14 | Continuous variables | - |
Univariate Analysis/ Multiple LR |
4: preoperative Hb, age, BMI, and disease duration | Internal: bootstrap | Nomogram |
CHD Coronary heart disease, Hb Hemoglobin, ESR Erythrocyte sedimentation rate, HCT Hematocrit, ALB Albumin, SRMI Sequential regression multiple imputation, KNN K-nearest neighbors, RFE Recursive feature elimination, LR Logistic regression, LASSO Least absolute shrinkage and selection operator, GBM Gradient boosting machine, RF Random forest, SVM Support vector machine, XGBoost eXtreme gradient boosting, ANN Artificial neural network, MPNN Message passing neural network, RFE Recursive feature elimination
-not reported
Characteristics of included predictors
The number of final predictors in the models varied between 2 and 15 (Table 2), and the frequency of occurrence is shown in Fig. 2. The most frequent predictors were preoperative hemoglobin and age, appearing in 10 studies (83.3%), respectively. Other commonly identified predictors were the use of TXA, BMI, type of surgery, and duration of surgery. Six predictors, including sex, preoperative platelet count, preoperative albumin levels, body weight, preoperative anemia, and ASA, appeared in two studies, respectively. The remaining 11 predictors were only used in one study.
Fig. 2.
Frequency of predictors used in the development models
Characteristics of model validation
Table 2 displays the model validation of the included studies. Seven studies [26–28, 32–35] conducted internal validation (58.3%). Two studies [25, 30] only underwent external validation, only Jo et al. ‘s study [31] used both internal and external validation, and two [24, 29] did not report any validation after model development. Regarding the way of model internal validation, six studies [26–28, 32–34] used the random sample splitting method, and two [31, 35] only used bootstrap resampling and ten-fold cross-validation, respectively. Of six studies that used the random sample splitting method, two [32, 34] additionally applied five-fold cross-validation and bootstrap resampling, respectively. Two of the three [25, 30] externally validated studies used time validation, and one [31] used geographical validation.
Characteristics of model presentation
As for the presentation of included models (Table 2), four studies [24, 25, 28, 29] presented models as equations, three [30, 32, 35] presented models as a nomogram, and two [27, 31] presented models as an online calculator. The remaining studies did not report the presentation format.
Characteristics of model performance
As shown in Table 3, the discrimination of the included studies was commonly reported using the AUC, while two studies [24, 29] did not report model discrimination. For the models that underwent internal validation, all models demonstrated moderate to good discrimination, with the reported AUC value ranging from 0.652 [35] to 0.97 [27]. Among the models that underwent external validation, Hu et al. [30] and Jo et al. [31] showed good discrimination, with the reported AUC of 0.839 and 0.880, respectively. The calibration results were reported in five studies [24, 30, 32, 33, 35]. Three studies [30, 32, 35] used the calibration curve, which suggested good calibration. Ahmed et al. [24] evaluated calibration through the Hosmer-Lemeshow (HL) test, suggesting good calibration in the development set (HL P-values > 0.1), while Mohammed et al. [33] used the Brier scores to assess the calibration of their four models in the test set, with the brier scores ranged from 0.088 to 0.095. Other indices, including the Youden index, sensitivity, specificity, negative predictive value, positive predictive value, accuracy, and F1 score, were also used to report the model performance. In addition, three studies [30, 32, 35] used the decision curve analysis (DCA) to assess the clinical benefit of prediction models. Specifically, Hu et al. [30] showed high net benefits with ranges of 0.2–0.94 and 0.1–0.62 in the training and external validation sets, respectively. Another two models [32, 35] also presented good clinical utility.
Table 3.
Summary of prediction models performance of the included studies
| Study | Discrimination measure | Calibration measure | Other indexes |
|---|---|---|---|
| Ahmed 2012 [22] | AUC = 0.74 (95% CI: 0.7—0.775) | Hosmer Lemeshow test: χ2 = 9.36, P = 0.313 |
Optimal cutoff level = 0.1 Se = 71% Sp = 71% NPV = 96% PPV = 21% |
| Noticewala 2012 [23] | — | — |
Se external = 90% Sp external = 52.5% |
| Cavazos 2023 [24] | AUC Internal = 0.894 | — |
Accuracy train = 97.2% Accuracy Internal = 95.8% |
| Faure 2024 [25] | AUC Internal = 0.97 (95% CI: 0.921—1) | — |
Youden index Internal = 0.8 Se Internal = 94.4% Sp Internal = 85.4% Accuracy Internal = 89.9% |
| Kolin 2023 [26] | AUC Internal = 0.90 (95% CI: 0.87—0.93). | — |
Youden index Internal = 0.6 Se Internal = 78% Sp Internal = 87% Accuracy Internal = 97% |
| Mozella 2021 [27] | — | — | — |
| Hu 2020 [28] |
AUC train = 0.884 (95% CI: 0.865–0.903) AUC external: 0.839 (95% CI: 0.773–0.905) |
Calibration curve train: high consistency Calibration curve external: good agreement |
DCA train: better benefit (0.2–0.94) DCA external: better benefit (0.1–0.62) |
| Jo 2020 [29] |
AUC Internal = 0.842 (95% CI: 0.820–0.856) AUC external = 0.880 (95% CI: 0.844–0.910) |
— |
Youden index Internal = 0.0687 Se Internal = 89.8% Sp Internal = 74.8% |
| Liu 2024 [30] |
AUC train = 0.855 (95% CI: 0.800–0.910) AUC Internal = 0.824 (95% CI: 0.740–0.909) |
Calibration curves train: high consistency Calibration curves Internal: good agreement. |
DCA: showed that the nomogram would provide a high net benefit. |
| Mohammed 2022 [31] |
LR: AUC test = 0.707 (95% CI: 0.704— 0.711) GBM: AUC test = 0.797 (95% CI: 0.794— 0.800) RF: AUC test = 0.783 (95% CI: 0.780—0.787) ANN: AUC test = 0.812 (95% CI: 0.805— 0.820) |
LR: Brier scores test = 0.095 GBM: Brier scores test = 0.091 RF: Brier scores test = 0.094 ANN: Brier scores test = 0.088 |
— |
| Chen 2021a [32] |
LR: AUC Internal = 0.816 SVM: AUC Internal = 0.864 RF: AUC Internal = 0.773 XGBoost: AUC Internal = 0.888 |
— |
LR: Se Internal = 88.9%; Sp Internal = 50%; Accuracy Internal = 81.6%; FI Internal = 0.897 SVM: Se Internal = 100%; Sp Internal = 72.7%; Accuracy Internal = 86.4%; FI Internal = 0.972 RF: Se Internal = 91.2%; Sp Internal = 100%; Accuracy Internal = 92.0%; FI Internal = 0.954 XGBoost: Se Internal = 91.2%; Sp Internal = 100% Accuracy Internal = 88.8%; FI Internal = 0.954 |
| Li 2024 a [33] | AUC Internal = 0.652 (95% CI: 0.612— 0.691) | Calibration curve Internal: good agreement |
Se Internal = 52.01% Sp Internal = 85.34% Accuracy Internal = 72.85% PPV Internal = 68.02% DCA: showed that validation cohort had good potential for clinical utility. |
AUC Area under the receiver operating characteristic curve, Training training set, Test test set, Internal Internal validation, External External validation, Se Sensitivity, Sp Specificity, NPV, Negative redictive value, PPV Positive predictive value, DCA Decision curve analysis
Risk of bias and applicability in prediction models
Table 4; Fig. 3 summarize the risk of bias (ROB) and applicability of the included studies, and detailed results are presented in Additional file 3. Overall, all studies were assessed as having a high risk of bias.
Table 4.
PROBAST results of the included studies
| Study | ROB | Applicability | Overall | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Participants | Predictors | Outcome | Analysis | Participants | Predictors | Outcome | ROB | Applicability | |
| Ahmed 2012 [24] | - | ? | ? | - | + | + | + | - | + |
| Noticewala 2012 [25] | - | ? | ? | - | + | + | ? | - | ? |
| Cavazos 2023 [26] | - | ? | ? | - | + | + | + | - | + |
| Faure 2024 [27] | - | ? | ? | - | + | + | + | - | + |
| Kolin 2023 [28] | - | ? | ? | - | + | + | + | - | + |
| Mozella 2021 [29] | - | ? | ? | - | + | + | + | - | + |
| Hu 2020 [30] | - | + | ? | - | + | + | + | - | + |
| Jo 2020 [31] | - | ? | ? | - | + | + | + | - | + |
| Liu 2024 [32] | - | ? | ? | - | + | + | + | - | + |
| Mohammed 2022 [33] | - | ? | ? | - | + | + | ? | - | ? |
| Chen 2021a [34] | - | ? | ? | - | + | + | ? | - | ? |
| Li 2024 a [35] | - | ? | ? | - | + | + | + | - | + |
PROBAST Prediction model Risk Of Bias Assessment Tool, ROB Risk of bias
+: low ROB/low concern regarding applicability, -: high ROB/high concern regarding application, ?: unclear ROB/unclear concern regarding applicability
Fig. 3.
Graphical summary presenting the percentage of risk prediction studies. A, Risk of bias (ROB). B, Risk of Applicability
As for the participant domain, all studies had a high risk of bias because of their retrospective study design. Within the predictor domain, only Hu et al. [30] considered a low risk of bias because it reported quality control measures to minimize bias risk, while the remaining studies had an unclear risk of bias. In the outcome domain, three studies [25, 33, 34] had an unclear risk of bias due to the unclear outcome definitions. None of the studies reported blind assessment of outcomes.
As for the analysis domain, all studies were deemed to have a high risk of bias. Four studies [25, 26, 31, 32] had deficient sample sizes, which could not meet the recommended standard of having more than 20 ‘events per variable’ (EPV) [19]. Two studies [24, 33] transformed all continuous variables into categorical ones without any explanation. One study [26] did not report information about excluded participants. Eight studies [24–26, 29–31, 34, 35] did not report the missing data procession, and one study [32] handled missing data inappropriately. Seven studies [24–26, 29, 30, 32, 35] selected predictive factors based on univariate analysis. Two studies [25, 29] did not report model discrimination, and seven [25–29, 31, 34] did not report how to assess the model calibration. Eight studies did not consider model underfitting or overfitting in model performance. Among which four studies [24, 25, 29, 30] did not conduct internal validation, while four studies [15, 25, 26, 30] used only randomly split samples for internal validation. Only two studies [28, 32] considered complexities among predictors and used variance inflation factor (VIF) analysis to solve the problem. The remaining studies did not report data complexity. Six studies [24, 25, 28, 30, 32, 35] reported the coefficients of predictive factors in their regression models, consistent with the results of multivariate analyses.
Regarding the applicability assessment, nine studies had low risk, and three [25, 33, 34] had unclear risk. Regarding the participant domain and the predictors domain, all studies were classified as low-risk. In the outcome domain, three studies [25, 33, 34] did not clearly define the predicted outcomes.
Meta-analysis results
Following the screening process, only three studies [25, 30, 31] were validated externally. Consequently, a meta-analysis was conducted on the models that were validated internally. Among them, two studies [26, 34] did not report the AUC or the 95% confidence interval (CI). Mohammed et al. [33] reported that the GBM-based prediction model demonstrated superior calibration and discrimination capabilities compared to alternative models. Therefore, the AUC of the GBM model was extracted for the meta-analysis. Ultimately, six studies [27, 28, 31–33, 35] were identified as meeting the inclusion criteria. The pooled AUC was calculated by applying a random-effects model, yielding a pooled result of 0.83 (95% CI: 0.74–0.92) (Fig. 4). The I² value was 97.3% (p < 0.001), indicating significant heterogeneity across the studies. Sensitivity analysis was conducted using the “metaninf” module in Stata 17. When each study was omitted in turn, the pooled estimates did not change significantly (Fig. 5), indicating that the result were robust. The results of the subgroup meta-analysis showed that the pooled AUC of the GBM models was higher than that of the LR one (0.87 vs. 0.79). However, there was no significant difference in model performance between them (p = 0.406), and heterogeneity within each subgroup remained significant (Fig. 6).
Fig. 4.
Forest plot of the random effects meta-analysis of pooled AUC estimates for 6 models
Fig. 5.
Sensitivity analyses were conducted using a leave-one-out method
Fig. 6.
Forest plots for subgroup analysis of logistic regression models (LR) versus machine learning models (GBM)
Discussion
This systematic review assessed 18 predictive models across 12 studies, drawn from various countries and regions. The common predictors used across the models included preoperative hemoglobin (Hb), age, BMI, surgery duration, and the use of tranexamic acid (TXA). These models exhibited moderate to excellent discrimination, with AUC results ranging from 0.652 [35] to 1.0 [27]. Nevertheless, all models were deemed to have a high risk of bias, which limits their reliability and clinical applicability. Therefore, future studies with improved methodological quality should be needed to advance this field.
Model performance is primarily evaluated using discrimination and calibration [38]. In this review, AUC was the most frequently used metric for assessing model performance, with values approaching 1 indicating better predictive ability. The AUC values for the included models ranged from 0.652 to 0.97, with 13 models reporting AUC > 0.75, indicating that these models can accurately predict the likelihood of blood transfusion following TKA. However, only four included studies [24, 30, 33, 35] reported calibration results. Calibration reflects the agreement between predicted probabilities and actual probabilities [20]. The absence of calibration assessment is a major limitation, as a model with good discrimination can still introduce inaccurate risk predictions, potentially leading to misleading clinical decision-making [39]. Therefore, future studies should include complete testing of model performance to assist decision-making and ensure more reliable clinical applications. Model validation, both internal and external, is essential for model development and implementation [40]. In this review, most studies lacked external validation, while two [24, 29] lacked both internal and external validation. External validation evaluates model performance in an independent dataset, which is a definitive test of generalizability and clinical utility [41].The lack of external validation hinders a realistic assessment of model performance on independent datasets [14]. Furthermore, for the seven studies with internal validation, almost all of them utilized the random sample split, which could result in model overfitting due to training and validation occurring within the same dataset [42]. Moreover, the majority of the included studies derived patient data from single-center. Multicenter validation could improve the external validity of the models [43]. DCA is a visual tool that can reveal the clinical net benefit of prediction models among various threshold probabilities [44]. In our review, three studies [28, 30, 33] used DCA to assess the model applicability and demonstrated clinical availability. Therefore, future research should focus on strengthening external validation, especially in multicenter studies and using DCA to consider the clinical benfit to enhance the clinical applicability.
A meta-analysis was performed on six internal models, resulting in a pooled AUC of 0.83 (95% CI: 0.74–0.92), indicating a relatively high predictive performance. However, the I2 value was 97.3% (p < 0.001), suggesting significant heterogeneity within the studies. In exploratory subgroup analyses (GBM vs. LR), the results show that the between-subgroup test was not significant (p = 0.406), but the heterogeneity within each subgroup remained significant, indicating that the modelling method could not be the primary source of heterogeneity. However, the heterogeneity could be attributed to others, such as variations in sample characteristics and outcome definition (transfusion indication) [45]. For sample characteristics, the data included in the studies were derived from different countries. The diversity of sample sources may lead to heterogeneity in results. Meanwhile, there are differences in TKA type. The risks of postoperative blood transfusion varied across different surgical types due to differences in operative duration and blood loss. Patients who undergo simultaneous bilateral TKA typically exhibit higher transfusion requirements [46]. Moreover, there are no objective criteria for transfusion indication with TKA. Physicians make diverse transfusion decisions based on their clinical experience or the standards of their institution, which can affect the TKA transfusion rate [47]. It may lead to model prediction bias and the observed heterogeneity in results. However, other subgroups were not applicable due to insufficient studies for each subgroup. Therefore, future studies should aim to standardize the methodological details of studies to facilitate more comprehensive meta-analyses and subgroup comparisons.
Following the PROBAST, all included studies were assessed with a high risk of bias, primarily due to issues in study design and analysis domains. All included studies employed retrospective designs, which could be information bias and recall bias. In the analysis domains, common issues included inadequate sample size, unclear handling of missing data, and inappropriate predictors procession. Adequate sample size is critical to minimize overfitting and enhance model reliability [19]. As PROBAST suggested, an events-per-variable (EPV) ratio of at least 20 can help reduce overfitting and bias [19]. However, some studies [25, 26, 31, 32] failed to satisfy this criterion due to the low transfusion rates or the use of numerous candidate predictors. In terms of handling missing data, many studies did not report their handling methods clearly or used inadequate approaches. For example, Liu et al. [32] directly deleted missing data, which could introduce sampling bias and affect the generalizability of prediction models [19]. Regarding predictor handling, two studies [24, 33] transformed continuous variables into categorical ones, which could result in information omission and decrease the correlation between predictors and research outcomes [48]. Furthermore, some studies used univariable analysis for predictor selection, which could lead to the loss of variables and the introduction of bias [19]. To mitigate bias, Kolin et al. ’s study [28] used a combination of expert opinions and multiple analyses, while two studies [27, 31] chose the Recursive Feature Elimination (RFE) method to select predictors. The RFE method can automatically select optimal predictors and avoid model overfitting [49]. Overall, improvements in model development methods are necessary to address these methodological problems.
The existing prediction models also have several important clinical significance. The highly frequent predictors included in these models were all easily available clinical data, such as preoperative hemoglobin (Hb), age, BMI, surgery duration, and the use of tranexamic acid (TXA). These predictors could be considered in future models for predicting blood transfusion needs after TKA. Moreover, the appropriate format of model presentation is an important consideration when implementing prediction models in clinical settings [50]. In this review, various formats are used to present prediction models, including model equations, nomograms, and online calculators. Nomograms were commonly used for logistic regression models, while online calculators were typically used for machine learning models. Visual nomograms and online calculators convert complex statistical methods and machine learning algorithms into simple forms, facilitating efficient prediction of blood transfusion risks. However, numerous of them did not provide complete model equations or machine learning codes, which limits external validation and model optimization [51]. Therefore, future research should consider the specific user and clinical setting when presenting prediction models and ensure that complete model equations and algorithms are provided to allow for external validation and optimization.
Strengths and limitations
Several strengths of this review are as follows. We systematically reviewed the existing clinical prediction models for postoperative blood transfusion after TKA using an extensive literature search, which can provide comprehensive evidence in this field. Additionally, we conducted this review following the PRISMA [15] statament and TRIPOD-SRMA [14] guideline, which can ensure complete and transparent reporting.
There are also some limitations in this review. Firstly, studies solely published in English and Chinese were included, which may lead to the omission of relevant studies published in other languages. Secondly, due to the limited number of included studies, only six models that underwent internal validation were included in the meta-analysis. This limitation prevented further quantification of the publication bias and the exploration of the sources of heterogeneity among studies. Finally, significant heterogeneity was still observed across studies, which could stem from participant populations and transfusion indications. Therefore, the clinical conclusions drawn from this review should be interpreted with caution.
Implications for clinical practice
Prediction models can help healthcare providers identify patients at high risk of requiring blood transfusion after TKA and guide the implementation of blood management strategies, including preoperative iron supplementation [52] and tranexamic acid use [53], which can reduce transfusion rates and associated complications. Future research should consider several key fields as follows. First, the predictors summarized in this review are readily available in clinical practice, which could be prioritized in the development of future clinical risk predictive models to enhance predictive accuracy. Second, given that the current prediction models had a high risk of bias, researchers are recommended to improve the methodological quality of studies following the PROBAST guidelines, such as using objective outcome definitions, utilizing appropriate methods to process missing data, avoiding categorizing continuous variables, and using bootstrapping or cross-validation instead of random sample splits to perform internal validation. Finally, given that most models included in this review were derived from single-center retrospective data and most lacked sufficient external validation, which may affect the generalizability and applicability of the model. Thus, some issues should be considered when implementing, such as patient populations and institutional conditions. Future research should conduct prospective study designs and pay attention to multi-center external validation to improve the reliability and clinical utility of the models.
Conclusions
This systematic review and meta-analysis summarized 12 studies with 18 clinical prediction models for transfusion risk after TKA. Each of these models showed a moderate to excellent predictive performance, and their overall performance (AUC = 0.83) also demonstrated some discriminative ability. However, all studies were assessed to have a high risk of bias with the PROBAST because of some methodological weaknesses, including a lack of rigorous study design or inadequate external validation, limiting their widespread or reliable clinical implementation. Future research should focus on enhancing existing models or developing new ones with rigorous methodology and multicenter external validation, providing more reliable evidence for the application of transfusion risk prediction in TKA patients.
Supplementary Information
Acknowledgements
Not applicable.
Abbreviations
- TKA
Total knee arthroplasty
- KOA
Knee osteoarthritis
- TXA
Tranexamic acid
- AUC
Area under curve
- CHARMS
Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies
- PROBAST
Prediction model Risk Of Bias Assessment Tool
- REML
Restricted maximum likelihood
- HKSJ
Hartung-Knapp-Sidik-Jonkman
- LR
Logistic regression
- MPNN
Message passing neural network
- GBM
Gradient boosting machine
- RF
Random forest
- ANN
Artificial neural network
- XG Boost
eXtreme gradient boosting
- SVM
Support vector machine
- KNN
K-nearest neighbors
- RFE
Recursive feature elimination
- LASSO
Least absolute shrinkage and selection operator
- SRMI
Sequential regression multiple imputation
- DCA
Decision curve analysis
- EPV
Events per variable
- NPV
Negative redictive value
- PPV
Positive predictive value
- VIF
Variance inflation factor
Authors’ contributions
JingWen Chen: Writing – original draft, Methodology, Formal analysis, Data curation, Visualization, Conceptualization. Xiaoping Zhong: Writing – review & editing, Methodology, Data curation. Yaojie Zhai: Software, Data curation. Cuixian Zhao: Formal analysis, Validation. Jingjing Lan: Visualization. Zhenlan Xia: Writing – review & editing, Methodology, Conceptualization. Liping Chen: Supervision, Project administration, Conceptualization. All authors have read and approved the final manuscript.
Funding
This research did not receive any specific grant from any departmental funding.
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Footnotes
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Jingwen Chen and Xiaoping Zhong made equal contributions to this manuscript.
Contributor Information
Liping Chen, Email: clp202306@163.com.
Zhenlan Xia, Email: 421170624@qq.com.
References
- 1.Ackerman IN, Soh S-E, de Steiger R. Actual versus forecast burden of primary hip and knee replacement surgery in Australia: analysis of data from the Australian orthopaedic association national joint replacement registry. J Clin Med. 2022;11:1883. 10.3390/jcm11071883. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Rupp M, Lau E, Kurtz SM, Alt V. Projections of primary TKA and THA in Germany from 2016 through 2040. Clin Orthop Relat Res. 2020;478:1622–33. 10.1097/CORR.0000000000001214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Singh JA, Yu S, Chen L, Cleveland JD. Rates of total joint replacement in the united states: future projections to 2020–2040 using the National inpatient sample. J Rheumatol. 2019;46:1134–40. 10.3899/jrheum.170990. [DOI] [PubMed] [Google Scholar]
- 4.Sun W, Yuwen P, Yang X, Chen W, Zhang Y. Changes in epidemiological characteristics of knee arthroplasty in eastern, Northern and central China between 2011 and 2020. J Orthop Surg Res. 2023;18:104. 10.1186/s13018-023-03600-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Palazzuolo M, Antoniadis A, Mahlouly J, Wegrzyn J. Total knee arthroplasty improves the quality-adjusted life years in patients who exceeded their estimated life expectancy. Int Orthop. 2021;45:635–41. 10.1007/s00264-020-04917-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Zhang H, Mu X, Zhang Z, Lin J, Jin J, Qian W, et al. Clinical and hematological factors affecting perioperative blood loss following total knee arthroplasty: a new clinical prediction model. Chin Med J Engl. 2025;138:868–70. 10.1097/CM9.0000000000003519. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Kimball CC, Nichols CI, Vose JG. Blood transfusion trends in primary and revision total joint arthroplasty: recent declines are not shared equally. J Am Acad Orthop Surg. 2019;27:E920–7. 10.5435/JAAOS-D-18-00205. [DOI] [PubMed] [Google Scholar]
- 8.Acuña AJ, Grits D, Samuel LT, Emara AK, Kamath AF. Perioperative blood transfusions are associated with a higher incidence of thromboembolic events after TKA: an analysis of 333,463 TKAs. Clin Orthop Relat Res. 2021;479:589–600. 10.1097/CORR.0000000000001513. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Everhart JS, Sojka JH, Mayerson JL, Glassman AH, Scharschmidt TJ. Perioperative allogeneic red blood-cell transfusion associated with surgical site infection after total hip and knee arthroplasty. J Bone Joint Surg Am. 2018;100:288–94. 10.2106/JBJS.17.00237. [DOI] [PubMed] [Google Scholar]
- 10.Wang Q, Lee RLT, Hunter S, Chan SW-C. The effectiveness of internet-based telerehabilitation among patients after total joint arthroplasty: an integrative review. Int J Nurs Stud. 2021;115:103845. 10.1016/j.ijnurstu.2020.103845. [DOI] [PubMed] [Google Scholar]
- 11.Irving A, McQuilten ZK. Does patient blood management represent good value for money? Best Pract Res Clin Anaesthesiol. 2023;37:511–8. 10.1016/j.bpa.2023.11.004. [DOI] [PubMed] [Google Scholar]
- 12.Palmer AJR, Gagné S, Fergusson DA, Murphy MF, Grammatopoulos G. Blood management for elective orthopaedic surgery. J Bone Joint Surg Am. 2020;102:1552–64. 10.2106/JBJS.19.01417. [DOI] [PubMed] [Google Scholar]
- 13.Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015. 10.1136/bmj.g7594. [DOI] [PubMed] [Google Scholar]
- 14.Snell KIE, Levis B, Damen JAA, Dhiman P, Debray TPA, Hooft L, et al. Transparent reporting of multivariable prediction models for individual prognosis or diagnosis: checklist for systematic reviews and meta-analyses (TRIPOD-SRMA). BMJ. 2023. 10.1136/bmj-2022-073538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372. 10.1136/bmj.n71. [DOI] [PMC free article] [PubMed]
- 16.Geersing G-J, Bouwmeester W, Zuithoff P, Spijker R, Leeflang M, Moons K. Search filters for finding prognostic and diagnostic prediction studies in medline to enhance systematic reviews. PLoS One. 2012;7:e32844. 10.1371/journal.pone.0032844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Moons KGM, de Groot JAH, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014;11:e1001744. 10.1371/journal.pmed.1001744. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170:51–8. 10.7326/M18-1376. [DOI] [PubMed] [Google Scholar]
- 19.Moons KGM, Wolff RF, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med. 2019;170:W1–33. 10.7326/M18-1377. [DOI] [PubMed] [Google Scholar]
- 20.Debray TPA, Damen JAAG, Snell KIE, Ensor J, Hooft L, Reitsma JB, et al. A guide to systematic review and meta-analysis of prediction model performance. BMJ. 2017;356:i6460. 10.1136/bmj.i6460. [DOI] [PubMed] [Google Scholar]
- 21.Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327:557–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Chapter 13. Assessing risk of bias due to missing evidence in a meta-analysis | Cochrane. https://www.cochrane.org/authors/handbooks-and-manuals/handbook/current/chapter-13. Accessed 16 Aug 2025.
- 23.Wallace BC, Schmid CH, Lau J, Trikalinos TA. Meta-analyst: software for meta-analysis of binary, continuous and diagnostic data. BMC Med Res Methodol. 2009;9:80. 10.1186/1471-2288-9-80. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Ahmed I, Chan JKK, Jenkins P, Brenkel I, Walmsley P. Estimating the transfusion risk following total knee arthroplasty. Orthopedics. 2012;35:e1465–1471. 10.3928/01477447-20120919-13. [DOI] [PubMed] [Google Scholar]
- 25.Noticewala MS, Nyce JD, Wang W, Geller JA, Macaulay W. Predicting need for allogeneic transfusion after total knee arthroplasty. J Arthroplasty. 2012;27:961–7. 10.1016/j.arth.2011.10.008. [DOI] [PubMed] [Google Scholar]
- 26.Cavazos DR, Sayeed Z, Court T, Chen C, Little BE, Darwiche HF. Predicting factors for blood transfusion in primary total knee arthroplasty using a machine learning method. J Am Acad Orthop Surg. 2023;31:e845–58. 10.5435/JAAOS-D-23-00063. [DOI] [PubMed] [Google Scholar]
- 27.Faure N, Knecht S, Tran P, Tamine L, Orban J-C, Bronsard N, et al. Prediction of transfusion risk after total knee arthroplasty: use of a machine learning algorithm. Orthop Traumatology: Surg Res. 2024;103985. 10.1016/j.otsr.2024.103985. [DOI] [PubMed]
- 28.Kolin DA, Lyman S, Della Valle AG, Ast MP, Landy DC, Chalmers BP. Predicting postoperative anemia and blood transfusion following total knee arthroplasty. J Arthroplasty. 2023;38:1262–e12662. 10.1016/j.arth.2023.01.018. [DOI] [PubMed] [Google Scholar]
- 29.Mozella A, de P, Cobra HA, de Duarte AB. Predictive factors for blood transfusion after total knee arthroplasty < sup/>. Rev Bras Ortop. 2021;56:463–9. 10.1055/s-0040-1715511. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Hu C, Wang Y, Shen R, Liu C, Sun K, Ye L, et al. Development and validation of a nomogram to predict perioperative blood transfusion in patients undergoing total knee arthroplasty. BMC Musculoskelet Disord. 2020;21:315. 10.1186/s12891-020-03328-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Jo C, Ko S, Shin WC, Han H-S, Lee MC, Ko T, et al. Transfusion after total knee arthroplasty can be predicted using the machine learning algorithm. Knee Surg Sports Traumatol Arthrosc. 2020;28:1757–64. 10.1007/s00167-019-05602-3. [DOI] [PubMed] [Google Scholar]
- 32.Liu Y, Ai J, Teng X, Huang Z, Wu H, Zhang Z, et al. Risk factor analysis and establishment of a nomogram model to predict blood loss during total knee arthroplasty. BMC Musculoskelet Disord. 2024;25:1–12. 10.1186/s12891-024-07570-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Mohammed H, Huang Y, Memtsoudis S, Parks M, Huang Y, Ma Y. Utilization of machine learning methods for predicting surgical outcomes after total knee arthroplasty. PLoS One. 2022;17:e0263897. 10.1371/journal.pone.0263897. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34.Chen C, He D, Liang J, He Z. Predicting the possibility of blood transfusion after total knee arthroplasty based on machine learning algorithm. Zhongguo Zuzhi Gongcheng Yanjiu. 2021;25:5792. [In Chinese]. [Google Scholar]
- 35.Li X, Lu X, Shi L, Xu K, Yu T, Zhang Y. Establishment of a predictive model for blood transfusion within 14d after simultaneous bilateral total knee arthroplasty. J Qingdao Univ (Medical Sciences). 2024;60:693–6. [In Chinese]. [Google Scholar]
- 36.Gross JB. Estimating allowable blood loss: corrected for dilution. Anesthesiology. 1983;58:277–80. 10.1097/00000542-198303000-00016. [DOI] [PubMed] [Google Scholar]
- 37.Hu Y, Lu H, Ren L, Yang M, Shen M, Huang J, et al. Prediction models for perineal lacerations during childbirth: a systematic review and critical appraisal. Int J Nurs Stud. 2023;145:104546. 10.1016/j.ijnurstu.2023.104546. [DOI] [PubMed] [Google Scholar]
- 38.Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiol (Cambridge Mass). 2010;21:128. 10.1097/EDE.0b013e3181c30fb2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW, Bossuyt P, et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230. 10.1186/s12916-019-1466-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162:W1–73. 10.7326/M14-0698. [DOI] [PubMed] [Google Scholar]
- 41.Riley RD, Ensor J, Snell KIE, Debray TPA, Altman DG, Moons KGM, et al. External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ. 2016;353:i3140. 10.1136/bmj.i3140. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.Steyerberg EW, Harrell FE, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol. 2001;54:774–81. 10.1016/s0895-4356(01)00341-9. [DOI] [PubMed] [Google Scholar]
- 43.Ho SY, Phua K, Wong L, Bin Goh WW. Extensions of the external validation for checking learned model interpretability and generalizability. Patterns. 2020;1:100129. 10.1016/j.patter.2020.100129. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Kerr KF, Brown MD, Zhu K, Janes H. Assessing the clinical impact of risk prediction models with decision curves: guidance for correct interpretation and appropriate use. JCO. 2016;34:2534–40. 10.1200/JCO.2015.65.5654. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Debray TPA, Vergouwe Y, Koffijberg H, Nieboer D, Steyerberg EW, Moons KGM. A new framework to enhance the interpretation of external validation studies of clinical prediction models. J Clin Epidemiol. 2015;68:279–89. 10.1016/j.jclinepi.2014.06.018. [DOI] [PubMed] [Google Scholar]
- 46.Chalmers BP, Mishu M, Chiu Y-F, Cushner FD, Sculco PK, Boettner F, et al. Simultaneous bilateral primary total knee arthroplasty with TXA and restrictive transfusion protocols: still a 1 in 5 risk of allogeneic transfusion. J Arthroplasty. 2021;36:1318–21. 10.1016/j.arth.2020.10.042. [DOI] [PubMed] [Google Scholar]
- 47.Frisch NB, Wessell NM, Charters MA, Yu S, Jeffries JJ, Silverton CD. Predictors and complications of blood transfusion in total hip and knee arthroplasty. J Arthroplasty. 2014;29(9 Suppl):189–92. 10.1016/j.arth.2014.03.048. [DOI] [PubMed] [Google Scholar]
- 48.Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332:1080. 10.1136/bmj.332.7549.1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 49.Pudjihartono N, Fadason T, Kempa-Liehr AW, O’Sullivan JM. A review of feature selection methods for machine Learning-Based disease risk prediction. Front Bioinform. 2022;2:927312. 10.3389/fbinf.2022.927312. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 50.Bonnett LJ, Snell KIE, Collins GS, Riley RD. Guide to presenting clinical prediction models for use in clinical settings. BMJ. 2019;365. 10.1136/bmj.l737. [DOI] [PubMed]
- 51.Kong L-N, Yang L, Lyu Q, Liu D-X, Yang J. Risk prediction models for frailty in older adults: a systematic review and critical appraisal. Int J Nurs Stud. 2025;167:105068. 10.1016/j.ijnurstu.2025.105068. [DOI] [PubMed] [Google Scholar]
- 52.Park Y-B, Kim K-I, Lee H-J, Yoo J-H, Kim J-H. High-dose intravenous iron supplementation during hospitalization improves hemoglobin level and transfusion rate following total knee or hip arthroplasty: a systematic review and meta-analysis. J Arthroplasty. 2024. 10.1016/j.arth.2024.11.058. [DOI] [PubMed] [Google Scholar]
- 53.Chen JY, Chin PL, Moo IH, Pang HN, Tay DKJ, Chia S-L, et al. Intravenous versus intra-articular tranexamic acid in total knee arthroplasty: a double-blinded randomised controlled noninferiority trial. Knee. 2016;23:152–6. 10.1016/j.knee.2015.09.004. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.






