Skip to main content
BMC Musculoskeletal Disorders logoLink to BMC Musculoskeletal Disorders
. 2025 Oct 1;26:892. doi: 10.1186/s12891-025-09164-z

Clinical prediction models for postoperative blood transfusion after total knee arthroplasty: a systematic review and meta-analysis

Jingwen Chen 1,#, Xiaoping Zhong 1,#, Yaojie Zhai 1, Cuixian Zhao 1, Jingjing Lan 2, Liping Chen 3,, Zhenlan Xia 2,
PMCID: PMC12487561  PMID: 41034914

Abstract

Background

Postoperative blood transfusion remains a significant concern following total knee arthroplasty. Clinical prediction models can facilitate early identification of patients at risk, enabling targeted blood management to reduce unnecessary transfusions and related complications. However, the predictive performance, methodological quality, and clinical applicability of these models remain uncertain. Therefore, we systematically reviewed existing models for predicting postoperative transfusion in total knee arthroplasty.

Methods

Ten English and Chinese databases were comprehensively searched from database inception to February 2025 to identify relevant studies. Two reviewers independently extracted data based on the checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies (CHARMS). The risk of bias and the applicability of each study was evaluated applying the Prediction model Risk Of Bias Assessment Tool (PROBAST). Extracted AUC of included models were pooled and analyzed utilizing a random-effects meta-analysis. A leave-one-out sensitivity analysis and an exploratory subgroup meta-analysis by modelling approach were also conducted to explore the sources of heterogeneity. All statistical analyses were performed in Stata 17.0 software.

Results

Twelve studies involving eighteen models were incorporated in this review. All studies established the prediction models employing logistic regression or machine learning methods. The most commonly used predictors were preoperative hemoglobin, age, body mass index, surgery duration, and the use of tranexamic acid. The pooled AUC for the six internally validated models was 0.83 (95% CI: 0.74–0.92), demonstrating a relatively high predictive discrimination. Sensitivity analysis did not materially change the estimates, and the subgroup meta-analyses showed that the modelling approach alone could not explain the heterogeneity (p = 0.406). However, all model were considered as having a high risk of bias, mainly owing to the unsuitable study design and poor reporting within the analysis domain.

Conclusions

Despite the included studies demonstrating moderate to excellent discrimination for predicting postoperative transfusion after total knee arthroplasty, all studies were considered as having a high risk of bias following the PROBAST due to some methodological shortcomings and inadequate external validation. Future research should focus on improving methodological quality and performing multicenter external validation to ensure clinical applicability.

Clinical trial number

Not applicable.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12891-025-09164-z.

Keywords: Blood transfusion, Total knee arthroplasty, Clinical prediction model, Systematic reviews, Meta-Analyses

Introduction

Total knee arthroplasty (TKA) is widely recognized as an effective treatment for end-stage knee osteoarthritis (KOA). With the aging population and the rising prevalence of KOA, the volume of TKA in various countries has significantly increased [14]. This surgery effectively alleviates pain, restores joint function, and enhances patients’ quality of life [5]. Despite the high success rate of TKA, the significant blood loss and the subsequent need for postoperative blood transfusion in TKA continue to be major concerns [6]. Reported transfusion rates after TKA vary from 3.2 to 18.1% [7]. Although blood transfusion can effectively restore blood volume and save lives, blood products are limited resource. Clinical judgment errors can result in blood overuse, leading to the unnecessary wastage of blood products and associated complications. Several studies have shown that transfusion is associated with increased risks of complications, including thromboembolic events and increased surgical site infections, which can prolong hospital stays, raise hospitalization costs, and increase mortality risk [810]. To avoid unnecessary transfusion, relevant patient blood management (PBM) strategies have been implemented, such as preoperative erythropoietin administration, preoperative iron supplementation, and the application of tranexamic acid (TXA), which have shown certain benefits for patients [11, 12]. Therefore, accurately assessing transfusion risk and implementing a targeted PBM strategy is crucial for improving patient outcomes and saving the blood resources of healthcare institutions.

Clinical prediction models can help clinicians recognize high-risk patients and guide diagnostic and therapeutic strategy selection to reduce adverse outcomes [13]. In recent years, several predictive models have been developed based on diverse participant groups and predictors to identify the requirements of blood transfusion after TKA effectively. These clinical prediction models can help clinicians identify high-risk patients before surgery, facilitating the implementation of appropriate preventive measures to decrease the rate of postoperative transfusion in patients who underwent TKA. Nonetheless, due to the varied designs and populations, the quality and effectiveness of these models have not been comprehensively assessed, which limits their widespread implementation.

To our knowledge, no systematic review has comprehensively evaluated the predictive performance and clinical applicability of these models. Therefore, this systematic review and meta-analysis aimed to identify and summarize existing prediction models for transfusion after TKA, critically appraise their methodological quality, and compare their predictive performance. The findings are intended to inform the refinement of future models and support more individualized clinical decision-making.

Methods

Study design

This systematic review was reported adhering to the Transparent Reporting of Multivariable Prediction Models for Individual Prognosis or Diagnosis: Checklist for Systematic Reviews and Meta-analyses (TRIPOD-SRMA) [14] statement and the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [15] (see Additional file 1). The study protocol was registered in PROSPERO (registration number CRD42024523294).

Data sources and search strategy

A thorough literature search was performed across ten databases, including both English and Chinese databases. Chinese databases comprised the China National Knowledge Infrastructure (CNKI), Wanfang, China Science and Technology Journal Database (VIP), and the China Biomedical Literature Service System (SinoMed). English databases included PubMed, Web of Science, Cochrane Library, CINAHL, Embase, and Scopus. The search covered the period from the inception of each database to February 19, 2025. Additionally, reference lists of relevant reviews and included studies were examined to identify further eligible studies.

Boolean operators (“OR” and “AND”) were applied to combine mesh terms with free-text keywords. The search strategy framework proposed by Geersing et al.’s study [16] was also used. The following keywords were used: “Arthroplasty, Replacement, Knee,” “Blood Transfusion,” “risk model,” “risk factor,” “risk assessment,” “predictive model,” “nomogram,” “scoring system,” “stratification,” “ROC curve,” “discrimination,” “c-statistics,” “calibration,” “algorithm,” “multivariable,” “indices,” and “AUC.” Detailed search strategies for each database were provided in Additional file 2.

Inclusion and exclusion criteria

Cohort and case-control studies that involved developing or validating at least one prediction model were included in this review. As the critical appraisal and data extraction for systematic reviews of prediction modelling studies (CHARMS) [17] checklist recommended, the PICOTS framework was applied to define the inclusion criteria as follows:

  • P (Population): patients aged ≥ 18 years who underwent total knee arthroplasty, including unilateral TKA, staged bilateral TKA, and simultaneous bilateral TKA,

    I (Intervention model): any developed and published prediction models that predict the risk of blood transfusion following TKA, including at least two predictors.

    C (Comparator): None.

    O (Outcome): The primary outcome of interest was the occurrence of blood transfusion after TKA surgery.

    T (Timing): Any time point during the postoperative period.

    S (Setting): Prediction models applied in ambulatory or inpatient wards

The following types of studies were excluded from the review: (1) studies that were not peer-reviewed; (2) studies that failed to Establish a prediction model; (3) studies published in languages other than english or chinese; (4) studies with unavailable full-text or duplicate publications

Study selection and screening

Two reviewers independently identified relevant articles according to the following surgeries. Firstly, duplicate studies were removed using the endnote software (version X9.3.3, clarivate, philadelphia, pennsylvania, USA). Titles and abstracts from the preliminary search were screened using the Rayyan platform (https://new.rayyan.ai/.), which is a free, web-based document management platform for systematic reviews that could detect and remove duplicates. Subsequently, two reviewers independently assessed the full texts of the selected articles to exclude those deemed irrelevant. Any disagreements regarding inclusion were resolved through discussion with a third reviewer.

Data extraction and synthesis

Two reviewers independently extracted data based on the checklist for critical appraisal and data extraction for systematic reviews of prediction modelling studies (CHARMS) [17]. data were extracted from the included studies and organized into two main sections: (1) general characteristics: first author, year of publication, country, study design, data source, study setting, participant characteristics, outcome (transfusion indications), outcome measurement (timing of prediction), sample size and number of events. (2) prediction model information: how to process missing data and continuous variables, method of predictors selection, final predictors, modelling methods, model validation (internal or external), model performance including discrimination (area under curve or C-index) and calibration (calibration plot or Hosmer-Lemeshow test), and formats of model presentation. Any inconsistencies were addressed through discussion with a third author. When essential information was missing or unclear, the corresponding authors of the original studies were contacted via email to obtain additional information. The outcome of data extraction was synthesized using a narrative synthesis method, and the key information was shown using a tabular form with Microsoft excel (version 2021, Microsoft Corporation). Moreover, the frequency of final predictors was demonstrated using a bar diagram.

Quality assessment

Each included study was independently assessed by two authors using the Prediction Model Risk Of Bias Assessment Tool (PROBAST) [18] checklist, which evaluates both the risk of bias and applicability concerns in diagnostic or prognostic prediction model studies. The tool comprises 20 signaling questions divided into four domains: participants, predictors, outcome, and analysis. Each item was answered with the following responses: “yes,” “probably yes,” “no,” “probably no,” or an indication of “no information” [19]. Each domain is rated as having a low, high, or unclear risk of bias. A domain was considered to have a high risk of bias if any signaling question within that domain was answered as “no” or “probably no,” Any disagreements between authors were solved through discussion until a consensus was reached.

Statistical analysis

A meta-analysis was performed utilizing Stata software (version 17, Stata corporation, college station, texas, USA). Given clinical and methodological differences across studies, the discrimination of the included models was pooled and analyzed using a random-effects model instead of a fixed-effects model [20]. The 95% confidence interval was calculated using restricted maximum likelihood (REML) Estimation and the Hartung-Knapp-Sidik-Jonkman (HKSJ) method [20]. The statistical heterogeneity across studies was evaluated using cochran’s Q test. Additionally, the I2 statistic was used to quantify statistical heterogeneity, with 25%, 50%, and 75% indicating low, moderate, and high heterogeneity, respectively [21]. Publication bias was not quantified because the Cochrane collaboration guidelines did not recommend funnel-plot asymmetry tests (e.g., egger’s test) in a meta-analysis when fewer than 10 studies [22]. As a sensitivity analysis, the leave-one-out method was implemented to assess the robustness of the result, which could re-estimate the summary effect after excluding each study in turn to identify influential studies [23]. Additionally, given the limited number of studies, we only conducted an exploratory subgroup meta-analysis using different modelling methods.

Results

Study selection

Figure 1 demonstrates the literature selection process of this systematic review. A total of 5,896 studies were retrieved initially. After removing 2,603 duplicates, 3,293 titles and abstracts were screened according to predefined inclusion and exclusion criteria, and then 50 full-text studies were for further evaluation. Of which 21 studies were excluded for not developing prediction models or only focusing on risk factors. Moreover, 10 studies were excluded due to unmatched study populations, four due to inconsistent results, two were duplicate publications, and one was not peer-reviewed. Ultimately, 12 studies reported that 18 prediction models were included in this study.

Fig. 1.

Fig. 1

Flowchart of literature search and selection

General characteristics of included studies

Table 1 demonstrates the general characteristics of the included studies, published from 2012 to 2024. The 12 studies were performed in China, the United States, the United Kingdom, Brazil, South Korea, and France. All included studies were retrospective design. Regarding the type of TKA, six studies [2429] focused on unilateral TKA, Five [3034] focused on both unilateral and bilateral TKA, while only Li et al. [35] focused on simultaneous bilateral TKA. Regarding outcome definitions, Eight studies [24, 2631, 35] defined transfusion based on postoperative hemoglobin (Hb) levels. Mohammed et al. [33] defined outcomes using blood transfusion requirements according to the International Classification of Diseases, 9th Revision (ICD-9) procedure code. Liu et al. [32] defined outcomes by calculating total blood loss ratios according to the Gross formula [36], and the remaining two studies [25, 34] did not report a clear definition. As for the predictive windows, four studies reported precise time points, including within 14 days postoperatively [30, 31, 35] and within 48 h [29], while the remaining studies failed to report specifics. The count of participants in the included studies varied from 234 to 63,606, and the reported rate of blood transfusion ranged from 3.2 to 37.47%.

Table 1.

Overview of the general characteristics of included studies

First author Country Study design Data source Participants Outcome
definition
Timing of prediction Event
fraction

Ahmed 2012

[24]

United Kingdom Retrospective cohort study Regional arthroplasty database of a hospital Patients underwent primary unilateral TKA Postoperative Hb < 8.5 g/dL Postoperatively

227/2281

(10%)

Noticewala 2012 [25] America Retrospective cohort study Hip and Knee Replacement center of a hospital Patients underwent primary unilateral TKA - Any point in the postoperative course before discharge

71/644

(11%)

Cavazos 2023 [26] America Retrospective cohort study Trauma center of a hospital Patients underwent primary unilateral TKA Postoperative Hb < 7 g/dL Postoperatively 67/2093 (3.2%)
Faure 2024 [27] France Retrospective cohort study Hospital electronic database of a hospital Patients (age > 18) underwent unilateral TKA Postoperative Hb < 7 g/dL Postoperatively 100/774 (12.9%)
Kolin 2023 [28] America Retrospective cohort study Inpatient Setting of a hospital Patients underwent primary single-stage TKA Postoperative Hb < 7 g/dL Postoperatively 543/14,188 (4%)
Mozella 2021 [29] Brazil Retrospective cohort study National Institute of Traumatology and Orthopedics (INTO) Patients (age 30–83) underwent unilateral TKA Postoperative Hb < 7 g/dL

Within 48 h

postoperatively

79/234 (33.7%)
Hu 2020 [30] China Retrospective cohort study Inpatient orthopedic unit of a hospital Patients underwent unilateral or bilateral TKA Postoperative Hb < 7 g/dL or Hb < 8 g/dL with symptoms of anemia

Within 14 days

postoperatively

391/5402 (7.2%)development:

148/1116

(13.3%)validation

Jo 2020 [31] South Korea Retrospective cohort study Electronic medical recording system and clinical data warehouse system of a institution Patients underwent unilateral TKA, staged bilateral TKA, and simultaneous bilateral TKA Postoperative Hb < 7 g/dL

Within 14 days

postoperatively

108/1686 (7.2%)development:

7/400

(1.8%)validation

Liu 2024 [32] China Retrospective cohort study

Hospital electronic database

of a hospital

Patients underwent TKA

TBL/EBV ratio

>20%

Postoperatively 73/329 (22.2%)
Mohammed 2022 [33] America Retrospective cohort study Inpatient data of National Inpatient Sample (NIS) Patients underwent TKA ICD-9 surgery codes for blood transfusion Postoperatively​ 73,020/636,062 (11.41%)
Chen 2021a [34] China Retrospective cohort study Hospital electronic medical records of a hospital Patients (age > 18) underwent elective TKA - Postoperatively​ 107/634 (16.9%)
Li 2024 a [35] China Retrospective cohort study Hospital’s electronic medical database of a hospital Patients underwent simultaneous bilateral TKA Postoperative Hb < 7 g/dL or Hb < 8 g/dL with symptoms of anemia

Within 14 days

postoperatively

323/862 (37.47%)

TKA Total knee arthroplasty, Hb Hemoglobin

a Study was published in Chinese

Characteristics of prediction models information

Table 2 summarizes the model information of the included studies. Regarding the modelling method, Over half of the studies [24, 25, 2830, 32, 35] developed models utilizing the logistic regression (LR) method, two [27, 31] used the gradient boosting machine (GBM) method, and one [26] used the message passing neural network (MPNN) method. The remaining two studies [33, 34] used both the LR method and machine learning methods to develop and compare models. As for candidate predictors, All studies selected multi-dimensional predictors, including demographic and laboratory ones, and the count of candidate predictors ranged between 7 and 43. As for how to handle continuous variables, ten studies did not transform continuous variables, while two studies [24, 33] converted continuous variables into categorical ones. Regarding the way of handling missing data, two studies [28, 33] applied imputation methods, one study [27] applied the K-Nearest Neighbors (KNN) algorithm, one [32] excluded missing data directly, and the remaining eight studies [2426, 29, 31, 34, 35, 37] did not report their handling approach. Univariate analysis was frequently used to select initial predictors [2426, 29, 30, 32, 35], one study [28] selected predictors based on expert opinion. Multivariate logistic regression was commonly utilized to determine the final predictors, with one study [24] using forward, two [29, 32] using stepwise, one using [33] backward, one [30] using the combination of Least Absolute Shrinkage and Selection Operator (LASSO) algorithm and multivariate analysis. Furthermore, among studies using machine learning, two studies [27, 31] used Recursive Feature Elimination (RFE) algorithms to select the final predictors, two [33, 34] used variable importance analysis, and one [26] used the MPNN algorithm.

Table 2.

Overview of the prediction model information and performance of the included studies

Study modelling methods Candidate
predictors(N)
Handling of Continuous variable Handling of missing data Predictors selection in modelling
(Prior/Final)
Final predictors Model
validation
Model presentation
Ahmed 2012 [24] LR 9 Categorical variables -

Univariate Analysis/

Forward conditional

3: age, weight,

and preoperative Hb

- Equation
Noticewala 2012 [25] LR 31 Continuous variables -

Univariate Analysis/

Multiple LR

4: age, anemia, preoperative Hb, and surgery time External: Temporal Equation
Cavazos 2023 [26] MPNN 13 Continuous variables -

Univariate Analysis/

MPNN

11: preoperative Hb, preoperative creatinine level, surgery time, simultaneous bilateral surgeries, TXA use, ASA score, preoperative ALB, ethanol use, preoperative anticoagulation use, age, and surgery type

Internal:

Split sample

(7:3)

-
Faure 2024 [27] GBM 25 Continuous variables KNN RFE 5: age, BMI, TXA use, preoperative Hb and platelet count

Internal:

Split sample

(7:3)

Online tool
Kolin 2023 [28] LR 8 Continuous variables

Means and

Bag Imputation

Expert

Opinions/

Multiple LR

8: age, sex, BMI, preoperative Hb, TXA use, ASA, surgery time, and Drain Use

Internal:

Split sample

(7:3)

Equation
Mozella 2021 [29] LR 7 Continuous variables -

Univariate Analysis/

Stepwise forward

2: Preoperative Hb and intraoperative ischemia time - Equation
Hu 2020 [30] LR 17 Continuous variables -

Univariate Analysis/

LASSO and Multiple LR

5: Age, BMI, surgery type, CHD, and preoperative Hb External: Temporal Nomogram
Jo 2020 [31] GBM 43 Continuous variables - RFE 6: TXA use, surgery type, platelet count, age, body weight, and preoperative Hb

Internal: tenfold cross

External: Geographical

Web-based risk-assessment system
Liu 2024 [32] LR 24 Continuous variables Delete

Univariate LR/

Stepwise forward

4: TXA use, preoperative ESR, HCT, and ALB

Internal:

Split sample (7:3), bootstrap

Nomogram
Mohammed 2022 [33]

LR,

GBM,

RF,

ANN

37 Categorical variables SRMI Variable importance analysis and Backward selection (LR model)

15: admission month, year of admission, patient location, deficiency anemia, median household income for the patient, age, race, sex, health insurance, fluid and electrolyte disorders, hypertension, obesity, diabetes (uncomplicated)control/

ownership of hospital, hospital bed size

Internal:

Split sample

(5:2:3)

-
Chen 2021a [34]

LR

SVM

RF

XGBoost

16 Continuous variables - Variable importance analysis 5: Preoperative Hb, age, surgery time, BMI, and surgery type

Internal:

Split sample (8:2),

fivefold cross

-
Li 2024 a [35] LR 14 Continuous variables -

Univariate Analysis/

Multiple LR

4: preoperative Hb, age, BMI, and disease duration Internal: bootstrap Nomogram

CHD Coronary heart disease, Hb Hemoglobin, ESR Erythrocyte sedimentation rate, HCT Hematocrit, ALB Albumin, SRMI Sequential regression multiple imputation, KNN K-nearest neighbors, RFE Recursive feature elimination, LR Logistic regression, LASSO Least absolute shrinkage and selection operator, GBM Gradient boosting machine, RF Random forest, SVM Support vector machine, XGBoost eXtreme gradient boosting, ANN Artificial neural network, MPNN Message passing neural network, RFE Recursive feature elimination

-not reported

Characteristics of included predictors

The number of final predictors in the models varied between 2 and 15 (Table 2), and the frequency of occurrence is shown in Fig. 2. The most frequent predictors were preoperative hemoglobin and age, appearing in 10 studies (83.3%), respectively. Other commonly identified predictors were the use of TXA, BMI, type of surgery, and duration of surgery. Six predictors, including sex, preoperative platelet count, preoperative albumin levels, body weight, preoperative anemia, and ASA, appeared in two studies, respectively. The remaining 11 predictors were only used in one study.

Fig. 2.

Fig. 2

Frequency of predictors used in the development models

Characteristics of model validation

Table 2 displays the model validation of the included studies. Seven studies [2628, 3235] conducted internal validation (58.3%). Two studies [25, 30] only underwent external validation, only Jo et al. ‘s study [31] used both internal and external validation, and two [24, 29] did not report any validation after model development. Regarding the way of model internal validation, six studies [2628, 3234] used the random sample splitting method, and two [31, 35] only used bootstrap resampling and ten-fold cross-validation, respectively. Of six studies that used the random sample splitting method, two [32, 34] additionally applied five-fold cross-validation and bootstrap resampling, respectively. Two of the three [25, 30] externally validated studies used time validation, and one [31] used geographical validation.

Characteristics of model presentation

As for the presentation of included models (Table 2), four studies [24, 25, 28, 29] presented models as equations, three [30, 32, 35] presented models as a nomogram, and two [27, 31] presented models as an online calculator. The remaining studies did not report the presentation format.

Characteristics of model performance

As shown in Table 3, the discrimination of the included studies was commonly reported using the AUC, while two studies [24, 29] did not report model discrimination. For the models that underwent internal validation, all models demonstrated moderate to good discrimination, with the reported AUC value ranging from 0.652 [35] to 0.97 [27]. Among the models that underwent external validation, Hu et al. [30] and Jo et al. [31] showed good discrimination, with the reported AUC of 0.839 and 0.880, respectively. The calibration results were reported in five studies [24, 30, 32, 33, 35]. Three studies [30, 32, 35] used the calibration curve, which suggested good calibration. Ahmed et al. [24] evaluated calibration through the Hosmer-Lemeshow (HL) test, suggesting good calibration in the development set (HL P-values > 0.1), while Mohammed et al. [33] used the Brier scores to assess the calibration of their four models in the test set, with the brier scores ranged from 0.088 to 0.095. Other indices, including the Youden index, sensitivity, specificity, negative predictive value, positive predictive value, accuracy, and F1 score, were also used to report the model performance. In addition, three studies [30, 32, 35] used the decision curve analysis (DCA) to assess the clinical benefit of prediction models. Specifically, Hu et al. [30] showed high net benefits with ranges of 0.2–0.94 and 0.1–0.62 in the training and external validation sets, respectively. Another two models [32, 35] also presented good clinical utility.

Table 3.

Summary of prediction models performance of the included studies

Study Discrimination measure Calibration measure Other indexes
Ahmed 2012 [22] AUC = 0.74 (95% CI: 0.7—0.775) Hosmer Lemeshow test: χ2 = 9.36, P = 0.313

Optimal cutoff level = 0.1

Se = 71%

Sp = 71%

NPV = 96%

PPV = 21%

Noticewala 2012 [23]

Se external = 90%

Sp external = 52.5%

Cavazos 2023 [24] AUC Internal = 0.894

Accuracy train = 97.2%

Accuracy Internal = 95.8%

Faure 2024 [25] AUC Internal = 0.97 (95% CI: 0.921—1)

Youden index Internal = 0.8

Se Internal = 94.4%

Sp Internal = 85.4%

Accuracy Internal = 89.9%

Kolin 2023 [26] AUC Internal = 0.90 (95% CI: 0.87—0.93).

Youden index Internal = 0.6

Se Internal = 78%

Sp Internal = 87%

Accuracy Internal = 97%

Mozella 2021 [27]
Hu 2020 [28]

AUC train = 0.884 (95% CI: 0.865–0.903)

AUC external: 0.839 (95% CI: 0.773–0.905)

Calibration curve train:

high consistency

Calibration curve external:

good agreement

DCA train: better benefit (0.2–0.94)

DCA external: better benefit (0.1–0.62)

Jo 2020 [29]

AUC Internal = 0.842 (95% CI: 0.820–0.856)

AUC external = 0.880 (95% CI: 0.844–0.910)

Youden index Internal = 0.0687

Se Internal = 89.8%

Sp Internal = 74.8%

Liu 2024 [30]

AUC train = 0.855 (95% CI: 0.800–0.910)

AUC Internal = 0.824 (95% CI: 0.740–0.909)

Calibration curves train:

high consistency

Calibration curves Internal:

good agreement.

DCA: showed that the nomogram would provide a high net benefit.
Mohammed 2022 [31]

LR: AUC test = 0.707 (95% CI: 0.704— 0.711)

GBM: AUC test = 0.797 (95% CI: 0.794— 0.800)

RF: AUC test = 0.783 (95% CI: 0.780—0.787)

ANN: AUC test = 0.812 (95% CI: 0.805— 0.820)

LR: Brier scores test = 0.095

GBM: Brier scores test = 0.091

RF: Brier scores test = 0.094

ANN: Brier scores test = 0.088

Chen 2021a [32]

LR: AUC Internal = 0.816

SVM: AUC Internal = 0.864

RF: AUC Internal = 0.773

XGBoost: AUC Internal = 0.888

LR: Se Internal = 88.9%; Sp Internal = 50%;

Accuracy Internal = 81.6%; FI Internal = 0.897

SVM: Se Internal = 100%; Sp Internal = 72.7%; Accuracy Internal = 86.4%; FI Internal = 0.972

RF: Se Internal = 91.2%; Sp Internal = 100%;

Accuracy Internal = 92.0%; FI Internal = 0.954

XGBoost: Se Internal = 91.2%; Sp Internal = 100%

Accuracy Internal = 88.8%; FI Internal = 0.954

Li 2024 a [33] AUC Internal = 0.652 (95% CI: 0.612— 0.691) Calibration curve Internal: good agreement

Se Internal = 52.01%

Sp Internal = 85.34%

Accuracy Internal = 72.85%

PPV Internal = 68.02%

DCA: showed that validation cohort had good potential for clinical utility.

AUC Area under the receiver operating characteristic curve, Training training set, Test test set, Internal Internal validation, External External validation, Se Sensitivity, Sp Specificity, NPV, Negative redictive value, PPV Positive predictive value, DCA Decision curve analysis

Risk of bias and applicability in prediction models

Table 4; Fig. 3 summarize the risk of bias (ROB) and applicability of the included studies, and detailed results are presented in Additional file 3. Overall, all studies were assessed as having a high risk of bias.

Table 4.

PROBAST results of the included studies

Study ROB Applicability Overall
Participants Predictors Outcome Analysis Participants Predictors Outcome ROB Applicability
Ahmed 2012 [24] - ? ? - + + + - +
Noticewala 2012 [25] - ? ? - + + ? - ?
Cavazos 2023 [26] - ? ? - + + + - +
Faure 2024 [27] - ? ? - + + + - +
Kolin 2023 [28] - ? ? - + + + - +
Mozella 2021 [29] - ? ? - + + + - +
Hu 2020 [30] - + ? - + + + - +
Jo 2020 [31] - ? ? - + + + - +
Liu 2024 [32] - ? ? - + + + - +
Mohammed 2022 [33] - ? ? - + + ? - ?
Chen 2021a [34] - ? ? - + + ? - ?
Li 2024 a [35] - ? ? - + + + - +

PROBAST Prediction model Risk Of Bias Assessment Tool, ROB Risk of bias

+: low ROB/low concern regarding applicability, -: high ROB/high concern regarding application, ?: unclear ROB/unclear concern regarding applicability

Fig. 3.

Fig. 3

Graphical summary presenting the percentage of risk prediction studies. A, Risk of bias (ROB). B, Risk of Applicability

As for the participant domain, all studies had a high risk of bias because of their retrospective study design. Within the predictor domain, only Hu et al. [30] considered a low risk of bias because it reported quality control measures to minimize bias risk, while the remaining studies had an unclear risk of bias. In the outcome domain, three studies [25, 33, 34] had an unclear risk of bias due to the unclear outcome definitions. None of the studies reported blind assessment of outcomes.

As for the analysis domain, all studies were deemed to have a high risk of bias. Four studies [25, 26, 31, 32] had deficient sample sizes, which could not meet the recommended standard of having more than 20 ‘events per variable’ (EPV) [19]. Two studies [24, 33] transformed all continuous variables into categorical ones without any explanation. One study [26] did not report information about excluded participants. Eight studies [2426, 2931, 34, 35] did not report the missing data procession, and one study [32] handled missing data inappropriately. Seven studies [2426, 29, 30, 32, 35] selected predictive factors based on univariate analysis. Two studies [25, 29] did not report model discrimination, and seven [2529, 31, 34] did not report how to assess the model calibration. Eight studies did not consider model underfitting or overfitting in model performance. Among which four studies [24, 25, 29, 30] did not conduct internal validation, while four studies [15, 25, 26, 30] used only randomly split samples for internal validation. Only two studies [28, 32] considered complexities among predictors and used variance inflation factor (VIF) analysis to solve the problem. The remaining studies did not report data complexity. Six studies [24, 25, 28, 30, 32, 35] reported the coefficients of predictive factors in their regression models, consistent with the results of multivariate analyses.

Regarding the applicability assessment, nine studies had low risk, and three [25, 33, 34] had unclear risk. Regarding the participant domain and the predictors domain, all studies were classified as low-risk. In the outcome domain, three studies [25, 33, 34] did not clearly define the predicted outcomes.

Meta-analysis results

Following the screening process, only three studies [25, 30, 31] were validated externally. Consequently, a meta-analysis was conducted on the models that were validated internally. Among them, two studies [26, 34] did not report the AUC or the 95% confidence interval (CI). Mohammed et al. [33] reported that the GBM-based prediction model demonstrated superior calibration and discrimination capabilities compared to alternative models. Therefore, the AUC of the GBM model was extracted for the meta-analysis. Ultimately, six studies [27, 28, 3133, 35] were identified as meeting the inclusion criteria. The pooled AUC was calculated by applying a random-effects model, yielding a pooled result of 0.83 (95% CI: 0.74–0.92) (Fig. 4). The I² value was 97.3% (p < 0.001), indicating significant heterogeneity across the studies. Sensitivity analysis was conducted using the “metaninf” module in Stata 17. When each study was omitted in turn, the pooled estimates did not change significantly (Fig. 5), indicating that the result were robust. The results of the subgroup meta-analysis showed that the pooled AUC of the GBM models was higher than that of the LR one (0.87 vs. 0.79). However, there was no significant difference in model performance between them (p = 0.406), and heterogeneity within each subgroup remained significant (Fig. 6).

Fig. 4.

Fig. 4

Forest plot of the random effects meta-analysis of pooled AUC estimates for 6 models

Fig. 5.

Fig. 5

Sensitivity analyses were conducted using a leave-one-out method

Fig. 6.

Fig. 6

Forest plots for subgroup analysis of logistic regression models (LR) versus machine learning models (GBM)

Discussion

This systematic review assessed 18 predictive models across 12 studies, drawn from various countries and regions. The common predictors used across the models included preoperative hemoglobin (Hb), age, BMI, surgery duration, and the use of tranexamic acid (TXA). These models exhibited moderate to excellent discrimination, with AUC results ranging from 0.652 [35] to 1.0 [27]. Nevertheless, all models were deemed to have a high risk of bias, which limits their reliability and clinical applicability. Therefore, future studies with improved methodological quality should be needed to advance this field.

Model performance is primarily evaluated using discrimination and calibration [38]. In this review, AUC was the most frequently used metric for assessing model performance, with values approaching 1 indicating better predictive ability. The AUC values for the included models ranged from 0.652 to 0.97, with 13 models reporting AUC > 0.75, indicating that these models can accurately predict the likelihood of blood transfusion following TKA. However, only four included studies [24, 30, 33, 35] reported calibration results. Calibration reflects the agreement between predicted probabilities and actual probabilities [20]. The absence of calibration assessment is a major limitation, as a model with good discrimination can still introduce inaccurate risk predictions, potentially leading to misleading clinical decision-making [39]. Therefore, future studies should include complete testing of model performance to assist decision-making and ensure more reliable clinical applications. Model validation, both internal and external, is essential for model development and implementation [40]. In this review, most studies lacked external validation, while two [24, 29] lacked both internal and external validation. External validation evaluates model performance in an independent dataset, which is a definitive test of generalizability and clinical utility [41].The lack of external validation hinders a realistic assessment of model performance on independent datasets [14]. Furthermore, for the seven studies with internal validation, almost all of them utilized the random sample split, which could result in model overfitting due to training and validation occurring within the same dataset [42]. Moreover, the majority of the included studies derived patient data from single-center. Multicenter validation could improve the external validity of the models [43]. DCA is a visual tool that can reveal the clinical net benefit of prediction models among various threshold probabilities [44]. In our review, three studies [28, 30, 33] used DCA to assess the model applicability and demonstrated clinical availability. Therefore, future research should focus on strengthening external validation, especially in multicenter studies and using DCA to consider the clinical benfit to enhance the clinical applicability.

A meta-analysis was performed on six internal models, resulting in a pooled AUC of 0.83 (95% CI: 0.74–0.92), indicating a relatively high predictive performance. However, the I2 value was 97.3% (p < 0.001), suggesting significant heterogeneity within the studies. In exploratory subgroup analyses (GBM vs. LR), the results show that the between-subgroup test was not significant (p = 0.406), but the heterogeneity within each subgroup remained significant, indicating that the modelling method could not be the primary source of heterogeneity. However, the heterogeneity could be attributed to others, such as variations in sample characteristics and outcome definition (transfusion indication) [45]. For sample characteristics, the data included in the studies were derived from different countries. The diversity of sample sources may lead to heterogeneity in results. Meanwhile, there are differences in TKA type. The risks of postoperative blood transfusion varied across different surgical types due to differences in operative duration and blood loss. Patients who undergo simultaneous bilateral TKA typically exhibit higher transfusion requirements [46]. Moreover, there are no objective criteria for transfusion indication with TKA. Physicians make diverse transfusion decisions based on their clinical experience or the standards of their institution, which can affect the TKA transfusion rate [47]. It may lead to model prediction bias and the observed heterogeneity in results. However, other subgroups were not applicable due to insufficient studies for each subgroup. Therefore, future studies should aim to standardize the methodological details of studies to facilitate more comprehensive meta-analyses and subgroup comparisons.

Following the PROBAST, all included studies were assessed with a high risk of bias, primarily due to issues in study design and analysis domains. All included studies employed retrospective designs, which could be information bias and recall bias. In the analysis domains, common issues included inadequate sample size, unclear handling of missing data, and inappropriate predictors procession. Adequate sample size is critical to minimize overfitting and enhance model reliability [19]. As PROBAST suggested, an events-per-variable (EPV) ratio of at least 20 can help reduce overfitting and bias [19]. However, some studies [25, 26, 31, 32] failed to satisfy this criterion due to the low transfusion rates or the use of numerous candidate predictors. In terms of handling missing data, many studies did not report their handling methods clearly or used inadequate approaches. For example, Liu et al. [32] directly deleted missing data, which could introduce sampling bias and affect the generalizability of prediction models [19]. Regarding predictor handling, two studies [24, 33] transformed continuous variables into categorical ones, which could result in information omission and decrease the correlation between predictors and research outcomes [48]. Furthermore, some studies used univariable analysis for predictor selection, which could lead to the loss of variables and the introduction of bias [19]. To mitigate bias, Kolin et al. ’s study [28] used a combination of expert opinions and multiple analyses, while two studies [27, 31] chose the Recursive Feature Elimination (RFE) method to select predictors. The RFE method can automatically select optimal predictors and avoid model overfitting [49]. Overall, improvements in model development methods are necessary to address these methodological problems.

The existing prediction models also have several important clinical significance. The highly frequent predictors included in these models were all easily available clinical data, such as preoperative hemoglobin (Hb), age, BMI, surgery duration, and the use of tranexamic acid (TXA). These predictors could be considered in future models for predicting blood transfusion needs after TKA. Moreover, the appropriate format of model presentation is an important consideration when implementing prediction models in clinical settings [50]. In this review, various formats are used to present prediction models, including model equations, nomograms, and online calculators. Nomograms were commonly used for logistic regression models, while online calculators were typically used for machine learning models. Visual nomograms and online calculators convert complex statistical methods and machine learning algorithms into simple forms, facilitating efficient prediction of blood transfusion risks. However, numerous of them did not provide complete model equations or machine learning codes, which limits external validation and model optimization [51]. Therefore, future research should consider the specific user and clinical setting when presenting prediction models and ensure that complete model equations and algorithms are provided to allow for external validation and optimization.

Strengths and limitations

Several strengths of this review are as follows. We systematically reviewed the existing clinical prediction models for postoperative blood transfusion after TKA using an extensive literature search, which can provide comprehensive evidence in this field. Additionally, we conducted this review following the PRISMA [15] statament and TRIPOD-SRMA [14] guideline, which can ensure complete and transparent reporting.

There are also some limitations in this review. Firstly, studies solely published in English and Chinese were included, which may lead to the omission of relevant studies published in other languages. Secondly, due to the limited number of included studies, only six models that underwent internal validation were included in the meta-analysis. This limitation prevented further quantification of the publication bias and the exploration of the sources of heterogeneity among studies. Finally, significant heterogeneity was still observed across studies, which could stem from participant populations and transfusion indications. Therefore, the clinical conclusions drawn from this review should be interpreted with caution.

Implications for clinical practice

Prediction models can help healthcare providers identify patients at high risk of requiring blood transfusion after TKA and guide the implementation of blood management strategies, including preoperative iron supplementation [52] and tranexamic acid use [53], which can reduce transfusion rates and associated complications. Future research should consider several key fields as follows. First, the predictors summarized in this review are readily available in clinical practice, which could be prioritized in the development of future clinical risk predictive models to enhance predictive accuracy. Second, given that the current prediction models had a high risk of bias, researchers are recommended to improve the methodological quality of studies following the PROBAST guidelines, such as using objective outcome definitions, utilizing appropriate methods to process missing data, avoiding categorizing continuous variables, and using bootstrapping or cross-validation instead of random sample splits to perform internal validation. Finally, given that most models included in this review were derived from single-center retrospective data and most lacked sufficient external validation, which may affect the generalizability and applicability of the model. Thus, some issues should be considered when implementing, such as patient populations and institutional conditions. Future research should conduct prospective study designs and pay attention to multi-center external validation to improve the reliability and clinical utility of the models.

Conclusions

This systematic review and meta-analysis summarized 12 studies with 18 clinical prediction models for transfusion risk after TKA. Each of these models showed a moderate to excellent predictive performance, and their overall performance (AUC = 0.83) also demonstrated some discriminative ability. However, all studies were assessed to have a high risk of bias with the PROBAST because of some methodological weaknesses, including a lack of rigorous study design or inadequate external validation, limiting their widespread or reliable clinical implementation. Future research should focus on enhancing existing models or developing new ones with rigorous methodology and multicenter external validation, providing more reliable evidence for the application of transfusion risk prediction in TKA patients.

Supplementary Information

Supplementary Material 1. (19.5KB, docx)
Supplementary Material 2. (33.5KB, docx)
Supplementary Material 3. (36.5KB, docx)

Acknowledgements

Not applicable.

Abbreviations

TKA

Total knee arthroplasty

KOA

Knee osteoarthritis

TXA

Tranexamic acid

AUC

Area under curve

CHARMS

Checklist for Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies

PROBAST

Prediction model Risk Of Bias Assessment Tool

REML

Restricted maximum likelihood

HKSJ

Hartung-Knapp-Sidik-Jonkman

LR

Logistic regression

MPNN

Message passing neural network

GBM

Gradient boosting machine

RF

Random forest

ANN

Artificial neural network

XG Boost

eXtreme gradient boosting

SVM

Support vector machine

KNN

K-nearest neighbors

RFE

Recursive feature elimination

LASSO

Least absolute shrinkage and selection operator

SRMI

Sequential regression multiple imputation

DCA

Decision curve analysis

EPV

Events per variable

NPV

Negative redictive value

PPV

Positive predictive value

VIF

Variance inflation factor

Authors’ contributions

JingWen Chen: Writing – original draft, Methodology, Formal analysis, Data curation, Visualization, Conceptualization. Xiaoping Zhong: Writing – review & editing, Methodology, Data curation. Yaojie Zhai: Software, Data curation. Cuixian Zhao: Formal analysis, Validation. Jingjing Lan: Visualization. Zhenlan Xia: Writing – review & editing, Methodology, Conceptualization. Liping Chen: Supervision, Project administration, Conceptualization. All authors have read and approved the final manuscript.

Funding

This research did not receive any specific grant from any departmental funding.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Jingwen Chen and Xiaoping Zhong made equal contributions to this manuscript.

Contributor Information

Liping Chen, Email: clp202306@163.com.

Zhenlan Xia, Email: 421170624@qq.com.

References

  • 1.Ackerman IN, Soh S-E, de Steiger R. Actual versus forecast burden of primary hip and knee replacement surgery in Australia: analysis of data from the Australian orthopaedic association national joint replacement registry. J Clin Med. 2022;11:1883. 10.3390/jcm11071883. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Rupp M, Lau E, Kurtz SM, Alt V. Projections of primary TKA and THA in Germany from 2016 through 2040. Clin Orthop Relat Res. 2020;478:1622–33. 10.1097/CORR.0000000000001214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Singh JA, Yu S, Chen L, Cleveland JD. Rates of total joint replacement in the united states: future projections to 2020–2040 using the National inpatient sample. J Rheumatol. 2019;46:1134–40. 10.3899/jrheum.170990. [DOI] [PubMed] [Google Scholar]
  • 4.Sun W, Yuwen P, Yang X, Chen W, Zhang Y. Changes in epidemiological characteristics of knee arthroplasty in eastern, Northern and central China between 2011 and 2020. J Orthop Surg Res. 2023;18:104. 10.1186/s13018-023-03600-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Palazzuolo M, Antoniadis A, Mahlouly J, Wegrzyn J. Total knee arthroplasty improves the quality-adjusted life years in patients who exceeded their estimated life expectancy. Int Orthop. 2021;45:635–41. 10.1007/s00264-020-04917-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Zhang H, Mu X, Zhang Z, Lin J, Jin J, Qian W, et al. Clinical and hematological factors affecting perioperative blood loss following total knee arthroplasty: a new clinical prediction model. Chin Med J Engl. 2025;138:868–70. 10.1097/CM9.0000000000003519. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Kimball CC, Nichols CI, Vose JG. Blood transfusion trends in primary and revision total joint arthroplasty: recent declines are not shared equally. J Am Acad Orthop Surg. 2019;27:E920–7. 10.5435/JAAOS-D-18-00205. [DOI] [PubMed] [Google Scholar]
  • 8.Acuña AJ, Grits D, Samuel LT, Emara AK, Kamath AF. Perioperative blood transfusions are associated with a higher incidence of thromboembolic events after TKA: an analysis of 333,463 TKAs. Clin Orthop Relat Res. 2021;479:589–600. 10.1097/CORR.0000000000001513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Everhart JS, Sojka JH, Mayerson JL, Glassman AH, Scharschmidt TJ. Perioperative allogeneic red blood-cell transfusion associated with surgical site infection after total hip and knee arthroplasty. J Bone Joint Surg Am. 2018;100:288–94. 10.2106/JBJS.17.00237. [DOI] [PubMed] [Google Scholar]
  • 10.Wang Q, Lee RLT, Hunter S, Chan SW-C. The effectiveness of internet-based telerehabilitation among patients after total joint arthroplasty: an integrative review. Int J Nurs Stud. 2021;115:103845. 10.1016/j.ijnurstu.2020.103845. [DOI] [PubMed] [Google Scholar]
  • 11.Irving A, McQuilten ZK. Does patient blood management represent good value for money? Best Pract Res Clin Anaesthesiol. 2023;37:511–8. 10.1016/j.bpa.2023.11.004. [DOI] [PubMed] [Google Scholar]
  • 12.Palmer AJR, Gagné S, Fergusson DA, Murphy MF, Grammatopoulos G. Blood management for elective orthopaedic surgery. J Bone Joint Surg Am. 2020;102:1552–64. 10.2106/JBJS.19.01417. [DOI] [PubMed] [Google Scholar]
  • 13.Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. BMJ. 2015. 10.1136/bmj.g7594. [DOI] [PubMed] [Google Scholar]
  • 14.Snell KIE, Levis B, Damen JAA, Dhiman P, Debray TPA, Hooft L, et al. Transparent reporting of multivariable prediction models for individual prognosis or diagnosis: checklist for systematic reviews and meta-analyses (TRIPOD-SRMA). BMJ. 2023. 10.1136/bmj-2022-073538. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372. 10.1136/bmj.n71. [DOI] [PMC free article] [PubMed]
  • 16.Geersing G-J, Bouwmeester W, Zuithoff P, Spijker R, Leeflang M, Moons K. Search filters for finding prognostic and diagnostic prediction studies in medline to enhance systematic reviews. PLoS One. 2012;7:e32844. 10.1371/journal.pone.0032844. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Moons KGM, de Groot JAH, Bouwmeester W, Vergouwe Y, Mallett S, Altman DG, et al. Critical appraisal and data extraction for systematic reviews of prediction modelling studies: the CHARMS checklist. PLoS Med. 2014;11:e1001744. 10.1371/journal.pmed.1001744. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Wolff RF, Moons KGM, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A tool to assess the risk of bias and applicability of prediction model studies. Ann Intern Med. 2019;170:51–8. 10.7326/M18-1376. [DOI] [PubMed] [Google Scholar]
  • 19.Moons KGM, Wolff RF, Riley RD, Whiting PF, Westwood M, Collins GS, et al. PROBAST: A tool to assess risk of bias and applicability of prediction model studies: explanation and elaboration. Ann Intern Med. 2019;170:W1–33. 10.7326/M18-1377. [DOI] [PubMed] [Google Scholar]
  • 20.Debray TPA, Damen JAAG, Snell KIE, Ensor J, Hooft L, Reitsma JB, et al. A guide to systematic review and meta-analysis of prediction model performance. BMJ. 2017;356:i6460. 10.1136/bmj.i6460. [DOI] [PubMed] [Google Scholar]
  • 21.Higgins JPT, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327:557–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Chapter 13. Assessing risk of bias due to missing evidence in a meta-analysis | Cochrane. https://www.cochrane.org/authors/handbooks-and-manuals/handbook/current/chapter-13. Accessed 16 Aug 2025.
  • 23.Wallace BC, Schmid CH, Lau J, Trikalinos TA. Meta-analyst: software for meta-analysis of binary, continuous and diagnostic data. BMC Med Res Methodol. 2009;9:80. 10.1186/1471-2288-9-80. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Ahmed I, Chan JKK, Jenkins P, Brenkel I, Walmsley P. Estimating the transfusion risk following total knee arthroplasty. Orthopedics. 2012;35:e1465–1471. 10.3928/01477447-20120919-13. [DOI] [PubMed] [Google Scholar]
  • 25.Noticewala MS, Nyce JD, Wang W, Geller JA, Macaulay W. Predicting need for allogeneic transfusion after total knee arthroplasty. J Arthroplasty. 2012;27:961–7. 10.1016/j.arth.2011.10.008. [DOI] [PubMed] [Google Scholar]
  • 26.Cavazos DR, Sayeed Z, Court T, Chen C, Little BE, Darwiche HF. Predicting factors for blood transfusion in primary total knee arthroplasty using a machine learning method. J Am Acad Orthop Surg. 2023;31:e845–58. 10.5435/JAAOS-D-23-00063. [DOI] [PubMed] [Google Scholar]
  • 27.Faure N, Knecht S, Tran P, Tamine L, Orban J-C, Bronsard N, et al. Prediction of transfusion risk after total knee arthroplasty: use of a machine learning algorithm. Orthop Traumatology: Surg Res. 2024;103985. 10.1016/j.otsr.2024.103985. [DOI] [PubMed]
  • 28.Kolin DA, Lyman S, Della Valle AG, Ast MP, Landy DC, Chalmers BP. Predicting postoperative anemia and blood transfusion following total knee arthroplasty. J Arthroplasty. 2023;38:1262–e12662. 10.1016/j.arth.2023.01.018. [DOI] [PubMed] [Google Scholar]
  • 29.Mozella A, de P, Cobra HA, de Duarte AB. Predictive factors for blood transfusion after total knee arthroplasty < sup/>. Rev Bras Ortop. 2021;56:463–9. 10.1055/s-0040-1715511. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Hu C, Wang Y, Shen R, Liu C, Sun K, Ye L, et al. Development and validation of a nomogram to predict perioperative blood transfusion in patients undergoing total knee arthroplasty. BMC Musculoskelet Disord. 2020;21:315. 10.1186/s12891-020-03328-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Jo C, Ko S, Shin WC, Han H-S, Lee MC, Ko T, et al. Transfusion after total knee arthroplasty can be predicted using the machine learning algorithm. Knee Surg Sports Traumatol Arthrosc. 2020;28:1757–64. 10.1007/s00167-019-05602-3. [DOI] [PubMed] [Google Scholar]
  • 32.Liu Y, Ai J, Teng X, Huang Z, Wu H, Zhang Z, et al. Risk factor analysis and establishment of a nomogram model to predict blood loss during total knee arthroplasty. BMC Musculoskelet Disord. 2024;25:1–12. 10.1186/s12891-024-07570-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Mohammed H, Huang Y, Memtsoudis S, Parks M, Huang Y, Ma Y. Utilization of machine learning methods for predicting surgical outcomes after total knee arthroplasty. PLoS One. 2022;17:e0263897. 10.1371/journal.pone.0263897. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Chen C, He D, Liang J, He Z. Predicting the possibility of blood transfusion after total knee arthroplasty based on machine learning algorithm. Zhongguo Zuzhi Gongcheng Yanjiu. 2021;25:5792. [In Chinese]. [Google Scholar]
  • 35.Li X, Lu X, Shi L, Xu K, Yu T, Zhang Y. Establishment of a predictive model for blood transfusion within 14d after simultaneous bilateral total knee arthroplasty. J Qingdao Univ (Medical Sciences). 2024;60:693–6. [In Chinese]. [Google Scholar]
  • 36.Gross JB. Estimating allowable blood loss: corrected for dilution. Anesthesiology. 1983;58:277–80. 10.1097/00000542-198303000-00016. [DOI] [PubMed] [Google Scholar]
  • 37.Hu Y, Lu H, Ren L, Yang M, Shen M, Huang J, et al. Prediction models for perineal lacerations during childbirth: a systematic review and critical appraisal. Int J Nurs Stud. 2023;145:104546. 10.1016/j.ijnurstu.2023.104546. [DOI] [PubMed] [Google Scholar]
  • 38.Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, et al. Assessing the performance of prediction models: a framework for some traditional and novel measures. Epidemiol (Cambridge Mass). 2010;21:128. 10.1097/EDE.0b013e3181c30fb2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW, Bossuyt P, et al. Calibration: the Achilles heel of predictive analytics. BMC Med. 2019;17:230. 10.1186/s12916-019-1466-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Moons KGM, Altman DG, Reitsma JB, Ioannidis JPA, Macaskill P, Steyerberg EW, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): explanation and elaboration. Ann Intern Med. 2015;162:W1–73. 10.7326/M14-0698. [DOI] [PubMed] [Google Scholar]
  • 41.Riley RD, Ensor J, Snell KIE, Debray TPA, Altman DG, Moons KGM, et al. External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges. BMJ. 2016;353:i3140. 10.1136/bmj.i3140. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Steyerberg EW, Harrell FE, Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema JD. Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. J Clin Epidemiol. 2001;54:774–81. 10.1016/s0895-4356(01)00341-9. [DOI] [PubMed] [Google Scholar]
  • 43.Ho SY, Phua K, Wong L, Bin Goh WW. Extensions of the external validation for checking learned model interpretability and generalizability. Patterns. 2020;1:100129. 10.1016/j.patter.2020.100129. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Kerr KF, Brown MD, Zhu K, Janes H. Assessing the clinical impact of risk prediction models with decision curves: guidance for correct interpretation and appropriate use. JCO. 2016;34:2534–40. 10.1200/JCO.2015.65.5654. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Debray TPA, Vergouwe Y, Koffijberg H, Nieboer D, Steyerberg EW, Moons KGM. A new framework to enhance the interpretation of external validation studies of clinical prediction models. J Clin Epidemiol. 2015;68:279–89. 10.1016/j.jclinepi.2014.06.018. [DOI] [PubMed] [Google Scholar]
  • 46.Chalmers BP, Mishu M, Chiu Y-F, Cushner FD, Sculco PK, Boettner F, et al. Simultaneous bilateral primary total knee arthroplasty with TXA and restrictive transfusion protocols: still a 1 in 5 risk of allogeneic transfusion. J Arthroplasty. 2021;36:1318–21. 10.1016/j.arth.2020.10.042. [DOI] [PubMed] [Google Scholar]
  • 47.Frisch NB, Wessell NM, Charters MA, Yu S, Jeffries JJ, Silverton CD. Predictors and complications of blood transfusion in total hip and knee arthroplasty. J Arthroplasty. 2014;29(9 Suppl):189–92. 10.1016/j.arth.2014.03.048. [DOI] [PubMed] [Google Scholar]
  • 48.Altman DG, Royston P. The cost of dichotomising continuous variables. BMJ. 2006;332:1080. 10.1136/bmj.332.7549.1080. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Pudjihartono N, Fadason T, Kempa-Liehr AW, O’Sullivan JM. A review of feature selection methods for machine Learning-Based disease risk prediction. Front Bioinform. 2022;2:927312. 10.3389/fbinf.2022.927312. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Bonnett LJ, Snell KIE, Collins GS, Riley RD. Guide to presenting clinical prediction models for use in clinical settings. BMJ. 2019;365. 10.1136/bmj.l737. [DOI] [PubMed]
  • 51.Kong L-N, Yang L, Lyu Q, Liu D-X, Yang J. Risk prediction models for frailty in older adults: a systematic review and critical appraisal. Int J Nurs Stud. 2025;167:105068. 10.1016/j.ijnurstu.2025.105068. [DOI] [PubMed] [Google Scholar]
  • 52.Park Y-B, Kim K-I, Lee H-J, Yoo J-H, Kim J-H. High-dose intravenous iron supplementation during hospitalization improves hemoglobin level and transfusion rate following total knee or hip arthroplasty: a systematic review and meta-analysis. J Arthroplasty. 2024. 10.1016/j.arth.2024.11.058. [DOI] [PubMed] [Google Scholar]
  • 53.Chen JY, Chin PL, Moo IH, Pang HN, Tay DKJ, Chia S-L, et al. Intravenous versus intra-articular tranexamic acid in total knee arthroplasty: a double-blinded randomised controlled noninferiority trial. Knee. 2016;23:152–6. 10.1016/j.knee.2015.09.004. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1. (19.5KB, docx)
Supplementary Material 2. (33.5KB, docx)
Supplementary Material 3. (36.5KB, docx)

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.


Articles from BMC Musculoskeletal Disorders are provided here courtesy of BMC

RESOURCES