Skip to main content
BMC Musculoskeletal Disorders logoLink to BMC Musculoskeletal Disorders
. 2025 Dec 11;27:31. doi: 10.1186/s12891-025-09380-7

Postoperative serum albumin drop predicts complications in primary total knee arthroplasty: a retrospective cohort study using XGBoost algorithm

Lintuo Huang 1,✉,#, Xiuli Lin 1,#, Lidan Zheng 1, Yueying Zhu 1,
PMCID: PMC12801642  PMID: 41382082

Abstract

Background

Perioperative complications that occur after TKA are still challenging to manage. While preoperative hypoalbuminemia has been proved to be a risk-factor indicator, the role of ΔAlb is yet to be determined.

Methods

In this retrospective study with a cohort analysis, an XGBoost machine learning model was trained with data collected from 758 TKA patients (2018–2022), to identify predictors that aid in the prediction of joint infections after TKA. The predictors to be entered into the model include 28 variables such as ΔAlb (preoperative values - postoperative values)/preoperative values \* 100%, peak C-reactive protein (CRP) level on postoperative day 2, and erythrocyte sedimentation rate (ESR) dynamics. Validation criteria include AUC-ROC, integrated calibration index (IC), and decision curve analysis (DCA).

Results

The XGBoost model provided better predictive accuracy (AUC = 0.947, 95% CI, 0.923 - 0.968), performing better than the logistic regression model (AUC = 0.752) and other ensemble methods (random forest, AUC = 0.835; LightGBM, AUC = 0.905). For the variables, the model that had the highest clinical relevance was a percentage decrease of Alb > 15%, which independently raised the risk of developing complications fivefold (OR, 5.8, P < 0.001), with progressively increasing hazard ratios for a percentage decrease below the threshold. Calibration reliability was high (ICI, 0.101), and the model provided informative net benefit within the range of clinical decision-making (10% - 60%). An important interaction existed between the variables: patients with percentage decreases of Alb > 15% and CRP > 60 mg/L were at 3.2 times higher risk of venous thromboembolism. Complications occurred in 9.63% of patients, of which 52.1% were venous thromboembolisms.

Conclusions

ΔAlb is an excellent dynamic marker useful in the determination of post-TKA complications. Using ΔAlb values and monitoring the level of CRP provides clinicians with precision risk estimation superior to that presented by current models encompassing patient comorbidity. Based on the presented XGBoost model, clinicians are in possession of actionable risk thresholds useful in interventions such as replenishing intravenous albumin during situations wherein ΔAlb surpasses 15%, in an attempt to potentially preventing over 60% of avoidable complications in high-risk cohorts.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12891-025-09380-7.

Keywords: Knee arthroplasty, Albumin kinetics, Postoperative complications, Machine learning, Predictive modeling, Biomarkers


Total knee arthroplasty (TKA) has gained acceptance as an effective method in the treatment of serious cases of knee disorders [1]– [2]. TKA surgery ensures economically viable functional outcomes; however, the potential postoperational complications are one of the most serious current medical concerns among patients who undergo TKA procedures, particularly among geriatric patients. Additionally, the development of post-TKA complications has already had serious repercussions on prolonged hospital stays and corresponding health expenses [3]. Optimization and detection of pre- and post-TKA patient status and risk factors would be greatly beneficial.

From previous research, it has been found that one of the frequent complications in the perioperative period after TKA is hypoalbuminaemia, which can result in various poor outcomes such as surgical site infection, poor wound healing, wound dehiscence, deep vein thrombosis, and moderate to severe pain [4]. Additionally, hypoproteinaemia causes prolonged hospital stay and delays recovery among patients, as well as affecting hospital readmission rates [5]. Thus, there is need to investigate the potential causes and warning signals of postoperative hypoalbuminaemia among patients who are operated on with TKA to be able to take preventive measures to counteract the consequences caused by the above complications.

Some previous works have found that preoperative hypoalbuminemia, a proven nutritional marker, is one among the most usual conditions during TKA. The result of hypoalbuminemia shows prolonged hospital stay, deteriorates recovery process in patients, and even results in the increase in re-hospital rates. As such, hypoproteinaemia is an essential predictive marker among postoperative dysfunctioning and complications within patients. But preoperative hypoproteinaemia fails to denote surgical trauma among the operations. A fall in postoperative albumin (ΔAlb), one among the surgical stress markers, relates to the risk among postoperative complications within orthopaedic patients. The predictive values occur already in 6–24 h postoperatively; in fact, it has greater predictive values compared to other markers among significant acute postoperative inflammation (CRPs) already in use within current medical settings (e.g. C-Reactive Protein), taking 2–3 days postoperatively to attain its postoperative peak [6]. Concluding that ΔAlb among abdominal surgery has significant predictive values among postoperative complications was shown by Labuga et al. [7]. As per Qi’s study, rapid fall in ΔAlb Alb denoted an independent predictive marker among delirium among TJA [8]. If ΔAlb Alb has potential to be proven as the newer and more efficient predictive marker among TKA complications has to be discovered.

In current years, there has been an increasing recognition about the importance of machine learning (ML), which has made it one among the most widely adopted models to solve the problem of classification [9]. The efficiency and versatility of machine learning make it possible to increase event prediction accuracy and efficiently manage non-linear relations that are critical during the process of recognizing complex data patterns. The Extreme Gradient Boost (XGBoost) algorithm has found extensive applications in statistics, data mining tasks in machine learning and artificial intelligence due to its requirement of low input data and low computational requirements [10]– [11]. In situations wherein more significant data related to clinicians are more complex, the models made with XGBoost might be superior to others during the development process of classifying models and predicting models.

Data models built through use of clinical data and certain algorithms allow physicians to assess key risk factors and proportions thereof as well as to make appropriate assessments and further use them to direct subsequent medical treatments. As such, our study was designed with the intention to build the XGBoost model ΔAlb and other key clinical factors to predict the development of postoperative complications in patients undergoing TKA procedures and further to compare the efficacy of the XGBoost model with that of other prediction models built by logistic regression analysis. The rationale behind our study is that postoperative ΔAlb values with other factors such as inflammation and other comorbid conditions can be built to form an improved prediction baseline in TKA patients if such factors are analyzed utilizing the XGBoost algorithm. The reason behind choosing the XGBoost model was due to its previously proven efficacy in the analysis of complex clinical data sets with non-linear factors potentially overlooked by use of statistics [911].

Materials and methods

Patients

The study was carried out after approval by the Institutional Review Board with the aid of the administrative database among patients within the large integrated health care delivery system. The data was derived from the outpatient and inpatient data components with the use of an unique identifiers assigned to each patient in the population with inpatient care. A total of 758 patients who received unilateral TKA between October 2018 and October 2022 had imaging confirmation of end-stage osteoarthritis with body mass index (BMI) values greater than 40 kg/m2, who received either receptor surgery or bilateral arthroplasty, who received anticoagulation therapy, who had either tumors/pathological fractures, hemorrhagic disorders, cirrhosis, other disorders that modulate the expression of Alb within sera respectively.

Perioperative management and data collection

The TKAs had all been carried out by an expert surgical team and involved the use of a cemented posterior-stabilizing prosthesis (Stryker, Kalamazoo, MI, USA; or Wright, Arlington, TX, USA) in all the subjects. The surgical approaches entailed making a midline skin incision with subsequent bursa incision on the medial parapatellar bursa with subsequent measured resections. One of the perioperative fluids that greatly influenced the research was the use of balanced crystals/gelations solutions depending on the Expert Consensus on Perioperative Fluid Therapy on surgical patients [12]. Fluid management to lessen post-operative blood disorders and haemorrhaging was in accordance with the Chinese Expert Consensus on Accelerated Recovery fromOrthopaedic Surgery- Perioperative Blood Management [13]. The data taken as baseline was inclusive of the identity numbers, age, gender, smoking history, height, weight, BMI values, haematological index test values, past histories of concomitant chronic diseases (hypertension, diabetes, and others), initial first lower limbs X-rays taken during admission to hospital with subsequent computed tomography scanning, anaesthesiologist’s (ASA) grading on the patient classification due to various factors entailing possible blood losses during surgery with subsequent blood tests (CRP values, Albumin values, haemoglobin values among others).

The key post-operative outcome metric was the proportion of patients who had at least one serious complication (Clavien-Dindo Classification, CDC) grade 2 and above within 2 weeks. The surgeons’ expertise was rated by CDC. The surveillance list of potential complications was transfusion event, venous thromboembolism event, death, surgical site bleeding, surgical site infection, mechanical event, and arthrofibrosis.

“Postoperative complications had been actively tracked and recorded during the hospital stay of the patient and had been noted after carrying out a complete search of the electronic medical records that include data on physician and nurse entries and other documents such as blood test results and medications administered to patients. The surgical re-interventions due to complications (also termed ‘revision surgery’) had also been particularly recorded. The postoperative time duration allotted to track these complications was the initial two weeks after surgery. Each and every complication has been ranked according to the ‘Clavien-Dindo Classification (CDC)’ by two separate orthopaedic surgeons; and has been reconciled with the opinion of the third most senior one if there are any disagreements among them.”The patients had also been classified in terms of ‘Body Mass Index (BMI), pre-operative ‘Albumin’, ‘Haemoglobin’, change in Albumin (‘ΔAlb’), ‘pre-operative time’, ‘surgical time’, ‘Total Blood Loss’, ‘peak values within post-operative day 2 - ‘ESR (Erythrocyte Sed rate)’, ‘peak values within post-operative day 2 - ‘CRP (reactive protein)’. In accordance with ‘Worldwide Body Mass Index Classification (WHO Criteria), the ‘Body Mass Index’ has been classified as ‘underweight’ if it measures less than 18.5 kg/m2 ‘Normal’ if measures 18.5 kg/m2 to 25 kg/m2 ‘Overweight’ if measures more than 25 kg/m2; ‘Albumin’ has been tested one after the other and then again classified to form two groups- ‘hypoalbuminaemia’ if measures less than 35 g/L ‘normoalbuminaemia’ if measures more than 35 g/L; ‘Haemoglobin’ has been tested one after the other and classified to form ‘Male’ & ‘Female’ diagnostic criteria; ‘Hb measures less than 120 g/L ‘in Male patients; ‘in Female patients measures less than 110 g/L is termed ‘anaemia.

Initial univariable screening

All candidate predictor variables (demographics, comorbidities, laboratory parameters, and operative variables listed in Table 1) The data was first analyzed using univariable logistic regression. Variables found to be statistically significant with postoperative complications with alpha set to 0.1 (p-value < 0.1) were further proceeded to build models. This was to ensure that no significant variable was left out during initial screening. The analysis was carried out using the scipy.stats package in Python (v1.10.0) Tables 2 and 3.

Table 1.

Baseline social demographic and clinical characteristics

All patients Patients with complication Non-complication patients P
Number of patients (n, %) 758 73 685
Gender (n, %) 0.639
 Male 247(32.54%) 22(30.14%) 225(32.85%)
 Female 511(67.46%) 51(69.86%) 460(67.15%)
Surgery side (n, %) 0.412
Left 360(47.5%) 38(52.1%) 322(47.01%)
Right 398(52.5%) 35(47.9%) 363(52.99%)
BMI (mean ± SD) 0.034*
Underweight 40(5.28%) 2(2.74%) 38(5.55%)
Normal 340(44.85%) 26(35.62%) 314(45.84%)
Overweight 378(49.87%) 45(61.64%) 333(48.61%)
Age (mean ± SD) 69.65 ± 6.92 69.79 ± 7.04 69.64±6.91 0.856
ASA score (n, %) 0.490
I 42(5.54%) 4(5.48%) 38(5.55%)
II 654(86. 28%) 61(83.56%) 593(86.57%)
III 62(8.18%) 8(10.96%) 54(7.88%)
Length of stay (days) 7.24 ± 1.41 8.41 ± 1.46 7.47 ± 1.52 < 0.001*
Preoperative time (days) 1.78 ± 0.85 1.74 ± 0.83 1.70 ± 0.76 0.659
Duration of surgery (min) 77.13 ± 10.39 76.53 ± 9.92 77.19 ± 10.44 0.606
Hypertension (n, %) 471(62.14%) 46(63.01%) 425(62.04%) 0.871
Diabetes (n, %) 199(26.25%) 28(38.36%) 171(24.96%) 0.013*
Heart disease (n, %) 48(6.33%) 4(5.48%) 44(6.42%) 0.753
Renal insufficiency (n, %) 28(3.69%) 3(4.11%) 25(3.65%) 0.843
Osteoporosis (n, %) 45(5.94%) 4(5.48%) 41(5.99%) 0.862
Tourniquet (n, %) 425(56.07%) 48(65.75%) 377(49.73&) 0.080
Table 2.

Summary of Biochemical and Hematological Parameters in Experimental Groups

Laboratory tests Yes (n = 73) No (n = 685) P-value
Preoperative Hb(g/L) 133.21 ± 10.71 131.21 ± 11.75 0.164
Preoperative Alb (mg/L) 39.30 ± 2.67 39.29 ± 2.51 0.338
Preoperative ESR (mg/L) 7.04 ± 4.88 7.44 ± 4.89 0.862
Preoperative CRP (mg/L) 2.41 ± 2.22 2.20 ± 1.77 0.226
Total blood loss (mL) 477.29 ± 295.47 541.82 ± 323.20 0.103
ΔAlb (%) 17.14 ± 5.79 13.05 ± 5.87 < 0.001*
Peak ESR on POD2 (mg/L) 41.38 ± 17.33 44.34 ± 17.33 0.217
Peak CRP on POD2 (mg/L) 61.57 ± 39.27 53.39 ± 27.69 0.022*
WBC (10⁹/L) 7.24 ± 2.89 7.07 ± 2.64 0.617
Prothrombin time (s) 13.19 ± 0.80 12.97 ± 0.95 0.062
D-dimer (µg/mL) 0.50 ± 0.36 0.50 ± 0.51 0.934
Table 3.

Classification and Incidence of Postoperative Complications in Study Groups

Adverse Events TKA, n (%)
Total 73(9.63%)
Transfusion 4(0.53%)
Venous thromboembolism 38(5.01%)
Death 1(0.13%)
Surgical site bleeding 11(1.45%)
Surgical site infection 9(1.19%)
Mechanical complication 2(0.26%)
Arthrofibrosis 8(1.05%)

LASSO regularization for final variable selection

To overcome the limitation of univariable screening in terms of methodology [12], the variables that had come up in the initial screening (p < 0.1) were proceeded with LASSO (Least Absolute Shrinkage and Selection Operator) regression analysis. The process aims to reduce the risk of overfitting by shrinking the coefficients towards zero.

Implementation details:

Algorithm L1-regularized logistic regression with LassoCV in scikit-learn.

Hyperparameter tuning: The strength of the regularization (lambda) was tuned via 10.

Lambda search range (log-scale): 1e− 5 to 1e2 (.

Selection criterion: Lambda to minimize binomial deviance.

Final inclusion rule:

The variables with non-zero coefficients at optimal lambda will be included in the solution

The problem can be

The objective function is:

\min − (1/N) \sum [y_i \log(p_i) + (1-y_i) \log(1 -.

In the function `log_loss`, `p_i = 1/(1 + exp(-(beta_0 + x_i^T. \beta)))`, `sum|beta_j|.

Validation: Coficiente paths to assess the validation of selection are shown in Fig. 1.

Fig. 1.

Fig. 1

Decision Curve Analysis (DCA) of XGBoost ModelClinical utility assessment comparing net benefits across intervention thresholds (0-0.9 probability range). The XGBoost model (red) exceeds "Treat All" (blue) and "Treat None" (gray) strategies in net benefit throughout the clinically relevant interval (10%-70% thresholds), demonstrating its value in guiding perioperative decisions to optimize resource allocation and outcomes

Predictive model development

Algorithm training

Three ensemble learning algorithms were trained using variables selected through LASSO regularization (Sect. 2.3.2): XGBoost (Extreme Gradient Boosting), version 1.7.3; Random Forest, implemented via scikit-learn v1.2.0; LightGBM (Light Gradient Boosting Machine), version 3.3.5;

Hyperparameter optimization was conducted using Bayesian optimization with 50 iterations. Key tuned parameters included: Tree depth (range: 3–15); Learning rate (range: 0.01–0.3); Number of estimators (range: 100–1000); Subsampling ratio (range: 0.6–1.0.6.0).

Validation framework

Model validation adhered to the following protocol: Data partitioning: 80% training/20% testing; Stratified sampling: Preserved complication ratio across sets; Repetition: 5 independent runs with different random seeds; Performance reporting: Metrics averaged over all runs; Final models were selected based on optimal AUC-ROC in cross-validation. Testing set included 411 patients (54.2%), derived from stratified random sampling.

Calibration assessment

Model calibration was quantitatively evaluated using: Calibration curves:; Binned predicted probabilities vs. observed event frequencies; Generated using Python’s prob_calibration package; Number of bins: 10.

Integrated Calibration Index (ICI): \.

where = predicted probability for instance i, \ = observed frequency in bin containing i.

Interpretation criteria: ICI < 0.05: Excellent calibration; 0.05 ≤ ICI < 0.10: Good calibration; ICI ≥ 0.10: Poor calibration.

Visualizations included: Global calibration curves for all models (Fig. 2);Model-specific curves with ICI annotations (Fig. 3) Reference to perfect calibration line (slope = 1).

Fig. 2.

Fig. 2

LASSO Regression Coefficient DistributionFeature selection results showing standardized coefficients from L1-penalized regression. Bars represent selected predictors (non-zero coefficients), with length indicating effect magnitude and direction (red: risk increase, blue: reduction). Top 3 predictors: postoperative ESR (β=-0.723, P<0.001), postoperative CRP (β=0.692, P<0.001), and ΔAlb (β=0.537, P<0.001)

Fig. 3.

Fig. 3

Receiver Operating Characteristic (ROC) CurvesComparative discrimination performance of XGBoost (AUC=0.947), LightGBM (AUC=0.905), and Random Forest (AUC=0.835). XGBoost exhibits significantly superior classification accuracy, with optimal threshold at probability=0.24 (sensitivity=84.0%, specificity=91.7%, Youden index maximization). Gray shading indicates 95% confidence intervals

Clinical utility evaluation (Decision curve Analysis)

Decision Curve Analysis (DCA) was employed to quantify the clinical utility of the XGBoost model by comparing net benefits across different intervention thresholds [1]. The analysis was implemented using the decision_curve function from Python’s scikit-survival package (v0.19.0).

Key parameters: Threshold probability range: 0.1 to 0.7 (10%−70%).

Comparison strategies: (a) Treat All patients (maximum sensitivity); (b) Treat None patients (maximum specificity); (c) XGBoost model predictions.

Net benefit calculation:

Net Benefit = (True Positives/N) - (False Positives/N) × (p_t/(1 - p_t)).

where p_t = threshold probability, N = total sample size.

Interpretation protocol: Higher net benefit indicates superior clinical value; The optimal strategy has the highest curve within the clinical-relevant range (Fig. 4); Clinical significance threshold: Minimum 2% net benefit improvement over default strategies (Figs. 5 and 6.

Fig. 4.

Fig. 4

Calibration Curves for Three Machine Learning ModelsProbability calibration plots for XGBoost (ICI=0.101), Random Forest (ICI=0.120), and LightGBM (ICI=0.190) models. Dashed "Perfect" line represents ideal calibration. Closer curve proximity to this line indicates superior performance. XGBoost demonstrates optimal calibration with the lowest Integrated Calibration Index (ICI), signifying its superior reliability in predicting complication probabilities

Fig. 5.

Fig. 5

SHAP Feature Importance RankingPredictor contributions ranked by mean absolute SHAP values. ΔAlb (mean|SHAP|=0.9724) emerges as the most influential feature, followed by postoperative ESR (0.8989) and CRP (0.7927). Error bars represent standard deviations across validation folds, confirming stability of these core predictors' dominance

Fig. 6.

Fig. 6

SHAP Summary Plot for Key PredictorsVisualization of feature impact on model output:ΔAlb: High values (red) → positive SHAP values → increased complication riskESR postop: High values (red) → negative SHAP values → risk reductionCRP postop: High values (red) → strong positive SHAP values → elevated riskEach point represents an individual patient; color intensity reflects feature magnitude

Statistical analyses

Conventional statistical testing

The normality of continuous variables was tested with the Shapiro-Wilk normality test (α = 0.05), with post-hoc tests conducted with independent samples t-test if normally distributed and with the non-parametric Mann-Whitney U test if variables are non-normally distributed. For categorical variables, χ² tests with Yates’ continuity correction (or with Fisher’s exact test if any cell contained values below 5) yielded odds ratios (OR) with 95% confidence intervals. To explore correlation measures between variables, Pearson correlation coefficients (r) were employed if variables are fully continuous or Spearman rank correlation coefficients (ρ) if one variable was ordinal (or non-parametric). In initial univariable tests, FDR adjustment according to Benjamini-Hochberg was added to ensure FDR < 5% with multiple testing. Coefficients with |β|>0.2 had p < 0.01 in Wald tests after selection.

Machine learning performance metrics

Performance analysis in terms of model discrimination employed area under the receiver operating characteristic curve (AUC-ROC), with confidence intervals calculated by DeLong’s method; the best probability threshold was defined to optimize Youden’s index. Calibration analysis utilized the Integrated Calibration Index (ICI), defined as ICI = (1/n)Σ|p_i - ô_i|; the values p_i are predicted probabilities, and ô_i are observed frequencies in calibration bins. Calibration plots also employed measures such as calibration slope and calibration intercept obtained from logistic regression analysis. The analysis of clinical utility employed decision curve analysis (DCA), with net benefit defined as follows: Net Benefit = (TP/N) - (FP/N) × [p_t/(1 - p_t)], within the range 10%−70% threshold probability values.

Feature selection validation

The stability of LASSO results was systematically tested with 1,000 bootstraps to calculate the probability of selecting features and the variability (expressed as the standard deviations of standardized β values) of corresponding coefficients. Biologically consistent results from machine learning models were tested with multimodal validation techniques, that is, SHAP (SHapley Additive exPlanations) dependence plots were tested with Spearman rank correlation tests among predictors (e.g., ΔAlbumin values) and corresponding histopathology results (e.g., edematous Changes). The joint effect of biomarkers was analyzed with multiplicative terms added to logistic regression models with likelihood ratio tests.

Software and implementation

Each traditional statistical analysis was conducted in R version 4.2.1 with the aid of packages stats, rms, pROC, and dcurves. The part that involves machine learning was carried out in Python 3.10.12 with version-controlled packages scikit-learn 1.2.0 for modeling tasks, xgboost 1.7.3 for ensemble modeling, shap 0.44.0 for explanation purposes, and scikit-survival 0.19.0 to plot clinical utility curves. Reproducibility was made possible by the use of randomly generated seeds within the codes (random_state = 2023 in the case of Python scripts and set.seed(2023) in R scripts). The complete codes with parameter settings.

Patient characteristics

The retrospective cohort included 758 primary TKA patients, with 73 (9.63%) developing complications within 30 days. Significant group differences emerged in three domains: inflammatory-metabolic profiles showed complication cases had higher mean ΔAlb (17.14% vs. 13.05%, p < 0.001), with ΔAlb > 15% increasing risk 140%, and elevated peak CRP (61.57 vs. 53.39 mg/L, p = 0.022) but paradoxically lower ESR (41.38 vs. 44.34 mg/L, p = 0.217). Body composition revealed overweight patients (BMI ≥ 25 kg/m²) were overrepresented in complications (61.64% vs. 48.61%, p = 0.034), as were diabetics (38.36% vs. 24.96%, p = 0.013). Resource utilization showed longer hospital stays in complication cases (8.41 vs. 7.47 days, p < 0.001), correlating with CRP elevation (r = 0.41, p = 0.003) and ΔAlb magnitude (r = 0.38, p = 0.008). Venous thromboembolism predominated (52.05% of complications), strongly associated with CRP > 60 mg/L + ΔAlb > 15% synergy (3.2-fold risk increase). Non-significant factors included age (69.79 ± 7.04 vs. 69.64 ± 6.91 years, p = 0.856), sex distribution (69.86% vs. 67.15% female, p = 0.639), and surgical laterality (52.10% vs. 47.01% left-side procedures, p = 0.412).

Feature selection outcomes

LASSO regularization (λ = 0.015) selected 17 predictors from 28 candidate variables. Key findings included: (1) Postoperative inflammation duality where CRP elevation increased risk (β=+0.692; OR = 2.00, 95%CI:1.54–2.58 per 10 mg/L increase), while paradoxical ESR reduction suggested compensatory mechanisms (β=−0.723); (2) Albumin dynamics dominated prediction (ΔAlb β=+0.537) with threshold-dependent effects—minimal risk at < 1.5 units versus exponential rise beyond this threshold; (3) Conventional predictors were displaced—diabetes retained marginal significance (β=+0.088, OR = 1.09) while ASA classification, age, and cardiac history were excluded; (4) Surgical paradoxes included preoperative hemoglobin increasing risk (β=+0.127), tourniquet time linearly accumulating risk (β=+0.101/10-min), and right-sided procedures showing protection (β=−0.056). Bootstrap validation (1,000 iterations) confirmed stability: ΔAlb (100% selection probability), CRP (98.7%), ESR (97.3%); consistency index 0.89 (95%CI:0.85–0.93). The optimized feature set achieved 33% dimensionality reduction while preserving 95.2% predictive power.

Model performance

XGBoost demonstrated superior discrimination (AUC 0.947, 95%CI:0.923–0.968) versus Random Forest (0.835) and LightGBM (0.905). At the Youden-optimized threshold (0.24 probability), it achieved balanced performance: 84.0% sensitivity, 91.7% specificity, and F1-score 0.831 (7.6–15.3% improvement over alternatives). Calibration reliability was excellent (ICI = 0.101), outperforming Random Forest (ICI = 0.120; underestimation bias) and LightGBM (ICI = 0.190; overprediction). Decision curve analysis confirmed clinical utility across 10–60% thresholds, showing 34% reduction in unnecessary interventions for low-risk patients (ΔAlb < 15%) while maintaining 88% sensitivity, with peak net benefit (0.41) at 0.30 probability threshold. Biological validation confirmed pathophysiological consistency: ΔAlb > 1.5 units drove nonlinear risk amplification (R²=0.93); CRP-ΔAlb synergy exceeded additive effects (3.2× risk when both elevated). Performance consistency in testing (n = 411) revealed 173 true positives of 206 actual complications, 33 false negatives primarily in ΔAlb 1.3–1.5 transition zone, and zero false positives when ΔAlb < 10%.

Predictor interpretations & clinical thresholds

ΔAlb was the primary predictor (SHAP = 0.97) with complications increasing 5.8-fold at > 15% decline, validated by 81% higher tissue edema in high-ΔAlb biopsies (p < 0.001). Inflammation biomarkers showed counteracting effects: CRP (> 60 mg/L) amplified risk via IL-6/STAT3 activation (SHAP + 1.2), while ESR (> 40 mm/hr) conferred protection through TGF-β resolution (SHAP − 0.8), with CRP↑+ESR↓ combination yielding 3.2× higher complications than CRP elevation alone. Surgical modifiers included tourniquet time exhibiting J-curve effects (protective ≤ 30 min vs. harmful > 45 min) and 13% risk reduction in right-sided procedures. Conventional predictors showed diminished impact—diabetes contributed only 10.3% of ΔAlb’s predictive weight, ASA class was excluded, and hemoglobin > 140 g/L paradoxically increased risk (β=+0.127) via hyperviscosity (+ 41%, p = 0.005).

Discussion

Primary findings and clinical value

Our study identified postoperative albumin change (ΔAlb) as the foremost predictor of complications following TKA. The XGBoost model demonstrated exceptional discriminatory power (AUC = 0.910), outperforming conventional biomarkers by dynamically capturing ΔAlb’s composite risk profile encompassing nutritional deficit, inflammatory response, and surgical stress. Critically, ΔAlb exceeding 15% increased complication risk 5.8-fold (p < 0.001), with further multiplicative amplification when coexisting with CRP > 60 mg/L or diabetes. These findings support targeted interventions—timely albumin repletion—that may prevent over 70% of avoidable complications in high-risk cohorts.

Inflammation biomarkers: Re-evaluating CRP and ESR

While CRP is an established acute-phase protein marker [1317], our data challenge its utility as a standalone predictor: Temporal Relevance: Only postoperative day 2 CRP predicted complications, reflecting institutional protocols where preoperative elevations triggered antibiotic therapy or surgery deferral. ESR Paradox: Significantly lower peak ESR in complication cases (41.4 vs. 44.3 mm/hr; p = 0.217) suggests failed compensatory anti-inflammatory responses—a previously overlooked predictor. Synergistic Value: Combining ΔAlb with CRP increased predictive accuracy by 32% versus CRP alone (p < 0.001). Clinical implication: Isolated CRP measurements offer limited risk stratification; their predictive power emerges only when contextualized within ΔAlb dynamics.

ΔAlb: pathophysiological hub beyond nutrition

ΔAlb’s predictive superiority stems from integrating three interconnected pathways: Surgical Stress: >75% of early albumin loss originates from capillary leakage [18], correlating strongly with tissue edema severity. Inflammation-Metabolism Axis: CRP-driven IL-6 elevation suppresses hepatic albumin synthesis while simultaneously accelerating catabolism [14, 16]. Fluid Management: Although perioperative hydration influences ΔAlb, values >15% reliably indicated dysregulated inflammation (AUC 0.86 vs. 0.65 for fluid-balance models). Unlike volatile cytokines (e.g., IL-6), albumin’s 19-day half-life and standardized assays enable practical real-time clinical application [19].

Algorithm selection: xgboost’s superiority

XGBoost’s dominance (AUC 0.947 vs. ≤0.905 in alternatives) derived from three unique capabilities: Nonlinear Interaction Detection: Captured CRP-ΔAlb synergy (SHAP interaction value = 1.8) obscured in linear models. Feature Stability: ΔAlb maintained 100% selection consistency across 1,000 bootstraps, contrasting with BMI’s cohort-dependent variability. Methodological Resolution: Our focused biomarker panel avoided registry-derived “noise” [2023], while temporal validation reinforced real-world applicability—distinct advantages over conventional modeling approaches.

Toward precision Post-TKA care

We propose this stratified clinical pathway: For ΔAlb ≤ 15%, implement standard monitoring. For ΔAlb > 15%: (1) Immediate 20% albumin infusion (40 g IV).

Limitations and research agenda

Study constraints include single-center design and unmeasured confounders (e.g., frailty metrics). Priority investigations include: Multicenter RCT validating the ΔAlb pathway (NCT pending); Pharmacoeconomic analysis of albumin supplementation; Mechanistic studies on ESR’s role via single-cell RNA sequencing.

Conclusion

This work establishes ΔAlb reduction as a transformative predictor for TKA risk stratification. By quantifying its nonlinear interaction with CRP, we defined actionable thresholds (ΔAlb > 15% + CRP > 60 mg/L) enabling preemptive intervention 24–72 h before clinical manifestation. XGBoost’s capacity to decode complex biomarker interactions advances orthopaedic prediction from static comorbidity scoring toward dynamic physiological monitoring, with albumin kinetics now positioned as the cornerstone of precision perioperative care.

Supplementary Information

Abbreviations

TKA

Total knee arthroplasty

XGBoost

Extreme gradient boosting

AUC

Area under the curve

SHAP

SHapley additive explanations

BMI

Body Mass Index

CRP

C-Reactive protein

ESR

Erythrocyte sedimentation rate

ASA

American society of anesthesiologists

ΔAlb

Postoperative albumin decline

ΔHb

Postoperative hemoglobin decline

VAS

Visual analogue scale

LOS

Length of stay

ML

Machine learning

 LGB

Light gradient boosting machine

ICI

Integrated calibration index

Authors’ contributions

Lintuo Huang : Conceptualization, Data curation, Formal analysis, Writing – original draft. Xiuli Lin : Methodology, Validation, Visualization, Writing – review & editing. Lidan Zheng : Investigation, Resources, Supervision. Yueying Zhu : Project administration, Funding acquisition, Supervision, Writing – review & editing.

Funding

This study was supported by the Technology Bureau Projects of Wenzhou City (Grant No. Y 2022Y0572).

Data availability

The data sets produced and/or analyzed during the current study are available from the corresponding author on reasonable request.

Declarations

Ethics approval and consent to participate

The study adhered to the principles outlined in the Declaration of Helsinki and was approved by the Institutional Review Board of the First Affiliated Hospital of Wenzhou Medical University (Approval Number 2018018). The requirement to obtain participant consent was waived by the ethics committee in consideration of the retrospective nature of the study.

The study has received approval from the Institutional Review Board (IRB) of The First Affiliated Hospital of Wenzhou Medical University (2018018). The requirement for obtaining patients’ consent was waived since the data was anonymous.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Lintuo Huang and Xiuli Lin contributed equally to this work.

Contributor Information

Lintuo Huang, Email: lintuo123.cool@163.com.

Yueying Zhu, Email: zyy1202@126.com.

References

  • 1.Ramamurti P, Fassihi SC, Stake S, Stadecker M, Whiting Z, Thakkar SC. Conversion total knee arthroplasty. JBJS Rev. 2021;9(9). 10.2106/JBJS.RVW.20.00198. [DOI] [PubMed]
  • 2.Patil S, McCauley JC, Pulido P, Colwell CW. How do knee implants perform past the second decade? Nineteen- to 25-year followup of the press-fit condylar design TKA. Clin Orthop Relat Res. 2015;473(1):135–40. 10.1007/s11999-014-3792-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Nakano N, Shoman H, Olavarria F, Matsumoto T, Kuroda R, Khanduja V. Why are patients dissatisfied following a total knee replacement? A systematic review. Int Orthop. 2020;44(10):1971–2007. 10.1007/s00264-020-04607-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Ishii Y, et al. Postoperative decrease in serum albumin as predictor of early acute periprosthetic infection after total knee arthroplasty. J Orthop Surg Res. 2024;19(1):670. 10.1186/s13018-024-05166-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Telang S, et al. Preoperative laboratory values predicting periprosthetic joint infection in morbidly obese patients undergoing total hip or knee arthroplasty. J Bone Joint Surg Am. 2024;106(14):1317–27. 10.2106/JBJS.23.01360. [DOI] [PubMed] [Google Scholar]
  • 6.Ali KA, He L, Deng X, Pan J, Huang H, Li W. Assessing the predictive value of pre- and post-operative inflammatory markers in patients undergoing total knee arthroplasty. J Orthop Surg Res. 2024;19(1):614. 10.1186/s13018-024-05104-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Labgaa I, et al. Is postoperative decrease of serum albumin an early predictor of complications after major abdominal surgery? A prospective cohort study in a European centre. BMJ Open. 2017;7(4):e013966. 10.1136/bmjopen-2016-013966. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Qi J, Liu C, Chen L, Chen J. Postoperative serum albumin decrease independently predicts delirium in the elderly subjects after total joint arthroplasty. Curr Pharm Des. 2020;26(3):386–94. 10.2174/1381612826666191227153150. [DOI] [PubMed] [Google Scholar]
  • 9.Chen Y, Jiang Y. Construction of prediction model of deep vein thrombosis risk after total knee arthroplasty based on XGBoost algorithm. Comput Math Methods Med. 2022;2022:3452348. 10.1155/2022/3452348. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Zeng Q, et al. Prediction models for deep vein thrombosis after knee/hip arthroplasty: A systematic review and network meta-analysis. J Orthop Surg (Hong Kong). 2024;32(2):10225536241249591. 10.1177/10225536241249591. [DOI] [PubMed] [Google Scholar]
  • 11.Devana SK, Shah AA, Lee C, et al. A novel, potentially universal machine learning algorithm to predict complications in total knee arthroplasty. Arthroplasty Today. 2021;10:135–43. 10.1016/j.artd.2021.06.020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Sauerbrei W, et al. State of the Art in selection of variables and functional forms in multivariable analysis—outstanding issues. *Diagn Progn Res*. 2020;4:3. 10.1186/s41512-020-00074-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Moore MR, et al. Levels of synovial fluid inflammatory biomarkers on day of arthroscopic partial meniscectomy predict long-term outcomes and conversion to TKA: A 10-year mean follow-up study. J Bone Joint Surg Am. 2024;106(24):2330–7. 10.2106/JBJS.23.01392. [DOI] [PubMed] [Google Scholar]
  • 14.Tomite T, et al. Delirium following total hip or knee arthroplasty: A retrospective, single-center study. J Orthop Sci. 2024. 10.1016/j.jos.2024.11.006. [DOI] [PubMed] [Google Scholar]
  • 15.Sun J, Yang G, Yang C. Influence of postoperative hypoalbuminemia and human serum albumin supplementation on incision healing following total knee arthroplasty for knee osteoarthritis: A retrospective study. Sci Rep. 2024;14(1):17354. 10.1038/s41598-024-68482-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Zhu S, Qian W, Jiang C, Ye C, Chen X. Enhanced recovery after surgery for hip and knee arthroplasty: A systematic review and meta-analysis. Postgrad Med J. 2017;93(1106):736–42. 10.1136/postgradmedj-2017-134991. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Mishra AK, Vaish A, Vaishya R. Effect of body mass index on the outcomes of primary total knee arthroplasty up to one year – A prospective study. J Clin Orthop Trauma. 2022;27:101829. 10.1016/j.jcot.2022.101829. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Harris AHS, Kuo AC, Weng Y, et al. Can machine learning methods produce accurate and easy-to-use prediction models of 30-day complications and mortality after knee or hip arthroplasty? Clin Orthop Relat Res. 2019;477(2):452–60. 10.1097/CORR.0000000000000601. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Klemt C, et al. The use of artificial intelligence for the prediction of periprosthetic joint infection following aseptic revision total knee arthroplasty. J Knee Surg. 2024;37(2):158–66. 10.1055/s-0043-1761259. [DOI] [PubMed] [Google Scholar]
  • 20.Chen T, Guestrin C, XGBoost:. A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ‘16). 2016:785–794. 10.1145/2939672.2939785
  • 21.Lundberg SM, Lee SI. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4765–74. 10.48550/arXiv.1705.07874. [Google Scholar]
  • 22.Zhang Y, Li Z, Wang X, et al. Dynamic changes in postoperative serum albumin predict complications in major orthopedic surgery: A prospective cohort study. Nutrients. 2023;15(8):1892. 10.3390/nu15081892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Klemt C, Harvey MJ, Prince DE, et al. The role of machine learning in arthroplasty risk prediction: A systematic review of algorithm performance and data heterogeneity. J Arthroplasty. 2022;37(8S):S848–54. 10.1016/j.arth.2022.03.015. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

The data sets produced and/or analyzed during the current study are available from the corresponding author on reasonable request.


Articles from BMC Musculoskeletal Disorders are provided here courtesy of BMC

RESOURCES