Abstract
This study aimed to identify the optimal prediction method and key preoperative variables for red blood cell (RBC) transfusion risk in patients undergoing mitral valve surgery. We conducted a retrospective study involving 1477 patients from eight large tertiary hospitals in China who underwent mitral valve surgery with cardiopulmonary bypass. From thirty collected preoperative variables, the Max-Relevance and Min-Redundancy (mRMR) method was used for feature selection, and various machine learning models were evaluated. Of the 1477 patients, 862 received RBC transfusions. The mRMR method identified ten significant preoperative variables. The LightGBM model demonstrated superior performance, achieving an area under the curve (AUC) of 0.935 in the training set and 0.734 in the validation set, with 74.2% accuracy in a prospective dataset. SHAP analysis revealed the ten most influential variables were hematocrit, RBC count, weight, body mass index, fibrinogen, hemoglobin, height, age, left ventricular dilation, and sex. In conclusion, LightGBM was identified as the optimal model for predicting RBC transfusion needs. The model’s high accuracy can assist clinicians in anticipating transfusions and improving blood management decisions.
Keywords: Mitral valve, Machine learning, LightGBM, Red blood cell transfusion, SHapley Additive exPlanations
Subject terms: Cardiology, Machine learning
Introduction
Internationally, valvular heart disease has been the leading cause of cardiovascular morbidity and mortality1, and its incidence is on the rise globally. Mitral valve disease accounted for 15% of all fatalities attributed to valvular heart disease2. Mitral valve surgery, by repairing or replacing the insufficient mitral valve, can significantly reduce mortality rates in patients with severe mitral valve disease3. However, during heart valve surgery, patients might experience severe coagulation disorders and blood loss, necessitating the prompt administration of red blood cell (RBC) transfusions to maintain blood volume and oxygen delivery capacity4,5.
In clinical practice, clinicians often based RBC transfusion decisions primarily on the patient’s hemoglobin level6 and symptoms of anemia due to the lack of clear and convenient guidelines. However, other perioperative indicators, such as physical signs and other laboratory indicators, should not be ignored. Furthermore, Global blood donation rates have declined significantly. In the United States, donation rates have dropped from 88‰ in 1992 to 55‰ today—a trend that began before COVID-19 and has been worsened by pandemic restrictions, reduced mobile drives, and economic pressures7. Similar trends in many other countries8 highlight the urgent need for predictive models to optimize transfusion management and ensure the efficient allocation of scarce blood resources.
Numerous predictors and prediction models have been put forth in prior research to evaluate the risk of cardiac surgery patients9–11. For instance, Nguyen et al.10 discovered that the duration of cardiopulmonary bypass (CPB) was a significant predictor of the need for RBC transfusions during cardiac surgery. However, relying on intraoperative and postoperative variables—such as CPB duration—limits the clinical utility of these predictors because strategies for reducing transfusion risk and preparing blood products are typically determined before surgery. Therefore, it is critical to identify reliable preoperative variables that can predict the need for perioperative RBC transfusions.
“Machine learning (ML)”, an umbrella term encompassing the construction of predictive models using extensive data or discerning clusters of information within datasets12, has demonstrated superior performance when handling large and complex datasets compared to conventional statistical approaches. Its application enabled accurate prediction of RBC transfusion risks following liver transplantation and knee replacement surgeries13,14.
Therefore, we hypothesized that by collecting preoperative patient data and identifying key variables that influence intraoperative and postoperative RBC transfusions, a machine learning approach could be developed to predict which patients may require intraoperative and postoperative RBC transfusions. This would prompt clinicians to prepare RBC transfusion management strategies in advance.
Methods
Study subjects
This study adhered to the TRIPOD + AI (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis + Artificial Intelligence) statement to ensure comprehensive and transparent reporting. We conducted a large-scale, multicenter retrospective cohort study across eight tertiary-level hospitals in China. Patients who underwent mitral valve surgery between February 2016 and December 2018 were screened for eligibility. Inclusion criteria were: (1) age 18–75 years; (2) undergoing mitral valvuloplasty, mitral bioprosthetic valve replacement, or isolated mitral mechanical valve replacement. Exclusion criteria comprised: (1) concomitant tricuspid valvuloplasty; (2) concurrent aortic valve replacement or coronary artery bypass grafting; (3) secondary or emergency surgery; (4) inability to provide required clinical data.
Over the study period, a total of 1477 mitral valve procedures met eligibility and were performed by the core surgical teams at each center. Case distribution was as follows: Second Xiangya Hospital (384 cases), Third Xiangya Hospital (112 cases), Beijing Aerospace General Hospital (92 cases), Qilu Hospital of Shandong University (192 cases), Fuwai Hospital (238 cases), Zhejiang Provincial People’s Hospital (114 cases), The Affiliated Hospital of Southwest Medical University (177 cases), and Xiamen Cardiovascular Hospital (176 cases). Each core team’s cumulative surgical volume during these 3 years ranged from 92 to 384 cases (median, 177 cases). All procedures were carried out by the highest-volume surgical teams at their respective centers; each team performed more than 100 cardiac operations annually. All operating surgeons possessed at least 10 years of dedicated experience in valve surgery. To minimize inter-operator variability and ensure data quality, every center strictly followed a standardized surgical protocol and quality-control workflow.
Because this investigation was observational, the requirement for written informed consent was waived. The study protocol received ethical approval from the Clinical Laboratory Ethics Committee of the Third Xiangya Hospital of Central South University (Approval No. 2019-S008). A flowchart delineating patient selection and grouping is presented in Fig. 1.
Fig. 1.
The modeling procedure and the study’s flow chart. (a) All variables, including demographic information, clinical characteristics, laboratory test items, and diagnosis, were extracted from the electronic medical record systems of the eight centers, as illustrated in this figure. In total, thirty preoperative variables were analyzed, and 10 of them were screened. (b) The study’s flow chart.
Surgical procedures
All mitral valve surgeries were performed via median sternotomy. Cardiopulmonary bypass was conducted under mild hypothermia, and intraoperative transesophageal echocardiography confirmed satisfactory valve function before weaning from bypass. A standardized perioperative protocol was applied, including postoperative anticoagulation and a restrictive transfusion strategy.
Cohorts construction and data collection
Based on the specified criteria, 1477 patients were enrolled in the study cohort. The transfusion group included patients who received RBC transfusions from the time of mitral valve surgery until their discharge or in-hospital demise. The non-transfusion group comprised patients who did not receive any RBC transfusion from the beginning of mitral valve surgery until their discharge or in-hospital demise. Further validation of the proposed model was achieved by prospectively collecting data from the Second Xiangya Hospital and the Third Xiangya Hospital of Central South University on 35 patients who underwent mitral valve surgery between March 2022 and December 2023.
At each medical center, all the data for the variables were extracted from the electronic health record (EHR) system and the paper-based medical record system. Authors (YJW, JYZ, XJM, and others) were granted access to the medical record system of their respective affiliated institutions to retrieve data. Available preoperative variables known or suspected to be associated with the necessity for RBC transfusion during mitral valve surgery were identified through consultations with medical experts and literature reviews. For some variables that required multiple measurements, the data were evaluated using the value closest to the initiation of the surgical procedure. The preoperative data collection encompassed demographic variables such as sex, age, height, weight, body mass index (BMI), and blood group; therapeutic characteristics including left ventricular dilation, atrial fibrillation, iron supplements, and drugs for anemia; past medical history including cerebrovascular disease, hypertension, anemia, and diabetes mellitus; information on laboratory indicators including white blood cell (WBC) count, red blood cell (RBC) count, hemoglobin (Hb) level, hematocrit (Hct) level, platelet (PLT) count, creatinine level, total protein (TP), albumin, globulin, alanine aminotransferase (ALT), aspartate aminotransferase (AST), fibrinogen (FIB), prothrombin time (PT), international normalized ratio (INR), left ventricular ejection fraction (LVEF), and type of surgery.
In the subsequent analysis, we referred to these variables using their abbreviations for brevity and clarity. For example, red blood cell (RBC) count was referred to as RBC and hemoglobin (Hb) level was referred to as hemoglobin.
Among the 30 variables, one hospital had no height data available. Therefore, before modeling, we implemented the K-nearest neighbors (KNN) strategy to address missing data. The data gathered by multiple medical centers were standardized. Ordinal variables were assigned to three surgery types: isolated mitral valve mechanical replacement, mitral bioprosthetic valve replacement, and mitral valvuloplasty.
Construction of prediction models
The max-relevance and min-redundancy (mRMR) algorithm15 was utilized to identify critical variables from 30 preoperative variables. The mRMR algorithm worked as a perfect filtered feature selection model that could find the most relevant features for the target outcome while making the most redundant features among the chosen ones16. The mRMR algorithm first selected the most pertinent features by identifying their significant mutual information with the target value. Then, it progressively added additional features according to their correlation with the existing features, ensuring minimal redundancy within the feature set. The approach could determine the features that were most representative and exhibited a strong correlation with the target value while being highly independent of one another. Moreover, the mRMR technology evaluated features during the feature selection process without constructing models, making it unaffected by modeling methods and reducing the risk of overfitting17. Consequently, the mRMR algorithm was suitable for subsequent modeling and evaluating models.
The research employed typical machine learning models to construct predictive models for categorical outcomes. These models included light gradient boosting machine (LightGBM), eXtreme Gradient Boosting (XGBoost), adaptive boosting (AdaBoost), multilayer perceptron (MLP), support vector machine (SVM), complement naive bayes (CNB), and logistic regression (LR).
Each model had its benefits and drawbacks in diverse dataset environments. There was no absolute advantage or disadvantage. After developing and validating the above models through the fivefold cross-validation technique, the area under the receiver operating characteristic curve (AUROC) for each model was subsequently computed on both the training and validation datasets. In addition, the F1-score, Kappa value, accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and cutoff value were evaluated to identify the optimal model for forecasting the necessity for intraoperative and postoperative RBC transfusion during mitral valve surgery. Moreover, the Brier score and net clinical benefit were used to evaluate and compare model performances. The Brier score is an indicator of the accuracy of the predicted probability. The lower the score, the better the model18. The net clinical benefit shows the trade-offs between the benefits and harms of different threshold choices, providing insights into the practical application of the model in a clinical setting.
The predictions of the optimal model were elucidated using the SHapley Additive exPlanation (SHAP) method after comparing model performances. By involving cooperative game theory19, the SHAP package transformed the unprocessed nonlinear ML outcomes into the sum of the attributed effects of all variables. This process quantified the contribution of each feature to the final prediction of the model. It approximated the risk associated with obtaining the relevant outcome for each patient. Thus, the transformation outcomes of SHAP provided a straightforward means of discerning the impact of preoperative variables on the final result.
Statistical analyses
The median with interquartile ranges was applied to represent continuous variables, while percentages were used to describe categorical variables. The t-test or Mann–Whitney U test (depending on the situation) was utilized to compare continuous variables between the transfusion and non-transfusion groups. The chi-square test was employed to analyze differences in the categorical variables. Statistical significance was determined as p < 0.05 using two-tailed p values. Predictive models were constructed with the feature sets of the critical variables selected by the mRMR algorithm. By comparing the prediction models, the optimal prediction model was identified. The average area under the receiver operating characteristic (ROC) curve (AUC) from the fivefold cross-validation served as an assessment metric to compare the accuracy of the prediction models. A higher AUC value indicated the superior predictive capability of the model. Python was utilized for all statistical analyses of the baseline characteristics.
Results
Characteristics of the study cohort
This investigation comprised a cohort of 1477 patients who underwent mitral valve surgery at eight medical facilities, as shown in Fig. 1. The patients had a median weight of 50.9 kg; 60.87% were female, and the median age was 53. Among all enrolled patients, 862 received RBC transfusions during or after mitral valve surgery, while 615 did not receive any RBC transfusions during this period. The preoperative characteristics of the patients are detailed in Table 1. Notably, mechanical mitral valve replacement was performed on most patients (73.3%). Table 1 presents the attributes of the non-transfused and transfused cohorts.
Table 1.
Preoperative information.
| Preoperative variables | Missing | Subtype | All (n = 1477) | Intraoperative and postoperative blood transfusion | p | |
|---|---|---|---|---|---|---|
| No | Yes | |||||
| Sex, n (%) | 0 | Female | 899 (60.867) | 315 (51.220) | 584 (67.749) | < 0.001 |
| Male | 578 (39.133) | 300 (48.780) | 278 (32.251) | |||
| Blood type, n (%) | 0 | O | 517 (35.003) | 221 (35.935) | 296 (34.339) | 0.250 |
| A | 492 (33.311) | 190 (30.894) | 302 (35.035) | |||
| B | 347 (23.494) | 146 (23.740) | 201 (23.318) | |||
| AB | 121 (8.192) | 58 (9.431) | 63 (7.309) | |||
| Atrial fibrillation, n (%) | 1 | 758 (51.320) | 316 (51.382) | 442 (51.276) | 0.968 | |
| Types of surgery | 1 | 235 (15.911) | 80 (13.008) | 155 (17.981) | < 0.001 | |
| 2 | 1082 (73.257) | 429 (69.756) | 653 (75.754) | |||
| 3 | 160 (10.833) | 106 (17.236) | 54 (6.265) | |||
| Cerebrovascular disease, n (%) | 1 | 99 (6.703) | 41 (6.667) | 58 (6.729) | 0.963 | |
| Drug therapy for anemia, n (%) | 1 | 5 (0.339) | 0 (0.000) | 5 (0.580) | 0.143 | |
| Anemia, n (%) | 1 | 3 (0.203) | 0 (0.000) | 3 (0.348) | 0.059 | |
| Diabetes, n (%) | 1 | 55 (3.724) | 27 (4.390) | 28 (3.248) | 0.253 | |
| Hypertension, n (%) | 1 | 229 (15.504) | 109 (17.724) | 120 (13.921) | 0.047 | |
| Left ventricular enlargement, n (%) | 1 | 647 (43.805) | 236 (38.374) | 411 (47.680) | < 0.001 | |
| Iron supplementation, n (%) | 587 | 0 | 886 (99.551) | 432 (99.769) | 454 (99.344) | 0.343 |
| 1 | 4 (0.449) | 1 (0.231) | 3 (0.656) | |||
| Age (year), median [Q1, Q3] | 0 | 53.000 [47.000, 61.000] | 51.800 [45.000, 58.000] | 54.000 [48.000, 62.000] | < 0.001 | |
| Height (cm), median [Q1, Q3] | 0 | 160.000 [155.000, 167.000] | 162.000 [156.000, 169.000] | 158.000 [154.000, 165.000] | < 0.001 | |
| Weight (kg), median [Q1, Q3] | 0 | 59.000 [52.000, 67.000] | 61.000 [55.000, 70.000] | 57.000 [50.000, 65.000] | < 0.001 | |
| BMI, median [Q1, Q3] | 0 | 22.863 [20.808, 25.059] | 23.323 [21.359, 25.778] | 22.507 [20.429, 24.610] | < 0.001 | |
| Red blood cell (1012/L), median [Q1, Q3] | 0 | 4.470 [4.090, 4.890] | 4.720 [4.360, 5.060] | 4.310 [3.930, 4.730] | < 0.001 | |
| White blood cell (109/L), median [Q1, Q3] | 0 | 6.120 [5.040, 7.530] | 6.140 [5.150, 7.470] | 6.120 [4.960, 7.570] | 0.566 | |
| Hemoglobin (g/L), median [Q1, Q3] | 0 | 131.000 [118.000, 144.000] | 139.000 [127.000, 150.000] | 126.000 [112.000, 139.000] | < 0.001 | |
| Hematocrit (%), median [Q1, Q3] | 0 | 40.600 [37.000, 44.000] | 42.500 [40.000, 45.400] | 39.100 [35.300, 42.500] | < 0.001 | |
| Platelet (109/L), median [Q1, Q3] | 0 | 194.000 [155.000, 241.000] | 196.000 [159.000, 239.000] | 193.000 [153.000, 243.000] | 0.480 | |
| CR (umol/L), median [Q1, Q3] | 0 | 71.900 [60.800, 85.000] | 75.700 [64.700, 88.990] | 69.000 [58.800, 82.500] | < 0.001 | |
| TP (g/L), median [Q1, Q3] | 0 | 68.100 [63.800, 72.700] | 68.400 [64.400, 72.900] | 67.600 [63.400, 72.500] | 0.034 | |
| ALB (g/L), median [Q1, Q3] | 0 | 39.800 [37.200, 42.600] | 40.400 [38.000, 43.000] | 39.300 [36.600, 42.300] | < 0.001 | |
| GLO (g/L), median [Q1, Q3] | 0 | 28.000 [25.200, 31.450] | 27.780 [24.700, 31.200] | 28.200 [25.500, 31.700] | 0.042 | |
| ALT (IU/L), median [Q1, Q3] | 0 | 19.600 [13.000, 31.000] | 20.300 [14.000, 32.900] | 19.000 [12.800, 29.000] | 0.004 | |
| AST (IU/L), median [Q1, Q3] | 0 | 22.700 [18.000, 29.300] | 22.700 [18.000, 29.000] | 22.700 [18.300, 29.500] | 0.463 | |
| PT (s), median [Q1, Q3] | 0 | 13.100 [12.000, 14.400] | 13.200 [12.300, 14.300] | 13.100 [11.800, 14.400] | 0.062 | |
| INR, median [Q1, Q3] | 0 | 1.060 [1.000, 1.180] | 1.060 [1.000, 1.170] | 1.070 [1.000, 1.200] | 0.128 | |
| FIB (g/L), median [Q1, Q3] | 0 | 2.937 [2.470, 3.480] | 2.890 [2.460, 3.326] | 2.961 [2.500, 3.580] | 0.009 | |
| LVEF, median [Q1, Q3] | 0 | 62.000 [57.000, 67.000] | 62.000 [57.000, 67.000] | 62.000 [57.000, 67.000] | 0.451 | |
BMI body mass index; CR creatinine, TP total protein, ALB albumin, GLO globulin, ALT alanine aminotransferase, AST aspartate aminotransferase, PT prothrombin time, INR international normalized ratio, FIB fibrinogen, LVEF left ventricular ejection fractions.
Key variables and model performances
Using the mRMR algorithm, ten significant preoperative variables were identified from the data: sex, height, weight, body mass index (BMI), age, left ventricular dilation, red blood cell (RBC), hemoglobin (Hb), hematocrit (Hct), and fibrinogen (FIB). Female patients, those with a lower BMI, older age, presence of left ventricular dilation, anemia, lower RBC count, lower Hct, and lower FIB exhibited a higher propensity to undergo RBC transfusions.
The models’ performance metrics, including accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1-score, and Kappa value, were assessed and compared using training and validation data, as displayed in Tables 2 and 3. The analysis determined that the LightGBM model’s AUC and accuracy in the training set were 0.935 (95% CI 0.927–0.942) and 0.863, and the AUC in the validation set was 0.734 (95% CI 0.697–0.771) (Fig. 2a,b). LightGBM outperformed the other five machine learning models and conventional logistic regression when considering aggregate metrics of AUC score (Fig. 2c), reliability (Fig. 2d), and net clinical benefit (Fig. 2e).
Table 2.
Performance of machine learning models in the training set.
| Model | AUC (SD) | Cutoff (SD) | Accuracy (SD) | Sensitivity (SD) | Specificity (SD) | PPV (SD) | NPV (SD) | F1 score (SD) | Kappa (SD) |
|---|---|---|---|---|---|---|---|---|---|
| XGBoost | 0.769 (0.005) | 0.594 (0.031) | 0.688 (0.013) | 0.636 (0.055) | 0.762 (0.044) | 0.788 (0.016) | 0.603 (0.025) | 0.702 (0.027) | 0.382 (0.018) |
| LR | 0.734 (0.008) | 0.579 (0.018) | 0.676 (0.009) | 0.678 (0.039) | 0.676 (0.038) | 0.744 (0.012) | 0.602 (0.014) | 0.708 (0.019) | 0.346 (0.011) |
| LightGBM | 0.935 (0.004) | 0.543 (0.026) | 0.863 (0.004) | 0.875 (0.021) | 0.850 (0.030) | 0.890 (0.021) | 0.829 (0.023) | 0.881 (0.004) | 0.719 (0.008) |
| AdaBoost | 0.762 (0.005) | 0.514 (0.006) | 0.688 (0.012) | 0.663 (0.053) | 0.725 (0.046) | 0.771 (0.019) | 0.609 (0.024) | 0.711 (0.025) | 0.376 (0.016) |
| MLP | 0.694 (0.013) | 0.604 (0.007) | 0.634 (0.005) | 0.548 (0.054) | 0.753 (0.071) | 0.758 (0.029) | 0.546 (0.008) | 0.634 (0.025) | 0.285 (0.016) |
| SVM | 0.723 (0.006) | 0.593 (0.035) | 0.671 (0.007) | 0.681 (0.053) | 0.661 (0.058) | 0.736 (0.023) | 0.601 (0.029) | 0.706 (0.018) | 0.335 (0.008) |
| CNB | 0.703 (0.006) | 0.613 (0.049) | 0.642 (0.012) | 0.561 (0.053) | 0.756 (0.049) | 0.762 (0.024) | 0.555 (0.016) | 0.644 (0.029) | 0.300 (0.013) |
PPV positive predictive value, NPV negative predictive value, XGBOOST eXtreme gradient boosting, LR logistic regression, LightGBM light gradient boosting machine, AdaBoost adaptive boosting, MLP multi-layer perceptron, SVM support vector machine, CNB complement Naive Bayes.
Table 3.
Performance of machine learning models in the validation set.
| Model | AUC (SD) | Cutoff (SD) | Accuracy (SD) | Sensitivity (SD) | Specificity (SD) | PPV (SD) | NPV (SD) | F1score (SD) | Kappa (SD) |
|---|---|---|---|---|---|---|---|---|---|
| XGBoost | 0.724 (0.012) | 0.594 (0.031) | 0.650 (0.015) | 0.610 (0.055) | 0.754 (0.058) | 0.768 (0.047) | 0.559 (0.038) | 0.677 (0.028) | 0.311 (0.020) |
| LR | 0.726 (0.020) | 0.579 (0.018) | 0.662 (0.016) | 0.654 (0.047) | 0.703 (0.050) | 0.736 (0 0.042) | 0.579 (0.029) | 0.692 (0.042) | 0.316 (0.033) |
| LightGBM | 0.734 (0.019) | 0.543 (0.026) | 0.671 (0.019) | 0.631 (0.067) | 0.736 (0.045) | 0.728 (0.028) | 0.595 (0.039) | 0.674 (0.039) | 0.322 (0.038) |
| AdaBoost | 0.721 (0.010) | 0.514 (0.006) | 0.664 (0.019) | 0.604 (0.050) | 0.759 (0.041) | 0.756 (0.036) | 0.574 (0.041) | 0.670 (0.031) | 0.327 (0.042) |
| MLP | 0.690 (0.020) | 0.604 (0.007) | 0.614 (0.010) | 0.554 (0.039) | 0.752 (0.069) | 0.757 (0.062) | 0.522 (0.026) | 0.638 (0.029) | 0.255 (0.031) |
| SVM | 0.711 (0.015) | 0.593 (0.035) | 0.648 (0.005) | 0.594 (0.085) | 0.734 (0.076) | 0.723 (0.031) | 0.562 (0.023) | 0.650 (0.062) | 0.285 (0.014) |
| CNB | 0.700 (0.014) | 0.613 (0.049) | 0.626 (0.008) | 0.635 (0.071) | 0.679 (0.072) | 0.762 (0.041) | 0.531 (0.023) | 0.690 (0.051) | 0.273 (0.014) |
PPV positive predictive value, NPV negative predictive value, XGBOOST eXtreme gradient boosting, LR logistic regression, LightGBM light gradient boosting, AdaBoost adaptive boosting, MLP multi-layer perceptron, SVM support vector machine, CNB complement naive Bayes.
Fig. 2.
Comparison of multiple models. Panel (a) displays the area under the receiver operating characteristic curve (AUC) for multiple models in the training set. The ROC curve shows the trade-off between sensitivity (true positive rate) and specificity (false positive rate) at various threshold settings. A higher AUC value corresponds to a better overall performance of the model, as it reflects the model’s ability to correctly distinguish between positive and negative outcomes across all thresholds. Panel (b) presents the AUCs of multiple models in the validation set. Panel (c) shows the forest plot of the AUC scores for multiple models, providing a visual representation of the performance metrics. (d) Calibration plots of multiple models. A well-calibrated model will have a calibration curve close to the diagonal representing perfect calibration. The values in brackets indicate the Brier score of each model along with its 95% confidence interval. The Brier score assesses the accuracy of predicted probabilities; a lower score denotes better model performance. (e) The clinical decision curve of multiple models. This subplot helps to evaluate the models’ utility in clinical decision-making. The clinical decision curves show the trade-offs between the benefits and harms of different threshold choices, providing insights into the practical application of the model in a clinical setting.
Based on these results, The LightGBM model was the best. To further assess the predictive performance of LightGBM, a random test set comprising N = 221 cases (15.00% of the total sample) was selected. The remaining samples were used as the training set for fivefold cross-validation. The LightGBM model’s final AUCs in the new training, validation, and test sets were 0.991 (Fig. 3a), 0.713 (Fig. 3b), and 0.722 (Fig. 3c), respectively. Additionally, we examined the correlation between the LightGBM model’s ROC curve and the number of sample cases, ultimately demonstrating that the model’s ROC curve is stable (Fig. 3d). To evaluate the reliability of the LightGBM model, we use the calibration plot to show the results. From Fig. 3e, we can see that the LightGBM model is highly reliable. Furthermore, to illustrate the utility of the model in clinical decision-making, we present clinical decision curves (Fig. 3f).
Fig. 3.
Performance and calibration of the LightGBM model. (a) AUC of the LightGBM model in the training set. (b) AUC of the LightGBM model in the validation set. (c) AUC of the LightGBM model in the test set. (d) The relationship between the ROC curve of the LightGBM model and the training samples. This subplot shows the ROC curve of the LightGBM model and how it relates to the number of training samples. A stable ROC curve across various sample sizes indicates the robustness of the model. (e) Calibration plot of the LightGBM model, assessing whether the predicted probabilities align with the actual outcomes. (f) Clinical decision curve of the LightGBM model, demonstrating its practical application in clinical settings by evaluating the net benefit at various threshold probabilities.
Prospective experiments
Prospective data were collected for validation from an overall sample of 35 patients. Among these patients, 51% (n = 18) received RBC transfusions during or after mitral valve surgery. The model achieved an accuracy level of 74.2% on the prospective data set. Four patients had RBC transfusions despite the model predicting the opposite outcome. Referring to the surgical records, three of these patients experienced intraoperative hemorrhage (> 500 mL). Among the five non-transfused patients expected to need transfusion by the model, one received a plasma transfusion, and two received RBC transfusions before the operation.
Explainable analysis of the model
The SHAP packages analyzed the LightGBM model on the entire cohort and illustrated the influence of each variable on predicting RBC transfusion (Fig. 4a). The ten preoperative variables in the LightGBM predictive model are listed in order of their importance for predicting RBC transfusion outcomes. These variables are hematocrit, RBC, weight, BMI, fibrinogen, hemoglobin, height, age, left ventricular dilation, and sex, with hematocrit being the most crucial factor (Fig. 4b). To gain deeper insight into the impact of each variable, we used force plots to visualize the probabilities of RBC transfusion for two patients (Fig. 4c,d). For example, the likelihood of RBC transfusion was 0.809 (Fig. 4c), influenced by specific factors including weight (47 kg), height (150 cm), BMI (20.8), RBC count (3.72 × 1012/L), and hematocrit (33.9). These findings suggested a high likelihood of the patient needing an RBC transfusion.
Fig. 4.
Interpretability of the model. Panel (a) presents a SHAP (Shapley additive explanations) summary plot of the top 10 variables of the LightGBM model. This image displays information from the whole cohort, and each dot represents an individual patient. The color relates to the values of the variable, with blue representing lower values while red represents the higher values. The SHAP value, which means the expected likelihood of transfusions during and after mitral valve surgery, is plotted on the graph’s horizontal axis. The magnitude of a variable’s impact on an outcome is proportional to its absolute value in the horizontal coordinate; a more significant value suggests a more substantial level of impact. For instance, a lower transfusion likelihood is associated with a higher hemoglobin value. (b) Ranking of each variable’s contribution to the outcome, with hematocrit identified as the most influential factor. (c) and (d) Force plots illustrating how each variable contributes to the outcome. Red indicates a positive contribution, while blue indicates a negative contribution.
Discussion
In this study, we developed a machine learning model using preoperative variables to predict the need for red blood cell (RBC) transfusions during or after mitral valve surgery. Our model, which is based on the LightGBM algorithm, demonstrated excellent performance on both the training and validation datasets, achieving high area under the receiver operating characteristic curve (AUC) values. Using the mRMR method, we identified ten key preoperative variables—including RBC, hemoglobin, and hematocrit—that play a crucial role in predicting transfusion requirements. Importantly, the model was further validated on a prospective dataset, which reinforces its potential clinical applicability.
Perioperative RBC transfusions are common in cardiac surgery20. and have been associated with adverse outcomes, including infections and increased short-term and long-term mortality21–23. By incorporating preoperative data, our model enables early identification of high-risk patients before surgery, thereby allowing clinicians to implement targeted strategies—such as early use of erythropoietin, intravenous iron supplementation, or antifibrinolytics—to reduce the need for transfusions. For patients identified as low risk, unnecessary pretransfusion testing can be avoided, which not only reduces costs but also minimizes patient burden.
Our study also examined the impact of individual predictors. Hematocrit, for example, emerged as the most influential factor, altering the predicted probability of transfusion by an average of 12 percentage points (Fig. 4b). Similarly, hemoglobin levels also significantly influenced transfusion probability (Fig. 5a), as both parameters are primary determinants in clinical transfusion decisions24. While restrictive transfusion thresholds (e.g., hemoglobin 7–8 g/dL) have been shown to be safe in certain settings25–27, postoperative anemia remains strongly linked to adverse outcomes in high-risk cardiac surgeries28, suggesting that transfusion thresholds need to be carefully considered for patients undergoing cardiac surgery. One example was the consideration of sex-based variations in transfusion thresholds29. This highlights the need for individualized transfusion thresholds and proactive management of preoperative anemia as key components of effective patient blood management (PBM)30.
Fig. 5.
SHAP dependence contribution plots. (a) SHAP dependence contribution plots of Hb. The horizontal axis represents preoperative hemoglobin levels, and the vertical axis shows the SHAP value, indicating the impact of hemoglobin levels on the likelihood of transfusion. (b) SHAP dependence contribution Plots of FIB. Higher SHAP values for FIB correspond to an increased risk of transfusion, highlighting the variable’s significant role in the model’s predictions.
Additionally, our analysis revealed that a lower body mass index (BMI) is associated with a higher risk of transfusion (Fig. 4a). Previous research has linked low BMI with hypofibrinogenemia and an increased risk of bleeding31. Although our results did not show a simple linear relationship between preoperative fibrinogen levels and transfusion probability, higher fibrinogen levels seemed to increase the risk (Fig. 5b), potentially due to an underlying inflammatory state32,33. We also found that left ventricular dilation, an often overlooked variable, may contribute to increased transfusion needs by potentially reducing cardiac output and oxygen delivery during surgery. Our focused analysis on mitral valve surgery allowed us to uncover the significance of left ventricular dilation in this specific context, which may have been diluted in studies encompassing various types of cardiac surgery.
Regarding the machine learning approach, the LightGBM algorithm outperformed other models by effectively capturing complex, nonlinear relationships among variables. To enhance the interpretability of our model, we employed the SHAP method, which provides a clear visualization of how each variable contributes to the prediction. It is important to note that while machine learning models can identify strong associations, they do not prove causality.
Several limitations of this study should be acknowledged. First, as a retrospective cohort study, our analysis may be subject to selection bias due to the exclusion of patients with incomplete data. Although we mitigated this by using multicenter data, variations in data quality and clinical practices may still affect the results. Second, our study was conducted in hospitals that adhere to a restrictive transfusion strategy34 (transfusion initiated when hemoglobin is below 7 g/dL), which may limit the generalizability of our findings to settings with different protocols. Third, the heterogeneity of multicenter data and a relatively small sample size may have contributed to differences between the training and validation sets. Future studies should incorporate larger, more diverse datasets and consider weighted models to further improve prediction accuracy. Finally, additional transfusion-related variables—such as more detailed laboratory data and the use of antifibrinolytic therapy—could further enhance the predictive performance of the model, and these factors should be explored in future research.
Conclusions
This study evaluated the ability of a machine learning model to predict the risk of red blood cell transfusion in patients undergoing mitral valve surgery. In addition, we identified key variables associated with the risk of needing red blood cell transfusion during and after surgery. Our next step is to develop a clinical scoring system for mitral valve surgery based on our research, which can automatically acquire data and provide accurate predictions, offering clinicians a more practical tool.
Author contributions
Yuhan Wang: Writing—initial draft and final manuscript, formal analysis, investigation and graphing. Leping Liu: Writing—initial draft and final manuscript, formal analysis and graphing. Kexin Fan: Writing—initial draft, data collation and curation. Yongjun Wang: Data collation Jiyan Zhang: Data collation and supervision Xianjun Ma: Data collation and supervision Yuanshuai Huang: Data collation and supervision Xinhua Wang: Data collation and supervision Jinsong Zhang: Supervision, conceptualization, and revision. Bingyu Chen: Data collation and supervision. Jinsong Zhang: Supervision, conceptualization, and revision. Rong Gui: Supervision, conceptualization, and revision. All authors read and approved the final draft.
Funding
This study was supported by the Wisdom Accumulation and Talent Cultivation Project of the Third Xiangya Hospital of Central South University (No. BJ202101).
Data availability
Given the Ethics Committee’s refusal to approve the data’s inclusion in a public database, the original datasets mentioned in this article cannot be freely accessed. Email the first author (Yuhan Wang) at yhan0617@163.com with any requests for access to the datasets.
Declarations
Competing interests
The authors declare no competing interests.
Ethics approval and consent to participate
The research containing data from human subjects was reviewed and approved by the Institutional Review Board of the Third Xiangya Hospital of Central South University (No. 2019-S008). The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). No identifiable data of patients were collected. Because the model was constructed using retrospective data, our institutional review board waived written informed consent. The Patient Informed Consent was obtained for prospective validation of the model.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Yuhan Wang, Leping Liu and Kexin Fan contributed equally to this work and therefore share first authorship.
Contributor Information
Jinsong Zhang, Email: 1445098592@qq.com.
Rong Gui, Email: guirong@csu.edu.cn.
References
- 1.Coffey, S. et al. Global epidemiology of valvular heart disease. Nat. Rev. Cardiol.18, 853–864. 10.1038/s41569-021-00570-z (2021). [DOI] [PubMed] [Google Scholar]
- 2.Aluru, J. S., Barsouk, A., Saginala, K., Rawla, P. & Barsouk, A. Valvular heart disease epidemiology. Med. Sci.10, 32. 10.3390/medsci10020032 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Gammie, J. S. et al. Isolated mitral valve surgery: The society of thoracic surgeons adult cardiac surgery database analysis. Ann. Thorac. Surg.106, 716–727. 10.1016/j.athoracsur.2018.03.086 (2018). [DOI] [PubMed] [Google Scholar]
- 4.Jakobsen, C., Larsen, J. B., Fuglsang, J. & Hvas, A.-M. Mechanical heart valves, pregnancy, and bleeding: A systematic review and meta-analysis. Semin. Thromb. Hemost.49, 542–552. 10.1055/s-0042-1756707 (2023). [DOI] [PubMed] [Google Scholar]
- 5.Kim, H. J. et al. Perioperative red blood cell transfusion is associated with adverse cardiovascular outcomes in heart valve surgery. Anesth. Analg.137, 153–161. 10.1213/ANE.0000000000006245 (2023). [DOI] [PubMed] [Google Scholar]
- 6.Carson, J. L. et al. Red blood cell transfusion: 2023 AABB international guidelines. JAMA330, 1892. 10.1001/jama.2023.12914 (2023). [DOI] [PubMed] [Google Scholar]
- 7.Tran, M., Niu, C. & Kelley, W. Why are we donating less?. Transfusion64, 1154–1160. 10.1111/trf.17861 (2024). [DOI] [PubMed] [Google Scholar]
- 8.Barnes, L. S. et al. AABB global transfusion forum, COVID-19 and the impact on blood availability and transfusion practices in low- and middle-income countries. Transfusion62, 336–345. 10.1111/trf.16798 (2022). [DOI] [PubMed] [Google Scholar]
- 9.Doshi, K. A., Shastry, S. & Pai, V. B. Transfusion requirement prediction score for patients undergoing cardiac surgery: An experience from a tertiary care set-up from south India. Transfus. Med.31, 243–249. 10.1111/tme.12774 (2021). [DOI] [PubMed] [Google Scholar]
- 10.Nguyen, Q., Meng, E., Berube, J., Bergstrom, R. & Lam, W. Preoperative anemia and transfusion in cardiac surgery: A single-centre retrospective study. J. Cardiothorac. Surg.16, 109. 10.1186/s13019-021-01493-z (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Lee, S. M. et al. Development and validation of a prediction model for need for massive transfusion during surgery using intraoperative hemodynamic monitoring data. JAMA Netw. Open5, e2246637. 10.1001/jamanetworkopen.2022.46637 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Greener, J. G., Kandathil, S. M., Moffat, L. & Jones, D. T. A guide to machine learning for biologists. Nat. Rev. Mol. Cell Biol.23, 40–55. 10.1038/s41580-021-00407-0 (2022). [DOI] [PubMed] [Google Scholar]
- 13.Jo, C. et al. Transfusion after total knee arthroplasty can be predicted using the machine learning algorithm. Knee Surg. Sports Traumatol. Arthrosc.28, 1757–1764. 10.1007/s00167-019-05602-3 (2020). [DOI] [PubMed] [Google Scholar]
- 14.Liu, L.-P. et al. Machine learning for the prediction of red blood cell transfusion in patients during or after liver transplantation surgery. Front. Med.8, 632210. 10.3389/fmed.2021.632210 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Peng, H., Long, F. & Ding, C. Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell.27, 1226–1238. 10.1109/TPAMI.2005.159 (2005). [DOI] [PubMed] [Google Scholar]
- 16.Bugata, P. & Drotar, P. On some aspects of minimum redundancy maximum relevance feature selection. Sci. China Inf. Sci.63, 112103. 10.1007/s11432-019-2633-y (2020). [Google Scholar]
- 17.Ma, X.-H., Chen, Z.-G. & Liu, J.-M. Wavelength selection method for near-infrared spectroscopy based on max-relevance min-redundancy. Spectrochim. Acta Part A Mol. Biomol. Spectrosc.310, 123933. 10.1016/j.saa.2024.123933 (2024). [DOI] [PubMed] [Google Scholar]
- 18.Rufibach, K. Use of brier score to assess binary predictions. J. Clin. Epidemiol.63, 938–939. 10.1016/j.jclinepi.2009.11.009 (2010). [DOI] [PubMed] [Google Scholar]
- 19.Rodríguez-Pérez, R. & Bajorath, J. Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J. Med. Chem.63, 8761–8777. 10.1021/acs.jmedchem.9b01101 (2020). [DOI] [PubMed] [Google Scholar]
- 20.Ming, Y. et al. Large volume acute normovolemic hemodilution in patients undergoing cardiac surgery with intermediate-high risk of transfusion: A randomized controlled trial. J. Clin. Anesth.87, 111082. 10.1016/j.jclinane.2023.111082 (2023). [DOI] [PubMed] [Google Scholar]
- 21.Ming, Y. et al. Transfusion of red blood cells, fresh frozen plasma, or platelets is associated with mortality and infection after cardiac surgery in a dose-dependent manner. Anesth. Analg.130, 488–497. 10.1213/ANE.0000000000004528 (2020). [DOI] [PubMed] [Google Scholar]
- 22.Li, Y. et al. Prognostic association between perioperative red blood cell transfusion and postoperative cardiac surgery outcomes. Front. Cardiovasc. Med.8, 730492. 10.3389/fcvm.2021.730492 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Tang, M. et al. Fewer transfusions are still more—Red blood cell transfusions affect long-term mortality in cardiac surgery. Eur. J. Cardio-Thorac. Surg.63, ezad101. 10.1093/ejcts/ezad101 (2023). [DOI] [PubMed] [Google Scholar]
- 24.Tomic Mahecic, T., Dünser, M. & Meier, J. RBC transfusion triggers: Is there anything new?. Transfus. Med. Hemother.47, 361–369. 10.1159/000511229 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Carson, J. L. et al. Transfusion thresholds for guiding red blood cell transfusion. Cochrane Database Syst. Rev.10.1002/14651858.CD002042.pub5 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Konda, S. R. et al. Transfusion thresholds can be safely lowered in the hip fracture patient: A consecutive series of 1,496 patients. J. Am. Acad. Orthop. Surg.31, 349–356. 10.5435/JAAOS-D-22-00582 (2023). [DOI] [PubMed] [Google Scholar]
- 27.Teutsch, B. et al. Potential benefits of restrictive transfusion in upper gastrointestinal bleeding: A systematic review and meta-analysis of randomised controlled trials. Sci. Rep.13, 17301. 10.1038/s41598-023-44271-8 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Kougias, P., Sharath, S., Mi, Z., Biswas, K. & Mills, J. L. Effect of postoperative permissive anemia and cardiovascular risk status on outcomes after major general and vascular surgery operative interventions. Ann. Surg.270, 602–611. 10.1097/SLA.0000000000003525 (2019). [DOI] [PubMed] [Google Scholar]
- 29.Min, Y. et al. Blood transfusion in cardiac surgeries—Toward a personalized protocol. Am. J. Surg.227, 237–238. 10.1016/j.amjsurg.2023.07.035 (2024). [DOI] [PubMed] [Google Scholar]
- 30.Althoff, F. C. et al. Multimodal patient blood management program based on a three-pillar strategy a systematic review and meta-analysis. Ann. Surg.269, 794–804. 10.1097/SLA.0000000000003095 (2019). [DOI] [PubMed] [Google Scholar]
- 31.Tanaka, K. A. et al. Impact of preoperative hematocrit, body mass index, and red cell mass on allogeneic blood product usage in adult cardiac surgical patients: Report from a statewide quality initiative. J. Cardiothorac. Vasc. Anesth.37, 214–220. 10.1053/j.jvca.2022.03.034 (2023). [DOI] [PubMed] [Google Scholar]
- 32.Charbonneau, H., Pasquie, M. & Mayeur, N. Preoperative plasma fibrinogen level and transfusion in cardiac surgery: A biphasic correlation. Interact. Cardiovasc. Thorac. Surg.31, 622–625. 10.1093/icvts/ivaa153 (2020). [DOI] [PubMed] [Google Scholar]
- 33.Mion, S. et al. U-shaped relationship between pre-operative plasma fibrinogen levels and severe peri-operative bleeding in cardiac surgery. Eur. J. Anaesthesiol.37, 889–897. 10.1097/EJA.0000000000001246 (2020). [DOI] [PubMed] [Google Scholar]
- 34.National Health Commission of the People’s Republic of China. Guideline for Perioperative Patient Blood Management (WS/T 796—2022) (2022). http://www.nhc.gov.cn/fzs/s7852d/202202/1484f47dcf824ee8bae3ad7b6809e603/files/e06866408d504b64a2df649cd1ff4b44.pdf. Accessed 17 July 2024.
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
Given the Ethics Committee’s refusal to approve the data’s inclusion in a public database, the original datasets mentioned in this article cannot be freely accessed. Email the first author (Yuhan Wang) at yhan0617@163.com with any requests for access to the datasets.





