Interpretable machine learning model for predicting anastomotic leak after esophageal cancer surgery via LightGBM

Xiaodong Yang; Fulin Dou; Guoshuo Tang; Ruipu Xiu; Xiaogang Zhao

doi:10.1186/s12885-025-14387-3

. 2025 Jun 1;25:976. doi: 10.1186/s12885-025-14387-3

Interpretable machine learning model for predicting anastomotic leak after esophageal cancer surgery via LightGBM

Xiaodong Yang ¹, Fulin Dou ², Guoshuo Tang ¹, Ruipu Xiu ¹, Xiaogang Zhao ^1,^✉

PMCID: PMC12128519 PMID: 40452009

Abstract

Background

Postoperative anastomotic leakage (AL) is a severe complication following esophageal cancer surgery, that often leads to a poor prognosis. This study aims to develop an interpretable machine learning (ML) model to predict AL occurrence and identify associated risk factors.

Methods

A retrospective case‒control study analyzed clinical and laboratory data from esophageal cancer patients obtained via a case management system. Nine machine learning (ML) models were compared to identify the best-performing model and its optimal feature set. The selected LightGBM-based model underwent internal cross-validation and external validation. Performance was evaluated via metrics such as ROC, DCA, and PR curves. To enhance interpretability, the SHapley Additive exPlanations (SHAP) method was applied for feature analysis.

Results

Data from a total of 406 esophageal cancer patients were collected, and the LightGBM-based model showed the best performance. The model included the following features: lesion length, McKeown surgery, gastrointestinal decompression drainage (GID) volume on postoperative day 1, and prealbumin difference. SHAP dependence plots were created for each variable to understand their impact on the outcome. The model achieved an AUC of 0.956 (95% CI: 0.934–0.978).

Conclusion

This study successfully developed an interpretable ML model based on the LightGBM to predict postoperative AL in patients with esophageal cancer.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12885-025-14387-3.

Keywords: Esophageal cancer, Anastomotic leak, LightGBM, Interpretable machine learning, Postoperative complication prediction

Introduction

Esophageal cancer is a common malignant tumor of the digestive system [1], and surgical resection is the primary treatment [2]. However, AL is a major postoperative complication that negatively affects patient prognosis, increasing mortality, hospital stay duration, and healthcare costs [3]. Reducing and preventing AL after surgery is crucial for improving treatment outcomes and minimizing adverse consequences [4, 5].

Traditional methods for predicting ALs rely on clinical experience and a limited set of indicators [6, 7]. However, due to the influence of multiple factors, these predictions are often inaccurate. With the increasing volume of medical data and advancements in computational power, machine learning (ML) techniques have been widely applied to predict surgical complications [8–10].

Despite the strengths of ML models, they are limited by their lack of interpretability [11]. Several studies have ranked the importance of various features in predicting AL in postoperative esophageal cancer patients using ML models, but few have provided individual-level explanations [10]. To address this limitation, this study employs the SHapley Additive exPlanations (SHAP) method to interpret the ML models, uncover the importance of different features, and develop an interpretable model that accurately predicts the occurrence of AL after esophageal cancer surgery [12]. The model is specifically developed for thoracic surgeons to assist in clinical decision-making.

Method

Data collection

This study utilized a retrospective case‒control design. A total of 406 patients who were diagnosed with esophageal cancer and who underwent curative resection at the Department of Thoracic Surgery, Shandong University Second Hospital, between March 1, 2021 and April 30, 2024, were consecutively enrolled. Both clinical and laboratory data were collected for each patient. All the data were collected by clinicians, and any discrepancies were reviewed and discussed by the team to resolve any issues.

Inclusion and exclusion criteria

The inclusion criteria were as follows: (1) age ≥ 18 years with a confirmed diagnosis of esophageal cancer through pathological examination. (2) There was no distant metastasis, and the patient underwent curative resection. (3) Overall good health, with no significant cardiopulmonary dysfunction. (4) Patients who had undergone gastric tube reconstruction. The exclusion criterion was patients whose clinical or laboratory data exceeded 25%.

Surgery

All curative esophageal cancer surgeries were performed by experienced surgeons, each having performed at least 20 cases annually. The choice of surgical approach—whether Ivor-Lewis, McKeown, or Sweet—was based on the tumor location and the surgeon’s preference.

Observation indicators

We collected 192 features from patients’ medical records, including general information such as sex, age, body mass index (BMI), smoking history, and medical history (e.g., hypertension, diabetes, coronary heart disease). Laboratory Results: Preoperative and postoperative (within 24 h) laboratory tests, including complete blood count (CBC), liver and kidney function tests, and other relevant examinations, were performed. Surgical Information: Surgical duration, intraoperative blood loss, surgical approach, and drainage volume within the first 3 days after surgery. Pathology results included the histological type of the tumor, the number of lymph nodes removed, and the Ki67-positive rate. The data are collected by two doctors, and 10% of the data were randomly selected by the team leader for comparison and verification to evaluate the accuracy of data collection.

Diagnostic criteria for AL

The diagnosis of AL was based on the definition proposed by the Esophagectomy Complications Consensus Group, in conjunction with the relevant literature [13]. A comprehensive assessment was performed via CT, contrast imaging, and clinical symptoms to identify the presence of ALs [14, 15]. Neck leakage: Redness and swelling around the anastomotic site, induration or abscess formation, and air or saliva-like gas leakage upon coughing. Thoracic leak: Fever, elevated inflammatory markers, and purulent drainage from mediastinal or thoracic drainage tubes. Contrast imaging can show extravasation of contrast agents, whereas CT imaging may reveal fluid in the mediastinum or thoracic cavity, with pus obtained upon aspiration. On the basis of the diagnostic criteria outlined above, the diagnosis of AL for each patient was determined through group discussion.

Statistical methods

This study followed the TRIPOD (Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis) guidelines (Supplementary Material 1) [16]. Statistical analysis was performed via R software (version 4.1.3).

Sample size calculation

To estimate the sample size, the pmsampsize package (version 1.1.3) was used [17]. Based on previous studies, the model’s C-index was approximately 0.8, considering around 8 parameters [6]. The occurrence rate of AL was assumed to be between 15% and 30%, with an expected incidence of 20% [18–20]. The required minimum sample size was calculated to be 355 cases.

Given the multiple preoperative and postoperative test results and the potential for missing data due to the study’s retrospective design, a 10% data loss rate was anticipated. The final calculated sample size was 394 cases. To ensure adequate statistical power, the study aimed to enrol 400 patients.

Model development and comparison

In this study, nine ML algorithms were selected for analysis: decision tree (DT), random forest (RF), extreme gradient boosting (XGBoost), elastic net (ENET), support vector machine (SVM), multilayer perceptron (MLP), logistic regression (LR), light gradient boosting machine (LightGBM), and k-nearest neighbors (KNN). To optimize the prediction models, grid search combined with manual fine-tuning was used to determine the final hyperparameters for each algorithm. Hyperparameter optimization was performed using the R ‘tune’ package. First, we set the hyperparameter ranges for each model based on its characteristics. Next, we used 5-fold cross-validation to evaluate the candidate parameter combinations. The select_best function identified the optimal parameter set by the AUC maximum value.

The SHAP method was used to rank input feature importance and interpret the results of the predictive models [12]. Using SHAP, we assessed and ranked variable importance for each model, selecting the top 10 most influential variables from each method for further analysis. The best ML method and optimal feature set were determined based on performance indicators, including the area under the curve (AUC), sensitivity, and specificity, to construct the final predictive model.

Feature selection safeguards

During data partitioning, we used temporal isolation to divide the dataset into a training set (356 samples) and a validation set (45 samples). Feature selection adhered to strict chronological independence principles. Model development was confined to the training data to avoid data leakage. Feature stability was assessed through five-fold cross-validation and SHAP value ranking consistency checks. Features were selected if they exhibited ≥ 80% ranking consistency within the top 10 across validation iterations. Dimensionality was controlled using the ‘pmsampsize’ package, which limited the number of features to 8 via parameter constraints to prevent model overfitting.

Model evaluation

The clinical utility of the final model was assessed using the following metrics: the receiver operating characteristic (ROC) curve, decision curve analysis (DCA), and precision‒recall curve (P-R). The NUn score, a widely used predictive model for AL, was also considered. This score is based on postoperative C-reactive protein (CRP), white blood cell count (WCC), and albumin levels: NUn score = 11.3894 + 0.005 (CRP) + 0.186 (WCC) − 0.174 (albumin) [21]. The predictive performance of the NUn model was compared with that of our ML model to evaluate the effectiveness of both approaches.

Considering the potential impact of our center’s annual surgical volume and the study’s time span on the results, we decided to use the full sample as the training set and performed 5-fold cross-validation to validate and evaluate the final model.

To assess the model’s applicability to external data, we collected additional data from patients who underwent esophagectomy for esophageal cancer in our department between May 1, 2024 and February 28, 2025. The inclusion and exclusion criteria were identical to those used in the training set. We gathered data on the incidence of AL and other variables required by the final model. The ROC curve was plotted based on the predictions from the validation dataset to evaluate the model’s performance.

Model interpretation and application

This study used the SHAP method to assess the impact of various variables on the outcome variable, AL. SHAP provides both global and local explanations. The global explanation assigns consistent and accurate importance values to each feature, revealing how each input variable contributes to the model’s prediction of AL, which allows for a comprehensive understanding of how features influence decision-making across all cases. In contrast, the local explanation offers insights into individual predictions, demonstrating how specific features affect the model’s output for a given patient based on their unique data inputs.

To facilitate model application, we developed a web application using the Shiny framework. Users can input feature values from the final model, and the system will return the predicted probability of AL, accompanied by a bar chart showing the importance of each feature.

Handling missing data

Missing values were imputed via multiple imputation [22]. For categorical data, the most frequent imputed value was used, while for continuous data, the mean imputed value was applied.

Results

Patient characteristics

A total of 406 patients were initially enrolled in the study. After 50 patients whose test data exceeded 25% were excluded, 356 patients remained for the analysis. Among them, 55 patients (15.4%) were diagnosed with AL.

Baseline characteristics

Table 1 summarizes the baseline clinical, demographic, and perioperative test results for patients with and without AL. No statistically significant differences were observed between the two groups for parameters such as sex, age, smoking history, alcohol consumption, or hypertension (P > 0.05). However, significant differences were noted in variables such as lesion length, laparoscopic surgery, surgical approach, red cell volume, postoperative sputum culture, GID volume on POD 1/2/3, and preoperative red blood cell count (P < 0.05). Detailed statistical information for all variables is available in Table S1.

Table 1.

Comparison of part clinical and laboratory characteristics and outcomes based on the occurrence of anastomotic leakage. Qualitative data are presented as frequency (percentage), and Chi-square tests were applied to compare categorical data. Quantitative data are presented as mean (standard deviation), and independent sample t-tests were conducted for comparison. *denotes ≥ 0.01p < 0.05, ** denotes p < 0.01. AL: anastomotic leakage, RBC: red blood cell count, Hb: hemoglobin, GID: Gastrointestinal decompression drainage; POD: postoperative day, BMI: body mass index

		AL group (n = 55)	non-AL group n = 301)	P
Gender	Male	48 (87.27%)	241 (80.07%)
	Female	7 (12.73%)	60 (19.93%)	0.280
Age, year		65.44 (7.2)	64.82 (7.96)	0.640
BMI, kg/m2		23.41 (3.04)	23.14 (3.51)	0.340
lesion length, cm		3.56 (1.4)	3.15 (1.27)	0.016*
Esophageal Dilation	yes	11 (20%)	26 (8.64%)
	no	44 (80%)	275 (91.36%)	0.020*
Laparoscopic Surgery	yes	46 (83.64%)	178 (59.14%)
	no	9 (16.36%)	123 (40.86%)	0.000**
Surgery Type	Sweet	9 (16.36%)	124 (41.2%)
	Ivor-Lewis	1 (1.82%)	22 (7.31%)
	McKeown	45 (81.82%)	155 (51.5%)	0.000**
	no	36 (65.45%)	245 (81.4%)	0.010*
GID volume on POD 1, ml		53.07 (70.87)	85.68 (103.87)	0.002**
GID volume on POD 2, ml		120.82 (183.09)	196.71 (183.04)	0.000**
GID volume on POD 3, ml		127.09 (170.68)	202.61 (186.73)	0.000**
Preoperative Glutamate Dehydrogenase, U/L		3.16 (2.49)	4.11 (4.78)	0.035*
Preoperative RBC, ×10^12/L		4.41 (0.44)	4.21 (0.52)	0.004**
Preoperative Hb Concentration, g/L		136.23 (14.86)	130 (18.16)	0.014*
Preoperative Hematocrit, %		41.75 (4.02)	38.85 (5.14)	0.000**
Postoperative Magnesium, mmol/L		0.83 (0.1)	0.8 (0.11)	0.044*
Postoperative Fucosidase, U/L		20.38 (4.6)	18.28 (5.11)	0.003**
Postoperative Conjugated Bilirubin, µmol/L		2.27 (1.33)	1.88 (1.27)	0.047*
Postoperative RBC, ×10^12/L		4.19 (0.53)	4.04 (0.55)	0.047*
Postoperative Hb Concentration, g/L		130.96 (17.3)	125.61 (17.44)	0.038*
Postoperative Hematocrit, %		39.4 (4.84)	37.87 (4.86)	0.033*
Globulin difference, g/L		2.84 (4.13)	1.06 (4.49)	0.008**
Albumin Globulin Ratio difference,		0.01 (0.28)	0.14 (0.34)	0.002**
Prealbumin difference, mg/dL		6.97 (4.96)	5.34 (4.39)	0.031*
Fucosidase difference, U/L		-3.94 (4)	-2.51 (4.14)	0.030*
Magnesium difference, mmol/L		0.06 (0.1)	0.09 (0.12)	0.027*

Open in a new tab

Model development and comparison

In this study, all variables were included to develop nine ML models for predicting the likelihood of AL. The top 10 most important features for each model, ranked by SHAP values, are shown in Figures S1a–i.

As the number of features was progressively reduced based on their importance, noticeable changes were observed in the area under the curve (AUC) values across the nine models. Among them, the LightGBM model demonstrated the best overall performance, with superior evaluation metrics, stable predictive efficacy, and the highest predictive capability, as shown in Fig. 1.

Fig. 1 — Performance metrics of nine machine learning (ML) algorithms under varying feature count conditions. (a) AUC, (b) Sensitivity, (c) Specificity, (d) F1 score, (e) Youden’s index (J), and (f) PPV of nine machine learning algorithms under different feature count conditions. Abbreviations: **AUC**: Area under the ROC curve, **SENS**: sensitivity, **SPEC**: specificity, F1: F1 score, J: Youden’s index, **PPV**: Positive predictive value, DT: Decision Tree, RF: Random Forest, **XGBoost**: Extreme Gradient Boosting, **ENET**: Elastic Net, **SVM**: Support Vector Machine, **MLP**: Multilayer Perceptron, LR: Logistic Regression, **LightGBM**: Light Gradient Boosting Machine, **KNN**: K-Nearest Neighbors

Thus, LightGBM outperformed the other models in predicting AL. Figure 1 also illustrates the performance of all models with varying feature sets, detailing metrics such as sensitivity, specificity, positive predictive value (PPV), accuracy, Youden’s index, and F1 score.

These findings highlight the robustness and predictive superiority of the LightGBM model in identifying patients at risk for AL.

Final model selection

The final model was selected during the feature reduction process of the LightGBM model, as shown in Fig. 1 and Table S2. The 4-feature model outperformed the 8-feature model (ΔAUC = 0.25, p = 0.051). Based on sample size estimation, the maximum number of feature variables was set at 8.

Considering both clinical applicability and model performance, the 4-feature LightGBM model was selected as the final predictive model. This model included GID volume on POD 1, McKeown surgery, lesion length, and prealbumin difference.

The model demonstrated excellent predictive performance, with an AUC of 0.956 (95% CI: 0.934–0.978), a sensitivity of 0.900, a specificity of 0.855, a positive predictive value (PPV) of 0.971, a negative predictive value (NPV) of 0.610, a Youden index of 0.755, and an F1 score of 0.934.

Model evaluation and comparison

To validate the sample size and model robustness, 5-fold cross-validation was performed. The results showed an AUC of 0.955 (95% CI: 0.949–0.969), as illustrated in Fig. 2a. A total of 45 patients were included in the external dataset, 5 of whom (15.2%) experienced AL. The specific parameters for these variables are detailed in Table S3. The ROC curve for the validation dataset, shown in Fig. 2a, demonstrated an AUC of 0.756.

To compare the predictive performance of our model with a classical model, the ROC curve for the NUn model was generated, and the AUC was 0.531, as shown in Fig. 2a.

Additionally, the P-R curve for the ML model is shown in Fig. 3b, with a P-R AUC of 0.833. The DCA curve in Fig. 2c demonstrates that the model has good clinical utility.

Model interpretation

To ensure the clinical utility of the predictive model, SHAP was used to calculate each variable’s contribution to the predictions, providing interpretability for the final model. As shown in Fig. 3a, the SHAP summary plot illustrates the model’s functionality. The features are ranked in descending order based on their mean SHAP values, reflecting their contributions to the predictions.

SHAP dependence plots (Fig. 3b) and partial dependence plots (PDPs, Figure S2) together revealed distinct risk patterns for AL across clinical features. SHAP values (with positive values indicating elevated risk) were consistent with PDP trajectories for all major predictors: For GID volume on POD1, SHAP values exceeded zero when drainage volumes fell below 20 mL (Fig. 3b), matching the risk thresholds identified in PDP analysis (Figure S2a). The McKeown surgery showed consistently positive SHAP values, with PDP analysis confirming a higher AL risk compared to non- McKeown surgery (Figure S2b). Lesion length greater than 3.2 cm showed progressively increasing SHAP values (Fig. 3b), corresponding to incremental risk increases shown in PDP analysis (Figure S2c). Prealbumin difference greater than 12 mg/dL resulted in SHAP values greater than zero (Fig. 3b), matching the risk stabilization pattern observed in PDP analysis beyond this threshold (Figure S2d).

Local interpretability analysis was conducted to explore how the model generated predictions for specific cases based on personalized data inputs. Figure 3c presents an example of a patient who developed AL after esophagectomy. According to the model, the patient underwent McKeown surgery, had a GID volume on POD1 of 10 mL, a lesion length of 4.3 cm, and a prealbumin difference of 12.6. These factors collectively shifted the prediction towards the “AL” category.

Clinical application

The final predictive model was integrated into a web-based application to enhance its usability in clinical settings (https://yxd369152.shinyapps.io/app_AL_eso/). As shown in Fig. 3d, the application allows users to input the actual values of the four key feature variables. Based on these inputs, the application automatically predicts the probability of AL for each individual.

Discussion

This study involved 356 esophageal cancer patients and 192 clinical feature variables, including patient characteristics, surgical details, postoperative complications, and perioperative laboratory results. Nine ML models were compared to assess their ability to predict AL after surgery.

Although several studies have focused on predicting AL, its heterogeneity in clinical practice makes it challenging to apply individual clinical factors effectively [6, 7, 10]. This study incorporated a large set of preoperative and postoperative laboratory results, and calculated the changes in test values before and after surgery. These changes were then used as input features for the ML analysis.

Among the nine ML models, LightGBM demonstrated the best predictive performance and had a high threshold probability for feature reduction [23]. Multiple studies have confirmed that the LightGBM exhibits excellent predictive performance in the medical field [24, 25].

Our study identified four risk factors for AL: GID volume on POD 1, prealbumin difference, McKeown surgery, and lesion length.

The use of GID after esophageal cancer surgery remains controversial [26]. Some argue that it reduces tension at the anastomosis site, lowering the risk of AL and promoting recovery [27]. However, other studies suggest that it does not significantly affect AL occurrence and may increase the risk of complications, such as pneumonia. Consequently, early removal of the decompression tube is often recommended [28].

Our analysis indicated that gastric irrigation and drainage (GID) volumes below 20 mL on postoperative day (POD) 1 were significantly associated with an increased risk of anastomotic leakage (AL) (p < 0.05). Mechanistically, drainage volumes below 20 mL may indicate GID tube dysfunction (e.g., obstruction or malposition). This impaired drainage capacity can lead to fluid accumulation, which increases anastomotic tension and promotes AL development. Cross-validation results (Figure S1) further confirmed these findings, showing that low GID volumes consistently ranked among the top 10 predictors across machine learning models. We recommend promptly assessing GID tube functionality (e.g., tube flushing or radiographic evaluation) when drainage volumes fall below 20 mL, as early intervention may reduce anastomotic tension and the risk of subsequent AL [27].

Preoperative serum ALB levels are commonly used to assess nutritional status [29]. Previous studies have confirmed that low postoperative ALB levels are an independent risk factor for postoperative AL [30].

Our study included the prealbumin difference between preoperative and postoperative values for analysis. The results showed that a greater prealbumin difference was associated with a greater risk of AL. When the prealbumin difference exceeds 12, the risk of AL increases significantly. Prealbumin is a more sensitive indicator of nutritional status than albumin. Clinicians should focus on protein supplementation for patients with normal albumin levels but a noticeable decrease in prealbumin levels to reduce the risk of AL.

Research shows that the overall AL rate in thoracic anastomosis is 12.3%, compared to 34.1% in cervical anastomosis [18]. The McKeown procedure places the anastomosis in the neck, which is farther from the blood supply, making it more prone to AL [19]. Our study also found that the McKeown approach is an independent risk factor for AL. Other studies have indicated that larger tumor volumes are linked to surgical complications [31]. Our findings suggest that longer lesion length is also a high-risk factor for AL. This highlights the importance of carefully selecting the surgical approach and monitoring the patient’s postoperative nutritional status and GID to minimize AL risk.

The external validation results, shown in Fig. 2a, report an AUC of 0.756, indicating moderate predictive performance. The decreased performance compared to the training dataset is likely attributable to the smaller sample size. Despite this limitation, the results highlight the critical role of the four key variables in predicting AL.

The NUn score is a well-established model for predicting postoperative AL in esophageal cancer, with some studies supporting its predictive ability [32–34]. In this study, we assessed the NUn model using POD1 data, which yielded an AUC value of 0.531, indicating poor predictive performance. In comparison, our newly developed model demonstrated superior predictive ability for postoperative AL in esophageal cancer and shows significant clinical potential. The poor performance of the NUn model in our dataset may stem from our focus on postoperative test results within the first 24 h, while the NUn model relies on daily postoperative data [21, 33]. This approach is challenging to implement in clinical practice and may lead to causal confusion, limiting its clinical applicability.

Our model, developed using the LightGBM algorithm, showed strong early predictive ability for postoperative AL in esophageal cancer. Its reliability and effectiveness were validated through internal cross-validation, DCA curves, and P-R curves. Additionally, a web application based on the Shiny framework was developed (https://yxd369152.shinyapps.io/app_AL_eso/), enabling clinicians to easily access and use the model.

Limitations

This study has three main limitations that should be considered. First, the single-center retrospective design limits the generalizability of the results, although external validation in a cohort (n = 45) showed moderate predictive value (AUC = 0.756). The small validation sample size may reduce the accuracy of the model’s discrimination. Second, although standardized 24-hour drainage measurements were recorded daily at 06:00, improved quality control measures are recommended, such as using calibrated instruments, recording at 6-hour intervals, and implementing data audit mechanisms. Third, the retrospective design prevented objective assessments, such as radiographic catheter evaluation or pressure monitoring, which may have caused classification bias in low-drainage (< 20 mL) cases. Finally, although the four predictors (GID volume on POD 1, McKeown surgery, lesion length, and prealbumin difference) are biologically plausible, multicenter prospective studies are needed to confirm their clinical utility.

Conclusion

By extracting clinical laboratory data from the medical records system, we developed an interpretable ML model for predicting AL. The final LightGBM model showed excellent predictive ability for AL during internal validation. Our study revealed that McKeown surgery, lesion length, GID volume on POD 1 and prealbumin differences are high-risk factors for AL.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1^{(1,003.5KB, pdf)}

Acknowledgements

Not applicable.

Abbreviations

AL: Anastomotic Leak
ML: Machine Learning
LightGBM: Light Gradient Boosting Machine
SHAP: SHapley Additive exPlanations
AUC: Area Under the ROC Curve
ROC: Receiver Operating Characteristic
DCA: Decision Curve Analysis
PR: Precision-Recall
TRIPOD: Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis
GID: Gastrointestinal Decompression Drainage
POD: Postoperative Day
CBC: Complete Blood Count
BMI: Body Mass Index
CRP: C-Reactive Protein
WCC: White Cell Count
DT: Decision Tree
RF: Random Forest
XGBoost: Extreme Gradient Boosting
ENET: Elastic Net
SVM: Support Vector Machine
MLP: Multilayer Perceptron
LR: Logistic Regression
KNN: K-Nearest Neighbors
CI: Confidence Interval
PPV: Positive Predictive Value
NPV: Negative Predictive Value
NUn: Noble and Underwood (Score)
CT: Computed Tomography

Author contributions

X.Y. designed the study, analyzed the data, and wrote the original manuscript. F.D. performed result interpretation and prepared all figures. R.X. collected clinical data, conducted statistical analyses, and processed the datasets. G.T. contributed to data collection and analysis. X.Z. designed the study framework and revised the manuscript critically. All authors reviewed and approved the final version of the manuscript. X.Z. is the corresponding author responsible for correspondence and manuscript integrity.

Funding

No funding was received.

Data availability

The datasets generated and analyzed during the current study are not publicly available due to patient privacy concerns but are available from the corresponding author on reasonable request.

Declarations

Ethics approval and consent to participate

This study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the Second Hospital of Shandong University (Approval No. KYLL2024788). Informed consent was waived due to the retrospective nature of the study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Clinical trial number

Not applicable.

Footnotes

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Morgan E, Soerjomataram I, Rumgay H, Coleman HG, Thrift AP, Vignat J, Laversanne M, Ferlay J, Arnold M. The global landscape of esophageal squamous cell carcinoma and esophageal adenocarcinoma incidence and mortality in 2020 and projections to 2040: new estimates from GLOBOCAN 2020. Gastroenterology. 2022:649–e658642. [DOI] [PubMed]
2.Watanabe M, Otake R, Kozuki R, Toihata T, Takahashi K, Okamura A, Imamura Y. Recent progress in multidisciplinary treatment for patients with esophageal cancer. Surg Today. 2020:12–20. [DOI] [PMC free article] [PubMed]
3.Guan X, Liu C, Zhou T, Ma Z, Zhang C, Wang B, Yao Y, Fan X, Li Z, Zhang Y. Survival and prognostic factors of patients with esophageal fistula in advanced esophageal squamous cell carcinoma. Biosci Rep. 2020. [DOI] [PMC free article] [PubMed]
4.Verstegen MHP, Bouwense SAW, van Workum F, ten Broek R, Siersema PD, Rovers M, Rosman C. Management of intrathoracic and cervical anastomotic leakage after esophagectomy for esophageal cancer: a systematic review. World J Emerg Surg. 2019. [DOI] [PMC free article] [PubMed]
5.Grantham JP, Hii A, Shenfine J. Preoperative risk modelling for oesophagectomy: A systematic review. World J Gastrointest Surg. 2023:450–70. [DOI] [PMC free article] [PubMed]
6.van Kooten RT, Bahadoer RR, ter, Buurkes de Vries B, Wouters MWJM, Tollenaar RAEM, Hartgrink HH, Putter H, Dikken JL. Conventional regression analysis and machine learning in prediction of anastomotic leakage and pulmonary complications after esophagogastric cancer surgery. J Surg Oncol. 2022:490–501. [DOI] [PMC free article] [PubMed]
7.Griffiths E. Predictors of anastomotic leak and conduit necrosis after oesophagectomy: results from the oesophago-gastric anastomosis audit (OGAA). Eur J Surg Oncol. 2024. [DOI] [PubMed]
8.Hu J, Xu J, Li M, Jiang Z, Mao J, Feng L, Miao K, Li H, Chen J, Bai Z, et al. Identification and validation of an explainable prediction model of acute kidney injury with prognostic implications in critically ill children: a prospective multicenter cohort study. EClinicalMedicine. 2024;68:102409. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Bataille B, de Selle J, Moussot P-E, Marty P, Silva S, Cocquet P. Machine learning methods to improve bedside fluid responsiveness prediction in severe sepsis or septic shock: an observational study. Br J Anaesth. 2021:826–34. [DOI] [PubMed]
10.Zhao Z, Cheng X, Sun X, Ma S, Feng H, Zhao L. Prediction model of anastomotic leakage among esophageal Cancer patients after receiving an esophagectomy: machine learning approach. JMIR Med Inf. 2021:e27110. [DOI] [PMC free article] [PubMed]
11.Collaris D, van Wijk JJ. StrategyAtlas: strategy analysis for machine learning interpretability. IEEE Trans Vis Comput Graph. 2023;29(6):2996–3008. [DOI] [PubMed] [Google Scholar]
12.Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, Liston DE, Low KW, Newman SF, Kim J. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomedical Eng. 2018;2(10):749–60. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Low DE, Alderson D, Cecconello I, Chang AC, Darling GE, D’Journo XB, Griffin SM, Hölscher AH, Hofstetter WL, Jobe BA et al. International consensus on standardization of data collection for complications associated with esophagectomy. Ann Surg. 2015:286–94. [DOI] [PubMed]
14.Moon SW, Kim JJ, Cho DG, Park JK. Early detection of complications: anastomotic leakage. J Thorac Disease. 2019:S805–11. [DOI] [PMC free article] [PubMed]
15.Fabbi M, Hagens ERC, van Berge Henegouwen MI, Gisbertz SS. Anastomotic leakage after esophagectomy for esophageal cancer: definitions, diagnostics, and treatment. Dis Esophagus. 2020. [DOI] [PMC free article] [PubMed]
16.Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD). Ann Intern Med. 2015:735–6. [DOI] [PubMed]
17.Riley RD, Ensor J, Snell KIE, Harrell FE, Martin GP, Reitsma JB, Moons KGM, Collins G, van Smeden M. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020:m441. [DOI] [PubMed]
18.van Workum F, Verstegen MHP, Klarenbeek BR, Bouwense SAW, van Berge Henegouwen MI, Daams F, Gisbertz SS, Hannink G, Haveman JW, Heisterkamp J et al. Intrathoracic vs cervical anastomosis after totally or hybrid minimally invasive esophagectomy for esophageal Cancer. JAMA Surg. 2021:601. [DOI] [PMC free article] [PubMed]
19.Gooszen JAH, Goense L, Gisbertz SS, Ruurda JP, van Hillegersberg R, van Berge Henegouwen MI. Intrathoracic versus cervical anastomosis and predictors of anastomotic leakage after oesophagectomy for cancer. Br J Surg. 2018:552–60. [DOI] [PMC free article] [PubMed]
20.Mboumi IW, Reddy S, Lidor AO. Complications after esophagectomy. Surg Clin North Am. 2019:501–10. [DOI] [PubMed]
21.Noble F, Curtis N, Harris S, Kelly JJ, Bailey IS, Byrne JP, Underwood TJ. Risk assessment using a novel score to predict anastomotic leak and major complications after oesophageal resection. J Gastrointest Surg. 2012;16(6):1083–95. [DOI] [PubMed] [Google Scholar]
22.Zhang Q, Yuan KH, Wang L. Asymptotic bias of normal-distribution‐based maximum likelihood estimates of moderation effects with data missing at random. Br J Math Stat Psychol. 2019:334–54. [DOI] [PubMed]
23.Meng Q. LightGBM: a highly efficient gradient boosting decision tree. In: Neural Information Processing Systems: 2017. 2017.
24.Yang XH, Liao HJ, Yu Sun P, Ma J, Wang B, He Y, Xue LG, Su LM, Wang BJ. MCD-LightGBM system for intelligent analyzing heterogeneous clinical drug therapeutic effects. IEEE J Biomed Health Inf. 2024, Pp. [DOI] [PubMed]
25.Yang X, Wuchty S, Liang Z, Ji L, Wang B, Zhu J, Zhang Z, Dong Y. Multi-modal features-based human-herpesvirus protein-protein interaction prediction by using LightGBM. Brief Bioinform. 2024;25(2). [DOI] [PMC free article] [PubMed]
26.Jiang D, Liu XB, Xing WQ, Chen PN, Feng SK, Liu JX, Sun HB. Impact of nasogastric decompression on gastric tube size after McKeown minimally invasive esophagectomy: a retrospective controlled cohort study. J Gastrointest Surg. 2022;26(12):2585–7. [DOI] [PubMed] [Google Scholar]
27.Lu Y, Ren Z. Clinical application of Gastrointestinal decompression in anastomotic fistula after McKeown esophagectomy for esophageal cancer. Med (Baltim). 2022;101(29):e29831. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Hayashi M, Kawakubo H, Shoji Y, Mayanagi S, Nakamura R, Suda K, Wada N, Takeuchi H, Kitagawa Y. Analysis of the effect of early versus conventional nasogastric tube removal on postoperative complications after transthoracic esophagectomy: A Single-Center, randomized controlled trial. World J Surg. 2019;43(2):580–9. [DOI] [PubMed] [Google Scholar]
29.Chen XF, Lin JP, Zhou H, Kang BZ, Nayak R, Gao L, Jiang SS, Wang F. The relationship between the collagen score at the anastomotic site of esophageal squamous cell carcinoma and anastomotic leakage. J Thorac Dis. 2024;16(7):4515–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Wang YJ, Xie XF, He YQ, Bao T, He XD, Li KK, Guo W. Impact of perioperative decreased serum albumin level on anastomotic leakage in esophageal squamous cell carcinoma patients treated with neoadjuvant chemotherapy followed by minimally invasive esophagectomy. BMC Cancer. 2023;23(1):1212. [DOI] [PMC free article] [PubMed] [Google Scholar]
31.Shiomi A, Ito M, Maeda K, Kinugasa Y, Ota M, Yamaue H, Shiozawa M, Horie H, Kuriu Y, Saito N. Effects of a diverting stoma on symptomatic anastomotic leakage after low anterior resection for rectal cancer: a propensity score matching analysis of 1,014 consecutive patients. J Am Coll Surg. 2015;220(2):186–94. [DOI] [PubMed] [Google Scholar]
32.Van Daele E, Vanommeslaeghe H, Decostere F, Beckers Perletti L, Beel E, Van Nieuwenhove Y, Ceelen W, Pattyn P. Systemic inflammatory response and the noble and Underwood (NUn) score as early predictors of anastomotic leakage after esophageal reconstructive surgery. J Clin Med. 2024;13(3). [DOI] [PMC free article] [PubMed]
33.Bundred J, Hollis AC, Hodson J, Hallissey MT, Whiting JL, Griffiths EA. Validation of the NUn score as a predictor of anastomotic leak and major complications after esophagectomy. Dis Esophagus. 2019. [DOI] [PubMed]
34.Paireder M, Jomrich G, Asari R, Kristo I, Gleiss A, Preusser M, Schoppmann SF. External validation of the NUn score for predicting anastomotic leakage after oesophageal resection. Sci Rep. 2017;7(1):9725. [DOI] [PMC free article] [PubMed]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material 1^{(1,003.5KB, pdf)}

Data Availability Statement

The datasets generated and analyzed during the current study are not publicly available due to patient privacy concerns but are available from the corresponding author on reasonable request.

[CR1] 1.Morgan E, Soerjomataram I, Rumgay H, Coleman HG, Thrift AP, Vignat J, Laversanne M, Ferlay J, Arnold M. The global landscape of esophageal squamous cell carcinoma and esophageal adenocarcinoma incidence and mortality in 2020 and projections to 2040: new estimates from GLOBOCAN 2020. Gastroenterology. 2022:649–e658642. [DOI] [PubMed]

[CR2] 2.Watanabe M, Otake R, Kozuki R, Toihata T, Takahashi K, Okamura A, Imamura Y. Recent progress in multidisciplinary treatment for patients with esophageal cancer. Surg Today. 2020:12–20. [DOI] [PMC free article] [PubMed]

[CR3] 3.Guan X, Liu C, Zhou T, Ma Z, Zhang C, Wang B, Yao Y, Fan X, Li Z, Zhang Y. Survival and prognostic factors of patients with esophageal fistula in advanced esophageal squamous cell carcinoma. Biosci Rep. 2020. [DOI] [PMC free article] [PubMed]

[CR4] 4.Verstegen MHP, Bouwense SAW, van Workum F, ten Broek R, Siersema PD, Rovers M, Rosman C. Management of intrathoracic and cervical anastomotic leakage after esophagectomy for esophageal cancer: a systematic review. World J Emerg Surg. 2019. [DOI] [PMC free article] [PubMed]

[CR5] 5.Grantham JP, Hii A, Shenfine J. Preoperative risk modelling for oesophagectomy: A systematic review. World J Gastrointest Surg. 2023:450–70. [DOI] [PMC free article] [PubMed]

[CR6] 6.van Kooten RT, Bahadoer RR, ter, Buurkes de Vries B, Wouters MWJM, Tollenaar RAEM, Hartgrink HH, Putter H, Dikken JL. Conventional regression analysis and machine learning in prediction of anastomotic leakage and pulmonary complications after esophagogastric cancer surgery. J Surg Oncol. 2022:490–501. [DOI] [PMC free article] [PubMed]

[CR7] 7.Griffiths E. Predictors of anastomotic leak and conduit necrosis after oesophagectomy: results from the oesophago-gastric anastomosis audit (OGAA). Eur J Surg Oncol. 2024. [DOI] [PubMed]

[CR8] 8.Hu J, Xu J, Li M, Jiang Z, Mao J, Feng L, Miao K, Li H, Chen J, Bai Z, et al. Identification and validation of an explainable prediction model of acute kidney injury with prognostic implications in critically ill children: a prospective multicenter cohort study. EClinicalMedicine. 2024;68:102409. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR9] 9.Bataille B, de Selle J, Moussot P-E, Marty P, Silva S, Cocquet P. Machine learning methods to improve bedside fluid responsiveness prediction in severe sepsis or septic shock: an observational study. Br J Anaesth. 2021:826–34. [DOI] [PubMed]

[CR10] 10.Zhao Z, Cheng X, Sun X, Ma S, Feng H, Zhao L. Prediction model of anastomotic leakage among esophageal Cancer patients after receiving an esophagectomy: machine learning approach. JMIR Med Inf. 2021:e27110. [DOI] [PMC free article] [PubMed]

[CR11] 11.Collaris D, van Wijk JJ. StrategyAtlas: strategy analysis for machine learning interpretability. IEEE Trans Vis Comput Graph. 2023;29(6):2996–3008. [DOI] [PubMed] [Google Scholar]

[CR12] 12.Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, Liston DE, Low KW, Newman SF, Kim J. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomedical Eng. 2018;2(10):749–60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] 13.Low DE, Alderson D, Cecconello I, Chang AC, Darling GE, D’Journo XB, Griffin SM, Hölscher AH, Hofstetter WL, Jobe BA et al. International consensus on standardization of data collection for complications associated with esophagectomy. Ann Surg. 2015:286–94. [DOI] [PubMed]

[CR14] 14.Moon SW, Kim JJ, Cho DG, Park JK. Early detection of complications: anastomotic leakage. J Thorac Disease. 2019:S805–11. [DOI] [PMC free article] [PubMed]

[CR15] 15.Fabbi M, Hagens ERC, van Berge Henegouwen MI, Gisbertz SS. Anastomotic leakage after esophagectomy for esophageal cancer: definitions, diagnostics, and treatment. Dis Esophagus. 2020. [DOI] [PMC free article] [PubMed]

[CR16] 16.Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD). Ann Intern Med. 2015:735–6. [DOI] [PubMed]

[CR17] 17.Riley RD, Ensor J, Snell KIE, Harrell FE, Martin GP, Reitsma JB, Moons KGM, Collins G, van Smeden M. Calculating the sample size required for developing a clinical prediction model. BMJ. 2020:m441. [DOI] [PubMed]

[CR18] 18.van Workum F, Verstegen MHP, Klarenbeek BR, Bouwense SAW, van Berge Henegouwen MI, Daams F, Gisbertz SS, Hannink G, Haveman JW, Heisterkamp J et al. Intrathoracic vs cervical anastomosis after totally or hybrid minimally invasive esophagectomy for esophageal Cancer. JAMA Surg. 2021:601. [DOI] [PMC free article] [PubMed]

[CR19] 19.Gooszen JAH, Goense L, Gisbertz SS, Ruurda JP, van Hillegersberg R, van Berge Henegouwen MI. Intrathoracic versus cervical anastomosis and predictors of anastomotic leakage after oesophagectomy for cancer. Br J Surg. 2018:552–60. [DOI] [PMC free article] [PubMed]

[CR20] 20.Mboumi IW, Reddy S, Lidor AO. Complications after esophagectomy. Surg Clin North Am. 2019:501–10. [DOI] [PubMed]

[CR21] 21.Noble F, Curtis N, Harris S, Kelly JJ, Bailey IS, Byrne JP, Underwood TJ. Risk assessment using a novel score to predict anastomotic leak and major complications after oesophageal resection. J Gastrointest Surg. 2012;16(6):1083–95. [DOI] [PubMed] [Google Scholar]

[CR22] 22.Zhang Q, Yuan KH, Wang L. Asymptotic bias of normal-distribution‐based maximum likelihood estimates of moderation effects with data missing at random. Br J Math Stat Psychol. 2019:334–54. [DOI] [PubMed]

[CR23] 23.Meng Q. LightGBM: a highly efficient gradient boosting decision tree. In: Neural Information Processing Systems: 2017. 2017.

[CR24] 24.Yang XH, Liao HJ, Yu Sun P, Ma J, Wang B, He Y, Xue LG, Su LM, Wang BJ. MCD-LightGBM system for intelligent analyzing heterogeneous clinical drug therapeutic effects. IEEE J Biomed Health Inf. 2024, Pp. [DOI] [PubMed]

[CR25] 25.Yang X, Wuchty S, Liang Z, Ji L, Wang B, Zhu J, Zhang Z, Dong Y. Multi-modal features-based human-herpesvirus protein-protein interaction prediction by using LightGBM. Brief Bioinform. 2024;25(2). [DOI] [PMC free article] [PubMed]

[CR26] 26.Jiang D, Liu XB, Xing WQ, Chen PN, Feng SK, Liu JX, Sun HB. Impact of nasogastric decompression on gastric tube size after McKeown minimally invasive esophagectomy: a retrospective controlled cohort study. J Gastrointest Surg. 2022;26(12):2585–7. [DOI] [PubMed] [Google Scholar]

[CR27] 27.Lu Y, Ren Z. Clinical application of Gastrointestinal decompression in anastomotic fistula after McKeown esophagectomy for esophageal cancer. Med (Baltim). 2022;101(29):e29831. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Hayashi M, Kawakubo H, Shoji Y, Mayanagi S, Nakamura R, Suda K, Wada N, Takeuchi H, Kitagawa Y. Analysis of the effect of early versus conventional nasogastric tube removal on postoperative complications after transthoracic esophagectomy: A Single-Center, randomized controlled trial. World J Surg. 2019;43(2):580–9. [DOI] [PubMed] [Google Scholar]

[CR29] 29.Chen XF, Lin JP, Zhou H, Kang BZ, Nayak R, Gao L, Jiang SS, Wang F. The relationship between the collagen score at the anastomotic site of esophageal squamous cell carcinoma and anastomotic leakage. J Thorac Dis. 2024;16(7):4515–24. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Wang YJ, Xie XF, He YQ, Bao T, He XD, Li KK, Guo W. Impact of perioperative decreased serum albumin level on anastomotic leakage in esophageal squamous cell carcinoma patients treated with neoadjuvant chemotherapy followed by minimally invasive esophagectomy. BMC Cancer. 2023;23(1):1212. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] 31.Shiomi A, Ito M, Maeda K, Kinugasa Y, Ota M, Yamaue H, Shiozawa M, Horie H, Kuriu Y, Saito N. Effects of a diverting stoma on symptomatic anastomotic leakage after low anterior resection for rectal cancer: a propensity score matching analysis of 1,014 consecutive patients. J Am Coll Surg. 2015;220(2):186–94. [DOI] [PubMed] [Google Scholar]

[CR32] 32.Van Daele E, Vanommeslaeghe H, Decostere F, Beckers Perletti L, Beel E, Van Nieuwenhove Y, Ceelen W, Pattyn P. Systemic inflammatory response and the noble and Underwood (NUn) score as early predictors of anastomotic leakage after esophageal reconstructive surgery. J Clin Med. 2024;13(3). [DOI] [PMC free article] [PubMed]

[CR33] 33.Bundred J, Hollis AC, Hodson J, Hallissey MT, Whiting JL, Griffiths EA. Validation of the NUn score as a predictor of anastomotic leak and major complications after esophagectomy. Dis Esophagus. 2019. [DOI] [PubMed]

[CR34] 34.Paireder M, Jomrich G, Asari R, Kristo I, Gleiss A, Preusser M, Schoppmann SF. External validation of the NUn score for predicting anastomotic leakage after oesophageal resection. Sci Rep. 2017;7(1):9725. [DOI] [PMC free article] [PubMed]

PERMALINK

Interpretable machine learning model for predicting anastomotic leak after esophageal cancer surgery via LightGBM

Xiaodong Yang

Fulin Dou

Guoshuo Tang

Ruipu Xiu

Xiaogang Zhao

Abstract

Background

Methods

Results

Conclusion

Supplementary Information

Introduction

Method

Data collection

Inclusion and exclusion criteria

Surgery

Observation indicators

Diagnostic criteria for AL

Statistical methods

Sample size calculation

Model development and comparison

Feature selection safeguards

Model evaluation

Model interpretation and application

Handling missing data

Results

Patient characteristics

Baseline characteristics

Table 1.

Model development and comparison

Fig. 1.

Final model selection

Model evaluation and comparison

Fig. 2.

Fig. 3.

Model interpretation

Clinical application

Discussion

Limitations

Conclusion

Electronic supplementary material

Acknowledgements

Abbreviations

Author contributions

Funding

Data availability

Declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Clinical trial number

Footnotes

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases