Differential Performance of Machine Learning Models in Prediction of Procedure Specific Outcomes

Kevin A Chen; Matthew E Berginski; Chirag S Desai; Jose G Guillem; Jonathan Stem; Shawn M Gomez Eng; Muneera R Kapadia

doi:10.1007/s11605-022-05332-x

. Author manuscript; available in PMC: 2023 Aug 1.

Published in final edited form as: J Gastrointest Surg. 2022 May 4;26(8):1732–1742. doi: 10.1007/s11605-022-05332-x

Differential Performance of Machine Learning Models in Prediction of Procedure Specific Outcomes

Kevin A Chen ¹, Matthew E Berginski ², Chirag S Desai ¹, Jose G Guillem ¹, Jonathan Stem ¹, Shawn M Gomez Eng ^2,³, Muneera R Kapadia ¹

PMCID: PMC9444966 NIHMSID: NIHMS1813819 PMID: 35508684

Abstract

Background

Procedure-specific complications can have devastating consequences. Machine learning-based tools have the potential to outperform traditional statistical modeling in predicting their risk and guiding decision-making. We sought to develop and compare deep neural network (NN) models, a type of machine learning, to logistic regression (LR) for predicting anastomotic leak after colectomy, bile leak after hepatectomy, and pancreatic fistula after pancreaticoduodenectomy (PD).

Methods

The colectomy, hepatectomy, and PD National Surgical Quality Improvement Program (NSQIP) databases were analyzed. Each dataset was split into training, validation, and testing sets in a 60/20/20 ratio, with 5-fold cross-validation. Models were created using NN and LR for each outcome. Models were evaluated primarily with area under the receiver operating characteristic curve (AUROC).

Results

197,488 patients were included for colectomy, 25,403 for hepatectomy, and 23,333 for PD. For anastomotic leak, AUROC for NN was 0.676 (95% 0.666–0.687), compared with 0.633 (95% CI 0.620–0.647) for LR. For bile leak, AUROC for NN was 0.750 (95% CI 0.739–0.761), compared with 0.722 (95% CI 0.698–0.746) for LR. For pancreatic fistula, AUROC for NN was 0.746 (95% CI 0.733–0.760), compared with 0.713 (95% CI 0.703–0.723) for LR. Variables related to intra-operative information, such as surgical approach, biliary reconstruction, and pancreatic gland texture were highly important for model predictions.

Discussion

Machine learning showed a marginal advantage over traditional statistical techniques in predicting procedure-specific outcomes. However, models that included intra-operative information performed better than those that did not, suggesting that NSQIP procedure-targeted datasets may be strengthened by including relevant intra-operative information.

Keywords: anastomotic leak, pancreatic fistula, hepatectomy, artificial intelligence, machine learning

Introduction

Procedure specific complications can have devastating consequences. For example, anastomotic leak after colectomy is associated with increased morbidity, length of stay, re-admissions, and mortality, as well as local recurrence and cancer-specific mortality for oncologic surgeries.^1–3 Predictive models can be helpful to estimate a patient’s specific risk for post-operative complications, guide peri-operative decision-making such as ostomy placement or early drain removal, and perform risk adjustment for comparing post-operative outcomes.

Prior predictive models, such as the American College of Surgeons (ACS) Surgical Risk Calculator, provide accurate estimates of overall mortality and morbidity.⁴ However, this model, and others which are based on the National Surgical Quality Improvement Program (NSQIP) dataset, fall short in their ability to predict procedure-specific outcomes.^5–7

Machine learning, a branch of artificial intelligence (AI), uses computer algorithms that identify patterns within data without explicit instructions and has the potential to identify subtle, non-linear patterns. Machine learning has been successfully applied to the prediction of post-operative outcomes, but previous projects have focused on broader, rather than procedure specific, outcomes, such as overall morbidity and mortality.^8,9 Our hypothesis is that machine learning could be helpful in the prediction of procedure-specific outcomes. This study seeks to develop machine learning models for predicting three procedure-specific outcomes: anastomotic leak following colectomy, bile leak following hepatectomy, and pancreatic fistula following pancreaticoduodenectomy (PD). We also sought to compare the machine learning models with logistic regression.

Materials and Methods

Data Source

We used the colectomy, hepatectomy, and pancreatectomy procedure-targeted datasets from the ACS National Surgical Quality Improvement Program (NSQIP) database. All available years for colectomy (2012–2019), hepatectomy (2014–2019), and pancreatectomy (2014–2019) were included. Patients missing primary outcome data were excluded. Patients undergoing colectomy who underwent concurrent ostomy placement were also excluded. From the pancreatectomy dataset, patients undergoing procedures other than PD were excluded. This study was determined to be exempt from institutional review board approval.

Outcomes

For each procedure type, we sought to predict a procedure-specific outcome: anastomotic leak for colectomy, bile leak for hepatectomy, and pancreatic fistula for PD. Anastomotic leak included leaks requiring treatment with antibiotics, percutaneous drainage, or reoperation. Bile leak included leaks requiring percutaneous drainage or reoperation. Pancreatic fistula included grade B or C fistulas for 2018–2019 (fistula grading was implemented in NSQIP in 2018). For 2014–2017, clinically-relevant pancreatic fistulas were defined according to methods described by Kantor et al.^6,10

Predictive Models

Each dataset was split into training, validation, and testing sets in 60%, 20%, and 20% ratios, respectively, using randomly selected data from all years. The training set was used for model development, the validation set was used for model adjustment and to monitor overfitting, and the test set was reserved for evaluation of model performance after completion of development. Cross-validation was used to create 5 different train/test splits to verify model consistency. We selected a deep neural network (NN) as our machine learning approach, as it has been previously demonstrated to have improved performance compared with tree-based methods (such as random forest) in prediction of post-operative outcomes from the NSQIP database.^8,9,11 This deep learning approach uses layers of functions, each containing model weights, to transform input data into output data representing predictions.¹² Dropout (random removal of functions within layers) and early stopping (stopping training when validation set accuracy decreases) were used to reduce overfitting.¹³ Logistic regression (LR) models were also created for comparison. LR was implemented with no regularization and no variable elimination techniques to approximate a standard implementation. Models were implemented in Python (version 3.9) with use of the Pandas,^14,15 SciKitLearn,¹⁶ and Keras¹⁷ libraries.

Input data included all available peri-operative variables within the core NSQIP database and procedure-targeted variables that would be known prior to the occurrence of the outcome of interest (Tables 1 and 2 and Supplementary Table 1). Missing variables from the datasets were addressed by imputation techniques, which is standard data pre-processing. Missing categorical values were imputed as “unknown” and missing continuous values as the median.^9,13,18 Further details are available in the Supplementary Appendix and code is available at https://github.com/gomezlab/nsqip_procedurespecific.

Table 1.

Key Input Variables by Procedure

		Colectomy	Pancreatectomy	Hepatectomy
Age, mean (SD)		62.0 (14.9)	63.4 (12.8)	59.2 (13.7)
Sex, n (%)	Female	96357 (53.0)	19583 (49.8)	12681 (50.0)
	Male	85485 (47.0)	19711 (50.2)	12656 (50.0)
	Non-binary	3 (0.0)	0 (0.0)	0 (0.0)
Race, n (%)	White	133433 (73.4)	29199 (74.3)	16084 (63.5)
	Black or African American	16916 (9.3)	3327 (8.5)	2059 (8.1)
	Asian	5571 (3.1)	1629 (4.1)	1717 (6.8)
	American Indian or Alaska Native	776 (0.4)	116 (0.3)	95 (0.4)
	Native Hawaiian or Pacific Islander	412 (0.2)	70 (0.2)	63 (0.2)
	Unknown	24737 (13.6)	4953 (12.6)	5319 (21.0)
Hispanic ethnicity n (%)	Yes	9055 (5.6)	1977 (5.6)	1378 (6.7)
BMI, mean (SD)		28.7 (6.7)	27.9 (6.1)	28.5 (6.3)
ASA Classification	1	4204 (2.3)	260 (0.7)	350 (1.4)
	2	77345 (42.5)	9354 (23.8)	6310 (24.9)
	3	87662 (48.2)	27070 (68.9)	16800 (66.3)
	4	11835 (6.5)	2552 (6.5)	1824 (7.2)
	5	613 (0.3)	24 (0.1)	11 (0.0)
	Unknown	186 (0.1)	34 (0.1)	42 (0.2)
Functional Status	Independent	176926 (97.7)	38924 (99.2)	25115 (99.3)
	Partially Dependent	3553 (2.0)	293 (0.7)	158 (0.6)
	Totally Dependent	679 (0.4)	25 (0.1)	15 (0.1)
Dyspnea	At rest	741 (0.4)	60 (0.2)	57 (0.2)
	With moderate exertion	11434 (6.3)	2058 (5.2)	1337 (5.3)
	No	169670 (93.3)	37176 (94.6)	23943 (94.5)
Diabetes	Requiring insulin	9118 (5.0)	4839 (12.3)	1555 (6.1)
	Not requiring insulin	18943 (10.4)	5293 (13.5)	2938 (11.6)
	No diabetes	153784 (84.6)	29162 (74.2)	20844 (82.3)
Hypertension		87817 (48.3)	20445 (52.0)	11589 (45.7)
Heart failure		1949 (1.1)	157 (0.4)	93 (0.4)
Ascites		996 (0.5)	114 (0.3)	131 (0.5)
COPD		9016 (5.0)	1586 (4.0)	900 (3.6)
Renal failure		697 (0.4)	30 (0.1)	22 (0.1)
Dialysis		1598 (0.9)	156 (0.4)	81 (0.3)
Chronic steroid use		13313 (7.3)	1220 (3.1)	817 (3.2)
Smoking		28987 (15.9)	6703 (17.1)	3851 (15.2)
Bleeding disorder		6593 (3.6)	1206 (3.1)	842 (3.3)
Weight loss (>10%)		7279 (4.0)	4659 (11.9)	975 (3.8)
Pre-operative transfusion		4213 (2.3)	319 (0.8)	147 (0.6)
Wound classification	Clean	1823 (1.0)	2621 (6.7)	3647 (14.4)
	Clean/contaminated	139733 (76.8)	31308 (79.7)	20032 (79.1)
	Contaminated	22625 (12.4)	4317 (11.0)	1129 (4.5)
	Dirty/Infected	17664 (9.7)	1048 (2.7)	529 (2.1)
Transfer status	Not transferred	172906 (95.1)	38211 (97.3)	24881 (98.2)
	From acute care hospital	3632 (2.0)	781 (2.0)	277 (1.1)
	From nursing home	1502 (0.8)	75 (0.2)	38 (0.2)
	From outside ED	3148 (1.7)	169 (0.4)	110 (0.4)
	From other	544 (0.3)	51 (0.1)	26 (0.1)
Sodium, mean (SD)		139.1 (3.1)	139.0 (3.1)	139.3 (2.8)
Blood urea nitrogen, mean (SD)		15.5 (9.5)	15.6 (7.4)	15.1 (6.9)
Creatinine, mean (SD)		1.0 (0.7)	0.9 (0.5)	0.9 (0.5)
Albumin, mean (SD)		3.8 (0.6)	3.9 (0.6)	4.0 (0.5)
White blood cell count, mean (SD)		7.9 (3.6)	7.3 (2.8)	6.9 (3.1)
Hematocrit, mean (SD)		38.3 (5.9)	38.3 (5.2)	39.4 (5.0)
Platelet count, mean (SD)		268.0 (95.3)	250.0 (91.6)	236.2 (90.8)
Operative time, mean (SD)		173.0 (88.2)	371.9 (128.5)	239.9 (121.7)

Open in a new tab

Data are n (%) unless otherwise specified. BMI = Body Mass Index. ASA = American Society of Anesthesiologists. COPD = Chronic obstructive pulmonary disease. PATOS = present at time of surgery

Table 2.

Procedure-Targeted Variables for Colectomy, Hepatectomy, and Pancreatectomy

	Colectomy
CPT, n (%)	Colectomy	28472 (15.7)
	Colectomy with coloproctostomy	14051 (7.7)
	Colectomy with abdominal and transanal approach	312 (0.2)
	Colectomy with ileocolostomy	23458 (12.9)
	Laparoscopic colectomy	48250 (26.5)
	Laparoscopic colectomy with ileocolostomy	33206 (18.3)
	Laparoscopic colectomy with coloproctostomy	34096 (18.8)
Indication, n (%)	Acute diverticulitis	11348 (5.8)
	Bleeding	1244 (0.6)
	Chronic diverticular disease	30920 (15.7)
	Colon cancer	75478 (38.4)
	Colon cancer w/ obstruction	8433 (4.3)
	Crohn’s Disease	11641 (5.9)
	Enterocolitis (e.g. C. Difficile)	395 (0.2)
	Non-malignant polyp	18981 (9.7)
	Other	31764 (16.1)
	Ulcerative colitis	846 (0.4)
	Volvulus	5609 (2.9)
Emergent indication, n (%)	Not emergent	178150 (90.4)
	Bleeding	1121 (0.6)
	Obstruction	6904 (3.5)
	Other	2256 (1.1)
	Perforation	6072 (3.1)
	Toxic colitis	948 (0.5)
Pre-operative steroid use, n (%)		10459 (5.4)
Mechanical bowel prep, n (%)		109434 (63.9)
Antibiotic bowel prep, n (%)		81762 (47.1)
Pre-operative chemotherapy, n (%)		7485 (3.8)
Approach, n (%)	Open (planned)	55977 (28.4)
	Laparoscopic	61348 (31.2)
	Laparoscopic w/ open assist	46797 (23.8)
	Laparoscopic w/ unplanned conversion to open	13803 (7.0)
	Robotic	11531 (5.9)
	Robotic w/ open assist	6283 (3.2)
	Robotic w/ unplanned conversion to open	969 (0.5)
	Other	127 (0.1)
	Hepatectomy
CPT code, n (%)	Hepatectomy, partial lobectomy	17073 (67.4)
	Hepatectomy, trisegmentectomy	2050 (8.1)
	Hepatectomy, total left lobectomy	2274 (9.0)
	Hepatectomy, total right lobectomy	3940 (15.6)
Indication, n (%)	Colorectal metastasis	8403 (33.1)
	Other metastasis	1503 (6.0)
	Hepatocellular carcinoma	4575 (18.0)
	Cholangiocarcinoma	2233 (8.8)
	Hepatic adenoma	1005 (4.0)
	Hemangioma	802 (3.2)
	Hepatic cyst	722 (2.8)
	Gallbladder cancer	655 (2.6)
	Focal nodular hyperplasia	474 (1.9)
	Biliary cyst	416 (1.6)
	Hepatic abscess	190 (0.7)
	Other	4425 (17.4)
Biliary stent placed, n (%)	Yes, endoscopic	948 (3.8)
	Yes, percutaneous	216 (0.9)
	Yes, other/unknown	102 (0.4)
	No	23943 (95.0)
	Unknown	194 (0.8)
Drain placed, n (%)		11229 (44.3)
Neo-adjuvant systemic chemotherapy, n (%)		6566 (25.8)
Portal vein embolization, n (%)		877 (3.5)
Pre-operative intra-arterial infusion, n (%)		222 (0.9)
Pre-operative ablation, n (%)		169 (0.7)
Viral hepatitis, n (%)	Hepatitis B	1124 (4.9)
	Hepatitis B and C	133 (0.6)
	Hepatitis C	1670 (7.3)
	None	19677 (86.4)
	Other	158 (0.7)
Approach, n (%)	MIS	5777 (22.8)
	MIS w/ conversion	999 (3.9)
	Open (planned)	18616 (73.3)
Liver texture, n (%)	Cirrhotic	2461 (9.7)
	Congested	468 (1.8)
	Fatty	3229 (12.7)
	Fibrosis	256 (1.0)
	Normal	7030 (27.7)
	Unknown	11959 (47.1)
Number of concurrent partial resections, n (%)	0	12688 (50.7)
	1	6822 (27.3)
	2	3011 (12.0)
	3 or more	2439 (9.8)
	Pancreatectomy
CPT, n (%)	Pancreaticoduodenectomy	14679 (63.2)
	Pylorus-sparing pancreaticoduodenectomy	8554 (36.8)
Indication, n (%)	Pancreatic adenocarcinoma	12931 (55.7)
	Ampullary/duodenal adenocarcinoma	3627 (15.6)
	Biliary adenocarcinoma	1761 (7.6)
	Neuroendocrine tumor	1247 (5.5)
	Benign neoplasm of pancreas	945 (4.1)
	Cystic lesion	1101 (4.7)
	Chronic pancreatitis	865 (3.7)
	Other	756 (3.3)
Jaundice, n (%)		10102 (43.8)
Pre-operative biliary stent, n (%)	Endoscopic stent	10950 (49.1)
	No stent at time of surgery	10229 (45.9)
	Percutaneous stent	696 (3.1)
	Stent of other or unknown type	405 (1.8)
Pre-operative chemotherapy, n (%)		4857 (21.0)
Pre-operative radiation therapy, n (%)		1863 (8.1)
Approach, n (%)	Minimally invasive (MIS)	1863 (8.1)
	Open (planned)	21172 (91.1)
Pancreatic duct size, n (%)	3–6 mm	9780 (42.1)
	<3 mm	5748 (24.7)
	>6 mm	3031 (13.0)
	Unknown	4674 (20.1)
Pancreas gland texture, n (%)	Hard	7517 (32.4)
	Intermediate	2117 (9.1)
	Soft	8143 (35.0)
	Unknown	5456 (23.5)
Type of reconstruction, n (%)	Not performed	739 (3.3)
	Pancreaticogastrostomy	511 (2.3)
	Pancreaticojejunal duct-to-mucosal	19499 (86.0)
	Pancreaticojejunal invagination	1915 (8.4)
Drains placed, n (%)	Yes	20649 (89.0)
Vascular resection, n (%)	Not performed	18950 (82.4)
	Artery	435 (1.9)
	Vein	2860 (12.4)
	Vein and artery	766 (3.3)
Drain amylase (POD1), mean (SD)		3475.8 (10299.8)
Incision type, n (%)	Subcostal type	1916 (8.2)
	Upper midline	9179 (39.5)
	Other	177 (0.8)
	Unknown	11961 (51.5)
Gastrojejunostomy, n (%)	Antecolic	3832 (16.5)
	Retrocolic	1611 (6.9)
	Not performed	192 (0.8)
	Unknown	17598 (75.7)
Drain location, n (%)	Biliary anastomosis	157 (0.7)
	Pancreatic & Biliary Anastomosis	3946 (17.0)
	Pancreatic anastomosis	964 (4.1)
	Pancreatic parenchyma	119 (0.5)
	Type(s) cannot be determined	536 (2.3)
	Unknown	17511 (75.4)
Drain system type, n (%)	Closed	10599 (45.6)
	Closed and Open	122 (0.5)
	Open	96 (0.4)
	Unknown	12416 (53.4)
Wound protector, n (%)	Yes	4131 (17.8)
	No	11334 (48.8)
	Unknown	7768 (33.4)
Pre-incision antibiotic, n (%)	1^st generation cephalosporin	5302 (22.8)
	2^nd or 3^rd generation cephalosporin	4493 (19.3)
	Broad spectrum	6125 (26.4)
	Other	552 (2.4)
	Unknown	6761 (29.1)

Open in a new tab

Evaluation

Models were evaluated primarily with area under the receiver operating characteristic curve (AUROC). The receiver operating characteristic curve plots the true positive rate against the false positive rate and the AUROC summarizes the model’s ability to distinguish positive cases from negative cases. AUROC ranges from 0.5 (random guessing) to 1 (perfect classification). AUROC’s were compared between models using the Delong test with significance set at p <0.05.¹⁹ In addition, the area under the precision-recall curve (AUPRC) was also calculated for each model, which assesses a model’s ability to identify all positive cases without identifying false positives. A random classifier will have an AUPRC equal to the rate of the positive class (e.g., rate of anastomotic leak) and a perfect classifier will have an AUPRC of 1.0. The relative importance of input variables was estimated for procedure-specific variables using Shapley additive explanations (SHAP) for NN models and odds ratios for LR models.²⁰

Results

Colectomy

The colectomy dataset included 257,913 patients. After application of exclusion criteria, 197,488 patients remained. 6012 (3.05%) patients experienced an anastomotic leak. After splitting, 118,493 patients were included in the training group, 39,497 patients were included in the validation group, and 39,498 patients were included in the test group. Further input variable characteristics for all groups are described in Table 1. On the test set, NN obtained an AUROC of 0.676 (95% 0.666–0.687) and an AUPRC of 0.104 (95% CI 0.092–0.115). LR obtained an AUROC of 0.633 (95% CI 0.620–0.647) and an AUPRC of 0.056 (95% CI 0.051–0.061). Receiver operating characteristic and precision-recall curves for anastomotic leak are shown in Figures 1a and 2a. Comparison using the Delong test showed a significant difference between the AUROC of NN and LR with p <0.001. Of the variables within the procedure-targeted dataset, approach, mechanic bowel prep, and antibiotic bowel prep contributed most to the NN model output, compared with chemotherapy, pre-operative steroid use, and antibiotic bowel prep for the LR model (Table 4).

Figure 1. — Receiver Operating Characteristic Curves for Procedure-Specific Outcomes, NN = neural network, LR = logistic regression, a = Anastomotic leak, b = Bile leak, c = Pancreatic fistula

Figure 2. — Precision-Recall Curves for Procedure-Specific Outcomes, NN = neural network, LR = logistic regression, a = Anastomotic leak, b = Bile leak, c = Pancreatic fistula

Table 4.

Relative Importance of Input Variables Compared between Neural Network and Logistic Regression Using SHAP Values and Odds Ratios

Anastomotic leak
Variable	SHAP value	Variable	Odds ratio^*

Approach	0.016	Chemotherapy	1.32

Mechanical bowel prep	0.016	Steroid use	1.25

Antibiotic bowel prep	0.014	Antibiotic bowel prep	0.81

Emergent indication	0.011	Mechanical bowel prep	0.86

Steroid use	0.010	Approach	1.14

Chemotherapy	0.009	Emergent indication	0.94

Indication	0.009	Indication	1.01

Bile leak
Variable	SHAP value	Variable	Odds ratio^*

Use of drain	0.034	Biliary reconstruction	1.88

Biliary reconstruction	0.029	Pringle maneuver	1.42

Approach	0.017	Approach	1.37

Biliary stent	0.016	Neoadjuvant chemo-embolization	1.37

Pringle maneuver	0.015	Use of drain	1.37

# of concurrent resections	0.011	Neoadjuvant chemo-infusion	0.73

Concurrent ablation	0.01	Biliary stent	1.22

Viral hepatitis	0.009	Neoadjuvant ablation	1.19

Neoadjuvant therapy	0.009	Neoadjuvant chemotherapy	1.17

Neoadjuvant chemo-embolization	0.008	Viral hepatitis	1.13

Pancreatic fistula
Variable	SHAP value	Variable	Odds ratio^*

Gland texture	0.039	Drains placed	1.27

Indication	0.036	Gland texture	1.25

Drain amylase (POD1)	0.027	Chemotherapy	0.89

Reconstruction	0.010	Reconstruction	1.09

Duct size	0.008	Indication	0.92

Vascular resection	0.006	Radiation therapy	0.93

Biliary stent	0.006	Vascular resection	0.94

Jaundice	0.006	Duct size	0.94

Radiation therapy	0.006	Antibiotic	0.96

Chemotherapy	0.005	Jaundice	0.97

Open in a new tab

Odds ratio is sorted by distance from 1 (null value)

Hepatectomy

The hepatectomy dataset included 25,595 patients. After application of exclusion criteria, 25,403 patients remained. 966 (3.8%) patients experienced a bile leak. After splitting, 15,242 patients were included in the training group, 5,080 patients were included in the validation group, and 5,081 patients were included in the test group. On the test set, NN obtained an AUROC of 0.750 (95% CI 0.739–0.761) and an AUPRC of 0.134 (95% CI 0.115–0.153). LR obtained an AUROC of 0.722 (95% CI 0.698–0.746) and AUPRC of 0.114 (95% CI 0.090–0.139). Receiver operating characteristic and precision-recall curves for anastomotic leak are shown in Figures 1b and 2b. Comparison using the Delong test showed a significant difference between the AUROC of NN and LR with p = 0.003. Of the variables within the procedure-targeted dataset, placement of drain intra-operatively, biliary reconstruction, surgical approach, biliary stent placement, use of Pringle maneuver, and number of concurrent resections contributed most to the NN model, compared with biliary reconstruction, Pringle maneuver, surgical approach, neoadjuvant chemo-embolization, placement of drain, and neoadjuvant chemo-infusion for the LR model (Table 4).

Pancreaticoduodenectomy

The PD dataset included 23,437 patients. After application of exclusion criteria, 23,233 patients remained. 3,346 (14.4%) patients experienced a pancreatic fistula. After splitting, 13,940 patients were included in the training group, 4,647 patients were included in the validation group, and 4,646 patients were included in the test group. On the test set, NN obtained an AUROC of 0.746 (95% CI 0.733–0.760) and an AUPRC of 0.346 (95% CI 0.327–0.365). LR obtained an AUROC of 0.713 (95% CI 0.703–0.723) and an AUPRC of 0.294 (95% CI 0.281–0.307). Receiver operating characteristic and precision-recall curves for anastomotic leak are shown in Figures 1c and 2c. Comparison using the Delong test showed a significant difference between the AUROCs of NN and LR with p < 0.001. Of the variables within the procedure-targeted dataset, pancreatic gland texture, indication, drain amylase on post-operative day 1, type of reconstruction, and duct size contributed most to the NN model output, compared with placement of drain intra-operatively, gland texture, pre-operative chemotherapy, type of reconstruction, and indication for the LR model (Table 4).

Discussion

This study developed and compared machine learning and logistic regression models which predict procedure-specific complications after colectomy, hepatectomy, and PD. Overall, the NN showed marginal improvement over LR in terms of predictive accuracy. There was a marked difference between models’ predictive ability for various outcomes, with anastomotic leak after colectomy less accurately predicted compared with bile leak after hepatectomy and pancreatic fistula after PD for both the NN and LR approaches. Evaluation of variable importance using SHAP values and odds ratios showed that both models emphasized intra-operative variables as risk factors. Notably, the colectomy procedure-targeted dataset includes much less intra-operative information compared with hepatectomy and PD.

While machine learning applied to the entire NSQIP dataset predicts general outcomes with high accuracy (AUROC 0.88–0.95) and significantly outperforms the ACS risk calculator,^4,8 machine learning to predict procedure-specific complications in the current project does not show as clear of an advantage over LR. For anastomotic leak, previous models developed using LR and the NSQIP dataset obtained AUROC’s of 0.65–0.66, similar to our machine learning models, although they significantly outperform the ACS Surgical Risk Calculator (AUROC 0.58).^5,21,22 Models developed using LR on single-institution and regional datasets, which also incorporate more intra-operative information, have obtained higher AUROC’s 0.73 – 0.82.^7,23 LR models created for bile leak and pancreatic leak from non-NSQIP datasets resulted in AUROC (0.65–0.79), similar to results for our models.^24–30 One previous study did apply machine learning methods to predict pancreatic fistula in a smaller, single-institution dataset of 1769 patients with an AUROC 0.74, also similar to our model.³¹

A particularly interesting finding from this study is that certain outcomes, in particular anastomotic leak after colectomy, are much more difficult to predict from the NSQIP dataset compared with bile leak and pancreatic fistula. This is likely because the NSQIP dataset does not include intra-operative variables for colectomy, in contrast to hepatectomy and pancreatectomy. Tellingly, models for anastomotic leak based on non-NSQIP datasets which include relevant intra-operative information, such as number of staple fires, occurrence of intra-operative adverse events, and need for intra-operative transfusion have improved accuracy (AUROC 0.73 – 0.82) that are more similar our results for hepatectomy and PD.^7,23 This aligns with a body of literature showing a strong link between intra-operative performance and post-operative outcomes, indicating that the incorporation of intra-operative information is key to predicting procedure-specific outcomes.^{31–34
32–34}

This comparison does have some limitations. First, use of NSQIP as training data introduces selection bias because only hospitals participating in the NSQIP program are included. In addition, predictions are limited to 30-day outcomes. For some variables, data may be missing because of the clinical scenario and for those variables, assumptions made using imputation techniques may not be valid. Missing data for pancreatectomy variables has also improved over time, making earlier years less useful for model training. Second, this study is not an exhaustive analysis of every procedure-specific complication in NSQIP. Rather, it analyzes the abdominal surgical procedures with the most robust procedure-targeted datasets. Finally, while direct comparison of the absolute values of SHAP and odds ratios is not valid, their use for relative importance can provide insights into model decision-making.

Conclusion

In conclusion, our results show that machine learning has a marginal advantage over traditional statistical techniques in predicting procedure-specific outcomes based on the NSQIP dataset. However, models which include intra-operative variables performed better compared with those that did not, suggesting that NSQIP procedure-targeted datasets may be strengthened by the collection of relevant intra-operative information. The application of machine learning to datasets which include multi-modal data, such as real-time electronic health record information and assessments of intra-operative surgeon performance, represents a target of future research.

Supplementary Material

Supplementary Table 1

NIHMS1813819-supplement-Supplementary_Table_1.docx^{(13.8KB, docx)}

Table 3.

Area Under the Receiver Operating Characteristic and Precision-Recall Curves for Neural Network and Logistic Regression Models

	AUROC Mean	AUROC 95% CI	AUPRC Mean	AUPRC 95% CI
Anastomotic Leak - NN	0.68	0.67–0.69	0.10	0.09–0.12
Anastomotic Leak - LR	0.63	0.62–0.65	0.06	0.05–0.06
Bile Leak - NN	0.75	0.74–0.76	0.13	0.12–0.15
Bile Leak - LR	0.72	0.70–0.75	0.11	0.10–0.14
Pancreatic Fistula - NN	0.75	0.73–0.76	0.35	0.33–0.37
Pancreatic Fistula - LR	0.71	0.70–0.72	0.29	0.28–0.30

Open in a new tab

Acknowledgments

Grant support: This work was supported by funding from the National Institutes of Health (Program in Translational Medicine T32-CA244125 to UNC/KAC).

Footnotes

Conflicts of interest: None declared for each author

References

1.Midura EF, Hanseman D, Davis BR, et al. Risk factors and consequences of anastomotic leak after colectomy: A national analysis. In: Diseases of the Colon and Rectum. Vol 58. Lippincott Williams and Wilkins; 2015:333–338. doi: 10.1097/DCR.0000000000000249 [DOI] [PubMed] [Google Scholar]
2.Mirnezami A, Mirnezami R, Chandrakumaran K, Sasapu K, Sagar P, Finan P. Increased local recurrence and reduced survival from colorectal cancer following anastomotic leak: Systematic review and meta-analysis. Ann Surg. 2011;253(5):890–899. doi: 10.1097/SLA.0b013e3182128929 [DOI] [PubMed] [Google Scholar]
3.Romagnoni A, Jégou S, Van Steen K, et al. Comparative performances of machine learning methods for classifying Crohn Disease patients using genome-wide genotyping data. Sci Reports 2019 91. 2019;9(1):1–18. doi: 10.1038/s41598-019-46649-z [DOI] [PMC free article] [PubMed] [Google Scholar]
4.KY B, Y L, JL P, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217(5). doi: 10.1016/J.JAMCOLLSURG.2013.07.385 [DOI] [PMC free article] [PubMed] [Google Scholar]
5.McKenna NP, Bews KA, Cima RR, Crowson CS, Habermann EB. Development of a Risk Score to Predict Anastomotic Leak After Left-Sided Colectomy: Which Patients Warrant Diversion? J Gastrointest Surg. 2020;24(1):132–143. doi: 10.1007/s11605-019-04293-y [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Kantor O, Talamonti MS, Pitt HA, et al. Using the NSQIP Pancreatic Demonstration Project to Derive a Modified Fistula Risk Score for Preoperative Risk Stratification in Patients Undergoing Pancreaticoduodenectomy. J Am Coll Surg. 2017;224(5):816–825. doi: 10.1016/j.jamcollsurg.2017.01.054 [DOI] [PubMed] [Google Scholar]
7.T S, L C, AI K, et al. Validation of an online risk calculator for the prediction of anastomotic leak after colon cancer surgery and preliminary exploration of artificial intelligence-based analytics. Tech Coloproctol. 2017;21(11):869–877. doi: 10.1007/S10151-017-1701-1 [DOI] [PubMed] [Google Scholar]
8.Bertsimas D, Dunn J, Velmahos GC, Kaafarani HMA. Surgical Risk Is Not Linear: Derivation and Validation of a Novel, User-friendly, and Machine-learning-based Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) Calculator. Ann Surg. 2018;268(4):574–583. doi: 10.1097/SLA.0000000000002956 [DOI] [PubMed] [Google Scholar]
9.Varadarajan KM, Muratoglu OK, Malchau H, et al. Assessing the utility of deep neural networks in predicting postoperative surgical complications: a retrospective study. Artic Lancet Digit Heal. 2021;3:471–485. doi: 10.1016/S2589-7500(21)00084-4 [DOI] [PubMed] [Google Scholar]
10.C B, G M, C D, et al. The 2016 update of the International Study Group (ISGPS) definition and grading of postoperative pancreatic fistula: 11 Years After. Surgery. 2017;161(3):584–591. doi: 10.1016/J.SURG.2016.11.014 [DOI] [PubMed] [Google Scholar]
11.Merath K, Hyer JM, Mehta R, et al. Use of Machine Learning for Prediction of Patient Risk of Postoperative Complications After Liver, Pancreatic, and Colorectal Surgery. J Gastrointest Surg 2019 248. 2019;24(8):1843–1851. doi: 10.1007/S11605-019-04338-2 [DOI] [PubMed] [Google Scholar]
12.LeCun Y, Bengio Y, Hinton G. Deep learning. Nat 2015 5217553. 2015;521(7553):436–444. doi: 10.1038/nature14539 [DOI] [PubMed] [Google Scholar]
13.Géron A Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media; 2019. [Google Scholar]
14.McKinney W Data Structures for Statistical Computing in Python. Proc 9th Python Sci Conf. Published online 2010:56–61. doi: 10.25080/MAJORA-92BF1922-00A [DOI] [Google Scholar]
15.pandas development team T. pandas-dev/pandas: Pandas. Published online February 2020. doi: 10.5281/zenodo.3509134 [DOI] [Google Scholar]
16.Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12:2825–2830. [Google Scholar]
17.Chollet F, others. Keras. Published online 2015. https://github.com/fchollet/keras [Google Scholar]
18.Nudel J, Bishara AM, de Geus SWL, et al. Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database. Surg Endosc. Published online 2020. doi: 10.1007/s00464-020-07378-x [DOI] [PMC free article] [PubMed] [Google Scholar]
19.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics. 1988;44(3):837. doi: 10.2307/2531595 [DOI] [PubMed] [Google Scholar]
20.Lundberg SM, Allen PG, Lee S-I. A Unified Approach to Interpreting Model Predictions. Accessed October 21, 2021. https://github.com/slundberg/shap
21.T S, M L, ML T, MJ L, A H, JW M. A simple web-based risk calculator (www.anastomoticleak.com) is superior to the surgeon’s estimate of anastomotic leak after colon cancer resection. Tech Coloproctol. 2017;21(1):35–41. doi: 10.1007/S10151-016-1567-7 [DOI] [PubMed] [Google Scholar]
22.Rencuzogullari A, Benlice C, Valente M, Abbas MA, Remzi FH, Gorgun E. Predictors of anastomotic leak in elderly patients after colectomy: nomogram-based assessment from the American College of Surgeons National Surgical Quality Program Procedure-Targeted Cohort. Dis Colon Rectum. 2017;60(5):527–536. doi: 10.1097/DCR.0000000000000789 [DOI] [PubMed] [Google Scholar]
23.Rojas-Machado SA, Romero-Simó M, Arroyo A, Rojas-Machado A, López J, Calpena R. Prediction of anastomotic leak in colorectal cancer surgery based on a new prognostic index PROCOLE (prognostic colorectal leakage) developed from the meta-analysis of observational studies of risk factors. Int J Color Dis 2015 312. 2015;31(2):197–210. doi: 10.1007/S00384-015-2422-4 [DOI] [PubMed] [Google Scholar]
24.K M, D F, E V, et al. External Validation and Optimization of the French Association of Hepatopancreatobiliary Surgery and Transplantation’s Score to Predict Severe Postoperative Biliary Leakage after Open or Laparoscopic Liver Resection. J Am Coll Surg. 2018;226(6):1137–1146. doi: 10.1016/J.JAMCOLLSURG.2018.03.024 [DOI] [PubMed] [Google Scholar]
25.Yokoo H, Miyata H, Konno H, et al. Models predicting the risks of six life-threatening morbidities and bile leakage in 14,970 hepatectomy patients registered in the National Clinical Database of Japan. Medicine (Baltimore). 2016;95(49):e5466. doi: 10.1097/{MD}.0000000000005466 [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Shinde RS, Acharya R, Chaudhari VA, et al. External validation and comparison of the original, alternative and updated-alternative fistula risk scores for the prediction of postoperative pancreatic fistula after pancreatoduodenectomy. Pancreatology. 2020;20(4):751–756. doi: 10.1016/j.pan.2020.04.006 [DOI] [PubMed] [Google Scholar]
27.Lao M, Zhang X, Guo C, et al. External validation of alternative fistula risk score (a-{FRS}) for predicting pancreatic fistula after pancreatoduodenectomy. {HPB} Off J Int Hepato Pancreato Biliary Assoc. 2020;22(1):58–66. doi: 10.1016/j.hpb.2019.05.007 [DOI] [PubMed] [Google Scholar]
28.Huang X-T, Huang C-S, Liu C, et al. Development and validation of a new nomogram for predicting clinically relevant postoperative pancreatic fistula after pancreatoduodenectomy. World J Surg. 2021;45(1):261–269. doi: 10.1007/s00268-020-05773-y [DOI] [PubMed] [Google Scholar]
29.Mungroop TH, van Rijssen LB, van Klaveren D, et al. Alternative Fistula Risk Score for Pancreatoduodenectomy (a-{FRS}): Design and International External Validation. Ann Surg. 2019;269(5):937–943. doi: 10.1097/{SLA}.0000000000002620 [DOI] [PubMed] [Google Scholar]
30.Tabchouri N, Bouquot M, Hermand H, et al. A novel pancreatic fistula risk score including preoperative radiation therapy in pancreatic cancer patients. J Gastrointest Surg. 2021;25(4):991–1000. doi: 10.1007/s11605-020-04600-y [DOI] [PubMed] [Google Scholar]
31.Han IW, Cho K, Ryu Y, et al. Risk prediction platform for pancreatic fistula after pancreatoduodenectomy using artificial intelligence. World J Gastroenterol. 2020;26(30):4453–4464. doi: 10.3748/wjg.v26.i30.4453 [DOI] [PMC free article] [PubMed] [Google Scholar]
32.Birkmeyer JD, Finks JF, O’Reilly A, et al. Surgical skill and complication rates after bariatric surgery. N Engl J Med. 2013;369(15):1434–1442. doi: 10.1056/{NEJMsa1300625} [DOI] [PubMed] [Google Scholar]
33.CP S, OA V, AM C, JD B, JB D. Video Ratings of Surgical Skill and Late Outcomes of Bariatric Surgery. JAMA Surg. 2016;151(6). doi: 10.1001/JAMASURG.2016.0428 [DOI] [PMC free article] [PubMed] [Google Scholar]
34.AB C, S L, JH N, Y L, AJ H. Machine learning analyses of automated performance metrics during granular sub-stitch phases predict surgeon experience. Surgery. 2021;169(5):1245–1249. doi: 10.1016/J.SURG.2020.09.020 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Table 1

NIHMS1813819-supplement-Supplementary_Table_1.docx^{(13.8KB, docx)}

[R1] 1.Midura EF, Hanseman D, Davis BR, et al. Risk factors and consequences of anastomotic leak after colectomy: A national analysis. In: Diseases of the Colon and Rectum. Vol 58. Lippincott Williams and Wilkins; 2015:333–338. doi: 10.1097/DCR.0000000000000249 [DOI] [PubMed] [Google Scholar]

[R2] 2.Mirnezami A, Mirnezami R, Chandrakumaran K, Sasapu K, Sagar P, Finan P. Increased local recurrence and reduced survival from colorectal cancer following anastomotic leak: Systematic review and meta-analysis. Ann Surg. 2011;253(5):890–899. doi: 10.1097/SLA.0b013e3182128929 [DOI] [PubMed] [Google Scholar]

[R3] 3.Romagnoni A, Jégou S, Van Steen K, et al. Comparative performances of machine learning methods for classifying Crohn Disease patients using genome-wide genotyping data. Sci Reports 2019 91. 2019;9(1):1–18. doi: 10.1038/s41598-019-46649-z [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.KY B, Y L, JL P, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217(5). doi: 10.1016/J.JAMCOLLSURG.2013.07.385 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.McKenna NP, Bews KA, Cima RR, Crowson CS, Habermann EB. Development of a Risk Score to Predict Anastomotic Leak After Left-Sided Colectomy: Which Patients Warrant Diversion? J Gastrointest Surg. 2020;24(1):132–143. doi: 10.1007/s11605-019-04293-y [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Kantor O, Talamonti MS, Pitt HA, et al. Using the NSQIP Pancreatic Demonstration Project to Derive a Modified Fistula Risk Score for Preoperative Risk Stratification in Patients Undergoing Pancreaticoduodenectomy. J Am Coll Surg. 2017;224(5):816–825. doi: 10.1016/j.jamcollsurg.2017.01.054 [DOI] [PubMed] [Google Scholar]

[R7] 7.T S, L C, AI K, et al. Validation of an online risk calculator for the prediction of anastomotic leak after colon cancer surgery and preliminary exploration of artificial intelligence-based analytics. Tech Coloproctol. 2017;21(11):869–877. doi: 10.1007/S10151-017-1701-1 [DOI] [PubMed] [Google Scholar]

[R8] 8.Bertsimas D, Dunn J, Velmahos GC, Kaafarani HMA. Surgical Risk Is Not Linear: Derivation and Validation of a Novel, User-friendly, and Machine-learning-based Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) Calculator. Ann Surg. 2018;268(4):574–583. doi: 10.1097/SLA.0000000000002956 [DOI] [PubMed] [Google Scholar]

[R9] 9.Varadarajan KM, Muratoglu OK, Malchau H, et al. Assessing the utility of deep neural networks in predicting postoperative surgical complications: a retrospective study. Artic Lancet Digit Heal. 2021;3:471–485. doi: 10.1016/S2589-7500(21)00084-4 [DOI] [PubMed] [Google Scholar]

[R10] 10.C B, G M, C D, et al. The 2016 update of the International Study Group (ISGPS) definition and grading of postoperative pancreatic fistula: 11 Years After. Surgery. 2017;161(3):584–591. doi: 10.1016/J.SURG.2016.11.014 [DOI] [PubMed] [Google Scholar]

[R11] 11.Merath K, Hyer JM, Mehta R, et al. Use of Machine Learning for Prediction of Patient Risk of Postoperative Complications After Liver, Pancreatic, and Colorectal Surgery. J Gastrointest Surg 2019 248. 2019;24(8):1843–1851. doi: 10.1007/S11605-019-04338-2 [DOI] [PubMed] [Google Scholar]

[R12] 12.LeCun Y, Bengio Y, Hinton G. Deep learning. Nat 2015 5217553. 2015;521(7553):436–444. doi: 10.1038/nature14539 [DOI] [PubMed] [Google Scholar]

[R13] 13.Géron A Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media; 2019. [Google Scholar]

[R14] 14.McKinney W Data Structures for Statistical Computing in Python. Proc 9th Python Sci Conf. Published online 2010:56–61. doi: 10.25080/MAJORA-92BF1922-00A [DOI] [Google Scholar]

[R15] 15.pandas development team T. pandas-dev/pandas: Pandas. Published online February 2020. doi: 10.5281/zenodo.3509134 [DOI] [Google Scholar]

[R16] 16.Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12:2825–2830. [Google Scholar]

[R17] 17.Chollet F, others. Keras. Published online 2015. https://github.com/fchollet/keras [Google Scholar]

[R18] 18.Nudel J, Bishara AM, de Geus SWL, et al. Development and validation of machine learning models to predict gastrointestinal leak and venous thromboembolism after weight loss surgery: an analysis of the MBSAQIP database. Surg Endosc. Published online 2020. doi: 10.1007/s00464-020-07378-x [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics. 1988;44(3):837. doi: 10.2307/2531595 [DOI] [PubMed] [Google Scholar]

[R20] 20.Lundberg SM, Allen PG, Lee S-I. A Unified Approach to Interpreting Model Predictions. Accessed October 21, 2021. https://github.com/slundberg/shap

[R21] 21.T S, M L, ML T, MJ L, A H, JW M. A simple web-based risk calculator (www.anastomoticleak.com) is superior to the surgeon’s estimate of anastomotic leak after colon cancer resection. Tech Coloproctol. 2017;21(1):35–41. doi: 10.1007/S10151-016-1567-7 [DOI] [PubMed] [Google Scholar]

[R22] 22.Rencuzogullari A, Benlice C, Valente M, Abbas MA, Remzi FH, Gorgun E. Predictors of anastomotic leak in elderly patients after colectomy: nomogram-based assessment from the American College of Surgeons National Surgical Quality Program Procedure-Targeted Cohort. Dis Colon Rectum. 2017;60(5):527–536. doi: 10.1097/DCR.0000000000000789 [DOI] [PubMed] [Google Scholar]

[R23] 23.Rojas-Machado SA, Romero-Simó M, Arroyo A, Rojas-Machado A, López J, Calpena R. Prediction of anastomotic leak in colorectal cancer surgery based on a new prognostic index PROCOLE (prognostic colorectal leakage) developed from the meta-analysis of observational studies of risk factors. Int J Color Dis 2015 312. 2015;31(2):197–210. doi: 10.1007/S00384-015-2422-4 [DOI] [PubMed] [Google Scholar]

[R24] 24.K M, D F, E V, et al. External Validation and Optimization of the French Association of Hepatopancreatobiliary Surgery and Transplantation’s Score to Predict Severe Postoperative Biliary Leakage after Open or Laparoscopic Liver Resection. J Am Coll Surg. 2018;226(6):1137–1146. doi: 10.1016/J.JAMCOLLSURG.2018.03.024 [DOI] [PubMed] [Google Scholar]

[R25] 25.Yokoo H, Miyata H, Konno H, et al. Models predicting the risks of six life-threatening morbidities and bile leakage in 14,970 hepatectomy patients registered in the National Clinical Database of Japan. Medicine (Baltimore). 2016;95(49):e5466. doi: 10.1097/{MD}.0000000000005466 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Shinde RS, Acharya R, Chaudhari VA, et al. External validation and comparison of the original, alternative and updated-alternative fistula risk scores for the prediction of postoperative pancreatic fistula after pancreatoduodenectomy. Pancreatology. 2020;20(4):751–756. doi: 10.1016/j.pan.2020.04.006 [DOI] [PubMed] [Google Scholar]

[R27] 27.Lao M, Zhang X, Guo C, et al. External validation of alternative fistula risk score (a-{FRS}) for predicting pancreatic fistula after pancreatoduodenectomy. {HPB} Off J Int Hepato Pancreato Biliary Assoc. 2020;22(1):58–66. doi: 10.1016/j.hpb.2019.05.007 [DOI] [PubMed] [Google Scholar]

[R28] 28.Huang X-T, Huang C-S, Liu C, et al. Development and validation of a new nomogram for predicting clinically relevant postoperative pancreatic fistula after pancreatoduodenectomy. World J Surg. 2021;45(1):261–269. doi: 10.1007/s00268-020-05773-y [DOI] [PubMed] [Google Scholar]

[R29] 29.Mungroop TH, van Rijssen LB, van Klaveren D, et al. Alternative Fistula Risk Score for Pancreatoduodenectomy (a-{FRS}): Design and International External Validation. Ann Surg. 2019;269(5):937–943. doi: 10.1097/{SLA}.0000000000002620 [DOI] [PubMed] [Google Scholar]

[R30] 30.Tabchouri N, Bouquot M, Hermand H, et al. A novel pancreatic fistula risk score including preoperative radiation therapy in pancreatic cancer patients. J Gastrointest Surg. 2021;25(4):991–1000. doi: 10.1007/s11605-020-04600-y [DOI] [PubMed] [Google Scholar]

[R31] 31.Han IW, Cho K, Ryu Y, et al. Risk prediction platform for pancreatic fistula after pancreatoduodenectomy using artificial intelligence. World J Gastroenterol. 2020;26(30):4453–4464. doi: 10.3748/wjg.v26.i30.4453 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R32] 32.Birkmeyer JD, Finks JF, O’Reilly A, et al. Surgical skill and complication rates after bariatric surgery. N Engl J Med. 2013;369(15):1434–1442. doi: 10.1056/{NEJMsa1300625} [DOI] [PubMed] [Google Scholar]

[R33] 33.CP S, OA V, AM C, JD B, JB D. Video Ratings of Surgical Skill and Late Outcomes of Bariatric Surgery. JAMA Surg. 2016;151(6). doi: 10.1001/JAMASURG.2016.0428 [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] 34.AB C, S L, JH N, Y L, AJ H. Machine learning analyses of automated performance metrics during granular sub-stitch phases predict surgeon experience. Surgery. 2021;169(5):1245–1249. doi: 10.1016/J.SURG.2020.09.020 [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Differential Performance of Machine Learning Models in Prediction of Procedure Specific Outcomes

Kevin A Chen, MD

Matthew E Berginski, PhD

Chirag S Desai, MD

Jose G Guillem, MD

Jonathan Stem, MD

Shawn M Gomez Eng, ScD

Muneera R Kapadia, MD, MME

Abstract

Background

Methods

Results

Discussion

Introduction

Materials and Methods

Data Source

Outcomes

Predictive Models

Table 1.

Table 2.

Evaluation

Results

Colectomy

Figure 1.

Figure 2.

Table 4.

Hepatectomy

Pancreaticoduodenectomy

Discussion

Conclusion

Supplementary Material

Table 3.

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases