Leveraging Decision Curve Analysis to Improve Clinical Application of Surgical Risk Calculators

Esmaeel Reza Dadashzadeh; Patrick Bou-Samra; Lauren V Huckaby; Giacomo Nebbia; Robert M Handzel; Patrick R Varley; Shandong Wu; Allan Tsung

doi:10.1016/j.jss.2020.11.059

. Author manuscript; available in PMC: 2022 Apr 8.

Published in final edited form as: J Surg Res. 2021 Jan 5;261:58–66. doi: 10.1016/j.jss.2020.11.059

Leveraging Decision Curve Analysis to Improve Clinical Application of Surgical Risk Calculators

Esmaeel Reza Dadashzadeh ^a, Patrick Bou-Samra ^a, Lauren V Huckaby ^a, Giacomo Nebbia ^b,^c, Robert M Handzel ^a, Patrick R Varley ^a, Shandong Wu ^b,^c, Allan Tsung ^d,^*

PMCID: PMC8991373 NIHMSID: NIHMS1793576 PMID: 33418322

Abstract

Background:

Surgical risk calculators (SRCs) have been developed for estimation of postoperative complications but do not directly inform decision-making. Decision curve analysis (DCA) is a method for evaluating prediction models, measuring their utility in guiding decisions. We aimed to analyze the utility of SRCs to guide both preoperative and postoperative management of patients undergoing hepatopancreaticobiliary surgery by using DCA.

Methods:

A single-institution, retrospective review of patients undergoing hepatopancreaticobiliary operations between 2015 and 2017 was performed. Estimation of postoperative complications was conducted using the American College of Surgeons SRC [ACS-SRC] and the Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) calculator; risks were compared with observed outcomes. DCA was used to model optimal patient selection for risk prevention strategies and to compare the relative performance of the ACS-SRC and POTTER calculators.

Results:

A total of 994 patients were included in the analysis. C-statistics for the ACS-SRC prediction of 12 postoperative complications ranged from 0.546 to 0.782. DCA revealed that an ACS-SRC–eguided readmission prevention intervention, when compared with an all-or-none approach, yielded a superior net benefit for patients with estimated risk between 5% and 20%. Comparison of SRCs for venous thromboembolism intervention demonstrated superiority of the ACS-SRC for thresholds for intervention between 2% and 4% with the POTTER calculator performing superiorly between 4% and 8% estimated risk.

Conclusions:

SRCs can be used not only to predict complication risk but also to guide risk prevention strategies. This methodology should be incorporated into external validations of future risk calculators and can be applied for institution-specific quality improvement initiatives to improve patient outcomes.

Keywords: Surgical risk calculators, Decision curve analysis, Risk prediction, Postoperative complications, Net benefit

Introduction

Postoperative complications, which occur in up to 30% of general surgery procedures, may lead to prolonged hospital stay, delays in adjuvant treatment, poor quality of life, and postoperative mortality.¹ In addition, adverse events represent a significant burden to the health care system; thus, it is critical to accurately predict these complications and seek to reduce them.² Many groups have attempted to refine risk estimation through the creation of surgical risk calculators (SRCs).^3–5 As an example, the American College of Surgeons (ACS) SRC uses patient demographics and clinical characteristics to generate estimated risk percentages for 12 complications including cardiovascular events, surgical site infections, and death.⁴ Although patient-specific risk estimation is a valuable tool for informed consent discussion, clinical application of these data to guide intervention in situations where estimated risk is high has thus far been limited.

Despite their wealth of information, risk calculators have several limitations. The ACS-SRC, for example, was developed using a heterogeneous cohort of patients and may yield variable accuracy in specific patient populations.⁴ In addition, calculators may only provide accurate risk prediction within a certain range and thus may oversimplify overall risk.⁶ With the advent of new calculators, it has also become increasingly difficult to establish the relative performance of one calculator over another. Finally, translation of risk calculator data to effect clinical practice is lacking. To address this latter point, Vickers et al. developed decision curve analysis (DCA) as a statistical approach using existing risk calculation data to evaluate the clinical consequences of a targeted intervention to improve overall patient outcomes.^6–8 This represents a novel approach to inform implementation of risk prevention measures.

The aim of this study was to use validated SRCs, including the ACS-SRC and the Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) calculator,³ to 1) compare predicted risk with observed outcomes, 2) use DCA to estimate the potential impact of risk-driven interventions on patient outcomes, and 3) compare the overall net benefit of risk-driven interventions from two different calculators. To explore this, we focused exclusively on patients undergoing hepatopancreaticobiliary (HPB) surgery, as these are complex operations associated with relatively high morbidity and mortality. We hypothesized that utilization of DCA in interpretation of risk estimation could provide objective data to guide quality improvement initiatives and optimize patient outcomes in HPB surgery with the potential for application to other surgical patient populations.

Methods

Derivation of the study cohort

A retrospective review of our institutional HPB database was performed for all patients who underwent pancreaticoduodenectomy (Whipple procedure), distal pancreatectomy, and hepatectomy for any indication between January 1, 2015 and June 30, 2017. Both open and laparoscopic cases were included. The Current Procedural Terminologies codes used to categorize patients by procedure type are listed (Appendix Table A.1). Baseline characteristics were obtained from our institutional ACS National Surgical Quality Improvement Program data set. Observed 30-day outcomes were obtained from ACSeNational Surgical Quality Improvement Program data and from comprehensive review of the electronic health record. This study was approved by the Institutional Review Board at the University of Pittsburgh. Because this was a retrospective study, informed consent was waived for participation in the study.

External validation of the risk calculator

Baseline characteristics were entered into the ACS web-based SRC between August 20, 2017 and August 28, 2017 to obtain estimated risk percentages of the 12 specified 30-day complication rates. Observed and predicted risk percentages were compared using discrimination, calibration, and the Brier score. Discrimination was measured using the C-statistic or the area under the receiver operating characteristic curve (AUC).⁹ Calibration was measured using the HosmereLemeshow test which divides the data set into deciles based on predicted values and compares the observed response rates to the expected rates, with significant differences indicating lack of fit.¹⁰ Overall predictive accuracy was measured using the raw Brier score which is the average gap (mean squared difference) between forecast probabilities and actual outcomes. This incorporates a model’s discriminative ability and calibration with a score of 0 for a perfect model and a score of p*(1-p), where p is the a priori probability of the outcome, for a noninformative model.^11,12

Decision curve analysis

Decision curves were constructed using the open source “rmda” package (version 1.4) in R (version 3.4.1; R Development Core Team, Vienna, Austria).¹³ A detailed explanation on the manual calculation of a decision curve is provided (Appendix). In addition, to understand the relative frequencies of risk estimates, we developed a custom score distribution script in Python using standard plotting libraries (version 3.4; Python Software Foundation, Wilmington, DE). No data manipulation was performed, and the visualizations are easily reproducible.

Decision curve analysis for selection of risk intervention and comparison of risk calculators

After our external validation of the ACS-SRC in our patient cohort, we constructed a DCA curve to model readmission. We also performed an external validation for the POTTER calculator in our patient cohort to compare C-statistics. As an example, we then plotted DCA curves for venous thromboembolism (VTE) for both the ACS-SRC and POTTER calculators, as compared with treat-all and treat-none.

Statistical analysis

Continuous data are shown as mean ± standard deviation, and continuous data are shown as number (percent). Student’s t-test was used to compare continuous variables. Statistical analyses were performed using R Studio (version 3.4.1; R Foundation for Statistical Computing, Vienna, Austria).

Case scenario explanation of decision curve analysis

To illustrate the concepts of DCA, we offer the following scenario. City Y has performed poorly in tornado preparedness and is considering the adoption of a new weather prediction model, model X, to predict a tornado and thereby guide the activation of their city-wide siren system. Currently, sirens only activate when a tornado is already reported on the ground. Model X inputs a series of meteorological variables and outputs an estimated risk percentage, from 0% to 100%, of a tornado touching down in the next several hours. City Y officials want to determine if the expected net benefit of implementing model X is superior to no intervention (status quo), or a liberal intervention for any day with inclement weather strategy.

To begin, city Y officials perform an external validation of model X to assess its reproducibility and geographic transportability. Weather data from the past 100 d are input into model X, and the estimated risk percentages of a tornado touching down are recorded. When comparing these expected values to actual observed tornado events, metrics of model X performance such as AUC and calibration can be recorded. Although important, these metrics alone do not inform on the utility of adopting model X; therefore, city Y officials wish to apply some form of decision analytics. They choose DCA because their same external validation data set is sufficient to perform DCA.

DCA provides a method for exploring risk-based intervention (in this example, siren activation) that, unlike other forms of decision analysis, does not require utility scores for all possible outcomes of the decision tree.^8,14,15 Instead, DCA approaches the problem in terms of the threshold probability (Pt) above which the decision maker would deem the expected value of intervention to be greater than not doing so. In this formulation, the ratio (1- Pt)/Pt represents the relative cost of false-positive to false-negative results (e.g., a threshold probability of 10% signifies that the harm of a single false negative is nine times the harm of a false positive). In our example, a false positive indicates unnecessary siren activation, whereas a false negative represents a failure of siren activation on a day of a tornado touching down. In other words, a false positive signifies waste of time and resources while a false negative indicates a missed opportunity for improved safety and care. The effectiveness of an intervention and its cost (monetary, risk to those involved, etc.) is inherently accounted for in the decision maker’s selection of the threshold probability. An intervention that approaches 100% efficacy with minimal cost and risk lends itself to a low threshold probability for triggering the intervention.

DCA derives the net benefit (y-axis) to intervene when the estimated risk percentage from model X is greater than city Y’s specified threshold probability (x-axis). The unit of net benefit is analogous to net true positives. As a reference, a net benefit of 0.1 is equivalent to a net of 10 true-positive siren activations per 100 d without an increase in the number of false-positive activations. At the decision maker’s selected threshold probability, the model or strategy with the highest net benefit would be the best to implement. At a given Pt, net benefit is calculated by subtracting the proportion of all days with false-positive siren activations (simply count the number of days without an observed tornado where model X’s estimated risk percentage was ≥ to the given Pt) from the proportion with true-positive activations, weighting by the relative harm of a false-positive and a false-negative result in accordance with the following formula:

Net benefit = (True - positive count / n) - (False - positive count / n) * (Pt / 1 - Pt)

By allowing the threshold probability to vary, DCA can show graphically the net benefit obtained by using model X for the decision to activate sirens. In the absence of other predictive models, DCA compares net benefit of using model X against the net benefits of two opposing strategies of treat-none and treat-all days with the intervention. For the former, the net benefit to intervene is zero, which is constant whatever the threshold probability. In the latter, when intervention is implemented for all days, (true-positive count/n) is city Y tornado prevalence (π) and (false-positive count/n) is 1-π, resulting in a net benefit function defined by π - (1- π)*(P_t/1-P_t) that ranges from π down to negative infinity.

DCA’s utility becomes more apparent when multiple models are compared. Model W claims to be superior to model X because of its higher AUC. City Y officials have agreed on a threshold probability of 50% for siren activations (i.e., they are willing to accept a false alarm rate of 50%). As such, the model with the highest net benefit at a Pt of 50% would be their selection. City Z is also looking to implement a model; however, they have more fortified commercial and residential architecture and are willing to accept a higher threshold probability of 75% for siren activation. A higher model AUC does not guarantee a higher net benefit across all threshold probabilities; therefore, they would also look at the DCA comparing model X and model W before making their selection.

Results

Baseline characteristics of the study cohort

Baseline characteristics of patients undergoing one of three selected HPB procedures are shown (Table 1). Of the 994 patients, 306 (30.8%) underwent a Whipple procedure, 127 (12.8%) underwent distal pancreatectomy, and 561 (56.4%) underwent hepatectomy. Approximately one-third (36.3%) of patients undergoing a Whipple procedure were <65 y old, whereas 48% of patients undergoing distal pancreatectomy and 54.9% of patients undergoing a hepatectomy were <65 y. The majority of patients in the Whipple (97.1%) and distal pancreatectomy (95.3%) groups had a cancer diagnosis, whereas only 58.5% of those undergoing hepatectomy had a cancer diagnosis.

Table 1 –

Baseline characteristics of patients undergoing hepatopancreaticobiliary surgery, stratified by procedure type.

	Whipple n = 306	Distal pancreatectomy n = 127	Hepatectomy n = 561
Age (y)–no. (%)
<65	111 (36.3)	61 (48.0)	308 (54.9)
65–74	117 (38.2)	45 (35.4)	155 (27.6)
75–84	71 (23.2)	19 (15.0)	87 (15.5)
>85	7 (2.3)	2 (1.6)	11 (2.0)
Sex–no. (%)
Female	145 (47.4)	65 (51.2)	276 (49.2)
Male	161 (52.6)	62 (28.8)	285 (50.8)
Body mass index (kg/m²)–mean ± SD	27.0 ± 5.9	29.6 ± 6.9	28.6 ± 6.0
Functional status
Independent	304 (99.3)	127 (100.0)	559 (99.6)
Partially dependent	2 (0.7)	0	2 (0.4)
Emergent–no. (%)	1 (0.3)	0	1 (0.2)
ASA class
1	0	1 (0.8)	3 (0.5)
2	35 (11.4)	19 (15)	97 (17.3)
3	247 (80.7)	98 (77.2)	436 (77.7)
4	24 (7.8)	9 (7.1)	25 (4.5)
Ventilator dependent–no. (%)	0	0	0
Cancer–no. (%)	297 (97.1)	121 (95.3)	328 (58.5)
Sepsis–no. (%)
None	301 (98.4)	127 (100.0)	555 (98.9)
SIRS	3 (1.0)	0	2 (0.4)
Sepsis	2 (0.7)	0	4 (0.7)
Diabetes–no. (%)
No	223 (72.9)	92 (72.4)	459 (81.8)
Insulin	47 (15.4)	11 (8.7)	40 (7.1)
Noninsulin	36 (11.8)	24 (18.9)	62 (11.1)
Hypertension–no. (%)	182 (59.5)	71 (55.9)	276 (49.2)
Congestive heart failure–no. (%)	0	0	1 (0.2)
Dyspnea–no. (%)
No	295 (96.4)	0 (0.0)	552 (98.4)
At rest	1 (0.3)	4 (3.1)	1 (0.2)
Moderate exertion	10 (3.3)	123 (96.9)	8 (1.4)
Smoking–no. (%)	72 (23.5)	33 (26.0)	106 (18.9)
COPD–no. (%)	15 (4.9)	33 (26.0)	106 (18.9)
Dialysis–no. (%)	1 (0.3)	0	1 (0.2)
Acute renal failure–no. (%)	0	0	0
Ascites–no. (%)	0	0	0
Chronic steroid use–no. (%)	5 (1.6)	3 (2.4)	9 (1.6)

Open in a new tab

ASA = American Society of Anesthesiologists; COPD = chronic obstructive pulmonary disease; SIRS = systemic immune response syndrome; SD = standard deviation.

Comparison of predicted and observed 30-day outcomes

Predicted (based on the ACS-SRC) and observed 30-day rates of 12 complications specified by the ACS-SRC are shown (Table 2). Poor calibration was observed for “any complication” and “pneumonia” with Hosmere–Lemeshow test P-values of 0.04 and 0.01, respectively. All other complications displayed a P-value >0.05, indicating satisfactory calibration. In addition, we compared predicted and observed outcomes individually for Whipple procedure (Appendix Table A.2), distal pancreatectomy (Appendix Table A.3), and hepatectomy (Appendix Table A.4). Among all patients, the highest C-statistic was noted for discharge to a facility (0.782, Fig. 1A) with the lowest being for surgical site infection (0.546, Fig. 1B). Receiver operation curves for readmission (C-statistic 0.611, Fig. 1C) and VTE (C-statistic 0.674, Fig. 1D) are also shown. Analysis of the Brier scores (with a lower score reflecting superior accuracy) demonstrates the highest accuracy for cardiac complications (0.002), death (0.009), and renal failure (0.013).

Table 2 –

Predicted and observed 30-day outcomes for patients undergoing hepatopancreaticobiliary surgery.

	Predicted %±SD	Observed No. (%)	C-statistic	Brier score	Hosmer–Lemeshow P-value
Serious complication	21.8 ± 7.5	190 (19.1)	0.606	0.152	0.37
Any complication	25.3 ± 9.0	220 (22.1)	0.599	0.171	0.04
Pneumonia	4.1 ± 2.3	27 (2.7)	0.656	0.026	0.01
Cardiac complication	1.8 ± 1.5	2 (0.2)	0.571	0.002	0.55
Surgical site infection	13.8 ± 6.4	132 (13.3)	0.546	0.117	0.32
Urinary tract infection	3.2 ± 1.5	19 (1.9)	0.706	0.019	0.26
Venous thromboembolism	3.2 ± 1.2	25 (2.5)	0.674	0.024	0.07
Renal failure	1.9 ± 1.5	13 (1.3)	0.658	0.013	0.47
Readmission	13.7 ± 3.6	182 (18.3)	0.611	0.148	0.14
Return to the operating room	3.9 ± 1.8	36 (3.6)	0.591	0.035	0.08
Death	2.2 ± 2.5	9 (1.0)	0.604	0.009	0.78
Discharge to a nursing or rehabilitation facility	9.3 ± 9.3	86 (8.7)	0.782	0.069	0.39
Predicted length of stay (d)	7.9 ± 2.5	6.6 ± 5.3

Open in a new tab

SD = standard deviation.

Risk prediction was obtained from the American College of Surgeons Surgical Risk Calculator. P-value for length of stay was <0.001.

Fig. 1 – — Receiver operating curves generated from comparison of predicted (from the American College of Surgeons Surgical Risk Calculator) 30-day complication rates for surgical site infection (A), readmission (B), venous thromboembolic event (C), and discharge to a nursing facility (D) in patients undergoing hepatopancreaticobiliary surgery.

Decision curve analysis for risk intervention

We modeled the application of DCA for intervention for readmission (Table 3). The threshold probability at which an intervention would be implemented is shown in addition to the corresponding net benefit (for treating only those patients with an ACS-SRC risk above the threshold versus treating all patients) and the adjusted benefit of using the ACS-SRC risk to determine intervention. In addition, we demonstrate the number of patients that would be spared from an unnecessary intervention at each threshold. As an example, if a surgeon selects an ACS-SRC–epredicted readmission risk of 10% at which to intervene, unnecessary intervention (i.e., intervention is implemented but patient not have experienced the complication) would be avoided in 26 of 100 patients.

Table 3 –

Application of decision curve analysis for readmission intervention, based on data obtained from the American College of Surgeons (ACS) Surgical Risk Calculator (SRC).

P_t (%)	Patient counts			Net benefit		Advantage of ACS-SRC
P_t (%)	Total	TPC	FPC	Treatment based on ACS-SRC	All patients treated	Relative benefit using ACS-SRC	Number of interventions avoided per 100 patients
0	994	182	812	0.183	0.183	0	0
5	990	182	808	0.141	0.140	0.001	1.480
6	980	182	798	0.134	0.131	0.003	4.359
7	948	181	767	0.130	0.122	0.008	11.185
8	934	181	753	0.124	0.112	0.012	13.364
9	881	176	705	0.121	0.102	0.018	18.527
10	819	171	648	0.121	0.092	0.029	25.693
11	758	161	597	0.115	0.082	0.033	26.639
12	655	150	505	0.124	0.072	0.052	38.258
13	572	136	436	0.124	0.061	0.063	42.049
14	514	122	392	0.113	0.050	0.063	38.754
15	426	100	326	0.100	0.039	0.061	34.429
20	24	5	19	0.010	−0.021	0.032	12.617

Open in a new tab

FPC = false-positive count; P_t = threshold probability; TPC = true-positive count.

We demonstrate the DCA for readmission, with the net benefit on the y-axis and the threshold probability on the x-axis (Fig. 2). The solid line demonstrates the DCA if all patients were to receive the readmission intervention, and the dashed line represents no patients receiving the intervention. The dash-dot line shows the decision curve using the SRC-predicted risk. Therefore, using the SRC approach to guide intervention results in superior net benefit between threshold probabilities of 5% and 20%.

Decision curve analysis for comparison of risk calculators

Next we sought to use DCA to compare risk-based VTE intervention using two different risk calculators (ACS-SRC and POTTER). The receiver operating curves for VTE for both ACS-SRC and POTTER are shown (Fig. 3A), with a C-statistic of 0.574 for POTTER and 0.674 for ACS-SRC. As shown in Figure 3B, ACS-SRC–eguided intervention yielded a greater net benefit at lower threshold probabilities (2%-4%) and the POTTER results were superior between threshold probabilities of 4% and 8%. In addition, we demonstrate the score distribution of each calculator at different threshold probabilities (Fig. 3C), which displays the overall distribution of risk percentage frequency obtained from the calculators at or below each probability in our external validation data set.

Fig. 3 – — Receiver operation curves demonstrating predicted 30-day venous thromboembolism (VTE) risk as estimated by the American College of Surgeons Surgical Risk Calculator (ACS-SRC) and the Predictive OpTimal Trees in Emergency Surgery Risk (POTTER) calculator (A). Decision curve analysis using both the ACS-SRC and POTTER calculators for VTE risk intervention (B). Decision curve analysis with intervals representing the frequency of reported risk estimation at or below each probability (C).

Discussion

SRCs have been devised with the intent of guiding informed consent discussions through estimation of specific risks. The clinical application of these results in the postoperative period, however, has thus far been limited. Building on prior work by others using DCA, we demonstrate the utility of DCA in the understanding of implementation of risk-reducing approaches based on estimated risk in a population of almost 1000 patients undergoing high-risk HPB surgery. Our results highlight that the model with the largest AUC may not be the optimal model for clinical decisions. DCA also allows for the selection of the optimal SRC depending on the goals of a quality improvement program and thresholds for intervention. This approach has broad applications to a variety of surgical specialties and can serve to guide quality improvement initiatives to improve surgical outcomes.

To the best of our knowledge, this is the largest published external validation of the ACS-SRC in a population of patients undergoing HPB surgery with almost 1000 patients. Our results demonstrate satisfactory overall estimation of the 12 complications with results comparable with other external validation studies in various surgical populations. Similar to other studies, we found variations in model performance between different complications.^16–18 In addition, our Brier scores were comparable with those reported in an exploration of the ACS-SRC in a smaller study of patients undergoing HPB procedures¹⁹ and in patients undergoing Whipple.²⁰ We do believe that it is important to incorporate multiple tests of predictive accuracy in the external validation of SRCs, as expounded by others and as we have carried out here.²¹ Because interpretation of model performance remains subjective and should be interpreted in the context of different practice environments, it is particularly important to factor in the SRC calibration, accuracy, and discrimination results before applying SRC results to clinical practice or research endeavors.

One key advantage of DCA analysis is the ability to select a threshold probability. DCA can be applied in individual practice settings, with surgeons determining their own threshold for intervention based on their available risk interventions and perceived benefits, and is also applicable on a larger scale for institution-wide quality improvement initiatives. For example, to reduce postoperative VTE, an institution may select to institute aggressive risk management strategies (i.e., early ambulation, mechanical compression, outpatient pharmacologic prophylaxis) in patients with an estimated VTE risk over 3% based on ACS-SRC risk.^22,23 Thus, in a preoperative office setting, providers can obtain these risk estimations and use these results not only to counsel patients but also to identify potential barriers to ensure a smooth postoperative course. This allows for preoperative interventions (e.g., prehabilitation which may facilitate early postoperative ambulation) and allows for sufficient planning to institute any postoperative interventions (e.g., patient and caretaker education about enoxaparin injections).^24,25 This approach, however, does necessitate understanding of the risks/benefits and resources available for each intervention and thus must be tailored for each institution. DCA allows for flexibility in threshold probability and provides a range in which a given calculator remains reliable and thus is adaptable to changes in goal metrics.

Certainly, DCA is not the only approach developed for modeling the impacts of decision-making from a clinical, resource, and financial perspective. Cost-effectiveness analysis and decision tree analysis, for example, represent valuable and more robust methodologies, although they may require additional data such as billing records and utility scores which may not be readily available.^26–28 DCA, on the other hand, has the advantage of being less cumbersome, using existing clinical data from a validation data set.^6,8,29

In this study, we used the ACS-SRC calculator as it is the most widely used and well-validated SRC.^4,5,30 Despite this, these results may not be applicable to all patient populations, which, together with the rise of big data approaches, has driven the development of a multitude of SRCs for specific patient populations. We compared the ACS-SRC with the recently published and machine learninge–driven POTTER calculator to explore the relative utility of each of the calculators in VTE risk.³ Interestingly, despite a higher C-statistic, the ACS-SRC demonstrated a superior net benefit at a more limited range of threshold probabilities (2%-4%), whereas the POTTER calculator performed better at a higher range (4%-8%). This finding highlights the value of direct comparison of these risk estimation approaches with selection of a risk prediction method based on specific goals. For example, although POTTER was developed for emergency general surgery operations, its results may be more applicable if an institution has selected a lower threshold probability (e.g., 2%-4%) on which to intervene to reduce VTE risk.

This study has several limitations. We chose to focus on patients undergoing HPB surgery given the high-risk nature of these operations and the importance of risk mitigation among this patient population. We believe that this methodology can be extended to other surgical or nonsurgical patient populations, but it is possible that it may not carry the same value in lower-risk populations. In addition, we acknowledge that this study, like many decision modeling studies, is theoretical in nature. A large, prospective study comparing DCA-guided interventions to other guided risk intervention approaches would be particularly informative. Finally, for some of the complications, we observed a low number of events; thus, external validation of the calculator and subsequent DCA analysis may be limited.

With the rapid advent of novel risk calculators, it is imperative that future external validations include not only standard metrics of model performance, but also data in support of their potential role for clinical implementation.³¹ We demonstrate the application of SRC data to guide interventions for risk reduction, using DCA to inform intervention implementation for high-risk patients. This method represents a key step in leveraging SRC data to guide clinical decision-making. Future work will focus on implementation of this strategy in a prospective study testing SRC-guided intervention measures. We believe that this work can be easily extrapolated to other disease types, using DCA and other risk calculators to maximize patient outcomes for oncologic and nononcologic diseases.

Supplementary Material

supplemental data

NIHMS1793576-supplement-supplemental_data.docx^{(46.3KB, docx)}

Disclosure

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Footnotes

Meeting presentation: This was presented at the American College of Surgeons 104th Annual Clinical Congress, Scientific Forum, Boston, MA, October 2018.

Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.jss.2020.11.059.

REFERENCES

1.Healey MA, Shackford SR, Osler TM, Rogers FB, Burns E. Complications in surgical patients. Arch Surg. 2002;137:611–618. [DOI] [PubMed] [Google Scholar]
2.Grosse SD, Nelson RE, Nyarko KA, Richardson LC, Raskob GE. The economic burden of incident venous thromboembolism in the United States: a review of estimated attributable healthcare costs. Thromb Res. 2016;137:3–10. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Bertsimas D, Dunn J, Velmahos GC, Kaafarani HMA. Surgical risk is not linear. Ann Surg. 2018;268:574–583. [DOI] [PubMed] [Google Scholar]
4.Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217:833–842.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Liu Y, Cohen ME, Hall BL, Ko CY, Bilimoria KY. Evaluation and enhancement of calibration in the American College of surgeons NSQIP surgical risk calculator. J Am Coll Surg. 2016;223:231–239. [DOI] [PubMed] [Google Scholar]
6.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Mak. 2006;26:565–574. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Steyerberg EW, Vickers AJ. Decision curve analysis: a discussion. Med Decis Mak. 2008;28:146–149. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Fitzgerald M, Saville BR, Lewis RJ. Decision curve analysis. JAMA. 2015;313:409–410. [DOI] [PubMed] [Google Scholar]
9.Pencina MJ, D’Agostino RB. Evaluating discrimination of risk prediction models: the C statistic. JAMA. 2015;314:1063–1064. [DOI] [PubMed] [Google Scholar]
10.Van Calster B, Vickers AJ. Calibration of risk prediction models: impact on decision-analytic performance. Med Decis Mak. 2015;35:162–169. [DOI] [PubMed] [Google Scholar]
11.Redelmeier D, Bloch D, Hickam D. Assessing predictive accuracy: how to compare Brier scores. J Clin Epidemiol. 1991;44:1141–1146. [DOI] [PubMed] [Google Scholar]
12.Wu YC, Lee WC. Alternative performance measures for prediction models. PLoS One. 2014;9. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.Brown M rmda: Risk Model Decision Analysis. Available at: http://mdbrown.github.io/rmda/.
14.Localio AR, Goodman S. Beyond the usual prediction accuracy metrics: reporting results for clinical decision making. Ann Intern Med. 2012;157:294–295. [DOI] [PubMed] [Google Scholar]
15.Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ. 2016;352:3–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Basta MN, Bauder AR, Kovach SJ, Fischer JP. Assessing the predictive accuracy of the American College of surgeons national surgical quality improvement project surgical risk calculator in open ventral hernia repair. Am J Surg. 2016;212:272–281. [DOI] [PubMed] [Google Scholar]
17.Vaziri S, Wilson J, Abbatematteo J, et al. Predictive performance of the American College of Surgeons universal risk calculator in neurosurgical patients. J Neurosurg. 2018;128:942–947. [DOI] [PubMed] [Google Scholar]
18.Khavanin N, Qiu CS, Mlodinow AS, et al. External validation of the breast reconstruction risk assessment calculator. J Plast Reconstr Aesthet Surg. 2017;70:876–883. [DOI] [PubMed] [Google Scholar]
19.Beal EW, Lyon E, Kearney J, et al. Evaluating the American College of surgeons national surgical quality improvement project risk calculator: results from the U.S. Extrahepatic biliary malignancy consortium. HPB. 2017;19:1104–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Mogal HD, Fino N, Clark C, et al. NSQIP risk calculator in patients undergoing pancreaticoduodenectomy. J Surg Oncol. 2017;114:157–162. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Collins GS, De Groot JA, Dutton S, et al. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol. 2014;14:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Kakkar A, Cohen A, Tapson V, et al. Venous thromboembolsim risk and prophylaxis in acute care hospital setting (ENDORSE survey): findings in surgical patients. Ann Surg. 2010;251:330–338. [DOI] [PubMed] [Google Scholar]
23.Agnelli G Prevention of venous thromboembolism in surgical patients. Circulation. 2004;110(24 Suppl L):4–12. [DOI] [PubMed] [Google Scholar]
24.Colwell CW, Pulido P, Hardwick ME, Morris BA. Patient compliance with outpatient prophylaxis: an observational study. Orthopedics. 2005;28:143–147. [DOI] [PubMed] [Google Scholar]
25.Mayo NE, Feldman L, Scott S, et al. Impact of preoperative change in physical function on postoperative recovery: argument supporting prehabilitation for colorectal surgery. Surgery. 2011;150:505–514. [DOI] [PubMed] [Google Scholar]
26.Hill SR. Cost-effectiveness analysis for clinicians. BMC Med. 2012;10:2–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Celi LA, Charlton P, Ghassemi MM, et al. Secondary analysis of electronic health records. 2016.
28.Sonnenberg FA, Beck JR. Markov models in medical decision making: a practial guide. Med Decis Mak. 1993;13:322–338. [DOI] [PubMed] [Google Scholar]
29.Rousson V,Zumbrunn T. Decision curve analysis revisited: overall net benefit, relationships to ROC curve analysis, and application to case-control studies. BMC Med Inform Decis Mak. 2011;11. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Cohen ME, Liu Y, Ko CY, Hall BL. An examination of American College of surgeons NSQIP surgical risk calculator accuracy. J Am Coll Surg. 2017;224:787–795.e1. [DOI] [PubMed] [Google Scholar]
31.Pencina MJ, Goldstein BA, D’Agostino RB. Prediction models – development, evaluation, and clinical application. N Engl J Med. 2020;382:1583–1586. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplemental data

NIHMS1793576-supplement-supplemental_data.docx^{(46.3KB, docx)}

[R1] 1.Healey MA, Shackford SR, Osler TM, Rogers FB, Burns E. Complications in surgical patients. Arch Surg. 2002;137:611–618. [DOI] [PubMed] [Google Scholar]

[R2] 2.Grosse SD, Nelson RE, Nyarko KA, Richardson LC, Raskob GE. The economic burden of incident venous thromboembolism in the United States: a review of estimated attributable healthcare costs. Thromb Res. 2016;137:3–10. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Bertsimas D, Dunn J, Velmahos GC, Kaafarani HMA. Surgical risk is not linear. Ann Surg. 2018;268:574–583. [DOI] [PubMed] [Google Scholar]

[R4] 4.Bilimoria KY, Liu Y, Paruch JL, et al. Development and evaluation of the universal ACS NSQIP surgical risk calculator: a decision aid and informed consent tool for patients and surgeons. J Am Coll Surg. 2013;217:833–842.e3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Liu Y, Cohen ME, Hall BL, Ko CY, Bilimoria KY. Evaluation and enhancement of calibration in the American College of surgeons NSQIP surgical risk calculator. J Am Coll Surg. 2016;223:231–239. [DOI] [PubMed] [Google Scholar]

[R6] 6.Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Mak. 2006;26:565–574. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Steyerberg EW, Vickers AJ. Decision curve analysis: a discussion. Med Decis Mak. 2008;28:146–149. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Fitzgerald M, Saville BR, Lewis RJ. Decision curve analysis. JAMA. 2015;313:409–410. [DOI] [PubMed] [Google Scholar]

[R9] 9.Pencina MJ, D’Agostino RB. Evaluating discrimination of risk prediction models: the C statistic. JAMA. 2015;314:1063–1064. [DOI] [PubMed] [Google Scholar]

[R10] 10.Van Calster B, Vickers AJ. Calibration of risk prediction models: impact on decision-analytic performance. Med Decis Mak. 2015;35:162–169. [DOI] [PubMed] [Google Scholar]

[R11] 11.Redelmeier D, Bloch D, Hickam D. Assessing predictive accuracy: how to compare Brier scores. J Clin Epidemiol. 1991;44:1141–1146. [DOI] [PubMed] [Google Scholar]

[R12] 12.Wu YC, Lee WC. Alternative performance measures for prediction models. PLoS One. 2014;9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] 13.Brown M rmda: Risk Model Decision Analysis. Available at: http://mdbrown.github.io/rmda/.

[R14] 14.Localio AR, Goodman S. Beyond the usual prediction accuracy metrics: reporting results for clinical decision making. Ann Intern Med. 2012;157:294–295. [DOI] [PubMed] [Google Scholar]

[R15] 15.Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ. 2016;352:3–7. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] 16.Basta MN, Bauder AR, Kovach SJ, Fischer JP. Assessing the predictive accuracy of the American College of surgeons national surgical quality improvement project surgical risk calculator in open ventral hernia repair. Am J Surg. 2016;212:272–281. [DOI] [PubMed] [Google Scholar]

[R17] 17.Vaziri S, Wilson J, Abbatematteo J, et al. Predictive performance of the American College of Surgeons universal risk calculator in neurosurgical patients. J Neurosurg. 2018;128:942–947. [DOI] [PubMed] [Google Scholar]

[R18] 18.Khavanin N, Qiu CS, Mlodinow AS, et al. External validation of the breast reconstruction risk assessment calculator. J Plast Reconstr Aesthet Surg. 2017;70:876–883. [DOI] [PubMed] [Google Scholar]

[R19] 19.Beal EW, Lyon E, Kearney J, et al. Evaluating the American College of surgeons national surgical quality improvement project risk calculator: results from the U.S. Extrahepatic biliary malignancy consortium. HPB. 2017;19:1104–1111. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Mogal HD, Fino N, Clark C, et al. NSQIP risk calculator in patients undergoing pancreaticoduodenectomy. J Surg Oncol. 2017;114:157–162. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Collins GS, De Groot JA, Dutton S, et al. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol. 2014;14:1–11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R22] 22.Kakkar A, Cohen A, Tapson V, et al. Venous thromboembolsim risk and prophylaxis in acute care hospital setting (ENDORSE survey): findings in surgical patients. Ann Surg. 2010;251:330–338. [DOI] [PubMed] [Google Scholar]

[R23] 23.Agnelli G Prevention of venous thromboembolism in surgical patients. Circulation. 2004;110(24 Suppl L):4–12. [DOI] [PubMed] [Google Scholar]

[R24] 24.Colwell CW, Pulido P, Hardwick ME, Morris BA. Patient compliance with outpatient prophylaxis: an observational study. Orthopedics. 2005;28:143–147. [DOI] [PubMed] [Google Scholar]

[R25] 25.Mayo NE, Feldman L, Scott S, et al. Impact of preoperative change in physical function on postoperative recovery: argument supporting prehabilitation for colorectal surgery. Surgery. 2011;150:505–514. [DOI] [PubMed] [Google Scholar]

[R26] 26.Hill SR. Cost-effectiveness analysis for clinicians. BMC Med. 2012;10:2–4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] 27.Celi LA, Charlton P, Ghassemi MM, et al. Secondary analysis of electronic health records. 2016.

[R28] 28.Sonnenberg FA, Beck JR. Markov models in medical decision making: a practial guide. Med Decis Mak. 1993;13:322–338. [DOI] [PubMed] [Google Scholar]

[R29] 29.Rousson V,Zumbrunn T. Decision curve analysis revisited: overall net benefit, relationships to ROC curve analysis, and application to case-control studies. BMC Med Inform Decis Mak. 2011;11. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] 30.Cohen ME, Liu Y, Ko CY, Hall BL. An examination of American College of surgeons NSQIP surgical risk calculator accuracy. J Am Coll Surg. 2017;224:787–795.e1. [DOI] [PubMed] [Google Scholar]

[R31] 31.Pencina MJ, Goldstein BA, D’Agostino RB. Prediction models – development, evaluation, and clinical application. N Engl J Med. 2020;382:1583–1586. [DOI] [PubMed] [Google Scholar]

PERMALINK

Leveraging Decision Curve Analysis to Improve Clinical Application of Surgical Risk Calculators

Esmaeel Reza Dadashzadeh, MD, MS

Patrick Bou-Samra, MD

Lauren V Huckaby, MD, MS

Giacomo Nebbia, BS

Robert M Handzel, MD, MS

Patrick R Varley, MD

Shandong Wu, PhD

Allan Tsung, MD

Abstract

Background:

Methods:

Results:

Conclusions:

Introduction

Methods

Derivation of the study cohort

External validation of the risk calculator

Decision curve analysis

Decision curve analysis for selection of risk intervention and comparison of risk calculators

Statistical analysis

Case scenario explanation of decision curve analysis

Results

Baseline characteristics of the study cohort

Table 1 –

Comparison of predicted and observed 30-day outcomes

Table 2 –

Fig. 1 –

Decision curve analysis for risk intervention

Table 3 –

Fig. 2 –

Decision curve analysis for comparison of risk calculators

Fig. 3 –

Discussion

Supplementary Material

Disclosure

Footnotes

REFERENCES

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases