Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2020 Jul 1.
Published in final edited form as: Med Decis Making. 2019 May 20;39(5):491–492. doi: 10.1177/0272989X19849436

The Importance of Uncertainty and Opt In vs. Opt Out: Best Practices for Decision Curve Analysis1

Kathleen F Kerr 1, Tracey L Marsh 2, Holly Janes 3
PMCID: PMC6786944  NIHMSID: NIHMS1527393  PMID: 31104561

Decision Curves analysis (DCA) [1] is a breakthrough methodology, and we are happy to see its growing adoption. DCA evaluates a risk model when its intended application is to choose or forego an intervention based on estimated risk of having an unwanted outcome without the intervention. The interesting survey of the recent literature by Capogrosso and Vickers [2] shows DCA applied to cancer, cardiovascular disease, and other areas. In the context of this survey, Capogrosso and Vickers present a list of items representing their view of best practices for DCA. This list is a good start towards a checklist for practitioners, and we endorse many of its items. In particular, we concur that the clinical decision to be influenced by the risk model is fundamental and should be identified when reporting a DCA [2, 3]. We also agree that investigators should select and present a range of risk thresholds for the DCA that is appropriate for that clinical decision [2, 3].

The risk model performance metric at the heart of DCA is Net Benefit. It is well-known that there is an optimistic bias when a risk model is evaluated on the same data that were used to build and/or fit the model unless special methods are employed. This issue pertains to estimating Net Benefit just as it pertains to estimating any other risk model performance metric. We concur with Capogrosso and Vickers that best practice is to use methods that account for this bias [2].

In our opinion, two key items are missing from the list in the article by Capogrosso and Vickers: including measures of uncertainty such as confidence intervals, and choosing the type of Decision Curve that is most relevant for the application.

First, investigators should summarize the uncertainty in DCA results [4, 5]. The standard statistical tool for quantifying uncertainty is the confidence interval. Decision Curves are estimated using a sample of data from a target population (such as a patient population) in order to infer risk model performance in that population. A large sample size enables reliable and precise inference to the target population, reflected in narrow confidence intervals. With a small sample size, spurious results can arise by chance. In this situation, wide confidence intervals communicate the larger degree of uncertainty.

While there is largely consensus on the importance of quantifying uncertainty in quantitative scientific biomedical research, there appears to be some disagreement on this point for DCA [4, 6]. We suspect this disagreement relates to another issue. It has been proposed that an individual can use DCA results together with his/her personal risk threshold to decide whether to choose the intervention, forego the intervention, or use the risk model to decide [1, 3]. If one accepts this proposal, one can then argue that the individual should choose the option that appears superior based on point estimates of Net Benefit, regardless of statistical significance. Under this proposal, measures of uncertainty such as confidence intervals do not affect an individual's decision, arguably rendering them irrelevant.

Our view is that DCA is not appropriately used for such individual decision-making. The components of Net Benefit are the fraction of the target population that will go on to have the unwanted clinical event, and the true and false positive rates for the risk model at the risk threshold. All of these quantities are population quantities, and so too is Net Benefit.

We view DCA as useful for researchers and policymakers to evaluate the population impact of intervention policies. For researchers, confidence intervals inform whether a risk model should be abandoned, advanced to the next stage of research, or whether more evidence is needed to decide [4]. Policymakers can use DCA to evaluate and compare intervention policies, including those that intervene on everyone (“treat-all”), those that do not intervene on anyone (“treat-none”), and those that use a risk model to decide on intervention. Treatment policies impact public health, so changes in policy require careful consideration. In contemplating a change from a default policy (treat-all or treat-none) to a policy based on estimated risk, policymakers should know the strength of the evidence in favor of the policy change – they need some quantification of uncertainty such as confidence intervals. Importantly, advocating confidence intervals does not prescribe how policymakers must use them in every instance. In particular, we acknowledge that policymakers might sometimes decide to change policy despite statistically insignificant evidence favoring the new policy. However, this does not imply that it is preferable or acceptable not to summarize the strength/weakness of the evidence when risk models are evaluated.

Confidence intervals have heightened importance when current policy is treat-none. Risk model development depends on the existence of data on outcomes absent intervention. Any switch away from the treat-none policy could preclude future opportunities to develop better risk models. Therefore, policymakers should be wary of switching out of a treat-none policy when the evidence favoring a risk model is weak.

Our proposal that Decision Curves should be published with confidence intervals can be viewed as an alternative to the proposal that Decision Curves should be smoothed [2]. To our knowledge, the statistical properties of smoothed estimates of Net Benefit have not been investigated, so we think it is premature to recommend them. Confidence intervals around a bumpy Decision Curve prevent over-interpretation of bumps in the curve. In contrast, smoothing a bumpy Decision Curve might make results appear more definitive than they really are, which could invite, rather than prevent, over-interpretation of DCA results.

The second item we propose for a list of DCA best practices refers to the type of Decision Curves investigators present. The common type of DCA evaluates risk models for identifying high-risk patients who should be recommended the intervention. These "opt-in" Decision Curves assess risk-based assignment to the intervention relative to the treat-none policy [7]. "Opt-out" Decision Curves are better suited when current policy is treat-all, and the potential use of a risk model is to opt low-risk individuals out of the intervention. In our opinion, a list of DCA best practices should advocate choosing the type of Decision Curve that is most appropriate for the application – depending on whether current practice is treat-none or treat-all. This proposed item is in the spirit of Capogrosso and Vickers's overarching goal of thoughtful and conscientious application of DCA.

Footnotes

1

Financial support for this work was provided in part by grants from the National Institutes of Health 5R01HL085757-11 and R01CA152089. The funding agreement ensured the authors’ independence in writing, and publishing this commentary.

Contributor Information

Kathleen F. Kerr, Department of Biostatistics, Box 357232, University of Washington, 206-543-2507, Fax 206-543-3286.

Tracey L. Marsh, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, M2-B500, Seattle WA 98109, 206-667-7460, Fax 206-667-4378.

Holly Janes, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, M2-C200, Seattle WA 98109, 206-667-6353, Fax 206-667-4378.

REFERENCES

  • [1].Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006; 26(6):565–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Capogrosso P, Vickers A. A systematic review of the literature demonstrates some errors in the use of decision curve analysis but generally correct interpretation of findings. Medical Decision Making. 2019; To appear. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Van Calster B, Wynants L, Verbeek JFM, Verbakel JY, Christodoulou E, Vickers AJ, et al. Reporting and Interpreting Decision Curve Analysis: A Guide for Investigators. Eur Urol. 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Vickers AJ, Cronin AM, Elkin EB, Gonen M. Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers. BMC Med Inform Decis Mak. 2008; 8:53. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Kerr KF, Brown MD, Zhu K, Janes H. Assessing the Clinical Impact of Risk Prediction Models With Decision Curves: Guidance for Correct Interpretation and Appropriate Use. J Clin Oncol. 2016; 34(21):2534–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Baker SG, Cook NR, Vickers A, Kramer BS. Using relative utility curves to evaluate risk prediction. J R Stat Soc Ser A Stat Soc. 2009; 172(4):729–48. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Kerr KF, Brown MD, Marsh TL, Janes H. Assessing the Clinical Impact of Risk Models for Opting Out of Treatment. Med Decis Making. 2019; 39(2):86–90. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES