Skip to main content
NPJ Digital Medicine logoLink to NPJ Digital Medicine
editorial
. 2021 Aug 10;4:119. doi: 10.1038/s41746-021-00495-4

Beyond performance metrics: modeling outcomes and cost for clinical machine learning

James A Diao 1,, Leia Wedlund 1, Joseph Kvedar 1,2
PMCID: PMC8355228  PMID: 34376781

Abstract

Advances in medical machine learning are expected to help personalize care, improve outcomes, and reduce wasteful spending. In quantifying potential benefits, it is important to account for constraints arising from clinical workflows. Practice variation is known to influence the accuracy and generalizability of predictive models, but its effects on cost-effectiveness and utilization are less well-described. A simulation-based approach by Mišić and colleagues goes beyond simple performance metrics to evaluate how process variables may influence the impact and financial feasibility of clinical prediction algorithms.

Subject terms: Health care economics, Computational science, Outcomes research


Advances in medical machine learning are expected to help personalize care, improve outcomes, and reduce wasteful spending. In quantifying potential benefits, it is important to account for constraints arising from clinical workflows. Practice variation is known to influence the accuracy and generalizability of predictive models1,2, but its effects on cost-effectiveness and utilization are less well-described. A simulation-based approach by Mišić and colleagues3 goes beyond simple performance metrics to evaluate how process variables may influence the impact and financial feasibility of clinical prediction algorithms.

Mišić et al.’s study builds on previous work that developed equations for predicting unplanned readmissions4. Readmission rate is a publicly reported metric that commands significant attention in quality improvement and cost management initiatives. As part of these efforts, prediction equations are used to stratify readmission risk and allocate limited interventions to the patients who need it most. Still, many important questions—how many interventions are applied, how many readmissions are prevented, and how much spending is averted—are often unclear.

Mišić and colleagues provide answers by simulating patient flow for a hypothetical clinical workflow. Under their model, each patient is assigned a risk score based on four separate algorithms (LACE, HOSPITAL, and two locally-designed equations) for each day a prediction is available. On each day that a provider is available, the eight highest-risk patients are “treated” with a hypothetical intervention that has a 10% chance of preventing readmission. The authors applied this model to a dataset of 19,331 post-operative surgical patients from the UCLA Ronald Reagan Medical Center, including 969 (5.0%) who were later readmitted. Because the described workflow is speculative, it cannot be validated with data. Instead, the design and parameters were chosen to reflect typical staff and time constraints.

Using these simulation conditions, Mišić and colleagues compute utilization and volume metrics for the Ronald Reagan Medical Center, including interventions conducted, readmissions anticipated, and expected readmissions prevented. To compute net savings, the authors added the highest-cost ICD-10 codes for each “prevented” readmission and subtracted expected labor costs. By toggling simulation parameters, the authors also show how differences in accuracy, prediction timing, and provider availability translate into differences in outcomes. For example, algorithms that rely on length of stay are unable to assign risk scores before the day of discharge, potentially constraining opportunities to intervene. The authors also find several parameter settings where costs outweigh savings, consistent with earlier studies showing that interventions for preventing readmission are not always cost-effective5.

The simulation approach relies on a broad set of simplifying assumptions and therefore has several limitations. In particular, the assumption of fixed, limited availability (e.g., one nurse practitioner providing readmission interventions for a 520-bed hospital) may be overly stringent, or may overlook the need for additional funding to support effective programs. Assumptions for intervention timing may also be inexact, as strategies for preventing hospital readmission increasingly comprise multiple components administered before, during, and after discharge6. Last, the evaluated algorithms do not account for many important drivers of readmissions, such as language and cultural barriers, mental illness, and poverty. Allocating interventions based on clinical risk alone may not represent the most common or effective strategy for reducing rehospitalization. Together, these considerations indicate the need to validate simulation results against real-world data and recalibrate assumptions where necessary. Beyond validation, future work should extend the model to provide estimates of uncertainty and evaluate health and equity-based outcomes.

Ultimately, the proposed simulation model provides estimates for utilization and financial feasibility in the setting of a specific clinical workflow. Preventing rehospitalization is only one application; others include prevention of sepsis7 or acute kidney injury8. While not a substitute for randomized trials9, simulation modeling can provide initial answers and insights for all stakeholders involved, including researchers developing prediction algorithms, administrators optimizing clinical workflows, executives evaluating business models, and regulators seeking to understand performance in context.

For decades, medical practice has proved impervious to algorithmic reinvention10. One contributor is imperfect communication of a clear value proposition centered on outcomes, costs, and metrics that matter. Performance metrics like sensitivity and specificity are only part of the puzzle. A simulated modeling approach may help contextualize and complement traditional accuracy metrics to strengthen the case for new prediction models.

Author contributions

Initial draft by J.A.D. Critical revisions by L.W. and J.K. All authors approved the final draft.

Competing interests

J.K. is the Editor-in-Chief of npj Digital Medicine. J.A.D. and L.W. declare no competing interests.

References

  • 1.Agniel D, Kohane IS, Weber GM. Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ. 2018;361:k1479. doi: 10.1136/bmj.k1479. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Beaulieu-Jones BK, et al. Machine learning for patient risk stratification: standing on, or looking over, the shoulders of clinicians? NPJ Digit. Med. 2021;4:62. doi: 10.1038/s41746-021-00426-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Mišić VV, Rajaram K, Gabel E. A simulation-based evaluation of machine learning models for clinical decision support: application and analysis using hospital readmission. NPJ Digit Med. 2021;4:98. doi: 10.1038/s41746-021-00468-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Mišić VV, Gabel E, Hofer I, Rajaram K, Mahajan A. Machine learning prediction of postoperative emergency department hospital readmission. Anesthesiology. 2020;132:968–980. doi: 10.1097/ALN.0000000000003140. [DOI] [PubMed] [Google Scholar]
  • 5.Nuckols TK, et al. Economic evaluation of quality improvement interventions designed to prevent hospital readmission: a systematic review and meta-analysis. JAMA Intern. Med. 2017;177:975–985. doi: 10.1001/jamainternmed.2017.1136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Kripalani S, Theobald CN, Anctil B, Vasilevskis EE. Reducing hospital readmission rates: current strategies and future directions. Annu. Rev. Med. 2014;65:471–485. doi: 10.1146/annurev-med-022613-090415. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Reyna, M. et al. Early Prediction of Sepsis from Clinical Data: The PhysioNet/Computing in Cardiology Challenge 2019. 2019 Computing in Cardiology Conference (CinC) (2019) 10.22489/cinc.2019.412.
  • 8.Flechet M, et al. Machine learning versus physicians’ prediction of acute kidney injury in critically ill adults: a prospective evaluation of the AKIpredictor. Crit. Care. 2019;23:1–10. doi: 10.1186/s13054-019-2563-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Nikolova-Simons M, et al. A randomized trial examining the effect of predictive analytics and tailored interventions on the cost of care. NPJ Digit. Med. 2021;4:92. doi: 10.1038/s41746-021-00449-w. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Schwartz WB, Patil RS, Szolovits P. Artificial intelligence in medicine. Where do we stand? N. Engl. J. Med. 1987;316:685–688. doi: 10.1056/NEJM198703123161109. [DOI] [PubMed] [Google Scholar]

Articles from NPJ Digital Medicine are provided here courtesy of Nature Publishing Group

RESOURCES