Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Sep 1.
Published in final edited form as: J Pain Symptom Manage. 2020 Jun 26;60(3):e33–e36. doi: 10.1016/j.jpainsymman.2020.06.018

Sensitivity and Specificity of a Machine Learning Algorithm to Identify Goals-of-care Documentation for Adults With Congenital Heart Disease at the End of Life

Jill M Steiner 1,2, Christina Morse 3, Robert Y Lee 4,5, J Randall Curtis 6,7, Ruth A Engelberg 8,9
PMCID: PMC7484168  NIHMSID: NIHMS1618240  PMID: 32599151

To the Editor:

Goals-of-care discussions are an important part of high-quality care for patients with serious illness. These discussions are most useful if they are documented in the electronic health record (EHR) in a way that is easily accessible. This is especially important in the inpatient setting, where team-based care is common and long-term clinician-patient relationships are the exception. Patients with adult congenital heart disease (ACHD) are a rapidly growing group of young patients whose prognoses are changing quickly with advances in medical technology.1 They face health-care challenges throughout their lives, including the possibility of sudden illness, multiple organ failure, or early death,2 and very few have been formally exposed to palliative care and advance care planning (ACP).3

Where and how documentation of goals-of-care discussions is stored in the EHR vary across institutions and individual clinicians, posing challenges to identification and use. Natural language processing and machine learning methods (NLP/ML) may provide efficient and effective alternative approaches to manual abstraction for identifying goals-of-care documentation in the EHR.4,5 NLP/ML techniques have been used to evaluate documentation in the EHR, including unstructured symptoms, diagnoses,6 and, more recently, goals-of-care discussions.4,7 In this study, we describe the prevalence of EHR documentation of goals-of-care discussions near the end of life for patients with ACHD and evaluate the performance of an NLP/ML algorithm to identify this documentation.

Methods

We retrospectively examined EHR notes for a cohort of decedents (2010–2016) with moderate or complex ACHD.1 Trained clinicians abstracted documentation of goals-of-care discussions in the last six months of life using a standardized data abstraction form. Evidence of goals-of-care discussions included patient and family values or wishes; care choices or code status; prognosis in the context of decision-making; or ACP. Notes reviewed included admission notes, code status notes, progress notes, and discharge summaries created by the primary team (cardiology or cardiothoracic surgery), palliative care, spiritual care, or social work.

We previously developed an NLP/ML algorithm8 that uses mathematical models based on the statistical probability of whether documentation is present to identify goals-of-care EHR documentation. It is guided by the frequency of single-word “tokens” in text, trained and validated across a variety of adult medical specialties and settings. When previously compared with manual abstraction, the algorithm achieved a sensitivity of 74.2% and a specificity of 96.6% for overall per-note evaluation, and 96.3% and 99%, respectively, for inpatient documents.8 ACHD patient data were not included in algorithm development.

We used descriptive statistics to characterize the prevalence of documentation identified by manual abstraction. We determined the sensitivity and specificity of the algorithm’s ability to correctly identify documentation of goals-of-care discussions compared with manual abstraction in this population. This was examined on a per-patient basis (whether any note was correctly identified for a patient, i.e., "has the patient had a documented goals-of-care discussion?") and on a per-note basis (ability of the algorithm to correctly identify all positive notes rather than "any/ none" for a patient), to allow us to understand if the timing or type of note differentially affected algorithm performance.

Results

We studied 62 decedents with moderate (40%) or complex (60%) ACHD. Mean age at death was 48 ± 18 years, 66% were male, and 74% had heart failure. Palliative care consultations were documented for 19 patients (31%), two-thirds of which occurred in the last six months of life.

By manual abstraction, we identified 155 notes for 23 patients (37%) containing documentation of goals-of-care discussions. For 9 patients (39% of those with documentation of a goals-of-care discussion), the first documentation occurred during the final hospitalization before death. Palliative care consultations produced 41 of the 155 notes containing goals-of-care discussion (26%).

The NLP/ML algorithm identified 30 notes for 18 patients (29% of those with documentation of a goals-of-care discussion) containing documentation for the same time period (Figure 1). For 6 patients (33%), the first documentation occurred during the final hospitalization before death. The algorithm found, on average, fewer notes per patient than manual review (2 vs. 7 notes). Unlike manual abstraction, the algorithm did not identify any notes written by the primary team (cardiology or cardiothoracic surgery). The algorithm identified 9 of the 41 palliative care consultation notes documenting goals-of-care discussion (22%).

Figure 1.

Figure 1.

Diagram of documentation examined in this study and natural language processing and machine learning algorithm performance.

In per-patient analysis, manual and NLP/ML findings were concordant for 53 patients (85%): 16 patients were identified by both approaches as containing documentation of goals-of-care discussion, and 37 were identified as without documentation. Two patients were incorrectly classified by the algorithm as containing documentation. Therefore, the sensitivity of the algorithm to correctly identify whether a patient had any documentation of a goals-of-care discussion was 70% (16/23; 95% confidence interval [CI], 0.51–0.88) and specificity 95% (37/29; 95% CI, 0.88–1.02).

In per-note analysis, findings were concordant for 144 notes (50%): Documentation was present for 20 notes (7%) and absent for 124 notes (43%), based on 289 total notes. The algorithm incorrectly identified 10 notes for which manual review did not reveal any documentation. These were a mix of admission notes, discharge notes, and palliative or spiritual care notes. It was not apparent on manual review why these were selected by the algorithm. Therefore, sensitivity in the per-note analysis was 13% (20/155; 95% CI, 0.08–0.18), and specificity 93% (124/134; 95% CI, 0.88–0.97).

Comment

Within the ACHD community, there has been growing interest in identifying and responding to the palliative care needs of our patients. Prior studies show that patients want to know about their future health and are interested in discussing future care, yet exposure to ACP remains limited.3,9 Low prevalence of documented goals-of-care discussions is in line with prior studies9,10 and identifies this as an area of opportunity in ACHD care.

The ability of clinicians to find documentation of a goals-of-care discussion is arguably as important as the occurrence of these discussions. NLP and ML have shown promise in their application to EHR data, yet the details of training and environment can create a range of results. Our algorithm’s performance in this study was not nearly as robust as previously demonstrated, particularly its per-note sensitivity (13%), despite it being trained using data from the same EHR and medical system. The structure of cardiology and cardiothoracic team notes is likely to be different than that of notes used in training, so these were likely assigned a lower probability in the algorithm’s assessment. Furthermore, our results suggest that even palliative care specialists’ notes may vary in different populations. These findings emphasize the importance of ensuring that NLP/ML algorithms developed in one population or setting are tested and validated in others.

There are other important limitations to this study. As we only evaluated notes in the last six months of life, documentation of prior discussions would not have been captured. In addition, this study took place within a single academic health-care system with a relatively small patient sample and may not generalize to other settings.

In conclusion, we found a low prevalence of EHR-documented goals-of-care discussions and palliative care consultations in the last six months of life for decedent patients with ACHD. An NLP/ML algorithm designed to identify goals-of-care discussion documentation performed moderately to poorly in this patient population, likely related to differences in algorithm training. These findings provide guidance for diversifying algorithm training, such as exposure to more note types, patient populations, and different types of clinicians. This is particularly important in settings where patient or clinician characteristics may be unique, such as in ACHD. These findings also suggest that both the conduct and documentation of goals-of-care discussions may be important areas for further research and quality improvement.

Disclosures and Acknowledgments

J. M. S. reports grant funding from National Heart Lung and Blood Institute (T32 HL 125195). J. R. C. and C. M. have no disclosures to report. R. Y. L. reports grant funding from the NHBLI (K12 HL137940 and F32 HL142211). R. A. E. reports grant funding from the Cambia Health Foundation. The authors would like to acknowledge James N. Kirkpatrick, MD, James Sibley, BS, Erin K. Kross, MD, and William B. Lober, MD, MS for their contributions to this study.

Contributor Information

Jill M. Steiner, Division of Cardiology, University of Washington, Seattle, Washington, USA; Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA.

Christina Morse, Internal Medicine Residency, School of Medicine, University of Washington, Seattle, Washington, USA.

Robert Y. Lee, Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care & Sleep, Medicine, University of Washington, Seattle, Washington, USA.

J. Randall Curtis, Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care & Sleep, Medicine, University of Washington, Seattle, Washington, USA.

Ruth A. Engelberg, Cambia Palliative Care Center of Excellence, University of Washington, Seattle, Washington, USA; Division of Pulmonary, Critical Care & Sleep, Medicine, University of Washington, Seattle, Washington, USA.

References

  • 1.Warnes CA, Williams RG, Bashore TM, et al. ACC/AHA 2008 guidelines for the management of adults with congenital heart disease. J Am Coll Cardiol 2008;52:e143–e263. [DOI] [PubMed] [Google Scholar]
  • 2.Steiner JM, Kovacs AH. Adults with congenital heart disease e facing morbidities and uncertain early mortality. Prog Pediatr Cardiol 2018;48:75–81. [Google Scholar]
  • 3.Tobler D, Greutmann M, Colman JM, Greutmann-Yantiri M, Librach SL, Kovacs AH. Knowledge of and preference for advance care planning by adults with congenital heart disease. Am J Cardiol 2012;109:1797–1800. [DOI] [PubMed] [Google Scholar]
  • 4.Lindvall C, Lilley EJ, Zupanc SN, et al. Natural language processing to assess end-of-life quality indicators in cancer patients receiving palliative surgery. J Palliat Med 2019;22: 183–187. [DOI] [PubMed] [Google Scholar]
  • 5.Udelsman BV, Moseley ET, Sudore RL, Keating NL, Lindvall C. Deep natural language processing identifies variation in care preference documentation. J Pain Symptom Manage 2020;59:1186–1194.e3. [DOI] [PubMed] [Google Scholar]
  • 6.Wagholikar KB, Fischer CM, Goodson A, et al. Extraction of ejection fraction from echocardiography notes for constructing a cohort of patients having Heart Failure with Reduced Ejection Fraction (HFrEF). J Med Syst 2018;42: 1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lilley EJ, Lindvall C, Lillemoe KD, Tulsky JA, Wiener DC, Cooper Z. Measuring processes of care in palliative surgery: a novel approach using natural language processing. Ann Surg 2018;267:823–825. [DOI] [PubMed] [Google Scholar]
  • 8.Lee RY, Lober W, Sibley J, Kross EK, Engelberg RA, Curtis JR. Identifying goals-of-care conversations in the electronic health record using machine learning and natural language processing [Internet]. 2019. Available from https://www.atsjournals.org/doi/pdf/10.1164/ajrccm-conference.2019.199.1_MeetingAbstracts.A1089 Accessed May 10, 2020. [DOI] [PMC free article] [PubMed]
  • 9.Tobler D, Greutmann M, Colman JM, Greutmann-Yantiri M, Librach LS, Kovacs AH. End-of-life care in hospitalized adults with complex congenital heart disease: care delayed, care denied. Palliat Med 2012;26:72–79. [DOI] [PubMed] [Google Scholar]
  • 10.Steiner JM, Kirkpatrick JN, Heckbert SR, et al. Hospital resource utilization and presence of advance directives at the end of life for adults with congenital heart disease. Congenit Heart Dis 2018;13:721–727. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES