Abstract
Background:
Leveraging “big data” as a means of informing cost-effective care holds potential in triaging high-risk heart failure (HF) patients for interventions within hospitals seeking to reduce 30-day readmissions.
Objective:
Explore provider’s beliefs and perceptions about using an electronic health record (EHR)-based tool that uses unstructured clinical notes to risk-stratify high-risk heart failure patients.
Methods:
Six providers from an inpatient HF clinic within an urban safety net hospital were recruited to participate in a semistructured focus group. A facilitator led a discussion on the feasibility and value of using an EHR tool driven by unstructured clinical notes to help identify high-risk patients. Data collected from transcripts were analyzed using a thematic analysis that facilitated drawing conclusions clustered around categories and themes.
Results:
From six categories emerged two themes: (1) challenges of finding valid and accurate results, and (2) strategies used to overcome these challenges. Although employing a tool that uses electronic medical record (EMR) unstructured text as the benchmark by which to identify high-risk patients is efficient, choosing appropriate benchmark groups could be challenging given the multiple causes of readmission. Strategies to mitigate these challenges include establishing clear selection criteria to guide benchmark group composition, and quality outcome goals for the hospital.
Conclusion:
Prior to implementing into practice an innovative EMR-based case-finder driven by unstructured clinical notes, providers are advised to do the following: (1) define patient quality outcome goals, (2) establish criteria by which to guide benchmark selection, and (3) verify the tool’s validity and reliability. Achieving consensus on these issues would be necessary for this innovative EHR-based tool to effectively improve clinical decision-making and in turn, decrease readmissions for high-risk patients.
Keywords: Automatic Data Processing, Heart Failure, Information Systems, Risk Adjustment
Introduction
Big data analytics is emerging as a promising strategy by which to improve care, save lives, and lower costs.1 Utilizing data from electronic medical records to build predictive models can facilitate identifying patients who would benefit from preventive care.2 Predictive modelling using structured data such as the International Classification of Disease Revision 9 (ICD-9) codes within administrative claims data has been used to predict heart failure (HF) readmissions or inpatient mortality with adequate to excellent discriminative validity.3–5 Adding socioeconomic factors3 or laboratory values4,5 to these models improve the validity even further, demonstrating the value of incorporating data not necessarily available in administrative claims. In addition to using these structured sources of data within administrative claims, using unstructured sources of data within the clinical notes of electronic medical records improve models’ predictive power compared to the use of ICD-9 codes alone.6 Natural language processing (NLP) is a technology that makes it possible to leverage unstructured clinical notes and improve precision and efficiency,7,8 hence serving as an effective approach to data mine clinical decision support data within electronic medical record (EMRs).9
Limitations still exist for advanced data processing. The resources required to operate, understand, and maintain10 these systems make this option currently seem cost prohibitive. Furthermore, NLP’s precision is conditional upon the completeness of algorithms, which can still erroneously omit key phrases if the algorithm does not contain sufficient lexicon to capture particular phrases or symptoms.11 These challenges provide an opportunity to explore alternative innovative risk-stratification approaches that can be integrated into the EMR workflow of clinical settings.
The goal of our project was to explore the value and feasibility of an automated EMR-based case-finder that identifies high-risk patients based upon the “clinical similarity” of their unstructured notes to a group of patients selected to serve as a benchmark. While previous predictive tools calculates patients’ risk based upon a group of specific risk factors chosen a priori, this predictive tool calculates patients’ risk based upon the clinical similarity of an individual’s clinical notes to the entirety of clinical notes contained within the benchmark group of patients. This approach removes the need to use predetermined lists of risk factors or lexicons for NLP. Instead, this approach uses the data included with the unstructured clinical notes of the benchmark group to conceptually create a “search term” by which to identify and rank all other patients within the EMR. Although this approach has been successfully implemented in outpatient settings, little is known about how well this tool would operate as a tool assist in decreasing 30-day readmissions within an inpatient HF clinic.
Safety net hospitals are more adversely affected by financial penalties associated with pay-for-performance compared to the average hospital12 partially due to the costs of treating populations with relatively lower socioeconomic status and health status.13,14 Therefore, implementing a user-friendly automated case-finder interfaced with the EMR could effectively help urban safety net hospitals triage high-risk patients into interventions designed to reduce readmissions. Yet, because nonclinical factors such as homelessness, drug abuse, and socioeconomic status could create challenges in selecting appropriate benchmark groups, more research is needed to determine if and how these challenges could be overcome prior to implementing such a tool in practice. The objective of this study is to explore providers’ beliefs and perceptions about using an EHR-based tool that uses unstructured clinical notes to risk-stratify high-risk HF patients.
Methods
Focus Groups
Our primary goal was to arrive at a consensus regarding which types of high risk HF patients would be appropriate to include in the benchmark groups that would ultimately be used to systematically triage clinically similar patients into predischarge interventions to reduce likelihood of readmission. Our secondary goal was to explore providers’ opinions and beliefs regarding the use of unstructured EMR data-mining technology to assist in clinical care decisions—specifically: triaging high risk patients. Given the utility of focus groups in capturing dynamic interactions15 as well as the strength of focus group interviews’ utility for providing major insights into attitudes, beliefs, and opinions,16 we chose to conduct semistructured interviews, moderated by an experienced discussion leader. We thought this data collection approach would be most effective because of our focused technical questions about unstructured text and risk stratification, and the diverse perspectives offered by a variety of provider types.
We strategically recruited key opinion leaders who had experience using EMR data regularly as part of their regular job responsibilities of patient care and quality care reporting within the HF clinic. After reaching out to approximately 20 eligible participants, we successfully recruited 2 cardiologists, 2 pharmacists and 2 nurse practitioners from the inpatient cardiology unit of an urban safety net hospital. This relatively small focus group optimized interaction and discussions about the specific topic of EMR-based data mining used for clinical decision-making. Including physicians, nurses, and pharmacists increased the heterogeneity of the group and, hence, the likelihood of obtaining multiple perspectives.17
Prior to commencing the 90-minute focus group session, participants were informed of the study objective and provided with written consent for study participation. The focus group moderator provided a brief overview of the risk stratification tool’s operational capabilities, the rationale for using the tool, and the potential for using this tool within an inpatient setting to triage high-risk HF patients to predischarge interventions. Following the introduction, clinical pharmacists presented four case studies, after which they asked the participants: “Should these types of patients be included or excluded in a benchmark group, which would then in turn be used to triage clinically similar patients into a beneficial intervention in order to reduce readmissions? Why or why not?” Because the tool operates by using a benchmark group of patients, we intentionally presented case studies instead of risk factors in order to focus our discussion on types of patients. Presenting controversial cases (Table 1) was useful to promote discussion about the nuances of how both clinical and socioeconomic risk factors interact to determine readmission risk. The focus group interview was digitally recorded, and recordings were subsequently transcribed in order to produce the verbatim data used for the qualitative analysis.
Table 1.
Case Studies Used as Examples in the Focus Group
CASE | DESCRIPTION |
---|---|
1. Heart failure (HF) with preserved ejection fraction with valvular heart disease. | Patient is a 56-year-old African American male who was admitted for increased shortness of breath and the inability to lay flat. His ejection fraction on admission was 55% with no mention of diastolic function in the notes. Based upon the clinical notes and echocardiogram, it was determined his severe valvular heart disease is contributing to his HF symptoms. Over a two-year period, the patient was readmitted to the hospital 9 times, 5 of which were within 30 days, 4 of which were determined to be due to HF. |
2. Heart failure with preserved ejection fraction with admissions unrelated to heart failure | A patient with grade 3 diastolic dysfunction admitted over 30 times over a course of 2 years. This patient received an HF discharge diagnosis on each admission although the admit reason was sickle-cell anemia for hospitalization. This patient was also on chemotherapy. |
3. Heart failure due to renal dysfunction | Patient with renal disease and admitted for volume management. The echocardiogram shows normal ejection fraction and no documentation of the presence of diastolic dysfunction. Despite these findings, the patient received an HF diagnosis at discharge. |
4. Heart Failure with reduced ejection fraction with frequent readmissions due to social factors | A patient with a documented left ventricular ejection fraction < 40% and who had three 30-day readmissions due to HF over a course of 2 years. Each readmission was related to HF. This patient was either discharged to another facility or left against medical advice during each admission. |
Case Studies
Four cases were pulled from an original pool of 225 patients discharged with HF between 2011 and 2012 and who had a 30-day all-cause readmission. These four cases had one or more of the following nine issues identified by the investigators as key factors to consider when selecting sentinel patient cohort: (1) HF with preserved ejection fraction; (2) concomitant renal disease; (3) history of HF but admission for something unrelated to HF; (4) concomitant substance abuse; (5) other high risk for readmission disease states such as sickle-cell anemia; (6) multiple admissions with low 30-day readmission rates (e.g., 7 admissions within 1 year; only one 30-day readmission); (7) leaving against medical advice; (8) discharged to hospice or other facilities; and (9) those receiving chemotherapy. The four cases are summarized in Table 1.
Qualitative Analysis
Since our primary goal was to gain a broader understanding of providers’ opinions, we used the Framework Method,18 a type of thematic analysis whose goal is to identify commonalities and differences in the data as well as draw descriptive and explanatory conclusions clustered around themes.19 To achieve this, we first coded the data in order to develop categories and, secondly, created a matrix of quotes to help identify the emerging themes. To complete the coding, two members of the study team independently coded the transcripts by following an “open coding” approach—by comparing data within and across the transcript, and continually asking questions until different categories could be successfully identified.20 Based on these categories, three study team members arrived at a consensus on an analytic framework composed of six broad categories. This framework was then used as a guide to index transcripts in order to sort all key quotes into the six respective categories, after which key quotes were used to populate a six-by-six matrix that is used as a tool to identify additional trends and patterns within the data.19 At the conclusion of this process, two broad themes emerged from the six categories within the analytic framework: (1) challenges affecting the feasibility and value of automated case finders, and (2) strategies proposed by experts to overcome those challenges (Table 2). Analytic tools included Microsoft Excel, and the study was reviewed and approved by the University of Missouri–Kansas City (UMKC) Institutional Review Board.
Table 2.
Final Analytic Framework with Emerging Themes
EMERGING THEMES | ORIGINAL CATEGORIES FROM WHICH THEME EMERGED |
---|---|
Challenges affecting the feasibility and value of automated case finders |
|
Strategies proposed by experts to overcome those challenges |
|
Results
The “challenges” theme emerged from three of the six categories contained with the analytic framework: (1) patients that presented unique barriers to treatment, (2) inconsistent documentation within medical records, and (3) needing to identify the overarching goal of patient outcomes. The first category involved the existence of “special populations”—subsets of the patient population that presented unique barriers to treatment. Those populations included homeless patients, patients with drug abuse problems, nonadherent patients, or chemotherapy patients. Though accurately considered at risk for HF, these patient subpopulations likely required additional noncardiology interventions in order to reduce readmissions. Given that we presented this tool within the context of using it to assist in triaging high-risk patients into interventions to prevent readmissions, the issue of special populations was significant because including these types of patients in a benchmark group could result in benchmark groups that may not represent “typical” high-risk patients. Consequently, including “false positives” into the benchmark could decrease the effectiveness of identifying a population ideal for an intervention.
The second category involved the inconsistency of documentation or communication of clinical knowledge between hospital personnel, specifically between providers and medical coders. Although the medical coders’ adjudication process assists in generating for the clinicians a narrowed down list of high-risk HF patients, some patients may be unintentionally omitted due to inconsistent clinical information within patients’ EMRs. Even if data mining clinical notes, as opposed to ICD-9 code, could mitigate this issue and identify additional patients on the list, the presence of inconsistent data increases the likelihood of an automated tool misclassifying a patient, regardless of the tool’s sophistication level.
The third category involved the necessity of providers having well-defined goals prior to implementing a tool in the daily workflow that helps systematically identify high-risk patients.
Simply identifying high-risk HF patients would not necessarily help reduce readmissions, especially since the root causes of these readmissions are so diverse. Furthermore, patients seen at this urban safety net hospital may also be struggling with drug abuse and homelessness in addition to HF. Consequently, triaging all high-risk patients into the same intervention will not necessarily be cost-effective, for many may need more social service support instead of clinical support in order to reduce readmissions.
The “strategies” theme emerged from three of the six categories contained with the analytic framework: (1) use of inclusion and exclusion criteria to build appropriate benchmark groups, (2) the importance of piloting this tool on a benchmark based upon Centers for Medicare & Medicaid Services (CMS) standards, and (3) applying this tool to identify gaps in continuity of care. The first category revolved around the importance of using selection criteria to guide the composition of the benchmark group, especially if the tool is used to target patients for specific interventions. Without consistent criteria, patients selected by the tool may be skewed toward a group that would not necessarily benefit from an intervention designed to reduce HF-specific readmissions, and hence be an efficient use of provider resources. For example, a patient with severe valve disease should likely be excluded from a benchmark, because this cause of readmission is relatively rare compared to reasons due to medication nonadherence. On the other hand, a patient who is being discharged to a long-term care facility should likely be included in the benchmark since postacute care transitions are associated with medication errors and increased risk of readmission.
Similarly, including in the benchmark group patients who either died from HF or experienced multiple readmissions over a short period would be valuable.
The second category revolved around ensuring the validity of the tool. Multiple participants suggested using criteria on a model already validated in order to determine inclusion into the benchmark group. For example, building a benchmark group of patients based upon those considered high risk by the Acute Decompensated Heart Failure National Registry (ADHERE) prediction model could ensure that reference population was truly high risk, based upon a previously validated model.
The third category revolved around the potential of using this tool to improve continuity of care in other ways than decreasing readmission. For example, some expressed a concern about patients hospitalized for HF who clearly had a history of HF but only had contact with a primary care physician with no evidence of a cardiac consult prior to admission. By building a benchmark composed of hospitalized patients who missed a cardiac consult, this tool could be valuable in systematically identifying high-risk patients for intervention prior to a readmission occurring.
Discussion
Recommendations: Challenges of Finding Valid and Accurate Results
Prior to embarking upon a risk-stratification initiative with a tool that uses benchmark groups as a strategy to identify at-risk patients, a hospital needs to clearly identify quality improvement goals and to design appropriate interventions that will reach those goals. For example, urban safety net hospitals seeking to reduce HF readmissions could supplement standard of care with ancillary social work services to homeless patients with drug-abuse problems. In contrast, a suburban hospital with a more affluent patient population may achieve a greater return on investment by focusing on supplementing their standard of care with patient education on medication adherence instead of social services. Once providers achieve consensus on quality goals, achieving consensus on inclusion and exclusion criteria for a benchmark groups becomes more pragmatic, with fewer assumptions and uncertainty. In addition to identifying goals, an organization should strive for some baseline data entry standards in order to improve consistency and interpretation across providers that share the same EMR system. The providers, for example, should not need to worry about whether the stratified list provided by coders is incomplete due to data inconsistencies.
Recommendations: Strategies Aimed at Resolving These Challenges
Once a hospital has clearly identified its goals and has a data sharing plan in place, implementing a tool such as this could be extremely valuable, assuming particular strategies are adopted to mitigate the potential challenges. The first strategy would be arriving at a consensus about the actual types of patients to be included in the benchmark. The elegance of this tool is the ability of providers to hand select the patients from the EMR system to include in the benchmark, in essence creating a “metasearch term” by which to data mine the entire EMR system to pull all patients who have clinical notes similar to the patients in that benchmark. Furthermore, the ability of this tool to subsequently rank this patient list in order of similarity implies that the first patient on the list is the “most similar” compared to benchmark patient group. Therefore, by design, this tool incorporates “clinical intuitiveness” by allowing the end user great control in selecting the baseline patient group.
The second strategy would be creating benchmarks composed of patients considered high risk based upon another validated model. This approach would allow the use of a previously validated prediction tool to select benchmark, after which the similar patients could be prospectively followed to determine actual readmission rates. Future studies therefore need to be conducted to demonstrate the validity of this tool in identifying high-risk patients.
The third strategy would be using this tool to select benchmarks of high-risk patients based upon other factors apart from readmission risk. For example, a hospital concerned about the continuity of care could build a benchmark comprising all patients with HF readmissions who missed a cardiac consult. This tool’s approach would allow the end user to use any patterns of interacting social and clinical factors present unique to this group of patients as a search term to be able to identify all other patients present in the EMR system.
Strengths and Limitations
Using a semistructured focus group approach, as opposed to one-on-one interviews or observing clinicians within their work environment, leveraged our ability to encourage interaction within a relatively small sample of interdisciplinary providers. Although findings from a small sample may not necessarily be representative of the entire HF clinic, we are confident that our thematic analysis approach was appropriate, given the narrow focus of the discussion and the relatively homogeneous sample. Furthermore, approaching the discussion as an exploratory, hypothesis-seeking exercise led to additional conversations that may have not occurred with individual interviews. Future work beyond a single focus group is needed to inform the design of automated case finding tools in EMRs.
Conclusions
Despite some concerns about data validity, heterogeneity of target groups, or absence of quality goals, the HF clinical providers practicing within this urban safety net hospital found value in the concept of this tool. Prior to implementing an EMR-based case finder in the workflow of clinical practice, providers need to establish clear patient quality-outcome goals, establish validity in the tool, and agree upon criteria by which to guide benchmark selection. Adopting these strategies in conjunction with this specific tool has the potential to innovate risk stratification by implementing a user-friendly tool directly in daily workflow.
Acknowledgments
We would like to acknowledge health informatics department at Truman Medical Center and the staff at Quire Data, Inc. for their assistance in this research. This research is funded by Trailblazer Award through the University of Kansas Medical Center via an NIH National Center for Advancing Translational Science (NCATS; grant # UL1TR000001)
References
- 1.Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health information science and systems. 2014;2:3. doi: 10.1186/2047-2501-2-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.IBM . IBM big data platform for healthcare. IBM; 2012. [Google Scholar]
- 3.Amarasingham R, Moore BJ, Tabak YP, et al. An automated model to identify heart failure patients at risk for 30-day readmission or death using electronic medical record data. Medical care. 2010 Nov;48(11):981–988. doi: 10.1097/MLR.0b013e3181ef60d9. [DOI] [PubMed] [Google Scholar]
- 4.Tabak YP, Johannes RS, Silber JH. Using automated clinical data for risk adjustment: development and validation of six disease-specific mortality predictive models for pay-for-performance. Medical care. 2007 Aug;45(8):789–805. doi: 10.1097/MLR.0b013e31803d3b41. [DOI] [PubMed] [Google Scholar]
- 5.van Walraven C, Wong J, Forster AJ. LACE+ index: extension of a validated index to predict early death or urgent readmission after hospital discharge using administrative data. Open medicine: a peer-reviewed, independent, open-access journal. 2012;6(3):e80–90. [PMC free article] [PubMed] [Google Scholar]
- 6.Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC medical informatics and decision making. 2006;6:30. doi: 10.1186/1472-6947-6-30. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Cheng LT, Zheng J, Savova GK, Erickson BJ. Discerning tumor status from unstructured MRI reports--completeness of information in existing reports and utility of automated natural language processing. Journal of digital imaging. 2010 Apr;23(2):119–132. doi: 10.1007/s10278-009-9215-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Dalan D. Clinical data mining and research in the allergy office. Current opinion in allergy and clinical immunology. 2010 Jun;10(3):171–177. doi: 10.1097/ACI.0b013e328337bce6. [DOI] [PubMed] [Google Scholar]
- 9.Demner-Fushman D, Chapman WW, McDonald CJ. What can natural language processing do for clinical decision support? Journal of biomedical informatics. 2009 Oct;42(5):760–772. doi: 10.1016/j.jbi.2009.08.007. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Koh HC, Tan G. Data mining applications in healthcare. Journal of healthcare information management : JHIM. 2005 Spring;19(2):64–72. [PubMed] [Google Scholar]
- 11.Forbush TB, Gundlapalli AV, Palmer MN, et al. “Sitting on pins and needles”: characterization of symptom descriptions in clinical notes”. AMIA Joint Summits on Translational Science proceedings AMIA Summit on Translational Science. 2013;2013:67–71. [PMC free article] [PubMed] [Google Scholar]
- 12.Bhalla R, Kalkut G. Could Medicare readmission policy exacerbate health care system inequity? Annals of internal medicine. 2010 Jan 19;152(2):114–117. doi: 10.7326/0003-4819-152-2-201001190-00185. [DOI] [PubMed] [Google Scholar]
- 13.Fonarow GC, Peterson ED. Heart failure performance measures and outcomes: real or illusory gains. Jama. 2009 Aug 19;302(7):792–794. doi: 10.1001/jama.2009.1180. [DOI] [PubMed] [Google Scholar]
- 14.Wang CJ, Conroy KN, Zuckerman B. Payment reform for safety-net institutions--improving quality and outcomes. The New England journal of medicine. 2009 Nov 5;361(19):1821–1823. doi: 10.1056/NEJMp0907656. [DOI] [PubMed] [Google Scholar]
- 15.Kitzinger J. Introducing focus groups. In: Mays N, Pope C, editors. Qualitative Research in Health Care. B. M. J. Publishing Group; London: 1996. pp. 36–45. [Google Scholar]
- 16.Carey M. The group effect in focus groups: planning, implementing, and interpreting focus group research. In: Morse J, editor. Critical issues in qualitative research methods. Sage; London: 1994. pp. 225–241. [Google Scholar]
- 17.McLafferty I. Focus group interviews as a data collecting strategy. J Adv Nurs. 2004 Oct;48(2):187–94. doi: 10.1111/j.1365-2648.2004.03186.x. [DOI] [PubMed] [Google Scholar]
- 18.Ritchie J, Lewis J. Qualitative research practice: a guide for social science students and researchers. London: Sage; 2003. [Google Scholar]
- 19.Gale NK, Heath G, Cameron E, Rashid S, Redwood S. Using the framework method for the analysis of qualitative data in multi-disciplinary health research. BMC Med Res Methodol. 2013 Sep 18;13:117. doi: 10.1186/1471-2288-13-117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Corbin JM, Strauss AL. Basics of qualitative research: Techniques and procedures for developing grounded theory. Sage Publications, Inc; 2008. [Google Scholar]