Abstract
Health information exchange (HIE) provides an essential enhancement to electronic health records (EHR), allowing information to follow patients across provider organizations. There is also an opportunity to improve public health surveillance, quality measurement, and research through secondary use of HIE data, but data quality presents potential barriers. Our objective was to validate the secondary use of HIE data for two emergency department (ED) quality measures: identification of frequent ED users and early (72-hour) ED returns. We compared concordance of various demographic and encounter data from an HIE for four hospitals to data provided by the hospitals from their EHRs over a two year period, and then compared measurement of our two quality measures using both HIE and EHR data. We found that, following data cleaning, there was no significant difference in the total counts for frequent ED users or early ED returns for any of the four hospitals (p<0.001).
Introduction
Health information exchange (HIE) is an important complement to electronic health records (EHRs), providing much needed outside clinical information to providers at the bedside.1 Although comprehensive HIE at a state or national level is far from being fully realized, it is increasingly identified as a key element of our nation’s approach to providing 21st century healthcare.2–4 The Health Information Technology for Economic and Clinical Health (HITECH) portion of the American Recovery and Reinvestment Act (ARRA) has promoted the “meaningful use” (MU) of EHRs through three stages, each of which defines standards and provides incentives to compliant hospitals and providers. Whereas stage 1, meaningful use, focused on the technical ability to get information into a shareable electronic form, stages 2 and 3 focus on actual clinical use and driving improved outcomes with a progressively increased focus on HIE.5
HIE is not without challenges, facing questionable financial sustainability, adoption, and usage in many settings. 3,4,6–8 That said, the trend towards HIE use has continually increased,6 with several recent studies showing that the primary clinical use case of clinicians reviewing the records of individual patients at the bedside can help prevent admissions, decrease testing, and ultimately save money.7,9,10 However, studies showing that quality and safety are actually improved by HIE are still needed.3,4,11,12
As HIE expands, and we begin sharing data more broadly, health data is increasingly aggregated into ever larger nodes as we work toward a vision of national HIE. Through this progression, HIE data will increasingly serve as a data source for secondary use cases including care coordination,13 population management,14 public health surveillance,15 broad community-wide quality measurement,16 and research.17 Ultimately these secondary uses may be a strong driving force for improving the safety and quality of patient care.
Because of the context-dependent nature of HIE data quality, meaning data may be of sufficiently high quality for one use case but not for another, HIE data that are “fit for use” in the primary clinical use case may not be “fit for use” in secondary use cases.18 As these secondary uses begin to test the limits of HIE data, the importance of ensuring HIE data quality should not be underestimated.
Background
Currently, EHRs are a common source of electronic data for quality measurement, many of these quality measurements are mandated by state and federal reporting requirements for research and for other secondary uses.5,15,19,20 Because of this, the need for data quality analysis in the setting of EHRs has been recently described, and various methods have been explored for generating and rating data from EHRs to ensure a level of data quality.18,21,22 For HIE, data quality assessment may be even more challenging, since aggregation of multiple data sources into an HIE multiplies the potential for data quality issues.
HIE networks are often initially formed to support the primary clinical use case, without much consideration for secondary use cases at their inception. HIEs often contain both clinical data from ancillary systems (e.g., lab results, diagnostic reports, clinical notes) and registration data from admission, discharge and transfer (ADT) registration systems (e.g. visit dates, patient demographics and other identifiers). For primary clinical use, these data are incorporated from multiple sites so that an individual clinician can view data belonging to an individual patient from multiple provider organizations. Initial HIE implementations may involve minimal validation, cleaning, or transformation. Testing may focus on interfaces between systems, and on assuring the proper display of clinical data at the presentation layer for individual patient data. Because HIE networks are usually designed around the primary clinical use case, data transformation to leverage the deep semantics at each site, mapping to appropriate terminologies to translate the meaning of data across sites, and detailed data quality and validation may be neglected. Once we begin to apply scenarios for the secondary use of HIE data, data quality issues are often discovered.23,24
In this analysis, we describe approaches to data cleaning, measuring data concordance before and after data cleaning, and validation of the secondary use of HIE data for two specific quality measures: identification of frequent emergency department (ED) users and early (72-hour) ED returns.
Methods
Setting
Healthix is a regional health information organization (RHIO) providing HIE to the New York metropolitan area and Long Island.25 Healthix formed in 2012 through the merger of the Long Island Patient Information Exchange (Lipix) and the New York Clinical Information Exchange (NYCLIX).26 Since then, the Brooklyn Health Information Exchange (BHIX) has also merged with Healthix,27 and Healthix now has 9.2 million unique patients, > 6,500 users performing > 10,000 searches per month, and 107 participating organizations with 383 facilities comprising 29,946 acute and extended care beds as of January 2014.
To date, Healthix has enabled HIE primarily through the use of standard HL7 2.x messages,28 written to a common specification, to send data from source systems at multiple provider organizations to edge servers at each site built with a common data structure. Registration information from each site’s ADT system, including patient demographics and master patient index (MPI) functions, is housed in a centralized location.
As part of each site’s implementation of HIE with Healthix, end-to-end interface testing was conducted. Artificial data was entered into test patient records in the site’s source systems, and then followed downstream through the HIE interface, into the edge server, and finally to the display level in the HIE’s portal viewer. Each step of the way, the data was checked for validity based on the source system, which consisted of making sure the interfaces were sending data properly and that the information was being properly displayed for clinicians. While this sort of testing is a standard aspect of HIE implementation, it may not detect problems that do not affect data display in the primary use case. For example, missing discharge date/time stamps might not be detected if all that is being tested is the ability to display laboratory and radiology results, but if the data are later employed for secondary use in measuring frequent ED users and early (72-hour) ED returns, then these missing data become much more important. As another example, free text diagnosis data might display properly during testing, but if analyses that require ICD-9 codes are attempted later, they may fail.
Data
EHR and HIE data elements from four hospitals were obtained for all ED visits from 3/1/09 to 2/28/11, including visit number (denotes a unique encounter), medical record number (MRN – denotes a unique patient), admission and discharge date and time, date of birth, and gender. All data were de-identified in accordance with HIPAA prior to analysis by the research team, and the protocol was reviewed by the Mount Sinai Program for the Protection of Human Subjects and given a not human research determination.
Analysis
HIE and EHR data for each site were merged on hashed visit numbers to evaluate concordance of unique encounters, and on hashed MRNs to evaluate concordance of unique patients. In this case, concordance is defined as having a unique match. Next, age, gender and the time-stamps for admissions/discharges were also tested for concordance. In order to adjust for differences between the EHR and HIE data for these latter four parameters, various data cleaning rules were systematically applied. These data cleaning techniques were derived from observation of the major areas of discrepancy noticed when concordance between the two data sets was tested, and through expert evaluation of workflow issues that were likely to have caused these discrepancies at each site. The three data cleaning techniques used were as follows:
Age was considered to match if the date of birth was less than or equal to one year difference between the HIE and EHR data.
Gender was considered to match if it was specified in either the EHR or HIE, and recorded as the same or “unknown” in the other system.
Admit and discharge times were considered to match if the difference in date/time was less than 6–24 hours. The number of hours from 6–24 was chosen on a site-specific basis by extending the data cleaning factor by the smallest multiple of 6 hours in which the concordance of encounters first became greater than 98%. This particular data cleaning technique was necessary because differences in clinical and registration staff workflows likely led to small but frequent discrepancies between the two data systems in which admit and discharge times were entered. For instance, when a clinician discharged an ED patient, a date/time stamp was immediately entered in the EHR but the registration staff member may wait until the end of his or her shift to remove the patient from the ADT system, causing the ADT date/time stamp to lag behind by a small number of hours.
The last part of the data analysis included measuring frequent ED users (patients with ≥ 4 visits in 30 days) and early (72-hour) ED returns (patients who return for a second ED visit within 72 hours of being discharged). The counts for each of these quality measures were then compared for statistically significant similarity between HIE and EHR datasets for each hospital using Chi square.
Results
Adjusted values (following data cleaning) were not significantly different between site-specific HIE and EHR datasets (Table 1). There was a high degree of concordance for unique encounters and patients between HIE and EHR data sets, so no data cleaning was employed.
Table 1.
Site 1 | Site 2 | Site 3 | Site 4 | |||||
---|---|---|---|---|---|---|---|---|
Unadj % Matched | Adj % Matched | Unadj % Matched | Adj % Matched | Unadj % Matched | Adj % Matched | Unadj % Matched | Adj % Matched | |
Unique Encounters (Visit #) | 99.45 | N/A | 99.27 | N/A | 99.25 | N/A | 99.98 | N/A |
Unique Patients (MRN) | 99.32 | N/A | 99.31 | N/A | 99.31 | N/A | 99.84 | N/A |
Age | 99.71 | 100 | 97.61 | 100 | 97.69 | 100 | 97.94 | 100 |
Gender | 99.25 | 100 | 99.16 | 99.91 | 99.16 | 99.91 | 99.61 | 99.63 |
Admit Date/Time | 0.13 | 99.53 | 5.00 | 99.99 | 2.76 | 99.99 | 53.86 | 100 |
Discharge Date/Time | 2.47 | 98.05 | 49.27 | 99.89 | 47.42 | 99.86 | 94.42 | 99.96 |
The unadjusted match rate for age and gender ranged from 97.61% to 99.71% across sites (std. dev. 0.87%), and once the adjustment criteria were applied, all match rates increased and were greater than 99.6%. The admit and discharge date/time did not match well in the unadjusted data (range 0.13% to 94.42%, std. dev. 34.6%). Data cleaning adjustment allowed for a match in greater than 99.5% of admissions and 98% of discharges. The lowest discharge date/time unadjusted match rate was at Site 1, but only a six hour data cleaning time frame was needed to get the match percentage over 98%. Site 2–4 required a 24 hour adjustment of the time frame, and after adjustment had greater than 99.85% match rates across all sites.
When we measured the number of frequent ED users (patients with ≥ 4 ED visits in 30 days) and early (72-hour) ED returns, we found a high degree of concordance between HIE and EHR data sets (Table 2). All four sites have EHR counts that do not differ significantly from the HIE counts (p-value < 0.001).
Table 2.
Site 1 | Site 2 | Site 3 | Site 4 | |||||
---|---|---|---|---|---|---|---|---|
EHR Count | HIE Count | EHR Count | HIE Count | EHR Count | HIE Count | EHR Count | HIE Count | |
Frequent Users | 1,204 | 1,221 | 1,060 | 1,035 | 1,746 | 1,708 | 936 | 924 |
72 Hour Returns | 8,299 | 8,456 | 7,237 | 7,093 | 12,243 | 12,045 | 5,476 | 5,431 |
Discussion
In preparation for a project measuring frequent ED users and early (72-hour) emergency department (ED) returns across an HIE, this analysis was performed to validate the use of HIE data by comparing it to the electronic health record (EHR) data, since EHRs are often the source of data for these ED quality measures, and much of the encounter data in the HIE comes from ADT source systems. Our analysis shows that through some simple data cleaning and transformation, the level of concordance between EHR and HIE data sources for multiple data elements across four separate hospitals was very high. Furthermore, when we compared the performance of our two specific ED quality measures we found no statistical difference between EHR and HIE data.
There are several likely reasons that concordance between the data sets for demographic and encounter data exist, and the counts for frequent ED users and early (72-hour) ED returns were not identical. First, the data in the date and time stamps for the EHR are generally captured by clinicians directly as part of their EHR workflow, and the date and time stamps for the HIE were generally captured by registration staff in the sites’ ADT source systems as part of their registration workflow. Differences in the timing of these two workflows likely led to some of the discrepancies. Second, age and gender data are generally captured by registration staff in ADT source systems, and then flow to both the sites’ EHRs and to the HIE. It is possible that subsequent changes were occasionally made directly in the EHRs and not in the ADT systems, causing some of these discrepancies. Third, it is possible that some of the HL7 messages from various sites did not make it to the HIE due to occasional interface downtimes or other malfunctions, causing some small number of HL7 messages to be either dropped or altered. This may help explain why three of the four hospitals have lower counts for frequent ED users and early returns of ED patients from the HIE data than the EHR data. Some of the HIE cases were missing a discharge date/time, and in those we are unable to calculate a 72 hour return, so the counts would be lower. Regardless, the discrepancies between the HIE and EHR datasets was minimal in most cases, and in other cases could be addressed by simple data cleaning approaches.
This study has several important limitations. First, these analyses were performed on only a small subset of sites participating in Healthix (four out of more than 50 hospitals with EDs), and without further analysis of more sites, there is no way to determine if similar data quality issues would be encountered at the other sites, or if the data cleaning techniques employed here would suffice. Also, the ED has traditionally been the focus of HIE interventions, but future studies should investigate data quality issues for inpatient and ambulatory domains. Second, some of the data quality issues here might be unique to this geographic region, though similar problems are likely to exist more broadly. Further analyses in other settings would need to be performed to make this determination. Finally, the analyses performed here were limited to HIE taking place using standard HL7 2.x interfaces. There is currently much work being done using newer XML-based continuity of care document approaches in HL7 Version 3 in RHIOs28 and Integrating the Healthcare Enterprise protocols to exchange data between separate HIE networks.29 There will likely be new and different data quality issues and data cleaning requirements that arise when these newer data transport standards are employed.
Secondary use of data gathered in electronic health records is increasing, raising the need for standardized data quality assessment. Without this, the validity of secondary uses that leverage electronic data may be questionable. Some of these secondary uses include quality measurement, chronic disease management and care coordination, population management, public health surveillance and observational comparative effectiveness research.
Health information exchange presents an appealing data source for these secondary uses of electronic data, and has the distinct advantage over EHR data in that it includes data from multiple provider organizations in a region. Patients often visit more than one provider organization, causing their healthcare data to become fragmented.30–32 The use of HIE data for these secondary purposes therefore may more accurately reflect the manner in which patients interact with the healthcare system when compared to individual EHRs as a data source. However, the need for standardized data quality assessment and data cleaning is even greater when HIE serves as the data source because data from multiple sites are aggregated in an HIE, compounding data quality issues evident in individual EHRs. Further work should be done to determine if standardized data assessment and data cleaning techniques in the setting of an HIE can be developed, and if they differ from similar work that is being done at the level of the EHR.
Acknowledgments
Jason Shapiro was supported in part by the Agency for Healthcare Research and Quality (Grant No. 5R01HS021261). The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the Agency for Healthcare Research and Quality
References
- 1.State Health Information Exchange Cooperative Agreement Program. 2011. Accessed March 4, 2014, 2014, at http://www.healthit.gov/policy-researchers-implementers/state-health-information-exchange.
- 2.Williams C, Mostashari F, Mertz K, Hogin E, Atwal P. From the Office of the National Coordinator: the strategy for advancing the exchange of health information. Health Aff (Millwood) 2012;31:527–36. doi: 10.1377/hlthaff.2011.1314. [DOI] [PubMed] [Google Scholar]
- 3.Vest JR. More than just a question of technology: factors related to hospitals’ adoption and implementation of health information exchange. International journal of medical informatics. 2010;79:797–806. doi: 10.1016/j.ijmedinf.2010.09.003. [DOI] [PubMed] [Google Scholar]
- 4.Genes N, Shapiro J, Vaidya S, Kuperman G. Adoption of health information exchange by emergency physicians at three urban academic medical centers. Applied clinical informatics. 2011;2:263–9. doi: 10.4338/ACI-2011-02-CR-0010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Meaningful Use Definition & Objectives. at http://www.healthit.gov/providers-professionals/meaningful-use-definition-objectives.)
- 6.Adler-Milstein J, Bates DW, Jha AK. Operational health information exchanges show substantial growth, but long-term funding remains a concern. Health Aff (Millwood) 2013;32:1486–92. doi: 10.1377/hlthaff.2013.0124. [DOI] [PubMed] [Google Scholar]
- 7.Vest JR. Health information exchange: national and international approaches. Advances in health care management. 2012;12:3–24. doi: 10.1108/s1474-8231(2012)0000012005. [DOI] [PubMed] [Google Scholar]
- 8.Kuperman GJ, McGowan JJ. Potential unintended consequences of health information exchange. J Gen Intern Med. 2013;28:1663–6. doi: 10.1007/s11606-012-2313-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Wilcox AB, Shen S, Dorr DA, Hripcsak G, Heermann L, Narus SP. Improving access to longitudinal patient health information within an emergency department. Applied clinical informatics. 2012;3:290–300. doi: 10.4338/ACI-2011-03-RA-0019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Frisse ME, Johnson KB, Nian H, et al. The financial impact of health information exchange on emergency department care. Journal of the American Medical Informatics Association : JAMIA. 2012;19:328–33. doi: 10.1136/amiajnl-2011-000394. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Altman R, Shapiro JS, Moore T, Kuperman GJ. Notifications of hospital events to outpatient clinicians using health information exchange: a post-implementation survey. Informatics in primary care. 2012;20:249–55. doi: 10.14236/jhi.v20i4.14. [DOI] [PubMed] [Google Scholar]
- 12.Rudin R, Volk L, Simon S, Bates D. What Affects Clinicians’ Usage of Health Information Exchange? Applied clinical informatics. 2011;2:250–62. doi: 10.4338/ACI-2011-03-RA-0021. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Moore T, Shapiro JS, Doles L, et al. Event detection: a clinical notification service on a health information exchange platform. AMIA Annual Symposium proceedings / AMIA Symposium AMIA Symposium. 2012;2012:635–42. [PMC free article] [PubMed] [Google Scholar]
- 14.Shapiro JS, Mostashari F, Hripcsak G, Soulakis N, Kuperman G. Using Health Information Exchange to Improve Public Health. American Journal of Public Health. 2011;101:616–23. doi: 10.2105/AJPH.2008.158980. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shapiro JS, Genes N, Kuperman G, Chason K, Clinical Advisory Committee H1N1 Working Group NYCIE. Richardson LD. Health information exchange, biosurveillance efforts, and emergency department crowding during the spring 2009 H1N1 outbreak in New York City. Ann Emerg Med. 2010;55:274–9. doi: 10.1016/j.annemergmed.2009.11.026. [DOI] [PubMed] [Google Scholar]
- 16.Shapiro JS, Johnson SA, Angiollilo J, Fleischman W, Onyile A, Kuperman G. Health information exchange improves identification of frequent emergency department users. Health Aff (Millwood) 2013;32:2193–8. doi: 10.1377/hlthaff.2013.0167. [DOI] [PubMed] [Google Scholar]
- 17.Weiner MG, Embi PJ. Toward Reuse of Clinical Data for Research and Quality Improvement: The End of the Beginning? Annals of Internal Medicine. 2009;151:359–60. doi: 10.7326/0003-4819-151-5-200909010-00141. [DOI] [PubMed] [Google Scholar]
- 18.DQC White Paper Draft 1: A consensus-based data quality reporting framework for observational healthcare data. Data Quality Collaborative. http://repository.academyhealth.org/dqc/12013.
- 19.Frieling W. Beyond ‘meaningful use’. Regional health information exchanges just as important to healthcare IT. Modern healthcare. 2009;39:22. [PubMed] [Google Scholar]
- 20.Wright A, Feblowitz J, Samal L, McCoy AB, Sittig DF. The Medicare Electronic Health Record Incentive Program: provider performance on core and menu measures. Health Serv Res. 2014;49:325–46. doi: 10.1111/1475-6773.12134. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Weiskopf NG, Weng C. Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. Journal of the American Medical Informatics Association. 2013;20:144–51. doi: 10.1136/amiajnl-2011-000681. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Kahn MG, Raebel MA, Glanz JM, Riedlinger K, Steiner JF. A Pragmatic Framework for Single-site and Multisite Data Quality Assessment in Electronic Health Record-based Clinical Research. Medical Care. 2012;50:S21–S9. doi: 10.1097/MLR.0b013e318257dd67. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Shapiro JS OA, Genes N, DiMaggio C, Kuperman G, Richardson LD. Validating health information exchange data for quality measurement. Ann Emerg Med. 2013;62:S94. [PMC free article] [PubMed] [Google Scholar]
- 24.Shapiro JS OO, DiMaggio C, Kuperman G. Validating health information exchange data for quality measurement. In: AMIA, editor. Annu Symp Proc. Washington, D.C.: 2013. [PMC free article] [PubMed] [Google Scholar]
- 25.Healthix. 2013. Accessed February 16, 2014, 2014, at https://services.lipixportal.org/HealthixPortal.
- 26.Volpe S. LIPIX + NYCLIX Merge To form: Healthix. EHR PHR Patient Portals with Meaningful Use = Patient Centered Medical Home (PCMH) http://ehrphrpatientportal.blogspot.com/2011/11/lipix-nyclix-merge-to-form-healthix.html2011.
- 27.Becker AaVd. Healthix, Inc. and the Brooklyn Health Information Exchange (BHIX) announce plans to merge. https://services.lipixportal.org/Content/resources/RHIOMergerAnnouncment061113.pdf2013.
- 28.Health Level Seven International. Health Level Seven International. 2014. Accessed March 6, 2014, 2014, at http://www.hl7.org/implement/standards/product_brief.cfm?product_id=185.
- 29.Witting KaJM. Health Information Exchange: Enabling Document Sharing Using IHE Profiles. 2012. [Google Scholar]
- 30.Bourgeois F OK, Mandl K. Patients treated at multiple acute health care facilities: quantifying information fragmentation. Arch Intern Med. 2010:1989–95. doi: 10.1001/archinternmed.2010.439. [DOI] [PubMed] [Google Scholar]
- 31.Finnell J OJ, Grannis S. AMIA Annual Symposium Proceedings. Washington, DC: 2011. All health care is not local: an evaluation of the distribution of emergency department care delivered in Indiana. [PMC free article] [PubMed] [Google Scholar]
- 32.Grinspan ZM AE, Banerjee S, Kern L, Kaushal R, Shapiro JS. Potential value of health information exchange for people with epilepsy: crossover patterns and missing clinical data. In: AMIA, editor. AMIA Annual Symposium proceedings / AMIA Symposium AMIA Symposium. Washington, D.C.: 2013. pp. 527–36. [PMC free article] [PubMed] [Google Scholar]