Hospital administrative databases are largely derived from discharge abstracts of patient records by trained chart abstraction personnel and are intended primarily for billing and demographic use. Unlike clinical data derived from dedicated research databases, which are often limited in scope, hospital administrative data are generally captured for all patients, often at a large number of hospitals, making these data an appealing source for observational research (1). What many researchers and consumers of the literature often fail to appreciate, however, is that this is a secondary use of these data, and they may lose sight of the inherent limitations of these otherwise-rich data sources. In this issue of AnnalsATS, Garland and colleagues (pp. 229–235) sought to better understand the accuracy of an administrative hospital database in Manitoba, Canada. In doing so, the authors provide an illustrative example of the perils inherent to research based solely on administrative datasets, which lack the granularity of alternative data sources, such as the electronic health record (EHR) (2). We contend that these findings are likely generalizable elsewhere and provide an important cautionary tale.
In their study, the authors assessed the quality of administrative data in the Canadian Discharge Abstract Database, a database mandated by the Canadian Institute for Health Information. It uses International Classification of Diseases, Tenth Revision, Canadian coding standards (ICD-10-CA) diagnoses and Canadian Classification of Health Interventions procedure codes. To assess its validity, the documented use of three life support modalities was compared with a reference database, the Winnipeg ICU database, a prospectively collected database obtained from daily, manual chart reviews by trained abstracters, all of whom were trained critical care nurses (3). Data elements collected by the Winnipeg ICU database include daily assessments pertaining to the use of invasive mechanical ventilation, vasoactive medications, and renal replacement therapy, along with a wide number of other important variables. As the accuracy of these extensively studied life-support modalities has not been well-validated with hospital administrative datasets, they are an important focus for improved validation. Interestingly, the authors found that the administrative database was essentially incapable of accurately identifying vasoactive medication infusions, but somewhat better at identifying invasive mechanical ventilation and renal replacement therapy.
These findings highlight important concerns that have been raised about administrative data elsewhere. These data have been shown to be frequently inaccurate because of miscalculations at an administrative processing level (4), clinician misclassification or underreporting (5, 6), or limitations of the codes themselves (7), all resulting in failure to capture the true clinical picture. Variability among providers, hospital systems, and countries can translate to significant inconsistency in administrative data, which may, in turn, lead to incorrect or inaccurate conclusions being derived (8).
Considering these, and other, concerns, we propose an alternative approach. Administrative data are collected for every patient on every admission, as are data in the EHR. Improved integration of data derived directly from an EHR with administrative data, and with a robust understanding of the inherent data constraints in both, has the potential to produce a rich fund of observational data and improve patient care on a large scale. The Canadian Discharge Abstract Database may be an opportune setting for this endeavor, as it is centrally organized and includes nationwide participation. This integration could occur largely with existing infrastructure, particularly if buy-in is achieved from key stakeholders and commercial EHR vendors (9). Integrated EHR and administrative databases would allow for more useful and generalizable information, not exclusively reliant on diagnostic codes or trained personnel to extract information.
This is particularly evident in the example of vasoactive infusion administration. As Garland and colleagues highlight, the use of vasoactive agents is challenging to identify in administrative data, as are many other important data elements for observational research studies involving critically ill patients. The Canadian Discharge Abstract Database lacks a specific Canadian Classification of Health Interventions procedure code for the use of vasoactive agents, leading the authors to propose using a surrogate marker; in this case, the diagnostic codes for shock. This surrogate marker clearly falls well short of expectations, even failing to identify the use of norepinephrine, a first-line therapy in septic shock (10). In this example, integration into the EHR not only could better identify the administration of vasoactive medications but also have the potential to supply highly granular and useful data, including duration and dosage of medications, all of which have been shown to be readily available in an EHR (11).
When using large administrative databases for observational research, it is critical to understand the inherent limitations of the underlying data. Ongoing assessments of underlying data quality will improve our ability to optimally use and understand these data sources to formulate appropriate hypotheses for these data. Consumers of data and observational research derived exclusively from administrative data need to be appropriately critical. Moving forward, we need to work to foster better integration of data from the EHR into administrative data in the hopes of significantly improving data quality.
Supplementary Material
Footnotes
Support was received from National Institutes of Health-National Institute of General Medical Sciences (NIH-NIGMS)-5T32GM108554–06 training grant (C.S.B.) and National Institutes of Health-National Center for Advancing Translational Sciences (NIH-NCATS)-1KL2TR002245 career development grant (R.E.F.).
Author disclosures are available with the text of this article at www.atsjournals.org.
References
- 1.Garland A, Gershengorn HB, Marrie RA, Reider N, Wilcox ME. A practical, global perspective on using administrative data to conduct intensive care unit research. Ann Am Thorac Soc. 2015;12:1373–1386. doi: 10.1513/AnnalsATS.201503-136FR. [DOI] [PubMed] [Google Scholar]
- 2.Garland A, Marrie RA, Wunsch H, Yogendran M, Chateau D. Accuracy of administrative hospital data to identify use of life support modalities: A Canadian study. Ann Am Thorac Soc. 2020;17:229–235. doi: 10.1513/AnnalsATS.201902-106OC. [DOI] [PubMed] [Google Scholar]
- 3.Fransoo R, Yogendran M, Olafson K, Ramsey C, McGowan KL, Garland A. Constructing episodes of inpatient care: data infrastructure for population-based research. BMC Med Res Methodol. 2012;12:133. doi: 10.1186/1471-2288-12-133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gotfried J, Bernstein M, Ehrlich AC, Friedenberg FK. Administrative database research overestimates the rate of interval colon cancer. J Clin Gastroenterol. 2015;49:483–490. doi: 10.1097/MCG.0000000000000183. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.George J, Newman JM, Ramanathan D, Klika AK, Higuera CA, Barsoum WK. Administrative databases can yield false conclusions-an example of obesity in total joint arthroplasty. J Arthroplasty. 2017;32:S86–S90. doi: 10.1016/j.arth.2017.01.052. [DOI] [PubMed] [Google Scholar]
- 6.Grosse SD, Boulet SL, Amendah DD, Oyeku SO. Administrative data sets and health services research on hemoglobinopathies: a review of the literature. Am J Prev Med. 2010;38:S557–S567. doi: 10.1016/j.amepre.2009.12.015. [DOI] [PubMed] [Google Scholar]
- 7.Walraven CV. A comparison of methods to correct for misclassification bias from administrative database diagnostic codes. Int J Epidemiol. 2018;47:605–616. doi: 10.1093/ije/dyx253. [DOI] [PubMed] [Google Scholar]
- 8.Cooke CR, Iwashyna TJ. Using existing data to address important clinical questions in critical care. Crit Care Med. 2013;41:886–896. doi: 10.1097/CCM.0b013e31827bfc3c. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Freundlich RE, Wanderer JP, Ehrenfeld JM. Building big datasets: Do not forget the emr. Anesth Analg. 2017;124:1367. doi: 10.1213/ANE.0000000000001809. [DOI] [PubMed] [Google Scholar]
- 10.Rhodes A, Evans LE, Alhazzani W, Levy MM, Antonelli M, Ferrer R, et al. Surviving sepsis campaign: International guidelines for management of sepsis and septic shock: 2016. Intensive Care Med. 2017;43:304–377. doi: 10.1007/s00134-017-4683-6. [DOI] [PubMed] [Google Scholar]
- 11.Huerta LE, Wanderer JP, Ehrenfeld JM, Freundlich RE, Rice TW, Semler MW SMART Investigators and the Pragmatic Critical Care Research Group. Validation of a sequential organ failure assessment score using electronic health record data. J Med Syst. 2018;42:199. doi: 10.1007/s10916-018-1060-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.