Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2018 Jul 1.
Published in final edited form as: Circ Cardiovasc Qual Outcomes. 2017 Jul;10(7):e003846. doi: 10.1161/CIRCOUTCOMES.117.003846

With Great Power Comes Great Responsibility: “Big Data” Research from the National Inpatient Sample

Rohan Khera 1, Harlan M Krumholz 2,3,4,5
PMCID: PMC5728376  NIHMSID: NIHMS884870  PMID: 28705865

The use of large administrative databases is transforming clinical cardiovascular research. These sources of “big data” allow the study of practices and outcomes across a spectrum of health systems, providing real-world evidence. However, these databases have peculiarities to their design that require specialized expertise and distinct analytic practices for their appropriate interpretation. We discuss these issues in the context of the National Inpatient Sample (NIS), which is one such dataset used in healthcare research. Compiled by the Agency for Healthcare Research and Quality (AHRQ) annually since 1988, it constitutes a large number of inpatient discharges from U.S. community hospitals regardless of the payer (~8 million/year), with each observation representing a unique hospitalization.1 It has some features to its design and the content of its data that are essential to consider in the pursuit of studies with it.

The NIS includes information on patient demographics, administrative codes for primary diagnosis and secondary diagnoses, procedures, survival to discharge, disposition, hospital charges, and length of stay.1 The NIS can be used to examine the utilization of hospital health services, practice variation, cost, and the impact of health policy interventions in the inpatient setting.1 The data are easily accessible, inexpensive, and can be analyzed using ubiquitous statistical programs. Consequently, research publications from the NIS data have grown rapidly in recent years (Figure 1). Nevertheless, researchers as well as scientific journals and their readers may not yet be familiar with the nuances of this complex dataset, and therefore, be challenged to determine if the data are interpreted correctly.

Figure 1. Calendar-year trends in publications from the National Inpatient Sample.

Figure 1

Number of peer-reviewed publications from the National Inpatient Sample (NIS) have increased rapidly in recent years. Data from other HCUP datasets are presented for comparison – KID (Kids’ Inpatient Database) and NEDS (Nationwide Emergency Department Sample). Source: HCUP Publications. Healthcare Cost and Utilization Project (HCUP). Agency for Healthcare Research and Quality, Rockville, MD. www.hcup-us.ahrq.gov/reports/pubsearch/pubsearch.jsp.

While not an exhaustive list, we discuss four instances highlighting issues related to this widely-used database that should be considered when using it as a scientist, evaluating it as a reviewer or understanding it as a consumer of scientific studies. We believe that these issues are pervasive in the literature and have identified several studies with similar problems. We have used a few representative examples to illustrate these issues but do not believe it is appropriate to call out particular authors or papers. Nevertheless, we shared the specific studies discussed here with the Editors to have our conclusions verified.

(1) Dynamic sample design

The NIS is constructed using a complex sampling design, and obtaining national estimates requires accounting for clustering at hospitals and stratification of sampled data, and for changes in sampling over time.2,3 During 1988–2011, the NIS was constructed annually by including 100% of the discharges from 20% of U.S. hospitals, and was redesigned in 2012 as a 20% national patient-level sample, with non-representative sampling across hospitals.2 Accounting for these changes is essential for an accurate study design. For example, a study using NIS 2003–2012 compared calendar-year trends in rates of an invasive cardiovascular procedure between hospitals with, and without a second, more complex, operative procedure. While appropriate within the 2003–2011 data, this was a problem with the 2012 data.2 Since the NIS only captures a non-representative fraction of hospital discharges after 2011, volumes of either procedure cannot be determined for this period.

(2) Inpatient hospitalization record

The NIS does not identify individual patients, and recurrent hospitalizations appear as distinct observations.3 Further, it does not capture outpatient encounters or observation-only stays, and conditions and procedures occurring across multiple healthcare settings may be underrepresented.1,3 This may be an important consideration in interpreting a study performed in NIS 2001–2011 that reported a very low utilization of a routine diagnostic imaging modality that is performed in both inpatient and outpatient settings, and found that compared with hospitalizations where this study was performed, those without this procedure had higher mortality rates. The latter analysis incorrectly assumes that NIS captures all healthcare records of individual patients, and does not account for other settings where the diagnostic test may have been performed during the same illness episode – either in an outpatient encounter directly preceding the hospitalization or in a recent prior hospitalization. Further, the analysis may also be confounded by illness severity, and patients undergoing multiple procedures may not have the procedure code for this simple diagnostic test included in the record due to either limited additional reimbursement value or limited space on a claim record.

(3) Volume assessments

Similar to the limited ability to perform hospital-level volume assessment since 2012, the data structure does not allow volume estimates for certain subgroups. First, U.S. states are not a part of the sampling framework of the NIS, and therefore, sampled discharges from a given state are not representative of all discharges from that state.4 States contribute hospitalizations based on how representative its hospitals and patient population are to the national landscape. Hence, unless a state’s hospital characteristics (ownership, urban/rural location, teaching status, and bedsize) and patient features (diagnosis-related groups), which are components of NIS’s sampling methodology, are nationally representative, state-level samples are not representative of the state’s discharges. Hence, in a study that assesses state-level rates of a specific procedure performed for an acute cardiovascular condition before and after changes in public-reporting regulations in that state, as compared with other states in the NIS, may be biased by the sampling in the respective states. State-to-state comparisons assume representative samples, and are better conducted using databases that have this property. Second, analysis of provider-level volumes is particularly challenging. A study evaluating volume-outcomes associations for procedures performed by individual providers are also not appropriate, since the provider code-field in NIS does not link to a specific procedure, and is not reported uniformly across hospitals and states, referring to individual physicians at some hospitals, and physician groups at others.5

(4) Administrative codes

A final consideration is the identification of disease conditions or procedures based on their descriptive connotations without formal validation. The claim codes that do not affect reimbursement directly may be prone to variation in coding practices. As an example, a study conducted using NIS 1993–2007 found that rates of pulmonary artery hypertension (PAH) hospitalizations declined abruptly during the study period.6 The authors, however, appropriately investigated this trend in other datasets, and inferred that this did not represent a true demographic trend, but was likely due to a recommendation to limit the use of the PAH-specific claim code as a default for all pulmonary hypertension-related hospitalizations during this period. Similarly, using codes to identify specific diagnostic subgroups, like the ST-elevation myocardial infarction among all acute myocardial infarction, heart failure with preserved ejection fraction among all heart failure, and in-hospital cardiac arrest, without a subgroup-specific reimbursement value, may also be inaccurate, with noise or bias introduced. In addition to the primary diagnosis code, secondary diagnoses should also be interpreted with caution, particularly, for identifying events that may have occurred during a hospitalization. Since the NIS does not have present-on-admission flags accompanying its secondary diagnosis codes, or allows longitudinal assessment of patients, most secondary codes may not be sufficiently reliable in distinguishing complications from comorbid conditions. A rigorous literature review for prior validation studies before conducting such an investigation is warranted.

Given its complexity and ever-evolving data structure, the AHRQ recommends a careful review of NIS’s publicly-available documentation.7 This includes details on year-specific data structure,7 statistical best-practices,3 and analytic tools.8 In addition, it offers ‘HCUPnet’,9 a publicly-accessible, web-based portal that provides national estimates for individual administrative diagnosis/procedure codes, which can help with appropriately vetting proposed methodological strategies. Further, it may be prudent for investigators to clarify additional questions directly with the AHRQ, rather than solely relying on the methodology of published studies in the literature.

Finally, we believe that a simple checklist, like the one we propose in Figure 2, may help prevent common errors early in the study-design phase, and improve the validity and generalizability of studies using the NIS. Further, to communicate that best-practices are followed, there is specific information that should be specifically highlighted in manuscripts. [A] Data Source: (i) The years of NIS data included, and (ii) if the NIS data-structure changed during the study period, how these changes are germane to the study question and addressed. [B] Research design: It is clearly stated that (i) captured encounters represented hospitalization records, and not distinct patients, (ii) validated administrative codes are used to identify diseases/procedures, or the lack of validation is acknowledged as a study limitation, (iii) outcome-assessment is limited to the in-hospital setting, and post-discharge outcomes are not inferred, and (iv) secondary diagnosis codes are not used to infer complications, since these may represent comorbid conditions, unless they are specific for in-hospital events or present on admission codes are used. [C] Data Analysis: The study clearly (i) accounts for the survey design of the NIS and its components – clustering, stratification, and weighting, (ii) reports the software program as well as the survey-specific commands used to generate national estimates, and (iii) states how trend analyses are modified to account for changes in data-structure. [D] Data interpretation: The study clearly states that (i) the estimates for disease conditions and/or procedures from the NIS only represent their occurrence in an inpatient setting, and does not account for outpatient occurrences, (ii) an assessment of possible confounding through appropriate statistical models and necessary sensitivity/subgroup analyses was performed, and (iii) the findings of the study were not sensitive to interpreting complications as comorbidities, or vice versa, given the challenges in differentiating the two in administrative data. In the accompanying publication, Ziaeian and colleagues follow such a checklist in reporting the findings from their study.10 In the future, studies will also need to be clear how they handled the transition from ICD-9 to ICD-10.

Figure 2. Proposed study-design checklist for studies published using the National Inpatient Sample.

Figure 2

The fields marked with an asterisk (*) may be included as a checklist in published studies.

In summary, with the increasing access and utilization of NIS in clinical investigations, there is a potential for errors based on an inadequate understanding of the database design and how it has changed over time. The clinical research community and scientific journals have a responsibility of vetting research ideas and ensuring appropriate interpretation of study results that ensure consistency with the design of this otherwise powerful dataset. As more, large, complex existing datasets become available, the importance of understanding their particular features and their limitations will be increasingly important.

Acknowledgments

Funding Sources: Dr. Khera is supported by the National Heart, Lung, and Blood Institute (5T32HL125247-02) and the National Center for Advancing Translational Sciences (UL1TR001105) of the National Institutes of Health.

Footnotes

Disclosures: None.

References

RESOURCES