Skip to main content
Journal of General Internal Medicine logoLink to Journal of General Internal Medicine
. 2021 Jan 8;36(3):683–690. doi: 10.1007/s11606-020-06211-4

Accuracy of Administrative Database Algorithms for Hospitalized Pneumonia in Adults: a Systematic Review

Vicente F Corrales-Medina 1,2,3,, Carl van Walraven 1,2
PMCID: PMC7947096  PMID: 33420557

Abstract

Background

Administrative data algorithms (ADAs) to identify pneumonia cases are commonly used in the analysis of pneumonia burden, trends, etiology, processes of care, outcomes, health care utilization, cost, and response to preventative and therapeutic interventions. However, without a good understanding of the validity of ADAs for pneumonia case identification, an adequate appreciation of this literature is difficult. We systematically reviewed the quality and accuracy of published ADAs to identify adult hospitalized pneumonia cases.

Methods

We reviewed the Medline, EMBase, and Cochrane Central databases through May 2020. All studies describing ADAs for adult hospitalized pneumonia and at least one accuracy statistic were included. Investigators independently extracted information about the sampling frame, reference standard, ADA composition, and ADA accuracy.

Results

Thirteen studies involving 24 ADAs were analyzed. Compliance with a 38-item study-quality assessment tool ranged from 17 to 29 (median, 23; interquartile range [IQR], 20 to 25). Study setting, design, and ADA composition varied extensively. Inclusion criteria of most studies selected for high-risk populations and/or increased pneumonia likelihood. Reference standards with explicit criteria (clinical, laboratorial, and/or radiographic) were used in only 4 ADAs. Only 2 ADAs were validated (one internally and one externally). ADA positive predictive values ranged from 35.0 to 96.5% (median, 84.8%; IQR, 65.3 to 89.1%). However, these values are exaggerated for an unselected patient population because pneumonia prevalences in the study cohorts were very high (median, 66%; IQR, 46 to 86%). ADA sensitivities ranged from 31.3 to 97.8% (median, 65.1%; IQR 52.5–72.4).

Discussion

ADAs for identification of adult pneumonia hospitalizations are highly heterogeneous, poorly validated, and at risk for misclassification bias. Greater standardization in reporting ADA accuracy is required in studies using pneumonia ADA for case identification so that results can be properly interpreted.

Electronic supplementary material

The online version of this article (10.1007/s11606-020-06211-4) contains supplementary material, which is available to authorized users.

KEY WORDS: pneumonia, clinical coding, data accuracy, systematic reviews

INTRODUCTION

Pneumonia is a leading cause of morbidity and mortality worldwide. In Canada and the USA, pneumonia is the 6th and 8th most commonly cited cause of death,1,2 respectively, and the most common infection requiring hospitalization.36 In 2013, the annual cost of hospital treatment of pneumonia in the USA approached $9.5 billion.7

Given the public health importance of pneumonia, research is required regarding its burden, trends, etiology, processes of care, outcomes, health care utilization, and cost along with the effect of preventative or therapeutic interventions on these parameters. To accomplish such wide-ranging analyses, researchers and policy-makers often rely on health administrative database algorithms (ADAs) to identify pneumonia cases for their studies.810 ADAs are combinations of diagnostic/procedural codes along with patient/hospital demographic information that are used to identify cases of specific diseases using health administrative data. The use of ADAs is common in medical research in general, and in particular in pneumonia research.11 However, most pneumonia analyses using ADAs, even when published in the most influential medical literature and aimed at guiding policy-making and clinical practice, do not address the accuracy ADAs for pneumonia case identification.1215 This is important because it is only with a good understanding of the validity of ADAs for pneumonia case identification that the end-users of this literature (i.e., physicians, policy-makers, researches) can fully appreciate the relevance of these analyses for their practice, policy-making, or research-planning. To our knowledge, no systematic review of the literature has been done regarding the accuracy of ADAs for pneumonia. Thus, the objective of this study was to systematically identify and evaluate the quality of all published ADAs for adult hospitalized pneumonia and assess the overall performance of ADAs for identification of cases with this condition.

METHODS

Data Sources and Search Strategy

With the assistance of a health information specialist, we developed search strategies (Appendix 1) to interrogate three literature databases (Medline, 1946 to June 09, 2020; EMBase, 1947 to June 09, 2020; Cochrane Central, May, 2020) for studies that measured the accuracy of ADAs for the identification of adults assessed in the emergency room or admitted to hospital with pneumonia. Key search terms included “administrative data”, “claims data”, “coding”, “billing data”, “international classification of diseases”, “ICD”, “registry”, and “medical records,” among others. We did not set search restrictions on publication date, language, or completion status. The references of all selected studies were reviewed to identify additional relevant literature that might have been initially missed.

Study Selection and Data Abstraction

To be included in our review, studies had to be published in peer-reviewed journals and convey in their title or abstract that the accuracy of a pneumonia ADA (or a broader infectious diagnosis that could include pneumonia [e.g., “sepsis”]) was measured. The authors reviewed the relevance of all citations returned by our search strategy in duplicate by independently reviewing their titles and abstracts. Discrepancies were resolved by consensus.

Relevant articles were retrieved and reviewed in full-text also in duplicate and independently by the authors to determine final inclusion. To be included, studies had to describe at least one ADA for adult pneumonia requiring hospitalization and its accuracy. The description of the pneumonia ADA had to be detailed enough that it could be replicated by other researchers. We limited our analysis to adults because this is the most commonly studied demographic and because pediatric pneumonias can be very distinct clinically. We focused on cases requiring hospitalization because such cases are the most common pneumonia type studied with ADAs and their severity, presentation, and outcomes are generally distinct from cases assessed and treated in the community. We excluded ADAs that focused on etiology-typified pneumonias (i.e., pneumococcal pneumonia, atypical pneumonia, etc.) because the microbiological cause of most pneumonias is not usually determined with ADAs. We also excluded studies that focused upon hospital-acquired pneumonia because this disease is distinct from pneumonias acquired in the community by virtue of well-defined risk factors, etiologies, management, and outcomes.16 Finally, studies had to report at least one of the following accuracy statistics: positive predictive value (PPV), negative predictive value (NPV), sensitivity, specificity, or positive/negative likelihood ratio.

We used a structured abstraction form (available from researchers upon request) to independently abstract relevant data from the included studies. When more than one eligible pneumonia ADA was presented in a study, all were included in our assessment. Discrepancies were solved by consensus.

Analysis

The methodological and reporting quality of the studies was assessed using criteria from Benchimol et al.17 To simplify reporting, we summed for each study the number of criteria that were met (out of a maximum of 38). For each study, we extracted the total sample size (defined as the number of people to which the reference standard was applied) and the number of patients classified as having true pneumonia using the study’s reference standard to estimate the study pneumonia prevalence. For each ADA, we abstracted the number of patients to which the ADA was applied, the number of these patients meeting criteria for the ADA, and the number of patients categorized as true pneumonia cases as per the study reference standard. With this information, we constructed 2 × 2 contingency tables to calculate (when possible) the ADA’s sensitivity, specificity, PPV, NPV, positive likelihood ratio (LR+), and negative likelihood ratio (LR−). Prevalence measurements were rounded to the unit whereas accuracy measurements were rounded to one decimal point.

RESULTS

After removal of duplicates, 896 citations were screened for study relevance (Fig. 1). Of these, 64 were selected for full-text review. Cross-referencing identified 4 additional articles of interest. In total, 13 studies were included in our final review and analysis.8,10,1828

Figure 1.

Figure 1

Flow chart of literature search and study selection.

Study Description

Studies varied extensively in their design and characteristics (Table 1). Seven studies (54%) were completed in the USA,8,18,19,22,23,26,27 4 (31%) in Europe,10,20,21,25 and 1 each in Australia24 and Canada.28

Table 1.

Sampling Frame, Sample, and Reference Standards of Studies

Study (year) Country Sampling frame Sample Reference standard Benchimol score (max. 38)
Scope Time period Inclusion criteria Selection N Pneumonia prevalence* Explicit criteria Application Blinded to code Multiple reviewers
Wiese (2018)18 USA State-wide health care network 2008–2012 Age > 50 years, MedicAid eligible and hospitalization coded with pneumonia Random 340 96% Yes Retrospective No No 20
Ahmed (2014)19 USA County-wide health care network 2010 Admission to ICU and high risk of ARDS All 1443 8% No Prospective Yes Yes 26
Holland-Bill (2014)20 Denmark Single center 2006–2010 Coded with cancer and hospitalization coded with pneumonia Random 93 89% Yes Retrospective Yes No 22
Meropol (2012)21 UK Nation-wide database 1985–2006 Hospitalization within 30 days of an outpatient visit coded with pneumonia Random 59 86% No Retrospective No No 17
Yu (2011)22 USA State-wide health care network 1997–2005 Hospitalization coded with pneumonia All 3991 60% No Retrospective No No 27
Grijalva (2008)23 USA State-wide health care network 1995–2004 MedicAid eligible, coded with rheumatoid arthritis and hospitalization coded with pneumonia Random 161 84% No Retrospective Yes Yes 20
Skull (2008)24 Australia Two hospitals 2000–2002 Age ≥ 65 years, hospitalization coded with pneumonia (cases) or not (controls) All 3343 46% No Retrospective Yes No 29
Gedeborg (2007)25 Sweden Single center 1994–1999 Admission to ICU All 7615 5% No Prospective Yes No 24
van de Garde (2007)10 Netherlands Seven hospitals 2000–2004 Enrolled in pneumonia clinical trial All 293 100% Yes Prospective Yes No 21
Schneeweiss (2007)26 USA State-wide health care network 2001–2004 Veteran, hospitalization for ≥3 days, hospitalization coded with pneumonia All 23 70% Yes Retrospective No No 25
Aronsky (2005)8 USA Single center 1999–2000 Admission to ED All 10,748 2% No Retrospective Yes Yes 25
Whittle (1997)27 USA Single center 1989–1990 Hospitalization coded with pneumonia Random 144 65% No Retrospective Yes Yes 23
Marrie (1987)28 Canada Single center 1984 Hospitalization with clinical diagnosis of pneumonia or coded with pneumonia All 159 66% No Prospective Yes No 20

USA refers to United States. ICU refers to intensive care unit. ARDS refers to acute respiratory distress syndrome. *As per the application of the reference standard. Number of positive items out of the 38 quality items proposed by Benchimol et al.17 (for details of the performance of each study in individual quality items, please see Appendix 2). Validation sample for this study

Five studies18,2123,26 involved state- or nation-wide hospital systems whereas the rest involved oligo- or mono-institutional systems. Six studies10,1820,24,26 included cases that were solely from 2000 onwards whereas the rest included cases that partially8,2123 or fully25,27,28 pre-dated that time. The inclusion criteria of 9 studies10,18,2024,2628 (70%) resulted in populations with a higher risk of pneumonia by virtue of increased age,18,24 admission to the intensive care units,19,25 high-risk of acute respiratory distress syndrome,19 being coded with cancer20 or rheumatoid arthritis,23 being coded with pneumonia in the hospitalization of interest22 or in a pre-dated outpatient visit,21 or having hospitalizations of ≥ 3-day duration.26 In 5 of these studies,18,20,21,23,26 the inclusion criteria included the actual ADA whose accuracy was to be measured. Overall, only 1 study included a population that was not limited to patients with a pneumonia ADA or with higher risk for pneumonia.8

Study samples were either all-inclusive or randomly generated and ranged in size from 23 to 10,748 (mean 2186, SD 950; median, 293). In the study having the most generalizable inclusion criteria (Aronsky et al.8), pneumonia prevalence was 2%. Otherwise, pneumonia prevalence in the other studies ranged from 5 to 100%. Overall, pneumonia prevalence in the 13 studies analyzed ranged from 2 to 100% (median, 66%; IQR, 46 to 86%).

Most of the reference standards used to define pneumonia cases relied upon physician notation of a clinical diagnosis or suspicion of pneumonia in the medical records. Only 4 studies (31%) used reference standards with explicit objective criteria (clinical, laboratory, and/or radiographic).10,18,20,26 Nine studies (69%) stated that the application of their reference standards was blinded8,10,19,20,2325,27,28; however, the inclusion criteria in 4 of these studies20,23,27,28 required that patients be coded for pneumonia or have a recorded clinical diagnosis of this condition, making effective blinding difficult.

Compliance with the 38 quality items proposed by Benchimol et al.17 varied from 17 to 29 (mean, 23; SD 9; median, 23) (Appendix 2).

Characteristics and Accuracy of Pneumonia ADAs

The 13 studies included in the review measured the accuracy of 24 pneumonia ADAs (Appendix 3). Sixteen ADAs (67%) used the International Classification of Disease (ICD)-9 coding system,8,10,18,19,22,23,2528 4 (17%) used ICD-10,20,24,25 3 (13%) used the Diagnosis-Related Group (DRG) system,8,27 and 1 (4%) used the Read coding system (Table 2).21 Twenty-three ADAs8,10,1820,2228 (96%) used codes assigned at the hospitalization of interest whereas the remaining ADA21 used codes from a community information system. ADA construction varied extensively, even if they used the same coding system (Appendix 3). For example, while most ICD-9-based ADAs used the codes 480.x-486.x for pneumonia identification, there was large variation among them with regard to criteria for the positioning of these codes in discharge diagnoses lists (e.g., any position vs. primary position only), the inclusion of or combination with other codes that could also represent pneumonia, and the addition of other criteria beyond the coding system of interest (e.g., radiographic criteria, minimum hospitalization length). Only 1 ADA (based on ICD-9 coding) was validated internally.19 Only 1 ADA (based on DRG coding) was validated externally (although the studies used distinct sampling frameworks and reference standards).8,27

Table 2.

Accuracy of Algorithms Reviewed

Study* Code system Validation Number of cases algorithm + Number of cases algorithm − Algorithm accuracy Probability of pneumonia
Pneumonia + Pneumonia − Pneumonia + Pneumonia − Sensitivity Specificity PPV NPV LR+ LR− Algorithm + Algorithm −
Wiese18 ICD-9-CM No 328 12 0 0 - - 96.5% - - - - -
Ahmed19 ICD-9-CM Yes 94 26 28 1295 77.0% 98.0% 78.3% 97.9% 38.5 0.2 49.7% 0.5%
Holland-Bill20 ICD-10-CM No 83 10 0 0 - - 89.2% - - - - -
Meropol21 Read No 51 8 0 0 - - 86.4% - - - - -
Yu22 ICD-9-CM (a) No 325 33 189 407 63.2% 92.5% 90.8% 68.3% 8.4 0.4 17.7% 1.0%
ICD-9-CM (b) No 1214 150 651 870 65.1% 85.3% 89.0% 57.2% 4.4 0.4 10.1% 1.0%
Grijalva23 ICD-9-CM (a) No 135 26 0 0 - - 83.9% - - - - -
ICD-9-CM (b) No 103 5 0 0 - - 95.4% - - - - -
ICD-9-CM (c) No 32 21 0 0 - - 60.4% - - - - -
Skull24 ICD-10-CM No 1509 644 34 1156 97.8% 64.2% 70.1% 97.1% 2.7 0.03 6.5% 0.1%
Gedeborg25 ICD-9-CM (a) No 89 165 96 3831 48.1% 95.9% 35.0% 97.6% 11.7 0.5 23.1% 1.3%
ICD-9-CM (b) No 40 62 88 3991 31.3% 98.5% 39.2% 97.8% 20.9 0.7 34.9% 1.8%
ICD10-CM (c) No 116 135 105 3078 52.5% 95.8% 46.2% 96.7% 12.5 0.5 24.3% 1.3%
ICD10-CM (d) No 71 45 115 3203 38.2% 98.6% 61.0% 96.5% 27.3 0.6 41.2% 1.5%
van de Garde10 ICD-9-CM No 212 0 81 0 72.4% - - - - - - -
Schneeweiss26 ICD-9-CM No 16 7 0 0 - - 69.6% - - - - -
Aronsky8 ICD-9-CM (a) No 109 20 90 10,529 54.8% 99.8% 84.5% 99.2% 274.0 0.5 87.5% 1.3%
ICD-9-CM (b) No 136 23 63 10,526 68.3% 99.8% 85.5% 99.4% 341.5 0.3 89.8% 0.8%
ICD-9-CM (c) No 139 25 60 10,524 69.8% 99.8% 84.8% 99.4% 349.0 0.3 89.9% 0.8%
DRG (d) No 89 13 110 10,536 44.7% 99.9% 87.3% 99.0% 447.0 0.6 92.0% 1.5%
DRG (e) Yes§ 124 21 75 10,528 62.3% 99.8% 85.5% 99.3% 311.5 0.4 88.9% 1.0%
Whittle27 ICD-9-CM (a) No 79 7 15 43 84.0% 86.0% 91.9% 74.1% 6 0.2 13.3% 0.5%
DRG (b) No 70 5 24 45 74.5% 90.0% 93.3% 65.2% 7.5 0.3 16.1% 0.8%
Marrie28 ICD-9-CM No 73 54 32 0 69.5% - 57.5% - - - - -

*For detailed description of the algorithms, please refer to Appendix 3. Letters in (), if present, corresponded with the same letters in Appendix 3. Assuming a true prevalence of pneumonia in population is 2.5%. Data shown corresponds to the internal validation analysis of this study. §This algorithm is considered external validation of DRG-based algorithm of Whittle’s study

Pneumonia ADA accuracy varied extensively (Table 2). PPV was the most common accuracy statistic, being measured in 23 (96%) of the 24 ADAs. Values ranged from 35.0 to 96.5% (median 84.8%, IQR 65.3–89.1%). NPV was measured in 15 (63%) of the ADAs, with values ranging from 65.2 to 99.4% (median 98.0%, IQR 90.0–99.8%). Sensitivity was measured in 17 (71%) ADAs with values ranging from 31.3 to 97.8% (median 65.1%, IQR 52.5–72.4). Specificity was measured in 15 ADAs with values ranging from 64.5 to 99.8% (median 98.0%, IQR 91.2–99.8). The LR+ and LR− of the ADAs could be calculated in these latter studies and we used them to estimate the pneumonia probability in patients of a hypothetical population with a true prevalence of pneumonia of 2% (the prevalence measured in the study by Aronsky et al.,8 the most generalizable sampling frame of the included studies). In patients who were ADA positive, pneumonia probabilities ranged from 4.3 to 90.1% (median 19.3%; IQR 8.2–64.4%). In contrast, in patients who were pneumonia ADA negative, pneumonia was essentially excluded with disease probabilities ranging from 0 to 1.4% (median 0.8%; IQR 0.4–1.0%).

DISCUSSION

Our systematic review identified 24 published pneumonia ADAs. Despite having the same goal (i.e., to identify adults admitted to hospital with pneumonia), these ADAs varied extensively in their construction, the patient population to which they were applied, the prevalence of pneumonia in these populations, the rigor of the reference standards used to define pneumonia cases, their overall methodological quality, and their accuracy measurements. Of the 24 ADAs identified, only 1 was internally validated19 and only 1 was externally validated.8,27

Our systematic review highlights important points that researchers, health professionals, and policy-makers should consider when studying pneumonia using ADAs and/or assessing literature that uses such strategy. First, the PPV and NPV should be used cautiously to assess the performance of these ADAs for the identification of pneumonia cases. PPV and NPV values are highly dependent on disease prevalence; therefore, the reported values are only generalizable to populations having similar disease prevalence.29,30 Many of the studies in our review included populations with clinical characteristics and/or sampling frames that notably increased pneumonia prevalence (Table 1). Thus, the PPVs and/or NPVs for the ADAs reported in these studies are not generalizable to less selected populations having a lower prevalence of pneumonia. In fact, only one study (Aronsky et al.)8 in our review involved a population that could be considered representative of unselected adults seeking acute medical care in the emergency department. To avoid biased statistics from varying disease prevalence, one should use sensitivity, specificity, and their corresponding LRs to evaluate ADA performance.29,30 Unfortunately, these statistics are much less commonly reported (Table 2). Second, the median sensitivity from the 17 ADAs in which it was measured was 65.1%, suggesting that analyses using ADAs likely miss a significant proportion of pneumonias cases. Third, in the absence of prospectively applied explicit to determine pneumonia status (used in only 4 of the studies included in this review),10,19,25,28 studies commonly used manual chart review to determine whether or not physicians noted a clinical impression of pneumonia as the reference standard. This strategy makes the reliability and reproducibility of these studies difficult because the suspicion and diagnosis of pneumonia in clinical settings frequently relies on the subjective interpretation of the probability that unspecific symptomatology and/or clinical, laboratorial, and/or imaging findings, together or in isolation, are explained by a pneumonic process. Moreover, clinicians may be under administrative pressure to document a diagnosis even when the degree of certainty for such diagnosis is suboptimal. Fourth, ADAs frequently involved criteria that were extraneous to the diagnostic codes of interest (Table 1), mostly to increase the probability of pneumonia. If other researchers use the cited diagnostic codes (Appendix 3) in isolation (i.e., without the other additional criteria), the originally reported accuracy statistics will not apply. Finally, only 1 pneumonia ADA8 was an external validation of a previously published pneumonia ADA.27 Therefore, the generalizability of the other 23 ADAs is uncertain. The performance of most predictive models (which ADAs essentially are) deteriorates when they are applied to an external population. Such performance deterioration usually occurs when ADA developers overfit their original model to maximize its accuracy. Hence, external validation of ADAs is key to truly understand their accuracies and validity for external use.

Despite the above limitations of ADAs for pneumonia, administrative data remain a valuable resource for timely and practical analyses of this condition at large scale. However, to improve the interpretation of the validity and applicability of these analyses, several strategies should be applied. The most robust but also complex strategy would be the development of a contemporary, accurate, and reproducible pneumonia ADA that allows for the calculation of precise ADA likelihood ratios and that can then be used for the estimation of pneumonia probability in any other population in which pneumonia prevalence can be measured. Developing such an ADA should include unselected populations of hospitalized patients, involve multiple centers in broad jurisdictions, use explicit reference standard criteria to determine pneumonia status, and should be designed to accurately measure pneumonia prevalence in the study sampling frame. The resulting pneumonia ADA and its accuracy must then be validated in external populations. Ideally, this “reference” ADA should return a probability of pneumonia instead of the dichotomous outcome (pneumonia present/not present) as this would also allow for the use of statistical methods like bootstrap imputation to accommodate for misclassification bias in future studies using such ADA.3133 Clearly, the development of such ADA would be logistically challenging and consume large amounts of time and resources. An alternative and more practical approach is that all pneumonia ADA studies report, at a minimum, the details of its ADA in enough detail that it can be replicated, disclose whether a reference standard was used, describe the details of the said reference standard, and report the pneumonia prevalence and ADA accuracy measured in a representative sample of their study population. Preferably, reference standards in such investigations should be based in objective clinical, radiographic, and/or laboratorial criteria. End-users of these reports can then use this information to more appropriately interpret the results of these analyses.

Limitations of our review include its restriction to articles published in peer-reviewed journals, articles whose objectives (as stated in their title or abstract) included the assessment of ADAs’ accuracy, and articles that explicitly reported at least one accuracy metric of interest. These restrictions may have resulted in missed relevant information outside the peer-reviewed literature or articles that focused on accuracy of ADAs only tangentially. Also, as administrative database studies are not always accurately indexed in literature, relevant articles may not have been captured by our literature search. Information about the mode of data extraction (manual vs. automated) for each ADA was not consistently reported in the reviewed papers and we could not assess the potential impact of this variable on ADA accuracy (for example, manual extraction could have introduced further bias or error from the person or people carrying the data extraction). Most of the ADAs in our review were based on the ICD (9th and 10th versions) diagnosis coding system. Therefore, whether ADAs based on other coding systems show similar accuracy performance as ICD-based ADAs remains unclear. Also, our review did not account for differences in the source of administrative data (local hospital administration, governmental offices, insurance claims, etc.) which could also affect individual performances of ADAs. Given the remarkable heterogeneity in the reviewed literature (in ADA design, the coding systems used, the use and design of gold standards, and the accuracy metrics reported, among others), we could not perform any meaningful meta-analysis of the data. Nonetheless, such heterogeneity is a principal finding of our review and it stresses the need for more standardized reporting of the accuracy of ADAs in all analyses that use such strategy for pneumonia case identification.

CONCLUSION

Published studies of ADAs for identification of hospitalized pneumonia cases vary widely in their study setting, study design, ADA composition, use and nature of reference standards, and overall methodological quality. Most of the ADA accuracy statistics derived from these studies were obtained from analyses of populations selected for their high likelihood of pneumonia, and therefore their applicability to less selected populations is questionable. Moving forward, greater standardization in the reporting of ADA accuracy is required in studies that use pneumonia ADAs for case identification so that the end-users of this literature can properly interpret the results of these analyses.

Electronic Supplementary Material

ESM 1 (59.2KB, docx)

(DOCX 59 kb)

ESM 2 (65.5KB, doc)

(DOC 65 kb)

Authors’ Contribution

V.F.C-M and C.v.W equally participated in the study design, data gathering, performance of analyses, interpretation of results, and writing of the manuscript. Both authors approved the final version of the manuscript and take responsibility for its integrity and accuracy.

Funding

This study was supported by a grant from the Department of Medicine at The Ottawa Hospital in Ontario, Canada. The funding source had no role in the design, execution, or reporting of our work.

Compliance with Ethical Standards

Conflict of Interest

The authors declare that they do not have a conflict of interest.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Centers for Disease Control and Prevention. Leading causes of death. Available at: https://www.cdc.gov/nchs/fastats/leading-causes-of-death.htm. Accessed February 1, 2020.
  • 2.Statistics Canada. Deaths and causes of death, 2015. Available at: https://www150.statcan.gc.ca/n1/daily-quotidien/180223/dq180223c-eng.htm. Accessed February 1, 2020.
  • 3.Prina E, Ranzani OT, Torres A. Community-acquired pneumonia. Lancet. 2015;386(9998):1097–108. doi: 10.1016/S0140-6736(15)60733-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Raut M, Schein J, Mody S, Grant R, Benson C, Olson W. Estimating the economic impact of a half-day reduction in length of hospital stay among patients with community-acquired pneumonia in the US. Curr Med Res Opin. 2009;25(9):2151–7. doi: 10.1185/03007990903102743. [DOI] [PubMed] [Google Scholar]
  • 5.File TM, Jr, Marrie TJ. Burden of community-acquired pneumonia in North American adults. Postgrad Med. 2010;122(2):130–41. doi: 10.3810/pgm.2010.03.2130. [DOI] [PubMed] [Google Scholar]
  • 6.Public Health Agency of Canada. Respiratory Disease in Canada. Available at: http://publications.gc.ca/collections/Collection/H39-593-2001E.pdf. Accessed February 1, 2020.
  • 7.American Thoracic Society. Top 20 Pneumonia Facts—2018. Available at: https://www.thoracic.org/patients/patient-resources/resources/top-pneumonia-facts.pdf. Accessed February 1, 2020.
  • 8.Aronsky D, Haug PJ, Lagor C, Dean NC. Accuracy of administrative data for identifying patients with pneumonia. Am J Med Qual. 2005;20(6):319–28. doi: 10.1177/1062860605280358. [DOI] [PubMed] [Google Scholar]
  • 9.Drahos J, Vanwormer JJ, Greenlee RT, Landgren O, Koshiol J. Accuracy of ICD-9-CM codes in identifying infections of pneumonia and herpes simplex virus in administrative data. Ann Epidemiol. 2013;23(5):291–3. doi: 10.1016/j.annepidem.2013.02.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.van de Garde EM, Oosterheert JJ, Bonten M, Kaplan RC, Leufkens HG. International classification of diseases codes showed modest sensitivity for detecting community-acquired pneumonia. J Clin Epidemiol. 2007;60(8):834–8. doi: 10.1016/j.jclinepi.2006.10.018. [DOI] [PubMed] [Google Scholar]
  • 11.van Walraven C, Bennett C, Forster AJ. Administrative database research infrequently used validated diagnostic or procedural codes. J Clin Epidemiol. 2011;64(10):1054–9. doi: 10.1016/j.jclinepi.2011.01.001. [DOI] [PubMed] [Google Scholar]
  • 12.Wadhera RK, Joynt Maddox KE, Wasfy JH, Haneuse S, Shen C, Yeh RW. Association of the Hospital Readmissions Reduction Program With Mortality Among Medicare Beneficiaries Hospitalized for Heart Failure, Acute Myocardial Infarction, and Pneumonia. JAMA. 2018;320(24):2542–52. doi: 10.1001/jama.2018.19232. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Griffin MR, Zhu Y, Moore MR, Whitney CG, Grijalva CG. U.S. hospitalizations for pneumonia after a decade of pneumococcal vaccination. N Eng J Med. 2013;369(2):155–63. doi: 10.1056/NEJMoa1209165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Herzig SJ, Howell MD, Ngo LH, Marcantonio ER. Acid-suppressive medication use and the risk for hospital-acquired pneumonia. JAMA. 2009;301(20):2120–8. doi: 10.1001/jama.2009.722. [DOI] [PubMed] [Google Scholar]
  • 15.Fry AM, Shay DK, Holman RC, Curns AT, Anderson LJ. Trends in hospitalizations for pneumonia among persons aged 65 years or older in the United States, 1988-2002. JAMA. 2005;294(21):2712–9. doi: 10.1001/jama.294.21.2712. [DOI] [PubMed] [Google Scholar]
  • 16.Kalil AC, Metersky ML, Klompas M, et al. Management of adults with hospital-acquired and ventilator-associated pneumonia: 2016 clinical practice guidelines by the Infectious Diseases Society of America and the American Thoracic Society. Clinl Infect Dis. 2016;63(5):e61–e111. doi: 10.1093/cid/ciw353. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Benchimol EI, Manuel DG, To T. Griffiths AM, Rabeneck L, Guttmann A. Development and use of reporting guidelines for assessing the quality of validation studies of health administrative data. J Clin Epidemiol. 2011;64(8):821–9. doi: 10.1016/j.jclinepi.2010.10.006. [DOI] [PubMed] [Google Scholar]
  • 18.Wiese AD, Griffin MR, Stein CM, et al. Validation of discharge diagnosis codes to identify serious infections among middle age and older adults. BMJ open. 2018;8(6):e020857. doi: 10.1136/bmjopen-2017-020857. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ahmed A, Thongprayoon C, Pickering BW, et al. Towards prevention of acute syndromes: electronic identification of at-risk patients during hospital admission. Appl Clin Inform. 2014;5(1):58–72. doi: 10.4338/ACI-2013-07-RA-0045. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Holland-Bill L, Xu H, Sorensen HT, et al. Positive predictive value of primary inpatient discharge diagnoses of infection among cancer patients in the Danish National Registry of Patients. Ann Epidemiol. 2014;24(8):593–7. doi: 10.1016/j.annepidem.2014.05.011. [DOI] [PubMed] [Google Scholar]
  • 21.Meropol SB, Metlay JP. Accuracy of pneumonia hospital admissions in a primary care electronic medical record database. Pharmacoepidemiol Drug Saf. 2012;21(6):659–65. doi: 10.1002/pds.3207. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Yu O, Nelson JC, Bounds L, Jackson LA. Classification algorithms to improve the accuracy of identifying patients hospitalized with community-acquired pneumonia using administrative data. Epidemiol Infect. 2011;139(9):1296–306. doi: 10.1017/S0950268810002529. [DOI] [PubMed] [Google Scholar]
  • 23.Grijalva CG, Chung CP, Stein CM, et al. Computerized definitions showed high positive predictive values for identifying hospitalizations for congestive heart failure and selected infections in Medicaid enrollees with rheumatoid arthritis. Pharmacoepidemiol Drug Saf. 2008;17(9):890–5. doi: 10.1002/pds.1625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Skull SA, Andrews RM, Byrnes GB, et al. ICD-10 codes are a valid tool for identification of pneumonia in hospitalized patients aged > or = 65 years. Epidemiol Infect. 2008;136(2):232–40. doi: 10.1017/S0950268807008564. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Gedeborg R, Furebring M, Michaelsson K. Diagnosis-dependent misclassification of infections using administrative data variably affected incidence and mortality estimates in ICU patients. J Clin Epidemiol. 2007;60(2):155–62. doi: 10.1016/j.jclinepi.2006.05.013. [DOI] [PubMed] [Google Scholar]
  • 26.Schneeweiss S, Robicsek A, Scranton R, Zuckerman D, Solomon DH. Veteran’s affairs hospital discharge databases coded serious bacterial infections accurately. J Clin Epidemiol. 2007;60(4):397–409. doi: 10.1016/j.jclinepi.2006.07.011. [DOI] [PubMed] [Google Scholar]
  • 27.Whittle J, Fine MJ, Joyce DZ, et al. Community-acquired pneumonia: can it be defined with claims data? Am J Medl Qual. 1997;12(4):187–93. doi: 10.1177/0885713X9701200404. [DOI] [PubMed] [Google Scholar]
  • 28.Marrie TJ, Durant H, Sealy E. Pneumonia--the quality of medical records data. Med Care. 1987;25(1):20–4. doi: 10.1097/00005650-198701000-00003. [DOI] [PubMed] [Google Scholar]
  • 29.Jaeschke R, Guyatt GH, Sackett DL. Users’ guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? The Evidence-Based Medicine Working Group. JAMA. 1994;271:703–7. doi: 10.1001/jama.1994.03510330081039. [DOI] [PubMed] [Google Scholar]
  • 30.Jaeschke R, Guyatt G, Sackett DL. Users’ guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1994;271:389–91. doi: 10.1001/jama.1994.03510290071040. [DOI] [PubMed] [Google Scholar]
  • 31.van Walraven C. Improved Correction of Misclassification Bias With Bootstrap Imputation. Med Care. 2018;56(7):e39–e45. doi: 10.1097/MLR.0000000000000787. [DOI] [PubMed] [Google Scholar]
  • 32.van Walraven C. A comparison of methods to correct for misclassification bias from administrative database diagnostic codes. Int J Epidemiol. 2018;47(2):605–16. doi: 10.1093/ije/dyx253. [DOI] [PubMed] [Google Scholar]
  • 33.van Walraven C. Bootstrap imputation with a disease probability model minimized bias from misclassification due to administrative database codes. J Clin Epidemiol. 2017;84:114–20. doi: 10.1016/j.jclinepi.2017.01.007. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ESM 1 (59.2KB, docx)

(DOCX 59 kb)

ESM 2 (65.5KB, doc)

(DOC 65 kb)


Articles from Journal of General Internal Medicine are provided here courtesy of Society of General Internal Medicine

RESOURCES