Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Oct 1.
Published in final edited form as: Ann Epidemiol. 2016 Aug 20;26(10):717–722.e1. doi: 10.1016/j.annepidem.2016.08.002

Sepsis surveillance from administrative data in the absence of a perfect verification

S Reza Jafarzadeh a,*, Benjamin S Thomas a,b, Jeff Gill c,d, Victoria J Fraser a, Jonas Marschall a,e, David K Warren a
PMCID: PMC5086291  NIHMSID: NIHMS811639  PMID: 27600804

Abstract

Purpose

Past studies of sepsis epidemiology did not address misclassification bias due to imperfect verification of sepsis detection methods to estimate the true prevalence.

Methods

We examined 273 126 hospitalizations from 2008–2012 at a tertiary-care center to develop surveillance-aimed sepsis detection criteria, based on the presence of the sepsis explicit International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes (995.92 or 785.52), blood culture orders, and antibiotics administration. We used Bayesian multinomial latent class models to estimate the true prevalence of sepsis, while adjusting for the imperfect sensitivity and specificity and the conditional dependence among the individual criteria.

Results

The apparent annual prevalence of sepsis hospitalizations based on explicit ICD-9-CM codes were 1.5%, 1.4%, 1.6%, 2.2%, and 2.5% for the years 2008 to 2012. Bayesian posterior estimates for the true prevalence of sepsis suggested that it remained stable from 2008, 19.2% (95% credible interval [CI]: 17.9%, 22.9%), to 2012, 17.8% (95% CI: 16.8%, 20.2%). The sensitivity of sepsis-explicit codes, however, increased from 7.6% (95% CI: 6.4%, 8.4%) in 2008 to 13.8% (95% CI: 12.2%, 14.9%) in 2012.

Conclusions

The true prevalence of sepsis remained high, but stable despite an increase in the sensitivity of sepsis-explicit codes in administrative data.

Keywords: Bayesian estimation, No reference standard, Prevalence, Sensitivity, Sepsis, Specificity, Surveillance

Introduction

Sepsis is a major public health problem and is one of the leading causes of death in the United States [1]. The high morbidity of sepsis results in $20.3 billion in annual hospital costs in the United States [2], in addition to the potential costs associated with permanent organ damage, long-term cognitive impairment, and functional disability [3]. The Agency for Healthcare Research and Quality (AHRQ) reported that sepsis was involved in 2.8% of all hospitalizations in 2011 [2].

Sepsis was defined in 1991 by a consensus conference of the American College of Chest Physicians (ACCP) and the Society of Critical Care Medicine (SCCM) as a syndrome of dysregulated inflammatory response to severe infection [4]. The consensus group recognized (and reaffirmed in 2001) the host response, called the systemic inflammatory response syndrome (SIRS), as a result of suspected or confirmed infection, for the definition, as opposed to the presence of a specific infection [4]. The most recent revision (i.e. Sepsis-3) to the consensus definition defines sepsis as a life-threatening organ dysfunction as a result of a dysregulated response to an infectious insult [5]. The diverse causes and clinical manifestations of sepsis such as pneumonia or urinary tract infection accompanied with organ dysfunctions or shock has created difficulty for surveillance and assessment of quality of care.

Several multicenter studies and national reports in the literature that relied on administrative billing data, suggested that the incidence of sepsis has been increasing by about 10% annually [614]. Similarly, a recent 5-year study at our tertiary-care center reported a 9.7% annual percent change in hospitalizations with a discharge diagnosis of sepsis [15]. The results of a study at our institution also did not find an increase in sepsis incidence when we used patient-level data to adjust for the coinciding improvement in the clinical diagnosis of sepsis, its documentation, and administrative coding of sepsis during the same period [16]. These studies demonstrated a lack of a temporal trend in the apparent prevalence (or incidence) of sepsis; however, there has not been an attempt to estimate the true prevalence of sepsis by adjusting for the misclassification bias due to the imperfect accuracy of current sepsis detection using administrative data.

In this study we developed criteria, referred to as surveillance-aimed sepsis detection (SASD) criteria to estimate the true prevalence of sepsis from administrative data. In specifying the criteria, we considered some fundamental concepts of a surveillance system such as simplicity of implementation, accuracy (diagnostic sensitivity and specificity), precision (repeatability and reproducibility), timeliness (quick implementation), utility (flexibility and extensibility of methods to evolving settings and conditions), and value (low- or no-cost compared to accrued value) [17]. In devising SASD, we intended the criteria to be applied to aggregate-level data for surveillance purposes, rather than in a clinical setting for an individual patient. Unlike some published studies [14,1824], we did not assume that our criteria or any other reference or validation method has perfect accuracy. We adapted appropriate analytical techniques to adjust for the misclassification bias due to imperfect verification and to estimate the true prevalence of sepsis, while we coherently incorporated all uncertainties regarding the unknown quantities in our inference [25,26]. Finally, we illustrated the use of methods for surveillance using an imperfect diagnostic criterion and provided an open-source program code that can be readily adapted for surveillance of conditions of interest using administrative data or electronic health records.

Methods

Study setting and population

The study population included all inpatient stays for patients, who were 18 years of age or older, admitted to Barnes-Jewish Hospital (BJH), an academic tertiary-care center affiliated with Washington University School of Medicine in St. Louis, Missouri, between January 1, 2008 and December 31, 2012. Administrative data and electronic health records containing clinical, pharmacy, and laboratory data for BJH were available from the BJC HealthCare’s Center for Clinical Excellence and the Center for Biomedical Informatics, a joint partnership between Washington University and BJC HealthCare. The study was approved by the Human Research Protection Office of Washington University School of Medicine, with a waiver of written informed consent.

Description of data

Data included patient information and discharge diagnosis of sepsis of any etiology, identified by the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnosis codes, 995.92 or 785.52, as per the third revision (i.e. Sepsis-3) of the sepsis consensus definition [5]. Other data obtained included all blood cultures performed as well as antibiotic administration during the course of hospitalization. Receipt of antibiotics was considered negative for antibiotic administration routes that are inconsistent with sepsis treatment such as topical ophthalmologic administration, oral rinse, or other topical antibiotics use.

Bayesian inference

We used Bayesian latent class models to estimate the true prevalence of sepsis and the diagnostic sensitivity and specificity of our SASD criteria based on cross-sectional sampling design. This analytical technique allows estimation of true prevalence, despite being unobserved, from observed data that are subject to misclassification [27]. For an overview on Bayesian methodology, see Christensen et al [28] or Gill [29]. Briefly, Bayesian inference about an unknown quantity (i.e. parameter) such as prevalence or a sensitivity starts by specifying a probability distribution, referred to as a prior, on the parameter of interest. The prior is often elicited from expert knowledge or past research, referred to as informative prior, or defined to be diffuse (i.e. less informative) or non-informative, which contains no information (i.e., every possible value of the parameter is equally likely). This prior information is then combined (i.e. updated) with the observed data to obtain a posterior distribution of the parameter, a process based on the Bayes’ theorem. The posterior distribution can be summarized with point estimates and probability intervals (i.e. quantiles) of the parameter of interest. There are several features of the Bayesian approach that are suited to prevalence estimation in the absence of a perfect verification. The Bayesian approach formally incorporates all uncertainties, for example from expert opinion, or certainties, for example from past research, regarding an unknown parameter through prior specification. Priors also allow parameters to be estimable even when there are not enough degrees of freedom without the need to put additional constraints on the parameters [30]. Finally, the Bayesian framework directly provides probability intervals and do not need to rely on large sample approximation [28,29].

Model for data

We specified a multinomial model, described by Branscum et al [31] for the cross-classified results of the three criterions in the SASD criteria: sepsis-explicit discharge codes, an order for blood cultures, and antibiotic administration during the course of hospitalization. Multinomial sampling distribution is commonly used to model the frequencies corresponding to the cross-classified dichotomous diagnostic test outcomes [26,27,3135]. For the SASD criteria, the data vector consists frequencies corresponding to the combination of outcomes for the three criteria, i.e. (+++, ++-, +-+, …, --+, ---), where (+++) is the number of patients with all criteria present and so forth. We followed the model parameterization of Dendukuri and Joseph [36] to allow conditional dependence (or correlations) between the results of each criterion [34,37]. Specifically, we allowed antibiotic administration and sepsis-explicit coding to be dependent criteria, conditional on true unknown sepsis status, and the order for blood cultures criterion to be independent. Briefly, the model assumes that the observed frequencies in the cross-classified table of SASD criteria results is a realization of data from a multinomial distribution with the corresponding probabilities that are functions of the true sepsis prevalence, sensitivities, specificities, and the conditional covariances between the sensitivities and specificities of the SASD criteria [32]. We emphasize that the diagnostic sensitivity and specificity of the blood cultures order to identify sepsis from administrative data are considered here, and these quantities should not be confused with the analytic sensitivity and specificity of the culture method [25]. Bayesian computations were performed in JAGS [38] version 4.0.1 through rjags [39] library in R [40] version 3.2.2, and the JAGS codes, adapted from Branscum et al [31], are provided in the Appendix. All inferences were based on 250 000 iterations thinned from 500 000 after a burn-in of 200 000 iterations. Lack of convergence were assessed using several numerical and graphical diagnostics including Geweke’s statistic, Heidelberger and Welch’s statistic, and Gelman-Rubin statistic using two chains with distinct initial values in addition to trace-plots available in R’s coda [41] library.

Priors

We specified beta probability distributions on true sepsis prevalence, sensitivities and specificities of the SASD criteria. We followed Suess et al [42] to construct informative beta priors. To incorporate current knowledge, it only makes sense to have experts think in terms of original data rather than in terms of the parameters of a probability distribution. Experts are often capable of asserting their best estimate/guess of the most likely value for a quantity, based on similar or previous data, and also a value that the truth is unlikely to be above (or below). Alternatively, these two quantities could be derived from past research or chosen to be non-informative. Suess et al [42] provided the exact derivation, which describes how these two inputs are considered as the mode and 5- or 95-th percentile of the corresponding elicited beta distribution. For example, we assumed that the sensitivity of the sepsis-explicit ICD-9-CM codes is most likely around 10% (for example, Iwashyna et al [43] and Whittaker et al [22] reported 9.3% and 20.5% for sepsis, respectively), and we were 95% certain that the sensitivity will not exceed 30%. These two quantities are corresponding to the Beta(2.56, 15.03) distribution that has a mean of 0.15 and variance of 0.01 [42]. Finally, we followed Dendukuri and Joseph [36] in specifying priors on the conditional covariances from uniform distributions that satisfy the possible range of the covariances.

Two additional sets of priors were considered for the sensitivity analysis (Table 1). The priors in the sensitivity analysis 1 were constructed similarly to the priors in the primary analysis, but it was specified to be either substantially more diffuse (i.e. less informative) or non-informative. The priors in the sensitivity analysis 2 were informative and elicited directly from estimates of the previous year, except for 2008 where the priors were identical to those in sensitivity analysis 1. The parameters of beta priors constructed from percentiles were computed using prevalence [44] library for R software.

Table 1.

Priors for the parameters of the multinomial model for surveillance-aimed sepsis detection criteria.

Criterion Parameter Description; Primary
Prior
Alternative Prior for
Sensitivity Analysis 1
Alternative Prior for
Sensitivity Analysis 2
True sepsis
prevalence
Prev Mode = 0.05, 95% sure
that mode < 0.25;
Beta(1.71, 14.48)
Non-informative;b
Beta(1, 1)
Identical to sensitivity
analysis 1 for 2008, and
elicited from previous year
afterwards
Sepsis explicit
codesa
Se Mode = 0.10, 95% sure
that mode < 0.30;
Beta(2.56, 15.03)
Non-informative;b
Beta(1, 1)
Same as above
Sp Mode = 0.95, 95% sure
that mode > 0.80;
Beta(21.20, 2.06)
Mode = 0.95, 95% sure
that mode > 0.50;
Beta(5.38, 1.49)
Same as above
Blood culture
order
Se Mode = 0.90, 95% sure
that mode > 0.70;
Beta(15.03, 2.56)
Mode = 0.95, 95% sure
that mode > 0.50;
Beta(5.38, 1.49)
Same as above
Sp Mode = 0.90, 95% sure
that mode > 0.70;
Beta(15.03, 2.56)
Mode = 0.95, 95% sure
that mode > 0.50;
Beta(5.38, 1.49)
Same as above
Antibiotics
administration
Se Mode = 0.95, 95% sure
that mode > 0.80;
Beta(21.20, 2.06)
Mode = 0.95, 95% sure
that mode > 0.50;
Beta(5.38, 1.49)
Same as above
Sp Mode = 0.25, 95% sure
that mode < 0.50;
Beta(3.88, 9.63)
Non-informative;b
Beta(1, 1)
Same as above

Prev = prevalence; Se = sensitivity; Sp = specificity.

a

International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnosis codes 995.92 or 785.52.

b

Every possible value of the parameter is equally likely.

Results

We examined a total of 273 126 hospitalizations. The apparent prevalence of sepsis hospitalizations based on explicit ICD-9-CM codes were 1.5% (808/53 291), 1.4% (783/54 293), 1.6% (888/55 090), 2.2% (1182/54 284), and 2.5% (1422/56 168) from 2008–2012, respectively. Table 2 presents cross-classified results of the SASD criteria for the study period. Estimates of the true prevalence, sensitivities, and specificities of the SASD criteria are presented in Table 3. The results suggested that the true prevalence of sepsis remained relatively stable from 2008, 19.2% (95% credible interval [CI]: 17.9%, 22.9%), to 2012, 17.8% (95% CI: 16.8%, 20.2%). The sensitivity of sepsis explicit codes, however, increased from 7.6% (95% CI: 6.4%, 8.4%) in 2008 to 13.8% (95% CI: 12.2%, 14.9%), whereas the specificity of the sepsis explicit code was almost perfect (i.e. 100%) during the same period (Table 3). The specificity of the antibiotic administration criterion was low, but slightly improved during the study period (Table 3). This is expected because in addition to sepsis, antibiotics are administered for many other infectious conditions.

Table 2.

Cross-classified results for surveillance-aimed sepsis detection criteria.

Year Sepsis Explicit Codesa Blood Culture Order Antibiotics Administration Frequency (%)
2008 53 291
+ + + 754 (1.41)
+ + 4 (0.01)
+ + 44 (0.08)
+ 6 (0.01)
+ + 10 649 (19.98)
+ 1650 (3.10)
+ 22 399 (42.03)
17 782 (33.37)
2009 54 293
+ + + 736 (1.36)
+ + 8 (0.01)
+ + 38 (0.07)
+ 1 (0.002)
+ + 10 191 (18.77)
+ 1470 (2.71)
+ 21 845 (40.24)
20 004 (36.84)
2010 55 090
+ + + 832 (1.51)
+ + 2 (0.004)
+ + 51 (0.09)
+ 3 (0.01)
+ + 9855 (17.89)
+ 1375 (2.50)
+ 21 205 (38.49)
21 767 (39.51)
2011 54 284
+ + + 1093 (2.01)
+ + 4 (0.01)
+ + 78 (0.14)
+ 7 (0.01)
+ + 9557 (17.61)
+ 1409 (2.60)
+ 20 407 (37.59)
21 729 (40.03)
2012 56 168
+ + + 1320 (2.35)
+ + 13 (0.02)
+ + 87 (0.15)
+ 2 (0.004)
+ + 9187 (16.36)
+ 1473 (2.62)
+ 20 911 (37.23)
23 175 (41.26)

+ = criterion is present; − = criterion is absent.

a

International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnosis codes 995.92 or 785.52.

Table 3.

Estimates of true prevalence and accuracy of surveillance-aimed sepsis detection criteria.

Parameter Year Posterior Median (95%
CI) Using Primary
Priors
Posterior Median (95% CI)
Using Priors in Sensitivity
Analysis 1
Posterior Median (95% CI)
Using Priors in Sensitivity
Analysis 2
True sepsis prevalence
2008 0.192 (0.179, 0.229) 0.193 (0.178, 0.242) 0.193 (0.178, 0.242)
2009 0.187 (0.176, 0.214) 0.188 (0.175, 0.226) 0.182 (0.176, 0.190)
2010 0.185 (0.174, 0.210) 0.184 (0.172, 0.217) 0.183 (0.175, 0.192)
2011 0.189 (0.176, 0.215) 0.188 (0.175, 0.223) 0.195 (0.185, 0.205)
2012 0.178 (0.168, 0.202) 0.180 (0.167, 0.214) 0.186 (0.177, 0.195)
Sepsis explicit codesa
Se
2008 0.076 (0.064, 0.084) 0.076 (0.061, 0.084) 0.076 (0.061, 0.084)
2009 0.076 (0.066, 0.083) 0.075 (0.062, 0.083) 0.078 (0.073, 0.083)
2010 0.085 (0.074, 0.092) 0.085 (0.072, 0.093) 0.083 (0.078, 0.088)
2011 0.112 (0.098, 0.121) 0.112 (0.095, 0.121) 0.102 (0.096, 0.108)
2012 0.138 (0.122, 0.149) 0.138 (0.116, 0.149) 0.128 (0.122, 0.136)
Sepsis explicit codesa
Sp
2008 0.999 (0.999, 1.000) 0.999 (0.999, 1.000) 0.999 (0.999, 1.000)
2009 1.000 (0.999, 1.000) 1.000 (0.999, 1.000) 0.999 (0.999, 1.000)
2010 1.000 (0.999, 1.000) 1.000 (0.999, 1.000) 1.000 (0.999, 1.000)
2011 0.999 (0.998, 1.000) 0.999 (0.998, 1.000) 1.000 (0.999, 1.000)
2012 0.999 (0.998, 1.000) 0.999 (0.998, 1.000) 0.999 (0.999, 1.000)
Blood culture order
criterion Se
2008 0.960 (0.935, 0.989) 0.960 (0.935, 0.994) 0.960 (0.935, 0.994)
2009 0.965 (0.941, 0.990) 0.963 (0.940, 0.994) 0.961 (0.945, 0.978)
2010 0.959 (0.934, 0.989) 0.957 (0.932, 0.993) 0.961 (0.941, 0.979)
2011 0.954 (0.927, 0.988) 0.952 (0.925, 0.993) 0.954 (0.932, 0.975)
2012 0.957 (0.933, 0.989) 0.953 (0.930, 0.992) 0.950 (0.933, 0.970)
Blood culture order
criterion Sp
2008 0.925 (0.914, 0.966) 0.926 (0.913, 0.983) 0.926 (0.913, 0.983)
2009 0.940 (0.931, 0.971) 0.941 (0.930, 0.986) 0.934 (0.929, 0.941)
2010 0.949 (0.940, 0.977) 0.947 (0.939, 0.985) 0.946 (0.940, 0.955)
2011 0.947 (0.938, 0.977) 0.947 (0.938, 0.987) 0.955 (0.945, 0.964)
2012 0.948 (0.939, 0.975) 0.949 (0.939, 0.988) 0.955 (0.945, 0.964)
Antibiotics
administration criterion
Se
2008 0.978 (0.911, 0.999) 0.977 (0.891, 0.999) 0.977 (0.891, 0.999)
2009 0.978 (0.921, 0.998) 0.977 (0.896, 0.999) 0.992 (0.975, 0.999)
2010 0.979 (0.921, 0.999) 0.983 (0.908, 1.000) 0.984 (0.963, 0.998)
2011 0.978 (0.918, 0.999) 0.980 (0.901, 0.999) 0.961 (0.941, 0.985)
2012 0.978 (0.918, 0.998) 0.976 (0.894, 0.998) 0.960 (0.939, 0.985)
Antibiotics
administration criterion
Sp
2008 0.446 (0.440, 0.452) 0.446 (0.440, 0.452) 0.446 (0.440, 0.452)
2009 0.481 (0.476, 0.487) 0.482 (0.476, 0.487) 0.480 (0.475, 0.485)
2010 0.511 (0.505, 0.516) 0.511 (0.505, 0.517) 0.509 (0.504, 0.514)
2011 0.520 (0.514, 0.526) 0.520 (0.514, 0.527) 0.520 (0.515, 0.526)
2012 0.529 (0.523, 0.535) 0.530 (0.523, 0.536) 0.530 (0.525, 0.535)

CI = Credible interval; Se = sensitivity; Sp = specificity.

a

International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnosis codes 995.92 or 785.52.

Discussion

Our surveillance-aimed criteria estimated the true prevalence of sepsis to be about 18%, which remained stable during the study period at our institution. This study follows the results of two previous studies at the our institution that suggested an uptrend in the apparent prevalence of hospitalizations with a discharge diagnosis code for sepsis [15,16]. Our findings are similar to those from Iwashyna et al [43], who reported an apparent prevalence of sepsis to be 13.5% based on an alternative algorithm, referred to as the Angus implementation [45], in administrative data with 50.3% and 96.3% sensitivity and specificity, respectively. Using Rogan and Gladen’s formula to estimate true prevalence from apparent prevalence [46], we estimated the true prevalence of sepsis in Iwashyna et al [43] population to be about 21%, which is similar to our study population despite using different methodology. Our findings suggest that, despite the stable prevalence of sepsis, the sensitivity of explicit coding in administrative data almost doubled to about 14% during the 5-year study period, but still remained very low. These findings are consistent with, for example, Iwashyna et al [43] among others [47], that reported the sensitivity of sepsis-explicit codes for sepsis (ICD-9-CM: 995.92 or 785.52) to be 9.3% (95% confidence interval: 0%, 19.3%), which was based on the medical records chart review of a sample of hospitalizations.

The estimates for the sensitivity of ICD-9-CM explicit codes for sepsis in our study are critical because several past studies that used large multi-center or national datasets to describe the epidemiology of sepsis did not adjust for the misclassification bias, imperfect accuracy of the verification method, and inaccuracy in ICD-9-CM codes for sepsis [614,23,24,48]. Consequently, these studies provided a severely biased estimate of sepsis trends over time. The findings are also important because the AHRQ’s estimate [2] of $20.3 billion for annual sepsis care aggregate hospital costs does not account for approximately 85% of true sepsis hospitalizations that are missed in administrative data, based upon our study and those reported by Iwashyna et al [43].

Our aggregate-level prevalence study could not consider all of the factors that were associated with receiving a discharge diagnosis of sepsis for an individual patient. However, in a complementary study [16], we found an admission to the intensive care unit (ICU) and frequency of blood culture ordering during the course of hospitalization was associated with receiving a discharge diagnosis for sepsis. This is consistent with findings from other studies that suggested a higher sensitivity of sepsis-explicit codes in ICU hospitalizations [47]. Additionally, we previously quantified the changes in the probability of receiving a discharge diagnosis of sepsis for an individual patient, as a proxy for measuring the coinciding improvement in the clinical diagnosis of sepsis, its documentation in electronic health records, and its medical coding in administrative billing data [16]. Another limitation of our study is that it occurred in a single academic center. However, our modeling approach is very flexible and can readily be adapted to different settings such as a different time period where the accuracy of each individual criterion changes, or for example for community hospitals with lower probability of sepsis explicit coding or surgical patients with higher probability of receiving antibiotics by modifying the specified priors whenever appropriate.

Our analytical approach is distinct from previous studies that required sepsis-explicit codes along with blood culture orders, a positive blood culture, antibiotics use, vasopressor use, or other variables to create a pseudo-gold standard [10,14,23,24,48,49]. This method of combining several individual criterions is referred to as serial interpretation in diagnostic testing literature, which improves diagnostic specificity at the expense of reducing diagnostic sensitivity and consequently missing even more true sepsis cases [33]. These approaches that result in improved specificity are not suitable for surveillance purposes given that the specificity of sepsis-explicit codes is almost perfect (i.e. 100%) as suggested by our results and those provided by Iwashyna et al [43] among others. Moreover, these pseudo-gold standards remain subject to varying degrees of misclassification bias that result in severe underestimation of sepsis prevalence [26,50]. Instead, we modeled the three imperfect criteria simultaneously such that each contributed information to the estimation of true sepsis prevalence without the need to create a hypothetical perfect reference standard.

Sepsis remains a critical public health concern. Our attempt to estimate the true prevalence of sepsis is important because it allows for comparing the changes in true prevalence over time or between different hospitals for surveillance purposes. Further, the methods are algorithm-independent and can be applied to different settings or conditions of interest.

Acknowledgments

This work was supported by the Prevention Epicenters Program from the Centers for Disease Control and Prevention (CDC) [Grants U54 CK000162 and U54 CK000172] and the Washington University Institute of Clinical and Translational Sciences from the National Center for Advancing Translational Sciences (NCATS) [Grant UL1 TR000448].

Abbreviations

ACCP

American College of Chest Physicians

AHRQ

Agency for Healthcare Quality and Research

BJH

Barnes-Jewish Hospital

CI

credible interval

ICD-9-CM

International Classification Of Diseases, Ninth Revision, Clinical Modification

SASD

surveillance-aimed sepsis detection

SCCM

Society of Critical Care Medicine

SIRS

systemic inflammatory response syndrome

Appendix

Program codes to estimate the true prevalence and accuracy of surveillance-aimed sepsis detection criteria.

# JAGS 4.0.1; http://mcmc-jags.sourceforge.net/
# Adapted from Branscum et al. 2005; DOI:10.1016/j.prevetmed.2004.12.005
# T1: Sepsis explicit codes
# T2: Antibiotics administration
# T3: Blood culture order
# p[1]: T1+, T2+, T3+
# p[2]: T1+, T2−, T3+
# p[3]: T1+, T2+, T3−
# p[4]: T1+, T2−, T3−
# p[5]: T1−, T2+, T3+
# p[6]: T1−, T2-, T3+
# p[7]: T1−, T2+, T3−
# p[8]: T1−, T2−, T3−

model {
      x[1:8] ~ dmulti(p[1:8], n)
      p[1] <- prev*Se3*(Se1*Se2+covDp) + (1-prev)*(1-Sp3)*((1-Sp1)*(1-Sp2)+covDn)
      p[2] <- prev*Se3*(Se1*(1-Se2)-covDp) + (1-prev)*(1-Sp3)*((1-Sp1)*Sp2-covDn)
      p[3] <- prev*(1-Se3)*(Se1*Se2+covDp) + (1-prev)*Sp3*((1-Sp1)*(1-Sp2)+covDn)
      p[4] <- prev*(1-Se3)*(Se1*(1-Se2)-covDp) + (1-prev)*Sp3*((1-Sp1)*Sp2-covDn)
      p[5] <- prev*Se3*((1-Se1)*Se2-covDp) + (1-prev)*(1-Sp3)*(Sp1*(1-Sp2)-covDn)
      p[6] <- prev*Se3*((1-Se1)*(1-Se2)+covDp) + (1-prev)*(1-Sp3)*(Sp1*Sp2+covDn)
      p[7] <- prev*(1-Se3)*((1-Se1)*Se2-covDp) + (1-prev)*Sp3*(Sp1*(1-Sp2)-covDn)
      p[8] <- prev*(1-Se3)*((1-Se1)*(1-Se2)+covDp) + (1-prev)*Sp3*(Sp1*Sp2+covDn)

      ls <- (Se1-1)*(1-Se2)
      us <- min(Se1,Se2) - Se1*Se2
      lc <- (Sp1-1)*(1-Sp2)
      uc <- min(Sp1,Sp2) - Sp1*Sp2
      rhoD <- covDp / sqrt(Se1*(1-Se1)*Se2*(1-Se2))
      rhoDc <- covDn / sqrt(Sp1*(1-Sp1)*Sp2*(1-Sp2))

      prev ~ dbeta(1.709702, 14.48435)
      Se1 ~ dbeta(2.55936, 15.03424)
      Sp1 ~ dbeta(21.20184, 2.063255)
      Se2 ~ dbeta(21.20184, 2.063255)
      Sp2 ~ dbeta(3.876141, 9.628424)
      Se3 ~ dbeta(15.03422, 2.559357)
      Sp3 ~ dbeta(15.03422, 2.559357)

      covDn ~ dunif(lc, uc)
      covDp ~ dunif(ls, us)
}

# R 3.2.2; https://www.r-project.org/
# ‘prevalence’ package; https://cran.r-project.org/web/packages/prevalence/index.html library(prevalence)

# Prior for prev
betaExpert(best = 0.05, upper = 0.25)
# Prior for Se1
betaExpert(best = 0.10, upper = 0.30)
# Prior for Sp1
betaExpert(best = 0.95, lower = 0.80)
# Prior for Se2
betaExpert(best = 0.95, lower = 0.80)
# Prior for Sp2
betaExpert(best = 0.25, upper = 0.50)
# Prior for Se3
betaExpert(best = 0.90, lower = 0.70)
# Prior for Sp3
betaExpert(best = 0.90, lower = 0.70)

# Data for 2012
x <- c(1320, 13, 87, 2, 9187, 1473, 20911, 23175)
n <- sum(1320, 13, 87, 2, 9187, 1473, 20911, 23175)

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Heron M. Deaths: leading causes for 2010. Natl Vital Stat Rep Cent Dis Control Prev Natl Cent Health Stat Natl Vital Stat Syst. 2013;62:1–96. [PubMed] [Google Scholar]
  • 2.Torio CM, Andrews RM. Healthc. Cost Util. Proj. HCUP Stat. Briefs, Rockville (MD): Agency for Health Care Policy and Research (US); 2013. National Inpatient Hospital Costs: The Most Expensive Conditions by Payer, 2011: Statistical Brief #160. [PubMed] [Google Scholar]
  • 3.Iwashyna TJ, Ely EW, Smith DM, Langa KM. Long-term cognitive impairment and functional disability among survivors of severe sepsis. JAMA. 2010;304:1787–1794. doi: 10.1001/jama.2010.1553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Levy MM, Fink MP, Marshall JC, Abraham E, Angus D, Cook D, et al. 2001 SCCM/ESICM/ACCP/ATS/SIS International Sepsis Definitions Conference. Intensive Care Med. 2003;29:530–538. doi: 10.1007/s00134-003-1662-x. [DOI] [PubMed] [Google Scholar]
  • 5.Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, et al. The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3) JAMA. 2016;315:801–810. doi: 10.1001/jama.2016.0287. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Martin GS, Mannino DM, Eaton S, Moss M. The epidemiology of sepsis in the United States from 1979 through 2000. N Engl J Med. 2003;348:1546–1554. doi: 10.1056/NEJMoa022139. [DOI] [PubMed] [Google Scholar]
  • 7.Dombrovskiy VY, Martin AA, Sunderram J, Paz HL. Rapid increase in hospitalization and mortality rates for severe sepsis in the United States: a trend analysis from 1993 to 2003. Crit Care Med. 2007;35:1244–1250. doi: 10.1097/01.CCM.0000261890.41311.E9. [DOI] [PubMed] [Google Scholar]
  • 8.Bateman BT, Schmidt U, Berman MF, Bittner EA. Temporal trends in the epidemiology of severe postoperative sepsis after elective surgery: a large, nationwide sample. Anesthesiology. 2010;112:917–925. doi: 10.1097/ALN.0b013e3181cea3d0. [DOI] [PubMed] [Google Scholar]
  • 9.Kumar G, Kumar N, Taneja A, Kaleekal T, Tarima S, McGinley E, et al. Nationwide trends of severe sepsis in the 21st century (2000–2007) Chest. 2011;140:1223–1231. doi: 10.1378/chest.11-0352. [DOI] [PubMed] [Google Scholar]
  • 10.Lagu T, Rothberg MB, Shieh M-S, Pekow PS, Steingrub JS, Lindenauer PK. Hospitalizations, costs, and outcomes of severe sepsis in the United States 2003 to 2007. Crit Care Med. 2012;40:754–761. doi: 10.1097/CCM.0b013e318232db65. [DOI] [PubMed] [Google Scholar]
  • 11.Gaieski DF, Edwards JM, Kallan MJ, Carr BG. Benchmarking the incidence and mortality of severe sepsis in the United States. Crit Care Med. 2013;41:1167–1174. doi: 10.1097/CCM.0b013e31827c09f8. [DOI] [PubMed] [Google Scholar]
  • 12.Sutton JP, Friedman B. Healthc. Cost Util. Proj. HCUP Stat. Briefs, Rockville (MD): Agency for Health Care Policy and Research (US); 2013. Trends in Septicemia Hospitalizations and Readmissions in Selected HCUP States, 2005 and 2010: Statistical Brief #161. [PubMed] [Google Scholar]
  • 13.Stoller J, Halpin L, Weis M, Aplin B, Qu W, Georgescu C, et al. Epidemiology of severe sepsis: 2008–2012. J Crit Care. 2016;31:58–62. doi: 10.1016/j.jcrc.2015.09.034. [DOI] [PubMed] [Google Scholar]
  • 14.Gohil SK, Cao C, Phelan M, Tjoa T, Rhee C, Platt R, et al. Impact of policies on the rise in sepsis incidence, 2000–2010. Clin Infect Dis Off Publ Infect Dis Soc Am. 2016;62:695–703. doi: 10.1093/cid/civ1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Thomas BS, Jafarzadeh SR, Warren DK, McCormick S, Fraser VJ, Marschall J. Temporal trends in the systemic inflammatory response syndrome, sepsis, and medical coding of sepsis. BMC Anesthesiol. 2015;15:169. doi: 10.1186/s12871-015-0148-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Jafarzadeh SR, Thomas BS, Marschall J, Fraser VJ, Gill J, Warren DK. Quantifying the improvement in sepsis diagnosis, documentation, and coding: the marginal causal effect of year of hospitalization on sepsis diagnosis. Ann Epidemiol. 2016;26:66–70. doi: 10.1016/j.annepidem.2015.10.008. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Thurmond MC. Conceptual foundations for infectious disease surveillance. J Vet Diagn Investig Off Publ Am Assoc Vet Lab Diagn Inc. 2003;15:501–514. doi: 10.1177/104063870301500601. [DOI] [PubMed] [Google Scholar]
  • 18.Madsen KM, Schønheyder HC, Kristensen B, Nielsen GL, Sørensen HT. Can hospital discharge diagnosis be used for surveillance of bacteremia? A data quality study of a Danish hospital discharge registry. Infect Control Hosp Epidemiol. 1998;19:175–180. [PubMed] [Google Scholar]
  • 19.Gedeborg R, Furebring M, Michaëlsson K. Diagnosis-dependent misclassification of infections using administrative data variably affected incidence and mortality estimates in ICU patients. J Clin Epidemiol. 2007;60:155–162. doi: 10.1016/j.jclinepi.2006.05.013. [DOI] [PubMed] [Google Scholar]
  • 20.Grijalva CG, Chung CP, Stein CM, Gideon PS, Dyer SM, Mitchel EF, et al. Computerized definitions showed high positive predictive values for identifying hospitalizations for congestive heart failure and selected infections in Medicaid enrollees with rheumatoid arthritis. Pharmacoepidemiol Drug Saf. 2008;17:890–895. doi: 10.1002/pds.1625. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Cevasco M, Borzecki AM, Chen Q, Zrelak PA, Shin M, Romano PS, et al. Positive predictive value of the AHRQ Patient Safety Indicator “Postoperative Sepsis”: implications for practice and policy. J Am Coll Surg. 2011;212:954–961. doi: 10.1016/j.jamcollsurg.2010.11.013. [DOI] [PubMed] [Google Scholar]
  • 22.Whittaker S-A, Mikkelsen ME, Gaieski DF, Koshy S, Kean C, Fuchs BD. Severe sepsis cohorts derived from claims-based strategies appear to be biased toward a more severely ill patient population. Crit Care Med. 2013;41:945–953. doi: 10.1097/CCM.0b013e31827466f1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Rhee C, Murphy MV, Li L, Platt R, Klompas M. Centers for Disease Control and Prevention Epicenters Program. Comparison of trends in sepsis incidence and coding using administrative claims versus objective clinical data. Clin Infect Dis Off Publ Infect Dis Soc Am. 2015;60:88–95. doi: 10.1093/cid/ciu750. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Rhee C, Kadri S, Huang SS, Murphy MV, Li L, Platt R, et al. Objective sepsis surveillance using electronic clinical data. Infect Control Hosp Epidemiol. 2015:1–9. doi: 10.1017/ice.2015.264. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Greiner M, Gardner IA. Epidemiologic issues in the validation of veterinary diagnostic tests. Prev Vet Med. 2000;45:3–22. doi: 10.1016/s0167-5877(00)00114-8. [DOI] [PubMed] [Google Scholar]
  • 26.Enøe C, Georgiadis MP, Johnson WO. Estimation of sensitivity and specificity of diagnostic tests and disease prevalence when the true disease state is unknown. Prev Vet Med. 2000;45:61–81. doi: 10.1016/s0167-5877(00)00117-3. [DOI] [PubMed] [Google Scholar]
  • 27.Branscum AJ, Gardner IA, Johnson WO. Bayesian modeling of animal- and herd-level prevalences. Prev Vet Med. 2004;66:101–112. doi: 10.1016/j.prevetmed.2004.09.009. [DOI] [PubMed] [Google Scholar]
  • 28.Christensen R, Johnson WO, Branscum AJ, Hanson TE. Bayesian Ideas and Data Analysis: an Introduction for Scientists and Statisticians. 1st. CRC Press; 2010. [Google Scholar]
  • 29.Gill J. Bayesian Methods: A Social and Behavioral Sciences Approach. Third. Boca Raton, Florida: Chapman and Hall/CRC; 2014. [Google Scholar]
  • 30.Jones G, Johnson WO, Hanson TE, Christensen R. Identifiability of models for multiple diagnostic testing in the absence of a gold standard. Biometrics. 2010;66:855–863. doi: 10.1111/j.1541-0420.2009.01330.x. [DOI] [PubMed] [Google Scholar]
  • 31.Branscum AJ, Gardner IA, Johnson WO. Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling. Prev Vet Med. 2005;68:145–163. doi: 10.1016/j.prevetmed.2004.12.005. [DOI] [PubMed] [Google Scholar]
  • 32.Georgiadis MP, Johnson WO, Gardner IA, Singh R. Correlation-adjusted estimation of sensitivity and specificity of two diagnostic tests. J R Stat Soc Ser C Appl Stat. 2003;52:63–76. [Google Scholar]
  • 33.Su C-L, Gardner IA, Johnson WO. Diagnostic test accuracy and prevalence inferences based on joint and sequential testing with finite population sampling. Stat Med. 2004;23:2237–2255. doi: 10.1002/sim.1809. [DOI] [PubMed] [Google Scholar]
  • 34.Johnson WO, Gardner IA, Metoyer CN, Branscum AJ. On the interpretation of test sensitivity in the two-test two-population problem: assumptions matter. Prev Vet Med. 2009;91:116–121. doi: 10.1016/j.prevetmed.2009.06.006. [DOI] [PubMed] [Google Scholar]
  • 35.Jafarzadeh SR, Warren DK, Nickel KB, Wallace AE, Fraser VJ, Olsen MA. Bayesian estimation of the accuracy of ICD-9-CM- and CPT-4-based algorithms to identify cholecystectomy procedures in administrative data without a reference standard. Pharmacoepidemiol Drug Saf. 2016;25:263–268. doi: 10.1002/pds.3870. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Dendukuri N, Joseph L. Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests. Biometrics. 2001;57:158–167. doi: 10.1111/j.0006-341x.2001.00158.x. [DOI] [PubMed] [Google Scholar]
  • 37.Gardner IA, Stryhn H, Lind P, Collins MT. Conditional dependence between tests affects the diagnosis and surveillance of animal diseases. Prev Vet Med. 2000;45:107–122. doi: 10.1016/s0167-5877(00)00119-7. [DOI] [PubMed] [Google Scholar]
  • 38.Plummer M. JAGS version 4.0.0 user manual. Lyon, France: International Agency for Research on Cancer; 2015. [Google Scholar]
  • 39.Plummer M. rjags: Bayesian graphical models using MCMC. R package version 4-4. 2015 http://cran.r-project.org/package=rjags.
  • 40.R Core Team. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2015. [Google Scholar]
  • 41.Plummer M, Best N, Cowles K, Vines K. CODA: convergence diagnosis and output analysis for MCMC. R News. 2006;6:7–11. [Google Scholar]
  • 42.Suess EA, Gardner IA, Johnson WO. Hierarchical Bayesian model for prevalence inferences and determination of a country’s status for an animal pathogen. Prev Vet Med. 2002;55:155–171. doi: 10.1016/s0167-5877(02)00092-2. [DOI] [PubMed] [Google Scholar]
  • 43.Iwashyna TJ, Odden A, Rohde J, Bonham C, Kuhn L, Malani P, et al. Identifying patients with severe sepsis using administrative claims: patient-level validation of the angus implementation of the international consensus conference definition of severe sepsis. Med Care. 2014;52:e39–e43. doi: 10.1097/MLR.0b013e318268ac86. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Devleesschauwer M, Torgerson PR, Charlier J, Levecke B, Praet N, Roelandt S, et al. prevalence: tools for prevalence assessment studies. R package version 0.4.4. 2015 http://cran.r-project.org/package=prevalence. [Google Scholar]
  • 45.Angus DC, Linde-Zwirble WT, Lidicker J, Clermont G, Carcillo J, Pinsky MR. Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care. Crit Care Med. 2001;29:1303–1310. doi: 10.1097/00003246-200107000-00002. [DOI] [PubMed] [Google Scholar]
  • 46.Rogan WJ, Gladen B. Estimating prevalence from the results of a screening test. Am J Epidemiol. 1978;107:71–76. doi: 10.1093/oxfordjournals.aje.a112510. [DOI] [PubMed] [Google Scholar]
  • 47.Jolley RJ, Sawka KJ, Yergens DW, Quan H, Jetté N, Doig CJ. Validity of administrative data in recording sepsis: a systematic review. Crit Care Lond Engl. 2015;19:139. doi: 10.1186/s13054-015-0847-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Rhee C, Murphy MV, Li L, Platt R, Klompas M. Centers for Disease Control and Prevention Epicenters Program. Improving documentation and coding for acute organ dysfunction biases estimates of changing sepsis severity and burden: a retrospective study. Crit Care Lond Engl. 2015;19:338. doi: 10.1186/s13054-015-1048-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Lagu T, Lindenauer PK, Rothberg MB, Nathanson BH, Pekow PS, Steingrub JS, et al. Development and validation of a model that uses enhanced administrative data to predict mortality in patients with sepsis. Crit Care Med. 2011;39:2425–2430. doi: 10.1097/CCM.0b013e31822572e3. [DOI] [PubMed] [Google Scholar]
  • 50.Johnson WO, Gastwirth JL, Pearson LM. Screening without a “gold standard”: the Hui-Walter paradigm revisited. Am J Epidemiol. 2001;153:921–924. doi: 10.1093/aje/153.9.921. [DOI] [PubMed] [Google Scholar]

RESOURCES