Abstract
We compared sepsis “time zero” and CMS SEP-1 pass rates amongst 3 abstractors in 3 hospitals. Abstractors agreed on time zero in 29/80 (36%) cases. Perceived pass rates ranged from 9/80 (11%) to 19/80 (23%) cases. Variability in time zero and perceived pass rates limits SEP-1’s utility for measuring quality.
Keywords: SEP-1, Sepsis, Severe Sepsis, Quality Measures, Centers for Medicare and Medicaid Services
INTRODUCTION
In October 2015, the Centers for Medicare and Medicaid Services (CMS) implemented the “SEP-1” sepsis core measure requiring U.S. hospitals to report compliance with 3 and 6-hour bundles of care for patients with severe sepsis or septic shock.1 Hospitals are now devoting substantial resources to measuring and improving SEP-1 adherence, which requires all bundle components be met to “pass”.2,3
SEP-1 bundle adherence is measured relative to sepsis “time zero”, defined as the first point in which there is documentation of suspected or confirmed infection, 2 or more systemic inflammatory response syndrome criteria, and one or more organ dysfunction criteria within a 6-hour window.1 Time zero is also triggered if a clinician explicitly documents severe sepsis or septic shock. Given this complex definition, different abstractors may reach different conclusions about time zero, which in turn could lead to different impressions on whether or not cases passed or failed SEP-1.4,5
We compared time zero determinations and SEP-1 pass rates amongst hospital abstractors and clinicians in three U.S. hospitals. We also examined clinical factors associated with lower agreement rates.
METHODS
We randomly selected 80 SEP-1 cases discharged between July 1st-December 31st, 2016 at three U.S. tertiary care hospitals (Brigham and Women’s Hospital in Boston, MA; Barnes-Jewish Hospital in St. Louis, MO; and Duke University Hospital in Durham, NC). Each case was reviewed by the official hospital SEP-1 abstractor and by two clinicians at each hospital (either internal medicine physicians or clinical pharmacists) for all SEP-1 components, including time zero (Table 1) and whether or not cases passed. Abstractors were blinded to one another’s determinations. Clinician reviewers underwent one hour of training on SEP-1 abstraction by the lead investigator (C.R.) in order to encourage standardization, using the CMS specification in place during the study period.1 CMS exclusion criteria were applied prior to selecting cases for review (i.e., outside hospital transfer, severe sepsis criteria not met on chart review, goals of care limitations, and antibiotic administration prior to 24 hours before time zero).1
Table 1.
All 3 of the following within a 6 hour windowa: |
---|
|
|
|
Time zero is the time at which the last sign of severe sepsis (documentation of suspected infection, ≥2 systemic inflammatory response syndrome criteria, and organ dysfunction) within that 6 hour window is noted. Alternatively, severe sepsis criteria are met if there is provider documentation of suspected or confirmed severe sepsis or septic shock.
Excludes organ dysfunction explicitly documented as chronic.
We compared agreement on time zero and SEP-1 pass versus fail rates amongst the three abstractors at each site. Time zero was considered to be concordant between abstractors if within ±1 minute. We also conducted sensitivity analyses allowing for agreement if time zero determinations were within one hour and three hours of one another.
We calculated interobserver variability on whether cases passed or failed SEP-1 using the Fleiss kappa statistic (κ) for 3-abstractor comparisons and Cohen’s kappa for 2-abstractor comparisons.6 We considered κ greater than 0.75 to be strong agreement, 0.40–0.75 to be moderate agreement, and less than 0.40 to be poor agreement.7
We conducted a multivariate analysis to identify factors associated with disagreement on time zero. Covariates included age >65, sex, hospital length-of-stay >7 days, sepsis time zero occurring after hospital admission (per the hospital abstractor), and which organ dysfunction criteria triggered time zero per the hospital abstractor (hypotension, lactate >2.0 mmol/L, provider documentation of severe sepsis/septic shock, or other organ dysfunction).
Analyses were performed using SAS version 9.3 (SAS Institute, Cary, NC) and an online software package for interrater reliability calculations.8 The study was approved by the Institutional Review Boards at Harvard Pilgrim Health Care Institute, Partners Healthcare, Washington University School of Medicine, and Duke University Health System.
RESULTS
Of the 80 study cases, all three abstractors agreed on time zero in 29 (36.3%) of cases. Agreement rates by hospital are shown in Table 2. Among the 51 cases where there was a discrepancy, the median time zero difference between clinician abstractors and hospital abstractors was 40 minutes (IQR 0–70 minutes, range 0 minutes to 11.6 days). Agreement on time zero was better but still marginal when the window for concordance was expanded to one hour (47/80 cases, 58.9%) or three hours (54/80 cases, 67.5%).
Table 2.
Overall (N=80) | Hospital 1 (N=29) | Hospital 2 (N=21) | Hospital 3 (N=30) | |
---|---|---|---|---|
Agreement (All 3 Abstractors) | ||||
Exact (±1 minute) | 29/80 (36.3%) | 15/29 (51.7%) | 8/21 (38.1%) | 6/30 (20%) |
±1 hour | 47/80 (58.8%) | 20/29 (69.0%) | 12/21 (57.1%) | 15/30 (50%) |
±3 hours | 54/80 (67.5%) | 24/29 (82.8%) | 12/21 (57.1%) | 18/30 (60%) |
Median Difference in Time Zero for Clinician vs Hospital Abstractors (IQR)a | 40 minutes (0–70) | 41 minutes (0–139) | 13 minutes (1–568) | 49 minutes (0–210) |
Agreement for Sepsis Occurring in Emergency Departmentb (within ±1 minute) | 25/55 (45.5%) | 13/23 (56.5%) | 7/11 (63.6%) | 5/21 (23.8%) |
Agreement for Sepsis Occurring After Admissionb (within ±1 minute) | 4/25 (16.0%) | 2/6 (33.3%) | 1/10 (10.0%) | 1/9 (11.1%) |
Interrater Reliability for Overall SEP-1 Pass vs Fail (Fleiss κ)c | 0.39 | 0.39 | 0.24 | 0.37 |
Data for each hospital are presented in no specific order.
Represents the median difference in time zero determined by both clinician abstractors compared to each hospital’s official quality abstractor. Only cases where there was at least one disagreement were included in the analysis.
The timing of sepsis onset was determined using the official hospital abstractor’s time zero.
Interrater reliability calculations included all three abstractors at each hospital.
Hospital abstractors identified time zero as occurring in the ED or day of admission in 55 of 80 cases (68.8%). Agreement among the three abstractors was better in these cases (25/55 cases, 45.5%) compared to cases where time zero occurred after hospital admission (4/25 cases, 16%; p=0.01). On multivariate analysis, hospital-onset of sepsis was independently associated with at least one abstractor disagreeing on time zero (OR 8.2, 95% CI 1.6–40.7, p=0.01), whereas age, sex, length-of-stay>7 days, and organ dysfunction criteria were not.
Overall, hospital abstractors identified 19/80 cases (23.8%) as passing SEP-1. Among the clinician abstractors, 9 (11.3%) cases passed when using the abstractor with the strictest assessments at each hospital; when using the highest pass rates per clinician abstractor at each hospital, 15 (18.8%) cases passed. Interrater reliability among the three abstractors for determining SEP-1 compliance was poor (Fleiss κ 0.39).
When assessing agreement by at least one clinician abstractor identifying the same time zero as the hospital abstractor, agreement increased to 56/80 (70.0%) of cases, and interreliability reliability for determining SEP-1 compliance was better but still only moderate (Cohen κ 0.67). When examining agreement only between the two clinician abstractors at each hospital, agreement on time zero occurred in 34/80 (42.5%) cases and interrater reliability for SEP-1 compliance was poor (Cohen κ 0.28).
DISCUSSION
We found poor agreement between abstractors for identifying sepsis time zero and whether or not cases passed the CMS SEP-1 measure. Agreement improved to only moderate when requiring just one of two clinician abstractors to agree with a hospital’s official abstractor. Sepsis onset after hospital admission was associated with lower agreement rates compared to sepsis present-on-admission.
The SEP-1 measure relies on determining sepsis time zero to calculate 3 and 6-hour bundle compliance rates, but there are several potential sources of error and subjectivity. Abstractors need to assess many different parts of the chart (e.g., vital signs, laboratory tests, clinical notes, and medication administration records) to determine time zero and overall SEP-1 compliance. Abstractors must exercise judgment to decide whether clinicians suspect infection, whether organ dysfunction is present and whether organ dysfunction is new or chronic. Reviewers may also need to review dozens of progress notes, including multiple versions of the same note that have been copied and pasted, to find the first documentation of suspected infection, particularly when sepsis occurs after hospital admission.
More broadly, sepsis is an elusive entity to define and identify. There is no gold standard for sepsis and even expert clinicians using common definitions often disagree on whether sepsis is present or absent.9,10
Our study has several limitations. Clinicians may be less adept at abstracting data for quality measures compared to trained hospital abstractors. We focused on agreement for sepsis time zero and overall SEP-1 pass rates, but variability in abstracting individual bundle components could also contribute to disagreements in perceived pass rates. Our study was conducted in academic hospitals and may not be generalizable to community hospitals, where sepsis cases may differ in their level of complexity. Finally, the CMS specification for SEP-1 continues to change over time and this study cannot evaluate the impact of recent changes on interrater reliability.
In conclusion, there is significant variability between different abstractors in determining severe sepsis time zero and SEP-1 compliance rates. These findings underscore the importance of ensuring adequate standardization of quality measures, especially complex ones like SEP-1, that require substantial judgment for implementation.
ACKNOWLEDGEMENTS
Financial Support: This work was funded by the Prevention Epicenters Program of the Centers for Disease Control and Prevention (grant U54CK000484). C.R. received support from the Agency for Healthcare Research and Quality (grant K08HS025008). The content is solely the responsibility of the authors and does not necessarily represent the official views of the Centers for Disease Control and Prevention or the Agency for Healthcare Research and Quality.
Footnotes
Potential conflicts of interest: None of the authors have any conflicts to disclose.
REFERENCES:
- 1.Centers for Medicare and Medicaid Services. QualityNet - Inpatient Hospitals Specifications Manual. https://www.qualitynet.org. Accessed March 19th, 2018.
- 2.Venkatesh AK, Slesinger T, Whittle J, et al. Preliminary Performance on the New CMS Sepsis-1 National Quality Measure: Early Insights From the Emergency Quality Network (E-QUAL). Ann. Emerg. Med 2018;71:10–15 e11. [DOI] [PubMed] [Google Scholar]
- 3.Barbash IJ, Rak KJ, Kuza CC, Kahn JM. Hospital Perceptions of Medicare’s Sepsis Quality Reporting Initiative. J. Hosp. Med 2017;12:963–968. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Klompas M, Rhee C. The CMS Sepsis Mandate: Right Disease, Wrong Measure. Ann. Intern. Med 2016;165:517–518. [DOI] [PubMed] [Google Scholar]
- 5.Aaronson EL, Filbin MR, Brown DF, Tobin K, Mort EA. New Mandated Centers for Medicare and Medicaid Services Requirements for Sepsis Reporting: Caution from the Field. J. Emerg. Med 2017;52:109–116. [DOI] [PubMed] [Google Scholar]
- 6.McHugh ML. Interrater reliability: the kappa statistic. Biochem. Med. (Zagreb) 2012;22:276–282. [PMC free article] [PubMed] [Google Scholar]
- 7.Stevens JP, Kachniarz B, Wright SB, et al. When policy gets it right: variability in u.s. Hospitals’ diagnosis of ventilator-associated pneumonia*. Crit. Care Med. 2014;42:497–503. [DOI] [PubMed] [Google Scholar]
- 8.ReCal3: Reliability for 3+ coders. http://dfreelon.org/utils/recalfront/recal3/. Accessed March 19th, 2018.
- 9.Rhee C, Kadri SS, Danner RL, et al. Diagnosing sepsis is subjective and highly variable: a survey of intensivists using case vignettes. Crit. Care 2016;20:89. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Angus DC, Seymour CW, Coopersmith CM, et al. A Framework for the Development and Interpretation of Different Sepsis Definitions and Clinical Criteria. Crit. Care Med. 2016;44:e113–121. [DOI] [PMC free article] [PubMed] [Google Scholar]