Abstract
More than a year after the first domestic COVID-19 cases, the United States does not have national standards for COVID-19 surveillance data analysis and public reporting. This has led to dramatic variations in surveillance practices among public health agencies, which analyze and present newly confirmed cases by a wide variety of dates.
The choice of which date to use should be guided by a balance between interpretability and epidemiological relevance. Report date is easily interpretable, generally representative of outbreak trends, and available in surveillance data sets. These features make it a preferred date for public reporting and visualization of surveillance data, although it is not appropriate for epidemiological analyses of outbreak dynamics. Symptom onset date is better suited for such analyses because of its clinical and epidemiological relevance. However, using symptom onset for public reporting of new confirmed cases can cause confusion because reporting lags result in an artificial decline in recent cases.
We hope this discussion is a starting point toward a more standardized approach to date-based surveillance. Such standardization could improve public comprehension, policymaking, and outbreak response. (Am J Public Health. 2021;111(12):2127–2132. https://doi.org/10.2105/AJPH.2021.306520)
The COVID-19 pandemic has placed pressure on public health agencies to produce and report surveillance data at an unprecedented speed and granularity.1 While the Centers for Disease Control and Prevention publishes guidelines for collecting COVID-19 surveillance information, there is limited information on standard practices for analysis and public reporting of date-based surveillance data.2,3 This has led to dramatic variations in reporting practices among health departments. For example, 30% of health departments use report date for visualizing new COVID-19 cases in epidemic (epi) curves, 22% use test date, 12% use symptom onset date, 16% display multiple dates such as report date and symptom onset date, and 20% do not define what dates are used or do not show epi curves.
The choices that health departments make regarding date-based analysis and reporting of COVID-19 cases have important consequences for public comprehension and trust, policymaking, and outbreak response. For example, until July 2020, all epi curves included in the Georgia Department of Public Health (GDPH) Daily COVID-19 Status Reports showed new cases by symptom onset date. Although this approach was in keeping with standard epidemiological practice,4 it resulted in an apparent downward trend in recent cases because of incomplete reporting of cases whose symptoms started recently.5 This led to public confusion and incorrect conclusions about Georgia’s progress in reducing infections in the early months of the epidemic.6,7 Discrepancies in dates used across different reporting platforms have caused further confusion.6
This article discusses considerations for reporting and analysis of date-based surveillance data, using a longitudinal COVID-19 surveillance data set from GDPH as an example. This data set included 862 153 confirmed COVID-19 cases as of April 11, 2021.5 We hope this discussion will contribute to the development of more standardized approaches for analysis and presentation of date-based COVID-19 surveillance data across jurisdictions. Such standardization could improve public comprehension, policymaking, outbreak response, and communication.
WHICH DATES TO USE FOR SURVEILLANCE AND REPORTING
Confirmed COVID-19 cases follow a basic timeline from infection date to symptom onset date (for symptomatic individuals), test date, and, finally, report date. Figure 1 presents this timeline, along with median lags between each date from GDPH COVID-19 surveillance data and published reports. These dates are commonly used for epidemiological analysis and public reporting of routine COVID-19 surveillance data. The strengths and weaknesses of each are discussed in the following paragraphs.
Report Date
Report date (i.e., the date a confirmed case was reported to the health department) is generally available in surveillance data sets. For example, it is available for more than 99% of confirmed COVID-19 cases in Georgia. Report date is easily interpretable for decision-makers and the lay public because there is no need to account for reporting lags (Figure 2a). In addition, report date is usually representative of outbreak trends, especially when a running average is presented. These features make it a preferred choice for public reporting and visualization of new confirmed COVID-19 cases via epi curves, maps, and tables. This is the date that is used by the World Health Organization, Centers for Disease Control and Prevention, and many state and local health departments.8,9 Report date is also a suitable choice for summary metrics (e.g., 14-day community transmission rates) where reporting lags associated with other dates could otherwise complicate interpretation.
However, report date is the least epidemiologically relevant date because it is not directly related to the occurrence of disease in individuals or patterns of disease in populations. Epidemiological analyses of outbreak dynamics (e.g., contact tracing, investigations of transmission chains, cluster or hotspot analyses, estimation of the effective reproductive number) require precise characterization of temporal relationships between cases. Using report date for such analyses could introduce systematic or random error.10 Variations in reporting practices can also introduce artificial trends in surveillance data, such as a weekend drop or spikes from resolved reporting backlogs.11
Test Date
Test date (i.e., the day a positive molecular test was first performed for a confirmed case) may be less biased by the previously mentioned variations in reporting practices and is more clinically relevant. Test date is available in electronic laboratory records, so it is timelier than dates requiring patient interviews. More than 98% of confirmed cases in Georgia include test date (Figure 1). For these reasons, some health departments report confirmed COVID-19 cases by test date. However, positive COVID-19 cases in Georgia have a median lag of 3 days between test date and report date, and median lags were as long as 5 days during parts of the outbreak. These lags may cause confusion when test date is used for public presentation of surveillance data because incomplete reporting results in an artificial decline in recent cases. Test date is not a recommended date for most analyses of transmission dynamics because it may not directly align with the clinical course of infections or patterns of disease in populations. In addition, many testing centers have irregular operating dates, and health care utilization often changes depending on the day of the week.12 This could introduce artificial “weekend effects” or other irregularities in epi curves that use test date (Figure 2b).
Symptom Onset Date
Symptom onset date (i.e., the date an individual first began experiencing COVID-19‒like symptoms before a positive diagnostic test) is usually reportable on COVID-19 case report forms or collected during case interviews.13 Because symptomatic individuals are most infectious at or just before their symptoms begin, symptom onset date approximates the time period when individuals were most likely to infect close contacts.14 This makes it an appropriate date for epidemiological analyses related to transmission dynamics. Because symptoms generally appear a median of 5 days after infection, symptom onset date can be used to estimate infection date for contact tracing, outbreak response, or other epidemiological analyses of infection dynamics when the date of infection is unknown.15,16
Symptom onset date is not ideal for public reporting because the inherent lag from onset to report creates an artificial downward trend in recent cases (Figure 2c). Symptom onset date also must be collected via case report forms or contact-tracing interviews, which can cause further delays in reporting. The public and decision-makers may misinterpret reporting lags as true downward trends. This can negatively affect public trust and potentially lead to misinformed policy decisions.6,7 Unlike previous pandemics such as HIV and severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1), a high proportion of SARS-CoV-2 infections are asymptomatic.17 Reporting all COVID-19 cases by symptom onset date obscures this fact and may cause confusion. In addition, public digestion of COVID-19 epidemiological data is dramatically higher than in previous pandemics.1,18 This makes it more imperative to clearly present data and avoid potential ambiguities from dates with long reporting delays.
Limitations of using symptom onset date include underreporting, potential recall bias, and inapplicability to asymptomatic or presymptomatic individuals. This can lead to issues with reliability and a high degree of absence from surveillance data sets. In Georgia, for example, symptom onset date is only available for 55% of confirmed COVID-19 cases. Despite these limitations, various methods can be used to impute symptom onset date and assess bias. GDPH and other health departments derive symptom onset date for missing observations by using a decision tree of other available dates, and other approaches include predictive regression models and multiple imputation.5,19
Infection Date
Infection date (i.e., when an individual was exposed to and infected with SARS-CoV-2) is the most epidemiologically relevant date because of its direct relationship to disease patterns in populations. When available, it is an ideal date for contact tracing,14 outbreak investigations,20 and spatiotemporal analyses of transmission dynamics.21,22 However, individuals often do not know the date they were infected, and it is not collected as part of most COVID-19 surveillance systems. This reduces its applicability to analyses that use routine surveillance data. It can be estimated by using symptom onset date, but many estimation methods homogenize substantial heterogeneities in incubation periods.23,24 Long lags between infection and report dates also complicate the interpretation of epi curves that use infection date (Figure 2d). In Georgia, for example, most cases are reported more than 10 days after infection (Figure 1). Therefore, infection date is not recommended for use in public presentation of surveillance data.
PUBLIC HEALTH IMPLICATIONS
More than a year after the first domestic COVID-19 cases, the United States still does not have national standards for jurisdiction-level analysis and reporting of date-based COVID-19 surveillance data. A more standardized approach could increase public confidence and enable better harmonization of indicators and reporting methods across jurisdictions.25 This is increasingly important in the current context when strengthened surveillance and reporting are required to inform reopening decisions, maintain public confidence, and rapidly detect and respond to recurrent outbreaks.
The choice of which date to use for public reporting or epidemiological analysis should be guided by a balance between interpretability and epidemiological relevance. In general, report date is preferable for use in public reporting, while symptom onset date is the best choice for many epidemiological analyses of transmission dynamics when infection date is not available. Some health department dashboards, such as that of GDPH, now give users an option to view epi curves by date of report and by date of symptom onset. This may reduce public confusion, but additional efforts are needed to identify best practices for reporting and visualization of surveillance data.5,18
ACKNOWLEDGMENTS
This work was supported in part by the Robert W. Woodruff Foundation through a grant to the Emory COVID-19 Response Collaborative. Research reported in this publication was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under award T32AI138952.
We would like to thank Laura Edison and Cherie Drenzek of GDPH, Allison Chamberlain and Hannah Cooper of Emory University Rollins School of Public Health, and the Emory COVID-19 Response Collaborative for leading the research partnership between GDPH and Emory University. We are grateful to Katherine Yih of Harvard Medical School for her insightful comments on the draft article. We extend a special thanks to the staff of GDPH for their tireless efforts in collecting, analyzing, and reporting COVID-19 surveillance data.
Note. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The data for this project were supplied by the GDPH. The contents herein are those of the authors and do not necessarily represent the official views of, nor an endorsement by, the GDPH.
CONFLICTS OF INTEREST
The authors have no potential conflicts of interest to declare.
HUMAN PARTICIPANT PROTECTION
De-identified case data were analyzed with the purpose of informing the GDPH’s pandemic response. GDPH therefore determined that this analysis was exempt from GDPH institutional review board review and approval.
Footnotes
See also Lau et al., p. 2085.
REFERENCES
- 1. Brownson RC, Burke TA, Colditz GA, Samet JM. Reimagining public health in the aftermath of a pandemic. Am J Public Health. . 2020;110(11):1605–1610. doi: 10.2105/AJPH.2020.305861. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Centers for Disease Control and Prevention. 2021. https://www.cdc.gov/coronavirus/2019-ncov/php/reporting-pui.html
- 3. Pearce N, Vandenbroucke JP, VanderWeele TJ, Greenland S. Accurate statistics on COVID-19 are essential for policy guidance and decisions. Am J Public Health. . 2020;110(7):949–951. doi: 10.2105/AJPH.2020.305708. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Centers for Disease Control and Prevention. 2015. https://www.cdc.gov/foodsafety/outbreaks/investigating-outbreaks/epi-curves.html
- 5.Georgia Department of Public Health. Georgia Department of Public Health daily status report. 2021. https://dph.georgia.gov/covid-19-daily-status-report
- 6.Mariano W.. Atlanta Journal‒Constitution. April 23, 2020https://www.ajc.com/news/state–regional-govt–politics/confused-and-scared-georgians-frustrated-over-shifting-virus-data/k9oUbZDE3z6iyouWQBF7gJ
- 7.Newkirk M.. Bloomberg. July 17, 2020https://www.bloomberg.com/news/articles/2020-07-17/georgia-massaged-virus-data-to-reopen-then-voided-mask-orders
- 8.Centers for Disease Control and Prevention. 2021. https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html
- 9.Surveillance strategies for COVID-19 human infection. 2020.
- 10.Rasmussen SA, Goodman RA. The CDC Field Epidemiology Manual. New York, NY: Oxford University Press; 2019. [DOI] [Google Scholar]
- 11.Cohen J..” Forbes. April 27, 2020https://www.forbes.com/sites/joshuacohen/2020/04/27/transient-drops-in-reported-new-coronavirus-cases-sunday-effect/?sh=3cc9f115811a
- 12. Marcilio I, Hajat S, Gouveia N. Forecasting daily emergency department visits using calendar variables and ambient temperature readings. Acad Emerg Med. . 2013;20(8):769–777. doi: 10.1111/acem.12182. [DOI] [PubMed] [Google Scholar]
- 13.Centers for Disease Control and Prevention. 2020. https://www.cdc.gov/coronavirus/2019-ncov/php/contact-tracing/contact-tracing-plan/investigating-covid-19-case.html
- 14. Cheng H-Y, Jian S-W, Liu D-P, et al. Contact tracing assessment of COVID-19 transmission dynamics in Taiwan and risk at different exposure periods before and after symptom onset. JAMA Intern Med. . 2020;180(9):1156–1163. doi: 10.1001/jamainternmed.2020.2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Lauer SA, Grantz KH, Bi Q, et al. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Ann Intern Med. . 2020;172(9):577–582. doi: 10.7326/M20-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Pan A, Liu L, Wang C, et al. Association of public health interventions with the epidemiology of the COVID-19 outbreak in Wuhan, China. JAMA. . 2020;323(19):1915–1923. doi: 10.1001/jama.2020.6130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Petersen E, Koopmans M, Go U, et al. Comparing SARS-CoV-2 with SARS-CoV and influenza pandemics. Lancet Infect Dis. . 2020;20(9):e238–e244. doi: 10.1016/S1473-3099(20)30484-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Chiolero A, Anker D. Data are not enough to reimagine public health. Am J Public Health. . 2020;110(11):1614. doi: 10.2105/AJPH.2020.305907. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Nguyen CD, Carlin JB, Lee KJ. Model checking in multiple imputation: an overview and case study. Emerg Themes Epidemiol. . 2017;14(1):8. doi: 10.1186/s12982-017-0062-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Li Q, Guan X, Wu P, et al. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N Engl J Med. . 2020;382(13):1199–1207. doi: 10.1056/NEJMoa2001316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Pei S, Kandula S, Shaman J. Differential effects of intervention timing on COVID-19 spread in the United States. Sci Adv. . 2020;6(49):eabd6370. doi: 10.1126/sciadv.abd6370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Lau MSY, Grenfell B, Thomas M, Bryan M, Nelson K, Lopman B. Characterizing superspreading events and age-specific infectiousness of SARS-CoV-2 transmission in Georgia, USA. Proc Natl Acad Sci U S A. . 2020;117(36):22430–22435. doi: 10.1073/pnas.2011802117. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Fareed N, Swoboda CM, Chen S, Potter E, Wu DTY, Sieck CJ. US COVID-19 state government public dashboards: an expert review. Appl Clin Inform. . 2021;12(2):208–221. doi: 10.1055/s-0041-1723989. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. McAloon CG, Wall P, Griffin J, et al. Estimation of the serial interval and proportion of pre-symptomatic transmission events of COVID-19 in Ireland using contact tracing data. BMC Public Health. . 2021;21(1):805. doi: 10.1186/s12889-021-10868-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Tarantola D, Dasgupta N. COVID-19 surveillance data: a primer for epidemiology and data science. Am J Public Health. . 2021;111(4):614–619. doi: 10.2105/AJPH.2020.306088. [DOI] [PMC free article] [PubMed] [Google Scholar]