Highlights
-
•
The serial interval of novel coronavirus (COVID-19) infections was estimated from a total of 28 infector-infectee pairs.
-
•
The median serial interval is shorter than the median incubation period, suggesting a substantial proportion of pre-symptomatic transmission.
-
•
A short serial interval makes it difficult to trace contacts due to the rapid turnover of case generations.
Keywords: Coronavirus, Outbreak, Illness onset, Generation time, Statistical model, Epidemiology, Viruses
Abstract
Objective
To estimate the serial interval of novel coronavirus (COVID-19) from information on 28 infector-infectee pairs.
Methods
We collected dates of illness onset for primary cases (infectors) and secondary cases (infectees) from published research articles and case investigation reports. We subjectively ranked the credibility of the data and performed analyses on both the full dataset (n = 28) and a subset of pairs with highest certainty in reporting (n = 18). In addition, we adjust for right truncation of the data as the epidemic is still in its growth phase.
Results
Accounting for right truncation and analyzing all pairs, we estimated the median serial interval at 4.0 days (95% credible interval [CrI]: 3.1, 4.9). Limiting our data to only the most certain pairs, the median serial interval was estimated at 4.6 days (95% CrI: 3.5, 5.9).
Conclusions
The serial interval of COVID-19 is close to or shorter than its median incubation period. This suggests that a substantial proportion of secondary transmission may occur prior to illness onset. The COVID-19 serial interval is also shorter than the serial interval of severe acute respiratory syndrome (SARS), indicating that calculations made using the SARS serial interval may introduce bias.
Introduction
The epidemic of novel coronavirus (COVID-19) infections that began in China in late 2019 has rapidly grown and cases have been reported worldwide. An empirical estimate of the serial interval—the time from illness onset in a primary case (infector) to illness onset in a secondary case (infectee)—is needed to understand the turnover of case generations and transmissibility of the disease (Fine, 2003). Estimates of the serial interval can only be obtained by linking dates of onset for infector-infectee pairs, and these links are not easily established. A recently published epidemiological study used contact tracing data from cases reported in Hubei Province early in the epidemic to estimate the mean serial interval at 7.5 days (Li et al., 2020), which is consistent with the 8.4-day mean serial interval reported for severe acute respiratory syndrome (SARS) from Singaporean household contact data (Lipsitch et al., 2003). However, there were only six infector-infectee pairs in this dataset, and sampling bias may have been introduced to the variance and mean. To further assess the serial interval of COVID-19 infections we compiled a dataset of 28 publicly shared infector-infectee pairs and calculated the serial interval from these data.
Materials and methods
We scanned publicly available information published in research articles and quoted from official reports of outbreak investigations to obtain our dataset. The date of illness onset was defined as the date on which a symptom relevant to COVID-19 infection appeared and was determined by the reporting governmental body. We subjectively ranked the credibility of the ascertained pairs into “certain” and “probable,” where the former was used for pairs and dates of illness onset were clearly defined in an academic article and the latter was applied to pairs and dates of illness onset that were clearly defined but quoted from outbreak investigation reports. Pairs of cases that cannot be scientifically linked were classified as “uncertain” and removed from our analysis, and the verification of illness onset date was ensured following an erratum report of presymptomatic transmissions in Germany (Rothe et al., 2020). Estimates were obtained for certain and probable pairs combined (n = 28) as well as for the certain pairs alone (n = 18).
The interval censored data were handled in units of days. We employed a Bayesian approach with doubly interval censored likelihood to obtain estimates of the serial interval (Reich et al., 2009):
(1) |
where i represents the identity of each pair, (ER,i, EL,i) is the interval for symptom onset of the infector and (SR,i, SL,i) is the interval for symptom onset of the infectee. Intervals are allowed for illness onset event so that the serial interval can be inferred even when we do not have exact dates of illness onset for infector-infectee pairs. Here, g(.) is the probability density function (p.d.f.) of exposure following a uniform distribution, because there is no indication of time-dependence in the frequency of illness onset among primary cases. f(.) is the p.d.f. of the serial interval, assumed to be governed by three different distributions—lognormal, gamma, and Weibull. We sampled the posterior distributions using CmdStan version 2.22.1 (Stan Development Team, 2014) (http://github.com/aakhmetz/nCoVSerialInteval2020).
As the epidemic will continue to grow beyond our data collection cutoff point of 12 February 2020, it is possible that the naïve likelihood (1) underestimates the serial interval as sampling during the early stage of the epidemic preferentially excludes infector-infectee pairs with longer serial intervals. We adjusted for this selection bias—called right truncation—in our model. The alternative p.d.f. that accounts for right truncation during the exponential growth phase of the epidemic is written as:
(2) |
where r is the exponential growth rate estimated at 0.14 (Jung et al., 2020) and T is the latest time of observation (12 February 2020). We account for the exponential growth rate of cases, because recently infected individuals are more likely to be sampled during the exponential growth phase of an epidemic. The widely applicable information criterion (WAIC) was used to compare between distributions and the model with the minimal WAIC value was selected as the best-fit model for each set of estimates with and without right truncation.
Results
We were able to obtain data on 28 infector-infectee pairs (see Supplementary Table). Of these, 12 pairs were family clusters. Accounting for right truncation and analyzing all pairs, the model using the lognormal distribution was selected as the best-fit model (WAIC = 224.0), while no significant differences from other models were identified. The median serial interval was estimated at 4.0 days (95% credible interval [CrI]: 3.1, 4.9) while the mean and standard deviation (SD) of the serial interval were estimated at 4.7 days (95% CrI: 3.7, 6.0) and 2.9 days (95% CrI: 1.9, 4.9), respectively. Without truncation, the model using the lognormal distribution was also the best-fit model (WAIC = 128.0) with the median serial interval estimated at 3.9 days (95% CrI: 3.1, 4.8).
Limiting our dataset to only certain observations, the median serial interval of the best-fit Weibull distribution model was estimated at 4.6 days (95% CI: 3.5, 5.9) with a mean and SD of 4.8 days (95% CrI: 3.8, 6.1) and 2.3 days (95% CrI: 1.6, 3.5), respectively. Without truncation, the best-fit model used the lognormal distribution and estimated the median serial interval at 4.1 days (95% CrI: 3.2, 5.0). Figure 1 shows the best-fit distributions overlaid with a published distribution of the SARS serial interval (Lipsitch et al., 2003).
Discussion
Our estimate of the median serial interval as 4.0 days indicates that COVID-19 infection leads to rapid cycles of transmission from one generation of cases to the next. The shorter serial interval compared to SARS implies that contact tracing methods must compete against the rapid replacement of case generations, and the number of contacts may soon exceed what available healthcare and public health workers are able to handle. The difference between these distributions suggests that using serial intervals estimates from SARS data will result in overestimation of the COVID-19 basic reproduction number.
More importantly, the estimated median serial interval is shorter than the preliminary estimates of the mean incubation period (approximately 5 days) (Li et al., 2020, Linton et al., 2020). As illustrated in Figure 2 , when the serial interval is shorter than the incubation period, pre-symptomatic transmission is likely to have taken place and may even occur more frequently than symptomatic transmission. A substantial proportion of secondary transmission occurring before illness onset indicates that many transmissions cannot be prevented solely through isolation of symptomatic cases, as by the time contacts are traced they may have already become infectious themselves and generated secondary cases (Fraser et al., 2004). It is possible that serial intervals were shortened due to case isolation, especially outside of China, but even the subset of data in China alone can indicate that the mean is shorter than 7.5 days.
Correct ascertainment of dates of illness onset is critical to the calculation of the serial interval. Considering the overall mild nature of the infection (Nishiura et al., 2020), it is possible that different reporting jurisdictions have different criteria for determining what qualifies as illness onset for COVID-2019 cases, which is a potential bias we are unable to account for. However, the present study addresses the issue of data quality of the reported pairs in two ways. First, our data include the updated information from a recent report of pre-symptomatic transmission in Germany (Rothe et al., 2020) where it was later found that the primary case was already symptomatic while in contact with persons who later became infected (Supplementary Material in (Rothe et al., 2020)). Second, classification of the credibility of the data and comparing analyses including and excluding less certain (but nonetheless highly probable) pairs allowed us to determine that our results using all pairs (and therefore a greater sample size) did not differ significantly from the results using only the most credible data.
In conclusion, we have estimated the median serial interval of COVID-19 at 4.0 days, which is close to or shorter than the disease’s median incubation period indicating that rapid cycles of transmission and substantial pre-symptomatic transmissions are occurring. Thus, containment via case isolation alone is likely to be very challenging.
Conflict of interest
The authors declare no conflicts of interest.
Funding source
H.N. received funding support from Japan Agency for Medical Research and Development [grant number: JP18fk0108050] the Japan Society for the Promotion of Science (JSPS) Grants-in-Aid for Scientific Research (KAKENHI in Japanese abbreviation) grant nos. 17H04701, 17H05808, 18H04895 and 19H01074, and the Japan Science and Technology Agency (JST) Core Research for Evolutional Science and Technology (CREST) program [grant number: JPMJCR1413]. NML received a graduate study scholarship from the Ministry of Education, Culture, Sports, Science and Technology, Japan. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Ethical approval
This study was based on publicly available data and did not require ethical approval.
Footnotes
Supplementary material related to this article can be found, in the online version, at doi:https://doi.org/10.1016/j.ijid.2020.02.060.
Appendix A. Supplementary data
The following is Supplementary data to this article:
References
- Fine P.E. The interval between successive cases of an infectious disease. Am J Epidemiol. 2003;158:1039–1047. doi: 10.1093/aje/kwg251. [DOI] [PubMed] [Google Scholar]
- Fraser C., Riley S., Anderson R.M., Ferguson N.M. Factors that make an infectious disease outbreak controllable. Proc Natl Acad Sci U S A. 2004;101(16):6146–6151. doi: 10.1073/pnas.0307506101. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jung S., Akhmetzhanov A.R., Hayashi K., Linton N.M., Yang Y., Yuan B. Real-time estimation of the risk of death from novel coronavirus (COVID-19) infection: inference using exported cases. J Clin Med. 2020;9(2) doi: 10.3390/jcm9020523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li Q., Guan X., Wu P., Wang X., Zhou L., Tong Y. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N Engl J Med. 2020 doi: 10.1056/NEJMoa2001316. in press. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Linton N.M., Kobayashi T., Yang Y., Hayashi K., Andrei A.R., Jung S. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data. J Clin Med. 2020;9(2) doi: 10.3390/jcm9020538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lipsitch M., Cohen T., Cooper B., Robins J.M., Ma S., James L. Transmission dynamics and control of severe acute respiratory syndrome. Science. 2003;300(5627):1966–1970. doi: 10.1126/science.1086616. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nishiura H., Kobayashi T., Yang Y., Hayashi K., Miyama T., Kinoshita R. The rate of underascertainment of novel coronavirus (2019-nCoV) infection: estimation using Japanese passengers data on evacuation flights. J Clin Med. 2020;9(2) doi: 10.3390/jcm9020419. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reich N.G., Lessler J., Cummings D.A., Brookmeyer R. Estimating incubation period distributions with coarse data. Stat Med. 2009;28:2769–2784. doi: 10.1002/sim.3659. [DOI] [PubMed] [Google Scholar]
- Rothe C., Schunk M., Sothmann P., Bretzel G., Froeschl G., Wallrauch C. Transmission of 2019-nCoV infection from an asymptomatic contact in Germany. N Eng J Med. 2020;382(10):970–971. doi: 10.1056/NEJMc2001468. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stan Development Team . 2014. Stan Modeling Language Users Guide and Reference Manual.http://mc-stan.org/manual.html [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.