Abstract
To better control the SARS-CoV-2 pandemic, it is essential to quantify the impact of control measures and the fraction of infected individuals that are detected. To this end we developed a deterministic transmission model based on the renewal equation and fitted the model to daily case and death data in the first few months of 2020 in 79 countries and states, representing 4.2 billions individuals. Based on a region-specific infection fatality ratio, we inferred the time-varying probability of case detection and the time-varying decline in transmissiblity. As a validation, the predicted total number of infected was close to that found in serosurveys; more importantly, the inferred probability of detection strongly correlated with the number of daily tests per inhabitant, with 50 % detection achieved with 0.003 daily tests per inhabitants. Most of the decline in transmission was explained by the reductions in transmissibility (social distancing), which avoided 10 millions deaths in the regions studied over the first four months of 2020. In contrast, symptom-based testing and isolation of positive cases was not an efficient way to control the spread of the disease, as a large part of transmission happens before symptoms and only a small fraction of infected individuals was typically detected. The latter is explained by the limited number of tests available, and the fact that increasing test capacity often increases the probability of detection less than proportionally. Together these results suggest that little control can be achieved by symptom-based testing and isolation alone.
Keywords: SARS-CoV-2, Epidemiology, Mathematical model, Test and isolate, Detection probability
1. Introduction
The coronavirus SARS-CoV-2 originated in November-December 2019 (Rambaut, 2020), appeared as a cluster of cases of pneumonia of unknown etiology in the Wuhan province in China in December 2019-January 2020, and subsequently spread in the world in 2020. The rapid doubling time associated with the basic reproductive number R0 at 2–3 (Li et al., 2020a; Kucharski et al., 2020; Riou and Althaus, 2020), together with the fact that an estimated ∼50 % of transmission is presymptomatic (Ferretti et al., 2020; Casey et al., 2020) make it difficult to control. A substantial proportion of infected individuals need to be hospitalised: 1–18% with increasing age in China, 4% overall in France (Salje et al., 2020; Verity et al., 2020; Wu et al., 2020a). The infection fatality ratio (IFR) is around 1%, and much higher in the elderly (Salje et al., 2020; Verity et al., 2020; Wu et al., 2020a; Hauser et al., 2020).
By early March 2020, many regions of the world had imposed strong social distancing measures to reduce transmission and contain the spread of SARS-CoV-2. These social distancing measures were varied and included school closure, business closure, partial or full lockdowns, stay-at-home order, the prohibition of gatherings, curfews, etc. These measures resulted in the stabilisation or the inversion of the epidemic curve in many countries (Flaxman et al., 2020). This was accompanied by an increase in the capacity to PCR-test potentially infected individuals.
To improve the control of the epidemic, it is necessary to understand the transmission dynamics during the period of unrestricted growth in the first few months of 2020, and the impact of the subsequent reduction in transmission owing to (i) the depletion of susceptible individuals, (ii) the social distancing measures implemented, (iii) tests and isolation of cases. We develop a dynamical epidemiological model that describes the transmission dynamics with a discrete-time renewal equation. Thanks to published estimates of the IFR, our model predicts the daily number of all cases and the fraction of detected cases, and the daily number of deaths over the course of the epidemic and can thus be readily fit to data from 79 countries, states and provinces. Within each of these regions, we infer the time-varying probability of detection and the time-varying transmissibility. We focus on the first few months of the epidemic (up to May 8th 2020), when transmissibility can be assumed to decrease, probability of detection to increase, and the IFR is approximately constant. We deduce the impact of detection and case isolation on transmission dynamics. We call “detection” the fact that an infected individual is tested positive and counted as a case, which is also called “ascertainment”. The model is validated by the small difference between the predicted attack rate and that found in serological surveys. Finally, we show that the capacity to detect SARS-CoV-2 infections is strongly related to the number of tests performed per inhabitant daily, develop a novel model that relate the number of tests to the probability of detection and verify the model predictions. These results will serve to better understand and control transmission dynamics.
2. Results
We model the dynamics of SARS-CoV-2 transmission for 79 geographical zones (countries, USA states, Canadian provinces and the Hubei province in China; hereafter “regions”) with a discrete-time renewal equation that describes how individuals are infected each day by transmission from previously infected individuals (Methods). Our model is akin to an existing model that predicts the daily number of deaths (Flaxman et al., 2020). The adapted renewal equation we use predicts in a deterministic way the daily numbers of infected, cases recorded, and deaths, given temporal profiles of transmissibility and case detection (Fig. 1 ).
Fig. 1.
Schematic of the model.
Infected individuals may die with a constant probability called the infection fatality ratio (IFR). We fix both the IFR and the distribution of time to death to values previously estimated from data from mainland China (Verity et al., 2020). The inference of the number of infected and hence the probability of detection crucially relies on the IFR, which links the daily deaths with the past number of infected individuals. The IFR is difficult to estimate because case detection is biased towards more severe cases. Early estimates relied on settings where tests were exhaustive such as repatriation flights or the Diamond Princess cruise boat (Salje et al., 2020; Verity et al., 2020; Wu et al., 2020b). We use one of the published estimates of age-dependent IFR ((12); similar to other estimates, Supplementary Fig. 1) to compute a region-specific IFR that takes into account the regional age distribution. This region-specific IFR ranges from 0.3−0.4% (Bangladesh, Egypt, Pakistan, Philippines, South Africa) to 1.2 %–1.4 % (Germany, Italy, Portugal, Spain), and is typically around 1% in the regions examined (median 0.94 %). This region-specific IFR does not take into account diffeences in health care capacity that could introduce additional variability (Walker et al., 2020). The IFR and the distribution of the time from infection to death allows us to project back in time the number of infected individuals.
We fit jointly the number of cases and deaths. This strategy has two advantages. While the number of deaths may be small, the number of cases is typically much larger and less subject to stochastic fluctuations. Furthermore, cases give an early signal of potential changes in transmissibility, as infected individuals may be detected as soon as symptoms occur, about a week after infection, while death occurs about three to four weeks after infection on average. The number of recorded cases, however, depends on the intensity of testing and the testing strategy. We account for changes in intensity of testing by modelling sand inferring a time-varying probability of case detection. We can thus interpret the number of cases recorded jointly with the number of deaths. Case detection is assumed to happen a few days after symptom onset (2.2 days on average), as inferred from (Lauer et al., 2020). The imperfect case detection results from a variety of factors, including asymptomatic or paucisymptomatic infected individuals, limited testing capacity, or false negatives. Case detection is assumed to be followed by perfect isolation. Isolation reduces the pool of infected individuals who contribute to transmission (Methods).
From the renewal equation framework predicting the daily number of cases and deaths, we infer the time-varying transmission rate and the time-varying detection probability in the 79 studied regions. The chosen regions are those where the daily death incidence had reached 10 deaths at least once as of 23rd April 2020 according to the John Hopkins Coronavirus Resource Center database. We fit the model by maximum likelihood to the case and death count data assuming the data points each day are drawn in a negative binomial distribution with mean given by the model prediction, and with an inferred dispersion parameter. The maximum-likelihood model generally fitted the data very closely (Fig. 2 for a few example regions, Supplementary Fig. 2 for all regions).
Fig. 2.
Example trajectories of daily cases (green) and deaths (blue) incidence on a log10 scale. The crosses show the data, the points the fitted maximum likelihood model and the shaded regions the 95 % confidence intervals. We focused on the model with a smooth decline in transmissibility. The red points are the unobserved daily infections predicted by the model. The inset shows the inferred relative transmissibility (with respect to initial transmissibility) in red and the probability of detection in black.
We validate our projections by comparing the inferred total attack rates—the proportion of individuals in the population that have ever been infected at a given date—with the number of infected individuals in twelve regions where the number of infected at a certain time is known by systematic survey on a representative sample (Supplementary Table 1). The attack rate is given by the result of seroprevalence surveys, where a seropositive individuals is assumed to have been infected no later than 13 days in the past, corresponding to the median time to seroconversion (Long et al., 2020). Note that in one case (Austria) we use results from a systematic PCR test survey. In that one case a positive individual is assumed to have been infected in the interval from 20 to 4 days ago (Kucirka et al., 2020). The attack rate predicted by our analysis was generally close to that in the data (Fig. 3 ), with no systematic bias. Countries above the identity line have more positive individuals in reality than predicted by the model. For these countries the true IFR is lower than the one assumed: given the realised number of deaths, the country actually had more infected individuals than what the model predicts. This pattern could also be caused by death under-reporting. For example for India, there was a notable discrepancy whereby seroprevalence is at 0.7 % against the predicted 0.1 %. On the contrary, countries below the identity line have a higher IFR than the one assumed. Deviations of the true IFR from that assumed in the model bias the estimated absolute value of the detection probability, but not the temporal trends in the detection probability. The good agreement between predicted and true seroprevalence is reassuring, but we note that countries with available seroprevalence surveys may also be the countries where COVID-19 deaths are best reported.
Fig. 3.
Comparison of the total number of infected (attack rate) found in systematic serological test surveys with that predicted by our model. The segments are 95 % confidence intervals (for the data, binomial confidence intervals; for the model, estimated from the MCMC sample). Binomial confidence intervals for the data do not take into account the uncertainty on the representativity of the sample and could thus be underestimates. We used the model with the smooth sigmoid reduction in transmission; the model with the sharp transition gave very similar results.
To study the change in transmission following social distancing measures, one could infer the effects of different types of measures such as business, school, bar and restaurant closures, banning large gatherings, lockdowns, etc. However, these measures and their implementations are very varied across regions and multiple measures are often implemented simultaneously and may be accompanied by undocumented behavioural changes, complicating the inference of effects of individual measures (Flaxman et al., 2020). Instead, we estimate a region-specific reduction in transmissibility. We test two functional forms for the decline in transmissibility: (i) a sharp reduction in transmissibility at the date of social distancing, (ii) a smooth sigmoid reduction in transmissibility. For the first functional form, we considered the date of the national lockdownn (65/79 regions). If there was no national lockdown, we chose the date of regional lockdowns (5 regions: Algeria, Brazil, Indonesia, Oklahoma, Russia) or the date of a variety of distancing measures without strict lockdown (9 regions: British Columbia, Canada, Chile, Dominican Republic, Egypt, Iran, Ontario, Sweden, Turkey). When comparing the fit of the two functional forms with the Akaike Information Criterion (AIC), the smooth reduction in transmissibility fitted data better (an AIC difference greater than 4) in 50 regions out of 79. In these cases the reduction in transmissibility predated the date of social distancing by 5–20 days (Supplementary Fig. 3). In the 29 other regions, both functional forms were similar (Supplementary Fig. 4).
In most regions, we find a strong reduction in transmissibility accompanied by an increase in detection capacity. The basic reproduction number decreased from 3.7 on average across countries at the first date when 5 daily cases were reached, to 0.98 as of 8th of May (Fig. 4 B). There is substantial variation in the inferred initial transmissibility across regions. The mean probability of detection increased from 4% to 29 % over the same period (Fig. 4C). The transmissibility remained above 1 (the threshold above which the epidemic expands in the absence of other measures) in several regions as of 8th May, including Minnesotta, Brazil, Mexico, Pakistan, South Africa (Fig. 5 A). The type of social distancing measure (national lockdown, regional lockdown, distancing) did not affect the final transmissibility (linear model for the final transmissibility as a function of the distancing measure; p = 0.46). The probability of detection as of 8th May was below 50 % for 67 out of 79 regions (Fig. 4B). The model predicted an attack rate of infection across regions of 0.1 % (India) to 15 % (New Jersey, USA).
Fig. 4.
Panel A shows the map of the regions considered in this study, colored by geographic areas (Europe + Russia, North Africa/Middle East, Asia, South Africa, Central-South America, North America). The USA are represented by 33 states. China is represented by the Hubei province. Canada is represented by three provinces (Quebec, Ontario, British Columbia) and Canada as a whole. Panels B, C show the inferred transmissibility and the probability of detection as a function of time for all regions. The overall mean is a thick black line. The early blue trajectory is that of the Hubei province in China.
Fig. 5.
Inferred transmissibility Rt (panel A) and probability of detection ct (panel B) as of May 8th, for each region. The point is the maximum likelihood estimate and the segment shows the 95 % confidence intervals.
2.1. Factors contributing to the reduction to transmission
The effective reproduction number on the 8th of May (), including the impacts of detection and isolation and immunity may be written as the product of the initial basic reproduction number times three factors that all reduce transmission:
with , , and (Material and Methods). The reduction in overall transmission depends on (i) the depletion of the pool of susceptibles, (ii) reduced transmissibility impacting the basic reproduction number owing to what we generically call “social distancing”, (iii) testing and case isolation. We found that the factor contributing most to reduced transmission is the reduced transmissibility (Fig. 6 B).
Fig. 6.
Impact of control measures and immunity on transmission dynamics. Panel A illustrates how social distancing and case isolation reduce transmission of the disease. The basic reproduction number is given by the area under the curve. A reduction in transmissibility uniformly reduces the Rt (blue curve and area), while detection and case isolation truncates the serial interval (red curve). Panel B represents the distribution of the reduction in transmission caused by social distancing (blue), detection and isolation (red) and immunity of already infected individuals (green) across the 79 regions. Panel C represents the log10 number of deaths averted by social distancing between the beginning of the epidemic and the 8th May 2020.
The reduction in transmission owing to population immunity depends on the total number of individuals ever infected (the attack rate) over the initial number of susceptible individuals , assumed to be the total population size of the region. The attack rate was smaller than 2% in 47 regions out of 79. The reduction in the number of susceptible individuals that could lead to herd immunity was thus very small in most regions, assuming that all individuals are initially susceptible. The second factor is estimated from the inferred sigmoid curve for . The third factor is estimated assuming that case detection is followed by strict isolation, such that a detected case stops transmitting and the generation time is effectively truncated (Fig. 6A). This assumption is compatible with evidence that generation times are shortened by case isolation (Bi et al., 2020; Ali et al., 2020). With our set of parameters, the reduction owing to detection and isolation is approximately . That is, on average detection and isolation only prevents 46 % of transmission of a detected individual. The resulting reduction in transmission caused by detection and isolation is typically small (even under the conservative assumption that all detected individuals are perfectly isolated) because a small fraction of infected individuals is detected, and because individuals are detected a few days after symptoms when about half of the transmission already occurred (Ferretti et al., 2020; Casey et al., 2020).
To estimate the number of deaths averted by social distancing from the beginning of the epidemic to May 8th, we simulated the epidemic in the absence of social distancing measures, i.e. when transmissibility remains constant at its inferred initial value. The difference between the simulated number of deaths and the true reported number of deaths is the number of deaths averted. The reductions in transmissibility fom the beginning of the epidemic to May 8th avoided in total across these regions 9.8 millions deaths (95 % CI 5.8–13.1 millions), and of the order of ten thousands to one million deaths per country. Countries which averted the largest number of deaths were Brazil (694,000 [412,000–1,070,000]), Mexico (586,000 [49,100–603,000]), France (616,000 [563,000–667,000]), Germany (716,000 [548,000–827,000]), Italy (804,000 [795,000–805,000]), Spain (532,000 [522,000–535,000]), United Kingdom (564,000 [551,000–567,000]). A previous study of 11 European countries reported figures similar to ours (500,000–700,000 deaths avoided in the five aforementioned countries (11)).
Lastly, we found that the inferred sigmoid-shaped transmissibility correlated with some mobility indicators, and most highly with the presence of individuals in transit stations, both in Europe and in the USA, and with a regression coefficient close to 1 (i.e., a given reduction in mobility corresponds to an equivalent reduction in transmission) (Supplementary text; Supplementary Table 2).
2.2. Relationship between probability of detection and intensity of testing
We last relate the time-varying probability of detection to the intensity of testing. First, we correlate the probability of detection (weekly average May 2nd-May 8th) with the number of tests performed by inhabitants across regions. We do so for 62 regions where test data were available. We weight the probability of detection by the inverse of the width of the confidence interval to give more weight to more certain estimates. There was a strong correlation between probability of detection and number of tests per inhabitants (regression coefficient β = 161 per daily test per inhabitant, 95 % CI [87.1–233] ) (Fig. 7 A). Bootstrap confidence intervals were similar to those based on assuming normality of the coefficient [51.2–226]. The correlation was also significant in the subset of 29 American states (β = 233, p = 0.0028), the subset of 29 regions reaching deaths at least one day–which had better data to estimate the probability of detection (β = 167, ), and the subset of 44 regions with less than 0.001 daily tests per inhabitant (β = 197, ). However, there was no correlation across the 15 European countries (β = -50, ).
Fig. 7.
Relationship between probability of detection and number of tests. Panel A represents the final probability of detection as a function of number of daily tests per capita (over the 7 days preceding 8th May) for the 62 regions with available test data. The five Asian countries (blue) with small number of tests and small probability of detection are Bangladesh, India, Indonesia, Pakistan, Philippines. Panel B shows stacked distributions of the disease score for positive (red) and negative (blue) individuals. The fraction of positive individuals increases with the score. The number of tests performed is the area to the right of the threshold (vertical line). Panel C shows the predicted root-function relationship between proportion of detected and daily tests for the 22 regions with sufficient available test data. Panel D shows the proportion of detected a function of daily tests and the number of testable infected presenting for a test, for the New York state (one of the high-prevalence states where the proportion detected declines with the number of infected as predicted at high prevalence).
Second, to examine further how the changing number of tests affects the probability of detection within a region and across time, we formulated a simple model of testing (Fig. 7B). The goal of this model is to relate within a region the number of tests conducted on a given day (called ) with the inferred probability of detection on that day (). We assume that in the period when the incidence of infections is much higher than the number of tests, the decision to test individuals for SARS-CoV-2 is made on the basis of a set of symptoms and risk exposure defining a score. SARS-CoV-2 infected and uninfected individuals present two distinct distributions of this score, such that the probability that the individual is truly infected with SARS-CoV-2 increases with this score. Tests are prioritised on individuals with the highest score. This model thus reproduces the fact that the fraction of positive tests increases when tests are limited compared to the number of infected individuals. For simplicity, we additionally assume that the score in infected and uninfected individuals follows exponential distributions with two distinct rates. Under this model, the probability of detection is given by the solution of:
(Material and Methods). In this equation, the variable is the total number of tests conducted at day t. and are the number of SARS-CoV-2 infected and non-infected individuals who could potentially be tested at day t if the number of tests available allows. More specifically, is the time-delayed number of infected individuals given by where is the probability that an individual is detected days after infection (when it is detected), while is assumed to be constant over the considered period. The first term is the number of positive tests and the second term, , the number of negative tests. The fraction of positive tests is therefore . The parameter describes the distribution of the symptom score in infected individuals relative to that in uninfected individuals. The number of negative tests is smaller when is large, i.e. when the score discriminates better between uninfected and infected individuals.
There is no closed form solution for the general solution , but when the fraction of positive tests is small (the distribution of the score is dominated by negative individuals), the probability of detection is approximately a root function of the number of tests:
The probability of detection should thus generally increase sublinearly with the number of tests since , and at best, should be proportional to the number of tests (when ). This is because tests are prioritised on individuals that are more likely to be infected; as the number of tests increases, the probability of positivity decreases. We also predict that in general, when the number of infected is large such that the fraction of positive tests is not small, the probability of detection decreases with the number of infected individuals. Indeed, keeping the number of tests constant, the probability of detection decreases when the number of infected increases (Material and Methods).
Both predictions were verified in data (Fig. 7). We inferred for each region the best-fitting pair of parameters to relate the inferred probability of detection to the number of tests , using both the approximated and the general model. With the approximated model, we found that for most regions, often implying a linear or sublinear relationship as predicted (Fig. 7C, Supplementary Fig. 6). The full model where the probability of detection decreased with the number of testable infected individuals was a better fit only when the attack rate was high, for example in New York state (Fig. 7D, Supplementary Fig. 6).
3. Discussion
We developed a discrete time renewal equation model to describe the dynamics of SARS-CoV-2 infections. We fitted this model to the daily cases and deaths in a large number of countries and states (together representing 4.2 billions individuals), with the following results:
-
(i)
Transmissibility declined in all 79 regions examined. The best-fit decline in transmissibility was often smooth, with the decline in transmissibility predating the date of the lockdown. This could be due to non-pharmaceutical interventions implemented before the full lockdown or other behavioural changes. However, the decline in transmissibility as of May 8th was not enough to contain the epidemic in a number of regions.
-
(ii)
The probability of case detection increased, was on average 29 % across regions as of May 8th, and very rarely above 50 %.
-
(iii)
Epidemic control was achieved mainly through reductions in transmissibility brought about by social distancing. Case detection and isolation had a limited impact (Fig. 6B), even under the conservative assumption that case detection is followed by perfect isolation. Only a small proportion of cases are detected and about half of the transmission happens before symptom onset. We emphasise that in this period most testing was based on symptoms and not on past contacts with infected individuals. The build-up of immunity in infected individuals also had a very limited impact because the fraction of individuals infected remains small in all regions. Social distancing in the regions considered avoided almost 10 millions deaths from the beginning of the epidemic to May 8th.
-
(iv)
The inferred probability of detection correlated with the number of tests per capita across regions. However, increasing the number of tests does not proportionally increase the probability of detection. This is explained by the fact that tests are prioritised on individuals most likely to be infected. This study proposes a simple model to describe how test prioritisation impacts the probability of detection. This model implies that the probability of detection is an implicit function of the number of testable infected individuals and the number of tests available. In the regions examined, the limiting factor was the number of tests available and only in rare cases did the probability of detection decrease with the number of infected. The simplified model where the probability of detection is a root function of the number of tests could be a useful null model to adjust the number of cases by the number of tests conducted, when estimating transmission parameters from case time series. The root function only has one additional parameter and is in our opinion better motivated than a linear function when tests are not conducted at random.
Our model and inference rely on several assumptions detailed in the following paragraphs.
First of all, we describe transmission dynamics within a simplified model that does not take into account age structure or household structure. These forms of structure may be weak enough that they can be neglected when describing the overall epidemic trajectory (Pellis et al., 2020).
Second, to infer jointly the time-varying transmissibility and probability of detection within a dynamical model, we assumed the temporal change took sigmoid functional forms. This differs from other approaches which estimate daily transmissibility as the incidence at a given day divided by past incidence weighted by the distribution of the generation time (Gostic et al., 2020; Cori et al., 2013). These alternative approaches are more flexible in that they can infer any pattern of time-varying transmissibility. However, they cannot account exactly for the delay in case reporting, and can be very sensitive to noise in the data (Gostic et al., 2020). Fitting a dynamical model with imposed functional forms for transmissibility and probability of detection reduces the sensitivity of inference to noise in the data.
Third, and most importantly, inference relies on daily deaths and cases. Deaths are assumed to be perfectly reported. Cases are assumed to be partially reported with a time-varying detection probability. The inferred absolute value of the probability of detection of course strongly relies on the assumed IFR at around 1% on average (and tuned to the specific age structure of each region considered). The approach was broadly validated in a number of regions where systematic test or seroprevalence surveys were conducted (Fig. 3). However, it is possible that in some of the other regions examined the number of deaths was greatly under-reported, in which case the true number of infected would be much higher than predicted, and the true probability of case detection much smaller. Death under-reporting might explain some of the very high inferred probabilities of detection (Fig. 4B). For example, in India, the only lower-middle-income country for which we found a systematic seroprevalence survey, the attack rate was 0.7 % while our model predicted 0.1 %: the two observations can be reconciled with substantial death under-reporting. Death under-reporting should not affect the temporal trends in transmissibility or probability of detection, provided that it is constant in time. Other emerging seroprevalence surveys will give more information on the IFR (or death under-reporting) across regions, but it is notable that the early estimate of IFR in mainland China (Verity et al., 2020) already allow good predictions (Fig. 3).
Lastly, our framework does not take into account the possibility that the IFR (or apparent IFR) changes in time. Such temporal variation in IFR could be caused by overwhelmed health systems (increasing IFR), increase in death reporting (increasing apparent IFR), better social distancing in at-risk groups (decreasing IFR), or better clinical care (decreasing IFR). We did not conduct a sensitivity analysis with the same imposed time-varying IFR in all regions, because it is not clear what would be on average the result of these conflicting effects. We did not attempt to infer a time-varying IFR because it may have caused identifiability problems, since we already infer time-varying transmissibility and case detection probability. Our study spans a relatively short period (from introduction to 8th May 2020) which ensures a limited variation in IFR. For example, over the period considered, the reduction in IFR allowed by better clinical care is around 25 % in IFR given hospitalisation in France (Lefrancq et al., 2020). The overall IFR was inferred to have declined by 15 % in the UK (Knock et al., 2020). These effects remain small compared to the inferred changes in transmissibility and probability of detection.
We note that in spite of all these shortcomings and potential sources of variability between regions, we were still able to recover a robust positive correlation between inferred probability of detection and test capacity (Fig. 7A).
Our method has several advantages. The discrete-time renewal equation makes the minimal assumptions that the transmissibility of an infected individual depends on the age of infection. It allows arbitrary distributions of the generation time, and arbitrary delays between infection and case detection, and infection and death. The distributions of these delays determines the dynamics of the changes in number of cases and deaths following a change in transmissibility. Parameters can be inferred using multiple time series, improving the precision of inference. The daily cases, although dependent on the number of tests available, give an earlier signal of changes in transmissibility than the daily deaths, and suffer less from stochastic effects. The method allows different transmissibility for detected cases (here assumed to be zero, i.e. perfect isolation after detection). This is particularly relevant for accurate inference of transmissibility, as non-pharmaceutical interventions shorten the serial interval (Fig. 6A) (Ali et al., 2020). Lastly, the framework quantifies the immunity acquired by infected individuals.
The probability of detection as a function of time in different countries was computed by different means in another study (Russell et al., 2020). Their statistical approach was based on estimating the case fatality ratio (CFR) adjusted for the delay between infection and deaths, and comparing with the baseline infection fatality ratio estimated in other studies that account for under-reporting (assumed to be 1.4 % in their case). Their statistical method allows inferring arbitrary temporal variations in the probability of detection. However, it does not explicitly model the dynamics of transmission. It is unclear how the changing age-of-infection structure of the population upon reductions in transmission will affect the relationship between daily number of deaths and past number of cases, hence the inferred probability of detection, in their approach.
We found that tests detected only a small proportion of cases. Furthermore, increasing the number of tests does not proportionally increase the proportion of detected individuals. As a consequence of the typically small probability of detection, together with the fact that a large part of transmission already occurred at test result, tests followed by isolation of positive cases had very little impact on transmission, and were not sufficient by themselves to control an epidemic with a basic reproductive number of 3 or more. We assumed that individuals isolate only at the date of the test result (a few days after symptom onset). Assuming that symptomatic individuals isolate at symptom onset would not change much our quantitative results. Our model assumes an exponential distribution of the score underpinning the decision to refer an individual to a SARS-CoV-2 test or not. This distribution could be linked to more precise data if individuals tested in priority are those most likely to be infected, and if the decision to test is based on a defined set of variables describing symptoms, or risk exposure. For example, a logistic regression of infection status vs. symptoms (as in (Menni et al., 2020)) would define a score for each individual based on a linear combination of these symptoms. The probability of infection would increase with this score, and the right tail of the distribution of this score (including the individuals most likely to be infected) could resemble an exponential distribution.
Our model is primarily concerned with the first few months of the epidemic where in most regions contact tracing could no longer be practically implemented. More widespread contact tracing could improve the relationship between probability of detection and number of tests, as contacts of positive cases may have a 5–10 % chance of being positive, up to 10–15 % for household contacts (Bi et al., 2020; Jing et al., 2020; Li et al., 2020b). Furthermore, contacts could self-isolate before the onset of symptoms (Ferretti et al., 2020; Hellewell et al., 2020). For these two reasons, contact-tracing and testing is a more efficient way to control the epidemic than symptom-based testing. Thus, if the capacity to trace contacts is limited, the epidemic may be out control as soon as the daily incidence is too large to trace a good fraction of contacts. This pleads for the use of digitical contact tracing apps and/or rapid implementation of additional social distancing measures when incidence increases (Ferretti et al., 2020).
Lastly, the inferred time-varying transmissibility correlated with mobility indicators as found elsewhere (Miller et al., 2020; Nouvellet et al., 2021), and especially with mobility in transit stations. The mobility in transit stations could be a general indicator of economic / social activity resulting in more transmission. Public transports could also be a common context of transmission. In support of our finding, individual use of public transport in Maryland was strongly associated with SARS-CoV-2 positivity (Clipman et al., 2020).
In conclusion, we developed a framework to estimate time-varying transmissibility and probability of detection from daily cases and deaths in a large number of countries and regions. In the first few months of 2020, control of the epidemic was achieved mostly by reductions in transmissibility, which avoided 10 millions deaths in these 79 regions (representing more than half of the world’s population), while case detection and isolation comparatively had a much smaller effect.
4. Methods
4.1. Deterministic transmission dynamics
To model transmission dynamics, we use a discretised version of the renewal equation (e.g. Flaxman et al. (2020)). We follow the dynamics of the number of individuals infected at day t who were infected days ago, and have not yet been detected and isolated, called . The transmission dynamics are given by the system of recurrence equations:
| (1a) |
| (1b) |
The first equation represents transmission to new susceptible individuals giving rise to infected individuals with age of infection 0. The parameter reflects transmissibility, and is the basic reproduction number (in the absence of interventions, and when the population is fully susceptible, i.e. ). The factor is the fraction of transmission that occurs at age of infection , where . Thus represents the distribution of the generation time of the virus. The infectiousness profile of the virus is linked with the generation time distribution through . Transmission is reduced by a factor by population immunity, where is the initial number of susceptible individuals in the region, assumed to be the total population size. The variable is the total number of individuals already infected and assumed to be fully immune at time . The instantaenous reproduction number that accounts for population immunity but not for case isolation is .
The Eq. (1b) represents the dynamics of individuals infected in the past. Individuals infected days ago are now of age of infection , provided they were not detected and isolated. An infected individual is detected with time-varying probability and the probability that an individual is detected at age (when it is detected) is given by , with . An individual who is detected is removed from the pool of individuals that contribute to further transmission of the disease. The total number of cases detected at day is thus:
| (2) |
And the number of detected individuals who were infected days ago changes as:
| (3a) |
| (3b) |
The total number of infected individuals, be they undetected or detected, that we may call , follows the equations:
| (4a) |
| (4b) |
The fact that incidence (in the Eq. (4a)) only depends on undetected cases emerges from the assumption that detected individuals do not transmit.
While in the absence of testing and isolation, the infectiousness profile is given by (with ), detection and isolation truncate the infectiousness profile at the time of detection (Fig. 6A) with probability where is the time of detection. In other words, the effective infectiousness profile is the mixture distribution:
| (5) |
where is an indicator variable equal to 1 when , and 0 otherwise.
4.2. Probability of dying and time to death given infection
The probability that an infected individual dies is the infection fatality ratio (IFR) denoted d, assumed to be constant over time. The probability of dying days after infection, given that one dies, is given by . The mean number of deceased individuals at day t is then given by:
| (6) |
As death typically occurs at a time when the infected individual does not transmit any longer, and the IFR is small (of the order of 1%), we neglect the impact of death on transmission.
4.3. Effects of detection and isolation, change in transmissibility and immunity on transmission
We call “effective reproduction number” the instantaneous reproduction number taking into account immunity and case isolation. It is given by (see also (Grassly et al., 2020)):
| (7a) |
For example, an individual detected at day 0 only infects individuals on average. Equation (7a) can be rewritten as:
| (7b) |
Thus, the effective reproduction number on the 8th of May (), including the impacts of detection and isolation and immunity may be written as the product of the initial basic reproduction number, times three factors that all reduce transmission:
| (8) |
With , , and
4.4. Parameter estimates
We fix the distributions of the generation time , the distribution of time from infection to death , the distribution from infection to detection , and the infection fatality ratio d to values estimated previously (Table 1 ).
Table 1.
Summary of model parameters.
| Parameter | Symbol | Value | Reference |
|---|---|---|---|
| Distribution of generation time | Log-normal(1.77, 0.888) Mean 7 days SD 4.5 days |
(Wu et al., 2020a, Bi et al., 2020, Ma et al., 2020) | |
| Distribution of time from infection to symptom onset | – | Log-normal(1.518, 0.472) | (Lauer et al., 2020) |
| Distribution of time from infection to death | Calculated by convoluting distribution from infection to onset and from onset to death. The latter is Gamma(5, 0.25) | (Wu et al., 2020a) | |
| Distribution of time from infection to detection | Calculated by convoluting distribution from infection to onset and from onset to detection | (Lauer et al., 2020) | |
| Infection fatality ratio | d | Depends on the age structure of the country, around 1% on average | (Verity et al., 2020) |
| Probability of detection | Inferred | – | |
| Transmissibility | Inferred | – |
4.4.1. Generation time
We assume the generation time is lognormally distributed with mean 7 days and standard deviation 4.5 days (Wu et al., 2020a) (Supplementary Fig. 7). This is the generation time when the infected individual is not tested. A positive test is assumed to be followed by perfect isolation of the infected individual and interruption of transmission. Self-isolation of positive individuals effectively truncates the distribution of generation time, reducing its mean below 7 days (Fig. 6A). Two factors make estimation of this generation time difficult: first, the generation time, the time from an infection to another infection, is often approximated by the serial interval, the time between symptom onset in an infector and symptom onset in the infectee. These two quantities have the same mean, but the variance of the generation time should in general be smaller than that of the serial interval (Britton and Scalia Tomba, 2019). Second, measuring the serial interval requires to identify infectees and their infectors. The fact that the infector needs to be identified could bias the serial interval towards lower values. For example, in a large study in the Shenzhen province in China, the serial interval had mean 6.3 days overall and 8.1 days if the infector was isolated more than two days after infection (Bi et al., 2020). Thus, in settings where most infections are undocumented, the typical serial interval (and generation time) may be longer than that estimated in other work (e.g. mean 5 days in (5)), motivating the mean of 7 days chosen here.
Note that the chosen serial interval distribution affects the absolute value of the basic reproduction number, but does not affect either the inferred temporal trend in basic reproduction number or the absolute value of the probability of detection.
4.4.2. Time from infection to detection
The time from symptom onset to case detection was inferred from published data on 150 cases from various countries (Lauer et al., 2020). We used the time between the midpoint date of symptom onset and the midpoint date of case detection. We excluded 31 cases for which the date of case detection was not available or there was very large uncertainty on the date of symptom onset. We inferred that the time from symptom onset to detection was gamma-distributed with mean 2.2 days [95 % CI 1.6–3.2] and SD 2.7 days [2.0–3.8] (shape 0.69 [0.55−0.82] and rate 3.2 [2.5, 4.5]). The fit of a Weibull distribution was comparable to that of the gamma (Supplementary Fig. 8).
The distribution of time from infection to detection was computed from the convolution of the distribution of time from infection to symptom onset (Lauer et al., 2020), and our inferred distribution of time from symptom onset to case detection, assuming independence of the two times. The distribution of time from infection to symptom onset has mean 5–6 days (Supplementary Fig. 9).
4.4.3. Time from infection to death
The distribution of the time from infection to death was estimated using data from 41 patients in Wuhan analysed elsewhere (Wu et al., 2020b). The time from symptom onset to death was gamma-distributed with a mean of 20 days and a standard deviation of 10 days. This estimate is close to that of other studies (24 deceased cases from mainland China, mean and SD of time from onset to deaths 18.8 / 8.5 days (Verity et al., 2020); 34 deceased cases from mainland China, mean and SD 20.2 /11.6 days (Linton et al., 2020).
4.4.4. Discretisation of the distributions
We discretised the distributions of time to events as follow, explained for the example of time from infection to detection. We assumed the probability that detection happens i days after infection is given by:
where is the cumulative distribution function for the random variable in continuous time describing the time from infection to detection.
4.4.5. Infection fatality ratio
For each region studied, we computed an overall infection fatality ratio that takes into account the age pyramid of the country. To this end, we used the infection fatality ratio (IFR) estimated in nine age classes, [0,9], [10,19], etc., [80+] in mainland China (Verity et al., 2020). Other estimates similarly stratified by age, for mainland China and for France, are very similar (Supplementary Fig. 1). The IFR climbs from close to 0% in 0–39 years old, up to 5–10% in individuals aged 80 years old or more.
4.5. Likelihood method
To fit the model and infer transmission and case detection parameters, we use data on the number of confirmed cases over time and the number of deaths over time in 79 states and countries from different public sources detailed below. We include all states and countries that had a daily incidence of 10 deaths or more at least once as of 23th April. As we want to estimate the impact of sudden social distancing measures in an essentially uncontrolled epidemic, we exclude South Korea and Japan from the analysis. In these two countries, SARS-CoV-2 was introduced earlier and strong control measures including social distancing and contact tracing were immediately in place.
The likelihood of the model is defined as in other epidemiological studies (Salje et al., 2020; Flaxman et al., 2020). Simulating the deterministic model gives the expected number of detected cases and deaths at time t as a function of model parameters. We assume that the probability to observe a certain number of cases (resp. deaths) in the data at day t is the density of a negative binomial distribution with mean given by the theoretical predictions for cases (resp. deaths), and dispersion parameters that we infer. The likelihood for cases (resp. deaths) is the product of these probabilities over all days. The overall likelihood is the product of the likelihood of cases and the likelihood of deaths. For the number of deaths, we include the period from the first day to the last day when at least 1 death and 5 cases were recorded. For the number of cases, we include the period from the first day to the last day when at least 5 cases were recorded. We offset the simulation time such that the date when 5 deaths are reached in the simulations becomes the date when 5 deaths are reached in the data. At this date, the number of infected is large enough that the underlying dynamics should largely be deterministic.
We mainly estimate the time-changing transmissibility and the time-changing probability of detection .
For the time-changing transmissibility, we fit two functional forms. First we assume that is a step function with a sharp transition from a high pre-control value to a low post-control value, at a fixed date corresponding to the date of implementation of the control measure:
| (9a) |
| (9b) |
For the sharp change in transmissibility, we infer the two values and . Furthermore, to investigate the possibility that transmissibility changed in a more gradual way, we assume is a smooth declining sigmoid function:
| (10) |
Where is the basic reproductive number before social distancing measures, is the basic reproductive number after social distancing, is the steepness of the logistic curve, and is the time when the basic reproductive number is intermediate between and . The step function is a special case of the logistic when is large and . For the smooth change in transmissibility, we infer the two values and , the steepness and the time .
For the time-changing detection probability, we assume an increasing logistic function:
| (11) |
We infer the four parameters , , and . Note that we constrain the parameter , the initial probability of detection, to be small, in [0.0001, 0.001]. We fit three models: (i) a model based on death data only with the step function of transmissibility, (ii) a model based on death and case data with the step transmissibility function; and (iii) a model based on death and case data with the smooth transmissibility function. These three models are fitted by maximum likelihood. We first find an optimal likelihood value by 50 iterations of the Nelder-Mead algorithm starting from different initial parameters. We then run a Markov chain Monte Carlo (MCMC) sampling of the likelihood function with bounded parameters. We let the chain run for steps and record the parameter values from to steps. This sample is used both for maximum likelihood parameters (if a better parameter set is found than with the Nelder-Mead algorithm) and for confidence intervals.
4.6. Validation with seroprevalence and systematic surveys
For the attack rate in serosurveys, a seropositive individuals is assumed to have been infected no later than 13 days in the past, corresponding to the median time to seroconversion (Long et al., 2020). When we used results from a systematic PCR test survey in Austria, a positive individual was assumed to have been infected in the interval from 20 to 4 days ago (Kucirka et al., 2020). We used the mid-point between first and last date of the serosurvey as the date of the survey.
4.7. Symptom-based test model
4.7.1. The model
We relate the fraction of infected individuals detected to the number of daily RT-PCR tests performed and the incidence of infection. Each day, the testable individuals are composed of two populations:
-
-
SARS-CoV-2 infected individuals. The number of such individuals is time-varying and is denoted by where is the probability that an individual is detected at age of infection (given that it is detected) (Fig. 7A).
-
-
Non-SARS-CoV-2 infected individuals. The number of such individuals is assumed to be constant and is denoted by . We acknowledge that a more complete model would allow for this number to vary in time, for example to account for seasonal infections by respiratory diseases like influenza or seasonal coronavirus that may contribute to the pool of testable individuals.
We assume that contexts in which we apply our model are characterized by a number of tests smaller than the number of testable individuals, where is the number of tests available at time t. Thus the tests are prioritised on the subset of individuals most likely to be infected by SARS-CoV-2. Testable individuals are characterised by a score such that the probability of SARS-CoV-2 infection increases with the score. Given the limited number of tests available each day, a threshold score is defined and tests are performed only for patients above this score. Formally, denoting by and the distribution of the score s in infected and uninfected individuals, the (time-varying) threshold score is the solution of:
| (12) |
In the absence of detailed information on the choice of individuals to test in different regions at different stages of the pandemic, we further assume for simplicity that the scores are distributed exponentially. We set the rate of the exponential distribution to 1 without loss of generality, and we denote the rate of :
| (13a) |
| (13b) |
The fact that guarantees that the probability that an individual is positive increases with the score. Plugging the distributions of Eq. (13) in the implicit formula to define the threshold score (Eq. (12)) yields:
| (14) |
The probability of detection , defined as the ratio between positive tests results and the number of testable infected individuals , is proportional to the area of the distribution above the threshold :
| (15) |
Replacing with Eq. (15) in Eq. (14), we find that is the solution of:
| (16) |
This solution generally defines an implicit function of the number of testable infected at day t, and the number of available tests . In Eq. (16), is the number of positive tests, is the number of negative tests, and the fraction of positive tests is therefore . The number of negative tests is smaller when is large, i.e. when the score discriminates better between uninfected and infected individuals. We can simplify this general solution in two ways. First, in the limit when the fraction of positive tests is small (the first term in Eq. (16) is much smaller than the second), the probability of detection is:
| (17a) |
That is, the probability of detection increases as a root function of the normalised number of tests. In general, when the fraction of positive tests is not small, the solution of Eq. (16) decreases with . When the number of testable infected individuals is small, the solution (17a) can be better approximated by:
| (17b) |
In this approximation the probability of detection decreases linearly with the number of infected .
4.7.2. Parameter inference for the test model
We verify the model predictions using the inferred probability of detection together with data on the daily number of tests , and the number of testable infected individuals inferred from the dynamical model in different regions. We used the nls (nonlinear least-squares) method from the stats package in the software R (R Core Team, 2018).
First, we use the general solution of Eq. (16). This solution is a non-linear function with parameters and . We infer the parameters and by minimizing the mean square error between the inferred and the prediction. In most cases (except, notably New York and New Jersey states) the coefficient of determination was as good with the simplification of the model where is approximated as a root function of only (Eq.(17a)) (Supplementary Fig. 6). The general solution improved the fit all the more than the the attack rate was larger, as predicted by the model (Supplementary Fig. 6).
Data sources
Epidemiological data
For France, we used data from OpenCOVID19 available at https://github.com/opencovid19-fr/data. This website curates data from Agence nationale de santé publique, the French governmental public health agency.
For Italy, we used data from the Civil Protection Department (Dipartimento della Protezione Civile), available at https://github.com/pcm-dpc/COVID-19. This data includes daily cases and deaths, and daily number of tests.
For other European countries, we used data from the European Center for Disease Control (ECDC) available at https://opendata.ecdc.europa.eu/covid19/casedistribution/
For American states, we used data from the COVID Tracking Project that compiles data from American official sources, available at https://covidtracking.com/api/v1/states/daily.csv. This data includes daily cases and deaths, and daily number of tests.
For other countries, we used data from the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU), available at https://github.com/CSSEGISandData/COVID-19
Daily number of tests data for regions other than Italy and American states were compiled from Our World in Data at https://covid.ourworldindata.org/data/owid-covid-data.csv
We considered test data only for regions for which the number of tests was strictly superior to the number of cases recorded for at least 80 % of the days. The reported number of tests are sometimes exactly equal to the number of cases that day, indicating that negative tests are not reported. Since we ignore whether negative tests are not reported or reported at a later date (as sometimes suggested by a peak in the number of reported tests a few days after), we exclude these datapoints and exclude regions where this artefact is often observed.
Age structure data
We collected data on the number of individuals in age categories 0–9, 10–19, …, 80+, for different states and countries, from the following sources:
Mobility data
We used Google mobility data available at https://www.google.com/covid19/mobility/
Data availability
The code and data are available on https://github.com/FrancoisBlanquart/covid_model
CRediT authorship contribution statement
Antoine Belloir: Formal analysis, Investigation, Visualization, Writing - review & editing. François Blanquart: Conceptualization, Formal analysis, Investigation, Methodology, Supervision, Visualization, Data curation, Writing - original draft, Writing - review & editing.
Declaration of Competing Interest
None.
Acknowledgements
We thank Florence Débarre and Chris Wymant for helpful comments. We thank the many people involved in the collection and curation of the epidemiological data that we use. F.B. was supported by a Momentum grant from the CNRS. A.B was supported by a scholarship from Ecole Polytechnique. We are grateful to the INRA MIGALE bioinformatics facility (MIGALE, INRA, 2018. Migale bioinformatics Facility, doi: 10.15454/1.5572390655343293E12) for providing computational resources.
Footnotes
Supplementary material related to this article can be found, in the online version, at doi:https://doi.org/10.1016/j.epidem.2021.100445.
Appendix A. Supplementary data
The following is Supplementary data to this article:
References
- Rambaut A. 2020. Phylodynamic Analysis| 176 Genomes| 6 Mar 2020. Virol Httpvirological Orgtphylodynamic-Anal-176-genomes-6-Mar-2020356 Accessed; p. 15. [Google Scholar]
- Li Q., Guan X., Wu P., Wang X., Zhou L., Tong Y. Early transmission dynamics in Wuhan, China, of novel Coronavirus–Infected pneumonia. N. Engl. J. Med. 2020;382(13):1199–1207. doi: 10.1056/NEJMoa2001316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kucharski A.J., Russell T.W., Diamond C., Liu Y., Edmunds J., Funk S. Early dynamics of transmission and control of COVID-19: a mathematical modelling study. Lancet Infect. Dis. 2020;20(5):553–558. doi: 10.1016/S1473-3099(20)30144-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Riou J., Althaus C.L. Pattern of early human-to-human transmission of Wuhan 2019 novel coronavirus (2019-nCoV), December 2019 to January 2020. Eurosurveillance. 2020;25(4) doi: 10.2807/1560-7917.ES.2020.25.4.2000058. https://www.eurosurveillance.org/content/10.2807/1560-7917.ES.2020.25.4.2000058 Jan 30 [cited 2020 Jul 28];. Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ferretti L., Wymant C., Kendall M., Zhao L., Nurtay A., Abeler-Dörner L. Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing. Science. 2020;368(6491) doi: 10.1126/science.abb6936. https://science.sciencemag.org/content/368/6491/eabb6936 May 8 [cited 2020 Jun 24];. Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Casey M., Griffin J., McAloon C.G., Byrne A.W., Madden J.M., McEvoy D. Pre-symptomatic transmission of SARS-CoV-2 infection: a secondary analysis using published data. medRxiv. 2020 doi: 10.1136/bmjopen-2020-041240. Jun 11;2020.05.08.20094870. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Salje H., Kiem C.T., Lefrancq N., Courtejoie N., Bosetti P., Paireau J. Estimating the burden of SARS-CoV-2 in France. Science. 2020 doi: 10.1126/science.abc3517. https://science.sciencemag.org/content/early/2020/05/12/science.abc3517 May 13 [cited 2020 Jun 24]; Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Verity R., Okell L.C., Dorigatti I., Winskill P., Whittaker C., Imai N. Estimates of the severity of coronavirus disease 2019: a model-based analysis. Lancet Infect. Dis. 2020 doi: 10.1016/S1473-3099(20)30243-7. https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30243-7/abstract Mar 30 [cited 2020 Apr 7];0(0). Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wu J.T., Leung K., Bushman M., Kishore N., Niehus R., de Salazar P.M. Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China. Nat. Med. 2020;26(4):506–510. doi: 10.1038/s41591-020-0822-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hauser A., Counotte M.J., Margossian C.C., Konstantinoudis G., Low N., Althaus C.L. Estimation of SARS-CoV-2 mortality during the early stages of an epidemic: a modeling study in Hubei, China, and six regions in Europe. medRxiv. 2020 doi: 10.1371/journal.pmed.1003189. Jul 12;2020.03.04.20031104. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flaxman S., Mishra S., Gandy A., Unwin H.J.T., Mellan T.A., Coupland H. Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe. Nature. 2020:1–8. doi: 10.1038/s41586-020-2405-7. [DOI] [PubMed] [Google Scholar]
- Wu J.T., Leung K., Bushman M., Kishore N., Niehus R., de Salazar P.M. Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China. Nat. Med. 2020:1–5. doi: 10.1038/s41591-020-0822-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Walker P.G.T., Whittaker C., Watson O.J., Baguelin M., Winskill P., Hamlet A. The impact of COVID-19 and strategies for mitigation and suppression in low- and middle-income countries. Science. 2020;369(6502):413–422. doi: 10.1126/science.abc0035. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lauer S.A., Grantz K.H., Bi Q., Jones F.K., Zheng Q., Meredith H.R. The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Ann. Intern. Med. 2020;172(9):577–582. doi: 10.7326/M20-0504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Long Q.-X., Liu B.-Z., Deng H.-J., Wu G.-C., Deng K., Chen Y.-K. Antibody responses to SARS-CoV-2 in patients with COVID-19. Nat. Med. 2020;26(6):845–848. doi: 10.1038/s41591-020-0897-1. [DOI] [PubMed] [Google Scholar]
- Kucirka L.M., Lauer S.A., Laeyendecker O., Boon D., Lessler J. Variation in false-negative rate of reverse transcriptase polymerase chain reaction–based SARS-CoV-2 tests by time since exposure. Ann. Intern. Med. 2020 doi: 10.7326/M20-1495. May 13 [cited 2020 Jun 24]; Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bi Q., Wu Y., Mei S., Ye C., Zou X., Zhang Z. Epidemiology and transmission of COVID-19 in 391 cases and 1286 of their close contacts in Shenzhen, China: a retrospective cohort study. Lancet Infect. Dis. 2020 doi: 10.1016/S1473-3099(20)30287-5. http://www.sciencedirect.com/science/article/pii/S1473309920302875 Apr 27 [cited 2020 Jun 24]; Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ali S.T., Wang L., Lau E.H.Y., Xu X.-K., Du Z., Wu Y. Serial interval of SARS-CoV-2 was shortened over time by nonpharmaceutical interventions. Science. 2020 doi: 10.1126/science.abc9004. https://science.sciencemag.org/content/early/2020/07/20/science.abc9004 Jul 21 [cited 2020 Jul 31]; Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller A.C., Foti N.J., Lewnard J.A., Jewell N.P., Guestrin C., Fox E.B. Mobility trends provide a leading indicator of changes in SARS-CoV-2 transmission. medRxiv. 2020 May 11;2020.05.07.20094441. [Google Scholar]
- Pellis L., Cauchemez S., Ferguson N.M., Fraser C. Systematic selection between age and household structure for models aimed at emerging epidemic predictions. Nat. Commun. 2020;11(1):906. doi: 10.1038/s41467-019-14229-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gostic K.M., McGough L., Baskerville E., Abbott S., Joshi K., Tedijanto C. Practical considerations for measuring the effective reproductive number. Rt. medRxiv. 2020 doi: 10.1371/journal.pcbi.1008409. Jun 23;2020.06.18.20134858. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cori A., Ferguson N.M., Fraser C., Cauchemez S. A New framework and software to estimate time-varying reproduction numbers during epidemics. Am. J. Epidemiol. 2013;178(9):1505–1512. doi: 10.1093/aje/kwt133. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lefrancq N., Paireau J., Hozé N., Courtejoie N., Yazdanpanah Y., Bouadma L. 2020. Evolution of Outcomes for Patients Hospitalized During the First SARS-CoV-2 Pandemic Wave in France. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Knock E., Whittles L., Lees J., Perez Guzman P., Verity R., Fitzjohn R. 2020. Report 41: The 2020 SARS-CoV-2 Epidemic in England: Key Epidemiological Drivers and Impact of Interventions. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Russell T.W., Golding N., Hellewell J., Abbott S., Pearson C.A.B., van Zandvoort K. 2020. Reconstructing the Global Dynamics of Unreported COVID-19 Cases and Infections | CMMID Repository.https://cmmid.github.io/topics/covid19/Under-Reporting.html [cited 2020 Jul 31]. Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Menni C., Valdes A.M., Freidin M.B., Sudre C.H., Nguyen L.H., Drew D.A. Real-time tracking of self-reported symptoms to predict potential COVID-19. Nat. Med. 2020;26(7):1037–1040. doi: 10.1038/s41591-020-0916-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jing Q.-L., Liu M.-J., Yuan J., Zhang Z.-B., Zhang Z.-B., Dean N.E. Household secondary attack rate of COVID-19 and associated determinants. medRxiv. 2020 doi: 10.1016/S1473-3099(20)30471-0. Apr 15;2020.04.11.20056010. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Li W., Zhang B., Lu J., Liu S., Chang Z., Peng C. Characteristics of household transmission of COVID-19. Clin. Infect. Dis. 2020 doi: 10.1093/cid/ciaa450. [cited 2020 Jul 31]; Available from: https://academic.oup.com/cid/article/doi/10.1093/cid/ciaa450/5821281. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hellewell J., Abbott S., Gimma A., Bosse N.I., Jarvis C.I., Russell T.W. Feasibility of controlling COVID-19 outbreaks by isolation of cases and contacts. Lancet Glob. Health. 2020;8(4):e488–96. doi: 10.1016/S2214-109X(20)30074-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nouvellet P., Bhatia S., Cori A., Ainslie K., Baguelin M., Bhatt S. 2021. Report 26: Reduction in Mobility and COVID-19 Transmission. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Clipman S.J., Wesolowski A.P., Gibson D.G., Agarwal S., Lambrou A.S., Kirk G.D. Rapid real-time tracking of non-pharmaceutical interventions and their association with SARS-CoV-2 positivity: the COVID-19 pandemic pulse study. Clin. Infect. Dis. 2020 doi: 10.1093/cid/ciaa1313. [cited 2020 Sep 5]; Available from: https://academic.oup.com/cid/advance-article/doi/10.1093/cid/ciaa1313/5900759. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grassly N.C., Pons-Salort M., Parker E.P.K., White P.J., Ferguson N.M., Ainslie K. Comparison of molecular testing strategies for COVID-19 control: a mathematical modelling study. Lancet Infect. Dis. 2020 doi: 10.1016/S1473-3099(20)30630-7. http://www.sciencedirect.com/science/article/pii/S1473309920306307 Aug 18 [cited 2020 Aug 27]; Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ma S., Zhang J., Zeng M., Yun Q., Guo W., Zheng Y. Epidemiological parameters of coronavirus disease 2019: a pooled analysis of publicly reported individual data of 1155 cases from seven countries. medRxiv. 2020 2020.03.21.20040329. [Google Scholar]
- Britton T., Scalia Tomba G. Estimation in emerging epidemics: biases and remedies. J. R. Soc. Interface. 2019;16(150) doi: 10.1098/rsif.2018.0670. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364646/ [cited 2020 Jul 21]; Available from: [DOI] [PMC free article] [PubMed] [Google Scholar]
- Linton N.M., Kobayashi T., Yang Y., Hayashi K., Akhmetzhanov A.R., Jung S. Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: a statistical analysis of publicly available case data. J. Clin. Med. 2020;9(2):538. doi: 10.3390/jcm9020538. [DOI] [PMC free article] [PubMed] [Google Scholar]
- R Core Team . R Foundation for Statistical Computing; Vienna, Austria: 2018. R: A Language and Environment for Statistical Computing. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The code and data are available on https://github.com/FrancoisBlanquart/covid_model







