Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2024 Jun 18;51(16):3366–3385. doi: 10.1080/02664763.2024.2351467

Identifying waves of COVID-19 mortality using skew normal curves

Kamal Rai a,CONTACT, Patrick E Brown a,b
PMCID: PMC11610353  PMID: 39628858

Abstract

We propose a model for multiple waves of an epidemic that decomposes the health outcome of interest into the sum of scaled skew normal curves. When applied to daily COVID-19 mortality in six regions (Japan, Italy, Belgium, Ontario, Texas, and Peru), this model provides three notable results. First, when fit to data from early 2020 to May 31, 2022, the estimated skew normal curves substantially overlap with the dates of COVID-19 waves in Ontario and Belgium, as determined by their respective health authorities. Second, the asymmetry of the skew normal curves changes over time – they progress from increasing more quickly to decreasing more quickly, indicating changes in the relative speed that daily COVID-19 mortality rises and falls over time. Third, most regions have day-of-the-week effects, which suggests that day-of-the-week effects should be included when modeling daily COVID-19 mortality. We conclude by discussing limitations and possible extensions of this model and its results, including commenting on its applicability to potential future COVID-19 waves.

Keywords: Daily mortality, day-of-the-week effect, skew normal distribution, Bayesian inference, COVID-19

1. Introduction

The COVID-19 outbreak was officially recognized as a pandemic on March 11, 2020 [8]. In many regions, the initial outbreak was associated with rapidly rising daily cases that strained health care systems. A large variety of statistical models aimed to predict quantities of interest, such as when COVID-19 health outcomes (cases, hospitalizations, and mortality) will peak, while others studied the accuracy of COVID-19 tests, or the prevalence of COVID-19 in particular regions. In this paper, we propose a model with interpretable parameters that can characterizes key aspects of COVID-19 waves over time, such as when they peak, their degree of asymmetry, and how they evolve over time.

While an exhaustive review of COVID-19 modeling is beyond the scope of this paper, we give a sense of the variety of approaches used. The interested reader can consult Wynants et al. [42] and Gnanvi et al. [12] for more in-depth reviews. We first consider the Center for Disease Control (CDC) ensemble model [5], which incorporated seventy-six models as of October 7, 2022 [7]. Each model forecasts one or more of cases, hospitalizations, and deaths. The ensemble contains twenty-three Susceptible-Exposed-Infectious-Removed (SEIR) models, ten ensemble models, eight Susceptible-Infectious-Removed (SIR) models, six time series models, five deep learning models, and two regression models. These models take different approaches to incorporating public health interventions. Twenty-seven models assume interventions will remain constant during the forecasting period while seventeen assume they will continue in the future. Fourteen models make various assumptions regarding social distancing and mobility, and ten make no assumptions. The variety of models and assumptions in this ensemble demonstrates the breadth of COVID-19 modeling during the pandemic. However, note that the quantities of interest are very similar between these models.

For examples of quantities of interest, we turn to SIR and SEIR models. SIR models incorporate compartments for the susceptible S(t), infectious I(t) and recovered R(t) populations over time t, while SEIR models incorporate an additional compartment for the exposed E(t). Tang et al. [39] contains a more detailed introduction to SIR and SEIR models. One quantity of interest is the effective reproductive number, the average number of secondary cases attributable to an infected individual at time t, where a value below one suggests the infected population is decreasing (at t). For example, Peng et al. [24] proposes a seven-compartment model that additionally considers insusceptible and quarantined populations. Using data from January 20, 2020 to February 9, 2020, they estimate the effective reproductive number is below one by the end of January for Beijing and Shanghai, and by the first half of February for Wuhan. SIR and SEIR models can also estimate when COVID-19 waves peak. For example, Hao et al. [14] propose a (different) seven-compartment that is fit to data from January 1, 2020 to February 29, 2020 in Wuhan. They find that daily infections peak on February 2, and the effective reproductive number is below one from February 2 to March 8.

Daily COVID-19 mortality often occurs in rising and falling periods that do not happen at regular intervals. In addition, these periods have varying duration and levels of asymmetry (the relative rate at which daily mortality rises and falls). Our goal is to propose a parametric model that captures these characteristics in the multi-wave setting, and provides interpretable parameters that helps quantify them. We therefore extend the single skew normal curve model of Slater et al. [36] by proposing the sum of skew normal (SoSN) model, which is described in more detail in Section 2.2. It uses a negative binomial response distribution whose intensity includes the sum of skew normal curves, and provides a decomposition of daily mortality into the terms of that sum. We apply it to daily COVID-19 mortality data, such that the (asymmetric) skew normal curves provide information on when COVID-19 waves peak, and the rate at which they increase and decrease through the shape parameter of the curves. This interpretable parameter quantifies the asymmetry of each curve. One contribution of the SoSN model is that it provides the evolution of the shape parameter over time, which can help us understand how patterns of COVID-19 mortality have changed over time.

The SoSN model includes multiple asymmetric curves so that it does not assume periodicity but can capture the irregular periods of asymmetric growth and decline in daily COIVD-19 mortality. It only requires daily COVID-19 mortality data for each region and can be specified with any number of skew normal curves, making it easily applicable across regions. Slater et al. [36] find day-of-the-week effects in some regions, and so we also include them in the SoSN model. We fit four regions with the same day-of-the-week effect across the study period, and two regions with different day-of-the-week effects across the two halves of the study period. In these two regions, we additionally examine if the day-of-the-week effect changes in the two periods. The SoSN model can provide forecasts in multi-wave settings, and we examine its forecasts in periods of increasing, peaking, and decreasing daily COVID-19 mortality in Section 4. Finally, note that while one model in the CDC ensemble model uses skew normal curves, it independently fits curves to different time periods [40]. Our model fits a sum of skew normal curves to the entire study period. This is a major difference, as our model uses the full data set to provide a data-driven decomposition of COVID-19 mortality over time.

This paper contains four additional sections. Section 2 describes the SoSN model in detail. In Section 3, we examine the results when fitting the model to six regions (Japan, Italy, Belgium, Ontario, Texas, and Peru) using COVID-19 mortality data from early 2020 to May 31, 2022. In Section 4, we fit the SoSN model to data from early 2020 to November 12, 2020 and examine short-term forecasting results. Section 5 summarizes our results and discusses limitations and extensions of the SoSN model.

2. Methods

This section introduces the publicly available data sources used in this paper, and proposes a model where the intensity of the negative binomial response distribution equals the sum of scaled skew normal curves, plus a term to capture the much lower level of COVID-19 mortality that occurs between COVID-19 waves. We fit this model to daily COVID-19 mortality in two different time periods. We use an eight curve model for the entire study period (early 2020 to May 31, 2022) that we call the ‘full model’ and a three curve model for a shorter period (early 2020 to November 12, 2020) that we call the ‘forecasting model.’ We also describe the priors used in the SoSN model, and discuss the choice of using three and eight curves in each model in subsequent sections.

2.1. COVID-19 mortality data

We consider daily COVID-19 mortality in six regions – Japan, Italy, Belgium, Ontario, Texas and Peru – from the start of the pandemic to May 31, 2022. The data for Japan, Italy, and Peru is from the World Health Organization (WHO) Coronavirus (COVID-19) Dashboard, and the data for Texas is from the CDC [3,41]. The Belgium data is from Sciensano, and the Ontario data is from the Public Health Infobase, made available by the Public Health Agency of Canada [27,34]. We remove obvious outliers in Italy, Ontario, Japan, Texas, and Peru, such as those in and around holiday periods. Also, note that different regions attribute deaths to COVID-19 slightly differently. For instance, Belgium health authorities suggest attributing deaths clinically suspected of being due to COVID-19, while Italian health authorities suggest a narrower definition based on laboratory tests [4]. Karanikolos and McKee [18] contains a table of the definitions for COVID-10 deaths used across a variety of countries. Although the definitions used across regions are somewhat different, this is the (daily) data that is made available by health authorities, and is therefore appropriate to use in this model.

Figure 1 shows daily COVID-19 mortality by day-of-the-week (DOW) across these six regions, and illustrates three of its key features. The first is that some regions have observable DOW effects. For example, the observed mortality on Monday in Italy is consistently below the observed mortality on other days (in Italy). In Texas, the DOW effect has little visible variation in the first half, but more in the second half. One possible reason for this is reporting changes, such as the ones that occurred in Peru, where officials revised death counts substantially higher halfway through the pandemic [9]. Second, many regions experience COVID-19 mortality in visible waves. We subsequently refer to periods of visibly rising and then falling daily mortality as ‘observed waves,’ as a qualitative description of this aspect of the observed data. We call the terms of the skew normal sum ‘curves’ to distinguish them from observed waves, and use 'designated wave' to refer to waves as identified by others. Third, observed waves need not be symmetric. For example, the first observed wave appears to rise much more quickly than it declines in most regions. Subsequent observed waves are not as visibly asymmetric.

Figure 1.

Figure 1.

Observed daily COVID-19 mortality in Japan, Italy, Belgium, Ontario, Texas and Peru (left to right) by day of the week from early 2020 to May 31, 2022.

2.2. The sum of skew normal model

We propose a model that captures three features of COVID-19 mortality data: (1) day-of-the-week effects, (2) the pattern of increasing and decreasing mortality, and (3) the asymmetry between the increasing and decreasing parts of that pattern. To achieve this, we model daily COVID-19 mortality with a negative binomial distribution whose mean on day t in region j, λtj, is the sum of scaled skew normal curves (plus an additional term for mortality that occurs between observed waves). We chose a negative binomial response because it is well suited to the overdispersed count data often seen in infectious disease applications [20]. The SoSN model for region j and any positive integer N is,

YtjNegBin(λtj,ηj), (1)
λtj=EjβDOW(t)j[i=1NCijf(t|Aij,Bij,Kij)+Dj], (2)
f(t|A,B,K)=2Bϕ(tAB)Φ(K(tAB)), (3)

where ϕ is the standard Gaussian density and Φ is its distribution function. In this model, Ytj is the observed COVID-19 mortality on day t in region j, βDOW(t)j is the DOW effect on day t in region j, and Ej is the age standardized mortality for region j relative to Italy on March 29, 2020, which is near the date when Italy's first COVID-19 wave peaked. The inclusion of Ej makes the results of the scaling parameter Cij more comparable across regions with different demographics, such as Japan and Peru. We describe the scaling parameter in more detail below. Also, note that the number of skew normal curves in the sum, N, is fixed before estimation.

The negative binomial distribution has mean λij and variance μ+μ2/ηj, where ηj is a measure of over-dispersion. Recalling that the negative binomial distribution can be written as a Poisson-Gamma mixture, this parameterization corresponds to a Gamma distribution with shape and rate parameters both equal to ηj [15], mean 1, variance 1/ηj and coefficient of variation 1/ηj. In Section 2.3, we place a prior on this coefficient of variation.

The mean λtj has two components – scaled skew normal densities that we call skew normal curves, and a ‘spark’ term, Dj. The skew normal curves capture mortality during observed waves, while the latent (non-observed) parameter Dj captures COVID-19 deaths that occur outside of observed waves. The ith skew normal curve in region j is governed by a skew normal density with location parameter Aij, scale parameter Bij, and shape parameter Kij. The severity parameter Cij is a scaling parameter relative to expected deaths Ej with larger larger values indicating more COVID-19 mortality, and Kij captures the asymmetry of the ith curve in region j. When Kij=0, the skew normal curve reduces to the symmetric (Gaussian) case.

The SoSN model is implemented in Stan, via the R interface Rstan [38]. Stan is a probabilistic programming language that provides a BUGS-like interface and uses automatic differentiation so that its users do not have to calculate any derivatives needed for inference [2]. It uses a version of Hamiltonian Monte Carlo (HMC), or more specifically a version of the No-U-Turns (NUTS) sampler presented in Hoffman and Gelman [16], that does not require any hand-tuning of algorithmic parameters. The HMC algorithm uses the log gradient of the posterior distribution for efficient exploration of the posterior, and Stan is a state-of-the-art software package for Bayesian inference [21]. More details on HMC are available in Betancourt [1], and more details on the NUTS algorithm are available in Hoffman and Gelman [16].

2.3. Prior distributions

We consider two versions of the SoSN model in Section 3. Recall that the full model is fit to data from early 2020 to May 31, 2022 with N = 8 skew normal curves. Let μ = (April 1 2020, August 1 2020, December 1 2020, April 1 2021, August 1 2021, December 1 2021, March 1 2022, June 1 2022), and βdj be the day-of-the-week effect on day d in region j for d=Monday,,Sunday. We set βThursday as the baseline day and fix it equal to 1. With i indexing the N skew normal curves, the priors for the full model (with N = 8 and i=1,,8) are,

AijN(μi,8),BijGamma(7,0.5),CijGamma(0.4,0.25),KijN(0,2),1ηjExp(10),DjExp(23,000),βdjGamma(1,1),dThursday

The forecasting model is fit to data from early 2020 to November 12, 2020 with N = 3 and the same priors as above (for i = 1, 2, 3). These priors are motivated by Slater et al. [36], which considers the first two COVID-19 waves in the early months of the pandemic. They are informative for some parameters. For instance, the priors on the location parameter encourages an ordering on the skew normal curves (which helps prevent re-labeling). We discuss the individual priors in more detail subsequently. Figure 2 shows the scaled skew normal densities (skew normal curves) and cumulative distribution functions obtained by taking ten samples from the prior distributions of the parameters of the scaled skew normal density. As illustrated below, they cover a range of possible skew normal curves.

Figure 2.

Figure 2.

The scaled skew normal densities (left) and their cumulative distribution function (right), from ten samples from the prior distribution of the skew normal distribution.

The normal prior on the location parameters Aij is somewhat informative. It has roughly equally spaced means with a standard deviation of eight. This prior encourages an ordering on the Aij, so that the model does not simply relabel the skew normal curves. We incorporated subject matter expertise into the selection of some priors. For example, the prior on the scale parameter Bij is mildly informative, as COVID-19 waves tend to have a duration of two-four months. The 2.5% and 97.5% percentiles of its Gamma prior are 5.63 and 26.12, respectively, which covers those durations but is also somewhat broader. The severity parameter Cij is a scaling factor relative to expected deaths Ej. Its Gamma prior has 2.5% and 97.5% percentiles of 0.00 and 8.90, respectively, and is less informative as we allow for the possibility of severe outbreaks. The shape parameter Kij has a normal prior with a mean of zero and standard deviation of two. This is more general than the Gamma prior in Slater et al. [36], which places all prior mass on Kij>0. It is symmetric about zero, and therefore neutral about the possible direction of asymmetry in the skew normal curves. The 2.5% and 97.5% percentiles of this prior are -3.92 and 3.92, which correspond to fairly asymmetric skew normal curves.

The (multiplicative) DOW effect βdj is non-negative. It is given a Gamma prior with a shape and rate parameter of one, such that its mean and coefficient of variation are one. The (transformed) over-dispersion parameter 1/ηj is given the same exponential prior as Slater et al. [36]. Recalling that 1/ηj is the coefficient of variation in the Poisson-Gamma mixture representation of our parameterization of the negative binomial distribution (described in Section 2.2), this exponential prior with a mean of 0.1 is moderately informative. Finally, Dj is a latent (unobserved) parameter that captures deaths in region j that occur outside of skew normal curves. Since Dj is scaled by Ej, we expect it to be small. The age standardized mortality in Japan is EJapan = 22,962, such that EJapan/23,000 is approximately one. We therefore give Dj an informative exponential prior with rate 23,000, so that its mean is approximately the age adjusted mortality in Japan outside of observed waves. Large values of a similar magnitude, such as 10,000, are also reasonable choices for this prior rate parameter.

3. Results

We begin by examining the results obtained when the full model is fit to six regions – Japan, Italy, Belgium, Ontario, Texas, and Peru – as shown in Figure 3. The two regions in the middle row, Belgium and Ontario, are additionally annotated with the dates of COVID-19 waves as designated by Public Health Ontario and Sciensano [32,35], and when the Alpha, Delta, and Omicron variants of concern (VoC) were dominant [19,28,29,33]. Note that these dates are typically reported weekly or bi-weekly. We will first compare the estimated skew normal curves in Belgium and Ontario to the designated waves, then examine the estimated skew normal curves in these six regions.

Figure 3.

Figure 3.

Observed COVID-19 mortality (gray points), posterior mean and 95% credible interval (light blue lines and shaded regions), skew normal curves (assorted colors) and their 95% credible regions (shaded gray regions) in Japan, Italy, Belgium, Ontario, Texas, and Peru (left to right) from early 2020 to May 31, 2022. The plots for Belgium and Ontario are annotated with the dates of six waves, as determined by their respective health authorities, and with the dates that the Alpha, Delta, and Omicron variants were dominant.

3.1. Decomposition

The SoSN model decomposes daily COVID-19 mortality into a sum of skew normal curves. Figure 3 shows this decomposition in terms of deaths per million per day. In Belgium, the midpoints of the six designated waves are roughly aligned with the posterior modes of six (out of eight) of the skew normal curves. The two remaining curves are in the inter-wave periods of late summer 2020 and 2021, and have much lower peaks. This demonstrates that the SoSN model can adjust to periods with different levels (high and low) of daily deaths per million (DPM). Regarding variants, the third designated wave is associated with the Alpha variant, the fourth is associated with the Delta variant, and the fifth and sixth are associated with the Omicron variant. Note that in Belgium, the BA.1 variant become dominant at the end of December 2021, the BA.2 sub-variant became dominant at the end of February 2022, and BA.5 sub-variant become dominant in mid-June 2022 [19].

In Ontario, the midpoints of the six designated waves roughly align with the posterior peaks of five (out of eight) skew normal curves. The three remaining curves have much lower peaks but occur in the summer of 2020 and 2021. While not designated as inter-wave periods, Ontario also has lower daily DPM in these two summer periods. The Alpha variant is associated with the third designated wave, the Delta variant with the fourth designated wave (which occurs in the summer and has much lower daily DPM than other designated waves), and the Omicron variant with the fifth and sixth designated waves. Also, note that in Ontario, the B.1.1.529 sub-variant became dominant in mid-December 2021, and the BA.2 sub-variant become dominant in mid-March 2022 [30,31].

Recall from Section 2.2 that N is fixed prior to estimation. We chose N=3 for the forecasting model (discussed in Section 4) that is fit to data from early 2020 to November 12, 2020, and N = 8 for the full model that is fit to data from early 2020 to May 31, 2022. Our choice of N is motivated by the number of designated COVID-19 waves in Belgium and Ontario over those time periods – Belgium has two designated waves and one inter-wave period between early 2020 and November 2020, and six designated waves and two inter-wave periods between early 2020 and May 2022. The pattern of daily COVID-19 mortality in Ontario is similar. We discuss possible extensions to the SoSN model, where N is not fixed, in Section 5.

We make two additional observations. The peaks of the skew normal curves roughly coincide with the peaks in the observed data across all six regions, and the skew normal curves achieve this under various levels of asymmetry. Thus, Figure 3 illustrates the SoSN model achieves the second and third goals listed in Section 2.2. However, a key difference between skew normal curves and designated waves is that curves can overlap. For example, they do so at various times between September 2021 to April 2022 in Italy, Texas, and Peru. Also, the intensity typically has higher uncertainty during periods with overlapping curves than ones with non-overlapping curves.

3.2. Model parameters

We present parameter estimates for the full model in Figure 4. The Supplemental Materials contains additional summary tables and traceplots of these parameters. Figure 4 shows estimates for the skew normal and DOW parameters, as well as the severity parameter C. For C, Japan has consistently low estimates across its eight curves, indicating that it had low COVID-19 mortality relative to expectations. In contrast, the first, second, and fourth curves in Peru have the highest (mean) values of C, so it experienced the most COVID-19 mortality relative to expectations. This may be related to reporting adjustments, since Peru officials revised their official COVID-19 mortality numbers almost three times higher in mid-2021 [9]. The sixth curve in Japan has the lowest estimated (mean) C, and the curve is barely visible in Figure 3. In the case of Japan, the SoSN model demonstrates an ability to estimate curves with very low peaks (in daily DPM).

Figure 4.

Figure 4.

The posterior distributions (mean and 95% credible intervals) of the severity parameter (C), skew normal location (A), scale (B), and shape (K) parameters, and day-of-the-week effects ( βdj) estimated by the full model in the indicated region.

Next, we examine the parameter estimates of the skew normal densities. As these parameters are associated with (and govern) the skew normal curves in the SoSN model, we will also refer to them as the parameters of the skew normal curves. The estimates are shown in Figure 4. The (mean) location parameter for Italy's first curve is earlier than other regions. For the third curve, Belgium's and Italy's location parameter is earlier than other regions. The differences between regions become smaller after the fourth curve. Since curves can overlap, the scale parameter is more difficult to interpret. We simply note the highest and lowest estimates. The third curve in Italy has the largest (mean) scale parameter. It occurs during a ten month period of high daily DPM (October 2020 to July 2021) and declines very slowly while overlapping with its fourth curve. The sixth curve in Japan has the smallest (mean) scale parameter, and occurs during a period of very low daily DPM in the second half of 2021.

While the DOW model in Slater et al. [36] only uses one skew normal curve, the SoSN model has N curves and therefore provides the evolution of K over time. Recall that K>0 indicates a skew normal curve that increases more steeply than it decreases, and K<0 the opposite. The evolution of K is consistent across regions. The first curve has positive K, with the mean being (often very) positive and the lower bound of the 95% credible intervals being above zero in all eight regions. The mean and upper bound of the 95% credible interval of K progress to being below zero by the last curve. Estimation results for the peak of each curve are similar to the results for the location parameter. Italy's first curve peaks earlier than other regions (on March 29, 2020), while its second curve peaks later than other regions (on September 24, 2020). The third curve peaks earliest in Peru (on November 3, 2020) and latest in Japan (on January 28, 2021). The differences between regions become smaller after the fourth curve. Full estimation results for the peak of each curve are given in the Supplemental Materials.

Finally, the bottom row of Figure 4 shows the estimated DOW effects for Japan, Italy, Belgium, Ontario, Texas, and Peru. When fitting a single-wave DOW model to COVID-19 mortality data from Brazil, Canada, Spain, and the United States from the first half of 2020, Slater et al. [36] found smaller effects around weekends than on other days of the week. We therefore chose Thursday as the baseline day for the DOW effect. In this paper, four regions – Japan, Italy, Belgium, and Ontario – are fit with one DOW effect for the entire study period. The DOW effect in Belgium is the most similar throughout the week, and is most varied in Ontario. The DOW effect in all four of these regions decreases around weekends. The two remaining regions, Texas and Peru, are split into two periods with different DOW effects. The second period begins on July 01, 2021 in Texas, and on June 01, 2021 in Peru. Texas has large differences between periods, as only its second period has substantially lower relative risks for Sunday and Monday. The differences between periods is smaller for Peru, with slightly higher means and uncertainty in the second period.

4. Forecasting

While our main focus is in examining the decomposition of daily COVID-19 mortality into the sum of skew normal curves, we consider forecasting in this section. The forecasting model fits the SoSN model to data from the early 2020 to November 12, 2020 with N = 3 skew normal curves, then forecasts daily COVID-19 mortality fourteen and twenty-eight days ahead. From Figure 5, it forecasts that daily DPM peaks in Belgium on mid-November and in Italy in early December. It also forecasts that daily DPM is increasing in Ontario and Texas, and decreasing in Peru. In addition, Table 1 shows the percentage of days that observed mortality is contained in the 80% credible interval of the posterior predictive distribution by region. When forecasting two weeks ahead, consider two of the six regions – Italy and Peru – that have higher than 80% coverage. The increased uncertainty in the forecasts for Peru is likely because its first and third curves overlap during the forecasting period. For Italy, the third curve is forecasting a potential peak. This contrasts with the forecasts for Belgium, where daily mortality has already peaked, and suggests that forecasting during a potential peak (Italy) is more difficult for the SoSN model than forecasting once daily DPM has begun peaking (Belgium).

Figure 5.

Figure 5.

Observed COVID-19 mortality in training data (gray points), observed COVID-19 mortality not in training data (red points), mean and 95% credible intervals of the posterior predictive distribution (light blue line and shaded regions), and skew normal curves (assorted colors) and their 95% credible regions (gray shaded regions) in Japan, Italy, Belgium, Ontario, Texas, and Peru (left to right) from early 2020 to December 10, 2020.

Table 1.

The percentage of days that observed mortality is contained in the 80% credible region of the posterior predictive distribution when forecasting 2 weeks (14 days) and 4 weeks (28 days) ahead.

  Two weeks Four weeks
Japan 78.57% 42.86%
Italy 100.00% 100.00%
Belgium 78.57% 89.29%
Ontario 71.43% 60.71%
Texas 85.71% 78.57%
Peru 92.86% 96.43%

Of the three regions with increasing daily DPM (Japan, Ontario, and Texas), Texas has a more notable increase than Ontario or Japan before the forecasting period begins. In Japan, daily DPM increases very little until after the forecasting period begins. Comparing the results for the six regions shows that the model can forecast daily DPM two weeks ahead during rising, peaking, and declining periods, that forecasting performance declines as we progress to four weeks ahead, and that the model has the most difficulty when forecasting in the early stages of a period of rising daily DPM.

We also report the standardized root mean squared log error (sRMSLE) to facilitate comparisons to the sRMSLE results reported by Slater et al. [36], where we note that the single skew normal curve model has difficulty forecasting in regions with multiple observed waves (such as Spain). Let Pj(t) and Oj(t) be the predicted and observed mortality on day t in region j, where the predicted mortality Pj(t) is the mean value of the posterior predictive distribution on day t in region j. The sRMSLE is,

sRMSLE(t)=meanj(log[Pj(t)/Oj(t)]2log[Oj(t)])

Figure 6 shows the sRMSLE over twenty-eight days for two models – the forecasting model, and the forecasting model with the DOW effect removed. We make two observations. First, the forecasting model experiences a smaller increase on November 16, 2020 than the model with no DOW effect. This suggests that including DOW effects can improve forecasting performance over short time periods, particularly around weekends. Second, the sRMSLE increases throughout the second half of the forecasting period. As mentioned above, this is likely due to the increased uncertainty in forecasts of regions entering a period of rising or potentially peaking cases. For more forecasting comparisons involving the single wave model, please see Slater et al. [36].

Figure 6.

Figure 6.

The standardized root mean squared log error for Japan, Italy, Belgium, Ontario, Texas, and Peru over twenty-eight days for two models – the forecasting model, and the forecasting model with the day-of-the-week effect removed.

5. Discussion

This paper introduces the sum of skew normal (SoSN) curves model, which decomposes daily COVID-19 mortality into the sum of skew normal curves. We fit a model with three curves (the forecasting model) to data from early 2020 to November 12, 2020, and eight curves (the full model) to data from early 2020 to May 31, 2022. The only input data is daily COVID-19 mortality, which allows the model to be easily applied to different regions. We show that the skew normal curves substantially overlap with the COVID-19 waves in Belgium and Ontario, which are identified by Sciensano and Public Health Ontario, respectively. They also broadly align with the introduction of new COVID-19 variants. The shape parameter quantifies how asymmetric each skew normal curve is, and a novel contribution of the SoSN model is that it provides the evolution of the shape parameter over time. Across the six regions considered (Japan, Italy, Belgium, Ontario, Texas, and Peru), the shape parameter is positive for the first curve and progresses to being negative by the last curve. A positive shape parameter indicates that the skew normal curve increases faster than it decreases, and our results show that this characterizes curves in the early part of the pandemic, but not its later phases. Regarding day-of-the-week effects, of the four regions we fit with one set of day-of-the-week effects, Ontario's are the most varied and Belgium's the least varied, with Sunday and Monday generally having the lowest values. We fit two regions – Texas and Peru – with two sets of day-of-the-week effects: one for the first half of the study period, and one for the second half. For these two regions, we find that Texas has substantially lower effects on Sunday and Monday, but only in the second period. Finally, we show that the SoSN model can forecast when observed mortality is approaching a near-term peak, when we applied the forecasting model to Italy and Belgium.

The SoSN model uses informative priors for some parameters. In Section 2.3, we describe how some priors, such as those on the location and scale parameters of the skew normal curves, are more informative while other priors, such as the severity parameter prior, are less informative. One task in developing the SoSN model was to determine appropriate priors. We describe how we chose the priors currently being used in Section 2.3, and demonstrate that the SoSN model can estimate quantities of interest when using these priors. Furthermore, it has similar results to some models in the CDC ensemble model, as can be seen (for example) by comparing Table 1 of Section 4 to Table S8 in the Supplemental Materials of Cramer et al. [6]. Similar to the constituent models of that ensemble, the SoSN model has different forecasting performance across regions and time periods, and exhibits declining forecasting performance over longer forecasting periods. These results suggest that the chosen priors, while informative, are reasonable for this application.

The SoSN model has some limitations. A key limitation is the number of skew normal curves in the model is fixed before estimation. We used the same number of curves across regions for simplicity, but in choosing the number of curves the relevant scientific question is in determining how many new waves have started. One way to address this is to use reversible-jump MCMC (RJ-MCMC) to infer the number of skew normal curves. However, RJ-MCMC can be challenging to implement as it requires users to specify functions that relate parameter spaces of different dimensions. The specification of these functions is an ongoing area of research, with some recent papers being Gagnon [10] and Gagnon and Doucet [11]. While Stan uses a NUTS sampler that requires no user tuning and is readily applicable to new regions, an RJ-MCMC implementation may require substantial user or computational effort when applied to new regions. Therefore, development of an RJ-MCMC implementation is beyond the scope of this paper.

Another approach to selecting the number of curves is to use model selection criteria, such as the deviance information criteria (DIC) or widely applicable information criteria (WAIC). However, the DIC is not based on the posterior (predictive) distribution and WAIC can lead to overfitting when performing a large number of comparisons [25,37]. Given the large number of comparisons needed to select a subset of skew normal curves for a model with only eight curves, model selection for the SoSN model may be difficult to interpret, as well as computationally demanding if applied to a large number of regions. Moreover, the SoSN model has some ability to estimate curves with very small peaks (in daily deaths per million), such as when it was applied to COVID-19 mortality data from Japan. This suggests that any extension relating to determining the number of curves to include in the model should carefully weigh potential benefits against the corresponding increase in complexity, both computational and on the user. While the current Stan implementation does not estimate the number of curves, it has an important advantage of being readily applicable to new regions. A second limitation relates to the interpretation of the estimates from the SoSN model. It is not motivated by biological processes, but rather by characteristics of daily COVID-19 mortality. It provides useful information to practitioners, as described in the beginning of this section. However, the extent that the skew normal curves reflect biological processes versus simply acting as basis functions varies based on the time and region. We discuss interpretation in more detail in the next paragraph. Finally, another extension one could consider is a hierarchical model over smaller regions (cities or counties) that amalgamate into larger regions (provinces or countries). Currently, regions are fit independently, and this an extension can help us better understand the heterogeneity in COVID-19 mortality within larger regions.

We next consider the types of inferences that can be made from the results in Section 3. Italy's second curve peaks later than the second curves of other regions, which may possibly be related to schools being re-opened more slowly in Italy after their first observed wave [22]. As another example, the third skew normal curve peaks in Ontario on January 6, 2020, around two weeks after a province-wide shutdown that began on December 26, 2020 [13]. However, drawing conclusions about the effect of public health measures requires comparing against what would have occurred if they had been implemented differently (or not at all), as people voluntarily change their behavior in response to higher case and mortality counts [17,43]. Since voluntary responses may not be consistent across region and time, these comparisons should be done on a case-by-case basis. One may also want to make conclusions regarding the asymmetry of the skew normal curves. For instance, among the regions we considered, the first curve in Ontario increases the most steeply. One possible explanation is in the analysis of cremation data done by Postill et al. [26], who find that less than half of the excess deaths from March 2020 to May 2020 were attributed to COVID-19. Moriarty et al. [23] note that excess deaths at home increased by 8% in February 2020 and 24% in March 2020 in Ontario, and with cremations typically occurring within weeks of death [26], this increase occurs before Ontario's first skew normal curve begins. Therefore, the steepness of the first curve in Ontario may be attributable to under-reporting of COVID-19 mortality prior to April 2020. However, for the same reason as above, this analysis must also be done on a case-by-case basis.

In conclusion, the SoSN model is a data-driven model that can decompose a daily health outcome observed over time, such as daily COVID-19 mortality, into a sum of skew normal curves. Our results show that the SoSN model can capture the asymmetry that characterizes observed COVID-19 waves, and that it can provide insight into how patterns in daily mortality change over time. For example, in the six regions we considered, the skew normal curves increase less steeply after the first year of the pandemic. The SoSN model can forecast when daily mortality is peaking in the multi-wave setting, which is useful for practitioners. Our results also support the inclusion of day-of-the-week effects when modeling COVID-19, as they are present around weekends for most of the regions we consider. Finally, while this paper only considers data up to May of 2022, the SoSN model can be applied to longer time periods that may include subsequent COVID-19 waves. This can be done by simply increasing the number of skew normal curves in the sum. The model can then be used to forecast quantities of interest, such as when possible future COVID-19 waves will peak. Moreover, since the SoSN model only requires daily counts of the health outcome it is modeling, it can easily be applied to COVID-19 case or hospitalization data, as well as any other infectious disease data of interest.

Supplementary Material

Supplemental Material

Funding Statement

This research was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).

Disclosure statement

No potential conflict of interest was reported by the author(s).

Data availability statement

The data sources are detailed in Section 2.1 and the code used to produce this paper is available at: https://github.com/Kamal-R/SoSN-Model.

References

  • 1.Betancourt M., A conceptual introduction to Hamiltonian Monte Carlo (2018).
  • 2.Carpenter B., Gelman A., Hoffman M.D., Lee D., Goodrich B., Betancourt M., Brubaker M., Guo J., Li P., and Riddell A., Stan: A probabilistic programming language, J. Stat. Softw. 76 (2017), pp. 1–32. 10.18637/jss.v076.i01 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Centers for Disease Control and Prevention, COVID-19 Response , United States COVID-19 cases and deaths by state over time – archived (version date: October 19, 2022) (2022). Available at https://data.cdc.gov/Case-Surveillance/United-States-COVID-19-Cases-and-Deaths-by-State-o/9mfq-cb36.
  • 4.Corrao G., Rea F., and Blangiardo G.C., Lessons from COVID-19 mortality data across countries, J. Hypertens. 39 (2021), pp. 856–860. 10.1097/HJH.0000000000002833 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Cramer E.Y., Huang Y., Wang Y., Ray E.L., Cornell M., Bracher J., Brennen A., Rivadeneira A.J.C., Gerding A., House K., Jayawardena D., Kanji A.H., Khandelwal A., Le K., Mody V., Mody V., Niemi J., Stark A., Shah A., Wattanchit N., Zorn M.W., Reich N.G., Gneiting T., Mühlemann A., Gu Y., Chen Y., Chintanippu K., Jivane V., Khurana A., Kumar A., Lakhani A., Mehrotra P., Pasumarty S., Shrivastav M., You J., Bannur N., Deva A., Jain S., Kulkarni M., Merugu S., Raval A., Shingi S., Tiwari A., White J., Adiga A., Hurt B., Lewis B., Marathe M., Peddireddy A.S., Porebski P., Venkatramanan S., Wang L., Dahan M., Fox S., Gaither K., Lachmann M., Meyers L.A., Scott J.G., Tec M., Woody S., Srivastava A., Xu T., Cegan J.C., Dettwiller I.D., England W.P., Farthing M.W., George G.E., Hunter R.H., Lafferty B., Linkov I., Mayo M.L., Parno M.D., Rowland M.A., Trump B.D., Chen S., Faraone S.V., Hess J., Morley C.P., Salekin A., Wang D., Zhang-James Y., Baer T.M., Corsetti S.M., Eisenberg M.C., Falb K., Huang Y., Martin E.T., McCauley E., Myers R.L., Schwarz T., Gibson G.C., Sheldon D., Gao L., Ma Y., Wu D., Yu R., Jin X., Wang Y.-X., Yan X., Chen Y., Guo L., Zhao Y., Chen J., Gu Q., Wang L., Xu P., Zhang W., Zou D., Chattopadhyay I., Huang Y., Lu G., Pfeiffer R., Sumner T., Wang D., Wang L., Zhang S., Zou Z., Biegel H., Lega J., Hussain F., Khan Z., Van Bussel F., McConnell S., Guertin S.L., Hulme-Lowe C., Nagraj V.P., Turner S.D., Bejar B., Choirat C., Flahault A., Krymova E., Lee G., Manetti E., Namigai K., Obozinski G., Sun T., Thanou D., Ban X., Shi Y., Walraven R., Hong Q.-J., van de Walle A., Ben-Nun M., Riley S., Riley P., Turtle J., Cao D., Galasso J., Cho J.H., Jo A., DesRoches D., Forli P., Hamory B., Koyluoglu U., Kyriakides C., Leis H., Milliken J., Moloney M., Morgan J., Nirgudkar N., Ozcan G., Piwonka N., Ravi M., Schrader C., Shakhnovich E., Siegel D., Spatz R., Stiefeling C., Wilkinson B., Wong A., Cavany S., España G., Moore S., Oidtman R., Perkins A., Ivy J.S., Mayorga M.E., Mele J., Rosenstrom E.T., Swann J.L., Kraus A., Kraus D., Bian J., Cao W., Gao Z., Ferres J.L., Li C., Liu T.-Y., Xie X., Zhang S., Zheng S., Chinazzi M., Vespignani A., Xiong X., Davis J.T., Mu K., Piontti A.P. y., Baek J., Farias V., Georgescu A., Levi R., Sinha D., Wilde J., Zheng A., Lami O.S., Bennouna A., Ndong D.N., Perakis G., Singhvi D., Spantidakis I., Thayaparan L., Tsiourvas A., Weisberg S., Jadbabaie A., Sarker A., Shah D., Celi L.A., Penna N.D., Sundar S., Berlin A., Gandhi P.D., McAndrew T., Piriya M., Chen Y., Hlavacek W., Lin Y.T., Mallela A., Miller E., Neumann J., Posner R., Wolfinger R., Castro L., Fairchild G., Michaud I., Osthus D., Wolffram D., Karlen D., Panaggio M.J., Kinsey M., Mullany L.C., Rainwater-Lovett K., Shin L., Tallaksen K., Wilson S., Brenner M., Coram M., Edwards J.K., Joshi K., Klein E., Hulse J.D., Grantz K.H., Hill A.L., Kaminsky K., Kaminsky J., Keegan L.T., Lauer S.A., Lee E.C., Lemaitre J.C., Lessler J., Meredith H.R., Perez-Saez J., Shah S., Smith C.P., Truelove S.A., Wills J., Gardner L., Marshall M., Nixon K., Burant J.C., Budzinski J., Chiang W.-H., Mohler G., Gao J., Glass L., Qian C., Romberg J., Sharma R., Spaeder J., Sun J., Xiao C., Gao L., Gu Z., Kim M., Li X., Wang Y., Wang G., Wang L., Yu S., Jain C., Bhatia S., Nouvellet P., Barber R., Gaikedu E., Hay S., Lim S., Murray C., Pigott D., Reiner R.C., Baccam P., and US COVID-19 forecast hub consortium , The United States COVID-19 forecast hub dataset, Sci. Data 9 (2022), ARTICLE ID 462. 10.1038/s41597-022-01517-w [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Cramer E.Y., Ray E.L., Lopez V.K., Bracher J., Brennen A., Rivadeneira A.J.C., Gerding A., Gneiting T., House K.H., Huang Y., Jayawardena D., Kanji A.H., Khandelwal A., Le K., Mühlemann A., Niemi J., Shah A., Stark A., Wang Y., Wattanachit N., Zorn M.W., Gu Y., Jain S., Bannur N., Deva A., Kulkarni M., Merugu S., Raval A., Shingi S., Tiwari A., White J., Abernethy N.F., Woody S., Dahan M., Fox S., Gaither K., Lachmann M., Meyers L.A., Scott J.G., Tec M., Srivastava A., George G.E., Cegan J.C., Dettwiller I.D., England W.P., Farthing M.W., Hunter R.H., Lafferty B., Linkov I., Mayo M.L., Parno M.D., Rowland M.A., Trump B.D., Zhang-James Y., Chen S., Faraone S.V., Hess J., Morley C.P., Salekin A., Wang D., Corsetti S.M., Baer T.M., Eisenberg M.C., Falb K., Huang Y., Martin E.T., McCauley E., Myers R.L., Schwarz T., Sheldon D., Gibson G.C., Yu R., Gao L., Ma Y., Wu D., Yan X., Jin X., Wang Y.-X., Chen Y., Guo L., Zhao Y., Gu Q., Chen J., Wang L., Xu P., Zhang W., Zou D., Biegel H., Lega J., McConnell S., Nagraj V.P., Guertin S.L., Hulme-Lowe C., Turner S.D., Shi Y., Ban X., Walraven R., Hong Q.-J., Kong S., van de Walle A., Turtle J.A., Ben-Nun M., Riley S., Riley P., Koyluoglu U., DesRoches D., Forli P., Hamory B., Kyriakides C., Leis H., Milliken J., Moloney M., Morgan J., Nirgudkar N., Ozcan G., Piwonka N., Ravi M., Schrader C., Shakhnovich E., Siegel D., Spatz R., Stiefeling C., Wilkinson B., Wong A., Cavany S., Espa na G., Moore S., Oidtman R., Perkins A., Kraus D., Kraus A., Gao Z., Bian J., Cao W., Ferres J.L., Li C., Liu T.-Y., Xie X., Zhang S., Zheng S., Vespignani A., Chinazzi M., Davis J.T., Mu K., Piontti A.P. y, Xiong X., Zheng A., Baek J., Farias V., Georgescu A., Levi R., Sinha D., Wilde J., Perakis G., Bennouna M.A., Nze-Ndong D., Singhvi D., Spantidakis I., Thayaparan L., Tsiourvas A., Sarker A., Jadbabaie A., Shah D., Penna N.D., Celi L.A., Sundar S., Wolfinger R., Osthus D., Castro L., Fairchild G., Michaud I., Karlen D., Kinsey M., Mullany L.C., Rainwater-Lovett K., Shin L., Tallaksen K., Wilson S., Lee E.C., Dent J., Grantz K.H., Hill A.L., Kaminsky J., Kaminsky K., Keegan L.T., Lauer S.A., Lemaitre J.C., Lessler J., Meredith H.R., Perez-Saez J., Shah S., Smith C.P., Truelove S.A., Wills J., Marshall M., Gardner L., Nixon K., Burant J.C., Wang L., Gao L., Gu Z., Kim M., Li X., Wang G., Wang Y., Yu S., Reiner R.C., Barber R., Gakidou E., Hay S.I., Lim S., Murray C., Pigott D., Gurung H.L., Baccam P., Stage S.A., Suchoski B.T., Prakash B.A., Adhikari B., Cui J., Rodrýguez A., Tabassum A., Xie J., Keskinocak P., Asplund J., Baxter A., Oruc B.E., Serban N., Arik S.O., Dusenberry M., Epshteyn A., Kanal E., Le L.T., Li C.-L., Pfister T., Sava D., Sinha R., Tsai T., Yoder N., Yoon J., Zhang L., Abbott S., Bosse N.I., Funk S., Hellewell J., Meakin S.R., Sherratt K., Zhou M., Kalantari R., Yamana T.K., Pei S., Shaman J., Li M.L., Bertsimas D., Lami O.S., Soni S., Bouardi H.T., Ayer T., Adee M., Chhatwal J., Dalgic O.O., Ladd M.A., Linas B.P., Mueller P., Xiao J., Wang Y., Wang Q., Xie S., Zeng D., Green A., Bien J., Brooks L., Hu A.J., Jahja M., McDonald D., Narasimhan B., Politsch C., Rajanala S., Rumack A., Simon N., Tibshirani R.J., Tibshirani R., Ventura V., Wasserman L., O'Dea E.B., Drake J.M., Pagano R., Tran Q.T., Ho L.S.T., Huynh H., Walker J.W., Slayton R.B., Johansson M.A., Biggerstaff M. and Reich N.G., Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States, Proc. Natl. Acad. Sci. 119 (2022), ARTICLE ID e2113561119. 10.1073/pnas.2113561119 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Cramer E.Y., Wang S.Y., Reich N.G., Hannan A., Niemi J., Ray E., House K., Huang Y.D., Stark A., Walraven R. aniruddhadiga, S. Xie, D. Karlen, M. L. Li, rjpagano, Y. Gu, zyt9lsb, A. Gerding, X. X, L. Castro, mzorn 58, F. T. Xu, stevemcconnell, G. Gibson, leyouz, M. Le, S. Horstman, H. Biegel, and EpiDeep, Reichlab/covid19-forecast-hub: release for Zenodo 20220227 (2022).
  • 8.Cucinotta D. and Vanelli M., WHO declares COVID-19 a pandemic, Acta Biomed. Atenei Parmensis 91 (2020), pp. 157–160. 10.23750/abm.v91i1.9397 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Dyer O., COVID-19: Peru's official death toll triples to become world's highest, BMJ 373 (2021), pp. 1. 10.1136/bmj.n1442 [DOI] [PubMed] [Google Scholar]
  • 10.Gagnon P., Informed reversible jump algorithms, Electron. J. Stat. 15 (2021), pp. 3951–3995. 10.1214/21-EJS1877 [DOI] [Google Scholar]
  • 11.Gagnon P. and Doucet A., Nonreversible jump algorithms for Bayesian nested model selection, J. Comput. Graph. Stat. 30 (2021), pp. 312–323. 10.1080/10618600.2020.1826955 [DOI] [Google Scholar]
  • 12.Gnanvi J.E., Salako K.V., Kotanmi G.B., and Glèlè Kakaï R., On the reliability of predictions on COVID-19 dynamics: A systematic and critical review of modelling techniques, Infect. Dis. Model. 6 (2021), pp. 258–272. 10.1016/j.idm.2020.12.008 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Government of Ontario , Ontario announces provincewide shutdown to stop spread of COVID-19 and save lives [Press release] (2020).Available at https://news.ontario.ca/en/release/59790/ontario-announces-provincewide-shutdown-to-stop-spread-of-covid-19-and-save-lives.
  • 14.Hao X., Cheng S., Wu D., Wu T., Lin X., and Wang C., Reconstruction of the full transmission dynamics of COVID-19 in Wuhan, Nature 584 (2020), pp. 420–424. 10.1038/s41586-020-2554-8 [DOI] [PubMed] [Google Scholar]
  • 15.Hilbe J.M., Negative Binomial Regression, 2nd ed., Cambridge: Cambridge University Press, 2011. 10.1017/CBO9780511973420.009 [DOI] [Google Scholar]
  • 16.Hoffman M.D. and Gelman A., The No-U-Turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo, J. Mach. Learn. Res. 15 (2014), pp. 1593–1623. [Google Scholar]
  • 17.Jamison J.C., Bundy D., Jamison D.T., Spitz J., and Verguet S., Comparing the impact on COVID-19 mortality of self-imposed behavior change and of government regulations across 13 countries, Health Serv. Res. 56 (2021), pp. 874–884. 10.1111/1475-6773.13688 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Karanikolos M. and McKee M., How comparable is COVID-19 mortality across countries? Eurohealth 26 (2020), pp. 45–50. https://iris.who.int/bitstream/handle/10665/336295/Eurohealth-26-2-45-50-eng.pdf. [Google Scholar]
  • 19.Laenen L., Cuypers L., Baele G., Dellicour S., Maes P., Andre E., Janssen R., Nevejan L., Wenseleers T., Slechten B., Weyenbergh J.V., Keyaerts E., Raymenants J., Potter B., Wollants E., Ranst M.V., and The Belgian Sequencing Consortium , Genomic surveillance report: Update for Belgium, 04/10/2022 (2022).Available at https://www.uzleuven.be/nl/media/24d1a4c9-e148-4e1b-a15e-b9247a2cbaa4/genomic_surveillance_update_221004.pdf.
  • 20.Lloyd-Smith J.O., Maximum likelihood estimation of the negative binomial dispersion parameter for highly overdispersed data, with applications to infectious diseases, PLoS One 2 (2007), ARTICLE ID e180. 10.1371/journal.pone.0000180 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Margossian C.C., Zhang Y., and Gillespie W.R., Flexible and efficient Bayesian pharmacometrics modeling using Stan and Torsten, part I, CPT Pharm. Syst. Pharmacol. 11 (2022), pp. 1151–1169. 10.1002/psp4.12812 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Marziano V., Guzzetta G., Rondinone B.M., Boccuni F., Riccardo F., Bella A., Poletti P., Trentini F., Pezzotti P., Brusaferro S., Rezza G., Iavicoli S., Ajelli M., and Merler S., Retrospective analysis of the Italian exit strategy from COVID-19 lockdown, Proc. Natl. Acad. Sci. 118 (2021), ARTICLE ID e2019617118. 10.1073/pnas.2019617118 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Moriarty T.J., Boczula A.E., Thind E.K., Loreto N., and McElhaney J.E., Excess all-cause mortality during the COVID-19 epidemic in Canada (2021).Available at https://rsc-src.ca/sites/default/files/EM%20PB_EN.pdf.
  • 24.Peng L., Yang W., Zhang D., Zhuge C., and Hong L., Epidemic analysis of COVID-19 in China by dynamical modeling (2020).
  • 25.Piironen J. and Vehtari A., Comparison of Bayesian predictive methods for model selection, Stat. Comput. 27 (2017), pp. 711–735. 10.1007/s11222-016-9649-y [DOI] [Google Scholar]
  • 26.Postill G., Murray R., Wilton A.S., Wells R.A., Sirbu R., Daley M.J., Barrett K., Born K., Jüni P., Maltsev A., Sander B., Schull M., and Rosella L.C., Excess mortality in ontario during the COVID-19 pandemic, Sci. Br. Ont. COVID-19 Sci. Advisory Table 2 (2021), pp. 1–6. 10.47326/ocsat.2021.02.29.1.0 [DOI] [Google Scholar]
  • 27.Public Health Agency of Canada , Public Health Infobase – Data on COVID-19 in Canada (2022).Available at https://open.canada.ca/data/en/dataset/261c32ab-4cfd-4f81-9dea-7b64065690dc.
  • 28.Public Health Ontario , SARS-CoV-2 Whole Genome Sequencing in Ontario, July 28, 2021 (2021).Available at https://www.publichealthontario.ca/-/media/Documents/nCoV/Archives/Genome/2021/07/SARS-CoV-2-WGS-report-2021-07-30.pdf.
  • 29.Public Health Ontario , SARS-CoV-2 Whole Genome Sequencing in Ontario, January 18, 2021 (2021).Available at https://www.publichealthontario.ca/-/media/Documents/nCoV/Archives/Genome/2022/01/SARS-CoV-2-WGS-report-2022-01-21.pdf.
  • 30.Public Health Ontario , SARS-CoV-2 Whole Genome Sequencing in Ontario, January 18, 2022 (2022).Available at https://www.publichealthontario.ca/-/media/Documents/nCoV/Archives/Genome/2022/01/SARS-CoV-2-WGS-report-2022-01-21.pdf.
  • 31.Public Health Ontario , SARS-CoV-2 Whole Genome Sequencing in Ontario, April 5, 2022 (2022).Available at https://www.publichealthontario.ca/-/media/Documents/nCoV/Archives/Genome/2022/04/SARS-CoV-2-WGS-report-2022-04-08.pdf?rev=daafb78a0663438e94b69c482ec58beb&sc_lang=en.
  • 32.Public Health Ontario , COVID-19 in Ontario: Focus on October 2, 2022 to October 8, 2022 (Week 40) (2022).Available at https://www.publichealthontario.ca/-/media/Documents/nCoV/epi/2022/10/weekly-epi-summary-covid-ontario-oct-14.pdf.
  • 33.Science Advisory and Modelling Consensus Tables , Update on COVID-19 Projections (2021).Available at https://covid19-sciencetable.ca/wp-content/uploads/2021/04/Update-on-COVID-19-Projections_2021.04.01_English-2.pdf.
  • 34.Sciensano , COVID-19 (2022).Available at https://data.gov.be/en/dataset/1030d556bc6489a9d1e85994e25d6bd01d53ce6b.
  • 35.Sciensano , COVID-19 surveillance frequently asked questions (2022).Available at https://covid-19.sciensano.be/sites/default/files/Covid19/COVID-19_FAQ_ENG_final.pdf.
  • 36.Slater J.J., Brown P.E., and Rosenthal J.S., Forecasting subnational COVID-19 mortality using a day-of-the-week adjusted Bayesian hierarchical model, Stat 10 (2021), ARTICLE ID e328. 10.1002/sta4.328 [DOI] [Google Scholar]
  • 37.Spiegelhalter D.J., Best N.G., Carlin B.P., and Linde A., The deviance information criterion: 12 years on, J. R. Stat. Soc. Ser. B Stat. Methodol. 76 (2014), pp. 485–493. 10.1111/rssb.12062 [DOI] [Google Scholar]
  • 38.Stan Development Team , RStan: the R interface to Stan (2022).Available at http://mc-stan.org/ R package version 2.21.7.
  • 39.Tang L., Zhou Y., Wang L., Purkayastha S., Zhang L., He J., Wang F., and Song P.X.-K., A review of multi-compartment infectious disease models, Int. Stat. Rev. 88 (2020), pp. 462–513. 10.1111/insr.12402 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Walraven R., COVID-19 Model (2022).Available at http://rwalraven.com/COVID19/Model.
  • 41.World Health Organization , WHO coronavirus (COVID-19) dashboard (2020).Available at https://covid19.who.int/.
  • 42.Wynants L., Van Calster B., Collins G.S., Riley R.D., Heinze G., Schuit E., Albu E., Arshi B., Bellou V., Bonten M.M.J., Dahly D.L., Damen J.A., Debray T.P.A., de Jong V.M.T., De Vos M., Dhiman P., Ensor J., Gao S., Haller M.C., Harhay M.O., Henckaerts L., Heus P., Hoogland J., Hudda M., Jenniskens K., Kammer M., Kreuzberger N., Lohmann A., Levis B., Luijken K., Ma J., Martin G.P., McLernon D.J., Navarro C.L.A., Reitsma J.B., Sergeant J.C., Shi C., Skoetz N., Smits L.J.M., Snell K.I.E., Sperrin M., Spijker R., Steyerberg E.W., Takada T., Tzoulaki I., van Kuijk S.M.J., van Bussel B.C.T., van der Horst I.C.C., Reeve K., van Royen F.S., Verbakel J.Y., Wallisch C., Wilkinson J., Wolff R., Hooft L., Moons K.G.M., and van Smeden M., Prediction models for diagnosis and prognosis of COVID-19: systematic review and critical appraisal, BMJ 369 (2020), pp. 1–22. 10.1136/bmj.m1328 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Yan Y., Malik A.A., Bayham J., Fenichel E.P., Couzens C., and Omer S.B., Measuring voluntary and policy-induced social distancing behavior during the COVID-19 pandemic, Proc. Natl. Acad. Sci. 118 (2021), ARTICLE ID e2008814118. 10.1073/pnas.2008814118 [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplemental Material

Data Availability Statement

The data sources are detailed in Section 2.1 and the code used to produce this paper is available at: https://github.com/Kamal-R/SoSN-Model.


Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES