Abstract
Stochastic transmission dynamic models are needed to quantify the uncertainty in estimates and predictions during outbreaks of infectious diseases. We previously developed a calibration method for stochastic epidemic compartmental models, called Multiple Shooting for Stochastic Systems (MSS), and demonstrated its competitive performance against a number of existing state-of-the-art calibration methods. The existing MSS method, however, lacks a mechanism against filter degeneracy, a phenomenon that results in parameter posterior distributions that are weighted heavily around a single value. As such, when filter degeneracy occurs, the posterior distributions of parameter estimates will not yield reliable credible or prediction intervals for parameter estimates and predictions. In this work, we extend the MSS method by evaluating and incorporating two resampling techniques to detect and resolve filter degeneracy. Using simulation experiments, we demonstrate that an extended MSS method produces credible and prediction intervals with desired coverage in estimating key epidemic parameters (e.g. mean duration of infectiousness and R0) and short- and long-term predictions (e.g. one and three-week forecasts, timing and number of cases at the epidemic peak, and final epidemic size). Applying the extended MSS approach to a humidity-based stochastic compartmental influenza model, we were able to accurately predict influenza-like illness activity reported by U.S. Centers for Disease Control and Prevention from 10 regions as well as city-level influenza activity using real-time, city-specific Google search query data from 119 U.S. cities between 2003 and 2014.
Keywords: Calibration, outbreak, mathematical model, real time, influenza
1. Introduction
The accurate prediction of epidemic behavior and the estimation of key epidemic parameters (e.g. expected number of secondary cases) are essential to inform control of emerging infectious threats. Stochastic transmission dynamic models are especially useful to characterize the uncertainty in estimates and predictions early in outbreaks given the importance of chance events when the number of infectious individuals is small. While a wide variety of methods have been developed to calibrate deterministic models,1–3 the calibration of stochastic models has received less attention.
We previously developed a calibration method for stochastic epidemic compartmental models called Multiple Shooting for Stochastic Systems (MSS),4 and demonstrated its competitive performance against a number of several existing state-of-the-art calibration methods including a likelihood approximation with an assumption of independent Poisson observations,5 a particle filtering method,6 and an ensemble Kalman filter method.6 The existing MSS method, however, lacks a mechanism to guard against filter degeneracy, a phenomenon that results in parameter posterior distributions that are weighted heavily around a single value. As such, when filter degeneracy occurs, these posterior distributions will not yield reliable credible and prediction intervals for parameter estimates and predictions.
In this work, we extend the MSS method by evaluating and incorporating two resampling technique to detect and resolve filter degeneracy. In simulation experiments, we use these two algorithms for resolving filter degeneracy and show that they are both effective and the extended MSS outperforms the existing MSS method in terms of coverage of the prediction intervals. To evaluate the performance of the extended MSS method in describing real-world influenza activity, we apply this method to the influenza-like-illness (ILI) surveillance data provided by U.S. Centers for Disease Control and Prevention (CDC) between 2003 and 2016 as well as to city-level Google Flu Trends (GFT) influenza data between 2003 and 2014 to retrospectively predict seasonal flu activity in 119 U.S. cities during this period.
2. Method
2.1. Calibration
We assume that epidemiological observations y1,..., yn can be obtained at time points t1,..., tn. For example, the observations y1,..., yn could be the weekly number of newly diagnosed cases. The observation times t1,..., tn do not need to be equidistant. Our goal is to use the observations y1,..., yn to estimate the parameters that govern disease transmission during an ongoing epidemic, such as the duration that individuals remain infectious or the number of secondary cases attributable to a single infectious case (i.e. reproduction number). To this end, we let the probability distribution πi(θ) denote our belief about the true value of parameter vector θ having obtained observations y1,..., yi. We will use each new observation yi to update this prior distribution according to
| (1) |
[Posterior at time ti] = [Prior at time ti] × [probability to observe yt] where is the probability to observe the current observation yi given the epidemic history y1,..., yi−1. Note that the posterior at time ti−1, πi−1, is as well the prior distribution at time ti. Since no observation is gathered prior to i = 1, we set π0(θ| y0) = π0(θ) which describes our prior knowledge about the epidemic parameters at the onset of the epidemic.
In the following subsections, we will describe how to characterize and approximate using stochastic compartmental epidemic models. By knowing , we can then propagate the distribution πi through the course of the epidemic using equation (1).
2.2. MSS method
As we show below, calculating the exact value of the function (•) is computationally prohibitive. To cope with this challenge, we use the MSS method to approximate as it has proven to be computationally efficient and numerically accurate for this purpose. The MSS method has been developed in a systems biology context7–9and has been extended to calibrate stochastic compartmental models of epidemics.4 It has demonstrated competitive performance against a number of existing calibration methods in extensive and thorough simulation experiments.4 We briefly describe the MSS method below and refer the reader to previous publications for additional details.4
Given accumulated observations y1,..., yi, we use Π(•| y1, y2,..., yi) to denote the belief state at time ti. A belief state is a probability distribution over all possible epidemic states which returns the probability that the epidemic is in a particular state when y1,..., yi are observed. The epidemic state at time ti is denoted by vi and is a vector of integers representing the number of individuals in each epidemic compartment (e.g. number of susceptible, infected, or recovered individuals). We use v0 to denote the initial epidemic state (i = 1) and Π(v0) to denote our belief about the epidemic state at time t0.
The probability function (yi|y1, y2,...,yi−1; θ) in equation (1) can now be calculated by conditioning on the epidemic state at time ti−1, denoted by vi−1, and the epidemic state vi at time ti
| (2) |
[Probability for current observation conditioned on history] = [Sum over all possible current states] [Sum over all possible previous states] [Probability for observation conditioned on specific current state] × [Transition probability to move from specific previous to specific current state] × [Belief state probability to be in specific previous state given history]. In equation (2), Ωi is the support of the belief state at time ti, P is the observation probability mapping the state vi; to the observation yi that can also incorporate any additional uncertainty in the data collection such as reporting errors, and p is the transition probability to move from state vi−1 at time ti−1 to state vi at time ti.
To use equation (2), we need to calculate probability functions P(•) and p(•), and the belief state Π(•) given observations y1,…, yi. The MSS method employs a linear noise approximation (LNA) to approximate the transition probability p(•) in equation (2).4,8,9 The LNA assumes that the probability distribution of vi|vi−1 can be approximated by a normal distribution (xi, Σi), where xi is the solution of the ordinary differential equation (ODE) representation of the epidemic during interval [ti−1, ti]
| (3) |
In the ODE system (3), the vector x(t, vi−1; θ) is the epidemic state of the ODE model at time t given the initial state vi−1, the vector Λ(x(t, vi−1; θ), denotes the instantaneous changes in the epidemic when at state x(t, vi−1; θ), and the matrix Γ describes the instantaneous changes of the epidemic state.
In using a normal distribution (xi, Σi) to approximate the probability distribution of vi|vi−1, the covariance matrix Σi will be the solution of the following ODE system on the interval [ti−1, ti] 10,11
| (4) |
In the ODE system (4), and D is a K × K matrix with the (i, j) entity equal to , where K is the number of compartments in the epidemic model. The initialization of the covariance for each interval will be discussed in the next subsections.
We next describe two approaches for calculating the belief state Π in equation (2). The first approach (which we refer to as MSSa) seeks to simplify the computational complexity of equation (2) by representing the belief state as a step function that takes 1 for the most probable state (denoted by ) and 0 elsewhere. This effectively lets us drop the second summation in equation (2). For each interval, a new state estimate is determined as follows: given a previous state estimate at time ti−1, the probability of observing yi at the end of the interval [ti−1, ti] is equal to . We now define the state at time ti as our state estimate that leads to the highest probability of observing yi
| (5) |
[State estimate] = [Maximum over possible states] of [Probability for observation conditioned on specific current state] × [Transition probability to move from previous state estimate to specific current state]. At the beginning of each period, the ODE system to propagate the epidemic state to the next period is initialized with the state estimate Since the MSSa approach assumes a point distribution for the belief state , the LNA method to characterize the distribution of the epidemic state in the next period is initialized with zero variance for (see Zimmer et al.4 for details).
The second approach for calculating the belief state n, which we refer to as MSSb, assumes that the observation probability is Gaussian and models the relation between the observation, yi, and the state as where h(•) is the observation function mapping the state of the epidemic vi to the observation yi, and is a normally distributed error term with zero mean vector 0obs and an observation error covariance Σobs (see equation (14) for an example in the context of a SIRS model). In this case, the belief state n will follow a Gaussian distribution.12–14 That is, where the state estimate and the covariance estimate are calculated as the follows12–14
| (6) |
In the equations above, Ri is the observation variance, is the predicted outcome, and Ki is the so-called Kalman gain:
where •T stands for matrix transpose, •−1 for the matrix inverse and . In this case, the ODE system (equation (3)) is initialized with the state estimate and the LNA system (equation (4)) is initialized with the covariance estimate .
If the measurement error ϵi follows a Gaussian distribution, MSSb is expected to outperform MSSa because MSSb accounts for the Gaussian error when updating estimates of the epidemic state and its variance.
2.3. Resolving filter degeneracy
To initialize the recursion (1), we sample (with replacement) M parameter vectors θ1,θ2,…, θM from a prior parameter distribution , and M initial state vectors v0,1,v0,2,…,v0,M from a prior state distribution . We consider the initial states as an additional dimension of the parameter vector and denote , m = 1,…,M. With any new observation yi, we calculate the probabilities for each vectors , according to the recursion (1). This allows us to assign a weight, , to each sampled parameter vector .
Since we are using a finite set of samples from the prior distributions π0 and , it is possible that after a few observations, the recursion (1) results in probability weights that are significant for only a few parameter vectors and are several orders of magnitude lower for the rest. In an extreme scenario, only one parameter vector is assigned a non-trivial weight and all others are assigned weights that are almost zero. This phenomenon is commonly referred to as filter degeneracy.15–17
Filter degeneracy can cause serious problems when calculating credible or prediction intervals for model parameters or when attempting to use the model to make predictions since such degeneracy may result in a posterior distribution that is essentially a point distribution. We note that if we have unlimited computational resources, filter degeneracy can be avoided since we could theoretically test the entire space of the parameters to accurately calculate the posterior distributions. However, as we are always bound by computation and time limitations, we need to use a sample of parameter vectors to estimate the posteriors. Next, we describe two approaches to detect and resolve filter degeneracy when the recursion (1) is used to calculate the likelihood of accumulated observations during an epidemic.
2.3.1. Detecting and resolving filter degeneracy using effective sample size
Detection of filter degeneracy is relatively simple and can be achieved by measuring the relative contribution of each sampled parameter vector. To this end, we first normalize weights using17
| (7) |
and then calculate the effective sample size Neff as a measure of (non-)degeneracy
| (8) |
Before any observation is obtained (i.e. i = 0), we assign equal weights to sampled parameter vectors, and hence, Neff is equal to M. In the other extreme, if all the weight is on one particle, Neff is equal to 1. Therefore, one way of detecting degeneracy is to determine a threshold value NC with 1 < NC < M below which we trigger the algorithm described next to resolve filter degeneracy. The calculation of Neff is not time-consuming, and thus, one can easily check if filter degeneracy has occurred in each iteration. Choosing a higher NC implies a stricter mechanism against degeneracy but also imposes a larger computational cost. We found that NC = M/2 works well in our study.
To resolve filter degeneracy once it is detected, we re-sample with replacement parameter vectors from according to current weights w1,…, wM. Using these samples and the kernel density estimation defined in the equation below, we construct a density estimate of the parameter distribution
| (9) |
[Estimated density] = [Normalization by number of points and bandwidth] [Sum over all points] [Kernel probability for each point]. In equation (8), det(H) is the determinant of matrix H, the (multivariate) kernel K(•) is a (multivariate) probability density (e.g. Gaussian), and H is the bandwidth matrix which controls the smoothness of the density estimate .
In a one-dimensional case, the density estimate (9) reduces to , where K can be a Gaussian kernel . Small values for H (the so-called bandwidth or smoothing parameter) place higher probability around each sample and lower probability elsewhere, whereas high values for H lead to a smoother density function. See Supporting Information “Technical details” for details and our choice of K(•) and H.
The re-sample size can be larger or smaller than M, but since the procedure to correct filter degeneracy is computationally inexpensive, one would rather choose to improve the quality of density estimation. For the results presented here, we chose as in Zimmer et al.4
Next, M new parameter vectors are sampled based on density function . As no state estimates and weights have yet been calculated for these new parameter vectors, we calculate the new weight with the following
| (10) |
[New weight] = [Product over all previous time intervals] [Probability for observation given its history] with i being the current iteration, the parameter vector component of , and the initial state estimate determined by the state vector component of . Whenever distinction is needed, we refer to this approach by adding “-Neff”, e.g. MSSb-Neff for the MSSb version with effective sample sized based approach against filter degeneracy.
2.3.2. Detecting and resolving filter degeneracy using incremental mixture importance sampling
The Incremental Mixture Importance Sampling (IMIS) algorithm is an alternative to detect and resolve filter degeneracy. It is an iterative algorithm that can be applied in each iteration of the filter. This section is a brief review of the algorithm described in Raftery and Bao.16 The IMIS algorithm checks each of its iterations for degeneracy and suggests new samples until filter degeneracy is resolved. The IMIS algorithm is initialized (k = 0) by normalizing the weights of the parameter sample (equation 6). The kth iteration of IMIS (k ∈{1,2,…,}) checks whether the samples are degenerated by calculating the expected fraction of unique points after re-sampling:
where Bk = M + k * B and B is the number of re-samples in each iteration. If the expected number of unique points is too small
| (11) |
then the IMIS algorithm is carried out for an additional iteration.
In each iteration k, the IMIS algorithm samples new inputs (with hopefully higher weights) until the expected number of points passes the desired threshold (equation 10). The details can be found in the Supporting Information “IMIS details” or in the original publication.16
Once the termination criterion is satisfied , M inputs are re-sampled with replacement from and their corresponding state estimates with the corresponding weights . We refer to this approach with MSSb-IMIS.
Figure 1 shows the pseudo code of our MSS extension being able to resolve filter degeneracy, and Figure 2 lists the free parameters of the methods studied here.
Figure 1.

An algorithm for real-time calibration of stochastic compartmental epidemic models including mechanisms against filter degeneracy.
IMIS: Incremental Mixture Importance Sampling.
Figure 2.

List of free parameters of the MSS method and the two algorithms to resolve filter degeneracy.
MSS: Multiple Shooting for Stochastic Systems; ILI: influenza-like illness; GFT: Google Flu Trends; IMIS: Incremental Mixture Importance Sampling.
2.4. Quantifying uncertainty in parameter estimates and epidemic predictions
We use credible intervals to characterize the uncertainty in parameter estimates. The 100(1 − κ)% credible interval for the parameter vector θ is defined as the interval between κ/2 and (1 − κ)/2-quantiles of the probability distribution . Accordingly, the 100(1 − κ)% prediction interval for the desired prediction target Z (e.g. the number of cases diagnosed in the next week or the time of the peak of the epidemic) is calculated as the interval between κ/2 and (1 − κ/2)-quantiles of the probability distribution P(Z| yi,…,yi) with
| (12) |
[Probability to observe target conditioned on history] = [Integration over parameter space] [Integration over state space] [Probability to observe target conditioned on specific state and specific parameter] [Probability for specific state conditioned on history and parameter] [Probability for specific parameter conditioned on history]. Here is the parameter space and is the probability to observe Z given the epidemic state vi and the parameter vector θ. Due to the Markovian property of compartmental models, predictions Z are independent given vi, θ—the model can be fully described by knowing current state vi and parameter values θ.
This probability can be calculated with the help of a simulation model. We evaluate equation (11) by obtaining new realizations of parameter vectors according to their current weights and then carrying out simulations initialized with the state estimate until the time of the target Z. For this study, we set .
2.5. Humidity-based SIRS model
To evaluate the effectiveness of the approach proposed above to resolve filter degeneracy, we designed a simulation study using a humidity-based SIRS (Susceptible → Infected → Recovered → Susceptible) model. We later use this model to predict influenza activities in cities in the United States using GFT data.
The humidity-based SIRS model divides the population into three disjoint subgroups.6,19 Susceptible members may be infected at a rate proportional to the prevalence of infectious individuals in the population and will immediately become infectious to others. Infectious individuals recover at a constant rate and develop temporary immunity against the circulating pathogen. Once the immunity wanes, the individual becomes susceptible to infection. These transitions are represented as
| (13) |
where γ is the mean infectious period, α is the average duration of immunity, and β(t) is the transmission rate defined as . Here with a maximal and minimal daily reproductive number R0max and R0min, and the function q(t) describes the absolute humidity over time6 (see Supporting Information “Interpolation of humidity data” for details on how to interpolate the city-level humidity data for the United States).
The model can be also written in ODE form as
| (14) |
We assume that we can observe the number of new infections during each week, which corresponds to the number of population members that move from the compartment S to I in each week. We also assume that our population size is constant, and hence, we can replace R with R = Npop − S − I in ODE system (14) and eliminate the third equation.
To use the weekly number of cases to calibrate this SIRS model, we need to specify the observation function P(•) (equation (2)). We note that this time-series does not directly correspond to any of the model states and instead represents the flow from S to I. If there were no recoveries in the time interval [ti−1, ti], we could calculate this flow by taking the difference between the number of infected people at time ti and , where denotes the number of people in compartment I at time point ti. But if there are recoveries, we need to add them to this difference. We therefore define the observation function h(•) as
| (15) |
where denotes the number of recoveries within the time interval [ti−1, ti] (see Supporting Information “Technical details’’).
2.6. Simulation study
To evaluate the ability of the MSS approach to produce reliable credible and prediction intervals for epidemic parameters and predictions, we used simulated trajectories generated by the humidity-based SIRS model described above (Figure 3). In these simulation experiments, the true parameter values and the corresponding epidemic trajectories are known, and hence, we can evaluate whether the intervals characterized by our proposed approach cover the true values of parameters and prediction targets.
Figure 3.

Fifty stochastic trajectories produced by simulating the humidity-based SIRS model.
To generate the simulated trajectories shown in Figure 3, we used the Gillespie algorithm20 (implemented in COPASI21) with epidemic parameters and initial states drawn from the following uniform distribution: , and . We use the New York humidity profile for this simulation experiment.
To ensure that we only use trajectories that display reasonable behavior, we included simulated trajectories with an attack rate of at least 10% (to avoid premature die out) and a peak week no earlier than epidemic week 49 (consistent with real-world influenza epidemics). We let the weekly number of cases be observed with some measurement error that follows a normal distribution with mean zero and standard deviation 100. We will assume that this distribution is also known to the calibration algorithm which allows us to employ the MSSb method. As expected and confirmed by our numerical experiments, in the presence of such measurement error, MSSb outperforms MSSa, and hence, we only report the results from the MSSb method. The re-sampling kernel density estimation (equation (8)) is calculated using the Mathematica22 function KernelMixtureDistribution with a Gaussian kernel, a maximum of 5 mixture kernels, and bandwidths of 100,0.1,0.1,0.025,1000,10 that correspond to parameters , , and as well as initial states . These bandwidths are determined based on the range that each model parameter may vary over.
3. Results
We first present the results of the simulation experiment described above. We then employ the MSS method to predict regional-level ILI activities in the United States using CDC ILI surveillance data, and city-level influenza activities using GFT data.
3.1. Simulation study
Figure 4 displays how the effective sample size Neff and the fraction of unique points ( in equation (10)) change as the model is re-calibrated using the latest observations accumulating during epidemics. Neff approaching 1 or approaching 0 implies that the probability weight of one of the parameter vectors is approaching one and all the others zero, signaling the occurrence of filter degeneracy (blue curve). Figure 4 demonstrates that filter degeneracy clearly occurs in the old MSS version and both methods described here to rectify filter degeneracy are effective (red and green curve).
Figure 4.

Effectiveness of the proposed mechanisms to address filter degeneracy: For the original MSS method (represented by the blue curve), the effective sample size (Neff) and the expected fraction of unique points approaches zero as the model is recalibrated based on the new epidemic observations. This suggests the occurrence of filter degeneracy. This phenomenon is avoided in both new versions of the MSS method (represented by the red and green curves). Solid lines are created by calculating the average of Neff and over 50 simulated trajectories, and shaded areas represent the 90% confidence region.
MSS: Multiple Shooting for Stochastic Systems; IMIS: Incremental Mixture Importance Sampling.
Figure 5 compares the predictions made by the MSSb-Neff method (red line) with the true value of several epidemic targets (black crosses) for two simulated trajectories. These key epidemic targets include infection prevalence, number of cases at the epidemic peak (number of cases in the worst week), the number of new cases over the next week, and the final epidemic size. These two simulated trajectories present different realistic epidemic behaviors; Trajectory 1 spreads faster, peaks sooner, and results in more infections compared to Trajectory 2. The MSSb method is able to capture the behaviors of all 50 simulated trajectories (Figure 3) and to quickly distinguish these trajectories from each other (Figure 6 and Supporting Information Figures A5 to A8).
Figure 5.

Predicting epidemic targets in two simulated trajectories. Each row represents one simulated epidemic. Black crosses represent the true value of the prediction targets observed in the simulated trajectory and red lines represent the mean posterior estimate proposed by the MSSb-Neff method using the epidemic observations until the week displayed on the x-axis. The red region shows the 90% prediction intervals.
Figure 6.

Coverage of the 90% prediction intervals and RE produced by the MSS versions for several prediction targets at different times relative to peak for the simulated trajectories shown in Figure 3. (a) Proportion of 90% prediction intervals that included the true value of the prediction target. If this fraction (which is a random variable itself) falls in the region between the two dotted lines, it suggests that the estimated prediction intervals have the appropriate level of coverage. The dotted lines are calculated as the 95% percentile interval of a binomial distribution with the probability 0.9 and sample size 50 (number of simulated trajectories in Figure 3) and then divided by the sample size 50. (b) RE for the mean of the posterior distribution for predicting “Infection prevalence,” “Number of cases at the epidemic peak,” “Number of new cases over the next week,” and “Final epidemic size.”
MSS: Multiple Shooting for Stochastic Systems; RE: relative error.
Figure 6(a) displays the proportion of 90% prediction intervals characterized by all three MSS versions for the simulated trajectories shown in Figure 3 that include the true value of the prediction target. While these simulated trajectories present markedly different epidemic behaviors, the prediction intervals produced by the new MSS versions have the expected level of coverage (Figure 6(a)). This figure also demonstrates that the old version of MSS without proper mechanism against filter degeneracy clearly fails to produce the desired coverage for infection prevalence, or to a smaller extend, final epidemic size. Figure 6(b) shows that the point predictions produced by all MSS versions have relatively low relative error with respect to the true values of the prediction targets. This is expected as degenerated posteriors can still result in accurate point predictions, as seen in the old MSS method that lacked a mechanism to resolve filter degeneracy.
The MSS method can also be used to estimate key epidemic parameters such as the humidity driven quantity R0(t), mean duration of immunity (α), mean duration of infectiousness (γ), and number of susceptibles at the beginning of the flu season. Figure 7 compares the estimates provided by our method with the true value of these parameters for two simulated trajectories. The MSS method reached the desired coverage for the parameter estimate (Figure 8(a)) with small relative error (Figure 8(b)). We note that the width of the credible intervals is influenced by the sensitivity of the epidemic dynamics to the parameter. Tighter intervals suggest greater sensitivity, as small changes in these parameters are expected to lead to substantial changes in the system’s dynamics. On the other hand, less sensitive parameters, such as mean duration of immunity, can take a larger range of values without strongly influencing the dynamics, and hence, can be less precisely determined (i.e. larger width for their credible intervals). We also clearly see that the old version of MSS, which was not equipped with mechanisms against filter degeneracy, fails to produce the desired level of coverage for credible or prediction intervals (Figures 6(a) and 8(a)) further targets can be found in Figure A9.
Figure 7.

Estimating epidemic parameters of two simulated trajectories. Each row represents one simulated epidemic. Black crosses represent the true value of the parameter in the simulated trajectory and red lines represent the mean posterior estimate proposed by the MSSb-Neff method using the cumulative epidemic observations up to the epidemic week displayed on the x-axis. The red region shows the 90% credible intervals.
Figure 8.

Coverage of the 90% credible intervals and RE produced by the MSS versions for estimates of model parameters at different times relative to peak for simulated trajectories shown in Figure 3. (a) Proportion of 90% credible intervals that included the true value of the prediction target. If this fraction (which is a random variable itself) falls in the region between the two dotted lines, it suggests that the estimated credible intervals have the appropriate level of coverage. The dotted lines are calculated as the 95% percentile interval of a binomial distribution with the probability 0.9 and sample size 50 (number of simulated trajectories in Figure 3) and then divided by the sample size 50. (b) RE for the mean of the posterior distribution for estimating “Duration of Immunity,” “Duration of Infectiousness,” number of “Initial Infected,” and “R0(t).”
MSS: Multiple Shooting for Stochastic Systems; RE: relative error.
3.1.1. Sensitivity analyses
For the analyses presented above, we assumed that the weekly number of cases are observed with some measurement error that follows a normal distribution with mean zero and a standard deviation that remains constant over time. Supporting Information Figure A4 shows that the performance of the methods studied here to address filter degeneracy is not affected if the standard deviation of measurement error is assumed to be proportional to the number of cases observed weekly. We also note that another possible approach to address filter degeneracy is to simply increase the number of parameter samples M. While increasing M from 1000 to 10,000 improved the coverage of some calibration targets (Supporting Information Figure A1), it does not properly address the issue of filter degeneracy (Supporting Information Figure A3), and also led to a significant increase in the computation time (Supporting Information Figure A2). Finally, in our simulation experiments, while both methods we studied resolved filter degeneracy, the IMIS-based method showed a slight performance gain. However, its computational time was almost 10-fold of that of the approach based on effective sample size. Therefore, we only used the latter method to analyze the CDC ILI and GFT influenza data (see below).
3.2. Prediction and parameter estimation using CDC ILI data
Next, we use the CDC ILI data23 to evaluate the ability of our method to produce reliable forecasts. This data set comprises 143 ILI time-series from 10 U.S. Department of Health & Human Services (HHS) regions between 2003 and 2016 including data from the first and second waves of the 2009 pandemic. Consistent with previous studies,24 we modeled ILIs as a single disease. This is a simplifying assumption but has shown to be sufficient to accurately describe ILI activities.
For the results presented below, the re-sampling kernel density estimation (equation (8)) is calculated using the Mathematica function KernelMixtureDistribution with a Gaussian kernel, a maximum of 5 mixture kernels, and bandwidths of 100,0.1,0.1,0.025 for the parameters , , and , and 1000 for (i.e. initial size of the susceptile population) and the median of the current samples for .
3.2.1. Humidity data
Following the approach described in Text Si of Yang et al.,6 we independently constructed a humidity dataset from the original Phase 2 of the North American Land Data Assimilation System (NLDAS-2) dataset.25 These data were derived though the National Centers for Environmental Prediction North American Regional Reanalysis. The hourly data are available on a 0.125° grid from 1979 to present. We extracted the specific humidity data for 119 cities and then averaged the hourly data to developed a daily climatology for each city from 1979 to 2016. For regional levels and national level, humidity is averaged over the cities being part of the respective area.
3.2.2. Accuracy and reliability of predictions using CDC ILI data
Figure 9 displays one-week forecasts for the 10 HHS health regions. As shown in this figure, our method can produce reliable predictions for outbreaks with different characteristics: large or small epidemic size (Region 10 vs. Region 1), pronounced or less pronounced peaks (Region 4 vs. Region 2). The 2009 influenza pandemic presents distinct behavior compared with seasonal epidemics; in 2009, most regions experienced a minor peak followed by a major one (Figure 10). As evident in Figure 10, the method proposed here can describe such behavior and provide reliable one-week predictions for the 2009 influenza pandemic.
Figure 9.

One week predictions of ILI and the corresponding prediction intervals for the 2013 season. The x-axes represent epidemic weeks and the y-axes show the percent ILI cases amongst total patient visits during each week. Prediction for a given epidemic week is carried out with the knowledge accumulated up to the past week. Black crosses represent the observed values, the red line the mean posterior, the dark red area the 50% prediction region, and the light red area the 90% prediction region. The underlying U.S. map’s source is Codomo.26
Figure 10.

One week predictions of ILI and the corresponding prediction intervals for the 2009 pandemic. The x-axes represent epidemic weeks and the y-axes show the percent ILI cases amongst total patient visits during each week. Prediction for a given epidemic week is carried out with the knowledge accumulated up to the past week. Black crosses represent the observed values, the red line the mean posterior, the dark red area the 50% prediction region, and the light red area the 90% prediction region. The underlying U.S. map’s source is Codomo.26
While Figures 9 and 10 display predictions for specific seasons, similar accuracy is obtained for the other seasons (Supporting Information Figure A12). Our analysis suggests that the prediction intervals displayed in Figures 9 and 10 and those in other seasons exhibit the desired level of coverage (Supporting Information Figure A10).
3.3. Prediction and parameter estimation using GFT data
Next, we used GFT data27 to evaluate the ability of our method to produce reliable city-level influenza forecasts and epidemic parameter estimates. Internet search queries have gained attention as a surveillance source for monitoring influenza activity since 2009.28,29 GFT data have demonstrated competitive performance in matching influenza-like illness (ILI) data in various countries and regions such as the United States,30 South Korea,31 Southern China,32 Canada,33 Latin America,34 and New Zealand.35
3.3.1. Weekly influenza-positive cases
GFT previously published municipal-level estimates of weekly ILI incidence per 100,000 people based on internet search query activity.29 The U.S. CDC publishes influenza metrics through the U.S. World Health Organization Collaborating Laboratories System and the National Respiratory and Enteric Virus Surveillance System (WHO/NREVSS). WHO/NREVSS records data on the total number of respiratory specimens tested and the number positive for influenza viruses in clinical laboratories on a national, regional, and census division level.36
By multiplying the GFT ILI incidence data by the CDC data on the proportion of ILI cases who were test-positive for the same epidemiological week, we constructed a dataset of ILI+ for 121 cities from 2003 to 2014. This approach is consistent with previous studies to construct ILI+ time-series that estimate the municipal influenza infections per 100,000 patient visits.19 We selected cities within the continental U.S. with complete observations for all epidemic weeks between 2003 and 2014 leading to a total of 1168 time-series for analysis.
3.3.2. Accuracy and reliability of predictions using GFT data
To fit the humidity-based SIRS model to the ILI+ time-series, we assumed the same parameter prior distributions as used in Yang et al.6 with one exception: if the number of accumulated cases passed 50,000 (signaling a more severe epidemic), we changed the prior distribution of α to U[50,3650] (instead of using U[365,3650]). We found that this change was important for fitting models to these relatively severe epidemics. This highlights the important effect that the selection of prior distributions might have on the overall fit of the model. Prior distributions should ideally be determined based on domain expert knowledge. In the absence of such information, the effect of each parameter on the model behavior should be investigated visually to characterize appropriate prior distributions.
Figure 11 displays one-week forecasts for a select number of U.S. cities. As shown in this figure, our method can produce reliable predictions for outbreaks with different characteristics: large or small epidemic size (New Orleans vs. Miami), early or late peak timing (Los Angeles vs. Miami), and single or double peaks (San Francisco vs. New York). There seems to be a slight overshooting of the peaks in some of the cities (such as Houston and Chicago) which might be due to effects not covered in a relatively simple SIRS model such as behavioral changes over the course of the epidemics, school vacations, or vaccinations. The one-week predictions for other U.S. cities are provided in the Supporting Information Figure A13.
Figure 11.

One week predictions of ILI+ and the corresponding prediction intervals for the 2013 influenza season. The x-axes represent epidemic weeks and the y-axes show the number of new influenza cases arising during each week. Prediction for a given epidemic week is carried out with the knowledge accumulated up to the past week. Black crosses represent the observed values, the red line the mean posterior, the dark red area the 50% prediction region, and the light red area the 90% prediction region. The underlying U.S. map’s source is Codomo.26
The 2009 influenza pandemic presents distinct behavior compared with seasonal epidemics; in 2009, most cities experienced a minor peak followed by a major one (Figure 12). A third peak was also observed in several cities (e.g. New Orleans). As evident in Figure 12, the method proposed here can describe such behavior and provide reliable one-week predictions for the 2009 influenza pandemic.
Figure 12.

One week predictions of ILI+ and the corresponding prediction intervals for the 2009 influenza pandemic. The x-axes represent epidemic weeks and the y-axes show the number of new influenza cases arising during each week. Prediction for a given epidemic week is carried out with the knowledge accumulated up to the past week. Black crosses represent the observed values, the red line the mean posterior, the dark red area the 50% prediction region, and the light red area the 90% prediction region. The underlying U.S. map’s source is Codomo.25
While Figures 11 and 12 display predictions for major cities, similar accuracy is obtained for smaller cities (Supporting Information Figure A13). Our analysis suggests that the prediction intervals displayed in Figures 11 and 12 and those in smaller cities exhibit the desired level of coverage (Supporting Information Figure A11).
Figure 13 displays the distribution of R0 estimates calculated for the cities included in our study for season 2003/04 to 2013/14. Adopting the same approach used by Yang et al.,37 we calculate R0 by evaluating R0(t) of the humidity-based SIRS model in the week of the maximal effective R (effective R at time t is defined as (where is the proportion of the population that is susceptible at time t). The R0 estimates shown in Figure 13 are consistent with the estimates reported in past studies.37–40
Figure 13.

Distribution of R0 estimates calculated for the included cities between the season 2003/04 and 2013/14. Here, wl and w2 stands for the first and the second pandemic waves of 2009 and wlw2 represents the full pandemic period which includes both waves of 2009.
4. Discussion
The accurate prediction of epidemic behaviors and the estimation of epidemic parameters are challenged by the partial observability of and the intrinsic randomness in the process governing the disease spread. Characterizing the uncertainty in these prediction and parameter estimates is essential for making meaningful inference about the true parameter values and for drawing conclusions on the epidemic behavior under different control strategies.
In this work, we extend an existing calibration method called MSS and equipped it with a mechanism to produce reliable credible and prediction intervals for the estimates of epidemic parameters and predictions for the epidemic behavior. The MSS method is a calibration approach for stochastic compartmental epidemic models that has competitive performance to several existing state-of-the-art calibration methods including a particle filter6 and an ensemble Kalman filter.6 The original MSS method was not capable of handling filter degeneracy (Figure 4), a phenomenon where particles or members of an ensemble or filter have a very small likelihood that varies over orders of magnitude even for the best members or particles. Filter degeneracy can result in degenerate posterior distributions and hinder the ability to properly characterize predictions and credible intervals for predictions and parameter estimates.
Using simulation experiments on stylized models of influenza outbreaks (where parameter values were known with certainty), we demonstrated that the extended MSS method can effectively detect and resolve the occurrence of filter degeneracy (Figure 4), and—in contrast to the previous version—produce prediction and credible intervals with desired coverage level (Figures 6(a) and Figure 8(a)) for key epidemic parameters such as duration of infectiousness and the reproductive number. We note that while we use influenza as an illustrative example here, the proposed method can be readily applied to other stochastic compartmental models.
Applying the extended MSS approach with method to address filter degeneracy to a humidity-based stochastic compartmental influenza model, we found that we could accurately predict CDC ILI data for the 10 US regions and national level as well as city-level influenza activity using real-time, city-specific Google search query data from 119 U.S. cities between 2003 and 2014 (Figure 11). The proposed method was robust to distinct patterns observed over this period, such as variable peak times and variable magnitudes of these seasonal epidemics. The extended MSS method addressing filter degeneracy performs well even for exceptional seasons, such as the 2009 influenza pandemic and those with multiple peaks (Figure 12). Analyzing the coverage of prediction intervals shows that the expected fraction of prediction intervals covers the observed number of cases (Supporting Information Figure A11).
Figures 11 and 12 indicate that the humidity-based SIRS model tends to overestimate the peak of the epidemic in several cities. Such overestimation was not observed when predicting the peak of simulated epidemics (Figure 5 and Supporting Information Figures A5 and A8). This suggests that our simplified humidity-based SIRS model does not accurately capture all complex characteristics of influenza epidemics, such as changes in contact rates or health-seeking behavior of population members through the course of epidemics.
All versions of the MSS method rely on the LNA to capture the correlations between epidemic compartments throughout the outbreak. The LNA has been successfully applied to describe intrinsic stochastic fluctuations in various systems4,8,9 and has a strong theoretical basis41 which suggests that LNA accurately approximates the dynamic behavior of large populations, especially when fluctuations are small compared with the steady state of the system. As demonstrated by our results, we find that LNA also performs well when the intervals over which the LNA projections are made are sufficiently small (here, the epidemic observations were accrued weekly). We also note that LNA accuracy can be improved by using higher order terms for correction, if this is needed.42
In this paper, we focused on two methods for resolving filter degeneracy, both of which proved effective in calibrating our humidity-based SIRS model against simulated data. We then used one method for resolving filter degeneracy to accurately determine uncertainty for GFT ILI data. We, however, note that there exist other methods to address filter degeneracy that we did not study here.6,17
While we used the ILI+ metric to evaluate the performance of the MSS method, we recognize that this metric serves only as a proxy for influenza incidence.43,44 As ILI+ is proportional to GFT ILI rates, any observational error may be exacerbated by abnormal web-search activity (e.g. increased media coverage often leads to higher levels of web-search activity, and in consequence, higher GFT ILI rates). This further emphasizes the importance of conventional surveillance systems for accurate monitoring of influenza activity.43
Supplementary Material
Acknowledgements
The authors wish to thank an anonymous reviewer of a previous manuscript for pointing out the formulas for state updating in case of Gaussian noise. The authors also thank A Kunkel for her advice on the R code to fetch the humidity data.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: CZ was funded by the German research foundation (DFG) and National Institutes of Health (NIH) U54GM088558. SIL and TC were supported by NIH U54GM088558. RY was supported by grant 1K01AI119603 from the National Institute of Allergy and Infectious Diseases. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Allergy and Infectious Diseases or the National Institute of General Medical Sciences.
Footnotes
Declaration of conflicting interests
The authors declare that there is no conflict of interest. However Dr. Christoph Zimmer changed affiliation during the peer review process from Yale School of Public Health to Bosch Center for Artificial Intelligence (BCAI) but there is no conflict of interest.
Supplemental Material
Supplemental material is available for this article online.
References
- 1.Alkema L, Raftery AE and Clark SJ. Probabilistic projections of HIV prevalence using Bayesian melding. Ann Appl Stat 2007; 1: 229–248. [Google Scholar]
- 2.Elderd BD, Dukic VM and Dwyer G. Uncertainty in predictions of disease spread and public health responses to bioterrorism and emerging diseases. Proc Natl Acad Sci 2006; 103: 15693–15697. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Birrell PJ, Ketsetzis G, Gay NJ, et al. Bayesian modeling to unmask and predict influenza A/H1N1pdm dynamics in London. Proc Natl Acad Sci 2011; 108: 18238–18243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Zimmer C, Yaesoubi R and Cohen T. A likelihood approach for real-time calibration of stochastic compartmental epidemic models. PLOS Comput Biol 2017, 10.1371/journal.pcbi.1005257. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Riley S, Fraser C, Donnelly CA, et al. Transmission dynamics of the etiological agent of SARS in Hong Kong: impact of public health interventions. Science 2003; 300: 1961–1966. [DOI] [PubMed] [Google Scholar]
- 6.Yang W, Karspeck A and Shaman J. Comparison of filtering methods for the modeling and retrospective forecasting of influenza epidemics. PLOS Comput Biol 2014; 10: e1003583. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Zimmer C and Sahle S. Parameter estimation for stochastic models of biochemical reactions. J Comput Sci Syst Biol 2012; 6: 11–21. [Google Scholar]
- 8.Zimmer C and Sahle S. Deterministic inference for stochastic systems using multiple shooting and a linear noise approximation for the transition probabilities. IET Syst Biol 2015; 9: 181–192. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Zimmer C Reconstructing the hidden states in time course data of stochastic models. Math BioSci 2015; 269: 117–129. [DOI] [PubMed] [Google Scholar]
- 10.Thomas P, Matuschek H and Grima R. Intrinsic noise analyzer: a software package for the exploration of stochastic biochemical kinetics using the system size expansion. PLOS one 2012; 7(6): e38518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Van Kampen NG. Stochastic processes in physics and chemistry. Vol. 1, Amsterdam, TheNetherlands: Elsevier, 1992. [Google Scholar]
- 12.Finkenstadt B, Woodcock DJ, Komorowski M, et al. quantifying intrinsic and extrinsic noise in gene transcription using the linear noise approximation: an application to single cell data. Ann Appl Stat 2013; 7: 1960–1982. [Google Scholar]
- 13.Leander J, Lundh T and Jirstrand M. Stochastic differential equations as a tool to regularize the parameter estimation problem for continuous time dynamical systems given discrete time measurements. Math Biosci 2014; 251: 54–62. [DOI] [PubMed] [Google Scholar]
- 14.Fearnhead P, Giagos V and Sherlock C. Inference for reaction networks using the linear noise approximation. Biometrics 2014; 70: 457–466. [DOI] [PubMed] [Google Scholar]
- 15.Doucet A, Godsill S and Andrieu C. On sequential Monte Carlo sampling methods for Bayesian filtering. Stat Comput 2000; 10: 197–208. [Google Scholar]
- 16.Raftery AE and Bao L. Estimating and projecting trends in HIV/AIDS generalized epidemics using incremental mixture importance sampling. Biometrics 2010; 66: 1162–1173. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Arulampalam MS, Maskell S, Gordon N, et al. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. IEEE Trans Signal Process 2002; 50: 174–188. [Google Scholar]
- 18.Silverman BW. Density estimation for statistics and data analysis In: Monographs on statistics and applied probability. London: Chapman and Hall; 1986.. [Google Scholar]
- 19.Shaman J, Karspeck A, Yang W, et al. Real-time influenza forecasts during the 2012–2013 season. Nat Commun 2013; 4: 2837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Gillespie DT. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J Comput Phys 1976; 22: 403–434. [Google Scholar]
- 21.Hoops S, Sahle S, Gauges R, et al. COPASI - a complex pathway simulator. Bioinformatics 2006; 22: 3067–3074. [DOI] [PubMed] [Google Scholar]
- 22.Mathematica, Version 10.4, Wolfram Research, Inc., Champaign, IL (2016).
- 23.Centers for Disease Control and Prevention. Flu activity and surveillance, www.cdc.gov/flu/weekly/fluactivitysurv.htm (accessed 1 October 2018).
- 24.Hickmann KS, Fairchild G, Priedhorsky R, et al. Forecasting the 2013–2014 influenza season using wikipedia. PLOS Comput Biol 2015, 10.1371/journal.pcbi.1004239. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Xia Y, et al. NLDAS primary forcing data L4 hourly 0.125 × 0.125 degree V002. NCEP/EMC (2009). Greenbelt, MD: Goddard Earth Sciences Data and Information Services Center, 2016, https://commons.wikimedia.org/wiki/File:PNGedUSoutline.png. [Google Scholar]
- 26.Codomo D U.S. map: Wikipedia; 2015.. [Google Scholar]
- 27.Data source: Google Flu Trends, http://wwwgoogleorg/flutrends (accessed 1 October 2018).
- 28.Carneiro HA and Mylonakis E. Google trends: a web-based tool for real-time surveillance of disease outbreaks. Clin Infect Dis 2009; 49: 1557–1564. [DOI] [PubMed] [Google Scholar]
- 29.Ginsberg J, Mohebbi MH, Patel RS, et al. Detecting influenza epidemics using search engine query data. Nat Lett 2009; 457: 1012–1014. [DOI] [PubMed] [Google Scholar]
- 30.Preis T and Moat HS. Adaptive nowcasting of influenza outbreaks using Google searches. R Soc Open Sci 2014, DOI: 10.1098/rsos.140095. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31.Cho S, Sohn CH, Jo MW, et al. Correlation between national influenza surveillance data and Google trends in South Korea. PLOS One 2013, 10.1371/journal.pone.0081422. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Kang M, Zhong H, He J, et al. Using Google trends for influenza surveillance in South China. PLOS One 2013, 10.1371/journal.pone.0055205. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33.Martin LJ, Lee BE and Yasui Y. Google Flu Trends in Canada: a comparison of digital disease surveillance data with physician consultations and respiratory virus surveillance data, 2010–2014. Epidemiol Infect 2016; 144: 325–332. [DOI] [PubMed] [Google Scholar]
- 34.Pollett S, Boscardin WJ, Azziz-Baumgartner E, et al. Evaluating Google Flu Trends in Latin America: important lessons for the next phase of digital disease detection. Clin Infect Dis 2017; 64: 34–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35.Wilson N, Mason K, Tobias M, et al. Interpreting “Google Flu Trends” data for pandemic H1N1 influenza: the New Zealand experience. Euro Surveill 2009; 14: pii = 19386, 10.2807/ese.14.44.19386-en. [DOI] [PubMed] [Google Scholar]
- 36.Centers for Disease Control and Prevention. Overview of influenza surveillance in the United States. Washington, DC: Centers for Disease Control and Prevention, 2017. [Google Scholar]
- 37.Yang W, Lipsitch M and Shaman J. Inference of seasonal and pandemic influenza transmission dynamics. Proc Natl Acad Sci 2015; 112: 2723–2728. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Boëlle PY, Ansart S, Cori A, et al. Transmission parameters of the A/H1N1 (2009) influenza virus pandemic: a review. Influenza Other Resp Virus 2011; 5: 306–316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Fraser C, Donnelly CA, Cauchemez S, et al. Pandemic potential of a strain of influenza A (H1N1): early findings. Science 2009; 324: 1557–1561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Yang Y, Sugimoto JD, Halloran ME, et al. The transmissibility and control of pandemic influenza A (H1N1) virus. Science 2009; 326: 729–733. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.Grima R An effective rate equation approach to reaction kinetics in small volumes: theory and application to biochemical reactions in nonequilibrium steady-state conditions. J Chem Phys 2010; 133: 035101. [DOI] [PubMed] [Google Scholar]
- 42.Thomas P, Matuschek H and Grima R. Intrinsic noise analyzer: a software package for the exploration of stochastic biochemical kinetics using the system size expansion. PLOS One 2012; 7: e38518. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Olson DR, Konty KJ, Paladini M, et al. Reassessing Google Flu Trends data for detection of seasonal and pandemic influenza: a comparative epidemiological study at three geographic scales. PLOS Comput Biol 2013, 10.1371/journal.pcbi.1003256. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Cook S, Conrad C, Fowlkes AL, et al. Assessing Google Flu Trends performance in the United States during the 2009 influenza virus A (H1N1) pandemic. PLOS one 2011, 10.1371/journal.pone.0023610. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
