Abstract
Process-based models have been used to simulate and forecast a number of nonlinear dynamical systems, including influenza and other infectious diseases. In this work, we evaluate the effects of model initial condition error and stochastic fluctuation on forecast accuracy in a compartmental model of influenza transmission. These two types of errors are found to have qualitatively similar growth patterns during model integration, indicating that dynamic error growth, regardless of source, is a dominant component of forecast inaccuracy. We therefore examine the nonlinear growth of model initial error and compute the fastest growing directions using singular vector analysis. Using this information, we generate perturbations in an ensemble forecast system of influenza to obtain more optimal ensemble spread. In retrospective forecasts of historical outbreaks for 95 US cities from 2003 to 2014, this approach improves short-term forecast of incidence over the next one to four weeks.
Author summary
Mathematical models are now used to forecast infectious disease incidence at the population scale. By better understanding how errors in prediction systems are introduced, grow and impact the predictability of infectious disease, forecast accuracy could be improved. Here we explore the growth pattern of errors introduced from two major sources–model initial conditions and stochastic fluctuation–in a simple, compartmental model describing influenza transmission. We find that model initial error typically undergoes faster growth due to nonlinear amplification during model evolution. Adopting techniques used in numerical weather prediction, we leverage this growth of uncertainty and modify an ensemble forecast system to generate optimal perturbations along the fastest growing direction of initial error. This perturbation procedure increases ensemble spread, which better captures observations with large uncertainties. In retrospective forecasts for 95 US cities during the 2003 through 2014 flu seasons, this procedure leads to a substantial improvement of short-term forecast quality.
Introduction
Influenza imposes a tremendous toll on global public health due to its recurrent worldwide spread and associated heavy morbidity and mortality burden [1]. To better prepare for and mitigate future outbreaks, accurate forecasts of influenza transmission are needed. Over the last few years, a number of forecasting systems have been developed and operationalized in the hopes of informing real-time policy-making during an influenza outbreak [2–11]. Although forecast skill has advanced significantly, the predictability of nonlinear influenza transmission dynamics is limited by the errors in model forecast systems [12]. These errors derive from three major sources: errors in model initial conditions, stochasticity in model dynamics, and model misspecification. To further improve influenza forecast accuracy, a better understanding of these errors and their impact on forecast uncertainty is needed. In this work, we focus on the first two error sources (i.e., initial condition error and stochasticity) and do not investigate model misspecification.
While prediction uncertainty and error growth in weather and climate forecasting has been well studied [13–23], few works have examined this phenomenon in forecast models of infectious disease. In this work, we perform an analysis of prediction uncertainty and error growth in a compartmental model of influenza transmission. We compare growth patterns of errors derived from both initial condition error and stochastic fluctuation during different stages of an influenza outbreak. We find these error sources have similar effects on influenza incidence predictability; however, initial error leads to a faster increase in ensemble spread and therefore appears more responsible for the degradation of predictability. We then derive the linear propagator of the transmission model and calculate the unstable direction of initial error growth using singular vector analysis [14–17]. The flow-dependent singular vectors obtained can then be used to generate optimal perturbations during the ensemble forecast of influenza, an adaptation of methods used in operational numerical weather prediction [21–23]. We optimize this perturbation procedure in a model-data assimilation forecast framework and validate it using historical outbreaks from 95 cities in the United States from 2003 to 2014. Compared with the baseline method without optimal perturbations, the properly perturbed system substantially improves short-term forecast quality around and after the peak of an outbreak, when observed incidence levels are most uncertain. This procedure of diagnosing and optimally perturbing ensemble forecasts of influenza can be applied to ensemble forecast systems for other infectious diseases.
Materials and methods
Data
We combine Google Flu Trends (GFT) data and concurrent laboratory-confirmed influenza positivity rates to generate observational estimates of influenza incidence. Using internet search query activity, GFT provided real-time estimates of weekly influenza-like illness (ILI) per 100,000 people seeking medical treatment for major cities in the United States during 2003–2015 [24]. ILI is a medical diagnosis of possible influenza or other illness defined by symptoms of a fever above 37.8 °C plus cough and/or sore throat. These symptoms are not exclusively caused by influenza, as other respiratory viruses, e.g., respiratory syncytial virus, rhinovirus, may produce similar symptoms. Therefore, to capture a more specific signal of influenza infection incidence, we multiply weekly municipal GFT ILI with the percentage of laboratory-confirmed influenza infections among patients presenting with ILI, compiled regionally through the National Respiratory and Enteric Virus Surveillance System and US-based World Health Organization collaborating laboratories [25]. This combined metric, termed ILI+, better tracks influenza incidence and thus provides a more specific target for inference and forecast [5,25]. Excluding the pandemic seasons of 2008–2009 and 2009–2010, locations without absolute humidity data, and seasons with incomplete observations, we used 790 ILI+ outbreak time series from 95 cities in the US during the 2003–2004 through 2013–2014 seasons in this study.
Humidity-driven SIRS model
A parsimonious SIRS (susceptible-infected-recovered-susceptible) model forced by absolute humidity (AH) conditions is used to simulate influenza activity. This SIRS model with environmental forcing, previously validated against historical outbreaks in the United States [26,27], provides a concise mathematical description of influenza transmission dynamics. Within an assumed uniformly mixed population, transmission proceeds according to the following equations:
(1) |
(2) |
where N, S and I are the total, susceptible and infected populations, respectively; β(t) is the contact rate at time t; D is the mean infectious period; and L is the average duration of immunity. As population size is constant, the recovered population is N − S − I. The contact rate β(t) is modulated by local AH conditions via
(3) |
Here, R0(t) is the basic reproductive number (the expected number of secondary infections generated by a single infection in a fully susceptible population), and q(t) is specific humidity (a measure of AH). The coefficients in the exponential term are estimated from laboratory experiments on influenza virus survival: a = −180 and b = log(R0max − R0min), where R0max and R0min are the maximum and minimum basic reproductive numbers [27]. Local AH conditions, i.e., daily climatological humidity data averaged from 1979 through 2002, are derived from the North American Land Data Assimilation System [28].
The SIRS model can be integrated forward in time either deterministically or stochastically. When inspecting the growth of initial error, the model was run deterministically using a fourth-order Runge-Kutta stepping scheme. A stochastic version was used to examine the impact of stochastic fluctuation. There exist a number of approaches for introducing stochasticity into model dynamics [29–35]. Here we used an event-driven approach that interprets the transitions between individuals’ states as Markov chains [31]. In particular, the rate for each type of transition event, defined in Eqs 1 and 2 (e.g., susceptible to infected, infected to recovered, and recovered to susceptible), in a short time step δt = 1 was perturbed through multiplication with a Gamma distributed parameter (mean 1 and standard deviation σp). Mathematically, the model equations are modified to
(4) |
(5) |
where γS→I, γI→R and γR→S represent the stochastic forcing on the transition events from susceptible to infected, infected to recovered, and recovered to susceptible, respectively. The exact number of individuals transitioning from one state to another during a time step δt = 1 was generated from a Poisson distribution with the mean value set equal to the value in the deterministic process. This approach has been widely used to model the stochastic dynamics of infectious disease [30–35].
In all model simulations, the total population was set as N = 1 × 105 uniformly. Because ILI+ (i.e., influenza infection per 100,000 patient visits) is reported as a rate not a magnitude, the total population size, N, is arbitrary. To generate synthetic outbreaks, initial conditions (S,I) and epidemiological parameters (R0max, R0min, D,L) were drawn randomly using a Latin hypercube sampling strategy [36] from the following ranges: 3,000 ≤ S ≤ 8,000, 0 ≤ I ≤ 1,000, 1.3 ≤ R0max ≤ 4, 0.8 ≤ R0min ≤ 1.3, 2 days ≤ D ≤ 7 days, 1 year ≤ L ≤ 10 years. [In the two-dimensional case, Latin hypercube sampling generates n samples in two steps: 1) divide the state-space into n × n uniform squares and 2) select sample positions such that there is only one sample in each row and each column. High-dimensional Latin hypercube sampling is a generalization of this process.] The humidity-driven SIRS model was integrated from October 1st for 40 consecutive weeks to generate synthetic outbreaks. Weekly observations of local influenza incidence are the number of new infections, Ot, which are calculated during model integration. To mimic real-world observational error, random Gaussian noise with mean 0 and observation error variance at week t was added to the simulated weekly incidence.
The ensemble adjustment kalman filter
Data assimilation methods were used to infer unobserved variables and parameters in the humidity-driven SIRS model from observations. Specifically, we employed a sequential ensemble filtering algorithm called the Ensemble Adjustment Kalman Filter (EAKF) [37] to iteratively optimize the distribution of variables (S,I) and parameters (R0max, R0min, D,L) with each successive observation. While the EAKF is optimal for linear systems, it also exhibits satisfactory performance in practice for weakly nonlinear dynamical models such as the SIRS model we study here. To date, the EAKF has been used for the inference and forecast of a number of infectious diseases, such as influenza [4–6,38–40], West Nile Virus [41–42], dengue [43], respiratory syncytial virus [44], Ebola [45] and antibiotic-resistant pathogens [46].
To represent the state-space distribution, the EAKF maintains an ensemble of system state vectors acting as samples from the distribution. The EAKF assumes that both the prior distribution and likelihood are Gaussian and can be fully characterized by the first two moments, i.e., mean and covariance. Unobserved variables and parameters are updated through their covariability with the observed state variable, which can be computed directly from the ensemble. In the EAKF, the variables and parameters are updated deterministically so that higher moments of the prior distribution are preserved in the posterior [37].
The SIRS model-EAKF system can simulate the behavior of realistic epidemic curves due to the iterative adjustment of the system state by the EAKF. In S1 Text, we fit historical outbreaks from New York, Denver, Los Angeles and Houston for the 2010–2011 to 2013–2014 seasons. In general, the posterior estimate captures the ILI+ curves in these outbreaks (see S1 Text, Fig A).
Results
Analytical and numerical investigation of error growth
Roles of model initial error and stochastic forcing
The predictability of a dynamical system can be measured by the variance of an ensemble of perturbed trajectories [13]. For n model trajectories perturbed at time t, we denote fi(t,k) (i = 1,⋯,n) as the observation of the ith trajectory after time k. The ensemble spread is defined as
(6) |
where is the ensemble mean over all trajectories, i.e., the mean of fi(t,k) (i = 1,⋯,n).
The humidity-driven SIRS model was perturbed in two different ways. For the first, at time t we perturbed the initial condition of variables (St,It) through multiplication with scaling parameters (ε1,ε2), where both ε1 and ε2 were generated from a Gaussian distribution For each synthetic outbreak and each day of perturbation, we generated n = 100 perturbed trajectories and tracked the evolution of the ensemble spread for time k. For the second, at each perturbation time t, we simulated n = 100 realizations of the stochastic model (Eqs 3 and 4) using a Gamma distribution with the same variance , starting from the same initial condition (St,It). Note that the first perturbation method produces errors in initial conditions and integrates the model deterministically; the second perturbation method integrates the model from the same initial condition but introduces errors through continuous stochastic forcing of model dynamics. Because the above two perturbation methods operate in different ways, it is challenging to design a completely controlled, fair comparison. Here, we impose perturbations with the same variance in order to control the strength of the initial condition perturbation and the intensity of stochastic forcing.
We generated 1,000 synthetic outbreaks using Latin hypercube sampling of initial conditions and parameters, with transmission rate forced by daily absolute humidity for New York City, and then imposed perturbations on these trajectories each day from 10 weeks (70 days) prior to the peak until 6 weeks (42 days) after. We measured the log-transformed ensemble spread log(σ2(t,k)) averaged over all trajectories for 6 weeks (42 days) following the perturbation. In Fig 1A and 1B, we show the evolution of ensemble spread after perturbations with σp = 10% at different times with respect to the outbreak peak.
In general, the growth of uncertainty introduced from stochastic forcing and initial error exhibit qualitatively similar patterns (Fig 1). This finding indicates that the impact of stochastic fluctuation is largely manifested by the nonlinear growth of error it introduces into the model. The stochasticity-induced uncertainty is not static, but will propagate following the nonlinear model dynamics, just as the introduced initial error propagates dynamically. This implies, in generating variability within an ensemble of model trajectories used for influenza forecast, using a stochastic model is equivalent in effect to perturbing initial conditions, but differs in that perturbations from initial conditions (Fig 1B) result in a larger ensemble spread than stochastic fluctuations, which appear to partially damp dynamic error growth (Fig 1A). The impact of these errors depends heavily on both the perturbation time and forecast horizon. Errors introduced before the peak amplify exponentially during the early phase of outbreaks, whereas perturbations after the peak generally remain stable. Other perturbations for σp = 5% and 15% were tested (see S1 Text, Figs B-C), but no significant change in the results was observed. Further, we performed the same analysis as in Fig 1 for three other cities with different climate conditions–Denver, Los Angeles and Houston (see S1 Text, Figs D-G). The error growth patterns were robust across these different regions of the US.
Around the peak of an outbreak, a forecast with a large ensemble spread may still have utility because the forecast target also increases. To account for the increased target, we use another measure of predictability, potential prediction utility (PPU) [47,48], to quantify the forecast uncertainty relative to the target. PPU for a prediction made at time t with a forecast length k is expressed as
(7) |
Recall that σ(t,k) and are the ensemble standard deviation and ensemble mean. The term measures the “noise-to-signal” ratio. PPU can vary from one to zero, with a value of one indicating a perfect prediction. In Fig 1C and 1D, the evolution of PPU after perturbation is compared between stochastic fluctuation and initial error. PPU for stochastic forcing remains almost constant at around 0.9, indicating a stable relative uncertainty with respect to the true signal for all perturbations. PPU for initial error, however, has more complex features. Generally, PPU rapidly drops below 0.85 at 7 days after the perturbation, and then continues to decrease at a rate that depends on t, the time of perturbation. In Fig 1D, we observe two blue areas with extremely low PPU. The one in the upper-left corner is attributed to the large ensemble spread σ(t,k) produced during the exponential growth of epidemics, while the one in the upper-right corner is due to low signal at the end of outbreaks. For days -20 to 0, the large signal, , near the peak leads to increased PPU values. The same pattern was also observed in experiments for other cities and perturbations with σp = 5% and 15%.
From above analyses we conclude that the predictability loss in the SIRS model due to initial error is more pronounced than that from stochastic fluctuation, which is in agreement with findings from climate models [13]. In the next section, we examine the rate and direction of initial error growth.
Nonlinear growth of initial error
For this parsimonious 2-dimension ordinary-differential-equation model of influenza transmission, we employ singular vector analysis to estimate the speed and direction of initial error expansion. This method has been applied with great success in numerical weather prediction [49–51].
For the humidity-driven SIRS model (Eqs 1 and 2), we assume that model parameters R0max, R0min, L and D do not change and define the variable vector x = (S,I)T. We then write Eqs 1 and 2 in the form
(8) |
Here A(x) is the function describing the nonlinear evolution of the variable vector x. We examine how small perturbations evolve following these nonlinear dynamics. Instantaneous error growth for a small perturbation, δx = (δS,δI)T, at time t is given by the linear system
(9) |
where is the Jacobian of the system at time t:
(10) |
In the last expression, S′ = R0(t)S(t)/N, I′ = R0(t)I(t)/N, L′ = L/D. Recall that R0(t) = β(t)D and note that the last matrix in Eq 10 is non-dimensional. Epidemiologically, S′ is the rescaled effective reproductive number, i.e., the average number of infections caused by one infection in D days in a population with S(t) susceptible people; I′ is the rescaled force of infection, i.e., the hazard (or rate) of a susceptible individual acquiring influenza in D days.
In a population of size N = 105, the typical error (or uncertainty) in S is of order O(103), whereas for I it is usually of order O(102). To give the two errors approximately equal weight we normalize the absolute errors δS and δI by their typical uncertainties η(S) and η(I): with W = diag(1/η(S), 1/η(I)). For the new variable , the error growth equation becomes
(11) |
where, after defining ν = η(S)/η(I),
(12) |
The direction that has the fastest instantaneous error growth rate at time t is the one that maximizes the quantity
(13) |
The norm ‖x‖2 is defined as ‖x‖2 = xTx. In Eq 13, the numerator quantifies the instantaneous growth rate of (square of the Euclid length of ). The denominator normalizes this growth rate by . Therefore, Eq 13 represents the relative instantaneous growth rate of a perturbation . If we consider unit perturbations with , the growth rate is solely determined by .
Because, by Eq 11
(14) |
the direction e1 that grows the fastest is the solution of the eigenvalue problem
(15) |
The largest eigenvalue (the fastest growth rate) λ1 may be found analytically:
(16) |
The principal eigenvector e1 is called the singular vector of the system [48]. It is an approximation of the local Lyapunov vector [52–54]. Note that the singular vector is different from the principal eigenvector of the Jacobian . The impact of each variable or parameter on the (non-dimensional) error growth rate Dλ1 can be calculated from Eq 16. Since L ∈ [1,10] years ≫ D ∈ [2,7] days, we will omit the term 1/L′ = D/L hereafter.
To validate Eq 16, we calculated the maximal error growth rate numerically and then compared it with the theoretical value. At each day t after the beginning of an outbreak, we imposed an ensemble of perturbations on x along different directions in the (S,I) plane: δx = (cos(2kπ)η(S), sin(2kπ)η(I))T (k = 1/360,⋯, 1, η(S) = 103, η(I) = 102) in the normalized space). Both the unperturbed and perturbed trajectories were evolved forward for δt = 0.1. We then calculated the error at t + δt and the maximal error growth rate among all perturbations according to Eq 13. In Fig 2, we compare the numerically calculated maximal error growth rate r(t) with that predicted by Eq 16 for the SIRS models with or without humidity forcing. In both cases, the maximal error growth rate is well predicted by Eq 16. Further, according to the overlaid epidemic curves, error growth is most pronounced at the early stage of outbreaks, indicating that model dynamics are more sensitive to the errors introduced early in the season. We repeated this analysis for 1,000 synthetic outbreaks, and display the distribution of discrepancy between theoretical and simulated error growth rate in S1 Text and Fig H. Results indicate a satisfactory performance from the theoretical prediction of Eq 16.
To identify realistic combinations of (S′,I′), we generated 1,000 synthetic outbreaks using the SIRS model forced by humidity conditions for New York City starting from October 1st. The distribution of S′ and I′ in the (S′,I′) plane, calculated from these synthetic outbreaks over 280 days (40 weeks), is shown in Fig 3A. We display the contour of Dλ1 as a function of S′ and I′ in Fig 3B (η(S) = 103, η(I) = 102). The area contained by the black dashed line marks the region of (S′,I′) in Fig 3A with probability of occurrence higher than 10−5. In this feasible region, Dλ1 is quite sensitive to S′ but less sensitive to I′ such that the error growth rate depends primarily on the size of the susceptible population. Epidemiologically, this indicates that the uncertainty of future, predicted incidence is more strongly linked to the proportion of susceptible people in the population than to the proportion of infected individuals. For each particular outbreak, we can draw its trajectory in the S′ − I′ plane and observe how the growth rate changes over time (see the red trajectory in Fig 3B for an example).
The fastest error growth direction can be estimated by the eigenvector e1 = (e11,e12)T corresponding to λ1. We quantify the direction of e1 by θ1 = arctan (e12/e11) (in degrees from −90° to 90°), and show its contour in Fig 3C. In the middle of the feasible region is a singular point where degenerates to diag(0,−0.02). In fact, the singular point is the vertex of the parabola of Dλ1 = 0 defined by Eq 16 (Fig 3B). At this point, we have e1 = (0,1)T where e12/e11 diverges. An epidemic could reach this singular point. This would lead to the divergence of θ1 around this point but would not affect the epidemic process described by Eqs 1 and 2.
During the epidemic process marked by the red curve in Fig 3C, θ1 first changes from approximately −40° to −90°, and then from 90° to 0°. Note that e1 and −e1 (the opposite of e1) are both eigenvectors. Thus, the directions between −90° and 0° are equivalent to their opposite directions between 90° and 180°. In this sense, the fastest error growth direction evolves continuously from 140° to 0°. Recall that e11 and e12 represent the projections of e1 on S′ and I′, respectively. This implies, in the normalized space, the error growth direction gradually moves to align with I′ (e12 > e11) at the early phase and then turns to S′ (e11 > e12) in the end.
Fig 3 provides a simplified picture to interpret the impact of parameters on error growth. According to Eq 16, the second eigenvalue λ2 is always negative. Therefore, errors along the direction of the eigenvector corresponding to λ2 will always contract, and the only concern is for error growth along e1. The growth rate and direction of these errors are described in Fig 3B and 3C. Varying D changes the time scale of error growth; changing R0 modifies the position of (S′,I′) in the (S′,I′) plane by a given scaling parameter. The evolution of error growth for an outbreak can be tracked in a trajectory in the (S′,I′) plane, as plotted in Fig 3B and 3C.
As the error growth in the dynamical model is intrinsically nonlinear, it may deviate from the linear approximation characterized by the matrix . By using a linearized system to study error growth, we assume that the linear approximation generally captures the behavior of the full nonlinear system within a certain time interval. To verify this assumption, it is important to quantify the deviation of the linear approximation from the full nonlinear system. In Fig 4, we compare the error growth in the nonlinear system with approximations at four different phases of an outbreak (t = 5, 10, 15, and 20 weeks). At each time point t, we added an ensemble of errors δx = (cos(2kπ)η(S), sin(2kπ)η(I))T (k = 1/360,⋯, 1, η(S) = 103, η(I) = 102) (equivalently, in the normalized space) to the variables and bred the errors for 7 days. We display the largest error after δt, and compare it with two approximations: 1) a linear extrapolation , and 2) an exponential growth for . Here λ1 is the largest eigenvalue of the linear propagator . As shown in Fig 4, the exponential approximation provides a good agreement with the full nonlinear growth at the early stage, indicating that the error will grow exponentially with a rate λ1. The linear approximation, however, is only valid for small δt and tends to underestimate the error growth after 2 days, especially before the outbreak peak. The largest eigenvalue λ1, although obtained from a linearized system, can reliably quantify the speed of nonlinear error growth between two successive observations.
Applications in conjunction with the EAKF
The above analyses are performed on the assumption that model parameters and variables are known. In an operational forecast, unobserved parameters and variables can be estimated using data assimilation techniques. In this work, we use the ensemble mean of parameters and variables obtained using the EAKF to calculate the matrix . Error normalization denominators η(S) and η(I) are set as the 95 percentile of ensemble member distance to the ensemble mean so that most errors fall within the unit circle. Outliers are not considered due to their large variation. In order to inspect the estimation bias in error growth rate λ1 and direction e1, we ran the SIRS-EAKF system with n = 300 ensemble members for 1,000 synthetic outbreaks for which the actual λ1 and e1 can be calculated, and computed the estimated and in 40 consecutive weeks. In Fig 5A, we display the distribution of estimation bias in error growth rate , grouped by the predicted lead to peak ranging from -10 weeks to 6 weeks (a negative predicted lead indicates the peak is predicted to occur in the future; a positive lead indicates the peak is predicted to have already passed). The boxes and whiskers indicate the interquartile and the minimal and maximal values. In general, Δλ is distributed around 0 within a small range, suggesting that the error growth rate λ1 can be well estimated. The bias in e1 is quantified by the angular deviation from to e1 (in degree from 0° to 90°) . The distributions of θ are shown in Fig 5B. The estimation bias θ is low for the majority of cases. As a result, the estimated generally has a large projection on the actual e1.
Retrospective forecast of historical influenza outbreaks
Optimal perturbation for ensemble forecasts
As in numerical weather and climate prediction, information on error growth can be harnessed to improve the forecast quality of the model-data assimilation system. In principle, perturbations along the fastest error growth direction, termed optimal perturbations [4], are imposed when the ensemble spread needs to be enlarged to account for uncertainty in targets. Specifically, for each ensemble member, we adjust the component of along the estimated by a factor k: , and use the adjusted variables to project the model ensemble into the future to make forecast. Model parameters are not adjusted. The deviations δS and δI are obtained from the difference between the ensemble member and ensemble mean. If k > 1, the perturbation expands the distribution of variables along in the normalized space. Since the variability of incidence and dynamical error growth rate changes over time, we assign different perturbation intensities at different predicted lead to peak.
To determine the perturbation intensity k needed for each predicted lead, we optimized k to improve the forecast quality of near-term predictions, here meaning the forecast of incidence in the next one to four weeks ahead. The quality of probabilistic forecasts can be measured using a reliability plot [55]. We divide the forecast range into 14 categories: [0,1 × 103),⋯,[1.2 × 104, 1.3 ×104), [1.3 × 104, ∞) (infections per 105 people). For a large number of forecasts, we can calculate the probability of falling into each category Ppred(i), averaged over the full ensemble distribution of multiple forecasts, as well as the actual observed frequency of occurrence in each category Poccur(i). The 14 points (Ppred(i),Poccur(i)) form the reliability plot. A perfect probabilistic forecast satisfies Ppred(i) = Poccur(i) for 1 ≤ i ≤ 14. In the reliability plot, this means all 14 points fall on the diagonal line y = x. Here, we use the deviation of the points from the diagonal line ∑i|Ppred(i) − Poccur(i)| to quantify the forecast quality. Our objective is to minimize the average deviation for lead times of one to four weeks over predictions from -8 to 6 weeks relative to the predicted peak.
We optimized the perturbation intensity using simulated annealing [56] (see details in S1 Text, Fig I). To give a fair evaluation of the perturbation procedure, half of historical outbreaks in 95 US cities during the 2003–2004 through 2013–2014 seasons (excluding the 2008–2009 and 2009–2010 pandemic seasons) were used in the optimization, and the other half were used in out-of-sample validation. The historical outbreaks selected for training and validation are reported in S1 Text (Table A). (The Matlab code and data for retrospective forecast are provided in S1 Code). To understand the baseline behavior of the SIRS-EAKF system, we display the reliability plots for 1- to 4-week prediction in S1 Text, grouped by the predicted lead to peak. In general, reliability plots have a greater deviation from the diagonal line at predicted lead between 0 to 6 weeks (Figs J-M).
In Fig 6A, the reduction of deviation in the reliability plot is shown for different predicted leads. The deviation in the reliability plot (y-axis) is averaged over 4 targets, i.e., 1- to 4-week predictions. Figures breakdown for each target are shown in S1 Text (Figs N-O). Improvement is most pronounced around and after the peak. The inset shows the optimized perturbation intensity k. According to the optimization, perturbations have roughly three phases: 1) -8 to -5 weeks. Errors have a slow growth at the early stage of an outbreak. Therefore, the ensemble spread needs to be expanded (k > 1). However, since the targets remain low without too much variation, this expansion should not be too large. 2) -4 to -1 week. Errors can expand exponentially during the rapid growth of an outbreak. The dynamical expansion alone is enough to generate ensemble spread. No additional expansion is needed (k ≈ 1). 3) After 0 week. The error growth rate becomes lower after the peak where targets drop fast from high to low values. A strong expansion is needed to supplement the ensemble spread and capture the large variation in targets.
To validate the perturbation procedure, we ran retrospective forecasts for the rest of the historical outbreaks using the optimized perturbation intensity. Weekly forecasts of incidence during the next one to four weeks were generated. In Fig 6B, we compare the average deviation in the reliability plot for these 4 targets between the baseline (without perturbation) and the perturbed system. Forecasts are improved as in the training data set (Fig 6A), indicating there is no over-fitting issue.
We also used the “log score” to assess the forecast accuracy. For each forecast target, the n = 300 ensemble trajectories are grouped into 14 bins as defined before. The fraction of trajectories falling in each bin i is the corresponding predicted weight wi. If the observed target falls in bin h, the log score for a given forecast is defined as the logarithmic value (base e) of the weight in bin h: . If the log score is below -10, we use the floor value of -10. Similar score measures have been used in the US Centers for Disease Control and Prevention's real-time influenza forecast challenge [2,3]. (In S1 Data, we provide the forecast results for the baseline and perturbed EAKF in the format of the influenza forecast challenge.) In Fig 7, we compare the log scores of 1- to 4-week forecasts grouped by predicted lead. Comparison of the log scores obtained from the baseline and perturbed SIRS-EAKF forecasts indicates that the perturbation procedure improves short-term forecast accuracy for historical outbreaks, particularly for forecasts generated near and after peak, i.e., after -1 week. This improvement, observed for both training and out-of-sample seasons, substantially enhances the forecast quality near the peak, where the prediction task is the most challenging. In S1 Text, we report the 5%, 25%, 50%, 75% and 95% percentiles of log scores at each predicted lead to peak for 1- to 4-week prediction (Table B). In general, the perturbation procedure dramatically improves the 5% percentile scores (i.e., bad predictions) at predicted leads between 0 and 6 weeks.
Discussion
In this work, we show that within a humidity-driven compartmental model used for influenza forecast, the error introduced from initial conditions grows faster than error derived from stochastic fluctuations when these errors are of roughly the same magnitude. For other infectious diseases with lower incidence, however, stochastic effects may play a more crucial role determining the predictability of model dynamics [29–35,57].
In the application of optimal perturbations presented here, we make use of the nonlinear growth of initial error to expand the ensemble spread. This procedure is demonstrated to be effective in enhancing short-term forecast quality by inflating the distribution of ensemble members along the fastest error growth direction. As a consequence, the efficiency of each ensemble member is improved because the perturbed ensemble can explore a larger region of state-space. This implies, for a certain level of forecast accuracy, forecast systems with perturbations would require a smaller number of ensembles. For high-dimensional forecast systems that involve large numbers of localities, such as the system developed in Ref. [6], it should be possible to generate a similar perturbation procedure that reduces ensemble size and thus computational burden.
The mechanistic epidemic model employed here is mis-specified–i.e. it does not represent the full complexity of influenza transmission as it occurs in the real world. For a mis-specified model, initial conditions must be well constrained or error growth will likely deteriorate long-term predictions. If too large, such initial condition error in a mis-specified model will produce unrealistic trajectories that are outside the scope of the real world. (Forecasts generated using a better-specified model also require well-constrained initial conditions; however, the issue of improper initial conditions is more problematic for more grossly mis-specified models, as data assimilation becomes less effective due to the increasing model flaws.) Data assimilation, such as with the EAKF, is a means of partially handling the effects of both model mis-specification and state space error; however, data assimilation methods do not address dynamical error growth. In a recent related work [58], we explored initial condition error growth using a numerical technique–the breeding method–and proposed a method to counteract unrealistic errors growth in the SIRS model. We diagnosed the error structure between unobserved variables and observations using the breeding method, and then examined the deviation of the prediction from observations to further constrain the system using that error growth structure [58]. This error correction procedure does not necessarily reduce the spread of ensemble trajectories or variable/parameter distributions, but does in effect calibrate unrealistic trajectories toward realistic regions in the state space under the assumption that the SIRS model can reasonably well describe the transmission dynamics.
Both optimal perturbation and error correction make use of error growth in the dynamical model; however, the two approaches employ different techniques and perceive the role of error growth from opposite perspectives. First, optimal perturbation examines the linearized system in a short time period and uses an analytical singular vector analysis to find the fastest error growth direction; whereas in error correction, the error structure is diagnosed using a numerical method–the breeding method–which fully preserves the nonlinear dynamics. Second, in optimal perturbation, the error growth is beneficial to short-term forecast because it increases the spread of prediction; however, in error correction, the error growth is detrimental for unrealistic trajectories so that it should be counteracted to calibrate those trajectories toward reasonable regions in the state space. The latter error correction improves the forecast of seasonal targets, e.g., peak week, peak intensity and attack rate. A systematic comparison between optimal perturbation and error correction is needed; however, this task is nontrivial and goes beyond the scope of this study.
The approach presented here does not address model mis-specification but instead uses singular vector analysis to develop optimal perturbations of the ensemble that improve forecast accuracy. The findings indicate that, even for prediction using a simple SIRS model, forecast accuracy can be heavily impacted by factors such as system initialization, ensemble spread, model nonlinearity and error structure. Our challenge going forward is to design operational forecasting systems that optimize and balance all these factors.
Supporting information
Acknowledgments
We acknowledge Sasikiran Kandula for his help in preparing the data.
Data Availability
All relevant data are within the manuscript and its Supporting Information files.
Funding Statement
JS was funded by US NIH (https://www.nih.gov/) grants GM110748 and ES009089. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.World Health Organization. Influenza (seasonal). Fact Sheet No. 211. 2009. Available from: http://www.who.int/mediacentre/factsheets/fs211/en/index.html
- 2.Biggerstaff M, Alper D, Dredze M, Fox S, Fung IC, Hickmann KS, et al. Results from the centers for disease control and prevention’s predict the 2013–2014 Influenza Season Challenge. BMC infectious diseases. 2016;16: 357 10.1186/s12879-016-1669-x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Biggerstaff M, Johansson M, Alper D, Brooks LC, Chakraborty P, Farrow DC, et al. Results from the second year of a collaborative effort to forecast influenza seasons in the United States. Epidemics. 2018; Forthcoming. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Shaman J, Karspeck A. Forecasting seasonal outbreaks of influenza. Proc Natl Acad Sci U S A. 2012; 109:20425–20430. 10.1073/pnas.1208772109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Shaman J, Karspeck A, Yang W, Tamerius J, Lipsitch M. Real-time influenza forecasts during the 2012–2013 season. Nat Comm. 2013;4: 2837. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Pei S, Kandula S, Yang W, Shaman J. Forecasting the spatial transmission of influenza in the United States. Proc Natl Acad Sci U S A. 2018;115: 2752–2757. 10.1073/pnas.1708856115 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Tizzoni M, Bajardi P, Poletto C, Ramasco JJ, Balcan D, Goncalves B, et al. Real-time numerical forecast of global epidemic spreading: case study of 2009 A/H1N1pdm. BMC Med. 2012;10: 165 10.1186/1741-7015-10-165 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Farrow DC, Brooks LC, Hyun S, Tibshirani RJ, Burke DS, Rosenfeld R. A human judgment approach to epidemiological forecasting. PLoS Comput Biol. 2017;13: e1005248 10.1371/journal.pcbi.1005248 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.Du X, King AA, Woods RJ, Pascual M. Evolution-informed forecasting of seasonal influenza A (H3N2). Sci Transl Med. 2017;9: eaan5325 10.1126/scitranslmed.aan5325 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Osthus D, Hickmann KS, Caragea PC, Higdon D, Del Valle SY. Forecasting seasonal influenza with a state-space SIR model. The annals of applied statistics. Ann Appl Stat. 2017;11: 202–224. 10.1214/16-AOAS1000 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Ray EL, Reich NG. Prediction of infectious disease epidemics via weighted density ensembles. PLoS Comput Biol. 2018;14: e1005910 10.1371/journal.pcbi.1005910 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Lighthill J. The recently recognized failure of predictability in Newtonian dynamics. Proc R Soc A. 1986;407: 35–60. [Google Scholar]
- 13.Karspeck AR, Kaplan A, Cane MA. Predictability loss in an intermediate ENSO model due to initial error and atmospheric noise. J Climate. 2006;19: 3572–3588. [Google Scholar]
- 14.Palmer TN. Predicting uncertainty in forecasts of weather and climate. Rep Prog Phys. 2000;63: 71–116. [Google Scholar]
- 15.Palmer TN, Shutts GJ, Hagedorn R, Doblas-Reyes FJ, Jung T, Leutbecher M. Representing model uncertainty in weather and climate prediction. Annu Rev Earth Planet Sci. 2005;33: 163–93. [Google Scholar]
- 16.Xue Y, Cane MA, Zebiak SE. Predictability of a coupled model of ENSO using singular vector analysis. Part I: Optimal growth in seasonal background and ENSO cycles. Mon Weather Rev. 1997;125: 2043–2056. [Google Scholar]
- 17.Xue Y, Cane MA, Zebiak SE, Palmer TN. Predictability of a coupled model of ENSO using singular vector analysis. Part II: Optimal growth and forecast skill. Mon Weather Rev. 1997;125: 2057–2073. [Google Scholar]
- 18.Hawkins E, Sutton R. Decadal predictability of the Atlantic Ocean in a coupled GCM: Forecast skill and optimal perturbations using linear inverse modeling. J Climate. 2009;22: 3960–3978. [Google Scholar]
- 19.Tziperman E, Zanna L, Penland C. Nonnormal thermohaline circulation dynamics in a coupled ocean–atmosphere GCM. J Phys Oceanogr. 2008;38: 588–604. [Google Scholar]
- 20.Buizza R, Palmer TN. The singular-vector structure of the atmospheric global circulation. J Atmospheric Sci. 1995;52: 1434–1456. [Google Scholar]
- 21.Molteni F, Buizza R, Palmer TN, Petroliagis T. The ECMWF ensemble prediction system: Methodology and validation. Q J R Meteorol Soc. 1996;122: 73–119. [Google Scholar]
- 22.Toth Z, Kalnay E. Ensemble forecasting at NMC: The generation of perturbations. B Am Meteorol Soc. 1993;74: 2317–2330. [Google Scholar]
- 23.Toth Z, Kalnay E. Ensemble forecasting at NCEP and the breeding method. Mon Weather Rev. 1997;125: 3297–3319. [Google Scholar]
- 24.Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L. Detecting influenza epidemics using search engine query data. Nature. 2009;457: 1012–1014. 10.1038/nature07634 [DOI] [PubMed] [Google Scholar]
- 25.Goldstein E, Viboud C, Charu V, Lipsitch M. Improving the estimation of influenza-related mortality over a seasonal baseline. Epidemiology. 2012;23: 829–838. 10.1097/EDE.0b013e31826c2dda [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Shaman J, Kohn M. Absolute humidity modulates influenza survival, transmission, and seasonality. Proc Natl Acad Sci U S A. 2009;106: 3243–3248. 10.1073/pnas.0806852106 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Shaman J, Pitzer VE, Viboud C, Grenfell BT, Lipsitch M. Absolute humidity and the seasonal onset of influenza in the continental United States. PLoS Biol. 2010;8: e1000316 10.1371/journal.pbio.1000316 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Cosgrove BA, Lohmann D, Mitchell KE, Houser PR, Wood EF, Schaake JC, et al. Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project. J Geophys Res. 2003;108: 8842. [Google Scholar]
- 29.Andersson H, Britton T. Stochastic epidemic models and their statistical analysis. New York: Springer Science & Business Media; 2012. [Google Scholar]
- 30.Bretó C, He D, Ionides EL, King AA. Time series analysis via mechanistic models. Ann Appl Stat. 2009;3: 319–348. [Google Scholar]
- 31.He D, Ionides EL, King AA. Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. J R Soc Interface. 2010;7: 271–283. 10.1098/rsif.2009.0151 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Bjørnstad ON, Grenfell BT. Noisy clockwork: time series analysis of population fluctuations in animals. Science. 2001;293: 638–643. 10.1126/science.1062226 [DOI] [PubMed] [Google Scholar]
- 33.Finkenstädt BF, Grenfell BT. Time series modelling of childhood diseases: a dynamical systems approach. J Royal Stat Soc C. 2000;49: 187–205. [Google Scholar]
- 34.Bjørnstad ON, Finkenstädt BF, Grenfell BT. Dynamics of measles epidemics: estimating scaling of transmission rates using a time series SIR model. Ecol Monogr. 2002;72: 169–184. [Google Scholar]
- 35.Grenfell BT, Bjørnstad ON, Finkenstädt BF. Dynamics of measles epidemics: scaling noise, determinism, and predictability with the TSIR model. Ecol Monogr. 2002;72: 185–202. [Google Scholar]
- 36.McKay MD, Beckman RJ, Conover WJ. Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics. 1979;21: 239–245. [Google Scholar]
- 37.Anderson JL. An ensemble adjustment Kalman filter for data assimilation. Mon Weather Rev. 2001;129: 2884–2903. [Google Scholar]
- 38.Yang W, Lipsitch M, Shaman J. Inference of seasonal and pandemic influenza transmission dynamics. Proc Natl Acad Sci U S A. 2015;112: 2723–2728. 10.1073/pnas.1415012112 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Yang W, Karspeck A, Shaman J. Comparison of filtering methods for the modeling and retrospective forecasting of influenza epidemics. PLoS Comput Biol. 2014;10: e1003583 10.1371/journal.pcbi.1003583 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40.Kandula S, Yamana T, Pei S, Yang W, Morita H, Shaman J. Evaluation of mechanistic and statistical methods in forecasting influenza-like illness. J R Soc Interface 2018;15: 20180174 10.1098/rsif.2018.0174 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.DeFelice NB, Little E, Campbell SR, Shaman J. Ensemble forecast of human West Nile virus cases and mosquito infection rates. Nat Comm. 2017;8: 14592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42.DeFelice NB, Schneider ZD, Little E, Barker C, Caillouet KA, et al. Use of temperature to improve West Nile virus forecasts. PLoS Comput Biol. 2018;14: e1006047 10.1371/journal.pcbi.1006047 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43.Yamana TK, Kandula S, Shaman J. Superensemble forecasts of dengue outbreaks. J R Soc Interface 2016;13: 20160410 10.1098/rsif.2016.0410 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Reis J, Shaman J. Retrospective parameter estimation and forecast of respiratory syncytial virus in the United States. PLoS Comput Biol. 2016;12: e1005133 10.1371/journal.pcbi.1005133 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Yang W, Zhang W, Kargbo D, Yang R, Chen Y, Chen Z, et al. Transmission network of the 2014–2015 Ebola epidemic in Sierra Leone. J R Soc Interface. 2015;12: 20150536 10.1098/rsif.2015.0536 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 46.Pei S, Morone F, Liljeros F, Makse HA, Shaman J. Inference and control of the nosocomial transmission of Methicillin-resistant Staphylococcus aureus. eLife. 2019;7: e40977. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47.Kleeman R. Measuring dynamical prediction utility using relative entropy. J Atmos Sci. 2002;59: 2057–2072. [Google Scholar]
- 48.Kleeman R, Moore AM . A new method for determining the reliability of dynamical ENSO predictions. Mon Weather Rev. 1999;127: 694–705. [Google Scholar]
- 49.Palmer TN, Gelaro R, Barkmeijer J, Buizza R. Singular vectors, metrics, and adaptive observations. J Atmos Sci. 1998;55: 633–653. [Google Scholar]
- 50.Buizza R, Tribbia J, Molteni F, Palmer T. Computation of optimal unstable structures for a numerical weather prediction model. Tellus A. 1993;45: 388–407. [Google Scholar]
- 51.Hamill TM, Snyder C, Morss RE. A comparison of probabilistic forecasts from bred, singular-vector, and perturbed observation ensembles. Mon Weather Rev. 2000;128: 1835–1851. [Google Scholar]
- 52.Nicolis C. Dynamics of model error: Some generic features. J Atmos Sci. 2003;60: 2208–2218. [Google Scholar]
- 53.Benettin G, Galgani L, Giorgilli A, Strelcyn JM. Lyapunov characteristic exponents for smooth dynamical systems and for Hamiltonian systems; a method for computing all of them. Part 1: Theory. Meccanica. 1980;15: 9–20. [Google Scholar]
- 54.Benettin G, Galgani L, Giorgilli A, Strelcyn JM. Lyapunov characteristic exponents for smooth dynamical systems and for Hamiltonian systems; a method for computing all of them. Part 2: Numerical application. Meccanica. 1980;15: 21–30. [Google Scholar]
- 55.The International Research Institute for Climate and Society. Descriptions of the IRI Climate Forecast Verification Scores. Available from: https://iri.columbia.edu/wp-content/uploads/2013/07/scoredescriptions.pdf
- 56.Kirkpatrick S, Gelatt CD, Vecchi MP. Optimization by simulated annealing. Science. 1983;220: 671–680. 10.1126/science.220.4598.671 [DOI] [PubMed] [Google Scholar]
- 57.Ellner SP, Turchin P. When can noise induce chaos and why does it matter: a critique. Oikos. 2005;111: 620–631. [Google Scholar]
- 58.Pei S, Shaman J. Counteracting structural errors in ensemble forecast of influenza outbreaks. Nat Comm. 2017;8: 925. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All relevant data are within the manuscript and its Supporting Information files.