Abstract
The 2009 H1N1 influenza pandemic provides a unique opportunity for detailed examination of the spatial dynamics of an emerging pathogen. In the US, the pandemic was characterized by substantial geographical heterogeneity: the 2009 spring wave was limited mainly to northeastern cities while the larger fall wave affected the whole country. Here we use finely resolved spatial and temporal influenza disease data based on electronic medical claims to explore the spread of the fall pandemic wave across 271 US cities and associated suburban areas. We document a clear spatial pattern in the timing of onset of the fall wave, starting in southeastern cities and spreading outwards over a period of three months. We use mechanistic models to tease apart the external factors associated with the timing of the fall wave arrival: differential seeding events linked to demographic factors, school opening dates, absolute humidity, prior immunity from the spring wave, spatial diffusion, and their interactions. Although the onset of the fall wave was correlated with school openings as previously reported, models including spatial spread alone resulted in better fit. The best model had a combination of the two. Absolute humidity or prior exposure during the spring wave did not improve the fit and population size only played a weak role. In conclusion, the protracted spread of pandemic influenza in fall 2009 in the US was dominated by short-distance spatial spread partially catalysed by school openings rather than long-distance transmission events. This is in contrast to the rapid hierarchical transmission patterns previously described for seasonal influenza. The findings underline the critical role that school-age children play in facilitating the geographic spread of pandemic influenza and highlight the need for further information on the movement and mixing patterns of this age group.
Author Summary
The determinants of influenza spatial spread are not fully understood, in part due to the insufficient geographic resolution of incidence data. We address this using a fine-grained private sector electronic health database of insurance claims data from health encounters in the US during 2009. We used physician diagnoses codes to generate a dataset of the weekly number of office visits with diagnosed influenza-like illness for 271 US locations. Applying statistical and mathematical models to these disease data, we find that the main autumn wave of the 2009 pandemic in the US was remarkably spatially structured. Its onset in the South Eastern US precipitated a slow radial spread that took 3 months to diffuse across the country. These patterns were replicated by models that included short-distance spatial transmission between nearby locations and increased transmission rates when school was in session. Our results contrast with previous modelling studies that indicated that environmental factors, population sizes, and long-distance transmission events (air traffic) are major determinants in disease spread. We conclude that the 2009 pandemic autumn wave spread slowly because transmissibility of the influenza virus was relatively low and children (who travel long distance far less than adults) were the predominant sources of infection.
Introduction
Understanding the spatio-temporal spread of infectious disease is important both for design of control strategies and to deepen fundamental knowledge about the interaction between epidemic dynamics and spatial mixing of the host population. Dynamic models and statistical analyses have provided key insights into the spread of a number of acute, directly transmitted infections of humans, including measles, rotavirus, dengue, pertussis, and seasonal and pandemic influenza [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]. A unifying feature of these analyses is the interaction of coupling between populations (often expressed in terms of ‘gravity’ or ‘radiation’ models for hierarchical spatial spread, [1], [2], [3], [5], [11], [12]) and demographic or environmental factors modulating transmission, in particular the seasonal aggregation of children in schools [13], [14], [15], [16], [17], or seasonal variation in humidity [18], [19]
Previous efforts have sought to forecast the likely spatial spread of pandemic influenza with model simulations accounting for intricate host demography and mixing data [10], [20]. However, a lack of finely resolved epidemiological data complicates validation and testing of such models. Analysis of long-term influenza-related mortality time series has highlighted the role of daily work commute as a driver of the regional spread of seasonal influenza in the US [3]. While mortality records were useful to explore the spatial transmission of the devastating 1918 pandemic in the US and UK [5], such data typically lack power to investigate disease patterns in small geographical areas or during more recent and milder seasons. However, increased disease surveillance and data availability in the context of the 2009 A/H1N1pdm09 pandemic provides a unique opportunity to explore the spatial spread of influenza in more detail, identify further data gaps, and validate existing models and theory. Here we used a rich dataset of influenza-like-illness records compiled from electronic medical claims and covering about 50% of outpatient physician visits in 2009 across the US to study influenza spread with an unprecedented level of detail. These electronic claims data have only recently been used for public health purposes, in particular to investigate the reduction in diarrhoea outpatient visits associated with Rotavirus vaccine introduction [21].
The 2009 pandemic spread rapidly across the world, soon after the putative emergence of the pandemic virus in Mexico [22]. The earliest laboratory-confirmed cases of pandemic influenza infection were reported in April 2009 in the South Western US. Subsequently, some cities, such as New York, Boston and Milwaukee, experienced intense community transmission in spring and summer [23], [24]. For most of the country however, there was no widespread outbreak until the autumn of 2009 when most pandemic-related deaths occurred [23]. Recent work has suggested that school fall terms starts were associated with the onset of the fall pandemic onsets in different US states [15], while reactive school closure in the spring reduced influenza transmission in Hong-Kong and Mexico [25], [26]. Another candidate driver of pandemic spread is low absolute humidity, which according to experimental and epidemiological studies may favour the transmission of influenza [18], [19], [27].
To determine the relative contributions of population movements, demographics, school openings, prior immunity, and environmental factors to pandemic spread, we fit a series of mechanistic models to our highly resolved US influenza surveillance datasets [28]. To track pandemic activity, we compile weekly epidemic indicators of the number of influenza-like illness (ILI) patients stratified by zip code, providing disease information in 271 administrative areas, covering more than 90% of the US population in the 48 contiguous states (Figure 1, top panel). We focus on the dynamics of the fall wave of the 2009 H1N1pdm09 pandemic, as all sites experienced a clearly defined pandemic onset between July and November 2009.
Results
Our analysis begins with a simple descriptive analysis of observed spatial autumn 2009 pandemic patterns and correlations with putative drivers. Armed with these empirical results, we construct a series of mechanistic epidemiological models to determine the importance of different processes for pandemic spread.
Descriptive analysis
Our spatial analysis relies on the estimation of pandemic onset dates, which are based on the date when ILI incidence exceeded a seasonal threshold during summer-autumn 2009 [29], [30] (as most onset dates occurred in autumn, we refer to this pandemic wave as the “autumn wave” for the sake of simplicity; see methods for details). We disregard receipt of pandemic influenza vaccine as nearly all doses were administered after the onset of the autumn wave [31].
Onset dates range between 26th July to 1st November, 2009 in the 271 locations, with a clear spatial patterning starting in South East US and spreading in all directions within around three months (Figure 1 and Supplementary Movie). Visually, the hub of the South Eastern spread is in Alabama or Georgia, and Dothan, Alabama had the earliest onset in these states (see also Figure S1). We correlate estimates of onset dates with four different putative drivers of spatial transmission: date of school term start [15], great circle distance from Dothan, distance on the nearest neighbour network from Dothan (see Figure 1b), and absolute humidity indicators (considering both raw values and anomalies in days 7–10 prior to pandemic onset, as in past work [18], [19]). Autumn pandemic onset is highly correlated with distance metrics and school starts (correlation coefficients 0.35–0.72, P<0.0001; Figure 2a–c) and moderately correlated with absolute humidity (coefficient −0.63, P<0.0001 for raw AH, and 0.22, P = 0.001 for anomalies; Figure 2d–e). Outliers in this correlation analysis may indicate a second important seeding event in California; hence correlations with distance are even stronger if restricted to the Eastern US (Figure 2 red points, correlation coefficient 0.91 and 0.92, P<0.0001). As AH in each location is generally decreasing through the autumn, the correlation between onset and AH at onset must be treated with some caution. However there is more signal here than can be explained just by the general temporal trend in AH: the correlation coefficient (−0.63) is stronger than that obtained from 10,000 random permutations of onset dates between locations.
Partial correlations were computed for each combination of predictors (Table 1). For the residuals from regression with geographic or network distance, weak but significant correlations were found with absolute humidity (coefficient −0.26 and −0.27, p<10−4) and schools (0.16, p = 0.02). For the residuals from school openings and both humidity measures, any of the other variables gave a typically moderate to high correlation (range of coefficients, 0.16–0.90, P<0.02). This finding suggests that a purely spatial process may dominate in explaining the timing of the autumn wave, perhaps modulated by environmental and school-related factors. However analysis of a more mechanistic epidemiological model is required to distinguish the relative contributions and interactions of these and other potential drivers.
Table 1. Partial correlations of putative factors affecting the onset of influenza autumn 2009 pandemic wave in 176 Eastern US locations.
Geographic Distance | Network Distance | School Opening | Absolute Humidity | Anomalous Humidity | |
Geographic Distance | - | 0.04 (p = 0.30) | 0.48 p<10−4 | 0.47 p<10−4 | 0.90 p<10−4 |
Network Distance | 0.09 (p = 0.12) | - | 0.49 p<10−4 | 0.49 p<10−4 | 0.91 p<10−4 |
School Opening | 0.16 (p = 0.02) | 0.15 (p = 0.03) | - | 0.51 p<10−4 | 0.78 p<10−4 |
Absolute Humidity | −0.27 p<10−4 | −0.26 p<10−3 | −0.57 p<10−4 | - | −0.85 p<10−4 |
Anomalous Humidity | 0.04 (p = 0.30) | 0.08 (p = 0.15) | 0.16 (p = 0.02) | 0.48 p<10−4 | - |
Each of the five variables in the first row (geographic distance, network distance, school opening time, absolute humidity, humidity anomalies), residuals are computed from linear regression with the onset of influenza timings for locations in the East of the US. This table gives the correlation between these residuals and a second variable, listed in the first column.
For the residuals from regression with geographic or network distance (first two columns), weak correlation is found with absolute humidity (p<10−4) and schools (p = 0.02). For the residuals from school openings and both humidity measures (last three columns), any of the other variables give a significant correlation (p = 0.02 for one combination and p<10−4 for the other 11).
Mechanistic epidemiological models
We build a simple spatial model for the spread of influenza, inspired by previous work on the 1918 pandemic [5] (see methods for full details). Briefly, treating each of 271 locations in the US as the statistical units, a maximum likelihood approach is used to fit the observed pandemic onset dates. The parametric model of the force of infection, the rate of outbreak initiation for each location, includes the contribution of both local and long-distance transmission. The outbreak in each location can be sparked by transmission from another nearby location: this contribution to the force of infection is modelled using a power law kernel driven by population size and distance (hereafter referred to as the gravity model) [1], [3], [5], [12]. Alternative spatial kernels based on different model formulations or distance metrics were also explored, including Gaussian kernel and grid distance (methods). Further, we introduced a normalization parameter that quantifies how connectivity may depend on the number and size of neighbouring populations, following [5], akin to the difference between density-dependent and density-independent transmission [32]. In addition to short-range disease transmission, a term was included to account for the background probability of an outbreak spark (hereafter referred to as external seeding), which could be seeded by imported infections from distant locations (domestically or internationally) or even a low-level persistent local chain of infection that survived the summer. Both external seeding and local transmission were also allowed to depend on whether or not schools were in session and also to scale according to population sizes to some power. The force of infection was also allowed to be modulated by previous immunity to pandemic A/H1N1pdm09 (as measured by indicators of the intensity of the spring 2009 pandemic wave) and absolute humidity. We evaluate every possible combination of these factors to explain pandemic onset dates using the corrected Akaike information criteria.
Figure 3 shows the relative importance of each factor and their interactions. Irrespective of the spatial kernel used, introducing terms for schools and/or spatial spread results in a significant improvement of fit over models exclusively considering external seeding. Models including spatial spread alone result in better fit than models purely driven by schools terms. For both the gravity and Gaussian models, the normalisation described above consistently improves the fit, whereas it is less important for the grid models. Of the spatial kernels tested, the gravity model offers the best fit (AICc more than 30 lower than other kernels, see Table S1). In line with a strong distance effect evidenced in the correlation analysis, the distance exponent of the best-fit gravity model was high (2.6, See Table S2) while the population size exponent was low (0.27).
Overall, the best model includes terms for spatial transmission, the effect of schools on spatial transmission, and a baseline external seeding rate (see Table 2). The external seeding rate was small but not zero (Table S2), and only appears to have played any significant role in a small number of locations (Figure S1). Neither absolute humidity nor prior exposure to influenza during the spring wave were part of the best model. Taken together, these results support a scenario where autumn pandemic onset is determined by a local mode of spatial spread, which is density-independent and enhanced by schools being in session.
Table 2. Parsimony of model fits to the autumn 2009 pandemic onset timing: best AICc for each model category.
Model Category | Log Likelihood | Number of parameters | AICc | ΔAICc | |||
EXT | −1106.65 | 2 | 2217.35 | 879.39 | |||
EXT | +AH | −1094.77 | 3 | 2195.63 | 857.67 | ||
EXT | +SCH | −860.77 | 3 | 1727.64 | 389.68 | ||
EXT | +SCH | +AH | −859.12 | 4 | 1726.40 | 388.44 | |
EXT | +SP | −675.06 | 5 | 1360.34 | 22.38 | ||
EXT | +SP | +AH | −675.04 | 6 | 1362.04 | 24.08 | |
EXT | +SP | +SCH | −662.82 | 6 | 1337.96 | 0 | |
EXT | +SP | +SCH | +AH | −662.16 | 7 | 1338.75 | 0.79 |
Each row corresponds to best fit of the most parsimonious model in a given category. The categories are the eight that include EXT (external seeding) plus all possible combinations of AH (absolute humidity), SCH (effect of schools) and SP (spatial transmission). The row in bold indicates the most parsimonious model, as determined by AICc, and ΔAICc gives the difference from the AICc of this model. For each of the categories including SP, the most parsimonious model used the gravity kernel. Table S1 gives an extended version of this table with the different spatial kernels tested separately.
One-step ahead predictions (Figure S2) and full simulations (Figure S3) confirmed that the best model could broadly reproduce the observed spatial dynamics of autumn pandemic wave onsets. However the model predicts later onsets than those observed in California's Central Valley, again suggesting multiple seeding events (Figure S1). Simulating the effect of setting schools to be permanently closed, the general spatial structure of the wave was similar to observed, but spread was substantially slower (Figure S3).
Discussion
To our knowledge, this is the most detailed analysis to date of the local and regional dynamics of influenza pandemic spread, made possible by the availability of rich electronic disease datasets maintained in the private sector. Our analysis shows that the spread of the A/H1N1pdm09 pandemic during autumn 2009 in the US was highly spatially structured, with a clear wave originating in the South Eastern region of the country and slowly spreading outwards over a 3-month period. Variation in school openings alone cannot explain the observed fine grain variations in pandemic onset across the US, but school opening does exert a significant effect on the spread of the pandemic, consistent with past research [15], [25], [26], [33]
It is remarkable that the main 2009 pandemic wave, set in an era of intense air traffic and regional ground transportation, showed such a short-range mode of spread – so local that observed outbreak patterns conflicted with the usual model of rapid transmission between distant major cities followed by spread to less populated areas [3], [10], [20]. This intriguing picture of mainly local spread could be due to a combination of two factors: the relatively low transmissibility of the 2009 A/H1N1pdm09 virus [34] and the importance of children in pandemic spread [35]. In turn, both of these factors could be consequences of a strong build-up of anti H1N1pdm09 immunity in older cohorts due to earlier exposure to related viruses [25], [36]. The global transportation network likely played a significant role in the initial spread of the pandemic virus in spring 2009 [37], [38], [39].
Analysis of long-term mortality data indicates that the regional spread of seasonal influenza is driven by longer-range commuter-driven movements of adults with strong dependence on population sizes of recipient and donor locations [3]. Previous models of pandemic spread make similar assumptions [10], [20]; however, our analysis of detailed local morbidity data suggest that the travel patterns of school-age children may be a major factor explaining the spread of the 2009 autumn pandemic wave in the US. While intuitively one might expect that movements of children are typically shorter-range and revolve around home and school, limited information exists on contact rates in this age group. The 2009 experience underlines the urgency for improved understanding of the dynamics of epidemiologically-relevant spatial and social mixing in children.
The relatively modest transmissibility of the A/H1N1pdm09 virus, with an effective reproduction ratio estimated at around 1.5 [34], might also explain why long range travel was a lesser determinant of the spread of the pandemic. With a low reproduction ratio, occasional long-range imports of infection may die out after a small number of generations of transmission, and hence simply fail to “take”. In contrast, a large outbreak in a proximate community will result in repeated infection challenges, and inevitably a successful chain of infection will commence. Intriguingly, the effective reproduction number of seasonal influenza is typically lower than that of the A/H1N1pdm09 virus, and hence we would expect an even more localized and slower spread for seasonal outbreaks than for the autumn 2009 pandemic. Unfortunately, no epidemiological data at a comparable level of spatial detail is available for comparison. Further, as hypothesized in earlier work, the transmission patterns of seasonal influenza epidemics may not be predictive of pandemic patterns, due to differences in outbreak timing and age distribution of infection [35], [40], [41], [42]. Understanding the relative contribution of virus transmissibility, seasonality, and mean age at infection on the spatial dynamics of influenza is an interesting area for future work.
Another surprising feature of the 2009 autumn pandemic wave was the late arrival in large northeastern cities – regardless of whether these cities had experienced an early summer wave. For instance, the five boroughs of New York City suffered a major spring pandemic outbreak and were particularly late in experiencing an autumn outbreak, in contrast to the less densely populated cities upstate. This implies that chains of influenza transmission from the spring wave did not persist over the summer in most places, consistent with phylogeographic analysis of A/H1N1pdm09 viruses suggesting reintroduction of a single dominant viral lineage in the autumn in the US [24]. This phylogenetic pattern is not repeated in all countries, for example in Scotland [43], and synthesising the evolutionary and epidemiological observations for influenza spatial transmission is proving challenging [44]. It would therefore be interesting to test whether other countries also experienced slow and localized pandemic transmission and how that correlates with the corresponding observed evolutionary patterns.
Although our study is the first to investigate influenza spread at such a high level of spatial resolution and over such a broad geographic area, our findings may be compared with those of an earlier study of the 1918 pandemic in the US, England and Wales by Eggo et al. [5], made using a similar modelling framework. Interestingly, both studies suggest that transmission should be normalised by a weighted sum of all populations, meaning that transmission is nearly density-independent. However the fitted spatial kernels differ between the studies: in the study of the 1918 datasets [5] spatial transmission scales approximately as distance to the power −1, whereas in this 2009 study the distance exponent is around −2.6, implying much sharper localised transmission. Differences in spatial resolution of the data available and transmissibility of the 1918 and 2009 pandemic virus may explain these conflicting results. Clearly, more high resolution analyses of influenza disease spread are needed, in a variety of geographic settings, before the spatial transmission of pandemic influenza can be accurately predicted.
Several caveats are worth noting in our study. Here we have developed time series data of influenza-like illness coded by physicians (as a proxy of H1N1pdm09 activity), but these patients were not generally laboratory-confirmed. The contribution of other respiratory pathogens to influenza-like illness diagnoses is likely small in all age groups given the unusual timing of the pandemic outbreak during autumn when other important respiratory viruses – especially respiratory syncytial virus – are typically not epidemic. Further, because of its unusual timing, the onset of the pandemic was relatively easy to identify at a fine spatial and temporal resolution, given low background of respiratory diagnoses.
Our analysis was limited to the 2009 pandemic autumn wave period, and it would be interesting to model the spread of seasonal influenza epidemics at the scale studied here, although outbreak onset dates would be more difficult to identify. Additionally, we did not explore the spatial patterning of the 2009 spring wave, because its presence in the US was much more sporadic than the autumn wave and onset dates could be obscured by changes in health-seeking behaviour as people become aware of the new pandemic. Further, even though the 271 locations in our study represent 90% of the US population, we had to exclude cities with less than 200,000 inhabitants due to demographic noise. Most of these cities are located in the central US, a less well-connected and potentially interesting region that was not considered here. In addition, we used school term data at the state level [15], rather than at the county or city-level, given that detailed school data are not publicly available in most states. Further, our models do not integrate any age information, although analysis of age-stratified disease incidence time series revealed very similar patterns of pandemic onsets in the 271 US locations (not shown). Finally, we did not consider radiation models of spatial flux [11]: these are unlikely to add significantly to the present analysis as the picture of sequential outbreak onset is so clear already, and a normalisation factor has been included in the gravity formulation to account for the varying population density across the US. However it would be interesting to test the relative performance of radiation and gravity models on a finer grained influenza data set, particularly if a matching resolution of school data were available.
Overall, our results are robust: they do not depend on the exact formulation of the spatial model nor the definition of epidemiologic indicators. Our conclusions highlight the role of local transmission in the spread of the major autumn 2009 pandemic wave. This work highlights the importance of testing model predictions against detailed empirical disease data and suggests that fine-scale transmission models should take these results into account for simulation of future pandemic outbreaks. As ever, a synthesis of models, demographic, viral sequence data, environmental and movement data with multiple incidence data sets collected by different means would offer a particularly powerful way forward to understand infectious disease dynamics and improve preparedness for outbreaks of novel respiratory pathogens.
Methods
Data source
Weekly time series of outpatient visits for influenza-like-illness and total visits were compiled from the visit-level database of CMS-1500 medical claims data maintained by IMS Health, which captures a convenience sample of about half of all physician visits in the US. We first developed and employed a case definition of influenza-like illness (ILI) as any mention of a diagnostic code for influenza (ICD487x-488x) OR [fever and (sore throat or cough), (ICD780.6 and (462 or 786.2)] OR febrile viral illness (ICD079.99). Most of the cases were coded as ICD9 = 079.99 rather than the influenza specific code 487–488 – which probably reflects that only few doctor's offices utilized rapid testing for H1N1pdm09 influenza, on advice from the CDC and WHO to allow the laboratories to focus supplies and effort on severe cases only [45].
We extracted the weekly number of visits that met the ILI definition and also total number of all visits, stratified by 3-digit zip code of the physician's office. The IMS database covered 906 three-digit zip codes in the continental US during the 2009 pandemic period. The resulting syndromic case definition was validated against CDC's ILI surveillance system at the national and HHS regional level for seasonal and pandemic seasons; furthermore the ILI time series displayed known geographical heterogeneities, in particular large early summer waves in Northern cities like New York City but an absence of such patterns in upstate New York and the South [46].
Standardization of ILI data
The three-digit zip codes were aggregated according to 449 “sectional center facility” (SCF) as defined by the United States Postal Service, to make geographically meaningful population divisions. The case definition was sensitive enough to yield a large number of weekly cases in most SCFs year-round; however both coverage and reporting rate may vary by location and time. To generate stable time series, we used the ratio of ILI to total number of visits. We restricted the analysis to the continental US, to SCFs with populations of more than 200,000, and to SCFs with more than 250 ILI cases reported in 2009. This reduced the total number of SCFs available for analysis to 271 but still accounted for over 90% of the US population. These SCF are shown in Figure 1 (top panel), and we refer to these as “locations” in the main text.
Geographic data
Population numbers were determined from US Census 2000 data and zip codes weighted by population size to determine SCF centres. The eastern US was defined as HHS regions 1–5 [47]. The neighbour network (also called ‘grid model’) was constructed by joining each location to its four nearest neighbours and allowing all links to be reciprocal. The median school opening date was used for each state, and methods for collecting these data are given in Chao et al. 2010 [15]. The absolute humidity data were daily 2 m above-ground specific humidity conditions compiled from the North America Land Data Assimilation System (NLDAS) project, 1999–2009 [48]. For each SCF, we calculated the AH values and AH anomalies in days 7–10 prior pandemic onset, where daily AH anomalies are defined as observed AH minus average AH for the same day of the year during 1999–2008.
Definition of influenza pandemic onset
Weeks of national “low ILI activity” between 2001–2009 for aggregated US are defined as when the ILI ratio is below 0.6%, and a sinusoid is fitted to these weeks to determine the phase (similar to methods used for mortality data [29], [30]). Using this phase and the same set of weeks, amplitude plus a quartic function of time is fitted to the ILI ratio for each SCF to give an approximate seasonal baseline. As the pandemic does not necessarily respect the usual annual timing of influenza, we define a conservative pandemic threshold as the maximum of the sinusoidal model baseline during 2009 plus a small additional buffer (0.2%). Using absolute numbers for 2009, a binomial test (with exact probabilities) is used to determine if the observed number of visits is significantly (p<0.01) above threshold in each week. If there are at least three consecutive weeks that are significantly above threshold, then the first such week is considered to be the week of pandemic onset. To interpolate to a slightly greater degree of resolution, we estimate the number of ILI visits and total visits in the half week before onset using the geometric mean, and the binomial test is again used to determine if the fall wave start time should be moved back by half a week. For the fall wave of 2009, the calculated pandemic onset timings will not be sensitive to these methods of calculation as the epidemic upswing in each location is so sharply defined.
Transmission model
We use a maximum likelihood approach based on a simple mechanistic model. Following Eggo et al. [5], the probability that the fall wave starts at time for a given location (indexed by ) is given by
where is the force of infection at location at time . For the gravity model, this force of infection is given by
where the following are explanatory variables: is the great circle distance between locations and , and for each location : is the absolute humidity, is the intensity of the spring wave (percentage of total pandemic excess ILI cases that were reported during the spring), is the population size normalised by average population size, is an indicator function of time which is 1 when schools are open and 0 otherwise. is the set of indices of currently infected locations.
The estimated parameters are as follows: and are the effects of the spring wave and humidity in modulating the full transmission rate. The parameters are all transmission rate factors: is for the background rate of infection (including external seeding from domestic and international locations and local chains of transmission surviving over summer), is a boost due to schools being in, is the spatial transmission coefficient and is the boost to spatial transmission due to schools being in. , and are all exponents on population size, representing the effect of population size on the background rate of infection, for recipient population and donor population in spatial transmission respectively. The distance exponent in the spatial kernel of the gravity model is . For the spatial transmission, the numerator of the fraction is the sum over infected locations (weighted by distance and population size to some powers), and the denominator is the same sum but over all locations. The denominator is to the power : so 0 corresponds to no normalisation (fully density-dependent), and 1 corresponds to full normalisation (fully density-independent).
For the Gaussian model, the expression is the same except for the form of dependence on distance and the parameter gives the distance scaling in the Gaussian (and is no longer used):
The grid model is slightly different as instead of explicit dependence on geographic distance, we use the set of locations one step () or two steps () away for location , as defined by the constructed grid:
There is no distance parameter, but now the spatial transmission parameters have been extended from two to four to account for transmission to locations one and two steps away: , , and .
Simpler models can be made from all these spatial kernels by “turning off’ combinations of parameters, i.e. setting them to zero. This makes 400 gravity, 400 Gaussian and 1936 grid models. For each of the 2736 models, likelihood was maximised using the Nelder-Mean simplex algorithm as implemented in the GNU scientific library in C. Simulations were done in C using the ranlux algorithm of Lüscher with maximum luxury level as provided by the GSL library. Convergence was assessed by likelihood profiles (Figure S4).
Supporting Information
Acknowledgments
The weekly disease time series were kindly compiled by IMS Health for research purposes, under a collaborative agreement.
Funding Statement
This study was supported by the RAPIDD program of the Science and Technology Directorate, Department of Homeland Security (to JRG, LS, JS, BTG), the in-house influenza program of the Fogarty International Center, National Institutes of Health, and the MIDAS program of the National Institute of General Medical Sciences, NIH (grant U01-GM070749, DLC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
References
- 1.Cliff AD, Haggett P, Smallman-Raynor M (1998) Deciphering global epidemics: analytical approaches to the disease records of world cities, 1888–1912; Press BU, editor. Cambridge. [Google Scholar]
- 2. Grenfell BT, Bjornstad ON, Kappey J (2001) Travelling waves and spatial hierarchies in measles epidemics. Nature 414: 716–723. [DOI] [PubMed] [Google Scholar]
- 3. Viboud C, Bjornstad ON, Smith DL, Simonsen L, Miller MA, et al. (2006) Synchrony, waves, and spatial hierarchies in the spread of influenza. Science 312: 447–451. [DOI] [PubMed] [Google Scholar]
- 4. Chowell G, Bettencourt LM, Johnson N, Alonso WJ, Viboud C (2008) The 1918–1919 influenza pandemic in England and Wales: spatial patterns in transmissibility and mortality impact. Proc Biol Sci 275: 501–509. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Eggo RM, Cauchemez S, Ferguson NM (2011) Spatial dynamics of the 1918 influenza pandemic in England, Wales and the United States. J R Soc Interface 8: 233–243. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6. Pitzer VE, Viboud C, Simonsen L, Steiner C, Panozzo CA, et al. (2009) Demographic variability, vaccination, and the spatiotemporal dynamics of rotavirus epidemics. Science 325: 290–294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Cummings DAT, Irizarry RA, Huang NE, Endy TP, Nisalak A, et al. (2004) Travelling waves in the occurrence of dengue haemorrhagic fever in Thailand. Nature 427: 344–347. [DOI] [PubMed] [Google Scholar]
- 8.Keeling MJ, Rohani P (2008) Modeling infectious diseases in humans and animals. Princeton: Princeton University Press. [Google Scholar]
- 9. Rvachev LA, Longini IM (1985) Model for the Global Spread of Influenza. Mathematical biosciences 75: 3–22. [Google Scholar]
- 10. Ferguson NM, Cummings DA, Fraser C, Cajka JC, Cooley PC, et al. (2006) Strategies for mitigating an influenza pandemic. Nature 442: 448–452. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Simini F, Gonzalez MC, Maritan A, Barabasi AL (2012) A universal model for mobility and migration patterns. Nature 484: 96–100. [DOI] [PubMed] [Google Scholar]
- 12. Xia Y, Bjornstad ON, Grenfell BT (2004) Measles metapopulation dynamics: a gravity model for epidemiological coupling and dynamics. Am Nat 164: 267–281. [DOI] [PubMed] [Google Scholar]
- 13. Cauchemez S, Ferguson NM, Wachtel C, Tegnell A, Saour G, et al. (2009) Closure of schools during an influenza pandemic. Lancet Infect Dis 9: 473–481. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Cauchemez S, Valleron AJ, Boelle PY, Flahault A, Ferguson NM (2008) Estimating the impact of school closure on influenza transmission from Sentinel data. Nature 452: 750–754. [DOI] [PubMed] [Google Scholar]
- 15. Chao DL, Halloran ME, Longini IM Jr (2010) School opening dates predict pandemic influenza A(H1N1) outbreaks in the United States. J Infect Dis 202: 877–880. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Bjornstad O, Finkenstadt B, Grenfell B (2002) Dynamics of measles epidemics: Estimating scaling of transmission rates using a Time series SIR model. Ecol Monogr 72: 169–184. [Google Scholar]
- 17. Grenfell BT, Bjornstad ON, Finkenstadt BF (2002) Dynamics of measles epidemics: scaling noise, determinism, and predicability with the TSIR model. Ecological Monographs 72: 185–202. [Google Scholar]
- 18. Shaman J, Goldstein E, Lipsitch M (2011) Absolute humidity and pandemic versus epidemic influenza. Am J Epidemiol 173: 127–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Shaman J, Pitzer VE, Viboud C, Grenfell BT, Lipsitch M (2010) Absolute humidity and the seasonal onset of influenza in the continental United States. PLoS Biol 8: e1000316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20. Germann TC, Kadau K, Longini IM Jr, Macken CA (2006) Mitigation strategies for pandemic influenza in the United States. Proc Natl Acad Sci U S A 103: 5935–5940. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Cortese MM, Tate JE, Simonsen L, Edelman L, Parashar UD (2010) Reduction in gastroenteritis in United States children and correlation with early rotavirus vaccine uptake from national medical claims databases. Pediatr Infect Dis J 29: 489–494. [DOI] [PubMed] [Google Scholar]
- 22. Fraser C, Donnelly CA, Cauchemez S, Hanage WP, Van Kerkhove MD, et al. (2009) Pandemic potential of a strain of influenza A (H1N1): early findings. Science 324: 1557–1561. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.CDC (2005) CDC Influenza activity in the US. http://www.cdc.gov/flu/ (accessed Oct 12, 2005)
- 24. Nelson MI, Tan Y, Ghedin E, Wentworth DE, St George K, et al. (2011) Phylogeography of the spring and fall waves of the H1N1/09 pandemic influenza virus in the United States. J Virol 85: 828–834. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Chowell G, Echevarria-Zuno S, Viboud C, Simonsen L, Tamerius J, et al. (2011) Characterizing the Epidemiology of the 2009 Influenza A/H1N1 Pandemic in Mexico. PLoS Med 8: e1000436. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Wu JT, Cowling BJ, Lau EH, Ip DK, Ho LM, et al. (2010) School closure and mitigation of pandemic (H1N1) 2009, Hong Kong. Emerg Infect Dis 16: 538–541. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Shaman J, Kohn M (2009) Absolute humidity modulates influenza survival, transmission, and seasonality. Proc Natl Acad Sci U S A 106: 3243–3248. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. http://www.sdihealth.com/ (2012) Surveillance Data Incorporated.
- 29. Serfling R (1963) Methods for current statistical analysis of excess pneumonia-influenza deaths. Public Health Rep 78: 494–506. [PMC free article] [PubMed] [Google Scholar]
- 30. Viboud C, Grais RF, Lafont BA, Miller MA, Simonsen L (2005) Multinational impact of the 1968 Hong Kong influenza pandemic: evidence for a smoldering pandemic. J Infect Dis 192: 233–248. [DOI] [PubMed] [Google Scholar]
- 31. Borse RH, Shrestha SS, Fiore AE, Atkins CY, Singleton JA, et al. (2013) Effects of vaccine program against pandemic influenza A(H1N1) virus, United States, 2009–2010. Emerg Infect Dis 19: 439–448. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. McCallum H, Barlow N, Hone J (2001) How should pathogen transmission be modelled? Trends Ecol Evol 16: 295–300. [DOI] [PubMed] [Google Scholar]
- 33. Eames KT, Tilston NL, Brooks-Pollock E, Edmunds WJ (2012) Measured dynamic social contact patterns explain the spread of H1N1v influenza. PLoS Comput Biol 8: e1002425. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Boelle PY, Ansart S, Cori A, Valleron AJ (2011) Transmission parameters of the A/H1N1 (2009) influenza virus pandemic: a review. Influenza Other Respi Viruses 5: 306–316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Hardelid P, Andrews NJ, Hoschler K, Stanford E, Baguelin M, et al. (2010) Assessment of baseline age-specific antibody prevalence and incidence of infection to novel influenza A/H1N1 2009. Health Technol Assess 14: 115–192. [DOI] [PubMed] [Google Scholar]
- 36. Hancock K, Veguilla V, Lu X, Zhong W, Butler EN, et al. (2009) Cross-reactive antibody responses to the 2009 pandemic H1N1 influenza virus. N Engl J Med 361: 1945–1952. [DOI] [PubMed] [Google Scholar]
- 37. Lemey P, Suchard M, Rambaut A (2009) Reconstructing the initial global spread of a human influenza pandemic: A Bayesian spatial-temporal model for the global spread of H1N1pdm. PLoS Curr 1: RRN1031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Balcan D, Hu H, Goncalves B, Bajardi P, Poletto C, et al. (2009) Seasonal transmission potential and activity peaks of the new influenza A(H1N1): a Monte Carlo likelihood analysis based on human mobility. BMC Med 7: 45. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39. Hosseini P, Sokolow SH, Vandegrift KJ, Kilpatrick AM, Daszak P (2010) Predictive power of air travel and socio-economic data for early pandemic spread. PLoS One 5: e12763. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Simonsen L, Clarke MJ, Schonberger LB, Arden NH, Cox NJ, et al. (1998) Pandemic versus epidemic influenza mortality: a pattern of changing age distribution. J Infect Dis 178: 53–60. [DOI] [PubMed] [Google Scholar]
- 41. Olson DR, Simonsen L, Edelson PJ, Morse SS (2005) Epidemiological evidence of an early wave of the 1918 influenza pandemic in New York City. Proc Natl Acad Sci U S A 102: 11059–11063. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 42. Andreasen V, Viboud C, Simonsen L (2008) Epidemiologic characterization of the 1918 influenza pandemic summer wave in Copenhagen: implications for pandemic control strategies. J Infect Dis 197: 270–278. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Baillie GJ, Galiano M, Agapow PM, Myers R, Chiam R, et al. (2012) Evolutionary Dynamics of Local Pandemic H1N1/2009 Influenza Virus Lineages Revealed by Whole-Genome Analysis. J Virol 86: 11–18. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44. Viboud C, Nelson MI, Tan Y, Holmes EC (2013) Contrasting the epidemiological and evolutionary dynamics of influenza spatial transmission. Philos Trans R Soc Lond B Biol Sci 368: 20120199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. 2009 pandemic influenza A (H1N1) virus infections - Chicago, Illinois, April–July 2009. MMWR Morb Mortal Wkly Rep 58: 913–918. [PubMed] [Google Scholar]
- 46.Charu V, Viboud C, Ballesteros S, Gog J, Grenfell B, et al. (2013) Validation of a high-volume electronic surveillance system to monitor seasonal and pandemic influenza activity in the US. Submitted.
- 47.HHS (2013) Regions Map. http://www.hhs.gov/about/regionmap.html
- 48. Cosgrove BA, Lohmann D, Mitchell KE, Houser PR, Wood EF, et al. (2003) Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project. J Geophys. Res 108 (D22) 8842. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.