Skip to main content
Springer logoLink to Springer
. 2018 Jun 11;7(1):16. doi: 10.1140/epjds/s13688-018-0144-x

Inferences about spatiotemporal variation in dengue virus transmission are sensitive to assumptions about human mobility: a case study using geolocated tweets from Lahore, Pakistan

Moritz U G Kraemer 1,2,3,, D Bisanzio 4,5, R C Reiner 6, R Zakar 7, J B Hawkins 1,2, C C Freifeld 2,8, D L Smith 6,9, S I Hay 6, J S Brownstein 1,2, T Alex Perkins 10,
PMCID: PMC6404370  PMID: 30854281

Abstract

Billions of users of mobile phones, social media platforms, and other technologies generate an increasingly large volume of data that has the potential to be leveraged towards solving public health challenges. These and other big data resources tend to be most successful in epidemiological applications when utilized within an appropriate conceptual framework. Here, we demonstrate the importance of assumptions about host mobility in a framework for dynamic modeling of infectious disease spread among districts within a large urban area. Our analysis focused on spatial and temporal variation in the transmission of dengue virus (DENV) during a series of large seasonal epidemics in Lahore, Pakistan during 2011–2014. Similar to many directly transmitted diseases, DENV transmission occurs primarily where people spend time during daytime hours, given that DENV is transmitted by a day-biting mosquito. We inferred spatiotemporal variation in DENV transmission under five different assumptions about mobility patterns among ten districts of Lahore: no movement among districts, movement following patterns of geo-located tweets, movement proportional to district population size, and movement following the commonly used gravity and radiation models. Overall, we found that inferences about spatiotemporal variation in DENV transmission were highly sensitive to this range of assumptions about intra-urban human mobility patterns, although the three assumptions that allowed for a modest degree of intra-urban mobility all performed similarly in key respects. Differing inferences about transmission patterns based on our analysis are significant from an epidemiological perspective, as they have different implications for where control efforts should be targeted and whether conditions for transmission became more or less favorable over time.

Electronic Supplementary Material

The online version of this article (10.1140/epjds/s13688-018-0144-x) contains supplementary material.

Keywords: Big data, Disease dynamics, Geo-located tweets, Gravity model, Human mobility, Radiation model, Spatiotemporal analysis, Twitter data

Introduction

The spread and transmission dynamics of human infectious diseases are shaped extensively by human behavior [18]. Pathogen transmission depends on human contact patterns and tends to accelerate in highly connected areas with high population size and frequent travel [23]. Relevant population interactions between areas can be the result of daily commuting to the commercial center of a city and back [32, 50], visiting relatives or friends [27], religious or cultural activities [15], or many other reasons. Generally, urban travel is characterized by extensive daily activity, as work activities do not typically take place at the same places where people live [19]. Dynamic human movement patterns in cities can be inferred using a variety of data sources such as census data, mobile phone data, or social media data [28]. Passive data collection from social media platforms now offer timely, high resolution estimates of spatiotemporal patterns of human mobility [4, 5, 28, 32]. All of these movement types have the potential to shape infectious disease transmission dynamics, potentially in different ways depending on the mode of transmission (e.g., by direct contact or through a mosquito vector).

Specifically in the context of urban transmission, the importance of spatial heterogeneity in drivers of transmission is well-documented [40, 47, 54]. Some districts of a city may have considerably higher likelihood of infection as a result of, for example, higher mosquito densities (e.g., malaria [53], dengue [56]), yet each district contributes to transmission in any other district, not just within its own boundaries, due largely to human travel [11, 31, 34]. Such considerations can become highly relevant when human mobility is high, as observed in large urban areas [12] and can critically inform how resources to control and eliminate disease should be allocated [9, 11]. Understanding this interaction between human mobility, spatial variation in drivers of transmission, and control measures is important to know where control measures will be most impactful, as dengue [48], chikungunya [14, 49], yellow fever [21, 22], and Zika [30] continue to cause large urban outbreaks to control their spread and limit the burden caused by these viruses. An important feature that they have in common is that they are all transmitted by the Aedes mosquitoes, which are active during daytime hours when human mobility is high [54].

In this study, we examined a series of seasonal dengue epidemics in an urban setting that occurred between 2011 and 2014 in Lahore, Pakistan; no major epidemic had been recorded before that date [24]. Dengue virus (DENV) is a flavivirus transmitted between humans primarily by the Aedes aegypti mosquito [51]. Dengue burden is enormous and it has increased substantially in recent decades [6]. The distribution of Ae. aegypti is now larger than it has ever known to be [25], and the viruses it transmits have been expanding too as a result [29, 33], leading to expanding ranges or changes in the epidemiology of Zika, chikungunya, and yellow fever.

To enhance our understanding of urban transmission dynamics of infectious diseases and to evaluate the importance of assumptions of the spatial configuration of cities, we here use human mobility models and estimates derived from the social network platform Twitter to compare inferences about spatiotemporal variation in transmission patterns and determine how sensitive these inferences are to different assumptions about patterns of intra-urban human mobility. Little sensitivity of these inferences would suggest that analyses could proceed with business as usual assumptions, whereas strong sensitivity would point to a need for more careful consideration of human mobility data within analyses of infectious disease dynamics, even at the granularity of intra-urban scales.

Material & methods

Epidemiological data: We obtained individual dengue case data from Lahore, Pakistan, and aggregated to the town level (an administrative subdivision of the city, n=10) on a weekly basis from January 1, 2011 to December 31, 2014. We refer to the number of dengue cases reported in town i in week t as Ii,t. Data were provided by the Health Department, Pakistan, and were processed from their original line list form. In total, 35,348 confirmed and suspected cases were recorded in Lahore. Roughly 18,020 of those occurred during the 2011 epidemic alone. Details of the geo-positioning procedure are described in detail in Kraemer et al. [26].

Human mobility data and models: To quantify human mobility patterns, we used openly available data from Twitter through its API. Our database consists of tweets made in Lahore from January 1, 2011 through June 30, 2015. Specifically, the tweets were gathered by querying the free streaming API for a bounding box of [180,180] longitude and [90,90] latitude, so all tweets with geographic coordinates match. The results are limited by Twitter to 1% of total tweet volume. We then filtered the database to only include tweets sent within the city of Lahore, Pakistan. The penetration of Twitter users with geo-located information amounts to about 1% of the total population in the study period, similar to previous estimates [28]. Other information included the user’s unique ID. We associated each user with a town of residence according to which town they sent most tweets from during night hours defined as 9pm–7am.

To use the tweets to summarize mobility patterns of residents of the 10 different towns, we computed a single matrix H that contained the proportion of tweets made in town j by residents of town i, where i and j refer to the row and column of H, respectively. Thus, the rows of H sum to 1, and the columns of H sum to values somewhat less than or greater than 1. Due to the somewhat limited number of tweets available from users in a given town during a given time period and because there was no obvious seasonality in the data, we did not make use of temporally disaggregated Twitter data in our transmission model.

In addition to the H matrix based on tweets, we constructed four alternative H matrices that span a wide range of assumptions about human mobility commonly used in infectious disease modeling. At one extreme, we constructed an H matrix following the ideal free assumption that movements between all locations occur proportional to population size. At the opposite extreme, we constructed an H matrix consistent with an assumption of no movement between towns. Just as the H matrix based on tweets represents an intermediate assumption between these two extremes, we formulated two additional H matrices based on commonly used models of human mobility; the gravity model [60] and the radiation model [50]. We applied these models to data about the distance between town centroids and town population sizes. This produced values of fluxes between i and j but did not produce an estimate of the magnitude of time spent in i by residents of i. To work around this gap in the predictions of these models, we used the diagonal of the tweet-based H matrix as the diagonal for these two H matrices. For the off-diagonal elements, we normalized the fluxes out of i predicted by the gravity and radiation models and multiplied those terms by 1Hi,i. Numerical values of all five H matrices are provided in Tables S1–S5 (Additional file 1).

Mobility-based transformation of incidence data: Our analysis is premised on the distinction between the location where an individual resides and the locations where she or he spends time. DENV is transmitted by the urban-adapted mosquito Ae. aegypti, which engages in the majority of its blood feeding activity during daytime hours [1]. Because this means that transmission is expected to occur mainly where people spend time during the day rather than where they reside [55], we transformed the ten residence-based incidence time series Ii,t to ten mobility-based incidence time series I˜i,t. The latter contains the incidence of cases acquired in town i in week t under a given assumption about mobility patterns defined by H and is calculated according to I˜i,t=jHj,iIj,t. We examined a total of five different interpretations of I˜i,t corresponding to the five different assumptions about human mobility patterns quantified by five different H matrices, as described in the previous section.

Transmission model: We used a spatial TSIR framework to model the dynamics of I˜i,t in the ten towns of Lahore during 2011–2014. Consistent with the assumptions about mobility used to define I˜i,t, we defined the effective population size of town i during daytime hours as N˜i=jHj,iNj. We are not aware of any significant DENV transmission activity in Lahore prior to 2011, so we assumed that the effective number of susceptible individuals in town i during daytime hours was S˜i,1=N˜i during the first week of January, 2011. Thenceforth, the susceptible population was depleted as new cases arose according to S˜i,t=S˜i,t1I˜i,t/ρ, where ρ is the probability that a person infected with DENV reported to the Health Department. Although there is a great deal of variability in ρ due to variation in rates of symptomatic disease and health-seeking behavior in different populations, we adopted a value of ρ=0.18 based on a recent meta-analysis [10]. This parameter accounts for the fact that many DENV infections are mild or asymptomatic, which is important when tracking the susceptible population due to the fact that individuals exposed to DENV become immune thereafter regardless of the extent to which they experience symptoms. One complication that we did not account for due to a lack of data is that there are four distinct DENV serotypes, with long-lasting immunity being specific only to the serotype(s) to which one has been exposed. There is, however, a short-term period of cross-immunity that is protective against all serotypes following exposure to only a single serotype, with the duration of this period (maximum-likelihood estimate: 1.88 y, 95% confidence interval: 0.88–4.31 y [43]) being similar to the timescale of our data set as a whole.

Following the standard form of TSIR models, we assumed that new cases among people spending time in town i were acquired on week t according to

I˜i,t=βi(t)I˜i,tN˜iS˜i,t, 1

where βi(t) is the transmission coefficient in town i at time t. The prime notation for I˜i,t and S˜i,t denotes the numbers of infected and susceptible people in the “generation” prior to t. The obligatory role of a mosquito in the transmission of DENV from one person to another is associated with a relatively long generation interval compared to directly transmitted pathogens. Whereas most TSIR models treat consecutive time steps as distinct generations, we obviated the need to temporally aggregate the data to such a large extent by calculating

I˜i,t=n=15ωnI˜i,tn, 2

where

ωn=1F(35)(177(n1)7nF(τ+7)F(τ)dτ) 3

is the probability that a case in week t is attributable to a case that occurred in week tn as defined by a generation interval with distribution function F [38]. We adopted a distribution function estimated by Siraj et al. [52] at a temperature of 30°C (the average daily temperature in Lahore during 2011–2014), which resulted in values of ωn of 4.8×104,0.168,0.440,0.267, and 0.125 for n=1,,5, respectively. S˜i,t was calculated similarly.

Model fitting: A primary advantage of the TSIR framework is that it allows for a model to be fitted to incidence data using regression techniques, which are easier to implement than alternative approaches to fitting dynamic models to time series data. To do so, we took the natural log of Eq. (1) and rearranged to obtain the regression equation

ln(I˜i,t)ln(S˜i,t)+ln(N˜i)=ln(βi(t))+ln(I˜i,t)+ϵt, 4

where

ln(βi(t))=ssecular,i(t)+sseasonal,i(t52t/52) 5

and each ϵt is an independent and identically distributed normal random variable. We posed ln(βi(t)) as a shape-constrained additive model (SCAM) and estimated parameters describing its two components using the scam function in the scam package [41] in R [42]. To prevent data points near the beginning and end of the time series from leading to unreasonably large values of βi(t) when extrapolating beyond those data points, we constrained ssecular,i to be a concave function. We modeled sseasonal,i as a cyclic cubic spline to ensure that its values at the beginning and end of the year were equal up to their second derivative. Under all mobility assumptions other than ideal free, we estimated separate town-specific functions for each of the two components of ln(βi(t)). Under the ideal free mobility assumption, we estimated only a single ln(β(t)) that applied to all towns due to the fact that the mobility-transformed data were strictly proportional to each other under this assumption.

Results

Human mobility data: Tweet-derived movement estimates showed relatively low movement outside the town of residence, with the mean proportion of time spent within one’s town of residence being 91.2% (range: 84.0–96.8%). The town from which the largest proportion of non-resident tweets was made was Gulberg Town (1.7%), and the fewest were made in Wagha Town (0.16%). Although there was substantial day-to-day variation in Twitter activity across towns (Fig. 1), the extent to which that variation was driven by a set of deterministic factors or sampling noise was not apparent. Based on the limited sample of tweets available and their incomplete coverage over the study period, we used the time-averaged proportion of tweets by residents of each town made in every other town in the epidemiological analysis.

Figure 1.

Figure 1

Relative proportion of tweets made in the town indicated in the panel label by residents of every other town. Different colors represent different home location of residents

Mobility-based transformation of incidence data: Applying the five mobility matrices to dengue incidence time series stratified by town of residence, we obtained notably different time series of the towns in which the cases were acquired. Under the assumption of no movement outside one’s town of residence, the residence-based and mobility-based time series were identical. Under the assumption that mobility follows Twitter, gravity, or radiation movement patterns, the mobility-based time series was mostly similar to the residence-based time series (Fig. 2), although redistribution from high-incidence towns to low-incidence towns was visually apparent (Fig. 3). This redistribution was attributable to the partially homogenizing effect of inter-town mobility. Under the assumption that mobility follows an ideal-free distribution, the mobility-based time series was stratified proportional to town population size (Fig. 2), resulting in time series that followed identical dynamics (Fig. 3). Whereas the distribution of incidence across towns was temporally constant under the ideal-free assumption, there was substantially more temporal variation in the distribution of incidence across towns under the Twitter, gravity, radiation, and no-movement assumptions (Fig. 2).

Figure 2.

Figure 2

Time series stratified by towns in which cases were assumed to be acquired (colors) under five different assumptions about mobility among towns (rows)

Figure 3.

Figure 3

Time series stratified by towns in which cases were assumed to be acquired (rows) under five different assumptions about mobility among towns (line type)

Model fitting: Best-fit models to all five mobility-based time series explained a relatively high proportion of variation in incidence, with the coefficient of determination, R2, ranging 0.519–0.685 (Fig. 4). In general, the time series data was explained similarly well by each of the models that performed a mobility transformation; i.e., Twitter (R2=0.678), ideal free (R2=0.662), gravity (R2=0.685), and radiation (R2=0.662). The data was explained less well by the model that assumed no movement (R2=0.519). Rather than an indication of the inadequacy of the no movement assumption, we interpreted this lower R2 value as a consequence of the fact that I˜i,t in Eq. (1) is integer-valued under the no movement assumption and continuous under the other mobility assumptions. Because our model’s generation-interval adjustment in Eq. (2) results in I˜i,t being continuous, the mobility assumptions associated with continuous values of I˜i,t have an inherent advantage in fitting the data, especially for I˜i,t<1 (Fig. 4). As a result, comparison of R2 values calculated in reference to the mobility-based time series to which the models were fitted indicates no clear distinction among the mobility assumptions and their appropriateness for modeling the mobility-based time series.

Figure 4.

Figure 4

Relationships between predicted (x-axis) and observed (y-axis) log incidence based on models fitted to five different mobility-based incidence time series (panels). The coefficient of determination, R2, associated with each best-fit model is indicated in each panel. Values of observed incidence vary across panels due to the effect of different assumptions about mobility used to transform the residence-based time series to mobility-based time series. For example, log incidence under the assumption of no movement never falls below 0, because there were no fractional cases observed in the raw data. Fractional incidence did occur in the other two time series due to each person’s incidence of disease being partitioned across the towns proportional to assumed mobility patterns. Under the ideal free assumption, the diagonal sets of points are a result of incidence on a given day varying across towns only in proportion to their different population sizes

Another way that we examined model fit was based on how well each best-fit model matched the original time series once a model’s one-step ahead predictions of I˜i,t were transformed back to predictions of Ij,t using the H matrix under which a given model was fitted. Under the no movement assumption, I˜i,t and Ij,t were, by definition, the same. These predictions were generally consistent with the data, although there were instances in Gulberg Town, Shalimar Town, and Wagha Town in which model predictions far exceeded the data during certain time periods (Fig. 5). This was likely due to the seasonal component of βi(t) being influenced too heavily by years with larger outbreaks. Under the Twitter, gravity, and radiation assumptions, predictions of Ij,t were often similar to or nearly as good as predictions based on the model fitted under the no movement assumption (Fig. 5). These models performed less well in the towns with the lowest incidence—i.e., Aziz Bhatti Town and Wagha Town—due to those models’ predictions of a greater degree of imported incidence from towns with high transmission than actually occurred (Fig. 5). In terms of ability to predict Ij,t, the model fitted under the ideal free assumption performed the worst by far. In the towns with the greatest incidence per capita, this model predicted either too few cases overall (Samanabad Town) or incidence patterns that were not as peaked as was observed locally (Ravi Town) (Fig. 5). In all other towns, the model fitted under the ideal free assumption overpredicted incidence, sometimes by several hundred cases in a single week (e.g., Iqbal Town, Nishtar Town) (Fig. 5).

Figure 5.

Figure 5

Predicted values of residence-based incidence in each town (rows) using the best-fit model under each of five different assumptions about mobility (colors). Observed values of residence-based incidence in each town are shown with black dots, and bands show 95% confidence intervals on model predictions

Transmission model inferences: Inferences of βi(t) under different mobility assumptions varied widely. Variation in βi(t) across towns was maximized under the assumption of no movement, with patterns ranging from nearly flat in Cantonment and Aziz Bhatti Town to a single seasonal peak in Shalimar Town and Wagha Town to multiple annual peaks of different heights across years in the other towns (Fig. 6, left). Under the no movement assumption, confidence intervals for βi(t) were unreasonably high in Shalimar Town and Wagha Town (Fig. 6, left). By design, inferences of βi(t) under the ideal free assumption were identical across towns and displayed a pattern of two seasonal peaks, with the one in the third quarter being larger (Fig. 6, right). This same general pattern was apparent in the inferences of βi(t) under the Twitter mobility assumption, but there was clear variability across towns, with differences in the heights of the peaks, their timing, and other aspects of their shape (Fig. 6, center). Inferences of βi(t) under the gravity and radiation mobility assumptions were similar to those under the Twitter assumption, with gravity being extremely similar (Fig. 7, left) and radiation being similar in most towns but having considerably larger peaks in Aziz Bhatti Town and Wagha Town (Fig. 7, right).

Figure 6.

Figure 6

Temporal variation in βi(t) for different towns (rows) under three different mobility assumptions (columns): no movement (left), Twitter (center), and ideal free (right). The black line shows the mean of the best-fit model and blue bands show standard error around the mean. The dashed red line indicates where βi(t)=1

Figure 7.

Figure 7

Temporal variation in βi(t) for different towns (rows) under three different mobility assumptions (columns): gravity model (left), Twitter (center), and radiation model (right). The black line shows the mean of the best-fit model and blue bands show standard error around the mean. The dashed red line indicates where βi(t)=1

A general tendency for variation in βi(t) across towns under different assumptions about mobility was reinforced by examining geometric means of βi(t) over time. Under the ideal free mobility assumption, the geometric mean of βi(t) decreased every year (Fig. 8). In contrast, the geometric mean of βi(t) was greater in 2014 than in 2011 in approximately half the towns under the Twitter, gravity, and radiation mobility assumption, with differences across models in terms of which towns experienced those increases (Fig. 8). The degree of inter-annual variation in the geometric mean of βi(t) was greatest under the no movement assumption, moderate under the Twitter, gravity, and radiation assumptions, and least under the ideal free assumption (Fig. 8). Under the Twitter, gravity, and radiation assumptions, the geometric mean of βi(t) across all years was highest in Data Gunj Bakhsh Town, which was also highest in both absolute and per capita terms for both residence-based and mobility-based incidence (Fig. 9). Otherwise, there was little correspondence between the geometric mean of βi(t) and either absolute or per capita incidence of the other nine towns under the Twitter, gravity, and radiation mobility assumptions. There was also no clear correspondence between βi(t) and incidence under the no movement assumption, and both the geometric mean of βi(t) and per capita incidence were equal across towns under the ideal free assumption, as expected (Fig. 9).

Figure 8.

Figure 8

Geometric means of βi(t) stratified by year (x-axis), town (rows), and mobility assumption (columns)

Figure 9.

Figure 9

Measures of transmission and incidence across towns (colors) under different mobility assumptions (columns) aggregated over the entire 2011–2014 time period. The upper left y-axis was cut off to permit viewing of the majority of values; the geometric mean of βi(t) for Shalimar town is 17.49

Discussion

Urban areas exhibit spatial heterogeneity in numerous factors that are relevant to infectious disease transmission, which can contribute to spatial variation in transmission [2] and interact with temporal drivers of transmission [44]. Our work contributes to understanding of infectious disease dynamics in urban settings by highlighting the important role that human mobility plays in relating observed patterns of disease incidence to inferred patterns of disease transmission. On the one hand, our results show that assuming that human mobility is well mixed at the scale of a city (ideal free assumption) fails to capture underlying spatial heterogeneity in transmission and can lead to incorrect conclusions about secular trends in transmission across years. In some ways, this behavior is not surprising, but a systematic review of the literature on mathematical modeling of mosquito-borne disease transmission showed that this assumption is extremely prevalent across this field [45]. On the other hand, assuming that different districts of a city are isolated (no movement assumption) may lead to exaggerated and biologically unrealistic inferences about transmission patterns. Whether it be Twitter or other data streams [3], incorporating realistic patterns of human mobility among districts of a city may help strike an appropriate balance between the tendencies of these two extreme assumptions. Regardless of whether such data streams provide a “correct” picture of mobility, it is encouraging that our results showed that Twitter, gravity, and radiation assumptions all resulted in similar epidemiological inferences.

The results of our analysis are a useful case study for any infectious disease, but they have particular importance for dengue. First, dengue mitigation strategies tend to be spatially reactive to reported incidence [59]. Other than the town with the highest per capita incidence, we did not identify a strong correspondence between towns with the highest incidence and those with the highest inferred transmission coefficients (similar to recent findings for malaria [11]), which suggests that reactive control deployed to areas with the highest incidence may not necessarily have as much impact on reducing transmission as more optimized strategies might. Second, due to adverse outcomes associated with vaccinating individuals with no prior DENV exposure with the only currently licensed dengue vaccine [20], it is recommended that this vaccine only be used in areas with high transmission intensity [17]. Although consideration is given to subnational variation in transmission intensity (see https://mrcdata.dide.ic.ac.uk/_dengue/dengue.php), our results indicate that this issue may also warrant attention at intra-urban scales. Third, the increasing trend in the transmission coefficient that we observed under the Twitter mobility assumption serves as a reminder that a decreasing trend in incidence may not be indicative of a decreasing trend in factors underlying transmission. Instead, it appears that incidence has probably decreased due to an increase in herd immunity and that conditions remain ripe for transmission, which could result in a large epidemic once a sufficient number of susceptibles build up from births and waning heterotypic immunity.

Although our Twitter mobility assumption may be an improvement over some of the other assumptions that we examined, there are a number of other considerations about intra-urban human mobility that are likely to affect DENV transmission. We have limited understanding of how representative Twitter patterns are compared to actual movements of people and see them more as an approximation to relative movements between towns. At the same time, Twitter data have the advantage of being widely accessible for many urban areas worldwide, whereas alternative models for intra-urban human mobility tend to have been developed around specific settings (e.g., [37]). In addition, there may be interactions between disease symptoms, infectiousness, and mobility in DENV-infected people [13, 39] that complicate the assumption that tweets by presumably healthy people are a suitable approximation of mobility patterns of people involved in transmission [58]. Higher order descriptions of movement may also be necessary to accurately capture transmission dynamics, as social network structure has been shown to affect transmission dynamics in urban environments [46, 54, 57]. There is also the perennial question of what spatial scale is satisfactory for modeling infectious disease dynamics [36, 40]. Examinations of intra-urban DENV transmission patterns in Bangkok, Thailand suggest that there can be strong spatial heterogeneities of relevance to transmission dynamics at scales as small as hundreds of meters [47, 49].

Overall, our results showed that the inferred degree of variation in a spatially and temporally variable transmission coefficient was sensitive to five different assumptions about intra-urban mobility that we considered. This approach extends previous applications of the TSIR model that estimated either seasonally varying transmission coefficients according to pre-defined functions (e.g., [16]) or completely independent values of the transmission coefficient at each time step (e.g., [35]) by blending the same underlying conceptual approach with a powerful new regression technique [41] and applying it in a spatial context. Although our analysis does not account for specifically which factors underlie variation in the transmission coefficient that we uncovered, there are many well-known candidates that could be incorporated into future analyses [7, 8, 25, 52]. Either way, we expect that our results about the sensitivity of transmission inferences to assumptions about intra-urban mobility would still apply. More generally, we hope that this case study will serve as a guiding example to the growing number of data scientists engaging in analyses of infectious disease dynamics. The increasing availability of data from Twitter and other Internet-based streams provide an exciting opportunity for extracting new understanding from time series of infectious disease incidence, if used within an appropriate conceptual framework.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Acknowledgments

Acknowledgements

We are thankful to Dr. Irfan Ahmed, the World Health Organization (WHO), Punjab Office and Punjab Health Department for providing dengue case data.

Availability of data and materials

Data and code are available upon request.

Authors’ contributions

MUGK, TAP designed the experiment. RCR, DLS advised on methods. SIH, JH, CCF, JSB, DB, RZ provided and processed data. MUGK, TAP performed the analyses and wrote the paper. All authors provided feedback and contributed to revisions. All authors read and approved the final manuscript.

Funding

MUGK is supported by The Branco Weiss Fellowship—Society in Science, administered by the ETH Zurich and acknowledges funding from a Training Grant from the National Institute of Child Health and Human Development (T32HD040128) and the National Library of Medicine of the National Institutes of Health (R01LM010812, R01LM011965). RCR, DLS, and TAP received support from a grant from the Bill and Melinda Gates Foundation (OPP 1110495 to DLS). TAP received support from a grant from the National Institutes of Health/National Institute of Allergy and Infectious Disease (1P01AI098670-01A1) and a DARPA Young Faculty Award (D16AP00114). JSB and JBH acknowledge support by the National Institutes of Health (NIH 5R01LMO011965-03).

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Moritz U. G. Kraemer, Email: kramer.moritz@gmail.com

T. Alex Perkins, Email: taperkins@nd.edu.

References

  • 1.Akram W, Hafeez F, Ullah UN, Kim YK, Hussain A, Lee JJ. Seasonal distribution and species composition of daytime biting mosquitoes. Entomol Res. 2009;39:107–113. doi: 10.1111/j.1748-5967.2009.00204.x. [DOI] [Google Scholar]
  • 2.Alexander L, Jiang S, Murga M, González MC. Origin—destination trips by purpose and time of day inferred from mobile phone data. Transp Res, Part C, Emerg Technol. 2015;58:240–250. doi: 10.1016/j.trc.2015.02.018. [DOI] [Google Scholar]
  • 3.Althouse BM, Scarpino SV, Meyers LA, Ayers JW, Bargsten M, Baumbach J, et al. Enhancing disease surveillance with novel data streams: challenges and opportunities. EPJ Data Sci. 2015;4:17. doi: 10.1140/epjds/s13688-015-0054-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Bassolas A, Lenormand M, Tugores A, Gonçalves B, Ramasco JJ. Touristic site attractiveness seen through Twitter. EPJ Data Sci. 2016;5:12. doi: 10.1140/epjds/s13688-016-0073-5. [DOI] [Google Scholar]
  • 5.Beiró MG, Panisson A, Tizzoni M, Cattuto C. Predicting human mobility through the assimilation of social media traces into mobility models. EPJ Data Sci. 2016;5:30. doi: 10.1140/epjds/s13688-016-0092-2. [DOI] [Google Scholar]
  • 6.Bhatt S, Gething PW, Brady OJ, Messina JP, Farlow AW, Moyes CL, et al. The global distribution and burden of dengue. Nature. 2013;496:504–507. doi: 10.1038/nature12060. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Brady OJ, Golding N, Pigott DM, Kraemer MU, Messina JP, Reiner RC, et al. Global temperature constraints on Aedes aegypti and Ae. albopictus persistence and competence for dengue virus transmission. Parasites Vectors. 2014;7:338. doi: 10.1186/1756-3305-7-338. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Brady OJ, Johansson MA, Guerra CA, Bhatt S, Golding N, Pigott DM, et al. Modelling adult Aedes aegypti and Aedes albopictus survival at different temperatures in laboratory and field settings. Parasites Vectors. 2013;6:351. doi: 10.1186/1756-3305-6-351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chowell G, Nishiura H. Transmission dynamics and control of Ebola virus disease (EVD): a review. BMC Med. 2014;12:196. doi: 10.1186/s12916-014-0196-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Clapham HE, Cummings DAT, Johansson MA. Immune status alters the probability of apparent illness due to dengue virus infection: evidence from a pooled analysis across multiple cohort and cluster studies. PLoS Negl Trop Dis. 2017;11:e0005926. doi: 10.1371/journal.pntd.0005926. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Cohen JM, Le Menach A, Pothin E, Eisele TP, Gething PW, Eckhoff PA, et al. Mapping multiple components of malaria risk for improved targeting of elimination interventions. Malar J. 2017;16:459. doi: 10.1186/s12936-017-2106-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Çolak S, Lima A, Gonzalez MC. Understanding congested travel in urban areas. Nat Commun. 2016;7:10793. doi: 10.1038/ncomms10793. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Duong V, Lambrechts L, Paul RE, Ly S, Lay RS, Long KC, et al. Asymptomatic humans transmit dengue virus to mosquitoes. Proc Natl Acad Sci USA. 2015;112:14688–14693. doi: 10.1073/pnas.1508114112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Faria NR, Lourenco J, Marques de Cerqueira E, Maia de Lima M, Pybus O, Alcantara CJ. Epidemiology of chikungunya virus in Bahia, Brazil, 2014–2015. PLoS Curr Outbreaks. 2015 doi: 10.1371/currents.outbreaks.c97507e3e48efb946401755d468c28b2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Finger F, Genolet T, Mari L, Constantin G, Magny D, Magloire N. Mobile phone data highlights the role of mass gatherings in the spreading of cholera outbreaks. Proc Natl Acad Sci USA. 2016;113:6421–6426. doi: 10.1073/pnas.1522305113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Finkenstädt BF, Grenfell BT. Time series modelling of childhood diseases: a dynamical systems approach. Appl Stat. 2000;49:187–205. [Google Scholar]
  • 17.Flasche S, Jit M, Rodríguez-Barraquer I, Coudeville L, Recker M, Koelle K, et al. The long-term safety, public health impact, and cost-effectiveness of routine vaccination with a recombinant, live-attenuated dengue vaccine (Dengvaxia): a model comparison study. PLoS Med. 2016;13:e1002181. doi: 10.1371/journal.pmed.1002181. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Funk S, Salathé M, Jansen VAA, Funk S, Salathe M. Modelling the influence of human behaviour on the spread of infectious diseases: a review. J R Soc Interface. 2010;7:1247–1256. doi: 10.1098/rsif.2010.0142. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.González MC, Hidalgo CA, Barabási A-L. Understanding individual human mobility patterns. Nature. 2008;453:779–782. doi: 10.1038/nature06958. [DOI] [PubMed] [Google Scholar]
  • 20.Hadinegoro SR, Arredondo-García JL, Capeding MR, Deseda C, Chotpitayasunondh T, Dietze R, et al. Efficacy and long-term safety of a dengue vaccine in regions of endemic disease. N Engl J Med. 2015;373:1195–1206. doi: 10.1056/NEJMoa1506223. [DOI] [PubMed] [Google Scholar]
  • 21.Johansson MA, Vasconcelos PFC, Staples JE. The whole iceberg: estimating the incidence of yellow fever virus infection from the number of severe cases. Trans R Soc Trop Med Hyg. 2014;108:482–487. doi: 10.1093/trstmh/tru092. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Kraemer MUG, Faria NR, Reiner RC, Jr, Golding N, Nikolay B, Stasse S, et al. Spread of yellow fever virus outbreak in Angola and the Democratic Republic of the Congo 2015–16: a modelling study. Lancet Infect Dis. 2017;17:330–338. doi: 10.1016/S1473-3099(16)30513-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Kraemer MUG, Hay SI, Pigott DM, Smith DL, Wint GRW, Golding N. Progress and challenges in infectious disease cartography. Trends Parasitol. 2016;32:19–29. doi: 10.1016/j.pt.2015.09.006. [DOI] [PubMed] [Google Scholar]
  • 24.Kraemer MUG, Perkins TA, Cummings DAT, Zakar R, Hay SI, Smith DL, et al. Big city, small world: density, contact rates, and transmission of dengue across Pakistan. J R Soc Interface. 2015;12:20150468. doi: 10.1098/rsif.2015.0468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kraemer MUG, Sinka ME, Duda KA, Mylne A, Shearer FM, Barker CM, et al. The global distribution of the arbovirus vectors Aedes aegypti and Ae. albopictus. eLife. 2015;4:e08347. doi: 10.7554/eLife.08347. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Kraemer MUG, Sinka ME, Duda KA, Mylne A, Shearer FM, Brady OJ, et al. The global compendium of Aedes aegypti and Ae. albopictus occurrence. Sci Data. 2015;2:150035. doi: 10.1038/sdata.2015.35. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Laniado D, Volkovich Y, Scellato S, Mascolo C, Kaltenbrunner A. The impact of geographic distance on online social interactions. Inf Syst Front. 2017 [Google Scholar]
  • 28.Lenormand M, Picornell M, Cantú-Ros OG, Tugores A, Louail T, Herranz R, et al. Cross-checking different sources of mobility information. PLoS ONE. 2014;9:e105184. doi: 10.1371/journal.pone.0105184. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Leta S, Jibat T, De Clercq EM, Amenu K, Kraemer MUG, Revie CW. Global risk mapping for major diseases transmitted by Aedes aegypti and Aedes albopictus. Int J Infect Dis. 2018;67:25–35. doi: 10.1016/j.ijid.2017.11.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Lourenco J, De Lima MM, Faria NR, Walker A, Kraemer MUG, Villabona-Arenas CJ, et al. Epidemiological and ecological determinants of Zika virus transmission in an urban setting. eLife. 2017;6:e29820. doi: 10.7554/eLife.29820. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31.Mahmud AS, Metcalf CJE, Grenfell BT. Comparative dynamics, seasonality in transmission, and predictability of childhood infections in Mexico. Epidemiol Infect. 2017;145:607–625. doi: 10.1017/S0950268816002673. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.McNeill G, Bright J, Hale SA. Estimating local commuting patterns from geolocated Twitter data. EPJ Data Sci. 2017;6:24. doi: 10.1140/epjds/s13688-017-0120-x. [DOI] [Google Scholar]
  • 33.Messina JP, Brady OJ, Pigott DM, Golding N, Kraemer MUG, Scott TW, et al. The many projected futures of dengue. Nat Rev Microbiol. 2015;13:230–239. doi: 10.1038/nrmicro3430. [DOI] [PubMed] [Google Scholar]
  • 34.Metcalf CJE, Bjornstad ON, Ferrari MJ, Klepac P, Bharti N, Lopez-Gatell H, et al. The epidemiology of rubella in Mexico: seasonality, stochasticity and regional variation. Epidemiol Infect. 2011;139:1029–1038. doi: 10.1017/S0950268810002165. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 35.Metcalf CJE, Bjornstad ON, Grenfell BT, Andreasen V. Seasonality and comparative dynamics of six childhood infections in pre-vaccination Copenhagen. Proc R Soc Lond B, Biol Sci. 2009;276:4111–4118. doi: 10.1098/rspb.2009.1058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36.Mills HL, Riley S. The spatial resolution of epidemic peaks. PLoS Comput Biol. 2014;10:e1003561. doi: 10.1371/journal.pcbi.1003561. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37.Perkins TA, Garcia AJ, Paz-Soldan VA, Stoddard ST, Reiner RC, Vazquez-Prokopec G, et al. Theory and data for simulating fine-scale human movement in an urban environment. J R Soc Interface. 2014;11:20140642. doi: 10.1098/rsif.2014.0642. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Perkins TA, Metcalf CJE, Grenfell BT, Tatem AJ. Estimating drivers of autochthonous transmission of chikungunya virus in its invasion of the Americas. PLoS Curr. 2015 doi: 10.1371/currents.outbreaks.a4c7b6ac10e0420b1788c9767946d1fc. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Perkins TA, Paz-Soldan VA, Stoddard ST, Morrison AC, Forshey BM, Long KC, et al. Calling in sick: impacts of fever on intra-urban human mobility. Proc R Soc Lond B, Biol Sci. 2016;283:20160390. doi: 10.1098/rspb.2016.0390. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.Perkins TA, Scott TW, Le Menach A, Smith DL. Heterogeneity, mixing, and the spatial scales of mosquito-borne pathogen transmission. PLoS Comput Biol. 2013;9:e1003327. doi: 10.1371/journal.pcbi.1003327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Pya N, Wood SN. Shape constrained additive models. Stat Comput. 2015;25:543–559. doi: 10.1007/s11222-013-9448-7. [DOI] [Google Scholar]
  • 42.R Core Team . R: a language and environment for computing. Vienna: R Foundation for Statistical Computing; 2016. [Google Scholar]
  • 43.Reich NG, Shrestha S, King AA, Rohani P, Lessler J, Kalayanarooj S, et al. Interactions between serotypes of dengue highlight epidemiological impact of cross-immunity. J R Soc Interface. 2013;10:20130414. doi: 10.1098/rsif.2013.0414. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Reiner RC, King AA, Emch M, Yunus M, Faruque ASG, Pascual M. Highly localized sensitivity to climate forcing drives endemic cholera in a megacity. Proc Natl Acad Sci USA. 2012;109:2033–2036. doi: 10.1073/pnas.1108438109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Reiner RC, Perkins TA, Barker CM, Niu T, Chaves LF, Ellis AM, et al. A systematic review of mathematical models of mosquito-borne pathogen transmission: 1970–2010. J R Soc Interface. 2013;10:20120921. doi: 10.1098/rsif.2012.0921. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Reiner RC, Stoddard ST, Scott TW. Socially structured human movement shapes dengue transmission despite the diffusive effect of mosquito dispersal. Epidemics. 2014;6:30–36. doi: 10.1016/j.epidem.2013.12.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Salje H, Lessler J, Berry IM, Melendrez MC, Endy T, Kalayanarooj S, et al. Dengue diversity across spatial and temporal scales: local structure and the effect of host population size. Science. 2017;355:1302–1306. doi: 10.1126/science.aaj9384. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 48.Salje H, Lessler J, Endy TP, Curriero FC, Gibbons RV, Nisalak A, et al. Revealing the microscale spatial signature of dengue transmission and immunity in an urban population. Proc Natl Acad Sci USA. 2012;109:9535–9538. doi: 10.1073/pnas.1120621109. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Salje H, Lessler J, Kumar K, Azman AS, Rahman MW, Rahman M. How social structures, space, and behaviors shape the spread of infectious diseases using chikungunya as a case study. Proc Natl Acad Sci USA. 2016;113:13420–13425. doi: 10.1073/pnas.1611391113. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Simini F, González MC, Maritan A, Barabási A-L. A universal model for mobility and migration patterns. Nature. 2012;484:96–100. doi: 10.1038/nature10856. [DOI] [PubMed] [Google Scholar]
  • 51.Simmons CP, Farrar JJ, Chau NVV, Wills B. Dengue. N Engl J Med. 2012;366:1423–1432. doi: 10.1056/NEJMra1110265. [DOI] [PubMed] [Google Scholar]
  • 52.Siraj AS, Oidtman RJ, Huber JH, Kraemer MUG, Brady J, Johansson MA, et al. Temperature modulates dengue virus epidemic growth rates through its effects on reproduction numbers and generation intervals. PLoS Negl Trop Dis. 2017;11:e0005797. doi: 10.1371/journal.pntd.0005797. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Stevenson JC, Pinchoff J, Muleba M, Lupiya J, Chilusu H, Mwelwa I, et al. Spatio-temporal heterogeneity of malaria vectors in northern Zambia: implications for vector control. Parasites Vectors. 2016;9:510. doi: 10.1186/s13071-016-1786-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54.Stoddard ST, Forshey BM, Morrison AC, Paz-Soldan VA, Vazquez-Prokopec GM, Astete H, et al. House-to-house human movement drives dengue virus transmission. Proc Natl Acad Sci USA. 2013;110:994–999. doi: 10.1073/pnas.1213349110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 55.Stoddard ST, Morrison AC, Vazquez-Prokopec GM, Paz Soldan V, Kochel TJ, Kitron U, et al. The role of human movement in the transmission of vector-borne pathogens. PLoS Negl Trop Dis. 2009;3:e481. doi: 10.1371/journal.pntd.0000481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 56.Vanlerberghe V, Gómez-Dantés H, Vazquez-Prokopec GM, Alexander N, Manrique-Saide P, Coelho G, et al. Changing paradigms in Aedes control: considering the spatial heterogeneity of dengue transmission. Pan Am J Public Health. 2017;41:1–6. doi: 10.26633/RPSP.2017.16. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Vazquez-Prokopec GM, Bisanzio D, Stoddard ST, Paz-Soldan V, Morrison AC, Elder JP, et al. Using GPS technology to quantify human mobility, dynamic contacts and infectious disease dynamics in a resource-poor urban environment. PLoS ONE. 2013;8:e58802. doi: 10.1371/journal.pone.0058802. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 58.Wesolowski A, Buckee CO, Engø-Monsen K, Metcalf CJE. Connecting mobility to infectious diseases: the promise and limits of mobile phone data. J Infect Dis. 2016;214:S414–S420. doi: 10.1093/infdis/jiw273. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 59.World Health Organization (WHO) Dengue: guidelines for diagnosis, treatment, prevention, and control. Geneva: World Health Organization; 2009. [PubMed] [Google Scholar]
  • 60.Zipf GK. The P1P2D hypothesis: on the intercity movement of persons. Am Sociol Rev. 1946;11:677–686. doi: 10.2307/2087063. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Data Availability Statement

Data and code are available upon request.


Articles from Epj Data Science are provided here courtesy of Springer

RESOURCES