Skip to main content
eLife logoLink to eLife
. 2023 Apr 4;12:e80466. doi: 10.7554/eLife.80466

Disentangling the rhythms of human activity in the built environment for airborne transmission risk: An analysis of large-scale mobility data

Zachary Susswein 1, Eva C Rest 1, Shweta Bansal 1,
Editors: Niel Hens2, Diane M Harper3
PMCID: PMC10118388  PMID: 37014055

Abstract

Background:

Since the outset of the COVID-19 pandemic, substantial public attention has focused on the role of seasonality in impacting transmission. Misconceptions have relied on seasonal mediation of respiratory diseases driven solely by environmental variables. However, seasonality is expected to be driven by host social behavior, particularly in highly susceptible populations. A key gap in understanding the role of social behavior in respiratory disease seasonality is our incomplete understanding of the seasonality of indoor human activity.

Methods:

We leverage a novel data stream on human mobility to characterize activity in indoor versus outdoor environments in the United States. We use an observational mobile app-based location dataset encompassing over 5 million locations nationally. We classify locations as primarily indoor (e.g. stores, offices) or outdoor (e.g. playgrounds, farmers markets), disentangling location-specific visits into indoor and outdoor, to arrive at a fine-scale measure of indoor to outdoor human activity across time and space.

Results:

We find the proportion of indoor to outdoor activity during a baseline year is seasonal, peaking in winter months. The measure displays a latitudinal gradient with stronger seasonality at northern latitudes and an additional summer peak in southern latitudes. We statistically fit this baseline indoor-outdoor activity measure to inform the incorporation of this complex empirical pattern into infectious disease dynamic models. However, we find that the disruption of the COVID-19 pandemic caused these patterns to shift significantly from baseline and the empirical patterns are necessary to predict spatiotemporal heterogeneity in disease dynamics.

Conclusions:

Our work empirically characterizes, for the first time, the seasonality of human social behavior at a large scale with a high spatiotemporal resolutio and provides a parsimonious parameterization of seasonal behavior that can be included in infectious disease dynamics models. We provide critical evidence and methods necessary to inform the public health of seasonal and pandemic respiratory pathogens and improve our understanding of the relationship between the physical environment and infection risk in the context of global change.

Funding:

Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number R01GM123007.

Research organism: Human

Introduction

The seasonality of infectious diseases is a widespread and familiar phenomenon. Although a number of potential mechanisms driving seasonality in directly transmitted infectious diseases have been proposed, the causal process behind seasonality is still largely an open question (Martinez, 2018; Altizer et al., 2006; Grassly and Fraser, 2006). In the case of the influenza virus, seasonal changes in humidity have been identified as a potential mechanism, with drier winter months enhancing transmission (Shaman and Kohn, 2009; Shaman et al., 2010; Dalziel et al., 2018); similar patterns have been observed for respiratory syncytial virus and hand foot and mouth disease (Baker et al., 2019; Onozuka and Hashizume, 2011). However, humidity is but one of many mechanisms contributing to seasonality in infectious disease transmission. Seasonal changes in temperature, human mixing patterns, and the immune landscape, among other factors, are thought to contribute to transmission dynamics (Metcalf et al., 2009; Mossong et al., 2008; Kronfeld-Schor et al., 2021; Bakker et al., 2021; Altizer et al., 2006). The relative importance of these disparate mechanisms varies across directly-transmitted pathogens and is still largely unexplained (Martinez, 2018; Grassly and Fraser, 2006). The influence of seasonal host behavior on respiratory disease seasonality remains particularly understudied (Fisman, 2012; Kronfeld-Schor et al., 2021) except for a few notable examples (Bharti et al., 2011; Few et al., 2013; Kummer et al., 2022).

For respiratory pathogens spread via the aerosol transmission route, in particular, seasonality may be mediated by multiple behaviorally-driven mechanisms. Aerosol transmission, a significant mode of transmission for a number of respiratory pathogens including tuberculosis, measles, and influenza (Tellier et al., 2019), has become increasingly acknowledged during the COVID-19 pandemic (Greenhalgh et al., 2021; Wang et al., 2021; Jayaweera et al., 2020; Klompas et al., 2020; Morawska and Milton, 2020). The role of aerosols in respiratory disease transmission allows for transmission outside of the traditional 6 ft. radius and 5 min duration for the droplet mode and implicates human mixing in indoor locations with poor ventilation as being a high risk for transmission, regardless of the intensity of the social contact. While more is known about the spatiotemporal variation in environmental factors such as temperature and humidity in the indoor environment (e.g. Nguyen and Dockery, 2016) and about the impact these factors have on airborne pathogen transmission (e.g. Robey and Fierce, 2022; Yang and Marr, 2011), limited information is available on rates of human indoor activity and how this varies geographically and seasonally. In the United States, most studies quantifying indoor and outdoor time are conducted in the context of air pollutants, suffer from small study sizes, lack spatiotemporal resolution, and are outdated. The most cited estimates originate from the 1980s-90s and estimate that Americans spend upwards of 90% of their time indoors (Ott, 1988); more recent data agree with these estimates (Klepeis et al., 2001; Spalt et al., 2016). While it is well understood that seasonal differences and latitude likely affect time spent indoors, little is known of the spatiotemporal variation in indoor activity beyond this one monolithic estimate, vastly limiting our ability to comprehensively characterize the seasonality of airborne disease exposure risk.

Because our understanding of the drivers of seasonality for respiratory diseases has been limited, the modeling of seasonally-varying infectious disease dynamics has been traditionally done using environmental data-driven or phenomenological approaches. Environmental data-driven approaches incorporate seasonality into epidemiological models through environmental correlates of seasonality, such as solar exposure or outdoor temperature (Bakker et al., 2021; Baker et al., 2019; Coletti et al., 2018). This approach to seasonal dynamics controls for interseasonal variation in transmission dynamics and measures the strength of correlations between proposed metrics and seasonal variation in force of infection – although the observed relationship is rarely causally relevant for respiratory disease transmission. In contrast, phenomenological models such as seasonal forcing approaches modulate transmissibility over time without specifying a particular mechanism for this modulation (Keeling et al., 2001; Altizer et al., 2006). By applying well-understood functions (such as sine functions), seasonal forcing allows for flexible specification and quantification of dynamics, such as periodicity or oscillation damping, and indirectly captures seasonal variation in nonenvironmental factors such as school mixing. A significant remaining gap in seasonal infectious disease modeling is thus the ability to empirically incorporate spatiotemporal variation in behavioral mechanisms driving seasonality of disease exposure and transmission.

Thus, despite the role of the indoor built environment in exposure to the airborne transmission route, seasonal variation in indoor human mixing has not yet been systematically characterized nor integrated into mathematical models of seasonal respiratory pathogens. To address this gap, we construct a novel metric quantifying the relative propensity for human mixing to be indoors at a fine spatiotemporal scale across the United States. We derive this metric using anonymized mobile GPS panel data of visits of over 45 million mobile devices to approximately 5 million public locations across the United States. We find a systematic latitudinal gradient, with indoor activity patterns in the northern and southern United States following distinct temporal trends at baseline. However, we find that the COVID-19 pandemic disrupted this structure. Lastly, we fit simple parametric models to incorporate these seasonal activity dynamics into models of infectious disease transmission when indoor activity is expected to be at baseline. Our work provides the evidence and methods necessary to inform the epidemiology of seasonal and pandemic respiratory pathogens and improve our understanding of the relationship between the physical environment and infection risk in light of global change.

Methods

Data source

We use the SafeGraph Weekly Patterns data, which provides foot traffic at public locations (‘points of interest’, hereafter referred to as POIs) across the United States based on the usage of mobile apps with GPS (Safegraph, 2021a). The data are from 2018–2020, and 4.6 million POIs are sampled in all years of our study. The data is anonymized by applying noise, omitting data associated with a single mobile device, and is provided at the weekly temporal scale. Data are sampled from over 45 million smartphone devices (of approximately 275–290 million smartphone devices in the United States during 2018–2021 Statista Digital Market Outlook, 2022), and does not include devices that are out of service, powered off, or ones that opt out of location services on their devices.

This is secondary data analysis, so no informed consent or consent to publish was necessary. Ethical review for this study (STUDY00003041) was sought from the Institutional Review Board at Georgetown University and was approved on October 14, 2020.

Defining indoor activity seasonality

Safegraph POIs are locations where consumers can spend money and/or time and include schools, hospitals, parks, grocery stores, restaurants, etc., but do not include home locations. (In Figure 1—figure supplement 1, we show that time at home does not display significant seasonal variation). Each POI is assigned a six-digit North American Industry Classification System (NAICS) code in the SafeGraph Core Places dataset to classify each location into a business category. We classify each six-digit NAICS code (363 unique codes in total) as primarily indoor (e.g. schools, hospitals, grocery stores) or primarily outdoor (e.g. parks, cemeteries, zoos). We classify some locations as unclear if the location is a potentially mixed indoor and outdoor setting (e.g. gas stations with convenience stores, automobile dealerships). Approximately 90% of POIs were classified as indoors, 6.5% were classified as outdoors, and 3.5% were classified as unclear. In Figure 1—figure supplement 2, we illustrate the robustness of our metric to the classification of unclear locations.

We define σ~it, Equation 1, as the propensity for visits to be to indoor locations relative to outdoor locations. We aggregated raw visit counts, defined when a device is present at a non-home POI for longer than one minute, to all indoor POIs and all outdoor POIs in a given week (t) at the US county level (i). Visit counts are normalized by the maximum visit counts for indoor or outdoor locations in each county during the year 2019 (In Figure 1—figure supplement 3, we show that the maximum visit count is comparable in 2018 and 2019).

σ~it=Nitindoor/maxt{Nitindoor}Nitoutdoor/maxt{Nitoutdoor} (1)

This metric is then mean-centered to arrive at a relative measure of indoor activity seasonality, σit, which is comparable across all counties:

σit=σ~itμσ~ (2)

We note that μσ~ is not spatially structured (see Figure 1—figure supplement 4).

As a data cleaning step, we use spatial imputation for any county-weeks where sample sizes are small. For location-weeks in which the total visit count is less than 100, we impute the indoor activity seasonality using an average of σ in the neighboring locations (where neighbors are defined based on shared county borders). This affects 0.6% of all county-weeks and a total of 79 (out of 3143) counties.

Time series clustering analysis

To characterize groups of US counties with similar indoor activity dynamics, we use a complex networks-based time series clustering approach. We first calculate the pairwise similarity between z-normalized indoor activity time series for each pair of counties, i and j using the Pearson correlation coefficient (ρij). For pairs of locations where ρij is in the top 10% of all correlations, we represent the pairwise time series similarities as a weighted network where nodes are US counties and edges represent strong time series similarity (In Figure 2—figure supplement 1, we show the robustness of our clustering results to this choice of correlation threshold).

We then cluster the time series similarity network using community structure detection. This method effectively clusters nodes (counties) into groups of nodes that are more connected within than between. The resulting clustering thus represents a regionalization of the United States in which regions consist of counties that have more similar indoor activity dynamics to each other than to other regions. One benefit of the network-based community detection approach over other clustering methods is that community detection does not require user specification of the number of clusters (regions, in this case); instead, the number of clusters emerges organically from the data connectivity (Aggarwal and Reddy, 2013). For community detection, we use the Louvain method (Blondel et al., 2008), a multiscale method in which modularity is first optimized using a greedy local algorithm, on the similarity network with edge weights (i.e. time series correlations) using a igraph implementation in Python (Louvain-igraph, 2018).

We performed a robustness assessment of the community structure using a set of 25 ‘bootstrap networks,’ Bi. For each bootstrap network, the edge weight (i.e. the time series correlation) for each edge of the network was perturbed by ϵN(0,0.05). The community structure algorithm was performed on each bootstrap network. A consensus value was then calculated as the sum of the normalized mutual information between the community structure partition of the bootstrap network Bi and all other bootstrap networks. The partition with the largest consensus value was defined as the robust community structure partition.

Given some known limitations to the time series correlation network-based approach to clustering (Hoffmann et al., 2020), we validated our network-based clustering results with another common clustering method. In particular, we used hierarchical clustering with Ward linkage and Euclidean distance on z-normalized indoor activity time series, implemented using scipy in Python. (We note that Euclidean distance is equivalent to Pearson’s correlation on normalized time series Berthold and Höppner, 2016). The results of this comparison are summarized in Figure 2—figure supplement 5.

Disruptions to indoor activity due to pandemic response

We investigate the COVID-19 pandemic’s impact on indoor activity seasonality by comparing pre-pandemic mobility patterns in 2018 and 2019 with mobility patterns during the COVID-19 pandemic in 2020. We compared the proportion of indoor visits at the county level, σit, across 2018, 2019, and 2020 to examine changes in indoor activity seasonality during the COVID-19 pandemic. We also examined total activity, aggregating visits to all indoor, outdoor, and unclear POIs by week and mean-centering them for each US county during the COVID-19 pandemic in 2020.

Incorporating indoor activity into infectious disease models

We seek to illustrate the impact of incorporating seasonality into an infectious disease model using a phenomenological model versus empirical data. To achieve this, we parameterize a simple compartmental disease model with a seasonality term, using either our empirically-derived indoor activity seasonality metric or an analytical phenomenological model of seasonality fit to this metric.

Phenomenological model of seasonality

We first fit our empirically-derived indoor activity seasonality metric using a time-varying non-linear model. We specify the time-varying effect as a sinusoidal function as is commonly done to incorporate seasonality into infectious disease models phenomenologically. The indoor activity seasonality, σit for cluster i at week t is specified as: σit=1+αisin(ωit+ϕi), where αi is the sine wave amplitude, ωi is the frequency, and ϕi is the phase. We fit a model for locations in the northern cluster separately from those in the southern cluster, as identified above. We fit the parameters for this model using the nlme, a standard package in R for fitting Gaussian nonlinear models.

Disease model

We model infectious disease dynamics through a simple SIR model of disease spread:

dSdt=β0β(t)SI
dIdt=β0β(t)SIγI
dRdt=γI

We incorporate alternative seasonality terms to consider the impact of heterogeneity in indoor seasonality on disease dynamics. For the northern and southern clusters separately, we define modeled seasonality as β(t)=1+αsin(ωt+ϕ), with the fitted parameters for each cluster (Figure 4—figure supplement 1 and Figure 4—figure supplement 2). We also consider two exemplar locations for empirical estimates of seasonality, where β(t)=σt after rolling window smoothing: Cook County for an example county from the northern cluster, and Maricopa County for an example location from the southern cluster. We also compare against a null expectation where β(t)=1 (All seasonality functions are illustrated in Figure 4—figure supplement 3). We assume that β0=0.0025 and γ=2 (on a weekly time scale).

Results

Based on anonymized location data from mobile devices, we construct a novel metric that measures the relative propensity for human activity to be indoors at a fine geographic (US county) and temporal (weekly) scale. Activity is measured as the number of visits to unique physical, public (non-residential) locations across the United States. Locations are classified as indoors if they are enclosed environments (i.e. buildings and transportation services). We characterize the systematic spatiotemporal structure in this metric of indoor activity seasonality with a time series clustering analysis. We also characterize the shift that occurred in the baseline patterns of indoor activity seasonality during the COVID-19 pandemic. We note that this seasonal variation in the propensity of human activity to be indoors differs from the variation in overall rates of contact or mobility, which does not appear to be highly seasonal (Figure 1—figure supplement 1, Klein et al., 2022). Lastly, we fit non-linear models to the indoor activity metric at baseline, comparing the ability of a simple model to capture seasonal variation in transmission risk.

Quantifying empirical dynamics in an indoor activity

The indoor activity seasonality metric, σ, captures the relative frequency of visits to indoor versus outdoor locations within an area. The components of σ capture the degree to which indoor and outdoor locations are occupied; when σ=1, a given county is at its county-specific average propensity (over time) for indoor activity relative to outdoor. When σ<1, activity within the county is more frequently outdoor and less frequently indoor than average, while σ>1 indicates that activity is more frequently indoor and less frequently outdoor than average. Thus, a σ of 1.2 indicates that the county’s activity is 20% more indoor than average, and a σ of 0.80 indicates that the county’s activity is 20% less indoor than average (additional details in methods).

Through this metric, we measure the relative propensity for human activity to be indoors for every community (i.e. US county) across time (at a weekly timescale), finding significant heterogeneity between counties (Figure 1A). The representative examples of Cook County, Illinois (home of the city of Chicago in the northern US) and Maricopa County, Arizona (home of the city of Phoenix in the southwestern US) highlight systematic spatial and temporal heterogeneity in indoor activity dynamics. In Cook County, indoor activity varies over time, at its peak in the winter, with the relative odds of an indoor visit well above average. During the summer, σ in Cook County reaches its trough, with activity systematically more outdoors on average. On the other hand, the variation of σ across time in Maricopa County is characterized by a smaller winter peak in indoor activity, and an additional peak in the summer (i.e. July and August); this peak occurs concurrently with the trough in Cook County. Unlike in Cook County, σ in Maricopa County is lowest in the spring and fall. These representative counties illustrate the systematic within-county variation in indoor activity over time, as well as the between-county variation in temporal trends as represented in Figure 1B for all US communities.

Figure 1. Spatio-temporal heterogeneity in indoor activity seasonality.

(A) Case studies to highlight varying trends in indoor activity seasonality during 2018 and 2019: King County and Suffolk County (in the northern United States) have high indoor activity in the winter months and a trough in indoor activity in the summer months. Miami-Dade and Maricopa County (in the southern United States) see moderate indoor activity in the winter and may have an additional peak in indoor activity during the summer. We apply a rolling window mean for visualization purposes. (B) A heatmap of the indoor activity seasonality metric for all US counties by week for 2018 and 2019. Counties are ordered by latitude. We see significant spatiotemporal heterogeneity with distinct trends in the summer versus winter seasons.

Figure 1.

Figure 1—figure supplement 1. Other measures of mobility are not highly seasonal.

Figure 1—figure supplement 1.

Top: Using the Safegraph Weekly Patterns dataset (https://docs.safegraph.com/docs/weekly-patterns), we show total (all non-home locations) visitor counts for a random sample of 310 counties (10% of all US counties). Overall mobility does not appear to be highly seasonal. Bottom: Using the Safegraph Social Distancing Metrics dataset (https://docs.safegraph.com/docs/social-distancing-metrics), we show time spent at home for a random sample of 310 counties (10% of all US counties). While home locations are not included in our indoor activity metric, time spent at home does not appear to be highly seasonal.
Figure 1—figure supplement 2. We demonstrate the effect of the ‘unclear’ locations on the indoor activity seasonality.

Figure 1—figure supplement 2.

In the left panel, we show the difference in σ if all ‘unclear’ locations were to be classified as indoor. In the right panel, we show the difference if σ if all ‘unclear’ locations are classified as outdoor.
Figure 1—figure supplement 3. We show that the maximum number of visits used in the definition of the σ metric is highly comparable in 2018 and 2019.

Figure 1—figure supplement 3.

Figure 1—figure supplement 4. The mean proportion of indoor/outdoor activity (μσ~) in 2018 displays no latitudinal gradient and is relatively homogeneous across counties; outliers of mean ≥ 2.5 are removed.

Figure 1—figure supplement 4.

To identify systematic geographic structure, we cluster the heterogeneous time series of county-level, weekly indoor activity. We find three geographic clusters corresponding to groups of locations that experience similar indoor activity dynamics (Figure 2). These clusters primarily split the country into two clusters: a northern cluster and a southern cluster. Among the communities in the northern cluster, activity is more commonly outdoor over the summer months, trending toward indoor during fall, with a peak in the winter months, as observed in Cook County. Comparatively, the southern cluster has a larger winter peak (i.e. between December and February) and a smaller summer peak (i.e. between July and August); most summer peaks are less extreme than that of Maricopa County (shown). We hypothesize that these two clusters are consistent with climate zones. While there is a moderate association between indoor activity seasonality and environmental variables such as temperature and humidity (Figure 2—figure supplement 2), we expect that the northern and southern indoor activity clusters will be more consistent with climate zones defined for the construction of the indoor built environment and find that there is indeed substantial consistency between the two (Figure 2—figure supplement 3). The third cluster differs substantially: it is geographically discontiguous and its two annual peaks occur during the spring (close to April) and fall (closer to November) seasons. Thus, the counties in this cluster have outdoor activity more frequently than average during both the winter and the summer. The counties in this cluster correspond to locations that are hubs for winter or other tourism, which we speculate is driving their unique dynamics (Figure 2—figure supplement 4).

Figure 2. Using a time series clustering approach on the indoor activity time series for each US county, we identify groups of counties that experience similar trends in indoor activity.

Locations in the northern cluster (light blue) follow a single peak pattern with the highest indoor activity occurring every winter. Locations in the southern cluster (dark blue) experience two peaks in indoor activity each year, one in the winter and a second, smaller one in the summer. The third cluster also experiences two peaks not matching environmental conditions, but potentially corresponding to winter or other tourism areas. We apply a rolling window mean to the time series for visualization purposes.

Figure 2.

Figure 2—figure supplement 1. We illustrate the impact of the correlation threshold on the clustering results (without post-processing).

Figure 2—figure supplement 1.

For each panel, we list the percentile for time series correlations used as the threshold, the corresponding correlation value (ρ), and the normalized mutual information between each partition and the partition with the 90th percentile threshold (corresponding to the partition presented in Figure 2).
Figure 2—figure supplement 2. Using data on temperature and rainfall from NOAA’s North American Regional Reanalysis (Mesinger et al., 2006), we find that indoor activity (sigma) is moderately anticorrelated with both temperature and humidity.

Figure 2—figure supplement 2.

Temperature and humidity are strongly correlated in all three clusters (Pearson’s ρ0.87). Across the three clusters, indoor activity is moderately associated with temperature (ρ-0.52). Likewise, indoor activity is moderately anticorrelated with humidity (ρ-0.45).
Figure 2—figure supplement 3. Comparison of indoor activity clusters to climate clusters.

Figure 2—figure supplement 3.

(A) The IECC climate zones are based on temperature, humidity, and rainfall in each county and govern the type of building material and amount of ventilation required in a building (International Code Council, 2015). (B) The consistency between the two primary clusters of indoor activity identified by our analysis and the IECC climate zones. Treating the IECC climate zones as ‘ground truth,’ we quantify the ability of our indoor activity clusters to predict the IECC climate zones. We achieve this by collapsing the partitions into two clusters each (the tourism cluster is grouped with the northern cluster in the indoor activity clustering; and IECC climate zones 1/2/3 are grouped into one cluster and zones 4/5/6/7 into another cluster). Our indoor activity clusters have a 0.72 F1-score, with a precision of 0.92 and a recall score of 0.59 with the IECC zones.
Figure 2—figure supplement 4. The third indoor activity cluster displays some correlation with areas of increased tourism, including US ski areas in western and northeastern states, potentially contributing to off-season activity increases.

Figure 2—figure supplement 4.

Most areas in the cluster are either in a ski area or neighbor a ski area, with some parts of Hawaii and Florida being clear outliers of this pattern and suggesting other types of tourism lead to similar behavioral seasonality.
Figure 2—figure supplement 5. We show the results of time series clustering based on a hierarchical clustering method using Ward linkage and Euclidean distance, implemented using scipy.cluster in Python.

Figure 2—figure supplement 5.

This partition has high similarity to the network-based clustering algorithm results that we illustrate in Figure 2: normalized mutual information = 0.56 with 89% of counties matching on cluster identity.

Characterizing pandemic disruption to baseline indoor activity seasonality

In addition to the description of indoor activity seasonality at baseline, we examine the impact of a large-scale disruption – the COVID-19 pandemic – on these patterns. We compare indoor activity seasonality during the COVID-19 pandemic in 2020 to the baseline patterns of 2018 and 2019. We find that the temporal trends in indoor activity are less geographically structured in 2020 than those of previous years (see Figure 3—figure supplement 2 for a characterization of the time series patterns). We find that indoor activity deviated from pre-pandemic trends beyond interannual deviations (Figure 3—figure supplement 1). We focus on four case studies to highlight the varying impacts on indoor activity of the pandemic disruption (Figure 3). In all four communities, 2020 indoor activity trends shift from 2018 and 2019 patterns, with Maricopa County (home of the city of Phoenix, Arizona) showing the least perturbation relative to prior years. We also find that in early 2020, when there was substantial social distancing in the United States (e.g. school closures, remote work), activity was more likely to be outdoors than in prior years, independent of changes in overall activity levels. With our case studies, we highlight that social distancing policies can have different impacts on airborne exposure risk in different locations: while some locations, such as Travis County (home of Austin, Texas), shifted activities outdoors during this period, reducing their overall risk further, other locations, such as Charleston County (home of Charleston, South Carolina) increased indoor activity above the seasonal average during this period, potentially diminishing the effect of reducing overall mobility. The trends in Charleston are representative of those in the southeastern United States during the spring of 2020 (Figure 3—figure supplement 1). By the end of 2020 (and the first winter wave of SARS-CoV-2), many parts of the country were shifting activity more outdoors than seasonally expected (Figure 3—figure supplement 1).

Figure 3. Indoor activity during the COVID-19 pandemic was shifted: We compare indoor activity trends in the baseline years of 2018 and 2019 to the pandemic year 2020 in four case study locations.

We find that most locations saw a shift in their indoor activity patterns, while others (such as Maricopa County) did not. We also find that while overall activity was diminished uniformly during the Spring of 2020, indoor activity decreased in some locations (Travis County, Texas and Baltimore County, Maryland) and increased in others (Charleston County, South Carolina). We apply a three week rolling window mean to the time series for visualization purposes.

Figure 3.

Figure 3—figure supplement 1. Deviations in 2020 indoor activity from baseline.

Figure 3—figure supplement 1.

Top: Euclidean distance between indoor activity time series in corresponding years for each county, averaged over all counties. The 2020 time series show a higher deviation from each of the baseline years than the two baseline years do from each other. Bottom: We illustrate the mean difference in indoor activity at baseline (defined as the average of 2018 and 2019) and 2020 for two time periods: (a) Week 10 to Week 20 in spring 2020 during the initial lockdown period for COVID-19. (b) Week 44 to Week 52 in winter 2020 during the first winter surge of COVID-19. Positive mean differences suggest more outdoor activity in 2020 than at baseline and negative mean differences suggest more indoor activity in 2020 than at baseline.
Figure 3—figure supplement 2. Indoor seasonality clusters during 2020.

Figure 3—figure supplement 2.

(A) Indoor seasonality during 2020 can be clustered into four groups, although clusters are more geographically fragmented than in previous years. (B) Time series for 2020 indoor seasonality clusters display heterogeneous trends that were not apparent in previous years, with some clusters more variable than others.

Implications for modeling seasonal disease dynamics

We use this finely-grained spatiotemporal information on indoor activity to incorporate airborne exposure risk seasonality into compartmental models of disease dynamics using common, coarser seasonal forcing approaches. To investigate the impact of heterogeneity in σ on the estimation of seasonal forcing for infectious disease models, we fit a sinusoidal model to the time series of indoor activity for each of the primary clusters (Figure 4A). We note that because σ is defined as deviation from baseline indoor activity, the sinusoidal parameters (amplitude, frequency, phase) should be interpreted as a measure of seasonality in indoor activity, relative to each location’s baseline. We find that the parameters of seasonality vary across clusters: the amplitude is higher, and the phase is lower in the northern cluster compared to the southern cluster, indicating a difference in the variability of indoor and outdoor activity seasonality in each cluster (Figure 4—figure supplement 1). While the fits are comparable for both clusters (Figure 4—figure supplement 2), the sinusoidal model does not capture the second peak of indoor activity during the summer months in the southern cluster. These differences in best fit indicate that sinusoidal models may have an overly restrictive functional form, limiting the accuracy of the approximation, and may underestimate the impacts of seasonality on transmission, obscuring systematic differences between regions. Furthermore, differences in seasonal activity of the observed magnitude can have important implications for disease modeling; applying region-level and county-level forcing to a simple disease model alters incidence patterns (Figure 4B). Although region-level seasonality changes incidence timing and peak size relative to a non-seasonal model, it does not fully capture the changes produced by county-level seasonality. These differences indicate that while coarser geographic approximations of seasonality can be appropriate, these approximations can also oversimplify, reducing the accuracy of disease models. Additionally, while simple models of baseline indoor activity can capture seasonality in exposure risk, disruptions such as pandemics can alter this baseline structure and increase heterogeneity.

Figure 4. Incorporating seasonality in epidemiological models.

(A) Sine curves fit to the 2018 and 2019 time series data (analogous to seasonal forcing model components) fit the northern cluster better than the southern cluster, with a markedly poorer fit for the southern cluster’s second summer peak. (B) Regional seasonal forcing models display variation in patterns of disease incidence omitted by a non-seasonal model, but even region-level seasonal forcing does not fully capture within-cluster county-level variation.

Figure 4.

Figure 4—figure supplement 1. Parameters of the sinusoidal model fits.

Figure 4—figure supplement 1.

Top: Inferred parameters for the sinusoidal model fits of the indoor activity data for the northern and southern clusters show a similar frequency, but the greater amplitude and shorter phase in the southern cluster. Values displayed are mean parameter estimates. Standard errors for all parameters are smaller than 5e−3 and thus are not displayed. Bottom: We show the estimated parameters for the parameters of the sine curve fits to the Northern and Southern clusters as well as the difference between the parameter estimates. The period is in units of time (weeks). The amplitude matches the units of σ. The phase is in units of time (weeks).
Figure 4—figure supplement 2. Model performance as measured by the root mean square error of the sine curve fit to the cluster averaged over counties within the cluster.

Figure 4—figure supplement 2.

The summer period between March and September is highlighted in light gray to emphasize the summer months.
Figure 4—figure supplement 3. The seasonal forcing functions (β(t)) we used in the epidemiological model.

Figure 4—figure supplement 3.

The non-seasonal model (gray) shows no variation in transmission risk over time. We model northern seasonality via a sinusoidal model fit to the northern indoor activity data (light blue solid) and via the empirically-measured indoor seasonality from a county in the northern cluster (Cook County, light blue dotted). We model southern seasonality via a sinusoidal model fit to the southern indoor activity data (dark blue solid) and via the empirically-measured indoor seasonality from a county in the northern cluster (Maricopa County, dark blue dotted).

Discussion

The seasonality of influenza, SARS-CoV-2, and other respiratory pathogens depends not only on environmental variables but also on the social behavior of hosts. In settings with little prior immunity – such as a pandemic – host social behavior (generating contacts during which transmission may occur) primarily drives heterogeneity in disease dynamic and seasonality is dwarfed by susceptibility (Baker et al., 2020). In settings with higher rates of immunity, the contact remains critically important, and seasonal changes in contacts (both direct and indirect) can contribute to the movement of Rt above and below 1 – providing noticeable changes in incidence. Although environmental variables play a role in the seasonality of respiratory pathogens, the role of host social behavior in pathogen seasonality is poorly understood, driven by a poor understanding of indoor versus outdoor social interactions and interactions between behavior and the environment. In this study, we propose a fine-grain measure of indoor activity seasonality across time and space. This metric is a relative quantity of behavior, comparable across locations, and thus intended to be a measure of seasonality beyond a baseline. We determine that indoor activity seasonality displays significant spatiotemporal heterogeneity and that this variability is highly geographically structured. We also find that while indoor activity seasonality may be highly predictable under baseline conditions, disruptions such as the COVID-19 pandemic can alter these patterns. Finally, we provide an illustration of how our findings can be incorporated into classical infectious disease models using parsimonious models of exposure seasonality.

The indoor activity seasonality that we quantify may reflect heterogeneity in transmission risk via a number of mechanisms including those affecting host contact, susceptibility, or transmissibility. Increased indoor activity may indicate longer-duration airborne contact (e.g. co-location without direct interaction) between susceptible and infected individuals, elevating respiratory transmission risk. Increased indoor density may also suggest increased droplet contact (e.g. a conversation in close proximity), under homogeneous mixing. Additionally, indoor activity may suggest increased susceptibility as poor ventilation, increased pollutants, reduced solar exposure, and low humidity of the indoor environment have been shown to weaken immune response (Moriyama et al., 2020). Finally, increased indoor activity may indicate an increase in transmissibility due to higher exposure as low humidity caused by climate control (heating, ventilation, and cooling, HVAC) in indoor environments has been shown to increase viral survival and HVAC re-circulation has been shown to increase viral dispersion (Lu et al., 2020; Liao et al., 2005). While our new measure does not disentangle these component mechanisms, it represents an integrated seasonality in exposure risk due to all of these factors and can help lead us to a more complete understanding of the heterogeneity and seasonality in disease dynamics and outcomes.

We find that spatiotemporal heterogeneity in the indoor activity metric can be decomposed into two large geographically-contiguous groups in the northern and southern United States representing distinct temporal dynamics in indoor activity. These groups closely correspond to built environment climate zones, potentially explaining this systematic variability. We note, however, that while these clusters overlap with climate classifications, this correspondence does not suggest that environmental variables such as temperature and humidity should be used to represent behavioral heterogeneity. Climatic factors within these climate zones may be related to, but not necessarily correlated with, the seasonality of human mixing within these zones. Additionally, even in the case that environmental factor variability drives behavioral variability, it would be critical to capture the effect of behavior on disease directly so as to not obscure any direct effects of climatic factors on disease.

We illustrate how to incorporate seasonality in exposure risk to future models of disease dynamics using a simple phenomenological model. We use this traditional model of infectious disease dynamics to evaluate the implications of the spatial coarseness of seasonal forcing. Our results suggest that the substantial local heterogeneity in the dynamics of indoor activity across time and space could be large enough to alter seasonality in infectious disease dynamics. While our work does not consider observed transmission patterns, we suggest that researchers carefully consider the spatial scale on which they model seasonality in theoretical models, commonly used for scenario analysis and model-based intervention design (e.g. Borchering et al., 2021). We additionally highlight that the use of simple or complex functional forms of seasonality requires statistical fits to baseline data and, in the case of disruptions, these fitted models may no longer be appropriate. . Although indoor activity is moderately anticorrelated with temperature and humidity (Figure 1), weather-derived covariates are not able to completely reflect the impacts of human movement (but they may have some statistical power). We show patterns of human mobility changed substantially during the COVID-19 pandemic, potentially contributing to changes in infectious disease seasonality.

Recent work during the COVID-19 pandemic demonstrates the impact of reduced occupancy in indoor locations and increasing outdoor activity on the likelihood of disease transmission. In particular, behavioral interventions or nudges that reduce occupancy are more impactful than reducing overall mobility as they reduce visitor density and the likelihood of density-dependent airborne tnsmission (Chang et al., 2021). Similarly, the availability of outdoor areas in urban settings, such as public parks, has been demonstrated to reduce case rates when population mobility becomes less restricted (Johnson et al., 2021). Our results suggest that such public health strategies should be implemented in a targeted manner, informed by real-time data, and with clear communication of the goals. We found notable changes occurred in indoor activity seasonality at the start of the COVID-19 pandemic, despite relatively consistent patterns during the spring season in prior years. Designing a behavioral strategy and measuring its effectiveness without real-time data could thus be misleading. Our finding of two distinct geographic clusters of indoor activity suggests the need for geographical targeting of strategies to reduce indoor transmission risk. While northern latitudes might benefit from decreased indoor occupancy and increased outdoor activity in Northern Hemisphere winters, southern latitudes should be additionally targeted for such interventions in the summer months. Lastly, our findings highlight the need to communicate the goals of behavioral interventions clearly. While all communities universally reduced overall activity during the early days of the COVID-19 pandemic, some increased indoor activity during this time, potentially diminishing the positive effects of the social distancing policies put into place. A public health education campaign to clarify the role of indoor interactions in transmission risk may have ameliorated this.

Our study leverages a novel data stream made available to researchers due to the COVID-19 pandemic. Similar datasets are available globally, part of a $12 billion location intelligence industry (Keegan and Ng, 2021). Such novel data streams offer many opportunities to address long-unanswered questions in infectious disease and climate change behavior dynamics, but these data must be interpreted carefully. Safegraph’s mobile-app-based location data does not include data on individuals less than 16 years of age (Safegraph, 2021b). While we may expect that children under 12 may be accompanied by adults that may be represented in the dataset, our metric likely does not capture the activity dynamics of older children (children 12–15 make up 5% of the US population). For those included in the Safegraph database, representation is dependent on smartphone usage and a number of business processes not transparent to users of the data, thus we expect that there is geographic variation in the representativeness of the data. Smartphone ownership has increased in recent years, with 85% of US adults reporting smartphone ownership; however, smartphone usage does vary significantly by age, with only 61% of adults over 65 reporting smartphone use (Pew Resesarch Center, 2021). Additionally, data shows that location sharing among mobile users is not significantly biased by age, gender, race/ethnicity, income, or education (with 40–65% of all demographic groups participating in location sharing) (Zickuhr and Smith, 2011). Based on an analysis done by Safegraph, the panel is representative of race, educational attainment, and income (Fox, 2019). On the other hand, a recent independent analysis shows that older and non-white individuals are less likely to be captured in the panel for POI-specific analyses (Coston et al., 2021). It is important to note that both studies are associative in nature as the devices in the panel are fully anonymized, so no device-level demographic data exists. Continued work to understand the sampling biases of such datasets will be needed so that improved bias correction approaches can be developed (Coston et al., 2021). Additionally, we limit our scope in this study to consider only the number of visits and do not incorporate information about visit duration. The dataset counts all visits of 1 minute or longer. For disease transmission, there may be a threshold duration required for an interaction between an infected and susceptible individual for infection to be propagated. These thresholds are not well-understood for all respiratory diseases, but evidence that SARS-CoV-2 transmission can occur with brief encounters has emerged (Pringle et al., 2020). While the Safegraph dataset does provide median dwell times for POIs, the likely significant heterogeneity in the distribution of dwell times remains unknown and is difficult to capture in an aggregated manner.

Our metric and analysis also focus on the US county scale to reflect the finest scale generally used for infectious disease modeling as well as public health decision-making. This choice is likely to ignore some within-county heterogeneity and means that our metric does not represent the experience of all groups, particularly by socioeconomic status. For example, low-income and racially marginalized communities have systematically less access to outdoor, natural spaces and spend more time indoors due to structural inequities including lack of paid leave (Spalt et al., 2016; Nesbitt et al., 2019; Sefcik et al., 2019). Such socioeconomic disparities have been further exacerbated during the pandemic, which potentially affects our indoor activity estimates during 2021. Thus, our estimate of a county’s indoor transmission risk may represent an underestimate of the risk experienced by individuals in these communities. We commit to continued work to better characterize the transmission risk experienced by vulnerable populations. Lastly, we acknowledge that data modeling work that can influence public health policy decisions, particularly during an ongoing crisis, must be done with care to prevent misconceptions from having adverse effects on risk perception and policies (Carlson et al., 2020). We thus strongly note that while our measure of indoor behavioral seasonality provides a potential driver of respiratory disease seasonality, it remains one among many complex factors which integrate to predict the transmission potential of an ongoing epidemic or pandemic (Susswein et al., 2021). Thus, we cannot rely on behavioral seasonality to diminish transmission naturally, and pandemic intervention strategies should not be planned around behavioral seasonality while population susceptibility remains high in so many locations.

Ongoing global change events highlight the importance of this work, as it informs how widespread disruptions may shift patterns of indoor activity, potentially altering traditional infectious disease seasonality. Climate change events will continue to cause significant disruption to normal behavior patterns; mechanistic understanding of infectious disease seasonality and real-time data collection will be crucial components of future disease control efforts. While other global change events may impact indoor activity in different ways than the COVID-19 pandemic, a rigorous understanding of the impact of host behavior on infectious disease allows policymakers and emergency preparedness experts to effectively address future disruptions.

Acknowledgements

Research reported in this publication was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number R01GM123007. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. We gratefully acknowledge the data sharing by Safegraph which made this study possible. We thank Alexes Merritt for her data processing efforts.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Contributor Information

Shweta Bansal, Email: shweta.bansal@georgetown.edu.

Niel Hens, Hasselt University, Belgium.

Diane M Harper, University of Michigan, United States.

Funding Information

This paper was supported by the following grant:

  • National Institute of General Medical Sciences R01GM123007 to Zachary Susswein, Eva C Rest, Shweta Bansal.

Additional information

Competing interests

is currently employed at the Rockefeller Foundation as a Data Analyst. The author has no other competing interests to declare.

No competing interests declared.

Author contributions

Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Data curation, Validation, Investigation, Writing – review and editing.

Conceptualization, Resources, Data curation, Software, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing.

Ethics

Human subjects: Ethical review for this study was sought from the Institutional Review Board at Georgetown University and the study was approved on October 14, 2020 (STUDY00003041). This is secondary data analysis, so no informed consent or consent to publish was necessary.

Additional files

MDAR checklist

Data availability

We make available on Github the data and code needed to reproduce all figures and analyses in this manuscript: https://github.com/bansallab/indoor_outdoor (copy archived at swh:1:rev:d8a2ffc49f46a22c45814bd1dfcd1b054f2a4a27). The dataset we provide is of the metric used in all our analyses and figures ("indoor activity"). This dataset can be regenerated using the Safegraph Weekly Patterns datasets found at https://docs.safegraph.com/docs/weekly-patterns and code in the Github repository. The Safegraph Weekly Patterns was made freely available to academics at a uniquely granular level in response to the COVID-19 pandemic. Safegraph's business model involves selling these datasets to other corporations and, as a result, any data access agreement with the company forbids sharing of the raw data. The company does, however, make its data freely available to academics (for non-commercial use) through an institutional university subscription to Dewey or an individual data use agreement with Safegraph.

References

  1. Aggarwal CC, Reddy CK. Data Clustering: Algorithms and Applications. 1st edition. Chapman & Hall/CRC; 2013. [DOI] [Google Scholar]
  2. Altizer S, Dobson A, Hosseini P, Hudson P, Pascual M, Rohani P. Seasonality and the dynamics of infectious diseases. Ecology Letters. 2006;9:467–484. doi: 10.1111/j.1461-0248.2005.00879.x. [DOI] [PubMed] [Google Scholar]
  3. Baker RE, Mahmud AS, Wagner CE, Yang W, Pitzer VE, Viboud C, Vecchi GA, Metcalf CJE, Grenfell BT. Epidemic dynamics of respiratory syncytial virus in current and future climates. Nature Communications. 2019;10:5512. doi: 10.1038/s41467-019-13562-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Baker RE, Yang W, Vecchi GA, Metcalf CJE, Grenfell BT. Susceptible supply limits the role of climate in the early SARS-CoV-2 pandemic. Science. 2020;369:315–319. doi: 10.1126/science.abc2535. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Bakker KM, Eisenberg MC, Woods R, Martinez ME. Exploring the seasonal drivers of varicella zoster virus transmission and reactivation. American Journal of Epidemiology. 2021;190:1814–1820. doi: 10.1093/aje/kwab073. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Berthold MR, Höppner F. On clustering time series using Euclidean distance and Pearson correlation. arXiv. 2016 https://arxiv.org/abs/1601.02213
  7. Bharti N, Tatem AJ, Ferrari MJ, Grais RF, Djibo A, Grenfell BT. Explaining seasonal fluctuations of measles in niger using nighttime lights imagery. Science. 2011;334:1424–1427. doi: 10.1126/science.1210554. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. Journal of Statistical Mechanics. 2008;2008:10008. doi: 10.1088/1742-5468/2008/10/P10008. [DOI] [Google Scholar]
  9. Borchering RK, Viboud C, Howerton E, Smith CP, Truelove S, Runge MC, Reich NG, Contamin L, Levander J, Salerno J, van Panhuis W, Kinsey M, Tallaksen K, Obrecht RF, Asher L, Costello C, Kelbaugh M, Wilson S, Shin L, Gallagher ME, Mullany LC, Rainwater-Lovett K, Lemaitre JC, Dent J, Grantz KH, Kaminsky J, Lauer SA, Lee EC, Meredith HR, Perez-Saez J, Keegan LT, Karlen D, Chinazzi M, Davis JT, Mu K, Xiong X, Pastore y Piontti A, Vespignani A, Srivastava A, Porebski P, Venkatramanan S, Adiga A, Lewis B, Klahn B, Outten J, Schlitt J, Corbett P, Telionis PA, Wang L, Peddireddy AS, Hurt B, Chen J, Vullikanti A, Marathe M, Healy JM, Slayton RB, Biggerstaff M, Johansson MA, Shea K, Lessler J. Modeling of future COVID-19 cases, hospitalizations, and deaths, by vaccination rates and nonpharmaceutical intervention scenarios — United States, April–September 2021. MMWR Morbidity and Mortality Weekly Report. 2021;70:719–724. doi: 10.15585/mmwr.mm7019e3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Carlson CJ, Gomez AC, Bansal S, Ryan SJ. Misconceptions about weather and seasonality must not misguide COVID-19 response. Nature Communications. 2020;11:4312. doi: 10.1038/s41467-020-18150-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Chang S, Pierson E, Koh PW, Gerardin J, Redbird B, Grusky D, Leskovec J. Mobility network models of COVID-19 explain inequities and inform reopening. Nature. 2021;589:82–87. doi: 10.1038/s41586-020-2923-3. [DOI] [PubMed] [Google Scholar]
  12. Coletti P, Poletto C, Turbelin C, Blanchon T, Colizza V. Shifting patterns of seasonal influenza epidemics. Scientific Reports. 2018;8:12786. doi: 10.1038/s41598-018-30949-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Coston A, Guha N, Ouyang D, Lu L, Chouldechova A, Ho DE. Leveraging administrative data for bias audits: Assessing disparate coverage with mobility data for COVID-19 policy. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency; 2021. [DOI] [Google Scholar]
  14. Dalziel BD, Kissler S, Gog JR, Viboud C, Bjørnstad ON, Metcalf CJE, Grenfell BT. Urbanization and humidity shape the intensity of influenza epidemics in U.S. cities. Science. 2018;362:75–79. doi: 10.1126/science.aat6030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Few R, Lake I, Hunter PR, Tran PG. Seasonality, disease and behavior: using multiple methods to explore socio-environmental health risks in the Mekong Delta. Social Science & Medicine. 2013;80:1–9. doi: 10.1016/j.socscimed.2012.12.027. [DOI] [PubMed] [Google Scholar]
  16. Fisman D. Seasonality of viral infections: mechanisms and unknowns. Clinical Microbiology and Infection. 2012;18:946–954. doi: 10.1111/j.1469-0691.2012.03968.x. [DOI] [PubMed] [Google Scholar]
  17. Fox R. What about bias in your dataset?”: Quantifying sampling bias in SafeGraph Patterns. 2019. [February 17, 2022]. https://colab.research.google.com/drive/1u15afRytJMsizySFqA2EPlXSh3KTmNTQ
  18. Grassly NC, Fraser C. Seasonal infectious disease epidemiology. Proceedings. Biological Sciences. 2006;273:2541–2550. doi: 10.1098/rspb.2006.3604. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Greenhalgh T, Jimenez JL, Prather KA, Tufekci Z, Fisman D, Schooley R. Ten scientific reasons in support of airborne transmission of SARS-CoV-2. Lancet. 2021;397:1603–1605. doi: 10.1016/S0140-6736(21)00869-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Hoffmann T, Peel L, Lambiotte R, Jones NS. Community detection in networks without observing edges. Science Advances. 2020;6:eaav1478. doi: 10.1126/sciadv.aav1478. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. International Code Council . 2012 International Energy Conservation Code. ICC; 2015. [Google Scholar]
  22. Jayaweera M, Perera H, Gunawardana B, Manatunge J. Transmission of COVID-19 virus by droplets and aerosols: a critical review on the unresolved dichotomy. Environmental Research. 2020;188:109819. doi: 10.1016/j.envres.2020.109819. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Johnson TF, Hordley LA, Greenwell MP, Evans LC. Associations between COVID-19 transmission rates, park use, and landscape structure. The Science of the Total Environment. 2021;789:148123. doi: 10.1016/j.scitotenv.2021.148123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Keegan J, Ng A. There’s a Multibillion-Dollar Market for Your Phone’s Location Data, The Markup. 2021. [March 20, 2023]. https://themarkup.org/privacy/2021/09/30/theres-a-multibillion-dollar-market-for-your-phones-location-data
  25. Keeling MJ, Rohani P, Grenfell BT. Seasonally forced disease dynamics explored as switching between attractors. Physica D. 2001;148:317–335. doi: 10.1016/S0167-2789(00)00187-1. [DOI] [Google Scholar]
  26. Klein B, LaRock T, McCabe S, Torres L, Friedland L, Kos M, Privitera F, Lake B, Kraemer MU, Brownstein JS. Characterizing collective physical distancing in the US during the first nine months of the COVID-19 pandemic. arXiv. 2022 doi: 10.1371/journal.pdig.0000430. https://arxiv.org/abs/2212.08873 [DOI] [PMC free article] [PubMed]
  27. Klepeis NE, Nelson WC, Ott WR, Robinson JP, Tsang AM, Switzer P, Behar JV, Hern SC, Engelmann WH. The National Human Activity Pattern Survey (NHAPS): a resource for assessing exposure to environmental pollutants. Journal of Exposure Analysis and Environmental Epidemiology. 2001;11:231–252. doi: 10.1038/sj.jea.7500165. [DOI] [PubMed] [Google Scholar]
  28. Klompas M, Baker MA, Rhee C. Airborne transmission of SARS-cov-2: theoretical considerations and available evidence. JAMA. 2020;324:441–442. doi: 10.1001/jama.2020.12458. [DOI] [PubMed] [Google Scholar]
  29. Kronfeld-Schor N, Stevenson TJ, Nickbakhsh S, Schernhammer ES, Dopico XC, Dayan T, Martinez M, Helm B. Drivers of infectious disease seasonality: potential implications for COVID-19. Journal of Biological Rhythms. 2021;36:35–54. doi: 10.1177/0748730420987322. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. Kummer AG, Zhang J, Litvinova M, Vespignani A, Yu H, Ajelli M. Measuring the seasonality of human contact patterns and its implications for the spread of respiratory infectious diseases. medRxiv. 2022 doi: 10.1101/2022.02.22.22271357. [DOI]
  31. Liao C-M, Chang C-F, Liang H-M. A probabilistic transmission dynamic model to assess indoor airborne infection risks. Risk Analysis. 2005;25:1097–1107. doi: 10.1111/j.1539-6924.2005.00663.x. [DOI] [PubMed] [Google Scholar]
  32. Louvain-igraph Vincent Traag. 2018. [February 17, 2022]. https://louvain-igraph.readthedocs.io/en/latest/reference.html
  33. Lu J, Gu J, Li K, Xu C, Su W, Lai Z, Zhou D, Yu C, Xu B, Yang Z. COVID-19 outbreak associated with air conditioning in restaurant, Guangzhou, China, 2020. Emerging Infectious Diseases. 2020;26:1628–1631. doi: 10.3201/eid2607.200764. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Martinez ME. The calendar of epidemics: seasonal cycles of infectious diseases. PLOS Pathogens. 2018;14:e1007327. doi: 10.1371/journal.ppat.1007327. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Mesinger F, DiMego G, Kalnay E, Mitchell K, Shafran PC, Ebisuzaki W, Jović D, Woollen J, Rogers E, Berbery EH, Ek MB, Fan Y, Grumbine R, Higgins W, Li H, Lin Y, Manikin G, Parrish D, Shi W. North American regional reanalysis. Bulletin of the American Meteorological Society. 2006;87:343–360. doi: 10.1175/BAMS-87-3-343. [DOI] [Google Scholar]
  36. Metcalf CJE, Bjørnstad ON, Grenfell BT, Andreasen V. Seasonality and comparative dynamics of six childhood infections in pre-vaccination Copenhagen. Proceedings. Biological Sciences. 2009;276:4111–4118. doi: 10.1098/rspb.2009.1058. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Morawska L, Milton DK. It is time to address airborne transmission of Coronavirus Disease 2019 (COVID-19) Clinical Infectious Diseases. 2020;71:2311–2313. doi: 10.1093/cid/ciaa939. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Moriyama M, Hugentobler WJ, Iwasaki A. Seasonality of respiratory viral infections. Annual Review of Virology. 2020;7:83–101. doi: 10.1146/annurev-virology-012420-022445. [DOI] [PubMed] [Google Scholar]
  39. Mossong J, Hens N, Jit M, Beutels P, Auranen K, Mikolajczyk R, Massari M, Salmaso S, Tomba GS, Wallinga J, Heijne J, Sadkowska-Todys M, Rosinska M, Edmunds WJ. Social contacts and mixing patterns relevant to the spread of infectious diseases. PLOS Medicine. 2008;5:e74. doi: 10.1371/journal.pmed.0050074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  40. Nesbitt L, Meitner MJ, Girling C, Sheppard SR, Lu Y. Who has access to urban vegetation? A spatial analysis of distributional green equity in 10 US cities. Landscape and Urban Planning. 2019;181:51–79. doi: 10.1016/j.landurbplan.2018.08.007. [DOI] [Google Scholar]
  41. Nguyen JL, Dockery DW. Daily indoor-to-outdoor temperature and humidity relationships: a sample across seasons and diverse climatic regions. International Journal of Biometeorology. 2016;60:221–229. doi: 10.1007/s00484-015-1019-5. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Onozuka D, Hashizume M. The influence of temperature and humidity on the incidence of hand, foot, and mouth disease in Japan. The Science of the Total Environment. 2011;410–411:119–125. doi: 10.1016/j.scitotenv.2011.09.055. [DOI] [PubMed] [Google Scholar]
  43. Ott WR. Human Activity Patterns: A Review of the Literature for Estimating Time Spent Indoors, Outdoors, and in Transit. US Environmental Protection Agency; 1988. [Google Scholar]
  44. Pew Resesarch Center Mobile Fact Sheet. 2021. [February 17, 2022]. https://www.pewresearch.org/internet/fact-sheet/mobile
  45. Pringle JC, Leikauskas J, Ransom-Kelley S, Webster B, Santos S, Fox H, Marcoux S, Kelso P, Kwit N. COVID-19 in a correctional facility employee following multiple brief exposures to persons with COVID-19 - Vermont, July-August 2020. MMWR Morbidity and Mortality Weekly Report. 2020;69:1569–1570. doi: 10.15585/mmwr.mm6943e1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Robey AJ, Fierce L. Sensitivity of airborne transmission of enveloped viruses to seasonal variation in indoor relative humidity. International Communications in Heat and Mass Transfer. 2022;130:105747. doi: 10.1016/j.icheatmasstransfer.2021.105747. [DOI] [Google Scholar]
  47. Safegraph Safegraph Patterns. 2021a. [February 14, 2022]. https://safegraph.com
  48. Safegraph Privacy Policy. 2021b. [February 17, 2022]. https://www.safegraph.com/privacy-policy
  49. Sefcik JS, Kondo MC, Klusaritz H, Sarantschin E, Solomon S, Roepke A, South EC, Jacoby SF. Perceptions of nature and access to green space in four urban neighborhoods. International Journal of Environmental Research and Public Health. 2019;16:2313. doi: 10.3390/ijerph16132313. [DOI] [PMC free article] [PubMed] [Google Scholar]
  50. Shaman J, Kohn M. Absolute humidity modulates influenza survival, transmission, and seasonality. PNAS. 2009;106:3243–3248. doi: 10.1073/pnas.0806852106. [DOI] [PMC free article] [PubMed] [Google Scholar]
  51. Shaman J, Pitzer VE, Viboud C, Grenfell BT, Lipsitch M. Absolute humidity and the seasonal onset of influenza in the continental United States. PLOS Biology. 2010;8:e1000316. doi: 10.1371/journal.pbio.1000316. [DOI] [PMC free article] [PubMed] [Google Scholar]
  52. Spalt EW, Curl CL, Allen RW, Cohen M, Adar SD, Stukovsky KH, Avol E, Castro-Diehl C, Nunn C, Mancera-Cuevas K, Kaufman JD. Time-location patterns of a diverse population of older adults: the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air) Journal of Exposure Science & Environmental Epidemiology. 2016;26:349–355. doi: 10.1038/jes.2015.29. [DOI] [PMC free article] [PubMed] [Google Scholar]
  53. Statista Digital Market Outlook Individuals of any age who own at least one smartphone and use the smartphone(s) at least once per month. 2022. [February 17, 2022]. https://www.statista.com/statistics/201182/forecast-of-smartphone-users-in-the-us
  54. Susswein Z, Valdano E, Brett T, Rohani P, Colizza V, Bansal S. Ignoring spatial heterogeneity in drivers of SARS-Cov-2 transmission in the US will impede sustained elimination. medRxiv. 2021 doi: 10.1101/2021.08.09.21261807. [DOI]
  55. Tellier R, Li Y, Cowling BJ, Tang JW. Recognition of aerosol transmission of infectious agents: a commentary. BMC Infectious Diseases. 2019;19:101. doi: 10.1186/s12879-019-3707-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Wang CC, Prather KA, Sznitman J, Jimenez JL, Lakdawala SS, Tufekci Z, Marr LC. Airborne transmission of respiratory viruses. Science. 2021;373:eabd9149. doi: 10.1126/science.abd9149. [DOI] [PMC free article] [PubMed] [Google Scholar]
  57. Yang W, Marr LC. Dynamics of airborne influenza A viruses indoors and dependence on humidity. PLOS ONE. 2011;6:e21481. doi: 10.1371/journal.pone.0021481. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Zickuhr K, Smith A. 28% of american adults use mobile and social location-based services. 2011. [September 6, 2011]. https://www.pewresearch.org/internet/2011/09/06/28-of-american-adults-use-mobile-and-social-location-based-services/

Editor's evaluation

Niel Hens 1

This is a valuable study characterizing seasonal deviations in indoor activity at the county level in the United States with relevance to respiratory disease transmission. The strength of evidence is solid. This study and its results are of potential interest to those people constructing more evidence-based infectious disease transmission models.

Decision letter

Editor: Niel Hens1
Reviewed by: Guillaume Béraud2

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Disentangling the rhythms of human activity in the built environment for airborne transmission risk: a large-scale analysis of mobility data" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Diane Harper as the Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Guillaume Béraud (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1) One of the major issues is related to the definition of "indoor". It should be stated from the beginning what "indoor" locations are about, in particular, that home is not part of it, which is an important point to understand the results (and the discussion). Did the authors conduct any sensitivity analysis related to these choices?

2) Compare trends with other proxies for seasonality.

3) Provide a better justification of the clustering approach used (choice of methodology) and conduct the necessary sensitivity analyses (including choice of threshold).

4) Compare model fits to real COVID-19 data and/or ILI (because of the difficulty in dealing with NPIs for COVID-19), or step back from the claims that the 'σ' metric generates better model fits given that you've shown that it yields different model fits, but not necessarily better ones.

5) Provide formal justifications for several claims throughout the Results section (e.g. a measurement of geographic heterogeneity in trends, and differences in pre-pandemic vs. peri-pandemic patterns).

In summary, this work has potential but requires essential revisions, particularly by clarifying some of the language used and by including more detailed and quantitative analyses.

Reviewer #1 (Recommendations for the authors):

Overall, I found this study to be well-conducted and impactful. There are a number of areas where it could be improved, particularly by clarifying some of the language and by including more detailed and quantitative analyses. In the following comments, I'll try to provide specific suggestions for where the authors might focus their efforts.

Specific comments:

10: Perhaps "impacting" rather than "suppressing"? The rest of the manuscript makes clear that the authors interpret seasonality as a force that can both enhance and suppress transmission, so it would be worth using consistent language here.

54-56 ("While more is known about… rates of indoor activity.") I had trouble parsing this sentence. What do you mean by spatio-temporal variation in the indoor environment? At what scale? Are you talking about the environment itself, or the variation in people's experience of it? What kinds of rates of indoor activity are unknown, and why does this matter? It seems to me that this sentence is identifying the key gap that this study aims to fill, so it would be worth making this more precise.

90: What sort of global change do you mean? As it stands, the term is too vague to be meaningful here.

Figure 1: The authors could consider grouping Figure 1B by latitude rather than alphabetically; this might reveal some interesting patterns that more clearly support their findings of different seasonal patterns in the north vs. the south.

109: systematic how? This paragraph and Figure 1A seem to discuss just two counties, but a "systematic" difference suggests to me that there's some kind of variation that's observed repeatedly across instances (counties). It would be good to discuss here what exactly is systematic about this variation.

Figure 2: Did the data include Alaska and Hawaii and/or the territories? I would imagine these states might also have substantially different seasonal trends relative to the lower 48 and might give some indication of what sorts of seasonal trends we might expect outside the US.

140: Heterogeneous how? Can you provide some measurement of the degree of heterogeneity?

142: ("in most locations indoor activity deviated from pre-pandemic trends") – again, as measured how? In what fraction of locations? To what degree did the trends deviate?

146: ("activity was more likely to be outdoor than in prior years") – how much more likely? How widespread was this change?

158-159: It would be worth reporting the changes in amplitude and phase here, with appropriate units.

161: A poorer fit as measured how?

170: "accuracy of disease models" – I appreciate that the authors have shown that using different seasonal forcing terms as inputs can yield different epidemic curves, but I don't think they've made a formal assessment of accuracy here, which would require comparing model fits to disease transmission data. The evidence presented so far does not make clear to me the degree of detail needed in the seasonal forcing term to accurately characterize disease transmission trends, as this will depend on the disease dynamics themselves and the temporal and geographic scale of the model.

234: Perhaps "southern latitudes should be targeted for such interventions in summer months as well"? It seems that southern latitudes still have a substantial winter peak, it's just that they also have a summer peak.

278: Again, it would be worth specifying what global change events the authors have in mind here.

Another general point for the Discussion: how should we interpret differences in amplitude across locations? Since σ is a measurement of the percent change in baseline activity, the indoor activity in a location with a high baseline but low σ might still be higher than the indoor activity in a location with a low baseline but a large σ. To what extent can we use σ to compare indoor activity across locations in the US? Or can we only use it to compare variations in indoor activity within counties? Would it be worth including some analysis of the baseline indoor activity across the US, since σ is really operating on this baseline?

288: Make explicit that you'll be referring to this as a POI.

294: What kind of spatial imputation did you do? Why?

Figures S7: It feels odd to me to have amplitude, frequency, and phase all plotted on the same vertical axis despite them all having different units. Perhaps a table would be better?

Reviewer #2 (Recommendations for the authors):

I am confident that a revision of the issues in the public review would improve the quality of the paper and allow it to exploit the full potential of this work.

I believe that it is crucial to repeat the analysis taking into account the nature of the correlation matrix, so either adjusting the null hypothesis on modularity optimization or by using a different community detection algorithm.

Reviewer #3 (Recommendations for the authors):

Overall, it is an excellent paper and very well written. However, there are some issues I'd like to be considered:

The major issue is related to the definition of "indoor". It should be stated from the beginning what is "indoor" locations, in particular, that home is not part of it, which is an important point to understand the results (and the discussion). At the moment, it is only defined within the methods (Line 300), at the end of the article.

Line 43-41: Maybe authors could extend a few references on seasonality causes. But it is not mandatory.

Figure 1B: Why counties are ordered in alphabetical order? Which does not bring a lot of information, while it could be ordered by latitude, as an example, which could reveal some patterns.

Lines 103-106: authors should define more precisely what is the average (county-level? season? …).

Cluster: Maybe the clustering methods could be explained more extensively in the appendix.

Figure 3: I found it difficult to observe a shift in indoor activities, according to season.

Line 232-233: Isn't it the contrary? Increase of indoor activities in winter for northern regions?

Finally, a discussion on the difference between relationship and causality could be useful to distinguish human behavior seasonality and infectious diseases seasonality.

eLife. 2023 Apr 4;12:e80466. doi: 10.7554/eLife.80466.sa2

Author response


Essential revisions:

1) One of the major issues is related to the definition of "indoor". It should be stated from the beginning what "indoor" locations are about, in particular, that home is not part of it, which is an important point to understand the results (and the discussion). Did the authors conduct any sensitivity analysis related to these choices?

We have now added text to the Results to define indoor, and added more details to the Methods section to provide further clarity on our classification of locations. Additionally,

– We have now added a sensitivity analysis to the Supplement (Figure S11) that shows that the locations that could not be classified as indoor or outdoor do not make a significant impact on the measure of indoor activity seasonality.

– We have added a figure (Figure S1) that shows that time spent at home does not appear to be highly seasonal.

2) Compare trends with other proxies for seasonality.

We have now added a supplementary figure (Figure S3) to compare trends in indoor activity seasonality with those in environmental variables such as temperature and humidity.

3) Provide a better justification of the clustering approach used (choice of methodology) and conduct the necessary sensitivity analyses (including choice of threshold).

We have now added significant details to the Methods section on the clustering approach, and added sensitivity analysis and methodological comparison to the Supplement (Figures S13, S14).

4) Compare model fits to real COVID-19 data and/or ILI (because of the difficulty in dealing with NPIs for COVID-19), or step back from the claims that the 'σ' metric generates better model fits given that you've shown that it yields different model fits, but not necessarily better ones.

We appreciate this point, and agree that future work will have to consider how indoor activity seasonality affects our ability to capture observed transmission trends. However, such work would additionally need careful characterization of other seasonal factors hypothesized to drive transmission (including environmental and other behavioral factors), and is beyond the scope of our work. Instead, in Figure 4, we aim to (a) provide the infectious disease modeling community with empirically-inferred parameters for a simple sinusoidal model which is commonly used in infectious disease models to capture transmission seasonality; and (b) demonstrate the implications of ignoring geographic heterogeneity in transmission seasonality in theoretical models of disease dynamics, which are commonly used for scenario analysis and model-based intervention design. As we demonstrate, transmission seasonality described by such sinusoidal models, even when they are empirically characterized as in our case, can lead to meaningfully different epidemic dynamics when transmission seasonality varies from the assumptions. We have added text in the Results and Discussion sections to clarify these points.

5) Provide formal justifications for several claims throughout the Results section (e.g. a measurement of geographic heterogeneity in trends, and differences in pre-pandemic vs. peri-pandemic patterns).

We have added additional clarifying text or statistical analyses to respond to these reviewer points.

Reviewer #1 (Recommendations for the authors):

Overall, I found this study to be well-conducted and impactful. There are a number of areas where it could be improved, particularly by clarifying some of the language and by including more detailed and quantitative analyses. In the following comments, I'll try to provide specific suggestions for where the authors might focus their efforts.

We thank the reviewer for their positive feedback on our work.

Regarding the geographic scope and generalizability of findings:

– We acknowledge that our study is limited to the US because of the data available to us. However, similar data can be obtained by researchers globally and our metric and its insights will be generalizable to these datasets. We have now added a sentence (with a reference) to our Discussion section.

– We acknowledge that analysis is limited to those mobile users that share location data. We already include in our Discussion section an acknowledgement that smartphone usage varies by age. Additionally, research shows that location sharing among mobile users is not significantly biased by age, gender, race/ethnicity, income or education within the United States (with 40-65% of all demographic groups participating in location sharing). We have added this information and a reference to our Discussion.

While it’s important to acknowledge these biases, we believe that the benefits of leveraging these data to provide insights into built environment social behavior dynamics outweigh these limitations.

Specific comments:

10: Perhaps "impacting" rather than "suppressing"? The rest of the manuscript makes clear that the authors interpret seasonality as a force that can both enhance and suppress transmission, so it would be worth using consistent language here.

We have edited the Background section of the Abstract based on this suggestion.

54-56 ("While more is known about… rates of indoor activity.") I had trouble parsing this sentence. What do you mean by spatio-temporal variation in the indoor environment? At what scale? Are you talking about the environment itself, or the variation in people's experience of it? What kinds of rates of indoor activity are unknown, and why does this matter? It seems to me that this sentence is identifying the key gap that this study aims to fill, so it would be worth making this more precise.

Thank you for this suggestion. We have edited this sentence in the Introduction.

90: What sort of global change do you mean? As it stands, the term is too vague to be meaningful here.

Here, we refer to the field of global change which focuses on the planetary-scale biological changes occurring due to anthropogenic impacts.

Figure 1: The authors could consider grouping Figure 1B by latitude rather than alphabetically; this might reveal some interesting patterns that more clearly support their findings of different seasonal patterns in the north vs. the south.

We have re-designed Figure 1 to order the heatmap in panel B according to latitude. Additionally, we have added more data to panel A to provide more intuition.

109: systematic how? This paragraph and Figure 1A seem to discuss just two counties, but a "systematic" difference suggests to me that there's some kind of variation that's observed repeatedly across instances (counties). It would be good to discuss here what exactly is systematic about this variation.

Thank you for this suggestion. We have edited this sentence in the Results to not include the word “systematic”.

Figure 2: Did the data include Alaska and Hawaii and/or the territories? I would imagine these states might also have substantially different seasonal trends relative to the lower 48 and might give some indication of what sorts of seasonal trends we might expect outside the US.

We had previously limited our analysis to the continental US, but have now added Alaska and Hawaii to the results in Figures 1 and 2, as well as the supplementary figure maps. Alaska clusters with the Northern Cluster in Figure 2, while Hawaii’s islands demonstrate more dynamic indoor activity patterns with some islands grouping into the tourism cluster.

140: Heterogeneous how? Can you provide some measurement of the degree of heterogeneity?

Thank you for this point. We have edited the sentence to be more precise:

“We find that the temporal trends in indoor activity are less geographically structured in 2020 than those of previous years (see Supplementary Figure S7 for a characterization of the time series patterns).”

142: ("in most locations indoor activity deviated from pre-pandemic trends") – again, as measured how? In what fraction of locations? To what degree did the trends deviate?

146: ("activity was more likely to be outdoor than in prior years") – how much more likely? How widespread was this change?

These are excellent points. We have edited this section of the manuscript for clarity. Additionally, we now (a) report statistics to quantity the deviation in the 2020 indoor activity in the manuscript, pointing to new supplementary table S1; (b) quantify the locations in which activity shifted indoors vs outdoors during two period of the COVID-19 pandemic in 2020 (new Figure S6).

158-159: It would be worth reporting the changes in amplitude and phase here, with appropriate units.

We have updated Supplementary Figure S9 (previously Figure S7) to which we have added differences between clusters in the inferred parameters and included the units.

161: A poorer fit as measured how?

We thank the reviewer for this helpful question. We have added a figure to the supplement (Figure S10) that shows model performance (in terms of root mean square error). We have also added this statement to the relevant Results section:

“While the fits are comparable for both clusters (Supplementary Figure S10B), the sinusoidal model does not capture the second peak of indoor activity during the summer months in the southern cluster.”

170: "accuracy of disease models" – I appreciate that the authors have shown that using different seasonal forcing terms as inputs can yield different epidemic curves, but I don't think they've made a formal assessment of accuracy here, which would require comparing model fits to disease transmission data. The evidence presented so far does not make clear to me the degree of detail needed in the seasonal forcing term to accurately characterize disease transmission trends, as this will depend on the disease dynamics themselves and the temporal and geographic scale of the model.

We appreciate this point by the reviewer, and agree that future work will have to consider how indoor activity seasonality affects our ability to capture observed transmission trends. However, such work would additionally need careful characterization of other seasonal factors hypothesized to drive transmission (including environmental and other behavioral factors), and is beyond the scope of our work. Instead, in Figure 4 we aim to (a) provide the infectious disease modeling community with empirically-inferred parameters for a simple sinusoidal model which is commonly used in infectious disease models to capture transmission seasonality; and (b) demonstrate the implications of ignoring geographic heterogeneity in transmission seasonality in theoretical models of disease dynamics, which are commonly used for scenario analysis and model-based intervention design. As we demonstrate, transmission seasonality described by such sinusoidal models, even when they are empirically characterized as in our case, can lead to meaningfully different epidemic dynamics when transmission seasonality varies from the assumptions.

We have added text to our Discussion paragraph (starting with “We illustrate how to incorporate seasonality…”) to clarify these points.

234: Perhaps "southern latitudes should be targeted for such interventions in summer months as well"? It seems that southern latitudes still have a substantial winter peak, it's just that they also have a summer peak.

Thank you for this point. We agree, and have edited the sentence to clarify this:

“…, southern latitudes should be additionally targeted for such interventions in the summer months.”

278: Again, it would be worth specifying what global change events the authors have in mind here.

Please see our previous response. We have edited some of the mentions of “global change” to “climate change” in this paragraph for additional clarity.

Another general point for the Discussion: how should we interpret differences in amplitude across locations? Since σ is a measurement of the percent change in baseline activity, the indoor activity in a location with a high baseline but low σ might still be higher than the indoor activity in a location with a low baseline but a large σ. To what extent can we use σ to compare indoor activity across locations in the US? Or can we only use it to compare variations in indoor activity within counties? Would it be worth including some analysis of the baseline indoor activity across the US, since σ is really operating on this baseline?

Yes this is correct. Our definition of indoor activity seasonality (σ) is intended to be a relative measure. Our focus is on seasonality, which measures deviations from a baseline, even if the baseline may be different for different locations. We have now added a sentence to the Discussion to highlight this point (fourth paragraph) and to the Results subsection titled “Implications for modeling seasonal disease dynamics”.

288: Make explicit that you'll be referring to this as a POI.

Thanks. We’ve edited the sentence to clarify this.

294: What kind of spatial imputation did you do? Why?

We’ve added the following sentence to the methods:

“For location-weeks in which the total visit count is less than 100, we impute the indoor activity seasonality using an average of σ in the neighboring locations (where neighbors are defined based on shared county borders). This affects 0.6% of all county-weeks and a total of 79 (out of 3143) counties.”

Figures S7: It feels odd to me to have amplitude, frequency, and phase all plotted on the same vertical axis despite them all having different units. Perhaps a table would be better?

We have now added a table to the updated figure (Figure S9), and added the units and quantified the differences in the caption of this figure.

Reviewer #3 (Recommendations for the authors):

Overall, it is an excellent paper and very well written. However, there are some issues I'd like to be considered:

The major issue is related to the definition of "indoor". It should be stated from the beginning what is "indoor" locations, in particular, that home is not part of it, which is an important point to understand the results (and the discussion). At the moment, it is only defined within the methods (Line 300), at the end of the article.

Thank you for this point. We have now added text to the Results to define indoor, and added more details to the Methods section. Additionally, we have now added a sensitivity analysis to the Supplement (Figure S11) that shows that the locations that could not be classified as indoor or outdoor do not make a significant impact on the measure of indoor activity seasonality. Also, we have added a figure (Figure S1) that shows that time spent at home does not appear to be highly seasonal.

Line 43-41: Maybe authors could extend a few references on seasonality causes. But it is not mandatory.

Our original list of references covers a diverse set of factors including environmental and social that are discussed in the literature, thus we have not added additional references.

Figure 1B: Why counties are ordered in alphabetical order? Which does not bring a lot of information, while it could be ordered by latitude, as an example, which could reveal some patterns.

We have re-designed Figure 1 to order the heatmap in panel B according to latitude. Additionally, we have added more data to panel A to provide more intuition.

Lines 103-106: authors should define more precisely what is the average (county-level? season? …).

We have added a clarification to that sentence to specify that it’s a county-specific average over time.

Cluster: Maybe the clustering methods could be explained more extensively in the appendix.

We have added more details on the clustering methods in the Methods and Supplement (Figures S13). Additionally, we have now added comparison to other clustering methods to demonstrate the robustness of our results (Supplementary Figure S14).

Figure 3: I found it difficult to observe a shift in indoor activities, according to season.

We have now provided in the Supplement (Table S1 and Figure S6) additional summary statistics to characterize the deviation in indoor activity seasonality during 2020 from earlier years.

Line 232-233: Isn't it the contrary? Increase of indoor activities in winter for northern regions?

We’re not sure we follow this point, so please feel free to elaborate. This sentence is about the design of public health intervention strategies in which transmission risk may be lowered by decreasing indoor activity and/or increasing outdoor activity during winters in northern areas (where indoor activity is naturally high during this period).

Finally, a discussion on the difference between relationship and causality could be useful to distinguish human behavior seasonality and infectious diseases seasonality.

We agree with the reviewer on this important point. We have made an effort to make this point in our Discussion section in a few ways:

– In the second paragraph, we discuss the many underlying mechanisms via which indoor activity seasonality may reflect seasonality of infectious disease. Some of these mechanisms are directly causal (e.g. high indoor activity leads to increased contact which leads to increased transmission of airborne diseases), while others are indirectly causal or simply correlative (e.g. indoor activity suggests increased transmissibility due to poor ventilation).

– In the third paragraph of the Discussion section, we make a case for considering behavior seasonality and environmental seasonality in disease models as each factor has the potential to affect disease dynamics independently or via an interaction.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    MDAR checklist

    Data Availability Statement

    We make available on Github the data and code needed to reproduce all figures and analyses in this manuscript: https://github.com/bansallab/indoor_outdoor (copy archived at swh:1:rev:d8a2ffc49f46a22c45814bd1dfcd1b054f2a4a27). The dataset we provide is of the metric used in all our analyses and figures ("indoor activity"). This dataset can be regenerated using the Safegraph Weekly Patterns datasets found at https://docs.safegraph.com/docs/weekly-patterns and code in the Github repository. The Safegraph Weekly Patterns was made freely available to academics at a uniquely granular level in response to the COVID-19 pandemic. Safegraph's business model involves selling these datasets to other corporations and, as a result, any data access agreement with the company forbids sharing of the raw data. The company does, however, make its data freely available to academics (for non-commercial use) through an institutional university subscription to Dewey or an individual data use agreement with Safegraph.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES