Estimating intervention effects on infectious disease control: The effect of community mobility reduction on Coronavirus spread

Andrew Giffin; Wenlong Gong; Suman Majumder; Ana G Rappold; Brian J Reich; Shu Yang

doi:10.1016/j.spasta.2022.100711

. 2022 Oct 21;52:100711. doi: 10.1016/j.spasta.2022.100711

Estimating intervention effects on infectious disease control: The effect of community mobility reduction on Coronavirus spread

Andrew Giffin ^a,^⁎, Wenlong Gong ^a, Suman Majumder ^b, Ana G Rappold ^c, Brian J Reich ^a, Shu Yang ^a

PMCID: PMC9584839 PMID: 36284923

Abstract

Understanding the effects of interventions, such as restrictions on community and large group gatherings, is critical to controlling the spread of COVID-19. Susceptible–Infectious–Recovered (SIR) models are traditionally used to forecast the infection rates but do not provide insights into the causal effects of interventions. We propose a spatiotemporal model that estimates the causal effect of changes in community mobility (intervention) on infection rates. Using an approximation to the SIR model and incorporating spatiotemporal dependence, the proposed model estimates a direct and indirect (spillover) effect of intervention. Under an interference and treatment ignorability assumption, this model is able to estimate causal intervention effects, and additionally allows for spatial interference between locations. Reductions in community mobility were measured by cell phone movement data. The results suggest that the reductions in mobility decrease Coronavirus cases 4 to 7 weeks after the intervention.

Keywords: Causal inference, COVID-19, Spatiotemporal modeling, SIR model

1. Introduction

Since the Coronavirus exploded into a global pandemic, much research has been done to understand its epidemiological characteristics and quantify the effectiveness of various interventions to mitigate disease spread (Li et al., 2020, Livingston and Bucher, 2020, Dandekar and Barbastathis, 2020, Dehning et al., 2020, Cowling et al., 2020, Prem et al., 2020, Carroll and Prentice, 2021, Chen et al., 2020, Lee et al., 2021, Lyu and Wehby, 2020, Nouvellet et al., 2021, Rashed and Hirata, 2021). The traditional model for the progression of an infectious disease uses a set of differential equations to decompose the population at risk into the number of susceptible (S), infected (I), or recovered (R) individuals and provide time-dependent trajectories of these compartments. Potential effects of interventions such as school closure, work from home, social distancing, etc. can be incorporated in the SIR model for the transitions between compartments. For example, there has been substantial effort put forth to combine machine learning and/or deep learning with SIR model to make predictions of the spreading trend (Dandekar and Barbastathis, 2020, Punn et al., 2020, Magdon-Ismail, 2020, Lyu et al., 2020, Kounchev et al., 2020). Similar efforts have attempted to adopt Bayesian approaches to provide uncertainty quantification of predictions (Dehning et al., 2020, Mbuvha and Marwala, 2020, Bradley, 2020).

The SIR model and the above variants do not account for the spatial distribution of the number of susceptible, infected, or recovered people (Kermack and McKendrick, 1927). A number of attempts have been made to modify the SIR model to incorporate this spatial distribution. One strategy considers diffusion of the infected population by modeling the individual agents (Reluga, 2004, Burger et al., 2009). Another branch of the literature considers Brownian motion-like movement for subgroups of people belonging to each compartment (Chinviriyasit and Chinviriyasit, 2010, Hilker et al., 2007, Robinson et al., 2012, Wang et al., 2010); while yet another group uses cross-diffusion an assumption where the susceptible move away from an increasing gradient of the infected to bring in a spatial distribution to the SIR model (Berres and Ruiz-Baier, 2011, Milner and Zhao, 2008, Capasso and Di Liddo, 1994). Models that assume that patches of population have different infection parameters and are connected by travel or migration have also been proposed (Arino and Van den Driessche, 2003, Hyman and LaForce, 2003, Sattenspiel et al., 1995, Sattenspiel and Herring, 2003, Lee et al., 2012, Lee and Castillo-Chavez, 2015, Lee and Jung, 2015).

Our method is based on a spatial SIR model that uses an “infectious radius” to allow for spread across locations (Paeng and Lee, 2017). This method is a model discrete in time and with space partitioned into distinct areas (e.g., counties). The number of individuals in each compartment and area evolves over time using the same mechanics as the standard SIR model. A distance metric is defined between counties, allowing the number of infected in county $j$ to account for dependence across nearby areas.

Our statistical contribution is using the SIR model as a stepping stone towards a spatiotemporal statistical model that permits causal inference without including differential-equation solvers. The result is a model with a very different flavor than the standard SIR model. While the goal of an SIR model is often to forecast the severity of a disease over time, the model presented here is designed to tease out causal effects of interventions.

Integrating causal inference into this spatial framework presents some challenges. Existing SIR models are generally used for prediction and forecasting, but not for ascertaining causal effects of interventions. Causal methods generally require adjusting for confounders that are related to both intervention and response, which allows for identification of the causal intervention effect as opposed to the observed association between intervention and response. However, if interventions at one location affect the response at other locations – a phenomenon known as interference – then the standard causal framework which adjusts only for local confounding is rendered ineffective. With a virus that spreads across space, clearly some degree of interference is involved, and a causal analysis must take this into account. This is typically done by placing assumptions on the form of interference. Commonly used assumptions on interference include “partial interference” in which the population is divided into isolated clusters, as well as “network interference” in which interference is allowed along a pre-specified network, and “spatial interference” in which interference is tempered by the distance between locations (Reich et al., 2020, Halloran and Struchiner, 1991, Sobel, 2006, Giffin et al., 2020). This paper uses a parsimonious assumption on interference, which allows for interference from neighboring counties only. This can be thought of as a simple version of both the spatial and network interference assumptions. Intervention effects are then divided between direct (within county) interventions and indirect (between county) interventions.

The framework that we propose approximates the standard SIR model, in a way that allows for estimation of the causal effects of interventions. In particular, we use the fact that infection rates are likely smooth across space, to obtain an approximation to the rate of infection at across space and time. This approximation is a function of the intervention variables, covariates, and a spatiotemporal model, and allows us to fit the observed data and obtain intervention effect estimates.

The remainder of the paper proceeds as follows. The motivating data are described in Section 2. Section 3 introduces the statistical method and corresponding theoretical properties. The method is then evaluated using a simulation study in Section 4 where we show that the approximate model can recover causal effects of data generated from the SIR model. Section 5 applies the proposed model to estimate the effect of decreases in mobility on Coronavirus cases, finding that decreases in mobility cause a decrease in local Coronavirus cases 4–7 week later. Section 6 concludes.

2. Data description

We estimate the effect of mobility on the number of observed Coronavirus cases. There are three primary types of data for analysis: intervention (mobility reduction), response (Coronavirus cases) and covariates. All variables are defined temporally by the week ( $t$ ) and spatially by the county ( $j$ ). Data from March 6, 2020 through October 8, 2020 for 3,137 counties or county equivalents are used in this study.

The response data $Y_{j} (t)$ are new, recorded cases of Coronavirus in a given county/week. These data are taken from the publicly available Johns Hopkins University Coronavirus Resource Center, which aggregates Coronavirus case counts and provides a daily, cumulative count of cases in each county (Dong et al., 2020).

The intervention data are publicly available measures of mobility – as measured by Google devices – which have been shown to have strong associations with Coronavirus case data (Google LLC, 2020, Chang et al., 2021, Yilmazkuday, 2020). We use an aggregate mobility measure that includes the categories: “retail/recreation”, “grocery/pharmacy”, “transit station”, “workplace”, and “residential” mobility. The first four categories measure the number of visits to such locations in comparison to baseline; the “residential mobility” category is a measure of the length of stay in comparison to baseline. These data are given as percentage change from a baseline level which was taken over January 3, 2020-February 6, 2020. Intuitively, in the months since the pandemic began, “retail/recreation”, “grocery/pharmacy”, “transit station” and “workplace” mobility have decreased substantially from the baseline and “residential” mobility (i.e., amount of time spent at home) has increased. The metric that we will use as intervention $A_{j} (t)$ for a given county/week is the mean of percentage change from baseline of “retail/recreation”, “grocery/pharmacy”, “transit station”, “workplace”, and negative “residential” mobility. We note that the mean over these categories is a somewhat crude measure of mobility; however, it is used so as to reduce variability coming from any single category.

Because of privacy concerns, county/days with too few users are not provided by Google. When a subset of the categories are provided for a given county/day, the mean of the available categories is taken. When none of the categories are provided for a given count/day, we impute the missing value from any available first and second degree neighbors (with values weighted by 1 divided by the distance between centroids, where the available weights are first standardized so that they sum to 1).

In addition we include a number of static and time-varying covariates which are potentially related to virus spread or mobility. For the static covariates, we rely heavily on a curated dataset compiled by Wu et al. 2020, which examines the relationship between air pollution and COVID-19 (Wu et al., 2020a, Wu et al., 2020b). Variables from this dataset include poverty rate, population density, median household income, and total population, originally taken from the 2016 American Community Survey, as well as data on the number of hospital beds. Since this and other works have shown that PM_2.5 (particulate matter smaller than 2.5 micrometers) may interact significantly with COVID-19, we included the county level PM_2.5 estimates from 2016 from this dataset (Alexeeff et al., 2015, Chudnovsky et al., 2012). To this dataset we added a number of other variables. The number of ICU beds in 2020 was compiled by Environmental Systems Research Institute (ESRI) (ESRI, 2020). Additionally, from the 2018 American Community Survey (ACS) we included median age and the number of foreign born residents (US Census Bureau, 2016). From the Bureau of Labor Statistics we included the number of people employed in healthcare and social services in September 2019 (US Bureau of Labor Statistics, 2019). We also included time-varying county-level daily meteorological data, as this could inform levels of mobility and spread of disease. We use daily mean temperature (degrees Celsius), as well as relative humidity and mean dew point (degrees Celsius) from NOAA’s Global Surface Summary of the Day using the “GSODR” R package (Sparks et al., 2017). Then we obtain county-level data by predicting at the mean county latitude/longitude coordinates from the 2010 Census using thin plate spline regression from the “fields” R package (Nychka et al., 2014).

3. Main methodology

3.1. Notation

Let $Y_{j} (t)$ be the number of new cases of Coronavirus reported during week $t \in {1, \dots, T}$ and county $j \in {1, \dots, J}$ , where $T = 31$ and $J = 3$ ,137. For county $j$ , denote $N_{j}$ as the population size and $X_{j} (t)$ as a vector of covariates for week $t$ . The direct intervention variable, $A_{j} (t)$ , is the mobility percentage change from baseline for county $j$ and week $t$ . In addition to the direct intervention, we allow for an indirect (spillover) intervention ${\tilde{A}}_{j} (t)$ . This variable captures the interventions received by neighboring counties. Including this term allows for interference between neighbors, since the response at county $j$ can now be impacted by the interventions in surrounding counties. The spatial configuration of the counties is summarized by their adjacency, with $c_{j j} = 0$ , $c_{j k} = 1$ if counties $j$ and $k$ are adjacent and $c_{j k} = 0$ otherwise. ${\tilde{A}}_{j} (t)$ is defined as the mean direct intervention over the $m_{j} = \sum_{k = 1}^{J} c_{j k}$ adjacent regions: ${\tilde{A}}_{j} (t) = \sum_{k = 1}^{J} c_{j k} A_{j} (t) / m_{j}$ . Similarly, the mean of the adjacent covariate values is ${\tilde{X}}_{j} (t) = \sum_{k = 1}^{J} c_{j k} X_{j} (t) / m_{j}$ .

3.2. Conceptual SIR model

To motivate the proposed statistical framework, we begin by describing the discrete-time, spatial SIR model proposed in Paeng and Lee (2017). Let $S_{j} (t)$ , $I_{j} (t)$ and $R_{j} (t)$ be the number of susceptible, infected and recovered individuals in county $j$ at time $t$ . These three states evolve over time but always satisfy the constraint that $N_{j} = S_{j} (t) + I_{j} (t) + R_{j} (t)$ . For each county $j$ the state evolution is similar to the classical SIR model (Kermack and McKendrick, 1927):

S_{j} (t + 1) - S_{j} (t) = - λ_{j} (t),

I_{j} (t + 1) - I_{j} (t) = λ_{j} (t) - γ I_{j} (t),

R_{j} (t + 1) - R_{j} (t) = γ I_{j} (t),

(1)

where $γ_{j} > 0$ is the recovery rate and $λ_{j} (t) = β \frac{S_{j} (t)}{N_{j}} \{\sum_{k = 1}^{J} W_{j k} I_{k} (t)\}$ is the rate new infections in county $j$ and week $t$ . Here $W_{j k}$ is proportional to the contact rate between individuals in region $j$ with individuals in region $k$ .

We expand on this by allowing the infection rate $β_{j} (t) > 0$ to vary with region and time:

λ_{j} (t) = β_{j} (t) \frac{S_{j} (t)}{N_{j}} \{\sum_{k = 1}^{J} W_{j k} I_{k} (t)\} .

(2)

This allows us to model spatiotemporal variation in $β_{j} (t)$ as a function of the covariates $X_{j} (t)$ and direct and indirect intervention variables $A_{j} (t)$ and ${\tilde{A}}_{j} (t)$ . In the absence of other information on connectivity, we simply set $W_{j j} = (1 - ϕ)$ and $W_{j k} = ϕ \cdot c_{j k} / m_{j}$ for $ϕ \in [0, 1]$ so that large $ϕ$ leads to strong spatial dependence and vice versa.

A major difficulty in using model (1) is that we do not observe the latent states $S_{j} (t)$ , $I_{j} (t)$ or $R_{j} (t)$ . We learn about these latent processes via the reported number of new cases $Y_{j} (t)$ . Further complicating the statistical model is that there may be under-reporting and a lag time between an increase in the true and reported infection rate due to the latency of the disease. Because of these issues, we link $Y_{j} (t)$ and $λ_{j} (t)$ in the SIR model by the over-dispersed Poisson distribution

Y_{j} (t) | λ_{j} (t), g_{j} (t) \overset{i n d e p}{\sim} Poisson [p exp {g_{j} (t)} λ_{j} (t - l)],

(3)

where $p \in (0, 1)$ accounts for under-reporting, $g_{j} (t) \overset{i i d}{\sim} Normal (0, τ^{2})$ captures over-dispersion, and $l \in {0, 1, 2, \dots}$ is the reporting lag. Setting $l > 0$ allows for cases to be observed after the actual infection, as is common in Coronavirus cases.

3.3. Gaussian approximation

The SIR model in Section 3.2 is an elegant way to model and forecast the spread of a disease through a population. However, the solution to the difference equations is complex, which hinders the ability to estimate model parameters in a statistical manner. Previously methods have been proposed to mimic the mechanistic model to provide realistic forecasts, e.g., Buckingham-Jeffery et al. (2018). Here, we propose an approximation to the SIR model to allow for estimation of intervention effects on local infection rates.

The key insight is that the rate of new infections in (2) can be re-expressed as

λ_{j} (t) = β_{j} (t) exp {θ_{j} (t)} + v_{j} (t),

(4)

exp {θ_{j} (t)} = S_{j} (t) I_{j} (t) / N_{j},

(5)

v_{j} (t) = β_{j} (t) ϕ \frac{S_{j} (t)}{N_{j}} \sum_{k = 1}^{J} \frac{c_{j k}}{m_{j}} {I_{k} (t) - I_{j} (t)} .

(6)

Both $θ_{j} (t)$ and $v_{j} (t)$ in (4) are latent spatial and temporal processes due to the unobserved $S_{j} (t)$ and $I_{j} (t)$ , which we approximate by spatiotemporal models.

We model $θ_{j} (t)$ as a separable spatiotemporal continuous autoregressive model (Stein, 2005). Denote $θ (t) = {θ_{1} (t), \dots, θ_{J} (t)}$ and $θ = {θ {(1)}^{⊤}, \dots, θ {(T)}^{⊤}}^{⊤}$ . Then the spatiotemporal conditional autoregressive model (STCAR) for $θ$ is multivariate normal with mean zero and covariance $σ^{2} Σ (ρ_{t}) \otimes Ω (ρ_{s})$ . Spatial dependence is governed by $Σ (ρ_{s}) = {(M - ρ_{s} C)}^{- 1}$ , where $M$ is diagonal with diagonal elements $m_{j}$ and the $(j, k)$ element of $C$ is 1 if sites $j$ and $k$ are adjacent and zero otherwise. If $θ (t) \sim Normal {0, σ^{2} Σ (ρ_{s})}$ , then $θ_{j} (t) | θ_{k} (t) for all k \neq j \sim Normal (ρ_{s} \sum_{k \sim j} θ_{k} (t) / m_{j}, σ^{2} / m_{j})$ , so that $ρ_{s}$ determines the strength of spatial dependence and $σ$ is the scale parameter. We denote this model as $θ (t) \sim CAR (σ, ρ_{s})$ . Similarly, temporal dependence is governed by $Ω (ρ_{t})$ , which has the same form as $Σ (ρ_{s})$ except for temporal adjacency structure with time $t$ having $t - 1$ and $t + 1$ considered neighbors. We refer to this as the $STCAR (σ, ρ_{s}, ρ_{t})$ model.

The second term in (4), $v_{t} (t)$ , sums over the $m_{j}$ regions adjacent to region $j$ , and is a function of the local differences, $I_{k} (t) - I_{j} (t)$ . Assuming the number of infected individuals varies smoothly across space, these local differences should be small. Therefore, one approximation we consider is simply setting $v_{j} (t) = 0$ for all $j$ and $t$ . For cases where these terms cannot be removed, we note that because they are functions of local differences they should be less spatially correlated than the spatial process itself, and thus a second approximation is $log {v_{j} (t)} \overset{i i d}{\sim} Normal (μ_{v}, σ_{v}^{2})$ . This form implicitly forces the $v_{j} (t)$ term in (7) to be positive. This approximation is justified because this term will be close to zero for spatially-correlated processes, as can be seen in (6). Approximating $v_{j} (t)$ as Gaussian is another option, but this would require additional constraints to ensure $λ_{j} (t) > 0$ . Modeling $v_{j} (t)$ as log-normal solves this issue and provides accurate results as shown in Section 4.

A crucial point is that neither our model for $θ_{t} (s)$ nor $v_{t} (s)$ attempt to retain the mechanistic properties of $I_{j} (t)$ and $S_{j} (t)$ . Therefore, this model would not provide reliable forecasts about future disease spread. However, forecasting is not our objective. Our aim to is provide a computationally feasible method to use observed data to estimate effects of intervention variables on local infection rates. The following section verifies that the approximation indeed has a causal interpretation under standard assumptions from causal inference, and proposes an estimation procedure for these effects.

For the final component, modeling the spatiotemporal infection rate $β_{j} (t)$ , we regress $log {β_{j} (t)}$ on covariates $X_{j} (t)$ and ${\tilde{X}}_{j} (t)$ as well as mobility variables $A_{j} (t)$ and ${\tilde{A}}_{j} (t)$ as

η_{j} (t) = log {β_{j} (t)} = α_{0} + X_{j} {(t)}^{⊤} α_{1} + {\tilde{X}}_{j} {(t)}^{⊤} α_{2} + A_{j} (t) δ_{1} + {\tilde{A}}_{j} (t) δ_{2}

where $δ_{1}$ and $δ_{2}$ quantify the direct and indirect (spillover) causal effects of mobility on infection rate. The covariate vectors $X_{j} (t)$ and ${\tilde{X}}_{j} (t)$ can include both the original covariates or covariates derived as propensity scores (Section 3.4).

Since it is impossible to identify the reporting rate $p$ in (3) and intercept terms $α_{0}$ and $μ_{v}$ we fit the final model

Y_{j} (t) | g_{j} (t), θ, v_{j} (t) \overset{i n d e p}{\sim} Poisson [exp {g_{j} (t) + η_{j} (t - l) + θ_{j} (t - l)} + exp {{\tilde{v}}_{j} (t)}]

(7)

η_{j} (t) = α_{0} + X_{j} {(t)}^{⊤} α_{1} + {\tilde{X}}_{j} {(t)}^{⊤} α_{2} + A_{j} (t) δ_{1} + {\tilde{A}}_{j} (t) δ_{2}

(8)

g_{j} (t) \overset{i i d}{\sim} Normal (0, τ^{2})

θ \sim STCAR (σ^{2}, ρ_{s}, ρ_{t})

{\tilde{v}}_{j} (t) \overset{i i d}{\sim} Normal (μ_{v}, σ_{\tilde{v}}^{2}) .

The final term in (7) combines the overdispersion term $g_{j} (t)$ and the nugget term $v_{j} (t)$ . When the dimensions of the covariates are large, we propose dimension reduction for $log {β_{j} (t)}$ using the generalized propensity score (Imbens, 2000) in Section 3.4. To complete the Bayesian model, we specify noninformative independent Normal $(0, 1 0^{2})$ priors for $α_{0}$ , $δ_{1}$ , $δ_{2}$ , $μ_{v}$ and the elements of $α_{1}$ and $α_{2}$ , and noninformative priors for covariance parameters $σ^{2}, τ^{2}, σ_{\tilde{v}}^{2} \sim InvGamma (0.1, 0.1)$ and $ρ_{s}, ρ_{t} \sim Uniform (0, 1)$ .

3.4. The potential outcomes framework

In this section, we provide a causal interpretation of $δ_{1}$ and $δ_{2}$ under the potential outcomes framework (Rubin, 1974). To ease discussion, let $Y_{j} (t)$ be the total number of cases, rather than reported number (i.e., $p = 1$ ). We use the potential outcomes framework to define the causal effect of mobility on infection rate. We use overbar to denote all history. Let ${\bar{a}}_{j} (t) = {(a_{j} (1), \dots, a_{j} (t))}^{⊤}$ be the trajectory of mobility level at region $j$ through time $t$ . Let $\bar{a} (t) = ({\bar{a}}_{1} (t), \dots, {\bar{a}}_{J} (t))$ be the trajectory of mobility level for all regions through time $t$ . Define $Y_{j}^{\bar{a} (t)} (t)$ to be the (possibly counterfactual) number of new cases in region $j$ at time $t$ had all the regions controlled the mobility level at $\bar{a} (t)$ through time $t$ .

Assume the potential outcome model for the SIR model with $θ_{j} (t) = log {S_{j} (t) I_{j} (t) / N_{j}}$ and $v_{j} (t) = β_{j} (t) \frac{S_{j} (t)}{N_{j}} \sum_{k = 1}^{J} \frac{c_{j k}}{m_{j}} {I_{k} (t) - I_{j} (t)}$ as in (5), (6), and

Y_{j}^{\bar{a} (t)} \sim Poisson {λ_{j}^{\bar{a} (t)}},

λ_{j}^{\bar{a} (t)} (t) = β_{j}^{\bar{a} (t)} (t) exp {θ_{j} (t)} + v_{j} (t),

(9)

log {β_{j}^{\bar{a} (t)} (t)} = a_{j} (t) δ_{1} + {\tilde{a}}_{j} (t) δ_{2} + h {\bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)} .

Consider two regimes $\bar{a} (t)$ and ${\bar{a}}^{'} (t)$ , where all components are the same except for $a_{j} (t) = a_{j}^{'} (t) + 1$ , and $v_{j} (t) = 0$ . Then under model (9),

\frac{E {Y_{j}^{\bar{a} (t)} (t) ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}}{E {Y_{j}^{{\bar{a}}^{'} (t)} (t) ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}} = exp (δ_{1}),

where $\bar{X} (t)$ is the full history of covariates and intervention through time $t$ , and $\bar{S} (t - 1)$ and $\bar{I} (t - 1)$ are the full history of susceptible and infected, excluding time $t$ . (Because $S_{j} (t - 1) + I_{j} (t - 1) + R_{j} (t - 1) = N_{j}$ , observing $\bar{S} (t - 1)$ and $\bar{I} (t - 1)$ implies $\bar{R} (t - 1)$ .) Then $exp (δ_{1})$ entails the risk ratio of new cases by increasing mobility level by 1 unit locally (direct effect). If $v_{j} (t)$ is nonzero but small in comparison to $β_{j}^{\bar{a} (t)} (t) exp {θ_{j} (t)}$ , this should hold as a good approximation.

Similarly, if we consider two regimes $\bar{a} (t)$ and ${\tilde{a}}^{'} (t)$ where all components are the same except for ${\bar{a}}_{j} (t) = {\bar{a}}_{j}^{'} (t) + 1$ , and $v_{j} (t) = 0$ , then

\frac{E {Y_{j}^{\bar{a} (t)} (t) ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}}{E {Y_{j}^{{\bar{a}}^{'} (t)} (t) ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}} = exp (δ_{2})

encodes the risk ratio of new cases by increasing mobility level by 1 unit non-locally (indirect effect).

For the effect from neighbor $k$ , consider two regimes $\bar{a} (t)$ and ${\bar{a}}^{'} (t)$ , where all components are the same, except for $a_{k} (t)$ and $a_{k}^{'} (t) + 1$ , and $v_{j} (t) = 0$ . Then

\frac{E {Y_{j}^{\bar{a} (t)} (t) ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}}{E {Y_{j}^{{\bar{a}}^{'} (t)} (t) ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}} = exp (δ_{2} c_{j k} / m_{j}),

which encodes the indirect effect of area $k$ on area $j$ .

To link the potential variables and the observed variables and identify the direct and indirect effects, we invoke the following assumptions.

Assumption 1 Interference —

Given the past information through $t$ , the intervention $A (t)$ affects the potential infection rate at time $t$ and region $j$ only through $A_{j} (t)$ and ${\tilde{A}}_{j} (t)$ .

Assumption 2 Causal Consistency —

$Y_{j} (t) = Y_{j}^{\bar{A} (t)} (t), \forall t, j$ .

Assumption 3 Ignorability of Intervention Process Variables —

$A_{j} (t), {\tilde{A}}_{j} (t) ⫫ Y_{j}^{\bar{a} (t)} (t) ∣ {\bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}, \forall t, j$ .

We further define a generalized propensity score for the direct and indirect effects

{\bar{e}}_{j t} {κ_{1}, κ_{2}; \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)} = f {A_{j} (t) = κ_{1}, {\tilde{A}}_{j} (t) = κ_{2}, ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)} .

In practice, we often assume that the direct and indirect interventions are independent, and can write this score as the two separate components

e_{j t} {κ; \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)} = f {A_{j} (t) = κ ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)},

{\tilde{e}}_{j t} {κ; \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)} = \tilde{f} {{\tilde{A}}_{j} (t) = κ ∣ \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)},

denoted by $e_{j t} {κ; \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}$ and ${\tilde{e}}_{j t} {κ; \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}$ , respectively.

Lastly we require a positivity assumption that ensures all regions can take any plausible mobility level. This is formulated based on the propensity score:

Assumption 4 Positivity —

For all $\bar{X} (t)$ , $\bar{S} (t - 1)$ , $\bar{I} (t - 1)$ with $f {\bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)} > 0$ , ${\bar{e}}_{j t} {κ_{1}, κ_{2}; \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)} > 0, \forall t, κ_{1}, κ_{2}$ where $f (\cdot)$ denotes the generic probability function; i.e., it is a probability density function for a continuous variable and a probability mass function for a discrete variable.

Under Assumption 2, Assumption 3, the induced model from (9) for $Y_{j} (t)$ is $Y_{j} (t) \sim$ $Poisson {λ_{j} (t)}$ . One can fit the induced model with the observed data to estimate the causal parameters. Because $\bar{X}$ , $\bar{I}$ , and $\bar{S}$ are high-dimensional, fitting the above model directly may become a daunting task. We consider the generalized propensity score approach for dimension reduction. Under Assumption 3, Imbens’ generalized propensity score approach, we can show that

A_{j} (t), {\tilde{A}}_{j} (t) ⫫ {\bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)} ∣ {\bar{e}}_{j t} {κ_{1}, κ_{2}; \bar{X} (t), \bar{S} (t - 1), \bar{I} (t - 1)}, \forall κ_{1}, κ_{2} .

(Balancing)

In practice, we model the two generalized propensity scores using linear regressions of $A_{j} (t)$ and ${\tilde{A}}_{j} (t)$ onto covariates and past interventions and responses (Giffin et al., 2020). Then, their distributions for a given $j$ and $t$ (i.e., their generalized propensity scores values) can be completely summarized with a single sufficient statistic: the estimated conditional mean (i.e., the fitted value from the regression). For this reason, henceforth $e_{j} (t)$ and ${\tilde{e}}_{j} (t)$ (without the index for $κ$ ) will refer to these sufficient conditional means, for which the result (10) applies. This shows that the generalized propensity score serves as a balancing score; i.e., at the same level of the propensity score, the distribution of the history of confounding variables are the same across different intervention levels.

Allowing $X$ and $\tilde{X}$ to include propensity score components, the induced model for $Y_{j} (t)$ given the intervention and generalized propensity score is

E {Y_{j} (t) ∣ \bar{A} (t), \bar{e} (t), \bar{Y} (t - 1)} = exp {A_{j} (t) δ_{1} + {\tilde{A}}_{j} (t) δ_{2} + α_{1} + X_{j} {(t)}^{⊤} α_{2} + {\tilde{X}}_{j} {(t)}^{⊤} α_{3}} \cdot exp {θ_{j} (t)} + v_{j} (t) .

The derivation is provided in the Supporting Information. Therefore, we can fit the model $Y_{j} (t) \sim$ Poisson ${λ_{j}^{e} (t)}$ , where

λ_{j}^{e} (t) = β_{j}^{e} (t) exp {θ_{j} (t)} + v_{j} (t),

(10)

β_{j}^{e} (t) = exp {A_{j} (t) δ_{1} + {\tilde{A}}_{j} (t) δ_{2} + α_{1} + X_{j} {(t)}^{⊤} α_{2} + {\tilde{X}}_{j} {(t)}^{⊤} α_{3}},

To model the propensity of $A_{t}$ given the past history, we require some structural assumption to reduce the dimension of the historical variables. We posit a model for $A_{j} (t)$ adjusting for its direct causes $A_{j} (t - 1)$ , $A_{j} (t - 2)$ , $X_{j} (t)$ , $X_{j} (t - 1)$ and $Y_{j} (t - 1)$ to get $e_{j} (t)$ . ${\tilde{A}}_{j} (t)$ can be modeled similarly with the neighbor-averaged variables ${\tilde{A}}_{j} (t - 2)$ , ${\tilde{X}}_{j} (t)$ , ${\tilde{X}}_{j} (t - 1)$ and ${\tilde{Y}}_{j} (t - 1)$ . In practice, simply taking the average over neighbors of $e_{j} (t)$ gives similar estimates for ${\tilde{e}}_{j} (t)$ .

4. Simulation study

In this section we conduct a simulation study to evaluate the statistical properties of the causal effects that stem from the approximate SIR models. Data are generated from the full SIR model in Section 3.2 and analyzed using the approximated spatial model in Section 3.3. The objectives are to determine when the approximations give unbiased point estimation and valid interval estimation, and to compare different approximations using these criteria. The code required to run the simulation study is provided on the Harvard Dataverse (Giffin, 2022).

4.1. Methods

We generate data on a 15 × 15 square grid of $J = 225$ regions for $T = 30$ time steps. Rook neighbors are considered adjacent. Each region has population $N_{j} = 100, 000$ and initial states are generated as $I_{j} (1) = 100 exp {U_{j}}$ , $R_{j} (1) = 0$ and $S_{j} (1) = N_{j} - I_{j} (1)$ , where $(U_{1}, \dots, U_{J}) \sim CAR (1, ρ_{s})$ . A confounding variable $X_{j} (t)$ is generated from the STCAR $(1, ρ_{s}, ρ_{t})$ distribution and the intervention variable is generated as $A_{j} (t) = ρ_{x} X_{j} (t) + \sqrt{1 - ρ_{x}^{2}} E_{j} (t)$ where $E_{j} (t)$ is also generated from the STCAR $(1, ρ_{s}, ρ_{t})$ distribution. The latent states for times $t \in {2, \dots, T}$ are generated from the full mechanistic model given by (1), (2) with recovery rate $γ = 0.1$ and infection rate

log {β_{j} (t)} = α_{0} + X_{j} (t) α_{1} + {\tilde{X}}_{j} (t) α_{2} + A_{j} (t) δ_{1} + {\tilde{A}}_{j} (t) δ_{2}

for $α_{0} = - 3$ , $α_{1} = 0.5$ , $α_{2} = 0.3$ , $δ_{1} = 0.5$ and $δ_{2} = 0.2$ . The data are then generated as $Y_{j} (t) \sim Poisson {p λ_{j} (t - l)}$ where the reporting rate is $p = 0.5$ and the lag is $l = 2$ .

We compare six simulation scenarios defined by the spatial ( $ρ_{s}$ ) and temporal ( $ρ_{t}$ ) dependence of the intervention variable $A_{j} (t)$ , the strength of confounding $(ρ_{x})$ and the strength of spatial dependence ( $ϕ$ ) in the SIR model:

1.
Base model: $ρ_{s} = 0.9$ , $ρ_{t} = 0.5$ , $ρ_{x} = 0.5$ and $ϕ = 0.4$
2.
Strong spatial dependence in $A_{j} (t)$ : $ρ_{s} = 0.99$ , $ρ_{t} = 0.5$ , $ρ_{x} = 0.5$ and $ϕ = 0.4$
3.
Strong temporal dependence in $A_{j} (t)$ : $ρ_{s} = 0.3$ , $ρ_{t} = 0.9$ , $ρ_{x} = 0.5$ and $ϕ = 0.4$
4.
Strong spatiotemporal dependence $A_{j} (t)$ : $ρ_{s} = 0.9$ , $ρ_{t} = 0.9$ , $ρ_{x} = 0.5$ and $ϕ = 0.4$
5.
Strong confounding: $ρ_{s} = 0.9$ , $ρ_{t} = 0.5$ , $ρ_{x} = 0.9$ and $ϕ = 0.4$
6.
Weak SIR spatial dependence: $ρ_{s} = 0.9$ , $ρ_{t} = 0.5$ , $ρ_{x} = 0.5$ and $ϕ = 0.2$

For each scenario we generate 100 datasets.

The propensity score $e_{j} (t)$ is the fitted value from a least squares regression of $A_{j} (t)$ onto $A_{j} (t - 1)$ , $A_{j} (t - 2)$ , $X_{j} (t)$ , $X_{j} (t - 1)$ and $Y_{j} (t - 1)$ . We then model $log {β_{j} (t)}$ as a linear combination of $A_{j} (t)$ , ${\tilde{A}}_{j} (t)$ , $X_{j} (t)$ , ${\tilde{X}}_{j} (t)$ , $e_{j} (t)$ , ${\tilde{e}}_{j} (t)$ , $e_{j} {(t)}^{2}$ , ${\tilde{e}}_{j} {(t)}^{2}$ and $e_{j} (t) {\tilde{e}}_{j} (t)$ . For each simulated dataset we fit the full model in (7) with the propensity scores included as covariates (“Full”), and compare this model to the full model but with $v_{j} (t) = 0$ (“No nugget”), the full model but without the propensity scores $e_{j} (t) = {\tilde{e}}_{j} (t) = 0$ (“No PS”) and the full model but without spatial dependence $ρ_{s} = 0$ (“Non-spatial”). For the non-spatial model we also set $v_{j} (t) = 0$ because the MCMC algorithm did not converge with these terms included. The models are fit using responses from time points $t \in {5, \dots, n_{t}}$ to accommodate lagged predictors. We fit the model using MCMC with 10,000 iterations and the first 2,000 samples are discarded as burn-in to approximate the posterior median and 95% interval of the direct $(δ_{1})$ and indirect $(δ_{2})$ effects.

4.2. Results

The results are given in Table 1. Broadly speaking the Full model and the No Nugget model perform well. Giving similar precision and coverage results under all scenarios, they generally outperform the other models in terms of coverage and bias. Moreover, their coverage rates remain reasonably close to their 90% target, although they appear to have a small, positive bias. The Non-spatial model performs worse than the Full and No Nugget models in both precision and coverage, but it remains competitive. The No PS model performs the worst, with extremely poor performance for the strong temporal/spatiotemporal dependence scenarios. The four models seem to have the most difficulty estimating effects when there is strong temporal/spatiotemporal dependence in $A_{j} (t)$ or strong confounding. All models appear to estimate the direct effects better than the indirect effects. Fig. 1 provides the sampling distribution and power of the direct and indirect effect estimators. These plots show that for the simulation study the estimates have the correct sign with probability near one and have reasonable power in most cases. Therefore, in these synthetic cases the algorithm performs well.

Table 1.

Spatiotemporal model simulation study results. The six data-generating scenarios are given in Section 4.1. The two recommended models, the full model (“Full”) and the model without a nugget term (“No nugget”), are given in bold; and are compared to the model without the generalized propensity score (“No PS”) and the model without spatial dependence (“Non-spatial”) in terms of bias and empirical coverage of 90% prediction intervals, separately for the direct ( $δ_{1}$ ) and indirect $(δ_{2})$ effects. Bias is multiplied by 100 and standard errors are given in subscripts.

Scenario	Method	Direct effect		Indirect effect
		Bias	90% Cov.	Bias	90% Cov.
1. Base Model	Full	0.25_0.25	90₃	0.36_0.45	93₃
	No nugget	0.12_0.24	94₂	0.36_0.46	94₂
	No PS	2.18_0.25	74₄	2.19_0.53	79₄
	Non-spatial	0.52_0.25	96₂	−0.41_0.64	80₄

2. Strong spatial dependence in $A_{j} (t)$	Full	0.79_0.24	92₃	0.25_0.44	91₃
	No nugget	0.60_0.23	91₃	0.09_0.43	92₃
	No PS	2.87_0.23	65₅	1.64_0.54	78₄
	Non-spatial	1.59_0.26	99₁	−1.02_0.71	73₄

3. Strong temporal dependence in $A_{j} (t)$	Full	1.16_0.47	95₂	0.62_0.96	93₃
	No nugget	1.06_0.47	94₂	0.73_0.94	93₃
	No PS	12.71_0.45	2₁	14.28_1.01	20₄
	Non-spatial	1.78_0.53	97₂	−0.06_1.29	83₄

4. Strong spatiotemporal dependence in $A_{j} (t)$	Full	2.15_0.53	86₄	3.63_1.04	86₃
	No nugget	2.18_0.53	88₃	3.43_1.02	86₃
	No PS	13.59_0.51	4₂	17.37_1.26	22₄
	Non-spatial	4.44_0.62	94₂	2.15_1.63	72₅

5. Strong confounding	Full	0.97_0.46	92₃	−0.86_1.01	91₃
	No nugget	0.77_0.46	94₂	−0.90_0.98	92₃
	No PS	2.99_0.50	78₄	1.44_1.09	82₄
	Non-spatial	1.33_0.49	95₂	−0.31_1.37	78₄

6. Weak SIR spatial dependence	Full	0.28_0.25	92₃	0.67_0.48	96₂
	No nugget	0.22_0.25	93₃	0.55_0.48	96₂
	No PS	2.53_0.26	74₄	1.91_0.56	82₄
	Non-spatial	0.64_0.27	97₂	−0.30_0.67	80₄

Open in a new tab

Fig. 1 — **Simulation results for the full model**. The boxplots represent the sampling distribution of the posterior estimator of the direct ( $δ_{1}$ ) and indirect ( $δ_{2}$ ) effects for the full model. The horizontal lines are the true values and the values above the boxplots are the percentage of datasets for which the posterior 90% interval excludes zero.

Table 2 compares the model fit in terms of the log mean response $l_{j} (t) = log [E {Y_{j} (t)}]$ , by giving the correlation (across space and time) between the true and estimated $l_{j} (t)$ and the coverage of credible intervals. The Full model and the models without nugget or propensity score fit equally well while the non-spatial model shows some degradation in fit. However, the differences in fit are much less dramatic than the difference in bias and covariance for the direct and indirect effects $δ_{1}$ and $δ_{2}$ , as expected.

Table 2.

Simulation study results for model fit. The data-generating scenarios are given in Section 4.1. We compare the full model (“Full”), the model without a nugget term (“No nugget”), the model without the generalized propensity score (“No PS”) and the model without spatial dependence (“Non-spatial”). Estimation of the log mean $l_{j} (t) = log [E {Y_{j} (t)}]$ is measured using the correlation between the true and estimated $l_{j} (t)$ and empirical coverage of 90% intervals. All values are multiplied by 100 and standard errors are given in subscripts.

Scenario	Correlation				Coverage
	Full	No nugget	No PS	Non-spatial	Full	No nugget	No PS	Non-spatial
1	93.5_0.1	93.5_0.1	93.5_0.1	88.4_0.1	94.4_0.1	94.4_0.1	94.4_0.1	93.0_0.1
2	96.4_0.1	96.4_0.1	96.4_0.1	92.9_0.2	94.5_0.1	94.5_0.1	94.5_0.1	93.0_0.2
3	94.9_0.1	95.0_0.1	94.9_0.1	91.0_0.2	94.5_0.1	94.5_0.1	94.5_0.1	92.7_0.2
4	96.9_0.1	97.0_0.1	96.9_0.1	93.8_0.3	93.9_0.1	94.2_0.1	94.1_0.1	92.5_0.3
5	94.5_0.1	94.5_0.1	94.5_0.1	90.4_0.1	94.3_0.1	94.3_0.1	94.3_0.1	93.0_0.1
6	92.0_0.1	91.9_0.1	92.0_0.1	87.9_0.1	93.5_0.1	93.5_0.1	93.5_0.1	93.1_0.1

Open in a new tab

5. Estimating the causal effect of community mobility reduction on Coronavirus spread

We implement four models on the real data: A Full model, a No Nugget model, a No PS model, and a Non-spatial model. The models are fit using responses from time points $t \in {8, \dots, 31}$ (April 24, 2020 through October 8, 2020) to accommodate lagged predictors. The direct propensity score is estimated with least squares regressions of $A_{j} (t)$ onto the previous intervention $A_{j} (t - 1)$ , the local covariates $X_{j} (t)$ and $X_{j} (t - 1)$ , the number of weeks since the first case in county $j$ (enters both linearly and quadratically), time $t$ (enters both linearly and quadratically), a time and intervention interaction $t \cdot A_{j} (t - 1)$ , the baseline level of mobility $A_{j} (1)$ , and $log {Y_{j} (t - 1)} - log (N_{j})$ . The propensity score $e_{j} (t)$ is set to the fitted values of the resulting model. The propensity of the spillover intervention is estimated as the local average of its neighboring direct scores: ${\tilde{e}}_{j} (t) = \sum_{k \sim j} e_{k} (t) / m_{j}$ . The direct and indirect propensity scores are added to $X$ and $\tilde{X}$ , respectively, as in (8), along with $l$ -lagged interventions and covariates. Each model is implemented with lags of $l = 0, \dots, 7$ . The No Nugget, No PS, and Non-spatial models were run for 100,000 MCMC iterations, with a burn-in tuning period of 20,000 iterations. The Full model required more iterations to converge, so these values were increased to 200,000 and 40,000, respectively. The 200,000 iteration Full model took approximately 30 h to run using 4 cores of a Dell R7425 Dual Processor AMD Epyc 2.2 GHz cluster, with 512 GB RAM and running 64Bit Ubuntu Linux Version 18.04. Finally, the code and data required to run the primary analysis are provided on the Harvard Dataverse The code required to run the simulation study is provided on the Harvard Dataverse (Giffin, 2022).

5.1. Effect estimation

Table 3 gives the estimated posteriors for the direct and spillover effects for the four different models across lags $l = 0, \dots, 7$ , and Fig. 2 shows these results graphically for the Full model. Rather than present the estimated values for $δ_{1}$ and $δ_{2}$ , the estimates shown are $100 \cdot {exp (50 \cdot δ_{i}) - 1}$ , $i = 1, 2$ , and correspond to the expected percentage increase in cases from an increase in mobility 50% above baseline. Because it is not clear that the observed data can identify the correct lag, we provide model results for all lags, and speculate that lags of 3–5 weeks are most appropriate to see effects of changes in mobility. The Centers for Disease Control and Prevention guidelines suggest that COVID symptoms appear 2–14 days after virus exposure (Centers for Disease Control and Prevention, 2020). Given this, 3–5 weeks would allow the virus to spread through roughly several generations of people — which would be enough to see changes in county-level case counts.

Table 3.

Posterior estimates for direct and indirect treatment effects for lags 0–7 weeks for all models. Estimates correspond to the expected percentage increase in cases from an increase in mobility 50% above baseline ( $100 \cdot (exp (50 \cdot δ) - 1)$ ). 95% credible intervals are shown in parentheses, and estimates significant at the 95% level are denoted with $*$ .

Model	Lag (weeks)	Direct effect		Indirect effect
Full	0	2.0	(−9.0, 14.1)*	−12.9	(−35.1, 17.0)*
	1	1.0	(−9.5, 12.6)*	−28.0	(−45.7, −4.6)*
	2	4.9	(−5.8, 16.8)*	−9.4	(−31.8, 22.1)*
	3	3.5	(−7.8, 16.4)*	−9.6	(−32.5, 20.0)*
	4	15.7	(2.8, 29.9)*	−7.6	(−33.1, 28.3)*
	5	24.8	(11.0, 40.2)*	0.3	(−25.3, 34.6)*
	6	25.7	(12.3, 40.7)*	−9.9	(−33.2, 21.0)*
	7	24.3	(11.0, 39.1)*	24.2	(−7.5, 66.5)*

No Nugget	0	−0.8	(−11.6, 11.7)*	−21.7	(−41.1, 4.1)*
	1	−1.5	(−11.8, 9.8)*	−28.5	(−46.4, −4.6)*
	2	3.6	(−7.2, 15.6)*	−15.4	(−35.7, 12.1)*
	3	0.3	(−10.4, 12.3)*	−19.3	(−38.5, 6.2)*
	4	12.3	(−0.3, 28.0)*	−13.7	(−35.5, 15.4)*
	5	22.7	(9.1, 37.8)*	3.1	(−22.8, 37.7)*
	6	22.2	(8.7, 37.2)*	−18.0	(−38.8, 11.4)*
	7	22.8	(9.2, 37.7)*	2.3	(−22.9, 35.7)*

No PS	0	−2.7	(−8.0, 2.8)*	2.9	(−9.4, 17.1)*
	1	2.1	(−3.1, 7.7)*	8.7	(−5.0, 24.1)*
	2	6.6	(0.9, 12.9)*	20.1	(5.1, 39.2)*
	3	11.9	(5.6, 18.7)*	32.5	(12.3, 57.3)*
	4	15.6	(9.2, 22.3)*	32.7	(15.1, 59.5)*
	5	15.3	(9.6, 21.6)*	27.6	(10.4, 48.4)*
	6	12.4	(6.7, 18.4)*	20.3	(5.6, 37.4)*
	7	10.8	(5.2, 16.7)*	21.0	(6.3, 37.2)*

Non-spatial	0	−2.6	(−17.0, 13.8)*	7.7	(−12.5, 32.0)*
	1	−5.9	(−19.8, 10.1)*	−46.4	(−57.1, −34.5)*
	2	−0.9	(−14.9, 15.5)*	−25.7	(−39.3, −9.4)*
	3	−4.3	(−17.7, 11.5)*	−37.2	(−48.3, −23.5)*
	4	9.8	(−6.1, 28.8)*	−21.6	(−35.2, −4.2)*
	5	25.1	(7.3, 45.5)*	35.8	(13.3, 62.4)*
	6	19.3	(3.0, 38.0)*	−10.5	(−24.3, 6.2)*
	7	24.2	(6.8, 44.5)*	6.1	(−11.1, 26.5)*

Open in a new tab

The key trend evident in these results is that the direct effect appears to be positive and significantly different from zero for higher lags. The Full model gives positive significant estimates on $δ_{1}$ for lags 4–7 weeks; the No Nugget for lags 5–7 weeks; the No PS model for lags 2–7 weeks; and the Non-spatial model for lags 5–7. The size of this effect differs between models/lags, but the Full model for lags 5–7 predicts that a mobility level 50 percent above baseline will increase Coronavirus cases by between 11 and 41 percent. This trend is demonstrated in the grouping of posteriors above zero in the top sub-figure of Fig. 2. This provides evidence that a decrease in local mobility should decrease local Coronavirus cases roughly 4–7 weeks after the change in mobility.

The indirect effect estimates are more variable with wider credible intervals, as illustrated in Fig. 2. For the Full model, a lag of 7 gives a positive but non-significant estimate for the indirect effect, although the indirect effect posteriors for other lags seem centered at zero, with the exception of lag 1 which is significant and negative. For lag 1 the No Nugget model also shows this counter-intuitive result. The No PS model shows a number of positive and significant indirect effects, and the Non-spatial model paradoxically shows several significant indirect effects — some positive and some negative. It is not entirely clear why there is more variability in the indirect effects or why we see these counter-intuitive results. Some of these could simply be spurious significance resulting from estimating many lags. Another possibility is that our measure of connectivity does not precisely measure the levels of contact between different counties. This could explain the wide intervals, as the connectivity is over-estimated between some counties and under-estimated between others. In any event, the indirect effect story appears less clear cut than that of the direct effect.

Table 4 gives parameter estimates from the No Nugget/Lag 5 model. In addition to the intervention effects and accounting for the propensity scores, the covariate coefficients tell an interesting story about their association with Coronavirus case counts. For example, expected Coronavirus case counts are decreasing in median age of population, population density, county population, and relative humidity. Alternatively, Coronavirus case counts tend to increase with the poverty rate, PM_2.5, number of hospital beds, number of ICU beds, and percentage healthcare-employed residents.

Table 4.

Posterior estimates for coefficients in No Nugget, lag 5 model. 95% credible intervals are shown in parentheses, and estimates significant at the 95% level are denoted with $*$ . Coefficients are not transformed as in Table 3.

Variable	Estimate (95% interval)
Intercept	−7.43 (−7.68, −7.18)*
$A_{j} (t)$	0.004 (0.002, 0.006)*
${\tilde{A}}_{j} (t)$	0.0003 (−0.0055, 0.006)
$e_{j} (t)$	−0.0011 (0.0007, 0.0054)*
${\tilde{e}}_{j} (t)$	0.006 (0.001, 0.013)*
Mean temperature	−0.008 (−0.016, 0.000)*
Relative Humidity	−0.009 (−0.011, −0.006)*
Dew point	0.045 (0.033, 0.057)*
PM_2.5	0.082 (0.066, 0.097)*
Poverty rate	1.07 (0.88, 1.26)*
Population density	−0.035 (−0.043, −0.026)*
Median household income (thousands)	−0.002 (−0.003, −0.001)*
Population (millions)	−0.048 (−0.066, −0.030)*
# hospital beds	5.95 (3.64, 8.29)*
# ICU beds	24.01 (6.99, 40.94)*
Median age	−0.021 (−0.023, −0.018)*
Proportion of foreign-born residents	4.16 (3.97, 4.35)*
Proportion of health-employed	0.82 (0.59, 1.04)*
$ρ_{s}$	0.9949 (0.9945, 0.9950)*
$ρ_{t}$	0.452 (0.441, 0.463)*

Open in a new tab

6. Discussion

This analysis uses a spatiotemporal model to provide insights into how mobility affects the spread of Coronavirus. The analysis uses a propensity score framework to obtain causal estimates of this effect, while allowing for potential interference between nearby counties. We show via simulation study that the causal estimates from our computationally-efficient approximate model have good statistical properties for data generated from a mechanistic process. In particular, the data suggest that decreases in mobility appear to cause a statistically significant decrease on the number of local Coronavirus cases after roughly 5 weeks. This is significant, as such a long lag might be overlooked in a naive analysis of the relationship between mobility and Coronavirus.

The core of our contribution is developing a novel method which allows for the recovery causal treatment effect estimates from an SIR model — a class of models which is primarily used for prediction and forecasting rather than causal inference. This theoretical gap is bridged using the potential outcomes framework and propensity scores. Because many infectious disease models are predictive in nature, this provides an importance new inferential tool for researchers.

There are several noteworthy limitations of this analysis. The intervention data are somewhat crude measures of mobility. Taken from a small minority of smartphone users, they provide a limited view into the distribution of mobility among residents for a given county and week. Moreover, the five relevant measures of mobility provided by Google (which we average over) may have different intervention effects. We also do not have data on other interventions to reduce the spread of Coronavirus, such as wearing masks, state-mandated closures of different types of businesses, social distancing, washing hands, etc. These are all plausibly correlated with mobility, so it is possible that the intervention effect of mobility incorporates the causal effect of these measures too. The measure of connectivity that we use – that of adjacency between counties – is also somewhat crude. For example, while the first viral hotspot in the United States is thought to have been Seattle, it quickly traveled to New York by air. A more comprehensive measure of connectivity and travel between regions may be helpful to incorporate to the model. The response data also have limitations. Because the availability of Coronavirus tests in the United States has varied over time and space, the number of new positive tests results in a particular county/week might partly reflect the availability of tests. One solution to this would be to use COVID-19-related hospitalizations or deaths as a response. However, these are necessarily less common – providing fewer response data – and they would add to the already substantial lag between the moment of infection and positive test. Finally, as we make the “no unmeasured confounders” assumption, it is possible that there are key confounders that have not been included. Because of this, we include a number of possible confounders, but more can always be done in this regard. Lastly, correlation between variables was not a major issue, as shown in Figure B.1.

As the pandemic continues, many of these data issues are being addressed: surveys are now being done to ascertain levels of mask-usage and other social distancing practices. Publicly available data on the availability of Coronavirus tests by location are emerging. Further, as the number of cases continues to grow, using hospitalizations or deaths as a response becomes more feasible.

Disclaimer

The views expressed in this manuscript are those of the individual authors and do not necessarily reflect the views and policies of the U.S. Environmental Protection Agency. Mention of trade names or commercial products does not constitute endorsement or recommendation for use.

Acknowledgments

The authors thank Sukanya Bhattacharyya, Can Cui, Matt Miller, Parker Trostle, Laura Wendelberger, Stephen Xu and Zun Yin (North Carolina State University), Yawen Guan (University of Nebraska) and Pulong Ma (Duke University) for help compiling the data.

Funding

This work was partially supported by NIH, United States of America grant R01ES031651-01.

Footnotes

^{Appendix A}

Supplementary material related to this article can be found online at https://doi.org/10.1016/j.spasta.2022.100711, which contains a derivation of the conditional expectation of $Y_{j} (t)$ , as well as the correlation among predictors.

Appendix A. Supplementary data

The following is the Supplementary material related to this article.

MMC S1

mmc1.pdf^{(206.5KB, pdf)}

References

Alexeeff S.E., et al. Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: insights into spatial variability using high-resolution satellite data. J. Expo. Sci. Environ. Epidemiol. 2015;25(2):138–144. doi: 10.1038/jes.2014.40. [DOI] [PMC free article] [PubMed] [Google Scholar]
Arino J., Van den Driessche P. A multi-city epidemic model. Math. Popul. Stud. 2003;10(3):175–193. [Google Scholar]
Berres S., Ruiz-Baier R. A fully adaptive numerical approximation for a two-dimensional epidemic model with nonlinear cross-diffusion. Nonlinear Anal. Real World Appl. 2011;12(5):2888–2903. [Google Scholar]
Bradley J.R. 2020. Joint spatio-temporal analysis of multiple response types using the hierarchical generalized transformation model with application to coronavirus disease 2019 and social distancing. arXiv Preprint (Stat.ME:2002.09983) [Google Scholar]
Buckingham-Jeffery E., Isham V., House T. Gaussian process approximations for fast inference from infectious disease data. Math. Biosci. 2018;301:111–120. doi: 10.1016/j.mbs.2018.02.003. [DOI] [PubMed] [Google Scholar]
Burger R., et al. Modelling the spatial-temporal evolution of the 2009 A/H1N1 influenza pandemic in Chile. Math. Biosci. Eng. 2009;13:1–17. doi: 10.3934/mbe.2016.13.43. [DOI] [PubMed] [Google Scholar]
Capasso V., Di Liddo A. Asymptotic behaviour of reaction-diffusion systems in population and epidemic models. J. Math. Biol. 1994;32(5):453–463. doi: 10.1007/BF00160168. [DOI] [PubMed] [Google Scholar]
Carroll R., Prentice C.R. Using spatial and temporal modeling to visualize the effects of US state issued stay at home orders on COVID-19. Sci. Rep. 2021;11(1):1–7. doi: 10.1038/s41598-021-93433-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
Centers for Disease Control and Prevention R. 2020. Symptoms of Coronavirus. https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html. [Google Scholar]
Chang S., et al. Mobility network models of COVID-19 explain inequities and inform reopening. Nature. 2021;(589):82–87. doi: 10.1038/s41586-020-2923-3. [DOI] [PubMed] [Google Scholar]
Chen H., He J., Song W., Wang L., Wang J., Chen Y. Modeling and interpreting the COVID-19 intervention strategy of China: A human mobility view. PLoS One. 2020;15(11) doi: 10.1371/journal.pone.0242761. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chinviriyasit S., Chinviriyasit W. Numerical modelling of an SIR epidemic model with diffusion. Appl. Math. Comput. 2010;216(2):395–409. [Google Scholar]
Chudnovsky A.A., et al. Prediction of daily fine particulate matter concentrations using aerosol optical depth retrievals from the Geostationary Operational Environmental Satellite (GOES) J. Air Waste Manage. Assoc. 2012;62(9):1022–1031. doi: 10.1080/10962247.2012.695321. [DOI] [PubMed] [Google Scholar]
Cowling B.J., et al. Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study. Lancet Public Health. 2020;5(5):279–288. doi: 10.1016/S2468-2667(20)30090-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dandekar R., Barbastathis G. Cold Spring Harbor Laboratory Press; 2020. Quantifying the Effect of Quarantine Control in Covid-19 Infectious Spread Using Machine Learning. MedRxiv. [DOI] [Google Scholar]
Dehning J., et al. 2020. Inferring COVID-19 spreading rates and potential change points for case number forecasts. MedRxiv: [DOI] [Google Scholar]
Dong E., Du H., Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 2020;20(5):533–534. doi: 10.1016/S1473-3099(20)30120-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
ESRI E. 2020. Definitive Healthcare: USA Hospital Beds. https://coronavirus-resources.esri.com/datasets/definitivehc::definitive-healthcare-usa-hospital-beds. [Google Scholar]
Giffin A. Harvard Dataverse; 2022. Data and Code for Estimating Intervention Effects on Infectious Disease Control: the Effect of Community Mobility Reduction on Coronavirus Spread. [DOI] [PMC free article] [PubMed] [Google Scholar]
Giffin A., et al. 2020. Generalized propensity score approach to causal inference with spatial interference. arXiv Preprint: 2007.00106. [DOI] [PMC free article] [PubMed] [Google Scholar]
Google LLC A. 2020. Google COVID-19 Community Mobility Reports. https://www.google.com/covid19/mobility/. Accessed October 2020. [Google Scholar]
Halloran M.E., Struchiner C.J. Study designs for dependent happenings. Epidemiology. 1991;2(5):331–338. doi: 10.1097/00001648-199109000-00004. [DOI] [PubMed] [Google Scholar]
Hilker F.M., et al. A diffusive SI model with Allee effect and application to FIV. Math. Biosci. 2007;206(1):61–80. doi: 10.1016/j.mbs.2005.10.003. [DOI] [PubMed] [Google Scholar]
Hyman J.M., LaForce T. Bioterrorism: Mathematical Modeling Applications in Homeland Security. SIAM; 2003. Modeling the spread of influenza among cities; pp. 211–236. [Google Scholar]
Imbens G.W. The role of the propensity score in estimating dose-response functions. Biometrika. 2000;87(3):706–710. [Google Scholar]
Kermack W.O., McKendrick A.G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. Ser. A Contain. Pap. Math. Phys. Charact. 1927;115(772):700–721. [Google Scholar]
Kounchev O., Simeonov G., Kuncheva Z. 2020. The TVBG-SEIR spline model for analysis of COVID-19 spread, and a Tool for prediction scenarios. arXiv preprint arXiv:2004.11338. [Google Scholar]
Lee S., Castillo-Chavez C. The role of residence times in two-patch dengue transmission dynamics and optimal strategies. J. Theor. Biol. 2015;374:152–164. doi: 10.1016/j.jtbi.2015.03.005. [DOI] [PubMed] [Google Scholar]
Lee J., Jung E. A spatial-temporal transmission model and early intervention policies of 2009 A/H1N1 influenza in South Korea. J. Theor. Biol. 2015;380:60–73. doi: 10.1016/j.jtbi.2015.05.008. [DOI] [PubMed] [Google Scholar]
Lee W.D., Qian M., Schwanen T. The association between socioeconomic status and mobility reductions in the early stage of England’s COVID-19 epidemic. Health Place. 2021;69:102563. doi: 10.1016/j.healthplace.2021.102563. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lee J.M., et al. The effect of public health interventions on the spread of influenza among cities. J. Theor. Biol. 2012;293:131–142. doi: 10.1016/j.jtbi.2011.10.008. [DOI] [PubMed] [Google Scholar]
Li Q., et al. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus – Infected Pneumonia. N. Engl. J. Med. 2020;382(13):1199–1207. doi: 10.1056/NEJMoa2001316. PMID: 31995857. [DOI] [PMC free article] [PubMed] [Google Scholar]
Livingston E., Bucher K. Coronavirus Disease 2019 (COVID-19) in Italy. J. Am. Med. Assoc. 2020;323(14):1335. doi: 10.1001/jama.2020.4344. [DOI] [PubMed] [Google Scholar]
Lyu W., Wehby G.L. Community use of face masks and COVID-19: Evidence from A natural experiment of state mandates in the US: Study examines impact on COVID-19 growth rates associated with state government mandates requiring face mask use in public. Health Affairs. 2020;39(8):1419–1425. doi: 10.1377/hlthaff.2020.00818. [DOI] [PubMed] [Google Scholar]
Lyu H., et al. 2020. COVID-19 Time-series Prediction by Joint Dictionary Learning and Online NMF. arXiv preprint arXiv:2004.09112. [Google Scholar]
Magdon-Ismail M. Cold Spring Harbor Laboratory Press; 2020. Machine Learning the Phenomenology of COVID-19 From Early Infection Dynamics. MedRxiv: [DOI] [Google Scholar]
Mbuvha R., Marwala T. Cold Spring Harbor Laboratory Press; 2020. Bayesian Inference of COVID-19 Spreading Rates in South Africa. MedRxiv: [DOI] [PMC free article] [PubMed] [Google Scholar]
Milner F.A., Zhao R. SIR model with directed spatial diffusion. Math. Popul. Stud. 2008;15(3):160–181. [Google Scholar]
Nouvellet P., Bhatia S., Cori A., Ainslie K.E., Baguelin M., Bhatt S., Boonyasiri A., Brazeau N.F., Cattarino L., Cooper L.V., et al. Reduction in mobility and COVID-19 transmission. Nature Commun. 2021;12(1):1–9. doi: 10.1038/s41467-021-21358-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nychka D., Furrer R., Sain S. 2014. Fields: Tools for Spatial Data. R Package Version 7.1. Vol. 10. Accessed Online. [Google Scholar]
Paeng S.-H., Lee J. Continuous and discrete SIR-models with spatial distributions. J. Math. Biol. 2017;74(7):1709–1727. doi: 10.1007/s00285-016-1071-8. [DOI] [PubMed] [Google Scholar]
Prem K., et al. The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study. Lancet Public Health. 2020;5(5):261–270. doi: 10.1016/S2468-2667(20)30073-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Punn N.S., Sonbhadra S.K., Agarwal S. Cold Spring Harbor Laboratory Press; 2020. COVID-19 Epidemic Analysis Using Machine Learning and Deep Learning Algorithms. MedRxiv: [DOI] [Google Scholar]
Rashed E.A., Hirata A. One-year lesson: Machine learning prediction of COVID-19 positive cases with meteorological data and mobility estimate in Japan. Int. J. Environ. Res. Public Health. 2021;18(11):5736. doi: 10.3390/ijerph18115736. [DOI] [PMC free article] [PubMed] [Google Scholar]
Reich B.J., et al. 2020. A review of spatial causal inference methods for environmental and epidemiological applications. arXiv Preprint: 2007.02714. [DOI] [PMC free article] [PubMed] [Google Scholar]
Reluga T. A two-phase epidemic driven by diffusion. J. Theor. Biol. 2004;229(2):249–261. doi: 10.1016/j.jtbi.2004.03.018. [DOI] [PubMed] [Google Scholar]
Robinson M., Stilianakis N.I., Drossinos Y. Spatial dynamics of airborne infectious diseases. J. Theor. Biol. 2012;297:116–126. doi: 10.1016/j.jtbi.2011.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rubin D.B. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 1974;66(5):688. [Google Scholar]
Sattenspiel L., Dietz K., et al. A structured epidemic model incorporating geographic mobility among regions. Math. Biosci. 1995;128(1):71–92. doi: 10.1016/0025-5564(94)00068-b. [DOI] [PubMed] [Google Scholar]
Sattenspiel L., Herring D.A. Simulating the effect of quarantine on the spread of the 1918–19 flu in central Canada. Bull. Math. Biol. 2003;65(1):1–26. doi: 10.1006/bulm.2002.0317. [DOI] [PubMed] [Google Scholar]
Sobel M.E. What do randomized studies of housing mobility demonstrate? Causal inference in the face of interference. J. Amer. Statist. Assoc. 2006;101(476):1398–1407. [Google Scholar]
Sparks A., Hengl T., Nelson A. GSODR: Global Summary Daily Weather Data in R. J. Open Source Softw. 2017;2(10):177. [Google Scholar]
Stein M.L. Space–time covariance functions. J. Amer. Statist. Assoc. 2005;100(469):310–321. [Google Scholar]
US Bureau of Labor Statistics M.L. 2019. Health Care and Social Assistance: NAICS 62. https://www.bls.gov/iag/tgs/iag62.htm. [Google Scholar]
US Census Bureau M.L. 2016. American Community Survey Public Use Microdata Samples. https://www.census.gov/newsroom/press-releases/2016/cb16-tps144.html. [Google Scholar]
Wang Y., Wang J., Zhang L. Cross diffusion-induced pattern in an SI model. Appl. Math. Comput. 2010;217(5):1965–1970. [Google Scholar]
Wu X., Nethery R.C., Sabath M., Braun D., Dominici F. Air pollution and COVID-19 mortality in the United States: Strengths and limitations of an ecological regression analysis. Sci. Adv. 2020;6(45):eabd4049. doi: 10.1126/sciadv.abd4049. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wu X., et al. Cold Spring Harbor Laboratory Press; 2020. Exposure to Air Pollution and COVID-19 Mortality in the United States: A Nationwide Cross-Sectional Study. [DOI] [Google Scholar]
Yilmazkuday H. Stay-at-home works to fight against COVID-19: international evidence from google mobility data. J. Hum. Behav. Soc. Environ. 2020 doi: 10.1080/10911359.2020.1845903. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

MMC S1

mmc1.pdf^{(206.5KB, pdf)}

[b1] Alexeeff S.E., et al. Consequences of kriging and land use regression for PM2.5 predictions in epidemiologic analyses: insights into spatial variability using high-resolution satellite data. J. Expo. Sci. Environ. Epidemiol. 2015;25(2):138–144. doi: 10.1038/jes.2014.40. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b2] Arino J., Van den Driessche P. A multi-city epidemic model. Math. Popul. Stud. 2003;10(3):175–193. [Google Scholar]

[b3] Berres S., Ruiz-Baier R. A fully adaptive numerical approximation for a two-dimensional epidemic model with nonlinear cross-diffusion. Nonlinear Anal. Real World Appl. 2011;12(5):2888–2903. [Google Scholar]

[b4] Bradley J.R. 2020. Joint spatio-temporal analysis of multiple response types using the hierarchical generalized transformation model with application to coronavirus disease 2019 and social distancing. arXiv Preprint (Stat.ME:2002.09983) [Google Scholar]

[b5] Buckingham-Jeffery E., Isham V., House T. Gaussian process approximations for fast inference from infectious disease data. Math. Biosci. 2018;301:111–120. doi: 10.1016/j.mbs.2018.02.003. [DOI] [PubMed] [Google Scholar]

[b6] Burger R., et al. Modelling the spatial-temporal evolution of the 2009 A/H1N1 influenza pandemic in Chile. Math. Biosci. Eng. 2009;13:1–17. doi: 10.3934/mbe.2016.13.43. [DOI] [PubMed] [Google Scholar]

[b7] Capasso V., Di Liddo A. Asymptotic behaviour of reaction-diffusion systems in population and epidemic models. J. Math. Biol. 1994;32(5):453–463. doi: 10.1007/BF00160168. [DOI] [PubMed] [Google Scholar]

[b8] Carroll R., Prentice C.R. Using spatial and temporal modeling to visualize the effects of US state issued stay at home orders on COVID-19. Sci. Rep. 2021;11(1):1–7. doi: 10.1038/s41598-021-93433-z. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b9] Centers for Disease Control and Prevention R. 2020. Symptoms of Coronavirus. https://www.cdc.gov/coronavirus/2019-ncov/symptoms-testing/symptoms.html. [Google Scholar]

[b10] Chang S., et al. Mobility network models of COVID-19 explain inequities and inform reopening. Nature. 2021;(589):82–87. doi: 10.1038/s41586-020-2923-3. [DOI] [PubMed] [Google Scholar]

[b11] Chen H., He J., Song W., Wang L., Wang J., Chen Y. Modeling and interpreting the COVID-19 intervention strategy of China: A human mobility view. PLoS One. 2020;15(11) doi: 10.1371/journal.pone.0242761. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b12] Chinviriyasit S., Chinviriyasit W. Numerical modelling of an SIR epidemic model with diffusion. Appl. Math. Comput. 2010;216(2):395–409. [Google Scholar]

[b13] Chudnovsky A.A., et al. Prediction of daily fine particulate matter concentrations using aerosol optical depth retrievals from the Geostationary Operational Environmental Satellite (GOES) J. Air Waste Manage. Assoc. 2012;62(9):1022–1031. doi: 10.1080/10962247.2012.695321. [DOI] [PubMed] [Google Scholar]

[b14] Cowling B.J., et al. Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study. Lancet Public Health. 2020;5(5):279–288. doi: 10.1016/S2468-2667(20)30090-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b15] Dandekar R., Barbastathis G. Cold Spring Harbor Laboratory Press; 2020. Quantifying the Effect of Quarantine Control in Covid-19 Infectious Spread Using Machine Learning. MedRxiv. [DOI] [Google Scholar]

[b16] Dehning J., et al. 2020. Inferring COVID-19 spreading rates and potential change points for case number forecasts. MedRxiv: [DOI] [Google Scholar]

[b17] Dong E., Du H., Gardner L. An interactive web-based dashboard to track COVID-19 in real time. Lancet Infect. Dis. 2020;20(5):533–534. doi: 10.1016/S1473-3099(20)30120-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b18] ESRI E. 2020. Definitive Healthcare: USA Hospital Beds. https://coronavirus-resources.esri.com/datasets/definitivehc::definitive-healthcare-usa-hospital-beds. [Google Scholar]

[b19] Giffin A. Harvard Dataverse; 2022. Data and Code for Estimating Intervention Effects on Infectious Disease Control: the Effect of Community Mobility Reduction on Coronavirus Spread. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b20] Giffin A., et al. 2020. Generalized propensity score approach to causal inference with spatial interference. arXiv Preprint: 2007.00106. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b21] Google LLC A. 2020. Google COVID-19 Community Mobility Reports. https://www.google.com/covid19/mobility/. Accessed October 2020. [Google Scholar]

[b22] Halloran M.E., Struchiner C.J. Study designs for dependent happenings. Epidemiology. 1991;2(5):331–338. doi: 10.1097/00001648-199109000-00004. [DOI] [PubMed] [Google Scholar]

[b23] Hilker F.M., et al. A diffusive SI model with Allee effect and application to FIV. Math. Biosci. 2007;206(1):61–80. doi: 10.1016/j.mbs.2005.10.003. [DOI] [PubMed] [Google Scholar]

[b24] Hyman J.M., LaForce T. Bioterrorism: Mathematical Modeling Applications in Homeland Security. SIAM; 2003. Modeling the spread of influenza among cities; pp. 211–236. [Google Scholar]

[b25] Imbens G.W. The role of the propensity score in estimating dose-response functions. Biometrika. 2000;87(3):706–710. [Google Scholar]

[b26] Kermack W.O., McKendrick A.G. A contribution to the mathematical theory of epidemics. Proc. R. Soc. Lond. Ser. A Contain. Pap. Math. Phys. Charact. 1927;115(772):700–721. [Google Scholar]

[b27] Kounchev O., Simeonov G., Kuncheva Z. 2020. The TVBG-SEIR spline model for analysis of COVID-19 spread, and a Tool for prediction scenarios. arXiv preprint arXiv:2004.11338. [Google Scholar]

[b28] Lee S., Castillo-Chavez C. The role of residence times in two-patch dengue transmission dynamics and optimal strategies. J. Theor. Biol. 2015;374:152–164. doi: 10.1016/j.jtbi.2015.03.005. [DOI] [PubMed] [Google Scholar]

[b29] Lee J., Jung E. A spatial-temporal transmission model and early intervention policies of 2009 A/H1N1 influenza in South Korea. J. Theor. Biol. 2015;380:60–73. doi: 10.1016/j.jtbi.2015.05.008. [DOI] [PubMed] [Google Scholar]

[b30] Lee W.D., Qian M., Schwanen T. The association between socioeconomic status and mobility reductions in the early stage of England’s COVID-19 epidemic. Health Place. 2021;69:102563. doi: 10.1016/j.healthplace.2021.102563. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b31] Lee J.M., et al. The effect of public health interventions on the spread of influenza among cities. J. Theor. Biol. 2012;293:131–142. doi: 10.1016/j.jtbi.2011.10.008. [DOI] [PubMed] [Google Scholar]

[b32] Li Q., et al. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus – Infected Pneumonia. N. Engl. J. Med. 2020;382(13):1199–1207. doi: 10.1056/NEJMoa2001316. PMID: 31995857. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b33] Livingston E., Bucher K. Coronavirus Disease 2019 (COVID-19) in Italy. J. Am. Med. Assoc. 2020;323(14):1335. doi: 10.1001/jama.2020.4344. [DOI] [PubMed] [Google Scholar]

[b34] Lyu W., Wehby G.L. Community use of face masks and COVID-19: Evidence from A natural experiment of state mandates in the US: Study examines impact on COVID-19 growth rates associated with state government mandates requiring face mask use in public. Health Affairs. 2020;39(8):1419–1425. doi: 10.1377/hlthaff.2020.00818. [DOI] [PubMed] [Google Scholar]

[b35] Lyu H., et al. 2020. COVID-19 Time-series Prediction by Joint Dictionary Learning and Online NMF. arXiv preprint arXiv:2004.09112. [Google Scholar]

[b36] Magdon-Ismail M. Cold Spring Harbor Laboratory Press; 2020. Machine Learning the Phenomenology of COVID-19 From Early Infection Dynamics. MedRxiv: [DOI] [Google Scholar]

[b37] Mbuvha R., Marwala T. Cold Spring Harbor Laboratory Press; 2020. Bayesian Inference of COVID-19 Spreading Rates in South Africa. MedRxiv: [DOI] [PMC free article] [PubMed] [Google Scholar]

[b38] Milner F.A., Zhao R. SIR model with directed spatial diffusion. Math. Popul. Stud. 2008;15(3):160–181. [Google Scholar]

[b39] Nouvellet P., Bhatia S., Cori A., Ainslie K.E., Baguelin M., Bhatt S., Boonyasiri A., Brazeau N.F., Cattarino L., Cooper L.V., et al. Reduction in mobility and COVID-19 transmission. Nature Commun. 2021;12(1):1–9. doi: 10.1038/s41467-021-21358-2. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b40] Nychka D., Furrer R., Sain S. 2014. Fields: Tools for Spatial Data. R Package Version 7.1. Vol. 10. Accessed Online. [Google Scholar]

[b41] Paeng S.-H., Lee J. Continuous and discrete SIR-models with spatial distributions. J. Math. Biol. 2017;74(7):1709–1727. doi: 10.1007/s00285-016-1071-8. [DOI] [PubMed] [Google Scholar]

[b42] Prem K., et al. The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study. Lancet Public Health. 2020;5(5):261–270. doi: 10.1016/S2468-2667(20)30073-6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b43] Punn N.S., Sonbhadra S.K., Agarwal S. Cold Spring Harbor Laboratory Press; 2020. COVID-19 Epidemic Analysis Using Machine Learning and Deep Learning Algorithms. MedRxiv: [DOI] [Google Scholar]

[b44] Rashed E.A., Hirata A. One-year lesson: Machine learning prediction of COVID-19 positive cases with meteorological data and mobility estimate in Japan. Int. J. Environ. Res. Public Health. 2021;18(11):5736. doi: 10.3390/ijerph18115736. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b45] Reich B.J., et al. 2020. A review of spatial causal inference methods for environmental and epidemiological applications. arXiv Preprint: 2007.02714. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b46] Reluga T. A two-phase epidemic driven by diffusion. J. Theor. Biol. 2004;229(2):249–261. doi: 10.1016/j.jtbi.2004.03.018. [DOI] [PubMed] [Google Scholar]

[b47] Robinson M., Stilianakis N.I., Drossinos Y. Spatial dynamics of airborne infectious diseases. J. Theor. Biol. 2012;297:116–126. doi: 10.1016/j.jtbi.2011.12.015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b48] Rubin D.B. Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 1974;66(5):688. [Google Scholar]

[b49] Sattenspiel L., Dietz K., et al. A structured epidemic model incorporating geographic mobility among regions. Math. Biosci. 1995;128(1):71–92. doi: 10.1016/0025-5564(94)00068-b. [DOI] [PubMed] [Google Scholar]

[b50] Sattenspiel L., Herring D.A. Simulating the effect of quarantine on the spread of the 1918–19 flu in central Canada. Bull. Math. Biol. 2003;65(1):1–26. doi: 10.1006/bulm.2002.0317. [DOI] [PubMed] [Google Scholar]

[b51] Sobel M.E. What do randomized studies of housing mobility demonstrate? Causal inference in the face of interference. J. Amer. Statist. Assoc. 2006;101(476):1398–1407. [Google Scholar]

[b52] Sparks A., Hengl T., Nelson A. GSODR: Global Summary Daily Weather Data in R. J. Open Source Softw. 2017;2(10):177. [Google Scholar]

[b53] Stein M.L. Space–time covariance functions. J. Amer. Statist. Assoc. 2005;100(469):310–321. [Google Scholar]

[b54] US Bureau of Labor Statistics M.L. 2019. Health Care and Social Assistance: NAICS 62. https://www.bls.gov/iag/tgs/iag62.htm. [Google Scholar]

[b55] US Census Bureau M.L. 2016. American Community Survey Public Use Microdata Samples. https://www.census.gov/newsroom/press-releases/2016/cb16-tps144.html. [Google Scholar]

[b56] Wang Y., Wang J., Zhang L. Cross diffusion-induced pattern in an SI model. Appl. Math. Comput. 2010;217(5):1965–1970. [Google Scholar]

[b57] Wu X., Nethery R.C., Sabath M., Braun D., Dominici F. Air pollution and COVID-19 mortality in the United States: Strengths and limitations of an ecological regression analysis. Sci. Adv. 2020;6(45):eabd4049. doi: 10.1126/sciadv.abd4049. [DOI] [PMC free article] [PubMed] [Google Scholar]

[b58] Wu X., et al. Cold Spring Harbor Laboratory Press; 2020. Exposure to Air Pollution and COVID-19 Mortality in the United States: A Nationwide Cross-Sectional Study. [DOI] [Google Scholar]

[b59] Yilmazkuday H. Stay-at-home works to fight against COVID-19: international evidence from google mobility data. J. Hum. Behav. Soc. Environ. 2020 doi: 10.1080/10911359.2020.1845903. [DOI] [Google Scholar]

PERMALINK

Estimating intervention effects on infectious disease control: The effect of community mobility reduction on Coronavirus spread

Andrew Giffin

Wenlong Gong

Suman Majumder

Ana G Rappold

Brian J Reich

Shu Yang

Abstract

1. Introduction

2. Data description

3. Main methodology

3.1. Notation

3.2. Conceptual SIR model

3.3. Gaussian approximation

3.4. The potential outcomes framework

Assumption 1 Interference —

Assumption 2 Causal Consistency —

Assumption 3 Ignorability of Intervention Process Variables —

Assumption 4 Positivity —

4. Simulation study

4.1. Methods

4.2. Results

Table 1.

Fig. 1.

Table 2.

5. Estimating the causal effect of community mobility reduction on Coronavirus spread

5.1. Effect estimation

Table 3.

Fig. 2.

Table 4.

6. Discussion

Disclaimer

Acknowledgments

Funding

Footnotes

Appendix A. Supplementary data

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases