Abstract
Randomized controlled trials (RCTs) of interventions intended to modify health behaviors may be influenced by neighborhood effects which can impede unbiased estimation of intervention effects. Examining a RCT designed to increase colorectal cancer (CRC) screening (N=5,628), we found statistically significant neighborhood effects: average CRC test use among neighboring study participants was significantly and positively associated with individual patient’s CRC test use. This potentially important spatially-varying covariate has not previously been considered in a RCT. Our results suggest that future RCTs of health behavior interventions should assess potential social interactions between participants, which may cause intervention arm contamination and may bias effect size estimation.
Keywords: Neighborhood, cancer screening, peer relationships, spatial autocorrelation, randomized controlled trial, diffusion
INTRODUCTION
Recent years have witnessed considerable growth in research on the impact of neighborhood social and physical environments on health behaviors and outcomes. However, in 2004, Oakes noted the many methodological challenges to the extant research toolkit for causal determination of neighborhood effects on health (Oakes, 2004b). Namely, he claimed that identifying an independent neighborhood effect on a health outcome was impossible given current methodologies (i.e. multilevel modeling of observational data). While the significance of Oakes’ critique has been debated,(Diez Roux, 2004, Subramanian, 2004, Oakes, 2004a) the field has generally responded favorably with a more cautious approach to making causal claims about neighborhood effects. At the same time, while there is great interest in the design and testing of randomized controlled trials (RCTs) aimed at modifying health behaviors, scant attention has been paid to understanding how Oakes’s arguments pertain to causal inference in the context of RCTs. RCTs are considered the “gold standard” methodology for generating causal inference. Therefore, it is crucial to study the role that neighborhood effects might have on the results obtained from RCTs.
RCTs that test behavioral interventions, hereafter behavioral RCTs, differ from other types of RCTs (such as RCTs to test new drugs) in part due to the unique set of factors influencing human choices that are often outside the control (or measurement) of the RCT itself. Social dynamics that influence behaviors and that often occur within residential neighborhood contexts represent one of these key, frequently unmeasured, confounders. A recent article by Manski (2013) highlights the significant challenges social dynamics present for the estimation of intervention effects in behavioral RCTs. At the same time in the epidemiological literature, VanderWeele and Tchetgen (VanderWeele et al., 2012, Tchetgen and VanderWeele, 2012) elucidate the challenges to causal inference in the presence of subject-to-subject interference. However, little is known about the likelihood and/or scope of social influences within behavioral RCTs.
We examine the case of a behavioral RCT designed to increase colorectal cancer (CRC) screening in order to further elucidate how social influences may bias behavioral RCT outcomes. While the role of social influence on cancer screening is not completely understood, significant prior research indicates spatial variation in screening behaviors (Doubeni et al., 2012, Lian et al., 2008, Mobley et al., 2010, Shariff-Marco et al., 2013, Vogt et al., 2014) that could, in part, be caused by interactions among neighbors. Further, the CRC intervention we examine is inherently spatial in nature because it uses mailed invitations (i.e. targeted to patient’s residences) to deliver the intervention. Our contributions are two-fold: (1) we lay out an analysis framework for assessing situations in which social influences may be biasing behavioral RCT results and (2) we provide effect size estimates for the neighborhood effects occurring in our CRC screening behavioral RCT along with a discussion of how estimated neighborhood effects should be interpreted.
Our work continues the conversation began by Oakes (2004b). As such, we assess the more recent literature, across multiple disciplines, regarding the identification of neighborhood effects. This literature has much to offer health researchers interested in how neighborhoods affect health and should, therefore, be considered in the design and implementation of future behavioral RCTs. Our work also contributes to a growing body of social science research that seeks to understand the causes and consequences of geographic “spillover” effects(Anselin and Bera, 1998, Baicker, 2005, Pereira and Roca-Sagales, 2003) as well as the epidemiological literature examining the related concept of interference (Tchetgen and VanderWeele, 2012, VanderWeele et al., 2012).
Spatial Dependence in Randomized Controlled Trials
Spatial dependence1 in health behaviors is not commonly assessed in the design, conduct, or evaluation of behavioral RCTs. If spatial dependence is present and unaccounted for in estimations of intervention effects, conventional standard error estimates and hypothesis tests based on the standard errors are not accurate. Moreover, depending on the underlying mechanisms that cause the spatial dependence, point estimates of intervention effects may also be biased. As a result, inference and policy recommendations arising from behavioral RCTs may be misleading, have weak support or, in extreme cases, be completely inaccurate (Anselin and Bera, 1998, Manski, 2013). Therefore, it is critical to understand the mechanisms that generate spatial dependence.
Methods developed in the fields of spatial econometrics and regional science focus on identification of mechanisms leading to spatial dependence. In these fields, mechanisms that may cause spatial dependence are divided into three categories of neighborhood effects: correlated, exogenous, and endogenous (Manski, 1993).
Correlated effects refer to the neighborhood effects that result because individuals self-select into neighborhoods--often sorting along demographic characteristics as a result of homophily, shared preferences for neighborhood amenities, and economic constraints. (Tiebout, 1956) For example, if lower-socioeconomic status (SES) individuals are less likely to receive CRC screening and also sort into the same neighborhoods, spatial dependence in CRC screening may be an artifact of the correlation between income and screening.
Exogenous effects refer to the influence of shared neighborhood exposures or institutions. As one example, the promotion of CRC screening may vary in emphasis and outreach methods across different neighborhood clinics, in which case spatial dependence in CRC screening may be attributable to the particular clinic a patient attends.
Endogenous effects refer to a relationship between an individual’s behavior and the behavior of his neighbors as a result of social interaction and social influence. For the case of CRC screening, an individual may be more likely to undergo screening if she hears about other friends and neighbors also undergoing regular screening. (Manski, 1993)
Implications of Spatial Dependence in Behavioral RCTs
Correctly attributing spatial dependence to exogenous, correlated, and endogenous effects is important because the analytic and intervention implications for RCTs vary depending upon the neighborhood effect mechanisms. Table 1 presents a summary of the implications that result when neighborhood effects exist but are unaccounted for in analysis of behavioral RCTs. In many cases traditional approaches such as adjusting for neighborhood level sociodemographic characteristics, including neighborhood fixed effects, multilevel modeling (i.e. neighborhood random effects), or incorporating spatial dependence in the model’s error structure are sufficient to account for spatial dependence. However there are situations where these approaches are insufficient, as we describe below. If spatial dependence is a result of correlated or exogenous effects that are independent of treatment assignment, it may be considered a nuisance parameter and simply adjusted for in analyses. The case in which correlated or exogenous effects are correlated with treatment assignment represents a failure to randomize across neighborhoods. This is of course serious—with consequences similar to other scenarios in which randomization fails.
Table 1.
Implications for Accurate Model Specification in the Presence of Various Types of Neighborhood Effects and Analytic Solutions
| Type of Neighborhood Effect | Implication for Intervention Effect Size Estimates | Solutions that will “Fix” the Misspecification Problem | ||
|---|---|---|---|---|
| Biased Effect size Estimate | Inaccurate Standard Error Estimate | Multilevel Models, Neighborhood Fixed Effects, or Adjusted Standard Errors | Add Variables to the Model to Measure the Neighborhood Effects | |
| Correlated Effects or Exogenous Effects that are independent of treatment assignment | X | X | X | |
| Correlated Effects or Exogenous Effects that are correlated with treatment assignment | X | X | X | |
| Endogenous Effects | X | X | X | |
Endogenous effects generated by direct social interaction also have very significant implications. In the context of a behavioral RCT, not accounting for endogenous effects may result in biased intervention effect estimates through contamination of treatment and control groups (Maski 2013). For example, if “treated” individuals influence the behavior of untreated or treated neighbors, this may augment intervention effectiveness. Failure to account for endogenous effects results in an “omitted variable” problem: endogenous effects confound the relationship between treatment and the outcome targeted by the RCT. Additional covariates that measure the endogenous effects must be added to the model to fully and accurately estimate treatment effect sizes (Greene, 1981, Hill et al., 2011).
Manski (2013) points out that traditional analysis of RCTs assumes “individualistic” treatment response, which is not the case in the presence of endogenous effects. In the presence of endogenous effects, models are needed to accurately evaluate total intervention effects, which would include both direct, (“individualistic”) and indirect (neighbor-to-neighbor or peer-to-peer) effects (Ioannides, 2012, Durlauf and Ioannides, 2010).
To our knowledge, no studies have examined spatial dependence and neighborhood effects in the context of a behavioral RCT where participants lived close to each other. At the same time, despite a robust literature documenting spatial and geographic differences (Doubeni et al., 2012, Lian et al., 2008, Mobley et al., 2010, Shariff-Marco et al., 2013, Vogt et al., 2014), the mechanisms driving spatial variations in cancer screening behavior are poorly understood. Therefore, in a study of a geographically-based RCT designed to increase colorectal cancer (CRC) screening, we applied spatial econometric methods to test for spatial dependence and the existence of correlated, exogenous and endogenous neighborhood effects.
METHODS
Sample
We conducted secondary analyses of data from patients in a randomized, comparative effectiveness trial (2011–2012) conducted in the John Peter Smith (JPS) urban safety-net healthcare system. JPS consists of 12 community primary-care clinics and a tertiary-care hospital providing services to residents of Fort Worth and Tarrant County, Texas. The study and eligibility criteria is described in detail elsewhere (Gupta et al., 2013). The study included men and women, ages 54–64 years, with a recent health system visit (any visit within 8 months before randomization), no CRC history, and who were uninsured but enrolled in a County medical assistance program for the uninsured. The medical assistance program facilitates access to primary and specialty care, including surgery and cancer care, on a sliding pay scale. Patients were excluded if they resided in jail, were homeless, or were up-to-date with CRC screening (defined as having a fecal occult blood test within 1 year, sigmoidoscopy or barium enema within 5 years, or colonoscopy within 8 years [8 not 10 years was used given availability of health system data.])
Procedures
Eligible patients were randomly assigned to one of three conditions: 1) Usual Care, 2) Organized fecal immunochemical test (FIT), or 3) Organized Colonoscopy. Usual Care included opportunistic, clinic visit-based offers to complete screening with fecal occult blood test, colonoscopy, barium enema, or sigmoidoscopy at the discretion of health care providers. Opportunistic visit-based screening offers could be received at any time by usual care or intervention participants. Intervention participants received mailed invitations to complete no-cost CRC screening. Patients randomized to Organized Colonoscopy were invited to schedule free screening by calling an included telephone number. Patients randomized to Organized FIT were mailed a free one-sample FIT test with instructions and a postage-paid return envelope for the kit with their invitation letter. All materials were written in both English and Spanish. All invitees received two automated phone reminders, and non-respondents received up to two live telephone reminders to promote participation. Institutional Review Boards at JPS and UT Southwestern Medical Center approved the study, with a waiver of informed consent.
For this analysis, we excluded n=342 patients with addresses that could not be geocoded (e.g. P.O. Box), resided outside Tarrant County, or did not have an identified primary care clinic or physician. Those excluded were more likely to be female, White, speak English, be in the Usual care study arm, and not to have received CRC testing (p<.05).
Measures
Outcome
The outcome was completion of any modality of CRC testing (e.g. FIT or other stool blood test, sigmoidoscopy, colonoscopy) within one year of randomization. Therefore, the dependent variable is binary—0 if the participant did not receive any testing and 1 otherwise.
Covariates
We analyzed several patient covariates related to CRC screening based on previous research (Cokkinides et al., 2003, Diaz et al., 2008, McQueen et al., 2006), including age, sex, race/ethnicity (non-Hispanic White, non-Hispanic Black, Hispanic, and other), and primary language spoken at home (English, Spanish). We also consider block group poverty rate and percent graduating from college as neighborhood demographic covariates. These variables were calculated from the American Community Survey (2006–2010) data and are included in the model to account for correlated neighborhood effects. We measured three health system covariates as they are likely sources of exogenous neighborhood effects in CRC test use (Rossi et al., 2005, Pruitt et al., 2014). We coded the patient’s primary care clinic and calculated travel time for each patient to their primary care clinic and the central hospital (see Pruitt et al. for more details on travel time calculation).
Analysis
First we tested the assumption of spatial independence. We calculated Moran’s I, a global measure of spatial autocorrelation, to test whether CRC test use was spatially correlated within the ½ mile surrounding the patient’s home.
Next, we estimated a series of models to examine the impact of adding in additional covariates to control for each of the different types of neighborhood effects. To model the binary variable representing CRC test use, we estimated probit multivariate models. Letting y* denote the latent variable, propensity of CRC test use, the model was specified as:
| (1) |
Equation (1) implies that we assumed a linear relationship between y* and treatment assignment (T), the individual patient covariates (X), correlated effects as measured by the neighborhood demographic covariates (C), exogenous effects as measured by the health system covariates (H), and endogenous effects as measured by the CRC test use of neighbors (Wy*).
The spatial weights matrix, W, defines neighbors; and Wy* is the weighted-average propensity of CRC test use among each patient’s neighbors. This term is a measure of the influence of nearby patients on the individual patient. We defined nearby as the 6 observations nearest the target observation and weighted them by the inverse of the distance between the target and neighbor observation. All other non-neighboring observations were assigned a weight of zero. The weight matrix was row-standardized so that all weights for a given observation summed to 1 (LeSage and Pace, 2009). Neighbors living in the same apartment complex were weighted as living 1 foot from one another. When more than 6 study participants lived in the same apartment complex (as was the case for 78 out of our 5,628 study observations), the nearest 6 neighbors were randomly chosen from the group of neighbors.
In probit models, we assume that y* follows a standard normal distribution (Φ) so that:
| (2) |
First, we followed the estimation approaches most commonly applied to the estimation of effects in a behavioral RCT. We fitted two probit models. Model 1 includes only the intervention effect, assuming that ρ = η = γ = β =0. Model 2 adds the patient, neighborhood (correlated effects), and health system (exogenous effects) covariates, thus assuming that only ρ =0.
Finally, we include endogenous effects by removing the restriction on ρ. Hereafter, we will refer to this model as the “spatial lag model.” We fitted two spatial lag models. Model 3 includes only the intervention and endogenous effects; and Model 4 adds the patient, neighborhood, and health system covariates. Model 4 is the most complete model, which includes all three types of neighborhood effects. To estimate the spatial lag model, (1) was transformed to account for the endogneity of y*
| (3) |
and the likelihood function was maximized utilizing Bayesian estimation with diffuse priors (Lesage and Pace, 2009).
We conducted several sensitivity analyses as follows. An implicit assumption in the spatial lag models is that the spatial weights matrix accurately depicts the structure of social relationships among study participants (Partridge et al., 2012). To test the sensitivity of our results we also estimated the final spatial lag model (Model 4) using different assumptions to construct the spatial weights matrix: (1) 12 nearest neighbors weighted by inverse distance and (2) all neighbors within .25 mile of the patient. Next, we added fixed effects for primary care providers (n=185) to Model 4. PCP effects were not included in the primary analyses due to concerns about over-fitting. Finally, we conducted sensitivity tests of spatial decay because we hypothesized that if neighbor’s CRC test use behavior influences individual’s test use, then more distant neighbors would have less influence. We tested this hypothesis by estimating additional spatial lag models (using Model 4), each incorporating an increasingly distant number of neighbors in groups of 6 (i.e. nearest neighbors 1–6; 7–12; 13–18 and 19–24).
Data were merged and managed in SAS 9.3 (SAS Institute, Cary NC). Models 1 and 2 were estimated in STATA 13.0 (StataCorp. 2013. Stata Statistical Software: Release 13. College Station, TX: StataCorp LP.) Neighbor matrices were computed with MATLAB 7.9 (The MathWorks, Inc., Natick, MA). Moran’s I test and spatial lag models were estimated with LeSage’s publicly available Spatial Econometrics Toolbox for MATLAB (LeSage, 2010).
RESULTS
Overall, the sample included 5,628 patients of whom 21.2% (n=1,194) completed CRC testing during the one-year study period. Test use varied by group assignment as reported elsewhere (Gupta et al., 2013) with the highest uptake in the organized FIT group (42.2%) compared to organized colonoscopy (25.5%) and usual care groups (12.3%). Test use varied by race/ethnicity and primary language but did not differ by patient age, sex, travel time, or block group poverty or education. Patient characteristics by test use are described in Table 2.
Table 2.
Patient characteristics by colorectal cancer (CRC) test use during 12 month study period (n=5,628)
| No CRC test use | CRC test use | p-value | |||
|---|---|---|---|---|---|
|
| |||||
| N/Mean | %/(SE) | N/Mean | %/(SE) | (Chi2/t-test) | |
| Group Assignment | |||||
| Organized Colonoscopy | 336 | 0.75 | 115 | 0.25 | < 0.001 |
| Organized FIT | 851 | 0.58 | 622 | 0.42 | |
| Usual Care | 3247 | 0.88 | 457 | 0.12 | |
| Language | |||||
| Spanish | 241 | 0.25 | 717 | 0.75 | 0.001 |
| English | 3717 | 0.80 | 953 | 0.20 | |
| Race / Ethnicity | |||||
| Hispanic | 1256 | 0.76 | 392 | 0.24 | <0.001 |
| Black | 1021 | 0.76 | 322 | 0.24 | |
| Other | 1878 | 0.83 | 390 | 0.17 | |
| White | 279 | 0.76 | 90 | 0.24 | |
| Age | 59.02 | (2.89) | 59.18 | (2.89) | 0.097 |
| Sex | |||||
| Male | 2823 | 0.64 | 1611 | 0.36 | 0.202 |
| Female | 784 | 0.66 | 410 | 0.34 | |
| Travel to medical home in minutes | 11.1 | (6.53) | 11.14 | (6.41) | 0.849 |
| Travel to endoscopy clinic in minutes | 14.78 | (6.74) | 15.11 | (6.67) | 0.132 |
| Percent below poverty in BG | 20.86 | (17.33) | 20.02 | (16.00) | 0.133 |
| Percent graduated college in BG | 16.87 | (14.57) | 17.24 | (14.41) | 0.424 |
|
| |||||
| Total | 4434 | 0.79 | 1194 | 0.21 | |
CRC=colorectal cancer; FIT= fecal immunochemical test; BG=block group
Study participants resided in close proximity to each other. Nearly all (92.5%) had at least one neighbor participating in the study within a .25 mile radius from their home. Only 17 (<1%) patients did not have at least one neighbor within 1 mile of their home. On average, participants had 6.3, 18.3, and 59.0 neighbors within .25, .50, and 1 mile radius of their home, respectively. Many participants resided in an apartment complex (19.5%, n=1,095). Apartment dwellers had a mean of 9.83 neighbors within 0.25 miles compared to 5.40 neighbors for non-apartment dwellers. The maximum number of neighbors was large. Participants had up to 57 neighbors in a .25 mile radius of their home and up to 184 within a 1 mile radius. This is particularly noteworthy given that the Tarrant County study area is large, consisting of 897 square miles. Figure 1 presents a map demonstrating the density of study participants across the County. The spatial weight matrices included the nearest 6 neighbors, weighted by inverse distance. On average, the nearest 6 neighbors lived 0.23 miles away (range 0–2.41).
Figure 1. Participant density (n=5,628) by block group and locations of primary care clinics and hospital across Tarrant County.
Note: graphic shows patients and clinics in Tarrant County; multiple clinics can be located at the same address.
Moran’s I calculated on the raw test use variable was .017 (p=.029). We thus rejected the null hypothesis that there was no spatial dependence in CRC test use in this sample.
Estimation results for the multivariate models are reported in Table 3. Model 1 indicated that both intervention arms were significantly and positively associated with CRC test use, confirming findings of the primary outcome study (Gupta et al., 2013). These effects remained significant and positive after including individual, exogenous, and correlated effects (i.e. patient, neighborhood, and health system covariates) in Model 2. In Model 2 “other” race (vs. Whites) was negatively associated (p<.05) with CRC test use.
Table 3.
Coefficients and standard errors from nonspatial base models and spatial lag models of colorectal cancer test use (n=5,628)
| Non-spatial Models | Spatial Lag Models | |||
|---|---|---|---|---|
|
| ||||
| Model 1 | Model 2 | Model 3 | Model 4 | |
| Constant | −1.158 *** (0.026) | −1.732 *** (0.430) | −1.132 *** (0.029) | −1.729 *** (0.432) |
| Group Assignment | ||||
| Organized Colonoscopy (vs. Usual Care) | 0.499 *** (0.069) | 0.527 *** (0.070) | 0.493 *** (0.075) | 0.529 *** (0.073) |
| Organized FIT (vs. Usual Care) | 0.962 *** (0.042) | 0.992 *** (0.043) | 0.965 *** (0.043) | 0.998 *** (0.037) |
| Language | ||||
| Spanish (vs. English) | - | 0.086 (0.073) | - | 0.091 (0.075) |
| Race / Ethnicity | ||||
| Black (vs. White) | - | 0.118 (0.091) | - | 0.125 * (0.088) |
| Hispanic (vs. White) | - | 0.034 (0.098) | - | 0.033 (0.094) |
| Other (vs. White) | - | −0.196 ** (0.085) | - | −0.194 *** (0.079) |
| Age | - | 0.010 (0.007) | - | 0.010 (0.007) |
| Sex | ||||
| Male (vs. Female) | - | 0.040 (0.042) | - | 0.040 (0.041) |
| Travel to medical home in minutes | - | −0.001 (0.004) | - | −0.001 (0.004) |
| Travel to endoscopy clinic in minutes | - | 0.000 (0.004) | - | 0.000 (0.004) |
| Percent below poverty in BG | - | −0.102 (0.150) | - | −0.123 (0.156) |
| Percent graduated college in BG | - | 0.248 (0.164) | - | 0.235 * (0.160) |
| Clinic Fixed Effects | - | [Data not shown] | - | [Data not shown] |
|
| ||||
| Spatial lag coefficient (ρ) | - | - | 0.033 ** (0.020) | 0.029 ** (0.018) |
p<0.10;
p<0.05;
p<0.01
Due to space considerations, clinics fixed effects (n=19) from Models 2 and 4 are not displayed but are available from the corresponding author upon request
Next, spatial lag models that allow for inclusion of each type of neighborhood effect were examined. Comparing spatial lag models 3 and 4 to nonspatial Models 1 and 2, intervention arm effects remained strongly and significantly associated with CRC testing. Intervention arm effect size is not statistically distinguishable (the confidence intervals almost entirely overlap) between the different model specifications.
The coefficient estimate for ρ (or “rho”) is statistically significant in Models 3 and 4 (Table 3). The average CRC test use of geographically close neighbors is positively associated with individual test use before and after controlling for covariates (Model 3: ρ = 0.033, p<.01 Model 4: ρ = 0.029 p<.02). In Model 4, patient factors are statistically significant (p<.10) as follows: older patients and Blacks (vs. Whites) were more likely while “others” (vs. Whites) were less likely to be tested. Statistically significant (p<.10) correlated effects included block group education level, which was positively associated with testing (Table 3, Model 4). Exogenous effects including many of the clinic fixed effects were significantly associated with testing [data not shown]. In other words, compared to the largest (referent) clinic, CRC test use was higher in some clinics (n=11) and lower in others (n=8).
Sensitivity analyses of spatial decay (Table 4) indicated that spatial dependence was generally attenuated over increasingly distant sets of neighbors. Notably, the spatial effect was significant for the nearest 6 neighbors (ρ = 0.029, p<.001) and was no longer significant for the nearest 19–24 neighbors (ρ = 0.019, p=.133). The nearest 6 and nearest 19–24 neighbors lived an average of 0.23 miles (range: 0–5.32) and 0.74 (range: 0–6.69) miles away, respectively. Sensitivity analysis using different specifications of the spatial weight matrix did not substantively alter effect size or significance of rho in Models 3 or 4 [data not shown]. Addition of PCP fixed effects to Model 4 did not substantively alter intervention effects; and rho was modestly attenuated but remained significant after inclusion of PCP fixed effects in Model 4 [data not shown].
Table 4.
Spatial decay models demonstrating distance between patient and neighbors and spatial lag coefficient (Rho/ ρ) effect in colorectal cancer test use over sets of 6 increasingly more distant neighbors
| # Nearest Neighbors | Distance Between Patient and Neighbors in Miles | Spatial Lag | |||
|---|---|---|---|---|---|
|
| |||||
| Average | Range | Spatial lag coefficient (ρ) | SE | p value | |
| 1 thru 6 | 0.23 | 0–2.41 | 0.029 | 0.018 | 0.020 |
| 7 thru 12 | 0.45 | 0–5.74 | 0.022 | 0.016 | 0.003 |
| 13 thru 18 | 0.60 | 0–6.17 | 0.027 | 0.019 | 0.017 |
| 19 thru 24 | 0.74 | 0–6.69 | 0.019 | 0.015 | 0.133 |
Models control for intervention group, language, race/ethnicity, age, sex, travel time to clinic and endoscopy facility, neighborhood poverty, neighborhood education, and clinic fixed effects. SE=standard error
DISCUSSION
We extended the traditional framework for analyzing RCT data of a health behavior intervention by incorporating contributions from the economics and regional science “neighborhood effects” literatures. Our spatial econometric approach highlights some novel conceptual and analytic opportunities for those seeking to better understand how neighborhoods might influence health behavior. In future studies using these methods, the availability of additional data on participant social interactions will facilitate a more robust identification of the causal mechanisms driving neighborhood influence on health.
First, our approach provides a conceptual framework for considering how neighborhoods might influence health behaviors (i.e. correlated, exogenous, and endogenous effects). Our framework is notable in that it provides direct implications for how neighborhood effects should be measured and incorporated into an analytic framework. While there are numerous conceptual models for how neighborhoods might influence health (Robert, 1999, Krieger, 2001, Diez Roux, 2003, Macintyre and Ellaway, 2003), there are few analytic frameworks able to robustly test the implications of the conceptual models. For example, methods such as multilevel modeling cannot elucidate potential mechanisms leading to spatial variation. The neighborhood effects literature presented here—specifically exogenous, correlated, and endogenous effects—helps to fill this gap. Future work should consider how this characterization of neighborhood effects may be extended to the case of clustered-RCTs. The “clusters” may be considered similar to neighborhoods as discussed here. Exogenous effects may exist within clusters to the extent that individuals within the same cluster experience shared institutional, social or environmental exposures. Furthermore, endogenous effects may be heightened if social interactions occur within clusters and they may produce significant contamination if social interactions occur across clusters.
Second, for the case of behavioral RCTs, we illustrated the need to analyze outcome variables in terms of their spatial configurations. Spatial dependence is a symptom of possible endogenous effects, which if unaccounted for may cause bias in intervention effect size estimates. In our case, we found little evidence of bias (i.e. effect size estimates across models are not statistically distinguishable) despite statistically significant endogenous effects; but this may not always be the case. There may be many scenarios in which failure to account for spatial dependence may result in greater bias, for example, when dependence is stronger, which may be the case in more frequent, observable, or inherently social behaviors, such as eating or smoking.
Third, we demonstrated a potentially important spatially-varying covariate that has not previously been considered in RCTs: the CRC test use of neighbors was a significant and positive correlate of individual test use. This was independent of measured exogenous and correlated effects as well as patient covariates. This novel covariate has implications for both cancer screening interventions and for RCTs more generally.
Our study provides novel evidence (the positive and significant rho) of social influence as a possible mechanism driving spatial dependence in CRC screening behaviors. Social influences are acknowledged and incorporated as an important causal predictor of health behaviors across health behavior theories and models (e.g. Social Cognitive Theory, Social Learning Theory, Interdependence Theory, and the behavioral ecological and social ecological models) (Glanz et al., 2002, DiClemente et al., 2002, Kelley and Thibaut, 1978). However, the empirical evidence showing a causal role of social influence via direct social interaction—the most common interpretation applied to a statistically significant rho in the neighborhood effects literature—in health behaviors is small (e.g. (Christakis and Fowler, 2007, Eisenberg and Quinn, 2006) and largely debated (e.g. (Lyons, 2011, Cohen-Cole and Fletcher, 2008a, Cohen-Cole and Fletcher, 2008b, Noel and Nyhan, 2011, Sainsbury, 2008, Shalizi and Thomas, 2011). Nevertheless, other recent studies, using random assignment of peers, have identified a direct causal link between individual and peer physical activity (Carrell et al., 2011).
Are the Endogenous Effects Causal?
While our results are suggestive, we cannot infer causality of social interactions and influence between neighbors in our RCT. In our study, we assumed that the strength of social interactions was based on geographic proximity and the causal implications of our estimates rely on the accuracy of this assumption (Corrado and Fingleton, 2012, Gibbons and Overman, 2012, Partridge et al., 2012). Spatial decay in the estimated endogenous effects lends credibility to our assumption, but we are limited in our ability to define the true nature of social relationships among study participants.
Moreover, neighborhood sorting by individual preferences, which are related to the likelihood of CRC screening test use, may be confounding the estimated endogenous effects. For example, individuals may sort into neighborhoods where others share similar health behavior norms (unmeasured confounders). We controlled for multiple exogenous and correlated neighborhood and clinic characteristics to address this possibility, but shared attitudes or norms may be unrelated to these factors. For example, residents may choose to live in neighborhoods near other members of their religious affiliation and some religious affiliations may be intrinsically biased for or against invasive medical procedures like colonoscopy.
Future research is needed to confirm the presence of endogenous effects in the context of a behavioral RCT. This task is complicated by the sorting of individuals both geographically (through choice of residential neighborhood) and socially (through choice of social peers.) To robustly identify endogenous effects, a RCT in the context of a social observatory or other setting wherein there is near-complete characterization of participant neighborhoods and social networks is needed. While data linkage to existing RCT data is an appealing approach, prospective measures of social interactions are needed to identify causality. Furthermore, secondary datasets characterizing both geographic and social dimensions are rare and those allowing for the identification of social peer effects are even more unusual.(Bramoulle et al., 2009, Cohen-Cole and Fletcher, 2008b, Cohen-Cole and Fletcher, 2008a, Dietz, 2002, Durlauf, 2004, Durlauf and Ioannides, 2010, Ioannides and Topa, 2010, Manski, 1993)
Significance of Endogenous Effects for Cancer Screening Research and RCTs
If the relationship between neighbors’ CRC test use and individual test use is a result of true social interaction in which the behavior of one individual influences the behavior of others, there are significant implications for cancer screening research and for behavioral RCTs more generally. There are particular implications for cancer screening intervention development and delivery. If social influence and interactions are a causal mechanism driving neighborhood variations in cancer screening, they are potentially a very powerful, and modifiable intervention target. Future intervention research should explore approaches to maximizing and/or modifying the social interactions that result in increased cancer screening uptake.
There are also implications for calculating RCT effect size and RCT intervention arm contamination. If intervention effects are multiplied as each treated individual in the RCT treatment arm indirectly extends the treatment to his/her neighbors (or peers), this “social multiplier” effect may result in treatment arm contamination, which could bias intervention results. It also means that the true intervention effect can only be captured by considering the spillover effects beyond the treatment arm itself. In our case, the total intervention effects are greater than the marginal direct effects of the intervention shown in Table 3. Total effects of the intervention are increased as follows: for every patient undergoing CRC testing as a result of the intervention, via social influence, his or her neighbors are now slightly more likely to be tested (see LeSage et al., 2011) for detail about calculating the total and direct effects from spatial probit estimates). In our study the intervention effect sizes increase for FOBT (from 0.2525 direct to 0.2606 total) and colonoscopy (from 0.1339 direct to 0.1383 total) (estimates based on Model 4).
Are Endogenous Effects Common in RCTs?
Considering the external validity of our results, the plausibility of direct social influence (causal endogenous effects) in the context of a behavioral RCT likely depends on features of trial design that may facilitate or hinder patient interactions. Such features might include participant geographic proximity, sharing of the same physician or attending the same clinic, neighborhood characteristics (e.g. social cohesion or strength of social norms), and characteristics of the health behavior itself, such as how likely it is that others will observe or gain knowledge of the behavior. Length of follow-up is another likely influential feature. RCTs of health behavior interventions typically randomize and ascertain individuals’ outcomes after a significant follow-up period of weeks, months, or even years. During this time, participants’ exposures to their neighborhood environments as well as other RCT participants are ongoing; longer follow-up periods provide more opportunities to interact, share experiences, and observe each other’s behaviors.
Our results may represent a situation where endogenous effects play a more minor role than what may be likely for other behavioral RCTs. For example, as a pragmatic trial, informed consent was not required in our study, and participants were unaware they were participating in a RCT. In the event participants were actively consented and enrolled, it’s possible they would be more likely to discuss trial participation or the target behavior with social and neighborhood peers. Additionally, our behavioral RCT had very strong intervention effects. Understanding the role of endogenous effects is arguably more critical when direct intervention effects are more modest.
Strengths and Limitations
Our study faces several limitations. We examined patients in an urban safety-net health system living in a single Texas county with identified primary care providers and clinics; and further restricted our analysis to those with addresses that could be geocoded; thus the external validity of our findings is unknown. We were unable to measure provider- or clinic- level factors such as attitudes or institutional systems that may contribute to observed endogenous effects. However, our results are robust to adjustments for provider and clinic level fixed effects. We did not have data on the exact nature of social interactions among study participants. Therefore our results suggest that endogenous effects may be present, but we cannot claim identification without qualification. A clear understanding of social networks of RCT participants, and the nature of social interactions (e.g. who is a leader, a follower, the number and type of social connections, etc.) would be necessary to more fully understand the degree to which intervention effects are under- or over-estimated when social interactions are unaccounted for (Manski, 2013). However, our tests for spatial decay of the endogenous effects reinforce the notion that proximity plays an important role in social interactions.
Our study provided an exemplary case study for the application of spatial econometric methods to data from an RCT of a health behavior intervention. Treatment arms were randomly assigned without regard to geography and the study was conducted within a single, large health care system in Tarrant County, a defined, geographically discrete area that is densely-populated and urban. These design features facilitated geographic proximity and density among neighbors, including participants in different treatment arms. Further, our one-year prospective study period allowed adequate time in which social interactions could occur.
Conclusions
To our knowledge, we provide the first evidence that average CRC test use among neighbors is significantly and positively associated with individual CRC test use. Our results support a role for social influence in explaining regional “spillover” effects in cancer screening behaviors documented by Vogt et al. (2014). Future research should further characterize, quantify, and test the causality of direct social influence (endogenous effects) as a potential mechanism driving spatial dependence in health behaviors. If real, future cancer screening interventions might consider the possibility of leveraging direct social influence; for example, by engaging peers in screening invitation. Many health behavior RCTs enroll participants who are close neighbors. We recommend that researchers conducting RCTs of health behavior interventions should assess and adjust for spatial dependence and social interaction between participants in the design of the RCT as appropriate, because such interaction may cause treatment arm contamination.
Highlights.
We assess neighborhood effects in a behavioral randomized controlled trial (RCT)
We characterize neighborhood effects per economics and sociology conceptual models
Average behavior of neighbors was associated with individual behavior
Social interactions between participants could cause contamination in a RCT
Implications for RCT design and analysis in light of neighborhood effects are discussed
Acknowledgments
This work was supported by Cancer Prevention Research Institute of Texas (CPRIT) grants PP100039 and R1208.
Footnotes
Spatial dependence is when the outcome of some random variable at a particular location depends on the outcomes of that same random variable at nearby locations.
References
- ANSELIN L, BERA AK. Spatial dependence in linear regression models with an introduction to spatial econometrics. Statistics Textbooks and Monographs. 1998;155:237–90. [Google Scholar]
- BAICKER K. The spillover effects of state spending. Journal of Public Economics. 2005;89:529–544. [Google Scholar]
- BRAMOULLE Y, DJEBBARI H, FORTIN B. Identification of peer effects through social networks. Journal of Econometrics. 2009;150:41–55. [Google Scholar]
- CARRELL SE, HOEKSTRA M, WEST JE. Is poor fitness contagious?: Evidence from randomly assigned friends. Journal of Public Economics. 2011;95:657–663. [Google Scholar]
- CHRISTAKIS NA, FOWLER JH. The Spread of Obesity in a Large Social Network over 32 Years. New England Journal of Medicine. 2007;357:370–379. doi: 10.1056/NEJMsa066082. [DOI] [PubMed] [Google Scholar]
- COHEN-COLE E, FLETCHER JM. Detecting implausible social network effects in acne, height, and headaches: longitudinal analysis. BMJ. 2008a;337:a2533. doi: 10.1136/bmj.a2533. [DOI] [PMC free article] [PubMed] [Google Scholar]
- COHEN-COLE E, FLETCHER JM. Is obesity contagious? Social networks vs. environmental factors in the obesity epidemic. J Health Econ. 2008b;27:1382–7. doi: 10.1016/j.jhealeco.2008.04.005. [DOI] [PubMed] [Google Scholar]
- COKKINIDES VE, CHAO A, SMITH RA, VERNON SW, THUN MJ. Correlates of underutilization of colorectal cancer screening among U.S. adults, age 50 years and older. Prev Med. 2003;36:85–91. doi: 10.1006/pmed.2002.1127. [DOI] [PubMed] [Google Scholar]
- CORRADO L, FINGLETON B. WHERE IS THE ECONOMICS IN SPATIAL ECONOMETRICS?*. Journal of Regional Science. 2012;52:210–239. [Google Scholar]
- DIAZ JA, ROBERTS MB, GOLDMAN RE, WEITZEN S, EATON CB. Effect of language on colorectal cancer screening among Latinos and non-Latinos. Cancer Epidemiol Biomarkers Prev. 2008;17:2169–73. doi: 10.1158/1055-9965.EPI-07-2692. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DICLEMENTE RJ, CROSBY RA, KELGLER MC, editors. Emerging theories in health promotion practice and research. San Francisco, CA: Jossey-Bass; 2002. [Google Scholar]
- DIETZ RD. The estimation of neighborhood effects in the social sciences: An interdisciplinary approach. Social Science Research. 2002;31:539–575. [Google Scholar]
- DIEZ ROUX AV. Residential environments and cardiovascular risk. J Urban Health. 2003;80:569–89. doi: 10.1093/jurban/jtg065. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DIEZ ROUX AV. Estimating neighborhood health effects: the challenges of causal inference in a complex world. Soc Sci Med. 2004;58:1953–1960. doi: 10.1016/S0277-9536(03)00414-3. [DOI] [PubMed] [Google Scholar]
- DOUBENI CA, JAMBAULIKAR GD, FOUAYZI H, ROBINSON SB, GUNTER MJ, FIELD TS, ROBLIN DW, FLETCHER RH. Neighborhood socioeconomic status and use of colonoscopy in an insured population--a retrospective cohort study. PLoS One. 2012;7:e36392. doi: 10.1371/journal.pone.0036392. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DURLAUF SN. Neighborhood effects. Handbook of regional and urban economics. 2004;4:2173–242. [Google Scholar]
- DURLAUF SN, IOANNIDES YM. Social interactions. Annual Review of Economics. 2010;2:451–478. [Google Scholar]
- EISENBERG D, QUINN BC. Estimating the Effect of Smoking Cessation on Weight Gain: An Instrumental Variable Approach. Health Services Research. 2006;41:2255–2266. doi: 10.1111/j.1475-6773.2006.00594.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- GIBBONS S, OVERMAN HG. MOSTLY POINTLESS SPATIAL ECONOMETRICS?*. Journal of Regional Science. 2012;52:172–191. [Google Scholar]
- GLANZ K, RIMER BK, LEWIS FM, editors. Health behavior and health education. Theory, research, and practice. San Francisco, CA: Jossey-Bass; 2002. [Google Scholar]
- GREENE WH. Sample Selection Bias as a Specification Error: A Comment. Econometrica. 1981;49:795–798. [Google Scholar]
- GUPTA S, HALM EA, ROCKEY DC, HAMMONS M, KOCH M, CARTER E, LV, TONG L, AHN C, KASHNER M, ARGENBRIGHT K, TIRO JA, GENG Z, PRUITT SL, SKINNER CS. Comparative effectiveness of fecal immuochemical test outreach, colonoscopy outreach, and usual care for boosting colorectal cancer screening among the underserved: a randomized trial. JAMA Internal Medicine. 2013;173:1725–32. doi: 10.1001/jamainternmed.2013.9294. [DOI] [PMC free article] [PubMed] [Google Scholar]
- HILL CR, GRIFFITHS WE, LIM GC. Principles of Econometrics. John Wiley & Sons; 2011. [Google Scholar]
- IOANNIDES Y. From Neighborhoods to Nations-The Economics of Social Interactions. Princeton University Press; 2012. [Google Scholar]
- IOANNIDES YM, TOPA G. Neighborhood Effects: Accomplishments and looking beyond them. Journal of Regional Science. 2010;50:343–362. [Google Scholar]
- KELLEY HH, THIBAUT JW. Interpersonal relations: A theory of interdependence. New York: Wiley-Interscience; 1978. [Google Scholar]
- KRIEGER N. Theories for social epidemiology in the 21st century: an ecosocial perspective. Int J Epidemiol. 2001;30:668–677. doi: 10.1093/ije/30.4.668. [DOI] [PubMed] [Google Scholar]
- LESAGE J, PACE RK. Introduction to Spatial Econometrics. CRC Press/Taylor & Francis Group; 2009. [Google Scholar]
- LESAGE JP, KELLEY PACE R, LAM N, CAMPANELLA R, LIU X. New Orleans business recovery in the aftermath of Hurricane Katrina. Journal of the Royal Statistical Society: Series A (Statistics in Society) 2011;174:1007–1027. [Google Scholar]
- LIAN M, SCHOOTMAN M, YUN S. Geographic variation and effect of area-level poverty rate on colorectal cancer screening. BMC Public Health. 2008;8:358. doi: 10.1186/1471-2458-8-358. [DOI] [PMC free article] [PubMed] [Google Scholar]
- LYONS R. The spread of evidence-poor medicine via flawed social-network analysis. Statistics, Politics, and Policy. 2011:2. [Google Scholar]
- MACINTYRE S, ELLAWAY A. Neighborhoods and health: An overview. In: Kawachi I, Berkman L, editors. Neighborhoods and health. Oxford University Press; 2003. pp. 45–64. 2003. [Google Scholar]
- MANSKI CF. Identification of endogenous social effects: The reflection problem. The Review of Economic Studies. 1993;60:531–42. [Google Scholar]
- MANSKI CF. Identification of treatment response with social interactions. The Econometrics Journal. 2013;16:S1–S23. [Google Scholar]
- MCQUEEN A, VERNON SW, MEISSNER HI, KLABUNDE CN, RAKOWSKI W. Are there gender differences in colorectal cancer test use prevalence and correlates? Cancer Epidemiol Biomarkers Prev. 2006;15:782–791. doi: 10.1158/1055-9965.EPI-05-0629. [DOI] [PubMed] [Google Scholar]
- MOBLEY LR, KUO TM, URATO M, SUBRAMANIAN S. Community contextual predictors of endoscopic colorectal cancer screening in the USA: spatial multilevel regression analysis. Int J Health Geogr. 2010;9:44. doi: 10.1186/1476-072X-9-44. [DOI] [PMC free article] [PubMed] [Google Scholar]
- NOEL H, NYHAN B. The‚ Äúunfriending‚ Äù problem: The consequences of homophily in friendship retention for causal estimates of social influence. Social Networks. 2011;33:211–218. [Google Scholar]
- OAKES JM. Causal inference and the relevance of social epidemiology. Soc Sci Med. 2004a;58:1969–1971. doi: 10.1016/j.socscimed.2003.05.001. [DOI] [PubMed] [Google Scholar]
- OAKES JM. The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology. Soc Sci Med. 2004b;58:1929–1952. doi: 10.1016/j.socscimed.2003.08.004. [DOI] [PubMed] [Google Scholar]
- PARTRIDGE MD, BOARNET M, BRAKMAN S, OTTAVIANO G. Introduction: Whither Spatial Econometrics? Journal of Regional Science. 2012;52:167–171. [Google Scholar]
- PEREIRA AMO, ROCA-SAGALES O. Spillover effects of public capital formation: evidence from the Spanish regions. Journal of Urban economics. 2003;53:238–256. [Google Scholar]
- PRUITT SL, LEONARD T, ZHANG S, SCHOOTMAN M, HALM EA, GUPTA S. Physicians, clinics, and neighborhoods: Multiple levels of influence on colorectal cancer screening. Cancer Epidemiol Biomarkers Prev. 2014;23:1356–55. doi: 10.1158/1055-9965.EPI-13-1130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- ROBERT SA. Neighborhood socioeconomic context and adult health. The mediating role of individual health behaviors and psychosocial factors. Ann N Y Acad Sci. 1999;896:465–468. doi: 10.1111/j.1749-6632.1999.tb08171.x. [DOI] [PubMed] [Google Scholar]
- ROSSI PG, FEDERICI A, BARTOLOZZI F, SARCHI S, BORGIA P, GUASTICCHI G. Understanding non-compliance to colorectal cancer screening: a case control study, nested in a randomised trial [ISRCTN83029072] BMC Public Health. 2005;5:135. doi: 10.1186/1471-2458-5-139. [DOI] [PMC free article] [PubMed] [Google Scholar]
- SAINSBURY P. Commentary: Understanding social network analysis. BMJ. 2008:337. doi: 10.1136/bmj.a1957. [DOI] [PubMed] [Google Scholar]
- SHALIZI CR, THOMAS AC. Homophily and contagion are generically confounded in observational social network studies. Sociological Methods & Research. 2011;40:211–239. doi: 10.1177/0049124111404820. [DOI] [PMC free article] [PubMed] [Google Scholar]
- SHARIFF-MARCO S, BREEN N, STINCHCOMB DG, KLABUNDE CN. Multilevel predictors of colorectal cancer screening use in California. Am J Manag Care. 2013;19:205–16. [PMC free article] [PubMed] [Google Scholar]
- SUBRAMANIAN SV. The relevance of multilevel statistical methods for identifying causal neighborhood effects - Commentary. Social Science & Medicine. 2004;58:1961–1967. doi: 10.1016/S0277-9536(03)00415-5. [DOI] [PubMed] [Google Scholar]
- TCHETGEN EJ, VANDERWEELE TJ. On causal inference in the presence of interference. Stat Methods Med Res. 2012;21:55–75. doi: 10.1177/0962280210386779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- TIEBOUT C. A pure theory of local expenditures. Journal of Political Economy. 1956;64:416–24. [Google Scholar]
- VANDERWEELE TJ, VANDENBROUCKE JP, TCHETGEN EJ, ROBINS JM. A mapping between interactions and interference: implications for vaccine trials. Epidemiology. 2012;23:285–92. doi: 10.1097/EDE.0b013e318245c4ac. [DOI] [PMC free article] [PubMed] [Google Scholar]
- VOGT V, SIEGEL M, SUNDMACHER L. Examining regional variation in the use of cancer screening in Germany. Soc Sci Med. 2014;110:74–80. doi: 10.1016/j.socscimed.2014.03.033. [DOI] [PubMed] [Google Scholar]

