Abstract
Recent advances in computing technologies have enabled the development of low-cost, compact weather and air quality monitors. The U.S. federally funded Array of Things (AoT) project has deployed more than 140 such sensor nodes throughout the City of Chicago. This paper combines a year’s worth of AoT sensor data with household data collected from 450 elderly Chicagoans in order to explore the feasibility of using previously unavailable data on local environmental conditions to improve traditional neighborhood research. Specifically, we pilot the use of AoT sensor data to overcome limitations in research linking air pollution to poor physical and mental health and find support for recent findings that exposure to pollutants contributes to both respiratory and dementia-related diseases. We expect that this support will become even stronger as sensing technologies continue to improve and more AoT nodes come online, enabling additional applications to social science research where environmental context matters.
Keywords: Sensors, GIS, Environmental Exposure, Pollution, Air Quality, Health, Neighborhood
Introduction
The Internet of Things (IoT) and recent advances in low-cost sensor technologies have made it possible to monitor environmental conditions at a local or neighborhood scale with considerably higher resolution than in the past (Benedict, Wayland, & Hagler, 2017). The potential applications of these new sources of micro-environmental data are myriad, including city planning, improved weather forecasting, or conducting basic research on a host of topics, perhaps the most obvious being the impact of the local environment on health. The goal of this paper is to pilot the application of these new sources of “big” data to more standard approaches in the social sciences in order to better understand their potential to improve current studies and, ultimately, open up entirely new fields of research.
Our social science data comes from the Chicago Health and Activity in Real-Time (CHART) project, which has been collecting household, Ecological Momentary Assessment (EMA1), and GPS tracking data from 450 elderly Chicagoans across 10 diverse city neighborhoods in order to assess the impact of daily activity spaces and social support networks on the health of older adults (see http://www.norc.org/Research/Projects/Pages/chicago-health-and-activity-in-real‐time.aspx). Specifically, CHART is collecting data in three waves using both an in-person survey and five EMA surveys per day over the course of a week in each wave. The project is also using the GPS feature of provided smartphones to track respondents as they go about their daily lives and provide a measure of their weekly “activity spaces.” Although there is a vast literature on community determinants of health and even a small literature on using mobile devices to study health outcomes (York Cornwell & Cagney, 2017), we believe that CHART is the first study to combine EMA and continuous location tracking methodologies with a traditional household survey.
Our study is also breaking new ground by exploiting sensor data made available through the recent deployment of dozens of environmental and air quality nodes across the City of Chicago, Illinois via the Array of Things (AoT) project (Catlett et al., 2017). This paper uses GIS and spatial modelling approaches to demonstrate how such sensor-derived data can be used to facilitate the analysis of associations between environmental conditions and health outcomes. Specifically, we will explore the feasibility of combining data from CHART surveys and AoT air quality sensors and conducting preliminary analyses with the combined dataset, with the ultimate goal of replicating findings from the extant research literature but using this new source of micro-environmental data:
Hypothesis 1: Exposures to higher levels of air pollution in the immediate environment will be associated with chronic respiratory disease.
Hypothesis 2: Exposures to higher levels of air pollution in the immediate environment will be associated with worse overall physical and mental health.
Background and Literature Review
Funded by the U.S. federal National Science Foundation, the Array of Things project (NSF 1532133) is a collaborative effort among leading scientists, universities, and local government to collect real-time data for research and public planning purposes. To date, the project has installed more than 140 sensor nodes in a variety of Chicago neighborhoods in order to assess micro-environmental conditions such as weather, air quality, noise levels, and both human and vehicle traffic flow (Catlett et al., 2017; see also https://arrayofthings.github.io/). We estimate that more than a third of CHART respondents live within 1 km of an AoT sensor node and over 80% live within 2 km of a node, allowing us to add environmental contextual data to much of the survey, EMA, and GPS data that has been collected by the project to date. Working with the AoT project team, we first accessed and downloaded two months (July 2018 and January 2019) of sensor data and conducted preliminary analysis in order to understand the data structure and availability. Further, we developed environmental metrics using raster values for a variety of weather (temperature, humidity) and air quality (PM2.52, PM10, O3, H2S3, NO24) data and merged those measures with health variables from the CHART survey (asthma, chronic obstructive pulmonary disease (COPD), overall health) as part of a small pilot study presented at the American Association for Public Opinion Research (AAPOR) 2019 annual meeting (English et al., 2019). Here, we expand this pilot in order to fully demonstrate the feasibility of integrating networked sensor data with traditional social science research by using AoT data in combination with CHART survey data to test whether respondent respiratory and overall health are correlated with variations in air quality across Chicago neighborhoods.
Indeed, a growing body of research highlights the health impacts of both long- and short-term exposure to a variety of airborne pollutants, leading the World Health Organization to issue air quality guidelines in 2005 (https://www.who.int/airpollution/publications/aqg2005/en/). The earliest studies focused on the link between air pollution and both cardiovascular and respiratory diseases (e.g. Dominici et al., 2006; Miller et al., 2007; Keet, Keller, & Peng, 2017), while more recent studies have established a connection between exposure to particulate matter and changes to brain structure (e.g. Chen, et al. 2015; Casanova, et al. 2016). Most recently, the neurotoxicity of fine particulate matter has been implicated in the growing epidemic of cognitive decline and related illnesses such as Alzheimer’s disease (Kilian and Kitazawa, 2018). In fact, a systematic review of the most rigorous research in this area concludes that “almost all reported an adverse association between at least one pollutant and one dementia-related outcome” (Power et al., 2016), while an even more recent study found evidence for “the synergistic effects of neighborhood social stressors and environmental hazards on residents’ health,” suggesting that the urban poor may be most susceptible to the cognitive effects of exposure to outdoor air pollution (Ailshire, Karraker, & Clarke, 2017).
However, all of the preceding research suffers from a fundamental methodological weakness: the inability to measure an individual’s actual exposure to pollutants. While some gases and finer particles may disperse relatively evenly on scales of neighborhoods under ideal circumstances, the spatial distribution of pollution in urban areas is likely to be highly uneven due to the complex air flows created by the “built environment” such as high-rise buildings, networks of roads, and variable vehicle traffic patterns (Sabatino, Buccolieri, & Kumar, 2018). Although the U.S. Environmental Protection Agency (EPA) maintains a small number of air quality sensors in cities, researchers typically must estimate local or neighborhood levels of pollution through a variety of modelling techniques such as the EPA’s Downscaler Model (https://www.epa.gov/air-research/downscaler-model-predicting-daily-air-pollution) or that developed by USC’s Spatial Sciences Institute for Los Angeles County (https://spatial.usc.edu/custom-geospatial-datasets/). However, the accuracy of such models will degrade at smaller scales such as neighborhoods, particularly for solid pollutants that do not stay airborne for long, suggesting that such hazards “will be measured more accurately in areas where there is a denser network of monitors than in areas with a sparse network” (Ailshire, Karraker, & Clarke, 2017).
In this paper we aim to use the network of AoT sensors deployed across Chicago neighborhoods to more accurately measure individual exposure to pollutants, which in turn potentially “increases power to detect a true health effect of air pollution” (Power et al., 2016). Our approach is to combine AoT sensor data with survey data collected by the CHART study, which is based on a neighborhood probability sample of 450 adults age 65 and older in ten purposively-selected Chicago neighborhoods (York Cornwell & Cagney, 2019). Individual neighborhoods were chosen to ensure geographic and socio-demographic variation across the city of Chicago, which is highly segregated by race/ethnicity and socioeconomic status. Respondents were selected via a systematic random sample of addresses within each identified neighborhood of interest. CHART began with an in-person interview, followed by one week of smartphone-based observation of respondents’ activity spaces. The in-person interview included a baseline questionnaire capturing respondents’ social networks, demographic characteristics, and both physical and mental health status. Our analytic sample for this study includes 346 of these respondents who had complete data on sociodemographic characteristics in the baseline survey. We describe the associated AoT sensor data in more detail below.
Assessing Exposure to Pollution
Power et al. (2016) report that the majority of studies linking air quality to dementia focused on airborne particulate matter (PM), although some researchers looked at ozone (O3), nitrogen dioxide (NO2), sulfur dioxide (SO2), carbon monoxide (CO), or some combination thereof. Power et al. (2018) also note that “there was significant variation in the spatial resolution of assigned air pollution exposures,” with estimates often being based on an individual’s address, city of residence, county, postcode, or census tract. As mentioned above, however, all of these previous studies had to make inferences about a subject’s exposure to pollutants given the paucity of monitors, sometimes just assigning averages from city or regional EPA monitors or using statistical models to predict exposure. For example, Ailshire, Karraker, & Clarke (2017) used an average of PM2.5 values from monitoring sites within a 60km radius of the respondent’s census tract centroid. This geographic source of measurement error is addressed to a large degree in our study by utilizing data from the AoT sensors scattered throughout the City of Chicago. Indeed, Power et al. (2016) observe that studies with higher levels of variation in measured PM2.5 were more likely to find a significant impact on cognitive function, concluding that “measurement error may possibly explain some heterogeneity observed in cognitive effect estimates for different pollutants and from different studies, geographic regions, and time periods.”
Data and Analytic Methods
AoT sensors have been installed at intersections and other locations with access to electricity in the City of Chicago, where a set of parameters that describe weather and air quality conditions were recorded per second on each day from July 2018 to July 2019 (see Table 1). Figure 1 shows the coverage of AoT nodes relative to the home locations of CHART respondents in the City of Chicago, with more than 80% living within 2 kilometers of a node.
Table 1.
Descriptions of AoT sensor parameters (July 2018 - July 2019)
Parameter | Unit | Lower Range Limit | Upper Range Limit | Number of nodes with any measure | Number of nodes with sufficient and valid time series measure | Annual min measured value | Annual mean measured value | Annual max measured value |
---|---|---|---|---|---|---|---|---|
| ||||||||
Temperature | Fahrenheit | – | – | 93 | 35 | 49.205 | 55.147 | 61.688 |
Humidity | Relative (%) | 0 | 100 | 93 | 36 | 61.578 | 76.931 | 90.187 |
PM 10 | microgram/m3 | 0 | – | 12 | 12 | 0.000 | 0.083 | 1.001 |
PM2.5 | microgram/m3 | 0 | – | 12 | 12 | 0.000 | 0.073 | 0.871 |
CO | ppm | 0 | 1000 | 55 | 55 | 0.122 | 0.830 | 4.726 |
O3 | ppm | 0 | 20 | 57 | 53 | 0.000 | 0.034 | 0.097 |
NO2 | ppm | 0 | 20 | 57 | 56 | 0.000 | 0.013 | 0.036 |
SO2 | ppm | 0 | 20 | 57 | 57 | 0.068 | 0.469 | 2.066 |
H2S | ppm | 0 | 50 | 56 | 56 | 0.006 | 0.124 | 0.427 |
-indicates no minimum or maximum sensor limits
Figure 1.
Overlap between AoT nodes and CHART neighborhoods, Chicago, US. Sensor locations are shown as black dots with 1- and 2-kilometer circular buffers.
We suspect that a few sensors may not have been functioning properly some of that time, which may introduce bias in our findings. We identified and removed potentially unreliable sensor measurements based on previous studies that indicate the range (minimum and maximum) of observed instantaneous air pollutant concentrations at various locations in major US cities (Stern, 1968; Jing et al., 2014). Specifically, we excluded the AoT data points where temperature exceeded the highest or lowest recorded values in Chicago and data points where relative humidity level was less than zero percent or greater than 100 percent. Some excluded data points reported air pollutant concentrations above the maximum level of instantaneous concentrations found in the literature (Stern, 1968). In particular, data points where O3 > 1 ppm, H2S > 1 ppm, SO2 > 5 ppm, NO2 > 1 ppm and CO > 100 ppm were considered extremely rare and were removed from the dataset for subsequent modeling.
We derived an annual average measure of temperature, humidity, PM2.5, PM10, O3, CO, SO2, H2S, and NO2 by first calculating an unweighted average of monthly measures for each AoT node that recorded observations for that parameter. Monthly values were then aggregated to create a node-specific annual average. Because of the unevenness in data collection across time, long-term estimates are subject to greater uncertainty than short-term estimates. Selection criteria were applied to identify the subset of sensor nodes that have sufficient temporal coverage to assess annual average. Temperature and humidity both exhibit considerable temporal variability; therefore, nodes that have less than 10 months of weather measurements were not included in the calculation of annual average temperature and humidity. We used inverse distance weighting (IDW) to interpolate annual average values from node locations to a 200m-by-200m raster grid cell across the study area (Burrough & McDonnell, 1998; Pebesma, 2004). Finally, mean raster values within a 250m radius of the respondent’s home address were calculated and assigned to each respondent to indicate the level of environmental exposure.
As noted above, the CHART sample was randomly drawn from a frame of addresses in 10 diverse neighborhoods. A team of field interviews then screened each address for having a resident aged 65+ who could also complete the informed consent process (as a measure of cognitive capacity) and would agree to carry a project-provided smartphone to collect GPS and EMA data. Recruitment ended after 450 completed interviews, which contained a series of self-reported health questions. We then used R version 3.6.1 to statistically assess associations between a subset of these health variables and the environmental conditions around each respondent’s home address. Note that all air quality variables were log2-transformed and centered to be at the same numeric scale as other independent variables and to improve the ease of model interpretation. Multicollinearity among air quality variables also was tested by calculating the Pearson product moment correlation, and highly correlated variables (r > 0.70) were removed from subsequent analysis. Specifically, we found high correlations between PM10 and PM2.5 (r = 1); PM10, PM2.5 and CO (r = 0.71); PM10, PM2.5 and O3 (r = −0.70); CO and O3 (r = 0.78); and NO2 and H2S (r = −0.76). Because O3 and NO2 are “criteria air pollutants” that can cause asthma and lung damage (https://www.epa.gov/criteria-air-pollutants), we excluded PM10, PM2.5, CO and H2S since they are highly correlated with O3 and NO2.
Pollution and respiratory health
In the CHART household survey, respondents were asked whether or not they have ever been diagnosed with any respiratory disease, including emphysema, asthma, chronic bronchitis, or chronic obstructive pulmonary disease. Out of 343 elderly respondents, 11.7% said they had been diagnosed with respiratory disease. For these binary respiratory health outcomes, we fit four logistic regression models and examined the adequacy of the models by performing a Hosmer-Lemeshow test. Likelihood ratio tests and the comparisons of AIC and BIC were employed to assess whether the addition of environmental or control variables improves the model’s goodness-of-fit.
We began by examining the effect of air quality and weather conditions on respiratory health in a separate logistic model. Individual-level socioeconomic, demographic and lifestyle characteristics--including gender, age, race, ethnicity, education, employment, whether travel is typically by car, smoking history and length of residency in the neighborhood--were considered exclusively in the second logistic. Our third model combines both environmental and confounding variables. In the last model, interactions between length of residency and air quality were added to the third model in order to distinguish between shorter- and longer-term exposures to air pollution in a given neighborhood.
Pollution and overall health
The CHART household survey asked respondents to self-rate their physical and mental health on a 5-point scale (poor, fair, good, very good and excellent), with 29.2% and 13.3% indicating that they had fair-to-poor physical and mental health, respectively. To consider the ordinal character of the physical and mental health outcome variables, we used the ordered logistic model with Lipsitz goodness-of-fit test. Physical and social activity levels were also added as potential confounds to model physical and mental health status, respectively. The CHART baseline survey asked respondents the type and amount of physical activity involved in their daily life on a scale of 1 to 4. Specifically, respondents were asked how often do they take part in three types of sports or activities that are either vigorous, moderate or mild, such as running or jogging/gardening, walking at a moderate pace/vacuuming, laundry, or home repairs. We summed the self-evaluated values from these questions to create an indicator of respondent’s physical activity level. In addition, the CHART baseline survey asked about the frequency of doing volunteer work/attending meetings of any organized group/getting together socially with friends or relatives/attending religious services in the past 6 months on a scale of 1 to 6. Similarly, we summed the values from these questions to build a composite measure that reflects a respondent’s typical level of social activity.
We began by fitting an ordered logistic model under the proportional odds (PO) assumption, taking into account environmental, sociodemographic and lifestyle behavioral factors. The PO assumption suggests that each predictor’s effect on the probability of being at or beyond any level of the response variable remains constant. As such, the slope estimate of a PO model provides a summary of each independent variable’s relationship to the outcome variable across all cutoff values. We investigated the plausibility of the PO assumption for each independent variable. If the assumption is violated, we then fit a partial proportional odds (PPO) model that relaxes the PO assumption. In PPO, the effects of independent variables are allowed to vary across splits. If the effects of certain independent variables are found to be stable, they would be held constant as in the PO model.
Results of Statistical Analyses
Figure 2 shows that there is considerable variation in temperature, humidity and air pollutants across space, as might be expected given the well-known but little-studied effect of Lake Michigan on Chicago weather (Changnon & Jones, 1972). For example, the eastern parts of the City of Chicago along Lake Michigan were exposed to both higher temperatures, perhaps due to the warming effect of the Lake in colder weather, and higher levels of relative humidity, perhaps due to proximity to a large body of water. Higher CO concentrations were predominantly observed in the western area and along the northeastern lakeside. In contrast, elevated concentrations of O3, H2S and SO2 were observed in neighborhood areas where population density is highest and in and around the city center. It is interesting to find that the concentrations of PM10 and PM2.5 were highest in the industrial southwestern part of the Chicago area, while the majority of western border was exposed to higher levels of NO2.
Figure 2.
Weather measures and air pollutant concentrations interpolated from AoT sensors, Chicago, US. Annual average values (July 2018 to July 2019) are shown at each monitor location.
Table 2 shows the relationship between environmental conditions and respiratory health. Since the Hosmer-Lemeshow test indicates a lack of fit for Model 1 (p = 0.047), we only discuss results of Models 2, 3 and 4, which consider confounding factors. AIC, BIC, and likelihood ratio test between Models 2 and 3 (p = 0.393) suggest that the addition of environmental exposure variables doesn’t significantly improve model fit. Model 3 also suggests that none of the air pollutant variables are significantly associated with respiratory diseases. Interestingly, after adjustment for both covariates and the interaction effects between air pollution exposure and length of residency, elevated NO2 concentrations appear to be positively correlated with a higher likelihood of having respiratory disease, but only for people who had lived at the neighborhood for 25–50 years (Table 2, Model 4, p = 0.04). The significant interaction term of NO2 implies that negative effects of NO2 on respiratory health are only evident for long-term rather than short-term residents. None of the models demonstrate a significant association between sociodemographic factors and respiratory disease (Table 2, Models 2, 3, and 4). In contrast, all models agree that those who ever smoked regularly had, as expected, a much higher likelihood (131.7%, p = 0.006; 132.5%, p = 0.007 and 148.4%, p = 0.005) of suffering from emphysema, COPD, or other related respiratory diseases (Table 2, Models 2, 3, and 4).
Table 2.
Logistic regression: environmental exposure and respiratory health
Model 1 | Model 2 | Model 3 | Model 4 | |||||
---|---|---|---|---|---|---|---|---|
|
||||||||
Independent Variable | Environmental exposure variables only | Control variables only | Environmental exposure and control variables | Environmental exposure variables, control variables and interaction effects | ||||
Odds Ratio | P Value | Odds Ratio | P Value | Odds Ratio | P Value | Odds Ratio | P Value | |
|
||||||||
(Intercept) | 0.209 | 0.000*** | 0.240 | 0.037* | 0.231 | 0.036* | 0.205 | 0.028* |
O3 | 0.660 | 0.108 | 0.602 | 0.079 | 0.523 | 0.134 | ||
NO2 | 0.753 | 0.555 | 0.789 | 0.642 | 0.325 | 0.106 | ||
SO2 | 0.942 | 0.895 | 1.200 | 0.714 | 1.524 | 0.560 | ||
Temperature | 1.096 | 0.524 | 1.079 | 0.634 | 1.143 | 0.429 | ||
Humidity | 1.049 | 0.407 | 1.061 | 0.341 | 1.038 | 0.561 | ||
| ||||||||
Male | 1.099 | 0.759 | 1.076 | 0.817 | 1.064 | 0.850 | ||
Age > 751 | 0.679 | 0.205 | 0.622 | 0.130 | 0.622 | 0.142 | ||
Highest education level: Bachelor's degree2 | 1.709 | 0.242 | 1.704 | 0.255 | 1.782 | 0.233 | ||
Highest education level: Graduate or professional degree2 | 0.559 | 0.261 | 0.567 | 0.289 | 0.586 | 0.327 | ||
White3 | 0.623 | 0.395 | 0.496 | 0.217 | 0.469 | 0.193 | ||
Black4 | 1.035 | 0.954 | 0.904 | 0.869 | 0.882 | 0.841 | ||
Hispanic5 | 0.583 | 0.352 | 0.599 | 0.386 | 0.537 | 0.306 | ||
Currently employed6 | 0.850 | 0.670 | 0.923 | 0.836 | 0.912 | 0.815 | ||
Travel typically by car7 | 1.016 | 0.962 | 1.135 | 0.715 | 1.255 | 0.533 | ||
Smoke cigarettes, cigars or a pipe regularly8 | 2.317 | 0.006 ** | 2.325 | 0.007 ** | 2.484 | 0.005 ** | ||
Have lived at this neighborhood for 25-50 years9 | 0.793 | 0.485 | 0.901 | 0.763 | 0.853 | 0.666 | ||
Have lived at this neighborhood for more than 50 years9 | 0.844 | 0.705 | 0.938 | 0.889 | 0.862 | 0.770 | ||
| ||||||||
O3 : Have lived at this neighborhood for 25-50 years9 | 2.728 | 0.159 | ||||||
NO2: Have lived at this neighborhood for 25-50 years9 | 16.842 | 0.040 * | ||||||
SO2 : Have lived at this neighborhood for 25-50 years9 | 0.599 | 0.624 | ||||||
O3: Have lived at this neighborhood for more than 50 years9 | 1.366 | 0.760 | ||||||
NO2: Have lived at this neighborhood for more than 50 years9 | 9.546 | 0.254 | ||||||
SO2 : Have lived at this neighborhood for more than 50 years9 | 1.025 | 0.987 | ||||||
| ||||||||
Hosmer-Lemeshow test | Statistics = 14.190 | p = 0.047* | Statistics = 3.822 | p = 0.873 | Statistics = 9.925 | p = 0.2703 | Statistics = 2.328 | p = 0.969 |
Residual Deviance | 315.78 | 302.18 | 296.99 | 290.74 | ||||
| ||||||||
AIC | 327.776 | 328.18 | 332.986 | 338.740 | ||||
| ||||||||
BIC | 350.802 | 378.069 | 402.066 | 430.845 | ||||
|
Analytical Sample Size = 343
P <0.05
P <0.01
P <0.001.
Odds ratio greater than 1 represents negative effects on respiratory health, odds ratio smaller than 1 represents positive effects.
reference category: Age <= 75
reference category: High school graduate and below
reference category: Non-White
reference category: Non-Black
reference category: Non-Hispanic
reference category: Currently unemployed
reference category: Travel typically by walking or taking transit such as cab, Uber/Lyft, bus and train
reference category: Do not ever smoke cigarettes, cigars or a pipe regularly
reference category: Have lived at this neighborhood for less than 25 years.
In the case of modeling physical and mental health outcomes, the assumption that the proportional odds ratios across all cut-points are homogenous was fulfilled. Lipsitz tests indicate that the fitted ordered logit models describe the data well (Table 3, p = 0.971 for physical heath and p = 0.065 for mental health). Among all air pollutants, only NO2 exhibits a statistically significant negative association with physical health (Table 3, p = 0.045). A 2-fold increase in NO2 exposure would decrease the odds of having good physical health by a factor of 52.9%. We also found that respondents who have graduate or professional degrees are more likely to have good physical health, compared to those who earned a high school diploma or below (p = 0.019). As expected, people who smoked regularly had a lower likelihood of having good physical health (Table 3, p = 0.027). Although no environmental and sociodemographic factors were significantly associated with mental health, we did find that people who are more socially active tend to have better mental health (Table 3, p = 0.013).
Table 3.
Ordered logit model: environmental exposure and health outcomes
Independent Variable | Physical Health | Mental Health | ||
---|---|---|---|---|
| ||||
Odds Ratio | P Value | Odds Ratio | P Value | |
|
||||
(Intercept):l | 45.087 | 0.000* | 26.832 | 0.000*** |
(lntercept):2 | 4.067 | 0.010** | 3.971 | 0.008 ** |
(Intercept):3 | 0.665 | 0.448 | 0.528 | 0.214 |
(Intercept):4 | 0.130 | 0.000*** | 0.158 | 0.000*** |
O3 | 0.978 | 0.911 | 1.265 | 0.237 |
NO2 | 0.471 | 0.045* | 1.155 | 0.707 |
SO2 | 0.897 | 0.754 | 0.858 | 0.660 |
Temperature | 0.968 | 0.748 | 0.986 | 0.890 |
Humidity | 1.080 | 0.073 | 0.991 | 0.840 |
Male | 1.326 | 0.185 | 1.055 | 0.801 |
Age > 751 | 0.762 | 0.193 | 0.816 | 0.332 |
Highest education level: Bachelor's degree2 | 0.824 | 0.568 | 1.149 | 0.682 |
Highest education level: Graduate or professional degree2 | 2.094 | 0.019* | 1.358 | 0.341 |
White3 | 1.650 | 0.160 | 1.231 | 0.558 |
Black4 | 1.041 | 0.919 | 0.932 | 0.858 |
Hispanic5 | 0.696 | 0.325 | 0.605 | 0.174 |
Currently employed6 | 1.170 | 0.547 | 0.675 | 0.127 |
Travel typically by car7 | 1.059 | 0.805 | 1.093 | 0.701 |
Smoke cigarettes, cigars or a pipe regularly8 | 0.633 | 0.027 * | 0.770 | 0.207 |
Have lived at this neighborhood for 25-50 years9 | 0.664 | 0.085 | 1.119 | 0.633 |
Have lived at this neighborhood for more than 50 years9 | 1.402 | 0.278 | 1.105 | 0.749 |
Physical activity level | 0.959 | 0.252 | ||
Social activity level | 1.055 | 0.013 * | ||
| ||||
Lipsitz Goodness of Fit Test | Statistics = 2.8144 | P = 0.971 | Statistics = 16.1 | P = 0.065 |
| ||||
Analytical Sample Size | 341 | 338 |
P <0.05
P <0.01
P <0.001.
Odds ratio greater than 1 represents positive effects on physical/mental health, odds ratio smaller than 1 represents negative effects.
reference category: Age <= 75
reference category: High school graduate and below
reference category: Non-White
reference category: Non-Black
reference category: Non-Hispanic
reference category: Currently unemployed
reference category: Travel typically by walking or taking transit such as cab, Uber/Lyft, bus and train
reference category: Do not ever smoke cigarettes, cigars or a pipe regularly
reference category: Have lived at this neighborhood for less than 25 years.
Discussion of Results
Overall, our two analyses provide tentative support to previous research linking air pollution to poor health, noting however that the correlation between different kinds of pollutants makes it difficult to discern the exact pollutant in question. The lack of significant findings on some air pollutants may be due in part to a lack of power given the small number of nodes collecting air quality data and to the unevenness in both the spatial and temporal coverage of sensor data. In terms of temporal unevenness, not all sensor nodes have sufficient data across all months during the study period. This required us to use an annual average rather than more fine-grained temporal units, which may bias our estimates of the annual average towards certain months or seasons and thus may not necessarily be representative. In terms of spatial unevenness, the current set of nodes are not dispersed uniformly across all neighborhoods in Chicago. The local trends are subject to greater uncertainty than the regional trends because of the relatively small number of sensor nodes in some CHART neighborhoods, particularly nodes that have sensors for measuring PM2.5 and PM10 concentrations. Data quality and quantity for such particulate matter, which are suspected of causing a wide range of diseases, should improve over time as more nodes come online over the next few years. Furthermore, although AoT represents much better coverage than previous research, we still expect there to be some variation in exposures below the 1–2km resolution currently possible with AoT data due to neighborhood or block differences in weather patterns, physical structures, and traffic flow.
As noted above, the accuracy of the AoT air quality data may vary as well. For example, each node includes a board that hosts seven low-cost, experimental printed electrochemical gas sensors to measure five gas-phase species (O3, NO2, CO, SO2 and H2S). Because these boards rely on low-cost sensors, “characterizing the sensor responses is critical to understanding the applicability of the AoT for urban air quality issues” (Potosnak et al., 2018). As a first step, the AoT team installed several nodes with these boards at an EPA air quality monitoring site that has Federal Reference Method monitors for O3, NO2, and SO2. After collecting data for seven months, results show a strong correlation between the AoT and EPA values for O3 and NO2 (r2 > 0.65). However, SO2 levels generally remained below the detection level of the AoT sensor, while evaluation of CO and PM sensors are still congoing as of January 2020 (Potosnak et al., 2018; Potosnak et al., 2019). The lower reliability of the necessarily less expensive air quality sensors used by the AoT, then, introduces an unknown amount of measurement error into our findings. Moving forward, it may be possible to use both the AoT and EPA air quality data together via a modeling-based approach for enhanced analytical utility.5
The CHART survey data themselves are also based on self-reports and do not distinguish between poor mental health and the dementia-related diseases that are the primary subject of previous research. Indeed, the recruitment process itself was designed to weed out those suffering from the later stages of cognitive decline. We are currently exploring the possibility of using survey response-time data over all three waves of data collection to more directly assess cognitive function in our elderly respondents. Our health questions also do not surface pre-existing conditions that might have predisposed respondents to health problems later in life, potentially leading to an overestimate of the association between air quality on health, nor does the survey distinguish between overall respiratory health and those respiratory conditions that are more likely to be caused by exposure to air pollution such as COPD, perhaps leading to an underestimate.
Finally, our analysis shares some limitations common to previous research in this area. One is a lack of a strong theoretical foundation for including certain pollutants over others when studying specific health effects, especially given the high correlation among air quality variables. Although we have included control variables that represent individual sociodemographic and lifestyle characteristics, there are other individual- or area-level factors that we may need to take into account, such as income, poverty, and access to healthcare infrastructure. Given that both the AoT and CHART data are hierarchically structured and potentially spatially clustered, it is worthwhile to test the applicability of spatial hierarchical models that account for spatial dependency among observations and control potential confounding factors at multiple levels. Although local air quality sensors open the door to studying the effects of short-term exposures to pollution in the natural environment, an area of research not feasible with previous sensing technologies (Benedict, Wayland, & Hagler, 2017), another limitation is the absence of historical air quality data that would be needed to measure exposure to pollutants over the life course. The bulk of the literature on the health effects of pollution similarly lacks the required longitudinal data to measure long-term exposure, a serious limitation noted by Power et al. (2018) in their systematic review of the literature:
Dementia marks the end of a protracted period of pre-clinical accumulation of pathology and cognitive decline. Thus, the most relevant exposure period may be years to decades prior to dementia onset....Most of the studies in our review generated estimates of exposure averaged over one year prior to or concurrent with the year of outcome assessment….Therefore, almost all of the studies in this review implicitly assumed that current exposure levels are adequate surrogates for past exposure levels. This can be a strong assumption, especially over longer time intervals between the key exposure window and the health endpoint and in studies without consideration of residential mobility.
Historic data are of course available on a regional level, but rare is the study that is able to associate that data with specific individuals as they move from place-to-place over a period of years or decades (Khan et al., 2019). Moving forward, the Array of Things will be able to provide longitudinal data on actual exposures to pollution when used in conjunction with longitudinal studies such as CHART.
Conclusion and Next Steps
Although this paper explored the feasibility of using networked sensor data for social science research, the implications reach far beyond studying the impact of neighborhood air quality on chronic health conditions. We feel that the much of the current value of the Array of Things is in studying real-time phenomena such as bouts of asthma or hospitalizations due to other illnesses that can by triggered by short-term exposure to poor air quality or inclement weather (Silva et al., 2018). For example, the CHART project will use this study to model the fluctuations in emotional and physical health being measured by EMA that might be due to environmental factors such as temperature, humidity, and ambient noise (York Cornwell, Cagney, & Hawkley, 2019). We also expect that local weather conditions, which have been shown to affect outdoor activity levels across both days (Aspvik et al., 2018) and seasons (Tucker & Gilliland, 2007), will be important predictors of variations in the activity spaces of elderly Chicagoans, a key outcome of the CHART study. The Array of Things allows us for the first time to be able to measure these variables at the neighborhood scale needed to isolate their localized effect.
Beyond the CHART project, however, our findings suggest that recent advances in sensing and computing technologies exemplified by the AoT have the potential to inform the study of any behavioral or social phenomena where environmental context matters. Moving forward, our next steps for refining and expanding this approach include measuring activity spaces using GPS tracking data collected by the CHART project in order to assess the value of local measures of weather, quantifying household exposure to air pollution at fine scales by using ancillary information and considering the spatial autocorrelation among air pollutant observations, and examining how the relationship between air pollution exposure and health status might vary when exposure is estimated at different spatial scales: (1) at the respondent’s home address, (2) in respondents’ home neighborhood and (3) within respondent’s daily activity space. Characterizing the space within which people move during the course of their day-to-day activities may provide new insights into environmental exposure and associated health effects.
Funding Acknowledgement
This work was funded by the National Institutes of Health (NIH) under grant #5R01AG050605, by the National Science Foundation (NSF) under grant #1532133, and by an NORC Labs award to the primary authors. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NIH or NSF.
Footnotes
Ecological Momentary Assessment (EMA), sometimes called the Experience Sampling Method (ESM), began as a method of collecting survey data in real-time in the 1980s at the University of Chicago but has grown exponentially as a data collection method with the advent of smartphone technology in the last decade, which has greatly simplified the process and reduced costs.
The U.S. Environmental Protection Agency (EPA) recently reduced the National Ambient Air Quality Standard for average annual exposure to PM2.5 from 15 μg/m3 to 12 μg/m3. The daily PM2.5 standard remains at 150 μg/m3 (see https://www3.epa.gov/region1/airquality/pm-aq-standards.html). However, the World Health Organization (WHO) sets lower limits at 10 μg/m3 annually and 25 μg/m3 daily (see https://www.who.int/airpollution/publications/aqg2005/en/).
Hydrogen sulfide is a toxic gas even in small quantities, affecting respiratory health in particular, and can be deadly in large doses. The U.S. Occupational Safety and Health Association (OSHA) currently recommends limiting exposure to 10 ppm on average over 8 hours or 20 ppm for 15 minutes. However, the American Conference of Governmental Industrial Hygienists (ACGIH), a broadly recognized authority on the health effects of toxic gases, recommends H2S exposures of only 1 ppm and 5 ppm respectively (see https://ohsonline.com/articles/2011/09/01/monitoring-h2s-to-meet-new-exposure-standards.aspx).
Nitrogen dioxide is one of a class of pollutants known as nitrous oxides that are formed by the combustion of fossil fuels and have been associated with both short- and long-term respiratory problems, particularly asthma, and pulmonary disease (see https://www.ncbi.nlm.nih.gov/books/NBK138707/). The harmful effects of NO2 are largely due to the formation of nitric acid when it comes into contact with moist tissues. The U.S. EPA sets a 1-hour daily limit of 100 ppb and an annual mean exposure of 53 ppb (see https://www.epa.gov/no2-pollution/setting-and-reviewing-standards-control-no2-pollution#standards).
Concurrently, the AoT team is part of a new project funded by the National Science Foundation (NSF 1935984) to design a new version of the nodes for deployment in late 2020 or early 2021 along with new capabilities for developing, testing, and deploying intelligent measurement services using AI hardware within the nodes. As part of this project the gas sensors are being re-evaluated, and there are several candidates that will provide much higher quality data for these gases of interest as well as new gases such as methane (CH4) and carbon dioxide (CO2).
Data Availability
Due to confidentiality requirements, limited data from the CHART project is available upon request by contacting PI Kathleen Cagney (kacagney@uchicago.edu). Full data will be made available to authorized users at the end of the project via distribution through the National Archive of Computerized Data on Aging (NACDA) following a full disclosure analysis of the data in collaboration with the research team and NACDA (https://www.icpsr.umich.edu/icpsrweb/NACDA/) in order to ensure compliance with all federal requirements and IRB confidentiality protection standards.
Data from the Array of Things is publicly available as either a bulk download (https://aot-file-browser.plenar.io/) or more selectively through API (https://api.arrayofthings.org/). Questions about the AoT data should be directed to PI Charlie Catlett (catlett@anl.gov).
Software Information
Scripts used for analysis in SAS 9.4 and R (version 3.6.1) are available by contacting NORC Senior Research Methodologist Ned English (english-ned@norc.org).
Contributor Information
Ned English, NORC at the University of Chicago.
Chang Zhao, NORC at the University of Chicago.
Kevin L. Brown, NORC at the University of Chicago.
Charlie Catlett, Argonne National Laboratory.
Kathleen Cagney, University of Chicago.
References
- Ailshire J, Karraker A, & Clarke P (2017). Neighborhood psychosocial stressors, air pollution, and cognitive function among older U.S. adults. Social Science & Medicine, 172, 56–63. DOI: 10.1016/j.socscimed.2016.11.019. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Aspvik NP, Viken H, Ingebrigtsen JE, Zisko N, Mehus I, Wisloff U, & Stensvold D (2018). Do weather changes influence physical activity level among older adults?—The Generation 100 study. PLoS One, 13(7): e0199463. 10.1371/journal.pone.0199463 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Benedict K, Wayland R, & Hagler G (2017). Characterizing air quality in a rapidly changing world. EM Magazine, November. Retrieved from the EPA website at https://www.epa.gov/sites/production/files/2017-11/documents/wayland_with_citation.pdf. [PMC free article] [PubMed]
- Burrough PA, McDonnell RA (1998). Principles of Geographical Information Systems. Oxford University Press, Oxford, 333pp. [Google Scholar]
- Casanova R, Wang X, Reyes J, Akita Y, Serre ML, Vizuete W, Chui HC, Driscoll I, Resnick SM, Espeland MA, Chen J (2016). A Voxel-based morphometry study reveals local brain structural alterations associated with ambient fine particles in older women. Frontiers in Human Neuroscience, 10(368). DOI: 10.3389/fnhum.2016.00495. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Catlett CE, Beckman PH, Sankaran R, & Galvin KK (2017). A rray of Things: A scientific research instrument in the public way: Platform design and early lessons learned. Proceedings of the 2nd International Workshop on Science of Smart City Operations and Platforms Engineering, 26–33. ACM. [Google Scholar]
- Changnon SA & Jones DMA (1972). Review of influences of the Great Lakes on weather. Water Resources Research, 8(2), 360–371. [Google Scholar]
- Chen J, Wang X, Wellenius GA, Serre ML, Driscoll I, Casanova R, McArdle JJ, Manson JE, Chui HC, Espeland MA (2015). Ambient air pollution and neurotoxicity on brain structure: Evidence from Women’s Health Initiative Memory Study. Annals of Neurology, 78(3), 466–476. 10.1002/ana.2446 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dominici F, Peng RD, Bell ML, Luu P, McDermott A, Zeger SL, & Samet JM (2006). Fine particulate air pollution and hospital admission for cardiovascular and respiratory diseases. Journal of the American Medical Association, 295(10), 1127–1134. DOI: 10.1001/jama.295.10.1127. [DOI] [PMC free article] [PubMed] [Google Scholar]
- English N, Brown KL, Curtis B, Zhao C, Cagney KA, & Catlett CE (2019). Linking extant social and environmental data at multiple scales to surveys: Activity space. Presentation at the 74th Annual Conference of the American Association of Public Opinion Research (AAPOR), Toronto, Canada. [Google Scholar]
- Jing P, Lu ZF, Xing J, Streets DG, Tan Q, O’Brien T, Kamberos J (2014). Response of the summertime ground-level ozone trend in the Chicago area to emission controls and temperature changes, 2005 – 2013. Atmospheric Environment, 99, 630–640. [Google Scholar]
- Khan A, Plana-Ripoll O, Antonsen S, Brandt J, Geels C, Landecker H, Sullivan PF, Pedersen PD, Rzhetsky A (2019) Environmental pollution is associated with increased risk of psychiatric disorders in the US and Denmark. PLoS Biol, 17(8), 1–28. DOI: e3000353. 10.1371/journal.pbio.3000353. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Miller KA, Siscovick DS, Sheppard L, Shepherd K, Sullivan JH, Anderson GL, Kaufman JD (2007). Long-term exposure to air pollution and incidence of cardiovascular events in women. New England Journal of Medicine, 356, 447–458. DOI: 10.1056/NEJMoa054409. [DOI] [PubMed] [Google Scholar]
- Pebesma EJ (2004). Multivariable geostatistics in S: the gstat package. Computers & Geosciences, 30, 683–691. [Google Scholar]
- Potosnak MJ, Banerjee P, Sankaran R, Kotamarthi VR, Jacob RL, Beckman PH, & Catlett C (2018). Array of Things: Characterizing low-cost air quality sensors for a city-wide instrument. Presentation at the Fall Meeting of the American Geophysical Union (AGU), Washington, DC. [Google Scholar]
- Potosnak MJ, Banerjee P, Berkelhammer MB, Sankaran R, Kotamarthi VR, Jacob RL, Beckman PH, Shahkarami S, Horton DE, Montgomery A, and Catlett C (2019). Array of Things: A high-density, urban deployment of low-cost air quality sensors. Presentation at the Fall Meeting of the American Geophysical Union (AGU), Washington, DC. [Google Scholar]
- Power MC, Adar SD, Yanosky JD, & Weuve J (2016). Exposure to air pollution as a potential contributor to cognitive function, cognitive decline, brain imaging, and dementia: a systematic review of epidemiologic research. NeuroToxicology, 56, 235–253. 10.1016/j.neuro.2016.06.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Silva MP, Sharma A, Budhathoki M; Jain R, & Catlett CE (2018). Neighborhood scale heat mitigation strategies using Array of Things (AoT) data in Chicago. Proceedings of the American Geophysical Union (AGU) Fall Meeting, Washington, DC, December. [Google Scholar]
- Stern AC, (Ed.). (1968). Air Pollution, 2nd edition. Academic Press, New York. [Google Scholar]
- Tucker P & Gilliland J (2007). The effect of season and weather on physical activity: A systematic review. Public Health, 121(12), 909–922. 10.016/j.puhe.2007.04.009. [DOI] [PubMed] [Google Scholar]
- York Cornwell E, Cagney KA, & Hawkley L (2019). Inequalities in social and physical contexts of older adults’ activity spaces: Early results from the CHART study. Annual Meeting of the Population Association of America (PAA). Austin, TX, April. [Google Scholar]
- York Cornwell E & Cagney KA (2017). Aging in activity space: Results from smartphone-based GPS-tracking of urban seniors. Journals of Gerontology: Social Sciences 72, 864–875. [DOI] [PMC free article] [PubMed] [Google Scholar]