EXPOSURE MODEL DEVELOPMENT AND EVALUATION
In this study, we developed new high-resolution exposure models to predict within-city spatial variations in outdoor UFP number concentrations, mean UFP size, and BC mass concentrations for Canada’s two largest cities. This analysis improves on our earlier models12,13 by increasing the spatial coverage of the monitoring campaign, increasing the total monitoring time, extending the monitoring period over an entire year, randomly sampling all days of the week and most times of day, and incorporating information from digital aerial images into model predictions using CNNs. The increased spatial and temporal coverage of this monitoring campaign compared to our previous effort likely resulted in a more representative sample of the within-city spatial variations of annual average outdoor air pollution. Model R2 values cannot be directly compared across studies, but the R2 values of our LUR, CNN, and combined models fell within the range of published R2 values from other studies that developed LUR models, machine learning models, or CNN models trained on images of the urban environment.12,13,22,43,53–55 The magnitude of the bias observed in our models was similar to the bias reported by other studies as well.12,43,53,55 A further improvement over our earlier models was the development of models for mean UFP size, which generally predicted smaller mean UFP sizes in areas of elevated UFP number concentrations, which is consistent with other research56 and our understanding of particle growth (i.e. fresh emissions consisting of high concentrations of very small particles).57,58 The development of mean UFP size models is important because particle size may play a role in UFP toxicity59–61 and has the potential to confound associations between outdoor UFP number concentrations and adverse health outcomes, as discussed previously. Collectively, this investigation produced a number of interesting results.
First, we observed high within-city spatial variations in outdoor UFP number concentrations, mean UFP size, and BC concentrations during mobile monitoring, all of which were monitored and modeled on the same spatial and temporal scales following the same methods. This finding is consistent with those from other studies that reported outdoor UFP and BC concentrations as having much greater within-city spatial variations than outdoor PM2.5 concentrations.22,62–64 The high-resolution models we developed explained more than half of the observed spatial variation in outdoor UFP and BC concentrations in the test sets. Our predicted UFP number concentrations were less strongly correlated with other air pollutants (PM2.5
r = 0.10, BC r = 0.38) than other models recently developed for Southern California (PM2.5
r = 0.28, BC r = 0.64).22 City-specific models performed better than multicity models trained on pooled data, which is consistent with the documented difficulty of transferring models between study areas.65–67 The CNN models performed somewhat worse than the LUR models but took advantage of an alternative data stream (i.e., images instead of GIS data), and it is possible that the CNN models learned complex associations that were not present in GIS data alone. Combined models performed better than any LUR or CNN models on their own, though with only a modest increase in R2 compared to the LUR models. This is consistent with the results from a similar study43 and suggests that CNNs trained on images can be useful for predicting within-city spatial variations in outdoor air pollution, especially when combined with LUR models. Nonetheless, CNNs can learn unintended associations between image features and underlying structures in the data, which can affect generalizability.68–70 For example, our CNNs appeared to be somewhat sensitive to the time of year images were captured, which likely introduced measurement error into our estimates. Nevertheless, our results suggest that CNN models are useful in capturing spatial information on environmental exposures, and this may be particularly useful in places lacking large, curated databases of land use and traffic information, as recently demonstrated in Bucaramanga, Colombia.43
Second, for each pollutant, the LUR and CNN models generated similar prediction surfaces, yet there were several interesting differences. For instance, the LUR model prediction surfaces were generally smoother than those from CNN models. This difference was due in part to the inclusion of latitude and longitude in the LUR models. Latitude and longitude vary incrementally throughout the study area and smoothed the LUR predictions, whereas each CNN-generated prediction was based solely on digital images that covered up to 280 m × 280 m of the earth’s surface (i.e., the CNN prediction for a given point was naïve to any information beyond the edge of the image centered on that point). UFP and BC concentrations can vary greatly over very short distances,4,62,63 and it is possible LURs may have over-smoothed the spatial variations in certain areas. In other areas, however, the CNN may have resulted in under-smoothing, as it is naïve to information beyond the limits of the images. Combined model prediction surfaces appeared to integrate the smoothness of the LUR surfaces with the sharp gradients of the CNN surfaces, which may be a useful compromise between the two approaches. Furthermore, mapping the difference between LUR and CNN model predictions highlighted interesting contrasts. For example, on major highways, the Montreal LUR model consistently predicted higher UFP number concentrations than the Montreal CNN model, whereas in Toronto, it was the opposite. In general, combining LUR and CNN predictions resulted in only a modest increase in overall model performance compared to the LUR models alone, but the combined models may help generate more robust predictions throughout the modeling areas by taking advantage of information from both land use data and digital images. In our previous study, conducted in Colombia, spatial variations in model errors were lower for CNN models than for LUR models.43
A strength of our exposure modeling study was the large scope of the monitoring campaign, and mobile monitoring was an efficient approach to maximize spatial coverage, though a limitation was relatively low monitoring time per road segment compared to stationary monitoring.55,71 On average, road segments were visited on 10 different days for a total of roughly 60 seconds of monitoring. Although researchers have successfully developed models based on similar levels of monitoring,12,13,71,72 longer monitoring times would likely provide more stable estimates of annual average ambient UFP and BC levels. In addition, mobile monitoring was conducted using internal combustion engine vehicles driving on roads and major highways, which likely resulted in our measured values of air pollution being higher than the air pollution values immediately outside residences. However, it is important to note that many areas were detected with low air pollution levels, and thus our use of gasoline vehicles did not prevent us from capturing a wide range of pollutant concentrations across each city. Nevertheless, future monitoring campaigns could follow the approach recently used by Blanco and colleagues to address this limitation (i.e., building in stopping locations along each route).73
Another strength of this study was the incorporation of information from digital images to improve predictions. However, the application of CNNs can be challenging. First, CNNs require a large amount of training data and may not be applicable for smaller monitoring campaigns. Second, CNNs do not have the easily interpretable coefficients of linear regression models; however, we explored several approaches to verify that CNN models responded in a logical manner, and results were generally consistent with our current understanding of UFP sources (e.g., adding a road to an image of green space generally increased model predictions). Third, quality control of digital images is an extremely important and potentially resource-intensive step74,75 when training CNN models. For example, past applications of CNNs have erroneously learned structural flaws in the data when training models on images from multiple databases.76,77 Using R to download Google Maps satellite images was an efficient approach to compile a high-quality database of digital images, but we could not control the exact timing of image capture. This led to some images being from different seasons during the year-long campaign and likely had a small impact on CNN model predictions. Future studies should consider allocating resources to establishing high-quality databases of digital images for CNN model training and possibly developing methods to take advantage of seasonal differences in digital images to generate robust estimates of spatial variations in air pollution.
EPIDEMIOLOGICAL ANALYSIS
In the cohort study portion of this project, we followed a large population of adults in Montreal and Toronto and found consistent positive associations between long-term exposure to outdoor UFP number concentrations and both nonaccidental and cause-specific mortality. These associations were independent of other outdoor air pollutants, including PM2.5 and oxidant gases. The associations persisted when using different exposure models and, importantly, also adjusted for UFP size (i.e., mean UFP diameter), which has not been done in previous studies and could result in an underestimation of health risks when excluded from the analyses. Results for BC were largely null, although small positive associations were observed for some mortality outcomes.
As noted earlier, few cohort studies have examined the relationship between long-term exposure to outdoor UFP number concentrations and mortality. A study in California reported a positive association between outdoor UFPs and ischemic heart disease mortality,23 but exposures were estimated at a spatial resolution of approximately 4 km, which is too coarse to capture fine-scale spatial variations that may affect health. Similarly, Pond and colleagues16 reported positive associations between UFPs and nonaccidental and cause-specific mortality, but results were sensitive to the inclusion of PM2.5 in models. In addition, this study aggregated UFP exposures to the census tract level, which likely contributed substantially to exposure measurement error for UFPs (more so than for PM2.5), making it difficult to directly compare the results of this study to those using high-resolution exposure information. More recently, a cohort study in the Netherlands reported positive associations between outdoor UFP number concentrations and mortality using high-resolution estimates of spatial variations in long-term average outdoor UFP concentrations.15 Although this study did not adjust for UFP size, the observed associations (nonaccidental HR = 1.045, 95% CI: 1.037, 1.056; respiratory HR = 1.083, 95% CI: 1.049, 1.123; rescaled to match 10,000 particles/cm3 increment used in this study) were similar in magnitude to those observed in the present study (when expressed on the same scale) when we excluded UFP size from our models (nonaccidental HR = 1.034, 95% CI: 1.024, 1.043; respiratory HR = 1.092, 95% CI: 1.061, 1.123). Other studies of long-term exposures to UFPs and mortality were not identified, but studies in Denmark have compared the health risks of total outdoor UFP number concentrations as well as traffic-related UFP number concentrations.78,79
Specifically, in these studies, traffic-related UFPs were more strongly associated with type 2 diabetes incidence80 with weaker associations observed for incident myocardial infarctions.81 More generally, these previous observations and our results with respect to confounding by UFP size highlight the fact that we should not treat all UFP number concentrations as though they reflect a single type of exposure. This relates more broadly to the issue of confounding of version of treatment-outcome associations when studying exposures with multiple versions of treatment, which is likely an underappreciated source of bias in air pollution epidemiology for pollutants that have historically been treated as a single entity (e.g., UFP number concentrations or PM2.5 mass concentrations) but in reality represent a complex mixture of component parts with varying levels of toxicity.24 In the context of our results, it is clear that for outdoor UFP number concentrations, we need to consider UFP size because the size distribution of UFPs varies across the range of outdoor UFP number concentrations and UFP size is independently associated with mortality. Importantly, our results suggest that failure to consider UFP size in the analysis may result in an underestimation of the health effects of outdoor UFP number concentrations. In our analysis HRs were approximately four times smaller for cardiovascular mortality and disappeared altogether for cerebrovascular mortality when UFP size was excluded from the models. As such, UFP size should be considered in epidemiological analyses to avoid the potential bias in health risk estimates for UFP number concentrations. This is consistent with a recent review by Kittelson and colleagues that recommended using UFP number, mass, and surface area (i.e., size) to properly characterize UFP exposures.57 In the same manner that earlier toxicological research investigated the varying health effects of different number concentrations by holding mass concentration constant,82,83 future investigations into the health effects of UFP number concentrations should control for potential variations in UFP size.
In general, our results for UFPs were robust to sensitivity analyses both with and without backcasting and with and without mobility weighting, although mobility-weighted results were attenuated. This is likely attributable to mobility weighting being conducted at the neighborhood level as opposed to the individual level, and thus the mobility-weighting process likely contributed to exposure measurement error at the individual level. In addition, the results of epidemiological analyses for UFPs conducted separately for the LUR and CNN models generally suggested stronger associations for CNN model estimates. The reasons for this are unclear but may be related to the issue of spatial smoothing whereby the CNN model exposures are more local in nature; in contrast, LUR models exhibit more spatial smoothing owing to the nature of the variables included in the model. If local-level UFP levels are most relevant to health, this could explain the stronger associations observed for the CNN models. Different exposure models can introduce varying degrees of spatial and temporal errors into exposure estimates that can lead to unpredictable bias in estimated HRs. Training a single CNN model requires intensive computing resources, which makes a formal uncertainty analysis impractical. Nonetheless, although we observed some variation in estimated associations when using different exposure models, the direction and magnitudes of associations were relatively consistent. Moreover, measurement error for UFP size could result in residual confounding by UFP size. As this error is expected to be nondifferential, it suggests that true HRs for UFP number concentrations could be larger than those reported earlier (because confounding by UFP size resulted in an underestimation of the health effects of UFP number concentrations).
Another intriguing finding from our analysis is that larger particles in the UFP size range were more strongly associated with mortality in the main epidemiological analysis (i.e., while controlling for exposure to UFP number concentration, other pollutants, and relevant confounders). UFP size ranged from approximately 18 to 50 nm in our study, and the probability of lung deposition is similar across this range.84,85 Freshly emitted UFPs can rapidly grow in size as gaseous vapors condense into liquids and as particles aggregate together (i.e., nucleation and accumulation modes).2,4,64 During this process, UFPs interact with the outdoor environment, and this atmospheric aging can enhance the toxicity of the particles.86 Alternatively, the observed pattern of larger UFP particles being more harmful may be explained by differences in particle composition across the UFP size distribution or the propensity for particles of various sizes to reach the systemic circulation once deposited in the lung.2,71, 87 Future studies should continue to explore the independent health effects of UFP size as well as composition to further elucidate this relationship.
When not adjusting for exposure to other air pollutants, we observed positive associations between outdoor BC concentrations and nonaccidental, cardiovascular, cardiometabolic, ischemic heart disease, cerebrovascular and respiratory mortality. These associations weakened or disappeared when UFPs, PM2.5, and Ox were included in the model, with only nonaccidental, cardiovascular, and cardiometabolic mortality continuing to be associated with outdoor BC concentrations. Previous studies have observed positive associations between outdoor BC concentrations and mortality, 88–90 although most of these studies did not examine high-resolution within-city spatial variations in outdoor BC. For example, Gan and colleagues91 reported a positive association between outdoor BC concentrations and coronary heart disease mortality in Vancouver (HR = 1.037, 95% CI: 1.018, 1.055; rescaled to match 500 ng/m3 increment used in this study). This association is slightly larger than the positive association we observed between BC and cardiovascular mortality in Montreal and Toronto (HR = 1.015, 95% CI: 1.004, 1.025). Similarly, a weak association between BC and nonaccidental mortality was reported in the Dutch Environmental Longitudinal Study (HR = 1.021, 95% CI: 0.996, 1.047; rescaled to 500 ng/m3)92 and was similar in magnitude to what we observed in Montreal and Toronto (HR = 1.009, 95% CI: 1.004, 1.015). A third study, in Oakland, California, used mobile monitoring and high-resolution exposure models to examine the association between long-term exposure to BC and cardiovascular mortality but did not observe clear evidence of a positive association.93 Conversely, two studies in Denmark94,95 reported positive associations between high-resolution estimates of outdoor BC concentrations and all-cause and cardiovascular mortality. However, similar to our results, estimates in one of these studies95 were sensitive to adjustment for NO2, and BC was not associated with mortality when NO2 was included in the model. Our single-pollutant model results were similar to an analysis of several European cohorts that observed associations between long-term BC exposure and nonaccidental, cardiovascular, cardiometabolic, ischemic heart disease, cerebrovascular, and respiratory mortality in single-pollutant models.89 As with our results, these associations were attenuated when co-pollutants were included in the models. Lastly, an analysis of a French cohort found that associations between BC exposure and mortality were stronger when using longer exposure windows (e.g., 20-year average instead of 3-year average); however, the analysis did not consider exposure to NO2 or O3.90
More generally, heterogeneity in epidemiological results for outdoor BC concentrations could be attributed to numerous factors, including geographic differences in the composition of the BC mixture, which may vary depending on sources (e.g., predominantly gasoline vs. diesel vehicle emissions), differences in the spatial scale of exposure assessment (e.g., high spatial resolution vs. regional estimates), or residual confounding. Future cohort studies examining the health effects of traffic-related air pollution should consider measuring outdoor BC concentrations, UFP number concentrations, and UFP size (in addition to PM2.5 and Ox) because, to our knowledge, this study is the only epidemiological investigation to date to consider all three of these metrics in the analysis.
Our epidemiological analyses had several notable strengths, including a large population-based cohort, adjustment for mean UFP size, high-resolution exposure models based on year-long monitoring campaigns, updating exposures for residential mobility both within and between cities, extensive sensitivity analyses to various modeling approaches, and detailed evaluation of concentration-response relationships. It is important, however, to note several limitations. First, while we backcasted exposure estimates for use in epidemiological analyses, it is not possible to evaluate the validity of our estimates back in time owing to the absence of historical monitoring data suitable for estimating annual average outdoor UFP and BC concentrations. Therefore, as in all epidemiological studies, our results likely were subject to exposure measurement error, including contributions from both Berkson-type (e.g., cohort members in the same six-digit postal code received the same exposure estimate) and classical-type measurement error (e.g., measured values aggregated to each road segment are imperfect estimates of true long-term outdoor concentrations). In addition, the exposure models used in this study were developed using on-road measurements, and the absolute values of our estimated exposures may be elevated compared to true long-term outdoor concentrations. However, it is important to note that our monitoring campaign captured a wide range of outdoor UFP and BC concentrations across each city, and thus our monitoring approach did not prevent us from identifying low-exposure areas. Importantly, UFPs and BC were measured concurrently, were modeled using the same methods on the same spatial and temporal scales, and were weakly correlated. Although we cannot rule out that one may have been measured more precisely than the other, we do not believe that the differences (in the amount of measurement error between these two pollutants) are a likely explanation for our results or differences in the strengths of mortality associations for UFPs and BC.
Another limitation was the use of mean UFP size as measured by the Testo DiSCmini and Naneos Partector 2 instruments as opposed to more sophisticated methods that provide measurements across the entire particle size distribution. However, the method employed by these instruments has been evaluated against gold standard scanning mobility particle sizer measurements for UFP size and performed very well in these comparisons. 32 In general, while mean UFP size as measured by these handheld monitors may have limitations, our results suggest that it is important to consider this parameter to reduce potential confounding bias in health risk estimates for UFP number concentrations. As noted previously, improved measures of UFP size could reduce residual confounding and further strengthen associations for UFP number concentrations.
Finally, although our cohort lacked information on individual-level behaviors and characteristics such as smoking or body mass index, we do not view this as a limitation, as these personal-level factors are unlikely to confound associations for outdoor concentrations because they do not affect annual average outdoor pollution levels. The directed acyclic graphs in the current study illustrate that these personal-level factors are not confounders of associations for outdoor concentrations. Likewise, other studies of outdoor air pollution have found that adjusting for such risk factors did not affect risk estimates.20,48,96 Thus, we think it is unlikely that our results were confounded by personal-level factors, such as smoking or body mass index. The issue of confounding at the personal versus concentration level is addressed in detail by Weisskopf and Webster97 along with an examination of the trade-offs of personal versus proxy exposure measures in environmental epidemiology.