Skip to main content
Environmental Health Perspectives logoLink to Environmental Health Perspectives
. 2011 Mar 31;119(8):1123–1129. doi: 10.1289/ehp.1002976

Creating National Air Pollution Models for Population Exposure Assessment in Canada

Perry Hystad 1,, Eleanor Setton 2, Alejandro Cervantes 3, Karla Poplawski 4, Steeve Deschenes 2, Michael Brauer 4, Aaron van Donkelaar 5, Lok Lamsal 5, Randall Martin 5,6, Michael Jerrett 7, Paul Demers 4,8
PMCID: PMC3237350  PMID: 21454147

Abstract

Background: Population exposure assessment methods that capture local-scale pollutant variability are needed for large-scale epidemiological studies and surveillance, policy, and regulatory purposes. Currently, such exposure methods are limited.

Methods: We created 2006 national pollutant models for fine particulate matter [PM with aerodynamic diameter ≤ 2.5 μm (PM2.5)], nitrogen dioxide (NO2), benzene, ethylbenzene, and 1,3-butadiene from routinely collected fixed-site monitoring data in Canada. In multiple regression models, we incorporated satellite estimates and geographic predictor variables to capture background and regional pollutant variation and used deterministic gradients to capture local-scale variation. The national NO2 and benzene models are evaluated with independent measurements from previous land use regression models that were conducted in seven Canadian cities. National models are applied to census block-face points, each of which represents the location of approximately 89 individuals, to produce estimates of population exposure.

Results: The national NO2 model explained 73% of the variability in fixed-site monitor concentrations, PM2.5 46%, benzene 62%, ethylbenzene 67%, and 1,3-butadiene 68%. The NO2 model predicted, on average, 43% of the within-city variability in the independent NO2 data compared with 18% when using inverse distance weighting of fixed-site monitoring data. Benzene models performed poorly in predicting within-city benzene variability. Based on our national models, we estimated Canadian ambient annual average population-weighted exposures (in micrograms per cubic meter) of 8.39 for PM2.5, 23.37 for NO2, 1.04 for benzene, 0.63 for ethylbenzene, and 0.09 for 1,3-butadiene.

Conclusions: The national pollutant models created here improve exposure assessment compared with traditional monitor-based approaches by capturing both regional and local-scale pollution variation. Applying national models to routinely collected population location data can extend land use modeling techniques to population exposure assessment and to informing surveillance, policy, and regulation.

Keywords: air pollution, Canada, fixed-site monitors, gradients, land use regression, population exposure assessment, satellite data


Predicting air pollution concentrations at resolutions capable of capturing local-scale pollutant gradients over large geographical areas is becoming increasingly important in multicity and national health studies; in population exposure assessment; and in support of policy, surveillance, and regulatory initiatives. Currently, fixed-site government monitors are the foundation of these activities; however, because of siting criteria, such monitors may fail to fully capture local-scale pollutant variability. In addition, the number of monitors and their spatial distribution may be limited, as is the case in Canada. At present, few methodologies are available that adequately capture local-scale pollutant variability at a national scale when monitor density, distribution, or siting is suboptimal.

A number of approaches may be used to model air pollution over large areas, including interpolation of fixed-site government monitoring data, dispersion modeling, satellite remote sensing, land use regression (LUR), and proximity and deterministic methods. Each approach, however, has inherent limitations that restrict its use for producing local-scale pollution estimates. Interpolation of fixed-site air pollution monitoring data has typically been used to predict pollution concentrations across large areas (Beelen et al. 2009), with recent interest directed towards kriging methods and spatial smoothing with geographic covariates (Beelen et al. 2009; Hart et al. 2009; Yanosky et al. 2008). Fixed-site monitors may not capture entire populations, and measurements typically represent regional and between-city pollution differences due to monitor siting criteria, which prevent monitors from being placed in proximity to major roads and other pollution sources. Dispersion models also exist for large geographical areas and have been incorporated into regulatory and epidemiological studies of air pollution (Cyrys et al. 2005; Nafstad et al. 2003). Importantly, the resolutions of pollutant estimates from dispersion models over large geographical areas are typically restricted, for example, to 1 or 3 km2 (Jerrett et al. 2005). Satellite remote sensing is a new methodology available to predict air pollution concentrations over large geographic areas, and a number of studies have evaluated different remotely sensed concentrations of fine particulate matter [PM with aerodynamic diameter ≤ 2.5 μm (PM2.5)] (e.g., van Donkelaar et al. 2010) and gaseous pollutants (Martin 2008) and found moderate to good associations with ground-level monitoring data. Currently, the resolution of satellite data limits their use to representing regional pollution concentrations, but indicators of local air pollution may be used in concert to improve the spatial resolution of predictions (Liu et al. 2009). LUR approaches have been used extensively to predict within-city pollutant concentrations of nitrogen dioxide (NO2) and PM2.5 (for review, see Hoek et al. 2008), but to a lesser extent for volatile organic compounds (VOCs). However, the approach is well suited to modeling pollutants that exhibit significant spatial variation, especially traffic-related VOCs (Atari and Luginaah 2009; Mukerjee et al. 2009; Smith et al. 2006; Su et al. 2010; Wheeler et al. 2008). The city-by-city approach in which LUR models are created is costly, and integration and interpretation across multiple city models is difficult. Simple proximity and deterministic approaches have also been widely used as surrogates for exposure to vehicle and industrial sources, specifically in epidemiological studies; yet, such measures in isolation are often poor surrogates for exposure. To date, few population exposure assessments have incorporated multiple sources of data, specifically satellite pollutant estimates, LUR modeling of geographic characteristics, and information on proximity and pollution gradients, to estimate local-scale air pollution concentrations at a national scale.

Here we report a modeling initiative to produce 2006 national PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadiene models for Canada that capture local-scale pollutant variability and apply these models to routinely collected population location data to calculate population exposures. This research is part of Carex Canada, a national surveillance initiative designed to estimate the number of Canadians potentially exposed to known or suspected environmental and occupational carcinogens (Carex Canada 2011). This research adds to the literature on air pollution modeling and exposure assessment by creating national LUR models from fixed-site monitoring data; incorporating various predictor data sets and methods to capture the different scales of pollution sources; and extending LUR modeling techniques to population exposure assessment and to informing surveillance, policy, and regulation.

Materials and Methods

Pollutant modeling approach. Models were developed in two stages using different predictor variables and methodology to capture background, regional, and local-scale pollution variation. First, for each National Air Pollution Surveillance (NAPS) fixed-site monitoring station, we derived satellite-based estimates (PM2.5 and NO2 only) and geographic variables (e.g., road length, population density, proximity to large emitters) using ArcGIS (version 9.3; ESRI, Redland, CA, USA). We used forward stepwise regression to develop LUR models and retained variables that corresponded to hypothesized effect directions; we maximized the sums of squares explained by Akaike’s information criterion. Spatial autocorrelation was also evaluated using the Moran’s I statistic in ArcGIS. We sought to develop parsimonious models rather than traditional predictive models that maximize prediction but make interpretation of individual variable contributions difficult. Only variables significant at the p < 0.05 level were included in the final models. As expected, NAPS monitoring locations in Canada did not display sufficient variability to estimate model coefficients for important local-scale parameters, such as proximity to major roadways, because of monitor siting. Thus, local-scale predictors were underpowered in the LUR modeling approach.

In the second stage, we conducted comprehensive literature reviews to identify deterministic factors to represent local-scale gradients in pollutant concentrations associated with specific sources (i.e., highways, major roads, gas stations). For each pollutant, we identified concentrations near these selected sources in relation to local background levels and developed deterministic multipliers with distance decay rates (together referred to as gradients in this paper) to apply to the background and regional concentrations predicted by our LUR models. All statistical analyses were conducted using SAS (version 9.1; SAS Institute Inc., Cary, NC, USA).

Air quality data. Annual average concentrations of PM2.5 (177 monitoring stations), NO2 (134 monitors), and benzene, ethylbenzene, and 1,3-butadiene (53 monitors) were calculated using data from unique NAPS monitoring sites that were operating during 2006 (see Figure 1). Continuous monitoring data from a given monitor were included if at least 50% of hourly observations were available for a 24-hr period and at least 50% of days were available in a month. Monthly averages from filter-based PM2.5 measurements required a minimum of three of five valid measurements per month. Annual averages for 2006 were not calculated for individual monitors unless there were at least 6 months of complete data with one valid month per quarter.

Figure 1.

Figure 1

Location of NAPS monitors that were used to create national PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadiene models.

NAPS includes different monitor types for PM2.5, including tapered element oscillating microbalances (TEOMs), dichotomous partisol samplers (Thermo Fisher Scientific Inc., Waltham, MA, USA), and beta-attenuation mass monitors (Met One Instruments Inc., Grants Pass, OR, USA). Multiple monitors are often present at one location, and our comparative analysis found differences in levels measured by TEOMs, which are known to underpredict PM2.5 because of nitrate evaporation (Dann T, personal communication). We therefore selected other monitor types when they were available at the same location. Those stations with only TEOMs available were adjusted based on yearly calibration between collocated dichotomous and TEOM monitors during 2006 [n = 14, dichotomous = 1.640 + 1.089 × (TEOM), R2 = 0.89, p < 0.001]. NO2, benzene, ethylbenzene, and 1,3-butadiene were measured using standard methods (NAPS 2004).

Predictor variables. PM2.5 and NO2 satellite data. Canada-wide concentrations of PM2.5 and NO2 were estimated using satellite atmospheric composition data combined with local, coincident scaling factors from a chemical transport model [Goddard Earth Observing System (GEOS)-Chem 2011]. Ground-level PM2.5 estimates were derived from aerosol optical depth data from the Terra satellite [National Aeronautics and Space Administration (NASA) 2011b], in combination with output from GEOS-Chem simulations to estimate the relationship between aerosol optical depth over the atmospheric column and ground-level PM2.5 (van Donkelaar et al. 2010). Ground-level NO2 concentrations were estimated from tropospheric NO2 columns retrieved from the ozone monitoring instrument on the Aura satellite (NASA 2011a); GEOS-Chem was also used to calculate the relationship between the NO2 column and ground-level concentration (Lamsal et al. 2008). Both PM2.5 and NO2 were estimated at a 0.1 × 0.1° resolution (~ 10 × 10 km). Estimates for PM2.5 were calculated from 2001–2006 data to ensure sufficient observations. For NO2 estimates, we used data from 2005 and 2006, because ozone monitoring instrument measurements began in late 2004.

Geographic data. We modeled regional pollutant variation using geographic predictor variables potentially relevant to pollutant sources, emissions, and dispersion. To capture varying spatial influences of predictors, all variables were calculated for circular buffer distances ranging from 50 m to 50 km. Classes of variables included population density derived from census block-face points (Statistics Canada 2006); 1-km land use classifications (Global Land Cover Characterization 2008); high-resolution (30 m) land-use classifications (DMTI Spatial Inc., Markham, Ontario, Canada); sources of large industrial emissions from the Canadian National Pollutant Release Inventory (NPRI; Environment Canada 2010); small point source locations extracted from the Dun and Bradstreet (D&B) Selectory database of businesses (Hoovers, Austin, TX, USA) in Canada; length of and distance to specific road classifications using the DMTI Spatial road network, such as freeway, highway, major road, and minor road (DMTI Spatial Inc.); length and density of railroads; elevation; and meteorological variables (precipitation and temperature). Any geographic variables with > 30% zero values—those with no predictive features in proximity to a monitor—were recoded as binary (i.e., present/absent). In total, 10 variable classes and 270 buffer-specific variables were explored in the LUR models.

Deterministic gradients. Gradients were developed with a focus on mobile sources and gas stations. We conducted a comprehensive literature review of published studies to identify the distance from sources at which pollutant concentrations typically return to background levels, and an expected ratio of near-source pollutant levels compared with background pollutant levels for each source and pollutant. We searched PubMed (2010), Web of Science (Thomson Reuters 2010), and Google Scholar (2010) using a range of keywords to identify studies with measurements of pollutant gradients. Studies varied widely in terms of location, date, methods, duration of measures, number of samples, and definition of near source and background. We developed linear gradients using the steepest portion of the exponential decay curves typically found in the literature, as the tails of the decay functions were very sensitive to local parameters. Gradients were also selected to represent Canadian conditions. Table 1 summarizes the gradients developed for Canada and applied to the LUR models.

Table 1.

PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadiene gradients determined from the literature and incorporated with national LUR model predictions.

Substance Source Increase at source Gradient distance (m)
PM2.5 Highway 1.25a 75b
Major road 1.1a 75b
NO2 Highway 1.65a 300c
Major road 1.2a 100c
Benzene Gas station 6.5d 100d
Highway/major road 3.25e 50f
Local road 1.5e 50f
Ethylbenzene Highway 3.7g 300h
Major road 2.2g 300h
Local road 1.4g 300h
1,3-Butadiene Highway 4i 75i
aSmargiassi et al. (2005). bBeckerman et al. (2008), Hitchins et al. (2000), Roorda-Knape et al. (1998), Tiitta et al. (2002). cBeckerman et al. (2008), Gilbert et al. (2003, 2007), Roorda-Knape et al. (1998), Su et al. (2009). dKarakitsios et al. (2007). eHellén et al. (2006), Parra et al. (2009), Thorsson and Eliasson (2006), Vardoulakis et al. (2002). fBeckerman et al. (2008), Thorsson and Eliasson (2006), Venkatram et al. (2009). gParra et al. (2009), Roukos et al. (2009), Wang and Zhao (2008). hWang and Zhao (2008). iVenkatram et al. (2009).

To identify the distance of each NAPS monitor from the nearest highway, major road, local urban road, and gas station, we used DMTI road network data and D&B commercial data for point sources. If a monitor was close enough to one of these features for the source to influence pollutant levels, we modified the corresponding LUR model results (not including point source industrial variables) to account for the deterministic gradients. For example, based on our review of the literature, we assumed that NO2 concentrations at the side of a highway would be 1.65 times higher than LUR-based background concentrations but consistent with background levels 300 m from the highway; this assumption resulted in a distance decay rate of 0.33% per meter that was applied to the model to estimate NO2 levels within the 300-m gradient buffer.

Model evaluation. We used three approaches for model evaluation. Due to the small number of NAPS monitoring stations for PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadience, we did not leave out a percentage for independent postmodel evaluation, because we wanted to capture the greatest range of model predictors possible. Therefore, we first evaluated all LUR models using a bootstrap approach to determine the sensitivity of model prediction and parameter estimates to monitor sampling. Random selection of monitors was conducted, with replacement, and variable coefficients and model R2 values were recorded from the new full sample. This was repeated for 10,000 iterations to estimate the 95% confidence interval (CI) for overall model prediction and individual variable coefficients. Next, we conducted a leave-one-out analysis where each LUR model was repeatedly parameterized on n – 1 data points and then used to predict the excluded monitor measurement. The mean differences between the predicted and measured values were used to estimate model error.

Finally, we evaluated the NO2 and benzene LUR models, with and without gradients, against independent data (35–196 monitoring sites per city) previously collected for LUR models in seven Canadian cities (for a full description of data collection and modeling see Allen et al. 2010; Atari and Luginaah 2009; Crouse et al. 2009; Henderson et al. 2007; Jerrett et al. 2007; Su et al. 2010). Briefly, in each city, monitoring took place over a 2-week period; data from fixed-site monitors, monitoring during yearly average concentration periods, or multiple measurement periods were used to estimate yearly averages [see Supplemental Material, Table 1 (doi:10.1289/ehp.1002976) for the city-specific data used for model evaluation]. These pollution measurements were collected at much higher spatial densities than were NAPS and from monitors that were located to specifically capture spatial pollutant gradients. Consequently, these data were reasonable for use as a gold standard to determine how well the two national NO2 and benzene models (the LUR models and the LUR models with gradients) predicted within-city variation. In addition, we compared the city-specific data with estimates based on inverse distance weighting (IDW) of annual average NO2 and benzene concentrations measured at NAPS monitors (with and without deterministic gradients). Because of NAPS monitor density in Canada, kriging could not be applied.

Population exposure assessment. The national pollutant models were applied to each of the 478,831 Statistics Canada street block-face centroid locations in 2006 to estimate population exposures. First, we applied the LUR models to each block point to derive a unique predicted pollutant concentration for each point, representing the average exposure level for 89 and a SD of ± 158 individuals. We used a GIS to identify the distance of each block centroid to the nearest highway, major road, local urban road, and gas stations and adjusted the corresponding LUR model estimate when the street block point was located within an associated gradient. We then estimated population-weighted exposures to PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadiene in the Canadian population as a whole, and we estimated uncertainty using the 95% confidence limits for LUR model predictions. Because there was insufficient information in the literature to examine uncertainty for specific gradients, we selected ± 50% for all gradients (values shown in Table 1).

Results

National LUR model results. Table 2 summarizes the national LUR model results. The PM2.5 model predicted 46% of PM2.5 variation and was dominated by satellite predictions, which alone explained 41% of PM2.5 variation. The NO2 model predicted 73% of NO2 variation and length of all roads within 10 km was the dominant predictor, explaining 55% of NO2 variation. This variable was only moderately correlated (r = 0.56) to NO2 predictions from satellite data, which further explained 4% of NO2 variation in the final model. The models for benzene, ethylbenzene, and 1,3-butadiene had similar predictive results, explaining 62, 67, and 68% of pollutant variability, respectively. Data from one monitor were removed as an outlier from the benzene and ethylbenzene models (St. John Baptiste, located in Montreal east city) and from the 1,3-butadiene model (Sarnia, located in southern Ontario near the Detroit–Windsor border), which were associated with the highest pollutant concentration for each substance.

Table 2.

National LUR model results for PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadiene.

Variable Distancea Value SE p-Value
PM2.5 model (R2 = 0.46, RMSE = 1.529)
Intercept 2.802 0.497 < 0.0001
Satellite PM2.5 (µg/m3) 2.392 0.263 < 0.0001
NPRI emissions (tonnes) 5 km 1.63e–3 5.95e–4 0.007
Industrial land use (m2) 1 km 1.03e–6 4.18e–7 0.014
NO2 model (R2 = 0.73, RMSE = 5.470)
Intercept 13.179 1.374 < 0.0001
Satellite NO2 (ppb) 1.4903 0.355 < 0.0001
Industrial land use (m2) 2 km 3.21e–6 5.73e–7 < 0.0001
Road length (m) 10 km 7.42e–6 9.04e–7 < 0.0001
Summer rainfall (mm) –0.010 0.002 < 0.0001
Benzene modelb (R2 = 0.62, RMSE = 0.298)
Intercept 0.346 0.069 < 0.001
Major road length (m) 10 km 1.18e–6 2.56e–7 < 0.001
NPRI emissions (present) 10 km 0.526 0.089 < 0.001
Ethylbenzene modelc (R2 = 0.67, RMSE = 0.193)
Intercept 0.152 0.039 < 0.001
Population (count) 10 km 6.74e–7 7.25e–8 < 0.001
NPRI emissions (present) 2 km 0.272 0.071 < 0.001
1,3-Butadiene modeld (R2 = 0.68, RMSE = 0.034)
Intercept 0.011 0.009 0.208
Road length (m) 750 m 3.89e–6 7.93e–7 < 0.001
Highway (present) 500 m 0.041 0.012 0.002
Commercial land use (m2) 10 km 1.60e–9 5.97e–10 0.010
Satellite PM2.5 and NO2 are satellite-derived estimates of PM2.5 and NO2. Land use is the area of specific land-use types (industrial, commercial) within the associated buffer distance. Road length refers to the length of different road classifications (all, major, highways) within the associated buffer distance. Summer rainfall refers to the amount of rainfall recorded from May to September from the nearest meteorological station. NPRI emissions refer to the amount of annual emissions of the model substance released from industries that reported to the NPRI. NPRI emissions (present) refers to the presence of NPRI facilities that have released a model substance into the air. Population (count) refers to the number of individuals who resided within the associated buffer distance. aRadius of cicular buffers used to derive variables. bOne outlier removed with benzene concentration of 3.55 µg/m3. cOne outlier removed with ethylbenzene concentration of 2.57 µg/m3. bOne outlier removed with 1,3-butadiene concentration of 0.82 µg/m3.

Spatial autocorrelation of national LUR models. Spatial autocorrelation of the LUR model residuals was examined using Moran’s I in ArcGIS. Spatial autocorrelation was present in the PM2.5 LUR model residuals (Moran’s I = 0.33, p < 0.001), indicating that a moderate amount of spatial autocorrelation remained that was not explained by the PM2.5 model predictors. Clustering of positive residuals (model underpredicting by an average of 2.57 µg/m3) occurred in the rural interior of British Columbia. An indicator variable for British Columbia substantially reduced the spatial autocorrelation (Moran’s I = 0.03, p = 0.04). Sensitivity analysis using a summer-only PM2.5 model indicated no spatial autocorrelation (Moran’s I = 0.04, p = 0.01), supporting our hypothesis of woodburning as the primary source of model underprediction in this region. No significant spatial autocorrelation existed in LUR model residuals for NO2 (Moran’s I = 0.03, p = 0.44), benzene (Moran’s I = –0.20, p = 0.13), ethylbenzene (Moran’s I = –0.00, p = 0.87), and 1,3-butadiene (Moran’s I = 0.09, p = 0.32).

Incorporating gradients with national LUR models. Deterministic gradients were added to LUR models, because we could not estimate the effects of local-scale pollution sources from NAPS data alone. Figure 2A illustrates the final PM2.5 model (LUR plus gradients) for Canada as a whole and for southern Ontario and the city of Toronto. Figure 2B illustrates the final national NO2 model (LUR plus gradients) for Canada as a whole and for southwestern British Columbia and the city of Vancouver. These maps illustrate the spatial resolution of the final national pollutant models; however, for population exposure assessment, the LUR model results and deterministic gradients were applied to street block point locations, as shown in Figure 3, which illustrates the final national benzene model (LUR plus gradients) calculated at the block point level.

Figure 2.

Figure 2

National annual average models for PM2.5, highlighting southern Ontario and the city of Toronto (A), and for NO2, highlighting southwestern British Columbia and the city of Vancouver (B), that incorporate satellite-derived pollutant estimates, geographic land use variables, and deterministic gradients. The seven cities shown in (B) represent locations of independent monitoring data used to evaluate the national NO2 and benzene models.

Figure 3.

Figure 3

National benzene LUR model plus gradients (illustrating the city of Toronto) calculated for each street block point in Canada (n = 478,831).

Evaluation of national pollutant models. Evaluation of LUR models using bootstrap and leave-one-out analyses. The distribution of all model coefficients for each pollutant resulting from bootstrap analysis showed normal distributions. The NO2 model was the least sensitive to monitor selection, with a bootstrap R2 95% CI of 65–81. Models for PM2.5, benzene, ethylbenzene, and 1,3-butadiene demonstrated larger uncertainty to monitor selection, with R2 95% CIs of 33–59, 44–80, 49–85, and 53–82, respectively. Variable coefficients for industrial NPRI proximity variables were extremely sensitive to monitor selection. The leave-one-out analyses indicated no significant bias in any LUR model, as demonstrated by the mean ± SD error: 1.07e–3 ± 5.61 for NO2; –6.35e–3 ± 1.59 for PM2.5; –0.04 ± 0.32 for benzene; –0.01 ± 0.04 for 1,3-butadiene; and –0.04 ± 0.22 for ethylbenzene.

Evaluation of NO2 and benzene models using city-specific data. On average, the national NO2 LUR plus gradient model predicted 43% of the within-city NO2 variation (based on the city-specific data evaluation) compared with 22% predicted based on IDW of NAPS monitors plus gradients (Table 3). National LUR, LUR plus gradients, IDW, and IDW plus gradients models overpredicted the city-specific NO2 measurements, with average city-specific intercepts of 4.56, 7.45, 8.51, and 11.56 µg/m3, respectively. City-specific scatter plots of measured and modeled NO2 concentrations are illustrated in Supplemental Material, Figure 1 (doi:10.1289/ehp.1002976).

Table 3.

Evaluation of national NO2 and benzene models, as well as IDW estimates from fixed-site monitors, against independent city-specific measurement data.

R2 (RMSE)
Substance na LURb LUR + Gc IDWd IDW + Ge
NO2
Edmonton 50 0.60 (3.67) 0.41 (4.59) 0.10 (5.52) 0.01 (5.92)
Montreal 135 0.41 (4.28) 0.48 (4.04) 0.31 (4.63) 0.41 (4.29)
Sarnia 34 0.42 (4.21) 0.49 (4.04) 0.12 (5.15) 0.19 (5.12)
Toronto 196 0.18 (7.69) 0.36 (6.78) 0.13 (7.93) 0.32 (6.99)
Victoria 40 0.19 (3.95) 0.37 (3.70) 0.23 (3.86) 0.26 (3.98)
Vancouver 114 0.31 (6.41) 0.42 (5.93) 0.31 (6.43) 0.36 (6.24)
Winnipeg 49 0.54 (3.65) 0.51 (3.86) 0.08 (5.17) 0.02 (5.43)
Average 618 0.39 (4.84) 0.43 (4.71) 0.18 (5.53) 0.22 (5.42)
Benzene
Montrealf 131 0.33 (0.24) 0.26 (0.25) 0.11 (0.28) 0.05 (0.29)
Sarnia 37 0.02 (0.57) 0.04 (0.56) 0.00 (0.57) 0.03 (0.56)
Torontog 44 0.03 (0.19) 0.22 (0.17) 0.00 (0.19) 0.34 (0.16)
Winnipeg 94 0.08 (0.25) 0.10 (0.25) 0.00 (0.26) 0.01 (0.26)
Average 306 0.12 (0.31) 0.16 (0.31) 0.03 (0.33) 0.11 (0.32)
aNumber of within-city measurement locations. bNational LUR model. cNational LUR model plus gradients (G). dIDW interpolation of NAPS fixed-site monitoring data. eIDW interpolation of NAPS fixed-site monitoring data plus gradients. fFour outliers removed with highest city concentrations (> 2 µg/m3). gOne outlier removed with highest city concentration (4.10 µg/m3).

For benzene, all modeling methods performed poorly in explaining within-city benzene variation. The LUR plus gradients model explained, on average, only 16% of within-city variability in benzene concentrations compared with 11% based on IDW plus gradients (Table 3). In the evaluation using the Montreal city-specific benzene concentrations, four outliers were removed (all concentrations > 2 µg/m3), and one outlier (4.10 µg/m3) was removed in the Toronto evaluation. Benzene models also overpredicted city-specific concentrations, based on city-specific intercepts of modeled versus measured concentrations [see Supplemental Material Figure 2 (doi:10.1289/ehp.1002976)]. Sarnia, a high-density industrial community with 46 NPRI emitters, had poor NO2 and benzene model evaluations.

Canadian population exposure assessment. The final LUR models and gradients were applied to all 478,831 street block-face centroid locations to conduct population exposure assessments. Estimated mean (95% CI) population exposures (micrograms per cubic meter) to ambient PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadiene in Canada based on the LUR models were 8.10 (5.84–10.43), 22.40 (13.14–33.51), 0.94 (0.57–1.31), 0.38 (0.25–0.52), and 0.086 (0.035–0.138), respectively. Estimates for the same pollutants based on the national LUR plus gradients models were 8.39 (6.00–11.13), 23.37 (14.01–35.73), 1.04 (0.59–1.49), 0.63 (0.35–1.10), and 0.089 (0.036–0.146), respectively. Wide ranges of exposure levels were estimated in Canada for all substances; see Supplemental Material, Figure 3 (doi:10.1289/ehp.1002976) for population exposure distributions.

Discussion

We created national pollutant models from fixed-site monitoring data that incorporate satellite, geographic, and deterministic components and demonstrated that these models can improve exposure assessment over large geographic areas compared with approaches based solely on interpolation of fixed-site monitoring data. We also demonstrated how these models can be used for population exposure assessment.

The national LUR models explained 73% of pollution variation in NAPS measurements for NO2, and lesser degrees for PM2.5 (46%), benzene (62%), ethylbenzene (67%), and 1,3-butadiene (68%). The NO2 and PM2.5 models were least sensitive to monitor selection, whereas models for VOCs were more sensitive—likely because of the smaller number of monitors on which LUR estimates were based (n = 53). The predictive performance of the PM2.5 model [R2 = 0.46, root mean square error (RMSE) = 1.53 µg/m3] was consistent with other large-scale modeling studies based on different monitoring methodologies and data inputs (Beelen et al. 2009; Hart et al. 2009; Liao et al. 2006; Ross et al. 2007).

The national LUR models generally captured regional patterns in pollutant concentrations, corresponding to NAPS monitor siting criteria, but were less effective at identifying small-scale geographic predictor variables. For example, only 35 NAPS monitors were located within 500 m of a major road and only 7 monitors were within 500 m of a major industrial emission source. Such small sample sizes greatly reduce the power of the models to capture these specific pollutant sources. Some city-specific LUR methods have used location-allocation methods to more fully represent the true spatial variation in pollution levels and to capture the range of predictor variables (Jerrett et al. 2005). Models based on fixed-site monitor data may therefore need additional approaches to represent local-scale pollutant variability not captured by fixed-site monitors. This was indeed the case with the Canadian NAPS network, but larger regulatory networks, such as those in the United States, may better represent the range of predictor variables needed to build local-scale LUR models.

To address the lack of local-scale geographic variability in the NAPS data, we incorporated deterministic gradients based on proximity to specific sources (i.e., vehicles and gas stations). The final NO2 LUR plus gradient model improved prediction of within-city pollutant variation considerably compared with the LUR model alone and interpolation methods. On average, the final model predicted 43% of within-city NO2 variation compared with 18% using IDW. Both the national benzene model and IDW predicted within-city benzene poorly, which may be due to the small number of NAPS monitors on which the model was based, the relatively small variation in within-city benzene levels, or the inability of gradients to capture local benzene concentrations. Similar to the NO2 model, the evaluation of the benzene model with Sarnia data was poor, reflecting the difficulty in capturing unique high-density industrial conditions in a national model.

Gradients were based on literature reviews. The lack of methodological consistency among published data of pollutant level increases near specific sources and the distance required for pollutant levels to return to background were clear limitations. To improve reliability of gradients, we used linear functions to represent the decreases in pollutant levels found in the initial portions of the exponential decay curves found in the literature. The methodology used here could be augmented as new gradients become available or with other modeled data.

Population exposure assessment was conducted using the national models and census street block-face points. The population-weighted average exposures to PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadiene were 8.39, 23.37, 1.04, 0.63, and 0.089 (µg/m3), respectively. The uncertainty of population exposure estimates were driven primarily by LUR model uncertainty. Although the results of the national LUR models are similar to city-specific LUR models in their predictive capacity and error, we are unaware of any LUR models that have been applied to estimate exposure uncertainty. Although these exposures are low compared with other developed countries, exposures in particular locations in Canada are relatively high. For example, the 90th percentiles of exposures (micrograms per cubic meter) are 9.78 for PM2.5, 34.81 for NO2, 1.61 for benzene, 1.01 for ethylbenzene, and 0.14 for 1,3-butadiene. The ability of the national models to capture local-scale pollutant variability allows for more realistic exposure assessments and assessments that can potentially identify high-risk populations. Future work will refine approaches for using the national models to calculate population exposure assessments, incorporate socioeconomic information from census to examine environment injustice issues, and integrate national models into a risk-assessment framework that incorporates exposures from other sources and microenvironments.

This study faced a number of challenges and limitations to creating national pollutant models from fixed-site monitors and applying these models to estimate Canadian population exposures. First, the NAPS monitors in Canada are centered in large metropolitan areas, and modeled relationships will therefore be weighted toward these areas. This is appropriate for population exposure assessment, because these locations represent the majority of Canadians, but in rural areas the models could be adjusted or a background concentration could be used. This is particularly relevant to the benzene, ethylbenzene, and 1,3-butadiene models, which were based on data from monitors located almost exclusively in large urban areas or sited near large industrial sources. Second, we had limited data on pollutant sources and source strengths such as traffic volumes. In addition, we did not model emissions from woodburning stoves and forest fires, which may have caused us to underpredict PM2.5 concentrations in the interior of British Columbia. Third, parsimonious LUR models were created, because the specificity of model variables may be important for informing surveillance and regulation. This, however, leads to models that do not capture the complex interactions between geographic characteristics and pollutant sources, and even the simplest LUR predictors (e.g., major roads or NPRI facilities within 10 km) capture complex mixes of geographic characteristics and pollutant sources. Fourth, we compared model estimates with city-specific measurements for NO2 and benzene collected in different years and using a variety of methodologies. Nevertheless, these measurements represent the best data on within-city pollutant variability available. Fifth, applying LUR model results to approximately half a million block points is currently extremely computationally and time intensive. Finally, the geographic accuracy of street block centroids may introduce errors into the gradient portions of the models and therefore the exposure assessment, particularly between rural and urban areas. These errors, however, are likely spatially random within rural and urban areas across Canada.

Conclusions

National exposure models were required by Carex Canada to produce population exposure assessments that captured both between-city and within-city pollution variability. We created national PM2.5, NO2, benzene, ethylbenzene, and 1,3-butadiene models from fixed-site monitoring data and found that a combination of data sources and methods to capture background, regional, and local-scale pollution variation improved exposure assessment over traditional IDW interpolation approaches. The national pollutant models were applied to street block-face points, representing the locations of the Canadian population, to determine population exposure estimates. Estimates of average population exposure levels in Canada are PM2.5 8.39, NO2 23.37, benzene 1.04, ethylbenzene 0.63, and 1,3-butadiene 0.09 (µg/m3). The modeling approach developed here uses readily available data and could be reproduced over time, for example, every 5 years with the Canadian census. This would provide updated population exposure assessments and a long-term surveillance capacity for monitoring trends in population exposures, for identifying potential susceptible populations and geographic locations with elevated exposures, and for evaluating the impacts of policies and regulatory changes on exposure levels.

Supplemental Material

(80 KB) PDF

Acknowledgments

We thank R. Allen for providing the Edmonton and Winnipeg monitoring data; D. Crouse, M. Goldberg, and N. Ross for the Montreal data; and O. Atari for the Sarnia data.

Footnotes

The research was supported by a grant from the Canadian Partnership against Cancer. Health Canada also provided support for the development of the satellite-derived pollution estimates.

The authors declare they have no actual or potential competing financial interests.

References

  1. Allen RW, Amram O, Wheeler A, Brauer M. The transferability of NO and NO2 land use regression models between cities and pollutants. Atmos Environ. 2010;45(2):369–378. [Google Scholar]
  2. Atari DO, Luginaah IN.2009Assessing the distribution of volatile organic compounds using land use regression in Sarnia, “Chemical Valley,” Ontario, Canada. Environ Health 816; doi: 10.1186/1476-069X-8-16[Online 16 April 2009] [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Beckerman B, Jerrett M, Brook JR, Verma DK, Arain MA, Finkelstein MM. Correlation of nitrogen dioxide with other traffic pollutants near a major expressway. Atmos Environ. 2008;42(2):275–290. [Google Scholar]
  4. Beelen R, Hoek G, Pebesma E, Vienneau D, de Hoogh K, Briggs DJ. Mapping of background air pollution at a fine spatial scale across the European Union. Sci Total Environ. 2009;407(6):1852–1867. doi: 10.1016/j.scitotenv.2008.11.048. [DOI] [PubMed] [Google Scholar]
  5. Carex Canada. Surveillance of Environmental and Occupational Exposures for Cancer Prevention. Carcinogen Database. 2011. Available: http://www.carexcanada.ca/en/our_research [accessed 15 January 2011]
  6. Crouse DL, Goldberg MS, Ross NA. A prediction-based approach to modeling temporal and spatial variability of traffic-related air pollution in Montreal, Canada. Atmos Environ. 2009;43(32):5075–5084. [Google Scholar]
  7. Cyrys J, Hochadel M, Gehring U, Hoek G, Diegmann V, Brunekreef B, et al. GIS-based estimation of exposure to particulate matter and NO2 in an urban area: stochastic versus dispersion modeling. Environ Health Perspect. 2005;113:987. doi: 10.1289/ehp.7662. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Environment Canada. National Pollutant Release Inventory. Tracking Pollution in Canada. 2010. Available: http://www.ec.gc.ca/inrp-npri/default.asp?lang=en [accessed 17 January 2011]
  9. GEOS-Chem. GEOS–CHEM Model. 2011. Available: http://acmg.seas.harvard.edu/geos/ [accessed 15 January 2011]
  10. Gilbert NL, Goldberg MS, Brook JR, Jerrett M. The influence of highway traffic on ambient nitrogen dioxide concentrations beyond the immediate vicinity of highways. Atmos Environ. 2007;41(12):2670–2673. [Google Scholar]
  11. Gilbert NL, Woodhouse S, Stieb DM, Brook JR. Ambient nitrogen dioxide and distance from a major highway. Sci Total Environ. 2003;312(1–3):43–46. doi: 10.1016/S0048-9697(03)00228-6. [DOI] [PubMed] [Google Scholar]
  12. Global Land Cover Characterization. Global Land Cover Characterization Background. 2008. Available: http://edc2.usgs.gov/glcc/background.php [accessed 3 October 2010]
  13. Google Scholar. Advanced Scholar Search Homepage. 2010. Available: http://scholar.google.com/advanced_scholar_search [accessed 24 November 2010]
  14. Hart JE, Yanosky JD, Puett RC, Ryan L, Dockery DW, Smith TJ, et al. Spatial modeling of PM10 and NO2 in the continental United States, 1985–2000. Environ Health Perspect. 2009;117:1690–1696. doi: 10.1289/ehp.0900840. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Hellén H, Hakola H, Pirjola L, Laurila T, Pystynen KH. Ambient air concentrations, source profiles, and source apportionment of 71 different C2−C10 volatile organic compounds in urban and residential areas of Finland. Environ Sci Technol. 2006;40(1):103–108. doi: 10.1021/es051659d. [DOI] [PubMed] [Google Scholar]
  16. Henderson SB, Beckerman B, Jerrett M, Brauer M. Application of land use regression to estimate long-term concentrations of traffic-related nitrogen oxides and fine particulate matter. Environ Sci Technol. 2007;41(7):2422–2428. doi: 10.1021/es0606780. [DOI] [PubMed] [Google Scholar]
  17. Hitchins J, Morawska L, Wolff R, Gilbert D. Concentrations of submicrometre particles from vehicle emissions near a major road. Atmos Environ. 2000;34(1):51–59. [Google Scholar]
  18. Hoek G, Beelen R, de Hoogh K, Vienneau D, Gulliver J, Fischer P, et al. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmos Environ. 2008;42(33):7561–7578. [Google Scholar]
  19. Jerrett M, Arain M, Kanaroglou P, Beckerman B, Crouse D, Gilbert N, et al. Modeling the intraurban variability of ambient traffic pollution in Toronto, Canada. J Toxicol Environ Health A. 2007;70(3–4):200–212. doi: 10.1080/15287390600883018. [DOI] [PubMed] [Google Scholar]
  20. Jerrett M, Arain A, Kanaroglou P, Beckerman B, Potoglou D, Sahsuvaroglu T, et al. A review and evaluation of intraurban air pollution exposure models. J Expo Anal Environ Epidemiol. 2005;15(2):185–204. doi: 10.1038/sj.jea.7500388. [DOI] [PubMed] [Google Scholar]
  21. Karakitsios SP, Delis VK, Kassomenos PA, Pilidis GA. Contribution to ambient benzene concentrations in the vicinity of petrol stations: estimation of the associated health risk. Atmos Environ. 2007;41(9):1889–1902. [Google Scholar]
  22. Lamsal L, Martin R, van Donkelaar A, Steinbacher M, Celarier E, Bucsela E, et al. 2008Ground-level nitrogen dioxide concentrations inferred from the satellite-borne ozone monitoring instrument. J.Geophys Res 113D16308; doi: 10.1029/2007JD009235[Online 28 August 2008] [DOI] [Google Scholar]
  23. Liao D, Peuquet DJ, Duan Y, Whitsel EA, Dou J, Smith RL, et al. GIS approaches for the estimation of residential-level ambient PM concentrations. Environ Health Perspect. 2006;114:1374–1380. doi: 10.1289/ehp.9169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Liu Y, Paciorek CJ, Koutrakis P. Estimating regional spatial and temporal variability of PM2.5 concentrations using satellite data, meteorology, and land use information. Environ Health Perspect. 2009;117:886–892. doi: 10.1289/ehp.0800123. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Martin RV. Satellite remote sensing of surface air quality. Atmos Environ. 2008;42(34):7823–7843. [Google Scholar]
  26. Mukerjee S, Smith LA, Johnson MM, Neas LM, Stallings CA. Spatial analysis and land use regression of VOCs and NO2 from school-based urban air monitoring in Detroit/Dearborn, USA. Sci Total Environ. 2009;407(16):4642–4651. doi: 10.1016/j.scitotenv.2009.04.030. [DOI] [PubMed] [Google Scholar]
  27. Nafstad P, Haheim LL, Oftedal B, Gram F, Holme I, Hjermann I, et al. Lung cancer and air pollution: a 27 year follow up of 16,209 Norwegian men. Thorax. 2003;58(12):1071–1076. doi: 10.1136/thorax.58.12.1071. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. NAPS (National Air Pollution Surveillance Network) NAPS Quality Assurance and Quality Control Guidelines. Report No. AAQD 2004-1. 2004. Available: http://www.etc-cte.ec.gc.ca/publications/naps/NAPSQAQC.pdf [accessed 15 January 2011]
  29. NASA (National Aeronautics and Space Administration) Aura Satellite. 2011a. Available: http://aura.gsfc.nasa.gov/ [accessed 23 May 2011]
  30. NASA (National Aeronautics and Space Administration) Terra Satellite. 2011b. Available: http://terra.nasa.gov/ [accessed 3 May 2011]
  31. Parra M, Elustondo D, Bermejo R, Santamaria J. Ambient air levels of volatile organic compounds (VOC) and nitrogen dioxide (NO2) in a medium size city in northern Spain. Sci Total Environ. 2009;407(3):999–1009. doi: 10.1016/j.scitotenv.2008.10.032. [DOI] [PubMed] [Google Scholar]
  32. PubMed Central 2010. PubMed Central Homepage. Available:http://www.pubmedcentral.nih.gov/ [accessed 15 November 2010].
  33. Roorda-Knape MC, Janssen NAH, De Hartog JJ, Van Vliet PHN, Harssema H, Brunekreef B. Air pollution from traffic in city districts near major motorways. Atmos Environ. 1998;32(11):1921–1930. [Google Scholar]
  34. Ross Z, Jerrett M, Ito K, Tempalski B, Thurston GD. A land use regression for predicting fine particulate matter concentrations in the New York City region. Atmos Environ. 2007;41(11):2255–2269. [Google Scholar]
  35. Roukos J, Riffault V, Locoge N, Plaisance H. VOC in an urban and industrial harbor on the French North Sea coast during two contrasted meteorological situations. Environ Pollut. 2009;157(11):3001–3009. doi: 10.1016/j.envpol.2009.05.059. [DOI] [PubMed] [Google Scholar]
  36. Smargiassi A, Baldwin M, Pilger C, Dugandzic R, Brauer M. Small-scale spatial variability of particle concentrations and traffic levels in Montreal: a pilot study. Sci Total Environ. 2005;338(3):243–251. doi: 10.1016/j.scitotenv.2004.07.013. [DOI] [PubMed] [Google Scholar]
  37. Smith L, Mukerjee S, Gonzales M, Stallings C, Neas L, Norris G, et al. Use of GIS and ancillary variables to predict volatile organic compound and nitrogen dioxide levels at unmonitored locations. Atmos Environ. 2006;40(20):3773–3787. [Google Scholar]
  38. Statistics Canada. Block-face representative points. 2006. Available: http://www12.statcan.ca/census-recensement/2006/ref/dict/geo040a-eng.cfm [accessed 3 October 2011]
  39. Su J, Jerrett M, Beckerman B, Verma D, Altaf Arain M, Kanaroglou P, et al. A land use regression model for predicting ambient volatile organic compound concentrations in Toronto, Canada. Atmos Environ. 2010;44(29):3529–3537. [Google Scholar]
  40. Su JG, Jerrett M, Beckerman B, Wilhelm M, Ghosh JK, Ritz B. Predicting traffic-related air pollution in Los Angeles using a distance decay regression selection strategy. Environ Res. 2009;109(6):657–670. doi: 10.1016/j.envres.2009.06.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Thomson Reuters. ISI Web of Knowledge Central Homepage. 2010. Available: http://apps.isiknowledge.com/WOS_GeneralSearch_input.do?highlighted_tab=WOS&product=WOS&last_prod=WOS&search_mode=GeneralSearch&SID=1AjaopJLKe4Mb31h29I [accessed 9 November 2010]
  42. Thorsson S, Eliasson I. Passive and active sampling of benzene in different urban environments in Gothenburg, Sweden. Water Air Soil Pollut. 2006;173(1):39–56. [Google Scholar]
  43. Tiitta P, Raunemaa T, Tissari J, Yli-Tuomi T, Leskinen A, Kukkonen J, et al. Measurements and modeling of PM2.5 concentrations near a major road in Kuopio, Finland. Atmos Environ. 2002;36(25):4057–4068. [Google Scholar]
  44. van Donkelaar A, Martin RV, Brauer M, Kahn R, Levy R, Verduzco C, et al. Global estimates of ambient fine particulate matter concentrations from satellite-based aerosol optical depth: development and application. Environ Health Perspect. 2010;118:847–855. doi: 10.1289/ehp.0901623. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Vardoulakis S, Gonzalez-Flesca N, Fisher B. Assessment of traffic-related air pollution in two street canyons in Paris: implications for exposure studies. Atmos Environ. 2002;36(6):1025–1039. [Google Scholar]
  46. Venkatram A, Isakov V, Seila R, Baldauf R. Modeling the impacts of traffic emissions on air toxics concentrations near roadways. Atmos Environ. 2009;43(20):3191–3199. [Google Scholar]
  47. Wang P, Zhao W. Assessment of ambient volatile organic compounds (VOCs) near major roads in urban Nanjing, China. Atmos Res. 2008;89(3):289–297. [Google Scholar]
  48. Wheeler AJ, Smith-Doiron M, Xu X, Gilbert NL, Brook JR. Intra-urban variability of air pollution in Windsor, Ontario-measurement and modeling for human exposure assessment. Environ Res. 2008;106(1):7–16. doi: 10.1016/j.envres.2007.09.004. [DOI] [PubMed] [Google Scholar]
  49. Yanosky JD, Paciorek CJ, Schwartz J, Laden F, Puett R, Suh HH. Spatio-temporal modeling of chronic PM10 exposure for the Nurses’ Health Study. Atmos Environ. 2008;42(18):4047–4062. doi: 10.1016/j.atmosenv.2008.01.044. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

(80 KB) PDF

Articles from Environmental Health Perspectives are provided here courtesy of National Institute of Environmental Health Sciences

RESOURCES