Assessing NO2 Concentration and Model Uncertainty with High Spatiotemporal Resolution across the Contiguous United States Using Ensemble Model Averaging

Qian Di; Heresh Amini; Liuhua Shi; Itai Kloog; Rachel Silvern; James Kelly; M Benjamin Sabath; Christine Choirat; Petros Koutrakis; Alexei Lyapustin; Yujie Wang; Loretta J Mickley; Joel Schwartz

doi:10.1021/acs.est.9b03358

. Author manuscript; available in PMC: 2020 Mar 11.

Published in final edited form as: Environ Sci Technol. 2020 Jan 14;54(3):1372–1384. doi: 10.1021/acs.est.9b03358

Assessing NO₂ Concentration and Model Uncertainty with High Spatiotemporal Resolution across the Contiguous United States Using Ensemble Model Averaging

Qian Di ^a,^b,^*, Heresh Amini ^b, Liuhua Shi ^b,^c, Itai Kloog ^d, Rachel Silvern ^e, James Kelly ^f, M Benjamin Sabath ^g, Christine Choirat ^g, Petros Koutrakis ^b, Alexei Lyapustin ^h, Yujie Wang ⁱ, Loretta J Mickley ^j, Joel Schwartz ^b

PMCID: PMC7065654 NIHMSID: NIHMS1568557 PMID: 31851499

Abstract

NO₂ is a combustion byproduct that has been associated with multiple adverse health outcomes. To assess NO₂ level with high accuracy, we propose an ensemble model to integrate multiple machine learning algorithms, including neural network, random forest, and gradient boosting, with a variety of predictor variables, including chemical transport models. This NO₂ model covers the entire contiguous U.S. with daily predictions on 1-km-level grid cells from 2000 to 2016. The ensemble produced a cross-validated R² of 0.788 overall, a spatial R² of 0.844, and a temporal R² of 0.729. The relationship between daily monitored and predicted NO₂ is almost linear. We also estimated the associated monthly uncertainty level for the predictions and address-specific NO₂ levels. This NO₂ estimation has a very high spatiotemporal resolution and allows the examination of health effects of NO₂ in unmonitored areas. We found the highest NO₂ levels along highways and in cities. We also observed that nationwide NO₂ levels declined in early years and stagnated after 2007, in contrast to the trend at monitoring sites in urban areas, where the decline continued. Our research indicates that integrating different predictor variables and fitting algorithms can achieve an improved air pollution modeling framework.

Keywords: NO₂, Ensemble Model, Machine Learning, Neural Network, Gradient Boosting, Random Forest

Graphical Abstract

graphic file with name nihms-1568557-f0001.jpg

1. Introduction

NO₂, or nitrogen dioxide, is a gaseous air pollutant, which can affect the respiratory system ¹ by increasing susceptibility to respiratory infections², exacerbating asthma symptoms³, and decreasing pulmonary function⁴. In addition to respiratory symptoms, evidence is mounting on the association of NO₂ with low birth-weights⁵, cardiovascular diseases⁶, hospital admission, and mortality. For some health outcomes, the evidence is moderate⁷. Besides its direct health impacts, NO₂ can mediate the formation of secondary organic aerosol from biogenic (e.g., terpenes) and anthropogenic (e.g., aromatics from vehicle exhaust) sources via reactions with organic gases and by influencing oxidant abundance^8–10. It similarly drives reactions that produce the surface pollutant ozone.

NO₂ is an oxidative gas which reacts with other chemicals in the atmosphere. Mobile emissions are the major source of NO₂ in the United States¹¹, although power plants and other large fossil fuel combustors are also important, resulting in local hotspots of NO₂. This results in a heterogenous distribution of NO₂. NO₂ modeling, therefore, needs to capture small-scale variations, which can be challenging.

NO₂ concentrations also vary considerably from day to day due to its short lifetime. Chemical sources and sinks, the height of the planetary boundary layer, wind speed, and wind direction all influence concentrations in any location, with substantial variations from day to day, even given similar emissions. Hence accurate modeling must also capture these temporal patterns.

Many existing NO₂ models were based on land-cover regression. Land-cover terms are proxies for traffic emissions and related to NO₂ concentrations indirectly. A typical NO₂ model is based on a land-cover regression with such quantities as road length, population density, tree canopy coverage, impervious surface, elevation, distance to coast¹², traffic flow¹³, traffic intensity¹⁴, land-cover type^{13, 15, 16}, road types¹⁷, building density¹⁸, and urban density¹⁹, as predictor variables.

Some of these land-cover regressions incorporated satellite measurements as well. NO₂ column density from OMI (Ozone Monitoring Instrument) have been widely used in NO₂ modeling^{15, 17, 20–25} , for its relatively fine spatial resolution (13 km × 24 km) and continuous operation since 2004. SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) and GOME-2A (Global Ozone Monitoring Experiment-2A) also provided NO₂ column density, but were used less often^{24, 26–28}, since GOME-2A and SCIAMACHY have much coarser resolutions (80 km × 40 km and 60 km × 30 km respectively) and daily coverage for GOME-2A was not available after 2012 due to a change in viewing configuration. SCIAMACHY stopped data collection after 2012. NO₂ retrievals from satellite measurement are column concentrations. To obtain surface-level NO₂, existing studies used scaling factors (i.e., the vertical distribution of NO₂) from chemical transport models to derive the relationships between satellite retrievals and surface-level NO₂ concentrations^{21, 29}. Chemical transport models can also directly simulate surface-level NO₂^15,30,31, in addition to providing scaling factors.

In terms of fitting algorithms, most previous studies have used simple regression with some variable selection process, or more advanced regression methods, such as geographically weighted regression³². Most recently, several studies estimated NO₂ concentration using machine learning approaches. Gardner et al. used multilayer neural networks to model hourly NO₂ in Central London, which outperformed a regression model³³. Kukkonen et al. also found that a neural network outperformed a regression model when estimating NO₂ levels in central Helsinki, Finland³⁴. Yeganeh et al. employed an adaptive neuro-fuzzy inference system, a kind of artificial neural network, to estimate monthly mean NO₂ levels in a selected area in Australia, with model performance superior to that of a multiple regression model³⁵. Other machine learning algorithms were also utilized to model NO₂ in other regions, including Hong Kong, where a support vector machine predicted hourly NO₂³⁶, and urban Hungary, where a forecast model used a neural network and a support vector machine³⁷.

After reviewing existing NO₂ models, we found two major areas for improvement. First, no existing study achieved high spatial resolution, high temporal resolution, and large spatial coverage at the same time. NO₂ models with fine spatial resolution or/and temporal resolution were often constrained to a small study area, usually at the city level^{13, 19, 31, 33, 38–43}, while studies extending over a larger area had either relative low temporal resolution^{16, 18, 22, 30} (e.g., national models only available at the annual level) or low spatial resolution²³. Second, most existing studies relied on a single model and a single fitting algorithm to estimate NO₂, even though recent studies suggest that a hybrid model is better at integrating monitoring data, land-cover regression, remote sensing data, and dispersion data⁴⁴ and could potentially improve model performance²³.

Therefore, in this study we integrated multiple types of predictor variables and multiple types of machine learners into an ensemble model to estimate NO₂ with high spatial resolution (1 km × 1 km), high temporal resolution (daily), and large spatial coverage (the contiguous United States) from 2000 to 2016. We further added a land cover regression with meteorology to estimate within-grid variation. The ensemble model integrated neural network, random forest, and gradient boosting algorithms into a unified framework based on a generalized additive model for ensemble averaging. For predictor variables, we used satellite-based NO₂ measurements, an extensive number of land-cover variables, meteorological variables, simulation results from multiple chemical transport models, and some predictor variables not used by previous studies. We validated our model using 10-fold cross-validation and predicted daily NO₂ levels for every 1 km × 1 km grid cell in the entire contiguous United States from 2000 to 2016. We also quantified the uncertainty level by estimating the monthly standard deviation of the difference between daily monitored NO₂ and predicted NO₂, for the same 1 km × 1 km grid cells. This high resolution daily NO₂ estimation, along with predicted uncertainty, will help epidemiologists to better assess both long-term and short-term exposures for studies of large cohorts with residents in locations far from or without monitors.

2. Data

2.1. Study Area and NO₂ Measurements

Our study area is the contiguous United States, including 48 states and Washington, D.C. The contiguous United States has several NO₂ monitoring networks included in the Air Quality System (AQS) from the Environmental Protection Agency (EPA), encompassing 912 monitoring sites. Monitoring sites are not evenly distributed in the study area, with more monitoring sites in populous regions and urban areas. Mountainous regions and some remote border areas have almost no monitoring sites. We extracted or calculated 1-hour daily maximum NO₂ concentrations, the NO₂ metric used for EPA regulation, from these monitoring sites. We used the term “daily NO₂” to stand for 1-hour daily maximum NO₂ in this paper, unless specified otherwise. The study period is from January 1^st, 2000 to December 31^st, 2016, a total of 6,210 days. Not all monitoring sites were operating during the entire study period. Missing data within monitoring sites were excluded from the follow-up model training process.

Like other air pollutants, the distribution of NO₂ demonstrates some degree of spatial and temporal autocorrelation. NO₂ measurements from nearby monitoring sites are more correlated than those from sites far apart; NO₂ measurements from neighboring days are more correlated than measurements distant in time. Using autocorrelation can improve model fit, and we incorporated spatially and temporally lagged NO₂ measurements. Spatially lagged terms were calculated as inverse distance weighted NO₂ measurements at other locations, as well as their one-day, three-day and five-day lagged moving average values.

2.2. Meteorological Data

Reanalysis data sets rely on data sourced from land-surface monitors, ship, aircraft, satellite radiosondes, pibals, and other sources. The National Oceanic and Atmospheric Administration (NOAA) assimilates these data sets into a data assimilation system and provides gridded atmospheric fields. Compared with meteorological measurements from monitoring sites, reanalysis data provide almost continuous spatial and temporal coverage, often with no or few missing values. We used daily values of 16 meteorological variables (Section 1, Supporting Information), with spatial resolution approximately 32 km.

2.3. NO₂ Column Density and Chemical Transport Model Simulations

We used NO₂ column density from the OMI instrument aboard the Aura satellite. The OMI NO₂ data product is available every day at 13 km × 24 km grid cells. OMI NO₂ retrievals are column measurements. To relate OMI NO₂ retrievals to surface-level NO₂ level, we used chemical transport models to simulate scaling factors.

A chemical transport model (CTM) simulates the chemistry, transport, and deposition of air pollutants in discrete three-dimension grid cells, based on surface-level emission inventories and meteorological fields. The models capture the relevant atmospheric photochemical reactions, including the secondary formation of air pollutants. We used the vertical distribution of NO₂ from two different CTMs – the global-scale GEOS-Chem (http://acmg.seas.harvard.edu/geos/) and the regional-scale Community Multi-scale Air Quality Model (CMAQ, https://www.epa.gov/cmaq) – and calculated scaling factors as the percentage of surface-level NO₂ contributing to the total NO₂ column density. We then related the satellite-retrieved NO₂ column to surface-level NO₂, as in previous existing studies^{29, 45, 46}. In addition, we used surface-level NO₂ estimates from the CTMs as a predictor variable in NO₂ modeling. Details of both CTMs can be found elsewhere^47–49. The spatial resolution of GEOS-Chem output was 0.5° × 0.625°; the spatial resolution of CMAQ output was 12 km for all years, except 36-km resolution for the Western U.S. in the early years. Neither GEOS-Chem nor CMAQ was calibrated or tuned with NO₂ monitoring data.

2.4. Land-cover Variables

A large percentage of surface NO₂ concentrations stems from local traffic emissions, which are sensitive to land-cover patterns⁵⁰ and can be approximated by land-cover terms. Hence, land-use variables are among the most important predictor variables in NO₂ modeling. Land-cover variables have been used in nationwide NO₂ models^{12, 51}, as well as some regional or neighborhood models^{38, 41}. Following previous studies⁵², this study included seven categories of land-cover variables, including land-cover terms (water bodies, developed areas, barren land, forest, shrubland, herbaceous land, planted/cultivated land, wetlands, impervious surface, and tree canopy), truck traffic (truck traffic volume, truck route density, and shortest distance to truck route), road density (road density for primary roads, secondary roads, and all roads, respectively), restaurant density, elevation (minimum elevation, maximum elevation, mean elevation, median elevation, standard deviation of elevation, and breakline emphasis), normalized difference vegetation index, nighttime light, with details listed in the Supporting Information (Section 2).

We also prepared selected land-cover variables at three resolutions: 100 m × 100 m, 1 km × 1 km and 10 km × 10 km. OMI column NO₂ has a resolution of 13 km × 24 km; the horizontal resolution of GEOS-Chem, CMAQ, and reanalysis data sets are at similar levels, or even coarser than OMI. The kilometer-level variables capture local emissions, especially from traffic, and emissions from neighboring areas, and the 10-km variables capture more of the overall pattern of urban emissions. We incorporated 1-km- and 10-km-level land-cover variables to fit the three machine learning models. The 100-m-level land-cover variables were used to fit the local models of address-specific deviations from the 1 km grid cell.

2.5. Other Ancillary Variables

The retrieval algorithm of satellite-based NO₂ is affected by aerosol, surface reflectance⁵³/surface albedo, and cloud contamination⁵⁴, although the agreement of satellite-based NO₂ with in situ measurements is usually good⁵⁵. To correct possible errors in the NO₂ retrieval, we further added the following variables to our model. (1) Variables related to aerosol concentration and aerosol type, including simulated elemental carbon, organic carbon, sulfate, nitrate, aerosol mass from both GEOS-Chem and CMAQ; sulfate aerosol, hydrophilic black carbon, hydrophobic black carbon, hydrophilic organic carbon, and hydrophobic organic carbon from MERRA-2 ⁵⁶; and absorbing aerosol index in the ultraviolet and visible ranges (OMAERUVd, OMAEROe) from OMI^{57, 58} . (2) Cloud coverage, including cloud area fractions at low, medium, and high altitudes from the NCEP/NCAR reanalysis data set⁵⁹. (3) Surface albedo from the NCEP/NCAR reanalysis data set⁵⁹ and surface reflectance from MODIS (MOD09A1)⁶⁰.

OMI retrievals have many missing values. We also acquired NO₂ column simulations from Copernicus Atmosphere Monitoring Service (CAMS), another reanalysis data set⁶¹. The CAMS reanalysis data for NO₂ rely on observations from multiple satellites, without observations from NO₂ monitoring sites, combined with state-of-the-art computer models. CAMS NO₂ columns have a spatial resolution of 0.125° × 0.125°, similar to that of OMI NO₂ retrievals and with no missing values, providing additional information where OMI NO₂ retrievals are missing.

3. Methods

3.1. Overview

Our NO₂ model was based on an ensemble model that took estimates from three independent machine learning algorithms. We first fit neural network, random forest, and gradient boosting algorithms with all input predictor variables and monitored NO₂ as the dependent variable. Then, a generalized additive geographically weighted model combined the NO₂ estimates from the three algorithms and produced a final NO₂ estimation. NO₂ concentrations demonstrate some degree of temporal and spatial autocorrelation. To leverage this autocorrelation, we used the above NO₂ estimates, calculated their spatially and temporally lagged values, used them as additional input predictor variables in refitting the three machine learning algorithms and ensemble model again (Figure S1). In this two-step modeling framework, each step combines a neural network, random forest, gradient boosting, and a generalized additive model into an ensemble model.

We applied ten-fold cross-validation in choosing the model hyperparameters to avoid overfitting. We also used 10-fold cross-validation to evaluate the final model performance. We randomly divided all monitoring sites into 10 splits. We trained the model with 90% of the monitoring sites and predicted NO₂ at the remaining 10% monitoring sites; then we repeated the process for other 9 splits. We aggregated cross-validated NO₂ predictions from 10 splits together, compared with corresponding NO₂ monitoring values, and calculated total R², temporal R², spatial R², root mean square error (RMSE), and other metrics for model performance. The definition of total R², temporal R², spatial R² and RMSE are based on previous literature⁶². It is worth mentioning that spatial R² is calculated by regressing annual-averaged monitored NO₂ against the predicted value, so spatial R² evaluates model performance of long-term averages.

3.2. Three Machine Learning Algorithms

Previous studies have used neural network, random forest⁶³, and other machine learning algorithms to estimate surface-level NO₂^{17, 23, 33, 34}. In these studies, land-cover variables, satellite measurements and other predictors were input variables of the machine learning algorithm; monitored NO₂ was the dependent variable. We used neural network, random forest, and gradient boosting algorithms to estimate monitored NO₂ separately, with all predictors as input variables. Hyperparameters of the machine learning algorithms, such as the number of hidden layer and the number of neurons for a neural network and learning rate for gradient boosting, were determined by a grid search process and imbedded cross-validated process (Table S1). To improve efficiency, we standardized all input variables by $x_{standard} = \frac{x - mean (x)}{s d (x)}$ and took the logarithm of the monitored NO₂. We also used imputation to fill in missing values of predictor variables before model training and model prediction (Section 3, Supporting Information).

3.3. Ensemble Model

To blend NO₂ estimations from the three machine learning algorithms, we used a generalized additive model with penalized spline on both location and NO₂ estimation to account for geographic weights:

\hat{N O_{2}} = f_{1} (Locatio n_{i}, {\hat{N O_{2, n n}}}_{i j}) + f_{2} (Locatio n_{i}, {\hat{N O_{2, r f}}}_{i j}) + f_{3} (Locatio n_{i}, {\hat{N O_{2, g b}}}_{i j})

where f₁ denotes a thin plate spline for an interaction between location i and the NO₂ estimation from the neural network at location i and on day j ( ${\hat{N O_{2, n n}}}_{i j})$ ; f₂ and ${\hat{N O_{2, r f}}}_{i j}$ and f₃ and ${\hat{N O_{2, g b}}}_{i j}$ stand for similar quantities, but from random forest and gradient boosting at location i and on day j, respectively. By employing this generalized additive model, we allowed the contribution of each algorithm to the final NO₂ estimate to potentially depend on the NO₂ concentration (i.e., non-linear response) and vary in different locations (geographically weighted regression).

To fit the local address deviations from a grid cell level, we took the daily residuals at each monitor and modeled these as a function of local land cover within 100 m and meteorology, using a random forest. Downscaling predictors included NLCD land-cover, truck traffic, traffic volume, elevation, and road density. We also included air temperature, humidity, wind speed, and planetary boundary layer height.

3.4. Model Prediction

We predicted daily NO₂ at 1 km × 1 km grid cells in the study area with the trained model. In total there are over 11 million grid cells in the entire study area. The trained model here included trained neural networks, random forests, gradient boosting models, and generalized additive models in both steps. Model prediction repeated the same process as model training: obtain NO₂ prediction from three learning algorithms, put them into the ensemble model and calculated NO₂ estimation, calculate spatially and temporally averaged NO₂ estimation, and use these averages as additional predictors and repeat above process again (Figure S1).

The address-specific exposure can be used to assign better exposure in studies where addresses or geocodes are available. To illustrate this while avoiding confidentiality issues, we estimated the final NO₂ estimation on a 100-m grid in the greater Boston metropolitan area. We calculated the residual of the NO₂ model (monitored NO₂ minus predicted NO₂) and used downscaling predictor variables to estimate the residual in a random forest. After training the random forest model, we prepared those downscaling variables and predicted residuals in each 100 m × 100 m grid cell.

3.5. Uncertainty Estimation

We also estimated the uncertainty in the NO₂ predictions. We used the following generalized additive model to estimate the monthly uncertainty of NO₂ estimation:

s {d (Δ}_{{N O 2}_{i j}}) = f_{1} (Locatio n_{i}) + f_{2} (Locatio n_{i}, {\hat{N O_{2}}}_{i j}) + f_{3} (elevation) + f_{3} (elevation s . d .) + {f_{4} (truck traffic) + f}_{5} (traffic volume) + f_{6} (humidity) + f_{7} (tree canopy) + f_{8} (NDVI) + f_{9} (urban) + Year + e_{i j}

where $s {d (Δ}_{{N O 2}_{i j}})$ is the standard deviation of the difference between monitored daily NO₂ and estimated daily NO₂ at location i and month j; f₁ is a penalized spline for location i; f₂ is a thin plate spline for an interaction between location and monthly averaged predicted NO₂ at location i and month j; f₃ ~ f₉ are splines on elevation, standard deviation of elevation, truck traffic, traffic volume, humidity, tree canopy, NDVI, and urban areas, respectively. The error term is e_ij.

4. Results

The mean cross-validated R² was 0.79 for daily NO₂. The two-step modeling framework indeed improved model performance, with total R² improved from 0.77 in Step 1 to 0.79 in Step 2 (Table S2). The spatial R², which we defined as the R² between annual averaged monitored NO₂ and estimated NO₂, varied between 0.78 to 0.86 by year, with a mean spatial R² of 0.84, indicating a good model performance at the annual level (Table 1). The average RMSE was 7.15 ppb overall (4.51 ppb spatially and 5.57 ppb temporally). The ensemble model outperformed the three base learners (R², neural network: 0.763, random forest: 0.787, and gradient boosting: 0.752). Temporally, model performance remained stable, but less satisfying in early and most recent years. Among the three machine learners, the random forest outperformed the neural network and gradient boosting. Overall, ensemble averaging further improved model performance compared to the best single learner, although only modestly. Figure 1 presents the maps of uncertainty level, with better model performance in California (except the south) and the Northeastern United States. Performance was worse in mountainous regions, such as Rocky and Appalachian Mountains, where site monitors are sparse.

Table 1.

Cross-validated Model Performance

	Ensemble Model						Neural Network	Random Forest	Gradient Boosting
Year	R²	MSE (ppb)	Spatial R²	Temporal R²	Bias (ppb)	slope	R²	R²	R²
2000	0.692	10.175	0.804	0.602	1.330	0.962	0.668	0.693	0.677
2001	0.762	8.440	0.827	0.709	0.705	0.984	0.721	0.760	0.741
2002	0.780	7.872	0.824	0.734	0.464	0.993	0.745	0.774	0.751
2003	0.801	7.289	0.845	0.751	0.317	0.995	0.789	0.799	0.770
2004	0.782	7.249	0.833	0.734	0.374	0.985	0.754	0.781	0.755
2005	0.767	7.443	0.816	0.730	0.683	0.971	0.748	0.764	0.737
2006	0.771	7.305	0.820	0.735	0.610	0.979	0.750	0.769	0.738
2007	0.782	6.997	0.840	0.730	0.488	0.982	0.759	0.778	0.747
2008	0.785	6.964	0.799	0.764	0.323	0.984	0.744	0.787	0.753
2009	0.804	6.267	0.859	0.764	−0.157	1.000	0.775	0.803	0.765
2010	0.789	6.377	0.829	0.763	0.065	0.993	0.769	0.786	0.749
2011	0.797	6.284	0.846	0.755	−0.090	0.998	0.777	0.798	0.756
2012	0.777	6.263	0.832	0.738	0.029	0.996	0.754	0.772	0.738
2013	0.792	5.999	0.835	0.762	−0.165	1.000	0.755	0.796	0.761
2014	0.787	6.113	0.819	0.761	−0.031	0.997	0.767	0.785	0.756
2015	0.779	6.227	0.817	0.755	−0.059	1.001	0.749	0.775	0.742
2016	0.749	6.459	0.780	0.724	0.334	0.968	0.733	0.749	0.722
Total	0.788	7.146	0.844	0.729	0.233	0.990	0.763	0.787	0.752

Open in a new tab

Note: the definition of spatial and temporal R²’s were based on a previous study⁶² . For bias and slope, we regressed daily predicted NO₂ at monitors against daily monitored NO₂ in a linear regression model to obtain slope and bias (the intercept).

Left column shows the cross-validated R² at each monitoring site ; at right are the monthly mean standard deviations (SD) of the differences between daily monitored NO₂ and daily predicted NO₂, averaged over each 1 km × 1 km grid cell for the entire study period. Spring is March to May; summer is June to August; autumn is September to November; winter is December, January, and February.

Although the ensemble model only had a limited impact on daily R², it improved the linearity of the relationship between monitored NO₂ and predicted NO₂. The neural network underestimated NO₂ at high concentrations, while the random forest overestimated at the high end. The overestimation at the high end was even more serious for gradient boosting. The ensemble model showed good linearity until 150 ppb, an extremely high daily concentration seldom seen in the contiguous United States (Figure 2). At the annual level, the linearity between monitored and predicted NO₂ was even better, with linearity at concentrations up to 55 ppb, a very high annual average that only 0.2% monitoring data reached (Figure S2). Both Figure 2 and Figure S2 indicated that our model estimated NO₂ accurately at common pollution levels in the contiguous United States, with slight underestimation at extremely high (and also rare) concentration levels.

We compared monitored NO₂ and predicted NO₂ from the ensemble model and three machine learners, respectively, with a spline on monitored NO₂ in a generalized additive model. Dashed lines stand for 95% confidence intervals. The 95% confidence intervals are very narrow here because of the large sample size.

The distribution of NO₂ exhibits clear spatial clustering, with high concentration clustering around urban areas, especially major cities, and along highways. From the 2000 national maps, we can clearly identify several NO₂ hotspots, such as Seattle, Los Angeles, Phoenix, Salt Lake City, Denver, Albuquerque, Chicago, Indianapolis, Louisville, New York, and Philadelphia (Figure 3). This clustering pattern of NO₂ is clearer in the downscaled prediction of the greater Boston metropolitan area, using a 100-m resolution grid to illustrate the address specific model (Figure S3). We can clearly identify the central urban area with generally high concentrations, but lower concentrations in rural areas, over greenspaces and waterbodies.

The panels show daily NO₂ estimate for 1 km × 1 km grid cells, averaged annually and for four seasons. Here “daily NO₂” means 1-hour daily maximum NO₂. Rows show the four seasons, defined in Figure 1.

NO₂ concentrations fell substantially in the U.S. during the study period, with annual level in 2016 about 50% of the 2000 concentrations, but the decline stagnated after 2007. The nationwide NO₂ level in 2016 was almost identical (100.08%) to that of 2007 (Figure 4). By constraining only to predictions at monitoring sites, we observed a different pattern, with long-term decline and a steady decrease after 2007, consistent with the trend reported in a previous GEOS-Chem model study⁴⁹. The average predicted NO₂ level at the monitoring sites in 2016 was only 71.62% of the 2007 level (Figure 4).

We calculated the daily NO₂ for all 1 km × 1 km grid cells in the contiguous U.S., and plotted the daily average over the entire study period (blue line), as well as the one-year moving-average (orange line). For comparison, we also plotted the one-year moving average of NO₂ level at just the monitoring sites (black line). To visualize the relative changes after 2007, we show the timeseries of the annual averaged changes relative to the 2007 NO₂ levels (upper right figure). “Daily NO₂” means 1-hour daily maximum NO₂.

We also reported the relative importance of different variables from the three machine learning algorithms (Table S3). Specific approaches to assess variable importance were mentioned in the footnote of Table S3. Spatially lagged NO₂ and its 1-day-lagged values were both important predictor variables. Multiple land-cover variables, such as impervious surface, developed land, road density, traffic volume of truck route also ranked as important predictors. The explanatory power of CMAQ-simulated NO₂, and elemental carbon, which derives from similar sources as NO₂ was also high. The standard deviation of elevation, maximum elevation, nighttime light, and traffic volume of trucks -- variables seldom used in previous studies -- also demonstrated important explanatory power.

5. Discussion

In this paper, we present an ensemble model to incorporate neural network, random forest, and gradient boosting to estimate daily NO₂ across the contiguous United States. Performance of the ensemble model was excellent, with cross-validated mean R² of 0.79, mean spatial R² of 0.84, RMSE of 7.15 ppb, and spatial RMSE of 4.51 ppb. Our model used various types of predictors (satellite remote sensing, chemical transport models, multiple land-cover terms) that are not often combined in such models, as well as ensembled results from them using three different machine learning algorithms. We predicted daily NO₂ at every 1 km × 1 km grid cell in the contiguous United States, which should be useful for epidemiology and health impact assessment that require small area estimates (e.g., over census tracts or ZIP Codes). The ability to predict well outside of major urban areas is an important feature of this model, with good performance in rural areas as well. A key addition is the modeling of the standard deviation of exposure error for each month of each year in each grid cell. This will enable researchers to incorporate the measurement errors in epidemiological studies⁶⁴.

This study exhibits several advantages over existing studies. First, our modeling framework incorporated multiple machine learning algorithms, and assembles them in an innovative way. These complementary machine learning algorithms improved model performance, especially at high concentrations. In contrast to many ensemble methods, which give fixed weights to each machine learner, our approach lets the weights vary spatially and by NO₂ concentration. This modeling framework, with several independent algorithms estimating NO₂ individually and a generalized additive model combining them, can be extended to additional fitting algorithms and is applicable to modeling other air pollutants. For example, several existing studies on NO₂ modeling used a support vector machine, which could become another base learner in future ensemble models. Second, this study achieved high spatiotemporal resolution, with 1-km-level and potentially address-specific predictions available every day. Most existing studies estimated NO₂ at the annual level, which would not be appropriate for pregnancy outcomes or acute effects. In addition, previous studies exhibited a tradeoff between resolution and study area. NO₂ models with large spatial coverage (e.g., nationwide models) generally had to compromise either on spatial/temporal resolution or both. Our study, using multiple land-cover variables as spatial predictors and meteorological variables and CTM simulations as temporal predictors, achieved fine spatial and temporal resolutions for the entire contiguous United States. Third, we developed a sophisticated model to fill in the missing values. Unlike previous studies estimating annual NO₂ and simply excluding missing values, daily estimation of NO₂ requires filling in missing values before training the model. Moreover, annual average estimates can be biased if the data are not missing at random, a situation our method avoids. Some studies used values from the nearest locations to fill in missing values⁶⁵, but we have argued in a related study on PM_2.5 modeling that this strategy can be problematic, especially when the number of missing values is large and missing values are spatially clustered. While our method of filling in missing values requires separate prediction models for each variable with missing data, which is computationally intensive, growing computational capacity makes the process less formidable. We leveraged computational power from the Harvard Odyssey Supercomputer.

We have additionally used the standard deviation of elevation, maximum elevation, restaurant density, nighttime light, and truck route as predictor variables, which, to the best of our knowledge, have seldom been used in previous studies. Results on variable importance indicate that these variables are important for NO₂ estimation. Truck exhaust is the largest source of NO_x in the United States¹¹, and responsible for a large portion of NO_x emission in other countries as well⁶⁶. Indeed, NO_x emission from trucks is many times higher than normal passenger cars⁶⁷. Thus, it is reasonable to separate truck emissions from generic traffic emissions. Elevation is a predictor variable widely used in NO₂ estimation. Our study suggests that elevation variation, instead of elevation itself, is more important. This is consistent with common sense: topography, as well as stable tropospheric structure in the winter due to temperature inversion, affects dispersion of air pollutants. For a similar reason, breakline emphasis of elevation was an important variable, which again demonstrates that elevation variation matters in air pollution modeling. Nighttime light corresponds to the level of urbanization, energy consumption, and overall economic activity^68–70, and thus is related to pollution emission. Previous studies have used nighttime light in PM_2.5 modeling⁷¹. Nighttime light is available globally over multiple years and could serve as an important predictor variable for NO₂ modeling in other countries. In contrast, other variables we used here are not always available. Cooking is a major source of air pollution, especially in cities⁷². Thus, restaurants are an important source, creating local hotspots of NO₂. Incorporating restaurants as a predictor variable can improve model performance at finer scales, especially in cities.

The three machine learning algorithms gave the highest weights to different predictor variables. Spatially lagged terms of monitored NO₂ (i.e. nearby monitors) play important roles in all three algorithms. Gradient boosting predominantly depends on these lagged terms; random forest relies primarily on these terms plus additional land-cover variables and CTM simulations, while neural network relies primarily on land-cover variables. The relative importance of land-cover variables also varies. Our results indicate that the contribution or importance of predictor variables depends on the fitting algorithm. Similarly, for the ensemble model, the contribution of three individual machine learning algorithms varied by concentrations and location. Based on these results, we conclude that the model performance of different fitting algorithms and the contribution of different predictor variables are context-based. In other words, it is difficult to foresee which variable(s) are most informative and which fitting algorithm is most appropriate to an air pollution model without actually running the model. Answers to both questions depend on the research topic, time period, and study area. Some previous studies compared the performance of machine learning algorithms with a statistical model^{34, 73}, or compared the performance of different model specifications^{30, 45}. Our study suggests that it would be more useful to propose a framework integrating multiple predictor variables and estimations from different fitting algorithms, as we did in this study. We also conclude that the specific structure of ensemble model depends on the practical interest. For our study, the ensemble model aggregated daily NO₂ estimation to improve model performance at daily level, thus total R² improved (Table 1); but model performance at annual level may not be optimized at the same time, thus spatial R² of the ensemble model decreased slightly compared with random forest (Table S4). To optimize spatial R², another ensemble model is required to aggregate NO₂ estimations at annual level.

We found that satellite-derived NO₂ column measurements are not as important as other predictor variables, contributing to less than 1% of the prediction in the neural network, random forest, and gradient boosting methods. This result contrasts with PM_2.5 modeling, where satellite-derived aerosol optical depth (AOD) is an important predictor variable. The reasons are multiple-fold: first, NO₂ column measurement from OMI is much coarser (13 km × 24 km) compared with AOD from MODIS (the finest resolution of MODIS AOD is 1 km × 1 km). Coarse satellite-based NO₂ measurements average out heterogenous NO₂ levels within each cell⁷⁴. This is especially an issue when modeling NO₂, an air pollutant primarily coming from local traffic emission, and fine-scale measurement is essential. Second, the sensitivity of OMI and any satellite-based measurement of NO₂ increases with altitude, such that the measurement is least sensitive at the surface due to scattering of radiation at the surface and through the atmosphere⁷⁵. Third, CTM outputs already contributed to the temporal variations and the satellite-derived NO₂ was less important as a result.

In the long term, the spatial distribution of surface NO₂ contrasts with that of PM_2.5 and ozone (Figure S4). High NO₂ levels cluster along highways and cities. Traffic emissions are also an important source of primary PM_2.5, as well emitting precursor gases that form PM_2.5 in the atmosphere. However, PM_2.5 has a longer atmospheric lifetime than NO₂ and can be transported further; it also has more widespread sources of importance, such as the biosphere or aqueous phase production in clouds. Thus, for example, the entire Southeastern United States experiences high PM_2.5 concentration in the summer, while NO₂ is more locally enhanced. The spatial distribution of ozone also exhibits different patterns from NO₂, with high concentrations occurring over rural areas surrounding or downwind of urban areas, and in mountainous regions. The distinct patterns of NO₂, PM_2.5, and ozone at the national level suggest that a nationwide environmental epidemiological study could separate and identify the adverse health effect of each pollutant.

In terms of temporal trend, we found a discrepancy between the nationwide average trend and the average trend across the monitoring sites. We observed a steadily decreasing trend of averaged NO₂ level at monitoring sites, but at the national level (i.e., averaged NO₂ level for every 1-km grid in the contiguous U.S.) NO₂ declined from 2000 to 2007 and stagnated after 2007. For example, from 2007 to 2008, the site-averaged NO₂ level dropped from 21.9 ppb to 21.1 ppb at monitoring sites with about 3.5% decrease, but our ensemble model predicted that the nationwide averaged NO₂ rebounded from 8.1 ppb to 9.7 ppb over the same period, with a nearly 20% increase.

The steady 2000–2016 decrease of NO₂ concentrations predicted at the monitoring sites is consistent with observations and with the National Emission Inventory (NEI) of the U.S. Environmental Protection Agency (EPA), underscoring the success of clean air regulations (Silvern et al., 2019). However, the discrepancy between the predicted site-based trend and the nationwide trend suggests a different pattern of NO₂ pollution in less urban areas where there is scant monitor coverage. Whether this is due to a lower rate of replacement of more polluting vehicles, increased wood combustion, increased influence of background NO_x, or widespread reduction in anthropogenic VOC emission in urban areas, deserves further attention, particularly as rural NO_x pollution may impact production of secondary organic particles and ozone.

Our model has some limitations. There are still differences between predicted and observed NO₂ values, and we only model outdoor concentrations, and not personal exposure. However, a recent study pointed out that ambient exposure has an advantage over personal exposure in epidemiology studies in that it is much less correlated with individual level confounders⁷⁶. The model also depends on the existing monitoring network, and so was unable to take advantage of local intensive monitoring campaigns, which are often used in land cover regressions. On the other hand, the model covers many years, whereas land cover regression suffers from challenges in representing the influence of changing emissions over time using largely static land-cover terms. Despite these limitations, the daily NO₂ concentrations at high spatial resolution provided by our ensemble model promise to improve estimates of both long- and short-term exposures for epidemiological studies of large cohorts of U.S. residents, even those living far from monitors.

Supplementary Material

Supporting Information

NIHMS1568557-supplement-Supporting_Information.pdf^{(820.8KB, pdf)}

Acknowledgement

This publication was made possible by U.S. EPA grant numbers RD-834798, RD-835872, and 83587201; HEI grant 4953-RFA14-3/16-4. Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the U.S. Environmental Protection Agency (EPA). The views expressed in this article are those of the authors and do not necessarily represent the views or policies of the U.S. EPA. Further, the U.S. EPA does not endorse the purchase of any commercial products or services mentioned in the publication. Research described in this article was also conducted under contract to the Health Effects Institute (HEI), an organization jointly funded by the U.S. EPA (Assistance Award No.CR-83467701) and certain motor vehicle and engine manufacturers. The contents of this article do not necessarily reflect the views of HEI, or its sponsors, nor do they necessarily reflect the views and policies of the EPA or motor vehicle and engine manufacturers. The computations in this paper were run on the Odyssey cluster supported by the FAS Division of Science, Research Computing Group at Harvard University.

Footnotes

Supporting Information

The Supporting Information is available free of charge on the ACS Publications website:

Detailed list of meteorological variables, details of land-cover variables used in the analysis , details of dealing with missing values, parameters tuned for base learners, model performance from Step 1 and Step2, contribution of predictor variables, model performance comparison between the ensemble model and base learners, flowchart of ensemble modeling, linearity between monitored NO₂ and predicted NO₂ at the annual level, downscaled NO₂ levels in the great Boston area, and long-term averages of PM_2.5, NO₂ and ozone

References

1.Kagawa J, Evaluation of biological significance of nitrogen oxides exposure. The Tokai journal of experimental and clinical medicine 1985, 10, (4), 348–353. [PubMed] [Google Scholar]
2.Chauhan A; Krishna M; Frew A; Holgate S, Exposure to nitrogen dioxide (NO2) and respiratory disease risk. Reviews on environmental health 1998, 13, (1–2), 73–90. [PubMed] [Google Scholar]
3.Weinmayr G; Romeo E; De Sario M; Weiland SK; Forastiere F, Short-term effects of PM10 and NO2 on respiratory health among children with asthma or asthma-like symptoms: a systematic review and meta-analysis. Environmental health perspectives 2009, 118, (4), 449–457. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Speizer FE; Ferris B Jr; Bishop YM; Spengler J, Respiratory disease rates and pulmonary function in children associated with NO2 exposure. American Review of Respiratory Disease 1980, 121, (1), 3–10. [DOI] [PubMed] [Google Scholar]
5.Brauer M; Lencar C; Tamburic L; Koehoorn M; Demers P; Karr C, A cohort study of traffic-related air pollution impacts on birth outcomes. Environmental health perspectives 2008, 116, (5), 680. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Chiusolo M; Cadum E; Stafoggia M; Galassi C; Berti G; Faustini A; Bisanti L; Vigotti MA; Dessì MP; Cernigliaro A, Short-term effects of nitrogen dioxide on mortality and susceptibility factors in 10 Italian cities: the EpiAir study. Environmental health perspectives 2011, 119, (9), 1233. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Latza U; Gerdes S; Baur X, Effects of nitrogen dioxide on human health: systematic review of experimental and epidemiological studies conducted between 2002 and 2006. International journal of hygiene and environmental health 2009, 212, (3), 271–287. [DOI] [PubMed] [Google Scholar]
8.Zhao Y; Saleh R; Saliba G; Presto AA; Gordon TD; Drozd GT; Goldstein AH; Donahue NM; Robinson AL, Reducing secondary organic aerosol formation from gasoline vehicle exhaust. Proceedings of the National Academy of Sciences 2017, 114, 6984–6989. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Xu L; Guo H; Boyd CM; Klein M; Bougiatioti A; Cerully KM; Hite JR; Isaacman-VanWertz G; Kreisberg NM; Knote C; Olson K; Koss A; Goldstein AH; Hering SV; de Gouw J; Baumann K; Lee S-H; Nenes A; Weber RJ; Ng NL, Effects of anthropogenic emissions on aerosol formation from isoprene and monoterpenes in the southeastern United States. Proceedings of the National Academy of Sciences 2015, 112, 37–42. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Pye HOT; D’Ambro EL; Lee BH; Schobesberger S; Takeuchi M; Zhao Y; Lopez-Hilfiker F; Liu J; Shilling JE; Xing J; Mathur R; Middlebrook AM; Liao J; Welti A; Graus M; Warneke C; de Gouw JA; Holloway JS; Ryerson TB; Pollack IB; Thornton JA, Anthropogenic enhancements to production of highly oxygenated molecules from autoxidation. Proceedings of the National Academy of Sciences 2019, 116, 6641–6646. [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Preble C; Harley R; Kirchstetter T In N2O and NO2 Emissions from Heavy-Duty Diesel Trucks with Advanced Emission Controls, AGU Fall Meeting Abstracts, 2014; 2014. [Google Scholar]
12.Novotny EV; Bechle MJ; Millet DB; Marshall JD, National satellite-based land-use regression: NO2 in the United States. Environmental science & technology 2011, 45, (10), 4407–4414. [DOI] [PubMed] [Google Scholar]
13.Kim Y; Guldmann J-M, Land-use regression panel models of NO2 concentrations in Seoul, Korea. Atmospheric Environment 2015, 107, 364–373. [Google Scholar]
14.Kashima S; Yorifuji T; Sawada N; Nakaya T; Eboshida A, Comparison of land use regression models for NO 2 based on routine and campaign monitoring data from an urban area of Japan. Science of The Total Environment 2018, 631, 1029–1037. [DOI] [PubMed] [Google Scholar]
15.De Hoogh K; Chen J; Gulliver J; Hoffmann B; Hertel O; Ketzel M; Bauwelinck M; van Donkelaar A; Hvidtfeldt UA; Katsouyanni K, Spatial PM2. 5, NO2, O3 and BC models for Western Europe–Evaluation of spatiotemporal stability. Environment international 2018, 120, 81–92. [DOI] [PubMed] [Google Scholar]
16.Kim S-Y; Song I, National-scale exposure prediction for long-term concentrations of particulate matter and nitrogen dioxide in South Korea. Environmental pollution 2017, 226, 21–29. [DOI] [PubMed] [Google Scholar]
17.Araki S; Shima M; Yamamoto K, Spatiotemporal land use random forest model for estimating metropolitan NO 2 exposure in Japan. Science of The Total Environment 2018, 634, 1269–1277. [DOI] [PubMed] [Google Scholar]
18.Eeftens M; Meier R; Schindler C; Aguilera I; Phuleria H; Ineichen A; Davey M; Ducret-Stich R; Keidel D; Probst-Hensch N, Development of land use regression models for nitrogen dioxide, ultrafine particles, lung deposited surface area, and four other markers of particulate matter pollution in the Swiss SAPALDIA regions. Environmental Health 2016, 15, (1), 53. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Gulliver J; de Hoogh K; Hoek G; Vienneau D; Fecht D; Hansell A, Back-extrapolated and year-specific NO2 land use regression models for Great Britain-Do they yield different exposure assessment? Environment international 2016, 92, 202–209. [DOI] [PubMed] [Google Scholar]
20.Daneshvar MRM; Abadi NH, Spatial and temporal variation of nitrogen dioxide measurement in the Middle East within 2005–2014. Modeling Earth Systems and Environment 2017, 3, (1), 20. [Google Scholar]
21.Lamsal L; Martin R; Parrish DD; Krotkov NA, Scaling relationship for NO2 pollution and urban population size: a satellite perspective. Environmental science & technology 2013, 47, (14), 7855–7861. [DOI] [PubMed] [Google Scholar]
22.Xu H; Bechle MJ; Wang M; Szpiro AA; Vedal S; Bai Y; Marshall JD, National PM2. 5 and NO2 exposure models for China based on land use regression, satellite measurements, and universal kriging. Science of The Total Environment 2019, 655, 423–433. [DOI] [PubMed] [Google Scholar]
23.Zhan Y; Luo Y; Deng X; Zhang K; Zhang M; Grieneisen ML; Di B, Satellite-Based Estimates of Daily NO2 Exposure in China Using Hybrid Random Forest and Spatiotemporal Kriging Model. Environmental science & technology 2018, 52, (7), 4180–4189. [DOI] [PubMed] [Google Scholar]
24.Bechle MJ; Millet DB; Marshall JD, Remote sensing of exposure to NO2: Satellite versus ground-based measurement in a large urban area. Atmospheric Environment 2013, 69, 345–353. [Google Scholar]
25.Lee HJ; Koutrakis P, Daily ambient NO2 concentration predictions using satellite ozone monitoring instrument NO2 data and land use regression. Environmental science & technology 2014, 48, (4), 2305–2311. [DOI] [PubMed] [Google Scholar]
26.Boersma K; Jacob DJ; Trainic M; Rudich Y; DeSmedt I; Dirksen R; Eskes H, Validation of urban NO 2 concentrations and their diurnal and seasonal variations observed from the SCIAMACHY and OMI sensors using in situ surface measurements in Israeli cities. Atmospheric Chemistry and Physics 2009, 9, (12), 3867–3879. [Google Scholar]
27.Richter A; Burrows JP; Nüß H; Granier C; Niemeier U, Increase in tropospheric nitrogen dioxide over China observed from space. Nature 2005, 437, (7055), 129. [DOI] [PubMed] [Google Scholar]
28.Anand JS; Monks PS, Estimating daily surface NO 2 concentrations from satellite data–a case study over Hong Kong using land use regression models. Atmospheric Chemistry and Physics 2017, 17, (13), 8211–8230. [Google Scholar]
29.Lamsal L; Martin R; Van Donkelaar A; Steinbacher M; Celarier E; Bucsela E; Dunlea E; Pinto J, Ground‐level nitrogen dioxide concentrations inferred from the satellite‐borne Ozone Monitoring Instrument. Journal of Geophysical Research: Atmospheres 2008, 113, (D16). [Google Scholar]
30.de Hoogh K; Gulliver J; van Donkelaar A; Martin RV; Marshall JD; Bechle MJ; Cesaroni G; Pradas MC; Dedele A; Eeftens M, Development of West-European PM2. 5 and NO2 land use regression models incorporating satellite-derived and chemical transport modelling data. Environmental research 2016, 151, 1–10. [DOI] [PubMed] [Google Scholar]
31.Hanigan IC; Williamson GJ; Knibbs LD; Horsley J; Rolfe MI; Cope M; Barnett AG; Cowie CT; Heyworth JS; Serre ML, Blending multiple nitrogen dioxide data sources for neighborhood estimates of long-term exposure for health research. Environmental science & technology 2017, 51, (21), 12473–12480. [DOI] [PubMed] [Google Scholar]
32.Song W; Jia H; Li Z; Tang D; Wang C, Detecting urban land-use configuration effects on NO2 and NO variations using geographically weighted land use regression. Atmospheric Environment 2019, 197, 166–176. [Google Scholar]
33.Gardner M; Dorling S, Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmospheric Environment 1999, 33, (5), 709–719. [Google Scholar]
34.Kukkonen J; Partanen L; Karppinen A; Ruuskanen J; Junninen H; Kolehmainen M; Niska H; Dorling S; Chatterton T; Foxall R, Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki. Atmospheric Environment 2003, 37, (32), 4539–4550. [Google Scholar]
35.Yeganeh B; Hewson MG; Clifford S; Tavassoli A; Knibbs LD; Morawska L, Estimating the spatiotemporal variation of NO 2 concentration using an adaptive neuro-fuzzy inference system. Environmental Modelling & Software 2018, 100, 222–235. [Google Scholar]
36.Lu W; Wang W; Leung AY; Lo S-M; Yuen RK; Xu Z; Fan H In Air pollutant parameter forecasting using support vector machines, Neural Networks, 2002. IJCNN’02. Proceedings of the 2002 International Joint Conference on, 2002; IEEE: 2002; pp 630–635. [Google Scholar]
37.Juhos I; Makra L; Tóth B, Forecasting of traffic origin NO and NO2 concentrations by Support Vector Machines and neural networks using Principal Component Analysis. Simulation Modelling Practice and Theory 2008, 16, (9), 1488–1502. [Google Scholar]
38.Mavko ME; Tang B; George LA, A sub-neighborhood scale land use regression model for predicting NO2. Science of the Total Environment 2008, 398, (1–3), 68–75. [DOI] [PubMed] [Google Scholar]
39.Huang Y-K; Luvsan M-E; Gombojav E; Ochir C; Bulgan J; Chan C-C, Land use patterns and SO2 and NO2 pollution in Ulaanbaatar, Mongolia. Environmental research 2013, 124, 1–6. [DOI] [PubMed] [Google Scholar]
40.Liu C; Henderson BH; Wang D; Yang X; Peng Z. r., A land use regression application into assessing spatial variation of intra-urban fine particulate matter (PM2. 5) and nitrogen dioxide (NO2) concentrations in City of Shanghai, China. Science of The Total Environment 2016, 565, 607–615. [DOI] [PubMed] [Google Scholar]
41.Johnson M; MacNeill M; Grgicak-Mannion A; Nethery E; Xu X; Dales R; Rasmussen P; Wheeler A, Development of temporally refined land-use regression models predicting daily household-level air pollution in a panel study of lung function among asthmatic children. Journal of Exposure Science and Environmental Epidemiology 2013, 23, (3), 259. [DOI] [PubMed] [Google Scholar]
42.Liu W; Li X; Chen Z; Zeng G; León T; Liang J; Huang G; Gao Z; Jiao S; He X, Land use regression models coupled with meteorology to model spatial and temporal variability of NO2 and PM10 in Changsha, China. Atmospheric Environment 2015, 116, 272–280. [Google Scholar]
43.Rahman MM; Yeganeh B; Clifford S; Knibbs LD; Morawska L, Development of a land use regression model for daily NO2 and NOx concentrations in the Brisbane metropolitan area, Australia. Environmental Modelling & Software 2017, 95, 168–179. [Google Scholar]
44.He B; Heal M; Reis S, Land-use regression modelling of intra-urban air pollution variation in China: current status and future needs. Atmosphere 2018, 9, (4), 134. [Google Scholar]
45.Vienneau D; De Hoogh K; Bechle MJ; Beelen R; Van Donkelaar A; Martin RV; Millet DB; Hoek G; Marshall JD, Western European land use regression incorporating satellite-and ground-based measurements of NO2 and PM10. Environmental science & technology 2013, 47, (23), 13555–13564. [DOI] [PubMed] [Google Scholar]
46.Geddes JA; Martin RV; Boys BL; van Donkelaar A, Long-term trends worldwide in ambient NO2 concentrations inferred from satellite observations. Environmental health perspectives 2015, 124, (3), 281–289. [DOI] [PMC free article] [PubMed] [Google Scholar]
47.Fisher JA; Jacob DJ; Travis KR; Kim PS; Marais EA; Chan Miller C; Yu K; Zhu L; Yantosca RM; Sulprizio MP, Organic nitrate chemistry and its implications for nitrogen budgets in an isoprene-and monoterpene-rich atmosphere: constraints from aircraft (SEAC 4 RS) and ground-based (SOAS) observations in the Southeast US. Atmospheric chemistry and physics 2016, 16, (9), 5969–5991. [DOI] [PMC free article] [PubMed] [Google Scholar]
48.Kelly JT; Jang CJ; Timin B; Gantt B; Reff A; Zhu Y; Long S; Hanna A, A system for developing and projecting PM2.5 spatial fields to correspond to just meeting National Ambient Air Quality Standards. Atmospheric Environment 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
49.Silvern RF; Jacob DJ; Mickley LJ; Sulprizio MP; Travis KR; Marais EA; Cohen RC; Laughner JL; Choi S; Joiner J; Lamsal LN, Using satellite observations of tropospheric NO2 columns to infer long-term trends in US NOx emissions: the importance of accounting for the free tropospheric NO2 background. Atmos. Chem. Phys. Discuss 2019, 2019, 1–26. [Google Scholar]
50.Zheng S; Zhou X; Singh RP; Wu Y; Ye Y; Wu C, The spatiotemporal distribution of air pollutants and their relationship with land-use patterns in Hangzhou city, China. Atmosphere 2017, 8, (6), 110. [Google Scholar]
51.Knibbs LD; Hewson MG; Bechle MJ; Marshall JD; Barnett AG, A national satellite-based land-use regression model for air pollution exposure assessment in Australia. Environmental research 2014, 135, 204–211. [DOI] [PubMed] [Google Scholar]
52.Di Q; Amini H; Shi L; Kloog I; Silvern R; Kelly J; Sabath MB; Choirat C; Koutrakis P; Lyapustin A, An ensemble-based model of PM2. 5 concentration across the contiguous United States with high spatiotemporal resolution. Environment international 2019, 130, 104909. [DOI] [PMC free article] [PubMed] [Google Scholar]
53.Lin J-T; Martin R; Boersma K; Sneep M; Stammes P; Spurr R; Wang P; Van Roozendael M; Clémer K; Irie H, Retrieving tropospheric nitrogen dioxide from the Ozone Monitoring Instrument: effects of aerosols, surface reflectance anisotropy, and vertical profile of nitrogen dioxide. Atmospheric Chemistry and Physics 2014, 14, (3), 1441–1461. [Google Scholar]
54.Boersma K; Eskes H; Brinksma E, Error analysis for tropospheric NO2 retrieval from space. Journal of Geophysical Research: Atmospheres 2004, 109, (D4). [Google Scholar]
55.Bucsela E; Perring A; Cohen R; Boersma K; Celarier E; Gleason J; Wenig M; Bertram T; Wooldridge P; Dirksen R, Comparison of tropospheric NO2 from in situ aircraft measurements with near‐real‐time and standard product data from OMI. Journal of Geophysical Research: Atmospheres 2008, 113, (D16). [Google Scholar]
56.Modeling G, MERRA-2 inst3_3d_aer_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Aerosol Mixing Ratio V5.12.4. In NASA Goddard Earth Sciences Data and Information Services Center: 2015. [Google Scholar]
57.Herman J; Bhartia P; Torres O; Hsu C; Seftor C; Celarier E, Global distribution of UV-absorbing aerosols from Nimbus 7/TOMS data. J. Geophys. Res 1997, 102, (16), 911–16. [Google Scholar]
58.Torres O; Bhartia P; Herman J; Ahmad Z; Gleason J, Derivation of aerosol properties from satellite measurements of backscattered ultraviolet radiation: Theoretical basis. Journal of Geophysical Research: Atmospheres (1984–2012) 1998, 103, (D14), 17099–17110. [Google Scholar]
59.Kalnay E; Kanamitsu M; Kistler R; Collins W; Deaven D; Gandin L; Iredell M; Saha S; White G; Woollen J; Zhu Y; Leetmaa A; Reynolds R; Chelliah M; Ebisuzaki W; Higgins W; Janowiak J; Mo KC; Ropelewski C; Wang J; Jenne R; Joseph D, The NCEP/NCAR 40-Year Reanalysis Project. Bulletin of the American Meteorological Society 1996, 77, 437–471. [Google Scholar]
60.Vermote E, MOD09A1 MODIS/Terra Surface Reflectance 8-Day L3 Global 500m SIN Grid V006 In DAAC, N. E. L. P., Ed. 2015. [Google Scholar]
61.Inness A; Ades M; Agusti-Panareda A; Barré J; Benedictow A; Blechschmidt A-M; Dominguez JJ; Engelen R; Eskes H; Flemming J; Huijnen V; Jones L; Kipling Z; Massart S; Parrington M; Peuch V-H; Razinger M; Remy S; Schulz M; Suttie M, The CAMS reanalysis of atmospheric composition. Atmospheric Chemistry and Physics Discussions 2018, 1–55. [Google Scholar]
62.Kloog I; Koutrakis P; Coull BA; Lee HJ; Schwartz J, Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements. Atmospheric Environment 2011, 45, 6267–6275. [Google Scholar]
63.Zhu Y; Zhan Y; Wang B; Li Z; Qin Y; Zhang K, Spatiotemporally mapping of the relationship between NO2 pollution and urbanization for a megacity in Southwest China during 2005–2016. Chemosphere 2018. [DOI] [PubMed] [Google Scholar]
64.Spiegelman D, Evaluating Public Health Interventions: 4. The Nurses’ Health Study and Methods for Eliminating Bias Attributable to Measurement Error and Misclassification. American Journal of Public Health 2016, 106, (9), 1563–1566. [DOI] [PMC free article] [PubMed] [Google Scholar]
65.Kharol S; Martin R; Philip S; Boys B; Lamsal L; Jerrett M; Brauer M; Crouse D; McLinden C; Burnett R, Assessment of the magnitude and recent trends in satellite-derived ground-level nitrogen dioxide over North America. Atmospheric Environment 2015, 118, 236–245. [Google Scholar]
66.Velders GJ; Geilenkirchen GP; de Lange R, Higher than expected NOx emission from trucks may affect attainability of NO2 limit values in the Netherlands. Atmospheric environment 2011, 45, (18), 3025–3033. [Google Scholar]
67.Soltic P; Weilenmann M, NO2/NO emissions of gasoline passenger cars and light-duty trucks with Euro-2 emission standard. Atmospheric Environment 2003, 37, (37), 5207–5216. [Google Scholar]
68.Doll CN; Muller J-P; Morley JG, Mapping regional economic activity from nighttime light satellite imagery. Ecological Economics 2006, 57, (1), 75–92. [Google Scholar]
69.Shi K; Yu B; Hu Y; Huang C; Chen Y; Huang Y; Chen Z; Wu J, Modeling and mapping total freight traffic in China using NPP-VIIRS nighttime light composite data. GIScience & Remote Sensing 2015, 52, (3), 274–289. [Google Scholar]
70.Zhang Q; Seto KC, Mapping urbanization dynamics at regional and global scales using multi-temporal DMSP/OLS nighttime light data. Remote Sensing of Environment 2011, 115, (9), 2320–2329. [Google Scholar]
71.Wang J; Aegerter C; Xu X; Szykman JJ, Potential application of VIIRS Day/Night Band for monitoring nighttime surface PM2. 5 air quality from space. Atmospheric Environment 2016, 124, 55–63. [Google Scholar]
72.Zhao Y; Zhao B, Emissions of air pollutants from Chinese cooking: A literature review. Building Simulation 2018, 11, (5), 977–995. [Google Scholar]
73.Meng X; Chen L; Cai J; Zou B; Wu C-F; Fu Q; Zhang Y; Liu Y; Kan H, A land use regression model for estimating the NO2 concentration in Shanghai, China. Environmental research 2015, 137, 308–315. [DOI] [PubMed] [Google Scholar]
74.Celarier E; Brinksma E; Gleason J; Veefkind J; Cede A; Herman J; Ionov D; Goutail F; Pommereau JP; Lambert JC, Validation of Ozone Monitoring Instrument nitrogen dioxide columns. Journal of Geophysical Research: Atmospheres 2008, 113, (D15). [Google Scholar]
75.Martin RV, An improved retrieval of tropospheric nitrogen dioxide from GOME. Journal of Geophysical Research 2002, 107, (D20). [Google Scholar]
76.Weisskopf MG; Webster TF, Trade-offs of Personal Versus More Proxy Exposure Measures in Environmental Epidemiology. Epidemiology (Cambridge, Mass.) 2017, 28, (5), 635–643. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supporting Information

NIHMS1568557-supplement-Supporting_Information.pdf^{(820.8KB, pdf)}

[R1] 1.Kagawa J, Evaluation of biological significance of nitrogen oxides exposure. The Tokai journal of experimental and clinical medicine 1985, 10, (4), 348–353. [PubMed] [Google Scholar]

[R2] 2.Chauhan A; Krishna M; Frew A; Holgate S, Exposure to nitrogen dioxide (NO2) and respiratory disease risk. Reviews on environmental health 1998, 13, (1–2), 73–90. [PubMed] [Google Scholar]

[R3] 3.Weinmayr G; Romeo E; De Sario M; Weiland SK; Forastiere F, Short-term effects of PM10 and NO2 on respiratory health among children with asthma or asthma-like symptoms: a systematic review and meta-analysis. Environmental health perspectives 2009, 118, (4), 449–457. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Speizer FE; Ferris B Jr; Bishop YM; Spengler J, Respiratory disease rates and pulmonary function in children associated with NO2 exposure. American Review of Respiratory Disease 1980, 121, (1), 3–10. [DOI] [PubMed] [Google Scholar]

[R5] 5.Brauer M; Lencar C; Tamburic L; Koehoorn M; Demers P; Karr C, A cohort study of traffic-related air pollution impacts on birth outcomes. Environmental health perspectives 2008, 116, (5), 680. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Chiusolo M; Cadum E; Stafoggia M; Galassi C; Berti G; Faustini A; Bisanti L; Vigotti MA; Dessì MP; Cernigliaro A, Short-term effects of nitrogen dioxide on mortality and susceptibility factors in 10 Italian cities: the EpiAir study. Environmental health perspectives 2011, 119, (9), 1233. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Latza U; Gerdes S; Baur X, Effects of nitrogen dioxide on human health: systematic review of experimental and epidemiological studies conducted between 2002 and 2006. International journal of hygiene and environmental health 2009, 212, (3), 271–287. [DOI] [PubMed] [Google Scholar]

[R8] 8.Zhao Y; Saleh R; Saliba G; Presto AA; Gordon TD; Drozd GT; Goldstein AH; Donahue NM; Robinson AL, Reducing secondary organic aerosol formation from gasoline vehicle exhaust. Proceedings of the National Academy of Sciences 2017, 114, 6984–6989. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Xu L; Guo H; Boyd CM; Klein M; Bougiatioti A; Cerully KM; Hite JR; Isaacman-VanWertz G; Kreisberg NM; Knote C; Olson K; Koss A; Goldstein AH; Hering SV; de Gouw J; Baumann K; Lee S-H; Nenes A; Weber RJ; Ng NL, Effects of anthropogenic emissions on aerosol formation from isoprene and monoterpenes in the southeastern United States. Proceedings of the National Academy of Sciences 2015, 112, 37–42. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Pye HOT; D’Ambro EL; Lee BH; Schobesberger S; Takeuchi M; Zhao Y; Lopez-Hilfiker F; Liu J; Shilling JE; Xing J; Mathur R; Middlebrook AM; Liao J; Welti A; Graus M; Warneke C; de Gouw JA; Holloway JS; Ryerson TB; Pollack IB; Thornton JA, Anthropogenic enhancements to production of highly oxygenated molecules from autoxidation. Proceedings of the National Academy of Sciences 2019, 116, 6641–6646. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] 11.Preble C; Harley R; Kirchstetter T In N2O and NO2 Emissions from Heavy-Duty Diesel Trucks with Advanced Emission Controls, AGU Fall Meeting Abstracts, 2014; 2014. [Google Scholar]

[R12] 12.Novotny EV; Bechle MJ; Millet DB; Marshall JD, National satellite-based land-use regression: NO2 in the United States. Environmental science & technology 2011, 45, (10), 4407–4414. [DOI] [PubMed] [Google Scholar]

[R13] 13.Kim Y; Guldmann J-M, Land-use regression panel models of NO2 concentrations in Seoul, Korea. Atmospheric Environment 2015, 107, 364–373. [Google Scholar]

[R14] 14.Kashima S; Yorifuji T; Sawada N; Nakaya T; Eboshida A, Comparison of land use regression models for NO 2 based on routine and campaign monitoring data from an urban area of Japan. Science of The Total Environment 2018, 631, 1029–1037. [DOI] [PubMed] [Google Scholar]

[R15] 15.De Hoogh K; Chen J; Gulliver J; Hoffmann B; Hertel O; Ketzel M; Bauwelinck M; van Donkelaar A; Hvidtfeldt UA; Katsouyanni K, Spatial PM2. 5, NO2, O3 and BC models for Western Europe–Evaluation of spatiotemporal stability. Environment international 2018, 120, 81–92. [DOI] [PubMed] [Google Scholar]

[R16] 16.Kim S-Y; Song I, National-scale exposure prediction for long-term concentrations of particulate matter and nitrogen dioxide in South Korea. Environmental pollution 2017, 226, 21–29. [DOI] [PubMed] [Google Scholar]

[R17] 17.Araki S; Shima M; Yamamoto K, Spatiotemporal land use random forest model for estimating metropolitan NO 2 exposure in Japan. Science of The Total Environment 2018, 634, 1269–1277. [DOI] [PubMed] [Google Scholar]

[R18] 18.Eeftens M; Meier R; Schindler C; Aguilera I; Phuleria H; Ineichen A; Davey M; Ducret-Stich R; Keidel D; Probst-Hensch N, Development of land use regression models for nitrogen dioxide, ultrafine particles, lung deposited surface area, and four other markers of particulate matter pollution in the Swiss SAPALDIA regions. Environmental Health 2016, 15, (1), 53. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Gulliver J; de Hoogh K; Hoek G; Vienneau D; Fecht D; Hansell A, Back-extrapolated and year-specific NO2 land use regression models for Great Britain-Do they yield different exposure assessment? Environment international 2016, 92, 202–209. [DOI] [PubMed] [Google Scholar]

[R20] 20.Daneshvar MRM; Abadi NH, Spatial and temporal variation of nitrogen dioxide measurement in the Middle East within 2005–2014. Modeling Earth Systems and Environment 2017, 3, (1), 20. [Google Scholar]

[R21] 21.Lamsal L; Martin R; Parrish DD; Krotkov NA, Scaling relationship for NO2 pollution and urban population size: a satellite perspective. Environmental science & technology 2013, 47, (14), 7855–7861. [DOI] [PubMed] [Google Scholar]

[R22] 22.Xu H; Bechle MJ; Wang M; Szpiro AA; Vedal S; Bai Y; Marshall JD, National PM2. 5 and NO2 exposure models for China based on land use regression, satellite measurements, and universal kriging. Science of The Total Environment 2019, 655, 423–433. [DOI] [PubMed] [Google Scholar]

[R23] 23.Zhan Y; Luo Y; Deng X; Zhang K; Zhang M; Grieneisen ML; Di B, Satellite-Based Estimates of Daily NO2 Exposure in China Using Hybrid Random Forest and Spatiotemporal Kriging Model. Environmental science & technology 2018, 52, (7), 4180–4189. [DOI] [PubMed] [Google Scholar]

[R24] 24.Bechle MJ; Millet DB; Marshall JD, Remote sensing of exposure to NO2: Satellite versus ground-based measurement in a large urban area. Atmospheric Environment 2013, 69, 345–353. [Google Scholar]

[R25] 25.Lee HJ; Koutrakis P, Daily ambient NO2 concentration predictions using satellite ozone monitoring instrument NO2 data and land use regression. Environmental science & technology 2014, 48, (4), 2305–2311. [DOI] [PubMed] [Google Scholar]

[R26] 26.Boersma K; Jacob DJ; Trainic M; Rudich Y; DeSmedt I; Dirksen R; Eskes H, Validation of urban NO 2 concentrations and their diurnal and seasonal variations observed from the SCIAMACHY and OMI sensors using in situ surface measurements in Israeli cities. Atmospheric Chemistry and Physics 2009, 9, (12), 3867–3879. [Google Scholar]

[R27] 27.Richter A; Burrows JP; Nüß H; Granier C; Niemeier U, Increase in tropospheric nitrogen dioxide over China observed from space. Nature 2005, 437, (7055), 129. [DOI] [PubMed] [Google Scholar]

[R28] 28.Anand JS; Monks PS, Estimating daily surface NO 2 concentrations from satellite data–a case study over Hong Kong using land use regression models. Atmospheric Chemistry and Physics 2017, 17, (13), 8211–8230. [Google Scholar]

[R29] 29.Lamsal L; Martin R; Van Donkelaar A; Steinbacher M; Celarier E; Bucsela E; Dunlea E; Pinto J, Ground‐level nitrogen dioxide concentrations inferred from the satellite‐borne Ozone Monitoring Instrument. Journal of Geophysical Research: Atmospheres 2008, 113, (D16). [Google Scholar]

[R30] 30.de Hoogh K; Gulliver J; van Donkelaar A; Martin RV; Marshall JD; Bechle MJ; Cesaroni G; Pradas MC; Dedele A; Eeftens M, Development of West-European PM2. 5 and NO2 land use regression models incorporating satellite-derived and chemical transport modelling data. Environmental research 2016, 151, 1–10. [DOI] [PubMed] [Google Scholar]

[R31] 31.Hanigan IC; Williamson GJ; Knibbs LD; Horsley J; Rolfe MI; Cope M; Barnett AG; Cowie CT; Heyworth JS; Serre ML, Blending multiple nitrogen dioxide data sources for neighborhood estimates of long-term exposure for health research. Environmental science & technology 2017, 51, (21), 12473–12480. [DOI] [PubMed] [Google Scholar]

[R32] 32.Song W; Jia H; Li Z; Tang D; Wang C, Detecting urban land-use configuration effects on NO2 and NO variations using geographically weighted land use regression. Atmospheric Environment 2019, 197, 166–176. [Google Scholar]

[R33] 33.Gardner M; Dorling S, Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London. Atmospheric Environment 1999, 33, (5), 709–719. [Google Scholar]

[R34] 34.Kukkonen J; Partanen L; Karppinen A; Ruuskanen J; Junninen H; Kolehmainen M; Niska H; Dorling S; Chatterton T; Foxall R, Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki. Atmospheric Environment 2003, 37, (32), 4539–4550. [Google Scholar]

[R35] 35.Yeganeh B; Hewson MG; Clifford S; Tavassoli A; Knibbs LD; Morawska L, Estimating the spatiotemporal variation of NO 2 concentration using an adaptive neuro-fuzzy inference system. Environmental Modelling & Software 2018, 100, 222–235. [Google Scholar]

[R36] 36.Lu W; Wang W; Leung AY; Lo S-M; Yuen RK; Xu Z; Fan H In Air pollutant parameter forecasting using support vector machines, Neural Networks, 2002. IJCNN’02. Proceedings of the 2002 International Joint Conference on, 2002; IEEE: 2002; pp 630–635. [Google Scholar]

[R37] 37.Juhos I; Makra L; Tóth B, Forecasting of traffic origin NO and NO2 concentrations by Support Vector Machines and neural networks using Principal Component Analysis. Simulation Modelling Practice and Theory 2008, 16, (9), 1488–1502. [Google Scholar]

[R38] 38.Mavko ME; Tang B; George LA, A sub-neighborhood scale land use regression model for predicting NO2. Science of the Total Environment 2008, 398, (1–3), 68–75. [DOI] [PubMed] [Google Scholar]

[R39] 39.Huang Y-K; Luvsan M-E; Gombojav E; Ochir C; Bulgan J; Chan C-C, Land use patterns and SO2 and NO2 pollution in Ulaanbaatar, Mongolia. Environmental research 2013, 124, 1–6. [DOI] [PubMed] [Google Scholar]

[R40] 40.Liu C; Henderson BH; Wang D; Yang X; Peng Z. r., A land use regression application into assessing spatial variation of intra-urban fine particulate matter (PM2. 5) and nitrogen dioxide (NO2) concentrations in City of Shanghai, China. Science of The Total Environment 2016, 565, 607–615. [DOI] [PubMed] [Google Scholar]

[R41] 41.Johnson M; MacNeill M; Grgicak-Mannion A; Nethery E; Xu X; Dales R; Rasmussen P; Wheeler A, Development of temporally refined land-use regression models predicting daily household-level air pollution in a panel study of lung function among asthmatic children. Journal of Exposure Science and Environmental Epidemiology 2013, 23, (3), 259. [DOI] [PubMed] [Google Scholar]

[R42] 42.Liu W; Li X; Chen Z; Zeng G; León T; Liang J; Huang G; Gao Z; Jiao S; He X, Land use regression models coupled with meteorology to model spatial and temporal variability of NO2 and PM10 in Changsha, China. Atmospheric Environment 2015, 116, 272–280. [Google Scholar]

[R43] 43.Rahman MM; Yeganeh B; Clifford S; Knibbs LD; Morawska L, Development of a land use regression model for daily NO2 and NOx concentrations in the Brisbane metropolitan area, Australia. Environmental Modelling & Software 2017, 95, 168–179. [Google Scholar]

[R44] 44.He B; Heal M; Reis S, Land-use regression modelling of intra-urban air pollution variation in China: current status and future needs. Atmosphere 2018, 9, (4), 134. [Google Scholar]

[R45] 45.Vienneau D; De Hoogh K; Bechle MJ; Beelen R; Van Donkelaar A; Martin RV; Millet DB; Hoek G; Marshall JD, Western European land use regression incorporating satellite-and ground-based measurements of NO2 and PM10. Environmental science & technology 2013, 47, (23), 13555–13564. [DOI] [PubMed] [Google Scholar]

[R46] 46.Geddes JA; Martin RV; Boys BL; van Donkelaar A, Long-term trends worldwide in ambient NO2 concentrations inferred from satellite observations. Environmental health perspectives 2015, 124, (3), 281–289. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R47] 47.Fisher JA; Jacob DJ; Travis KR; Kim PS; Marais EA; Chan Miller C; Yu K; Zhu L; Yantosca RM; Sulprizio MP, Organic nitrate chemistry and its implications for nitrogen budgets in an isoprene-and monoterpene-rich atmosphere: constraints from aircraft (SEAC 4 RS) and ground-based (SOAS) observations in the Southeast US. Atmospheric chemistry and physics 2016, 16, (9), 5969–5991. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R48] 48.Kelly JT; Jang CJ; Timin B; Gantt B; Reff A; Zhu Y; Long S; Hanna A, A system for developing and projecting PM2.5 spatial fields to correspond to just meeting National Ambient Air Quality Standards. Atmospheric Environment 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R49] 49.Silvern RF; Jacob DJ; Mickley LJ; Sulprizio MP; Travis KR; Marais EA; Cohen RC; Laughner JL; Choi S; Joiner J; Lamsal LN, Using satellite observations of tropospheric NO2 columns to infer long-term trends in US NOx emissions: the importance of accounting for the free tropospheric NO2 background. Atmos. Chem. Phys. Discuss 2019, 2019, 1–26. [Google Scholar]

[R50] 50.Zheng S; Zhou X; Singh RP; Wu Y; Ye Y; Wu C, The spatiotemporal distribution of air pollutants and their relationship with land-use patterns in Hangzhou city, China. Atmosphere 2017, 8, (6), 110. [Google Scholar]

[R51] 51.Knibbs LD; Hewson MG; Bechle MJ; Marshall JD; Barnett AG, A national satellite-based land-use regression model for air pollution exposure assessment in Australia. Environmental research 2014, 135, 204–211. [DOI] [PubMed] [Google Scholar]

[R52] 52.Di Q; Amini H; Shi L; Kloog I; Silvern R; Kelly J; Sabath MB; Choirat C; Koutrakis P; Lyapustin A, An ensemble-based model of PM2. 5 concentration across the contiguous United States with high spatiotemporal resolution. Environment international 2019, 130, 104909. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R53] 53.Lin J-T; Martin R; Boersma K; Sneep M; Stammes P; Spurr R; Wang P; Van Roozendael M; Clémer K; Irie H, Retrieving tropospheric nitrogen dioxide from the Ozone Monitoring Instrument: effects of aerosols, surface reflectance anisotropy, and vertical profile of nitrogen dioxide. Atmospheric Chemistry and Physics 2014, 14, (3), 1441–1461. [Google Scholar]

[R54] 54.Boersma K; Eskes H; Brinksma E, Error analysis for tropospheric NO2 retrieval from space. Journal of Geophysical Research: Atmospheres 2004, 109, (D4). [Google Scholar]

[R55] 55.Bucsela E; Perring A; Cohen R; Boersma K; Celarier E; Gleason J; Wenig M; Bertram T; Wooldridge P; Dirksen R, Comparison of tropospheric NO2 from in situ aircraft measurements with near‐real‐time and standard product data from OMI. Journal of Geophysical Research: Atmospheres 2008, 113, (D16). [Google Scholar]

[R56] 56.Modeling G, MERRA-2 inst3_3d_aer_Nv: 3d,3-Hourly,Instantaneous,Model-Level,Assimilation,Aerosol Mixing Ratio V5.12.4. In NASA Goddard Earth Sciences Data and Information Services Center: 2015. [Google Scholar]

[R57] 57.Herman J; Bhartia P; Torres O; Hsu C; Seftor C; Celarier E, Global distribution of UV-absorbing aerosols from Nimbus 7/TOMS data. J. Geophys. Res 1997, 102, (16), 911–16. [Google Scholar]

[R58] 58.Torres O; Bhartia P; Herman J; Ahmad Z; Gleason J, Derivation of aerosol properties from satellite measurements of backscattered ultraviolet radiation: Theoretical basis. Journal of Geophysical Research: Atmospheres (1984–2012) 1998, 103, (D14), 17099–17110. [Google Scholar]

[R59] 59.Kalnay E; Kanamitsu M; Kistler R; Collins W; Deaven D; Gandin L; Iredell M; Saha S; White G; Woollen J; Zhu Y; Leetmaa A; Reynolds R; Chelliah M; Ebisuzaki W; Higgins W; Janowiak J; Mo KC; Ropelewski C; Wang J; Jenne R; Joseph D, The NCEP/NCAR 40-Year Reanalysis Project. Bulletin of the American Meteorological Society 1996, 77, 437–471. [Google Scholar]

[R60] 60.Vermote E, MOD09A1 MODIS/Terra Surface Reflectance 8-Day L3 Global 500m SIN Grid V006 In DAAC, N. E. L. P., Ed. 2015. [Google Scholar]

[R61] 61.Inness A; Ades M; Agusti-Panareda A; Barré J; Benedictow A; Blechschmidt A-M; Dominguez JJ; Engelen R; Eskes H; Flemming J; Huijnen V; Jones L; Kipling Z; Massart S; Parrington M; Peuch V-H; Razinger M; Remy S; Schulz M; Suttie M, The CAMS reanalysis of atmospheric composition. Atmospheric Chemistry and Physics Discussions 2018, 1–55. [Google Scholar]

[R62] 62.Kloog I; Koutrakis P; Coull BA; Lee HJ; Schwartz J, Assessing temporally and spatially resolved PM2.5 exposures for epidemiological studies using satellite aerosol optical depth measurements. Atmospheric Environment 2011, 45, 6267–6275. [Google Scholar]

[R63] 63.Zhu Y; Zhan Y; Wang B; Li Z; Qin Y; Zhang K, Spatiotemporally mapping of the relationship between NO2 pollution and urbanization for a megacity in Southwest China during 2005–2016. Chemosphere 2018. [DOI] [PubMed] [Google Scholar]

[R64] 64.Spiegelman D, Evaluating Public Health Interventions: 4. The Nurses’ Health Study and Methods for Eliminating Bias Attributable to Measurement Error and Misclassification. American Journal of Public Health 2016, 106, (9), 1563–1566. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R65] 65.Kharol S; Martin R; Philip S; Boys B; Lamsal L; Jerrett M; Brauer M; Crouse D; McLinden C; Burnett R, Assessment of the magnitude and recent trends in satellite-derived ground-level nitrogen dioxide over North America. Atmospheric Environment 2015, 118, 236–245. [Google Scholar]

[R66] 66.Velders GJ; Geilenkirchen GP; de Lange R, Higher than expected NOx emission from trucks may affect attainability of NO2 limit values in the Netherlands. Atmospheric environment 2011, 45, (18), 3025–3033. [Google Scholar]

[R67] 67.Soltic P; Weilenmann M, NO2/NO emissions of gasoline passenger cars and light-duty trucks with Euro-2 emission standard. Atmospheric Environment 2003, 37, (37), 5207–5216. [Google Scholar]

[R68] 68.Doll CN; Muller J-P; Morley JG, Mapping regional economic activity from nighttime light satellite imagery. Ecological Economics 2006, 57, (1), 75–92. [Google Scholar]

[R69] 69.Shi K; Yu B; Hu Y; Huang C; Chen Y; Huang Y; Chen Z; Wu J, Modeling and mapping total freight traffic in China using NPP-VIIRS nighttime light composite data. GIScience & Remote Sensing 2015, 52, (3), 274–289. [Google Scholar]

[R70] 70.Zhang Q; Seto KC, Mapping urbanization dynamics at regional and global scales using multi-temporal DMSP/OLS nighttime light data. Remote Sensing of Environment 2011, 115, (9), 2320–2329. [Google Scholar]

[R71] 71.Wang J; Aegerter C; Xu X; Szykman JJ, Potential application of VIIRS Day/Night Band for monitoring nighttime surface PM2. 5 air quality from space. Atmospheric Environment 2016, 124, 55–63. [Google Scholar]

[R72] 72.Zhao Y; Zhao B, Emissions of air pollutants from Chinese cooking: A literature review. Building Simulation 2018, 11, (5), 977–995. [Google Scholar]

[R73] 73.Meng X; Chen L; Cai J; Zou B; Wu C-F; Fu Q; Zhang Y; Liu Y; Kan H, A land use regression model for estimating the NO2 concentration in Shanghai, China. Environmental research 2015, 137, 308–315. [DOI] [PubMed] [Google Scholar]

[R74] 74.Celarier E; Brinksma E; Gleason J; Veefkind J; Cede A; Herman J; Ionov D; Goutail F; Pommereau JP; Lambert JC, Validation of Ozone Monitoring Instrument nitrogen dioxide columns. Journal of Geophysical Research: Atmospheres 2008, 113, (D15). [Google Scholar]

[R75] 75.Martin RV, An improved retrieval of tropospheric nitrogen dioxide from GOME. Journal of Geophysical Research 2002, 107, (D20). [Google Scholar]

[R76] 76.Weisskopf MG; Webster TF, Trade-offs of Personal Versus More Proxy Exposure Measures in Environmental Epidemiology. Epidemiology (Cambridge, Mass.) 2017, 28, (5), 635–643. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Assessing NO2 Concentration and Model Uncertainty with High Spatiotemporal Resolution across the Contiguous United States Using Ensemble Model Averaging

Qian Di

Heresh Amini

Liuhua Shi

Itai Kloog

Rachel Silvern

James Kelly

M Benjamin Sabath

Christine Choirat

Petros Koutrakis

Alexei Lyapustin

Yujie Wang

Loretta J Mickley

Joel Schwartz

Abstract

Graphical Abstract

1. Introduction

2. Data

2.1. Study Area and NO2 Measurements

2.2. Meteorological Data

2.3. NO2 Column Density and Chemical Transport Model Simulations

2.4. Land-cover Variables

2.5. Other Ancillary Variables

3. Methods

3.1. Overview

3.2. Three Machine Learning Algorithms

3.3. Ensemble Model

3.4. Model Prediction

3.5. Uncertainty Estimation

4. Results

Table 1.

Figure 1. Cross-validated R2 at Monitoring Sites and Predicted Uncertainty.

Figure 2. Linearity between Monitored NO2 and Predicted NO2.

Figure 3. Spatial Distribution of Predicted NO2.

Figure 4. Nationwide NO2 Trend over the Study Period.

5. Discussion

Supplementary Material

Acknowledgement

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Assessing NO₂ Concentration and Model Uncertainty with High Spatiotemporal Resolution across the Contiguous United States Using Ensemble Model Averaging

2.1. Study Area and NO₂ Measurements

2.3. NO₂ Column Density and Chemical Transport Model Simulations

Figure 1. Cross-validated R² at Monitoring Sites and Predicted Uncertainty.

Figure 2. Linearity between Monitored NO₂ and Predicted NO₂.

Figure 3. Spatial Distribution of Predicted NO₂.

Figure 4. Nationwide NO₂ Trend over the Study Period.