Abstract
Efficient prediction of the air quality response to emission changes is a prerequisite for an integrated assessment system in developing effective control policies. Yet representing the nonlinear response of air quality to emissions controls with accuracy remains a major barrier in air quality-related decision-making. Here we demonstrate a novel method that combines deep-learning approaches with chemical indicators of pollutant formation to quickly estimate the coefficients of air quality response functions using ambient concentrations of 18 chemical indicators simulated with a comprehensive atmospheric chemical transport model (CTM). By requiring only two CTM simulations for model application, the new method significantly enhances the computational efficiency compared to existing methods that achieve lower accuracy despite requiring 20+ CTM simulations (the benchmark statistical model). Our results demonstrate the utility of deep-learning approaches for capturing the nonlinearity of atmospheric chemistry and physics and the prospects of the new method to inform effective policymaking in other environment systems.
Graphical Abstract
Air pollution is a global concern due to its harmful effects on human health1, climate2, agriculture and ecosystem health3, and visibility4. Ambient PM2.5 (particulate matter with aerodynamic diameter less than 2.5 μm) and ozone (O3) are among the highest risk factors for global premature mortality1,5, with PM2.5 pollution estimated to have contributed to 2.9 million deaths globally in 2017 and O3 pollution to nearly a half million deaths6. A central challenge in effectively controlling the sources of ambient PM2.5 and O3 is that the dominant contributors to these pollutants are emitted precursors such as sulfur dioxide (SO2), nitrogen oxides (NOx), ammonia (NH3), and volatile organic compounds (VOCs)7 that undergo chemical transformations in the atmosphere. The chemical reactions that lead to O3 and PM2.5 formation involve highly nonlinear processes across multiple phases that vary significantly with meteorological conditions and precursor levels. Despite their complexity, these chemical pathways ultimately dictate the strong nonlinear responses of PM2.5 and O3 to precursor emission changes8–12 and must be accurately modeled.
Comprehensive chemical transport models (CTMs) implemented with the most recent knowledge of atmospheric science are the preferred tools for simulating the chemical and physical processes occurring in the atmosphere13. Numerical experiments such as simulating air quality under conditions of reduced precursor emission levels relative to a baseline case (i.e., “brute force” method) can be conducted to investigate the response of air quality to emission changes14. The sensitivity of air pollutant concentrations to emission sources can also be explored with advanced techniques such as the decoupled direct method (DDM)15, higher-order DDM16, and adjoint sensitivity analysis17. Contributions of emissions to ambient concentrations can be estimated using ozone source apportionment technology18, particulate matter source apportioning technology19, the integrated source apportionment method20–21, and source-oriented models22. These methods are practical for quantifying the relative contributions of emission sources to air pollution and the sensitivity of air pollution to limited changes in emissions23; however, they are computationally expensive and do not address prediction of air quality responses to emission changes for the wide range of possible scenarios of interest to policymakers.
Efficient and accurate prediction of air pollutant responses to emission changes is a key component of the integrated assessment systems commonly used by policymakers to quickly achieve multiple objectives. Integrated assessment models for air pollution control quantify the influence of future policies on air pollution levels using process parameterizations and are used to analyze the benefits and costs of emission controls in designing efficient strategies to attain air quality goals24–28. The Air Benefit and Cost and Attainment Assessment System (ABaCAS) is an integrated assessment system that connects air pollution emission control with health benefit and cost estimation29. In ABaCAS, the response of pollutant concentrations to emission changes is predicted in real-time with a response surface model (RSM) developed from many CTM simulations using advanced statistical interpolation techniques30–31. Recently, a series of innovations have improved the representation of nonlinear interactions among precursors from sources in multiple regions in extended versions of the original RSM (i.e., E-RSMs)32–34. To ensure model accuracy, the development of RSM and E-RSM requires many control scenarios to be simulated with a CTM, with heavy computational burden that limits the adaptability and broad application of RSMs. To partially address this issue, a RSM based on polynomial functions (pf-RSM) was recently developed using prior knowledge from earlier RSM studies to reduce the number of CTM simulations required for RSM development by 60%35. However, implementation of the pf-RSM still requires at least 20 CTM simulations, and such computational cost remains a significant barrier to the broad adoption of RSM technology.
In the pf-RSM, polynomial functions were fitted individually for each spatial grid cell and therefore did not consider the moderate degree of spatial correlation that is common among air pollutants. Also, the functions were fitted solely based on simulated O3 and PM2.5 concentrations without considering the concentrations of related chemical species. Many species are influenced by common atmospheric processes and reactions and are highly correlated in the atmosphere. Moreover, concentrations of secondary pollutants, such as O3 and PM2.5, may largely be determined by the ambient levels of their precursors. Previous studies suggest that certain combinations of related chemical species can be used as indicators for O3 and PM2.5 chemistry36–37. Studies have also shown that the response of O3 and PM2.5 to changes in precursor levels can be identified from changes in concentrations of related species38, as illustrated with Empirical Kinetic Modeling Approach (EKMA) diagrams of the response in O3 and PM2.5 concentrations to changes in NOx and VOC concentrations39–40. Such relationships imply that nonlinearity in the O3 and PM2.5 response to precursor emission changes can be quantified using combinations of ambient concentrations of certain species (hereafter indicators), and that the indicator-pollutant relationships are independent of location or time.
Despite the potential predictive value of the chemical indicators, previous RSMs have been directly fit to O3 and PM2.5 concentrations, because collinearity associated with the moderately-correlated indicators cannot be resolved with statistical regression models. In contrast, neural network algorithms are well suited to address collinearity issues and have been used in recent air quality prediction studies41–42. Moreover, convolutional neural networks (CNNs) can potentially enhance predictive capability by preserving important spatial features of pollutants through the network. Although previous studies have used neural networks to forecast air quality under varying meteorological conditions and develop concentration fields for retrospective periods, deep learning methods have not been applied to comprehensively address air quality prediction under varying emission levels, which is of central importance to policymakers.
In this study, we present a novel method called the Deep-learning-based response surface model (DeepRSM) to characterize the response of O3 and PM2.5 concentrations to the full range of emission changes using a deep CNN with carefully designed architecture and training method. The training and test data for the DeepRSM model is based on brute-force simulations with the Community Multiscale Air Quality (CMAQ) CTM (Table S1) on domains that cover China (noted as CN27) and three polluted regions within China (i.e., Northern China Plain, NCP; Fen-Wei Plain, FWP; and Chuan-Yu region, CYR) (Figure S1). The DeepRSM based on the trained CNN can reliably estimate the responsiveness of O3 and PM2.5 concentrations to emission changes for any domain and time period in real-time using only ambient concentrations of related chemical species from two simulations (i.e., baseline and fully-controlled emission scenarios). To demonstrate the performance of DeepRSM, we evaluated DeepRSM predictions against CTM results in a series of experiments with different types and numbers of training datasets, as summarized in Table S2. DeepRSM predictions are also compared with those of the existing RSM method, which serves as the benchmark case in this study.
Methods
(1). CTM Configuration
The pf-RSM and DeepRSM were developed using CTM simulations with the CMAQ model (version 5.2; www.epa.gov/cmaq). Baseline concentrations and the responses of PM2.5 and O3 to emission controls were simulated for a matrix of 40 emission control scenarios (Table S1) as part of our previous pf-RSM development35. The four modeling domains are shown in Figure S1. Simulations for the CN27 domain used 27km by 27km horizontal resolution, and simulations for the three nested domains (i.e., NCP, FWP and CYR) used finer resolution of 9 km by 9 km. Modeling was performed for January, April, July and October in 2017 to represent winter, spring, summer and fall, respectively. O3 concentrations were analyzed based on afternoon averages (12:00pm-6:00pm local time), and PM2.5 concentrations were based on daily or monthly averages.
The emission data were developed by Tsinghua University based on a bottom-up method with high spatial and temporal resolution. Meteorological fields were based on simulations with the Weather Research and Forecasting (WRF, version 3.7) model. The configurations of the WRF and CMAQ models matched those of our previous study43–44. The performance of the CMAQ model for predicting O3 and PM2.5 concentrations was thoroughly evaluated using ambient measurements43–44 and shown to be acceptable based on recommended benchmarks for comparisons with ground-based observations.
(2). pf-RSM Configuration
Our previous study suggested that nonlinear response of O3 and PM2.5 concentrations to precursor emission controls can be represented by a set of polynomial functions (i.e., pf-RSM)35. The structure of the polynomial function is expressed as follows:
(E1) |
Where ΔConc is the response of the O3 and PM2.5 concentrations (i.e., change relative to the baseline concentration) calculated from a polynomial function of five variables (ENOx, ESO2, ENH3, EVOCs); ENOx, ESO2, ENH3, and EVOCs are the ratios of emission changes relative to baseline emissions for NOx, SO2, NH3, and VOC, respectively; and ai, bi, ci, and di represent the nonnegative integer powers of ENOx, ESO2, ENH3, and EVOCs, respectively. Xi (the coefficient of term i) is determined by fitting the polynomial function for each spatial grid cell in the pf-RSM using 20 to 40 CTM simulations. The 14 terms used to represent the PM2.5 and O3 responses to emission controls were determined previously in designing the pf-RSM and are shown in Figure 1.
(3). Neural Network Training
The CNN was selected as the neural network in this study because of its advantages in analyzing image data45–46, and the similarity of spatial distributions of ambient pollution concentrations to image data. Also, CNNs are relatively good at representing complex nonlinear behavior compared with other machine learning methods, and are therefore suitable for representing the O3 and PM2.5 response functions.
Dataset.
We collected pollutant concentrations from CTM modeling for 480 days (four domains × four months × 30 days per month) for 40 emission control scenarios plus the baseline- and clean-conditions simulations (42 simulations overall; see Table S1). We conducted numerical experiments to test DeepRSM performance on each of the four spatial domains. To evaluate the temporal transfer capabilities of the DeepRSM (-TT experiments), we used the first 25 days in each month as the training dataset and the last five days in each month as the test dataset. To evaluate the domain transfer capabilities of the DeepRSM (-DT experiments), we used all 360 days from the three domains that were not being tested as the training dataset and the last five days in all months (20 days in total) from the domain being evaluated as the test dataset. For -DT experiments with fine-tuning, we included in the training dataset an additional 5 or 20 days that were randomly selected from the first 25 days in each month from the domain being evaluated.
More training data could lead to an improved CNN model, while the computational cost of numerical air quality model is too heavy to create abundant training data. Data augmentation has shown its ability in improving the performance of CNN with low-level task (i.e., the output value at each location is only related to the input values spatially close to the location)45–46 which is also the case for atmospheric concentration response to emissions we studied here. Therefore, we randomly cropped the indicator maps by size of 96 for data augmentation to improve the CNN performance.
DeepRSM training strategy.
Since the relative change in pollutant concentration is the metric often used by policymakers, we adopt an objective function that measures the relative loss between predicted and simulated concentrations:
(E2) |
where and y denote the DeepRSM-predicted and CTM-simulated pollutant concentrations, respectively. The variable N denotes the number of samples, and H,W and C denote the height, width, and number of channels of y, with i ∈ [0,H], j ∈ [0,W],c ∈ [0,C]. All model hyper-parameters were chosen using holdout validation datasets. The objective function is optimized using Adam47 with β1 = 0.9, β2 = 0.999 and a mini-batch size of 32. The learning rate starts from 0.0002 and linearly decay to zero at the end of training. To reduce the risk of over-fitting, we applied L2 weight regularization on all trainable parameters during training and fine-tuning. For each simulated day, one group of indicators (i.e., the concatenated baseline and clean-condition indicators) corresponds to one group of coefficients in the polynomial response function. However, 40 concentration labels are available that correspond to the 40 emission control scenarios simulated with CMAQ. To achieve computationally efficiency with the deep CNN, we calculate the average of the objective function over all emission control scenarios in one day, and then backpropagate the gradients of the average loss to update our model and complete one epoch. The DeepRSM and DeepRSM+ models are trained for 5000 epochs in -TT and -DT experiments and are fine-tuned for another 1000 epochs in fine-tuning experiments.
Evaluation metric.
Validation of the model performance is critical48. For consistency with the performance evaluation of the benchmark model30–31, 34, the performance of the DeepRSM was evaluated using two statistical indices, namely the MeanNE and the 95th MaxNE, which are also commonly used in evaluating the performance of atmospheric numerical modeling49. They are calculated as follows:
(E3) |
(E4) |
Where Mi and Oi are the DeepRSM-predicted and CMAQ-simulated value of the ith data in the series, and ThN represents the number of records (i.e., number of datasets multiplied by the number of grid cells)
Results
(1). The principle of the DeepRSM
The basic principle of the DeepRSM is that the coefficients in the response functions for PM2.5 and O3 from the pf-RSM can be accurately estimated from indicator species rather than by fitting results of CTM simulations based on random samples of emission scenarios. This design eliminates the need for a large number of computationally expensive CTM simulations as in the previous pf-RSM approach. To deploy the DeepRSM, only two CTM simulations are required: one for baseline emission levels and one for “clean” emission levels, where all anthropogenic PM2.5 and O3 precursor emissions are fully controlled.
The key design elements of the DeepRSM are the selection of the O3 and PM2.5 response indicators (i.e., concentrations of relevant chemical species under baseline and clean conditions) and the architecture of the CNN. To ensure the efficiency of the DeepRSM, we selected 18 chemical indicators that are relatively important to O3 and PM2.5 formation from the 130+ chemical species that are simulated in the CMAQ model. The indicators are either products or reactants in chemical reactions involving O3 or PM2.5 and are represented in all major CTMs. The pf-RSM model predicts strong correlations between the coefficients of the 14 terms in the PM2.5 and O3 response functions and the changes in indicator concentrations between the baseline and clean emission simulations. These correlations are consistent with current knowledge in atmospheric chemistry. For example, the coefficient of the linear term for NOx emissions in the O3 response function exhibits the strongest positive correlation with H2O2 concentrations (r = 0.8) but negative correlation with concentrations of the nitrogen species (r = −0.3 ~ −0.6) (Figure 1a). These relationships reflect the behavior that NOx emission control tends to reduce O3 when H2O2 is high and NOx is low (NOx-limited regime50) but increase O3 when NOx is high and H2O2 is low (VOC-limited regime).
The strong correlations between indicators and response function coefficients in Figure 1 imply that valuable information for predicting the response functions for PM2.5 and O3 is contained in the indicators. However, extracting this information is challenging because the coefficient of each term is positively or negatively correlated with multiple indicators. For example, the PM2.5 components (SO4, NO3, NH4, and SOC) are highly correlated with the majority of coefficients in both the O3 and PM2.5 response functions (Figure 1). Such collinearity among the chemical indicators motivates use of neural network technology, which has advantages over traditional statistical regression in resolving complex relationships.
Deep neural networks have led to a series of breakthroughs in a wide range of fields due to their powerful expressive ability to approximate complex nonlinear functions51–52. A deep CNN with residual connection53 is employed here for four reasons. First, deep neural networks can efficiently solve highly nonlinear regression problems and are therefore potentially suitable for resolving the collinearity among chemical indicators. Second, CNNs can effectively use spatial relationships among nearby chemical indicators that may contribute to local pollutant concentrations. Third, CNNs with the convolutional kernel applied over space can well represent the common atmospheric processes and reactions occurring across the domain. Finally, residual connection is indispensable for modern deep CNN models, and a deep network is needed to provide high accuracy in modeling the complex processes that influence atmospheric chemistry.
The architecture of the DeepRSM model is illustrated in Figure 2. We use spatial concentration fields of 18 chemical indicators under baseline and clean conditions to represent the predictive features of the system. We concatenate the indicator fields for both scenarios before feeding them into our DeepRSM model. The first convolutional layer of the DeepRSM model transforms the 36 input channels of indicator maps into 128 channels of feature maps. This layer is followed by eight residual blocks and one convolutional layer through which the number of channels is maintained at 128 to increase the expressiveness of the network. The last convolutional layer transforms the number of channels from 128 to 14, which represent the coefficients in the standard polynomial function based on prior knowledge from pf-RSM development. O3 and PM2.5 concentrations are calculated as the inner product of the coefficients in the last layer and the corresponding response function terms based on the specific emission control scenario. We use a LeakyReLU54 as the nonlinear activation function because it preserves negative gradients and performs well in low-level regression tasks. Our results suggest that the DeepRSM (trained with the CN27 dataset as one example) can well reproduce the spatial and seasonal variations in the coefficients of the PM2.5 and O3 response functions, with results similar to those of the pf-RSM (Figure S2).
Although the polynomial function in the pf-RSM was carefully designed in our previous study, uncertainty still exists in the functional form of air quality responses to precursor emission changes. Therefore, in addition to the DeepRSM based on the 14 terms of the pf-RSM response function, we developed the DeepRSM+ model that augments the polynomial function with 50 additional implicit terms to reduce the approximation error. The additional terms are automatically learned from the emission control factor vector using a compensated polynomial term model (CPT Model in Figure 2) and are not associated with an analytical functional form. The CPT Model uses three fully connected layers of width 128 to learn the nonlinear transformation from the emission control factor vector to the values of the additional 50 terms. The total number of terms in the augmented polynomial function is 64, which equals the number of coefficient maps and channels in the last convolutional layer of the DeepRSM+ model.
(2). The DeepRSM is effective across time periods and spatial domains
A key advantage of the DeepRSM is that the trained deep CNN is generally transferable across time periods (i.e., temporal transfer, TT) and spatial regions (i.e., domain transfer, DT). To examine the temporal transfer capabilities, we trained the DeepRSM model using data from the first 25 days in each of four months and applied it to predict concentration responses in the last five days of each month on the same domain (i.e., -TT experiment in Table S2 for PM2.5 and O3, Figure 3 for PM2.5, and Figure S3 for O3). Evaluation of the DeepRSM predictions against CTM results demonstrates good performance, with mean normalized error (meanNE) less than 5% and 95th maximal NE (95th MaxNE) less than 10%. The performance of the DeepRSM based on two CTM simulations in the –TT experiment is significantly better than that of the pf-RSM, which is based fitting with 20 CTM scenarios, and demonstrates the transferability of the DeepRSM to time periods not included in the training data.
To examine the transferability of the DeepRSM to different spatial domains, the air quality response predicted in one domain was evaluated based on the DeepRSM model trained with data from the other three domains (i.e., -DT experiment in Table S2, Figure 3, and Figure S3). The -DT experiment is a greater test for the DeepRSM than the -TT experiment, because differences in air quality simulated for different regions and grid resolutions are much larger than for air quality simulated for different days for the same region and resolution. Despite the greater challenge, the DeepRSM performance is only slightly degraded in the -DT experiment compared to the -TT experiment. The DeepRSM exhibits similar or slightly better performance than pf-RSM in the -DT experiment in all domains except for CN27.
Predicting concentrations on the CN27 domain is relatively challenging using the DeepRSM based on training data from the three smaller domains that do not fully encompass the CN27 domain. However, the DeepRSM performance can be readily improved as necessary using a fine-tuning procedure in which the model is dynamically updated using very little additional training data. To demonstrate the performance improvement, we fine-tuned the DeepRSM models trained in the -DT experiments using an additional 5 or 20 days of data from the test domain and a relatively small number of epochs (i.e. -DTF5 and -DTF20 cases in Figure 3 and Table S2). The fine-tuning method is especially effective for reducing prediction bias for the CN27 domain.
DeepRSM predictions of the daily variation in air quality response was also evaluated for the -DT experiment in which no data for the test domain was used in training (Figure S4 for PM2.5 and Figure S5 for O3). The results indicate that the daily variations in air quality response predicted by the DeepRSM are similar to those simulated with CMAQ across all four months and domains. Moreover, the spatial distributions of air quality responses are also consistent with CMAQ simulations, as shown in Figure S6–S9 for PM2.5 and Figure S10–S13 for O3. The results of the -TT and -DT experiments demonstrate that the DeepRSM can efficiently and reliably capture variations in PM2.5 and O3 response across space and time.
To further examine the ability of the DeepRSM to predict the nonlinear response of air quality to emission changes, we generated PM2.5 and O3 isopleths for DeepRSM predictions in the -DT experiment for simultaneous changes in emissions of two precursors (Figure S14–S15): PM2.5 response to NOx and VOC emissions (Figure S14a), PM2.5 response to SO2 and NH3 emissions (Figure S14b), O3 response to NOx and VOC emissions (Figure S15a), and O3 response to SO2 and NH3 emissions (Figure S15b). We included 25 colored dots in the isopleths that correspond to CMAQ predictions that were not used in model training for comparison with the DeepRSM predictions. We also compared isopleths based on pf-RSM predictions with those based on the DeepRSM. These comparisons indicate that the DeepRSM generally captures the nonlinear response of O3 and PM2.5 to precursor emission changes across seasons. For instance, the DeepRSM predicts that O3 chemistry is strongly VOC-limited in January and NOx-limited in July and that the PM2.5 response to NOx and VOC emission changes has a similar, but weaker, dependence on oxidant abundance as O3. The DeepRSM results also suggest that the effectiveness of NOx and NH3 emission controls for PM2.5 reduction increases with increasing control (from 1 to 0 in Figure S14). The concentration responses predicted by the DeepRSM generally agree well with those simulated by CMAQ, and the DeepRSM isopleths are consistent with the pf-RSM isopleths, despite use of only two CTM simulations by the DeepRSM.
As mentioned above, the performance of the DeepRSM can be further improved by optimizing the polynomial structure using the DeepRSM+ model. The DeepRSM+ model adopts 50 additional terms that are learned from the emission control factor vector using the CPT model to reduce the approximation error of the polynomial function. In all experiments, the DeepRSM+ model with optimized polynomial structure based on fine-tuning with an additional 20 simulation days (i.e., -PolyF20 experiment in Table S2, Figure 3, and Figure S3) exhibits the best performance, with MeanNE < 5% and 95th MaxNE < 10% across all months and domains. The value of the compensation terms is also evident in the isopleth comparison displayed in Figure 4. The compensation terms adjust the DeepRSM toward the CTM simulation results, particularly along edges of the isopleths where emission control factors are close to 0 (fully-controlled) or 2 (doubled). These conditions are relatively hard to resolve using the DeepRSM model based on the 14-term polynomial function alone.
(3). Interpretability of the DeepRSM for prediction of air quality response
The success of the DeepRSM implies that information from only two states (i.e., baseline and fully-controlled scenarios) is needed to fit the curved concentration surface in four-dimensional space (i.e., emission changes of NOx, SO2, NH3 and VOCs) using the trained deep CNN. Concentrations throughout the four-dimensional space cannot be predicted accurately using only PM2.5 or O3 concentrations from the two states; however, rich information for the prediction of PM2.5 and O3 is contained in the states in the form of the chemical indicators. Therefore, the DeepRSM predictions are not based only on PM2.5 and O3 concentrations at two points, but two pairs of vectors including the full suite of chemical indicators in addition to PM2.5 and O3 concentrations. The set of indicators contain sufficient information to represent the key atmospheric chemical and physical processes independent of spatial location or time period. The DeepRSM represents the atmospheric processes by linking the coefficients of the air quality response functions and the indicators in an efficient way, as follows.
If we consider a single grid cell in a CTM as a box model, the concentration change over time can be written as follows:
(E5) |
where [P] is the concentration of air pollutant (i.e., PM2.5 or O3); fi is the numerical function of process i (e.g., transport, chemistry, deposition) that contributes to the pollutant concentration; ki is related to geographic (e.g., land cover) and meteorological variables (e.g., temperature, solar radiation, wind) but independent of concentrations; and [Is] is the concentration of reactant s in bi- or tri- molecular reaction. The ambient concentrations of the gaseous precursors for O3 and PM2.5 (i.e., [Ip], where p = NOx, SO2, NH3, and VOCs) are approximately proportional to their emissions (Ep), as follows:
(E6) |
Although the forms of the fi terms differ significantly for different processes, they can all be approximated with polynomial functions. Using the precursor emissions as independent variables, E1 can be represented as a polynomial function of precursor emissions, as follows:
(E7) |
where gj is the jth term in the polynomial function of precursor emissions.
The average concentration of P over an integration period can be estimated based on E3 according to the following:
(E8) |
Equation E4 is of the same form as the polynomial function used in the pf-RSM. Therefore, the accuracy of the pf-RSM suggests that the coefficient of each term is roughly constant and unrelated to the variation of Ep, but still related to the constant ki and concentration of reactant [Is]. Thus, we can conclude that the coefficient of each term is only determined by the concentration of reactants and the geographic or meteorological factors. Since the coefficient of each term is constant in the response function and does not change with emissions, the concentration of reactants can be determined from a single baseline-emission simulation to develop the response functions. Considering the challenges in representing the geographic and meteorological factors, we additionally use the concentration of reactants under clean conditions (fully-controlled scenario) to further represent such influence. More importantly, the difference in concentrations from the two scenarios (baseline and fully-controlled) can be used to indicate the influence of the controllable fraction of the total emissions, since some emissions cannot be readily controlled (e.g., biogenic sources and regional emissions from outside the target area).
To promote interpretability of the machine-learning results, we examined the relative contribution of each indicator to the coefficients in the PM2.5 response function (Figure 5). In general, the wide range of the contribution of each indicator to the coefficients demonstrates the advantage of machine learning for feature extraction from the raw 18 indicators. The coefficient for the linear NH3 emission term (Term 2) is strongly determined by the indicators HNO3, nitrate (NO3), ammonium (NH4), and PM2.5. The coefficient for the linear SO2 emission term (Term 5) is strongly determined by the indicators OH, sulfate (SO4), and ammonium (NH4). For high order NOx emission terms (Terms 8, 12, 13, 14), the coefficients are most influenced by indicators associated with complex free radical oxidation reactions. These relationships are consistent with known mechanisms of atmospheric chemistry and indicate that the DeepRSM based on deep learning is scientifically reasonable in addition to performing with high accuracy and efficiency.
Our study is the first to apply deep-learning technology in predicting the air quality response to emission changes by linking CNN and RSM technologies using a carefully selected set of chemical indicators and novel model design. The new DeepRSM developed in this study significantly improves the real-time prediction of air quality for the full range of policy-relevant control strategies, compared to previous methods such as the original RSM.
Since the DeepRSM links the coefficients of the PM2.5 and O3 response functions with chemical indicators independent of time and space, it can be applied for any study period or domain. The good performance of the CNN for days (-TT experiments) and spatial domains (-DT experiments) not represented in the training data supports this use. Compared to traditional regression methods (e.g., the pf-RSM benchmark case), the DeepRSM has higher efficiency and accuracy, and thus can be applied for real-time air quality response prediction in integrated assessment systems to inform long-term air quality management. It can also be applied for daily air quality forecasting to inform emergency actions to protect public health using a combination of short-term pollutant controls.
The scientific implications of our study are that the ambient concentrations of the chemical indicators are key factors for determining the nonlinear response of air quality to emission changes. This finding does not imply that other factors are unimportant, since factors such as meteorology and geographic characteristics are likely somehow already considered in the CNN through the change in indicator concentrations between the clean and baseline conditions. This study also reveals an important fact that, for systems that can be represented deterministically (e.g., atmospheric air pollution), we can interpret the full pathway using information from the initial and final states alone. However, training networks to adequately represent such systems is a major challenge, which requires full knowledge of the relevant factors (indicators) and ample training data.
Supplementary Material
Acknowledgements
This work was supported in part by National Key R & D program of China (2018YFC0213805), and National Natural Science Foundation of China (21625701, 41907190, 51861135102, 71722003, 71974108 and 71690244), and MSRA Star Track Program. Shuxiao Wang acknowledges the support from the Tencent Foundation through the XPLORER PRIZE. This work was completed on the “Explorer 100” cluster system of Tsinghua National Laboratory for Information Science and Technology. The views expressed in this manuscript are those of the authors alone and do not necessarily reflect the views and policies of the U.S. Environmental Protection Agency.
Footnotes
Supporting Information
Training and testing dataset; Statistics of pf-RSM and DeepRSM performance; 14 term coefficients of the PM2.5 and O3 response function; Daily variation, spatial distribution, and the isopleths of PM2.5 and O3 response.
Data and code availability
The original data and code used in this study are available upon request from the corresponding authors.
Competing interests
The authors declare no competing financial interests.
Reference
- (1).Cohen AJ; Brauer M; Burnett R; Anderson HR; Frostad J; Estep K; Balakrishnan K; Brunekreef B; Dandona L; Dandona R; Feigin V Estimates and 25-year trends of the global burden of disease attributable to ambient air pollution: an analysis of data from the Global Burden of Diseases Study 2015. The Lancet 2017, 389(10082), 1907–1918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (2).Myhre G, Shindell D, Bréon F-M, Collins W, Fuglestvedt J, Huang J, Koch D, Lamarque J-F, Lee D, Mendoza B, Nakajima T, Robock A, Stephens G, Takemura T and Zhang H, 2013: Anthropogenic and Natural Radiative Forcing. In: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change [Stocker TF, Qin D, Plattner G-K, Tignor M, Allen SK, Boschung J, Nauels A, Xia Y, Bex Vand Midgley PM(eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 2013, 659–740. [Google Scholar]
- (3).Fuhrer J; Val Martin M; Mills G; Heald CL; Harmens H; Hayes F; Sharps K; Bender J; Ashmore MR Current and future ozone risks to global terrestrial biodiversity and ecosystem processes. Ecology and evolution, 2016, 6(24), 8785–8799. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (4).Friedlander SK Smoke, dust and haze: Fundamentals of aerosol behavior. New York, Wiley-Interscience, 1977. 333pp. [Google Scholar]
- (5).Forouzanfar MH; Alexander L; Anderson HR; Bachman VF; Biryukov S; Brauer M; Burnett R; Casey D; Coates MM; Cohen A; Delwiche K Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks in 188 countries, 1990–2013: a systematic analysis for the Global Burden of Disease Study 2013. The Lancet 2015, 386 (10010), 2287–2323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (6).HEI, 2019. Health Effects Institute. 2019. State of Global Air 2019. Available: www.stateofglobalair.org [accessed 08/22/2019].
- (7).Seinfeld JH; Pandis SN Atmospheric chemistry and physics: from air pollution to climate change. John Wiley & Sons, 2012 [Google Scholar]
- (8).West JJ; Ansari AS; Pandis SN Marginal PM25: nonlinear aerosol mass response to sulfate reductions in the Eastern United States. Journal of the Air & Waste Management Association, 1999, 49(12), 1415–1424. [DOI] [PubMed] [Google Scholar]
- (9).Hakami A; Odman MT; Russell AG Nonlinearity in atmospheric response: A direct sensitivity analysis approach. Journal of Geophysical Research: Atmospheres, 2004, 109(D15), 10.1029/2003JD004502. [DOI] [Google Scholar]
- (10).Cohan DS; Hakami A; Hu Y; Russell AG Nonlinear response of ozone to emissions: source apportionment and sensitivity analysis. Environ. Sci. Technol 2005, 39, 6739–6748. [DOI] [PubMed] [Google Scholar]
- (11).Pun BK; Seigneur C; Bailey EM; Gautney LL; Douglas SG; Haney JL; Kumar N Response of atmospheric particulate matter to changes in precursor emissions: a comparison of three air quality models. Environmental science & technology, 2007, 42(3), 831–837. [DOI] [PubMed] [Google Scholar]
- (12).Megaritis AG; Fountoukis C; Charalampidis PE; Pilinis C and Pandis SN Response of fine particulate matter concentrations to changes of emissions and temperature in Europe. Atmospheric Chemistry and Physics, 2013, 13(6), 3423–3443. [Google Scholar]
- (13).Brasseur GP; Jacob DJ Modeling of Atmospheric Chemistry. Cambridge University Press, 2017. [Google Scholar]
- (14).Sandu A; Carmichael GR; Potra FA Sensitivity analysis for atmospheric chemistry models via automatic differentiation. Atmospheric Environment, 1997, 31(3), 475–489. [Google Scholar]
- (15).Napelenok SL; Cohan DS; Hu Y; Russell AG Decoupled direct 3D sensitivity analysis for particulate matter (DDM-3D/PM). Atmospheric Environment, 2006, 40(32), 6112–6121. [Google Scholar]
- (16).Hakami A; Odman MT; Russell AG. High-order, direct sensitivity analysis of multidimensional air quality models. Environmental Science & Technology, 2003, 37(11), 2442–2452. [DOI] [PubMed] [Google Scholar]
- (17).Sandu A; Daescu DN; Carmichael GR; Chai T Adjoint sensitivity analysis of regional air quality models. Journal of Computational Physics, 2005, 204(1), 222–252. [Google Scholar]
- (18).Dunker AM; Yarwood G; Ortmann JP; Wilson GM Comparison of source apportionment and source sensitivity of ozone in a three-dimensional air quality model, Environmental Science and Technology, 2002, 36, 2953–2964. [DOI] [PubMed] [Google Scholar]
- (19).Wagstrom KM; Pandis SN; Yarwood G; Wilson GM; Morris RE Development and application of a computationally efficient particulate matter apportionment algorithm in a three-dimensional chemical transport model. Atmospheric Environment, 2008, 42, 5650–5659. [Google Scholar]
- (20).Kwok RHF; Napelenok SL; Baker KR Implementation and evaluation of PM2.5 source contribution analysis in a photochemical model, Atmos. Environ, 2013, 80, 398–407. [Google Scholar]
- (21).Kwok RHF; Baker KR; Napelenok SL; Tonnesen GS Photochemical grid model implementation and application of VOC, NOx, and O3 source apportionment, Geosci. Model Dev, 2015, 8, 99–114. [Google Scholar]
- (22).Zhang H; DeNero SP; Joe DK; Lee H-H; Chen S-H; Michalakes J; and Kleeman MJ Development of a source oriented version of the WRF/Chem model and its application to the California regional PM10 / PM2.5 air quality study, Atmos. Chem. Phys, 2014, 14, 485–503, 10.5194/acp-14-485-2014. [DOI] [Google Scholar]
- (23).Koo B; Wilson GM; Morris RE; Dunker AM; and Yarwood G Comparison of Source Apportionment and Sensitivity Analysis in a Particulate Matter Air Quality Model Environ. Sci. Technol 2009, 43, 17, 6669–6675. https://pubs.acs.org/doi/abs/10.1021/es9008129 [DOI] [PubMed] [Google Scholar]
- (24).Schöpp W; Amann M; Cofala J; Heyes C; Klimont Z Integrated assessment of European air pollution emission control strategies. Environmental Modelling & Software, 1998, 14(1), 1–9. [Google Scholar]
- (25).Reis S; Nitter S; Friedrich R Innovative approaches in integrated assessment modelling of European air pollution control strategies–implications of dealing with multi-pollutant multi-effect problems. Environmental Modelling & Software, 2005, 20(12), 1524–1531. [Google Scholar]
- (26).Amann M; Bertok I; Borken-Kleefeld J; Cofala J; Heyes C; Höglund-Isaksson L; Klimont Z; Nguyen B; Posch M; Rafaj P and Sandler R Cost-effective control of air quality and greenhouse gases in Europe: Modeling and policy applications. Environmental Modelling & Software, 2011, 26(12), 1489–1501. [Google Scholar]
- (27).Wild O; Fiore AM; Shindell DT; Doherty RM; Collins WJ; Dentener FJ; Schultz MG; Gong S; MacKenzie IA; Zeng G; Hess P; Duncan BN; Bergmann DJ; Szopa S; Jonson JE; Keating TJ; and Zuber A Modelling future changes in surface ozone: a parameterized approach, Atmos. Chem. Phys, 2012, 12, 2037–2054, 10.5194/acp-12-2037-2012. [DOI] [Google Scholar]
- (28).Turnock ST; Wild O; Dentener FJ; Davila Y; Emmons LK; Flemming J; Folberth GA; Henze DK; Jonson JE; Keating TJ; Kengo S; Lin M; Lund M; Tilmes S; and O’Connor FM The impact of future emission policies on tropospheric ozone using a parameterised approach, Atmos. Chem. Phys, 2018, 18, 8953–8978, 10.5194/acp-18-8953-2018. [DOI] [Google Scholar]
- (29).Xing J; Wang S; Jang C; Zhu Y; Zhao B; Ding D; Wang J; Zhao L; Xie H; Hao J ABaCAS: an overview of the air pollution control cost-benefit and attainment assessment system and its application in China. The Magazine for Environmental Managers - Air & Waste Management Association, 2017, April. [Google Scholar]
- (30).Xing J; Wang SX; Jang C; Zhu Y; Hao JM Nonlinear response of ozone to precursor emission changes in China: a modeling study using response surface methodology. Atmos. Chem. Phys, 2011, 11, 5027–5044. [Google Scholar]
- (31).Wang SX; Xing J; Jang C; Zhu Y; Fu JS; Hao J Impact assessment of ammonia emissions on inorganic aerosols in east China using response surface modeling technique. Environ. Sci. Technol, 2011, 45, 9293–9300. [DOI] [PubMed] [Google Scholar]
- (32).Zhao B; Wang SX; Xing J; Fu K; Fu JS; Jang C; Zhu Y; Dong XY; Gao Y; Wu WJ; Wang JD Assessing the nonlinear response of fine particles to precursor emissions: development and application of an extended response surface modeling technique v1.0. Geosci. Model Dev, 2015, 8, 115–128. [Google Scholar]
- (33).Zhao B; Wu W; Wang S; Xing J; Chang X; Liou KN; Jiang JH; Gu Y; Jang C; Fu JS and Zhu Y A modeling study of the nonlinear response of fine particles to air pollutant emissions in the Beijing–Tianjin–Hebei region. Atmospheric Chemistry and Physics, 2017, 17(19), 12031–12050. [Google Scholar]
- (34).Xing J; Wang S; Zhao B; Wu W; Ding D; Jang C; Zhu Y; Chang X; Wang J; Zhang F; Hao J Quantifying Nonlinear Multiregional Contributions to Ozone and Fine Particles Using an Updated Response Surface Modeling Technique. Environmental science & technology, 2017, 51(20),11788–11798. [DOI] [PubMed] [Google Scholar]
- (35).Xing J; Ding D; Wang S; Zhao B; Jang C; Wu W; Zhang F; Zhu Y; Hao J Quantification of the enhanced effectiveness of NOx control from simultaneous reductions of VOC and NH3 for reducing air pollution in the Beijing–Tianjin–Hebei region, China, Atmos. Chem. Phys, 2018, 18, 7799–7814, 10.5194/acp-18-7799-2018. [DOI] [Google Scholar]
- (36).Zhang Y; Wen XY; Wang K Vijayaraghavan K and Jacobson MZ. Probing into regional O3 and particulate matter pollution in the United States: 2. An examination of formation mechanisms through a process analysis technique and sensitivity study. Journal of Geophysical Research: Atmospheres, 2009, 114(D22), 10.1029/2009JD011900. [DOI] [Google Scholar]
- (37).Liu XH; Zhang Y; Xing J; Zhang Q; Wang K; Streets DG; Jang C; Wang WX; Hao JM Understanding of regional air pollution over China using CMAQ, part II. Process analysis and sensitivity of ozone and particulate matter to precursor emissions. Atmospheric Environment, 2010, 44(30), 3719–3727. [Google Scholar]
- (38).Xing J; Ding D; Wang S; Dong Z; Kelly JT; Jang C; Zhu Y; Hao J Development and application of observable response indicators for design of an effective ozone and fine particle pollution control strategy in China, Atmospheric Chemistry and Physics, 2019, 19(21), 13627–13646. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (39).Gipson GL; Freas WP; Kelly RF; Meyer EL Guideline for use of city-specific EKMA in preparing ozone SIPs. EPA-450/4–80-027, US Environmental Protection Agency, Research Triangle Park, North Carolina, USA, 1981. [Google Scholar]
- (40).Womack CC; McDuffie EE; Edwards PM; Bares R; de Gouw JA; Docherty KS; Dubé WP; Fibiger DL; Franchin A; Gilman JB; Goldberger L; Lee BH; Lin JC; Long R; Middlebrook AM; Millet DB; Moravek A; Murphy JG; Quinn PK; Riedel TP; Roberts JM; Thornton JA; Valin LC; Veres PR; Whitehill AR; Wild RJ; Warneke C; Yuan B; Baasandorj M; and Brown SS An odd oxygen framework for wintertime ammonium nitrate aerosol pollution in urban areas: NOx and VOC control as mitigation strategies. Geophysical Research Letters, 2019, 46(9), 4971–4979. [Google Scholar]
- (41).Cabaneros SMS; Calautit JK; Hughes BR A review of artificial neural network models for ambient air pollution prediction. Environmental Modelling & Software, 2019, 119, 285–304. [Google Scholar]
- (42).Di Q; Kloog I; Koutrakis P; Lyapustin A; Wang Y; Schwartz J Assessing PM2.5 exposures with high spatiotemporal resolution across the continental United States. Environmental science & technology, 2016, 50(9), 4712–4721. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (43).Ding D; Xing J; Wang S; Liu K; Hao J Estimated Contributions of Emissions Controls, Meteorological Factors, Population Growth, and Changes in Baseline Mortality to Reductions in Ambient PM2.5 and PM2.5 -Related Mortality in China, 2013–2017., Environ Health Perspect. 2019, 127(6):67009. [DOI] [PMC free article] [PubMed] [Google Scholar]
- (44).Ding D; Xing J; Wang S; Chang X; Hao J Impacts of emissions and meteorological changes on China’s ozone pollution in the warm seasons of 2013 and 2017. Frontiers of Environmental Science & Engineering, 2019, 13(5), 76. [Google Scholar]
- (45).Dong C; Loy CC; He K; and Tang X Image super-resolution using deep convolutional networks[J]. IEEE transactions on pattern analysis and machine intelligence, 38(2): 295–307, 2015. [DOI] [PubMed] [Google Scholar]
- (46).Zhang K; Zuo W; Chen Y; Meng D; and Zhang L Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising[J]. IEEE Transactions on Image Processing, 26(7): 3142–3155, 2017. [DOI] [PubMed] [Google Scholar]
- (47).Kingma DP; Ba J: Adam: A method for stochastic optimization[J]. International Conference of Learning Representation. 2014. [Google Scholar]
- (48).Humphrey GB; Maier HR; Wu W; Mount NJ; Dandy GC; Abrahart RJ; and Dawson CW “Improved Validation Framework and R-Package for Artificial Neural Network Models.” Environmental Modelling & Software 92:82–106, 2017. [Google Scholar]
- (49).EPA US. Guidance on the Use of Models and Other Analyses for Demonstrating Attainment of Air Quality Goals for Ozone, PM2.5, and Regional Haze. U S EPA, Research Triangle Park, NC 27711: Office of Air and Radiation, Office of Air Quality Planning and Standards, 2007. [Google Scholar]
- (50).Finlayson-Pitts BJ; Pitts JN Jr Chemistry of the upper and lower atmosphere: theory, experiments, and applications. Elsevier, 1999. [Google Scholar]
- (51).Csáji BC: Approximation with artificial neural networks[J]. Faculty of Sciences, Etvs Lornd University, Hungary, 2001, 24(48): 7. [Google Scholar]
- (52).Lu Z; Pu H; Wang F; and Wang L The expressive power of neural networks: A view from the width[C]// Advances in neural information processing systems. 2017: 6231–6239. [Google Scholar]
- (53).He K; Zhang X; Ren S; and Sun J Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition 2016: 770–778. [Google Scholar]
- (54).Xu B; Wang N; Chen T; and Li M Empirical evaluation of rectified activations in convolutional network[J]. arXiv preprint arXiv:1505.00853, 2015. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.