Evaluating deep learning time series models for PM2.5 forecasting across diverse horizons

Ling Zeng; Runan Dong; Meng Yuan; Linhai Jing; Shoutao Jiao

doi:10.1016/j.isci.2026.114770

. 2026 Jan 21;29(2):114770. doi: 10.1016/j.isci.2026.114770

Evaluating deep learning time series models for PM_2.5 forecasting across diverse horizons

Ling Zeng ^1,^5,^∗, Runan Dong ^1,², Meng Yuan ^1,², Linhai Jing ^3,^∗∗, Shoutao Jiao ⁴

PMCID: PMC12907896 PMID: 41704768

Summary

Air pollution, particularly PM2.5, poses a major health challenge in urban areas such as Chengdu, China, where basin topography and intense emission sources exacerbate pollutant concentrations. This study evaluates four deep-learning time series algorithms—LSTM, CNN-LSTM, Transformer, and Transformer-LSTM—for PM2.5 forecasting, comparing univariate and four multivariate configurations incorporating auxiliary pollutants (CO, NO₂, O₃, SO₂) and meteorological factors (temperature, pressure, precipitation, wind speed). Using two years of daily data (November 2022–October 2024), models are assessed across monthly, seasonal, half-year, and annual horizons with complete and incomplete seasonal datasets. Results demonstrate: Transformer-LSTM yields superior performance with higher R² and lower MAE% and RMSE%, especially when augmented by meteorological factors over pollutants; complete seasonal training improves performance, while gaps exceeding three months between training and prediction reduce reliability due to evolving PM2.5 dynamics. These findings underscore meteorological integration, data-driven modeling, seasonal completeness, and timely prediction for pollution control for policymakers in Chengdu.

Subject areas: Atmospheric science, Atmospheric chemistry, Atmosphere modelling, Environmental science, Environmental health, Pollution, Machine learning

Graphical abstract

Highlights

•
Transformer-LSTM delivers the best PM2.5 prediction across horizons in Chengdu
•
Meteorology, especially temperature, outweighs auxiliary pollutants
•
Complete seasonal data enhances prediction performance
•
Three-month gaps between training and prediction reduce reliability

Atmospheric science; Atmospheric chemistry; Atmosphere modelling; Environmental science; Environmental health; Pollution; Machine learning

Introduction

Air pollution, particularly PM_2.5 (particles ≤2.5 μm), poses a pressing global urban challenge due to its deep lung penetration and links to severe respiratory and cardiovascular diseases.¹^,² The World Health Organization deems concentrations above 35 μg/m³ hazardous, raising growing concerns about public health. Chengdu, a major city in southwestern China’s Sichuan Basin, faces severe PM_2.5 pollution. The basin’s topography—low wind speeds, temperature inversions, and high humidity—traps pollutants, worsening air quality.³^,⁴ Dense population, industrial emissions, shallow mixing layers, and winter thermal inversions drive frequent pollution episodes.⁵^,⁶^,⁷ Nitrates, a key secondary aerosol, account for nearly half of Chengdu’s PM_2.5.⁸^,⁹

Recent predictive PM_2.5 models evolved from traditional statistical methods and machine learning to deep learning architecture.¹⁰^,¹¹^,¹²^,¹³^,¹⁴^,¹⁵^,¹⁶^,¹⁷ Traditional statistical models, such as regression models and autoregressive integrated moving average (ARIMA), provide interpretable results but fail to capture the complex temporal dependencies inherent in PM_2.5 time-series data.¹⁸^,¹⁹ Machine learning methods, such as random forests and support vector machines, boost accuracy but lack scalability and generalization across diverse datasets.²⁰^,²¹^,²² Outperforming these, deep learning-based methods, such as convolutional neural networks (CNNs), long short-term memory (LSTM) networks, and Transformers, excel in nonlinear modeling and large datasets.²³ CNNs excel at capturing PM_2.5 spatial features but struggle with temporal dependencies and high computation.²⁴ LSTMs suit sequential data and time-series forecasting, yet struggle with vanishing gradients over long sequences.²⁵ Transformers utilize attention mechanisms to effectively handle long-range dependencies, but are computationally demanding.²⁶ Hybrid models such as CNN-LSTM and Transformer-LSTM, combine these strengths to enhance spatial-temporal modeling.²⁷^,²⁸^,²⁹ Transformer-LSTM, though underexplored for PM₂.₅ prediction, integrates Transformer’s self-attention with LSTM’s sequential capabilities, improving short- and long-term trend capture.²⁶^,³⁰ Its practical applications are limited, needing more research.

In addition to predictive modeling techniques, researchers have also explored different input frameworks, including univariate (PM_2.5-only) and multivariate (incorporating auxiliary meteorological factors) approaches. Numerous studies have shown that incorporating auxiliary meteorological variables—such as temperature, humidity, and wind speed—can improve prediction accuracy for PM_2.5.³¹^,³² However, the incorporation of co-pollutants such as CO, NO₂, SO₂, and O₃ as auxiliary variables remains underexplored, despite their demonstrated correlations with PM_2.5. Furthermore, few studies have systematically compared the performance of univariate (PM_2.5-only) and multivariate models within the same deep learning architecture. This gap limits a comprehensive understanding of how auxiliary co-pollutants contribute to predictive accuracy and under what conditions they offer the greatest benefit.³³

An underexplored aspect in PM_2.5 prediction is how model performance varies across different forecasting horizons when using training datasets of varying completeness and duration. Many existing studies focus on predicting PM_2.5 over specific time spans without systematically examining how the length and composition of training data affect predictions for diverse forecasting periods, such as one month, one season, half a year, or a full year. Short-term forecasts (e.g., monthly or seasonal predictions) are vital for real-time air quality monitoring and emergency response, while long-term forecasts (e.g., half-yearly or annual predictions) are critical for environmental policy planning and assessment.³⁴ Assessing the sensitivity of predictive models to different forecasting horizons, training data completeness, and the temporal gap between training and prediction periods is essential to support applications ranging from immediate air quality management to long-term environmental strategies.

This study aims to address these gaps by: (1) evaluating the performance of Transformer-LSTM against other deep learning models (LSTM, CNN-LSTM, and Transformer), (2) systematically investigating the sensitivity of these models to varying forecasting horizons using training datasets of different completeness and duration, and (3) comparing univariate (PM_2.5-only) with multivariate models within the same deep learning architectures, by incorporating auxiliary pollutants (O₃, NO₂ SO₂, and CO) to investigate their predictive effect on PM_2.5, while also evaluating the predictive contributions of meteorological factors (temperature, pressure, precipitation, wind speed). Using real-world data from Chengdu, this study reveals the predictive role of auxiliary pollutants and meteorological variables, the efficacy of hybrid architectures, and the impact of forecasting horizon, training data completeness, and duration on PM_2.5 prediction accuracy, aiding air quality prediction and policy.

Results

This study evaluates four deep-learning methods—LSTM, CNN-LSTM, Transformer, and Transformer-LSTM—for predicting PM₂.₅ trends in Chengdu. Each method is tested in univariate (PM₂.₅ only) and multivariate configurations, incorporating auxiliary variables in four multivariate configurations: (1) CO and NO₂, (2) CO only, (3) O₃ and SO₂, and (4) meteorological factors (temperature, pressure, precipitation, wind speed), as detailed in Section “configurations of auxiliary variables.” This yields 20 models (four methods × five configurations) assessed across multiple forecasting horizons. Time-sensitivity analyses are divided into two categories: Category 1 uses complete yearly four-season data to predict the next year’s long-term (full-year), mid-term (half-year), seasonal, and monthly trends; Category 2 uses incomplete seasonal data to predict remaining months within the same year (see Section “setups of time-sensitivity analyses”).

Model performance is evaluated using MAE%, RMSE%, and R² (see Section “model evaluation metrics”), which normalize errors relative to average PM₂.₅ values for consistent comparisons across models and scenarios.

Configurations of auxiliary variables

To evaluate the contribution of auxiliary variables to PM₂.₅ forecasting, pollutants (CO, NO₂, O₃, and SO₂) and meteorological factors (temperature, pressure, precipitation, and wind speed) were organized into four configurations. These configurations assess the predictive impact of each group and compare their performance against univariate (PM₂.₅-only) models, as detailed later in discussion.

Pollutant configuration 1: and NO₂

This configuration includes CO and NO₂, selected for their high correlations with PM₂.₅ (0.788 and 0.727, respectively). The objective is to determine whether these strongly correlated pollutants enhance PM₂.₅ prediction accuracy compared to univariate models and to evaluate their contribution to forecasting.

Pollutant configuration 2: CO only

CO and NO₂ in pollutant configuration 1 exhibit strong intercorrelation (0.747), which may introduce covariance and reduce model stability. This configuration uses only CO, which has the highest correlation with PM₂.₅ (0.788), to isolate its predictive impact. The goal is to compare its performance against Configuration 1 to assess whether covariance between CO and NO₂ affects prediction accuracy.

Pollutant configuration 3: O₃ and SO₂

This configuration includes O₃ and SO₂, with moderate correlations to PM₂.₅ (0.294 and 0.205, respectively). The aim is to evaluate whether these less correlated pollutants improve PM₂.₅ prediction over univariate models and to assess their contribution to forecasting accuracy.

Meteorological configuration: temperature, pressure, precipitation, and wind speed

This configuration combines four meteorological factors—temperature, pressure, precipitation, and wind speed—with correlations to PM₂.₅ ranging from 0.257 to 0.491. The objective is to evaluate their collective predictive impact on PM₂.₅ forecasting and compare their performance against pollutant configurations and univariate models.

Setups of time-sensitivity analyses

Time-sensitivity analyses were structured into two main categories: predictions using complete yearly four-season data and predictions using incomplete seasonal data. Each category was further divided based on forecasting horizons, as outlined later in discussion.

Category 1: Predictions using complete yearly four-season data

This category evaluates model performance using a full year of four-season data (November 2022 to October 2023) for training, with predictions tested on the following year (November 2023 to October 2024) across various horizons. This setup captures complete seasonal patterns, enabling robust accuracy analysis over diverse time frames (Table 1).

Table 1.

Subcategories and cases for Category 1

Category 1′	Case	Description	Time Period
Category 1-1: Long-term predictions	Full year forecast	Forecast for an entire year	November 2023 – October 2024
Category 1–2: Mid-term half-year predictions	Case 1	Predict winter and spring	November 2023 – April 2024
Category 1–2: Mid-term half-year predictions	Case 2	Predict summer and autumn	May 2024 – October 2024
Category 1–3: Short-to-medium-term seasonal predictions	Case 1	Predict winter	November 2023 – January 2024
	Case 2	Predict spring	February 2024 – April 2024
	Case 3	Predict summer	May 2024 – July 2024
	Case 4	Predict autumn	August 2024 – October 2024
Category 1–4: Short-term monthly predictions	Case 1	Predict November 2023	November 2023
	Case 2	Predict December 2023	December 2023
	Case 3	Predict January 2024	January 2024
	Case 4	Predict February 2024	February 2024
	Case 5	Predict March 2024	March 2024
	Case 6	Predict April 2024	April 2024
	Case 7	Predict May 2024	May 2024
	Case 8	Predict June 2024	June 2024
	Case 9	Predict July 2024	July 2024
	Case 10	Predict August 2024	August 2024
	Case 11	Predict September 2024	September 2024
	Case 12	Predict October 2024	October 2024

Open in a new tab

Category 2: Predictions using incomplete yearly season data

This category tests models trained on partial yearly data (November 2023 to October 2024) to predict remaining months within the same cycle, maintaining a full-year span. It assesses forecasting efficacy with limited seasonal input, focusing on later-year trends (Table 2).

Table 2.

Subcategories and cases for Category 2 PM_2.5 forecasting

Category 1	Case	Description	Training Period	Prediction Period
Category 2-1: Seasonal gap predictions	Case 1	Train in winter and spring to predict summer and autumn.	November 2023 to April 2024	May 2024 to October 2024
Category 2-1: Seasonal gap predictions	Case 2	Train on winter, spring, and summer to predict autumn.	November 2023 to July 2024	August 2024 to October 2024
Category 2-2: Short-term predictions with missing months	Case 1	Train on the first 10 months to predict the next 2 months.	November 2023 to August 2024	September 2024 to October 2024
	Case 2	Train on the first 11 months to predict the final month.	November 2023 to September 2024	October 2024

Open in a new tab

Results of models

This study evaluates 460 models derived from four deep-learning methods (LSTM, Transformer, CNN-LSTM, Transformer-LSTM), five prediction configurations (one univariate and four multivariate configurations), and 23 forecasting horizons from time-sensitivity analyses. Predictive performance is summarized through trends (Figures S1–S23) and metrics (Tables S1–S6) in the supplemental information, using MAE%, RMSE%, and R².

Prediction trends

Figures S1–S23 align with the forecasting horizons outlined in Section “setups of time-sensitivity analyses,” each with four subplots: (a) LSTM, (b) Transformer, (c) CNN-LSTM, and (d) Transformer-LSTM. Subplots display observed PM₂.₅ (black solid line), alongside predictions, distinguished by line styles and colors: orange solid for univariate, blue dashed for the CO + NO₂ multivariate configuration, purple dashed for the CO-only configuration, green dashed for the SO₂+O₃ configuration, and red dashed for the four meteorological factor configuration (temperature, pressure, precipitation, wind speed). Figures S1–S19 (Category 1) are based on complete yearly four-season data, while Figures S20–S23 (Category 2) utilize incomplete seasonal data.

①
Figure S1: Long-term forecast (next year).
②
Figures S2 and S3: Mid-term forecasts (first and second half-years).
③
Figures S4–S7: Short-to-medium-term seasonal forecasts (Winter, Spring, Summer, Autumn).
④
Figures S8–S19: Short-term monthly forecasts (Nov 2023–Oct 2024).
⑤
Figures S20 and S21: Seasonal gap predictions (initial seasons to rest).
⑥
Figures S22 and S23: Short-term forecasts with missing months (final 1–2 months).

For Category 1 (S1 to S19), Figures S1, S2, and S4 show the best fit to observed PM₂.₅ trends, followed by S8 to S10 with strong alignment, particularly when meteorological factors are included. Other figures struggle with significant fluctuations. In Category 2 (S20 to S23), accuracy declines overall, but Figures S21 and S22, particularly with the Transformer-LSTM model incorporating meteorological data, better capture trends. Models trained on complete yearly data, enhanced by meteorological insights, consistently surpass those relying on incomplete seasonal data in predictive accuracy.

Performance metrics

The full performance metrics across all cases are listed in Tables S1–S6. Tables S1–S4 cover Category 1 (complete yearly four-season data), while Tables S5 to S6 address Category 2 (incomplete yearly data).

In Category 1: Table S1 (long-term prediction) exhibits the highest overall performance. The multivariate Transformer-LSTM model, incorporating four meteorological factors, achieves the greatest accuracy, as evidenced by the highest R² values and the lowest error metrics. Closely following, Case 1 in Table S2 (mid-term prediction for the first half-year) demonstrates robust performance, with the multivariate Transformer-LSTM model leveraging meteorological factors performing particularly well. For seasonal predictions in Table S3, Case 1 (winter) yields the best results, with the multivariate Transformer-LSTM model utilizing meteorological factors achieving superior performance, likely attributable to stronger temporal correlations with prior data. In Table S4 (monthly predictions), performance declines across Cases 3 (January), 2 (December), and 1 (November). The multivariate Transformer-LSTM model with meteorological factors leads in Cases 3 and 2, but is less effective compared to the winter season results in Table S3.

In Category 2: Tables S5 and S6 exhibit diminished performance relative to Category 1, marked by lower R²values and higher RMSE and MAE metrics. In Table S5, Case 2 demonstrates relatively stable performance compared to the other cases within the table, while in Table S6, Case 1 shows similar relative stability. Moreover, the multivariate Transformer-LSTM incorporating meteorological factors outperforms other models in both tables, yielding slightly higher R² and reduced error metrics compared to its univariate counterpart or models with alternative auxiliary variables. Although incomplete data constrain overall accuracy, the inclusion of meteorological factors enhances model resilience.

Discussion

Sensitivity for the time gap between training and the prediction period

The analysis of sensitivity to the time gap between training and prediction periods is relevant only when using complete training data (Category 1, Tables S1–S4), as incomplete data predictions (Category 2, Tables S5 and S6) do not consider this gap. Analysis of Tables S1–S4 reveals the following.

①
Predictions immediately following the training period exhibited relatively high and stable R² values. This includes long-term predictions (Table S1 Case 1, immediately adjacent to the training period), mid-term predictions for the first half-year (Table S2, Case 1, immediately adjacent to the training period), and short-to-medium-term predictions for the first quarter (Table S3, Case 1, immediately adjacent to the training period), and short-term monthly predictions for the first three months (Table S4, Cases 1–3, immediately adjacent to the training period).
②
As the time gap between the training and prediction periods increased, model fit metrics, such as R², declined significantly. For example, mid-term predictions for the second half-year (Table S2, Case 2) showed a marked decrease in R². In some cases, R² values even turned negative, as observed in short-to-medium-term predictions for the second, third, and fourth quarters (Table S3 Cases 2–4) and in short-term monthly predictions for months beyond the first three (Table S4 Cases 4–12). Notably, this decline becomes particularly pronounced when the temporal gap exceeds three months (one-quarter), indicating a critical threshold for maintaining predictive accuracy.

The decline in performance indicated above, associated with increasing time gaps between training and prediction periods, may be attributed to the difficulty in capturing complete temporal patterns, as training data becomes less representative of evolving PM₂.₅ dynamics, such as seasonal shifts or new pollution sources, which are inadequately reflected beyond a three-month interval.

Impact of completeness of training data

Category 1 (complete four-season training data) outperforms Category 2 (incomplete yearly data), as evidenced by Tables S1–S4 vs. Tables S5 and S6 and Figures S1–S19 vs. Figures S20–S23. Category 1 excels, especially for predictions immediately following training. This indicates that complete seasonal data enhances forecast accuracy, while incomplete data in Category 2 fails to capture seasonal patterns, lowering performance.

Impact of training proportion in incomplete seasonal data

Tables S5 and S6, based on incomplete seasonal data from November 2023 to October 2024, reveal that a higher training data proportion generally boosts performance. Table S6, Case 1 (10 months training, the next 2 months prediction) outperforms Table S5, Case 2 (9 months training, 3 months prediction), followed by Table S5, Case 1 (6 months training, 6 months prediction). However, this trend breaks when predicting just one month: Table S6, Case 2 (11 months training, the next one-month prediction) shows reduced R², suggesting excessive training data relative to the short one-month prediction horizon can lead to overfitting and yield diminishing returns.

Challenges in single-month predictions immediately after the training period

Predicting a single month right after the training period proves challenging, as seen in Table S6, Case 2 (11 months of incomplete seasonal data) and Table S4, Case 1 (complete yearly seasonal data). Section “impact of training proportion in incomplete seasonal data” notes that increasing training data proportion typically enhances prediction accuracy. However, this trend reverses in Table S6, Case 2, where 11 months of training data for predicting the final month (Oct 2024) yields poorer performance. Similarly, Table S4, Case 1 (predicting Nov 2023) underperforms compared to Cases 2 and 3 (predicting Dec 2023 and Jan 2024, with a one-month gap).

This reduced accuracy may stem from.

① Limited time window: Predicting immediately after training leaves the model with a narrow time frame, hindering its ability to capture subtle PM₂.₅ fluctuations just beyond the data.
② Insufficient time to understand trends: A small gap (e.g., one to two months, as in Table S4, Cases 2–3) allows the model to better identify longer-term trends, enhancing accuracy.

Comparative performance of univariate, pollutant-based, and meteorology-based prediction

We evaluated 92 groups of univariate, pollutant-based, and meteorology-based models across 23 forecasting horizons using four deep-learning algorithms. Results indicate that only the meteorological factors configuration (temperature, pressure, precipitation, wind speed) consistently enhanced prediction accuracy over univariate models, yielding higher R² and lower MAE% and RMSE%. In contrast, the three pollutant-based configurations (CO + NO₂, CO-only, O₃+SO₂) showed no significant accuracy improvements over univariate models, though rare exceptions occurred with specific algorithms.

Specifically, in the first four subsections of the “discussion,” we evaluated stable cases across two categories: Category 1 includes long-term one-year predictions (Table S1, Case 1), mid-term first half-year predictions (Table S2, Case 1), short-to-medium-term first-quarter predictions (Table S3, Case 1), and short-term monthly predictions for the first three months (Table S4, Cases 1–3); Category 2 covers predictions using 9 months of training for 3 months (Table S5, Case 2) and 10 months for 2 months (Table S6, Case 1), which demonstrated relative stability but lower accuracy compared to Category 1.

Analysis of these stable cases indicates that CO-only predictions consistently showed slightly lower accuracy than CO + NO₂, except in short-term monthly predictions (Table S4, Cases 1–3), where their performance was comparable with no consistent advantage for either. The O₃+SO₂ configuration yielded inconsistent results across all cases, showing no reliable pattern of improvement.

Additionally, across these stable cases, the four deep-learning architectures demonstrated consistent and significant improvements when incorporating meteorological features relative to their univariate counterparts.

Specifically, For LSTM, meteorological factors yielded modest but consistent gains, with ΔR² ranging from 0.08 to 0.12 in Category 1 long-term (Table S1, Case 1) and mid-term first half-year (Table S2, Case 1) forecasts (e.g., RMSE% reduction of 12–18%), reflecting its foundational sequential modeling but limited capacity to leverage causal dispersion effects such as wind speed and precipitation without advanced feature extraction.

CNN-LSTM demonstrated the most pronounced relative improvements among the non-hybrid Transformer models, particularly in Category 2 incomplete data scenarios (e.g., ΔR² up to 0.15 and 20–25% RMSE% reductions in 9-month training cases). This suggests that the convolutional layers effectively capture local meteorological patterns (e.g., temperature inversions), amplifying the hybrid’s sensitivity to external features and occasionally surpassing Transformer-LSTM’s absolute gains in short-term predictions.

The Transformer architecture benefited moderately from meteorological inputs (ΔR² of 0.10–0.14; MAE% reductions of 15–22%), with stronger performance in Category 1 seasonal forecasts due to its attention mechanism prioritizing long-range dependencies influenced by pressure and temperature. However, these gains were less resilient in Category 2 compared to hybrids, highlighting potential attention dilution with incomplete seasonal data.

In contrast to these architectures, the Transformer-LSTM hybrid maintained the highest absolute improvements (ΔR² > 0.15; error reductions >25% across metrics), as previously noted, but the relative uplifts in CNN-LSTM underscore the value of tailored feature integration for specific data completeness levels. Overall, while pollutant-based configurations offered negligible benefits, meteorological factors universally enhanced model efficacy, with the magnitude varying by architecture: hybrids such as CNN-LSTM and Transformer-LSTM showed the greatest potential for scalable air quality forecasting.

These findings align with El Mghouchi et al. (2024), who similarly observed that pollutant-based auxiliary variables provide limited predictive value, whereas meteorological factors substantially improve PM₂.₅ forecasting accuracy.³⁵^,³⁶ This disparity may arise due to the following factors: ① Pollutant-based variables such as CO and NO₂ exhibit high correlations with PM₂.₅ (exceeding 0.7), likely reflecting shared sources rather than direct causality, while the correlation between O₃ and SO₂ with PM₂.₅ (ranging from 0.2 to 0.3) is lower, suggesting weaker source commonality. ② Table 3 shows that CO and NO₂ are highly correlated, suggesting multicollinearity, but CO + NO₂ slightly outperforms CO-only in most cases, indicating NO₂ provides some unique information. It is probable that the deep-learning models’ robustness mitigates the impact of multicollinearity as much as possible, maintaining model stability and enhancing accuracy. ③ Meteorological factors (temperature, pressure, precipitation, wind speed), despite moderate correlations (0.2–0.4), exert a causal influence on PM₂.₅ concentrations, actively contributing to their dilution or exacerbation through mechanisms such as dispersion or atmospheric stability.

Table 3.

Distance correlation matrix

	CO	NO₂	O₃	SO₂	PM_2.5	T	P	RH	Prec	WS
CO	1.000	0.747	0.251	0.275	0.788	0.382	0.273	0.154	0.229	0.393
NO₂	0.747	1.000	0.283	0.242	0.727	0.455	0.404	0.100	0.350	0.471
O₃	0.251	0.283	1.000	0.446	0.294	0.734	0.638	0.544	0.096	0.206
SO₂	0.275	0.242	0.446	1.000	0.205	0.277	0.197	0.339	0.114	0.086
PM_2.5	0.788	0.727	0.294	0.205	1.000	0.491	0.331	0.127	0.257	0.340
T	0.382	0.455	0.734	0.277	0.491	1.000	0.811	0.195	0.256	0.290
P	0.273	0.404	0.638	0.197	0.331	0.811	1.000	0.148	0.290	0.323
RH	0.154	0.100	0.544	0.339	0.127	0.195	0.148	1.000	0.305	0.154
Prec	0.229	0.350	0.096	0.114	0.257	0.256	0.290	0.305	1.000	0.360
WS	0.393	0.471	0.206	0.086	0.340	0.290	0.323	0.154	0.360	1.000

Open in a new tab

T: temperature; P: pressure; RH: relative humidity; Prec: precipitation; WS: wind speed.

Performance analysis of deep-learning algorithms

This analysis evaluates the performance of deep-learning algorithms (LSTM, CNN-LSTM, Transformer, and Transformer-LSTM) for PM₂.₅ forecasting using complete (Category 1) and incomplete (Category 2) training data. For Category 1, considering that demonstration that cases with a training-to-prediction gap exceeding three months were excluded due to limited utility in Section “sensitivity for time gap between training and predicting period,” the analysis focuses on predictions immediately following the training period: long-term (Table S1), mid-term first six months (Table S2, Case 1), short-to-medium-term first quarter (Table S3, Case 1), and short-term first three months (Table S4, Cases 1–3). For Category 2, stable cases include Table S6, Case 1 (10 months training, 2 months prediction) and Table S5, Case 2 (9 months training, 3 months prediction), excluding anomalous cases (Table S6, Case 2; Table S5, Case 1). Performance is assessed using R², MAE%, and RMSE%.

Transformer-LSTM, in both univariate and multivariate configurations, emerged as the most reliable algorithm across the eight cases spanning both categories, consistently demonstrating superior robustness and accuracy across all forecasting horizons. In the six cases of Category 1, it achieved higher prediction accuracy, while in the two cases of Category 2, it maintained reliability despite challenging conditions. In contrast, Transformer and CNN-LSTM followed with variable performance, struggling in certain instances, whereas LSTM consistently lagged, often producing negative outputs. This highlights Transformer-LSTM’s resilience, particularly in scenarios where other models, notably LSTM, underperformed.

Insights into attention weights of Transformer-long short-term memory

In the multivariate configurations, we analyzed the Transformer-LSTM models, identified as the optimal deep learning algorithm among the four, by examining the attention weights allocated to different auxiliary variables, as derived from attention weight data extracted during the training process of our models.

In the meteorological combination, the attention weights revealed the following order of importance: temperature with the highest weight of approximately 0.2654, followed by pressure at 0.247, wind speed at 0.2055, and precipitation with the lowest weight of 0.0853. This ranking suggests that temperature plays the most dominant role in capturing the dynamic seasonal and diurnal influences on PM₂.₅, while pressure and wind speed contribute moderately to atmospheric stability and dispersion effects, and precipitation has the least influence, likely due to its episodic nature.

In the O₃ and SO₂ combination, the attention weights indicated that SO₂ received the highest weight of 0.42176, followed by O₃ at 0.31576, implying that SO₂ has a stronger association with PM₂.₅ dynamics, potentially reflecting its role in local pollution sources, whereas O₃’s contribution is notable but secondary, possibly linked to secondary aerosol formation. These attention weight distributions highlight the varying significance of input features across different configurations, with temperature and SO₂ emerging as key drivers based on their respective weight rankings.

For the CO and NO₂ combination, CO has the highest estimated weight of 0.4724, followed by NO₂ with an estimated weight of 0.2546. This prioritization of NO₂ over CO reflects its greater influence in short-term dynamics due to its direct involvement in nitrate formation from traffic emissions, overriding the stronger statistical association of CO. Additionally, for the analysis of attention weights across training time steps, we take one year of training data with a 7-day time window, for example, resulting in 359-time sequences. Among these, the weights for the initial approximately 60 sequences exhibited significant fluctuations, whereas the weights for the subsequent approximately 300 sequences remained relatively uniform with minimal fluctuations and overall higher weights. The initial instability in attention weights, likely due to model warm-up, seasonal transitions, and limited historical context, underscores the importance of a stabilization period to adapt to evolving PM₂.₅ patterns. The subsequent stability and higher weights, reflecting the convergence and recognition of seasonal cycles, highlight the model’s reliance on complete temporal data for robust performance.

Multi-sites analysis and verification

To improve the generalizability of our findings from the previous six subsections, we performed an extended validation of the optimal Transformer-LSTM model incorporating meteorological factors. We trained the model using complete four-season data to predict air quality for the following periods: one year (November 2023–October 2024), the first half-year (November 2023–April 2024), and the first quarter (November 2023–January 2024). This validation was conducted in two heavily polluted Chinese cities, Urumqi and Hangzhou, which frequently rank among the top 10% of Chinese cities for PM2.5 pollution. The results show stable model performance across both cities, with R² values ranging from 0.539 to 0.627, MAE% from 28.8% to 36%, and RMSE% from 46.8% to 56.3% (see Table S7).

This multi-site validation highlights the meteorological-based Transformer-LSTM model’s potential for air quality management in heavily polluted cities, with stable yearly, half-yearly, and quarterly forecasts ensuring timely predictions.

Key findings

The key findings are.

(1)
Transformer-LSTM Outperforms: Transformer-LSTM surpassed LSTM, CNN-LSTM, and Transformer, with higher R² and lower MAE% and RMSE% in univariate and multivariate setups, excelling in monthly, seasonal, and annual PM_2.5 forecasts due to its self-attention and sequential modeling.
(2)
Meteorological Factors Boost Accuracy: Meteorological factors (temperature, pressure, precipitation, wind speed) significantly enhanced prediction accuracy over univariate models, reflecting their causal role in PM_2.5 dynamics.
(3)
Pollutant Configurations Underperform: Pollutant-based configurations (CO + NO₂, CO-only, O₃+SO₂) showed minimal predictive improvement, indicating shared emission sources rather than direct causality.
(4)
Complete Data Enhances Performance: Models trained on complete four-season data (Category 1) outperformed those with incomplete seasonal data (Category 2), highlighting the need for full seasonal patterns to ensure robust forecasting.
(5)
Temporal Gap Sensitivity: Prediction accuracy declined with temporal gaps exceeding three months between training and prediction periods, as training data became less representative of evolving PM_2.5 dynamics, such as seasonal shifts or new pollution sources.
(6)
Training Proportion Impact: For incomplete datasets, a higher training data proportion generally improved performance, but single-month forecasts exhibited diminishing returns, likely due to limited temporal windows for capturing trends.

Policy implications

The Transformer-LSTM model’s consistent stability across different forecasting horizons underscores its reliability as a tool for policymakers in cities like Chengdu to effectively predict and manage air pollution. There is a need for policies that incorporate real-time meteorological data into air quality monitoring systems, enabling more accurate and timely responses. Moreover, training models with complete seasonal datasets and avoiding temporal gaps longer than three months is crucial to ensure forecasting accuracy. Policymakers should prioritize the use of comprehensive and up-to-date datasets in air quality prediction systems to better capture the evolving dynamics of PM_2.5 concentrations. Such data-driven approaches can support evidence-based strategies, including targeted pollution control measures and urban planning initiatives aimed at reducing PM_2.5 levels.

Limitations of this study

The study’s reliance on Chengdu data limits generalizability, as PM_2.5 dynamics vary across regions with different topographies, emission profiles, and climates, such as coastal cities with strong sea breezes, arid regions with dust contributions, or rural areas with biomass burning. The dataset, spanning only two years (November 2022–October 2024), restricts the model’s ability to capture long-term PM_2.5 trends influenced by decadal climatic shifts or policy changes.

Future research should: (1) validate the Transformer-LSTM model across diverse regions, including provinces, cities, and countries with varied environmental conditions (e.g., coastal Shanghai, arid Lanzhou, or tropical Hainan), to ensure robust generalizability and (2) incorporate longer datasets, spanning a decade or more, to enhance the model’s ability to predict long-term PM_2.5 trends and account for multi-year variations in emissions and climate.

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact, Ling Zeng (zengling18@cdut.edu.cn).

Materials availability

As this study did not generate new unique material, the material availability is not applicable.

Data and code availability

•
Data are available online at http://eia-data.com/ or https://doi.org/10.5281/zenodo.18229490.
•
Any code request will be made through the lead contact.
•
Any additional information required to reanalyze the data reported in this article is available from the lead contact upon request.

Acknowledgments

This study was supported by the National Science and Technology Major Project for Deep Earth (No. 2025ZD1008103) and the Deep Earth Probe and Mineral Resources Exploration - National Science and Technology Major Project (No. 2024ZD1001200). We also thank Bin Hu for his support in revising and validating this article.

Author contributions

Ling Zeng: writing – original draft, writing – review and editing, validation, methodology, and conceptualization. Runan Dong: resources, data curation, formal analysis, and visualization. Meng Yuan: visualization, formal analysis, and investigation. Linhai Jing: formal analysis, validation, fund acquisition, investigation, and supervision. Shoutao Jiao: investigation.

Declaration of interests

The authors declare no competing interests.

STAR★Methods

Key resources table

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Deposited data

Data is available online	Environmental Meteorological Data Service Platform	http://eia-data.com/or https://doi.org/10.5281/zenodo.18229490

Software and algorithms

Software — MATLAB r2024a	MathWorks	https://www.mathworks.com/products/new_products/release2024a.html
Deep Learning Toolbox (for LSTM, CNN-LSTM, Transformer, and Transformer-LSTM models)	MathWorks	https://www.mathworks.com/products/deep-learning.html

Open in a new tab

Experimental model and study participant details

This study does not involve experimental models or study participants typical in the life sciences.

Method details

Data collection

The study area is located at Chengdu, Sichuan Province (Figure 1), encompassing five air quality monitoring stations and two meteorological monitoring stations that provide air quality and meteorological data, respectively. The five air quality monitoring stations consist of Shilidian station (1432A), Shahepu station (1434A), Renmin Park station (1437A), Dashi West station (2880A), Sanwayao station (1433A). Datasets for NO₂, CO, SO₂, O₃, and PM_2.5 are collected from the five air quality stations mentioned above, ranging from November 1^st 2022 to October 31^th 2024. The two meteorological stations, Wenjiang station and Shuangliu station, provided datasets of temperature (T), pressure (P), relative humidity (RH), precipitation (Prec) and wind speed (WS) are collected over the same period. Average values from five air quality stations and the two metrological stations were used to represent the overall spatial quality and meteorological data for the study area, respectively.

The location of the study area

The top left shows a map of China, scale bars: 1 cm = 1700 km. The bottom left shows a map of Chengdu, scale bars: 1 cm = 60 km, and the right side shows a map of Chengdu’s main urban area, scale bars: 1 cm = 1850 km.

Data preprocessing and exploratory analyses

Data cleaning

Here preprocessing mainly consist of outliers’ removal and data standardization. Outliers were detected based on the analysis of box-and-whisker plot and no actual outliers were detected.³⁷^,³⁸ And next, datasets are standardized to eliminate the impact of dimensions, using the min-max scaling method as following equation:

X_{i, s t a n d a r d i z e d} = \frac{X_{i} - X_{\min}}{X_{\max} - X_{\min}}

(Equation 1)

The entire preprocessing process was executed using MATLAB programming. Additionally, there were no missing data in the datasets, so no treatment for missing value was required.

Distance correlation analysis

Distance correlation (dCor), introduced by Székely et al.,³⁹ is a robust statistical method to measure both linear and nonlinear dependencies between two random variables X∈R and Y∈R. Unlike Pearson’s correlation, which captures only linear relationships, distance correlation leverages Euclidean distance matrices to detect any form of dependence, with a value of zero indicating independence (assuming finite first moments). It computes the distance covariance dCor²(X,Y) as the mean product of centered distance matrices:

A_{i j} = ‖ x_{i} - x_{j} ‖ - {\bar{a}}_{i .} - {\bar{a}}_{. j} + {\bar{a}}_{. .}

B_{i j} = ‖ y_{i} - y_{j} ‖ - {\bar{b}}_{i .} - {\bar{b}}_{. j} + {\bar{b}}_{. .}

d {C o r}^{2} (X, Y) = \frac{1}{n^{2}} \sum_{i = 1}^{n} \sum_{j = 1}^{n} A_{i j} \times B_{i j}

And then normalizes it to obtain the distance correlation coefficient as follow:

R^{2} (X, Y) = \frac{d {C o r}^{2} (X, Y)}{\sqrt{d {C o r}^{2} (X, X) d {C o r}^{2} (Y, Y)}}

(Equation 2)

Based on the distance correlation analysis, relative humidity (RH) was excluded as a potential auxiliary variable for PM_2.5 prediction due to its low correlation with PM_2.5 (0.127), falling even below 0.2, indicating minimal predictive value. The remaining eight variables—CO (0.788), NO₂ (0.727), T (0.491), WS (0.340), P (0.331), O₃ (0.294), Prec (0.257), and SO₂ (0.205)—were retained as potential predictors due to their stronger correlations with PM_2.5. However, significant intercorrelations were observed, notably CO-NO₂ (0.747) and T-P (0.811), alongside O₃-T (0.734) and O₃-P (0.638), with a moderate correlation between NO₂-T (0.455), suggesting shared emission sources or meteorological influences that may introduce redundancy. This distance correlation analysis serves as a preliminary screening to identify promising predictors, with subsequent analysis planned to rigorously confirm each variable’s predictive significance for accurate PM_2.5 prediction.

Descriptive statistics

We statistically analyzed five pollutant concentrations (CO, NO₂, O₃, SO₂, PM₂.₅) and four meteorological factors (temperature [T], pressure [P], precipitation [Prec], and wind speed [WS]) across two study periods in Chengdu: November 1, 2022–October 31, 2023, and November 1, 2023–October 31, 2024. Mean values represent average concentrations, variances quantify data spead, and the coefficient of variation (CV, standard deviation/mean) measures relative variability, enabling comparisons across pollutants and meteorological variables.

Below Table summarizes the results. Mean concentrations of NO₂ and PM₂.₅ decreased, while CO, O₃, and SO₂ increased from the first to the second period. Variances increased for CO and PM₂.₅, indicating greater fluctuations, but decreased for NO₂, O₃, and SO₂, suggesting more stable concentrations. CVs rose for CO, NO₂, and PM₂.₅, reflecting higher relative variability, while CVs for O₃ and SO₂ fell, indicating consistency. Meteorological factors showed minimal changes with limited impact on air quality trends.

Statistical analyses of CO, NO₂, and PM_2.5 concentrations in Chengdu

Descriptive statistic	Mean values		Variances		Coefficients of variation (CV)
Period	2022.11.01–2023.10.31	2023.11.01–2024.10.31	2022.11.01–2023.10.31	2023.11.01–2024.10.31	2022.11.01–2023.10.31	2023.11.01–2024.10.31
CO	0.614027	0.633115	0.026585	0.038631	0.265541	0.310445
NO₂	30.85479	28.33005	174.7649	163.2194	0.428454	0.450961
PM_2.5	39.44822	36.54536	708.0643	836.58	0.674542	0.791446
O₃	95.74904	102.1415	2599.497	2486.815	0.532489	0.488224
SO₂	3.116712	3.380328	1.115242	0.915557	0.338835	0.283064
T	18.1417	18.53802	56.23795	65.0431	0.413368	0.435048
P	950.9589	950.6199	56.89198	62.96936	0.007932	0.008348
Prec	1.472852	1.46744	23.34876	19.3945	3.280747	3.00109
WS	1.708518	1.703272	0.346479	0.332324	0.344523	0.338452

Open in a new tab

Descriptive seasonal trends

The trends of five pollutants and four meteorological factors are described in the following Figure 2, covering November 2022 to October 2024 in Chengdu. PM₂.₅, CO, and NO₂ exhibit clear seasonal trends, with concentrations peaking in winter and decreasing in summer, likely due to heating activities, temperature inversions, and regional pollution events. Conversely, O₃ shows a distinct pattern, with higher concentrations in summer and lower in winter, driven by increased photochemical activity during warmer months. SO₂ trends are less pronounced, lacking a clear seasonal pattern, possibly due to more consistent emission sources.

The trends of pollutants (NO₂, CO, PM₂.₅, O₃, and SO₂) and meteorological factors (temperature, pressure, precipitation, and wind speed) in Chengdu from November 2022 to October 2024

(A) PM₂.₅ concentrations.

(B) CO concentrations.

(C) NO₂ concentrations.

(D) O₃ concentrations.

(E) SO₂ concentrations.

(F) Temperature.

(G) Pressure.

(H) Precipitation.

(I) Wind speed.

Among meteorological factors, pressure closely follows temperature trends with slight seasonal fluctuations. Precipitation displays anomalously high values during a small portion of summer, likely aiding pollutant dispersion. Wind speed is slightly higher on average in spring and autumn, but its seasonal fluctuations are not pronounced, with occasional anomalously high values. These patterns highlight the influence of seasonal and meteorological factors on air quality dynamics.

Time series analyses methods

Long short-term memory (LSTM)

Long Short-Term Memory (LSTM), a specialized recurrent neural network (RNN), addresses traditional RNNs’ limitations in modeling long-term dependencies in sequential data.²⁵ Its core strength lies in a memory cell that stores and selectively updates information over extended periods, ideal for time series tasks like PM₂.₅ prediction.⁴⁰ Three gates regulate this cell: the forget gate discards irrelevant past data, the input gate incorporates new relevant information, and the output gate controls what advances to the next step. The LSTM architecture is depicted in Figure 3. LSTM effectively captures both short- and long-term dependencies in time series data. However, it struggles with extremely long-term dependencies and remains sensitive to data fluctuations.

The architecture of LSTM

This diagram illustrates the internal architecture and data flow, highlighting the three gating mechanisms (forget, input, output) that regulate the update of the cell state (Ct) and the generation of the hidden state (Ht).

CNN-LSTM

Convolutional Neural Networks (CNNs) excel at extracting local features, typically in image processing, but their convolutional operations—using filters to detect trends, peaks, and variations—also apply to time series. Pooling layers then reduce dimensionality, boosting efficiency.

The CNN-LSTM hybrid integrates CNN’s local pattern extraction with LSTM’s long-term dependency modeling for time series analysis.⁴¹ Its architecture (Figure 4) features a CNN applying 1D convolution to capture short-term trends and joint features from inputs (e.g., CO, SO₂, NO₂, O₃), followed by an LSTM that processes these features to model temporal dependencies. This hybrid model enhances PM₂.₅ predictive accuracy by utilizing CNN to extract meaningful local features from the time series, thereby reducing input complexity, and allowing LSTM to effectively capture temporal dependencies.²⁴

The architecture of CNN-LSTM

The model sequentially integrates a convolutional neural network (CNN) and a long short-term memory (LSTM) network for temporal pattern capture, with a fully connected layer for final output generation.

Transformer

The Transformer, introduced by Vaswani et al. (2017), is a deep learning architecture that transforms sequence modeling by replacing sequential processing (as in RNNs and LSTMs) with a self-attention mechanism.³⁰ This enables parallel input processing, enhancing training efficiency and capturing long-term dependencies effectively. Self-attention weights input elements based on their mutual relationships, excelling at contextual analysis in time series data. Its architecture (Figure 5) comprises encoder and decoder blocks with multi-head self-attention and feedforward layers, using positional encoding to maintain sequence order.

Transformers outperform LSTMs in tasks like air quality, financial, and energy forecasting, with superior long-range dependency modeling and faster training. However, they face higher computational complexity and reduced efficiency with very long sequences due to attention dilution.

Transformer-LSTM

The Transformer-LSTM model integrates the Transformer’s parallel processing and self-attention with LSTM’s sequential, memory-based modeling. In this hybrid structure, the Transformer extracts key temporal features via self-attention, feeding them into the LSTM layer to refine time series forecasting by capturing sequential dependencies and long-term memory. Illustrated in Figure 6, this architecture shines in air pollution forecasting, stock price prediction, and meteorological analysis, utilizing strong feature extraction and sequence retention. However, it inherits drawbacks from both models: high computational complexity, challenges in hyperparameter tuning, and a risk of overfitting on small datasets.

Quantification and statistical analysis

Model performance is assessed using Mean Absolute Error (MAE), Root-Mean-Square Error (RMSE), R-squared (R²), and their percentage-based variants, MAE% and RMSE%. MAE measures the average magnitude of prediction errors, RMSE emphasizes larger errors, and R²quantifies the proportion of variance in actual PM₂.₅ values explained by the model. MAE% and RMSE% normalize errors by the mean of actual values, enabling consistent comparisons across forecasting horizons with varying PM₂.₅ scales.

Mean absolute error (MAE) and MAE%

MAE calculates the average absolute difference between actual and predicted values, offering a straightforward, outlier-insensitive error metric.⁴² It is defined as:

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - \hat{y_{i}} |

(Equation 3)

where y_i represents the actual values, $\hat{y_{i}}$ represents the predicted values, n is the total number of observations. Lower MAE reflects predictions closer to actual values, though it weighs all errors equally.

MAE% normalizes MAE by the mean of actual values, expressed as percentages, to account for varying PM₂.₅ scales across scenarios. It is defined as:

M A E % = 100 \times M A E / \bar{y}

(Equation 4)

where $\bar{y} = (1 / n) \sum_{i = 1}^{n} y_{i}$ is the mean of actual values.

Root means square error (RMSE) and RMSE%

RMSE measures prediction error by averaging squared differences between actual and predicted values, then taking the square root. It penalizes larger errors more than MAE, making it ideal when big deviations matter.⁴² It is defined as

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}

(Equation 5)

where y_i, $\hat{y_{i}}$ , and n are those as mentioned above.

RMSE% normalizes RMSE by the mean of actual values, expressed as percentages, to account for varying PM₂.₅ scales across scenarios. It is defined as:

R M S E % = 100 \times R M S E / \bar{y}

(Equation 6)

where ȳ is the mean of actual values, as defined above.

R-squared (R²)

R² is unitless and describes the proportion of variance in the dependent variable that is explained by the independent variables.⁴³ It shows how well the regression model explains the variability of the data and provides a unitless measure of model fit. It is defined as:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - \hat{y_{i}})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(Equation 7)

where y_i, $\hat{y_{i}}$ , and n are those as mentioned above. $\bar{y}$ is the mean of the actual values. Ranging from 0 to 1, R² = 1 indicates perfect variance explanation, while 0 suggests no improvement over the mean. Negative R² can occur if the model underperforms the mean. HigherR² values signify better fit.

Published: January 21, 2026

Footnotes

Supplemental information can be found online at https://doi.org/10.1016/j.isci.2026.114770.

Contributor Information

Ling Zeng, Email: zengling18@cdut.edu.cn.

Linhai Jing, Email: 2024010071@cugb.edu.cn.

Supplemental information

Document S1. Figures S1–S23 and Tables S1–S7

mmc1.pdf^{(5.5MB, pdf)}

References

1.Wang C., Tu Y., Yu Z., Lu R. PM2.5 and cardiovascular diseases in the elderly: an overview. Int. J. Environ. Res. Public Health. 2015;12:8187–8197. doi: 10.3390/ijerph120708187. [DOI] [PMC free article] [PubMed] [Google Scholar]
2.Hayes R.B., Lim C., Zhang Y., Cromar K., Shao Y., Reynolds H.R., Silverman D.T., Jones R.R., Park Y., Jerrett M., et al. PM2.5 air pollution and cause-specific cardiovascular disease mortality. Int. J. Epidemiol. 2020;49:25–35. doi: 10.1093/ije/dyz114. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Shi G., Yang F., Zhang L., Zhao T., Hu J. Impact of atmospheric circulation and meteorological parameters on wintertime atmospheric extinction in Chengdu and Chongqing of Southwest China during 2001–2016. Aerosol Air Qual. Res. 2019;19:1538–1554. doi: 10.4209/aaqr.2018.09.0336. [DOI] [Google Scholar]
4.Du X.X., Shi G.M., Zhao T.L., Yang F.M., Zheng X.B., Zhang Y.J., Tan Q.W. Contribution of secondary particles to wintertime PM2.5 during 2015–2018 in a major urban area of the Sichuan Basin, Southwest China. Earth Space Sci. 2020;7 doi: 10.1029/2020EA001194. [DOI] [Google Scholar]
5.Tao J., Gao J., Zhang L., Zhang R., Che H., Zhang Z., Lin Z., Jing J., Cao J., Hsu S.C. PM2.5 pollution in a megacity of southwest China: Source apportionment and implication. Atmos. Chem. Phys. 2014;14:8679–8699. doi: 10.5194/acp-14-8679-2014. [DOI] [Google Scholar]
6.Liao T., Wang S., Ai J., Gui K., Duan B., Zhao Q., Zhang X., Jiang W., Sun Y. Heavy pollution episodes, transport pathways and potential sources of PM2.5 during the winter of 2013 in Chengdu (China) Sci. Total Environ. 2017;584–585:1056–1065. doi: 10.1016/j.scitotenv.2017.01.160. [DOI] [PubMed] [Google Scholar]
7.Qiao X., Guo H., Wang P., Tang Y., Ying Q., Zhao X., Deng W., Zhang H. Fine particulate matter and ozone pollution in the 18 cities of Sichuan Basin, southwestern China: Model Performance and characteristics. Aerosol Air Qual. Res. 2019;19:2308–2319. doi: 10.4209/aaqr.2019.05.0235. [DOI] [Google Scholar]
8.Tian M., Liu Y., Yang F., Zhang L., Peng C., Chen Y., Shi G., Wang H., Luo B., Jiang C., et al. Increasing importance of nitrate formation for heavy aerosol pollution in two megacities in Sichuan Basin, Southwest China. Environ. Pollut. 2019;250:898–905. doi: 10.1016/j.envpol.2019.04.098. [DOI] [PubMed] [Google Scholar]
9.Song T., Feng M., Song D., Liu S., Tan Q., Wang Y., Luo Y., Chen X., Yang F. Comparative analysis of secondary organic aerosol formation during PM2.5 pollution and complex pollution of PM2.5 and O3 in Chengdu, China. Atmosphere. 2022;13:1834. doi: 10.3390/atmos13111834. [DOI] [Google Scholar]
10.Chi Y., Wu Y., Wang K., Ren Y., Ye H., Yang S., Lin G. Quantification of uncertainty in short-term tropospheric column density risks for a wide range of carbon monoxide. J. Environ. Manage. 2024;370 doi: 10.1016/j.jenvman.2024.122725. [DOI] [PubMed] [Google Scholar]
11.Zhou S., Wang W., Zhu L., Qiao Q., Kang Y. Deep-learning architecture for PM2.5 concentration prediction: A review. Environ. Sci. Ecotechnol. 2024;21 doi: 10.1016/j.ese.2024.100400. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Gaikwad S., Kumar B., Yadav P.P., Ambulkar R., Govardhan G., Kulkarni S.H., Kumar R., Chate D.M., Nigam N., Rao S.A., Ghude S.D. Harnessing deep learning for forecasting fire-burning locations and unveiling PM 2.5 emissions. Model. Earth Syst. Environ. 2024;10:927–941. doi: 10.1007/s40808-023-01831-1. [DOI] [Google Scholar]
13.Zhang L., Lin J., Qiu R., Hu X., Zhang H., Chen Q., Tan H., Lin D., Wang J. Trend analysis and forecast of PM2.5 in Fuzhou, China using the ARIMA model. Ecol. Indic. 2018;95:702–710. doi: 10.1016/j.ecolind.2018.08.032. [DOI] [Google Scholar]
14.Bhatti U.A., Yan Y., Zhou M., Ali S., Hussain A., Qingsong H., Yu Z., Yuan L. Time series analysis and forecasting of air pollution particulate matter (PM 2.5): an SARIMA and factor analysis approach. IEEE Access. 2021;9:41019–41031. doi: 10.1109/ACCESS.2021.3060744. [DOI] [Google Scholar]
15.Lai X., Li H., Pan Y. A combined model based on feature selection and support vector machine for PM2.5 prediction. J. Intell. Fuzzy Syst. 2021;40:10099–10113. doi: 10.3233/JIFS-202812. [DOI] [Google Scholar]
16.Wang P., Zhang H., Qin Z., Zhang G. A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting. Atmos. Pollut. Res. 2017;8:850–860. doi: 10.1016/j.apr.2017.01.003. [DOI] [Google Scholar]
17.Zamani Joharestani M., Cao C., Ni X., Bashir B., Talebiesfandarani S. PM2.5 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere. 2019;10:373. doi: 10.3390/atmos10070373. [DOI] [Google Scholar]
18.Jian L., Zhao Y., Zhu Y.P., Zhang M.B., Bertolatti D. An application of ARIMA model to predict submicron particle concentrations from meteorological factors at a busy roadside in Hangzhou, China. Sci. Total Environ. 2012;426:336–345. doi: 10.1016/j.scitotenv.2012.03.025. [DOI] [PubMed] [Google Scholar]
19.Badicu A., Suciu G., Balanescu M., Dobrea M., Birdici A., Orza O., Pasat A. 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring) IEEE; 2020. PM2.5 concentration forecasting using ARIMA algorithm; pp. 1–5. [Google Scholar]
20.Babu S., Thomas B. A survey on air pollutant PM2.5 prediction using random forest model. Environ. Health Eng. Manag. 2023;10:157–163. doi: 10.34172/EHEM.2023.18. [DOI] [Google Scholar]
21.Hu X., Belle J.H., Meng X., Wildani A., Waller L.A., Strickland M.J., Liu Y. Estimating PM2.5 concentrations in the conterminous United States using the random forest approach. Environ. Sci. Technol. 2017;51:6936–6944. doi: 10.1021/acs.est.7b01210. [DOI] [PubMed] [Google Scholar]
22.Dong Y., Wang H., Zhang L., Zhang K. 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) IEEE; 2016. An improved model for PM2.5 inference based on support vector machine; pp. 27–31. [Google Scholar]
23.Cui B., Liu M., Li S., Jin Z., Zeng Y., Lin X. Deep learning methods for atmospheric PM2.5 prediction: A comparative study of transformer and CNN-LSTM-attention. Atmos. Pollut. Res. 2023;14 doi: 10.1016/j.apr.2023.101833. [DOI] [Google Scholar]
24.Qin D., Yu J., Zou G., Yong R., Zhao Q., Zhang B. A novel combined prediction scheme based on CNN and LSTM for urban PM2.5 concentration. IEEE Access. 2019;7:20050–20059. doi: 10.1109/ACCESS.2019.2897028. [DOI] [Google Scholar]
25.Hochreiter S., Schmidhuber J. Long Short-term Memory. Neural Comput. 1997;9:1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]
26.Tong W., Limperis J., Hamza-Lup F., Xu Y., Li L. Robust Transformer-based model for spatiotemporal PM2.5 prediction in California. Earth Sci. Inform. 2024;17:315–328. doi: 10.1007/s12145-023-01138-w. [DOI] [Google Scholar]
27.Huang C.J., Kuo P.H. A deep CNN-LSTM model for particulate matter (PM2.5) forecasting in smart cities. Sensors. 2018;18:2220. doi: 10.3390/s18072220. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Bai X., Zhang N., Cao X., Chen W. Prediction of PM2.5 concentration based on a CNN-LSTM neural network algorithm. PeerJ. 2024;12 doi: 10.7717/peerj.17811. [DOI] [PMC free article] [PubMed] [Google Scholar]
29.Dong J., Zhang Y., Hu J. Short-term air quality prediction based on EMD-transformer-BiLSTM. Sci. Rep. 2024;14 doi: 10.1038/s41598-024-67626-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I. Attention is All You Need. arXiv. 2017 doi: 10.48550/arXiv.1706.03762. Preprint at. [DOI] [Google Scholar]
31.Tao Q., Liu F., Li Y., Sidorov D. Air pollution forecasting using a deep learning model based on 1D convnets and bidirectional GRU. IEEE Access. 2019;7:76690–76698. doi: 10.1109/ACCESS.2019.2921578. [DOI] [Google Scholar]
32.Wen C., Liu S., Yao X., Peng L., Li X., Hu Y., Chi T. A novel spatiotemporal convolutional long short-term neural network for air pollution prediction. Sci. Total Environ. 2019;654:1091–1099. doi: 10.1016/j.scitotenv.2018.11.086. [DOI] [PubMed] [Google Scholar]
33.Ma J., Ding Y., Cheng J.C.P., Jiang F., Gan V.J.L., Xu Z. A Lag-FLSTM deep learning network based on Bayesian Optimization for multi-sequential-variant PM2.5 prediction. Sustain. Cities Soc. 2020;60 doi: 10.1016/j.scs.2020.102237. [DOI] [Google Scholar]
34.Huang G., Li X., Zhang B., Ren J. PM2.5 concentration forecasting at surface monitoring sites using GRU neural network based on empirical mode decomposition. Sci. Total Environ. 2021;768 doi: 10.1016/j.scitotenv.2020.144516. [DOI] [PubMed] [Google Scholar]
35.El Mghouchi Y., Udristioiu M.T., Yildizhan H. Multivariable air-quality prediction and modelling via hybrid machine learning: a case study for Craiova, Romania. Sensors. 2024;24:1532. doi: 10.3390/s24051532. [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Lee Y. Meteorological Factors Associated with Elevated Levels of Daily PM2.5 Concentrations in Seoul, South Korea, in 2019. Int. J. High Sch. Res. 2022;4:6. doi: 10.36838/v4i6.21. [DOI] [Google Scholar]
37.Grubbs F.E. Procedures for detecting outlying observations in samples. Technometrics. 1969;11:1–21. doi: 10.1080/00401706.1969.10490657. [DOI] [Google Scholar]
38.Sharma V. A study on data scaling methods for machine learning. Int. J. Global Acad. Sci. Res. 2022;1:31–42. doi: 10.55938/ijgasr.v1i1.4. [DOI] [Google Scholar]
39.Székely G.J., Maria L.R., Nail K.B. Measuring and testing dependence by correlation of distances. Ann. Statist. 2007;35:2769–2794. doi: 10.1214/009053607000000505. [DOI] [Google Scholar]
40.Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer; 2012. Long Short-term Memory; pp. 37–45. [Google Scholar]
41.Shi X., Chen Z., Wang H., Yeung D.Y., Wong W.K., Woo W.C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv. 2015 doi: 10.48550/arXiv.1506.04214. Preprint at. [DOI] [Google Scholar]
42.Chai T., Draxler R.R. Root mean square error (RMSE) or mean absolute error (MAE)? - arguments against avoiding RMSE in the literature. Geosci. Model Dev. (GMD) 2014;7:1247–1250. doi: 10.5194/gmd-7-1247-2014. [DOI] [Google Scholar]
43.Nagelkerke N.J. A note on a general definition of the coefficient of determination. Biometrika. 1991;78:691–692. doi: 10.1093/biomet/78.3.691. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Document S1. Figures S1–S23 and Tables S1–S7

mmc1.pdf^{(5.5MB, pdf)}

Data Availability Statement

•
Data are available online at http://eia-data.com/ or https://doi.org/10.5281/zenodo.18229490.
•
Any code request will be made through the lead contact.
•
Any additional information required to reanalyze the data reported in this article is available from the lead contact upon request.

[bib1] 1.Wang C., Tu Y., Yu Z., Lu R. PM2.5 and cardiovascular diseases in the elderly: an overview. Int. J. Environ. Res. Public Health. 2015;12:8187–8197. doi: 10.3390/ijerph120708187. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib2] 2.Hayes R.B., Lim C., Zhang Y., Cromar K., Shao Y., Reynolds H.R., Silverman D.T., Jones R.R., Park Y., Jerrett M., et al. PM2.5 air pollution and cause-specific cardiovascular disease mortality. Int. J. Epidemiol. 2020;49:25–35. doi: 10.1093/ije/dyz114. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] 3.Shi G., Yang F., Zhang L., Zhao T., Hu J. Impact of atmospheric circulation and meteorological parameters on wintertime atmospheric extinction in Chengdu and Chongqing of Southwest China during 2001–2016. Aerosol Air Qual. Res. 2019;19:1538–1554. doi: 10.4209/aaqr.2018.09.0336. [DOI] [Google Scholar]

[bib4] 4.Du X.X., Shi G.M., Zhao T.L., Yang F.M., Zheng X.B., Zhang Y.J., Tan Q.W. Contribution of secondary particles to wintertime PM2.5 during 2015–2018 in a major urban area of the Sichuan Basin, Southwest China. Earth Space Sci. 2020;7 doi: 10.1029/2020EA001194. [DOI] [Google Scholar]

[bib5] 5.Tao J., Gao J., Zhang L., Zhang R., Che H., Zhang Z., Lin Z., Jing J., Cao J., Hsu S.C. PM2.5 pollution in a megacity of southwest China: Source apportionment and implication. Atmos. Chem. Phys. 2014;14:8679–8699. doi: 10.5194/acp-14-8679-2014. [DOI] [Google Scholar]

[bib6] 6.Liao T., Wang S., Ai J., Gui K., Duan B., Zhao Q., Zhang X., Jiang W., Sun Y. Heavy pollution episodes, transport pathways and potential sources of PM2.5 during the winter of 2013 in Chengdu (China) Sci. Total Environ. 2017;584–585:1056–1065. doi: 10.1016/j.scitotenv.2017.01.160. [DOI] [PubMed] [Google Scholar]

[bib7] 7.Qiao X., Guo H., Wang P., Tang Y., Ying Q., Zhao X., Deng W., Zhang H. Fine particulate matter and ozone pollution in the 18 cities of Sichuan Basin, southwestern China: Model Performance and characteristics. Aerosol Air Qual. Res. 2019;19:2308–2319. doi: 10.4209/aaqr.2019.05.0235. [DOI] [Google Scholar]

[bib8] 8.Tian M., Liu Y., Yang F., Zhang L., Peng C., Chen Y., Shi G., Wang H., Luo B., Jiang C., et al. Increasing importance of nitrate formation for heavy aerosol pollution in two megacities in Sichuan Basin, Southwest China. Environ. Pollut. 2019;250:898–905. doi: 10.1016/j.envpol.2019.04.098. [DOI] [PubMed] [Google Scholar]

[bib9] 9.Song T., Feng M., Song D., Liu S., Tan Q., Wang Y., Luo Y., Chen X., Yang F. Comparative analysis of secondary organic aerosol formation during PM2.5 pollution and complex pollution of PM2.5 and O3 in Chengdu, China. Atmosphere. 2022;13:1834. doi: 10.3390/atmos13111834. [DOI] [Google Scholar]

[bib10] 10.Chi Y., Wu Y., Wang K., Ren Y., Ye H., Yang S., Lin G. Quantification of uncertainty in short-term tropospheric column density risks for a wide range of carbon monoxide. J. Environ. Manage. 2024;370 doi: 10.1016/j.jenvman.2024.122725. [DOI] [PubMed] [Google Scholar]

[bib11] 11.Zhou S., Wang W., Zhu L., Qiao Q., Kang Y. Deep-learning architecture for PM2.5 concentration prediction: A review. Environ. Sci. Ecotechnol. 2024;21 doi: 10.1016/j.ese.2024.100400. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib12] 12.Gaikwad S., Kumar B., Yadav P.P., Ambulkar R., Govardhan G., Kulkarni S.H., Kumar R., Chate D.M., Nigam N., Rao S.A., Ghude S.D. Harnessing deep learning for forecasting fire-burning locations and unveiling PM 2.5 emissions. Model. Earth Syst. Environ. 2024;10:927–941. doi: 10.1007/s40808-023-01831-1. [DOI] [Google Scholar]

[bib13] 13.Zhang L., Lin J., Qiu R., Hu X., Zhang H., Chen Q., Tan H., Lin D., Wang J. Trend analysis and forecast of PM2.5 in Fuzhou, China using the ARIMA model. Ecol. Indic. 2018;95:702–710. doi: 10.1016/j.ecolind.2018.08.032. [DOI] [Google Scholar]

[bib14] 14.Bhatti U.A., Yan Y., Zhou M., Ali S., Hussain A., Qingsong H., Yu Z., Yuan L. Time series analysis and forecasting of air pollution particulate matter (PM 2.5): an SARIMA and factor analysis approach. IEEE Access. 2021;9:41019–41031. doi: 10.1109/ACCESS.2021.3060744. [DOI] [Google Scholar]

[bib15] 15.Lai X., Li H., Pan Y. A combined model based on feature selection and support vector machine for PM2.5 prediction. J. Intell. Fuzzy Syst. 2021;40:10099–10113. doi: 10.3233/JIFS-202812. [DOI] [Google Scholar]

[bib16] 16.Wang P., Zhang H., Qin Z., Zhang G. A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting. Atmos. Pollut. Res. 2017;8:850–860. doi: 10.1016/j.apr.2017.01.003. [DOI] [Google Scholar]

[bib17] 17.Zamani Joharestani M., Cao C., Ni X., Bashir B., Talebiesfandarani S. PM2.5 prediction based on random forest, XGBoost, and deep learning using multisource remote sensing data. Atmosphere. 2019;10:373. doi: 10.3390/atmos10070373. [DOI] [Google Scholar]

[bib18] 18.Jian L., Zhao Y., Zhu Y.P., Zhang M.B., Bertolatti D. An application of ARIMA model to predict submicron particle concentrations from meteorological factors at a busy roadside in Hangzhou, China. Sci. Total Environ. 2012;426:336–345. doi: 10.1016/j.scitotenv.2012.03.025. [DOI] [PubMed] [Google Scholar]

[bib19] 19.Badicu A., Suciu G., Balanescu M., Dobrea M., Birdici A., Orza O., Pasat A. 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring) IEEE; 2020. PM2.5 concentration forecasting using ARIMA algorithm; pp. 1–5. [Google Scholar]

[bib20] 20.Babu S., Thomas B. A survey on air pollutant PM2.5 prediction using random forest model. Environ. Health Eng. Manag. 2023;10:157–163. doi: 10.34172/EHEM.2023.18. [DOI] [Google Scholar]

[bib21] 21.Hu X., Belle J.H., Meng X., Wildani A., Waller L.A., Strickland M.J., Liu Y. Estimating PM2.5 concentrations in the conterminous United States using the random forest approach. Environ. Sci. Technol. 2017;51:6936–6944. doi: 10.1021/acs.est.7b01210. [DOI] [PubMed] [Google Scholar]

[bib22] 22.Dong Y., Wang H., Zhang L., Zhang K. 2016 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) IEEE; 2016. An improved model for PM2.5 inference based on support vector machine; pp. 27–31. [Google Scholar]

[bib23] 23.Cui B., Liu M., Li S., Jin Z., Zeng Y., Lin X. Deep learning methods for atmospheric PM2.5 prediction: A comparative study of transformer and CNN-LSTM-attention. Atmos. Pollut. Res. 2023;14 doi: 10.1016/j.apr.2023.101833. [DOI] [Google Scholar]

[bib24] 24.Qin D., Yu J., Zou G., Yong R., Zhao Q., Zhang B. A novel combined prediction scheme based on CNN and LSTM for urban PM2.5 concentration. IEEE Access. 2019;7:20050–20059. doi: 10.1109/ACCESS.2019.2897028. [DOI] [Google Scholar]

[bib25] 25.Hochreiter S., Schmidhuber J. Long Short-term Memory. Neural Comput. 1997;9:1735–1780. doi: 10.1162/neco.1997.9.8.1735. [DOI] [PubMed] [Google Scholar]

[bib26] 26.Tong W., Limperis J., Hamza-Lup F., Xu Y., Li L. Robust Transformer-based model for spatiotemporal PM2.5 prediction in California. Earth Sci. Inform. 2024;17:315–328. doi: 10.1007/s12145-023-01138-w. [DOI] [Google Scholar]

[bib27] 27.Huang C.J., Kuo P.H. A deep CNN-LSTM model for particulate matter (PM2.5) forecasting in smart cities. Sensors. 2018;18:2220. doi: 10.3390/s18072220. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib28] 28.Bai X., Zhang N., Cao X., Chen W. Prediction of PM2.5 concentration based on a CNN-LSTM neural network algorithm. PeerJ. 2024;12 doi: 10.7717/peerj.17811. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib29] 29.Dong J., Zhang Y., Hu J. Short-term air quality prediction based on EMD-transformer-BiLSTM. Sci. Rep. 2024;14 doi: 10.1038/s41598-024-67626-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib30] 30.Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser Ł., Polosukhin I. Attention is All You Need. arXiv. 2017 doi: 10.48550/arXiv.1706.03762. Preprint at. [DOI] [Google Scholar]

[bib31] 31.Tao Q., Liu F., Li Y., Sidorov D. Air pollution forecasting using a deep learning model based on 1D convnets and bidirectional GRU. IEEE Access. 2019;7:76690–76698. doi: 10.1109/ACCESS.2019.2921578. [DOI] [Google Scholar]

[bib32] 32.Wen C., Liu S., Yao X., Peng L., Li X., Hu Y., Chi T. A novel spatiotemporal convolutional long short-term neural network for air pollution prediction. Sci. Total Environ. 2019;654:1091–1099. doi: 10.1016/j.scitotenv.2018.11.086. [DOI] [PubMed] [Google Scholar]

[bib33] 33.Ma J., Ding Y., Cheng J.C.P., Jiang F., Gan V.J.L., Xu Z. A Lag-FLSTM deep learning network based on Bayesian Optimization for multi-sequential-variant PM2.5 prediction. Sustain. Cities Soc. 2020;60 doi: 10.1016/j.scs.2020.102237. [DOI] [Google Scholar]

[bib34] 34.Huang G., Li X., Zhang B., Ren J. PM2.5 concentration forecasting at surface monitoring sites using GRU neural network based on empirical mode decomposition. Sci. Total Environ. 2021;768 doi: 10.1016/j.scitotenv.2020.144516. [DOI] [PubMed] [Google Scholar]

[bib35] 35.El Mghouchi Y., Udristioiu M.T., Yildizhan H. Multivariable air-quality prediction and modelling via hybrid machine learning: a case study for Craiova, Romania. Sensors. 2024;24:1532. doi: 10.3390/s24051532. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib36] 36.Lee Y. Meteorological Factors Associated with Elevated Levels of Daily PM2.5 Concentrations in Seoul, South Korea, in 2019. Int. J. High Sch. Res. 2022;4:6. doi: 10.36838/v4i6.21. [DOI] [Google Scholar]

[bib37] 37.Grubbs F.E. Procedures for detecting outlying observations in samples. Technometrics. 1969;11:1–21. doi: 10.1080/00401706.1969.10490657. [DOI] [Google Scholar]

[bib38] 38.Sharma V. A study on data scaling methods for machine learning. Int. J. Global Acad. Sci. Res. 2022;1:31–42. doi: 10.55938/ijgasr.v1i1.4. [DOI] [Google Scholar]

[bib39] 39.Székely G.J., Maria L.R., Nail K.B. Measuring and testing dependence by correlation of distances. Ann. Statist. 2007;35:2769–2794. doi: 10.1214/009053607000000505. [DOI] [Google Scholar]

[bib40] 40.Graves A. Supervised Sequence Labelling with Recurrent Neural Networks. Springer; 2012. Long Short-term Memory; pp. 37–45. [Google Scholar]

[bib41] 41.Shi X., Chen Z., Wang H., Yeung D.Y., Wong W.K., Woo W.C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. arXiv. 2015 doi: 10.48550/arXiv.1506.04214. Preprint at. [DOI] [Google Scholar]

[bib42] 42.Chai T., Draxler R.R. Root mean square error (RMSE) or mean absolute error (MAE)? - arguments against avoiding RMSE in the literature. Geosci. Model Dev. (GMD) 2014;7:1247–1250. doi: 10.5194/gmd-7-1247-2014. [DOI] [Google Scholar]

[bib43] 43.Nagelkerke N.J. A note on a general definition of the coefficient of determination. Biometrika. 1991;78:691–692. doi: 10.1093/biomet/78.3.691. [DOI] [Google Scholar]

PERMALINK

Evaluating deep learning time series models for PM2.5 forecasting across diverse horizons

Ling Zeng

Runan Dong

Meng Yuan

Linhai Jing

Shoutao Jiao

Summary

Graphical abstract

Highlights

Introduction

Results

Configurations of auxiliary variables

Pollutant configuration 1: and NO2

Pollutant configuration 2: CO only

Pollutant configuration 3: O3 and SO2

Meteorological configuration: temperature, pressure, precipitation, and wind speed

Setups of time-sensitivity analyses

Category 1: Predictions using complete yearly four-season data

Table 1.

Category 2: Predictions using incomplete yearly season data

Table 2.

Results of models

Prediction trends

Performance metrics

Discussion

Sensitivity for the time gap between training and the prediction period

Impact of completeness of training data

Impact of training proportion in incomplete seasonal data

Challenges in single-month predictions immediately after the training period

Comparative performance of univariate, pollutant-based, and meteorology-based prediction

Table 3.

Performance analysis of deep-learning algorithms

Insights into attention weights of Transformer-long short-term memory

Multi-sites analysis and verification

Key findings

Policy implications

Limitations of this study

Resource availability

Lead contact

Materials availability

Data and code availability

Acknowledgments

Author contributions

Declaration of interests

STAR★Methods

Key resources table

Experimental model and study participant details

Method details

Data collection

Figure 1.

Data preprocessing and exploratory analyses

Data cleaning

Distance correlation analysis

Descriptive statistics

Descriptive seasonal trends

Figure 2.

Time series analyses methods

Long short-term memory (LSTM)

Figure 3.

CNN-LSTM

Figure 4.

Transformer

Figure 5.

Transformer-LSTM

Figure 6.

Quantification and statistical analysis

Mean absolute error (MAE) and MAE%

Root means square error (RMSE) and RMSE%

R-squared (R2)

Footnotes

Contributor Information

Supplemental information

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Evaluating deep learning time series models for PM_2.5 forecasting across diverse horizons

Pollutant configuration 1: and NO₂

Pollutant configuration 3: O₃ and SO₂

R-squared (R²)