Skip to main content
Heliyon logoLink to Heliyon
. 2024 Aug 8;10(16):e35933. doi: 10.1016/j.heliyon.2024.e35933

Advancing sub-seasonal to seasonal multi-model ensemble precipitation prediction in east asia: Deep learning-based post-processing for improved accuracy

Uran Chung a,, Jinyoung Rhee b,c, Miae Kim a, Soo-Jin Sohn a
PMCID: PMC11385763  PMID: 39258194

Abstract

The growing interest in Subseasonal to Seasonal (S2S) prediction data across different industries underscores its potential use in comprehending weather patterns, extreme conditions, and important sectors such as agriculture and energy management. However, concerns about its accuracy have been raised. Furthermore, enhancing the precision of rainfall predictions remains challenging in S2S forecasts. This study enhanced the sub-seasonal to seasonal (S2S) prediction skills for precipitation amount and occurrence over the East Asian region by employing deep learning-based post-processing techniques. We utilized a modified U-Net architecture that wraps all its convolutional layers with TimeDistributed layers as a deep learning model. For the training datasets, the precipitation prediction data of six S2S climate models and their multi-model ensemble (MME) were constructed, and the daily precipitation occurrence was obtained from the three thresholds values, 0 % of the daily precipitation for no-rain events, <33 % for light-rain, >67 % for heavy-rain. Based on the precipitation amount prediction skills of the six climate models, deep learning-based post-processing outperformed post-processing using multiple linear regression (MLR) in the lead times of weeks 2–4. The prediction accuracy of precipitation occurrence with MLR-based post-processing did not significantly improve, whereas deep learning-based post-processing enhanced the prediction accuracy in the total lead times, demonstrating superiority over MLR. We enhanced the prediction accuracy in forecasting the amount and occurrence of precipitation in individual climate models using deep learning-based post-processing.

Keywords: S2S prediction, Post-processing, Precipitation forecast, Multi-model ensemble, U-net

Highlights

  • Several industries require accurate climate forecasting on the S2S time scale.

  • Post-processing enhances the precipitation prediction on the S2S scale.

  • Deep learning-based post-processing improves precipitation prediction.

  • We improved the prediction accuracy of precipitation data for 2–4 weeks S2S.

1. Introduction

Agriculture plays a crucial role in the economies of many Asian countries. However, recent extreme weather conditions and disasters, such as floods and droughts, have caused a decline in agricultural production. These disasters have had a significant impact on Asian countries due to their heavy reliance on the agricultural industry. Additionally, the effects have been felt globally, leading to a surge in global grain prices [[1], [2], [3]]. Severe damage can be prevented by providing timely information on agricultural water, such as irrigation. Making well-informed decisions on farming activities, including timing of planting, transplanting, and second planting in double cropping, can greatly benefit farmers and the economy. This is because agricultural water is one of the most critical factors affecting agricultural productivity [4]. The amount of water available for agriculture can be predicted based on expected precipitation. Climate data such as rainfall and temperature over 2 weeks to 2 months (known as sub-seasonal to seasonal timescales) can be particularly valuable, as they align with the decision-making timeline for farming activities. For instance, it is essential to have enough water for agriculture within 2–3 weeks of planting or transplanting. Inadequate water during this period can negatively impact crop growth and agricultural productivity.

Predicting precipitation has been a major challenge in climate science; however, even with dynamic models, the inherent difficulty has made it one of the “real holes in climate science” [5]. Fortunately, advances in artificial intelligence appear promising for resolving complex and nonlinear problems. Deep learning is a potential method for overcoming the difficulty of precipitation prediction. Lecun et al. [6] reported that deep learning can be applied to resolve various problems related to geoscience; the rich growth in data and computing resources has yielded innovative results. In particular, advances in computer vision technology and deep learning have facilitated weather forecasting by predicting precipitation based on the shape and density of clouds detected by radar [7,8] Nonlinear operators, such as convolutional neural networks, have shown great performance in computer vision training. These models can also be used for accurate precipitation predictions in numerical weather prediction models.

The development of high-performance computing systems, open-source libraries, and big data technology has made more memory-intensive deep learning models available. Among these, convolutional neural network (CNN) models have demonstrated better results than feedforward neural networks in parameterizing tropical turbulence. This is significant because general climate models have historically struggled to provide accurate results in this area [9]. Shi et al. [10] applied the convolution technique to general Long Short-Term Memory (LSTM) to construct precipitation prediction models that use Convolutional LSTM (ConvLSTM) and radar images, resulting in superior performance compared with other techniques. Shi et al. [11] proposed a trajectory-gated recurrent unit model to dynamically learn location-deformation structure, which the authors applied to learn different precipitation patterns for each region and improve the accuracy of precipitation prediction.

Recent trends in precipitation prediction research have combined various data sources, such as radar, satellite, and numerical forecast data, as well as various deep learning models, such as ConvLSTM, U-Net, and computer vision technology [[12], [13], [14]] In particular, U-Net, a CNN-based model consisting of an encoder and a decoder, has been applied in cloud segmentation, precipitation distribution prediction, and sea ice area segmentation using climate observation data [15,16] RainNet, a CNN-based rainfall prediction model that combines U-Net and SegNet, has also been used to predict short-term precipitation intensity from radar data [12,13]. SegNet [17] is a neural network architecture based on VGGNet [18] that consists of an encoder and a decoder, similar to U-Net, but differs in terms of information transfer between the encoder and the decoder. Mooers et al. [14] combined Deep Convolutional Neural Networks (DLWP) with U-Net to convert a geopotential height of 500 hPa into cubes, minimizing spherical distortions and improving precipitation predictions. Moreover, recent efforts have been made to improve the predictability of climate dynamics models. This has been achieved by combining physical information with neural networks to address the challenges in modeling the interactions between the atmosphere and the ocean [19].

Previous deep-learning studies have largely concentrated on enhancing precipitation predictions for short-term and seasonal periods. However, current forecasting systems continue to struggle with accurately predicting precipitation levels between two to six weeks ahead. There have been limited studies applying deep learning to precipitation predictions on the Subseasonal to Seasonal (S2S) timescale due to the complexities arising from increased nonlinearity over longer timescales [20] Recently, research applying deep learning to improve the prediction skill of the S2S timescale has increased. Weyn et al. [21] used a CNN in a DLWP model to predict the total water vapor from two to six weeks, providing tropical cyclone information through mid-latitude weather system simulations, while Specq et al. [22] proposed a statistical-dynamic post-processing scheme for stochastic weekly precipitation re-forecasting for weeks 1–4 of the Australian summer. Kim et al. [23] discussed how a deep learning bias correction method significantly reduced multi-model forecast errors in the Madden-Julian Oscillation with a sub-seasonal timescale over four weeks.

Although deep learning research for improving S2S prediction is increasing, few studies have directly used deep learning as a post-processing method for predicting S2S [23]. More research is needed to improve precipitation predictions at the S2S timescale, especially by using deep learning to process climate dynamical model results. Traditional statistical post-processing methods for time series data characterized by seasonality, periodicity, and irregularity present challenges because they assume that the data properties are random or inconsequential, which may not be well-suited for such data. Therefore, we developed a deep learning-based post-processing model for predicting the amount and occurrence of precipitation at the S2S timescale. We used U-Net as a training model, which has recently become widely used in the field of meteorology, and modified the model's architecture to consider the temporal continuity of time-series data. The performance of the proposed model was evaluated by comparing it with that of multiple linear regression (MLR), a traditional statistical method.

2. Data and methods

2.1. S2S data

This study used six S2S forecasting systems from the European Centre for Medium-Range Weather Forecasts (ECMWF) [20,24,25], Global Seasonal Forecast System Version 5 of the Korean Meteorological Administration (KMA) [26,27], Environment and Climate Change Canada (ECCC) [28], National Centers for Environmental Prediction (NCEP) [29], the UK Met Office (UKMO) [30], and the China Meteorological Administration (CMA)(Table 1) [31,32].

Table 1.

Information on the sub-seasonal to seasonal (S2S) predictions of the six climate models used in this study.

S2S climate forecasting system Global Producing Center Hindcast ensemble size Hindcast period Forecasting frequency Forecasting time range (d)
Model KMA Seoul 3 1991–2010 4/month (1, 9, 17, 25) 60
UKMO Exeter 7 1993–2015 4/month (1, 9, 17, 25) 60
CMA Beijing 4 1994–2014 daily 60
ECMWF ECMWF 11 1998–2017 2/week (Mon, Thu) 46
NCEP Washington 4 1999–2010 Daily 44
ECCC Montreal 4 1998–2014 Weekly (Thu) 32
Ensemble MME 1995–2014 Weekly 30

CMA: China Meteorological Administration; ECCC: Environment and Climate Change Canada; ECMWF: European Centre for Medium-Range Weather Forecasts; KMA: Korean Meteorological Administration; MME: Multi-Model Ensemble; NCEP: National Centers for Environmental Prediction; UKMO: UK Met Office.

The temporal resolution of the S2S climate variables was daily, except for the maximum air temperature at 2 m (TMAX) and minimum air temperature at 2 m (TMIN). The 6-h data of TMAX and TMIN were averaged to obtain the daily mean. Data from all the climate models had a spatial resolution of 1.5° × 1.5°, covering the globe with 121 latitudinal and 240 longitudinal grids. The domain of interest was East Asia, as shown in Fig. 1. The number of ensemble members, hindcast length, production frequency, and time range of the hindcast datasets differed for each model.

Fig. 1.

Fig. 1

Spatial domain of interest in this study.

2.1.1. S2S and multi-model ensemble (MME) precipitation data for training

The S2S data from the six climate models were combined to construct their MME data by averaging them over a common hindcast period from 1995 to 2014. The ensemble members of each model consisted of the results of the control and perturbation runs. They were averaged to use the single-model ensemble mean as the input data. Since the prediction frequency periods (time range) of the six climate models were heterogeneous, a common frequency and period were needed for the MME. The prediction period for the MME was set to 30 days, while the prediction frequency was the same as those of the ECMWF and ECCC (in the hindcast produced in 2018). Based on the prediction period of the ECMWF and ECCC, beginning on January 4, 2018, for example, prediction data from the CMA and NCEP were easily combined with ECMWF and ECCC data because they have daily frequencies (Table 1). However, for KMA and UKMO, we excluded the first 3 days of their data since their period began on January 1, 2018.

Three threshold values were applied to the precipitation data to define precipitation occurrence (Table 2) [33]. Days with a value above the threshold were assigned a value of 1, indicating the occurrence of precipitation, otherwise, 0. The input variables of the training model were maximum temperature, minimum temperature, total precipitation, 2 m air temperature, mean sea level pressure, U-component of wind at 50, 850, and 200 hPa, V-component of wind at 850 and 200 hPa, geopotential height at 200 and 500 hPa, vertical velocity at 500 hPa, sea surface temperature, top net thermal radiation, and specific humidity at 850 hPa, which are common variables of the six climate models and consider precipitation, wind speed, and direction [25]. The targeted (dependent) variable of the deep learning model was either the daily precipitation amounts or its occurrence. All S2S prediction data were standardized by removing the mean and scaling the standard deviation and then fed into the learning model using five dimensions: initial date, lead time, latitude, longitude, and climate variables.

Table 2.

Classification of rainfall intensity.

Threshold value Rain rate [mm/d-1] Status
0 % <0.1 No Rain
>0 to ≤ 33 % ≥0.1 and < 10 Light Rain
>67 % ≥50 Heavy Rain

2.1.2. Reference precipitation data for training

Daily precipitation data from the ECMWF Reanalysis version 5 (ERA5) were used as references [34]. ERA5 shows some improvements in the representation of rainfall variability over most tropical and localized mid-latitude regions [35,36] compared with ERA-Interim. However, ERA5 precipitation still slightly overestimates precipitation in the summer season and has errors in high mountain regions such as the Andes and Himalayas. Therefore, we focused more on assessing and enhancing the capability of the post-processing model to improve precipitation predictions from climate models rather than merely relying on the quality of the reference data. All settings, such as the period and spatial domains, were the same as those of the S2S prediction data.

2.2. Application of post-processing models

2.2.1. MLR model

The MLR model was compared with the U-Net model to examine the improvement in the predictions of precipitation amount and occurrence. MLR is a statistical technique that uses multiple explanatory variables to predict the outcome of a response variable and models the linear relationship between the explanatory (independent) variables and response (dependent) variable. The MLR was trained to minimize the Mean Square Error (MSE), which measures the difference between observation and prediction, and its accuracy in this study was assessed using the coefficient of determination (R2). MLR can be expressed using Eq. (1). Detailed explanation has been reported in literature [37,38].

yi=β0+β1xi1+β2xi2++βρxiρ (1)

where yi is the dependent or predicted variable, xi1, xi2, and xip are independent variables, β0 is the y-intercept, i.e., the value of y when all independent variables are equal to zero, and β1, β2, and βp are the estimated regression coefficients representing the change in yi relative to a one-unit change in their respective independent variables.

2.2.2. U-net model

In this study, the U-Net model, which is widely used in deep learning research in the climate domain, was selected to develop a deep learning-based post-processing technique. The U-Net model classifies and highlights certain parts of an image at the pixel level, making the objects or boundaries in the image clear. The U-Net has a U-shaped symmetrical form consisting of both a contracting path network for obtaining the overall feature information of the input images and an expanding path network for accurate localization [21,39]. We modified U-Net by adding a TimeDistributed wrapper to all convolutional layers of the architecture to incorporate the characteristics of the time-series data (Fig. 2). The TimeDistributed wrapper allows for the application of the same weights of Conv2D to each time step of the input sequence and utilizes the time axis independently (TensorFlow v2.12.0) and reduces the learning speed by simplifying the code; However, whether to use the TimeDistributed wrapper did not affected the improvement of in the accuracy of the model learning or prediction (e.g., MSE or RMSE). As another modification in this study, average pooling was employed to retain more information than max pooling, which is commonly used for downsampling in the original U-Net. The kernel size (or filter) is a parameter for determining image features and is generally defined as a square matrix, such as (2 × 2), (3× 3), or (4× 4). In this study, only the kernel size of the first layer was set to (2× 2), while the kernel sizes of the remaining layers were (3× 3). The number of feature maps generated at the convolution layers by applying the kernels was 32, 64, and 128.

Fig. 2.

Fig. 2

The U-Net architecture [39] wrapped by TimeDistributed layers (indicated by the purple dashed line) to incorporate the characteristics of time-series data. The kernel size initially was [2,2,3,3] afterward; the numbers of kernels were set to 32, 64, and 128 in the model.

2.2.3. Model configurations

The training and testing periods varied based on the hindcast period of each climate model, or MME, used in this study. We used data covering 80 % of the entire hindcast period for training the model and the remaining 20 % for testing the data on model performance. For training, we used 4-fold cross-validation by randomly dividing the training data into four folds, of which three were used as training data to fit the model and the remaining as validation data to tune the model parameters. The MSE was used as the loss function for model optimization during training. Cosine similarity has also been used as an evaluation metric [40]; it measures the degree of reproduction or similarity (i.e., how well the model can mimic the input patterns or how well model predictions correlate with the labels). The Adam optimizer was used to update the model parameters while minimizing the loss function because it is currently known to have the best performance [41,42]. The learning rate was set to 0.005, and the number of iterations (epochs) was set to 50. The batch size, that is, the number of instances used in one training iteration, was set to 80.

2.3. Assessment skill metrics

To evaluate the accuracy and variability of the spatial agreement of daily precipitation predictions for the test period from 2011 to 2014 based on individual climate models and MME, we calculated the anomalous spatial pattern correlation coefficient (PCC) [9] for lead times of 1–4 weeks (Eq. (2)).

PCCi=j=1NΔxi,jΔyi,jj=1NΔxi,j2j=1NΔyi,j2 (2)

where i and j are a target week and grid point, respectively, Δxi,j and Δyi,j are the precipitation anomalies between prediction and observation in week i on grid j, and N is the total number of grids in the study area.

We also calculated the anomaly correlation coefficient (ACC), focusing on temporal correlation to determine how well the model predictions predict temporal changes in anomaly (Eq. (3)).

ACCi=i=1n(xix¯)(yiy¯)i=1n(xix¯)2i=1n(yiy¯)2 (3)

where xi is the i-th S2S precipitation and yi is the i-th observed precipitation, x¯ and y¯ are respectively the average of S2S precipitation and the observed precipitation, and n is the total data number.

The accuracy of precipitation occurrence predictions was assessed using a contingency table, a commonly used binary variable verification method for precipitation forecasting (Table 3) [43]. Contingency tables were also used to calculate the critical success index (CSI), Gilbert skill score (GSS), probability of detection (POD), and false alarm ratio (FAR).

Table 3.

Prediction contingency table.

Prediction
Yes No Total
Observation Yes TPa (Hits) FNb (Misses) TP + FN
No FPc (False alarms) TNd (Correct rejection) FP + TN
Total TP + FP FN + TN TP + FN + FP + TN
a

True positive (TP) is the number of positive classes correctly predicted by the model.

b

False negative (FN) is the number of false negative classes incorrectly predicted by the model.

c

False positive (FP) is the number of false positive classes incorrectly predicted by the model.

d

True negative (TN) is the number of negative classes correctly predicted by the model.

CSI was used to assess both the occurrence of precipitation and false alarms, ignoring no-rain events; it is defined as the ratio of actual occurrences to the total number of occurrences related to the phenomenon, regardless of whether they were observed or predicted, excluding the number of true negatives (Table 2 and Eq. (4)). CSI has values ranging from 0 to 1, with a higher the CSI value indicating a better forecast.

CSI=TPTP+FN+FP (4)

GSS, as known as the Equitable Threat Score (ETS), is designed to be more equitable by penalizing the score for random hits and is calculated by being adjusted based on CSI, considering random predictions (Eq. (5)). It has values ranging from −1 to 1, with a closer to 1 the GSS value indicating a better forecast.

GSS=TPETP+FP+FNE)andE=(TP+FP)(FN+TP)N (5)

where E is the expected number of hits due to random chance, and N is the total number of observations.

POD does not consider false alarms but focuses on detecting actual precipitation occurrences from all predicted rainfall events and is defined as the ratio of the number of accurate forecasts to the number of samples in which the phenomenon occurs (Eq. (6)). Similar to the CSI, higher POD values indicate better prediction model performances.

POD=TPTP+FN (6)

FAR is a measure of false alarms, which are situations in which precipitation is forecasted but no rainfall occurs. It is defined as the ratio of the number of false alarms to the total number of predicted rainfall events (Eq. (7)) [[44], [45], [46]]. A lower FAR indicates fewer false alarms.

FAR=FPTP+FP (7)

In this study, the improvement ratio (called skill score) was calculated to explain the relative changes in each evaluation index (Eq. (8)) [47]. For example, the skill score based on the evaluation index value can be calculated using the PCC values of the calibrated S2S (trained using the deep learning model) and the original S2S (not trained).

Skillscore=IndexpredictionIndexreferenceIndexreference (8)

where Indexprediction is the evaluation index value (e.g., PCC or ACC after training) of the post-processed daily precipitation anomaly or occurrence, and Indexreference is the index value of the raw S2S forecast for the daily precipitation anomaly or occurrence.

3. Results and discussion

3.1. Training performance of the post-processing models

Tables A1 and A2 present the training results for the MLR model (Appendix A). The regression coefficient for β3 (the independent input variable, PREC) is high when predicting the target variable of precipitation amount, except for CMA and ECCC (Table A1). Precipitation occurrence was an important independent variable for predicting the target variable of precipitation occurrence, as indicated by its high regression coefficients β17 (Table A2). The MLR model showed that the predictors more strongly correlated with the same independent variables, either the precipitation amount or precipitation occurrence.

In Fig. A1, the U-Net learning model's loss function and cosine similarity results are displayed for daily precipitation anomalies and occurrence predictions during the training period from 1995 to 2010. As depicted in Fig. A1a, the mean squared error (MSE) values for the precipitation anomaly sharply decreased after epoch 10, stabilizing at around 1 for both training and validation datasets. However, for precipitation occurrence, the training and validation losses fluctuated notably, particularly until epoch 15, and then maintained a consistent disparity between the two sets. In Fig. A1b, the cosine similarity for precipitation anomaly remained relatively constant at about 0.2 after epoch 5, whereas for precipitation occurrence, it consistently exceeded that of precipitation anomaly. This can be attributed, in part, to the fact that binary classification, like precipitation occurrence (rain or no rain), is more likely to yield high cosine similarity scores as the predicted values and labels either match or do not, unlike precipitation amount, which encompasses a range of continuous values.

3.2. Evaluation of the models

3.2.1. Prediction skill for precipitation amount

Fig. 3 presents the comparison of ACC and PCC skills of daily precipitation predictions before and after post-processing using the MLR and U-Net models for the six climate models and MME during the test period. Fig. 3a shows the ACC of S2S precipitation during the forecast period, which is the same period as the forecast range of individual climate models. With the exception of CMA, the ACC of daily precipitation improved after applying U-Net and MLR post-processing. Notably, the ACC of daily precipitation for NCEP remained higher than that of other individual climate models after 14 days. Overall, it is evident that the ACC at the time of prediction improves through the application of post-processing models. Fig. 3b compares the PCC skills of daily precipitation predictions, with all PCCs being compared for 1–4-week lead times. At the 1-week lead time, the PCC values did not significantly improve by the two post-processing models for the six climate models and MME and did not differ between the U-Net and MLR models; only the PCCs of ECMWF, UKMO, and MME with the U-Net model were higher than those with the MLR model. From the two to four-week lead times, the PCCs with the U-Net model were higher than those with the MLR model and Reference for the six climate models and MME (PCCU-Net > PCCMLR > PCCReference in general), indicating the superior performance of the U-Net model for post-processing. When comparing the PCCs of the six climate models with those of MME, the PCCs of MME were comparable to or even better than those of some climate models.

Fig. 3.

Fig. 3

a) Comparisons of the daily anomalous correlation coefficients (ACC) of S2S daily precipitation predictions of the six climate models and MME, post-processed with the U-Net model (solid color lines) or with the MLR model (dashed color lines), and before post-processing (Reference, increasing gray color) during the prediction period. b) Comparisons of the anomalous spatial pattern correlation coefficients (PCC) of S2S daily precipitation predictions of the six climate models (ECMWF is E, ECCC is M, NCEP is N, KMA is K, UKMO is U, CMA is C) and MME, post-processed with the U-Net model (solid black circles and stars, respectively) or with the MLR model (solid and hatched gray bars, respectively), and before post-processing (Reference, solid gray lines) from weeks 1–4.

Skill scores were calculated to better understand the changes in ACCs and PCCs before and after applying the post-processing models (Eq. (8) and Fig. 4). Fig. 4a shows the ACC skill score, and Fig. 4b shows the range of ACC skill scores of six individual climate models and MME with the U-Net (hatched boxplot) and MLR (empty boxplot) models applied at each prediction time. At the time of the overall prediction, the ACC skill score of both the U-Net and MLR improved except for the ACC skill score of 1–2 days (Fig. 4a). The range of ACC skill score of the MLR after 7 days was no longer improved between 0 and 0.5 (Fig. 4b). However, the range of ACC skill score of the U-Net continued to increase, and after day 21, most of them appeared to be ≥ 1 (Fig. 4b). Fig. 4c shows the PCC skill score. At the 1-week lead time, the PCC skill scores of the MLR and U-Net models were not high, indicating that skill scores did not significantly improve after applying post-processing. However, the skill scores then increased for 2–4-week lead times. The prediction skill for CMA improved significantly at all lead times, and ECCC showed significant improvement at the 4-week lead time point. The skill scores of the U-Net model for MME and the climate models were generally much higher than those of the MLR model, indicating that post-processing based on the U-Net model yielded better results than the conventional method. This implies that the U-Net model based on neural networks better captures and handles the complex nonlinear relationships between the response and predictor variables and trains well for pattern recognition compared with the MLR model, which uses linear relationships. One benefit of the U-Net model is its efficient matrix-based extraction of input feature information. As a result, the U-Net model effectively captures the input precipitation amount, leading to superior performance compared to the post-correction of the MLR model.

Fig. 4.

Fig. 4

a) Comparisons of daily ACC skill scores of six climate models (dashed lines for MLR and solid color lines for U-Net) and MME (dashed black line for MLR and solid black line for U-Net) S2S precipitation predictions. b) Comparisons of the range of ACC skill scores of MLR (empty boxplot) and U-Net (hatched gray boxplot) applied to S2S daily precipitation of six climate models and MME. c) Comparisons of weekly PCC skill scores of six climate models (white solid bars for MLR and solid symbols for U-Net) and MME S2S precipitation predictions (hatch bars and solid star symbol for U-Net).

In addition, The U-Net-based post-processing reduces the error compared to conventional methods by training to reproduce the spatiotemporal predictive distribution of S2S precipitation, as outputted directly from the climate model, to match the distribution of observed data. This is clear from the ACC skill score in Fig. 4b, where the skill score of the deep learning-based post-processing exceeds 0.5 after 14 days and maintains above 1.5. In addition, the ACCs were calculated for each grid for weeks 2 and 3 during overall test periods (Fig. 5). The spatial distribution of the ACCs in S2S without the post-processing (the left panels) has been improved when the U-Net (the center panels) or the MLR (the right panels) has been applied. In week 2, the spatial distribution of ACC based on the post-processing of six climate models and MME was clearly improved, which was already confirmed in the PCCs in Fig. 3b. Even in the 3-week ACCs, it was confirmed that grids with low ACCs improved after the post-processing, especially in NCEP and MME. In other words, the high value of ACC is displayed in red, with the intensity of the red color increasing after post-processing, and the red color is more widely distributed in the U-Net compared to the MLR.

Fig. 5.

Fig. 5

Comparisons of the spatial distribution of ACCs by calculating ACC for each grid for weeks 2 and 3 during overall test periods. a) ECMWF, b) KMA, c) NCEP, d) CMA, e) ECCC, f) UKMO, and g) MME.

3.2.2. Prediction skill for precipitation occurrence

FAR, POD, CSI and GSS for three types of rainfall intensities, calculated based on the contingency table (Table 3), were evaluated for precipitation occurrence (Eqs. (4), (5), (6), (7)). In Fig. 6, for FAR, POD, CSI, and GSS, in the panels in the vertical column, the first panel represents before post-processing, the second panel represents the U-Net model, and the third panel represents the MLR model. The panels in the horizontal column represent three rainfall intensity types (No-Rain, Light-Rain, and Heavy-Rain) in order. In common, the index value increases toward blue and the index value decreases toward yellow. In addition, for FAR, the lower the index (closer to yellow), the better the prediction, and for POD, CSI, and GSS, the higher the index (closer to blue), the better the prediction.

Fig. 6.

Fig. 6

Comparison of the weekly (a) false alarm ratio (FAR), (b) probability of detection (POD), (c) critical success index (CSI), and (d) Gilbert skill score (GSS) for the three thresholds of precipitation occurrence of six climate models with post-processing based on U-Net and MLR, three thresholds applied to MME, and three thresholds applied before post-processing.

In Fig. 6a, the FAR patterns of Reference and the MLR in No-Rain similarly showed lower FARs as the lead time increased. However, the U-Net generally showed a high FAR, especially in MME, which showed a high FAR in the overall lead time, leading to false positives (false alarm, closer to blue). The case of No-Rain implies that the post-processing models did not improve the false positive of the Reference. As the intensity of rainfall increased (light rain and heavy rain), a lower FAR was observed, nearing 0. Both the U-Net and MLR models displayed the same pattern. In Fig. 6b, the POD patterns of the reference, MLR, and U-Net during no rain were similar to the FAR (with the exception of ECCC). Additionally, the POD increased as the rainfall intensity increased. Although the FAR and POD were higher in the Reference and post-processing models for the three types of rainfall intensity and increasing lead time, it is challenging to determine whether post-processing influenced the high FAR and POD. However, based on the evaluation results for the CSI and GSS, it is apparent that post-processing can help improve the assessment of precipitation occurrence. In Fig. 6c and d, the CSI and GSS of U-Net during No Rain were higher than the Reference. The CSI and GSS of the MLR were also higher than the reference but lower than those of the U-Net. Therefore, as the rainfall intensity increased, high CSI and GSS were observed, and both indices improved more in the U-Net compared to the MLR.

The selection of hyperparameters and their values can significantly impact the training outcome of a neural network model. While this study did not delve into optimizing batch sizes, epochs, learning rates, and network depths, these factors could improve the learning performance of deep learning models if adjusted precisely. There are various methods to post-process the prediction data of the climate model. Since the methodology used to construct the training data can influence the learning outcomes, the learning performance of deep learning models can be enhanced through an appropriate post-processing technique tailored to the characteristics of the input data.

4. Conclusions

The MME application of the six models and the use of the modified U-Net model for post-processing improved the weekly prediction skills for precipitation amounts. We applied post-processing using a deep learning model that modified the U-Net model by integrating spatial information from the input data into its convolutional layers and accounting for temporal continuity by adding TimeDistributed wrappers. The results were compared to those obtained using the traditional statistical method of MLR.

When using the MME technique, the prediction of precipitation amount was improved and further enhanced through deep learning-based post-processing. However, the MME for predicting precipitation occurrence did not improve (case of FAR and POD in No-Rain). This is likely due to the MME method of precipitation occurrence, which averaged the precipitation of the climate model and then determined the presence or absence of precipitation with a 0.1 mm/d threshold. However, the MME did show improvements in CSI and GSS not only in No-Rain but also in Light-Rain and Heavy-Rain. In conclusion, prediction improvement can be achieved through MME and post-processing if it rains, even with very little rainfall, but improving predictions in cases where there is no rain remains a challenge. Nevertheless, precipitation occurrence predictions at lead times before post-processing, which were not improved by the MME technique, were enhanced by deep learning-based post-processing.

The use of deep learning-based post-processing significantly improved prediction accuracy compared to conventional methods, particularly in dynamic climate models. By optimizing neural network model structures and parameters, such as hyperparameters and layer depths, we can further improve the accuracy of predicting daily precipitation anomalies and occurrences 3–4 weeks in advance. This could broaden the applications of Sub-seasonal to seasonal (S2S) prediction data in different fields. Improved S2S precipitation data serve as crucial and valuable information for decision-making across various fields. Additionally, these enhanced data improve the input for application models, thereby influencing their prediction outcomes. Therefore, rather than using the raw S2S precipitation forecast information from climate models, it is important to enhance its predictability with the post-processing techniques. In this context, the development of deep learning-based post-processing in this study is expected to not only improve S2S precipitation predictions but also enhance the accuracy of application models and decision-making processes.

Ethical declaration

Informed consent was not required for this study, and no reason for ethical vetting was found since a review of previously published material has been made.

Funding

This research was funded by the Korea Meteorological Administration Research and Development Program “APEC Climate Center for Climate Information Services” under Grant (KMA2013-07510).

Additional information

The data inventory of APEC Climate Center is based on S2S database; the dataset was sourced from the Sub-seasonal to Seasonal (S2S) Prediction Project Database archive (https://apps.ecmwf.int/datasets/data/s2s/).

Data and code availability

Data will be made available on request.

CRediT authorship contribution statement

Uran Chung: Writing – review & editing, Writing – original draft, Visualization, Validation, Methodology, Formal analysis, Conceptualization. Jinyoung Rhee: Writing – review & editing, Project administration, Conceptualization. Miae Kim: Writing – review & editing. Soo-Jin Sohn: Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research was funded by the Korea Meteorological Administration Research and Development Program “APEC Climate Center for Climate Information Services” under Grant (KMA2013-07510). This study used a data inventory constructed by the APEC Climate Center from the S2S database. We are grateful to the researchers who continuously collected and updated climate prediction data from the S2S database. Also, we appreciate many thanks to two editors for English.

Footnotes

Appendix A

Supplementary data to this article can be found online at https://doi.org/10.1016/j.heliyon.2024.e35933.

Appendix A. Supplementary data

The following is the Supplementary data to this article:

Multimedia component 1
mmc1.docx (248.1KB, docx)

References

  • 1.Nations F.A.O.U. FAO; 2021. The Impact of Disasters and Crises on Agriculture and Food Security: 2021. Food & Agriculture Organization of the UN.https://books.google.co.kr/books?id=3-YlEAAAQBAJ [Google Scholar]
  • 2.World Meteorological Organization State of the climate in Asia. 2021. https://public.wmo.int/en/our-mandate/climate/wmo-statement-state-of-global-climate/asia 1–40.
  • 3.Majaw B. Taylor & Francis; 2020. Climate Change in South Asia: Politics, Policies and the SAARC.https://books.google.co.kr/books?id=pgfoDwAAQBAJ [Google Scholar]
  • 4.Morita . In: Water Productivity and Food Security. 3. Kumar M.D., editor. vol. 3. Elsevier; 2021. Chapter 8 - measure for raising crop water productivity in south Asia and sub-saharan africa,”; pp. 157–196. (Current Directions in Water Scarcity Research). [DOI] [Google Scholar]
  • 5.Schiermeier Q. The real holes in climate science. Nature. 2010;463:284–287. doi: 10.1038/463284a. [DOI] [PubMed] [Google Scholar]
  • 6.Lecun Y., Bottou L., Bengio Y., Haffner P. Gradient-based learning applied to document recognition. Proc. IEEE. 1998;86(11):2278–2324. doi: 10.1109/5.726791. [DOI] [Google Scholar]
  • 7.Tran Q.-K., Song S. Computer vision in precipitation nowcasting: applying image quality assessment metrics for training deep neural networks. Atmosphere. 2019;10(5) doi: 10.3390/atmos10050244. [DOI] [Google Scholar]
  • 8.Zeiler M.D., Fergus R. In: Computer Vision -- ECCV 2014. Fleet D., Pajdla T., Schiele B., Tuytelaars T., editors. Springer International Publishing; Cham: 2014. Visualizing and understanding convolutional networks; pp. 818–833. [DOI] [Google Scholar]
  • 9.Wang Y., Ren H.L., Zhou F., et al. Multi-model ensemble sub-seasonal forecasting of precipitation over the maritime continent in boreal summer. Atmosphere. 2020;11(5):157–172. doi: 10.3390/atmos11050515. [DOI] [Google Scholar]
  • 10.Shi X., Chen Z., Wang H., Yeung D.-Y., Wong W., Woo W. Convolutional LSTM network: a machine learning approach for precipitation nowcasting. arXiv. 1506.0421. 2015 doi: 10.48550/arXiv.1506.04214. [DOI] [Google Scholar]
  • 11.Shi X., Gao Z., Lausen L., et al. Deep learning for precipitation nowcasting: a benchmark and a new model. Adv. Neural Inf. Process. Syst. 2017 (Nips) 2017:5618–5628. [Google Scholar]
  • 12.Agrawal S., Barrington L., Bromberg C., Burge J., Gazen C., Hickey J. Machine learning for precipitation nowcasting from radar images,”. 2019 doi: 10.48550/arXiv.1912.12132. ArXiv. 1912.1213. [DOI] [Google Scholar]
  • 13.Ayzel G., Scheffer T., Heistermann M. RainNet v1.0: a convolutional neural network for radar-based precipitation nowcasting. Geosci. Model Dev. (GMD) 2020;13(6):2631–2644. doi: 10.5194/gmd-13-2631-2020. [DOI] [Google Scholar]
  • 14.Mooers G., Rritchard M., Beucler T., et al. Assessing the potential of deep learning for emulating cloud superparameterization in climate models with real-geography boundary conditions. J. Adv. Model. Earth Syst. 2021;13(5):1–26. doi: 10.1029/2020MS002385. [DOI] [Google Scholar]
  • 15.Jeppesen J.H., Jacobsen R.H., Inceoglu F., Toftegaard T.S. A cloud detection algorithm for satellite imagery based on deep learning. Remote Sens. Environ. 2019;229:247–259. doi: 10.1016/j.rse.2019.03.039. [DOI] [Google Scholar]
  • 16.Radhakrishnan K., Scott K.A., Clausi D.A. Sea ice concentration estimation: using passive microwave and SAR data with a U-Net and curriculum learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021;14:5339–5351. doi: 10.1109/JSTARS.2021.3076109. [DOI] [Google Scholar]
  • 17.Badrinarayanan V., Kendall A., Cipolla R. SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017;39(12):2481–2495. doi: 10.1109/TPAMI.2016.2644615. [DOI] [PubMed] [Google Scholar]
  • 18.Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. 2015 doi: 10.48550/arXiv.1409.1556. arXiv :1409.1556. [DOI] [Google Scholar]
  • 19.Bihlo A., Popovych R.O. Physics-informed neural networks for the shallow-water equations on the sphere. J. Comput. Phys. 2022;456 doi: 10.1016/j.jcp.2022.111024. [DOI] [Google Scholar]
  • 20.Vitart F., Robertson A.W. The sub-seasonal to seasonal prediction project (S2S) and the prediction of extreme events. Clim. Atmos. Sci. 2018;1(1):3–2018. doi: 10.1038/s41612-018-0013-0. [DOI] [Google Scholar]
  • 21.Weyn J.A., Durran D.R., Caruana R., Cresswell-Clay N. Sub-seasonal forecasting with a large ensemble of deep-learning weather prediction models. J. Adv. Model. Earth Syst. 2021;13:7. doi: 10.1029/2021MS002502. [DOI] [Google Scholar]
  • 22.Specq D., Batté L. Improving subseasonal precipitation forecasts through a statistical–dynamical approach : application to the southwest tropical Pacific. Clim. Dynam. 2020;55(7):1913–1927. doi: 10.1007/s00382-020-05355-7. [DOI] [Google Scholar]
  • 23.Kim H., Ham Y.G., Joo Y.S., Son S.W. Deep learning for bias correction of MJO prediction. Nat. Commun. 2021;12(1):3087. doi: 10.1038/s41467-021-23406-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.de Andrade F.M., Coelho C.A.S., Cavalcanti I.F.A. Global precipitation hindcast quality assessment of the Subseasonal to Seasonal (S2S) prediction project models. Clim. Dynam. 2019;52(9–10):5451–5475. doi: 10.1007/s00382-018-4457-z. [DOI] [Google Scholar]
  • 25.IFS Documentation CY40R1 (4) ECMWF. 2014. ECMWF, IFS documentation CY40R1 - Part IV: physical processes. [DOI] [Google Scholar]
  • 26.Maclachlan C., Arribas A., Peterson K.A., et al. Global seasonal forecast system version 5 (GloSea5): a high-resolution seasonal forecast system. Q. J. R. Meteorol. Soc. 2015;141(689):1072–1084. doi: 10.1002/qj.2396. [DOI] [Google Scholar]
  • 27.Scaife A.A., Arribas A., Blockley E.W., et al. Skillful long-range prediction of European and North American winters. Geophys. Res. Lett. 2014;41(7):2514–2519. doi: 10.1002/2014GL059637. [DOI] [Google Scholar]
  • 28.Lin H., Gagnon N., Beauregard S., et al. GEPS-based monthly prediction at the Canadian Meteorological Centre. Mon. Weather Rev. 2016;144(12):4867–4883. doi: 10.1175/MWR-D-16-0138.1. [DOI] [Google Scholar]
  • 29.Saha S., Moorthi S., Wu X., et al. The NCEP climate forecast system version 2. J. Clim. 2014;27(6):2185–2208. doi: 10.1175/JCLI-D-12-00823.1. [DOI] [Google Scholar]
  • 30.Senior C.A., Andrews T., Burton C., et al. Idealized climate change simulations with a high-resolution physical model: HadGEM3-GC2. J. Adv. Model. Earth Syst. 2016;8(2):813–830. doi: 10.1002/2015MS000614. Jun. 2016. [DOI] [Google Scholar]
  • 31.Wu T., Song L., Li W., et al. An overview of BCC climate system model development and application for climate change studies. J. Meteorol. Res. 2014;28(1):34–56. doi: 10.1007/s13351-014-3041-7. [DOI] [Google Scholar]
  • 32.Wu T., Yi R., Lu Y., et al. BCC-CSM2-HR: a high-resolution version of the Beijing climate center climate system model. Geosci. Model Dev. (GMD) 2021;14(5):2977–3006. doi: 10.5194/gmd-14-2977-2021. [DOI] [Google Scholar]
  • 33.Mukherjee S., Ballav S., Soni S., Kumar K., Kumar De U. Investigation of dominant modes of monsoon ISO in the northwest and eastern Himalayan region. Theor. Appl. Climatol. 2016;125(3):489–498. doi: 10.1007/s00704-015-1512-0. [DOI] [Google Scholar]
  • 34.Hersbach H., Bell B., Berrisford P., et al. vol. 159. ECMWF Newsletter; 2019. pp. 17–24. (Global Reanalysis: Goodbye ERA-Interim, Hello ERA5). [DOI] [Google Scholar]
  • 35.Hersbach H., Bell B., Berrisford P., et al. The ERA5 global reanalysis. Q. J. R. Meteorol. Soc. 2020;146(730):1999–2049. doi: 10.1002/qj.3803. [DOI] [Google Scholar]
  • 36.Nogueira M. Inter-comparison of ERA-5, ERA-interim and GPCP rainfall over the last 40 years: process-based analysis of systematic and random differences. J. Hydrol. 2020;583 doi: 10.1016/j.jhydrol.2020.124632. [DOI] [Google Scholar]
  • 37.Krzywinski M., Altman N. Multiple linear regression. Nat. Methods. 2015;12(12):1103–1104. doi: 10.1038/nmeth.3665. [DOI] [PubMed] [Google Scholar]
  • 38.Uyanık G.K., Güler N. A study on multiple linear regression analysis. Procedia - Soc. Behav. Sci. 2013;106:234–240. doi: 10.1016/j.sbspro.2013.12.027. [DOI] [Google Scholar]
  • 39.Ronneberger O., Fischer P., Brox T. 2015. U-net: Convolutional Networks for Biomedical Image Segmentation. arXiv 1505.0459. [DOI] [Google Scholar]
  • 40.He S., Li X., DelSole T., Ravikumar P., Banerjee A. 2020. Sub-Seasonal Climate Forecasting via Machine Learning: Challenges, Analysis, and Advances, arXiv :2006.0797. [DOI] [Google Scholar]
  • 41.Kingma D.P., Ba J. 2017. Adam: a Method for Stochastic Optimization. arXiv 1412.6980. [DOI] [Google Scholar]
  • 42.Mehmood F., Ahmad S., Whangbo T.K. An efficient optimization technique for training deep neural networks. Mathematics. 2023;11(6) doi: 10.3390/math11061360. [DOI] [Google Scholar]
  • 43.Wilks D.S. In: Statistical Methods in the Atmospheric Sciences. fourth ed.), fourth ed. Wilks D.S., editor. Elsevier; 2019. Chapter 9 - forecast verification; pp. 369–483. [DOI] [Google Scholar]
  • 44.Sadeghi M., Nguyen P., Hsu K., Sorooshian S. Improving near real-time precipitation estimation using a U-Net convolutional neural network and geographical information. Environ. Model. Softw. 2020;134 doi: 10.1016/j.envsoft.2020.104856. [DOI] [Google Scholar]
  • 45.Schaefer J.T. The critical success index as an indicator of warning skill. Weather Forecast. 1990;5(4):570–575. doi: 10.1175/1520-0434(1990)005&#x0003c;0570:TCSIAA&#x0003e;2.0.CO;2. [DOI] [Google Scholar]
  • 46.Shivam K., Tzou J.-C., Wu S.-C. Multi-step short-term wind speed prediction using a residual dilated causal convolutional network with nonlinear attention. Energies. 2020;13:1772. doi: 10.3390/en13071772. [DOI] [Google Scholar]
  • 47.Liu Y., Bogaardt L., Attema J., Hazeleger W. Extended range arctic sea ice forecast with convolutional long-short term memory networks. Mon. Weather Rev. 2021;149(6):1673–1693. doi: 10.1175/MWR-D-20-0113.1. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Multimedia component 1
mmc1.docx (248.1KB, docx)

Data Availability Statement

Data will be made available on request.


Articles from Heliyon are provided here courtesy of Elsevier

RESOURCES