Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2021 Feb 20;296:126425. doi: 10.1016/j.jclepro.2021.126425

Real-time CO2 emissions estimation in Spain and application to the COVID-19 pandemic

Luis FS Merchante a,b, Delia Clar a, Alberto Carnicero a,b, Francisco J Lopez-Valdes a,b,, Jesús R Jimenez-Octavio a,b
PMCID: PMC8743041  PMID: 35035123

Abstract

CO2 emissions are one of the major contributors to global warming. The variety of emission sources and the nature of CO2 hinders estimating its concentration in real time and therefore to adopt flexible policies that contribute to its control and, ultimately, to reduce its effects. Spain is not exempted from this challenge and CO2 emissions are published only at the end of the year and as an aggregated value for the whole country, without recognising the existing differences between the regions (the so-called, Autonomous Communities). The recent COVID-19 pandemic is a clear example of the need of accurate and fast estimation methods so that policies can be tailored to the current status and not to a past one. This paper provides a method to estimate monthly emissions of CO2 for each AACC in Spain based on data that are published monthly by the relevant administrations. The paper discusses the approximations needed in the development of the method, predicts the drop in emissions due to the reduced industrial activity during the pandemic in Spain and provides the estimation of future emissions under three recovery scenarios after the pandemic.

Keywords: Forecast, COVID-19, CO2, Emissions

MSC: 00–02, 62–07

1. Introduction

Measuring greenhouse gas emissions has been one of the main concerns of many governments for the last decades. Even if methane has the most potential in contribution to global warming, carbon dioxide (CO2) currently ranks first in affecting global warming due to its abundance in the atmosphere. In addition, CO2 is the primary greenhouse gas (GHG) originated from human activities (United States Environmental Protection Agency, 2019). The Environmental Protection Agency of the United States and the most recent report from the Mitigation of Climate Change working group of the Intergovernmental Panel on Climate Change (IPCC) of the United Nations identified that the vast majority of anthropogenic carbon dioxide emissions come from combustion of fossil fuels (principally coal, oil, and natural gas), with additional contributions coming from deforestation, changes in land use, soil erosion and agriculture (including livestock) (United States Environmental Protection Agency, 2019; Mitigation of Climate Change working group of the Intergovernmental Panel on Climate Change, 2018).

Since 1970, CO2 emissions have increased by about 90%, with emissions from fossil fuel combustion and industrial processes contributing to about 78% of the total GHG emissions increase from 1970 to 2011. The variety of sources and the nature of the non-anthropogenic CO2 that is naturally present on the atmosphere difficult estimating its concentration in real time (Le Quéré et al., 2020).

Consequently, CO2 emission values are normally released in an aggregated manner at the end of each year. When instantaneous concentration of CO2 emission are required, estimations must be made using proxy data which could be available almost at real time. These estimations are usually based on satellite images (Doll et al., 2000; Meng et al., 2014; Ghosh et al.; Shi et al., 2016; Nassar et al., 2017) but other approaches use proxy variables such as the fractional change in activity levels for each sector (Le Quéré et al., 2020) or other socio-economic variables (Begum et al., 2015; Hong et al., 2018) to estimate the instantaneous concentration of CO2 emissions. In this regard, a great variety of short-term and long-term forecasting techniques have been used to estimate GHG emissions. A useful review can be found in Table 1 from reference Hong et al. (2018). Most of those techniques are based on Evolutionary Algorithms (Karabulut et al., 2008; Mousavi et al., 2014; Fang et al., 2018), although Artificial Networks are also very popular on this domain (Behrang et al., 2011; Kankal et al., 2011; Ardakani and Ardehali, 2014; Guo et al., 2018; Heydari et al., 2019). Fewer references can be found on Support Vector Machines (Sun and Liu, 2016; Saleh et al., 2016; Ahmadi et al., 2019) or Regressions (Köne and Büke, 2010; Azadeh et al., 2017; Hosseini et al., 2019). Not many studies have been found testing other Machine Learning techniques, especially those based on ensemble methods (Dietterich, 2000) like Random Forest (Wei et al., 2018), Adaboost (Zhou et al., 2019) or voting of Multi-layer Perception Classifiers (Khan and Awasthi, 2019).

Table 1.

Main features of regression techniques used for forecasting emissions.

Community Description Main features
Linear Regressor Fits a linear model to minimize the quadratic mean squared error States a simple base line of accuracy
K-Nearest Neighbors Regressor The value is predicted by local interpolation of the nearest data in the k-neighbourhood Simple, non-parametric, robust to noisy data
Decision Trees Regressor Non-parametric method by learning decision rules resulting in local linear regressions Non-parametric, interpretable
Random Forest Regressor Ensemble of Decision Trees to improve generalizability More resistant to overfitting, very stable
Gradient Boosting Regressor Ensemble of weak decision trees models. New predictors are fitted with mistakes committed by previous predictors Reduce bias and variance
Epsilon Support Vector Regressor Fit a hyper-plane to the data transformed by a RBF kernel. Some error controlled by epsilon is tolerated Computational complexity does not depend on the input data dimensionality
Kernel Ridge Regressor Combines ordinary Least Squares with L2 penalty on the coefficients with kernel trick Efficient non-linear fitting

An alternative to proxy variables are time series of CO2 emissions to be used with Grey system theory and ARIMA models (Lin et al., 2011; Pao et al., 2012; Lotfalipour et al., 2013; García-Martos et al., 2013; Yuan et al., 2016; Pao and Tsai, 2011). Unfortunately, methods that estimate CO2 emissions based on trends of the historical time series are not reliable when extraordinary events, such as the COVID-19 pandemic, arrive.

Indeed, few months ago, the outbreak of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2 or COVID-19) and the global health crisis announced by the World Health Organization on March 11th, 2020 shattered many of these estimations. COVID-19 has spread around the planet and many governments have imposed lock-downs with different restrictions that affect severely the mobility and the industrial and economical activity (Bert et al., 2020; IHS Markit, 2020; Ozili & Arun, 0000; Bartik et al., 2020; del Rio-Chanona et al., 2020). These restrictions met the objective of reducing the virus spread, but they also had a significant impact in other areas, like in the environmental field, where emissions decreased to unknown levels in the last years. Thus, this unexpected scenario revealed the limitations of the aforementioned methods based on historical data to estimate CO2 emissions in real time.

This study focuses on the development of a method to estimate almost realtime CO2 emission levels in Spain. According with the data from the Emission Database for Global Atmospheric Research (Joint Research Centre, 2020) and the International Energy Agency (International Energy Agency, 2020), the historical trend of CO2 emission for Spain is far from the figures from top CO2 emitting countries (Shirmohammadi et al., 2020). Spain ranks in 24th position in contribution to the global CO2 emission, producing a 0.7% of the global share (Worldometer, 2020). The main goal of this work is to determine whether proxy variables related to energy consumption can be considered as sufficiently robust metrics for almost real time CO2 emission estimations. If the model works, it can be used to predict the effect of future policies on the generation of pollutants. To show the robustness of the model, it will be applied to the case of modeling the CO2 emissions of the Autonomous Communities1 (AACC) of Spain and to forecast these emissions under three potential scenarios of economic activity recovery after COVID-19.

2. Data and methodology

2.1. Data

Publicly available data per each Spanish AACC in the period 2011–2018 were retrieved from several sources (see Table 2 ). Feature selection techniques were applied, so that the CO2 emissions model for each AACC could use different predictors, in order to contribute to reveal the nature of CO2 emissions within each one. The list of variables that were explored to model the CO2 emissions in the different Autonomous Communities are listed in Table 2.

Table 2.

Variables used to train the model.

Variable Frequency Absent AACC Source
CO2 global anthropogenic monthly Ceuta MITECO1
CO2 non renewable energy generation monthly Ceuta REE2
Meteorology (average temperature and precipitations) monthly AEMET3
GBP annual INE4
GBP chained index quarterly Ceuta and Melilla AIREF5
Population annual INE4
GBP per capita (computed as the division of GBP and Population) anual
Number of accidents with victims on intercity roads monthly Melilla DGT6
Number of victims (deaths, severe and minor) monthly Melilla DGT6
Energy demand monthly REE2
Number of passengers in urban bus trips monthly Baleares, Cantabria, La Rioja, Ceuta and Melilla INE4
Number of real state operations quarterly MITMA7
Number of mortgages monthly INE4
Overnight stays in hotel monthly INE4
Retail sales index monthly INE4
Services sector activity indicator monthly Ceuta and Melilla INE4
Industry sector activity indicator monthly Ceuta and Melilla INE4
Registered workers to Social Security System monthly SS8
Consumption of petrol products for transportation monthly CORE9
Consumption of petrol products for home heating and industry monthly CORE9
Consumption of petrol products monthly CORE9
Public administration and other resident sectors credits monthly BE10
Public administration and other resident sectors deposits monthly BE10
Deaths monthly INE4
1

Spanish Ministry of Environment.

2

Red Eléctrica Española.

3

State Metereological Agency.

4

Statistic National Institute.

5

Independent Authority of Fiscal Responsibility.

6

Directorate General for Traffic and Department of Security of Basque Government.

7

Ministry of Transports and Mobility.

8

Social Security System.

9

Strategic Oil Products Reserves Public Corporation.

10

Central Bank of Spain.

2.2. Apportioning yearly reported variables into monthly figures

The most recent information about CO2 emissions was retrieved from reports from the Ministry for Ecological Transition and Demographic Challenge (Ministerio para la Transición Ecológica y el Reto Demográfico, 2020), where equivalent emissions for AACC are provided from 1990 to 2018. It should be noted that the volume of CO2 is provided only annually. Since the model to be developed in this study seeks to predict the emissions on a monthly basis (to be able to detect the impact of rapid events such as the COVID-19 crisis), this variable needed to be transformed into monthly figures before being able to include it in the model.

To this end, an approach based on monthly reported energy indicators was followed. These indicators are published by Red Eléctrica Española (REE, the only high voltage electric transport operator in Spain) (Red Eléctrica Española, 2020). REE reports on the monthly emissions produced by non-renewable energy generation in Spain and on the energy produced monthly by every AACC, desegregated by energy type. The approximation is that the monthly distribution found on the emissions from non-renewable energy generation applies also to the global anthropogenic CO2 monthly distribution. This hypothesis is supported by a very similar behaviour between both annual time series as depicted in Fig. 1 . If their annual values are closely correlated, it has sense finding the same behaviour on monthly distributions.

Fig. 1.

Fig. 1

kTmCO2eq anthropogenic vs TmCO2eq from non-renewable generation (data standardized).

The approach followed to apportion in monthly figures the yearly amount of CO2 emissions is detailed in Algorithm 1.

Algorithm 1

Approach to apportion yearly reported emissions of CO2 in monthly figures

Image 1

2.3. Model construction, training and validation

In this study, it was considered that regressions techniques and, more precisely, regression meta-algorithms based on ensembles could provide useful models without sacrificing accuracy. Several of those regression techniques were tested and compared by means of the R2 coefficient, thanks to recent implementations of Machine Learning libraries that allowed the authors to train numerous modeling techniques with few efforts (Pedregosa et al., 2011; Pytorch, 2020; TensorFlow, 2020). The interpretation of the R2 coefficient is that the closer the R2 value is to one, the more accurate the prediction is. In this context, negative R2 values mean that the average of the data provides a better fit to the outcomes than the predicted values.

Before training, data were standardized (removing the mean and scaling to unit variance), as many regression techniques assume a Gaussian distribution of the attributes. This process is not required for regression algorithms that do not make this Gaussian assumption (e.g. Decision Trees), but it is mandatory for several others. The regression techniques used in this stud ywere: K-Nearest Neighbors Regressor, Decision Trees, Random Forest and Gradient Boosting Regressors, Epsilon-Support Vector Regression, Linear and Kernel Ridge Regressors. Table 1 shows the main features for each technique. Modelling was combined with feature selection (by means of Sequential Forward Floating Selection (Pudil et al., 1994)) and hyper-parameter searching techniques to find the best fit. All the models were trained with data from years 2011–2016, and validated with data from 2017 to 2018. The processed described above is resumed in Algorithm 2. It is only after training and validation that the model can be used to predict the emissions in 2019 and 2020.

Algorithm 2

Method followed for modelling CO2 emissions

Image 2

2.4. Scenarios used to predict future CO2 emissions

Finally, the model obtained was used to predict the change in emission levels in three hypothetical scenarios that can potentially occur associated to the recovery after the COVID-19 pandemic.

At the time of writing this document, the Statistic National Institute of Spain had estimated a −21.5% Gross Domestic Product (GDP) fall for the 2nd quarter of 2020 (flash estimate) (Instituto Nacional de Estadística, 2020). Based on this prediction, this study hypothesized that, due to the pandemic, the GDP in Spain remained at this level until the end of 2020. Under this assumption, three potential recovery scenarios were simulated, as follows:

  • Scenario 1, V-shape recovery: economic activity will get back to the level of January 2020 in a linear fashion by January 2022.

  • Scenario 2, Slow V-shape recovery: similar to the previous one, but full recovery will be reached by January 2023.

  • Scenario 3, U-shape recovery: in which the economic activity in January 2022 would be similar to that of January 2021, rising back to the values of January 2020 by January 2023.

3. Results

3.1. Model selection and results validation

As aforementioned, all the algorithms from Table 1 were trained and tested using 50 repetitions of a 5-fold cross-validation process. Each cross-validated training returned five R2 whose mean and standard deviation were averaged over the 50 repetitions. Table 3 shows the average and standard deviation values of the R2 coefficient. Best values were achieved by Random Forest and Gradient Boosting Regressors. Eventually a Gradient Boosting Regressor (Friedman, 1999, 2000) was chosen because its implementation allowed us to obtain a measure of uncertainty using lower and upper prediction intervals (Scikit-learn, 2020). As mentioned before, this technique combined with feature selection allowed the authors to identify the most relevant factors that explained CO2 emissions within each AACC. These variables are shown in Table 4 .

Table 3.

Average and standard deviation values of the R2 coefficient for the different models tested.


linear
knn
decisionTrees
randomForest
gradientBoosting
svr
krr
avg std avg std avg std avg std avg std avg std avg std
Andalucia −1.04 1.46 0.17 0.19 0.12 0.34 0.60 0.13 0.51 0.20 −0.20 0.23 −26.24 10.22
Aragon −31.26 60.04 −0.33 0.79 −0.63 1.10 0.01 0.45 0.02 0.39 −0.25 0.25 −15.71 14.45
Cantabria −1.63 1.77 −0.28 0.38 −1.00 1.12 −0.18 0.22 −0.37 0.39 −0.13 0.12 −26.94 12.50
Castilla la Mancha −2.87 4.92 −0.04 0.32 −1.17 1.19 −0.02 0.17 −0.04 0.28 −0.24 0.28 −21.16 14.05
Castilla y Leon −2.23 3.18 0.07 0.44 −1.84 2.08 −0.28 0.71 −0.63 1.02 −0.17 0.20 −7.04 9.21
Cataluña −1.85 3.15 0.14 0.30 −0.44 0.82 0.14 0.31 0.11 0.35 −0.20 0.17 −54.26 21.47
Pais Vasco −1.03 1.82 −0.00 0.38 −0.46 0.69 0.26 0.17 0.11 0.10 −0.12 0.10 −16.25 6.54
Principado de Asturias −2.14 3.28 0.18 0.31 −0.72 1.10 0.24 0.28 0.21 0.49 −0.11 0.12 −10.31 5.72
Comunidad de Madrid −6.86 6.12 −0.07 0.41 −0.40 0.43 0.10 0.21 0.15 0.31 −0.14 0.16 −90.52 44.94
Comunidad de Navarra −3.09 5.39 −0.03 0.30 −1.00 0.56 −0.06 0.85 −0.13 0.75 −0.08 0.09 −18.38 12.18
Comunidad Valenciana −4.19 7.32 0.00 0.47 −0.62 0.66 0.02 0.27 0.04 0.39 −0.07 0.09 −29.39 18.45
Extremadura −1.42 1.32 −0.22 0.83 −1.49 1.31 −0.42 0.53 −0.77 0.71 −0.10 0.12 −15.91 8.55
Galicia −2.59 5.34 0.24 0.24 −0.03 0.24 0.42 0.12 0.39 0.14 −0.13 0.16 −10.49 2.18
Islas Baleares −3.03 5.38 0.45 0.47 0.28 0.31 0.58 0.25 0.49 0.31 −0.31 0.40 −36.95 4.01
Islas Canarias −8.70 13.21 0.15 0.28 −0.49 0.51 0.25 0.22 0.08 0.36 −0.14 0.25 −66.52 16.03
La Rioja −11.90 15.68 −0.04 0.29 −0.77 0.40 0.05 0.32 −0.14 0.47 −0.08 0.10 −1.39 0.48
Region de Murcia −3.79 6.53 −0.18 0.47 −0.33 0.63 0.19 0.25 0.18 0.23 −0.14 0.14 −21.17 13.26

Table 4.

Relevant variables per model.

AACC var1 var2 var3 var4 var5 var6 var7 var8 var9 var10 var11
Andalucia year gbp gbp chained population gbp p.c1 deposits tCO2eq2
Aragon gbp credits tCO2eq
Cantabria gbp population gbp p.c tCO2eq
Castilla la Mancha year gbp population tCO2eq
Castilla y Leon year month gbp gbp chained population gbp p.c trx_inmob3 services idx4 credits deposits tCO2eq
Cataluña gbp population tCO2eq
Pais Vasco gbp gbp chained population gbp p.c trx_inmob tCO2eq
Principado de Asturias gbp gbp p.c tCO2eq
Comunidad de Madrid year population gbp p.c tCO2eq
Comunidad de Navarra year gbp p.c tCO2eq
Comunidad Valenciana year gbp tCO2eq
Extremadura year population gbp p.c tCO2eq
Galicia population mortgages tCO2eq
Islas Baleares year gbp population tCO2eq
Islas Canarias year gbp tCO2eq
La Rioja gbp population gbp p.c indexretail tCO2eq
Region de Murcia year gbp population gbp p.c tCO2eq
1

gdp p.c. stands for gdp per capita.

2

tCO2eq stands for tCO2eq from non-renewable electric generation.

3

trx inmob stands for number of real state operations.

4

Services idx stands for Index of activity of service sector.

As shown in Table 4, the models are from different complexity as intended with the proposed feature selection methodology. The results also show that emissions from non-renewable electric generation are possibly the best predictor of CO2 emissions because the former variable (tCO2eq) was selected by all models.

The accuracy of the model can be quantified by means of the R2 coefficient as shown in Table 5 .

Table 5.

R-squared values per model (from best to worse accuracy).

Community Accuracy (R2)
Galicia 0.929011
Castilla la Mancha 0.902679
Cataluña 0.858921
Andalucía 0.798672
Principado de Asturias 0.796600
Aragón 0.772803
Castilla y Leon 0.753828
Islas Baleares 0.749662
Islas Canarias 0.735272
Comunidad de Madrid 0.684010
Comunidad de Navarra 0.631332
La Rioja 0.559210
Extremadura 0.459273
Comunidad Valenciana 0.394295
Pais Vasco 0.211449
Región de Murcia −0.218722
Cantabria −2.792710

3.2. Predicted CO2 emissions in each AACC in Spain in 2019 and 2020

Eventually, given the accuracy obtained in the validation, the model can be used to forecast CO2 values for 2019 and 2020. These estimations are shown in Fig. 2, Fig. 3 . These figures show validation data from 2017 to 2018 and forecast data for 2019 and 2020. The red lines are the values predicted by the models and the shaded regions are bounded by the lower and upper limits representing the 10th and 90th percentiles.

Fig. 2.

Fig. 2

Actual data (kTmCO2eq) along with predicted data for 2017 and 2018 and forecast values for 2019 and 2020 for every AACC models.

Fig. 3.

Fig. 3

Aggregated actual data (kTmCO2eq) and predicted data for Spain.

CO2 emission values estimated for 2019 and 2020 cannot be validated since no official figures have been published to date. However, the model predicts the decrease of emissions during 2019 since 58,6% of the electricity generated in Spain in 2019 did not emit CO2 because it came from renewable sources. The prediction also shows a decrease in emissions in the first semester of 2020 that can possibly be linked to the COVID-19 pandemic.

3.3. Required pre-analysis for scenarios forecasting

The good fit of the model is strongly associated to the predictor quantifying the emissions from non-renewable electric generation. Thus, the estimation of the anthropogenic CO2 emitted in future scenarios requires computing the predictors based on the assumptions taken for each scenario, including estimating the amounts of non-renewable electric generation in these scenarios. To overcome this difficulty, a new set of models to predict the amount of CO2 emitted from non-renewable energies per every AACC was developed using the same techniques as the ones described in the Data and methodology section.

To validate the accuracy of these new models per AACC, the model used to predict the emissions of CO2 was run again using the prediction of CO2 in the generation of non-renewable energy per AACC instead of the available reported data. As it could be expected, less precise predictions were obtained than with the first set of models. Table 6 shows the variables that entered in each model. More variables needed to be considered to get to a reasonable level of results. Fig. 4, Fig. 5 show the estimation obtained.

Table 6.

Relevant variables per model in the estimation of emitted CO2 on electric non-renewable generation.

AACC var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
Andalucia month trx_inmob ss affils petrol_mov petrol_indstr deposits
Aragon month gbp population ss affils credits
Cantabria gbp gbp chained population trx_inmob indexretail ss affils petrol_indstr credits
Castilla la Mancha month prec population energy_demand ss affils credits
Castilla y Leon year gbp gbp chained gbp p.c services idx ss affils credits
Cataluña month population credits deposits
Pais Vasco month gbp gbp chained population gbp p.c hotelnights ss affils
Principado de Asturias month gbp indexindustry petrol_indstr
Comunidad de Madrid month gbp population mortgages credits
Comunidad de Navarra year month gbp chained indexretail services idx ss affils petrol_mov petrol_indstr credits deaths
Comunidad Valenciana year month population services idx
Extremadura year month gbp gbp chained population hotelnights credits
Galicia month gbp population ss affils deposits
Islas Baleares gbp population energy_demand trx_inmob credits deposits
Islas Canarias year month gbp population gbp p.c trx_inmob credits
La Rioja year month population mortgages services idx petrol_indstr credits
Region de Murcia year month tmed gbp chained population gbp p.c indexindustry petrol_mov

Fig. 4.

Fig. 4

Actual data (TmCO2eq) on electric non-renewable generation along with predicted data for 2017 and 2018 and forecasted values for 2019 and 2020 for every AACC models.

Fig. 5.

Fig. 5

Aggregated actual data (TmCO2eq) on electric non-renewable generation and predicted data for Spain.

Even if this new set of models are not extremely accurate, it was considered that they can provide a reasonable approximation to the monthly trends of CO2 emissions for every AACC.

3.4. Scenarios

After the validation explained in the previous subsection, the trained model was used to predict the amount of CO2 (kTmCO2eq) that will be emitted to the atmosphere for each of the three scenarios described above until January 2023. Results are displayed in Fig. 6 . Values from January 2019 until the moment of writing this document are colored in grey (estimations produced with actual values of the predictors). Then, and until January 2023, the values have also been produced by the above method using hypothetical values of the predictors according to the different scenarios. Scenarios 1, 2 and 3 are colored in blue, orange or green respectively.

Fig. 6.

Fig. 6

Aggregated estimated data (kTmCO2eq) on anthropogenic emissions for every scenario.

The scenarios described in section Scenarios used to predict future CO2 emissions differed in the hypothesized recovery timeline of the GDP. The starting point is the second quarter of 2020 were the GDP was estimated to have fallen by 21.5% with respect the same quarter of 2019. Scenario 1 is the most optimistic of the three with a V-shape recovery; this scenario would reach by January 2022 the same levels of economic activity than those from January 2020. The remaining scenarios 2 and 3 predict V-shaped and U-shaped respectively slower recoveries, reaching the same economic activity values observed in January 2020 three years later, that is, in January 2023.

From the emissions point of view, the best scenario would be scenario 3 (green curve, slow U-shaped recovery) since the integral of its curve accumulates the least amount of kTmCO2eq. Unfortunately, this scenario is also the worst from an economical perspective, as it assumes 12 months of recession at its lowest values and it is not until 2022 that the economy begins its recovery. On the contrary, the desired scenario for the economical recovery of Spain would be scenario 1 (blue curve) that is also the one with the largest amount of CO2 emitted.

4. Discussion

The COVID-19 crisis has caused a decrease in pollutant emissions in many places in the world. It would be expected that once the crisis are over, emissions will return to their original levels. However, it can be argued that this situation can also lead to substantial shifts in energy efficiency and to the development of alternative, cleaner, energy sources (Le Quéré et al., 2020; Peters et al., 2011). In the case of COVID-19, for instance, it has been observed that the precautions taken to avoid infection had caused a decrease in the use of public transportation associated to an increase in the use of new means of clean personal mobility such as e-scotters or e-bikes. It should be noted that restrictions to contain the propagation of COVID-19 change very rapidly depending on the growth of the infection, even on a weekly or monthly basis. That means that the human and industrial activity may change dramatically in a very short period of time, making more difficult to estimate the CO2 emissions using traditional methods. This is why the approximation shown in this work, even if it is only an approximation, can assist policy makers in predicting the impact in emissions of these restrictive policies.

Since AACC in Spain can implement policies to control CO2 emissions in their territory, this paper used data retrieved at the AACC level so that the models developed here could be used by the regional authorities to design custom emission regulation policies that could be optimized for their individual characteristics. This approach resulted in training 19 different prediction models. From a modeling perspective, it is not a big issue. But having different models for each AACC complicated drawing conclusions about their similarities and differences regarding CO2 emissions. Focusing on the variables that entered in the anthropogenic CO2 and non-renewable CO2 estimation models as shown in Table 4, Table 6, it was expected that the inclusion of the levels of CO2 coming from non-renewable generation would be the key predictor for the 19 anthropogenic models. But the relevance of GDP related variables was also significant and somewhat less predictable. It seems that for those AACC with a quite predictable emission model of non-renewable CO2, GDP and time of the year variables are enough to get an accurate prediction. For those communities with less predictable emissions, several extra variables related to financial products, state agent transactions or service sector levels were required in addition. It should also be noted that the estimation of non-renewable generation emissions was more challenging, as shown by the increased number of variables required per model. Variables related to energy demand, the number of mortgages or affiliations to the social security become more relevant then. The information coded on those two tables could be of great value when designing emission reduction plans per AACC.

Two major difficulties were found: apportioning in monthly figures the yearly amount of CO2 emissions and the lack of mobility data available in Spain. To cope with the first one, a model to estimate these monthly figures was also developed. Unfortunately, despite the evident influence of traffic related emissions in the global CO2 emissions, the availability of information about mobility patterns per month and per AACC in Spain was very limited and did not allow us to establish any sensible method of estimating them. Initially, data related to the number of yearly crashes and victims were used as proxy variables for mobility, as there is literature pointing to the existing relationship between exposure and injuries (Segui-Gomez et al., 2011). However even if this data is available up to 2018 it did not end up contributing to the significance of the models. Some local administrations report the actual figures of private and public transportation use, unfortunately the number of informed cities is still too low to include this predictor in the models.

As mentioned above, one of the most significant predictors used in the model of anthropogenic CO2 emissions was the CO2 emissions produced in the generation of non-renewable electric energy. Recent publications describe energy generation as the second source of CO2 emissions in Spain (Ministry of Environment, 2020). However, generation values per AACC are not available and therefore published values could not be directly used as variables in the prediction models. To overcome this difficulty, we followed the approach described in the Data and methodology section that required the assumption of two hypotheses. The first one considers that the same distribution of energy per AACC applies to the CO2 emissions from the generation process. The second one considers that the monthly distribution of non-renewable emissions per AACC also applies to the monthly distribution of anthropogenic emissions. The first hypothesis seems quite reasonable. The second one is only supported on the similarity of the annual time-series that are the only actual data available. Even if this approach might present certain limitations, in the absence of other more detailed sources of data, it is the only way of estimating these values.

Although it is not possible to validate these assumptions, the model predicts the decrease of emissions due to new policies on contaminant energy production during 2019 and also due COVID-19 restrictions during 2020. These results were therefore consistent with reality. The strength of the approach is that, contrary to the total level of emissions that are only reported in a yearly bases, the exact amount of CO2 produced in generation is reported monthly by the relevant authority in Spain, and therefore can be used to predict the total amount of CO2 monthly emissions, which was the goal of this paper.

Last, one of the main contributions of this study is that the developed method allows quantifying economical recovery and anthropogenic emissions variables at the same time. The manuscript includes three hypothesized scenarios to show the feasibility of the method, but the method can be applied to other scenarios with different recovery lengths or more complex hypothesis affecting predictors to find an optimal solution between economic growth and keeping CO2 emissions at an affordable level, an issue that has been already identified in previous literature (Linares and Romero, 2000; Guerra et al., 2016; Lopez-Pena et al., 2012). It should be noted that the three studied scenarios were designed based on the estimation of the GDP reduction in Spain provided by the Statistic National Institute of Spain (Instituto Nacional de Estadística, 2020). The scenarios could be updated for other values of GDP change and the method developed here would serve as a tool to forecast interactions between economic recovery and CO2 emissions. For the proposed GDP reduction value, scenario 2 (orange curve in Fig. 6) might be a good compromise between reasonable recovery of economic activity and controlled emissions of CO2. However, when the economic survival of an entire country is at stake, economic recovery is likely a priority.

5. Conclusion

This study has proposed a method that estimates almost real time (monthly) CO2 emissions based on proxy variables related to energy production and comsumption for each AACC in Spain. The method is flexible enough so that each AACC can forecast the short-term effect of implementing diverse energy policies in the CO2 emission levels. The method has been shown capable of capturing the effects of changes in regulation and of sudden events such as the COVID-19 pandemic. The model can also be used to product a set of economic recovery scenarios after the pandemic, in which the participating variables can be estimated, to benchmark the effects of different energy policies in the emissions of CO2. The advantage of a data-centric policy definition is that, if the hypotheses about the predictors are revealed to be wrong, the scenarios might be updated with actual values whenever possible and policies can be dynamically improved.

Future work should include improving the complexity of the modeling techniques introducing theory of Recurrent Neural Networks that might capture better the correlation of the predictors and their evolution in time. For instance, the possibility of including mobility data at the AACC level and for a number of years can contribute largely to improving the predicting capabilities of the model.

CRediT authorship contribution statement

Luis F.S. Merchante: Conceptualization, Funding acquisition, Data curation, Supervision, Formal analysis. Delia Clar: Data curation. Alberto Carnicero: Conceptualization, Supervision, Formal analysis. Francisco J. Lopez-Valdes: Funding acquisition, Supervision, Formal analysis. Jesús R. Jimenez-Octavio: Conceptualization, Funding acquisition, Supervision, Formal analysis.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This research is part of the project MovilidAD uRbana post-covID19 (PP2020_06), funded by Universidad Pontificia Comillas. The authors are thankful for the support to carry out this research. The content of this manuscript is the solely responsibility of the authors and does not necessarily reflect the position of the funding institution.

Handling editor: Dr Sandra Caeiro

Footnotes

1

Autonomous Communities is a level of political and administrative division similar, to a greater or lesser extent, to the French Departments or the Länders in Germany.

References

  1. Ahmadi M.H., Dehghani Madvar M., Sadeghzadeh M., Rezaei M.H., Herrera M., Shamshirband S. Current status investigation and predicting carbon dioxide emission in Latin american countries by connectionist models. Energies. 2019;12:1916. [Google Scholar]
  2. Ardakani F., Ardehali M. Long-term electrical energy consumption forecasting for developing and developed economies based on different optimized models and historical data types. Energy. 2014;65:452–461. [Google Scholar]
  3. Azadeh A., Jafari-Marandi R., Abdollahi M., Roudi E. A novel benchmark methodology for estimating industrial electricity demand considering unsteady socio-economic conditions. Int. J. Bus. Perform. Manag. 2017;18:196–215. [Google Scholar]
  4. Bartik A.W., Bertrand M., Cullen Z., Glaeser E.L., Luca M., Stanton C. The impact of covid-19 on small business outcomes and expectations. 2020;117:17656–17666. doi: 10.1073/pnas.2006991117. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Begum R.A., Sohag K., Abdullah S.M.S., Jaafar M. Co2 emissions, energy consumption, economic and population growth in Malaysia. Renew. Sustain. Energy Rev. 2015;41:594–601. [Google Scholar]
  6. Behrang M., Assareh E., Assari M., Ghanbarzadeh A. Using bees algorithm and artificial neural network to forecast world carbon dioxide emission. Energy Sources, Part A Recovery, Util. Environ. Eff. 2011;33:1747–1759. [Google Scholar]
  7. Bert J., Schellong D., Hagenmaier M., Hornstein D., Wegscheider A.K., Palme T. 2020. How Covid-19 Will Shape Urban Mobility.https://image-src.bcg.com/Images/BCG-How-COVID-19-Will-Shape-Urban-Mobility-Jun-2020_tcm9-251727.pdf Accessed: 2020-08-30. [Google Scholar]
  8. del Rio-Chanona R.M., Mealy P., Pichler A., Lafond F., Farmer D. 2020. Supply and Demand Shocks in the Covid-19 Pandemic: an Industry and Occupation Perspective. arXiv preprint arXiv:2004.06759. [Google Scholar]
  9. Dietterich T.G. International Workshop on Multiple Classifier Systems. Springer; 2000. Ensemble methods in machine learning; pp. 1–15. [Google Scholar]
  10. Doll C.H., Muller J.-P., Elvidge C.D. Night-time imagery as a tool for global mapping of socioeconomic parameters and greenhouse gas emissions. 2000;29:157–162. doi: 10.1579/0044-7447-29.3.157. [DOI] [Google Scholar]
  11. Fang D., Zhang X., Yu Q., Jin T.C., Tian L. A novel method for carbon dioxide emission forecasting based on improved Gaussian processes regression. J. Clean. Prod. 2018;173:143–150. [Google Scholar]
  12. Friedman J.H. 1999. Stochastic Gradient Boosting. [Google Scholar]
  13. Friedman J.H. 2000. Greedy Function Approximation: a Gradient Boosting Machine. [Google Scholar]
  14. García-Martos C., Rodríguez J., Sánchez M.J. vol. 101. 2013. pp. 363–375. (Modelling and Forecasting Fossil Fuels, Co2 and Electricity Prices and Their Volatilities). [DOI] [Google Scholar]
  15. Ghosh, T., Elvidge, C. D., Sutton, P. C., Baugh, K. E., Ziskin, D., & Tuttle, B. T. ( ). Creating a global grid of distributed fossil fuel co2 emissions from nighttime satellite imagery, . 3, 1895–1913. doi:10.3390/en3121895.
  16. Guerra O.J., Tejada D.A., Reklaitis G.V. An optimization framework for the integrated planning of generation and transmission expansion in interconnected power systems. Applied Engergy. 2016;170:1–21. doi: 10.1016/j.apenergy.2016.02.014. [DOI] [Google Scholar]
  17. Guo D., Chen H., Long R. Can China fulfill its commitment to reducing carbon dioxide emissions in the paris agreement? analysis based on a back-propagation neural network. Environ. Sci. Pollut. Control Ser. 2018;25:27451–27462. doi: 10.1007/s11356-018-2762-z. [DOI] [PubMed] [Google Scholar]
  18. Heydari A., Garcia D.A., Keynia F., Bisegna F., De Santoli L. Renewable energies generation and carbon dioxide emission forecasting in microgrids and national grids using grnn-gwo methodology. Energy Procedia. 2019;159:154–159. [Google Scholar]
  19. Hong T., Jeong K., Koo C. An optimized gene expression programming model for forecasting the national co2 emissions in 2030 using the metaheuristic algorithms. 2018;228:808–820. doi: 10.1016/j.apenergy.2018.06.106. [DOI] [Google Scholar]
  20. Hosseini S.M., Saifoddin A., Shirmohammadi R., Aslani A. Forecasting of co2 emissions in Iran based on time series and regression analysis. Energy Rep. 2019;5:619–631. [Google Scholar]
  21. IHS Markit . 2020. Covid-19: the Future Mobility Delusion.https://www.autonews.com/sponsored/covid-19-future-mobility-delusion Accessed: 2020-08-30. [Google Scholar]
  22. Instituto Nacional de Estadística . 2020. National Accounts, Press Section.https://www.ine.es/en/prensa/pib_prensa_en.htm Accessed: 2020-08-30. [Google Scholar]
  23. International Energy Agency . 2020. Co2 Emissions from Fuel Combustion.https://www.iea.org/reports/co2-emissions-from-fuel-combustion-overview Accessed: 2020-08-30. [Google Scholar]
  24. Joint Research Centre . 2020. Emissions Database for Global Atmospheric Research.https://data.jrc.ec.europa.eu/collection/edgar Accessed: 2020-08-30. [Google Scholar]
  25. Kankal M., Akpınar A., Kömürcü M.İ., Özşahin T.Ş. Modeling and forecasting of Turkey’s energy consumption using socio-economic and demographic variables. Appl. Energy. 2011;88:1927–1939. [Google Scholar]
  26. Karabulut K., Alkan A., Yilmaz A.S. Long term energy consumption forecasting using genetic programming. Math. Comput. Appl. 2008;13:71–80. [Google Scholar]
  27. Khan M.J.U.R., Awasthi A. Machine learning model development for predicting road transport ghg emissions in Canada. WSB Journal of Business and Finance. 2019;53:55–72. [Google Scholar]
  28. Köne A.Ç., Büke T. Forecasting of co2 emissions from fuel combustion using trend analysis. Renew. Sustain. Energy Rev. 2010;14:2906–2915. [Google Scholar]
  29. Le Quéré C., Jackson R.B., Jones M.W., Smith A.J., Abernethy S., Andrew R.M., De-Gol A.J., Willis D.R., Shan Y., Canadell J.G., et al. Temporary reduction in daily global co2 emissions during the covid-19 forced confinement. Nat. Clim. Change. 2020:1–7. [Google Scholar]
  30. Lin C.-S., Liou F.-M., Huang C.-P. Grey forecasting model for co2 emissions. A taiwan study. 2011;88:3816–3820. doi: 10.1016/j.apenergy.2011.05.013. [DOI] [Google Scholar]
  31. Linares P., Romero C. A multiple criteria decision making approach for electricity planning in Spain: economic versus environmental objectives. J. Oper. Res. Soc. 2000;51:736–743. doi: 10.1057/palgrave.jors.2600944. [DOI] [Google Scholar]
  32. Lopez-Pena A., Perez-Arriaga I., Linares P. Renewables vs. energy efficiency: the cost of carbon emissions reduction in Spain. Energy Pol. 2012;50:659–668. doi: 10.1016/j.enpol.2012.08.006. [DOI] [Google Scholar]
  33. Lotfalipour M.R., Falahi M.A., Bastam M. Prediction of co2 emissions in Iran using grey and arima models. Int. J. Energy Econ. Pol. 2013;3:229. [Google Scholar]
  34. Meng L., Graus W., Worrell E., Huang B. Estimating co2 (carbon dioxide) emissions at urban scales by dmsp/ols (defense meteorological satellite program’s operational linescan system) nighttime light imagery: methodological challenges and a case study for China. 2014;71:468–478. doi: 10.1016/j.energy.2014.04.103. [DOI] [Google Scholar]
  35. Ministerio para la Transición Ecológica y el Reto Demográfico . 2020. Emisiones del gei por comunidades autónomas a partir del inventario español - serie 1990-2018.https://www.miteco.gob.es/es/calidad-y-evaluacion-ambiental/temas/sistema-espanol-de-inventario-sei-/emisionesgeiporccaaserie1990-2018_tcm30-508282.pdf Accessed: 2020-08-30. [Google Scholar]
  36. Ministry of Environment . 2020. Plan Nacional integrado de Energía y Clima (PNIEC) 2021-2030. Technical Report. [Google Scholar]
  37. Mitigation of Climate Change working group of the Intergovernmental Panel on Climate Change . 2018. Drivers, Trends and Mitigation.https://www.ipcc.ch/site/assets/uploads/2018/02/ipcc_wg3_ar5_chapter5.pdf Accessed: 2020-08-30. [Google Scholar]
  38. Mousavi S.M., Mostafavi E.S., Hosseinpour F. Gene expression programming as a basis for new generation of electricity demand prediction models. Comput. Ind. Eng. 2014;74:120–128. [Google Scholar]
  39. Nassar R., Hill T.G., McLinden C.A., Wunch D., Jones D.B.A., Crisp D. vol. 44. 2017. (Quantifying Co 2 Emissions from Individual Power Plants from Space). [DOI] [Google Scholar]
  40. Ozili, P. K., & Arun, T. ( ). Spillover of Covid-19: Impact on the Global Economy, . doi:10.2139/ssrn.3562570.
  41. Pao H.-T., Tsai C.-M. Modeling and forecasting the co2 emissions, energy consumption, and economic growth in Brazil. 2011;36:2450–2458. doi: 10.1016/j.energy.2011.01.032. [DOI] [Google Scholar]
  42. Pao H.-T., Fu H.-C., Tseng C.-L. Forecasting of co2 emissions, energy consumption and economic growth in China using an improved grey model. 2012;40:400–409. doi: 10.1016/j.energy.2012.01.037. [DOI] [Google Scholar]
  43. Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E. Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 2011;12:2825–2830. [Google Scholar]
  44. Peters G.P., Minx J.C., Weber C.L., Edenhofer O. Growth in emission transfers via international trade from 1990 to 2008. Proc. Natl. Acad. Sci. Unit. States Am. 2011;108:8903–8908. doi: 10.1073/pnas.1006388108. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Pudil P., Novovičová J., Kittler J. Floating search methods in feature selection. 1994;15:1119–1125. doi: 10.1016/0167-8655(94)90127-9. [DOI] [Google Scholar]
  46. Pytorch . 2020. Pytorch, from Research to Production.https://pytorch.org/ Accessed: 2020-08-30. [Google Scholar]
  47. Red Eléctrica Española . 2020. Redata Api from Red Eléctrica Española.https://www.ree.es/en/apidatos Accessed: 2020-08-30. [Google Scholar]
  48. Saleh C., Dzakiyullah N.R., Nugroho J.B. Carbon dioxide emission prediction using support vector machine. IOP Conf. Ser. Mater. Sci. Eng. 2016;114 012148. [Google Scholar]
  49. Scikit-learn . 2020. Gradient Boosting Regressor Implementation from Scikit-Learn Library.https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html Accessed: 2020-08-30. [Google Scholar]
  50. Segui-Gomez M., Lopez-Valdes F.J., Guillen-Grima F., Smyth E., Llorca J., de Irala J. Exposure to traffic and risk of hospitalization due to injuries. Risk Anal. 2011;31:466–474. doi: 10.1111/j.1539-6924.2010.01509.x. [DOI] [PubMed] [Google Scholar]
  51. Shi K., Chen Y., Yu B., Xu T., Chen Z., Liu R., Li L., Wu J. Modeling spatiotemporal co2 (carbon dioxide) emission dynamics in China from dmsp-ols nighttime stable light data using panel data analysis. 2016;168:523–533. doi: 10.1016/j.apenergy.2015.11.055. [DOI] [Google Scholar]
  52. Shirmohammadi R., Aslani A., Ghasempour R. Challenges of carbon capture technologies deployment in developing countries. Sustainable Energy Technologies and Assessments. 2020;42:100837. [Google Scholar]
  53. Sun W., Liu M. Prediction and analysis of the three major industries and residential consumption co2 emissions based on least squares support vector machine in China. J. Clean. Prod. 2016;122:144–153. [Google Scholar]
  54. TensorFlow . 2020. Tensorflow, the Deep Learning Library for python.https://www.tensorflow.org/ Accessed: 2020-08-30. [Google Scholar]
  55. United States Environmental Protection Agency . 2019. Global Greenhouse Gas Emissions Data.https://www.epa.gov/ghgemissions/global-greenhouse-gas-emissions-data Accessed: 2020-08-30. [Google Scholar]
  56. Wei S., Yuwei W., Chongchong Z. Forecasting co 2 emissions in hebei, China, through moth-flame optimization based on the random forest and extreme learning machine. Environ. Sci. Pollut. Control Ser. 2018;25:28985–28997. doi: 10.1007/s11356-018-2738-z. [DOI] [PubMed] [Google Scholar]
  57. Worldometer . 2020. Co2 Emissions.https://www.worldometers.info/co2-emissions/ Accessed: 2020-08-30. [Google Scholar]
  58. Yuan C., Liu S., Fang Z. Comparison of China’s primary energy consumption forecasting by using arima (the autoregressive integrated moving average) model and gm(1,1) model. 2016;100:384–390. doi: 10.1016/j.energy.2016.02.001. [DOI] [Google Scholar]
  59. Zhou J., Xu X., Li W., Guang F., Yu X., Jin B. Forecasting co2 emissions in China’s construction industry based on the weighted adaboost-enn model and scenario analysis. J. Energy. 2019;2019 doi: 10.1155/2019/8275491. 12 pages. [DOI] [Google Scholar]

Articles from Journal of Cleaner Production are provided here courtesy of Elsevier

RESOURCES