Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2024 Oct 30;1541(1):230–242. doi: 10.1111/nyas.15243

Autoencoder‐based flow‐analogue probabilistic reconstruction of heat waves from pressure fields

Jorge Pérez‐Aracil 1,, Cosmin M Marina 1, Eduardo Zorita 2, David Barriopedro 3, Pablo Zaninelli 3, Matteo Giuliani 4, Andrea Castelletti 4, Pedro A Gutiérrez 5, Sancho Salcedo‐Sanz 1
PMCID: PMC11580787  PMID: 39476218

Abstract

This paper presents a novel hybrid approach for the probabilistic reconstruction of meteorological fields based on the combined use of the analogue method (AM) and deep autoencoders (AEs). The AE–AM algorithm trains a deep AE in the predictor fields, which the encoder filters towards a compressed space of reduced dimensionality. The AM is then applied in this latent space to find similar situations (analogues) in the historical record, from which the target field can be reconstructed. The AE–AM is compared to the classical AM, in which flow analogues are explicitly searched in the fully resolved field of the predictor, which may contain useless information for the reconstruction. We evaluate the performance of these two approaches in reconstructing the daily maximum temperature (target) from sea‐level pressure fields (predictor) recorded during eight major European heat waves of the 1950–2010 period. We show that the proposed AE–AM approach outperforms the standard AM algorithm in reconstructing the magnitude and spatial pattern of the considered heat wave events. The improvement ranges from 7% to 22% in skill score, depending on the heat wave analyzed, demonstrating the potential added value of the hybrid method.

Keywords: analogue method, autoencoders, field reconstruction, heat waves


This paper presents a novel hybrid approach for the probabilistic reconstruction of meteorological fields based on the combined use of the analogue method (AM) and deep autoencoders (AEs).

graphic file with name NYAS-1541-230-g008.jpg

INTRODUCTION

Heat waves 1 , 2 , 3 and, in the last decades, mega‐heat waves 4 , 5 are extreme atmospheric events with the potential to have profound social impacts and to pose significant risks to people, 6 particularly the elderly. 7 , 8 Consequently, analyzing, predicting, and attributing heat waves are prominent topics in atmospheric extreme event research. 9 , 10 , 11 Researchers are interested in understanding both the natural causes of these events, such as circulation patterns, 12 , 13 , 14 and also the anthropogenic contribution to their frequency of occurrence or intensity. 15

The literature has addressed the reconstruction and analysis of heat waves using various methodologies, including flow‐analogue approaches, 16 , 17 , 18 , 19 , 20 which are often employed to assess the contribution of thermodynamic changes to the observed magnitude of exceptional heat wave events. The analogue method (AM) is a classical statistical downscaling technique widely applied to field reconstruction problems. The primary objective of the AM is to predict local meteorological variables based on large‐scale, often synoptic, predictors. However, it can also be applied to reconstruct large‐scale fields from point observations. 21 Researchers have extensively utilized the AM in numerous meteorology‐ and climatology‐related issues, including the comparison of meteorological datasets such as reanalysis products, 22 precipitation prediction and downscaling, 22 , 23 , 24 , 25 solar radiation and wind speed forecasting, 26 , 27 , 28 , 29 , 30 visibility and fog formation prediction, 31 air quality forecasting, 32 and temperature fields reconstruction, 33 as well as extreme event prediction 34 , 35 and attribution problems. 36

At its core, the AM is based on the K‐nearest neighbor (KNN) algorithm, 37 founded on the hypothesis that two similar large‐scale states of the atmosphere lead to similar local effects. 21 More specifically, two atmospheric states are considered analogues when they exhibit resemblance in terms of a similarity criterion and objective variables. Hence, the AM involves searching for specific past situations in a meteorological archive that exhibit properties similar to the target situation based on chosen predictors or variables.

Although the AM has remained largely unchanged since its inception, there have been attempts to enhance its performance for specific applications. For example, Delle Monache et al. 38 proposed an ensemble of analogues, which can be used to estimate the probability distribution of the future state of the atmosphere. The AM‐based ensemble method is typically derived from large datasets such as numerical weather models. 39 , 40 Machine learning has been responsible for the most recent advancements in analogue modeling and the analysis of extreme events. 41 For instance, Horton et al 42 . proposed optimizing AM parameters with a genetic algorithm to improve precipitation prediction in the Alps. Other works have proposed methods that allow for reducing the dimensionality of the data. 43 For example, Grooms and co‐workers 44 , 45 have proposed a novel use of AM with variational autoencoders (VAEs) to create an optimal interpolation ensemble in a data assimilation problem. This methodology involves training an autoencoder (AE) 46 , 47 on the available data to generate a latent space (encoder part of the AE) from which an ensemble is constructed. The decoder part of the AE is then used to create an analogue‐based ensemble of reconstructions with improved performance compared to previous approaches, such as the classical AM. Miloshevich et al. 48 have recently applied a method based on VAEs, convolutional networks, and AM to obtain the probabilities of prolonged heat waves in France and Scandinavia based on a multivariable predictor dataset of temperature, soil moisture, and atmospheric circulation fields.

This paper introduces a novel machine learning‐based hybrid approach for analyzing extreme events. It combines deep AEs and the AM to deliver a probabilistic temperature reconstruction during heat waves from flow analogues. The idea is that the latent space of an AE is a compact representation of the data so that the AM can find better analogues by working on this optimized space of the predictor rather than on spatially resolved fields, which involves equal treatment of all grid points without discrimination of the relevant information. Our AE–AM approach uses the AE's encoder to obtain a filtered version of the predictor field and the AM to find similar configurations of the predictor in the historical record. We evaluate this approach during eight heat waves that affected Europe, showing that the proposed AE–AM algorithm improves the performance of the reconstruction of all heat wave events in terms of different error metrics.

The rest of the paper has been structured as follows: The second section  describes the methods applied in this work, mainly the classical AM and the proposed AE–AM approach. The third section  describes the data available and the methodology followed for this study. The fourth section  presents and discusses the results obtained in the probabilistic reconstruction of eight significant European heat waves. The last section  closes the paper by giving some conclusions and remarks on the research carried out.

METHODS

This section introduces the AM and its application for the reconstruction of heat waves. Then, the fundamental concepts of AE are also described. Finally, the novel hybrid AE–AM approach combining VAEs with the AM method is fully described.

The analogue method

As previously stated, the AM is based on the principle that two similar atmosphere states lead to similar local effects. 49 More specifically, two states of the atmosphere are considered analogues when there is a resemblance between them in terms of an analogy criterion and objective variables. Thus, the AM 50 searches for a certain number of past situations in a meteorological archive with similar properties (i.e., predictors) to those of the target situation to reconstruct. That way, the AM reconstructs the expected conditions of the target in the presence of the predictor (e.g., the temperature patterns that are compatible with a given atmospheric circulation).

Let F(X,ti) be a given spatiotemporal field of an atmospheric variable describing the characteristics of the atmospheric circulation (usually the spatial pattern of sea level pressure (SLP) or geopotential height), where X stands for a two‐dimensional grid or matrix of measurement points where F is defined, occurring at ti, in which 1<i<M, where M is the maximum number of events considered. Let F(X,t) be a field of an atmospheric variable defining a given event of interest (e.g., maximum temperature (T max ) in the case of a heat wave), defined on a subgrid XX and at a specific time t. An example of the search region (grid X) and reconstruction region (subgrid X) is shown in Figure 1 for the heat wave of France 2003. As can be seen, the predictor field is defined on the larger grid X, whereas the reconstructed field of the heat wave (Tmax) is defined on the smaller grid X.

FIGURE 1.

FIGURE 1

Example of search and reconstruction regions (grids X and X, respectively) defined to analyze the heat wave of France 2003 with the AM algorithm.

The AM (classic approach) first obtains the time steps of the historical record for which the predictor field on grid X was similar to that observed at the time t of the event F(X,t). For that purpose, the KNN algorithm is usually employed. The algorithm calculates the distance between the targeted event, and the M events considered. Then, the K closest events are considered, for which K<M. This process considers all evaluation periods of the database, except those coming from the event year (i.e., in our example of France 2003, that would correspond to all summer days of the analyzed period, except for 2003). In other words, the synoptic situation accompanying the specific event of interest is used as a reference, and similar situations are sought in the dataset. Thus, let F(X,t1),,F(X,tK), be the K most similar (closest neighbors) situations (flow analogues) of the predictor field F(X,t) at time t. A possible reconstruction of the target variable defining the event at that time is given by

F^(X)=1Kj=1KF(X,tj), (1)

in which F^(X) is the reconstructed field, at region X, and F(X,tj) is the targeted field of the F(X,tj) analogue, with 1jK. The reconstruction error for the considered event can be measured as the deviation between the K reconstructions and the original field F(X,t), usually as a mean absolute error (MAE) in the region of interest (X):

e=1|X|XF(X,t)F^(X), (2)

where |X| stands for the number of nodes/measurement points in X. Other error metrics, such as the mean square error, can be employed instead. To consider a probabilistic reconstruction, N uniformly random extractions can be developed for the K similar situations.

Autoencoders

An AE is a deep neural network approach for dimensionality reduction of data. 51 , 52 It comprises two parts (Figure 2): the encoder and the decoder. The intermediate representation is latent space, which can be understood as a meaningful data representation. The data are decoded to reconstruct the input data. The reconstructed field is denoted as Inline graphic, as shown in Figure 2.

FIGURE 2.

FIGURE 2

AE model. It reconstructs the input F(X,t) by extracting meaningful features represented in Z. The output is the reconstructed field Inline graphic.

In the classical AM approach summarized above, the field F(X,ti) is directly used to obtain similar situations in the dataset to the target F(X,t). We propose to use an AE to transform the predictor field F to its latent space Z and search for analogues directly in the reduced space Z instead of in the input field F(X,ti) employed in the classical AM. Figure 3 outlines the proposed approach. The idea behind this strategy is to funnel the input information through an optimized space of reduced dimensionality, which contains the relevant information of the predictor. Let V[·] be an AE, trained over data from the field F(X,t), in which we consider only the encoder part, so V[F(X,ti)]Z(X,ti), where usually |X||X|, with |X| standing for the dimension of X. Then, we apply the AM approach using the features of the latent space to find the KNN, [Z(X,t1),,Z(X,tK)], to the latent space of the targeted state, Z(X,t). Note that these latent situations are associated with their corresponding input fields [F(X,t1),,F(X,tK)], so we can reconstruct the extreme event of interest by simply applying Equations (1) and (2).

FIGURE 3.

FIGURE 3

AE–AM model. The event under study, which occurs at time t, feeds the trained encoder to obtain the latent space Z(X,t). A pool of latent spaces, Z(X,ti), is obtained by codifying the different fields F(X,ti), hence obtaining a latent space Z, per time t. The AM is then applied to search the KNN in the M possible candidates. Then, N uniformly random extractions are developed to construct the field, F^(X).

DATA AND CASE STUDIES

We have considered ERA5 reanalysis data 53 from 1950 to 2021. Daily fields of SLP and Tmax have been obtained at different latitude–longitude grids with 2Inline graphic of resolution: X for SLP (the predictor) and X for Tmax (the predictand) (see the notation in the section “The analogue method”).

Regarding the heat wave definition, we have employed a widely used index in the literature 54 : a heat wave occurs when Tmax exceeds a given threshold for three or more consecutive days. The threshold is the daily 90th percentile of Tmax in the reference period (1981–2010), computed over a 31‐day window. Therefore, the threshold for a given calendar day d is dth=P90(Ad), with

Ad=y=19812010i=d15d+15Ty,i, (3)

where denotes the union of sets and Ty,i stands for the daily Tmax of the day i in the year y.

Following this methodology, we have considered eight European heat waves since 1950 (see Table 1 for their detailed description). Figure 4 shows these events, corresponding to some of the most severe European heat waves since at least 1950. 54 These heat waves also affected different regions of Europe. Moreover, they displayed different durations (from approximately 1 week to almost 1 month) and timings of occurrence within the high summer season (from early July to the second half of August). Table 1 also lists the spatial domains used for the predictor (SLP) and target (Tmax) fields of each event, which vary from case to case, depending on the region and spatial extension of the heat wave event and the average maximum temperature of the target grid region.

TABLE 1.

Summary information of the heat waves considered.

Heat wave Duration Average maximum (Inline graphic C) Pressure grid (X) Temperature grid (X)
France 2003 Aug 01.–Aug. 19 28.3582 32Inline graphic N ‐ 70Inline graphic N 42Inline graphic N ‐ 50Inline graphic N
28Inline graphic W ‐ 30Inline graphic E 6Inline graphic W ‐ 8Inline graphic E
Spain 1995 Jul. 16—Jul. 24 30.5132 32Inline graphic N ‐ 70Inline graphic N 34Inline graphic N ‐ 42Inline graphic N
28Inline graphic W ‐ 30Inline graphic E 10Inline graphic W ‐ 4Inline graphic E
Greece 1987 Jul. 18–Jul. 27 30.1892 28Inline graphic N ‐ 66Inline graphic N 34Inline graphic N ‐ 44Inline graphic N
8Inline graphic W ‐ 50Inline graphic E 18Inline graphic E ‐ 32Inline graphic E
Germany 2006 Jul. 09–Jul. 31 26.1944 24Inline graphic N ‐ 72Inline graphic N 44Inline graphic N ‐ 54Inline graphic N
28Inline graphic W ‐ 30Inline graphic E 4Inline graphic W ‐ 16Inline graphic E
Poland 1994 Jul. 21–Aug. 11 28.7656 32Inline graphic N ‐ 70Inline graphic N 48Inline graphic N ‐ 56Inline graphic N
18Inline graphic W ‐ 40Inline graphic E 14Inline graphic E ‐ 26Inline graphic E
Balkans 2007 Aug. 15–Aug. 28 29.2392 32Inline graphic N ‐ 70Inline graphic N 40Inline graphic N ‐ 52Inline graphic N
8Inline graphic W ‐ 50Inline graphic E 18Inline graphic E ‐ 42Inline graphic E
Russia 2010 Jul. 16–Aug. 19 32.6340 32Inline graphic N ‐ 70Inline graphic N 38Inline graphic N ‐ 60Inline graphic N
22Inline graphic E ‐ 80Inline graphic E 40Inline graphic E ‐ 60Inline graphic E
Russia 1954 Jul. 01–Jul. 12 28.8298 32Inline graphic N ‐ 70Inline graphic N 44Inline graphic N ‐ 60Inline graphic N
8Inline graphic W ‐ 50Inline graphic E 28Inline graphic E ‐ 48Inline graphic E

FIGURE 4.

FIGURE 4

European heat waves: France 2003, Spain 1995, Greece 1987, Germany 2006, Poland 1994, Balkans 2007, Russia 2010, and Russia 1954. Blue and red lines represent the daily evolution of the Tmax average on the grid point region for the specified dates and the P90 for the corresponding calendar days, respectively. Shading highlights the heat wave duration at that grid point.

This work will show and compare the capacity of the AM and AE–AM algorithms to reconstruct the spatial patterns and intensity of the heat waves considered. For every heat wave, five different re‐trainings of the AE with K=20 analogues and N=1000 realizations each are run, to retrieve a probabilistic AE–AM reconstruction of the heat wave event. Although the configuration may lead to different results, the one used in this work seems to be robust in our experiments. We launch the same number of realizations with the AM. This approach allows for a direct comparison between the AM and AE–AM methods. The improvement of the AE–AM over the AM is quantified with the skill score (SS), defined as

SS=1MAEAEAMMAEAM×100. (4)

A sensitivity test based on different sizes of the AE latent space has been carried out to assess this hyperparameter influence in the AE training (see the section “Further analysis and discussion”). After this preliminary analysis, the latent space size was set to be of a dimension of 400 features. It is important to consider that the original number of nodes in the SLP grid X (20×30, with 2Inline graphic of resolution) was 600 nodes. Figures S1 and S2 illustrate the specific AE architecture considered (see the general one in Figure 2). Although the AE model comprises an encoder and a decoder part, for the AE–AM method the decoder part is only used to train the AE (see the section “Autoencoders”). The loss function is defined as the mean squared error of the pixel‐wise difference between the input image and the decoded image, and hence the closer the decoded image to the original one is the better the codification of the latent space is. As a consequence, the input layer of the encoder and the output of the last convolutional layer of the decoder has a space dimension of 20×30, with a resolution of 2 degrees in latitude and longitude (as the grid X). The first three convolutional two‐dimensional layers of the encoder have 32 filters: the first two keep the data size, while the last one reduces it with a stride of (2,3). The next three layers have 16 filters with a reduction of strides (2,2) for the third one. This means that the original dimensions 20×30 (latitude and longitude) are reduced by a factor of four and six, respectively. After the convolutional layers, the features are flattened and go through a dense layer to the latent space, or layer Z (see the AE general architecture in Figure 2). The decoder performs the inverse process but with a more shallow architecture. The transpose convolutional layers have 32 and 16 filters reversing the encoder reductions and are followed by a convolutional layer.

RESULTS

In this section, we compare the performance of the AM and the proposed hybrid AE–AM in reconstructing the observed Tmax during the eight high‐top European heat waves considered (Table 1), using the associated SLP field as a predictor. The performance of the algorithms is assessed separately for each heat wave event, considering the entire period of ERA5 reanalysis.

Comparison of heat waves probabilistic reconstructions with the AE–AM versus AM

First, we compare the performance of the AM and AE–AM approaches in the reconstruction of Tmax distributions. In the AM (AE–AM), flow analogues of the original (encoded) SLP field are searched in the historical record, and their associated Tmax fields are employed to reconstruct the target (see the section “The analogue method”). This procedure is applied to each day of the heat wave event separately. In this section, the temporally averaged reconstructions F^(X) throughout the heat wave event will be shown. We conducted these experiments for different latent space dimensions, with the following features in each case: 8, 64, 128, 256, 400, 600, 700, and 800. They include smaller, equal, and larger latent spaces than the input space (600).

Figures 5 and 6 show the Tmax distributions (F^(X); see the section “The analogue method”), obtained for each heat wave using the AM as well as different latent space dimensions of the AE–AM. Compared to the target (i.e., the observed value, red line), the performance of the AE–AM is better than that of the AM for all the heat wave events considered. This improvement is observed for all the latent space dimensions, except for the smallest one (dimension 8), which does not seem to include enough information to yield a competitive reconstruction of the heat wave intensity. In the rest of the cases, the AE–AM distribution is closer to the target than the AM, indicating an AI‐enhanced reconstruction.

FIGURE 5.

FIGURE 5

Comparison of the reconstruction Tmax distributions obtained by the AE–AM using different latent dimensions and the classic AM methods, in the heat waves of France 2003, Spain 1995, Greece 1987, and Germany 2006.

FIGURE 6.

FIGURE 6

Comparison of the reconstruction Tmax distributions obtained by the AE–AM using different latent dimensions and the classic AM methods, in the heat waves of Poland 1994, Balkans 2007, Russia 2010, and Russia 1954.

This improvement is further quantified in Tables S1–S4, which summarize the performance of each method, including the reconstructed Tmax distribution (mean and standard deviation) and the observed reconstructed difference in Tmax and SLP. The two approaches are also compared by using the SS of the AE–AM over the AM method. All these metrics are averaged over the domain of the target (X in Table 1), which corresponds to the region with the largest Tmax. The joint assessment of heat wave events shows that the AE–AM with a latent space dimension of 400 often yields one of the best results. However, larger reductions (e.g., dimensions of 256 and 128 nodes) also display similar performance (always better than the standard AM). The SS improvement of the Tmax reconstructions obtained by the AE–AM with respect to AM is larger than 10%, except for the latent space of dimension 8, as mentioned above.

The differences in the reconstructed Tmax depend more on the heat wave analyzed than on the latent space dimension (the effect of the latter is typically less than 1Inline graphic C). Note that both approaches can reconstruct warmer‐than‐average conditions (heat waves), indicating that the predictor (atmospheric circulation) is a significant driver of these extreme events. 17 , 19 Moreover, depending on the case, they tend to underestimate the observed heat waves' magnitude (between 3Inline graphic C and 5Inline graphic C). This underestimation is common in AM applications, which can be partially explained by (1) a biased selection of the heat wave events towards the regions and intervals of maximum severity; (2) the fact that some of the analyzed heat waves were record breaking over their respective regions of occurrence (i.e., with few historical analogues of the given severity); and (3) the lack of consideration of other amplification factors that could have contributed to exacerbating the magnitude of the heat wave event (e.g., land‐atmosphere coupling; see, e.g., Barriopedro et al. 2 and references therein). Despite this, our results show a clear advantage of preprocessing the data with the AE before applying the AM method. This indicates that the AE can condense important information in the latent space that the AM may better exploit.

Further analysis of the performance of the proposed AE–AM versus the standard AM can be carried out by trying to reconstruct nonextreme periods. For this experiment, we have chosen two periods without heat waves, that is, where the daily Tmax is below the P90 (July 1 to July 15 of 1996 in Spain and 2004 in France; see Figure 7), and we have applied the AE–AM and AM to reconstruct them. Figure S3 shows that the performance of the Tmax reconstructions in these cases is much better than during extreme events. In both cases, the AE–AM reconstruction is slightly better than the AM one, particularly in the example of the France heat wave. However, the differences between the AE–AM and the AM approaches tend to be smaller than for extreme periods. Therefore, the AE–AM approach can also yield a reasonable reconstruction (at least as good as AM) for nonextreme conditions, indicating a high performance of the proposed AE–AM approach.

FIGURE 7.

FIGURE 7

Periods selected for experiments reconstructing no heat wave days. (A) No heat wave period I selected: July 01‐July 15, 2004 (summer of 2004 in France). (B) No heat wave period II selected: July 01‐July 15, 1996 (summer of 1996 in Spain).

Note that, for the reconstruction of the target (Tmax), we have only considered the large‐scale atmospheric circulation (SLP patterns), disregarding other potential drivers for heat waves. Including additional inputs or information channels for the reconstruction problem could lead to better results, reducing the differences with the target. However, this could also generate problems derived from including different input variables in the AM, such as the definition of distances in the multivariate space, the choice of historical analogues, or the relevance of the variables for heat wave reconstruction. Some of these issues are amenable to deep‐learning techniques (see, e.g., Miloshevich et al. 48 ).

Further analysis and discussion

In this subsection, we further discuss other specific aspects and configurations of the proposed approach and its comparison with alternative ways of data dimensionality reduction. Assuming that an optimal dimension reduction of the predictor field can improve the search for good analogues and the reconstruction of the target, the question is whether we can obtain similar results with simpler linear dimensionality reduction methods, such as principal component analysis (PCA). To clarify this question, we carried out a first analysis where we have replaced the first stage of preprocessing (carried out by the AE) with a PCA. Then, the analogue search was applied to the components obtained by the PCA. Figure S4 shows that the reconstruction of Tmax with the hybrid PCA–AM is worse than with the AE–AM and the AM reconstruction, at least for the analyzed heat waves. Thus, the search for analogues in the PCA space does not yield satisfactory results in the reconstruction of heat waves, and it does not outperform the hybrid AE–AM. This indicates that not all dimensionality reduction schemes work fine when hybridized with the AM.

Another interesting point to study is the effect of using alternative atmospheric circulation variables (e.g., 500 hPa geopotential height) as input predictors for the heat waves in the AE–AM. Tables S5 and S6 show the performance of the AE–AM in the reconstruction of the considered heat waves using 500 hPa geopotential (with a latent space dimension of 400). As can be seen, the AE–AM approach with geopotential height as the predictor performs worse than the case using SLP as the input variable. Moreover, although geopotential height has been used as a predictor of precipitation and heat waves, 17 the use of the SLP variable is more extensive in the literature. 21 The reason is that, generally, warmer temperatures raise the geopotential heights, since the density of the atmospheric column is affected by the mean temperature. This rise does not imply a different atmospheric circulation configuration, but the column expansion affects the magnitude of the anomalies and, hence, the selection of analogues. By contrast, the SLP patterns are unaffected by the mean temperature, which makes it a more stable variable to apply in the AM. This point has been discussed in the literature in the context of downscaling. 55 , 56 In the particular case of AM‐based attribution of extreme events, some studies employ SLP. 57 Still, attempts have also been adopted to address these issues with the geopotential height variable, for example, by removing the first‐order thermodynamic effect of global warming on geopotential height rise. 17 , 58

An important benefit of the AM (and AE–AM) is the provision of spatiotemporal information (the evolution of the event and its spatial pattern), which raises the question of the performance of AE–AM on local scales. As an example, we illustrate the spatial reconstructions of Tmax obtained for the heat wave of France 2003, the first mega‐heat wave of the 21st century 59 , 60 with SLP as the input variable. Figure 8 shows the target to reconstruct, and Figures 9 and 10 show the best and median Tmax reconstruction over the targeted domain, respectively, as inferred from the AE–AM and AM.

FIGURE 8.

FIGURE 8

Target to reconstruct (heat wave of France 2003). The values are in degrees Celsius.

FIGURE 9.

FIGURE 9

Comparison of different reconstructions with the best solution by the AE–AM and the best solution by the standard AM algorithm for the heat wave of France 2003. (A) AE‐AM reconstruction (best solution of the distribution). (B) Difference between the target and the reconstruction with the best AE‐AM solution. (C) Standard AM reconstruction (best solution of the distribution). (D) Difference between the target and standard AM best solution. (E) Difference between the best AE‐AM and best standard AM solutions. The values are in degrees Celsius.

FIGURE 10.

FIGURE 10

Comparison of different reconstructions with the median solution by the AE–AM and the median solution by the standard AM algorithm, for the heat wave of France 2003. (A) AE‐AM reconstruction (median solution of the distribution). (B) Difference between the target and the reconstruction with the median AE‐AM solution. (C) Standard AM reconstruction (median solution of the distribution). (D) Difference between the target and standard AM median solution. (E) Difference between the reconstructions with the median AE‐AM and median standard AM solutions. The values are in degrees Celsius.

The results show notable differences between the best Tmax reconstruction obtained by the AE–AM and the standard AM (Figure 9). The superiority of the AE–AM is particularly notable in the central and southwestern areas of the reconstructed grid X, where the AE–AM reconstruction is closer to observations than the AM one.

AE–AM also displays an overall better performance than AM for the mean reconstruction (Figure 10). In this case, the largest improvements of the AE–AM over the AM imply a local reduction of the reconstruction error of more than 2Inline graphic C (e.g., in the north part of the domain), and the standard AM algorithm only performs better than the AE–AM reconstruction in very few zones.

To understand the differences in the Tmax reconstruction performed by AM and AE–AM, we also analyze the analogue days selected by both methods. For simplicity, the analysis will focus again on the France 2003 heat wave. As mentioned above, the reconstruction is made by randomly extracting N times the K possible analogues. Note that the same analogue day can be employed to reconstruct different days of the event. Considering the total duration of the event (19 days; see duration in Table 1), the AM method used 306 different days as analogues to make the reconstructions, while the AE–AM method selected 287 different days. Many of these days (188) are shared by the two methods. However, AM employed 118 days for its reconstruction that AE–AM did not, and AE–AM includes 99 days that are not analogues in the AM. We will call these disjoint days.

Figure 11 compares the reconstruction performed only with the disjoint days of each method. The analogues selected by the AE–AM perform better than those selected by the AM regarding the magnitude and spatial pattern of the targeted Tmax field. This is confirmed by the difference between both reconstructions (see Figure 11C): the disjoint‐analogue reconstruction of AE–AM reduces the bias of the AM one in all grid points. This shows that the AE–AM can identify better analogues that result in a better reconstruction of the predictand, suggesting that the AE filters out irrelevant information in the predictor field and increases the signal‐to‐noise ratio for the reconstruction.

FIGURE 11.

FIGURE 11

Comparison of reconstructions by AM and AE–AM algorithms, using uniquely selected days, for the heat wave of France 2003. (A) Mean AE‐AM unique days reconstruction. (B) Mean AM unique days reconstruction. (C) Difference between the reconstructions with the AE‐AM unique days and AM unique days. The values are in degrees Celsius.

Finally, Figure S6 displays the temporal distribution of the disjoint days in both methods. Although both methods tend to select analogues of the summer period (June to August), the AM also picks days of April, May, September, and October, which contrasts with an AE–AM preference for July and August. The comparison of months' distributions is crucial, since the France 2003 heat wave occurred towards the warmest time of the year, specifically in August. A marked seasonality is more clear in the latent space (which codifies the most important information of the input field) than in the original pressure field, suggesting that the AE–AM can learn seasonal (or other relevant) aspects of the target that may not be present in the predictor. When the disjoint analogue days are analyzed by the year of occurrence, we can see that the AE–AM analogues are more uniformly distributed than those of the AM method, which leads to a higher frequency of recent (and warmer) analogues (the 2010s and 2020s). All this suggests that the AE–AM may learn relevant aspects of the target (seasonality, trends, etc.) that may not be present in the predictor fields.

CONCLUSIONS AND FINAL REMARKS

In this paper, we have proposed a novel hybrid approach for a probabilistic reconstruction of meteorological fields during extreme events, based on autoencoders and the analogue method (AE–AM). This novel algorithm uses a deep AE trained with the predictor fields (herein sea‐level pressure, although other input variables may be considered), which are mapped into the reduced latent space. Then, the AM is directly applied to the states of this latent space to find similar situations in the historical record. These analogue days are finally employed to reconstruct the targeted field.

The performance of our hybrid approach has been compared with that of the classical AM in a problem of maximum temperature (Tmax) reconstruction during major European heat waves. Eight European heat waves of the 1954–2010 period have been considered and reconstructed from SLP fields with both AM and AE–AM using the ERA5 reanalysis data. In all the cases analyzed, the AE–AM approach has shown better performance than AM in reconstructing the daily maximum temperature observed during these heat waves. A comprehensive analysis of the proposed AE–AM approach has been carried out, including comparisons with alternative configurations of the algorithm and other hybrid approaches for dimensionality reduction (e.g., PCA–AM). Different input variables (500 hPa geopotential instead of SLP) have also been considered, without reporting qualitative differences in the results. All these experiments confirm a robust behavior of the AE–AM, with an overall improved performance with respect to the standard AM.

The proposed AE–AM algorithm can be directly applied to other problems where the usefulness of the AM has been demonstrated, such as extreme event attribution. 61 Attribution is usually defined as the process of evaluating the relative contributions of multiple causal factors to a change or event, with an assignment of statistical confidence. 62 , 63 When applied to specific individual events, this is known as extreme attribution, which represents a key aspect in understanding climate change risks, as they are often associated with the occurrence of extreme events. 64 , 65 The idea behind extreme attribution is to address climate change influences in a particular extreme event by comparing a class of events that is similar to the observed one in two worlds with different levels of anthropogenic influences. These two worlds can come from different observational periods (i.e., a new/recent and old/past world) or climate simulations (i.e., a factual and counterfactual world with and without anthropogenic forcings).

The AM has been employed for extreme event attribution since it can reconstruct the expected intensity of an event in two different climates, given the observed atmospheric conditions that caused the event (i.e., how the extreme event could have been in the past under the same forcing). As the reconstruction of the target field is needed to perform a formal attribution with the AM, applying the AE–AM for extreme attribution would be straightforward. Future improvements include considering multiple input channels (a multivariate approach). Expanding to the attribution question would allow us to perform numerous conditioned attributions, considering the influence of various factors contributing to the event. In addition, future work will investigate the physical interpretation of the latent space. 66 , 67 It will allow an understanding of the implication of the different input variables and the regions of each variable that most influence the reconstruction of the field. Moreover, this may lead to a causal analysis of the different extreme events. Hence, the variables causing the extreme event will be analyzed over different periods.

AUTHOR CONTRIBUTIONS

J.P.A.: Conceptualization, software, tests and results, and paper writing, and editing; C.M.M.: conceptualization, software, data treatment, tests and results, and paper writing, and editing; E.Z.: conceptualization, data treatment, and paper editing; D.B.: Conceptualization, and paper editing; P.Z.: Conceptualization, and paper editing; M.G.: Conceptualization, and paper editing; A.C.: Conceptualization, and paper editing; P.A.G.: conceptualization, and paper editing; S.S.S.: Conceptualization, tests and results, and paper writing and editing.

CONFLICT OF INTEREST STATEMENT

The authors declare no potential conflict of interest.

PEER REVIEW

The peer review history for this article is available at: https://publons.com/publon/10.1111/nyas.15243

Supporting information

Figure S1. AE Encoder architecture.

Table S1. Average AE‐AM results for different dimensions of the latent space (heat waves of France 2003 and Spain 1995).

Figure S2. AE Decoder architecture.

Table S2. Average AE‐AM results for different dimensions of the latent space (heat waves of Greece1987 and Germany 2006)

Table S3. Average AE‐AM results for different dimensions of the latent space (heat waves of Poland 1994 and Balkans 2007).

Table S4. Average AE‐AM results for different dimensions of the latent space (heat waves of Russia 2010 and Russia 1954).

Figure S3. Comparison of AE‐AM and AM applied to a case of no extreme event (no heat wave)

Figure S4. Comparison of the reconstruction Tmax distributions obtained by PCA–AM for different number of components (64, 256 and 600), the AE‐AM and the classic AM methods, in the heat waves of France 2003, Spain 1995, Greece 1987, Germany 2006, Poland 1994, Balkans 2007, Russia 2010 and Russia 1954

Table S5. Average results over‐runs for different heat waves (I) with 500hPa geopotential.

Table S6. Average results over‐runs for different heat waves (II) with 500hPa geopotential.

Figure S5. Comparison of reconstructions by AM and AE‐AM algorithms, using unique selected days, for the heat wave of France 2003.

Figure S6. Comparison of the distribution of unique selected days by AM and AE‐AM algorithms for the heat wave of France 2003

NYAS-1541-230-s001.pdf (2.3MB, pdf)

ACKNOWLEDGMENTS

This research has been partially supported by the European Union, through the H2020 Project “CLIMATE INTELLIGENCE Extreme events detection, attribution and adaptation design using machine learning (CLINT)”, Ref: 101003876‐CLINT. This work has also been partially supported by “Agencia Española de Investigación (España)” (grant references: PID2020‐115454GB‐C21 and PID2020‐115454GB‐C22/AEI/10.13039/501100011033).

Pérez‐Aracil, J. , Marina, C. M. , Zorita, E. , Barriopedro, D. , Zaninelli, P. , Giuliani, M. , Castelletti, A. , Gutiérrez, P. A. , & Salcedo‐Sanz, S. (2024). Autoencoder‐based flow‐analogue probabilistic reconstruction of heat waves from pressure fields. Ann NY Acad Sci., 1541, 230–242. 10.1111/nyas.15243

DATA AVAILABILITY STATEMENT

The developed code (Python), its corresponding documentation/description and the data used are fully available at the links below:

Code: https://github.com/GheodeAI/va_am

Documentation: https://va‐am.readthedocs.io/en/latest/

Data (ERA5 Reanalysis): https://cds.climate.copernicus.eu/

REFERENCES

  • 1. Chapman, S. , Watkins, N. W. , & Stainforth, D. A. (2019). Warming trends in summer heatwaves. Geophysical Research Letters, 46(3), 1634–1640. [Google Scholar]
  • 2. Barriopedro, D. , García‐Herrera, R. , Ordóñez, C. , Miralles, D. , & Salcedo‐Sanz, S. (2023). Heat waves: Physical understanding and scientific challenges. Reviews of Geophysics, 61, e2022RG000780. [Google Scholar]
  • 3. Zhou, Y. , Gu, S. , Yang, H. , Li, Y. , Zhao, Y. , Li, Y. , & Yang, Q. (2024). Spatiotemporal variation in heatwaves and elderly population exposure across China. Science of The Total Environment, 917, 170245. [DOI] [PubMed] [Google Scholar]
  • 4. Bador, M. , Terray, L. , Boe, J. , Somot, S. , Alias, A. , Gibelin, A.‐L. , & Dubuisson, B. (2017). Future summer mega‐heatwave and record‐breaking temperatures in a warmer France climate. Environmental Research Letters, 12(7), 074025. [Google Scholar]
  • 5. Sánchez‐Benítez, A. , García‐Herrera, R. , Barriopedro, D. , Sousa, P. M. , & Trigo, R. M. (2018). June 2017: The earliest European summer mega‐heatwave of reanalysis period. Geophysical Research Letters, 45(4), 1955–1962. [Google Scholar]
  • 6. Amengual, A. , Homar, V. , Romero, R. , Brooks, H. E. , Ramis, C. , Gordaliza, M. , & Alonso, S. (2014). Projections of heat waves with high impact on human health in Europe. Global and Planetary Change, 119, 71–84. [Google Scholar]
  • 7. Díaz, J. , Jordán, A. , García, R. , López, C. , Alberdi, J. , Hernández, E. , & Otero, A. (2002). Heat waves in Madrid 1986–1997: Effects on the health of the elderly. International Archives of Occupational and Environmental Health, 75(3), 163–170. [DOI] [PubMed] [Google Scholar]
  • 8. Díaz, J. , Garcia, R. , De Castro, F. V. , Hernández, E. , López, C. , & Otero, A. (2002). Effects of extremely hot days on people older than 65 years in Seville (Spain) from 1986 to 1997. International Journal of Biometeorology, 46(3), 145–149. [DOI] [PubMed] [Google Scholar]
  • 9. Wang, Z. , Jiang, Y. , Wan, H. , Yan, J. , & Zhang, X. (2017). Detection and attribution of changes in extreme temperatures at regional scale. Journal of Climate, 30(17), 7035–7047. [Google Scholar]
  • 10. Suli, S. , Barriopedro, D. , García‐Herrera, R. , & Rusticucci, M. (2023). Regionalisation of heat waves in southern South America. Weather and Climate Extremes, 40, 100569. [Google Scholar]
  • 11. Wang, J. , & Yan, Z. (2021). Rapid rises in the magnitude and risk of extreme regional heat wave events in China. Weather and Climate Extremes, 34, 100379. [Google Scholar]
  • 12. Shi, J. , Cui, L. , Ma, Y. , Du, H. , & Wen, K. (2018). Trends in temperature extremes and their association with circulation patterns in China during 1961–2015. Atmospheric Research, 212, 259–272. [Google Scholar]
  • 13. Ma, F. , Yuan, X. , & Li, H. (2022). Characteristics and circulation patterns for wet and dry compound day‐night heat waves in mid‐eastern China. Global and Planetary Change, 213, 103839. [Google Scholar]
  • 14. Serrano‐Notivoli, R. , Lemus‐Canovas, M. , Barrao, S. , Sarricolea, P. , Meseguer‐Ruiz, O. , & Tejedor, E. (2022). Heat and cold waves in mainland Spain: Origins, characteristics, and trends. Weather and Climate Extremes, 37, 100471. [Google Scholar]
  • 15. Zwiers, F. W. , Zhang, X. , & Feng, Y. (2011). Anthropogenic influence on long return period daily temperature extremes at regional scales. Journal of Climate, 24(3), 881–892. [Google Scholar]
  • 16. Cattiaux, J. , & Yiou, P. (2013). US heat waves of spring and summer 2012 from the flow‐analogue perspective. Bulletin of the American Meteorological Society, 94(9), S10–S13. [Google Scholar]
  • 17. Jézéquel, A. , Yiou, P. , & Radanovics, S. (2018). Role of circulation in European heatwaves using flow analogues. Climate Dynamics, 50(3‐4), 1145–1159. [Google Scholar]
  • 18. Xu, P. , Wang, L. , Huang, P. , & Chen, W. (2021). Disentangling dynamical and thermodynamical contributions to the record‐breaking heatwave over central Europe in June 2019. Atmospheric Research, 252, 105446. [Google Scholar]
  • 19. Faranda, D. , Bourdin, S. , Ginesta, M. , Krouma, M. , Noyelle, R. , Pons, F. , Yiou, P. , & Messori, G. (2022). A climate‐change attribution retrospective of some impactful weather extremes of 2021. Weather and Climate Dynamics, 3(4), 1311–1340. [Google Scholar]
  • 20. Zhang, M. , Yang, X. , Cleverly, J. , Huete, A. , Zhang, H. , & Yu, Q. (2022). Heat wave tracker: A multi‐method, multi‐source heat wave measurement toolkit based on Google Earth engine. Environmental Modelling & Software, 147, 105255. [Google Scholar]
  • 21. Zorita, E. , & Von Storch, H. (1999). The analog method as a simple statistical downscaling technique: Comparison with more complicated methods. Journal of Climate, 12(8), 2474–2489. [Google Scholar]
  • 22. Horton, P. (2022). Analogue methods and ERA5: Benefits and pitfalls. International Journal of Climatology, 42(7), 4078–4096. [Google Scholar]
  • 23. Wetterhall, F. , Halldin, S. , & Xu, C.‐y. (2005). Statistical precipitation downscaling in central Sweden with the analogue method. Journal of Hydrology, 306(1‐4), 174–190. [Google Scholar]
  • 24. Horton, P. , Jaboyedoff, M. , Metzger, R. , Obled, C. , & Marty, R. (2012). Spatial relationship between the atmospheric circulation and the precipitation measured in the western Swiss Alps by means of the analogue method. Natural Hazards and Earth System Sciences, 12(3), 777–784. [Google Scholar]
  • 25. Daoud, A. B. , Sauquet, E. , Bontron, G. , Obled, C. , & Lang, M. (2016). Daily quantitative precipitation forecasts based on the analogue method: Improvements and application to a French large river basin. Atmospheric Research, 169, 147–159. [Google Scholar]
  • 26. Alessandrini, S. , Delle Monache, L. , Sperati, S. , & Cervone, G. (2015). An analog ensemble for short‐term probabilistic solar power forecast. Applied Energy, 157, 95–110. [Google Scholar]
  • 27. Alessandrini, S. , Delle Monache, L. , Sperati, S. , & Nissen, J. (2015). A novel application of an analog ensemble for short‐term wind power forecasting. Renewable Energy, 76, 768–781. [Google Scholar]
  • 28. Vanvyve, E. , Delle Monache, L. , Monaghan, A. J. , & Pinto, J. O. (2015). Wind resource estimates with an analog ensemble approach. Renewable Energy, 74, 761–773. [Google Scholar]
  • 29. Zhang, X. , Li, Y. , Lu, S. , Hamann, H. F. , Hodge, B.‐M. , & Lehman, B. (2018). A solar time based analog ensemble method for regional solar power forecasting. IEEE Transactions on Sustainable Energy, 10(1), 268–279. [Google Scholar]
  • 30. Alessandrini, S. , Sperati, S. , & Delle Monache, L. (2019). Improving the analog ensemble wind speed forecasts for rare events. Monthly Weather Review, 147(7), 2677–2692. [Google Scholar]
  • 31. Alaoui, B. , Bari, D. , Bergot, T. , & Ghabbar, Y. (2022). Analog ensemble forecasting system for low‐visibility conditions over the main airports of Morocco. Atmosphere, 13(10), 1704. [Google Scholar]
  • 32. Delle Monache, L. , Alessandrini, S. , Djalalova, I. , Wilczak, J. , Knievel, J. C. , & Kumar, R. (2020). Improving air quality predictions over the United States with an analog ensemble. Weather and Forecasting, 35(5), 2145–2162. [Google Scholar]
  • 33. Salcedo‐Sanz, S. , García‐Herrera, R. , Camacho‐Gómez, C. , Alexandre, E. , Carro‐Calvo, L. , & Jaume‐Santero, F. (2019). Near‐optimal selection of representative measuring points for robust temperature field reconstruction with the CRO‐SL and analogue methods. Global and Planetary Change, 178, 15–34. [Google Scholar]
  • 34. Alessandrini, S. , Delle Monache, L. , Rozoff, C. M. , & Lewis, W. E. (2018). Probabilistic prediction of tropical cyclone intensity with an analog ensemble. Monthly Weather Review, 146(6), 1723–1744. [Google Scholar]
  • 35. Jia, L. , Ren, F. , Ding, C. , Jia, Z. , Wang, M. , Chen, Y. , & Feng, T. (2022). Improvement of the ensemble methods in the dynamical–statistical–analog ensemble forecast model for landfalling typhoon precipitation. Journal of the Meteorological Society of Japan. Ser. II, 100(3), 575–592. [Google Scholar]
  • 36. Ren, L. , Zhou, T. , & Zhang, W. (2020). Attribution of the record‐breaking heat event over Northeast Asia in summer 2018: The role of circulation. Environmental Research Letters, 15(5), 054018. [Google Scholar]
  • 37. Hand, D. J. (2007). Principles of data mining. Drug Safety, 30, 621–622. [DOI] [PubMed] [Google Scholar]
  • 38. Delle Monache, L. , Eckel, F. A. , Rife, D. L. , Nagarajan, B. , & Searight, K. (2013). Probabilistic weather prediction with an analog ensemble. Monthly Weather Review, 141(10), 3498–3516. [Google Scholar]
  • 39. Eckel, F. A. , & Delle Monache, L. (2016). A hybrid NWP–analog ensemble. Monthly Weather Review, 144(3), 897–911. [Google Scholar]
  • 40. Katal, A. , Leroyer, S. , Zou, J. , Nikiema, O. , Albettar, M. , Belair, S. , & Wang, L. L. (2023). Outdoor heat stress assessment using an integrated multi‐scale numerical weather prediction system: A case study of a heatwave in Montreal. Science of The Total Environment, 865, 161276. [DOI] [PubMed] [Google Scholar]
  • 41. Salcedo‐Sanz, S. , Pérez‐Aracil, J. , Ascenso, G. , Del Ser, J. , Casillas‐Pérez, D. , Kadow, C. , Fister, D. , Barriopedro, D. , García‐Herrera, R. , Giuliani, M. , & Castelletti, A. (2024). Analysis, characterization, prediction, and attribution of extreme atmospheric events with machine learning and deep learning techniques: A review. Theoretical and Applied Climatology, 155, 1–44. [Google Scholar]
  • 42. Horton, P. , Jaboyedoff, M. , & Obled, C. (2018). Using genetic algorithms to optimize the analogue method for precipitation prediction in the Swiss Alps. Journal of Hydrology, 556, 1220–1231. [Google Scholar]
  • 43. Hinton, G. E. , & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507. [DOI] [PubMed] [Google Scholar]
  • 44. Yang, L. M. , & Grooms, I. (2021). Machine learning techniques to construct patched analog ensembles for data assimilation. Journal of Computational Physics, 443, 110532. [Google Scholar]
  • 45. Grooms, I. (2021). Analog ensemble data assimilation and a method for constructing analogs with variational autoencoders. Quarterly Journal of the Royal Meteorological Society, 147(734), 139–149. [Google Scholar]
  • 46. Kingma, D. P. , & Welling, M. (2019). An introduction to variational autoencoders. Foundations and Trends in Machine Learning, 12(4), 307–392. [Google Scholar]
  • 47. Klampanos, I. A. , Davvetas, A. , Andronopoulos, S. , Pappas, C. , Ikonomopoulos, A. , & Karkaletsis, V. (2018). Autoencoder‐driven weather clustering for source estimation during nuclear events. Environmental Modelling & Software, 102, 84–93. [Google Scholar]
  • 48. Miloshevich, G. , Lucente, D. , Yiou, P. , & Bouchet, F. (2023). Extreme heatwave sampling and prediction with analog Markov chain and comparisons with deep learning . arXiv. 10.48550/arXiv.2307.09060 [DOI]
  • 49. Lorenz, E. N. (1969). Atmospheric predictability as revealed by naturally occurring analogues. Journal of Atmospheric Sciences, 26(4), 636–646. [Google Scholar]
  • 50. Likas, A. , Vlassis, N. , & Verbeek, J. J. (2003). The global k‐means clustering algorithm. Pattern Recognition, 36(2), 451–461. [Google Scholar]
  • 51. Goodfellow, I. , Bengio, Y. , & Courville, A. (2016). Deep learning. MIT Press. [Google Scholar]
  • 52. Pinaya, W. H. L. , Vieira, S. , Garcia‐Dias, R. , & Mechelli, A. (2020). Autoencoders. In Machine learning (pp. 193–208). Elsevier. [Google Scholar]
  • 53. Hersbach, H. , Bell, B. , Berrisford, P. , Hirahara, S. , Horányi, A. , Muñoz‐Sabater, J. , Nicolas, J. , Peubey, C. , Radu, R. , Schepers, D. , & Simmons, A. (2020). The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146(730), 1999–2049. [Google Scholar]
  • 54. Russo, S. , Sillmann, J. , & Fischer, E. M. (2015). Top ten European heatwaves since 1950 and their occurrence in the coming decades. Environmental Research Letters, 10(12), 124003. [Google Scholar]
  • 55. Burkhardt, U. (1999). Alpine precipitation in a tripled CO2‐climate. Tellus A: Dynamic Meteorology and Oceanography, 51(2), 289–303. [Google Scholar]
  • 56. Wilby, R. L. , Charles, S. P. , Zorita, E. , Timbal, B. , Whetton, P. , & Mearns, L. O. (2004). Guidelines for use of climate scenarios developed from statistical downscaling methods. Supporting material of the Intergovernmental Panel on Climate Change, available from the DDC of IPCC TGCIA, 27.
  • 57. Yiou, P. , Jézéquel, A. , Naveau, P. , Otto, F. E. , Vautard, R. , & Vrac, M. (2017). A statistical framework for conditional extreme event attribution. Advances in Statistical Climatology, Meteorology and Oceanography, 3(1), 17–31. [Google Scholar]
  • 58. Barriopedro, D. , Sousa, P. , Trigo, R. , García‐Herrera, R. , & Ramos, A. (2020). The exceptional Iberian heatwave of summer 2018. Bulletin of the American Meteorological Society, 101(1), S29–S34. [Google Scholar]
  • 59. García‐Herrera, R. , Díaz, J. , Trigo, R. M. , Luterbacher, J. , & Fischer, E. M. (2010). A review of the European summer heat wave of 2003. Critical Reviews in Environmental Science and Technology, 40(4), 267–306. [Google Scholar]
  • 60. Beniston, M. , & Diaz, H. F. (2004). The 2003 heat wave as an example of summers in a greenhouse climate? Observations and climate model simulations for Basel, Switzerland. Global and Planetary Change, 44(1‐4), 73–81. [Google Scholar]
  • 61. Lloyd, E. A. , & Shepherd, T. G. (2020). Environmental catastrophes, climate change, and attribution. Annals of the New York Academy of Sciences, 1469(1), 105–124. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Stott, P. A. , Christidis, N. , Otto, F. E. , Sun, Y. , Vanderlinden, J.‐P. , van Oldenborgh, G. J. , Vautard, R. , von Storch, H. , Walton, P. , Yiou, P. , & Zwiers, F. W. (2016). Attribution of extreme weather and climate‐related events. Wiley Interdisciplinary Reviews: Climate Change, 7(1), 23–41. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 63. Hegerl, G. C. , Zwiers, F. W. , Braconnot, P. , Gillett, N. P. , Luo, Y. , Orsini, J. A. M. , Nicholls, N. , Penner, J. E. , Stott, P. A. , Allen, M. , et al. (2007). Understanding and attributing climate change. In Solomon, S. , Qin D., Manning M., Chen Z., Marquis M., Averyt KB, Tignor M., & Miller H. L. (Eds.), Contribution of working group I to the fourth assessment report of the Intergovernmental Panel on Climate Change (IPCC). Cambridge University Press. [Google Scholar]
  • 64. Hulme, M. (2014). Attributing weather extremes to ‘climate change’: A review. Progress in Physical Geography, 38(4), 499–511. [Google Scholar]
  • 65. Villa, D. L. , Schostek, T. , Govertsen, K. , & Macmillan, M. (2023). A stochastic model of future extreme temperature events for infrastructure analysis. Environmental Modelling & Software, 163, 105663. [Google Scholar]
  • 66. Fukami, K. , & Taira, K. (2023). Grasping extreme aerodynamics on a low‐dimensional manifold. Nature Communications, 14(1), 6480. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 67. Smith, L. , Fukami, K. , Sedky, G. , Jones, A. , & Taira, K. (2024). A cyclic perspective on transient gust encounters through the lens of persistent homology. Journal of Fluid Mechanics, 980, A18. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Figure S1. AE Encoder architecture.

Table S1. Average AE‐AM results for different dimensions of the latent space (heat waves of France 2003 and Spain 1995).

Figure S2. AE Decoder architecture.

Table S2. Average AE‐AM results for different dimensions of the latent space (heat waves of Greece1987 and Germany 2006)

Table S3. Average AE‐AM results for different dimensions of the latent space (heat waves of Poland 1994 and Balkans 2007).

Table S4. Average AE‐AM results for different dimensions of the latent space (heat waves of Russia 2010 and Russia 1954).

Figure S3. Comparison of AE‐AM and AM applied to a case of no extreme event (no heat wave)

Figure S4. Comparison of the reconstruction Tmax distributions obtained by PCA–AM for different number of components (64, 256 and 600), the AE‐AM and the classic AM methods, in the heat waves of France 2003, Spain 1995, Greece 1987, Germany 2006, Poland 1994, Balkans 2007, Russia 2010 and Russia 1954

Table S5. Average results over‐runs for different heat waves (I) with 500hPa geopotential.

Table S6. Average results over‐runs for different heat waves (II) with 500hPa geopotential.

Figure S5. Comparison of reconstructions by AM and AE‐AM algorithms, using unique selected days, for the heat wave of France 2003.

Figure S6. Comparison of the distribution of unique selected days by AM and AE‐AM algorithms for the heat wave of France 2003

NYAS-1541-230-s001.pdf (2.3MB, pdf)

Data Availability Statement

The developed code (Python), its corresponding documentation/description and the data used are fully available at the links below:

Code: https://github.com/GheodeAI/va_am

Documentation: https://va‐am.readthedocs.io/en/latest/

Data (ERA5 Reanalysis): https://cds.climate.copernicus.eu/


Articles from Annals of the New York Academy of Sciences are provided here courtesy of Wiley

RESOURCES