Skip to main content
Elsevier - PMC COVID-19 Collection logoLink to Elsevier - PMC COVID-19 Collection
. 2020 Oct 3;896:1–84. doi: 10.1016/j.physrep.2020.09.005

Statistical physics approaches to the complex Earth system

Jingfang Fan a,b,⁎,1, Jun Meng c,a,, Josef Ludescher a,⁎⁎, Xiaosong Chen b,⁎⁎, Yosef Ashkenazy d,⁎⁎, Jürgen Kurths a,e,f,, Shlomo Havlin g,⁎⁎, Hans Joachim Schellnhuber a,h,
PMCID: PMC7532523  PMID: 33041465

Abstract

Global warming, extreme climate events, earthquakes and their accompanying socioeconomic disasters pose significant risks to humanity. Yet due to the nonlinear feedbacks, multiple interactions and complex structures of the Earth system, the understanding and, in particular, the prediction of such disruptive events represent formidable challenges to both scientific and policy communities. During the past years, the emergence and evolution of Earth system science has attracted much attention and produced new concepts and frameworks. Especially, novel statistical physics and complex networks-based techniques have been developed and implemented to substantially advance our knowledge of the Earth system, including climate extreme events, earthquakes and geological relief features, leading to substantially improved predictive performances. We present here a comprehensive review on the recent scientific progress in the development and application of how combined statistical physics and complex systems science approaches such as critical phenomena, network theory, percolation, tipping points analysis, and entropy can be applied to complex Earth systems. Notably, these integrating tools and approaches provide new insights and perspectives for understanding the dynamics of the Earth systems. The overall aim of this review is to offer readers the knowledge on how statistical physics concepts and theories can be useful in the field of Earth system science.

Keywords: Statistical physics, Complex Earth systems, Complex network, Climate change, Earthquake

1. Introduction

1.1. The Earth as a complex system

The Earth behaves as an integrated system comprised of geosphere, atmosphere, hydrosphere, cryosphere as well as biosphere components, with nonlinear interactions and feedback loops between and within them [1]. These components can be also regarded as self-regulating systems in their own right, and further broken down into more specialized subsystems. Nevertheless, the growing understanding of the multi-component interactions between physical, chemical, biological and human processes suggests that one should bring different disciplines together and take into account the Earth system as a whole. Such studies and results initiate the emergence of a new ‘science of the Earth’–Earth System Science (ESS) [2]. The ESS framework has already demonstrated its potential as a powerful tool for exploring the dynamical and structural properties of how the Earth operates as a complex system.

The ESS has emerged in the early to mid-20th century, and has developed rapidly during the last decades. Its historical evolution can be briefly outlined into four phases [2]: (i) Precursors and beginnings (pre-1970s). The systemic nature of Earth was mainly described and emphasized by some conceptualizations, such as Vernadsky’s biosphere concept that life has a strong influence on the physical and chemical properties of Earth [3], and Lovelock’s Gaia hypothesis that Earth is a synergistic and self-regulating, complex system [4]. These conceptualizations play vital roles in the contemporary understanding of the Earth system. In particular, the International Geophysical Year (IGY) 1957–1958 promoted the development of international science and led to the emergence of two contemporary paradigms, modern climatology and plate tectonics [5], [6]. (ii) Founding a new science (1980s). In the 1980s, the newly emerging recognition of Earth as an integrated entity: the Earth system, was called for by a series of workshops and conference reports. In particular, the Bretherton diagram developed by the National Aeronautics and Space Administration (NASA) [7] was the first systems-dynamics representation of the Earth System to couple the physical climate system and biogeochemical cycle. The Brundtland report published by the World Commission on Environment and Development in 1987 [8] developed guiding principles for a sustainable development and recognized the importance of the environmental problems for the Earth system. Some international organizations were also established in this stage, for example, the World Climate Research Programme (WCRP) aims to determine the predictability of the climate and the effects of human activities on the climate. The International Geosphere–Biosphere Programme (IGBP) was launched in 1987 in order to coordinate international research on global-scale and regional-scale interactions between Earth’s biological, chemical and physical processes and their interactions with human systems. The Intergovernmental Panel on Climate Change (IPCC) was created in 1988 to provide policymakers with scientific assessments on climate change, its implications and potential future risks, as well as to put forward adaptation and mitigation options. (iii) Going global (1990s–2015). Due to great research efforts of international programmes such as the IGBP, and the widespread use of the Bretherton diagram, ESS then developed rapidly, from the ‘new science of the Earth’ movement to a global one. In particular, the International Human Dimensions Programme (IHDP) on Global Environmental Change was founded with the aim to frame, develop and integrate social science research on global change. (iv) Contemporary ESS (beyond 2015). In the 21st century, the ESS framework was well established, and initiated a new programme, Future Earth, which was integrated by several international organizations or programmes, such as the IGBP and IHDP. Future Earth’s mission is to accelerate transformations to global sustainability through research and innovation. Following Ref. [1], we highlight the key organizations, publications, campaigns and events that characterize well the evolution of ESS in Fig. 1.

Fig. 1.

Fig. 1

Timeline illustration of the development of Earth System Science (ESS). The figure shows the key organizations, publications, campaigns, events and concepts that have helped to define and develop the ESS.

Reprinted figure from Ref. [2].

Among the pioneering publications for the evolution of ESS, Hans Joachim Schellnhuber introduced and developed a fundamental concept for ESS: the dynamic, co-evolutionary relationship between nature and human factors at the planetary scale [9]. He proposed that the Earth system E can be conceptually represented by the following mathematical form:

E=(N,H), (1)

where N=(a,b,c,); H=(A,S). This model suggests that the overall Earth system contains two main components: N stands the ecosphere and H is interpreted as the human drivers. N is composed of an alphabet of intricately interacting planetary sub-spheres a (atmosphere), b (biosphere), c (cryosphere), etc. The human factor H is much more subtle: A means the ‘physical’ sub-component and the ‘metaphysical’ sub-component S reflects the emergence of a ‘global subject’. This work provided the conceptual framework for fully integrating human dynamics into an Earth system and built a unified understanding of the Earth.

There exist numerous tools and approaches that support the evolutionary development of the ESS. However, it is worth noting that they can be integrated into three interrelated foci: observations, modeling and computer simulations, assessments and syntheses [2]. The Earth observations are usually referring to the instrumental data, for example, that are collected by the meteorological stations or polar orbiting and geostationary satellites. However, such datasets only extend, at best, for about one-to-two centuries into the past. In order to extend our reach beyond thousands or even millions of years ago, climate proxies fill this gap. There are multiple proxy records, including coral records [10], marine-sediment [11], stalagmite time series [12], tree rings [13] as well as the Vostok [14] and EPICA [15] ice core record that help us to better understand the past Earth system. Mathematical models are currently key methods for the understanding and projecting of climate and Earth systems. These models include different variants from conceptual climate models–Energy Balance Models (EBMs) [16] to more complex Earth models, the General Circulation Models (GCMs) [17]. In addition, the integrated assessment models (IAMs) were designed to take human dynamics as an integral component into account and aim to understand how human development and societal choices affect each other and the natural world, including climate change [18], [19]. Based on the GCMs, the Earth systems Models of Intermediate Complexity (EMICs) were developed to investigate the Earth’s systems on long timescales or at reduced computational cost [20]. These models provide knowledge-integration to explore the dynamic properties and basic mechanisms of the Earth system across multiple space and time scales. The assessments and syntheses also played crucial roles in building new scientific knowledge, linking the scientific and political communities, and facilitating new research fields based on the feedback from the political sector.

Some new concepts and theory have indeed arisen from the evolution of ESS, including the emerging concept of sustainability [21], Anthropocene [22], tipping elements [23], planetary boundaries framework [24], such as planetary thresholds and state shifts [25], etc. Taken together, they create powerful ways to better understand and project the future trajectory of the Earth system.

1.2. Why Statistical Physics?

Yet, despite the rapid development of the ESS, the nature and statistical properties of extreme climate events and earthquakes remain elusive and debated, and furthermore, the existence of early warning signals of these phenomena is still a major open question. As a consequence, in this review, we will present several novel approaches based on or stemmed from statistical physics, that could enhance our understanding of how the Earth system could evolve. In particular, the interdisciplinary perspective on statistical physics and the Earth system yields to an improvement of the prediction skill of high-impact disruptive events within the climate and earthquake systems.

Statistical Physics is a branch of physics that draws heavily on the laws of probability and the statistics of many interacting components. It can describe a wide variety of systems with an inherently stochastic nature, aiming to predict and explain measurable properties and behaviors of macroscopic systems. It has been applied to many problems including fields of physics, biology, chemistry, engineering and also social sciences. Note that statistical physics does not focus on the dynamic of every individual particle but on the macroscopic behavior of a large number of particles. The basic theory and ideas of statistical physics are depicted in many textbooks, such as [26], [27], [28].

Sethna motivated the relationship between statistical physics and complex systems in his book [29] as follows: ”Many systems in nature are far too complex to analyze directly. Solving for the motion of all the atoms in a block of ice – or the boulders in an earthquake fault, or the nodes on the Internet – is simply infeasible. Despite this, such systems often show simple, striking behavior. We use statistical mechanics to explain the simple behavior of complex systems”. A complex system is usually defined as a system composed of many components which interact with each other. Regarding the large variety of components and interrelations, the Earth system thus can be interpreted as an evolving complex system. The concepts and methods of statistical physics can infiltrate into ESS, in particular, (i) critical phenomena are analogous to tipping elements: a system will collapse and follow a breakdown [30], if it is close to a phase transition or tipping point. Critical phenomena and transitions exist widely in the Earth system [31], such as in atmospheric precipitation [32] and percolation phase transition in sea ice [33]. (ii) The concept of fractal was used to describe the Earth’s relief, shape, coastlines and islands [34]. (iii) The earthquake process is regarded as a complex spatio-temporal phenomenon, and has been viewed as a self-organized criticality (SOC) paradigm [35]. Moreover, the seismic activity exhibits scaling properties in both temporal and spatial dimensions [36].

Complex network theory provides a powerful tool to study the structure, dynamics and function of complex systems [37], [38], [39], [40]. Meanwhile, statistical physics is a fundamental framework and has brought theoretical insights for understanding many properties of complex networks. From an applied perspective, statistical physics has led to the definition of null models for real-world networks that reproduce global and local features [41]. Complex network theory is an emerging multidisciplinary discipline that has been applied to many fields including mathematics, physics, biology, computer science, sociology, epidemiology and others [42]. Ideas from network science have also been successfully applied to the climate system and revealed interesting mechanisms underlying its functions, and leading to the emergence of a new concept, Climate Network (CN) [43]. The nodes in a CN represent the available, geographically localized, time series, in particular, regular latitude–longitude grids, while the level of similarity and causality between the nonlinear climate records of different grid points represents the CN’s links [44], [45]. The development of this data-based approach is providing radical new ways to investigate the patterns and the dynamics of climate variability [46].

In this review, we will introduce how to apply statistical physics methods in Earth system science. Particularly, we focus largely on the surface of the Earth system, including the climate system, Earth’s relief as well as the earthquakes system, but not including the whole planetary interior.

1.3. Outline of the report

Apart from this introductory preamble, the report is organized along 3 Sections.

In the next Section, we start by offering the overall methodology that will accompany the rest of our discussion. Section 2 contains several topics that are essential but rather formal, including the CN approach, percolation theory, tipping points analysis, entropy theory and complexity. The reader will find there the attempt to define an overall theoretical framework encompassing the different situations and systems that will be later extensively treated.

Section 3 gives a comprehensive review of applications, especially those that were studied in climate systems, the Earth geometric surface relief and earthquake systems.

Finally, Section 4 presents our conclusive remarks and perspective ideas.

2. Methodology

2.1. Climate networks

For more than two decades, the complex network paradigm has demonstrated its great potential as a versatile tool for exploring dynamical and structural properties of complex systems, from a wide variety of disciplines in physics, biology, social science, economics, and many other fields [37], [40], [47], [48], [49], [50], [51], [52]. In the context of network theory, a complex network is a graph with non-trivial topological features which do not occur in simple networks such as regular lattices (Fig. 2a) or random Erdős–Rényi graphs (Fig. 2b). One of the novelties of complex network theory is that it can relate the topological characteristics to the function and dynamics of the system. Complex network theory has been successfully applied to many real world systems and revealed fundamental mechanisms underlying their functions. It has been also found that many real world networks are scale-free networks (Fig. 2c) [38], [53].

Fig. 2.

Fig. 2

Characteristics of the basic explored model networks. (a) A regular 2D square lattice, is the most degree-homogeneous network. (b) An Erdős–Rényi (ER) network, has a Poisson degree distribution and its degree heterogeneity is determined by the average degree k. (c) A scale–free (SF) network, has a power-law degree distribution, yielding large degree heterogeneity. The degree distributions are shown for (d) the 2D square lattice, (e) the ER network and (f) the SF network.

In recent years, the ideas of network theory have also been implemented in climate sciences to construct CN [43], [44], [54]. In CNs the geographical locations (or grid points) are regarded as the nodes of the networks and the level of similarity (causality) between the records (time series) of two grid points represents the links and their strength. Various climate data such as temperature, pressure, wind and precipitation can be used to construct a network. The climate networks approach allows to study the interrelationship between the different locations on the globe and thus represent the global behavior of the climate system, e.g., how energy and matter are transferred from on location to another. These networks have been used successfully to analyze, model, understand, and even predict various climate phenomena [44], [45], [46], [55], [56], [57], [58], [59], [60], [61], [62], [63], [64], [65], [66], [67], [68], [69], [70]. A visualization of a climate network is shown in Fig. 3 as an example.

Fig. 3.

Fig. 3

Visualization of a climate network with surface air temperature taking into account the spatial embedding of vertices on the Earth’s surface.

Figure from Ref. [71].

We first provide an introduction of definitions, notations and basic quantities used to describe the topology of a network, and then present an overview on how to construct a climate network.

2.1.1. Network characteristics

In this Section we review basic characteristics and metrics used to describe and analyze networks, most of them come from graph theory [72]. Graph theory is a large field containing many branches but we present only a small fraction of those results here, focusing on the ones most relevant to the study of the complex Earth system. A network, also called a graph in the mathematical literature, is a collection of vertices connected by edges. Vertices and edges are also called nodes and links in computer science, sites and bonds in physics, and actors and ties in sociology [40]. Throughout this review we will denote the number of vertices by N and the number of edges by M, which is also a common notation in the mathematical and physics literature.

The adjacency matrix

Most of the networks we will study in this review have at most a single edge between any pair of vertices, i.e., self-edges or self-loops not allowed. A simple representation of a network for many purposes is the adjacency matrix A, with elements Aij such that

Aij=1 if there is an edge between vertices i and j,0 otherwise.  (2)

We should notice that the diagonal matrix elements are all zero in the adjacency matrix. In some situations it is useful to represent edges as having a strength, weight, or value to them, usually a real number. Such weighted or valued networks can be represented by giving the elements of the adjacency matrix values equal to the weights of the corresponding connections. A directed network or directed graph, also called a digraph for short, is a network in which each edge has a direction, pointing from one vertex to another. Such edges are themselves called directed edges. For this case, the adjacency matrix is usually not symmetric, and becomes

Aij=1 if there is an edge from j to i,0 otherwise.  (3)

Degree

The degree of a vertex in a network is the number of edges connected to it. We denote the degree of vertex i by ki. For an undirected network of N vertices the degree can be written in terms of the adjacency matrix (Eq. (3)) as

ki=j=1NAij. (4)

The number of edges M is equal to the sum of the degrees of all the vertices divided by 2, so

M=12i=1Nki=12ijAij. (5)

The mean degree k of the node in an undirected graph is k=1Ni=1Nki=2MN.

The concept of degree is more complicated in directed networks. In a directed network each vertex or node has two degrees. The in-degree kiin is the number of in-going edges connected to a node and the out-degree kjout is the number of out-going edges. From the adjacency matrix of a directed network Eq. (3), they can be written

kiin=j=1NAij,kjout=i=1NAij. (6)

Bearing in mind that the number of edges in a directed network is equal to the total number of in-going ends of edges at all vertices, or equivalently to the total number of out-going ends of edges.

Degree distributions

One of the most fundamental properties of a network that can be measured directly is the degree distribution, or the fraction P(k) of nodes having k connections (degree k). A well-known result for the Erdős–Rényi [73] network is that its degree distribution follows a Poissonian,

P(k)=ezzkk!, (7)

where z=k is the average degree. As shown in Fig. 2e, despite the fact that the position of the edges is random, a typical random graph is rather homogeneous, the maximum number of the nodes having the same number of edges.

However, direct measurements of the degree distribution for real networks, such as the Internet [74], WWW [38], email network [75], metabolic networks [76], airline networks [77], neuronal networks [78], and many more, show that the Poisson law does not apply. But they exhibit an approximate power-law degree distribution

P(k)=Ckλ, (8)

where C is a normalization factor. The constant λ is known as the exponent of the power law. Values in the range 2λ3 are typical, although values slightly outside this range are possible and are observed occasionally [79]. Networks with power-law degree distributions are called scale-free networks. The simplest strategy to determine the scale-free properties is to look at a histogram of the degree distribution on a log–log plot, as we did in Fig. 2f, to see if we have a straight line. Several models have been proposed for the evolution of scale-free networks, each of which may lead to a different ensemble. The first proposal was the preferential attachment model of Barabási and Albert, which is known as the Barabási–Albert model [38]. Several variants of this model have been suggested, see, e.g., in Ref. [80].

Clustering coefficient

The extent to which nodes cluster together on very short scales in a network is measured by the clustering coefficient. The definition of clustering is related to the number of triangles in the network. The clustering is high if two nodes sharing a neighbor have a high probability of being connected to each other. The most common way of defining the clustering coefficient is:

C=(number of triangles)×3(number of connected triples). (9)

Here a “connected triple” means three nodes uvw with edges (u,v) and (v,w). The factor of three in the numerator arises because each triangle is counted three times when the connected triples in the network are counted.

We can also get a local clustering coefficient Ci for a single vertex by defining

Ci= the number of triangles connected to vertex i the number of triples centered on vertex i. (10)

That is, to calculate Ci, we go through all distinct pairs of vertices that are neighbors of i, count the number of such pairs that are connected to each other, and divide by the total number of pairs. Ci represents the average probability that a pair of i’s friends are friends of one another. For vertices with degree 0 or 1, for which both numerator and denominator are zero, we assume Ci=0. Then the clustering coefficient for the whole network [37] is the average C=1NiCi. In both cases, the clustering is in the range 0C1.

Subgraphs

A graph G1 consisting of a set P1 of nodes and a set E1 of edges is a subgraph of a graph G={P,E} if all nodes in P1 are also nodes of P and all edges in E1 are also edges of E. The simplest examples of subgraphs are cycles, trees, and complete subgraphs [81]. A cycle of order k is a closed loop of k edges such that every two consecutive edges and only those have a common node. That is, graphically a triangle is a cycle of order 3, while a rectangle is a cycle of order 4. The average degree of a cycle is equal to 2, since every node has two edges. A tree, as shown in Fig. 4a, is a connected, undirected network that contains no closed loops. Here, by “connected” we mean that every node in the network is reachable from every other via some path through the network. A river network is an example of a naturally occurring tree with directed links [82]. The most important property of trees is that, since they have no closed loops, there is exactly one path between any pair of nodes. Thus the number of edges in a tree is always exactly k1 edges. Complete subgraphs of order k, also called k-clique, contain k nodes and all the possible k(k1)2 edges—in other words, they are completely connected [see Fig. 4b].

Fig. 4.

Fig. 4

Illustration of the concept of subgraphs. Examples of (a) a tree network, (b) a fully completed network, (c) overlapping k-clique communities at k=4.

(c): Reprinted figure from Ref. [84].

Let us consider the ER model, in which we start from N isolated nodes, then connect every pair of nodes with a probability p. Most generally, we can ask whether there is a critical probability that marks the appearance of arbitrary subgraphs consisting of k nodes and l edges. A few important special cases that directly give an answer to this question [81] are: (i) The threshold probability of having a tree of order k is pc(N)=cNk(k1); (ii) The threshold probability of having a cycle of order k is pc(N)=cN1; (iii) The critical probability of having a complete subgraph of order k is pc(N)=cN2(k1).

Generally, a network is always composed of many separate subgraphs or components, i.e., groups of nodes connected internally, but disconnected from others. In each such component there exists a path between any two nodes, but there is no path between nodes in different components. A network component with size proportional to that of the entire network, N, is called a giant component.

Similar to component, the concept of community is also one of the most fundamental properties in complex networks [83]. The nodes of the network might be joined together into tightly connected groups, whereas between them there are still links but less connections. A single node of the network may belong to more than one community, and most of the actual networks are made of highly overlapping communities of nodes [84]. Since community structures are quite common in real networks, community finding and detection are thus of great importance for better understanding the function of a network. An example of overlapping k-clique communities is shown in Fig. 4c. A good review on the community detection in graphs can be found in Ref. [85].

Network models

Erdős–Rényi model: It can be interpreted by the following two well-studied graph ensembles, GN,M the ensemble of all graphs having N vertices and M edges, and GN,p the ensemble consisting of graphs with N vertices, where each possible edge is realized/connected with probability p. These two families, initially studied by Erdős and Rényi [73], are known to be similar if M=N2p. They are referred to as Erdős–Rényi (ER) model. These descriptions are quite similar to the microcanonical and canonical ensembles studied in statistical physics [86]. The ER model has traditionally been the dominant subject of study in the field of random graphs [81], with Poissonian degree distributions, see Eq. (7).

Barabási–Albert model: It is based on two simple assumptions regarding network evolution [38]. (i) Growth: Starting with a small number m0 of nodes, at every time step, a new node with mm0 edges that link the new node to m different nodes already present in the system is added. (ii) Preferential attachment: This is the heart of the model. When choosing the nodes to which the new node will be connected, the probability Π that the new node will be connected to node i depends on the degree ki of node i, such that Πki=kijkj. In this process, after t time steps this results in a network with N=t+m0 nodes and mt edges. Theoretical and numerical results show that this network model evolves into a power-law degree distribution, see Eq. (8).

Watts–Strogatz model:

In 1998, Watts and Strogatz [37] proposed a one-parameter model that interpolates between a regular lattice and a random graph. The details behind the model are the following: (i) Start with order: Start with a ring lattice with N nodes in which every node is connected to its closest k neighbors. In order to have a sparse but connected network at all times, consider Nk ln(N)1. (ii) Randomize: Randomly rewire each edge of the lattice with probability p such that self-connections and duplicate edges are excluded. This process introduces pNK2 long-range edges. The resulting network properties of this model are small-world and high clustering coefficient. Specifically, a small-world network is referring to a network where the characteristic path length L grows proportionally to the logarithm of the number of nodes, LlogN [37].

2.1.2. Pearson correlation climate network

Our climate system is made up of an enormous number of nonlinear sub-systems having mutual nonlinear interactions and feedback loops active on a wide-range of temporal and spatial scales. Therefore, modeling the Earth climate system from the point of view of complex networks can clearly provide critical insights into the underlying dynamics of the evolving climate system. As discussed above, in CNs, geographical regions of Earth are regarded as nodes, and the bivariate statistical analysis of similarities between pairs of climatological variables time series represent the links.

In general, there are five steps for the CNs construction and analysis [46], the procedure is displayed in Fig. 5, adapted from [87]. Step (i): Nodes, the nodes in CNs are usually defined as locations in a longitude–latitude spatial grid at various resolutions. (ii): Climatological variable, we select the suitable climatological time series to be analyzed, e.g., surface air temperature, sea surface temperature, precipitation, wind, etc. Some pre-processing is also often needed. For example, to avoid the strong effect of seasonality, we usually subtract the mean seasonal cycle and divide by the seasonal standard deviations for each grid point time series. (iii): Edges, in this step we compute the statistical similarity that quantifies the interdependencies between pairs of time series. The strength of each edge is correlation based. There are many measures for quantifying the interdependencies of time series, here we mention the types of CNs by correlation measures. For example, a Pearson correlation climate network, where we use Pearson correlation to quantify the cross-correlations between time series; while an event synchronization climate network uses the event synchronization method; and a mutual information climate network is based on mutual information. (iv): Construction, in this step we construct the CN which typically involves some thresholding criterion to select only statistically significant edges. Considering, for example, that the significant links have Wij values above a given threshold, Wc, then the adjacency matrix is determined as

Aij=HWijWc. (11)

where H is the Heaviside function. We then investigate the obtained network by using various network characteristics. Finally, in step (v): Climatological interpretation, the results of this analysis are interpreted in terms of dynamical processes of the climate system (e.g., atmospheric circulation, ocean currents, plenty of waves, etc.).

Fig. 5.

Fig. 5

Methodology used to construct a climate network from climatic time series.

Reprinted figure from Ref. [87].

In the following, we will demonstrate how to construct a Pearson correlation climate network. Suppose that the climatic variable is the daily surface air temperature, either from observations, proxy reconstructions, reanalyses, or simulations, gathered at static measurement stations, or provided at grid cells. At each node i of the network, we calculate the daily atmospheric temperature anomalies Ti(t) (actual temperature value minus the climatological average which is then divided by the climatological standard deviation) for each calendar day.

For obtaining the time evolution of the weight of the link between a pair of nodes i and j, we follow [66] and compute, for each time windowing (such as month or year) t over the whole time span, the time-delayed cross-correlation function defined as

Ci,j(t)(τ)=Ti(t)(t)Tj(t)(tτ)Ti(t)(t)Tj(t)(tτ)(Ti(t)(t)Ti(t)(t))2(Tj(t)(tτ)Tj(t)(tτ))2, (12)

and

Ci,j(t)(τ)=Ti(t)(tτ)Tj(t)(t)Ti(t)(tτ)Tj(t)(t)(Ti(t)(tτ)Ti(t)(tτ))2(Tj(t)(t)Tj(t)(t))2, (13)

where the brackets denote an average over the past T days, according to

f(t)=1Ta=1Tf(ta). (14)

Here, τ is the time lag spanning from [0,τmax] days. The reliable estimate of the background noise level, i.e., the values of the τmax were​ discussed in [88], [89]. Based on the correlation functions, Eqs. (12), (13), a weighted and directed link between nodes i and j was defined in Refs. [59], [61], [64]. This is done by identifying the value of the highest peak (or lowest valley) of the cross-correlation function and denote the corresponding time lag of this peak (valley) as θi,j. The sign of θi,j indicates the direction of each link; i.e., when the time lag is positive (θi,j>0), the direction of the link is from j to i [64]. The positive and negative links and their weights are determined via Ci,j(τ), and are defined as

Wi,j+=max(Ci,j(τ))mean(Ci,j(τ))std(Ci,j(τ)), (15)

and

Wi,j=min(Ci,j(τ))mean(Ci,j(τ))std(Ci,j(τ)), (16)

where “max” and “min” are the maximum and minimum values of the cross-correlation function, “mean” and “std” are the mean and standard deviation. Typical time series and their cross-correlation functions are shown in Fig. 6. For demonstration, two nodes are selected, a node i is located in the Southwest Atlantic and a node j is in the South American continent (Fig. 6a). The near surface daily air temperature anomalies are shown in Fig. 6b for the time period between 2014 to 2018. The cross-correlation function between the time series is shown in Fig. 6c, where the absolute value of a maximum cross-correlation function is much larger than the level of noise and the minimum value. Since θi,j=2>0, then the direction of the link is from j to i. According to Eq. (15), we calculate its weight to be Wi,j+=5.71. This value represents 5.71 standard deviations above the noise level, i.e., highly significant.

Fig. 6.

Fig. 6

Typical weighted and directed link in a Pearson Correlation Climate Network. (a) Node i is located on the Southwest Atlantic and node j is in the South American continent. (b) The near surface daily air temperature anomalies for the period [2014,2018]. (c) The cross-correlation function between the time series shown in (b). The direction of this link is from j to i with weight Wi,j+=5.71.

Finally, networks can be constructed by establishing links between pairs of nodes with weights larger than some significant threshold. The advantage of this method is that it overcomes the problem of strong auto-correlation values in the data [88].

2.1.3. Event synchronization climate network

The event synchronization method provides an alternative way to construct a network from climate observations: event synchronization CN. It was originally proposed by Quian Quiroga et al. [90], where they measured synchronization and inferred time delays between signals in neuroscience. Event synchronization is based on the relative timings of events, such as rainfall, in a time series and is defined, e.g., by the crossing of a threshold or by local maxima or minima, etc. For instance, an extreme rainfall event is defined as the day where its precipitation is above the 99th percentiles of all days. Event synchronization is especially appropriate for studying extreme events. The degree of synchronization is obtained from the number of quasi-simultaneous events and the delay is calculated from the precedence of events in one time series (signal) with respect to the other.

Event synchronization starts by constructing two event series from two discrete univariate time series, X and Y. An event l that occurs at X at time tlx is considered to be synchronized with an event m that occurs at Y at time tmy within a time lag±τlmxy, if 0|tlxtmy|<τlmxy, where

τlmxy=mintl+1xtlx,tlxtl1x,tm+1ytmy,tmytm1y2. (17)

Here, l=1,2,,ex, and m=1,2,,ey. ex and ey are the number of events in the X and Y respectively. The number of times an event appears in X shortly after it occurs in Y is counted:

c(x|y)=l=1exm=1eyJxylm (18)

with

Jxylm=1, if 0<tlxtmyτlmxy,12, if tlx=tmy,0, else,  (19)

and analogously to Eq. (18) we can define c(x|y). Finally, the symmetrical and anti-symmetrical combinations

Qxy=c(y|x)+c(x|y)exey,qxy=c(y|x)c(x|y)exey, (20)

are used to measure the synchronization of the events and their delay behavior, respectively. They are normalized to 0Qxy 1 and 1qxy1. It should be noted that if several extreme events are very close in one record, then only the first one is considered. Particularly, Qxy=1 if and only if the events of the signals are fully synchronized. In addition, if the events in X always precede those in Y, then we get qxy=1.

The event synchronization method has been found useful for the study of electroencephalogram signals [90], neurophysiological signals [91], and the patterns of extreme rainfall events [92], [93], [94], [95], [96].

2.1.4. Mutual information climate network

The mutual information method is also one of the common tools for quantifying the interdependencies between time series and for constructing climate networks. The mutual information of two random variables is a measure of the mutual dependence between the two variables. The concept of mutual information is intricately linked to that of entropy of a random variable [97]. Let X and Y be a pair of random variables with values over the space X×Y. If their joint distribution is p(x,y), and their associated marginal probability distribution functions are p(x) and p(y), then the mutual information is defined as

MI=xyp(x,y)logp(x,y)p(x)p(y). (21)

States with zero probability of occurrence are ignored.

Intuitively, MI is a measure of the inherent dependence expressed in the joint distribution of X and Y relative to their corresponding joint distribution under the assumption of independence. Mutual information, therefore, measures dependence and nonlinearity, in fact: MI=0 if and only if X and Y are independent random variables, i.e., p(x,y)=p(x)p(y). Moreover, mutual information is non-negative, symmetric and can also be computed with a time lag [98]. After we get the mutual information coefficient, Eq. (21), for each pair of nodes, we can construct a mutual information climate network. It has been shown that the mutual information CN can well capture the westward propagation of sea surface temperature (SST) anomalies that occur in the Atlantic multidecadal oscillation [99].

To summarize, in this chapter we have described different network characteristics and presented various linear and nonlinear tools of time series analysis, which can be used to construct, define and characterize CNs. Linear and nonlinear methods include Pearson correlation, event synchronization, and information-theory measures such as entropy and mutual information. There are also some other powerful tools, such as, spectral analysis, empirical orthogonal function analysis and symbolic ordinal analysis that can be used to reconstruct CNs [46].

2.2. Percolation theory

Percolation originally has been used to describe the movement and filtering of fluids through porous materials [100]. The idea of percolation was also considered by Flory and Stockmayer on polymerization and the sol–gel transition [101]. In recent years, it has been a cornerstone in the theory of spatial stochastic processes with broad applications to diverse problems in fields such as statistical physics [102], [103], [104], [105], phase transitions [106], [107], materials [108], [109], epidemiology [110], [111], [112], [113], networks [30], [38], [40], [114], [115], [116], [117], [118], [119], [120], [121], colloids [122], [123], semiconductors [124], traffic [125], turbulence [126], as well as Earth systems [33], [65], [67], [127], [128], [129], [130]. Percolation theory also plays a pivotal role in studying the structure, robustness and functions of complex systems [131].

It is well known that in lattices and other ordered networks [106], [107], [132], for dimensions greater than one, a percolation phase transition occurs. The percolation process is a simple model in which the nodes (sites) or edges (bonds) are occupied with some probability p and unoccupied with probability q=1p. Take a regular lattice as an example (see Fig. 2a). A system is regraded as percolating if there is a path from one side to the other parallel one, passing only through occupied bonds and sites. When such a path exists, the component or cluster of sites that spans the lattice is called the spanning cluster or the infinite percolation cluster. At low concentration p, the sites are either isolated or form small clusters of nearest-neighbor sites. Two sites belong to the same cluster if they are connected by a path of nearest-neighbor sites. At large p values, on the other hand, at least one path between opposite sides exists​ (see Fig. 7). The percolation phase transition occurs at some critical density pc that depends on the type and dimensionality of the lattice.

Fig. 7.

Fig. 7

Bond percolation clusters on a 512 × 512 2D square lattice forp=(a) 0.3, (b) 0.5 and (c) 0.55, respectively. Different cluster sizes have different colors. The color of the larger clusters depends on their size and varies from blue (small clusters) to red (infinite cluster). The percolation threshold is pc=0.5.

For many complex networks, the notion of side does not exist. However, the ideas and tools of percolation theory can still be applied [51]. The main difference is that the condition for percolation is no longer the existence of spanning cluster, but rather having a cluster containing O(N) nodes. If such a component exists, we call it giant component, that was discussed in Section 2.1. Indeed, the condition of the existence of a giant component applies also to lattice networks, and therefore it can be used as a more general condition than the spanning property.

There exist two types of percolation, site percolation, where all sites are with probability p occupied and 1p empty. In bond percolation, however, it is the bonds which are occupied with probability p and above pc they form a giant component of connected sites. In this review, we will focus on the critical phenomena of percolation near the percolation threshold pc, where for the first time a giant component is formed or the system collapses [133]. These aspects are called critical phenomena, and the fundamental theory to describe them is the scaling theory from phase transitions.

2.2.1. Phase transition

The concept of phase transition is usually used to describe transitions between solid, liquid, and gaseous states of matter for thermodynamic physical systems, where an ordered phase (e.g., solid) changes into a disordered phase (e.g., liquid) at some critical temperature Tc [26]. Another classical example of a phase transition is the magnetic phase transition, explained by the Ising model, where a spontaneous magnetization m>0 appears, at low temperature without any external field. While increasing temperature, the spontaneous magnetization decreases and vanishes at Tc.

Percolation is indeed a simple geometrical phase transition, where the percolation threshold pc distinguishes a phase of finite clusters (disordered phase) from a phase of an infinite cluster (ordered phase). The occupied probability p of sites or bonds plays the same role as the temperature in the thermal phase transition. They are usually called control parameters in statistical physics.

Percolation order parameter

The percolation order parameter P in the relative size of the infinite cluster which is defined as the fraction of the sites belonging to the infinite cluster. For p<pc, there exist only finite clusters, thus, P=0; For the case p>pc, however, P is analogous to the magnetization below Tc and behaves near pc as a power law

Pppcβ. (22)

It describes the order in the percolation system and is therefore called the order parameter. We show in Fig. 8, how P behaves as a function of p, for both bond and site percolation in the 2D square lattice model, as well as, for site percolation in the ER network, respectively. It is worth noting that we adopt p as the control parameter in Fig. 8, which results P being a monotonic increasing function. When p is gradually decreased (from 1), it can be regarded as removal (or attack) of nodes or links and the system collapses after removing a fraction of 1pc. Thus, 1pc can be a measure of the resilience of the system, i.e., how close it is to collapse.

Fig. 8.

Fig. 8

Percolation phase transitions in lattice and network systems. The percolation order parameter P as a function of occupied probability p for (a) 2D bond percolation with pc=0.5, (b) 2D site percolation with pc0.593 and (c) ER network with pc=0.25. Here the average degree k=4 for ER network.

Cluster size distribution

The finite components distribution near criticality follows the scaling form [106], [107]

nssτessξ. (23)

Where s is the component size, and ns represents the number of components of size s. At the percolation threshold sξ|ppc|σ diverges and the tail of the distribution behaves as a power law with the critical exponent σ.

Average cluster size

For any given site, the probability that it belongs to a cluster of size s is sns. Let us define ρ(p) as the probability that any given site is part of a finite cluster,

ρ(p)=s=1sns(p)=M1(p), (24)

which is the first moment of the cluster size distribution. Hence, for any given site of any given finite cluster, the probability ws(p) that the cluster is of size s is

ws(p)=1ρ(p)sns(p), (25)

with s=1ws(p)=1. For any given site of any given finite cluster, the average size s(p) of the cluster is

s(p)=s=1sws(p)=1ρ(p)s=1s2ns(p)=M2(p)M1(p). (26)

Note that, the average size s(p), excluding the infinite cluster, diverges near the critical point as,

s|ppc|γ,ppc. (27)

Correlation length

The correlation function g(r) is the probability that a site at position r from an occupied site in a finite cluster belongs to the same cluster.

Typically, for large r|r|, there is an exponential cutoff, i.e., g(r)erξ, at the correlation length ξ. The correlation length ξ is defined as

ξ2=rr2g(r)rg(r). (28)

ξ measures the mean distance between two sites on the same finite cluster. When p approaches pc, ξ increases as

ξ|ppc|ν. (29)

The quantities P and s are analogous to the magnetization m and the susceptibility χ in magnetic systems. β, γ and ν are called the critical exponents and describe the critical behavior of the percolation phase transition.

2.2.2. Structural properties

Next, we will briefly introduce some fundamental measurements that are used to characterize the structural properties of a percolation cluster. For more details, the readers can refer to Ref. [106].

Fractal dimension

The fractal concept was introduced into the physical world by Mandelbrot [134] and applied to percolation by Stanley [135] to describe the cluster shapes at the percolation threshold pc. The infinite cluster is self-similar on all length scales, and can be regarded as a fractal. The fractal dimension df is defined as how, on average, the mass M of the cluster within a sphere of radius r from a site of the cluster changes with r,

M(r)rdf. (30)

For length scales smaller than the correlation length ξ, a fractal structure exists. For length scales larger than ξ, the cluster becomes homogeneous. This can be summarized as

M(r)rdf,rξrd,rξ. (31)

The relative size of the giant component, P can also be expressed as

Prdfrd=ξdfξd,r<ξ. (32)

The fractal dimension df of percolation clusters can be related to the critical exponents β and ν. Substituting Eqs. (22), (29) into Eq. (32), yields df=dβν.

Fractal analysis was also applied in the study of complex networks [136]. Generally, there are two basic methods to calculate the fractal dimensions of a given system, i.e., using either the box counting method [137] or the cluster growing method [138]. (i) The box counting method: Let NB be the number of boxes of linear size lB that are needed to cover the object, the fractal dimension df is then given by NBlBdf. It means that the average number of nodes MBlB within a box of size lB is,

MBlBlBdf. (33)

The fractal dimension df can be obtained by a power law fit. (ii) The cluster growing method: Similar to Eq. (30), the dimension df can be calculated by

MCldf (34)

where MC is the average mass of the cluster, defined as the average number of nodes within linear size in the cluster. Note that, the above methods are difficult to apply directly to networks, since networks are generally not embedded in space. In order to measure the fractal dimension of networks one usually combines the concept of renormalization [136], i.e., for each chemical size lB, boxes are placed until the network is covered. Then each box is replaced by a node (renormalization), and the renormalized nodes are connected if there is at least one link between the no-renormalized boxes. This procedure is repeated and the number of boxes NB scale with B as NBBdf.

Graph dimensions dmin and d

The fractal dimension dmin has been used to describe the structural properties of the shortest path between two arbitrary sites in the same cluster. Let be the path length, which is often called the ”chemical distance” [139], it scales with r as

rdmin, (35)

and the number of sites within is

M()d, (36)

where d is often called the “graph dimension” or “chemical” dimension. Combining Eqs. (30), (35), (36), we obtain the relation between dmin, d and df, d=dfdmin.

Besides the fractal dimension exponents dmin, d and df discussed above, there are also some other exponents such as the backbone exponent and the red bonds exponents to describe the fractal dimensions of the substructures composing percolation clusters [26], [107].

2.2.3. Scaling theory

The scaling hypotheses of phase transitions were developed by Kenneth G. Wilson during the last century and honored by the 1982 Nobel prize. The scaling theory of percolation clusters relates the critical exponents of the percolation transition to the cluster size distribution, ns(p). According to the scaling hypotheses, it is possible to state the following relation for ns(p),

ns(p)sτf|ppc|1σs, (37)

where τ is the Fisher exponent [102], and f(x) is scaling function following f(x)=exp(x), which rapidly decays to zero.

The correlation length ξ is defined as the root mean square distance between occupied sites on the same finite cluster for all clusters, see Eq. (28). For clusters with s sites, the root mean square distance between all pairs of sites on each cluster, is

Rs2=2s(s1)i=1sj=1irirj2¯, (38)

and thus,

ξ2=s=1Rs2s2nss=1s2ns. (39)

Close to the percolation threshold pc, Rss1df, and we obtain from the above equation

ξ2s=1s2df+2τf|ppc|1σss=1s2τf|ppc|1σs. (40)

To calculate the sums in Eq. (40), we transform them into integrals, and obtain

ξ2|ppc|2dfσ, (41)

which yields the scaling relations between ν,σ and τ

ν=1dfσ=τ1dσ, (42)

where d is the system’s dimension.

Consider the kth moment of the cluster size distribution

Mk=s=1skns(p)s=1skτfsξdf, (43)

which scales in the critical region as,

Mkξdf(kτ+1)|ppc|(τ1k)σ. (44)

Next we consider the percolation order parameter P, which behaves as [106],

P=11pssnsss(ns(pc)ns(p))(ppc)(τ2)σ. (45)

Combining with Eq. (22), we have,

β=τ2σ. (46)

Similarly, we obtain for k=2, the relation Mk|ppc|(3τ)σ, yielding

γ=τ3σ. (47)

Thus, the critical exponents are not independent of each other but satisfy two sets of scaling and hyperscaling relations. The scaling relations can be easily expressed by Eqs. (42), (46), (47).

2.2.4. Universal gap scaling

An interesting property of percolation is universality, which is a fundamental principle of behavior at or near a phase transition critical point. As a result, the critical exponents depend only on the dimensionality of the system, and are independent of the microscopic interaction details of the system. The behavior of a system is characterized by a set of critical exponents, as discussed in the last section. If two systems have the same values of critical exponents, they belong to the same universality class. The universality property is a general feature of phase transitions. A phase transition is also characterized by scaling functions that govern the finite-size behaviors [140]. The concept of finite-size scaling provides a versatile tool to study the percolation transition.

Usually, a lattice or network is expected to undergo a continuous percolation phase transition during a random occupation or failure process [81]. Explosive, hybrid and genuinely discontinuous percolation for processes that are not random but competitive link-addition processes has attracted much attention in recent years [116], [141], [142], [143], [144], [145], [146], [147], [148], [149], [150], [151], [152]. The description of the explosive percolation model is shown in Fig. 9.

Fig. 9.

Fig. 9

Schematic of the explosive percolation model. (A) For each step two vertices are chosen randomly and connected by an edge. (B) At each time step of the Product Rule (PR) process, two edges, e1 and e2, compete for addition. The selected edge is the one minimizing the product of the sizes of the components it merges. Here e1 is accepted and e2 is rejected. (C) Typical evolution of an ER, Bohman Frieze (BF), and PR process on a system of size N=512,000.

Figure from Ref. [116].

Instead of performing a finite-size scaling analysis at the critical phase transition point, the finite-size scaling analysis on the percolation properties and critical scaling of the size of the largest gap in the order parameter is considered. For instance, Fan et al. [153] developed a new universal gap scaling theory and propose six gap critical exponents to describe the universality of percolation phase transitions. This theoretical framework can be applied for both continuous and discontinuous percolation in various lattice and network models by using finite-size scaling functions.

Gap exponents

Starting with an empty lattice, or network system with N isolated nodes, bonds or links are added randomly or by competitive link-addition processes one by one. The control parameter is denoted as r, which represents the link density r=TM, with M being the maximal number of edges. The order parameter is the size of the largest cluster, given by the largest connected component in the entire system. During the evolution of the system, we record the size of the largest cluster S(T) at time step T, and calculate its largest one-step gap Δ

Δ1NmaxTS(T+1)S(T). (48)

The step with the largest jump defines Tc and its relative transition point as rc. Moreover, the percolation strength is defined as the size of the largest connected component at Tc, i.e., Sc=S(Tc).

Critical phenomena such as percolation exhibit scale-free behaviors that are quantified by scaling relations, see discussions in previous sections. Similarly, the averages Δ¯, r¯c and S¯c are anticipated to exhibit the following power-law relations [153], as a function of L,2

Δ¯(L)Lβ1, (49)
r¯c(L)rc()L1ν1, (50)
Sc¯(L)Ldf1, (51)

where β1, ν1 and df1 are three critical exponents describing the universal class of the percolation, and rc() is the percolation threshold in the thermodynamic infinity limit, L. Their corresponding fluctuations δΔ(i)=Δ(i)(L)Δ¯(L), δrc(i)=rc(i)(L)r¯c(L) and δSc(i)=Sc(i)(L)Sc¯(L), are also investigated,

χΔ=1Di=1D[δΔ(i)]2, (52)
χrc=1Di=1D[δrc(i)]2, (53)
χSc=1Di=1D[δSc(i)]2, (54)

where D is the number of independent realizations. We expect that χΔ, χrc and χSc decay algebraically with L with the following scaling relations

χΔLβ2, (55)
χrcL1ν2, (56)
χScLdf2, (57)

where β2, ν2 and df2 constitute another set of critical exponents. The universality class of the percolation is characterized by the new six gap critical exponents. Note that they are highly related to the standard percolation exponents, and not fully independent from each other. The relationships between the gap exponents and the standard percolation critical exponents are derived as [153],

β1=β2=βν,ν2=ν,df1=df2=df. (58)

In particular, the values of β1 can imply the order of the percolation. That is, β1=0 indicates the percolation is a first order transition; whereas models with 0<β1<1 are continuous [120], [147]. We present the numerical results for bond percolation on a 2D square lattice in Fig. 10, where we obtain β1=β20.104, 1ν1=1ν21ν, and rc()0.5, and df1=df21.895. All results are in agreement with the known theoretical values [103].

Fig. 10.

Fig. 10

Gap critical exponents for bond percolation on 2D square lattice systems. (a) and (b), Δ¯(L) and χΔ as functions of L. (c) and (d), log–log plot of the percolation threshold r¯c(L) and χΔ versus L. (e) and (f). The power-law relations of Sc¯(L) and χSc with L. Inset in (c) shows the log–log plot of rc()r¯c(L) as a function of L.

Reprinted figure from Ref. [153].

2.3. Tipping points analysis

The notion of tipping point: “little things can make a big difference”, was first published by Malcolm Gladwell in his book [154]. It means, at a particular moment in time, a small change can result a large and long-term consequences in a complex system. Tipping points are usually associated with bifurcations [155]. A tipping point is defined as “the moment of critical mass, the threshold, the boiling point”. Many complex systems experience sudden shifts in behavior, often referred to as tipping points or critical transitions, ranging from climate [23], [156], [157], [158], [159], [160], [161], ecosystems [158], [162], [163], [164], [165], [166], [167], to social science [168], [169], financial markets [170], [171], medicine [172], [173], [174] and event macroeconomic agent-based models [175].

Complex systems can shift abruptly from one state to another at tipping points, which may imply growing a threat and risk of abrupt and irreversible changes [161]. It is thus of great practical importance to understand the theoretical mechanisms and predict the tipping phenomena. Although predicting such critical points before they occur is notoriously difficult. Theory proposes the existence of generic early-warning signals (EWS) that may indicate for a wide class of systems if a critical threshold is approaching [176], [177], [178]. EWS is currently one of the most powerful tools for predicting critical transitions. The tipping point analysis technique provides a vital tool to anticipate, detect and predict tipping points in complex dynamical systems. The methodology usually combines monitoring memory in time series, includes dynamically derived lag-1 autocorrelation [179], power-law scaling exponent of detrended fluctuation analysis (DFA) [180], [181], and power-spectrum-based analysis [182].

2.3.1. Basic concepts

Defining tipping points: Following Ref. [177], a tipping point is the corresponding critical point at which the future state of the system is qualitatively changed. A single control parameter is identified as (ρ), for which there exists a threshold ρc, from which a small perturbation (δρ>0) leads to a qualitative change (Fˆ) in a crucial feature of the system (F), after some time (T>0). The actual change (ΔF) is defined as:

|ΔF|=|Fρρc+δρ|TFρc|T|Fˆ>0 (59)

In this definition, the critical threshold ρc is considered as the tipping point, beyond which a qualitative change occurs. Note that such changes may occur immediately or even much later after the cause.

Types of tipping points: The theoretical mechanisms behind tipping phenomena in a complex system can be effectively divided into three distinct categories: bifurcation-induced, noise-induced and rate-dependent tipping [183].

Bifurcation, means that a small change in forcing (ρ) past a critical threshold causes a large, nonlinear change in the system state, thus meeting the tipping point definition in Eq. (59). Given a conceptualized open system [184],

dFdt=f(F,ρ(t)), (60)

where ρ(t) is in general a time-varying input. In the case that ρ is constant, we refer Eq. (60) as the parameterized system with parameter λ and to its stable solution as the quasi-static attractor. If ρ(t) passes through a bifurcation point of the system where a quasi-static attractor loses stability, it is intuitively clear that a system may ‘tip’ directly as a result of varying that parameter. We present two bifurcation-induced tipping point examples in Fig. 11. Where the system’s states are described by

dFdt=ρ+FF3, (61)

and

dFdt=ρFF3, (62)

for Fig. 11 a and b, respectively. In general, as a system approaches a bifurcation tipping point, where its current state becomes unstable, it leads to a shift to an alternative attractor.

Fig. 11.

Fig. 11

Different sources of cascading tipping point types. (a) and (b) Bifurcation-induced, (c) noised-induced transitions, (d) reversible tipping points. Solid lines are stable steady states, dashed lines are unstable steady states.

Noise-induced, noise-induced transitions between existing stable states of a complex system (Fig. 11c), can also be regarded as tipping points [185], however, they do not meet the definition of forced changes, as in Eq. (59). Noise-induced tipping points mean that noisy fluctuations result in the system departing from a neighborhood of a quasi-static attractor. For example, Pikovsky and Kurths studied the coherence resonance in a noise-driven excitable Fitz Hugh–Nagumo​ system and uncovered that the effect of coherence resonance is explained by different noise dependencies of the activation and the excursion times [186]. The abrupt warming events during the last ice age, known as Dansgaard–Oeschger events, provide a noise-induced tipping real-world example [185]. In addition, Sutera [187] studied noise-induced tipping points in a simple global zero-dimensional energy balance model with ice-albedo and greenhouse feedback [188]. Their results indicate a characteristic time of 100,000 years for random transitions between ‘warm’ and ‘cold’ climate states, which match very well with the observed data. The noise-induced tipping points approach has also successfully been used for modeling changes in other climate models and phenomena [189]. In contrast to approaching bifurcations, it was found that noise-induced transitions are fundamentally unpredictable and show none of the EWS [177].

Rate-dependent, in which the system fails to track a continuously changing quasi-static attractor. To better understand the phenomenon of rate-dependent tipping, Ashwin et al. introduced a linear model with a tipping radius and discussed three examples where rate-dependent tipping appears [183]. Given a system for FRn and the parameter ρ has a quasi-static equilibrium F~(ρ) with a tipping radius R>0, then for some initial F0 with |F0F~(ρ)|<R, the evolution of F with time is given by

dFdt=M(FF~(ρ)) for |FF~(ρ)|<R, (63)

where M is a fixed stable linear operator (i.e. |eMt|0 as t). ρ(t) is a time-varying parameter, that represents the input to the subsystem. The tipping radius may be related to the basin of attraction boundary for the nonlinear problem. This model shows only rate-dependent tipping—because​ M is fixed and there is no bifurcation in the system and no noise is present. In particular, the model can be generalized to include M and R that vary with ρ(t). Eq. (63) can be solved with the initial condition F(0)=F0 to give

F(t)=eMtF0+s=0teM(ts)MF~(ρ(s))ds. (64)

Note that the rate-dependent tipping was also observed in the zero-dimensional global energy balance model [188].

Besides the above three tipping types, there is potentially another type, a reversible tipping point. In Fig. 11d, we show an example of a reversible tipping point, in which a mono-stable system exhibits non-linear but reversible change [159].

2.3.2. Tipping elements in the Earth’s climate system

In Ref. [23] Lenton et al. introduced the term tipping elements to describe large-scale components of the Earth system that may pass a tipping point, and assessed where their tipping points lie. They emphasized that the human activities may play a vital role to push the components of the Earth system past their tipping points, resulting in profound impacts on our social and natural systems. They also defined the subset of policy-relevant tipping elements having the following conditions: (i) The first condition is described in Eq. (59); (ii) the second one is related to the “political time horizon” TP; (iii) the third one is called “ethical time horizon” TE; (iv) the last one is that a qualitative change should correspondingly be defined in terms of impacts. According to these four aforementioned conditions, we review here some policy-relevant potential tipping elements in the climate system, as shown in Fig. 12. For each tipping element, we consider its (1) feature of the system F (see Eq. (59)) or direction of change, (2) control parameter(s) ρ, (3) critical value(s) ρc, (4) transition timescale T as well as (5) key impacts.

Fig. 12.

Fig. 12

Some basic main tipping elements in the Earth’s climate system.

Figure from Ref. [161].

Arctic Sea-Ice. For both summer and winter Arctic sea-ice, the area coverage is declining and the ice has thinned significantly over a large area [190]. In the IPCC projections with ocean–atmosphere​ GCMs, half of them will become ice-free in September during this century [191], at a polar temperature 9[192]. The decline in the areal extent can be regarded as the feature of the system, denoted by F; the local air temperature ΔTair and ocean heat transport can be regarded as control parameters; a numerical value of the critical threshold is still lacking in literature; the transition timescale is about T10 years; the declining of the Arctic Sea-Ice can amplify warming and cause ecosystem changes.

Greenland ice sheet. It has been reported that the Greenland ice sheet is melting at an accelerating rate [193], which could add additional 7 meters to sea level over thousands of years if it passes a particular threshold. The decrease of the ice volume can be regarded as F, a feature of system; the local air temperature ΔTair is a control parameter; the critical value ρc3°; the transition timescale T can reach about 300 years.

West Antarctic ice sheet. According to the IPCC report [193], the Amundsen Sea embayment of West Antarctica might have already passed a tipping point, i.e., the ‘grounding line’ (where ice, ocean and bedrock meet) is retreating irreversibly. This could also destabilize the rest of the West Antarctic ice sheet [194]. It has been found, using paleoclimatology data, that such widespread collapse of the West Antarctic ice sheet occurred repeatedly in the past. Similar to the Greenland ice sheet, the decrease of the Ice volume can be regarded as F; the local air temperature ΔTair is the control parameter; the critical value ρc58°; the transition timescale T can reach 300 years. The collapse of the West Antarctic ice sheet may lead to about 5 meters of sea-level rise on a timescale of centuries to millennia [23].

Atlantic circulation. The Atlantic meridional overturning circulation (AMOC) is one of Earth’s major ocean circulation systems, redistributing heat and affecting the climate. Research has provided evidence for a weakening of the AMOC by about 3±1 sverdrups (around 15 per cent)3 since the mid-twentieth century [160]. The AMOC is also considered as one of the main tipping elements of the Earth system [23], [156]. The overturning of the Atlantic circulation can be regarded as F; the additional North Atlantic freshwater input is the control parameter; the critical value ρc0.10.5 Sv; the transition timescale T can reach 100 years. A slowdown of the AMOC is associated with a southward shift of the tropical rainfall belt by influencing the Intertropical Convergence Zone, and a warming of the Southern Ocean and Antarctica [17].

El Niño–Southern Oscillation (ENSO). ENSO, the interannual fluctuation between anomalous warm and cold conditions in the tropical Pacific, is one of the most influential coupled ocean–atmosphere climate phenomena on Earth [195], [196], [197]. It has been reported that extreme El Niño events are projected to likely increase in frequency in the 21st century [198]. The tipping point behavior of ENSO in a warming world was discussed in Ref. [199]. The amplitude of ENSO is regarded as F; the zonal mean thermocline depth, thermocline sharpness in the east equatorial Pacific, and the strength of the annual cycle are the control parameters; the transition timescale T can reach 100 years. A stronger El Niño usually causes more extreme events (e.g., floods, droughts, or severe storms), which have serious consequences for economies, societies, agriculture and ecosystems.

Indian summer monsoon. The Indian summer monsoon rainfall (ISMR) has a decisive influence on India’s agricultural output and economy. The monsoon season (from June to September) can bring drought and food shortages or severe flooding, depending on how much rain falls. The land-to-ocean pressure gradient drives the monsoon circulation, related to the moisture-advection feedback [200]. The ISMR shows a declining trend since 1953 [201]. It has been reported that under some plausible decadal-scale scenarios of land use, greenhouse gas and aerosol forcing, the Indian summer monsoon switches between two metastable regimes corresponding to the “active” and “weak” monsoon phases [23]. Thus, the Indian summer monsoon can be also regarded as a tipping element. In particular, the ISMR is regarded as F; the planetary albedo over India is the control parameter; the critical value ρc=0.5 Sv; and the transition timescale T is 1 years.

Amazon rainforest. Tropical forests play a vital role in the global carbon cycle [202] and are the home of more than half of the known species worldwide [203]. Deforestation and climate change are destabilizing the tropical forests with annual deforestation rates of around 0.5% since the 1990s, with a strong increase in recent years [204]. An empirical finding suggests that the observed tropical forest fragmentation is near to the critical point in three continents, including the Americas, Africa and Asia–Australia [205]. In particular, we focus on the Amazon rainforest, since it is the world’s largest rainforest and is home to one in ten known species [161]. If forests are close to tipping points, the Amazon dieback could release 90 gigatonnes Carbon dioxide. Finding the tipping point of the Amazon rainforest is thus essential for us to stay within the emissions budget. Here the rainforest in the Amazon is regarded as F; the precipitation and dry season length are the control parameters; the critical value ρc=1,100 mm/year; its transition timescale T is about 50 years.

Besides the seven above mentioned tipping elements, there are also some potential policy-relevant tipping elements in the climate system, such as, Sahara/Sahel and West African monsoon, boreal forest, Antarctic bottom water, tundra, permafrost, marine methane hydrates, ocean anoxia and Arctic ozone. The readers who are interested can find them in Ref. [23]. Strong evidence indicates that “tipping points are under way has mounted in the past decade. Domino effects have also been proposed” [161].

2.3.3. Early warning of tipping points

As discussed above, many complex dynamical systems, in particular climate systems, can have tipping points and imply risks of unwanted collapse. Although predicting such tipping points before they are reached is a big challenge, the existence of generic EWS provide useful indicators for anticipating such critical transitions [176], [178]. Hence, if an early warning of a tipping point can be identified, then it could help broader society, scientists and policymakers to perform early actions to reduce system collapsed related damages. Thus, numerous studies have been dedicated to detecting and predicting these critical transitions, often making use of the EWS. In this review, we will highlight the tipping point analysis techniques that are used to anticipate, detect and forecast critical transitions in a dynamical system.

Critical slowing down near tipping points

The “critical slowing down” phenomenon has been suggested as indicators of whether a dynamical system is getting close to a critical threshold [206]. This happens, for instance, at the fold bifurcation (Fig. 11), often associated with tipping points. This indicates that the rate at which a system recovers from small perturbations becomes slow, and the slowness can be inferred from rising “memory” in small fluctuations in the state of a system. For example, Fig. 13 demonstrates that the critical slowing down has an indicator that the system has lost resilience and may be tipped more easily into an alternative state (as reflected by lag-1 autocorrelation) [178].

Fig. 13.

Fig. 13

Critical slowing down as an indicator for foreseeing tipping points.

Figure from Ref. [178].

In order to illustrate the relation between the critical slowing down phenomenon, increased autocorrelation and increased variance, here we show a simple example that reveals the underlying mechanism. We consider a simple autoregressive model of first order,

xn+1x¯=eλΔtxnx¯+σεnyn+1=eλΔtyn+σεn, (65)

where x¯ is the stable equilibrium of the model, Δt is the period and λ is the recovery speed. Here yn is the deviation of the state variable x from the equilibrium, εn is a random number obtained from a standard normal distribution and σ is the standard deviation. If λ and Δt are independent of yn, Eq. (65) can be written as:

yn+1=αyn+σεn. (66)

The lag-1 autocorrelation αeλΔt is zero for white noise and close to one for red noise. The expectation value of a first-order autoregressive process yn+1=c+αyn+σεn is

Eyn+1=E(c)+αEyn+Eσεnμ=c+αμ+0μ=c1α. (67)

Let us consider c=0, then the expectation value equals zero and the variance to be

Varyn+1=Eyn2μ2=σ21α2. (68)

The return speed to equilibrium decreases, when the system is close to the critical point. This implies that λ tends to zero and α approaches to one. According to Eq. (68), the variance tends to infinity. In summary, the critical slowing down leads to an increase in lag-1 autocorrelation (α) and in the resulting pattern of fluctuations (variance).

Slowing down causes the intrinsic rates of change to decrease, and thus the state of the system becomes more like its past state, i.e., the autocorrelation increases. The resulting increase in “memory” can be measured in a variety of ways from the frequency spectrum of the system.

Autocorrelation function. The lag-1 autocorrelation function (ACF(1)) indicator is a simple way to provide an EWS for an impending tipping event. For instance, Held and Kleinen have shown that the autocorrelation increases in the vicinity of a bifurcation in a model of the thermohaline circulation [179]; Dakos et al. found that the autocorrelation increases before eight well-known climate transitions in the past data [157].

The coefficient of the correlation between two values in a time series is called the ACF. For example the ACF for a time series yn, see Eq. (66) in the autoregressive model, is given by: Corr(yn,yns),s=1,2,, where s is the time gap and is called the lag. In particular, a lag-1 autocorrelation is the correlation between values that are one time step apart. More generally, a lag s autocorrelation is the correlation between values that are s time steps apart.

The ACF scaling exponent, γ, is the power-law decay of the autocorrelation function with increasing lag s [207]. Let us denote C(s) as the autocorrelation with lag s of time series, then the scaling is defined as

C(s)sγ, (69)

for long-range correlations. Notably, for short-range correlated records, C(s) decays exponentially and only ACF(1) is indicative of a data variability close to a tipping point.

Detrended fluctuation analysis. Slowing down causes an increase in memory, which can also be measured using detrended fluctuation analysis (DFA) [180]. See [180] for the detailed method. DFA is often used to detect long-range correlations or the persistence of diverse time series including DNA sequences [208], heart rate [209], [210], [211], earthquakes [212], and also climate records [213]. If the time series is long-range correlated, the fluctuation function, F(n), increases according to a power-law relation:

F(n)nα, (70)

where n is the window size and α the DFA scaling exponent. Here, F(n)=1nt=1nXtYtz2, where Xt=i=1t(xix) is the cumulative sum or the “profile” of a time series xi, and Ytz is the fitting polynomial, z stands for the (polynomial) order of the DFA. The DFA exponent α is calculated as the slope of a linear fit to the log–log graph of F(n) vs. n. It has been found that the DFA exponent in the temporal range 10n100 is sensitive to changes in a dynamical system, which is similar to ACF(1) [181].

Power spectrum. Recently, Prettyman et al. introduced a novel scaling indicator based on the decay rate of the power spectrum (PS) [214]. PS analysis partitions the amount of variation in a time series into different frequencies. When a system is close to a critical transition, it tends to show spectral reddening, i.e., higher variation at low frequencies [215]. The PS scaling exponent β is calculated by estimating the slope of the power spectrum S(f) of the data, from the scaling relationship

S(f)fβ. (71)

Analytically, the three scaling exponents have the linear relationship: α=1+β2=1γ2. It has been reported that the PS-indicator is a useful technique which behaves similarly to the related ACF(1)- and DFA-indicators. In addition, it also shows signs of providing an EWS for a real geophysical system, tropical cyclones, whereas the ACF(1)- indicator fails [214].

Besides the slower recovery from perturbations, increased autocorrelation and memory [see Eq. (67)], increased variance [measured as standard deviation, see Eq. (68)] is another possible indicator of a critical slowing down as a critical transition is approached. For example, Carpenter and Brock have shown that the variance increases in the vicinity of a bifurcation in a lake model [216].

Flickering before transitions. Another notable EWS is a system’s back and forth oscillation between two stable states in the vicinity of a critical transition. This oscillation has been called flickering and was observed on the model of lake eutrophication [217] and trophic cascades [218].

Skewness and Kurtosis. In addition to the aforementioned EWS before a catastrophic bifurcation, two further precursors, observed when approaching a critical transition and thus suggested as EWS, are changes in the skewness and kurtosis of the distribution of states [219], [220], [221]. While skewness indicates asymmetry in the distribution—with a negative skew indicating a right-sided concentration and a positive skew indicating the opposite. Skewness is the standardized third moment around the mean of a distribution,

γ=1ni=1nyiμ3[1ni=1nyiμ2]32. (72)

Similarly, kurtosis is a measure of the “peakedness” of the distribution—with positive kurtosis indicating a peak higher than the one of a normal distribution and negative kurtosis indicating a lower peak. Kurtosis is the standardized fourth moment around the mean of a distribution estimated as,

κ=1nt=1nztμ4(1nt=1nztμ2)2. (73)

2.3.4. Precursors of transitions in real systems

In the last section, we first highlighted the theoretical background of tipping points that may occur in non-equilibrium dynamics before critical transitions. Nonetheless, it poses much of a challenge to detect EWS in real complex systems such as, climate, social and ecological systems. Developing reliable predictive systems based on these generic properties can strengthen our capacity to navigate systemic failure and guide us for designing more resilient systems.

We will briefly review emerging precursors of transitions in different real systems, including climate, ecosystems, medicine and finance. An abrupt climate change occurs when the Earth system is forced to cross a threshold to a new climate state [222]. Large, abrupt, and widespread climate changes include the Carboniferous Rainforest Collapse [223], Younger Dryas [224], greenhouse–icehouse transition [225], Dansgaard–Oeschger events, Heinrich events and Paleocene–Eocene Thermal Maximum [226]. All these rapid climate changes could be explained as critical transitions [176]. For instance, a significant increase in autocorrelation is regarded as a precursor of eight well-known climate transitions [157]. A flickering phenomenon was found to precede the abrupt end of the Younger Dryas cold period, which can be also regarded as a precursor [227].

In ecology, tipping point analysis has also become a major focus of research. For example, an EWS was found in the destabilization of exploited fish stocks, where harvesting leads to increased fluctuations in fish populations [228]. The emerging of EWS has been demonstrated in lake and marine systems [229], [230]. In physiology, abrupt transitions were found in epileptic seizures and asthma attacks [174]. For example, flickering may occur before epileptic seizures. In finance, market dynamics may contain information indicating an abrupt change. For example, increased trade volatility may occur before a main shock [231]. For other precursors of transitions in real systems, the reader is referred to Ref. [176].

2.4. Entropy and complexity

2.4.1. Introduction

Entropy is an important concept arising from statistical mechanics. It is a characteristic that describes the state of a system composed of smaller components, and it has been used to be a general measure of complexity, with widespread applications. In classical thermodynamics, entropy is related to the loss of energy during an irreversible process, developed by Rudolf Clausius in the early 1850s. The thermodynamic entropy S is derived from the heat flow δQ at a fixed temperature T,

S=δQT. (74)

Note that the entropy of a system is defined only if it is in a thermodynamic equilibrium. The statistical mechanics definition of entropy was developed by Ludwig Boltzmann in 1870s via analyzing the statistical behavior of the microscopic components of a system as

S=kBlnΩ. (75)

Here S is the entropy of the macrostate, kB is the Boltzmann’s constant, and Ω standing the total number of possible microstates that yields the macrostate. When viewed in terms of information theory, Claude Shannon developed the very general concept of information entropy, a fundamental cornerstone of information theory, to describe an analogous loss of information [232]. It is a measure of the amount of information that is missing before reception. The definition of the information entropy is,

S=kSipilogpi, (76)

where kS=1log(2) (in bits), pi is the probability of each state. In quantum statistical mechanics, the concept of entropy has been developed by John von Neumann and is generally referred to as “von Neumann entropy”. For a quantum-mechanical system described by a density matrix ρ, the von Neumann entropy is,

S=kBTr(ρlogρ), (77)

where Tr denotes the trace operator.

There exist also many other types of entropy, such as Gibbs, Residual, Approximate, Sinai–Kolmogorov, Sample, Multiscale. Entropy has been proven useful in many real-world systems, including analysis of DNA sequences [233], cosmology and astrophysics [234], [235], [236], economics [237], [238], and climate systems [239], [240], [241]. Each definition of entropy could give better results for some systems but fails for others. We will discuss in the following that entropy has three related interpretations [29] as a measure (i) irreversible changes, (ii) disorder and (iii) uncertainty.

2.4.2. Entropy as irreversibility

The idea of irreversibility is central to the understanding of entropy. A process that is not reversible is usually called irreversible. This concept arises in thermodynamics. It has been reported that all complex dynamic natural processes are irreversible [242]. For an isolated system with an irreversible process, the entropy never decreases. This is known as the second law of thermodynamics.

In order to better understand the behavior of entropy in an irreversible process, we consider the Carnot heat engine and the corresponding Carnot cycle. As shown in Fig. 14a, given by a piston with pressure P (external), and two heat baths, one at a hot temperature TH and the other at a cold temperature TC with some material (a monatomic ideal gas) inside the piston. During one Carnot cycle, QH heat flows out of the hot bath, heat QC flows into the cold bath, resulting in a net work W=QHQC. Carnot imagined a fully reversible heat engine, and the Carnot cycle moves the piston in and out with the following four steps (see, Fig. 14 b, the Carnot cycle P–V diagram). (1) (iii): The gas is connected to the hot bath, and the piston moves outward at a varying pressure and a fixed temperature TH. In this step, heat QH flows into the piston. (2) (iiiii): The piston expands at varying pressure P. (3) (iiiiv): The expanded gas is then connected to the cold bath and compressed at a fixed temperature at TC. In this step, the heat QC flows out. (4) (ivi): The piston is compressed and returns to the original state. In this step, there is no heat transfer.

Fig. 14.

Fig. 14

Entropy as Irreversibility. a. Prototype of Heat Engine. b. Carnot Cycle P–V Diagram.

The net work W done by the piston is the area inside P–V Loop (integration) in Fig. 14b,

W=cyclePdV= Area inside PV Loop.  (78)

According to the ideal gas law,

PV=NkBT (79)

where N is the number of particles, the total energy of the ideal gas inside the piston is given by the equipartition theorem

E=32NkBT=32PV, (80)

where E stands for the kinetic energy. During the first step, we get

QH=EiiEi+Wi,ii=iiiPdV=iiiNkBTHVdV=NkBTHlogViiVi. (81)

Similarly, for the third step, we have

QC=NkBTClogVivViii. (82)

For the other two steps, there is no any heat flow, based on dE=PdV=NkBTVdV=32NkBdT, and we have

iiiiidVV=logViiiVii=iiiii32dTT=32logTCTH. (83)

The above equation gives ViiiVii=THTC32 and VivVi=THTC32. Thus ViiiVii=VivVi and hence

ViiiViv=ViiVi. (84)

Substituting Eq. (84) into the heat flow, Eqs. (81), (81), we yield Carnot’s fundamental result,

QHTH=QCTC. (85)

Let us define the heat flow ΔE=Q at a fixed temperature T. Then the entropy change ΔS is

ΔS=QT. (86)

Note that the above is equivalent to Eq. (74). This means that for a reversible Carnot engine, the entropy flows from the hot bath QHTH equals the entropy flows from the piston QCTC, i.e., no entropy is created or destroyed. However, for any irreversible real engines, entropy will increase.

2.4.3. Entropy as disorder

Traditionally, another interpretation of entropy is described as a measurement of the disorder or randomness of a system. In thermodynamics, the entropy of mixing is the increase of entropy when two or more different types of particles are mixed without chemical reactions. Consider that N molecules of an ideal gas are separated in a vessel with two equal volumes V. The total unmixed entropy Sun is [29]

Sun=2kBlogVN2(N2)!. (87)

Here we assume that the two separated ideal gases have the same masses and the same total energy. Let the partition between the gases be removed and they are allowed to mix. Then the mixed entropy Sm becomes

Sm=2kBlog(2V)N2(N2)!. (88)

According to Eqs. (87), (88), we obtain the increased entropy,

ΔS=SmSun=NkBlog(2V)logV=NkBlog2. (89)

The above equation means that there will be a gain kBlog2 in entropy per molecule during the mixing process. This is since the temperatures and pressures are equal, and removing the partition of the vessel does not involve any heat transfer, thus mixing of gases (e.g., by diffusion), always results in an increasing entropy. The mixing is spontaneous! The diffusion of initially separated gases results in an increase in entropy. The process has increased the random distribution of molecules. Therefore, it is appropriate to suggest a relationship between entropy and complexity (disorder).

2.4.4. Entropy as uncertainty

The third interpretation of entropy is as a measure of the uncertainty or ignorance of a complex system. This interpretation is more general and strongly related to the information and memory. In this interpretation, the entropy is not an intrinsic property but representing our knowledge about the system.

Next, we will focus on the non-equilibrium entropy and the information entropy to insulate how entropy can be interpreted as uncertainty. The second law of thermodynamics states that the entropy of an isolated system never decreases, and is unchanged if and only if all processes are reversible. What we are interested in is how to define the entropy for non-equilibrium systems. Generally, for both non-equilibrium and equilibrium systems, we use a probability distribution ρ, to define an ensemble of states. Given a probability distribution of a discrete set of states, the entropy is

Sdiscrete=kBlogpi=kBipilogpi, (90)

where pi is the probability of i state. In the case of continuum distributions, the entropy becomes

Scontinuum=kBlogρ=kBρlogρ. (91)

We consider a microcanonical ensemble, in which ρequil =1Ω(E)δE, the non-equilibrium of the entropy is shifted from the equilibrium (Eq. (75)), and the entropy is

Smicro =kBlogρequil =kBlog(Ω(E)δE)=kBlog(Ω(E))+kBlog(δE). (92)

For quantum systems, the non-equilibrium entropy can be described by the density matrix ρ, as in Eq. (77).

For non-thermodynamic systems, the temperature variable does not exist. Therefore we do not need to use Boltzmann’s constant kB, but the constant kS=1log(2). Then Eq. (76) becomes

SS=ipilog2pi, (93)

here the entropy is measured in bits. This is called information entropy, which was introduced by Shannon [232]. Information entropy represents an ensemble of possible messages or images, and has important implications in communication technologies (message passing) and computer science (data compression). Note that the non-equilibrium Shannon entropy satisfies the following properties: (1) it is maximum for equal probabilities; (2) it is unaffected by extra states of zero probability; and (3) it changes for conditional probabilities. For more details, we refer the reader to Ref. [29].

2.4.5. Approximate entropy, sample entropy and system sample entropy

In the following, we will highlight several types of entropy, including Approximate Entropy (ApEn), Sample entropy (SampEn) and System Sample Entropy (SysSampEn), which have been developed to quantify the complexity in non-linear time-series.

ApEn

The ApEn was introduced by Pincus in 1991 [243] motivated by the exact regularity statistics, Kolmogorov–Sinai (KS) entropy [244] and the K2 entropy [245]. It is used as a quantification of regularity in time-series data, in particular for relatively short and noisy datasets. ApEn has a wide range of applications in areas from medical data, such as heart rate [243], to finance system [246], psychology [247] and human factors engineering [248].

Definition of ApEn: Given a time series of data u(1),u(2),,u(N). There are N raw data equidistant in time. It forms a sequence of vectors x(1) through x(Nm+1) defined by x(i)=[u(i),,u(i+m1)]. Here, m is an integer representing the length of the data. Define the embedding distance d[x(i),x(j)] [249] between vectors x(i) and x(j) as the maximum difference in their respective scalar components. Then use x(1),x(2),,x(Nm+1) to construct, for each iNm+1, Cim(r)= (number of jNm+1 such that d[x(i),x(j)]r)(Nm+1). Here,

d[x(i),x(j)]=maxk=1,2,,m(|u(i+k1)u(j+k1)|), (94)

where r is a positive real number which specifies a filtering level. Based on the Cim(r), the ApEn is defined as

Φm(r)=(Nm+1)1i=1Nm+1logCim(r). (95)

For fixed m and r, Eq. (95) becomes

ApEn(m,r)=limNΦm(r)Φm+1(r). (96)

In particular, given N data points, the statistic formula is implemented

ApEn(m,r,N)=Φm(r)Φm+1(r). (97)

One should note that a similar entropy, called Eckmann–Ruelle (E-R) entropy [244], has been defined as,

 E-R entropy =limr0limmlimNΦm(r)Φm+1(r). (98)

Despite their algorithms being very similar, ApEn(m,r) is not intended to be an approximate value of the ER entropy. Compared to the KS, ER and K2 entropy, ApEn(m,r) has the following advantages [250]: (i) Less affected by noise; (ii) It is robust to outliers; (iii) Lower computational demand, with good confidence; (iv) ApEn(m,r) is finite for stochastic, noisy deterministic and composite processes; (v) Increasing ApEn(m,r) corresponds to intuitively increasing process complexity. ApEn(m,r) has been applied to classify the electroencephalogram (EEG) in psychiatric diseases, such as schizophrenia [251], epilepsy [252], and addiction [253].

SampEn

Motivated by the concept of ApEn, Richman and Moorman developed the sample entropy (SampEn) to assess the complexity of physiological time-series signals, diagnosing diseased states [254]. SampEn is a modification of ApEn, but has three advantages: (i) SampEn agrees much better than ApEn statistics with the theory for random numbers having known probabilistic character over a broad range of operating conditions; (ii) maintains relative consistency and (iii) has a residual bias for very short record lengths.

Definition of SampEn: Similar to the definition of ApEn: given a time series of N points, {u(j):1jN} forms the Nm+1 vectors xm(i) for {i|1iNm+1}, where xm(i)={u(i+k):0km1} is the vector of m data points from u(i) to u(i+m1). The distance between two vectors is d[X(i),X(j)]=max{|u(i+k)u(j+k)|:0km1}, the maximum difference of their corresponding scalar components. Let Bi be the number of vectors xm(j) within r of xm(i) and let Ai be the number of vectors xm+1(j) within r of xm+1(i). Here r represents the tolerance for accepting matches. It is convenient to set the tolerance as r×SD, the standard deviation of the dataset. Then the sample entropy is defined as:

SampEn=logAB. (99)

It is clear that A will always be smaller or equal to B. Therefore, SampEn will be always either zero or positive. A smaller value of SampEn indicates more self-similarity in the dataset or less noise [254]. Note that the parameters N, m, and r must be fixed for each calculation.

SysSampEn

One limitation of the aforementioned entropy is that it can only be applied to univariate or bivariate time series (cross-ApEn) [254]. For a complex system with multivariate time series and spatial–temporal structures, Meng et al. proposed the so called SysSampEn and applied it to study the climate system. Based on the SysSampEn, they could measure the complexity (disorder) of a system composed of temperature anomaly time series and forecast the magnitude of an El Niño event with a prediction horizon of 1 year and high accuracy [241].

Definition of SysSampEn: Given N interdependent time series xα(t)(α=1,2,,N;t=1,2,,l) of length l composing the system:

  • 1.

    We select sub-records k of length m<l, starting at each qth data point, i.e., starting at t=k×q+1=0×q+1,1×q+1,2×q+1,, as long as k×q+ml. Thus a specific sub-sequence is denoted as Xαk(m,q)={xα(k×q+1),xα(k×q+2),,xα(k×q+m)}. Then we select n sub-sequences from each time series and construct a set of N×n template vectors from the system, i.e., Θ(m,q,n)={Xαk(m,q):0kn1,1αN}. We assume that two vectors are close (similar) if their Euclidean distance d(Xαi(m,q),Xβj(m,q))<γ×max{σα,σβ} (if α=β, then ij), where σα and σβ are the standard deviations of the time series xα(t) and xβ(t) respectively. γ defines the similarity criterion and is a nonzero constant.

  • 2.

    To examine the probability that two time series, which are close at m data points, still will be close at the next p data points, we construct analogously another set Θ(m+p,q,n) by selecting sub-records of length m+p. To make the number of template vectors of length m equal to that of length m+p, we choose nlmpq+1. In order to reduce the parameter degrees of freedom and save calculation time, we take p=q, then nlmp. We assume that two template vectors from the set Θ(m+p,q,n) are close if d(Xαi(m+p,q),Xβj(m+p,q))<γ×max{σα,σβ} (if α=β, then ij).

  • 3.
    The SysSampEn of the system is defined as
    SysSampEnN,m,p,leff,γ=logAB, (100)
    where A is the number of close vector pairs from the set Θ(m+p,q,n), B is the number of close vector pairs from the set Θ(m,q,n), and leff(n)=np+m, is the number of days that is used in the calculation of the SysSampEn.

Note that, when N=1, p=1, and leff=l, the SysSampEn is equivalent to the classical SampEn [254]. Appropriate parameter values have to be identified since only certain value combinations can be used to estimate a system’s complexity with considerable accuracy. In order to achieve this, Meng et al. introduced two tests [241]: Spatial asynchrony test and temporal disorder test. The SysSampEn was calculated for the climate system composed of the near surface air or sea surface temperature anomaly time series in the Niño 3.4 region. It was found that a strong positive correlation between the El Niño magnitudes and the values SysSampEn of its previous calendar year exists (Fig. 15A,B). Fig. 15C demonstrates the calculation of the accuracy level, which allows to choose effective parameter combinations.

Fig. 15.

Fig. 15

Correlation between SysSampEn and El Niño magnitude.

Reprinted figure from Ref. [241].

In addition, to reveal the mathematical meaning of the SysSampEn, Fig. 16 shows the logistic map as an example of applying the SysSampEn to estimate the system complexity and compare it with the Lyapunov exponents. It is found that higher (lower) values of the SysSampEn are strongly associated with higher (lower) Lyapunov exponents, which indicates that the SysSampEn can well capture the complexity of the system.

Fig. 16.

Fig. 16

One logistic map example ofSysSampEnthat gives accurate estimation of the system complexity, i.e., higher values of theSysSampEnindicate higher complexity of the system. For each system, 10 time series are generated from the logistic equation xn+1=μxn(1xn) with the same value of μ[2.5,4], but for different initial conditions x0[0.0001,0.0002].

Reprinted figure from Ref. [241].

3. Applications

3.1. Climate system

The Earth’s Climate System is a highly complex and interactive system as defined by the Global Atmospheric Research Programme (GARP) of the World Meteorological Organization (WMO) in 1975, as being composed of five major components: the atmosphere, the hydrosphere, the cryosphere, the land surface and the biosphere. And in 1992, the United Nations’ Framework Convention on Climate Change (FCCC) defined the climate system as ‘the totality of the atmosphere, hydrosphere, biosphere and geosphere and their interactions’. Fig. 17 shows a schematic representation of the most important components of the climate system and their potential changes. The atmosphere is the gaseous envelope surrounding the solid planet and provides oxygen to most animal life at the Earth’s surface. It is the most unstable and rapidly changing component. Currently, the atmosphere is made up primarily of nitrogen (78.1%) and oxygen (20.9%). There are also a number of trace gases that cause of great concern for the future of the planet, including carbon dioxide (CO2), methane (CH4) and ozone (O3). They are referred to as greenhouse gases (GHGs) and, along with water vapor, provide the Earth with the greenhouse effect. Their effect is to keep our Earth warm, but not too warm. The hydrosphere is the water on a planet, including oceans, seas, rivers, lakes and underground water. The cryosphere is the portion of the Earth’s surface where water is in solid form, consisting of land ice (including ice shelves and glaciers), snow and sea ice. It impacts the climate system greatly through its high albedo. Variations in the volume of the ice sheets are regarded as one of the major factors for sea level rise. Land Surface, i.e., the “solid” Earth, creates the distribution of continents, ocean basins, mountain ranges, etc. It influences the transformation of short-wave to long-wave radiation, reflectivity of the Earth’s surface, reservoir of dust, transfer of momentum and energy of the land surface. The biosphere is the total sum of all living things on the planet, including the organic cover of the land masses (vegetation, soil) and marine organisms. It plays a vital role in exchanging of carbon between the different reservoirs, such as the concentration of CO2 in the atmosphere, as well as the balances of other gases [255]. There are numerous interactions between the components, by exchanging mass, heat and momentum. For instance, the ocean–atmosphere interaction is a strongly-coupled system exchanging water vapor and heat through evaporation, among others. Each climate system component operates on a rather broad of characteristic temporal and spatial scales. Note that the climate system itself is often considered as part of the broader Earth System [2].

Fig. 17.

Fig. 17

Schematic view of the most important components and associated processes of the climate system on a global scale .

Figure is from the IPCC’s Fourth Assessment Report (AR4) [255].

© 2020 IPCC. Reproduced with permission.

A fundamental factor in climate science analyses is the climate data, which provides information on changes in the Earth’s ecosystem and supplies decision makers with a reliable knowledge base on the climate crisis and its impact. In general, there are two data collection methods to gather quantitative and qualitative climate data (also briefly discussed in Section 1.1): observations (instrumental, proxies and reanalysis) and climate models simulations. Observational data of the climate system are based on direct measurements and remote sensing from satellites, radar and other platforms. The first meteorological stations were established in Europe and North America. Global-scale observations in the instrumental era began in the mid-19th century for temperature and other climatic variables. The increase and quality of the network of observations and the technology supporting the collection and storage of data have rapidly evolved form 1950 onwards. Paleoclimate reconstructions extend some proxy records back hundreds to millions of years, including coral records [10], [256], tree rings records [13] for the last few millennia, as well as stalagmite δ18O time series [12], Central-Mediterranean sediment [11] and ice-core records [257].

Climate studies require data of consistent spatial resolution and accuracy over long time intervals. To satisfy this goal, some numerical weather prediction centers have started to produce so-called reanalyses that are obtained by subtle data assimilation techniques which efficiently combine observational and numerical data. For example, the European Centre for Medium-range Weather Forecasts (ECMWF) [258] provides the following reanalysis datasets: ERA5, ERA-Interim, CERA-SAT, CERA-20C, ERA-20CM, ERA-40, ERA-20C and so on.4 The NCEP-NCAR Reanalysis I/II, 20th Century Reanalysis and North American Regional Reanalysis (NARR)5 are produced in collaboration by the U.S. National Centers for Environmental Prediction (NCEP) and the National Center for Atmospheric Research (NCAR) [259]. The JRA-25 and JRA-55 reanalysis6 are produced by the Japan Meteorological Agency (JMA) [260], [261]. Note that most climatic variables or fields in these reanalysis data agree very well with observed data, such as the geopotential fields over the continents of the Northern Hemisphere. However, substantial differences persist in some fields over the Southern Hemisphere or those that are poorly observed [262], [263], [264], [265], [266]. More details are summarized in Ref. [267].

Climate models are considered as the primary tools available for investigating the response of the climate system to various forcings, for making climate predictions on seasonal to decadal time scales and for making projections of future climate over the coming century and beyond. Climate models could be defined as a mathematical representation of the climate system based on physical, chemical and biological principles. The models used in climate research range from simple energy balance models to complex Earth System Models (ESMs), which can be slow and costly to use, even on high-performance computers, and the results can only be approximations [17]. Here, we provide a brief overview of the basic types of climate models: (i) Energy balance models propose a highly simplified version of the dynamic of the climate system. They are zero- or one-dimensional models describing the surface temperature as a function of the energy balance of the Earth, where the number of dimensions, from zero to three, refers to the number of independent space variables used in the model domain. For instance, the global mean surface temperature Ts¯ can be expressed as:

cdTs¯dt=RiRoRi=μQ0{1α(Ts¯)}Ro=σm(Ts¯)(Ts¯)4. (101)

Here Ri and Ro are the absorbed shortwave radiation and outgoing longwave radiation. C is the heat capacity of Earth system, in units of Jm2K1, α is the planetary albedo, m is the transmissivity of the atmosphere, a number less than 1 that represents the greenhouse effect of the Earth’s atmosphere, σ5.67×108Wm2K4 is the Stefan–Boltzmann constant, μ is an insolation parameter equal to unity for present-day conditions. (ii) Atmosphere–Ocean General Circulation Models (AOGCMs) are the most comprehensive standard climate models that are assessed in the IPCC Fourth Assessment Report (AR). They are used to understand the dynamics of the physical components of the climate system (such as, atmosphere, ocean, land and sea ice), and for making projections based on future GHGs and aerosol forcing. (iii) Earth System Models (ESMs) are the current state-of-the-art models, and they expand on AOGCMs by including representation of various biogeochemical cycles such as those involved in the carbon cycle, the sulfur cycle, or ozone. These models provide the most comprehensive tools available for simulating the past and future response of the climate system to external forcing, in which biogeochemical feedbacks play an important role. (iv) Regional Climate Models RCMs are limited-area models with representations of climate processes comparable to those in the atmospheric and land surface components of AOGCMs, though typically run without interactive ocean and sea ice. RCMs are often used to dynamically ‘downscale’ global model simulations for some particular geographical region to provide more detailed information. In addition, based on the GCMs, the Intermediate Complexity models were developed to investigate the Earth systems on long timescales or at a reduced computational cost [20]. A special application is to achieve reliable predictions of extreme events on a sub-seasonal scale.

Taken together, climate models provide a comprehensive view of the variability and long-term changes in the climate systems. However, assessing and forecasting climate extreme events, such as El Niño events and extreme rainfall, still poses challenges. The difficulties are mainly due to the intrinsically rare and disruptive natural conditions. They are strongly influenced by weather patterns, modes of variability, thermodynamic processes and various feedbacks. In the following, we will provide an in-depth and detailed description of how the statistical physics methodologies introduced in Section 2 can be applied to investigate and (or) predict various climate phenomena, such as El Niño–Southern Oscillation (ENSO), Indian Summer Monsoon (ISM), Extreme Rainfall, Atmospheric Circulation and Atlantic Meridional Overturning Circulation (AMOC).

3.1.1. El Niño–Southern Oscillation

One of the most important climate phenomena on year-to-year time scales is the El Niño Southern Oscillation (ENSO). As mentioned in Section 2.3, ENSO can be regarded as a tipping element of the Earth system. El Niño is the warm phase of the ENSO and is associated with a band of warm ocean water that develops in the central and east-central equatorial Pacific, sometimes called El Niño basin [44]. The cool phase of ENSO is called La Niña, with SSTs in the eastern Pacific below average. ENSO significantly impacts Earth’s ecosystems and human societies, by influencing temperatures and precipitation around the globe, including the Americas, India and surrounding tropical continents [268], [269], [270]. These remote effects are known as teleconnections [271].

Definition of El Niño: In this review, we use the Oceanic Niño Index (ONI) to define and distinguish the El Niñoand La Niña events. The ONI is one of the primary indices used to monitor the ONI. It is calculated by 3-month running mean SST anomalies (based on centered 30-year base periods updated every 5 years) in an area of the east-central equatorial Pacific Ocean, which is called the Niño 3.4 region (5°S to 5°N; 170°W to 120°W), see the Niño regions presented in Fig. 18. If the ONI > +0.5°C for at least five consecutive months, then the event is defined as an El Niño; whereas, when ONI is < 0.5°C for at least five consecutive months, the corresponding event is regarded as a La Niña. The ONI is use for the operational definition of the ENSO phase by NOAA.7 There are also some other indices to measure the El Niño events, such as, Niño 1+ 2, 3, 3.4, 4 and Trans-Niño indexes.8 Besides the SST, ENSO also influences other climate variables like atmospheric pressure gradients, upper ocean currents and the thermocline depth [272].

Fig. 18.

Fig. 18

Niño Regions. The Niño 3.4 region (5°S to 5°N; 170°W to 120°W); The Niño 4 region (5°S to 5°N; 160°E to 150°W); The Niño 3 region (5°S to 5°N; 150°W to 90°W); The Niño 1+ 2 region (10°S to 0 °; 90°W to 80°W).

Figure is from NOAA.

ENSO transition mechanism in brief: In the following, we will describe the physical mechanism of ENSO. Bjerknes first postulated that the ocean–atmosphere interaction is essential for ENSO and is now referred to as the Bjerknes feedback [273]. This positive feedback mechanism is understood to be a prominent mechanism necessary for the development of ENSO. Consider an initial SST anomaly in the central/eastern tropical Pacific. This anomaly reduces the east–west SST gradient and hence weakens the Walker circulation [274], resulting in a westerly wind anomaly. The westerly wind anomaly, in turn, drives the ocean circulation change (easterly winds shoal the equatorial thermocline and induce upwelling in the eastern Pacific, keeping the east cool) that further reinforces the SST anomaly. As a result of the positive feedback, the tropical Pacific reaches a warm state, i.e., El Niño. Tziperman and Yu pointed out that the westerly wind bursts, while partially stochastic, seem an inherent part of the large-scale deterministic ENSO dynamics [275].

Besides the positive Bjerknes feedback, the oscillatory nature of ENSO requires a negative feedback to turn the coupled ocean–atmosphere system from a warm (El Niño) phase to a cold (La Niña) phase. Different negative feedbacks have been proposed since the 1980s associated with the delayed oscillator, the western Pacific oscillator, the recharge–discharge oscillator, and the advective–reflectiveoscillator. The delayed oscillator is a mechanism for the oscillatory nature of ENSO, which was proposed by McCreary based on the Rossby wave reflection at the ocean western boundary [276]. Suarez and Schopf introduced the delayed oscillator explaining the oscillatory feature of ENSO to emphasize the delayed effects of oceanic wave reflection at the ocean western boundary [277]. Cane and Zebiak developed the first coupled ocean–atmosphere model, and used it to predict ENSO [278], [279]. The conceptual delayed oscillator model is represented as follows:

dTdt=ATBT(tη)εT3, (102)

where T represents the SST anomaly in the equatorial eastern Pacific, A,B,η and ε are constants. The first term on the right hand side of Eq. (102) stands for the Bjerknes positive feedback between the ocean and the atmosphere. The second term represents the delayed negative feedback of wave reflection at the ocean western boundary. The last term is a cubic damping term. It has been found that the delayed oscillator overlooks the role of the ocean–atmosphere interaction in the western Pacific and also assumes that wave reflection at the ocean eastern boundary is not important [280].

Since the delayed oscillator does not consider the western-Pacific anomaly pattern, Weisberg and Wang [281] developed a western-Pacific oscillator model for ENSO. This model stresses that the equatorial wind anomalies in the far western Pacific play an important role in the evolution of ENSO. The western-Pacific oscillator model is represented as:

dTdt=aτ1+b2τ2(tδ)εT3, (103)
dhdt=cτ1(tλ)Rhh, (104)
dτ1dt=dTRτ1τ1, (105)
dτ1dt=dTRτ1τ1, (106)
dτ2dt=ehRτ2τ2 (107)

where T is the SST anomaly in the equatorial eastern Pacific, h is the thermocline-depth anomaly in the off-equatorial western Pacific, and τ1 and τ2 are the equatorial zonal wind-stress anomalies in the central Pacific and the western Pacific, respectively. All model parameters are constants.

It has been found that the growth and decrease in sea level over the western Pacific Ocean are related to ENSO [282]. Based on these ideas Jin [283], [284] developed a recharge–discharge oscillator model for ENSO that is represented by:

dTdt=CT+DhεT3, (108)
dhdt=ETRhh. (109)

where T is the SST anomaly in the equatorial eastern Pacific and h is the thermocline-depth anomaly in the equatorial western Pacific. The model parameters C,D,ε,E and Rh are constants. In particular, Eqs. (108), (109) represent the discharge and recharge of tropical Pacific Ocean heat content.

Picaut et al. [285] proposed a conceptual model of the advective–reflective oscillator for ENSO. They found that the positive feedback results from ocean zonal currents that advect the western-Pacific warm pool (WPWP) towards the east. There are three negative feedbacks that tend to push the WPWP back to the western Pacific, include: (i) anomalous zonal current associated with wave reflection at the ocean western boundary, (ii) anomalous zonal current associated with wave reflection at the ocean eastern boundary and (iii) mean zonal current converging at the WPWP’s eastern edge. Unlike the other three oscillator models, the advective–reflective oscillator model does not have a set of simple and heuristic equations. Instead, they used a linear wind-forced ocean numerical model forced by wind anomalies, which were associated with the zonal current of the first baroclinic Kelvin and first meridional Rossby waves [195].

Moreover, based on the dynamics and thermodynamics of Zebiak and Cane’s coupled model, as well as all previous ENSO oscillator models, Wang [286] developed a unified ENSO oscillator ocean–atmosphere model with the hypothesis that ENSO may be a multi-mechanism phenomenon and their relative importance may depend on time. The unified ENSO oscillator model is formulated as:

dTdt=aτ1b1τ1(tη)+b2τ2(tδ)b3τ1(tμ)εT3, (110)
dhdt=cτ1(tλ)Rhh, (111)
dτ1dt=dTRτ1τ1, (112)
dτ2dt=ehRτ2τ2, (113)

where T,h,τ1 and τ2 are variables that stand for the SST anomalies in the equatorial eastern Pacific, the thermocline-depth anomalies in the off-equatorial western Pacific, the zonal wind-stress anomalies in the equatorial central Pacific and the zonal wind-stress anomalies in the equatorial western Pacific, respectively. For a given set of parameters, Eqs. (110)(113) can oscillate on interannual time scales.

El Niño forecasting:

Since the 1990s, both dynamical and statistical models have been used to model and forecast El Niño events, providing the society the opportunity to prepare for associated hazards such as heavy rains, floods and droughts [287], [288]. Although there are about 17 dynamic and 8 statistical models in CPC/IRI currently providing ENSO forecasts routinely, high prediction skill is generally limited to about six months ahead, see the description in Fig. 19. The reason is the presence of the so-called “spring predictability barrier” (SPB), where errors are greatly amplified due to the coupled feedbacks in the equatorial ocean–atmosphere system [289], [290]. After the spring, the ability of the models to predict successfully becomes increasingly better.

Fig. 19.

Fig. 19

Spring Predictability Barrier. The El Niño forecasting skill of models based on February–October observations to predict the November–January average value in the El Niño-3.4 region (see Fig. 18). Results are an average correlation coefficient from 20 models between 2002–2011.

Figure is from NOAA.

In general, a poor skill in forecasts can be divided into two categories: imperfections in the forecast system (including, error in model parameterization of sub-grid-scale motions, the relative scarcity of points with data, errors in the measurement of the data), and fundamental limits of predictability [272]. The fundamental limits are usually coming from the complex and chaotic nature of a system.

To overcome the limits of the SPB in El Niño forecasting, several CN-based approached were developed. For example, by analyzing the CN, Ludescher et al. found that a large-scale cooperative mode exists one calendar year before the warming event, linking the El Niño region and the rest of the ocean [291]. This mode is probably due to the emergence of El Niño acting as an autonomous component in the CN [61]. All nodes used in the CN are shown in Fig. 20a, where the 14 grid points in the El Niño basin [61] are in red and outside this domain there are 193 grid point in blue. The data is the daily surface air temperature (SAT) from NCEP/NCAR reanalysis I. As described in Section 2, the first step is to remove the seasonal trend by minus climatological average for each calendar day. And then, the links between the El Niño basin and outside (see Fig. 20a) are considered. The strength Wi,j(t) of each link is computed according to Eq. (15). The mean strength S(t) of the dynamical links in the CN is obtained by simply averaging over all individual link strengths. Fig. 20b, shows the time evolution of S(t) for the learning (top) and forecasting (bottom) phases. They found that before an El Niño event, S(t) tends to increase. They marked an alarm when S(t) crosses a threshold Θ=2.82 (see Fig. 20b). The prediction accuracy, through Receiver Operating Characteristic (ROC)-type analysis (explained later), is much better than the 12-months forecasts based on climate models. Based on these findings regarding the temporal evolution of the CN, Ludescher et al. successfully predicted the onset of the 2014–2016 El Niño event [292].

Fig. 20.

Fig. 20

Climate Network and El Niño forecasting algorithm. a. The nodes in CN. Each node inside the El Niño basin (solid red symbols) is linked to each node outside the basin (open blue symbols). The red rectangle denotes the Niño 3.4 region. b. The red curve is the average link weight S(t) of the CN, the horizontal line indicates the decision threshold Θ=2.82, and the blue areas show the El Niño events. When S(t) crosses the threshold from below, an alarm is given that an El Niño event will start in the following calendar year.

Reprinted figure from Ref. [291].

Motivated by this work, Meng et al. developed a multidisciplinary approach combining climate, network, and percolation theory to predict the onset of El Niño [65]. Their method can forecast El Niño events 1 year-ahead, with a high prediction accuracy of 70%, and a low false alarm of only 4%. Different from [291], they consider the global CN with initially 726 isolated nodes. Links are added one by one according to the link strength, i.e., first adding the link with the highest weight, and continue selecting links ordered by decreasing weight. During the evolution of the CN, they found that the structure of the network changes violently approximately one year ahead of El Niño events, and thus suggest to use the largest size change of the largest cluster (percolation cluster) Δ [see Eq. (48)] as a percolation-based precursor, to forecast El Niño events. In addition, based on finite size scaling analysis, they found that the percolation process is discontinuous.

To forecast both the onset and magnitude of El Niño events, another sophisticated network-based approach was developed by Meng et al. [66]. They proposed a new forecasting index based on CN links representing the similarity of low frequency temporal temperature anomaly variations between different sites in the Niño 3.4 region. They found that significant upward trends in the index forecast the onset of El Niño approximately 1 year ahead, and the highest peak since the end of last El Niño forecasts the magnitude of the following event. Their index was also tested on several datasets, including ERA-Interim, NCEP Reanalysis I, PCMDI-AMIP 1.1.3 and ERSST.v5. They defined the degree of coherence/disorder of the Niño 3.4 region as,

C(t)=2N(N1)i=1N1j=i+1NCi,j(t)(θ), (114)

where N is the number of nodes in the Niño 3.4 region and Ci,j(t)(θ) stands for the max value of the time-delayed cross-correlation function, see Eqs. (12), (13). The forecasting index (FI) was proposed as a function of time (months),

FI(t)=1m+1a=0mlnC(ta)lnC(t) (115)

where a=0 indicates that the average of ln(C) includes the current month, while m is the total number of months preceding t since Jan. 1981. We show the time evolution of the FI and ONI in Fig. 21 by using the ERA-Interim dataset. Based on the temporal evolution of the FI, they successfully predicted the onset of the 2018–2019 El Niño event, see the purple star above the purple arrow in Fig. 21b.

Fig. 21.

Fig. 21

The El Niño forecasting algorithm. a. The red curve is the forecasting index defined in Eq. (115), the blue line stands for the ONI. The horizontal black dashed line indicates FI=0, and the blue shades under the ONI curve indicate the El Niño periods. True positive forecasts are marked by the green arrows, while false alarms are marked in black dashed arrows. The purple arrow is the alarm for 2018–2019 El Niño event. b. A detailed view of FI(t) and the ONI since May 2012.

Reprinted figure from Ref. [66].

Note that one can use the peak of the FI to predict the magnitude of the following El Niño event. This peak value is determined from the end of the last event to the start of a new event, which means that we should wait until the new event begins to occur. To further improve the forecasting skill, in particular, the magnitude forecasting with a long lead time. Meng et al. [241] developed the SysSampEn method, see more details in Section 2.4.5. It is defined in Eq. (100) and shown in Fig. 15a. The correlation between the magnitude of the El Niño events and the SysSampEn can reach to r=0.99 [see Fig. 15b]. They further used this correlation to forecast the magnitude of an El Niño with a prediction horizon of 1 year and high accuracy (i.e., Root Mean Square Error =0.23C for the average of the individual datasets forecasts). In particular, for the 2018–2019 El Niño event, they forecasted a weak El Niño with a magnitude of 1.11±0.23C, compared to the observed value 0.9C.

Recently, Feng et al. combined machine learning techniques and climate networks to apply to El Niño predictions, especially for the occurrence of El Niño events and the development of the NINO 3.4 index over time [293]. In Ref. [294], Petersik and Dijkstra applied Gaussian density neural network and quantile regression neural network ensembles to predict ENSO. Classical machine learning models for prediction usually only predict a point value, e.g., the ONI. In contrast, their models forecast directly a probability distribution for the ONI. In particular, they found that their machine learning models have a high-correlation skill (r=0.5) for long lead times (12 months ahead) for an evaluation period between 1963 and 2017. In Ref. [295], Ham, Kim and Luo used transfer learning to train a convolutional neural network on historical simulations and reanalysis, and developed the heat map analyses. They found that a statistical forecast model employing a deep-learning approach produces skillful ENSO forecasts for lead times of up to one and a half years. A review of ENSO prediction based on modern machine learning and artificial neural networks was reported in Ref. [296].

ENSO remote impacts: teleconnections

As introduced above, the ENSO phenomenon mainly lies in the Pacific ocean. However, its effects are felt over a large part of the Earth. These remote effects are usually called teleconnections, emphasizing that changing climatic conditions in one place can affect areas far from the source [271], [297]. A teleconnection is usually indicated by the correlation between the values observed at two separate locations or regions. It is related to a pattern of variability, associated with atmospheric wave propagation, the presence of ocean currents, etc. For example, the teleconnections associated with the North Atlantic Oscillation (NAO) and ENSO are main examples of long-range correlations that can be found in the atmosphere. Because of those global teleconnections, El Niño leads to higher precipitation in the central Pacific and dry conditions over Indonesia and Northern Australia, much drier and warmer conditions in Mozambique, while the western USA tends to be wetter. Fig. 22 shows weather patterns statistically associated with El Niño conditions during the boreal winter months December–January–February. The regions that have been shown with some degree of reliability to be affected by warm ENSO phases are highlighted, for example, warm-vs.-cold and wet-vs.-dry anomalies. The effects for cold phases of ENSO, La Niña, are in approximately the same regions, but with the opposite sign.

Fig. 22.

Fig. 22

El Niño remote impacts: teleconnections, during the boreal winter months December–January–February. Regions with statistically reliable relation of precipitation and surface air temperature to a warm ENSO episode.

Figure is reproduced with permission from NOAA.

The basic mechanism of ENSO remote influences are produced in the atmosphere via large-scale atmospheric waves, such as Rossby waves. Rossby waves are fundamental for the understanding of weather systems in the atmosphere and the large-scale circulation in the ocean. They depend fundamentally upon the variation of the Coriolis parameter with latitude. In the so called β-plain approximation, the Coriolis parameter varies linearly with latitude where

β=2ΩRcosϕ, (116)

where ϕ is the latitude, Ω is the angular speed of the Earth’s rotation, and R is the mean radius of the Earth. The wavelength of atmospheric Rossby wave is about 6000 km. The atmospheric wave dynamics are akin to those observed in the ocean for ENSO in that they involve a transfer of atmospheric mass by anomalous winds, which in turn affect pressure and winds.

In order to assess if a region has an ENSO connection, Fan et al. applied the Pearson correlation CNs approach to investigate the global remote impacts of El Niño and La Niña [64]. They refer to the Niño 3.4 region as the El Niño Basin (ENB). The CNs were constructed by using only directed links from the ENB to regions outside the ENB (“in”-links) based on the global daily SAT fields of the NCEP/NCAR reanalysis and EAR-Interim datasets. The links’ strengths are computed according to Eqs. (12), (13), (15). They identified the value of the highest peak of the absolute value of the cross-correlation function Ci,jy(τ) and denote the corresponding time lag as θi,jy. The adjacency matrix of a CN became

Ai,jy=(1δi,j)H(θi,jy), (117)

where H(x) is the Heaviside step function for which H(x0)=1 and H(x<0)=0. The “in” and “out” degrees of each node are defined as Iiy=jAj,iy, Oiy=jAi,jy respectively, quantifying the number of links into a node or out from a node. The total “in” weights for each node outside the ENB using are defined as

INCiy=jENBAj,iyCj,iy(θ)INWiy=iENBAj,iyWj,iy. (118)

The values of IN(Ciy) and IN(Wiy) reflect the impacts of the ENB. Specifically, if there are no “in” links for a node, both the “in” degree and “in” weights are zero, indicating no impact of ENB. Based on the ONI, the time is divided into El Niño , La Niña and normal years. The “in”-weighted degree fields for El Niño and La Niña are calculated by taking the average of the same type of events

INCi=yEY(LY)INCiySINWi=yEY(LY)INWiyS, (119)

where S=yEY(LY)Iiy, and “EY” and “LY” refer to the years in which El Niño and La Niña start. The number of nodes with zero in-degree is defined as Ny and the average in-weights per node is

Cy=iENBjENBAj,iyCj,iy(θ)Ny. (120)

Fig. 23(a–d) shows that, the affected regions by El Niño and La Niña, are characterized by relatively high in-weights (using C and W). For comparison, mean winter (Dec–Feb) temperature anomalies during El Niño and La Niña years are shown in Fig. 23 (e–f). A quantitative analysis of the area that is affected/unaffected during El Niño and La Niña years is shown in Fig. 24, where El Niño and La Niña years are respectively emphasized by the red and blue shading. The network analysis reveals strongly (high Cy) localized (low Ny) impacts of ENSO, i.e., these stronger in-weighted activities are found to be concentrated in very localized areas, whereas a large fraction of the globe is not influenced by the events (low Ny means that fewer nodes are influenced).

Fig. 23.

Fig. 23

El Niño remote impacts: Climate Networks. (a), (c) “In”-weight maps (using C and W) for El Niño events. (b),(d), “In”-weight maps (using C and W) for La Niña events. (e), (f) Mean winter (Dec–Feb) temperature anomalies during El Niño and La Niña.

Reprinted figure from Ref. [64].

Fig. 24.

Fig. 24

Strongly localized impacts of ENSO. (a) The ONI as a function of time. (b) The evolution of the number of nodes that have in-links with time. (c) The evolution of the average in-weights per node with time. El Niño and La Niña periods are respectively emphasized by the red and blue shading.

Reprinted figure from Ref. [64].

Diversity of El Niño

It has been reported that there are different types of El Niño events, such as the canonical eastern Pacific (EP) and the Modoki central Pacific (CP) types [298], [299]. They are classified based on spatial patterns of the SST anomaly. For instance, the strongest SST anomalies associated with the EP types are located off the coast of South America. However, for CP types, the strongest SST anomalies are located near the International Date Line. In Ref. [300], Wiedermann et al. proposed an index based on evolving CNs to objectively discriminate different types of El Niño and La Niña episodes. In particular, they find that their network index displays a sharp peak during EP events, whereas during CP events it remains close to the observed values during normal years. Based on the CNs analysis, Lu et al. developed a new approach to estimate the impacts of El Niño events in advance, as well as predicted the types of El Niño events [301].

3.1.2. Indian summer monsoon

The Indian Summer Monsoon (ISM) is the most prominent among the world’s monsoon systems and has a decisive influence on India’s agricultural output and economy. The ISM delivers more than 70% of the country’s annual rainfall. The prediction of the ISM timing (onset date and withdrawal date) as well as the ISM rainfall is a vital issue for the Indian subcontinent and strongly affects the gross domestic product of the country, up to 22% of which is determined by agriculture [302]. A slight deviation of the timing manifested as delay (or early arrival) may lead to drastic droughts (floods). Also, swings in the amount of rainfall, even with variations of only 10%, can cause severe flooding or drought, causing damages to infrastructure and loss of crops and livelihoods of the population [303]. The onset of the ISM takes place abruptly and the long lead-time operational forecasts for the onset, withdrawal and rainfall amount have shown little skill in the recent decades.

ISM onset and withdrawal forecasting

Several statistical climate models have been developed for ISM onset and withdrawal dates forecasting with different timescales from short-range (up to 4 days), medium-range (4–10 days) [304], [305] to extended-range (10–30 days) and long-range (more than 30 days) forecasts [306]. In particular, the Indian Meteorological Department (IMD) provides a forecast of the monsoon onset 21 days in advance, with an accuracy of ± 4 days. Note that most of the ISM forecasting models are based on the input values, including SST, mean sea level pressure, tropospheric moisture, moist static energy, outgoing longwave radiation, wind fields, etc [307], [308], [309], [310], [311]. However, there are still impending challenges in the existing forecasting methods: (i) The false monsoon onsets mostly related to non-monsoonal atmospheric circulation systems [312]; (ii) the limitations in predicting a withdrawal data earlier than 1st September.

To overcome these challenges, Stolbova et al. proposed a novel tipping elements approach to accurately forecast the ISM onset and withdrawal dates [95]. This approach relies on the combination of two methodologies: the climate network analysis and the nonlinear dynamics. First, the analysis of the network of extreme rainfall events allows to identify and reveal the teleconnection between the most active hubs, the Eastern Ghats (EG) and North Pakistan (NP). Second, the application of the nonlinear dynamics through the analysis of fluctuations allows to detect the critical transitions, which are used to define the tipping elements of the ISM. Later, the EG and NP were chosen as optimal observation locations or reference points (RPs) to predict the ISM monsoon onset and withdrawal. It was discovered that there is a tipping point between these RPs for the SAT and relative humidity, which yields a very successful prediction scheme.

The ISM CN is focused on the monsoon region (62.5–97.5°E, 5.0–40.0°N) with a spatial resolution of 2.5 degrees, which results in 15×15=225 grid points, see Fig. 25. The near SAT (T) at 1000 hPa, relative humidity (rh) at 1000 hPa, and wind fields at 700 hPa were used. The variance σ2 of (T) and (rh) for each grid point are firstly calculated as [95]

σ2(x,d,w,y)=xt(y)dkx¯t(y)dk2w==k=1w1wxt(y)dki=1wxt(y)diw2, (121)

where x(t) stands for a time series, w is the length of the time window, d is the number of days before the onset of the ISM, y is a given year, and t(y) is the onset date for the given year y. The time average value of σ2 is simply expressed as

σ¯2(x,d,w)=σ2(x,d,w,y)y=y=1Yσ2(x,d,w,y)Y, (122)

where Y stands for the total number of years. The results are shown in Fig. 25. It has been confirmed that the EG and NP regions experience the highest growth of σ¯2 for both T and rh, when approaching the monsoon onset date. It has been further discovered that there exist two intersections in the RPs for the average time series during the period from 1951 to 2014. At both intersections, it coincides with the mean values of the onset and withdraw as determined by the IMD with a standard deviation of ±5 days for the EG region, see Fig. 26. Thus, this allows to derive a prediction scheme for forecasting the onset and withdrawal of the ISM in the EG, based on the trends of T and rh in the RPs.

Fig. 25.

Fig. 25

Pre-monsoon growth of the variance of fluctuationsσ¯2defined in Eq. (121)of the weekly mean values of near SATT. (a) 21 days, (b) 7 days, and (c) 1 day before the ISM onset in the Eastern Ghats (EG), d=ca. Composites are for the period 1958–2001 and were calculated from the ERA40 reanalysis dataset,​ 700 hPa winds are indicated by the blue lines. Two boxes refer to the RPs: North Pakistan, NP (blue) and EG (pink). (e) Growth of the variance of fluctuations in NP (blue), EG (pink), and averaged over the Indian subcontinent (IS) at approaching the onset date of the monsoon.

Figure from Ref. [95].

Fig. 26.

Fig. 26

Prediction of onset date (OD) and withdrawal date (WD) of the ISM: case study 2012. Left column: prediction of the OD; right column: WD in the Eastern Ghats (EG). (a), (c): air temperature at 1000 hPa; (b), (d) relative humidity at 1000 hPa. Time series from the reference points: 14-year mean (black) and 2012 values for North Pakistan (NP) (blue) and the EG (red). Gray lines show the time series from NP and EG for the training period of 14 years. Saturation temperature Tsat(a) and saturation humidity rhsat(c) are marked by horizontal black solid lines (Tsat=Tonset,Tonset and rhsat calculated as the intersection of mean time series for the training period from the EG and NP), and the day of the saturation dsat (when the temperature in the EG in 2012 reaches Tsat) by vertical dark blue lines. The orange lines indicate trends to the mean time series in NP and EG for the training period, light blue are the trends for 2012. Black solid lines indicate mean values of the OD(OD) and WD(WD) for the training period. Vertical dotted gray lines correspond to the predicted onset ODp and withdrawal dates WDp, while vertical solid gray lines are the actual onset and withdrawal dates for 2012.

Figure from Ref. [95].

Next, a linear trend estimation for the two RPs is performed and compared with the trends of the mean time series of the training period (1951–1965). The slopes of the trends for the RPs provided an estimation of an early, normal, or late monsoon arrival: a greater than average slope of T will lead to an earlier than usual onset, and vice versa. Trends of rh in the RPs in comparison with the average trends for the training period add up to the predictability of the onset: higher than average values of the relative humidity lead to a late onset, and vice versa.

It has also been found that the ISM onset coincides with the date when T in the EG and in NP become equal (see Fig. 26a). Therefore, for the forecasting of the onset, one should predict when T for the EG will abruptly decrease and intersect T for NP. Finally, the prediction scheme of the withdrawal is based on the symmetry of T changes in NP during the year. The withdrawal is estimated as the intersection of the projected T decrease in NP and the T in the EG during the monsoon season (see Fig. 26c).

The prediction is regarded as successful if the time difference between the predicted onset and the real one is 7 days for the onset and 10 days for the withdrawal [95]. The proposed scheme using T results in 74% successful predictions of the onset when made on day 125 of the year (April 10). The prediction scheme for the withdrawal succeeds in 84% of the years when made on day 205 (July 25). Based on this prediction scheme, they successfully predicted both the onset and withdrawal since 2016 in the EG region.9

ISM rainfall forecasting

The ISM is one of the most prominent monsoon systems. It affects the Indian subcontinent, where it is one of the oldest and most anticipated weather phenomena. Several theories have been proposed to explain the underlying mechanism of the monsoon. For example, the sea breeze theory: because of the differences in the specific heat capacity of land and water, continents heat up faster than seas. Consequently, the air above coastal lands heats up faster than the air above seas. These create areas of low air pressure above coastal lands compared with pressure over the seas, resulting moisture-laden winds to flow from the seas onto the lands. On reaching land, these winds rise because of the geographical relief, cooling adiabatically and leading to orographic rains. However, the understanding and predictability of ISM are still evolving.

The All India Rainfall Index (AIRI) is the total amount of ISM June-to-September rainfall averaged over the entire Indian subcontinent. It is the IMD’s primary indicator for monitoring the ISM rainfall [313]. An accurate long-term prediction of the ISMR is crucial for taking timely actions and mitigation activities. Over the last few decades, great efforts have been undertaken to understand the basic physics of the monsoon, and to improve the prediction skill for the AIRI [314], [315], [316], [317], [318], [319], [320]. However, both methods have the challenge of unstable relationships with the predictors over time, as observed during the recent years due to a weakened coupling between the boundary forcing and the Indian monsoon [321], [322]. In particular, the forecasting skill (cross correlation) of the official operational forecasts made by IMD statistical models is −0.12 for 1989–2012, for the five ENSEMBLE models’ multi-model ensemble is 0.09 and the four APCC/CliPAS models’ MME skill is 0.24 after 1989. It was discovered that the recent failure is largely due to the models’ inability to capture new predictability sources emerging during the recent global warming [303].

Traditionally, the April–May SST anomaly in the Niño 3 or Niño 3.4 region was used as a predictor for the ISMR prediction. This is since the ISMR variations are primarily driven by the EP-ENSO through tropical teleconnections [314], [315]. This inverse relationship between the AIRI and EP-ENSO,​ however, has weakened in recent decades [270]. It has been found that the CP-ENSO has also distinct teleconnections and affects on the ISMR [324]. Based on this physical mechanism, Wang et al. [303] proposed four physical–empirical AIRI predictors, EP-ENSO predictor (EPT), CP-ENSO predictor (CPT), mega-ENSO predictor (PSH) and Anomalous Asian Low (NAT). The EPT is defined as: May minus March east–west SST dipole tendency: DSST (20°S-5°N, 150°E–170°E) minus DSST (10°S–10°N, 110°W–80°W); the CPT is defined as: May minus April SST north–south dipole tendency: DSST (10 °–25°S, 170°E–160°W) minus DSST (5°–20°N, 180 °–150°W); the PSH is defined as: April–May mean SLP averaged over (40 °S–10°S, 160°W–90°W) and (10°N–30°N, 180°–130°W); the NAT is defined as: May minus March SLP averaged over (45°N–60°N, 95°E–125°E). DSST means the difference of the SST between May and March (May minus March). DSST means the difference of SST between May and April (May minus April). These predictors (EPT) produce an independent forecast skill of 0.51 for 1989–2012 and a 92-year retrospective forecast skill of 0.64 for 1921–2012. However, the forecasting lead-time for this model is quite short, i.e., starting from May (one month lead-time) or June (IMD’s operational model).

In order to achieve the goal of a long-term and reliable prediction of the ISM rainfall, Fan et al. constructed a series of dynamical and physical CNs based on the global near SAT field [323]. It was uncovered that some characteristics of the directed and weighted CNs can serve as efficient long-term predictors for ISM rainfall forecasting. The developed prediction method produces a forecast skill of 0.5 with a 5-month lead-time in advance by using the previous calendar year’s data. The skill of the forecasting, is comparable to the current state-of-the-art models, however, with quite a short (i.e., within one month) lead-time. The new network-based prediction approach is described as follows.

CN construction: Based on the NCEP/NCAR reanalysis daily SAT anomalies data, a CN is constructed for each calendar year from 1948 to the present. In the work, 726 nodes are selected, such that the globe is covered approximately homogeneously [61]. The link strength and direction are determined based on a similarity measure between the temperature anomaly time series of the nodes [see Eqs. (12)(16)]. A pair of nodes is defined to be connected only if their link is within the top 5% positive strength, which is also corresponding to a statistical significance of above the 95% confidence level.

Network Predictors Mining: To mine the dynamic network predictors for the AIRI, the cross correlation between the observed AIRI and the time series of network in-degree Kiy [see Eq. (6)] for each node i on the globe is calculated. Because the forecasting skill of the AIRI became very poor in the IMD operational forecasts and other models since the year 1989 [303], the record is divided into two consecutive periods, i.e., (i) 1950–1988, as the training period and (ii) 1989–2018, as the retrospective forecast period. For the training period, the correlation coefficients are shown in Fig. 27a. It was found that the in-degree index for some nodes captures well the behavior of the AIRI. Particularly, the node (in the region marked by a yellow box in Fig. 27a) having the maximal correlation r is located in the South Atlantic ocean [(30°S, 30°W)]. Fig. 27b shows that the value of this maximal correlation is r=0.49, with a student t-test significance of p<0.01. Therefore, the in-degree index of this key node is regarded as the optimized network predictor for the AIRI.

Fig. 27.

Fig. 27

Schematic of ISM rainfall forecasting based on climate networks. (a) The correlation coefficients between observed AIRI and in-degree time series during the training period for all nodes. (b) Scatter plots of the observed AIRI versus the optimized network predictor (i.e., the in-degree time series of the key node in the yellow box) in the training period. (c) Scatter plots of the observed AIRI versus the predicted AIRI during the forecast period 1989–2018. (d) Teleconnection path between the key node in South Atlantic and Indian subcontinent. Black × in panel (a) represent the regions with correlations significant at the 95% confidence level (Student’s t-test). The recent five years are highlighted in red in panel (c). The colors and gray arrows depict the magnitudes and directions of the 850 hPa winds on 12th August 2014 during the Indian summer monsoon season in panel (d).

Reprinted figure from Ref. [323].

AIRI Forecasting: Next, the forecasting capability of the network predictor during the period 1989–2018 is tested. To obtain the forecasted value of the AIRI for one specific year, the forecasted year’s predictor value (i.e., the previous year’s network in-degree of the key node) is substituted into the least square linear regression equation. This equation used for prediction is derived using only the past information. The independent forecast skill for the period 1989–2018 becomes then 0.50, which is significantly higher than the IMD operational forecast skill [see Fig. 27c]. The result indicates the good accuracy of the network-based predictor with the long lead-time of 5 months. Finally, based on 2018’s data, the AIRI for 2019 is forecasted as 949.87 mm, a 7% departure percentage compared to10 the long-term average, which is 887.5 mm. The actual value of the AIRI for 2019 was 968.3 mm.

In the following, a potential physical mechanism is discussed. From a meteorological perspective, the teleconnection between the key node’s location in the mid-latitude of the South West Atlantic Ocean and the Indian subcontinent is no coincidence. The orography of the Eastern coast of the South American continent is prone to the formation of an anticyclone at mid-latitude due to the interaction with mid-latitude Westerlies (winds that blow from the west at the surface level between 30°and 60°S). Moreover, there is the influence of the South polar jet stream, which appears within the upper air Westerlies. The acceleration/deceleration of the South polar jet stream induces areas of low/high pressure, respectively, which links to the formation of cyclones and anticyclones. Because the South polar jet stream strengthens and weakens seasonally, and from year to year, it synchronizes with the variability in the strength of the circulation centered around the location of the key node. One of the possible data-based teleconnection paths is shown in Fig. 27d. Hence, the key node location is sort of the nexus of the South Atlantic circulation, the south polar jet stream and the main atmospheric circulations patterns over the Indian Ocean. It explains why we observe a significant correlation between the network characteristics of the key node and the AIRI.

Further analyses are performed and reveal that the physical mechanisms are related to (i) the network delayed ENSO teleconnection, i.e., the in-degree index is locally minimized in the previous calendar year of the El Niño onset. (ii) The ENSO-Monsoon teleconnection [270], [325], [326], i.e., the AIRI anomaly and the Niño 3.4 SST anomaly index are negatively correlated. The spatial correlation maps between the Niño 3.4 SST anomaly index with both the SST as well as the rainfall index for all nodes are presented in Fig. 28. These results indicate that the Niño 3.4 SST anomaly index is significantly anti-correlated with the SST in the equatorial Indian ocean and also heavily influences the rainfall pattern over the Indian land region.

Fig. 28.

Fig. 28

Physical mechanisms of ISM rainfall forecasting. Correlation between the seasonal mean (Jun–Sep) Niño 3.4 SST and (a) SSTs (b) rainfall at each grid point. Statistically significant (95% confidence level) correlations are stippled.

Reprinted figure from Ref. [323].

Moreover, the network-based approach can be also successfully applied to forecast the Indian homogeneous region rainfall as a prediction scheme, including two specific homogeneous regions [323]: (1) the Central India Rainfall Index; (2) the East & Northeast India Rainfall Index.

In addition, there is solid evidence which reveals that climate change affects the forecast skills. The considerable warming trend in the last 70 years – the 850 hPa temperature in the key node area has risen 2.5 °C – substantially impacted the key node region. An increase of the in-degree index and a change in the structure of the network itself are observed. Concurrently with these changes, the prediction skill of this forecasting method for the ISM rainfall amount is improving substantially [323].

3.1.3. Extreme rainfall

Extreme precipitation events have produced more rain and have become more common since the 1950s in many regions of the world. Heavy precipitation may pose threats to our society. The most immediate impact is the threat of flooding. This risk can be heightened in urban and agricultural areas. In addition to flooding, extreme precipitation also increases the risk of landslides. Excessive precipitation can also degrade water quality, harming human health and ecosystems. Thus is crucial to predict extreme rainfall events and to establish an early warning system for them. However, the analysis of spatial patterns of co-variability of such events is challenging because traditional techniques based on principal component analysis of the covariance matrix only capture the first two statistical moments of the data distribution and are not suitable to analyze the behavior in the tails of the respective distributions [327].

Extreme Rainfall over India

To overcome these challenges, Malik et al. applied the CNs approach to analyze the spatial and temporal patterns of extreme rainfall during the ISM [92], [328]. This approach is based on the event synchronization method (see Section 2.1.3). In Ref. [328], the dataset is a daily precipitation gridded set for 1961–2004 developed as part of the project—Asian Precipitation Highly Resolved Observational Data Integration Towards the Evaluation of Water Resources.11 Then they extracted the data for the South Asian region at a 0.5 degree resolution. For calculating the event synchronization, two different thresholds of α=94th percentile and α=90th percentile were chosen to obtain the extreme event time series, referring to very heavy and heavy rainfall events, respectively. Next, they computed the strength of synchronization Qij and the delay behavior qij between each pair of the grid sites i and j [see Eq. (20)]. After normalizing it holds, that 0Qij1 and 1qij1, where Qij=1 means complete synchronization, and Qij=0 stands for the absence of synchronization; and qij=1 means that events at i precede events in j. The CN is then constructed by selecting only pairs of sites that show strong correlations, and the time lags between the events are used to define the direction of the links.

The spatial structures, organization, and scales of the extreme rainfall over India during the ISM period were investigated. Fig. 29 exhibits that the degree centrality [see Eq. (4)] of the CNs reveals well-defined spatial structures in the monsoonal extreme precipitation [92]. Based on the median of the geographical length of the links, calculated using the formula for the spherical Earth projected onto a plane, it has been uncovered that extreme rainfall events are synchronized up to 250 km for most of the region. They further found that such a spatial organization opens the possibility for predictions of a probable spatial extent of monsoonal rainfall activity without delay, e.g., for most of the subcontinent, the accuracy of the prediction ϵ is above 70% and even reaches up to 100% in certain places [92].

Fig. 29.

Fig. 29

Degree centrality of the event synchronization CN for extreme rainfall over the Indian subcontinent. The extreme events threshold is (a) α=94% and (b) α=90%.

Figure from Ref. [92].

Extreme Rainfall over South America

This event synchronization CN approach was also applied to predict extreme rainfall events in the Central Andes of South America [94]. Based on the real-time satellite-derived rainfall data, TRMM 3B42V7 [329], Boers et al. were able to predict more than 60% of the rainfall events above the 99th percentile in the Central Andes, and even 90% during El Niño conditions. The interplay of northward migrating frontal systems and a low-level wind channel from the western Amazon to the subtropics are considered as the responsible mechanism.

Extreme rainfall events are defined as times with threshold α=99% for all Dec–Jan–Feb seasons in the spatial domain (85°W, 30°W) and (40°S, 15°N), at a resolution of 0.25°, and 3-hourly temporal resolution for the time period from 1998 to 2012. After the construction of the event synchronization CN (see Section 2.1.3), the network divergence ΔS is defined as

ΔSiSiinSioutj=1NAijj=1NAji, (123)

where Sin  and Sout  stand for the in-strength and out-strength respectively at each node. Positive values of ΔS indicate sinks of the network; negative values indicate sources. The strength out of and into a region R is

Siin(R)=1|R|jRAij, (124)
Siout(R)=1|R|jRAji, (125)

where |R| denotes the number of grid points contained in R.

Fig. 30a shows the network divergence ΔS (defined in Eq. (123)), which estimates the dynamics and temporal order of extreme rainfall in South America. The boxes labeled 1 to 7 are used for the tracking of extreme events. It was found that the most pronounced source region of the rainfall network is South Eastern South America (SESA), defined as the box ranging from (60°W, 53°W) to (35°S, 30°S). Fig. 30b depicts where synchronized extreme events occur within 2 days after extreme events occurred in SESA, measured by Siout(SESA). For comparison, Siout(SESA) is shown in Fig. 30c. This analysis reveals that extreme events in SESA are followed by extreme events along a narrow band following the eastern Andean slopes up to western Bolivia, while they are only preceded by extreme events to the southwest. Based on the propagation of extreme rainfall from SESA to the eastern Central Andes (ECA), see Fig. 30d, a prediction for the extreme rainfall for the ECA within 2 days is made after they occurred in SESA. It should be noted that these extreme rainfall events could not be forecasted by using other methods.

Fig. 30.

Fig. 30

Event synchronization CN for extreme rainfall over South America. (a) Network divergence, ΔS, is defined as the difference of in-strength and out-strength at each node (see Eq. (123)). The boxes labeled 1 to 7 are used for the tracking of extreme events. (b) Strength out of SESA, Siout(SESA), which is the average out-Strength restricted to SESA ((see Eq. (124))). (c) Strength in of SESA, Siin(SESA), which is the average in-Strength restricted to SESA ((see Eq. (125))). (d) Temporal evolution of extreme rainfall events from SESA to ECA along the sequence of boxes indicated in (a).

Figure from Ref. [94].

Extreme Rainfall over the Globe

The event synchronization CN method was also applied to investigate the global pattern of extreme-rainfall by Boers et al. [96]. The gauge-calibrated, satellite-derived rainfall dataset TRMM 3B42 V731, with daily temporal resolution, from 50°N to 50°S for the period 1998–2016 is used. The extreme rainfall event thresholds are 80th,81th,,99th of the wet days (¿1 mm) for the Jun–Jul–Aug season. Note that consecutive days with rainfall above the threshold are considered as single events and placed on the first day of the occurrence. To study the robustness of the method, Global Precipitation Climatology Project32 and NCEP/NCAR Reanalysis 1 data were also tested.

First, the synchronization of extreme events for each pair of nodes is computed (see Section 2.1.3). Network links are added between two nodes if the corresponding synchronization values are significant, p<0.005. The global distribution of the spatial distances for which significant synchronizations occur was studied. This distribution decays as a power law p(d)dα, with an exponent α very close to 1 for distances d< 2500km, but exhibits a super-power-law behavior for longer distances. The scale-break indicates that the significant links can be divided into two distinct classes: (i) links associated with regional weather systems with distances up to 2500km, which include mesoscale convective systems and tropical cyclones; and (ii) links associated with global-scale teleconnections. Such teleconnections are generally understood to be caused either by direct signal transport due to large-scale atmospheric circulations or by propagating waves triggered by disturbances of these circulations [96]. In Fig. 31, the teleconnection pattern for south-central Asia (SCA) for events above the 95th percentile is presented. Pronounced link bundles connecting SCA with eastern Asia, the African tropics, large parts of Europe and the eastern coast of North America, as well as the Southern Hemisphere extratropics are apparent. Further analysis reveals that the Rossby waves can be regarded as the physical mechanism underlying these global teleconnection patterns.

Fig. 31.

Fig. 31

Event synchronization CN for extreme rainfall over globe. Teleconnection pattern for south-central Asia (SCA) for events above the 95th percentile. (a) Links connected to SCA, before correcting for the multiple-comparison bias. (b) Link bundles attached to SCA, after correcting for the multiple-comparison bias.

Figure from Ref. [96].

3.1.4. Atmospheric circulation and global warming

Earth’s atmosphere, made up essentially of the gases that surround our planet, consists of circulation patterns that move air from one location to another and from the surface to top elevations. The atmospheric circulation is controlled by Earth’s rotation, barometric pressure, topography, ocean currents, and differences in temperature, salinity, etc.

Latitude structure of the circulation

Since latitudinal gradients in solar energy input are dominant drivers of the atmospheric circulation, its primary features depend on the latitude. We show an idealized depiction of the large-scale atmospheric circulation on Earth in Fig. 32, which schematizes the main features of the circulation, both in a latitude–height plane and in the horizontal, without yet considering longitudinal variations due to continents, oceans, etc. This represents the circulation in an average over all longitudes, known as a zonal average. As we see, there are three large-scale convection cells in both hemisphere.

Fig. 32.

Fig. 32

Schematic of major features of the atmospheric circulation. The average circulation is idealized as being independent of longitude and symmetric about the equator.

The figure is from https://courses.lumenlearning.com/geophysical/chapter/global-atmospheric-circulations/.

(i) The Hadley cell is a thermally driven, overturning circulation that tends to rise in the tropics and sink at slightly higher latitudes. Warming from the surface near the equator is transferred upward through a deep layer by convection clouds. The rising air spreads poleward and cools slowly, returning to the surface and then moves towards the equator. Roughly speaking, the Hadley circulation transports heat to about 30°latitude, and the average effect of the transient weather disturbances transports the heat further poleward.

(ii) The Ferrel cell is also called mid-latitude cell. In the Ferrel cell, air flows poleward and eastward near the surface and equatorward and westward at higher altitudes; this movement is the reverse of the airflow in the Hadley cell. It was the first to account for the westerly winds between latitudes 35° and 60° in both hemispheres.

(iii) The Polar cell is the smallest and weakest cell, which extends from between 60°and 70°north and south to the poles. Air in these cells sinks over the highest latitudes and flows out towards the lower latitudes at the surface. Note that the idealized picture of the atmospheric circulation based on latitude is a reasonable first approximation, but land–ocean contrasts and other variations in longitude are also important.

Climate Change Effects on Atmospheric Circulations

Climate change strongly affects the mode of atmospheric circulations, especially, the Hadley cell, which plays a pivotal role in the Earth’s climate by transporting energy and heat poleward [330], [331], [332]. An analysis of satellite observations indicates a poleward expansion of the Hadley cell by 2°latitude from 1979 to 2005 [330]. A possible mechanism for the changes in the Hadley cell and its relation to global warming has been reported [331]. Also, a possible mechanism for the changes in the Hadley cell’s strength and its relation to global warming was developed in [332]. Both aforementioned observations and theories suggest a weakening and poleward expansion of the Hadley cell under global warming.

As is well known, the locations of the subtropical dry zones and the major tropical/subtropical deserts are associated with the subsiding branches of the Hadley cell [333]. Therefore, the poleward expansion of the Hadley cell may result in a drier future in some tropical/subtropical regions [331]. In addition, the poleward migration of the location of the tropical cyclone maximum intensity is also considered as a consequence of the expansion of the Hadley cell [334].

The strength of the Hadley cell is calculated using the observed zonal-mean meridional wind in the stream function Ψ [335],

[V¯]=g2πRcosϕΨp, (126)

where V is the meridional velocity in pressure coordinates, R is the mean radius of the Earth, and p is the pressure. The operators ¯ and [] stand for temporal and zonal averaging, respectively. When computing the Ψ field, we usually assume Ψ=0 at the top of the atmosphere and integrating Eq. (126) downward to the surface. The intensity of the Hadley cell is much stronger in winter than in summer. To quantify the edges of the Hadley cell, the maximum of the absolute value of Ψ at 500 hPa is first determined (Ψ500), and then the edges are identified as the first latitude poleward of the maximum at which Ψ500 becomes zero [331]. Theories of the Hadley cell suggest that the meridional extent should scale with the height of the tropopause H [336],

ϕHgHΔHΩ2R212, (127)

where H was computed from temperature data as the lowest pressure level at which the lapse rate decreases to 2 °C/km [337], ΔH is the tropospheric mean meridional potential temperature gradient, and Ω is the angular velocity of the earth. From the thermodynamic equation and the mass continuity equation [335], the meridional overturning stream function Ψ of the circulation scales as,

ΨgHΔHΩ2R232RHΔHτΔVH52ΔH25ΔV, (128)

where ΔV is the dry static stability of the tropical troposphere and τ is the characteristic overturning time scale of the circulation.

Although the conventional analysis of satellite observations and the climate change simulations suggests a poleward expansion of the Hadley cell, there are still two challenges. (i) The latitude–longitude structure of this expansion is not fully resolved. (ii) In contrast to models and theoretical considerations, which predict a decreasing intensity, an increasing trend was found in the intensity of the Hadley cell in reanalysis datasets [338]. To overcome these unsolved issues, Fanet al. developed an approach based on network and percolation frameworks to study the impacts of climate changes on the atmospheric circulations [67]. They found an abrupt transition during the evolution of the climate network, indicating a consistent poleward expansion of the largest cluster that corresponds to the tropical area, as well as a weakening of the strength of link in the tropics. This was found in both the reanalysis data and the CMIP5 simulations. The underlying mechanism for the observed expansion of the tropical cluster was linked to the weakening and expansion of the Hadley cell.

In Ref. [67], the CNs were embedded into a two-dimensional lattice, where only nearest neighbor links are considered. The strength of each link are calculated based on Eq. (15), with the month-to-month temperature difference. The links are sorted in decreasing order of strength and then added one by one according to the decreasing strength. More specifically, the nodes that are more similar are connected first. Existing clusters grow when a new link connects one cluster to another one. Then the CN undergoes an abrupt and statistically significant phase transition, i.e., exhibiting a significant discontinuity in the order parameter G1, the relative size of the largest cluster. Due to the Earth’s spherical shape, the largest component in the climate networks is redefined as

G1(M)=maxiS1(M)cos(ϕi),,iSm(M)cos(ϕi),,i=1Ncos(ϕi), (129)

where ϕi is the latitude of node i. The percolation threshold is determined according to Eq. (48), i.e., the step with the largest jump is regarded as the phase transition point. Fig. 33(a) shows the CN component structure in the global map at the percolation threshold. Just before this jump, the CN is characterized by three major communities; the largest one is located in the tropical region; the second and third largest are located in the high latitudes of the southern and northern hemispheres. Fig. 33(b) depicts the relative size of the largest cluster, G1, as a function of the link occupation probability r in the evolution of the CN. It has been found that G1 exhibits an abrupt jump at the percolation threshold rc0.53. The probability density function (PDF) of the weight Wi,j of links is shown in Fig. 33(d).

Fig. 33.

Fig. 33

The network-percolation approach to the atmospheric circulation. Snapshots of the component structures of the CN. (a) Just before the percolation threshold. (b) The giant cluster relative size G1 versus the fraction of total number of links r, for real (red), spatially shuffled (blue) and temporally shuffled (purple) records. (c) Just before the formation of the spanning cluster (at the second largest jump at r0.8). (d) The PDF of the weight of links Wi,j around the globe, in real (solid line) and shuffled (dashed lines) data. The vertical line (orange) indicates the strength of the critical link, Wc, at the percolation threshold rc0.53.

Reprinted figure from Ref. [67].

To determine the temporal evolution of the largest component Gc, and its intensity Wc (the weight of the critical link that leads to the largest transition), a sequence of networks based on successive and non-overlapping temporal windows with lengths of 5 years each was constructed. The results suggested that the tropical cluster is expanding poleward, meanwhile, its intensity decreases significantly with time. The robust weakening and poleward expansion of the tropical component have been observed both in the reanalysis data and in 31 CMIP5 models. The network-percolation approach was also used to identify the climate change response, by comparing the topology of the tropical component for the first and last twenty years of the 21st century, i.e., 2080–2100 vs. 2006–2026. It was found that some regions, for example, northern India, southern Africa, and western Australia have a higher probability to be influenced by the tropical component, whereas the impacts in other regions, e.g., the Northeast Pacific, will become weaker in the future.

3.1.5. Atlantic meridional overturning circulation

The Atlantic Meridional Overturning Circulation (AMOC) is a crucial part of the climate system in the North Atlantic. It has a major impact on climate because of its redistributing meridional heat transport [339], [340]. The AMOC has been identified as one of the important tipping elements of the Earth system [23]. Changes in the AMOC have not only affected the North Atlantic and surrounding landmasses, but have also had global impacts [156]. For example, negative change leads to increasing storminess in southern Europe [341], connects to above average sea-level rise at the US east coast [342], and associates with increasing drought in the Sahel (for both frequency and intensity) [343]. Recently, Caesar et al. provided solid evidence for a weakening of the AMOC by about 3±1 (around 15%) sverdrups since the mid-twentieth century [160]. This weakening is revealed by the consisting of a pattern of cooling in the subpolar Atlantic Ocean and warming in the Gulf Stream region (see Fig. 34). The slowdown of the AMOC was found in both in a high-resolution climate model (CM2.6) in response to increasing atmospheric carbon dioxide concentrations, and in the observed HadISST data since the late 19th century. An improved SST-based AMOC index (the subpolar cold patch) was developed, which is optimized in its regional and seasonal coverage to reconstruct AMOC changes. In particular, it was found that the AMOC decline since the 1950s is very likely to be largely anthropogenic. This slowdown of the AMOC is mainly caused by the gradually varying freshwater forcing in the northern North Atlantic.

Fig. 34.

Fig. 34

Slowdown of the Atlantic Meridional Overturning Circulation (AMOC). Comparison of SST trends in model and observations. Linear SST trends for the CM2.6 climate model of the Geophysical Fluid Dynamics Laboratory during a CO2-doubling experiment (left) and in a control run with fixed CO2 concentrations (middle). Right, Linear SST trends for the observed HadISST data from 1870 to 2016. The data is shown from the November–May season.

Figure from Ref. [160].

The AMOC is the zonally averaged volume transport and its strength at 26°N in the Atlantic and is now routinely monitored by the RAPID-MOCHA array [344]. Mean patterns of the meridional overturning circulation in the Atlantic are determined from the Community Earth System Model (CESM) simulation [345], [346]. The AMOC has its maximum at about 1000 m depth around the separation latitude of the Gulf Stream with a strength of about 20 Sv. Since a future collapse of the AMOC has been identified as a tipping element, it is therefore crucial to develop early warning indicators for such a potential collapse. As we discussed in Section 2.3, various techniques have been developed and successfully applied to detect early warning indicators in single time series used based on the concepts of critical slowing down [177]. In particular, early warning indicators for the tipping point of the AMOC have been discussed in [179], [181].

Recently, several CN-based warning-signals have been proposed by analyzing the correlations for the AMOC collapse that efficiently monitor spatial changes in deep ocean circulation. Specifically, the values of the mean degree, its kurtosis, assortativity, and clustering increase when approaching a tipping point [69], [70]. The CN methodology is applied to temperature time series from an idealized ocean model of the AMOC, as well as to AMOC strength time series from a coupled atmosphere–ocean​ GCM. In addition, the optimal locations of measurement of the AMOC are formulated to provide early warning signals of a collapse through the analysis of the performance of this indicator. Next, we will provide an overview of these CN-based early warning indicators for the collapse of the AMOC.

Mheen et al. [69] used a dimensional (meridional-depth) idealized ocean model of the AMOC, where tipping points can be explicitly computed. In this model, there are two active tracers: temperature T and salinity S, which are related to the density ρ by a linear equation of state [347]

ρ=ρ01αTTT0+αSSS0, (130)

where αT and αS are the thermal expansion and saline contraction coefficients, respectively, and ρ0, T0 and S0 are reference quantities. The surface freshwater forcing FS, which can be applied as a virtual salt flux, is

FS(ϕ)=(γ+η)cosϕcosπϕϕN+βFp(ϕ)Q, (131)

where ϕ indicates latitude, ϕN is the northern boundary of the equatorially symmetric domain, γ is the strength of the background freshwater forcing, and η is a white noise term. β is the strength of an anomalous freshwater flux which is only applied over the area 40°N,60°N, and Q is a constant used to normalize the surface-integrated salt flux from 0 to 1 for all parameter values. As shown in Fig. 35, the bifurcation diagram of the AMOC in the idealized ocean model indicates the existence of tipping points at L1 and L2. The model was discretized on a 32 × 16 spatial grid, with four values of β close to L1, labeled A, B, C and D. The output of this model is the temperature field time series.

Fig. 35.

Fig. 35

Tipping points of the AMOC in the idealized ocean model. (a) Bifurcation diagram with the sum of the maximum (Ψ+) meridional overturning stream function and minimum (Ψ) meridional overturning stream function vs. the anomalous freshwater flux strength β. (b) Pattern of the meridional overturning stream function (Ψ) at the upper stable branch. (c) Pattern of Ψ at the lower stable branch.

Reprinted figure from Ref. [69].

Next, an undirected and unweighted CN is constructed based on the Pearson correlation method (see Section 2.1) with no time lags and a threshold Cc=0.7 [69], i.e, if the correlation Ci,j> Cc, the corresponding two nodes i and j are connected. Four network-based indicators Ed, Ec, Kd and Kc were developed to study the dynamic evolution of the AMOC. Here Ed is defined as the expectation value of the normalized degree distribution ddmax; Ec is defined as the expectation value of the normalized clustering coefficients ccmax; Kd is the kurtosis of the degree distribution; and Kc is the kurtosis of the clustering coefficients distribution. These four network indicators are presented in Fig. 36. It was found that Ed increases steeply and smoothly when the tipping point L1 is approached. The explanation for this increase is based on the increased dominance of the first EOF in the variance of the solution in a large part of the domain. The same holds for the indicator Ec (Fig. 36g). The kurtosis distributions Kd and Kc (Fig. 36h) show even more clearly a strong increase when the tipping point L1 is approached before the collapse. Note that the CNs were constructed from the full spatial temperature field using a sliding window of 5000 years with a shift of 1000 years. The results were not sensitive to the sliding window length. For comparison, previously suggested early warning indicators of the transition in the same AMOC time series, such as, lag-1 autocorrelation, DFA, standard deviation (see our discussion in Section 2.3) are also plotted in Fig. 36 and tend to exhibit a rapid increase/decrease when approaching the tipping point. In particular, the coefficient of the lag-1 autocorrelation (Fig. 36c) reaches its maximum near the tipping point indicating a critical slowdown of the AMOC. However, it has been found that the indicator is not monotonic and also increases when the AMOC is still far from the transition. The DFA analysis shown in Fig. 36d does not indicate early warnings of a transition. Fig. 36e depicts the standard deviation of the AMOC record. It shows a steady increase towards the transition but it is difficult to set a threshold for an alarm. In Fig. 36f, the fraction of the time spent below the 0.5 percentile (estimated based on the first 5000 years) of the AMOC strength is shown. This indicator displays a much sharper increase, but it is a rather ad-hoc measure which depends on a calibration in the far past [69].

Fig. 36.

Fig. 36

Early warning signals of the AMOC. (a) Time series of the maximum value of the AMOC. (b) Transient change of β versus time from 0.05 to 0.17 myr1 during 35,000 years. (c) Th lag-1 autocorrelation of the projection of the time series onto the first EOF. (d) The DFA exponent, using linear (up) and quadratic (down) detrending. (e) Standard deviation of the AMOC. (f) Fraction of the time spent below the 0.5 percentile of the AMOC strength. (g) The network-based indicators Ed (circles) and Ec (squares). (h) Same as (g) but for the kurtosis (Kd (circles) and Kc (squares). The vertical dashed line indicates the value of β at the tipping point.

Figure from Ref. [69].

The same CNs-based analysis was also performed in a more realistic Fast Met Office/UK Universities Simulator (FAMOUS) model by Feng et al. [70]. The FAMOUS model is a reduction for Hadley Centre Coupled Model with a lower resolution ocean and atmosphere component. The annual mean of AMOC strength data were obtained from a control simulation and from a freshwater-perturbed (referred to as “hosing”) simulation of the FAMOUS model. In the hosing simulation, the freshwater flux over the extratropical North Atlantic was increased linearly from zero to 1.0 Sv over 2000 years [70]. The complete AMOC streamfunction field for each of the six 100-year equilibrium simulations has been considered. It was found that when the freshwater forcing is increased, a high degree in the network – indicating high spatial AMOC correlations – first appears at nodes in the South Atlantic at about 1000 m depth. The kurtosis Kd of the CN was introduced as an effective indicator to capture the changes in the topology of the degree field. For the hosing simulation, there is a strong increase of Kd to values far extending those for the control simulation significantly before the collapse time. However, the critical slowdown indicators variance and lag-1 autocorrelation do not show any early warning signal. Moreover, the kurtosis Kd can also be used to determine the optimal observation locations of the AMOC, which provides a strong anomalous signal at least 100 years before the transition.

3.2. Earth geometric surface relief

The topography or bathymetry of the Earth shows complex multifractal structures and scaling properties [34], [348], [349], [350], which can be regarded as a consequence of plate tectonic processes. Revealing the relations between geometrical features of terrestrial surfaces and their internal geological processes has long been a fundamental challenge in the Earth sciences. The plate tectonic theory is usually used to study most of the major surface topographic features of the Earth [351]. It has been reported that there exist strong connections between the ocean bottom topography and the Earth’s climate [352]. Moreover, the surface topography of the Earth plays a remarkable role in the dynamical evolution of oceans, especially, in regard to global climate change and sea level rising. Sea level rise is one of the major consequences of global climate warming [353] and the effects in combination with storm surges and other extreme events have been observed [354]. Decreasing global CO2 emissions is crucial for limiting the risks of sea-level rise. It is estimated that the median sea-level rise will be between 0.7 and 1.2 m until the year 2300 even within the constraints of the Paris Agreement  [355]. The world’s large ice sheets in Greenland and Antarctica are considered to be the largest possible contributors to sea level rise. If these ice sheets melt completely, the sea level would rise about 7 m, 5 m and 53 m from the Greenland ice sheet, the West Antarctic ice sheet, and the East Antarctic ice sheet, respectively. There is thus a great interest in clarifying/predicting the influenced areas in response to sea level rise.

Recently, the ETOPO1 Global Relief Model was released and provided new opportunities for a better understanding of Earth’s surface processes based on geomorphic signatures [356]. ETOPO1 is a 1 arc-minute global relief model of the Earth’s surface that integrates land topography and ocean bathymetry and was developed by NOAA. It was built from global and regional datasets and used to calculate the volumes of the world’s oceans and to derive a hypsographic curve of Earth’s surface (see Fig. 37).

Fig. 37.

Fig. 37

ETOPO1 Global Relief Model. ETOPO1 is a 1 arc-minute global relief model of the Earth’s surface that integrates land topography and ocean bathymetry. The present mean sea level (zero height) is assumed as a vertical datum of the height relief.

Figure is from NOAA https://www.ngdc.noaa.gov/mgg/global/.

In this Section, we will review statistical properties of the Earth geometric surface relief using percolation theory. Their fractal structure and scaling will also be discussed. In particular, the present mean sea level on Earth will be shown to coincide with the critical threshold in a percolation description of the global topography; strong evidence reveals abrupt transitions that occurred during the evolution of the Earth’s relief network, indicative of a continental/cluster aggregation. This could help us to identify the critical nodes or locations that will be more exposed to global climate change.

3.2.1. Self-similarity and long-range correlations

Self-similarity and long-range correlations are remarkable features of the Earth’s surface topography [34]. Starting about 30 years ago, new ideas in nonlinear dynamics, particularly fractals and scaling, provoked an explosive growth of research both in modeling and in experimentally characterizing the solid earth geophysics including the topography, for a review see Ref. [357]. The power spectrum S of linear transects of Earth’s topography follows the scaling relation S(k)kβc, where k stands for the wave number [34]. Such a scaling relation was used to measure the self-similarity of the Earth’s surface topography. Moreover, similar scaling relations have been identified in Earth’s sea-floor topography [358], the topology of river networks [359], as well as natural rock surfaces [360].

The long-range correlation features of the Earth’s surface topography can be described in the following way. The exponent βc in the power-law spectrum is related to the Hurst exponent H in fractional Brownian motion (fBm) via βc=2H+1, which results in H0.5 for the Earth’s topography. Whereas further universal multifractal parameters estimated for specific parts have resulted in a much more complex structure, i.e., H=0.46,0.66,0.77 for bathymetry, continents and continental margins, respectively [348].

The third characteristic of the Earth’s surface topography is the well known bimodal distribution, as a consequence of plate tectonic processes [361]. It indicates the topographic dichotomy of continents and ocean basins.

3.2.2. Landmass and oceanic clusters

Although the classical pate tectonic theory provided a framework that might explain most of the major surface topographic features of the Earth, a percolation-based description as discussed in Section 2.2 was developed to study the geometrical features of the Earth from a statistical physics perspective [127], [130]. The analysis is based on high-resolution ETOPO1 global relief records. The resolution is 1 arc-minute, i.e., N=10800×21600 grid points. The present mean sea level (zero height) is assumed as a vertical datum of the height relief h(ϕi,θi), where ϕi, θi are the corresponding latitude and longitude of grid point i.

In Ref. [127], the hypothetical water level was considered as the percolation control parameter. When it is decreased from the highest to lowest available heights on Earth, there occurs a geometrical percolation phase transition at a critical level hc around which most parts of landmass join together. Fig. 38 illustrates the landmass cluster aggregation by decreasing the sea level from h=100m (top) to h=3520m (bottom). To determine the percolation threshold, the probability of any nodes to be part of the largest island was defined as the order parameter; the mean island size χ (analogous to the susceptibility of the system) was defined as, χ=ss2ns(h)ssns(h), where ns(h) denotes the average number of islands of size s at level h and the prime on the sums stands for the exclusion of the largest island; the correlation length ξ is defined as the average distance of nodes belonging to the same island cluster, ξ2=s2Rs2s2ns(h)ss2ns(h), where Rs is the radius of gyration of a given s cluster. It has been found that the order parameter for islands has an abrupt jump around the zero height level h=0, i.e., right at the present mean sea level (Fig. 39a). It was furthermore found, that both quantities χ and ξ become divergent at the present mean sea level, see Fig. 39b. From these results, the most remarkable observation is that the critical level hc coincides with the present mean sea level h=0 on Earth. This may suggest the important role of water on Earth and shed new light on the tectonic plate motion [127].

Fig. 38.

Fig. 38

Schematic illustration of the landmass cluster aggregation by decreasing the sea level. The nodes are occupied if its corresponding height is above the given threshold h. Different colors stand for different clusters.

Figure from Ref. [127].

Fig. 39.

Fig. 39

Percolation analysis of the Earth’s topography. (a) Relative surface area of the largest island (circles) and the largest sea (triangles) as a function of the sea level. The landmass critical level is close to the current sea level h=0, whereas, the oceanic critical level is close to the level h=3287 m at which the total island and oceanic surface areas are equal. (b) The correlation length ξ and mean island size χ as a function of the sea level. The dashed line marks a geometrical percolation phase transition at the present mean sea level.

Reprinted figure from Ref. [127].

The same percolation analysis was also performed for the oceanic clusters, i.e., the hypothetical water level was increased from the lowest to highest. Fig. 39b gives rise to a discontinuous jump in the oceanic order parameter at around 3640 m. Further investigation in the lunar topography reveals various characteristic features of the Moon. It was found that the critical level for the Moon has the same amount of land and oceans at the threshold, indicating a purely geometrical phase transition [127].

3.2.3. Origin of the discontinuity

It has been reported that a random network or lattice system always undergoes a continuous percolation phase transition and shows standard scaling features [81]. Yet, the order of the percolation transition for the Earth’s topography is still an open question. To answer this, Fan et al. developed a more sophisticated percolation-based method [130]. The percolation model was defined as follows: starting from an unoccupied lattice, the sites are occupied one by one according to their ranking, i.e., first choosing the site with the largest height, then the second, etc. At each step, the fraction of occupied sites p increases by the inverse of the total number of sites N in the Earth’s relief landscape. By this procedure, a configuration of occupied sites is continually obtained at every p. Here all the grid points in the ETOPO1 Global Relief Model were ranked according to their height h(ϕi,θi), from the largest to the smallest value.

In the following, the size of the percolation landmass (oceanic) clusters s was calculated based on Eq. (129) due to the Earth’s spherical nature. The size of the ith gap gi at each fraction p is defined as follows:

gi(p)s(p)s(p1N). (132)

Specifically, here g1 denotes the largest gap, g2 indicates the second largest gap, etc. The larger the gap gi is, the larger are the two clusters before merging. The dynamical evolution of the largest landmass cluster s as a function of the fraction of occupied nodes p is shown in Fig. 40a. It was found that Earth’s relief network undergoes several abrupt and statistically significant phase transitions, i.e., exhibiting a discontinuity in the order parameter s. Fig. 40b shows the network clusters landmass structure at the percolation threshold (just before the largest gap g1). The results reveal that the network, just before this jump, is characterized by four major communities: the largest one being the Afro-Eurasia continental landmass, the second largest cluster is the Americas, the third is located in Antarctica, and the fourth is Oceania. Notably, there exists a critical node (64.458333°N,171.141667°W) that connects the largest and second largest clusters at the percolation threshold pc0.321, with altitude level h=43 m, under the current sea level.

Fig. 40.

Fig. 40

Percolation analysis of the Earth’s topography. (a) The largest landmass cluster relative size s as a function of the fraction p of the total landmass for real (green) and shuffled (red) Earth relief records. g1g5 indicate the five largest gaps, defined in Eq. (132). (b) Snapshots of the landmass clusters of the Earth’s surface topography network at the percolation threshold, p=0.321. The star indicates the critical node.

Reprinted figure from Ref. [130].

The same analysis has also been applied for the oceanic clusters, i.e., the nodes are occupied one by one according to their height level in increasing order. It also has given rise to a discontinuous jump in the oceanic order parameter at the percolation threshold pc0.379 with an altitude level h=3621 m. This is very close to the result h=3640 m reported in Ref. [127]. Note that the critical node, (59.908333°S,161.308333°E), connects the Atlantic+Indian Ocean Plate to the Pacific Plate. It is worth noting that the role of the largest cluster on the landmass structures is different from that on the oceanic one. These differences reveal the complex and different structure of the Earth’s topography (continents) and bathymetry (oceans). This dichotomy is manifested in the well-known bimodal distribution of the Earth’s topography [361].

To demonstrate the order of the percolation phase transition in Earth’s relief network, a finite-size effects analysis was performed by altering the resolution of the nodes. The largest gap g1(L) was calculated for a given system size L. Here, L is defined as the number of nodes in the zonal direction. According to Eq. (49), if g1(L)0 as L, the corresponding giant cluster is assumed to undergo a continuous percolation; otherwise, the corresponding percolation is assumed to be discontinuous [147], [153]. In addition, the size of the order parameter at the percolation threshold [just before the largest jump g1], sc(L), was also studied [see Eq. (51)]. The results are shown in Fig. 41a and b, which indicate a discontinuous percolation since (1) g1(L) tends to be a non-zero constant; (2) dfd=0, also indicates a discontinuous percolation. For comparison, a randomized topography obtained from the shuffling (spatial randomizing the height profile) of the original data was also considered (see Fig. 41a and b). The numerical results for the shuffled case, suggest a continuous percolation transition with known critical exponents βν=5480.104 and df=9148 [106], [107], as expected when the long-range correlations were destroyed.

Fig. 41.

Fig. 41

Finite size analysis of the percolation in Earth’s relief network. (a) Log–log plot of the largest gap g1 vs. the network system size L for original real data (red) and shuffled data (blue). (b) Log–log plot of the largest landmass cluster relative size sc at the percolation threshold vs. L for real data (red) and shuffled data (blue). (c) The average of the largest jump g1(H,L) as a function of system size L; (d) the corresponding percolation threshold pc(H,L) as a function of L for percolation on 2D fractional Brownian motion surfaces with the Hurst exponent H=0.66.

Reprinted figure from Ref. [130].

In order to reveal the origin of the discontinuity, Fan et al. [130] further studied the percolation on 2D fBm surfaces with Hurst exponent H [362], [363]. The parameter H is usually between 0 and 1, where 0 is very noisy, and 1 is smoother. The Earth’s rough surfaces can be modeled via a fBm [364], and the estimation for the Earth continents topography is H=0.66 [348]. Next, the percolation analysis on fBm surfaces with H=0.66 was performed. Similar to the real network, the largest cluster s also exhibited abrupt transitions around p0.3. The largest gap g1(H,L) (average) as a function of system size L is shown in Fig. 41c. It was found that the percolation on fBm surfaces with H=0.66 is discontinuous, since g1(H,L) tends to be a non-zero constant when L. The percolation threshold pc(H,L) corresponding to the largest gap during the evolution of site percolation is shown in Fig. 41d. The values are robust and agree well with the real data where pc0.321 [see Fig. 38]. This indicates that the combined percolation and fBm methods can be used as an efficient tool to investigate the Earth’s surface topography.

3.3. Earthquake systems

Earthquakes are one of the most devastating natural disasters that plague society. Thus, reliable and skillful earthquake forecasts over different time scales (from days to decades) are essential for establishing rational seismic risk reduction strategies and to enhance preparedness and resilience. The earthquake process is a complex spatio-temporal phenomenon, and has been thought to be an example of self-organized criticality systems, in which events occur as cascades on a wide range of sizes, each determined by fine details of the rupture process. Despite rapid progress in the latter part of the 20th century, the study of earthquakes, like the science of many other complex natural systems, is still in its juvenile stages of exploration and discovery. The scientific challenge is how to understand the earthquake phenomena that are both profound and practical, in particular, the deterministic prediction of specific event sizes, their locations, and times. Recently, probabilistic forecasting, based on statistical patterns of earthquake occurrence, became a much more realistic goal, and has been actively explored and tested in global initiatives [36].

In this section, we will first present a description of the phenomenological laws of earthquake occurrence, and scale invariance and the intertime distribution is also briefly discussed. Then we will review how the statistical mechanics approach (memory analysis) can be applied to improve the forecasting skill for real earthquake catalogs.

3.3.1. Scale invariance and the inter-event distribution

First, we introduce some fundamental quantities that characterize earthquake occurrence. This information, such as occurrence time, hypocentral location, earthquake magnitude and depth, is usually reported in seismic catalogs available online for different geographic areas, for instance, the United States Geological Survey (USGS) earthquake catalog.12 We present here the main phenomenological laws for the statistical distribution of these quantities.

The most common method of describing the size of an earthquake is the Richter magnitude scale m, which takes the logarithm of the ground displacement as measured by a seismograph, and applies a correction which varies with the distance from the earthquake to the seismograph [365]. The characteristic size of the fractured area L, as well as the faulting duration τ, can be both expressed in terms of the magnitude as Lτ100.5m.

Next, we will give some well established empirical basic laws of earthquakes. In seismology, the Gutenberg–Richter (GR) law expresses the relationship between the magnitude m and cumulative number of earthquakes N in any given region and time period with magnitude larger or equal to m, which exhibits an exponential decay [366]

logN=abm, (133)

where a and b are fitting parameters. We show the Gutenberg–Richter law for the Israeli earthquake catalog as an example, in Fig. 42. The b-value of the Gutenberg and Richter law has been calculated using the approach described by Marzocchi and Sandri [367]. The value was found to be equal to 0.97±0.02. The intercept of the fit line defines the a-value that explains the earthquake rate for the region selected. In our case a=5.36. As expected, different catalogs exhibit different seismicity levels quantified by the magnitude range and by the total number of events, corresponding to different values of the constant a. Conversely, the parameter b appears to be quite independent of the geographical region and the temporal interval. Typical experimental results provide b1 for tectonic earthquakes.

Fig. 42.

Fig. 42

Gutenberg–Richter law for the Israeli earthquake catalog. (a) Map of the events present in the selected area. (b) Cumulative number as a function of magnitude. The orange line is the fit of the distribution based on Eq. (133), with a=5.36 and b=0.97.

The second empirical law, the Omori law, was first described by the seismologist Omori in 1894 [368],

n(t)=K(c+t)p, (134)

where n(t) is the aftershock rate, t is the time elapsed from the mainshock occurrence; K and c are empirical constants controlling, respectively, the total number of aftershocks n(t) and the onset of the power law decay. p is a third constant, which modifies the decay rate and typically falls in the range 0.7–1.5.

The magnitude of an event understandably influences the number of aftershocks triggered by it, i.e., the larger the mainshock magnitude, the larger is the total number of triggered aftershocks. This property is known as the productivity law, stating that the number of aftershocks nA [integral of Eq. (134)] belonging to a sequence increases exponentially with the mainshock magnitude mM,

nA10αMmM, (135)

The parameter αM is typically small for swarm-type activity, and large for clear primary aftershock sequences [369].

The distribution of waiting times between seismic events has generated much attention and discussion over the last decades, since it was considered to have a universal scaling form [370]. Bak et al. proposed, for the first time, a unified scaling law for the waiting times between earthquakes, expressing a hierarchical organization in time, space, and magnitude. They considered the California earthquake catalog and covered the region with a grid of cells of size L×L. Then they analyzed the inter-event times for each cell and calculated a histogram of the inter-event times PS,L whose magnitudes are greater than m=log(S); this was repeated for different values of L and minimum magnitudes (see Fig. 43a). It was found that if they rescaled the inter-event times T by SbLdf (where b is the Gutenberg–Richter b -value and df is the spatial fractal dimension), and rescaled PS,L by Tα, where α is some exponent, the distribution of waiting times all appeared to collapse onto a single curve (Fig. 43b). The scaling hypothesis was thus expressed as

TαPS,L(T)=fTSbLdf, (136)

where α1 can be identified as the Omori-law exponent for aftershocks, b1 is the b value in the Gutenberg–Richter law, and df1.2 describes the 2d fractal dimension of the location of epicenters projected onto the surface of the Earth [370].

Fig. 43.

Fig. 43

Unified Scaling Law for Earthquakes. (a) The distribution PS,L(T) of interoccurrence time T between earthquakes with magnitude m greater than log10S within an area of linear size L for the California catalog. (b) The scaling function, see Eq. (136), for the data in (a) with T>38 s rescaled with TαPS,L(T) as a function of the variable x=cTSbLdf, with c=104. The data collapse implies a unified law for earthquakes. The solid circles, squares, and triangles correspond to magnitude cutoffs Mc=2, 3, and 4, respectively. The color coding represents the linear size L=0.25° (black), 0.5° (red), 1° (green), 2° (blue), and 4° (orange) of the cells.

Reprinted figure from Ref. [370].

Subsequently, Corral [371] has discovered that the probability density function Dxy(τ) of the inter-occurrence time τ can be simplified to the scaling form

Dxy(τ)=Rxyf(Rxyτ), (137)

where Rxy stands for the mean seismic rate that refers to the (x,y) region. The scaling function f can be expressed by a generalized Gamma distribution,

f(θ)=C1θ1γexpθδB, (138)

with the parameters γ=0.67±0.05,δ=0.98±0.05,B= 1.58±0.15, and C=0.50±0.10. Corral verified the above scaling hypothesis by considering data from the global worldwide catalog from the National Earthquake Information Center (NEIC)13 and several local catalogs: The Southern California Earthquake Center (SCEC),14 the Japan University Network Earthquake Catalog (JUNEC),15 the Bulletins of the IGN (the Iberian Peninsula and the North of Africa),16 and the BGS catalog (the British Islands and the North Sea).17 The inter-event time distributions for the different geographic areas and for different values of the lower magnitude can be collapsed into a universal scaling function, as shown in Fig. 44.

Fig. 44.

Fig. 44

Recurrence-time distributions without and with re-scaling. (a) Probability densities from the NEIC worldwide catalog. (b) Previous data, after re-scaling, with a fit of the scaling function f, see Eq. (138).

Reprinted figure from Ref. [371].

Bak et al. and Corral are the main two proponents of the idea of a universal scaling law. The validity of the scaling relation Eq. (138) has attracted much interest. For example, Saichev and Sornette [372] carried out an extensive analysis to determine the distribution of the inter-event time based on the Epidemic-Type Aftershock Sequences (ETAS) model (see more details in the following section), and they found that it is only approximately universal and of a gamma form, assuming that p in Eq. (134) has a value close to 1, and the branching ratio of seismicity, the average number of events each event triggers, is around 0.7–1. Then Molchan [373] published a rigorous mathematical study on the earthquake inter-event time, and proved that if the scaling distribution would be universal, the functional form must follow an exponential function. Touati et al. analyzed the variation of the distribution for both real data (SCEC catalog) and ETAS simulation data, and suggested that it is not universal [374]. They have shown that the distribution is a bimodal mixture distribution dominated by correlated aftershocks at short waiting times and independent events at longer times.

3.3.2. Modeling seismic time series by the point process approach

There are various statistical earthquake models for describing the occurrences process of earthquakes that can be used for real-time earthquake forecasts [375]. The principle of these models is to evaluate the probabilities of earthquake occurrence by using a point process approach. Among the different models, the ETAS model, which describes the features of earthquake clustering of mainshocks, foreshocks and aftershocks, has become a standard model for testing hypotheses and a starting point for short-term earthquake forecasts [376], [377], [378], [379].

Here we will focus on the ETAS model, which is used to generate synthetic earthquake catalogs. The ETAS model was developed by Ogata [369], [380], [381] who observed that seismic activity can be well described by the Gutenberg–Richter law Eq. (133), and the Omori law Eq. (134). The conditional rate in the ETAS model is given by,

λ(t|Ht)=μ+Ai:ti<texp[αM(miMz)]1+tticp, (139)

where ti are the times of the past events and mi are their magnitudes; Ht={(ti,mi);ti<t} is the history of occurrence. λ(t|Ht) provides the probability to have an earthquake above a threshold magnitude Mz at time t, given the earthquake history. Here, A=Kcp is the occurrence rate of earthquakes in the Omori law at zero lag [374], and αM is called the productivity parameter defined in Eq. (135). The ETAS model can also be written in terms of a stochastic integral [382],

λt|Ht=μ+m10tK(ts+c)peαMmMzN(ds,dm), (140)

where N(ds,dm)=1 if an infinitesimal element (ds,dm) would include an event ti,mi for some i, otherwise N(ds,dm)=0. The five parameters μ,A,c,αM,p represent some characteristics of seismic activity of the region. Therefore, they vary spatially, and also temporally in some cases.

Then the ETAS model was extended to a space–time phase with the conditional intensity function [383],

λt,x,y|Ht=μ(x,y)+i:ti<tκmigttifxxi,yyi|mi, (141)

where μ(x,y) still is the background intensity, which is a function of space, but not of time; κ(m) is the expected number of events triggered from an event of a magnitude m in the form

κ(m)=AexpαmMz, (142)

g(t) is the probability density function of the occurrence times of the triggered events,

g(t)=p1c1+tcp. (143)

f(x,y|m) is the spatial distribution of the triggered events, which can be expressed in the following two ways: a short-range Gaussian decay,

f(x,y|m)=12πD2eαMmMzexpx2+y22D2eαMmMz, (144)

or a long-range inverse power decay,

f(x,y|m)=(q1)D2(q1)eαM(q1)mMzπx2+y2+D2eαMmMzq,q>1. (145)

The ETAS model has been successfully used for operational earthquake forecasting, e.g., the complex Amatrice–Norcia seismic sequence [384], next-day earthquake forecasts for the Japan region [385] and California region [386]. More general discussions on modeling seismic time series by the point process approach are given in Ref. [387].

3.3.3. Memory analysis

In this subsection, we will review the memory analysis that has been applied to seismic records. The first method is the so-called detrended fluctuation analysis (DFA), which is used to detect long-range correlations or long-term memory of diverse time series, such as DNA sequences and climate records [180]. Details of the DFA are presented in Section 2.3.3. According to Eq. (70), if the time series has a long-term memory, the fluctuation function F(n) increases according to a power-law relation with a exponent α. The DFA has been applied to earthquake interoccurrence times for the first time by Lennartz et al. [388]. They tested the real and the synthetic records for long-term correlations by employing the first two orders of the detrended fluctuation analysis DFA0 and DFA1 on the SCEC catalog (1995–1998) and the NCSN catalog (1995–1998). They found that there exists a long-term memory between seismic events in the Californian catalogs, which show up in characteristic fluctuations in both magnitudes and interoccurrence times. Both DFA0 and DFA1 have resulted in a DFA exponent α0.75, independently of the threshold Mz and the catalog, see Fig. 45.

Fig. 45.

Fig. 45

Detrended fluctuation analysis of the interoccurrence times from the Californian catalogs. (a) For all earthquakes with magnitude M2, and (b) M3. Black circles indicate the SCEC catalog (1995–1998) and red squares stand for the NCSN catalog (1995–1998), blue triangles represent an artificial Gutenberg–Richter distributed correlated dataset with α=0.8. The open symbols show the DFA0 results and the filled symbols the DFA1 results. The corresponding slopes are represented by straight lines with α0.75.

Figure from Ref. [388].

A similar analysis, DFA2, has been applied to the real Israeli and Italian catalogs by Fan et al. [212]. They found that the exponent α was also very close to 0.75 for the inter-occurrence time series between earthquake events with different magnitude threshold Mz. Figs. 46a–d show the DFA2 results. They also performed DFA on other seismic variables, such as the number of earthquake events and the released energy within a coarse time window dt. The energy of earthquakes was defined as S(t)=l=1E(t)1032Ml(t), where E(t) denotes the number of events that occurred between t and t+dt, and Ml(t) denotes the magnitude of the event. To deal with more homogeneous time series, Fan et al. switched to s(t)=log(S(t)) for S(t)>1 and zero for S(t)1 and e(t)=log(E(t)) for E(t)>1 and zero for E(t)1 [212]. Next, applying the DFA analysis for the time series s(t) and e(t) in the Israeli and Italian catalogs, with different time windows dt, were tested. It was found that for both regions and for all studied magnitudes, the value of the scaling exponent α is quite robust 0.75, i.e., the size of dt does not affect the memory exponent α. Thus, the return intervals, the number of events and the released energy are significantly long-term correlated and have a very similar scaling exponent. The memory analysis has also been applied on the ETAS model. The DFA2 of the inter-occurrence times, with the parameters A=6.26, μ=0.2, p=1.1, αM=1.5 and c=0.007, see Eq. (139), yields α0.75 and is shown in Fig. 46e.

Fig. 46.

Fig. 46

Interoccurrence time series and DFA from the Israeli, Italian and ETAS catalogs. (a) Interoccurrence time series between earthquake events from the Israeli (a) and Italian (b) catalogs using magnitude threshold Mz. DFA of the interoccurrence times from the (c) Israeli and (d) Italian catalogs with different Mz values. (d) DFA of the interoccurrence times from the ETAS model, with parameters A=6.26, μ=0.2, p=1.1, αM=1.5 and c=0.007, see Eq. (139). The solid line is the best fitting line with slope α=0.75, R-square >0.99. For comparison, the shuffled data with slope α=0.5, indicating no memory, are presented in (d).

Reprinted figure from Ref. [212].

Another common method for studying memory in the occurrence of earthquakes is called conditional probability (CP), which was introduced by Livina et al. [389]. Considering a time series of recurrence intervals, they sorted it in ascending order and divided it into four 25% quantiles; i.e., the first quantile, Q1, represents the shortest 25% of waiting times, etc. The distribution of recurrence times τ, that follow a prior recurrence time τ0, P(τ|τ0), was studied, where τ0 belongs to either one of the quantiles at the extremes, Q1 or Q4. Note that for records without memory, P(τ|τ0) should be independent of τ0 and should be identical to P(τ). For the real data, Livina et al. found that P(τ|τ0) depends strongly on the previous recurrence time τ0, such that short recurrence times lead to short ones, and long recurrence times follow long ones (see Figs. 47a, c for the Israeli and Italian catalogs). To quantify and measure the level of memory, Fan et al. analyzed the cumulative distribution function (CDF) of the recurrence times and denoted the CDF for Q, Q1 and Q4 as CQ(τ), CQ1(τ) and CQ4(τ), respectively. The strength of the memory for Q1 is defined as

ρ1=(CQ1(τ)CQ(τ))dτdτ, (146)

and similarly, the level of memory for Q4 is

ρ4=(CQ4(τ)CQ(τ))dτdτ. (147)

Thus, 0ρ11 and 1ρ40, and higher |ρ1| (or |ρ4|), imply stronger memory and ρ1=0 (or ρ4=0), implies no memory [212]. The results are shown in Fig. 47b and d for the Israeli and Italian earthquake catalogs. They found ρ1=0.248,ρ4=0.193 for Q1 and Q4 of the Israeli catalog, whereas ρ1=0.260,ρ4=0.153 for Q1 and Q4 of the Italian catalog [212]. Further study suggests that the values of ρ are robust and do not depend on Mz. To study the dependence of the correlations (α, ρ1, and ρ4) on the geographical location (tectonic setting), they performed the DFA and CP analysis for other earthquake catalogs, including New Zealand, Southern California Earthquake Center, Japan unified hIgh-resolution relocated catalog for earthquakes and Preliminary Determination of Epicenters global earthquake catalogs. It has been found that for all catalogs and for all studied magnitudes the values of α, ρ1, and ρ4 are quite robust.

Fig. 47.

Fig. 47

Memory analyses for the Israeli, Italian and ETAS catalogs. (a, c, e) Conditional PDF and (b, d, f) CDF of the recurrence times τ for the (a, b) Israeli catalog above the threshold Mc = 2.0, for the (c, d) Italian catalog above the threshold Mc = 3.0 and for the (e, f) ETAS catalog above the threshold Mc = 2.0.

Reprinted figure from Ref. [212].

Finally, the memory analyses DFA2 and CP, were performed in ETAS models, by measuring the coefficients α, ρ1 and ρ4, as shown in Fig. 46, Fig. 47. The results indicate that the empirically observed earthquake memory (for Israeli catalog) can be reproduced only for a narrow range of the model’s parameters. Moreover, the origin of memory in the ETAS model is influenced by: (i) the model’s background (noise) rate parameter μ which, affects the memory through the interference of temporally overlapping aftershock subsequences, i.e., smaller μ leads to stronger memory, and (ii) the branching ratio n=Acp1ββαM: the exponent relating the production of aftershocks as a function of magnitude, αM, and the power p of Omori’s law can also affect the memory through the branching ratio of the ETAS model, i.e., smaller p and larger αM result in a stronger memory (see Figures. 11 and 12 in Ref. [212]).

More recently, a lagged CP method was introduced by Zhang et al. to study a possible long-term memory in both the interevent times and distances of earthquakes [390]. The CP of times and distances were defined as ρτk|τ0 and ρrk|r0, respectively, where τ0r0 belongs to the Q1 or Q3 group. τkrk is the lagged kth inter-event time that follows τ0r0. In order to quantify the level of memory, Sτk|τ01s13 was introduced, where s13 is the common area of the PDF between Q1 and Q3. Here Sτk|τ0[0,1]. Similarly, Srk|r0 represents the level of memory for the inter-event distances. Interestingly, it has been shown that S(k) can be expressed by the following scaling form,

S(k)=LduF(k10bMz)10aMz, (148)

where du and a are constants, b=1 as for the Gutenberg–Richter law. The distribution of Sτk|τ0, Srk|r0 and their corresponding scaling functions for the Italian catalog are presented in Fig. 48, where du=0.14 and a=0.09 for the inter-event times τ; du=0.08 and a=0.24 for the inter-event distances r. The scaling functions shown in Fig. 48c and d suggest a crossover between two distinct power-law relations with F(x)xγ. Both scaling functions exhibit a significant crossover at k10bMz105. These results indicate that the memory measure of different grid sizes and different magnitude thresholds can be rescaled into a single function. However, it was found that the memory function in the ETAS model is very different from that of real earthquake catalogs, i.e., the model’s memory is weaker (stronger) on short (long) timescale compared to the real catalogs. Moreover, the model does not exhibit a clear crossover observed in the real catalogs.

Fig. 48.

Fig. 48

The lagged CP memory analysis for the inter-event times and distances from the Italian catalog. The memory measures (a) S(τk|τ0) and (b) S(rk|r0) as a function of the lag index k. Rescaled memory measure (c) for inter-event times and (d) distances. Colors represent different grid sizes L. The shapes of the symbols represent different magnitude thresholds Mz.

Reprinted figure from Ref. [390].

3.3.4. Earthquake forecasting

Earthquakes are one of the most destructive natural disasters in the world. Skilled and reliable earthquake forecasting remains an ultimate goal. The term earthquake prediction is usually referring to the specification of the occurrence time, location or magnitude of a future earthquake. However, “the term earthquake forecasting usually refers to the evaluation of the occurrence probability PΣ of an earthquake inside a hypercell of volume Σ centered in the point ω=(t,x,y,z,m). Therefore, the fundamental quantity is the occurrence probability PΣ(ω|) conditioned on the whole set of information available at time t” [36]. The probability of having an earthquake at time t, in the kth interval conditioned to the previous history is

Pζk=1|Htk=λωk|HtkdΣ+OdΣ2, (149)

where λ is the local conditional rate as defined in Eqs. (139) or (141), ωk denotes the center of the kth cell. The forecasting probability, PΣ the integral over Σ of the occurrence rate λωk|Htk, depends on the volume of the hypercell Σ, i.e., the smaller Σ, the more accurate is the evaluation of λ. Secondly, the temporal extension T of Σ also influences the forecasting accuracy. According to the time scale of T, one can separate arbitrarily Long-Term from Short-Term forecasting [391]: earthquake Long-Term forecasting, for T in the interval of decades to centuries; Intermediate-Term forecasting, for T of the order of months and Short-Term forecasting, for T in the interval from few seconds up to several weeks. Short-Term forecasting is usually subdivided into post-seismic and pre-seismic forecasting, referring to the aftershock and foreshock of a mainshock. It is notable that the post seismic forecasting is still very important from the point of view of risk management, since aftershocks can have sizes comparable or even larger than their triggering mainshock. Long-Term forecasting, however, is probably the most relevant from the engineering point of view, such as urban planning risk assessment. Based on the early warning foreshock observations, one could successfully predict some big earthquakes, such as the 1975 Haicheng, China earthquake [392] and the 1995 Kozani–Grevena, Greece earthquake [393]. Nevertheless, most earthquakes do not present significant precursory patterns, e.g., the 2004 Parkfield (California) earthquake [394]. Thus the evaluation of foreshocks or precursory patterns is still an open question. Since most earthquake models are based on the well-established empirical laws, aftershock spatio-temporal occurrence clustering, as the ETAS model, they are able, in some degree, to provide reliable risk assessment only for the Long-Term and the post-seismic Short-Term forecasting but not for the foreshocks. Nevertheless, how to validate a forecasting model is still of great importance. In the following, we will review various methods for evaluating earthquake predictions and earthquake forecasts.

Receiver operating characteristic diagram

The receiver operating characteristic (ROC) diagram is a graphical plot that illustrates the prediction quality of a binary classifier system as its discrimination threshold is varied. For evaluating earthquake predictions, if an earthquake is expected (“YES”) with a probability PΣ>Pt, where Pt is a threshold, we call it a hit, if an event has occurred; if earthquakes are not expected (“NO”) in the cell, we call it a miss, if an event has occurred. Each “YES” prediction in which no corresponding earthquake is observed is a false alarm, and each “NO” prediction in which no earthquake occurred is acorrect rejections. In this representation, there are four possible combinations of alarm declaration and event observation, see Table 1 for the number of each of these contingencies. Here we define a as the number of hits, b as the number of false alarms, c as the number of misses and d as the number of correct rejections, respectively. We define a true positive rate (TPR) as

TPR=aa+c, (150)

and false positive rate (FPR) as

FPR=bb+d. (151)

When TPR and FPR are plotted together on the square [0,1] × [0,1], the resulting metric is called the ROC, whereas the perfect prediction corresponds to the point with coordinates TPR=1 and FPR=0, the diagonal line represents the long-term behavior of random guessing. One can obtain the full ROC curve by changing Pt continuously. Here we show an aftershock forecasting ROC curve by using a deep learning method in Fig. 49 as an example.

Table 1.

Contingency table for a binary event.

Observation
YES NO
Forecast YES a = number of hits b = number of false alarms
NO c = number of misses d = number of correct rejections
Fig. 49.

Fig. 49

An example of ROC curves based on the neural network. (a) The true positive rate is plotted versus the false alarm rate. (b) For a synthetic case of a 60-km-long, right-lateral strike-slip fault (red lines) at a depth of 10 km. The dashed diagonal represents random prediction (the null-hypothesis).

Reprinted figure from Ref. [395].

The Molchan diagram

Closely related to the ROC is the Molchan diagram, which is an error diagram that allows to compare the prediction of the test model and another generic reference model [396]. It is a plot of the miss rate, M=1TPR, and the fraction of experiment space–time volume, F, occupied by alarms or YES predictions. The best prediction in the Molchan diagram is the point (F=0,M=0). In particular, the probability of obtaining h or more hits by chance, given that there have been n observed target earthquakes, is described by the binomial distribution

ΓM=i=nnNiFi(1F)ni (152)

where ni stands for the binomial coefficient. One can obtain a confidence level curve in the Molchan diagram by fixing ΓM and changing F continuously. Therefore, a very small ΓM value suggests that the alarm set has high skill.

The Number test (N-test)

The N-test is used to access how well the total number of forecasted earthquakes Nforecast matches the number of events observed Nobs [397]. The target earthquakes are assumed to follow a generalized Poisson distribution, and the mass function of the negative binomial distribution (NBD) can be expressed as [384]

f(n;r,p)=Γ(r+n)n!Γ(r)pn(1p)r, (153)

where Γ(r) is the gamma function, n is the number of target earthquakes, and r,p are the two parameters of the distribution. Then NBD can be derived as

f(n;r,p)=0f1(n;λ)f2λ;r,1ppdλ, (154)

where f1(n;λ) is the Poisson distribution and λ is assumed to follow a gamma distribution

f2(λ;k,θ)=λk1eλθθkΓ(k), (155)

where k,θ are the parameters of the gamma distribution with r=k and p=θ1+θ. Next, we calculate the parameters k and θ of the gamma distribution. To interpret the N-test results, one can use a one-sided test with an effective significance value αeff, which is half of the intended significance value α=0.05. Fig. 50c shows an example of the N-test.

Fig. 50.

Fig. 50

The statistical tests for earthquake forecasting for the complex Amatrice–Norcia seismic sequence. (a) The Space test (S-test). The red dashed horizontal line is the critical value ζ=0.05 for rejection. (b) Expected number of target earthquakes for each forecast. (c) The Space test (S-test): probability distribution of the expected number of target earthquakes with a 95% confidence interval (marked by red) and the observed number of target earthquakes (vertical dashed blue line). (d) The CDF of the forecast and observed frequency–magnitude distribution.

Figure from Ref. [384].

The Space test (S-test)

The S-test is a likelihood test where only the spatial distributions of the forecast and the observation are considered [398]. The S-test can be summarized by a

ζ=nSi|SiS,SiSsnSs, (156)

where Si is the ith simulated spatial likelihood, Ss is the set of simulated spatial likelihoods, S is the likelihood of the spatial forecast relative to the observed, n{A} is the number of elements in a set {A}. If ζ is less than critical significance value α, the observed spatial distribution is inconsistent with the forecast. However, values close to 1 indicate optimal spatial forecasts. We show an example of the S-test in Fig. 50a.

Besides the aforementioned methods, there are still various other methods that have been developed to evaluate the accuracy of a forecasting model, such as, the Likelihood test (L-test), the Likelihood Ratio test (R-test) and the Magnitude test (M-test). An instructive review on the evaluation of earthquake predictions and earthquake forecasts can be found in Ref. [399].

As discussed in the last section, we studied the possible origin of memory in earthquakes for both real catalogs and ETAS models. Here, we will show that the memory analysis can significantly improve the short-term forecasting rate for the real earthquake catalog. In estimating the ETAS model parameters for a given earthquake catalog, the ETAS parameters are commonly inverted from the data based on the point-process maximum likelihood (ML) method, by the Davidon–Fletcher–Powell​ algorithm [380] or by simulated annealing [400]. For example, by using the ML method, one can obtain μ=0.2, A=6.26, αM=1.4, p=1.13 and c=0.007, see Eq. (139), for the Italian catalog [400]. When considering the memory, however, the ETAS model can reproduce the same memory as in real catalogs, only for a small range of parameter values, i.e., μ=0.2, A=6.26, αM=1.5, p=1.1 and c=0.007 [212]. As a test case the cumulative number of events, NC, after the Capitignano 5.7 main shock that happened on 18 January 2017 [401], was forecasted by using the ETAS model. The cumulative number at time t is calculated by integrating Eq. (139) from 0 to t. This way the forecasting accuracy within 14 days after the main shock can be improved by more than 20% by considering the memory. Moreover, a revised and generalized ETAS model was proposed recently by Zhang et al. in which the short- and long-term/distance memory was reproduced accurately [402]. The new ETAS model is also found to significantly improve earthquake forecasting for the Italian and South California earthquake catalogs.

4. Conclusions and perspectives

In this article, we have reviewed statistical physics and complex networks-based techniques that advance our knowledge on the complex Earth system, a relatively novel branch of geophysics. These techniques can help to address the understanding as well as the prediction of climate variability, Earth geometric relief features and earthquakes. In this section we try to draw a few concluding remarks and address the perspectives that remain still open for future research. We believe that these novel approaches will attract the attention of scientists in the related fields.

In Section 2 we started by describing the overall methodology based on statistical physics for analyzing important aspects of complex Earth subsystems. It includes complex networks, percolation theory, tipping points analysis, entropy theory and complexity. Although it was shown that the aforementioned approaches can be successfully applied to Earth systems individually or in a combination, there is still a lot of work ahead in order to develop a proper and comprehensive framework. Among the different open problems to be solved, we highlight the following four: (i) The need of gathering a better knowledge and understanding on the underlying physical mechanisms; (ii) The need of extending concepts of CNs to a more comprehensive description, such as multilayer climate networks [403]. This is since the climate is a coupled and multilayered system; (iii) The need of developing a universal theory of critical phenomena (percolation) in climate systems, in particular for out-of-equilibrium systems; (iv) The need of proposing a principle on how to anticipate the tipping points and entropy of the Earth system by means of time series of multiple spatio-temporal variables.

In Section 3, we have continued by reviewing the applications of statistical physics approaches to different Earth subsystems. Specifically, five weather and climate dynamical scenarios were highlighted in Section 3.1, including, El Niño-Southern Oscillation, Indian summer monsoon, extreme rainfall, atmospheric circulation and Atlantic meridional overturning circulation. We suggested that the evolving CNs (and other combined approaches) interactions and patterns can serve as novel predictors which significantly improve the predictions. Beside the above mentioned scenarios, the CNs method was also widely applied to other high-impact phenomena, such as the westward propagation of the Atlantic multidecadal oscillation [99], the detection of teleconnection paths [62], the identifying of causal gateways and mediators [404], the early prediction of extreme stratospheric polar vortex states [405], the impacts of carbon dioxide (CO2) on global SAT [406], etc. In spite of the great achievements, there are still many unexplored problems related to climate systems that merit further attention. Therefore, further topics are needed to be investigated, for example, (i) how human activities have changed and continue to change the Earth’s oceanic and atmospheric composition, some of these changes have a direct or indirect impact on the energy balance of the Earth and are thus drivers of climate change; (ii) The statistical mechanics and CNs behaviors in Earth System Models, ranging from simple energy balance models to general circulation models; (iii) The projected short-term and long-term abrupt phase transitions in paleoclimate, in particular, the role of sea ice and Snowball-Earth initiations [407], [408], [409]; (iv) Improving the predictability of severe convective weather processes including thunderstorms, hail and tornadoes by using the CNs approach.

In Section 3.2, we have reviewed the current literature on the statistical features and the percolation framework of the Earth’s surface topography. We have pointed out the evidence for abrupt transitions that occurred during the evolution of the Earth’s relief network, indicative of a continental/cluster aggregation. A further analysis via fractional Brownian motion models suggests that long-range correlations may play a key role in the observed discontinuity on Earth. This research may facilitate the understanding of the geometrical phase transition on Earth, but also can be used to identify the critical nodes for future global change in Earth’s relief network. In particular, (i) the future movement of tectonic plates, as well as (ii) the dynamic evolution of these critical nodes are of key importance and thus need to be further addressed.

In Section 3.3, we first presented a description of the well established empirical basic laws of earthquakes, including the Gutenberg–Richter, Omori and productivity laws. We then discussed the scaling hypothesis of the inter-event distributions. An epidemic-type aftershock sequence model (ETAS) and various methods for evaluating earthquake predictions and earthquake forecasts have been also reviewed. At last, we have shown that the two proposed memory analysis approaches, DFA and CP, can significantly improve the short-term forecasting rate for real earthquake catalogs. Nevertheless, earthquake prediction research has been plagued by controversy, and it remains an outstanding challenge. Besides the improvement of short-term aftershock forecasting, we also emphasize the potential of new directions. (i) The wide applications of tipping points and memory analysis in theoretical earthquake models which account for accelerated moment release and foreshock occurrence together with other precursory patterns, where mainshocks were treated as characteristic earthquakes within a seismic cycle [410]. (ii) The development of an earthquake forecasting model which is implemented for describing foreshock organization and memory behaviors. (iii) Motivated by the successful applications of network theory to the prediction of climate phenomena, the proposing of an earthquake network framework based on the occurrence time series [411], or in particular the seismic waves (elastic body and shear waves) and elastogravity signals preceding direct seismic waves [412].

One of the key features of statistical physics-based approaches reviewed here is the ability to better and reliably forecast the complex Earth phenomena, such as El Niño events, extreme rainfall in the Eastern Central Andes, Indian summer monsoon, the collapse of the Atlantic multidecadal oscillation and the occurrence of earthquakes. Moreover, we observed that artificial intelligence and deep learning techniques [413] have achieved great success during recent years in many fields, such as phase transitions in statistical physics [414], [415], [416], data-driven Earth system science [417], ENSO forecasts [294], [295], Indian monsoon rainfall [418], as well as the forecasting aftershock patterns [395] and detecting earthquakes [419] in seismic systems. In fact, we are confident that the combination and complement of network-based and artificial intelligence-based skills will boost each other. For instance, an application of machine learning to network attribute vectors (products similarity) can predict successful and failing firms with much better accuracy than current state of the art techniques for market forecasting [420]. In addition, deep learning systems can be regarded as complex networks, which thus gain some insights into the structural and functional properties of the computational graph resulting from the learning process [421].

As a final remark, there is still room for many approaches considering the analysis of climate resilience of societies, ecosystems and economies by using the complex network theory. We currently lack appropriate models to better understand or predict the effects of cascading failures [30] triggered by the increasing adverse effect of extreme climate/weather events on interdependent critical infrastructures. Closing this knowledge gap is crucial for developing means to achieve both climate and infrastructure resilience. Another very relevant subject of investigation that will certainly attract huge attention is the assessing and quantifying of the complex climate–health relationships. There is an overwhelming consensus that climate conditions, including temperatures and spatial–temporal distribution of precipitation, has key implications for human health [422]. Moreover, the morbidity and outbreak of some disease, such as the vector-borne (malaria and dengue fever) disease, the ongoing COVID-19 pandemic [423], and influenza are strongly affected by the climate change or the environment [424], [425], [426], [427], [428]. However, the climatic influences are often excluded from consideration in the traditional epidemiologic models [113], e.g., the SIR, SIRS and SEIR models. What we need to emphasize is that traditional epidemiologic models with taking into account the climate factor could be a very promising road towards a deeper understanding of epidemic processes and assess the health consequences of climate change both regionally and globally.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We would like to acknowledge gratefully all colleagues with whom we maintained interactions and discussions on the topic in our report.

In particular, we would like to thank A. Agarwala, N. Boers, A. Bunde, D. Chen, R. Cohen, H. A. Dijkstra, G. Falcone, L. Da, N. B. George, G. Dong, J. F. Donges, Z. Fu, J. Gao, A. Gozolchiani, H. Hof, R. Hofstetter, H. Inoue, V. Livina, T. M. Lenton, Y. Liu, W. Lucht, N. Marwan, W. Marzocchi, M. McPhaden, J. Nagler, P. Nooteboom, A. A. Saberi, L. M. Shekhtman, H. E. Stanley, V. Stolbova, Y. Sun, E. Surovyatkina, K. Yamasaki, N. Yuan, Y. Zhang, D. Zhou.

This work was partly supported by the ‘East Africa Peru India Climate Capacities — EPICC’ project, which is part of the International Climate Initiative (IKI), Germany. The Federal Ministry for the Environment, Nature Conservation and Nuclear Safety (BMU), Germany supports this initiative on the basis of a decision adopted by the German Bundestag. The Potsdam Institute for Climate Impact Research (PIK), Germany is leading the execution of the project together with its project partners The Energy and Resources Institute (TERI) and the Deutscher Wetterdienst (DWD). S.H. thanks the Italian Ministry of Foreign Affairs and International Cooperation jointly with the Israel Ministry of Science, Technology, and Space (MOST); the Israel Science Foundation, and the EU H2020 project RISE, Europe for financial support. J.K. was supported by the Russian Ministry of Science and Education Agreement No. 075-15-2020-808.

Editor: I. Procaccia

Footnotes

Appendix. Acronyms

This Appendix contains two tables of acronyms used throughout the paper: Table A.2 contains the scientific acronyms and Table A.3 the institutional ones.

Table A.3.

Institutional acronyms.

Acronym Meaning
AR Assessment Report
CMIP Climate Model Intercomparison Project
ECMWF European Center for
Mid-range Weather Forecast
JUNEC Japan University Network Earthquake Catalog
IMD Indian Meteorological Department
IPCC Intergovernmental Panel on Climate Change
NCAR National Center for Atmospheric Research
NCEP National Center for Environmental Prediction
NEIC National Earthquake Information Center
PCMDI Program for Climate Model
SCEC Southern California Earthquake Center

Table A.2.

Scientific acronyms.

Acronym Meaning
ACF Autocorrelation function
AIRI All India Rainfall Index
AMOC Atlantic Meridional Overturning Circulation
ApEn Approximate Entropy
CNs Climate networks
DFA Detrended fluctuation analysis
EBM Energy Balance Model
EEG Electroencephalogram
ENB El Niño Basin (ENB)
ENSO El Niño-Southern Oscillation
ESMs Earth System Models
ESS Earth System Science
ETAS Epidemic Type Aftershock Sequence
EWS Early warning signals
FAMOUS Fast Met Office/UK Universities Simulator (FAMOUS)
fBm fractional Brownian motion
GCM General Circulation Model
GHGs Greenhouse Gases
ISM Indian Summer Monsoon
NAO North Atlantic Oscillation
OD Onset date
ONI Oceanic Niño Index (ONI)
ROC Receiver Operating Characteristic
SampEn Sample entropy
SLP Sea Level Pressure
SST Sea Surface Temperature
SPB Spring Predictability Barrier
SysSampEn System Sample Entropy
WD Withdrawal date

References


Articles from Physics Reports are provided here courtesy of Elsevier

RESOURCES