Methods for Detecting Early Warnings of Critical Transitions in Time Series Illustrated Using Simulated Ecological Data

Vasilis Dakos; Stephen R Carpenter; William A Brock; Aaron M Ellison; Vishwesha Guttal; Anthony R Ives; Sonia Kéfi; Valerie Livina; David A Seekell; Egbert H van Nes; Marten Scheffer

doi:10.1371/journal.pone.0041010

. 2012 Jul 17;7(7):e41010. doi: 10.1371/journal.pone.0041010

Methods for Detecting Early Warnings of Critical Transitions in Time Series Illustrated Using Simulated Ecological Data

Vasilis Dakos ^1,^2,^*, Stephen R Carpenter ³, William A Brock ⁴, Aaron M Ellison ⁵, Vishwesha Guttal ⁶, Anthony R Ives ⁷, Sonia Kéfi ⁸, Valerie Livina ⁹, David A Seekell ¹⁰, Egbert H van Nes ¹¹, Marten Scheffer ¹

Editor: Bülent Yener¹¹

PMCID: PMC3398887 PMID: 22815897

Abstract

Many dynamical systems, including lakes, organisms, ocean circulation patterns, or financial markets, are now thought to have tipping points where critical transitions to a contrasting state can happen. Because critical transitions can occur unexpectedly and are difficult to manage, there is a need for methods that can be used to identify when a critical transition is approaching. Recent theory shows that we can identify the proximity of a system to a critical transition using a variety of so-called ‘early warning signals’, and successful empirical examples suggest a potential for practical applicability. However, while the range of proposed methods for predicting critical transitions is rapidly expanding, opinions on their practical use differ widely, and there is no comparative study that tests the limitations of the different methods to identify approaching critical transitions using time-series data. Here, we summarize a range of currently available early warning methods and apply them to two simulated time series that are typical of systems undergoing a critical transition. In addition to a methodological guide, our work offers a practical toolbox that may be used in a wide range of fields to help detect early warning signals of critical transitions in time series data.

Introduction

The Earth’s past has been characterized by rapid and often unexpected punctuated shifts in temperature and climatic conditions [1], lakes and coral reefs have shifted among alternative states [2], neural cells move regularly between different dynamical regimes [3], and financial markets are notorious for abrupt shifts. The gradual change in some underlying condition (or driver), such as the accumulation of phosphorus in a lake or the increasing flux of freshwater from melting ice sheets into the ocean, can bring a system closer to a catastrophic bifurcation point (a ‘tipping point’) causing a loss of resilience in the sense that even small perturbations can invoke a shift to an alternative state [2], [4]. In most cases, however, information about the drivers or the values at which systemic responses are so easily triggered (critical thresholds) is difficult to acquire (but see [5]). Nonetheless, these sudden transition incur large costs as restoration to the previous conditions is difficult or sometimes even impossible [2].

To overcome these challenges, numerous studies have suggested the use of generic early warning signals (or leading indicators) that can detect the proximity of a system to a tipping point [6]. Such indicators are based on common mathematical properties of phenomena that appear in a broad range of systems as they approach a catastrophic bifurcation [6]. An important application of these leading indicators is their potential real-time use as warnings of increased risk for upcoming transitions. However, they also may be used to rank instances of a system (e.g. different patients, individual coral reefs, different markets etc.) according to their proximity to a critical threshold.

Several empirical studies have now demonstrated that leading indicators can be found in a variety of systems. Increases in autocorrelation has been documented prior to past climatic transitions [7], [8], increased variability has been shown before extinction in zooplankton lab experiments [9], and before an experimentally induced regime shift in a lake food web [10], whereas decreases in recovery rates have been demonstrated in chemical reactions [11], lasers [12], or in the plankton [13]. However, the statistical detection of leading indicators in both past events and in real time remains challenging for at least two reasons. First, there is a lack of appropriate data. High frequency sampling and designed experimentation have been proposed as potential solutions that can improve the detection of leading indicators [6], [10]. In many important cases, however, high frequency sampling or experiments are impossible. Furthermore, in many systems, sampling schemes are designed explicitly to avoid temporal autocorrelation, which is, in fact, needed for the accurate application and assessment of leading indicators (see worked examples below).

Second, there is no clear framework for the application and detection of leading indicators. Different approaches have emerged in different fields [14] and have been applied to different types of transitions [15]. For instance, most leading indicators are based on detecting changes in the stability properties of a system around its equilibrium under a weak stochastic regime [6], whereas alternative approaches have been developed for systems experiencing highly noisy regimes [16]. As the literature is rapidly expanding, there is an urgent need for a coherent methodological framework and a comparison between approaches.

Here we present a methodological guide for using leading indicators for detecting critical transitions in time series. For this, we apply available leading indicators to two example datasets generated from a simple ecological model that is known to undergo a critical transition to an alternative state. While most of these methods have been applied to real-world data in papers that we cite, such applications inevitably depend on specific details (e.g. missing values, data transformation, coping with too-long sampling intervals or too-short time series) that make it difficult to compare the methods themselves. The exact location and nature of the critical transition is also ambiguous for real-world data. Therefore we gather issues of data preprocessing in a separate section (see “Step 1. Preprocessing” below), and illustrate the methods using simulated data with known, clearly defined critical transitions. The structure of the paper is as follows. First, we describe two categories of leading indicators: metric-based and model-based indicators. Second, we present the ecological model we use to generate the time series we use to detect critical transitions. Third, we show how each indicator is applied to the two simulated time series. We provide computer code alongside the worked-out examples. Last, we review the sensitivity and limitations of each indicator and discuss their interpretation. We trust that the framework and the tools we provide will encourage testing the ability of these indicators to detect upcoming transitions in real systems.

Methods

We group leading indicators of critical transitions into two broad categories: metric-based and model-based indicators (Table 1). Both types of indicators reflect changes in the properties of the observed time series of a system that is generated by a general process:

(1)

where x is the state of the system, f(x,θ) describes the deterministic part of the system, and g(x,θ)dW determines how stochasticity interacts with the state variable; dW is a white noise process. A slow change in the underlying conditions (drivers), θ, moves the system close to a threshold where a transition may occur. Metric-based indicators quantify changes in the statistical properties of the time series generated by equation 1 without attempting to fit the data with a specific model structure. Model-based methods quantify changes in the time series by attempting to fit the data to a model that is based on the general structure of equation 1. The ultimate goal of both types of indicators is to capture changes in the ‘memory’ (i.e. correlation structure) and variability of a time series and to determine if they follow patterns as predicted by models of critical transitions, while the system is approaching a transition into an alternative dynamic regime (Table 1).

Table 1. Early warning signals for critical transitions.

		Phenomenon
	Method/Indicator	Rising memory	Rising variability	Flickering	Ref.
metrics	Autocorrelation at-lag-1	x			[23]
	Autoregressive coefficient of AR(1) model	x			[19]
	Return rate (inverse of AR(1) coefficient)	x			[23]
	Detrended fluctuation analysis indicator	x			[7]
	Spectral density	x			[20]
	Spectral ratio (of low to high frequencies)	x			[25]
	Spectral exponent	x			[this paper]
	Standard deviation		x	x	[28]
	Coefficient of variation		x	x	[28]
	Skewness		x	x	[29]
	Kurtosis		x	x	[25]
	Conditional heteroskedasticity		x	x	[32]
	BDS test		x	x	[10]
models	Time-varying AR(p) models	x	x		[38]
	Nonparametric drift-diffusion-jump models	x	x	x	[16]
	Threshold AR(p) models			x	[38]
	Potential analysis (potential wells estimator)			x	[43]

Open in a new tab

Leading indicator or method, the primary underlying dynamical phenomenon associated with it, and the original reference in which it was developed.

Metric-based Indicators

Autocorrelation and spectral properties

The rate of return to equilibrium following a (small) perturbation slows down as systems approach critical transitions [17]. This slow return rate has been termed “critical slowing down” [18] and can be detected by changes in the correlation structure of a time series. In particular, critical slowing down causes an increase in the ‘short-term memory’ ( = correlation at low lags) of a system prior to a transition [19], [20].

Autocorrelation is the simplest way to measure slowing down: an increase in autocorrelation at-lag-1 indicates that the state of the system has become increasingly similar between consecutive observations [19]. There are at least three alternative ways to measure autocorrelation at-lag-1. The most straightforward is to estimate the first value of the autocorrelation function, Inline graphic , where μ is the mean and σ the variance of variable z_t [21]. Alternatively one can use a conditional least-squares method to fit an autoregressive model of order 1 (linear AR(1)-process) of the form; z_t ₊₁ = α₁z_t + ε_t, where ε_t is a Gaussian white noise process, and α₁ is the autoregressive coefficient [21]. ρ₁ and α₁ are mathematically equivalent [21]. Slowing down can also be expressed as return rate: the inverse of the first-order term of a fitted autoregressive AR(1) model [1/α₁] [22], [23]. The return rate has also been expressed as [1-α₁], which reflects the proportion of the distance from equilibrium that decays away at each time step [10].

Whereas autocorrelation at-lag-1 ignores changes in correlation structure at higher lags, power spectrum analysis can reveal changes in the complete spectral properties of a time series prior to a transition. Power spectrum analysis partitions the amount of variation in a time series into different frequencies [21]. A system close to a transition tends to show spectral reddening: higher variation at low frequencies [20]. Changes in the power spectra of a time series also can be expressed in different ways: by estimating the entire power spectrum and observing a shift in the power of spectral densities to lower frequencies [20]; by estimating the spectral exponent of the spectral density based on the slope of a linear fitted model on a double-log scale of spectral density versus frequency [24]; or by estimating the spectral ratio of the spectral density at low frequency (e.g. 0.05) to the spectral density at high frequency (e.g. 0.5) [25].

Detrended fluctuation analysis

Detrended fluctuation analysis (DFA) can be used to measure increases in short- and mid-term ‘memory’ in a time series of a system close to transition. Instead of estimating correlations at a given lag (like autocorrelation at-lag-1), DFA estimates a range of correlations by extracting the fluctuation function of a time series of size s. If the time series is long-term power-law correlated, the fluctuation function F(s) increases as a power law; Inline graphic , where a is the DFA fluctuation exponent [26]. The DFA fluctuation exponent is then rescaled to give a DFA indicator, which is usually estimated in time ranges between 10 and 100 time units, and which reaches value 1 (rescaled from 1.5) at a critical transition [7]. Although, the DFA captures similar information as autocorrelation at-lag-1, it is more data demanding (it requires >100 points for robust estimation) [27].

Variance

Slow return rates back to a stable state close to a transition also can make the system state drift widely around the stable state. Moreover, strong disturbances potentially can push the system across boundaries of alternative states – a phenomenon termed flickering. Both slowing down and flickering will cause variance to increase prior to a complete transition [6]. Variance is the second moment around the mean μ of a distribution and serves as early warning measured either as standard deviation: Inline graphic or alternatively as the coefficient of variation [28].

Skewness and Kurtosis

In some cases disturbances push the state of the system towards values that are close to the boundary between the two alternative states. Because the dynamics at the boundary become slow [6], we may observe a rise in the skewness of a time series- the distribution of the values in the time series will become asymmetric [29]. Just like variance, skewness can also increase because of flickering [6]. Skewness is the standardized third moment around the mean of a distribution and it is given by Inline graphic Note that skewness may increase, or decrease, depending on whether the transition is towards an alternative state that is larger or smaller than the present state.

Flickering or strong perturbations also make it more likely that the state of a system may reach more extreme values close to a transition. Such effects can lead to a rise in the kurtosis of a time series prior to the transition [25]; the distribution may become ‘leptokurtic’: the tails of the time series distribution become fatter due to the increased presence of rare values in the time series. Kurtosis is the standardized fourth moment around the mean of a distribution estimated as: Inline graphic

Conditional heteroskedasticity

Another measure of change in the pattern of variability in a time series is conditional heteroskedasticity. Conditional heteroskedasticity means that variance at one time step has a positive relationship with variance at one or more previous time steps. This implies that periods of high variability will tend to follow periods of high variability and periods of low variability will tend to follow periods of low variability [30], [31]. As variability tends to increase prior to a transition, conditional heteroskedasticity can serve as a leading indicator because the portion of a time series near an impending shift will appear as a cluster of high variability while the portion of the time series away from the shift will appear as a cluster of low variability [32]. Conditional heteroskedasticity is based on a Langrange multiplier test [30], [31], which is calculated by first extracting the residuals of a fitted model to the time series. Usually an autoregressive model of selected order is selected according to a measure of relative goodness of fit (e.g. the Akaike Information Criterion); then the residuals are squared, and finally the residuals are regressed on themselves lagged by one time step. A positive slope of the linear regression of the lagged residuals suggests conditional heteroskedasticity. The coefficient of determination of the regression r² is compared with a χ² distribution of one degree of freedom to assign the significance for the r². The χ² value can be divided by the sample size to make it directly comparable to the r² value.

BDS test

The BDS test (after the initials of W. A. Brock, W. Dechert and J. Scheinkman) detects nonlinear serial dependence in time series [33].The BDS test was not developed as a leading indicator, but it can help to avoid false detections of critical transitions due to model misspecification. After detrending (or first-differencing) to remove linear structure from the time series by fitting any linear model (e.g. ARMA(p,q), ARCH(q) or GARCH(p,q) models), the BDS tests the null hypothesis that the remaining residuals are independent and identically distributed (i.i.d.) [10]. Rejection of the i.i.d. hypothesis implies that there is remaining structure in the time series, which could include a hidden nonlinearity, hidden nonstationarity or other type of structure missed by detrending or model fitting. As critical transitions are considered to be triggered by strong nonlinear responses, the BDS test is expected to reject the i.i.d. hypothesis in the residual time series from a system that is approaching a critical transition. The BDS test can be helpful as an ad-hoc diagnostic test to detect nonlinearities in time series prior to transitions: if the BDS test rejects the i.i.d. hypothesis and there is another strong leading indicator, then the detected early warning is less likely to be a false positive.

Model-based Indicators

Nonparametric drift-diffusion-jump models (DDJ models)

Often we do not know the underlying processes that generate the time series that we are analyzing for early warnings. Nonparametric drift-diffusion-jump models address this problem by fitting a general model that can approximate a wide range of nonlinear processes without the need to specify an explicit equation. Drift measures the local rate of change. Diffusion measures relatively small shocks that occur at each time step. Jumps are large intermittent shocks. Total variance combines the contributions of diffusion and jumps.

The approach is to estimate terms of a drift-diffusion-jump model as a surrogate for the unknown data generating process [16]:

(2)

Here x is the state variable, f(·) and g(·) are nonlinear functions, dW is white noise, and J is a jump process. Jumps are large, one-step, positive or negative shocks that are uncorrelated in time. Equation 2 is assumed to be subject to a critical transition at a critical parameter value Inline graphic , just as in equation 1. We assume that x_t can be observed at discrete intervals of time Δt that can be short, i.e. very high-frequency observations are possible.

The data-generating process (eq 2) is unknown in the sense that the expressions for f(·) and g(·) are not known, θ_t is neither known nor measured, the critical value θ_c where x undergoes a catastrophic change is not known, and the parameters of the jump process are not known. From the time series, however, we can estimate drift, diffusion and jump statistics that may serve as leading indicators of the transition. We do this by assuming that high-frequency observations of the system in equation 2 can be approximated by fitting the drift-diffusion-jump model

(3)

In this fitted model (eq. 3), the drift, diffusion, and jump functions track the slow and unknown changes in θ_t. The drift function Inline graphic measures the instantaneous deterministic change in the time series. The diffusion function measures the standard deviation of relatively small shocks that occur at each time step. Jumps, the last term of equation 3, are relatively large shocks that occur intermittently. Jumps are characterized by an average magnitude Inline graphic (where Ζ_n∼) and the probability of a jump arriving in a small time increment dt is λ(x_t, θ_t)dt. The subscript t- in μ(·) and σ_D(·) indicates that these functions are evaluated just before the time step. In practice, the drift, diffusion, and jump functions are estimated using nonparametric regression [34], [35]. The regression yields estimates of drift Inline graphic , total variance , jump intensity , and the diffusion variance is given by , where is the jump-variance function. In addition, we can estimate the conditional variance of x using standard nonparametric regression techniques. This conditional variance rises to infinity at a critical point caused by bifurcation in f(·), g(·) or both. The conditional variance function, Inline graphic , can be estimated as the difference between the second conditional moment and the square of the first conditional moment as [16], [36]. An interesting feature of the drift-diffusion-jump model is that conditional variance and diffusion estimates may be useful for distinguishing bifurcations that occur in the drift from bifurcations that occur in the diffusion (so-called noise-induced transitions: an abrupt shift in the shape of the stationary distribution as in [37]). A bifurcation in the drift only may be indicated in advance by conditional variance but not diffusion. A bifurcation in the diffusion may be indicated by increases in both conditional variance and diffusion.

Time-varying AR(p) models

Time-varying autoregressive models provide a model-based approach for estimating time-dependent return rates in time series [38], which as we noted in the earlier section can act as an early warning of a critical transition. In time-invariant AR(p) models, the inverse of the characteristic root, λ, of a fitted AR(p) model [39] is similar in magnitude to the dominant eigenvalue of the Jacobian matrix computed at a stationary point of a deterministic discrete-time model [18]. Values of λ near zero imply that the state variable returns rapidly towards the mean; this central tendency diminishes as values approach one [22].

Time-varying AR(p) models assume that the coefficients of the AR(p) model can change through time, thereby allowing estimation of the time-dependent characteristic root as it varies along a time series up to a transition [38]. The general form of time-varying AR(p) models is

(4a)

(4b)

Equation 4a is a standard AR(p) model with coefficient b ₀ determining the mean of the time series, autoregressive coefficients b _i determining the dynamics around the mean, and ε(t) giving the environmental variability associated with changes in the state variable; ε(t) is assumed to be a Gaussian random variable with mean zero and variance σ ² _ε. Equation 4b allows the coefficients b _i to vary as random walks, with rates dictated by the variances σ ² _i of φ_i(t).

We incorporate measurement error using the measurement equation

(5)

in which x*(t) is the observed value of the state variable, x(t) is the “true” modeled value, and α(t) is a Gaussian random variable with mean zero and variance σ ² _α. This makes it possible to factor out measurement error that could potentially obscure underlying dynamical patterns [38].

Together, equations 4a and 4b are a state-space model that can be fit using a Kalman filter [40]. Although we present the model assuming that data are sampled at equidistant points through time, the state-space structure allows for missing points. Fitting with a Kalman filter gives maximum likelihood parameter estimates, and likelihood ratio tests (LRT) can be used for statistical inference about the parameter estimates. Likelihood-based model selection such as Akaike’s Information Criterion (AIC) can also be used [38]. Because the variance components of the model, σ ² _i, are constrained to be zero, a standard LRT is overly conservative; the calculated P-values are too large, leading to acceptance of the null hypothesis that σ ² _i = 0 even when it is false. To correct for this, the LRT can be performed using the relationship that the twice the difference in log likelihoods between models differing by q in the number of terms σ ² _i they contain is given asymptotically by a 50∶50 mixture distribution of χ ² _(q-1) and χ ² _q. [41], [42]. Therefore, the corrected P-value is the average of P-values calculated from the two χ ² distributions. Since P(χ ² _(q-1)<x) is less than P(χ ² _q<x), this always leads to lower P-values than would be obtained from a standard LRT based on χ² _q alone.

Threshold AR(p) models

As described above, flickering occurs when a time series repeatedly crosses the domains of attraction of two alternative states. Identifying flickering can serve as an early warning for a permanent shift to an alternative state [6]. The difficulty lies in robustly estimating that a time series is jumping among two (or more) distinct states. Threshold AR(p) models are designed to identify these occasional transitions [38]. These models assume there are two underlying processes governing the dynamics in a time series, with the possibility that the state variable switches between them when it crosses a threshold. The two processes are described by two AR(p) models

(6a)

(6b)

where b_i and b_i′ (i = 0, …, p) denote separate sets of coefficients. As with the time-varying AR(p) models (eqs 4), equation 5 is used to incorporate measurement error, and the Kalman filter is used to compute likelihoods in eqs 6, which in turn can be used for parameter estimation and model selection. In addition to the two sets of autoregression parameters b _i and b_i′, parameters to be estimated are the threshold c, and the variance of the process error σ ² _ε.

Potential analysis

An alternative way of probing the existence of alternative regimes in a time series is potential analysis. Just like threshold AR(p) models, this method in essence identifies flickering and serves as warning of the existence of alternative states. Potential analysis [43], [44] is a technique for deriving the shape of the underlying potential of a system. Potential analysis assumes that a time series may be approximated by a stochastic potential equation

(7)

where dU/dz is a polynomial potential of even order (2nd for one-well potential, 4th for double-well potential, etc.), dW is white noise of unit variance and intensity σ. The order of the best-fit polynomial in essence reflects the number of potential system states identified along the time series [43], [44].

Threshold AR(p) models and potential analysis are not, strictly speaking, early warnings for critical transitions, as flickering implies that the system already has undergone repeated state changes. Nonetheless flickering detection methods can robustly indicate the presence of alternative regimes during the period that the system has not permanently shifted to the alternative attractor.

Datasets

We applied all methods to simulated time series - in which we are certain that a critical transition was crossed - rather than on real-world time series to illustrate the application of the methods across identical datasets. There are few available real-world time series that exhibit transitions, and for most of them there is no clear evidence that the transition is of the critical type we are treating here. Thus, for the illustrative purposes of our methodological paper, simulated datasets allowed us to compare the methods independently of uncertainties in the presence of a critical transition, data limitations, or insufficient data resolution that are common in empirical time series.

The two time series used were generated by a well-studied ecological model that describes the shift of a harvested resource to overexploitation [45], [46]. In the model, resource biomass x grows logistically and is harvested according to

(8)

where r is the growth rate, K is the population’s carrying-capacity, h is the half-saturation constant, c is the grazing rate and dW is a white noise process with intensity (σx)²/dt. In the deterministic case, when c reaches a certain threshold value (c ≈ 2.604), the ecosystem undergoes a critical transition to overexploitation through a fold bifurcation (Fig. 1A).

(A) Bifurcation diagram of an ecological model of a logistically growing resource under harvesting. As grazing rate c increases (x axis), resource biomass gradually declines up to a critical grazing threshold that the resource undergoes a critical transition through a fold bifurcation (F₁). At this bifurcation the resource collapses to the alternative overexploited state. If grazing rate c is restored, resource biomass returns to the previous underexploited state at another threshold (F₂). [solid lines represent equilibria, dashed line marks the boundary between the two basins of attraction between the underexploited (cyan) and overexploited (yellow) states] (B) Critical slowing down simulated dataset of resource biomass (blue line) for gradually increasing grazing rate (green line). (C) Flickering simulated dataset of resource biomass (blue line) for gradually increasing grazing rate (green line).

We simulated time series for two cases. In the first case (which we henceforth call the critical slowing down or CSD dataset), we increased grazing rate c linearly in 1,000 time steps from 1 to 2.6771 (just after the bifurcation). At approximately time step 970 the system shifted to overexploitation (Fig. 1B). Parameter values used were r = 1, h = 1, K = 10, σ = 0.03. The values were not parameterized for specific cases, but are similar to ones typically used in the literature (e.g. [45], [47], [48]). In the second case (which we henceforth call the flickering dataset), we again increased grazing rate c linearly from 1 to 2.6771 but in 10,000 time steps (Fig. 1C). In the flickering dataset, we additionally assumed a small time-correlated inflow i of resource biomass that was generated by a simple equation for red noise scaled to the resource biomass x [49]: Inline graphic , where T is a parameter that represents the time scale over which noise becomes uncorrelated ( = 20), and β the standard deviation ( = 0.07) of the normally distributed error term η_t. Parameter values used were r = 1, h = 1, K = 10, σ = 0.15. For both scenarios we also included measurement error in the derived time series x_obs,t = x_t + σ_obserr ε_t, where σ_obserr is the standard deviation of the normally distributed error term ε_t. We used σ_obserr = 0.1 for both the CSD and flickering datasets.

All simulated time series were produced in MATLAB R2011a using the software package GRIND (freely available at http://www.aew.wur.nl/UK/GRIND/). The estimation of the leading indicators was performed in R v.2.12.0 (http://www.r-project.org/) using R package earlywarnings (that can be downloaded at http://earlywarnings.r-forge.r-project.org/), except for the DFA and potential analysis, which were performed in MATLAB R2011a using Fortran and C computational kernels with shell scripts (can be obtained from VNL V.Livina@uea.ac.uk), and the time-varying AR(p) and threshold AR(p) models that were performed in MATLAB R2011a (and are available as supporting information in [38]). Further worked out examples can be also found at http://www.early-warning-signals.org.

Results

We present results here assuming that the only available information to a practitioner is a time series derived from a system, which may be approaching a critical transition. The analysis is presented as a step-by-step procedure that starts with the preparation of the simulated time series (step 1, 2), the estimation of the leading indicators (step 3), and the testing of their sensitivity (step 4) and significance (step 5).

Step 1 Preprocessing

To sensibly apply leading indicators, we first selected the part of the time series that preceded the potential transition. For most methods the estimation of the indicators takes place within rolling windows of predetermined size up to the end of the time series prior to the transition. We selected data up to time-step 970 in the CSD dataset (Fig. 1B). We used the whole time series of the ‘flickering dataset, as it was difficult to clearly identify when the transition took place. We ensured that there were no missing values and that all data were equally spaced in time (i.e. a regular time series). Regular time series are especially important in the case of leading indicators such as autocorrelation that estimate memory in time series. Interpolation can solve issues of missing values and irregular time series, but it can also result in spurious correlations, and checking interpolated records against the original time series to ensure that the density of interpolated points is constant along the time series should be considered [8]. Alternatively, points can also be dropped to obtain a regular time series. However, all the methods we used in this paper can also be applied to irregular time series as well as regular ones.

Equally important is the frequency of observations, that is, the time interval between values in the time series. In many cases data are recorded at different frequencies from the ones needed for the methods we illustrate. In principle, one needs data that are sampled at intervals shorter than the characteristic time scales of the slowest return rate of the system, especially when measuring indicators of critical slowing down [19], [50]. Averaging within non-intersecting windows of a given length results in records of longer time scales that may match the underlying dynamics of interest in the studied system [19], [27]. Choosing the length of the window to aggregate, however, depends on a fairly deep understanding of the dynamics of the system. In addition, aggregation also may solve the issue of missing values, although at the cost of losing data. Here, we did not need to aggregate our datasets because both were sampled in time scales that represented the characteristic time scale of the system we simulated.

We also transformed data where necessary. For example, we log-transformed (using log(z+1)) and in some cases also standardized [ Inline graphic ] the flickering dataset, because of the presence of values close to zero or extreme values, respectively. We checked that data transformations did not change fundamentally the distribution of the original data, as it is exactly the deviations from constant normal distributions that the early warnings are sensitive to.

Step 2 Filtering-detrending

Non-stationarities in the mean of the time series can cause spurious indications of impending transitions, especially for the metrics that are estimated within rolling windows. Additionally, time series may be characterized by strong seasonal periodicities, which, if not removed, impose a strong correlation structure on the time series. For all metrics that were estimated within rolling windows, we removed trends or filtered out high frequencies using Gaussian smoothing (autocorrelation, variance, skewness), simple linear detrending (DFA), or by fitting linear autoregressive models (conditional heteroskedasticity). When applying these or any other type of detrending or filtering (i.e. first-differences, removing running means, loess smoothing), care should be taken to not over-fit or filter out the slow dynamics (of interest) from the dataset [8]. Alternatively, one could also detrend within the rolling windows rather than the entire dataset. Lenton et al [27] have shown that results from the two approaches do not significantly differ.

Step 3 Probing the Signals: Metric-based Indicators

Autocorrelation, variance and skewness

We estimated autocorrelation, variance (as standard deviation), and skewness within rolling windows half the size of the datasets (window size_CSD = 485 points, window size_flickering = 5,000 points) (Fig. 2). We did that after detrending the CSD dataset using Gaussian smoothing with bandwidth size 10% of the time series length (Fig. 2A). We used a sliding (overlapping) moving window based on the idea that indicators should be estimated as data are becoming available. Using nonoverlapping moving windows, however, would give similar results [27]. Autocorrelation at-lag-1 increased almost linearly up to the transition with a strong trend as estimated by Kendall’s τ (rank correlation) both for the original (τ = 0.911) and the residual (after detrending) datasets (τ = 0.944) (Fig. 2E). Standard deviations also increased in both original and detrended records as expected (Fig. 2G), while skewness generally decreased (τ = −0.436 for the original data, τ = −0.475 for the residuals after detrending), but in a somewhat irregular fashion (Fig. 2I). All indicators behaved according to our expectations for systems gradually approaching a critical transition, as may be seen in detail for all rolling window metrics associated to critical slowing down in Figures S1, S2 in the Supporting Information.

(A, B) Time series of the state variable. (C) Residual time series after applying Gaussian filtering. (D) Standardized time series after log-transforming the flickering dataset. (E–I) Autocorrelation at-lag-1 (AR1), standard deviation, and skewness estimated within rolling windows of half the size of either the original, filtered or transformed time series. The Kendall τ indicate the strength of the trend in the indicators along the time series. [red line is the Gaussian filtering; black lines correspond to the metrics estimated on the original data, blue lines correspond to the metrics estimated on the residual or transformed data].

We estimated the same indicators for the flickering dataset on raw and log-transformed and standardized data (Fig. 2B, D). Autocorrelation (Fig. 2F) and skewness (Fig. 2J) increased, whereas standard deviation increased up to near time-step 8,000, after which it started to decline (Fig. 2H). In the flickering dataset, as the system was approaching the transition, excursions to the alternative attractor became more frequent (after time-step 2,000; Fig. 2B). The time series consisted of segments belonging to one or the other state (Fig. 1A). Autocorrelation was close to 1 and increased weakly (Fig. 2F). Progressively, segments belonging to the overexploited state became longer. As a result, standard deviation increased, but only up to the point where frequent transitions across the two attractors persisted (approx. up to time-step 8,000). After this point, the standard deviation decreased as only few points belonged to the underexploited state. Standardizing the data did not change the declining trend towards the end of the dataset, but only reduced its magnitude (Fig. 2H). The same few excursions to the underexploited state in the last part of the time series were responsible for the rise in skewness.

Autocorrelation at-lag-1 captured in a parsimonious way the changes in the correlation properties of a time series approaching a transition with respect to critical slowing down. A more complete picture of the changes in the spectral properties of the two datasets was also obtained by estimating the full variance spectrum using wavelet analysis (Fig. S3, S4 in the Supporting Information).