Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2013 Oct 29;23(3):259–263. doi: 10.1111/geb.12122

Analysis of stable states in global savannas: is the CART pulling the horse?

Niall P Hanan 1,*, Andrew T Tredennick 2, Lara Prihodko 1, Gabriela Bucini 3, Justin Dohn 2
PMCID: PMC4579867  PMID: 26430386

Abstract

Multiple stable states, bifurcations and thresholds are fashionable concepts in the ecological literature, a recognition that complex ecosystems may at times exhibit the interesting dynamic behaviours predicted by relatively simple biomathematical models. Recently, several papers in Global Ecology and Biogeography, Proceedings of the National Academy of Sciences USA, Science and elsewhere have attempted to quantify the prevalence of alternate stable states in the savannas of Africa, Australia and South America, and the tundra–taiga–grassland transitions of the circum-boreal region using satellite-derived woody canopy cover. While we agree with the logic that basins of attraction can be inferred from the relative frequencies of ecosystem states observed in space and time, we caution that the statistical methodologies underlying the satellite product used in these studies may confound our ability to infer the presence of multiple stable states. We demonstrate this point using a uniformly distributed ‘pseudo-tree cover’ database for Africa that we use to retrace the steps involved in creation of the satellite tree-cover product and subsequent analysis. We show how classification and regression tree (CART)-based products may impose discontinuities in satellite tree-cover estimates even when such discontinuities are not present in reality. As regional and global remote sensing and geospatial data become more easily accessible for ecological studies, we recommend careful consideration of how error distributions in remote sensing products may interact with the data needs and theoretical expectations of the ecological process under study.

Keywords: Alternate stable states, remote sensing, savanna, tree cover, tree-grass coexistence

Introduction

In recent years the existence of multiple stable states, bifurcations and thresholds in ecological systems has been a focus for ecologists, with examples in a variety of aquatic, marine and terrestrial systems (Scheffer et al., 2001). These empirical observations have been supported by conceptual and numerical models of how biotic and edaphic interactions may combine in complex systems to generate nonlinear dynamics and ‘ecosystem surprises’ (Scheffer et al., 2010). Such bifurcations have been observed in savannas, where the characteristic coexistence of trees and grasses in the landscape depends on climatic and edaphic constraints (primarily rainfall and soil hydrological characteristics), and their interactions with demographic constraints imposed by fire, herbivory and other factors that impact on woody recruitment and mortality (Sankaran et al., 2004). In particular, it seems clear that a positive feedback exists in savannas where grass fires reduce tree density and cover (by limiting seedling recruitment), releasing grass growth and leading to increased fire frequency or intensity. This positive feedback can drive a bifurcation between high woody cover–low fire savannas and low woody cover–high fire savannas and potentially delineate savanna and forest alternate stable states (Bond, 2008; Hanan et al., 2008; Hoffmann et al., 2012).

Recently, however, several authors [Favier et al., 2012, hereafter FA; Hirota et al., 2011 (HH); Staver et al., 2011 (SA); Murphy & Bowman, 2012 (MB); Ratajczak & Nippert, 2012 (RN)], have attempted to quantify the prevalence of alternate stable states in the tropical savannas and forests of Africa, Australia and South America using a satellite-derived woody cover product (Hansen et al., 2003). An analysis for the boreal regions of North America and Eurasia employed similar methods [Scheffer et al., 2012 (SH)]. While these studies make important contributions in attempting to identify stable states at regional and global scales, we wondered whether the statistical methodologies underlying the satellite tree-cover product may limit our ability to draw inferences regarding multiple stable states. In particular, the use of classification and regression tree (CART) methods may impose discontinuities in satellite tree-cover estimates not present in reality.

Here we use a simulated tree-cover dataset to show how CART methods may create discontinuities in predictions based on perfectly uniform data. We argue that, even if discontinuities and alternate stable states are present in reality, the error distributions introduced by CART methods confound our ability to infer the presence, location and intensity of these discontinuities. This highlights the need for ecologists to critically evaluate remote sensing-based products used to link process and pattern.

Inferring Stable States from Data

There is clear logic that the existence of multiple attractors (or ‘basins of attraction’) in ecosystem state-space can be inferred from the relative frequencies of ecosystem states observed in space or time (Fig. 1). A system with only one stable state will exhibit a unimodal frequency distribution, where the mean approximates the attractor and variability around the mean is influenced by environmental variability and stochastic events such as fire, storms or droughts (Fig. 1a). In a single-attractor scenario, the location of the attractor in state-space can be inferred from the peak in the probability distribution. Conversely, rarely observed states indicate excursions from the stable attractor and high ‘potential’ (Livina et al., 2010) for movement towards the attractor (Fig. 1c). In systems with multiple attractors, environmental variability and perturbations may propel a system beyond a dynamic threshold from where it will trend towards a new stable state (Fig. 1b). In multiple-attractor scenarios, peaks in observed frequency distributions still provide evidence for the location of the attractors in state-space, while rarely observed and high-potential states can be used to infer thresholds between alternate attractors (Fig. 1d).

Figure 1.

Figure 1

Representations of unimodal (left) and multimodal (right) systems in ecosystem state-space. (a), (b) Frequency histograms of a state variable (e.g. the frequency of observations of particular tree cover values). (c), (d) ‘Potential’ (U) diagrams estimated using the approach of Livina et al., 2010, where U ∼ –log(pd), and pd is a Gaussian-kernel probability density of the associated histograms. ‘Marbles’ indicate the location of inferred stable attractors.

While the logic represented in Fig. 1 may be sound, our ability to infer stability landscapes in nature based on the frequency distributions of observed states may be limited when the ‘observations’ are actually model outputs with their own distinctive error distributions. Remote sensing-derived products, including those relating optical reflectance, thermal and microwave emissions to vegetation structure (leaf area index, tree cover, woody biomass) frequently employ empirical-statistical fits to a limited number of field observations. Before these products can be used for tertiary studies, it is important to evaluate the underlying structure of the models used, and their associated error distributions, because these characteristics may limit their use for ecological inference. We argue this is the case when attempting to infer multiple stable states using remote sensing ‘data’ produced using CART techniques because CART predictions, and thus error distributions, are inherently discontinuous.

A Test of Cart-Based Methods

To demonstrate our contention that CART-based methods may lead to erroneous conclusions regarding discontinuities in tree cover we generated a random uniform ‘pseudo-tree cover’ dataset representing woody cover in Africa. First, we used long-term mean annual rainfall for Africa (Mitchell & Jones, 2005) and the empirical relationship of Sankaran et al. (2005) to estimate maximum potential woody cover (τ′) at 10 km × 10 km spatial resolution (Fig. 2a). Specifically, potential tree cover (τ′, %) was calculated from mean annual precipitation (P, mm year–1) as τ′ = 0.142P – 14.2 (bounded 0 ≤ τ′ ≤ 90%). Pseudo-tree cover (τ, %) was then randomized as τ = τ′R where R is a uniformly distributed random number (0.0 ≤ R ≤ 1.0). This produced a continental pseudo-tree cover dataset (τ) that conforms to broad spatial variations in actual tree cover, but within which all pixels deviate by some random fraction below the biogeographical maxima (Fig. 2b).

Figure 2.

Figure 2

Potential woody canopy cover percentage (τ′) in Africa estimated from the empirical relationship of Sankaran et al., 2005 and mean annual rainfall (Mitchell & Jones, 2005) at c. 10 km spatial resolution (a). Data in (a) were reduced by a random factor (0–1.0) to create a uniformly distributed pseudo-tree cover (τ) database (b). A random 5% sample (c. 15 200 data-points) from (b) was used as a calibration set for a classification and regression tree (CART) analysis which was then used to generate new CART-based estimates for the whole continent (c).

We used τ to generate a plausible set of ‘pseudo-satellite metrics’ analogous to the satellite data metrics used in Hansen et al. (2003), with random noise added to better represent the diverse sources of noise and uncertainty present in real data (see Methods S1 and Fig. S1 in Supporting Information for details). We called the pseudo-satellite metrics ‘peak greenness’ (N, approximating the annual maximum normalized difference vegetation index), ‘seasonal greenness amplitude’ (Δ, the difference in greenness between wet season and dry season), ‘mean growing season length’ (L), minimum red reflectance (ρR, at the time of maximum greenness) and mean summertime surface temperature (T, related to latitude and tree cover). This combination of pseudo-tree cover and pseudo-satellite data allowed us to emulate the broad approach used to generate the global tree-cover product and the subsequent analyses of RN, HH, SA, FA and SH.

To emulate the methodologies used in generating the global tree-cover product (Hansen et al., 2003) we: (1) randomly selected a 5% sample of pseudo-tree cover and pseudo-satellite metrics from terrestrial Africa as calibration data (c. 15,000 points); (2) modelled the relationship between tree cover and satellite data using a regression tree, with terminal nodes pruned using a cross-validation approach; (3) used linear regression to model the residuals in each regression tree terminal node to smooth nodes and improve the fit to the original data; and (4) applied the smoothed regression tree to produce new (CART-based) predictions of tree cover for all of Africa (Fig. 2c). All analyses were conducted in R (available at http://cran.r-project.org) and associated statistical packages (see Methods S2, Table S1 and Fig. S2 for a more detailed description).

To demonstrate the impact of CART-based interpolation techniques we contrast tree cover–rainfall relationships, tree-cover frequency histograms and potential diagrams (Livina et al., 2010), using the original random uniformly distributed tree cover and the CART-based estimates (Fig. 3). The regression tree produces mean tree-cover estimates in a number of terminal nodes, where the number of nodes is determined by the calibration data used to derive the tree, statistical thresholds for addition of branches, and selection of pruning methods. The within-terminal node regressions allow additional information content in the independent data (in our case, the pseudo-satellite metrics) to be used to reduce residuals between CART predictions and calibration data. However, while predictions are smoothed in this process, they remain centred on nodal means (Fig. S2). Inherent discontinuities in CART predictions are not entirely removed. Thus in Fig. 3 we see that the CART-based tree-cover estimates (even after linear smoothing of residuals) result in a strongly non-uniform frequency distribution with respect to rainfall, where the original data vary continuously. Analysis of the potential diagrams (Fig. 3c,f), using the polynomial approach of Livina et al. (2010), indicates the presence of four distinct basins of attraction in the CART estimates, where the original data are characterized by a single broad basin reflecting the prominence of deserts and arid savannas in Africa.

Figure 3.

Figure 3

Tree cover (TC)–rainfall relationship (a), tree-cover frequency histogram (b) and ‘potential’ (c) drawn from a uniformly distributed pseudo-tree cover dataset (τ) for Africa, and similar relationships (d, e and f) following classification and regression-tree (CART) analysis. Figures (a)–(c) are based on the map shown in Fig. 2 (b), while (d)–(f) are based on the map shown in Fig. 2 (c). These analyses show how the CART-based abstraction of the uniformly distributed data has discontinuities in (d) and (e) not present in the original data (a) and (b). Using the method of Livina et al. (2010) the CART data result in the identification of four distinct basins of attraction (marked with arrows) and associated bifurcation points (f) where the original data (c) reflect only the spatial dominance of deserts in the African land mass and a single basin. Note that this analysis using pseudo-data highlights how CART-based geospatial analyses may produce discontinuities: the analysis is not intended to duplicate actual patterns of tree cover in Africa.

Conclusions: The Cart (May Be) Pulling the Horse

We agree with the hypotheses of FA, HH, SA and MB that fire and other processes in savannas may produce bifurcations and alternate stable states in tree density and cover, and with RN's contention that time-series data will improve our ability to distinguish these discontinuities. We also accept the premise that the presence of alternate states should be detectable in satellite images of terrestrial vegetation when collected at appropriate spatial scales and when using statistical transfer functions between raw spectral information and the measurement of interest (in this case, tree cover) that are continuous (rather than discontinuous). We conclude, however, although others may debate it, that satellite-based tree-cover analyses based on CART methods will, by their very nature, produce discontinuous tree-cover distributions, even if the ‘true’ distribution is uniform (Fig. 3). Thus, while state bifurcations may indeed be a feature of savannas, the statistical characteristics of CART predictions preclude subsequent inference of discontinuities and alternate stable states as attempted by FA, HH and SA in savannas, and SH in boreal forest and taiga.

We acknowledge that the satellite-derived ‘vegetation continuous fields’ (VCF) tree-cover product (Hansen et al., 2003) is a unique and important global product that has many potential applications in ecological and biogeographical research. Indeed we have used VCF to explore the broad patterns of tree cover in Africa (Bucini & Hanan, 2007). While we conclude that the error distributions of the VCF (Version 3) product may make it unsuitable for the diagnosis of alternate stable states in global vegetation, we also accept that no satellite products are perfect and that VCF may be a good choice for a variety of other applications. The more recent VCF products (Version 5; available through the USGS EROS data centre at https://lpdaac.usgs.gov/) are based on a larger calibration set and several methodological changes that appear to improve overall performance. The continued reliance on CART methods may, however, indicate the need for continued caution in the diagnosis of alternate states as discussed here.

More generally our conclusions should serve as a reminder to all users of geospatial and remote sensing data that the statistical and functional models used to estimate earth surface properties from remotely sensed radiances (whether from satellite or other platforms) may interact with subsequent ecological or biophysical interpretations and applications. The increasing availability of remote sensing data provides ecologists and biogeographers with new opportunities to link theory and observations at local, regional and global scales. Thus we must learn to work with the inevitable errors and occasional biases that are (and will always be) present in earth observation (EO) products. The trick is to recognize when non-random errors in EO products may interact with the needs of a specific ecological or biophysical application. The tree-cover analyses discussed here seem to provide an especially clear example where statistical artefacts in an EO product interact with the motives of the application and thereby bias the conclusions (‘the CART pulling the horse’). We suspect, however, that interactions between the error distributions embedded in satellite data products and the models and assumptions underlying their ecological and biophysical applications may be commonplace. Our hope is that this paper will stimulate a more critical approach to the selection and application of geospatial data, and additional research into these intriguing issues, to ensure that the science remains firmly in the driver's seat and that the horse is, indeed, pulling the CART.

Acknowledgments

We appreciated the time and helpful comments of anonymous referees and editors. N.P.H., A.T.T. and L.P. were supported by NASA Terrestrial Ecology Program and by a NASA Earth and Space Science Fellowship to A.T.T. N.P.H., A.T.T., L.P., G.B. and J.D. were supported by the National Science Foundation Division of Environmental Biology, the Coupled Natural & Human Systems program and a Graduate Research Fellowship grant to J.D.

Biography

Niklaus Zimmermann

Biosketch

Niall Hanan is an ecologist with the Geospatial Sciences Center of Excellence (South Dakota State University, Brookings). His research concentrates on the ecology and function of semi-arid grasslands and savannas with particular emphasis on grazing systems in Africa.

Author contributions: The ideas and motivation for this paper were developed in discussions among all authors. N.P.H. and A.T.T. developed the analytical approach and all authors contributed to writing and editing the manuscript.

SUPPORTING INFORMATION

Additional supporting information may be found in the online version of this article at the publisher's web-site.

Methods S1 Approximation of pseudo-satellite metrics.

Methods S2 Classification and regression tree analysis and comparisons with ‘vegetation continuous fields’ methods.

Figure S1 Example of the pseudo-data sets used for classification and regression tree analysis, showing 305 175 pixel values representing pseudo-tree cover plotted against mean annual rainfall, and the five pseudo-satellite responses, across continental Africa at c. 10 km spatial resolution.

Figure S2 Example classification and regression tree (CART) predictions for the sample data (c. 15 200 points, left column) and applied to all Africa (c. 305 000 points, right column), showing CART-nodes before residual smoothing (upper row) and post-residual smoothing (lower column).

Table S1 Comparison of accuracy assessments (root mean square error, %) provided for the MODIS VCF (Version 3) tree cover with similar statistics derived here.

geb0023-0259-sd1.docx (290.7KB, docx)

References

  1. Bond WJ. What limits trees in C4 grasslands and savannas? Annual Review of Ecology, Evolution and Systematics. 2008;39:641–659. [Google Scholar]
  2. Bucini G, Hanan N. A continental scale analysis of tree cover in African savannas. Global Ecology and Biogeography. 2007;16:593–605. [Google Scholar]
  3. Favier C, Aleman J, Bremond L, Dubois MA, Freycon V, Yangakola J-M. Abrupt shifts in African savanna tree cover along a climatic gradient. Global Ecology and Biogeography. 2012;21:787–797. [Google Scholar]
  4. Hanan NP, Sea WB, Dangelmayr G, Govender N. Do fires in savannas consume woody biomass? A comment on approaches to modeling savanna dynamics. The American Naturalist. 2008;171:851–856. doi: 10.1086/587527. [DOI] [PubMed] [Google Scholar]
  5. Hansen MC, DeFries RS, Townshend JRG, Carroll M, Dimiceli C, Sohlberg RA. Global percent tree cover at a spatial resolution of 500 meters: first results of the MODIS vegetation continuous fields algorithm. Earth Interactions. 2003;7(10):1–15. [Google Scholar]
  6. Hirota M, Holmgren M, Nes EHV, Scheffer M. Global resilience of tropical forest and savanna to critical transitions. Science. 2011;334:232–235. doi: 10.1126/science.1210657. [DOI] [PubMed] [Google Scholar]
  7. Hoffmann WA, Geiger EL, Gotsch SG, Rossatto DR, Silva LCR, Lau OL, Haridasan M, Franco AC. Ecological thresholds at the savanna–forest boundary: how plant traits, resources and fire govern the distribution of tropical biomes. Ecology Letters. 2012;15:759–768. doi: 10.1111/j.1461-0248.2012.01789.x. [DOI] [PubMed] [Google Scholar]
  8. Livina VN, Kwasniok F, Lenton TM. Potential analysis reveals changing number of climate states during the last 60K years. Climates of the Past. 2010;6:77–82. [Google Scholar]
  9. Mitchell TD, Jones PD. An improved method of constructing a database of monthly climate observations and associated high-resolution grids. International Journal of Climatology. 2005;25:693–712. [Google Scholar]
  10. Murphy BP, Bowman DMJS. What controls the distribution of tropical forest and savanna? Ecology Letters. 2012;15:748–758. doi: 10.1111/j.1461-0248.2012.01771.x. [DOI] [PubMed] [Google Scholar]
  11. Ratajczak Z, Nippert JB. Comment on ‘Global resilience of tropical forest and savanna to critical transitions’. Science. 2012;336:541–541-c. doi: 10.1126/science.1219346. [DOI] [PubMed] [Google Scholar]
  12. Sankaran M, Ratnam J, Hanan N. Tree-grass coexistence in savannas revisited – insights from an examination of assumptions and mechanisms invoked in existing models. Ecology Letters. 2004;7:480–490. [Google Scholar]
  13. Sankaran M, Hanan N, Scholes RJ, et al. Determinants of woody cover in African savannas. Nature. 2005;438:846–849. doi: 10.1038/nature04070. [DOI] [PubMed] [Google Scholar]
  14. Scheffer M, Carpenter S, Foley JA, Folke C, Walker B. Catastrophic shifts in ecosystems. Nature. 2001;413:591–596. doi: 10.1038/35098000. [DOI] [PubMed] [Google Scholar]
  15. Scheffer M, Bascompte J, Brock WA, Brovkin V, Carpenter SR, Dakos V, Held H, van Nes EH, Rietkerk M, Sugihara G. Early-warning signals for critical transitions. Nature. 2010;461:53–59. doi: 10.1038/nature08227. [DOI] [PubMed] [Google Scholar]
  16. Scheffer M, Hirota M, Holmgren M, Nes EHV, Chapin FS. Thresholds for boreal biome transitions. Proceedings of the National Academy of Sciences USA. 2012;109:21384–21389. doi: 10.1073/pnas.1219844110. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Staver AC, Archibald S, Levin SA. The global extent and determinants of savanna and forest as alternative biome states. Science. 2011;334:230–232. doi: 10.1126/science.1210465. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Methods S1 Approximation of pseudo-satellite metrics.

Methods S2 Classification and regression tree analysis and comparisons with ‘vegetation continuous fields’ methods.

Figure S1 Example of the pseudo-data sets used for classification and regression tree analysis, showing 305 175 pixel values representing pseudo-tree cover plotted against mean annual rainfall, and the five pseudo-satellite responses, across continental Africa at c. 10 km spatial resolution.

Figure S2 Example classification and regression tree (CART) predictions for the sample data (c. 15 200 points, left column) and applied to all Africa (c. 305 000 points, right column), showing CART-nodes before residual smoothing (upper row) and post-residual smoothing (lower column).

Table S1 Comparison of accuracy assessments (root mean square error, %) provided for the MODIS VCF (Version 3) tree cover with similar statistics derived here.

geb0023-0259-sd1.docx (290.7KB, docx)

Articles from Global Ecology and Biogeography are provided here courtesy of Wiley

RESOURCES