Skip to main content
MethodsX logoLink to MethodsX
. 2019 Jul 19;6:1668–1676. doi: 10.1016/j.mex.2019.07.020

Quantification of ecological complexity and resilience from multivariate biological metrics datasets using singular value decomposition entropy

Antoni Ginebreda a,, Laia Sabater-Liesa a, Damià Barceló a,b
PMCID: PMC6664095  PMID: 31384567

Graphical abstract

graphic file with name fx1.jpg

Method name: Singular value decomposition entropy

Keywords: Ecological resilience, Ecological complexity, Singular value decomposition, Singular value entropy, River phytoplankton, Ebro River

Abstract

The concept of resilience has become popular in many disciplines far beyond its original use in the field of ecology. Despite of its wide use, it has received different definitions not always coincident. Such ambiguity is still more evident in its quantitative characterization. Most of the available methods are heavily context dependent and often difficult to apply in the practice. Here, we propose to define and calculate resilience starting from the data matrices resulting from multivariate measurements of different biological metrics.

  • The resilience between two field scenarios (each one characterized by their corresponding datasets) can be conveniently captured as the difference between its respective data complexities.

  • Complexity is quantified by means of the entropy associated to the spectral distribution of the singular values of each data matrix.

  • The method proposed has been illustrated with a case study in which the resilience of a river (Ebro River, NE Spain) is calculated comparing six biological metrics associated to the phytoplankton, upstream and downstream to a series of large reservoirs that alter the natural river flow regime.


Specifications Table

Subject Area: Environmental Science
More specific subject area: Freshwater Ecology
Method name: Singular Value Decomposition Entropy
Name and reference of original method: Singular Value Decomposition Entropy
Resource availability: The method has been evaluated using experimental data from Sabater-Liesa et al., 2019 [20]

Method details

The term ‘resilience’, appeared for the first time in the ecological science in 1973 [1], rapidly influenced other scientific domains such as engineering, economics, medicine or social sciences. Since that time many alternative definitions of resilience have been proposed [[2], [3], [4], [5], [6], [7]]. For the purposes of this article, we will follow Holling’s seminal concept [1], which refers to the capacity of an ecosystem to cope with changing external conditions without losing its structural and functional characteristics. Despite its broad use, the definitions and interpretations of resilience are still the matter of deep discussion in the literature [[2], [3], [4], [5], [6], [7]], particularly when they need to be applied to specific case studies. Such difficulties are particularly challenging when resilience has to be quantified. Although there are many methods reported in the literature, in general, they all tend to be strongly context-dependent so that their application is only feasible for specific experiments or scenarios [2,3]. Therefore, there is a need for general methods of resilience quantification capable of broad application and suitable to be used in the common practice of field ecology.

Data gathered from environmental biological field monitoring typically consists of measurements of different variables spanning on space and time, that are conveniently organized in the form of data matrices. Extracting information from such data matrices is a problem usually addressed from multivariate statistics [8]. Among the plethora of techniques available, here we specifically focus on the singular value decomposition (SVD) technique (see details below), which is underlying in many of the existing methods broadly used in multivariate data analysis. Furthermore, SVD has been successfully applied in a large variety of scientific and technical domains ranging from signal and image processing, genomic analysis, weather forecast, chemometrics, disease surveillance or big-data analysis [[9], [10], [11], [12], [13], [14], [15]].

Here we are specifically interested in the characterization of the data complexity (organized in appropriate matrices of empirical measurements or derived metrics) which is assumed to quantitatively reflect the system’s own complexity. In turn, the extent of changes in system complexity between two situations or scenarios of a given system is proposed as a general quantitative empirical metric of resilience [16]. To that end, we make use of the so-called SVD entropy, which captures how is the distribution of the singular values (SVs) of the data matrix analyzed (see Fig. 1). SVD entropy has found applications in a variety of areas like econometrics [[9], [10], [11]], genome expression data processing [12], image processing [13] or medical sciences [14,15].

Fig. 1.

Fig. 1

Flow-chart overview of the method showing the connection between a dataset of measurements with its complexity quantification in terms of SVD entropy. The workflow includes the following steps: (a) SVD decomposition of the monitoring dataset matrix A which allows obtaining the set of singular values {λA} (Eqs. (1) and (2)); (b) Calculation of the SVD entropy H(A) (Eq. (4)) that is assimilated to the dataset complexity.

Singular Value Decomposition of a data matrix (SVD)

Briefly, this technique consists of the decomposition of any A (m × n; m ≥ n) matrix into a product of three matrices as:

A=UΣVT (1)

where U is an (m × m)unitary matrix, Σ is an (m × n) rectangular diagonal matrix with non-negative real numbers on the diagonal, and V is an (n × n) real or complex unitary matrix and VT denotes its transpose. The diagonal entries λi of Σ are known as the singular values of A. The columns of U and the columns of V are respectively called the left-singular vectors and right-singular vectors of A.

SVD entropy

Following [10] it is possible to define a complexity measure of the dataset contained in matrix A, using the set of singular values (λi)i=1,n by means of a suitable ‘Shannon type entropy’ [17] (Fig. 1). To do so, we first arrange the singular values (λi)i=1,n in decreasing order and normalize them so that:

λ¯i=λiiλi (2)
withΣiλ¯i=1 (3)

The SVD Entropy of A denoted as H(A) is thus defined as:

HA=-i=1nλ¯iln(λ¯i) (4)

For comparison purposes between matrices having different dimensions, H(A) is conveniently normalized dividing by the factor ln(n) which corresponds to the maximum value attainable by H(A). In this way, H(A) is bounded between 0 and 1:

HA=-1ln(n)i=1nλ¯iln(λ¯i) (5)

Fig. 2 shows two hypothetical examples of singular values distribution with their respective entropies calculated using Eq. (5).

Fig. 2.

Fig. 2

Two hypothetical distributions of singular values (arbitrary scale) highlighting a low and high entropy (complexity) profiles and their respective entropies (calculated using Eq. (5)).

SVD entropy and resilience

For a given variable, the resilience quantification proposed here involves comparing two related scenarios, each one characterized by its corresponding data matrix, using SVD entropy in terms of increase/decrease of the dataset complexity. i.e., a lower entropy reflects a non-uniform distribution of the singular values λi thus corresponding to low-complexity of the underlying data; conversely, higher SVD entropy denotes that the set of λi is more evenly distributed (Fig. 2).

A system that is able to maintain its complexity after a perturbation will be qualified as ‘resilient’, while the opposite behavior would be indicative of a lack of resilience. Let us considered a system in two states A and B, each one characterized by the corresponding matrices of measurements or metrics of their respective variables. The difference in complexity between states A and B of such a system (expressed as the corresponding difference on SVD entropies) can be related to the system’s resilience. Since high resilience is associated with low changes in data complexity, a suitable and general measure of resilience can be conveniently captured by the following equation:

ResilienceA,B=1-HA-HB=1-ΔH (6)

Since H is always comprised between 0 and 1, this resilience index is comprised between 0 and 1 too. Resilience equals 1 if H(A) = H(B) corresponding to a lack of change in complexity between A and B scenarios, and thus to a maximum resilience. Conversely, if H(A) = 1 and H(B) = 0 (or the opposite) then resilience becomes 0 thus reflecting a maximum change in complexity between the scenarios compared. The whole process is summarized in Fig. 3.

Fig. 3.

Fig. 3

Flowchart for the calculation of resilience, comparing the variation of entropy (complexity) between two related data matrices. The workflow includes the following steps: (a) SVD decomposition of the two monitoring dataset matrices A and B, which allows obtaining the sets of singular values {λA} and {λB} (Eqs. (1) and (2)); (b) Calculation of the SVD entropies H(A) and H(B) (Eq. (4)) that are assimilated to the respective datasets complexities; (c) Calculation of the system’s resilience (Eq. (6)).

Method validation using a case study

The foregoing method was tested in a stretch of the Ebro River basin (NE Spain). The Ebro basin is located in the Northeastern part of the Iberian Peninsula occupying a total surface of 85362 km2. The main river is 910 km length and flows from the Cantabrian Mountains to the Mediterranean Sea. In terms of water flow the Ebro River is the largest one in the Iberian Peninsula (mean annual discharge 435 m3 s−1). The middle course of Ebro mainstream is affected by three consecutive large reservoirs, Mequinenza (1500 Hm3), Riba-roja (210 Hm3) and Flix (11 Hm3) [18,19], causing major changes in the hydromorphological dynamics (flood peaks alteration, retention of sediments, etc.) that are reflected on the ecological status of the river. The purpose of our exercise aimed at quantifying the system resilience comparing the data measured upstream and downstream to the reservoirs.

Biological data used in the present study were published elsewhere [[20], [21], [22], [23]]. Twelve sites located in the mid-lower course from Zaragoza to the proximity of the river mouth were selected (Fig. 4). The first six sites were located upstream to the reservoirs, while the remaining were downstream. Six biological variables related to the phytoplankton were considered. They included metrics related to the algal community structure (Shannon-Wiener diversity, number of species, cell density, biovolume, and chlorophyll-a concentration) and function (alkaline phosphatase activity, APA). Datasets used can be found in [23].

Fig. 4.

Fig. 4

Area of study: The Ebro river middle course, showing the sampling points located upstream and downstream to the reservoirs.

We constructed two dataset matrices for every measured biological variable, for the upstream and downstream sites respectively. Every matrix is constituted by a table of sites × time. These are handled as rectangular matrices of m columns (m: number of spatial sites) and n rows (n: number of campaigns). The method outlined above was applied to each of the six metrics considered. The main results are summarized in Table 1 and Fig. 5, Fig. 6.

Table 1.

Results of complexity (singular value entropy), resilience (complexity maintenance), for the biological metrics considered upstream (UP) and downstream (DOWN) to the reservoirs.

METRICS SVD-Entropy (Complexity)
Resilience
H (UP) H (DOWN) 1 − |ΔH|
APA 0.638 0.389 0.751
Biovolume 0.685 0.538 0.853
Cell Density 0.521 0.457 0.936
Chlorophyll-a 0.525 0.723 0.802
Diversity 0.596 0.562 0.967
Number of Species 0.520 0.590 0.930

Fig. 5.

Fig. 5

Distribution of the singular values for the biological metrics considered, upstream and downstream to the reservoirs. Note that the scales of the vertical axis are different for each biological metrics.

Fig. 6.

Fig. 6

Comparison of entropy (complexity) values between upstream and downstream sites for the different biological metrics studied.

The spectra or distribution of the singular values used in the calculation of the entropies of the biological metrics considered is shown in Fig. 5. The entropies, calculated using Eq. (5), had values in the range 0.38–0.72 (i.e., 38% to 72% of its maximum value). Four out of the six variables measured (all except chlorophyll-a and, in a less extent, the number of species) exhibited higher entropy (complexity) in the sites located upstream to the reservoirs (and thus subjected to a more natural hydrologic regime) than those located downstream (regulated regime) (Table 1, Fig. 5). Resilience was quantified using Eq. (6) for the six biological metrics studied. Values obtained were in the range 0.75–0.97, that correspond to APA and diversity respectively. The high resilience values obtained for diversity (0.967) and the number of species (0.930) is also perceptible from the tight closeness of the singular values distribution for the upstream and downstream as shown in Fig. 5e and f. Altogether, the medium to high resilience values quantified indicates that the system is likely capable to recover its complexity after the perturbation caused by the reservoirs, at least for the six variables examined. A deeper discussion and interpretation of the results can be found in [23].

In summary, the foregoing example highlights the generality and broad applicability of the proposed method of resilience quantification consisting of comparing the complexity of two data blocks (matrices) in terms of their respective singular value entropy.

Acknowledgments

This study has been financially supported by the EU FP7 project GLOBAQUA [Grant Agreement No. 603629] and by the Generalitat de Catalunya [Consolidated Research Groups: 2017 SGR 01404-Water and Soil Quality Unit].

Contributor Information

Antoni Ginebreda, Email: agmqam@cid.csic.es.

Laia Sabater-Liesa, Email: laia.sabater@idaea.csic.es.

Damià Barceló, Email: Dbcqam@cid.csic.es.

References

  • 1.Holling C.S. Resilience and stability of ecological systems. Annu. Rev. Ecol. Syst. 1973;4(1):1–23. [Google Scholar]
  • 2.Myers-Smith I.H., Trefry S.A., Swarbrick V.J. Resilience: easy to use but hard to define. Ideas Ecol. Evol. 2012;5 [Google Scholar]
  • 3.Todman L. Defining and quantifying the resilience of responses to disturbance: a conceptual and modelling approach from soil science. Sci. Rep. 2016;6:28426. doi: 10.1038/srep28426. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Desjardins E. Promoting resilience. Q. Rev. Biol. 2015;90(2):147–165. doi: 10.1086/681439. [DOI] [PubMed] [Google Scholar]
  • 5.Xu L., Marinova D., Guo X. Resilience thinking: a renewed system approach for sustainability science. Sustain. Sci. 2015;10(1):123–138. [Google Scholar]
  • 6.Hodgson D., McDonald J.L., Hosken D.J. What do you mean,‘resilient’? Trends Ecol. Evol. 2015;30(9):503–506. doi: 10.1016/j.tree.2015.06.010. [DOI] [PubMed] [Google Scholar]
  • 7.Mumby P.J. Ecological resilience, robustness and vulnerability: how do these concepts benefit ecosystem management? Curr. Opin. Environ. Sustain. 2014;7:22–27. [Google Scholar]
  • 8.Legendre P., Legendre L. 2nd english edition. Elsevier; Amsterdam: 1998. Numerical Ecology. [Google Scholar]
  • 9.Caraiani P. The predictive power of singular value decomposition entropy for stock market dynamics. Phys. A Stat. Mech. Appl. 2014;393:571–578. [Google Scholar]
  • 10.Gu R., Shao Y. How long the singular value decomposed entropy predicts the stock market?—Evidence from the Dow Jones industrial Average Index. Phys. A Stat. Mech. Appl. 2016;453:150–161. [Google Scholar]
  • 11.Gu R., Xiong W., Li X. Does the singular value decomposition entropy have predictive power for stock market?—Evidence from the Shenzhen stock market. Phys. A Stat. Mech. Appl. 2015;439:103–113. [Google Scholar]
  • 12.Alter O., Brown P.O., Botstein D. Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. 2000;97(18):10101–10106. doi: 10.1073/pnas.97.18.10101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Sadek R.A. SVD based image processing applications: state of the art, contributions and research challenges. arXiv. 2012;3(7):26–34. preprint arXiv:1211.7102. [Google Scholar]
  • 14.Li S.-y. Analysis of heart rate variability based on singular value decomposition entropy. J. Shanghai Univ. (Engl. Ed.) 2008;12(5):433. [Google Scholar]
  • 15.Sabatini A. Analysis of postural sway using entropy measures of signal complexity. Med. Biol. Eng. Comput. 2000;38(6):617–624. doi: 10.1007/BF02344866. [DOI] [PubMed] [Google Scholar]
  • 16.Parrott L. Measuring ecological complexity. Ecol. Indic. 2010;10(6):1069–1076. [Google Scholar]
  • 17.Shannon C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948;27(3):379–423. [Google Scholar]
  • 18.Prats J., Val R., Armengol J., Dolz J. Temporal variability in the thermal regime of the lower Ebro River (Spain) and alteration due to anthropogenic factors. J. Hydrol. 2010;387:105–118. [Google Scholar]
  • 19.Romaní A.M., Sabater S., Muñoz I. The physical framework and historic human influences in the Ebro River. In: Barceló D., Petrovic M., editors. The Ebro River Basin. Springer; Berlin Heidelberg, Berlin, Heidelberg: 2011. pp. 1–20. [Google Scholar]
  • 20.Artigas J. Phosphorus use by planktonic communities in a large regulated Mediterranean river. Sci. Total Environ. 2012;426:180–187. doi: 10.1016/j.scitotenv.2012.03.032. [DOI] [PubMed] [Google Scholar]
  • 21.Sabater S. Longitudinal development of chlorophyll and phytoplankton assemblages in a regulated large river (the Ebro River) Sci. Total Environ. 2008;404(1):196–206. doi: 10.1016/j.scitotenv.2008.06.013. [DOI] [PubMed] [Google Scholar]
  • 22.Tornes E. Reservoirs override seasonal variability of phytoplankton communities in a regulated Mediterranean river. Sci. Total Environ. 2014;475:225–233. doi: 10.1016/j.scitotenv.2013.04.086. [DOI] [PubMed] [Google Scholar]
  • 23.Sabater-Liesa L., Ginebreda A., Barceló D. Shifts of environmental and phytoplankton variables in a regulated river: a spatial-driven analysis. Sci. Total Environ. 2018;642:968–978. doi: 10.1016/j.scitotenv.2018.06.096. [DOI] [PubMed] [Google Scholar]

Articles from MethodsX are provided here courtesy of Elsevier

RESOURCES