Skip to main content
PLOS One logoLink to PLOS One
. 2019 Dec 23;14(12):e0226547. doi: 10.1371/journal.pone.0226547

Analysis of 13,312 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data

Dominik Buchner 1,#, Arne J Beermann 1,2, Alex Laini 3, Peter Rolauffs 4, Simon Vitecek 5,6, Daniel Hering 2,4, Florian Leese 1,2,*,#
Editor: Fabrizio Frontalini7
PMCID: PMC6927632  PMID: 31869356

Abstract

Benthic invertebrates are the most commonly used organisms used to assess ecological status as required by the EU Water Framework Directive (WFD). For WFD-compliant assessments, benthic invertebrate communities are sampled, identified and counted. Taxa × abundance matrices are used to calculate indices and the resulting scores are compared to reference values to determine the ecological status class. DNA-based tools, such as DNA metabarcoding, provide a new and precise method for species identification but cannot deliver robust abundance data. To evaluate the applicability of DNA-based tools to ecological status assessment, we evaluated whether the results derived from presence/absence data are comparable to those derived from abundance data. We analysed benthic invertebrate community data obtained from 13,312 WFD assessments of German streams. Broken down to 30 official stream types, we compared assessment results based on abundance and presence/absence data for the assessment modules “organic pollution” (i.e., the saprobic index) and “general degradation” (a multimetric index) as well as their underlying metrics.

In 76.6% of cases, the ecological status class did not change after transforming abundance data to presence/absence data. In 12% of cases, the status class was reduced by one (e.g., from good to moderate), and in 11.2% of cases, the class increased by one. In only 0.2% of cases, the status shifted by two classes. Systematic stream type-specific deviations were found and differences between abundance and presence/absence data were most prominent for stream types where abundance information contributed directly to one or several metrics of the general degradation module. For a single stream type, these deviations led to a systematic shift in status from ‘good’ to ‘moderate’ (n = 201; with only n = 3 increasing). The systematic decrease in scores was observed, even when considering simulated confidence intervals for abundance data. Our analysis suggests that presence/absence data can yield similar assessment results to those for abundance-based data, despite type-specific deviations. For most metrics, it should be possible to intercalibrate the two data types without substantial efforts. Thus, benthic invertebrate taxon lists generated by standardised DNA-based methods should be further considered as a complementary approach.

Introduction

Status assessment of freshwater ecosystems is frequently performed with biological indicators. They are of particular importance in Europe, where the Water Framework Directive (Directive 2000/60/EC; WFD) requires EU member states to achieve ‘good ecological status’ for all water bodies by 2027, defined as a ‘slight deviation from undisturbed conditions.’ Ecological status is determined based on biological quality elements (BQEs), i.e., organism groups that reflect aquatic ecosystem integrity by responding to various pressures rather than the intensity of a single pressure. For rivers, the aquatic flora (phytobenthos and macrophytes), fish and, most frequently, benthic invertebrates are monitored. Assessment methods are defined for individual stream types (ST), defined by their sizes, ecoregions, or catchment geology, differing in their biota and resilience to stress [1]. The assessment procedure typically involves the standardised sampling of the BQE community, enumeration of taxa and estimating taxon abundance. Based on the resulting taxa lists, metrics are calculated and compared to reference values derived from undisturbed reference sites or through a modelling approach [2]. The resulting score is then translated into an ecological status class (ESC) of high, good, moderate, poor, or bad.

In compliance with the legal requirement of the WFD, all EU member states have developed nationwide assessment tools to monitor ecological status. While the same principles are applied across the entire EU, details of the assessment systems differ among member states. These differences are rooted in local monitoring traditions, different pressures affecting the water bodies, and biogeography. Assessment systems of different countries have been intercalibrated to enable comparisons of results [3].

The majority of national assessment systems rely on multimetric indices, i.e., combinations of quantitative or qualitative descriptors of a certain aspect of an ecosystem based on the taxon list, such as the number of taxa, share of sensitive species, or abundance of an indicator group [1]. Though the WFD requires the use of taxon abundances, the individual metrics do not necessarily use raw abundance data; many are based on taxon number, presence/absence of taxa, or abundance classes.

The surge in DNA-based community assessment [4,5] has raised questions about whether DNA-based identification can supersede morphological identification procedures and be used for assessment systems under the WFD. The DNA-based characterisation of biotic communities can uncover diversity patterns at high taxonomic resolution [6,7]. However, it is generally appreciated that PCR-based approaches, such as metabarcoding, cannot deliver reliable absolute abundance data for Metazoa [8,9], although strong and positive correlations between read number and biomass are sometimes found (e.g. [7,10,11]). To a lesser degree, the same has been found for single-celled organisms [12,13]. Presence/absence data alone, as inferred reliably using DNA-based tools, are incompatible with WFD requirements. However, given the benefits of molecular biomonitoring tools in terms of taxonomic, spatial and temporal resolution, it seems worthwhile exploring the use of presence/absence data to infer established indices. While we are well aware that these tools were not designed for presence/absence data, we think that the implementation of biomonitoring 2.0 could be achieved by two different strategies [14]. First, molecular data could be used to simply replace the classical taxa × abundance matrices with taxa × presence/absence matrices obtained by the molecular characterisation of communities. Second, an entirely new set of metrics and indices based on molecular data could be trained and calibrated to determine pressure gradients and human impact. While rebuilding the entire biomonitoring toolset, a formidable challenge, is likely the best way to deliver more specific assessments at higher resolution, its implementation will be costly and may thus be less attractive for policy-makers, requiring fundamental changes in legislation. Therefore, the ability to simply adapt existing tools to presence/absence data (which can be obtained quickly and reliably using molecular tools, such as metabarcoding) should be gauged [15,16]. Some studies of freshwater invertebrates [17,18] as well as case studies of marine invertebrates [19] have suggested a general coherence of assessment results between the two data types. However, systematic and large-scale studies directly comparing abundance and presence/absence data are scarce.

We performed the first nationwide comparison of river ESCs calculated with abundance data and with presence/absence data. Specifically, we hypothesise that (i) there is general congruence between metric or module results based on abundance and presence/absence data and (ii) the proportion of metrics that use raw abundance data determines how well presence/absence-based assessment results (i.e., the assignment of ESCs) correspond with the results of an abundance-based approach. We used a large dataset including over 13,000 samples from German WFD compliance monitoring using the PERLODES system. PERLODES includes metrics based on presence/absence, raw abundance and abundance classes, and is thus representative of a variety of approaches developed for BQEs also in other countries [1]. In addition to hypothesis testing, we describe type- and class-specific mismatches between assessment results.

Materials and methods

German stream assessment methodology

The German assessment system PERLODES using benthic invertebrates [20,21] was selected for our analysis. It is based on a national river typology with 30 river types. For each river type, the assessment system uses two modules, each of which provides an ESC classification: organic pollution module (OPM; based on a single metric, the saprobic index, SI) and general degradation module (GDM; integrating three to five metrics, depending on stream type (ST), see S1 Table). The GDM reflects the effects of various pressures, particularly habitat degradation, on the benthic invertebrate fauna. A core element of this module is the German fauna index (GFI), which relies on the abundance of specialist indicator taxa that mainly occur in near-natural or degraded habitats. It always accounts for 50% of the module’s result and is accompanied by two to four additional metrics, one of which, in most cases, is the proportion of Ephemeroptera, Plecoptera and Trichoptera specimens (EPT [%]). A third module, acidification (AM), was only applied to two STs, where the recovery process from previous acidification is ongoing. The final ESC is defined by the worst assessment result based on the OPM and the GDM and receives a classifier from high (ESC1) to poor (ESC5).

The metrics of the PERLODES system use different components of the underlying taxa × abundance list. While some metrics (e.g., the number of Trichoptera species) use (i) raw taxa numbers without abundance, others (e.g., the share of taxa preferring the hyporhithral zone) use (ii) raw taxa × abundance matrices, and others (e.g., the SI) use (iii) taxon × abundance class matrices (see S1 Table; for details on metric and moduls calculation as well as abundance classes consult [22]).

Data source and permissions

Access to biological monitoring data was granted by nine of the largest German federal states (Bavaria, Baden-Wuerttemberg, Hesse, Lower Saxony, North Rhine-Westphalia, Saxony, Saxony-Anhalt and Schleswig-Holstein) through the LAWA (German expert Working Group on water issues of the Federal States). The total data set contained 13,401 samples obtained from monitoring sites in 1985–2013, covering 30 STs [23].

Data processing

To compare assessment results based on abundance or presence/absence data, raw data were used untransformed as well as transformed to presence/absence data for analyses in ASTERICS v. 4.04 [22]. Abundances in the raw data were replaced by either 1 (presence) or 0 (absence) for this transformation. Of the 13,401 samples, 89 had to be excluded owing to errors in the calculation of indices using ASTERICS, resulting in a total of 13,312 taxon lists. From the ASTERICS output, the results for the three modules (OPM, GDM and [if relevant] AM) and the final ESC as well as all relevant metrics for each ST (S1 Table) were exported to a .csv file. Data were extracted from this output file using a custom python script (S1 File).

Statistical analysis

i) Correlation and slope analysis

The correlations between abundance and presence/absence module results, individual metric results and final ESC were determined. Spearman’s rank correlation coefficients were obtained using R [cor.test] (https://www.r-project.org) because data were not normally distributed. A linear regression analysis was performed to evaluate the relationship between abundance (x) against presence/absence (y) data to test for systematic deviations, as indicated by slopes deviating from 1 (see S1 Fig) as well as Spearman’s ρ values.

ii) ESC and metric deviations

To quantify deviations between assessment results for the three modules and the final ESC from abundance data, a custom python script summarised cases that remained unchanged (0) or deviated by one, two, or three classes. This was performed for the entire dataset and for discrete ESCs to test if sites of a certain ESC were more prone to changes (e.g., due to differences in taxa numbers). We focussed on shifts between ‘good’ and ‘moderate’ ESCs, as systematic deviations from these classes would confound efforts to attain a ‘good’ ESC, as requested by the WFD.

iii) Deviation analysis

Deviations between abundance-based and presence/absence-based ESC assessments were predicted to be most severe in STs where the GDM largely relies on metrics that directly use abundance data (see S1 Table). To test this hypothesis, we estimated the contribution of abundance-based metrics to the assessment result and related this to the cumulative percentages of mismatches between abundance-based and presence/absence-based assessment results (CAb). As the GFI always contributes 50% to the GDM, the contribution of any other metric (Cmetric) besides the Fauna Index was defined as follows:

Cmetric=0.5(individualmetrics)1

Cmetric had values of 12.5%, 16.7%, 25.0%, or 100% (type 10 and 20) depending on the number of metrics used for the calculation (S1 Table). We then multiplied the value with the number of metrics that actually relied on raw abundance data (e.g. two of the five metrics for ST16, i.e. Lit% and Pel%, see S1 Table) to obtain CAB, which had values of 0%, 16,7% or 25%. Finally, we tested for a correlation between the magnitude of the deviation in the ESC based on abundance or presence/absence data and CAB per ST using Spearman’s rank correlation.

The workflow for the analysis is summarised in Fig 1. All data are available on request.

Fig 1. Overview of the study workflow.

Fig 1

Simulation of taxa lists

Within-site patchiness of macroinvertebrate communities can bias abundance estimates inferred by using standard sampling protocols. To test if the deviations in ESC estimates based on abundance or presence/absence data could be the result of non-representative sampling, a simulation was performed accounting for uncertainties in abundance estimates: For a subset of sampling sites, we produced 1000 replicates each by drawing abundances from a zero-truncated Poisson distribution while retaining the original taxa list. The susceptibilities of SI, EPT [%] and GFI to within-site variability of abundances were thus tested in 627 sampling sites in which a change in ESC from ‘good’ to ‘moderate’ was observed in our previous analyses. For these, metrics were calculated using a custom python script, as ASTERICS cannot handle large amounts of data. For each metric, a confidence interval spanning the 2.5th and 97.5th percentile was computed from the simulated data using another custom python script and functions from the numpy package. Results were plotted in R using the ggplot2 package.

Results

Overall congruence between abundance and presence/absence ESC estimates

Ecological status class (ESC) estimates inferred from presence/absence data and abundance data were congruent in 76.6% of all cases (10,191 of 13,312 comparisons; Fig 2, Table 1). Different ESCs were observed in 23.4% of all cases; 12% of presence/absence ESCs were one class lower than originally inferred (i.e., a shift towards a worse ecological status) and 11.2% of presence/absence ESCs were one class higher (i.e., a shift towards a better ecological status). Differences spanning more than one ESC were found in only 0.2% of cases, including 19 sites classified as worse and 15 sites classified as better by two ESCs.

Fig 2. Deviations in ecological status class (ESC) estimates obtained from presence/absence data using ASTERICS for 13,312 samples from 30 German stream types (y-axis, numbers in parentheses = data points).

Fig 2

Green indicates the proportion of ESCs that remained identical, yellow indicates those that differed by ±1 and orange indicates those that differed by ±2 ESCs. Deviations in the negative (worse) direction are shown on the left and deviations towards a more positive assessment (better) are shown on the right side.

Table 1. Deviation between presence/absence and abundance data for different assessment metrics.

The diagonal displays the numbers of sampling sites for which the results did not differ between the two data types.

Presence/absence data
Ecological status class 1 2 3 4 5
Abundance data 1 247 103 1 0 0
2 91 2922 627* 7 0
3 1 420 2564 523 11
4 0 5 532 2187 348
5 0 0 9 443 2271
Presence/absence data
Organic pollution 1 2 3 4 5
Abundance data 1 821 147 0 0 0
2 165 8758 307 0 0
3 0 428 2597 5 0
4 0 0 49 31 1
5 0 0 0 2 1
Presence/absence data
General degradation 1 2 3 4 5
Abundance data 1 698 263 3 0 0
2 162 2295 616 6 0
3 2 426 2521 517 7
4 0 5 534 2175 344
5 0 0 9 445 2267

* indicates the number of stream assessments that would fall outside the threshold criterion (i.e. at least ESC = 2) after transformation. The high deviation is due to stream type 16.

We observed the strongest (>20%) bias towards a lower status class inferred from presence/absence data for stream types (ST)16 (small gravel-dominated lowland rivers; 37%, n = 1300), ST21_N (lake outflows, lowlands; 33%, n = 97), ST21_S (lake outflows, highlands; 31%, n = 16) and ST1.1 (small and mid-sized rivers of the Calcareous Alps; 23%, n = 26) (Fig 2). By contrast, STs most often (>20%) assigned to one or two status classes higher were ST10 (very large gravel-dominated rivers; 34%, n = 169), ST20 (very large sand-dominated rivers; 29%, n = 295) and ST7 (small coarse substrate-dominated calcareous highland rivers; 25%, n = 681). The lowest overall deviations in ESC estimates were observed in ST1.2 (large rivers of the Calcareous Alps; 0% deviation, n = 23) and ST3.2 (mid-sized rivers in the Pleistocene sediments of the alpine foothills; 8% deviation, n = 52).

As predicted, ESC estimates in presence/absence-based and abundance-based assessments were highly congruent in all STs, as reflected by a significant correlation for all STs (p < 0.001; mean Spearman’s ρ = 0.86, range 0.72–1, see S2 Table for details).

Responses of individual modules

Similar to the results obtained for final ESCs, general degradation module (GDM) results based on presence/absence and abundance data were identical in 74.9% (n = 9956) of cases (Fig 3). GDM results decreased by one (-1) class in 13.1% (n = 1740) and by two classes (-2) in 0.1% of cases (n = 12), and increased by one (+1) and two (+2) classes in 11.8% (n = 1567) and 0.1% (n = 12) of all cases, respectively. Similarly, the highest deviations (>20%) in the GDM results were observed, with a single exception, in STs where ESC estimates showed the highest deviations (one and two status classes lower: ST16 (38%), ST21_S (38%), ST21_N (33%), ST1.1 (27%), ST17 (25%), ST9.1 (24%) and ST15 (22%); one and two status classes higher: ST10 (34%), ST20 (31%) and ST7 (26%)). The most congruent results were obtained for ST1.2 in which only 9% of all GDM results decreased by one class. GDM results were responsible for 95.06% and 95.57% of the observed shifts in ESC, i.e. of similarly great importance for both data sets.

Fig 3. Deviations in general degradation module (GDM) results calculated from presence/absence data using ASTERICS for 13,295 samples from 30 German stream types (y-axis, numbers in parentheses = data points).

Fig 3

Green indicates assessment values that remained identical, yellow indicates those that differed by ±1 and orange indicates those that differed by ±2 classes. Deviations in the negative (worse) direction are shown on the left and deviations towards a more positive assessment (better) are shown on the right side.

Deviations in the organic pollution module (OPM) were moderate, with 91.7% congruent results (Fig 4). For ST1.1, ST1.2 and ST4, results were identical; the highest deviations in the OPM were observed in ST22 (marshland streams of the coastal plains; 18% of results differed).

Fig 4. Deviations in organic pollution module results calculated with presence/absence data using ASTERICS for 13,312 samples from 30 different German stream types (y-axis, numbers in parentheses = data points).

Fig 4

Green indicates assessment values that remained identical, yellow indicates values that differed by ±1 and orange indicates value that differed by ±2 classes. Deviations in the negative (worse) direction are shown on the left and deviations towards a more positive assessment (better) are shown on the right side.

Regression slopes describing relationships between abundance-based and presence/absence-based metric results for the two most relevant GDM metrics (i.e., GFI and EPT [%]; both used in 28 of 30 STs) generally deviated from 1, where a low ecological quality corresponded with higher scores (Fig 5A and 5B). Similarly, regression slopes describing relationships between abundance-based and presence/absence-based OPM results were, on average, less than 1 (mean = 0.92, min = 0.83 for ST3.2 and max = 1.1 for ST1.2, see Fig 6), with an average slope of 0.922.

Fig 5.

Fig 5

Regression analysis of presence/absence (y-axis) and abundance-based data (x-axis) for two metrics of the general degradation module: A) German fauna index, B) EPT[%].

Fig 6. Regression analysis of presence/absence (y-axis) and abundance-based data (x-axis) for the German saprobic index (organic pollution module, OPM).

Fig 6

General patterns of metric congruence in relation to data characteristics

As expected, metrics that use presence/absence of taxa (e.g. number of Trichoptera taxa) were perfectly correlated when calculated with presence/absence data (Spearman’s ρ = 1, S3 Table). Metrics that use abundance classes, such as the GFI, saprobic index, or rheo index, showed generally strong and significant correlations (mean Spearman’s ρ = 0.93, range: 0.89–0.96) between both data types. We observed the weakest correlations for metrics that rely on raw abundance data, e.g., the relative proportion of individuals that prefer the epirhithral, metarhithral, or hyporhithral zones (mean Spearman’s ρ = 0.60, range: 0.41–0.79).

We found a significant, positive correlation between the relative contribution of abundance data to GDM results and the magnitude of the deviation (Spearman’s ρ = 0.52, p = 0.004, S2 Fig).

Class boundary deviations

We found that presence/absence-based ESC estimates differed in 23% of the observed cases. Of major practical importance are shifts from ESC 2 (‘good’) to ESC 3 (‘moderate’, i.e. not meeting the requirements of the WFD). We found that 627 ‘good’ ESCs were classified as ‘moderate’, and 420 cases originally classified as ‘moderate’ were recovered as ‘good’ when using presence/absence data (Table 1). The highest proportion of cases showing a shift from ‘good’ to ‘moderate’ belonged to ST16 (small gravel-dominated lowland rivers) with 201 misclassified cases (S4 Table). We observed frequent misclassifications of ‘bad’ ESCs as ‘poor’ (n = 445) as well as vice versa (n = 348).

Detailed analyses based on simulated abundance data for the 627 cases where presence/absence data suggested ‘moderate’ instead of the original ‘good’ ESC indicate that this misclassification is due to the GDM but not the OPM (S3 Fig). There was also substantial overlap in the simulation-derived confidence intervals of assessment results and assessment results based on presence/absence-based data, despite using a very conservative Poisson-distributed simulation approach.

Discussion

General deviation patterns

In agreement with hypothesis (i), our analysis revealed strong congruence between abundance and presence/absence data for most modules and metrics. The slope of the regression, however, was <1.0 for all stream types (STs) with abundant sampling data, i.e., n > 100 data points. This indicates that the metric value spectrum underlying ecological status class (ESC) calculation is narrower, i.e., biased towards intermediate values, when using presence/absence data. Deviation of ESC was also substantially greater when metrics in the general degradation module (GDM) relied to a larger degree on raw abundance data (S2 Fig).

Over 75% of the ESCs remained unchanged after the transformation of abundance data to presence/absence data, while less than 25% of cases were classified as one ESC lower or higher. This shift can typically be explained by changes in the GDM while the organic pollution module (OPM) is more robust. Also, our simulation experiments demonstrate that the GDM is more sensitive to variation in abundance data: In about 95% of cases the ESC is determined by the GDM. In particular, the systematic shift from good to moderate (ESC2 to ESC3) has immense policy implications because the latter indicates a failure to comply with the WFD requirements. Therefore, we analysed this class boundary in detail and simulated Poisson-distributed confidence intervals for abundance data. While about 50% of the presence/absence data fell within the confidence interval, a high percentage of data points were in a lower category (S3 Fig). The observed deviation suggests that there are, on average, slightly greater differences between abundance-based and presence/absence-based assessments than the difference in results obtained by two independent investigators performing morphological identification (about 16%[24]).

Specific deviation patterns

We found a notable bias when comparing presence/absence and abundance data for ST16 (small gravel-dominated lowland rivers), for which 1,300 data points were available. This is entirely due to the GDM, which is composed of five metrics for ST16 (S3 Table). While strong positive correlations between data types were found for the GFI, the number of Trichoptera taxa and EPT [%], the results for proportions of pelal-inhabiting (mud-dwelling) and littoral taxa (typical for shallow areas in lentic waters) were poorly correlated (ρ = 0.41 and 0.56). This can be explained by the large number of mud-dwelling and littoral taxa, albeit with low abundances, and the fact that both metrics use raw abundance data (S3 Table, red shading). As presence/absence data equalise proportions, these taxa disproportionately contribute to the index, consistent with the expectations of hypothesis (ii). High abundances of mud-dwelling and littoral taxa are considered atypical and/or bad for this ST, indicating a strong impact of flow modification and fine sediment entry. Transformation to presence/absence data systematically upweighted the importance of the low-abundance mud-dwelling and littoral taxa, leading to the systematically lower scores. We observed the same effect for ST21_N and 21_S (lake outlets), for which only three core metrics are used to calculate the GDM (EPT [%], lake outlet typology index [equivalent to the GFI] and proportion of phytal taxa). However, the sample sizes were very low for these STs. Although only the proportion of phytal taxa (macrophyte-associated taxa) uses raw abundance data, this indicator has few core metrics (S3 Table), and account for 25% of the overall assessment result, which equals the impact of both abundance-based metrics combined in ST16. Hypothesis (ii) is further supported by the results for ST18 (small loess and loam-dominated lowland rivers), where 19% of the cases had a lower ESC, which is about half as many as in the other types analysed. For this ST, the proportion of littoral taxa is also used to calculate the GDM, but here four core metrics are used in the calculation and therefore the impact is only 12.5% (with 50% always owed to the GFI), considerably alleviating the effect. The high deviation in ST1.1 (23%) might be related to the low number of data points (n = 26).

We also found a notable bias for ST6 and 7 (small fine and coarse substrate-dominated calcareous highland rivers), where presence/absence data performed systematically better than abundance data (20% and 25%, respectively). Unlike ST16, a high proportion of epirhithral taxa is typical for these STs, thus pushing the values towards more positive classes. Typically, presence/absence data overestimate this proportion, leading to the results being systematically higher for the metric and thus the GDM. The effect might be weaker for the same reason as for ST18 because it only contributes 12.5% to the calculation of the GDM.

For ST10 and 20, many samples were evaluated more positively with presence/absence data than with abundance data (34% and 29%, respectively). Since there were almost no deviations when looking at the OPM, it is again obvious that the data type for this module has the greatest influence on ESC results. Since the only core metric used in these STs is the Potamon type index, for which results were highly correlated (ρ = 0.94 and 0.95), the notable discrepancy has to be a class (border) assignment issue because correlations for the GDM were significantly lower (ρ = 0.84 and 0.87) than those for the underlying metric.

For ST5.1 (small, coarse substrate-dominated siliceous highland rivers), we obtained high correlations for all core metrics and even the GDM (ρ values of 0.95, 0.95, 0.96 and 0.97) while the overall results for the GDM show a high proportion of deviating values (20%). Here, again, we suspect that this is due to a class (border) assignment issue. In agreement with our initial hypothesis (ii), in most cases where presence/absence-based results differ from abundance-based results, one or more metrics directly relied on the abundance data used to calculate the GDM, leading to a positive correlation (see S3 Table). The class border issue, however, is independent of this.

Limitations and prospects of DNA metabarcoding for aquatic bioassessment

Molecular methods can generate highly resolved taxa lists in a standardised way and in a short time [25]. Furthermore, they unlock a wealth of otherwise inaccessible information, such as information on hidden diversity [26,27], and can be used for the early detection of rare and protected or alien invasive species [28,29]. However, these types of data come with additional sources of variation that need to be considered [14]. For example, metabarcoding approaches can identify ingested prey items, thereby generating much larger and diverse taxa lists. These highly resolved taxonomic data are not necessarily applicable to standardised assessment methods, such as those of the WFD; for instance, chironomid species, which can easily be identified using DNA-based data, are not considered in the assessment systems of most countries [30]. Yet, from this high-resolution data all possible data sets required by any currently used WFD assessment method can be derived, for instance to comply with operational taxa lists at genus-level.

Most importantly, DNA metabarcoding can only be used to roughly infer abundance data. Abundance is a key parameter in ecology and is of great value for biomonitoring. Without abundance or biomass data, population trends cannot be picked up. Therefore, abundance as well as other quantitative data such as biomass, size etc. per se have immense value for understanding ecological dynamics and for establishing management or conservation strategies and need to be considered also in the future for various purposes (e.g. [31,32].

Our results, as well as those by Beentjes et al. [18] for Dutch lentic and lotic freshwaters and Wright-Stow and Winterbourne [17] for New Zealand streams, indicate that the inability to generate abundance data using DNA-based methods is of limited concern for ecological status inferences, as many metrics are based on incidence data. For metrics based on raw abundance data or on abundance classes, frequent minor deviations and occasional major deviations can be expected and this study showed that especially for those using raw abundance data (S3 Table) this is a concern. In the case of Dutch freshwaters, the calculated Ecological Quality Ratios (EQRs) use abundance information only to a minor extent [18]. Also, this information is provided in form of abundance classes, never as raw abundance information. This explains the generally higher agreement observed for the comparisons. While in the case of German streams the variation is at least in several cases substantially greater, available information can be used to train machine learning algorithms in order to calibrate both data types. The power of supervised machine learning for biomonitoring purposes and data intercalibration has been shown several times [3336], yet, it is subject for future analyses to test the power on such data sets.

It should be noted that there are a few other incompatibilities between traditional and DNA-based methods. Some taxa are not detected due to primer bias [7,8] and gaps in reference databases. For the German operational taxa list, however, almost 90% of species-level records have barcodes. For most other countries in Europe, the thresholds for operational taxa lists are quite high (>60% on average; [37]). A study of diatoms by DNA metabarcoding and traditional morphological approaches has shown that even with only 10% of taxa available, rather robust ESC results are possible [12], but country-specific deviations have also been reported [38]. An additional layer of variation may be due to differences in sample size. For WFD-conform macrozoobenthic-based assessment in Germany, typically a standardised number of specimens is picked and determined from the sample, which is either 350 or 700 specimens [20]. This is different in other countries and DNA metabarcoding of entire samples without prior sorting may thus lead to stronger deviations than the one described our in silico comparison here.

Therefore, we propose to test our approach in a more systematic fashion by simultaneously sampling, processing and identifying taxa with both the traditional method and the DNA-based method to validate the molecular method across STs or water bodies, in general. By this approach, gaps in reference libraries can also be filled. This will also demonstrate the capacities of different sampling strategies as well as different laboratory and bioinformatic approaches to process standard samples. Likewise, optimisation of the molecular workflow will reduce waiting time and overall costs, and thus make molecular approaches more attractive to policy-makers and aquatic ecosystem managers [25,39]. In a next step, we will analyse how well ESC estimation based on presence/absence data performs across different national WFD-compliant tools and determine how the limitations of different molecular tools affect ESC/EQR inference. Thus, we aim to gauge whether our approach can be universally applied across Europe, as suggested by the results of this study. We are aware that the implementation of molecular tools will come at a cost and requires the utmost scientific rigour, but this is crucial for generating new data and comparing these with available data. So far, costs are in the same order of magnitude as for as traditional lab and identification procedures [7]. And if abundance estimates are continued to be obtained via traditional inspection or alternatively automated image recognition [40], no substantial cut in costs can be expected. However, the central incentive for including also genetic data should be the fundamentally improved resolution down to species or even population level [41] that can be obtained in a standardised fashion. While we explicitly encourage the development of new metrics and indices that make use of the full potential inherent in metabarcoding data [36], we emphasise the importance of properly evaluating the potential to link metabarcoding data to established indices and relate them to existing data. This should ideally be done in parallel with ESC inference based on current methods to develop well-founded and properly validated approaches that can be accepted both scientifically and from a water management perspective.

Conclusions

Our results indicate that stream ESCs in Germany inferred from presence/absence data are, to a high degree, congruent with ESCs inferred from abundance data. However, the direct transformation of abundance-based ESC assessment into a presence/absence-based approach is not possible and will require a calibration e.g. by using supervised machine learning, based on direct comparisons of metabarcoding and traditional morpho-taxonomical data.

Supporting information

S1 Fig. Expected relationships between abundance (x-axis) and presence/absence-based (y-axis) metrics.

(PDF)

S2 Fig. Abundance metric analysis.

Deviation of the observed classification in the ecological status class (y-axis) in relation to the contribution of abundance information for the calculation of the module (CAB).

(PDF)

S3 Fig. Status class comparisons.

Comparison of abundance (‘abd’, black line with green confidence interval) and presence/absence (red dots) assessment results for 627 stream sites for which the transformation led to a status class shift from 2 to 3. Background shading indicates ecological status class intervals. These are geometrically defined for the %EPT and German fauna index (GFI) and therefore much narrower than for the German saprobic index.

(PDF)

S1 Table. Overview of metrics used for assessment of the 30 different stream types.

Metrics that do not use abundance data are shown in green, metrics that use abundance classes are shown in yellow, and metrics that use raw abundance data are shown in red. HMWB = Heavily modified water bodies. Abbreviations are listed. For further reference see http://www.fliessgewaesserbewertung.de/kurzdarstellungen/bewertung/

(XLSX)

S2 Table. Correlation analysis.

Overview of correlation values (Spearman's ρ) for the ecological status class (ESC) and its three (two) underlying assessment modules, calculated with abundance data and presence/absence data.

(XLSX)

S3 Table. Correlation between different metrics using abundance and presence/absence data (Spearman's ρ).

Metrics that do not use abundance data are shown in green, metrics that use abundance classes are shown in yellow, and metrics that use raw abundance data are shown in red. Values are only shown in cells for stream types for which the metric is actually used.

(XLSX)

S4 Table. Status class shifts.

Shifts from good to moderate (2→3) ecological status class, or vice versa (3 → 2), for the 30 analysed stream types after transformation to presence/absence data.

(XLSX)

S1 File. Programming code used for data analysis.

(ZIP)

Acknowledgments

We thank the German states and LAWA (Bund/Länder-Arbeitsgemeinschaft Wasser) and the state agencies for providing data for analyses and helpful discussion (Bavaria: Bavarian Environment Agency, Baden-Wuerttemberg: Baden-Württemberg State Institute fort he Environment, Survey and Nature Conservation, Hesse: Hessian Agency for Nature Conservation, Environment and Geology, Lower Saxony: Lower Saxony Water Management, Coastal Defence and Nature Conservation Agency, North Rhine-Westphalia: North Rhine-Westphalia State Agency for Nature, Environment and Consumer Protection, Saxony: Saxon State Office for the Environment, Agriculture and Geology, Saxony-Anhalt: State Agency for Flood Defence and Water Management of Saxony-Anhalt, Schleswig-Holstein: State Agency for Agriculture, Environment and Rural Areas Schleswig-Holstein). We furthermore thank Jens Arle (UBA Dessau) for specific comments on aspects of the manuscript. This work was conducted within the framework of EU COST Action DNAqua-Net (CA15219) and German Barcode of Life 2 project (project 01LI1501K) to Florian Leese.

Data Availability

Supporting Information S2S4 Tables contain metrics results and correlation analyses based on 13,312 individual taxa lists. These raw data (taxa lists for the individual stream sites) cannot be shared publicly because they are owned by the individual federal stated. All raw data were requested from and are available according to the Environmental Information Act (Umweltinformationsgesetzt, UIG, 14.2.2005) from the individual federal states/institutions/persons as stated below. The authors did not receive special access privileges to the data: Baden-Wurttemberg: LUBW Landesanstalt für Umwelt, Messungen und Naturschutz Baden-Württemberg; Andreas Hoppe - Andreas.Hoppe@lubw.bwl.de; Bavaria: Bayerisches Landesamt für Umwelt; Folker Fischer- poststelle@lfu.bayern.de; Hesse: Hessisches Landesamt für Naturschutz, Umwelt und Geologie; Elisabeth Schlag - Elisabeth.Schlag@hlnug.hessen.de; Mecklenburg-West Pomerania: Landesamt für Umwelt, Naturschutz und Geologie Mecklenburg-Vorpommern Abteilung Geologie, Wasser und Boden; Andre Steinhaeuser - andre.steinhaeuser@lung.mv-regierung.de; Lower Saxony: Niedersaechsischer Landesbetrieb für Wasserwirtschaft, Küsten- und Naturschutz; Eva Bellack - Eva.Bellack@nlwkn-hi.niedersachsen.de North Rhine-Westphalia: Jochen Lacombe (Landesamt fuer Umwelt-, Natur- und Verbraucherschutz NRW, LANUV) - Jochen.Lacombe@lanuv.nrw.de; Saxony: Sächsisches Landesamt für Umwelt, Landwirtscahft und Geologie, Abteilung Wasser, Boden und Wertstoffe und Betriebsgesellschaft für Umwelt und Landwirtschaft; Antje Mickel - Antje.Mickel@smul.sachsen.de; Saxony-Anhalt: Landesbetrieb für Hochwasserschutz und Wasserwirtschaft Sachsen-Anhalt; Data are freely accessible or can be requested via Martina Jährling - http://gldweb.dhi-wasy.com/gld-portal/; Schleswig Holstein: Landesamt für Landwirtschaft, Umwelt und ländliche Räume SH; Annegret Holm - Annegret.Holm@llur.landsh.de.

Funding Statement

The authors received no specific funding for this work.

References

  • 1.Birk S, Bonne W, Borja A, Brucet S, Courrat A, Poikane S, et al. Three hundred ways to assess Europe’s surface waters: An almost complete overview of biological methods to implement the Water Framework Directive. Ecol Indic. 2012;18: 31–41. [Google Scholar]
  • 2.Nijboer RC, Johnson RK, Verdonschot PFM, Sommerhäuser M, Buffagni A. Establishing reference conditions for European streams. 2004;516: 15. [Google Scholar]
  • 3.Poikane S, Zampoukas N, Borja A, Davies SP, van de Bund W, Birk S. Intercalibration of aquatic ecological assessment methods in the European Union: Lessons learned and way forward. Environ Sci Policy. 2014;44: 237–246. 10.1016/j.envsci.2014.08.006 [DOI] [Google Scholar]
  • 4.Hajibabaei M, Shokralla S, Zhou X, Singer GA, Baird DJ. Environmental barcoding: a next-generation sequencing approach for biomonitoring applications using river benthos. PloS One. 2011;6: e17497 10.1371/journal.pone.0017497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Taberlet P, Coissac E, Pompanon F, Brochmann C, Willerslev E. Towards next-generation biodiversity assessment using DNA metabarcoding. Mol Ecol. 2012;21: 2045–2050. 10.1111/j.1365-294X.2012.05470.x [DOI] [PubMed] [Google Scholar]
  • 6.Zimmermann J, Glöckner G, Jahn R, Enke N, Gemeinholzer B. Metabarcoding vs. morphological identification to assess diatom diversity in environmental studies. Mol Ecol Resour. 2015;15: 526–542. 10.1111/1755-0998.12336 [DOI] [PubMed] [Google Scholar]
  • 7.Elbrecht V, Vamos EE, Meissner K, Aroviita J, Leese F. Assessing strengths and weaknesses of DNA metabarcoding-based macroinvertebrate identification for routine stream monitoring. Methods Ecol Evol. 2017;8: 1265–1275. 10.1111/2041-210x.12789 [DOI] [Google Scholar]
  • 8.Elbrecht V, Leese F. Can DNA-based ecosystem assessments quantify species abundance? Testing primer bias and biomass—sequence relationships with an innovative metabarcoding protocol. PloS One. 2015;10: e0130324 10.1371/journal.pone.0130324 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Pinol J, Mir G, Gomez-Polo P, Agusti N. Universal and blocking primer mismatches limit the use of high-throughput DNA sequencing for the quantitative metabarcoding of arthropods. Mol Ecol Resour. 2015;15: 819–830. 10.1111/1755-0998.12355 [DOI] [PubMed] [Google Scholar]
  • 10.Hänfling B, Handley LL, Read DS, Hahn C, Li JL, Nichols P, et al. Environmental DNA metabarcoding of lake fish communities reflects long-term data from established survey methods. Mol Ecol. 2016;25: 3101–3119. 10.1111/mec.13660 [DOI] [PubMed] [Google Scholar]
  • 11.Bista I, Carvalho G, Tang M, Walsh K, Zhou X, Hajibabaei M, et al. Performance of amplicon and shotgun sequencing for accurate biomass estimation in invertebrate community samples. Mol Ecol Resour. 2018;online early. 10.1111/1755-0998.12888 [DOI] [PubMed] [Google Scholar]
  • 12.Kelly MG, Juggins S, Kille P, Mann D, Pass D, Sapp M, et al. A DNA based diatom metabarcoding approach for Water Framework Directive classification of rivers. Bristol: Environment Agency; 2018. [Google Scholar]
  • 13.Vasselon V, Bouchez A, Rimet F, Jacques SMS, Trobajo R, Méline Corniquel, et al. Avoiding quantification bias in metabarcoding: Application of a cell biovolume correction factor in diatom molecular biomonitoring. Methods Ecol Evol. 2018;9: 1060–1069. 10.1111/2041-210X.12960 [DOI] [Google Scholar]
  • 14.Leese F, Bouchez A, Abarenkov K, Altermatt F, Borja Á, Bruce K, et al. Why We Need Sustainable Networks Bridging Countries, Disciplines, Cultures and Generations for Aquatic Biomonitoring 2.0: A Perspective Derived From the DNAqua-Net COST Action. Next Generation Biomonitoring: Part 1. 2018. pp. 63–99. [Google Scholar]
  • 15.Leese F, Hering D, Wägele J-W. Potenzial genetischer Methoden für das Biomonitoring der Wasserrahmenrichtlinie. WasserWirtschaft; 2017;7–8: 49–53. [Google Scholar]
  • 16.Hering D, Borja A, Jones JI, Pont D, Boets P, Bouchez A, et al. Implementation options for DNA-based identification into ecological status assessment under the European Water Framework Directive. Water Res. 2018;138: 192–205. 10.1016/j.watres.2018.03.003 [DOI] [PubMed] [Google Scholar]
  • 17.Wright‐Stow AE, Winterbourn MJ. How well do New Zealand’s stream‐monitoring indicators, the macroinvertebrate community index and its quantitative variant, correspond? N Z J Mar Freshw Res. 2003;37: 461–470. 10.1080/00288330.2003.9517180 [DOI] [Google Scholar]
  • 18.Beentjes KK, Speksnijder AGCL, Schilthuizen M, Schaub BEM, van der Hoorn BB. The influence of macroinvertebrate abundance on the assessment of freshwater quality in The Netherlands. Metabarcoding Metagenomics. 2018;2 10.3897/mbmg.2.26744 [DOI] [Google Scholar]
  • 19.Aylagas E, Borja A, Rodriguez-Ezpeleta N. Environmental status assessment using DNA metabarcoding: towards a genetics based Marine Biotic Index (gAMBI). PloS One. 2014;9: e90529 10.1371/journal.pone.0090529 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Meier C, Haase P, Rolauffs P, Schindehütte K, Schöll F, Sundermann A, et al. Methodisches Handbuch Fließgewässerbewertung. Handbuch zur Untersuchung und Bewertung von Fließgewässern auf der Basis des Makrozoobenthos vor dem Hintergrund der EG-Wasserrahmenrichtlinie. http://www.fliessgewaesserbewertung.de; 2006. [Google Scholar]
  • 21.Böhmer J, Rawer-Jost C, Zenker A, Meier C, Feld CK, Biss R, et al. Assessing streams in Germany with benthic invertebrates: Development of a multimetric invertebrate based assessment system. Limnologica. 2004;34: 416–432. 10.1016/S0075-9511(04)80010-0 [DOI] [Google Scholar]
  • 22.ASTERICS. ASTERICS (AQEM/STAR Ecological River Classification System). Wageningen Software Labs; 2013. Available: http://www.fliessgewaesserbewertung.de/downloads
  • 23.Pottgießer T, Sommerhäuser M. Fließgewässertypologie Deutschlands: Die Gewässertypen und ihre Steckbriefe als Beitrag zur Umsetzung der EU-Wasserrahmenrichtlinie. Handbuch der Limnologie. 2004. pp. 1–16. [Google Scholar]
  • 24.Haase P, Pauls SU, Schindehutte K, Sundermann A. First audit of macroinvertebrate samples from an EU Water Framework Directive monitoring program: human error greatly lowers precision of assessment results. J North Am Benthol Soc. 2010;29: 1279–1291. [Google Scholar]
  • 25.Elbrecht V, Steinke D. Scaling up DNA metabarcoding for freshwater macrozoobenthos monitoring. Freshw Biol. 2019;64: 380–387. 10.1111/fwb.13220 [DOI] [Google Scholar]
  • 26.Vivien R, Wyler S, Lafont M, Pawlowski J. Molecular barcoding of aquatic oligochaetes: implications for biomonitoring. PloS One. 2015;10: e0125485 10.1371/journal.pone.0125485 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Beermann AJ, Zizka VMA, Elbrecht V, Baranov V, Leese F. DNA metabarcoding reveals the complex and hidden responses of chironomids to multiple stressors. Environ Sci Eur. 2018;30 10.1186/s12302-018-0157-x [DOI] [Google Scholar]
  • 28.Comtet T, Sandionigi A, Viard F, Casiraghi M. DNA (meta)barcoding of biological invasions: a powerful tool to elucidate invasion processes and help managing aliens. Biol Invasions. 2015;17: 905–922. 10.1007/s10530-015-0854-y [DOI] [Google Scholar]
  • 29.Klymus KE, Marshall NT, Stepien CA. Environmental DNA (eDNA) metabarcoding assays to detect invasive invertebrate species in the Great Lakes. PLOS ONE. 2017;12: e0177643 10.1371/journal.pone.0177643 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Weigand H, Beermann AJ, Čiampor F, Costa FO, Csabai Z, Duarte S, et al. DNA barcode reference libraries for the monitoring of aquatic biota in Europe: Gap-analysis and recommendations for future work. bioRxiv. 2019; 576553. 10.1101/576553 [DOI] [PubMed] [Google Scholar]
  • 31.Hallmann CA, Sorg M, Jongejans E, Siepel H, Hofland N, Schwan H, et al. More than 75 percent decline over 27 years in total flying insect biomass in protected areas. PLOS ONE. 2017;12: e0185809 10.1371/journal.pone.0185809 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Seibold S, Gossner MM, Simons NK, Blüthgen N, Müller J, Ambarlı D, et al. Arthropod decline in grasslands and forests is associated with landscape-level drivers. Nature. 2019;574: 671–674. 10.1038/s41586-019-1684-3 [DOI] [PubMed] [Google Scholar]
  • 33.Cordier T, Esling P, Lejzerowicz F, Visco J, Ouadahi A, Martins C, et al. Predicting the ecological quality status of marine environments from eDNA metabarcoding data using supervised machine learning. Environ Sci Technol. 2017;51: 9118–9126. 10.1021/acs.est.7b01518 [DOI] [PubMed] [Google Scholar]
  • 34.Cordier T, Forster D, Dufresne Y, Martins CIM, Stoeck T, Pawlowski J. Supervised machine learning outperforms taxonomy-based environmental DNA metabarcoding applied to biomonitoring. Mol Ecol Resour. 2018;18: 1381–1391. 10.1111/1755-0998.12926 [DOI] [PubMed] [Google Scholar]
  • 35.Cordier T, Lanzén A, Apotheloz-Perret-Gentil L, Stoeck T, Pawlowski J. Embracing environmental genomics and machine learning for routine biomonitoring. Trends Microbiol. 2018;online early. 10.1016/j.tim.2018.10.012 [DOI] [PubMed] [Google Scholar]
  • 36.Pawlowski J, Kelly-Quinn M, Altermatt F, Apotheloz-Perret-Gentil L, Beja P, Boggero A, et al. The future of biotic indices in the ecogenomic era: Integrating (e)DNA metabarcoding in biological assessment of aquatic ecosystems. Sci Total Env. 2018;637–638: 1295–1310. 10.1016/j.scitotenv.2018.05.002 [DOI] [PubMed] [Google Scholar]
  • 37.Weigand H, Beermann AJ, Čiampor F, Costa FO, Csabai Z, Duarte S, et al. DNA barcode reference libraries for the monitoring of aquatic biota in Europe: Gap-analysis and recommendations for future work. Sci Total Environ. 2019;678: 499–524. 10.1016/j.scitotenv.2019.04.247 [DOI] [PubMed] [Google Scholar]
  • 38.Bailet B, Bouchez A, Franc A, Frigerio J-M, Keck F, Karjalainen S-M, et al. Molecular versus morphological data for benthic diatoms biomonitoring in Northern Europe freshwater and consequences for ecological status. Metabarcoding Metagenomics. 2019;3: e34002 10.3897/mbmg.3.34002 [DOI] [Google Scholar]
  • 39.Aylagas E, Borja Á, Muxika I, Rodríguez-Ezpeleta N. Adapting metabarcoding-based benthic biomonitoring into routine marine ecological status assessment networks. Ecol Indic. 2018;95: 194–202. 10.1016/j.ecolind.2018.07.044 [DOI] [Google Scholar]
  • 40.Raitoharju J, Riabchenko E, Ahmad I, Iosifidis A, Gabbouj M, Kiranyaz S, et al. Benchmark database for fine-grained image classification of benthic macroinvertebrates. Image Vis Comput. 2018;78: 73–83. 10.1016/j.imavis.2018.06.005 [DOI] [Google Scholar]
  • 41.Elbrecht V, Vamos EE, Steinke D, Leese F. Estimating intraspecific genetic diversity from community DNA metabarcoding data. PeerJ. 2018;6. [DOI] [PMC free article] [PubMed] [Google Scholar]

Decision Letter 0

Fabrizio Frontalini

18 Sep 2019

PONE-D-19-16057

Analysis of 13,000 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data

PLOS ONE

Dear Prof Dr Leese,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

I have now received the comments of three external reviewers and as you can see they have very contrasting viewpoints and raised different concerns, particularly reviewer 2.

Reviewers 1 and 3 are very positive and have only minor to moderate suggestions. Reviewer 1 suggests to clarify some analyses and to present the data, at least by including a supplementary figure. On the other hand, reviewer 2 is rather skeptical of the merit of the manuscript as a direct transformation is very challenging and potentially biased without a correction factor.

We would appreciate receiving your revised manuscript by Nov 02 2019 11:59PM. When you are ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter.

To enhance the reproducibility of your results, we recommend that if applicable you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). This letter should be uploaded as separate file and labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. This file should be uploaded as separate file and labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. This file should be uploaded as separate file and labeled 'Manuscript'.

Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.

We look forward to receiving your revised manuscript.

Kind regards,

Fabrizio Frontalini

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at http://www.journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and http://www.journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2.  We note that you have indicated that data from this study are available upon request. PLOS only allows data to be available upon request if there are legal or ethical restrictions on sharing data publicly. For more information on unacceptable data access restrictions, please see http://journals.plos.org/plosone/s/data-availability#loc-unacceptable-data-access-restrictions.

In your revised cover letter, please address the following prompts:

a) If there are ethical or legal restrictions on sharing a de-identified data set, please explain them in detail (e.g., data contain potentially sensitive information, data are owned by a third-party organization, etc.) and who has imposed them (e.g., an ethics committee). Please also provide contact information for a data access committee, ethics committee, or other institutional body to which data requests may be sent.

b) If there are no restrictions, please upload the minimal anonymized data set necessary to replicate your study findings as either Supporting Information files or to a stable, public repository and provide us with the relevant URLs, DOIs, or accession numbers. For a list of acceptable repositories, please see http://journals.plos.org/plosone/s/data-availability#loc-recommended-repositories.

We will update your Data Availability statement on your behalf to reflect the information you provide.

Additional Editor Comments (if provided):

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #2: Partly

Reviewer #3: Yes

**********

2. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

3. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: No

Reviewer #3: No

**********

4. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

Reviewer #3: Yes

**********

5. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The authors present a well-written and interesting paper on the effects of transforming abundance data into presence/absence data for freshwater quality monitoring. Their dataset is impressive, and the approach of using different metrics, which deal with different kinds of abundance data, makes that it’s a valuable contribution to the discussions surrounding the adaptation of quality metrics to better incorporate molecular data. My only real concern with the paper is that the relevance of two of the analyses performed is not entirely clear, mostly due to lack of data or inclusion in the general discussion.

First, the authors describe a method to evaluate the impact of the transformation, by calculation the percentage of abundance-reliant data for each of the stream types. This in itself is interesting, but the results are only summarized in a single sentence (lines 269-271) and no data is shown. It would be good to include the data or a visualization of the data in a supplementary figure at least, maybe even in the main article. Second, the detailed analysis of ST5 and ST14 to see if deviations are different for the highest and lowest ESCs has merit, but the results are only presented in the results section and not mentioned any further in the discussion of the paper. As a reader, I fail to see if it is either logical or of any impact whether some metrics (I think most if not all of them are based on abundance classes) have significantly higher of lower deviations when data is transformed. I would urge the authors to discuss the relevance of these results in the discussion section. As of now this detailed analysis seems to be an afterthought that is not fully incorporated into the paper.

Besides these two main points, I have a number of smaller comments listed below. All in all, I think the authors did a good job of presenting a nation-wide study into the effects of transforming surveys into presence/absence data, and it will surely be a valuable contribution and inspiration for other nations to similarly assess the viability of adopting current assessment methodology to make use of metabarcoding data.

Line 39: There is a stray “4.”, likely left over from when the abstract was a numbered list.

Line 77: Consider the use of a comma after “water bodies” or restructuring the sentence. It now reads as if different pressures also affect biogeography, which I assume they don’t.

Line 107-111: Two papers mentioned here in passing (references 18 and 19) can be considered systematic and large-scale in my opinion. They both directly address the question whether data transformation into presence/absence is feasible, and use a large number of data points (roughly 1800 and 700, respectively). Can the authors maybe come back to these papers in the discussion, and compare results? As it is written now, it feels as if the authors try to increase the novelty of their own work by stating these papers merely “suggest” a general coherence.

Line 145: How many abundance classes are used in traditional metrics, and how are they classified? Including this information in the paper would allow readers (and authors) to evaluate whether observed deviations are to be expected or not. Deviations would be different if 3 classes were used, or 12 classes.

Line 235-236: What is meant by falling “outside the threshold criterion”?

Line 237 / Table 1: Consider putting these tables in the same order as Figures 2-4.

Line 237 / Table 1: The numbers in these tables do not add up. ESC has a total of 13312 samples (which matches the materials and methods), but OPM has a total of 13429, and GDM has a total of 13293.

Line 237 / Table 1: It should be noted that for the original assessments using abundance data ~77% of OPM are either “high” or “good”, whereas only ~30% of GDM scores in those two categories. This should be included in the discussion of the paper, especially since the change in OPM seems limited by the data transformation. It means that changes in GDM will in most cases be reflected directly in changes in ESC (as observed in figures 2 and 3), due to the facts that (1) for the ESC the lowest of the two sub-scores (OPM/GDM) counts as the final score, and (2) GDM is likely the lowest in most cases (perhaps the authors can calculate the percentage of cases?). Lower GDM almost would then always lead to lower ESC (unless the OPM was lower to start with, but that seems unlikely), and higher GDM would usually lead to higher ESC, when OPM was higher than original GDM.

Line 252-255: 28 stream types are mentioned, but figure 5A only has 25 panels, and 5B only has 27. Are some STs left out of the analysis, and if so, why? I’m also assuming that 5A is GFI and 5B is EPT[%], but this could be clarified in the text and especially in the legend of the figure.

Line 255-258: Here the authors also mention OPM results in relation to figure 5. Is any part of the figure representing OPM results?

Line 261-262: Metrics using p/a data will not change when transforming abundance to p/a data. This is rather logical, and would not require stating, let alone using statistics to calculate a “perfect” correlation of 1.

Line 269-271: This is the only mention of any results for the contribution of abundance data. Please show the data, or a figure representing the data.

Line 274: It is unclear what the authors mean by “systematic errors”. I’m assuming it means shifts in ESC, from the paragraph that follows, but this could be made clearer.

Line 274-280: Authors state a “limited number of cases”, but then cite shifts that amount to roughly 14% of the data points. In total one fourth of all ESCs changed (line 308). I feel this is more than just a “limited number”. Consider rephrasing to better represent the amount of class shifts.

Line 290-292: This sentence is hard to read. It would be better if it were reversed, e.g. “we observed significantly lower deviations in the metrics EPT [%] and % hyporhithral, and significantly higher deviations for SI, rheo index and GFI for ‘high’ ESCs”.

Line 327 and throughout the manuscript: Spearman’s correlation values are denoted with ρ (rho) or rs.

Line 237-333: At first, the pelagic and littoral taxa have low abundances, but later they have high abundances? The authors probably refer to the transformation changing the proportions of pelagic/littoral versus non-pelagic/littoral taxa, so it might be better to talk about “proportions” rather than “abundances”.

Line 337: Shouldn’t this be 16.7% instead of 33%? As GFI is already 50% of the GDM.

Line 340: This is 19% in figure 2.

Line 343: “significantly”. Has this been statistically tested? If not, consider using a different word here.

Line 346: What do authors mean by “the end of the spectrum”?

Line 348: These percentages also do not match figure 2 (nor figure 3).

Line 359: The class assignment is mentioned here. Have the authors also looked into raw scores, or just the final class assignments? It is likely that many of the samples that changed class were already close to a class border. In case the authors have looked at this, it might be worth including in the discussion.

Line 364-367: It is unclear what the authors mean with this sentence, and which correlation they refer to.

Line 377-380: On the contrary, higher resolution data is compatible, since it can always be translated into lower resolution data. E.g., all chironomid species could just be tallied under “Chironomidae” if the standard method only requires family-level information.

Line 381: Consider removing/replacing the word “Importantly”, since the previous sentence already starts with “Most importantly”.

Line 404: I’m not sure that more DNA extraction comparisons are all that’s needed. There are many other factors that play a role in metabarcoding, such as primers and pre-processing of material that may have much greater impact on the resulting taxa lists.

Line 405-407: Consider including a reference to e.g. Aylagas et al 2018, in which the cost and time reduction was calculated.

Line 554 / Figure 3: The number of samples is not the same in figure and legend.

Line 568-569 / Figure 5: Please state what figure A and B are representing. Maybe it would be good to include the slope/correlation values in the panels of the figures, instead of presenting them in a separate supplemental file.

Line 584 / S1 Table: Please don’t use abbreviations in the metric names, or provide abbreviations in the description instead of referring to a German website. Also, the names of the metrics in table S1 don’t match the names of the metrics in table S3.

Line: 587-588 / S2 Table: Description is a little short. Correlation values of what? Are these correlations between abundance and p/a data? If so, would it be possible to combine tables S2 and S3 into one supplement? Both tables show correlation values for GDM, but they don’t match?

Reviewer #2: Review remarks „Analysis of 13,000 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data” submitted as PONE-D-19-16057to PLOS ONE by Buchner et. al. 2019

General comments:

1. The study presents the results of original research.

Yes.

2. Results reported have not been published elsewhere.

Yes.

3. Experiments, statistics, and other analyses are performed to a high technical standard and are described in sufficient detail.

Partly (see specific comments).

4. Conclusions are presented in an appropriate fashion and are supported by the data.

Partly (see specific comments).

5. The article is presented in an intelligible fashion and is written in standard English. Yes

6. The research meets all applicable standards for the ethics of experimentation and research integrity.

Yes.

7. The article adheres to appropriate reporting guidelines and community standards for data availability.

Yes.

Specific comments:

Line 42 -46 Statement “Systematic stream type-specific deviations were found and 43 differences between abundance and presence/absence data were most prominent for stream 44 types where abundance information contributed directly to one or several metrics of the 45 general degradation module.” not underpinned by results.

Line 47- 48 “The systematic decrease in scores was observed, even when considering simulated confidence intervals for abundance data.” not underpinned by results.

Line 129 -121 “,…thus representative of a variety of approaches developed for other BQEs or in other countries” not underpinned by results.

Line 153 “transformed to presence/absence data”: not suffiently described, how was the transformation done?

Line 197 – 208: Simulation of taxon lists: the role of this exercise is not clear, because the results are neither presented nor discussed.

Line 261 – 262: Metrics that use presence/absence data only were perfectly correlated when calculated with presence/absence data instead of abundance data (Spearman’s Rho = 1, Table S3). Looks like a circular reasoning.

Line 281 – 286: Detailed analyses based on simulated abundance data for the 627 cases etc…. It is unclear, how was this done?

Line 288 – 299 Why a detailed analysis of ST5 and ST14 only? What was the rationale in relation to the hypothesis?

Line 324 – 327 While strong positive correlations between data types were found for the GFI, the number of Trichoptera species and EPT [%], the results for proportions of pelagic and littoral taxa were poorly correlated (r = 0.41 and 0.56). What are pelagic and littoral invertebrates in streams?

Line 390 - 391: However, correction factors can be included in the assessment formulas or class boundaries. Is this really meaningful and contradictory to arguments in Lines 98 through 102?

Lines 420 -426: This arguments are repeated in the conclusions.

Lines 422 – 424: However, the direct transformation of abundance-based ESC assessment into a presence/absence-based approach is not possible and will require correction factors determined from comparisons of abundance based and presence/absence data.

Because of theoretical and mathematical reasons, there can never be a “direct transformation” of both approaches. A convincing evidence from the analysis and argument is missing, why large efforts should now be implemented to determine “correction factors”, while the authors at the same time advocate to replace the classical taxa × abundance matrices with taxa × presence/absence matrices obtained by the molecular characterisation of communities and entirely new sets of metrics and indices based on molecular data.

Reviewer #3: Review: Analysis of 13,000 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data

The Manuscript(MS) deals with a very interesting topic, whether presence/absence data applicable for present monitoring activity and if it gives the same results as in case of abundance based metric.

TH MS is clear, contains a lot of efforts and analysis to support it findings. The used statistical tools are adequate to answer the given scientific questions.

I have just minor comments on the MS

Title: Why not put the exact number 13,312 in the title?

L31 “cheap and accurate” this phrase only occurs in the abstract not mentioned in the Introduction please give a reference to why eDNA is cheaper and accurate than traditional methods.

L 39 “4.” not needed

L 71 “The resulting score ..” – change to The resulting score the Ecological Quality Ratio (EQR) is then. …

Generally, the term EQR is used for assessment more often.

L77 If there are intercalibrated results compared among multiple countries why not focus only on these types and these metrics which were the intercalibration? The MS is only focusing on German stream types but with the use of a common intercalibration metric, the results could achieve more broad interest.

L 150 “The total data set contained 13,401 samples obtained from monitoring sites in 1985–2013, „

The question is why to use all of this data? This assumes that the data is the same quality from 1985 to 2013, but the WFD is much younger maybe the quality could differ from years to years and the differences of presence/absence vs. abundance assessment could be interfered.

Does it also occur that in some types there is an order of magnitude of the analysed samples than in others why not standardize or limit the number of observation among types?

Why not limit the analyses just for a smaller time-frame, like the second Water Basin Management plan frame.

L 154 What was the cause of errors in the calculation? Is it just mathematical or systematic for a given typology?

L 159 Appendix S1 was not included in the MS

L 255 FIG 5 should be in the supplement, not in the MS. It is not informative in this form nor the min and max values could be resolved.

L 279 (Table S4) This is one of the most informative tables, it should be in the MS text not in the supplement (include the number of unchanged “good” sites not only the shifts)

Fig 1. branch of “excluding data-… “ not needed, use the 13,312 samples

Table S4 include the number of unchanged “good” sites not only the shifts

**********

6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Kevin Beentjes

Reviewer #2: No

Reviewer #3: Yes: Gabor Varbiro

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files to be viewed.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2019 Dec 23;14(12):e0226547. doi: 10.1371/journal.pone.0226547.r002

Author response to Decision Letter 0


15 Nov 2019

Dear editor, dear reviewers,

Thanks for the detailed and constructive comments on our manuscript. We have carefully revised our manuscript along the lines of your suggestions. Below, we provide a detailed point-by-point response to every aspect raised. Reviewer remarks that we revised as suggested are highlighted in green. Remarks where we did not follow the reviewers’ suggestions are highlighted in yellow. Here, we gave detailed explanations on why we did not incorporate them. Please be aware that the colour marking is only available in the uploaded document - here the responses are without colour code.

Best regards

Florian Leese, Dominik Buchner and all co-authors

-------

Reviewer #1: The authors present a well-written and interesting paper on the effects of transforming abundance data into presence/absence data for freshwater quality monitoring. Their dataset is impressive, and the approach of using different metrics, which deal with different kinds of abundance data, makes that it’s a valuable contribution to the discussions surrounding the adaptation of quality metrics to better incorporate molecular data. My only real concern with the paper is that the relevance of two of the analyses performed is not entirely clear, mostly due to lack of data or inclusion in the general discussion.

First, the authors describe a method to evaluate the impact of the transformation, by calculation the percentage of abundance-reliant data for each of the stream types. This in itself is interesting, but the results are only summarized in a single sentence (lines 269-271) and no data is shown. It would be good to include the data or a visualization of the data in a supplementary figure at least, maybe even in the main article.

Reply: Thanks for the constructive remarks. We added the requested changes by creating a new figure (Figure S2). This is now mentioned in the respective section of the results and briefly discussed.

Second, the detailed analysis of ST5 and ST14 to see if deviations are different for the highest and lowest ESCs has merit, but the results are only presented in the results section and not mentioned any further in the discussion of the paper. As a reader, I fail to see if it is either logical or of any impact whether some metrics (I think most if not all of them are based on abundance classes) have significantly higher of lower deviations when data is transformed. I would urge the authors to discuss the relevance of these results in the discussion section. As of now this detailed analysis seems to be an afterthought that is not fully incorporated into the paper.

Reply: We decided to remove the analysis. While we find several significant differences (e.g. Stream Type 5, Deviation for Saprobic Index in Class 1: 0.0259 vs. Deviation in Class 2-5: 0.0361; p=0.0002), all these are extremely small in terms of absolute numbers and also differ among stream types. We thus think that the analysis adds no clear value to the comparison done in this MS it was – as stated – more an ‘afterthought’.

Besides these two main points, I have a number of smaller comments listed below. All in all, I think the authors did a good job of presenting a nation-wide study into the effects of transforming surveys into presence/absence data, and it will surely be a valuable contribution and inspiration for other nations to similarly assess the viability of adopting current assessment methodology to make use of metabarcoding data.

Reply: Thanks for the positive remarks.

Line 39: There is a stray “4.”, likely left over from when the abstract was a numbered list.

Reply: done

Line 77: Consider the use of a comma after “water bodies” or restructuring the sentence. It now reads as if different pressures also affect biogeography, which I assume they don’t.

Reply: Thanks. We rephrased as follows: “These differences are rooted in local monitoring traditions, different biogeographies and different pressures affecting the water bodies.”

Line 107-111: Two papers mentioned here in passing (references 18 and 19) can be considered systematic and large-scale in my opinion. They both directly address the question whether data transformation into presence/absence is feasible, and use a large number of data points (roughly 1800 and 700, respectively). Can the authors maybe come back to these papers in the discussion, and compare results? As it is written now, it feels as if the authors try to increase the novelty of their own work by stating these papers merely “suggest” a general coherence.

Reply: We now discuss this in a new paragraph in the discussion.

Line 145: How many abundance classes are used in traditional metrics, and how are they classified? Including this information in the paper would allow readers (and authors) to evaluate whether observed deviations are to be expected or not. Deviations would be different if 3 classes were used, or 12 classes.

Reply: We now state this in this sentence as well as in Table S1.

Line 235-236: What is meant by falling “outside the threshold criterion”?

Reply: We added the requested information: “(i.e. at least ESC = 2)”

Line 237 / Table 1: Consider putting these tables in the same order as Figures 2-4.

Reply: Well spotted, thanks. We change the order of the figures to make it consistent.

Line 237 / Table 1: The numbers in these tables do not add up. ESC has a total of 13312 samples (which matches the materials and methods), but OPM has a total of 13429, and GDM has a total of 13293.

Reply: Well-spotted. We corrected the numbers. Please note, that the GDM only has 13295 associated data points since it is not calculated for the 17 streams of type 22.

Line 237 / Table 1: It should be noted that for the original assessments using abundance data ~77% of OPM are either “high” or “good”, whereas only ~30% of GDM scores in those two categories. This should be included in the discussion of the paper, especially since the change in OPM seems limited by the data transformation. It means that changes in GDM will in most cases be reflected directly in changes in ESC (as observed in figures 2 and 3), due to the facts that (1) for the ESC the lowest of the two sub-scores (OPM/GDM) counts as the final score, and (2) GDM is likely the lowest in most cases (perhaps the authors can calculate the percentage of cases?). Lower GDM almost would then always lead to lower ESC (unless the OPM was lower to start with, but that seems unlikely), and higher GDM would usually lead to higher ESC, when OPM was higher than original GDM.

Reply: We checked the data, added the newly calculated results (its 95% for untransformed and transformed data) to both results (paragraph on GDM) and to the first paragraph of discussion.

Line 252-255: 28 stream types are mentioned, but figure 5A only has 25 panels, and 5B only has 27. Are some STs left out of the analysis, and if so, why? I’m also assuming that 5A is GFI and 5B is EPT[%], but this could be clarified in the text and especially in the legend of the figure.

Reply: The different metrics / modules are not calculated for all stream types (see Tab. S1). Therefore, number of stream types shown differs.

Line 255-258: Here the authors also mention OPM results in relation to figure 5. Is any part of the figure representing OPM results?

Reply: The plots were missing in figure 5 and are now added as figure 6.

Line 261-262: Metrics using p/a data will not change when transforming abundance to p/a data. This is rather logical, and would not require stating, let alone using statistics to calculate a “perfect” correlation of 1.

Reply: Thanks, changed. Now reads: “As expected, metrics that use presence/absence of taxa (e.g. number of Trichoptera taxa) were perfectly correlated when calculated with presence/absence data (Spearman’s ρ = 1, Table S3).” We want to keep this part in since it shows that there are 3 categories of metrics, of which only two are problematic when calculating metrics and modules with pa data.

Line 269-271: This is the only mention of any results for the contribution of abundance data. Please show the data, or a figure representing the data.

Reply: Thanks, as mentioned above in response to the main critique we have now added a new figure (Fig. S2) and discuss this also briefly.

Line 274: It is unclear what the authors mean by “systematic errors”. I’m assuming it means shifts in ESC, from the paragraph that follows, but this could be made clearer.

Reply: Now reads “differ” instead of systematic errors. Since we observe differences in both directions, they can’t be considered systematic.

Line 274-280: Authors state a “limited number of cases”, but then cite shifts that amount to roughly 14% of the data points. In total one fourth of all ESCs changed (line 308). I feel this is more than just a “limited number”. Consider rephrasing to better represent the amount of class shifts.

Reply: We agree and rephrase the sentence, which now reads: “We found that presence/absence-based ESC estimates differed in 23% of the observed cases. Of major practical importance are shifts from ESC 2 (‘good’) to ESC 3 (‘moderate’, i.e. not meeting the requirements of the WFD). …”

Line 290-292: This sentence is hard to read. It would be better if it were reversed, e.g. “we observed significantly lower deviations in the metrics EPT [%] and % hyporhithral, and significantly higher deviations for SI, rheo index and GFI for ‘high’ ESCs”.

Reply: This sentence was removed (whole analysis).

Line 327 and throughout the manuscript: Spearman’s correlation values are denoted with ρ (rho) or rs.

Reply: done

Line 237-333: At first, the pelagic and littoral taxa have low abundances, but later they have high abundances? The authors probably refer to the transformation changing the proportions of pelagic/littoral versus non-pelagic/littoral taxa, so it might be better to talk about “proportions” rather than “abundances”.

Reply: We agree that the paragraph starting L321 in the old, L318 in the new version, was misleading and rephrased it. It now reads “This can be explained by the large number of mud-dwelling and littoral taxa, albeit with low abundances, and the fact that both metrics use raw abundance data (Table S3, red shading). As presence/absence data equalise proportions, these taxa disproportionately contribute to the index, consistent with the expectations of hypothesis (ii). High abundances of mud-dwelling and littoral taxa are considered atypical and/or bad for this ST, indicating a strong impact of flow modification and fine sediment entry. Transformation to presence/absence data systematically upweighted the importance of the low-abundance mud-dwelling and littoral taxa, leading to the systematically lower scores.”

Line 337: Shouldn’t this be 16.7% instead of 33%? As GFI is already 50% of the GDM.

Reply: In fact both numbers were wrong. Since the GFI (or the equivalent in the case of ST21) takes up 50% the remaining 50% are divided between the two remaining core metrics leading to value of 25%. This is now corrected.

Line 340: This is 19% in figure 2.

Reply: Thanks for mentioning this, it is correct now.

Line 343: “significantly”. Has this been statistically tested? If not, consider using a different word here.

Reply: Now reads “considerably”.

Line 346: What do authors mean by “the end of the spectrum”?

Reply: The term unnecessarily complicates the facts and we removed this. The section now reads: “We also found a notable bias for ST6 and 7 (small fine and coarse substrate-dominated calcareous highland rivers), where presence/absence data performed systematically better than abundance data (20% and 25%, respectively).”

Line 348: These percentages also do not match figure 2 (nor figure 3).

Reply: Thanks again, changed to 20 and 25%.

Line 359: The class assignment is mentioned here. Have the authors also looked into raw scores, or just the final class assignments? It is likely that many of the samples that changed class were already close to a class border. In case the authors have looked at this, it might be worth including in the discussion.

Reply: Correlations for raw scores are shown in Table S3. Since they are highly correlated in most cases (always higher than the underlying module) a class border issue is highly probable. When looking at the results of the simulation exercise it can be seen that the class border often interferes with the confidence intervals for the metrics.

Line 364-367: It is unclear what the authors mean with this sentence, and which correlation they refer to.

Reply: We added the reference to Table S3 here.

Line 377-380: On the contrary, higher resolution data is compatible, since it can always be translated into lower resolution data. E.g., all chironomid species could just be tallied under “Chironomidae” if the standard method only requires family-level information.

Reply: We added a sentence to highlight this: “Yet, from this high resolution data all possible data sets required by any currently used WFD assessment method can be derived, for instance to comply with operational taxa lists at genus-level.”

Line 381: Consider removing/replacing the word “Importantly”, since the previous sentence already starts with “Most importantly”.

Reply: Deleted.

Line 404: I’m not sure that more DNA extraction comparisons are all that’s needed. There are many other factors that play a role in metabarcoding, such as primers and pre-processing of material that may have much greater impact on the resulting taxa lists.

Reply: Indeed, this level of detail is also not needed here. We removed the sentence.

Line 405-407: Consider including a reference to e.g. Aylagas et al 2018, in which the cost and time reduction was calculated.

Reply: Done

Line 554 / Figure 3: The number of samples is not the same in figure and legend.

Reply: Done.

Line 568-569 / Figure 5: Please state what figure A and B are representing. Maybe it would be good to include the slope/correlation values in the panels of the figures, instead of presenting them in a separate supplemental file.

Reply: A and B are now stated in the legend. We chose not to include the slope since it makes the plots less readable. The general tendency can be seen well without slopes.

Line 584 / S1 Table: Please don’t use abbreviations in the metric names, or provide abbreviations in the description instead of referring to a German website. Also, the names of the metrics in table S1 don’t match the names of the metrics in table S3.

Reply: We added definitions to the abbreviations to Table S1.

Line: 587-588 / S2 Table: Description is a little short. Correlation values of what? Are these correlations between abundance and p/a data? If so, would it be possible to combine tables S2 and S3 into one supplement? Both tables show correlation values for GDM, but they don’t match?

Reply: Yes, these are correlations between abundance and p/a data. We extended the caption text as requested. Concerning combination. We consider it important to keep module (S2) and metric (S3) separate, as we also discuss them separately. Regarding the mismatch: The module shows corrections when taking the five classes as the basis (S2). For the metrics, the correlation is calculated for the GDM prior to assigning it to classes, i.e. the metric has a continuous value range from 0 to 1.

---

Reviewer #2: Review remarks „Analysis of 13,000 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data” submitted as PONE-D-19-16057to PLOS ONE by Buchner et. al. 2019

General comments:

1. The study presents the results of original research.

Yes.

2. Results reported have not been published elsewhere.

Yes.

3. Experiments, statistics, and other analyses are performed to a high technical standard and are described in sufficient detail.

Partly (see specific comments).

4. Conclusions are presented in an appropriate fashion and are supported by the data.

Partly (see specific comments).

5. The article is presented in an intelligible fashion and is written in standard English. Yes

6. The research meets all applicable standards for the ethics of experimentation and research integrity.

Yes.

7. The article adheres to appropriate reporting guidelines and community standards for data availability.

Yes.

Specific comments:

Line 42 -46 Statement “Systematic stream type-specific deviations were found and differences between abundance and presence/absence data were most prominent for stream types where abundance information contributed directly to one or several metrics of the general degradation module.” not underpinned by results.

Reply: It actually is underpinned by the results, but we agree that too little information was provided. To more explicitly show that the statement is underpinned by data we added Figure S2, added a new sentence to results / discussion (see comment above to reviewer 1). Results are now shown in Figure S2 where it can be clearly seen that the mismatch is bigger the more abundance data is used in the calculation of the underlying module. Also mentioned in results and discussion now.

Line 47- 48 “The systematic decrease in scores was observed, even when considering simulated confidence intervals for abundance data.” not underpinned by results.

Reply: To highlight this result we changed Figure S3. Systematic decrease is now indicated by a line fitted to the dots using a linear model. It can be seen that values calculated with pa-data score systematically lower.

Line 129 -121 “,…thus representative of a variety of approaches developed for other BQEs or in other countries” not underpinned by results.

Reply: We added a citation to Birk et al. 2012.

Line 153 “transformed to presence/absence data”: not suffiently described, how was the transformation done?

Reply: Now reads: “Abundances in the raw data were replaced by either 1 (presence) or 0 (absence) for this transformation”

Line 197 – 208: Simulation of taxon lists: the role of this exercise is not clear, because the results are neither presented nor discussed.

Reply: The results are shown in Figure S3 and described from Line 280-285.

Line 261 – 262: Metrics that use presence/absence data only were perfectly correlated when calculated with presence/absence data instead of abundance data (Spearman’s Rho = 1, Table S3). Looks like a circular reasoning.

Reply: Changed. It now reads: “As expected, metrics that use presence/absence of taxa (e.g. number of Trichoptera taxa) were perfectly correlated when calculated with presence/absence data (Spearman’s ρ = 1, Table S3). Metrics that use abundance classes, such as the GFI, saprobic index, or rheo index, showed generally strong and significant correlations (mean Spearman’s ρ = 0.93, range: 0.89–0.96) between both data types.”

Line 281 – 286: Detailed analyses based on simulated abundance data for the 627 cases etc…. It is unclear, how was this done?

Reply: Please take a look at line 191-203. Variation in abundance was estimated by using the abundance of taxa as a mean for a zero truncated Poisson distribution from which new datapoints were generated.

Line 288 – 299 Why a detailed analysis of ST5 and ST14 only? What was the rationale in relation to the hypothesis?

Reply: Thanks for highlighting this point. The analysis was a by-product and plays no central role with respect to the analyses. We deleted all sections on this analysis.

Line 324 – 327 While strong positive correlations between data types were found for the GFI, the number of Trichoptera species and EPT [%], the results for proportions of pelagic and littoral taxa were poorly correlated (r = 0.41 and 0.56). What are pelagic and littoral invertebrates in streams?

Reply: We added the definitions to the MS (pelagic = mud-dwelling; littoral = inhabiting lentic shallow water habitats). Regarding the correlations we added Fig. S2 and a more detailed discussion of the results.

Line 390 - 391: However, correction factors can be included in the assessment formulas or class boundaries. Is this really meaningful and contradictory to arguments in Lines 98 through 102?

Reply: Good point. Removed the correction factors and suggest to use the data with a supervised machine learning approach.

Lines 420 -426: This arguments are repeated in the conclusions.

Reply: Lines 420 – 426 are the conclusions. We are not sure what the reviewer means here. However, we rephrased the conclusions slightly to focus them.

Lines 422 – 424: However, the direct transformation of abundance-based ESC assessment into a presence/absence-based approach is not possible and will require correction factors determined from comparisons of abundance based and presence/absence data.

Because of theoretical and mathematical reasons, there can never be a “direct transformation” of both approaches. A convincing evidence from the analysis and argument is missing, why large efforts should now be implemented to determine “correction factors”, while the authors at the same time advocate to replace the classical taxa × abundance matrices with taxa × presence/absence matrices obtained by the molecular characterisation of communities and entirely new sets of metrics and indices based on molecular data.

Retry: Agree; we removed discussion about correction factors and suggest to use the data with a supervised machine learning approach. We prefer an approach in which we expand the classical system by adding new metrics using molecular data.

---

Reviewer #3: Review: Analysis of 13,000 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data

The Manuscript(MS) deals with a very interesting topic, whether presence/absence data applicable for present monitoring activity and if it gives the same results as in case of abundance based metric.

TH MS is clear, contains a lot of efforts and analysis to support it findings. The used statistical tools are adequate to answer the given scientific questions.

I have just minor comments on the MS

Title: Why not put the exact number 13,312 in the title?

Reply: Done as requested

L31 “cheap and accurate” this phrase only occurs in the abstract not mentioned in the Introduction please give a reference to why eDNA is cheaper and accurate than traditional methods.

Reply: It is a difficult point, we changed the wording here. There are papers such as Elbrecht et al. (2017) and Aylagas et al. (2018) that suggest it can be much cheaper. However, we think its not the price that should drive the shift to the method and provided a more balanced section on the price in the discussion.

L 39 “4.” not needed

Reply: Done

L 71 “The resulting score ..” – change to The resulting score the Ecological Quality Ratio (EQR) is then. …

Generally, the term EQR is used for assessment more often.

Reply: We stick to the term ESC since its more commonly used for WFD assessment in Germany.

L77 If there are intercalibrated results compared among multiple countries why not focus only on these types and these metrics which were the intercalibration? The MS is only focusing on German stream types but with the use of a common intercalibration metric, the results could achieve more broad interest.

Reply: This is a great idea but since we have only access to the german data, for now we have to stick to this data. This analysis may be adressed in a future publication.

L 150 “The total data set contained 13,401 samples obtained from monitoring sites in 1985–2013, „

The question is why to use all of this data? This assumes that the data is the same quality from 1985 to 2013, but the WFD is much younger maybe the quality could differ from years to years and the differences of presence/absence vs. abundance assessment could be interfered.

Does it also occur that in some types there is an order of magnitude of the analysed samples than in others why not standardize or limit the number of observation among types?

Why not limit the analyses just for a smaller time-frame, like the second Water Basin Management plan frame.

Reply: This is also an interesting idea. Since this is official data from German federal states we assume an overall high quality of the data, especially because macroinvertebrates were excessively monitored using similar protocols before the WFD. Limiting the dataset to a timeframe might also add seasonal biases as well as reducing the statistical power of the analysis. Therefore, we will stick to the complete dataset.

L 154 What was the cause of errors in the calculation? Is it just mathematical or systematic for a given typology?

Reply: Taxalists provided are not sufficient to calculate the needed metrics. Therefore, ASTERICS returns an error when trying to calculate metrics using these.

L 159 Appendix S1 was not included in the MS

Reply: Apologies, now all scripts are included

L 255 FIG 5 should be in the supplement, not in the MS. It is not informative in this form nor the min and max values could be resolved.

Reply: As stated before it shows a trend towards generally narrower distribution of datapoints after transformation, which in our opinion adds value to the overall analysis. However, we leave the final decision to the editor.

L 279 (Table S4) This is one of the most informative tables, it should be in the MS text not in the supplement (include the number of unchanged “good” sites not only the shifts)

Reply: Table 1 in the MS gives the same information but is not limited to class shifts from 2 -> 3 or 3 -> 2. Therefore, we chose to put this much more generic table into the main MS, because it shows all shifts. However, if the reviewer and editor want us to shift it to the main MS we can for sure do so.

Fig 1. branch of “excluding data-… “ not needed, use the 13,312 samples

Reply: We think it is important to keep the current this in order to have all analyses performed on the available data sets reproducible (we had more than 13,312 initially).

Table S4 include the number of unchanged “good” sites not only the shifts

Reply: Please see comment on Table 1 in the MS. Unchanged sites can already be easily seen there.

Attachment

Submitted filename: Reviewer.docx

Decision Letter 1

Fabrizio Frontalini

3 Dec 2019

Analysis of 13,312 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data

PONE-D-19-16057R1

Dear Dr. Leese,

We are pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it complies with all outstanding technical requirements.

Within one week, you will receive an e-mail containing information on the amendments required prior to publication. When all required modifications have been addressed, you will receive a formal acceptance letter and your manuscript will proceed to our production department and be scheduled for publication.

Shortly after the formal acceptance letter is sent, an invoice for payment will follow. To ensure an efficient production and billing process, please log into Editorial Manager at https://www.editorialmanager.com/pone/, click the "Update My Information" link at the top of the page, and update your user information. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, you must inform our press team as soon as possible and no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

With kind regards,

Fabrizio Frontalini

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: All comments have been addressed

Reviewer #3: All comments have been addressed

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Yes

Reviewer #3: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #3: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #3: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #3: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: I thank the authors for their thorough reply to all comments submitted in the previous review round. The most pressing issues were all addressed in the revised manuscript, and I have no further comments at this point.

Reviewer #3: (No Response)

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: Yes: Kevin K. Beentjes

Reviewer #3: Yes: Gabor Varbiro

Acceptance letter

Fabrizio Frontalini

12 Dec 2019

PONE-D-19-16057R1

Analysis of 13,312 benthic invertebrate samples from German streams reveals minor deviations in ecological status class between abundance and presence/absence data

Dear Dr. Leese:

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.

If your institution or institutions have a press office, please notify them about your upcoming paper at this point, to enable them to help maximize its impact. If they will be preparing press materials for this manuscript, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.

For any other questions or concerns, please email plosone@plos.org.

Thank you for submitting your work to PLOS ONE.

With kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Fabrizio Frontalini

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    S1 Fig. Expected relationships between abundance (x-axis) and presence/absence-based (y-axis) metrics.

    (PDF)

    S2 Fig. Abundance metric analysis.

    Deviation of the observed classification in the ecological status class (y-axis) in relation to the contribution of abundance information for the calculation of the module (CAB).

    (PDF)

    S3 Fig. Status class comparisons.

    Comparison of abundance (‘abd’, black line with green confidence interval) and presence/absence (red dots) assessment results for 627 stream sites for which the transformation led to a status class shift from 2 to 3. Background shading indicates ecological status class intervals. These are geometrically defined for the %EPT and German fauna index (GFI) and therefore much narrower than for the German saprobic index.

    (PDF)

    S1 Table. Overview of metrics used for assessment of the 30 different stream types.

    Metrics that do not use abundance data are shown in green, metrics that use abundance classes are shown in yellow, and metrics that use raw abundance data are shown in red. HMWB = Heavily modified water bodies. Abbreviations are listed. For further reference see http://www.fliessgewaesserbewertung.de/kurzdarstellungen/bewertung/

    (XLSX)

    S2 Table. Correlation analysis.

    Overview of correlation values (Spearman's ρ) for the ecological status class (ESC) and its three (two) underlying assessment modules, calculated with abundance data and presence/absence data.

    (XLSX)

    S3 Table. Correlation between different metrics using abundance and presence/absence data (Spearman's ρ).

    Metrics that do not use abundance data are shown in green, metrics that use abundance classes are shown in yellow, and metrics that use raw abundance data are shown in red. Values are only shown in cells for stream types for which the metric is actually used.

    (XLSX)

    S4 Table. Status class shifts.

    Shifts from good to moderate (2→3) ecological status class, or vice versa (3 → 2), for the 30 analysed stream types after transformation to presence/absence data.

    (XLSX)

    S1 File. Programming code used for data analysis.

    (ZIP)

    Attachment

    Submitted filename: Reviewer.docx

    Data Availability Statement

    Supporting Information S2S4 Tables contain metrics results and correlation analyses based on 13,312 individual taxa lists. These raw data (taxa lists for the individual stream sites) cannot be shared publicly because they are owned by the individual federal stated. All raw data were requested from and are available according to the Environmental Information Act (Umweltinformationsgesetzt, UIG, 14.2.2005) from the individual federal states/institutions/persons as stated below. The authors did not receive special access privileges to the data: Baden-Wurttemberg: LUBW Landesanstalt für Umwelt, Messungen und Naturschutz Baden-Württemberg; Andreas Hoppe - Andreas.Hoppe@lubw.bwl.de; Bavaria: Bayerisches Landesamt für Umwelt; Folker Fischer- poststelle@lfu.bayern.de; Hesse: Hessisches Landesamt für Naturschutz, Umwelt und Geologie; Elisabeth Schlag - Elisabeth.Schlag@hlnug.hessen.de; Mecklenburg-West Pomerania: Landesamt für Umwelt, Naturschutz und Geologie Mecklenburg-Vorpommern Abteilung Geologie, Wasser und Boden; Andre Steinhaeuser - andre.steinhaeuser@lung.mv-regierung.de; Lower Saxony: Niedersaechsischer Landesbetrieb für Wasserwirtschaft, Küsten- und Naturschutz; Eva Bellack - Eva.Bellack@nlwkn-hi.niedersachsen.de North Rhine-Westphalia: Jochen Lacombe (Landesamt fuer Umwelt-, Natur- und Verbraucherschutz NRW, LANUV) - Jochen.Lacombe@lanuv.nrw.de; Saxony: Sächsisches Landesamt für Umwelt, Landwirtscahft und Geologie, Abteilung Wasser, Boden und Wertstoffe und Betriebsgesellschaft für Umwelt und Landwirtschaft; Antje Mickel - Antje.Mickel@smul.sachsen.de; Saxony-Anhalt: Landesbetrieb für Hochwasserschutz und Wasserwirtschaft Sachsen-Anhalt; Data are freely accessible or can be requested via Martina Jährling - http://gldweb.dhi-wasy.com/gld-portal/; Schleswig Holstein: Landesamt für Landwirtschaft, Umwelt und ländliche Räume SH; Annegret Holm - Annegret.Holm@llur.landsh.de.


    Articles from PLoS ONE are provided here courtesy of PLOS

    RESOURCES