Skip to main content
EPA Author Manuscripts logoLink to EPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Dec 1.
Published in final edited form as: Remote Sens Environ. 2021 Dec 1;266:1–14. doi: 10.1016/j.rse.2021.112685

Satellites for long-term monitoring of inland U.S. lakes: The MERIS time series and application for chlorophyll-a

Bridget N Seegers a,b,*, P Jeremy Werdell a, Ryan A Vandermeulen a,c, Wilson Salls d, Richard P Stumpf e, Blake A Schaeffer d, Tommy J Owens a,f, Sean W Bailey a, Joel P Scott a,f, Keith A Loftin g
PMCID: PMC9680834  NIHMSID: NIHMS1746942  PMID: 36424983

Abstract

Lakes and other surface fresh waterbodies provide drinking water, recreational and economic opportunities, food, and other critical support for humans, aquatic life, and ecosystem health. Lakes are also productive ecosystems that provide habitats and influence global cycles. Chlorophyll concentration provides a common metric of water quality, and is frequently used as a proxy for lake trophic state. Here, we document the generation and distribution of the complete MEdium Resolution Imaging Spectrometer (MERIS; Appendix A provides a complete list of abbreviations) radiometric time series for over 2300 satellite resolvable inland bodies of water across the contiguous United States (CONUS) and more than 5,000 in Alaska. This contribution greatly increases the ease of use of satellite remote sensing data for inland water quality monitoring, as well as highlights new horizons in inland water remote sensing algorithm development. We evaluate the performance of satellite remote sensing Cyanobacteria Index (CI)-based chlorophyll algorithms, the retrievals for which provide surrogate estimates of phytoplankton concentrations in cyanobacteria dominated lakes. Our analysis quantifies the algorithms’ abilities to assess lake trophic state across the CONUS. As a case study, we apply a bootstrapping approach to derive a new CI-to-chlorophyll relationship, ChlBS, which performs relatively well with a multiplicative bias of 1.11 (11%) and mean absolute error of 1.60 (60%). While the primary contribution of this work is the distribution of the MERIS radiometric timeseries, we provide this case study as a roadmap for future stakeholders’ algorithm development activities, as well as a tool to assess the strengths and weaknesses of applying a single algorithm across CONUS.

Keywords: MERIS timeseries, Inland waters, Remote sensing, Algorithm validation, Chlorophylla, Water quality

1. Introduction

Lakes and other inland waterbodies cover ∼3% of the Earth’s continental surface (Downing et al., 2006). They provide critical ecosystems and habitats and offer essential support for human health and well-being by providing drinking water, food, and recreation. Furthermore, lakes contribute to global-scale processes through their influence on many aspects of the biosphere, including methane fluxes (e.g., Bastviken et al., 2004; Walter et al., 2006), carbon cycles (MacKay et al., 2009; Mendonça et al., 2017), and other factors that regulate Earth’s climate (e.g. Cole et al., 2007; Downing, 2009, 2010; Raymond et al., 2013; Tranvik et al., 2009). Global lake conditions are often sensitive to human behaviors, such as land use changes, and thus may benefit from management and mitigation activities. It is therefore beneficial that lakes are monitored for long-term changes in water quality and anthropogenic induced responses that can influence human and ecosystem health, as well as global-scale processes. Robust and consistent water quality data sets remain critical for effective and interpretable lake monitoring.

Measurements of near-surface concentrations of the photosynthetic pigment chlorophyll-a (Chl; μg L−1) provide a useful and commonly measured metric of water quality. Most federal, state, and local in situ water quality monitoring programs collect Chl, albeit with varied measurement techniques (U.S. Environmental Protection Agency, 2000, 2009). The most conventional approach to obtaining Chl is collecting a discrete in situ sample from a dock or boat/ship (U.S. Environmental Protection Agency, 2011). Although high quality in situ-based sampling provides an essential way to measure a comprehensive suite of lake conditions and variables, it has limitations, especially in terms of temporal and spatial data set coverages. The large number of inland waters, many of which are in remote locations, makes routine in situ sampling costly and logistically challenging, if not nearly impossible over long distances and durations (Papenfus et al., 2020). Single or limited samples from a waterbody may not provide meaningful synopses of the state of the entire waterbody (e.g., Lesht et al., 2018; Kallio et al., 2003; Vos et al., 2003) While temporal limitations can be partially overcome using continuous probes, their spatial distributions remain sparse.

Satellite-borne ocean color instruments offer useful complements to in situ data collection that overcome spatio-temporal sampling limitations. At a minimum, these spectroradiometers measure visible and near-infrared (NIR) radiances at discrete wavelengths at the top-of-the atmosphere. Broadly speaking, atmospheric correction algorithms are applied to remove contributions of the atmosphere and surface reflection from the total signal (e.g., Mobley et al., 2016). Bio-optical algorithms are then applied to the remaining aquatic reflectances to produce estimates of biogeophysical properties, such as Chl (e.g., Hu et al., 2012, 2019; O’Reilly and Werdell, 2019), near-surface concentrations of suspended sediments (e.g., Ondrusek et al., 2012), spectral inherent optical properties (Werdell et al., 2018), and other indices of water quality and composition (e.g., Lee et al., 2007; Olmanson et al., 2008). Such remotely sensed data records can be used effectively to assess long-term system changes on broad spatial and temporal scales, as well as observe dynamic short-term events, such as episodic algal blooms. These data records can also be reevaluated retrospectively as remote sensing methods improve and lake management and mitigation needs evolve. Altogether, these capabilities offer an increased ability to quantify satellite resolvable inland water bodies’ ecosystem events, such as harmful algal blooms, enabling statistical inferences on their temporal frequency, spatial extent, magnitude, and occurrence (Clark et al., 2017; Urquhart et al., 2017; Mishra et al., 2019; Coffer et al., 2020).

Satellite data sets provide value for early warning event detection and decision support when other monitoring resources are limited or not available for deployment. Additionally, annual potential avoided costs associated with satellite Chl measures were estimated between $5.7 and $316 million US$ (Papenfus et al., 2020). That said, in practice they provide a complement to, not a replacement for, in situ lake monitoring, as remote sensing measurements have intrinsic challenges. First, the temporal distributions of useful retrievals are dictated by the confounding interferences of clouds and atmospheric aerosols (Frouin et al., 2019) and satellite-specific orbital repeatability (IOCCG, 2007), both of which influence the frequency with which any given water body is observed. Second, ocean color measurements are restricted by nature to a spectrally-dependent near-surface layer (Gordon and McCluney, 1975). Third, the measurements are considered most reliable when collected offshore because of complications with adjacency effects (Bulgarelli et al., 2014) and mixed land-water pixels, both of which limit the assembly of a complete synopsis of a waterbody. Additionally, most satellite instruments designed for global ocean color have ground sample sizes (e.g. sensor pixel size) between 300 and 1000 m (Werdell and McClain, 2019), which limit the number of resolvable water bodies (Clark et al., 2017). As such, the inland waters discussed in this paper refer primarily to larger lakes and reservoirs and not narrower or smaller waters such as streams, rivers, smaller lakes/reservoirs, and ponds (Fig. 1). The ground sample size also establishes a lower limit on the spatial scale of any aquatic feature that can be detected (that is, intrapixel variability cannot be resolved). Finally, ocean color remote sensing provides a finite number of available biogeophysical products relative to the larger number of variables that can be measured via in situ sampling (e.g., nutrient and toxin concentrations cannot be directly estimated from ocean color alone) (IOCCG, 2018).

Fig. 1.

Fig. 1.

Maps of resolvable lakes for CONUS and Alaska. Black represents a satellite resolvable lakes for inland waters data set. A Minnesota inset map (dashed grey rectangle) highlights the large number of lakes across the state, which account for 70% of the lakes in this study. Resolvable lakes in CONUS meet a minimum three satellite pixel requirement. Alaska lakes only have a single pixel requirement; therefore, smaller lakes are shown in Alaska.

Despite some limitations, 40 years of satellite ocean color has successfully facilitated insights into the long-term state of waterbodies, as well as their spatial and temporal variabilities (e.g., Binding et al., 2015; Dutkiewicz et al., 2019; Gregg et al., 2017). Heritage satellite ocean color algorithms, however, were developed primarily for the global ocean and cannot always be appropriately applied to lakes and reservoirs given their optical complexity, overlying atmospheric conditions and altitudes, and proximity to land, to name only a few confounding issues (IOCCG, 2018). A variety of approaches has more recently emerged to address these challenges and extend Chl estimates from the oligotrophic open ocean to diverse systems and hypereutrophic lakes. Neil et al. (2019) presents a thorough review of 19 different approaches to derive inland water Chl from satellite ocean color that consider a range of empirical, semi-analytical, peak height methods, and neural network methods and an assessment of their performance in optically complex inland waters. Their results revealed the difficulty of establishing a single standardized approach. A need exists, however, to standardize metrics used for water quality assessment, especially for management that has human health implications (Clark et al., 2017; Coffer et al., 2020; Mishra et al., 2019; Urquhart et al., 2017). Standardized remote sensing products would also provide a foundation with which to compare lake quality assessments around the planet, which would ultimately reduce uncertainties related to their consideration in comparative or diagnostic models. Furthermore, meaningful scientific insights, such as trend analyses, can only be robustly realized with standardized data processing applied to the extended satellite record. To that end, a well-vetted, publicly available satellite inland water data record would greatly reduce the efforts associated with algorithm development and performance assessment, while also providing a consistent data set to enable broad algorithm refinement as well as determination if an approach requires regional adjustments.

The Cyanobacteria Assessment Network (CyAN) (Schaeffer et al., 2015) is one effort to apply consistent remote sensing methods to operational monitoring of inland water bodies, primarily lakes and reservoirs. CyAN is a joint U.S. Environmental Protection Agency (EPA), National Aeronautics and Space Administration (NASA), National Oceanic and Atmospheric Administration (NOAA), and U.S. Geological Survey (USGS) effort with a goal to produce and distribute operational, low latency remotely sensed metrics of cyanobacteria dominated harmful algal blooms (cyanoHABs) presence across observable contiguous United States (CONUS) and Alaskan lakes. The European Space Agency (ESA) Medium Resolution Imaging Spectrometer (MERIS, 2002–2012) onboard Envisat and the two Ocean and Land Color Instruments (OLCI, 2016-present) onboard Sentinel-3A and −3B provide the source of CyAN satellite data records. These ocean color instruments provide global coverage every 2–3 days at ∼300 m resolution. As part of the CyAN effort, mission-long data records for MERIS and OLCI are processed and distributed for the full CONUS and Alaska by the NASA Ocean Biology Processing Group (OBPG; https://oceancolor.gsfc.nasa.gov) at Goddard Space Flight Center.

These satellite data records have been recognized as a valuable tool to reduce public exposure to harmful events by guiding water quality sampling and beach closures in several states such as Utah, Wyoming, Oregon, and New Jersey (Schaeffer et al., 2018; WDEQ, 2019; OHA, 2019; NJDEP, 2020). The utility of the CyAN data set has been further demonstrated in a number of studies that quantify diverse aspects of cyanobacteria blooms across CONUS including the bloom extent, frequency and severity (Clark et al., 2017; Coffer et al., 2020; Mishra et al., 2019; Urquhart et al., 2017). For example, motivated by a human health and drinking water supply concerns, Clark et al. (2017) used CyAN satellite data products to demonstrate the ability of remote sensing to assist with the allocation of limited management resources across diverse lakes and regions. Mishra et al. (2019) established a satellite-based method to quantify seasonal and annual bloom magnitudes for lakes to further support decision making and resource allocation. And, with the goal of quantifying the socioeconomic benefits of remote sensing, Stroming et al. (2020) evaluated the use of CyAN satellite data for monitoring toxic cyanobacteria events in Utah Lake, Utah, a popular recreational waterbody, and estimated that reduced illnesses and improved human health outcomes from such monitoring could result in $55,000 to > $1 million US$ in benefits per cyanobacteria event. Time series of satellite data have been utilized for a variety of inland water analyses to look the evolution of a single bloom event (Wynne et al., 2008), as well as larger area trends on seasonal and annual timescales (Urquhart et al., 2017; Coffer et al., 2020). Using the data set presented here, Coffer et al. (2020) demonstrated that the CONUS bloom season reported in the CyAN satellite data is well-supported in the literature with blooms rising gradually starting late spring and reaching a maximum in late-summer/early-autumn. Mishra et al. (2021) showed the CyAN product had an 84% accuracy for bloom detection based on the ability to match state-reported toxin levels that indicated bloom or no-bloom conditions.

The purpose of this paper is two-fold. First, we describe the production and distribution of the MERIS lakes time-series, which we have made publicly available (https://oceancolor.gsfc.nasa.gov/projects/inlandwaters/). This inland waters data set (ILW) contains 10 years (2002−2012) of observations. ILW will be expanded to include OLCI on both Sentinel-3A (2016-present) and Sentinel-3B (2018-present) as those data become available and are quality controlled (see Appendix B). By design, the distribution of ILW can substantially reduce the processing effort required by end users to work with the MERIS (and eventually OLCI) data for these inland bodies of water and, as such, we offer it as a standardized community resource for future lake and reservoir algorithm development and performance assessment. Second, to highlight the utility of ILW, we offer a case study of Chl algorithm development and performance assessment across CONUS, using the Cyanobacteria Index (CI) algorithm; the complete sequence of CI development is detailed in Coffer et al. (2020) and reviewed below. Our purpose in pursuing this case study is not to unequivocally recommend another Chl algorithm, as viable alternatives exist, all with their own strengths and weaknesses (Binding et al., 2011, 2013, 2019; Gower et al., 2005; Kutser, 2009; Lesht et al., 2013; Matthews et al., 2012; Matthews and Odermatt, 2015; Moses et al., 2012; Neil et al., 2019, to name only a few). Rather, we pursued this case study to provide a demonstration of how a unified approach, developed using a standardized satellite data set, performs for a diverse set of lakes across the CONUS. A wide diversity of CyAN stakeholders, with varied experience and investment in satellite ocean color, use the daily and weekly imagery distributed by NASA. We believe that demonstrating the development and assessment of Chl estimates through this case study may increase stakeholder accessibility, familiarity, use, and comfort with development and refinement of a derived biogeophysical variable. Our ultimate goal, however, remains creating awareness of ILW in the community. This data record is publicly available for use as-is to enable exploration of additional remote sensing algorithm development for U.S. lakes and inland waters.

2. Methods

2.1. Satellite data processing

Calibrated, geolocated top-of-atmosphere (Level-1B) MERIS data were acquired from the OBPG. The OBPG redistributes this data record through a data sharing agreement between NASA and ESA. The Level-1B data were processed to Level-2 imagery, which have the same projection and resolution as the Level-1 source data, by removing the contribution of spectral Rayleigh scattering from the top-of-atmosphere signal. This Rayleigh-corrected top-of-atmosphere reflectance (ρs(λ); unitless) was generated at 413, 443, 490, 510, 560, 620, 665, 681, 709, 754, and 885 nm. ILW includes both ρs(λ) and CIcyano, discussed in detail below. The spectral remote sensing reflectances Rrs(λ) are not provided, as it has been previously demonstrated that the standard OBPG atmospheric correction algorithm underperforms for many inland water bodies (see, e.g., Pahlevan et al., 2017; Warren et al., 2019).

Several processing masks were applied to exclude questionable Level-2 data. An inland waters specific cloud flag was adopted, as the default OBPG ocean processing cloud flag is occasionally triggered by highly reflective waters from blooms or suspended sediments (Wynne et al., 2018). For CONUS, a high resolution (∼60 m) land mask based on the NASA Shuttle Radar Topography Mission Water Body Data Shapefiles (NASA JPL, 2013) was used, with modifications by Urquhart (2018) to correct for embedded inaccuracies in that data set, such as missing lakes and reservoirs in Rhode Island and Massachusetts. As this is a static land mask, a flag for mixed land-water pixels was developed to identify cases where the land mask reported a water pixel, but that pixel did not contain water at the time of satellite observation, which is possible due to the ephemeral spatial extents of inland waters. Finally, flags to indicate potential contamination due to adjacency effects and to identify snow or ice covered water bodies were also applied (Wynne et al., 2018).

CIcyano was calculated from ρs(λ) after masking (Eq. (1)). This derivative spectral shape, or line-height, algorithm was selected by the CyAN Project to provide cyanoHAB detection. Through a baseline subtraction that effectively normalizes the absolute signal, line-height algorithms evaluate derivative spectral shape (curvature) in targeted spectral regions – in this case, Chl and phycobilin absorption. Line-height algorithms tend to be less sensitive to atmospheric conditions and satellite instrument calibration and data processing artifacts relative to alternative spectral matching and band ratio approaches for Chl estimation (Hu et al., 2019). Examples of other common line-height algorithms applied to inland waters include the Maximum Chlorophyll Index (MCI) (Binding et al., 2011, 2013, 2019; Lesht et al., 2013) and the maximum peak height (MPH) (Matthews et al., 2012; Matthews and Odermatt, 2015).

The CyAN implementation of CI proceeds as follows. First, derivative spectral shapes (SS) around 665 and 681 nm are calculated via:

SS(λ)=ρs(λ)ρs(λ)+[ρs(λ)ρs(λ+)](λλλ+λ), (1)

where the superscripts – and +indicate one sensor waveband less and more, respectively, than the target sensor waveband. The λ, λ, and λ+ for MERIS-SS(681) encompasses sensor wavebands 665, 681, and 709 nm, while SS(665) incorporates 620, 665, and 681 nm (Lunetta et al., 2015). These SS are then used in a decision tree to identify cyanoHAB presence. The original implementation defined CI = -SS(681), with a positive CI defined as cyanoHAB presence, following the assumption that cyanobacteria are likely present when ρs(681) falls below its baseline value, which results from a combination of insignificant fluorescence and strong chlorophyll absorption from cyanobacteria at 681 nm (Seppälä et al., 2007 ; Wynne et al., 2008; Binding et al., 2011). Use of SS (681) alone, however, occasionally misidentifies other non-cyanobacteria phytoplankton blooms as cyanoHABs and, furthermore, cannot ubiquitously provide robust estimates of non-cyanobacteria biomass (Matthews et al., 2012; Wynne et al., 2010, 2013). As such, CI was augmented to also consider SS(665) to provide an additional metric for constraining CyAN estimates to cyanobacteria biomass. A spectral shape centered on 665 was used to identify presence of phycocyanin that would separate cyanobacteria from other blooms (Lunetta et al., 2015). This approach was also used by Matthews et al. (2012) for detecting cyanobacteria in African lakes. A positive SS(665) further indicates cyanoHAB presence, following the assumption that phycocyanin absorption depresses ρs(620) and alters the curvature around 665 nm. The sign of SS(665) can be used to assign the derived CI value as either CIcyano or CInoncyano, with the subscripts indicating the presence of cyanobacteria or not, respectively. ILW only includes CIcyano, as the priority of CyAN is cyanoHAB detection, noting that CI and CInoncyano can be easily determined using the provided ρs(λ) (Eq. (1)). Wynne et al. (2018) provides additional details on the CyAN suite of products, as well as quality assurance metrics and additional exclusion criteria applied.

As the final ILW processing step, Level-3 composites of ρs(λ) and CIcyano for CONUS were generated from the Level-2 imagery using the OBPG’s standard software and processes. This involved generation of Level-3 bin files covering CONUS using an integerized sinusoidal projection (Campbell et al., 1995), followed by production of Level-3 Standard Mapped Images (SMI) using a Plate Carrée projection and nearest neighbor weighting with a 300 m bin size with the bins for the ρs(λ) and CIcyano products based on where the maximum CIcyano value is found. Daily imagery was produced for all products for all locations with valid satellite retrievals. The ρs(λ) and CIcyano products were further temporally composited as mean values over 7-day (centered on Wednesday), monthly, rolling 28-day, and seasonal ranges. Monthly and seasonal climatologies were also generated. ILW SMIs are provided as complete CONUS and Alaska maps stored as netCDF files, which provide flexibility for a wide range of potential end users. To further ensure accommodation of all potential end users, Appendix C provides details and recipes for reprojecting the SMI imagery into alternate map projections (e.g., Albers conic projection, which is used in operational CyAN processing), as well as for extracting regional spatial subsets and saving the imagery in alternate file formats (e.g., GeoTIFF). All data are publicly available via the NASA OceanColorWeb site (https://oceancolor.gsfc.nasa.gov/projects/inlandwaters/). Furthermore, all source code are available through the distribution of SeaDAS.

2.2. In situ data

In situ Chl measurements based on discrete water samples from the Water Quality Portal (https://www.waterqualitydata.us) were acquired from the USGS CyAN Field Integrated Exploratory Lakes Database (Eslick et al., 2019). The in situ data were for CONUS only. The initial search for Chl data from 2002 to 2012 resulted in 547,783 measurements. After filtering these data using the criteria described below, the sample size reduced by 67%, leaving a final count of 148,018 measurements for use in this study. Data filtering used the following steps: (1) samples collected at >0.5 m depth were discarded, as were those lacking a reported depth value, to ensure that only near-surface samples were considered; (2) negative Chl values and extremely high values (> 2000 μg L−1) were discarded, as they are extreme outliers for this data set, as well as outside the range for meaningful satellite detection; (3) samples with different reported start and end dates were removed, as it was impossible to ascertain the actual collection time; (4) only sample types labeled “Sample-Routine,” “Field Msr/Obs,” or “Sample,” were retained, as other sample type designations indicate measurements for laboratory quality control that are not appropriate for validation; and (5) replicate samples with identical dates, times, locations and depths were removed. The linear distance from an in situ sampling station to the nearest shore was estimated from each sample’s coordinates using a revised version of the National Hydrography Dataset Plus lakes shapefile (NHDPlus v2.0 polygons; U.S. Geological Survey, 2012). Only samples collected >300 m from shore were retained to minimize potential inclusion of land-water pixels or those contaminated by adjacency effects or bottom reflectance in optically shallow water.

2.3. Chl algorithm case study

It has long been established that the NIR region of the spectrum has a meaningful relationship with Chl concentration (Gitelson, 1992). Tomlinson et al. (2016) demonstrated the use of CI to estimate Chl in high chlorophyll lakes in Florida. Here, we extend this effort to explore the use of CIcyano to estimate Chl across the CONUS. The Tomlinson et al. (2016) formulation is:

ChlT16=4050(±271)×CI+20(±3) (2)

Their training data set focused on cyanobacteria dominated lakes and included remote sensing reflectances from above water radiometers for the calculation of CI and in situ Chl measurements ranging from eutrophic to hypereutrophic conditions (16 to 115 μg L−1). Tomlinson et al. (2016) reported a bias of 3 μg L− 1 and a root mean square error (RMSE) of 15 μg L−1 for ChlT16 with a relative RMSE of 27%. They calculated CI (not CIcyano), but as these Florida lakes were cyanobacteria-dominated their CI is expected to be comparable to the CIcyano used throughout this paper.

Our in situ data set spans oligotrophic to hyper-eutrophic conditions (0.003 to 750 μg L−1), which presents an opportunity to re-tune a CI to Chl algorithm to a wider range of conditions more representative of CONUS lakes. Our re-tuning considered MERIS-derived CIcyano and in situ Chl. This required accumulating satellite-to-in situ match-ups. We acquired Level-3 MERIS-to-in situ match-ups from the OBPG following the methods of Scott and Werdell (2019). Briefly, this involved retrieving Level-3 daily SMIs from ILW, where a valid satellite pixel matched an in situ target on the same day the in situ sample was collected. This resulted in 1738 MERIS-to-in situ match-ups available for final analyses (Fig. 3).

Fig. 3.

Fig. 3.

CONUS match-up locations. The light grey circles represent a single sample and black represents ≥3 samples at a location. 15 states have at least one sampling location. The inset map shows match-up locations in Minnesota, which is the location for 72% of the match-up results. The yellow indicates the location of a sample that was part of the 20% of samples used for evaluating the ChlBS algorithm. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

A bootstrapping approach was used to relate the paired MERIS CIcyano with in situ Chl following Eq. (2). Bootstrapping creates a series of data sets via random sampling from and replacement to a larger, original data set (Efron, 1979). Bootstrapping assumes the full data set represents the population of interest and follows an iterative sampling-with-replacement strategy, where data selected for a sub-sampled data set are returned to the original full data set for potential reuse in a subsequent sub-sampled data set. A variety of bootstrapping approaches exist, and selection of method requires consideration of data set characteristics such as sample size and distribution (Efron and Tibshirani, 1997; Davison and Hinkley, 1997; Fox, 2002; Bi et al., 2021). For the bootstrapping used in this analysis, the 1738 satellite-to-in situ match-ups were split into two data sets – a training data set consisting of 80% of the match-ups and an evaluation data set consisting of the remaining 20% (Fig. 3). The bootstrapping sample-with-replacement method was applied to the training data set for 1500 iterations (Python with Scikitlearn, a machine learning library). In each iteration, a regression analysis was executed to estimate the algorithm coefficients (Eq. (2)). Analysis of the full suite of runs allowed derivation of average coefficients and their 95% confidence intervals. The performance of the newly tuned bootstrapped chl algorithm, ChlBS, was assessed with the evaluation data set (the isolated 20%). In the context of algorithm development, this process of resampling and exposing the algorithm to fits from multiple data sets offers advantages relative to a single linear regression fit on the full data set. It not only reduces the risk of overfitting to one particular data set, but also allows production of confidence intervals around the coefficients.

2.4. Performance assessment

Ocean color algorithms are often validated using least squares regressions and analysis of their errors (e.g., Werdell and Bailey, 2005). Our satellite-to-in situ data set is not normally distributed (Fig. 4). Therefore, mean square error statistics were not reported, as they are best suited for Gaussian distributions. Instead, we focus on mean bias and mean absolute error (MAE)to summarize algorithm performance (Seegers et al., 2018):

Biaslog=10^(Σj=1nlog10(Mi)log10(Oi)n) (3)
MAElog=10^(Σi=1n|log10(Mi)log10(Oi)|n) (4)

where M, O, and n represent the modeled satellite value, the in situ observation, and the sample size, respectively. Log-transformed metrics were used (Eqs. (3) and (4)), because error for Chl is heteroskedastic and the data ranges four orders of magnitude, which is common in the validation field (Seegers et al., 2018). Also, the log approach helps minimize analytical biases and the influence of outliers (Tofallis, 2015). The metrics, as shown, are based on geometric mean, converted from log units and are dimensionless. Their interpretation is roughly multiplicative, meaning that a biaslog of 1.3 indicates that the model is 1.3× (30%) greater on average than the observed variable, while a bias less than unity indicates a negative bias. MAElog always exceeds unity, such that a MAE of 1.2 indicates relative measurement error of 20% in either direction. While the term “error” is used here as it is in the vast majority of validation studies, it is acknowledged that all are actually referring to “misfit”, because the reference data also have observation errors and therefore the actual error is not quantifiable (Lynch, 2009).

Fig. 4.

Fig. 4.

Data distribution of the 1738 in situ values from the match-up results.

In the context of water quality monitoring, many stakeholders are additionally interested in broadly categorizing performance to simply determine “is there a problem?” and, if yes, “how bad is it?” (IOCS, 2015). Thus, performance was additionally assessed by quantifying the frequency for which the satellite retrieval properly identified the trophic category into which each corresponding in situ data point belonged. To do so, the in situ value was used to divide the match-up data set into four trophic categories based on criteria from the National Lakes Assessment: oligotrophic/mesotrophic (0–7 μg L−1), eutrophic (7–30 μg L−1), and two hypereutrophic conditions (>30 μg L−1) (U.S. Environmental Protection Agency, 2009). Because of its large range (30–650 μg L−1), the hypereutrophic category was further divided into “low hypereutropic” (30–90 μg L−1) and “high hypereutrophic (>90 μg L−1). High hypereutrophic conditions (>90 μg L−1) fall in the top 25% of data points in terms of Chl concentration in both the training and evalutation data sets. A confusion matrix was generated to report percentages of correct trophic level characterization by ChlBS. Generally, a confusion matrix is a visual display of an algorithm’s ability to properly predict categories.

3. Results

3.1. The composition of ILW

ILW provides data across CONUS for over 2300 resolvable lakes with sizes greater than three 300 m pixels (Urquhart and Schaeffer, 2019; Coffer et al., 2020) and 15,450 waterbodies with sizes of at least one 300 m pixel (Clark et al., 2017) and 5874 lakes of at least one pixel size in Alaska (Fig. 1). ILW currently consists of L2 files, L3-binned files, and L3 SMIs spanning 28 April 2002 to 9 April 2012, representing the mission life of MERIS. The total file volume for this first version of ILW is greater than 20 TB with 16 TB of L2 ρs(λ) data. The remaining 4 TB are L3-mapped and L3-binned files which provide 3600 daily files, plus 527 files for both weekly and rolling 28-day files, 120 monthly files and 48 seasonal files. Occassionally, ILW will be reprocessed, so the exact file numbers will change and version numbers will be given to updated versions. Additionally, OLCI will be added to the ILW data set, which will greatly expand the time series.

Lakes can have wide-ranging MERIS ρs(λ) spectra. Lake Winnebago, Wisconsin and Utah Lake, Utah are from ecologically distinct regions of the USA, the upper Great Lakes Region and the Southwest, respectively. These lakes were selected for closer illustration of how the varied line heights relate to a dynamic range of CIcyano retrievals as well as to provide examples of MERIS imagery and ChlBS retrievals (Fig. 2). Level-3 Standard Mapped Images (SMI) from ILW were obtained for September 10, 2011, showcasing cloud-free scenes for Lake Winnebago, WI and Utah Lake, UT. Appendix B demonstrates the same products displayed for OLCI on Sentinel-3A and Sentinel-3B. The selected lakes exhibit divergent ρs(λ) spectra, yet the line height algorithms allow for meaningful interpretation of the data across the waterbodies as demonstrated in mapped imagery for both the CI and ChlBS algorithms (Fig. 2). The variations in spectral shape were highlighted by normalizing each ρs spectrum by its integrated value [ρs(λ) / ρs]. The integration was calculated over the 400–754 nm range using the trapezoidal rule.

Fig. 2.

Fig. 2.

A comparison of 10 September 2011 satellite data from two different lakes in different U.S. regions; Lake Winnebago, Wisconsin (top; A-F) and Utah Lake, Utah (bottom; G-L) showing mapped satellite images of ρs (665, 681, 709 nm; A-C, G-I), CIcyano (D,J) and ChlBS (E,K) and ρs(λ) spectra (F,L). The maps of each lake’s CI values and corresponding ChlBS demonstrate different water types present (D,E,J,K). For each lake the median MERIS ρs(λ) spectra are shown for pixels in discrete Chl ranges with colors representing diverse water types based on discretized ranges of /Chl values (F,L). To focus on the variations in spectral shape, and not spectral amplitude, each ρs spectrum was normalized by its integrated value [ρs(λ) / ρs] over the range of 400–754 nm. These spectra are included as part of the ILW suite. The circles along the lines mark the center wavelength of measured satellite bands (F,L).

3.2. The Chl algorithm development data set

The final filtered ILW CIcyano-to-in situ Chl match-up data set (https://oceancolor.gsfc.nasa.gov/fileshare/jeremy_werdell/CyAN_ChlBS/) included 1738 match-ups from 15 states across CONUS (Minnesota: 1263; Oregon: 291; Florida: 98; North Dakota: 51; Texas: 5; Nevada: 4; Nebraska, North Carolina, South Carolina, Wisconsin: 3; Idaho, Michigan, Utah:2; Kansas, Virginia: 1) (Fig. 3). The majority of the match-ups occurred in Minnesota (72%) and Oregon (17%), while the remaining 13 states made up 10% of the data (Fig. 3). Ideally, the match-ups would be more evenly spread across the country, however, Minnesota has more satellite resolvable lakes than any other state, with 17.5% of all CONUS resolvable lakes located in that state (Schaeffer et al., 2018). Minnesota also provided one of the larger volumes of in situ data, making satellite matches more probable. Fortunately, Minnesota has diverse lakes with in situ Chl concentrations included in this analysis ranging from 0.51 to 650 μg L−1. Additionally, the region includes farmland, urban systems, and forested watersheds providing varied aquatic systems for algorithm testing. The full match-up data set included in situ measurements across water types from oligotrophic to hypereutrophic, with Chl ranging from 0.5 to 832 μg L−1, a mean concentration of 71.2 μg L−1, and a median concentration of 45 μg L−1. The vast majority (79%) of these match-ups included in situ values with Chl < 100 μg L−1 (Fig. 4). The summer and autumn seasons have the highest frequency of match-ups with 96% occurring between May and October and 56% falling into the two months of August and September (Fig. 5).

Fig. 5.

Fig. 5.

The monthly distribution of the 1738 in situ and satellite match-up results. 95% of all match-ups occurred between May and October.

3.3. Performance of ChlT16

When applied to the evaluation satellite-to-in situ data set assembled in this study (20% of all available data), ChlT16 reported a positive biaslog of 1.33 (33%) and MAElog of 1.8 (80%), (Fig. 6, Table 1). Only the evaluation data set was considered here, for consistency with ChlBS analysis. It is clear from Fig. 6 that ChlT16, which was developed primarily for hypereutrophic lakes in Florida, provides meaningful Chl retrievals when the analysis focuses only on high chlorophyll conditions (>20 μg L−1), with the biaslog dropping to 1.01 (1%) and MAElog dropping to 1.48 (48%, Table 1) for this range. For the eutrophic category, ChlT16 reported a positive biaslog and MAElog of 2.23 (123%) and 2.2 (120%), respectively. Performances in both hypereutrophic categories exceeded that of the eutrophic category. The low hypereutrophic category reported a reduction of biaslog to 1.16 (16%) and MAElog to 1.3 (30%). A tendency to underestimate Chl in the high hypereutrophic category yielded a negative biaslog of 0.60 (− 40%) and MAElog of 1.7 (70%).

Fig. 6.

Fig. 6.

Comparison of in situ Chl versus ChlT16 retrievals (Tomlinson et al., 2016). The shading indicates the data density with dark colors being more data points compared to the lightly shaded markers.

Table 1.

ChlT16 and ChlBS summary statistics (Biaslog and MAElog in μg L−1) for total performance across 6 data ranges determined by in situ Chl concentrations using the evaluation data set (20% of the total data set). ChlT16 was developed for hypereutrophic lakes and is most appropriate for high Chl (>20 μg L−1) conditions; therefore, the 20–700 μg L−1 range results are shown.

Chl Concentration (μg L−1) Statistics ChlT16 (Tomlinson et al., 2016)
Chl BS
Chl = 4050(±271) * CICyano + 20 (±3) Chl = 6620(±646) * CICyano − 3.07(±5)

0–700 N 348 348
Biaslog 1.33 1.11
MAElog 1.8 1.6
20–700 N 270 270
Biaslog 1.01 1.04
MAElog 1.48 1.52
Oligotrophic/Mesotrophic N 19 19
 0–7 Biaslog 8.2 1.79
MAElog 8.2 2.8
Eutrophic N 94 94
 7–30 Biaslog 2.23 1.27
MAElog 2.2 1.8
Hypereutrophic N 145 145
 30–90 Biaslog 1.16 1.19
MAElog 1.3 1.4
High Hypereutrophic N 82 82
 >90 Biaslog 0.60 0.73
MAElog 1.7 1.5

The large MAElog and biaslog in the oligotrophic range is unsurprising as Tomlinson et al. (2016) reported a minimum detection level of ∼20 μg L−1 for ChlT16. Their intercept reflects the inherent eutrophic nature of these lakes, as well as a potential background concentration of Chl from other (non-cyano) phytoplankton that may be as much as 20 μg L−1. Their approach, therefore, does not accurately assess low concentrations and should not be applied under these conditions, which explains why eliminating the lowest Chl concentration locations led to improved performance.

3.4. Development and performance of ChlBS

The bootstrapping training data subsets, consisting of 1390 data points (80% of the full data set), were run through 1500 bootstrapping iterations (Fig. 7A), resulting in the following relationship:

ChlBS=6620(±646)×CIcyano3.1(±5.2) (6)

Fig. 7.

Fig. 7.

(A) Bootstrap results for 1500 iterations of the CIcyano training data set. The red line is the mean fit and the grey shows the spread of fits. The color shading indicates data density with light grey representing fewer data points compared to black indicating high data density. (B) Predicted ChlBS versus evaluation data set in situ Chl concentrations with log-transformed axes. The grey to black shading is an indication of data density from low to high. (C) Results from panel B in normal space in the range 0–650 μg L−1). The bars indicate the potential uncertainty range for each predicted data point based on the ChlBS coefficients’ 95% confidence intervals. (D) Exploded view of the data from the box in the lower left of panel C allowing easier viewing of the high density, low Chl (0–90 μg L−1) data. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

While the intercept is negative, we note that its standard deviation is larger than its value, indicating that the intercept is not meaningfully different than zero. That said, we would advise caution when considering ChlBS retrievals that approach null to avoid biases. While admittedly imperfect to do so, we considered removal of all negative ChlBS retrievals in our subsequent analyses. For these data, however, no negative values were reported and, therefore, no match-ups were removed from consideration.

The performance of ChlBS was assessed using the evaluation data set, consisting of 348 data points (20% of the full data set). ChlBS reported a slight positive biaslog of 1.11 (11%) and MAElog of 1.6 (60%), both of which improve upon the performance of ChlT16 (Table 1; Fig. 7B). The improved performance of ChlBS relative to ChlT16 was anticipated, as it was trained from a larger data set over a wide range of data values. The removal of data points < 20 μg L−1 improved the biaslog with a slight reduction to 1.04 (4%) and the MAElog improved to 1.52 (52%) (Table 1). The poorest performance was seen in the oligotrophic/mesotrophic range with a biaslog of 1.79 (79%) and a MAElog of 2.8 (180%). Performance in the eutrophic range gave a biaslog of 1.27 (27%) and a MAElog of 1.8 (80%). ChlBS yielded the lowest MAElog of 1.4 (40%) in the low hypereutrophic range with a positive bias of 1.19 (19%). The tendency to underestimate in the high hypereutrophic range resulted in a negative bias of 0.73 (− 27%) and a MAElog of 1.5 (50%). When considering the full range of data ChlBS was an improvement over ChlT16 with a reduction in both biaslog and MAElog. However, the improvements were not uniformly seen across all the concentrations. For example, in the 20–700 μg L−1 range and the low hypereutrophic range ChlBS and ChlT16 performed similarly.

The bootstrapping approach resulted in a spread of potential curve fits (Fig. 7A) allowing for the calculation of coefficient confidence intervals. Following, the 95% confidence intervals for the ChlBS coefficients were determined (Eq. (6)). The evaluation data set was plotted using lines to indicate the spread of predicted Chl values that would result from using the full range of coefficients defined by the 95% confidence intervals (Fig. 7C,D). Clearly, the spread grows with increasing predicted Chl concentrations reflecting the increasing uncertainty. Relatively small confidence intervals, compared to the coefficient, indicate a strong algorithm fit, while larger confidence intervals imply a weaker relationship. As Fig. 7A shows there is more spread in the curve fit at the high end where there are far fewer data points. If there was a desire for a tighter fit, more data at the high end could be useful.

A summary of the ability of ChlBS to assess four broad categories of trophic status, specifically, oligotrophic/mesotrophic, eutrophic, low hypereutrophic, and high hypereutrophic, is also presented in the form of a confusion matrix (Fig. 8). The algorithm minimum detection limit is an improvement over ChlT16 minimum detection level, but nonetheless ChlBS still results in an overestimation of Chl in low concentrations. In the 0–7 μg L−1 range ChlBS properly assigned the category in 55% of cases and overestimated 45% of the time. The ChlBS best performance is in the eutrophic range correctly predicting the trophic state in 72% of cases, overestimating 8% of cases oligotrophic/mesotrophic cases that were miscategorized as eutrophic, and underestimating 21% of hypereutrophic cases categorized as eutrophic (Fig. 8). The breakdown for the low hypereutrophic (30–90 μg L−1) range shows that low hypereutrophic cases were identified correctly 61% of the time, while overestimation resulted in lower Chl cases being miscategorized into the low hypereutrophic category 25% of the time of which 23% were eutrophic cases and 2% were in the oligotrophic/mesotrophic range. Underestimation led to 14% of cases being miscategorized as low hypereutrophic rather than high hypereutrophic. The ChlBS in high hypereutrophic range correctly predicted 59% of samples, while 40% were overestimated and categorized as high hypereutrophic rather than low hypereutrophic (Fig. 8). When all hypereutrophic waters, that is, all cases >30 μg L−1, were analyzed together, then 85% (215 out of 254) of the ChlBS trophic category matched the in situ measurement. Overall, ChlBS showed promise at trophic status identification.

Fig. 8.

Fig. 8.

Confusion matrix percentages to illustrate the frequency of which the algorithm, ChlBS, predicts the trophic status of the in situ sample placed into 4 broad categories of oligotrophic/mesotrophic (0–7 μg L−1), eutrophic (7–30 μg L−1), low hypereutrophic (30–90 μg L−1), and high hypereutrophic (> 90 μg L−1).

4. Discussion

Our primary motivation for this work was creating awareness that the CyAN Project developed and is distributing the mission-long MERIS time-series of CONUS and Alaska lakes data, inclusive of core radiometric products and the CIcyano optical data product. Similar time-series from OLCI on both Sentinel-3A and −3B will be developed and distributed when data are properly quality controlled, possibly by the time of publication of this work (https://oceancolor.gsfc.nasa.gov/projects/inlandwaters/). Given growing interest in satellite-based water quality monitoring, straightforward access to such data has become increasingly critical to support the emerging cohorts of new end-users and stakeholders. The distribution of this standardized and consolidated data set offers two substantial contributions to the water-quality monitoring communities. First, MERIS radiometric data for CONUS and Alaska have been processed to Level-2, projected onto a Plate Carrée Level-3300-m grid, and included in this distribution. To our knowledge, such a data set has not been previously compiled in such a manner. This allows algorithm developers motivated to pursue alternative methods to relate principle ocean color remote-sensing radiometric variables to any inland water biogeophysical data product of interest. We envision this radiometric data set could be used to evaluate existing inland remote sensing approaches, to enable new performance assessments as additional in situ data become available, and to support regional tunings and reparameterizations. Fig. 2 shows divergent ρs spectra across water types, demonstrating that radiometric distinction is possible and gives reason for optimism in the utility of this data set for inland waters. Although Lake Winnebago and Utah Lake show varied spectra, the utility of the satellite data to be able to map meaningful biophysical variables of concern such as chlorophyll concentrations and potentially dangerous harmful algal blooms was demonstrated. The spatial distribution of phytoplankton could provide meaningful information for water managers concerned about human or ecosystem health. This demonstrates the power of satellite data for inland water use and creates awareness of its untapped potential to further develop approaches and algorithms for regional to global applications. Previously, processed satellite data were not easily available for inland waters and perhaps the most useful aspect of this contribution is that this data set removes much of the burden of satellite data processing for end-users and stakeholders.

Our secondary motivation was to explore the viability of using a single algorithm to estimate a biogeophysical variable across the spatially and temporally diverse CONUS data record. We explored using the readily available CyAN optical metrics of cyanobacteria presence (CIcyano) as a rudimentary estimate of chlorophyll biomass (ChlBS). This serves end-users with interest in the CyAN Project’s core biogeophysical deliverables and offers a performance assessment of their quality and utility across a spatially and temporally diverse data set. Perhaps more importantly, it also provides less familiar end-users a roadmap for using ILW in algorithm development, while demonstrating the challenge of universally relating an optical property to a biological or biogeochemical one. Acknowledging, of course, the need for many end-users and stakeholders to operate and communicate using biogeophysical variables in lieu of optical variables. A line-height calculation stemming directly from measurements of reflectance, CIcyano is expected to be, in principle, universally applicable (assuming adequate consideration of Rayleigh calculations, mixed land-water pixels, clouds, snow/ice, and adjacency effects). Its relationship to any biogeophysical condition, however, requires conscientious consideration. Given here was a demonstration of methods and pitfalls to avoid that should provide a useful roadmap and foundation with which to support an emerging cohort of end-users.

Regionally, Chl derived from CI has been shown to be meaningful (Tomlinson et al., 2016). Arising from the high Chl conditions from which it was developed, ChlT16 has an intercept of 20 μg L−1. ChlT16 is based on local above water radiometry from Florida eutrophic and hypereutrophic lakes with data collected in one summer, thereby covering a narrow set of conditions. The intercept is likely an offset from the background chlorophyll not associated with cyanobacteria, which is unlikely in most lakes, although relevant to certain hypereutrophic Florida lakes. This offset has been adjusted for other high Chl regional applications including to 10 μg L−1 in Lake Erie (Rowe et al., 2016) demonstrating the utility of regional tuning. It is encouraging that the ChlBS slope coefficient of 6620 is the range of the more regional specific tuned ChlT16 slope of 4050 giving confidence in the CIcyano to Chl relationship. Other studies have also successfully demonstrated the relationships of line-heights to biogeophysical variables, primarily on a regional basis (Binding et al., 2011, 2019; Lesht et al., 2013; Matthews et al., 2012; Matthews and Odermatt, 2015). We explored the robustness of a universal CONUS application, which served to demonstrate how well a single algorithm can perform across spatially and optically diverse and complex waters. While both ChlT16 and ChlBS provide valuable information, their performance assessments show room for improvement and reiterate the need for additional development of inland water algorithms to expand the accuracy and applications of satellite remote sensing in these systems. Distribution of the ILW data set can support such efforts. For many end-user applications, ChlBS performance may be perfectly adequate for early warning detection or trend detection, not unlike the previous studies that used CIcyano cyanobacteria estimates (Clark et al., 2017; Urquhart et al., 2017; Mishra et al., 2019; Coffer et al., 2020). For other applications, regional reparameterizations or alternative approaches may be most prudent. Neil et al. (2019) reviewed 19 Chl algorithm and 48 approaches to applying the algorithms for 13 different water types and proposed that an adaptive framework leads to overall improvement of estimates. However, this dynamic method of algorithm selection, while potentially more precise, may not be ideal for users who only require a simplified approach for broad water quality monitoring and response. Again, our case study serves to provide a recipe for simple algorithm development for a stakeholders who wish to generate an alternative CIcyano-to-biogeophysical variable relationship.

Generally speaking ChlBS showed utility in estimating chlorophyll concentrations across CONUS lakes (Table 1). Our results support the Matthews et al. (2012) finding that such algorithms are suitable for trophic status assessment and appropriate for providing some warning signs for harmful algal bloom (HAB) events. However, there remains uncertainty about the precise retrieved quantity of Chl and, therefore, analyses that require constraining small changes may not yet be able to consider satellite retrievals. The ChlBS algorithm tended to perform best in the >7 μg L−1 range, while underperforming at the lowest chlorophyll concentrations (oligotrophic-mesotrophic, Table 1). The reduction in performance in the low end could be caused by a number of variables. First, the ρs(λ) signal in the NIR must overcome the absorption of pure water in that range. The CIcyano 709 nm peak and the associated negative 681 nm curvature result from cyanobacteria scattering, chl-a absorption and limited cyanobacteria fluorescence compared to other phytoplankton in the 665 to 709 nm range and at low cyanobacteria concentrations the signal may not be strong enough for 681 nm spectral shape to develop (Gower et al., 1999; Wynne et al., 2008). Furthermore, at very low signal levels, satellite instrument performance (e.g., signal-to-noise characteristics) in the NIR could confound meaningful retrievals. Ultimately, the reduced performance in the low concentration range limits confidence in assessments of modest changes in water quality in low Chl waters. Such limits in the low Chl range are not unique to ChlBS and have been previously reported as common for chlorophyll algorithms using red and near-infrared radiometric measurements (e.g. Binding et al., 2013; Gilerson et al., 2010; Moses et al., 2012; Palmer et al., 2015). Neil et al. (2019) also acknowledged the limitations of red/near-infrared Chl algorithms at low concentrations and suggested a switching approach using a blue-green band ratio algorithm (e.g., O’Reilly and Werdell, 2019) in oligotrophic systems and a red/NIR method in systems with concentrations spanning 3–155 μg L−1. Gilerson et al. (2010) observed that Chl algorithms using the red/NIR portion of the spectrum outperform blue-green band ratio algorithms at concentrations greater than 5 μg L−1. Binding et al. (2019) also found that the MCI and CI worked better than band-ratio approaches for Chl > 10 μg L−1. Additionally, ChlBS tends to underestimate the highest concentrations (>90 μg L−1) (Table 1). This is also consistent with Neil et al. (2019), which suggested alternative approaches at high concentrations >155 μg L−1. Ultimately, we propose that end-users interested in the trophic state of a lake can use ChlBS to provide a functional state of the lake from the eutrophic to hypereutrophic range (Fig. 8) and, therefore, if nothing else, ChlBS retrievals provide an appropriate screening tool in familiar biogeophysical units.

Although coincident satellite-to-in situ match-ups analysis is the most common form of validation there are known challenges and sources of uncertainty associated with this approach (e.g., Werdell and Bailey, 2005; Zheng and Di Giacomo, 2017; Neil et al., 2019). First, there is the issue of a single in situ sample representing an entire pixel (in our case, 300 m) that may or may not be homogenous. The potential intrapixel heterogeneity creates uncertainty about how well the in situ sample represents the mean across the larger pixel. And, as previously stated, multiple measurement techniques are often employed, all with their own uncertainties. Efforts to estimate the error associated with Chl laboratory techniques have found in situ sampling methods have an average error of 39%, and as high as 68% (Trees et al., 1985; Gregor and Maršálek, 2004). This is in the same range of the ChlBS MAElog of 1.6. Ideally, algorithm analysis would include an assessement of the uncertainties in field measurements as well. Unfortunately, the in situ data used in this study do not include uncertainty estimates, nor information on systematic, or directional, biases. Therefore, it is not possible to fully assess what amount of uncertainty results from the in situ data versus the algorithm itself.Another source of error for algorithms can be tied to diverse phytoplankton communities. Binding et al. (2019) found MCI and CI tended to underestimate Chl in diatom-dominated stations and performed best with cyanobacteria populations, particularly Microcystis-dominated waters, showing algorithm sensitivity to community composition. Ideally, the influence of community composition would be considered during algorithm development. The approach used in our analysis did not separate validation points by algal type as the information was not available in our in situ data set and, perhaps, some spread and error in the retrievals can be attributed to diversity in the phytoplankton populations considered. However, the ChlBS algorithm is specifically for cyanobacteria dominated lakes, that is, because the algorithm is based on CIcyano, the process inherently filters for cyanobacteria dominated bodies of water. Nonetheless, mixed phytoplankton communities may register a valid CIcyano value, which could introduce more error and uncertainty into the algorithm application. A future viable alternative approach may utilize validation that considers community composition and then applies a community-specific algorithm used to estimate Chl.

The bootstrapping approach using a CONUS match-up data set to modify the CIcyano to Chl relationship allowed for coefficient adjustment and derivation of confidence intervals that resulted in an algorithm more suitable for the diverse set of CONUS lakes, which ranged from oligotrophic to hypereutrophic conditions. The ChlBS exercise demonstrated that bootstrapping provides a useful alternative approach for remote sensing algorithm creation compared to the traditional least squares assignment of a linear relationship through the calibration points. Bootstrapping has been used successfully for Chl algorithms in other systems including in the Red Sea (Brewin et al., 2015) and the Great Lakes (Lesht et al., 2016), and offers an approach for consideration when using limited local data sets to develop a better performing algorithm for regional waterbodies. Another bootstrapping advantage is confidence intervals around the coefficients, which provide insights into the strength of the estimated relationships between the variables Fig. 8; C,D). Large confidence intervals, which could result from outliers in the data set, indicates a weak, poorly constrained relationship, while reduced confidence intervals would suggest a stronger algorithm. The approach made it possible to get relatively good estimates for CONUS, but for increased precision some regional tuning may be necessary.

5. Conclusions

We produced the first full standardized MERIS (2002–2012) inland water time series, inclusive of radiometric products and an indicator of cyanoHABs, CIcyano, for use universally in algorithm development and performance assessment, as well as CONUS plus Alaska water quality monitoring activities. The primary contribution of this work is the public distribution of this data set, with similar Sentinel-3A and −3B OLCI data sets planned to follow (see Appendix B). We also explored the derivation and utility of a CIcyano-to-Chl algorithm for application across CONUS. The estimation of Chl for inland waterbodies utilizing the readily available CIcyano variable showed some potential. The original CIcyano-based Chl algorithm, ChlT16, was developed for eutrophic waters in Florida, USA (Tomlinson et al., 2016). A bootstrapping recalibration of the algorithm, ChlBS, demonstrated that the re-tuned algorithm is also appropriate for low Chl waters, albeit less precise and accurate in this range that for eutrophic conditions. Bootstrapping was demonstrated to be an effective approach to improve algorithm performance as large in situ data sets become available.

Ultimately, the need for reliable satellite remote sensing of inland bodies of waters for monitoring, management decisions, and global climate modeling has been well documented. A standardized and easily available satellite data set should substantially facilitate and enable future work in this arena. In addition, well-validated satellite algorithms can be used to create a historical inland waters data set that allows evaluation of changes in waterbodies over time. The ChlBS algorithm explored in this work performed well for categorizing lakes into trophic status. These approaches are immediately available to support resource management decisions, such as cyanoHAB warnings, early detection activities, and lake trophic classification. However, the retrieval errors are relatively large and therefore may be unsuitable for precise estimates of Chl, therefore limiting the type of analyses that can be robustly interpreted. To that end, we offer a bootstrapping case study that provides confidence intervals as a step forward in uncertainty assessment. Perhaps more importantly, the contribution of ILW – a daily time series of more than 2000 lakes across CONUS and 5000 in Alaska–can further support community progress in the development and performance assessment of improved algorithms and approaches.

Supplementary Material

Supplement1

Acknowledgements

We thank Jason Lefler and Christopher Proctor for their invaluable assistance, Caren Binding for insightful conversations, Erdem Karaköylü for guidance with Python, and Don Shea for working through MERIS processing challenges. This project was funded by the NASA Ocean Biology and Biogeochemistry Program/Applied Sciences Program under proposal 14-SMDUNSOL14- 0001 and by U.S. EPA, NOAA, and USGS. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the US Government. This article has been reviewed by the Center for Environmental Measurement and Modeling and approved for publication. The views expressed in this article are those of the authors and do not necessarily reflect the views or policies of the U.S. EPA, but do represent the views of the U.S. Geological Survey.

Footnotes

Authors’ responsibilities

All authors participated in discussions and in pre-submission reviews of the manuscript. Bridget Seegers, as the lead author, headed the analysis and did the bulk of the writing. She coordinated efforts with co-authors to maximize their contributions. Jeremy Werdell provided expertise on satellite remote sensing and algorithm assessment and contributed directly to the section on satellite processing, participated in active conversations during manuscript development, and gave comments on all manuscript versions. Ryan Vandermeulen provided satellite imagery and spectra as well as insightful comments on the manuscript. Wilson Salls acquired and filtered the data as well as providing meaningful and thoughtful comments on the draft manuscripts. Rick Stumpf, with much experience with HAB remote sensing, provided guidance on manuscript scope and comments on the draft manuscript. Blake Schaeffer, as a key collaborator, gave direction on manuscript focus and important comments on the draft manuscript. Tommy Owens and Sean Bailey produced the satellite data set making the research possible. In addition, they contributed to the manuscript by providing details on remote sensing product processing and detailed description of the data set. Joel Scott was responsible for the satellite to in situ remote sensing and provided comments on the draft manuscript. Keith Loftin is an expert on HABs and provided guidance and feedback on the draft manuscripts.

Declaration of Competing Interest None.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.rse.2021.112685.

References

  1. Bastviken D, Cole J, Pace M, Tranvik L, 2004. Methane emissions from lakes: dependence of lake characteristics, two regional assessments, and a global estimate. Glob. Biogeochem. Cycles 18, GB3010. 10.1029/2004GB002238. [DOI] [Google Scholar]
  2. Bi S, Li Y, Liu G, Song K, Xu J, Dong X, Cai X, Mu M, Miao S, Lyu H, 2021. Assessment of algorithms for estimating chlorophyll-a concentration in inland waters: a round-robin scoring method based on the optically fuzzy clustering. IEEE Trans. Geosci. Remote Sens. 1–17. 10.1109/TGRS.2021.3058556. [DOI] [Google Scholar]
  3. Binding CE, Greenberg TA, Jerome JH, Bukata RP, Booty WG, 2011. An assessment of MERIS algal products during an intense bloom in Lake of the Woods. J. Plankton Res. 33 (5), 793–806. [Google Scholar]
  4. Binding CE, Greenberg TA, Bukata RP, 2013. The MERIS maximum chlorophyll index; its merits and limitations for inland water algal bloom monitoring. J. Great Lakes Res. 39 (S1), 100–107. [Google Scholar]
  5. Binding CE, Greenberg TA, Watson SB, Rastin S, Gould J, 2015. Long term water clarity changes in North America’s Great Lakes from multi-sensor satellite observations. Limnol. Oceanogr. 60, 1976–1995. [Google Scholar]
  6. Binding CE, Zastepa A, Zeng C, 2019. The impact of phytoplankton community composition on optical properties and satellite observations on optical properties and satellite observations of the 2017 western Lake Erie algal bloom. J. Great Lakes Res. 45, 573–586. 10.1016/j.jglr.2018.11.015. [DOI] [Google Scholar]
  7. Brewin RJW, Raitsos DE, Dall’Olmo G, Zarokanellos N, Jackson T, Racault M-F, Boss ES, Sathyendranath S, Jones BH, Hoteit I, 2015. Regional ocean-colour chlorophyll algorithms for the Red Sea. Remote Sens. Environ. 165, 64–85. 10.1016/j.rse.2015.04.024. [DOI] [Google Scholar]
  8. Bulgarelli B, Kiselev V, Zibordi G, 2014. Simulation and analysis of adjacency effects in coastal waters: a case study. Appl. Opt. 53 (8), 1523–1545. [DOI] [PubMed] [Google Scholar]
  9. Campbell JW, Blaisdell JM, Darzi M, 1995. Level-3 SeaWiFS data products: spatial and temporal binning algorithms. In: Hooker SB, Firestone ER, Acker JG (Eds.), NASA Tech. Memo. 104566, Vol. 32. NASA Goddard Space Flight Center, Greenbelt, MD. [Google Scholar]
  10. Clark J, Schaeffer BA, Darling J, Urquhart E, Johnston J, Ignatius A, Myer M, Loftin KA, Werdell PJ, Stumpf RP, 2017. Satellite monitoring of cyanobacterial harmful algal bloom frequency in recreational waters and drinking water sources. Ecol. Indic. 80, 84–95. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Coffer M, Schaeffer BA, Darling J, Urquhart E, Salls W, 2020. Quantifying national and regional cyanobacterial occurrence in US lakes using satellite remote sensing Ecol. Indic. 111 10.1016/j.ecolind.2019.105976. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Cole JJ, Prairie YT, Caraco NF, McDowell WH, Tranvik LJ, Striegl RG, Duarte CM, Kortelainen P, Downing JA, Middelburg JJ, Melack J, 2007. Plumbing the global carbon cycle: integrating inland waters into the terrestrial carbon budget. Ecosystems. 10, 171–184. 10.1007/s10021-006-9013-8. [DOI] [Google Scholar]
  13. Davison A, Hinkley D, 1997. Bootstrap Methods and their Application. Cambridge University Press. doi.org/ 10.1017/CBO9780511802843. [DOI] [Google Scholar]
  14. Downing JA, 2009. Global limnology: up-scaling aquatic services and processes to the planet Earth. Verh. Internat Verein. Limnol. 30, 1149–1166. [Google Scholar]
  15. Downing JA, 2010. Emerging global role of small lakes and ponds: little things mean a lot. Limnetica. 29, 9–23. [Google Scholar]
  16. Downing JA, Prairie YT, Cole JJ, Duarte CM, Tranvik LJ, Striegl RG, McDowell WH, Kortelainen P, Caraco NF, Melack JM, et al. , 2006. The global abundance and size distribution of lakes, ponds, and impoundments. Limnol. Oceanogr. 51 (5), 2388–2397. [Google Scholar]
  17. Dutkiewicz S, Hickman AE, Jahn O, Henson S, Beaulieu C, Monier E, 2019. Ocean colour signature of climate change. Nat. Commun. 10, 578. 10.1038/s41467-019-08457-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Efron B, 1979. Bootstrap methods: another look at jackknife. Ann. Stat. 7, 1–26. [Google Scholar]
  19. Efron B, Tibshirani R, 1997. Improvements on cross-validation: the 632+ bootstrap method. J. Am. Stat. Assoc. 92 (438), 548–560. [Google Scholar]
  20. Eslick PJ, Conmy R, Donovan A, Graham JL, Lanning-Rush JL, Loftin KA, Salls W, Schaeffer B, Seegers B, Stumpf R, Werdell J, 2019. CyAN: R Package. U.S. Geological Survey software release. 10.5066/P90GMHSM. [DOI] [Google Scholar]
  21. Fox J, 2002. Bootstrapping Regression Models Appendix to An R and S-PLUS Companion to Applied Regression. [Google Scholar]
  22. Frouin RJ, Franz BA, Ibrahim A, Knobelspiesse K, Ahmad Z, Cairns B, et al. , 2019. Atmospheric correction of satellite ocean-color imagery during the PACE era. Front. Earth Sci. 10.3389/feart.2019.00145. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Gilerson AA, Gitelson AA, Zhou J, Gurlin D, Moses W, Ioannou I, Ahmed SA, 2010. Algorithms for remote estimation of chlorophyll-a in coastal and inland waters using red and near infrared bands. Opt. Express 18 (23), 24109–24125. [DOI] [PubMed] [Google Scholar]
  24. Gitelson A, 1992. The peak near 700 nm on radiance spectra of algae and water: relationships of its magnitude and position with chlorophyll concentration. Int. J. Remote Sens. 13, 3367–3373. [Google Scholar]
  25. Gordon HR, McCluney WR, 1975. Estimation of the depth of sunlight penetration in the sea for remote sensing. Appl. Opt. 14 (2), 413–416. [DOI] [PubMed] [Google Scholar]
  26. Gower J, Doesffer R, Borstad G, 1999. Interpretation of the 685 nm peak in water-leaving radiance spectra in terms of fluorescence, absorption and scattering, and its observation by MERIS. Int. J. Remote Sens. 20, 1771–1786. 10.1080/014311699212470. [DOI] [Google Scholar]
  27. Gower J, King S, Borstad G, Brown L, 2005. Detection of intense plankton blooms using the 709nm band of the MERIS imaging spectrometer. Int. J. Remote Sens. 26, 2005–2021. 10.1080/01431160500075857. [DOI] [Google Scholar]
  28. Gregg WW, Rousseaux CS, Franz BA, 2017. Global trends in ocean phytoplankton: a new assessment using revised ocean colour data. Remote Sens. Lett. 8, 12,1102–1111. 10.1080/2150704X.2017.1354263. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Gregor J, Maršálek B, 2004. Freshwater phytoplankton quantification by chlorophyll a: a comparative study of in vitro, in vivo and in situ methods. Water Res. 38 (3), 517–522. 10.1016/j.watres.2003.10.033. [DOI] [PubMed] [Google Scholar]
  30. Hu C, Lee Z, Franz B, 2012. Chlorophyll a algorithms for oligotrophic oceans: a novel approach based on three-band reflectance difference. J. Geophys. Res. 117 (C1) 10.1029/2011jc007395. [DOI] [Google Scholar]
  31. Hu C, Feng L, Lee Z, Franz BA, Bailey SW, Werdell PJ, Proctor CW, 2019. Improving satellite global chlorophyll a data products through algorithm refinement and data recovery. J. Geophys. Res. Oceans. 124 (3), 1524–1543. [Google Scholar]
  32. IOCCG, 2007. Ocean-colour data merging. In: Gregg W (Ed.), Reports of the International Ocean-Colour Coordinating Group. No. 6, International Ocean Colour Coordinating Group, Dartmouth, Canada. [Google Scholar]
  33. IOCCG, 2018. Earth observations in support of global water quality monitoring. In: Greb S, Dekker A, Binding C (Eds.), IOCCG Report Series. No. 17, International Ocean Colour Coordinating Group, Dartmouth, Canada. [Google Scholar]
  34. IOCS, 2015. Proceedings of the 2015 International Ocean Colour Science Meeting. International Ocean Colour Coordinating Group. https://iocs.ioccg.org/wp-content/uploads/2012/10/report-iocs-2015-meeting.pdf. [Google Scholar]
  35. Kallio K, Koponen S, Pulliainen J, 2003. Feasibility of airborne imaging spectrometry for lake monitoring - a case study of spatial chlorophyll a distribution in two mesoeutrophic lakes. Int. J. Remote Sens. 24, 3771–3790. 10.1080/01431160210000023899. [DOI] [Google Scholar]
  36. Kutser T, 2009. Passive optical remote sensing of cyanobacteria and other intense phytoplankton blooms in coastal and inland waters. Int. J. Remote Sens. 30 (17), 4401–4425. 10.1080/01431160802562305. [DOI] [Google Scholar]
  37. Lee Z, Weidemann A, Kindle J, Arnone R, Carder KL, Davis C, 2007. Euphotic zone depth: its derivation and implication to ocean-color remote sensing. J. Geophys. Res. Oceans. 112 (C3). [Google Scholar]
  38. Lesht BM, Barbiero RP, Warren GJ, 2013. A band-ratio algorithm for retrieving open-lake chlorophyll values from satellite observations of the Great Lakes. J. Great Lakes Res. 39 (1), 138–152. [Google Scholar]
  39. Lesht BM, Barbiero RP, Warren GJ, 2016. Verification of a simple band ratio algorithm for retrieving Great Lakes open water surface chlorophyll concentrations from satellite observations. J. Great Lakes Res. 42 (448–454) 10.1016/j.jglr.2015.12.013. [DOI] [Google Scholar]
  40. Lesht BM, Barbiero RP, Warren GJ, 2018. Using satellite observations to assess the spatial representativeness of the GLNPO water quality monitoring program. J. Great Lakes Res. 44 (4), 547–562. 10.1016/j.jglr.2018.05.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Lunetta RS, Schaeffer BA, Stumpf RP, Keith D, Jacobs SA, Murphy MS, 2015. Evaluation of cyanobacteria cell count detection derived from MERIS imagery across the eastern USA. Remote Sens. Environ. 157, 24–34. 10.1016/j.rse.2014.06.008. [DOI] [Google Scholar]
  42. Lynch DR, 2009. Skill assessment for coupled biological/physical models of marine systems. J. Mar. Syst. 76, 1–3. 10.1016/j.jmarsys.2008.05.002. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. MacKay MD, Neale PJ, Arp CD, De Senerpont Domis LN, Fang X, Gal G, Jöhnk K, Kirillin G, Lenters JD, Litchman E, MacIntyre S, Marsh P, Melack J, Mooij WM, Peeters F, Quesada A, Schladow SG, Schmid M, Spence C, Stokes SL, 2009. Modelling lakes and reservoirs in the climate system. Limnol. Oceanogr. 54, 2315–2329. [Google Scholar]
  44. Matthews MW, Odermatt D, 2015. Improved algorithm for routine monitoring of cyanobacteria and eutrophication in inland and near-coastal waters. Remote Sens. Environ. 156, 374–382. [Google Scholar]
  45. Matthews MW, Bernard S, Robertson L, 2012. An algorithm for detecting trophic status (chlorophyll-a), cyanobacterial-dominance, surface scums and floating vegetation in inland and coastal waters. Remote Sens. Environ. 124, 637–652. [Google Scholar]
  46. Mendonça R, Müller RA, Clow D, Verpoorter C, Raymond P, Tranvik LJ, Sobek S, 2017. Organic carbon burial in global lakes and reservoirs. Nat. Commun. 8, 1694. 10.1038/s41467-017-01789-6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  47. Mishra S, Stumpf RP, Schaeffer BA, Werdell J, Loftin KA, Meredith A, 2019. Measurement of cyanobacterial bloom magnitude using satellite remote sensing. Sci. Rep. 9, 18310. 10.1038/s41598-019-54453-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  48. Mishra S, Stumpf RP, Schaeffer BA, Werdell J, Loftin KA, Meredith A, 2021. Evaluation of a satellite-based cyanobacteria bloom detection algorithm using field-measured microcystin data. Sci. Total Environ. 774, 145462. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Mobley CD, Werdell J, Franz B, Ahmad Z, Bailey S, 2016. Atmospheric Correction for Satellite Ocean Color Radiometry, NASA/Tech. Memorandum-217551, GSFC-E-DAA-TN35509. [Google Scholar]
  50. Moses WJ, Gitelson AA, Berdnikov S, Saprygin V, Povazhnyi V, 2012. Operational MERIS-based NIR-red algorithms for estimating chlorophyll-a concentrations in coastal waters — the Azov Sea case study. Remote Sens. Environ. 121, 118–124. [Google Scholar]
  51. NASA JPL, 2013. NASA Shuttle Radar Topography Mission Water Body Data Shapefiles & Raster Files V3.0 [Data set]. NASA LP DAAC. 10.5067/MEaSUREs/SRTM/SRTMSWBD.003. [DOI]
  52. Neil C, Spyrakos E, Hunter PD, Tyler AN, 2019. A global approach for chlorophyll-a retrieval across optically complex inland waters based on optical water types. Remote Sens. Environ. 229, 159–178. [Google Scholar]
  53. New Jersey Department of Environmental Protection. (NJDEP), 2020. Cyanobacterial Harmful Algal Bloom (HAB) Freshwater Recreational Response Strategy. https://www.state.nj.us/dep/hab/download/NJHABResponseStrategy.pdf.
  54. Olmanson L, Bauer M, Brezonik P, 2008. A 20-year Landsat water clarity census of Minnesota’s 10,000 lakes. Remote Sens. Environ. 112, 4086–4097. 10.1016/j.rse.2007.12.013. [DOI] [Google Scholar]
  55. Ondrusek M, Stengel E, Kinkade CS, Vogel RL, Keegstra P, Hunter C, Kim C, 2012. The development of a new optical total suspended matter algorithm for the Chesapeake Bay. Remote Sens. Environ. 119, 243–254. [Google Scholar]
  56. Oregon Health Authority (OHA) Public Health Division, 2019. Recreational Use Public Health Advisory Guidelines for Cyanobacterial Blooms in Freshwater Bodies. [Google Scholar]
  57. O’Reilly JE, Werdell PJ, 2019. Chlorophyll algorithms for ocean color sensors-OC4, OC5 & OC6. Remote Sens. Environ. 229, 32–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
  58. Pahlevan N, Sarkar S, Franz BA, Balasubramanian SV, He J, 2017. Sentinel-2 MultiSpectral instrument (MSI) data processing for aquatic science applications: demonstrations and validations. Remote Sens. Environ. 201, 47–56. 10.1016/j.rse.2017.08.033. [DOI] [Google Scholar]
  59. Palmer SCJ, Hunter PD, Lankester T, Hubbard S, Spyrakos E, Tyler AN, Présing M, et al. , 2015. Validation of envisat MERIS algorithms for chlorophyll retrieval in a large, turbid and optically-complex Shallow Lake. Remote Sens. Environ. 157, 158–169. [Google Scholar]
  60. Papenfus M, Schaeffer BA, Pollard AI, Loftin KA, 2020. Exploring the potential value of satellite remote sensing to monitor chlorophyll-a for US lakes and reservoirs. Environ. Monit. Assess. 192 (12), 1–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  61. Raymond PA, Hartmann J, Lauerwald R, Sobek S, McDonald C, Hoover M, Butman D, Striegl R, Mayorga E, Humborg C, Kortelainen P, Dürr H, Meybeck M, Ciais P, Guth P, 2013. Global carbon dioxide emissions from inland waters. Nature 503, 355–359. 10.1038/nature12760. [DOI] [PubMed] [Google Scholar]
  62. Rowe MD, Anderson EJ, Wynne TT, Stumpf RP, Fanslow DL, Kijanka K, Vanderploeg HA, Strickler JR, Davis TW, 2016. Vertical distribution of buoyant Microcystis blooms in a Lagrangian particle tracking model for short-term forecasts in Lake Erie. J. Geophys. Res. Oceans 121. 10.1002/2016JC011720. [DOI] [Google Scholar]
  63. Schaeffer BA, Bailey SW, Conmy RN, Galvin M, Ignatius AR, Johnston JM, Keith DJ, Lunetta RS, Parmar R, Stumpf RP, Urquhart EA, Werdell PJ, Wolfe K, 2018. Mobile device application for monitoring cyanobacteria harmful algal blooms using Sentinel-3 satellite ocean and land color instruments. Environ. Model. Softw. 109, 93–103. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Schaeffer BA, Loftin KA, Stumpf RP, Werdell PJ, 2015. Agencies collaborate, develop a Cyanobacteria Assessment Network. Eos 96. 10.1029/2015EO038809. [DOI] [Google Scholar]
  65. Scott JP, Werdell PJ, 2019. Comparing level-2 and level-3 satellite ocean color retrieval validation methodologies. Opt. Express 27 (21), 30140–30157. 10.1364/OE.27.030140. [DOI] [PubMed] [Google Scholar]
  66. Seegers BN, Stumpf RP, Schaeffer BA, Loftin KA, Werdell PJ, 2018. Performance metrics for the assessment of satellite data products: an ocean color case study. Opt. Express 26 (6), 7404–7422. [DOI] [PMC free article] [PubMed] [Google Scholar]
  67. Seppälä J, Ylostalo P, Kaitala S, Hallfors S, Raateoja M, Maunual P, 2007. Ship-of opportunity based phycocyanin fluorescence monitoring of the filamentous cyanobacteria bloom dynamics in the Baltic Sea. Estuar. Coast. Shelf Sci. 73, 489–500. [Google Scholar]
  68. Stroming S, Robertson M, Mabee B, Kuwayama Y, Schaeffer B, 2020. Quantifying the human health benefits of using satellite information to detect cyanobacterial harmful algal blooms and manage recreational advisories in U.S. lakes. GeoHealth 4, e2020GH000254. 10.1029/2020GH000254. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Tofallis C, 2015. A better measure of relative prediction accuracy for model selection and model estimation. J. Oper. Res. Soc. 66, 1352–1362. [Google Scholar]
  70. Tomlinson MC, Stumpf RP, Wynne TT, Dupuy D, Burks R, Hendrickson J, Fulton III RS, 2016. Relating chlorophyll from cyanobacteria-dominated inland waters to a MERIS bloom index. Remote Sens. Lett. 7 (2), 141–149. 10.1080/2150704X.2015.1117155. [DOI] [Google Scholar]
  71. Tranvik LJ, Downing JA, Cotner JB, Loiselle SA, Striegl RG, Ballatore TJ, Dillon P, Finlay K, Fortino K, Knoll LB, Kortelainen PL, Kutser T, Larsen S, Laurion I, Leech DM, McCallister SL, McKnight DM, Melack JM, Overholt E, Porter JA, Prairie Y, Renwick WH, Roland R, Sherman BS, Schindler DW, Sobek S, Tremblay A, Vanni MJ, Verschoor AM, von Wachenfeldt E, Weyhenmeye GA, 2009. Lakes and reservoirs as regulators of carbon cycling and climate. Limnol. Oceanogr. 54, 2298–2314. [Google Scholar]
  72. Trees C, Mahlon C, Kennicutt C, Brooks JM, 1985. Errors associated with the standard fluorimetric determination of chlorophylls and phaeopigments. Mar. Chem. 17 (1), 1–12. 10.1016/0304-4203(85)90032-5. [DOI] [Google Scholar]
  73. U.S. Environmental Protection Agency, 2000. Nutrient Criteria Technical Guidance Manual: Lakes and Reservoirs. (EPA 822-B-00–001). [Google Scholar]
  74. U.S. Environmental Protection Agency, 2009. National Lakes Assessment: A Collaborative Survey of the Nation’s Lakes. (EPA 841-R-09–001). [Google Scholar]
  75. U.S. Environmental Protection Agency, 2011. 2012 National Lakes Assessment. Field Operations Manual. (EPA 841-B-11–003). [Google Scholar]
  76. U.S. Geological Survey, 2012. National Hydrography Dataset Plus, Version 2. [Google Scholar]
  77. Urquhart EA, 2018. NASA Shuttle Radar Topography Mission Water Body Data Shapefiles & Raster Files V4.0 [Data set]. US EPA Cyanobacteria Assessment Network. [Google Scholar]
  78. Urquhart EA, Schaeffer BA, 2019. Envisat MERIS and Sentinel-3 OLCI satellite lake biophysical water quality flag dataset for the contiguous United States. Data Brief 104826. 10.1016/j.dib.2019.104826. [DOI] [PMC free article] [PubMed] [Google Scholar]
  79. Urquhart EA, Schaeffer BA, Stumpf RP, Loftin KA, Werdell PJ, 2017. A method for monitoring cyanobacterial harmful algal bloom spatial extent using satellite remote sensing data. Harmful Algae 67, 144–152. [DOI] [PMC free article] [PubMed] [Google Scholar]
  80. Vos R, Hakvoort J, Jordans R, Ibelings B, 2003. Multiplatform optical monitoring of eutrophication in temporally and spatially variable lakes. Sci. Total Environ. 312, 221–243. 10.1016/S0048-9697(03)00225-0. [DOI] [PubMed] [Google Scholar]
  81. Walter KM, Zimov SA, Chaton JP, Verbyla D, Chapin III FS, 2006. Methane bubbling from Siberian thaw lakes as a positive feedback to climate warming. Nature 443, 71–75. [DOI] [PubMed] [Google Scholar]
  82. Warren MA, Simis SGH, Martinez-Vicente V, Poser K, Besciani M, Alikas K, Spyrakos E, Giardino C, Ansper A, 2019. Assessment of atmospheric correction algorithms for the sentinel-2A MultiSpectral imager over coastal and inland waters. Remote Sens. Environ. 225, 267–289. 10.1016/j.rse.2019.03.018. [DOI] [Google Scholar]
  83. Werdell PJ, Bailey SW, 2005. An improved in-situ bio-optical data set for ocean color algorithm development and satellite data product validation. Remote Sens. Environ. 98, 122–140. 10.1016/j.rse.2005.07.001. [DOI] [Google Scholar]
  84. Werdell PJ, McClain CR, 2019. Satellite remote sensing: ocean color. In: Steele John H. (Ed.), Encyclopedia of Ocean Sciences, , 3rd edn5. Academic Press, pp. 443–455. 10.1016/B978-0-12-409548-9.10817-6. [DOI] [Google Scholar]
  85. Werdell PJ, McKinna LI, Boss E, Ackleson SG, Craig SE, Gregg WW, Lee Z, Maritorena S, Roesler CS, Rousseaux CS, Stramski D, 2018. An overview of approaches and challenges for retrieving marine inherent optical properties from ocean color remote sensing. Prog. Oceanogr. 160, 186–212. [DOI] [PMC free article] [PubMed] [Google Scholar]
  86. Wynne TT, Stumpf RP, Tomlinson MC, Warner RA, Tester PA, Dyble J, Fahnenstiel GL, 2008. Relating spectral shape to cyanobacterial blooms in the Laurentian Great Lakes. Int. J. Remote Sens. 29 (12), 3665–3672. 10.1080/01431160802007640. [DOI] [Google Scholar]
  87. Wynne TT, Stumpf RP, Tomlinson MC, Dyble J, 2010. Characterizing a cyanobacterial bloom in Western Lake Erie using satellite imagery and meteorological data. Limnol. Oceanogr. 55 (5), 2025–2036. 10.4319/lo.2010.55.5.2025. [DOI] [Google Scholar]
  88. Wynne TT, Stumpf RP, Tomlinson MC, Fahnenstiel GL, Dyble J, Schwab DJ, Joshi SJ, 2013. Evolution of a cyanobacterial bloom forecast system in western Lake Erie: development and initial evaluation. J. Great Lakes Res. 39 (Supplement 1), 90–99. 10.1016/j.jglr.2012.10.003. [DOI] [Google Scholar]
  89. Wynne T, Meredith A, Briggs T, Litaker W, Stumpf R, 2018. Harmful Algal Bloom Forecasting Branch Ocean Color Satellite Imagery Processing Guidelines, NOAA Technical Memorandum NOS NCCOS 252. Silver Spring, MD, p. 48. 10.25923/twc0-f025. [DOI] [Google Scholar]
  90. Wyoming Department of Environmental Quality (WDEQ) /Water Quality Division (WQD), 2019. Harmful Cyanobacterial Bloom Action Plan for Publicly Accessible Lakes and Reservoirs of Wyoming. [Google Scholar]
  91. Zheng G, Di Giacomo PM, 2017. Uncertainties and applications of satellite-derived coastal water quality products. Prog. Oceanogr. 159, 45–72. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement1

RESOURCES