Abstract
Retrieval of vegetation properties from satellite and airborne optical data usually takes place after atmospheric correction, yet it is also possible to develop retrieval algorithms directly from top-of-atmosphere (TOA) radiance data. One of the key vegetation variables that can be retrieved from at-sensor TOA radiance data is leaf area index (LAI) if algorithms account for variability in atmosphere. We demonstrate the feasibility of LAI retrieval from Sentinel-2 (S2) TOA radiance data (L1C product) in a hybrid machine learning framework. To achieve this, the coupled leaf-canopy-atmosphere radiative transfer models PROSAIL-6SV were used to simulate a look-up table (LUT) of TOA radiance data and associated input variables. This LUT was then used to train the Bayesian machine learning algorithms Gaussian processes regression (GPR) and variational heteroscedastic GPR (VHGPR). PROSAIL simulations were also used to train GPR and VHGPR models for LAI retrieval from S2 images at bottom-of-atmosphere (BOA) level (L2A product) for comparison purposes. The BOA and TOA LAI products were consistently validated against a field dataset with GPR (R2 of 0.78) and with VHGPR (R2 of 0.80) and for both cases a slightly lower RMSE for the TOA LAI product (about 10% reduction). Because of delivering superior accuracies and lower uncertainties, the VHGPR models were further applied for LAI mapping using S2 acquisitions over the agricultural sites Marchfeld (Austria) and Barrax (Spain). The models led to consistent LAI maps at BOA and TOA scale. The LAI maps were also compared against LAI maps as generated by the SNAP toolbox, which is based on a neural network (NN). Maps were again consistent, however the SNAP NN model tends to overestimate over dense vegetation cover. Overall, this study demonstrated that hybrid LAI retrieval algorithms can be developed from TOA radiance data given a cloud-free sky, thus without the need of atmospheric correction. To the benefit of the community, the development of such hybrid models for the retrieval vegetation properties from BOA or TOA images has been streamlined in the freely downloadable ALG-ARTMO software framework.
Keywords: Biophysical variables, LAI estimation, Top-of-atmosphere radiance data, Sentinel-2, Gaussian process regression, Radiative transfer model
1. Introduction
The estimation of vegetation biophysical variables is key for a wide range of ecological and agricultural applications (Weiss et al., 2020). Particularly leaf area index (LAI) has been proven to be a successful variable retrievable from optical sensors mounted on Earth observing satellites (Verrelst et al., 2015b; Yan et al., 2019). Copernicus’ flagship for terrestrial earth observation (EO), i.e. the Sentinel-2 (S2) constellation, provides free, full and open access optical data with very short revisit times (5 days with 2 satellites under cloud-free conditions with up to 2−3 days at mid-latitudes), high spatial resolution (< 30 m), and good spectral resolution (10−180 nm) (Drusch et al., 2012; Malenovský et al., 2012). This vast data stream has proven to be convenient for the quantification and monitoring of vegetation characteristics, with LAI as the most successful indicator of vegetation density (Fang et al., 2019; Verrelst et al., 2015b).
However, optical missions do not measure vegetation properties directly, and some essential pre-processing steps are required to transform at-sensor reflected radiation into interpretable surface reflectance measures, i.e. radiometric calibration and correction, geometric correction, and ultimately atmospheric correction. The retrieval of biophysical variables takes place typically after the atmospheric correction step where top-of-atmosphere (TOA) radiance is converted into bottom-of-atmosphere (BOA) reflectance. Consequently, the pre-processing from TOA to BOA data is a critical step, and determines the success of the subsequent retrieval process (Laurent et al., 2011b). Nevertheless, the TOA radiance to BOA reflectance conversion is not so straightforward. Typically, the atmospheric correction is based on the inversion of an atmospheric radiative transfer model (RTM) commonly through interpolation of pre-computed look-up tables (LUT). Together with the intrinsic errors of LUT interpolation, the ill-posedness of the inversion of atmospheric characteristics introduces important uncertainties in atmospheric correction (Thompson et al., 2019). Also, the atmospheric correction generally makes the assumption that the surface is Lambertian. Other steps that introduce errors are the corrections for topographic, adjacency, and bi-directional surface reflectance effects. These corrections are applied sequentially and independently, potentially accumulating errors into the BOA reflectance data (Gao et al., 2009).
In an attempt to overcome this limitation, it has earlier been suggested and successfully tested to retrieve biophysical variables directly from at-sensor TOA radiance, thus without the necessity to go through the atmospheric correction process (Fang and Liang, 2003; Lauvernet et al., 2008; Laurent et al., 2011b; Laurent et al., 2011a; Laurent et al., 2013; Laurent et al., 2014; Mousivand et al., 2015; Shi et al., 2016; Shi et al., 2017; Verrelst et al., 2019b; Bayat et al., 2020). The possibility of retrieving LAI directly from TOA radiance data was recently theoretically confirmed by calculating a global sensitivity analysis, thereby varying all leaf-canopy-atmosphere RTM variables (Mousivand et al., 2014; Verrelst et al., 2019b; Prikaziuk and van der Tol, 2019). At TOA radiance, LAI proved to be a dominant variable along the spectral range, and at multiple spectral regions it is not influenced by the atmospheric variables, especially in the short wave infrared (SWIR) (Verrelst et al., 2019b). An advantage of retrieving vegetation properties directly from TOA is that the combined model simulates TOA radiances, which is the physical variable measured by the sensor. This means that the simulated quantities can be directly compared with the reference measurements, unlike the BOA approach where a mismatch may exist due to the approximations and assumptions in the atmospheric correction step (Laurent et al., 2011b). The downside of these approaches, however, is that they require a sound physical understanding on the factors determining the at-sensor spectral TOA radiance, e.g. as studied in Fourty and Baret (1997), Verhoef and Bach (2003), Verhoef and Bach (2007), Yang et al. (2020). Although for operational products such as the S2 atmospherically-corrected BOA reflectance products are freely provided, for experimental missions or airborne campaign atmospheric corrections are still a mandatory preprocessing step. It was also with such kinds of experimental data that the aforementioned TOA retrieval approaches were presented.
When it comes to the retrieval of vegetation properties from optical EO data, such as LAI, four principal families of retrieval methods can be identified: (1) parametric regression, (2) nonparametric regression, (3) inversion of RTMs, and (4) hybrid or combined method (Verrelst et al., 2015a; Verrelst et al., 2019a). Regarding the retrieval of LAI from TOA radiance data, earlier retrieval approaches are usually to be found into the third category of methods, i.e., inversion of leaf a coupled canopyatmosphere RTMs by making use of predefined look-up tables (LUT). The main drawback of this method is that it takes a long computational time, i.e. for each pixel querying and interpolating the LUT for inversion through a minimization function (Verrelst et al., 2015a; Verrelst et al., 2019a). In this regard, for the last few years hybrid retrieval methods has become an appealing alternative due to the fast progress in machine learning methods. Hybrid methods establish a statistical relationship between simulated spectra and a biophysical variable. These type of methods have been particularly successful in operational processing of EO data because they exploit the generic properties of physically-based methods combined with the flexibility and computational efficiency of machine learning regression algorithms (MLRAs) (Verrelst et al., 2015a; Verrelst et al., 2019a). One of the major advantages of these methods is that, once the MLRA is trained, it can process an image into a vegetation product quasi-instantly.
Hybrid retrieval implementations in an operational context has long been restricted to artificial neural networks (NNs). The combination of artificial NNs with the radiative transfer model (RTM) PROSAIL (Jacquemoud et al., 2009; Berger et al., 2018) has long been used in operational applications (Bacour et al., 2006; Baret et al., 2013) and kept on being used, e.g for the processing of S2 images. For instance, NN models have been implemented into the biophysical processor tool of the Sentinel Application Platform (SNAP) (Weiss et al., 2016). At the same time, related studies reveal advantages in the application of alternative MLRAs over conventional NNs techniques (Upreti et al., 2019). Especially the MLRA families of decision trees and kernel-based methods proved to be successful (Verrelst et al., 2015; Verrelst et al., 2019a). These methods tend to be simpler to train and can perform more robust than NNs while maintaining competitive accuracies (Verrelst et al., 2012b; Verrelst et al., 2015b). From the kernel-based MLRAs family noteworthy are the algorithms kernel ridge regression (Suykens and Vandewalle, 1999) because of its simplicity and therefore fast run-time, and Gaussian Process Regression (GPR) (Rasmussen and Williams, 2006) because of its ability to provide additional information such as ranking of relevant bands as well as associated uncertainties (Verrelst et al., 2013b; Verrelst et al., 2015b).
Given the progress made by the MLRAs, new opportunities emerged to develop retrieval models directly applicable to TOA radiance data. For instance, of interest to implement are hybrid strategies with advanced MLRAs that at the same time provides associated uncertainty estimates (Verrelst et al., 2019b). Yet, what makes the implementation of TOA approach challenging is the atmospheric part of the coupled model to account for variability in atmospheric effects. Although multiple atmospheric RTMs have been developed, these models are often difficult to configure for generating a large amount of simulations. Widely used atmospheric RTMs include 6SV (Vermote et al., 1997), Libratran (Mayer and Kylling, 2005) and MODTRAN (Berk et al., 2006). To overcome this limitation, the Atmospheric Look-up table Generator (ALG) is one of the few software packages that enables executing atmospheric RTMs with a friendly graphical user interface (Vicent et al., 2020). In addition, the few software packages available to automate the retrieval from TOA data are still experimental. To the best of our knowledge, only the automated radiative transfer models Operator (ARTMO) scientific software framework is able to provide these steps into a streamlined and quasi-automatic way (Verrelst et al., 2012c). ARTMO not only runs leaf-canopy models, its recent TOC2TOA toolbox allows coupling a leaf-canopy LUT with an ALG-generated atmospheric LUT to upscale the data to TOA radiance level given the assumption of a Lambertian surface (Verrelst et al., 2019b). At the same time, with ARTMO’s MLRA toolbox retrieval algorithms can be trained and maps of vegetation products can be generated from BOA reflectance or from TOA radiance EO data, as initially explored in Verrelst et al. (2019b).
Building on experience from above studies, the main objective of this work was to develop, optimize and validate a hybrid LAI retrieval model applicable to S2 TOA radiance data. To demonstrate its validity, the developed model was compared against a hybrid retrieval model applicable to S2 BOA reflectance data. The pursued approach was as follows. First, a coupled atmosphere-canopy RT model was used to simulate a LUT of TOA radiance data with associated input variables. This LUT was then used to train a GPR model for LAI retrieval from a S2 TOA radiance image (L1C product) over the agricultural region Barrax, Spain. To do so, an atmosphere 6SV LUT was generated with the ALG toolbox, which was then coupled with PROSAIL simulations to generate TOA radiance data in ARTMO’s TOC2TOA toolbox. The TOA radiance data was then used by the MLRA toolbox for developing the retrieval model. Similarly, PROSAIL simulations were used to train GPR model for LAI retrieval from S2 images at BOA level (L2A product) for comparison purposes. Finally, the obtained maps were validated against LAI maps as generated by the NN model in the SNAP toolbox.
2. Theoretical framework top-of-canopy and top-of-atmosphere simulations for retrieval
2.1. Leaf, canopy and atmosphere RTMs: PROSAIL and 6SV
In hybrid biophysical variable retrieval strategies, the model development is based on simulated data coming from RTMs. When aiming to develop hybrid retrieval models applicable to at-sensor TOA radiance data, then the simulated data come from coupled vegetation surface and atmosphere RTMs. Here, we demonstrate the feasibility of this approach with the most standard leaf-canopy-atmosphere RTMs, as they are fast and freely available to the community. Vegetation top-of-canopy (TOC) reflectance simulations come from the combination of the leaf RTM PROSPECT-4 (Feret et al., 2008) with the canopy RTM SAIL (Verhoef, 1984), also known as PROSAIL (Jacquemoud et al., 2009; Berger et al., 2018). PROSPECT-4 is one of the most widely used RTMs that simulates leaf optical properties. It calculates directional-hemispherical reflectance and transmittance measured from 400 nm to 2500 nm at 1 nm spectral sampling. SAIL solves the radiative transfer equation for scattering and absorption of four upward/downward fluxes at the canopy scale. The leaf reflectance (ρl) and transmittance τl) outputs of PROSPECT are entered into SAIL model to simulate the top-of-canopy (TOC) reflectance (ρc) in the 400−2500 nm spectral range at 1 nm sampling. The soil spectral reflectance is another important input of SAIL. Generally, field radiometric data is used, but also spectra from images have been successfully used (Verrelst et al., 2019a).
By varying the RTM input variables, multiple model realization are run and both inputs and output spectra are stored in LUTs. The LUTs can subsequently be used for further processing such as mapping applications, e.g. by means of applying inversion strategies through minimization functions (Rivera et al., 2013; Verrelst et al., 2014), or by means of using these LUTs for training a hybrid retrieval strategy (Rivera Caicedo et al., 2014; Verrelst et al., 2016a). However, with PROSAIL only TOC reflectance simulations are generated, which means these data can solely be used to images after atmospheric correction. An additional step is thus required when developing retrieval strategies directly from TOA radiance data, i.e. the coupling with an atmospheric RTM.
In order to enable extracting biophysical variables directly from TOA radiance data, it is necessary to upscale the PROSAIL-simulated TOC LUT to TOA radiance level. This is achieved by means of coupling the LUT with simulations from an atmospheric RTM. Among the multiple atmospheric RTMs available, the 6SV (Second Simulation of the Satellite Signal in the Solar Spectrum) (Vermote et al., 1997) is probably the most widely used computer code that simulated the propagation of radiation through the atmosphere. 6SV is an improved version of 5SV code, developed by the Laboratoire d’Optique Atmospherique. It takes into account the main atmospheric effects like gaseous absorption by water vapor, carbon dioxide, oxygen and ozone; scattering by molecules and aerosols. The computational accuracy for Rayleigh and aerosol scattering effects is based on the use of state-of-the-art approximations and implementation of the successive order of scattering (SOS) algorithm (Lenoble et al., 2007). Just like PROSAIL, the simulations of 6SV are in the 400−2500 nm spectral range but at a spectral resolution of 2.5 nm.
The output of 6SV are the following atmospheric transfer functions for each combination of key input parameters:
ρ0: Intrinsic atmospheric reflectance (unitless).
Tgas: Total gas transmittance (unitless).
Tdwn and Tup: Total downwards and upwards transmittance due to scattering (unitless)
S: Spherical albedo (unitless).
I0: Extraterrestrial solar irradiance in [mW·m−2·nm−1].
TOA radiance spectra (L) is calculated by coupling the generated atmospheric transfer functions from 6SV with the Lambertian surface reflectance (ρ) from PROSAIL following the equation:
| (1) |
where μil = cosθil being θil the solar zenith angle. For the sake of simplicity, the spectral dependency of all terms in the Eq. (1) has been omitted. A schematic overview of the coupling of PROSAIL with 6SV is provided in Fig. 1.
Fig. 1. Schematic illustration of the coupled PROSAIL with 6SV. The PROSAIL part is with permission adapted from Berger et al. (2018). For 6SV only the dominant continuous variables are given. See Tables 2 and 3 for an explanation of the symbols.
2.2. Gaussian process regression
We included Gaussian Process Regression (GPR) (Rasmussen and Williams, 2006) in the hybrid retrieval scheme because it has proven competitive performance in variable retrieval (Verrelst et al., 2012a; Verrelst et al., 2013a) and model emulation in general (Camps-Valls et al., 2016; Vicent et al., 2018; Svendsen et al., 2020; Camps-Valls et al., 2019), and when is applied to S2 and Sentinel-3 data in particular (Verrelst et al., 2012b; Verrelst et al., 2013b; Verrelst et al., 2015b; Upreti et al., 2019). See also reviews of Verrelst et al. (2015a, 2019a) for a rationale of using GPR as opposed to alternative MLRAs.
Notationally, the GPR model establishes a relation between the input (B-bands spectrum) x ∈ ℝB and the output variable (here LAI) y ∈ ℝ of the form:
| (2) |
where are the spectra used in the training phase, αi ∈ ℝ is the weight assigned to each one of them, and K is a function evaluating the similarity between the test spectrum and all N training spectra, , i =1, …,N. We used a scaled Gaussian kernel function,
| (3) |
where υ is a scaling factor, B is the number of bands, σb is a dedicated parameter controlling the spread of the relations for each particular spectral band b, σn is the noise standard deviation and δij is the Kronecker’s symbol. The kernel is thus parametrized by signal (υ, σb) and noise (σn) hyperparameters, collectively denoted as θ = {υ, σb, σn}.
For training purposes, we assume that the observed variable is formed by noisy observations of the true underlying function y = f (x) + ∈. Moreover we assume the noise to be additive independently identically Gaussian distributed with zero mean and variance σn. Let us define the stacked output values y = (y1, …,yN)T, the covariance terms of the test point k* = [k*(x*, x1),…,k(x*, xN)]τ, and k** = k (x*, x*) represents the self-similarity of x*. From the previous model assumption, the output values are distributed according to:
| (4) |
For prediction purposes, the GPR is obtained by computing the posterior distribution over the unknown output y*, p (y* |x*, 𝒟), where 𝒟≡ {xn, yn|n = 1, …,N} is the training dataset. Interestingly, this posterior can be shown to be a Gaussian distribution, for which one can estimate the predictive mean (point-wise predictions):
| (5) |
and the predictive variance (confidence intervals):
| (6) |
The corresponding hyperparameters θ are typically selected by Type-II Maximum Likelihood, using the marginal likelihood (also called evidence) of the observations, which is also analytical. When the derivatives of the log-evidence are also analytical, which is often the case, conjugated gradient ascent is typically used for optimization (see Rasmussen and Williams, 2006 for further details). A more detailed survey on GPR properties in remote sensing is provided in Camps-Valls et al. (2016), and a perspective outlook in Camps-Valls et al. (2019).
With respect to EO mapping applications, GPR is simple to train and works well with a relative small data set, as opposed to other methods like neural networks. The use of the ARD kernel function makes the GPR model quite flexible, and often outperforms other non-parametric regression methods in remote sensing applications (Verrelst et al., 2012b; Verrelst et al., 2015b). Furthermore, GPR provides information about the level of uncertainty (or confidence intervals) for prediction, e.g. confidence map that provides insight in the robustness of the retrieval (Verrelst et al., 2013b), and about the relevance of bands, which can be used for identifying the sensitive spectral regions (Verrelst et al., 2016b; Camps-Valls et al., 2016; Camps-Valls et al., 2019).
2.3. Heteroscedastic Gaussian process regression
Despite the great advantages for modeling, an important challenge in the practical use of GPR in EO mapping problems comes from the fact that very often signal and noise are often correlated. As seen before, the standard GP modeling assumes that the variance of the noise process σn is independent of the signal, which does not hold in most of EO applications. This strong assumption of homoscedasticity is generally broken in many biophysical retrieval problems because the acquisition process is typically affected by noise in different amounts depending on the measured range of the variable. In order to deal with input-dependent noise variance, heteroscedastic GPs let noise power vary smoothly throughout input space, σn (x). This, however, does not lead to closed-form solutions, and several approximations have been proposed in the literature. Among them, the marginalized variational approximation yields a richer and more flexible heteroscedastic GP model (Lázaro-Gredilla and Titsias, 2011), which has yielded very good results in biophysical retrieval from EO data (Lázaro-Gredilla et al., 2013; Camps-Valls et al., 2016).
2.4. Sentinel-2 satellite measurements
ESA’s Sentinel-2 (S2) is a polar-orbiting, super-spectral and high spatial resolution mission integrated by a pair of satellites (Sentinel-2A and Sentinel-2B) that enables a global revisit time below 5 days. The S2 mission delivers data from all land surfaces and coastal areas for supporting agro-ecosystems application within the European Commission’s Copernicus programme. Each S2 satellite carries a Multi-Spectral Imager (MSI) which has 13 spectral bands covering from the visible and NIR (VNIR) to SWIR spectral domains. MSI ranges from 400 to 2400 nm, with pixel sizes of 10, 20, or 60 m, depending on the spectral band (Drusch et al., 2012). Three of these bands are located in the rededge (centered at 705, 740 and 783 nm), an important region for vegetation study. Other band configuration details of MSI are included in Table 1. The super-spectral resolution, the inclusion of the red edge region of the spectrum, the high revisit frequency and the high radiometric quality (Gascon et al., 2017) make S2 optical data very convenient to estimate the biophysical variables.
Table 1.
Sentinel-2 MSI band settings. Bands used in this experiment are bolded.
| Band # | B1 | B2 | B3 | B4 | B5 | B6 | B7 | B8 | B8a | B9 | B10 | B11 | B12 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Band center (nm) | 443 | 490 | 560 | 665 | 705 | 740 | 783 | 842 | 865 | 945 | 1375 | 1610 | 2190 |
| Band width (nm) | 20 | 65 | 35 | 30 | 15 | 15 | 20 | 115 | 20 | 20 | 30 | 9 | 180 |
| Spatial resolution (m) | 60 | 10 | 10 | 10 | 20 | 20 | 20 | 10 | 20 | 60 | 60 | 20 | 20 |
Two reflectance products from S2 MSI are available at different processing levels: Level-1C and Level-2A. L1C product provides TOA reflectances (i.e., TOA radiance normalized by incident solar irradiance). The processing chain for this product involves radiometric calibration, geometric calibration and orthorectification. L2A product provides BOA reflectance from L1C product. Sentinel-2 Atmospheric Correction (S2AC) is achieved by means of the Sen2Cor atmospheric correction scheme (Main-Knorn et al., 2017). The baseline process of Sen2Cor is the cirrus/haze detection and removal (Richter et al., 2011b; Louis et al., 2010). The Aerosol Optical Thickness (AOT) can be preferably derived from Dark Dense Vegetation (DDV) targets and water bodies (Kaufman and Sendra, 1988). Water vapour retrieval over land is performed using the Atmospheric Pre-corrected Differential Absorption (ADPA) (Schläpfer et al., 1998). Sen2Cor is based on the libRadtran radiative transfer model (Emde et al., 2016) which simulates a wide variety of atmospheric conditions, solar geometries and ground elevations. A pre-computed LUT based on this atmospheric model is used to invert surface reflectance, using LUT interpolation to fill gaps in the simulation. Other optional pre-processing steps are performed such as correction for adjacency, topography and BRDF effects. Additionally, the Lambertian surface assumption is applied (Richter et al., 2011a). All the errors of L2A product related to measurements, modeling and assumptions, in combination with error propagation, results in non-negligible uncertainties that can impact a further biophysical retrieval. L1C and L2A products are made available to users via the Copernicus Open Access Hub (SciHub). L2A can also be generated by the user from the L1C product using the Sentinel-2 Toolbox or the standalone version of the Sen2Cor processor. This offline processing allows the user to set certain input parameters (Müller-Wilm, 2018), e.g. the type of aerosol (rural and maritime), the type of atmosphere (mid latitude summer and mid latitude winter) and the ozone concentration. In addition, there is the option to enable the cirrus correction and the BRDF correction, which are disabled by default. Whereas this mode allows the user to adjust certain parameters of the atmospheric correction to the local environment conditions, the product offered by ESA core through Scihub is processed with the default values defined in the Sen2cor algorithm (Clerc and team, 2020).
3. Materials and methods
Regarding the retrieval of LAI from S2 BOA and TOA data, we followed the recently proposed methodology by Verrelst et al. (2019b). In short, the hybrid retrieval strategy relies on GPR and VHGPR models trained by simulations from PROSAIL at the canopy scale and from PROSAIL-6SV at the atmosphere scale. Trained models are then applied S2 to L2A (BOA) and L1C (TOA) data for LAI mapping and validation with ground measurements. The following steps were necessary to conduct the methodology: (1) generation of simulations (LUTs) with PROSAIL and 6SV; (2) coupling of both LUTs to upscale to TOA radiance; (3) training GPR and VHGPR models with simulations and cross-validation; (4) validation with S2 data and ground measurements; (5) mapping variables; and (6) comparison with LAI product coming from the SNAP Biophysical Processor. A schematic overview of the method is provided in Fig. 2, and key steps are detailed in the following sections, starting with a description of the used Sentinel-2 data.
Fig. 2.
Flowchart of the pursued work-flow. Divided into two levels: bottom-of-atmosphere (left) and top-of-canopy (right). A rounded rectangle represents an input data/parameters, a normal rectangle represents a task/process, a rectangle with a curved bottom represents an intermediate output and a circle represents a final output/result.
3.1. PROSAIL simulations
The PROSAIL simulations were produced by coupling PROSPECT-4 with SAIL within the ARTMO framework. The PROSPECT-4 and 4SAIL input variables with their sampling range and distribution are shown in Table 2. This parametrization is based on the measurements campaigns and/or other studies which used the same crops (Rivera et al., 2013; Verrelst et al., 2014; Verrelst et al., 2015b). Hot spot and solar/viewing angles were fixed. LAI and Cab were 100 time sampled with Gaussian distribution and the rest of variables were 10 times sampled with uniform distribution. LAI and Cab required to put more emphasis on the values at the actual growth stages of the crops. The selected values of illumination and viewing conditions agree with the satellite overpass conditions. The combination of all input variables values would produce an unrealistic number of simulations of 1 billion. For that reason, a smaller LUT was randomly chosen with 1000 reflectance realizations. Since PROSAIL simulates reflectance from 400 nm to 2500 nm with a 1 nm spectral resolution the output spectra were resampled to the band settings of S2 (Table 1) using the spectral response function provided by the Ground Segment as from 15 January 2018 (ESA, 2018).
Table 2.
Ranges, values and distributions of input variables used to establish the synthetic canopy reflectance database for use in the LUT. : mean, SD: standard deviation.
| Model variables | Units | Range | Distribution | |
|---|---|---|---|---|
| Leaf variables: PROSPECT-4 | ||||
| N | Leaf structure index | unitless | 1.3-2.5 | Uniform |
| Cab | Leaf chlorophyll content | [μg/cm2] | 5-75 | Gaussian (: 35, SD: 30) |
| cm | Leaf dry matter content | [g/cm2] | 0.001-0.03 | Uniform |
| Cw | Leaf water content | [cm] | 0.002-0.05 | Uniform |
| Canopy variables: 4SAIL | ||||
| LAI | Leaf area index | [m2/m2] | 0.1-7 | Gaussian (: 3, SD: 2) |
| αsoil | Soil scaling factor | unitless | 0-1 | Uniform |
| ALA | Average leaf angle | [°] | 40-70 | Uniform |
| HotS | Hot spot parameter | [m/m] | 0.01 | - |
| SZA | Sun zenith angle | [°] | 30 | - |
| VZA | View zenith angle | [°] | 0 | - |
| RAA | Relative azimuth angle | [°] | 0 | - |
3.2. Noise model and added soil spectra
In order to improve the performance of the retrieval, ideally the training database should be as similar as possible to real Sentinel-2 data. It implies to consider some uncertainties associate with sensor measurement accuracy and data processing including radiometric calibration, atmospheric and geometric corrections. These different source of uncertainties can introduce additive and multiplicative errors which can be band dependent (applied to a single band) and band independent (applied to all bands) (Verger et al., 2011). All these variabilities and uncertainties were introduced in the simulated LUT based on white Gaussian noise, according to the noise model provided in Eq. (7) (Weiss et al., 2016):
| (7) |
where R(Λ) and R*(Λ) represent respectively the raw simulated reflectance for band 2 and the reflectance with uncertainties for band 2. MD and MI are the multiplicative wavelength dependent noise and the multiplicative wavelength independent noise, respectively. AD and AI are the additive wavelength dependent noise and the additive wavelength independent noise. After some testing of additive and multiplicative noise, a value of 0.01 for AD and AI, and a value of 4% for MD and MI were used for all the bands. Similar noise levels were successfully used in recent works (Upreti et al., 2019; Verrelst et al., 2019b), trying to reduce the over-fitting on the MLRA training database.
Further, because PROSAIL is a vegetation canopy model and not prepared to simulate variability in soil cover, reflectance spectra from bare soil pixels were added to the PROSAIL simulations (Verrelst et al., 2019b). A dataset of 30 distinct soil samples were visually identified from the S2 L2A and L1C products (BOA and TOA), trying to collect the more representative soil spectral signatures. Because these spectra came from the image itself, no additional noise was added to it.
3.3. 6SV simulations and coupling
The role of the atmosphere was simulated using the 6SV code (6SV2.1) (Kotchenova et al., 2006; Kotchenova and Vermote, 2007) within the ALG framework. The model input variables (Table 3) were set considering the experimental data conditions, similar to Verrelst et al. (2019b). The distribution of the variables follows the Latin Hypercube Sampling (LHS) method. The atmospheric profile mode used was Mid-Latitude Summer and the aerosol model selected was Continental. The geometric conditions of the canopy model PROSAIL were preserved. Finally, this atmospheric simulation generates a LUT of 1000 samples. This LUT consists on pairs of transfer functions and atmospheric parameters. The spectral range was limited to the S2 MSI spectral configuration (Table 1), matching with PROSAIL simulations.
Table 3. Range of 6SV input variables used for the simulations of the atmospheric transfer functions.
| Model variables | Units | Minimum | Maximum | |
|---|---|---|---|---|
| O3C | O3 Column concentration | [amt-cm] | 0.25 | 0.35 |
| CWV | Columnar water vapor | [g cm−2] | 0.4 | 4.5 |
| AOT | Aerosol optical thickness | unitless | 0.05 | 0.5 |
| SZA | Sun zenith angle | [°] | 30 | - |
| VZA | View zenith angle | [°] | 0 | - |
| RAA | Relative azimuth angle | [°] | 0 | - |
The original 1 nm sampling of PROSAIL surface simulations were resampled by spline interpolation to the 2.5 nm sampling 6SV atmospheric simulations. With this common spectral sampling, the surface and atmospheric LUTs were randomly combined and propagated to TOA radiance, following the Eq. (1) with the Lambertian surface assumption. This step was conducted using ARTMO’s TOC2TOA toolbox, which generate the TOA LUT consisting of pairs of radiance spectra and associated vegetation-atmosphere parameters. Finally, ARTMO’s TOC2TOA toolbox applied a convolution of this high-spectral resolution TOA radiance by the S2 spectral response function.
3.4. Training the GPR LAI models, retrieval and cross-validation
The TOC LUT simulated with model PROSAIL and the TOA LUT simulated with the combined PROSAIL-6SV models were used to train GPR into LAI retrieval models applicable to S2 at the corresponding BOA (L2A) and TOA (L1C) level. In order to assess the theoretical GPR retrieval performance, a 5-fold cross-validation was performed. Three goodness-of-fit metrics were calculated, being: (1) the root-mean-squared error (RMSE), (2) the normalized RMSE (NRMSE in %) and (3) the coefficient of determination (R2).
3.5. Validation with ground measurements and Sentinel-2 data
3.5.1. Marchfeld site
Validation data was collected from an area located east of Vienna in Lower Austria (Lat. 48°N, Long. 17°E). The site is a major agricultural production area in Austria with cropland occupying 60,000 ha, of which about 21,000 ha are irrigated regularly with groundwater throughout the growing season (Neugebauer and Vuolo, 2014). The region is characterized by a semi-arid climate representing the driest region of Austria.
The field campaign took place from April to August 2016. Eight different crop types, distributed over 72 parcels, were monitored to represent the prevailing crop types in the study area (Vuolo et al., 2018). The parcels include 33 ordinary fields and 39 one-hectare experimental plots. In-situ LAI measurements where collected with a Li-Cor LAI-2200 Plant Canopy Analyzer (Li-Cor, 1992). The LAI-2200’s sensor operates a non-destructive method and is sensitive to all light blocking objects in its view. It estimates the LAI from the values of canopy transmittance by identifying the attenuation of the radiation as it passes through the canopy (Li-Cor, 1992). Therefore, measurements were taken above- (A) and below-canopy (B). LAI estimates represent the effective Plant Area Index (PAIe), because the optical sensor does not distinguish between photosynthetically active leaves and inactive parts of the plants such as senescent leaves or stems. Care was taken to measure LAI only on photosynthetically active vegetation. For example, measurements were interrupted on winter cereals as soon as the first signs of senescence started to appear. The LAI-2200 was deployed in an elementary sampling units (ESUs) using a radius of 5−10 m of a georeferenced point (accuracy of ±3−5 m). Each unit represented a homogeneous area with a single crop type. The ESUs were randomly chosen from the study area, with the only restriction being the fields’ accessibility for time restraints. For ordinary fields the centres of the ESUs were placed in a corner of a squared area of 60 m by 60 m within the field and measured from the field border. It was imperative that the field conditions were relatively homogeneous in terms of crop development. The ESUs located in the experimental plots, part of a larger experimental setup, were located in the centre of each one-hectare plot. Winter cereal, onion, and potato were assessed through three replications of one A and eight B measurements, randomly distributed in the ESU. Row crops like maize, carrot and sugar beet were estimated with four replications of one A and six B measurements, for a total of 24 single measurements to generate a single LAI value per ESU. The final dataset of in situ collected LAI consists of 114 measurements and complementary L1C radiance and L2A reflectance values extracted from the satellite image data. The SNAP “Reflectance-to-Radiance” tool was used to convert L1C reflectance to radiance. Table 4 lists the satellite images and the corresponding dates of field measurements which were used for the analysis.
Table 4. Satellite acquisition and ground measurements dates from April to September 2016 for Marchfeld campaign.
| Date-Ground Measurements | Date-Satellite Acquisition | Difference (Days) |
|---|---|---|
| 13 April | 13 April | 0 |
| 18 April | 13 April | 5 |
| 25 April | 26 April | 1 |
| 2 May | 6 May | 4 |
| 9 May | 6 May | 3 |
| 27−28 June | 25 June | 3 |
| 4−5 July | 2 July | 3 |
| 16 August | 14 August | 2 |
| 31 August | 31 August | 3 |
| 12 September | 10 September | 3 |
3.5.2. Barrax site
As second test, the agricultural area Barrax, Spain, was chosen (Lat. 39°N, Long. -2°E). Although no field campaign has been conducted for the last few years, this site was long used as reference for ESA field campaigns in support of satellite missions (e.g., SPARC, SEN3EXP). The Barrax agricultural area has a rectangular form and an extent of 5 km by 10 km, and is characterized by a flat morphology and large, uniform land-use units. The region consists of approximately 65% dry land and 35% irrigated land, mainly by center pivot irrigation systems. It leads to a patchy landscape with large circular fields. Cultivated crops include garlic, alfalfa, onion, sunflower, corn, potato, sugar beet, vineyard and wheat. The annual rainfall average is about 400 mm.
3.6. Mapping and comparison with SNAP Biophysical Processor
As a final step, in order to evaluate the capability of the GPR and VHGPR models to generate LAI maps, we used the trained models to generate LAI maps from S2 images using ARTMO’s MLRA toolbox (Rivera Caicedo et al., 2014). To do so, for both the Marchfeld and Barrax sites, a cloud-free spatial subset of S2 L1C (TOA) and L2A (BOA) imagery was used to evaluate the retrieval performance at both TOA and BOA scale. For Marchfeld an acquisition during the field campaign (2 July 2016) was used, while for Barrax a more recent acquisition was used (5 June 2017). The average SZA values were 30° and 20° respectively for Marchfeld and Barrax. A RGB of the subsets are shown in Fig. 3. For Barrax site the L2A product was directly downloaded from Scihub. For Marchfeld site, since the image is not available at level 2A for this acquisition date in the Scihub, the L1C subset was processed offline using Sen2Cor Atmospheric Correction Processor (version 2.5.5) to perform the L2A reflectance. For this atmospheric correction the default parameters of the Sen2Cor algorithm were used (Clerc and team, 2020). Only the 10 m and 20 m bands were used from S2 images, being the bands 2 to 8, 8a, 11, and 12. The images were resampled to 20 m and for both sites a spatial subset of 400 by 400 pixels was selected.
Fig. 3. RGB composition of Sentinel-2 MSI over Marchfeld, Austria (left) and Barrax, Spain (right).
Lastly, the generated LAI maps were compared against the maps generated with the Biophysical Processor of SNAP software (Weiss et al., 2016). In SNAP, the retrieval of LAI is based on a NN model and is built on earlier experience for other ESA land missions (Weiss et al., 2016). This model is also trained with a PROSAIL LUT. The variables range and distributions established for the LUT were similar to the ones we assumed in PROSAIL (Table 2). Although ESA recommends to use this tool with TOC reflectance data, we also use TOA reflectance data to generate the maps for comparison purposes.
4. Results
4.1. GPR LAI models and validation against field data
Before assessing the validity of the trained GPR and VHGPR models, first inspection of the training data is provided. RTM simulations were run both by PROSAIL, which led to TOC reflectance, and by PROSAIL-6SV, which led to upscaled TOA radiance data. Because empirical soil spectra was added to the training data to account for the non-vegetated surfaces, overview statistics of both simulated vegetated spectra and bare soil spectra are shown in Fig. 4. The figures at TOC and TOA scales demonstrate that a large variability of vegetation and soil profiles are covered in the training dataset. This is essential to process a complete S2 image, including all kinds of non-vegetated surfaces. These datasets form the core of the LAI retrieval algorithms applied to S2 BOA and TOA images.
Fig. 4.
General statistics (mean, standard deviation (SD), min−max) for the bare soil spectra collected from S2-L2A product and vegetation spectra simulated with PROSAIL (left) for the 10 and 20 m S2 bands. The same data has been upscaled to TOA radiance with 6SV (right).
Following the above-described processing scheme (Fig. 2), GPR and VHGPR models were trained with the TOC and TOA training datasets for the development of LAI retrieval models. A first step was to evaluate the theoretical performances of the trained GPR and VHGPR models. A 5-k cross-validation sampling strategy was applied to assess the theoretical goodness-of-fit of the models. Both models showed good and consistent performances at BOA and TOA scale. At BOA scale, GPR obtained a slightly superior accuracy with a R2 of 0.59 (RMSE: 1.08; NRMSE: 15.50%) than VHGPR, with a R2 of 0.58 (RMSE: 1.09; NRMSE: 15.68%). Results were basically identical at the TOA scale yet somewhat poorer than at BOA scale: GPR with a R2 of 0.54 (RMSE: 1.14; NRMSE: 16.30%), and VHGPR with a R2 of 0.54 (RMSE: 1.14; NRMSE: 16.35%). That the TOA retrieval performed somewhat inferior was expected because of the additional variability introduced into the LUT by means of the coupling with the atmosphere simulations. Yet, the retrieval performances are similar to the TOC dataset, and it reveals the consistency of the LAI retrieval method from TOA radiance data. For both datasets, the added noise and saturation at higher LAI values explain the sub-optimal results, however what matters is the performances against ground validation data.
Regarding validation against field data, the Marchfeld dataset was used. For both BOA and TOA data and for GPR and VHGPR models, the measured vs. estimated scatter plots are shown in Fig. 5, along with the goodness-of fit statistics. For both GPR and VHGPR, the validation results do not suggest differences between the BOA and TOA scale for the various crop types. This is of interest, as it suggests that the retrieval models function just as well at both scales. It underlines the possibility of retrieving LAI directly from TOA radiance data. Conversely, validation results were slightly superior obtained by VHGPR as opposed to GPR. These results suggest that VHGPR delivers superior accuracies; e.g. the underestimations are smaller as opposed to GPR. An advantage of GP models is that associated uncertainty estimates (confidence intervals) are provided. As can be observed, all estimates are accompanied with a consistent uncertainty range. Here an additional advantage of VHGPR appeared: it provided lower uncertainties, especially at lower LAI. Hence, VHGPR models were used for further mapping applications. Finally, when inspecting further the estimations for the different crop types, all crops were reasonably well estimated. Only, it can be observed that onion showed a systematic underestimation. Most likely for this row crop the influence of bare soil plays a large role at the pixel scale, leading to the underestimation.
Fig. 5.
Measured vs. estimated LAI values along the 1:1-line with associated confidence intervals (1 SD). Ground validation of several crops over Marchfeld site for GPR (left) and VHGPR (right) for retrieval S2-L2A (BOA) data (top) and S2-L1C (TOA) data (bottom).
4.2. LAI mapping from S2 BOA (L2A) and TOA (L1C) data
Because of the adequate validation of the LAI retrieval models at the S2 BOA and TOA scale, the subsequent step is applying the VHGPR models to the S2 subsets for mapping. The obtained LAI maps as generated for both BOA and TOA data products of the two subsets are shown in Fig. 6 (Marchfeld) and Fig. 7 (Barrax).
Fig. 6.
LAI map (mean estimates; μ) (left), associated uncertainties (expressed as standard deviation (SD) around the μ) (center), and relative uncertainties (expressed as coefficient of variation (CV = SD/μ × 100 in %) (right) as generated by VHGPR algorithm from L2A (top) and L1C (middle) data for Marchfeld test site. Scatter plots of both maps with gridded color density (bottom). In case of %CV a maximum of 100% is set.
Fig. 7.
LAI map (mean estimates; μ) (left), associated uncertainties (expressed as standard deviation (SD) around the μ) (center), and relative uncertainties (expressed as coefficient of variation (CV = SD/μ × 100 in %) (right) as generated by VHGPR algorithm from L2A (top) and L1C (middle) data for Barrax test site. Scatter plots of both maps with gridded color density (bottom). In case of %CV a maximum of 100% is set.
For both test sites, the obtained maps are realistic and represent well the spatial variation of the surface. Regarding Marchfeld (Fig. 6), a clear distinction can be made between green fields with high LAI and fields with crops that are either senescent or where crop have been harvested with LAI close to zero. Regarding Barrax, the irrigated circular agricultural fields can be identified on the maps with their within-field variation. On the other hand, non-vegetated areas are estimated with LAI values close to zero. When comparing the LAI maps extracted at BOA and TOA scale, the spatial distribution and the LAI range appear alike, which suggests the possibility of retrieving LAI directly from TOA radiance data. Probably the high correlation between BOA and TOA can be better appreciated in the scatter plot for Marchfeld site (R2 = 0.95) in the bottom of Fig. 6 and in the scatter plot for Barrax site (R2 = 0.99) in the bottom of Fig. 7. The similarity between BOA and TOA maps suggests that LAI can be retrieved directly from TOA radiance data, i.e. without the need of an atmospheric correction.
As an advantage of the GPR and VHGPR models, because of developed in a Bayesian framework, apart from the LAI estimates also associated uncertainty maps are provided. Two uncertainty outputs are calculated: absolute uncertainties expressed as standard deviation (SD) around the mean estimate and relative uncertainties (%CV = SD/mean estimate × 100). These maps provide additional information about the performance of the retrieval models on a per-pixel basis. Accordingly, the consistency across the BOA and TOA scales can be inspected.
In the associated uncertainties maps, low SD values over vegetated areas indicate high certainties, while high values (light blue and yellow) indicate less certainties. Low values over non-vegetated areas appear because of the close-to-zero LAI values for bare soils or senescent fields. A few areas with high uncertainties can be seen in the maps in red (especially for the Barrax site), belonged to surfaces not included in the LUTs, e.g. man-made surfaces. The uncertainty maps can be useful to reveal areas that need more representativeness in the training data set (Verrelst et al., 2015b). Generally the BOA and TOA uncertainty maps are consistent. The BOA map showed some more areas with slightly higher SD values than TOA map, i.e. higher uncertainties. This was expected for BOA due to the errors involved in its complex pre-processing step, as we explained earlier in Section 1. The scatter plot of BOA vs TOA confirms again the consistency between both products (R2 of 0.91 for Marchfeld and R2 of 0.96 for Barrax). Although not shown, the SD uncertainties obtained from the original GPR model were systematically higher, which confirms that VHGPR leads to higher quality maps as opposed to GPR.
While the SD map is related to the magnitude of LAI, the relative uncertainty as calculated by the coefficient of variation (CV) is probably easier to interpret, as it is a relative estimate expressed in percentage. When inspecting this map for Marchfeld, it can be observed that the BOA map led substantial more parcels with low uncertainties than the TOA map. Nevertheless, the regions with high uncertainties are over parcels with close to zero LAI. The same occurs over Barrax: the irrigated vegetated areas are retrieved with low uncertainties, while the bare soils are retrieved with high uncertainties. This is the result of both very low LAI estimates, which is correct, but they are associated with absolute uncertainties of 1 or higher, as such resulting in high uncertainties. Hence, it is not that the estimates are out-of-range, it is rather that the estimates are accompanied with high uncertainties. Typically, adding more bare soil estimates to the training data, accounting for local variability, would further reduce the uncertainties. The regions with out-of-range values cause to break down the correlation of the scatter plot. Yet, the large majority of pixels fell precisely on the 1:1-line, again suggesting the consistency of both maps. It must also be remarked that the CV map as obtained from the GPR models led to substantially more areas with high values (not shown).
4.3. Comparison against LAI maps obtained from SNAP NN model
A final step involves comparison against the official LAI product as can be obtained from the ESA’s SNAP toolbox. Hence, the SNAP bio-physical processor, based on an NN model, was used to generate the LAI maps for both BOA and TOA scale over the Marchfeld and Barrax sites. The obtained LAI maps are provided in Fig. 8 (Marchfeld) and Fig. 9 (Barrax). At a glance, the SNAP map looks similar to the above map as generated by VHGPR; the same patterns are obtained, and LAI estimations seem alike. However, when comparing the BOA map against VHGPR in a scatter plot, it can be observed that SNAP clearly over-estimates some vegetated parcels. This is especially the case for Barrax, with unrealistic LAI values up to above 20. See also the scatter plots. Also the LAI maps look alike when obtained from TOA radiance data (L1C). Interestingly, at this level the SNAP NN model does not suffer from extreme overestimation. Despite the overestimation of NN at BOA scale, it should be mentioned that the SNAP biophysical processor also generates quality indicators per pixel. The indicators show when the input reflectances are outside the training definition domain. Also the indicator is flagged when the output value is outside the LAI range, defined by SNAP with a maximum, being 8, a minimum and a tolerance value. Hence, when the estimation is beyond 8, it is flagged as out-of-range.
Fig. 8.
LAI map obtained with SNAP (left), scatter plot (center) and relative error map (right) between VHGPR (Fig. 6) and SNAP NN estimations from L2A (BOA) data (top) and L1C (TOA) data (bottom) for Marchfeld test site. For visualization purposes, the LAI color bar was limited to. a maximum of 7.
Fig. 9.
LAI map obtained with SNAP (left), scatter plot (center) and relative error map (right) between VHGPR (Fig. 7) and SNAP NN estimations from L2A (BOA) data (Top) and L1C (TOA) data (Bottom) for Barrax test site. For visualization purposes, the LAI color bar was limited to a maximum of 7.
Probably an easier way to interpret the differences between the SNAP NN and VHGPR models may be by mapping the relative errors. For both BOA and TOA relative error maps are shown in Fig. 8 (Marchfeld) and Fig. 9 (Barrax). The white areas indicate no change within a 20% difference. Dark blue and red shades indicate large relative differences. For both the Marchfeld and Barrax sites, blue colors (i.e., underestimation) predominate. These parcels appear over low LAI values, i.e., parcels with influence of bare soil. When comparing both maps, it appears that the SNAP NN model does not reach zero values over senescent or non-vegetated surfaces. This suggests that the NN model is not optimized to estimate 0 values in case vegetation is absent or no longer green.
5. Discussion
Building on the hybrid retrieval processing strategy earlier proposed in Verrelst et al. (2019b), in this work we developed LAI retrieval models applicable to S2 BOA (L2A) and TOA (L1C) data using Gaussian processes (GPs). These GP models can then be applied to S2 images for the operational production of LAI maps. In this context, while demonstrating that the TOA retrieval approach can be developed using a hybrid method based on coupled vegetation-atmosphere RTMs and GP models is one thing, what was left to be done was assessing the robustness and the maturity of the retrieval models. This was done as follows. First the retrieval models were validated against ground measurements, then the mapping performances were tested over two European agricultural sites, and finally the maps were compared against the official S2 LAI product. The obtained validation results and consistency of the generated maps confirmed that biophysical variables such as LAI can be meaningfully retrieved directly from at-sensor radiance data. Here we discuss the multiple aspects of the retrieval strategy, starting with the chosen variable.
The reason to focus on LAI retrieval from TOA radiance data was motivated by an earlier conducted leaf-canopy-atmosphere global sensitivity analysis (GSA) (Verrelst et al., 2019b). According to those GSA results, LAI is the most dominant variable that drives the variability of TOA radiance along the visible-SWIR spectral range, followed by leaf chlorophyll and water content (Verrelst et al., 2019b). The good validation results obtained at the TOA scale were to be expected; the TOA-based GSA results indicated that − outside the water absorption regions and the blue region − the vegetation variables drive TOA radiance much more than atmospheric variables (Verrelst et al., 2019b). Moreover, the used S2 bands are conveniently located in the spectral regions where LAI plays a dominant role, especially in the red and SWIR bands (B11 at 1610 nm and B12 at 2190 nm). The S2 band settings exploit efficiently the spectral information because according to the GSA results it is in these two SWIR bands where LAI is mostly driving, with about 80% of the total sensitivity (STi) (Verrelst et al., 2019b). At the same time, the influence of the atmosphere in the visible part can introduce some error in both the TOA and BOA retrieval. While in principle at the BOA scale atmospheric correction takes care of it, in TOA retrieval, the idea is that the influence of the atmosphere in the visible is accounted for directly in the retrieval model.
In an attempt to figure out the role of S2 band positions, we revised the used S2 bands with respect to LAI retrieval performance. Related studies used similar S2 band settings in LAI model development but with one or two bands less. For instance Verrelst et al. (2019b) used 9 bands excluding B8, and Upreti et al. (2019) used 8 bands by excluding B2 and B8. To clarify the contribution of the bands, we conducted similar tests using 8, 9 and 10 bands in the LAI retrieval algorithms. Validation results showed insignificant differences in the retrieval performance (results not shown), which suggests that using one band more or less will not impact the results. Another aspect is that our validation results at BOA level conducted with GPR (RMSE: 0.70) and VHGPR (RMSE: 0.63) are consistent with the related work of Upreti et al. (2019), who found a similar performance using least-squares linear regression (RMSE: 0.68) and GPR with active learning (RMSE: 1.31). In comparison to the latter case, we obtained superior validation results without implementing an active learning strategy. At the same time, the reason for a good validation may as well lie in the good quality of the validation data; with over 100 ground measurements covering multiple crop types it is a rich dataset. It is also worth mentioning that we did not encounter related S2 studies that provide LAI validation results at the TOA scale, implying that the conducted BOA vs TOA intercomparison is probably the more informative. To sum up, the consistent validation results and also the similar estimations for all crop types at each scale support that the pursued hybrid retrieval method works at both TOA and BOA scales.
Apart from the targeted variable and selected bands, a key factor of the hybrid model is the specific GP algorithm used, i.e. GPR and VHGPR. Both flavors of GP models achieved consistent performances in LAI estimation with both BOA and TOA data. The small but systematic superiority by VHGPR over GPR can be explained by its more flexible nature accounting for signal-to-noise relations. Unlike GPR, VHGPR does not assume that the noise is independent of the signal. When observing the information of uncertainties offered by these methods, we can appreciate how VHGPR yields consistently lower uncertainties than GPR at low LAI values. We hypothesize that VHGPR adjusts better to the noise conditions at low levels of the variable due to its heteroscedastic characteristic. Although VHGPR takes somewhat longer training time than GPR (roughly twice as much, see Lázaro-Gredilla and Titsias, 2011; Camps-Valls et al., 2016), mapping run-time are on the same order for both algorithms, which suggests that training time should not be an obstacle to prefer VHGPR over GPR for achieving lower uncertainties and higher accuracy.
Having grip on the data and retrieval model, the essence of this work was to explore how well LAI can be retrieved directly from TOA data. Although the obtained results at BOA and TOA scale are consistent, it is not escaping our attention that the TOA validation results slightly outperformed those of BOA. It can be argued that retrieving from BOA data is probably a more complex approach due to the multiple steps involved in the processing to convert the L1C product into L2A. In addition to the uncertainties introduced by the hybrid LAI retrieval algorithm, the atmospheric correction carried out with the Sen2Cor procedure can introduce residual errors in the L2A product that are propagated in the biophysical variables retrieval. Among the elements that can affect the atmospheric correction accuracy of the Sen2Cor algorithm involve the atmospheric effects (i.e., Rayleigh and aerosol scattering effects) on the TOA data across the spectrum, particularly in the visible bands (Martins et al., 2017). In this respect, Sola et al. (2018), Doxani et al. (2018) compared the role of alternative atmospheric correction algorithms (6SV, ACOLITE and Sen2Cor). The authors reported diverging performances with respect to location, spectral band and land cover. These sensitivities suggest that atmospheric correction is not that straightforward, and can easily influence the resulting BOA reflectance. Also Djamai and Fernandes (2018) suggested that more research is required to quantify the impact of different atmospheric correction algorithms on biophysical variables retrieval.
The here proposed alternative approach, i.e. developing retrieval models directly at the TOA scale, seems to simplify the problem. Yet likewise challenges appear, although probably less than at BOA scale. First of all, a bright, cloud-free atmosphere is assumed, as was the case in the two demonstration sites. It must be remarked that the TOA models have not yet been tested in case of hazy conditions, so a bright sky is a requirement. Notwithstanding, the observed moderate discrepancies between BOA and TOA against the ground validation require further explanation; they may be attributed to the use of different RTMs for the atmospheric simulation and for the TOA data preprocessing. While 6SV is used to generate the training data set, the S2 TOA reflectance-to-radiance conversion in SNAP is done with routines based on libRadtran using the solar irradiance spectrum from Thuillier (Thuillier et al., 2003). Further work should ensure consistency among the atmospheric models used to simulate the training dataset as well as the preprocessing of real TOA data. This can be easily done in the ALG-TOC2TOA packages. For instance, new TOA LUTs can be generated with different atmospheric RTMs for comparison purposes. PROSAIL can be kept as the vegetation model that has proven to be suitable for generating the training dataset, while distinct atmospheric models can be applied to perform the atmospheric correction of the TOA product, e.g. 6SV, libRadtran, MODTRAN. Another research task would be to enable TOC2TOA to generate TOA reflectance spectra instead of TOA radiance spectra so that retrieval can use directly the provided S2 TOA reflectance product.
Once the VHGPR model is trained, in principle it can be applied to any S2 image for LAI mapping, such as here demonstrated over two agricultural sites. The consistency among the TOA and BOA scale and the low deviation shown in the uncertainty maps confirm the good performance of the VHGPR models. The uncertainty maps are a valuable addition as opposed to other retrieval methods: it model provides information about the per-pixel performance and thus portability of the VHGPR model (Verrelst et al., 2013b). For instance, the model can be applied to any S2 image as long as the uncertainties stay within a certain threshold. On the other hand, the uncertainty maps could also be used as a spatial mask to show only the pixels that meet a minimum level of confidence (Verrelst et al., 2015a). For example, the pixels below 20% of relative certainties can be discarded, as such meeting the recommendation of the GCOS for the minimum LAI quality (GCOS, 2011). Moreover, the models were performing more consistent than the SNAP NN model. As opposed to the NN-derived maps, more close-to-zero values were achieved for non-vegetated surfaces, while overestimations over dense vegetation did not occur. In the SNAP NN model, estimations above a LAI of 8 are flagged by additional quality layers as out-of-range, which can thus also be masked out. When comparing both approaches, the VHGPR-produced uncertainty estimates seem more valuable than threshold-based quality flags as an independent and quantitative source of information about the model performance (Verrelst et al., 2013b; Verrelst et al., 2015a).
It must also be remarked that, although LAI was successfully estimated over agricultural areas, it remains necessary to test and validate the models over other vegetation types typically present in a S2 image, such as forest and more heterogeneous areas. The estimations will most likely be poorer, as the models are not optimized for these vegetation types. Complementary training data is therefore required, e.g. coming from RTMs that are better equipped to simulate more heterogeneous vegetation types such as forest or shrubland. At the same time, the maps show higher levels of uncertainty over non-vegetation surfaces (such as bare soil, man-made surfaces) than over vegetation zones. The reference soil spectra extracted directly from the images would be limiting when moving beyond local conditions. In order to be more generically applicable, instead of extracting the soil spectra directly from the images, an idea is to implement reference soil data set that represent a large variation of soil types, moisture, roughness and geometrical configuration (Jacquemoud et al., 1992; Weidong et al., 2002). From this database a small subset of field measured spectra could be implemented that does not significantly increase the size of the LUT but that represents well the variability of the soil making the retrieval model more global (Weiss et al., 2016). The diversity of soil properties can be further increased by applying the concept of brightness. The brightness coefficient can multiply the reflectance values of the soil in order to reach a better soil representation (Verger et al., 2011). In case the soil spectra is too smooth, the implemented noise model should also be applied to these soil spectra. Additionally, the models are currently not trained for water bodies and man-made surfaces. It would be either a matter of adding such spectra (e.g., USGS database (Kokaly et al., 2017) or ECOSTRESS spectral library (Meerdink et al., 2019)) to the training dataset or otherwise masking out those non-vegetated surfaces.
Finally, in view of moving beyond the here presented results and with ambition to obtain consistent LAI retrievals across the globe, optimally trained and robust models are required. To achieve this, a fine tuning of PROSAIL and 6SV was conducted to ensure that the LUTs represent the most variable conditions possible for a cloudless agricultural scene. For the vegetation variables, Gaussian distributions applied to LAI and Cab were appropriate to represent the vegetation conditions in spring and summer. This allowed to concentrate the maximum information on the most characteristic LAI and Cab values for those dates. However, the lower representativeness at the extremes of the ranges may contribute to the underestimation and overestimation observed in the extreme LAI values. This configuration focused for a period of the year might limit the global application of the model. In order to mitigate these limitations, a common approach is to distribute better the variable along the range, even if this generates an increment in the size of the LUT that complicates the (VH)GPR training process and affects the efficiency of the retrieval (Verrelst et al., 2016a). Regarding the development of the LUTs for training, several modifications can be introduced that may lead to improved or more robust models. For instance, various PROSAIL and 6SV parameters were kept fixed, such as the hot-spot and the sun-target sensor geometry in order to facilitate the coupling between the vegetation and atmospheric RTMs. Also the validity of the Lambertian surface assumption might have to be revised, and more advanced TOA couplings taking into account surface anisotropy (Verhoef and Bach, 2003; Verhoef and Bach, 2007; Bayat et al., 2020; Yang et al., 2020) might lead to more accurate vegetation properties retrievals. However, adding more variables to the LUT makes the training more difficult and may deteriorate the retrieval performance. To deal with the growth of the LUT, reduction strategies such as active learning (Pasolli et al., 2012) can be used, which has been demonstrated to be superior to random sampling, improving retrieval accuracy with lower sampling rates (Verrelst et al., 2016a; Upreti et al., 2019). Another option is to use sample distributions that reflect reality more, e.g., normal or log-normal distributions for key variables (Verrelst et al., 2019b). Global processing applications of the methods presented here should take into account the variability in the illumination conditions along the orbit and acquisition time. Rather than being an independent variable, the illumination conditions should be taken as an additional source of information for the retrieval, which could be included as an additional band for training the statistical model. Several of these LUT optimization aspects have been studied and implemented in the SNAP NN model, which led to a large and optimized LUT (41472 simulations with 2/3 used for training) (Weiss et al., 2016). However, the validation of the NN model against the same Marchfeld field data yielded only a slightly better performance at the BOA scale (R2 of 0.83) (Vuolo et al., 2016) than reported in this study (R2 of 0.80). The here presented (VH)GPR models made use of about 1000 samples, which is considerably less than what is used in the SNAP NN model. Hence, LUT size is not a key factor for the (VH)GPR model; in fact large training datasets make the model unnecessary heavy and slow. What matters is the quality of the training data, which eventually boils down to seek for an optimized threshold between realism, diversity and keeping it manageable.
As a closing remark, although in this work the focus was on LAI retrieval from S2, it must be emphasized that essentially any hybrid retrieval model can be developed for any kind of TOA radiance data with the developed ALG-ARTMO software framework. For instance, that can be achieved with other leaf-canopy-atmosphere RTMs combinations or other machine learning regression algorithms. Future work will be dedicated to the retrieval of other vegetation and atmosphere variables and from other sensors, such as Sentinel-3. The software framework can be downloaded at http://artmotoolbox.com/. Code snippets and demos for both GPR, VHGPR and other machine learning regression algorithms are available from https://isp.uv.es/soft_regression.html.
6. Conclusions
This study aimed to develop LAI retrieval models directly from Sentinel-2 (S2) top-of-atmosphere (TOA) radiance data (L1C product). To do so, a hybrid machine learning regression approach was developed by making use of simulations from leaf-canopy-atmosphere radiative transfer models (RTMs). The coupled PROSAIL-6S RTMs were used to simulate a look-up table (LUT) of TOA radiance data and associated input variables. This LUT was then used to train the Bayesian algorithms Gaussian processes regression (GPR) and variational heteroscedastic GPR (VHGPR). Similarly, PROSAIL simulations were used to train GPR and VHGPR models for LAI retrieval from S2 images at BOA level (L2A product) for comparison purposes. The LAI products were adequately validated at TOA and BOA scale against a field dataset acquired at the agricultural site Marchfeld (Austria). The VHGPR model was further used for LAI mapping because of delivering superior accuracies and lower uncertainties. Obtained LAI maps over the agricultural sites Marchfeld and Barrax (Spain) from a S2 BOA and TOA subset were alike. A similar degree of consistency was obtained when comparing the obtained LAI maps against the SNAP LAI products at BOA and TOA scale. Associated uncertainty maps supported the spatial consistency.
Altogether, this study demonstrated that hybrid LAI retrieval models can be developed directly from TOA radiance data given a cloud-free sky, thus without the need of atmospheric correction. It is expected that in future hybrid retrieval models will be developed for a diversity of vegetation properties operated from TOA radiance data. The development of such hybrid retrieval models can be easily achieved within the ALG-ARTMO software framework.
Footnotes
Declaration of Competing Interest
None.
Acknowledgements
This work was supported by the European Research Council (ERC) under the ERC-2017-STG SENTIFLEX project (grant agreement 755617) and Ramón y Cajal Contract (Spanish Ministry of Science, Innovation and Universities). Gustau Camps-Valls was supported by the European Research Council (ERC) under the ERC-CoG-2014 SEDAL project (grant agreement 647423). We thank the two anonymous reviewers for their constructive suggestions to improve the quality of our paper.
References
- Bacour C, Baret F, Béal D, Weiss M, Pavageau K. Neural network estimation of LAI, fAPAR, fCover and LAI×Cab, from top of canopy MERIS reflectance data: Principles and validation. Remote Sens Environ. 2006;105:313–325. [Google Scholar]
- Baret F, Weiss M, Lacaze R, Camacho F, Makhmara H, Pacholcyzk P, Smets B. GEOV1: LAI and FAPAR essential climate variables and FCOVER global time series capitalizing over existing products. Part1: Principles of development and production. Remote Sens Environ. 2013;137:299–309. [Google Scholar]
- Bayat B, van der Tol C, Verhoef W. Retrieval of land surface properties from an annual time series of Landsat TOA radiances during a drought episode using coupled radiative transfer models. Remote Sens Environ. 2020;238:110917 [Google Scholar]
- Berger K, Atzberger C, Danner M, D’Urso G, Mauser W, Vuolo F, Hank T. Evaluation of the prosail model capabilities for future hyperspectral model environments: A review study. Remote Sens. 2018;10 [Google Scholar]
- Berk A, Anderson G, Acharya P, Bernstein L, Muratov L, Lee J, Fox M, Adler-Golden S, Chetwynd J, Hoke M, Lockwood R, et al. MODTRANTM5: 2006 update 2006 [Google Scholar]
- Camps-Valls G, Sejdinovic D, Runge J, Reichstein M. A Perspective on Gaussian Processes for Earth Observation. Natl Sci Rev. 2019;6:616–618. doi: 10.1093/nsr/nwz028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Camps-Valls G, Verrelst J, Munoz-Mari J, Laparra V, Mateo-Jimenez F, Gomez-Dans J. A Survey on Gaussian Processes for Earth-Observation Data Analysis: A Comprehensive Investigation. IEEE Geosci Remote Sens Magaz. 2016;4:58–78. [Google Scholar]
- Clerc S team, M. Level 2A Data Quality Report S2 MPC, Technical Report. ESA; 2020. [Google Scholar]
- Djamai N, Fernandes R. Comparison of SNAP-Derived Sentinel-2A L2A Product to ESA Product over Europe. Remote Sens. 2018;10:926. [Google Scholar]
- Doxani G, Vermote E, Roger JC, Gascon F, Adriaensen S, Frantz D, Hagolle O, Hollstein A, Kirches G, Li F, Louis J, et al. Atmospheric Correction Inter-Comparison Exercise. Remote Sens. 2018;10:352. doi: 10.3390/rs10020352. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Drusch M, Del Bello U, Carlier S, Colin O, Fernandez V, Gascon F, Hoersch B, Isola C, Laberinti P, Martimort P, Meygret A, et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens Environ. 2012;120:25–36. [Google Scholar]
- Emde C, Buras-Schnell R, Kylling A, Mayer B, Gasteiger J, Hamann U, Kylling J, Richter B, Pause C, Dowling T, Bugliaro L. The libradtran software package for radiative transfer calculations (version 2.0.1). Geoscient. Model Develop. 2016;9:1647–1672. [Google Scholar]
- ESA. Sentinel-2 Spectral Response Functions (S2-SRF), v3.0, Ref.: COPE-GSEG-EOPG-TN-15-0007, Technical Report. European Space Agency (ESA); 2018. [Google Scholar]
- Fang H, Baret F, Plummer S, Schaepman-Strub G. An Overview of Global Leaf Area Index (LAI): Methods, Products, Validation, and Applications. Rev Geophys. 2019;57:739–799. [Google Scholar]
- Fang H, Liang S. Retrieving leaf area index with a neural network method: Simulation and validation. IEEE Trans Geosci Remote Sens. 2003;41:2052–2062. [Google Scholar]
- Feret JB, François C, Asner GP, Gitelson AA, Martin RE, Bidel LPR, Ustin SL, le Maire G, Jacquemoud S. PROSPECT-4 and 5: Advances in the leaf optical properties model separating photosynthetic pigments. Remote Sens Environ. 2008;112:3030–3043. [Google Scholar]
- Fourty T, Baret F. Vegetation water and dry matter contents estimated from top-of-the-atmosphere reflectance data: a simulation study. Remote Sens Environ. 1997;61:34–45. [Google Scholar]
- Gao BC, Montes MJ, Davis CO, Goetz AF. Atmospheric correction algorithms for hyperspectral remote sensing data of land and ocean. Remote Sens Environ. 2009;113:S17–S24. [Google Scholar]
- Gascon F, Bouzinac C, Thépaut O, Jung M, Francesconi B, Louis J, Lonjou V, Lafrance B, Massera S, Gaudel-Vacaresse A, Languille F, et al. Copernicus Sentinel-2A Calibration and Products Validation Status. Remote Sens. 2017;9:584. [Google Scholar]
- GCOS,G. Systematic Observation Requirements for Satellite-Based Products for Climate, 2011 Update, Supplemental Details to the Satellite-Based Component of the Implementation Plan for the Global Observing System for Climate in Support of the UNFCCC (2010 update, GCOS-154) 2011 [Google Scholar]
- Jacquemoud S, Baret F, Hanocq JF. Modeling spectral and bidirectional soil reflectance. Remote Sens Environ. 1992;41:123–132. [Google Scholar]
- Jacquemoud S, Verhoef W, Baret F, Bacour C, Zarco-Tejada P, Asner G, François C, Ustin S. PROSPECT + SAIL models: A review of use for vegetation characterization. Remote Sens Environ. 2009;113:S56–S66. [Google Scholar]
- Kaufman Y, Sendra C. Algorithm for automatic atmospheric corrections to visible and near-IR satellite imagery. Int J Remote Sens. 1988;9:1357–1381. [Google Scholar]
- Kokaly R, Clark R, Swayze GA, Livo K, Hoefen T, Pearson N, Wise R, Benzel W, Lowers H, Driscoll R, Klein A. Geological Survey. Reston, VA: 2017. USGS Spectral Library Version 7, volume 1035 U.S. [Google Scholar]
- Kotchenova S, Vermote E, Matarrese R, Klemm F., Jr Validation of a vector version of the 6S radiative transfer code for atmospheric correction of satellite data. part I: Path radiance. Appl Opt. 2006;45:6762–6774. doi: 10.1364/ao.45.006762. [DOI] [PubMed] [Google Scholar]
- Kotchenova SY, Vermote EF. Validation of a vector version of the 6S radiative transfer code for atmospheric correction of satellite data. Part II. Homogeneous Lambertian and anisotropic surfaces. Appl Opt. 2007;46:4455–4464. doi: 10.1364/ao.46.004455. [DOI] [PubMed] [Google Scholar]
- Laurent V, Verhoef W, Clevers J, Schaepman M. Estimating forest variables from top-of-atmosphere radiance satellite measurements using coupled radiative transfer models. Remote Sens Environ. 2011a;115:1043–1052. [Google Scholar]
- Laurent V, Verhoef W, Clevers J, Schaepman M. Inversion of a coupled canopy-atmosphere model using multi-angular top-of-atmosphere radiance data: A forest case study. Remote Sens Environ. 2011b;115:2603–2612. [Google Scholar]
- Laurent V, Verhoef W, Damm A, Schaepman M, Clevers J. A Bayesian objectbased approach for estimating vegetation biophysical and biochemical variables from APEX at-sensor radiance data. Remote Sens Environ. 2013;139:6–17. [Google Scholar]
- Laurent VC, Schaepman ME, Verhoef W, Weyermann J, Chávez RO. Bayesian object-based estimation of LAI and chlorophyll from a simulated Sentinel-2 top-of-atmosphere radiance image. Remote Sens Environ. 2014;140:318–329. [Google Scholar]
- Lauvernet C, Baret F, Hascoёt L, Buis S, Le Dimet FX. Multitemporal-patch ensemble inversion of coupled surface−atmosphere radiative transfer models for land surface characterization. Remote Sens Environ. 2008;112:851–861. [Google Scholar]
- Lázaro-Gredilla M, Titsias MK. Variational heteroscedastic Gaussian process regression. ICML; 2011. pp. 841–848. [Google Scholar]
- Lázaro-Gredilla M, Titsias MK, Verrelst J, Camps-Valls G. Retrieval of biophysical parameters with heteroscedastic Gaussian processes. IEEE Geosci Remote Sens Lett. 2013;11:838–842. [Google Scholar]
- Lenoble J, Herman M, Deuzé J, Lafrance B, Santer R, Tanré D. A successive order of scattering code for solving the vector equation of transfer in the earth’s atmosphere with aerosols. J Quant Spectrosc Radiat Transf. 2007;107:479–507. [Google Scholar]
- Li-Cor I. LAI-2000 plant canopy analyzer instruction manual. LI-COR Inc; Lincoln, Nebraska, USA: 1992. [Google Scholar]
- Louis J, Charantonis A, Berthelot B. Cloud detection for Sentinel-2; Proceedings of ESA Living Planet Symposium, ESA/ESRIN; Bergen, Norway. 2010. [Google Scholar]
- Main-Knorn M, Pflug B, Louis J, Debaecker V, Mueller-Wilm U, Gascon F. Sen2Cor for Sentinel-2. 2017 [Google Scholar]
- Malenovský Z, Rott H, Cihlar J, Schaepman M, Garcia JCa, Santos G, Fernandes R, Berger M. Sentinels for science: Potential of Sentinel-1,-2, and -3 missions for scientific observations of ocean, cryosphere, and land. Remote Sensing of Environment. 2012;120:91–101. [Google Scholar]
- Martins V, Barbosa C, de Carvalho L, Jorge D, Lobo F, Novo E. Assessment of Atmospheric Correction Methods for Sentinel-2 MSI Images Applied to Amazon Floodplain Lakes. Remote Sens. 2017;9:322. [Google Scholar]
- Mayer B, Kylling A. The libRadtran software package for radiative transfer cal-culations-description and examples of use. Atmos Chem Phys. 2005;5:1855–1877. [Google Scholar]
- Meerdink SK, Hook SJ, Roberts DA, Abbott EA. The ecostress spectral library version 1.0. Remote Sens. Environ. 2019;230:111196 [Google Scholar]
- Mousivand A, Menenti M, Gorte B, Verhoef W. Global sensitivity analysis of the spectral radiance of a soil-vegetation system. Remote Sens Environ. 2014;145:131–144. [Google Scholar]
- Mousivand A, Menenti M, Gorte B, Verhoef W. Multi-temporal, multi-sensor retrieval of terrestrial vegetation properties from spectral-directional radiometric data. Remote Sens Environ. 2015;158:311–330. [Google Scholar]
- Müller-Wilm U. Sen2Cor Configuration and User Manual, Technical Report. ESA; 2018. [Google Scholar]
- Neugebauer N, Vuolo F. Crop Water Requirements on Regional Level using Remote Sensing Data-A Case Study in the Marchfeld Region Berechnung des Pflanzenwasserbedarfs für Sommerfeldfrüchte mittels Fernerkundungsdaten. Eine Fallstudie in der Marchfeld-Region Photogrammetrie-Fernerkundung-Geoinformation. 2014;2014:369–381. [Google Scholar]
- Pasolli E, Melgani F, Alajlan N, Bazi Y. Active learning methods for biophysical parameter estimation. IEEE Trans Geosci Remote Sens. 2012;50:4071–4084. [Google Scholar]
- Prikaziuk E, van der Tol C. Global Sensitivity Analysis of the SCOPE Model in Sentinel-3 Bands: Thermal Domain Focus. Remote Sensing. 2019;11:2424. [Google Scholar]
- Rasmussen CE, Williams CKI. Gaussian Processes for Machine Learning. The MIT Press; New York: 2006. [Google Scholar]
- Richter R, Louis J, Niezette M, Quality CD, Manager A. Sentinel-2 MSI-Level 2A Products Algorithm Theoretical Basis Document, Technical Report. 2011a
- Richter R, Wang X, Bachmann M, Schläpfer D. Correction of cirrus effects in Sentinel-2 type of imagery. Int J Remote Sens. 2011b;32:2931–2941. [Google Scholar]
- Rivera J, Verrelst J, Leonenko G, Moreno J. Multiple cost functions and regularization options for improved retrieval of leaf chlorophyll content and LAI through inversion of the PROSAIL model. Remote Sens. 2013;5:3280–3304. [Google Scholar]
- Rivera Caicedo J, Verrelst J, Muñoz-Marí J, Moreno J, Camps-Valls G. Toward a semiautomatic machine learning retrieval of biophysical parameters. IEEE J Select Top Appl Earth Observ Remote Sens. 2014;7:1249–1259. [Google Scholar]
- Schläpfer D, Borel C, Keller J, Itten K. Atmospheric Precorrected Differential Absorption Technique to Retrieve Columnar Water Vapor. Remote Sens Environ. 1998;65:353–366. [Google Scholar]
- Shi H, Xiao Z, Liang S, Ma H. A method for consistent estimation of multiple land surface parameters from MODIS top-of-atmosphere time series data. IEEE Trans Geosci Remote Sens. 2017;55:5158–5173. [Google Scholar]
- Shi H, Xiao Z, Liang S, Zhang X. Consistent estimation of multiple parameters from MODIS top of atmosphere reflectance data using a coupled soil-canopy-atmosphere radiative transfer model. Remote Sens Environ. 2016;184:40–57. [Google Scholar]
- Sola I, García-Martín A, Sandonís-Pozo L, Álvarez-Mozos J, Pérez-Cabello F, González-Audícana M, Montorio Llovería R. Assessment of atmospheric correction methods for Sentinel-2 images in Mediterranean landscapes. Int J Appl Earth Obs Geoinf. 2018;73:63–76. [Google Scholar]
- Suykens J, Vandewalle J. Least squares support vector machine classifiers. Neural Process Lett. 1999;9:293–300. [Google Scholar]
- Svendsen D, Martino L, Camps-Valls G. Active Emulation of Computer Codes with Gaussian Processes - Application to Remote Sensing. Pattern Recogn. 2020;100:1–12. [Google Scholar]
- Thompson D, Guanter L, Berk A, Gao BC, Richter R, Schläpfer D, Thome K. Retrieval of atmospheric parameters and surface reflectance from visible and shortwave infrared imaging spectroscopy data. Surv Geophys. 2019;40:333–360. [Google Scholar]
- Thuillier G, Hersé M, Foujols T, Peetermans W, Gillotay D, Simon P, Mandel H, et al. The solar spectral irradiance from 200 to 2400 nm as measured by the SOLSPEC spectrometer from the ATLAS and EURECA missions. Sol Phys. 2003;214:1–22. [Google Scholar]
- Upreti D, Huang W, Kong W, Pascucci S, Pignatti S, Zhou X, Ye H, Casa R. A Comparison of Hybrid Machine Learning Algorithms for the Retrieval of Wheat Biophysical Variables from Sentinel-2. Remote Sens. 2019;11:481. [Google Scholar]
- Verger A, Baret F, Camacho F. Optimal modalities for radiative transfer-neural network estimation of canopy biophysical characteristics: Evaluation over an agricultural area with CHRIS/PROBA observations. Remote Sens Environ. 2011;115:415–426. [Google Scholar]
- Verhoef W. Light scattering by leaf layers with application to canopy reflectance modeling: The SAIL model. Remote Sens Environ. 1984;16:125–141. [Google Scholar]
- Verhoef W, Bach H. Simulation of hyperspectral and directional radiance images using coupled biophysical and atmospheric radiative transfer models. Remote Sens Environ. 2003;87:23–41. [Google Scholar]
- Verhoef W, Bach H. Coupled soil-leaf-canopy and atmosphere radiative transfer modeling to simulate hyperspectral multi-angular surface reflectance and TOA radiance data. Remote Sens Environ. 2007;109:166–182. [Google Scholar]
- Vermote E, Tanré D, Deuzé J, Herman M, Morcrette JJ. Second simulation of the satellite signal in the solar spectrum, 6S: an overview. IEEE Trans Geosci Remote Sens. 1997;35:675–686. [Google Scholar]
- Verrelst J, Alonso L, Camps-Valls G, Delegido J, Moreno J. Retrieval of vegetation biophysical parameters using Gaussian process techniques. IEEE Trans Geosci Remote Sens. 2012a;50:1832–1843. [Google Scholar]
- Verrelst J, Muñoz J, Alonso L, Delegido J, Rivera J, Camps-Valls G, Moreno J. Machine learning regression algorithms for biophysical parameter retrieval: Opportunities for Sentinel-2 and-3. Remote Sens. Environ. 2012b;118:127–139. [Google Scholar]
- Verrelst J, Romijn E, Kooistra L. Mapping vegetation density in a heterogeneous river floodplain ecosystem using pointable CHRIS/PROBA data. Remote Sens. 2012c;4:2866–2889. [Google Scholar]
- Verrelst J, Alonso L, Rivera Caicedo J, Moreno J, Camps-Valls G. Gaussian process retrieval of chlorophyll content from imaging spectroscopy data. IEEE J Select Top Appl Earth Observ Remote Sens. 2013a;6:867–874. [Google Scholar]
- Verrelst J, Rivera J, Moreno J, Camps-Valls G. Gaussian processes uncertainty estimates in experimental Sentinel-2 LAI and leaf chlorophyll content retrieval. ISPRS J Photogram Remote Sens. 2013b;86:157–167. [Google Scholar]
- Verrelst J, Rivera J, Leonenko G, Alonso L, Moreno J. Optimizing LUT-Based RTM Inversion for Semiautomatic Mapping of Crop Biophysical Parameters from Sentinel-2 and-3 Data: Role of Cost Functions. IEEE Trans Geosci Remote Sens. 2014;52(1):257–269. [Google Scholar]
- Verrelst J, Camps Valls G, Muñoz Marí J, Rivera J, Veroustraete F, Clevers J, Moreno J. Optical remote sensing and the retrieval of terrestrial vegetation bio-geophysical properties - A review. ISPRS J Photogram Remote Sens. 2015a;108:273–290. [Google Scholar]
- Verrelst J, Rivera JP, Veroustraete F, Muñoz-Marí J, Clevers JG, Camps-Valls G, Moreno J. Experimental Sentinel-2 LAI estimation using parametric, nonparametric and physical retrieval methods - A comparison. ISPRS J Photogram Remote Sens. 2015b;108:260–272. [Google Scholar]
- Verrelst J, Dethier S, Rivera JP, Munoz-Mari J, Camps-Valls G, Moreno J. Active Learning Methods for Efficient Hybrid Biophysical Variable Retrieval. IEEE Geosci Remote Sens Lett. 2016a;13:1012–1016. [Google Scholar]
- Verrelst J, Rivera JP, Gitelson A, Delegido J, Moreno J, Camps-Valls G. Spectral band selection for vegetation properties retrieval using gaussian processes regression. Int J Appl Earth Obs Geoinf. 2016b;52:554–567. [Google Scholar]
- Vicent J, Verrelst J, Rivera-Caicedo J, Sabater N, Munoz-Mari J, Camps-Valls G, Moreno J. Emulation as an accurate alternative to interpolation in sampling radiative transfer codes. IEEE J Select Top Appl Earth Observ Remote Sens. 2018;11:4918–4931. doi: 10.1109/jstars.2018.2875330. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Verrelst J, Malenovský Z, van der Tol C, Camps-Valls G, Gastellu-Etchegorry JP, Lewis P, North P, Moreno J. Quantifying vegetation biophysical variables from imaging spectroscopy data: A review on retrieval methods. Surv Geophys. 2019a:1–41. doi: 10.1007/s10712-018-9478-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Verrelst J, Vicent J, Rivera-Caicedo JP, Lumbierres M, Morcillo-Pallarés P, Moreno J. Global Sensitivity Analysis of Leaf-Canopy-Atmosphere RTMs: Implications for Biophysical Variables Retrieval from Top-of-Atmosphere Radiance Data. Remote Sens. 2019b;11 doi: 10.3390/rs11161923. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vicent J, Verrelst J, Sabater N, Alonso L, Rivera-Caicedo JP, Martino L, Muñoz-Marí J, Moreno J. Comparative analysis of atmospheric radiative transfer models using the atmospheric look-up table generator (alg) toolbox (version 2.0) Geoscientific Model Development. 2020;13 doi: 10.5194/gmd-13-1945-2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Vuolo F, Neuwirth M, Immitzer M, Atzberger C, Ng WT. How much does multi-temporal Sentinel-2 data improve crop type classification? Int J Appl Earth Observ Geoinform. 2018;72:122–130. [Google Scholar]
- Vuolo F, Żółtak M, Pipitone C, Zappa L, Wenng H, Immitzer M, Weiss M, Baret F, Atzberger C. Data Service Platform for Sentinel-2 Surface Reflectance and Value-Added Products: System Use and Examples. Remote Sens. 2016;8 [Google Scholar]
- Weidong L, Baret F, Xingfa G, Qingxi T, Lanfen Z, Bing Z. Relating soil surface moisture to reflectance. Remote Sens Environ. 2002;81:238–246. [Google Scholar]
- Weiss M, Baret F. S2ToolBox Level 2 products: LAI, FAPAR, FCOVER, Version 1 1, in: ESA Contract nr 4000110612/14/I-BG. INRA Avignon; France: 2016. p. 52. [Google Scholar]
- Weiss M, Jacob F, Duveiller G. Remote sensing for agricultural applications: A meta-review. Remote Sens Environ. 2020;236:111402 [Google Scholar]
- Yan G, Hu R, Luo J, Weiss M, Jiang H, Mu X, Xie D, Zhang W. Review of indirect optical measurements of leaf area index: Recent advances, challenges, and perspectives. Agric Forest Meteorol. 2019;265:390–411. [Google Scholar]
- Yang P, van der Tol C, Yin T, Verhoef W. The SPART model: A soil-plant-atmosphere radiative transfer model for satellite measurements in the solar spectrum. Remote Sens Environ. 2020;247:111870 [Google Scholar]









