Three-phase hierarchical model-based and hybrid inference

Svetlana Saarela; Petri Varvia; Lauri Korhonen; Zhiqiang Yang; Paul L Patterson; Terje Gobakken; Erik Næsset; Sean P Healey; Göran Ståhl

doi:10.1016/j.mex.2023.102321

. 2023 Aug 6;11:102321. doi: 10.1016/j.mex.2023.102321

Three-phase hierarchical model-based and hybrid inference

Svetlana Saarela ^a,^⁎, Petri Varvia ^b, Lauri Korhonen ^b, Zhiqiang Yang ^c, Paul L Patterson ^d, Terje Gobakken ^a, Erik Næsset ^a, Sean P Healey ^c, Göran Ståhl ^e

PMCID: PMC10448159 PMID: 37637291

Abstract

Global commitments to mitigating climate change and halting biodiversity loss require reliable information about Earth's ecosystems. Increasingly, such information is obtained from multiple sources of remotely sensed data combined with data acquired in the field. This new wealth of data poses challenges regarding the combination of different data sources to derive the required information and assess uncertainties. In this article, we show how predictors and their variances can be derived when hierarchically nested models are applied. Previous studies have developed methods for cases involving two modeling steps, such as biomass prediction relying on tree-level allometric models and models linking plot-level field data with remotely sensed data. This study extends the analysis to cases involving three modeling steps to cover new important applications. The additional step might involve an intermediate model, linking field and remotely sensed data available from a small sample, for making predictions that are subsequently used for training a final prediction model based on remotely sensed data:

•
In cases where the data in the final step are available wall-to-wall, we denote the approach three-phase hierarchical model-based inference (3pHMB),
•
In cases where the data in the final step are available as a probability sample, we denote the approach three-phase hierarchical hybrid inference (3pHHY).

Keywords: Forest resources assessment, Remotely sensed data, Statistical inference, Superpopulation-based inference

Method name: Three-phase Hierarchical Model-based and Hybrid Inference

Graphical abstract

Specifications table

Subject area:	Environmental Science
More specific subject area:	Statistical Inference based on Earth Observation data
Name of your method:	Three-phase Hierarchical Model-based and Hybrid Inference
Name and reference of original method:	N.A.
Resource availability:	3pHMB_3pHHY_MC-based_valid.Rmd

Open in a new tab

Method details

Background

Assessment of state and change of forest biomass has become important worldwide due to the large carbon dioxide fluxes incurred by deforestation and afforestation, as well as by tree growth and harvesting in managed forests (e.g., [1]). Intergovernmental Panel on Climate Change (IPCC) guidance specifies that parties must report changes in biomass carbon stocks as well as uncertainties of the reported figures. Since large parts of the world's forests are inaccessible, applying remotely sensed (RS) data is an important option for such assessments (ibid.). However, the accuracies of the assessments should be quantified using objective and consistent methods (e.g., [2]).

A standard model-based approach to forest biomass assessment is to specify and estimate a model that links biomass measured in the field with RS metrics. The estimated model is then applied across the forest area of interest, whereby biomass predictions and their uncertainties can be obtained utilizing standard methods of model-based inference (e.g., [3]). Uncertainties arise since a model cannot perfectly predict the actual biomass conditions. In some cases, several models are needed to make biomass assessments feasible. For example, models are often applied to predict the biomass of individual trees in the field from measurements of their diameters and heights. Subsequently, these predicted biomasses are used for developing models linking field biomass at the level of sample plots with RS metrics. In this case, the overall uncertainty arises from two modeling steps (e.g., [4]).

Remote sensing methods that utilize laser measurements have revolutionized forest resource assessments [5], [6], [7]. These methods provide information about the 3D structure of forests, which is closely linked to biomass. However, laser measurements are expensive, and thus in many cases they cannot be conducted wall-to-wall but only through sample coverage [8]. Important examples include the strip samples of laser measurements conducted in Alaska [9], and the samples of laser footprints from NASA's Global Ecosystem Dynamics Investigation (GEDI; [1,10] and the Ice, Cloud, and land Elevation Satellites 1 and 2 (ICESat-1 and −2; [11], [12], [13], [14], [15], [16], [17]. For many of these applications, existing methods for assessing uncertainties using frameworks such as model-assisted estimation (e.g., [9]), hybrid inference (e.g., [1,11]), and hierarchical model-based inference (e.g., [10]) have been sufficient.

Sometimes the available data structure requires additional levels in the hierarchy. New methods that utilize several RS-based models in hierarchical structures call for further development of methods to assess uncertainties. An example application involves modeling plot level biomass from field measurements, linking these with laser measurements through a second model, and utilizing laser-based predictions of biomass at sample locations as substitutes for field data in estimating a third model for biomass prediction based on RS data available wall-to-wall, such as data from the Landsat satellite [18]. In this case, three hierarchically nested modeling steps contribute to the overall uncertainty of the biomass prediction for the study area. Ideally, the different datasets involved should stem from about the same time point Hou et al. [19].

Note the difference between this type of hierarchically nested models and the hierarchy of data and models applied in standard multi-level modeling (e.g., [20]). In our case, the target response variable in the first modeling step is replaced by model predictions of the response variable in the subsequent modeling steps. In standard multi-level modeling, the response variable does not change, and multi-level modeling (also called mixed-effects modeling) refers to handling groups of observations, within which the observations are dependent.

This article aims to develop and demonstrate predictors and methods for uncertainty assessment for model-based inference involving three hierarchically nested modeling steps. If data in the final step are available wall-to-wall, we denote the method three-phase hierarchical model-based inference (3pHMB). If data in the last step are available from a probability sample, we denote the method three-phase hierarchical hybrid inference (3pHHY). The methods are similar in the first three steps, and differ only in the last step (see the Graphical Abstract).

Overview

In the following sections, we first derive and present the formulas needed for applying the 3pHMB and 3pHHY frameworks. Secondly, we provide further theoretical support for the approach through an analysis based on a multivariate superpopulation model; in this section, we also demonstrate important consequences of multi-stage hierarchical modeling. Lastly, we validate our theoretical results through Monte Carlo simulation.

Details of three-phase hierarchical model-based inference (3pHMB)

We base our description of 3pHMB inference on an example from forest inventory aiming at assessing the biomass for some large study area. We define our population as the grid-cells that tessellate the study area into $N$ units. The size of the grid-cells corresponds to the size of plots utilized in field inventories and to the size of pixels for which RS data can be retrieved. For each grid-cell RS data are available, e.g., multispectral optical satellite data from the Landsat satellite (e.g., [21,22]). With the 3pHMB method, wall-to-wall RS data are used for predicting the biomass in each grid-cell and through averaging across the study area we predict the biomass density. All other data sources, described below, are utilized for providing data for estimating the parameters of this prediction model, which we denote the Landsat model.

When the 3pHMB method is applied, field data are not adequate for immediately estimating the parameters of the Landsat model, but “pseudo-field” biomass data are available from a sample of accurate predictions based on, e.g., airborne laser scanning (ALS) data. This sample dataset is used for estimating the Landsat model parameters. However, to make the ALS-based predictions, the parameters of the ALS model must also be estimated. This model is estimated from a small sample of field biomass assessments with corresponding ALS data.

The biomasses of trees in the field also need to be predicted from models (often called allometric biomass models, e.g., [23]), since direct measurement is expensive and destroys the resource being monitored. Thus, biomass models are typically estimated from small samples of trees, which are cut down and carefully weighed during dedicated research studies (e.g., [24,25]). When applying the 3pHMB method we often aggregate model predictions of tree-level biomass to plot level, e.g., using the method described in Saarela et al. [[4], pp. 11–12, Section “Aggregation of tree-level AGB predictions to plot level”] and in Varvia et al. [[26], Appendix A.1]. In this article, we do not address the details of this aggregation but assume that plot level biomass is either available from aggregation of tree-level predictions (based on field data) or directly predicted from plot level measurements in the field of, e.g., basal area and mean height.

Thus, our study area of interest (AOI) is tessellated into $N$ grid-cells that constitute the population elements. We assume that the objective is to predict the population mean, ${\bar{y}}_{U}$ , of grid-cell level aboveground biomass density (AGBD), defined as

{\bar{y}}_{U} = \frac{1}{N} \sum_{i = 1}^{N} y_{i},

where $U$ denotes the set of elements in the AOI. The RS data available for this set (e.g., Landsat data) are denoted $P_{U}$ , which is a matrix with $N$ observations of $t$ explanatory variables (plus a column of 1′s for a model intercept). The dataset available for estimating the prediction model used in the final stage is denoted $S_{I I I} .$ It has $n_{I I I}$ sample elements and contains data of the kind described above (denoted $P_{I I I}$ for this set) and intermediate RS data (e.g. from ALS) denoted $Z_{I I I}$ , which is a matrix with $n_{I I I}$ observations of $q$ explanatory variables (plus a column of 1′s). Further, the dataset $S_{I I}$ has $n_{I I}$ elements and data in the datasets $Z_{I I}$ (as above, but with $n_{I I}$ observations) and $X_{I I}$ , which is a matrix with $n_{I I}$ observations of $h$ variables (plus a column of 1′s). The latter data are field measurements (such as basal area, and plot averages of tree diameters and tree heights). Finally, the dataset $S_{I}$ has $n_{I}$ observations. It contains field measurements in $X_{I}$ (as above, but with $n_{I}$ observations) and data from actual measurements of AGBD, in the vector $y_{I}$ .

For deriving a formula for the variance of the 3pHMB method, we start by defining a univariate linear model F, which links AGBD and field measurements at the grid-cell level, using data from $S_{I}$ :

F : Y_{i} = x_{i} β + ϵ_{i}, ϵ_{i} \sim N (0, ω^{2}),

(1)

where $Y_{i}$ is the variable of interest (AGBD), $x_{i}$ is a $(h + 1)$ -length row vector of explanatory variables (field measurements) with a unit term for the intercept, $β$ is a $(h + 1)$ -length column vector of model parameters to be estimated, and $ϵ_{i}$ is a variability (a.k.a. error) term which we assume follows a normal distribution with zero mean and constant variance $ω^{2}$ .1

The model parameters are estimated using the ordinary least squares (OLS) estimator (e.g., [27]):

\hat{β} = {(X_{I}^{T} X_{I})}^{- 1} X_{I}^{T} y_{I}

The estimated model is then used to predict the target variable (AGBD) for the elements in $S_{I I}$ :

{\hat{y}}_{F} = X_{I I} \hat{β} .

The index $F$ is used to denote that the AGBD predictions in $S_{I I}$ are based on the model F (Eq. (1)). The values ${\hat{y}}_{F}$ are predictions of AGBD based on field data, which are subsequently used for estimating the parameters of the model predicting AGBD from ALS data. The ALS model is thus the next model to be addressed in the model hierarchy.

To support the construction of the ALS model, we specify a multivariate model, G $^{*}$ , which links the $(h + 1)$ -multivariate response variable of field measurements $X_{i}$ and a $(q + 1)$ -length row vector $z_{i}$ of ALS explanatory variables:

G^{*} : X_{i} = z_{i} A^{*} + υ_{i}^{*}, υ_{i}^{*} \sim N (0, Δ^{*}),

where $A^{*}$ is a ( $(q + 1) \times (h + 1)$ ) - matrix of model parameters to be estimated, $υ_{i}^{*}$ is a $(h + 1)$ -length multivariate vector of variability terms, which are assumed to be independent, normally distributed, have zero mean, and their variance-covariance matrix $Δ^{*}$ is of size ( $(h + 1) \times (h + 1)$ ).

By multiplying each model component in G $^{*}$ by $β$ we obtain the univariate model ([28], Appendix A.3)

X_{i} β = z_{i} A^{*} β + υ_{i}^{*} β, υ_{i}^{*} β \sim N (0, β^{T} Δ^{*} β),

Denoting $A^{*} β$ as $α$ , $υ_{i}^{*} β$ as $υ_{i}$ , and $β^{T} Δ^{*} β$ as $δ^{2}$ , we obtain the following model, which we denote as G:

G : X_{i} β = z_{i} α + υ_{i}, υ_{i} \sim N (0, δ^{2}) .

(2)

Here, $z_{i}$ a $(q + 1)$ -length row vector of ALS explanatory variables with a unit term for the intercept and $α$ is a $(q + 1)$ -length column vector of model parameters to be estimated. This model links the expectation of AGBD, due to the model F, with ALS explanatory variables.

The predictions ${\hat{y}}_{F}$ are used as the response variables in the model G to estimate its model parameters, $α$ , based on the explanatory variables $Z_{I I}$ from $S_{I I}$ ([28], Appendix A.3):

\hat{α} = {(Z_{I I}^{T} Z_{I I})}^{- 1} Z_{I I}^{T} {\hat{y}}_{F} = {(Z_{I I}^{T} Z_{I I})}^{- 1} Z_{I I}^{T} X_{I I} \hat{β} = \hat{A^{*}} \hat{β .}

Here, $\hat{A^{*}} = {(Z_{I I}^{T} Z_{I I})}^{- 1} Z_{I I}^{T} X_{I I}$ are estimated model parameters from the multivariate model G $^{*}$ .

At this point, we have utilized the actual measurements of AGBD from $S_{I}$ to predict AGBD for all elements in $S_{I I}$ based on field measurement data. The AGBD predictions for $S_{I I}$ have subsequently been used for training an ALS model (the G model), through which we now can predict AGBD for $S_{I I I}$ :

{\hat{y}}_{G} = Z_{I I I} \hat{α} .

The predicted values ${\hat{y}}_{G}$ are used for training the model to be applied in the last stage, based on wall-to-wall RS data, i.e., the Landsat model.

In support of constructing the Landsat model, we introduce a second multivariate model, which links the $(q + 1)$ -multivariate variable $Z_{i}$ (ALS variables) as a multivariate response variable with a $(t + 1)$ -length row vector of Landsat explanatory variables $p_{i}$ ; the model is denoted Q $^{*}$ :

Q^{*} : Z_{i} = p_{i} Γ^{*} + e_{i}^{*}, e_{i}^{*} \sim N (0, Θ^{*}),

where $Γ^{*}$ is a ( $(t + 1) \times (q + 1)$ )-matrix of model parameters, $e_{i}^{*}$ is a $(q + 1)$ -multivariate variable of variability terms which are independent, normally distributed, have zero mean and variance-covariance $Θ^{*}$ of size $((q + 1) \times (q + 1))$ .

By multiplying model components in Q $^{*}$ with the model coefficients from G, we obtain the following univariate model

Z_{i} α = p_{i} Γ^{*} α + e_{i}^{*} α, e_{i}^{*} α \sim N (0, α^{T} Θ^{*} α) .

Denoting $Γ^{*} α$ as $γ$ , $e_{i}^{*} α$ as $e_{i}$ and $α^{T} Θ^{*} α$ as $θ^{2}$ we obtain the following model, which we denote as Q:

Q : Z_{i} α = p_{i} γ + e_{i}, e_{i} \sim N (0, θ^{2}),

(3)

where, $γ$ is an $(t + 1)$ -length vector of model coefficients to be estimated.

The ALS-based predictions of AGBD, ${\hat{y}}_{G}$ , are used for estimating the model parameters, $γ$ , using information on explanatory Landsat variables $P_{I I I}$ from $S_{I I I}$ :

\hat{γ} = {(P_{I I I}^{T} P_{I I I})}^{- 1} P_{I I I}^{T} {\hat{y}}_{G} = {(P_{I I I}^{T} P_{I I I})}^{- 1} P_{I I I}^{T} Z_{I I I} \hat{α} = {(P_{I I I}^{T} P_{I I I})}^{- 1} P_{I I I}^{T} Z_{I I I} \hat{A^{*}} \hat{β} = \hat{Γ^{*}} \hat{A^{*}} \hat{β},

where $\hat{Γ^{*}} = {(P_{I I I}^{T} P_{I I I})}^{- 1} P_{I I I}^{T} Z_{I I I}$ are estimated model parameters from the model Q $^{*}$ .

Finally, the estimated Q model is used to predict the AGBD across the entire AOI, i.e., for all the elements in $U$ :

{\hat{y}}_{U} = P_{U} \hat{γ} .

The 3pHMB population mean predictor is then:

{\hat{\bar{y}}}_{U_{3 p H M B}} = \frac{1}{N} \sum_{i = 1}^{N} p_{i} \hat{γ} = {\bar{p}}_{U} \hat{γ},

(4)

where ${\bar{p}}_{U}$ is the $(t + 1)$ -length row vector of Landsat explanatory variable averages over the AOI $.$

Table 1 summarizes the description of the datasets and the estimation steps. In the table, the datasets used in the analyses are listed with descriptions of available variables and whether they are used for model training or model application.

Table 1.

A summary of datasets and estimation steps involved in 3pHMB prediction.

Dataset	Available information	Model training steps	Model application step
The field dataset $S_{I}$	AGBD and field measurements	A model linking AGBD and field measurements (model F)	–
The ALS dataset $S_{I I}$	Field measurements and ALS variables	An ALS model, used to predict AGBD from ALS variables (model G)	The model F is applied on the $S_{I I}$ dataset
The Landsat dataset $S_{I I I}$	ALS and Landsat variables	A Landsat model, used to predict AGBD from Landsat variables (model Q)	The model G is applied on the $S_{I I I}$ dataset
The AOI, $U$	Landsat data	–	The model Q is applied to the entire population

Open in a new tab

Fig. 1 gives a graphical overview of 3pHMB prediction.

The 3pHMB predictor variance and its estimator

Application of 3pHMB for making predictions, only, across some AOI is fairly straightforward, and does not require all the model formalism from the previous section. What is required is a procedure where the AGBD is first measured on the elements in $S_{I}$ , then predicted in a stepwise manner for the elements in $S_{I I}, S_{I I I}$ , and $U$ . However, the formalism introduced in the previous section is needed for assessing the uncertainty associated with predicting the AGBD for the AOI. In this section, we build on the previous section and derive a formula for the variance of the predictor of AGBD, and its corresponding estimator.

For deriving the 3pHMB predictor variance, we begin with its expectation:

E [{\hat{\bar{y}}}_{U_{3 p H M B}}] = E [{\bar{p}}_{U} \hat{γ}] = {\bar{p}}_{U} E [\hat{γ}] .

Since independent datasets (normally) are used for the three modeling steps, the estimated parameters in each step are independent and thus

E [\hat{γ}] = E [\hat{Γ^{*}} \hat{A^{*}} \hat{β}] = Γ^{*} A^{*} β = γ .

Thus, the expectation of the predictor of the population mean is

E [{\hat{\bar{y}}}_{U_{3 p H M B}}] = {\bar{p}}_{U} γ .

Following the definition, the variance of ${\hat{\bar{y}}}_{U_{3 p H M B}}$ is

\begin{matrix} V ({\hat{\bar{y}}}_{U 3 p H M B}) & = & E [{({\hat{\bar{y}}}_{U 3 p H M B} - E [{\hat{\bar{y}}}_{U 3 p H M B}])}^{2}] \\ = & E [{({\bar{p}}_{U} \hat{γ} - E [{\bar{p}}_{U} \hat{γ}])}^{2}] \\ = & {\bar{p}}_{U} E [{(\hat{γ} - E [\hat{γ}])}^{2}] {\bar{p}}_{U}^{T} \\ = & \bar{p} C o v (\hat{γ}) {\bar{p}}_{U}^{T} . \end{matrix}

(5)

It can be noted that the covariance of the estimated model parameters is at the core of this expression. Since several modeling steps have been involved in the estimation of the model parameters $\hat{γ}$ , the covariance can be decomposed into the following terms using the law of total covariance [29]:

\begin{matrix} C o v (\hat{γ}) & = & C o v (\hat{Γ *} \hat{α}) \\ = & E_{G} [C o v_{Q *} (\hat{Γ *} \hat{α})] + C o v_{G} (E_{Q *} [\hat{Γ *} \hat{α}]) \\ = & E_{G} [{\hat{α}}^{T} C o v_{Q *} (\hat{Γ *}) \hat{α}] + Γ * C o v_{G} (\hat{α}) Γ *^{T} \\ = & α^{T} C o v_{Q *} (\hat{Γ *}) α + Γ * C o v_{G} (\hat{α}) Γ *^{T} + T r [C o v_{Q *} (\hat{Γ *}) C o v_{G} (\hat{α})], \end{matrix}

(6)

where the step from the second last to the last line is obtained from

\begin{matrix} E_{G} [{\hat{α}}^{T} C o v_{Q *} (\hat{Γ *}) \hat{α}] & = & E_{G} [\sum_{i = 1}^{(q + 1)} \sum_{j = 1}^{(q + 1)} {\hat{α}}_{i} C o v_{Q *} ({\hat{γ}}_{i}, {\hat{γ}}_{j}) {\hat{α}}_{j}] \\ = & \sum_{i = 1}^{(q + 1)} \sum_{j = 1}^{(q + 1)} C o v_{Q *} ({\hat{γ}}_{i}, {\hat{γ}}_{j}) E_{G} [{\hat{α}}_{i} {\hat{α}}_{j}] \\ = & \sum_{i = 1}^{(q + 1)} \sum_{j = 1}^{(q + 1)} C o v_{Q *} ({\hat{γ}}_{i}, {\hat{γ}}_{j}) (α_{i} α_{j} + C o v ({\hat{α}}_{i}, {\hat{α}}_{j})) \\ = & \sum_{i = 1}^{(q + 1)} \sum_{j = 1}^{(q + 1)} C o v_{Q *} ({\hat{γ}}_{i}, {\hat{γ}}_{j}) α_{i} α_{j} + \sum_{i = 1}^{(q + 1)} \sum_{j = 1}^{(q + 1)} C o v_{Q *} ({\hat{γ}}_{i}, {\hat{γ}}_{j}) C o v ({\hat{α}}_{i}, {\hat{α}}_{j}) \\ = & α^{T} C o v_{Q *} (\hat{Γ *}) α + T r [C o v_{Q *} (\hat{Γ *}) C o v_{G} (\hat{α})] \end{matrix}

However, $C o v_{G} (\hat{α})$ in Eq. (6) emanates from several interacting models, and in order to present the variance formula in a format that would later on allow estimation of, $C o v_{G} (\hat{α})$ is further decomposed using the law of total covariance:

\begin{matrix} C o v_{G} (\hat{α}) & = & C o v_{G} (\hat{A *} \hat{β}) \\ = & E_{F} [C o v_{G *} (\hat{A *} \hat{β})] + C o v_{F} (E_{G *} [\hat{A *} \hat{β}]) \\ = & E_{F} [{\hat{β}}^{T} C o v_{G *} (\hat{A *}) \hat{β}] + A * C o v_{F} (\hat{β}) A *^{T} \\ = & β^{T} C o v_{G *} (\hat{A *}) β + A * C o v_{F} (\hat{β}) A *^{T} + T r [C o v_{G *} (\hat{A *}) C o v_{F} (\hat{β})], \end{matrix}

(7)

where the third term ( $T r [C o v_{G^{*}} (\hat{A^{*}}) C o v_{F} (\hat{β})]$ ) follows from $E_{F} [{\hat{β}}^{T} \hat{β}] = β^{T} β + C o v (\hat{β})$ .

Thus, the variance of the AGBD predictor can be expressed as:

\begin{matrix} V ({\hat{\bar{y}}}_{U_{3 p H M B}}) & = & {\bar{p}}_{U} (α^{T} C o v_{Q *} (\hat{Γ *}) α) {\bar{p}}_{U}^{T} \\ + {\bar{p}}_{U} (Γ * β^{T} C o v_{G *} (\hat{A *}) β Γ *^{T}) {\bar{p}}_{U}^{T} \\ + {\bar{p}}_{U} (Γ * A * C o v_{F} (\hat{β}) A *^{T} Γ *^{T}) {\bar{p}}_{U}^{T} \\ + {\bar{p}}_{U} (Γ * T r [C o v_{G *} (\hat{A *}) C o v_{F} (\hat{β})] Γ *^{T}) {\bar{p}}_{U}^{T} \\ + {\bar{p}}_{U} (T r [C o v_{Q *} (\hat{Γ *}) (β^{T} C o v_{G *} (\hat{A *}) β + A * C o v_{F} (\hat{β}) A *^{T} + T r [C o v_{G *} (\hat{A *}) C o v_{F} (\hat{β})])]) {\bar{p}}_{U}^{T} . \end{matrix}

(8a)

In the simulation analyses that we performed to validate our methodology (illustrated in Table 7), we found that all the terms involving traces were small, and could be neglected in the variance formula, for simplification. Thus, a good approximation of the variance of the AGBD predictor would be

V ({\hat{\bar{y}}}_{U 3 p H M B}) = {\bar{p}}_{U} (α^{T} C o v_{Q^{*}} (\hat{Γ^{*}}) α) {\bar{p}}_{U}^{T} + {\bar{p}}_{U} (Γ^{*} β^{T} C o v_{G^{*}} (\hat{A^{*}}) β Γ^{*^{T}}) {\bar{p}}_{U}^{T} + {\bar{p}}_{U} (Γ^{*} A^{*} C o v_{F} (\hat{β}) A^{*^{T}} Γ^{*^{T}}) {\bar{p}}_{U}^{T} .

(8b)

Table 7.

The contribution of the terms involving traces to the overall variance of 3pHMB and 3pHHY predictors.

Predictor	Predictor variance based on Eq. (8a) and (11)	Terms involving traces	Relative contribution of terms involving traces
3pHMB	10.94	2.0 × 10⁻³	1.8 × 10⁻²%
3pHHY	11.03	0.8 × 10⁻³	0.7 × 10⁻²%

Open in a new tab

By replacing the covariances of estimated model parameters and the model parameters by their corresponding estimators, we obtain a variance estimator. In the simulation analysis that we performed to validate our methodology, it was found that the variance estimators were approximately unbiased.

If we would like to derive the MSE of the predictor rather than the variance, we may define the true AGBD over the AOI through the models involved. We first express the AGBD variable using the model Q as

Z_{U} α = P_{U} γ + e_{U},

Using model G, we can express $Z_{U} α$ as $X_{U} β - υ_{U}$ and thus $X_{U} β - υ_{U} = P_{U} γ + e_{U}$ , leading to

X_{U} β = P_{U} γ + u_{U} + υ_{U} .

Using model F, we can express $X_{U} β$ as $y_{U} - ϵ_{U}$ and thus $y_{U} - ϵ_{U} = P_{U} γ + u_{U} + υ_{U}$ , which implies that the variable of interest for all elements in the AOI can be expressed using models F, G and Q as

y_{U} = P_{U} γ + e_{U} + υ_{U} + ϵ_{U} .

The AGBD mean for $U$ can thus be expressed as

{\bar{y}}_{U} = \frac{1}{N} \sum_{i = 1}^{N} (p_{i} γ + e_{i} + υ_{i} + ϵ_{i}) = {\bar{p}}_{U} γ + {\overline{e}}_{U} + {\overline{υ}}_{U} + {\overline{ϵ}}_{U} .

The MSE of the predictor ${\hat{\bar{y}}}_{U_{3 p H M B}}$ is then, according to the definition of MSE,

\begin{matrix} M S E ({\hat{\bar{y}}}_{U_{3 p H M B}}) & = & E [{({\hat{\bar{y}}}_{U_{3 p H M B}} - {\bar{y}}_{U})}^{2}] \\ = & E [{({\bar{p}}_{U} \hat{γ} - {\bar{p}}_{U} γ - {\bar{e}}_{U} - {\bar{υ}}_{U} - {\bar{\in}}_{U})}^{2}] . \end{matrix}

Under an assumption of independence between the variability terms, the MSE can be expressed as

M S E ({\hat{\bar{y}}}_{U 3 p H M B}) = {\bar{p}}_{U} C o v (\hat{γ}) {\bar{p}}_{U}^{T} + \frac{θ^{2}}{N} + \frac{δ^{2}}{N} + \frac{ω^{2}}{N} = V ({\hat{\bar{y}}}_{U 3 p H M B}) + \frac{θ^{2}}{N} + \frac{δ^{2}}{N} + \frac{ω^{2}}{N} .

(9)

However, note that it is not obvious that the three variability terms are independent, and typically there is also spatial autocorrelation among them. Thus, MSE expressions become complicated and require further investigation.

Details of three-phase hierarchical hybrid inference (3pHHY)

Next, we present our second case, i.e., when the last step auxiliary information is available only for a probability sample from the AOI, rather than wall-to-wall. This step is the only difference compared to 3pHMB inference. In our forest inventory example, the auxiliary information in this case could be retrieved from the ICESat-2 spaceborne LiDAR.

Thus, ${\bar{p}}_{U}$ in this case is estimated using the design-based estimator (e.g., [30], p. 42)

{\hat{\bar{p}}}_{U} = \frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}},

where $S_{I V}$ is a probability sample drawn from the AOI, $p_{k}$ is a vector of ICESat-2 auxiliary values for the $k^{t h}$ element in the sample, and $π_{k}$ is the probability of including the $k^{t h}$ element to the sample $S_{I V}$ .

The design-based expectation of ${\hat{\bar{p}}}_{U}$ is

E_{D} [{\hat{\bar{p}}}_{U}] = {\bar{p}}_{U} .

The 3pHHY predictor of the AGBD mean over the AOI is then (cf., [31], [32], [33])

{\hat{\bar{y}}}_{U 3 p H H Y} = \frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} \hat{γ} .

(10)

Fig. 2 gives a graphical overview of 3pHHY prediction.

The 3pHHY predictor variance and its estimator

To derive the variance of the 3pHHY predictor, we begin with the expectation of the predictor, which can be decomposed as

\begin{matrix} E [{\hat{\bar{y}}}_{U 3 p H H Y}] & = & E [\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} \hat{γ}] = E_{D} [E_{Q} [\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} \hat{γ}]] \\ = & E_{D} [\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} E_{Q} [\hat{γ}]] = {\bar{p}}_{U} E_{Q} [\hat{γ}] = {\bar{p}}_{U} Γ^{*} A^{*} β = {\bar{p}}_{U} γ, \end{matrix}

where the subscript $D$ denotes expectation due to the design.

The variance of the 3pHHY predictor can be decomposed using the law of total variance as

\begin{matrix} V ({\hat{\bar{y}}}_{U 3 p H H Y}) & = & V (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} \hat{γ}) \\ = & V_{D} (E_{Q} [\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} \hat{γ}]) + E_{D} [V_{Q} (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} \hat{γ})] \\ = & V_{D} (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} γ) + E_{D} [\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} C o v (γ) {(\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}})}^{T}] \\ = & γ^{T} C o v (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}}) γ + {\bar{p}}_{U} C o v (\hat{γ}) {\bar{p}}_{U}^{T} + T r [C o v (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}}) C o v (\hat{γ})] . \end{matrix}

(11)

The third term in the final expression of Eq. (11) is due to

E_{D} [\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} {(\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}})}^{T}] = {\bar{p}}_{U} {\bar{p}}_{U}^{T} + C o v (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}}) .

Our simulation analysis showed that the third term, $T r [C o v (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}}) C o v (\hat{γ})]$ , in Eq. (11) is small and may be neglected. Thus, for any given sampling design the 3pHHY variance is

V ({\hat{\bar{y}}}_{U 3 p H H Y}) = γ^{T} C o v (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}}) γ + {\bar{p}}_{U} C o v (\hat{γ}) {\bar{p}}_{U}^{T} .

(12)

Under a simple random sampling without replacement design, the variance is

V ({\hat{\bar{y}}}_{U 3 p H H Y}) = \frac{1}{n_{I V}} (1 - \frac{n_{I V}}{N}) γ^{T} C o v (P_{U}) γ + {\bar{p}}_{U} C o v (\hat{γ}) {\bar{p}}_{U}^{T},

(13)

where $n_{I V}$ is the sample size, $(1 - \frac{n_{I V}}{N})$ is the finite population correction factor, and $C o v (P_{U})$ is the population covariance of the ICESat-2 explanatory variables. In the simulations performed to validate the results, Eq. (13) was used, since simple random sampling was assumed.

By replacing the covariance of estimated model parameters, $C o v (\hat{γ})$ , the model parameters, $γ$ , and the design-based covariance $C o v (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}})$ with their corresponding estimators we obtain a variance estimator. Our simulation analysis showed that the variance estimator is approximately unbiased.

If we would like to derive the MSE of the 3pHHY predictor, we may proceed along the route previously outlined for 3pHMB, i.e.

\begin{matrix} M S E ({\hat{\bar{y}}}_{U 3 p H H Y}) & = & E [{({\hat{\bar{y}}}_{U 3 p H H Y} - {\bar{y}}_{U})}^{2}] \\ = & E [{(\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} \hat{γ} - {\bar{p}}_{U} γ - {\bar{e}}_{U} - {\bar{υ}}_{U} - {\bar{\in}}_{U})}^{2}] \\ = & E [{(\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} \hat{γ} - {\bar{p}}_{U} \hat{γ} + {\bar{p}}_{U} \hat{γ} - {\bar{p}}_{U} γ - {\bar{e}}_{U} - {\bar{υ}}_{U} - {\bar{\in}}_{U})}^{2}] \\ = & E [{((\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}} - {\bar{p}}_{U}) \hat{γ} + {\bar{p}}_{U} (\hat{γ} - γ) - {\bar{e}}_{U} - {\bar{υ}}_{U} - {\bar{\in}}_{U})}^{2}] \\ = & γ^{T} C o v (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}}) γ + {\bar{p}}_{U} C o v (\hat{γ}) {\bar{p}}_{U}^{T} + T r [C o v (\frac{1}{N} \sum_{S_{I V}} \frac{p_{k}}{π_{k}}) C o v (\hat{γ})] + \frac{θ^{2}}{N} + \frac{δ^{2}}{N} + \frac{ω^{2}}{N} \\ = & V ({\hat{\bar{y}}}_{U 3 p H H Y}) + \frac{θ^{2}}{N} + \frac{δ^{2}}{N} + \frac{ω^{2}}{N} . \end{matrix}

(14)

Like in the case of 3pHMB, this MSE formula assumes independence between the variability terms and absence of spatial autocorrelation. Thus, deriving a useful MSE expression is complicated and requires further investigation.

Study of hierarchical modeling in the context of a superpopulation model

The previous sections of this article have provided the formulas needed for applying 3pHMB and 3pHHY for prediction, and for assessing uncertainties. In this section, we study properties of hierarchical modeling that cannot be straightforwardly deduced from the formulas presented in the previous sections. We do this by specifying a superpopulation model with normally distributed random variables. Some of the non-standard models from the previous sections, e.g., Eqs. (2) and (3), will now be derived through conditional expectation, thus validating the previously presented models. This section will also provide insights into effects of hierarchical modeling that are important to understand, such as the reduced variability of the response variable the more hierarchically nested models are added. Examples of the magnitude of such variability reduction are provided.

We assume that with each population element a multivariate random variable is associated, i.e., for the $i^{t h}$ observation, the values ( $y_{i}$ , $x_{i}$ , $z_{i}$ , $p_{i}$ ) are outcomes from a multivariate random variable ( $Y_{i}$ , $X_{i}$ , $Z_{i}$ , $P_{i}$ ). The joint distribution of the random variables for the elements in the population defines a superpopulation model denoted $ξ$ ([34], p. 80).

The superpopulation model $ξ$ shapes the relationship between the variables in the population, and the observations linked to them. We assume the same type of population elements, variables and observations as previously, i.e. $y_{i}$ is the AGBD measured in the field, $x_{i}$ is some field plot measurements, such as average tree diameter (DBH), $z_{i}$ is some ALS variable, and $p_{i}$ some Landsat (in the case of 3pHMB) or ICESat-2 variable (in the case of 3pHHY). The values ( $y_{i}$ , $x_{i}$ , $z_{i}$ , $p_{i}$ ) are realizations from the corresponding multivariate random variable ( $Y_{i}$ , $X_{i}$ , $Z_{i}$ , $P_{i}$ ).

We assume a multivariate normal distribution, i.e.

(\begin{matrix} Y_{i} \\ X_{i} \\ \begin{matrix} Z_{i} \\ P_{i} \end{matrix} \end{matrix}) \sim N (μ, Σ),

where $μ = (μ_{Y}, μ_{X}, μ_{Z}, μ_{P})$ is a vector of superpopulation means, and $Σ$ is the superpopulation variance-covariance matrix with variances $σ_{(\cdot)}^{2}$ for each variable on the diagonal and covariances $σ_{(\cdot) (\cdot)}$ between the variables on the off-diagonal elements. We assume that our multivariate variables are exchangeable, i.e., every permutation of the order of the population elements follows the same joint distribution $ξ$ ([34], p. 85).

We now derive the models previously presented Eqs. (1), ((2), and (3)) through the superpopulation model. We begin with describing the model linking AGBD and DBH, i.e., the model F. Following the joint distribution $ξ$ , the conditional probability density function of $Y_{i}$ , given that $X_{i} = x_{i}$ for the $i^{t h}$ unit, is

f_{Y | X} (y | x) = \frac{f_{X, Y} (x, y)}{f_{X} (x)},

where $f_{X, Y} (x, y)$ is the joint probability density function of $Y$ and $X$ , and $f_{X} (x)$ is the marginal density function for $X$ . Under the normality assumption the conditional expectation of $Y_{i}$ given $X_{i} = x_{i}$ can be expressed as ([35], p. 72)

E [Y_{i} | X_{i} = x_{i}] = E [Y_{i}] + \frac{C o v (Y_{i}, X_{i})}{V (X_{i})} (x_{i} - E [X_{i}]) = μ_{Y} + r_{Y X} \frac{σ_{Y}}{σ_{X}} (x_{i} - μ_{X}) = μ_{Y | X}

(15)

where $r_{Y X}$ is the correlation between $Y_{i}$ and $X_{i}$ , and $σ_{(\cdot)}$ is the superpopulation standard deviation for the given variable. The variance of the conditional distribution is ([35], p. 72)

V a r (Y_{i} | X_{i} = x_{i}) = V (Y_{i}) - \frac{C o v^{2} (Y_{i}, X_{i})}{V (X_{i})} = σ_{Y}^{2} (1 - r_{Y X}^{2}) = σ_{Y | X}^{2} .

(16)

The conditional distribution of $Y_{i}$ , given that $X_{i} = x_{i}$ for the $i^{t h}$ observation is then

Y_{i} | X_{i} = x_{i} \sim N (μ_{Y | X}, σ_{Y | X}^{2}) .

Thus, in our example, the AGBD random variable associated with the $i^{t h}$ population element has the expectation $μ_{Y}$ and variance $σ_{Y}^{2}$ ; $X_{i}$ is the DBH random variable with expectation $μ_{X}$ and variance $σ_{X}^{2}$ . However, for a fixed outcome $x_{i}$ from $X_{i}$ , there is a random subset of $Y_{i}$ conditional on the fixed $x_{i}$ ; this subset is the variable $Y_{i} | X_{i} = x_{i}$ with expectation $μ_{Y | X} = μ_{Y} + r_{Y X} \frac{σ_{Y}}{σ_{X}} (x_{i} - μ_{X})$ and variance $σ_{Y | X}^{2} = σ_{Y}^{2} (1 - r_{Y X}^{2})$ . This corresponds to regressing $Y_{i}$ on $X_{i}$ , which reveals how much information about $Y_{i}$ is contained in an observation of $X_{i}$ ([35], p. 73).

Therefore, using the conditional distribution, we can define our first model (F), which links AGBD and DBH as

F : Y_{i} = μ_{Y | X} + ϵ_{i}, ϵ_{i} \sim N (0, σ_{Y | X}^{2}),

(17)

where $ϵ_{i}$ is the variability term. By denoting $ω^{2} = σ_{Y | X}^{2}$ , $β_{0} = (μ_{Y} - r_{Y X} \frac{σ_{Y}}{σ_{X}} μ_{X})$ and $β_{1} = r_{Y X} \frac{σ_{Y}}{σ_{X}}$ we can rewrite F in the form previously introduced, i.e.

F : Y_{i} = β_{0} + β_{1} x_{i} + ϵ_{i}, ϵ_{i} \sim N (0, ω^{2}),

and in linear algebra notation

F : Y_{i} = x_{i} β + ϵ_{i}, ϵ_{i} \sim N (0, ω^{2}),

where $β = {(β_{0}, β_{1})}^{T}$ and $x_{i} = (1, x_{i})$ . By substituting the variances and correlations in the parameter formulas with their corresponding estimators, we obtain the well-known formulas for parameter estimation from OLS regression.

Similarly, to derive the model G (Eq. (2)), we introduce $E [Y_{i} | X_{i}]$ , which is a random variable since we now condition on the random variable $X_{i}$ rather than some fixed outcome $X_{i} = x_{i}$ ([27], p. 131–132). The variance of $E [Y_{i} | X_{i}]$ is

V (E [Y_{i} | X_{i}]) = V (μ_{Y} + r_{Y X} \frac{σ_{Y}}{σ_{X}} (X_{i} - μ_{X})) = V (r_{Y X} \frac{σ_{Y}}{σ_{X}} X_{i}) = r_{Y X}^{2} \frac{σ_{Y}^{2}}{σ_{X}^{2}} V (X_{i}) = r_{Y X}^{2} \frac{σ_{Y}^{2}}{σ_{X}^{2}} σ_{X}^{2} = r_{Y X}^{2} σ_{Y}^{2} .

(18)

The expectation of $E [Y_{i} | X_{i}]$ can be derived through the law of iterated expectations ([27], p. 132)

E [E [Y_{i} | X_{i}]] = E [Y_{i}] = μ_{Y} .

(19)

The next step is to introduce the ALS variable, $Z_{i}$ . Based on our superpopulation model, the conditional distribution of $E [Y_{i} | X_{i}]$ given that $Z_{i} = z_{i}$ has the expected value (cf. Eq. (15))

\begin{matrix} E [E [Y_{i} | X_{i}] | Z_{i} = z_{i}] & = & E [E [Y_{i} | X_{i}]] + \frac{C o v (E [Y_{i} | X_{i}], Z_{i})}{V (Z_{i})} (z_{i} - E [Z_{i}]) \\ = & μ_{Y} + r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{Z}} (z_{i} - μ_{z}) = μ_{Y | X | Z}, \end{matrix}

(20)

where,

\begin{matrix} C o v (E [Y_{i} | X_{i}], Z_{i}) & = & C o v ((μ_{Y} + r_{Y X} \frac{σ_{Y}}{σ_{X}} (X_{i} - μ_{X})), Z_{i}) \\ = & C o v (r_{Y X} \frac{σ_{Y}}{σ_{X}} X_{i}, Z_{i}) = r_{Y X} \frac{σ_{Y}}{σ_{X}} C o v (X_{i}, Z_{i}) \\ = & r_{Y X} \frac{σ_{Y}}{σ_{X}} r_{X Z} \sqrt{V (X_{i}) V (Z_{i})} = r_{X Z} r_{Y X} σ_{Y} σ_{Z} . \end{matrix}

The variance is (cf. Eq. (16))

\begin{matrix} V a r (E [Y_{i} | X_{i}] | Z_{i} = z_{i}) & = & V (E [Y_{i} | X_{i}]) - \frac{C o v^{2} (E [Y_{i} | X_{i}], Z_{i})}{V (Z_{i})} \\ = & r_{Y X}^{2} σ_{Y}^{2} - \frac{{(r_{X Z} r_{Y X} σ_{Y} σ_{Z})}^{2}}{σ_{Z}^{2}} = σ_{Y}^{2} r_{Y X}^{2} (1 - r_{X Z}^{2}) = σ_{Y | X | Z}^{2} \end{matrix}

(21)

The conditional distribution of $E [Y_{i} | X_{i}]$ , given that $Z_{i} = z_{i}$ is then

E [Y_{i} | X_{i}] | Z_{i} = z_{i} \sim N (μ_{Y | X | Z}, σ_{Y | X | Z}^{2}) .

Thus, we can define the model G:

G : E [Y_{i} | X_{i}] = μ_{Y | X | Z} + υ_{i}, υ_{i} \sim N (0, σ_{Y | X | Z}^{2}) .

(22)

Denoting model parameters $α_{0} = (μ_{Y} - r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{Z}} μ_{Z})$ , $α_{1} = r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{Z}}$ , and the variance of the variability term $υ_{i}$ as $δ^{2} = σ_{Y | X | Z}^{2}$ , we can rewrite G as

G : E [Y_{i} | X_{i}] = α_{0} + α_{1} z_{i} + υ_{i}, υ_{i} \sim N (0, δ^{2})

and in linear algebra notation

G : X_{i} β = z_{i} α + υ_{i}, υ_{i} \sim N (0, δ^{2}),

where $X_{i} = (1, X_{i})$ , $α = {(α_{0}, α_{1})}^{T}$ and $z_{i} = (1, z_{i})$ . This model coincides with the model presented in Eq. (2) in the previous section.

It remains to derive the model Q, (Eq. (3)) in a similar fashion. Thus, we define a new random variable, $E [E [Y_{i} | X_{i}] | Z_{i}]$ , which has variance

V (E [E [Y_{i} | X_{i}] | Z_{i}]) = V (μ_{Y} - r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{Z}} (Z_{i} - μ_{Z})) = V (r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{Z}} Z_{i}) = r_{X Z}^{2} r_{Y X}^{2} σ_{Y}^{2},

(23)

and expectation (following the law of iterated expectations)

E [E [E [Y_{i} | X_{i}] | Z_{i}]] = E [Y_{i}] = μ_{Y} .

(24)

The conditional distribution of $E [E [Y_{i} | X_{i}] | Z_{i}]$ given $P_{i} = p_{i}$ is

E [E [Y_{i} | X_{i}] | Z_{i}] | P_{i} = p_{i} \sim N (μ_{Y | X | Z | P}, σ_{Y | X | Z | P}^{2}),

where

E [E [E [Y_{i} | X_{i}] | Z_{i}] | P_{i} = p_{i}] = μ_{Y} + r_{Z P} r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{P}} (p_{i} - μ_{P}) = μ_{Y | X | Z | P},

(25)

given that the covariance

C o v ((E [E [Y_{i} | X_{i}] | Z_{i}]), P_{i}) = C o v ((μ_{Y} - r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{Z}} (Z_{i} - μ_{Z})), P_{i}) = r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{Z}} C o v (Z_{i}, P_{i}) = r_{X Z} r_{Y X} σ_{Y} σ_{P},

and the variance is

V (E [E [Y_{i} | X_{i}] | Z_{i}] | P_{i} = p_{i}) = σ_{Y}^{2} r_{Y X}^{2} r_{X Z}^{2} (1 - r_{Z P}^{2}) = σ_{Y | X | Z | P}^{2} .

(26)

Thus, the model Q is

Q : E [E [Y_{i} | X_{i}] | Z_{i}] = μ_{Y | X | Z | P} + e_{i}, e_{i} \sim N (0, σ_{Y | X | Z | P}^{2}) .

(27)

Denoting the model parameters as $γ_{0} = (μ_{Y} - r_{Z P} r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{P}} μ_{P})$ and $γ_{1} = r_{Z P} r_{X Z} r_{Y X} \frac{σ_{Y}}{σ_{P}}$ , the variance of the variability term $e_{i}$ as $θ^{2} = σ_{Y | X | Z | P}^{2}$ , we can rewrite Q as

Q : E [E [Y_{i} | X_{i}] | Z_{i}] = γ_{0} + γ_{1} p_{i} + e_{i}, e_{i} \sim N (0, θ^{2}),

and in linear algebra notation as

Q : Z_{i} α = p_{i} γ + e_{i}, e_{i} \sim N (0, θ^{2}),

where $Z_{i} = (1, Z_{i})$ , $γ = {(γ_{0}, γ_{1})}^{T}$ and $p_{i} = (1, p_{i})$ , which coincides with Eq. (3) in the previous section.

Decreased variability of the response variable

Deriving the models through conditional expectation reveals properties of 3pHMB (and 3pHHY) that cannot be immediately observed from the first section of this article. An important property to note is that as more modeling steps are included, the variability of the response variable decreases. This decreased variability (of proxy AGBD in our example) may have important implications in applications. One obvious case is if the objective is to map the AGBD distribution across a landscape. In case several hierarchically nested models have been utilized, the mapped AGBD variability in the landscape would be substantially smaller than the real variability. A potential solution to avoiding such decreased variability could be to apply calibration (e.g., [36]).

The decreased variability may also have negative implications if the objective is to predict the AGBD, not least for domains. As pointed out by Chambers and Clark [37], a means to minimize the uncertainty of model-based predictors is to ensure first-order balanced samples, meaning that the mean values of explanatory variables in the dataset used for model estimation coincide with the mean values of the explanatory variables in the target population. Although not specifically studied in this article, we hypothesize that first-order balanced sampling becomes increasingly important the more modeling steps are included. Further, whereas the most relevant uncertainty measure in connection with model-based inference would be the MSE, several studies suggest that for large areas, the relative difference between MSE and variance would be small (e.g., [33], [38]) in case model-unbiased predictors are applied. However, in case the variability of the response variable is substantially reduced, it remains to be evaluated if this assertion holds.

Thus, model-based inference should be applied with caution in case the variability of the (proxy) response variable is substantially reduced compared to the real-world variability of the response variable. Below, we demonstrate how the variance of the response variable is affected by multi-step modeling (the models F, G and Q).

Table 2 gives an overview of the models and the corresponding expectation and variance of the response variable, under the previously introduced superpopulation model.

Table 2.

The expectation and variance of the response variable used in models F, G and Q.

Model	Response variable	Expectation of the response variable	Variance of the response variable
F	$Y_{i}$	$μ_{Y}$	$σ_{Y}^{2}$
G	$E [Y_{i} \| X_{i}]$	$μ_{Y}$	$r_{Y X}^{2} σ_{Y}^{2}$
Q	$E [E [Y_{i} \| X_{i}] \| Z_{i}]$	$μ_{Y}$	$r_{X Z}^{2} r_{Y X}^{2} σ_{Y}^{2}$

Open in a new tab

From Table 2, we can see that whereas the response variables have the same expectation, their variability around the expectation is not the same. With the increased hierarchy of conditions, the variability of the proxy target variable is decreasing in comparison to the variability of the actual target variable.

In Tables 3 and 4 we show numerical examples of the reduction of the variance of the response variable. In Table 3, this is shown for model G, i.e., for the case of two modeling steps.

Table 3.

The decrease of the variance of the response variable in model G ( $r_{Y X}^{2} σ_{Y}^{2}$ ) for different correlations ( $r_{Y X}$ ) between the explanatory and response variables in model F.

$= 0.9$	$r_{Y X} = 0.8$	$r_{Y X} = 0.7$	$r_{Y X} = 0.6$
$0.81 \times σ_{Y}^{2}$	$0.64 \times σ_{Y}^{2}$	$0.49 \times σ_{Y}^{2}$	$0.36 \times σ_{Y}^{2}$

Open in a new tab

Table 4.

The variance of the response variable for model Q ( $r_{X Z}^{2} r_{Y X}^{2} σ_{Y}^{2}$ ) for different correlations between the explanatory and response variables in model F and model G ( $r_{Y X}$ and $r_{X Z}$ ).

	$r_{Y X} = 0.9$	$r_{Y X} = 0.8$	$r_{Y X} = 0.7$	$r_{Y X} = 0.6$
$r_{X Z} = 0.9$	$0.66 \times σ_{Y}^{2}$	$0.52 \times σ_{Y}^{2}$	$0.40 \times σ_{Y}^{2}$	$0.29 \times σ_{Y}^{2}$
$r_{X Z} = 0.8$	$0.52 \times σ_{Y}^{2}$	$0.41 \times σ_{Y}^{2}$	$0.31 \times σ_{Y}^{2}$	$0.23 \times σ_{Y}^{2}$
$r_{X Z} = 0.7$	$0.40 \times σ_{Y}^{2}$	$0.31 \times σ_{Y}^{2}$	$0.24 \times σ_{Y}^{2}$	$0.18 \times σ_{Y}^{2}$
$r_{X Z} = 0.6$	$0.29 \times σ_{Y}^{2}$	$0.23 \times σ_{Y}^{2}$	$0.18 \times σ_{Y}^{2}$	$0.13 \times σ_{Y}^{2}$

Open in a new tab

From Table 3 we can see that, e.g., if the correlation between the response and explanatory variables is 0.8, the variance of the response variable in model G is 36% smaller than the original variance of the response variable; if the correlation is 0.7, about half of the variability is lost.

Table 4 gives numerical examples for the case of three modeling steps. Now, the decreased variance of the response variable is further accentuated. For example, when both correlations are 0.8, 59% of the original variance is lost; if both are 0.7, the corresponding figure is 76%.

Table 4 suggests that even if the correlation between explanatory variables and the response variable is fairly strong in the first two modeling steps, the last model is estimated using a response variable with substantially reduced variability compared to the response variable it is mimicking.

Methods validation

We validated the correctness of the presented variance formulas for 3pHMB and 3pHHY prediction through Monte Carlo (MC) simulation. A real-world application of 3pHHY is presented by Varvia et al. [26].

An R Markdown file is available as supplementary material to this article. It provides an R code for the simulation with a step-by-step description of how the simulations were performed. The input information mimics boreal forest conditions in the northern part of Finland. The target variable is the AGBD ( $Y_{i}$ ). The explanatory variable for model F is DBH ( $X_{i})$ ; in model G it is an ALS variable related to vegetation height ( $Z_{i}$ ), and in model Q, it is an ICESat-2 variable related to vegetation height ( $P_{i}$ ). The superpopulation means ( $μ_{(\cdot)}$ ) and standard deviations ( $s d_{(\cdot)})$ of the variables and the correlations between them are presented in Table 5. The sizes of the datasets are also given in the table.

Table 5.

Input information.2

Description	Input values
Superpopulation means	$μ_{Y}$	$μ_{X}$	$μ_{Z}$	$μ_{P}$
	62.62	10.94	7.16	8.28
Superpopulation standard deviations	$s d_{Y}$	$s d_{X}$	$s d_{Z}$	$s d_{P}$
	49.53	6.30	4.34	5.69
Superpopulation correlations between variables	$r_{Y X}$	$r_{X Z}$	$r_{Z P}$
	0.77	0.75	0.76
Sizes of datasets involved in the analysis	$n_{I}$	$n_{I I}$	$n_{I I I}$	$n_{I V}$	$N$
	102	943	1721	5760	38,400

Open in a new tab

²The input data mimics the conditions of the empirical case study presented in Varvia et al. [26]. Note that, although $n_{I} < n_{I I} < n_{I I I} < n_{I V} < N$ in this case, the presented methodological framework does not require any specific dataset size relations.

The evaluation of the performance of the proposed predictors and variance formulas was conducted following the steps described below.

•
Step 1a: Using the input information on the superpopulation means and standard deviations, the “true” model parameters and variances of variability terms in models F, G $^{*}$ , G, Q $^{*}$ and Q were obtained following the theoretical outline presented in the section “Study of hierarchical modeling in the context of a superpopulation model”.
•
Step 1b: The explanatory variables ( $X_{I}$ for model F over the dataset $n_{I}$ , $Z_{I I}$ for model G over the dataset $n_{I I}$ , $P_{I I I}$ for model Q over the dataset $n_{I I I}$ , and $P_{U}$ over $U$ ) were generated based on their mean values and standard deviations following normal distributions.
•
Step 1c: The variances of the 3pHMB and 3pHHY predictors were computed according to Eqs. (8a) and (11) based on the results from steps 1a and 1b.

Steps 1a – 1c are preparations for the MC simulation, but located outside the MC loop, since they are repeated only once. The MC iterations included the following steps:

•
Step 2a: The variability terms for the models F, G $^{*}$ , G, Q $^{*}$ and Q were generated randomly based on Step 1a. As a consequence, the response variables for models were also generated for all elements in the datasets.
•
Step 2b: The model parameters of the models F, G $^{*}$ , G, Q $^{*}$ and Q were estimated based on the simulated data, and applied for predicting the AOI population mean following 3pHMB inference. The predicted value was recorded for each MC iteration.
•
Step 2c: A simple random probability sample (without replacement) was drawn from $P_{U}$ and the AOI population mean was predicted following 3pHHY inference. Note that the predicted value was recorded for each MC iteration.
•
Step 2d: Variance estimators were applied to estimate the variance of the 3pHMB and 3pHHY predictors based on the simulated outcomes from each MC iteration. The estimates were recorded for each MC iteration.

Steps 2a – 2d were repeated one million times. Based on the MC simulations, the empirical variances of the 3pHMB and 3pHHY predictors were obtained and could be compared with the variances according to the results from Eqs. (8a) and (11) as well as the corresponding variance estimators.

The results of the simulations are presented in Tables 6 and 7. In addition, it was observed that the average of the predicted AOI population means over the MC iterations based on the 3pHMB and 3pHHY predictors corresponded well with the AGBD superpopulation mean, which shows that the predictors are approximately unbiased.

Table 6.

The variance of the 3pHMB and 3pHHY predictors and their corresponding estimated variances; $V ({\hat{\bar{y}}}_{U (\cdot)})$ is the variance of the population mean predictor based on Eqs. (8a) and (11), $V_{M C} ({\hat{\bar{y}}}_{U (\cdot)})$ is the empirical variance of the population mean predictor from the MC iterations, and $\hat{V} ({\hat{\bar{y}}}_{U (\cdot)})$ is the average of estimated variances across the MC iterations.

3pHMB			3pHHY
$V ({\hat{\bar{y}}}_{U_{3 p H M B}})$	$V_{M C} ({\hat{\bar{y}}}_{U_{3 p H M B}})$	$\hat{V} ({\hat{\bar{y}}}_{U_{3 p H M B}})$	$V ({\hat{\bar{y}}}_{U_{3 p H H Y}})$	$V_{M C} ({\hat{\bar{y}}}_{U_{3 p H H Y}})$	$\hat{V} ({\hat{\bar{y}}}_{U_{3 p H H Y}})$
10.94	10.96	10.96	11.03	11.04	11.04

Open in a new tab

Table 6 shows the results from comparing the analytically derived variances of the predictors with the corresponding empirical variances from the MC iterations, and the average values of estimated variances. The Table shows that Eqs. (8a) and (11) are valid as variance formulas and that the corresponding variance estimators are approximately unbiased.

Referring to the Eqs. (8a) and (8b), it was suggested that terms with traces in (8a) could be neglected due to their small contribution to the overall variance of the 3pHMB and 3pHHY predictors. The empirical results in Table 7 validate this assertion.

Conclusions

The main purpose of this study was to present predictors, variances, and variance estimators for hierarchical model-based inference and hierarchical hybrid inference involving three modeling steps. Previous studies have proposed solutions for methods involving two modeling steps (e.g., Saarela et al. [39]). The need for methods of this kind emanates from forest resources assessment utilizing multiple sources of remotely sensed data in combination with field data. In Monte Carlo simulations, the correctness of the proposed formulas was validated. In addition, it was shown that caution should be exercised when applying several modeling steps based on models with weak correlation between the explanatory variable(s) and the response variable, since multiple modeling steps dramatically reduce the variability of the predicted target variable, which may lead to problems in some applications. For example, a map produced on the basis of multiple modeling steps would display substantially reduced variability for the target variable compared to the real variability.

Although this article has focused on applications based on remotely sensed data, the methods are general and can be applied in any discipline where empirical data are lacking but proxies can be obtained through hierarchically nested models.

CRediT authorship contribution statement

Svetlana Saarela: Conceptualization, Methodology, Software, Formal analysis, Writing – original draft, Writing – review & editing. Petri Varvia: Data curation, Methodology, Writing – review & editing. Lauri Korhonen: Data curation, Resources, Funding acquisition, Writing – review & editing. Zhiqiang Yang: Software, Writing – review & editing. Paul L. Patterson: Methodology, Writing – review & editing. Terje Gobakken: Resources, Writing – review & editing. Erik Næsset: Resources, Methodology, Writing – review & editing. Sean P. Healey: Resources, Funding acquisition, Writing – review & editing. Göran Ståhl: Methodology, Writing – original draft, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Funding for the study was provided by the Academy of Finland (project number 332707) and by grants from NASA's GEDI Science Team (grant 80HQTR21T0013).

The input data mimics the conditions of the empirical case study presented in Varvia et al. [26]. Note that, although $n_{I} < n_{I I} < n_{I I I} < n_{I V} < N$ in this case, the presented methodological framework does not require any specific dataset size relations.

In the case of heterogeneity and correlation among model error terms, the generalized least square estimator should be applied to estimate the model parameters. Varvia et al. [26] presents an example where heterogeneous variance was taken into account in the context of 3pHHY.

Data availability

The data were simulated inside the validation code provided by the authors. An R Markdown file is attached as supplementary material.

References

1.Dubayah R., Armston J., Healey S.P., Bruening J.M., Patterson P.L., Kellner J.R., Duncanson L., Saarela S., Ståhl G., Yang Z., Tang H., Blair J.B., Fatoyinbo L., Goetz S., Hancock S., Hansen M., Hofton M., Hurtt G., Luthcke S. GEDI launches a new era of biomass inference from space. Environ. Res. Lett. 2022;17 doi: 10.1088/1748-9326/ac8694. [DOI] [Google Scholar]
2.Araza A., de Bruin S., Herold M., Quegan S., Labriere N., Rodriguez-Veiga P., Avitabile V., Santoro M., Mitchard E.T.A., Ryan C.M., Phillips O.L., Willcock S., Verbeeck H., Carreiras J., Hein L., Schelhaas M.-J., Pacheco-Pascagaza A.M., da Conceição Bispo P., Laurin G.V., Vieilledent G., Slik F., Wijaya A., Lewis S.L., Morel A., Liang J., Sukhdeo H., Schepaschenko D., Cavlovic J., Gilani H., Lucas R. A comprehensive framework for assessing the accuracy and uncertainty of global above-ground biomass maps. Remote Sens. Environ. 2022;272 doi: 10.1016/j.rse.2022.112917. [DOI] [Google Scholar]
3.McRoberts R.E. Probability- and model-based approaches to inference for proportion forest using satellite imagery as ancillary data. Remote Sens. Environ. 2010;114:1017–1025. doi: 10.1016/j.rse.2009.12.013. [DOI] [Google Scholar]
4.Saarela S., Wästlund A., Holmström E., Mensah A.A., Holm S., Nilsson M., Fridman J., Ståhl G. Mapping aboveground biomass and its prediction uncertainty using LiDAR and field data, accounting for tree-level allometric and LiDAR model errors. For. Ecosyst. 2020;7:43. doi: 10.1186/s40663-020-00245-0. [DOI] [Google Scholar]
5.Hyyppä J., Hyyppä H., Leckie D., Gougeon F., Yu X., Maltamo M. Review of methods of small-footprint airborne laser scanning for extracting forest inventory data in boreal forests. Int. J. Remote Sens. 2008;29:1339–1366. doi: 10.1080/01431160701736489. [DOI] [Google Scholar]
6.Næsset E. Predicting forest stand characteristics with airborne scanning laser using a practical two-stage procedure and field data. Remote Sens. Environ. 2002;80:88–99. doi: 10.1016/S0034-4257(01)00290-5. [DOI] [Google Scholar]
7.Wulder M.A., White J.C., Nelson R.F., Næsset E., Ørka H.O., Coops N.C., Hilker T., Bater C.W., Gobakken T. Lidar sampling for large-area forest characterization: a review. Remote Sens. Environ. 2012;121:196–209. doi: 10.1016/j.rse.2012.02.001. [DOI] [Google Scholar]
8.Gobakken T., Næsset E., Nelson R., Bollandsås O.M., Gregoire T.G., Ståhl G., Holm S., Ørka H.O., Astrup R. Estimating biomass in Hedmark County, Norway using national forest inventory field plots and airborne laser scanning. Remote Sens. Environ. 2012;123:443–456. doi: 10.1016/j.rse.2012.01.025. [DOI] [Google Scholar]
9.Andersen H.-E., Strunk J., Temesgen H. Using airborne light detection and ranging as a sampling tool for estimating forest biomass resources in the Upper Tanana Valley of Interior Alaska. West. J. Appl. For. 2011;26:157–164. doi: 10.1093/wjaf/26.4.157. [DOI] [Google Scholar]
10.Saarela S., Holm S., Healey S., Andersen H.-E., Petersson H., Prentius W., Patterson P., Næsset E., Gregoire T., Ståhl G. Generalized hierarchical model-based estimation for aboveground biomass assessment using GEDI and landsat data. Remote Sens. 2018;10:1832. doi: 10.3390/rs10111832. [DOI] [Google Scholar]
11.Holm S., Nelson R., Ståhl G. Hybrid three-phase estimators for large-area forest inventory using ground plots, airborne lidar, and space lidar. Remote Sens. Environ. 2017;197:85–97. doi: 10.1016/j.rse.2017.04.004. [DOI] [Google Scholar]
12.Lefsky M.A., Harding D.J., Keller M., Cohen W.B., Carabajal C.C., Del Bom Espirito-Santo F., Hunter M.O., de Oliveira R. Estimates of forest canopy height and aboveground biomass using ICESat. Geophys. Res. Lett. 2005;32 doi: 10.1029/2005GL023971. n/a-n/a. [DOI] [Google Scholar]
13.Margolis H.A., Nelson R.F., Montesano P.M., Beaudoin A., Sun G., Andersen H.-E., Wulder M.A. Combining satellite lidar, airborne lidar, and ground plots to estimate the amount and distribution of aboveground biomass in the boreal forest of North America. Can. J. For. Res. 2015;45:838–855. doi: 10.1139/cjfr-2015-0006. [DOI] [Google Scholar]
14.Markus T., Neumann T., Martino A., Abdalati W., Brunt K., Csatho B., Farrell S., Fricker H., Gardner A., Harding D., Jasinski M., Kwok R., Magruder L., Lubin D., Luthcke S., Morison J., Nelson R., Neuenschwander A., Palm S., Popescu S., Shum C., Schutz B.E., Smith B., Yang Y., Zwally J. The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2): science requirements, concept, and implementation. Remote Sens. Environ. 2017;190:260–273. doi: 10.1016/j.rse.2016.12.029. [DOI] [Google Scholar]
15.Narine L., Malambo L., Popescu S. Characterizing canopy cover with ICESat-2: a case study of southern forests in Texas and Alabama, USA. Remote Sens. Environ. 2022;281 doi: 10.1016/j.rse.2022.113242. [DOI] [Google Scholar]
16.Popescu S.C., Zhou T., Nelson R., Neuenschwander A., Sheridan R., Narine L., Walsh K.M. Photon counting LiDAR: an adaptive ground and canopy height retrieval algorithm for ICESat-2 data. Remote Sens. Environ. 2018;208:154–170. doi: 10.1016/j.rse.2018.02.019. [DOI] [Google Scholar]
17.Varvia P., Korhonen L., Bruguière A., Toivonen J., Packalen P., Maltamo M., Saarela S., Popescu S.C. How to consider the effects of time of day, beam strength, and snow cover in ICESat-2 based estimation of boreal forest biomass? Remote Sens. Environ. 2022;280 doi: 10.1016/j.rse.2022.113174. [DOI] [Google Scholar]
18.Wulder M.A., White J.C., Bater C.W., Coops N.C., Hopkinson C., Chen G. Lidar plots — A new large-area data collection option: context, concepts, and case study. Can. J. Remote Sens. 2012;38:600–618. doi: 10.5589/m12-049. [DOI] [Google Scholar]
19.Hou Z., Xu Q., McRoberts R.E., Greenberg J.A., Liu J., Heiskanen J., Pitkänen S., Packalen P. Effects of temporally external auxiliary data on model-based inference. Remote Sens. Environ. 2017;198:150–159. doi: 10.1016/j.rse.2017.06.013. [DOI] [Google Scholar]
20.Snijders T.A.B., Bosker R.J. 2. ed. SAGE; Los Angeles, Calif: 2012. Multilevel analysis: an Introduction to Basic and Advanced Multilevel Modeling. ed. [Google Scholar]
21.Roy D.P., Wulder M.A., Loveland T.R., W C.E., Allen R.G., Anderson M.C., Helder D., Irons J.R., Johnson D.M., Kennedy R., Scambos T.A., Schaaf C.B., Schott J.R., Sheng Y., Vermote E.F., Belward A.S., Bindschadler R., Cohen W.B., Gao F., Hipple J.D., Hostert P., Huntington J., Justice C.O., Kilic A., Kovalskyy V., Lee Z.P., Lymburner L., Masek J.G., McCorkel J., Shuai Y., Trezza R., Vogelmann J., Wynne R.H., Zhu Z. Landsat-8: science and product vision for terrestrial global change research. Remote Sens. Environ. 2014;145:154–172. doi: 10.1016/j.rse.2014.02.001. [DOI] [Google Scholar]
22.Wulder M.A., White J.C., Loveland T.R., Woodcock C.E., Belward A.S., Cohen W.B., Fosnight E.A., Shaw J., Masek J.G., Roy D.P. The global Landsat archive: status, consolidation, and direction. Remote Sens. Environ. 2016;185:271–283. doi: 10.1016/j.rse.2015.11.032. [DOI] [Google Scholar]
23.McRoberts R.E., Chen Q., Domke G.M., Ståhl G., Saarela S., Westfall J.A. Hybrid estimators for mean aboveground carbon per unit area. For. Ecol. Manag. 2016;378:44–56. doi: 10.1016/j.foreco.2016.07.007. [DOI] [Google Scholar]
24.Repola J. Biomass equations for Scots pine and Norway spruce in Finland. Silva Fenn. 2009;43 doi: 10.14214/sf.184. [DOI] [Google Scholar]
25.Repola J. Biomass equations for birch in Finland. Silva Fenn. 2008;42 doi: 10.14214/sf.236. [DOI] [Google Scholar]
26.Varvia, P., Saarela, S., Maltamo, M., Packalen, P., Gobakken, T., Næsset, E., Ståhl, G., Korhonen, L., 2023. Estimation of boreal forest biomass from ICESat-2 data using hierarchical hybrid inference. 10.48550/ARXIV.2307.04497 [DOI]
27.Davidson R., MacKinnon J.G. Oxford University Press; 1993. Estimation and Inference in Econometrics. [Google Scholar]
28.Saarela S., Holm S., Healey S.P., Patterson P.L., Yang Z., Andersen H.-E., Dubayah R.O., Qi W., Duncanson L.I., Armston J.D., Gobakken T., Næsset E., Ekström M., Ståhl G. Comparing frameworks for biomass prediction for the global ecosystem dynamics investigation. Remote Sens. Environ. 2022;278 doi: 10.1016/j.rse.2022.113074. [DOI] [Google Scholar]
29.Wallenius K.T. A conditional covariance formula with applications. Am. Stat. 1971;25:32–33. [Google Scholar]
30.Särndal C.-E., Swensson B., Wretman J. Springer Science & Business Media; 2003. Model Assisted Survey Sampling. [Google Scholar]
31.Ståhl G., Heikkinen J., Petersson H., Repola J., Holm S. Sample-based estimation of greenhouse gas emissions from forests—A new approach to account for both sampling and model errors. For. Sci. 2014;60:3–13. doi: 10.5849/forsci.13-005. [DOI] [Google Scholar]
32.Ståhl G., Holm S., Gregoire T.G., Gobakken T., Næsset E., Nelson R. Model-based inference for biomass estimation in a LiDAR sample survey in Hedmark County, NorwayThis article is one of a selection of papers from extending forest inventory and monitoring over space and time. Can. J. For. Res. 2011;41:96–107. doi: 10.1139/X10-161. [DOI] [Google Scholar]
33.Ståhl G., Saarela S., Schnell S., Holm S., Breidenbach J., Healey S.P., Patterson P.L., Magnussen S., Næsset E., McRoberts R.E., Gregoire T.G. Use of models in large-area forest surveys: comparing model-assisted, model-based and hybrid estimation. For. Ecosyst. 2016;3:5. doi: 10.1186/s40663-016-0064-9. [DOI] [Google Scholar]
34.Cassel, C.-M., Sarndal, C.-E., Wretman, J.H., 1977. Foundations of inference in survey sampling.
35.Feller W. An introduction to probability theory and its applications. Wiley Ser. Probab. Math. Stat. 1971;1:343–366. [Google Scholar]
36.Lindgren N., Nyström K., Saarela S., Olsson H., Ståhl G. Importance of calibration for improving the efficiency of data assimilation for predicting forest characteristics. Remote Sens. 2022;14:4627. doi: 10.3390/rs14184627. [DOI] [Google Scholar]
37.Chambers R., Clark R. Oxford University Press; 2012. An Introduction to Model-Based Survey Sampling With Applications. [Google Scholar]
38.Patterson P.L., Healey S.P., Ståhl G., Saarela S., Holm S., Andersen H.-E., Dubayah R.O., Duncanson L., Hancock S., Armston J., Kellner J.R., Cohen W.B., Yang Z. Statistical properties of hybrid estimators proposed for GEDI—NASA's global ecosystem dynamics investigation. Environ. Res. Lett. 2019;14 doi: 10.1088/1748-9326/ab18df. [DOI] [Google Scholar]
39.Saarela S., Holm S., Grafström A., Schnell S., Næsset E., Gregoire T.G., Nelson R.F., Ståhl G. Hierarchical model-based inference for forest inventory utilizing three sources of information. Ann. For. Sci. 2016;73(4):895–910. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data were simulated inside the validation code provided by the authors. An R Markdown file is attached as supplementary material.

[bib0006] 1.Dubayah R., Armston J., Healey S.P., Bruening J.M., Patterson P.L., Kellner J.R., Duncanson L., Saarela S., Ståhl G., Yang Z., Tang H., Blair J.B., Fatoyinbo L., Goetz S., Hancock S., Hansen M., Hofton M., Hurtt G., Luthcke S. GEDI launches a new era of biomass inference from space. Environ. Res. Lett. 2022;17 doi: 10.1088/1748-9326/ac8694. [DOI] [Google Scholar]

[bib0002] 2.Araza A., de Bruin S., Herold M., Quegan S., Labriere N., Rodriguez-Veiga P., Avitabile V., Santoro M., Mitchard E.T.A., Ryan C.M., Phillips O.L., Willcock S., Verbeeck H., Carreiras J., Hein L., Schelhaas M.-J., Pacheco-Pascagaza A.M., da Conceição Bispo P., Laurin G.V., Vieilledent G., Slik F., Wijaya A., Lewis S.L., Morel A., Liang J., Sukhdeo H., Schepaschenko D., Cavlovic J., Gilani H., Lucas R. A comprehensive framework for assessing the accuracy and uncertainty of global above-ground biomass maps. Remote Sens. Environ. 2022;272 doi: 10.1016/j.rse.2022.112917. [DOI] [Google Scholar]

[bib0016] 3.McRoberts R.E. Probability- and model-based approaches to inference for proportion forest using satellite imagery as ancillary data. Remote Sens. Environ. 2010;114:1017–1025. doi: 10.1016/j.rse.2009.12.013. [DOI] [Google Scholar]

[bib0027] 4.Saarela S., Wästlund A., Holmström E., Mensah A.A., Holm S., Nilsson M., Fridman J., Ståhl G. Mapping aboveground biomass and its prediction uncertainty using LiDAR and field data, accounting for tree-level allometric and LiDAR model errors. For. Ecosyst. 2020;7:43. doi: 10.1186/s40663-020-00245-0. [DOI] [Google Scholar]

[bib0011] 5.Hyyppä J., Hyyppä H., Leckie D., Gougeon F., Yu X., Maltamo M. Review of methods of small-footprint airborne laser scanning for extracting forest inventory data in boreal forests. Int. J. Remote Sens. 2008;29:1339–1366. doi: 10.1080/01431160701736489. [DOI] [Google Scholar]

[bib0018] 6.Næsset E. Predicting forest stand characteristics with airborne scanning laser using a practical two-stage procedure and field data. Remote Sens. Environ. 2002;80:88–99. doi: 10.1016/S0034-4257(01)00290-5. [DOI] [Google Scholar]

[bib0038] 7.Wulder M.A., White J.C., Nelson R.F., Næsset E., Ørka H.O., Coops N.C., Hilker T., Bater C.W., Gobakken T. Lidar sampling for large-area forest characterization: a review. Remote Sens. Environ. 2012;121:196–209. doi: 10.1016/j.rse.2012.02.001. [DOI] [Google Scholar]

[bib0008] 8.Gobakken T., Næsset E., Nelson R., Bollandsås O.M., Gregoire T.G., Ståhl G., Holm S., Ørka H.O., Astrup R. Estimating biomass in Hedmark County, Norway using national forest inventory field plots and airborne laser scanning. Remote Sens. Environ. 2012;123:443–456. doi: 10.1016/j.rse.2012.01.025. [DOI] [Google Scholar]

[bib0001] 9.Andersen H.-E., Strunk J., Temesgen H. Using airborne light detection and ranging as a sampling tool for estimating forest biomass resources in the Upper Tanana Valley of Interior Alaska. West. J. Appl. For. 2011;26:157–164. doi: 10.1093/wjaf/26.4.157. [DOI] [Google Scholar]

[bib0025] 10.Saarela S., Holm S., Healey S., Andersen H.-E., Petersson H., Prentius W., Patterson P., Næsset E., Gregoire T., Ståhl G. Generalized hierarchical model-based estimation for aboveground biomass assessment using GEDI and landsat data. Remote Sens. 2018;10:1832. doi: 10.3390/rs10111832. [DOI] [Google Scholar]

[bib0009] 11.Holm S., Nelson R., Ståhl G. Hybrid three-phase estimators for large-area forest inventory using ground plots, airborne lidar, and space lidar. Remote Sens. Environ. 2017;197:85–97. doi: 10.1016/j.rse.2017.04.004. [DOI] [Google Scholar]

[bib0012] 12.Lefsky M.A., Harding D.J., Keller M., Cohen W.B., Carabajal C.C., Del Bom Espirito-Santo F., Hunter M.O., de Oliveira R. Estimates of forest canopy height and aboveground biomass using ICESat. Geophys. Res. Lett. 2005;32 doi: 10.1029/2005GL023971. n/a-n/a. [DOI] [Google Scholar]

[bib0014] 13.Margolis H.A., Nelson R.F., Montesano P.M., Beaudoin A., Sun G., Andersen H.-E., Wulder M.A. Combining satellite lidar, airborne lidar, and ground plots to estimate the amount and distribution of aboveground biomass in the boreal forest of North America. Can. J. For. Res. 2015;45:838–855. doi: 10.1139/cjfr-2015-0006. [DOI] [Google Scholar]

[bib0015] 14.Markus T., Neumann T., Martino A., Abdalati W., Brunt K., Csatho B., Farrell S., Fricker H., Gardner A., Harding D., Jasinski M., Kwok R., Magruder L., Lubin D., Luthcke S., Morison J., Nelson R., Neuenschwander A., Palm S., Popescu S., Shum C., Schutz B.E., Smith B., Yang Y., Zwally J. The Ice, Cloud, and land Elevation Satellite-2 (ICESat-2): science requirements, concept, and implementation. Remote Sens. Environ. 2017;190:260–273. doi: 10.1016/j.rse.2016.12.029. [DOI] [Google Scholar]

[bib0019] 15.Narine L., Malambo L., Popescu S. Characterizing canopy cover with ICESat-2: a case study of southern forests in Texas and Alabama, USA. Remote Sens. Environ. 2022;281 doi: 10.1016/j.rse.2022.113242. [DOI] [Google Scholar]

[bib0021] 16.Popescu S.C., Zhou T., Nelson R., Neuenschwander A., Sheridan R., Narine L., Walsh K.M. Photon counting LiDAR: an adaptive ground and canopy height retrieval algorithm for ICESat-2 data. Remote Sens. Environ. 2018;208:154–170. doi: 10.1016/j.rse.2018.02.019. [DOI] [Google Scholar]

[bib0033] 17.Varvia P., Korhonen L., Bruguière A., Toivonen J., Packalen P., Maltamo M., Saarela S., Popescu S.C. How to consider the effects of time of day, beam strength, and snow cover in ICESat-2 based estimation of boreal forest biomass? Remote Sens. Environ. 2022;280 doi: 10.1016/j.rse.2022.113174. [DOI] [Google Scholar]

[bib0036] 18.Wulder M.A., White J.C., Bater C.W., Coops N.C., Hopkinson C., Chen G. Lidar plots — A new large-area data collection option: context, concepts, and case study. Can. J. Remote Sens. 2012;38:600–618. doi: 10.5589/m12-049. [DOI] [Google Scholar]

[bib0010] 19.Hou Z., Xu Q., McRoberts R.E., Greenberg J.A., Liu J., Heiskanen J., Pitkänen S., Packalen P. Effects of temporally external auxiliary data on model-based inference. Remote Sens. Environ. 2017;198:150–159. doi: 10.1016/j.rse.2017.06.013. [DOI] [Google Scholar]

[bib0029] 20.Snijders T.A.B., Bosker R.J. 2. ed. SAGE; Los Angeles, Calif: 2012. Multilevel analysis: an Introduction to Basic and Advanced Multilevel Modeling. ed. [Google Scholar]

[bib0024] 21.Roy D.P., Wulder M.A., Loveland T.R., W C.E., Allen R.G., Anderson M.C., Helder D., Irons J.R., Johnson D.M., Kennedy R., Scambos T.A., Schaaf C.B., Schott J.R., Sheng Y., Vermote E.F., Belward A.S., Bindschadler R., Cohen W.B., Gao F., Hipple J.D., Hostert P., Huntington J., Justice C.O., Kilic A., Kovalskyy V., Lee Z.P., Lymburner L., Masek J.G., McCorkel J., Shuai Y., Trezza R., Vogelmann J., Wynne R.H., Zhu Z. Landsat-8: science and product vision for terrestrial global change research. Remote Sens. Environ. 2014;145:154–172. doi: 10.1016/j.rse.2014.02.001. [DOI] [Google Scholar]

[bib0037] 22.Wulder M.A., White J.C., Loveland T.R., Woodcock C.E., Belward A.S., Cohen W.B., Fosnight E.A., Shaw J., Masek J.G., Roy D.P. The global Landsat archive: status, consolidation, and direction. Remote Sens. Environ. 2016;185:271–283. doi: 10.1016/j.rse.2015.11.032. [DOI] [Google Scholar]

[bib0017] 23.McRoberts R.E., Chen Q., Domke G.M., Ståhl G., Saarela S., Westfall J.A. Hybrid estimators for mean aboveground carbon per unit area. For. Ecol. Manag. 2016;378:44–56. doi: 10.1016/j.foreco.2016.07.007. [DOI] [Google Scholar]

[bib0022] 24.Repola J. Biomass equations for Scots pine and Norway spruce in Finland. Silva Fenn. 2009;43 doi: 10.14214/sf.184. [DOI] [Google Scholar]

[bib0023] 25.Repola J. Biomass equations for birch in Finland. Silva Fenn. 2008;42 doi: 10.14214/sf.236. [DOI] [Google Scholar]

[bib0034] 26.Varvia, P., Saarela, S., Maltamo, M., Packalen, P., Gobakken, T., Næsset, E., Ståhl, G., Korhonen, L., 2023. Estimation of boreal forest biomass from ICESat-2 data using hierarchical hybrid inference. 10.48550/ARXIV.2307.04497 [DOI]

[bib0005] 27.Davidson R., MacKinnon J.G. Oxford University Press; 1993. Estimation and Inference in Econometrics. [Google Scholar]

[bib0026] 28.Saarela S., Holm S., Healey S.P., Patterson P.L., Yang Z., Andersen H.-E., Dubayah R.O., Qi W., Duncanson L.I., Armston J.D., Gobakken T., Næsset E., Ekström M., Ståhl G. Comparing frameworks for biomass prediction for the global ecosystem dynamics investigation. Remote Sens. Environ. 2022;278 doi: 10.1016/j.rse.2022.113074. [DOI] [Google Scholar]

[bib0035] 29.Wallenius K.T. A conditional covariance formula with applications. Am. Stat. 1971;25:32–33. [Google Scholar]

[bib0028] 30.Särndal C.-E., Swensson B., Wretman J. Springer Science & Business Media; 2003. Model Assisted Survey Sampling. [Google Scholar]

[bib0030] 31.Ståhl G., Heikkinen J., Petersson H., Repola J., Holm S. Sample-based estimation of greenhouse gas emissions from forests—A new approach to account for both sampling and model errors. For. Sci. 2014;60:3–13. doi: 10.5849/forsci.13-005. [DOI] [Google Scholar]

[bib0031] 32.Ståhl G., Holm S., Gregoire T.G., Gobakken T., Næsset E., Nelson R. Model-based inference for biomass estimation in a LiDAR sample survey in Hedmark County, NorwayThis article is one of a selection of papers from extending forest inventory and monitoring over space and time. Can. J. For. Res. 2011;41:96–107. doi: 10.1139/X10-161. [DOI] [Google Scholar]

[bib0032] 33.Ståhl G., Saarela S., Schnell S., Holm S., Breidenbach J., Healey S.P., Patterson P.L., Magnussen S., Næsset E., McRoberts R.E., Gregoire T.G. Use of models in large-area forest surveys: comparing model-assisted, model-based and hybrid estimation. For. Ecosyst. 2016;3:5. doi: 10.1186/s40663-016-0064-9. [DOI] [Google Scholar]

[bib0003] 34.Cassel, C.-M., Sarndal, C.-E., Wretman, J.H., 1977. Foundations of inference in survey sampling.

[bib0007] 35.Feller W. An introduction to probability theory and its applications. Wiley Ser. Probab. Math. Stat. 1971;1:343–366. [Google Scholar]

[bib0013] 36.Lindgren N., Nyström K., Saarela S., Olsson H., Ståhl G. Importance of calibration for improving the efficiency of data assimilation for predicting forest characteristics. Remote Sens. 2022;14:4627. doi: 10.3390/rs14184627. [DOI] [Google Scholar]

[bib0004] 37.Chambers R., Clark R. Oxford University Press; 2012. An Introduction to Model-Based Survey Sampling With Applications. [Google Scholar]

[bib0020] 38.Patterson P.L., Healey S.P., Ståhl G., Saarela S., Holm S., Andersen H.-E., Dubayah R.O., Duncanson L., Hancock S., Armston J., Kellner J.R., Cohen W.B., Yang Z. Statistical properties of hybrid estimators proposed for GEDI—NASA's global ecosystem dynamics investigation. Environ. Res. Lett. 2019;14 doi: 10.1088/1748-9326/ab18df. [DOI] [Google Scholar]

[bib39] 39.Saarela S., Holm S., Grafström A., Schnell S., Næsset E., Gregoire T.G., Nelson R.F., Ståhl G. Hierarchical model-based inference for forest inventory utilizing three sources of information. Ann. For. Sci. 2016;73(4):895–910. [Google Scholar]

PERMALINK

Three-phase hierarchical model-based and hybrid inference

Svetlana Saarela

Petri Varvia

Lauri Korhonen

Zhiqiang Yang

Paul L Patterson

Terje Gobakken

Erik Næsset

Sean P Healey

Göran Ståhl

Abstract

Graphical abstract

Method details

Background

Overview

Details of three-phase hierarchical model-based inference (3pHMB)

Table 1.

Fig. 1.

The 3pHMB predictor variance and its estimator

Table 7.

Details of three-phase hierarchical hybrid inference (3pHHY)

Fig. 2.

The 3pHHY predictor variance and its estimator

Study of hierarchical modeling in the context of a superpopulation model

Decreased variability of the response variable

Table 2.

Table 3.

Table 4.

Methods validation

Table 5.

Table 6.

Conclusions

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Data availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases