Simultaneous confidence bands for functional data using the Gaussian Kinematic formula

Fabian JE Telschow; Armin Schwartzman

doi:10.1016/j.jspi.2021.05.008

. Author manuscript; available in PMC: 2022 Jul 8.

Published in final edited form as: J Stat Plan Inference. 2021 Jun 5;216:70–94. doi: 10.1016/j.jspi.2021.05.008

Simultaneous confidence bands for functional data using the Gaussian Kinematic formula

Fabian JE Telschow ^a,^*, Armin Schwartzman ^b,^c

PMCID: PMC9268949 NIHMSID: NIHMS1765530 PMID: 35813237

Abstract

We propose a construction of simultaneous confidence bands (SCBs) for functional parameters over arbitrary dimensional compact domains using the Gaussian Kinematic formula of t-processes (tGKF). Although the tGKF relies on Gaussianity, we show that a central limit theorem (CLT) for the parameter of interest is enough to obtain asymptotically precise covering even if the observations are non-Gaussian processes. As a proof of concept we study the functional signal-plus-noise model and derive a CLT for an estimator of the Lipshitz–Killing curvatures, the only data-dependent quantities in the tGKF. We further discuss extensions to discrete sampling with additive observation noise using scale space ideas from regression analysis. Our theoretical work is accompanied by a simulation study comparing different methods to construct SCBs for the population mean. We show that the tGKF outperforms state-of-the-art methods with precise covering for small sample sizes, and only a Rademacher multiplier-t bootstrap performs similarly well. A further benefit is that our SCBs are computational fast even for domains of dimension greater than one. Applications of SCBs to diffusion tensor imaging (DTI) fibers (1D) and spatio-temporal temperature data (2D) are discussed.

Keywords: Simultaneous inference, Confidence bands, Random fields, Functional data, Climate

1. Introduction

In the past three decades functional data analysis has received increasing interest due to the possibility of recording and storing data collected with high frequency and/or high resolution in time and space. Many methods have been developed to study these complex data objects; for overviews of some recent developments in this fast growing field we refer the reader to the review articles Cuevas (2014) and Wang et al. (2016) and books among others Ferraty and Vieu (2006) and Ramsay and Silverman (2007).

Despite the success of functional data analysis, only recently quantification of uncertainty with simultaneous confidence bands has received increasing attention. The existing methods for construction of simultaneous confidence bands (SCBs) split into three groups. The first group is based on functional central limit theorems (fCLTs) in the Banach space of continuous functions endowed with the maximum metric and evaluation of the maximum of the limiting Gaussian field often using Monte-Carlo simulations with an estimate of the limiting covariance structure, cf. Bunea et al. (2011), Degras (2011, 2017), Cao et al. (2012) and Cao et al. (2014). The second group is based on the bootstrap, among others (Cuevas et al., 2006; Chang et al., 2017; Wang et al., 2019) and Belloni et al. (2018). Thirdly, recently the use of a generalized Kac Rice formula for fast construction of SCBs for one dimensional functional data has been proposed in Liebl and Reimherr (2019), which is similar to our proposal, but limited to one dimensional domains.

Except for Liebl and Reimherr (2019) the mentioned methods are computationally expensive, since they either simulate from an estimated limiting field or require to draw many bootstrap samples. This hinders their use for domains of dimension greater than one. Moreover, they often perform poorly on small samples.

In order to construct precise and efficiently computable SCBs for functional parameters over arbitrarily dimensional domains, we propose to use random field theory (RFT). RFT was studied extensively in Adler (1981) and Adler and Taylor (2009), and has been successfully used in the neuroimaging community to control the FWER of statistical 3D images, among others (Worsley et al., 1996, 2004). More precisely we use the so called Gaussian kinematic formula (GKF) for pointwise t-distributed random fields (Taylor, 2006; Taylor and Worsley, 2007).

In a nutshell GKFs express the expected Euler characteristic (EEC) of the excursion set of a Gaussian related random field F(Z₁, … , Z_N), $F \in C^{2} (ℝ^{N}, ℝ)$ in terms of a finite linear combination of D known functions, called Euler characteristic (EC) densities. Here Z₁, … , Z_N ~ Z are i.i.d. zero-mean, unit-variance Gaussian fields with twice differentiable sample paths over a nice compact subset of $ℝ^{D}$ . The linear coefficients in this formula are called Lipshitz–Killing curvatures (LKCs) and depend solely on the domain and the covariance structure of the derivative of Z. Remarkably, the only difference between GKFs for different Gaussian related fields is that the EC densities change, see Adler and Taylor (2009, p.315, (12.4.2)). Takemura and Kuriki (2002) have shown that the GKF for Gaussian fields is closely related to the Volume of Tubes formula dating all the way back to Working and Hotelling (1929). The latter has been applied for SCBs in nonlinear regression analysis, e.g., Johansen and Johnstone (1990), Krivobokova et al. (2010) and Lu and Kuriki (2017). In this sense the GKFs of Taylor (2006) can be interpreted as a generalization of the volume of tube formula for repeated observations of functional data.

Our main contributions are the following. In Theorem 2 we show based on the main result in Taylor et al. (2005) that, asymptotically, the error in the covering rate of SCBs for a function-valued population parameter based on the tGKF can be bounded and is small, if the targeted covering probability of the SCB is sufficiently high. This requires neither Gaussianity nor stationarity of the observed fields. It only requires that the estimator of the targeted function-valued parameter fulfills an fCLT in the Banach space of continuous functions with a sufficiently regular Gaussian limit field. Moreover, it requires consistent estimators for the LKCs. The latter have been studied in Taylor and Worsley (2007) and Telschow et al. (2020). We illustrate the general approach for the special case of SCBs of the population mean curve and the difference of population means for functional signal-plus-noise models, where we allow the error fields to be non-Gaussian. Especially we derive for such models defined over sufficiently regular domains $S \subset ℝ^{D}$ , D = 1, 2, consistent estimators for the LKCs and derive CLTs for them. In order to deal with observation noise we discuss SCBs for scale spaces. In Theorem 8 we give sufficient conditions to have weak convergence of a scale space field to a Gaussian limit extending the results from Chaudhuri and Marron (2000) from regression analysis to repeated observations of functional data. Additionally, we prove that the LKCs of this limit field can be consistently estimated and therefore Theorem 2 can be used to bound the error in the covering rate for SCBs of the population mean of a scale space field.

Scale spaces are not the only way to deal with the observation noise and discrete sampling. For example local polynomial estimators can be used to estimate the population mean (Zhang et al., 2007; Degras, 2011; Zhang et al., 2016). Our proposed construction of SCBs is applicable in these cases provided that the LKCs can be consistently estimated and a bias correction is introduced. Note that in the ultra dense sampling case (Zhang et al., 2016) our developed theory for the SCBs of signal-plus-noise models can be directly used, since the bias introduced by the smoother is negligible. For less dense sampling schemes a bias correction and consistent estimates of the LKCs need to be derived, yet Theorem 2 can then still be applied.

The theory is accompanied by a simulation study using the Rpackage SIRF (Spatial Inference for Random Fields), which can be found on https://github.com/ftelschow/SIRF, and demonstrate the use of SCBs on two different data applications. In the simulation study we compare the performance of the tGKF approach to SCBs for different error fields mainly with bootstrap approaches and conclude that the tGKF approach does not only often give better coverings for small sample sizes, but outperforms bootstrap approaches computationally. Moreover, the average width of the tGKF confidence bands is lower for large sample sizes. As a first application we demonstrate the use of SCBs for scale spaces on a diffusion tensor imaging (DTI) experiment to detect differences in population means between healthy subjects and patients. The second application constructs simultaneous confidence bands for the expected increase in mean summer and winter temperatures over North America obtained from NARCAP simulations (Mearns et al., 2013).

Organization of the article.

Section 2.1 introduces necessary notations and definitions. In Section 2.2 and Section 2.3 we describe the general idea of construction of SCBs for functional parameters using the tGKF. In Section 2.4 we prove a theorem establishing that the SCBs from the tGKF give accurate covering rates. Applications to the functional signal-plus-noise model are given in Section 3. Especially, we provide consistent estimators of the LKCs and prove consistency and a CLT for them in Section 3.2. SCBs for scale space models are discussed in Section 3. In Section 4 we compare the tGKF approach in different simulations to competing state of the art methods to construct SCBs for functional signal-plus-noise models. Section 5 applies these ideas to real data.

2. Simultaneous confidence bands

2.1. Preliminary definitions and notations

Definition 1.

A random field G is a random function from a parameter space S to $ℝ$ . The function $G_{ω} : S \to ℝ$ , where ω ∈ Ω is an element of the underlying probability space, is called a sample path. Moreover, G is called Gaussian random field, if for all (s₁, … , s_M) ∈ S^M, $M \in ℕ$ , the random vector (G(s₁), … , G(s_M)) has a multivariate Gaussian distribution.

A detailed treatment of random fields can be found in Adler and Taylor (2009). Another important quantity, which we need is the Euler characteristic (EC) of a set A ⊆ S denoted by χ(A). The EC of a topological space is an invariant that can be rigorously defined using homology groups of a topological space (Bredon, 2013) or it can be defined using triangulations (Flegg, 2001). Intuitively, the EC of $A \subset ℝ^{D}$ is the number of connected components for D = 1, the number of connected components minus the number of holes for D = 2 and the number of connected components minus the number of holes plus the number of hollows (for example the hole in the 2-sphere) for D = 3.

Classical results on functional central limit theorems (fCLTs), which we will use later, make use of semi-metrics (Jain and Marcus, 1975). Hence we recall the definition:

Definition 2.

A semi-metric on a topological space S is a map $δ : S \times S \to ℝ$ such that (1.) δ(s, s′) ≥ 0, (2.) δ(s, s′) = 0 if and only if s = s′, (3.) δ(s, s′) = δ(s′, s). This implies that each metric is a semi-metric. A semi-metric is called continuous with respect to the topology of S, if it is a continuous function in the product topology of S × S.

To improve readability we abbreviate higher order partial derivatives using the following notation

f^{I} = \frac{\partial^{| I |_{f}}}{\partial s_{d_{1}} \dots \partial s_{d_{K}}},

where K = |I| denotes the number of elements in the multi-index I = (d₁, … , d_K). We denote with $C (S)$ the space of continuous functions from S to $ℝ$ and with “⇒” weak convergence in $C (S)$ endowed with the maximum norm ‖f‖_∞ = max_s∈S|f(s)|. Moreover, a letter in Fraktur font, for example $r$ , will always denote the covariance function of a random field.

2.2. SCBs for functional parameters

We describe a general well-known scheme for construction of simultaneous confidence bands (SCBs) for a functional parameter s ↦ θ(s), s ∈ S, where $S \subset ℝ^{D}$ is compact. Hereafter, we assume that all functions of s ∈ S belong to $C (S)$ .

Let $s \mapsto {\hat{θ}}_{N} (s)$ and $s \mapsto {\hat{ς}}_{N} (s)$ be estimators of functional parameters θ, ς respectively, fulfilling

τ_{N} \frac{{\hat{θ}}_{N} - θ}{ς} \overset{N \to \infty}{\Rightarrow} G,

(E1)

ℙ (lim_{N \to \infty} {‖ {\hat{ς}}_{N} - ς ‖}_{\infty} = 0) = 1.

(E2)

Here G is a zero-mean Gaussian field with covariance function $r$ satisfying $r (s, s) = 1$ for all s ∈ S, τ_N → ∞ is a sequence of positive numbers. Under (E1) and (E2) Slutzky’s Lemma implies

τ_{N} \frac{{\hat{θ}}_{N} - θ}{{\hat{ς}}_{N}} \overset{N \to \infty}{\Rightarrow} G .

(1)

Thus, it is easy to verify that the collection of intervals

S C B (s, q_{α, N}) = [{\hat{θ}}_{N} (s) - q_{α, N} \frac{{\hat{ς}}_{N} (s)}{τ_{N}}, {\hat{θ}}_{N} (s) + q_{α, N} \frac{{\hat{ς}}_{N} (s)}{τ_{N}}]

(2)

form (1 − α)-simultaneous confidence bands of θ, i.e.

ℙ (\forall s \in S : θ (s) \in S C B (s, q_{α, N})) = 1 - α,

provided that

ℙ (max_{s \in S} τ_{N} | \frac{{\hat{θ}}_{N} (s) - θ (s)}{{\hat{ς}}_{N} (s)} | q_{α, N}) = α .

(3)

The quantiles q_α,N are in general unknown and need to be estimated.

To the best of our knowledge there are two approaches for this. Limit approximations try to estimate q_α,N by estimation of the asymptotic covariance function and simulations of sample paths from the corresponding Gaussian field (Degras, 2011, 2017). Better performances for small samples can be achieved by bootstrap approaches such as the parametric bootstrap proposed in Degras (2011) or the multiplier bootstrap (Chang et al., 2017). Compare also Appendix A.

2.3. Estimation of the quantile using the tGKF

In this section we propose to use the Gaussian kinematic formula for t-fields (tGKF) as proven in Taylor (2006) to approximate the quantiles q_α,N.

2.3.1. The Gaussian Kinematic Formula for t-fields

Let $G_{1}, \dots, G_{N} \overset{i . i . d .}{~} G$ be zero-mean, unit-variance Gaussian fields. The field T satisfying

T (s) = \frac{\sqrt{N} G_{N} (s)}{\sqrt{\frac{1}{N - 1} \sum_{n = 1}^{N - 1} G_{n} {(s)}^{2}}},

is called a t_N−1-field. We pose the following assumptions on G:

(G1)
G has almost surely $C^{2}$ -sample paths.
(G2)
The joint distribution of the derivative fields (G^(d)(s), G^(d,l)(s)) is nondegenerate for all s ∈ S and d, l = 1, … , D.
(G3)
There is an ϵ > 0 such that
$E [{(G^{(d, l)} (s) - G^{(d, l)} (s^{'}))}^{2}] \leq K {| log ‖ s - s^{'} ‖ |}^{- (1 + γ)}$
for all d, l = 1, … , D and for all |s − s′| < ϵ. Here K > 0 and γ > 0 are finite constants.
(G4)
$E [G (s) G (s^{'})] = 1$ if and only if s = s′.

Remark 1.

Assumptions (G1)-(G3) imply (cf., Adler and Taylor (2009, Thm 11.3.3)) that the paths of G are almost surely Morse functions. This is necessary since the proof of the GKF is based on the classical Morse theorem for Whitney-stratified manifolds (Goresky and MacPherson, 1988). Thus, (G2) is necessary to ensure that almost surely all critical points are nondegenerate. Assumption (G3) is satisfied for any field G having almost surely $C^{3}$ -sample paths and all third derivatives have a finite second $C (S)$ -moment, see Definition 3. In particular, this holds true for any Gaussian field with $C^{3}$ -sample paths. For completeness the argument is carried out in more detail in Appendix B.1. Condition (G4) excludes, for example, periodic fields.

Under these assumptions the tGKF is an exact, analytical formula of the expectation of the Euler characteristic χ of the excursion sets A(T, u) = {s ∈ S | T(s) > u} of T. This formula as proven in Taylor (2006) or Adler and Taylor (2009) is

E [χ (A (T, u))] = \sum_{d = 0}^{D} L_{d} (S, G) ρ_{d}^{t_{N - 1}} (u), D \in ℕ,

(4)

where $ρ_{d}^{t_{N - 1}}$ , d = 0, … , D, is the d-th Euler characteristic density of a t_N−1-field (Taylor and Worsley, 2007), usually known, and L_d denotes the d-th Lipshitz–Killing curvature, which only depends on G and the parameter space S. Note that L₀(S, G) = χ(S) and hence is usually known.

Eq. (4) is useful, since by the expected Euler characteristic heuristic (EECH) (Taylor et al., 2005), we expect

\frac{1}{2} ℙ (max_{s \in S} | T (s) | > u) \leq ℙ (max_{s \in S} T (s) > u) \approx E [χ (A (T, u))],

(5)

to be a good approximation for large thresholds u. In the case that $S \subset ℝ$ this approximation is always from above. This is due to the fact that the Euler characteristic of one-dimensional sets is always non-negative and hence using the Markov-inequality we obtain

\frac{1}{2} ℙ (max_{s \in S} | T (s) | > u) \leq ℙ (χ (A (T, u)) \geq 1) \leq E [χ (A (T, u))] .

The same argument is heuristically valid for high enough thresholds in any dimension, since the excursion set will with high probability consist mostly of simply-connected sets. This heuristic has been rigorously proven in Taylor et al. (2005) for T being a Gaussian field.

2.3.2. The tGKF-estimator of q_α,N

Let ${\hat{L}}_{d} (S, G)$ be consistent estimators of the LKCs L_d(S, G) for d = 1, … , D, then combining the tGKF (4) and the EEC heuristic Eq. (5) yields the natural plug-in estimator ${\hat{q}}_{α, N}$ of the quantile q_α,N defined in Eq. (3) as the largest solution u of

{\hat{E E C}}_{t_{N - 1}} (u) = L_{0} (S) ρ_{0}^{t_{N - 1}} (u) + \sum_{d = 1}^{D} {\hat{L}}_{d} (S, G) ρ_{d}^{t_{N - 1}} (u) = \frac{α}{2} .

(6)

The following result is important for the proof of the accuracy of the SCBs derived using the tGKF estimator of q_α,N.

Theorem 1.

Assume that ${\hat{L}}_{d}^{N} (S, G)$ is a consistent estimator of L_d(S, G). Then ${\hat{q}}_{α, N}$ given by (6) converges almost surely for N tending to infinity to the largest solution ${\hat{q}}_{α}$ of

E E C_{G} (u) = L_{0} (S) ρ_{0} (u) + \sum_{d = 1}^{D} L_{d} (S, G) ρ_{d} (u) = \frac{α}{2},

(7)

where ρ_d are the Euler characteristic densities of a Gaussian field, which can be found in Adler and Taylor (2009, p.315, (12.4.2)).

Remark 2.

Estimation of LKCs is not widely studied yet. For stationary random fields this has been done in Kiebel et al. (1999). For non-stationary random fields over domains in arbitrary dimensions, consistent estimators based on residuals have been introduced in Taylor and Worsley (2007) and Telschow et al. (2020). The former uses a warping to stationary transform, while the latter uses projections onto Hermite polynomials. For signal-plus-noise models and D ≤ 2 we take a more direct approach based on normalized residuals in Section 3.2.

2.4. Asymptotic covering rates

This section discusses the accuracy of the SCBs obtained using the tGKF. Since the expected Euler characteristic of the excursion sets is only an approximation of the excursion probabilities, there is no hope to prove that the covering of these confidence bands is exact. Especially, if α is large the approximation fails badly and usually will lead to confidence bands that are too wide. However, for values of α < 0.1 typically used for confidence bands, the EEC approximation works astonishingly well. Theoretically, this has been made precise for Gaussian fields in Theorem 4.3 of Taylor et al. (2005), which is the main ingredient in the proof of the next result. Additionally, it relies on the fCLT (1) and the consistency of ${\hat{q}}_{α, N}$ for ${\hat{q}}_{α}$ proved in Theorem 1.

Theorem 2.

Assume (E1–E2) and assume that the limiting Gaussian field G satisfies (G1–G3). Moreover, let ${\hat{L}}_{d}^{N}$ be a sequence of consistent estimators of L_d for d = 1, … , D. Then there exists an α′ ∈ (0, 1) such that for all α ≤ α′ we have that the SCBs defined in Eq. (2) fulfill

lim_{N \to \infty} | 1 - α - ℙ (\forall s \in S : η (s) \in S C B (s, {\hat{q}}_{α, N})) | \leq e^{- (\frac{1}{2} + \frac{1}{2 σ_{c}^{2}}) {\tilde{q}}_{α}^{2}} < e^{- \frac{{\tilde{q}}_{α}^{2}}{2}},

where $σ_{c}^{2}$ is the critical variance of an associated field of G, ${\hat{q}}_{α, N}$ is the quantile estimated using Eq. (6) and ${\tilde{q}}_{α}$ is defined in Theorem 1.

Typically, in our simulations we have that, for α = 0.05, the quantile ${\tilde{q}}_{α}$ is about 3 leading to an upper bound of ≈ 0.011, if we use the weaker bound without the critical variance.

3. Application to the functional signal-plus-noise model

As it is an important example, we discuss in depth SCBs for the population mean in the functional signal-plus-noise model given by

Y (s) = μ (s) + σ (s) Z (s), for s \in S .

(8)

Here μ, σ are continuously differentiable functions on a compact domain $S \subset ℝ^{D}$ and Z is a stochastic process with zero mean and covariance function $cov [Z (s), Z (s^{'})] = c (s, s^{'})$ for s, s′ ∈ S satisfying $c (s, s^{'}) = 1$ if and only if s = s′. Additionally, we will need regularity conditions based on Definition 3 to guarantee that an i.i.d. sample of (8) fulfills a functional central limit theorem.

Definition 3.

We say a random field Z with domain S is $(L^{p}, δ)$ -Lipshitz, if there exists a (semi)-metric δ on S, continuous in the standard topology of $ℝ^{D}$ , and a random variable A satisfying $E [| A |^{p}] < \infty$ such that
$| Z (s) - Z (s^{'}) | \leq A δ (s, s^{'}) for all s, s^{'} \in S$ (9)
and $\int_{0}^{1} H^{1 / 2} (S, δ, ϵ) d ϵ < \infty$ , where H(S, δ, ϵ) denotes the metric entropy function of the (semi)-metric space (S, δ), e.g., Adler and Taylor (2009, Def. 1.3.1.).
We say a random field Z has finite pth $C (S)$ -moment, if $E [∥ Z ∥_{\infty}^{p}] < \infty$ .

Proposition 1.

Any $(L^{p}, δ)$ -Lipshitz field over a compact set S has finite pth $C (S)$ -moment if $E [| Z (s) |^{p}] < \infty$ for some s ∈ S.

Remark 3.

Any $(L^{p}, δ)$ -Lipshitz field Z has necessarily almost surely continuous sample paths. Moreover, this property is the main ingredient in the version of a CLT in $C (S)$ proven in Jain and Marcus (1975) or Ledoux and Talagrand (2013, Section 10.1).
Since any continuous Gaussian field satisfies the finite pth $C (S)$ -moment condition, cf. Landau and Shepp (1970), it is possible to prove a reverse of Proposition 1 for continuously differentiable Gaussian fields. Moreover, finite pth $C (S)$ -moment conditions are often assumed, when uniform consistency of estimates of the covariance function are required, e.g., Hall et al. (2006) and Li et al. (2010).

3.1. Asymptotic SCBs for the one and two sample case

As an application of Theorem 2 we derive SCBs for the population mean and the difference of population means in one and two sample scenarios of the signal-plus-noise model. We base our assumptions on the $(L^{2}, δ)$ -Lipshitz property.

Theorem 3 (Asymptotic SCBs for the Signal-plus-noise Model).

Let $Y_{1}, \dots, Y_{N} \overset{i . i . d .}{~} Y$ be a sample of model (8) and assume Z is an $(L^{2}, δ)$ -Lipshitz field. Define Y(s) = (Y₁(s), … , Y_N(s)).

Then the estimators
${\hat{μ}}_{N} (s) = \bar{Y} (s) = \frac{1}{N} \sum_{n = 1}^{N} Y (s), {\hat{σ}}_{N}^{2} (s) = {\hat{var}}_{N} [Y (s)] = \frac{1}{N - 1} \sum_{n = 1}^{N} {(Y_{n} (s) - \bar{Y} (s))}^{2},$
fulfill the conditions (E1–2) with $τ_{N} = \sqrt{N}$ , θ = μ, ς = σ and $r = c$ .
Let ${\hat{L}}_{1}, \dots, {\hat{L}}_{D}$ be consistent estimators of the LKCs of G. Additionally, assume that the $(L^{2}, δ)$ -Lipshitz field Z has almost surely $C^{3}$ -sample paths, that all partial derivatives up to order 3 of Z are $(L^{2}, δ)$ -Lipshitz and that G fulfills the non-degeneracy condition (G2). Then the accuracy result of Theorem 2 holds true for the SCBs
$S C B (s, {\hat{q}}_{α, N}) = {\hat{μ}}_{N} (s) \pm {\hat{q}}_{α, N} \frac{{\hat{σ}}_{N} (s)}{\sqrt{N}}$
with ${\hat{q}}_{α, N}$ given in Theorem 1.

Remark 4.

Under the assumptions on Z and G in (ii), the warping estimator (Taylor and Worsley, 2007), the Hermite Projection estimator (Telschow et al., 2020) and the estimators discussed in Section 3.2 are all consistent estimators of the LKCs of G. Hence the first assumption in (ii) is not restrictive.
A simple condition on Z to ensure that G fulfills the non-degeneracy condition (G2) is that for all d, l ∈ {1, … , D} we have that cov [(Z^(l)(s), Z^(d,l)(s))] has full rank for all s ∈ S. A proof is provided in Lemma 13 in the Appendix.

Theorem 4 (Asymptotic SCBs for Difference of Means of Two Signal-plus-noise Models).

Let $Y_{1}, \dots, Y_{N} \overset{i . i . d .}{~} Y$ and $X_{1}, \dots, X_{M} \overset{i . i . d .}{~} X$ be independent samples, where

Y (s) = μ_{Y} (s) + σ_{Y} (s) Z_{Y} (s) and X (s) = μ_{X} (s) + σ_{X} (s) Z_{X} (s),

with Z_Y, Z_X both $(L^{2}, δ)$ -Lipshitz fields and assume that c = lim_N,M→∞ N/M. Then

Condition (E1) is satisfied, i.e.
$\frac{\sqrt{N + M - 2} (\bar{Y} - \bar{X} - μ_{Y} + μ_{X})}{\sqrt{(1 + c^{- 1}) {\hat{σ}}_{N} {(Y)}^{2} + (1 + c) {\hat{σ}}_{N} {(X)}^{2}}} \overset{N, M \to \infty}{\Rightarrow} G = \frac{\sqrt{1 + c^{- 1}} σ_{Y} G_{Y} - \sqrt{1 + c} σ_{X} G_{X}}{\sqrt{(1 + c^{- 1}) σ_{Y}^{2} + (1 + c) σ_{X}^{2}}},$
where G_Y, G_X are Gaussian fields with the same covariance structures as X and Y and the denominator converges uniformly almost surely, i.e. condition (E2) is satisfied.
If there are consistent estimators of the LKCs L_d of G and Z_X, Z_Y have $C^{3}$ -sample paths, fulfill the non-degeneracy condition (G2) and all their partial derivatives are $(L^{p}, δ)$ -Lipshitz fields with finite $C (S)$ -variances, then the accuracy result of Theorem 2 holds for the sets
$S C B (s, {\hat{q}}_{α, N}) = {\hat{μ}}_{N} (s) \pm {\hat{q}}_{α, N} \frac{\sqrt{(1 + c^{- 1}) σ_{Y}^{2} + (1 + c) σ_{X}^{2}}}{\sqrt{N + M}}$
with ${\hat{q}}_{α, N}$ given in Theorem 1.

3.2. Estimation of LKCs

For D ≤ 2, we can obtain consistent estimators of the LKCs by directly implementing their definitions. These estimators are conceptually easier to understand than existing estimators (Taylor and Worsley, 2007; Telschow et al., 2020). Additionally, they allow for simple proofs of their consistency and fulfill a CLT.

For a zero-mean, unit-variance Gaussian field G, the LKCs L_d(S, G) are intrinsic volumes of S with respect to the pseudo-Riemannian metric Λ given in standard coordinates of $ℝ^{D}$ as

Λ_{d l} (G, s) = Λ_{d l} (s) = cov [G^{(d)} (s), G^{(l)} (s)], d, l = 1, \dots, D .

(10)

Using this notation the general expression of the LKCs (Adler and Taylor, 2009, Definition 10.7.2) for D = 1 yields

L_{1} (S, G) = {vol}_{1} (S, Λ) = \int_{S} \sqrt{var [\frac{d G}{d s} (s)]} d s .

(11)

In the case of $S \subset ℝ^{2}$ with piecewise $C^{2}$ -boundary ∂S parametrized by a piecewise $C^{2}$ -function $γ : [0, 1] \to ℝ^{2}$ the LKCs are given by

L_{1} = \frac{1}{2} length(\partial S, Λ) = \frac{1}{2} \int_{0}^{1} \sqrt{\frac{d γ^{T}}{d t} (t) Λ (γ (t)) \frac{d γ}{d t} (t)} ‖ \frac{d γ}{d t} (t) ‖ d t L_{2} = {vol}_{2} (S, Λ) = \int_{S} \sqrt{det (Λ (S))} d s_{1} d s_{2} .

(12)

An application of the chain rule shows that L₁ is independent of the parametrization γ of S.

These formulas allow to estimate the unknown LKCs from a sample Y₁, … , Y_N ~ Y of model (8). The estimators are built on the normalized residuals

R_{n} (s) = (Y_{n} (s) - \bar{Y} (s)) / {\hat{σ}}_{N} (s), s \in S, n = 1, \dots, N .

(13)

To abbreviate the following formulas we define R(s) = (R₁(s), … , R_N(s)). In view of Eq. (11) a natural estimator of L₁ for D = 1 is given by

{\hat{L}}_{1}^{N} = \int_{0}^{1} \sqrt{\hat{Var} [\frac{d}{d s} (R (s))]} d s,

(14)

and Eqs. (12) suggest the estimators

{\hat{L}}_{1}^{N} = \frac{1}{2} \int_{0}^{1} \sqrt{\frac{d γ^{T}}{d t} (t) {\hat{Λ}}_{N} (γ (t)) \frac{d γ}{d t} (t)} d t {\hat{L}}_{2}^{N} = \int_{S} \sqrt{det ({\hat{Λ}}_{N} (s))} d s_{1} d s_{2},

(15)

for D = 2. Here ${\hat{Λ}}_{N} (s) = {\hat{var}}_{N} [\nabla R (s)]$ is the empirical covariance matrix of the sample ∇R₁, … , ∇R_N consisting of the gradients of the normed residuals. In order to prove consistency of the LKC estimates, we establish that ${\hat{Λ}}_{N}$ converges uniformly almost surely to Λ.

Theorem 5.

If Z and Z^(d) for d = 1, … , D are $(L^{2}, δ)$ -Lipshitz fields, then

ℙ (lim_{N \to \infty} {\hat{Λ}}_{N} = Λ) = 1.

If Z and Z^(d) for d = 1, … , D are even $(L^{4}, δ)$ -Lipshitz, we obtain

\sqrt{N} (ι ({\hat{Λ}}_{N}) - ι (Λ)) \overset{N \to \infty}{\Rightarrow} G (0, t)

with $ι : Sym (2) \to ℝ^{3}$ mapping

(\begin{matrix} a & b \\ b & c \end{matrix}) \mapsto (a, b, c)

and the matrix valued covariance function $t : S \times S \to Sym (3)$ is given componentwise by

t_{d l} (s, s^{'}) = c o v [Z^{(d)} (s) Z^{(l)} (s), Z^{(d)} (s^{'}) Z^{(l)} (s^{'})]

(16)

for all s, s′ ∈ S and d, l = 1, … , D.

This theorem is the backbone in the proof of consistency and the CLT for the estimators (14) and (15).

Theorem 6 (Consistency of LKCs).

Under the setting of Theorem 3(ii) it follows for d = 1, 2 that

ℙ (lim_{N \to \infty} {\hat{L}}_{d}^{N} = L_{d}) = 1.

Remark 5.

Our proposed sample estimator of Λ through residuals is only one viable possibility. In general consistent estimation of the LKCs can be achieved by any estimator of Λ, which converges uniformly almost surely to Λ. Especially, this means that it is not necessary to observe complete fields as our residual approach suggests.

Theorem 7 (CLT for LKCs).

Assume Z and Z^(d) for d = 1, 2 are $(L^{4}, δ)$ -Lipshitz fields. Then, if $S \subset ℝ$ ,

\sqrt{N} ({\hat{L}}_{1}^{N} - L_{1}) \to \frac{1}{2} \int_{S} \frac{G (S)}{\sqrt{Λ (S)}} d s,

and, if $S \subset ℝ^{2}$ , we obtain

\sqrt{N} (({\hat{L}}_{1}^{N}, {\hat{L}}_{2}^{N}) - (L_{1}, L_{2})) \to (\frac{1}{2} \int_{0}^{1} \frac{1}{\sqrt{γ^{'} {(s)}^{T} Λ (γ (s)) γ^{'} (s)}} t r (Λ (γ (s)) ι (G (γ (s)))) d s, \int_{S} \frac{1}{\sqrt{det (Λ (s))}} tr (Λ (s) ι (diag (1, - 1, 1) G (s))) d s) .

where G(s) is a Gaussian field with zero-mean and covariance function $t$ as in Theorem 5 and γ is a parametrization of the boundary ∂S.

Corollary 1.

Assume additionally to the Assumptions of Theorem 7 that Z is Gaussian with covariance function $c$ and $S \subset ℝ$ . Then, we have the simplified representation

\sqrt{N} ({\hat{L}}_{1}^{N} - L_{1}) \overset{N \to \infty}{\Rightarrow} N (0, τ^{2}),

where

τ^{2} = \frac{1}{2} \int_{S} \int_{S} \frac{\dot{c} {(s, s^{'})}^{2}}{\sqrt{\dot{c} (s, s^{'}) \dot{c} (s, s^{'})}} d s d s^{'} with \dot{c} (s, s^{'}) = \frac{\partial^{2} c}{\partial s \partial s^{'}} (s, s^{'})

Estimation of LKCs in the scenario of two independent samples can be achieved along the same lines as in the one sample case. Here the independence of the samples implies that the covariance matrix Λ of the partial derivatives of the limiting field in Theorem 4(i) splits into a sum of covariance matrices depending on G_X and G_Y, i.e.,

Λ (G) = Λ (\frac{\sqrt{1 + c^{- 1}} σ_{Y} G_{Y}}{\sqrt{(1 + c^{- 1}) σ_{Y}^{2} + (1 + c) σ_{X}^{2}}}) + Λ (\frac{\sqrt{1 + c} σ_{X} G_{X}}{\sqrt{(1 + c^{- 1}) σ_{Y}^{2} + (1 + c) σ_{X}^{2}}}) .

Under the assumptions of Theorem 4(ii) these summands can be separately consistently estimated using the respective normalized residuals

R_{n}^{Y} = \frac{\sqrt{1 + M / N} \cdot (Y_{n} - \bar{Y})}{\sqrt{(1 + M / N) {\hat{σ}}_{N} (Y) + (1 + N / M) {\hat{σ}}_{N} (X)}}

R_{n}^{X} = \frac{\sqrt{1 + N / M} \cdot (X_{n} - \bar{X})}{\sqrt{(1 + M / N) {\hat{σ}}_{N}^{2} (Y) + (1 + N / M) {\hat{σ}}_{N}^{2} (X)}}

and the sum of these estimators is a uniformly almost surely consistent estimator of Λ(G). Thus it yields a consistent estimator of the LKCs in the two sample scenario.

3.3. Discrete sampling and additive noise: A scale space approach

In applications data of a signal-plus-noise model is usually observed on a discrete grid with possible additive measurement noise yielding the model

Y (s_{p}) = μ (s_{p}) + σ (s_{p}) Z (s_{p}) + ε (s_{p}), for p = 1, \dots, P,

(17)

where $S = (s_{1}, \dots, s_{p}) \in S \subset ℝ^{D \times P}$ is a random (or deterministic) sampling of the domain S, ε is a random field on S with finite second $C (S)$ -moment representing the observation noise and covariance function $e$ and ε, Z and S are mutually independent.

A generic way to estimate μ and perform inference on it is via local polynomial smoothers. Degras (2011) discussed construction of SCBs for these estimators. Since he proved that local linear smoothers under certain conditions similar to ours satisfy a fCLT, it is easy to extend our previous results of the signal plus noise model to his setup. However, there is no satisfactory solution to the question of choosing a data-driven bandwidth for finite sample sizes. Therefore, we consider scale spaces and their SCBs for inference on the population mean μ instead. This concept was originally introduced for regression in Chaudhuri and Marron (1999, 2000). Our goal is to construct SCBs for the smoothed population mean simultaneously across different bandwidths. For simplicity we restrict ourselves to the Priestley–Rao smoothing estimator. Extensions to local polynomial or other appropriate linear smoothers are possible.

Definition 4 (Scale Space Field).

We define the Scale Space field with respect to a continuous kernel function $K : \tilde{S} \times [h_{0}, h_{1}] \to ℝ$ with $\tilde{S} \supset S$ and 0 < h₀ < h₁ < ∞ corresponding to Model (17) as

\tilde{Y} (s, h) = \frac{1}{P} \sum_{p = 1}^{P} Y (s_{p}) K (s - s_{p}, h)

with mean

\tilde{μ} (s, h) = \frac{1}{P} \sum_{p = 1}^{P} μ (s_{p}) K (s - s_{p}, h) .

In order to apply the previously presented theory we have to obtain first a functional CLT. The version we present is similar to Theorem 3.2 of Chaudhuri and Marron (2000) with the difference that we consider the limit with respect to the number of observed curves and include the case of possibly having the (random) measurement points depend on the number of samples, too. The regression version in Chaudhuri and Marron (2000) treats the limit of observed measurement points going to infinity for one noisy observed function.

Theorem 8.

Let $Y_{1}, \dots, Y_{N} \overset{i . i . d .}{~} Y$ be a sample from Model (17), where P is allowed to depend on N and assume that Z has finite second $C (S)$ -moment. Further assume max_s∈S σ(s) ≤ B < ∞ and

r ((s, h), (s^{'}, h^{'})) = lim_{N \to \infty} \frac{1}{P^{2}} \sum_{p, p^{'} = 1}^{P} E [(σ (s_{p}) σ (s_{p^{'}}) c (s_{p}, s_{p^{'}}) + e (s_{p}, s_{p^{'}})) K (s - s_{p}, h) K (s^{'} - s_{p^{'}}, h^{'})]

(18)

exists for all (s, h), $(s^{'}, h^{'}) \in S \times H$ , where the expectation is w.r.t. the sampling distribution of S = (s₁, … , s_P). Finally, assume that the smoothing kernel (s, h) ↦ K(s, h) is α-Hölder continuous. Then in $C (S \times H)$

\sqrt{N} (N^{- 1} \sum_{n = 1}^{N} {\tilde{Y}}_{n} (s, h) - E [\tilde{μ} (s, h)]) \overset{N \to \infty}{\Rightarrow} G (0, r) .

Remark 6.

Suppose S ⊂ [0, 1] is non-random. If P independent of N and s_p = (p − 0.5)/P for p = 1, … , P for all n = 1, … , N and $ε (s_{1}), \dots, ε (s_{P}) \overset{i . i . d .}{~} N (0, η^{2})$ , then Assumption (18) is trivially satisfied. If instead P → ∞ as N → ∞ it is sufficient that the following integral exists and is finite for all (s, h), $(s^{'}, h^{'}) \in S \times H$

r ((s, h), (s^{'}, h^{'})) = \int_{S} \int_{S} (σ (τ) σ (τ^{'}) c (τ, τ^{'}) + e (τ, τ^{'})) K (s - τ, h) K (s^{'} - τ^{'}, h^{'}) d τ d τ^{'},

in order to have Assumption (18) satisfied. For example, this is the case if $c$ and $e$ are continuous.

In order to use Theorem 2 it remains to show that the LKCs can be consistently estimated and the assumptions of the GKF from Section 2.3 are satisfied.

Proposition 2.

Under the setting of Theorem 8 assume additionally that the kernel $K \in C^{3} (S \times H)$ . Define

\hat{\tilde{μ}} (s, h) = \frac{1}{N} \sum_{n = 1}^{N} {\tilde{Y}}_{n} (s, h) a n d \hat{\tilde{σ}} (s, h) = \sqrt{\frac{1}{N} \sum_{n = 1}^{N} {({\tilde{Y}}_{n} (s, h) - \hat{\tilde{μ}} (s, h))}^{2}}

and assume that $r$ has continuous partial derivatives up to order 3 and the covariance matrices of

lim_{N \to \infty} \frac{1}{P} \sum_{p = 1}^{P} (σ (s_{p}) Z (s_{p}) + ε (s_{p})) (\frac{\partial K (s - s_{p}, h)}{\partial x}, \frac{\partial^{2} K (s - s_{p}, h)}{\partial s \partial h}), x = s o r h,

have rank 2. Then $\hat{\tilde{μ}} (s, h)$ and $\hat{\tilde{σ}} (s, h)$ satisfy Assumptions (L), (E2) and (G3). Thus, all assumptions of Theorem 2 are satisfied.

Remark 7.

Assume the situation of Remark 6. Then the assumption that $r$ has continuous partial derivatives up to order 3 follows directly from the assumption $K \in C^{3} (S \times H)$ .

4. Simulations

All simulations are based on 5000 Monte Carlo simulations in order to estimate the covering rate of the SCBs. We compare SCBs constructed using the tGKF, the non-parametric bootstrap-t (Boots-t) and the multiplier-t with Gaussian (gMult-t) or Rademacher (rMult-t) multipliers. For the Gaussian simulations we include the fast and fair bands (ffscb) from Liebl and Reimherr (2019). Bootstrap methods are based on 5000 bootstrap replicates.

4.1. Coverage: Smooth Gaussian case

This first set of simulations deals with the most favorable case for the tGKF method. We simulate samples from the following smooth signal-plus-noise models (8)

Model A (1 D) : Y^{A} (s) = sin (8 π s) exp (- 3 s) + \frac{{(0.6 - s)}^{2} + 1}{6} \cdot \frac{a^{T} K^{A} (s)}{{‖ K}^{A} (s) ‖}, s \in [0, 1]

Model B (1 D) : Y^{B} (s) = sin (8 π s) exp (- 3 s) + \frac{{(0.6 - s)}^{2} + 1}{6} \cdot \frac{b^{T} K^{B} (s)}{{‖ K}^{B} (s) ‖}, s \in [0, 1]

Model C (2 D) : Y^{C} (s) = s_{1} s_{2} + \frac{s_{1} + 1}{s_{2}^{2} + 1} \cdot \frac{c^{T} K^{C} (s)}{{‖ K}^{C} (s) ‖}, s = (s_{1}, s_{2}) \in {[0, 1]}^{2}

with $a ~ N (0, I_{7 \times 7})$ , $b ~ N (0, I_{21 \times 21})$ and $c ~ N (0, I_{36 \times 36})$ . The vector K^A(s) has entries $K_{i}^{A} (s) = (\begin{array}{l} 6 \\ j \end{array}) s^{i} {(1 - s)}^{6 - i}$ , the (i, 6)-th Bernstein polynomial for i = 0, … , 6, K^B(s) has entries $K_{i}^{B} (s) = exp (- \frac{{(s - x_{i})}^{2}}{2 h_{i}^{2}})$ with x_i = i/21, h_i = 0.04 for i < 10, h₁₁ = 0.2 and h_i = 0.08 for i > 10 and K^C(s) is the vector of all entries from the 6 × 6-matrix $K_{i j} (s) = exp (- \frac{{‖ s - x_{i j} ‖}^{2}}{2 h^{2}})$ with x_ij = (i, j)/6 with h = 0.06. Examples of sample paths of the signal-plus-noise models and the error fields, are shown in the top two rows of Fig. 1.

We simulated samples from Model A and B on an equidistant grid of 200 points of [0, 1]. Model C was simulated on an equidistant grid with 50 points in each dimension.

According to the results of this simulation, shown in the bottom row of Fig. 1, the tGKF and the rMult-t method perform best since they are close to nominal level for all considered sample sizes; ffscb converges to the nominal level, too. Table 1 summarizes the computation times of the compared methods. It shows that the tGKF SCBs do not only provide the correct covering rates, it is also approximatively 20 times faster than its competitors. That ffscb has similar computation times as bootstrap methods is surprising. Personal communication with the authors suggests that their R-code needs to be optimized.

Table 1.

Comparison of the runtime for different methods to construct SCBs. The reported runtime is the average over 1000 calls to construct a SCB for a sample of size N = 50 from Model A in 1D and Model C in 2D.

	tGKF	ffscb	Boots-t	gMult-t	rMult-t
Model A (1D)	4.2 s	115.2 s	88.4 s	86.6 s	83.9 s
Model C (2D)	92.8 s	-	2039.6 s	1858.4 s	1864.9 s

Open in a new tab

4.2. Coverage: Smooth non-Gaussian case

The tGKF method is based on a formula valid for Gaussian fields only. However, we have seen in Section 2.4 that under appropriate conditions the covering is expected to be good asymptotically even if this condition is not fulfilled. For the simulations here we use Model A with the only change that a has i.i.d. entries of a Student’s $t_{3} / \sqrt{3}$ random variable, which is non-Gaussian, but still symmetric. A second simulation study tackles asymmetry of the noise distribution and increasing similarity to Gaussianity. Here we use Model B where b has i.i.d. entries of $(χ_{v}^{2} - v) / \sqrt{2 v}$ random variables for different parameters of ν. These random variables are non-symmetric, but for ν → ∞ they converge to a standard normal.

Fig. 2 shows that in the symmetric distribution case of Model A the tGKF has some over coverage for small sample sizes, but eventually it is in the targeted region for 95% coverage. For the particular process used in this simulation, ffscb seems to converge faster to nominal than tGKF SCBs. A possible explanation might be that the slight undercovering, which was present in the Gaussian case, is compensated due to the non-Gaussianity. A careful comparison of these cases will be part of future research. For this symmetric non-Gaussian distribution the rMult-t still works well. This is probably because it preserves all moments up to the fourth. The case of non-symmetric distributions produces usually undercovering for the tGKF. However, as predicted by Theorem 3 eventually for large N it gets close to the targeted covering. In this case, the bootstrap-t converges faster to the correct covering rate because it does not require symmetry (see Fig. 3).

Fig. 3. — Simulation results for smooth non Gaussian fields (Model B). **Left:** samples from the error fields. **Right:** simulated covering rates. The solid black line is the targeted level of the SCBs and the dashed black line is twice the standard error for a Bernoulli random variable with p = 0.95.

4.3. Average width and variance of different SCBs

An important feature of confidence bands is its pointwise width averaged over the domain and the variance of this average width over repeated experiments. It is preferable to have the smallest possible width that still has the correct coverage. Moreover, the width of a SCB should remain stable, meaning that its variance should be small.

Here we focus on the estimation of the quantile ${\hat{q}}_{N, α}$ , which for all methods except for ffscb is closely related to the width of the confidence band averaged over the domain. We decided to present this estimate, since it can be compared to its theoretical value and is easier to interpret. We simulate estimates of the quantile ${\hat{q}}_{N, α}$ obtained from different methods for the Gaussian Model B and the non-Gaussian Model B with ν = 7. The optimal quantile q_α is simulated using a Monte-Carlo simulation with 50,000 replications. Since the ffscb method from the R-package ffscb is not based on the quantile q_α we instead compute for this method

{\hat{q}}_{N, α}^{ffscb} (s) = \sqrt{N} \frac{ffscb.up (s) - ffscb.low (s)}{2 {\hat{σ}}_{N} (s)}

and report its average value over S. This quantity is comparable to the quantile ${\hat{q}}_{N, α}$ from the other methods and closely related to the average width of the ffscb.

We compare ffscb from the R-package ffscb, Degras’ asymptotic method from the R-package SCBmeanfd, SCBs based on the GKF and the tGKF, the non-parametric bootstrap-t, multiplier-t, non parametric bootstrap and a simple multiplier bootstrap. Here the latter two methods use the variance of the original sample instead of its bootstrapped version, compare Appendix A.

Tables 2 and 3 show the simulation results. We can draw two main conclusions from these tables. First, the tGKF method has the smallest standard deviation among the compared methods in the quantile while still having good coverage (at least asymptotically in the non-Gaussian case). Second, the main competitor – the bootstrap-t – has a much higher standard error in the width, but its coverage converges faster in the highly non-Gaussian and asymmetric case. This shows that the tGKF method is the most stable among the compared methods.

Table 2.

Bias of average estimate of ${\hat{q}}_{0.05, N}$ and twice its standard error for different methods of SCBs and the Gaussian Model B. The simulations are based on 1000 Monte Carlo simulations.

N	10	20	30	50	100	150
true	4.118	3.382	3.211	3.081	2.993	2.960
tGKF	−0.140 ± 0.033	−0.016 ± 0.011	−0.008 ± 0.007	+0.003 ± 0.004	+0.007 ± 0.002	+0.013 ± 0.001
GKF	−1.190 ± 0.016	−0.459 ± 0.008	−0.289 ± 0.005	−0.159 ± 0.003	−0.072 ± 0.002	−0.039 ± 0.001
Degras	−1.252 ± 0.021	−0.500 ± 0.011	−0.324 ± 0.007	−0.191 ± 0.005	−0.100 ± 0.003	−0.063 ± 0.003
Boots	−1.427 ± 0.023	−0.569 ± 0.012	−0.360 ± 0.010	−0.202 ± 0.007	−0.096 ± 0.005	−0.058 ± 0.004
Boots-t	+ 1.701 ± 0.604	+0.226 ± 0.050	+0.083 ± 0.019	+0.027 ± 0.009	+0.006 ± 0.005	+0.010 ± 0.004
gMult	−1.121 ± 0.159	−0.460 ± 0.031	−0.297 ± 0.016	−0.168 ± 0.009	−0.080 ± 0.005	−0.048 ± 0.004
gMult-t	−0.688 ± 0.046	−0.292 ± 0.017	−0.200 ± 0.011	−0.121 ± 0.007	−0.060 ± 0.005	−0.035 ± 0.004
rMult	−1.428 ± 0.179	−0.616 ± 0.032	−0.403 ± 0.015	−0.232 ± 0.009	−0.111 ± 0.005	−0.067 ± 0.004
rMult-t	−0.018 ± 0.122	−0.000 ± 0.023	−0.009 ± 0.013	−0.003 ± 0.008	+0.001 ± 0.005	+0.005 ± 0.004
ffscb	−0.530 ± 0.055	−0.181 ± 0.027	−0.111 ± 0.018	−0.057 ± 0.013	−0.022 ± 0.009	−0.048 ± 0.007

Open in a new tab

Table 3.

Bias of average estimate of ${\hat{q}}_{0.05, N}$ and twice its standard error for different methods of SCBs and the non-Gaussian Model B with ν = 7. The simulations are based on 1000 Monte Carlo simulations.

N	10	20	30	50	100	150
true	4.532	3.628	3.373	3.190	3.048	3.004
tGKF	−0.558 ± 0.040	−0.262 ± 0.014	−0.170 ± 0.008	−0.106 ± 0.004	−0.048 ± 0.002	−0.031 ± 0.001
GKF	−1.606 ± 0.019	−0.705 ± 0.010	−0.451 ± 0.006	−0.268 ± 0.004	−0.127 ± 0.002	−0.083 ± 0.001
Degras	−1.666 ± 0.022	−0.749 ± 0.011	−0.488 ± 0.008	−0.301 ± 0.005	−0.156 ± 0.003	−0.108 ± 0.003
Boots	−1.835 ± 0.025	−0.809 ± 0.013	−0.516 ± 0.010	−0.307 ± 0.007	−0.147 ± 0.005	−0.099 ± 0.004
Boots-t	+ 1.618 ± 0.699	+0.218 ± 0.098	+0.080 ± 0.042	+0.023 ± 0.016	+0.006 ± 0.006	+0.001 ± 0.004
gMult	−1.471 ± 0.205	−0.703 ± 0.036	−0.457 ± 0.018	−0.278 ± 0.009	−0.135 ± 0.005	−0.091 ± 0.004
gMult-t	−1.104 ± 0.047	−0.544 ± 0.018	−0.371 ± 0.012	−0.240 ± 0.008	−0.128 ± 0.005	−0.090 ± 0.004
rMult	−1.763 ± 0.234	−0.870 ± 0.039	−0.574 ± 0.018	−0.352 ± 0.009	−0.175 ± 0.005	−0.116 ± 0.004
rMult-t	−0.422 ± 0.133	−0.278 ± 0.027	−0.194 ± 0.015	−0.130 ± 0.009	−0.064 ± 0.005	−0.044 ± 0.004
ffscb	−0.947 ± 0.053	−0.427 ± 0.026	−0.274 ± 0.018	−0.166 ± 0.013	−0.077 ± 0.008	−0.092 ± 0.006

Open in a new tab

4.4. The influence of observation noise

In order to study the influence of observation noise, we evaluate the dependence of the covering rate of the smoothed mean of a signal-plus-noise model with added i.i.d. Gaussian observation noise on the bandwidth used in a local linear smoother (Degras, 2011) and the standard deviation of the observation noise. For the simulations we generate samples from the Gaussian Model A and Model B on an equidistant grid of [0, 1] with 100 points and add $N (0, σ)$ -distributed independent observation noise with σ ∈ {0.02, 0.1, 0.2}. Afterwards we smooth the samples with a Gaussian kernel with bandwidths h ∈ {0.02, 0.03, 0.05, 0.1}. The smoothed curves are evaluated on an equidistant grid with 400 points. The results of these simulations are shown in Figs. 4 and 5. In most cases the nominal covering is achieved independent of N. Only for small smoothing bandwidths and large standard deviation of the observation errors a slight overcoverage is present. This might be a problem with the estimation of LKCs, since they depend on numerical derivatives of the sample paths, which are in these particular scenarios highly variable.

Fig. 4. — Simulation results for Gaussian fields (Model A) with observation noise. **Top row:** samples from the error fields. **Bottom row:** simulated covering rates. The solid black line is the targeted level of the SCBs and the dashed black line is twice the standard error for a Bernoulli random variable with p = 0.95.

Fig. 5. — Simulation results for Gaussian fields (Model B) with observation noise. **Top row:** samples from the error fields. **Bottom row:** simulated covering rates. The solid black line is the targeted level of the SCBs and the dashed black line is twice the standard error for a Bernoulli random variable with p = 0.95.

We also study the covering rate of the population mean of the scale space field of Model B. The generation of the samples is the same as for the previous simulations. The only difference is that instead of smoothing with one bandwidth we construct the scale space field. Here we use a equidistant grid of 100 points of the interval [0.02, 0.1]. The results can be found in Fig. 6 and show that the covering rate of the tGKF SCB is close to nominal.

Fig. 6. — Simulation results for the scale space field from the Gaussian fields of Model B with added observation noise. **Left two panels:** samples from the error fields. **Right panel:** simulated covering rates. The solid black line is the targeted level of the SCBs and the dashed black line is twice the standard error for a Bernoulli random variable with p = 0.95.

4.5. SCBs for the difference of population means of two independent samples

Since the single sample scenario was studied in great detail, we only present the case of one dimensional smooth Gaussian noise fields in the two sample scenario. Moreover, we only report the results for the tGKF approach. The previous observations regarding the other methods carry over to this scenario.

The simulations are designed as follows. We generate two samples of sizes N and M such that N/M = c ∈ {1, 2, 4}. We are interested in four different scenarios. The first scenario is the most favorable having the same correlation structure and the same variance function. Here we use for both samples the Gaussian Model A from Section 4.1. In all remaining scenarios one of the samples will always be this error field. In order to check the effect of the two samples having different correlation structures, we use Gaussian Model B as the second sample from Section 4.1. For dependence on the variance, while the correlation structure is the same, we change the variance function in the Gaussian Model A to σ²(s) = 0.04 for the second sample. As an example field where both the correlation and the variance are different we use Gaussian Model B with the modification that the error field has pointwise variance σ²(s) = 0.04. The results of these simulations are shown in Fig. 7 and show that, except for very unbalanced sample sizes between the two groups, the covering rate of the tGKF SCB is close to nominal.

Fig. 7. — Simulation results for the SCBs of the difference between mean curves of two samples of smooth Gaussian fields for varying c = M/N. The solid black line is the targeted level of the SCBs and the dashed black line is twice the standard error for a Bernoulli random variable with p = 0.95.

5. Applications

5.1. DTI fibers

Our first data example (1D) concerns the impact of the eating disorder anorexia nervosa on white-matter tissue properties during adolescence in young females. The experimental setup and a different methodology to statistically analyze the data using pointwise testing and permutation tests can be found in the original article (Travis et al., 2015). The data set consists of a control group of 15 healthy subjects and 15 patients. For each subject 27 different neural fibers were extracted. The data for each fiber consists of fractional anisotropy values sampled on an equidistant grid of length 100.

In order to locate differences in the DTI fibers between the two groups, we computed for each fiber defined on the domain S = [0, 100] the two-sample 95%-SCBs for the difference of the population mean of the control group and the patients as explained in Section 3. Robustness of detected differences across scales is tested by computing the SCBs for the corresponding scale space fields, see Section 3. For the latter we used a Gaussian smoothing kernel and the considered bandwidth range $H = [1.5, 10]$ was sampled at 200 equidistant bandwidths.

The results for the three fibers for which we find significant differences are shown in Fig. 8. Our results are mostly consistent with the results from Travis et al. (2015) in the sense that both approaches find significant differences at similar locations in the right thalamic radiation and the left superior longitudinal fasciculus (SLF). Additionally, SCBs detect significant differences in the right cingulum hippocampus. Travis et al. (2015) claim further significant findings. However, these belong to criteria, which do not take simultaneous testing along the fibers into account. Therefore SCBs might not be able to detect them, since they correct for multiplicity along the fiber.

5.2. Climate data

Our second data example (2D) concerns the change in temperature over North America within the next century. The data was obtained from the North American Regional Climate Change Assessment Program (NARCCAP) project (Mearns et al., 2013). It consist of two sets of 29 spatially registered arrays of mean seasonal temperatures for summer (June–August) and winter (December–February) evaluated at a fine grid of fixed locations 0.5 degrees apart in geographic longitude and latitude over the time periods 1971–1999 and 2041–2069. Sommerfeld et al. (2018) analyzed this data set with the aim to detect regions at risk of exceeding a 2 °C temperature increase. To complement their analysis, we use their model and data processing and provide 90% SCBs for the estimated difference of mean temperatures between the two time periods assuming that the two samples have the same covariance structure.

Our results are shown in Fig. 9. From the 2 °C contour lines (yellow) of the lower bound for the SCBs we conclude that there is likely an increase of at least 2 °C in the time period 2041–2069 compared to 1971–1999 during the summer months in the region of the Rocky Mountains and the Sierra Madre Occidental mountains of Mexico, since in this region the lower bound of the SCBs is here larger than 2 °C. Similarly we obtain that for mean winter temperatures, small regions around the Hudson Bay and in the Canadian Shield are at risk of having an increase of more than 2 °C. The area enclosed by the 2 °C contour lines for the upper bound of the SCBs show that during summer large areas in Canada and Alaska as well as smaller patches at the Hudson Bay and the Gulf of Mexico are not at risk experiencing an temperature increases of more than 2 °C. For winter mean temperature there are only small regions for which we are certain that the temperature does not rise above 2 °C.

The aforementioned areas within the contour lines of the lower and upper bounds of the SCBs are similar to the COverage Probability Excursion (CoPE) sets introduced in Sommerfeld et al. (2018). CoPE sets are a pair ${\hat{A}}^{+} \subset {\hat{A}}^{-}$ of data driven random sets, for which ${\hat{A}}^{-}$ completely contains with a preset probability the true excursion set, while the set ${\hat{A}}^{+}$ is completely contained in the true excursion set.

The interior of the yellow contour lines, i.e., thresholding of the lower bound of the SCB yields a set similar to ${\hat{A}}^{+}$ , while thresholding the upper bound can be compared to ${\hat{A}}^{-}$ . In fact, we believe thresholding SCBs can be interpreted as conservative CoPE sets, explaining why our analysis is comparable to the results in Sommerfeld et al. (2018). Exploring this connection between CoPE sets and SCBs will be future research.

Acknowledgments

F.T. and A.S. were partially supported by NIH grant R01EB026859. F.T. thanks the WIAS Berlin for its hospitality where parts of the research for this article was performed. We also thank Samuel Davenport for proofreading the manuscript and spotting an error in the proof of Theorem 2.

This work was partially funded by National Institute of Health, USA grant R01EB026859.

Appendix A. Bootstrap methods

Parametric bootstrap-t for q_α,N (Degras, 2011).

Assume that the estimators $s \mapsto {\hat{η}}_{N} (s)$ and $s \mapsto {\hat{ς}}_{N} (s)$ are obtained from a sample $Y_{1}, \dots, Y_{N} \overset{i . i . d .}{~} Y$ of random functions, then the parametric bootstrap-t estimator of q_α,N is obtained as follows:

Resample from Y₁, … , Y_N with replacement to produce a bootstrap sample $Y_{1}^{*}, \dots, Y_{N}^{*}$ .
Compute ${\hat{η}}_{N}^{*}$ and ${\hat{ς}}_{N}^{*}$ using the sample $Y_{1}^{*}, \dots, Y_{N}^{*}$ .
Compute $T^{*} = {max}_{s \in S} τ_{N} | {\hat{η}}_{N}^{*} (s) - {\hat{η}}_{N} (s) | / {\hat{ς}}_{N}^{*} (s)$ .
Repeat steps 1 to 3 many times to approximate the conditional law $L^{*} = L (T^{*} ∣ Y_{1}, \dots, Y_{N})$ and take the (1 − α) · 100% quantile of $L^{*}$ to estimate q_α,N.

Remark 8.

Note that the variance in the denominator is also bootstrapped, which corresponds to the standard bootstrap-t approach for confidence intervals, cf. DiCiccio and Efron (1996). This is done in order to mimic the l.h.s. in (1) and improves the small sample coverage.

According to our simulations in Section 4 this estimator works well only for large enough sample sizes, although Degras (2011) introduced it especially for small sample sizes. Moreover, it is well known that confidence intervals for finite dimensional parameters based on the bootstrap-t have highly variable end points for small sample sizes, cf., Good (2005, Section 3.3.3). Evidence that this remains the case in the functional world is provided in Tables 2 and 3.

Multiplier-t bootstrap for q_α,N.

The second bootstrap method builds on residuals and a version of the multiplier (or wild) bootstrap as introduced in Chang et al. (2017). Here we assume that we can construct residuals $R_{n}^{N}$ for n = 1, … , N satisfying $\sum R_{n}^{N} = 1$ and $\sum R_{n}^{N} (s) R_{n}^{N} (s^{'}) \to r$ as N → ∞ uniformly almost surely. For example for the signal-plus-noise model the residuals $R_{n}^{N} = \sqrt{\frac{N}{N - 1}} (Y_{n} - {\hat{μ}}_{N})$ do satisfy these conditions, if the error field Z is $(L^{2}, δ)$ -Lipshitz and has finite second $C (S)$ -moment. Here as before ${\hat{η}}_{N} = {\hat{μ}}_{N}$ and ${\hat{ς}}_{N} = {\hat{σ}}_{N}$ denote the pointwise sample mean and the pointwise sample standard deviation. Algorithmically, the multiplier bootstrap estimates q_α,N are as follows:

Compute residuals $R_{1}^{N}, \dots, R_{N}^{N}$ and multipliers $g_{1}, \dots, g_{N} \overset{i.i.d.}{~} g$ with $E [g] = 0$ and var[g] = 1
Compute ${\hat{σ}}_{N}^{*} (s)$ , i.e. the sample standard deviation of $g_{1} R_{1}^{N} (s), \dots, g_{N} R_{N}^{N} (s)$ .
Compute $T^{*} (s) = \frac{1}{\sqrt{N}} \sum_{n = 1}^{N} g_{n} \frac{R_{n}^{N} (s)}{{\hat{σ}}_{N}^{*} (s)}$ .
Repeat steps 1 to 3 many times to approximate the conditional law $L^{*} = L (T^{*} ∣ Y_{1}, \dots, Y_{N})$ and take the (1 − α) · 100% quantile of $L^{*}$ to estimate q_α,N.

In our simulations we use Gaussian and Rademacher multipliers. The latter perform much better for small sample sizes than Gaussian multipliers. This is the reason why they probably have been used in Chang et al. (2017).

Appendix B. Proofs

B.1. Proof of Claim in Remark 1

Using the multivariate mean value theorem and the Cauchy–Schwarz inequality yields

{| G^{(d, l)} (s) - G^{(d, l)} (s^{'}) |}^{2} \leq max_{t \in S} {‖ \nabla G^{(d, l)} (t) ‖}^{2} {‖ s - s^{'} ‖}^{2} .

Applying the expectation to both sides and then taking the maximum of the resulting sums we obtain

E [{| G^{(d, l)} (s) - G^{(d, l)} (s^{'}) |}^{2}] \leq D max_{k = 1, \dots, D} E [max_{t \in S} {| G^{(d, l, k)} (t) |}^{2}] {‖ s - s^{'} ‖}^{2} .

The proof follows now from the following two observations. Firstly, by Remark 3 each of the expectations we take the maximum of is finite (Landau and Shepp, 1970, Thm 5), since all components of the gradient of ∇G^(d,l) are Gaussian fields with almost surely continuous sample paths. Secondly, |log‖x‖|⁻² ≥ x² for all 0 < x < 1.

B.2. Proof of Theorem 1

Lemma 9.

Let the function ${\hat{f}}_{N} (u)$ and its first derivative ${\hat{f}}_{N}^{'} (u)$ be uniformly consistent estimators of the function f(u) and its first derivative f′(u), respectively, where both are uniformly continuous over $u \in ℝ$ . Assume there exists an open interval I = (a, b) such that f is strictly monotone on I and there exists a unique solution u₀ ∈ I to the equation f(u) = 0. Define ${\hat{u}}_{N} = sup {u \in I : {\hat{f}}_{N} (u) = 0}$ . Then $\hat{u}$ is a consistent estimator of u₀.

Proof.

Assume w.l.o.g. that f is strictly decreasing on I. Thus, for any ε > 0 we have f(u₀ − ε) > 0 > f(u₀ + ε) by f(u₀) = 0. The assumption that ${\hat{f}}_{N} (u)$ is a consistent estimator of f(u) yields

ℙ ({\hat{f}}_{N} (u_{0} - ε) > 0 > {\hat{f}}_{N} (u_{0} + ε)) \to 1,

which implies that with probability tending to 1, there is a root of $\hat{f}$ in I_0,ε = (u₀ − ε, u₀ + ε). On the other hand the monotonicity of f guarantees the existence of an δ > 0 such that inf{|f(u)| : u ∈ I \ I_0,ε } > δ. Moreover, by the uniform consistency of $\hat{f}$ , we have that

ℙ (sup_{u \in I} | {\hat{f}}_{N} (u) - f (u) | < δ / 2) \to 1.

Therefore, using the inequality

inf_{u \in I \ I_{0, ε}} | {\hat{f}}_{N} (u) | \geq inf_{u \in I \ I_{0, ε}} | f (u) | - sup_{u \in I \ I_{0, ε}} | {\hat{f}}_{N} (u) - f (u) |,

we can conclude that

ℙ (inf_{u \in I \ I_{0, ε}} | {\hat{f}}_{N} (u) | > δ / 2) \geq ℙ (inf_{u \in I \ I_{0, ε}} | f (u) | - sup_{u \in I \ I_{0, ε}} | {\hat{f}}_{N} (u) - f (u) | > δ / 2) = ℙ (sup_{u \in I \ I_{0, ε}} | {\hat{f}}_{N} (u) - f (u) | < inf_{u \in I \ I_{0, ε}} | f (u) | - δ / 2) \to 1,

which implies that with probability tending to 1, there is no root of ${\hat{f}}_{N}$ outside I_0,ε. Hence from the definition of ${\hat{u}}_{N}$ , it is clear that ${\hat{u}}_{N}$ is the only root of ${\hat{f}}_{N}$ in I with probability tending to 1. As an immediate consequence we have that

ℙ [| {\hat{u}}_{N} - u_{0} | < ε] = ℙ [{\hat{u}}_{N} \in I_{0, ε}] \to 1,

which finishes the proof that $\hat{u}$ is a consistent estimator of u₀.

Lemma 10.

Let ${\hat{L}}_{d}^{N}$ be a consistent estimator of L_d and EEC_G(u) given in Eq. (7).

${‖ {EEC}_{G} (u) - {\hat{EEC}}_{t_{N - 1}} (u) ‖}_{\infty} \overset{N \to \infty}{\to} 0$ almost surely.
${‖ {EEC}_{G}^{'} (u) - {\hat{{EEC}^{'}}}_{t_{N - 1}} (u) ‖}_{\infty} \overset{N \to \infty}{\to} 0$ almost surely.

Proof.

Part 1.:

This is a direct consequence of the consistency of the LKC estimates and the observation that the EC densities ρ^tν of a t_ν-field with ν = N − 1 degrees of freedom converges uniformly to the EC densities of a Gaussian field ρ^G as N tends to infinity, i.e.

lim_{v \to \infty} max_{u \in ℝ} | ρ^{t_{ν}} (u) - ρ^{G} (u) | = 0.

The latter follows from Worsley (1994, Theorem 5.4), which implies that the uniform convergence of EC densities is implied by the uniform convergence of

lim_{v \to \infty} max_{u \in ℝ} | {(1 + \frac{u^{2}}{v})}^{- \frac{v - 1}{2}} - e^{- \frac{u^{2}}{2}} | = 0.

To see this, note that the distance

h_{v} (u) = {(1 + \frac{u^{2}}{v})}^{- \frac{v - 1}{2}} - e^{- \frac{u^{2}}{2}} \geq 0, for u \in ℝ

fulfills lim_{u→ ± ∞} h_ν(u) = 0. Thus, there is a $C_{v} = {max}_{u \in ℝ} | h_{v} (u) |$ by continuity of h_ν. Moreover, note that h_ν(u) ≥ h_ν+1(u) for ν ≥ 1, all $u \in ℝ$ and lim_ν→∞ h_ν(u) = 0. Hence, C_ν converges to zero for ν → ∞.

Part 2.:

This follows similar by the same arguments as above applied to the derivatives of the EC densities.

Proof of Theorem 1.

In order to prove the almost sure convergence ${\hat{q}}_{α, N} \overset{N \to \infty}{\to} {\tilde{q}}_{α}$ , note that for u large enough u ↦ EEC_G(u) is strictly monotonically decreasing and therefore combining Lemmas 9 and 10 yields the claim.

B.3. Proof of Theorem 2

By Taylor et al. (2005, Theorem 4.3) we have for $Z$ a zero-mean Gaussian field over a parameter set $T$ that

\underset{u \to \infty}{lim inf} - u^{2} log | ℙ (max_{t \in T} Z (t) \geq u) - {EEC}_{Z} (u) | \geq \frac{1}{2} (1 + \frac{1}{2 σ_{c}^{2}}),

where $σ_{c}^{2}$ is a variance depending on an associated field to $Z$ . This implies that there is a $\tilde{u}$ such that for all $u \geq \tilde{u}$ we have that

| ℙ (max_{t \in T} Z (t) \geq u) - {EEC}_{Z} (u) | \leq e^{- {(\frac{1}{2} + \frac{1}{2 σ_{c}^{2}})}^{u^{2}}} .

(19)

Equipped with this result using the definition M = max_s∈S G(s) and |M| = max_s∈S|G(s)| we compute

| 1 - α - ℙ (\forall s \in S : η (s) \in S C B (s, {\hat{q}}_{α, N})) | \leq | - α + ℙ (max_{s \in S} | τ_{N} \frac{{\hat{η}}_{N} (s) - η (s)}{{\hat{ς}}_{N} (s)} | > {\hat{q}}_{α, N}) | \leq | ℙ (| M | > {\hat{q}}_{α, N}) - α | + | ℙ (max_{s \in S} | τ_{N} \frac{{\hat{η}}_{N} (s) - η (s)}{{\hat{ς}}_{N} (s)} | > {\hat{q}}_{α, N}) - ℙ (| M | > {\hat{q}}_{α, N}) | = I + I I .

Here II converges to zero for N tending to infinity by the fCLT for ${\hat{η}}_{N}$ and the consistent estimation of ${\hat{ς}}_{N}$ from (E1–2). Therefore it remains to treat I.

To deal with this summand, note that

ℙ (| M | > {\hat{q}}_{α, N}) = ℙ (max_{s \in S} | G (s) |^{2} > {\hat{q}}_{α, N}^{2}) = ℙ (max_{(s, v) \in S \times S^{0}} Z (s, v) > {\hat{q}}_{α, N}),

where the Gaussian random field $Z$ over $T = S \times S^{0}$ , where S⁰ = {1, −1}, is defined by $Z (s, v) = v \cdot G (s)$ .

Using the above equality, ${\hat{EEC}}_{t_{N - 1}} ({\hat{q}}_{α, N}) = α / 2$ , i.e. the definition of our estimator ${\hat{q}}_{α, N}$ from Eq. (6) and Lemma 10 we have that

I = | ℙ (max_{s \in S \times S^{0}} Z (s, v) > {\hat{q}}_{α, N}) - 2 {\hat{EEC}}_{t_{N - 1}} ({\hat{q}}_{α, N}) | \overset{N \to \infty}{\to} | ℙ (max_{(s, v) \in S \times S^{0}} Z (s, v) > {\tilde{q}}_{α}) - 2 {EEC}_{G} ({\tilde{q}}_{α}) | .

Thus, using the fact that $L_{d} (T, Z) = L_{0} (S^{0}) L_{d} (S, G) = 2 L_{d} (S, G)$ and (19) and the observation that ${\tilde{q}}_{α}$ is monotonically increasing in α for α small enough, we can bound I by

I = | ℙ (max_{(s, v) \in S \times S^{0}} Z (s, v) > {\tilde{q}}_{α}) - {EEC}_{Z} ({\tilde{q}}_{α}) | \leq e^{- \frac{1}{2 + 2 σ_{c}^{2}} {\tilde{q}}_{α}^{2}}

for all α smaller than some α′, which finishes the proof.

Remark 9.

The specific definition of σ_c associated with the Gaussian field $Z$ can be found in Taylor et al. (2005).

B.4. Proof of Proposition 1

Assume that Z is $(L^{p}, δ)$ -Lipshitz, then using convexity of |·|^p we compute

E [‖ Z ‖_{\infty}^{p}] \leq 2^{p - 1} E [max_{s \in S} {| Z (s) - Z (s^{'}) |}^{p}] + 2^{p - 1} E [max_{s \in S} {| Z (s^{'}) |}^{p}] \leq 2^{p - 1} E [| A |^{p}] max_{s \in S} δ {(s, s^{'})}^{p} + 2^{p - 1} E [{| Z (s^{'}) |}^{p}] < \infty .

Hence Z has also a finite pth $C (S)$ -moment.

B.5. Proofs of Theorems 3 and 4

The following Lemma provides almost sure uniform convergence results and will be used often in the following proofs.

Lemma 11.

Assume that $X_{1}, \dots, X_{N} \overset{i . i . d .}{~} X$ and $Y_{1}, \dots, Y_{N} \overset{i . i . d .}{~} Y$ are $(L^{1}, δ)$ -Lipshitz, Then $\bar{X} \overset{N \to \infty}{\to} E [X]$ uniformly almost surely. If X and Y are $(L^{2}, δ)$ -Lipshitz, then ${\hat{cov}}_{N} [X, Y] \overset{N \to \infty}{\to} cov [X, Y]$ uniformly almost surely.

Proof.

First claim:

Using the generic uniform convergence result in Davidson (1994, Theorem 21.8), we only need to establish strong stochastical equicontinuity (SSE) of the random function $\bar{X} - E [X]$ , since pointwise convergence is obvious by the SLLNs. SSE, however, can be easily established using Davidson (1994, Theorem 21.10 (ii)), since

| N^{- 1} \sum_{n = 1}^{N} (X_{n} (s) - X_{n} (s^{'})) - E [X (s) - X (s^{'})] | \leq (N^{- 1} \sum_{n = 1}^{N} A_{n} + E [A]) δ (s, s^{'}) = C_{N} δ (s, s^{'})

for all s, s′ ∈ S. Here $A_{1}, \dots, A_{N} \overset{i.i.d.}{~} A$ denote the random variables from the $(L^{1}, δ)$ -Lipshitz property of the X_n’s and X and hence the random variable C_N converges almost surely to the constant $2 E [A]$ by the SLLNs.

Second claim:

Adapting the same strategy as above and assuming w.l.o.g. $E [X] = E [Y] = 0$ , we compute

| \frac{1}{N} \sum_{n = 1}^{N} X_{n} (s) Y_{n} (s) - X_{n} (s^{'}) Y_{n} (s^{'}) | \leq (\frac{1}{N} \sum_{n = 1}^{N} {‖ X_{n} ‖}_{\infty} B_{n} + {‖ Y_{n} ‖}_{\infty} A_{n}) δ (s, s^{'}) \leq (\sqrt{\sum_{n = 1}^{N} \frac{{‖ X_{n} ‖}_{\infty}^{2}}{N}} \sqrt{\sum_{n = 1}^{N} \frac{B_{n}^{2}}{N}} + \sqrt{\sum_{n = 1}^{N} \frac{{‖ Y_{n} ‖}_{\infty}^{2}}{N}} \sqrt{\sum_{n = 1}^{N} \frac{A_{n}^{2}}{N}}) δ (s, s^{'}),

where $‖ X ‖_{\infty} = {max}_{s \in S} | X (s) |$ and $B_{1}, \dots, B_{N} \overset{i . i . d .}{~} B$ denote the random variables from the $(L^{2}, δ)$ -Lipshitz property of the Y_n’s and Y. Again by the SLLNs the random Lipshitz constant converges almost surely and is finite, since X and Y have finite second $C (S)$ -moments and are $(L^{2}, δ)$ -Lipshitz.

Lemma 12.

Let $c$ be a covariance function. Then

If $c$ is continuous and has continuous partial derivatives up to order K, then the zero-mean Gaussian field with covariance $c$ has $C^{K}$ -sample paths with almost surely uniform and absolutely convergent expansions
$G^{I} (s) = \sum_{i = 1}^{\infty} \sqrt{λ_{i}} A_{i} φ_{i}^{I} (s),$
where λ_i, φ_i are the eigenvalues and eigenfunctions of the covariance operator of Z and ${A_{i}}_{i \in ℕ}$ are i.i.d. $N (0, 1)$ .
If Z and all its partial derivatives Z^I with |I| = K′ ≤ K, $K \in ℕ$ , are $(L^{2}, δ)$ -Lipshitz fields with finite $C (S)$ -variances, then $c$ is continuous and all partial derivatives $\partial^{| I | + | I^{'} |} c (s, s^{'}) / \partial s^{I} \partial {s^{'}}^{I^{'}}$ for |I|, |I′| ≤ K exist and are continuous for all s, s′ ∈ S.

Proof.

Since $c$ is continuous the field G is mean-square continuous. Hence there is a Karhunen–Loéve expansion of the form
$G (s) = \sum_{i = 1}^{\infty} \sqrt{λ_{i}} A_{i} φ_{i} (s),$
with λ_i, φ_i are the eigenvalues and eigenfunctions of the covariance operator associated with $c$ and ${A_{i}}_{i \in ℕ}$ are i.i.d. $N (0, 1)$ . From Ferreira and Menegatto (2012, Theorem 5.1) we have that $φ^{I} \in C^{K} (S)$ . Moreover, it is easy to deduce from their equation (4.3) that
$G^{I} (s) = \sum_{i = 1}^{\infty} \sqrt{λ_{i}} A_{i} φ^{I} (s),$
is almost surely absolutely and uniformly convergent.

The continuity is a simple consequence of the

(L^{2}, δ)

-Lipshitz property and the finite

C (S)

-variances. Let X be a field with these properties, then using the Cauchy–Schwarz inequality

| c (s, t) - c (s^{'}, t^{'}) | \leq | E [(X_{s} - X_{s^{'}}) X_{t} + (X_{t} - X_{t^{'}}) X_{s^{'}}] | \leq E [| X_{s} - X_{s^{'}} | | X_{t} |] + E [| X_{t} - X_{t^{'}} | | X_{s^{'}} |] \leq \sqrt{E [{(X_{s} - X_{s^{'}})}^{2}]} \sqrt{E [max_{t \in S} X_{t}^{2}]} + \sqrt{E [{(X_{t} - X_{t^{'}})}^{2}]} \sqrt{E [max_{s^{'} \in S} X_{s^{'}}^{2}]} \leq C (δ (s, s^{'}) + δ (t, t^{'}))

for some C < ∞ and therefore

c

and the covariances of Z^I are continuous. We only show that

\partial c (s, s^{'}) / \partial s_{d}

exists and is continuous. The argument is similar for the higher partial derivatives. From the definition we obtain for all s, s′

lim_{h \to 0} h^{- 1} (c (s, s^{'}) - c (s + h e_{d}, s^{'})) = lim_{h \to 0} E [h^{- 1} (Z (s) - Z (s + h e_{d})) Z (s^{'})],

where e_d denotes the dth element of the standard basis of

ℝ^{D}

. Thus, we only have to prove that we can interchange limits and integration. The latter is an immediate consequence of Lebesgue’s dominated convergence theorem, where we obtain the

L^{1}

majorant from the

(L^{2}, δ)

-Lipshitz property as AZ(s′), where

A \in L^{2}

Proof of Theorem 3.

Since Z is $(L^{2}, δ)$ -Lipshitz the main result in Jain and Marcus (1975) immediately implies (E1) with $τ_{N} = \sqrt{N}$ and $r = c$ . Condition (E2) is obtained from the second part of Lemma 11, since σ(s)Z(s) is $(L^{2}, δ)$ -Lipshitz and has finite second $C (S)$ -moment.
We only need to show that the Gaussian limit field with the covariance $c$ fulfills (G1) and (G3). Note that condition (G3) is a consequence of (G1) and the $C^{3}$ -sample paths by Remark 1. But (G1) is already a consequence of Lemma 12.

Lemma 13.

Let Z fulfill the assumptions of Theorem 3(ii) except for (G2) and for all d, l ∈ {1, … , D} suppose that cov [(Z^(d)(s), Z^(d,l)(s))] has full rank for all s. Then $G = G (0, c)$ fulfills (G2).

Proof.

Using the series expansions from Lemma 12 we have that for multi-indices I₁, … , I_K, $K \in ℕ$ and all $v \in ℝ^{K}$ it follows that

(\partial^{| I_{1} |} G (s) / \partial s^{I_{1}}, \dots, \partial^{| I_{K} |} G (s) / \partial s^{I_{K}}) v^{T} = \sum_{i = 1}^{\infty} \sqrt{λ_{i}} A_{i} \sum_{k = 1}^{K} v_{k} φ_{i}^{I_{k}} (s)

is convergent for all s (even uniformly). Note that we used here that the expansions are absolutely convergent such that we can change orders in the infinite sums. Thus, it is easy to deduce that (G^Ik, … , G^Ik) is a Gaussian field.

Therefore (G^(d)(s), G^(d,l)(s)) is a multivariate Gaussian random variable for all s ∈ S, which is non-degenerate if and only if its covariance matrix is non-singular. But this is the case by the assumption, since it is identical to the covariance matrix cov [(Z(d)(s), Z(d,l)(s))].

Proof of Theorem 4.

The proof is almost identical to the proof of Theorem 3 and therefore omitted.

B.6. Proof of Theorem 5

First note that using the definition of R_n from Eq. (13) we obtain

R_{n}^{(d)} = {(\frac{μ - \bar{Y}}{{\hat{σ}}_{N}})}^{(d)} + {(\frac{σ}{{\hat{σ}}_{N}})}^{(d)} Z_{n} + \frac{σ}{{\hat{σ}}_{N}} Z_{n}^{(d)} .

Thus, the entries of the sample covariance matrix ${\hat{Λ}}_{N}$ are given by

\hat{var} [Z] {(\frac{σ}{{\hat{σ}}_{N}})}^{(d)} {(\frac{σ}{{\hat{σ}}_{N}})}^{(l)} + \hat{cov} [Z, Z^{(d)}] \frac{σ}{{\hat{σ}}_{N}} {(\frac{σ}{{\hat{σ}}_{N}})}^{(l)} + \hat{cov} [Z, Z^{(l)}] \frac{σ}{{\hat{σ}}_{N}} {(\frac{σ}{{\hat{σ}}_{N}})}^{(d)} + \hat{cov} [Z^{(d)}, Z^{(l)}] \frac{σ^{2}}{{\hat{σ}}_{N}^{2}}

(20)

Now, the second part of Lemma 11 applied to Z, Z^(d) and Z^(l) and the fact that by

{\hat{σ}}_{N}^{2} (s) / σ^{2} (s) = \hat{var} [σ (s) Z (s)] / σ^{2} (s) = \hat{var} [Z (s)]

(21)

{({\hat{σ}}_{N}^{2} (s) / σ^{2} (s))}^{(l)} = 2 \hat{cov} [Z (s), Z^{(l)} (s)] .

(22)

the terms involving σ’s convergence uniformly almost surely to one and zero by Lemma 11 implies

{\hat{cov}}_{N} [R^{(d)}, R^{(l)}] \overset{N \to \infty}{\to} cov [Z^{(d)}, Z^{(l)}] = Λ_{d l}

uniformly almost surely. Thus, ${\hat{Λ}}_{N} \to Λ$ uniformly almost surely.

Now let X, Y be $(L^{4}, δ)$ -Lipshitz, then

| X (s) Y (s) - X (s^{'}) Y (s^{'}) | \leq (‖ X ‖_{\infty} A + ‖ Y ‖_{\infty} B) δ (s, s^{'})

with A, B the random variables in the $(L^{4}, δ)$ -Lipshitz property of Y, X. Note that

E [{(‖ X ‖_{\infty} A + ‖ Y ‖_{\infty} B)}^{2}] \leq 2 E [{(‖ X ‖_{\infty} A)}^{2} + {(‖ Y ‖_{\infty} B)}^{2}] \leq 2 \sqrt{E [‖ X ‖_{\infty}^{4}] E [A^{4}]} + 2 \sqrt{E [‖ Y ‖_{\infty}^{4}] E [B^{4}]} < \infty .

by (a + b)² ≤ 2a² + 2b² for all a, $b \in ℝ$ and the Cauchy–Schwarz inequality. Thus, a sample $X_{1} Y_{1}, \dots, X_{N} Y_{N} \overset{i . i . d .}{~} X Y$ fulfills the assumptions for the CLT in $C (S)$ given in Jain and Marcus (1975). Therefore, the following sums converge to a Gaussian field in $C (S)$ :

\sqrt{N} (\hat{var} [Z]), \sqrt{N} \hat{cov} [Z, Z^{(l)}], \sqrt{N} \hat{cov} [Z^{(d)}, Z^{(l)}] for d, l = 1, \dots, D .

Thus, using the latter together with Eq. (20) and the uniform almost sure convergence from (21) and (22), we obtain

\sqrt{N} (\hat{cov} [R^{(d)}, R^{(l)}] - Λ_{d l}) \overset{N \to \infty}{\Rightarrow} G (0, t_{d l})

with $t_{d l} (s, s^{'}) = cov [Z^{(d)} (s) Z^{(l)} (s), Z^{(d)} (s^{'}) Z^{(l)} (s^{'})]$ . This combined with the standard multivariate CLT yields the claim.

B.7. Proof of Theorem 7

By Theorem 5 the claim follows from the functional delta method (Kosorok, 2008, Theorem 2.8), if we prove that the corresponding functions are Hadamard differentiable and can compute this derivative.

Case 1D:

We have to prove that the function

H : (C (S), ‖ \cdot ‖_{\infty}) \to ℝ, f \mapsto \int_{S} \sqrt{f (s)} d s

is Hadamard differentiable. Therefore, note that the integral is a bounded linear operator and hence it is Fréchet differentiable with derivative being the integral itself. Moreover, $f \mapsto \sqrt{f}$ is Hadamard differentiable by Kosorok (2008, Lemma 12.2) with Hadamard derivative $D H_{f} (α) = 1 / \sqrt{4 f} α$ tangential even to the Skorohod space D(S). Combining this, we obtain the limit distribution $\sqrt{N} ({\hat{L}}_{1}^{N} - L_{1})$ from the fCLT for ${\hat{Λ}}_{N}$ given in Theorem 5 to be distributed as

D H_{Λ} (G) = \frac{1}{2} \int_{S} \frac{G (s)}{\sqrt{Λ (s)}} d s,

where G(s) is the asymptotic Gaussian field given in Theorem 5.

Case 2D:

The strategy of the proof is the same as in 1D, i.e. we need to calculate the Hadamard (Fréchet) derivative of

H : (C (S) \times C (S) \times C (S), ‖ \cdot ‖_{\infty}) \to ℝ^{2}, (f_{1}, f_{2}, f_{3}) \mapsto (\frac{1}{2} \int_{0}^{1} \sqrt{{\frac{d γ}{d t}}^{T} (t) ι (f_{1} (γ (t)), f_{2} (γ (t)), f_{3} (γ (t))) \frac{d γ}{d t} (t)} d t, \int_{S} \sqrt{det (Λ (s))} d s_{1} d s_{2}) .

The arguments are the same as before. Thus, using the chain rule and derivatives of matrices with respect to their components the Hadamard derivative evaluated at the field G is given by

d H_{Λ} (G) = (\frac{1}{2} \int_{0}^{1} \frac{1}{\sqrt{{\frac{d γ}{d t}}^{T} (t) Λ (γ (t)) \frac{d γ}{d t} (t)}} tr (Λ (γ (t)) ι (G (γ (t)))) d t, \int_{S} \frac{1}{\sqrt{det (Λ (s))}} tr (Λ (s) ι (diag (1, - 1, 1) G (s))) d s) .

B.8. Proof of Corollary 1

Note that it is well-known that the covariance function of the derivative of a differentiable field with covariance function $c$ is given by $\dot{c} (s, s^{'}) = D_{s}^{1} D_{s^{'}}^{1} c (s, s^{'})$ . Moreover, using the moment formula for multivariate Gaussian fields we have that

cov [{(Z^{'} (s))}^{2}, {(Z^{'} (s^{'}))}^{2}] = E [({(Z^{'} (s))}^{2} - \dot{c} (s, s)) ({(Z^{'} (s^{'}))}^{2} - \dot{c} (s^{'}, s^{'}))] = E [{(Z^{'} (s))}^{2} {(Z^{'} (s^{'}))}^{2}] - \dot{c} (s, s) \dot{c} (s^{'}, s^{'}) = \dot{c} (s, s) \dot{c} (s^{'}, s^{'}) + 2 \dot{c} (s, s^{'}) - \dot{c} (s, s) \dot{c} (s^{'}, s^{'}) = 2 \dot{c} (s, s^{'}) .

Combining this with the observation that the variance of the zero mean Gaussian random variable $\frac{1}{2} \int_{S} \frac{G (s)}{\sqrt{Λ (s)}} d s$ is given by

τ^{2} = \frac{1}{4} \int_{S} \int_{S} \frac{cov [{(Z^{'} (s))}^{2}, {(Z^{'} (s^{'}))}^{2}]}{\sqrt{Λ (s) Λ (s^{'})}} d s d s^{'}

yields the claim.

B.9. Proof of Theorem 8

We want to apply Pollard (1990, Theorem 10.6). Therefore, except for the indices we adapt the notations of that theorem and define the necessary variables. Recall that max_s∈S σ(s) ≤ B < ∞. We obtain

f_{N n} (s, h) = \frac{1}{\sqrt{N} P} \sum_{p = 1}^{P} (σ (s_{p}) Z_{n} (s_{p}) + ε_{n p}) K (s - s_{p}, h) F_{N n} = \sqrt{\frac{2}{N}} (B max_{s \in S} | Z_{n} (s) | + max_{s \in S} | ε_{n} (s) |) X_{N} (s, h) = \sum_{n = 1}^{N} f_{N n} (s, h) .

We have to establish the assumptions (i), (iii) and (iv) as (v) is trivially satisfied in our case and (ii) is Assumption (18). As discussed in Degras (2011, p.1759) the manageability (i) follows from the inequality

| f_{N n} (s, h) - f_{N n} (s^{'}, h^{'}) | \leq \frac{1}{\sqrt{N}} \sqrt{\frac{1}{P} \sum_{p = 1}^{P} {(σ (s_{p}) Z_{n} (s_{p}) + ε_{n} (s_{p}))}^{2}} \sqrt{\frac{1}{P} \sum_{p = 1}^{P} {(K (s - s_{p}, h) - K (s^{'} - s_{p}, h^{'}))}^{2}} \leq \sqrt{\frac{2}{N}} (B max_{s \in S} | Z_{n} (s) | + max_{s \in S} | ε_{n} (s) |) L {‖ (s, h) - (s^{'}, h^{'}) ‖}^{α} = L F_{N n} ϵ,

if ‖(s, h) − (s′, h′)‖ < ϵ^1/α. Assumption (iii) follows since we can compute

\sum_{n = 1}^{N} E [F_{N n}^{2}] = N E [F_{N 1}^{2}] \leq 4 B^{2} E [max_{s \in S} {| Z_{1} (s) |}^{2}] + 4 E [max_{s \in S} ε_{1}^{2} (s)] < \infty

and (iv) is due to

\sum_{n = 1}^{N} E [F_{N n}^{2} I (F_{N n} > ϵ)] = N E [F_{N 1}^{2} I (\sqrt{N} F_{N 1} > \sqrt{N} ϵ)] \overset{N \to \infty}{\to} 0

for all ϵ > 0, which follows from the convergence theorem for integrals with monotonically increasing integrands and the fact that by Markov’s inequality

E [I (\sqrt{N} F_{N 1} > \sqrt{N} ϵ)] = Pr (\sqrt{N} F_{N 1} > \sqrt{N} ϵ) \leq \frac{E [\sqrt{N} F_{N 1}]}{\sqrt{N} ϵ} \overset{N \to \infty}{\to} 0,

(23)

for fixed ϵ > 0.

The weak convergence to a Gaussian field now follows from Pollard (1990, Theorem 10.6).

B.10. Proof of Proposition 2

The first step is to establish that for each N the field with $C^{3}$ -sample paths

\tilde{Z} (s, h) = \frac{1}{P} \sum_{p = 1}^{P} (σ (s_{p}) Z (s_{p}) + ε (s_{p})) K (s - s_{p}, h),

has finite second $C (S)$ -moment. Moreover, the constant is uniformly bounded over all N. Additionally, we require that the field itself and its first derivatives are $(L^{2}, δ)$ -Lipshitz again uniformly over all N, since then the same arguments as in Lemma 11 will yield the consistency of the estimators of the LKCs from Theorem 6. Therefore, note that

| \tilde{Z} (s, h) | \leq \frac{1}{P} \sum_{p = 1}^{P} | σ (s_{p}) Z (s_{p}) + ε (s_{p}) | \cdot | K (s - s_{p}, h) | \leq \sqrt{\frac{1}{P} \sum_{p = 1}^{P} {(σ (s_{p}) Z (s_{p}) + ε (s_{p}))}^{2}} \sqrt{\frac{1}{P} \sum_{p = 1}^{P} {(K (s - s_{p}, h))}^{2}} \leq (B max_{s \in S} | Z (s) | + max_{s \in S} | ε (s) |) | K (s, h) | .

This yields using (a + b)² ≤ 2(a² + b²) that

E [max_{(s, h) \in S \times H} | \tilde{Z} (s, h) |^{2}] \leq 2 (B^{2} E [max_{s \in S} | Z (s) |^{2}] + C) max_{(s, h) \in S \times H} K {(s, h)}^{2} < \infty,

where the bound is independent of N. Basically, the same argument yields the $(L^{2}, ‖ \cdot ‖)$ -Lipshitz property for $\tilde{Z} (s, h)$ and all of its partial derivatives up to order 3 with a bounding $L^{2}$ random variable independent of N. The differentiability of the sample paths of the limiting Gaussian field follows again from Lemma 12(i).

Footnotes

CRediT authorship contribution statement

Fabian J.E. Telschow: Conceptualization, Methodology, Software.

References

Adler RJ, 1981. The Geometry of Random Fields, Vol. 62. Siam. [Google Scholar]
Adler RJ, Taylor JE, 2009. Random Fields and Geometry. Springer Science & Business Media. [Google Scholar]
Belloni A, Chernozhukov V, Chetverikov D, Wei Y, 2018. Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework. Ann. Statist 46 (6B), 3643. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bredon GE, 2013. Topology and Geometry, Vol. 139. Springer Science & Business Media. [Google Scholar]
Bunea F, Ivanescu AE, Wegkamp MH, 2011. Adaptive inference for the mean of a Gaussian process in functional data. J. R. Stat. Soc. Ser. B Stat. Methodol 73 (4), 531–558. [Google Scholar]
Cao G, Yang L, Todem D, 2012. Simultaneous inference for the mean function based on dense functional data. J. Nonparametr. Stat 24 (2), 359–377. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cao G, et al. , 2014. Simultaneous confidence bands for derivatives of dependent functional data. Electron. J. Stat 8 (2), 2639–2663. [Google Scholar]
Chang C, Lin X, Ogden RT, 2017. Simultaneous confidence bands for functional regression models. J. Statist. Plann. Inference 188, 67–81. [Google Scholar]
Chaudhuri P, Marron JS, 1999. Sizer for exploration of structures in curves. J. Amer. Statist. Assoc 94 (447), 807–823. [Google Scholar]
Chaudhuri P, Marron JS, 2000. Scale space view of curve estimation. Ann. Statist 408–428. [Google Scholar]
Cuevas A, 2014. A partial overview of the theory of statistics with functional data. J. Statist. Plann. Inference 147, 1–23. [Google Scholar]
Cuevas A, Febrero M, Fraiman R, 2006. On the use of the bootstrap for estimating functions with functional data. Comput. Statist. Data Anal 51 (2), 1063–1074. [Google Scholar]
Davidson J, 1994. Stochastic Limit Theory: An Introduction for Econometricians. OUP Oxford. [Google Scholar]
Degras DA, 2011. Simultaneous confidence bands for nonparametric regression with functional data. Statist. Sinica 21 (4), 1735–1765. [Google Scholar]
Degras D, 2017. Simultaneous confidence bands for the mean of functional data. Wiley Interdiscip. Rev. Comput. Stat 9 (3). [Google Scholar]
DiCiccio TJ, Efron B, 1996. Bootstrap confidence intervals. Statist. Sci 189–212. [Google Scholar]
Ferraty F, Vieu P, 2006. Nonparametric Functional Data Analysis: Theory and Practice. Springer Science & Business Media. [Google Scholar]
Ferreira JC, Menegatto VA, 2012. Reproducing properties of differentiable Mercer-like kernels. Math. Nachr 285 (8–9), 959–973. [Google Scholar]
Flegg G, 2001. From Geometry to Topology. Courier Corporation. [Google Scholar]
Good PI, 2005. Permutation, Parametric and Bootstrap Tests of Hypotheses: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer Science & Business Media. [Google Scholar]
Goresky M, MacPherson R, 1988. Stratified morse theory. In: Stratified Morse Theory. Springer, pp. 3–22. [Google Scholar]
Hall P, Müller H-G, Wang J-L, 2006. Properties of principal component methods for functional and longitudinal data analysis. Ann. Statist 1493–1517. [Google Scholar]
Jain NC, Marcus MB, 1975. Central limit theorems for C (S)-valued random variables. J. Funct. Anal 19 (3), 216–231. [Google Scholar]
Johansen S, Johnstone IM, 1990. Hotelling’s theorem on the volume of tubes: some illustrations in simultaneous inference and data analysis. Ann. Statist 18 (2), 652–684. [Google Scholar]
Kiebel SJ, Poline J-B, Friston KJ, Holmes AP, Worsley KJ, 1999. Robust smoothness estimation in statistical parametric maps using standardized residuals from the general linear model. Neuroimage 10 (6), 756–766. [DOI] [PubMed] [Google Scholar]
Kosorok MR, 2008. Introduction to Empirical Processes and Semiparametric Inference. Springer. [Google Scholar]
Krivobokova T, Kneib T, Claeskens G, 2010. Simultaneous confidence bands for penalized spline estimators. J. Amer. Statist. Assoc 105 (490), 852–863. [Google Scholar]
Landau H, Shepp LA, 1970. On the supremum of a Gaussian process. Sankhyā 369–378. [Google Scholar]
Ledoux M, Talagrand M, 2013. Probability in Banach Spaces: Isoperimetry and Processes. Springer Science & Business Media. [Google Scholar]
Li Y, Hsing T, et al. , 2010. Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Ann. Statist 38 (6), 3321–3351. [Google Scholar]
Liebl D, Reimherr M, 2019. Fast and fair simultaneous confidence bands for functional parameters. arxiv preprint arXiv:1910.00131. [Google Scholar]
Lu X, Kuriki S, 2017. Simultaneous confidence bands for contrasts between several nonlinear regression curves. J. Multivariate Anal 155, 83–104. [Google Scholar]
Mearns L, Sain S, Leung L, Bukovsky M, McGinnis S, Biner S, Caya D, Arritt R, Gutowski W, Takle E, et al. , 2013. Climate change projections of the North American regional climate change assessment program (NARCCAP). Clim. Change 120 (4), 965–975. [Google Scholar]
Pollard D, 1990. Empirical processes: theory and applications. In: NSF-CBMS Regional Conference Series in Probability and Statistics. JSTOR, pp. i–86. [Google Scholar]
Ramsay JO, Silverman BW, 2007. Applied Functional Data Analysis: Methods and Case Studies. Springer. [Google Scholar]
Sommerfeld M, Sain S, Schwartzman A, 2018. Confidence regions for spatial excursion sets from repeated random field observations, with an application to climate. J. Amer. Statist. Assoc 113 (523), 1327–1340. [DOI] [PMC free article] [PubMed] [Google Scholar]
Takemura A, Kuriki S, 2002. On the equivalence of the tube and Euler characteristic methods for the distribution of the maximum of Gaussian fields over piecewise smooth domains. Ann. Appl. Probab 768–796. [Google Scholar]
Taylor JE, 2006. A Gaussian kinematic formula. Ann. Probab 34 (1), 122–158. [Google Scholar]
Taylor J, Takemura A, Adler RJ, 2005. Validity of the expected Euler characteristic heuristic. Ann. Probab 1362–1396. [Google Scholar]
Taylor JE, Worsley KJ, 2007. Detecting sparse signals in random fields, with an application to brain mapping. J. Amer. Statist. Assoc 102 (479), 913–928. [Google Scholar]
Telschow F, Schwartzman A, Cheng D, Pranav P, 2020. Estimation of expected Euler characteristic curves of nonstationary smooth Gaussian random fields. arxiv preprint arXiv:1908.02493. [Google Scholar]
Travis KE, Golden NH, Feldman HM, Solomon M, Nguyen J, Mezer A, Yeatman JD, Dougherty RF, 2015. Abnormal white matter properties in adolescent girls with anorexia nervosa. NeuroImage: Clin 9, 648–659. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang J-L, Chiou J-M, Müller H-G, 2016. Functional data analysis. Annu. Rev. Stat. Appl 3, 257–295. [Google Scholar]
Wang Y, Wang G, Wang L, Ogden RT, 2019. Simultaneous confidence corridors for mean functions in functional data analysis of imaging data. Biometrics. [DOI] [PMC free article] [PubMed] [Google Scholar]
Working H, Hotelling H, 1929. Applications of the theory of error to the interpretation of trends. J. Amer. Statist. Assoc 24 (165A), 73–85. [Google Scholar]
Worsley KJ, 1994. Local maxima and the expected Euler characteristic of excursion sets of χ 2, F and t fields. Adv. Appl. Probab 26 (1), 13–42. [Google Scholar]
Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC, et al. , 1996. A unified statistical approach for determining significant signals in images of cerebral activation. Hum. Brain Mapp 4 (1), 58–73. [DOI] [PubMed] [Google Scholar]
Worsley KJ, Taylor JE, Tomaiuolo F, Lerch J, 2004. Unified univariate and multivariate random field theory. Neuroimage 23, S189–S195. [DOI] [PubMed] [Google Scholar]
Zhang J-T, Chen J, et al. , 2007. Statistical inferences for functional data. Ann. Statist 35 (3), 1052–1079. [Google Scholar]
Zhang X, Wang J-L, et al. , 2016. From sparse to dense functional data and beyond. Ann. Statist 44 (5), 2281–2321. [Google Scholar]

[R1] Adler RJ, 1981. The Geometry of Random Fields, Vol. 62. Siam. [Google Scholar]

[R2] Adler RJ, Taylor JE, 2009. Random Fields and Geometry. Springer Science & Business Media. [Google Scholar]

[R3] Belloni A, Chernozhukov V, Chetverikov D, Wei Y, 2018. Uniformly valid post-regularization confidence regions for many functional parameters in z-estimation framework. Ann. Statist 46 (6B), 3643. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Bredon GE, 2013. Topology and Geometry, Vol. 139. Springer Science & Business Media. [Google Scholar]

[R5] Bunea F, Ivanescu AE, Wegkamp MH, 2011. Adaptive inference for the mean of a Gaussian process in functional data. J. R. Stat. Soc. Ser. B Stat. Methodol 73 (4), 531–558. [Google Scholar]

[R6] Cao G, Yang L, Todem D, 2012. Simultaneous inference for the mean function based on dense functional data. J. Nonparametr. Stat 24 (2), 359–377. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Cao G, et al. , 2014. Simultaneous confidence bands for derivatives of dependent functional data. Electron. J. Stat 8 (2), 2639–2663. [Google Scholar]

[R8] Chang C, Lin X, Ogden RT, 2017. Simultaneous confidence bands for functional regression models. J. Statist. Plann. Inference 188, 67–81. [Google Scholar]

[R9] Chaudhuri P, Marron JS, 1999. Sizer for exploration of structures in curves. J. Amer. Statist. Assoc 94 (447), 807–823. [Google Scholar]

[R10] Chaudhuri P, Marron JS, 2000. Scale space view of curve estimation. Ann. Statist 408–428. [Google Scholar]

[R11] Cuevas A, 2014. A partial overview of the theory of statistics with functional data. J. Statist. Plann. Inference 147, 1–23. [Google Scholar]

[R12] Cuevas A, Febrero M, Fraiman R, 2006. On the use of the bootstrap for estimating functions with functional data. Comput. Statist. Data Anal 51 (2), 1063–1074. [Google Scholar]

[R13] Davidson J, 1994. Stochastic Limit Theory: An Introduction for Econometricians. OUP Oxford. [Google Scholar]

[R14] Degras DA, 2011. Simultaneous confidence bands for nonparametric regression with functional data. Statist. Sinica 21 (4), 1735–1765. [Google Scholar]

[R15] Degras D, 2017. Simultaneous confidence bands for the mean of functional data. Wiley Interdiscip. Rev. Comput. Stat 9 (3). [Google Scholar]

[R16] DiCiccio TJ, Efron B, 1996. Bootstrap confidence intervals. Statist. Sci 189–212. [Google Scholar]

[R17] Ferraty F, Vieu P, 2006. Nonparametric Functional Data Analysis: Theory and Practice. Springer Science & Business Media. [Google Scholar]

[R18] Ferreira JC, Menegatto VA, 2012. Reproducing properties of differentiable Mercer-like kernels. Math. Nachr 285 (8–9), 959–973. [Google Scholar]

[R19] Flegg G, 2001. From Geometry to Topology. Courier Corporation. [Google Scholar]

[R20] Good PI, 2005. Permutation, Parametric and Bootstrap Tests of Hypotheses: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer Science & Business Media. [Google Scholar]

[R21] Goresky M, MacPherson R, 1988. Stratified morse theory. In: Stratified Morse Theory. Springer, pp. 3–22. [Google Scholar]

[R22] Hall P, Müller H-G, Wang J-L, 2006. Properties of principal component methods for functional and longitudinal data analysis. Ann. Statist 1493–1517. [Google Scholar]

[R23] Jain NC, Marcus MB, 1975. Central limit theorems for C (S)-valued random variables. J. Funct. Anal 19 (3), 216–231. [Google Scholar]

[R24] Johansen S, Johnstone IM, 1990. Hotelling’s theorem on the volume of tubes: some illustrations in simultaneous inference and data analysis. Ann. Statist 18 (2), 652–684. [Google Scholar]

[R25] Kiebel SJ, Poline J-B, Friston KJ, Holmes AP, Worsley KJ, 1999. Robust smoothness estimation in statistical parametric maps using standardized residuals from the general linear model. Neuroimage 10 (6), 756–766. [DOI] [PubMed] [Google Scholar]

[R26] Kosorok MR, 2008. Introduction to Empirical Processes and Semiparametric Inference. Springer. [Google Scholar]

[R27] Krivobokova T, Kneib T, Claeskens G, 2010. Simultaneous confidence bands for penalized spline estimators. J. Amer. Statist. Assoc 105 (490), 852–863. [Google Scholar]

[R28] Landau H, Shepp LA, 1970. On the supremum of a Gaussian process. Sankhyā 369–378. [Google Scholar]

[R29] Ledoux M, Talagrand M, 2013. Probability in Banach Spaces: Isoperimetry and Processes. Springer Science & Business Media. [Google Scholar]

[R30] Li Y, Hsing T, et al. , 2010. Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Ann. Statist 38 (6), 3321–3351. [Google Scholar]

[R31] Liebl D, Reimherr M, 2019. Fast and fair simultaneous confidence bands for functional parameters. arxiv preprint arXiv:1910.00131. [Google Scholar]

[R32] Lu X, Kuriki S, 2017. Simultaneous confidence bands for contrasts between several nonlinear regression curves. J. Multivariate Anal 155, 83–104. [Google Scholar]

[R33] Mearns L, Sain S, Leung L, Bukovsky M, McGinnis S, Biner S, Caya D, Arritt R, Gutowski W, Takle E, et al. , 2013. Climate change projections of the North American regional climate change assessment program (NARCCAP). Clim. Change 120 (4), 965–975. [Google Scholar]

[R34] Pollard D, 1990. Empirical processes: theory and applications. In: NSF-CBMS Regional Conference Series in Probability and Statistics. JSTOR, pp. i–86. [Google Scholar]

[R35] Ramsay JO, Silverman BW, 2007. Applied Functional Data Analysis: Methods and Case Studies. Springer. [Google Scholar]

[R36] Sommerfeld M, Sain S, Schwartzman A, 2018. Confidence regions for spatial excursion sets from repeated random field observations, with an application to climate. J. Amer. Statist. Assoc 113 (523), 1327–1340. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Takemura A, Kuriki S, 2002. On the equivalence of the tube and Euler characteristic methods for the distribution of the maximum of Gaussian fields over piecewise smooth domains. Ann. Appl. Probab 768–796. [Google Scholar]

[R38] Taylor JE, 2006. A Gaussian kinematic formula. Ann. Probab 34 (1), 122–158. [Google Scholar]

[R39] Taylor J, Takemura A, Adler RJ, 2005. Validity of the expected Euler characteristic heuristic. Ann. Probab 1362–1396. [Google Scholar]

[R40] Taylor JE, Worsley KJ, 2007. Detecting sparse signals in random fields, with an application to brain mapping. J. Amer. Statist. Assoc 102 (479), 913–928. [Google Scholar]

[R41] Telschow F, Schwartzman A, Cheng D, Pranav P, 2020. Estimation of expected Euler characteristic curves of nonstationary smooth Gaussian random fields. arxiv preprint arXiv:1908.02493. [Google Scholar]

[R42] Travis KE, Golden NH, Feldman HM, Solomon M, Nguyen J, Mezer A, Yeatman JD, Dougherty RF, 2015. Abnormal white matter properties in adolescent girls with anorexia nervosa. NeuroImage: Clin 9, 648–659. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] Wang J-L, Chiou J-M, Müller H-G, 2016. Functional data analysis. Annu. Rev. Stat. Appl 3, 257–295. [Google Scholar]

[R44] Wang Y, Wang G, Wang L, Ogden RT, 2019. Simultaneous confidence corridors for mean functions in functional data analysis of imaging data. Biometrics. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] Working H, Hotelling H, 1929. Applications of the theory of error to the interpretation of trends. J. Amer. Statist. Assoc 24 (165A), 73–85. [Google Scholar]

[R46] Worsley KJ, 1994. Local maxima and the expected Euler characteristic of excursion sets of χ 2, F and t fields. Adv. Appl. Probab 26 (1), 13–42. [Google Scholar]

[R47] Worsley KJ, Marrett S, Neelin P, Vandal AC, Friston KJ, Evans AC, et al. , 1996. A unified statistical approach for determining significant signals in images of cerebral activation. Hum. Brain Mapp 4 (1), 58–73. [DOI] [PubMed] [Google Scholar]

[R48] Worsley KJ, Taylor JE, Tomaiuolo F, Lerch J, 2004. Unified univariate and multivariate random field theory. Neuroimage 23, S189–S195. [DOI] [PubMed] [Google Scholar]

[R49] Zhang J-T, Chen J, et al. , 2007. Statistical inferences for functional data. Ann. Statist 35 (3), 1052–1079. [Google Scholar]

[R50] Zhang X, Wang J-L, et al. , 2016. From sparse to dense functional data and beyond. Ann. Statist 44 (5), 2281–2321. [Google Scholar]

PERMALINK

Simultaneous confidence bands for functional data using the Gaussian Kinematic formula

Fabian JE Telschow

Armin Schwartzman

Abstract

1. Introduction

Organization of the article.

2. Simultaneous confidence bands

2.1. Preliminary definitions and notations

Definition 1.

Definition 2.

2.2. SCBs for functional parameters

2.3. Estimation of the quantile using the tGKF

2.3.1. The Gaussian Kinematic Formula for t-fields

Remark 1.

2.3.2. The tGKF-estimator of qα,N

Theorem 1.

Remark 2.

2.4. Asymptotic covering rates

Theorem 2.

3. Application to the functional signal-plus-noise model

Definition 3.

Proposition 1.

Remark 3.

3.1. Asymptotic SCBs for the one and two sample case

Theorem 3 (Asymptotic SCBs for the Signal-plus-noise Model).

Remark 4.

Theorem 4 (Asymptotic SCBs for Difference of Means of Two Signal-plus-noise Models).

3.2. Estimation of LKCs

Theorem 5.

Theorem 6 (Consistency of LKCs).

Remark 5.

Theorem 7 (CLT for LKCs).

Corollary 1.

3.3. Discrete sampling and additive noise: A scale space approach

Definition 4 (Scale Space Field).

Theorem 8.

Remark 6.

Proposition 2.

Remark 7.

4. Simulations

4.1. Coverage: Smooth Gaussian case

Fig. 1.

Table 1.

4.2. Coverage: Smooth non-Gaussian case

Fig. 2.

Fig. 3.

4.3. Average width and variance of different SCBs

Table 2.

Table 3.

4.4. The influence of observation noise

Fig. 4.

Fig. 5.

Fig. 6.

4.5. SCBs for the difference of population means of two independent samples

Fig. 7.

5. Applications

5.1. DTI fibers

Fig. 8.

5.2. Climate data

Fig. 9.

Acknowledgments

Appendix A. Bootstrap methods

Parametric bootstrap-t for qα,N (Degras, 2011).

Remark 8.

Multiplier-t bootstrap for qα,N.

Appendix B. Proofs

B.1. Proof of Claim in Remark 1

B.2. Proof of Theorem 1

Lemma 9.

Proof.

Lemma 10.

Proof.

Part 1.:

Part 2.:

Proof of Theorem 1.

B.3. Proof of Theorem 2

Remark 9.

B.4. Proof of Proposition 1

B.5. Proofs of Theorems 3 and 4

2.3.2. The tGKF-estimator of q_α,N

Parametric bootstrap-t for q_α,N (Degras, 2011).

Multiplier-t bootstrap for q_α,N.