A SIMULTANEOUS CONFIDENCE BAND FOR SPARSE LONGITUDINAL REGRESSION

Shujie Ma; Lijian Yang; Raymond J Carroll

doi:10.5705/ss.2010.034

. Author manuscript; available in PMC: 2013 Feb 27.

Published in final edited form as: Stat Sin. 2012;22:95–122. doi: 10.5705/ss.2010.034

A SIMULTANEOUS CONFIDENCE BAND FOR SPARSE LONGITUDINAL REGRESSION

Shujie Ma ¹, Lijian Yang ^2,¹, Raymond J Carroll ³

PMCID: PMC3583240 NIHMSID: NIHMS269746 PMID: 23459083

Abstract

Functional data analysis has received considerable recent attention and a number of successful applications have been reported. In this paper, asymptotically simultaneous confidence bands are obtained for the mean function of the functional regression model, using piecewise constant spline estimation. Simulation experiments corroborate the asymptotic theory. The confidence band procedure is illustrated by analyzing CD4 cell counts of HIV infected patients.

Key words and phrases: B spline, confidence band, functional data, Karhunen-Loève L² representation, knots, longitudinal data, strong approximation

1. Introduction

Functional data analysis (FDA) has in recent years become a focal area in statistics research, and much has been published in this area. An incomplete list includes Cardot, Ferraty, and Sarda (2003), Cardot and Sarda (2005), Ferraty and Vieu (2006), Hall and Heckman (2002), Hall, Müller, and Wang (2006), Izem and Marron (2007), James, Hastie, and Sugar (2000), James (2002), James and Silverman (2005), James and Sugar (2003), Li and Hsing (2007), Li and Hsing (2009), Morris and Carroll (2006), Müller and Stadtmüller (2005), Müller, Stadtmüller, and Yao (2006), Müller and Yao (2008), Ramsay and Silverman (2005), Wang, Carroll, and Lin (2005), Yao and Lee (2006), Yao, Müller, and Wang (2005a), Yao, Müller, and Wang (2005b), Yao (2007), Zhang and Chen (2007), Zhao, Marron, and Wells (2004), and Zhou, Huang, and Carroll (2008). According to Ferraty and Vieu (2006), a functional data set consists of iid realizations {ξ_i (x), x ∈ χ}, 1 ≤ i ≤ n, of a smooth stochastic process (random curve) {ξ (x), x ∈ χ} over an entire interval χ. A more data oriented alternative in Ramsay and Silverman (2005) emphasizes smooth functional features inherent in discretely observed longitudinal data, so that the recording of each random curve ξ_i(x) is over a finite number of points in χ, and contaminated with noise. This second view is taken in this paper.

A typical functional data set therefore has the form {X_ij, Y_ij}, 1 ≤ i ≤ n, 1 ≤ j ≤ N_i, in which N_i observations are taken for the i^th subject, with X_ij and Y_ij the j^th predictor and response variables, respectively, for the i^th subject. Generally, the predictor X_ij takes values in a compact interval χ = [a, b]. For the i^th subject, its sample path {X_ij, Y_ij} is the noisy realization of a continuous time stochastic process ξ_i(x) in the sense that

Y_{ij} = ξ_{i} (X_{ij}) + σ (X_{ij}) ε_{ij},

(1.1)

with errors ε_ij satisfying E (ε_ij) = 0, $E (ε_{ij}^{2}) = 1$ , and {ξ_i(x), x ∈ χ} are iid copies of a process {ξ(x), x ∈ χ} which is L², i.e., E ∫_χ ξ²(x)dx < +∞.

For the standard process {ξ(x), x ∈ χ}, one defines the mean function m(x) = E{ξ(x)} and the covariance function G(x, x′) = cov {ξ(x), ξ(x′)}. Let sequences ${λ_{k}}_{k = 1}^{\infty}, {ψ_{k} (x)}_{k = 1}^{\infty}$ be the eigenvalues and eigenfunctions of G(x, x′), respectively, in which λ₁ ≥ λ₂ ≥ ⋯ ≥ 0, $\sum_{k = 1}^{\infty} λ_{k} < \infty, {ψ_{k}}_{k = 1}^{\infty}$ form an orthonormal basis of L² (χ) and $G (x, x^{'}) = \sum_{k = 1}^{\infty} λ_{k} ψ_{k} (x) ψ_{k} (x^{'})$ , which implies that ∫ G(x, x′) ψ_k (x′) dx′ = λ_kψ_k(x).

The process {ξ_i(x), x ∈ χ} allows the Karhunen-Loève L² representation

ξ_{i} (x) = m (x) + \sum_{k = 1}^{\infty} ξ_{ik} ϕ_{k} (x),

where the random coefficients ξ_ik are uncorrelated with mean 0 and variances 1, and the functions $ϕ_{k} = \sqrt{λ_{k}} ψ_{k}$ . In what follows, we assume that λ_k = 0, for k > κ, where κ is a positive integer, thus $G (x, x^{'}) = \sum_{k = 1}^{κ} ϕ_{k} (x) ϕ_{k} (x^{'})$ and the data generating process is now written as

Y_{ij} = m (X_{ij}) + \sum_{k = 1}^{κ} ξ_{ik} ϕ_{k} (X_{ij}) + σ (X_{ij}) ε_{ij} .

(1.2)

The sequences ${λ_{k}}_{k = 1}^{κ}, {ϕ_{k} (x)}_{k = 1}^{κ}$ and the random coefficients ξ_ik exist mathematically, but are unknown and unobservable.

Two distinct types of functional data have been studied. Li and Hsing (2007), and Li and Hsing (2009) concern dense functional data, which in the context of model (1.1) means min_1≤i≤n N_i → ∞ as n → ∞. On the other hand, Yao, Müller, and Wang (2005a), Yao, Müller, and Wang (2005b), and Yao (2007) studied sparse longitudinal data for which N_i’s are i.i.d. copies of an integer-valued positive random variable. Pointwise asymptotic distributions were obtained in Yao (2007) for local polynomial estimators of m(x) based on sparse functional data, but without uniform confidence bands. Nonparametric simultaneous confidence bands are a powerful tool of global inference for functions, see Claeskens and Van Keilegom (2003), Fan and Zhang (2000), Hall and Titterington (1988), Härdle (1989), Härdle and Marron (1991), Huang, Wang, Yang, and Kravchenko (2008), Ma and Yang (2010), Song and Yang (2009), Wang and Yang (2009), Wu and Zhao (2007), Zhao and Wu (2008), and Zhou, Shen, and Wolfe (1998) for its theory and applications. The fact that a simultaneous confidence band has not been established for functional data analysis is certainly not due to lack of interesting applications, but to the greater technical difficulty in formulating such bands for functional data and establishing their theoretical properties. Specifically, the strong approximation results used to establish the asymptotic confidence level in nearly all published works on confidence bands, commonly known as “Hungarian embedding”, are unavailable for sparse functional data.

In this paper, we present simultaneous confidence bands for m(x) in sparse functional data via a piecewise-constant spline smoothing approach. While there exist a number of smoothing methods for estimating m(x) and G(x, x′) such as kernels (Yao, Müller and, Wang (2005a); Yao, Müller, and Wang (2005b); Yao (2007)), penalized splines (Cardot, Ferraty, and Sarda (2003); Cardot and Sarda (2005); Yao and Lee (2006)), wavelets Morris and Carroll (2006), and parametric splines James (2002), we choose B splines (Zhou, Huang, and Carroll (2008)) for simple implementation, fast computation and explicit expression, see Huang and Yang (2004), Wang and Yang (2007), and Xue and Yang (2006) for discussion of the relative merits of various smoothing methods.

We organize our paper as follows. In Section 2 we state our main results on confidence bands constructed from piecewise constant splines. In Section 3 we provide further insights into the error structure of spline estimators. Section 4 describes the actual steps to implement the confidence bands. Section 5 reports findings of a simulation study. An empirical example in Section 6 illustrates how to use the proposed confidence band for inference. Proofs of technical lemmas are in the Appendix.

2. Main results

For convenience, we denote the supremum norm of a function r on [a, b] by ∥r∥_∞ = sup_x∈[a,b] |r(x)|, and the modulus of continuity of a continuous function r on [a, b] by ω (r, δ) = max_{x,x′∈[a,b],|x−x′|≤δ} |r(x) − r(x′)|. Denote by ∥g∥₂ the theoretical L² norm of a function g on [a, b], ${∥ g ∥}_{2}^{2} = E {g^{2} (X)} = \int_{a}^{b} g^{2} (x) f (x) dx$ , where f(x) is the density function of X, and the empirical L² norm as ${∥ g ∥}_{2, N_{T}}^{2} = N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} g^{2} (X_{ij})$ , where we denote the total sample size by $N_{T} = \sum_{i = 1}^{n} N_{i}$ . Without loss of generality, we take the range of X, χ = [a, b], to be [0, 1]. For any β ∈ (0, 1], we denote the collection of order β Hõlder continuous function on [0, 1] by

C^{0, β} [0, 1] = {ϕ : {∥ ϕ ∥}_{0, β} = \sup_{x \neq x^{'}, x, x^{'} \in [0, 1]} \frac{| ϕ (x) - ϕ (x^{'}) |}{{| x - x^{'} |}^{β}} < + \infty},

in which ∥ϕ∥_0,β is the C^0,β-seminorm of ϕ. Let C [0, 1] be the collection of continuous function on [0, 1]. Clearly, C^0,β [0, 1] ⊂ C [0, 1] and, if ϕ ∈ C^0,β [0, 1], then ω (ϕ, δ) ≤ ∥ϕ∥_0,β δ^β.

To introduce the spline functions, divide the finite interval [0, 1] into (N_s+1) equal subintervals χ_J = [t_J, t_J+1), J = 0, …., N_s − 1, χ_{N_s} = [t_{N_s}, 1]. A sequence of equally-spaced points ${t_{J}}_{J = 1}^{N_{s}}$ , called interior knots, are given as

t_{0} = 0 < t_{1} < \dots < t_{N_{s}} < 1 = t_{N_{s} + 1}, t_{J} = {Jh}_{s}, 0 \leq J \leq N_{s} + 1, h_{s} = 1 / (N_{s} + 1),

in which h_s is the distance between neighboring knots. We denote by G⁽⁻¹⁾ = G⁽⁻¹⁾ [0, 1] the space of functions that are constant on each χ_J. For any x ∈ [0, 1], define its location index as J(x) = J_n(x) = min {[x/h_s], N_s} so that t_{J_n(x)} ≤ x < t_{J_n(x)+1}, ∀x ∈ [0, 1]. We propose to estimate the mean function m(x) by

\hat{m} (x) = \underset{g \in G^{(- 1)}}{argmin} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} {Y_{ij} - g (X_{ij})}^{2} .

(2.1)

The technical assumptions we need are as follows

(A1)
The regression function m(x) ∈ C^0,1 [0, 1].
(A2)
The functions f(x), σ(x), and ϕ_k(x) ∈ C^0,β [0, 1] for some β ∈ (2/3, 1] with f(x) ∈ [c_f, C_f], σ(x) ∈ [c_σ, C_σ], x ∈ [0, 1], for constants 0 < c_f ≤ C_f < ∞, 0 < c_σ ≤ C_σ < ∞.
(A3)
The set of random variables ${(N_{i})}_{i = 1}^{n}$ is a subset of ${(N_{i})}_{i = 1}^{\infty}$ consisting of independent variables N_i, the numbers of observations made for the i-th subject, i = 1, 2, …, with N_i ~ N, where N > 0 is a positive integer-valued random variable with $E {N^{2 r}} \leq r! c_{N}^{r}$ , r = 2, 3, … for some constant c_N > 0. The set of random variables ${(X_{ij}, Y_{ij}, ε_{ij})}_{i = 1, j = 1}^{n, N_{i}}$ is a subset of ${(X_{ij}, Y_{ij}, ε_{ij})}_{i = 1, j = 1}^{\infty, \infty}$ in which ${(X_{ij}, ε_{ij})}_{i = 1, j = 1}^{\infty, \infty}$ are iid. The number κ of nonzero eigenvalues is finite and the random coefficients ξ_ik, k = 1, …, κ, i = 1, …, ∞ are iid N (0, 1). The variables ${(N_{i})}_{i = 1}^{\infty}, {(ξ_{ik})}_{i = 1, k = 1}^{\infty, κ}, {(X_{ij})}_{i = 1, j = 1}^{\infty, \infty}, {(ε_{ij})}_{i = 1, j = 1}^{\infty, \infty}$ are independent.
(A4)
As n → ∞, the number of interior knots N_s = o (n^ϑ) for some ϑ ∈ (1/3, 2β − 1) while $N_{s}^{- 1} = o {n^{- 1 / 3} {(\log n)}^{- 1 / 3}}$ . The subinterval length $h_{s} ~ N_{s}^{- 1}$ .
(A5)
There exists r > 2/ {β − (1 + ϑ) /2} such that E |ε₁₁|^r < ∞.

Assumptions (A1), (A2), (A4) and (A5) are similar to (A1)–(A4) in Wang and Yang (2009), with (A1) weaker than its counterpart. Assumption (A3) is the same as (A1.1), (A1.2), and (A5) in Yao, Müller, and Wang (2005b), without requiring joint normality of the measurement errors ε_ij.

We now introduce the B-spline basis of G⁽⁻¹⁾, the space of piecewise constant splines, as ${b_{J} (x)}_{J = 0}^{N_{s}}$ , which are simply indicator functions of intervals χ_J, b_J(x) = I_χJ (x), J = 0, 1, …, N_s. Define

\begin{array}{c} c_{J, n} = {∥ b_{J} ∥}_{2}^{2} = \int_{0}^{1} b_{J} (x) f (x) dx, J = 0, \dots, N_{s}, \\ σ_{Y}^{2} (x) = var (Y ∣ X = x) = G (x, x) + σ^{2} (x), \forall x \in [0, 1], \end{array}

(2.2)

σ_{n}^{2} (x) = c_{J (x), n}^{- 2} {nE (N_{1})}^{- 1} {\frac{E {N_{1} (N_{1} - 1)}}{{EN}_{1}} \sum_{k = 1}^{κ} {(\int_{χ_{J (x)}} ϕ_{k} (u) f (u) du)}^{2} + \int_{χ_{J (x)}} σ_{Y}^{2} (u) f (u) du} .

(2.3)

In addition, define $Q_{N_{s} + 1} (α) = b_{N_{s} + 1} - a_{N_{s} + 1}^{- 1} \log {- (1 / 2) \log (1 - α)}$ ,

a_{N_{s} + 1} = {2 \log (N_{s} + 1)}^{1 / 2}, b_{N_{s} + 1} = a_{N_{s} + 1} - \frac{\log (2 π a_{N_{s} + 1}^{2})}{2 a_{N_{s} + 1}},

(2.4)

for any α ∈ (0, 1). We now state our main results.

Theorem 1

Under Assumptions (A1)-(A5), for any α ∈ (0, 1),

\begin{array}{c} \lim_{n \to \infty} P {\sup_{x \in [0, 1]} | \hat{m} (x) - m (x) | / σ_{n} (x) \leq Q_{N_{s} + 1} (α)} = 1 - α, \\ \lim_{n \to \infty} P {| \hat{m} (x) - m (x) | / σ_{n} (x) \leq Z_{1 - α / 2}} = 1 - α, \forall x \in [0, 1], \end{array}

where σ_n(x) and Q_{N_s+1} (α) are given in (2.3) and (2.4), respectively, while Z_1−α/2 is the 100 (1 − α/2)^th percentile of the standard normal distribution.

The definition of σ_n(x) in (2.3) does not allow for practical use. The next proposition provides two data-driven alternatives

Proposition 1

Under Assumptions (A2), (A3), and (A5), as n → ∞,

\sup_{x \in [0, 1]} {| σ_{n}^{- 1} (x) σ_{n, IID} (x) - 1 | + | σ_{n}^{- 1} (x) σ_{n, LONG} (x) - 1 |} = O (h_{s}^{β}),

in which for x ∈ [0, 1], σ_n,IID (x) ≡ σ_Y (x) {f(x)h_snE(N₁)}^−1/2 and

σ_{n, LONG} (x) \equiv σ_{n, IID} (x) {1 + \frac{E {N_{1} (N_{1} - 1)}}{{EN}_{1}} h_{s} \frac{G (x, x) f (x)}{σ_{Y}^{2} (x)}}^{1 / 2} .

Using σ_n,IID(x) instead of σ_n(x) means to treat the (X_ij, Y_ij) as iid data rather than as sparse longitudinal data, while using σ_n,LONG(x) means to correctly account for the longitudinal correlation structure. The difference of the two approaches, although asymptotically negligible uniformly for x ∈ [0, 1] according to Proposition 1, is significant in finite samples, as shown in the simulation results of Section 5. For similar phenomenon with kernel smoothing, see Wang, Carroll, and Lin (2005).

Corollary 1

Under Assumptions (A1)-(A5), for any α ∈ (0, 1), as n → ∞, an asymptotic 100 (1 − α) % simultaneous confidence band for m(x), x ∈ [0, 1] is

\hat{m} (x) \pm σ_{n} (x) Q_{N_{s} + 1} (α),

while an asymptotic 100 (1 − α) % pointwise confidence interval for m(x), x ∈ [0, 1], is m̂(x) ± σ_n(x)Z_1−α/2.

3. Decomposition

In this section, we decompose the estimation error m̂(x) − m(x) by the representation of Y_ij as the sum of m (X_ij), $\sum_{k = 1}^{κ} ξ_{ik} ϕ_{k} (X_{ij})$ , and σ (X_ij) ε_ij.

We introduce the rescaled B-spline basis ${B_{J} (x)}_{J = 0}^{N_{s}}$ for G⁽⁻¹⁾, which is $B_{J} (x) \equiv b_{J} (x) {∥ b_{J} ∥}_{2}^{- 1}$ , J = 0, …, N_s. Therefore,

B_{J} (x) \equiv b_{J} (x) {c_{J, n}}^{- 1 / 2}, J = 0, \dots, N_{s} .

(3.1)

It is easily verified that ${∥ B_{J} ∥}_{2}^{2} = 1$ , J = 0, 1, …, N_s, 〈B_J, B_J′〉 ≡ 0, J ≠ J′.

The definition of m̂(x) in (2.1) means that

\hat{m} (x) \equiv \sum_{J = 0}^{N_{s}} {\hat{λ}}_{J}^{'} b_{J} (x),

(3.2)

with coefficients ${{\hat{λ}}_{0}^{'}, \dots, {\hat{λ}}_{N_{s}}^{'}}^{T}$ as solutions of the least squares problem

{{\hat{λ}}_{0}^{'}, \dots, {\hat{λ}}_{N_{s}}^{'}}^{T} = \underset{{λ_{0}, \dots, λ_{N_{s}}} \in R^{N_{s} + 1}}{argmin} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} {Y_{ij} - \sum_{J = 0}^{N_{s}} λ_{J} b_{J} (X_{ij})}^{2} .

Simple linear algebra shows that $\hat{m} (x) \equiv \sum_{J = 0}^{N_{s}} {\hat{λ}}_{J} B_{J} (x)$ , where the coefficients {λ̂₀, …, λ̂_{N_s}}^T are solutions of the least squares problem

{{\hat{λ}}_{0}^{'}, \dots, {\hat{λ}}_{N_{s}}^{'}}^{T} = \underset{{λ_{0}, \dots, λ_{N_{s}}} \in R^{N_{s} + 1}}{argmin} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} {Y_{ij} - \sum_{J = 0}^{N_{s}} λ_{J} B_{J} (X_{ij})}^{2} .

(3.3)

Projecting the relationship in model (1.2) onto the linear subspace of $R^{N_{T}}$ spanned by {B_J (X_ij)}_{1≤j≤N_i,1≤i≤n,0≤J≤N_s}, we obtain the following crucial decomposition in the space G⁽⁻¹⁾ of spline functions:

\hat{m} (x) = \tilde{m} (x) + \tilde{e} (x) = \tilde{m} (x) + \tilde{ε} (x) + \sum_{k = 1}^{κ} {\tilde{ξ}}_{k} (x),

(3.4)

\tilde{m} (x) = \sum_{J = 0}^{N_{s}} {\tilde{λ}}_{J} B_{J} (x), \tilde{ε} (x) = \sum_{J = 0}^{N_{s}} {\tilde{a}}_{J} B_{J} (x), {\tilde{ξ}}_{k} (x) = \sum_{J = 0}^{N_{s}} {\tilde{τ}}_{k, J} B_{J} (x) .

(3.5)

The vectors {λ̃₀, …, λ̃_{N_s}}^T, {ã₀, …, ã_{N_s}}^T, and {τ̃_k,0, …, τ̃_{k,N_s}}^T are solutions to (3.3) with Y_ij replaced by m(X_ij), σ (X_ij) ε_ij, and ξ_ikϕ_k (X_ij), respectively. We cite next an important result concerning the function m̃(x). The first part is from de Boor (2001), p. 149, and the second is from Theorem 5.1 of Huang (2003).

Theorem 2

There is an absolute constant C_g > 0 such that for every ϕ ∈ C [0, 1], there exists a function g ∈ G⁽⁻¹⁾ [0, 1] that satisfies ∥g − ϕ∥_∞ ≤ C_gω (ϕ, h_s). In particular, if ϕ ∈ C^0,β [0, 1] for some β ∈ (0, 1], then ${∥ g - ϕ ∥}_{\infty} \leq C_{g} {∥ ϕ ∥}_{0, β} h_{s}^{β}$ . Under Assumptions (A1) and (A4), with probability approaching 1, the function m̃(x) defined in (3.5) satisfies ∥m̃(x) − m(x)∥_∞ = O (h_s).

The next proposition concerns the function ẽ(x) given in (3.4).

Proposition 2

Under Assumptions (A2)-(A5), for any τ ∈ R, and σ_n(x), a_{N_s+1}, and b_{N_s+1} as given in (2.3) and (2.4),

\lim_{n \to \infty} P {\sup_{x \in [0, 1]} | σ_{n} {(x)}^{- 1} \tilde{e} (x) | \leq τ / a_{N_{s} + 1} + b_{N_{s} + 1}} = \exp (- 2 e^{- τ}) .

4. Implementation

In this section, we describe procedures to implement the confidence bands and intervals given in Corollary 1. Given any data set ${(X_{ij}, Y_{ij})}_{j = 1, i = 1}^{N_{i}, n}$ from model (1.2), the spline estimator m̂(x) is obtained by (3.2), and the number of interior knots in (3.2) is taken to be $N_{s} = [c N_{T}^{1 / 3} (\log n)]$ , in which [a] denotes the integer part of a and c is a positive constant. When constructing the confidence bands, one needs to evaluate the function $σ_{n}^{2} (x)$ by estimating the unknown functions f(x), $σ_{Y}^{2} (x)$ , and G (x, x), and then plugging in these estimators: the same approach is taken in Wang and Yang (2009).

The number of interior knots for pilot estimation of f(x), $σ_{Y}^{2} (x)$ , and G (x, x) is taken to be $N_{s}^{*} = [n^{1 / 3}]$ , and $h_{s}^{*} = 1 / (1 + N_{s}^{*})$ . The histogram pilot estimator of the density function f(x) is

\hat{f} (x) = {\sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} b_{J (x)} (X_{ij})} / {(\sum_{i = 1}^{n} N_{i}) h_{s}^{*}} .

Defining the vector $R = {R_{ij}}_{1 \leq j \leq N_{i}, 1 \leq i \leq n}^{T} = {{(Y_{ij} - \hat{m} (X_{ij}))}^{2}}_{1 \leq j \leq N_{i}, 1 \leq i \leq n}^{T}$ , the estimation of $σ_{Y}^{2} (x)$ is ${\hat{σ}}_{Y}^{2} (x) = \sum_{J = 0}^{N_{s}^{*}} {\hat{ρ}}_{J} b_{J} (x)$ , where the coefficients ${{\hat{ρ}}_{0}, \dots, {\hat{ρ}}_{N_{s}^{*}}}^{T}$ are solutions of the least squares problem:

{{\hat{ρ}}_{0}, \dots, {\hat{ρ}}_{N_{s}^{*}}}^{T} = \underset{{{\hat{ρ}}_{0}, \dots, {\hat{ρ}}_{N_{s}^{*}}} \in R^{N_{s} + 1}}{argmin} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} {R_{i j} - \sum_{j = 0}^{N_{s}^{*}} ρ_{J} b_{J} (X_{i j})}^{2} .

The pilot estimator of covariance function G (x, x′) is

\hat{G} (x, x^{'}) = \arg \min_{g \in G^{(- 1)} \otimes G^{(- 1)}} \sum_{i = 1}^{n} \sum_{j, j^{'} = 1, j \neq j^{'}}^{N_{i}} {C_{{ijj}^{'}} - g (X_{ij}, X_{{ij}^{'}})}^{2},

where C_ijj′ = {Y_ij − m̂ (X_ij)} {Y_ij′ − m̂ (X_ij′)}, 1 ≤ j, j′ ≤ N_i, 1 ≤ i ≤ n. The function σ_n(x) is estimated by either σ̂_n,IID(x) ≡ σ̂_Y(x) {f̂(x)h_sN_T}^−1/2 or

{\hat{σ}}_{n, LONG} (x) \equiv {\hat{σ}}_{n, IID} (x) {1 + (\sum_{i = 1}^{n} N_{i}^{2} / N_{T} - 1) \frac{\hat{G} (x, x)}{{\hat{σ}}_{Y}^{2} (x)} \hat{f} (x) h_{s}}^{1 / 2} .

We now state a result. That is easily proved by standard theory of kernel and spline smoothing, as in Wang and Yang (2009).

Proposition 3

Under Assumptions (A1)-(A5), as n → ∞

\sup_{x \in [0, 1]} {| {\hat{σ}}_{n, IID} (x) {\hat{σ}}_{n, IID}^{- 1} (x) - 1 | + | {\hat{σ}}_{n, LONG} (x) {\hat{σ}}_{n, LONG}^{- 1} (x) - 1 |} = O_{a . s} (h_{s}^{β} + n^{- 1 / 2} N_{s}^{- 1} {(logn)}^{1 / 2}) .

Proposition 1, about how σ_n,IID(x) and σ_n,LONG(x) uniformly approximate σ_n(x), and Proposition 3 together imply that both σ̂_n,IID(x) and σ̂_n,LONG(x) approximate σ_n(x) uniformly at a rate faster than (n^−1/2+1/3 (logn)^1/2−1/3), according to Assumption (A5). Therefore as n → ∞, the confidence bands

\hat{m} (x) \pm {\hat{σ}}_{n, IID} (x) Q_{N_{s} + 1} (α),

(4.1)

\hat{m} (x) \pm {\hat{σ}}_{n, LONG} (x) Q_{N_{s} + 1} (α),

(4.2)

with Q_{N_s+1} (α) given in (2.4), and the pointwise intervals m̂(x) ± σ̂_n,IID(x)Z_1−α/2, m̂(x) ± σ̂_n,LONG(x)Z_1−α/2 have asymptotic confidence level 1 − α.

5. Simulation

To illustrate the finite-sample performance of the spline approach, we generated data from the model

Y_{ij} = m (X_{ij}) + \sum_{k = 1}^{2} ξ_{ik} ϕ_{k} (X_{ij}) + σ ε_{ij}, 1 \leq j \leq N_{i}, 1 \leq i \leq n,

with X ~ Uniform[0, 1], ξ_k ~ Normal(0, 1), k = 1, 2, ε ~ Normal(0, 1), N_i having a discrete uniform distribution from 25, … , 35, for 1 ≤ i ≤ n, and $m (x) = \sin {2 π (x - 1 / 2)}, ϕ_{1} (x) = - 2 \cos {π (x - 1 / 2)} / \sqrt{5}, ϕ_{2} (x) = \sin {π (x - 1 / 2)} / \sqrt{5}$ , thus λ₁ = 2/5, λ₂ = 1/10. The noise levels were σ = 0.5, 1.0, the number of subjects n was taken to be 20, 50, 100, 200, the confidence levels were 1 − α = 0.95, 0.99, and the constant c in the definition of N_s in Section 4 was taken to be 1, 2, 3. We found that the confidence band (4.1) did not have good coverage rates for moderate sample sizes, and hence in Table 1 we report the coverage as the percentage out of the total 200 replications for which the true curve was covered by (4.2) at the 101 points {k/100, k = 0, …, 100}.

Table 1.

Uniform coverage rates from 200 replications using the confidence band (4.2). For each sample size n, the first row is the coverage of a nominal 95% confidence band, while the second row is for a 99% confidence band.

σ	n	1 − α	c = 1	c = 2	c = 3

0.5	20	0.950	0.920	0.930	0.800
	20	0.990	0.990	0.990	0.900

	50	0.950	0.960	0.965	0.910
	50	0.990	0.995	0.995	0.965

	100	0.950	0.955	0.955	0.955
	100	0.990	1.000	1.000	0.985

	200	0.950	0.950	0.965	0.975
	200	0.990	0.985	0.985	0.990

1.0	20	0.950	0.935	0.930	0.735
	20	0.990	0.990	0.990	0.870

	50	0.950	0.975	0.960	0.895
	50	0.990	0.995	0.995	0.980

	100	0.950	0.950	0.940	0.935
	100	0.990	0.995	0.990	0.990

	200	0.950	0.940	0.965	0.960
	200	0.990	0.985	0.995	0.995

Open in a new tab

At all noise levels, the coverage percentages for the confidence band (4.2) are very close to the nominal confidence levels 0.95 and 0.99 for c = 1, 2, but decline for c = 3 when n = 20, 50. The coverage percentages thus depend on the choice of N_s, and the dependency becomes stronger when sample sizes decrease. For large sample sizes n = 100, 200, the effect of the choice of N_s on the coverage percentages is insignificant. Because N_s varies with N_i, for 1 ≤ i ≤ n, the data-driven selection of some “optimal” N_s remains an open problem.

We next examine two alternative methods to compute the confidence band, based on the observation that the estimated mean function m̂(x) and the confidence intervals are step functions that remain the same on each subinterval χ_J, 0 ≤ J ≤ N_s. Follwing an associate editor’s suggestion, locally weighted smoothing was applied to the upper and lower confidence limits to generate a smoothed confidence band. Following a referee’s suggestion to treat the number (N_s + 1) of subintervals as fixed instead of growing to infinity, a naive parametric confidence band was computed as

\hat{m} (x) \pm {\hat{σ}}_{n, LONG} (x) Q_{1 - α . N_{s} + 1}

(5.1)

in which Q_{1−α.N_s+1} = Z_{{1+(1−α)^1/(N_s+1)}/2} is the (1 − α) quantile of the maximal absolute values of (N_s + 1) iid N (0, 1) random variables. We compare the performance of the confidence band in (4.2), the smoothed band and naive parametric band in (5.1). Given n = 20 with N_s = 8, 12, and n = 50 N_s = 44 (by taking c = 1 in the definition of N_s in Section 4), σ = 0.5, 1.0, and 1 − α = 0.99, Table 2 reports the coverage percentages P̂, P̂_naive, P̂_smooth and the average maximal widths W, W_naive, W_smooth of N_s + 1 intervals out of 200 replications calculated from confidence bands (4.2), (5.1), and the smoothed confidence bands, respectively.

Table 2.

Uniform coverage rates and average maximal widths of confidence intervals from 200 replications using the confidence bands (4.2), (5.1), and the smoothed bands respectively, for 1 − α = 0.99.

n	σ	N_s	P̂	P̂_naive	P̂_smooth	W	W_naive	W_smooth

20	0.5	8	0.820	0.505	0.910	1.490	1.210	1.480
	0.5	12	0.930	0.765	0.955	1.644	1.363	1.628

	1.0	8	0.910	0.655	0.970	1.725	1.401	1.721
	1.0	12	0.960	0.820	0.985	1.937	1.606	1.928

50	0.5	44	0.990	0.960	0.990	1.651	1.522	1.609

	1.0	44	0.990	0.975	1.000	2.054	1.893	2.016

Open in a new tab

In all experiments, one has P̂_smooth > P̂ > P̂_naive and W > W_smooth > W_naive. The coverage percentages for both the confidence bands in (4.2) and the smoothed bands are much closer to the nominal level than those of the naive bands in (5.1), while the smoothed bands perform slightly better than the constant spline bands in (4.2), with coverage percentages closer to the nominal and smaller widths. Based on these observations, the naive band is not recommended due to poor coverage. As for the smoothed band, although it has slightly better coverage than the constant spline band, its asymptotic property has yet to be established, and the second step smoothing adds to its conceptual complexity and computational burden. Therefore with everything considered, the constant spline band is recommended for its satisfactory theoretical property, fast computing, and conceptual simplicity.

For visualization of the actual function estimates, at σ = 0.5 with n = 20, 50, Figure 1 depicts the simulated data points and the true curve, and Figure 2 shows the true curve, the estimated curve, the uniform confidence band, and the pointwise confidence intervals.

Plots of simulated data scatter points at σ = 0.5: (a) n = 20, (b) n = 50, and the true curve.

Plots of confidence bands (4.2) (upper and lower solid lines), pointwise confidence intervals (upper and lower dashed lines), the spline estimator (middle thin line), and the true function (middle thick line): (a) 1 − α = 0.95, n = 20, (b) 1 − α = 0.95, n = 50, (c) 1 − α = 0.99, n = 20,(d) 1 − α = 0.99, n = 50.

6. Empirical example

In this section, we apply the confidence band procedure of Section 4 to the data collected from a study by the AIDS Clinical Trials Group, ACTG 315 (Zhou, Huang, and Carroll (2008)). In this study, 46 HIV 1 infected patients were treated with potent antiviral therapy consisting of ritonavir, 3TC and AZT. After initiation of the treatment on day 0, patients were followed for up to 10 visits. Scheduled visit times common for all patients were 7, 14, 21, 28, 35, 42, 56, 70, 84, and 168 days. Since the patients did not follow exactly the scheduled times and/or missed some visits, the actual visit times T_ij were irregularly spaced and varied from day 0 to day 196. The CD4+ cell counts during HIV/AIDS treatments are taken as the response variable Y from day 0 to day 196. Figure 3 shows that the data points (dots) are extremely sparse between day 100 and 150, thus we first transform the data by $X_{ij} = T_{ij}^{1 / 3}$ . A histogram (not shown) indicates that the X_ij-values are distributed fairly uniformly. The number of interior knots in (3.2) is taken to be N_s = 6, so that the range for visit time T, which is [0, 196], is divided into seven unequal subintervals, and in each subinterval, the mean CD4+ cell counts and the confidence bands remain the same. Table 3 gives the mean CD4+ cell counts and the confidence limits on each subinterval at simultaneous confidence level 0.95. For instance, from day 4 to 14, the mean CD4+ cell counts is 241.62 with lower and upper limits 171.81 and 311.43 respectively.

Plots of the piecewise-constant spline estimator (thick line), the data (dots), and (a) confidence band (4.2) (upper and lower solid lines), the smoothed band (upper and lower thin lines), (b) pointwise confidence intervals (upper and lower thin lines) at confidence level 0.95.

Table 3.

The mean CD4+ cell counts and the confidence limits on each subinterval at simultaneous confidence level 0.95.

Days	Mean CD4+ cell counts	Confidence limits
[0, 1)	178.23	[106.73, 249.72]
[1, 4)	20.32	[130.51, 270.13]
[4, 15)	24.62	[171.81, 311.43]
[15, 36)	27.87	[194.70, 349.04]
[36, 71)	299.51	[222.34, 376.68]
[71, 123)	280.78	[203.50, 358.06]
[123, 196]	299.27	[221.99, 376.55]

Open in a new tab

Figure 3 depicts (a) the 95% simultaneous (smoothed) confidence band according to (4.2) in (median) thin lines, and (b) the pointwise 95% confidence intervals in thin lines. The center thick line is the piecewise-constant spline fit m̂(x). It can be seen that the pointwise confidence intervals are of course narrower than the uniform confidence band by the same ratio. Figure 3 is essentially a graphical representation of Table 3; both confirm that the mean CD4+ cell counts generally increases over time as Zhou, Huang, and Carroll (2008) pointed out. The advantage of the current method is that such inference on the overall trend is made with predetermined type I error probability, in this case 0.05.

7. Discussion

In this paper, we have constructed a simultaneous confidence band for the mean function m(x) for sparse longitudinal data via piecewise-constant spline fitting. Our approach extends the asymptotic results in Wang and Yang (2009) for i.i.d. random designs to a much more complicated data structure by allowing dependence of measurements within each subject. The proposed estimator has good asymptotic behavior, and the confidence band had coverage very close to the nominal in our simulation study. An empirical study for the mean CD4+ cell counts illustrates the practical use of the confidence band.

Clearly the simultaneous confidence band in (4.2) can be improved in terms of both theoretical and numerical performance if higher order spline or local linear estimators are used. Constant piecewise spline estimators are less appealing and have sub-optimal convergence rates in the sense of Hall, Müller, and Wang (2006), which uses local linear approaches. Establishing the asymptotic confidence level for such extensions, however, requires highly sophisticated extreme value theory, for sequences of non-stationary Gaussian processes over intervals growing to infinity. That is much more difficult than the proofs of this paper. We consider the confidence band in (4.2) significant because it is the first of its kind for the longitudinal case with complete theoretical justification, and with satisfactory numerical performance for commonly encountered data sizes.

Our methodology can be applied to construct simultaneous confidence bands for other functional objects, such as the covariance function G(x, x′) and its eigenfunctions, see Yao (2007). It can also be adapted to the estimation of regression functions in the functional linear model, as in Li and Hsing (2007). We expect further research along these lines to yield deep theoretical results with interesting applications.

Acknowledgments

The authors thank Shuzhuan Zheng and the seminar participants at the University of Michigan, Georgia Institute of Technology, Georgia State University, University of Toledo, University of Georgia, Soochow University, University of Science and Technology of China, and Peking University for their comments on the paper. Ma and Yang’s research was supported in part by NSF Awards DMS 0706518, DMS 1007594, an MSU Summer Support Fellowship and a grant from Risk Management Institute, National University of Singapore. Carroll’s research was supported by a grant from the National Cancer Institute (CA57030) and by Award Number KUS-CI-016-04, made by King Abdullah University of Science and Technology (KAUST). The detailed and insightful comments from an associate editor and two referees are gratefully acknowledged.

Appendix

Throughout this section, a_n ~ b_n means $\lim_{n \to \infty} b_{n} / a_{n} = c$ , where c is some nonzero constant, and for functions a_n(x), b_n(x), a_n(x) = u {b_n(x)} means a_n(x)/b_n(x) → 0 as n → ∞ uniformly for x ∈ [0, 1].

A.1. Preliminaries

We first state some results on strong approximation, extreme value theory and the classic Bernstein inequality. These are used in the proofs of Lemma A.7, Theorem 1, and Lemma A.6.

Lemma A.1

(Theorem 2.6.7 of Csőrgő and Révész (1981)) Suppose that ξ_i, 1 ≤ i ≤ n are iid with E(ξ₁) = 0, $E (ξ_{1}^{2}) = 1$ , and H(x) > 0 (x ≥ 0) is an increasing continuous function such that x^−2−γ H(x) is increasing for some γ > 0 and x⁻¹ logH (x) is decreasing with EH (|ξ₁|) < ∞. Then there exists a Wiener process {W (t), 0 ≤ t < ∞} that is a Borel function of ξ_i, 1 ≤ i ≤ n, and constants C₁, C₂, a > 0 which depend only on the distribution of ξ₁, such that for any ${x_{n}}_{n = 1}^{\infty}$ satisfying H⁻¹ (n) < x_n < C₁ (nlogn)^1/2 and $S_{k} = \sum_{i = 1}^{k} ξ_{i}$ ,

P {\max_{1 \leq k \leq n} | S_{k} - W (k) | > x_{n}} \leq C_{2} n {H ({ax}_{n})}^{- 1} .

Lemma A.2

Let $ξ_{i}^{(n)}$ , 1 ≤ i ≤ n, be jointly normal with $ξ_{i}^{(n)} ~ N (0, 1)$ . Let $r_{ij}^{(n)} = E ξ_{i}^{(n)} ξ_{j}^{(n)}$ be such that for γ > 0, C_r > 0, $| r_{ij}^{(n)} | < C_{r} / n^{γ}$ , i ≠ j. Then for τ ∈ R, as n → ∞, P{M_n,ξ ≤ τ/a_n + b_n} → exp (−2e^−τ), in which $M_{n, ξ} = \max {| ξ_{1}^{(n)} |, \dots, | ξ_{n}^{(n)} |}$ and a_n, b_n are as in (2.4) with N_s + 1 replaced by n.

Proof

Let ${η_{i}}_{i = 1}^{n}$ be i.i.d. standard normal r.v.’s, $u = {u_{i}}_{i = 1}^{n}, v = {υ_{i}}_{i = 1}^{n}$ be vectors of real numbers, and ω = min (|u₁|,…, |u_n| , |υ₁|,…, |υ_n|). By the Normal Comparison Lemma (Leadbetter, Lindgren and Rootzén (1983), Lemma 11.1.2),

\begin{array}{c} | P {- υ_{j} < ξ_{j}^{(n)} \leq u_{j} for j = 1, \dots, n} - P {- υ_{j} < η_{j} \leq u_{j} for j = 1, \dots, n} | \\ \leq \frac{4}{2 π} \sum_{1 \leq i < j \leq n} | r_{ij}^{(n)} | {(1 - {| r_{ij}^{(n)} |}^{2})}^{- 1 / 2} \exp (\frac{- ω^{2}}{1 + r_{ij}^{(n)}}) . \end{array}

If u₁ = ⋯ = u_n = υ₁ = ⋯ = υ_n = τ/a_n + b_n = τ_n, it is clear that $τ_{n}^{2} / (2 \log n) \to 1$ , as n → ∞. Then $τ_{n}^{2} > (2 - ε) \log n$ , for any ε > 0 and large n. Since $1 - r_{ij}^{(n) 2} \geq 1 - {(C_{r} / n^{γ})}^{2} \to 1$ as n → ∞, i ≠ j, for i ≠ j, ∃C_r2 > 0 such that $1 - r_{ij}^{(n) 2} \geq C_{r 2} > 0$ and $1 + r_{ij}^{(n)} < 1 + ε$ for any ∊ > 0 and large n. Let M_n,η = max {|η₁|,…, |η_n|}. By Leadbetter, Lindgren and Rootzén (1983), Theorem 1.5.3, P {M_n,η ≤ τ_n} → exp (−2e^−τ) as n → ∞, while the above results entail

\begin{matrix} | P (M_{n, ξ} \leq τ_{n}) - P (M_{n, η} \leq τ_{n}) | \leq \frac{4}{2 π} \sum_{1 \leq i < j \leq n} | r_{ij}^{(n)} | {(1 - {| r_{ij}^{(n)} |}^{2})}^{- 1 / 2} \exp (\frac{- ω^{2}}{1 + r_{ij}^{(n)}}) \\ \leq \frac{4}{2 π} \sum_{1 \leq i < j \leq n} C_{r} n^{- γ} C_{r 2}^{- 1 / 2} \exp {\frac{- (2 - ε) \log n}{1 + ε}} \leq C_{r}^{'} n^{2 - γ - (2 - ε) {(1 + ε)}^{- 1}} \to 0 \end{matrix}

as n → ∞. Hence P {M_n,ξ ≤ τ_n} → exp (−2e^−τ), as n → ∞.

Lemma A.3

(Theorem 1.2 of Bosq (1998)) Suppose that ${ξ_{i}}_{i = 1}^{n}$ are iid with E(ξ₁) = 0, $σ^{2} = E ξ_{1}^{2}$ , and there exists c > 0 such that for r = 3, 4, …, $E {| ξ_{1} |}^{r} \leq c^{r - 2} r! E ξ_{1}^{2} < + \infty$ . Then for each n > 1, t > 0, $P (| S_{n} | \geq \sqrt{n} σ t) \leq 2 \exp (- t^{2} {(4 + 2 ct / \sqrt{n} σ)}^{- 1})$ , in which $S_{n} = \sum_{i = 1}^{n} ξ_{i}$ .

Lemma A.4

Under Assumption (A2), as n → ∞ for c_J,n defined in (2.2), c_J,n = f (t_J) h_s (1 + r_J,n), 〈b_J, b_J′〉 ≡ 0, J ≠ J′, where max_{0≤J≤N_s} |r_J,n| ≤ Cω (f,h_s). There exist constants C_B > c_B > 0 such that $c_{B} h_{s}^{1 - r / 2} \leq E {B_{J} (X_{ij})}^{r} \leq C_{B} h_{s}^{1 - r / 2}$ for r = 1, 2, … and 1 ≤ J ≤ N_s + 1, 1 ≤ j ≤ N_i, 1 ≤ i ≤ n.

Proof

By the definition of c_J,n in (2.2),

c_{J, n} = \int b_{J} (x) f (x) dx = \int_{[t_{J}, t_{J + 1}]} f (x) dx = f (t_{J}) h_{s} + \int_{[t_{J}, t_{J + 1}]} {f (x) - f (t_{J})} dx .

Hence for all J = 0, …, N_s, |c_J,n − f (t_J) h_s| ≤ ∫_{[t_J, t_J+1]}| f(x) − f (t_J)| dx ≤ ω (f, h_s) h_s, or |r_J,n| = |c_J,n − f (t_J) h_s| {f (t_J) h_s}⁻¹ ≤ Cω (f, h_s), J = 0, …, N_s. By (3.1), $E {B_{J} (X_{ij})}^{r} = {(c_{J, n})}^{- r / 2} \int b_{J} (x) f (x) dx = {(c_{J, n})}^{1 - r / 2} ~ h_{s}^{1 - r / 2}$ .

Proof of Proposition 1

By Lemma A.4 and Assumption (A2) on the continuity of functions $ϕ_{k}^{2} (x)$ , σ²(x) and f(x) on [0, 1], for any x ∈ [0, 1]

\begin{matrix} | \int_{χ_{J (x)}} ϕ_{k} (x) f (x) du - \int_{χ_{J (x)}} ϕ_{k} (u) f (u) du | \leq ω (ϕ_{k} f, h_{s}) h_{s} = O (h_{s}^{1 + β}), \\ | \int_{J (x)} {σ_{Y}^{2} (x) f (x) - σ_{Y}^{2} (u) f (u)} du | \leq ω (σ_{Y}^{2} f, h_{s}) h_{s} = O (h_{s}^{1 + β}) . \end{matrix}

Hence,

\begin{array}{c} σ_{n}^{2} (x) = c_{J (x), n}^{- 2} {(n {EN}_{1})}^{- 1} \int_{J (x)} σ_{Y}^{2} (u) f (u) du \times {1 + \frac{E {N_{1} (N_{1} - 1)}}{{EN}_{1}} \sum_{k = 1}^{κ} {(\int_{χ_{J (x)}} ϕ_{k} (u) f (u) du)}^{2} {\int_{J (x)} σ_{Y}^{2} (u) f (u) du}^{- 1}} \\ = {f (x) h_{s} + U (h_{s}^{1 + β})}^{- 2} {(n {EN}_{1})}^{- 1} {σ_{Y}^{2} (x) f (x) h_{s} + U (h_{s}^{1 + β})} \times {1 + \frac{E {N_{1} (N_{1} - 1)}}{{EN}_{1}} \sum_{k = 1}^{κ} {ϕ_{k} (x) f (x) h_{s} + U (h_{s}^{1 + β})}^{2} {σ_{Y}^{2} (x) f (x) h_{s} + U (h_{s}^{1 + β})}^{- 1}} \\ = {(f (x) h_{s} n {EN}_{1})}^{- 1} σ_{Y}^{2} (x) {1 + \frac{E {N_{1} (N_{1} - 1)}}{E N_{1}} \frac{\sum_{k = 1}^{κ} ϕ_{k}^{2} (x) f (x) h_{s}}{σ_{Y}^{2} (x)}} {1 + U (h_{s}^{β})} = σ_{n, LONG}^{2} (x) {1 + U (h_{s}^{β})} = σ_{n, IID}^{2} (x) {1 + U (h_{s}^{β})} . \end{array}

A.2. Proof of Theorem 1

Note that $B_{J (x)} (x) \equiv c_{J (x), n}^{- 1 / 2}, x \in [0, 1]$ , so the terms ξ̃_k(x) and ε̃(x) defined in (3.5) are

\begin{array}{c} {\hat{ξ}}_{k} (x) = \sum_{J = 0}^{N_{s}} N_{T}^{- 1} B_{J} (x) {∥ B_{J} ∥}_{2, N_{T}}^{- 2} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} B_{J} (X_{ij}) ϕ_{k} (X_{ij}) ξ_{ik} \\ = c_{J (x), n}^{- 1 / 2} ∥ B_{J (x)} ∥_{2, N_{T}}^{- 2} N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} B_{J (x)} (X_{ij}) ϕ_{k} (X_{ij}) ξ_{ik}, \\ \tilde{ε} (x) = c_{J (x), n}^{- 1 / 2} ∥ B_{J (x)} ∥_{2, N_{T}}^{- 2} N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} B_{J (x)} (X_{ij}) σ (X_{ij}) ε_{ij} . \end{array}

Let

\begin{array}{r} {\hat{ξ}}_{k} (x) = {∥ B_{J (x)} ∥}_{2, N_{T}}^{2} {\tilde{ξ}}_{k} (x) = c_{J (x), n}^{- 1 / 2} N_{T}^{- 1} \sum_{i = 1}^{n} R_{ik, ξ, J (x)} ξ_{ik}, \\ \hat{ε} (x) = {∥ B_{J (x)} ∥}_{2, N_{T}}^{2} \tilde{ε} (x) = c_{J (x), n}^{- 1 / 2} N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} R_{ij, ε, J (x)} ε_{ij}, \end{array}

(8.1)

where

R_{ik, ξ, J} = \sum_{j = 1}^{N_{i}} B_{J} (X_{ij}) ϕ_{k} (X_{ij}), R_{ij, ε, J} = B_{J} (X_{ij}) σ (X_{ij}), 0 \leq J \leq N_{s} .

(8.2)

Lemma A.5

Under Assumption (A3), for ẽ(x) given in (3.4) and ξ̂_k(x), ε̂(x) given in (8.1), we have

| \tilde{e} (x) - {\sum_{k = 1}^{κ} {\hat{ξ}}_{k} (x) + \hat{ε} (x)} | \leq A_{n} {(1 - A_{n})}^{- 1} | \sum_{k = 1}^{κ} {\hat{ξ}}_{k} (x) + \hat{ε} (x) |, x \in [0, 1],

where $A_{n} = \sup_{0 \leq J \leq N_{s}} | {∥ B_{J} ∥}_{2, N_{T}}^{2} - 1 |$ . There exists C_A > 0, such that for large n, $P (A_{n} \geq C_{A} \sqrt{\log (n) / ({nh}_{s})}) \leq 2 n^{- 3}$ . $A_{n} = O_{a . s .} (\sqrt{\log (n) / ({nh}_{s})})$ as n → ∞.

See the supplement of Wang and Yang (2009) for a detailed proof.

Lemma A.6

Under Assumptions (A2) and (A3), for R_1k,ξ,J, R_{11, ε,J} in (8.2),

\begin{array}{l} {ER}_{1 k, ξ, J}^{2} = c_{J, n}^{- 1} [E (N_{1}) \int b_{J} (u) ϕ_{k}^{2} (u) f (u) du + E {N_{1} (N_{1} - 1)} {(\int b_{J} (u) ϕ_{k} (u) f (u) du)}^{2}], \\ {ER}_{11, ε, J}^{2} = c_{J, n}^{- 1} \int b_{J} (u) σ^{2} (u) f (u) du, 0 \leq J \leq N_{s}, \end{array}

there exist 0 < c_R < C_R < ∞, such that ${ER}_{1 k, ξ, J}^{2}, {ER}_{11, ε, J}^{2} \in [c_{R}, C_{R}]$ for 0 ≤ J ≤ N_s, $\sup_{0 \leq J \leq N_{s}} | n^{- 1} \sum_{i = 1}^{n} R_{i k, ξ, J}^{2} - {ER}_{1 k, ξ, J}^{2} | = O_{a . s .} (\sqrt{logn / ({nh}_{s})}), 1 \leq k \leq κ, \sup_{0 \leq J \leq N_{s}} | N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} R_{ij, ε, J}^{2} - {ER}_{11, ε, J}^{2} | = O_{a . s .} (\sqrt{logn / {nh}_{s})})$ as n → ∞.

Proof

By independence of X_1j, 1 ≤ j ≤ N₁ and N₁ and (3.1),

\begin{matrix} {ER}_{1 k, ξ, J}^{2} = E {\sum_{j, j' = 1}^{N_{1}} E {B_{J} (X_{1 j}) B_{J} (X_{1 j'}) ϕ_{k} (X_{1 j}) ϕ_{k} (X_{1 j'}) ∣ N_{1}}} \\ = E {\sum_{j = 1}^{N_{1}} E {B_{J}^{2} (X_{1 j}) ϕ_{k}^{2} (X_{1 j}) ∣ N_{1}}} + E {\sum_{j \neq j'}^{N_{1}} E {B_{J} (X_{1 j}) B_{J} (X_{1 j'}) ϕ_{k} (X_{1 j}) ϕ_{k} (X_{1 j'}) ∣ N_{1}}} \\ = c_{J (x), n}^{- 1} {E (N_{1}) \int b_{J} (u) ϕ_{k}^{2} (u) f (u) du + E {N_{1} (N_{1} - 1)} {(\int b_{J} (u) ϕ_{k} (u) f (u) du)}^{2}} . \end{matrix}

It is easily shown that ∃ 0 < c_R < C_R < ∞ such that $c_{R} \leq {ER}_{1 k, ξ, J}^{2} \leq C_{R,} 0 \leq J \leq N_{s}$ . Let $ζ_{i, J} = ζ_{i, k, J} = R_{ik, ξ, J}^{2}, ζ_{i, J}^{*} = ζ_{i, J} - E (ζ_{1, J})$ for r ≥ 1 and large n,

\begin{matrix} E {(ζ_{i, J})}^{r} = E {\sum_{j = 1}^{N_{i}} B_{J} (X_{ij}) ϕ_{k} (X_{ij})}^{2 r} \leq C_{ϕ}^{2 r} E {\sum_{j = 1}^{N_{i}} B_{J} (X_{ij})}^{2 r} \\ = C_{ϕ}^{2 r} E {\sum_{0 \leq ν_{1} \dots ν_{N_{i}} \leq 2 r}^{ν_{1} + \dots + ν_{N_{i}} = 2 r} (\begin{matrix} 2 r \\ ν_{1} \dots ν_{N_{i}} \end{matrix}) \prod_{j = 1}^{N_{i}} E {B_{J} (X_{ij})}^{ν_{j}}} \\ \leq C_{ϕ}^{2 r} E {N_{1}^{2 r} \max {\prod_{j = 1}^{N_{i}} E {B_{J} (X_{ij})}^{ν_{j}}}} \leq C_{ϕ}^{2 r} ({EN}_{1}^{2 r}) C_{B} h_{s}^{1 - r} \leq C_{ϕ}^{2 r} C_{B} c_{N}^{r} r! h_{s}^{1 - r} = C_{ζ} r! h_{s}^{1 - r}, \\ E {(ζ_{i, J})}^{r} \geq c_{ϕ}^{2 r} E {\sum_{j = 1}^{N_{i}} E {B_{J} (X_{ij})}^{2 r}} \geq c_{ϕ}^{2 r} ({EN}_{1}) c_{B} h_{s}^{1 - r}, \end{matrix}

by Lemma A.4. So {E(ζ_1,J)}^r ~ 1, E (ζ_i,J)^r ≫ {E(ζ_1,J)}^r for r ≥ 2, and $\exists C_{ζ}^{'} > c_{ζ}^{'} > 0$ such that $C_{ζ}^{'} h_{s}^{- 1} \geq σ_{ζ *}^{2} \geq c_{ζ}^{'} h_{s}^{- 1}$ , for $σ_{ζ *} = {E {(ζ_{i, J}^{*})}^{2}}^{1 / 2}$ . We obtain $E {| ζ_{i, J}^{*} |}^{r} \leq c_{*}^{r - 2} r! E {(ζ_{i, J}^{*})}^{2}$ with $c_{*} = {(C_{ζ} / c_{ζ}^{'})}^{\frac{1}{r - 2}} h_{s}^{- 1}$ , which implies that ${ζ_{i, J}^{*}}_{i = 1}^{n}$ satisfies Cramér’s condition. Applying Lemma A.3 to $\sum_{i = 1}^{n} ζ_{i, J}^{*}$ , for r > 2 and any large enough δ > 0, $P {n^{- 1} | \sum_{i = 1}^{n} ζ_{i, J}^{*} | \geq δ \sqrt{\log n / ({nh}_{s})}}$ is bounded by

2 \exp {\frac{- δ^{2} {(C_{ζ}^{'})}^{- 1} (\log n)}{4 + 2 {(C_{ζ} / c_{ζ}^{'})}^{\frac{1}{r - 2}} δ {(c_{ζ}^{'})}^{- 1} h_{s}^{1 / 2} {(\log n)}^{1 / 2} n^{- 1 / 2}}} \leq 2 \exp {\frac{- δ^{2} (\log n)}{4 C_{ζ}^{'}}} \leq 2 n^{- 3} .

Hence $\sum_{n = 1}^{\infty} P {\sup_{0 \leq J \leq N_{s}} | \frac{1}{n} \sum_{i = 1}^{n} R_{ik, ξ, J}^{2} - {ER}_{1 k, ξ, J}^{2} | \geq δ \sqrt{\log n / ({nh}_{s})}} \leq \sum_{n = 1}^{\infty} \frac{2 N_{s}}{n^{3}} < \infty$ . Thus, $\sup_{0 \leq J \leq N_{s}} | n^{- 1} \sum_{i = 1}^{n} R_{ik, ξ, J}^{2} - {ER}_{1 k, ξ, J}^{2} | = O_{a . s .} (\sqrt{\log n / ({nh}_{s})})$ as n → ∞ by Borel-Cantelli Lemma. The properties of R_ij,ε,J are obtained similarly.

Order all X_ij, 1 ≤ j ≤ N_i, 1 ≤ i ≤ n from large to small as X_(t), X₍₁₎ ≥ … ≥ X_{(N_T)}, and denote the ε_ij corresponding to X_(t) as ε_(t). By (8.1),

\begin{array}{l} \hat{ε} (x) & = c_{J (x), n}^{- 1} N_{T}^{- 1} \sum_{t = 1}^{N_{T}} b_{J (x)} (X_{(t)}) σ (X_{(t)}) ε_{(t)} \\ = c_{J (x), n}^{- 1} N_{T}^{- 1} \sum_{t = 1}^{N_{T}} b_{J (x)} (X_{(t)}) σ (X_{(t)}) {S_{t} - S_{t - 1}}, \end{array}

where $S_{q} = \sum_{t = 1}^{q} ε_{(t)}, q \geq 1$ and S₀ = 0.

Lemma A.7

Under Assumptions (A2)-(A5), there is a Wiener process {W (t), 0 ≤ t < ∞} independent of {N_i, X_ij, 1 ≤ j ≤ N_i, ξ_ik, 1 ≤ k ≤ κ, 1 ≤ i ≤ n}, such that as n → ∞, $\sup_{x \in [0, 1]} | {\hat{ε}}^{(0)} (x) - \hat{ε} (x) | = o_{a . s .} (n^{t})$ for some t < − (1 − ϑ) /2 < 0, where ε̂⁽⁰⁾ (x) is

{(c_{J (x), n} N_{T})}^{- 1} \sum_{t = 1}^{N_{T}} b_{J (x)} (X_{(t)}) σ (X_{(t)}) {W (t) - W (t - 1)}, x \in [0, 1] .

(8.3)

Proof

Define M_{N_T} = max_{1≤q≤N_T} |S_q − W (q)|, in which {W (t), 0 ≤ t ≤ ∞} is the Wiener process as in Lemma A.1 that as a Borel function of the set of variables {ε_(t) 1 ≤ t ≤ N_T} is independent of {N_i, X_ij, 1 ≤ j ≤ N_i, ξ_ik, 1 ≤ k ≤ κ, 1 ≤ i ≤ n} since {ε_(t) 1 ≤ t ≤ N_T} is. Further,

\begin{array}{l} \sup_{x \in [0, 1]} | {\hat{ε}}^{(0)} (x) - \hat{ε} (x) | = \sup_{x \in [0, 1]} c_{J (x), n}^{- 1} N_{T}^{- 1} | b_{J (x)} (X_{(N_{T})}) σ (X_{(N_{T})}) {W (N_{T}) - S_{N_{T}}} + \sum_{t = 1}^{N_{T} - 1} {b_{J (x)} (X_{(t)}) σ (X_{(t)}) - b_{J (x)} (X_{(t + 1)}) σ (X_{(t + 1)})} {W (t) - S_{t}} | \\ \leq \max_{0 \leq J \leq N_{s} + 1} c_{Jn}^{- 1} N_{T}^{- 1} {b_{J} (X_{(N_{T})}) σ (X_{(N_{T})}) + \sum_{t = 1}^{N_{T} - 1} | b_{J} (X_{(t)}) σ (X_{(t)}) - b_{J} (X_{(t + 1)}) σ (X_{(t + 1)}) |} M_{N_{T}} \\ \leq \max_{0 \leq J \leq N_{s} + 1} c_{J, n}^{- 1} N_{T}^{- 1} M_{N_{T}} {3 C_{σ} + \sum_{1 \leq t \leq N_{T} - 1, X_{(t)} \in b_{J}} | σ (X_{(t)}) - σ (X_{(t + 1)}) |} \end{array}

which, by the Hölder continuity of σ in Assumption (A2), is bounded by

\begin{matrix} N_{T}^{- 1} M_{N_{T}} \max_{0 \leq J \leq N_{s} + 1} c_{J, n}^{- 1} {3 C_{σ} + {∥ σ ∥}_{0, β} \sum_{1 \leq t \leq N_{T} - 1, X_{(t)} \in b_{J}} {| X_{(t)} - X_{(t + 1)} |}^{β}} \leq \\ N_{T}^{- 1} M_{N_{T}} \max_{0 \leq J \leq N_{s} + 1} c_{J, n}^{- 1} {3 C_{σ} + {∥ σ ∥}_{0, β} n_{J}^{1 - β} {(\sum_{1 \leq t \leq N_{T} - 1, X_{(t)} \in b_{J}} | X_{(t)} - X_{(t + 1)} |)}^{β}} \\ \leq N_{T}^{- 1} M_{N_{T}} (\max_{0 \leq J \leq N_{s} + 1} c_{J, n}^{- 1}) {3 C_{σ} + {∥ σ ∥}_{0, β} h_{s}^{β} {(\max_{0 \leq J \leq N_{s} + 1} n_{J})}^{1 - β}} \end{matrix}

where $n_{J} = \sum_{t = 1}^{N_{T}} I (X_{(t)} \in χ_{J})$ , 0 ≤ J ≤ N_s + 1, has a binomial distribution with parameters (N_T, p_J,n), where p_J,n = ∫_{χ_J} f (x) dx. Simple application of Lemma A.3 entails $\max_{0 \leq J \leq N_{s} + 1} n_{J} = O_{a . s .} (N_{T} N_{s}^{- 1})$ . Meanwhile, by letting H(x) = x^r, x_n = n^t′, t′ ∈ (2/r, β − (1 + ϑ) /2), the existence of which is due to the Assumption (A4) that r > 2/ {β − (1 + ϑ) /2}. It is clear that ${ε_{(t)}}_{t = 1}^{N_{T}}$ satisfies the conditions in Lemma A.1. Since $\frac{n}{H ({ax}_{n})} = a^{- r} n^{1 - r t^{'}} = O (n^{- γ_{1}})$ for some γ₁ > 1, one can use the probability inequality in Lemma A.1 and the Borel-Cantelli Lemma to obtain M_{N_T} = O_a.s. (x_n) = O_a.s. (n^t′). Hence Lemma A.4 and the above imply

\begin{matrix} \sup_{x \in [0, 1]} | {\hat{ε}}^{(0)} (x) - \hat{ε} (x) | = O_{a . s .} (N_{s} n^{t^{'} - 1}) {1 + N_{s}^{- β} {(N_{T} N_{s}^{- 1})}^{1 - β}} \\ = O_{a . s .} (N_{s} n^{t^{'} - 1} + N_{s} n^{t^{'} - 1} \times N_{s}^{- 1} n^{1 - β}) \\ = O_{a . s .} (N_{s} n^{t^{'} - 1} + N_{s} n^{t^{'} - β}) = o_{a . s .} (n^{t^{'} - β + ϑ}) \end{matrix}

since t′ < β − (1 + ϑ) /2 by definition, implying t′ − 1 ≤ t′ − β < − (1 + ϑ) /2. The Lemma follows by setting t = t′ − β + ϑ.

Now

\begin{array}{l} {\hat{ε}}^{(0)} (x) = c_{J (x), n}^{- 1} N_{T}^{- 1} \sum_{t = 1}^{N_{T}} b_{J (x)} (X_{(t)}) σ (X_{(t)}) Z_{(t)} \\ = c_{J (x), n}^{- 1} N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} b_{J (x)} (X_{ij}) σ (X_{ij}) Z_{ij,} \end{array}

(8.4)

where Z_(t) = W (t) − W (t − 1), 1 ≤ t ≤ N_T, are i.i.d N (0, 1), ξ_ik, Z_ij, X_ij, N_i are independent, for 1 ≤ k ≤ κ, 1 ≤ j ≤ N_i, 1 ≤ i ≤ n, and ξ̂_k(x), ε̂⁽⁰⁾(x) are conditional independent of X_ij, N_i, 1 ≤ j ≤ N_i, 1 ≤ i ≤ n. If the conditional variances of ξ̂_k(x), ε̂⁽⁰⁾(x) on (X_ij, N_i)_{1≤j≤N_i,1≤i≤n} are $σ_{ξ_{k}, n}^{2} (x), σ_{ε, n}^{2} (x)$ , we have

\begin{matrix} σ_{ξ_{k}, n} (x) = {c_{J (x), n}^{- 1} N_{T}^{- 2} \sum_{i = 1}^{n} R_{ik, ξ, J (x)}^{2}}^{1 / 2} \\ σ_{ε, n} (x) = {c_{J (x), n}^{- 1} N_{T}^{- 2} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} R_{ij, ε, J (x)}^{2}}^{1 / 2}, \end{matrix}

(8.5)

where R_ik,ξ,J(x), R_ij,ε,J(x), and c_J(x),n are given in (8.2) and (2.2).

Lemma A.8

Under Assumptions (A2) and (A3), let

η (x) = {\sum_{k = 1}^{κ} σ_{ξ_{k}, n}^{2} (x) + σ_{ε, n}^{2} (x)}^{- 1 / 2} {\sum_{k = 1}^{κ} {\hat{ξ}}_{k} (x) + {\hat{ε}}^{(0)} (x)},

(8.6)

with σ_{ξ_k,n}(x), σ_ε,n(x), ξ̂_k(x), ε̂⁽⁰⁾(x), and c_J(x),n given in (8.5), (8.1), (8.3), and (2.2). Then η(x) is a Gaussian process consisting of (N_s + 1) standard normal variables ${η_{J}}_{J = 0}^{N_{s}}$ such that η(x) = η_J(x) for x ∈ [0, 1], and there exists a constant C > 0 such that for large n, sup_{0≤J≠J′≤N_s} |Eη_Jη_J′| ≤ Ch_s.

Proof

It is apparent that ℒ {η_J |(X_ij, N_i), 1 ≤ j ≤ N_i, 1 ≤ i ≤ n} = N (0, 1) for 0 ≤ J ≤ N_s, so ℒ {η_J} = N (0, 1), for 0 ≤ J ≤ N_s. For J ≠ J′, by (8.2) and (3.1), R_ij,ε,J R_ij,ε,J′ = B_J (X_ij) B_J′ (X_ij) σ² (X_ij) = 0, along with (8.4), (8.3), the conditional independence of ξ̂_k(x), ε̂⁽⁰⁾(x) on X_ij, N_i, 1 ≤ j ≤ N_i, 1 ≤ i ≤ n, and independence of ξ_ik, Z_ij, X_ij, N_i, 1 ≤ k ≤ κ, 1 ≤ j ≤ N_i, 1 ≤ i ≤ n, E (η_Jη_J′) is

\begin{array}{l} E {{\sum_{i = 1}^{n} {\sum_{k = 1}^{κ} R_{ik, ξ, J}^{2} + \sum_{j = 1}^{N_{i}} R_{ij, ε, J}^{2}}}^{- 1 / 2} {\sum_{i = 1}^{n} {\sum_{k = 1}^{κ} R_{ik, ξ, J^{'}}^{2} + \sum_{j = 1}^{N_{i}} R_{ij, ε, J^{'}}^{2}}}^{- 1 / 2} E {\sum_{k = 1}^{κ} {\sum_{i = 1}^{n} R_{ik, ξ, J} ξ_{ik}} {\sum_{i = 1}^{n} R_{ik, ξ, J^{'}} ξ_{ik}} + \\ {\sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} R_{ij, ε, J} Z_{ij}} {\sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} R_{ij, ε, J^{'}} Z_{ij}} ∣ {(X_{ij}, N_{i})}_{1 \leq j \leq N_{i}, 1 \leq i \leq n}}} = {EC}_{n, J, J^{'}} \end{array}

in which $C_{n, J, J^{'}} = {N_{T}^{- 1} \sum_{i = 1}^{n} {\sum_{k = 1}^{κ} R_{ik, ξ, J}^{2} + \sum_{j = 1}^{N_{i}} R_{ij, ε, J}^{2}}}^{- 1 / 2} \times {N_{T}^{- 1} \sum_{i = 1}^{n} {\sum_{k = 1}^{κ} R_{ik, ξ, J^{'}}^{2} + \sum_{j = 1}^{N_{i}} R_{ij, ε, J^{'}}^{2}}}^{- 1 / 2} {N_{T}^{- 1} \sum_{k = 1}^{κ} \sum_{i = 1}^{n} R_{ik, ξ, J} R_{ik, ξ, J^{'}}}$ .Note that according to definitions of R_ik,ξ,J, R_ij,ε,J, and Lemma A.5,

N_{T}^{- 1} \sum_{i = 1}^{n} {\sum_{k = 1}^{κ} R_{ik, ξ, J}^{2} + \sum_{j = 1}^{N_{i}} R_{ij, ε, J}^{2}}

$\geq c_{σ}^{2} N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} B_{J}^{2} (X_{ij}) = c_{σ}^{2} {∥ B_{J} ∥}_{2, N_{T}}^{2} \geq c_{σ}^{2} (1 - A_{n})$ , for 0 ≤ J ≤ N_s,

P {\inf_{0 \leq J \neq J^{'} \leq N_{s}} {N_{T}^{- 1} \sum_{i = 1}^{n} (\sum_{k = 1}^{κ} R_{ik, ξ, J}^{2} + \sum_{j = 1}^{N_{i}} R_{ij, ε, J}^{2})} \times {N_{T}^{- 1} \sum_{i = 1}^{n} (\sum_{k = 1}^{κ} R_{ik, ξ, J^{'}}^{2} + \sum_{j = 1}^{N_{i}} R_{ij, ε, J^{'}}^{2})} \geq c_{σ}^{4} {(1 - C_{A} \sqrt{\frac{\log (n)}{{nh}_{s}}})}^{2}] \geq 1 - 2 n^{- 3},

by Lemma A.5. Thus for large n, with probability ≥ 1 − 2n⁻³, the numerator of C_n,J,J′ is uniformly greater than $c_{σ}^{2} / 2$ . Applying Bernstein’s inequality to $N_{T}^{- 1} {\sum_{k = 1}^{κ} \sum_{i = 1}^{n} R_{ik, ξ, J} R_{ik, ξ, J^{'}}}$ , there exists C₀ > 0 such that, for large n,

P (\sup_{0 \leq J \neq J^{'} \leq N_{s}} | N_{T}^{- 1} \sum_{k = 1}^{κ} \sum_{i = 1}^{n} R_{ik, ξ, J} R_{ik, ξ, J^{'}} | \leq C_{0} h_{s}) \geq 1 - 2 n^{- 3} .

Putting the above together, for large n, $C_{1} = C_{0} {(c_{σ}^{2} / 2)}^{- 1}$ ,

P (\sup_{0 \leq J \neq J^{'} \leq N_{s}} ∣ C_{n, J, J^{'}} ∣ \leq C_{1} h_{s}) \geq 1 - 4 n^{- 3} .

Note that as a continuous random variable, sup_{0≤J≠J′≤N_s}|C_n,J,J′| ∈ [0, 1], thus

E (\sup_{0 \leq J \neq J^{'} \leq N_{s}} ∣ C_{n, J, J^{'}} ∣) = \int_{0}^{1} P (\sup_{0 \leq J \neq J^{'} \leq N_{s}} ∣ C_{n, J, J^{'}} ∣ > t) dt .

For large n, C₁h_s < 1 and then E (sup_{0≤J≠J′≤N_s} |C_n,J,J′|) is

\begin{matrix} \int_{0}^{C_{1} h_{s}} P {\sup_{0 \leq J \neq J^{'} \leq N_{s}} ∣ C_{n, J, J^{'}} ∣ > t} dt + \int_{C_{1} h_{s}}^{1} P {\sup_{0 \leq J \neq J^{'} \leq N_{s}} ∣ C_{n, J, J^{'}} ∣ > t} dt \\ \leq \int_{0}^{C_{1} h_{s}} 1 dt + \int_{C_{1} h_{s}}^{1} 4 n^{- 3} dt \leq C_{1} h_{s} + 4 n^{- 3} \leq {Ch}_{s} \end{matrix}

for some C > 0 and large enough n. The lemma now follows from

\sup_{0 \leq J \neq J^{'} \leq N_{s}} | E (C_{n, J, J^{'}}) | \leq E (\sup_{0 \leq J \neq J^{'} \leq N_{s}} | C_{n, J, J^{'}} |) \leq {Ch}_{s} .

By Lemma A.8, the (N_s + 1) standard normal variables η₀, …, η_{N_s} satisfy the conditions of Lemma A.2 Hence for any τ ∈ R,

\lim_{n \to \infty} P (\sup_{x \in [0, 1]} ∣ η (x) ∣ \leq τ / a_{N_{s} + 1} + b_{N_{s} + 1}) = \exp (- 2 e^{- τ}) .

(8.7)

For x ∈ [0, 1], R_ik,ξ,J, R_ij,ε,J given in (8.2), define the ratio of population and sample quantities as r_n(x) = {nE (N₁) / N_T}^1/2 {R̄_n(x) / R̄(x)}^1/2, with

\begin{matrix} {\bar{R}}_{n} (x) = N_{T}^{- 1} {\sum_{i = 1}^{n} (\sum_{k = 1}^{κ} R_{ik, ξ, J (x)}^{2} + \sum_{j = 1}^{N_{i}} R_{ij, ε, J (x)}^{2})} \\ \bar{R} (x) = {({EN}_{1})}^{- 1} \sum_{k = 1}^{κ} {ER}_{1 k, ξ, J (x)}^{2} + {ER}_{11, ε, J (x)}^{2} . \end{matrix}

Lemma A.9

Under Assumptions (A2), (A3), for η(x), σ_n(x) in (8.6), (2.3),

∣ σ_{n} {(x)}^{- 1} {\sum_{k = 1}^{κ} {\hat{ξ}}_{k} (x) + {\hat{ε}}^{(0)} (x)} - η (x) ∣ = ∣ r_{n} (x) - 1 ∣ ∣ η (x) ∣

(8.8)

as n → ∞, $\sup_{x \in [0, 1]} {a_{N_{s} + 1} ∣ r_{n} (x) - 1 ∣} = O_{a . s .} (\sqrt{{\log (N_{s} + 1)} (logn) / ({nh}_{s})})$ .

Proof

Equation (8.8) follows from the definitions of η(x) and σ_n(x). By Lemma A.6, $\sup_{x \in [0, 1]} ∣ N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} R_{ij, ε, J (x)}^{2} - {ER}_{11, ε, J (x)}^{2} ∣ = O_{a . s .} (\sqrt{\log n / ({nh}_{s})}),$ ,

\begin{array}{c} \sup_{x \in [0, 1]} ∣ N_{T}^{- 1} \sum_{k = 1}^{κ} \sum_{i = 1}^{n} R_{ik, ξ, J (x)}^{2} - {({EN}_{1})}^{- 1} \sum_{k = 1}^{κ} {ER}_{1 k, ξ, J (x)}^{2} ∣ \\ \leq \sup_{x \in [0, 1]} {({EN}_{1})}^{- 1} \sum_{k = 1}^{κ} ∣ n^{- 1} \sum_{i = 1}^{n} R_{ik, ξ, J (x)}^{2} - {ER}_{1 k, ξ, J (x)}^{2} ∣ \\ + \sup_{x \in [0, 1]} {({EN}_{1})}^{- 1} \sum_{k = 1}^{κ} | n ({EN}_{1}) N_{T}^{- 1} - 1 | ∣ n^{- 1} \sum_{i = 1}^{n} R_{ik, ξ, J (x)}^{2} ∣ \\ = O_{a . s .} (\sqrt{\log n / ({nh}_{s})}) + O_{a . s .} = (n^{- 1 / 2}) = O_{a . s .} (\sqrt{\log n / ({nh}_{s})}), \end{array}

and there exist constants 0 < c_R̄ < C_R̄ < ∞ such that for all x ∈ [0,1], c_R̄ < R̄(x) < C_R̄. Thus, sup_x∈[0,1] |R̄_n(x) − R̄(x)| is bounded by

\begin{matrix} \sup_{x \in [0, 1]} ∣ N_{T}^{- 1} \sum_{k = 1}^{κ} \sum_{i = 1}^{n} R_{ik, ξ, J (x)}^{2} - {({EN}_{1})}^{- 1} \sum_{k = 1}^{κ} {ER}_{1 k, ξ, J (x)}^{2} ∣ + \\ \sup_{x \in [0, 1]} ∣ N_{T}^{- 1} \sum_{i = 1}^{n} \sum_{j = 1}^{N_{i}} R_{ij, ε, J (x)}^{2} - {ER}_{11, ε, J (x)}^{2} ∣ = O_{a . s .} (\sqrt{\log n / ({nh}_{s})}) . \end{matrix}

Thus $\sup_{x \in [0, 1]} ∣ {{\bar{R}}_{n} (x)}^{1 / 2} - {\bar{R} (x)}^{1 / 2} ∣ \leq \sup_{x \in [0, 1]} ∣ {\bar{R}}_{n} (x) - \bar{R} (x) ∣ \sup_{x \in [0, 1]} {\bar{R} (x)}^{- 1 / 2} = O_{a . s .} (\sqrt{\log n / ({nh}_{s})})$ . Then sup_x∈[0,1] {a_{N_s+1} |r_n(x) − 1|} is bounded by

\begin{array}{l} a_{N_{s} + 1} {{nE (N_{1}) / N_{T}}^{1 / 2} \sup_{x \in [0, 1]} ∣ {{\bar{R}}_{n} (x) / \bar{R} (x)}^{1 / 2} - 1 ∣ + ∣ 1 - {nE (N_{1}) / N_{T}}^{1 / 2} ∣} \\ \leq a_{N_{s} + 1} {{nE (N_{1}) / N_{T}}^{1 / 2} \sup_{x \in [0, 1]} {\bar{R} (x)}^{- 1 / 2} \sup_{x \in [0, 1]} ∣ {{\bar{R}}_{n} (x)}^{1 / 2} - {\bar{R} (x)}^{1 / 2} ∣ + ∣ 1 - {nE (N_{1}) / N_{T}}^{1 / 2} ∣} = O_{a . s .} (\sqrt{{\log (N_{s} + 1)} (\log n) / ({nh}_{s})}) . \end{array}

Proof of Proposition 2

The proof follows from Lemmas A.5, A.7, A.9, (8.7), and Slutsky’s Theorem.

Proof of Theorem 1

By Theorem 2, ∥m̃(x) − m(x)∥_∞ = O_p (h_s), so

\begin{matrix} a_{N_{s} + 1} (\sup_{x \in [0, 1]} σ_{n}^{- 1} (x) | \tilde{m} (x) - m (x) |) = O_{p} {{({nh}_{s})}^{1 / 2} \sqrt{\log (N_{s} + 1) h_{s}}} = o_{p} (1), \\ a_{N_{s} + 1} (\sup_{x \in [0, 1]} σ_{n}^{- 1} (x) | \hat{m} (x) - m (x) | - \sup_{x \in [0, 1]} σ_{n}^{- 1} (x) | \sum_{k = 1}^{κ} {\tilde{ξ}}_{k} (x) + \tilde{ε} (x) |) = o_{p} (1) . \end{matrix}

Meanwhile, (3.4) and Proposition 2 entail that, for any τ ∈ R,

\lim_{n \to \infty} P {a_{N_{s} + 1} (\sup_{x \in [0, 1]} σ_{n}^{- 1} (x) ∣ \sum_{k = 1}^{κ} {\tilde{ξ}}_{k} (x) + \tilde{ε} (x) ∣ - b_{N_{s} + 1}) \leq τ} = \exp (- 2 e^{- τ}) .

Thus Slutsky’s Theorem implies that

\lim_{n \to \infty} P {a_{N_{s} + 1} (\sup_{x \in [0, 1]} σ_{n}^{- 1} (x) ∣ \hat{m} (x) - m (x) ∣ - b_{N_{s} + 1}) \leq τ} = \exp (- 2 e^{- τ}) .

Let $τ = - \log {- \frac{1}{2} \log (1 - α)}$ , definitions of a_{N_s+1}, b_{N_s+1}, and Q_{N_s+1} (α) in (2.4) entail

\begin{array}{c} \lim_{n \to \infty} P {m (x) \in \hat{m} (x) \pm σ_{n} (x) Q_{N_{s} + 1} (α), \forall x \in [0, 1]} \\ = \lim_{n \to \infty} P {Q_{N_{s} + 1}^{- 1} (α) \sup_{x \in [0, 1]} σ_{n}^{- 1} (x) ∣ \tilde{e} (x) + \tilde{m} (x) - m (x) ∣ \leq 1} = 1 - α . \end{array}

by (3.4). That σ_n(x)⁻¹ {m̂(x) − m(x)} →_d N (0, 1) for any x ∈ [0, 1] follows by directly using η(x) ~ N (0, 1), without reference to sup_x∈[0,1] |η(x)|.

Contributor Information

Shujie Ma, Email: mashujie@stt.msu.edu.

Lijian Yang, Email: yanglijian@suda.edu.cn.

Raymond J. Carroll, Email: carroll@stat.tamu.edu.

References

Bosq D. Nonparametric Statistics for Stochastic Processes. Springer-Verlag; New York: 1998. [Google Scholar]
Cardot H, Ferraty F, Sarda P. Spline estimators for the functional linear model. Statistica Sinica. 2003;13:571–591. [Google Scholar]
Cardot H, Sarda P. Estimation in generalized linear models for functional data via penalized likelihood. Journal of Multivariate Analysis. 2005;92:24–41. [Google Scholar]
Claeskens G, Van Keilegom I. Bootstrap confidence bands for regression curves and their derivatives. Annals of Statistics. 2003;31:1852–1884. [Google Scholar]
Csőrgő M, Révész P. Strong Approximations in Probability and Statistics. Academic Press; New York-London: 1981. [Google Scholar]
de Boor C. A Practical Guide to Splines. Springer-Verlag; New York: 2001. [Google Scholar]
Fan J, Zhang WY. Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scandinavian Journal of Statistics. 2000;27:715–731. [Google Scholar]
Ferraty F, Vieu P. Nonparametric Functional Data Analysis: Theory and Practice. Springer; Berlin: 2006. [Google Scholar]
Hall P, Heckman N. Estimating and depicting the structure of a distribution of random functions. Biometrika. 2002;89:145–158. [Google Scholar]
Hall P, Müller HG, Wang JL. Properties of principal component methods for functional and longitudinal data analysis. Annals of Statistics. 2006;34:1493–1517. [Google Scholar]
Hall P, Titterington DM. On confidence bands in nonparametric density estimation and regression. Journal of Multivariate Analysis. 1988;27:228–254. [Google Scholar]
Härdle W. Asymptotic maximal deviation of M-smoothers. Journal of Multivariate Analysis. 1989;29:163–179. [Google Scholar]
Härdle W, Marron JS. Bootstrap simultaneous error bars for nonparametric regression. Annals of Statistics. 1991;19:778–796. [Google Scholar]
Huang J. Local asymptotics for polynomial spline regression. Annals of Statistics. 2003;31:1600–1635. [Google Scholar]
Huang J, Yang L. Identification of nonlinear additive autoregressive models. Journal of the Royal Statistical Society B. 2004;66:463–477. [Google Scholar]
Huang X, Wang L, Yang L, Kravchenko AN. Management practice effects on relationships of grain yields with topography and precipitation. Agronomy Journal. 2008;100:1463–1471. [Google Scholar]
Izem R, Marron JS. Analysis of nonlinear modes of variation for functional data. Electronic Journal of Statistics. 2007;1:641–676. [Google Scholar]
James GM, Hastie T, Sugar C. Principal component models for sparse functional data. Biometrika. 2000;87:587–602. [Google Scholar]
James GM. Generalized linear models with functional predictors. Journal of the Royal Statistical Society B. 2002;64:411–432. [Google Scholar]
James GM, Silverman BW. Functional adaptive model estimation. Journal of the American Statistical Association. 2005;100:565–576. [Google Scholar]
James GM, Sugar CA. Clustering for sparsely sampled functional data. Journal of the American Statistical Association. 2003;98:397–408. [Google Scholar]
Leadbetter MR, Lindgren G, Rootzén H. Extremes and Related Properties of Random Sequences and Processes. Springer-Verlag; New York: 1983. [Google Scholar]
Li Y, Hsing T. On rates of convergence in functional linear regression. Journal of Multivariate Analysis. 2007;98:1782–1804. [Google Scholar]
Li Y, Hsing T. Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Annals of Statistics. 2009 forthcoming. [Google Scholar]
Ma S, Yang L. A jump-detecting procedure based on spline estimation. Journal of Nonparametric Statistics. 2010 in press. [Google Scholar]
Morris JS, Carroll RJ. Wavelet-based functional mixed models. Journal of the Royal Statistical Society B. 2006;68:179–199. doi: 10.1111/j.1467-9868.2006.00539.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
Müller HG, Stadtmüller U. Generalized functional linear models. Annals of Statistics. 2005;33:774–805. [Google Scholar]
Müller HG, Stadtmüller U, Yao F. Functional variance processes. Journal of the American Statistical Association. 2006;101:1007–1018. [Google Scholar]
Müller HG, Yao F. Functional additive models. Journal of American Statistical Association. 2008;103:1534–1544. [Google Scholar]
Ramsay JO, Silverman BW. Functional Data Analysis. Second Edition. Springer; New York: 2005. [Google Scholar]
Song Q, Yang L. Spline confidence bands for variance function. Journal of Nonparametric Statistics. 2009;21:589–609. [Google Scholar]
Wang N, Carroll RJ, Lin X. Efficient semiparametric marginal estimation for longitudinal/clustered data. Journal of the American Statistical Association. 2005;100:147–157. [Google Scholar]
Wang L, Yang L. Spline-backfitted kernel smoothing of nonlinear additive autoregression model. Annals of Statistics. 2007;35:2474–2503. [Google Scholar]
Wang J, Yang L. Polynomial spline confidence bands for regression curves. Statistica Sinica. 2009;19:325–342. [Google Scholar]
Wu W, Zhao Z. Inference of trends in time series. Journal of the Royal Statistical Society B. 2007;69:391–410. [Google Scholar]
Xue L, Yang L. Additive coefficient modelling via polynomial spline. Statistica Sinica. 2006;16:1423–1446. [Google Scholar]
Yao F, Lee TCM. Penalized spline models for functional principal component analysis. Journal of the Royal Statistical Society B. 2006;68:3–25. [Google Scholar]
Yao F, Müller HG, Wang JL. Functional linear regression analysis for longitudinal data. Annals of Statistics. 2005a;33:2873–2903. [Google Scholar]
Yao F, Müller HG, Wang JL. Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association. 2005b;100:577–590. [Google Scholar]
Yao F. Asymptotic distributions of nonparametric regression estimators for longitudinal or functional data. Journal of Multivariate Analysis. 2007;98:40–56. [Google Scholar]
Zhang JT, Chen J. Statistical inferences for functional data. Annals of Statistics. 2007;35:1052–1079. [Google Scholar]
Zhao X, Marron JS, Wells MT. The functional data analysis view of longitudinal data. Statistica Sinica. 2004;14:789–808. [Google Scholar]
Zhao Z, Wu W. Confidence bands in nonparametric time series regression. Annals of Statistics. 2008;36:1854–1878. [Google Scholar]
Zhou L, Huang J, Carroll RJ. Joint modelling of paired sparse functional data using principal components. Biometrika. 2008;95:601–619. doi: 10.1093/biomet/asn035. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhou S, Shen X, Wolfe DA. Local asymptotics of regression splines and confidence regions. Annals of Statistics. 1998;26:1760–1782. [Google Scholar]

[R1] Bosq D. Nonparametric Statistics for Stochastic Processes. Springer-Verlag; New York: 1998. [Google Scholar]

[R2] Cardot H, Ferraty F, Sarda P. Spline estimators for the functional linear model. Statistica Sinica. 2003;13:571–591. [Google Scholar]

[R3] Cardot H, Sarda P. Estimation in generalized linear models for functional data via penalized likelihood. Journal of Multivariate Analysis. 2005;92:24–41. [Google Scholar]

[R4] Claeskens G, Van Keilegom I. Bootstrap confidence bands for regression curves and their derivatives. Annals of Statistics. 2003;31:1852–1884. [Google Scholar]

[R5] Csőrgő M, Révész P. Strong Approximations in Probability and Statistics. Academic Press; New York-London: 1981. [Google Scholar]

[R6] de Boor C. A Practical Guide to Splines. Springer-Verlag; New York: 2001. [Google Scholar]

[R7] Fan J, Zhang WY. Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scandinavian Journal of Statistics. 2000;27:715–731. [Google Scholar]

[R8] Ferraty F, Vieu P. Nonparametric Functional Data Analysis: Theory and Practice. Springer; Berlin: 2006. [Google Scholar]

[R9] Hall P, Heckman N. Estimating and depicting the structure of a distribution of random functions. Biometrika. 2002;89:145–158. [Google Scholar]

[R10] Hall P, Müller HG, Wang JL. Properties of principal component methods for functional and longitudinal data analysis. Annals of Statistics. 2006;34:1493–1517. [Google Scholar]

[R11] Hall P, Titterington DM. On confidence bands in nonparametric density estimation and regression. Journal of Multivariate Analysis. 1988;27:228–254. [Google Scholar]

[R12] Härdle W. Asymptotic maximal deviation of M-smoothers. Journal of Multivariate Analysis. 1989;29:163–179. [Google Scholar]

[R13] Härdle W, Marron JS. Bootstrap simultaneous error bars for nonparametric regression. Annals of Statistics. 1991;19:778–796. [Google Scholar]

[R14] Huang J. Local asymptotics for polynomial spline regression. Annals of Statistics. 2003;31:1600–1635. [Google Scholar]

[R15] Huang J, Yang L. Identification of nonlinear additive autoregressive models. Journal of the Royal Statistical Society B. 2004;66:463–477. [Google Scholar]

[R16] Huang X, Wang L, Yang L, Kravchenko AN. Management practice effects on relationships of grain yields with topography and precipitation. Agronomy Journal. 2008;100:1463–1471. [Google Scholar]

[R17] Izem R, Marron JS. Analysis of nonlinear modes of variation for functional data. Electronic Journal of Statistics. 2007;1:641–676. [Google Scholar]

[R18] James GM, Hastie T, Sugar C. Principal component models for sparse functional data. Biometrika. 2000;87:587–602. [Google Scholar]

[R19] James GM. Generalized linear models with functional predictors. Journal of the Royal Statistical Society B. 2002;64:411–432. [Google Scholar]

[R20] James GM, Silverman BW. Functional adaptive model estimation. Journal of the American Statistical Association. 2005;100:565–576. [Google Scholar]

[R21] James GM, Sugar CA. Clustering for sparsely sampled functional data. Journal of the American Statistical Association. 2003;98:397–408. [Google Scholar]

[R22] Leadbetter MR, Lindgren G, Rootzén H. Extremes and Related Properties of Random Sequences and Processes. Springer-Verlag; New York: 1983. [Google Scholar]

[R23] Li Y, Hsing T. On rates of convergence in functional linear regression. Journal of Multivariate Analysis. 2007;98:1782–1804. [Google Scholar]

[R24] Li Y, Hsing T. Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data. Annals of Statistics. 2009 forthcoming. [Google Scholar]

[R25] Ma S, Yang L. A jump-detecting procedure based on spline estimation. Journal of Nonparametric Statistics. 2010 in press. [Google Scholar]

[R26] Morris JS, Carroll RJ. Wavelet-based functional mixed models. Journal of the Royal Statistical Society B. 2006;68:179–199. doi: 10.1111/j.1467-9868.2006.00539.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Müller HG, Stadtmüller U. Generalized functional linear models. Annals of Statistics. 2005;33:774–805. [Google Scholar]

[R28] Müller HG, Stadtmüller U, Yao F. Functional variance processes. Journal of the American Statistical Association. 2006;101:1007–1018. [Google Scholar]

[R29] Müller HG, Yao F. Functional additive models. Journal of American Statistical Association. 2008;103:1534–1544. [Google Scholar]

[R30] Ramsay JO, Silverman BW. Functional Data Analysis. Second Edition. Springer; New York: 2005. [Google Scholar]

[R31] Song Q, Yang L. Spline confidence bands for variance function. Journal of Nonparametric Statistics. 2009;21:589–609. [Google Scholar]

[R32] Wang N, Carroll RJ, Lin X. Efficient semiparametric marginal estimation for longitudinal/clustered data. Journal of the American Statistical Association. 2005;100:147–157. [Google Scholar]

[R33] Wang L, Yang L. Spline-backfitted kernel smoothing of nonlinear additive autoregression model. Annals of Statistics. 2007;35:2474–2503. [Google Scholar]

[R34] Wang J, Yang L. Polynomial spline confidence bands for regression curves. Statistica Sinica. 2009;19:325–342. [Google Scholar]

[R35] Wu W, Zhao Z. Inference of trends in time series. Journal of the Royal Statistical Society B. 2007;69:391–410. [Google Scholar]

[R36] Xue L, Yang L. Additive coefficient modelling via polynomial spline. Statistica Sinica. 2006;16:1423–1446. [Google Scholar]

[R37] Yao F, Lee TCM. Penalized spline models for functional principal component analysis. Journal of the Royal Statistical Society B. 2006;68:3–25. [Google Scholar]

[R38] Yao F, Müller HG, Wang JL. Functional linear regression analysis for longitudinal data. Annals of Statistics. 2005a;33:2873–2903. [Google Scholar]

[R39] Yao F, Müller HG, Wang JL. Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association. 2005b;100:577–590. [Google Scholar]

[R40] Yao F. Asymptotic distributions of nonparametric regression estimators for longitudinal or functional data. Journal of Multivariate Analysis. 2007;98:40–56. [Google Scholar]

[R41] Zhang JT, Chen J. Statistical inferences for functional data. Annals of Statistics. 2007;35:1052–1079. [Google Scholar]

[R42] Zhao X, Marron JS, Wells MT. The functional data analysis view of longitudinal data. Statistica Sinica. 2004;14:789–808. [Google Scholar]

[R43] Zhao Z, Wu W. Confidence bands in nonparametric time series regression. Annals of Statistics. 2008;36:1854–1878. [Google Scholar]

[R44] Zhou L, Huang J, Carroll RJ. Joint modelling of paired sparse functional data using principal components. Biometrika. 2008;95:601–619. doi: 10.1093/biomet/asn035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R45] Zhou S, Shen X, Wolfe DA. Local asymptotics of regression splines and confidence regions. Annals of Statistics. 1998;26:1760–1782. [Google Scholar]

PERMALINK

A SIMULTANEOUS CONFIDENCE BAND FOR SPARSE LONGITUDINAL REGRESSION

Shujie Ma

Lijian Yang

Raymond J Carroll

Abstract

1. Introduction

2. Main results

Theorem 1

Proposition 1

Corollary 1

3. Decomposition

Theorem 2

Proposition 2

4. Implementation

Proposition 3

5. Simulation

Table 1.

Table 2.

Figure 1.

Figure 2.

6. Empirical example

Figure 3.

Table 3.

7. Discussion

Acknowledgments

Appendix

A.1. Preliminaries

Lemma A.1

Lemma A.2

Proof

Lemma A.3

Lemma A.4

Proof

Proof of Proposition 1

A.2. Proof of Theorem 1

Lemma A.5

Lemma A.6

Proof

Lemma A.7

Proof

Lemma A.8

Proof

Lemma A.9

Proof

Proof of Proposition 2

Proof of Theorem 1

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases