Some theoretical properties of Silverman's method for Smoothed functional principal component analysis

Xin Qi; Hongyu Zhao

doi:10.1016/j.jmva.2010.12.001

. Author manuscript; available in PMC: 2012 Apr 1.

Published in final edited form as: J Multivar Anal. 2011 Apr 1;102(4):741–767. doi: 10.1016/j.jmva.2010.12.001

Some theoretical properties of Silverman's method for Smoothed functional principal component analysis

Xin Qi ^1,^*, Hongyu Zhao ¹

PMCID: PMC3079282 NIHMSID: NIHMS259339 PMID: 21516205

Abstract

Principal component analysis (PCA) is one of the key techniques in functional data analysis. One important feature of functional PCA is that there is a need for smoothing or regularizing of the estimated principal component curves. Silverman's method for smoothed functional principal component analysis is an important approach in situation where the sample curves are fully observed due to its theoretical and practical advantages. However, lack of knowledge about the theoretical properties of this method makes it difficult to generalize it to the situation where the sample curves are only observed at discrete time points. In this paper, we first establish the existence of the solutions of the successive optimization problems in this method. We then provide upper bounds for the bias parts of the estimation errors for both eigenvalues and eigenfunctions. We also prove functional central limit theorems for the variation parts of the estimation errors. As a corollary, we give the convergence rates of the estimations for eigenvalues and eigenfunctions, where these rates depend on both the sample size and the smoothing parameters. Under some conditions on the convergence rates of the smoothing parameters, we can prove the asymptotic normalities of the estimations.

Keywords: Functional PCA, smoothing methods, roughness penalty, convergence rates, functional central limit theorem, asymptotic normality

1. Introduction

Principal component analysis (PCA) is one of the key techniques in multivariate analysis and functional data analysis. An important difference between classical PCA and functional PCA is that there is a need for smoothing or regularizing of the estimated principal component curves in functional PCA (see Chapter 9 in Ramsay and Silverman [12]). Many methods have been proposed to estimate the smoothed functional principal components when the sample curves are fully observed. A general overview of these methods and an extensive list of references can be found in Ramsay and Silverman [12]. The reader can find in Ferraty and Vieu [6] more discussions on theoretical aspects and nonparametric methods for functional data analysis. Functional PCA has many important applications. For example, functional principal component regression (see for instance Cardot, Ferraty and Sarda [2]) is a direct application of functional principal coponents analysis.

The approach proposed in Silverman [15] is an important method for smoothing functional PCA (see Chapter 9 in Ramsay and Silverman [12]) due to its theoretical and practical advantages. First, the weak assumptions underlying this method make it applicable to data from many fields. Silverman [15] did not make any assumptions on the mean curves and sample curves. Hence, in addition to data with smooth random curves, this method can be applied to analyze data where the sample curves can be unsmooth or even discontinuous, such as those encountered in financial engineering, survival analysis and other fields. For covariance functions, Silverman [15] only assumed that they have series expansions by their eigenfunctions without imposing smoothing constraint. This is attractive because the covariance functions are continuous but unsmooth in many important models such as stochastic differential equation models in financial engineering and counting process models in survival analysis. Second, Silverman's method controls the smoothness of eigenfunction curves by directly imposing roughness penalties on these functions instead of on sample curves or covariance functions. Furthermore, this approach changes the eigenvalue and eigenfunction problems in the usual L² space to problems in another Hilbert space, the Sobolev space (with a norm different from the usual norm in the Sobolev space). Therefore, many powerful tools from the theory of Hilbert space can be employed to study the properties of this method. Third, this approach incorporates the smoothing step into the step for computing eigenvalues and eigenfunctions. Therefore, this method is computationally efficient with the same computational load as the usual unsmoothed functional PCA. Fourth, the estimates produced by this method are invariant under scale transformations. As pointed out by Huang, Shen and Buja [8], the invariance property under scale transformations should be a guiding principle in introducing roughness penalties to functional PCA.

Despite all these advantages, lack of knowledge about the theoretical properties of this method makes it difficult to generalize it to the situations where the sample curves are only observed at discrete time points. Silverman [15] only proved consistency of the estimations as the sample size goes to infinity and the smoothing parameter goes to zero. Even the existence of the solutions to the successive optimization problems in this method is not established. It is not clear how the estimation errors depend on the sample size and the smoothing parameter. Asymptotic normalities of the estimations also need to be proved. In this paper, we aim to solve these open problems. In Section 2, we give the detailed backgroud, basic notations and our main assumptions. In Section 3, Silverman's method is introduced and the existence theorem for the successive optimization problems is proven. Our main results appear in Section 4. Section 5 contains detailed proofs of our theorems.

2. Notations and main assumptions

We introduce notations and definitions used throughout the paper. Let ℕ denote the collection of all the positive integers. We consider a finite time interval [a, b]. In this paper, we will mainly consider functions in the following two space, the L² space

L^{2} ([a, b]) = {f : f is a measurable function on [a, b] and \int_{a}^{b} {| f (t) |}^{2} dt < \infty},

and the Sobolev space

W_{2}^{2} ([a, b]) = {f : f, f^{'} are absolutely continuous on [a, b] and f^{″} \in L^{2} ([a, b])},

where f′ and f″ donate the first and second derivatives of f, respectively. For any f, g ∈ L²([a, b]), define the usual inner product

(f, g) = \int_{a}^{b} f (t) g (t) dt,

with corresponding squared norm ∥f∥² = (f, f). Given a smoothing parameter α > 0, for any f, $g \in W_{2}^{2} ([a, b])$ , define

[f, g] = \int_{a}^{b} f^{″} (t) g^{″} (t) dt

and the inner product

{(f, g)}_{α} = (f, g) + α [f, g]

with corresponding squard norm ${∥ f ∥}_{α}^{2} = {(f, f)}_{α}$ . Note that is α = 0, we return to the L²([a, b]) space. For any bounded operator B from L²([a, b]) to L²([a, b]), define the norm

∥ B ∥ = \sup {∥ B f ∥ : f \in L^{2} ([a, b]) and ∥ f ∥ \leq 1} .

(2.1)

For any measurable function A(s, t) on [a, b] × [a, b], if

\int_{a}^{b} \int_{a}^{b} A^{2} (s, t) dsdt < \infty,

then $f \to \int_{a}^{b} A (s, t) f (t) dt$ defines a bounded operator from L²([a, b) to L²([a, b]). To simplify the notation, we just use A to denote this operator, that is

A f (s) = \int_{a}^{b} A (s, t) f (t) dt,

and we have

∥ A ∥ \leq {(\int_{a}^{b} \int_{a}^{b} A^{2} (s, t) dsdt)}^{\frac{1}{2}} .

Let X(t), a ≤ t ≤ b be a measurable stochastic process on [a, b]. Under Assumption 1 below, X(t) ∈ L²([a, b]) a.s.. Let {X₁(t), X₂(t), ⋯, X_n(t)} be i.i.d. sample curves from the distribution of X(t). Assume that EX(t) = ν(t). Define Γ to be the covariance function

Γ (s, t) = E [(X (s) - ν (s)) (X (t) - ν (t))], \forall s, t \in [a, b],

and Γ̂_n to be the sample covariance function

{\hat{Γ}}_{n} (s, t) = \frac{1}{n} Σ_{p = 1}^{n} (X_{p} (s) - \bar{X} (s)) (X_{p} (t) - \bar{X} (s)), \forall s, t \in [a, b],

where X̄ is the sample mean curve

\bar{X} (t) = \frac{1}{n} (X_{1} (t) + \dots + X_{n} (t)) .

We will give our basic assumptions below. Silverman [15] made three assumptions in Section 5.2 in order to prove the consistency result. Our assumptions are stronger than those in Silverman [15].

Assumption 1

E [{∥ X ∥}^{4}] = E [{(\int_{a}^{b} {| X (t) |}^{2} dt)}^{2}] < \infty .

(2.2)

Remark

This assumption is stronger than the first assumption in Section 5.2 of Silverman [15]. Under condition (2.2), the central limit theorem for sample covariance function holds (see Section 2 in Dauxois, Pousse and Romain [3] and Chapter 10 in Ledoux and Talagrand [10]).
Assumption 1 is satisfied by many stochastic processes used in applications. For example, if X(t) is a bounded process, it is obvious that (2.2) is true. Gaussian processes are an important class of stochastic processes which are widely used in statistics and other areas. Suppose that X(t) is a Gaussian process with mean zero. Then
$\begin{matrix} E [{∥ X ∥}^{4}] = E [{(\int_{a}^{b} {| X (t) |}^{2} dt)}^{2}] = \int_{a}^{b} \int_{a}^{b} E [X {(t)}^{2} X {(s)}^{2}] dtds \\ = & \int_{a}^{b} \int_{a}^{b} [Γ (s, s) Γ (t, t) + 2 Γ {(s, t)}^{2}] dsdt \leq \int_{a}^{b} \int_{a}^{b} 3 Γ (s, s) Γ (t, t) dsdt \\ = & 3 {[\int_{a}^{b} Γ (t, t) dt]}^{2} . \end{matrix}$
Hence if Γ(t, t) is integrahle in [a, b], which is satisfied by Gaussian processes commonly encountered in applications, (2.2) is true. Now let us consider the standard Brovmian motion, the most widely studied Gaussian process. For the standard Brovmian motion, Γ(t, t) = t, hence Assumption 1 is satisfied. It is well known that its sample paths are continuous and nowhere differentiable almost surely. For non-Gaussian processes, let us consider a Poisson process with rate 1 in [0,1], Its sample paths are step functions only taking integer values and hence discontinuous. It is easy to verify that Assumption 1 is satisfied by Poisson processes.

Under condition (2.2), we have

\begin{matrix} \int_{a}^{b} \int_{a}^{b} Γ {(s, t)}^{2} dsdt = \int_{a}^{b} \int_{a}^{b} {(E [(X (s) - ν (s)) (X (t) - ν (t))])}^{2} dsdt \\ = & \int_{a}^{b} \int_{a}^{b} {(EX (t) X (s) - ν (s) ν (t))}^{2} dsdt \\ \leq & \int_{a}^{b} \int_{a}^{b} 2 {(EX (t) X (s))}^{2} + 2 ν {(s)}^{2} ν {(t)}^{2} dsdt \\ \leq & \int_{a}^{b} \int_{a}^{b} 2 {EX}^{2} (t) X^{2} (s) + 2 ν {(s)}^{2} ν {(t)}^{2} dsdt \leq 4 E [{∥ X ∥}^{4}] < \infty . \end{matrix}

Therefore, the operator Γ is a Hilbert-Schmidt operator, hence it is a compact operator (see Section XI.6 in Dunford and Schwartz [5] or Section 97 in Riesz and Sz.-Nagy [13]). It follows that the set of eigenvalues of this operator are bounded and at most countable with at most one limit point at 0. Because the covariance operator Γ is always nonnegative-definite, all the eigenvalues are nonnegative. Let λ₁ ≥ λ₂ ≥ ⋯ ≥ 0 be the collection of all eigenvalues and the corresponding eigenfunctions are γ₁, γ₂, ⋯. Every eigenfunction has been scaled to have L²-norm, 1. The set of all the eigenfunctions forms an orthonormal basis of L²([a, b]). Furthermore, we have decomposition

Γ (s, t) = Σ_{j = 1}^{\infty} λ_{j} γ_{j} (s) γ_{j} (t),

(2.3)

the series on the right hand side converges in the L² sense. If Γ is a continuous function, the series on the right hand side absolutely and uniformly converges. Although Silverman [15] did not assume that Γ is square integrahle, he assumed the decomposition form of (2.3).

We have
$Γ γ_{j} = λ_{j} γ_{j}, \forall j = 1, 2, \dots .$
By (2.2), X(s) is square integrable a.s.. Hence, the sample covariance functions Γ̂_n satisfies
$\int_{a}^{b} \int_{a}^{b} {\hat{Γ}}_{n} {(s, t)}^{2} dsdt < \infty$
a.s.. Then we have that the eigenvalues λ̂₁ ≥ λ̂₂ ≥ ⋯ ≥ 0 since the operator Γ̂ is nonnegative-definite. The corresponding eigenfunctions γ̂_j, j ∈ ℕ satisfying
${\hat{Γ}}_{n} {\hat{γ}}_{j} = {\hat{λ}}_{j} {\hat{γ}}_{j}, \forall j = 1, 2, \dots .$

Suppose that we are interested in estimating the first K eigenvalues and eigenfunctions of Γ.

Assumption 2

Any eigenvalue λ_j, 1 ≥ j ≤ K has multiplicity 1, so that

λ_{1} > λ_{2} > \dots > λ_{K} > λ_{K + 1} .

Remark

This assumption is just the third assumption in Section 5.2 of Silverman [15]. If an eigenvalue has multiplicity 1, then the corresponding eigen-function is uniquely determined up to a sign. If the multiplicity is larger than 1, the eigenfunctions can not he uniquely determined up to a sign.

Assumption 3

The eigenfunctions λ_j, 1 ≤ j ≤ K belong to $W_{2}^{2} ([a, b])$

Remark

This assumption is the second assumption in Section 5.2 of Silverman [15] and is essential in our paper.
If the covariance function Γ satisfies some smoothness conditions, then Assumption 3 is true. For example, suppose that Γ(s, t), $Γ (s, t), \frac{\partial Γ (s, t)}{\partial s}$ and $\frac{\partial^{2} Γ (s, t)}{\partial s^{2}}$ are all continuous on [a, b] × [a, b] (hence they are bounded and square integrable), one can easily verify that
$λ_{k} γ_{k}^{″} (s) = \int_{a}^{b} \frac{\partial^{2} Γ (s, t)}{\partial s^{2}} γ_{k} (t) dt \forall 1 \leq k \leq K .$
Hence, by Cauchy-Schwarz inequality and ∥γ_k∥ = 1, we have
$λ_{k}^{2} \int_{a}^{b} {(γ_{k}^{″} (s))}^{2} ds \leq \int_{a}^{b} \int_{a}^{b} {(\frac{\partial^{2} Γ (s, t)}{\partial s^{2}})}^{2} dsdt < \infty \forall 1 \leq k \leq K .$
There are many important random processes whose covariance matrices are not smooth, but the eigenfunctions corresponding to nonzero eigenvalues belong to $W_{2}^{2} ([a, b])$ . The simplest examples are standard Brownian motion and Poisson process with rate 1 in time interval [0, 1]. Their covariance functions are the same and equal to min(s, t), 0 ≤ s, t ≤ 1 (see Page 89 in the book Glasserman [7]). The eigenvalues and eigenfunctions are
$λ_{j} = {(\frac{2}{(2 j - 1) π})}^{2}, γ_{j} = \sqrt{2} \sin (\frac{(2 j - 1) π t}{2}), j = 1, 2, \dots .$ (2.4)
The next example is the famous Black-Scholes Model in finance. Let S_t denote the price of a stock at time t. Then S_t satisfies the following SDE,
${dS}_{t} = μ S_{t} dt + σ S_{t} {dW}_{t},$
where μ, is the instantaneous mean return, σ is the instantaneous return volatility and W_t is a Brownian motion. The covariance function of S_t is smooth except at the points on the diagonal line {(s, t) : s = t}. The same is true for the following example. Consider the counting processes model in survival analysis. Let N_t be the number of the occurrences of the event in [0, t]. Then N_t satisfies
${dN}_{t} = λ (t) dt + {dM}_{t},$
where λ(t) is a smooth intensity function and M_t is a martingale.

Silverman [15] introduced a “half-smoothing” operator which plays an important role in this paper. We give a strict definition of this operator here. We first define an unbounded operator L in L²([a, b]). The domain of L is 𝒟(L) = {f ∈ L² ([a, b]) : f, f′ are absolutely continous and f″ ∈ L² ([a, b])}, and for any f ∈ 𝒟(L),

Lf = f^{″} .

Then L is a closed but unbounded operator and 𝒟(L) is dense in L²([a, b]) (for the definition of closed operators, see Chapter VIII of Riesz and Sz.-Nagy [13] or Chapter 13 of Rudin [14]). Let L* be the adjoint operator of L. By the theorem in Section 118 of Riesz and Sz.-Nagy [13] or Theorem 13.13 in Rudin [14], (I + αL*L)⁻¹ is a bounded, positive self-adjoint operator with norm less than or equal to 1, where α ≥ 0 is the smoothing parameter. Now it follows from Theorem 12.33 and 13.31 in Rudin [14] that (I + αL*L)⁻¹ has a unique positive and self-adjoint square root S_α with norm less than or equal to 1 which is the “half-smoothing” operator in Silverman [15]. Therefore,

S_{α}^{2} = {(I + α L^{*} L)}^{- 1},

(2.5)

and by Theorem 13.11 (b) in Rudin [14], the inverse $S_{α}^{- 1}$ exists and is self-adjoint because (I + αL*L)⁻¹ is invertible.

3. Silverman's approach to smoothed functional PCA

In this section, we always assume that the independent sample curves

{X_{1} (t), X_{2} (t), \dots, X_{n} (t) : a \leq t \leq b}

are entirely observed. We first consider the usual population functional principal components. The first population functional principal component is defined as the linear functional ℓ₁(X) of X which maximizes

Var (ℓ (X))

over all nonzero linear functionals ℓ in L²([a, b]) with the norm ∥ℓ∥ = 1. The second population functional principal component is defined as the linear functional ℓ₂(X) of X which maximizes

Var (ℓ (X))

over all linear functional ℓ with the norm ‖ℓ‖ = 1 and uncorrelated with ℓ₁(X). Similarly, we can define all the other population functional principal components, ℓ₃(X), …. Because X takes values in L²([a, b]) which is a real Hilbert space, by the Riesz representation theorem, for any bounded linear functional ℓ, there is a unique γ ∈ L²([a, b]), such that for any f ∈ L²([a, b]),

ℓ (f) = (γ, f) and ∥ ℓ ∥ = ∥ γ ∥ .

Hence there exists γ_j, ∈ L²([a, b]), j ∈ ℕ, with ‖γ_j‖ = 1, such that the population functional principal components ℓ_j(X) = (γ_j, X),j ∈ ℕ. γ_j is called the j-th principal component weight function or j-th principal component curve. Because

Var (ℓ_{j} (X)) = Var (γ_{j}, X) = (γ_{j}, Γ γ_{j}), \forall j \in ℕ,

γ₁ is the solution of the following optimization problem,

\max_{∥ γ ∥ = 1} \frac{(γ, Γ γ)}{{∥ γ ∥}^{2}} .

(3.1)

The maximum value of (3.1) is just the largest eigenvalue Ai of λ₁ and γ₁ is the corresponding eigenfunction (see Section 2, Chapter 3 in Weinberger [16]). γ₂ is the solution of the optimization problem,

\max_{∥ γ ∥ = 1, (γ, γ_{1}) = 0} \frac{(γ, Γ γ)}{{∥ γ ∥}^{2}} .

(3.2)

The maximum value of (3.2) is just the second eigenvalue λ₂ of Γ and γ₂ is the corresponding eigenfunction. Similarly, γ_j is the eigenfunction corresponding to the eigenvalue λ_j which is also the variance of the j-th principal component.

Because the covariance function Γ is usually unknown, we can not obtain the population principal component weight functions directly. Hence, people use the sample covariance function Γ̂_n to estimate Γ and use the eigenvalues and eigenfunctions of Γ̂_n to estimate the eigenvalues and eigenfunctions of Γ. We call them non-smooth estimators. However, the non-smooth principal component curves can show substantial variability (see Chapter 9 in Ramsay and Silverman [12]). There is a need for smoothing of the estimated principal component weight functions.

Silverman [15] (see also Chapter 9 in Ramsay and Silverman [12]) proposed a method of incorporating smoothing by replacing the usual L² norm with a norm that takes the roughness of the functions into account. Let α be a nonnegative smoothing parameter. Define the estimators ${({\hat{λ}}_{j}^{[α]}, {\hat{γ}}_{j}^{[α]}) : j \in ℕ}$ of {(λ_j, γ_j) : j ∈ ℕ} to be the solutions of the following successive optimization problems. First, ${\hat{γ}}_{1}^{[α]}$ is the solution of the optimization problem

\max_{∥ γ ∥ = 1} \frac{(γ, {\hat{Γ}}_{n} γ)}{(γ, γ) + α [γ, γ]} = \max_{∥ γ ∥ = 1} \frac{(γ, {\hat{Γ}}_{n} γ)}{{∥ γ ∥}_{α}^{2}} .

(3.3)

Let ${\hat{λ}}_{1}^{[α]}$ be the maximum value of (3.3). For any k ∈ ℕ, if we have obtained ${{\hat{γ}}_{j}^{[α]}, j = 1, 2, \dots, k - 1}$ and ${{\hat{λ}}_{j}^{[α]}, j = 1, 2, \dots, k - 1}$ , ${\hat{γ}}_{k}^{[α]}$ is the solution of the optimization problem

\max_{\begin{matrix} ∥ γ ∥ = 1, {(γ, {\hat{γ}}_{j}^{[α]})}_{α} = 0, \\ j = 1, \dots, k - 1 \end{matrix}} \frac{(γ, {\hat{Γ}}_{n} γ)}{{∥ γ ∥}_{α}^{2}},

(3.4)

and ${\hat{λ}}_{k}^{[α]}$ is the maximum value of (3.4). Note that ${({\hat{λ}}_{j}^{[α]}, {\hat{γ}}_{j}^{[α]}) : j \in ℕ}$ depends on both the sample size n and the smoothing parameter α.

First of all, we need to show that the solutions ${({\hat{λ}}_{j}^{[α]}, {\hat{γ}}_{j}^{[α]}) : j \in ℕ}$ of the successive optimization problems (3.3) and (3.4) exist.

Theorem 3.1

Under Assumption 1, the solutions ${({\hat{λ}}_{j}^{[α]}, {\hat{γ}}_{j}^{[α]}) : j \in ℕ}$ of the successive optimization problems (3.3) and (3.4) exist for any α ≥ 0 almost surely. Moreover, we have, for any $γ \in W_{2}^{2} ([a, b])$ and j ∈ ℕ,

({\hat{Γ}}_{n} {\hat{γ}}_{j}^{[α]}, γ) = {\hat{λ}}_{j}^{[α]} {({\hat{γ}}_{j}^{[α]}, γ)}_{α} .

(3.5)

Similarly, define ${(λ_{j}^{[α]}, γ_{j}^{[α]}) : j \in ℕ}$ to be the solutions of the successive optimization problems (3.3) and (3.4) with Γ̂_n replaced by Γ Similarly, we have the following equalities for Γ and ${(λ_{j}^{[α]}, γ_{j}^{[α]}) : j \in ℕ}$

(Γ γ_{j}^{[α]}, γ) = λ_{j}^{[α]} {(γ_{j}^{[α]}, γ)}_{α}, \forall j \in ℕ, γ \in W_{2}^{2} ([a, b])

(3.6)

Note that

γ_{j}^{[0]} = γ_{j}, λ_{j}^{[0]} = λ_{j}, {\hat{γ}}_{j}^{[0]} = {\hat{γ}}_{j}, λ_{j}^{[0]} = {\hat{λ}}_{j}, \forall j \in ℕ .

Theorem 1 in Silverman [15] gives the consistency of the estimators

{({\hat{λ}}_{j}^{[α]}, {\hat{γ}}_{j}^{[α]}) : j \in ℕ}

as α → 0 and n → ∞.

4. Asymptotic theory

Fix a positive integer K. We will assume throughout this section that we want to estimates the first K principal component curves. For any 1 ≤ k ≤ K, define

L_{k} = \max_{1 \leq j \leq k} \sqrt{[γ_{j}, γ_{j}]} .

Then under Assumption 3, L_k is finite and is a measure of roughness of the first k eigenfunctions of Γ. For standard Brownian motion and Poisson process with rate 1 (see remark (3) after Assumption 3),

L_{k} = {(\frac{(2 k - 1) π}{2})}^{2}, k = 1, 2, \dots .

For any 1 ≤ k ≤ K, we have decompositions

{\hat{λ}}_{k}^{[α]} - λ_{k} = ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]}) + (λ_{k}^{[α]} - λ_{k}),

(4.1)

{\hat{γ}}_{k}^{[α]} - γ_{k} = ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}) + (γ_{k}^{[α]} - γ_{k}) .

(4.2)

The last terms $λ_{k}^{[α]} - λ_{k}, γ_{k}^{[α]} - γ_{k}$ on the right hand sides of both (4.1) and (4.2) are nonrandom. They are the “bias terms” due to the introduction of α. We will give the upper bounds for norms of these terms. The first terms on the right hand sides of both (4.1) and (4.2) are the “variation terms” due to the randomness of the sample curves. We will prove a functional central limit theorem for these terms. In order to avoid any confusion it should be pointed out that (4.1) and (4.2) are not the bias-variance decompositions in the strict sense because $λ_{k}^{[α]}$ and $γ_{k}^{[α]}$ are not the expectations of ${\hat{λ}}_{k}^{[α]}$ and ${\hat{γ}}_{k}^{[α]}$ respectively. Since it is hard to express or characterize the exact expectations of ${\hat{λ}}_{k}^{[α]}$ and ${\hat{γ}}_{k}^{[α]}$ , the asymptotic properties of the usual bias and variation terms in the strict sense may not be easily studied. Heuristic calculations of the usual bias and variation terms in the strict sense were performed in Section 6 of Silverman [15].

Note that even if the multiplicity of λ_k is one, we can not uniquely determine γ_k because −γ_k is also an eigenfunction. In the following theorem, by “Given γ_k”, we mean that not only γ_k is an eigenfunction, but also the direction of γ_k is given.

Define

α_{0} = \min_{1 \leq k \leq K} {\min {\frac{\sqrt{1 + \frac{2 k {(λ_{k - 1} - λ_{k})}^{2}}{(k - 1) λ_{k} ∥ Γ ∥}} - 1}{2 {kL}_{k}^{2}}, \frac{λ_{k} - λ_{k + 1}}{(8 \sqrt{k} + 16 k) L_{k}^{2} λ_{k}}, \frac{(λ_{k - 1} - λ_{k}) {1 + \frac{2 ∥ Γ ∥}{λ_{k} - λ_{k + 1}}}^{- \frac{1}{2}}}{4 \sqrt{2 k (k - 1)} L_{k}^{2} λ_{k}}}} .

(4.3)

Theorem 4.1

Under Assumptions 1 – 3, for any 1 ≤ k ≤ K and 0 ≤ α ≤ α₀,

\begin{matrix} 0 & \leq λ_{k} - λ_{k}^{[α]} \\ \leq \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k} α (1 + O (\frac{\sqrt{k} L_{k}^{2} λ_{k}}{λ_{k} - λ_{k + 1}} α + \frac{k (k - 1) L_{k}^{4} λ_{k}^{2} ∥ Γ ∥}{{(λ_{k - 1} - λ_{k})}^{2} (λ_{k} - λ_{k + 1})} α^{2})) . \end{matrix}

(4.4)

Given γ_k, 1 ≤ k ≤ K, we can uniquely choose $γ_{k}^{[α]}$ for each α ∈ [0, α₀] such that $γ_{k}^{[α]}$ is a continuous function of α and $(γ_{k}^{[α]}, γ_{k}) > 0$ for all 0 ≤ α ≤ α₀, and we have

∥ γ_{k}^{[α]} - γ_{k} ∥ \leq \sqrt{a} \sqrt{\frac{4 \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k}}{λ_{k} - λ_{k + 1}}} + α \sqrt{4 k (k - 1) L_{k}^{4} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} {1 + \frac{2 ∥ Γ ∥}{λ_{k} - λ_{k + 1}}} .}

(4.5)

Remark

If K is fixed or hounded, we have
$\begin{matrix} 0 \leq λ_{k} - λ_{k}^{[a]} \leq \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k} α + o (α), \\ ∥ γ_{k}^{[α]} - γ_{k} ∥ \leq \sqrt{a} \sqrt{\frac{4 \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k}}{λ_{k} - λ_{k + 1}}} + o (\sqrt{α}) \end{matrix}$
Hence, the convergence rates for eigenvalues and eigenfunctions are different. Eigenvalues have faster convergence rates than eigenfunctions.
As K → ∞, we have α₀ →. If we choose α in such a way that 0 ≤ α ≤ α₀and the right hand sides of (4.4) and (4.5) converges to zero, then $λ_{k}^{[α]} \to λ_{k}$ and $γ_{k}^{[α]} \to γ_{k}$ for all 1 ≤ k ≤ K.
The convergence rates for both eigenvalues and eigenfunctions depend on L_k. If the eigenfunctions are less smooth, that is, L_k is large, then the convergence is slow.
(4.4) and (4.5) give the upper bounds. However, the lower bounds are 0 for any k ∈ ℕ. Here is a simple example. Without loss of generality, let k = 2. Suppose [a, b] = [0, 2π],
$Γ (s, t) = \frac{2}{π} \cos (s) \cos (t) + \frac{1}{2 π} + \frac{1}{2 π} \sin (s) \sin (t) + \frac{1}{π} Σ_{m = 2}^{\infty} {(\frac{1}{2 m})}^{3} \cos (ms) \cos (mt) + \frac{1}{π} Σ_{m = 2}^{\infty} {(\frac{1}{2 m + 1})}^{3} \sin (ms) \sin (mt) .$
Note that the right hand side in the above equality converges both uniformly and in L²([0, 2π] × [0, 2π]) to a strictly positive definite covariance functions. Its first eigenvalue and eigenfunction are 2 and $\frac{1}{\sqrt{π}} \cos (t)$ , the second ones are 1 and $\frac{1}{\sqrt{2 π}}$ . It is interesting to note that the eigenfunctions of Γ are the same as the solutions of the successive optimization problems (3.3) and (3.4). The first maximum value of the successive optimization problems (3.3) and (3.4) is $\frac{2}{1 + α}$ and the second one is still 1. That is, in this case, we have $λ_{2}^{[α]} = λ_{2}$ and $γ_{2}^{[α]} = γ_{2}$ for any α, hence the lower bounds are zeros.

Define C_ℝ[0, α₀] to be the normed space of all continuous real functions in [0, α₀] equipped with norm sup_{0≤α≤α₀}. | · |. Let Π_1≤j≤KCℝ[0, α₀] denote the product space of K copies of Cℝ[0, α₀] Define C_L²([a,b]) [0, α₀] to be the normed space of all continuous functions in [0, α₀] taking values in L²([a,b]) equipped with norm sup_{0≤α≤α₀} ‖ · ‖. Similarly, we define Π_1≤j≤KC_L²([a,b])[0, α₀].

For each 1 ≤ k ≤ K and each n, we will view $\sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]})$ as a stochastic process with index α ∈ [0, α₀] and values in L²[a, b] and view $\sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]})$ as a stochastic process with index α ∈ [0, α₀] and values in ℝ. However, in the following subset in the probability space,

Ω_{0} = {ω : there exists at least one α \in [0, α_{0}] such that {\hat{λ}}_{1}^{[α]}, \dots, {\hat{λ}}_{K}^{[α]} are not mutually different},

(4.6)

${\hat{γ}}_{1}^{[α]}, \dots, {\hat{γ}}_{K}^{[α]}$ are not uniquely determined up to signs. We will show that Ω₀ is measurable and its probability goes to zero as n → ∞ in the proof of the following theorem. Hence, how to define ${\hat{γ}}_{1}^{[α]}, \dots, {\hat{γ}}_{K}^{[α]}$ in Ω₀ does not affect our asymptotic results. In order to make the development of our theory easier, we will use the following definition

in Ω_{0}, define {\hat{γ}}_{k}^{[α]} = 0, 1 \leq k \leq K .

(4.7)

Theorem 4.2

Under Assumptions 1 — 3 and the definition (4.7), we can properly choose ${\hat{γ}}_{k}^{[α]}$ in $Ω_{0}^{c}$ to make the sequence

{\sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}), 1 \leq k \leq K, 0 \leq α \leq α_{0}}_{n}

(4.8)

of stochastic processes is measurable and has sample paths in

\prod_{k = 1}^{K} C_{L^{2} ([a, b])} [0, α_{0}]

a.s. . Furthermore, the sequence converges in distribution to a Gaussian random, element with values in $\prod_{k = 1}^{K} C_{L^{2} ([a, b])} [0, α_{0}]$ and mean zero. Similarly, the sequence

{\sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]}), 1 \leq k \leq K, 0 \leq α \leq α_{0}}_{n}

(4.9)

of stochastic processes has sample paths in $\prod_{k = 1}^{K} C_{ℝ} [0, α_{0}]$ a.s. and converges in distribution to a Gaussian random, element with values in $\prod_{k = 1}^{K} C_{ℝ} [0, α_{0}]$ and mean zero.

Remark

Recall the definition of Guassian random elements in a separable Banach space. Suppose that X is a random element with values in a Banach space B with mean zero. Then X is a Guassian element if for any bounded linear functional f, f(X) is a Guassian random, variable. If X is a Guassian random, element, we can define its covariance operator Q. Q is a bounded operator from the dual space B′ to B such that for any f, g ∈ B′, g(Qf) = E [f(X)g(X)]. Note that the distribution of a Gaussian element with values in a Banach space and mean zero is determined by its covariance operator. For further properties of Guassian random elements in Banach spaces, see Ledoux and Talagrand [10].
The covariance operators (4.8) and (4.9) can be characterized by the “half-smoothing” operator S_α defined in (2.5) and the limit distribution of $\sqrt{n} ({\hat{Γ}}_{n} - Γ)$ . However, the characterization involves some technical definitions. The reader can find the characterization in the proof of this theorem.
The measurabilities and a.s. continuities of the sample paths of the processes (4.8) and (4.9) are not obvious at all.
The convergences of (4.8) and (4.9) are weak convergences of probability measures in spaces $\prod_{k = 1}^{K} C_{ℝ} [0, α_{0}]$ and $\prod_{k = 1}^{K} C_{L^{2} ([a, b])} [0, α_{0}]$ , which are stronger than the convergences of only the marginal distributions of (4.8) and (4.9).

Now from Theorem 4.1 and Theorem 4.2, we have the following corollaries.

Corollary 4.1

Under Assumptions 1 – 3, for any 1 ≤ k ≤ K and 0 ≤ α ≤ α₀

\begin{matrix} | {\hat{λ}}_{k}^{[α]} - λ_{k} | \leq | {\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]} | + \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k} α + o (α), \\ ∥ {\hat{γ}}_{k}^{[α]} - γ_{k} ∥ \leq | {\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]} | + \sqrt{α} \sqrt{\frac{4 \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k}}{λ_{k} - λ_{k + 1}}} + o (\sqrt{α}) . \end{matrix}

(4.10)

where

\sup_{0 \leq α \leq α_{0}} | {\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]} | = O_{p} (\frac{1}{\sqrt{n}}), \sup_{0 \leq α \leq α_{0}} | {\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]} | = O_{p} (\frac{1}{\sqrt{n}}) .

Remark

From, Corollary 4.1, it seems that smoothing (that is, α > 0) is unnecessary since when α = 0, we get the best order $\frac{1}{\sqrt{n}}$ . We clarify this problem by the following remarks.

Both Silverman [15] and this paper consider the ideal situation where every sample curve is observed at all points in [a, b] without any noise or measurement error. Although in this situation the estimates are consistent when α = 0, smoothing is advantageous.
- – First, because the “bias terms” and the “variation terms” are not the bias and the variation in the strict sense, they are correlated. Since the upper bounds on the right hand sides of (4.10) are the sums of the upper bounds for bias terms and variation terms, the upper bounds in (4.10) are actually for the cases in which bias terms and variation terms are positively correlated. They are the worst cases when we introduce smoothing. In some cases such as those in Section 6.3 of Silverman [15], the mean squared errors for some α > 0 are less than those for α = 0. For these cases, it is possible that bias terms and variation terms are negatively correlated and hence the estimate errors should be much less than the upper bounds in (4.10). Section 6.4 of Silverman [15] gave an optimal a with order $O (\frac{1}{n})$ for estimates of eigenfunctions. By Corollary 4.1, if we choose the optimal a, we obtain the best asymptotic rates $O (\frac{1}{\sqrt{n}})$ . Even for the worst cases, if we take $α = O (\frac{1}{n})$ , we can obtain the rate $O (\frac{1}{\sqrt{n}})$ .
- – Second, from a practical viewpoint, it is desirable that the estimates of principal component curves can keep main patterns of the true principal component curves. However, the sample curves of many stochastic process are nonsmooth or even discontinuous, such as examples in Remark (3) after Assumption 3. Hence, their sample covariance functions have many local variations and so do the eigenfunctions of those sample covariance functions. In these cases, the local variations can be removed by using an appropriate amount of smoothing, that is, choosing an appropriate positive α.
In practice, people cannot observe the entire sample curves. The observations can only be made at discrete points often with noise or measurement error. The observation points could he dense or sparse. If the sample curves are smooth and the observation points are dense, we can obtain smoothed estiamte of each sample function and perform the usual functional PC A. This method cannot be applied to other situations. However, Silverman's method can be generalized to all these situations (see Qi and Zhao [11]). In our generalization, smoothing is essential and the smoothing parameters must be positive. The theoretical results in this paper has been applied to prove the consistency results in Qi and Zhao [11].

If α goes to 0 fast enough as n → ∞, we have the following asymptotic normalities.

Corollary 4.2

Under Assumptions 1 – 3, for any sequence {α_n, n ≥ 1} with $α_{n} = o_{p} (\frac{1}{\sqrt{n}})$ , the joint distributions of

{\sqrt{n} ({\hat{λ}}_{1}^{[α_{n}]} - λ_{1}), \sqrt{n} ({\hat{λ}}_{2}^{[α_{n}]} - λ_{2}), \dots, \sqrt{n} ({\hat{λ}}_{K}^{[α_{n}]} - λ_{K})}

converge to the same Gaussian distribution with mean zero. For any sequence {α_n, n ≥ 1} with $α_{n} = o_{p} (\frac{1}{n})$ , the joint distributions of

{\sqrt{n} ({\hat{γ}}_{1}^{[α_{n}]} - γ_{1}), \sqrt{n} ({\hat{γ}}_{2}^{[α_{n}]} - γ_{2}), \dots, \sqrt{n} ({\hat{γ}}_{K}^{[α_{n}]} - γ_{K})}

converge to the same Gaussian distribution with mean zero.

Remark

Dauxois et al. [3] gave the asymptotic normalities of the eigenvalues and eigenfunctions of Γ̂_n and characterized the covariance operators of the limit Gaussian random elements. Those results are special cases of Corollary 4.2 with all α_n equal to zeros. Therefore, by Corollary 4.2, all the limit Guassian distributions in Corollary 4.2 are the same as those in Dauxois et al. [3].

5. Proofs

Proof of Theorem 3.1

By Remark (3) after Assumption 1, ∥Γ̂_n∥ < ∞ a.s.. Fix a sample and α ≥ 0 such that ∥Γ̂_n∥ < ∞. Consider the Hilbert space $W_{2}^{2} ([a, b])$ equipped with the inner product (·,·)_α. For any f, $g \in W_{2}^{2} ([a, b])$ , the functional (f, Γ̂_ng) define a bilinear form in $W_{2}^{2} ([a, b])$ and

| (f, {\hat{Γ}}_{n} g) | \leq ∥ {\hat{Γ}}_{n} ∥ ∥ f ∥ ∥ g ∥ \leq ∥ {\hat{Γ}}_{n} ∥ {∥ f ∥}_{α} {∥ g ∥}_{α} .

Hence, there is a unique bounded operator R_a in $W_{2}^{2} ([a, b])$ , such that for any f, $g \in W_{2}^{2} ([a, b])$ ,

(f, {\hat{Γ}}_{n} g) = {(f, R_{α} g)}_{α},

(see Section 84 in Riesz and Sz.-Nagy [13]). It is easy to see that R_a is symmetric and nonnegative-definite. We want to show that R_a is a compact operator (note that a compact operator is called completely continuous operator in Riesz and Sz.-Nagy [13]). By definition 4 in Section 85 of Riesz and Sz.-Nagy [13], we only need to show that for any bounded sequence ${f_{m} \in W_{2}^{2} ([a, b]), m \in ℕ}$ , one can select a subsequence {f_{m_k}} such that

{(f_{m_{k}} - f_{m_{l}}, R_{α} (f_{m_{k}} - f_{m_{l}}))}_{α} = (f_{m_{k}} - f_{m_{l}}, {\hat{Γ}}_{n} (f_{m_{k}} - f_{m_{l}})) \to 0,

(5.1)

as k, l → ∞. Because Γ̂_n is a compact operator in L²([a,b]) (see Remark (2) after Assumption 1) and {f_m} is also a bounded sequence in L²([a, b]), one can select a subsequence {f_{m_k}} such that {Γ̂_nfm_{m_k}} converges, then (5.1) is true for {f_{m_k}}. Hence R_a is a compact operator. It has eigenvalues and eigenfunctions ${({\hat{λ}}_{j}^{[α]}, {\hat{γ}}_{j}^{[α]}) : j \in ℕ}$ with ${\hat{λ}}_{1}^{[α]} \geq {\hat{λ}}_{2}^{[α]} \geq \dots \geq 0$ . They are the solutions of the successive optimization problems (3.3) and (3.4) (see Chapter 3 of Weinberger [16]). Now for any $γ \in W_{2}^{2} ([a, b])$ and any j ∈ ℕ, because

R_{α} {\hat{γ}}_{j}^{[α]} = {\hat{λ}}_{j}^{[α]} {\hat{γ}}_{j}^{[α]},

we have

({\hat{Γ}}_{n} {\hat{γ}}_{j}^{[α]}, γ) = {(R_{α} {\hat{γ}}_{j}^{[α]}, γ)}_{α} = {\hat{λ}}_{j}^{[α]} {({\hat{γ}}_{j}^{[α]}, γ)}_{α} .

Proof of Theorem 4.1

The proof of the existence and uniqueness of the choices of the signs of $γ_{k}^{[α]}$ , 1 ≤ k ≤ K making them continuous functions of α will be postponed to the proof of Theorem 4.2 because we need some technical lemmas in the proof of Theorem 4.2. We will assume that we can choose the signs of $γ_{k}^{[α]}$ , 1 ≤ k ≤ K such that they are continuous function of α for all 0 ≤ α ≤ α₀ and $γ_{k}^{[0]} = γ_{k}$ , 1 ≤ k ≤ K.

For any 1 ≤ k ≤ K, let P_k be the orthogonal projection operator in L²([a,b]) onto the space spanned by {γ₁,… ,γ_k} and I be the identity operator in L²([a, b]). Then (I − P_k) is the orthogonal projection operator onto the closed subspace spanned by {γ_j,j ≥ (k + 1)}.

Lemma 1

For any k ∈ ℕ, and α₁ ≥ α₂ ≥ 0

λ_{k}^{[α_{1}]} \leq λ_{k}^{[α_{2}]}, {\hat{λ}}_{k}^{[α_{1}]} \leq {\hat{λ}}_{k}^{[α_{2}]}, \forall k \in ℕ .

Proof. It follows Theorem 8.1 in Chapter 3 of Weinberger [16].

Lemma 2

For any 1 ≤ k ≤ K and α ≥ 0, we have

{∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} \leq α^{2} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} (k - 1) L_{k}^{2} [γ_{k}^{[α]}, γ_{k}^{[α]}] .

(5.2)

Proof. For any j < k, by (3.6), we have

\begin{matrix} λ_{j} (γ_{k}^{[α]}, γ_{j}) = (γ_{k}^{[α]}, Γ γ_{j}) & = (Γ γ_{k}^{[α]}, γ_{j}) = λ_{k}^{[α]} {(γ_{k}^{[α]}, γ_{j})}_{α} \\ = & λ_{k}^{[α]} {(γ_{k}^{[α]}, γ_{j}) + α [γ_{k}^{[α]}, γ_{j}]} . \end{matrix}

(λ_{j} - λ_{k}^{[α]}) (γ_{k}^{[α]}, γ_{j}) = λ_{k}^{[α]} α [γ_{k}^{[α]}, γ_{j}] .

By Assumption 2 and Lemma 1, $λ_{j} > λ_{k} \geq λ_{k}^{[α]}$ . Therefore,

(γ_{k}^{[α]}, γ_{j}) = \frac{λ_{k}^{[α]}}{λ_{j} - λ_{k}^{[α]}} α [γ_{k}^{[α]}, γ_{j}] .

and we have

\begin{matrix} {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} = Σ_{j = 1}^{k - 1} {(γ_{k}^{[α]}, γ_{j})}^{2} = Σ_{j = 1}^{k - 1} {(\frac{λ_{k}^{[α]}}{λ_{j} - λ_{k}^{[α]}})}^{2} α^{2} {[γ_{k}^{[α]}, γ_{j}]}^{2} \\ \leq & α^{2} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} Σ_{j = 1}^{k - 1} {[γ_{k}^{[α]}, γ_{j}]}^{2} \leq α^{2} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} Σ_{j = 1}^{k - 1} [γ_{k}^{[α]}, γ_{k}^{[α]}] [γ_{j}, γ_{j}] \\ \leq & α^{2} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} (k - 1) L_{k}^{2} [γ_{k}^{[α]}, γ_{k}^{[α]}], \end{matrix}

where the last inequality in the second line follows from Cauchy-Schwarz in-equality.

Lemma 3

For any 1 ≤ k ≤ K and any

0 \leq α < \frac{\sqrt{1 + \frac{4 k {(λ_{k - 1} - λ_{k})}^{2}}{(k - 1) λ_{k} ∥ Γ ∥}} - 1}{2 {kL}_{k}^{2}},

(if k = 1, the right hand side is defined to be infinity), we have

[γ_{k}^{[α]}, γ_{k}^{[α]}] \leq \frac{{kL}_{k}^{2}}{1 - α \frac{λ_{k}}{{(λ_{k - 1} - λ_{k})}^{2}} (k - 1) L_{k}^{2} (1 + α {kL}_{k}^{2}) ∥ Γ ∥} .

(5.3)

Furthermore, if

0 \leq α \leq \frac{\sqrt{1 + \frac{2 k {(λ_{k - 1} - λ_{k})}^{2}}{(k - 1) λ_{k} ∥ Γ ∥}} - 1}{2 {kL}_{k}^{2}},

(if k = 1, the right hand side is defined to he infinity), we have

[γ_{k}^{[α]}, γ_{k}^{[α]}] \leq 2 {kL}_{k}^{2} .

(5.4)

For any α ≥ 0, we have

0 \leq λ_{k} - λ_{k}^{[α]} \leq α {kL}_{k}^{2} λ_{k} .

(5.5)

Hence, as α → 0, $λ_{k}^{[α]} \to λ_{k}$ .

Proof. Let span(γ₁, … , γ_k) denote the linear subspace spanned by

{γ_{1}, \dots, γ_{k}} .

From Theorem 5.1 (Poincare's Principle) in Chapter 3 of Weinberger [16], we have

\begin{matrix} \min_{0 \neq γ \in span (γ_{1}, \dots, γ_{k})} \frac{(γ, Γ γ)}{{∥ γ ∥}^{2} + α [γ, γ]} \leq λ_{k}^{[α]} = \frac{(γ_{k}^{[α]}, Γ γ_{k}^{[α]})}{{∥ γ_{k}^{[α]} ∥}^{2} + α [γ_{k}^{[α]}, γ_{k}^{[α]}]} \\ = & \frac{(P_{k - 1} γ_{k}^{[α]} + (I - P_{k - 1}) γ_{k}^{[α]}, Γ (P_{k - 1} γ_{k}^{[α]} + (I - P_{k - 1}) γ_{k}^{[α]}))}{{∥ γ_{k}^{[α]} ∥}^{2} + α [γ_{k}^{[α]}, γ_{k}^{[α]}]} \\ = & \frac{(P_{k - 1} γ_{k}^{[α]}, Γ P_{k - 1} γ_{k}^{[α]}) + ((I - P_{k - 1}) γ_{k}^{[α]}, Γ (I - P_{k - 1}) γ_{k}^{[α]})}{{∥ γ_{k}^{[α]} ∥}^{2} + α [γ_{k}^{[α]}, γ_{k}^{[α]}]} \\ \leq & \frac{∥ Γ ∥ \cdot {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k}}{{∥ γ_{k}^{[α]} ∥}^{2} + α [γ_{k}^{[α]}, γ_{k}^{[α]}]}, \end{matrix}

(5.6)

where the equality in the third line of (5.6) is true because that (I − P_k−1) is the orthogonal projection operator onto the closed subspace spanned by {γ_j,j ≥ k} which is orthogonal to span(γ₁, … , γ_k-1), and both of them are invariant subspaces of Γ. The last inequality in (5.6) holds because the largest eigenvalue of Γ restricted to the closed subspace spanned by {γ_j,j≥k} is λ_k and the L² norm of $(I - P_{k - 1}) γ_{k}^{[α]}$ is less than 1. On the other hand, we have

\begin{matrix} \min_{0 \neq γ \in span (γ_{1}, \dots, γ_{k})} \frac{(γ, Γ γ)}{{∥ γ ∥}^{2} + α [γ, γ]} = \min_{0 \neq γ \in span (γ_{1}, \dots, γ_{k})} \frac{(γ, Γ γ)}{{∥ γ ∥}^{2} (1 + \frac{α [γ, γ]}{{∥ γ ∥}^{2}})} \\ \geq & \min_{0 \neq γ \in span (γ_{1}, \dots, γ_{k})} \frac{(γ, Γ γ)}{{∥ γ ∥}^{2} (1 + \max_{0 \neq β \in span (γ_{1}, \dots, γ_{k})} \frac{α [β, β]}{{∥ β ∥}^{2}})} \\ = & \frac{1}{(1 + \max_{0 \neq β \in span (γ_{1}, \dots, γ_{k})} \frac{α [β, β]}{{∥ β ∥}^{2}})} \min_{0 \neq γ \in span (γ_{1}, \dots, γ_{k})} \frac{(γ, Γ γ)}{{∥ γ ∥}^{2}} \\ = & \frac{λ_{k}}{(1 + \max_{0 \neq β \in span (γ_{1}, \dots, γ_{k})} \frac{α [β, β]}{{∥ β ∥}^{2}})} \geq \frac{λ_{k}}{(1 + α {kL}_{k}^{2})} \end{matrix}

(5.7)

The equality in the last line follows from the fact that the smallest eigenvalue of Γ in span(γ₁, … , γ_k) is λ_k. The last inequality holds because that, for any β ∈ span(γ₁, … , γ_k), let $β = Σ_{i = 1}^{k} c_{i} γ_{i}$ , where c₁, … , c_k are some real numbers, then we have

\begin{matrix} \frac{[β, β]}{{∥ β ∥}^{2}} = \frac{[Σ_{i = 1}^{k} c_{i} γ_{i}, Σ_{i = 1}^{k} c_{i} γ_{i}]}{Σ_{i = 1}^{k} c_{i}^{2}} = \frac{Σ_{i = 1}^{k} c_{i}^{2} [γ_{i}, γ_{i}] + Σ_{j \neq l} c_{j} c_{l} [γ_{j}, γ_{l}]}{Σ_{i = 1}^{k} c_{i}^{2}} \\ \leq & \frac{Σ_{i = 1}^{k} c_{i}^{2} {(\sqrt{[γ_{i}, γ_{i}]})}^{2} + Σ_{j \neq l} c_{j} c_{l} \sqrt{[γ_{j}, γ_{j}]} \sqrt{[γ_{l}, γ_{l}]}}{Σ_{i = 1}^{k} c_{i}^{2}} \\ = & \frac{{(Σ_{i = 1}^{k} c_{i} \sqrt{[γ_{i}, γ_{i}]})}^{2}}{Σ_{i = 1}^{k} c_{i}^{2}} \leq \frac{(Σ_{i = 1}^{k} c_{i}^{2}) (Σ_{i = 1}^{k} [γ_{i}, γ_{i}])}{Σ_{i = 1}^{k} c_{i}^{2}} \leq Σ_{i = 1}^{k} [γ_{i}, γ_{i}] \leq {kL}_{k}^{2}, \end{matrix}

where the inequality in the second line is due to Cauchy-Schwarz inequality. Now from (5.6), (5.7) and Lemma 1, we have

\frac{λ_{k}}{(1 + α {kL}_{k}^{2})} \leq λ_{k}^{[α]} \leq λ_{k} .

From these inequalities, it can be derived that

0 \leq λ_{k} - λ_{k}^{[α]} \leq α {kL}_{k}^{2} λ_{k} .

Therefore, $λ_{k}^{[α]} \to λ_{k}$ as α → 0.

Again by (5.6), (5.7), and note that $∥ γ_{k}^{[α]} ∥ = 1$ , we have

\frac{λ_{k}}{(1 + α {kL}_{k}^{2})} \leq \frac{∥ Γ ∥ \cdot {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k}}{{∥ γ_{k}^{[α]} ∥}^{2} + α [γ_{k}^{[α]}, γ_{k}^{[α]}]} = \frac{∥ Γ ∥ \cdot {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k}}{1 + α [γ_{k}^{[α]}, γ_{k}^{[α]}]} .

Then

λ_{k} (1 + α [γ_{k}^{[α]}, γ_{k}^{[α]}]) \leq ∥ Γ ∥ \cdot {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} (1 + α {kL}_{k}^{2}) + λ_{k} (1 + α {kL}_{k}^{2}),

hence,

λ_{k} α [γ_{k}^{[α]}, γ_{k}^{[α]}] \leq ∥ Γ ∥ \cdot {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} (1 + α {kL}_{k}^{2}) + λ_{k} α {kL}_{k}^{2} .

Now by (5.2), we have

[γ_{k}^{[α]}, γ_{k}^{[α]}] \leq α \frac{λ_{k}}{{(λ_{k - 1} - λ_{k})}^{2}} (k - 1) L_{k}^{2} [γ_{k}^{[α]}, γ_{k}^{[α]}] (1 + α {kL}_{k}^{2}) ∥ Γ ∥ + {kL}_{k}^{2} .

After rearranging the terms, we then obtain

[γ_{k}^{[α]}, γ_{k}^{[α]}] \leq {1 - α \frac{λ_{k}}{{(λ_{k - 1} - λ_{k})}^{2}} (k - 1) L_{k}^{2} (1 + α {kL}_{k}^{2}) ∥ Γ ∥} \leq {kL}_{k}^{2} .

When the expression in braces on the left of the above inequality is positive, which is equivalent to

α < \frac{\sqrt{1 + \frac{4 k {(λ_{k - 1} - λ_{k})}^{2}}{(k - 1) λ_{k} ∥ Γ ∥}} - 1}{2 {kL}_{k}^{2}},

(if k = 1, the right hand side is denned to be infinity), we have

[γ_{k}^{[α]}, γ_{k}^{[α]}] \leq \frac{{kL}_{k}^{2}}{1 - α \frac{λ_{k}}{{(λ_{k - 1} - λ_{k})}^{2}} (k - 1) L_{k}^{2} (1 + α {kL}_{k}^{2}) ∥ Γ ∥} .

(5.8)

When

α \leq \frac{\sqrt{1 + \frac{2 k {(λ_{k - 1} - λ_{k})}^{2}}{(k - 1) λ_{k} ∥ Γ ∥}} - 1}{2 {kL}_{k}^{2}},

(if k = 1, the right hand side is denned to be infinity), it can be shown that

1 - α \frac{λ_{k}}{{(λ_{k - 1} - λ_{k})}^{2}} (k - 1) L_{k}^{2} (1 + α {kL}_{k}^{2}) ∥ Γ ∥ \geq \frac{1}{2},

and then it follows from (5.8) that

[γ_{k}^{[α]}, γ_{k}^{[α]}] \leq 2 {kL}_{k}^{2} .

Lemma 4

For any 1 ≤ k ≤ K and any

0 \leq α \leq \frac{λ_{k} - λ_{k + 1}}{2 {kL}_{k}^{2} λ_{k}},

(5.9)

we have

\begin{matrix} {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2} \\ \leq \frac{2}{λ_{k} - λ_{k + 1}} & [∥ Γ ∥ α^{2} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} (k - 1) L_{k}^{2} [γ_{k}^{[α]}, γ_{k}^{[α]}] + λ_{k} α {[γ_{k}^{[α]}, γ_{k}^{[α]}]}^{\frac{1}{2}} L_{k}] \end{matrix}

(5.10)

Proof. By the following orthogonal decomposition

γ_{k}^{[α]} = P_{k - 1} γ_{k}^{[α]} + (γ_{k}^{[α]}, γ_{k}) γ_{k} + (I - P_{k}) γ_{k}^{[α]},

(5.11)

we have

\begin{matrix} (γ_{k}^{[α]}, Γ γ_{k}^{[α]}) \\ = & (P_{k - 1} γ_{k}^{[α]}, Γ P_{k - 1} γ_{k}^{[α]}) + {(γ_{k}^{[α]}, γ_{k})}^{2} (γ_{k}, Γ γ_{k}) + ((I - P_{k}) γ_{k}^{[α]}, Γ (I - P_{k}) γ_{k}^{[α]}) \\ \leq & ∥ Γ ∥ {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k} {(γ_{k}^{[α]}, γ_{k})}^{2} + λ_{k + 1} {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2}, \end{matrix}

(5.12)

where the last inequality follows from the fact that $(I - P_{k}) γ_{k}^{[α]}$ belongs to the closed subspace spanned by {γ_j, j ≥ k + 1} in which the largest eigenvalue of Γ is λ_k+1. On the other hand, by (3.6), we have

\begin{matrix} (γ_{k}^{[α]}, Γ γ_{k}^{[α]}) = γ_{k}^{[α]} {(γ_{k}^{[α]}, γ_{k}^{[α]})}_{α} = γ_{k}^{[α]} {∥ γ_{k}^{[α]} ∥}^{2} + {α γ}_{k}^{[α]} [γ_{k}^{[α]}, γ_{k}^{[α]}] \\ = & λ_{k}^{[α]} {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k}^{[α]} {(γ_{k}^{[α]}, γ_{k})}^{2} + λ_{k}^{[α]} {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2} + {α λ}_{k}^{[α]} [γ_{k}^{[α]}, γ_{k}^{[α]}] . \end{matrix}

(5.13)

From (5.12) and (5.13),

\begin{matrix} λ_{k}^{[α]} {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k}^{[α]} {(γ_{k}^{[α]}, γ_{k})}^{2} + λ_{k}^{[α]} {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2} + {α λ}_{k}^{[α]} [γ_{k}^{[α]}, γ_{k}^{[α]}] \\ \leq & ∥ Γ ∥ {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k} {(γ_{k}^{[α]}, γ_{k})}^{2} + λ_{k + 1} {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2}, \end{matrix}

then

\begin{matrix} (λ_{k}^{[α]} - λ_{k + 1}) {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2} \\ \leq (∥ Γ ∥ - λ_{k}^{[α]}) {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + (λ_{k} - λ_{k}^{[α]}) {(γ_{k}^{[α]}, γ_{k})}^{2} - {α λ}_{k}^{[α]} [γ_{k}^{[α]}, γ_{k}^{[α]}] \\ \leq (∥ Γ ∥ - λ_{k}^{[α]}) {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + (λ_{k} - λ_{k}^{[α]}) {(γ_{k}^{[α]}, γ_{k})}^{2} . \end{matrix}

(5.14)

It follows from (5.9) that $α k L_{k}^{2} λ_{k} \leq \frac{1}{2} (λ_{k} - λ_{k + 1})$ . Then by (5.5), we have

λ_{k} - λ_{k}^{[α]} \leq \frac{1}{2} (λ_{k} - λ_{k + 1}),

hence,

λ_{k}^{[α]} - λ_{k + 1} \geq \frac{1}{2} (λ_{k} - λ_{k + 1}) .

(5.15)

Because

\begin{matrix} λ_{k} (γ_{k}^{[α]}, γ_{k}) & = (γ_{k}^{[α]}, Γ γ_{k}) = (Γ γ_{k}^{[α]}, γ_{k}) = λ_{k}^{[α]} {(γ_{k}^{[α]}, γ_{k})}_{α} \\ = λ_{k}^{[α]} {(γ_{k}^{[α]}, γ_{k}) + α [γ_{k}^{[α]}, γ_{k}]}, \end{matrix}

we have

(λ_{k} - λ_{k}^{[α]}) (γ_{k}^{[α]}, γ_{k}) = λ_{k}^{[α]} α [γ_{k}^{[α]}, γ_{k}] .

(5.16)

From (5.14), (5.15) and (5.16),

\begin{matrix} \frac{1}{2} (λ_{k} - λ_{k + 1}) {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2} & \leq ∥ Γ ∥ {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k}^{[α]} α [γ_{k}^{[α]}, γ_{k}] (γ_{k}^{[α]}, γ_{k}) \\ \leq ∥ Γ ∥ {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k}^{[α]} α [γ_{k}^{[α]}, γ_{k}] \\ \leq ∥ Γ ∥ {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + λ_{k}^{[α]} α {[γ_{k}^{[α]}, γ_{k}^{[α]}]}^{\frac{1}{2}} {[γ_{k}, γ_{k}]}^{\frac{1}{2}} . \end{matrix}

Now by Lemma 2,

\begin{matrix} \frac{1}{2} (λ_{k} - λ_{k + 1}) {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2} \\ \leq & ∥ Γ ∥ α^{2} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} (k - 1) L_{k}^{2} [γ_{k}^{[α]}, γ_{k}^{[α]}] + λ_{k} α {[γ_{k}^{[α]}, γ_{k}^{[α]}]}^{\frac{1}{2}} L_{k} . \end{matrix}

Now we can prove Theorem 4.1. It follows from the definition (4.3) of α₀ that all the conditions in Lemmas 3 and 4 are satisfied. From the orthogonal decomposition

γ_{k}^{[α]} = P_{k - 1} γ_{k}^{[α]} + (γ_{k}^{[α]}, γ_{k}) γ_{k} + (I - P_{k}) γ_{k}^{[α]},

we have

1 = {∥ γ_{k}^{[α]} ∥}^{2} = {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} + {(γ_{k}^{[α]}, γ_{k})}^{2} + {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2} .

Hence, it follows from Lemma 2, Lemma 4 and (5.4) in Lemma 3 that

\begin{matrix} {(γ_{k}^{[α]}, γ_{k})}^{2} = 1 - {∥ P_{k - 1} γ_{k}^{[α]} ∥}^{2} - {∥ (I - P_{k}) γ_{k}^{[α]} ∥}^{2} \\ \geq & 1 - α^{2} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} (k - 1) L_{k}^{2} [γ_{k}^{[α]}, γ_{k}^{[α]}] \\ - & \frac{2}{λ_{k} - λ_{k + 1}} [∥ Γ ∥ α^{2} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} (k - 1) L_{k}^{2} [γ_{k}^{[α]}, γ_{k}^{[α]}] + λ_{k} α {[γ_{k}^{[α]}, γ_{k}^{[α]}]}^{\frac{1}{2}} L_{k}] \\ \geq & 1 - \frac{2 \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k}}{λ_{k} - λ_{k + 1}} α - 2 k (k - 1) L_{k}^{4} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} {1 + \frac{2 ∥ Γ ∥}{λ_{k} - λ_{k + 1}}} α^{2} . \end{matrix}

(5.17)

Define

a = 2 k (k - 1) L_{k}^{4} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} {1 + \frac{2 ∥ Γ ∥}{λ_{k} - λ_{k + 1}}}, b = \frac{2 \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k}}{λ_{k} - λ_{k + 1}} .

(5.18)

By solving the following inequalities,

a α^{2} + b α \leq \frac{1}{2}, α \geq 0,

we obtain $0 \leq α \leq \frac{\sqrt{b^{2} + 2 a - b}}{2 a}$ . Since

\frac{\sqrt{b^{2} + 2 a} - b}{2 a} = \frac{1}{\sqrt{b^{2} + 2 a} + b} \geq \frac{1}{2 \sqrt{b^{2} + 2 a}} \geq \frac{1}{2 \sqrt{2 \max {b^{2}, 2 a}}} \geq \frac{1}{2 \sqrt{2}} \min {\frac{1}{b}, \frac{1}{\sqrt{2 a}}}

By the definition (4.3) of α₀ and (5.18), we have

α_{0} \leq \frac{1}{2 \sqrt{2}} \min {\frac{1}{b}, \frac{1}{\sqrt{2 a}}} \leq \frac{\sqrt{b^{2} + 2 a} - b}{2 a} .

Hence, for any 0 ≤ α ≤ α₀, we have $a α^{2} + b α \leq \frac{1}{2}$ . Now it follows from (5.17) that, for any 0 ≤ α ≤ α₀,

{(γ_{k}^{[α]}, γ_{k})}^{2} \geq 1 - b α - a α^{2} = \frac{1}{2} + (\frac{1}{2} - b α - a α^{2}) \geq \frac{1}{2} .

(5.19)

Because $γ_{k}^{[α]}$ is a continuous function of α, $(γ_{k}^{[α]}, γ_{k})$ is also a continuous function of α and $(γ_{k}^{[0]}, γ_{k}) = (γ_{k}, γ_{k}) = 1$ . Hence, it follows from (5.19) that $(γ_{k}^{[α]}, γ_{k}) > 0$ for all 0 ≤ α ≤ α₀.

From (5.16), (5.17) and (5.4), we have

\begin{matrix} (λ_{k} - λ_{k}^{[α]}) = \frac{λ_{k}^{[α]} α [γ_{k}^{[α]}, γ_{k}]}{(γ_{k}^{[α]}, γ_{k})} \leq \frac{λ_{k}^{[α]} α ∣ γ_{k}^{[α]}, γ_{k} ∣}{\sqrt{{(γ_{k}^{[α]}, γ_{k})}^{2}}} \leq \frac{λ_{k}^{[α]} α {[γ_{k}^{[α]}, γ_{k}^{[α]}]}^{\frac{1}{2}} {[γ_{k}, γ_{k}]}^{\frac{1}{2}}}{\sqrt{{(γ_{k}^{[α]}, γ_{k})}^{2}}} \\ = & \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k} α (1 + O (\frac{\sqrt{k} L_{k}^{2} λ_{k}}{λ_{k} - λ_{k + 1}} α + \frac{k (k - 1) L_{k}^{4} λ_{k}^{2} ∥ Γ ∥}{{(λ_{k - 1} - λ_{k})}^{2} (λ_{k} - λ_{k + 1})} α^{2})) . \end{matrix}

By (5.17) and $(γ_{k}^{[α]}, γ_{k}) > 0$ , we have

\begin{matrix} {∥ γ_{k}^{[α]} - γ_{k} ∥}^{2} & = 2 (1 - (γ_{k}^{[α]}, γ_{k})) \leq 2 (1 - (γ_{k}^{[α]}, γ_{k})) (1 + (γ_{k}^{[α]}, γ_{k})) \\ = 2 (1 - {(γ_{k}^{[α]}, γ_{k})}^{2}), \end{matrix}

and thus

∥ γ_{k}^{[α]} - γ_{k} ∥ \leq \sqrt{a} \sqrt{\frac{4 \sqrt{2} \sqrt{k} L_{k}^{2} λ_{k}}{λ_{k} - λ_{k + 1}}} + α \sqrt{4 k (k - 1) L_{k}^{4} {(\frac{λ_{k}}{λ_{k - 1} - λ_{k}})}^{2} {1 + \frac{2 ∥ Γ ∥}{λ_{k} - λ_{k + 1}}}} .

Proof of Theorem 4.2

We first study the properties of the “half-smoothing” operators S_α. At the end of Section 2, we know that S_α is a bounded linear operator from L²([a,b]) to L²([a,b]) with norm less than or equal to 1. Moreover, S_α is a one to one (injective) map. Hence, its inverse $S_{α}^{- 1}$ exists. When α = 0, S₀ is just the identity operator I in L²([a,b]). The following lemma gives the reason why S_α is called “half-smoothing” operators.

Lemma 5

The range of S_α (or the domain of $S_{α}^{- 1}$ ) is $W_{2}^{2} ([a, b])$ . Moreover, for any $f \in W_{2}^{2} ([a, b])$ ,

{∥ S_{α}^{- 1} f ∥}^{2} = {∥ f ∥}_{α}^{2} .

(5.20)

Proof If α = 0, the results are trivial. Hence, we assume that α > 0. Since the space C^∞[a, b] of smooth functions is dense in space

(W_{2}^{2} ([a, b]), {∥ \cdot ∥}_{α}),

for any $f \in W_{2}^{2} ([a, b])$ , there exists a sequence {f_m ∈ C^∞[a, b], m ∈ ℕ} such that ∥f_m − f∥_α → 0. One can see that the domain of $S_{α}^{- 2} = I + α L^{*} L$ contains C^∞[a, b], hence C^∞[a, b] is also in the domain of $S_{α}^{- 1}$ . Now we compute

\begin{matrix} {∥ S_{α}^{- 1} f_{l} - S_{α}^{- 1} f_{m} ∥}^{2} = (S_{α}^{- 1} f_{l} - S_{α}^{- 1} f_{m}, S_{α}^{- 1} f_{l} - S_{α}^{- 1} f_{m}) \\ = & (f_{l} - f_{m}, S_{α}^{- 2} (f_{l} - f_{m})) = (f_{l} - f_{m}, (I + α L^{*} L) (f_{l} - f_{m})) \\ = & (f_{l} - f_{m}, f_{l} - f_{m}) + (f_{l} - f_{m}, α L^{*} L (f_{l} - f_{m})) \\ = & (f_{l} - f_{m}, f_{l} - f_{m}) + α (L (f_{l} - f_{m}), L (f_{l} - f_{m})) \\ = & (f_{l} - f_{m}, f_{l} - f_{m}) + α [f_{l} - f_{m}, f_{l} - f_{m}] = {∥ f_{l} - f_{m} ∥}_{α} \to 0, \end{matrix}

(5.21)

as m, l → ∞. Hence, ${S_{α}^{- 1} f_{m}, m \in ℕ}$ is a Cauchy sequence in L²([a, b]). It converges to some function, say g, in L²([a, b]). Since S_α is a bounded operator, $f_{m} = S_{α} S_{α}^{- 1} f_{m}$ converges to S_αg in L²-norm. However, f_m converges to f in ∥ · ∥_α norm, it also converges in L²-norm. Therefore, S_αg = f, that is, f is in the range of S_α. Hence, $W_{2}^{2} ([a, b])$ is in the range of S_α. Because for any m ∈ ℕ, from a similar calculation as in (5.21),

{∥ S_{α}^{- 1} f_{m} ∥}^{2} = {∥ f_{m} ∥}_{α}^{2},

and

∥ S_{α}^{- 1} f_{m} - S_{α}^{- 1} f ∥ \to 0, {∥ f_{m} - f ∥}_{α} \to 0,

we have ${∥ S_{α}^{- 1} f ∥}^{2} = {∥ f ∥}_{α}^{2}$ .

Now we show that the range of S_α is equal to $W_{2}^{2} ([a, b])$ . Since we have shown that $W_{2}^{2} ([a, b])$ is in the range of S_α and S_α is a one-to-one map, we only need to show that the range of $W_{2}^{2} ([a, b])$ under $S_{α}^{- 1}$ is L²([a, b]). By (5.20) and the completeness of $(W_{2}^{2} ([a, b]), ∥ \cdot ∥_{α})$ , the range of $W_{2}^{2} ([a, b])$ under $S_{α}^{- 1}$ is a closed subspace of L²([a, b]). If the range of $W_{2}^{2} ([a, b])$ under $S_{α}^{- 1}$ is not L²([a, b]), then we can find 0 ≠ h ∈ L²([a, b]) such that

(h, S_{α}^{- 1} f) = 0, \forall f \in W_{2}^{2} ([a, b]) .

Since one can see that the domain of $S_{α}^{- 2} = I + α L^{*} L$ is contained in $W_{2}^{2} ([a, b])$ , we have

(h, S_{α}^{- 1} f) = 0, \forall f \in domain of S_{α}^{- 2} .

Then

(h, S_{α}^{- 1} f) = (S_{α}^{- 1} S_{α} h, S_{α}^{- 1} f) = (S_{α} h, S_{α}^{- 2} f) = 0, \forall f \in domain of S_{α}^{- 2} .

However, because the range of $S_{α}^{- 2}$ is the whole L²([a, b]), we have S_αh = 0. Hence h = 0 since S_α is a one-to-one map. We get a contradiction. Therefore, the range of S_α is equal to $W_{2}^{2} ([a, b])$ .

Lemma 6

${({\hat{λ}}_{j}^{[α]}, S_{α}^{- 1} {\hat{γ}}_{j}^{[α]}) : j \in ℕ}$ and ${(λ_{j}^{[α]}, S_{α}^{- 1} γ_{j}^{[α]}) : j \in ℕ}$ are eigenvalues and eigenfunctions of the compact operators S_αΓ̂_nS_α and S_αΓ̂S_α in L²([a, b]) respectively. Moreover, there are no other eigenvalues for S_αΓ̂_nS_α and S_αΓ̂S_α.

Note that the L² norms of $S_{α}^{- 1} {\hat{γ}}_{j}^{[α]}$ and $S_{α}^{- 1} γ_{j}^{[α]}$ may not be 1.

Proof. If α = 0, the results are trivial. Hence, we assume that α > 0. Because ${({\hat{λ}}_{j}^{[α]}, {\hat{γ}}_{j}^{[α]}) : j \in ℕ}$ are solutions of the successive optimization problems (3.3) and (3.4), then by Lemma 5,

\begin{matrix} \frac{(S_{α}^{- 1} {\hat{γ}}_{1}^{[α]}, S_{α} {\hat{Γ}}_{n} S_{α} S_{α}^{- 1} {\hat{γ}}_{1}^{[α]})}{{∥ S_{α}^{- 1} {\hat{γ}}_{1}^{[α]} ∥}^{2}} = \frac{({\hat{γ}}_{1}^{[α]}, {\hat{Γ}}_{n} {\hat{γ}}_{1}^{[α]})}{{∥ {\hat{γ}}_{1}^{[α]} ∥}_{α}^{2}} = {\hat{λ}}_{1}^{[α]} = \max_{0 \neq γ \in W_{2}^{2} ([a, b])} \frac{(γ, {\hat{Γ}}_{n} γ)}{{∥ γ ∥}_{α}^{2}} \\ = & \max_{0 \neq γ \in W_{2}^{2} ([a, b])} \frac{(S_{α}^{- 1} γ, S_{α} {\hat{Γ}}_{n} S_{α} S_{α}^{- 1} γ)}{{∥ S_{α}^{- 1} γ ∥}^{2}} = \max_{0 \neq β \in L^{2} ([a, b])} \frac{(β, S_{α} {\hat{Γ}}_{n} S_{α} β)}{{∥ β ∥}^{2}} . \end{matrix}

Hence, $({\hat{λ}}_{1}^{[α]}, S_{α}^{- 1} {\hat{γ}}_{1}^{[α]})$ are the first eigenvalue and the corresponding eigenfunction of S_αΓ̂_nS_α. Similarly, we can prove the conclusions for other eigenvalues and eigenfunctions.

Define

H = the Banach space of all compact bounded operators from L^{2} ([a, b]) to L^{2} ([a, b]) with norm defined in (2.1) .

(5.22)

For the definition and properties of compact operators in Banach spaces, we refer reader to Chapter 21 in Lax [9]. Define a sequence of stochastic processes

{Z_{n} (α) = \sqrt{n} (S_{α} {\hat{Γ}}_{n} S_{α} - S_{α} Γ S_{α}), n \in ℕ, 0 \leq α \leq α_{0}},

which is indexed by α and takes values in H because both Γ̂_n and Γ are compact operators and S_α is a bounded operator. Note that $Z_{n} (0) = \sqrt{n} ({\hat{Γ}}_{n} - Γ)$ . We follow the notations in Dauxois et al. [3]. Let F denote the space of Hilbert-Schmidt operators from L²([a, b]) to L²([a, b]). Then F is a Hilbert space with a inner product denoted by < ·, · >_F. By Assumption 1,

E [{∥ X ∥}^{4}] < \infty .

Thus Γ̂_n, Γ ∈ F. It follows from Proposition 5 in Dauxois et al. [3] that {Z_n(0), n ∈ ℕ}, regarded as a sequence of random elements with values in F, converges in distribution to the Gaussian random element in F with mean 0 and covariance operator Q, where

Q = E [(X \otimes X - Γ) \tilde{\otimes} (X \otimes X - Γ)] = E [(X \otimes X) \tilde{\otimes} (X \otimes X)] - Γ \tilde{\otimes} Γ .

(5.23)

X ⊗ X denotes the bounded operator from L²([a, b]) to L²([a, b]) with (X ⊗ X) (γ) = (γ, X)X for any γ ∈ L²([a, b]). Γ⊗̃Γ denotes the bounded operator from F to F with (Γ⊗̃Γ)(Λ) = 〈Λ,Γ〉_F Γ for any Λ ∈ F. The other terms in (5.23) are denned similarly. Note that according to the definition (5.23), Q is an operator from F to F. However, because F is a Hilbert space, there is an isometry between F and its dual space F′. Hence, Q can be regarded as a bounded operator from F′ to F and then it satisfies the definition of covariance operators in Remark (1) after Theorem 4.2. However, in this paper, we will consider the space H of compact operators which is larger than the space F of Hilbert-Schmidt operators (every Hilbert-Schmidt operator is compact). In the proof of Proposition 6 in Dauxois et al. [3], the authors used the fact that if A is a Hilbert-Schmidt operator, then (A − zI)⁻¹ is also a Hilbert-Schmidt operator, where z is a complex which is not an eigenvalue of A and I is the identity operator. However, this is not true in general. But (A − zI)⁻¹ is a bounded operator. Because the norm (2.1) in H is smaller than the norm in F, the embedding map i : F ↪ H (i maps any Hilbert-Schmidt operator to itself) is a bounded operator. Then we have

Lemma 7

{Z_n(0), n ∈ ℕ}, regarded as a sequence of random, elements with values in H, converges in distribution to a Gaussian random element in H with mean zero and covariance operator iQi*, where i* is the adjoint operator of i and Q is defined in (5.23).

Proof. It follows immediately from the following lemma.

Lemma 8

Suppose that {X_n, n ≥ 1} is a sequence of random, elements with values in a Banach space B. If X_n converges in distribution to a Gaussian random element X with mean zero and covariance operator Λ. Let T be a bounded operator (that is, a continuous linear function) from B to another Banach space C. Then T(X_n) converges in distribution to T(X) which is also a Guassian random element with mean zero and covariance operator TΛT*, where T* is the adjoint operator of T.

Proof. Since T is a continuous map from B to C, by continuous mapping theorem, T(X_n) converges in distribution to T(X). Now we show that T(X) is an Guassian random element. For any bounded linear functional f ∈ C′, fοT ∈ B′. Hence, f(T(X)) = f ο T(X) is a Gaussian random variable since X is Gaussian. Thus T(X) is Guassian and obviously its mean is zero. In order to compute it covariance operator, we intruduce the following notations. For any x ∈ B, y ∈ C and f ∈ B′, g ∈ C′, define 〈x, f〉_B = f(x), 〈y, g〉_C = g(y). By the definition of covariance operators (see Remark (1) after Theorem 4.2) and the definition of adjoint operators, for any g, h ∈ C′,

\begin{matrix} E [g (T (X)) h (T (X))] & = E [(g \circ T (X)) (h \circ T (X))] = E [(T^{*} (g) (X)) (T^{*} (f) (X))] \\ = & {〈 Λ (T^{*} (g)), T^{*} (f) 〉}_{B} = {〈 Λ T^{*} (g), T^{*} (f) 〉}_{B} = {〈 T Λ T^{*} (g), f 〉}_{C} . \end{matrix}

Therefore, the covariance operator of TX is TΛT*.

Lemma 9

For any finite 0 ≤ α₁ < … < α_k ≤ α₀, the sequence

{(Z_{n} (α_{1}), \dots, Z_{n} (α_{k})), n \in ℕ}

converges in distribution to a Gaussian random element with values in H^k and mean zero, where H^k is the product space of k copies of H.

Proof. This lemma follows from Lemma 8 and the fact that

(Z_{n} (α_{1}), \dots, Z_{n} (α_{k})) = (S_{α 1} Z_{n} (0) S_{α 1}, \dots, S_{α k} Z_{n} (0) S_{α k})

is a continuous and linear function of Z_n(0) since S_α1, i = 1,… , k are bounded operators.

Unfortunately, S_α is not continuous as α → 0 under the norm (2.1). For example, let

[a, b] = [0, 2 π], f_{n} (t) = \frac{1}{\sqrt{2 π}} e^{int} .

By (5.20),

{∥ S_{α}^{- 1} f_{n} ∥}^{2} = {∥ f_{n} ∥}_{α}^{2} = {∥ f_{n} ∥}^{2} + α [f_{n}, f_{n}] = 1 + α n^{4} .

Define $g_{n} = \frac{1}{\sqrt{1 + α n^{4}}} S_{α}^{- 1} f_{n}$ . Then ∥g_n∥ = 1 and

\begin{matrix} ∥ (S_{α} - I) g_{n} ∥ = ∥ \frac{1}{\sqrt{1 + α n^{4}}} f_{n} - g_{n} ∥ \\ \geq ∥ g_{n} ∥ & - ∥ \frac{1}{\sqrt{1 + α n^{4}}} f_{n} ∥ = 1 - \frac{1}{\sqrt{1 + α n^{4}}} . \end{matrix}

Therefore, ∥S_α − I∥ ≥ 1 for all α. Note that S₀ = I. However, we have the following results.

Lemma 10

For any f ∈ L²([a, b]), α → S_αf is a continuous map from [0, α₀] to L²([a, b]).

Proof. Let E be the resolution of the identity for the self-adjoint operator S_α0 (for reference, see Chapter 12 of Rudin [14]). Because S_α0 is a positive operator with ∥S_α0∥ ≤ 1, E_{f, f} is a bounded positive Borel measure in [0, 1]. Fix α ∈ [0, α₀].

\begin{matrix} S_{α} = {(I + α L^{*} L)}^{- \frac{1}{2}} = {((1 - \frac{α}{α_{0}}) I + \frac{α}{α_{0}} (I + α_{0} L^{*} L))}^{- \frac{1}{2}} \\ = & {((1 - \frac{α}{α_{0}}) I + \frac{α}{α_{0}} S_{α_{0}}^{- 2})}^{- \frac{1}{2}} = S_{α_{0}} {(\frac{α}{α_{0}} + (1 - \frac{α}{α_{0}}) S_{α_{0}}^{2})}^{- \frac{1}{2}} . \end{matrix}

Now define a family continuous functions on [0, 1],

φ_{α} (x) = {\begin{matrix} \frac{x}{\sqrt{\frac{α}{α_{0}} + (1 - \frac{α}{α_{0}}) x^{2}}}, & 0 < α \leq α_{0} \\ 1 & α = 0, \end{matrix},

then S_α = φ_α(S_α₀). Let α′ ∈ [0, α₀] and α′ → α. It follows from Theorem 12.21 and 12.23 in Chapter 12 of Rudin [14] that

{∥ (S_{α^{'}} - S_{α}) f ∥}^{2} = \int_{0}^{1} {(φ_{α^{'}} (x) - φ_{α} (x))}^{2} {dE}_{f, f} (x) .

The integrand on the right hand side is bounded. If α ≠ 0, the integrand converges to 0 at each point in [0, 1] as α′ → α. By the bounded convergence theorem, ∥(S_α′ − S_α)f∥² → 0. If α = 0, the integrand converges to 0 at each point in [0, 1] except 0. If we can show that the measure value E_{f, f}({0}) of E_{f, f} on the set {0} is zero, then by the bounded convergence theorem, we still have ∥(S_α′ − S_α) f∥² → 0. In fact, for any g ∈ L²([a, b]),

(g, S_{α_{0}} E ({0}) f) = \int_{{0}} {xdE}_{g, f} (x) = 0 .

Hence, S_α₀E({0})f = 0. Because S_α₀ is a one-to-one operator, E({0})f = 0. Therefore,

E_{f, f} ({0}) = (f, E ({0}) f) = 0 .

Lemma 11

For any compact operator Λ in L²([a, b]), α → S_αΛS_α is a continuous map from [0, α₀] to H.

Proof. By Lemma 11 in Section XI.9 of Dunford and Schwartz [5], there exists a sequence Λ_m of bounded operators having finite-dimensional range, such that ∥Λ_m − Λ∥ → 0. If we can show that for each m, α → S_αΛ_mS_α is a continuous map, then since ∥S_αΛ_mS_α − S_αΛS_α∥ ≥ ∥Λ_m − Λ∥ → 0 uniformly, α → S_αΛS_α is continuous. Now fix m and 0 ≤ α ≤ α₀. Let {e₁, …, e_k} be an orthonormal basis of the range of Λ_m and α′ → α. For any f ∈ L² ([a, b]) with ∥f∥ ≤ 1,

\begin{matrix} ∥ S_{α^{'}} Λ_{m} S_{α^{'}} f - S_{α} Λ_{m} S_{α} f ∥ = ∥ (S_{α^{'}} - S_{α}) Λ_{m} S_{α^{'}} f + S_{α} Λ_{m} (S_{α^{'}} - S_{α}) f ∥ \\ \leq & ∥ (S_{α^{'}} - S_{α}) Λ_{m} S_{α^{'}} f ∥ + ∥ S_{α} Λ_{m} (S_{α^{'}} - S_{α}) f ∥ \\ = & ∥ (S_{α^{'}} - S_{α}) Λ_{m} (S_{α^{'}} - S_{α}) f + (S_{α^{'}} - S_{α}) Λ_{m} S_{α} f ∥ + ∥ S_{α} Λ_{m} (S_{α^{'}} - S_{α}) f ∥ \\ \leq & ∥ (S_{α^{'}} - S_{α}) ∥ ∥ Λ_{m} (S_{α^{'}} - S_{α}) f ∥ + ∥ (S_{α^{'}} - S_{α}) Λ_{m} S_{α} f ∥ \\ + ∥ S_{α} ∥ ∥ Λ_{m} (S_{α^{'}} - S_{α}) f ∥ \leq 3 ∥ Λ_{m} (S_{α^{'}} - S_{α}) f ∥ + ∥ (S_{α^{'}} - S_{α}) Λ_{m} S_{α} f ∥ . \end{matrix}

Because

Λ_{m} S_{α} f = Σ_{i = 1}^{k} (Λ_{m} S_{α} f, e_{i}) e_{i}

\begin{matrix} ∥ (S_{α^{'}} - S_{α}) Λ_{m} S_{α} f ∥ & \leq Σ_{i = 1}^{k} | (Λ_{m} S_{α} f, e_{i}) | ∥ (S_{α^{'}} - S_{α}) e_{i} ∥ \\ \leq Σ_{i = 1}^{k} ∥ Λ_{m} ∥ ∥ (S_{α^{'}} - S_{α}) e_{i} ∥ \end{matrix}

which converges to 0 uniformly for all f ∈ L²([a, b]) with ∥f∥ ≤ 1 by Lemma 10. Now

\begin{matrix} {∥ Λ_{m} (S_{α^{'}} - S_{α}) f ∥}^{2} = Σ_{i = 1}^{k} {| (Λ_{m} (S_{α^{'}} - S_{α}) f, e_{i}) |}^{2} \\ = & Σ_{i = 1}^{k} {| (f, (S_{α^{'}} - S_{α}) Λ_{m}^{*} e_{i}) |}^{2} \leq Σ_{i = 1}^{k} {∥ (S_{α^{'}} - S_{α}) Λ_{m}^{*} e_{i} ∥}^{2} . \end{matrix}

which converges to 0 uniformly for all f ∈ L²([a, b]) with ∥f∥ ≤ 1 by Lemma 10, where Λ_m* is the adjoint operator of Λ_m. Hence, ∥S_α′Λ_mS_α′ − S_αΛ_mS_α∥ → 0.

In the next lemma, we assume that all the eigenfunctions have norms 1.

Lemma 12

Suppose that α → Λ(α) is a continuous map from [0, α₀] to the suhspace of positive compact operators in L²([a, b]) in H. Assume that the first K eigenvalues of Λ(α) for any α ∈ [0, α₀] are positive and mutually different, and each of them has multiplicity 1. Then given the first K eigenfunctions ${e_{k}^{[0]}, 1 \leq k \leq K}$ of Λ(0), there exist unique choices of the first k eigenfunctions ${e_{k}^{[α]}, 1 \leq k \leq K}$ of Λ(α) for any α ∈ (0, α₀] such that $α \to e_{k}^{[α]}$ is a continuous map from [0, α₀] to L²([a, b]) for any 1 ≤ k ≤ K.

Note that for each 1 ≤ k ≤ K and 0 ≤ α ≤ α₀, there exist two eigenfunctions with norm 1 of Λ(α) corresponding its k-th eigenvalues and any one of the two eigenfunctions is equal to the other one multiplied by −1.

Proof. Let $μ_{1}^{[α]} > \dots > μ_{K}^{[α]} > 0$ be the first K eigenvalues of Λ(α). Let E^k(α) be the orthogonal projection onto the space spanned by the $e_{k}^{[α]}$ , 1 ≤ k ≤ K, 0 ≤ α ≤ α₀. Note E^k(α) does not depend on the sign of $e_{k}^{[α]}$ .

We first show that for any 1 ≤ k ≤ K, E^k(α) is a continuous function of from [0, α₀] to H. For any fixed α ∈ [0, α₀], we can find a small positive number ε_α such that the K + 1 intervals

[μ_{1}^{[α]} - ∊_{α}, μ_{1}^{[α]} + ∊_{α}], [μ_{2}^{[α]} - ∊_{α}, μ_{2}^{[α]} + ∊_{α}], \dots, [μ_{K + 1}^{[α]} - ∊_{α}, μ_{K + 1}^{[α]} + ∊_{α}]

are disjoint. Since Λ(α) is a continuous function, we can choose a neighborhood ℳ_α of α in [0, α₀], such that for any α′ ∈ ℳ_α

\max_{1 \leq k \leq K + 1} | μ_{k}^{[α^{'}]} - μ_{k}^{[α]} | \leq ∥ Λ (α^{'}) - Λ (α) ∥ \leq \frac{∊_{α}}{4} .

where the first inequality follows from Corollary 4 in Section XI.9 of Dunford and Schwartz [5]. Now we define K circles on the complex plane ℂ,

C_{k} = the circle with center μ_{k}^{[α]} and radius ∊_{α}, 1 \leq k \leq K .

Then one can see that for any α′ ∈ ℳ_α, the disk bounded by the circle C_k only contains the k-th eigenvalues $μ_{k}^{[α^{'}]}$ of Λ(α′). Hence, we have (see Section VII.3 of Dunford and Schwartz [4] or Definition 10.26 in Rudin [14])

E^{k} (α^{'}) = \frac{1}{2 π i} \int_{C_{k}} {(zI - Λ (α^{'}))}^{- 1} dz,

for any α′ ∈ ℳ_α. Since (zI − Λ(α′))⁻¹ is a continuous function of z ∈ C_kC_k is a compact set, we have

M = \sup_{z \in C_{k}} ∥ {(zI - Λ (α^{'}))}^{- 1} ∥ < \infty .

(5.24)

Since Λ(α) is a continuous function of α, for any 0 < δ < 1, we can find a neighborhood 𝒩_α of α such that

∥ Λ (α^{'}) - Λ (α) ∥ \leq \frac{δ}{M}, \forall α^{'} \in N_{α} .

(5.25)

Now for any α′ ∈ ℳ_α ⋂ 𝒩_α,

\begin{matrix} ∥ E^{k} (α^{'}) - E^{k} (α) ∥ \leq \frac{1}{2 π} \int_{C_{k}} ∥ {(zI - Λ (α^{'}))}^{- 1} - {(zI - Λ (α))}^{- 1} ∥ dz \\ = & \frac{1}{2 π} \int_{C_{k}} ∥ {(zI - Λ (α) - (Λ (α^{'}) - Λ (α)))}^{- 1} - {(zI - Λ (α))}^{- 1} ∥ dz \\ = & \frac{1}{2 π} \int_{C_{k}} ∥ {(zI - Λ (α))}^{- 1} {(I - (Λ (α^{'}) - Λ (α)) {(zI - Λ (α))}^{- 1})}^{- 1} - {(zI - Λ (α))}^{- 1} ∥ dz \\ \leq & \frac{1}{2 π} \int_{C_{k}} ∥ {(zI - Λ (α))}^{- 1} ∥ ∥ {(I - (Λ (α^{'}) - Λ (α)) {(zI - Λ (α))}^{- 1})}^{- 1} - I ∥ dz \\ \leq & \frac{M}{2 π} \int_{C_{k}} ∥ (I + Σ_{k = 1}^{\infty} {[(Λ (α^{'}) - Λ (α)) {(zI - Λ (α))}^{- 1}]}^{k}) - I ∥ dz \\ \leq & \frac{M}{2 π} \int_{C_{k}} Σ_{k = 1}^{\infty} {[∥ Λ (α^{'}) - Λ (α)) ∥ ∥ {(zI - Λ (α))}^{- 1} ∥]}^{k} dz \\ \leq & \frac{M}{2 π} \int_{C_{k}} dz Σ_{k = 1}^{\infty} {[\frac{δ}{M} M]}^{k} \end{matrix}

(5.26)

by (5.24) and (5.25)

= \frac{M}{2 π} \int_{C_{k}} dz \frac{δ}{1 - δ}

Since δ can be arbitrarily small, E^k(α) is continuous at α.

Now we show that for any given α ∈ [0, α₀], and given $e_{k}^{[α]}$ , there exists a neighborhood [α¹, α²] of α such that for any α′ ∈ [α¹, α²], we can uniquely choose $e_{k}^{[α^{'}]}$ such that $e_{k}^{[α^{'}]}$ is continuous in this neighborhood. Because E^k(α′) is a continuous function of α′, $∥ E^{k} (α^{'}) e_{k}^{[α]} ∥$ is a continuous function of α′ and its value is 1 at α′ = α. Hence, we can find a neighborhood [α¹, α²] of α such that $∥ E^{k} (α^{'}) e_{k}^{[α]} ∥ \geq \frac{1}{2}$ for α′ ∈ [α¹, α²]. Then

e_{k}^{[α^{'}]} = \frac{E^{k} (α^{'}) e_{k}^{[α]}}{∥ E^{k} (α^{'}) e_{k}^{[α]} ∥},

are eigenfunctions and continuous in [α¹, α²]. Now we show the uniqueness. Suppose ${\tilde{e}}_{k}^{[α]}$ , α′ ∈ [α¹, α²] is another choice of the eigenfunctions such that it is continuous and ${\tilde{e}}_{k}^{[α]} = e_{k}^{[α]}$ . If for some $α^{″} \in [α^{1}, α^{2}]$ , $e_{k}^{[α^{″}]} \neq {\tilde{e}}_{k}^{[α^{″}]}$ , we have $e_{k}^{[α^{″}]} = - {\tilde{e}}_{k}^{[α^{″}]}$ . Since both the inner products $(e_{k}^{[α]}, e_{k}^{[α^{'}]})$ and $(e_{k}^{[α]}, {\tilde{e}}_{k}^{[α^{'}]})$ are continuous functions for α′ ∈ [α¹, α²]. By the choice of [α¹, α²], $| (e_{k}^{[α]}, e_{k}^{[α^{″}]}) | = | (e_{k}^{[α]}, {\tilde{e}}_{k}^{[α^{″}]}) | \geq \frac{1}{2}$ . Because $(e_{k}^{[α]}, e_{k}^{[α^{″}]}) = - (e_{k}^{[α]}, {\tilde{e}}_{k}^{[α^{″}]})$ , one of them must be negative. Without loss of generality, we assume that $(e_{k}^{[α]}, e_{k}^{[α^{″}]}) < 0$ . Since $(e_{k}^{[α]}, e_{k}^{[α]}) = 1 > 0$ , it follows from the intermediate value theorem that there is at least one point α‴ between α and α″ such that $(e_{k}^{[α]}, e_{k}^{[α^{″'}]}) = 0$ . However, it is impossible because

| (e_{k}^{[α]}, e_{k}^{[α^{‴}]}) | = ∥ E^{k} (α^{‴}) e_{k}^{[α]} ∥ \geq \frac{1}{2} .

Hence we have proved the uniqueness.

Fix $e_{k}^{[0]}$ . Let the set

V = {α \in [0, α_{0}] : we can uniquely choose e_{k}^{[α^{'}]} for α^{'} \in [0, α_{0}] such that e_{k}^{[α^{'}]} is continous in [0, α]} .

By the arguments in the last paragraph, 𝒱 is nonempty. Now we show that the set 𝒱 is an open set. Suppose that α* is any point in 𝒱. It follows from the last paragraph that there exists a neighborhood [α¹, α²] of α* such that given e^[α*], we can uniquely choose the sign of e^[α] for any α ∈ [α¹, α²] to make e^[α], α ∈ [α¹, α²] a continuous function. We show that [α¹, α²] ⊂ 𝒱. Let α** be any point in [α¹, α²]. It is easy to see that we can choose the signs of e^[α] for all α ∈ [0, α**] such that e^[α] is a continuous function of α in [0, α**]. We only need to show the uniqueness of e^[α]. The uniqueness is obvious if α** ≥ α* since α* ∈ 𝒱. Hence we assume that α** < α*. We will proceed by contradiction. Assume that there are two different continuous functions ${\tilde{e}}^{[α]}$ and ${\overset{\approx}{e}}^{[α]}$ , 0 ≤ α ≤ α**. By the definition of [α¹, α²], we can choose a continuous function ${\hat{e}}^{[α]}$ , α** ≤ α ≤ α*. Define

e^{' [α]} = {\begin{matrix} {\tilde{e}}^{[α]} & if & 0 \leq α \leq α^{* *} \\ {\hat{e}}^{[α]} & if & α^{* *} \leq α \leq α^{*} and {\tilde{e}}^{[α^{* *}]} = {\hat{e}}^{[α^{* *}]} \\ - \hat{e} [α] & if & α^{* *} \leq α \leq α^{*} and {\tilde{e}}^{[α^{* *}]} = - {\hat{e}}^{[α^{* *}]} \end{matrix},

and

e^{″ [α]} = {\begin{matrix} {\overset{\approx}{e}}^{[α]} & if & 0 \leq α \leq α^{* *} \\ {\hat{e}}^{[α]} & if & α^{* *} \leq α \leq α^{*} and {\overset{\approx}{e}}^{[α^{* *}]} = {\hat{e}}^{[α^{* *}]} \\ - \hat{e} [α] & if & α^{* *} \leq α \leq α^{*} and {\overset{\approx}{e}}^{[α^{* *}]} = - {\hat{e}}^{[α^{* *}]} \end{matrix}

Then ${\tilde{e}}^{[α]}$ and ${\overset{\approx}{e}}^{[α]}$ are two different continuous functions in [0, α*], which contradicts to α* ∈ 𝒱. Hence, 𝒱 is an open set.

Now if we can prove that 𝒱 is also a closed set, we have 𝒱= [0, α₀]. Let α_m ∈ 𝒱 be a sequence of positive numbers converging to α ∈ [0, α₀]. If for some m, α_m ≥ α it is obvious that α ∈ 𝒱. Hence we assume that α_m < α for all m. Then we can uniquely choose the signs of of $e_{k}^{[α^{'}]}$ such that $e_{k}^{[α^{'}]}$ is continuous in [0, α). Let $e_{k}^{[α]}$ be one of the two eigenfunctions with norm 1. Because for any α′ < α

| {(e_{k}^{[α]}, e_{k}^{[α^{'}]})}^{2} - 1 | = | (e_{k}^{[α]}, E^{k} (α^{'}) e_{k}^{[α]}) - (e_{k}^{[α]}, E^{k} (α) e_{k}^{[α]}) | \leq ∥ E^{k} (α^{'}) - E^{k} (α) ∥

goes to zero as α′ → α, ${(e_{k}^{[α]}, e_{k}^{[α^{'}]})}^{2} \to 1$ . Since $e_{k}^{[α^{'}]}$ continuous in [0, α). $(e_{k}^{[α]}, e_{k}^{[α^{'}]})$ converges either to 1 or −1. In the latter case, we change $e_{k}^{[α]}$ to $- e_{k}^{[α]}$ . Hence, without loss of generality, we assume that $(e_{k}^{[α]}, e_{k}^{[α^{'}]}) \to 1$ as α′ → α. Now one can see that $e_{k}^{[α^{'}]}$ is continuous on [0, α] and its uniqueness is obvious. Hence, α ∈ 𝒱. We have proven that 𝒱 is a close set.

Define C_H [0, α₀] to be the space of all the continuous function from [0, α₀] → H (see Chapter 3 of Billingsley [1]). For any {Λ(α) : 0 ≤ α ≤ α₀} ∈ C_H[0, α₀], define a norm

∥ Λ ∥ = \sup_{0 \leq α \leq α_{0}} ∥ Λ (α) ∥ .

(5.27)

Under the norm (5.27), C_H[0, α₀] is a Banach space. Recall the definition

{Z_{n} (α) = \sqrt{n} (S_{α} {\hat{Γ}}_{n} S_{α} - S_{α} Γ S_{α}), n \in ℕ, 0 \leq α \leq α_{0}} .

By Lemma 11, we can regard the stochastic processes Z_n in [0, α] as random elements with values in C_H [0, α₀]. Define a linear map Θ: H → C_H[0, α₀] such that for any compact operator U ∈ H,

Θ (U) = {S_{α} {US}_{α}, 0 \leq α \leq α_{0}} .

(5.28)

Lemma 13

Θ is a bounded operator and the sequence {Z_n, n ∈ ℕ} of stochastic processes with sample paths in C_H[0, α₀] converges in distribution to the Gaussian random, element with mean zero and covariance operator ΘiQi*Θ*.

Proof. Since the norm of S_α is less than or equal to 1, for any V ∈ H,

\sup_{0 \leq α \leq α_{0}} ∥ S_{α} {US}_{α} - S_{α} {VS}_{α} ∥ \leq ∥ U - V ∥ .

Hence, the map (5.28) is continuous and hence a bounded operator. Since Z_n = Θ(Z_n(0)), the lemma follows from Lemmas 7 and 8.

Now for any 1 ≤ k ≤ K, define

{\hat{η}}_{k}^{[α]} = \frac{S_{α}^{- 1} {\hat{γ}}_{k}^{[α]}}{∥ S_{α}^{- 1} {\hat{γ}}_{k}^{[α]} ∥}, η_{k}^{[α]} = \frac{S_{α}^{- 1} γ_{k}^{[α]}}{∥ S_{α}^{- 1} γ_{k}^{[α]} ∥} .

(5.29)

Note that by Lemma 6, ${\hat{η}}_{k}^{[α]}$ and $η_{k}^{[α]}$ are the eigenfunctions of S_αΓ̂S_α and S_αΓS_α with norms 1. By (5.29) and because $∥ {\hat{γ}}_{k}^{[α]} ∥ = 1$ and $∥ γ_{k}^{[α]} ∥ = 1$ , we have

∥ S_{α} {\hat{η}}_{k}^{[α]} ∥ = \frac{1}{∥ S_{α}^{- 1} {\hat{γ}}_{k}^{[α]} ∥}, ∥ S_{a} η_{k}^{[α]} ∥ = \frac{1}{∥ S_{α}^{- 1} γ_{k}^{[α]} ∥},

(5.30)

and

{\hat{γ}}_{k}^{[α]} = \frac{S_{α} {\hat{η}}_{k}^{[α]}}{∥ S_{α} {\hat{η}}_{k}^{[α]} ∥}, γ_{k}^{[α]} = \frac{S_{α} η_{k}^{[α]}}{∥ S_{α} η_{k}^{[α]} ∥} .

(5.31)

Define ${\tilde{∊}}_{k} = \frac{λ_{k} - λ_{k + 1}}{4}$ , 1 ≤ k ≤ K, and ε_K = min_{1≤k≤K ε̃_k}. Then the K + 1 intervals

[λ_{1} - {\tilde{∊}}_{1}, λ_{1} + {\tilde{∊}}_{1}], [λ_{2} - {\tilde{∊}}_{2}, λ_{2} + {\tilde{∊}}_{1}], \dots, [λ_{K} - {\tilde{∊}}_{K}, λ_{K} + {\tilde{∊}}_{K - 1}], [λ_{K + 1} - {\tilde{∊}}_{K}, λ_{K + 1} + {\tilde{∊}}_{K}],

(5.32)

are disjoint. By the definition (4.3) of α₀ and (5.5) in Lemma 3, for any 0 ≥ α ≥ α₀ and 1 ≥ k ≥ K,

0 \leq λ_{k} - λ_{k}^{[α]} \leq α {kL}_{k}^{2} λ_{k} \leq α_{0} {kL}_{k}^{2} λ_{k} \leq \frac{λ_{k} - λ_{k + 1}}{16 {kL}_{k}^{2} λ_{k}} {kL}_{k}^{2} λ_{k} \leq \frac{{\tilde{∊}}_{k}}{4} .

(5.33)

Hence, $λ_{1}^{[α]}, \dots, λ_{K}^{[α]}$ are different mutually for all 0 ≥ α ≥ α₀. Now given γ_k, 1 ≤ k ≤ K, by Lemma 11 and Lemma 12, we can uniquely choose the first K eigenfunctions ${η_{k}^{[α]}, 1 \leq k \leq K}$ of S_αΓS_α such that $η_{k}^{[0]} = γ_{k}$ and $η_{k}^{[α]}$ , 1 ≤ k ≤ K, are continuous functions of α. We have proved the claims about the continuity of $γ_{k}^{[α]}$ , 1 ≤ k ≤ K at the beginning of the proof of Theorem 4.1.

Now we define K circles in the complex plane ℂ,

\begin{matrix} C_{1} = the circle with center λ_{1} and radius {\tilde{∊}}_{1}, 1 \leq k \leq K, \\ C_{k} = the circle with center λ_{k} + \frac{{\tilde{∊}}_{k - 1} - {\tilde{∊}}_{k}}{2} and radius \frac{{\tilde{∊}}_{k - 1} + {\tilde{∊}}_{k}}{2}, 1 \leq k \leq K . \end{matrix}

(5.34)

Note that the K discs bounded by C_k, 1 ≤ k ≤ K are disjoint and the intersections between these discs and the real line in the complex plane are just the first K intervals in (5.32). Let E^k(α) be the orthogonal projection onto the space spanned by the $η_{k}^{[α]}$ , 1 ≤ k ≤ K, 0 ≤ α ≤ α₀. Now because it follows from (5.33) that for any 0 ≤ α ≤ a₀, 1 ≤ k ≤ K, the disk bounded by the circle C_k only contains the k-th eigenvalues $λ_{k}^{[α]}$ of S_αΓS_α for any 0 ≤ α ≤ a₀, 1 ≤ k ≤ K, we have

E^{k} (α) = \frac{1}{2 π i} \int_{C_{k}} {(zI - S_{α} Γ S_{α})}^{- 1} dz .

(5.35)

By Lemma 11, S_aΓS_α is a continuous function of α. Hence, by a similar calculation as in (5.26), it can be shown that E^k(α) is a continuous function of α.

Recall that we define in (4.6)

Ω_{0} = {ω : there exists at least one α \in [0, α_{0}] such that {\hat{λ}}_{1}^{[α]}, \dots, {\hat{λ}}_{K}^{[α]} are not mutually different} .

Lemma 14

Ω₀is a measurable set and P(Ω₀) → 0 as n → ∞.

Proof. Consider the subset

ε = {B : B is a positive compact operator, its first K eigenvalues are mutually different and each of them has multiplicity 1} .

ε is an open subset of the space of all positive compact operators which is closed in H, hence it is measurable. Let (Ω, ℱ) be the probability space and ([0, α₀], ℬ[0, α₀]) be the Lebesgue space. Since S_αΓ̂S_α has continuous sample paths, it is jointly measurable in (Ω × [0, α₀], ℱ × ℬ[0, α₀]). One can see that $Ω_{0}^{c}$ is the projection of the set {(ω, α) : S_αΓ̂_nS_α ∈ ε} to Ω. Therefore, $Ω_{0}^{c}$ is measurable, so is Ω₀. By (5.33) and the definition of ε_K (just above (5.32)), we have

Ω_{0} \subset {\sup_{0 \leq α \leq α_{0}} \max_{1 \leq k \leq K + 1} | {\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]} | > \frac{∊_{K}}{4}} .

By Corollary 4 in Section XI.9 of Dunford and Schwartz [5],

\sup_{0 \leq α \leq α_{0}} \max_{1 \leq k \leq K + 1} | {\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]} | \leq \sup_{0 \leq α \leq α_{0}} ∥ S_{α} {\hat{Γ}}_{n} S_{α} - S_{α} Γ S_{α} ∥ \leq ∥ {\hat{Γ}}_{n} - Γ ∥ .

(5.36)

Hence,

P (Ω_{0}) \leq P (∥ {\hat{Γ}}_{n} - Γ ∥ > \frac{∊_{K}}{4}) \to 0

(5.37)

by the law of large numbers.

For any ω ∈ Ω₀, define ${\hat{E}}_{n}^{k} (α)$ to be zero. For any ω ∉ Ω₀, define ${\hat{E}}_{n}^{k} (α)$ to be the orthogonal projection onto the space spanned by the k-th eigenfunction ${\hat{η}}_{n}^{[α]}$ of S_αΓ̂_nS_α (note that ${\hat{E}}_{n}^{k} (α)$ does not depend on the sign of ${\hat{η}}_{k}^{[α]}$ ). By the same argument as in the proof of Lemma 12, we can show that ${\hat{E}}_{n}^{k} (α)$ is a continuous function of S_αΓ̂_nS_α, so it is measurable and continuous in α. Now let {e_m, m ∈ ℕ} be a set of complete orthonormal basis functions in L²([a, b]), we choose

\begin{matrix} {\hat{η}}_{k}^{[0]} & = \frac{{\hat{E}}_{n}^{k} (0) γ_{k}}{∥ {\hat{E}}_{n}^{k} (0) γ_{k} ∥} χ_{{{\hat{E}}_{n}^{k} (0) γ_{k} \neq 0}} \\ + Σ_{m = 1}^{\infty} \frac{{\hat{E}}_{n}^{k} (0) e_{m}}{∥ {\hat{E}}_{n}^{k} (0) e_{m} ∥} χ_{{{\hat{E}}_{n}^{k} (0) γ_{k} = 0, {\hat{E}}_{n}^{k} (0) e_{j} = 0, 1 \leq j \leq m - 1, {\hat{E}}_{n}^{k} (0) e_{m} \neq 0}} \end{matrix}

(5.38)

in $Ω_{0}^{c}$ and 0 in Ω₀, where χ is the indicator function. Then ${\hat{η}}_{k}^{[0]}$ is measurable and

({\hat{η}}_{k}^{[0]}, η_{k}^{[0]}) \geq 0 .

(5.39)

Now by Lemma 11, Lemma 12 and the definition of Ω₀, for any ω ∉ Ω₀, we can uniquely choose ${\hat{η}}_{k}^{[α]}$ , 1 ≤ k ≤ K, such that ${\hat{η}}_{k}^{[α]}$ , 1 ≤ k ≤ K are continuous functions of α. ${\hat{η}}_{k}^{[α]}$ is measurable by the following lemma. By (5.31), ${\hat{γ}}_{k}^{[α]}$ , 1 ≤ k ≤ K are continuous and measurable with $({\hat{γ}}_{k}^{[0]}, γ_{k}) \geq 0$ .

Lemma 15

If for any 1 ≤ k ≤ K, ${\hat{η}}_{k}^{[α]}$ is a measurable map to C_L²([a,b][0, α₀].

Proof. In $Ω_{0}^{c}$ , ${\hat{E}}_{n}^{k} (α) {\hat{η}}_{k}^{[0]}$ is a continuous function of α. Since $∥ {\hat{E}}_{n}^{k} (0) {\hat{η}}_{k}^{[0]} ∥ = 1$ , let ${\hat{T}}^{(1)} = \inf {α, ∥ {\hat{E}}_{n}^{k} (α) {\hat{η}}_{k}^{[0]} ∥ \leq \frac{1}{2}} Λ α_{0}$ in $Ω_{0}^{c}$ . In Ω₀, define T̂⁽¹⁾ = 0. Then T̂⁽¹⁾ is a nonnegative random variable. By Lemma 12, we have in $Ω_{0}^{c}$ , if α ≤ T̂⁽¹⁾,

{\hat{η}}_{k}^{[α]} = \frac{{\hat{E}}_{n}^{k} (α) {\hat{η}}_{k}^{[0]}}{∥ {\hat{E}}_{n}^{k} (α) {\hat{η}}_{k}^{[0]} ∥} .

Define a random element

ζ_{1} = \frac{{\hat{E}}_{n}^{k} ({\hat{T}}^{(1)}) {\hat{η}}_{k}^{[0]}}{∥ {\hat{E}}_{n}^{k} ({\hat{T}}^{(1)}) {\hat{η}}_{k}^{[0]} ∥}

in $Ω_{0}^{c}$ and 0 in Ω₀. Define a random variable ${\hat{T}}^{(2)} = \inf {α \geq {\hat{T}}^{(1)}, ∥ {\hat{E}}_{n}^{k} (α) ζ_{1} ∥ \leq \frac{1}{2}} Λ α_{0}$ and a random element

ζ_{2} = \frac{{\hat{E}}_{n}^{k} ({\hat{T}}^{(2)}) ζ_{1}}{∥ {\hat{E}}_{n}^{k} ({\hat{T}}^{(1)}) ζ_{1} ∥}

in $Ω_{0}^{c}$ and 0 in Ω₀. Similarly, we can define (T̂⁽³⁾, ζ₃), …. One can show that for any ω ∈ Ω^c, there are only finite T̂^(m)(ω) < α₀, m =0, 1, 2, …, where T̂⁽⁰⁾(ω). Hence in Ω^c, we have

{\hat{η}}_{k}^{[α]} = Σ_{m = 0}^{\infty} \frac{{\hat{E}}_{n}^{k} (α) ζ_{m}}{∥ {\hat{E}}_{n}^{k} (α) ζ_{m} ∥} χ_{[{\hat{T}}^{(m)}, {\hat{T}}^{(m + 1)})},

where $ζ_{0} = {\hat{η}}_{k}^{[0]}$ and χ is the indicator function. Hence, ${\hat{η}}_{k}^{[α]}$ is measurable.

By (5.33) and (5.36), in the event ${∥ {\hat{Γ}}_{n} - Γ ∥ \leq \frac{∊_{K}}{4}} \subset Ω_{0}^{c}$ , for any 0 ≤ α ≤ α₀, 1 ≤ k ≤ K, the disk bounded by the circle C_k only contains the k-th eigenvalues for S_αΓ̂S_α and S_αΓS_α. Hence, in the event ${∥ {\hat{Γ}}_{n} - Γ ∥ \leq \frac{∊_{K}}{4}}$ , for any 0 ≤ α ≤ α₀, 1 ≤ k ≤ K, we have

E^{k} (α) = \frac{1}{2 π i} \int_{C_{k}} {(zI - S_{α} Γ S_{α})}^{- 1} dz, {\hat{E}}_{n}^{k} (α) = \frac{1}{2 π i} \int_{C_{k}} {(zI - S_{α} {\hat{Γ}}_{n} S_{α})}^{- 1} dz .

(5.40)

The proofs of the following Lemma 16 and Lemma 17 follows the ideas of Section 2 in Dauxois et al. [3]. Define linear maps ϕ_k : C_H[0, α₀] → C_H[0, α₀], 1 ≤ k ≤ K such that for any Λ ∈ C_H[0, α₀ and 0 ≤ α ≤ α₀,

(ϕ_{k} (Λ)) (α) = \frac{1}{2 π i} \int_{C_{k}} [{(zI - S_{α} Γ S_{α})}^{- 1} Λ (α) {(zI - S_{α} Γ S_{α})}^{- 1}] dz,

(5.41)

where (ϕ_k(Λ))(α) denotes the value of ϕ_k(Λ) at the point α. Then define ΦK = (ϕ₁, ϕ₂, …, ϕ_K) which is a linear map from C_H[0, α₀] to $\prod_{k = 1}^{K} C_{H} [0, α_{0}]$ . One can verify that ϕ_k's are continuous. Hence Φ_K is a bounded operator.

Lemma 16

The sequence ${\sqrt{n} ({\hat{E}}_{n}^{k} - E^{k}), 1 \leq k \leq K}_{n}$ of stochastic processes has sample paths in $\prod_{k = 1}^{K} C_{H} [0, α_{0}]$ a.s. and converges in distribution to a Gaussian random, element with mean zero and covariance operator $Φ_{K} Θ {i Q i}^{*} Θ^{*} Φ_{K}^{*}$ .

Proof. In the event ${∥ {\hat{Γ}}_{n} - Γ ∥ \leq \frac{∊_{K}}{4}}$ , for each z ∈ C_K,

\begin{matrix} {(zI - S_{α} {\hat{Γ}}_{n} S_{α})}^{- 1} = {((zI - S_{α} Γ S_{α}) - (S_{α} \hat{Γ} S_{α} - S_{α} Γ_{n} S_{α}))}^{- 1} \\ = {(zI - S_{α} Γ S_{α})}^{- 1} = {(I - (S_{α} \hat{Γ} S_{α} - S_{α} Γ_{n} S_{α}) {(zI - S_{α} Γ S_{α})}^{- 1})}^{- 1} . \end{matrix}

(5.42)

\sup_{0 \leq α \leq α 0} ∥ S_{α} \hat{Γ} S_{α} - S_{α} Γ_{n} S_{α} ∥ = ∥ {\hat{Γ}}_{n} - Γ ∥ \leq \frac{1}{2} \tilde{∊},

where

\tilde{∊}^{- 1} = \max_{1 \leq k \leq K} \sup_{0 \leq α \leq α_{0}} \sup_{z \in C_{k}} ∥ {(zI - S_{α} Γ S_{α})}^{- 1} ∥ < \infty,

then by (5.42), we have an absolutely convergent series expansion

\begin{matrix} {(zI - S_{α} {\hat{Γ}}_{n} S_{α})}^{- 1} \\ = & {(zI - S_{α} Γ S_{α})}^{- 1} Σ_{m = 0}^{\infty} {((S_{α} Γ_{n} S_{α} - S_{α} {\hat{Γ}}_{n} S_{α}) {(zI - S_{α} Γ S_{α})}^{- 1})}^{m} . \end{matrix}

Hence,

\begin{matrix} {(zI - S_{α} {\hat{Γ}}_{n} S_{α})}^{- 1} - {(zI - S_{α} Γ S_{α})}^{- 1} \\ = & {(zI - S_{α} Γ S_{α})}^{- 1} (S_{α} Γ S_{α} - S_{α} {\hat{Γ}}_{n} S_{α}) {(zI - S_{α} Γ S_{α})}^{- 1} + {\hat{U}}_{α}^{n} (z) \end{matrix}

(5.43)

where

{\hat{U}}_{α}^{n} (z) = {(zI - S_{α} Γ S_{α})}^{- 1} Σ_{m = 2}^{\infty} {((S_{α} Γ S_{α} - S_{α} {\hat{Γ}}_{n} S_{α}) {(zI - S_{α} Γ S_{α})}^{- 1})}^{m} .

Hence, in the event ${∥ {\hat{Γ}}_{n} - Γ ∥ \leq \frac{1}{2} \tilde{∊}}$ ,

∥ {\hat{U}}_{α}^{n} (z) ∥ \leq \frac{2}{{\tilde{∊}}^{3}} {∥ {\hat{Γ}}_{n} - Γ ∥}^{2} .

(5.44)

Now in the event ${∥ {\hat{Γ}}_{n} - Γ ∥ \leq \min (\frac{1}{2} \tilde{∊}, \frac{∊_{K}}{4})}$ , by (5.42) and (5.43),

\begin{matrix} \sqrt{n} ({\hat{E}}_{n}^{k} (α) - E_{n}^{k} (α)) = \frac{\sqrt{n}}{2 π i} \int_{C_{k}} [{(zI - S_{α} \hat{Γ} S_{α})}^{- 1} dz - {(zI - S_{α} Γ_{n} S_{α})}^{- 1}] dz \\ = & ϕ_{k} (Z_{n}) + \frac{1}{2 π i} \int_{C_{k}} \sqrt{n} {\hat{U}}_{α}^{n} (z) dz \end{matrix}

(5.45)

Now we have from (5.44) and (5.45), for any δ > 0,

\begin{matrix} P (∥ \sqrt{n} ({\hat{E}}_{n}^{k} - E_{n}^{k}) - ϕ_{k} (Z_{n}) ∥ > δ) \leq P (∥ {\hat{Γ}}_{n} - Γ ∥ > \min (\frac{1}{2} \tilde{∊}, \frac{∊_{K}}{4})) \\ + P (∥ \sqrt{n} ({\hat{E}}_{n}^{k} - E_{n}^{k}) - ϕ_{k} (Z_{n}) ∥ > δ, ∥ {\hat{Γ}}_{n} - Γ ∥ \leq \min (\frac{1}{2} \tilde{∊}, \frac{∊_{K}}{4})) \\ \leq & P (∥ {\hat{Γ}}_{n} - Γ ∥ > \min (\frac{1}{2} \tilde{∊}, \frac{∊_{K}}{4})) + P (∥ \frac{\sqrt{n}}{2 π i} \int_{C_{k}} {\hat{U}}_{n} (z) dz ∥ > δ) \\ \leq & P (∥ {\hat{Γ}}_{n} - Γ ∥ > \min (\frac{1}{2} \tilde{∊}, \frac{∊_{K}}{4})) + P (\sqrt{n} {∥ {\hat{Γ}}_{n} - Γ ∥}^{2} > π δ {\tilde{∊}}^{3}) \to 0, \end{matrix}

(5.46)

as n → 0. By Lemmas 8 and 13, Φ_K(Z_n) = (ϕ₁(Z_n), ϕ₂(Z_n), … ϕ_K(Z_n)) converges in distribution to the Gaussian element with mean zero and covariance operator $Φ_{K} Θ {i Q i}^{*} Θ^{*} Φ_{K}^{*}$ . Now by (5.46), ${\sqrt{n} ({\hat{E}}_{n}^{k} - E_{n}^{k}), 1 \leq k \leq K}_{n}$ converges in distribution to the same distribution.

Define linear maps Ψ_k : C_H[0, α₀] → C_L²([a,b) [0, α₀], 1 ≤ k ≤ K such that for any Λ ∈ C_H[0, α₀],

ψ_{k} (Λ) = {(I - E^{k} (α)) Λ (α) η_{k}^{[α]}, 0 \leq α \leq α_{0}} .

(5.47)

Then we define a linear map $Ψ_{K} : \prod_{k = 1}^{K} C_{H} [0, α_{0}] \to \prod_{k = 1}^{K} C_{L^{2} ([a, b])} [0, α_{0}]$ such that for any $(Λ_{1}, \dots, Λ_{K}) \in \prod_{k = 1}^{K} C_{H} [0, α_{0}]$ ,

Ψ_{K} (Λ_{1}, \dots, Λ_{K}) = (ψ_{1} (Λ_{1}), \dots, ψ_{K} (Λ_{K})) .

(5.48)

It is easy to see that ψ_K is a bounded operator.

Lemma 17

The sequence ${\sqrt{n} ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]}), 1 \leq k \leq K}_{n}$ of stochastic processes has sample paths in $\prod_{k = 1}^{K} C_{L^{2} ([a, b])} [0, α_{0}]$ a.s. and converges in distribution to a Gaussian random element with mean zero and covariance operator $Ψ_{K} Φ_{K} Θ {i Q i}^{*} Θ^{*} Φ_{K}^{*} Ψ_{K}^{*}$ .

Proof. By the definitions (5.29) of ${\hat{η}}_{k}^{[α]}, η_{k}^{[α]}, 1 \leq k \leq K, {(η_{k}^{[α]}, η_{k}^{[α]})}^{2} = {∥ η_{k}^{[α]} ∥}^{2} = 1$ . In $Ω_{0}^{c}$ , we have

\begin{matrix} \sup_{0 \leq α \leq α_{0}} \sqrt{n} | {({\hat{η}}_{k}^{[α]}, η_{k}^{[α]})}^{2} - 1 | = \sup_{0 \leq α \leq α_{0}} \sqrt{n} | {({\hat{η}}_{k}^{[α]}, η_{k}^{[α]})}^{2} - {(η_{k}^{[α]}, η_{k}^{[α]})}^{2} | \\ = \sup_{0 \leq α \leq α_{0}} \sqrt{n} ∣ (({\hat{η}}_{k}^{[α]}, η_{k}^{[α]}) {\hat{η}}_{k}^{[α]}, η_{k}^{[α]}) - ((η_{k}^{[α]}, η_{k}^{[α]}) η_{k}^{[α]}, η_{k}^{[α]}) ∣ \\ = \sup_{0 \leq α \leq α_{0}} \sqrt{n} | (η_{k}^{[α]}, ({\hat{E}}_{n}^{k} (α) - E^{k} (α)) η_{k}^{[α]}) | \end{matrix}

By (5.46), $\sqrt{n} ({\hat{E}}_{n}^{k} (α) - E^{k} (α))$ and ϕ_k(Z_n) have the same limit distribution. Because for any Λ ∈ C_H[0, α₀],

\begin{matrix} (η_{k}^{[α]}, ϕ_{k} (Λ) η_{k}^{[α]}) \\ = & (η_{k}^{[α]}, \int_{C_{k}} [{(zI - S_{α} Γ S_{α})}^{- 1} Λ (α) {(zI - S_{α} Γ S_{α})}^{- 1}] d z η_{k}^{[α]}) \\ = & \int_{C_{k}} (η_{k}^{[α]}, {(zI - S_{α} Γ S_{α})}^{- 1} Λ (α) {(zI - S_{α} Γ S_{α})}^{- 1} η_{k}^{[α]}) dz \\ = & \int_{C_{k}} {(z - λ_{k}^{[α]})}^{- 2} dz (η_{k}^{[α]}, Λ (α) η_{k}^{[α]}) = 0 \end{matrix}

(5.49)

where we use the facts that

{(zI - S_{α} Γ S_{α})}^{- 1} η_{k}^{[α]} = {(z - λ_{k}^{[α]})}^{- 1} η_{k}^{[α]}, \int_{C_{k}} {(z - λ_{k}^{[α]})}^{- 2} dz = 0,

So we have

\sup_{0 \leq α \leq α_{0}} \sqrt{n} | {({\hat{η}}_{k}^{[α]}, η_{k}^{[α]})}^{2} - 1 | \to 0

in probability. By (5.39) and the continuities of ${\hat{η}}_{k}^{[a]}$ and $η_{k}^{[a]}$ , we have

\sup_{0 \leq α \leq α_{0}} \sqrt{n} | ({\hat{η}}_{k}^{[α]}, η_{k}^{[α]}) - 1 | \to 0,

(5.50)

in probability. Now

\begin{matrix} \sqrt{n} ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]}) = \sqrt{n} E^{k} (α) ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]}) + \sqrt{n} (I - E^{k} (α)) ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]}) \\ = \sqrt{n} (({\hat{η}}_{k}^{[α]}, η_{k}^{[α]}) - 1) η_{k}^{[α]} + \sqrt{n} (I - E^{k} (α)) {\hat{η}}_{k}^{[α]} \\ = \sqrt{n} (({\hat{η}}_{k}^{[α]}, η_{k}^{[α]}) - 1) η_{k}^{[α]} + \sqrt{n} (I - E^{k} (α)) \frac{{\hat{E}}_{n}^{k} (α) η_{k}^{[α]}}{({\hat{η}}_{k}^{[α]}, η_{k}^{[α]})} \\ = \sqrt{n} (({\hat{η}}_{k}^{[α]}, η_{k}^{[α]}) - 1) η_{k}^{[α]} + \frac{1}{({\hat{η}}_{k}^{[α]}, η_{k}^{[α]})} (I - E^{k} (α)) \sqrt{n} ({\hat{E}}_{n}^{k} (α) - E^{k} (α)) η_{k}^{[α]} \\ = \sqrt{n} (({\hat{η}}_{k}^{[α]}, η_{k}^{[α]}) - 1) η_{k}^{[α]} + \frac{1}{({\hat{η}}_{k}^{[α]}, η_{k}^{[α]})} ψ_{k} (\sqrt{n} ({\hat{E}}_{n}^{k} (α) - E^{k} (α))) . \end{matrix}

(5.51)

By (5.50), the first term in the last line converges to 0 in probability and $({\hat{η}}_{k}^{[a]}, η_{k}^{[a]}) \to 1$ in probability. Hence, $(\sqrt{n} ({\hat{η}}_{1}^{[α]} - η_{1}^{[α]}), \dots, \sqrt{n} ({\hat{η}}_{K}^{[α]} - η_{K}^{[α]}))$ has the same limit distribution as $Ψ_{K} (\sqrt{n} ({\hat{E}}_{n}^{k} (α) - E^{k} (α)))$ which converges to a Gaussian random element with mean zero and covariance operator $Ψ_{K} Φ_{K} Θ {i Q i}^{*} Θ^{*} Φ_{K}^{*} Ψ_{K}^{*}$ by Lemmas 8 and 16.

Define linear maps ϕ_k : C_H[0, α₀ → C_H[0, α₀], 1 ≤ k ≤ K, such that for any Λ ∈ C_H[0, α₀],

U_{k} (Λ) = {(η_{k}^{[α]}, E^{k} (α) (ψ_{k} (Λ)) (α)) + (η_{k}^{[α]}, Λ (α) η_{k}^{[α]}) + ((ψ_{k} (Λ)) (α), E^{k} (α) η_{k}^{[α]}), 0 \leq α \leq α_{0}},

where Ψ_k is defined in (5.47) and (Ψ_k(Λ))(α) denotes the value of Ψ_k(Λ) at α. Define a liner map $U_{k} : C_{H} [0, α_{0}] \to C_{ℝ} [0, α_{0}], 1 \leq k \leq K$ such that for any (Λ, …, Λ_K),

℧_{K} (Λ_{1}, \dots, Λ_{K}) = (U_{1} (Λ_{1}), \dots, U_{K} (Λ_{K})) .

(5.52)

It is easy to see that ℧_K is a bounded operator.

Lemma 18

The sequence ${\sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]}), 1 \leq k \leq K}_{n}$ of stochastic processes has sample paths in $\prod_{k = 1}^{K} C_{ℝ} [0, α_{0}]$ and converges in distribution to a Gaussian random, element with zero and covariance operator $℧_{K} Φ_{K} Θ {i Q i}^{*} Θ^{*} Φ_{K}^{*} ℧_{K}^{*}$ .

Proof. The continuities of ${\hat{λ}}_{k}^{[α]}$ and $λ_{k}^{[α]}$ follow from Lemma 11 and the inequalities

| {\hat{λ}}_{k}^{[α]} - {\hat{λ}}_{k}^{[α^{'}]} | \leq ∥ S_{α} \hat{Γ} S_{α} - S_{α^{'}} \hat{Γ} S_{α^{'}} ∥, | λ_{k}^{[α]} - λ_{k}^{[α^{'}]} | \leq ∥ S_{α} Γ S_{α} - S_{α^{'}} Γ S_{α^{'}} ∥,

for any 0 ≤ α, α′ ≤ α₀. In $Ω_{0}^{c}$ ,

\begin{matrix} \sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]}) = \sqrt{n} (({\hat{η}}_{k}^{[α]}, {\hat{E}}_{n}^{k} (α) {\hat{η}}_{k}^{[α]}) - (η_{k}^{[α]}, E^{k} (α) η_{k}^{[α]})) \\ = & \sqrt{n} (({\hat{η}}_{k}^{[α]}, {\hat{E}}_{n}^{k} (α) {\hat{η}}_{k}^{[α]}) - ({\hat{η}}_{k}^{[α]}, {\hat{E}}_{n}^{k} (α) η_{k}^{[α]})) \\ + \sqrt{n} (({\hat{η}}_{k}^{[α]}, {\hat{E}}_{n}^{k} (α) η_{k}^{[α]}) - ({\hat{η}}_{k}^{[α]}, E_{n}^{k} (α) η_{k}^{[α]})) \\ + \sqrt{n} (({\hat{η}}_{k}^{[α]}, E_{n}^{k} (α) η_{k}^{[α]}) - (η_{k}^{[α]}, E_{n}^{k} (α) η_{k}^{[α]})) \\ = & ({\hat{η}}_{k}^{[α]}, {\hat{E}}_{n}^{k} (α) \sqrt{n} ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]})) + ({\hat{η}}_{k}^{[α]}, \sqrt{n} ({\hat{E}}_{n}^{k} (α) - E_{n}^{k} (α)) η_{k}^{[α]}) \\ + (\sqrt{n} ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]}), E_{n}^{k} (α) η_{k}^{[α]}) \end{matrix}

(5.53)

By Lemmas 16 and 17, ${\hat{η}}_{k}^{[α]} \to η_{k}^{[α]}$ and ${\hat{E}}_{k}^{[α]} \to E_{k}^{[α]}$ in probability. Hence by (5.53), $\sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]})$ has the same limit distribution as

(η_{k}^{[α]}, E_{n}^{k} (α) \sqrt{n} ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]})) + (η_{k}^{[α]}, \sqrt{n} ({\hat{E}}_{n}^{k} (α) - E_{n}^{k} (α)) η_{k}^{[α]}) + (\sqrt{n} ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]}), E_{n}^{k} (α) η_{k}^{[α]})

which, by (5.51), has the same distribution as

\begin{matrix} (η_{k}^{[α]}, E_{n}^{k} (α) ψ_{k} (\sqrt{n} ({\hat{E}}_{n}^{k} (α) - E_{n}^{k} (α)))) + (η_{k}^{[α]}, \sqrt{n} ({\hat{E}}_{n}^{k} (α) - E_{n}^{k} (α)) η_{k}^{[α]}) + (ψ_{k} (\sqrt{n} ({\hat{E}}_{n}^{k} (α) - E_{n}^{k} (α))), E_{n}^{k} (α) η_{k}^{[α]}) \\ = & U_{k} (\sqrt{n} ({\hat{E}}_{n}^{k} (α) - E_{n}^{k} (α))) . \end{matrix}

Hence, ${\sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]}), 1 \leq k \leq K}_{n}$ has the same limit distribution as $℧_{K} (\sqrt{n} ({\hat{E}}_{n}^{1} (α) - E_{n}^{1} (α)), \dots, \sqrt{n} ({\hat{E}}_{n}^{K} (α) - E_{n}^{K} (α)))$ which converges to a Gaussian random element with mean zero and covariance operator $℧_{K} Φ_{K} Θ {i Q i}^{*} Θ^{*} Φ_{K}^{*} ℧_{K}^{*}$ by Lemmas 8 and 16.

Define a linear map $ℑ_{K} : \prod_{k = 1}^{K} C_{L^{2} ([a, b])} [0, α_{0}] \to \prod_{k = 1}^{K} C_{L^{2} ([a, b])} [0, α_{0}]$ such that for any $(Λ_{1}, \dots, Λ_{K}) \in \prod_{k = 1}^{K} C_{L^{2}} ([a, b]) [0, α_{0}]$ ,

ℑ_{K} (Λ_{1}, \dots, Λ_{K}) = {(\frac{1}{∥ S_{α} η_{1}^{[α]} ∥} S_{α} Λ_{1} (α), \dots, \frac{1}{∥ S_{α} η_{K}^{[α]} ∥} S_{α} Λ_{K} (α)), 0 \leq α \leq α_{0}} .

(5.54)

ℑ_K is a bounded operator.

Lemma 19

The sequence ${\sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}), 1 \leq k \leq K}_{n}$ of stochastic processes has sample paths in $\prod_{k = 1}^{K} C_{L^{2} ([a, b])} [0, α_{0}]$ a.s. and converges in distribution to a Gaussian random element with mean zero and covariance operator $ℑ_{K} Ψ_{K} Φ_{K} Θ {i Q i}^{*} Θ^{*} Φ_{K}^{*} Ψ_{K}^{*} ℑ_{K}^{*}$ .

Proof. By (5.31),

{\hat{γ}}_{k}^{[α]} = \frac{S_{α} {\hat{η}}_{k}^{[α]}}{∥ S_{α} {\hat{η}}_{k}^{[α]} ∥}, γ_{k}^{[α]} = \frac{S_{α} η_{k}^{[α]}}{∥ S_{α} η_{k}^{[α]} ∥} .

Therefore,

\begin{matrix} \sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}) = \sqrt{n} (\frac{S_{α} {\hat{η}}_{k}^{[α]}}{∥ S_{α} {\hat{η}}_{k}^{[α]} ∥} - \frac{S_{α} η_{k}^{[α]}}{∥ S_{α} η_{k}^{[α]} ∥}) \\ = \sqrt{n} (\frac{S_{α} {\hat{η}}_{k}^{[α]}}{∥ S_{α} {\hat{η}}_{k}^{[α]} ∥} - \frac{S_{α} {\hat{η}}_{k}^{[α]}}{∥ S_{α} η_{k}^{[α]} ∥}) + \sqrt{n} (\frac{S_{α} {\hat{η}}_{k}^{[α]}}{∥ S_{α} η_{k}^{[α]} ∥} - \frac{S_{α} η_{k}^{[α]}}{∥ S_{α} η_{k}^{[α]} ∥}) \\ = \sqrt{n} (\frac{1}{∥ S_{α} {\hat{η}}_{k}^{[α]} ∥} - \frac{1}{∥ S_{α} η_{k}^{[α]} ∥}) S_{α} {\hat{η}}_{k}^{[α]} + \frac{1}{∥ S_{α} η_{k}^{[α]} ∥} S_{α} (\sqrt{n} ({\hat{η}}_{k}^{[α]} - η_{k}^{[α]})) \end{matrix}

(5.55)

Because

∥ S_{α} {\hat{η}}_{k}^{[α]} - S_{α} η_{k}^{[α]} ∥ \leq ∥ {\hat{η}}_{k}^{[α]} - η_{k}^{[α]} ∥ \to 0

in probability, by the definition (5.54) of ℑ_K, (5.55) and Lemma 17, ${\sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}), 1 \leq k \leq K}_{n}$ has the same limit distribution as $ℑ_{K} ({\sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}), 1 \leq k \leq K}_{n})$ which converges to a Gaussian random element with mean zero and covariance operator $ℑ_{K} Ψ_{K} Φ_{K} Θ {i Q i}^{*} Θ^{*} Φ_{K}^{*} Ψ_{K}^{*} ℑ_{K}^{*}$ .

Proof of Corollary 4.1

By Lemma 18 and Lemma 19, the stochastic processes ${\sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]}), 1 \leq k \leq K}_{n}$ and ${\sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}), 1 \leq k \leq K}_{n}$ convergence in distribution, hence they are tight by Theorem 5.2 in Billinsley [1] since C_H[0, α₀] and C_L²[a,b][0, α₀] are both complete and separable. Therefore, for any ∈ > 0, one can find a positive number M depending on ∈ such that

\begin{matrix} \sup_{n} P (\max_{1 \leq k \leq K} \sup_{0 \leq α \leq α_{0}} | \sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]}) | \geq M) \leq ∊, \\ \sup_{n} P (\max_{1 \leq k \leq K} \sup_{0 \leq α \leq α_{0}} | \sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}) | \geq M) \leq ∊ . \end{matrix}

In other words,

\begin{matrix} {\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]} = O_{p} (\frac{1}{\sqrt{n}}), \\ {\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]} = O_{p} (\frac{1}{\sqrt{n}}) \end{matrix}

uniformly in α, which combines Theorem 4.1 to get our corollary.

Proof of Corollary 4.2

First, we have decompositions

\begin{matrix} \sqrt{n} ({\hat{λ}}_{k}^{[α_{n}]} - λ_{k}) = \sqrt{n} ({\hat{λ}}_{k}^{[α_{n}]} - λ_{k}^{[α_{n}]}) + \sqrt{n} (λ_{k}^{[α_{n}]} - λ_{k}), \\ \sqrt{n} ({\hat{γ}}_{k}^{[α_{n}]} - γ_{k}) = \sqrt{n} ({\hat{γ}}_{k}^{[α_{n}]} - γ_{k}^{[α_{n}]}) + \sqrt{n} (γ_{k}^{[α_{n}]} - γ_{k}) . \end{matrix}

Under the conditions on α_n for eigenvalues and eigenfunctions respectively, by Theorem 4.1, we have $\sqrt{n} (λ_{k}^{[α_{n}]} - λ_{k}) \to 0$ and $\sqrt{n} (γ_{k}^{[α_{n}]} - γ_{k}) \to 0$ respectively. Since ${\sqrt{n} ({\hat{γ}}_{k}^{[α]} - γ_{k}^{[α]}), 1 \leq k \leq K, 0 \leq α \leq α_{0}}_{n}$ and ${\sqrt{n} ({\hat{λ}}_{k}^{[α]} - λ_{k}^{[α]}), 1 \leq k \leq K, 0 \leq α \leq α_{0}}_{n}$ converge in distribution by Theorem 4.2, they are tight. Hence, the asymptotic normalities of $\sqrt{n} ({\hat{λ}}_{k}^{[α_{n}]} - λ_{k}^{[α_{n}]})$ and $\sqrt{n} ({\hat{γ}}_{k}^{[α_{n}]} - γ_{k}^{[α_{n}]})$ follows from Theorem 4.2 and the following lemma. Then the corollary follows at once.

Lemma 20

Suppose that F is a metric space with distance d. Let C_F[0, α₀] denote the continuous function on [0, α₀] taking values in F. Suppose we have a sequence {Y_n(α), 0 ≤ α ≤ α₀, n ∈ ℕ} of stochastic processes has sample paths in C_F[0, α₀]. Assume that Y_n is tight and Y_n(0) converges in distribution to a random element Y in F, then for any sequence α_n of positive numbers converging to 0, Y_n(α_n) also converges in distribution to Y.

Proof. First, we show that for any ∈ > 0 we can find δ > 0 such that

\sup_{n} P (\sup_{0 \leq α^{'}, α^{″} \leq δ} d (Y_{n} (α^{'}), Y_{n} (α^{″})) > ∊) \leq ∊ .

Since Y_n is tight, we can find a compact subset Χ of C_F[0, α₀] such that

\sup_{n} P (Y_{n} \notin Ξ) \leq ∊ .

We can find a finite number of Λ₁, … Λ_m ∈ Χ such that for any Λ ∈ Χ, we can find i such that $\sup_{0 \leq α \leq α_{0}} d (Λ_{i} (α), Λ (α)) \leq \frac{∊}{3}$ . Furthermore, we can find δ > 0 such that,

\max_{1 \leq i \leq m} \sup_{0 \leq α^{'}, α^{″} \leq δ} d (Λ_{i} (α^{'}), Λ_{i} (α^{″})) \leq \frac{∊}{3} .

Now it is easy to see that for any Λ ∈ Χ,

\sup_{0 \leq α^{'}, α^{″} \leq δ} d (Λ (α^{'}), Λ (α^{″})) \leq ∊ .

Hence,

\sup_{n} P (\sup_{0 \leq α^{'}, α^{″} \leq δ} d (Y_{n} (α^{'}), Y_{n} (α^{″})) > ∊) \leq \sup_{n} P (Y_{n} \notin Ξ) \leq ∊ .

If α_n ≤ δ, we have

P (d (Y_{n} (0), Y_{n} (α_{n})) > ∊) \leq ∊ .

Since ∈ is arbitrary, d(Y_n(0), Y_n(α_n)) → 0 in probability.

Acknowledgments

^☆ Supported in part by NIH grants R01 GM59507, a pilot project from the Yale Pepper Center, and NSF grant DMS 0714817.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

1.Billingsley P. Convergence of Probability Measures. 2nd Edition Wiley-Interscience; 1999. [Google Scholar]
2.Cardot H, Ferraty F, Sarda P. Functional linear model. Statistics and Probability Letters. 1999;45:11–22. [Google Scholar]
3.Dauxois J, Pousse A, Romain Y. Asymptotic theory for the principal component analysis ol a random vector lunction: some application to statistical inference. J. Multivariate Anal. 1982;12:136–154. [Google Scholar]
4.Dunford N, Schwartz JT. Linear Operators, General Theory, Part 1. Wiley-Interscience; 1988. [Google Scholar]
5.Dunford N, Schwartz JT. Linear Operators, Spectral Theory, Self Adjoint Operators in Hilbert Space, Part 2. Wiley-Interscience; 1988. [Google Scholar]
6.Ferraty F, Vieu P. Nonparametric Functional Data Analysis: Theory and Practice. Springer; 2006. [Google Scholar]
7.Glasserman P. Monte Carlo Methods in Financial Engineering. 1 edition Springer; 2003. [Google Scholar]
8.Huang JZ, Shen H, Buja A. Functional principal components analysis via penalized rank one approximation. Electron. J. Statist. 2008;2:678–695. [Google Scholar]
9.Lax PD. Functional Analysis (Pure and Applied Mathematics: A Wiley-Interscience Series of Texts, Monographs and Tracts) Wiley-Interscience; 2002. [Google Scholar]
10.Ledoux M, Talagrand M. Probability in Banach Spaces: Isoperimetry and Processes. A Series of Modern Surveys in Mathematics, Springer; 2006. Ergebnisse der Mathematik und ihrer Gren-zgebiete. 3.Folge, Band 23. [Google Scholar]
11.Qi X, Zhao H. Functional principal component analysis for discretely observed functional data. 2010 Submitted. [Google Scholar]
12.Ramsay JO, Silverman BW. Functional Data Analysis. 2nd Edition Springer; New York: 2005. [Google Scholar]
13.Riesz F, Sz.-Nagy B. Functional Analysis. Dover Publications; 1990. [Google Scholar]
14.Rudin W. Functional Analysis. McGraw-Hill Science/Engineering/Math; 1991. 2nd Edition. [Google Scholar]
15.Silverman BW. Smoothed functional principal components analysis by choice of norm. The Annals of Statistics. 1996;24:1–24. [Google Scholar]
16.Weinberger HF. Variational Methods for Eigenvalue Approximation, 2nd Edition (CBMS-NSF Regional Conference Series in Applied Mathematics) Society for Industrial Mathematics; 1987. [Google Scholar]

[R1] 1.Billingsley P. Convergence of Probability Measures. 2nd Edition Wiley-Interscience; 1999. [Google Scholar]

[R2] 2.Cardot H, Ferraty F, Sarda P. Functional linear model. Statistics and Probability Letters. 1999;45:11–22. [Google Scholar]

[R3] 3.Dauxois J, Pousse A, Romain Y. Asymptotic theory for the principal component analysis ol a random vector lunction: some application to statistical inference. J. Multivariate Anal. 1982;12:136–154. [Google Scholar]

[R4] 4.Dunford N, Schwartz JT. Linear Operators, General Theory, Part 1. Wiley-Interscience; 1988. [Google Scholar]

[R5] 5.Dunford N, Schwartz JT. Linear Operators, Spectral Theory, Self Adjoint Operators in Hilbert Space, Part 2. Wiley-Interscience; 1988. [Google Scholar]

[R6] 6.Ferraty F, Vieu P. Nonparametric Functional Data Analysis: Theory and Practice. Springer; 2006. [Google Scholar]

[R7] 7.Glasserman P. Monte Carlo Methods in Financial Engineering. 1 edition Springer; 2003. [Google Scholar]

[R8] 8.Huang JZ, Shen H, Buja A. Functional principal components analysis via penalized rank one approximation. Electron. J. Statist. 2008;2:678–695. [Google Scholar]

[R9] 9.Lax PD. Functional Analysis (Pure and Applied Mathematics: A Wiley-Interscience Series of Texts, Monographs and Tracts) Wiley-Interscience; 2002. [Google Scholar]

[R10] 10.Ledoux M, Talagrand M. Probability in Banach Spaces: Isoperimetry and Processes. A Series of Modern Surveys in Mathematics, Springer; 2006. Ergebnisse der Mathematik und ihrer Gren-zgebiete. 3.Folge, Band 23. [Google Scholar]

[R11] 11.Qi X, Zhao H. Functional principal component analysis for discretely observed functional data. 2010 Submitted. [Google Scholar]

[R12] 12.Ramsay JO, Silverman BW. Functional Data Analysis. 2nd Edition Springer; New York: 2005. [Google Scholar]

[R13] 13.Riesz F, Sz.-Nagy B. Functional Analysis. Dover Publications; 1990. [Google Scholar]

[R14] 14.Rudin W. Functional Analysis. McGraw-Hill Science/Engineering/Math; 1991. 2nd Edition. [Google Scholar]

[R15] 15.Silverman BW. Smoothed functional principal components analysis by choice of norm. The Annals of Statistics. 1996;24:1–24. [Google Scholar]

[R16] 16.Weinberger HF. Variational Methods for Eigenvalue Approximation, 2nd Edition (CBMS-NSF Regional Conference Series in Applied Mathematics) Society for Industrial Mathematics; 1987. [Google Scholar]

PERMALINK

Some theoretical properties of Silverman's method for Smoothed functional principal component analysis

Xin Qi

Hongyu Zhao

Abstract

1. Introduction

2. Notations and main assumptions

Assumption 1

Remark

Assumption 2

Remark

Assumption 3

Remark

3. Silverman's approach to smoothed functional PCA

Theorem 3.1

4. Asymptotic theory

Theorem 4.1

Remark

Theorem 4.2

Remark

Corollary 4.1

Remark

Corollary 4.2

Remark

5. Proofs

Proof of Theorem 3.1

Proof of Theorem 4.1

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Proof of Theorem 4.2

Lemma 5

Lemma 6

Lemma 7

Lemma 8

Lemma 9

Lemma 10

Lemma 11

Lemma 12

Lemma 13

Lemma 14

Lemma 15

Lemma 16

Lemma 17

Lemma 18

Lemma 19

Proof of Corollary 4.1

Proof of Corollary 4.2

Lemma 20

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases