A Two-Stage Approach for Semilinear In-Slide Models

Jinhong You; Haibo Zhou

doi:10.1016/j.jmva.2008.01.013

. Author manuscript; available in PMC: 2009 Oct 2.

Published in final edited form as: J Multivar Anal. 2008 Sep 1;99(8):1610–1634. doi: 10.1016/j.jmva.2008.01.013

A Two-Stage Approach for Semilinear In-Slide Models

Jinhong You ¹, Haibo Zhou ¹

PMCID: PMC2756113 NIHMSID: NIHMS67348 PMID: 19802362

Abstract

The semilinear in-slide models (SLIMs) have been shown to be effective method for normalizing microarray data (Fan, et al. 2004). Using a backfitting method, Fan, Peng and Huang (2005) proposed a profile least squares (PLS) estimation for the parametric and nonparametric components. The general asymptotic properties for their estimator is not developed. In this paper, we consider a new approach, two-stage estimation, which enables us to establish the asymptotic normalities for both of the parametric and nonparametric component estimators. We further propose a plug-in bandwidth selector using the asymptotic normality of the nonparametric component estimator. The proposed method allow for the modeling of the aggregated SLIMs case where we can explicitly show that taking the aggregated information into account can improve both of the parametric and nonparametric component estimator by the proposed two-stage approach. Some simulation studies are conducted to illustrate the finite sample performance of the proposed procedures.

Key words and phrases: Semilinear regression, In-slide model, Two-stage estimation, Asymptotic normality, Aggregated information

1 Introduction

Microarray technology is an important tool for quantitatively monitoring gene expression patterns and has been widely used in functional genomics (see e.g. Schena et al., 1995; Brown and Botstein 1999). Since great variations in experimental conditions exist in the microarray process it is essential to normalize the raw microarray data before any meaningful inference or analysis can be done. Useful normalization techniques developed include the global normalization method (e.g. Kroll and Wölfl 2002), the “lowess” method (e.g. Dudoit et al. 2002), the rank based procedure (e.g. Tseng et al. 2001). However, some restrictive biological assumptions are generally needed for normalization techniques. For example, the global normalization method needs an assumption that there is no print-tip block effect and no intensity effect. Without such an assumption, the global normalization method would be statistically biased. The “lowess” method requires an assumption that the average expression levels of up-and down-regulated genes at each intensity level are about the same in each print-tip block. The rank based procedure assumes that there are not many genes that are up-regulated (or down-regulated).

New statistical approaches have been sought to relax those restrictive biological assumptions. For example, two-way semilinear models have been proposed to normalize the microarray data (Huang, et al. 2003, Huang and Zhang 2003, Huang, Wang and Zhang 2005). This method does not make the usual assumptions underlying the existing methods mentioned above. The two-way semilinear model approach can also incorporate uncertainty due to normalization into significant analysis of microarrays.

Fan, et al. (2004) proposed a method to estimate the intensity and print-tip effects by aggregating information from the replications in a microarray. Let G be the number of genes, I_g be the number of replications of the gth gene, R_gi and G_gi be the red (Cy5) and green (Cy3) intensities of the gth gene in the ith replication, respectively. Further, let Y_gi be the log-intensity ratio of red over green channels of the gth gene in the ith repetition, and let U_gi be the corresponding average of the log-intensities of the red and green channels. That is, Y_gi = log₂ R_gi/G_gi,U_gi = 1/2 log₂(R_giG_gi). The following semilinear model was proposed by Fan, et al. (2004) to fit the intensity and print-tip block effects

Y_{g i} = α_{g} + β_{r_{g i}} + γ_{c_{g i}} + m (U_{g i}) + ε_{g i},

(1.1)

where α_g is the treatment effect associated with the gth gene, r_gi and c_gi are the row and column of print-tip block where the gth gene of the ith replication resides, β and γ are the row and column effects with constraints $\sum_{i = 1}^{r} β_{i} = 0 and \sum_{j = 1}^{c} γ_{j} = 0$ , where r and c are the number of rows and columns of the print-tip blocks, m(·) is a smooth function of U representing the intensity effect, and ε_gi’s are random errors with mean zero and variance σ².

Using matrix notation, model (1.1) can be re-written as

Y = B α + X β + M + ε, n = \sum_{g = 1}^{G} I_{g}

(1.2)

where Y = (Y₁,…, Y_n)^T is the response, B = blockdiag(1_I₁,…, 1_{I_G}) with 1_{I_g} being a vector of length I_g and all elements 1, X = (X₁,…,X_n)^T is an n × p design matrix with p being the sum of the numbers of row and column, α = (α₁,…, α_G)^T is the effect of gene, β = (β₁,…, β_r, γ₁,…, γ_c)^T is the print-tip block effect, M = (m(U₁),…,m(U_n))^T is the intensity effect and ε = (ε₁,…, ε_n)^T is the random error.

Model (1.2) can be viewed as an extension of the usual fixed-effects parametric model to the semiparametric context. Such fixed-effects model is an appropriate specification if one is interested in a specific set of subjects and it has been widely applied in econometric analysis. (e.g. for example, Lichtenberg 1988, Honoré 1994, Baltagi 1995, Entorf 1997).

For the case where I_g ≡ I, Baltagi and Li (2002) proposed difference-based series (DBS) estimators for β and m(·). They established the asymptotic normality of the former and derived the convergence rate of the latter. Fan, Peng and Huang (2005) proposed profile least squares (PLS) estimators for β and m(·) by combining the local linear, least squares and backfitting procedures. They established the asymptotic normality of the former and derived the upper boundary of the mean squares error of the latter. You, Zhou, and Zhou (2005) proposed semiparametric least squares (SLE) estimators for β and m(·) by series approximating the nonparametric component. For DBS, PLS and SLE estimators, it is not easy to establish the asymptotic normality of the nonparametric component estimators. The reason is that the DBS and SLE involve the series approximation and the PLS uses a backfitting procedure. This hinders the application of these estimators in practice as it is difficult to select bandwidth and inference on the nonparametric component. In addition, Baltagi and Li (2002) and You, Zhou, and Zhou (2005) only consider the non-aggregated model.

Real microarray data often has different replication numbers reported, i.e. I_g may not always be the same across different g. This structure may arise from the fact that different studies have different replication number or that within a same study, uncontrollable experimental conditions such as image corruption, array fabrication error, etc, may lead to different I_g for different g (Golub et al. 1999, Alizadeh et al. 2000, Hendenfalk et al. 2001, Nguyen et al. 2004). Extension of model (1.2) under unequal I_g cases is undeveloped.

In this paper, we describe a two-stage estimation procedure. In the first stage, the series approximating estimation is used to obtain the series estimates of the parametric and nonparametric components. In the second stage, we input the first-stage estimates and eliminate the nuisance parameters α_g by difference. This transforms model (1.2) into an ordinary semilinear regression model. We then propose an ordinary profile least squares estimation for the parametric and nonparametric components, respectively. The asymptotic normalities of the proposed estimators are established. In particular, we show that the estimator of the parametric component achieves the semiparametric efficiency bound. We extend the two-stage estimate to the aggregated SLIMs case. Using the PLS estimation the aggregated information can only be used to improve the parametric components (Fan, Peng and Huang 2005). We explicitly demonstrate that under our two-stage estimation, the aggregated information can be used to improve both of the parametric and nonparametric component estimates.

The layout of the remainder of this paper is as follows. In Section 2 we describe the proposed two-stage estimation. In Section 3 we derive the asymptotic properties of the two-stage estimators. Extending the two-stage estimation to the aggregated SLIMs case is considered in Section 4. Section 5 presents results from numerical studies. Section 6 concludes. All proofs of main results are relegated to the Appendix.

2 A Two-Stage Procedure

Throught out this paper we assume that G → ∞ and 2 ≤ I_g ≤ c for some fixed constant c. The two-stage estimation is as follows. In the first stage, the series approximating technique is used to obtain the series estimates of the parametric and nonparametric components, respectively. In the second stage, the first-stage estimates are input to the second stage and by differencing, we eliminate the nuisance parameters α_g and transform model (1.2) into an ordinary semilinear regression model. The ordinary profile least squares and local polynomial estimates are then obtained for the parametric and nonparametric components, respectively.

Since m(u) is a smooth function, it can be approximated by ζ^T (u)ϑ where ζ (u) = (ζ _{k_n1}(u),…, ζ_{k_nk_n}(u))^T is a vector of approximating functions, such as power series or B-splines, ϑ is an unknown k_n-variate constant vector and k_n is a positive integer which is dependent on n. Thus, model (1.2) can be written as

Y = B α + X β + Ξ ϑ + ε^{*},

(2.1)

where Ξ is an n × k_n matrix with i-th row being ζ(U_i) = (ζ_{k_n1}(U_i),…, ζ_{k_nk_n}(U_i))^T , ε^* = ε + M − Ξϑ and M = (m(U₁),…,m(U_n))^T . Define M_B = I_n − B(B^TB)⁻¹B^T . Then pre-multiplying (2.1) by M_B leads to

M_{B} Y = M_{B} X β + M_{B} Ξ ϑ + M_{B} ε^{*} .

(2.2)

If we take M_Bε^* as the residuals, model (2.2) is a version of the usual linear regression. By the usual “profile” or “partialing out” formula, the estimator of β can be written as

{\tilde{β}}_{n} = {(X^{T} M_{B} M_{M_{B} Ξ} M_{B} X)}^{- 1} X^{T} M_{B} M_{M_{B} Ξ} M_{B} Y,

(2.3)

where M_{M_BΞ} = I_n − P_{M_B}Ξ = I_n −M_BΞ (Ξ^TM_BΞ)⁻Ξ^TM_B and A⁻ denotes any generalized inverse of matrix A. An estimator of ϑ is

{\tilde{ϑ}}_{n} = {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} (Y - X {\tilde{β}}_{n}) .

Then an obvious estimator of m(u) is m̃_n(u) = ζ^T (u)ϑ̃_n, which is a nonparametric projecting estimator. Same as You, Zhou and Zhou (2005) we can establish the asymptotic normality of β̃ _n. However, it is a great challenge to establish the asymptotic normality of m̃_n(u). The lack of asymptotic normality of the nonparametric component estimator poses difficulties for bandwidth selections and hinders statistical inference. In the following we will propose two-stage estiamtors for both of the parametric and nonparametric components and establish the asymptotic normality for both of them.

For convenience, let $ι (g, i) = \sum_{g 1 = 1}^{g - 1} I_{g_{1}} + i and Q (g, i) = {(I_{g} - 1)}^{- 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (Y_{ι (g, i_{1})} - X_{ι (g, i_{1})}^{T} {\tilde{β}}_{n} - {\tilde{m}}_{n} (U_{ι (g, i_{1})}))$ with g = 1,…,G and i = 1,…, I_g. If subtracting Q(g, i) from two sides of model (1.2) we have

{Y_{ι}}_{(g, i)} - Q (g, i) = X_{ι (g, i_{1})}^{T} β + m (U_{ι (g, i)}) + ε_{ι (g, i)} + α_{g} - Q (g, i) .

According to Lemma 1 and Lemma 2 in the appendix, we have

\begin{matrix} α_{g} - Q (g, i) = \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})} + O_{P} (k_{n} / \sqrt{n} + k_{n}^{- 3 / 2}) . \end{matrix}

Therefore, if we denote $Y_{ι (g, i)}^{*} = Y_{ι (g, i)} - Q (g, i)$ we have

Y_{ι (g, i)}^{*} = X_{ι (g, i)}^{T} β + m (U_{ι (g, i)}) + ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{i})} + O_{p} (k_{n} / \sqrt{n} + k_{n}^{- 3 / 2}) .

(2.4)

It is easy to see that (2.4) is an ordinary semilinear regression model. The ordinary profile least squares and local polynomial estimations can be used to estimate β and m(·). The detail is as follows. For any given β, (2.4) can be written as

Y_{ι (g, i)}^{*} - X_{ι (g, i)}^{T} β = m (U_{ι (g, i)}) + ε_{ι (g, i)}^{* *}, g = 1, \dots, G, i = 1, \dots, I_{g}

(2.5)

where $ε_{ι (g, i)}^{* *} = ε_{ι (g, i)} + α_{g} - Q (g, i)$ . This transforms the semilinear regression model into the usual nonparametric model. Now, apply a local linear regression technique in a small neighborhood of u₀, one can approximate m(u) locally by a linear function

m (u) \approx m (u_{0}) + m^{'} (u_{0}) (u - u_{0}) \equiv a + b (u - u_{0})

with m′ (u) = ∂m/∂u. This leads to the following weighted local least squares problem: find a, b to minimize

\sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {Y_{ι (g, i)}^{*} - X_{ι (g, i)}^{T} β - a - b (U_{ι (g, i)} - u_{0})}^{2} K_{h} (U_{ι (g, i)} - u_{0}),

(2.6)

where K(·) is a kernel function, h is a bandwidth and K_h(·) = K(·/h)/h. The solution to minimizing the sum in (2.6) is given by

(\hat{a} (u), h \hat{b} (u))^{T} = {(D_{u}^{T} W_{u} D_{u})}^{- 1} D_{u}^{T} W_{u} (Y^{*} - X β),

(2.7)

where

X = (\begin{matrix} X_{1}^{T} \\ ⋮ \\ X_{n}^{T} \end{matrix}) = (\begin{matrix} X_{11} & \dots & X_{1 p} \\ ⋮ & ⋱ & ⋮ \\ X_{n 1} & \dots & X_{n p} \end{matrix}), D_{u} = (\begin{matrix} 1 & (U_{1} - u) / h \\ ⋮ & ⋮ \\ 1 & (U_{n} - u) / h \end{matrix}),

and

W_{u} = diag (K_{h} (U_{1} - u), \dots, K_{h} (U_{n} - u)) .

Replacing m(·) by â(·) in (2.5) results the following model

{\hat{Y}}_{ι (g, i)}^{*} = {\hat{X}}_{ι (g, i)}^{T} β + ε_{ι (g, i)}^{* * *}, g = 1, \dots, G and i = 1, \dots, I_{g},

(2.8)

where

\begin{matrix} {\hat{Y}}_{ι (g, i)}^{*} = Y_{ι (g, i)}^{*} - (1, 0) {(D_{U_{ι (g, i)}}^{T} W_{U_{ι (g, i)}} D_{U_{ι (g, i)}})}^{- 1} D_{U_{ι (g, i)}}^{T} W_{U_{ι (g, i)}} Y^{*}, \\ {\hat{X}}_{ι (g, i)} = X_{ι (g, i)} - (1, 0) {(D_{U_{ι (g, i)}}^{T} W_{U_{ι (g, i)}} D_{U_{ι (g, i)}})}^{- 1} D_{U_{ι (g, i)}}^{T} W_{U_{ι (g, i)}} X \end{matrix}

and $ε_{ι (g, i)}^{* * *} = ε_{ι (g, i)}^{* *} + \bar{m} (U_{ι (g, i)}) - {\bar{ε}}_{ι (g, i)}^{* *}$ with

\begin{matrix} \bar{m} (U_{ι (g, i)}) = m (U_{ι (g, i)}) - (1, 0) {(D_{U_{ι (g, i)}}^{T} W_{U_{ι (g, i)}} D_{U_{ι (g, i)}})}^{- 1} D_{U_{ι (g, i)}}^{T} W_{U_{ι (g, i)}} M, \\ {\bar{ε}}_{ι (g, i)}^{* *} = (1, 0) {(D_{U_{ι (g, i)}}^{T} W_{U_{ι (g, i)}} D_{U_{ι (g, i)}})}^{- 1} D_{U_{ι (g, i)}}^{T} W_{U_{ι (g, i)}} ε^{* *}, \\ M = {(m (U_{1}), \dots, m (U_{n}))}^{T} and ε^{* *} = {(ε_{1}^{* *}, \dots, ε_{n}^{* *})}^{T} . \end{matrix}

Take $ε_{ι (g, i_{1})}^{* * *}$ as residuals and apply the least squares method to (2.8), we obtain a two-stage estimator of β as

{\hat{β}}_{n} = {({\hat{X}}^{T} \hat{X})}^{- 1} {\hat{X}}^{T} {\hat{Y}}^{*},

(2.9)

where I_n is an n × n identity matrix,

S = (\begin{matrix} (1, 0) {(D_{U_{1}}^{T} W_{U_{1}} D_{U_{1}})}^{- 1} D_{U_{1}}^{T} W_{U_{1}} \\ ⋮ \\ (1, 0) {(D_{U_{n}}^{T} W_{U_{n}} D_{U_{n}})}^{- 1} D_{U_{n}}^{T} W_{U_{n}} \end{matrix}), Y^{*} = (\begin{matrix} Y_{1}^{*} \\ ⋮ \\ Y_{n}^{*} \end{matrix}), \begin{matrix} \hat{X} = (I_{n} - S) X, \\ {\hat{Y}}^{*} = (I_{n} - S) Y^{*} . \end{matrix}

Correspondingly, a two-stage estimator of m(·) is

{\hat{m}}_{n} (u) = (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} D_{u}^{T} W_{u} (Y^{*} - X {\hat{β}}_{n}) .

(2.10)

The error variance $σ^{2} = Var (ε_{1}^{2})$ is the quantity that describes the noise level. Apart from the intrinsic interest as parameters of the model, its estimation is essential in constructing confidence regions, model-based tests, model selection procedures, signal-to-noise ratio determination, and so on. Therefore, it is also essential to estimate it. We propose an estimate of σ² as follows

{\hat{σ}}_{n}^{2} = \frac{1}{n + \sum_{g = 1}^{G} I_{g} / (I_{g} - 1)} {(Y^{*} - X {\hat{β}}_{n} - \hat{M})}^{T} (Y^{*} - X {\hat{β}}_{n} - \hat{M}) .

In the next section, we will establish the asymptotic properties of β̂_n,m̂_n(·) and ${\hat{σ}}_{n}^{2}$ .

3 Asymptotic Normality of the Two-Stage Estimators

To present the asymptotic properties of β̂_n,m̂_n(·) and ${\hat{σ}}_{n}^{2}$ , we make the following assumptions

Assumption 1

(X_i,U_i, ε_i) are independent and identically distributed as (X₁,U₁, ε₁).

Assumption 2

(i) For very k_n there is a nonsingular matrix M such that for Mζ (u), the smallest eigenvalue of E[M(ζ(U₁) − Eζ (U₁))]^⊗2 is bounded away from zero uniformly in k_n.

(ii) There is a sequence of constants δ₀(k_n) satisfying sup_u∈𝓊 ║Mζ (u)║ ≤ δ₀(k_n) and k_n satisfies that (δ₀(k_n))²k_n/n → 0 as n → ∞, where 𝓊 is the support of U₁, and for a matrix A, ║A║ = tr(AA^T) denotes the Euclidean norm of A.

Assumption 3

(i) m(u) and h_j(u) = E(X_j1|U₁ = u) are twice continuously differentiable on 𝓊 where j = 1,…, p.

(ii) For m(u) or h_j(u), j = 1,…, p, there exist ϑ = (ϑ₁,…,ϑ_{k_n})^T, such that ${sup}_{u \in 𝓊} | g (u) - ϑ^{T} ζ (u) | O (k_{n}^{- 2})$ with g(u) = m(u) or h_j(u).

(iii) k_n = c_kn^4/15+ν for some constant c_k satisfying 0 < c_k < ∞ and some ν satisfying 0 ≤ ν < 1/30.

Assumption 4

The function K(·) is a symmetric density function with compact support.

Assumption 5

h = c_hn^−1/5 for some constant c_h satisfying 0 < c_h < ∞.

Remark 1

Assumption 2 is a standard assumption being used in series estimation methods. Assumption 3 says that the uniform approximation error to the function shrinks at the rate $k_{n}^{- 2}$ . Assumption 2 and Assumption 3 are not the easiest conditions but it is known that many series functions satisfy these conditions, e.g. power series and spline. Assumption 4 and Assumption 5 are standard assumptions used in kernel or local polynomial estimations.

Under the above assumptions, the following theorem provides the asymptotic properties of β̂_n,m̂_n(·) and ${\hat{σ}}_{n}^{2}$

Theorem 1

Suppose that Assumption 1 to Assumption 5 hold. Then it holds that

\sqrt{n} ({\hat{β}}_{n} - β) \overset{D}{\to} N (0, {lim_{n \to \infty} \frac{1}{n} \sum_{g = 1}^{G} I_{g}^{2} {(I_{g} - 1)}^{- 1}} σ^{2} Σ^{- 1}) as n \to \infty

where $Σ = E (Π_{1} Π_{1}^{T})$ and Π₁ = X₁ − E(X₁|U₁).

Theorem 2

Suppose that Assumption 1 to Assumption 5 hold. Then it holds that

\sqrt{n h} [{\hat{m}}_{n} (u) - m (u) - \frac{h^{2}}{2} \frac{μ_{2}^{2} - μ_{1} μ_{3}}{μ_{2} - μ_{1}^{2}} m^{″} (u)] \overset{D}{\to} N (0, ζ (u)) as n \to \infty

provided that p(u) ≠ 0, where $μ_{j} = \int_{- \infty}^{\infty} u^{j} K (u) d u, ν_{j} = \int_{- \infty}^{\infty} u^{j} K^{2} (u) d u,$

ζ (u) = \frac{{{lim}_{n \to \infty} n^{- 1} \sum_{g = 1}^{G} I_{g}^{2} / I_{g} - 1} σ^{2} (α_{0}^{2} ν_{0} + 2 α_{0} α_{1} ν_{1} + α_{1}^{2} ν_{2})}{p (u)},

with $α_{0} = - μ_{2} / (μ_{2} - μ_{1}^{2}) and α_{1} = μ_{1} / (μ_{2} - μ_{1}^{2})$ and p(·) is the density function of U₁.

Remark 2

According to Theorem 1, when I_g ≡ I the asymptotic covariance matrix of β̂_n reduces to I/(I − 1)σ²Σ⁻¹, i.e the semiparametric efficient boundary (Fan, Peng and Huang 2005).

Theorem 3

Suppose that Assumption 1 to Assumption 5 hold. If $E ε_{1}^{4} < \infty$ holds, then

\sqrt{n} ({\hat{σ}}_{n}^{2} - σ^{2}) \overset{D}{\to} N (0, κ) as n \to \infty,

where $κ = θ_{1}^{0} E (ε_{1}^{4}) θ_{2}^{0} σ^{4} with τ (n) = n / {n + \sum_{g = 1}^{G} I_{g} / {(I_{g} - 1)}^{2}}$ ,

θ_{1}^{0} = lim_{n \to \infty} τ (n) \sum_{g = 1}^{G} {1 + \frac{1}{{(I_{g} - 1)}^{2}} + \frac{2}{(I_{g} - 1)}} I_{g}

and

θ_{2}^{0} = lim_{n \to \infty} τ (n) \sum_{g = 1}^{G} \frac{1}{{(I_{g} - 1)}^{3}} (- I_{g}^{4} + 2 I_{g}^{3} + 6 I_{g}^{2} + I_{g}) .

Further, we define

\begin{matrix} {\hat{Σ}}_{n} = \sum_{g = 1}^{G} I_{g}^{2} {(I_{g} - 1)}^{- 1} {\hat{σ}}_{n}^{2} {({\hat{X}}^{T} \hat{X})}^{- 1}, \\ {\hat{ψ}}_{ι (g, i)} = (Y_{ι (g, i)} - X_{ι (g, i)}^{T} {\hat{β}}_{n} - {\hat{m}}_{n} (U_{ι (g, i)})) - (Y_{ι (g, i - 1)} - X_{ι (g, i - 1)}^{T} {\hat{β}}_{n} - {\hat{m}}_{n} (U_{ι (g, i - 1)})) \end{matrix}

for g = 1,…,G, i = 2,…, I_g,

\begin{matrix} θ_{1} = τ (n) \sum_{g = 1}^{G} {1 + \frac{1}{{(I_{g} - 1)}^{2}} + \frac{2}{(I_{g} - 1)}} I_{g}, \\ θ_{2} = τ (n) \sum_{g = 1}^{G} \frac{1}{{(I_{g} - 1)}^{3}} (- I_{g}^{4} + 2 I_{g}^{3} + 6 I_{g}^{2} + I_{g}), θ_{3} = \sum_{g = 1}^{G} (4 I_{g} - 2), \\ θ_{4} = \sum_{g = 1}^{G} {(I_{g} - 1) (I_{g} + 2) + 4 I_{g}} and {\hat{κ}}_{n} = θ_{1} / θ_{3} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {\hat{ψ}}_{ι (g, i)}^{4} + {θ_{2} - (θ_{1} θ_{4}) / θ_{3}} {\hat{σ}}_{n}^{4} \end{matrix}

The next theorem shows that Σ̂_n and κ̂_n are consistent estimators of ${lim}_{n \to \infty} n^{- 1} \sum_{g = 1}^{G} I_{g}^{2} . {(I_{g} - 1)}^{- 1} σ^{2} Σ$ and κ, respectively.

Theorem 4

Suppose that Assumption 1 to Assumption 5 hold. If $E ε_{1}^{4} < \infty$ holds, then

{\hat{Σ}}_{n} \to_{p} {lim_{n \to \infty} n^{- 1} \sum_{g = 1}^{G} I_{g}^{2} {(I_{g} - 1)}^{- 1}} σ^{2} Σ and {\hat{κ}}_{n} \to_{p} κ as n \to \infty .

4 Two-stage Estimation for the Aggregated SLIM

In so far, the intensity effect and the gene effect were estimated by using the information within one slide. Therefore, the arrays are allowed to have different gene effect, namely, α_g can be slide-dependent. When samples were drawn from different subjects this is reasonable. However, in many practical situations, the sample may come from the same subject. In those cases, it is natural to assume that the gene effects are the same across arrays and the information from other arrays can be aggregated. This assumption is helpful for improving the precision and for assessing the quality of an array using the coefficient of variation (Tseng, et al. 2001). Therefore, Fan, Peng and Huang (2005) further proposed an aggregated SLIM. This kind of aggregation idea is also appeared in the work of Huang, Wang and Zhang (2003) for a very different semiparametric model. The aggregated SLIM is defined as

Y_{i j} = B_{i j}^{T} α + X_{i j}^{T} β_{j} + m_{j} (U_{i j}) + ε_{i j}, i = 1, \dots, n, j = 1, \dots, J .

(4.1)

where Y_j=(Y_1j,…,Y_nj)^T, B_j = (B_1j , …,B_nj)^T , X_j = (X_1j ,… ,X_nj)^T , U_j = (U_1j ,…,U_nj)^T , α = (α₁, …, α_G)^T , β_j = (β_1j , …, β_{p_j j})^T and ε_j = (ε_1j ,…, ε_nj)^T.

Fan, Peng and Huang (2005) proposed an aggregated profile least squares (APLS) estimator for $β = {(β_{1}^{T}, \dots, β_{J}^{T})}^{T}$ and describe an estimation for the nonparametric components. We here propose an aggregated two-stage procedure.

4.1 Estimating the parametric component

We will investigate two cases. One is that X_ij1 and X_ij2 are independent and the other is X_ij1 and X_ij2 are dependent, where j₁ ≠ j₂.

Case 1

Suppose that β̃_jn and m̃_jn(·) are series estimators of β_j and m_j(·) , respectively which are based on individual equation. Let

\nabla_{ι (g, i), 1} = Y_{ι (g, i), j} - X_{ι (g, i), j}^{T} {\tilde{β}}_{j n} - {\tilde{m}}_{j n} (U_{ι (g, i), j}) .

For fixed j, if subtracting

{(I_{g} J - 1)}^{- 1} {\sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1}), j} + \sum_{j_{1} = 1, j_{1} \neq j}^{J} \sum_{i_{1} = 1}^{I_{g}} \nabla_{ι (g, i_{1}), j_{1}}}

from the two sides of model (4.1) we have

\begin{matrix} Y_{ι (g, i), j} - \frac{1}{I_{g} J - 1} {\sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1}), j} + \sum_{j_{1} = 1, j_{1} \neq j}^{J} \sum_{i_{1} = 1}^{I_{g}} \nabla_{ι (g, i_{1}), j_{1}}} \\ = & X_{ι (g, i), j}^{T} β_{j} + m (U_{ι (g, i), j}) + ε_{ι (g, i), j} + α_{g} - \frac{1}{I_{g} J - 1} {\sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1}), j} + \sum_{j_{1} = 1, j_{1} \neq j}^{J} \sum_{i_{1} = 1}^{I_{g}} \nabla_{ι (g, i_{1}), j_{1}}} \\ = & X_{ι (g, i), j}^{T} β_{j} + m (U_{ι (g, i), j}) + ε_{ι (g, i), j} - \frac{1}{I_{g} J - 1} {\sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1}), j} + \sum_{j_{1} = 1, j_{1} \neq j}^{J} \sum_{i_{1} = 1}^{I_{g}} ε_{ι (g, i_{1}), j}} \\ + O_{p} (max_{1 \leq j \leq J} k_{j n} / \sqrt{n} + max_{1 \leq j \leq J} k_{j n}^{- 3 / 2}) . \end{matrix}

Therefore, applying the usual profile least squares estimation we can obtain an aggregated two-stage estimator of β_j as

{\hat{β}}_{j n}^{(1) A} = {({\hat{X}}_{j}^{T} {\hat{X}}_{j})}^{- 1} {\hat{X}}_{j}^{T} {\hat{Y}}_{j}^{(1) *},

where S_j, X̂_j have the same definitions as S and X̂, the ι(g, i)th element of $Y_{j}^{(1) *} is Y_{ι (g, i), j} - {(I_{g} J - 1)}^{- 1} {\sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1}), j} + \sum_{j_{1} = 1, j_{1} \neq j}^{J} \sum_{i_{1} = 1}^{I_{g}} \nabla_{ι (g, i_{1}), j_{1}}} and {\hat{Y}}_{j}^{(1) *} = (I_{n} - S_{j}) Y_{j}^{(1) *}$ .

Case 2

For fixed j, if subtracting ${I_{g} (J - 1)}^{- 1} \sum_{j_{1} = 1}^{J} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1}), j_{1} .}$ from the two sides of model (4.1) we have

\begin{matrix} Y_{ι (g, i), j} - \frac{1}{I_{g} (J - 1)} \sum_{j_{1} = 1}^{J} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1}), j_{1}} = X_{ι (g, i), j}^{T} β_{j} + m (U_{ι (g, i), j}) \\ + ε_{ι (g, i), j} - \frac{1}{I_{g} (J - 1)} \sum_{j_{1} = 1}^{J} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1}), j_{1}} + O_{p} (max_{1 \leq j \leq J} k_{j n} / \sqrt{n} + max_{1 \leq j \leq J} k_{j n}^{- 3 / 2}) . \end{matrix}

Therefore, applying the usual profile least squares estimation we can obtain an aggregated two-stage estimator of β_j as

{\hat{β}}_{j n}^{(2) A} = {({\hat{X}}_{j}^{T} {\hat{X}}_{j})}^{- 1} ({\hat{X}}_{j}^{T} {\hat{Y}}_{j}^{(2) *},

where the ι(g, i)th element of $Y_{j}^{(2) *} is Y_{ι (g, i), j} - {I_{g} (J - 1)}^{- 1} \sum_{j_{1} = 1}^{J} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1}), j_{1}}$ .

For ${\hat{β}}_{j n}^{(1) A} and {\hat{β}}_{j n}^{(2) A}$ we have the following asymptotic properties.

Theorem 5

Under some regularity conditions (same as Assumption 1 to Assumption 5) it holds that

\sqrt{n} ({\hat{β}}_{j n}^{(1) A} - β_{j}) \underset{\to}{D} N (0, lim_{n \to \infty} n^{- 1} \sum_{g = 1}^{G} \frac{J I_{g}^{2}}{J I_{g} - 1} σ^{2} \sum_{j}^{- 1}) as n \to \infty

where $\sum_{j} = E (Π_{1 j} Π_{1 j}^{T}) and Π_{1 j} = X_{1 j} - E (X_{1 j} | U_{1 j})$ .

Theorem 6

Under some regularity conditions (same as Assumption 1 to Assumption 5) it holds that

\sqrt{n} ({\hat{β}}_{j n}^{(2) A} - β_{j}) \overset{D}{\to} N (0, lim_{n \to \infty} n^{- 1} \sum_{g = 1}^{G} \frac{{J (I_{g} - 1) + 1}}{J (I_{g} - 1)} σ^{2} \sum_{j}^{- 1}) as n \to \infty

where Σ_j is defined in Theorem 5.

Remark 3

From Theorem 5 and 6 we can see the aggregated information can be used to improve the two-stage estimators for the parametric components and the degree of improvement depend on X_ij1 and X_ij2 being independent or dependent. Moreover, when $I_{g} \equiv I, {lim}_{n \to \infty} n^{- 1} \sum_{g = 1}^{G} \frac{J I_{g}^{2}}{J I_{g} - 1}$ reduces to JI/(JI − 1). Thus, according to Fan, Peng and Huang (2005), our aggregated two-stage estimator has the same asymptotic covariance as that of the aggregated PLS estimator.

4.2 Estimating the nonparametric components

We propose an aggregated local linear estimator of m_j(·) for Case 1 and 2. In Case 1, it has the form

{\hat{m}}_{j n}^{(1) A} (u) = (1, 0) {(D_{j u}^{T} W_{j u} D_{j u})}^{- 1} D_{j u}^{T} W_{j u} (Y_{j}^{(1) *} - X_{j} {\hat{β}}_{j n}^{(1) A}) .

In Case 2, it has the form

{\hat{m}}_{j n}^{(2) A} (u) = (1, 0) {(D_{j u}^{T} W_{j u} D_{j u})}^{- 1} D_{j u}^{T} W_{j u} (Y_{j}^{(1) *} - X_{j} {\hat{β}}_{j n}^{(2) A}) .

For ${\hat{m}}_{j n}^{(1) A} (u) and {\hat{m}}_{j n}^{(2) A} (u),$ we have the following asymptotic properties.

Theorem 7

Under some regularity conditions (same as Assumption 1 to Assumption 5) it holds that

{\hat{m}}_{j n}^{(1) A} (u) - {\hat{m}}_{j n}^{(2) A} (u) = o_{p} {h^{2} + \frac{1}{\sqrt{n h}}} .

Further,

\sqrt{n h} [{\hat{m}}_{j n}^{(1) A} (u) - m_{j} (u) - \frac{h^{2}}{2} \frac{μ_{2}^{2} - μ_{1} μ_{3}}{μ_{2} - μ_{1}^{2}} m_{j}^{″} (u)] \overset{D}{\to} N (0, ζ_{j}^{A} (u)) as n \to \infty

provided that p_j(u) ≠ 0, where

ζ_{j}^{A} (u) = \frac{{{lim}_{n \to \infty} n^{- 1} \sum_{g = 1}^{G} \frac{J I_{g}^{2}}{J I_{g} - 1}} σ^{2} (α_{0}^{2} ν_{0} + 2 α_{0} α_{1} ν_{1} + α_{1}^{2} ν_{2})}{p_{j} (u)},

and p_j(u) is the density function of U_1j.

Remark 4

From Theorem 7, we can see that taking the aggregated information into account can improve the estimate of the nonparametric component as well.

5 Simulation Studies

In this section, we conduct some simulations to show the finite sample performance of the estimators in last sections. In order to compare our estimators with those in Fan, Peng and Huang (2005) we take Example 1 of Fan, Peng and Huang (2005).

Example 1

We select G = 100, 200, 400, 800 and I = 2, 3, 4. For each pair of (G, I), we simulate 200 datasets from model (1.2). The details of simulation scheme for this example are as follows:

α_g: The expression levels of the genes are generated from the standard double-exponential distribution.
β: For the row effects, first generate ${β_{i}^{'}, i = 1, \dots, 4}$ from N(0, 0.5), then set $β_{i} = β_{i}^{'} - \bar{β′}$ , which will guarantee that $\sum_{i = 1}^{4} β_{i} = 0.$ The column effects are generated in the same way.
U: The intensity is generated from a mixture distribution. We generate u from probability 0.0004(u−6)³I(6 < u < 16) with probability 0.7 and from uniform distribution over [6, 16] with probability 0.3.
m(·): Set the function $m (u) = \sqrt{5} (sin (u) - 0.2854)$ , where expectation is 0.
X: For each given gene, its associated block is assigned at random at one of 32 print-tip blocks.
ε: ε_gi is generated from the standard normal distribution.

For the proposed estimation, in first stage, we use a cubic B-spline basis function defined by

ζ (u | u^{0}, \dots, u^{4}) = \frac{1}{3!} \sum_{j = 0}^{4} {(- 1)}^{j} (\begin{matrix} 4 \\ j \end{matrix}) {[max (0, u - u^{j})]}^{3},

where u⁰,…, u⁴ are the evenly-spaced design knots. In the second stage, we take the Gaussian kernel, i.e.

K_{h} (u) = \frac{1}{h \sqrt{2 π}} exp (- u^{2} / 2 h^{2}) .

and the bandwidth is selected by plug-in method. The performance of the estimators is assessed by the mean squared errors (MSEs). The results are summarized in Table 1 and Figure 1.

Table 1.

MSEs of Example 1 (non-aggregation). Fan, Peng and Huang (2005)’s estimation and the proposed estimation

Estimation		I	G=100	G=200	G=400	G=800
Proposed Estimation	m(·)	2	0.1451	0.0742	0.0369	0.0208
		3	0.0767	0.0380	0.0233	0.0132
		4	0.0517	0.0269	0.0167	0.0991
	β	2	0.0670	0.0287	0.0156	0.0070
		3	0.0316	0.0149	0.0074	0.0032
		4	0.0214	0.0100	0.0056	0.0020
Fan, Peng and Huang (2005)’s estimation	m(·)	2	0.1454	0.0752	0.0358	0.0201
		3	0.0780	0.0397	0.0234	0.0137
		4	0.0515	0.0273	0.0167	0.0100
	β	2	0.0668	0.0299	0.0151	0.0069
		3	0.0318	0.0148	0.0071	0.0033
		4	0.0211	0.0098	0.0050	0.0024

Open in a new tab

The estimators of m(·) with G = 200 and I = 4. Dotted line: the proposed estimator; dash-dotted line: Fan, Peng and Huang (2005)’s estimator; and solid line: m(·).

From Table 1 and Figure we can see that the two-stage estimators almost has the same finite sample performance as that of the profile least squares estimators. This phenomena is also observed for the case of aggregation across arrays. We here omit the detail.

6 Concluding Remarks

In this paper, we have proposed a two-stage estimation procedure for the semilinear in-slide models. The main advantage of our approach over the existing ones is that we can establish the asymptotic normalities for the corresponding parametric and nonparametric component estimators, respectively. We further extended the two-stage estimation to aggregated semilinear in-slide models. The advantage of the two-stage estimation over the existed estimations in this case is that we can explicitly show that taking the aggregated information can lead to improvement in both the the parametric and nonparametric component estimators. The significance of developing these asymptotic normalities lies in that we can do bandwidth selection and statistical inference for the interested parametric and nonparametric components.

This is still an fast evolving area of research and additional effort in this direction is warranted. For example, how to take the heteroscedastic into account to improve the two-stage estimation is still an open problem.

Acknowledgments

This research is supported by a grant from National Institute of Health (CA 79949).

Appendix. Proof of Main Results

Lemma 1

Let (X₁, Y₁),…, (X_n, Y_n) be i.i.d random vectors, where the Y_i’s are scalar random variables. Further assume that E|Y_i|⁴ < ∞ and sup_x ∫ |y|⁴f(x, y)dy < ∞, where f denotes the joint density of (X, Y). Let K be a bounded positive function with a bounded support, and satisfies Lipschitz’s condition. Then if nh⁸ → 0 and nh²/(log n)² → ∞, it holds that

sup_{X} | \frac{1}{n} \sum_{i = 1}^{n} [K_{h} (X_{i} - X) Y_{i} - E {K_{h} (X_{i} - X) Y_{i}}] | = O_{p} ({\frac{log (\frac{1}{h})}{n h}}^{\frac{1}{2}}) .

The proof of Lemma 1 follows immediately from the result of Mack and Silverman (1982).

Lemma 2

Suppose that Assumption 3 to Assumption 5 hold. Then it holds that ${lim}_{n \to \infty} \frac{1}{n} {\hat{X}}^{T} \hat{X} = Σ$ where X̂ is defined in Section 2 and Σ is defined in Theorem 1.

The proof of Lemma 2 is trivial. We here omit the detail.

Lemma 3

Suppose that Assumption 1 to Assumption 3 hold. Then we have β̃_n − β = O_p(n^−1/2) Further,

{\tilde{β}}_{n} - β = {Π^{T} (I_{n} - P_{B}) Π}^{- 1} Π^{T} (I_{n} - P_{B}) ε + o_{p} (n^{- \frac{1}{2}})

where Π = (Π₁,…, Π_n)^T , Π_i = X_i − E(X_i|U_i) and P_B = B(B^TB)⁻¹B^T .

Lemma 4

Suppose that Assumption 1 to Assumption 3 hold. Then we have

lim_n→∞ ‖ϑ̃_n - ϑ‖ →_p 0;
${\tilde{ϑ}}_{n} - ϑ = O_{p} (k_{n}^{1 / 2} / n^{1 / 2} + k_{n}^{- 2});$
${sup}_{u \in 𝓊} {\tilde{m}}_{n} (u) - m (u) = O_{p} (k_{n} / \sqrt{n} + k_{n}^{- 3 / 2});$

Further,
$\begin{matrix} {\tilde{ϑ}}_{n} - ϑ = {ζ^{T} (I_{n} - P_{B}) ζ}^{- 1} ζ^{T} (I_{n} - P_{B}) ε + {ζ^{T} (I_{n} - P_{B}) ζ}^{- 1} ζ^{T} (I_{n} - P_{B}) (m (U_{1}) - \\ ζ^{T} (U_{1}) ϑ, \dots, m (U_{n}) - ζ^{T} (U_{n}) ϑ) + O_{p} (k_{n}^{3 / 2} / n + n^{- 1 / 2}) . \end{matrix}$

The proof of Lemma 3 is same as that of Theorem 1 in You, Zhou and Zhou (2005). Applying the root-n consistency of β̃_n, combining the proof of Theorem 1 in Horowitz and Mammen (2004) we can show Lemma 4 holds. We here omit the detail.

Proof of Theorem 1

For convenience, let $Δ_{g} (- i) = \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (X_{ι (g, i_{1})}^{T} (β - {\tilde{β}}_{n}) + m (U_{ι (g, i_{1})} - {\tilde{m}}_{n} (U_{ι (g, i_{1})}))$ for g = 1,…,G. Then, according to the definition of β̂_n it can be verified that

\begin{matrix} {\hat{β}}_{n} - β \\ = & {({\hat{X}}^{T} \hat{X})}^{- 1} {\hat{X}}^{T} (I_{n} - S) {X β + M + ε - (\frac{1}{I_{1} - 1} \sum_{i_{1} = 2}^{I_{1}} ε_{ι} (1, i_{1}), \dots, \frac{1}{I_{1} - 1} \sum_{i_{1} = 1, i_{1} \neq I_{1}}^{I_{1}} ε_{ι} (1, i_{1}), \\ \dots, {\frac{1}{I_{G} - 1} \sum_{i_{1} = 1, i_{1} \neq I_{G}}^{I_{G}} ε_{ι} (G, i_{1}))}^{T}} \\ - {({\hat{X}}^{T} \hat{X})}^{- 1} {\hat{X}}^{T} (I_{n} - S) {(Δ_{1} (- 1), \dots, Δ_{1} (- I_{1}), \dots, Δ_{G} (- I_{G}))}^{T} \\ = & {({\hat{X}}^{T} \hat{X})}^{- 1} J_{1} + {({\hat{X}}^{T} \hat{X})}^{- 1} J_{2}, say . \end{matrix}

Therefore, combining Lemma 2 in order to complete the proof we just need to show that

\frac{1}{\sqrt{n}} J_{1} \to_{D} N (0, lim_{n \to \infty} \frac{1}{n} \sum_{g = 1}^{G} \frac{I_{g}^{2}}{I_{g} - 1} σ^{2} Σ) as n \to \infty

A.1

and J₂ = o_p(n^1/2). Following the same argument as the proof of Theorem 1 in Fan and Huang (2005) we have

\frac{1}{\sqrt{n}} J_{1} = \frac{1}{\sqrt{n}} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}) + o_{p} (1) .

Since $\sum_{i = 1}^{I_{g}} Π_{ι (g, i)} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})})$ ’s are independent random variables with mean zero and finite covariance matrix

\begin{matrix} Cov {\sum_{i = 1}^{I_{g}} Π_{ι (g, i)} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})})} & = & I_{g} Σ Cov {(ε_{ι (g, 1)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 2}^{I_{g}} ε_{ι (g, i_{1})})} \\ = & I_{g}^{2} / (I_{g} - 1) σ^{2} Σ, \end{matrix}

by central limit theorem and Slustky’s theorem (A.1) holds. Moreover,

\begin{matrix} \frac{1}{n} J_{2} = \frac{1}{n} {\hat{X}}^{T} (I_{n} - S) {(Δ_{1} (- 1), \dots, Δ_{1} (- I_{1}), \dots, Δ_{G} (- I_{G}))}^{T} \\ = & \frac{1}{n} {\hat{X}}^{T} {(Δ_{1} (- 1), \dots, Δ_{1} (- I_{1}), \dots, Δ_{G} (- I_{G}))}^{T} - \frac{1}{n} {\hat{X}}^{T} S {(Δ_{1} (- 1), \dots, Δ_{1} (- I_{1}), \dots, Δ_{G} (- I_{G}))}^{T} \\ = & J_{21} + J_{22}, say . \end{matrix}

Let $O (u) = (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} D_{u}^{T} W_{u} .$ By definition of X^ it holds that

\begin{matrix} J_{21} = \frac{1}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (X_{ι (g, i_{1})}^{T} (β - {\tilde{β}}_{n}) + m (U_{ι (g, i_{1})}) - {\tilde{m}}_{n} (U_{ι (g, i_{1})})) \\ + \frac{1}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} O (U_{ι (g, i)}) Π \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (X_{ι (g, i_{1})}^{T} (β - {\tilde{β}}_{n}) + m (U_{ι (g, i_{1})}) - {\tilde{m}}_{n} (U_{ι (g, i_{1})})) \\ + \frac{1}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {h (U_{ι (g, i)}) - O (U_{ι (g, i)}) H} \cdot \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (X_{ι (g, i_{1})}^{T} (β - {\tilde{β}}_{n}) + m (U_{ι (g, i_{1})}) - {\tilde{m}}_{n} (U_{ι (g, i_{1})})) \\ = J_{211} + J_{212} + J_{213}, say \end{matrix}

where h(u) = (E(X₁₁|U₁ = u),…,E(X_p1|U₁ = u))^T and H = (h(U₁),…,h(U_n))^T . By Fan and Huang (2005) it holds that

max_{1 \leq g \leq G} max_{1 \leq i \leq I_{g}} ‖ O (U_{ι (g, i)}) Π ‖ = O_{p} (h^{2} + \frac{1}{\sqrt{n h}})

and

max_{1 \leq g \leq G} max_{1 \leq i \leq I_{g}} ‖ h (U_{ι (g, i)}) - O (U_{ι (g, i)}) H ‖ = O_{p} (h^{2} + \frac{1}{\sqrt{n h}}) .

Therefore, combining Lemma 3 and Lemma 4 we have

J_{212} = O_{p} (h^{2} + \frac{1}{\sqrt{n h}}) \cdot {O_{p} (n^{- 1}) + O_{p} (k_{n} / \sqrt{n} + k_{n}^{- 3 / 2})} = o_{p} (n^{- 1 / 2})

and J₂₁₃ = o_p(n^−1/2). Further,

\begin{matrix} J_{211} & = & \frac{1}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} X_{ι (g, i_{1})}^{T} (β - {\tilde{β}}_{n}) \\ + \frac{1}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (m (U_{ι (g, i_{1})}) - {\tilde{m}}_{n} (U_{ι (g, i_{1})})) \end{matrix}

It is easy to see that

E {\sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 2}^{I_{g}} X_{ι (g, i_{1})}^{T}}^{\otimes 2} = \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} E {(Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 2}^{I_{g}} X_{ι (g, i_{1})}^{T})}^{\otimes 2} = O (n)

where A^⊗ means A^TA. Combining the root-n consistency of β ˜_n; it holds that

\frac{1}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 2}^{I_{g}} X_{ι (g, i_{1})}^{T} (β - {\tilde{β}}_{n}) = O_{p} (n^{- 1}) .

According to the definition of m̃_n(·) we have

\begin{matrix} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (m (U_{ι (g, i_{1})}) - {\tilde{m}}_{n} (U_{ι (g, i_{1})})) \\ = & \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (m (U_{ι (g, i_{1})}) - ζ^{T} (U_{ι (g, i_{1})}) {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} M) \\ + \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ζ^{T} (U_{ι (g, i_{1})}) {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} X (β - {\tilde{β}}_{n}) \\ + \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ζ^{T} (U_{ι (g, i_{1})}) {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} ε = J_{3} + J_{4} + J_{5}, say . \end{matrix}

Now, we will prove J_s = o_p(n^1/2) for s = 3, 4 and 5. For convenience, we let

{\tilde{m}}_{ι (g, i)} = \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (m (U_{ι (g, i_{1})}) - ζ^{T} (U_{ι (g, i_{1})}) {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} M) .

It is easy to see, in order to complete the proof of J₃ = o_p(n^1/2), we just need to show that $G^{- 1} \sum_{g = 1}^{G} Π_{ι (g, 1), 1} {\tilde{m}}_{ι (g, 1)} = o_{p} (n^{- 1 / 2}) .$ Following the proof of Lemma 3 we have

{\tilde{m}}_{1} = max_{1 \leq g \leq G} | {\tilde{m}}_{ι (g, 1)} | = O (\sqrt{k_{n}} / n + k_{n}^{- 1}) a . s ..

Put _τg = _Πι(g,1),1_m̃ι(g,1). For any δ > 0, set

{\tilde{Π}}_{ι (g, 1), 1}^{'} = Π_{ι (g, 1), 1} I_{{| Π_{ι (g, 1), 1} | \leq δ^{2} g^{1 / 2}}} and {\tilde{Π}}_{ι (g, 1), 1}^{″} = Π_{ι (g, 1), 1} I_{{| Π_{ι (g, 1), 1} | > δ^{2} g^{1 / 2}}}

so that

τ_{g} = {\tilde{m}}_{ι (g, 1)} Π_{ι (g, 1), 1}^{'} + {\tilde{m}}_{ι (g, 1)} Π_{ι (g, 1), 1}^{″} .

By the three-series theorem we obtain $\sum_{g = 1}^{\infty} | Π_{ι (g, 1), 1}^{″} | < \infty$ for all g = 1,…,G. This implies that

\frac{1}{\sqrt{G}} \sum_{g = 1}^{G} Π_{ι (g, 1), 1}^{″} {\tilde{m}}_{ι (g, 1)} = o (1) a . s ..

For g = 1,…,G, let $τ_{g}^{'} = Π_{ι (g, 1), 1}^{'} {\tilde{m}}_{ι (g, 1)} .$ Then given ${\tilde{Δ}}_{n} = {U_{1}, \dots, U_{n}}, τ_{1}^{'}, \dots, τ_{G}^{'}$ are independent and

E (τ_{g} | {\tilde{Δ}}_{n}) = 0, max_{1 \leq g \leq G} | τ_{g} | \leq {\tilde{m}}_{1} δ^{2} G^{1 / 2} and E (τ_{g}^{2} | {\tilde{Δ}}_{n}) = 2 {\tilde{m}}_{ι (g, 1)} σ^{2} .

By Bernstein’s inequality we have

\begin{matrix} p_{m} & = & P [\underset{n \geq m}{\cup} {\frac{1}{\sqrt{G}} | \sum_{g = 1}^{G} τ_{g}^{'} | \geq δ}] \leq \sum_{G \geq m} E [\frac{1}{\sqrt{G}} \sum_{g = 1}^{G} Pr {| \sum_{g = 1}^{G} τ_{g}^{'} | \geq δ | {\tilde{Δ}}_{n}}] \\ \leq & 2 \sum_{G \geq m} \sum_{g = 1}^{G} E [exp {- \frac{G {(δ / G)}^{2}}{(2 / G) \sum_{g = 1}^{G} E [{(τ_{g}^{'})}^{2} | {\tilde{Δ}}_{n}] + δ^{2} G^{1 / 2} {\tilde{m}}_{1} (δ / G)}}] \\ \leq & 2 \sum_{G \geq m} \sum_{g = 1}^{G} E [exp {\frac{δ^{2}}{2 δ^{3} G^{1 / 2} {\tilde{m}}_{1}}}] \leq 2 \sum_{G \geq m} G^{- 2} \to 0 \end{matrix}

as m →∞. By this we have

Pr (| \frac{1}{\sqrt{G}} \sum_{g = 1}^{G} τ_{g}^{'} | \geq δ) \leq p_{m} + P r ({\tilde{m}}_{1} \geq δ^{2}) \leq 2 δ .

Therefore, J₃ = o(n^1/2) a.s.

By the Cauchy-Schwarz inequality, it holds that

\begin{matrix} ‖ J_{4} ‖ & = & \frac{1}{2} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)}^{T} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ζ^{T} (U_{ι (g, i_{1})}) {(Ξ^{T} M_{B} Ξ)}^{-} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} Π_{ι (g, i)}^{T} \\ + \frac{1}{2} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {(β - {\tilde{β}}_{n})}^{T} X^{T} M_{B} Ξ {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} X (β - {\tilde{β}}_{n}) = S_{1} + S_{2}, say . \end{matrix}

Further,

S_{1} = O (n^{- 1} k_{n}) \cdot \frac{1}{2} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} ‖ Π_{ι (g, i)}^{T} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ζ^{T} (U_{ι (g, i_{1})}) ‖ = O_{p} (k_{n}^{2}) = o_{p} (n) .

and

S_{2} \leq O_{p} (1) \cdot {(β - {\tilde{β}}_{n})}^{T} X^{T} X (β - {\tilde{β}}_{n}) = O_{p} (1) = o_{p} (n) .

Thus, J₄ = o_p(n^1/2).

In addition, it holds that

\begin{matrix} J_{5} & = & \frac{1}{2} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)}^{T} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ζ^{T} (U_{ι (g, i_{1})}) {(Ξ^{T} M_{B} Ξ)}^{-} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i \neq 1}^{I_{g}} ζ (U_{ι (g, i_{1})}) Π_{ι (g, i)} \\ + \frac{1}{2} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} ε^{T} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ζ^{T} (U_{ι (g, i_{1})}) {(Ξ^{T} M_{B} Ξ)}^{-} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i \neq 1}^{I_{g}} ζ (U_{ι (g, i_{1})}) ε \\ = & S_{3} + S_{4}, say . \end{matrix}

It is easy to see that

E S_{3} \leq O (1) \cdot E {Π M_{B} Ξ {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} Π^{T}} = O (k_{n}) = o (n^{\frac{1}{2}}) .

This implies that S₃ = o_p(n^1/2). Following the same line, we can show that S₄ = o_p(n^1/2). So J₅ = o_p(n^1/2) holds. In summary, the proof of Theorem 1 completes.

Proof of Theorem 2

According to the definition of m̂_n(u) it holds that

\begin{matrix} {\hat{m}}_{n} (u) & = & (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} D_{u}^{T} W_{u} \\ \cdot {ε - {(\frac{1}{I_{1} - 1} \sum_{i_{1} = 2}^{I_{1}} ε_{ι} (1, i_{1}), \dots, \frac{1}{I_{1} - 1} \sum_{i_{1} = 1, i_{1} \neq I_{1}}^{I_{1}} ε_{ι} (1, i_{1}), \dots, \frac{1}{I_{G} - 1} \sum_{i_{1} = 1, i_{1} \neq I_{G}}^{I_{G}} ε_{ι} (G, i_{1}))}^{T}} \\ + (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} D_{u}^{T} W_{u} {(Δ_{1} (- 1), \dots, Δ_{1} (- I_{1}), \dots, \nabla_{G} (- I_{G}))}^{T} \\ + (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} D_{u}^{T} W_{u} X (β - {\hat{β}}_{n}) + (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} D_{u}^{T} W_{u} M \\ = & J_{1} + J_{2} + J_{3} + J_{4}, say . \end{matrix}

It is easy to see that

D_{u}^{T} W_{u} D_{u} = (\begin{matrix} \sum_{i = 1}^{n} K_{h} (U_{i} - u) & \sum_{i = 1}^{n} (\frac{U_{i} - u}{h}) K_{h} (U_{i} - u) \\ \sum_{i = 1}^{n} (\frac{U_{i} - u}{h}) K_{h} (U_{i} - u) & \sum_{i = 1}^{n} {(\frac{U_{i} - u}{h})}^{2} K_{h} (U_{i} - u) \end{matrix}) .

Each element of the above matrix is in the form of kernel regression. By Lemma 1 it holds that

D_{u}^{T} W_{u} D_{u} = n p (u) \otimes (\begin{matrix} 1 & 0 \\ 0 & μ_{2} \end{matrix}) [1 + {\frac{log (1 / h)}{n h}}^{\frac{1}{2}}]

holds uniformly in 𝓊, where ⊗ is the Kronecker product and µ₂ = ∫_𝓊u²K(u)du By using the same argument, we have

D_{u}^{T} W_{u} X = n p (u) E (1^{T} X_{1} | U) \otimes (\begin{matrix} 1 & 0 \\ 0 & μ_{2} \end{matrix}) [1 + {\frac{log (1 / h)}{n h}}^{\frac{1}{2}}]

Therefore, combining the fact $‖ β - {\hat{β}}_{n} ‖ = O_{p} (n^{1 / 2})$ we have J₃ = O_p(n^1/2). Moreover, let

Δ_{g}^{(1)} (- i) = \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} X_{ι (g, i_{1})}^{T} (β - {\tilde{β}}_{n}) and Δ_{g}^{(2)} (- i) = \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (m (U_{ι (g, i_{1})}) - {\tilde{m}}_{n} (U_{ι (g, i_{1})}))

for g = 1,…,G. Then, we have

\begin{matrix} J_{2} & = & (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} (\begin{matrix} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} K_{h} (U_{ι (g, i)} - u) Δ_{g}^{(1)} (- i) \\ \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} \frac{U_{ι (g, i)} - u}{h} K_{h} (U_{ι (g, i)} - u) Δ_{g}^{(1)} (- i) \end{matrix}) \\ + (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} (\begin{matrix} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} K_{h} (U_{ι (g, i)} - u) Δ_{g}^{(2)} (- i) \\ \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} \frac{U_{ι (g, i)} - u}{h} K_{h} (U_{ι (g, i)} - u) Δ_{g}^{(2)} (- i) \end{matrix}) = J_{21} + J_{22}, say . \end{matrix}

By the root-n consistency of β̃_n and the argument as proving J₃ it is easy to see J₂₁ = O_p(n^−1/2). Further,

\begin{matrix} (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} K_{h} (U_{ι (g, i)} - u) Δ_{g}^{(2)} (- i) \\ = & (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} \sum_{i = 1}^{n} K_{h} (U_{i} - u) (m (U_{i}) - ζ^{T} (U_{i}) {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} M) \\ + (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} \sum_{i = 1}^{n} K_{h} (U_{i} - u) ζ^{T} (U_{i}) {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} X (β - {\tilde{β}}_{n}) \\ + (1, 0) {(D_{u}^{T} W_{u} D_{u})}^{- 1} \sum_{i = 1}^{n} K_{h} (U_{i} - u) ζ^{T} (U_{i}) {(Ξ^{T} M_{B} Ξ)}^{-} Ξ^{T} M_{B} ε = J_{5} + J_{6} + J_{7} say . \end{matrix}

Applying Lemma 1 and the root-n consistency of β̃_n we can show that J₆ = o_p(n^−1/2). Moreover, by the same argument as proof of Theorem 1 in Horowitz and Mammen (2004) we can show that $J_{5} = o_{p} {h^{2} + 1 / \sqrt{n h}} and J_{7} = o_{p} {h^{2} + 1 / \sqrt{n h}} .$ Above all we have $J_{4} = o_{p} {h^{2} + 1 / \sqrt{n h}} .$ .

According to the usual nonparametric regression result we have

\sqrt{n h} [J_{4} - m (u) - \frac{h^{2}}{2} \frac{μ_{2}^{2} - μ_{1} μ_{3}}{μ_{2} - μ_{1}^{2}} m^{″} (u)] \to_{p} 0 as n \to \infty .

Therefore, in order to complete the proof we just need to show that

\sqrt{n h} J_{1} \to_{D} N (0, ζ (u)) as n \to \infty .

Let

Q = \frac{1}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} [α_{0} + α_{1} (\frac{U_{ι (g, i)} - u}{h})] K_{h} (U_{ι (g, i)} - u) (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})})

where $α_{0} = μ_{2} / (μ_{2} - μ_{1}^{2}) and α_{1} = - μ_{1} / (μ_{2} - μ_{1}^{2}) .$ It follows that

\sqrt{n h} [J_{1} + J_{4} - m (u) - \frac{h^{2}}{2} \frac{μ_{2} - μ_{1} μ_{3}}{μ_{2} - μ_{1}^{2}} m^{″} (u)] = p^{- 1} (u) \sqrt{n h} Q + o_{p} (1) .

The variance of $\sqrt{n h} Q$ is

\begin{matrix} Var (\sqrt{n h} Q) & = & \frac{h σ^{2} α_{0}^{2}}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} \frac{I_{g}}{I_{g} - 1} E K_{h}^{2} (U_{ι (g, i)} - u) \\ + \frac{h σ^{2} α_{1}^{2}}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} \frac{I_{g}}{I_{g} - 1} E {{(\frac{U_{ι (g, i)} - u}{h})}^{2} K_{h}^{2} (U_{ι (g, i)} - u)} \\ + \frac{h σ^{2} α_{0} α_{1}}{n} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} \frac{I_{g}}{I_{g} - 1} E {(\frac{U_{ι (g, i)} - u}{h}) K_{h}^{2} (U_{ι (g, i)} - u)} \\ - \frac{h σ^{2} α_{0}^{2}}{n} \sum_{g = 1}^{G} \sum_{i_{1} = 1}^{I_{g}} \sum_{i_{2} = 1}^{I_{g}} \frac{I_{g}}{{(I_{g} - 1)}^{2}} E {K_{h} (U_{ι (g, i_{1})} - u) K_{h} (U_{ι (g, i_{2})} - u)} \\ + \frac{h σ^{2} α_{1}^{2}}{n} \sum_{g = 1}^{G} \sum_{i_{1} = 1}^{I_{g}} \sum_{i_{2} = 1}^{I_{g}} \frac{I_{g}}{{(I_{g} - 1)}^{2}} \\ \cdot E {(\frac{U_{ι (g, i_{1})} - u}{h}) K_{h} (U_{ι (g, i_{1})} - u) (\frac{U_{ι (g, i_{2})} - u}{h}) K_{h} (U_{ι (g, i_{2})} - u)} \\ + \frac{2 h σ^{2} α_{0} α_{1}}{n} \sum_{g = 1}^{G} \sum_{i_{1} = 1}^{I_{g}} \sum_{i_{2} = 1}^{I_{g}} \frac{I_{g}}{{(I_{g} - 1)}^{2}} E {(\frac{U_{ι (g, i_{1})} - u}{h}) K_{h}^{2} (U_{ι (g, i_{2})} - u))} \\ = & J_{8} + J_{9} + J_{10} + J_{11} + J_{12} + J_{13}, say . \end{matrix}

It is easy to see that $J_{8} \to_{p} α_{0}^{2} σ^{2} ν_{0}, J_{9} \to_{p} α_{1}^{2} σ^{2} ν_{2}, J_{10} \to_{p} α_{1} α_{0} σ^{2} ν_{1}, and J_{s} \to 0$ for s = 11, 12, 13 as n →∞. Above all,

Var (\sqrt{n h} Q) = p^{- 1} (u) (α_{0}^{2} ν_{0} + 2 α_{0} α_{1} ν_{1} + α_{1}^{2} ν_{2}) + o (1) .

Let

a_{g} = \sqrt{h} \sum_{i = 1}^{I_{g}} [α_{0} + α_{1} (\frac{U_{ι (g, i)} - u}{h})] K_{h} (U_{ι (g, i)} - u) (ε_{ι (g, i)} - \frac{I_{g}}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq 1}^{I_{g}} ε_{ι (g, i)})

and $B_{n}^{2} = \sum_{i = 1}^{G} E a_{g}^{2} .$ Then

B_{n}^{2} = n p^{- 1} (u) (α_{0}^{2} ν_{0} + 2 α_{0} α_{1} ν_{1} + α_{1}^{2} ν_{2}) σ^{2} + o (n) .

Simple calculation show that

\sum_{g = 1}^{G} E {| a_{g} |}^{3} \leq O (1) \cdot \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} h^{\frac{3}{2}} [| α_{0} | + | α_{1} | \cdot | \frac{U_{ι (g, i)} - u}{h} |] K_{h}^{3} (U_{ι (g, i)} - u) = O (n h^{- 1 / 2}) .

It follows that ${lim}_{n \to \infty} B_{n}^{- 3} \sum_{i = 1}^{G} E | a_{g}^{3} | = 0.$ By the central limit theorem the proof is complete.

Proof of Theorem 3

For convenience, let

\nabla_{ι (g, i)} = X_{ι (g, i)}^{T} (β - {\tilde{β}}_{n}) + m (U_{ι (g, i)}) - \tilde{m} (U_{ι (g, i)})

and

\nabla_{ι (g, i)}^{*} = X_{ι (g, i)}^{T} (β - {\hat{β}}_{n}) + m (U_{ι (g, i)}) - \hat{m} (U_{ι (g, i)}) .

By the definition of ${\hat{σ}}_{n}^{2}$ it can be decomposed as

\begin{matrix} {\hat{σ}}_{n}^{2} & = & d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}}^{2} + d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {\nabla_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1})}}^{2} \\ + d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} \nabla_{ι (g, i)}^{*} + 2 d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}} \\ \cdot {\nabla_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq 1}^{I_{g}} \nabla_{ι (g, i_{1})}} + 2 d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}} \nabla_{ι (g, i)}^{*} \\ + 2 d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} {\nabla_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} \nabla_{ι (g, i_{1})}} \nabla_{ι (g, i)}^{*} \\ = & J_{1} + \dots + J_{6}, say \end{matrix}

where $d (n) = 1 / {n + \sum_{g = 1}^{G} I_{g} / (I_{g} - 1))} .$ Applying Lemma 3 and Lemma 4, and Theorem 1 and Theorem 2 it is easy to show that J_s = o_p(n^−1/2) for s = 2, 3 and 6.

Let

ζ_{g} = {\sum_{i = 1}^{I_{g}} {ε_{ι (g, i)} - {(I_{g} - 1)}^{- 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}}}^{2} .

Obviously, ζ_g’s are independent random variables with $E ζ_{g} = {(I_{g} - 1)}^{- 1} I_{g}^{2} σ^{2} .$ Further,

\begin{matrix} E {[\sum_{i = 1}^{I_{g}} {ε_{ι (g, i)}^{2} - \frac{1}{{(I_{g} - 1)}^{2}} {(\sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})})}^{2} - \frac{2}{I_{g} - 1} ε_{ι (g, i)} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}}]}^{2} \\ = & E {(\sum_{i = 1}^{I_{g}} ε_{ι (g, i)}^{2})}^{2} + E {[\sum_{i = 1}^{I_{g}} \frac{1}{{(I_{g} - 1)}^{2}} {(\sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})})}^{2}]}^{2} \\ + E {[\frac{2}{I_{g} - 1} \sum_{i = 1}^{I_{g}} ε_{ι (g, i)} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}]}^{2} + E [\frac{2}{{(I_{g} - 1)}^{2}} \sum_{i_{1} = 1}^{I_{g}} \sum_{i_{3} = 1}^{I_{g}} ε_{ι (g, i_{1})}^{2} {(\sum_{i_{4} = 1, i_{4} \neq i_{3}}^{I_{g}} ε_{ι (g, i_{4})})}^{2}] \\ - E [\frac{4}{I_{g} - 1} \sum_{i_{1} = 1}^{I_{g}} \sum_{i_{3} = 1}^{I_{g}} ε_{ι (g, i_{1})}^{2} ε_{ι (g, i_{3})} \sum_{i_{4} = 1, i_{4} \neq i_{3}}^{I_{g}} ε_{ι (g, i_{4})}] \\ - E [\frac{4}{{(I_{g} - 1)}^{3}} \sum_{i_{1} = 1}^{I_{g}} \sum_{i_{3} = 1}^{I_{g}} {(\sum_{i_{2} = 1, i_{2} \neq i_{1}}^{I_{g}} ε_{ι (g, i_{2})})}^{2} ε_{ι (g, i_{3})} \sum_{i_{4} = 1, i_{4} \neq i_{3}}^{I_{g}} ε_{ι (g, i_{4})}] = J_{7} + \dots + J_{12}, say . \end{matrix}

It is easy to see that

\begin{matrix} J_{7} = I_{g} E (ε_{1}^{4}) + I_{g} (I_{g} - 1) σ^{4}, \\ J_{8} = {(I_{g} - 1)}^{- 2} I_{g} E (ε_{1}^{4}) + 2 I_{g} {(I_{g} - 1)}^{- 3} [3 {(I_{g} - 2)}^{2} + (I_{g} - 1)] σ^{4}, \\ J_{9} = 12 I_{g} {(I_{g} - 1)}^{- 1} σ^{4}, J_{10} = 2 {(I_{g} - 1)}^{- 1} I_{g} E (ε_{1}^{4}) + 2 I_{g} σ^{4}, \\ J_{11} = 0, and J_{12} = - 16 I_{g} {(I_{g} - 1)}^{- 2} (I_{g} - 2) σ^{4} . \end{matrix}

In summary, we have

\begin{matrix} E (ζ_{g}^{2}) & = & {1 + {(I_{g} - 1)}^{- 2} + 2 {(I_{g} - 1)}^{- 1}} I_{g} E (ε_{1}^{4}) \\ + {(I_{g} - 1)}^{- 3} (I_{g}^{5} - 2 I_{g}^{4} + 2 I_{g}^{3} + 6 I_{g}^{2} + I_{g}) σ^{4} . \end{matrix}

Then, by some simple calculation, we have

\begin{matrix} Var (ζ_{g}) & = & E (ζ_{g}^{2}) - {E (ζ_{g})}^{2} = {1 + {(I_{g} - 1)}^{- 2} + 2 {(I_{g} - 1)}^{- 1}} I_{g} E (ε_{1}^{4}) \\ + {(I_{g} - 1)}^{- 3} (- I_{g}^{4} + 2 I_{g}^{3} + 6 I_{g}^{2} + I_{g}) σ^{4} . \end{matrix}

Therefore,

\begin{matrix} Var (\frac{\sqrt{n}}{n + \sum_{g = 1}^{G} I_{g} / (I_{g} - 1)} \sum_{i = 1}^{G} ζ_{g}) = \frac{n}{{n + \sum_{g = 1}^{G} I_{g} / (I_{g} - 1)}^{2}} \\ \cdot \sum_{i = 1}^{G} [{1 + \frac{1}{{(I_{g} - 1)}^{2}} + \frac{2}{(I_{g} - 1)}} I_{g} E (ε_{1}^{4}) + \frac{1}{{(I_{g} - 1)}^{3}} (- I_{g}^{4} + 2 I_{g}^{3} + 6 I_{g}^{2} + I_{g}) σ^{4}] . \end{matrix}

According to the definition, J₄ can be written as

\begin{matrix} J_{4} & = & d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}) X_{ι (g, i)}^{T} (β - {\tilde{β}}_{n}) \\ - d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}) \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} X_{ι (g, i_{1})}^{T} (β - {\tilde{β}}_{n}) \\ + d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}) (m (U_{ι (g, i)}) - \tilde{m} (U_{ι (g, i)})) \\ - d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq 1}^{I_{g}} ε_{ι (g, i_{1})}) \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (m (U_{ι (g, i)}) - \tilde{m} (U_{ι (g, i)})) \\ = & J_{41} - J_{42} + J_{43} - J_{44} . say \end{matrix}

By the proof of Theorem 1, we can show that

d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}) X_{ι (g, i)}^{T} = O_{p} (n^{- \frac{1}{2}}) .

Therefore, combining the root-n consistency of β̂_n we have J₄₁ = o_p(n^−1/2). By the same argument we can show that J₄₂ = o_p(n^−1/2). Further, it holds that

\begin{matrix} J_{43} \\ = & d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}) {m (U_{ι (g, i)}) - ζ {(U_{ι (g, i)})}^{T} {Ξ^{T} M_{B} Ξ}^{-} Ξ^{T} M_{B} M} \\ + d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}) ζ {(U_{ι (g, i)})}^{T} {Ξ^{T} M_{B} Ξ}^{-} Ξ^{T} M_{B} X (β - {\tilde{β}}_{n}) \\ + d (n) \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} (ε_{ι (g, i)} - \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1})}) ζ {(U_{ι (g, i)})}^{T} {Ξ^{T} M_{B} Ξ}^{-} Ξ^{T} M_{B} ε \\ = & J_{421} + J_{422} + J_{423} say . \end{matrix}

Following the same line as proving

\sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i)} \frac{1}{I_{g} - 1} \sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} (m (U_{ι (g, i_{1})}) - {\tilde{m}}_{n} (U_{ι (g, i_{1})})) = o_{p} (n^{1 / 2})

in the proof of Theorem 1, we have J_42s = o_p(n^−1/2) for s = 1, 2 and 3. Thus J₄ = o_p(n^−1/2). By the same argument, we can show that J₅ = o_p(n^−1/2). The proof of theorem completes.

Proof of Theorem 4

Proving the consistency of ∑̂ is trivial. We here omit the detail. We just show the second result. To facilitate the notation we write

\nabla_{ι (g, i)} = (X_{ι (g, i)}^{T} - X_{ι (g, i - 1)}^{T}) (β - {\hat{β}}_{n}) + (m (U_{ι (g, i)}) - m (U_{ι (g, i - 1)})) - ({\hat{m}}_{n} (U_{ι (g, i)}) - {\hat{m}}_{n} (U_{ι (g, i - 1)})) .

Then it holds that

\begin{matrix} \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} {\hat{ψ}}_{ι (g, i)}^{4} = \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} {(ε_{ι (g, i)} - ε_{ι (g, i - 1)})}^{4} + \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} \nabla_{ι (g, i)}^{4} \\ + 4 \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} {(ε_{ι (g, i)} - ε_{ι (g, i - 1)})}^{3} \nabla_{ι (g, i)} + 4 \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} (ε_{ι (g, i)} - ε_{ι (g, i - 1)}) \nabla_{ι (g, i)}^{3} \\ + 6 \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} {(ε_{ι (g, i)} - ε_{ι (g, i - 1)})}^{2} \nabla_{ι (g, i)}^{2} = J_{1} + \dots + J_{5}, say \end{matrix}

For J₁ we have

\begin{matrix} \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} {(ε_{ι (g, i)} - ε_{ι (g, i - 1)})}^{4} = \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} {ε_{ι (g, i)}^{4} - ε_{ι (g, i - 1)}^{4} + 4 ε_{ι (g, i)}^{2} ε_{ι (g, i - 1)}^{2} + 2 ε_{ι (g, i)}^{2} ε_{ι (g, i - 1)}^{2} \\ + 4 ε_{ι (g, i)}^{2} ε_{ι (g, i)} ε_{ι (g, i - 1)} + 4 ε_{ι (g, i - 1)}^{2} ε_{ι (g, i)} ε_{ι (g, i - 1)}} \\ = & \sum_{g = 1}^{G} [(4 I_{g} - 2) E ε_{1}^{4} + {(I_{g} - 1) (I_{g} + 2) + 4 I_{g}} σ^{4}] + o_{p} (n) . \end{matrix}

Combining Theorem 1 and Theorem 2 it is easy to show that $\sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} \nabla_{ι (g, i)}^{4} = o_{p} (1) .$ Next, according to the Hölder inequality, for s = 1, 2 and 3 we have

| \sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} {(ε_{ι (g, i)} - ε_{ι (g, i - 1)})}^{s} \nabla_{ι (g, i)}^{4 - s} | \leq {(\sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} \nabla_{ι (g, i)}^{s})}^{(4 - s) / 4} {(\sum_{g = 1}^{G} \sum_{i = 2}^{I_{g}} {(ε_{ι (g, i)} - ε_{ι (g, i - 1)})}^{4})}^{s / 4} .

Therefore, we can show that J_i = o_p(n) for i = 3,…,5. Thus, the proof is complete.

Proof of Theorems 5 and 6

Following the proof of Theorem 1, we can show that

\begin{matrix} \sqrt{n} ({\hat{β}}_{j n}^{(1) A} - β_{j}) = \sum_{j}^{- 1} \frac{1}{\sqrt{n}} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i), j} \\ \cdot [ε_{ι (g, i), j} - \frac{1}{(I_{g} J - 1)} {\sum_{i_{1} = 1, i_{1} \neq i}^{I_{g}} ε_{ι (g, i_{1}), j} + \sum_{j_{1} = 1, j_{1} \neq j}^{J} \sum_{i_{1} = 1}^{I_{g}} ε_{ι (g, i_{1}), j_{1}}}] + o_{p} (1) \end{matrix}

and

\sqrt{n} ({\hat{β}}_{j n}^{(2) A} - β_{j}) = \sum_{j}^{- 1} \frac{1}{\sqrt{n}} \sum_{g = 1}^{G} \sum_{i = 1}^{I_{g}} Π_{ι (g, i), j} {ε_{ι (g, i), j} - \frac{1}{I_{g} (J - 1)} \sum_{j_{1} = 1, i_{1} = 1,}^{J} \sum_{i_{1} \neq 1}^{I_{g}} ε_{ι (g, i_{1}), j_{1}}} + o_{p} (1) .

Therefore, combining the central limit theorem and slustky’s theorem we can show that Theorem 5 and Theorem 6 hold.

Proof of Theorem 7

Applying Theorem 5 and Theorem 6, by the same argument as proving Theorem 2 we can show Theorem 7 holds.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

AMS 1980 classifications: primary 62H12; secondary 62A10

Contributor Information

Jinhong You, Email: jyou@bios.unc.edu.

Haibo Zhou, Email: zhou@bios.unc.edu.

References

1.Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Broldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Morre T, Hudson J, Jr, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisen-beuger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000;403:503–511. doi: 10.1038/35000501. [DOI] [PubMed] [Google Scholar]
2.Baltagi BH. Econometric Analysis of Panel Data. John Wiley & Sons; 1995. [Google Scholar]
3.Baltagi BH, Li D. Series Estimation of Partially Linear Panel Data Models with Fixed Effects. Annals of Economics and Finance. 2002;3:103–116. [Google Scholar]
4.Bickel PJ, Kwon J. Inference for semiparametric models: some questions and an answer. With comments and a rejoinder by the authors. Statistics Sinica. 2001;11:863–960. [Google Scholar]
5.Brown PO, Botstein D. Exploring the new world of the genome with microarrays. Nature Genetics. 1999;21:33–37. doi: 10.1038/4462. [DOI] [PubMed] [Google Scholar]
6.Dudoit S, Yang YH, Lu P, Lin DM, Peng V, Nagai J, Speed TP. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research. 2002;30:e15. doi: 10.1093/nar/30.4.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Entorf H. Random walks with drifts: nonsense regression and spurious fixed-effect estimation. Journal of Econometrics. 1997;80:287–296. [Google Scholar]
8.Fan J, Tam P, Vande Woude G, Ren Y. Normalization and analysis of cDNA micro-arrays using within-array replications applied to neuroblastoma cell response to a cytokine. Proceedings of the National Academy of Science. 2004:1135–1140. doi: 10.1073/pnas.0307557100. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Fan J, Peng H, Huang T. Semilinear high-dimensional model for normalization of microarray data: a theoretical analysis and partial consistency. Journal of American Statistical Association. 2005;471:781–798. [Google Scholar]
10.Fan J, Huang T. Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli. 2005;11:1031–1057. [Google Scholar]
11.Golub TR, Slonim DK, Tamayo P, Huard C, Gassenbeek M, Mesirov P, Celler H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537. doi: 10.1126/science.286.5439.531. [DOI] [PubMed] [Google Scholar]
12.Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O, Wilfond B, Borf A, Trent J. Gene-expression profiles in hereditary breast cancer. The New England Journal of Medicine. 2001;344:539–548. doi: 10.1056/NEJM200102223440801. [DOI] [PubMed] [Google Scholar]
13.Honoré BE. Orthogonality conditions for Tobit models with fixed effects and lagged dependent variables. Journal of Econometrics. 1994;59:35–61. [Google Scholar]
14.Horowitz JL, Mammen E. Nonparametric estimation of an additive model with a link function. The Annals of Statistics. 2004;32:2412–2443. [Google Scholar]
15.Huang J, Kuo H, Koroleva I, Zhang C, Soares MB. A semi-linear model for normalization and analyis of cDNA microarray data. Tech Report 321, University of Iowa, Department of Statistics. 2003 [Google Scholar]
16.Huang J, Zhang C. Asymptotic analysis of a two-way smiparametric regression model for microarray data. Statistic Sinica. 2003 to appear. [Google Scholar]
17.Huang J, Wang D, Zhang C. A two-way semilinear model for normalization and analysis of cDNA microarray data. Journal of the American Statistical Association. 2005;471:814–829. [Google Scholar]
18.Kroll TC, Wölfl S. Ranking: a closer look on globalization methods for normalization of gene expression arrays. Nucleic Acids Research. 2002;50:e50. doi: 10.1093/nar/30.11.e50. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.Lichtenberg FR. Estimation of the internal adjustment cost model using longitudinal establishment data. Review of Economics and Statistics. 1988;70:421–430. [Google Scholar]
20.Mack YP, Silverman BW. Weak and strong uniform consistency of kernel regression estimates. Z. Wahrsch. Verw. Gebiete. 1982;61:405–415. [Google Scholar]
21.Nguyen DV, Wang N, Carroll RJ. Evaluation of missing value estimation for microarray data. Journal of Data Science. 2004;2:347–370. [Google Scholar]
22.Schaid DJ. Case-parents design for gene-environment intercation. Genetic Epidemiology. 1999;16:261–273. doi: 10.1002/(SICI)1098-2272(1999)16:3<261::AID-GEPI3>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]
23.Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary cDNA microarray. Science. 1995;270:467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]
24.Tseng GC, Oh MK, Rohlin L, Liao JC, Wong WH. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Research. 2001;29:2549–2557. doi: 10.1093/nar/29.12.2549. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.You J, Zhou Y, Zhou X. Series Estimation in Partially Linear In-slide Regression Models. Journal of the Royal Statistical Society, Ser B. 2005 Submitted to. [Google Scholar]

[R1] 1.Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Broldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Morre T, Hudson J, Jr, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisen-beuger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000;403:503–511. doi: 10.1038/35000501. [DOI] [PubMed] [Google Scholar]

[R2] 2.Baltagi BH. Econometric Analysis of Panel Data. John Wiley & Sons; 1995. [Google Scholar]

[R3] 3.Baltagi BH, Li D. Series Estimation of Partially Linear Panel Data Models with Fixed Effects. Annals of Economics and Finance. 2002;3:103–116. [Google Scholar]

[R4] 4.Bickel PJ, Kwon J. Inference for semiparametric models: some questions and an answer. With comments and a rejoinder by the authors. Statistics Sinica. 2001;11:863–960. [Google Scholar]

[R5] 5.Brown PO, Botstein D. Exploring the new world of the genome with microarrays. Nature Genetics. 1999;21:33–37. doi: 10.1038/4462. [DOI] [PubMed] [Google Scholar]

[R6] 6.Dudoit S, Yang YH, Lu P, Lin DM, Peng V, Nagai J, Speed TP. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research. 2002;30:e15. doi: 10.1093/nar/30.4.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] 7.Entorf H. Random walks with drifts: nonsense regression and spurious fixed-effect estimation. Journal of Econometrics. 1997;80:287–296. [Google Scholar]

[R8] 8.Fan J, Tam P, Vande Woude G, Ren Y. Normalization and analysis of cDNA micro-arrays using within-array replications applied to neuroblastoma cell response to a cytokine. Proceedings of the National Academy of Science. 2004:1135–1140. doi: 10.1073/pnas.0307557100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] 9.Fan J, Peng H, Huang T. Semilinear high-dimensional model for normalization of microarray data: a theoretical analysis and partial consistency. Journal of American Statistical Association. 2005;471:781–798. [Google Scholar]

[R10] 10.Fan J, Huang T. Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli. 2005;11:1031–1057. [Google Scholar]

[R11] 11.Golub TR, Slonim DK, Tamayo P, Huard C, Gassenbeek M, Mesirov P, Celler H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537. doi: 10.1126/science.286.5439.531. [DOI] [PubMed] [Google Scholar]

[R12] 12.Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O, Wilfond B, Borf A, Trent J. Gene-expression profiles in hereditary breast cancer. The New England Journal of Medicine. 2001;344:539–548. doi: 10.1056/NEJM200102223440801. [DOI] [PubMed] [Google Scholar]

[R13] 13.Honoré BE. Orthogonality conditions for Tobit models with fixed effects and lagged dependent variables. Journal of Econometrics. 1994;59:35–61. [Google Scholar]

[R14] 14.Horowitz JL, Mammen E. Nonparametric estimation of an additive model with a link function. The Annals of Statistics. 2004;32:2412–2443. [Google Scholar]

[R15] 15.Huang J, Kuo H, Koroleva I, Zhang C, Soares MB. A semi-linear model for normalization and analyis of cDNA microarray data. Tech Report 321, University of Iowa, Department of Statistics. 2003 [Google Scholar]

[R16] 16.Huang J, Zhang C. Asymptotic analysis of a two-way smiparametric regression model for microarray data. Statistic Sinica. 2003 to appear. [Google Scholar]

[R17] 17.Huang J, Wang D, Zhang C. A two-way semilinear model for normalization and analysis of cDNA microarray data. Journal of the American Statistical Association. 2005;471:814–829. [Google Scholar]

[R18] 18.Kroll TC, Wölfl S. Ranking: a closer look on globalization methods for normalization of gene expression arrays. Nucleic Acids Research. 2002;50:e50. doi: 10.1093/nar/30.11.e50. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.Lichtenberg FR. Estimation of the internal adjustment cost model using longitudinal establishment data. Review of Economics and Statistics. 1988;70:421–430. [Google Scholar]

[R20] 20.Mack YP, Silverman BW. Weak and strong uniform consistency of kernel regression estimates. Z. Wahrsch. Verw. Gebiete. 1982;61:405–415. [Google Scholar]

[R21] 21.Nguyen DV, Wang N, Carroll RJ. Evaluation of missing value estimation for microarray data. Journal of Data Science. 2004;2:347–370. [Google Scholar]

[R22] 22.Schaid DJ. Case-parents design for gene-environment intercation. Genetic Epidemiology. 1999;16:261–273. doi: 10.1002/(SICI)1098-2272(1999)16:3<261::AID-GEPI3>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]

[R23] 23.Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary cDNA microarray. Science. 1995;270:467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]

[R24] 24.Tseng GC, Oh MK, Rohlin L, Liao JC, Wong WH. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Research. 2001;29:2549–2557. doi: 10.1093/nar/29.12.2549. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] 25.You J, Zhou Y, Zhou X. Series Estimation in Partially Linear In-slide Regression Models. Journal of the Royal Statistical Society, Ser B. 2005 Submitted to. [Google Scholar]

PERMALINK

A Two-Stage Approach for Semilinear In-Slide Models

Jinhong You

Haibo Zhou

Roles

Abstract

1 Introduction

2 A Two-Stage Procedure

3 Asymptotic Normality of the Two-Stage Estimators

Assumption 1

Assumption 2

Assumption 3

Assumption 4

Assumption 5

Remark 1

Theorem 1

Theorem 2

Remark 2

Theorem 3

Theorem 4

4 Two-stage Estimation for the Aggregated SLIM

4.1 Estimating the parametric component

Case 1

Case 2

Theorem 5

Theorem 6

Remark 3

4.2 Estimating the nonparametric components

Theorem 7

Remark 4

5 Simulation Studies

Example 1

Table 1.

Figure 1.

6 Concluding Remarks

Acknowledgments

Appendix. Proof of Main Results

Lemma 1

Lemma 2

Lemma 3

Lemma 4

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Proof of Theorems 5 and 6

Proof of Theorem 7

Footnotes

Contributor Information

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases