Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2009 Oct 2.
Published in final edited form as: J Multivar Anal. 2008 Sep 1;99(8):1610–1634. doi: 10.1016/j.jmva.2008.01.013

A Two-Stage Approach for Semilinear In-Slide Models

Jinhong You 1, Haibo Zhou 1
PMCID: PMC2756113  NIHMSID: NIHMS67348  PMID: 19802362

Abstract

The semilinear in-slide models (SLIMs) have been shown to be effective method for normalizing microarray data (Fan, et al. 2004). Using a backfitting method, Fan, Peng and Huang (2005) proposed a profile least squares (PLS) estimation for the parametric and nonparametric components. The general asymptotic properties for their estimator is not developed. In this paper, we consider a new approach, two-stage estimation, which enables us to establish the asymptotic normalities for both of the parametric and nonparametric component estimators. We further propose a plug-in bandwidth selector using the asymptotic normality of the nonparametric component estimator. The proposed method allow for the modeling of the aggregated SLIMs case where we can explicitly show that taking the aggregated information into account can improve both of the parametric and nonparametric component estimator by the proposed two-stage approach. Some simulation studies are conducted to illustrate the finite sample performance of the proposed procedures.

Key words and phrases: Semilinear regression, In-slide model, Two-stage estimation, Asymptotic normality, Aggregated information

1 Introduction

Microarray technology is an important tool for quantitatively monitoring gene expression patterns and has been widely used in functional genomics (see e.g. Schena et al., 1995; Brown and Botstein 1999). Since great variations in experimental conditions exist in the microarray process it is essential to normalize the raw microarray data before any meaningful inference or analysis can be done. Useful normalization techniques developed include the global normalization method (e.g. Kroll and Wölfl 2002), the “lowess” method (e.g. Dudoit et al. 2002), the rank based procedure (e.g. Tseng et al. 2001). However, some restrictive biological assumptions are generally needed for normalization techniques. For example, the global normalization method needs an assumption that there is no print-tip block effect and no intensity effect. Without such an assumption, the global normalization method would be statistically biased. The “lowess” method requires an assumption that the average expression levels of up-and down-regulated genes at each intensity level are about the same in each print-tip block. The rank based procedure assumes that there are not many genes that are up-regulated (or down-regulated).

New statistical approaches have been sought to relax those restrictive biological assumptions. For example, two-way semilinear models have been proposed to normalize the microarray data (Huang, et al. 2003, Huang and Zhang 2003, Huang, Wang and Zhang 2005). This method does not make the usual assumptions underlying the existing methods mentioned above. The two-way semilinear model approach can also incorporate uncertainty due to normalization into significant analysis of microarrays.

Fan, et al. (2004) proposed a method to estimate the intensity and print-tip effects by aggregating information from the replications in a microarray. Let G be the number of genes, Ig be the number of replications of the gth gene, Rgi and Ggi be the red (Cy5) and green (Cy3) intensities of the gth gene in the ith replication, respectively. Further, let Ygi be the log-intensity ratio of red over green channels of the gth gene in the ith repetition, and let Ugi be the corresponding average of the log-intensities of the red and green channels. That is, Ygi = log2 Rgi/Ggi,Ugi = 1/2 log2(RgiGgi). The following semilinear model was proposed by Fan, et al. (2004) to fit the intensity and print-tip block effects

Ygi=αg+βrgi+γcgi+m(Ugi)+εgi, (1.1)

where αg is the treatment effect associated with the gth gene, rgi and cgi are the row and column of print-tip block where the gth gene of the ith replication resides, β and γ are the row and column effects with constraints i=1rβi=0andj=1cγj=0, where r and c are the number of rows and columns of the print-tip blocks, m(·) is a smooth function of U representing the intensity effect, and εgi’s are random errors with mean zero and variance σ2.

Using matrix notation, model (1.1) can be re-written as

Y=Bα+Xβ+M+ε,n=g=1GIg (1.2)

where Y = (Y1,…, Yn)T is the response, B = blockdiag(1I1,…, 1IG) with 1Ig being a vector of length Ig and all elements 1, X = (X1,…,Xn)T is an n × p design matrix with p being the sum of the numbers of row and column, α = (α1,…, αG)T is the effect of gene, β = (β1,…, βr, γ1,…, γc)T is the print-tip block effect, M = (m(U1),…,m(Un))T is the intensity effect and ε = (ε1,…, εn)T is the random error.

Model (1.2) can be viewed as an extension of the usual fixed-effects parametric model to the semiparametric context. Such fixed-effects model is an appropriate specification if one is interested in a specific set of subjects and it has been widely applied in econometric analysis. (e.g. for example, Lichtenberg 1988, Honoré 1994, Baltagi 1995, Entorf 1997).

For the case where IgI, Baltagi and Li (2002) proposed difference-based series (DBS) estimators for β and m(·). They established the asymptotic normality of the former and derived the convergence rate of the latter. Fan, Peng and Huang (2005) proposed profile least squares (PLS) estimators for β and m(·) by combining the local linear, least squares and backfitting procedures. They established the asymptotic normality of the former and derived the upper boundary of the mean squares error of the latter. You, Zhou, and Zhou (2005) proposed semiparametric least squares (SLE) estimators for β and m(·) by series approximating the nonparametric component. For DBS, PLS and SLE estimators, it is not easy to establish the asymptotic normality of the nonparametric component estimators. The reason is that the DBS and SLE involve the series approximation and the PLS uses a backfitting procedure. This hinders the application of these estimators in practice as it is difficult to select bandwidth and inference on the nonparametric component. In addition, Baltagi and Li (2002) and You, Zhou, and Zhou (2005) only consider the non-aggregated model.

Real microarray data often has different replication numbers reported, i.e. Ig may not always be the same across different g. This structure may arise from the fact that different studies have different replication number or that within a same study, uncontrollable experimental conditions such as image corruption, array fabrication error, etc, may lead to different Ig for different g (Golub et al. 1999, Alizadeh et al. 2000, Hendenfalk et al. 2001, Nguyen et al. 2004). Extension of model (1.2) under unequal Ig cases is undeveloped.

In this paper, we describe a two-stage estimation procedure. In the first stage, the series approximating estimation is used to obtain the series estimates of the parametric and nonparametric components. In the second stage, we input the first-stage estimates and eliminate the nuisance parameters αg by difference. This transforms model (1.2) into an ordinary semilinear regression model. We then propose an ordinary profile least squares estimation for the parametric and nonparametric components, respectively. The asymptotic normalities of the proposed estimators are established. In particular, we show that the estimator of the parametric component achieves the semiparametric efficiency bound. We extend the two-stage estimate to the aggregated SLIMs case. Using the PLS estimation the aggregated information can only be used to improve the parametric components (Fan, Peng and Huang 2005). We explicitly demonstrate that under our two-stage estimation, the aggregated information can be used to improve both of the parametric and nonparametric component estimates.

The layout of the remainder of this paper is as follows. In Section 2 we describe the proposed two-stage estimation. In Section 3 we derive the asymptotic properties of the two-stage estimators. Extending the two-stage estimation to the aggregated SLIMs case is considered in Section 4. Section 5 presents results from numerical studies. Section 6 concludes. All proofs of main results are relegated to the Appendix.

2 A Two-Stage Procedure

Throught out this paper we assume that G → ∞ and 2 ≤ Igc for some fixed constant c. The two-stage estimation is as follows. In the first stage, the series approximating technique is used to obtain the series estimates of the parametric and nonparametric components, respectively. In the second stage, the first-stage estimates are input to the second stage and by differencing, we eliminate the nuisance parameters αg and transform model (1.2) into an ordinary semilinear regression model. The ordinary profile least squares and local polynomial estimates are then obtained for the parametric and nonparametric components, respectively.

Since m(u) is a smooth function, it can be approximated by ζT (u)ϑ where ζ (u) = (ζ kn1(u),…, ζknkn(u))T is a vector of approximating functions, such as power series or B-splines, ϑ is an unknown kn-variate constant vector and kn is a positive integer which is dependent on n. Thus, model (1.2) can be written as

Y=Bα+Xβ+Ξϑ+ε*, (2.1)

where Ξ is an n × kn matrix with i-th row being ζ(Ui) = (ζkn1(Ui),…, ζknkn(Ui))T , ε* = ε + MΞϑ and M = (m(U1),…,m(Un))T . Define MB = InB(BTB)−1BT . Then pre-multiplying (2.1) by MB leads to

MBY=MBXβ+MBΞϑ+MBε*. (2.2)

If we take MBε* as the residuals, model (2.2) is a version of the usual linear regression. By the usual “profile” or “partialing out” formula, the estimator of β can be written as

β˜n=(XTMBMMBΞMBX)1XTMBMMBΞMBY, (2.3)

where MMBΞ = InPMBΞ = InMBΞTMBΞ)ΞTMB and A denotes any generalized inverse of matrix A. An estimator of ϑ is

ϑ˜n=(ΞTMBΞ)ΞTMB(YXβ˜n).

Then an obvious estimator of m(u) is m̃n(u) = ζT (u)ϑ̃n, which is a nonparametric projecting estimator. Same as You, Zhou and Zhou (2005) we can establish the asymptotic normality of β̃ n. However, it is a great challenge to establish the asymptotic normality of m̃n(u). The lack of asymptotic normality of the nonparametric component estimator poses difficulties for bandwidth selections and hinders statistical inference. In the following we will propose two-stage estiamtors for both of the parametric and nonparametric components and establish the asymptotic normality for both of them.

For convenience, let ι(g,i)=g1=1g1Ig1+iandQ(g,i)=(Ig1)1i1=1,i1iIg(Yι(g,i1)Xι(g,i1)Tβ˜nm˜n(Uι(g,i1))) with g = 1,…,G and i = 1,…, Ig. If subtracting Q(g, i) from two sides of model (1.2) we have

Yι(g,i)Q(g,i)=Xι(g,i1)Tβ+m(Uι(g,i))+ει(g,i)+αgQ(g,i).

According to Lemma 1 and Lemma 2 in the appendix, we have

αgQ(g,i)=1Ig1i1=1,i1iIgει(g,i1)+OP(kn/n+kn3/2).

Therefore, if we denote Yι(g,i)*=Yι(g,i)Q(g,i) we have

Yι(g,i)*=Xι(g,i)Tβ+m(Uι(g,i))+ει(g,i)1Ig1i1=1,i1iIgει(g,ii)+Op(kn/n+kn3/2). (2.4)

It is easy to see that (2.4) is an ordinary semilinear regression model. The ordinary profile least squares and local polynomial estimations can be used to estimate β and m(·). The detail is as follows. For any given β, (2.4) can be written as

Yι(g,i)*Xι(g,i)Tβ=m(Uι(g,i))+ει(g,i)**,g=1,,G,i=1,,Ig (2.5)

where ει(g,i)**=ει(g,i)+αgQ(g,i). This transforms the semilinear regression model into the usual nonparametric model. Now, apply a local linear regression technique in a small neighborhood of u0, one can approximate m(u) locally by a linear function

m(u)m(u0)+m(u0)(uu0)a+b(uu0)

with m′ (u) = ∂m/∂u. This leads to the following weighted local least squares problem: find a, b to minimize

g=1Gi=1Ig{Yι(g,i)*Xι(g,i)Tβab(Uι(g,i)u0)}2Kh(Uι(g,i)u0), (2.6)

where K(·) is a kernel function, h is a bandwidth and Kh(·) = K(·/h)/h. The solution to minimizing the sum in (2.6) is given by

(a^(u),hb^(u))T=(DuTWuDu)1DuTWu(Y*Xβ), (2.7)

where

X=(X1TXnT)=(X11X1pXn1Xnp),Du=(1(U1u)/h1(Unu)/h),

and

Wu=diag(Kh(U1u),,Kh(Unu)).

Replacing m(·) by â(·) in (2.5) results the following model

Y^ι(g,i)*=X^ι(g,i)Tβ+ει(g,i)***,g=1,,Gandi=1,,Ig, (2.8)

where

Y^ι(g,i)*=Yι(g,i)*(1,0)(DUι(g,i)TWUι(g,i)DUι(g,i))1DUι(g,i)TWUι(g,i)Y*,X^ι(g,i)=Xι(g,i)(1,0)(DUι(g,i)TWUι(g,i)DUι(g,i))1DUι(g,i)TWUι(g,i)X

and ει(g,i)***=ει(g,i)**+m¯(Uι(g,i))ε¯ι(g,i)** with

m¯(Uι(g,i))=m(Uι(g,i))(1,0)(DUι(g,i)TWUι(g,i)DUι(g,i))1DUι(g,i)TWUι(g,i)M,ε¯ι(g,i)**=(1,0)(DUι(g,i)TWUι(g,i)DUι(g,i))1DUι(g,i)TWUι(g,i)ε**,M=(m(U1),,m(Un))Tandε**=(ε1**,,εn**)T.

Take ει(g,i1)*** as residuals and apply the least squares method to (2.8), we obtain a two-stage estimator of β as

β^n=(X^TX^)1X^TY^*, (2.9)

where In is an n × n identity matrix,

S=((1,0)(DU1TWU1DU1)1DU1TWU1(1,0)(DUnTWUnDUn)1DUnTWUn),Y*=(Y1*Yn*),X^=(InS)X,Y^*=(InS)Y*.

Correspondingly, a two-stage estimator of m(·) is

m^n(u)=(1,0)(DuTWuDu)1DuTWu(Y*Xβ^n). (2.10)

The error variance σ2=Var(ε12) is the quantity that describes the noise level. Apart from the intrinsic interest as parameters of the model, its estimation is essential in constructing confidence regions, model-based tests, model selection procedures, signal-to-noise ratio determination, and so on. Therefore, it is also essential to estimate it. We propose an estimate of σ2 as follows

σ^n2=1n+g=1GIg/(Ig1)(Y*Xβ^nM^)T(Y*Xβ^nM^).

In the next section, we will establish the asymptotic properties of β̂n,n(·) and σ^n2.

3 Asymptotic Normality of the Two-Stage Estimators

To present the asymptotic properties of β̂n,n(·) and σ^n2, we make the following assumptions

Assumption 1

(Xi,Ui, εi) are independent and identically distributed as (X1,U1, ε1).

Assumption 2

(i) For very kn there is a nonsingular matrix M such that for (u), the smallest eigenvalue of E[M(ζ(U1) − Eζ (U1))]⊗2 is bounded away from zero uniformly in kn.

(ii) There is a sequence of constants δ0(kn) satisfying supu∈𝓊 (u)║ ≤ δ0(kn) and kn satisfies that0(kn))2kn/n → 0 as n → ∞, where 𝓊 is the support of U1, and for a matrix A, ║A║ = tr(AAT) denotes the Euclidean norm of A.

Assumption 3

(i) m(u) and hj(u) = E(Xj1|U1 = u) are twice continuously differentiable on 𝓊 where j = 1,…, p.

(ii) For m(u) or hj(u), j = 1,…, p, there exist ϑ = (ϑ1,…,ϑkn)T, such that supu𝓊|g(u)ϑTζ(u)|O(kn2) with g(u) = m(u) or hj(u).

(iii) kn = ckn4/15+ν for some constant ck satisfying 0 < ck < ∞ and some ν satisfying 0 ≤ ν < 1/30.

Assumption 4

The function K(·) is a symmetric density function with compact support.

Assumption 5

h = chn−1/5 for some constant ch satisfying 0 < ch < ∞.

Remark 1

Assumption 2 is a standard assumption being used in series estimation methods. Assumption 3 says that the uniform approximation error to the function shrinks at the rate kn2. Assumption 2 and Assumption 3 are not the easiest conditions but it is known that many series functions satisfy these conditions, e.g. power series and spline. Assumption 4 and Assumption 5 are standard assumptions used in kernel or local polynomial estimations.

Under the above assumptions, the following theorem provides the asymptotic properties of β̂n,n(·) and σ^n2

Theorem 1

Suppose that Assumption 1 to Assumption 5 hold. Then it holds that

n(β^nβ)DN(0,{limn1ng=1GIg2(Ig1)1}σ2Σ1)asn

where Σ=E(Π1Π1T) and Π1 = X1E(X1|U1).

Theorem 2

Suppose that Assumption 1 to Assumption 5 hold. Then it holds that

nh[m^n(u)m(u)h22μ22μ1μ3μ2μ12m(u)]DN(0,ζ(u))asn

provided that p(u) ≠ 0, where μj=ujK(u)du,νj=ujK2(u)du,

ζ(u)={limnn1g=1GIg2/Ig1}σ2(α02ν0+2α0α1ν1+α12ν2)p(u),

with α0=μ2/(μ2μ12)andα1=μ1/(μ2μ12) and p(·) is the density function of U1.

Remark 2

According to Theorem 1, when IgI the asymptotic covariance matrix of β̂n reduces to I/(I − 1)σ2Σ−1, i.e the semiparametric efficient boundary (Fan, Peng and Huang 2005).

Theorem 3

Suppose that Assumption 1 to Assumption 5 hold. If Eε14< holds, then

n(σ^n2σ2)DN(0,κ)asn,

where κ=θ10E(ε14)θ20σ4withτ(n)=n/{n+g=1GIg/(Ig1)2},

θ10=limnτ(n)g=1G{1+1(Ig1)2+2(Ig1)}Ig

and

θ20=limnτ(n)g=1G1(Ig1)3(Ig4+2Ig3+6Ig2+Ig).

Further, we define

Σ^n=g=1GIg2(Ig1)1σ^n2(X^TX^)1,ψ^ι(g,i)=(Yι(g,i)Xι(g,i)Tβ^nm^n(Uι(g,i)))(Yι(g,i1)Xι(g,i1)Tβ^nm^n(Uι(g,i1)))

for g = 1,…,G, i = 2,…, Ig,

θ1=τ(n)g=1G{1+1(Ig1)2+2(Ig1)}Ig,θ2=τ(n)g=1G1(Ig1)3(Ig4+2Ig3+6Ig2+Ig),θ3=g=1G(4Ig2),θ4=g=1G{(Ig1)(Ig+2)+4Ig}andκ^n=θ1/θ3g=1Gi=1Igψ^ι(g,i)4+{θ2(θ1θ4)/θ3}σ^n4

The next theorem shows that Σ̂n and κ̂n are consistent estimators of limnn1g=1GIg2.(Ig1)1σ2Σ and κ, respectively.

Theorem 4

Suppose that Assumption 1 to Assumption 5 hold. If Eε14< holds, then

Σ^np{limnn1g=1GIg2(Ig1)1}σ2Σandκ^npκasn.

4 Two-stage Estimation for the Aggregated SLIM

In so far, the intensity effect and the gene effect were estimated by using the information within one slide. Therefore, the arrays are allowed to have different gene effect, namely, αg can be slide-dependent. When samples were drawn from different subjects this is reasonable. However, in many practical situations, the sample may come from the same subject. In those cases, it is natural to assume that the gene effects are the same across arrays and the information from other arrays can be aggregated. This assumption is helpful for improving the precision and for assessing the quality of an array using the coefficient of variation (Tseng, et al. 2001). Therefore, Fan, Peng and Huang (2005) further proposed an aggregated SLIM. This kind of aggregation idea is also appeared in the work of Huang, Wang and Zhang (2003) for a very different semiparametric model. The aggregated SLIM is defined as

Yij=BijTα+XijTβj+mj(Uij)+εij,i=1,,n,j=1,,J. (4.1)

where Yj=(Y1j,…,Ynj)T, Bj = (B1j , …,Bnj)T , Xj = (X1j ,… ,Xnj)T , Uj = (U1j ,…,Unj)T , α = (α1, …, αG)T , βj = (β1j , …, βpj j)T and εj = (ε1j ,…, εnj)T.

Fan, Peng and Huang (2005) proposed an aggregated profile least squares (APLS) estimator for β=(β1T,,βJT)T and describe an estimation for the nonparametric components. We here propose an aggregated two-stage procedure.

4.1 Estimating the parametric component

We will investigate two cases. One is that Xij1 and Xij2 are independent and the other is Xij1 and Xij2 are dependent, where j1j2.

Case 1

Suppose that β̃jn and jn(·) are series estimators of βj and mj(·) , respectively which are based on individual equation. Let

ι(g,i),1=Yι(g,i),jXι(g,i),jTβ˜jnm˜jn(Uι(g,i),j).

For fixed j, if subtracting

(IgJ1)1{i1=1,i1iIgι(g,i1),j+j1=1,j1jJi1=1Igι(g,i1),j1}

from the two sides of model (4.1) we have

Yι(g,i),j1IgJ1{i1=1,i1iIgι(g,i1),j+j1=1,j1jJi1=1Igι(g,i1),j1}=Xι(g,i),jTβj+m(Uι(g,i),j)+ει(g,i),j+αg1IgJ1{i1=1,i1iIgι(g,i1),j+j1=1,j1jJi1=1Igι(g,i1),j1}=Xι(g,i),jTβj+m(Uι(g,i),j)+ει(g,i),j1IgJ1{i1=1,i1iIgει(g,i1),j+j1=1,j1jJi1=1Igει(g,i1),j}+Op(max1jJkjn/n+max1jJkjn3/2).

Therefore, applying the usual profile least squares estimation we can obtain an aggregated two-stage estimator of βj as

β^jn(1)A=(X^jTX^j)1X^jTY^j(1)*,

where Sj, j have the same definitions as S and , the ι(g, i)th element of Yj(1)*isYι(g,i),j(IgJ1)1{i1=1,i1iIgι(g,i1),j+j1=1,j1jJi1=1Igι(g,i1),j1}andY^j(1)*=(InSj)Yj(1)*.

Case 2

For fixed j, if subtracting {Ig(J1)}1j1=1Ji1=1,i1iIgι(g,i1),j1. from the two sides of model (4.1) we have

Yι(g,i),j1Ig(J1)j1=1Ji1=1,i1iIgι(g,i1),j1=Xι(g,i),jTβj+m(Uι(g,i),j)+ει(g,i),j1Ig(J1)j1=1Ji1=1,i1iIgει(g,i1),j1+Op(max1jJkjn/n+max1jJkjn3/2).

Therefore, applying the usual profile least squares estimation we can obtain an aggregated two-stage estimator of βj as

β^jn(2)A=(X^jTX^j)1(X^jTY^j(2)*,

where the ι(g, i)th element of Yj(2)*isYι(g,i),j{Ig(J1)}1j1=1Ji1=1,i1iIgι(g,i1),j1.

For β^jn(1)Aandβ^jn(2)A we have the following asymptotic properties.

Theorem 5

Under some regularity conditions (same as Assumption 1 to Assumption 5) it holds that

n(β^jn(1)Aβj)DN(0,limnn1g=1GJIg2JIg1σ2j1)asn

where j=E(Π1jΠ1jT)andΠ1j=X1jE(X1j|U1j).

Theorem 6

Under some regularity conditions (same as Assumption 1 to Assumption 5) it holds that

n(β^jn(2)Aβj)DN(0,limnn1g=1G{J(Ig1)+1}J(Ig1)σ2j1)asn

where Σj is defined in Theorem 5.

Remark 3

From Theorem 5 and 6 we can see the aggregated information can be used to improve the two-stage estimators for the parametric components and the degree of improvement depend on Xij1 and Xij2 being independent or dependent. Moreover, when IgI,limnn1g=1GJIg2JIg1 reduces to JI/(JI − 1). Thus, according to Fan, Peng and Huang (2005), our aggregated two-stage estimator has the same asymptotic covariance as that of the aggregated PLS estimator.

4.2 Estimating the nonparametric components

We propose an aggregated local linear estimator of mj(·) for Case 1 and 2. In Case 1, it has the form

m^jn(1)A(u)=(1,0)(DjuTWjuDju)1DjuTWju(Yj(1)*Xjβ^jn(1)A).

In Case 2, it has the form

m^jn(2)A(u)=(1,0)(DjuTWjuDju)1DjuTWju(Yj(1)*Xjβ^jn(2)A).

For m^jn(1)A(u)andm^jn(2)A(u), we have the following asymptotic properties.

Theorem 7

Under some regularity conditions (same as Assumption 1 to Assumption 5) it holds that

m^jn(1)A(u)m^jn(2)A(u)=op{h2+1nh}.

Further,

nh[m^jn(1)A(u)mj(u)h22μ22μ1μ3μ2μ12mj(u)]DN(0,ζjA(u))asn

provided that pj(u) ≠ 0, where

ζjA(u)={limnn1g=1GJIg2JIg1}σ2(α02ν0+2α0α1ν1+α12ν2)pj(u),

and pj(u) is the density function of U1j.

Remark 4

From Theorem 7, we can see that taking the aggregated information into account can improve the estimate of the nonparametric component as well.

5 Simulation Studies

In this section, we conduct some simulations to show the finite sample performance of the estimators in last sections. In order to compare our estimators with those in Fan, Peng and Huang (2005) we take Example 1 of Fan, Peng and Huang (2005).

Example 1

We select G = 100, 200, 400, 800 and I = 2, 3, 4. For each pair of (G, I), we simulate 200 datasets from model (1.2). The details of simulation scheme for this example are as follows:

  • αg: The expression levels of the genes are generated from the standard double-exponential distribution.

  • β: For the row effects, first generate {βi,i=1,,4} from N(0, 0.5), then set βi=βi-β′¯, which will guarantee that i=14βi=0. The column effects are generated in the same way.

  • U: The intensity is generated from a mixture distribution. We generate u from probability 0.0004(u−6)3I(6 < u < 16) with probability 0.7 and from uniform distribution over [6, 16] with probability 0.3.

  • m(·): Set the function m(u)=5(sin(u)0.2854), where expectation is 0.

  • X: For each given gene, its associated block is assigned at random at one of 32 print-tip blocks.

  • ε: εgi is generated from the standard normal distribution.

For the proposed estimation, in first stage, we use a cubic B-spline basis function defined by

ζ(u|u0,,u4)=13!j=04(1)j(4j)[max(0,uuj)]3,

where u0,…, u4 are the evenly-spaced design knots. In the second stage, we take the Gaussian kernel, i.e.

Kh(u)=1h2πexp(u2/2h2).

and the bandwidth is selected by plug-in method. The performance of the estimators is assessed by the mean squared errors (MSEs). The results are summarized in Table 1 and Figure 1.

Table 1.

MSEs of Example 1 (non-aggregation). Fan, Peng and Huang (2005)’s estimation and the proposed estimation

Estimation I G=100 G=200 G=400 G=800
Proposed Estimation m(·) 2 0.1451 0.0742 0.0369 0.0208
3 0.0767 0.0380 0.0233 0.0132
4 0.0517 0.0269 0.0167 0.0991
β 2 0.0670 0.0287 0.0156 0.0070
3 0.0316 0.0149 0.0074 0.0032
4 0.0214 0.0100 0.0056 0.0020
Fan, Peng and Huang (2005)’s estimation m(·) 2 0.1454 0.0752 0.0358 0.0201
3 0.0780 0.0397 0.0234 0.0137
4 0.0515 0.0273 0.0167 0.0100
β 2 0.0668 0.0299 0.0151 0.0069
3 0.0318 0.0148 0.0071 0.0033
4 0.0211 0.0098 0.0050 0.0024

Figure 1.

Figure 1

The estimators of m(·) with G = 200 and I = 4. Dotted line: the proposed estimator; dash-dotted line: Fan, Peng and Huang (2005)’s estimator; and solid line: m(·).

From Table 1 and Figure we can see that the two-stage estimators almost has the same finite sample performance as that of the profile least squares estimators. This phenomena is also observed for the case of aggregation across arrays. We here omit the detail.

6 Concluding Remarks

In this paper, we have proposed a two-stage estimation procedure for the semilinear in-slide models. The main advantage of our approach over the existing ones is that we can establish the asymptotic normalities for the corresponding parametric and nonparametric component estimators, respectively. We further extended the two-stage estimation to aggregated semilinear in-slide models. The advantage of the two-stage estimation over the existed estimations in this case is that we can explicitly show that taking the aggregated information can lead to improvement in both the the parametric and nonparametric component estimators. The significance of developing these asymptotic normalities lies in that we can do bandwidth selection and statistical inference for the interested parametric and nonparametric components.

This is still an fast evolving area of research and additional effort in this direction is warranted. For example, how to take the heteroscedastic into account to improve the two-stage estimation is still an open problem.

Acknowledgments

This research is supported by a grant from National Institute of Health (CA 79949).

Appendix. Proof of Main Results

Lemma 1

Let (X1, Y1),…, (Xn, Yn) be i.i.d random vectors, where the Yi’s are scalar random variables. Further assume that E|Yi|4 < ∞ and supx ∫ |y|4f(x, y)dy < ∞, where f denotes the joint density of (X, Y). Let K be a bounded positive function with a bounded support, and satisfies Lipschitz’s condition. Then if nh8 → 0 and nh2/(log n)2 → ∞, it holds that

supX|1ni=1n[Kh(XiX)YiE{Kh(XiX)Yi}]|=Op({log(1h)nh}12).

.

The proof of Lemma 1 follows immediately from the result of Mack and Silverman (1982).

Lemma 2

Suppose that Assumption 3 to Assumption 5 hold. Then it holds that limn1nX^TX^=Σ where is defined in Section 2 and Σ is defined in Theorem 1.

The proof of Lemma 2 is trivial. We here omit the detail.

Lemma 3

Suppose that Assumption 1 to Assumption 3 hold. Then we have β̃nβ = Op(n−1/2) Further,

β˜nβ={ΠT(InPB)Π}1ΠT(InPB)ε+op(n12)

where Π = (Π1,…, Πn)T , Πi = XiE(Xi|Ui) and PB = B(BTB)−1BT .

Lemma 4

Suppose that Assumption 1 to Assumption 3 hold. Then we have

  1. limn→∞ϑ̃n - ϑ‖ →p 0;

  2. ϑ˜nϑ=Op(kn1/2/n1/2+kn2);

  3. supu𝓊m˜n(u)m(u)=Op(kn/n+kn3/2);

    Further,

  4. ϑ˜nϑ={ζT(InPB)ζ}1ζT(InPB)ε+{ζT(InPB)ζ}1ζT(InPB)(m(U1)ζT(U1)ϑ,,m(Un)ζT(Un)ϑ)+Op(kn3/2/n+n1/2).

The proof of Lemma 3 is same as that of Theorem 1 in You, Zhou and Zhou (2005). Applying the root-n consistency of β̃n, combining the proof of Theorem 1 in Horowitz and Mammen (2004) we can show Lemma 4 holds. We here omit the detail.

Proof of Theorem 1

For convenience, let Δg(i)=1Ig1i1=1,i1iIg(Xι(g,i1)T(ββ˜n)+m(Uι(g,i1)m˜n(Uι(g,i1))) for g = 1,…,G. Then, according to the definition of β̂n it can be verified that

β^nβ=(X^TX^)1X^T(InS){Xβ+M+ε(1I11i1=2I1ει(1,i1),,1I11i1=1,i1I1I1ει(1,i1),,1IG1i1=1,i1IGIGει(G,i1))T}(X^TX^)1X^T(InS)(Δ1(1),,Δ1(I1),,ΔG(IG))T=(X^TX^)1J1+(X^TX^)1J2,say.

Therefore, combining Lemma 2 in order to complete the proof we just need to show that

1nJ1DN(0,limn1ng=1GIg2Ig1σ2Σ)asn A.1

and J2 = op(n1/2). Following the same argument as the proof of Theorem 1 in Fan and Huang (2005) we have

1nJ1=1ng=1Gi=1IgΠι(g,i)(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))+op(1).

Since i=1IgΠι(g,i)(ει(g,i)1Ig1i1=1,i1iIgει(g,i1)) ’s are independent random variables with mean zero and finite covariance matrix

Cov{i=1IgΠι(g,i)(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))}=IgΣCov{(ει(g,1)1Ig1i1=2Igει(g,i1))}=Ig2/(Ig1)σ2Σ,

by central limit theorem and Slustky’s theorem (A.1) holds. Moreover,

1nJ2=1nX^T(InS)(Δ1(1),,Δ1(I1),,ΔG(IG))T=1nX^T(Δ1(1),,Δ1(I1),,ΔG(IG))T1nX^TS(Δ1(1),,Δ1(I1),,ΔG(IG))T=J21+J22,say.

Let O(u)=(1,0)(DuTWuDu)1DuTWu. By definition of X^ it holds that

J21=1ng=1Gi=1IgΠι(g,i)1Ig1i1=1,i1iIg(Xι(g,i1)T(ββ˜n)+m(Uι(g,i1))m˜n(Uι(g,i1)))+1ng=1Gi=1IgO(Uι(g,i))Π1Ig1i1=1,i1iIg(Xι(g,i1)T(ββ˜n)+m(Uι(g,i1))m˜n(Uι(g,i1)))+1ng=1Gi=1Ig{h(Uι(g,i))O(Uι(g,i))H}1Ig1i1=1,i1iIg(Xι(g,i1)T(ββ˜n)+m(Uι(g,i1))m˜n(Uι(g,i1)))=J211+J212+J213,say

where h(u) = (E(X11|U1 = u),…,E(Xp1|U1 = u))T and H = (h(U1),…,h(Un))T . By Fan and Huang (2005) it holds that

max1gGmax1iIgO(Uι(g,i))Π=Op(h2+1nh)

and

max1gGmax1iIgh(Uι(g,i))O(Uι(g,i))H=Op(h2+1nh).

Therefore, combining Lemma 3 and Lemma 4 we have

J212=Op(h2+1nh){Op(n1)+Op(kn/n+kn3/2)}=op(n1/2)

and J213 = op(n−1/2). Further,

J211=1ng=1Gi=1IgΠι(g,i)1Ig1i1=1,i1iIgXι(g,i1)T(ββ˜n)+1ng=1Gi=1IgΠι(g,i)1Ig1i1=1,i1iIg(m(Uι(g,i1))m˜n(Uι(g,i1)))

It is easy to see that

E{g=1Gi=1IgΠι(g,i)1Ig1i1=2IgXι(g,i1)T}2=g=1Gi=1IgE(Πι(g,i)1Ig1i1=2IgXι(g,i1)T)2=O(n)

where A means ATA. Combining the root-n consistency of β ˜n; it holds that

1ng=1Gi=1IgΠι(g,i)1Ig1i1=2IgXι(g,i1)T(ββ˜n)=Op(n1).

According to the definition of n(·) we have

g=1Gi=1IgΠι(g,i)1Ig1i1=1,i1iIg(m(Uι(g,i1))m˜n(Uι(g,i1)))=g=1Gi=1IgΠι(g,i)1Ig1i1=1,i1iIg(m(Uι(g,i1))ζT(Uι(g,i1))(ΞTMBΞ)ΞTMBM)+g=1Gi=1IgΠι(g,i)1Ig1i1=1,i1iIgζT(Uι(g,i1))(ΞTMBΞ)ΞTMBX(ββ˜n)+g=1Gi=1IgΠι(g,i)1Ig1i1=1,i1iIgζT(Uι(g,i1))(ΞTMBΞ)ΞTMBε=J3+J4+J5,say.

Now, we will prove Js = op(n1/2) for s = 3, 4 and 5. For convenience, we let

m˜ι(g,i)=1Ig1i1=1,i1iIg(m(Uι(g,i1))ζT(Uι(g,i1))(ΞTMBΞ)ΞTMBM).

It is easy to see, in order to complete the proof of J3 = op(n1/2), we just need to show that G1g=1GΠι(g,1),1m˜ι(g,1)=op(n1/2). Following the proof of Lemma 3 we have

m˜1=max1gG|m˜ι(g,1)|=O(kn/n+kn1)a.s..

Put τg = Πι(g,1),1m̃ι(g,1). For any δ > 0, set

Π˜ι(g,1),1=Πι(g,1),1I{|Πι(g,1),1|δ2g1/2}andΠ˜ι(g,1),1=Πι(g,1),1I{|Πι(g,1),1|>δ2g1/2}

so that

τg=m˜ι(g,1)Πι(g,1),1+m˜ι(g,1)Πι(g,1),1.

By the three-series theorem we obtain g=1|Πι(g,1),1|< for all g = 1,…,G. This implies that

1Gg=1GΠι(g,1),1m˜ι(g,1)=o(1)a.s..

For g = 1,…,G, let τg=Πι(g,1),1m˜ι(g,1). Then given Δ˜n={U1,,Un},τ1,,τG are independent and

E(τg|Δ˜n)=0,max1gG|τg|m˜1δ2G1/2andE(τg2|Δ˜n)=2m˜ι(g,1)σ2.

By Bernstein’s inequality we have

pm=P[nm{1G|g=1Gτg|δ}]GmE[1Gg=1GPr{|g=1Gτg|δ|Δ˜n}]2Gmg=1GE[exp{G(δ/G)2(2/G)g=1GE[(τg)2|Δ˜n]+δ2G1/2m˜1(δ/G)}]2Gmg=1GE[exp{δ22δ3G1/2m˜1}]2GmG20

as m →∞. By this we have

Pr(|1Gg=1Gτg|δ)pm+Pr(m˜1δ2)2δ.

Therefore, J3 = o(n1/2) a.s.

By the Cauchy-Schwarz inequality, it holds that

J4=12g=1Gi=1IgΠι(g,i)T1Ig1i1=1,i1iIgζT(Uι(g,i1))(ΞTMBΞ)1Ig1i1=1,i1iIgΠι(g,i)T+12g=1Gi=1Ig(ββ˜n)TXTMBΞ(ΞTMBΞ)ΞTMBX(ββ˜n)=S1+S2,say.

Further,

S1=O(n1kn)12g=1Gi=1IgΠι(g,i)T1Ig1i1=1,i1iIgζT(Uι(g,i1))=Op(kn2)=op(n).

and

S2Op(1)(ββ˜n)TXTX(ββ˜n)=Op(1)=op(n).

Thus, J4 = op(n1/2).

In addition, it holds that

J5=12g=1Gi=1IgΠι(g,i)T1Ig1i1=1,i1iIgζT(Uι(g,i1))(ΞTMBΞ)1Ig1i1=1,i1Igζ(Uι(g,i1))Πι(g,i)+12g=1Gi=1IgεT1Ig1i1=1,i1iIgζT(Uι(g,i1))(ΞTMBΞ)1Ig1i1=1,i1Igζ(Uι(g,i1))ε=S3+S4,say.

It is easy to see that

ES3O(1)E{ΠMBΞ(ΞTMBΞ)ΞTMBΠT}=O(kn)=o(n12).

This implies that S3 = op(n1/2). Following the same line, we can show that S4 = op(n1/2). So J5 = op(n1/2) holds. In summary, the proof of Theorem 1 completes.

Proof of Theorem 2

According to the definition of n(u) it holds that

m^n(u)=(1,0)(DuTWuDu)1DuTWu{ε(1I11i1=2I1ει(1,i1),,1I11i1=1,i1I1I1ει(1,i1),,1IG1i1=1,i1IGIGει(G,i1))T}+(1,0)(DuTWuDu)1DuTWu(Δ1(1),,Δ1(I1),,G(IG))T+(1,0)(DuTWuDu)1DuTWuX(ββ^n)+(1,0)(DuTWuDu)1DuTWuM=J1+J2+J3+J4,say.

It is easy to see that

DuTWuDu=(i=1nKh(Uiu)i=1n(Uiuh)Kh(Uiu)i=1n(Uiuh)Kh(Uiu)i=1n(Uiuh)2Kh(Uiu)).

Each element of the above matrix is in the form of kernel regression. By Lemma 1 it holds that

DuTWuDu=np(u)(100μ2)[1+{log(1/h)nh}12]

holds uniformly in 𝓊, where ⊗ is the Kronecker product and µ2 = ∫𝓊u2K(u)du By using the same argument, we have

DuTWuX=np(u)E(1TX1|U)(100μ2)[1+{log(1/h)nh}12]

Therefore, combining the fact ββ^n=Op(n1/2) we have J3 = Op(n1/2). Moreover, let

Δg(1)(i)=1Ig1i1=1,i1iIgXι(g,i1)T(ββ˜n)andΔg(2)(i)=1Ig1i1=1,i1iIg(m(Uι(g,i1))m˜n(Uι(g,i1)))

for g = 1,…,G. Then, we have

J2=(1,0)(DuTWuDu)1(g=1Gi=1IgKh(Uι(g,i)u)Δg(1)(i)g=1Gi=1IgUι(g,i)uhKh(Uι(g,i)u)Δg(1)(i))+(1,0)(DuTWuDu)1(g=1Gi=1IgKh(Uι(g,i)u)Δg(2)(i)g=1Gi=1IgUι(g,i)uhKh(Uι(g,i)u)Δg(2)(i))=J21+J22,say.

By the root-n consistency of β̃n and the argument as proving J3 it is easy to see J21 = Op(n−1/2). Further,

(1,0)(DuTWuDu)1g=1Gi=1IgKh(Uι(g,i)u)Δg(2)(i)=(1,0)(DuTWuDu)1i=1nKh(Uiu)(m(Ui)ζT(Ui)(ΞTMBΞ)ΞTMBM)+(1,0)(DuTWuDu)1i=1nKh(Uiu)ζT(Ui)(ΞTMBΞ)ΞTMBX(ββ˜n)+(1,0)(DuTWuDu)1i=1nKh(Uiu)ζT(Ui)(ΞTMBΞ)ΞTMBε=J5+J6+J7say.

Applying Lemma 1 and the root-n consistency of β̃n we can show that J6 = op(n−1/2). Moreover, by the same argument as proof of Theorem 1 in Horowitz and Mammen (2004) we can show that J5=op{h2+1/nh}andJ7=op{h2+1/nh}. Above all we have J4=op{h2+1/nh}. .

According to the usual nonparametric regression result we have

nh[J4m(u)h22μ22μ1μ3μ2μ12m(u)]p0asn.

Therefore, in order to complete the proof we just need to show that

nhJ1DN(0,ζ(u))asn.

Let

Q=1ng=1Gi=1Ig[α0+α1(Uι(g,i)uh)]Kh(Uι(g,i)u)(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))

where α0=μ2/(μ2μ12)andα1=μ1/(μ2μ12). It follows that

nh[J1+J4m(u)h22μ2μ1μ3μ2μ12m(u)]=p1(u)nhQ+op(1).

The variance of nhQ is

Var(nhQ)=hσ2α02ng=1Gi=1IgIgIg1EKh2(Uι(g,i)u)+hσ2α12ng=1Gi=1IgIgIg1E{(Uι(g,i)uh)2Kh2(Uι(g,i)u)}+hσ2α0α1ng=1Gi=1IgIgIg1E{(Uι(g,i)uh)Kh2(Uι(g,i)u)}hσ2α02ng=1Gi1=1Igi2=1IgIg(Ig1)2E{Kh(Uι(g,i1)u)Kh(Uι(g,i2)u)}+hσ2α12ng=1Gi1=1Igi2=1IgIg(Ig1)2E{(Uι(g,i1)uh)Kh(Uι(g,i1)u)(Uι(g,i2)uh)Kh(Uι(g,i2)u)}+2hσ2α0α1ng=1Gi1=1Igi2=1IgIg(Ig1)2E{(Uι(g,i1)uh)Kh2(Uι(g,i2)u))}=J8+J9+J10+J11+J12+J13,say.

It is easy to see that J8pα02σ2ν0,J9pα12σ2ν2,J10pα1α0σ2ν1,andJs0 for s = 11, 12, 13 as n →∞. Above all,

Var(nhQ)=p1(u)(α02ν0+2α0α1ν1+α12ν2)+o(1).

Let

ag=hi=1Ig[α0+α1(Uι(g,i)uh)]Kh(Uι(g,i)u)(ει(g,i)IgIg1i1=1,i11Igει(g,i))

and Bn2=i=1GEag2. Then

Bn2=np1(u)(α02ν0+2α0α1ν1+α12ν2)σ2+o(n).

Simple calculation show that

g=1GE|ag|3O(1)g=1Gi=1Igh32[|α0|+|α1||Uι(g,i)uh|]Kh3(Uι(g,i)u)=O(nh1/2).

It follows that limnBn3i=1GE|ag3|=0. By the central limit theorem the proof is complete.

Proof of Theorem 3

For convenience, let

ι(g,i)=Xι(g,i)T(ββ˜n)+m(Uι(g,i))m˜(Uι(g,i))

and

ι(g,i)*=Xι(g,i)T(ββ^n)+m(Uι(g,i))m^(Uι(g,i)).

By the definition of σ^n2 it can be decomposed as

σ^n2=d(n)g=1Gi=1Ig{ει(g,i)1Ig1i1=1,i1iIgει(g,i1)}2+d(n)g=1Gi=1Ig{ι(g,i)1Ig1i1=1,i1iIgι(g,i1)}2+d(n)g=1Gi=1Igι(g,i)*+2d(n)g=1Gi=1Ig{ει(g,i)1Ig1i1=1,i1iIgει(g,i1)}{ι(g,i)1Ig1i1=1,i11Igι(g,i1)}+2d(n)g=1Gi=1Ig{ει(g,i)1Ig1i1=1,i1iIgει(g,i1)}ι(g,i)*+2d(n)g=1Gi=1Ig{ι(g,i)1Ig1i1=1,i1iIgι(g,i1)}ι(g,i)*=J1++J6,say

where d(n)=1/{n+g=1GIg/(Ig1))}. Applying Lemma 3 and Lemma 4, and Theorem 1 and Theorem 2 it is easy to show that Js = op(n−1/2) for s = 2, 3 and 6.

Let

ζg=i=1Ig{ει(g,i)(Ig1)1i1=1,i1iIgει(g,i1)}2.

Obviously, ζg’s are independent random variables with Eζg=(Ig1)1Ig2σ2. Further,

E[i=1Ig{ει(g,i)21(Ig1)2(i1=1,i1iIgει(g,i1))22Ig1ει(g,i)i1=1,i1iIgει(g,i1)}]2=E(i=1Igει(g,i)2)2+E[i=1Ig1(Ig1)2(i1=1,i1iIgει(g,i1))2]2+E[2Ig1i=1Igει(g,i)i1=1,i1iIgει(g,i1)]2+E[2(Ig1)2i1=1Igi3=1Igει(g,i1)2(i4=1,i4i3Igει(g,i4))2]E[4Ig1i1=1Igi3=1Igει(g,i1)2ει(g,i3)i4=1,i4i3Igει(g,i4)]E[4(Ig1)3i1=1Igi3=1Ig(i2=1,i2i1Igει(g,i2))2ει(g,i3)i4=1,i4i3Igει(g,i4)]=J7++J12,say.

It is easy to see that

J7=IgE(ε14)+Ig(Ig1)σ4,J8=(Ig1)2IgE(ε14)+2Ig(Ig1)3[3(Ig2)2+(Ig1)]σ4,J9=12Ig(Ig1)1σ4,J10=2(Ig1)1IgE(ε14)+2Igσ4,J11=0,andJ12=16Ig(Ig1)2(Ig2)σ4.

In summary, we have

E(ζg2)={1+(Ig1)2+2(Ig1)1}IgE(ε14)+(Ig1)3(Ig52Ig4+2Ig3+6Ig2+Ig)σ4.

Then, by some simple calculation, we have

Var(ζg)=E(ζg2){E(ζg)}2={1+(Ig1)2+2(Ig1)1}IgE(ε14)+(Ig1)3(Ig4+2Ig3+6Ig2+Ig)σ4.

Therefore,

Var(nn+g=1GIg/(Ig1)i=1Gζg)=n{n+g=1GIg/(Ig1)}2i=1G[{1+1(Ig1)2+2(Ig1)}IgE(ε14)+1(Ig1)3(Ig4+2Ig3+6Ig2+Ig)σ4].

According to the definition, J4 can be written as

J4=d(n)g=1Gi=1Ig(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))Xι(g,i)T(ββ˜n)d(n)g=1Gi=1Ig(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))1Ig1i1=1,i1iIgXι(g,i1)T(ββ˜n)+d(n)g=1Gi=1Ig(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))(m(Uι(g,i))m˜(Uι(g,i)))d(n)g=1Gi=1Ig(ει(g,i)1Ig1i1=1,i11Igει(g,i1))1Ig1i1=1,i1iIg(m(Uι(g,i))m˜(Uι(g,i)))=J41J42+J43J44.say

By the proof of Theorem 1, we can show that

d(n)g=1Gi=1Ig(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))Xι(g,i)T=Op(n12).

Therefore, combining the root-n consistency of β̂n we have J41 = op(n−1/2). By the same argument we can show that J42 = op(n−1/2). Further, it holds that

J43=d(n)g=1Gi=1Ig(ει(g,i)1Ig1i1=1,i1iIgει(g,i1)){m(Uι(g,i))ζ(Uι(g,i))T{ΞTMBΞ}ΞTMBM}+d(n)g=1Gi=1Ig(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))ζ(Uι(g,i))T{ΞTMBΞ}ΞTMBX(ββ˜n)+d(n)g=1Gi=1Ig(ει(g,i)1Ig1i1=1,i1iIgει(g,i1))ζ(Uι(g,i))T{ΞTMBΞ}ΞTMBε=J421+J422+J423say.

Following the same line as proving

g=1Gi=1IgΠι(g,i)1Ig1i1=1,i1iIg(m(Uι(g,i1))m˜n(Uι(g,i1)))=op(n1/2)

in the proof of Theorem 1, we have J42s = op(n−1/2) for s = 1, 2 and 3. Thus J4 = op(n−1/2). By the same argument, we can show that J5 = op(n−1/2). The proof of theorem completes.

Proof of Theorem 4

Proving the consistency of ∑̂ is trivial. We here omit the detail. We just show the second result. To facilitate the notation we write

ι(g,i)=(Xι(g,i)TXι(g,i1)T)(ββ^n)+(m(Uι(g,i))m(Uι(g,i1)))(m^n(Uι(g,i))m^n(Uι(g,i1))).

Then it holds that

g=1Gi=2Igψ^ι(g,i)4=g=1Gi=2Ig(ει(g,i)ει(g,i1))4+g=1Gi=2Igι(g,i)4+4g=1Gi=2Ig(ει(g,i)ει(g,i1))3ι(g,i)+4g=1Gi=2Ig(ει(g,i)ει(g,i1))ι(g,i)3+6g=1Gi=2Igg=1Gi=2Ig(ει(g,i)ει(g,i1))2ι(g,i)2=J1++J5,say

For J1 we have

g=1Gi=2Ig(ει(g,i)ει(g,i1))4=g=1Gi=2Ig{ει(g,i)4ει(g,i1)4+4ει(g,i)2ει(g,i1)2+2ει(g,i)2ει(g,i1)2+4ει(g,i)2ει(g,i)ει(g,i1)+4ει(g,i1)2ει(g,i)ει(g,i1)}=g=1G[(4Ig2)Eε14+{(Ig1)(Ig+2)+4Ig}σ4]+op(n).

Combining Theorem 1 and Theorem 2 it is easy to show that g=1Gi=1Igι(g,i)4=op(1). Next, according to the Hölder inequality, for s = 1, 2 and 3 we have

|g=1Gi=2Ig(ει(g,i)ει(g,i1))sι(g,i)4s|(g=1Gi=2Igι(g,i)s)(4s)/4(g=1Gi=2Ig(ει(g,i)ει(g,i1))4)s/4.

Therefore, we can show that Ji = op(n) for i = 3,…,5. Thus, the proof is complete.

Proof of Theorems 5 and 6

Following the proof of Theorem 1, we can show that

n(β^jn(1)Aβj)=j11ng=1Gi=1IgΠι(g,i),j[ει(g,i),j1(IgJ1){i1=1,i1iIgει(g,i1),j+j1=1,j1jJi1=1Igει(g,i1),j1}]+op(1)

and

n(β^jn(2)Aβj)=j11ng=1Gi=1IgΠι(g,i),j{ει(g,i),j1Ig(J1)j1=1,i1=1,Ji11Igει(g,i1),j1}+op(1).

Therefore, combining the central limit theorem and slustky’s theorem we can show that Theorem 5 and Theorem 6 hold.

Proof of Theorem 7

Applying Theorem 5 and Theorem 6, by the same argument as proving Theorem 2 we can show Theorem 7 holds.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

AMS 1980 classifications: primary 62H12; secondary 62A10

Contributor Information

Jinhong You, Email: jyou@bios.unc.edu.

Haibo Zhou, Email: zhou@bios.unc.edu.

References

  • 1.Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Broldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Morre T, Hudson J, Jr, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisen-beuger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000;403:503–511. doi: 10.1038/35000501. [DOI] [PubMed] [Google Scholar]
  • 2.Baltagi BH. Econometric Analysis of Panel Data. John Wiley & Sons; 1995. [Google Scholar]
  • 3.Baltagi BH, Li D. Series Estimation of Partially Linear Panel Data Models with Fixed Effects. Annals of Economics and Finance. 2002;3:103–116. [Google Scholar]
  • 4.Bickel PJ, Kwon J. Inference for semiparametric models: some questions and an answer. With comments and a rejoinder by the authors. Statistics Sinica. 2001;11:863–960. [Google Scholar]
  • 5.Brown PO, Botstein D. Exploring the new world of the genome with microarrays. Nature Genetics. 1999;21:33–37. doi: 10.1038/4462. [DOI] [PubMed] [Google Scholar]
  • 6.Dudoit S, Yang YH, Lu P, Lin DM, Peng V, Nagai J, Speed TP. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Research. 2002;30:e15. doi: 10.1093/nar/30.4.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Entorf H. Random walks with drifts: nonsense regression and spurious fixed-effect estimation. Journal of Econometrics. 1997;80:287–296. [Google Scholar]
  • 8.Fan J, Tam P, Vande Woude G, Ren Y. Normalization and analysis of cDNA micro-arrays using within-array replications applied to neuroblastoma cell response to a cytokine. Proceedings of the National Academy of Science. 2004:1135–1140. doi: 10.1073/pnas.0307557100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Fan J, Peng H, Huang T. Semilinear high-dimensional model for normalization of microarray data: a theoretical analysis and partial consistency. Journal of American Statistical Association. 2005;471:781–798. [Google Scholar]
  • 10.Fan J, Huang T. Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli. 2005;11:1031–1057. [Google Scholar]
  • 11.Golub TR, Slonim DK, Tamayo P, Huard C, Gassenbeek M, Mesirov P, Celler H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537. doi: 10.1126/science.286.5439.531. [DOI] [PubMed] [Google Scholar]
  • 12.Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O, Wilfond B, Borf A, Trent J. Gene-expression profiles in hereditary breast cancer. The New England Journal of Medicine. 2001;344:539–548. doi: 10.1056/NEJM200102223440801. [DOI] [PubMed] [Google Scholar]
  • 13.Honoré BE. Orthogonality conditions for Tobit models with fixed effects and lagged dependent variables. Journal of Econometrics. 1994;59:35–61. [Google Scholar]
  • 14.Horowitz JL, Mammen E. Nonparametric estimation of an additive model with a link function. The Annals of Statistics. 2004;32:2412–2443. [Google Scholar]
  • 15.Huang J, Kuo H, Koroleva I, Zhang C, Soares MB. A semi-linear model for normalization and analyis of cDNA microarray data. Tech Report 321, University of Iowa, Department of Statistics. 2003 [Google Scholar]
  • 16.Huang J, Zhang C. Asymptotic analysis of a two-way smiparametric regression model for microarray data. Statistic Sinica. 2003 to appear. [Google Scholar]
  • 17.Huang J, Wang D, Zhang C. A two-way semilinear model for normalization and analysis of cDNA microarray data. Journal of the American Statistical Association. 2005;471:814–829. [Google Scholar]
  • 18.Kroll TC, Wölfl S. Ranking: a closer look on globalization methods for normalization of gene expression arrays. Nucleic Acids Research. 2002;50:e50. doi: 10.1093/nar/30.11.e50. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Lichtenberg FR. Estimation of the internal adjustment cost model using longitudinal establishment data. Review of Economics and Statistics. 1988;70:421–430. [Google Scholar]
  • 20.Mack YP, Silverman BW. Weak and strong uniform consistency of kernel regression estimates. Z. Wahrsch. Verw. Gebiete. 1982;61:405–415. [Google Scholar]
  • 21.Nguyen DV, Wang N, Carroll RJ. Evaluation of missing value estimation for microarray data. Journal of Data Science. 2004;2:347–370. [Google Scholar]
  • 22.Schaid DJ. Case-parents design for gene-environment intercation. Genetic Epidemiology. 1999;16:261–273. doi: 10.1002/(SICI)1098-2272(1999)16:3<261::AID-GEPI3>3.0.CO;2-M. [DOI] [PubMed] [Google Scholar]
  • 23.Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary cDNA microarray. Science. 1995;270:467–470. doi: 10.1126/science.270.5235.467. [DOI] [PubMed] [Google Scholar]
  • 24.Tseng GC, Oh MK, Rohlin L, Liao JC, Wong WH. Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Research. 2001;29:2549–2557. doi: 10.1093/nar/29.12.2549. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.You J, Zhou Y, Zhou X. Series Estimation in Partially Linear In-slide Regression Models. Journal of the Royal Statistical Society, Ser B. 2005 Submitted to. [Google Scholar]

RESOURCES