Structure Identification, Estimation and Variable Selection for Varying Coefficient EV Models With Longitudinal Data

Mingtao Zhao; Jingxiang Cao; Jun Sun; Yan Fan; Sanying Feng; Fanqun Li

doi:10.1002/sim.70434

. 2026 Apr 9;45:e70434. doi: 10.1002/sim.70434

Structure Identification, Estimation and Variable Selection for Varying Coefficient EV Models With Longitudinal Data

Mingtao Zhao ¹, Jingxiang Cao ¹, Jun Sun ¹, Yan Fan ², Sanying Feng ³, Fanqun Li ^1,^✉

PMCID: PMC13112365 PMID: 41956973

ABSTRACT

In this article, we propose a bias‐corrected double penalized quadratic inference functions method to simultaneously identify model structure, estimate parameters, and perform variable selection for varying coefficient errors‐in‐variables (EV) models with longitudinal data. Unlike the linear models or the partial linear varying coefficient models, the proposed method does not assume in advance whether each regression coefficient is constant or varying. Instead, it represents each coefficient as a nonparametric function and identifies whether it is constant or varying using the proposed method. By employing a B‐spline basis to approximate the unknown coefficient functions, the proposed method integrates a bias‐corrected quadratic inference function with two penalized terms to achieve structure identification, estimation, and variable selection. Under certain regularized conditions, the consistency and sparsity properties of the estimator are established. Moreover, a three‐step iterative algorithm is developed to implement the proposed method in practice. Simulation studies and a real data analysis demonstrate the superior finite‐sample performance of the method.

Keywords: bias‐corrected double penalized quadratic inference functions, longitudinal data, structure identification, variable selection, varying coefficient EV models

1. Introduction

Varying coefficient (VC) models extend the classical linear regression framework by treating each regression coefficient as a smoothing function, thus allowing covariate effects to evolve dynamically across longitudinal measurements. Their enhanced flexibility and interpretability have spurred considerable interest in recent longitudinal studies. A wide range of estimation techniques has been developed, including kernel methods [1, 2]; locally polynomial approaches [3]; least squares estimators [4]; and spline‐based methods [5, 6, 7]. In addition, Xue and Zhu [8] proposed an empirical likelihood‐based interval estimation method; Qu and Li [9] introduced a penalized quadratic inference functions (QIF) method for model estimation. Comprehensive reviews of VC models can be found in Fan and Zhang [10] and Park et al. [11].

Suppose longitudinal data ${(Y_{i j}, X_{i j}, t_{i j}) : i = 1, 2, \dots, n, j = 1, 2, \dots, n_{i}}$ satisfy the VC model

\begin{align} Y_{i j} = X_{i j}^{T} α (t_{i j}) + ε_{i j}, i = 1, 2, \dots, n, j = 1, 2, \dots, n_{i}, \end{align}

(1)

where $Y_{i j} \in ℝ$ is the response variable, $X_{i j} \in ℝ^{q}$ represents the covariates at $t_{i j} \in ℝ$ , and $X_{i j} = {(X_{i j}^{1}, X_{i j}^{2}, \dots, X_{i j}^{q})}^{T}$ . The term $ε_{i j} \in ℝ$ denotes a zero‐mean stochastic process, and $ε_{i j}$ is independent of $X_{i j}$ . For each $t \in ℝ$ , the coefficient vector $α (t) = {(α_{1} (t), α_{2} (t), \dots, α_{q} (t))}^{T}$ consists of some unknown smooth functions $α_{k} (t) (k = 1, 2, \dots, q)$ defined on the interval $t \in [0, 1]$ with some scale transformation. Some moment assumptions are stated as $E (Y_{i j} | X_{i j}, t_{i j}) = μ_{i j}$ and $var (Y_{i j} | X_{i j}, t_{i j}) = ν (μ_{i j})$ , where $ν (\cdot)$ is a known variance function.

Previous studies often assume that covariates are measured without errors. However, in practice, obtaining precise measurements for some covariates is frequently challenging or difficult to achieve, which results in inevitable measurement errors or potential unobserved covariates. Neglecting these measurement errors can result in biased parameter estimates and misleading inferences. To address this issue, we extend model (1) by incorporating additive measurement errors in the covariates, resulting in the varying coefficient errors‐in‐variables (VCEV) model

\begin{align} \{\begin{array}{l} Y_{i j} = X_{i j}^{T} α (t_{i j}) + ε_{i j}, \\ W_{i j} = X_{i j} + u_{i j}, \end{array}, i = 1, 2, \dots, n, j = 1, 2, \dots, n_{i}, \end{align}

(2)

where $W_{i j} = {(W_{i j}^{1}, W_{i j}^{2}, \dots, W_{i j}^{q})}^{T} \in ℝ^{q}$ represents the observed covariates, and $u_{i j} = {(u_{i j}^{1}, u_{i j}^{2}, \dots, u_{i j}^{q})}^{T} \in ℝ^{q}$ denotes the zero‐mean measurement errors with a diagonal covariance matrix $\sum_{u}$ . Additionally, we assume that $cov (u_{i j_{1}}, u_{i j_{2}}) = 0$ for $j_{1} \neq j_{2}$ , the errors $u_{i j}$ are mutually independent, and all $u_{i j}$ are independent of $(X_{i j}, t_{i j}, ε_{i j})$ , where $cov (\cdot)$ represents the covariance operator. To effectively account for measurement errors, supplementary information regarding $\sum_{u}$ is required in practice, and it is typically assumed that $\sum_{u}$ can be either estimated from the data or known in advance.

For model (2), Li and Greene [12] applied a locally corrected method to estimate the coefficient functions. Yang, Li and Peng [13] explored the empirical likelihood method for model (2) in the context of longitudinal data. For the VCEV models and partial linear varying coefficient EV (PLVCEV) models with longitudinal data, Zhao, Gao and Cui [14] and Zhao et al. [15] proposed a type of bias‐corrected penalized QIF method, which can handle measurement errors in covariates and within‐subject correlations simultaneously, estimate and select significant non‐zero parametric and nonparametric components. More studies about VCEV models with longitudinal data can be found in Zhao, Gao and Cui [14] and references therein, details are omitted here. Moreover, for structural change points in varying coefficient models, Zhao et al. [16] proposed the adaptive jump‐preserving (AJP) estimator. In the context of simultaneous measurement‐error correction and change‐point detection, Zhao et al. [17] developed the single‐index measurement error jump regression model. More studies about model (2) are omitted here.

Structure identification and variable selection [18, 19, 20] are of fundamental importance, as the validity of a fitted model and its subsequent inferences are critically dependent on the correctness of the specified structure. Linear models traditionally assume that all regression coefficients are constant; however, VC models generalize this concept by allowing the coefficients to be functions that vary over the domain. Building on this idea, partial linear varying coefficient (PLVC) models further distinguish between covariates by assuming that while some effects remain constant, others are set as varying functions in advance. In practice, arbitrarily designating which subset of variables should have constant versus varying effects on the response introduces a significant risk of model misspecification. Tang, Wang and Zhu [21]; Wang and Lin [22] and Xu et al. [23] have developed methodologies that consistently differentiate among varying coefficients, nonzero constant coefficients, and zero coefficients, in a manner that essentially achieves performance as if the true model structure and relevant variables were known a priori, thereby yielding robust selection outcomes in the analysis of longitudinal data. Inspired by Tang et al. [21], Xu et al. [23] and Wang and Lin [24], we propose a structure identification, estimation and variable selection approach based on the bias‐corrected double penalized quadratic inference functions (BCDPQIF), which is capable of addressing both measurement errors and within‐subject correlations, while accurately identifying the model structure. Additionally, by appropriately choosing tuning parameters, we provide a theoretical analysis of the consistency and sparsity properties of the proposed method.

The rest of this article is organized as follows. The BCDPQIF method and some theoretical results are stated in Section 2. Computational algorithm and selection of tuning parameters are presented in Section 3. Simulation studies and a real data analysis are performed to evaluate the proposed method in Section 4. Finally, we provide the conclusions and a discussion in Section 5. The derivation process of some equations, the proofs of theorems and some other numerical results are provided in Appendix A (Tables A1, A2, A3, A4).

2. Methodology and Main Results

2.1. Bias‐Corrected Double Penalized Quadratic Inference Functions Method

Following Wang and Lin [24], $α_{k} (t) (k = 1, 2, \dots, q)$ can be represented approximately as

α_{k} (t) = β_{k} + η_{k} (t),

(3)

where $E (η_{k} (t)) = 0$ . Denote a B‐spline basis $B (t) = {(B_{1} (t), B_{2} (t), \dots, B_{L} (t))}^{T}$ with the order $d$ , where $L = K + d$ , $K (> 0)$ is the number of interior knots. The compact support and piecewise‐polynomial structure of B‐splines markedly reduce computational complexity, facilitating rapid model fitting even in high‐dimensional or large‐scale datasets. Additionally, B‐splines deliver superior flexibility and precision in coefficient function estimation, adeptly capturing localized features and complex functional patterns. Thus, $η_{k} (t)$ can be approximated as

η_{k} (t) \approx \sum_{l = 1}^{L} B_{l} (t) γ_{k l} = B {(t)}^{T} γ_{k},

(4)

where $γ_{k} = {(γ_{k 1}, γ_{k 2}, \dots, γ_{k L})}^{T} \in ℝ^{L}$ , $k = 1, 2, \dots, q$ , is a regression coefficient vector of B‐spline basis. Thus we can have

α_{k} (t) \approx β_{k} + B {(t)}^{T} γ_{k}, k = 1, 2, \dots, q .

(5)

By replacing $α_{k} (t) (k = 1, 2, \dots, q)$ by Equation (5), model (2) can be represented as

\begin{align} \{\begin{aligned} Y_{i j} & \approx X_{i j}^{T} β + {\tilde{X}}_{i j}^{T} γ + ε_{i j} \\ W_{i j} & = X_{i j} + u_{i j} \\ {\tilde{W}}_{i j} & = {\tilde{X}}_{i j} + ũ_{i j} \end{aligned}, i = 1, 2, \dots, n, j = 1, 2, \dots, n_{i}, \end{align}

(6)

where $β = {(β_{1}, β_{2}, \dots, β_{q})}^{T}$ , $γ = {(γ_{1}^{T}, γ_{2}^{T}, \dots, γ_{q}^{T})}^{T}$ , $B_{i j} = I_{q} \otimes B (t_{i j})$ , ${\tilde{X}}_{i j} = B_{i j} X_{i j}$ , $ũ_{i j} = B_{i j} u_{i j}$ , $ũ_{i j} = B_{i j} u_{i j}$ , $I_{q}$ is the $q \times q$ identity matrix. Then $u_{i j}$ are independent of $(X_{i j}, t_{i j}, ε_{i j})$ . $E (u_{i j}) = 0$ , $cov (u_{i j}) = \sum_{u}$ , $cov (u_{i j_{1}}, u_{i j_{2}}) = 0$ for $j_{1} \neq j_{2}$ .

We can obtain the following generalized estimating equations (GEE) about $θ = {(β^{T}, γ^{T})}^{T}$ as

\begin{align} \sum_{i = 1}^{n} {(W_{i}, {\tilde{W}}_{i})}^{T} V_{i}^{- 1} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) = 0 . \end{align}

(7)

Thus, we have

\begin{align} E (\sum_{i = 1}^{n} [{(W_{i}, {\tilde{W}}_{i})}^{T} V_{i}^{- 1} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ)]) \\ = E (\sum_{i = 1}^{n} [({(X_{i}, {\tilde{X}}_{i})}^{T} + {(u_{i}, ũ_{i})}^{T}) V_{i}^{- 1} (Y_{i} - ((X_{i}, {\tilde{X}}_{i}) + (u_{i}, ũ_{i})) θ)]) \\ = E (\sum_{i = 1}^{n} [{(X_{i}, {\tilde{X}}_{i})}^{T} V_{i}^{- 1} (Y_{i} - (X_{i}, {\tilde{X}}_{i}) θ) + {(u_{i}, ũ_{i})}^{T} V_{i}^{- 1} (Y_{i} - (X_{i}, {\tilde{X}}_{i}) θ) \\ - {(X_{i}, {\tilde{X}}_{i})}^{T} V_{i}^{- 1} (u_{i}, ũ_{i}) θ - {(u_{i}, ũ_{i})}^{T} V_{i}^{- 1} (u_{i}, ũ_{i}) θ]) \\ = - n E ({(u_{i}, ũ_{i})}^{T} V_{i}^{- 1} (u_{i}, ũ_{i}) θ) \\ \neq 0 . \end{align}

This demonstrates that Equation (7) is biased when $θ \neq 0$ , which can not be used to obtain unbiased estimations. In order to overcome this drawback, we obtain an unbiased estimating equation by adding the term $E ({(u_{i}, ũ_{i})}^{T} V_{i}^{- 1} (u_{i}, ũ_{i}) θ)$ . Accordingly, the bias‐corrected GEE for $θ$ can be derived as

\begin{align} \sum_{i = 1}^{n} \{{(W_{i}, {\tilde{W}}_{i})}^{T} V_{i}^{- 1} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + E ({(u_{i}, ũ_{i})}^{T} V_{i}^{- 1} (u_{i}, ũ_{i}) θ)\} = 0, \end{align}

(8)

where $W_{i} = {(W_{i 1}, W_{i 2}, \dots, W_{i n_{i}})}^{T}$ , ${\tilde{W}}_{i} = {({\tilde{W}}_{i 1}, {\tilde{W}}_{i 2}, \dots, {\tilde{W}}_{i n_{i}})}^{T}$ , $Y_{i} = {(Y_{i 1}, Y_{i 2}, \dots, Y_{i n_{i}})}^{T}$ , $V_{i}$ is the covariance matrix of $Y_{i}$ . Then we take $V_{i}$ as $V_{i} = A_{i}^{1 / 2} R_{i} (ρ) A_{i}^{1 / 2}$ , where $R_{i} (ρ)$ is a common working correlation matrix with a nuisance parameter $ρ$ , $A_{i} = diag (var (Y_{i 1}), var (Y_{i 2}), \dots, var (Y_{i n_{i}}))$ . Liang and Zeger [25] stated that, in some simple cases, a consistent estimator for $ρ$ may not exist, which could undermine the validity of the GEE method.

To overcome this limitation of the GEE, Qu, Lindsay and Li [26] proposed a QIF approach, assuming that $R_{i}^{- 1} (ρ) = \sum_{κ = 1}^{s} a_{κ} M_{κ}$ , where $M_{κ} (κ = 1, 2, \dots, s)$ are some known simple matrices and $a_{κ} (κ = 1, 2, \dots, s)$ are some unknown constants. The QIF method treats $a_{κ} (κ = 1, 2, \dots, s)$ as the nuisance parameters, and approximates $R_{i}^{- 1} (ρ)$ by a linear combination of a class of basis matrices as

\begin{align} R_{i}^{- 1} (ρ) = \sum_{κ = 1}^{s} a_{κ} M_{κ} . \end{align}

(9)

One can see more details about $M_{κ} (κ = 1, 2, \dots, s)$ in Qu, Lindsay, and Li [26], which are omitted here. By substituting $R_{i}^{- 1} (ρ)$ into Equation (8), the resulting new bias‐corrected GEE is derived as

\begin{align} \sum_{i = 1}^{n} \{{(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} (\sum_{κ = 1}^{s} a_{κ} M_{κ}) A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) \\ + E ({(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} (\sum_{κ = 1}^{s} a_{κ} M_{κ}) A_{i}^{- 1 / 2} (u_{i}, ũ_{i}) θ)\} = 0 . \end{align}

(10)

Thus, following Qu, Lindsay and Li [26], one can define a bias‐corrected extended score function ${\overline{g}}_{n} (θ)$ as

\begin{align} {\overline{g}}_{n} (θ) & = \frac{1}{n} \sum_{i = 1}^{n} g_{i} (θ) \\ = \frac{1}{n} \sum_{i = 1}^{n} (\begin{array}{c} {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{1} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + D_{i}^{(1)} θ \\ {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{2} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + D_{i}^{(2)} θ \\ ⋮ \\ {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{s} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + D_{i}^{(s)} θ \end{array}), \end{align}

where $D_{i}^{(κ)} = E ({(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i})), κ = 1, 2, \dots, s .$ By some matrix calculations, we have

\begin{align} D_{i}^{(κ)} = \\ (\begin{matrix} tr (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) \cdot \sum_{u} & \sum_{u} \otimes (1_{1 \times n_{i}} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}) \\ {(\sum_{u} \otimes (1_{1 \times n_{i}} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}))}^{T} & \sum_{u} \otimes (B_{i} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}) \end{matrix}), \end{align}

(11)

where $1_{1 \times n_{i}} = {(1, 1, \dots, 1)}_{1 \times n_{i}}$ , $B_{i} = (B (t_{i 1}), B (t_{i 2}), \dots, B (t_{i n_{i}}))$ , $diag (\cdot)$ is a diagonal matrix operator. The detailed derivation process about Equation (11) can be found in Appendix A.

Since $\sum_{u}$ is unknown, $D_{i}^{(κ)}$ need to be estimated. Without loss of generality, we first consider a balanced longitudinal dataset, that is, $n_{i} = n_{0}$ ( $i = 1, 2, \dots, n$ ) and $n_{0}$ is a fixed positive integer. Suppose $W_{i j}$ ( $j = 1, 2, \dots, n_{0}$ ) can be observed $m_{i}$ times for each subject, with $W_{i j}^{(r)} = X_{i j} + u_{i j}^{(r)}$ , $r = 1, 2, \dots, m_{i}$ . A consistent estimator for $\sum_{u}$ can be computed as follows

\begin{align} {\sum^{^}}_{u} = \frac{1}{n n_{0}} \sum_{i = 1}^{n} \sum_{j = 1}^{n_{0}} (\frac{1}{m_{i} - 1} \sum_{r = 1}^{m_{i}} (W_{i j}^{(r)} - {\overline{W}}_{i j}) {(W_{i j}^{(r)} - {\overline{W}}_{i j})}^{T}), \end{align}

(12)

where ${\overline{W}}_{i j} = \frac{1}{m_{i}} \sum_{r = 1}^{m_{i}} W_{i j}^{(r)}$ . It should be pointed out that for unbalanced longitudinal data, following the idea from Xue, Qu and Zhou [27], one can utilize cluster‐specific transformation matrices to reformat the data with an unbalanced cluster size, details are omitted here. Then one can get a consistent estimator ${\hat{D}}_{i}^{(κ)}$ using the plug‐in method as

\begin{align} {\hat{D}}_{i}^{(κ)} = \\ (\begin{matrix} tr (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) \cdot {\sum^{^}}_{u} & {\sum^{^}}_{u} \otimes (1_{1 \times n_{i}} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}) \\ {({\sum^{^}}_{u} \otimes (1_{1 \times n_{i}} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}))}^{T} & {\sum^{^}}_{u} \otimes (B_{i} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}) \end{matrix}) . \end{align}

(13)

For the sake of simplicity, for both balanced and unbalanced data, we denote the estimators of $\sum_{u}$ and $D_{i}^{(k)}$ as ${\sum^{^}}_{u}$ and ${\hat{D}}_{i}^{(k)}$ , respectively. Therefore, based on Equations (12) and (13), a consistent estimator ${\hat{\overline{g}}}_{n} (θ)$ for ${\overline{g}}_{n} (θ)$ can be obtained as

\begin{align} {\hat{\overline{g}}}_{n} (θ) & = \frac{1}{n} \sum_{i = 1}^{n} ĝ_{i} (θ) \\ = \frac{1}{n} \sum_{i = 1}^{n} (\begin{matrix} {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{1} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + {\hat{D}}_{i}^{(1)} θ \\ {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{2} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + {\hat{D}}_{i}^{(2)} θ \\ ⋮ \\ {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{s} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + {\hat{D}}_{i}^{(s)} θ \end{matrix}) . \end{align}

It is evident that the equation $E ({\hat{\overline{g}}}_{n} (θ)) = 0$ involves more equations than the number of parameters to be estimated, and will result in the over‐identified problem. As a result, it can not be directly used to estimate $θ$ . Toovercome this problem, following Qu, Lindsay, and Li [26], we construct a bias‐corrected QIF (BCQIF) about $θ$ as

\begin{align} Q_{n} (θ) & = n {\hat{\overline{g}}}_{n}^{T} (θ) Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} (θ), \end{align}

(14)

where $Ω_{n} = \frac{1}{n} \sum_{i = 1}^{n} ĝ_{i} ((θ)) ĝ_{i}^{T} ((θ))$ . The matrix $Ω_{n}$ is the sample covariance of the moment conditions and serves as the optimal weighting matrix in the GMM criterion. This choice guarantees the estimator's properties (Qu, Lindsay and Li [26]).

Then a BCQIF estimator $\tilde{θ}$ can be obtained as

\begin{align} \tilde{θ} & = \underset{θ}{\arg \min} Q_{n} (θ) . \end{align}

(15)

As we know, the VCEV models assume that all the regression coefficients in the model are varying, the PLVCEV models presuppose that some of the regression coefficients in the model are constant while others are varying. All the regression coefficients in the linear EV models are set as all constants. However, these would expose us to the risk of assumptions in practice. Not only that, but the regression coefficients may also have zero coefficients. Therefore, the structure identification and variable selection of the model become very important and indispensable. To solve these problems, we propose the following BCDPQIF $Q_{p} (θ)$ to do structure identification, estimation and variable selection simultaneously for model (2), defined as

(16)

where $‖ γ_{k} ‖_{H} = {(γ_{k}^{T} H γ_{k})}^{1 / 2},$ and $H = {(h_{i j})}_{L \times L}, h_{i j} = \int_{0}^{1} B_{i} (t) B_{j}^{T} (t) d t, k = 1, 2, \dots, q .$ $I (\cdot)$ is an indicator function. $λ_{1 k}, λ_{2 k} (k = 1, 2, \dots, q)$ are some tuning parameters. $p_{λ_{1 k}} (\cdot)$ and $p_{λ_{2 k}} (\cdot)$ are the SCAD [18] penalty functions defined as

\begin{align} p_{λ} (| w |) = \{\begin{array}{l} λ | w |, & | w | \leq λ, \\ \frac{(a^{2} - 1) λ^{2} - {(| w | - a λ)}^{2}}{2 (a - 1)}, & λ < | w | \leq a λ, \\ \frac{1}{2} (a + 1) λ^{2}, & | w | > a λ . \end{array} \end{align}

(17)

where $a = 3.7$ . It should be noted that the penalty functions $p_{λ_{1 k}} (\cdot)$ and $p_{λ_{2 k}} (\cdot)$ here do not necessarily have to be the SCAD penalty function, one can use other penalty functions that we are familiar with, such as LASSO or MCP [19] penalty functions. In our work, we employ the SCAD penalty function to evaluate the proposed method.

Then the BCDPQIF estimator of $θ = {(β^{T}, γ^{T})}^{T}$ is given by

\begin{align} \hat{θ} = {({\hat{β}}^{T}, {\hat{γ}}^{T})}^{T} = \arg \min_{θ} Q_{p} (θ) . \end{align}

(18)

Furthermore, the BCDPQIF estimators of $α_{k} (t) (k = 1, 2, \dots, q)$ can be obtained by

{\hat{α}}_{k} (t) = {\hat{β}}_{k} + B {(t)}^{T} {\hat{γ}}_{k} .

(19)

Remark 1

Leveraging the SCAD penalty function's properties, the BCDPQIF method can identify, estimate and select the coefficient functions simultaneously. In Equation (16), the first penalty term, $n \sum_{k = 1}^{q} p_{λ_{1 k}} (‖ γ_{k} ‖_{H})$ , determines whether the functional component $η_{k} (\cdot)$ of $α_{k} (\cdot)$ is zero or nonzero, thereby distinguishing varying coefficients from constant ones. For coefficients identified as constant, the second penalty term, $n \sum_{k = 1}^{q} p_{λ_{2 k}} (| β_{k} |) I (‖ γ_{k} ‖_{H} = 0),$ further evaluates whether they are zeros or not and selects the nonzero constant coefficients. As a result, the BCDPQIF method is more versatile because it does not require pre‐specifying whether a coefficient is constant or varying. Instead, it can identify both varying and constant coefficients, and simultaneously estimate and select the varying coefficients and the nonzero constant coefficients. This generality makes it applicable to a wide range of models, including the linear EV models, VCEV models and PLVCEV models.

2.2. Asymptotic Properties

First, we give some necessary notations. Let $α_{0} (\cdot) = {(α_{01} (\cdot), α_{02} (\cdot), \dots, α_{0 q} (\cdot))}^{T}$ be the real coefficients in model (2). Correspondingly, the real $β_{k}$ and $η_{k} (t)$ are denoted as $β_{0 k}$ and $η_{0 k} (t)$ for $k = 1, 2, \dots, q$ , where $α_{0 k} (t) = β_{0 k} + η_{0 k} (t)$ . Let $γ_{0 k}$ be the B‐spline regression coefficient corresponding to $η_{0 k} (t)$ . Denote $β_{0} = {(β_{01}, β_{02}, \dots, β_{0 q})}^{T}$ , $γ_{0} = {(γ_{01}^{T}, γ_{02}^{T}, \dots, γ_{0 q}^{T})}^{T}$ . Without loss of generality, it is assumed that $α_{0 k} (\cdot) (k = 1, 2, \dots, c)$ are nonzero constant coefficients, $α_{0 k} (\cdot) (k = c + 1, c + 2, \dots, v)$ are varying coefficients, $α_{0 k} (\cdot) = 0$ for $k = v + 1, v + 2, \dots, q$ .

Denote $𝒞 = {1, 2, \dots, c}, 𝒱 = {c + 1, c + 2, \dots, v}, 𝒵 = {v + 1, v + 2, \dots, q}$ . $k \in 𝒞$ implies that $α_{k} (\cdot)$ is a nonzero constant, $k \in 𝒱$ implies that $α_{k} (\cdot)$ is a varying coefficient, and $k \in 𝒵$ implies $α_{k} (\cdot) = 0$ . Using the BCDPQIF method, $α_{k} (\cdot) (k = 1, 2, \dots, q)$ can be classified into three categories, that is, varying coefficients, nonzero constant coefficients and zero coefficients. Denote $α^{(𝒞)} (\cdot) = {(α_{1} (\cdot), α_{2} (\cdot), \dots, α_{c} (\cdot))}^{T}$ and $α^{(𝒱)} (\cdot) = {(α_{c + 1} (\cdot), α_{c + 2} (\cdot), \dots, α_{v} (\cdot))}^{T}$ . The corresponding real functions of $α^{(𝒞)} (\cdot)$ and $α^{(𝒱)} (\cdot)$ are denoted as $α_{0}^{(𝒞)} (\cdot)$ and $α_{0}^{(𝒱)} (\cdot)$ , respectively.

Some necessary regularity conditions for the asymptotic properties are stated as follows.

C1:
$0 < n_{i} < \infty$ for $i = 1, 2, \dots, n$ .
C2:
$α_{k} (t) (k = 1, 2, \dots, q)$ are $r$ th continuously differentiable on $(0, 1)$ , and $r \geq 2$ .
C3:
$\exists$ unique $θ_{0} \in Θ$ satisfies $E ({\hat{\overline{g}}}_{n} (θ_{0})) = o (1)$ , where $Θ$ is the parameter space.
C4:
There exists an invertible matrix $Ω_{0}$ s.t. $Ω_{n} \overset{a.s.}{\to} Ω_{0}$ .
C5:
$\sup ‖V_{i}‖ < \infty$ and exists $δ > 0$ s.t. $\sup E {‖ ε_{i} ‖^{2 + δ}} < \infty$ , $E {‖u_{i}‖}^{8} < \infty$ , $E {‖ũ_{i}‖}^{8} < \infty$ , where $‖ \cdot ‖$ is the modulus of the largest singular values.
C6:
$A_{i} \geq 0$ , $\sup_{i} ‖A_{i}‖ < \infty$ .
C7:
$E {‖X_{i}‖}^{4} < \infty$ , $E ‖ {\tilde{X}}_{i} ‖^{4} < \infty$ , $i = 1, 2, \dots, n$ .
C8:
Let interior knots ${ω_{i}, i = 1, 2, \dots, K}$ of $B (t)$ satisfy $\max_{1 \leq i \leq K} |Δ ω_{i + 1} - Δ ω_{i}| = o (K^{- 1})$ and $Δ ω_{\max} / Δ ω_{\min} \leq c$ , where $c \geq 0$ is a constant, $Δ ω_{\max} = \max_{1 \leq i \leq K} ω_{i}, Δ ω_{\min} = \min_{1 \leq i \leq K} ω_{i}$ , $Δ ω_{i} = ω_{i} - ω_{i - 1}$ , $ω_{0} = 0$ , $ω_{K + 1} = 1$ .
C9:
${\dot{\hat{\overline{g}}}}_{n} (θ) = \frac{\partial {\hat{\overline{g}}}_{n} (θ)}{\partial θ}$ exists and is continuous, and from the weak law of large numbers, when $\hat{θ} \overset{p}{\to} θ_{0}$ , exists $J_{0}$ s.t.
$\begin{align} \lim_{n \to \infty} \frac{1}{n} \sum_{i = 1}^{n} E (\begin{array}{c} {(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{1} A_{i}^{- 1 / 2} (X_{i}, {\tilde{X}}_{i}) \\ {(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{2} A_{i}^{- 1 / 2} (X_{i}, {\tilde{X}}_{i}) \\ ⋮ \\ {(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{s} A_{i}^{- 1 / 2} (X_{i}, {\tilde{X}}_{i}) \end{array}) \equiv J_{0} . \end{align}$
C10:
Denote $a_{n} = \max_{k} \{| p_{λ_{1 k}}^{″} (|β_{k 0}|) |, | p_{λ_{2 k}}^{″} ({‖γ_{0 k}‖}_{H}) |, β_{0 k} \neq 0, γ_{0 k} \neq 0\}$ , then $a_{n} \to 0$ as $n \to \infty$ .
C11:
$p_{λ} (t)$ satisfies
$\begin{align} \underset{n \to \infty}{\lim \inf} \underset{{‖γ_{k}‖}_{H} \to 0}{\lim \inf} λ_{1 k}^{- 1} p_{λ_{1 k}}^{'} ({‖γ_{k}‖}_{H}) > 0, \\ \underset{n \to \infty}{\lim \inf} \underset{β_{k} \to 0^{+}}{\lim \inf} λ_{2 k}^{- 1} p_{λ_{2 k}}^{'} (|β_{k}|) > 0, \end{align}$
where $k = v + 1, v + 2, \dots, q$ .

Remark 2

These conditions are often used in the literatures for nonparametric and semi‐parametric statistical inference. C1 implies $N = \sum_{i = 1}^{n} n_{i} = O (n)$ . C2 is the smoothness condition about $α_{k} (t) (k = 1, 2, \dots, q)$ and the necessary condition to study the convergence rate of B‐spline estimator. C4 and C9 can be easily obtained by the weak law of large numbers when $n \to \infty$ . C3, C5‐C7, C9 can be seen in Tian, Xue and Liu [28]. C8 is necessary for knots of B‐spline basis approximations Schumaker [29]. C10 and C11 can be seen in Tian, Xue and Liu [28]; Zhao and Xue [30] and Fan and Li [18].

Theorem 1

Assuming the conditions $C 1 \sim C 11$ hold and $K = O (N^{1 / (2 r + 1)})$ , we have

$\begin{align} ‖{\hat{α}}_{k} (\cdot) - α_{k} (\cdot)‖ = O (N^{- r / (2 r + 1)}), k = 1, 2, \dots, q . \end{align}$

Theorem 2

Assuming the conditions $C 1 \sim C 10$ hold, let $λ_{\max} = \max_{k} \{λ_{1 k}, λ_{2 k}\}$ and $λ_{\min} = \min_{k} \{λ_{1 k}, λ_{2 k}\}$ , satisfy $λ_{\max} \to 0$ and $n^{r / (2 r + 1)} \cdot λ_{\min} \to \infty$ . Then with probability tending to 1, we have

i.
${\hat{α}}_{k} (\cdot) = {\hat{β}}_{k} \neq 0, k \in 𝒞$ ;

ii.
${\hat{α}}_{k} (\cdot) = 0, k \in 𝒵$ .

Theorem 3

Assuming the conditions $C 1 \sim C 10$ hold, and $K = O (N^{1 / (2 r + 1)})$ , we have

$\sqrt{n} ({\hat{α}}^{(𝒞)} - α_{0}^{(𝒞)}) \overset{ℒ}{\to} N (0, A^{- 1} B A^{- 1}),$

where $A$ and $B$ are denoted in proof of theorem 3 in Appendix A, “ $\overset{ℒ}{\to}$ ” denotes “convergence in distribution”.

Theorem 1 shows that the BCDPQIF estimators of varying coefficients have the optimal convergence rate, while Theorem 2 states that the BCDPQIF estimators of nonzero constant coefficients and varying coefficients have the sparse property. Theorems show that the BCDPQIF method possesses the oracle property.

3. Computational Algorithm and Selection of Tuning Parameters

3.1. Computational Algorithm

According to Equation (18), $\hat{θ}$ does not have an explicit form, meaning that we can only obtain a numerical approximation of $\hat{θ}$ . First, observe that the first two derivatives of $Q_{n} (θ)$ are continuous. Therefore, around a given point $θ^{(0)}$ , $Q_{n} (θ)$ can be approximated as

\begin{align} Q_{n} (θ) & \approx Q_{n} (θ^{(0)}) + {\dot{Q}}_{n} (θ^{(0)}) (θ - θ^{(0)}) + \frac{1}{2} {(θ - θ^{(0)})}^{T} {\ddot{Q}}_{n} (θ^{(0)}) (θ - θ^{(0)}) . \end{align}

where ${\dot{Q}}_{n} (\cdot)$ and ${\ddot{Q}}_{n} (\cdot)$ represent the first and second derivatives of $Q_{n} (\cdot)$ w.r.t. $θ$ , respectively. According to the Qu, Lindsay and Li [9] we get

\begin{align} n^{- 1} {\dot{Q}}_{n} (\cdot) & = 2 {\dot{\hat{\overline{g}}}}_{n}^{T} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} - {\overline{g}}_{n}^{T} Ω_{n}^{- 1} \dot{Ω} Ω_{n}^{- 1} {\overline{g}}_{n}, \\ n^{- 1} {\ddot{Q}}_{n} (\cdot) & = 2 {\dot{\hat{\overline{g}}}}_{n}^{T} Ω_{n} {\dot{\hat{\overline{g}}}}_{n} + R_{n} \end{align}

\begin{align} R_{n} & = 2 {\ddot{\hat{\overline{g}}}}_{n}^{T} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} - 4 {\dot{\hat{\overline{g}}}}_{n}^{T} Ω_{n}^{- 1} {\dot{Ω}}_{n} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} \\ + 2 {\hat{\overline{g}}}_{n}^{T} Ω_{n}^{- 1} Ω_{n}^{- 1} {\dot{Ω}}_{n} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} - {\hat{\overline{g}}}_{n}^{T} Ω_{n}^{- 1} {\ddot{Ω}}_{n} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} \end{align}

Likewise, given an initial value $t_{0}$ , we have

\begin{align} p_{λ} (| t |) \approx p_{λ} (| t_{0} |) + \frac{1}{2} \frac{p_{λ}^{'} (| t_{0} |)}{| t_{0} |} (t^{2} - t_{0}^{2}), t \approx t_{0} . \end{align}

Therefore, apart from a constant, $Q_{p} (θ)$ can be represented by

\begin{align} Q_{p} (θ) & \approx Q_{n} (θ^{(0)}) + {\dot{Q}}_{n} {(θ^{(0)})}^{T} (θ - θ^{(0)}) \\ + \frac{1}{2} {(θ - θ^{(0)})}^{T} {\ddot{Q}}_{n} (θ^{(0)}) (θ - θ^{(0)}) + \frac{n}{2} θ^{T} \sum_{λ} (θ^{(0)}) θ . \end{align}

(20)

where

\begin{align} \sum_{λ} (θ^{(0)}) & = (\begin{matrix} \sum_{λ} (β^{(0)}) & 0 \\ 0 & \sum_{λ} (γ^{(0)}) \otimes H \end{matrix}), \end{align}

\begin{align} \sum_{λ} (γ^{(0)}) & = diag \{\frac{p_{λ_{11}}^{'} (‖ γ_{1}^{(0)} ‖_{H})}{‖ γ_{1}^{(0)} ‖_{H}}, \frac{p_{λ_{11}}^{'} (‖ γ_{2}^{(0)} ‖_{H})}{‖ γ_{2}^{(0)} ‖_{H}}, \dots, \frac{p_{λ_{1 q}}^{'} (‖ γ_{q}^{(0)} ‖_{H})}{‖ γ_{q}^{(0)} ‖_{H}}\}, \end{align}

\begin{align} \sum_{λ} (β^{(0)}) & = diag \{\frac{p_{λ_{21}}^{'} (|β_{1}^{(0)}|)}{|β_{1}^{(0)}|} I (‖ γ_{1}^{(0)} ‖_{H} = 0), \frac{p_{λ_{22}}^{'} (|β_{2}^{(0)}|)}{|β_{2}^{(0)}|} I (‖ γ_{2}^{(0)} ‖_{H} = 0), \dots, \\ \frac{p_{λ_{2 q}}^{'} (|β_{q}^{(0)}|)}{|β_{q}^{(0)}|} I (‖ γ_{q}^{(0)} ‖_{H} = 0)\} . \end{align}

To solve the problem of numerical solution, we propose a three‐step iterative computational algorithm as follows.

Step 1. This step first identifies varying and constant coefficients for each $α_{k} (\cdot)$ for $k = 1, 2, \dots, q$ . It provides an initial partition of the coefficient space and reduces model complexity. Specifically, we define the Step 1 estimator ${\hat{θ}}_{(1)}$ as

\begin{align} {\hat{θ}}_{(1)} = \arg \min_{θ} \{Q_{n} (θ) + n \sum_{k = 1}^{q} p_{λ_{1 k}} (‖ γ_{k} ‖_{H})\} . \end{align}

(21)

Through the minimization procedure (21), nonzero values of $‖ γ_{k} ‖_{H}$ are identified and selected; when $‖ γ_{k} ‖_{H} > 0$ , the coefficient function $α_{k} (\cdot)$ is identified as varying, whereas $‖ γ_{k} ‖_{H} = 0$ indicates it is constant.

Since ${\hat{θ}}_{(1)}$ lacks a closed‐form expression, we approximate it via the following iterative procedure

\begin{align} {\hat{θ}}_{(1)}^{(m + 1)} & = {\hat{θ}}_{(1)}^{(m)} - {[{\ddot{Q}}_{n} ({\hat{θ}}_{(1)}^{(m)}) + n \sum_{(1)}]}^{- 1} [{\dot{Q}}_{n} ({\hat{θ}}_{(1)}^{(m)}) + n \sum_{(1)} {\hat{θ}}_{(1)}^{(m)}], \end{align}

(22)

where $\sum_{(1)} = (\begin{array}{l} \sum_{0} & 0 \\ 0 & \sum_{λ} ({\hat{γ}}^{(m)}) \otimes H \end{array}), \sum_{0} = diag {0_{1}, 0_{2}, \dots, 0_{q}}$ and

\begin{align} \sum_{λ} (γ^{(m)}) & = diag \{\frac{p_{λ_{11}}^{'} (‖ γ_{1}^{(m)} ‖_{H})}{‖ γ_{1}^{(m)} ‖_{H}}, \frac{p_{λ_{12}}^{'} (‖ γ_{2}^{(m)} ‖_{H})}{‖ γ_{2}^{(m)} ‖_{H}}, \dots, \frac{p_{λ_{1 q}}^{'} (‖ γ_{q}^{(m)} ‖_{H})}{‖ γ_{q}^{(m)} ‖_{H}}\} . \end{align}

(23)

We initialize this iteration with $\tilde{θ}$ from Equation (15) and iterate Equation (22) until convergence to obtain the approximate solution ${\hat{θ}}_{(1)}$ . This process effectively identifies whether each $α_{k} (\cdot)$ is a varying or constant coefficient before proceeding to variable selection.

Step 2. Next, taking ${\hat{θ}}_{(1)}$ as an initial value, we refine the constant coefficients by selecting nonzero values. Specifically, we define

\begin{align} {\hat{θ}}_{(2)} = \arg \min_{θ} \{Q_{n} ({\hat{θ}}_{(1)}) + n \sum_{k = 1}^{q} p_{λ_{2 k}} (| β_{k} |)\} . \end{align}

(24)

Similar to Step 1, ${\hat{θ}}_{(2)}$ also has no closed‐form solution, so we propose a iterative process in Step 2 as

\begin{align} {\hat{θ}}_{(2)}^{(m + 1)} & = {\hat{θ}}_{(2)}^{(m)} - {[{\ddot{Q}}_{n} ({\hat{θ}}_{(2)}^{(m)}) + n \sum_{(2)}]}^{- 1} [{\dot{Q}}_{n} ({\hat{θ}}_{(2)}^{(m)}) + n \sum_{(2)} {\hat{θ}}_{(2)}^{(m)}], \end{align}

(25)

where $\sum_{(2)} = (\begin{array}{l} \sum_{λ} ({\hat{β}}^{(m)}) & 0 \\ 0 & \sum_{0} \otimes H \end{array})$ and

\begin{align} \sum_{λ} ({\hat{β}}^{(m)}) & = diag \{\frac{p_{λ_{21}}^{'} (| {\hat{β}}_{1}^{(m)} |)}{| {\hat{β}}_{1}^{(m)} |} I (‖ {\hat{γ}}_{1}^{(m)} ‖_{H} = 0), \dots, \\ \frac{p_{λ_{2 q}}^{'} (| {\hat{β}}_{q}^{(m)} |)}{| {\hat{β}}_{q}^{(m)} |} I (‖ {\hat{γ}}_{q}^{(m)} ‖_{H} = 0)\} . \end{align}

Iterate Equation (25) until convergence to approximate ${\hat{θ}}_{(2)}$ . This step focuses on eliminating uninformative constant coefficients while retaining those that are pertinent.

Step 3. Finally, we alternate between Steps 1 and 2 until overall convergence, arriving at the final estimator $\hat{θ}$ . By iteratively identifying constant versus varying coefficients and selecting only the nonzero constants, this procedure yields a parsimonious yet flexible model that adapts to do structure identification, estimation and variable selection without requiring assumptions in advance.

3.2. Selection of Tuning Parameters

As is known to all, ${λ_{1 k}, λ_{2 k}}_{k = 1}^{q}$ are of vital importance for variable selection. They control the amount of penalties and determine the outcomes of structure identification and variable selection. However, the selection of ${λ_{1 k}, λ_{2 k}}_{k = 1}^{q}$ involves a very high computational complexity. To over this problem, in our work, we denote the adaptive tuning parameters $λ_{1 k}$ and $λ_{2 k}$ as

\begin{align} λ_{1 k} = \frac{λ_{1}}{‖ {\tilde{γ}}_{k} ‖_{H}}, λ_{2 k} = \frac{λ_{2}}{| {\hat{β}}_{k} |}, \end{align}

where ${\tilde{γ}}_{k}$ , for $k = 1, 2, \dots, q$ , are defined by Equation (15), and ${\hat{β}}_{k}$ corresponds to the solution ${\hat{θ}}_{(1)}$ obtained in Step 1. We can see that the adaptive tuning parameters has significantly reduced the computational complexity.

To obtain the optimal tuning parameters in Steps 1 and 2, we employ a BIC‐type criterion separately for each step, thereby balancing model fit and complexity. Specifically, in Step 1, we determine the optimal ${\hat{λ}}_{1}$ via

BI C_{1} (λ) = Q_{n} ({\hat{θ}}_{(1)}) + d f_{λ_{1 k}} \log (n), {\hat{λ}}_{1} = \arg \min_{λ} BI C_{1} (λ),

(26)

where $d f_{λ_{1 k}} = \sum_{k = 1}^{q} I (‖ γ_{k} ‖_{H} \neq 0)$ . This quantity $d f_{λ_{1 k}}$ counts the number of nonzero varying coefficients and thus penalizes more complex models. Similarly, in Step 2 we use an analogous BIC‐type criterion to obtain the optimal ${\hat{λ}}_{2}$ :

BI C_{2} (λ) = Q_{n} ({\hat{θ}}_{(2)}) + d f_{λ_{2 k}} \log (n), {\hat{λ}}_{2} = \arg \min_{λ} BI C_{2} (λ),

(27)

where $d f_{λ_{2 k}} = \sum_{k = 1}^{q} I (| β_{k} | I (‖ γ_{k} ‖_{H} = 0) \neq 0) .$ Hence, $d f_{λ_{2 k}}$ counts only those nonzero constant coefficients that contribute to the model when the corresponding varying components are zero. By explicitly penalizing the inclusion of additional parameters, the BIC‐type criteria help select tuning parameters that yield a parsimonious yet informative model.

4. Numerical Studies

4.1. Simulations Studies

We perform some numerical simulations to evaluate the performance of the proposed method in finite samples. The performance of estimator ${\hat{α}}^{(𝒞)}$ in the simulation will be assessed by using the generalized mean square error (GMSE) [28], which is defined as

\begin{align} GMSE & = {({\hat{α}}^{(𝒞)} - α_{0}^{(𝒞)})}^{T} E (X^{(𝒞)} {(X^{(𝒞)})}^{T}) ({\hat{α}}^{(𝒞)} - α_{0}^{(𝒞)}), \end{align}

(28)

where ${\hat{α}}^{(𝒞)} = {({\hat{β}}_{1}, {\hat{β}}_{2}, \dots, {\hat{β}}_{c})}^{T}$ . The performance of estimator ${\hat{α}}^{(𝒱)}$ in the simulation will be assessed by using the square root of average square errors (RASE) [28]

\begin{align} RASE & = {\{\frac{1}{M} \sum_{m = 1}^{M} \sum_{k = c + 1}^{v} {[{\hat{α}}_{k} (t_{m}) - α_{0 k} (t_{m})]}^{2}\}}^{1 / 2} . \end{align}

(29)

where $t_{m} (m = 1, 2, \dots, M)$ are grid points at which ${\hat{α}}_{k} (t_{m})$ is evaluated. A smaller RASE or GMSE signifies higher estimation accuracy, indicating that $\hat{α} (\cdot)$ is closer to the true value $α (\cdot)$ . In our simulations, grid points were equally spaced on $[0, 1]$ and $M = 200$ .

In order to assess the performance of structure identification, estimation and variable selection, we give some denotations. Let “CZ” denote the average number of correctly identified zero coefficients; “IZ” represent the average number of nonzero coefficients incorrectly identified as zero; “CV” denote the average number of correctly identified varying coefficients; “IV” represent the average number of non‐varying coefficients incorrectly identified as varying; “CC” denote the average number of correctly identified constant coefficients; and “IC” represent the average number of non‐constant coefficients incorrectly identified as constant. “CF” represents the percentage of simulations in which the true model structure was correctly identified. Smaller values of IZ, IV, and IC, along with values of CZ, CV, and CC closer to the true model, indicate better performance in structure identification and selection. A lower GMSE or RASE indicates better estimation accuracy, implying that $\hat{α} (\cdot)$ is closer to the true parameter function $α_{0} (\cdot)$ on average.

Suppose that the real model (2) satisfies $𝒞 = {1, 2}, 𝒱 = {3, 4, 5}, 𝒵 = {6, 7, 8}$ and

\begin{align} α^{(𝒞)} (t) & = (α_{1} (t), α_{2} (t)) = (5, 6), \\ α^{(𝒱)} (t) & = (α_{3} (t), α_{4} (t), α_{5} (t)) = (0.4 \cdot e^{2 t - 1}, \sin (π t), 0.1 \cdot {(2 - 2 t)}^{3}), \\ α^{(𝒵)} (t) & = (α_{6} (t), α_{7} (t), α_{8} (t)) = (0, 0, 0) . \end{align}

We took $X_{i j} \sim N (8, σ_{X}^{2} I_{8}), u_{i j} \sim N (0, σ_{u}^{2} I_{8})$ , where $j =$ $1, 2, \dots, 8, σ_{X} = 8, I_{8}$ is $8 \times 8$ identifymatrix. We set $σ_{u}$ as $0.2, 0.4, 0.6$ . $t_{i j} \sim U [0, 1]$ . $ε_{i} = {(ε_{i 1}, ε_{i 2}, \dots, ε_{i n_{i}})}^{T} \sim N (0, σ^{2} Corr (ε_{i}, ρ))$ , where $σ^{2} = 1$ and $Corr (ε_{i}, ρ))$ is a known correlation matrix with parameter $ρ$ . Thus, we can get $A_{i} = diag (1, 1, \dots, 1)$ . In our work, we set $n = 300, 350, 400$ , $n_{i} = 10, 20$ and $ε_{i}$ has the first‐order autoregressive $(AR (1))$ and exchangeable (EX) correlation structures with $ρ = 0.3, 0.7$ . The cubic B‐spline basis was applied with the knots being equally spaced in $[0, 1], K = ⌊c_{0} \times N^{1 / 5}⌋$ , where $⌊ c_{0} ⌋$ denotes the largest integer less than $c_{0}$ . Following Tian, Xue and Liu [28], we choose $c_{0} = 0.4$ .

For each simulated longitudinal data, we compared the BCDPQIF method with the LASSO, MCP and the SCAD penalty functions and the one neglecting measurement errors with SCAD penalty function, denoted as BCDPQIF‐LASSO, BCDPQIF‐MCP, BCDPQIF‐SCAD and BCDPQIF‐nSCAD, respectively. For the sake of simplicity, BCDPQIF‐LASSO, BCDPQIF‐MCP, BCDPQIF‐SCAD and BCDPQIF‐nSCAD are denoted by “LASSO”, “MCP”, “SCAD” and “nSCAD” in the following tables respectively. ${\hat{λ}}_{1 k}, {\hat{λ}}_{2 k} (k = 1, 2, \dots, q)$ were selected by Equations (26) and (27).

Among them, Tables 1 and 3 show the model estimation results of longitudinal data with EX correlation structures under different $n_{i}$ ; while Tables 2 and 4 show the structure identification and variable selection results of longitudinal data with AR(1) correlation structures under different $n_{i}$ . Tables 5 and 7 continue this analysis for the EX correlation structures with under different $n_{i}$ ; Tables 6 and 8 present the corresponding results for the AR(1) correlation structures, following the same organization but focusing on different and parameter combinations.

TABLE 1.

Model estimation with the EX correlation structure ( $n_{i} = 10$ ).

n = 300

n = 350

n = 400

ρ

σ_{u}

Method

GMSE

RASE

GMSE

RASE

GMSE

RASE

0.3

0.2

LASSO

0.006310

0.020265

0.005318

0.018661

0.004500

0.017346

MCP

0.005170

0.020117

0.004126

0.018650

0.003502

0.017217

SCAD

0.005350

0.020085

0.004272

0.018533

0.003607

0.017198

nSCAD

0.007214

0.020461

0.006025

0.018826

0.005227

0.017122

0.4

LASSO

0.021447

0.034504

0.022946

0.031387

0.022236

0.029445

MCP

0.013685

0.033768

0.013808

0.030912

0.011439

0.028973

SCAD

0.014983

0.033763

0.014196

0.031039

0.012539

0.028929

nSCAD

0.036712

0.035110

0.038602

0.031825

0.034361

0.030109

0.6

LASSO

0.067761

0.050695

0.061829

0.046198

0.062955

0.043836

MCP

0.033222

0.049347

0.025128

0.044601

0.022594

0.041510

SCAD

0.034123

0.049820

0.027653

0.044584

0.023667

0.041760

nSCAD

0.143232

0.053116

0.120532

0.047648

0.129252

0.044608

0.7

0.2

LASSO

0.005144

0.018004

0.004381

0.017552

0.003754

0.017102

MCP

0.004269

0.017916

0.003456

0.017511

0.002752

0.017053

SCAD

0.004457

0.017971

0.003609

0.017524

0.002794

0.017049

nSCAD

0.006139

0.017992

0.004973

0.017827

0.004717

0.017225

0.4

LASSO

0.022754

0.030121

0.024283

0.031665

0.020305

0.028867

MCP

0.014025

0.029910

0.014666

0.031067

0.011411

0.028234

SCAD

0.015138

0.029797

0.014507

0.030956

0.011690

0.028112

nSCAD

0.036937

0.030407

0.037578

0.031802

0.033961

0.028730

0.6

LASSO

0.079412

0.045530

0.067183

0.046290

0.063914

0.042990

MCP

0.033737

0.044692

0.028633

0.044774

0.025319

0.041379

SCAD

0.034388

0.044384

0.030047

0.044629

0.025696

0.041235

nSCAD

0.144105

0.046027

0.130574

0.047405

0.126245

0.043912

Open in a new tab

TABLE 3.

Model estimation with the EX correlation structure ( $n_{i} = 20$ ).

n = 300

n = 350

n = 400

ρ

σ_{u}

Method

GMSE

RASE

GMSE

RASE

GMSE

RASE

0.3

0.2

LASSO

0.002549

0.014618

0.002126

0.012966

0.001890

0.012367

MCP

0.002203

0.014552

0.002022

0.012872

0.001506

0.012342

SCAD

0.002390

0.014589

0.002042

0.012890

0.001584

0.012278

nSCAD

0.004089

0.014739

0.003274

0.013051

0.003253

0.012508

0.4

LASSO

0.010233

0.024553

0.009077

0.022487

0.007610

0.021105

MCP

0.007098

0.024214

0.006204

0.022208

0.005311

0.020843

SCAD

0.007654

0.024516

0.006359

0.022315

0.005350

0.020757

nSCAD

0.031571

0.025823

0.029056

0.023655

0.027567

0.022327

0.6

LASSO

0.026446

0.034791

0.025621

0.032215

0.022216

0.030112

MCP

0.015586

0.033946

0.013372

0.031520

0.011238

0.029165

SCAD

0.016384

0.034301

0.014309

0.031568

0.011855

0.029246

nSCAD

0.127600

0.038364

0.124034

0.036174

0.115740

0.033771

0.7

0.2

LASSO

0.011945

0.021913

0.002154

0.012708

0.001604

0.011912

MCP

0.009801

0.021855

0.001838

0.012687

0.001377

0.011901

SCAD

0.009734

0.021733

0.001872

0.012659

0.001403

0.011876

nSCAD

0.031744

0.022608

0.004244

0.012763

0.003275

0.012015

0.4

LASSO

0.010903

0.024006

0.008541

0.021701

0.008329

0.020267

MCP

0.008149

0.024126

0.005660

0.021320

0.005311

0.020045

SCAD

0.008548

0.024027

0.005942

0.021531

0.005358

0.020022

nSCAD

0.032437

0.024605

0.030968

0.022312

0.029802

0.021041

0.6

LASSO

0.022386

0.035063

0.024778

0.031569

0.021510

0.030168

MCP

0.015425

0.034393

0.014333

0.031084

0.010042

0.029233

SCAD

0.017002

0.034705

0.014282

0.030983

0.010343

0.029355

nSCAD

0.114663

0.037800

0.124042

0.034664

0.120750

0.034191

Open in a new tab

TABLE 2.

Model estimation with the AR(1) correlation structure ( $n_{i} = 10$ ).

n = 300

n = 350

n = 400

ρ

σ_{u}

Method

GMSE

RASE

GMSE

RASE

GMSE

RASE

0.3

0.2

LASSO

0.005067

0.020788

0.004916

0.018783

0.004804

0.017505

MCP

0.004491

0.020572

0.004093

0.018768

0.003825

0.017414

SCAD

0.004634

0.020750

0.004196

0.018769

0.003833

0.017394

nSCAD

0.005291

0.020818

0.005076

0.018990

0.004946

0.017501

0.4

LASSO

0.024966

0.034666

0.020821

0.032083

0.020043

0.030934

MCP

0.015624

0.034262

0.012613

0.031646

0.011056

0.030374

SCAD

0.015726

0.034356

0.013997

0.031824

0.011561

0.030416

nSCAD

0.037888

0.035410

0.031702

0.031750

0.030249

0.031064

0.6

LASSO

0.076168

0.051224

0.071597

0.046558

0.064740

0.043943

MCP

0.032306

0.049338

0.026233

0.044643

0.023371

0.042344

SCAD

0.032083

0.049692

0.027803

0.044845

0.024317

0.042389

nSCAD

0.142457

0.052199

0.133610

0.048358

0.112066

0.045093

0.7

0.2

LASSO

0.006348

0.020563

0.004656

0.018560

0.003374

0.017292

MCP

0.005043

0.020318

0.003572

0.018437

0.002682

0.017168

SCAD

0.005124

0.020370

0.003710

0.018446

0.002878

0.017186

nSCAD

0.007161

0.020453

0.004955

0.018410

0.003912

0.017337

0.4

LASSO

0.022236

0.034752

0.021721

0.031719

0.021380

0.030122

MCP

0.015580

0.034360

0.012842

0.031043

0.011394

0.029633

SCAD

0.017398

0.034046

0.013235

0.030967

0.011727

0.029443

nSCAD

0.035691

0.035031

0.032769

0.032387

0.033609

0.030349

0.6

LASSO

0.062320

0.050684

0.073482

0.045768

0.065321

0.045198

MCP

0.029349

0.049476

0.029730

0.043799

0.020605

0.043522

SCAD

0.030288

0.050020

0.031292

0.043488

0.020856

0.043477

nSCAD

0.133572

0.050931

0.124449

0.046314

0.120049

0.045432

Open in a new tab

TABLE 4.

Model estimation with the AR(1) correlation structure ( $n_{i} = 20$ ).

n = 300

n = 350

n = 400

ρ

σ_{u}

Method

GMSE

RASE

GMSE

RASE

GMSE

RASE

0.3

0.2

LASSO

0.003031

0.014615

0.002251

0.013376

0.002001

0.012555

MCP

0.002757

0.014560

0.002151

0.013237

0.001696

0.012479

SCAD

0.002769

0.014623

0.002149

0.013293

0.001736

0.012396

nSCAD

0.004133

0.014716

0.003351

0.013513

0.003169

0.012532

0.4

LASSO

0.008171

0.023962

0.008643

0.023011

0.007514

0.021520

MCP

0.006882

0.023678

0.006250

0.022543

0.005167

0.021042

SCAD

0.007213

0.023850

0.006275

0.022768

0.005229

0.021094

nSCAD

0.025199

0.025266

0.026554

0.024029

0.025662

0.022530

0.6

LASSO

0.026435

0.035735

0.024354

0.032726

0.022199

0.029838

MCP

0.017692

0.035287

0.013702

0.032172

0.012334

0.029107

SCAD

0.018006

0.035391

0.013737

0.031987

0.012320

0.029036

nSCAD

0.118056

0.038565

0.112571

0.036187

0.111787

0.033875

0.7

0.2

LASSO

0.002589

0.014614

0.002075

0.013225

0.001798

0.012293

MCP

0.002230

0.014582

0.001883

0.013104

0.001592

0.012196

SCAD

0.002253

0.014590

0.001872

0.013095

0.001635

0.012196

nSCAD

0.003952

0.014572

0.003295

0.013263

0.003026

0.012375

0.4

LASSO

0.009626

0.024461

0.008094

0.022350

0.006898

0.020712

MCP

0.007010

0.024280

0.006497

0.022211

0.005155

0.020495

SCAD

0.007257

0.024291

0.006661

0.022158

0.005165

0.020371

nSCAD

0.029485

0.025269

0.024169

0.023269

0.024452

0.021766

0.6

LASSO

0.026455

0.035658

0.023275

0.032126

0.022329

0.029903

MCP

0.016085

0.034704

0.012669

0.031518

0.011182

0.028931

SCAD

0.017075

0.034907

0.013049

0.031266

0.010934

0.028944

nSCAD

0.117351

0.039046

0.117707

0.035933

0.117497

0.033745

Open in a new tab

TABLE 5.

Structure identification and variable selection with the EX correlation structure ( $n_{i} = 10$ ).

Structure identification and variable selection

ρ

σ_{u}

n

Method

0.3

0.2

300

LASSO

2.8400

0.0000

3.0000

0.0000

2.0000

0.1600

0.8650

MCP

2.8100

0.0000

3.0000

0.0000

2.0000

0.1900

0.8450

SCAD

2.8850

0.0000

3.0000

0.0000

2.0000

0.1150

0.9100

nSCAD

2.7850

0.0000

3.0000

0.0000

2.0000

0.2150

0.8200

350

LASSO

2.8800

0.0000

3.0000

0.0000

2.0000

0.1200

0.9000

MCP

2.8450

0.0000

3.0000

0.0000

2.0000

0.1550

0.8750

SCAD

2.9050

0.0000

3.0000

0.0000

2.0000

0.0950

0.9250

nSCAD

2.8450

0.0000

3.0000

0.0000

2.0000

0.1550

0.8700

400

LASSO

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8900

MCP

2.8800

0.0000

3.0000

0.0000

2.0000

0.1200

0.8900

SCAD

2.9300

0.0000

3.0000

0.0000

2.0000

0.0700

0.9400

nSCAD

2.8350

0.0000

3.0000

0.0000

2.0000

0.1650

0.8600

0.4

300

LASSO

2.5700

0.0000

3.0000

0.0000

2.0000

0.4300

0.6400

MCP

2.5750

0.0000

3.0000

0.0000

2.0000

0.4250

0.6400

SCAD

2.8400

0.0000

3.0000

0.0000

2.0000

0.1600

0.8700

nSCAD

2.5150

0.0000

3.0000

0.0000

2.0000

0.4850

0.5900

350

LASSO

2.8000

0.0000

3.0000

0.0000

2.0000

0.2000

0.8400

MCP

2.7450

0.0000

3.0000

0.0000

2.0000

0.2550

0.7850

SCAD

2.8650

0.0000

3.0000

0.0000

2.0000

0.1350

0.8950

nSCAD

2.7550

0.0000

3.0000

0.0000

2.0000

0.2450

0.7850

400

LASSO

2.8750

0.0000

3.0000

0.0000

2.0000

0.1250

0.8950

MCP

2.8450

0.0000

3.0000

0.0000

2.0000

0.1550

0.8550

SCAD

2.9250

0.0000

3.0000

0.0000

2.0000

0.0750

0.9300

nSCAD

2.7650

0.0000

3.0000

0.0000

2.0000

0.2350

0.7950

0.6

300

LASSO

2.4500

0.0000

3.0000

0.0000

2.0000

0.5500

0.5350

MCP

2.3850

0.0000

3.0000

0.0000

2.0000

0.6150

0.5150

SCAD

2.7950

0.0000

3.0000

0.0000

2.0000

0.2050

0.8250

nSCAD

2.3150

0.0000

3.0000

0.0000

2.0000

0.6850

0.4750

350

LASSO

2.6100

0.0000

3.0000

0.0000

2.0000

0.3900

0.6650

MCP

2.6300

0.0000

3.0000

0.0000

2.0000

0.3700

0.6750

SCAD

2.8150

0.0000

3.0000

0.0000

2.0000

0.1850

0.8400

nSCAD

2.4250

0.0000

3.0000

0.0000

2.0000

0.5750

0.5550

400

LASSO

2.6800

0.0000

3.0000

0.0000

2.0000

0.3200

0.7150

MCP

2.6450

0.0000

3.0000

0.0000

2.0000

0.3550

0.7000

SCAD

2.8500

0.0000

3.0000

0.0000

2.0000

0.1500

0.8700

nSCAD

2.5350

0.0000

3.0000

0.0000

2.0000

0.4650

0.6350

Open in a new tab

TABLE 7.

Structure identification and variable selection with the EX correlation structure ( $n_{i} = 20$ ).

Structure identification and variable selection

ρ

σ_{u}

n

Method

0.3

0.2

300

LASSO

2.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9150

MCP

2.8650

0.0000

3.0000

0.0000

2.0000

0.1350

0.8900

SCAD

2.9350

0.0000

3.0000

0.0000

2.0000

0.0650

0.9450

nSCAD

2.9050

0.0000

3.0000

0.0000

2.0000

0.0950

0.9250

350

LASSO

2.9100

0.0000

3.0000

0.0000

2.0000

0.0900

0.9250

MCP

2.8900

0.0000

3.0000

0.0000

2.0000

0.1100

0.9000

SCAD

2.9800

0.0000

3.0000

0.0000

2.0000

0.0200

0.9850

nSCAD

2.8500

0.0000

3.0000

0.0000

2.0000

0.1500

0.8850

400

LASSO

2.9450

0.0000

3.0000

0.0000

2.0000

0.0550

0.9500

MCP

2.9400

0.0000

3.0000

0.0000

2.0000

0.0600

0.9450

SCAD

2.9950

0.0000

3.0000

0.0000

2.0000

0.0050

0.9950

nSCAD

2.9250

0.0000

3.0000

0.0000

2.0000

0.0750

0.9350

0.4

300

LASSO

2.8100

0.0000

3.0000

0.0000

2.0000

0.1900

0.8450

MCP

2.8100

0.0000

3.0000

0.0000

2.0000

0.1900

0.8500

SCAD

2.9000

0.0000

3.0000

0.0000

2.0000

0.1000

0.9200

nSCAD

2.6250

0.0000

3.0000

0.0000

2.0000

0.3750

0.6600

350

LASSO

2.8700

0.0000

3.0000

0.0000

2.0000

0.1300

0.8850

MCP

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8750

SCAD

2.9200

0.0000

3.0000

0.0000

2.0000

0.0800

0.9300

nSCAD

2.7300

0.0000

3.0000

0.0000

2.0000

0.2700

0.7800

400

LASSO

2.9200

0.0000

3.0000

0.0000

2.0000

0.0800

0.9250

MCP

2.8750

0.0000

3.0000

0.0000

2.0000

0.1250

0.8850

SCAD

2.9300

0.0000

3.0000

0.0000

2.0000

0.0700

0.9350

nSCAD

2.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8350

0.6

300

LASSO

2.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8150

MCP

2.7850

0.0000

3.0000

0.0000

2.0000

0.2150

0.7950

SCAD

2.9000

0.0000

3.0000

0.0000

2.0000

0.1000

0.9100

nSCAD

2.5550

0.0000

3.0000

0.0000

2.0000

0.4450

0.6350

350

LASSO

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8700

MCP

2.8550

0.0000

3.0000

0.0000

2.0000

0.1450

0.8650

SCAD

2.9150

0.0000

3.0000

0.0000

2.0000

0.0850

0.9250

nSCAD

2.5750

0.0000

3.0000

0.0000

2.0000

0.4250

0.6550

400

LASSO

2.9000

0.0000

3.0000

0.0000

2.0000

0.1000

0.9050

MCP

2.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8300

SCAD

2.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9050

nSCAD

2.6350

0.0000

3.0000

0.0000

2.0000

0.3650

0.6800

Open in a new tab

TABLE 6.

Structure identification and variable selection with the AR(1) correlation structure ( $n_{i} = 10$ ).

Structure identification and variable selection

ρ

σ_{u}

n

Method

0.3

0.2

300

LASSO

2.6800

0.0000

3.0000

0.0000

2.0000

0.3200

0.7350

MCP

2.6600

0.0000

3.0000

0.0000

2.0000

0.3400

0.7150

SCAD

2.6600

0.0000

3.0000

0.0000

2.0000

0.3400

0.7000

nSCAD

2.6550

0.0000

3.0000

0.0000

2.0000

0.3450

0.6950

350

LASSO

2.7400

0.0000

3.0000

0.0000

2.0000

0.2600

0.7650

MCP

2.7150

0.0000

3.0000

0.0000

2.0000

0.2850

0.7550

SCAD

2.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9050

nSCAD

2.6950

0.0000

3.0000

0.0000

2.0000

0.3050

0.7350

400

LASSO

2.8700

0.0000

3.0000

0.0000

2.0000

0.1300

0.8850

MCP

2.8450

0.0000

3.0000

0.0000

2.0000

0.1550

0.8800

SCAD

2.9150

0.0000

3.0000

0.0000

2.0000

0.0850

0.9150

nSCAD

2.8900

0.0000

3.0000

0.0000

2.0000

0.1100

0.9050

0.4

300

LASSO

2.6050

0.0000

3.0000

0.0000

2.0000

0.3950

0.6650

MCP

2.5800

0.0000

3.0000

0.0000

2.0000

0.4200

0.6350

SCAD

2.6250

0.0000

3.0000

0.0000

2.0000

0.3750

0.6700

nSCAD

2.4950

0.0000

3.0000

0.0000

2.0000

0.5050

0.5750

350

LASSO

2.6850

0.0000

3.0000

0.0000

2.0000

0.3150

0.7000

MCP

2.6800

0.0000

3.0000

0.0000

2.0000

0.3200

0.7150

SCAD

2.8100

0.0000

3.0000

0.0000

2.0000

0.1900

0.8300

nSCAD

2.5850

0.0000

3.0000

0.0000

2.0000

0.4150

0.6400

400

LASSO

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8800

MCP

2.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8400

SCAD

2.8400

0.0000

3.0000

0.0000

2.0000

0.1600

0.8600

nSCAD

2.7100

0.0000

3.0000

0.0000

2.0000

0.2900

0.7650

0.6

300

LASSO

2.5600

0.0000

3.0000

0.0000

2.0000

0.4400

0.6150

MCP

2.4950

0.0000

3.0000

0.0000

2.0000

0.5050

0.5700

SCAD

2.5200

0.0000

3.0000

0.0000

2.0000

0.4800

0.5950

nSCAD

2.4050

0.0000

3.0000

0.0000

2.0000

0.5950

0.5100

350

LASSO

2.6250

0.0000

3.0000

0.0000

2.0000

0.3750

0.6750

MCP

2.5550

0.0000

3.0000

0.0000

2.0000

0.4450

0.6100

SCAD

2.6150

0.0000

3.0000

0.0000

2.0000

0.3850

0.6650

nSCAD

2.4750

0.0000

3.0000

0.0000

2.0000

0.5250

0.5500

400

LASSO

2.7450

0.0000

3.0000

0.0000

2.0000

0.2550

0.7900

MCP

2.7700

0.0000

3.0000

0.0000

2.0000

0.2300

0.7850

SCAD

2.7800

0.0000

3.0000

0.0000

2.0000

0.2200

0.8100

nSCAD

2.6000

0.0000

3.0000

0.0000

2.0000

0.4000

0.6650

Open in a new tab

TABLE 8.

Structure identification and variable selection with the AR(1) correlation structure ( $n_{i} = 20$ ).

Structure identification and variable selection

ρ

σ_{u}

n

Method

0.3

0.2

300

LASSO

2.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8450

MCP

2.8150

0.0000

3.0000

0.0000

2.0000

0.1850

0.8500

SCAD

2.8500

0.0000

3.0000

0.0000

2.0000

0.1500

0.8700

nSCAD

2.7700

0.0000

3.0000

0.0000

2.0000

0.2300

0.8100

350

LASSO

2.9150

0.0000

3.0000

0.0000

2.0000

0.0850

0.9250

MCP

2.8750

0.0000

3.0000

0.0000

2.0000

0.1250

0.9100

SCAD

2.9500

0.0000

3.0000

0.0000

2.0000

0.0500

0.9500

nSCAD

2.7600

0.0000

3.0000

0.0000

2.0000

0.2400

0.8250

400

LASSO

2.9250

0.0000

3.0000

0.0000

2.0000

0.0750

0.9300

MCP

2.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9100

SCAD

2.9950

0.0000

3.0000

0.0000

2.0000

0.0050

0.9950

nSCAD

2.9100

0.0000

3.0000

0.0000

2.0000

0.0900

0.9150

0.4

300

LASSO

2.8150

0.0000

3.0000

0.0000

2.0000

0.1850

0.8350

MCP

2.7750

0.0000

3.0000

0.0000

2.0000

0.2250

0.8050

SCAD

2.8300

0.0000

3.0000

0.0000

2.0000

0.1700

0.8500

nSCAD

2.6550

0.0000

3.0000

0.0000

2.0000

0.3450

0.6900

350

LASSO

2.8250

0.0000

3.0000

0.0000

2.0000

0.1750

0.8550

MCP

2.7850

0.0000

3.0000

0.0000

2.0000

0.2150

0.8300

SCAD

2.8900

0.0000

3.0000

0.0000

2.0000

0.1100

0.9050

nSCAD

2.6900

0.0000

3.0000

0.0000

2.0000

0.3100

0.7350

400

LASSO

2.8700

0.0000

3.0000

0.0000

2.0000

0.1300

0.8800

MCP

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8800

SCAD

2.9100

0.0000

3.0000

0.0000

2.0000

0.0900

0.9200

nSCAD

2.7500

0.0000

3.0000

0.0000

2.0000

0.2500

0.7950

0.6

300

LASSO

2.7550

0.0000

3.0000

0.0000

2.0000

0.2450

0.7850

MCP

2.7300

0.0000

3.0000

0.0000

2.0000

0.2700

0.7600

SCAD

2.7850

0.0000

3.0000

0.0000

2.0000

0.2150

0.8050

nSCAD

2.5600

0.0000

3.0000

0.0000

2.0000

0.4400

0.6650

350

LASSO

2.8250

0.0000

3.0000

0.0000

2.0000

0.1750

0.8400

MCP

2.7900

0.0000

3.0000

0.0000

2.0000

0.2100

0.8050

SCAD

2.7950

0.0000

3.0000

0.0000

2.0000

0.2050

0.8100

nSCAD

2.6150

0.0000

3.0000

0.0000

2.0000

0.3850

0.6600

400

LASSO

2.8450

0.0000

3.0000

0.0000

2.0000

0.1550

0.8700

MCP

2.8250

0.0000

3.0000

0.0000

2.0000

0.1750

0.8350

SCAD

2.8750

0.0000

3.0000

0.0000

2.0000

0.1250

0.8950

nSCAD

2.6350

0.0000

3.0000

0.0000

2.0000

0.3650

0.6900

Open in a new tab

For completeness, any additional or more detailed tables not included in the main text are provided in Appendix A, ensuring that all results are fully documented and reproducible. In our work, we have included simulation results for higher dimensional settings ( $q = 20, 50$ ) in Appendix A, and the conclusions in these higher dimensional scenarios coincide with those presented in the main text. In summary, from Tables 1, 2, 3, 4, 5, 6, 7, 8, we can draw the following conclusions.

1.
In the vast majority of experimental scenarios, the estimation accuracies of BCDPQIF‐LASSO, BCDPQIF‐SCAD, and BCDPQIF‐MCP consistently surpasses that of BCDPQIF‐nSCAD, thereby demonstrating the effectiveness of the proposed bias‐corrected strategy. Overall, the four methods exhibit comparable good performances in structure identification. Moreover, neglecting measurement errors results in biased estimation for model (2).
2.
Under same conditions, as both the sample size and the number of observations increase, the performances of BCDPQIF‐SCAD, BCDPQIF‐MCP, and BCDPQIF‐LASSO improve. Notably, BCDPQIF‐SCAD and BCDPQIF‐MCP generally outperform BCDPQIF‐LASSO when estimating varying coefficients.
3.
Similarly, as the magnitude of measurement errors increases, the performances of BCDPQIF‐SCAD, BCDPQIF‐MCP, and BCDPQIF‐LASSO deteriorate under the same conditions. When measurement errors are small, the performance differences among these methods are minimal; however, when measurement errors become substantial, BCDPQIF‐SCAD and BCDPQIF‐MCP significantly outperform BCDPQIF‐LASSO, indicating that BCDPQIF‐LASSO is less robust than BCDPQIF‐SCAD and BCDPQIF‐MCP.

Overall, these numerical study results have confirmed that the proposed method make sense, which is manifested in the dealing with measurement errors and within‐subject correlations, structural identification, estimation and variable selection.

4.2. Real Data Analysis

We now apply the proposed BCDPQIF method to data from the Multicenter AIDS Cohort Study, comprising 283 homosexual men infected with HIV between 1984 and 1991. This dataset has been widely used to illustrate VC models [5]; VCEV models [31] and PLVCEVM [15]. Because CD4 cells are crucial for immune function, the study focused on how risk factors—such as cigarette smoking, drug use, and pre‐infection CD4 cell levels—influence the post‐infection depletion of CD4 percentages. Previous analyses aimed to describe the trend of mean CD4 depletion over time and to evaluate the effects of pre‐infection CD4 percentage and age at HIV infection. In our application, we account for measurement errors in the covariates and demonstrate the utility of the BCDPQIF method on this dataset.

Let $Y$ be the individual's CD4 percentage, $X_{1}$ be the centered preCD4 percentage, $X_{2}$ be the centered age at HIV infection, $X_{3} = X_{1} \cdot X_{2}$ , $X_{4} = X_{1}^{2}$ and $X_{5} = X_{2}^{2}$ . Then we consider the following model

Y = α_{0} (t) + X_{1} α_{1} (t) + X_{2} α_{2} (t) + X_{3} α_{3} (t) + X_{4} α_{4} (t) + X_{5} α_{5} (t) + ε

(30)

where $α_{0} (t)$ is the baseline of CD4 percentage; $α_{1} (t)$ and $α_{2} (t)$ describe the effects of preCD4 percentage and age at HIV infection, two covariates that, in clinical practice, are particularly prone to measurement error (due to laboratory assay variability and patient recall), $α_{3} (t)$ describes the interaction effect between the preCD4 percentage and age at HIV infection, $α_{4} (t)$ and $α_{5} (t)$ correspond to the $X_{1}^{2}$ and $X_{2}^{2}$ terms, respectively, capturing the quadratic effects of pre‐infection CD4 percentage and age at HIV infection. $t$ is the visiting time for each patient.

In this application, we considered observations of the pre‐infection CD4 percentage and age may contain measurement errors. The validity of the BCDPQIF method was verified by adding some measurement errors to the covariates, that is,

W_{1} = X_{1} + u_{1}, W_{2} = X_{2} + u_{2}

where ${(u_{1}, u_{2})}^{T} \sim N (0, \sum_{u}), \sum_{u} = σ_{u}^{2} I_{2}$ . We took $σ_{u} = 0$ , which assumes no measurement error. 0.4 and 0.6 represent different levels of measurement errors.

The BCDPQIF identified one varying coefficient $α_{0} (t)$ . Figure 1 shows the curve of ${\hat{α}}_{0} (t)$ over time under different measurement errors. It shows that $α_{0} (t)$ decreases quickly at the beginning of HIV infection, and the rate of decrease slows down, which is similar to Zhao and Xue [31]. Furthermore, we found that the estimated functional curve ${\hat{α}}_{1} (t)$ under different measurement errors are very close to each other, which means that our bias‐corrected model selection scheme works well. This further demonstrates that the proposed model structure identification, estimation and variable selection method is valuable practically.

The fitted plot of the BCDPQIF estimation ${\hat{α}}_{0} (t)$ .

5. Conclusion and Discussion

In this article, combining the merits of Xu et al. [23] and Wang and Lin [24], we proposed a BCDPQIF for varying coefficient EV models with longitudinal data. Xu et al. [23] focused on a unified variable selection for longitudinal varying coefficient models, and Wang and Lin [24] conducted research on generalized partial linear varying coefficient models with longitudinal data. Notably, their approaches can do structure identification and variable selection simultaneously. However, they do not take into account the situation where the model contains measurement errors. It is worth noting that measurement errors are inevitable in practice. Especially for longitudinal data, both measurement errors and unknown working correlation matrices need to be handled appropriately. And precisely for this reason, we aim to study the structure identification, estimation and variable selection of the VCEV models with longitudinal data.

It is important to highlight that the VCEV models discussed here fall under a broad category of models, which includes both the linear EV models and the PLVCEV models. The proposed BCDPQIF method can identify the model structure, estimation and variable selection simultaneously for these models. To be precise, the BCDPQIF method can not onlyhandle measurement errors and unknown within subject correlations, but also identify whether the regression coefficients in the model are constant or varying coefficients, and select out the nonzero constant coefficients. This means that the BCDPQIF method avoids the assumption risks of the linear EV models, VCEV models and PLVCEV models. Theoretical and numerical results confirm that this method makes sense.

Furthermore, the BCDPQIF method is versatile and can be extended to structure identification, estimation and variable selection in a variety of models, including the additive models and the single‐index varying coefficient models, among others. Additionally, the BCDPQIF method is applicable to other forms of correlated data analysis, such as panel data and clustered data. In future work, we plan to use this method to investigate more complex modeling frameworks.

Conflicts of Interest

The authors declare no conflicts of interest.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Nos. 12371293 and 12401373), the University Social Science Research Project of Anhui Province (Nos. 2022AH050560, 2023AH010008, 2023AH050203, 2024AH050013, and 2024AH050015), the Social Science Foundation of Anhui Province (Nos. AHSKYQ2025D17 and AHSKF2022D08), the Social Science Foundation of the Ministry of Education of China (Nos. 24YJAZH146 and 21YJC910003), the National Social Science Foundation of China (No. 23BTJ061), the University Natural Science Research Project of Anhui Province (Nos. 2024AH050015, KJ2021A0486 and 2024AH050017), Innovation Team Project of Anhui Province (2023AH010008), Postgraduate Education Reform and Quality Improvement Project of Henan Province (YJS2026AL016).

Appendix A.

Derivation Process of $D_{i}^{(κ)}$

\begin{align} D_{i}^{(κ)} = \\ (\begin{matrix} tr (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) \cdot \sum_{u} & \sum_{u} \otimes (1_{1 \times n_{i}} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}) \\ {(\sum_{u} \otimes (1_{1 \times n_{i}} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}))}^{T} & \sum_{u} \otimes (B_{i} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}) \end{matrix}) . \end{align}

First, we know that

\begin{align} B_{i j} = I_{q} \otimes B (t_{i j}) = {(\begin{matrix} B (t_{i j}) & 0 & \dots & 0 \\ 0 & B (t_{i j}) & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & B (t_{i j}) \end{matrix})}_{q L \times q}, ũ_{i j} = {(\begin{matrix} B (t_{i j}) u_{i j}^{1} \\ B (t_{i j}) u_{i j}^{2} \\ ⋮ \\ B (t_{i j}) u_{i j}^{q} \end{matrix})}_{q L \times 1}, \end{align}

and

\begin{align} ũ_{i} = {(ũ_{i 1}, ũ_{i 2}, \dots, ũ_{i n_{i}})}^{T} = {(\begin{matrix} u_{i 1}^{1} B^{T} (t_{i 1}) & u_{i 1}^{2} B^{T} (t_{i 1}) & \dots & u_{i 1}^{q} B^{T} (t_{i 1}) \\ u_{i 2}^{1} B^{T} (t_{i 2}) & u_{i 2}^{2} B^{T} (t_{i 2}) & \dots & u_{i 2}^{q} B^{T} (t_{i 2}) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ u_{i n_{i}}^{1} B^{T} (t_{i n_{i}}) & u_{i n_{i}}^{2} B^{T} (t_{i n_{i}}) & \dots & u_{i n_{i}}^{q} B^{T} (t_{i n_{i}}) \end{matrix})}_{n_{i} \times q L} . \end{align}

For simplicity, we define the matrix $ζ^{κ}$ as

ζ^{κ} = A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} = [\begin{matrix} ι_{11}^{κ} & ι_{12}^{κ} & \dots & ι_{1 n_{i}}^{κ} \\ ι_{21}^{κ} & ι_{22}^{κ} & \dots & ι_{2 n_{i}}^{κ} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ ι_{n_{i} 1}^{κ} & ι_{n_{i} 2}^{κ} & \dots & ι_{n_{i} n_{i}}^{κ} \end{matrix}] .

Then, $D_{i}^{(κ)}$ can be reexpressed as follows

\begin{align} D_{i}^{(κ)} & = E ({(u_{i}, ũ_{i})}^{T} ζ^{κ} (u_{i}, ũ_{i})) = (\begin{matrix} E (u_{i}^{T} ζ^{κ} u_{i}) & E (u_{i}^{T} ζ^{κ} ũ_{i}) \\ E (ũ_{i}^{T} ζ^{κ} u_{i}) & E (ũ_{i}^{T} ζ^{κ} ũ_{i}) \end{matrix}) . \end{align}

After some matrix calculations, we can have

\begin{align} E (u_{i} ζ^{κ} u_{i}^{T}) & = E ((u_{i 1}, u_{i 2}, \dots, u_{i n_{i}}) [\begin{matrix} ι_{11}^{κ} & ι_{12}^{κ} & \dots & ι_{1 n_{i}}^{κ} \\ ι_{21}^{κ} & ι_{22}^{κ} & \dots & ι_{2 n_{i}}^{κ} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ ι_{n_{i} 1}^{κ} & ι_{n_{i} 2}^{κ} & \dots & ι_{n_{i} n_{i}}^{κ} \end{matrix}] (\begin{array}{c} u_{i 1}^{T} \\ u_{i 2}^{T} \\ ⋮ \\ u_{i n_{i}}^{T} \end{array})) \\ = E (\sum_{j = 1}^{n_{i}} u_{i j} ι_{j 1}^{κ} u_{i 1}^{T} + \sum_{j = 1}^{n_{i}} u_{i j} ι_{j 2}^{κ} u_{i 2}^{T} + \dots + \sum_{j = 1}^{n_{i}} u_{i j} ι_{j n_{i}}^{κ} u_{i n_{i}}^{T}) \\ = [ι_{11}^{κ} E (u_{i 1} u_{i 1}^{T}) + ι_{22}^{κ} E (u_{i 2} u_{i 2}^{T}) + \dots + ι_{n_{i} n_{i}}^{κ} E (u_{i n_{i}} u_{i n_{i}}^{T})] \\ = tr (ζ^{κ}) \cdot \sum_{u} . \end{align}

\begin{align} E (u_{i} ζ^{κ} ũ_{i}^{T}) & = E ({(u_{i 1}, u_{i 2}, \dots, u_{i n_{i}})}_{q \times n_{i}} [\begin{matrix} ι_{11}^{κ} & ι_{12}^{κ} & \dots & ι_{1 n_{i}}^{κ} \\ ι_{21}^{κ} & ι_{22}^{κ} & \dots & ι_{2 n_{i}}^{κ} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ ι_{n_{i} 1}^{κ} & ι_{n_{i} 2}^{κ} & \dots & ι_{n_{i} n_{i}}^{κ} \end{matrix}] {(\begin{array}{c} ũ_{i 1}^{T} \\ ũ_{i 2}^{T} \\ ⋮ \\ ũ_{i n_{i}}^{T} \end{array})}_{n_{i} \times L q}) \\ = E (\sum_{j = 1}^{n_{i}} u_{i j} ι_{j 1}^{κ} ũ_{i 1}^{T} + \sum_{j = 1}^{n_{i}} u_{i j} ι_{j 2}^{κ} ũ_{i 2}^{T} + \dots + \sum_{j = 1}^{n_{i}} u_{i j} ι_{j n_{i}}^{κ} ũ_{i n_{i}}^{T}) \\ = {[ι_{11}^{κ} E (u_{i 1} ũ_{i 1}^{T}) + ι_{22}^{κ} E (u_{i 2} ũ_{i 2}^{T}) + \dots + ι_{n_{i} n_{i}}^{κ} E (u_{i n_{i}} ũ_{i n_{i}}^{T})]}_{q \times L q} \\ = \sum_{u} \otimes ({(1, 1, \dots 1)}_{1 \times n_{i}} diag (ζ) {B_{i}}^{T}) . \end{align}

\begin{align} E (ũ_{i} ζ^{κ} ũ_{i}^{T}) & = E ((ũ_{i 1}, ũ_{i 2}, \dots, ũ_{i n_{i}}) [\begin{matrix} ι_{11}^{κ} & ι_{12}^{κ} & \dots & ι_{1 n_{i}}^{κ} \\ ι_{21}^{κ} & ι_{22}^{κ} & \dots & ι_{2 n_{i}}^{κ} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ ι_{n_{i} 1}^{κ} & ι_{n_{i} 2}^{κ} & \dots & ι_{n_{i} n_{i}}^{κ} \end{matrix}] {(\begin{array}{c} ũ_{i 1}^{T} \\ ũ_{i 2}^{T} \\ ⋮ \\ ũ_{i n_{i}}^{T} \end{array})}_{n_{i} \times L q}) \\ = E {(\sum_{j = 1}^{n_{i}} ũ_{i j} ι_{j 1}^{κ} ũ_{i 1}^{T} + \sum_{j = 1}^{n_{i}} ũ_{i j} ι_{j 2}^{κ} ũ_{i 2}^{T} + \dots + \sum_{j = 1}^{n_{i}} ũ_{i j} ι_{j n_{i}}^{κ} ũ_{i n_{i}}^{T})}_{L q \times L q} \\ = {[ι_{11}^{κ} E (ũ_{i 1} ũ_{i 1}^{T}) + ι_{22}^{κ} E (ũ_{i 2} ũ_{i 2}^{T}) + \dots + ι_{n_{i} n_{i}}^{κ} E (ũ_{i n_{i}} ũ_{i n_{i}}^{T})]}_{L q \times L q} \\ = \sum_{u} \otimes (B_{i} diag (A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2}) B_{i}^{T}) . \end{align}

Therefore, we can obtain $D_{i}^{(κ)}$ defined as equation (11).

Proof of Theorems

Firstly, we present two necessary lemmas.

Lemma 1

If C1‐C11 hold, and $K = O (N^{1 / (2 r + 1)})$ , then we have

$\begin{align} {\dot{\hat{\overline{g}}}}_{n} (θ) \overset{p}{\to} - J_{0}, \sqrt{n} {\hat{\overline{g}}}_{n} (θ_{0}) \overset{ℒ}{\to} N (0, Ω_{0}) . \end{align}$

According to Equation (14), we have

$\begin{align} {\hat{\overline{g}}}_{n} (θ) = \frac{1}{n} \sum_{i = 1}^{n} ĝ_{i} (θ) = \frac{1}{n} \sum_{i = 1}^{n} (\begin{matrix} {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{1} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + {\hat{D}}_{i}^{(1)} θ \\ {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{2} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + {\hat{D}}_{i}^{(2)} θ \\ ⋮ \\ {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{s} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ) + {\hat{D}}_{i}^{(s)} θ \end{matrix}) . \end{align}$

Denote the $κ th$ block matrix of ${\dot{\hat{\overline{g}}}}_{n} (θ)$ as ${\dot{\hat{\overline{g}}}}_{n κ} (θ)$ , $κ = 1, 2, \dots, s$ ,

$\begin{array}{l} {\dot{\hat{\overline{g}}}}_{n κ} (θ) \\ = - \frac{1}{n} \sum_{i = 1}^{n} ({(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (W_{i}, {\tilde{W}}_{i}) - {\hat{D}}_{i}^{(κ)}) \\ = - \frac{1}{n} \sum_{i = 1}^{n} ({(X_{i} + u_{i}, {\tilde{X}}_{i} + ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (X_{i} + u_{i}, {\tilde{X}}_{i} + ũ_{i}) - {\hat{D}}_{i}^{(κ)}) \\ = - \frac{1}{n} \sum_{i = 1}^{n} (\underset{Δ_{1 i}}{\underset{⏟}{{(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (X_{i}, {\tilde{X}}_{i})}} + \underset{Δ_{2 i}}{\underset{⏟}{{(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i})}} \\ + \underset{Δ_{3 i}}{\underset{⏟}{{(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (X_{i}, {\tilde{X}}_{i})}} + \underset{Δ_{4 i}}{\underset{⏟}{{(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i})}} - {\hat{D}}_{i}^{(κ)}) \\ = - (Δ_{1} + Δ_{2} + Δ_{3} + Δ_{4} - \frac{1}{n} \sum_{i = 1}^{n} {\hat{D}}_{i}^{(κ)}) . \end{array}$

Then we have

$\begin{align} Δ_{4} - \frac{1}{n} \sum_{i = 1}^{n} {\hat{D}}_{i}^{(κ)} & = \frac{1}{n} \sum_{i = 1}^{n} {(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i}) - D_{i}^{(κ)} + D_{i}^{(κ)} - \frac{1}{n} \sum_{i = 1}^{n} {\hat{D}}_{i}^{(κ)} . \end{align}$

Clearly, according to the law of large numbers, we have $\frac{1}{n} \sum_{i = 1}^{n} {(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i}) - D_{i}^{(κ)} \overset{p}{\to} 0$ and $D_{i}^{(κ)} - \frac{1}{n} \sum_{i = 1}^{n} {\hat{D}}_{i}^{(κ)} \overset{p}{\to} 0$ as the $n \to \infty$ . So we get $Δ_{4} - \frac{1}{n} \sum_{i = 1}^{n} {\hat{D}}_{i}^{(κ)} \overset{p}{\to} 0$ . Under C9, we can get $Δ_{1} \overset{p}{\to} J_{0}^{(κ)}$ . Now, let's prove that $Δ_{2} \overset{p}{\to} 0$ and $Δ_{3} \overset{p}{\to} 0$ .

Denote $Δ_{2} = \frac{1}{n} \sum_{i = 1}^{n} ξ_{i κ}$ , where $ξ_{i κ} = {(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i})$ . Obviously, we can get $E (ξ_{i κ}) = 0$ . From C4‐C7, we see that $cov (ξ_{i κ})$ are bounded. By the law of large numbers, we can get $Δ_{3}^{T} = Δ_{2} \overset{p}{\to} 0$ . Thus, we have ${\dot{\hat{\overline{g}}}}_{n κ} (θ) \overset{p}{\to} - J_{0}^{(κ)}$ and ${\dot{\hat{\overline{g}}}}_{n} (θ) \overset{p}{\to} - J_{0}$ where $J_{0} = {(J_{0}^{(1)}, J_{0}^{(2)}, \dots, J_{0}^{(s)})}^{T}$ . According to the Taylor expansion to ${\hat{\overline{g}}}_{n} (θ)$ at $θ_{0}$ , we have

$\begin{align} {\hat{\overline{g}}}_{n} (θ) = {\hat{\overline{g}}}_{n} (θ_{0}) + {\dot{\hat{\overline{g}}}}_{n} (θ_{0}) (θ - θ_{0}) + o (θ - θ_{0}) . \end{align}$

Denote the $κ th$ block matrix of ${\hat{\overline{g}}}_{n} (θ_{0})$ as ${\hat{\overline{g}}}_{n κ} (θ_{0})$ , $κ = 1, 2, \dots, s$ ,

$\begin{align} {\hat{\overline{g}}}_{n κ} (θ_{0}) & = \frac{1}{n} \sum_{i = 1}^{n} [{(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) θ_{0}) + {\hat{D}}_{i}^{(κ)} θ_{0}] \\ = \frac{1}{n} \sum_{i = 1}^{n} [{(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} ε_{i}] \\ - \frac{1}{n} \sum_{i = 1}^{n} [{(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i}) θ_{0}] \\ + \frac{1}{n} \sum_{i = 1}^{n} [{(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} J_{X R}] \\ + \frac{1}{n} \sum_{i = 1}^{n} [{(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} ε_{i}] \\ + \frac{1}{n} \sum_{i = 1}^{n} [{(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} J_{X R}] \\ - \frac{1}{n} \sum_{i = 1}^{n} [{(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i}) θ_{0}] + \frac{1}{n} \sum_{i = 1}^{n} D_{i}^{(κ)} θ_{0} \\ = J_{1} - J_{2} + J_{3} + J_{4} + J_{5} - J_{6} + \frac{1}{n} \sum_{i = 1}^{n} D_{i}^{(κ)} θ_{0} . \end{align}$

where $R (t_{i j}) = {(R_{1} (t_{i j}), R_{2} (t_{i j}), \dots, R_{q} (t_{i j}))}^{T}$ , $R_{k} (t_{i j}) = α_{k} (t_{i j}) - B {(t_{i j})}^{T} γ_{k} - β_{k}, k = 1, 2, \dots, q$ , and $J_{X R} = diag (X_{i} R_{i}^{T}) {(1, 1, \dots, 1)}_{n_{i} \times 1}$ .

$\begin{align} R_{i}^{T} = (R (t_{i 1}), R (t_{i 2}), \dots, R (t_{i n_{i}})) = (\begin{matrix} R_{1} (t_{i 1}) & R_{1} (t_{i 2}) & \dots & R_{1} (t_{i n_{i}}) \\ R_{2} (t_{i 1}) & R_{2} (t_{i 2}) & \dots & R_{2} (t_{i n_{i}}) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ R_{q} (t_{i 1}) & R_{q} (t_{i 2}) & \dots & R_{q} (t_{i n_{i}}) \end{matrix}) . \end{align}$

Denote $J_{1} = \frac{1}{n} \sum_{i = 1}^{n} φ_{i}$ , where $φ_{i} = {(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} ε_{i}$ . According to C5‐C7, we have $E (φ_{i}) = 0$ and

$cov (φ_{i}) = {(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} V_{i} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (X_{i}, {\tilde{X}}_{i}) < \infty .$

By the law of the large numbers, we get $J_{1} \overset{p}{\to} 0$ . Similarly, we have $J_{2} \overset{p}{\to} 0$ and $J_{3} \overset{p}{\to} 0$ .

Denote $J_{4} = \frac{1}{n} \sum_{i = 1}^{n} ϕ_{i}$ , $ϕ_{i} = {(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} ε_{i}$ . And since $ε_{i}, u_{i}$ are independent of each other, we have $E (ϕ_{i}) = 0$ . According to the Cauchy‐Schwarz inequality and C5‐C7 we have

$\begin{align} {(cov (φ_{i}))}^{2} & = E ({(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i})) E (ε_{i}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} ε_{i}) < \infty . \end{align}$

Thus, $J_{4} \overset{p}{\to} 0$ . By the law of large numbers, from the definition of ${\hat{D}}_{i}^{(κ)}$ , we have $J_{6} - \frac{1}{n} \sum_{i = 1}^{n} {\hat{D}}_{i}^{(κ)} θ_{0} \overset{p}{\to} 0$ . From C8, we have $J_{5} = O_{p} (n^{- 1 / 2} K^{- r}) = o_{p} (n^{- 1 / 2})$ and $J_{3} = o_{p} (n^{- 1 / 2})$ . So, we have ${\hat{\overline{g}}}_{n} (θ) \overset{p}{\to} J_{0} (θ_{0} - θ), \forall θ \in Θ .$

Following Tian, Xue and Liu [28], according to the results above, we have

$\begin{align} {\hat{\overline{g}}}_{n κ} (θ_{0}) & = \frac{1}{n} \sum_{i = 1}^{n} [{((X_{i}, {\tilde{X}}_{i}) + (u_{i}, ũ_{i}))}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} \\ (ε_{i} - (u_{i}, ũ_{i}) θ_{0}) + {\hat{D}}_{i}^{(κ)} θ_{0}] + o_{p} (n^{- 1 / 2}) \\ = \frac{1}{n} \sum_{i = 1}^{n} [{(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} ε_{i} - {(X_{i}, {\tilde{X}}_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i}) θ_{0} \\ + {(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} ε_{i} \\ - {(u_{i}, ũ_{i})}^{T} A_{i}^{- 1 / 2} M_{κ} A_{i}^{- 1 / 2} (u_{i}, ũ_{i}) θ_{0} + {\hat{D}}_{i}^{(κ)} θ_{0}] + o_{p} (n^{- 1 / 2}) \\ = \frac{1}{n} \sum_{i = 1}^{n} (ψ_{i κ 1} + ψ_{i κ 2} + ψ_{i κ 3} + ψ_{i κ 4}) + o_{p} (n^{- 1 / 2}) \\ = \frac{1}{n} \sum_{i = 1}^{n} ψ_{i κ} + o_{p} (n^{- 1 / 2}) . \end{align}$

where $ψ_{i} = {(ψ_{i 1}, ψ_{i 2}, \dots, ψ_{i s})}^{T}, ψ_{i κ} = ψ_{i κ 1} + ψ_{i κ 2} + ψ_{i κ 3} + ψ_{i κ 4}$ . So we have ${\hat{\overline{g}}}_{n} (θ_{0}) = \frac{1}{n} \sum_{i = 1}^{n} ψ_{i} + o_{p} (n^{- 1 / 2}), and Ω_{n} (θ_{0}) = \frac{1}{n} \sum_{i = 1}^{n} ψ_{i} ψ_{i}^{T} + o (1) .$ From C5‐C7, we get $E (ψ_{i k m}) = 0, cov (ψ_{i κ m}) < \infty, m = 1, 2, 3, 4 .$ Following the properties of covariance matrix, we have

$\begin{align} cov (ψ_{i κ}) \leq \sum_{m = 1}^{4} cov (ψ_{i κ m}) + \sum_{m \neq l} \sqrt{cov (ψ_{i κ m}) cov (ψ_{i κ l})} < \infty, \\ \forall a \in ℝ^{s (q + q L)}, a^{T} a = 1, E (a^{T} ψ_{i}) = 0, \sup_{i} E ‖ a^{T} ψ_{i} ‖^{3} \leq ‖ a^{T} ‖ \sup_{i} E ‖ ψ_{i} ‖^{3} . \end{align}$

According to the Slutsky Theorem, we have $\sqrt{n} {\hat{\overline{g}}}_{n} (θ_{0}) \overset{ℒ}{\to} N (0, Ω_{0}), {\hat{\overline{g}}}_{n} (θ_{0}) = O_{p} (n^{- 1 / 2})$ . The proof of Lemma 1 is completed.

Lemma 2

If C1‐C11 hold, we get

$‖ n^{- 1} {\dot{Q}}_{n} (θ_{0}) - 2 {\dot{\hat{\overline{g}}}}_{n}^{T} (θ_{0}) Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} (θ_{0}) ‖ = O_{p} (n^{- 1}),$ (A1)

$‖ n^{- 1} {\ddot{Q}}_{n} (θ_{0}) - 2 {\dot{\hat{\overline{g}}}}_{n}^{T} (θ_{0}) Ω_{n}^{- 1} {\dot{\hat{\overline{g}}}}_{n} (θ_{0}) ‖ = o_{p} (1) .$ (A2)

The proof of Lemma 2 is similar as Lemma 2 in Tian, Xue and Liu [28] and details are omitted here.

Proof of Theorem 1

Let $δ = n^{- r / (2 r + 1)}, β = β_{0} + δ C_{1}, γ = γ_{0} + δ C_{2}$ and $C = {(C_{1}^{T}, C_{2}^{T})}^{T}$ . To prove Theorem 1, it is sufficient to show that $\forall ε > 0, \exists$ a large constant $C_{0}$ satisfies

$P \{\inf_{‖ C ‖ = C_{0}} Q_{p} (θ) ⩾ Q_{p} (θ_{0})\} ⩾ 1 - ε .$ (A3)

Obviously, when $ε \geq 1$ , Equation (A3) is always true. Therefore, we consider the case that $ε \in (0, 1)$ . Assume $α_{k} (\cdot) = 0 (k = q_{1} + 1, q_{1} + 2, \dots, q)$ , $p_{λ} (0) = 0$ and let $Δ (β, γ) = \frac{1}{K} [Q_{p} (θ) - Q_{p} (θ_{0})], θ_{0} = {(β_{0}^{T}, γ_{0}^{T})}^{T}$ , we have

$\begin{align} Δ (β, γ) & \geq \frac{1}{K} [Q_{n} (θ) - Q_{n} (θ_{0})] + \frac{n}{K} \sum_{l = 1}^{q_{1}} [p_{λ_{1 l}} ({‖γ_{l}‖}_{H}) - p_{λ_{1 l}} ({‖γ_{l 0}‖}_{H})] \\ + \frac{n}{K} \sum_{k = 1}^{q_{1}} [p_{λ_{2 k}} (|β_{k}|) - p_{λ_{2 k}} (|β_{k 0}|)] \\ = {\tilde{Δ}}_{1} + {\tilde{Δ}}_{2} + {\tilde{Δ}}_{3} . \end{align}$

Apply Taylor expansion to $Q_{n} (θ)$ at $θ_{0}$ , we have $Q_{n} (θ) = Q_{n} (θ_{0} + δ C) = Q_{n} (θ_{0}) + δ C^{T} {\dot{Q}}_{n} (θ_{0}) + \frac{1}{2} δ^{2} C^{T} {\ddot{Q}}_{n} (\tilde{θ}) C,$ where $\tilde{θ}$ lies between $θ$ and $θ_{0}$ . According to Lemmas 1 and 2, we can get

$\begin{align} δ C^{T} {\dot{Q}}_{n} (θ_{0}) & = δ C^{T} \{2 n {\dot{\hat{\overline{g}}}}_{n} (θ_{0}) Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} (θ_{0}) + n O_{p} (n^{- 1})\} \\ = ‖ C ‖ O_{p} (\sqrt{n} δ) + ‖ C ‖ O_{p} (δ), \\ \frac{1}{2} δ^{2} C^{T} {\ddot{Q}}_{n} (θ_{0}) C & = δ^{2} C^{T} \{2 n {\dot{\hat{\overline{g}}}}_{n}^{T} (θ_{0}) Ω_{n}^{- 1} {\dot{\hat{\overline{g}}}}_{n} (θ_{0}) + n o_{p} (1)\} C \\ = n δ^{2} C^{T} {\dot{\hat{\overline{g}}}}_{n}^{T} (θ_{0}) Ω_{n}^{- 1} {\dot{\hat{\overline{g}}}}_{n} (θ_{0}) C + n δ^{2} ‖ C ‖^{2} o_{p} (1) . \end{align}$

Therefore, we have

$\begin{align} {\tilde{Δ}}_{1} & = \frac{1}{K} \{n δ^{2} ‖ C ‖^{2} J_{0}^{T} Ω_{0}^{- 1} J_{0} + ‖ C ‖ O_{p} (\sqrt{n} δ) + ‖ C ‖ O_{p} (δ) + n δ^{2} ‖ C ‖^{2} o_{p} (1)\} . \end{align}$

Obviously, $n δ^{2} ‖ C ‖^{2} J_{0}^{T} Ω_{0}^{- 1} J_{0} \geq 0$ . When $C$ is large enough,

$\begin{align} n δ^{2} ‖ C ‖^{2} J_{0}^{T} Ω_{0}^{- 1} J_{0} & \geq ‖ C ‖ O_{p} (\sqrt{n} δ), n δ^{2} ‖ C ‖^{2} J_{0}^{T} Ω_{0}^{- 1} J_{0} \geq n δ^{2} ‖ C ‖^{2} o_{p} (1) . \end{align}$

So when $C$ is large enough, $Δ_{1} > 0$ . Next, by Taylor expansion, we get that

$\begin{align} {\tilde{Δ}}_{2} & = \frac{n}{K} \sum_{k = 1}^{p_{1}} [p_{λ_{2 k}} (|β_{k}|) - p_{λ_{2 k}} (|β_{k 0}|)] \\ = \frac{1}{K} \sum_{k = 1}^{p_{1}} [n δ p_{λ_{2 k}}^{'} (|β_{k 0}|) sgn (β_{k 0}) |C_{2}| + n δ^{2} p_{λ_{2 k}}^{″} (β_{k 0}) {|C_{2}|}^{2} (1 + o (1))] \\ \leq \frac{1}{K} \{\sqrt{p_{1}} n δ a_{n} ‖ C ‖ + n δ^{2} a_{n} ‖ C ‖^{2}\}, \\ {\tilde{Δ}}_{3} & = \frac{n}{K} \sum_{k = 1}^{p_{1}} [p_{λ_{2 k}} (|β_{k}|) - p_{λ_{2 k}} (|β_{k 0}|)] \\ = \frac{1}{K} \sum_{k = 1}^{p_{1}} [n δ p_{λ_{2 k}}^{'} (|β_{k 0}|) sgn (β_{k 0}) |C_{1}| + n δ^{2} p_{λ_{2 k}}^{″} (β_{k 0}) {|C_{1}|}^{2} (1 + o (1))] \\ \leq \frac{1}{K} \{\sqrt{p_{1}} n δ a_{n} ‖ C ‖ + n δ^{2} a_{n} ‖ C ‖^{2}\} . \end{align}$

We can see that for sufficiently large $C$ , ${\tilde{Δ}}_{1} \geq Δ_{2}$ and ${\tilde{Δ}}_{1} \geq {\tilde{Δ}}_{3}$ uniformly in $‖ Δ_{β} ‖ = C$ . Thus, inequality (A3) holds. According to Schumaker [29], we get

$‖ α_{k} (t) - β_{0 k} - B^{T} (t) γ_{0 k} ‖ = O_{p} (n^{- r / (2 r + 1)}), k = 1, 2, \dots, q .$

And then we have

$\begin{align} {‖{\hat{α}}_{k} (t) - α_{k} (t)‖}^{2} \\ = \int_{0}^{1} \{B^{T} (t) {\hat{γ}}_{k} + {\hat{β}}_{k} - B^{T} (t) γ_{0 k} - β_{0 k} - α_{k} (t) + β_{0 k} + {B^{T} (t) γ_{0 k}\}}^{2} d t \\ \leq 2 \int_{0}^{1} {\{B^{T} (t) {\hat{γ}}_{k} - B^{T} (t) γ_{0 k} + {\hat{β}}_{k} - β_{0 k}\}}^{2} d t + 2 \int_{0}^{1} {\{α_{k} (t) - β_{0 k} - B^{T} (t) γ_{0 k}\}}^{2} d t \\ = O_{p} (n^{- 2 r / (2 r + 1)}) . \end{align}$

The proof of Theorem 1 is finished. See the reference Fan and Li [18].

Proof of Theorem 2

To prove part (i), we just need to prove ${‖γ_{k}‖}_{H} = 0$ for $k \in 𝒞 \cup 𝒵$ . According to Theorem 1, it is sufficient to show that, for any $θ$ that satisfies $‖ {\hat{θ}}^{(𝒞)} - θ_{0}^{(𝒞)} ‖ = O_{p} (N^{- 1 / (2 r + 1)})$ and $‖ {\hat{θ}}^{(𝒱)} - θ_{0}^{(𝒱)} ‖ = O_{p} (N^{- 1 / (2 r + 1)})$ , and $\exists$ a small $e = C n^{- 1 / (2 r + 1)}$ , when $n \to \infty$ , with probability tending to 1, we have

$\frac{\partial Q_{p} (β)}{\partial γ_{k l}} < 0, - e < γ_{k l} < 0, l = 1, 2, \dots, L, k \in 𝒞 \cup 𝒵,$ (A4)

$\frac{\partial Q_{p} (β)}{\partial γ_{k l}} > 0, e > γ_{k l} > 0, l = 1, 2, \dots, L, k \in 𝒞 \cup 𝒵 .$ (A5)

According to Equations (A1) and (A2), we have

$\begin{align} \frac{\partial Q_{p} (θ)}{\partial γ_{k l}} & = 2 n \frac{\partial {\hat{\overline{g}}}_{n} (θ)}{\partial γ_{k l}} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} (θ) + o_{p} (n^{- 1}) + n p_{λ_{1 k}}^{'} ({‖ γ ‖}_{H}) \frac{\sum_{j = 1}^{L} h_{l j} γ_{k j}}{‖ γ_{k} ‖_{H}} \\ = 2 n \frac{\partial {\hat{\overline{g}}}_{n} (θ)}{\partial γ_{k l}} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} (θ) + o_{p} (n^{- 1}) + n p_{λ_{1 k}}^{'} (‖ γ_{k} ‖_{H}) \frac{{(H γ_{k})}_{l}}{‖ γ_{k} ‖_{H}} \\ = n λ_{1 k} \{2 \frac{\partial {\hat{\overline{g}}}_{n} (θ)}{\partial γ_{k l}} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} (θ) + λ_{1 k}^{- 1} p_{λ_{1 k}}^{'} (‖ γ_{k} ‖_{H}) \frac{{(H γ_{k})}_{l}}{‖ γ_{k} ‖_{H}}\} + o_{p} (n^{- 1}) . \end{align}$

According to the condition C10 and $n^{r / (2 r + 1)} λ_{\min} \to \infty$ , it is clear that the $sgn$ of $\frac{\partial Q_{p} (θ)}{\partial γ_{k l}}$ is completely determined by that of $γ_{k l}$ , then Equations (A4) and (A5) hold. This completes the proof of part (i).

Similarly, to prove part (ii), we need to prove ${‖γ_{k}‖}_{H} = 0$ for $k \in 𝒞 \cup 𝒵$ holds with probability tending to one. It is clear that ${\hat{γ}}_{k} = 0$ for $k = 1, 2, \dots, c$ , and then ${\hat{α}}_{k} (\cdot)$ has been reduced to a constant; it remains to prove that ${\hat{β}}_{k} = 0$ for $k \in 𝒵$ . It is sufficient to show that, for any $β$ that satisfies $‖ {\hat{θ}}^{(𝒞)} - θ_{0}^{(𝒞)} ‖ = O_{p} (N^{- 1 / (2 r + 1)})$ and $‖ {\hat{θ}}^{(𝒱)} - θ_{0}^{(𝒱)} ‖ = O_{p} (N^{- 1 / (2 r + 1)})$ , and for some given small $e = C n^{- 1 / (2 r + 1)}$ , when $n \to \infty$ , with probability tending to 1, we have

$\begin{align} \frac{\partial Q_{p} (θ)}{\partial β_{k}} & < 0, - e < β_{k} < 0, and \\ \frac{\partial Q_{p} (θ)}{\partial β_{k}} & > 0, e > β_{k} > 0, k = v + 1, v + 2, \dots, q . \end{align}$

Applying similar techniques as in the analysis of part (i), we have

$\begin{align} \frac{\partial Q_{p} (θ)}{\partial β_{k}} & = 2 n \frac{\partial {\hat{\overline{g}}}_{n}^{T} (θ)}{\partial β_{k}} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} (θ) + o_{p} (n^{- 1}) + n p_{2 k}^{'} (|β_{k}|) sgn (β_{k}) \\ = n λ_{2 k} \{2 λ_{2 k}^{- 1} \frac{\partial {\hat{\overline{g}}}_{n}^{T} (θ)}{\partial β_{k}} Ω_{n}^{- 1} {\hat{\overline{g}}}_{n} (θ) + λ_{2 k}^{- 1} p_{2 k}^{'} (|β_{k}|) sgn (β_{k})\} + o_{p} (n^{- 1}) . \end{align}$

It is clear that the sign of $\frac{\partial Q_{p} (θ)}{\partial β_{k}}$ is completely determined by the sign of $β_{k}$ . Then, with probability tending to 1, ${\hat{β}}_{k} = 0$ for $k \in 𝒵$ . The proof of part (ii) is finished. See the Xu et al. [23] and reference therein.

Proof of Theorem 3

Let $α_{0} (\cdot) = {(α_{01} (\cdot), α_{02} (\cdot), \dots, α_{0 q} (\cdot))}^{T}$ be the real coefficients in model (2).

$\begin{align} θ_{0} & = {(β_{0}^{T}, γ_{0}^{T})}^{T} = ({(β_{0}^{(𝒞)})}^{T}, {(β_{0}^{(𝒱)})}^{T}, {(β_{0}^{(𝒵)})}^{T}, {{(γ_{0}^{(𝒞)})}^{T}, {(γ_{0}^{(𝒱)})}^{T}, {(γ_{0}^{(𝒵)})}^{T})}^{T} . \end{align}$

where $β_{0}^{(𝒞)} = {(β_{01}, \dots, β_{0 c})}^{T}, β_{0}^{(𝒱)} = {(β_{0 (c + 1)}, \dots, β_{0 v})}^{T}, β_{0}^{(𝒵)} = {(β_{0 (v + 1)}, \dots, β_{0 q})}^{T}$ , and $γ_{0}^{(𝒞)} = {(γ_{01}^{T}, \dots, γ_{0 c}^{T})}^{T}, γ_{0}^{(𝒱)} = {(γ_{0 (c + 1)}^{T}, \dots, γ_{0 v}^{T})}^{T}, γ_{0}^{(𝒵)} = {(γ_{0 (v + 1)}^{T}, \dots, γ_{0 q}^{T})}^{T} .$

Thus,

$\begin{align} (W_{i}, {\tilde{W}}_{i}) θ & = W_{i}^{(𝒞)} {\hat{β}}^{(𝒞)} + W_{i}^{(𝒱)} {\hat{β}}^{(𝒱)} + {\tilde{W}}_{i}^{(𝒱)} {\hat{γ}}^{(𝒱)} . \end{align}$

Then, Theorems 1 and 2 imply that, as $n \to \infty$ , with probability tending to 1, the objective function $Q_{p} (θ)$ attains its minimum at $\hat{θ} = {({({\hat{β}}^{(𝒞)})}^{T}, {({\hat{β}}^{(𝒱)})}^{T}, {(0)}^{T}, {(0)}^{T}, {({\hat{γ}}^{(𝒱)})}^{T}, {(0)}^{T})}^{T} .$ Denote $Q_{1 p} (θ) = \frac{\partial Q_{p} (θ)}{\partial β_{1}^{(𝒞)}}, Q_{2 p} (θ) = \frac{\partial Q_{p} (θ)}{\partial γ^{(𝒱)}}$ . We know that

$\begin{array}{l} Q_{1 p} ({({({\hat{β}}^{(𝒞)})}^{T}, {({\hat{β}}^{(𝒱)})}^{T}, {(0)}^{T}, {(0)}^{T}, {({\hat{γ}}^{(𝒱)})}^{T}, {(0)}^{T})}^{T}) \\ = - \frac{2}{n} \sum_{i = 1}^{n} \sum_{r = 1}^{s} \sum_{r^{'} = 1}^{s} [{(X_{i}^{(𝒞)})}^{T} A_{i}^{- 1 / 2} M_{r} A_{i}^{- 1 / 2} (W_{i}, {\tilde{W}}_{i}) Ω_{r, r^{'}}^{- 1} \\ {{(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{r^{'}} A_{i}^{- 1 / 2} (Y_{i} - (W_{i}, {\tilde{W}}_{i}) \hat{θ}) + {\hat{D}}_{i}^{(r^{'})} \hat{θ}}] \\ + o_{p} (n^{- 1}) + \sum_{k = 1}^{c} p_{2 k}^{'} (| {\hat{β}}_{k} |) sgn ({\hat{β}}_{k}) \\ = - \frac{2}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} \{W_{i}^{(𝒞)} β^{(𝒞)} + W_{i}^{(𝒱)} β^{(𝒱)} + {\tilde{W}}_{i}^{(𝒱)} γ^{(𝒱)} \\ - [W_{i}^{(𝒞)} {\hat{β}}^{(𝒞)} + W_{i}^{(𝒱)} {\hat{β}}^{(𝒱)} + {\tilde{W}}_{i}^{(𝒱)} {\hat{γ}}^{(𝒱)}] + ε_{i} + J_{X R}\} \\ + o_{p} (n^{- 1}) + \sum_{k = 1}^{c} p_{2 k}^{'} (| {\hat{β}}_{k} |) sgn ({\hat{β}}_{k}) \\ = 0 . \end{array}$ (A6)

and

$\begin{align} Q_{2 p} ({({({\hat{β}}^{(𝒞)})}^{T}, {({\hat{β}}^{(𝒱)})}^{T}, {(0)}^{T}, {(0)}^{T}, {({\hat{γ}}^{(𝒱)})}^{T}, {(0)}^{T})}^{T}) \\ = - \frac{2}{n} \sum_{i = 1}^{n} \sum_{r = 1}^{s} \sum_{r^{'} = 1}^{s} [{({\tilde{X}}_{i}^{(𝒱)})}^{T} A_{i}^{- 1 / 2} M_{r} A_{i}^{- 1 / 2} (W_{i}, {\tilde{W}}_{i}) Ω_{r, r^{'}}^{- 1} \\ \{{(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{r^{'}} A_{i}^{- 1 / 2} (Y_{i} - [W_{i}^{(𝒞)} {\hat{β}}^{(𝒞)} + W_{i}^{(𝒱)} {\hat{β}}^{(𝒱)} \\ + {\tilde{W}}_{i}^{(𝒱)} {\hat{γ}}^{(𝒱)}]) + {\hat{D}}_{i}^{(1)} \hat{θ}\}] + o_{p} (n^{- 1}) + \sum_{k = c + 1}^{v} p_{1 k}^{'} ({‖{\hat{γ}}_{k}‖}_{H}) \frac{H γ_{k}}{{‖{\hat{γ}}_{k}‖}_{H}} \\ = - \frac{2}{n} \sum_{i = 1}^{n} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} \{W_{i}^{(𝒞)} β^{(𝒞)} + W_{i}^{(𝒱)} β^{(𝒱)} + {\tilde{W}}_{i}^{(𝒱)} γ^{(𝒱)} \\ - [W_{i}^{(𝒞)} {\hat{β}}^{(𝒞)} + W_{i}^{(𝒱)} {\hat{β}}^{(𝒱)} + {\tilde{W}}_{i}^{(𝒱)} {\hat{γ}}^{(𝒱)}] + ε_{i} + J_{X R}\} \\ + o_{p} (n^{- 1}) + \sum_{k = c + 1}^{v} p_{1 k}^{'} ({‖{\hat{γ}}_{k}‖}_{H}) \frac{H}{γ_{k}} {‖{\hat{γ}}_{k}‖}_{H} \\ = 0 . \end{align}$ (A7)

where $Ω_{κ κ^{'}}^{- 1}$ is the $(κ, κ^{'})$ block of $Ω_{0}^{- 1}$ $τ_{i} = \sum_{r = 1}^{s} \sum_{r^{'} = 1}^{s} A_{i}^{- 1 / 2} M_{r} A_{i}^{- 1 / 2} (W_{i}, {\tilde{W}}_{i}) Ω_{r r^{'}}^{- 1} {(W_{i}, {\tilde{W}}_{i})}^{T} A_{i}^{- 1 / 2} M_{r^{'}} A_{i}^{- 1 / 2}$ .

Apply the Taylor expansion to $p_{λ_{2 k}}^{'} (| {\hat{β}}_{k} |)$ , we have

$p_{λ_{2 k}}^{'} (| {\hat{β}}_{k} |) = p_{λ_{2 k}}^{'} (|β_{0 k}|) + \{p_{λ_{2 k}}^{″} (|β_{0 k}|) + o_{p} (1)\} ({\hat{β}}_{k} - β_{0 k}) .$

Condition C10 implies that $p_{λ_{2 k}}^{″} (|β_{0 k}|) = o_{p} (1)$ , and note that $p_{λ_{2 k}}^{'} (|β_{0 k}|) = 0$ as $λ_{\max} \to o_{p} (\hat{β} - β_{0})$ . Thus, $p_{λ_{2 k}}^{'} (| {\hat{β}}_{k} |) = 0$ . By the same argument, we know that ${‖{\hat{γ}}_{k}‖}_{H} \geq a λ_{1 k}$ for $n$ large enough. Thus, $p_{λ_{1 k}}^{'} ({‖γ_{0 k}‖}_{H}) = 0$ and $p_{λ_{1 k}}^{″} ({‖γ_{0 k}‖}_{H}) = 0$ , which imply that $p_{1 k}^{'} (‖ {\hat{γ}}_{k} ‖_{H}) = 0$ .

Hence, according to equations (A6) and (A7), we have

$\begin{align} - \frac{2}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} (W_{i}^{(𝒞)} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) + W_{i}^{(𝒱)} (β^{(𝒱)} - {\hat{β}}^{(𝒱)}) \\ + {\tilde{W}}_{i}^{(𝒱)} (γ^{(𝒱)} - {\hat{γ}}^{(𝒱)}) + ε_{i} + J_{X R}) + o_{p} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) = 0 . \end{align}$ (A8)

$\begin{align} - \frac{2}{n} \sum_{i = 1}^{n} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} (W_{i}^{(𝒞)} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) + W_{i}^{(𝒱)} (β^{(𝒱)} - {\hat{β}}^{(𝒱)}) \\ + {\tilde{W}}_{i}^{(𝒱)} (γ^{(𝒱)} - {\hat{γ}}^{(𝒱)}) + ε_{i} + J_{X R}) + o_{p} (Δ θ^{(𝒱)}) = 0 . \end{align}$ (A9)

To clearly present the combined term, we have

$W_{i}^{(𝒱)} Δ θ^{(𝒱)} = W_{i}^{(𝒱)} (β^{(𝒱)} - {\hat{β}}^{(𝒱)}) + {\tilde{W}}_{i}^{(𝒱)} (γ^{(𝒱)} - {\hat{γ}}^{(𝒱)}),$

where $Δ θ^{(𝒱)} = (\begin{array}{l} β^{(𝒱)} - {\hat{β}}^{(𝒱)} \\ γ^{(𝒱)} - {\hat{γ}}^{(𝒱)} \end{array})$ and $W_{i}^{(𝒱)} = (\begin{array}{l} W_{i}^{(𝒱)}, {\tilde{W}}_{i}^{(𝒱)} \end{array}) .$ Denote $Φ_{n} \equiv \frac{1}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} W_{i}^{(𝒞)}$ , $Ψ_{n} \equiv \frac{1}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} W_{i}^{(𝒱)}$ , and $ϱ \equiv \frac{1}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} (ε_{i} + J_{X R})$ . Then Equation (A8) can be rewritten as

$Φ_{n} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) + Ψ_{n} Δ θ^{(𝒱)} + ϱ + o_{p} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) = 0 .$

Similarly, denote $Ã \equiv \frac{1}{n} \sum_{i = 1}^{n} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} W_{i}^{(𝒞)}, \tilde{B} \equiv \frac{1}{n} \sum_{i = 1}^{n} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} W_{i}^{(𝒱)},$ and $\tilde{R} \equiv \frac{1}{n} \sum_{i = 1}^{n} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} (ε_{i} + J_{X R}) .$ Thus, Equation (A9) becomes $Ã (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) + \tilde{B} Δ θ^{(𝒱)} + \tilde{R} + o_{p} (Δ θ^{(𝒱)}) = 0$ .Thus, we have $Δ θ^{(𝒱)} = - {\tilde{B}}^{- 1} Ã (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) - {\tilde{B}}^{- 1} \tilde{R} + o_{p} (Δ θ^{(𝒱)})$ . Substitute $Δ θ^{(𝒱)}$ into Equation (A8), then we can get

$\begin{align} Φ_{n} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) - Ψ_{n} {\tilde{B}}^{- 1} Ã (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) - Ψ_{n} {\tilde{B}}^{- 1} \tilde{R} + ϱ + o_{p} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) = 0 . \end{align}$ (A10)

The foregoing formula is equivalent to

$\begin{align} \frac{1}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} W_{i}^{(𝒞)} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) \\ - [\frac{1}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} W_{i}^{(𝒱)}] {\tilde{B}}^{- 1} [\frac{1}{n} \sum_{i = 1}^{n} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} W_{i}^{(𝒞)}] (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) \\ - [\frac{1}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} W_{i}^{(𝒱)}] {\tilde{B}}^{- 1} [\frac{1}{n} \sum_{i = 1}^{n} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} (ε_{i} + J_{X R})] \\ + \frac{1}{n} \sum_{i = 1}^{n} {(X_{i}^{(𝒞)})}^{T} τ_{i} (ε_{i} + J_{X R}) + o_{p} (β^{(𝒞)} - {\hat{β}}^{(𝒞)}) = 0 . \end{align}$

According Equation (A10), we have

$\frac{1}{n} \sum_{i = 1}^{n} Ã {\tilde{B}}^{- 1} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} \{W_{i}^{(𝒞)} - W_{i}^{(𝒱)} {\tilde{B}}^{- 1} Ã\} = 0,$

$\frac{1}{n} \sum_{i = 1}^{n} Ã {\tilde{B}}^{- 1} {({\tilde{X}}_{i}^{(𝒱)})}^{T} τ_{i} [ε_{i} + J_{X R} - W_{i}^{(𝒱)} {\tilde{B}}^{- 1} \tilde{R}] = 0 .$

and

$\begin{align} \{\frac{1}{n} \sum_{i = 1}^{n} {{\overset{˘}{W}}_{i}^{(𝒞)} τ_{i} ({\overset{˘}{W}}_{i}^{(𝒞)})}^{T} + o_{p} (1)\} \sqrt{n} ({\hat{β}}_{1}^{(𝒞)} - β_{01}^{(𝒞)}) \\ = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\overset{˘}{W}}_{i}^{(𝒞)} τ_{i} ε_{i} - \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\overset{˘}{W}}_{i}^{(𝒞)} τ_{i} {\tilde{X}}_{i}^{(𝒱)} [{\tilde{B}}^{- 1} + o_{p} (1)] \tilde{R} + \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\overset{˘}{W}}_{i}^{(𝒞)} τ_{i} J_{X R} \\ = G_{1} + G_{2} + G_{3}, \end{align}$

where ${\overset{˘}{W}}_{i}^{(C)} = {(W_{i}^{(C)})}^{T} - Ã {\tilde{B}}^{- 1} {(W_{i}^{(𝒱)})}^{T}$ .

It is clear that $\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\overset{˘}{W}}_{i}^{(C)} τ_{i} {\tilde{X}}_{i}^{(𝒱)} = 0$ implies $G_{2} = 0$ . Using the law of large numbers, we can obtain $\frac{1}{n} \sum_{i = 1}^{n} {\overset{˘}{W}}_{i}^{(C)} τ_{i} τ_{i} {({\overset{˘}{W}}_{i}^{(C)})}^{T} \overset{P}{\to} A$ . According to the central limit theorem, we have

$G_{1} = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} {\overset{˘}{W}}_{i}^{(𝒞)} τ_{i} ε_{i} \overset{ℒ}{\to} N (0, B),$

$\begin{align} A & = E \{{(W^{(𝒞)})}^{T} τ_{i} - E ({({\tilde{X}}^{(𝒱)})}^{T} diag (τ) W^{(𝒞)} | t). \\ {[E ({({\tilde{X}}^{(𝒱)})}^{T} diag (τ) W^{(𝒱)} | t)]}^{- 1} E ({(W^{(𝒱)})}^{T} τ | t)\}^{\otimes 2}, \end{align}$

$\begin{align} B & = E \{{(W^{(𝒞)})}^{T} τ_{i} - E ({({\tilde{X}}^{(𝒱)})}^{T} diag (τ) W^{(𝒱)} | t) \\ {[E ({({\tilde{X}}^{(𝒱)})}^{T} diag (τ) W^{(𝒱)} | t)]}^{- 1} E ({(W^{(𝒱)})}^{T} τ | t) ε\}^{\otimes 2}, \end{align}$

where the superscript symbol “” is defined as a matrix operator for a matrix $M$ such as $M^{\otimes 2} = M M^{T}$ .

$\begin{array}{l} W^{(𝒞)} & = {({(W_{1}^{(𝒞)})}^{T}, {(W_{2}^{(𝒞)})}^{T}, \dots, {(W_{n}^{(𝒞)})}^{T})}^{T}, \\ W^{(𝒱)} & = {({(W_{1}^{(𝒱)})}^{T}, {(W_{2}^{(𝒱)})}^{T}, \dots, {(W_{n}^{(𝒱)})}^{T})}^{T}, \\ {\tilde{X}}^{(𝒱)} & = {({({\tilde{X}}_{1}^{(𝒱)})}^{T}, {({\tilde{X}}_{2}^{(𝒱)})}^{T}, \dots, {({\tilde{X}}_{n}^{(𝒱)})}^{T})}^{T}, τ = {(τ_{1}, τ_{n}, \dots, τ_{n})}^{T} . \end{array}$

Following Tian, Xue and Liu [28], we know that $G_{3} = o_{p} (1)$ . According to the Slutsky Theorem, Theorem 3 is proved.

Some Additional Numerical Results

TABLE A1.

Structure identification and variable selection with the EX correlation structure ( $n_{i} = 10$ ).

Structure identification and variable selection

ρ

σ_{u}

n

Method

0.7

0.2

300

LASSO

2.7200

0.0000

3.0000

0.0000

2.0000

0.2800

0.7400

MCP

2.6800

0.0000

3.0000

0.0000

2.0000

0.3200

0.7200

SCAD

2.7150

0.0000

3.0000

0.0000

2.0000

0.2850

0.7500

nSCAD

2.6750

0.0000

3.0000

0.0000

2.0000

0.3250

0.7450

350

LASSO

2.8200

0.0000

3.0000

0.0000

2.0000

0.1800

0.8650

MCP

2.8000

0.0000

3.0000

0.0000

2.0000

0.2000

0.8450

SCAD

2.8450

0.0000

3.0000

0.0000

2.0000

0.1550

0.8900

nSCAD

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8850

400

LASSO

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.9000

MCP

2.8350

0.0000

3.0000

0.0000

2.0000

0.1650

0.8750

SCAD

2.9200

0.0000

3.0000

0.0000

2.0000

0.0800

0.9300

nSCAD

2.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9100

0.4

300

LASSO

2.6500

0.0000

3.0000

0.0000

2.0000

0.3500

0.6800

MCP

2.6450

0.0000

3.0000

0.0000

2.0000

0.3550

0.6800

SCAD

2.7200

0.0000

3.0000

0.0000

2.0000

0.2800

0.7350

nSCAD

2.5700

0.0000

3.0000

0.0000

2.0000

0.4300

0.6200

350

LASSO

2.6900

0.0000

3.0000

0.0000

2.0000

0.3100

0.7200

MCP

2.6850

0.0000

3.0000

0.0000

2.0000

0.3150

0.7250

SCAD

2.8450

0.0000

3.0000

0.0000

2.0000

0.1550

0.8800

nSCAD

2.6000

0.0000

3.0000

0.0000

2.0000

0.4000

0.6550

400

LASSO

2.8100

0.0000

3.0000

0.0000

2.0000

0.1900

0.8500

MCP

2.7800

0.0000

3.0000

0.0000

2.0000

0.2200

0.8250

SCAD

2.8700

0.0000

3.0000

0.0000

2.0000

0.1300

0.9000

nSCAD

2.7450

0.0000

3.0000

0.0000

2.0000

0.2550

0.8000

0.6

300

LASSO

2.3500

0.0000

3.0000

0.0000

2.0000

0.6500

0.5100

MCP

2.2350

0.0000

3.0000

0.0000

2.0000

0.7650

0.4800

SCAD

2.6600

0.0000

3.0000

0.0000

2.0000

0.3400

0.7300

nSCAD

2.1750

0.0000

3.0000

0.0000

2.0000

0.8250

0.4150

350

LASSO

2.4450

0.0000

3.0000

0.0000

2.0000

0.5550

0.5450

MCP

2.4450

0.0000

3.0000

0.0000

2.0000

0.5550

0.5500

SCAD

2.7800

0.0000

3.0000

0.0000

2.0000

0.2200

0.8300

nSCAD

2.3150

0.0000

3.0000

0.0000

2.0000

0.6850

0.4600

400

LASSO

2.7100

0.0000

3.0000

0.0000

2.0000

0.2900

0.7400

MCP

2.6500

0.0000

3.0000

0.0000

2.0000

0.3500

0.6950

SCAD

2.8500

0.0000

3.0000

0.0000

2.0000

0.1500

0.8550

nSCAD

2.5000

0.0000

3.0000

0.0000

2.0000

0.5000

0.5800

Open in a new tab

TABLE A2.

Structure identification and variable selection with the EX correlation structure ( $n_{i} = 20$ ).

Structure identification and variable selection

ρ

σ_{u}

n

Method

0.7

0.2

300

LASSO

2.8550

0.0000

3.0000

0.0000

2.0000

0.1450

0.8700

MCP

2.8200

0.0000

3.0000

0.0000

2.0000

0.1800

0.8500

SCAD

2.9000

0.0000

3.0000

0.0000

2.0000

0.1000

0.9150

nSCAD

2.7700

0.0000

3.0000

0.0000

2.0000

0.2300

0.8200

350

LASSO

2.8650

0.0000

3.0000

0.0000

2.0000

0.1350

0.9000

MCP

2.8500

0.0000

3.0000

0.0000

2.0000

0.1500

0.8850

SCAD

2.9550

0.0000

3.0000

0.0000

2.0000

0.0450

0.9650

nSCAD

2.8750

0.0000

3.0000

0.0000

2.0000

0.1250

0.9000

400

LASSO

2.9000

0.0000

3.0000

0.0000

2.0000

0.1000

0.9050

MCP

2.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9000

SCAD

2.9850

0.0000

3.0000

0.0000

2.0000

0.0150

0.9850

nSCAD

2.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9250

0.4

300

LASSO

2.7800

0.0000

3.0000

0.0000

2.0000

0.2200

0.8200

MCP

2.7300

0.0000

3.0000

0.0000

2.0000

0.2700

0.7900

SCAD

2.8650

0.0000

3.0000

0.0000

2.0000

0.1350

0.8850

nSCAD

2.7200

0.0000

3.0000

0.0000

2.0000

0.2800

0.7450

350

LASSO

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8650

MCP

2.8250

0.0000

3.0000

0.0000

2.0000

0.1750

0.8450

SCAD

2.9100

0.0000

3.0000

0.0000

2.0000

0.0900

0.9150

nSCAD

2.7200

0.0000

3.0000

0.0000

2.0000

0.2800

0.7650

400

LASSO

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8950

MCP

2.8350

0.0000

3.0000

0.0000

2.0000

0.1650

0.8750

SCAD

2.9100

0.0000

3.0000

0.0000

2.0000

0.0900

0.9250

nSCAD

2.8000

0.0000

3.0000

0.0000

2.0000

0.2000

0.8200

0.6

300

LASSO

2.6950

0.0000

3.0000

0.0000

2.0000

0.3050

0.7350

MCP

2.6750

0.0000

3.0000

0.0000

2.0000

0.3250

0.7300

SCAD

2.8200

0.0000

3.0000

0.0000

2.0000

0.1800

0.8550

nSCAD

2.5300

0.0000

3.0000

0.0000

2.0000

0.4700

0.6150

350

LASSO

2.8100

0.0000

3.0000

0.0000

2.0000

0.1900

0.8400

MCP

2.7650

0.0000

3.0000

0.0000

2.0000

0.2350

0.8150

SCAD

2.8550

0.0000

3.0000

0.0000

2.0000

0.1450

0.8800

nSCAD

2.5650

0.0000

3.0000

0.0000

2.0000

0.4350

0.6300

400

LASSO

2.8750

0.0000

3.0000

0.0000

2.0000

0.1250

0.8900

MCP

2.8550

0.0000

3.0000

0.0000

2.0000

0.1450

0.8750

SCAD

2.9050

0.0000

3.0000

0.0000

2.0000

0.0950

0.9150

nSCAD

2.6500

0.0000

3.0000

0.0000

2.0000

0.3500

0.7050

Open in a new tab

TABLE A3.

Structure identification and variable selection with the AR(1) correlation structure ( $n_{i} = 10$ ).

Structure identification and variable selection

ρ

σ_{u}

n

Method

0.7

0.2

300

LASSO

2.6450

0.0000

3.0000

0.0000

2.0000

0.3550

0.6900

MCP

2.6100

0.0000

3.0000

0.0000

2.0000

0.3900

0.6700

SCAD

2.6400

0.0000

3.0000

0.0000

2.0000

0.3600

0.6800

nSCAD

2.6050

0.0000

3.0000

0.0000

2.0000

0.3950

0.6650

350

LASSO

2.6700

0.0000

3.0000

0.0000

2.0000

0.3300

0.7150

MCP

2.7150

0.0000

3.0000

0.0000

2.0000

0.2850

0.7450

SCAD

2.7300

0.0000

3.0000

0.0000

2.0000

0.2700

0.7500

nSCAD

2.6950

0.0000

3.0000

0.0000

2.0000

0.3050

0.7150

400

LASSO

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8750

MCP

2.8300

0.0000

3.0000

0.0000

2.0000

0.1700

0.8600

SCAD

2.8700

0.0000

3.0000

0.0000

2.0000

0.1300

0.8850

nSCAD

2.8400

0.0000

3.0000

0.0000

2.0000

0.1600

0.8550

0.4

300

LASSO

2.5950

0.0000

3.0000

0.0000

2.0000

0.4050

0.6500

MCP

2.6000

0.0000

3.0000

0.0000

2.0000

0.4000

0.6350

SCAD

2.5950

0.0000

3.0000

0.0000

2.0000

0.4050

0.6200

nSCAD

2.5300

0.0000

3.0000

0.0000

2.0000

0.4700

0.6100

350

LASSO

2.6000

0.0000

3.0000

0.0000

2.0000

0.4000

0.6500

MCP

2.5950

0.0000

3.0000

0.0000

2.0000

0.4050

0.6400

SCAD

2.6100

0.0000

3.0000

0.0000

2.0000

0.3900

0.6600

nSCAD

2.5100

0.0000

3.0000

0.0000

2.0000

0.4900

0.6100

400

LASSO

2.6800

0.0000

3.0000

0.0000

2.0000

0.3200

0.7250

MCP

2.7050

0.0000

3.0000

0.0000

2.0000

0.2950

0.7350

SCAD

2.8650

0.0000

3.0000

0.0000

2.0000

0.1350

0.8800

nSCAD

2.5750

0.0000

3.0000

0.0000

2.0000

0.4250

0.6450

0.6

300

LASSO

2.5450

0.0000

3.0000

0.0000

2.0000

0.4550

0.6100

MCP

2.4450

0.0000

3.0000

0.0000

2.0000

0.5550

0.5400

SCAD

2.5150

0.0000

3.0000

0.0000

2.0000

0.4850

0.5850

nSCAD

2.3050

0.0000

3.0000

0.0000

2.0000

0.6950

0.4450

350

LASSO

2.6600

0.0000

3.0000

0.0000

2.0000

0.3400

0.7100

MCP

2.5700

0.0000

3.0000

0.0000

2.0000

0.4300

0.6300

SCAD

2.6000

0.0000

3.0000

0.0000

2.0000

0.4000

0.6500

nSCAD

2.4950

0.0000

3.0000

0.0000

2.0000

0.5050

0.5800

400

LASSO

2.6450

0.0000

3.0000

0.0000

2.0000

0.3550

0.6750

MCP

2.6450

0.0000

3.0000

0.0000

2.0000

0.3550

0.6800

SCAD

2.6550

0.0000

3.0000

0.0000

2.0000

0.3450

0.6750

nSCAD

2.5500

0.0000

3.0000

0.0000

2.0000

0.4500

0.6150

Open in a new tab

TABLE A4.

Structure identification and variable selection with the AR(1) correlation structure ( $n_{i} = 20$ ).

Structure identification and variable selection

ρ

σ_{u}

n

Method

0.7

0.2

300

LASSO

2.8200

0.0000

3.0000

0.0000

2.0000

0.1800

0.8400

MCP

2.8250

0.0000

3.0000

0.0000

2.0000

0.1750

0.8400

SCAD

2.8400

0.0000

3.0000

0.0000

2.0000

0.1600

0.8500

nSCAD

2.7750

0.0000

3.0000

0.0000

2.0000

0.2250

0.8000

350

LASSO

2.8300

0.0000

3.0000

0.0000

2.0000

0.1700

0.8750

MCP

2.8550

0.0000

3.0000

0.0000

2.0000

0.1450

0.8700

SCAD

2.8750

0.0000

3.0000

0.0000

2.0000

0.1250

0.8900

nSCAD

2.8150

0.0000

3.0000

0.0000

2.0000

0.1850

0.8550

400

LASSO

2.9300

0.0000

3.0000

0.0000

2.0000

0.0700

0.9350

MCP

2.9000

0.0000

3.0000

0.0000

2.0000

0.1000

0.9200

SCAD

2.9600

0.0000

3.0000

0.0000

2.0000

0.0400

0.9600

nSCAD

2.8250

0.0000

3.0000

0.0000

2.0000

0.1750

0.8700

0.4

300

LASSO

2.7350

0.0000

3.0000

0.0000

2.0000

0.2650

0.7600

MCP

2.7350

0.0000

3.0000

0.0000

2.0000

0.2650

0.7550

SCAD

2.8000

0.0000

3.0000

0.0000

2.0000

0.2000

0.8100

nSCAD

2.6200

0.0000

3.0000

0.0000

2.0000

0.3800

0.6650

350

LASSO

2.8650

0.0000

3.0000

0.0000

2.0000

0.1350

0.8650

MCP

2.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8300

SCAD

2.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.8750

nSCAD

2.7100

0.0000

3.0000

0.0000

2.0000

0.2900

0.7450

400

LASSO

2.8550

0.0000

3.0000

0.0000

2.0000

0.1450

0.8650

MCP

2.8450

0.0000

3.0000

0.0000

2.0000

0.1550

0.8550

SCAD

2.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9000

nSCAD

2.7200

0.0000

3.0000

0.0000

2.0000

0.2800

0.7650

0.6

300

LASSO

2.7000

0.0000

3.0000

0.0000

2.0000

0.3000

0.7350

MCP

2.7050

0.0000

3.0000

0.0000

2.0000

0.2950

0.7400

SCAD

2.7750

0.0000

3.0000

0.0000

2.0000

0.2250

0.8050

nSCAD

2.5650

0.0000

3.0000

0.0000

2.0000

0.4350

0.6250

350

LASSO

2.7500

0.0000

3.0000

0.0000

2.0000

0.2500

0.7800

MCP

2.7250

0.0000

3.0000

0.0000

2.0000

0.2750

0.7700

SCAD

2.8300

0.0000

3.0000

0.0000

2.0000

0.1700

0.8450

nSCAD

2.5700

0.0000

3.0000

0.0000

2.0000

0.4300

0.6400

400

LASSO

2.8400

0.0000

3.0000

0.0000

2.0000

0.1600

0.8650

MCP

2.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8450

SCAD

2.8700

0.0000

3.0000

0.0000

2.0000

0.1300

0.8900

nSCAD

2.6350

0.0000

3.0000

0.0000

2.0000

0.3650

0.6850

Open in a new tab

TABLE A5.

Model estimation results ( $q = 20, n_{i} = 20, ρ = 0.7$ ).

n = 400

n = 500

n = 600

Corstr

ρ

σ_{u}

Method

GMSE

RASE

GMSE

RASE

GMSE

RASE

AR(1)

0.7

0.4

LASSO

0.016985

0.062900

0.016123

0.056167

0.015715

0.052109

MCP

0.008156

0.061890

0.005986

0.053690

0.003870

0.049172

SCAD

0.008249

0.061449

0.006156

0.053667

0.004017

0.049403

nSCAD

0.144295

0.078095

0.144290

0.071223

0.142096

0.067723

0.6

LASSO

0.066033

0.099954

0.065660

0.087637

0.063251

0.080042

MCP

0.017984

0.091795

0.011278

0.079057

0.008998

0.070233

SCAD

0.019279

0.093662

0.012399

0.080058

0.009650

0.071932

nSCAD

0.692059

0.141239

0.663470

0.133779

0.657981

0.125970

0.7

0.4

LASSO

0.019243

0.063297

0.016841

0.055279

0.016527

0.050517

MCP

0.008648

0.061288

0.005200

0.052335

0.004022

0.048058

SCAD

0.008824

0.061437

0.005602

0.052955

0.004182

0.048277

nSCAD

0.175114

0.072692

0.166807

0.066381

0.158860

0.061465

0.6

LASSO

0.072644

0.098746

0.071361

0.086338

0.067447

0.079995

MCP

0.017900

0.092662

0.011884

0.077784

0.008770

0.070767

SCAD

0.018550

0.093136

0.011706

0.078541

0.008805

0.071786

nSCAD

0.760916

0.131829

0.735102

0.126376

0.715669

0.120093

Open in a new tab

TABLE A6.

Structure identification and variable selection results ( $q = 20, n_{i} = 20, ρ = 0.7$ ).

Structure identification and variable selection

Corstr

σ_{u}

n

Method

0.4

400

LASSO

14.8950

0.0000

3.0000

0.0000

2.0000

0.1050

0.9200

MCP

14.8650

0.0000

3.0000

0.0000

2.0000

0.1350

0.8900

SCAD

14.9450

0.0000

3.0000

0.0000

2.0000

0.0550

0.9450

nSCAD

14.5550

0.0000

3.0000

0.0000

2.0000

0.4450

0.7150

500

LASSO

14.9050

0.0000

3.0000

0.0000

2.0000

0.0950

0.9150

MCP

14.8850

0.0000

3.0000

0.0000

2.0000

0.1150

0.9050

SCAD

14.9150

0.0000

3.0000

0.0000

2.0000

0.0850

0.9700

nSCAD

14.5550

0.0000

3.0000

0.0000

2.0000

0.4450

0.7600

600

LASSO

14.8800

0.0000

3.0000

0.0000

2.0000

0.1200

0.9550

MCP

14.9300

0.0000

3.0000

0.0000

2.0000

0.0700

0.9500

SCAD

14.9300

0.0000

3.0000

0.0000

2.0000

0.0700

0.9900

nSCAD

14.6550

0.0000

3.0000

0.0000

2.0000

0.3450

0.7850

0.6

400

LASSO

13.2200

0.0000

3.0000

0.0000

2.0000

1.7800

0.5750

MCP

12.4450

0.0000

3.0000

0.0000

2.0000

2.5550

0.4600

SCAD

14.5550

0.0000

3.0000

0.0000

2.0000

0.4450

0.8550

nSCAD

13.1600

0.0000

3.0000

0.0000

2.0000

1.8400

0.6700

500

LASSO

14.4250

0.0000

3.0000

0.0000

2.0000

0.5750

0.6850

MCP

14.0250

0.0000

3.0000

0.0000

2.0000

0.9750

0.4950

SCAD

14.8000

0.0000

3.0000

0.0000

2.0000

0.2000

0.9150

nSCAD

13.7800

0.0000

3.0000

0.0000

2.0000

1.2200

0.7300

600

LASSO

14.8250

0.0000

3.0000

0.0000

2.0000

0.1750

0.8500

MCP

14.7400

0.0000

3.0000

0.0000

2.0000

0.2600

0.8200

SCAD

14.9050

0.0000

3.0000

0.0000

2.0000

0.0950

0.9300

nSCAD

14.1350

0.0000

3.0000

0.0000

2.0000

0.8650

0.7600

AR(1)

0.4

400

LASSO

14.7650

0.0000

3.0000

0.0000

2.0000

0.2350

0.8100

MCP

14.7500

0.0000

3.0000

0.0000

2.0000

0.2500

0.7800

SCAD

14.9100

0.0000

3.0000

0.0000

2.0000

0.0900

0.9350

nSCAD

14.2950

0.0000

3.0000

0.0000

2.0000

0.7050

0.6500

500

LASSO

14.9650

0.0000

3.0000

0.0000

2.0000

0.0350

0.9700

MCP

14.9550

0.0000

3.0000

0.0000

2.0000

0.0450

0.9800

SCAD

14.9300

0.0000

3.0000

0.0000

2.0000

0.0700

0.9750

nSCAD

14.6600

0.0000

3.0000

0.0000

2.0000

0.3400

0.8200

600

LASSO

14.9900

0.0000

3.0000

0.0000

2.0000

0.0100

0.9900

MCP

14.9000

0.0000

3.0000

0.0000

2.0000

0.1000

0.9900

SCAD

14.9950

0.0000

3.0000

0.0000

2.0000

0.0050

0.9950

nSCAD

14.8200

0.0000

3.0000

0.0000

2.0000

0.1800

0.8600

0.6

400

LASSO

13.1750

0.0000

3.0000

0.0000

2.0000

1.8250

0.5900

MCP

12.1650

0.0000

3.0000

0.0000

2.0000

2.8350

0.4500

SCAD

14.1700

0.0000

3.0000

0.0000

2.0000

0.8300

0.6700

nSCAD

12.7650

0.0000

3.0000

0.0000

2.0000

2.2350

0.5100

500

LASSO

14.7050

0.0000

3.0000

0.0000

2.0000

0.2950

0.7850

MCP

14.5800

0.0000

3.0000

0.0000

2.0000

0.4200

0.6900

SCAD

14.9200

0.0000

3.0000

0.0000

2.0000

0.0800

0.9500

nSCAD

13.9900

0.0000

3.0000

0.0000

2.0000

1.0100

0.8000

600

LASSO

14.9050

0.0000

3.0000

0.0000

2.0000

0.0950

0.9250

MCP

14.8400

0.0000

3.0000

0.0000

2.0000

0.1600

0.8800

SCAD

14.9700

0.0000

3.0000

0.0000

2.0000

0.0300

0.9750

nSCAD

14.2900

0.0000

3.0000

0.0000

2.0000

0.7100

0.8050

Open in a new tab

TABLE A7.

Model estimation results $(q = 50, n_{i} = 20)$ .

n = 400

n = 700

n = 1000

Corstr

ρ

σ_{u}

Method

GMSE

RASE

GMSE

RASE

GMSE

RASE

AR(1)

0.3

0.6

LASSO

0.064341

0.114592

0.031191

0.067650

0.042419

0.059851

MCP

0.029075

0.106077

0.015399

0.066843

0.010787

0.056794

SCAD

0.027497

0.108374

0.015930

0.067008

0.011287

0.056623

nSCAD

0.146192

0.137163

0.128575

0.082160

0.125350

0.074936

0.7

LASSO

0.112530

0.138968

0.051961

0.079297

0.043889

0.070980

MCP

0.043292

0.128857

0.022971

0.075008

0.020832

0.069268

SCAD

0.046243

0.129316

0.025064

0.075690

0.021372

0.069486

nSCAD

0.253093

0.170923

0.222431

0.103973

0.184898

0.084143

0.7

0.6

LASSO

0.089834

0.131975

0.033852

0.069576

0.039466

0.057969

MCP

0.042206

0.124017

0.018991

0.067334

0.011058

0.055223

SCAD

0.037738

0.122862

0.018790

0.067459

0.011709

0.055316

nSCAD

0.211071

0.150111

0.190835

0.084163

0.155614

0.072097

0.7

LASSO

0.147694

0.157707

0.059348

0.076340

0.049364

0.071095

MCP

0.048806

0.145195

0.021851

0.072973

0.020772

0.068934

SCAD

0.048354

0.147517

0.023644

0.073777

0.021018

0.068360

nSCAD

0.316021

0.190023

0.352679

0.097988

0.191410

0.083184

Open in a new tab

TABLE A8.

Structure identification and variable selection results $(q = 50, n_{i} = 20)$ .

Structure identification and variable selection

Corstr

σ_{u}

n

Method

0.6

400

LASSO

43.7350

0.0000

3.0000

0.0000

2.0000

1.2650

0.6300

MCP

44.3550

0.0000

3.0000

0.0000

2.0000

0.6450

0.6600

SCAD

44.5850

0.0000

3.0000

0.0000

2.0000

0.4150

0.7850

nSCAD

43.5950

0.0000

3.0000

0.0000

2.0000

1.4050

0.4250

700

LASSO

44.5250

0.0000

3.0000

0.0000

2.0000

0.4750

0.7950

MCP

44.6300

0.0000

3.0000

0.0000

2.0000

0.3700

0.8100

SCAD

44.6950

0.0000

3.0000

0.0000

2.0000

0.3050

0.8550

nSCAD

43.4700

0.0000

3.0000

0.0000

2.0000

1.5300

0.4700

1000

LASSO

44.8850

0.0000

3.0000

0.0000

2.0000

0.1150

0.9800

MCP

44.7750

0.0000

3.0000

0.0000

2.0000

0.2250

0.9750

SCAD

44.7150

0.0000

3.0000

0.0000

2.0000

0.2850

0.9650

nSCAD

44.3200

0.0000

3.0000

0.0000

2.0000

0.6800

0.7150

0.7

400

LASSO

43.9000

0.0000

3.0000

0.0000

2.0000

1.1000

0.5000

MCP

44.2400

0.0000

3.0000

0.0000

2.0000

0.7600

0.5800

SCAD

44.2700

0.0000

3.0000

0.0000

2.0000

0.7300

0.7500

nSCAD

42.5300

0.0000

3.0000

0.0000

2.0000

2.4700

0.2500

700

LASSO

44.3050

0.0000

3.0000

0.0000

2.0000

0.6950

0.7650

MCP

44.5450

0.0000

3.0000

0.0000

2.0000

0.4550

0.7500

SCAD

44.7350

0.0000

3.0000

0.0000

2.0000

0.2650

0.8200

nSCAD

43.0150

0.0000

3.0000

0.0000

2.0000

1.9850

0.4100

1000

LASSO

44.8650

0.0000

3.0000

0.0000

2.0000

0.1350

0.9050

MCP

44.8200

0.0000

3.0000

0.0000

2.0000

0.1800

0.8800

SCAD

44.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8650

nSCAD

43.8600

0.0000

3.0000

0.0000

2.0000

1.1400

0.6250

AR(1)

0.6

400

LASSO

44.3300

0.0000

3.0000

0.0000

2.0000

0.6700

MCP

44.4300

0.0000

3.0000

0.0000

2.0000

0.5700

0.6800

SCAD

44.6150

0.0000

3.0000

0.0000

2.0000

0.3850

0.8150

nSCAD

43.1700

0.0000

3.0000

0.0000

2.0000

1.8300

0.3450

700

LASSO

44.7150

0.0000

3.0000

0.0000

2.0000

0.2850

0.9400

MCP

44.7300

0.0000

3.0000

0.0000

2.0000

0.2700

0.8900

SCAD

44.7950

0.0000

3.0000

0.0000

2.0000

0.2050

0.9050

nSCAD

43.7650

0.0000

3.0000

0.0000

2.0000

1.2350

0.5450

1000

LASSO

44.9100

0.0000

3.0000

0.0000

2.0000

0.0900

0.9900

MCP

44.8600

0.0000

3.0000

0.0000

2.0000

0.1400

0.9800

SCAD

44.7650

0.0000

3.0000

0.0000

2.0000

0.2350

0.9900

nSCAD

44.6150

0.0000

3.0000

0.0000

2.0000

0.3850

0.9050

0.7

400

LASSO

44.2400

0.0000

3.0000

0.0000

2.0000

0.7600

0.6000

MCP

43.9600

0.0000

3.0000

0.0000

2.0000

1.0400

0.5700

SCAD

44.5100

0.0000

3.0000

0.0000

2.0000

0.4900

0.7650

nSCAD

42.5300

0.0000

3.0000

0.0000

2.0000

2.4700

0.3000

700

LASSO

44.7250

0.0000

3.0000

0.0000

2.0000

0.2750

0.8300

MCP

44.4550

0.0000

3.0000

0.0000

2.0000

0.5450

0.8000

SCAD

44.7500

0.0000

3.0000

0.0000

2.0000

0.2500

0.8250

nSCAD

43.2500

0.0000

3.0000

0.0000

2.0000

1.7500

0.3850

1000

LASSO

44.8050

0.0000

3.0000

0.0000

2.0000

0.1950

0.8900

MCP

44.6400

0.0000

3.0000

0.0000

2.0000

0.3600

0.8850

SCAD

44.2850

0.0000

3.0000

0.0000

2.0000

0.7150

0.8800

nSCAD

43.8650

0.0000

3.0000

0.0000

2.0000

1.1350

0.6950

Open in a new tab

Higher Dimensional Simulation Results

When considering the higher dimensional case, under reasonable conditions, we have relaxed the criteria in our code for determining whether the varying coefficient is zero. According to Equation (21), nonzero values of $‖ γ_{k} ‖_{H}$ are identified and selected. In our simulation with eight varying coefficients, we adopt the threshold $‖ γ_{k} ‖_{H} > 1 0^{- 7}$ to declare the coefficient function $α_{k} (\cdot)$ as varying, whereas $‖ γ_{k} ‖_{H} < 1 0^{- 7}$ indicates it is constant. In the higher dimensional setting, that is, $q = 20, 50$ , we replace the threshold $1 0^{- 7}$ by $1 0^{- 3}$ . We reached the same conclusion in higher dimensional setting. The detailed results are as follows.

1.
Suppose that the real model (2) satisfies $𝒞 = {1, 2}, 𝒱 = {3, 4, 5}, 𝒵 = {6, 7, \dots, 20}$ and
$\begin{align} α^{(𝒞)} (t) & = (α_{1} (t), α_{2} (t)) = (5, 6), \\ α^{(𝒱)} (t) & = (α_{3} (t), α_{4} (t), α_{5} (t)) \\ = (0.7 \cdot e^{2 t - 1}, 1.5 \cdot s i n (π t), 0.2 \cdot {(2 - 2 t)}^{3}), \\ α^{(𝒵)} (t) & = (α_{6} (t), α_{7} (t), \dots, α_{20} (t)) = (0, 0, \dots, 0) . \end{align}$

We took $X_{i j} \sim N (3, σ_{X}^{2} I_{20}), u_{i j} \sim N (0, σ_{u}^{2} I_{20})$ , where $j =$ $1, 2, \dots, 20, σ_{X} = 3, I_{20}$ is $20 \times 20$ identify matrix. We set $σ_{u}$ as $0.4, 0.6$ . $t_{i j} \sim U [0, 1]$ . $ε_{i} = {(ε_{i 1}, ε_{i 2}, \dots, ε_{i n_{i}})}^{T} \sim N (0, σ^{2} Corr (ε_{i}, ρ))$ , where $σ^{2} = 1$ and $Corr (ε_{i}, ρ))$ is a known correlation matrix with parameter $ρ$ . Thus, we can get $A_{i} = diag (1, 1, \dots, 1)$ . In our work, we set $n = 400, 500, 600$ , $n_{i} = 20$ and $ε_{i}$ has the first‐order autoregressive $(AR (1))$ and exchangeable (EX) correlation structures with $ρ = 0.7$ . The cubic B‐spline basis was applied with the knots being equally spaced in $[0, 1], K = ⌊c_{0} \times N^{1 / 5}⌋$ , where $⌊ c_{0} ⌋$ denotes the largest integer less than $c_{0}$ [28]. Please see Tables A5 and A6.

2.
Suppose that the real model (2) satisfies $𝒞 = {1, 2}, 𝒱 = {3, 4, 5}, 𝒵 = {6, 7, \dots, 50}$ and
$\begin{align} α^{(𝒞)} (t) & = (α_{1} (t), α_{2} (t)) = (5, 6), \\ α^{(𝒱)} (t) & = (α_{3} (t), α_{4} (t), α_{5} (t)) = (e^{2 t}, 6 \cdot s i n (π t), {(2 - 2 t)}^{3}), \\ α^{(𝒵)} (t) & = (α_{6} (t), α_{7} (t), \dots, α_{50} (t)) = (0, 0, \dots, 0) . \end{align}$

We took $X_{i j} \sim N (5, σ_{X}^{2} I_{50}), u_{i j} \sim N (0, σ_{u}^{2} I_{50})$ , where $j =$ $1, 2, \dots, 50, σ_{X} = 5, I_{50}$ is $50 \times 50$ identify matrix. We set $σ_{u}$ as $0.6, 0.7$ . $t_{i j} \sim U [0, 1]$ . $ε_{i} = {(ε_{i 1}, ε_{i 2}, \dots, ε_{i n_{i}})}^{T} \sim N (0, σ^{2} Corr (ε_{i}, ρ))$ , where $σ^{2} = 1$ and $Corr (ε_{i}, ρ))$ is a known correlation matrix with parameter $ρ$ . Thus, we can get $A_{i} = diag (1, 1, \dots, 1)$ . In our work, we set $n = 400, 700, 1000$ , $n_{i} = 20$ and $ε_{i}$ has the first‐order autoregressive $(AR (1))$ with $ρ = 0.3$ and exchangeable (EX) correlation structures with $ρ = 0.7$ . The cubic B‐spline basis was applied with the knots being equally spaced in $[0, 1], K = ⌊c_{0} \times N^{1 / 5}⌋$ , where $⌊ c_{0} ⌋$ denotes the largest integer less than $c_{0}$ [28]. Please see Tables A7 and A8.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

References

1. Wu C. O., Chiang C. T., and Hoover D. R., “Asymptotic Confidence Regions for Kernel Smoothing of a Varying Coefficient Model With Longitudinal Data,” Journal of the American Statistical Association 93, no. 444 (1998): 1388–1402, 10.1080/01621459.1998.10473800. [DOI] [Google Scholar]
2. Wu C. O. and Chiang C. T., “Kernel Smoothing on Varying Coefficient Models With Longitudinal Dependent Variable,” Statistica Sinica 10, no. 2 (2000): 433–456, 10.1007/s11424-022-2109-1. [DOI] [Google Scholar]
3. Hoover D. R., Rice J. A., Wu C. O., and Yang L. P., “Nonparametric Smoothing Estimates of Time‐Varying Coefficient Models With Longitudinal Data,” Biometrika 85, no. 4 (1998): 809–822, 10.1093/biomet/85.4.809. [DOI] [Google Scholar]
4. Lin D. and Ying Z., “Semiparametric and Nonparametric Regression Analysis of Longitudinal Data,” Journal of the American Statistical Association 96, no. 456 (2001): 103–126, 10.1198/016214501750333018. [DOI] [Google Scholar]
5. Huang J. Z., Wu C. O., and Zhou L., “Varying Coefficient Models and Basis Function Approximations for the Analysis of Repeated Measurements,” Biometrika 89, no. 1 (2002): 111–128, 10.1093/biomet/89.1.111. [DOI] [Google Scholar]
6. Huang J. Z., Wu C. O., and Zhou L., “Polynomial Spline Estimation and Inference for Varying Coefficient Models With Longitudinal Data,” Statistica Sinica 14, no. 3 (2004): 763–788, http://www.jstor.org/stable/24307415. [Google Scholar]
7. Tang Q. and Cheng L., “M‐Estimation and B‐Spline Approximation for Varying Coefficient Models With Longitudinal Data,” Journal of Nonparametric Statistics 20, no. 7 (2008): 611–625, 10.1080/10485250802375950. [DOI] [Google Scholar]
8. Xue L. and Zhu L., “Empirical Likelihood for a Varying Coefficient Model With Longitudinal Data,” Journal of the American Statistical Association 102, no. 478 (2007): 642–654, 10.1198/016214507000000293. [DOI] [Google Scholar]
9. Qu A. and Li R., “Quadratic Inference Functions for Varying Coefficient Models With Longitudinal Data,” Biometrics 62, no. 2 (2006): 379–391, 10.1111/j.1541-0420.2005.00490.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
10. Fan J. and Zhang W., “Statistical Methods With Varying Coefficient Models,” Statistics and Its Interface 1, no. 1 (2008): 179–195, 10.4310/SII.2008.v1.n1.a15. [DOI] [PMC free article] [PubMed] [Google Scholar]
11. Park B. U., Mammen E., Lee Y. K., and Lee E. R., “Varying Coefficient Regression Models: A Review and New Developments,” International Statistical Review 83, no. 1 (2015): 36–64, 10.1111/insr.12029. [DOI] [Google Scholar]
12. Li L. and Greene T., “Varying Coefficients Model With Measurement Error,” Biometrics 64, no. 2 (2008): 519–526, 10.1111/j.1541-0420.2007.00921.x. [DOI] [PubMed] [Google Scholar]
13. Yang Y., Li G., and Peng H., “Empirical Likelihood of Varying Coefficient Errors‐In‐Variables Models With Longitudinal Data,” Journal of Multivariate Analysis 127 (2014): 1–18, 10.1016/j.jmva.2014.02.004. [DOI] [Google Scholar]
14. Zhao M., Gao Y., and Cui Y., “Variable Selection for Longitudinal Varying Coefficient Errors‐In‐Variables Models,” Communications in Statistics ‐ Theory and Methods 51, no. 11 (2022): 3713–3738, 10.1080/03610926.2020.1801738. [DOI] [Google Scholar]
15. Zhao M., Xu X., Zhu Y., Zhang K., and Zhou Y., “Model Estimation and Selection for Partial Linear Varying Coefficient EV Models With Longitudinal Data,” Journal of Applied Statistics 50, no. 3 (2023): 512–534, 10.1080/02664763.2021.1904847. [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Zhao Y. Y., Lin J. G., Huang X. F., and Wang H. X., “Adaptive Jump‐Preserving Estimates in Varying Coefficient Models,” Journal of Multivariate Analysis 149 (2016): 65–80, 10.1016/j.jmva.2016.03.005. [DOI] [Google Scholar]
17. Zhao Y.‐Y., Lei K., Liu Y., et al., “Single‐Index Measurement Error Jump Regression Model in Alzheimer's Disease Studies,” Statistics in Medicine 44, no. 7 (2025): e70081, 10.1002/sim.70081. [DOI] [PubMed] [Google Scholar]
18. Fan J. and Li R., “Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties,” Journal of the American Statistical Association 96, no. 456 (2001): 1348–1360, 10.1198/016214501753382273. [DOI] [Google Scholar]
19. Zhang C., “Nearly Unbiased Variable Selection Under Minimax Concave Penalty,” Annals of Statistics 38, no. 2 (2010): 894–942, 10.1214/09-AOS729. [DOI] [Google Scholar]
20. Song Y., Han H., Fu L., and Wang T., “Penalized Weighted Smoothed Quantile Regression for High‐Dimensional Longitudinal Data,” Statistics in Medicine 43, no. 10 (2024): 2007–2042, 10.1002/sim.10056. [DOI] [PubMed] [Google Scholar]
21. Tang Y., Wang H. J., and Zhu Z., “Variable Selection in Quantile Varying Coefficient Models With Longitudinal Data,” Computational Statistics & Data Analysis 57, no. 1 (2013): 435–449, 10.1016/j.csda.2012.07.015. [DOI] [Google Scholar]
22. Wang K. and Lin L., “Simultaneous Structure Estimation and Variable Selection in Partial Linear Varying Coefficient Models for Longitudinal Data,” Journal of Statistical Computation and Simulation 85, no. 7 (2014): 1459–1473, 10.1080/00949655.2013.878716. [DOI] [Google Scholar]
23. Xu X., Zhou Y., Zhang K., and Zhao M., “Unified Variable Selection for Varying Coefficient Models With Longitudinal Data,” Journal of Systems Science and Complexity 36, no. 2 (2023): 822–842, 10.1007/s11424-022-2109-1. [DOI] [Google Scholar]
24. Wang K. and Lin L., “Robust and Efficient Estimator for Simultaneous Model Structure Identification and Variable Selection in Generalized Partial Linear Varying Coefficient Models With Longitudinal Data,” Statistical Papers 60, no. 5 (2017): 1649–1676, 10.1007/s00362-017-0890-z. [DOI] [Google Scholar]
25. Liang K. Y. and Zeger S. L., “Longitudinal Data Analysis Using Generalized Linear Models,” Biometrika 73, no. 1 (1986): 13–22, 10.1093/biomet/73.1.13. [DOI] [Google Scholar]
26. Qu A., Lindsay B. G., and Li B., “Improving Generalized Estimating Equations Using Quadratic Inference Functions,” Biometrika 87, no. 4 (2000): 823–836, 10.1093/biomet/87.4.823. [DOI] [Google Scholar]
27. Xue L., Qu A., and Zhou J., “Consistent Model Selection for Marginal Generalized Additive Model for Correlated Data,” Journal of the American Statistical Association 105, no. 492 (2010): 1518–1530, 10.1198/jasa.2010.tm10128. [DOI] [Google Scholar]
28. Tian R., Xue L., and Liu C., “Penalized Quadratic Inference Functions for Semiparametric Varying Coefficient Partially Linear Models With Longitudinal Data,” Journal of Multivariate Analysis 132, no. 10 (2014): 94–110, 10.1016/j.jmva.2014.07.015. [DOI] [Google Scholar]
29. Schumaker L., Spline Functions: Basic Theory (Cambridge University Press, 2007). [Google Scholar]
30. Zhao P. and Xue L., “Variable Selection for Semiparametric Varying Coefficient Partially Linear Errors‐In‐Variables Models,” Journal of Multivariate Analysis 101, no. 8 (2010): 1872–1883, 10.1016/j.jmva.2010.03.005. [DOI] [Google Scholar]
31. Zhao P. and Xue L., “Variable Selection for Varying Coefficient Models With Measurement Errors,” Metrika 74, no. 2 (2011): 231–245, 10.1007/s00184-010-0300-1. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

[sim70434-bib-0001] 1. Wu C. O., Chiang C. T., and Hoover D. R., “Asymptotic Confidence Regions for Kernel Smoothing of a Varying Coefficient Model With Longitudinal Data,” Journal of the American Statistical Association 93, no. 444 (1998): 1388–1402, 10.1080/01621459.1998.10473800. [DOI] [Google Scholar]

[sim70434-bib-0002] 2. Wu C. O. and Chiang C. T., “Kernel Smoothing on Varying Coefficient Models With Longitudinal Dependent Variable,” Statistica Sinica 10, no. 2 (2000): 433–456, 10.1007/s11424-022-2109-1. [DOI] [Google Scholar]

[sim70434-bib-0003] 3. Hoover D. R., Rice J. A., Wu C. O., and Yang L. P., “Nonparametric Smoothing Estimates of Time‐Varying Coefficient Models With Longitudinal Data,” Biometrika 85, no. 4 (1998): 809–822, 10.1093/biomet/85.4.809. [DOI] [Google Scholar]

[sim70434-bib-0004] 4. Lin D. and Ying Z., “Semiparametric and Nonparametric Regression Analysis of Longitudinal Data,” Journal of the American Statistical Association 96, no. 456 (2001): 103–126, 10.1198/016214501750333018. [DOI] [Google Scholar]

[sim70434-bib-0005] 5. Huang J. Z., Wu C. O., and Zhou L., “Varying Coefficient Models and Basis Function Approximations for the Analysis of Repeated Measurements,” Biometrika 89, no. 1 (2002): 111–128, 10.1093/biomet/89.1.111. [DOI] [Google Scholar]

[sim70434-bib-0006] 6. Huang J. Z., Wu C. O., and Zhou L., “Polynomial Spline Estimation and Inference for Varying Coefficient Models With Longitudinal Data,” Statistica Sinica 14, no. 3 (2004): 763–788, http://www.jstor.org/stable/24307415. [Google Scholar]

[sim70434-bib-0007] 7. Tang Q. and Cheng L., “M‐Estimation and B‐Spline Approximation for Varying Coefficient Models With Longitudinal Data,” Journal of Nonparametric Statistics 20, no. 7 (2008): 611–625, 10.1080/10485250802375950. [DOI] [Google Scholar]

[sim70434-bib-0008] 8. Xue L. and Zhu L., “Empirical Likelihood for a Varying Coefficient Model With Longitudinal Data,” Journal of the American Statistical Association 102, no. 478 (2007): 642–654, 10.1198/016214507000000293. [DOI] [Google Scholar]

[sim70434-bib-0009] 9. Qu A. and Li R., “Quadratic Inference Functions for Varying Coefficient Models With Longitudinal Data,” Biometrics 62, no. 2 (2006): 379–391, 10.1111/j.1541-0420.2005.00490.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim70434-bib-0010] 10. Fan J. and Zhang W., “Statistical Methods With Varying Coefficient Models,” Statistics and Its Interface 1, no. 1 (2008): 179–195, 10.4310/SII.2008.v1.n1.a15. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim70434-bib-0011] 11. Park B. U., Mammen E., Lee Y. K., and Lee E. R., “Varying Coefficient Regression Models: A Review and New Developments,” International Statistical Review 83, no. 1 (2015): 36–64, 10.1111/insr.12029. [DOI] [Google Scholar]

[sim70434-bib-0012] 12. Li L. and Greene T., “Varying Coefficients Model With Measurement Error,” Biometrics 64, no. 2 (2008): 519–526, 10.1111/j.1541-0420.2007.00921.x. [DOI] [PubMed] [Google Scholar]

[sim70434-bib-0013] 13. Yang Y., Li G., and Peng H., “Empirical Likelihood of Varying Coefficient Errors‐In‐Variables Models With Longitudinal Data,” Journal of Multivariate Analysis 127 (2014): 1–18, 10.1016/j.jmva.2014.02.004. [DOI] [Google Scholar]

[sim70434-bib-0014] 14. Zhao M., Gao Y., and Cui Y., “Variable Selection for Longitudinal Varying Coefficient Errors‐In‐Variables Models,” Communications in Statistics ‐ Theory and Methods 51, no. 11 (2022): 3713–3738, 10.1080/03610926.2020.1801738. [DOI] [Google Scholar]

[sim70434-bib-0015] 15. Zhao M., Xu X., Zhu Y., Zhang K., and Zhou Y., “Model Estimation and Selection for Partial Linear Varying Coefficient EV Models With Longitudinal Data,” Journal of Applied Statistics 50, no. 3 (2023): 512–534, 10.1080/02664763.2021.1904847. [DOI] [PMC free article] [PubMed] [Google Scholar]

[sim70434-bib-0016] 16. Zhao Y. Y., Lin J. G., Huang X. F., and Wang H. X., “Adaptive Jump‐Preserving Estimates in Varying Coefficient Models,” Journal of Multivariate Analysis 149 (2016): 65–80, 10.1016/j.jmva.2016.03.005. [DOI] [Google Scholar]

[sim70434-bib-0017] 17. Zhao Y.‐Y., Lei K., Liu Y., et al., “Single‐Index Measurement Error Jump Regression Model in Alzheimer's Disease Studies,” Statistics in Medicine 44, no. 7 (2025): e70081, 10.1002/sim.70081. [DOI] [PubMed] [Google Scholar]

[sim70434-bib-0018] 18. Fan J. and Li R., “Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties,” Journal of the American Statistical Association 96, no. 456 (2001): 1348–1360, 10.1198/016214501753382273. [DOI] [Google Scholar]

[sim70434-bib-0019] 19. Zhang C., “Nearly Unbiased Variable Selection Under Minimax Concave Penalty,” Annals of Statistics 38, no. 2 (2010): 894–942, 10.1214/09-AOS729. [DOI] [Google Scholar]

[sim70434-bib-0020] 20. Song Y., Han H., Fu L., and Wang T., “Penalized Weighted Smoothed Quantile Regression for High‐Dimensional Longitudinal Data,” Statistics in Medicine 43, no. 10 (2024): 2007–2042, 10.1002/sim.10056. [DOI] [PubMed] [Google Scholar]

[sim70434-bib-0021] 21. Tang Y., Wang H. J., and Zhu Z., “Variable Selection in Quantile Varying Coefficient Models With Longitudinal Data,” Computational Statistics & Data Analysis 57, no. 1 (2013): 435–449, 10.1016/j.csda.2012.07.015. [DOI] [Google Scholar]

[sim70434-bib-0022] 22. Wang K. and Lin L., “Simultaneous Structure Estimation and Variable Selection in Partial Linear Varying Coefficient Models for Longitudinal Data,” Journal of Statistical Computation and Simulation 85, no. 7 (2014): 1459–1473, 10.1080/00949655.2013.878716. [DOI] [Google Scholar]

[sim70434-bib-0023] 23. Xu X., Zhou Y., Zhang K., and Zhao M., “Unified Variable Selection for Varying Coefficient Models With Longitudinal Data,” Journal of Systems Science and Complexity 36, no. 2 (2023): 822–842, 10.1007/s11424-022-2109-1. [DOI] [Google Scholar]

[sim70434-bib-0024] 24. Wang K. and Lin L., “Robust and Efficient Estimator for Simultaneous Model Structure Identification and Variable Selection in Generalized Partial Linear Varying Coefficient Models With Longitudinal Data,” Statistical Papers 60, no. 5 (2017): 1649–1676, 10.1007/s00362-017-0890-z. [DOI] [Google Scholar]

[sim70434-bib-0025] 25. Liang K. Y. and Zeger S. L., “Longitudinal Data Analysis Using Generalized Linear Models,” Biometrika 73, no. 1 (1986): 13–22, 10.1093/biomet/73.1.13. [DOI] [Google Scholar]

[sim70434-bib-0026] 26. Qu A., Lindsay B. G., and Li B., “Improving Generalized Estimating Equations Using Quadratic Inference Functions,” Biometrika 87, no. 4 (2000): 823–836, 10.1093/biomet/87.4.823. [DOI] [Google Scholar]

[sim70434-bib-0027] 27. Xue L., Qu A., and Zhou J., “Consistent Model Selection for Marginal Generalized Additive Model for Correlated Data,” Journal of the American Statistical Association 105, no. 492 (2010): 1518–1530, 10.1198/jasa.2010.tm10128. [DOI] [Google Scholar]

[sim70434-bib-0028] 28. Tian R., Xue L., and Liu C., “Penalized Quadratic Inference Functions for Semiparametric Varying Coefficient Partially Linear Models With Longitudinal Data,” Journal of Multivariate Analysis 132, no. 10 (2014): 94–110, 10.1016/j.jmva.2014.07.015. [DOI] [Google Scholar]

[sim70434-bib-0029] 29. Schumaker L., Spline Functions: Basic Theory (Cambridge University Press, 2007). [Google Scholar]

[sim70434-bib-0030] 30. Zhao P. and Xue L., “Variable Selection for Semiparametric Varying Coefficient Partially Linear Errors‐In‐Variables Models,” Journal of Multivariate Analysis 101, no. 8 (2010): 1872–1883, 10.1016/j.jmva.2010.03.005. [DOI] [Google Scholar]

[sim70434-bib-0031] 31. Zhao P. and Xue L., “Variable Selection for Varying Coefficient Models With Measurement Errors,” Metrika 74, no. 2 (2011): 231–245, 10.1007/s00184-010-0300-1. [DOI] [Google Scholar]

PERMALINK

Structure Identification, Estimation and Variable Selection for Varying Coefficient EV Models With Longitudinal Data

Mingtao Zhao

Jingxiang Cao

Jun Sun

Yan Fan

Sanying Feng

Fanqun Li

ABSTRACT

1. Introduction

2. Methodology and Main Results

2.1. Bias‐Corrected Double Penalized Quadratic Inference Functions Method

Remark 1

2.2. Asymptotic Properties

Remark 2

Theorem 1

Theorem 2

Theorem 3

3. Computational Algorithm and Selection of Tuning Parameters

3.1. Computational Algorithm

3.2. Selection of Tuning Parameters

4. Numerical Studies

4.1. Simulations Studies

TABLE 1.

TABLE 3.

TABLE 2.

TABLE 4.

TABLE 5.

TABLE 7.

TABLE 6.

TABLE 8.

4.2. Real Data Analysis

FIGURE 1.

5. Conclusion and Discussion

Conflicts of Interest

Acknowledgments

Appendix A.

Derivation Process of Di(κ)

Proof of Theorems

Lemma 1

Lemma 2

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Some Additional Numerical Results

TABLE A1.

TABLE A2.

TABLE A3.

TABLE A4.

TABLE A5.

TABLE A6.

TABLE A7.

TABLE A8.

Higher Dimensional Simulation Results

Data Availability Statement

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Derivation Process of $D_{i}^{(κ)}$