Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Jul 21.
Published in final edited form as: Stat Sin. 2023 May;33(Spec Iss):1295–1318. doi: 10.5705/ss.202021.0006

Globally Adaptive Longitudinal Quantile Regression with High Dimensional Compositional Covariates

Huijuan Ma 1, Qi Zheng 1, Zhumin Zhang 1, Huichuan Lai 1, Limin Peng 1
PMCID: PMC10361693  NIHMSID: NIHMS1757788  PMID: 37483468

Abstract

In this work, we propose a longitudinal quantile regression framework that enables a robust characterization of heterogeneous covariate-response associations in the presence of high-dimensional compositional covariates and repeated measurements of both response and covariates. We develop a globally adaptive penalization procedure, which can consistently identify covariate sparsity patterns across a continuum set of quantile levels. The proposed estimation procedure properly aggregates longitudinal observations over time, and ensures the satisfaction of the sum-zero coefficient constraint that is needed for proper interpretation of the effects of compositional covariates. We establish the oracle rate of uniform convergence and weak convergence of the resulting estimators, and further justify the proposed uniform selector of the tuning parameter in terms of achieving global model selection consistency. We derive an efficient algorithm by incorporating existing R packages to facilitate stable and fast computation. Our extensive simulation studies confirm the theoretical findings. We apply the proposed method to a longitudinal study of cystic fibrosis children where the association between gut microbiome and other diet-related biomarkers is of interest.

Key words and phrases: Compositional covariates, Globally adaptive penalization, Longitudinal data, Quantile regression

1. Introduction

Compositional data have been frequently encountered in a variety of research fields. Examples include household expenditure compositions in economics, geochemical compositions of rocks in geology, and human microbiome compositions in medical studies. Compositional data consist of proportions bounded between zero and one and summed up to one, and are often high-dimensional. For instance, human microbiome data are usually captured by percentages (or relative abundance) of gene sequencing reads (Tyler et al., 2014) at a certain taxonomy level, and the number of operational taxonomy units (OTU) (e.g. phyla or genus) can range over hundreds, thousands, or even millions. With technology advancement, an increasing number of studies have collected such compositional data repeatedly over time. A common question of substantive interest is how these longitudinal compositional measurements are associated with other longitudinal biomarkers or clinical outcomes. This poses a regression problem subject to multi-fold complications, including a large number of covariates, positiveness and unit-sum constraints to the covariates, and within-subject dependence of longitudinal observations.

To deal with the high-dimensionality of covariates, a notable line of research has been established in the penalization framework (for example, Meinshausen and Buhlmann, 2006; Zhang and Huang, 2008; Kim et al., 2008; Lv and Fan, 2009; Fan and Lv, 2011). Extensions to longitudinal settings have been developed (for example, Wang et al., 2012; Zheng et al., 2018). When covariates are compositional, given the unit-sum constraint, an increase in one covariate must induce a decrease in another covariate. Applying the traditional penalization regression methods without accounting for the compositional nature of covariates may lead to results that are difficult to interpret. A common strategy for accommodating compositional covariates is to apply a sensible operation to the compositional proportions before incorporating them into a regression model, as in the linear log-contrast model and logistic normal multinomial regression model (Aitchison, 1982; Aitchison and Bacon-shone, 1984; Aitchison, 2003; Xia et al., 2013). Many efforts have also been devoted to deal with covariates that are both compositional and high-dimensional. For example, Lin et al. (2014) proposed Lasso-penalized method for the linear log-contrast regression model that properly accounts for the compositional nature of covariates. Shi et al. (2016) studied an extension of Lin et al. (2014)’s model with a set of linear constraints. Lu et al. (2019) further generalized the model to a generalized linear log-contrast model and proposed a l1-penalized likelihood estimation procedure.

Methods for addressing high-dimensional compositional covariates in a longitudinal setting, however, has been scarce. Moreover, existing approaches are mostly based on mean-based linear regression, which typically confine covariate effects to be location-shifts and thus can be restrictive for real data. Quantile regression (Koenker and Bassett, 1978), as characterized by its flexibility to assess covariate effects across different quantile levels, has demonstrated promising utility to identify and depict dynamic covariate-response associations that often shed useful scientific insight. The modeling strategy of quantile regression has been incorporated to the analysis longitudinal data under various perspectives (Koenker, 2004; Wang and Fygenson, 2009; Ma, Peng, and Fu, Ma et al., for example). In the present of high-dimensional covariates, many authors (Li et al., 2007; Zou and Yuan, 2008; Wang et al., 2012; Zheng et al., 2013; Fan et al., 2014, among others) have studied penalized quantile regression methods. These methods model a single or multiple pre-specified quantiles of the response; in other words, are locally concerned. The locally concerned methods are subject to inherent issues such as undesirable variability in variable selection results across neighboring quantile levels, and potential failure to detect some important variables due to an off-target selection of quantile levels. To address these limitations, Zheng et al. (2015) proposed the perspective of globally concerned quantile regression, which allows for simultaneously examining regression quantiles over a continuum set of quantile levels and thus reflects the underlying scientific interest in a more robust way. While demonstrating improved stability and “power” of variable selection compared to locally concerned quantile regression approaches, Zheng et al. (2015)’s method is not suitable for handling either longitudinal data or compositional covariates.

In this work, we develop a globally concerned longitudinal quantile regression framework, which is tailored to evaluate the effects of high-dimensional longitudinal compositional covariates on longitudinal responses. We consider a longitudinal linear log-contrast quantile regression model, where quantiles of the longitudinal response are linked to the log contrasts of the corresponding compositional covariates. To avoid the shortfall associated with selecting an irrelevant covariate as the reference in log-contrasts, we reformulate the model into a symmetric form with zero-sum constraint of coefficients, which ensures sensible interpretations of the effects of compositional covariates. We propose a regularization method, where a globally adaptive Lasso penalty is imposed to the longitudinal quantile loss function that appropriately aggregates the repeated measurements from the same subject. We further adapt the rq.fit.fnc() function in the existing R package quantreg to facilitate the estimation in the presence of the zero-sum constraint of coefficients.

We conduct theoretical studies for the proposed method in the ultra-high dimensional setting, where the number of covariates p may increase exponentially with sample size n (i.e. log p = o(nb) for some b > 0) and the number of relevant covariates s also increases with n. We attain the uniform convergence rate of the proposed estimator as Op(s log n/n), which is fastest possible. Because the longitudinal quantile loss function is not differentiable, to attain this result, we cannot adapt the existing work on linear regression based methods for high-dimensional compositional data, such as Lin et al. (2014), which penalizes a smooth least-squares loss function. Instead, we employ theoretical techniques, including chaining theory (Talagrand, 2005), contraction inequality (Ledoux and Talagrand, 1991), and empirical process (van der Vaart and Wellner, 1996), like in Zheng et al. (2015), which however did not address the longitudinal data structure and the compositional constraint for high dimensional covariates. In this work, we develop new lines of arguments to account for these special data features. A notable effort is that we properly formulate and establish a crucial Karush-Kuhn-Tucker (KKT) condition tailored to compositional data, which is new in literature. In addition, we thoroughly justify that penalizing the proposed longitudinal quantile loss function, which adopts the simple working independence assumption, is capable of accommodating longitudinal data with dependent repeated measures.

Our theoretical studies confer some useful results that were not discussed in existing work that handles high-dimensional compositional covariates based on log-contrast models, such as Lin et al. (2014) and Shi et al. (2016). For example, our theoretical investigation reveals that the asymptotic behavior of the globally adaptive estimator based on a constrained linear log-contrast quantile regression model is asymptotically equivalent to its unconstrained counterpart as long as the reference variable for the latter is a truly relevant variable, which however would not be known in advance. In addition, we establish the weak convergence of any linear combination of the proposed estimator to a Gaussian process. We develop a GIC-type uniform tuning parameter selector. We show that the proposed estimation and tuning parameter procedures can correctly identify all globally relevant variables with probability tending to one (i.e. global model selection consistency).

The rest of this article is outlined as follows. In Section 2, we introduce a globally concerned framework built upon a longitudinal linear log-contrast quantile regression model. Then we propose a globally adaptive regularization procedure based on a symmetric model representation with the zero-sum coefficient constraint. In Section 3, we present the asymptotic studies for the proposed estimation procedure. In Section 4, we investigate the finite sample performance of proposed method through simulation studies. Finally, our methodology is illustrated by an application to a longitudinal observational study of cystic fibrosis children.

2. Methodology

2.1. Longitudinal Linear Log-contrast Quantile Regression Model

Consider a longitudinal study with n subjects. Let Yi(t), Xi(t), and Wi(t), respectively, denote the longitudinal response, r × 1 vector of regular covariates including 1 as the first component, and p × 1 vector of compositional covariates at time t for subject i (i = 1, …, n). A component of Xi(t) may flexibly represent the value of a time-dependent covariate measured at time t or a summary of the covariate history up to time t. We consider the setting where r is fixed and p increases with n satisfying log p = o(nb) for some b > 0. At each time point t, the compositional covariates in Wi(t) are subject to the unit-sum constraint. That is, Wi(t) is belong to the (p − 1)-dimensional positive simplex Sp1={(w1,,wp):wj>0,j=1,,p;  j=1pwj=1}. Suppose Yi(t), Xi(t), and Wi(t) are observed at mi time points, denoted by {ti(k),k=1,,mi}. Define a counting process for the observation time as Ni(t)=k=1miI(ti(k)t).

To gain a comprehensive and flexible view of how covariates influence the response, we adopt quantile regression modeling to formulate covariate effects on the τth conditional quantile of Y(t) given X(t) and W(t), which is defined as QY(t){τ|X(t),W(t)} = inf{y : Pr{Y(t) ≤ y|X(t), W(t)} ≥ τ}. However, directly plugging in W(t) into a regression model would be problematic because the components of W(t) can not change freely due to the unit-sum constraint, and thus interpreting the coefficients for W(t) would be difficult. To deal with the unit-sum constraint, we apply Aitchison and Bacon-shone (1984)’s log-contrast (or log-ratio) transformation that transforms the compositional Wi(t) from Sp1 to Zip(t){log{Wi1(t)/Wip(t)},, log{Wi,p1(t)/Wip(t)}}, where Wij(t) denotes the jth component of Wi(t). The transformation from W(t) to Zip(t) is one-to-one and Zip(t) is freely ranged in Rp−1 without any constraint. A log-contrast transformation requires selecting a reference covariate. For Zip(t), the p-th component of W(t), Wip(t), serves as the reference.

We consider the following longitudinal linear log-contrast quantile regression model:

QYi(t){τXi(t),Wi(t)}=Xi(t)α0(τ)+Zip(t)β0,\p(τ)    for    τΔ, (2.1)

where α0(τ) is a r × 1 vector of regression coefficients for Xi(t), β0,\p(τ) ≐ {β0,1(τ), …, β0,p−1(τ)} is a (p − 1) × 1 vector of regression coefficients for Zip(t), and Δ ⊂ (0, 1) is a set of quantile levels, pre-specified to align with the scientific problem of interest. For example, if the interest is to identify variables influencing the center of the response distribution, one may choose Δ = [0.4, 0.6]. If the interest lies in the upper tail of the response distribution, one may choose Δ = [0.75, 0.9]. One subtle drawback of model (2.1) is that any variable selection based on model (2.1) would automatically include Wip(t) as a relevant covariate, even when Wip(t) is not a relevant variable.

Following the strategy employed in the setting of linear regression with compositional covariates (Lin et al., 2014; Shi et al., 2016), we define β0,p(τ)=j=1p1β0,j(τ), and re-express model (2.1) as

QYi(t){τXi(t),Zi(t)}=Xi(t)α0(τ)+Zi(t)β0(τ), subject to j=1pβ0,j(τ)=0,   for   τΔ. (2.2)

Here Zi(t) = {log{Wi1(t)}, … log{Wip(t)}} and β0(τ) = {β0,1(τ), …, β0,p−1(τ), β0,p(τ)} with β0,j(τ) denoting the jth component of β0(τ). Unlike model (2.1), model (2.2) takes a symmetric form and does not involve a choice of the reference covariate. The symmetric form of model (2.2) also enables estimation that possesses the desirable properties including scale invariance, permutation invariance and selection invariance (Aitchison, 1982; Lin et al., 2014).

Many longitudinal quantile regression models studied in literature (for example, Lipsitz et al., 1997; Wang and Fygenson, 2009; Sun et al., 2016; Cho et al., 2016; Gao and Liu, 2019) bear similar forms to model (2.1) or (2.2) but do not involve the zero-sum coefficient constraint. In addition, they were investigated under the locally concerned perspective.

We study a globally concerned framework based on the longitudinal quantile regression model (2.2), where a covariate is considered as relevant if it has nonzero effects on the conditional quantiles of Y(t) at some, not necessarily all, quantile levels in Δ. That is, the set of relevant (or active) compositional covariates is defined as

SΔ={j{1,,p}:τΔ,|β0,j(τ)|>0}.

It is clear that SτS{τ}SΔ when τ ∈ Δ. The globally concerned perspective warrants a global sparsity assumption, i.e. s ≐ |SΔ| = o(n), for the model identifiability purpose, where | · | denotes the cardinality.

2.2. Globally adaptive L1 penalized estimation

The observed longitudinal data can be generally formulated as {(Yi(t)dNi(t), Xi(t)dNi(t), Zi(t)dNi(t)), i = 1, …, n}. When p is fixed, model (2.2) without the zero-sum coefficient constraint can be estimated by minimizing the longitudinal quantile loss function,

Q(α,β;τ)=1ni=1n0ρτ{Yi(t)Xi(t)αZi(t)β}dNi(t),

where ρτ(t) = t(τI{t ≤ 0}) is the τth quantile loss function. By the definition, Q(α, β; τ) takes equal weight summation of the quantile loss function assessed at all within-subject observation time points. This mimics the idea of constructing a generalized estimating equation (GEE) for longitudinal under the working independence assumption (Liang and Zeger, 1986). The same strategy has also been adopted by existing work on longitudinal quantile regression (Wang and Fygenson, 2009; Sun et al., 2016, for example). Estimation based on Q(α, β; τ), like GEE approach, can properly accommodate longitudinal data with correlated repeated measures.

We propose to apply the adaptively weighted L1 regularization to Q(α, β; τ) to address the high-dimensionality of Zi(t). This renders a regression coefficient estimator γ^(τ) as a solution to the following constrained minimization problem,

γ^(τ)(α^(τ),β^(τ))=argminα,β,j=1pβj=0(Q(α,β;τ)+λj=1pωj(τ)|βj|). (2.3)

Aligning with the perspective of globally concerned quantile regression, λ is a tuning parameter which is a constant over τ and controls for the global sparsity over τ ∈ Δ, namely, SΔ. Here ωj(τ) is a nonnegative adaptive weight function that gauges the impoFance of Zij(t), the j-th component of Zi(t), j = 1, …, p. The adaptive weights may take the following forms:(w1) ωj(τ)=1/|βˇj(τ)|; (w2) ωj(τ)=1/(supτΔ|βˇj(τ)|); (w3) ωj(τ)=1/Δ|βˇj(τ)|dτ, where βˇ(τ) is a uniformly consistent estimator of β0(τ). As discussed in Zheng et al. (2015), (w2) and (w3) are two globally adaptive weights that capture the global impact of a covariate, and may be theoretically and empirically preferable. A uniformly consistent estimator βˇ(τ) can be obtained from directly adapting Belloni and Chernozhukov (2011)’s approach to high-dimensional longitudinal compositional data (i.e. solving the minimization problem (2.3) with the penalty term and tuning parameter selector presented Belloni and Chernozhukov (2011)). This can be justified by slightly modifying the proof of Theorem 1 (in Section 3), combined with the techniques of Belloni and Chernozhukov (2011).

To solve the constrained minimization problem in (2.3), we first write the objective function as a classical quantile loss function. Let ej be a p–dimensional vector with the jth component equal to 1 and all others equal to 0, j = 1, …, p. Besides, for any integer m ≥ 2, denote the m-vector of ones and zeros by 1m and 0m, respectively. Since ρτ(u) + ρτ(−u) = |u|,

λj=1pωj(τ)|βj|=j=1p{ρτ(Yj*Xj*αZj*β)+ρτ(Yp+j*Xp+j*αZp+j*β)},

where (Yj*,Xj*,Zj*)=(0,0r,λωj(τ)ej) and (Yp+j*,Xp+j*,Zp+j*)=(0,0r,λωj(τ)ej). Letting γ = (α, β), we then formulate the equality constraint j=1pβj=0 as two inequality constraints j=1pβj0 (or expressed as (0r,1p)γ0 in matrix form) and j=1pβj0 (or expressed as (0r,1p)γ0 in matrix form). Then the quantile regression problem in (2.3) with linear inequality constraints can be solved by the existing function rq.fit.fnc() in R package quantreg, using the augmented dataset {Yi(ti(k)),Xi(ti(k)),Zi(ti(k)),k=1,,mi;i=1,,n}, coupled with {(Yj*,Xj*,Zj*),(Yp+j*,Xp+j*,Zp+j*),j=1,,p}.

The set of relevant compositional covariates, SΔ, is estimated by

S^Δ{j{1,,p}:τΔ,|β^j(τ)|>0}.

2.3. Tuning parameter selection

Tuning parameter selection plays an important role in variable selection. In the globally concerned setting, a critical idea is to set λ as a common tuning parameter across all τ ∈ Δ as a means to control the overall model complexity and avoid overall fitting. We adapt the generalized information criterion (GIC) (Nishii, 1984; Fan and Tang, 2013) to the problem setting of globally concerned longitudinal quantile regression with compositional covariates.

Specifically, we propose the following uniform selector of tuning parameter by minimizing

GIC(λ)=Δlog σ^λ(τ)dτ+(|S^λ|1)ϕn,

where S^λ={j{1,,p}:supτΔ|β^j,λ(τ)|0},

σ^λ(τ)=1ni=1n0ρτ{Yi(t)Xi(t)α^λ(τ)Zi(t)β^λ(τ)}dNi(t),

and ϕn is a sequence converging to 0 with n. Here β^j,λ(τ), α^λ(τ), and β^λ(τ) stand for the proposed estimates for βj(τ), α(τ), and β(τ), respectively, with the tuning parameter fixed at λ. A popular choice of ϕn is n−1 log(p) log(log(n)). Note that the model size pertaining to the compositional covariates is |S^λ|1 due to the zero-sum constraint.

As shown in Theorem 3, with a properly chosen ϕn and a reasonable upper bound imposed to the model size, the proposed tuning parameter λ^, which is the minimizer of GIC(λ) with respect to λ, can consistently identify the true model SΔ. In other words, with probability tending to 1, S^λ^=SΔ.

2.4. Grid-based Approximation

With finite sample sizes, minimizing (2.3) for all τ ∈ Δ would yield estimates that are exactly piecewise constant functions of τ. While the exact breakpoints of these piecewise constant functions can be identified by adapting Koenker and d’Orey (1987) and Portnoy (1991)’s procedure, the computation expense can be overwhelming in the ultra-high dimensional cases. Therefore, we approximate α^() and β^() by piecewise constant functions that only jump at the grid points of a pre-specified sufficiently fine τ-grid in Δ to alleviate the computation burden. Let Sn denote the τ-grid in Δ, τ0 < τ1 < … < τM(n), and define the size of Sn as Sn=max{τkτk1:k=1,,M(n)}. The grid-based approximations are given by α^Sn()=k=1M(n)α^(τk)I(τk1<ττk), and β^Sn()=k=1M(n)β^(τk)I(τk1<ττk). With certain smoothness assumption for α0(·) and β0(·), we can show that the (α^Sn(),β^Sn()) and (α^(),β^()) have the same convergence rate and asymptotic distribution if Sn converges to 0 at the rate o((ns)−1/2).

3. Theoretical Results

Without loss of generality, we assume that r, the number of usual covariates, is finite. Let SΔ = {1, …, s} and we use SΔc={s+1,,p} to denote the collection of all irrelevant compositional variables. We allow the number of compositional covariates pnp and the true model size sns to increase with the sample size n. To ease the presentation, we often omit the subscript n, when it is clear from the context.

Let Vi(t) = (Xi(t), Zi(t)) and γ(τ) = (α(τ), β(τ)) satisfying j=1pβj(τ)=0. Thus, γ0(τ) = (α0(τ), β0(τ)). We decompose Zi(t) into (Zia(t), Zib(t)) and Vi(t) into (Via(t), Vib(t)), where Zia(t) = (Zi,1(t), …, Zi,s(t)), Via(t), = (Xi(t), Zia(t)) and Vib(t) = Zib(t) = (Zi,s+1(t), …, Zi,p(t)). Likewise, β(τ) = (βa(τ), βb(τ)) and γ(τ) = (γa(τ), γb(τ)), where βa(τ) = (β1(τ), …, βs(τ)), γa(τ) = α(τ), βa(τ)), and γb(τ) = βb(τ) = (βs+1(τ), …, βp(τ)). Regularity conditions (C1)–(C5) are stated in Section S1 of the Supplementary Materials.

In Theorem 1, we show that the proposed estimator is uniformly consistent over Δ, with the convergence rate, Op((r+s) log n/n), which is fastest possible and is as good as that of Zheng et al. (2015)’s globally adaptive estimator. Concerning a single τ or a finite number of τ’s, we establish a faster convergence rate, Op((r+s)/n), as stated in Corollary 1.

Theorem 1. Suppose conditions (C1)–(C5) (stated in the Supplementary Materials) hold. Furthermore, we assume that n/((r + s)3 log2 max{n, r + p}) → ∞ and

supj>r+s,δRr+s1E[0{Vij(t)Vi(t)δ}2dNi(t)]δ2=o(log max{n,r+p}(r+s) log n).

If r + s = o(n1/3), supjSΔ,τΔλwj(τ)=Op(n log n), λ/(r+s log max{n,r+p}) and (infj>r+s,τΔwj(τ))1n/(r+s) log max{n,r+p}=Op(1) then the proposed estimator satisfies

supτΔγ^(τ)γ0(τ)=Op((r+s) log n/n).

Corollary 1. Suppose the conditions in Theorem 1 hold. Then the proposed estimator satisfies

γ^(τ0)γ0(τ0)=Op((r+s)/n).

In Theorem 2, we establish the weak convergence of the proposed estimator.

Theorem 2. Suppose the conditions in Theorem 1 hold. If (r + s)3 log4 n = o(n), for any given ξRr+s−1 andξ∥ = 1, we have the following results:

  1. If n/{(r+s) log n} inf1js,τΔ|β0j(τ)|, then
    n1/2ξ[Hτ{γ^(τ)γ0(τ)}+λnϖ(τ)]
    converges weakly to a mean zero Gaussian process with covariance
    Σ(τ,τ)=E{hn,ξ,τ(V(t),Y)hn,ξ,τ(V(t),Y)}E{hn,ξ,τ(V(t),Y)}E{hn,ξ,τ(V(t),Y)},
    where hn,ξ,τ(V(t),Y)=0ξV(t)ψτ{Y(t)V(t)γ0(τ)}dN(t), ψτ(u) = τI(u < 0),
    Hτ=(E[0ft,τ{0Vi(t)}Via(t)Via(t)dNi(t)]000),
    ϖ(τ)=(0r,(ω(τ)sign(β0(τ))),0ps), ○ denotes the Hadamard product, and ω(τ) = (ω1(τ), …, ωp(τ));
  2. If supτΔ n1/2{jSτλwj2(τ)}1/2=op(1), then n1/2ξHτ{γ^(τ)γ0(τ)} converges weakly to a mean zero Gaussian process with covariance Σ(τ, τ′).

To establish the asymptotic properties of the GIC tuning parameter selector, we assume the following condition (C5+), which is an enhanced version of (C5) presented in the Supplementary Materials:

(C5+) (a)

0<Λmin:=infδR,r+κ,δ0δE[0Vi(t)Vi(t)dNi(t)]δδ2supδR,r+κ,δ0δE[0Vi(t)Vi(t)dNi(t)]δδ2:=Λmax<.

(b)

q:=supδR,r+κ,δ0E[0|Vi(t)δ|2dNi(t)]3/2E[0|Vi(t)δ|3dNi(t)]>0,

where R={δ=(δx,δz):δxr,j=1pδzj=0,δz0r} with ∥ · ∥0 denoting the L0 norm.

In addition, we set a model size upper bound, denoted by κ, with s < κ < p.

ξn=min{min1jrΔ|α0j(τ)|dτ,min1jsΔ|β0j(τ)|dτ},

which measures the minimal overall effect of usual and compositional relevant variables upon the conditional distribution. Theorem 3 and Corollary 2 present results on the consistency of tuning parameter selection based on GIC.

Theorem 3. Suppose the conditions in Theorem 1 and (C5+) hold. Further, log(r+p)/n=o(ϕn),ϕn=o(ξn5/2), and κn1 log max{n,r+p}=o(ξn3), then

P(infSSΔ,|S|κGIC(S)>GIC(SΔ))1.

Corollary 2. Under the same conditions as in Theorem 3, if

{infj>r+s,τΔwj(τ)}1n/(r+s) log max{n,r+p}=Op(1)

and supτΔ,jSτ wj(τ)=Op(n/(r+slog max{n,r+p})), then P(S^λ^=SΔ)1.

For any 1 ≤ ls, we use Zil(t) to denote the log-ratio transformed Wi(t) when the reference is the lth component, i.e. Zil(t) is the vector Zi(t) − Zi,l(t)1p with the lth component removed. We also define Vil(t)=(Xi(t),Zil(t)). Let γ\l(τ) = (α(τ), β\l(τ)) where β\l(τ) = (β1(τ), …, βl−1(τ), βl+1(τ), …, βp(τ)), Let γ^\l(τ) be the solution of the following unconstrained minimization problem

1ni=1n0ρτ{Yi(t)Vil(t)γ\l}dNi(t)+λj=1,jlpωj(τ)|βj|, (3.4)

where γ\l = (α1, ⋯, αr, β1, ⋯, βl−1, βl+1, ⋯, βp). Then the globally adaptive unconstrained estimator γ^lu(τ) with the lth component as the reference is (γ^1,\l(τ),,γ^r,\l(τ),γ^r+1,\l(τ),,γ^r+l1,\l(τ),k=1,klpγ^r+k,\l(τ),γ^r+l+1,\l(τ),,γ^r+p,\l(τ)). We state the asymptotic properties of γ^lu(τ) in the following theorem:

Theorem 4. Under the same conditions as in Theorem 2, if (r + s)3 log4 n = o(n) and supτΔ,jSτ n1/2λwj(τ)=op(1), then for any given ξRr+s−1, ∥ξ∥ = 1, and 1 ≤ ls, we have

  1. n1/2ξHτ{γ^lu(τ)γ0(τ)} converges weakly to a mean zero Gaussian process with covariance Σ(τ,τ′) and P(supτΔγ^b,lu(τ)=0)1;

  2. n1/2ξ{γ^lu(τ)γ0(τ)} and n1/2ξ{γ^(τ)γ0(τ)} are asymptotically equivalent.

Theorem 4 indicates that the proposed constrained estimator is asymptotically equivalent to an unconstrained estimator that uses a relevant variable as the reference. The latter approach however requires preliminary knowledge about truly relevant variables, which may not be available in real applications.

By our theorems, the technical constraints for s include (r + s)3 log2 max{n, r + p} = o(n) and (r + s)3 log4 n = o(n). When p = O(na) (a > 0), we can allow s to be close to but smaller than o(n1/3), which is the fastest model size growth rate derived in Welsh (1989) and He and Shao (2000) for an unpenalized quantile regression estimator to achieve asymptotic normality. Theorem proofs are provided in the Supplementary Materials (Section S4).

4. Simulation Studies

In this section, we carry out simulation studies to evaluate the finite sample performance of the proposed method. We consider the sample size n = 100 and generate Y(t) based on the assumed quantile regression model with r = 4 and p = 400. Specifically, we generate the longitudinal observation times ti(k), k = 1, …, mi, from a standard Poisson process, where mi is the integer part of 2+Ui with Ui ~ Uniform(0, 2). With r = 4, we generate Xi1 from Uniform(0, 1) and Xi2 from Bernoulli(0.5). For each observed time point t=ti(k), we first generate a p-dimensional vector Z˜i(t)=(Z˜i1(t),,Z˜ip(t)) from a multivariate normal distribution Np(0, Σ), where Σ = (ρ|ij|) with ρ = 0.5. Next, we set Zˇij(t)=Φ(Z˜ij(t)) for j ≠ 7 and Zˇi7(t)=Φ(Z˜i7(t)), and then standardize Zˇij(t) so that its second moment equals 1, where Φ(·) is the standard normal distribution function and j = 1, …, p. The standardized Zˇij(t)’s (j = 1, …, p) form the covariate vector Zi(t).

To generate the longitudinal responses, we consider the following four setups: Setup (I): Data are generated from a longitudinal linear model with independent homogeneous errors,

Yi(t)=Xi1+Xi2t+Zi(t)b+ϵi(t),

where b = (1, 0.8, 0.9, 1, 2, −1.5, −4.2, 0, …, 0), ϵi(t) ~ N(0, 1) for any t > 0, and ϵi(t) and ϵi(t′) are independent for t > 0, t′ > 0 and tt′.

Setup (II): Data are generated from a longitudinal linear model with dependent homogeneous errors,

Yi(t)=Xi1+Xi2t+Zi(t)b+ai+ϵi(t),

where b = (1, 0.8, 0.9, 1, 2, −1.5, −4.2, 0, …, 0), ai ~ N(0, 1/2) ϵi(t) ~ N(0, 1/2) for t > 0, ϵi(t) and ϵi(t′) are independent for t > 0, t′ > 0 and tt′, and ai and ϵi(t) are independent for t > 0.

Setup (III): Data are generated from a longitudinal linear model with independent heterogeneous errors,

Yi(t)=Xi1+Xi2t+Zi(t)b1+(Xi1+Zi(t)b2)ϵi(t),

where b1 = b = (1, 0.9, 0.75, 0.5, 0.8, 1, −4.95, 0, …, 0), b2 = (0, 0.25, 0, 1, 0, 0, −1.25, 0, …, 0), ϵi(t) ~ N(0, 1) for any t > 0, and ϵi(t) and ϵi(t′) are independent for t > 0, t′ > 0 and tt′.

Setup (IV): Data are generated from a longitudinal linear model with dependent heterogeneous errors,

Yi(t)=Xi1+Xi2t+Zi(t)b1+(Xi1+Zi(t)b2)(ai+ϵi(t)),

where b1 = b = (1, 0.8, 0.9, 1, 2, −1.5, −4.2, 0, …, 0), b2 = (0, 0.2, 0, 0.1, 0, 0, −0.3, 0, …, 0), ai ~ N(0, 1/2) and ϵi(t) ~ N(0, 1/2) for t > 0, ϵi(t) and ϵi(t′) are independent for t > 0, t′ > 0 and tt′, and ai and ϵi(t) are independent for t > 0.

Under Setups (I) and (II), we can show that

QYi(t){τXi(t),Zi(t)}=Qe(τ)Xi1+Xi2t+Zi(t)b,

where Qe(τ) is the τ–th quantile of standard normal distribution. Under Setups (III) and (IV), we can show that

QYi(t){τXi(t),Zi(t)}={1+Qe(τ)}Xi1+Xi2t+Zi(t){b1+b2Qe(τ)}.

In all setups, the true regression coefficients for Zi(t) satisfy the zero-sum constraint at each τ.

We evaluate the finite-sample performance of the proposed globally adaptive Lasso estimators with weights (w2) and (w3), denoted by AW2 and AW3 respectively. We set Δ = [0.1, 0.9] and the τ-grid Sn as {0.1 < 0.125 < … < 0.9}. We select the tuning parameter λ by a GIC criterion with ϕn = log(log n) log p/n, except for that in the initial estimator. The candidate values for λ include N/4 equally-spaced grid points between N/150 and N/15, where N=i=1nmi is the total number of longitudinal observations. We adapt Belloni and Chernozhukov (2011)’s method over Δ to get the estimator βˇ(τ) for calculating the adaptive weight functions.

We compare AW2 and AW3 to the locally concerned adaptive Lasso estimator at a single predetermined quantile level τ = 0.2, 0.5, or 0.8, denoted by SS(τ), as well as the pointwise approach, which simply combines the estimates from SS(τ) over τ ∈ Δ, denoted by PS. We also consider four other benchmark estimation procedures, ALasso (i), ALasso (ii), ALasso (iii), and ALasso (iv). The ALasso (i) estimators are the unconstrained estimator obtained by minimizing (3.4) with the reference, the lth component, randomly chosen. The ALasso (ii) estimators are the globally adaptive estimators derived from model (2.2) without considering the zero-sum constraint. That is, the ALasso (ii) estimators, (α^(τ)(ii),β^(τ)(ii)), are obtained as

argminα,β{1ni=1n0ρτ{Yi(t)Xi(t)αZi(t)β}dNi(t)+λj=1pωj(τ)|βj|}.

ALasso (iii) estimator is obtained from fitting the log-contrast model based on the relevant variables selected by ALasso (ii) approach. ALasso (iv) estimator is obtained by solving the minimization problem (2.3) without including the zero-sum constraint, using the selected relevant variables to fit a log-contrast model, and then selecting the tuning parameter based on the GIC criterion and determining the final estimator.

We assess the variable selection performance of the different methods described above in terms of mean number of correctly identified relevant variables (NCN), mean number of incorrectly selected variables (NIN), percentage of under-fitted models (PUUF), percentage of correctly fitted models (PCF), and percentage of over-fitted models (POF). To evaluate the global estimation accuracy over τ ∈ Δ, we consider three different types of average estimation errors, AEE1, AEE2 and AEE, where

AEEq1|Δ|Δβ^(τ)β*(τ)qdτ.

For SS(τ), we calculate the average estimation errors by extrapolating the coefficient estimate as the constant value of the whole coefficient function over τ ∈ Δ. To assess how well the estimated coefficients satisfy the zero-sum constraint, we adopt the criterion, SUM, which is defined as SUM=j=1pβj(τ*), where βj(·) stands for the estimated coefficient function and τ*=argmaxτΔ|j=1pβj(τ)|. Better performance would be indicated by NCN closer to 7, the true number of relevant covariates, PCF closer to 100%, NIN, PUF and POF closer to 0, smaller AEE1, AEE2 and AEE, and SUM closer to or equal to 0.

The simulation results for setups (I)–(IV) are presented in Table S1, Table S2, Table S3, and Table 1 respectively, where Tables S1S3 are provided in the Supplementary Materials. Simulation results are summarized based on 300 replicates. As seen from these tables, the proposed estimators with the globally adaptive weights, AW2 and AW3, perform well in all setups where the error terms can be homogeneous or heterogeneous, and can be independent or dependent across different time points. In all setups, the PCFs based on these estimators are around or above 85% and the zero-sum constraint is always met by the estimated coefficient functions. As shown by additional simulations reported in the Supplementary Materials (see Table S4S5), the PCFs can further increase as the variance of the longitudinal error deceases. In setups (I) and (II) where the effects of Z(t) are constant over τ, the estimation accuracy is comparable between the proposed globally adaptive estimators and the locally concerned estimators, SS(τ), τ = 0.2, 0.5, 0.8. However, variable selection based on SS(τ) is more likely to miss relevant variables, as reflected by the higher PUFs, particularly when τ = 0.2 or 0.8. In setups (III) and (IV) where the effects of Z(t) are not constant over τ, SS(τ) has much worse performance of variable selection than AW2 and AW3. This may lead to the deterioration in the average estimation errors for SS(τ) observed in setup (III) and (IV). In all setups, PS method produces average estimation errors similar to those of AW2 and AW3. However, PS method tends to overfit with POF equal to 31.7% in setup (I), 26.3% in setup (II), and 23% in setup (III) and setup (IV).

Table 1:

Simulation results under Setup (IV) with dependent heterogeneous errors

AEE1 AEE2 AEE NCN NIN PUF
(%)
PCF
(%)
POF
(%)
SUM
Proposed
AW2 2.261 1.024 0.694 6.923 0.040 7.7 88.3 4.0 0.000
AW3 2.297 1.041 0.707 6.877 0.017 12.0 86.7 1.3 0.000
SS(0.2) 2.828 1.275 0.841 6.110 0.007 57.3 42.0 0.7 0.000
SS(0.5) 1.908 0.863 0.577 6.670 0.017 29.0 69.3 1.7 0.000
SS(0.8) 2.623 1.206 0.829 6.273 0.023 43.3 54.7 2.0 0.000
PS 2.277 1.034 0.702 6.970 0.257 3.0 74.0 23.0 0.000
ALasso (i)
AW2 2.456 1.067 0.706 6.917 1.030 8.0 0.7 91.3 0.000
AW3 2.491 1.084 0.718 6.860 1.010 13.3 0.7 86.0 0.000
ALasso (ii)
AW2 2.326 1.064 0.725 6.913 0.033 8.7 88.3 3.0 1.916
AW3 2.352 1.076 0.734 6.857 0.017 14.0 84.3 1.7 −2.315
ALasso (iii)
AW2 2.549 1.146 0.770 6.913 0.033 8.7 88.3 3.0 0.000
AW3 2.668 1.199 0.811 6.857 0.017 14.0 84.3 1.7 0.000
ALasso (iv)
AW2 2.294 1.030 0.685 6.963 0.380 3.7 63.7 32.7 0.000
AW3 2.305 1.035 0.690 6.957 0.347 4.3 65.0 30.7 0.000

When examining the results from the globally adaptive estimators under ALasso (i), we note a common overfitting problem associated with adopting the unconstrained log-contrast model. This is because ALasso (i) procedure automatically includes the reference compositional covariate, which may not be a truly relevant covariate. The results under ALasso (ii) suggest that the underlying zero-sum constraint of coefficients would not be satisfied if it is not carefully accounted for in the estimation procedure. In such a situation, interpreting the resulting coefficient estimates as the effects of compositional covariates would be problematic. ALasso (iii) approach renders satisfactory rates of correct fitting but yields larger estimation errors compared to the proposed method. ALasso (iv) method tends to overfit with the percentages of overfitting above 25%. The enlarged estimation errors or the overfitting behavior reflect the disadvantage of handling the zero-sum constraint separately from model estimation and variable selection. In summary, the simulation results suggest the importance of the proposed globally adaptive estimators as well as their satisfactory empirical performance.

5. A real data example

We applied the proposed method to a longitudinal dataset from the Feeding Infants Rightfrom the STart (FIRST) study. The FIRST study is an ongoing perspective observational study, which has enrolled and followed up children with cystic fibrosis (CF) from the neonatal period. In this study, various diet-related biomarkers have been collected repeatedly at pre-specified CF care visits. For example, fecal specimens were collected approximately at 2, 4, 6, 8, and 12 months of age for each child. The gut microbiome composition data were extract from fecal specimen by 16S rRNA gene pyrosequencing and comprise of relative abundances of 364 unique genera subject to the unit-sum constraint. The levels of calprotectin, a biomarker for the inflammation in the gastrointestinal (GI) tract, were also tracked over time and recorded in the unit of microgram per gram of stool. In our analysis of the FIRST dataset, the specific question of interest is how the gut microbiome composition is associated with the calprotectin level over time. Identifying the subcompositional bacterial taxa that are linked to the variations in calprotectin can shed useful insight about the early CF disease mechnisam.

The final dataset includes 135 subjects and a total of 328 longitudinal records after the exclusion of 7 children with low birth weight. Table S6 in the Supplementary Materials presents basic summary statistics for gender, number of longitudinal records, and calprotectin levels. It is shown that 56% of subjects are boys, and about 50% of subjects have 3 or 4 longitudinal records. It is also noted that the calprotectin levels present a skewed distribution with median (= 64.5) considerably smaller than mean (= 111.2). In this case, adopting longitudinal quantile regression modeling can deliver a more comprehensive and robust view about how the gut microbiome composition influences calprotectin levels.

In our analysis, we implement the proposed globally adaptive methods with the adaptive weights (w2) and (w3) and Δ = (0.2, 0.8] (denoted by AW2 and AW3 respectively), the locally concerned adapive-Lasso method SS(τ) with τ = 0.2, 0.3, …, 0.8, and the pointwise method (denoted by PS), which is a union set for SS(τ) with τ = 0.2, 0.225, …, 0.8. We include gender as a regular covariate. The compositional covariates are the relative abundances of 364 genera measured from the gut microbiome samples. We exclude six genera that have relative abundance below the detection limit in all samples. In addition, we replace all non-detectable relative abundance by an extremely small constant 10−20, which is much smaller than the minimum nonzero relative abundance captured in our dataset, 4.418 × 10−6. For the tuning parameter selection, the candidate values of λ include N/4 equally-spaced grid points between N/150 and N/15, where N = 328. To avoid selecting boundary λ’s, ϕn in GIC criteria is chosen as log(log n) log p/(20n) for globally concerned quantile regression and locally concerned quantile regression. The estimates below 10−4 are shrunk to zero.

To evaluate each method, we compute prediction errors as follows. We first randomly split the 135 subjects into a training set of size 120 and a testing set of size 15. We apply the method to the training data set and obtain the estimator of (α0(τ), β0(τ)), denoted by (α^train(τ),β^train(τ)). Then we calculate the prediction error in the testing set as

PE(Δ)=iTΔ0ρτ{Yi(t)Xi(t)α^train(τ)Zi(t)β^train(τ)}dNi(t)dτi=1n1{iT},

where T denotes the test set. For SS(τ), we calculate PE(Δ) by treating the coefficient estimate as a constant valued function over τ ∈ Δ.

Table 2 lists the genus sets selected by different methods. The averages prediction errors (PE) along with the corresponding standard deviations (within parentheses) are also presented. The calculations of PE are based on 200 random splitting of training and test sets. From Table 2, we observe that the selected genus sets vary considerably across the locally concerned methods, SS(τ) with different choices of τ. These observations suggest that some genera may have varying influences on different quantiles of calprotectin level, and in part, may also reflect the variable selection instability associated with SS(τ) (Zheng et al., 2015). For example, the genus “g115” may only affect median calprotectin but not the other lower or upper quantiles of calprotectin. In contrast, the proposed globally concerned methods give robust and yet parsimonious selection of genus sets. For example, the selected genus sets are almost identical between AW2 and AW3. The selected genera are mostly also selected by one of SS(τ)’s. Naively pooling the results from SS(τ)’s, as shown by PS method, lead to select an excessive number of genera (i.e. 26 genera). Some genera selected by SS(τ) but not by AW2 or AW3 are possibly“false positive” as suggested by the apparent overfitting behavior of PS method demonstrated in the simulation studies. Moreover, it is noted that the proposed method AW3 yields the smallest prediction error. The prediction error of AW2 is close to the second smallest one. The locally concerned SS(τ) methods produce larger prediction errors because they would neglect important genera that do not show effects at the τ-th quantile but are relevant to other quantiles. In summary, the proposed globally adaptive methods strikes the best balance between parsimonious variable selections and accurate predictions, while retaining sensible interpretations via the satisfaction of the zero-sum constraint of coefficients.

Table 2:

Analysis of the FIRST dataset

τ Method Selected Genus Sets PE
[0.2, 0.8] AW2 g50 g93 g115 g137 g147 g152 g162 g178 g184 g197 g204 g210 g213 g219 g297 g319 g370 0.5279 (0.0837)
AW3 g50 g93 g115 g147 g152 g162 g178 g197 g204 g210 g213 g219 g297 g319 g370 0.5271 (0.0837)
PS g14 g32 g50 g64 g93 g115 g119 g137 g147 g152 g153 g162 g178 g183 g184 g188 g193 g197 g199 g204 g210 g213 g219 g297 g319 g370 0.5278 (0.0826)
0.2 SS g147 g153 g213 0.7893 (0.1420)
0.3 SS None 0.6833 (0.1206)
0.4 SS g50 g93 g119 g147 g162 g183 g197 g199 g204 g213 g297 0.6135 (0.1038)
0.5 SS g14 g115 g137 g147 g193 g197 g204 g213 g219 g297 g319 0.5855 (0.0976)
0.6 SS None 0.5960 (0.0981)
0.7 SS g147 g152 g178 g184 g197 g204 g213 g297 g319 g370 0.6621 (0.1025)
0.8 SS g32 g147 g152 g162 g178 g197 g204 g213 0.7926 (0.1235)

6. Discussions

In this work, we develop a globally concerned longitudinal quantile regression framework which accommodates high-dimensional compositional covariates. The proposed method can achieve the oracle convergence rate as well as the global model selection consistency, while enjoying interpretative advantages.

The longitudinal quantile regression model adopted in this work assumes all covariate effects do not change over time. To accommodate temporal covariate effects, model (2.1) or (2.2) can be extended with regression coefficients formulated as bivariate functions of τ and t. An intuitive approach to tackle this extension is to combine the proposed method with the strategy of Park and He (2017). That is, the longitudinal loss function may be modified by incorporating spline approximations to the regression coefficient functions with the penalty term adjusted accordingly. Nevertheless, this approach may be computationally prohibitive provided the additional high-dimensional layer induced by spline approximations. More specifically, suppose there are L spline basis functions, and L = O(n1/5). Based on the proposed estimation for model (2.1), the computational complexity is about O(n2 · p · M(n)), according to Klee and Minty (1972)’s result for simplex algorithm. When considering the spline based estimation for the extended model with time-varying coefficient coefficients, we expect that the computational intensity would be roughly equivalent to that of fitting a quantile regression model for a dataset with sample size nM(n) and covariate dimension pL, which is about O(n2M(n)2pL). Given M(n) = O(n), as suggested by Zheng et al. (2015), tackling the more flexible model with time-varying coefficients would require O(n6/5) times the computational effort needed for the proposed model (2.1), which can be computationally prohibitive for high-dimensional applications. How to address such an obstacle merits future research.

After applying the proposed method to a real dataset, assessing the adequacy of model (2.1) with the pre-specified quantile index set Δ and the selected relevant variables may be of practical interest. To this end, we can adapt the model checking strategy of Peng and Huang (2008), and consider the stochastic process

Kn(τ)=n1/2i=1n0W(Vi(t))ψτ{Yi(t)Xi(t)α^(τ)Zi(t)β^(τ)}dNi(t),

as an analogue of the martingale based diagnostic process employed by Peng and Huang (2008), where ψτ(u) = τI(u < 0). Here W(·) is a known bounded function and Vi(t) = (Xi(t), Zi(t)). A lack-of-fit test statistic may be constructed based on supτ∈Δ |Kn(τ)|. Following the lines of Peng and Huang (2008), the corresponding p value can be obtained by using a properly designed resampling scheme to approximate the distribution of Kn(·) under model assumption (2.1).

Following the idea of weighted GEE (Liang and Zeger, 1986) and the quasi-likelihood approach for median regression (Jung, 1996), we may incorporate within-subject correlation of repeated measures to further improve estimation efficiency of the proposed method. Specifically, we may consider a weighted penalized estimating equation,

n1/2i=1nViQi(τ;α,β)1Si(τ;α,β)+λj=1pωj(τ)sign(βj)=0,

subject to constraint j=1pβj=0, where Vi=(Vi(ti(1)),Vi(ti(mi))), Si(τ;α,β)=(Si1(τ;α,β),,Si,mi(τ;α,β)) with Sik(τ;α,β)=I(Yi(ti(k))Xi(ti(k))αZi(ti(k))β0)τ, and Qi(τ; α, β) is a working covariance matrix that approximates the covariance of Si(τ; α, β). When Qi(τ; α, β) is an identity matrix Imi, solving this estimating equation is equivalent to minimizing (2.3) which adopts the working independence assumption. However, we note that the weighted estimating equation loses the nice monotonicity property possessed by the unweighted version. In addition, the covariance of Qi(τ; α, β) is often unknown in practice and its empirical estimate may not be stable when sample size is not large, like in the FIRST dataset. As suggested by a referee, one possible solution to alleviate the computational issue is to adopt an iterative algorithm where one first solves the weighted estimating equation with the parameters α and β in the weight function Qi(τ; α, β)−1 fixed and then updates the weight function with the resulting parameter estimates. In this case, the estimating equation involved in each iteration is still monotone. Applying this strategy may lead to an approach that improves estimation efficiency while being computationally viable. It is of future research interests to investigate this weighted method in more details.

Supplementary Material

Supplementary Materials

Acknowledgements

The authors are grateful to valuable comments from the Associate Editor and two Referees. This work was partially supported by National Institutes of Health grant R01HL113548 and R01DK209692, the National Natural Science Foundation of China (11901200, 71931004) and Shanghai Pujiang program (19PJ1403400).

Footnotes

Supplementary Materials

The detailed proofs of lemmas and theorems and additional simulation studies are provided in the Supplementary Materials.

References

  1. Aitchison J (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society. Series B (Methodological) 44, 139–177. [Google Scholar]
  2. Aitchison J (2003). The Statistical Analysis of Compositional Data. Caldwell, New Jersey: Blackburn Press. [Google Scholar]
  3. Aitchison J and Bacon-shone J (1984). Log contrast models for experiments with mixtures. Biometrika 71, 323–330. [Google Scholar]
  4. Belloni A and Chernozhukov V (2011). 1-penalized quantile regression in high-dimensional sparse models. The Annals of Statistics 39, 82–130. [Google Scholar]
  5. Cho H, Hong HG, and Kim M-O (2016). Efficient quantile marginal regression for longitudinal data with dropouts. Biostatistics 17(3), 561–575. [DOI] [PubMed] [Google Scholar]
  6. Fan J, Fan Y, and Barut E (2014). Adaptive robust variable selection. The Annals of Statistics 42, 324–351. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Fan J and Lv J (2011). Non-concave penalized likelihood with np-dimensionality. IEEE Transactions on Information Theory 57, 5467–5484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Fan Y and Tang CY (2013). Tuning parameter selection in high dimensional penalized likelihood. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 75, 531–552. [Google Scholar]
  9. Gao X and Liu Q (2019). Sparsity identification in ultra-high dimensional quantile regression models with longitudinal data. Communications in Statistics-Theory and Methods, 1–25. [Google Scholar]
  10. He X and Shao Q-M (2000). On parameters of increasing dimensions. Journal of Multivariate Analysis 73(1), 120–135. [Google Scholar]
  11. Jung S-H (1996). Quasi-likelihood for median regression models. Journal of the American Statistical Association 91, 251–257. [Google Scholar]
  12. Kim Y, Choi H, and Oh H-S (2008). Smoothly clipped absolute deviation on high dimensions. Journal of the American Statistical Association 103, 1665–1673. [Google Scholar]
  13. Klee V and Minty G (1972). How good is the simplex algorithm, inequalities-iii, edited by 0. shisha. [Google Scholar]
  14. Koenker R (2004). Quantile regression for longitudinal data. Journal of Multivariate Analysis 91(1), 74–89. [Google Scholar]
  15. Koenker R and Bassett G (1978). Regression quantiles. Econometrica 46, 33–50. [Google Scholar]
  16. Koenker R and d’Orey V (1987). Algorithm as 229: Computing regression quantiles. Journal of the Royal Statistical Society, Series C (Applied Statistics) 36, 383–393. [Google Scholar]
  17. Ledoux M and Talagrand M (1991). Probability in Banach Space: Isoperimetry and Processes. Ergebnisse der Mathematik und ihrer Grenzgebiete, Volume 23. Springer, Berlin. [Google Scholar]
  18. Li Y, Liu Y, and Zhu J (2007). Quantile regression in reproducing kernel hilbert spaces. Journal of American Statistical Association 102, 255–268. [Google Scholar]
  19. Liang K-Y and Zeger SL (1986). Longitudinal data analysis using generalized linear models. Biometrika 73(1), 13–22. [Google Scholar]
  20. Lin W, Shi P, Feng R, and Li H (2014). Variable selection in regression with compositional covariates. Biometrika 101, 785–797. [Google Scholar]
  21. Lipsitz SR, Fitzmaurice GM, Molenberghs G, and Zhao LP (1997). Quantile regression methods for longitudinal data with drop-outs: application to cd4 cell counts of patients infected with the human immunodeficiency virus. Journal of the Royal Statistical Society: Series C (Applied Statistics) 46(4), 463–476. [Google Scholar]
  22. Lu J, Shi P, and Li H (2019). Generalized linear models with linear constraints for microbiome compositional data. Biometrics 75, 235–244. [DOI] [PubMed] [Google Scholar]
  23. Lv J and Fan Y (2009). A unified approach to model selection and sparse recovery using regularized least squares. The Annals of Statistics 37, 3498–3528. [Google Scholar]
  24. Ma H, Peng L, and Fu H. Quantile regression modeling of latent trajectory features with longitudinal data. Journal of Applied Statistics 46, 2884–2904. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Meinshausen N and Buhlmann P (2006). High-dimensional graphs and variable selection with the lasso. The Annals of Statistics 34, 1436–1462. [Google Scholar]
  26. Nishii R (1984). Asymptotic properties of criteria for selection of variables in multiple regression. The Annals of Statistics 12, 758–765. [Google Scholar]
  27. Park S and He X (2017). Hypothesis testing for regional quantiles. Journal of Statistical Planning and Inference 191, 13–24. [Google Scholar]
  28. Peng L and Huang Y (2008). Survival analysis with quantile regression models. Journal of the American Statistical Association 103(482), 637–649. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Portnoy S (1991). Asymptotic behavior of the number of regression quantile breakpoints. SIAM Journal on Scientific and Statistical Computing 12, 867–883. [Google Scholar]
  30. Shi P, Zhang A, and Li H (2016). Regression analysis for microbiome compositional data. The Annals of Applied Statistics 10, 1019–1040. [Google Scholar]
  31. Sun X, Peng L, Manatunga A, and Marcus M (2016). Quantile regression analysis of censored longitudinal data with irregular outcome-dependent follow-up. Biometrics 72(1), 64–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Talagrand M (2005). The Generic Chaining. Springer, Berlin. [Google Scholar]
  33. Tyler AD, Smith MI, and Silverberg MS (2014). Analyzing the human microbiome: a “how to” guide for physicians. The American Journal of Gastroenterology 109, 983–993. [DOI] [PubMed] [Google Scholar]
  34. van der Vaart A and Wellner J (1996). Weak Convergence and Empirical Processes: With Applications to Statistics. New York: Springer. [Google Scholar]
  35. Wang H and Fygenson M (2009). Inference for censored quantile regression models in longitudinal studies. The Annals of Statistics 37, 756–781. [Google Scholar]
  36. Wang L, Wu Y, and Li R (2012). Quantile regression for analyzing heterogeneity in ultrahigh dimension. Journal of the American Statistical Association 101, 1418–1429. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Wang L, Zhou J, and Qu A (2012). Penalized generalized estimating equations for high-dimensional longitudinal data analysis. Biometrics 68(2), 353–360. [DOI] [PubMed] [Google Scholar]
  38. Welsh A (1989). On m-processes and m-estimation. The Annals of Statistics 17(1), 337–361. [Google Scholar]
  39. Xia F, Chen J, Fung W, and Li H (2013). A logistic normal multinomial regression model for microbiome compositional data analysis. Biometrics 69, 1053–1063. [DOI] [PubMed] [Google Scholar]
  40. Zhang C-H and Huang J (2008). The sparsity and bias of the Lasso selection in high-dimensional linear regression. The Annals of Statistics 36(4), 1567–1594. [Google Scholar]
  41. Zheng Q, Gallagher C, and Kulasekera KB (2013). Adaptive penalized quantile regression for high dimensional data. Journal of Statistical Planning and Inference 143, 1029–1038. [Google Scholar]
  42. Zheng Q, Peng L, and He X (2015). Globally adaptive quantile regression with ultra-high dimensional data. The Annals of Statistics 43, 2225–2258. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Zheng X, Fu B, Zhang J, and Qin G (2018). Variable selection for longitudinal data with high-dimensional covariates and dropouts. Journal of Statistical Computation and Simulation 88(4), 712–725. [Google Scholar]
  44. Zou H and Yuan M (2008). Composite quantile regression and the oracle model selection theory. The Annals of Statistics 36, 1108–1126. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Materials

RESOURCES