Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2019 Apr 30.
Published in final edited form as: Stat Med. 2018 Feb 21;37(9):1577–1586. doi: 10.1002/sim.7619

Variable selection with group structure in competing risks quantile regression

Kwang Woo Ahn a, Soyoung Kim a,*
PMCID: PMC5889760  NIHMSID: NIHMS939135  PMID: 29468710

Abstract

We study the group bridge and the adaptive group bridge penalties for competing risks quantile regression with group variables. While the group bridge consistently identifies non-zero group variables, the adaptive group bridge consistently selects variables not only at group level, but also at within-group level. We allow the number of covariates to diverge as the sample size increases. The oracle property for both methods is also studied. The performance of the group bridge and the adaptive group bridge is compared in simulation and in a real data analysis. The simulation study shows that the adaptive group bridge selects non-zero within-group variables more consistently than the group bridge. A bone marrow transplant study is provided as an example.

Keywords: Adaptive lasso, Competing risks quantile regression, Group bridge

1. Introduction

Quantile regression provides an alternative method to the Cox proportional hazards model and the accelerated failure time (AFT) model in survival analysis [1]. It is often preferred when the survival distribution is skewed. There is rich literature in survival quantile regression. Peng and Huang [2] proposed a martingale-based estimating equations. Reich and Smith [3] developed a semiparametric Bayesian quantile regression model for censored data. Yin et al. [4] studied a power-transformed quantile regression model for survival data. Yin and Cai [5] proposed quantile regression models for correlated survival data.

Recently quantile regression for competing risks data have had much attention. Peng and Fine [1] proposed a semiparametric model based on the competing risks AFT model. Sun et al. [6] developed a regression model when the failure type is missing in competing risks data. Lee and Fine [7] studied parametric and nonparametric methods to make inference on cumulative incidence quantiles.

In spite of increasing popularity of quantile regression for survival and competing risks data, the current literature on variable selection is somewhat limited. Jiang et al. [8] proposed the adaptive lasso for a composite quantile regression with randomly censored data. Wang et al. [9] also studied the adaptive lasso for censored quantile regression. They all studied a survival setting, not a competing risks setting. In addition, their proposed methods addressed variable selection at individual level, not at group level. In practice, clinicians often encounter group variables such as categorical variables. For example, Verneris et al. [10] studied the outcomes of the patients having reduced-intensity conditioning allogeneic hematopoietic cell transplantation from 1999 to 2011. They studied competing risks outcomes including relapse and treatment-related mortality (TRM), where relapse and TRM are competing risks to each other. The variables that they considered for analysis consisted of binary and categorical variables.

Several penalties have been proposed to select group variables for linear regression and competing risks settings. Yuan and Lin [11] proposed the group lasso, which selects variables at group level, not at within-group level. Huang et al. [12] developed the group bridge to select both non-zero group and non-zero within-group variables. However, they studied group selection consistency only and did not show within-group variable selection consistency. Zhou and Zhu [13] proposed an adaptive hierarchical lasso having group variable selection consistency and within-group variable selection consistency. Zhao et al. [14] applied the adaptive hierarchical lasso penalty to identify non-zero variables at both levels for quantile linear regression. Fu et al. [15] extensively studied lasso, adaptive lasso, SCAD, and MCP for individual variable selection and their group variable selection versions for the subdistribution hazards model. However, they did not address within-group variable selection. In addition, their oracle property was limited to a fixed number of covariates. Despite extensive work in group variable selection for linear, linear quantile, and subdistribution hazards regression models, there is little literature on group variable selection in competing risks quantile regression. In particular, group and within-group level variable selection techniques remain unexplored in the current literature to the best of the authors’ knowledge.

We propose the group bridge and the adaptive group bridge for bi-level variable selection, that is, group and within-group variable selection, under the competing risks quantile regression model of Peng and Fine [1]. While the group bridge consistently identifies non-zero group variables, the adaptive group bridge consistently selects non-zero variables at both group level and within-group level. When there is no group structure for variables, individual variable selection can be handled as a special case of the proposed methods. Based on our knowledge, even individual variable selection has not been studied for the competing risks quantile regression. We study their oracle property while allowing the number of variables to diverge as the sample size increases. We show the adaptive group bridge identifies non-zero within-group variables more consistently than the group bridge in simulation study. In Section 2, we describe the proposed methods and study their theoretical properties. In Section 3, we compare the performance of the adaptive group bridge and the group bridge via simulation study. We illustrate a real data example in Section 4 and have a brief conclusion in Section 5. All the proofs of the theorems and the lemmas in this paper can be found in the online Supplementary Materials.

2. Method

In this section, we propose a penalized competing risks quantile regression model and study its theoretical properties. We begin with some notations. Without loss of generality, we consider two causes of failure ε ∈ {1,2} with sample size n. We allow the number of covariates dn to increase as n increases. Let Ti, Ci, εi, and Zi=(1,Zi1,,Zidn)T be the event time, censoring time, cause of failure, and covariate vector of subject i for i = 1, …, n. Denote β0(τ) = {βj,0(τ); j = 0, …, dn}T as the true parameter vector given quantile τ, where β0,0 is the true intercept coefficient. Let Xi = Ti ˄ Ci be the observed time and δi = I(Ti ≤ Ci)I(εi = 1), where a ˄ b = min (a, b). We assume that (Ti, εi, Ci, Zi) are independent and identically distributed, and the Ti’s and Ci’s are independent given Zi for i = 1, …, n. The study period is [0, L]. Let F1(t|Zi) be the cumulative incidence of cause 1 at time t given Zi, where F1(t|Zi) = P(Ti ≤ t,εi = 1|Zi). Given covariate Z, we define the τth conditional quantile of F1(t|Z) as Q1(τ|Z) = inf{t: F1(t|Z) ≥ τ}. For τ ε [τLU] with 0 < τLU < 1, we consider Q1(τ|Z) = g{ZTβ(τ)}, where g(·) is a known monotone link function. Let ‖·‖ be the Euclidean norm and a⊗2 = aaT for a vector a.

Let Zi=(Zi1,,Zidn)T. For simplicity, we assume that Zi’s are fixed over time. Let NiG(t)=I(CiTi)I(Cit) be the counting process for censoring and Yi(t) = I(Xit). We use the Cox proportional hazards model to fit censoring time Ci’s:

λG(t|Zi)=λ0G(t)eαTZi,

Where λ0G(t) is an arbitrary baseline hazard function for censoring and αT is the unknown parameter vector. Define

SG(d)(α,t)=n1i=1nYi(t)ZideαTZi,

where d = 0,1, and 2. The baseline cumulative hazard function for censoring Λ0G(t) is estimated by the Breslow-type estimator [16]:

Λ^0G(t;α^)=0ti=1ndNiG(u)nSG(0)(α^,u),

where α^ is the estimator of α based on the Cox proportional hazards model. Then, we estimate G(t|Zi) as follows:

G^(t|Zi)=exp{0teα^TZidΛ^0G(u:α^)}.

We can obtain the consistency of α^, Λ^0G(t:α^), and G(t|Zi) as follows:

Lemma 2.1

Assume Conditions (a)-(e) as in Appendix. Then, we have α^α=Op(dn/n) supt|Λ^0G(t:α^)Λ0G(t)|=Op(dn/n), and supt|G^(t|Z)G|(t|Z)|=Op(dn/n).

When the censoring distribution G does not depend on any covariates, the Kaplan-Meier estimator can be used instead of the Breslow estimator. The proof of Lemma 2.1 can be found in the online Supplemental Materials.

Next, we define some notations on group variables and their memberships. Assume that we have K groups of variables. Let A1,…,AK be subsets of {1, …, dn} representing group memberships of variables, where Ak’s may overlap. Define βA(τ) = {βj(τ),j ε A}T and βA,0(τ) = {βj,0(τ); j ε A}T for a set A. To distinguish the individual memberships between non-zero βj,0(τ)’s and zero βj,0(τ)’s, we define B1 and B2 such that βj,0(τ) ≠ 0 if j ε B1 and βj,0(τ) = 0 if j ε B2. To distinguish the group memberships between non-zero βAk,0(τ)’s and zero βAk,0(τ)’s, without loss of generality we further define E1 and E2 such that E1=k=1K1Ak and E2=k=K1+1KAk, where βAk,0(τ)0 for 1 ≤ k ≤ K1 and βAk,0(τ)=0 for K1 + 1 ≤ kK.

To estimate β(τ), Peng and Fine [1] considered the estimating equation Sn(b, τ) = 0, where

Sn(b,τ)=n1/2i=1nZi[I{Xig(ZiTb)}I(δi=1)G^(Xi|Zi)τ]. (1)

To solve Sn(b, τ) = 0, Peng and Fine [1] proposed the following L1-type convex function:

Un(b,τ)=i=1nI(δi=1)|g1(Xi)bTZiG^(Xi|Zi)|+|MbTi=1nZiI(δi=1)G^(Xi|Zi)|+|MbTi=1n2Ziτ|,

where M is a very large positive number to bound |bTi=1nZiI(δi=1)/G^(Xi|Zi)| and |bTi=1n2Ziτ| for all b’s in the parameter space for β0(τ). They studied the consistency and the asymptotic normality of the estimator of β0(τ) obtained by solving Sn(b, τ) = 0 when G is non-covariate dependent and dn is fixed.

To select variables at bi-level, we propose the following penalized function:

Wn(b,τ)=Un(b,τ)+λnk=1Kck(jAk|bj||βj(τ)|ν)γ, (2)

where βj(τ) is a consistent estimator of β(τ), ν ≥ 0, λn > 0, and 0 < γ < 1. Following Huang et al. [12], we set ck|Ak|1γ, where |A| is the cardinality of A. If ν = 0, the penalty term is the group bridge penalty of Huang et al. [12] and Huang et al. [17]. When ν > 0, we call the penalty term as adaptive group bridge penalty. The adaptive group bridge becomes i) individual variable selection when |Ak | = 1 for all k; and ii) the adaptive hierarchical lasso penalty of Zhou and Zhu [13] when γ = 1/2 and ck = 1 for all k.

We can formulate minimizing Wn(b, τ) to minimizing

Wn(b,θ,τ)=Un(b,τ)+k=1Kθk11/γck1/γjAk|bj||βj|νγ+ζnk=1Kθk, (3)

where θ = (θ1, …, θK)T. By defining

θk=ck(1γζnγ)γ(jAk|βj(τ)||βj(τ)|ν)γ,k=1,,K,

we can show the following lemma similarly to Proposition 1 of Huang et al. [12] and thus its proof is omitted:

Lemma 2.2

Assume that λn=ζn1γγγ(1γ)γ1 for 0 < γ < 1. Then, β^(τ) minimizes Wn (b, τ) if and only if {β^(τ),θ^} minimizes Wn(b,θ,τ), where θk > 0 and θ^k>0 for k = 1,…, K.

Define Sn(b,τ)=n1/2i=1nZi[F1{g(ZiTb)|Zi}τ]. Denote Sn(b,τ) as the first derivative of Sn(b,τ) with respect to b. We first study the oracle property of the group bridge estimator given τ. We assume that

  • (C1)

    There exists ω > 0 such that P(C=ω|Z)c>0 and P(C>ω|Z)=0 for any Z.

  • (C2)

    Zij and βj,0(τ) are uniformly bounded for j = 1,…, dn.

  • (C3)

    f1(t|z) is bounded above uniformly in t and z, where f1(t|z) = dF1(t|z)/dt.

  • (C4)

    Define H(b)=E{n1/2Sn(b,τ)}=E[Z2f1{g(ZTb)|Z}g(ZTb)]. For some ρ0 > 0, C1 > 0, and C2 > 0, infbεB(ρ0)κ{H(b)}C1 and supbεB(ρ0)κ{H(b)}C2<, where B(ρ0)={bεdn+1:bβ0(τ)ρ0} and κ(H) is the eigenvalue of a matrix H.

  • (C5)

    Σ(τ) = Var{Sn (b,τ)}. There exist C3 > 0 and C4 > 0 such that infβεB(ρ0)κ[(τ)}]C3 and supβεB(ρ0)κ[(τ)}]C4<, where ρ0 > 0.

  • (C6)

    There exists a constant C5 > 0 such that supbεB(ρ0),0idnn1Cov{Sn,ij(b,τ),Sn,ij(b,τ)}C5<, for all 0 < j, j′ < dn, where Sn,ij(b,τ) is the (i, j)th entry of Sn(b,τ).

  • (C7)

    dn4/n0.

  • (C8)

    Cn=maxjk=1KI(jεAk) is bounded and λn2/nk=1K1ck2{jεAk|βj,0(τ)|}2γ2|Ak|dnMn, Mn = Op (1),where λn/[nγ/2κmax{(τ)}dn1γ/2] as n → ∞

  • (C9)

    λnn1/20, 1/κmin{(τ)}+κmax{(τ)}+k=1Kck2=O(1), λn/(nγ/2dn1γ/2) as n → ∞

(C1)−(C5) are similar to the standard conditions for the competing risks quantile regression of Peng and Fine [1]. Peng and Fine [1] suggested to use a truncated censoring time C = mim(C, L) for ω in (C1) so that (C1) is always satisfied. In practice, ω can be chosen as large as possible so that only small information loss occurs [1]. (C4) − (C6) and (C8) control the behavior of the estimating equation as dn grows. Similar conditions to (C4) − (C8) were used to allow dn to diverge as n ⟶ ∞ in Cai et al. [18], Huang et al. [12], and Huang et al. [17]. (C5), (C6), and (C9) restricts the variability of Var{Sn(b,τ)} and Var{Sn(b,τ)} as n and dn increase. (C8) and (C9) control λn, the number of variables within group, and the magnitude of the true parameters in non-zero groups, which were used in Huang et al. [17]. The variance matrix Σ(τ) in Condition (C5) can be specified as follows: Define eG(α0,t) and A(α0) as in Appendix. We further define h(t,u,Zi)=exp(α0TZi)ut{ZieG(α0,u)}dΛ0G(v), MiG(t)=NiG(t)0tYi(u)exp(α0TZi)dΛ0G(u) and

q(t)=E(1ni=1n0L[hT(t,0,Zi)A(α0)1{ZieG(α0,t)}+exp(α0TZi)I(ut)sG(0)(α0,u)]MiG(u))

wi(b)=ZiI{Xig(ZiTb)}I(δi=1)q(Xi)/G(Xi|Zi). Then, (τ)=E{η1(τ)η1(τ)T}, where ηi(τ)=Zi[I{Xig(ZiTβ0T(τ))}I(δi=1)/G(Xi|Zi)τ]+wi{β0(τ)}. The s includes the detailed derivation of ηi(τ) and the asymptotic normality of the estimator obtained by solving Sn(b,τ) = 0 for fixed dn. Denote → d as convergence in distribution.

First of all, the following lemma shows the consistency of the estimator obtained by solving Sn (b,τ) = 0 when dn diverges as n ⟶ ∞:

Lemma 2.3

Let β(τ) be the estimator obtained by solving Sn (b, τ) = 0. Then, under the conditions (C1) − (C7), we have β(τ)β0(τ)=Op(dn/n).

The proof of Lemma 2.3 can be found in the online Supplementary Materials. Peng and Fine [1] studied the consistency of β(τ) for non-covariate dependent censoring with fixed number of covariates. Lemma 2.3 extends their result to covariate-dependent censoring with diverging dn. Similarly to Huang et al. [17], we have the following theorem for the group bridge estimator given τ:

Theorem 2.4

Assume ν = 0 in (2). Under (C1) − (C9), we have

  1. Consistency: β^(τ)β0(τ)=Op(dn/n).

  2. Group variable selection consistency: P{β^E2(τ)=0}1.

  3. Asymptotic distribution: for fixed unknown {E1,βE1,0},
    n{β^E1(τ)βE1,0(τ)}dN[0,H11{β0(τ)}111(τ)H11{β0(τ)}1],
    where H11{β0(τ) and 11(τ) are the leading |E1| × |E1| submatrices of H{β0(τ)} and Σ(τ), respectively.

Using Lemma 2.3, Theorem 2.4 can be shown similarly to the proofs of Theorems 1 and 2 of Huang et al. [17] and thus its proof is omitted. Theorem 2.4 shows the group variable selection consistency n/dn-consistency of the group bridge estimator.

Although the group bridge can consistently select non-zero group variables, it may not effectively eliminate zero individual variables within non-zero group variables. This may be improved with using ν > 0 in (2), that is, the adaptive group bridge penalty. For the adaptive group bridge, we have the following theorem given τ:

Theorem 2.5

Assume ν > 0 in (2). In addition to (C1) − (C7), we assume

(C8b) For some ν1 and ν2 such that 0 < ν1 < 1, 0 < ν2, and ν2/(1 − ν1) < ν, minjεB1|β0,j(τ)|=Op{(dn/n)ν1/2}, maxk|AkB1|=O{(n/dn)ν2/2}, and

k=1K1ck{(jεAkB1|βj,0(τ)|1ν)γ1jεAkB11|βj,0(τ)|ν}=Op(dn).

(C9b)λn/n0, n/dnβj=Op(1), and min(λnn(ν1)/2dn(1+ν)/2,λnnγ(ν1)/2dn1+γ(1ν)/2).

Then, we have

  1. Consistency: β^(τ)β0(τ)=Op(dn/n).

  2. Bi-level variable selection consistency: P{β^B2(τ)=0}1.

  3. Asymptotic distribution:for fixed unknown {B1,βB1,0},
    n{β^B1(τ)βB1,0(τ)}dN[0,H11{β0(τ)}111(τ)H11{β0(τ)}1],
    where H11 {β0(τ)} and Σ11(τ) are the leading |B1| × |B1| submatrices of H{β0(τ)} and Σ(τ), respectively.

The proof of Theorem 2.5 can be found in the online s. (C8b) controls the magnitude of non-zero parameters and the number of non-zero parameters. It requires the smallest magnitude of non-zero parameters does not shrink towards zero too fast. (C9b) controls λn and ν as n → ∞ to obtain the oracle property. Theorem 2.5 provides the oracle property of the adaptive group bridge estimator. In particular, it shows that the adaptive group bridge consistently identifies not only non-zero group variables, but also non-zero within-group variables.

To obtain β^, we minimize Wn(b,θ,τ) of (3). Then, the optimization algorithm is as follows:

  1. Obtain an consistent estimator β(τ) and an initial value β(0)(τ) from Peng and Fine [1] or the group bridge.

  2. Compute
    θk(i)=ck(1γζnγ)γ(jAk|βj(i)(τ)||βj(τ)|ν)γ,k=1,,K.
  3. Obtain β(i+1) (τ) by minimizing Wn(b,θ(i),τ) with respect to b.

  4. Repeat (2)–(3) until ||β(i+1)(τ) − β(i)(τ)|| < 10−4.

The minimization in Step 3 can be implemented using R package quantreg [19]. To choose a tuning parameter ζn in (3), we propose the following BIC-type criterion motivated by Lee et al. [20] and Shows et al. [21]:

2nUn{β^(τ),τ}+Clog(dn)pnlog(n)2n,

where pn is the number of non-zero estimates given ζn and C is some positive number.

3. Simulation

We performed simulation studies under two group variable settings: i) group variables consisting of continuous variables; and ii) group variables consisting of continuous variables and categorical variables. Censoring times and event times were independently generated. Let Z=(1,Z)T, β00(τ)={β1,0(τ),,βdn,0(τ)}T, and ζ00(τ)={ζ1,0(τ),,ζdn,0(τ)}T. Event times and cause of failure were generated as follows:

P(ε=1)=p1,
P{Tt|ε=1,Z)=Φ(logtβ00(τ)TZ},
P{Tt|ε=2,Z)=Φ(logtζ00(τ)TZ},
logQ1(τ|Z)=Φ1(τp1)+β00(τ)TZ,
G(t)=exp(λcα0TZt).

Thus, β0(τ)={Φ1(τ/p1),β00(τ)}T. We set β00(τ)=ζ00(τ). Selecting non-zero βj,0 (τ) for j = 1, …, dn was of interest in this simulation study. We selected p1 and λc to generate 40% cause 1 events, 30% cause 2 events, and 30% censoring. Each simulation was conducted 1000 iterations. The competing risks quantile regression of Peng and Fine [1] and the group bridge were used to estimate β. The adaptive group bridge with ν = 1 was compared to the group bridge. We evaluated the mean squared error that was calculated by

MSE=11000i=11000β^i,0(τ)β00(τ)2,

where β^i,0(τ) is the estimator of β00(τ) at the ith iteration given τ. The proposed BIC-type criterion with C = 1.5 was used to select the tuning parameter. Two τ values were examined: τ = 0.1 and 0.25. We first considered Setting i) group variables consisting of continuous variables with non-covariate dependent censoring distribution, that is, α0 = 0. We examined n = 400, 600, and 800. To generate Z, three correlated continuous variables for each group were generated from N(0, Σ), where

=(10.50.50.510.50.50.51).

Variables were assumed to be independent if they belong to different groups. For n = 400, 600, and 800, there were 9, 10, 11 groups, respectively. The true β0(τ) for n = 400 was {β1,0(τ), …, β9,0(τ)}T = (1, −1, 0, −1, 1, 0, 1, 0, 0)T and {β10,0(τ), …, β27,0(τ)}T = (0, …, 0)T. For n=600, we added {β28,0(τ), β29,0(τ), β30,0(τ)}T = (0,0,0)T. For n=800, we further added {β31,0(τ), β32,0(τ), β33,0(τ)}T = (0,0,0)T. This setting allowed dn to grow as n increased. The number of non-zero groups and non-zero individual variables of the underlying model were 3 and 5, respectively, for each n.

Table 1 summarizes the simulation results. “AGB-CQ”, “AGB-GB”, and “GB” indicate the adaptive group bridge with β(τ) from Peng and Fine [1], the adaptive group bridge with β(τ) from the group bridge, and the group bridge, respectively. “% Corr. Group” and “% Corr. Individual” represent the proportions that the corresponding variable selection method correctly identified the non-zero group variables and non-zero individual variables of the underlying model, respectively. “Group Size” and “Model Size” are the mean number of groups and individual variables selected by each variable selection method, respectively. “MSER” is the ratio of the median MSE of each variable selection method to that of the oracle estimator. The adaptive group bridge and the group bridge identified the true non-zero and zero groups very well in group variable selection. The mean group sizes of the adaptive group bridge and the group bridge were very close to 3. However, the group bridge performed poorly in within-group variable selection, that is, individual variable selection. It over-identified individual variables as non-zero variables. On the other hand, the adaptive group bridge correctly identified the true non-zero individual variables well. In addition, as n increased, the mean group sizes and the mean model sizes of the adaptive group bridge became closer to 3 and 5, respectively. The MSERs of the adaptive group bridge with β(τ) from the group bridge was lower than those of the other methods. Furthermore, the MSERs of the adaptive group bridge got smaller as n increased in general. We also conducted a simulation under the same setting except that pairwise correlation between continuous variables was assumed to be 0.2 if they belonged to different groups. We had similar results to Table 1 and thus did not report them.

Table 1.

Simulation results for group variables consisting of continuous variables with G(X).

τ n Method % Corr. Group % Corr. Individual Group Size Model Size MSER
0.1 400 AGB-CQ 0.995 0.987 3.005 5.013 2.108
AGB-GB 0.991 0.976 3.001 5.011 1.159
GB 0.988 0.554 3.012 5.570 2.451
600 AGB-CQ 0.996 0.996 3.004 5.005 2.281
AGB-GB 0.996 0.992 2.999 5.000 1.056
GB 0.994 0.622 3.008 5.472 2.491
800 AGB-CQ 0.997 0.996 3.003 5.004 1.834
AGB-GB 0.997 0.995 3.003 5.005 1.024
GB 0.997 0.627 3.003 5.434 2.315
0.25 400 AGB-CQ 0.917 0.855 3.097 5.185 1.316
AGB-GB 0.931 0.871 3.076 5.150 1.041
GB 0.945 0.363 3.062 5.963 1.543
600 AGB-CQ 0.936 0.884 3.073 5.141 1.293
AGB-GB 0.948 0.892 3.054 5.115 1.077
GB 0.960 0.358 3.041 5.977 1.583
800 AGB-CQ 0.949 0.905 3.059 5.111 1.041
AGB-GB 0.966 0.929 3.038 5.085 1.006
GB 0.973 0.404 3.028 5.875 1.463

Next, we performed a simulation study for Setting ii) group variables consisting of continuous variables and categorical variables with non-covariate dependent censoring distribution. We examined 3 sample sizes: n = 600, 900, and 1200. For n = 600, there were 10 groups: 5 groups consisting of continuous variables (Groups 1 to 5) and 5 groups consisting of categorical variables (Groups 6 to 10). Groups 1 and 2 contained 6 continuous variables each and Groups 3 to 5 were comprised of 3 continuous variables each. The pairwise correlation among continuous variables within group was 0.5. There was no correlation between continuous variables if they belonged to different groups. Groups 6 and 7 consisted of 7 categories each (that is, 6 indicator variables each) and Groups 8 to 10 categories had 4 categories each (that is, 3 indicator variables each). The reference group for each categorical variable was set to 0. Thus, there were 42 variables in total. The true β0(τ) for n = 600 was {β1,0(τ), …, β6,0(τ)}T = (1, −1,0, …, 0)T, {β7,0(τ), …, β12,0(τ)}T = (0, …, 0)T, {β13,0(τ), β14,0(τ), β15,0 (τ)}T = (1,0,0)T, and {β16(τ), …, β21(τ)}T = (0, …, 0)T, {β22,0(τ), …, β27,0(τ)}T = (1, −1, 0, …, 0)T, {β28,0 (τ), …, β33,0(τ)}T = (0, …, 0)T, {β34,0 (τ), β35,0(τ), β36,0(τ)}T = (1,0,0)T, and {β37(τ), …, β42(τ)}T = (0, …, 0)T. For n = 900, we added one more group consisting of 3 continuous variables with pairwise correlation 0.5 and {β43,0(τ), β44,0(τ), β45,0(τ)}T = (0,0,0)T. For n = 1200, in addition to {β43,0(τ), β44,0(τ), β45,0(τ)}T, we further added a categorical variable having 4 categories, that is, 3 indicator variables: {β46,0(τ), β47,0(τ), β48,0(τ)}T = (0,0,0)T. Thus, the number of non-zero groups and non-zero individual variables of the underlying model were 4 and 6, respectively, for each n.

Table 2 shows the simulation results. The adaptive group bridge identified the true non-zero and zero groups better than the group bridge when n = 600 and 900 for τ = 0.1, and n = 600 for τ = 0.25. When n = 1200, both of the methods selected non-zero group variables very well. The mean group sizes of the adaptive group bridge were very close to 4. The group bridge performed poorly in individual variable selection as in Setting i). On the other hand, the adaptive group bridge correctly identified the true non-zero individual variables proficiently. In addition, as n increased, the mean group sizes and the mean model sizes of the adaptive group bridge became closer to 4 and 6, respectively. The MSERs of the adaptive group bridge with β(τ) from the group bridge was lower than those of the other methods. In addition, the MSERs of the adaptive group bridge got smaller as n increased in general.

Table 2.

Simulation results for group variables consisting of continuous and categorical variables with G(X).

τ n Method % Corr. Group % Corr. Individual Group Size Model Size MSER
0.1 600 AGB-CQ 0.776 0.500 3.737 5.354 4.061
AGB-GB 0.870 0.652 3.861 5.603 1.621
GB 0.536 0.136 3.348 5.485 8.226
900 AGB-CQ 0.952 0.809 3.955 5.835 3.216
AGB-GB 0.989 0.913 3.989 5.934 1.344
GB 0.830 0.306 3.798 6.386 4.387
1200 AGB-CQ 0.990 0.910 3.998 5.940 3.283
AGB-GB 0.999 0.973 4.001 5.989 1.248
GB 0.955 0.383 3.955 6.704 4.147
0.25 600 AGB-CQ 0.933 0.647 4.015 6.097 1.925
AGB-GB 0.955 0.768 4.023 6.098 1.421
GB 0.881 0.187 3.897 7.187 2.660
900 AGB-CQ 0.974 0.810 4.025 6.163 1.681
AGB-GB 0.979 0.879 4.019 6.118 1.212
GB 0.979 0.220 3.997 7.393 2.484
1200 AGB-CQ 0.965 0.832 4.033 6.167 1.519
AGB-GB 0.964 0.888 4.034 6.126 1.250
GB 0.991 0.264 4.005 7.370 2.730

Last, we performed a simulation study for Setting ii) with covariate-dependent censoring distribution. We used the same β0(τ) as in Setting ii) with non-covariate dependent censoring distribution. The true α0 for G(t|Z) when n = 600 was (α1,0, …, α6,0)T = (1,−1,0, …, 0)T, (α7,0, …, α21,0)T = (0, …, 0)T, (α22,0, …, α27,0)T = (1,−1, 0, …, 0)T, and (α28,0, …, α42,0)T = (0, …, 0)T. For n = 900 and 1200, we added (α43,0, α44,0, α45,0)T = (0, 0, 0)T and (α46,0, α47,0, α48,0)T = (0, 0, 0)T, respectively. The Breslow-type estimator was used to estimate G(t|Z). We selected p1 and λc to generate 50% cause 1 events, 20% cause 2 events, and 30% censoring. Table 3 summarizes the simulation results. In general, the results were similar to Table 2. The adaptive group bridge performed better than the group bridge in terms of individual variable selection and MSER.

Table 3.

Simulation results for group variables consisting of continuous and categorical variables with G(X|Z)

τ n Method % Corr. Group % Corr. Individual Group Size Model Size MSER
0.1 600 AGB-CQ 0.769 0.547 3.725 5.393 3.096
AGB-GB 0.862 0.687 3.850 5.615 1.625
GB 0.533 0.155 3.372 5.538 7.692
900 AGB-CQ 0.956 0.830 3.976 5.859 2.504
AGB-GB 0.981 0.902 3.991 5.915 1.356
GB 0.796 0.287 3.849 6.411 4.049
1200 AGB-CQ 0.996 0.912 3.996 5.945 3.250
AGB-GB 0.998 0.963 4.000 5.982 1.248
GB 0.949 0.370 3.946 6.669 3.972
0.25 600 AGB-CQ 0.940 0.674 4.007 6.040 1.881
AGB-GB 0.963 0.803 4.024 6.051 1.346
GB 0.907 0.210 3.914 7.179 2.376
900 AGB-CQ 0.966 0.854 4.031 6.099 1.680
AGB-GB 0.967 0.891 4.031 6.093 1.206
GB 0.885 0.261 4.088 7.363 2.350
1200 AGB-CQ 0.980 0.888 4.020 6.099 1.537
AGB-GB 0.986 0.920 4.014 6.078 1.177
GB 0.996 0.296 4.000 7.202 2.386

4. Bone marrow transplant data example

The adaptive group bridge was applied to a bone marrow transplant data set. Verneris et al. [10] studied the outcomes of the patients having reduced-intensity conditioning allogeneic hematopoietic cell transplantation from 1999 to 2011. We considered 2011 patients with human leukocyte antigen fully-matched unrelated donors. Relapse was the outcome of interest for the analysis. Treatment-related-mortality (TRM) was a competing risk. There were 40.5% of relapse, 26.6% of TRM, and 32.9% of censoring. In addition, 69%, 16%, and 8% of relapse events occurred within 6 months, between 6 and 12 months, and between 12 and 24 months, respectively. Thus, the distribution of relapse events were skewed. The overall relapse rate at 1 year was about 35%. The 13 binary or categorical variables that we considered for variable selection included disease type, recipient age, donor age, donor-recipient sex match, donor-recipient cytomegalovirus (CMV) match, ABO blood type match, donor parity, disease status at transplant, conditioning intensity, total body irradiation, graft type, graft-versus-host disease (GVHD) prophylaxis, and in-vivo T cell depletion. They consisted of 28 indicator variables. The censoring distribution did not depend on any covariates based on the Cox proportional hazards model.

We selected variables for the 0.35th competing risks quantile regression for relapse using the following three selection methods: the group bridge, the adaptive group bridge with β(τ) from Peng and Fine [1], and the adaptive group bridge with β(τ) from the group bridge. The reference group was set to zero. Table 4 shows the selected variables and their estimates. The group bridge selected disease status at transplant, CMV match, conditioning intensity, in-vivo T cell depletion, graft type, and GVHD prophylaxis. On the other hand, both of the adaptive group bridge with β(τ) from Peng and Fine [1] and the adaptive group bridge with β(τ) from the group bridge selected the same variables: disease status at transplant, CMV match, conditioning intensity, and in-vivo T cell depletion. The adaptive group bridge did not select graft type and GVHD prophylaxis, which is why all of their estimates are zeros in Table 4. The competing risks quantile regression of Peng and Fine [1] was fitted using the variables selected by at least one of the three methods. “CQ” in Table 4 indicates their estimates and p-values from the competing risks quantile regression of Peng and Fine [1]. It suggests that all variables selected by the adaptive group bridge appeared to be significant. However, graft type and GVHD prophylaxis that the group bridge selected appeared not to be significant based on their p-values.

Table 4.

Selected variables and estimates. “ref” means the reference group. “CQ” indicates the competing risks quantile regression.

Variable Subcategory AGB-CQ AGB-GB GB CQ

Est. Est. Est. Est. p-value
Disease status Early (ref) 0 0 0 0
Intermediate 0 0 0 −0.055 0.503
Advanced −0.891 −0.891 −0.859 −0.460 < 0.001
CMV match +/+ (ref) 0 0 0 0
+/− −0.455 −0.445 −0.476 −0.363 0.018
−/+ 0 0 0 −0.034 0.902
−/− 0 0 0 −0.027 0.774
Missing 0 0 0 −0.054 0.939
Conditioning intensity Reduced intensity (ref) 0 0 0 0
Nonmyeloablative −1.056 −1.060 −1.165 −0.536 < 0.001
In-vivo T cell depletion No (ref) 0 0 0 0
Yes −0.488 −0.488 −0.466 −0.301 0.010
Graft type Bone marrow (ref) 0 0 0 0
Peripheral blood 0 0 0.181 0.130 0.137
GVHD prophylaxis FK506 ± others (ref) 0 0 0 0
Others 0 0 0.175 0.121 0.212

5. Conclusion

The group bridge and the adaptive group bridge were proposed to select variables for the competing risks quantile regression. Their oracle property was studied. In particular, the adaptive group bridge not only consistently identifies non-zero group variables, but also consistently selects non-zero within-group variables. We also proposed the BIC-type criterion to choose a tuning parameter. The proposed BIC-type criterion appears to work properly in the simulation study. The adaptive group bridge selected non-zero within-group variables more consistently than the group bridge in the simulation study. A bone marrow transplant example showed the usefulness of the adaptive group bridge.

The proposed method was limited to when dn < n. Developing a group variable selection method when dn < n would be a crucial research problem. A two-step variable selection procedure may be developed for this: once we screen group variables in the first step, we may use the adaptive group bridge to obtain a further parsimonious list of non-zero variables in the second step. The theoretical justification of the proposed BIC-type criterion needs to be studied in the future.

Supplementary Material

Supp info

Acknowledgments

This work was supported in part by Institutional Research Grant #14-247-29 from the American Cancer Society and the MCW Cancer Center, and the US National Cancer Institute (U24CA076518). The authors would like to thank the Associate Editor and two anonymous reviewers for their helpful comments that significantly improved the manuscript.

Appendix

For G(t|Z) and the Breslow estimator, we assume as follows:

  1. 0Lλ0G(t)dt< and P{Yi(t) = 1} > 0 for t ∈ [0, L], i = 1, …, n, and dn4/n0 as n → ∞.

  2. Zij is bounded almost surely for all i,j and αTZ is bounded almost surely for any Z and α ∈ ℬ, where ℬ is a neighborhood α0.

  3. For d = 0, 1, 2, there exists a neighborhood ℬ of α0 such that sG(d)(α,t) are continuous functions and supt(0,L),αBSG(d)(α,t)sG(d)(α,t)0 in probability.

  4. The matrix A(α0)=0LvG(α0,t)sG(0)(α0,t)λ0G(t)dt is positive definite, where vG(α,t)=sG(2)(α,t)/sG(0)(α,t)eG(α,t)2 and eG(α,t)=sG(1)(α,t)/sG(0)(α,t).

  5. For all α, t ∈ [0, L], SG(1)(α,t)=SG(0)(α,t)/α, and SG(2)(α,t)=2SG(0)(α,t)/(ααT), where SG(d)(α,t), d = 0, 1, 2 are continuous functions of α uniformly in t ∈ [0, L] and are bounded on × [0, L], and sG(0) is bounded away from zero on × [0, L].

Footnotes

Supplementary Material

Additional supplementary material may be found in the online version of this article at the publishers web site.

References

  • 1.Peng L, Fine JP. Competing risks quantile regression. Journal of the American Statistical Association. 2009;104:1440–1453. [Google Scholar]
  • 2.Peng L, Huang Y. Survival analysis with quantile regression models. Journal of the American Statistical Association. 2008;103:637–649. [Google Scholar]
  • 3.Reich BJ, Smith LB. Bayesian quantile regression for censored data. Biometrics. 2013;69:651–660. doi: 10.1111/biom.12053. [DOI] [PubMed] [Google Scholar]
  • 4.Yin G, Zeng D, Li H. Power–transformed linear quantile regression with censored data. Journal of the American Statistical Association. 2008;103:1214–1224. [Google Scholar]
  • 5.Yin G, Cai J. Quantile regression models with multivariate failure time data. Journal of the American Statistical Association. 2005;61:151–161. doi: 10.1111/j.0006-341X.2005.030815.x. [DOI] [PubMed] [Google Scholar]
  • 6.Sun Y, Wang HJ, Gilbert PB. Quantile regression for competing risks data with missing cause of failure. Statistica Sinica. 2012;22:703–728. doi: 10.5705/ss.2010.093. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Lee M, Fine J. Inference for cumulative incidence quantiles via parametric and nonparametric approaches. Statistics in Medicine. 2011;30:3221–3235. doi: 10.1002/sim.4349. [DOI] [PubMed] [Google Scholar]
  • 8.Jiang R, Qian W, Zhou Z. Variable selection and coefficient estimation via composite quantile regression with randomly censored data. Statistics & Probability Letters. 2012;82:308–317. [Google Scholar]
  • 9.Wang HJ, Zhou J, Li Y. Variable selection for censored quantile regression. Statistica Sinica. 2013;23:145–167. doi: 10.5705/ss.2011.100. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Verneris MR, Lee SJ, Ahn KW, Wang HL, Battiwalla M, Inamoto Y, Munker R, Aljurf M, Saber W, Spellman S, et al. HLA-mismatch is associated with worse outcomes after unrelated donor reduced intensity transplantation: An analysis from the CIBMTR. Biology of Blood and Marrow Transplant. 2015;21:1783–1789. doi: 10.1016/j.bbmt.2015.05.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B. 2006;68:49–67. [Google Scholar]
  • 12.Huang J, Ma S, Xie H, Zhang CH. A group bridge approach for variable selection. Biometrika. 2009;96:339–355. doi: 10.1093/biomet/asp020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Zhou N, Zhu J. Group variable selection via a hierarchical lasso and its oracle property. Statistics and Its Interface. 2010;3:557–574. [Google Scholar]
  • 14.Zhao W, Zhang R, Liu J. Sparse group variable selection based on quantile hierarchical lasso. Journal of Applied Statistics. 2014;41:1658–1677. [Google Scholar]
  • 15.Fu Z, Parikh CR, Zhou B. Penalized variable selection in competing risks regression. Lifetime Data Analysis. 2016;23:353376. doi: 10.1007/s10985-016-9362-3. [DOI] [PubMed] [Google Scholar]
  • 16.Breslow NE. Discussion of the paper by d. r.cox. Journal of the Royal Statistical Society: Series B. 1972;34:216–217. [Google Scholar]
  • 17.Huang J, Li L, Liu Y, Zhao X. Group selection in the Cox model with a diverging number of covariates. Statistica Sinica. 2014;24:1787–1810. [Google Scholar]
  • 18.Cai J, Fan J, Li R, Zhou H. Variable selection for multivariate failure time data. Biometrika. 2005;92:303–316. doi: 10.1093/biomet/92.2.303. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Koenker R. quantreg: Quantile Regression. 2016 URL https://CRAN.R-project.org/package=quantreg, r package version 5.26.
  • 20.Lee ER, Noh H, Park BU. Model selection via bayesian information criterion for quantile regression models. Journal of the American Statistical Association. 2014;109:216–229. [Google Scholar]
  • 21.Shows JH, Lu W, Zhang HH. Sparse estimation and inference for censored median regression. Journal of Statistical Planning and Inference. 2010;140:1903–1917. doi: 10.1016/j.jspi.2010.01.043. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp info

RESOURCES