Abstract
We study the group bridge and the adaptive group bridge penalties for competing risks quantile regression with group variables. While the group bridge consistently identifies non-zero group variables, the adaptive group bridge consistently selects variables not only at group level, but also at within-group level. We allow the number of covariates to diverge as the sample size increases. The oracle property for both methods is also studied. The performance of the group bridge and the adaptive group bridge is compared in simulation and in a real data analysis. The simulation study shows that the adaptive group bridge selects non-zero within-group variables more consistently than the group bridge. A bone marrow transplant study is provided as an example.
Keywords: Adaptive lasso, Competing risks quantile regression, Group bridge
1. Introduction
Quantile regression provides an alternative method to the Cox proportional hazards model and the accelerated failure time (AFT) model in survival analysis [1]. It is often preferred when the survival distribution is skewed. There is rich literature in survival quantile regression. Peng and Huang [2] proposed a martingale-based estimating equations. Reich and Smith [3] developed a semiparametric Bayesian quantile regression model for censored data. Yin et al. [4] studied a power-transformed quantile regression model for survival data. Yin and Cai [5] proposed quantile regression models for correlated survival data.
Recently quantile regression for competing risks data have had much attention. Peng and Fine [1] proposed a semiparametric model based on the competing risks AFT model. Sun et al. [6] developed a regression model when the failure type is missing in competing risks data. Lee and Fine [7] studied parametric and nonparametric methods to make inference on cumulative incidence quantiles.
In spite of increasing popularity of quantile regression for survival and competing risks data, the current literature on variable selection is somewhat limited. Jiang et al. [8] proposed the adaptive lasso for a composite quantile regression with randomly censored data. Wang et al. [9] also studied the adaptive lasso for censored quantile regression. They all studied a survival setting, not a competing risks setting. In addition, their proposed methods addressed variable selection at individual level, not at group level. In practice, clinicians often encounter group variables such as categorical variables. For example, Verneris et al. [10] studied the outcomes of the patients having reduced-intensity conditioning allogeneic hematopoietic cell transplantation from 1999 to 2011. They studied competing risks outcomes including relapse and treatment-related mortality (TRM), where relapse and TRM are competing risks to each other. The variables that they considered for analysis consisted of binary and categorical variables.
Several penalties have been proposed to select group variables for linear regression and competing risks settings. Yuan and Lin [11] proposed the group lasso, which selects variables at group level, not at within-group level. Huang et al. [12] developed the group bridge to select both non-zero group and non-zero within-group variables. However, they studied group selection consistency only and did not show within-group variable selection consistency. Zhou and Zhu [13] proposed an adaptive hierarchical lasso having group variable selection consistency and within-group variable selection consistency. Zhao et al. [14] applied the adaptive hierarchical lasso penalty to identify non-zero variables at both levels for quantile linear regression. Fu et al. [15] extensively studied lasso, adaptive lasso, SCAD, and MCP for individual variable selection and their group variable selection versions for the subdistribution hazards model. However, they did not address within-group variable selection. In addition, their oracle property was limited to a fixed number of covariates. Despite extensive work in group variable selection for linear, linear quantile, and subdistribution hazards regression models, there is little literature on group variable selection in competing risks quantile regression. In particular, group and within-group level variable selection techniques remain unexplored in the current literature to the best of the authors’ knowledge.
We propose the group bridge and the adaptive group bridge for bi-level variable selection, that is, group and within-group variable selection, under the competing risks quantile regression model of Peng and Fine [1]. While the group bridge consistently identifies non-zero group variables, the adaptive group bridge consistently selects non-zero variables at both group level and within-group level. When there is no group structure for variables, individual variable selection can be handled as a special case of the proposed methods. Based on our knowledge, even individual variable selection has not been studied for the competing risks quantile regression. We study their oracle property while allowing the number of variables to diverge as the sample size increases. We show the adaptive group bridge identifies non-zero within-group variables more consistently than the group bridge in simulation study. In Section 2, we describe the proposed methods and study their theoretical properties. In Section 3, we compare the performance of the adaptive group bridge and the group bridge via simulation study. We illustrate a real data example in Section 4 and have a brief conclusion in Section 5. All the proofs of the theorems and the lemmas in this paper can be found in the online Supplementary Materials.
2. Method
In this section, we propose a penalized competing risks quantile regression model and study its theoretical properties. We begin with some notations. Without loss of generality, we consider two causes of failure ε ∈ {1,2} with sample size n. We allow the number of covariates dn to increase as n increases. Let Ti, Ci, εi, and be the event time, censoring time, cause of failure, and covariate vector of subject i for i = 1, …, n. Denote β0(τ) = {βj,0(τ); j = 0, …, dn}T as the true parameter vector given quantile τ, where β0,0 is the true intercept coefficient. Let Xi = Ti ˄ Ci be the observed time and δi = I(Ti ≤ Ci)I(εi = 1), where a ˄ b = min (a, b). We assume that (Ti, εi, Ci, Zi) are independent and identically distributed, and the Ti’s and Ci’s are independent given Zi for i = 1, …, n. The study period is [0, L]. Let F1(t|Zi) be the cumulative incidence of cause 1 at time t given Zi, where F1(t|Zi) = P(Ti ≤ t,εi = 1|Zi). Given covariate Z, we define the τth conditional quantile of F1(t|Z) as Q1(τ|Z) = inf{t: F1(t|Z) ≥ τ}. For τ ε [τL,τU] with 0 < τL,τU < 1, we consider Q1(τ|Z) = g{ZTβ(τ)}, where g(·) is a known monotone link function. Let ‖·‖ be the Euclidean norm and a⊗2 = aaT for a vector a.
Let . For simplicity, we assume that ’s are fixed over time. Let be the counting process for censoring and Yi(t) = I(Xi ≥ t). We use the Cox proportional hazards model to fit censoring time Ci’s:
Where is an arbitrary baseline hazard function for censoring and αT is the unknown parameter vector. Define
where d = 0,1, and 2. The baseline cumulative hazard function for censoring is estimated by the Breslow-type estimator [16]:
where is the estimator of α based on the Cox proportional hazards model. Then, we estimate as follows:
We can obtain the consistency of , , and as follows:
Lemma 2.1
Assume Conditions (a)-(e) as in Appendix. Then, we have , and .
When the censoring distribution G does not depend on any covariates, the Kaplan-Meier estimator can be used instead of the Breslow estimator. The proof of Lemma 2.1 can be found in the online Supplemental Materials.
Next, we define some notations on group variables and their memberships. Assume that we have K groups of variables. Let A1,…,AK be subsets of {1, …, dn} representing group memberships of variables, where Ak’s may overlap. Define βA(τ) = {βj(τ),j ε A}T and βA,0(τ) = {βj,0(τ); j ε A}T for a set A. To distinguish the individual memberships between non-zero βj,0(τ)’s and zero βj,0(τ)’s, we define B1 and B2 such that βj,0(τ) ≠ 0 if j ε B1 and βj,0(τ) = 0 if j ε B2. To distinguish the group memberships between non-zero ’s and zero ’s, without loss of generality we further define E1 and E2 such that and , where for 1 ≤ k ≤ K1 and for K1 + 1 ≤ k ≤ K.
To estimate β(τ), Peng and Fine [1] considered the estimating equation Sn(b, τ) = 0, where
(1) |
To solve Sn(b, τ) = 0, Peng and Fine [1] proposed the following L1-type convex function:
where M is a very large positive number to bound and for all b’s in the parameter space for β0(τ). They studied the consistency and the asymptotic normality of the estimator of β0(τ) obtained by solving Sn(b, τ) = 0 when G is non-covariate dependent and dn is fixed.
To select variables at bi-level, we propose the following penalized function:
(2) |
where is a consistent estimator of β(τ), ν ≥ 0, λn > 0, and 0 < γ < 1. Following Huang et al. [12], we set , where |A| is the cardinality of A. If ν = 0, the penalty term is the group bridge penalty of Huang et al. [12] and Huang et al. [17]. When ν > 0, we call the penalty term as adaptive group bridge penalty. The adaptive group bridge becomes i) individual variable selection when |Ak | = 1 for all k; and ii) the adaptive hierarchical lasso penalty of Zhou and Zhu [13] when γ = 1/2 and ck = 1 for all k.
We can formulate minimizing Wn(b, τ) to minimizing
(3) |
where θ = (θ1, …, θK)T. By defining
we can show the following lemma similarly to Proposition 1 of Huang et al. [12] and thus its proof is omitted:
Lemma 2.2
Assume that for 0 < γ < 1. Then, minimizes Wn (b, τ) if and only if minimizes , where θk > 0 and for k = 1,…, K.
Define . Denote as the first derivative of with respect to b. We first study the oracle property of the group bridge estimator given τ. We assume that
-
(C1)
There exists ω > 0 such that and for any .
-
(C2)
Zij and βj,0(τ) are uniformly bounded for j = 1,…, dn.
-
(C3)
f1(t|z) is bounded above uniformly in t and z, where f1(t|z) = dF1(t|z)/dt.
-
(C4)
Define . For some ρ0 > 0, C1 > 0, and C2 > 0, and , where and κ(H) is the eigenvalue of a matrix H.
-
(C5)
Σ(τ) = Var{Sn (b,τ)}. There exist C3 > 0 and C4 > 0 such that and , where ρ0 > 0.
-
(C6)
There exists a constant C5 > 0 such that , for all 0 < j, j′ < dn, where is the (i, j)th entry of .
-
(C7)
.
-
(C8)
is bounded and , Mn = Op (1),where as n → ∞
-
(C9)
, , as n → ∞
(C1)−(C5) are similar to the standard conditions for the competing risks quantile regression of Peng and Fine [1]. Peng and Fine [1] suggested to use a truncated censoring time C = mim(C, L) for ω in (C1) so that (C1) is always satisfied. In practice, ω can be chosen as large as possible so that only small information loss occurs [1]. (C4) − (C6) and (C8) control the behavior of the estimating equation as dn grows. Similar conditions to (C4) − (C8) were used to allow dn to diverge as n ⟶ ∞ in Cai et al. [18], Huang et al. [12], and Huang et al. [17]. (C5), (C6), and (C9) restricts the variability of Var{Sn(b,τ)} and as n and dn increase. (C8) and (C9) control λn, the number of variables within group, and the magnitude of the true parameters in non-zero groups, which were used in Huang et al. [17]. The variance matrix Σ(τ) in Condition (C5) can be specified as follows: Define eG(α0,t) and A(α0) as in Appendix. We further define , and
. Then, , where . The s includes the detailed derivation of ηi(τ) and the asymptotic normality of the estimator obtained by solving Sn(b,τ) = 0 for fixed dn. Denote → d as convergence in distribution.
First of all, the following lemma shows the consistency of the estimator obtained by solving Sn (b,τ) = 0 when dn diverges as n ⟶ ∞:
Lemma 2.3
Let be the estimator obtained by solving Sn (b, τ) = 0. Then, under the conditions (C1) − (C7), we have .
The proof of Lemma 2.3 can be found in the online Supplementary Materials. Peng and Fine [1] studied the consistency of for non-covariate dependent censoring with fixed number of covariates. Lemma 2.3 extends their result to covariate-dependent censoring with diverging dn. Similarly to Huang et al. [17], we have the following theorem for the group bridge estimator given τ:
Theorem 2.4
Assume ν = 0 in (2). Under (C1) − (C9), we have
Consistency: .
Group variable selection consistency: .
- Asymptotic distribution: for fixed unknown ,
where and are the leading |E1| × |E1| submatrices of H{β0(τ)} and Σ(τ), respectively.
Using Lemma 2.3, Theorem 2.4 can be shown similarly to the proofs of Theorems 1 and 2 of Huang et al. [17] and thus its proof is omitted. Theorem 2.4 shows the group variable selection consistency of the group bridge estimator.
Although the group bridge can consistently select non-zero group variables, it may not effectively eliminate zero individual variables within non-zero group variables. This may be improved with using ν > 0 in (2), that is, the adaptive group bridge penalty. For the adaptive group bridge, we have the following theorem given τ:
Theorem 2.5
Assume ν > 0 in (2). In addition to (C1) − (C7), we assume
(C8b) For some ν1 and ν2 such that 0 < ν1 < 1, 0 < ν2, and ν2/(1 − ν1) < ν, , , and
, , and .
Then, we have
Consistency: .
Bi-level variable selection consistency: .
- Asymptotic distribution:for fixed unknown ,
where H11 {β0(τ)} and Σ11(τ) are the leading |B1| × |B1| submatrices of H{β0(τ)} and Σ(τ), respectively.
The proof of Theorem 2.5 can be found in the online s. (C8b) controls the magnitude of non-zero parameters and the number of non-zero parameters. It requires the smallest magnitude of non-zero parameters does not shrink towards zero too fast. (C9b) controls λn and ν as n → ∞ to obtain the oracle property. Theorem 2.5 provides the oracle property of the adaptive group bridge estimator. In particular, it shows that the adaptive group bridge consistently identifies not only non-zero group variables, but also non-zero within-group variables.
To obtain , we minimize of (3). Then, the optimization algorithm is as follows:
Obtain an consistent estimator and an initial value β(0)(τ) from Peng and Fine [1] or the group bridge.
- Compute
Obtain β(i+1) (τ) by minimizing with respect to b.
The minimization in Step 3 can be implemented using R package quantreg [19]. To choose a tuning parameter ζn in (3), we propose the following BIC-type criterion motivated by Lee et al. [20] and Shows et al. [21]:
where pn is the number of non-zero estimates given ζn and C is some positive number.
3. Simulation
We performed simulation studies under two group variable settings: i) group variables consisting of continuous variables; and ii) group variables consisting of continuous variables and categorical variables. Censoring times and event times were independently generated. Let , , and . Event times and cause of failure were generated as follows:
Thus, . We set . Selecting non-zero βj,0 (τ) for j = 1, …, dn was of interest in this simulation study. We selected p1 and λc to generate 40% cause 1 events, 30% cause 2 events, and 30% censoring. Each simulation was conducted 1000 iterations. The competing risks quantile regression of Peng and Fine [1] and the group bridge were used to estimate . The adaptive group bridge with ν = 1 was compared to the group bridge. We evaluated the mean squared error that was calculated by
where is the estimator of at the ith iteration given τ. The proposed BIC-type criterion with C = 1.5 was used to select the tuning parameter. Two τ values were examined: τ = 0.1 and 0.25. We first considered Setting i) group variables consisting of continuous variables with non-covariate dependent censoring distribution, that is, α0 = 0. We examined n = 400, 600, and 800. To generate , three correlated continuous variables for each group were generated from N(0, Σ), where
Variables were assumed to be independent if they belong to different groups. For n = 400, 600, and 800, there were 9, 10, 11 groups, respectively. The true β0(τ) for n = 400 was {β1,0(τ), …, β9,0(τ)}T = (1, −1, 0, −1, 1, 0, 1, 0, 0)T and {β10,0(τ), …, β27,0(τ)}T = (0, …, 0)T. For n=600, we added {β28,0(τ), β29,0(τ), β30,0(τ)}T = (0,0,0)T. For n=800, we further added {β31,0(τ), β32,0(τ), β33,0(τ)}T = (0,0,0)T. This setting allowed dn to grow as n increased. The number of non-zero groups and non-zero individual variables of the underlying model were 3 and 5, respectively, for each n.
Table 1 summarizes the simulation results. “AGB-CQ”, “AGB-GB”, and “GB” indicate the adaptive group bridge with from Peng and Fine [1], the adaptive group bridge with from the group bridge, and the group bridge, respectively. “% Corr. Group” and “% Corr. Individual” represent the proportions that the corresponding variable selection method correctly identified the non-zero group variables and non-zero individual variables of the underlying model, respectively. “Group Size” and “Model Size” are the mean number of groups and individual variables selected by each variable selection method, respectively. “MSER” is the ratio of the median MSE of each variable selection method to that of the oracle estimator. The adaptive group bridge and the group bridge identified the true non-zero and zero groups very well in group variable selection. The mean group sizes of the adaptive group bridge and the group bridge were very close to 3. However, the group bridge performed poorly in within-group variable selection, that is, individual variable selection. It over-identified individual variables as non-zero variables. On the other hand, the adaptive group bridge correctly identified the true non-zero individual variables well. In addition, as n increased, the mean group sizes and the mean model sizes of the adaptive group bridge became closer to 3 and 5, respectively. The MSERs of the adaptive group bridge with from the group bridge was lower than those of the other methods. Furthermore, the MSERs of the adaptive group bridge got smaller as n increased in general. We also conducted a simulation under the same setting except that pairwise correlation between continuous variables was assumed to be 0.2 if they belonged to different groups. We had similar results to Table 1 and thus did not report them.
Table 1.
τ | n | Method | % Corr. Group | % Corr. Individual | Group Size | Model Size | MSER |
---|---|---|---|---|---|---|---|
0.1 | 400 | AGB-CQ | 0.995 | 0.987 | 3.005 | 5.013 | 2.108 |
AGB-GB | 0.991 | 0.976 | 3.001 | 5.011 | 1.159 | ||
GB | 0.988 | 0.554 | 3.012 | 5.570 | 2.451 | ||
600 | AGB-CQ | 0.996 | 0.996 | 3.004 | 5.005 | 2.281 | |
AGB-GB | 0.996 | 0.992 | 2.999 | 5.000 | 1.056 | ||
GB | 0.994 | 0.622 | 3.008 | 5.472 | 2.491 | ||
800 | AGB-CQ | 0.997 | 0.996 | 3.003 | 5.004 | 1.834 | |
AGB-GB | 0.997 | 0.995 | 3.003 | 5.005 | 1.024 | ||
GB | 0.997 | 0.627 | 3.003 | 5.434 | 2.315 | ||
0.25 | 400 | AGB-CQ | 0.917 | 0.855 | 3.097 | 5.185 | 1.316 |
AGB-GB | 0.931 | 0.871 | 3.076 | 5.150 | 1.041 | ||
GB | 0.945 | 0.363 | 3.062 | 5.963 | 1.543 | ||
600 | AGB-CQ | 0.936 | 0.884 | 3.073 | 5.141 | 1.293 | |
AGB-GB | 0.948 | 0.892 | 3.054 | 5.115 | 1.077 | ||
GB | 0.960 | 0.358 | 3.041 | 5.977 | 1.583 | ||
800 | AGB-CQ | 0.949 | 0.905 | 3.059 | 5.111 | 1.041 | |
AGB-GB | 0.966 | 0.929 | 3.038 | 5.085 | 1.006 | ||
GB | 0.973 | 0.404 | 3.028 | 5.875 | 1.463 |
Next, we performed a simulation study for Setting ii) group variables consisting of continuous variables and categorical variables with non-covariate dependent censoring distribution. We examined 3 sample sizes: n = 600, 900, and 1200. For n = 600, there were 10 groups: 5 groups consisting of continuous variables (Groups 1 to 5) and 5 groups consisting of categorical variables (Groups 6 to 10). Groups 1 and 2 contained 6 continuous variables each and Groups 3 to 5 were comprised of 3 continuous variables each. The pairwise correlation among continuous variables within group was 0.5. There was no correlation between continuous variables if they belonged to different groups. Groups 6 and 7 consisted of 7 categories each (that is, 6 indicator variables each) and Groups 8 to 10 categories had 4 categories each (that is, 3 indicator variables each). The reference group for each categorical variable was set to 0. Thus, there were 42 variables in total. The true β0(τ) for n = 600 was {β1,0(τ), …, β6,0(τ)}T = (1, −1,0, …, 0)T, {β7,0(τ), …, β12,0(τ)}T = (0, …, 0)T, {β13,0(τ), β14,0(τ), β15,0 (τ)}T = (1,0,0)T, and {β16(τ), …, β21(τ)}T = (0, …, 0)T, {β22,0(τ), …, β27,0(τ)}T = (1, −1, 0, …, 0)T, {β28,0 (τ), …, β33,0(τ)}T = (0, …, 0)T, {β34,0 (τ), β35,0(τ), β36,0(τ)}T = (1,0,0)T, and {β37(τ), …, β42(τ)}T = (0, …, 0)T. For n = 900, we added one more group consisting of 3 continuous variables with pairwise correlation 0.5 and {β43,0(τ), β44,0(τ), β45,0(τ)}T = (0,0,0)T. For n = 1200, in addition to {β43,0(τ), β44,0(τ), β45,0(τ)}T, we further added a categorical variable having 4 categories, that is, 3 indicator variables: {β46,0(τ), β47,0(τ), β48,0(τ)}T = (0,0,0)T. Thus, the number of non-zero groups and non-zero individual variables of the underlying model were 4 and 6, respectively, for each n.
Table 2 shows the simulation results. The adaptive group bridge identified the true non-zero and zero groups better than the group bridge when n = 600 and 900 for τ = 0.1, and n = 600 for τ = 0.25. When n = 1200, both of the methods selected non-zero group variables very well. The mean group sizes of the adaptive group bridge were very close to 4. The group bridge performed poorly in individual variable selection as in Setting i). On the other hand, the adaptive group bridge correctly identified the true non-zero individual variables proficiently. In addition, as n increased, the mean group sizes and the mean model sizes of the adaptive group bridge became closer to 4 and 6, respectively. The MSERs of the adaptive group bridge with from the group bridge was lower than those of the other methods. In addition, the MSERs of the adaptive group bridge got smaller as n increased in general.
Table 2.
τ | n | Method | % Corr. Group | % Corr. Individual | Group Size | Model Size | MSER |
---|---|---|---|---|---|---|---|
0.1 | 600 | AGB-CQ | 0.776 | 0.500 | 3.737 | 5.354 | 4.061 |
AGB-GB | 0.870 | 0.652 | 3.861 | 5.603 | 1.621 | ||
GB | 0.536 | 0.136 | 3.348 | 5.485 | 8.226 | ||
900 | AGB-CQ | 0.952 | 0.809 | 3.955 | 5.835 | 3.216 | |
AGB-GB | 0.989 | 0.913 | 3.989 | 5.934 | 1.344 | ||
GB | 0.830 | 0.306 | 3.798 | 6.386 | 4.387 | ||
1200 | AGB-CQ | 0.990 | 0.910 | 3.998 | 5.940 | 3.283 | |
AGB-GB | 0.999 | 0.973 | 4.001 | 5.989 | 1.248 | ||
GB | 0.955 | 0.383 | 3.955 | 6.704 | 4.147 | ||
0.25 | 600 | AGB-CQ | 0.933 | 0.647 | 4.015 | 6.097 | 1.925 |
AGB-GB | 0.955 | 0.768 | 4.023 | 6.098 | 1.421 | ||
GB | 0.881 | 0.187 | 3.897 | 7.187 | 2.660 | ||
900 | AGB-CQ | 0.974 | 0.810 | 4.025 | 6.163 | 1.681 | |
AGB-GB | 0.979 | 0.879 | 4.019 | 6.118 | 1.212 | ||
GB | 0.979 | 0.220 | 3.997 | 7.393 | 2.484 | ||
1200 | AGB-CQ | 0.965 | 0.832 | 4.033 | 6.167 | 1.519 | |
AGB-GB | 0.964 | 0.888 | 4.034 | 6.126 | 1.250 | ||
GB | 0.991 | 0.264 | 4.005 | 7.370 | 2.730 |
Last, we performed a simulation study for Setting ii) with covariate-dependent censoring distribution. We used the same β0(τ) as in Setting ii) with non-covariate dependent censoring distribution. The true α0 for when n = 600 was (α1,0, …, α6,0)T = (1,−1,0, …, 0)T, (α7,0, …, α21,0)T = (0, …, 0)T, (α22,0, …, α27,0)T = (1,−1, 0, …, 0)T, and (α28,0, …, α42,0)T = (0, …, 0)T. For n = 900 and 1200, we added (α43,0, α44,0, α45,0)T = (0, 0, 0)T and (α46,0, α47,0, α48,0)T = (0, 0, 0)T, respectively. The Breslow-type estimator was used to estimate . We selected p1 and λc to generate 50% cause 1 events, 20% cause 2 events, and 30% censoring. Table 3 summarizes the simulation results. In general, the results were similar to Table 2. The adaptive group bridge performed better than the group bridge in terms of individual variable selection and MSER.
Table 3.
τ | n | Method | % Corr. Group | % Corr. Individual | Group Size | Model Size | MSER |
---|---|---|---|---|---|---|---|
0.1 | 600 | AGB-CQ | 0.769 | 0.547 | 3.725 | 5.393 | 3.096 |
AGB-GB | 0.862 | 0.687 | 3.850 | 5.615 | 1.625 | ||
GB | 0.533 | 0.155 | 3.372 | 5.538 | 7.692 | ||
900 | AGB-CQ | 0.956 | 0.830 | 3.976 | 5.859 | 2.504 | |
AGB-GB | 0.981 | 0.902 | 3.991 | 5.915 | 1.356 | ||
GB | 0.796 | 0.287 | 3.849 | 6.411 | 4.049 | ||
1200 | AGB-CQ | 0.996 | 0.912 | 3.996 | 5.945 | 3.250 | |
AGB-GB | 0.998 | 0.963 | 4.000 | 5.982 | 1.248 | ||
GB | 0.949 | 0.370 | 3.946 | 6.669 | 3.972 | ||
0.25 | 600 | AGB-CQ | 0.940 | 0.674 | 4.007 | 6.040 | 1.881 |
AGB-GB | 0.963 | 0.803 | 4.024 | 6.051 | 1.346 | ||
GB | 0.907 | 0.210 | 3.914 | 7.179 | 2.376 | ||
900 | AGB-CQ | 0.966 | 0.854 | 4.031 | 6.099 | 1.680 | |
AGB-GB | 0.967 | 0.891 | 4.031 | 6.093 | 1.206 | ||
GB | 0.885 | 0.261 | 4.088 | 7.363 | 2.350 | ||
1200 | AGB-CQ | 0.980 | 0.888 | 4.020 | 6.099 | 1.537 | |
AGB-GB | 0.986 | 0.920 | 4.014 | 6.078 | 1.177 | ||
GB | 0.996 | 0.296 | 4.000 | 7.202 | 2.386 |
4. Bone marrow transplant data example
The adaptive group bridge was applied to a bone marrow transplant data set. Verneris et al. [10] studied the outcomes of the patients having reduced-intensity conditioning allogeneic hematopoietic cell transplantation from 1999 to 2011. We considered 2011 patients with human leukocyte antigen fully-matched unrelated donors. Relapse was the outcome of interest for the analysis. Treatment-related-mortality (TRM) was a competing risk. There were 40.5% of relapse, 26.6% of TRM, and 32.9% of censoring. In addition, 69%, 16%, and 8% of relapse events occurred within 6 months, between 6 and 12 months, and between 12 and 24 months, respectively. Thus, the distribution of relapse events were skewed. The overall relapse rate at 1 year was about 35%. The 13 binary or categorical variables that we considered for variable selection included disease type, recipient age, donor age, donor-recipient sex match, donor-recipient cytomegalovirus (CMV) match, ABO blood type match, donor parity, disease status at transplant, conditioning intensity, total body irradiation, graft type, graft-versus-host disease (GVHD) prophylaxis, and in-vivo T cell depletion. They consisted of 28 indicator variables. The censoring distribution did not depend on any covariates based on the Cox proportional hazards model.
We selected variables for the 0.35th competing risks quantile regression for relapse using the following three selection methods: the group bridge, the adaptive group bridge with from Peng and Fine [1], and the adaptive group bridge with from the group bridge. The reference group was set to zero. Table 4 shows the selected variables and their estimates. The group bridge selected disease status at transplant, CMV match, conditioning intensity, in-vivo T cell depletion, graft type, and GVHD prophylaxis. On the other hand, both of the adaptive group bridge with from Peng and Fine [1] and the adaptive group bridge with from the group bridge selected the same variables: disease status at transplant, CMV match, conditioning intensity, and in-vivo T cell depletion. The adaptive group bridge did not select graft type and GVHD prophylaxis, which is why all of their estimates are zeros in Table 4. The competing risks quantile regression of Peng and Fine [1] was fitted using the variables selected by at least one of the three methods. “CQ” in Table 4 indicates their estimates and p-values from the competing risks quantile regression of Peng and Fine [1]. It suggests that all variables selected by the adaptive group bridge appeared to be significant. However, graft type and GVHD prophylaxis that the group bridge selected appeared not to be significant based on their p-values.
Table 4.
Variable | Subcategory | AGB-CQ | AGB-GB | GB | CQ | |
---|---|---|---|---|---|---|
| ||||||
Est. | Est. | Est. | Est. | p-value | ||
Disease status | Early (ref) | 0 | 0 | 0 | 0 | |
Intermediate | 0 | 0 | 0 | −0.055 | 0.503 | |
Advanced | −0.891 | −0.891 | −0.859 | −0.460 | < 0.001 | |
CMV match | +/+ (ref) | 0 | 0 | 0 | 0 | |
+/− | −0.455 | −0.445 | −0.476 | −0.363 | 0.018 | |
−/+ | 0 | 0 | 0 | −0.034 | 0.902 | |
−/− | 0 | 0 | 0 | −0.027 | 0.774 | |
Missing | 0 | 0 | 0 | −0.054 | 0.939 | |
Conditioning intensity | Reduced intensity (ref) | 0 | 0 | 0 | 0 | |
Nonmyeloablative | −1.056 | −1.060 | −1.165 | −0.536 | < 0.001 | |
In-vivo T cell depletion | No (ref) | 0 | 0 | 0 | 0 | |
Yes | −0.488 | −0.488 | −0.466 | −0.301 | 0.010 | |
Graft type | Bone marrow (ref) | 0 | 0 | 0 | 0 | |
Peripheral blood | 0 | 0 | 0.181 | 0.130 | 0.137 | |
GVHD prophylaxis | FK506 ± others (ref) | 0 | 0 | 0 | 0 | |
Others | 0 | 0 | 0.175 | 0.121 | 0.212 |
5. Conclusion
The group bridge and the adaptive group bridge were proposed to select variables for the competing risks quantile regression. Their oracle property was studied. In particular, the adaptive group bridge not only consistently identifies non-zero group variables, but also consistently selects non-zero within-group variables. We also proposed the BIC-type criterion to choose a tuning parameter. The proposed BIC-type criterion appears to work properly in the simulation study. The adaptive group bridge selected non-zero within-group variables more consistently than the group bridge in the simulation study. A bone marrow transplant example showed the usefulness of the adaptive group bridge.
The proposed method was limited to when dn < n. Developing a group variable selection method when dn < n would be a crucial research problem. A two-step variable selection procedure may be developed for this: once we screen group variables in the first step, we may use the adaptive group bridge to obtain a further parsimonious list of non-zero variables in the second step. The theoretical justification of the proposed BIC-type criterion needs to be studied in the future.
Supplementary Material
Acknowledgments
This work was supported in part by Institutional Research Grant #14-247-29 from the American Cancer Society and the MCW Cancer Center, and the US National Cancer Institute (U24CA076518). The authors would like to thank the Associate Editor and two anonymous reviewers for their helpful comments that significantly improved the manuscript.
Appendix
For and the Breslow estimator, we assume as follows:
and P{Yi(t) = 1} > 0 for t ∈ [0, L], i = 1, …, n, and as n → ∞.
Zij is bounded almost surely for all i,j and is bounded almost surely for any and α ∈ ℬ, where ℬ is a neighborhood α0.
For d = 0, 1, 2, there exists a neighborhood ℬ of α0 such that are continuous functions and in probability.
The matrix is positive definite, where and .
For all α ∈ ℬ, t ∈ [0, L], , and , where , d = 0, 1, 2 are continuous functions of α ∈ ℬ uniformly in t ∈ [0, L] and are bounded on ℬ × [0, L], and is bounded away from zero on ℬ × [0, L].
Footnotes
Supplementary Material
Additional supplementary material may be found in the online version of this article at the publishers web site.
References
- 1.Peng L, Fine JP. Competing risks quantile regression. Journal of the American Statistical Association. 2009;104:1440–1453. [Google Scholar]
- 2.Peng L, Huang Y. Survival analysis with quantile regression models. Journal of the American Statistical Association. 2008;103:637–649. [Google Scholar]
- 3.Reich BJ, Smith LB. Bayesian quantile regression for censored data. Biometrics. 2013;69:651–660. doi: 10.1111/biom.12053. [DOI] [PubMed] [Google Scholar]
- 4.Yin G, Zeng D, Li H. Power–transformed linear quantile regression with censored data. Journal of the American Statistical Association. 2008;103:1214–1224. [Google Scholar]
- 5.Yin G, Cai J. Quantile regression models with multivariate failure time data. Journal of the American Statistical Association. 2005;61:151–161. doi: 10.1111/j.0006-341X.2005.030815.x. [DOI] [PubMed] [Google Scholar]
- 6.Sun Y, Wang HJ, Gilbert PB. Quantile regression for competing risks data with missing cause of failure. Statistica Sinica. 2012;22:703–728. doi: 10.5705/ss.2010.093. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Lee M, Fine J. Inference for cumulative incidence quantiles via parametric and nonparametric approaches. Statistics in Medicine. 2011;30:3221–3235. doi: 10.1002/sim.4349. [DOI] [PubMed] [Google Scholar]
- 8.Jiang R, Qian W, Zhou Z. Variable selection and coefficient estimation via composite quantile regression with randomly censored data. Statistics & Probability Letters. 2012;82:308–317. [Google Scholar]
- 9.Wang HJ, Zhou J, Li Y. Variable selection for censored quantile regression. Statistica Sinica. 2013;23:145–167. doi: 10.5705/ss.2011.100. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10.Verneris MR, Lee SJ, Ahn KW, Wang HL, Battiwalla M, Inamoto Y, Munker R, Aljurf M, Saber W, Spellman S, et al. HLA-mismatch is associated with worse outcomes after unrelated donor reduced intensity transplantation: An analysis from the CIBMTR. Biology of Blood and Marrow Transplant. 2015;21:1783–1789. doi: 10.1016/j.bbmt.2015.05.028. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B. 2006;68:49–67. [Google Scholar]
- 12.Huang J, Ma S, Xie H, Zhang CH. A group bridge approach for variable selection. Biometrika. 2009;96:339–355. doi: 10.1093/biomet/asp020. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Zhou N, Zhu J. Group variable selection via a hierarchical lasso and its oracle property. Statistics and Its Interface. 2010;3:557–574. [Google Scholar]
- 14.Zhao W, Zhang R, Liu J. Sparse group variable selection based on quantile hierarchical lasso. Journal of Applied Statistics. 2014;41:1658–1677. [Google Scholar]
- 15.Fu Z, Parikh CR, Zhou B. Penalized variable selection in competing risks regression. Lifetime Data Analysis. 2016;23:353376. doi: 10.1007/s10985-016-9362-3. [DOI] [PubMed] [Google Scholar]
- 16.Breslow NE. Discussion of the paper by d. r.cox. Journal of the Royal Statistical Society: Series B. 1972;34:216–217. [Google Scholar]
- 17.Huang J, Li L, Liu Y, Zhao X. Group selection in the Cox model with a diverging number of covariates. Statistica Sinica. 2014;24:1787–1810. [Google Scholar]
- 18.Cai J, Fan J, Li R, Zhou H. Variable selection for multivariate failure time data. Biometrika. 2005;92:303–316. doi: 10.1093/biomet/92.2.303. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Koenker R. quantreg: Quantile Regression. 2016 URL https://CRAN.R-project.org/package=quantreg, r package version 5.26.
- 20.Lee ER, Noh H, Park BU. Model selection via bayesian information criterion for quantile regression models. Journal of the American Statistical Association. 2014;109:216–229. [Google Scholar]
- 21.Shows JH, Lu W, Zhang HH. Sparse estimation and inference for censored median regression. Journal of Statistical Planning and Inference. 2010;140:1903–1917. doi: 10.1016/j.jspi.2010.01.043. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.