On correlation rank screening for ultra-high dimensional competing risks data

Xiaolin Chen; Chenguang Li; Tao Zhang; Zhenlong Gao

doi:10.1080/02664763.2021.1884209

. 2021 Feb 9;49(7):1848–1864. doi: 10.1080/02664763.2021.1884209

On correlation rank screening for ultra-high dimensional competing risks data

Xiaolin Chen ^a, Chenguang Li ^a, Tao Zhang ^b, Zhenlong Gao ^a,^CONTACT

PMCID: PMC9042004 PMID: 35707564

Abstract

In recent years, numerous feature screening schemes have been developed for ultra-high dimensional standard survival data with only one failure event. Nevertheless, existing literature pays little attention to related investigations for competing risks data, in which subjects suffer from multiple mutually exclusive failures. In this article, we develop a new marginal feature screening for ultra-high dimensional time-to-event data to allow for competing risks. The proposed procedure is model-free, and robust against heavy-tailed distributions and potential outliers for time to the type of failure of interest. Apart from this, it is invariant to any monotone transformation of event time of interest. Under rather mild assumptions, it is shown that the newly suggested approach possesses the ranking consistency and sure independence screening properties. Some numerical studies are conducted to evaluate the finite-sample performance of our method and make a comparison with its competitor, while an application to a real data set is provided to serve as an illustration.

Keywords: Consistency in ranking, feature screening, model-free, sure independence screening, ultra-high dimensional competing risks data

1. Introduction

Competing risks data are commonly encountered in varieties of fields such as biomedical studies, reliability testing, empirical health economics and so on. Different from the standard survival analysis where subjects could experience only one type of failure, there exist multiple but mutually inclusive failure events in competing risks analysis, also named competing events. As an example, in prostate cancer studies, patients may die of many other causes before their prostate cancer becomes symptomatic due to the slow progress of prostate cancer.

In regression analysis of competing risks data, it is of essential concern to evaluate the effect of a predictor (covariate or feature) on time to a particular type of failure. In modern applications, data with high or ultra-high dimensional covariates are routinely collected, and competing risk data are no exception. Here, by saying ultra-high dimension, it is meant that the dimension of covariates could increase at a nonpolynomial rate with the sample size. To establish a parsimonious model with the purpose of enhancing model interpretability, penalized variable selection methods have been extended from complete data to competing risks data by several authors (see, for instance, [1,10,12] and so on). It has been demonstrated that the penalized approaches work well for data with moderate-high dimensional covariates in massive literature. When applied to ultra-high dimensional data, these procedures, however, suffer from the simultaneous challenges of computational expediency, statistical accuracy and algorithmic stability [8].

To tackle these challenges, Fan and Lv [7] highlighted the marginal feature screening procedures and established the general asymptotic framework for marginal Pearson correlation feature screening under the linear regression model. Since this pioneering work, marginal feature screening has almost become an indispensable tool in the ultra-high dimensional data analysis, and various extensions of marginal Pearson correlation feature screening are now available in the literature. We refer to the review paper by Liu et al. [16] and the references therein for recent developments.

As far as the ultra-high dimensional time-to-event data concerned, the more recent years have witnessed a rapid growth of literature concerning feature screening methods, including [3,4,11,13,20–22] to cite a few. However, almost all the existing feature screening methods only work for standard survival data with only one type of failure. Naive application of these approaches to the competing risks scenarios is problematic and not suitable; thus specialized procedures are needed to be developed just as nonparametric and low-dimensional semiparametric problems [18]. Motivated by this point, in recent years, Li et al. [14] and Chen et al. [5] developed model-based and model-free feature screening means for ultra-high dimensional competing risks data, respectively. In [14], the authors suggested a joint feature screening under the well-known Fine-Gray competing risks model [9] by extending the methodology of Xu and Chen [19]. Whereas in terms of feature screening methods, we believe that the model-free procedures are more appropriate than these model-based ones in ultra-high dimensional data analysis for the reason of a wider range of applications and resisting limitations of specific regression models. Then Chen et al. [5] put forward the first model-free marginal and conditional feature screening approaches for ultra-high dimensional competing risks data by mimicking the cSIRS of Zhou and Zhu [22] and SCS of Chen et al. [4] for standard survival data.

In this article, we propose, in a model-free fashion, a new marginal feature screening procedure for ultra-high dimensional competing risks data. Our screening utility quantifies the correlation of each predictor with event survival time of interested type, denoted by type-1 without loss of generality, through the correlation of this predictor and type-1 failure's cumulative incidence function (CIF). This concept is one of the two most essential notions in competing risks regression analysis. By sorting the screening indices, the features having effects on type-1 failure's CIF will be ranked on the top ones ; thus, those predictive for type-1 failure could be identified successfully. This newly advised technique enjoys several appealing advantages, such as the robustness against heavy-tailed distributions and potential outliers for time to the type of failure of interest, invariance to any monotone transformation of event time of interest. Besides, it is noted that our methodology is reduced to the CRS of Zhang et al. [20] with the absence of competing events.

The rest of this article is organized as follows. In Section 2, we firstly develop a new marginal screening methodology for competing risks data and then establish the corresponding large-sample properties. Section 3 evaluates the finite-sample performances of the newly proposed procedure and compares it with its competitor, crSIRS. Our approach is further illustrated by a bladder cancer data set in Section 4. A brief discussion is provided in Section 5. Lastly, the proofs of Section 2 are deferred to the appendix.

2. Methodology and theoretical properties

Without loss of generality, it is assumed that there exist only two types of failures: type-1 failure is of interest and type-2 is the competing event. Otherwise, type-2 includes all other competing risks. Let T and J denote the continuous failure time and the associated failure type, respectively. In the presence of right censoring, one can only observe $Y = min {T, C}$ and $δ = I (T \leq C)$ , where C is the censoring time with cumulative distribution function $G (y)$ . Furthermore, J is unobservable if T is censored by C and thus meaningless. Denote by $x = (X_{1}, X_{2}, \dots, X_{p})^{T}$ the p-dimensional vector of covariates, which are potentially predictive for type-1 failure. To ease presentation, throughout this article we assume that $E (X_{l}) = 0$ for $l = 1, \dots, p$ .

To identify a small number of features being most influential for type-1 failure, we firstly define the active and inactive predictors without specifying a regression model in the same fashion as those for the complete and standard survival data [22,23]. Let $F_{1} (t | x) = Pr (T \leq t, J = 1 | x)$ , which is the cumulative incidence function of type-1 failure. Then define

A = {1 \leq l \leq p : F_{1} (t | x) functionally depends on X_{l}} .

If $l \in A$ , $X_{l}$ is referred to as an active predictor, otherwise it is referred to as an inactive predictor.

Mimicking the marginal utility used in the correlation rank sure independent screening procedure of Zhang et al. [20], we propose a new marginal utility measure to rank all the covariates related to the failure of type of interest. Define $Ω (Y) = E {x F_{1} (Y)}$ with $F_{1} (y) = Pr (T \leq y, J = 1)$ and denote by $Ω_{l} (Y)$ the lth component of $Ω (Y)$ . It is easy to see that $Ω_{l} (Y) = cov {X_{l}, F_{1} (Y)}$ . Thus $Ω_{l} (Y)$ evaluates the correlation between $X_{l}$ and type-1 failure. Then, we suggest using

ω_{l} = E [Ω_{l}^{2} (Y)]

for $l = 1, 2, \dots, p$ as the population version of our screening procedure's marginal utility.

Suppose that $(Y_{i}, δ_{i}, δ_{i} J_{i}, x_{i})$ , $i = 1, \dots, n$ , are n independent copies of $(Y, δ, δ J, x)$ . We acquire the sample version of $ω_{l}$ below based on the independent and identically distributed copies below. For the sake of notational clarity, it is assumed that the sample predictors are all standardized throughout the paper; that is, $n^{- 1} \sum_{i = 1}^{n} X_{i l} = 0$ and $n^{- 1} \sum_{i = 1}^{n} X_{i l}^{2} = 1$ for $l = 1, \dots, p$ . Without right censoring, it is easy to see that $(T_{i}, J_{i})$ , $i = 1, \dots, n,$ are observed. Then the cumulative incidence function of type-1 failure could be estimated by the empirical ones, i.e. $n^{- 1} \sum_{i = 1}^{n} I (T_{i} \leq y, J_{i} = 1)$ . In the presence of censoring, this induction and the idea of inverse probability censoring weighting motivate the following estimator,

{\hat{F}}_{1} (y) = \frac{1}{n} \sum_{i = 1}^{n} \frac{δ_{i} I (Y_{i} \leq y, J_{i} = 1)}{1 - {\hat{G}}_{n} (Y_{i})},

where

{\hat{G}}_{n} (y) = 1 - \prod_{i = 1}^{n} {(1 - \frac{1}{\sum_{j = 1}^{n} I {Y_{j} \geq Y_{i}}})}^{(1 - δ_{i}) I (Y_{i} \leq y)} .

is the well-known Kaplan–Meier estimator of $G (y)$ . Consequently, the sample version of $ω_{l}$ could be provided by

{\hat{ω}}_{l} = {\frac{1}{n} \sum_{i = 1}^{n} X_{i l} {\hat{F}}_{1} (Y_{i})}^{2} .

Eventually, the index set of active predictors is recovered by picking up those with large ${\hat{ω}}_{l}$ 's; that is,

\hat{A} = {1 \leq l \leq p : {\hat{ω}}_{l} \geq ν_{n}},

where $ν_{n}$ is a threshold sequence given in advance and varies with sample size n. Alternatively, one could give another more practical definition of $\hat{A}$ :

\hat{A} = {1 \leq l \leq p : {\hat{ω}}_{l} ranks among the first k largest of all},

where the submodel size k is designated by the users in advance. See [7] and [8] for a detailed discussion about the choices of k. Because our methodology is motivated by the correlation rank screening for standard survival data, we refer it subsequently as the correlation rank sure independent screening approach for ultra-high dimensional competing risk data (crCRS for short).

To build the large-sample theories for crCRS, such as sure screening property [7] and consistency of ranking [23], we suppose that the following assumptions hold:

(A.1)
There exists a positive constant η such that $Pr (t \leq T \leq C) \geq η$ , where $t \in (0, τ]$ with τ being the maximum follow-up time. Furthermore, $sup {t : Pr (T > t) > 0} \geq sup {t : Pr (C > t) > 0}$ ; $G (y)$ has a uniformly bounded first-order derivative.
(A.2)
The second moments of all covariates are uniformly bounded; that is, there exists a positive constant M such that $max_{1 \leq l \leq p} E (X_{l}^{2}) \leq M$ .

Assumption (A.1) is frequently imposed in the survival analysis literature to make sure that the Kaplan–Meier estimator is well behaved. Assumption (A.2) is borrowed directly from Zhang et al. [20]. The following theorem states the sure screening property of our crCRS.

Theorem 2.1

Under Assumptions (A.1) and (A.2), for sufficiently large n, there exists a positive constant $d_{1}$ such that

$Pr (max_{1 \leq l \leq p} | {\hat{ω}}_{l} - ω_{l} | \geq c_{1} n^{- κ}) \leq O (p \exp [- d_{1} {\frac{n^{1 - 2 κ}}{\log \log (n)}}^{1 / 2}]),$

where $c_{1}$ and $κ < 1 / 2$ are prespecified positive constants. Furthermore, assuming that $min_{l \in A} ω_{l} \geq 2 c_{1} n^{- κ}$ and setting $ν_{n} = c_{1} n^{- κ}$ , we have

$\Pr (A \subset \hat{A}) \geq 1 - O (s \exp [- d_{1} {\frac{n^{1 - 2 κ}}{\log \log (n)}}^{1 / 2}]),$

where s is the number of truly important predictors and may increase with sample size n.

Denote by $x_{A}$ the vector consisting of $X_{l}$ with $l \in A$ and define $x_{A^{c}}$ similarly. In Theorem 2.2, we provide the property of consistency in ranking of our crCRS. Before presenting the theorem, we impose the following assumption in addition to the assumptions mentioned above.

(A.3)
The type-1 failure and $x_{A^{c}}$ are conditionally independent given $x_{A}$ ; $x_{A^{c}}$ is independent of $x_{A}$ .

It seems that this assumption is relatively strong. However, we only use it to prove the consistency in ranking property for the population version of our screening utility, not for the sample version. In addition, even then this assumption is violated, our crCRS still works based on our experiences from the numerical studies.

Theorem 2.2

Under Assumptions (A.3), as long as $min_{l \in A} ω_{l} > 0$ , then $ω_{l} = 0$ if and only if $l \in A^{c}$ , and thus $max_{l \in A^{c}} ω_{l} < min_{l \in A} ω_{l}$ . Moreover, under the same assumptions as those in Theorem 2.1, we have

$Pr (max_{l \in A^{c}} {\hat{ω}}_{l} < min_{l \in A} {\hat{ω}}_{l}) \geq 1 - O (p \exp [- d_{1} {\frac{n^{1 - 2 κ}}{\log \log (n)}}^{1 / 2}]),$

where $d_{1}$ is defined in Theorem 2.1.

The results in Theorem 2.2 could be proved by the same lines as the proof for Theorem 2.2 in [20]. Thus, we omit it in this article.

3. Simulation studies

In this section, we conducted extensive numerical studies to assess the finite-sample properties of our proposed crCRS, and compare it with its competitor, crSIRS of [5]. When implementing crCRS and crSIRS, we chose the model size of the estimated index set of active predictors to be $k = [n / \log (n)]$ . The overall performances of both approaches are examined by, based on 500 replications, 3 summary statistics: selection proportion of each feature, denoted by $P_{l}$ , where l is the index of predictor, selection proportion of all active covariates, denoted by $P_{a}$ , and the minimum model size to include all active covariates. The dimensionality p is set to be 1000 and 3000, while the sample size n varies from case to case to obtain satisfactory summary statistics.

Throughout this section, the censoring time takes the uniform distribution on $(0, \tilde{c})$ , where different $\tilde{c}$ is chosen to produce an approximate 40% censoring rate in each example. The covariate vector $x$ follows the multivariate normal distribution with mean vector $0_{p \times 1}$ and covariance matrix $Σ = (ρ^{| i - j |})_{p \times p}$ , where the values of ρ are specified as 0.5 and 0.8 in subsequent examples. Failure times and associated types are generated according to specific scenarios in the following examples. By design, the first three predictors are truly important for type-1 failure in all examples.

Example 3.1

Failure types are firstly generated according to $Pr (J = 1) = 1 - Pr (J = 2) = 0.6$ , which is free of the covariates. Then given J = 1 and $x$ , the model used for generating T is

$\log (T) = β^{T} x + a ε,$

where $β = (1.5, 1.8, 1.3, 0, \dots, 0)^{T}$ , a = 0.5, 1, 1.5, and the error term ε follows standard normal distribution or standard Gumbel distribution (Type-I extreme value distribution). Similarly, given J = 2 and $x$ , T is generated from $\log (T) = 3 (α^{T} x + ϵ)$ , where $α = (0, 0, 0, 1.0, 1.0, 1.0, 0, \dots, 0)^{T}$ and ϵ follows standard normal distribution.

In this example, the relationship between log-type-1 failure and the covariates are simply linear. Tables 1– 4 report the results of our simulation studies for Example 3.1. From these results, it could be seen easily that both procedures perform similarly in this specific setting. Concerning crCRS, we could conclude that it is relatively robust to the error distribution, correlations among covariates and choices of a under low sample sizes and very high dimension, at least in this example.

Table 2.

The proportion of $P_{l}$ and $P_{a}$ in Example 3.1 with $ε_{1}$ following standard Gumbel distribution.

p = 1000 p = 3000

a ρ n Method $P_{1}$ $P_{2}$ $P_{3}$ $P_{a}$ $P_{1}$ $P_{2}$ $P_{3}$ $P_{a}$

0.5 0.5 100 crCRS 0.92 0.99 0.99 0.91 0.85 0.98 0.98 0.83

crSIRS 0.97 1.00 0.94 0.91 0.93 0.99 0.88 0.82

150 crCRS 0.99 1.00 1.00 0.99 0.98 1.00 1.00 0.98

crSIRS 1.00 1.00 1.00 1.00 0.99 1.00 0.99 0.98

0.8 100 crCRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

crSIRS 1.00 1.00 1.00 1.00 0.99 1.00 0.99 0.99

150 crCRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

crSIRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.0 0.5 100 crCRS 0.90 0.99 0.99 0.89 0.82 0.97 0.98 0.80

crSIRS 0.94 1.00 0.93 0.88 0.91 0.98 0.84 0.77

150 crCRS 0.98 1.00 1.00 0.98 0.97 1.00 1.00 0.97

crSIRS 1.00 1.00 1.00 0.99 0.99 1.00 0.99 0.98

0.8 100 crCRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

crSIRS 1.00 1.00 0.99 0.99 0.99 1.00 0.99 0.98

150 crCRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

crSIRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

1.5 0.5 100 crCRS 0.85 0.98 0.98 0.84 0.77 0.96 0.96 0.73

crSIRS 0.91 0.99 0.90 0.82 0.86 0.97 0.78 0.68

150 crCRS 0.96 1.00 1.00 0.96 0.96 1.00 1.00 0.96

crSIRS 1.00 1.00 0.99 0.98 0.98 1.00 0.97 0.96

0.8 100 crCRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

crSIRS 0.99 0.99 0.99 0.98 0.98 1.00 0.98 0.96

150 crCRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

crSIRS 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00

Open in a new tab
Table 3.

The 5%, 25%, 50%, 75% and 95% quantiles of minimum model size S to include all the important covariates in Example 3.1 with $ε_{1}$ following a standard normal distribution.

p = 1000 p = 3000

a ρ n Method 5% 25% 50% 75% 95% 5% 25% 50% 75% 95%

0.5 0.5 100 crCRS 3 4 5 7 33 3 4 6 14 104

crSIRS 3 3 4 7 28 3 3 5 15 113

150 crCRS 3 4 5 6 9 3 3 4 5 11

crSIRS 3 3 3 3 5 3 3 3 3 14

0.8 100 crCRS 3 4 5 6 7 3 4 5 6 7

crSIRS 3 3 3 3 5 3 3 3 4 8

150 crCRS 3 4 5 5 6 3 4 5 5 6

crSIRS 3 3 3 3 4 3 3 3 3 4

1.0 0.5 100 crCRS 3 4 5 8 42 3 4 6 16 125

crSIRS 3 3 4 8 35 3 3 6 18 139

150 crCRS 3 4 5 6 10 3 4 5 6 15

crSIRS 3 3 3 3 6 3 3 3 4 18

0.8 100 crCRS 3 4 5 6 7 3 4 5 6 7

crSIRS 3 3 3 3 6 3 3 3 4 9

150 crCRS 3 4 5 5 6 3 4 5 6 6

crSIRS 3 3 3 3 4 3 3 3 3 4

1.5 0.5 100 crCRS 3 4 5 10 60 3 4 7 22 163

crSIRS 3 3 5 12 46 3 3 8 26 185

150 crCRS 3 4 5 6 14 3 4 5 6 20

crSIRS 3 3 3 3 9 3 3 3 5 27

0.8 100 crCRS 3 4 5 6 7 3 4 5 6 7

crSIRS 3 3 3 4 8 3 3 3 4 12

150 crCRS 3 4 5 6 6 3 4 5 6 6

crSIRS 3 3 3 3 4 3 3 3 3 4

Open in a new tab

Table 1.

The proportion of $P_{l}$ and $P_{a}$ in Example 3.1 with $ε_{1}$ following a standard normal distribution.

				p = 1000				p = 3000
a	ρ	n	Method	$P_{1}$	$P_{2}$	$P_{3}$	$P_{a}$	$P_{1}$	$P_{2}$	$P_{3}$	$P_{a}$
0.5	0.5	100	crCRS	0.94	0.99	0.99	0.93	0.84	0.99	0.97	0.82
			crSIRS	0.96	1.00	0.95	0.91	0.92	0.99	0.89	0.82
		150	crCRS	0.99	0.998	1.00	0.99	0.98	0.998	1.00	0.98
			crSIRS	1.00	1.00	1.00	1.00	0.99	1.00	0.99	0.98
	0.8	100	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	1.00	1.00	1.00	1.00	0.99	1.00	0.98	0.98
		150	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
1.0	0.5	100	crCRS	0.93	0.99	0.99	0.91	0.81	0.98	0.97	0.78
			crSIRS	0.96	1.00	0.95	0.91	0.90	0.99	0.86	0.78
		150	crCRS	0.98	1.00	1.00	0.98	0.97	1.00	1.00	0.97
			crSIRS	1.00	1.00	1.00	1.00	0.99	1.00	0.98	0.98
	0.8	100	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	1.00	1.00	0.99	0.99	0.99	0.99	0.98	0.98
		150	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
1.5	0.5	100	crCRS	0.90	0.99	0.98	0.87	0.78	0.97	0.96	0.75
			crSIRS	0.95	0.99	0.93	0.88	0.86	0.98	0.81	0.70
		150	crCRS	0.97	1.00	1.00	0.97	0.96	1.00	1.00	0.96
			crSIRS	0.99	1.00	0.99	0.99	0.98	1.00	0.98	0.96
	0.8	100	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	0.99	1.00	0.99	0.99	0.99	0.99	0.97	0.96
		150	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00

Open in a new tab

Table 4.

The 5%, 25%, 50%, 75% and 95% quantiles of minimum model size S to include all the important covariates in Example 3.1 with $ε_{1}$ following standard Gumbel distribution.

				p = 1000					p = 3000
a	ρ	n	Method	5%	25%	50%	75%	95%	5%	25%	50%	75%	95%
0.5	0.5	100	crCRS	3	4	5	8	37	3	4	6	12	98
			crSIRS	3	3	4	7	34	3	3	5	13	103
		150	crCRS	3	4	5	6	10	3	4	5	6	12
			crSIRS	3	3	3	3	6	3	3	3	3	10
	0.8	100	crCRS	3	4	5	6	7	3	4	5	6	7
			crSIRS	3	3	3	3	6	3	3	3	3	7
		150	crCRS	3	4	5	6	6	3	4	5	6	6
			crSIRS	3	3	3	3	4	3	3	3	3	4
1.0	0.5	100	crCRS	3	4	5	9	50	3	4	6	15	169
			crSIRS	3	3	4	10	49	3	3	6	19	137
		150	crCRS	3	4	5	6	13	3	4	5	6	14
			crSIRS	3	3	3	3	8	3	3	3	4	13
	0.8	100	crCRS	3	4	5	6	7	3	4	5	6	7
			crSIRS	3	3	3	4	6	3	3	3	4	10
		150	crCRS	3	4	5	6	7	3	4	5	6	6
			crSIRS	3	3	3	3	4	3	3	3	3	4
1.5	0.5	100	crCRS	3	4	6	13	83	3	4	8	28	227
			crSIRS	3	3	5	14	87	3	4	9	34	195
		150	crCRS	3	4	5	6	22	3	4	5	6	22
			crSIRS	3	3	3	4	14	3	3	3	4	27
	0.8	100	crCRS	3	4	5	6	7	3	4	5	6	7
			crSIRS	3	3	3	4	9	3	3	3	4	17
		150	crCRS	3	4	5	6	7	3	4	5	6	7
			crSIRS	3	3	3	3	4	3	3	3	3	5

Open in a new tab

Example 3.2

Failure type follows the same distribution as that in Example 3.1. The conditional model of T, given J = 1 and $x$ , is specified by

$\log (T) = β^{T} x + \exp {γ^{T} x} ε,$

where $β = (1, 0.8, 0, \dots, 0)^{T}$ , $γ = (0, 0, 1.2, 0, \dots, 0)^{T}$ , and ε follows standard normal distribution or standard Gumbel distribution. The distribution for T conditional on J = 2 and $x$ is the same as that in Example 3.1.

Data generated from this example exhibit heteroscedasticity in type-1 failure time. Simulation results are summarized in Tables 5 and 6. From these results, we observe that crCRS behaves uniformly better than crSIRS for all the situations considered here. Particularly, crCRS presents considerable improvements over crSIRS under scenarios with higher dimensions and smaller sample sizes.

Table 5.

The proportion of $P_{l}$ and $P_{a}$ in Example 3.2. $G (0, 1)$ denotes the standard Gumbel distribution.

				p = 1000				p = 3000
$ε_{1}$	ρ	n	Method	$P_{1}$	$P_{2}$	$P_{3}$	$P_{a}$	$P_{1}$	$P_{2}$	$P_{3}$	$P_{a}$
$N (0, 1)$	0.5	300	crCRS	1.00	1.00	0.88	0.88	1.00	1.00	0.72	0.72
			crSIRS	1.00	1.00	0.75	0.75	1.00	1.00	0.53	0.53
		400	crCRS	1.00	1.00	0.93	0.93	1.00	1.00	0.90	0.90
			crSIRS	1.00	1.00	0.89	0.89	1.00	1.00	0.73	0.73
	0.8	150	crCRS	1.00	1.00	1.00	1.00	1.00	0.99	0.99	0.99
			crSIRS	0.96	0.92	0.71	0.70	0.93	0.90	0.64	0.62
		250	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	1.00	1.00	0.96	0.96	1.00	1.00	0.90	0.90
$G (0, 1)$	0.5	200	crCRS	0.99	0.99	0.91	0.89	0.96	0.98	0.85	0.81
			crSIRS	0.996	0.99	0.82	0.81	0.98	0.97	0.66	0.64
		300	crCRS	1.00	1.00	0.99	0.99	1.00	1.00	0.98	0.98
			crSIRS	1.00	1.00	0.97	0.97	1.00	1.00	0.91	0.91
	0.8	100	crCRS	0.99	1.00	1.00	0.99	0.98	1.00	0.98	0.96
			crSIRS	0.91	0.89	0.72	0.69	0.84	0.81	0.63	0.57
		200	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	1.00	1.00	0.98	0.98	1.00	1.00	0.95	0.95

Open in a new tab

Table 6.

The 5%, 25%, 50%, 75% and 95% quantiles of minimum model size S to include all the important covariates in Example 3.2.

				p = 1000					p = 3000
$ε_{1}$	ρ	n	Method	5%	25%	50%	75%	95%	5%	25%	50%	75%	95%
$N (0, 1)$	0.5	300	crCRS	5	6	8	20	124	5	6	13	63	708
			crSIRS	3	4	13	53	186	3	9	46	167	635
		400	crCRS	5	6	7	12	81	5	6	7	19	146
			crSIRS	3	3	6	19	131	3	4	17	73	309
	0.8	150	crCRS	3	4	5	6	8	3	4	5	6	9
			crSIRS	3	3	9	40	248	3	4	14	74	512
		250	crCRS	3	5	5	6	7	3	5	5	6	7
			crSIRS	3	3	3	5	34	3	3	4	10	90
$G (0, 1)$	0.5	200	crCRS	4	5	6	12	101	4	5	8	22	168
			crSIRS	3	3	6	22	177	3	4	15	93	560
		300	crCRS	4	5	6	6	11	4	5	6	7	25
			crSIRS	3	3	3	5	33	3	3	4	11	107
	0.8	100	crCRS	3	4	5	6	11	3	4	6	7	17
			crSIRS	3	3	7	33	208	3	4	14	89	804
		200	crCRS	3	4	5	6	7	3	4	5	6	7
			crSIRS	3	3	3	4	16	3	3	3	5	32

Open in a new tab

Note: $G (0, 1)$ denotes the standard Gumbel distribution.

Example 3.3

Failure types are generated as those in Example 3.1. Given J = 1 and $x$ , failure time T follows the following nonlinear model

$\log (T) = 1.2 X_{1} + \sin^{2} (X_{2}) + (0.2 + X_{3})^{- 3} + ε,$

where ε follows the standard normal distribution or standard Gumbel distribution. The distribution for T conditional on J = 2 and $x$ is the same as that in Example 3.1.

In this example, we want to examine the performances of both methods under a general nonlinear regression structure. Summary statistics are presented in Tables 7 and 8. From these results, similar findings to those in Example 3.2 are observed here. Furthermore, the amelioration is more significant than that in the previous example.

Table 7.

The proportion of $P_{l}$ and $P_{a}$ in Example 3.3. $G (0, 1)$ denotes the standard Gumbel distribution.

				p = 1000				p = 3000
$ε_{1}$	ρ	n	Method	$P_{1}$	$P_{2}$	$P_{3}$	$P_{a}$	$P_{1}$	$P_{2}$	$P_{3}$	$P_{a}$
$N (0, 1)$	0.5	300	crCRS	0.91	0.97	1.00	0.90	0.81	0.94	1.00	0.78
			crSIRS	0.88	0.90	1.00	0.81	0.75	0.72	1.00	0.60
		400	crCRS	0.97	1.00	1.00	0.97	0.93	0.99	1.00	0.92
			crSIRS	0.97	0.97	1.00	0.94	0.93	0.87	1.00	0.82
	0.8	100	crCRS	0.94	1.00	1.00	0.94	0.89	0.98	1.00	0.89
			crSIRS	0.69	0.78	0.89	0.61	0.52	0.65	0.76	0.43
		200	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	0.98	1.00	1.00	0.97	0.95	0.99	1.00	0.95
$G (0, 1)$	0.5	300	crCRS	0.88	0.98	1.00	0.87	0.76	0.92	1.00	0.72
			crSIRS	0.88	0.87	1.00	0.79	0.75	0.71	0.99	0.60
		400	crCRS	0.97	1.00	1.00	0.97	0.90	0.99	1.00	0.90
			crSIRS	0.96	0.95	1.00	0.93	0.85	0.86	1.00	0.75
	0.8	100	crCRS	0.95	0.99	1.00	0.94	0.85	0.99	1.00	0.85
			crSIRS	0.62	0.72	0.86	0.52	0.49	0.59	0.71	0.40
		200	crCRS	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
			crSIRS	0.98	0.99	1.00	0.97	0.93	0.97	1.00	0.92

Open in a new tab

Table 8.

The 5%, 25%, 50%, 75% and 95% quantiles of minimum model size S to include all the important covariates in Example 3.3.

				p = 1000					p = 3000
$ε_{1}$	ρ	n	Method	5%	25%	50%	75%	95%	5%	25%	50%	75%	95%
$N (0, 1)$	0.5	300	crCRS	5	6	7	16	123	5	6	10	44	335
			crSIRS	3	4	9	34	164	3	7	33	117	506
		400	crCRS	5	6	6	10	46	5	6	7	15	122
			crSIRS	3	3	5	14	77	3	4	9	40	215
	0.8	100	crCRS	4	5	6	8	26	4	5	7	9	48
			crSIRS	3	5	13	46	199	3	7	32	131	558
		200	crCRS	4	5	6	7	8	4	5	6	7	8
			crSIRS	3	3	4	5	22	3	3	4	8	37
$G (0, 1)$	0.5	300	crCRS	5	6	8	19	121	5	6	13	60	473
			crSIRS	3	5	13	41	214	3	9	29	147	624
		400	crCRS	5	6	7	10	43	5	6	7	17	157
			crSIRS	3	3	6	18	88	3	5	12	64	405
	0.8	100	crCRS	4	5	7	9	23	4	5	7	10	71
			crSIRS	3	7	20	53	231	3	11	36	168	729
		200	crCRS	5	6	6	7	8	4	5	6	7	8
			crSIRS	3	3	4	6	20	3	3	4	9	70

Open in a new tab

Note: $G (0, 1)$ denotes the standard Gumbel distribution.

Example 3.4

Type-1 failure times are generated from the distribution with the CIF $F_{1} (t | x) = 1 - [1 - p_{0} {1 - \exp (- t)}]^{\exp (2 X_{1} X_{3} + 1.5 X_{2})}$ , where $p_{0} = 0.5$ . The CIF of type-2 failure is obtained by setting $Pr (J = 2 | x) = 1 - Pr (J = 1 | x)$ and $Pr (T \leq t | J = 2, x)$ to be the exponential distribution with rate $0.25 \exp (α^{T} x)$ and $α = (0, 0, 0, 0.5, 0.2, 0.3, 0, \dots, 0)^{T}$ .

Example 3.4 provides a case in which the predictors' effects on type-1 failure time are modeled directly through the CIF. Tables 9 and 10 provide the numerical results, from which we could see that our crCRS is slightly better than crSIRS in terms of the evaluated criterions.

Table 9.

The proportion of $P_{l}$ and $P_{a}$ in Example 3.4.

			p = 1000				p = 3000
ρ	n	Method	$P_{1}$	$P_{2}$	$P_{3}$	$P_{a}$	$P_{1}$	$P_{2}$	$P_{3}$	$P_{a}$
0.5	200	crCRS	0.74	1.00	0.87	0.68	0.62	1.00	0.79	0.55
		crSIRS	0.77	1.00	0.80	0.67	0.64	1.00	0.65	0.48
	300	crCRS	0.92	1.00	0.96	0.89	0.85	1.00	0.94	0.81
		crSIRS	0.95	1.00	0.93	0.90	0.88	1.00	0.88	0.80
0.8	100	crCRS	0.72	0.90	0.76	0.65	0.63	0.83	0.65	0.53
		crSIRS	0.69	0.96	0.69	0.58	0.58	0.87	0.58	0.48
	200	crCRS	0.98	1.00	0.99	0.97	0.94	0.99	0.95	0.91
		crSIRS	0.97	1.00	0.97	0.95	0.94	0.99	0.93	0.90

Open in a new tab

Note: $G (0, 1)$ denotes the standard Gumbel distribution.

Table 10.

The 5%, 25%, 50%, 75% and 95% quantiles of minimum model size S to include all the important covariates in Example 3.4.

			p = 1000					p = 3000
ρ	n	Method	5%	25%	50%	75%	95%	5%	25%	50%	75%	95%
0.5	200	crCRS	3	4	12	66	431	3	5	28	180	1319
		crSIRS	3	4	12	72	444	3	8	45	219	1502
	300	crCRS	3	3	4	14	109	3	3	6	32	287
		crSIRS	3	3	4	14	108	3	3	8	38	366
0.8	100	crCRS	3	4	8	52	352	3	4	18	157	1233
		crSIRS	3	4	13	61	366	3	5	28	198	1316
	200	crCRS	3	3	3	5	19	3	3	4	6	79
		crSIRS	3	3	3	5	37	3	3	3	7	126

Open in a new tab

Note: $G (0, 1)$ denotes the standard Gumbel distribution.

4. Real data analysis

In this section, we illustrate our developed crCRS, along with crSIRS, on a readily available bladder carcinoma data set [6], which is thereafter reanalyzed by Binder et al. [2], Ambrogi and Scheike [1] and so on. This data set consists of 404 records of patients diagnosed with non-muscle-invasive bladder cancer. One of the purposes of this study is to validate a gene expression signature for progression or death from bladder cancer, which is the failure time of interest. However, patients could also die from other or unknown failures, i.e. competing risks events. The techniques for standard survival analysis are, therefore, no longer valid. In our analysis, the covariates include 1381 microarray features and 5 potential influential clinical predictors (Age, Sex, BCG/MMC treatment, Reevaluated WHO grade and Reevaluated pathological disease stage). Since the clinical covariates are only collected for 301 observations, our subsequent analysis will be confirmed to this subset. Among these patients, 84 experienced progression or death from bladder cancer, 33 died from other or unknown reasons, and the rest were right-censored.

By setting the model size to be $[301 / \log (301)] = 52$ , we firstly execute crCRS and crSIRS to pick up the top 52 predictors among 1386 ones, respectively. It is noted that both procedures only select out grade as an important predictor within five clinical covariates. The IDs of identified genes are listed in Table 11. It can be observed, from this table, that a total of 79 unique genes are discovered and 22 genes are recognized by both methods.

Table 11.

The Gene IDs for the selected genes by crCRS and crSIRS. (The Gene IDs are ranked by order of screening result.)

Method	Gene IDs
crSIRS	SEQ820	SEQ34	SEQ973	SEQ972	SEQ266	SEQ370	SEQ440	SEQ709
	SEQ377	SEQ337	SEQ265	SEQ833	SEQ368	SEQ1226	SEQ493	SEQ1142
	SEQ246	SEQ287	SEQ782	SEQ78	SEQ765	SEQ1262	SEQ1253	SEQ1157
	SEQ884	SEQ152	SEQ1327	SEQ1381	SEQ882	SEQ264	SEQ766	SEQ847
	SEQ1038	SEQ62	SEQ247	SEQ974	SEQ681	SEQ369	SEQ767	SEQ339
	SEQ982	SEQ1165	SEQ768	SEQ807	SEQ785	SEQ162	SEQ1227	SEQ1225
	SEQ1302	SEQ151	SEQ336
crCRS	SEQ973	SEQ266	SEQ972	SEQ265	SEQ264	SEQ782	SEQ974	SEQ828
	SEQ213	SEQ982	SEQ820	SEQ279	SEQ648	SEQ1369	SEQ21	SEQ794
	SEQ882	SEQ790	SEQ73	SEQ1339	SEQ1015	SEQ1211	SEQ1172	SEQ833
	SEQ1192	SEQ71	SEQ1105	SEQ848	SEQ369	SEQ377	SEQ1253	SEQ211
	SEQ336	SEQ1302	SEQ927	SEQ493	SEQ1327	SEQ1164	SEQ370	SEQ410
	SEQ157	SEQ1380	SEQ1193	SEQ1031	SEQ425	SEQ152	SEQ34	SEQ62
	SEQ246	SEQ361	SEQ789

Open in a new tab

To further reduce the dimensionality and build a more parsimonious model for easier interpretations, we then fit the popular proportional subdistribution hazards model [9] to the data with selected covariates via regularized technique [10]. To be specific, the adaptive LASSO, SCAD and MCP are implemented in light of Fu et al. [10] with tuning parameters being selected by GCV. The clinical variable grade is retained once more by adaptive LASSO after crCRS and crSIRS with estimates of coefficient (standard error) being −0.06 (0.04) and $-$ 0.11 (0.07), respectively. The IDs of finally selected genes and corresponding coefficient estimates (standard errors) are presented in Table 12. We can see that penalized approaches for two groups of selected genes produce similar results. For example, MCP leads to the sparsest model, while SCAD results in the most complex model. In addition, SEQ820 and SEQ279 are identified by all the three means for crCRS, and by two approaches for crSIRS. This finding indicates that it is extremely likely that they are truly predictive for the failure time of progression or death from bladder cancer.

Table 12.

The summary statistics for regularized variable selections of ALASSO, SCAD and MCP based on the selected genes of crCRS and crSIRS.

	ALASSO		SCAD		MCP
	Gene ID	Est. (Se.)	Gene ID	Est. (Se.)	Gene ID	Est. (Se.)
crSIRS	SEQ377	0.59 (0.16)	SEQ820	0.14 (0.06)	SEQ820	0.53 (0.24)
	SEQ833	0.22 (0.11)	SEQ34	0.80 (0.33)	SEQ34	0.83 (0.35)
	SEQ287	0.42 (0.13)	SEQ709	0.06 (0.01)	SEQ287	0.12 (0.05)
	SEQ1262	$- 0.92$ (0.11)	SEQ833	0.17 (0.07)	SEQ1262	$- 0.31$ (0.17)
	SEQ1253	$- 0.09$ (0.04)	SEQ246	$- 0.04$ (0.01)	SEQ681	$- 0.09$ (0.03)
	SEQ847	$- 0.15$ (0.08)	SEQ287	0.22 (0.09)
	SEQ1227	$- 0.13$ (0.05)	SEQ1262	$- 0.37$ (0.17)
	SEQ1225	$- 0.05$ (0.02)	SEQ884	$- 0.02$ (0.01)
			SEQ1381	0.07 (0.02)
			SEQ847	$- 0.002$ (<0.001)
			SEQ681	$- 0.09$ (0.03)
			SEQ162	$- 0.04$ (0.01)
crCRS	SEQ265	0.27 (0.11)	SEQ973	0.04 (0.01)	SEQ820	0.58 (0.21)
	SEQ820	0.23 (0.11)	SEQ972	0.01 (>0.00)	SEQ279	$- 0.18$ (0.08)
	SEQ279	$- 0.58$ (0.10)	SEQ820	0.24 (0.09)	SEQ34	0.79 (0.35)
	SEQ377	0.67 (0.13)	SEQ279	$- 0.12$ (0.02)	SEQ246	$- 0.03$ (0.01)
	SEQ1031	0.11 (0.06)	SEQ833	0.11 (0.02)
			SEQ377	0.13 (0.03)
			SEQ1327	0.04 (0.01)
			SEQ370	0.02 (>0.00)
			SEQ34	0.34 (0.12)
			SEQ246	$- 0.07$ (0.02)

Open in a new tab

Note: Est. and Se. stand for estimates of regression coefficients and standard error, respectively.

5. Conclusion

In this work, we have proposed another model-free marginal feature screening approach for ultra-high dimensional competing risks data motivated by the correlation rank screening for standard survival data of Zhang et al. [20]. The new method enjoys several desirable characters as described in Section 1. Nevertheless, crCRS could only identify the predictors that have effects on the cumulative incidence function of the interested event time. It is well known that the effect of a covariate on cause-specific hazard function could be different from that on the cumulative incidence function [15]. The features being inferential for survival time to the type of failure of interest via cause-specific hazard function may, therefore, be missed by our suggested crCRS, and also crSIRS. In Section 4, the bladder cancer study is in fact a multi-center study, in which the data are collected in different countries. This information should be taken into account during the feature screening process. Such problems deserve further investigations.

Acknowledgments

Xiaolin Chen's research is supported by the Natural Science Foundation of Shandong Province of China (ZR2020MA023) and the National Natural Science Foundation of China (11771250). Tao Zhang's research is supported by the National Natural Science Foundation of China (11861014) and Natural Science Foundation of Guangxi (2018GSNSFAA281145).

Appendix 1. Lemma and proofs of the theorems.

For the convenience of presentation, we introduce two additional notations. Define ${\tilde{F}}_{1} (y) = n^{- 1} \sum_{i = 1}^{n} δ_{i} I (Y_{i} \leq y, J_{i} = 1) / {1 - G (Y_{i})}$ and ${\tilde{ω}}_{l} = {n^{- 1} \sum_{i = 1}^{n} X_{i l} F_{1} (Y_{i})}^{2}$ , respectively. We firstly establish an useful exponential inequality bound for the absolute difference of estimated and population cumulative incidence function of type-1 failure.

Lemma A.1

For any $ε \in (0, 1)$ , under Assumption (A.1), we have

$Pr (sup_{0 < y < τ} | {\tilde{F}}_{1} (y) - F_{1} (y) | \geq ε) \leq 2 (n + 1) \exp (- 2 n ε^{2} η^{2}) .$

Proof.

Under Assumption (A.1), it is noted that

$0 \leq \frac{I (Y_{i} \leq y, J_{i} = 1) δ_{i}}{1 - G (Y_{i})} \leq \frac{1}{η} .$

Then the desired result could be obtained by Hoeffding's inequality and modern empirical process theory [17].

Let $(Ω, F, P)$ be the probability space that underlies all the random variables in the paper, where Ω is the sample space, $F$ is the σ-algebra and P is the probability measure. According to Lemma A.1 in [22], under Assumption (A.1), it can be obtained that

\begin{aligned} sup_{0 < y \leq τ} | {\hat{F}}_{1} (y) - {\tilde{F}}_{1} (y) | \\ = sup_{0 < y \leq τ} | n^{- 1} \sum_{i = 1}^{n} δ_{i} I (Y_{i} \leq y, J_{i} = 1) [\frac{1}{1 - \hat{G} (Y_{i})} - \frac{1}{1 - G (Y_{i})}] | \\ \leq sup_{0 < y \leq τ} | \frac{1}{1 - \hat{G} (y)} - \frac{1}{1 - G (y)} | \\ = O {{(\frac{\log (n)}{n})}^{1 / 2}} \end{aligned}

(A1)

almost surely. Denote by $Ω_{0}$ a subset of Ω such that $sup_{0 < y \leq τ} | {\hat{F}}_{1} (y) - {\tilde{F}}_{1} (y) | = O {{(\frac{\log (n)}{n})}^{1 / 2}}$ on it. Under Assumption (A.2), by the strong law of large numbers, there exists a positive constant $c_{2}$ such that

\frac{1}{n} \sum_{i = 1}^{n} X_{i l}^{2} \leq c_{2}

(A2)

holds almost surely for sufficiently large n. That is, there exists $Ω_{l} \subseteq Ω$ with $Pr (Ω_{l}) = 1$ such that on $Ω_{l}$ (A2) holds for sufficiently large n. Let $Ω_{*} = ⋂_{l = 0}^{p} Ω_{l}$ , then it holds that $Ω_{*} = ⋂_{l = 0}^{p} Ω_{l}$ . In the subsequent proving, it is implicitly implied that the events considered are in fact intersections with $Ω_{*}$ .

Proof Proof of Theorem 2.1 —

By the Cauchy–Schwarz inequality and Equation (A2), it is easy to obtain that

$| \frac{1}{n} \sum_{i = 1}^{n} X_{i l} F_{1} (Y_{i}) | \leq c_{2}^{1 / 2}$ (A3)

and

$| \frac{1}{n} \sum_{i = 1}^{n} X_{i l} {\hat{F}}_{1} (Y_{i}) | \leq c_{2}^{1 / 2} .$ (A4)

Thus,

$\begin{aligned} | {\hat{ω}}_{l} - {\tilde{ω}}_{l} | \\ = | [\frac{1}{n} \sum_{i = 1}^{n} X_{i l} {\hat{F}}_{1} (Y_{i}) + \frac{1}{n} \sum_{i = 1}^{n} X_{i l} F_{1} (Y_{i})] [\frac{1}{n} \sum_{i = 1}^{n} X_{i l} {{\hat{F}}_{1} (Y_{i}) - F_{1} (Y_{i})}] | \\ \leq 2 c_{2}^{1 / 2} | \frac{1}{n} \sum_{i = 1}^{n} X_{i l} {{\hat{F}}_{1} (Y_{i}) - F_{1} (Y_{i})} | \\ \leq 2 c_{2} [\frac{1}{n} \sum_{i = 1}^{n} {{\hat{F}}_{1} (Y_{i}) - F_{1} (Y_{i})}^{2}]^{1 / 2} \\ \leq 2 c_{2} sup_{0 < y \leq τ} | {\hat{F}}_{1} (y) - F_{1} (y) | \\ \leq 2 c_{2} sup_{0 < y \leq τ} | {\hat{F}}_{1} (y) - {\tilde{F}}_{1} (y) | + 2 c_{2} sup_{0 < y \leq τ} | {\tilde{F}}_{1} (y) - F_{1} (y) |, \end{aligned}$ (A5)

where the second inequality follows from the Cauchy–Schwarz inequality. Then for sufficiently large n, we have

$\begin{aligned} Pr (| {\hat{ω}}_{l} - {\tilde{ω}}_{l} | \geq \frac{1}{2} c_{1} n^{- κ}) \\ \leq Pr (2 c_{2} sup_{0 < y \leq τ} | {\hat{F}}_{1} (y) - {\tilde{F}}_{1} (y) | \geq \frac{1}{4} c_{1} n^{- κ}) + Pr (2 c_{2} sup_{0 < y \leq τ} | {\tilde{F}}_{1} (y) - F_{1} (y) | \geq \frac{1}{4} c_{1} n^{- κ}) \\ = Pr (2 c_{2} sup_{0 < y \leq τ} | {\tilde{F}}_{1} (y) - F_{1} (y) | \geq \frac{1}{4} c_{1} n^{- κ}) \\ \leq 2 (n + 1) \exp {- 2 n (\frac{1}{8 c_{2}} c_{1} n^{- κ})^{2} η^{2}} \\ = 2 (n + 1) \exp (- 2 c_{3} n^{1 - 2 κ}), \end{aligned}$ (A6)

where the first equality comes from Equation (A1), the second inequality is based on Lemma A.1 and $c_{3} = 64^{- 1} c_{2}^{- 2} c_{1}^{2} η^{2}$ .

By the Cauchy–Schwarz inequality and Equation (A2), under Assumption (A.2), it is easy to show that

$\begin{aligned} Pr (| {\tilde{ω}}_{l} - ω_{l} | \geq \frac{1}{2} c_{1} n^{- κ}) \\ \leq Pr (c_{1}^{1 / 2} M^{1 / 2} | \frac{1}{n} \sum_{i = 1}^{n} X_{i l} F_{1} (Y_{i}) - E {X_{i l} F_{1} (Y_{i})} | \geq \frac{1}{2} c_{1} n^{- κ}) . \end{aligned}$ (A7)

Then by the similar derivation in Zhang et al. [20], there exists positive constants $c_{4}$ and $c_{5}$ such that

$Pr (| {\tilde{ω}}_{l} - ω_{l} | \geq \frac{1}{2} c_{1} n^{- κ}) \leq \exp {- {(\frac{n^{1 - 2 κ}}{\log \log (n)})}^{1 / 2} c_{4} + c_{5}} .$ (A8)

Note that

$\begin{aligned} Pr (| {\hat{ω}}_{l} - ω_{l} | \geq c_{1} n^{- κ}) \\ \leq Pr (| {\hat{ω}}_{l} - {\tilde{ω}}_{l} | \geq \frac{1}{2} c_{1} n^{- κ}) + Pr (| {\tilde{ω}}_{l} - ω_{l} | \geq \frac{1}{2} c_{1} n^{- κ}) . \end{aligned}$ (A9)

Combining Equations (A6)–(A9), we have

$\begin{aligned} Pr (| {\hat{ω}}_{l} - ω_{l} | \geq c_{1} n^{- κ}) \\ \leq 2 \exp (- 2 c_{3} n^{1 - 2 κ} + \log (n + 1)) + \exp {- {(\frac{n^{1 - 2 κ}}{\log \log (n)})}^{1 / 2} c_{4} + c_{5}} \\ = 2 \exp {- {\frac{n^{1 - 2 κ}}{\log \log (n)}}^{1 / 2} [2 c_{3} {\log \log (n)}^{1 / 2} n^{1 / 2 - κ} - {\frac{\log \log (n)}{n^{1 - 2 κ}}}^{1 / 2} \log (n + 1)]} \\ + \exp {- {(\frac{n^{1 - 2 κ}}{\log \log (n)})}^{1 / 2} c_{4} + c_{5}} \\ = O (\exp {- {(\frac{n^{1 - 2 κ}}{\log \log (n)})}^{1 / 2} d_{1}}), \end{aligned}$ (A10)

where $d_{1} = c_{4}$ . Finally, we arrive at

$\begin{aligned} Pr (max_{1 \leq l \leq p} | {\hat{ω}}_{l} - ω_{l} | \geq c_{1} n^{- κ}) \\ \leq \sum_{l = 1}^{p} Pr (| {\hat{ω}}_{l} - ω_{l} | \geq c_{1} n^{- κ}) \\ = O (p \exp {- {(\frac{n^{1 - 2 κ}}{\log \log (n)})}^{1 / 2} d_{1}}) . \end{aligned}$ (A11)

This completes the proof of the first part of Theorem 2.1.

Now we turn to the second part:

\begin{aligned} Pr (A \subset \hat{A}) \\ = Pr (min_{l \in A} {\hat{ω}}_{l} \geq c_{1} n^{- κ}) \\ \geq Pr (min_{l \in A} ω_{l} - min_{l \in A} {\hat{ω}}_{l} \leq c_{1} n^{- κ}) \\ \geq Pr (max_{l \in A} | ω_{l} - {\hat{ω}}_{l} | \leq c_{1} n^{- κ}) \\ \geq 1 - O (s \exp {- {(\frac{n^{1 - 2 κ}}{\log \log (n)})}^{1 / 2} d_{1}}), \end{aligned}

(A12)

where the first inequality is because of the fact that $min_{l \in A} ω_{l} \geq 2 c_{1} n^{- κ}$ , and the last inequality is obtained similarly to Equation (A11). This establishes the second result of Theorem 2.1.

Funding Statement

This work was supported by the Natural Science Foundation of Shandong Province of China (ZR2020MA023), the National Natural Science Foundation of China [11771250,11861014] and the Natural Science Foundation of Guangxi Province [2018GSNSFAA281145].

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

1.Ambrogi F. and Scheike T., Penalized estimation for competing risks regression with applications to high-dimensional covariates, Biostatistics 17 (2016), pp. 708–721. [DOI] [PubMed] [Google Scholar]
2.Binder H., Allignol A., Schumacher M., and Beyersmann J., Boosting for high-dimensional time-to-event data with competing risk, Bioinformatics 25 (2009), pp. 890–896. [DOI] [PubMed] [Google Scholar]
3.Chen X., Chen X., and Wang H., Robust feature screening for ultra-high dimensional right censored data via distance correlation, Comput. Stat. Data Anal. 119 (2018), pp. 118–138. [Google Scholar]
4.Chen X., Zhang Y., Chen X., and Liu Y., A simple model-free survival conditional feature screening, Stat. Probab. Lett. 146 (2019), pp. 156–160. [Google Scholar]
5.Chen X., Zhang Y., Liu Y., and Chen X., Model-free feature screening for ultra-high dimensional competing risks data, Stat. Probab. Lett. 164 (2020), pp. 108815. [Google Scholar]
6.Dyrskjøt L., Zieger K., Real F.X., Malats N., Carrato A., Hurst C., Kotwal S., Knowles M., Malmström P.-U., de la Torre M., Wester K., Allory Y., Vordos D., Caillault A., Radvanyi F., Hein A.-M.K., Jensen J. L., Jensen K.M.E., Marcussen N., and Orntoft T.F., Gene expression signatures predict outcome in non-muscle invasive bladder carcinoma: A multicenter validation study, Clin. Cancer Res. 13 (2007), pp. 3545–3551. [DOI] [PubMed] [Google Scholar]
7.Fan J. and Lv J., Sure independence screening for ultrahigh dimensional feature space, J. R. Stat. Soc. Ser. B 70 (2008), pp. 849–911. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Fan J., Samworth R., and Wu Y., Ultrahigh dimensional feature selection: Beyond the linear model, J. Mach. Learn. Res. 10 (2009), pp. 2013–2038. [PMC free article] [PubMed] [Google Scholar]
9.Fine J. and Gray P., A proportional hazards model for the subdistribution of a competing risk, J. Am. Stat. Assoc. 94 (1999), pp. 496–509. [Google Scholar]
10.Fu Z., Parikh C., and Zhou B., Penalized variable selection in competing risks regression, Lifetime Data. Anal. 23 (2017), pp. 353–376. [DOI] [PubMed] [Google Scholar]
11.Gorst-Rasmussen A. and Scheike T., Independent screening for single-index hazard rate models with ultrahigh dimensional features, J. R. Stat. Soc. Ser. B 72 (2013), pp. 217–245. [Google Scholar]
12.Ha I., Lee M., Oh S., Jeong J., Sylvester R., and Lee Y., Variable selection in subdistribution hazard frailty models with competing risks data, Stat. Med. 33 (2014), pp. 4590–4604. [DOI] [PMC free article] [PubMed] [Google Scholar]
13.He X., Wang L., and Hong H., Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data, Ann. Stat. 41 (2013), pp. 342–369. [Google Scholar]
14.Li E., Mei B., and Tian M., Feature screening based on ultrahigh dimensional competing risks models (in Chinese), Sci. Sinica Math. 48 (2018), pp. 1061–1086. [Google Scholar]
15.Li G. and Yang Q., Joint inference for competing risks survival data, J. Am. Stat. Assoc. 111 (2016), pp. 1289–1300. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Liu J., Zhong W., and Li R., A selective overview of feature screening for ultrahigh-dimensional data, Sci. China Math. 58 (2015), pp. 2033–2054. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Pollard D., Convergence of Stochastic Processes, Springer-Verlag, New York, 1984. [Google Scholar]
18.Putter H., Fioccol M., and Geskus R., Tutorial in biostatistics: competing risks and multi-state models, Stat. Med. 26 (2007), pp. 2389–2430. [DOI] [PubMed] [Google Scholar]
19.Xu C. and Chen J., The sparse mle for ultrahigh-dimensional feature screening, J. Am. Stat. Assoc. 109 (2014), pp. 1257–1269. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Zhang J., Liu Y., and Wu Y., Correlation rank screening for ultrahigh-dimensional survival data, Comput. Stat. Data Anal. 108 (2017), pp. 121–132. [Google Scholar]
21.Zhao S. and Li Y., Principled sure independence screening for cox models with ultra-high-dimensional covariates, J. Multivariate Anal. 105 (2012), pp. 397–411. [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Zhou T. and Zhu L., Model-free features screening for ultrahigh dimensional censored regression, Stat. Comput. 27 (2017), pp. 947–961. [Google Scholar]
23.Zhu L., Li L., Li R., and Zhu L., Model-free feature screening for ultrahigh-dimensional data, J. Am. Stat. Assoc. 106 (2011), pp. 1464–1475. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0001] 1.Ambrogi F. and Scheike T., Penalized estimation for competing risks regression with applications to high-dimensional covariates, Biostatistics 17 (2016), pp. 708–721. [DOI] [PubMed] [Google Scholar]

[CIT0002] 2.Binder H., Allignol A., Schumacher M., and Beyersmann J., Boosting for high-dimensional time-to-event data with competing risk, Bioinformatics 25 (2009), pp. 890–896. [DOI] [PubMed] [Google Scholar]

[CIT0003] 3.Chen X., Chen X., and Wang H., Robust feature screening for ultra-high dimensional right censored data via distance correlation, Comput. Stat. Data Anal. 119 (2018), pp. 118–138. [Google Scholar]

[CIT0004] 4.Chen X., Zhang Y., Chen X., and Liu Y., A simple model-free survival conditional feature screening, Stat. Probab. Lett. 146 (2019), pp. 156–160. [Google Scholar]

[CIT0005] 5.Chen X., Zhang Y., Liu Y., and Chen X., Model-free feature screening for ultra-high dimensional competing risks data, Stat. Probab. Lett. 164 (2020), pp. 108815. [Google Scholar]

[CIT0006] 6.Dyrskjøt L., Zieger K., Real F.X., Malats N., Carrato A., Hurst C., Kotwal S., Knowles M., Malmström P.-U., de la Torre M., Wester K., Allory Y., Vordos D., Caillault A., Radvanyi F., Hein A.-M.K., Jensen J. L., Jensen K.M.E., Marcussen N., and Orntoft T.F., Gene expression signatures predict outcome in non-muscle invasive bladder carcinoma: A multicenter validation study, Clin. Cancer Res. 13 (2007), pp. 3545–3551. [DOI] [PubMed] [Google Scholar]

[CIT0007] 7.Fan J. and Lv J., Sure independence screening for ultrahigh dimensional feature space, J. R. Stat. Soc. Ser. B 70 (2008), pp. 849–911. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0008] 8.Fan J., Samworth R., and Wu Y., Ultrahigh dimensional feature selection: Beyond the linear model, J. Mach. Learn. Res. 10 (2009), pp. 2013–2038. [PMC free article] [PubMed] [Google Scholar]

[CIT0009] 9.Fine J. and Gray P., A proportional hazards model for the subdistribution of a competing risk, J. Am. Stat. Assoc. 94 (1999), pp. 496–509. [Google Scholar]

[CIT0010] 10.Fu Z., Parikh C., and Zhou B., Penalized variable selection in competing risks regression, Lifetime Data. Anal. 23 (2017), pp. 353–376. [DOI] [PubMed] [Google Scholar]

[CIT0011] 11.Gorst-Rasmussen A. and Scheike T., Independent screening for single-index hazard rate models with ultrahigh dimensional features, J. R. Stat. Soc. Ser. B 72 (2013), pp. 217–245. [Google Scholar]

[CIT0012] 12.Ha I., Lee M., Oh S., Jeong J., Sylvester R., and Lee Y., Variable selection in subdistribution hazard frailty models with competing risks data, Stat. Med. 33 (2014), pp. 4590–4604. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0013] 13.He X., Wang L., and Hong H., Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data, Ann. Stat. 41 (2013), pp. 342–369. [Google Scholar]

[CIT0014] 14.Li E., Mei B., and Tian M., Feature screening based on ultrahigh dimensional competing risks models (in Chinese), Sci. Sinica Math. 48 (2018), pp. 1061–1086. [Google Scholar]

[CIT0015] 15.Li G. and Yang Q., Joint inference for competing risks survival data, J. Am. Stat. Assoc. 111 (2016), pp. 1289–1300. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0016] 16.Liu J., Zhong W., and Li R., A selective overview of feature screening for ultrahigh-dimensional data, Sci. China Math. 58 (2015), pp. 2033–2054. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0017] 17.Pollard D., Convergence of Stochastic Processes, Springer-Verlag, New York, 1984. [Google Scholar]

[CIT0018] 18.Putter H., Fioccol M., and Geskus R., Tutorial in biostatistics: competing risks and multi-state models, Stat. Med. 26 (2007), pp. 2389–2430. [DOI] [PubMed] [Google Scholar]

[CIT0019] 19.Xu C. and Chen J., The sparse mle for ultrahigh-dimensional feature screening, J. Am. Stat. Assoc. 109 (2014), pp. 1257–1269. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0020] 20.Zhang J., Liu Y., and Wu Y., Correlation rank screening for ultrahigh-dimensional survival data, Comput. Stat. Data Anal. 108 (2017), pp. 121–132. [Google Scholar]

[CIT0021] 21.Zhao S. and Li Y., Principled sure independence screening for cox models with ultra-high-dimensional covariates, J. Multivariate Anal. 105 (2012), pp. 397–411. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0022] 22.Zhou T. and Zhu L., Model-free features screening for ultrahigh dimensional censored regression, Stat. Comput. 27 (2017), pp. 947–961. [Google Scholar]

[CIT0023] 23.Zhu L., Li L., Li R., and Zhu L., Model-free feature screening for ultrahigh-dimensional data, J. Am. Stat. Assoc. 106 (2011), pp. 1464–1475. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

On correlation rank screening for ultra-high dimensional competing risks data

Xiaolin Chen

Chenguang Li

Tao Zhang

Zhenlong Gao

Abstract

1. Introduction

2. Methodology and theoretical properties

Theorem 2.1

Theorem 2.2

3. Simulation studies

Example 3.1

Table 2.

Table 3.

Table 1.

Table 4.

Example 3.2

Table 5.

Table 6.

Example 3.3

Table 7.

Table 8.

Example 3.4

Table 9.

Table 10.

4. Real data analysis

Table 11.

Table 12.

5. Conclusion

Acknowledgments

Appendix 1. Lemma and proofs of the theorems.

Lemma A.1

Proof.

Proof Proof of Theorem 2.1 —

Funding Statement

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases