Random survival forests for competing risks

Hemant Ishwaran; Thomas A Gerds; Udaya B Kogalur; Richard D Moore; Stephen J Gange; Bryan M Lau

doi:10.1093/biostatistics/kxu010

. 2014 Apr 11;15(4):757–773. doi: 10.1093/biostatistics/kxu010

Random survival forests for competing risks

Hemant Ishwaran ^1,^*, Thomas A Gerds ², Udaya B Kogalur ³, Richard D Moore ⁴, Stephen J Gange ⁵, Bryan M Lau ⁵

PMCID: PMC4173102 PMID: 24728979

Abstract

We introduce a new approach to competing risks using random forests. Our method is fully non-parametric and can be used for selecting event-specific variables and for estimating the cumulative incidence function. We show that the method is highly effective for both prediction and variable selection in high-dimensional problems and in settings such as HIV/AIDS that involve many competing risks.

Keywords: AIDS, Brier score, Competing risks, C-index, Cumulative incidence function, Ensemble

1. Introduction

Individuals subject to competing risks are observed from study entry to the occurrence of the event of interest, a competing event, or often, before the individual can experience one of the events, that person is right censored. Formally, let Inline graphic be the event time for the th subject, , and let be the event type, , where . Let denote the censoring time for individual such that the actual time of event is unobserved and one only observes and the event indicator . When , the individual is said to be censored at ; otherwise if Inline graphic , the individual is said to have an event of type at time . The observed data are where is a -dimensional vector of covariates.

We are interested in predicting events and in the discovery of risk factors. For the latter, we shall distinguish between risk factors for the cause-specific hazard and risk factors for the cumulative incidence. The cause-specific hazard function for event Inline graphic given covariates is

Here, Inline graphic is the event-free survival probability function given . The cause-specific hazard function describes the instantaneous risk of event for subjects that currently are event-free. Factors found to change the instantaneous event risk are associated with the biological mechanism behind event Inline graphic . On the other hand, the probability that an event occurs in a specific time period, say , depends on the cause-specific hazards of the other events (Gray, 1988). The probability of an event is determined using the cumulative incidence function (CIF), defined as the probability of experiencing an event of type Inline graphic by time ; i.e. . The CIF and cause-specific hazard function are related according to

(1.1)

Informally speaking, event Inline graphic can only occur for those surviving other risks. A covariate that reduces the cause-specific hazard of a competing risk increases the event-free survival probability and thereby indirectly increases the cumulative incidence of event . Thus, covariates found to change the -year risk of event Inline graphic (i.e. the cumulative incidence) are those that change the cause-specific hazard function of event and those that change the cause-specific hazard functions of the competing risks.

When the aim is to assist decision-making and for patient counseling we are interested in Inline graphic -year predictions and in finding covariates that affect the cumulative incidence. On the other hand, to understand and discuss treatment options for the biological mechanism that drives the risk of a specific event, we focus on the cause-specific hazard function.

In this paper, we propose a new approach to competing risks that builds on the framework of random survival forests (RSF) (Ishwaran and others, 2008), an extension of Breiman's random forests (Breiman, 2001) to right-censored survival settings. Our novel approach benefits from the many useful properties of forests and has following the important features: (a) it directly estimates the CIF; (b) it provides accurate prediction performance; (c) it models non-linear effects and interactions; (d) it can be used for event-specific selection of risk factors; (e) it can be used effectively in high-dimensional settings; and (f) it is free of model assumptions.

Section 2 describes the main parameters which we estimate by using ensembles. Section 3 describes the competing risks forest algorithm, introduces terminal node estimators used for constructing ensembles, and describes splitting rules for growing competing risk trees suitable for either cause-specific hazard or CIF inference. The prediction error for the proposed ensemble estimators and variable selection are discussed in Sections 4 and 5. Section 6 studies the performance of our method using synthetic data. In Section A of supplementary material available at Biostatistics online (http://www.biostatistics.oxfordjournals.org), we consider performance over a collection of well-known data sets. Section 7 utilizes RSF to identify event-specific variables using the Johns Hopkins HIV Clinical Cohort, a large database involving over 6000 HIV patients.

2. Parameters of interest

2.1. Expected number of life years lost and cause- mortality

In addition to estimating the CIF, we propose a 1D summary of the cumulative incidence referred to as the expected number of life years lost due to cause Inline graphic (Andersen, 2012). In right-censored data, it is not feasible to get a reliable estimate of the expected lifetime. Therefore, for a fixed time point we consider the restricted mean lifetime conditional on : . The truncation time point is chosen such that the probability of being uncensored at Inline graphic is bounded away from zero: . In practice, we will typically set in accordance with the observed follow-up period (see Section 3). We extend the notation of Andersen (2012) to the case with covariates and note the relation , which holds for all values and all . The expected number of years lost before time Inline graphic is

Our summary value is Inline graphic , which the above shows equals the expected number of life years lost due to cause before time . We shall also call the cause- mortality.

2.2. Terminal node estimators

We describe non-parametric estimators of the event-free survival function, the cause-specific CIF, and mortality. The estimators are described here using the entire learning data set, but in implementation they are calculated within the terminal node of a RSF tree and then aggregated to form the ensemble (see Section 3.1).

Let Inline graphic denote the distinct and ordered event times from . Let be the number of type events at , and be the number of type events in . Define also , the total number of events occurring at time , , the total number of events occurring in , and , the number of individuals at risk (event-free and uncensored) just prior to Inline graphic . The Nelson–Aalen estimator for the cumulative event-specific hazard function is given by

where Inline graphic . The Kaplan–Meier estimator for the event-free survival function is given by

We use the Aalen–Johansen estimator (Aalen and Johansen, 1978) to estimate Inline graphic :

The cause- Inline graphic mortality is estimated by . We set to be the largest observed time .

3. Competing risk forests

A RSF (Ishwaran and others, 2008) is an collection of randomly grown survival trees. Each tree is grown using an independent bootstrap sample of the learning data using random feature selection at each node. RSF trees are generally grown very deeply with many terminal nodes (the ends of the tree). Trees in competing risk forests are grown similarly. What differs are the splitting rules used to grow the tree (Section 3.3) and the estimated values calculated within the terminal nodes used to define the ensemble (Section 3.1).

To grow a competing risk forest, we highlight two conceptually different approaches:

Separate competing risk trees are grown for each of the events in each bootstrap sample. The splitting rules used to grow the trees are event-specific.
A single competing risk tree is grown in each bootstrap sample. The splitting rules are either event-specific, or combine event-specific splitting rules across the events.

The second approach is more efficient (especially for high-dimensional problems and large data settings), sufficient for most tasks, and what we do in this article. In the next subsections, we describe how to calculate various ensembles useful for competing risks and provide details of competing risk trees. The forest algorithm is then summarized in Section 3.4.

3.1. Event-specific ensembles

Let Inline graphic denote the learning data. As stated earlier, a RSF tree is grown using an independent bootstrap sample of the learning data. Let be the number of times case occurs in bootstrap sample . To define the CIF for the th tree, take a case's covariate and drop it down the tree. Let denote the indices for cases from the learning data whose covariates share the terminal node with Inline graphic . Denoting node-specific event counts by and the number at risk by , we define 's CIF as

where Inline graphic is 's Kaplan–Meier estimate of event-free survival. The ensemble estimates of the CIF and the cause- mortality, respectively, equal

For reporting an internal error rate, we use out-of-bag (OOB) ensembles. By standard bootstrap theory, each bootstrap sample leaves out approximately 37% of the data. The OOB data are used to construct the OOB ensemble. Let Inline graphic be the index set of trees where ; i.e. records trees where case is OOB. The OOB ensemble estimates of the CIF and the cause- mortality are, respectively, given by

The OOB predicted value for a case does not use event time outcome information for that case, and, therefore, because it is a cross-validation based estimator, it can be used for estimation of the prediction error.

3.2. Event-free survival ensembles

An efficient method to analyze event-free survival probability is to simply use the tree-specific estimators already computed from the competing risks forests, which saves the computation time needed to grow a separate forest. Thus, we estimate the forest event-free survival using the ensemble Inline graphic .

3.3. Splitting rules

Here, we describe two splitting rules that can be used to grow competing risk trees. For notational convenience, we describe these rules for the root node using the entire learning data, but the idea extends obviously to any tree node and to bootstrap data.

As before, let Inline graphic denote the survival times and event indicators, and let be the distinct event times. Suppose that the proposed split for the root node is of the form and for a continuous predictor (this can be obviously generalized to categorical variables). Such a split forms two daughter nodes containing two new sets of competing risk data. To indicate these data, we use a subscript of Inline graphic and for the left and right daughter nodes, and denote by and the cause- specific hazard rates in the left and the right daughter nodes, respectively. Similarly, define and to be the CIF for the left and the right daughter nodes, respectively.

The number of individuals at risk at time Inline graphic in the left and right daughter nodes are, respectively, and , where , , and is the -predictor for individual . The number of individuals who are risk at time is . The number of type events at time for the left and right daughters is, respectively,

and Inline graphic is the total number of type events at . Define also to be the largest time on study in the parent node and the two daughters, respectively.

3.3.1. Generalized log-rank test

Our first splitting rule is the log-rank test. In the setting with competing risk, this is a test of the null hypothesis Inline graphic for all . The test is based on the weighted difference of the cause-specific Nelson–Aalen estimates in the two daughter nodes. Specifically, for a split at the value for variable , the splitting rule is

(3.1)

where the variance estimate is given by

Time-dependent weights Inline graphic are used to make the test more sensitive to early or late differences between the cause-specific hazards. The choice corresponds to the standard log-rank test which has optimal power for detecting alternatives where the cause-specific hazards are proportional. The best split is found by maximizing Inline graphic over and .

3.3.2. Gray's test

The cause- Inline graphic specific log-rank splitting rule (3.1) is useful if the main purpose is to detect variables that affect the cause- specific hazard. It may not be optimal if the purpose is also prediction of cumulative event probabilities. In this case, better results may be obtained with splitting rules that select variables based on their direct effect on the cumulative incidence. For this reason, we model our second splitting rule after Gray's test (Gray, 1988), which tests the null hypothesis Inline graphic for all . For notational simplicity, consider analysis of event and assume ; that is, we pool all events not equal to event . Gray's statistic for testing the null is

where Inline graphic Here, the variance estimate is estimated based on the asymptotic normal representation under the null hypothesis; see bib11 for details.

In the special case where the censoring time is known for those cases that have an event before the end of follow-up, it is possible to obtain the score statistic of Gray's test by a simple modification of the log-rank test statistic. This is achieved by substituting in (3.1) for Inline graphic the modified risk set:

This motivates our modified splitting rule. The splitting rule based on the score statistic that uses the modified risk sets is denoted by Inline graphic and given by substituting for and for in (3.1). Note that if the censoring time is not known for those cases that have an event before the end of follow-up, the largest observed time is used, and the statistic is still a good (and computationally efficient) approximation of Gray's test statistic; see Fine and Gray (1999, Section 3.2).

3.3.3. Composite splitting rules

If the aim is to predict the CIF of all causes simultaneously, or if interest is in identifying variables that are important for any cause, it can be useful to combine the cause-specific splitting rules across the event types:

(3.2)

(3.3)

The best split is found by maximizing over Inline graphic and . Note that we have ignored the dependence in the test statistics in defining the variance. We do so because these types of calculations are not suitable for random forest trees. As these trees are grown deeply, tree nodes typically have few observations, which makes estimation of a covariance matrix problematic due to the limited data and will result in a poorly performing split-statistic. We should remark that a bias may occur with (3.3) if the censoring times remain unknown for cases that have an event before the end of the follow-up. However, our empirical results indicate that, for the purpose of building competing risk forests, the modified Gray splitting rule performs very well.

3.4. Competing risks forest algorithm

The steps required to construct a competing risks forest can be summarized as follows.

Draw bootstrap samples from the learning data.
Grow a competing risk tree for each bootstrap sample. At each node of the tree, randomly select candidate variables. The node is split using the candidate variable that maximizes a competing risk splitting rule.
Grow the tree to full size under the constraint that a terminal node should have no less than unique cases.
Calculate and for each tree, .
Take the average of each estimator over the trees to obtain its ensemble.

4. Prediction performance

4.1. Performance metrics

To assess prediction performance, we use the concordance index and the prediction error defined by the integrated Brier score (BS). The concordance index (C-index) is related to the area under the receiver operating characteristic curve and estimates the probability that, in a randomly selected pair of cases, the case that fails first had a worse predicted outcome. The BS is the squared difference between actual and predicted outcome.

Individuals are ranked by ensemble cause- Inline graphic mortality. We say that case has a higher risk of event than case if . Wolbers and others (2013) described a time-truncated concordance index for competing risks, which in our setting is

Thus, the ensemble prediction of the cumulative incidence is concordant with the outcome if either the case with the higher cause- Inline graphic mortality has event before the other case has an event of cause or if the other case has a competing event. We also consider the time-dependent BS (Graf and others, 1999; Gerds and Schumacher, 2006) and its integral (IBS) to assess the performance of the ensemble CIF:

4.2. OOB estimate of prediction error

Denote Inline graphic for the right-censored observations in a validation data set of size . Based on these data, the prediction error can be estimated using inverse probability of censoring weights (IPCWs) (Gerds and Schumacher, 2006; Wolbers and others, 2013). This technique requires an estimate of the censoring distribution. Let Inline graphic denote the so-called reverse Kaplan–Meier estimate of the censoring distribution. We shall assume that the censoring times are independent of the covariates and the event times and the event type. Thus, provides an unbiased estimate of the probability of being uncensored at time . To estimate Inline graphic we define weights and . The OOB-IPCW estimate at the largest observation time is

where Inline graphic , and . Using weights (Binder and others, 2009), the OOB estimate of the integrated BS for event is given by

Note that extremely large weights may occur, but can be avoided by evaluating the IPCW statistics at an earlier time point Inline graphic .

5. Variable selection

5.1. Variable importance

RSF variable selection typically involves filtering variables on the basis of variable importance (VIMP). VIMP measures the increase (or decrease) in prediction error for the forest ensemble when a variable is randomly “noised-up” (Breiman, 2001). A large positive VIMP shows that the prediction accuracy of the forest is substantially degraded when a variable is noised-up; thus a large VIMP indicates a potentially predictive variable.

In Breiman's original definition, VIMP is calculated by noising up a variable by permuting its value randomly. A more effective noising-up method, and one used throughout this paper, is random node assignment (Ishwaran and others, 2008). In random node assignment, cases are dropped down a tree and randomly assigned to a daughter node whenever the parent node splits on the target variable. This is more effective than permutation since it leads to a random assignment regardless of the type of variable. For example, permuting a discrete variable with, say, two values may not lead to a sufficiently noised-up feature.

Both non-event-specific and event-specific VIMP can be readily calculated for competing risks. To compute event-specific VIMP, we estimate the prediction error as described in Section 4.2. Then we noise up the data by random node assignment, and recompute the prediction error. The difference in these two values gives the VIMP for each variable for each event Inline graphic .

5.2. Minimal depth

Minimal depth assesses the predictiveness of a variable by the depth of the first split of a variable relative to the root node of a tree (Ishwaran, Kogalur, Gorodeski and others, 2010). The smaller this value, the more predictive is the variable. There are unique advantages to using minimal depth in a competing risk setting. First, unlike VIMP, there is an easily derived minimal depth threshold that can be used for selecting variables. Secondly, minimal depth is non-event-specific, and therefore by fitting a single forest it can be used to identify all variables that affect long-term event probabilities. On the other hand, while non-event-specific analyses are useful, it may also be important to identify variables that are event-specific. In Section 6.2, we describe a simple way to combine minimal depth with event-specific VIMP.

6. Empirical results

6.1. Simulations

We used simulations based on Cox-exponential hazard models Inline graphic of two competing events () given a vector of covariates . In all simulations, we set . Six continuous predictors were drawn independently from a standard normal distribution and six binary predictors from a binomial distribution with success probability of 50%. We set

such that variables Inline graphic , , , have an effect on the hazard of event 1 only, variables , , , have an effect on the hazard of event 2 only, and variables , , , have an effect on both hazards. The effect size for the continuous variables was set to and for the discrete variables the effect size was set to 1.5. This was our “linear model”. The additive structure of the linear model was changed in our “quadratic model”. Here, the squared variables, Inline graphic , have an additional effect on the event-specific hazards where

Finally, we consider an “interaction model”. Additional interaction effects were added to the linear model of the form

We set the effect sizes of the interaction terms to

In all three simulation models, Inline graphic independent noise variables were drawn independently from a standard normal distribution and added to the simulated data sets. We set in our low-dimensional scenarios and in our high-dimensional scenarios. In all settings, independent right censoring was induced by drawing censoring times from an exponential distribution with rate Inline graphic . This yielded approximately 33% censored observations.

6.2. Forest models

The R-package randomForestSRC (Ishwaran and Kogalur, 2013) was used for computations. For each simulation experiment, 1000 trees were grown using the log-rank splitting rule (3.2) and the modified Gray's splitting rule (3.3). Terminal node size was set at Inline graphic (the default software setting). Randomized splitting was used. Within each parent node, for each of the randomly selected candidate variables, “nsplit” randomly selected split points were chosen (this is in contrast to non-random splitting where all possible split points for each of the candidate variables are considered). The tree node was split on that variable and random split point maximizing the absolute value of the split-statistic. We set nsplit Inline graphic . A small nsplit value is necessary in settings involving a mixture of discrete and continuous variables to avoid biasing splits toward continuous variables (Loh and Shih, 1997; Ishwaran, Kogalur, Gorodeski and others, 2010).

We fit RSF using log-rank splitting (3.2) for each event using weights Inline graphic and . We denote the resulting forests as and . Because we focus on performance over event 1 only (for ease of interpretation), we only report the results for . Additionally, three forests were fit using Gray's modified splitting rule (3.3): used ; used ; used . Only and are reported.

Variables were selected using minimal depth variable selection Ishwaran, Kogalur, Gorodeski and others (2010). Those variables whose event-specific VIMP was positive, and that met a minimal depth threshold (estimated from the forest), represented the final selected set of variables. As noted in Ishwaran, Kogalur and others (2010), the number of variables selected at each node, Inline graphic , referred to as “mtry”, should be set high when using minimal depth in high-dimensional applications. In our high-dimensional simulations (), we used . The default setting was used in low-dimensions (). See Ishwaran, Kogalur and others (2010) for further discussion on setting tuning parameters in high dimensions.

For comparison, we used four alternative methods. For the first, we used the proportional subhazard method of Fine and Gray (1999), abbreviated as FG Inline graphic and FG. Computations were implemented using the R-software package cmprsk (Gray, 2006). For the second method, we used cause-specific Cox regression (abbreviated as and ) for each of the competing events. Predictions of the CIF were obtained by combining the Cox models using (1.1). Computations were implemented using the R-software riskRegression (Gerds and others, 2012). For both approaches, we specified additive effects of the predictor variables and resorted to selecting variables by using Inline graphic -values. A cutoff of 5% was used. For the third method, we applied a stepwise selection algorithm (CRRstep) to the Fine-Gray regression models as proposed in Kuk and Varadhan (2013). We used backward elimination as implemented in the R-package crrstep with an Akaike information criterion selection criterion (Varadhan and Kuk, 2013). For the fourth method, we used Cox-likelihood based boosting (Binder and Schumacher, 2008), abbreviated as CoxBoost Inline graphic and CoxBoost. This uses boosting to fit proportional subhazards as in Fine and Gray (1999). Computations were implemented using the CoxBoost R-software package (Binder, 2009). The optimal number of boosting iterations was estimated using 10-fold cross-validation. A boosting penalty of 100 was used.

6.3. Simulation results

The simulations were repeated 1000 times independently and results were averaged over the runs. To estimate prediction performance (Table 1), in each simulation run we generated a training set with Inline graphic independent observations and a test set with independent observations. The C-index and the integrated BS were truncated at a sufficiently low quantile of the observed event time distribution. A lower benchmark for prediction performance was obtained in each simulation study by fitting a null model which ignores all covariates. An upper benchmark was obtained by fitting the data generating model to each training data set, i.e. the combination of two cause-specific Cox regression models that were given the correct linear predictor (including quadratic and interaction terms) and no noise variables.

Table 1.

Cox-exponential simulations Inline graphic replications)

	Low-dimensional simulations
	, ,
	Linear model		Quadratic model		Interaction model

NM	15.4 (0.7)	—	15.1 (0.6)	—	15.5 (0.6)	—
	10.1 (0.6)	82.1 (1.4)	9.5 (0.7)	79.4 (2.2)	10.8 (0.7)	80.3 (1.5)
	13.6 (0.7)	75.9 (2.3)	13.9 (0.7)	72.1 (2.6)	15.1 (0.6)	61.6 (2.5)
	13.2 (0.7)	76.5 (2.1)	13.6 (0.6)	72.0 (2.4)	15.0 (0.6)	61.3 (2.6)
	13.1 (0.6)	77.2 (2)	13.5 (0.6)	72.9 (2.3)	15.0 (0.6)	61.9 (2.5)
CoxBoost	11.4 (0.7)	79.6 (1.9)	13.9 (0.8)	65.7 (4)	15.4 (0.7)	55.9 (4.1)
FG	11.7 (0.9)	79.5 (1.8)	14.6 (1.0)	66.4 (2.7)	16.5 (0.9)	57.5 (2.5)
	10.9 (0.7)	80.1 (1.7)	14.4 (1.0)	66.6 (2.6)	16.5 (0.9)	57.6 (2.5)
CRRstep	13.6 (2.1)	62.7 (18.4)	14.8 (0.9)	55.4 (12.1)	15.7 (0.7)	49.7 (9.1)

	High-dimensional simulations
	, ,
	Linear model		Quadratic model		Interaction model

NM	15.4 (0.7)	—	15.1 (0.6)	—	15.5 (0.6)	—
DGM	10.1 (0.6)	82.0 (1.4)	9.8 (0.9)	79.3 (2.2)	10.8 (0.7)	80.3 (1.5)
	14.9 (0.7)	67.4 (2.9)	14.8 (0.7)	65.0 (3.6)	15.5 (0.7)	53.3 (2.5)
	14.9 (0.7)	65.8 (3.5)	14.6 (0.7)	64.5 (3.4)	15.6 (0.6)	52.2 (2.4)
	14.8 (0.7)	68.3 (3.1)	14.4 (0.6)	66.7 (3.2)	15.6 (0.6)	52.7 (2.5)
CoxBoost	13.5 (1.0)	72.0 (4)	14.8 (0.8)	58.4 (5.7)	15.7 (0.7)	51.5 (2.9)

Open in a new tab

Performance measures are average (standard deviation) of test set C-index Inline graphic and integrated BS . NM is the null model which assigns the same predicted CIF to each observation and is the data generating model fitted to the training set.

Based on Table 1, we draw the following conclusions:

In the low-dimensional linear simulations, Fine–Gray, Cox, and CoxBoost are better than RSF.
For the low-dimensional quadratic and interaction model, RSF outperforms the other methods.
RSF is better than CoxBoost in the high-dimensional quadratic and interaction model simulations, but CoxBoost is better in the linear model.
The event-specific RSF models and tended to be slightly better than the composite model in the low-dimensional simulations, but this trend was less pronounced in the high-dimensional simulations, and in some cases it was reversed (high-dimensional linear model).
We were not able to calculate Fine–Gray or Cox in the high-dimensional simulations. This is expected of unregularized methods which perform poorly in high-dimensional problems.

To assess VIMP, we calculated selection rates across the 1000 runs (Tables 2 and 3). True positive rates were summarized separately for each of the predictors Inline graphic . False positive rates were averaged across noise variables. Based on Tables 2 and 3, we draw the following conclusions:

The log-rank splitting forest performs best in identifying only those variables affecting the event 1 cause-specific hazard (i.e. and ). Recall that log-rank splitting is designed to test for differences in the cause-specific hazard: the results confirm the efficacy of the approach.
The Gray composite splitting forest is designed to discover all variables affecting the event 1 CIF, which are . We find it does a good job doing so. Furthermore, in nearly all simulations, it achieves the smallest false positive rate over the noise variables.

Table 2.

Variable selection frequencies Inline graphic from low-dimensional simulation study

													Noise
Linear model
	34.4	37	17.1	17.6	85.3	85.4	42.4	42.2	16.7	18	94.1	94.1	2.0
	90.3	91.2	22	19.7	39.5	42.4	94.1	94.5	17.5	18.8	39.1	39.2	7.5
	88.4	88.9	6.5	6	81.8	81.6	93.2	94.4	2	2.6	88.4	86.6	5.2
CoxBoost	99.9	99.7	78.2	75.8	91.4	91.6	99.9	100	82.5	80.5	95	94.4	37.6
	99.7	100	7.1	7.1	99.9	99.5	99.9	99.9	7	5.5	100	99.9	0
	99.6	99.6	9.5	8.2	99.4	99.6	99.9	99.8	8.9	8.5	99.8	99.6	8.9
FG	97.8	97.3	41	42	70	72.2	99.2	99.3	48.7	46.9	79.8	77.4	8.9
CRRstep	48.8	48.9	31.2	31.5	42.7	42.6	49	49.1	35.2	33.5	44.9	45.1	11.5
Quadratic model
	79.1	42.9	35.9	14.1	91.8	97.9	8.2	10	6.1	3.7	31.1	31.9	3.8
	99.9	84.1	23.3	18.9	59.2	42	31.1	32.9	7.4	7.5	10.6	9.6	7.6
	100	80.7	5.5	6.2	96.4	83.6	27.5	30.9	2.5	2.4	23.8	22.5	6.4
CoxBoost	92	83.5	60.9	46.3	57.9	70.1	68.3	66.4	37.5	38.6	46.2	43.4	28.7
	100	97.2	7.1	6.7	99.6	97.6	72.1	74.2	7.5	7.2	72.8	71.4	0
	92.9	66.7	10.5	9.3	76.6	69.9	45.3	46.3	8.2	9.6	41.1	37.8	8.4
FG	80.3	66.1	34.8	18.9	30.4	44.5	39.3	40.7	12.1	12.3	18.4	16.5	7.2
CRRstep	43.6	35.9	23.3	18	26.1	28.1	27.8	28.7	13.6	13	16.5	15.5	9.7
Interaction model
	28.3	27.2	18.1	18.5	65.2	59.5	20.5	27.7	8.9	11	66.5	57	14.2
	50.1	57.2	20.2	18.9	27.5	26.6	45.5	61.1	10.4	13.9	19.8	17.4	17.1
	48.1	57.4	17.4	16.6	43.3	44.6	46.2	62.1	5.8	5.2	45.4	42.3	17.7
CoxBoost	23.3	17	16.6	16.4	16.7	15	50.2	59.8	22.4	28.9	31.5	27.6	14.9
	99.1	96.9	7	6.4	97.4	97.1	56	45	6.1	6.1	54.1	50.3	0
	19	8.3	8.7	7.6	14.1	7.9	46.8	59.1	7.9	6.7	47.4	40.4	8.0
FG	15.2	7	8.5	6.5	7.2	6.7	40.2	54.5	12	14.4	22.5	19.5	6.5
CRRstep	15.8	5.7	6	5.8	5.8	5.9	19.8	22	9.1	5	7.4	6.2	5.7

Open in a new tab

Variables Inline graphic have an effect on the hazard of event only variables have an effect on the hazard of event only and variables have an effect on both hazards. Shown are the true positive rates separately for variables –. False positive rates over the noise variables are given in columns labeled “Noise”.

Table 3.

Variable selection frequencies Inline graphic from high-dimensional simulation study , ,

													Noise
Linear model
	15.3	15	7.7	7	54.8	53.9	21.4	23.3	9.5	9.9	83.4	81.5	1.6
	53.5	51.7	5.5	6.6	16.2	17	64	67.1	6.4	4.9	11.6	10	3.4
	44.5	44.8	1.4	1.4	35.7	35.3	56.6	59.8	0.3	0.1	48.2	44.6	1.4
CoxBoost	89.9	91.2	26.8	27.3	47.2	46.5	93.6	93.6	33.6	31.3	57.8	53.6	3.9
	99.9	99.8	8.4	6.3	99.8	99.7	100	100	6.2	4.7	99.8	100	0
Quadratic model
	60.8	22.2	24.5	11.9	82.7	95.4	4.8	4.5	3.6	4.1	16.2	18.4	1.5
	99.4	30.7	6	12.4	48.4	8.2	12.3	11	1.4	2.4	2	2.7	3.7
	99.1	31.1	3	5.8	93.5	40.4	13.4	12.7	0.6	1.2	8.6	9.5	4.5
CoxBoost	68.7	32.7	13.4	8	16.7	18.3	18.1	18.7	3.2	4.7	5.6	5.1	1.8
DGM	100	96.6	7.5	7.7	99.8	97.9	69.8	72.8	6.1	8.2	71.2	70.9	0
Interaction model
	3.8	4.4	4.5	3.5	10	9.8	7.7	13.4	6.6	7.3	38.1	33	2.6
	10.2	9.7	6.2	4.2	5.3	6.4	15.1	25.7	2.7	3.3	4.3	3.9	4.4
	10.8	12.3	7.3	4.4	10.5	11.6	15.8	25.8	1.6	1.2	17.7	15.9	5.4
CoxBoost	2.6	1.8	1.4	1.2	1.3	1.6	15.5	25.7	3.9	5.2	6	6.1	1.2
	97.9	97	6.8	7.7	98.5	97.3	56.8	46.8	8.5	7.7	54	47.9	0

Open in a new tab

Variables Inline graphic , , , have an effect on the hazard of event 1 only, variables , , , have an effect on the hazard of event 2 only, and variables , , , have an effect on both hazards. Shown are the true positive rates separately for variables –. False positive rates over the noise variables are given in columns labeled “Noise”.

7. Highly active antiretroviral therapy for HIV infection

The Johns Hopkins HIV Clinical Cohort is a longitudinal, dynamic, clinical cohort of HIV-infected patients receiving primary care through the Johns Hopkins AIDS Service, which provides care to a large proportion of HIV-infected patients in the Baltimore metropolitan area (Moore, 1998). From this cohort, we identified 2960 individuals initiating effective antiretroviral therapy between 1996 and 2005 and for these we wished to predict time to all-cause mortality, and time to AIDS-defining illnesses after the initiation of effective treatment. Variables included laboratory measurements (CD4 nadir, HIV-RNA levels, total lymphocyte counts, and hemoglobin, albumin, and creatinine levels) as well as non-laboratory measurements (prior diagnosis of an AIDS-defining illness, prophylaxis for Pneumocystis jiroveci pneumonia, sex, race, history of injection drug use, history of heavy alcohol use, heroin use, cocaine use, and a medical history of personality disorder, anxiety, depression, schizophrenia, or suicide attempt). All marker measurements were restricted to the measurement closest to the time of initiation of highly active antiretroviral therapy (HAART) within a window starting 1 year prior to treatment with the exception of nadir CD4 counts. Nadir CD4 counts was the lowest CD4 that was measured prior to the initiation of effective treatment.

Including death, there were 15 competing risk outcomes. Forests were fit using the modified Gray splitting rule (3.3) with weights Inline graphic for all . The same tuning parameters were used as before except for terminal node size, which was set to . This value was determined by optimizing the OOB event averaged C-index using a small subset of the original data (; the remaining data were used for the analysis). In addition, in order to determine cause-specific hazard risk factors, we fit a separate RSF to each event Inline graphic utilizing log-rank splitting (3.2) with weights for .

Figure 1 displays the ensemble CIF for each of the 15 outcomes from the RSF analysis using the composite Gray splitting rule. The CIF's have been groups with similar ranges for better visualization. Most apparent is that death has a near uniform higher incidence rate than all other events. Some AIDS illnesses have incidence rates that peak rapidly. For example, incidence for non-Hodgkin's lymphoma increases rapidly and then begins to flatten after 4 years.

Fig. 1. — Averaged ensemble CIF for all 15 events from the HAART study using RSF. CIFs have been grouped by similar vertical ranges for better visualization.

Table 4 lists the minimal depth and event-specific VIMP for each variable for the top five most frequent outcomes, which includes death, the most frequently occurring event. Minimal depth values were obtained using Gray's splitting; event-specific VIMP were obtained using log-rank splitting. Variables selected by minimal depth (a total of 11) represent factors affecting Inline graphic -year predictions for all events. The top three variables are nadir CD4 count, albumin level, and total lymphocyte count. These three factors, however, have different cause-specific hazard effects as seen by their event-specific VIMP. For death, albumin level is the most influential factor; nadir CD4 count is influential for the four other events; and total lymphocyte counts are influential for HIV encephalopathy. The importance of albumin for death is not surprising as it is a marker for many general health issues such as liver disease, malnutrition, renal disease, and dehydration. However, it is not a marker for immune function, whereas nadir CD4 is a marker for immune system damage. Candidiasis, pneumocystis pneumonia, HIV encephalopathy, and mycobacterium avium complex are all infection related: thus it is not surprising that nadir CD4 is influential for these outcomes.

Table 4.

Minimal depth and event-specific VIMP for risk factors from HAART analysis for the top five most frequent outcomes

		VIMP
	Minimal depth (all events)	Death	Candediasis	PCP	HIV encephalopathy	MAC
Nadir CD4 prior to HAART	1.14	0.45	4.81	8.88	5.28	6.89
Albumin level	1.81	5.51	0.60	0.08	2.87	0.01
Total lymphocyte counts	2.21	1.36	1.02	1.89	2.49	1.85
Hemoglobin level	2.48	0.98	0.65	0.77	1.48	2.58
Creatinine level	3.24	2.40	0.10	0.51	0.91	1.58
Injected drug use	3.32	0.64	0.20	0.17	0.23	0.10
HIV-RNA levels	3.53	0.12	0.51	2.38	0.24	3.50
Age	3.63	1.67	0.00	0.09	0.39	1.79
Pre-2000	3.93	0.00	0.26	0.06	0.15	0.06
AIDS prior to HAART	4.14	0.32	0.46	1.23	0.42	2.46
PCP prophylaxis	4.69	0.04	1.58	0.85	0.16	0.99
History of hepatitis C	5.06	0.54	0.02	0.26	1.10	0.09
Race	5.64	0.06	1.12	0.08	0.33	0.21
Heterosexual	5.74	0.05	0.05	0.02	0.14	0.05
History of mental illness	5.90	0.18	0.01	0.05	0.10	0.02
Sex	6.26	0.20	0.08	0.04	0.08	0.03
History of hepatitis B	6.36	0.18	0.03	0.08	0.03	0.03
Cocaine	6.43	0.08	0.00	0.01	0.05	0.03
Men sex with men	6.49	0.02	0.02	0.11	0.02	0.10
Depression	6.54	0.01	0.04	0.05	0.04	0.03
Heroin	6.68	0.07	0.04	0.01	0.02	0.05
Current smoker	7.04	0.01	0.06	0.11	0.03	0.03
Suicide attempt	7.51	0.01	0.02	0.01	0.04	0.00
Alcohol	7.53	0.01	0.02	0.01	0.03	0.04
Anxiety	7.60	0.00	0.00	0.00	0.00	0.00
Personality disorder	7.60	0.00	0.00	0.00	0.00	0.00
Schizophrenia	7.60	0.00	0.00	0.00	0.00	0.00
Smoking history	7.60	0.00	0.00	0.00	0.00	0.00
		73.9	70.5	78.1	80.0	87.7

Open in a new tab

PCP, pneumocystis pneumonia; MAC, mycobacterium avium complex.

The minimal depth threshold for selecting variables is Inline graphic (indicated by a horizontal line separating significant variables from non-significant variables). The event-specific C-index is listed in the last row under the entry

8. Discussion

In this paper, we described a novel extension of RSF to competing risk settings. We introduced new splitting rules for growing competing risk trees and described several ensemble estimators useful for competing risks. These included ensembles for the CIF as well as event-specific estimates of mortality. We described a novel non-parametric method for event-specific variable selection and showed how minimal depth, a new variable selection method for RSF, could be used for identifying non-event-specific variables. Our two splitting rules, log-rank splitting and the modified Gray's splitting rule, are designed to test different null hypotheses. Log-rank splitting tests for equality of the cause-specific hazard, while the modified Gray's splitting rule tests for equality of the CIF. We showed how event-specific VIMP and minimal depth variable selection could be used individually or simultaneously with these rules to identify variables specific to certain events or common to all events.

RSF computations were implemented using the randomForestSRC package. In the future, we plan to complement the package with a Java application that will allow users to restore a RSF analysis for prediction on new data. This would make it possible to apply competing risk prediction in clinical settings (see Section B of supplementary material available at Biostatistics online (http://www.biostatistics.oxfordjournals.org) for further discussion). Computational load is always an issue in large-scale problems and we mention two strategies to combat this. One is to utilize randomized splitting via “nsplit”. This not only mitigates bias but also greatly reduces computational times. A second strategy is to utilize the OpenMP enabled package of randomForestSRC (http://www.ccs.miami.edu/~hishwaran/rfsrc.html), which implements parallel processing. This approximately reduces computational time linearly with the number of CPU's which can translate into substantial computational gains.

Supplementary material

Supplementary material is available at http://biostatistics.oxfordjournals.org.

Funding

This work was supported by the National Institutes of Health [R01CA16373 to H.I., K01-AI071754, U01-AI42590, and R01-DA011602 to R.D.M., S.J.G., and B.M.L.]; and the National Science Foundation [DMS 114899 to H.I.].

Supplementary Material

Supplementary Data

supp_15_4_757__index.html^{(735B, html)}

Acknowledgements

We gratefully acknowledge the help of two referees whose reviews improved the manuscript. Conflict of Interest: None declared.

References

Aalen O., Johansen S. An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scandinavian Journal of Statistics. 1978;5:141–150. [Google Scholar]
Andersen P. K. A note on the decomposition of number of life years lost according to causes of death. Research Report. 2012 doi: 10.1002/sim.5903. University of Copenhagen, Department of Biostatistics, 2. [DOI] [PubMed] [Google Scholar]
Binder H. CoxBoost: Cox Models by Likelihood Based Boosting for a Single Survival Endpoint or Competing Risks. R package version 1.1. http://cran.r-project.org . [Google Scholar]
Binder H., Allignol A., Schumacher M., Beyersmann J. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics. 2009;25:890–896. doi: 10.1093/bioinformatics/btp088. [DOI] [PubMed] [Google Scholar]
Binder B., Schumacher M. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinformatics. 2008;9:14. doi: 10.1186/1471-2105-9-14. [DOI] [PMC free article] [PubMed] [Google Scholar]
Breiman L. Random forests. Machine Learning. 2001;45:5–32. [Google Scholar]
Fine J. P., Gray R. J. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 1999;446:496–509. [Google Scholar]
Gerds T. A., Schumacher M. Consistent estimation of the expected Brier score in general survival models with right-censored event times. Biometrical Journal. 2006;6:1029–1040. doi: 10.1002/bimj.200610301. [DOI] [PubMed] [Google Scholar]
Gerds T. A., Scheike T. H., Andersen P. K. Absolute risk regression for competing risks: interpretation, link functions, and prediction. Statistics in Medicine. 2012;31:3921–3930. doi: 10.1002/sim.5459. [DOI] [PMC free article] [PubMed] [Google Scholar]
Graf E., Schmoor C., Sauerbrei W. F., Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine. 1999;18:2529–2545. doi: 10.1002/(sici)1097-0258(19990915/30)18:17/18<2529::aid-sim274>3.0.co;2-5. [DOI] [PubMed] [Google Scholar]
Gray R. J. A class of K-sample tests for comparing the cumulative incidence of a competing risk. The Annals of Statistics. 1988;16:1141–1154. [Google Scholar]
Gray R. J. cmprsk: Subdistribution Analysis of Competing Risks. R package version 2.1-7. http://cran.r-project.org . [Google Scholar]
Ishwaran H., Kogalur U. B. randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC) R package version 1.4.0. http://cran.r-project.org . [Google Scholar]
Ishwaran H., Kogalur U. B., Blackstone E. H., Lauer M. S. Random survival forests. The Annals of Applied Statistics. 2008;2:841–860. [Google Scholar]
Ishwaran H., Kogalur U. B., Chen X., Minn A. J. Random survival forests for high-dimensional data. Statistical Analysis and Data Mining. 2010;4:115–132. [Google Scholar]
Ishwaran H., Kogalur U. B., Gorodeski E. Z., Minn A. J., Lauer M. S. High-dimensional variable selection for survival data. Journal of the American Statistical Association. 2010;105(489):205–217. [Google Scholar]
Kuk D., Varadhan R. Model selection in competing risks regression. Statistics in Medicine. 2013;32(18):3077–3088. doi: 10.1002/sim.5762. [DOI] [PubMed] [Google Scholar]
Loh W.-Y., Shih Y.-S. Split selection methods for classification trees. Statistica Sinica. 1997;7:815–840. [Google Scholar]
Moore R. D. Understanding the clinical and economic outcomes of HIV therapy: the Johns Hopkins HIV clinical practice cohort. Journal of Acquired Immune Deficiency Syndrome and Human Retrovirology. 1998;17(Suppl. 1):38. doi: 10.1097/00042560-199801001-00011. [DOI] [PubMed] [Google Scholar]
Varadhan R., Kuk D. crrstep: Stepwise Covariate Selection for the Fine & Gray Competing Risks Regression Model. R package version 2013-02.12. http://cran.r-project.org . [Google Scholar]
Wolbers M., Koller M. T., Witteman J. C. Concordance for prognostic models with competing risks. Research Report. 2013 University of Copenhagen, Department of Biostatistics, 3. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

supp_15_4_757__index.html^{(735B, html)}

supp_kxu010_kxu010supp.pdf^{(48.4KB, pdf)}

[KXU010C1] Aalen O., Johansen S. An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scandinavian Journal of Statistics. 1978;5:141–150. [Google Scholar]

[KXU010C2] Andersen P. K. A note on the decomposition of number of life years lost according to causes of death. Research Report. 2012 doi: 10.1002/sim.5903. University of Copenhagen, Department of Biostatistics, 2. [DOI] [PubMed] [Google Scholar]

[KXU010C3] Binder H. CoxBoost: Cox Models by Likelihood Based Boosting for a Single Survival Endpoint or Competing Risks. R package version 1.1. http://cran.r-project.org . [Google Scholar]

[KXU010C4] Binder H., Allignol A., Schumacher M., Beyersmann J. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics. 2009;25:890–896. doi: 10.1093/bioinformatics/btp088. [DOI] [PubMed] [Google Scholar]

[KXU010C5] Binder B., Schumacher M. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinformatics. 2008;9:14. doi: 10.1186/1471-2105-9-14. [DOI] [PMC free article] [PubMed] [Google Scholar]

[KXU010C6] Breiman L. Random forests. Machine Learning. 2001;45:5–32. [Google Scholar]

[KXU010C7] Fine J. P., Gray R. J. A proportional hazards model for the subdistribution of a competing risk. Journal of the American Statistical Association. 1999;446:496–509. [Google Scholar]

[KXU010C8] Gerds T. A., Schumacher M. Consistent estimation of the expected Brier score in general survival models with right-censored event times. Biometrical Journal. 2006;6:1029–1040. doi: 10.1002/bimj.200610301. [DOI] [PubMed] [Google Scholar]

[KXU010C9] Gerds T. A., Scheike T. H., Andersen P. K. Absolute risk regression for competing risks: interpretation, link functions, and prediction. Statistics in Medicine. 2012;31:3921–3930. doi: 10.1002/sim.5459. [DOI] [PMC free article] [PubMed] [Google Scholar]

[KXU010C10] Graf E., Schmoor C., Sauerbrei W. F., Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine. 1999;18:2529–2545. doi: 10.1002/(sici)1097-0258(19990915/30)18:17/18<2529::aid-sim274>3.0.co;2-5. [DOI] [PubMed] [Google Scholar]

[KXU010C11] Gray R. J. A class of K-sample tests for comparing the cumulative incidence of a competing risk. The Annals of Statistics. 1988;16:1141–1154. [Google Scholar]

[KXU010C12] Gray R. J. cmprsk: Subdistribution Analysis of Competing Risks. R package version 2.1-7. http://cran.r-project.org . [Google Scholar]

[KXU010C13] Ishwaran H., Kogalur U. B. randomForestSRC: Random Forests for Survival, Regression and Classification (RF-SRC) R package version 1.4.0. http://cran.r-project.org . [Google Scholar]

[KXU010C14] Ishwaran H., Kogalur U. B., Blackstone E. H., Lauer M. S. Random survival forests. The Annals of Applied Statistics. 2008;2:841–860. [Google Scholar]

[KXU010C15] Ishwaran H., Kogalur U. B., Chen X., Minn A. J. Random survival forests for high-dimensional data. Statistical Analysis and Data Mining. 2010;4:115–132. [Google Scholar]

[KXU010C16] Ishwaran H., Kogalur U. B., Gorodeski E. Z., Minn A. J., Lauer M. S. High-dimensional variable selection for survival data. Journal of the American Statistical Association. 2010;105(489):205–217. [Google Scholar]

[KXU010C17] Kuk D., Varadhan R. Model selection in competing risks regression. Statistics in Medicine. 2013;32(18):3077–3088. doi: 10.1002/sim.5762. [DOI] [PubMed] [Google Scholar]

[KXU010C18] Loh W.-Y., Shih Y.-S. Split selection methods for classification trees. Statistica Sinica. 1997;7:815–840. [Google Scholar]

[KXU010C19] Moore R. D. Understanding the clinical and economic outcomes of HIV therapy: the Johns Hopkins HIV clinical practice cohort. Journal of Acquired Immune Deficiency Syndrome and Human Retrovirology. 1998;17(Suppl. 1):38. doi: 10.1097/00042560-199801001-00011. [DOI] [PubMed] [Google Scholar]

[KXU010C20] Varadhan R., Kuk D. crrstep: Stepwise Covariate Selection for the Fine & Gray Competing Risks Regression Model. R package version 2013-02.12. http://cran.r-project.org . [Google Scholar]

[KXU010C21] Wolbers M., Koller M. T., Witteman J. C. Concordance for prognostic models with competing risks. Research Report. 2013 University of Copenhagen, Department of Biostatistics, 3. [Google Scholar]

PERMALINK

Random survival forests for competing risks

Hemant Ishwaran

Thomas A Gerds

Udaya B Kogalur

Richard D Moore

Stephen J Gange

Bryan M Lau

Abstract

1. Introduction

2. Parameters of interest

2.1. Expected number of life years lost and cause- mortality

2.2. Terminal node estimators

3. Competing risk forests

3.1. Event-specific ensembles

3.2. Event-free survival ensembles

3.3. Splitting rules

3.3.1. Generalized log-rank test

3.3.2. Gray's test

3.3.3. Composite splitting rules

3.4. Competing risks forest algorithm

4. Prediction performance

4.1. Performance metrics

4.2. OOB estimate of prediction error

5. Variable selection

5.1. Variable importance

5.2. Minimal depth

6. Empirical results

6.1. Simulations

6.2. Forest models

6.3. Simulation results

Table 1.

Table 2.

Table 3.

7. Highly active antiretroviral therapy for HIV infection

Fig. 1.

Table 4.

8. Discussion

Supplementary material

Funding

Supplementary Material

Acknowledgements

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases