The wild bootstrap for multivariate Nelson–Aalen estimators

Tobias Bluhmki; Dennis Dobler; Jan Beyersmann; Markus Pauly

doi:10.1007/s10985-018-9423-x

. 2018 Mar 6;25(1):97–127. doi: 10.1007/s10985-018-9423-x

The wild bootstrap for multivariate Nelson–Aalen estimators

Tobias Bluhmki ^1,^#, Dennis Dobler ^2,^✉,^#, Jan Beyersmann ¹, Markus Pauly ¹

PMCID: PMC6323102 PMID: 29512005

Abstract

We rigorously extend the widely used wild bootstrap resampling technique to the multivariate Nelson–Aalen estimator under Aalen’s multiplicative intensity model. Aalen’s model covers general Markovian multistate models including competing risks subject to independent left-truncation and right-censoring. This leads to various statistical applications such as asymptotically valid confidence bands or tests for equivalence and proportional hazards. This is exemplified in a data analysis examining the impact of ventilation on the duration of intensive care unit stay. The finite sample properties of the new procedures are investigated in a simulation study.

Electronic supplementary material

The online version of this article (10.1007/s10985-018-9423-x) contains supplementary material, which is available to authorized users.

Keywords: Conditional central limit theorem, Counting process, Equivalence test, Proportional hazards, Kolmogorov–Smirnov test, Survival analysis, Weak convergence

Introduction

One of the most crucial quantities within the analysis of time-to-event data with independently right-censored and left-truncated survival times is the cumulative hazard function, also known as cumulative transition intensity. Most commonly, it is nonparametrically estimated by the well-known Nelson–Aalen estimator (Andersen et al. 1993, Chapter IV). In this context, time-simultaneous confidence bands are the perhaps best interpretative tool to account for related estimation uncertainties.

The construction of confidence bands is typically based on the asymptotic behavior of the underlying stochastic processes, more precisely, the (properly standardized) Nelson–Aalen estimator asymptotically behaves like a Wiener process. Early approaches utilized this property to derive confidence bands for the cumulative hazard function; see e.g., Bie et al. (1987) or Section IV.1.3 in Andersen et al. (1993).

However, Dudek et al. (2008) found that this approach applied to small samples can result in considerable deviations from the nominal level. To improve small sample properties, Efron (1979, 1981) suggested a computationally convenient and flexible resampling technique, called bootstrap, where the unknown non-Gaussian quantile is approximated via repeated generation of point estimates based on random samples of the original data. For a detailed discussion within the standard right-censored survival setup, see also Akritas (1986), Lo and Singh (1986), and Horvath and Yandell (1987). The simulation study of Dudek et al. (2008) particularly reports improvements of bootstrap-based confidence bands for the hazard function as compared to those using asymptotic quantiles. An alternative is the so-called wild bootstrap firstly proposed in the context of regression analyses (Wu 1986). As done in Lin et al. (1993), the basic idea is to replace the (standardized) residuals with independent standardized variates—so-called multipliers—while keeping the data fixed. One advantage compared to Efron’s bootstrap is to gain robustness against variance heteroscedasticity (Wu 1986). Using standard normal multipliers, this resampling procedure has been applied to construct time-simultaneous confidence bands for survival curves under the Cox proportional hazards model (Lin et al. 1994) and adapted to cumulative incidence functions in the competing risks setting (Lin 1997). The latter approach has recently been extended to general wild bootstrap multipliers with mean zero and variance one (Beyersmann et al. 2013), which indicate possible improved small sample performances. This result was confirmed in Dobler and Pauly (2014) as well as Dobler et al. (2017), where more general resampling schemes are discussed. In all these references, only one multiplier, typically standard normal, was required per individual, because each individual experienced at most one event. If an individual may experience more than one event, multiple multipliers per individual were, e.g., considered by Dabrowska and Ho (2000) in the context of Cox modelling, see also Shu et al. (2007) for Cox models in a semi-Markov illness-death model, and Liu et al. (2008) in a progressive multistate model for current leukemia free survival.

The present article focuses on the nonparametric estimation of cumulative hazard functions and proposes a general and flexible wild bootstrap resampling technique, which is valid for a large class of time-to-event models. In particular, the procedure is not limited to the standard survival or competing risks framework. The key assumption is that the involved counting processes satisfy the so-called multiplicative intensity model (Andersen et al. 1993). Consequently, arbitrary Markovian multistate models with finite state space are covered, as well as various other intensity models (e.g., excess or relative mortality models, cf. Andersen and Væth 1989) and specific semi-Markov situations (Andersen et al. 1993, Example X.1.7). Independent right-censoring and left-truncation can straightforwardly be incorporated.

The main aim of this article is to mathematically justify the wild bootstrap technique for the multivariate Nelson–Aalen estimator in this general framework, while using not necessarily normally distributed but possibly multiple multipliers per counting process in the resampling step, This is accomplished by a novel martingale-based proof that discloses the close connection between the estimator and its wild bootstrap version. This insight would not have been possible by generalizing the elementary approach for showing bootstrap validity in competing risks given in Beyersmann et al. (2013) and Dobler and Pauly (2014) to the present set-up. Compared to the standard survival or competing risks setting, with at most one transition per individual, the major difficulty is to account for counting processes having an arbitrarily large random number of jumps. We will see that utilizing only one multiplier per individual counting process leads to the wrong covariance structure in general; instead, one multiplier per increment is required. As Beyersmann et al. (2013) suggested in the competing risks setting, we also allow for more general multipliers with expectation 0 and variance 1 and extend the resulting weak convergence theorems to resample the multivariate Nelson–Aalen estimator in our general setting. For practical applications, this result allows, for instance, within- or two-sample comparisons and the formulation of statistical tests.

The wild bootstrap is exemplified to statistically assess the impact of mechanical ventilation in the intensive care unit (ICU) on the length of stay. A related problem is to investigate ventilation-free days, which was established as an efficacy measure in patients subject to acute respiratory failure (Schoenfeld et al. 2002). However, applications of their methodology (see e.g., Sauaia et al. 2009; Stewart et al. 2009) rely on the constant hazards assumption. Other publications like de Wit et al. (2008), Trof et al. (2012), or Curley et al. (2015) used a Kaplan–Meier-type procedure that does not account for the more complex multistate structure. In contrast, we propose an illness-death model with recovery that methodologically works under the more general time-inhomogeneous Markov assumption and captures both the time-dependent structure of mechanical ventilation and the competing endpoint ‘death in ICU’.

The remainder of this article is organized as follows: Sect. 2 introduces cumulative hazard functions and their Nelson–Aalen estimators using counting process formulations. After summarizing its asymptotic properties, Sect. 3 offers our main theorem on conditional weak convergence for the wild bootstrap. This allows for various statistical applications in Sect. 4: Two-sided hypothesis tests and various sorts of time-simultaneous confidence bands are deduced, as well as simultaneous confidence intervals for a finite set of time points. Furthermore, tests for equivalence, inferiority and superiority as well as for proportionality of two hazard functions constitute useful criteria in practical data analyses. A simulation study assessing small and large sample performances of both the derived confidence bands in comparison to the algebraic approach based on the time-transformed Brownian motion and the tests for proportional hazards is reported in Sect. 5. The SIR-3 data on patients in ICU (Beyersmann et al. 2006; Wolkewitz et al. 2008) serves as its template and is practically revisited in Sect. 6. Concluding remarks and a discussion are given in Sect. 7. All proofs are deferred to Appendix A and the non-applicability of the ordinary multiplier resampling is verified in Appendix B.

Nonparametric estimation under the multiplicative intensity structure

Throughout, we adopt the notation of Andersen et al. (1993). For $k \in N$ , let $N = {(N_{1}, \dots, N_{k})}^{'}$ be a multivariate counting process which is adapted to a filtration ${(F_{t})}_{t \geq 0}$ . Each entry $N_{j}, j = 1, \dots, k,$ is supposed to be a càdlàg function, zero at time zero, and to have piecewise constant paths with jumps of size one. In addition, assume that no two components jump at the same time and that each $N_{j} (t)$ satisfies the multiplicative intensity model of Aalen (1978) with intensity process given by $λ_{j} (t) = α_{j} (t) Y_{j} (t)$ . Here, $Y_{j} (t)$ defines a predictable process not depending on unknown parameters and $α_{j}$ describes a non-negative (hazard) function. For well-definiteness, the observation of $N$ is restricted to the interval $[0, τ]$ , where $τ < τ_{j} = sup {u \geq 0 : \int_{(0, u]} α_{j} (s) d s < \infty} for all j = 1, \dots, k .$ The multiplicative intensity structure covers several customary frameworks in the context of time-to-event analysis. The following overview specifies frequently used models.

Example 1

Markovian multistate models with finite state space $S$ are very popular in biostatistics. In this setting, $Y_{ℓ} (t)$ represents the total number of individuals in state $ℓ$ just prior to t (‘number at risk’), whereas $α_{ℓ m} (t)$ is the instantaneous risk (‘transition intensity’) to switch from state $ℓ$ to m, where $ℓ, m \in S$ , $ℓ \neq m$ . Here, $N_{ℓ} = \sum_{i = 1}^{n} N_{ℓ ; i}$ is the aggregation over individual-specific counting processes with $n \in N$ individuals under study. For specific examples (such as competing risks or the illness-death model) and details including the incorporation of independent left-truncation and right-censoring, see Andersen et al. (1993) and Aalen et al. (2008).
Other examples are the relative or excess mortality model, where not all individuals necessarily share the same hazard rate $α$ . In this case Y cannot be interpreted as the total number of individuals at risk as in part (a); see Example IV.1.11 in Andersen et al. (1993) for details.
The time-inhomogeneous Markov assumption required in part (a) can even be relaxed in specific situations: Following Example X.1.7 in Andersen et al. (1993), consider an illness-death model without recovery. Assuming that the transition intensity $α_{12}$ depends on the duration d in the intermediate state, but not on time t, leads to a semi-Markov process not satisfying the multiplicative intensity structure. This is because the intensity process of $N_{12} (t)$ is given by $α_{12} (t - T) Y_{1} (t)$ , where the first factor of the product is not deterministic anymore. Here, T is the random transition time into state 1. However, when $d = t - T$ is used as the basic timescale, the counting process $K (d) = N_{12} (d + T)$ has intensity $α_{12} (d) Y_{1} (d)$ with respect to the filtration
$\begin{matrix} F_{d} = (σ {(N_{01} (t), N_{02} (t)) : 0 < t < τ} \lor σ {K (d) : 0 < d < \infty}) . \end{matrix}$

Thus, the multiplicative intensity structure is fulfilled.

Under the above assumptions, the Doob–Meyer decomposition applied to $N_{j}$ leads to

\begin{matrix} d N_{j} (s) = λ_{j} (s) d s + d M_{j} (s), \end{matrix}

2.1

where the $M_{j}$ are zero-mean martingales with respect to ${(F_{t})}_{t \in [0, τ]}$ . The canonical nonparametric estimator of the cumulative hazard function $A_{j} (t) = \int_{(0, t]} α_{j} (s) d s$ is given by the so-called Nelson–Aalen estimator

\begin{matrix} {\hat{A}}_{jn} (t) = \int_{(0, t]} \frac{J_{j} (s)}{Y_{j} (s)} d N_{j} (s) . \end{matrix}

Here, $J_{j} (t) = 1 {Y_{j} (t) > 0}$ , $\frac{0}{0} : = 0$ , and $n \in N$ is a sample size-related number (that goes to infinity in asymptotic considerations). Its multivariate counterpart is introduced by ${\hat{A}}_{n} : = {({\hat{A}}_{1 n}, \dots, {\hat{A}}_{kn})}^{'}$ . As in Andersen et al. (1993), suppose that there exist deterministic functions $y_{j}$ with ${inf}_{u \in [0, τ]} y_{j} (u) > 0$ such that

\begin{matrix} sup_{s \in [0, τ]} |\frac{Y_{j} (s)}{n} - y_{j} (s)| \overset{P}{\to} 0 for all j = 1, \dots, k, \end{matrix}

2.2

where ‘ $\overset{P}{\to}$ ’ denotes convergence in probability for $n \to \infty$ . For each j, define the normalized Nelson–Aalen process $W_{jn} : = \sqrt{n} ({\hat{A}}_{jn} - A_{j})$ possessing the asymptotic martingale representation

\begin{matrix} W_{jn} (t) ≑ \sqrt{n} \int_{(0, t]} \frac{J_{j} (s)}{Y_{j} (s)} d M_{j} (s) \end{matrix}

2.3

with $M_{j}$ given by (2.1). Here, ‘ $≑$ ’ means that the difference of both sides converges to zero in probability. Define the vectorial aggregation of all $W_{jn}$ as $W_{n} = {(W_{1 n}, \dots, W_{kn})}^{'}$ and let ‘ $\overset{d}{\to}$ ’ denote convergence in distribution for $n \to \infty$ . Then, Theorem IV.1.2 in Andersen et al. (1993) in combination with (2.2) provides a weak convergence result on the k-dimensional space $D {[0, τ]}^{k}$ of càdlàg functions endowed with the product Skorohod topology.

Theorem 1

If assumption (2.2) holds, we have convergence in distribution

\begin{matrix} W_{n} \overset{d}{⟶} U = {(U_{1}, \dots, U_{k})}^{'}, \end{matrix}

2.4

on $D {[0, τ]}^{k}$ , where $U_{1}, \dots, U_{k}$ are independent zero-mean Gaussian martingales with covariance functions $ψ_{j} (s_{1}, s_{2}) : = C o v (U_{j} (s_{1}), U_{j} (s_{2})) = \int_{(0, s_{1}]} \frac{α_{j} (s)}{y_{j} (s)} d s$ for $j = 1, \dots, k$ and $0 \leq s_{1} \leq s_{2} \leq τ$ .

The covariance function $ψ_{j}$ is commonly approximated by the Aalen-type

\begin{matrix} {\hat{σ}}_{j}^{2} (s_{1}) = n \int_{(0, s_{1}]} \frac{J_{j} (s)}{Y_{j}^{2} (s)} d N_{j} (s) . \end{matrix}

2.5

or the Greenwood-type estimator

\begin{matrix} {\hat{σ}}_{j}^{2} (s_{1}) = n \int_{(0, s_{1}]} \frac{J_{j} (s) (Y_{j} (s) - Δ N_{j} (s))}{Y_{j}^{3} (s)} d N_{j} (s) \end{matrix}

2.6

which are consistent for $ψ_{j} (s_{1}, s_{2})$ under the assumption of Theorem 1; cf. (4.1.6) and (4.1.7) in Andersen et al. (1993). Here, $Δ N_{j} (s)$ denotes the jump size of $N_{j}$ at time s.

Inference via Brownian bridges and the wild bootstrap

As discussed in Andersen et al. (1993), the limit process $U$ can analytically be approximated via Brownian bridges. However, improved coverage probabilities in the simulation study in Sect. 5 suggest that the proposed wild bootstrap approach may be preferable. First, we sum up the classic result.

Inference via transformed Brownian bridges

The asymptotic mutual independence stated in Theorem 1 allows to focus on a single component of $W_{n}$ , say $W_{1 n} = \sqrt{n} ({\hat{A}}_{1 n} - A_{1})$ . For notational convenience, we suppress the subscript 1. Let g be a positive (weight) function on an interval $[t_{1}, t_{2}] \subset [0, τ]$ of interest and $B^{0}$ a standard Brownian bridge process. Then, as $n \to \infty$ , it is established in Section IV.1 in Andersen et al. (1993) that

\begin{matrix} sup_{s \in [t_{1}, t_{2}]} | \frac{\sqrt{n} ({\hat{A}}_{n} (s) - A (s))}{1 + {\hat{σ}}^{2} (s)} g (\frac{{\hat{σ}}^{2} (s)}{1 + {\hat{σ}}^{2} (s)}) | \overset{d}{⟶} sup_{s \in [ϕ (t_{1}), ϕ (t_{2})]} | g (s) B^{0} (s) | . \end{matrix}

3.1

Here $ϕ (t) = \frac{σ^{2} (t)}{1 + σ^{2} (t)}$ , $σ^{2} (t) = ψ (t, t)$ and ${\hat{σ}}^{2} (t)$ is a consistent estimator for $σ^{2} (t)$ , such as (2.5) or (2.6). Quantiles of the right-hand side of (3.1) for $g \equiv 1$ are recorded in tables (e.g., Koziol and Byar 1975; Hall and Wellner 1980; Schumacher 1984). For general g, they can be approximated via standard statistical software.

Even though relation (3.1) enables statistical inference based on the asymptotics of a central limit theorem, appropriate resampling procedures usually showed improved properties; see e.g., Hall and Wilson (1991), Good (2005) and Pauly et al. (2015).

Wild bootstrap resampling

In contrast to, for instance, a competing risks model where each counting process $N_{j}$ is at most n, the number $N_{j} (τ)$ is not necessarily bounded in our setup only assuming Aalen’s multiplicative intensity model. Hence, a modification of the multiplier resampling scheme under competing risks suggested by Lin (1997) and elaborated by Beyersmann et al. (2013) is required. For this purpose, introduce counting process-specific stochastic processes indexed by $s \in [0, τ]$ that are independent of $N_{j}, Y_{j}$ for all $j = 1, \dots, k$ . Let ${(G_{j} (s))}_{s \in [0, τ]}, 1 \leq j \leq k,$ be independently and identically distributed (i.i.d.) white noise processes such that each $G_{j} (s)$ satisfies $E (G_{j} (s)) = 0$ and $v a r (G_{j} (s)) = 1$ , $j = 1, \dots, k$ , $s \in [0, τ]$ . That is, all $ℓ$ -dimensional marginals of $G_{1}$ , $ℓ \in N$ , shall be the same $ℓ$ -fold product-measure. Then, a wild bootstrap version of the normalized multivariate Nelson–Aalen estimator $W_{n}$ is defined as

\begin{matrix} {\hat{W}}_{n} (t) = & {({\hat{W}}_{1 n} (t), \dots, {\hat{W}}_{kn} (t))}^{'} \\ : = & \sqrt{n} (\int_{(0, t]} \frac{J_{1} (s)}{Y_{1} (s)} G_{1} (s) d N_{1} (s), \dots, \int_{(0, t]} \frac{J_{k} (s)}{Y_{k} (s)} G_{k} (s) d N_{k} (s))^{'} . \end{matrix}

3.2

In words, ${\hat{W}}_{n}$ is obtained from representation (2.3) of $W_{n}$ by substituting the unknown individual martingale processes $M_{j}$ with the observable quantities $G_{j} N_{j}$ . Even though only the values of each $G_{j}$ at the jump times of $N_{j}$ are relevant, this construction in terms of white noise processes enables a consideration of the wild bootstrap process on a product probability space; see the Appendix for details.

Consider for a moment the special case of a multistate model with n i.i.d. individuals (Example 1(a)). For instance, the competing risks model in Lin (1997) involves at most one transition (and thus one multiplier) per individual, while Glidden (2002) introduces only one multiplier per individual for estimating state occupation probabilities in general non-Markov multistate models. In contrast, our resampling approach is a new approach in the sense that it involves independent weightings of all jumps even within the same individual. The consequence is that, instead of considering one multiplier per individual, we need to utilize a white noise processes as done in (3.2) in order to account for randomly many numbers of events per individual.

The limit distribution of ${\hat{W}}_{n}$ may be approximated by simulating a large number of replicates of the G’s, while the data is kept fixed. For a competing risks setting with standard normally distributed multipliers, our general scheme reduces to the one discussed in Lin (1997).

For the remainder of the paper, we summarize the available data in the $σ$ -algebra $C_{0} = σ {N_{j} (u),$ $Y_{j} (u) : j = 1, \dots, k, u \in [0, τ]} .$ A natural way to introduce a filtration based on $C_{0}$ that progressively collects information on the white noise processes is by setting

\begin{matrix} C_{t} = C_{0} \lor σ {G_{j} (s) : j = 1, \dots, k, s \in [0, t]} . \end{matrix}

The following lemma is a key argument in an innovative, martingale-based consistency proof of the proposed wild bootstrap technique.

Lemma 1

For each $n \in N$ , the wild bootstrap version of the multivariate Nelson–Aalen estimator ${({\hat{W}}_{n} (t))}_{t \in [0, τ]}$ is a square-integrable martingale with respect to the filtration ${(C_{t})}_{t \in [0, τ]}$ with orthogonal components. Its predictable variation process is given by

\begin{matrix} ⟨ {\hat{W}}_{n} ⟩ : t ⟼ n (\int_{0}^{t} \frac{J_{1} (s)}{Y_{1}^{2} (s)} d N_{1} (s), \dots, \int_{0}^{t} \frac{J_{k} (s)}{Y_{k}^{2} (s)} d N_{k} (s)) \end{matrix}

and its optional variation process by

\begin{matrix} [{\hat{W}}_{n}] : t ⟼ n (\int_{0}^{t} \frac{J_{1} (s)}{Y_{1}^{2} (s)} G_{1}^{2} (s) d N_{1} (s), \dots, \int_{0}^{t} \frac{J_{k} (s)}{Y_{k}^{2} (s)} G_{k}^{2} (s) d N_{k} (s)) . \end{matrix}

The following conditional weak convergence result justifies the approximation of the limit distribution of $W_{n}$ via ${\hat{W}}_{n}$ given $C_{0}$ . Both, the general framework requiring only Aalen’s multiplicative intensity structure as well as using possibly non-normal multipliers are original to the present paper.

Theorem 2

Let $U$ be as in Theorem 1. Assuming (2.2), we have the following conditional convergence in distribution on $D {[0, τ]}^{k}$ given $C_{0}$ as $n \to \infty$ :

\begin{matrix} {\hat{W}}_{n} \overset{d}{⟶} U in probability. \end{matrix}

Remark 1

Reconsider the ordinary multiplier resampling based on a sequence of (time-constant) i.i.d. random variables $D_{1}, \dots, D_{n}$ with $E (D_{1}) = 0$ and unit variance where, in the resampling step, the martingales $M_{j}$ are replaced with $D_{j} N_{j}$ . In contrast to the wild bootstrap based on white noise processes, the wild bootstrap using the time-constant sequence $D_{1}, \dots, D_{n}$ fails to reproduce the correct covariance structure of the Nelson–Aalen process. Even in the special univariate Markovian case, the limit process does not have independent increments and it hence necessarily differs from the asymptotics described in Theorem 2; see Appendix B for details.
It is due to the martingale property of the wild bootstrapped multivariate Nelson–Aalen estimator that we anticipate a good finite sample approximation of the unknown distribution of the Nelson–Aalen estimator. In particular, the wild bootstrap, realized by white noise processes as above, succeeds in imitating the martingale structure of the original Nelson–Aalen estimator. The predictable variation process of the wild bootstrap process equals the optional variation process of the centered Nelson–Aalen process. Hence, both processes share the same properties and approximately the same covariance structure.
Suppose that $E (n^{k} J_{1} (u) / Y_{1}^{k} (u)) = O (1)$ for some $k \in N$ and all $u \in [0, τ]$ , which for example holds for any $k \in N$ if $Y_{1}$ has a number at risk interpretation. Since different increments of $W_{n}$ (to arbitrary powers) are uncorrelated, it can be shown that the convergence in Theorem 1 for single $t \in [0, τ]$ even holds in the Mallows metric $d_{p}$ for any $0 < p \leq k$ ; see, for instance, Bickel and Freedman (1981) for such theorems related to the classical bootstrap. Provided that the rth moment of $G_{1} (u)$ exists, similar arguments show that the convergence in probability in Theorem 2 for single $t \in [0, τ]$ holds in the Mallows metric $d_{p}$ for any even $0 < p \leq r$ as well. This of course includes white noise processes with centered Poi(1) or standard normal marginals, as applied later on.

Statistical applications

Throughout this section denote by $α \in (0, 1)$ the nominal level of all inference procedures.

Confidence bands

After having established all required weak convergence results, we discuss different possibilities for realizing confidence bands for $A_{j}$ around the Nelson–Aalen estimator ${\hat{A}}_{jn}$ , $j = 1, \dots, k,$ on an interval $[t_{1}, t_{2}] \subset [0, τ]$ of interest. Later on, we propose a confidence band for differences of cumulative hazard functions. As in Sect. 3.1, we first focus on $A_{1}$ and suppress the index 1 for notational convenience. Following Andersen et al. (1993), Section IV.1, we consider weight functions

\begin{matrix} g_{1} (s) = {(s (1 - s))}^{- 1 / 2} or g_{2} \equiv 1 \end{matrix}

as choices for g in relation (3.1). The resulting confidence bands are commonly known as equal precision and Hall–Wellner bands, respectively. We apply a log-transformation in order to improve small sample level $α$ control. Combining the previous sections’ convergences with the functional delta-method and Slutsky’s lemma yields

Theorem 3

Under condition (2.2), for any $0 \leq t_{1} \leq t_{2} \leq τ$ such that $A (t_{1}) > 0$ , we have the following convergences in distribution on the càdlàg space $D [t_{1}, t_{2}]$ :

\begin{matrix} (\sqrt{n} {\hat{A}}_{n} \frac{log {\hat{A}}_{n} - log A}{1 + {\hat{σ}}^{2}}) \cdot g \circ \frac{{\hat{σ}}^{2}}{1 + {\hat{σ}}^{2}} \overset{d}{⟶} (g B^{0}) \circ ϕ and \end{matrix}

4.1

\begin{matrix} (\frac{{\hat{W}}_{n}}{1 + σ^{* 2}}) \cdot g \circ \frac{σ^{* 2}}{1 + σ^{* 2}} \overset{d}{⟶} (g B^{0}) \circ ϕ \end{matrix}

4.2

conditionally given $C_{0}$ in probability, with $ϕ$ as in Sect. 3 and the wild bootstrap variance estimator $σ^{* 2} (s) : = n \int_{(0, t]} J (s) Y^{- 2} (s) G^{2} (s)$ dN(s).

In particular, $σ^{* 2}$ is a uniformly consistent estimate for $σ^{2}$ (Dobler and Pauly 2014) and, being the optional variation process of the wild bootstrap Nelson–Aalen process, it may be one natural choice for variance estimation. For practical purposes, we adapt the approach of Beyersmann et al. (2013) and estimate $σ^{2}$ based on the empirical variance of the wild bootstrap quantities ${\hat{W}}_{n}$ . The continuity of the supremum functional translates (4.1) and (4.2) into weak convergences for the corresponding suprema. Hence, the consistency of the following critical values is ensured:

\begin{matrix} c_{1 - α}^{g} = & (1 - α) quantile of L (sup_{s \in [t_{1}, t_{2}]}, | g (\hat{ϕ} (s)) B^{0} (\hat{ϕ} (s)) |), \\ {\tilde{c}}_{1 - α}^{g} = & (1 - α) quantile of L (sup_{s \in [t_{1}, t_{2}]}, |, \frac{{\hat{W}}_{n} (s)}{1 + σ^{* 2} (s)}, g, (\frac{σ^{* 2} (s)}{1 + σ^{* 2} (s)}), |, |, C_{0}), \end{matrix}

where $L (\cdot)$ denotes the law of a random variable. Here, g equals either $g_{1}$ or $g_{2}$ and $\hat{ϕ} = \frac{{\hat{σ}}^{2}}{1 + {\hat{σ}}^{2}}$ . Note, that ${\tilde{c}}_{1 - α}^{g}$ is, in fact, a random variable. The results are back-transformed into four confidence bands for A abbreviated with HW and EP for the Hall–Wellner and equal precision bands and a and w for bands based on quantiles of the asymptotic distribution and the wild bootstrap, respectively. In our simulation studies these bands are also compared with the linear confidence band $C B_{dir}^{w}$ , which is based on the critical value

\begin{matrix} {\tilde{c}}_{1 - α} = & (1 - α) quantile of L (sup_{s \in [t_{1}, t_{2}]} | {\hat{W}}_{n} (s) | | C_{0}) . \end{matrix}

Corollary 1

Under the assumptions of Theorem 3, the following bands for the cumulative hazard function ${(A (s))}_{s \in [t_{1}, t_{2}]}$ provide an asymptotic coverage probability of $1 - α$ :

\begin{matrix} {CB}_{EP}^{a} = & {[{\hat{A}}_{n} (s) exp (\mp \frac{c_{1 - α}^{g_{1}}}{\sqrt{n} {\hat{A}}_{n} (s)} {\hat{σ}}_{n} (s))]}_{s \in [t_{1}, t_{2}]} \\ {CB}_{HW}^{a} = & {[{\hat{A}}_{n} (s) exp (\mp \frac{c_{1 - α}^{g_{2}}}{\sqrt{n} {\hat{A}}_{n} (s)} (1 + {\hat{σ}}_{n}^{2} (s)))]}_{s \in [t_{1}, t_{2}]} \\ {CB}_{EP}^{w} = & {[{\hat{A}}_{n} (s) exp (\mp \frac{{\tilde{c}}_{1 - α}^{g_{1}}}{\sqrt{n} {\hat{A}}_{n} (s)} {\hat{σ}}_{n} (s))]}_{s \in [t_{1}, t_{2}]} \\ {CB}_{HW}^{w} = & {[{\hat{A}}_{n} (s) exp (\mp \frac{{\tilde{c}}_{1 - α}^{g_{2}}}{\sqrt{n} {\hat{A}}_{n} (s)} (1 + {\hat{σ}}_{n}^{2} (s)))]}_{s \in [t_{1}, t_{2}]} \\ {CB}_{dir}^{w} = & {[{\hat{A}}_{n} (s) \mp \frac{{\tilde{c}}_{1 - α}}{\sqrt{n}}]}_{s \in [t_{1}, t_{2}]} . \end{matrix}

4.3

Remark 2

Note that the wild bootstrap quantile ${\tilde{c}}_{1 - α}$ does not require an estimate of $ϕ$ , thereby eliminating one possible cause of inaccuracy within the derivation of the other bands. However, the corresponding band ${CB}_{dir}^{w}$ has the disadvantage to possibly include negative values.
The confidence bands are only well-defined if the left endpoint $t_{1}$ of the bands’ time interval is larger than the first observed event. In particular, these bands yield unstable results for small values of ${\hat{A}}_{n} (t_{1})$ due to the division in the exponential function; see Lin et al. (1994) for a similar observation.
The present approach directly allows the construction of confidence bands for within-sample comparisons of multiple $A_{1}, \dots, A_{k}$ . For instance, a confidence band for the difference $A_{1} - A_{2}$ may be obtained via quantiles based on the conditional convergence in distribution ${\hat{W}}_{1 n} - {\hat{W}}_{2 n} \overset{d}{⟶} U_{1} - U_{2} \sim G a u s s (0, ψ_{1} + ψ_{2})$ in probability by simply applying the continuous mapping theorem and taking advantage of the independence of $U_{1}$ and $U_{2}$ ; see Whitt (1980) for the continuity of the difference functional. For that purpose, the distribution of
$\begin{matrix} D (t) = \sqrt{n} g (t) ({\hat{A}}_{1 n} (t) - A_{1} (t) - ({\hat{A}}_{2 n} (t) - A_{2} (t))), \end{matrix}$ 4.4
with positive weight function g can be approximated by the conditional distribution of $\hat{D} (t) = g (t) ({\hat{W}}_{1 n} (t) - {\hat{W}}_{2 n} (t))$ . With $g \equiv 1$ , an approximate $(1 - α) \cdot 100 %$ confidence band for the difference $A_{1} - A_{2}$ of two cumulative hazard functions on $[t_{1}, t_{2}]$ is
$\begin{matrix} {[({\hat{A}}_{1} (s) - {\hat{A}}_{2} (s)) \pm {\tilde{q}}_{1 - α} / \sqrt{n}]}_{s \in [t_{1}, t_{2}]}, \end{matrix}$ 4.5
where
$\begin{matrix} {\tilde{q}}_{1 - α} & = (1 - α) quantile of L (sup_{s \in [t_{1}, t_{2}]} | {\hat{W}}_{1 n} (s) - {\hat{W}}_{2 n} (s) | | C_{0}) . \end{matrix}$
Similar arguments additionally enable common two-sample comparisons. A practical data analysis using other weight functions g in the context of cumulative incidence functions is given in Hieke et al. (2013).

Remark 3

(Construction of confidence intervals)

In particular, Theorem 3 yields a convergence result on $R^{m}$ for a finite set of time points ${s_{1}, \dots, s_{m}} \subset [0, τ], m \in N$ . Hence, using critical values ${\tilde{c}}_{1 - α}$ and ${\tilde{c}}_{1 - α}^{g}$ obtained from the law of the maximum ${max}_{s_{1}, \dots, s_{m}}$ instead of the supremum, a variant of Corollary 1 specifies simultaneous confidence intervals $I_{1} \times \dots \times I_{m}$ for $(A (s_{1}), \dots, A (s_{m}))$ with asymptotic coverage probability $1 - α$ . Since the error multiplicity is taken into account, the asymptotic coverage probability of a single such interval $I_{j}$ for $A (s_{j})$ is greater than $1 - α$ .
Due to the asymptotic independence of the entries of the multivariate Nelson–Aalen estimator, a confidence region for the value of a multivariate cumulative hazard function $(A_{1} (t), \dots, A_{k} (t))$ at time $t \in [0, τ]$ may be found using Šidák’s correction: Letting $J_{1}, \dots, J_{k}$ be pointwise confidence intervals for $A_{1} (t),$ $\dots,$ $A_{k} (t)$ with asymptotic coverage probability ${(1 - α)}^{1 / k}$ , each found using the wild bootstrap principle, the coverage probability of $J_{1} \times \dots \times J_{k}$ for $A_{1} (t) \times \dots \times A_{k} (t)$ clearly goes to $1 - α$ as $n \to \infty$ .

Hypothesis tests for equivalence, inferiority, superiority, and equality

Adapting the principle of confidence interval inclusion as discussed in Wellek (2010), Section 3.1, to time-simultaneous confidence bands, hypothesis tests for equivalence of cumulative hazard functions become readily available. To this end, let $ℓ, u : [t_{1}, t_{2}] \to (0, \infty)$ be positive, continuous functions and denote by ${(a_{n} (s), \infty)}_{s \in [t_{1}, t_{2}]}$ and ${[0, b_{n} (s))}_{s \in [t_{1}, t_{2}]}$ the one-sided (half-open) analogues of any confidence band of the previous subsection with asymptotic coverage probability $1 - α$ . Furthermore, let $A_{0} : [t_{1}, t_{2}] \to [0, \infty)$ be a pre-specified non-decreasing, continuous function for which equivalence to A shall be tested. More precisely:

\begin{matrix} H : {A (s) \leq A_{0} (s) - ℓ (s) or A (s) \geq A_{0} (s) + u (s) for some s \in [t_{1}, t_{2}]} \\ vs. K : {A_{0} (s) - ℓ (s) < A (s) < A_{0} (s) + u (s) for all s \in [t_{1}, t_{2}]} . \end{matrix}

Corollary 2

Under the assumptions of Theorem 3, a hypothesis test $ψ_{n}$ of asymptotic level $α$ for H vs K is given by the following decision rule: Reject H if and only if the combined two-sided confidence band ${(a_{n} (s), b_{n} (s))}_{s \in [t_{1}, t_{2}]}$ is fully contained in the region spanned by ${(A_{0} (s) - ℓ (s), A_{0} (s) + u (s))}_{s \in [t_{1}, t_{2}]}$ . Further, it holds under K that $E (ψ_{n}) \to 1$ as $n \to \infty$ , i.e., $ψ_{n}$ is consistent.

Similar arguments lead to analogue one-sided tests for the inferiority or superiority of the true cumulative hazard function to a prespecified function $A_{0}$ . Moreover, statistical tests for equality of two cumulative hazard functions can be constructed using the weak convergence results of Remark 2(c):

\begin{matrix} H_{=} : {A_{1} \equiv A_{2} on [t_{1}, t_{2}]} vs K_{\neq} : {A_{1} (s) \neq A_{2} (s) for some s \in [t_{1}, t_{2}]} . \end{matrix}

Corollary 3 below yields an asymptotic level $α$ test for $H_{=}$ . Bajorunaite and Klein (2007) and Dobler and Pauly (2014) used similar two-sided tests for comparing cumulative incidence functions in a two-sample problem.

Corollary 3

(A Kolmogorov–Smirnov-type test) Under the assumptions of Theorem 3 and letting g again be a positive weight function,

\begin{matrix} φ_{n}^{KS} = 1 {sup_{s \in [t_{1}, t_{2}]} \sqrt{n} g (s) | {\hat{A}}_{1 n} (s) - {\hat{A}}_{2 n} (s) | > {\tilde{q}}_{1 - α}} \end{matrix}

defines a consistent, asymptotic level $α$ resampling test for $H_{=}$ vs. $K_{\neq}$ . Here ${\tilde{q}}_{1 - α}$ is the $(1 - α)$ -quantile of $L ({sup}_{s \in [t_{1}, t_{2}]} | \hat{D} (s) | | C_{0})$ .

Similarly, Theorem 3 enables the construction of other tests, e.g., such of Cramér-von Mises-type. Furthermore, by taking the suprema over a discrete set ${s_{1}, \dots, s_{m}} \subset [0, τ]$ , the Kolmogorov–Smirnov test of Corollary 3 can also be used to test

\begin{matrix} {\tilde{H}}_{=} : {A_{1} (s_{j}) = A_{2} (s_{j}) for all 1 \leq j \leq m} \\ vs. {\tilde{K}}_{\neq} : {A_{1} (s_{j}) \neq A_{2} (s_{j}) for some 1 \leq j \leq m} . \end{matrix}

Note that in a similar way, two-sample extensions of Corollaries 2 and 3 can be established following Dobler and Pauly (2014).

Tests for proportionality

A major assumption of the widely used Cox (1972) regression model is the assumption of proportional hazards over time. Several authors have developed procedures for testing the null hypothesis of proportionality, see e.g., Gill and Schumacher (1987), Lin (1991), Grambsch and Therneau (1994), Hess (1995), Scheike and Martinussen (2004), Kraus (2007), Bagdonavičius et al. (2010), or Chen et al. (2015), and the references cited therein. However, most of these approaches have mainly been investigated for the standard survival framework under (independent) right-censoring and left-truncation mechanisms, even though they may be generalized to more general settings. In the context of the proposed wild bootstrap resampling technique, this motivates the explicit formulation of a two-sample proportionality test for the general setting only assuming a multiplicative intensity structure. This covers, for instance, arbitrary Markovian multistate models.

The framework is an unpaired two-sample model given by independent counting processes $N^{(1)}, N^{(2)}$ and predictable processes $Y^{(1)}, Y^{(2)}$ , assuming the conditions of Sect. 2 for each group, and with sample sizes $n_{1}$ and $n_{2}$ , respectively. Let again $J^{(j)} (t) = 1 {Y^{(j)} (t) > 0}$ , $j = 1, 2$ . Denote by ${\hat{A}}_{n_{j}}^{(j)} = \int_{(0, t]} \frac{J^{(j)} (s)}{Y^{(j)} (s)} d N^{(j)}$ the Nelson–Aalen estimator of the cumulative hazard functions $A^{(j)}$ and by $α^{(j)}$ the corresponding rates, $j = 1, 2$ . To motivate a suitable test statistic we make use of the following equivalence between hazards proportionality and equality of both cumulative hazards:

\begin{matrix} α^{(1)} (t) = c α^{(2)} (t) in t \in [0, τ] for c > 0 \\ ⟺ A^{(1)} (t) = c A^{(2)} (t) in t \in [0, τ] for c > 0, \end{matrix}

which, as the null hypothesis of interest, is denoted by $H_{0, prop}$ . In a natural way, similar to Gill and Schumacher (1987), this leads to statistics of the form

\begin{matrix} T_{n_{1}, n_{2}} = ρ (\sqrt{\frac{n_{1} n_{2}}{n}} \frac{{\hat{A}}_{n_{2}}^{(2)}}{{\hat{A}}_{n_{1}}^{(1)}}, \sqrt{\frac{n_{1} n_{2}}{n}} \frac{{\hat{A}}_{n_{2}}^{(2)} (τ)}{{\hat{A}}_{n_{1}}^{(1)} (τ)}), \end{matrix}

$n = n_{1} + n_{2}$ , where $ρ$ is an adequate distance on $D [0, τ]$ , e.g., $ρ (f, g) = sup w | f - g |$ (leading to Kolmogorov–Smirnov-type tests), $ρ (f, g) = \int {(f - g)}^{2} w^{2} d λ λ$ (leading to Cramér-von-Mises-type tests), where $w : [0, τ] \to [0, \infty)$ is a suitable weight function. Later on, we choose $w = {\hat{A}}_{n_{1}}^{(1)}$ which ensures the evaluation of $ρ$ on ${{\hat{A}}_{n_{1}}^{(1)} > 0}$ . Let ${\hat{W}}_{n_{1}}^{(1)}$ and ${\hat{W}}_{n_{2}}^{(2)}$ be the obvious wild bootstrap versions of the sample-specific centered Nelson–Aalen estimators; cf. (3.2).

Theorem 4

Let $ρ$ be either the above Kolmogorov–Smirnov- or the Cramér-von Mises-type statistic with $w = {\hat{A}}_{n_{1}}^{(1)}$ . If $n_{1} / n \to p \in (0, 1)$ as $min (n_{1}, n_{2}) \to \infty$ , then the test for $H_{0, p r o p}$

\begin{matrix} φ_{n_{1}, n_{2}}^{prop} = 1 {T_{n_{1}, n_{2}} > {\tilde{q}}_{1 - α}} \end{matrix}

has asymptotic level $α$ under $H_{0, p r o p}$ and asymptotic power 1 on the whole complement of $H_{0, p r o p}$ . Here ${\tilde{q}}_{1 - α}$ is the $(1 - α)$ -quantile of

\begin{matrix} L (ρ, (\sqrt{\frac{n_{1}}{n}} \frac{{\hat{W}}_{n_{2}}^{(2)}}{{\hat{A}}_{n_{1}}^{(1)}} - \sqrt{\frac{n_{2}}{n}} {\hat{W}}_{n_{1}}^{(1)} \frac{{\hat{A}}_{n_{2}}^{(2)}}{{[{\hat{A}}_{n_{1}}^{(1)}]}^{2}}, \sqrt{\frac{n_{1}}{n}} \frac{{\hat{W}}_{n_{2}}^{(2)} (τ)}{{\hat{A}}_{n_{1}}^{(1)} (τ)})) \\ ((- \sqrt{\frac{n_{2}}{n}} {\hat{W}}_{n_{1}}^{(1)} (τ) \frac{{\hat{A}}_{n_{2}}^{(2)} (τ)}{{[{\hat{A}}_{n_{1}}^{(1)} (τ)]}^{2}}), |, C_{0}) . \end{matrix}

Simulation study

The motivating example behind the present simulation study is the SIR-3 data of Sect. 6. The setting is a specification of Example 1(a) called illness-death model with recovery. As illustrated in the multistate pattern of Fig. 1, the model has state space $S = {0, 1, 2}$ and includes the transition hazards $α_{01}, α_{10}, α_{02},$ and $α_{12}$ . The simulation of the underlying quantities is based on the methodology suggested by Allignol et al. (2011) generalized to the time-inhomogeneous Markovian multistate framework, which can be seen as a nested series of competing risks experiments. More precisely, the individual initial states are derived from the proportions of individuals at $t = 0$ and the censoring times are obtained from a multinomial experiment using probability masses equal to the increments of the censoring Kaplan–Meier estimate originated from the SIR-3 data. Similarly, event times are generated according to a multinomial distribution with probabilities given by the increments of the original Nelson–Aalen estimators. These times are subsequently included into the multistate simulation algorithm described in Beyersmann et al. (2012), Section 8.2. Since censoring times are sampled independently and each simulation step is only based on the current time and the current state, the resulting data follows a Markovian structure. A more formal justification of the multistate simulation algorithm can be found in Gill and Johansen (1990) and Theorem II.6.7 in Andersen et al. (1993).

Fig. 1 — Illness-death model with recovery and transition hazards $α_{01}, α_{10}, α_{02},$ and $α_{12}$ at time t

We consider three different sample sizes: The original number of 747 patients is stepwisely reduced to 373, 186, and 93 patients. For each scenario we simulate 1000 studies. As an overview, the mean number of events for each possible transition and scenario is illustrated in Table 1.

Table 1.

Mean number of events per transition on [5, 30] provided by the simulation study of Sect. 5

Sample size	Transition
Sample size	$1 \to 0$	$0 \to 1$	$0 \to 2$	$1 \to 2$
93	20.1	4.3	43.9	10.1
186	42.7	8.3	96.5	21.7
373	85.6	17.0	193.4	43.7
747	170.9	33.9	387.4	87.4
$747^{a}$	171	34	387	87

Open in a new tab

$^{a}$ Original data

The mean number of events regarding 747 patients reflects the original number of events. All numbers are restricted to the time interval [5,30], which is chosen due to a small amount of events before $t = 5$ (left panel of Fig. 2). Further, less than 10% of all individuals are still under observation after day 30. In particular, asymptotic approximations tend to be poor at the left- and right-hand tails; cf. Remark 2(b) and Lin (1997).

Fig. 2 — 95% confidence bands based on standard normal multipliers for the cumulative hazard of end-of-stay from the data example in Sect. 6. The solid black lines are the Nelson–Aalen estimators separately for ‘no ventilation’ (state 0, right plot) and ‘ventilation’ (state 1, left plot)

Utilizing the R-package sde (Iacus 2014), the quantiles $c_{1 - α}^{g}$ in (4.3) of each single study are empirically estimated by simulating 1000 sample paths of a standard Brownian bridge. These quantiles are separately derived for both the Aalen- and Greenwood-type variance estimates (2.5) and (2.6). The bootstrap critical values are based on 1000 bootstrap realizations of ${\hat{W}}_{n}$ for each simulation step including both standard normal and centered Poisson variates with variance one. The latter is motivated by a slightly better performance compared to standard normal multipliers (Beyersmann et al. 2013; Dobler et al. 2017). Furthermore, Liu (1988) argued in a classical (linear regression) problem that wild bootstrap weights with skewness equal to one satisfy the second order correctness of the resampling approach. According to the cited simulation results, a similar result might hold true in our context, as the Poisson variates have skewness equal to one and standard normal variates are symmetric. A careful analysis of the convergence rates, however, is certainly beyond the scope of this article. In order to guarantee statistical reliability, we do not derive confidence bands for sample sizes and transitions with a mean number of observed transitions distinctly smaller than 20. The nominal level is set to $α = 0.05$ . All simulations are performed with the R-computing environment version 3.3.2 (R Core Team 2016).

Following Table 2, almost all bands constructed via Brownian bridges consistently tend to be rather conservative in our setting, i.e., result in too broad bands. Here, the usage of the Greenwood-type variance estimate yields more accurate coverage probabilities compared to the Aalen-type estimate. In contrast, the wild bootstrap approach mostly outperforms the Brownian bridge procedures: The log-transformed wild bootstrap bands approximately keep the nominal level even in the smaller sample sizes, except for the $0 \to 1$ transition with smallest sample size (corresponding to only 17 events in the mean; cf. Table 1). We also observe that the log-transformation in general improves coverage for the wild bootstrap procedure. The current simulation study showed no clear preference for the choice of weight. Note that all wild bootstrap bands for transition $0 \to 2$ show a similar, but mostly reduced conservativeness compared to the bands provided by Brownian bridges. We have to emphasize that coverage probabilities for the cumulative hazard functions are drastically decreased to approximately 75% in all sample sizes if log-transformed pointwise confidence intervals would wrongly be interpreted time-simultaneously (results not shown).

Table 2.

Empirical coverage probabilities (%) from the simulation study of Sect. 5 separately for each transition and different simulated number of individuals

Transition	N	Type of confidence band
		Brownian bridge				Wild bootstrap
		95% log EP		95% log HW		95% log EP		95% log HW		95% direct
		Aalen	Greenwood	Aalen	Greenwood	Poisson	Standard normal	Poisson	Standard normal	Poisson	Standard normal
$0 \to 1$	373	96.4	95.9	95.5	95.3	92.5	92.5	92.5	92.5	91.4	91.9
$0 \to 1$	747	97.7	97.3	97.2	97.0	95.0	94.9	95.2	95.0	92.9	93.2
$0 \to 2$	93	98.0	97.1	98.4	97.3	97.6	97.8	97.3	96.0	96.6	96.6
	186	98.3	95.5	98.9	97.4	97.2	98.2	98.0	96.1	96.2	96.2
	373	98.1	95.0	98.2	96.9	97.2	97.3	97.1	97.1	96.0	96.3
	747	98.6	96.2	98.8	97.7	97.7	97.8	97.4	97.8	96.0	96.2
$1 \to 0$	93	97.0	94.9	97.0	95.2	95.1	95.1	94.8	94.8	93.7	93.7
	186	97.3	95.8	97.7	96.1	95.6	95.7	95.7	95.4	94.5	94.3
	373	97.2	96.3	97.9	97.0	95.2	95.3	95.9	96.3	95.2	95.3
	747	97.8	96.8	97.5	96.9	96.6	96.6	95.9	96.0	96.1	96.3
$1 \to 2$	186	97.5	96.7	97.2	96.2	94.7	94.5	94.3	94.7	93.2	93.3
	373	98.2	97.7	98.2	97.8	95.8	95.8	95.1	95.2	94.6	94.3
	747	97.2	96.6	96.6	96.0	94.4	95.5	94.3	94.7	94.9	95.1

Open in a new tab

EP: equal-precision band; HW: Hall–Wellner band

The second set of simulations follows the test for proportional hazards derived in Theorem 4 with regard to keeping the preassigned error level under the null hypothesis. For that purpose, we assume a competing risks model with two competing events separately for two unpaired patient groups. For an illustration, see for instance, Figure 3.1 in Beyersmann et al. (2012).

We consider four different constant hazard scenarios: (I) the hazards for the type-1 event are set to $α_{01}^{(1)} (t) = α_{01}^{(2)} (t) = 2$ (no effect on the type-1 hazard, in particular, a hazard ratio of $c = 1$ ); (II) $α_{01}^{(1)} (t) = 1$ and $α_{01}^{(2)} (t) = 2$ (large effect); (III) $α_{01}^{(1)} (t) = α_{01}^{(2)} (t) = 1$ ; (IV) $α_{01}^{(1)} (t) = 1$ and $α_{01}^{(2)} (t) = 1.5$ (moderate effect). In each scenario, we set $α_{02}^{(1)} = α_{02}^{(2)} (t) = 2$ , in particular, we consistently assume no group effect on the competing hazard. Further, scenario-specific administrative censoring times are chosen such that approximately 25% of the individuals are censored. The simulations designs are selected such that we include different effect sizes as well as different type-1 hazard ratio configurations with respect to the competing hazards. We consider a balanced design with $n_{1} = n_{2} = n \in {125, 250, 500, 1000}$ . The right-hand tail of the domain of interest is set to $τ = 0.3$ . Simulation of the event times and types follows the procedure explained in Chapter 3.2 of Beyersmann et al. (2012). As before, we simulate 1000 studies for each scenario and sample size configuration, whereas the critical values of the Kolmogorov–Smirnov-type and Cramér-von-Mises-type statistics from Sect. 4.3 are derived from 1000 bootstrap samples utilizing both standard normal and centered Poisson variates with variance one.

The results for the type I error rates (for $α = 0.05$ ) are displayed in Table 3. As expected from consistency, the type I error control becomes better with a larger number of patients for both test statistics in each scenario. Except for Scenario (II), all procedures keep the type I error rate quite accurately for $n \geq 500$ . For smaller sample sizes, all tests tend to be conservative with a particular advantage for the Kolmogorov–Smirnov statistic.

Table 3.

Simulated size of $ϕ_{n_{1}, n_{2}}^{prop}$ for nominal size $α = 5 %$ under different sample sizes and constant hazard configurations

	Scenario I				Scenario II				Scenario III				Scenario IV
	KMS		CvM		KMS		CvM		KMS		CvM		KMS		CvM
$n_{i}$	SN	Poi	SN	Poi	SN	Poi	SN	CvM	SN	Poi	SN	Poi	SN	Poi	SN	Poi
125	0.029	0.024	0.033	0.030	0.029	0.026	0.027	0.023	0.045	0.041	0.028	0.030	0.046	0.042	0.030	0.025
250	0.035	0.039	0.039	0.040	0.040	0.038	0.037	0.034	0.039	0.040	0.037	0.034	0.034	0.034	0.033	0.030
500	0.057	0.054	0.059	0.060	0.034	0.038	0.040	0.041	0.056	0.053	0.047	0.045	0.044	0.044	0.043	0.044
1000	0.050	0.050	0.047	0.047	0.048	0.049	0.043	0.046	0.047	0.049	0.045	0.048	0.058	0.059	0.053	0.056

Open in a new tab

In each scenario $τ = 0.3$ and $25 %$ of individuals are censored

KMS Kolmogorov–Smirnov-type statistic, CvM Cramér-von-Mises-type statistic; SN standard normal multiplier; Poi centered poisson multiplier

Data example

The SIR-3 (Spread of Nosocomial Infections and Resistant Pathogens) cohort study at the Charité University Hospital in Berlin, Germany, prospectively collected data on the occurrence and consequences of hopital-aquired infections in intensive care (Beyersmann et al. 2006; Wolkewitz et al. 2008). A device of particular interest in critically ill patients is mechanical ventilation. The present data analysis investigates the impact of ventilation on the length of intensive care unit stay which is, e.g., of interest in cost-benefit analyses in hospital epidemiology (Beyersmann et al. 2011). The analysis considers a random subset of 747 patients of the SIR-3 data which one of us has made publicly available (Beyersmann et al. 2012). Patients may either be ventilated (state 1 as in Fig. 1) or not ventilated (state 0) upon admission. Switches in device usage are modeled as transitions between the intermediate states 0 and 1. Patients move into state 2 upon discharge from the unit. The numbers of observed transitions are reported in the last row of Table 1. We start by separately considering the two cumulative end-of-stay hazards $A_{12}$ and $A_{02}$ , followed by a more formal group comparison as in Remark 2(c). Based on the approach suggested by Beyersmann et al. (2012), Section 11.3, we find it reasonable to assume the Markov property. Figure 2 displays the Nelson–Aalen estimates of $A_{12}$ and $A_{02}$ accompanied by simultaneous 95% confidence bands utilizing the 1000 wild bootstrap versions with standard normal variates and restricted to the time interval [5,30] of intensive care unit days. As before, the left-hand tail of the interval is chosen, because Nelson–Aalen estimation regarding $A_{12}$ picks up at $t = 5$ , cf. the left panel of Fig. 2. Graphical validation of empirical means and variances of ${\hat{W}}_{n}$ showed good compliance compared to the theoretical limit quantities stated in Remark 1. Bands using Poisson variates are similar (both results not shown). Figure 3 also displays the 95% pointwise confidence intervals based on a log-transformation. The performance of both equal precision and Hall–Wellner bands is comparable for transitions out of the ventilation state. However, the latter tend to be larger for the $0 \to 2$ transitions for later days due to more unstable weights at the right-hand tail. Equal precision bands are graphically competitive when compared to the pointwise confidence intervals. Ventilation significantly reduces the hazard of end-of-stay, since the upper half-space is not contained in the 95% confidence band of the cumulative hazard difference, see Fig. 4.

Fig. 3 — 95% equal precision confidence bands based on standard normal multipliers and 95% log-transformed pointwise confidence intervals for the cumulative hazard of end-of-stay from the data example in Sect. 6. The solid black lines are the Nelson–Aalen estimators separately for ‘no ventilation’ (state 0, right plot) and ‘ventilation’ (state 1, left plot)

Fig. 4 — 95% confidence bands from relation (4.5) based on standard normal multipliers and 95% linear pointwise confidence intervals for difference of the two cumulative hazards of end-of-stay from the data example in Sect. 6. The solid black lines is the difference ‘ventilation vs. no ventilation’ of the Nelson–Aalen estimators within the two ventilation groups

A second investigation exemplary applies the nonparametric proportionality test suggested in Sect. 4.3 to the present study example. For that purpose, we consider sex-specific subsamples (441 male vs. 306 female patients) and test $H_{0, prop}$ using both the Kolmogorov–Smirnov- and the Cramér-von-Mises-type test statistic. The Nelson–Aalen estimators separately displayed for males and females are given in Supplementary Figure S1. p values are computed from 1000 bootstrap samples utilizing standard normal variates and the right-hand limit of the interval of interest is set to 30 days. Tabulated results are in Supplementary Table S1 including the test statistics $T_{n_{1}, n_{2}}$ and the bootstrap quantiles ${\tilde{q}}_{0.95}$ of Theorem 4 as well as the corresponding bootstrap p values $\tilde{p}$ . Non-proportionality cannot be inferred for any of the transitions at the $5 %$ level.

A concise R-script is available in the online supplementary material, which executes the bootstrap core operation given in relation (3.2) in a computationally efficient way.

Discussion and further research

We have given a rigorous presentation of a weak convergence result for the wild bootstrap methodology for the multivariate Nelson–Aalen estimator in a general setting only assuming Aalen’s multiplicative intensity structure of the underlying counting processes. This allowed the construction of time-simultaneous confidence bands and intervals as well as asymptotically valid equivalence and equality tests for cumulative hazard functions. In the context of time-to-event analysis, our general framework is not restricted to the standard survival or competing risks setting, but also covers arbitrary Markovian multistate models with finite state space, other classes of intensity models like relative survival or excess mortality models, and even specific semi-Markov situations. Additionally, independent left-truncation and right-censoring can be incorporated. The procedure has also been used to construct a test for proportional hazards. We want to emphasize that the framework induces a random number of multipliers in relation (3.2); thus, goes beyond existing approaches for competing risks as done in Beyersmann et al. (2013) or Lin (1997). Easy and computationally convenient implementation and within- or two-sample comparisons demonstrate its attractiveness in various practical applications.

Future work will be on the approximation of the asymptotic distribution corresponding to the matrix of transition probabilities (see Aalen and Johansen 1978) and functionals thereof in general Markovian multistate models. This is of great practical interest, because no similar Brownian Bridge procedure is available to perform time-simultaneous statistical inference. In particular, previous implications rely on pointwise considerations. Note that such an approach would significantly simplify the original justifications given by Lin (1997) and generalizes his idea mainly used in the context of competing risks (Scheike and Zhang 2003; Hyun et al. 2009; Beyersmann et al. 2013). In addition, we plan to extend the utilized wild bootstrap technique to general semiparametric regression models; see Lin et al. (2000) for an application in the survival context. Current work investigates to what degree the martingale properties presented in this article may be exploited to obtain wild bootstrap consistencies for such functionals of Nelson–Aalen estimates or for estimators in semiparametric regression models. We are confident that the present approach will lead to reliable inference procedures in these contexts for which there has been only little research on such general methodology.

In contrast to the procedure of Schoenfeld et al. (2002) or the framework in Cube et al. (2017), the more general illness-death model with recovery does not rely on a constant hazards assumption and captures both the time-dependent structure of mechanical ventilation and the competing event ‘death in ICU’. This significantly improves medical interpretations. The widths of the confidence bands were competitive compared to the pointwise confidence intervals, i.e., demonstrated usefulness in practical situations.

In the present data analysis, the multistate perspective is the natural way to assess the impact of time-dependent exposures on complex survival outcomes. This is a current topic in various fields of applied research; thus, the wild bootstrap procedure in its very general formulation is practicable to analyses regarding, for instance, different stages of illicit drug use (Mayet et al. 2012), the clinical course of liver diseases (Jepsen et al. 2015), antibiotics in hospital epidemiology (Munoz-Price et al. 2016), alternative outcomes in leukemia trials (Schmoor et al. 2013; Eefting et al. 2016), or joint replacements in orthopaedic patients (Gillam et al. 2012). It has even been recently applied in a study investigating femoral fracture risk, disability, and mortality in an elderly population (Bluhmki et al. 2017).

It has to be emphasized that our simulation study suggested that the wild bootstrap approach leads to more powerful procedures (i.e., to narrower confidence bands) compared to the approximation via Brownian bridges. As expected, the applied log-transformation results in improved small sample properties compared to the untransformed wild bootstrap bands. Based on the current simulation study, however, it was difficult to clearly recommend which type of band and which type of multiplier should be used.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (R 2 KB)^{(2.6KB, r)}

Supplementary material 2 (pdf 78 KB)^{(78.1KB, pdf)}

Acknowledgements

The authors would like to thank two referees and an associate editor for their helpful comments which increased the quality of this article.

Appendix

A Proofs

Proof (of Lemma 1)

: Due to similarity, it is enough to concentrate at the first component only; thus, we subsequently suppress the subscript ‘1’. The independence of all white noise processes immediately imply the orthogonality of all component processes. At first, we verify the martingale property; the square-integrability is obviously fulfilled since $E (G_{1}^{2} (0)) < \infty$ . To this end, let $0 \leq s \leq t$ . By measurability of the counting and the predictable process with respect to $C_{0}$ , we have

\begin{matrix} E ({\hat{W}}_{n} (t) | C_{s}) = & \sqrt{n} \int_{(0, t]} \frac{J (u)}{Y (u)} E (G (u) | C_{s}) d N (u) \\ = & \sqrt{n} \int_{(0, s]} \frac{J (u)}{Y (u)} G (u) d N (u) \\ + \sqrt{n} \int_{(s, t]} \frac{J (u)}{Y (u)} E (G (u)) d N (u) = {\hat{W}}_{n} (s) \end{matrix}

by the independence of $σ (G (u))$ and $C_{s}$ for all $u > s$ . Hence, the martingale property is shown.

The predictable variation process $⟨ {\hat{W}}_{n} ⟩$ is the compensator of ${\hat{W}}_{n}^{2}$ , i.e., we calculate

\begin{matrix} E ({\hat{W}}_{n}^{2} (t) | C_{s}) \\ = n (\int_{(0, s]} \int_{(0, s]} + \int_{(s, t]} \int_{(0, s]} + \int_{(0, s]} \int_{(s, t]} + \int_{(s, t]} \int_{(s, t]}) E (G (u) G (v) | C_{s}) \\ \times \frac{J (u) J (v)}{Y (u) Y (v)} d N (u) d N (v) \\ = n (\int_{(0, s]} \int_{(0, s]} G (u) G (v) + \int_{(s, t]} \int_{(0, s]} E (G (u)) G (v)) \\ (+ \int_{(0, s]} \int_{(s, t]} G (u) E (G (v))) \\ (+ \int_{(s, t]} \int_{(s, t]} E (G (u) G (v))) \frac{J (u) J (v)}{Y (u) Y (v)} d N (u) d N (v) \\ = n (\int_{(0, s]}, \int_{(0, s]}, G, (u), G, (v), \frac{J (u) J (v)}{Y (u) Y (v)}, d, N, (u), d, N, (v)) \\ (+ \int_{(s, t]} E (G^{2} (u)) \frac{J (u)}{Y^{2} (u)} d N (u)) \\ = {\hat{W}}_{n}^{2} (s) + n \int_{(0, t]} \frac{J (u)}{Y^{2} (u)} d N (u) - n \int_{(0, s]} \frac{J (u)}{Y^{2} (u)} d N (u), \end{matrix}

again by the $C_{s}$ -measurability of G(u) for $u \leq s$ and their independence for $u > s$ . The second to last equality is due to the independence of G(u) and G(v) for $u \neq v$ . Hence, ${({\hat{W}}_{n}^{2} (t) - n \int_{(0, t]} \frac{J (u)}{Y^{2} (u)} d N (u))}_{t \in [0, τ]}$ is a martingale.

Letting $Δ f$ denote the jump-size process fo a càdlàg function f, the definition of the optional variation process yields

\begin{matrix} [{\hat{W}}_{n}] (t) & = \sum_{0 < s \leq t} {(Δ {\hat{W}}_{n} (s))}^{2} = n \sum_{0 < s \leq t} G^{2} (u) \frac{J (u)}{Y^{2} (u)} Δ N (u) \\ = n \int_{(0, t]} G^{2} (u) \frac{J (u)}{Y^{2} (u)} d N (u), \end{matrix}

where the sum is taken over all jump points of N. $□$

Proof (of Theorem 2)

[of Theorem 2]: It is enough to verify the conditions of Rebolledo’s martingale central limit theorem (in conditional probability); see e.g., Theorem II.5.1 in Andersen et al. (1993). Since the filtration $C_{0}$ at time $s = 0$ is not trivial, the resulting weak convergence will hold given $C_{0}$ as well, in probability. From the classical theory we know that the Aalen-type variance estimator, which is in fact the predictable variation process of ${\hat{W}}_{n}$ , is uniformly consistent for the variance function.

It remains to prove the Lindeberg condition (2.5.3) on page 83 in Andersen et al. (1993). But, by the same arguments as in the proof of Lemma 1, this is exactly the same as the Lindeberg condition for the Nelson–Aalen estimator itself. And this holds due to the main assumption (2.2).

Hence, Rebolledo’s martingale central limit theorem yields the desired weak convergence as well as the uniform consistency of the optional variation process. $□$

Proof (of Theorem 3)

[of Theorem 3]: For convergence (4.1), see Section IV.1 in Andersen et al. (1993) in combination with Slutsky’s theorem. Convergence (4.2) follows from the consistency of $σ^{* 2}$ , Slutsky’s theorem and Theorem 2, since ${\hat{W}}_{n}$ asymptotically mimicks the distribution of $\sqrt{n} ({\hat{A}}_{n} - A)$ . The functional delta-method for $(x \mapsto log x)$ completes the proof. $□$

Proof (of Corollaries 1 and 3)

[of Corollaries 1 and 3]: Due to the continuous limit distribution the conditional quantiles converge as well in probability; see e.g., Janssen and Pauls (2003), Lemma 1. The consistency of $φ_{n}^{KS}$ under $K_{\neq}$ follows from the convergence in probability of the conditional quantile towards a finite value and from the uniform consistency of the multivariate Nelson–Aalen estimator for the cumulative hazard functions. Since the factor $\sqrt{n}$ tends to infinity, the test statistic also goes to infinity in probability under $K_{\neq}$ . $□$

Proof (of Corollary 2)

[of Corollary 2]: The proof extends the arguments of Wellek (2010), Section 3.1, from confidence intervals to confidence bands. Write $H = H_{1} \cup H_{2}$ where

\begin{matrix} H_{1} : {A (s) \leq A_{0} (s) - ℓ (s) for some s \in [t_{1}, t_{2}]} \\ and & H_{2} : {A (s) \geq A_{0} (s) + u (s) for some s \in [t_{1}, t_{2}]} . \end{matrix}

Suppose H is true and let without loss of generality be $H_{1}$ true due to analogy. Then the probability of a false rejection of H amounts to

\begin{matrix} P (A_{0} (s) - ℓ (s) < a_{n} (s) and b_{n} (s) < A_{0} (s) + u (s) for all s \in [t_{1}, t_{2}]) \\ \leq P (A_{0} (s) - ℓ (s) < a_{n} (s) for all s \in [t_{1}, t_{2}]) \\ \leq P (A (s) < a_{n} (s) for some s \in [t_{1}, t_{2}]) ⟶ α . \end{matrix}

Here the last inequality holds since $H_{1}$ is true and the convergence is due to the asymptotic coverage probability of the confidence band ${(a_{n} (s), \infty)}_{s \in [t_{1}, t_{2}]}$ .

In order to prove consistency, suppose the alternative hypothesis K is true and choose any $ε$ such that

\begin{matrix} 0 < ε < inf_{s \in [t_{1}, t_{2}]} - (A_{0} (s) - ℓ (s) - A (s)) \land (A_{0} (s) + u (s) - A (s)) . \end{matrix}

Thus, by the (uniform) consistency of the Nelson–Aalen estimator and the wild bootstrap quantiles, the probability of a correct rejection of H equals

\begin{matrix} P (A_{0} (s) - ℓ (s) < a_{n} (s) and b_{n} (s) < A_{0} (s) + u (s) for all s \in [t_{1}, t_{2}]) \\ \geq P (A (s) - ε < a_{n} (s) and b_{n} (s) < A (s) + ε for all s \in [t_{1}, t_{2}]) ⟶ 1 \end{matrix}

as $n \to \infty$ . For the convergence in the previous display, also note that $a_{n} \overset{P}{\to} A$ as well as $b_{n} \overset{P}{\to} A$ uniformly in $[t_{1}, t_{2}]$ . $□$

Proof (of Theorem 4)

[of Theorem 4]:

Let $t_{0} > 0$ . Denote by $D_{> 0} [t_{0}, τ] \subset D [t_{0}, τ]$ the cone of positive càdlàg functions that are bounded away from zero. It is easy to see that the functional $ϕ : D_{> 0}^{2} [t_{0}, τ] \to D_{> 0} [t_{0}, τ], (f, g) \mapsto \frac{f}{g}$ is Hadamard-differentiable tangentially to the set of pairs of continuous functions $C^{2} [t_{0}, τ]$ with continuous and linear Hadamard-derivative

\begin{matrix} ϕ_{(f, g)}^{'} : C^{2} [t_{0}, τ] \to C [t_{0}, τ], (h_{1}, h_{2}) ⟼ \frac{h_{1}}{g} - h_{2} \frac{f}{g^{2}} . \end{matrix}

A simpler Hadamard-differentiability result holds for $ϕ$ ’s restriction to $τ$ , i.e., ${ϕ |}_{τ} : {(0, \infty)}^{2} ∋ (f (τ), g (τ)) \mapsto \frac{f (τ)}{g (τ)}$ with continuous, linear Hadamard-derivative

\begin{matrix} {(ϕ |_{τ})}_{(f, g)}^{'} : R^{2} \to R, (h_{1} (τ), h_{2} (τ)) ⟼ \frac{h_{1} (τ)}{g (τ)} - h_{2} (τ) \frac{f (τ)}{g^{2} (τ)} . \end{matrix}

Hence, we apply the functional $δ$ -method and the continuous mapping theorem to

\begin{matrix} \sqrt{\frac{n_{1} n_{2}}{n}} (ϕ ({\hat{A}}_{n_{2}}^{(2)}, {\hat{A}}_{n_{1}}^{(1)}) - ϕ (A^{(2)}, A^{(1)})) and ϕ_{({\hat{A}}_{n_{2}}^{(2)}, {\hat{A}}_{n_{1}}^{(1)})}^{'} (\sqrt{\frac{n_{1}}{n}} {\hat{W}}_{n_{2}}^{(2)}, \sqrt{\frac{n_{2}}{n}} {\hat{W}}_{n_{1}}^{(1)}), \end{matrix}

respectively, verifying their equality in distribution in the limit (conditionally in probability for the latter). Proceed similarly with the restricted functional ${ϕ |}_{τ}$ . Furthermore, the difference functional of both above functionals ensures the Hadamard-differentiability tangentially to the set of pairs of continuous functions. Our specific choices of the distance $ρ$ are continuous functionals. Hence we are able to apply the continuous mapping theorem again. To conclude the proof of the asymptotic behaviour of $φ_{n_{1}, n_{2}}^{prop}$ under $H_{0, prop}$ , note that the particular weight function solves the problem of dividing by zero at $t_{0} = 0$ .

For the asymptotic power assertion, let $t_{1} \in [0, τ]$ at which $H_{0, prop}$ is violated. Then

\begin{matrix} ρ (\frac{{\hat{A}}_{n_{2}}^{(2)}}{{\hat{A}}_{n_{1}}^{(1)}}, \frac{{\hat{A}}_{n_{2}}^{(2)} (τ)}{{\hat{A}}_{n_{1}}^{(1)} (τ)}) \end{matrix}

converges in probability to a positive value, whence $T_{n_{1}, n_{2}} \overset{p}{\to} \infty$ follows. The conditional quantiles, however, still converge to a finite constant in probability by the above arguments. $□$

B Theoretical justification for the non-applicability of the ordinary multiplier resampling

In this part of the appendix, following the suggestion of a referee, we show in detail why the ordinary multiplier bootstrap fails to provide the correct limit process. For ease of presentation, we focus on the univariate Nelson–Aalen estimator in the special Markovian and uncensored multistate model: Let ${(X (t))}_{t}$ denote a Markov process with state space ${1, 2, \dots, k}$ , where $k \geq 2$ , i.e., given the present state of the process, the future development does not depend on previously occupied states. Without loss of generality, we focus on the transition from state 1 to state 2 in order to prove our claim. That is, we restrict attention to the cumulative hazard function given by $A (d t) = A_{12} (d t) = P (X (t + d t) = 2 ∣ X (t) = 1)$ ; for ease of presentation, this subscript is omitted. Obviously, having shown that the competing resampling technique fails in this particular situation, the same is immediately implied in the general and multivariate case.

In this set-up, the alternative resampling approach is available because the counting process N for all $1 \to 2$ transitions is a sum of $n \in N$ i.i.d. counting processes $N_{i}, i = 1, \dots, n$ , i.e., $N = \sum_{i = 1}^{n} N_{i}$ . Hence, each counting process satisfies the Doob-Meyer decomposition (2.1) such that we have the martingale representation

\begin{matrix} M (t) = N (t) - \int_{(0, t]} Y (s) d A (s) . \end{matrix}

The ordinary multiplier bootstrap works as follows: Consider the martingale representation of the Nelson–Aalen estimator,

\begin{matrix} W_{n} (t) = \sqrt{n} \int_{(0, t]} \frac{J (u)}{Y (u)} d M (u) + o_{p} (1) = \sqrt{n} \sum_{i = 1}^{n} \int_{(0, t]} \frac{J (u)}{Y (u)} d M_{i} (u) + o_{p} (1), \end{matrix}

where $o_{p} (1) \overset{p}{\to} 0$ as $n \to \infty$ ; cf. (2.3) and omit the index j therein. In the resampling step, the individual counting process martingales $M_{i} (u)$ are replaced with $G_{i} \cdot N_{i} (u)$ . Here, $G_{1}, \dots, G_{n}$ are i.i.d. random variables with zero mean and unit variance, which are independent of the data. The resulting ordinary multiplier bootstrap process shall be denoted as

\begin{matrix} {\tilde{W}}_{n} (t) = \sqrt{n} \sum_{i = 1}^{n} G_{i} \int_{(0, t]} \frac{J (u)}{Y (u)} d N_{i} (u) . \end{matrix}

We now show that this approach asymptotically does not yield the appropriate correlation structure. The limit processes of the Nelson–Aalen estimator and its wild bootstrap version as discussed in Section 3.2 are centered Gaussian martingales and, thus, have independent increments. On the other hand, this does not hold for the limit process of ${\tilde{W}}_{n}$ : For any choice of $0 \leq s \leq t \leq τ$ we have

\begin{matrix} c o v ({\tilde{W}}_{n} (t), {\tilde{W}}_{n} (s) ∣ C_{0}) \\ = E ({\tilde{W}}_{n} (t) {\tilde{W}}_{n} (s) ∣ C_{0}) \\ = n \sum_{i = 1}^{n} \sum_{k = 1}^{n} E (G_{i} G_{k}) \int_{(0, t]} \frac{J (u)}{Y (u)} d N_{i} (u) \int_{(0, s]} \frac{J (v)}{Y (v)} d N_{k} (v) \\ = \frac{1}{n} \sum_{i = 1}^{n} \int_{(0, t]} \frac{n J (u)}{Y (u)} d N_{i} (u) \int_{(0, s]} \frac{n J (v)}{Y (v)} d N_{i} (v) \\ = \frac{1}{n} \sum_{i = 1}^{n} \int_{(0, t]} \frac{1}{y (u)} d N_{i} (u) \int_{(0, s]} \frac{1}{y (v)} d N_{i} (v) + o_{p} (1) \\ \overset{p}{\to} \tilde{ψ} (s, t) : = E (\int_{(0, t]}, \frac{1}{y (u)}, d, N_{1}, (u), \int_{(0, s]}, \frac{1}{y (v)}, d, N_{1}, (v)), \end{matrix}

where the last convergence in probability holds due to the law of large numbers as $n \to \infty$ . Let X denote the Markov process underlying the counting process $N_{1}$ . A twofold application of Fubini’s theorem yields that the above expectation equals

\begin{matrix} \int_{(0, t]} \int_{(0, s]} \frac{1}{y (u)} \frac{1}{y (v)} E (d N_{1} (u) d N_{1} (v)) . \end{matrix}

As we assumed that there is no censoring, we have

\begin{matrix} E (d N_{1} (u) d N_{1} (v)) & = P (X (u) = 1, X (u + d u) = 2, X (v) = 1, X (v + d v) = 2) . \end{matrix}

Now, we distinguish between two different cases: First, we treat the diagonal which contributes the following integral to the asymptotic covariance:

\begin{matrix} \int_{(0, min (s, t)]} \frac{E (d N_{1} (u))}{y^{2} (u)} = \int_{(0, min (s, t)]} \frac{E (Y_{1} (u) d A (u))}{y^{2} (u)} = \int_{(0, min (s, t)]} \frac{d A (u)}{y (u)}, \end{matrix}

where $Y_{1}$ is the at risk indicator of the process X for a transition out of state 1. This integral corresponds to the asymptotic covariance function of the Nelson–Aalen estimator, cf. Theorem 1. Unfortunately, the overall asymptotic covariance of the alternative multiplier approach is increased due to the off-diagonal integral parts:

\begin{matrix} \int_{(0, t]} \int_{(0, s]} \frac{1 {u \neq v}}{y (u) y (v)} E (d N_{1} (u) d N_{1} (v)) \\ = \int_{(0, s]} \int_{(0, s]} \frac{1 {u \neq v}}{y (u) y (v)} E (d N_{1} (u) d N_{1} (v)) \\ + \int_{(s, t]} \int_{(0, s]} \frac{1 {u \neq v}}{y (u) y (v)} E (d N_{1} (u) d N_{1} (v)) \\ = 2 \int_{(0, s]} \int_{(u, s]} \frac{1}{y (u) y (v)} E (d N_{1} (u) d N_{1} (v)) \\ + \int_{(s, t]} \int_{(0, s]} \frac{1}{y (u) y (v)} E (d N_{1} (u) d N_{1} (v)) \end{matrix}

In the case of $u < v$ and due to the assumed Markov property, the above expected values reduce to

\begin{matrix} E (d N_{1} (u) d N_{1} (v)) = P (X (u) = 1, X (u + d u) = 2, X (v) = 1, X (v + d v) = 2) \\ = P (X (v + d v) = 2 ∣ X (v) = 1) P (X (v) = 1 ∣ X (u + d u) = 2) \\ \times P (X (u + d u) = 2 ∣ X (u) = 1) P (X (u) = 1) \\ = P (X (u) = 1) P_{21} (u, v) d A (u) d A (v), \end{matrix}

where $P_{21} (u, v) = P (X (v) = 1 ∣ X (u) = 2)$ . All in all, using $y (u) = P (X (u) = 1)$ , we obtain the following asymptotic covariance of the ordinary multiplier bootstrap version of the Nelson–Aalen estimator:

\begin{matrix} \tilde{ψ} (s, t) = \int_{(0, min (s, t)]} \frac{d A (u)}{y (u)} + \int_{(0, s]} \int_{(0, t]} \frac{P_{21} (min (u, v), max (u, v))}{y (max (u, v))} d A (u) d A (v) . \end{matrix}

We may conclude two things from this covariance representation: First, we see that $\tilde{ψ} (s, t) \neq ψ (s, t)$ . That is, the asymptotic covariance function of the Nelson–Aalen estimator as given in Theorem 1 is altered by the ordinary multiplier bootstrap. As $\tilde{ψ} (s, t)$ is in general not a function of $min (s, t)$ , the underlying limit Gaussian process does not even have independent increments. Hence, the alternative multiplier approach fails in reproducing an adequately distributed stochastic process.

Second, we see that the correct limit covariance structure is ensured only if the second integral in the above representation of $\tilde{ψ}$ vanishes. This is the case if $P_{21} \equiv 0$ which holds, for example, if there is at most one $1 \to 2$ transition possible as in the competing risks special case. This is no surprise, though: In the competing risks set-up, both resampling options, our proposed wild bootstrap and the ordinary multiplier bootstrap, reduce to the same procedure; see Beyersmann et al. (2013) for the applicability of the wild bootstrap to Aalen-Johansen estimators in competing risks models.

References

Aalen OO. Nonparametric inference for a family of counting processes. Ann Stat. 1978;6(4):701–726. [Google Scholar]
Aalen OO, Johansen S. An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand J Stat. 1978;5(3):141–150. [Google Scholar]
Aalen OO, Borgan Ø, Gjessing HK. Survival and event history analysis: a process point of view. New York: Springer; 2008. [Google Scholar]
Akritas MG. Bootstrapping the Kaplan–Meier estimator. J Am Stat Assoc. 1986;81(396):1032–1038. [Google Scholar]
Allignol A, Schumacher M, Wanner C, Drechsler C, Beyersmann J. Understanding competing risks: a simulation point of view. BMC Med Res Methodol. 2011;11:86. doi: 10.1186/1471-2288-11-86. [DOI] [PMC free article] [PubMed] [Google Scholar]
Andersen PK, Væth M. Simple parametric and nonparametric models for excess and relative mortality. Biometrics. 1989;45(2):523–535. [PubMed] [Google Scholar]
Andersen PK, Borgan Ø, Gill RD, Keiding N. Statistical models based on counting processes. New York: Springer; 1993. [Google Scholar]
Bagdonavičius V, Levuliené R, Nikulin MS. Goodness-of-fit criteria for the Cox model from left truncated and right censored data. J Math Sci. 2010;167(4):436–443. [Google Scholar]
Bajorunaite R, Klein JP. Two-sample tests of the equality of two cumulative incidence functions. Comput Stat Data Anal. 2007;51(9):4269–4281. [Google Scholar]
Beyersmann J, Gastmeier P, Grundmann H, Bärwolff S, Geffers C, Behnke M, Rüden H, Schumacher M. Use of multistate models to assess prolongation of intensive care unit stay due to nosocomial infection. Infect Control Hosp Epidemiol. 2006;27(05):493–499. doi: 10.1086/503375. [DOI] [PubMed] [Google Scholar]
Beyersmann J, Wolkewitz M, Allignol A, Grambauer N, Schumacher M. Application of multistate models in hospital epidemiology: advances and challenges. Biom J. 2011;53(2):332–350. doi: 10.1002/bimj.201000146. [DOI] [PubMed] [Google Scholar]
Beyersmann J, Allignol A, Schumacher M. Competing risks and multistate models with R. New York: Springer; 2012. [Google Scholar]
Beyersmann J, Di Termini S, Pauly M. Weak convergence of the wild bootstrap for the Aalen–Johansen estimator of the cumulative incidence function of a competing risk. Scand J Stat. 2013;40(3):387–402. [Google Scholar]
Bickel PJ, Freedman DA. Some asymptotic theory for the bootstrap. Ann Stat. 1981;9(6):1196–1217. [Google Scholar]
Bie O, Borgan Ø, Liestøl K. confidence intervals and confidence bands for the cumulative hazard rate function and their small sample properties. Scand J Stat. 1987;14(3):221–233. [Google Scholar]
Bluhmki T, Peter RS, Rapp K, König H-H, Becker C, Lindlbauer I, Rothenbacher D, Beyersmann J, Büchele G. Understanding mortality of femoral fractures following low-impact trauma in persons with and without care need. J Am Med Dir Assoc. 2017;18(3):221–226. doi: 10.1016/j.jamda.2016.08.022. [DOI] [PubMed] [Google Scholar]
Chen Wei, Wang Dehui, Li Yanfeng. A class of tests of proportional hazards assumption for left-truncated and right-censored data. J Appl Stat. 2015;42(11):2307–2320. [Google Scholar]
Cox D. Regression models and life tables (with discussion) J R Stat Soc. 1972;34:187–220. [Google Scholar]
Curley MAQ, Wypij D, Watson RS, Grant MJC, Asaro LA, Cheifetz IM, Dodson BL, Franck LS, Gedeit RG, Angus DC, Matthay MA. Protocolized sedation vs usual care in pediatric patients mechanically ventilated for acute respiratory failure: a randomized clinical trial. J Am Med Assoc. 2015;313(4):379–389. doi: 10.1001/jama.2014.18399. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dabrowska DM, Ho W-T. Confidence bands for comparison of transition probabilities in a Markov chain model. Lifetime Data Anal. 2000;6(1):5–21. doi: 10.1023/a:1009681332533. [DOI] [PubMed] [Google Scholar]
de Wit M, Gennings C, Jenvey WI, Epstein SK. Randomized trial comparing daily interruption of sedation and nursing-implemented sedation algorithm in medical intensive care unit patients. Crit Care. 2008;12(3):R70. doi: 10.1186/cc6908. [DOI] [PMC free article] [PubMed] [Google Scholar]
Dobler D, Pauly M. Bootstrapping Aalen–Johansen processes for competing risks: handicaps, solutions, and limitations. Electron J Stat. 2014;8(2):2779–2803. [Google Scholar]
Dobler D, Beyersmann J, Pauly M. Non-strange weird resampling for complex survival data. Biometrika. 2017;104(3):699–711. [Google Scholar]
Dudek A, Goćwin M, Leśkow J. Simultaneous confidence bands for the integrated hazard function. Comput Stat. 2008;23(1):41–62. [Google Scholar]
Eefting M, de Wreede LC, Halkes CJM, Peter A, Kersting S, Marijt EWA, Veelken H, Putter H, Schetelig J, Falkenburg JHF. Multi-state analysis illustrates treatment success after stem cell transplantation for acute myeloid leukemia followed by donor lymphocyte infusion. Haematologica. 2016;101(4):506–514. doi: 10.3324/haematol.2015.136846. [DOI] [PMC free article] [PubMed] [Google Scholar]
Efron B. Bootstrap methods: another look at the jackknife. Ann Stat. 1979;7(1):1–26. [Google Scholar]
Efron B. Censored data and the bootstrap. J Am Stat Assoc. 1981;76(374):312–319. [Google Scholar]
Gill R, Schumacher M. A simple test of the proportional hazards assumption. Biometrika. 1987;74(2):289–300. [Google Scholar]
Gill RD, Johansen S. A survey of product-integration with a view toward application in survival analysis. Ann Stat. 1990;18(4):1501–1555. [Google Scholar]
Gillam MH, Ryan P, Salter A, Graves SE. Multi-state models and arthroplasty histories after unilateral total hip arthroplasties. Acta Orthop. 2012;83(3):220–226. doi: 10.3109/17453674.2012.684140. [DOI] [PMC free article] [PubMed] [Google Scholar]
Glidden DV. Robust inference for event probabilities with non-Markov event data. Biometrics. 2002;58(2):361–368. doi: 10.1111/j.0006-341x.2002.00361.x. [DOI] [PubMed] [Google Scholar]
Good PI. Permutation, parametric, and bootstrap tests of hypotheses. 3. New York: Springer; 2005. [Google Scholar]
Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika. 1994;81(3):515–526. [Google Scholar]
Hall P, Wilson SR. Two guidelines for bootstrap hypothesis testing. Biometrics. 1991;47(2):757–762. [Google Scholar]
Hall WJ, Wellner JA. Confidence bands for a survival curve from censored data. Biometrika. 1980;67(1):133–143. [Google Scholar]
Hess KR. Graphical methods for assessing violations of the proportional hazards assumption in Cox regression. Stat Med. 1995;14(15):1707–1723. doi: 10.1002/sim.4780141510. [DOI] [PubMed] [Google Scholar]
Hieke S, Bertz H, Dettenkofer M, Schumacher M, Beyersmann J. Initially fewer bloodstream infections for allogeneic vs. autologous stem-cell transplants in neutropenic patients. Epidemiol Infect. 2013;141(01):158–164. doi: 10.1017/S0950268812000283. [DOI] [PMC free article] [PubMed] [Google Scholar]
Horvath L, Yandell BS. Convergence rates for the bootstrapped product-limit process. Ann Stat. 1987;15(3):1155–1173. [Google Scholar]
Hyun S, Sun Y, Sundaram R. Assessing cumulative incidence functions under the semiparametric additive risk model. Stat Med. 2009;28(22):2748–2768. doi: 10.1002/sim.3640. [DOI] [PMC free article] [PubMed] [Google Scholar]
Iacus SM (2014) sde: simulation and inference for stochastic differential equations. https://CRAN.R-project.org/package=sde. R package version 2.0.13. Accessed 8 Nov 2017
Janssen A, Pauls T. How do bootstrap and permutation tests work? Ann Stat. 2003;31(3):768–806. [Google Scholar]
Jepsen P, Vilstrup H, Andersen PK. The clinical course of cirrhosis: the importance of multistate models and competing risks analysis. Hepatology. 2015;62(1):292–302. doi: 10.1002/hep.27598. [DOI] [PubMed] [Google Scholar]
Koziol JA, Byar DP. Percentage points of the asymptotic distributions of one and two sample K-S statistics for truncated or censored data. Technometrics. 1975;17(4):507–510. [Google Scholar]
Kraus D. Data-driven smooth tests of the proportional hazards assumption. Lifetime Data Anal. 2007;13(1):1–16. doi: 10.1007/s10985-006-9027-8. [DOI] [PubMed] [Google Scholar]
Lin DY. Goodness-of-fit analysis for the Cox regression model based on a class of parameter estimators. J Am Stat Assoc. 1991;86(415):725–728. [Google Scholar]
Lin DY. Non-parametric inference for cumulative incidence functions in competing risks studies. Stat Med. 1997;16(8):901–910. doi: 10.1002/(sici)1097-0258(19970430)16:8<901::aid-sim543>3.0.co;2-m. [DOI] [PubMed] [Google Scholar]
Lin DY, Wei LJ, Ying Z. Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika. 1993;80(3):557–572. [Google Scholar]
Lin DY, Fleming TR, Wei LJ. Confidence bands for survival curves under the proportional hazards model. Biometrika. 1994;81(1):73–81. [Google Scholar]
Lin DY, Wei LJ, Yang I, Ying Z. Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc Ser B (Stat Methodol) 2000;62(4):711–730. [Google Scholar]
Liu L, Logan B, Klein JP. Inference for current leukemia free survival. Lifetime Data Anal. 2008;14(4):432–446. doi: 10.1007/s10985-008-9093-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
Liu RY. Bootstrap procedures under some non-iid models. Ann Stat. 1988;16(4):1696–1708. [Google Scholar]
Lo SH, Singh K. The product-limit estimator and the bootstrap: some asymptotic representations. Probab Theory Relat Fields. 1986;71(3):455–465. [Google Scholar]
Mayet A, Legleye S, Falissard B, Chau N. Cannabis use stages as predictors of subsequent initiation with other illicit drugs among french adolescents: use of a multi-state model. Addict Behav. 2012;37(2):160–166. doi: 10.1016/j.addbeh.2011.09.012. [DOI] [PubMed] [Google Scholar]
Pauly M, Brunner E, Konietschke F. Asymptotic permutation tests in general factorial designs. J R Stat Soc Ser B (Stat Methodol) 2015;77(2):461–473. [Google Scholar]
R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Accessed 8 Nov 2017
Sauaia A, Moore EE, Johnson JL, Ciesla DJ, Biffl WL. Validation of postinjury multiple organ failure scores. Shock. 2009;31(5):438–447. doi: 10.1097/SHK.0b013e31818ba4c6. [DOI] [PMC free article] [PubMed] [Google Scholar]
Scheike TH, Martinussen T. On estimation and tests of time-varying effects in the proportional hazards model. Scand J Stat. 2004;31(1):51–62. [Google Scholar]
Scheike TH, Zhang M-J. Extensions and applications of the Cox–Aalen survival model. Biometrics. 2003;59(4):1036–1045. doi: 10.1111/j.0006-341x.2003.00119.x. [DOI] [PubMed] [Google Scholar]
Schmoor C, Schumacher M, Finke J, Beyersmann J. Competing risks and multistate models. Clin Cancer Res. 2013;19(1):12–21. doi: 10.1158/1078-0432.CCR-12-1619. [DOI] [PubMed] [Google Scholar]
Schoenfeld DA, Bernard GR, Network ARDS. Statistical evaluation of ventilator-free days as an efficacy measure in clinical trials of treatments for acute respiratory distress syndrome. Crit Care Med. 2002;30(8):1772–1777. doi: 10.1097/00003246-200208000-00016. [DOI] [PubMed] [Google Scholar]
Schumacher M. Two-sample tests of Cramér-von Mises- and Kolmogorov–Smirnov-type for randomly censored data. Int Stat Rev. 1984;52(3):263–281. [Google Scholar]
Silvia Munoz-Price L, Frencken Jos F, Sergey Tarima, Marc Bonten. Handling time-dependent variables: antibiotics and antibiotic resistance. Clin Infect Dis. 2016;62(12):1558–1563. doi: 10.1093/cid/ciw191. [DOI] [PubMed] [Google Scholar]
Shu Y, Klein JP, Zhang M-J. Asymptotic theory for the Cox semi-Markov illness-death model. Lifetime Data Anal. 2007;13(1):91–117. doi: 10.1007/s10985-006-9018-9. [DOI] [PubMed] [Google Scholar]
Stewart RM, Park PK, Hunt JP, McIntyre RC Jr, McCarthy J, Zarzabal LA, Michalek JE; for the NIH/NHLBI ARDS Clinical Trials Network (2009) Less is more: improved outcomes in surgical patients with conservative fluid administration and central venous catheter monitoring. J Am Coll Surg 208(5):725–735 [DOI] [PubMed]
Trof RJ, Beishuizen A, Cornet AD, de Wit RJ, Girbes ARJ, Groeneveld ABJ. Volume-limited versus pressure-limited hemodynamic management in septic and nonseptic ${shock}^{*}$ . Crit Care Med. 2012;40(4):1177–1185. doi: 10.1097/CCM.0b013e31823bc5f9. [DOI] [PubMed] [Google Scholar]
von Cube M, Schumacher M, Wolkewitz M. Basic parametric analysis for a multi-state model in hospital epidemiology. BMC Med Res Methodol. 2017;17(1):111. doi: 10.1186/s12874-017-0379-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wellek S. Testing statistical hypotheses of equivalence and noninferiority. 2. Boca Raton: CRC Press; 2010. [Google Scholar]
Whitt W. Some useful functions for functional limit theorems. Math Oper Res. 1980;5(1):67–85. [Google Scholar]
Wolkewitz M, Vonberg RP, Grundmann H, Beyersmann J, Gastmeier P, Bärwolff S, Geffers C, Behnke M, Rüden H, Schumacher M. Risk factors for the development of nosocomial pneumonia and mortality on intensive care units: application of competing risks models. Crit Care. 2008;12(2):R44. doi: 10.1186/cc6852. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wu CFJ. Jackknife, bootstrap and other resampling methods in regression analysis. Ann Stat. 1986;14(4):1261–1295. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary material 1 (R 2 KB)^{(2.6KB, r)}

Supplementary material 2 (pdf 78 KB)^{(78.1KB, pdf)}

[CR1] Aalen OO. Nonparametric inference for a family of counting processes. Ann Stat. 1978;6(4):701–726. [Google Scholar]

[CR2] Aalen OO, Johansen S. An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scand J Stat. 1978;5(3):141–150. [Google Scholar]

[CR3] Aalen OO, Borgan Ø, Gjessing HK. Survival and event history analysis: a process point of view. New York: Springer; 2008. [Google Scholar]

[CR4] Akritas MG. Bootstrapping the Kaplan–Meier estimator. J Am Stat Assoc. 1986;81(396):1032–1038. [Google Scholar]

[CR5] Allignol A, Schumacher M, Wanner C, Drechsler C, Beyersmann J. Understanding competing risks: a simulation point of view. BMC Med Res Methodol. 2011;11:86. doi: 10.1186/1471-2288-11-86. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR6] Andersen PK, Væth M. Simple parametric and nonparametric models for excess and relative mortality. Biometrics. 1989;45(2):523–535. [PubMed] [Google Scholar]

[CR7] Andersen PK, Borgan Ø, Gill RD, Keiding N. Statistical models based on counting processes. New York: Springer; 1993. [Google Scholar]

[CR8] Bagdonavičius V, Levuliené R, Nikulin MS. Goodness-of-fit criteria for the Cox model from left truncated and right censored data. J Math Sci. 2010;167(4):436–443. [Google Scholar]

[CR9] Bajorunaite R, Klein JP. Two-sample tests of the equality of two cumulative incidence functions. Comput Stat Data Anal. 2007;51(9):4269–4281. [Google Scholar]

[CR10] Beyersmann J, Gastmeier P, Grundmann H, Bärwolff S, Geffers C, Behnke M, Rüden H, Schumacher M. Use of multistate models to assess prolongation of intensive care unit stay due to nosocomial infection. Infect Control Hosp Epidemiol. 2006;27(05):493–499. doi: 10.1086/503375. [DOI] [PubMed] [Google Scholar]

[CR11] Beyersmann J, Wolkewitz M, Allignol A, Grambauer N, Schumacher M. Application of multistate models in hospital epidemiology: advances and challenges. Biom J. 2011;53(2):332–350. doi: 10.1002/bimj.201000146. [DOI] [PubMed] [Google Scholar]

[CR12] Beyersmann J, Allignol A, Schumacher M. Competing risks and multistate models with R. New York: Springer; 2012. [Google Scholar]

[CR13] Beyersmann J, Di Termini S, Pauly M. Weak convergence of the wild bootstrap for the Aalen–Johansen estimator of the cumulative incidence function of a competing risk. Scand J Stat. 2013;40(3):387–402. [Google Scholar]

[CR14] Bickel PJ, Freedman DA. Some asymptotic theory for the bootstrap. Ann Stat. 1981;9(6):1196–1217. [Google Scholar]

[CR15] Bie O, Borgan Ø, Liestøl K. confidence intervals and confidence bands for the cumulative hazard rate function and their small sample properties. Scand J Stat. 1987;14(3):221–233. [Google Scholar]

[CR16] Bluhmki T, Peter RS, Rapp K, König H-H, Becker C, Lindlbauer I, Rothenbacher D, Beyersmann J, Büchele G. Understanding mortality of femoral fractures following low-impact trauma in persons with and without care need. J Am Med Dir Assoc. 2017;18(3):221–226. doi: 10.1016/j.jamda.2016.08.022. [DOI] [PubMed] [Google Scholar]

[CR17] Chen Wei, Wang Dehui, Li Yanfeng. A class of tests of proportional hazards assumption for left-truncated and right-censored data. J Appl Stat. 2015;42(11):2307–2320. [Google Scholar]

[CR18] Cox D. Regression models and life tables (with discussion) J R Stat Soc. 1972;34:187–220. [Google Scholar]

[CR19] Curley MAQ, Wypij D, Watson RS, Grant MJC, Asaro LA, Cheifetz IM, Dodson BL, Franck LS, Gedeit RG, Angus DC, Matthay MA. Protocolized sedation vs usual care in pediatric patients mechanically ventilated for acute respiratory failure: a randomized clinical trial. J Am Med Assoc. 2015;313(4):379–389. doi: 10.1001/jama.2014.18399. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR20] Dabrowska DM, Ho W-T. Confidence bands for comparison of transition probabilities in a Markov chain model. Lifetime Data Anal. 2000;6(1):5–21. doi: 10.1023/a:1009681332533. [DOI] [PubMed] [Google Scholar]

[CR21] de Wit M, Gennings C, Jenvey WI, Epstein SK. Randomized trial comparing daily interruption of sedation and nursing-implemented sedation algorithm in medical intensive care unit patients. Crit Care. 2008;12(3):R70. doi: 10.1186/cc6908. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR22] Dobler D, Pauly M. Bootstrapping Aalen–Johansen processes for competing risks: handicaps, solutions, and limitations. Electron J Stat. 2014;8(2):2779–2803. [Google Scholar]

[CR23] Dobler D, Beyersmann J, Pauly M. Non-strange weird resampling for complex survival data. Biometrika. 2017;104(3):699–711. [Google Scholar]

[CR24] Dudek A, Goćwin M, Leśkow J. Simultaneous confidence bands for the integrated hazard function. Comput Stat. 2008;23(1):41–62. [Google Scholar]

[CR25] Eefting M, de Wreede LC, Halkes CJM, Peter A, Kersting S, Marijt EWA, Veelken H, Putter H, Schetelig J, Falkenburg JHF. Multi-state analysis illustrates treatment success after stem cell transplantation for acute myeloid leukemia followed by donor lymphocyte infusion. Haematologica. 2016;101(4):506–514. doi: 10.3324/haematol.2015.136846. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] Efron B. Bootstrap methods: another look at the jackknife. Ann Stat. 1979;7(1):1–26. [Google Scholar]

[CR27] Efron B. Censored data and the bootstrap. J Am Stat Assoc. 1981;76(374):312–319. [Google Scholar]

[CR28] Gill R, Schumacher M. A simple test of the proportional hazards assumption. Biometrika. 1987;74(2):289–300. [Google Scholar]

[CR29] Gill RD, Johansen S. A survey of product-integration with a view toward application in survival analysis. Ann Stat. 1990;18(4):1501–1555. [Google Scholar]

[CR30] Gillam MH, Ryan P, Salter A, Graves SE. Multi-state models and arthroplasty histories after unilateral total hip arthroplasties. Acta Orthop. 2012;83(3):220–226. doi: 10.3109/17453674.2012.684140. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR31] Glidden DV. Robust inference for event probabilities with non-Markov event data. Biometrics. 2002;58(2):361–368. doi: 10.1111/j.0006-341x.2002.00361.x. [DOI] [PubMed] [Google Scholar]

[CR32] Good PI. Permutation, parametric, and bootstrap tests of hypotheses. 3. New York: Springer; 2005. [Google Scholar]

[CR33] Grambsch PM, Therneau TM. Proportional hazards tests and diagnostics based on weighted residuals. Biometrika. 1994;81(3):515–526. [Google Scholar]

[CR34] Hall P, Wilson SR. Two guidelines for bootstrap hypothesis testing. Biometrics. 1991;47(2):757–762. [Google Scholar]

[CR35] Hall WJ, Wellner JA. Confidence bands for a survival curve from censored data. Biometrika. 1980;67(1):133–143. [Google Scholar]

[CR36] Hess KR. Graphical methods for assessing violations of the proportional hazards assumption in Cox regression. Stat Med. 1995;14(15):1707–1723. doi: 10.1002/sim.4780141510. [DOI] [PubMed] [Google Scholar]

[CR37] Hieke S, Bertz H, Dettenkofer M, Schumacher M, Beyersmann J. Initially fewer bloodstream infections for allogeneic vs. autologous stem-cell transplants in neutropenic patients. Epidemiol Infect. 2013;141(01):158–164. doi: 10.1017/S0950268812000283. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR38] Horvath L, Yandell BS. Convergence rates for the bootstrapped product-limit process. Ann Stat. 1987;15(3):1155–1173. [Google Scholar]

[CR39] Hyun S, Sun Y, Sundaram R. Assessing cumulative incidence functions under the semiparametric additive risk model. Stat Med. 2009;28(22):2748–2768. doi: 10.1002/sim.3640. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] Iacus SM (2014) sde: simulation and inference for stochastic differential equations. https://CRAN.R-project.org/package=sde. R package version 2.0.13. Accessed 8 Nov 2017

[CR41] Janssen A, Pauls T. How do bootstrap and permutation tests work? Ann Stat. 2003;31(3):768–806. [Google Scholar]

[CR42] Jepsen P, Vilstrup H, Andersen PK. The clinical course of cirrhosis: the importance of multistate models and competing risks analysis. Hepatology. 2015;62(1):292–302. doi: 10.1002/hep.27598. [DOI] [PubMed] [Google Scholar]

[CR43] Koziol JA, Byar DP. Percentage points of the asymptotic distributions of one and two sample K-S statistics for truncated or censored data. Technometrics. 1975;17(4):507–510. [Google Scholar]

[CR44] Kraus D. Data-driven smooth tests of the proportional hazards assumption. Lifetime Data Anal. 2007;13(1):1–16. doi: 10.1007/s10985-006-9027-8. [DOI] [PubMed] [Google Scholar]

[CR45] Lin DY. Goodness-of-fit analysis for the Cox regression model based on a class of parameter estimators. J Am Stat Assoc. 1991;86(415):725–728. [Google Scholar]

[CR46] Lin DY. Non-parametric inference for cumulative incidence functions in competing risks studies. Stat Med. 1997;16(8):901–910. doi: 10.1002/(sici)1097-0258(19970430)16:8<901::aid-sim543>3.0.co;2-m. [DOI] [PubMed] [Google Scholar]

[CR47] Lin DY, Wei LJ, Ying Z. Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika. 1993;80(3):557–572. [Google Scholar]

[CR48] Lin DY, Fleming TR, Wei LJ. Confidence bands for survival curves under the proportional hazards model. Biometrika. 1994;81(1):73–81. [Google Scholar]

[CR49] Lin DY, Wei LJ, Yang I, Ying Z. Semiparametric regression for the mean and rate functions of recurrent events. J R Stat Soc Ser B (Stat Methodol) 2000;62(4):711–730. [Google Scholar]

[CR50] Liu L, Logan B, Klein JP. Inference for current leukemia free survival. Lifetime Data Anal. 2008;14(4):432–446. doi: 10.1007/s10985-008-9093-1. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR51] Liu RY. Bootstrap procedures under some non-iid models. Ann Stat. 1988;16(4):1696–1708. [Google Scholar]

[CR52] Lo SH, Singh K. The product-limit estimator and the bootstrap: some asymptotic representations. Probab Theory Relat Fields. 1986;71(3):455–465. [Google Scholar]

[CR53] Mayet A, Legleye S, Falissard B, Chau N. Cannabis use stages as predictors of subsequent initiation with other illicit drugs among french adolescents: use of a multi-state model. Addict Behav. 2012;37(2):160–166. doi: 10.1016/j.addbeh.2011.09.012. [DOI] [PubMed] [Google Scholar]

[CR54] Pauly M, Brunner E, Konietschke F. Asymptotic permutation tests in general factorial designs. J R Stat Soc Ser B (Stat Methodol) 2015;77(2):461–473. [Google Scholar]

[CR55] R Core Team (2016) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Accessed 8 Nov 2017

[CR56] Sauaia A, Moore EE, Johnson JL, Ciesla DJ, Biffl WL. Validation of postinjury multiple organ failure scores. Shock. 2009;31(5):438–447. doi: 10.1097/SHK.0b013e31818ba4c6. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR57] Scheike TH, Martinussen T. On estimation and tests of time-varying effects in the proportional hazards model. Scand J Stat. 2004;31(1):51–62. [Google Scholar]

[CR58] Scheike TH, Zhang M-J. Extensions and applications of the Cox–Aalen survival model. Biometrics. 2003;59(4):1036–1045. doi: 10.1111/j.0006-341x.2003.00119.x. [DOI] [PubMed] [Google Scholar]

[CR59] Schmoor C, Schumacher M, Finke J, Beyersmann J. Competing risks and multistate models. Clin Cancer Res. 2013;19(1):12–21. doi: 10.1158/1078-0432.CCR-12-1619. [DOI] [PubMed] [Google Scholar]

[CR60] Schoenfeld DA, Bernard GR, Network ARDS. Statistical evaluation of ventilator-free days as an efficacy measure in clinical trials of treatments for acute respiratory distress syndrome. Crit Care Med. 2002;30(8):1772–1777. doi: 10.1097/00003246-200208000-00016. [DOI] [PubMed] [Google Scholar]

[CR61] Schumacher M. Two-sample tests of Cramér-von Mises- and Kolmogorov–Smirnov-type for randomly censored data. Int Stat Rev. 1984;52(3):263–281. [Google Scholar]

[CR62] Silvia Munoz-Price L, Frencken Jos F, Sergey Tarima, Marc Bonten. Handling time-dependent variables: antibiotics and antibiotic resistance. Clin Infect Dis. 2016;62(12):1558–1563. doi: 10.1093/cid/ciw191. [DOI] [PubMed] [Google Scholar]

[CR63] Shu Y, Klein JP, Zhang M-J. Asymptotic theory for the Cox semi-Markov illness-death model. Lifetime Data Anal. 2007;13(1):91–117. doi: 10.1007/s10985-006-9018-9. [DOI] [PubMed] [Google Scholar]

[CR64] Stewart RM, Park PK, Hunt JP, McIntyre RC Jr, McCarthy J, Zarzabal LA, Michalek JE; for the NIH/NHLBI ARDS Clinical Trials Network (2009) Less is more: improved outcomes in surgical patients with conservative fluid administration and central venous catheter monitoring. J Am Coll Surg 208(5):725–735 [DOI] [PubMed]

[CR65] Trof RJ, Beishuizen A, Cornet AD, de Wit RJ, Girbes ARJ, Groeneveld ABJ. Volume-limited versus pressure-limited hemodynamic management in septic and nonseptic ${shock}^{*}$ . Crit Care Med. 2012;40(4):1177–1185. doi: 10.1097/CCM.0b013e31823bc5f9. [DOI] [PubMed] [Google Scholar]

[CR66] von Cube M, Schumacher M, Wolkewitz M. Basic parametric analysis for a multi-state model in hospital epidemiology. BMC Med Res Methodol. 2017;17(1):111. doi: 10.1186/s12874-017-0379-4. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR67] Wellek S. Testing statistical hypotheses of equivalence and noninferiority. 2. Boca Raton: CRC Press; 2010. [Google Scholar]

[CR68] Whitt W. Some useful functions for functional limit theorems. Math Oper Res. 1980;5(1):67–85. [Google Scholar]

[CR69] Wolkewitz M, Vonberg RP, Grundmann H, Beyersmann J, Gastmeier P, Bärwolff S, Geffers C, Behnke M, Rüden H, Schumacher M. Risk factors for the development of nosocomial pneumonia and mortality on intensive care units: application of competing risks models. Crit Care. 2008;12(2):R44. doi: 10.1186/cc6852. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR70] Wu CFJ. Jackknife, bootstrap and other resampling methods in regression analysis. Ann Stat. 1986;14(4):1261–1295. [Google Scholar]

PERMALINK

The wild bootstrap for multivariate Nelson–Aalen estimators

Tobias Bluhmki

Dennis Dobler

Jan Beyersmann

Markus Pauly

Abstract

Electronic supplementary material

Introduction

Nonparametric estimation under the multiplicative intensity structure

Example 1

Theorem 1

Inference via Brownian bridges and the wild bootstrap

Inference via transformed Brownian bridges

Wild bootstrap resampling

Lemma 1

Theorem 2

Remark 1

Statistical applications

Confidence bands

Theorem 3

Corollary 1

Remark 2

Remark 3

Hypothesis tests for equivalence, inferiority, superiority, and equality

Corollary 2

Corollary 3

Tests for proportionality

Theorem 4

Simulation study

Fig. 1.

Table 1.

Fig. 2.

Table 2.

Table 3.

Data example

Fig. 3.

Fig. 4.

Discussion and further research

Electronic supplementary material

Acknowledgements

Appendix

A Proofs

Proof (of Lemma 1)

Proof (of Theorem 2)

Proof (of Theorem 3)

Proof (of Corollaries 1 and 3)

Proof (of Corollary 2)

Proof (of Theorem 4)

B Theoretical justification for the non-applicability of the ordinary multiplier resampling

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases