Abstract
This article deals with studies that monitor occurrences of a recurrent event for n subjects or experimental units. It is assumed that the ith unit is monitored over a random period [0,τi]. The successive inter-event times Ti1, Ti2, ..., are assumed independent of τi. The random number of event occurrences over the monitoring period is Ki = max{k ∈ {0, 1, 2, ...} : Ti1 + Ti2 + ... + Tik ≤ τi}. The Tijs are assumed to be i.i.d. from an unknown distribution function F which belongs to a parametric family of distributions . The τis are assumed to be i.i.d. from unknown distribution function G. The problem of estimating θ, and consequently the distribution F, is considered under the assumption that the τis are informative about the inter-event distribution. Specifically, 1 – G = (1 – F)β for some unknown β > 0, a generalized Koziol-Green (cf., Koziol and Green (1976); Chen, Hollander, and Langberg (1982)) model. Asymptotic properties of estimators of θ, β, and F are presented. Efficiencies of estimators of θ and F are ascertained relative to estimators which ignores the informative monitoring aspect. These comparisons reveal the gain in efficiency when the informative structure of the model is exploited. Concrete demonstrations were performed for F exponential and a two-parameter Weibull.
Keywords: Counting processes, efficiency comparisons, exponential inter-event times, generalized Koziol-Green model, martingales, Weibull inter-event times
1 Introduction
The parametric, semiparametric, and nonparametric estimation of the distribution function of an event time has been the subject of intense research in past decades, especially in settings where there is at most one observed event (so-called single-event settings) during the monitoring period per experimental unit. Among the seminal papers dealing with this problem are those of Kaplan and Meier (1958), Efron (1967), Cox (1972), Breslow and Crowley (1974), Aalen (1978), and Borgan (1984); see the books of Fleming and Harrington (1991), Andersen, Borgan, Gill, and Keiding (1993), Kalbfleisch and Prentice (2002), and Aalen, Borgan, and Gjessing (2008). The situation where the event is recurrent so there could be more than one event occurrence per unit has also been dealt with, albeit not as thoroughly yet as the single-event case. In the recurrent event setting, the estimation problem has been considered by Gill (1980, 1981), Vardi (1982a,b), Wang and Chang (1999), and Peña, Strawderman, and Hollander (2001). Gill (1981) dealt with the problem of nonparametric inference for renewal processes in a life testing setting. Vardi (1982a) presented an algorithm for obtaining the maximum likelihood (ML) estimator of the survivor function when the underlying interoccurence times are arithmetic. Sellke (1988), in the case of a single unit, considered the problem of establishing weak convergence of a Nelson-Aalen-type estimator when the length of the monitoring period increases without bound. Peña et al. (2001) proposed Nelson-Aalen and Kaplan-Meier-type estimators and derived their asymptotic properties when the number of units increases but with the monitoring time for each unit being finite with probability one, in contrast to the setting in Sellke (1988) where the monitoring time increases to infinity.
An important issue that arose in the single-event setting is the impact of an informative censoring mechanism. An analytically tractable informative random censorship model was proposed by Koziol and Green (1976) which assumes proportional hazards for the event time and the censoring time. This model was utilized by Chen et al. (1982) to study exact properties of the Kaplan-Meier estimator. Cheng and Lin (1987) also utilized this model to derive an estimator of the survivor function which exploits the informative censoring structure, and showed that their estimator is more efficient than the Kaplan and Meier (1958) estimator, especially under heavy censoring. Hollander and Peña (1989) also used this Koziol-Green model to obtain a more efficient class of confidence bands for the survivor function.
There are many situations however in the engineering, actuarial, biomedical, public health, social and economic sciences, as well as in business, where the event of interest is recurrent. Examples of such events are machine (mechanical or electronic) malfunction, non-life insurance claim, onset of depression, heart attack, economic recession, marital strife, and commission of a criminal act. In this recurrent event setting, the impact of an informative monitoring period has not been examined extensively. This article is for the purpose of studying the impact of an informative monitoring period especially in the context of efficiency gains and losses in the estimation of the inter-event time parameter and distribution. As pointed out in Peña et al. (2001) and Peña and Hollander (2004), recurrent event data have additional features that require attention in performing statistical inference. Two of these important features are: (i) because of the sum-quota data accrual scheme, the number of observed event occurrences is informative about the inter-event distribution even if G is unrelated to F; and (ii) the variable that right-censors the last inter-event time at the end of the monitoring period is dependent on the previous inter-event times. Thus, there is both informative and dependent censoring in recurrent event data. Because of these additional features for recurrent event data, there is a need to study the additional impact of having a G informative about F in the estimation of F or its parameters, in particular, in the efficiency gain when the informative structure is exploited.
There has been several models that have been proposed to model informative censoring. William (1989) proposed a model where the censoring variable is related to the frailty of the individual. He showed in particular that in the case of exponential frailty the use of the Kaplan-Meier estimator can lead to errors in estimating the survivor probability. Wang, Qin, and Chiang (2001) proposed various models where the occurrence of recurrent events is modeled by a subject specific nonstationary Poisson process via a latent variable. Siannis (2004) considered a parametric model where the parameter represents the level of dependence between the failure and the censoring process. In this article we employ a generalization to the recurrent event setting of the model studied in Koziol and Green (1976), the so-called Koziol-Green (KG) model. This KG model has been most utilized in studying efficiency aspects under informative censoring in single-event settings; see for instance Chen et al. (1982) which obtains exact properties of the Kaplan-Meier estimator under this model, and Cheng and Lin (1987) which derives an estimator of the survivor function utilizing the informative structure. We point out that, just as in the case of the single-event setting, the utility of the proposed generalized KG model is not primarily to provide a practical and realistic model, but rather to provide a medium in which to examine analytically properties of inference procedures with recurrent event data.
The major goal of this article is to obtain estimators of the inter-event time distribution and its parameter for this generalized KG model and to ascertain the loss in efficiency if one ignores the informative structure. An outline of this article is as follows. Section 2 introduces relevant processes, describes the generalized KG model and its properties, and develops the estimators. The framework of stochastic processes is adopted to gain generality. Section 3 deals with asymptotic properties of the estimators under the KG model and those estimators derived by ignoring the KG assumption. Section 4 performs efficiency comparisons of the estimators that exploits the informative structure relative to those which were derived ignoring the structure. In particular, the efficiency of a fully nonparametric estimator of the inter-event distribution is examined. Section 5 presents the results of simulation studies which studies small- to moderate-sample properties of estimators for models in which closed-form analytical expressions are not possible, specifically when the inter-event distribution is a two-parameter Weibull. Finally, Section 6 provides some concluding remarks.
2 Model of Interest and Estimators
2.1 Random Entities
All random entities are defined on a basic probability space (Ω, , P). We suppose that there are n subjects in the study. For the ith subject, {Sij, j = 1, 2, ..., } are the successive calendar times of event occurrences, while {Tij, j = 1, 2, ...} are the successive inter-event times. Thus, we have Si0 = 0, Sij = Tik and Tij = Sij – Si,j–1. The Tijs are assumed to be i.i.d. nonnegative r.v.s with a common absolutely continuous distribution function F. In this paper we restrict to the i.i.d. inter-event times setting, while the possibly more relevant model for biostatistical applications with correlated inter-event times, specifically with the association induced by frailty components, will be dealt with in a separate paper.
The renewal function associated with F is
| (1) |
where F*(n) is the nth convolution of F, the distribution of Sin. We assume that . The hazard rate function of F(·; θ) is
where F̄ = 1 – F (analogously, Ḡ = 1 – G). The cumulative hazard function is
(similarly λG and ΛG). For subject i, i = 1, 2, ... , n, the recurrent event is observed over a random period [0, τi], where the τis represent the end of the monitoring periods and are i.i.d. according to some distribution function G. These τis may also be viewed as the right-censoring variables, though the right-censoring structure is somewhat different from the usual right-censoring structure with a single event per unit as explained in the next paragraph. Furthermore, the τis and Tijs are mutually independent. With Z+,0 = {0, 1, 2, ...}, the (random) number of event occurrences observed for the ith unit is Ki = max{k ∈ Z+,0 : Sik ≤ τi}. Therefore, the random observables for the ith subject is represented by the vector
| (2) |
Let us examine in more detail the censoring structure for this model. Note that there will always be one right-censored inter-event time per unit. This inter-event time, denoted by Ti,Ki+1, is right-censored by the variable τi – SiKi. Observe that the right-censoring variable depends on the previous inter-event times through SiKi, and in fact, since both Ti,Ki+1 and SiKi depend on the random variable Ki, then they are dependent. More interestingly, observe that Ti,Ki+1 has a distribution different from Ti1 owing to the randomness of Ki. This is evident from the resulting sum-quota constraint given by
and, in fact, Ti,Ki+1 is stochastically larger than T1, a length-biased phenomenon. The intuition is that the inter-event time that covered the monitoring time tends to be longer, similar to the well-known phenomenon that a passenger arriving on a bus stop has a greater chance of riding a bus that waits longer! In this recurrent event model we thus have the situation where the right-censoring structure is both dependent and informative. Observe that the sample space for Di is the subset
such that d =(k, τ, t1, t2, ... , tk, c) ∈ satisfies .
2.2 Generalized KG Model for Recurrent Events
The generalized Koziol-Green model for this recurrent event setting postulates that there exists a β > 0 such that Ḡ(t; θ, β) = F̄(t; θ)β. This condition is equivalent to ΛG(·; θ, β) = βΛF (·; θ). In the single-event setting, the parameter β is referred to as the censoring parameter since Pr(τi < Ti) = β/(1 + β)and τi right-censors Ti. In contrast, for the recurrent event setting, the parameter β determines the length of the monitoring period relative to the inter-event times. For example, when F(t; θ) = 1 − exp(−θt)for t ≥ 0, then
More precisely, for this exponential model, 1/β is the expected number of event occurrences during the monitoring period. This is so since, given τi, the expected number of event occurrences over [0, τi] is θτi, and the result follows since E(τi) = 1/(θβ). For this recurrent event setting, the case where β ∈ (0, 1] is of more practical relevance since it leads to more observed recurrences, though β > 1 is also an admissible value. The property Pr(τi < Ti1) = β/(1 + β) still holds true for this recurrent event setting, but it seems inappropriate to refer to β in this setting as the ‘censoring parameter’ since as pointed out earlier, the effective right-censoring variable for Ti,Ki+1 is τi − SiKi, whose distribution not only depends on β but also on F. Thus, it is perhaps more proper to refer to β as the ‘monitoring parameter!’
A crucial independence property that was exploited in Chen et al. (1982) and the other papers dealing with this model in the single-event setting is the fact that min(τi, Ti) and I{Ti ≤ τi} are independent if and only if the KG model holds (cf., Allen (1963)). This allowed the derivation of the exact means and variances of the Kaplan-Meier estimator in Chen et al. (1982), and also facilitated the derivation of the semiparametric estimator of the survivor function in Cheng and Lin (1987). For the recurrent event setting in this paper, such an independence property does not come directly into play. For single-event settings, it has been pointed out, e.g., Csörgő and Faraway (1998), that the KG model may not occur much in practice, though its utility is more from the theoretical point of view as it provides a model allowing for the analytical examinations of properties of procedures. Our proposing the generalized KG model for recurrent event data is also not meant primarily as a practical model, but rather it is meant to provide a model specimen for the examination of analytical properties of procedures utilizing recurrent event data.
2.3 Estimators
The first goal of this article is, given the data in (2), to develop estimators of θ and β, and consequently, an estimator of F. For generality we adopt the approach of counting processes and martingales (cf., Fleming and Harrington (1991) and Andersen et al. (1993)).
For i = 1, 2, ..., n and s ∈ , let
| (3) |
The processes in (3) counts the number of failures for unit i at time s; indicates whether unit i is still under observation at time s; and indicates whether unit i is past its monitoring time at time s, respectively. We augment the probability space by the filtration F = {}, where is given by
with representing the σ-field containing all the information available at time 0, and is the σ-field containing all information generated on all subjects up to time s. Also, for each i = 1, 2, ..., n, define the backward recurrence process via
| (4) |
which is the time elapsed at time s since the last event occurrence. This is an F-adapted and a left-continuous process, hence is an F-predictable process. For s ∈ , let
| (5) |
From stochastic integration theory, and are, respectively, the compensators of and , so
| (6) |
are, for each i, square-integrable F-martingales. From the results of Jacod (1974/75), the full likelihood process is, for s ∈ ,
| (7) |
Taking logarithm we obtain the log-likelihood process
| (8) |
For a vector a, let , and define the operator ∇θ = ∂/∂θ ≡ (∂/∂θj,j = 1, 2, ..., p)t. Let
| (9) |
Under the regularity condition that the order of differentiation with respect to θ and β and integration with respect to Lebesgue measure can be interchanged, we obtain the score processes for θ and β by taking the derivatives of (8) with respect to θ and β, respectively, to be
| (10) |
| (11) |
To make notation compact, let
| (12) |
Then, the vector of score processes becomes
| (13) |
Let s* ∈ . Equating the score vector in (13), evaluated at s = s*, to 0 and solving for θ and β, we obtain the maximum likelihood (ML) estimators and of θ and β, respectively. In general, it is not possible to obtain closed-form expressions for these estimators, so in practice numerical methods such as the Newton-Raphson (NR) algorithm or the Nelder-Mead simplex algorithm could be used to obtain the ML estimates.
The observed Fisher information process, which is the negative of the second partial derivatives with respect to the parameters is
| (14) |
where, with
we have
Provided that the matrix inverse of the observed Fisher information matrix exists, the Newton-Raphson iteration below for numerically computing the parameter estimates may be implemented.
A more convenient computational implementation is achieved when we let s* → ∞ since the score equation for the ML estimator of β satisfies
| (15) |
As a consequence, the estimator of θ, denoted by , satisfies
| (16) |
since . Equation (16) may be solved through numerical methods, e.g., Newton-Raphson iteration.
Having obtained the estimate of θ, then the parametric estimator of F̄ is provided by
| (17) |
We will obtain the asymptotic distribution of this estimator in Section 3 and compare this with a fully nonparametric estimator in Section 4.
2.4 Computational Forms
To facilitate the numerical implementation of the procedures to obtain estimates, we observe the forms below for the integrals with respect to the martingale processes. Let which could possibly depend also on (θ, β), and
Then, we have the following identities.
These expressions are used in re-expressing the score functions as s → ∞. With regards to the terms in the observed Fisher information, we also observe the following identities:
with .
2.4.1 Exponential Inter-Event Times
Implementing to the case of exponentially-distributed inter-event times with λF (t; θ) = θ so ϕF(t; θ) = 1/θ, we find that
| (18) |
Equating both to zeros and solving for θ and β yield the estimators
| (19) |
so the ML estimator of θ is just the occurrence-exposure rate.
For this exponential case, with and , we also obtain the observed Fisher information matrix to be
| (20) |
Dividing by n and taking the limit as n → ∞, noting that and , then the in-probability limiting matrix is
The inverse of this matrix is
| (21) |
2.4.2 Weibull Inter-Event Times
In this subsubsection we present the estimation procedure when F is Weibull with shape parameter θ1 and scale parameter θ2, so that θ = (θ1, θ2)t ∈ . For this case we have
| (22) |
We observe that given θ = (θ1, θ2) and β in this Weibull situation, the ratio of E(τ) and E(T1) equals 1/(β1/θ1). Thus, if one desires approximately k0 observed recurrences per unit, then the monitoring parameter could be chosen approximately equal to 1/().
From (22), it follows that
| (23) |
The second equation arising in (16) simplifies to
where, for brevity of notation but with a slight conflict with our earlier usage of Ti,Ki+1, from hereon we shall let
Consequently, in terms of is given by
| (24) |
where
Let us then define, for i = 1, 2, ... , n,
Substituting the expression in (24) in the first equation arising in (16), we obtain that is the solution in θ1 of the equation
| (25) |
The solution to this uni-dimensional equation maybe obtained numerically by plotting, direct search, or a Newton-Raphson iteration. The last approach will be implemented in performing simulations associated with this Weibull inter-event times. For the purpose of implementing this procedure, say in the R package, define the function via
In terms of the w-function, we also have
Furthermore, define via
where
Then, in terms of this r-function,
Define the function according to
| (26) |
so that solves q() = 0 from (25). The derivative of this q-function is
| (27) |
Using these function definitions, the Newton-Raphson iteration for obtaining the estimate of the Weibull shape parameter θ1 is given by
| (28) |
Upon obtaining the estimate of θ1, the estimates of the scale parameter θ2 and the monitoring parameter β could then be obtained using (24) for and, from (15),
| (29) |
3 Asymptotic Properties
In this section we study the asymptotic properties of the ML estimators. We will make use of the results in Borgan (1984), which deals with the consistency and asymptotic normality of ML estimators in parametric counting process models. We will consider the case where the number of subjects is increasing to infinity (n → ∞) in contrast to the situation where only one subject is considered and the time of monitoring increases to infinity (τ → ∞) as in Sellke (1988). Some of the regularity conditions in Borgan (1984) will be reformulated in terms of gap-times, which will enable obtaining more useful analytical conditions.
3.1 Reformulated Processes
Following an idea exploited in Sellke (1988) and also in Peña et al. (2001), we define stochastic processes via
| (30) |
The first time parameter corresponds to calendar time, while the second time parameter represents gap or inter-event time. Note that Zi(s, t) indicates whether at calendar time s at most t units of time have elapsed since the last event occurrence. Because it is F-adapted and has left-continuous paths, then Zi(·, t) is an F-predictable process, aside from bounded.
To facilitate our asymptotic analysis we introduce a generalized likelihood process involving two time indices defined via
| (31) |
Notice here that the likelihood in Sellke (1988) could not be used directly for asymptotic properties purposes because it does not contain an informative censoring part since the distribution of the censoring time does not depend on that of the inter-event time. So the most important difference between the two likelihood is the contribution of the censored observations. Observe that L(s; θ, β) = limt→∞ LG(s, t; θ, β) since limt→∞ Zi(s, t) = 1. Therefore, functions and estimators derived from L(s; β, θ) are limits of analogous functions and estimators obtained from LG(s, t; β, θ) as t → ∞. As we will see, however, dealing with LG is more convenient analytically.
Taking the logarithm of LG to obtain the generalized log-likelihood lG = log LG, and then the partial derivatives of lG with respect to θ and β, we obtain the generalized score process
| (32) |
where Hi is as defined in (12), and Mi(s, t; θ, β) = with
If (θ, β) are the true parameter values, then for fixed t ∈ , Mi(·, t; θ, β) is a vector of square-integrable martingales with F-predictable quadratic variation (PQV) process given by
| (33) |
where Dg(a) is the diagonal matrix with diagonal elements being the elements of the vector a. Re-scaling by 1/n and simplifying, we have
where
The matrix Σ11Bn(s, t; θ, β) has the alternative representation given by
where
| (34) |
is the generalized at-risk process (cf., Peña et al. (2001)). It is of interest to obtain the limit of Σn(s, t; θ, β) as n → ∞ to be able to use Borgan's results. For this purpose, we need the following lemma.
Lemma 1 For (s*, t*) ∈ (0, ∞)2, then, as n →∞,
;
,
where the function y(s, t; θ, β) is given by (cf., Peña et al. (2001))
| (35) |
where ρF(·; θ), the renewal function, is given in (1).
Proof: Result (i) is just the Glivenko-Cantelli theorem, whereas result (ii) was established in Peña et al. (2001). ∥
Using this lemma, it is immediate that as n → ∞,
where the limiting matrix
| (36) |
has components given by
Also, observe that the limiting matrix in (36) can be decomposed into
Furthermore, with
| (37) |
the generalized observed Fisher information process, it is straightforward to show that as n →∞,
The limiting results pertaining to Σn and IG are conditions in Borgan (1984)'s set of regularity conditions, and from his theorem the following results for the recurrent event setting follow.
Theorem 1 Let (s*, t*) ∈ such that Σ(s*, t*; θ, β) is positive definite. Under the ‘usual regularity conditions’ in Borgan (1984),
- (Consistency) There exists a sequence of solutions ((s*, t*), (s*, t*))t to the sequence of equations UG(s*, t*; θ, β) = 0 satisfying, as n → ∞,
- (Asymptotic Normality) As n → ∞,
where MVN means multivariate normal. Furthermore, is a consistent estimator of Σ(s*, t*; θ, β).
The following corollary is immediate from Theorem 1.
Corollary 1 Under the conditions of Theorem 1, as n → ∞,
where Σ11(s*, t*; θ, β) = (Γ(s*, t*; θ, β) + Ξ(s*; θ, β))−1 with
| (38) |
| (39) |
Since of main interest to us is the case where t → ∞, we observe that because y(s, t; θ, β) = 0 whenever s < t, then the limiting function arising in (38) as t → ∞ is
| (40) |
With a view towards investigating the loss in efficiency when one uses a fully nonpara-metric estimator of the inter-event survivor function F̄, from Corollary 1 and the δ-method, we are able to get the asymptotic distribution of the parametric estimator of F̄(t) in (17).
Corollary 2 Under the conditions of Corollary 1, as n → ∞,
where the variance function is
| (41) |
where Γ(θ, β) and Ξ(θ, β) are obtained from (38) and (39), respectively, by letting s and t both tend to ∞.
Note that under regularity conditions,
We remark here that the result in Corollary 2 could be strengthened to the weak convergence of the process to a Gaussian process, see Adekpedjou (2007)'s dissertation for this extended result; however, for the specific purpose of performing efficiency comparisons, the weaker version in Corollary 2 suffices.
4 Efficiency Comparisons
The second goal of this article is to examine efficiency issues in the presence of an informative monitoring period. The major reason for introducing the generalized KG model for recurrent events is in order to have an analytically tractable model which facilitates the investigation of some efficiency questions with recurrent event data. Through the informative monitoring model, we seek to provide concrete answers to the following questions. (i) If one employs an estimator of θ that was derived ignoring the informativeness of G on F, how much efficiency loss is incurred? (ii) If one uses an estimator of θ which was derived ignoring the additional event recurrences after the first event, but which takes into account the informativeness of G on F, what is the cost in terms of efficiency? This further leads to the question of how much information is contributed by the event occurrences past the first occurrence. (iii) If one adopts the fully nonparametric estimator of F̄(t) studied in Peña et al. (2001), but the true underlying model is the generalized KG model, how much efficiency is sacrificed?
4.1 Ignoring Informative Monitoring
We now address the first question. We suppose that is the estimator of θ that was derived ignoring the informativeness of G on θ. Then, it is not difficult to see that the limiting variance of equals Γ(s*; θ, β)−1, where the matrix Γ is defined in (38). With det Q denoting the determinant of a matrix Q, a measure of the change in asymptotic relative efficiency of the sequence {} relative to the sequence {} is
Let us examine further the situation when s* → ∞. Denote by
| (42) |
with (·; θ) being the inverse of ΛF(·; θ). Then, making the variable substitution z = βΛF(v; θ) in the expression for Ξ(θ, β), we obtain the compact expression
| (43) |
where Z is a unit exponential random variable. Note in particular that Ξ(θ, β), being a covariance matrix, is nonnegative definite, so det Ξ(θ, β) ≥ 0. Since Γ(θ, β) is also a limiting covariance matrix, hence is nonnegative definite, then this establishes that
as is to be expected. Analogously, by straightforward manipulations and with Z still a unit exponential variable, we are able to find the more compact expression for Γ(θ, β) given by
| (44) |
In the special case where the recurrent event accrual follows a homogeneous Poisson process so that ΛF(t; θ) = θt, we are able to obtain closed form expressions for the above quantities. In this situation, ϕF(t;θ) = 1/θ and the renewal function is ρF(t; θ) = θtI(t ≥ 0). Since ϕF(·; θ) is constant in t, this immediately shows that Ξ(θ, β) = 0. Straightforward calculations also show that in this exponential case,
| (45) |
Observe that the inverse (reciprocal) of this quantity equals the (1, 1)th element of Σ(∞; θ, β) given in (21) which was obtained via a limiting operation on the observed Fisher information.
Therefore, when F is the exponential distribution, ΔARE() = 0, that is, there is no loss of efficiency by ignoring the fact that Ḡ(t; θ, β) = exp(−βθt) is informative about F̄(t;θ) = exp(−θt)! Surprising as it may look at first, this actually is a logical result for in this case, from (19) and by an easy calculation,
| (46) |
that is, both estimators of θ are just the occurrence-exposure rate. However, for other non-exponential F where ϕF(·; θ) is not constant, then det Ξ(θ, β) need not be zero, hence there could be loss in efficiency by ignoring the informativeness of G for F, as in the situation of Weibull inter-event times dealt with via simulations in Section 5.
4.2 Single-Event versus Recurrent Event Methods
To address the second efficiency question, denote by the estimator of θ based only on the possibly right-censored first event times, that is, the single-event estimator, but taking into account the informativeness of G for F. Then, the only technical change is that the appropriate y(∞, t; θ, β) function to utilize in the Γ(θ, β) expression is
| (47) |
As a consequence, the limiting variance of is [Γ1(θ, β) + Ξ(θ, β)]−1, where
| (48) |
with Z a unit exponential random variable. With Γ2(θ, β) denoting the second term in the expression for Γ(θ, β) in (44), that is,
| (49) |
then the change in asymptotic relative efficiency of the sequence {} relative to {} is
| (50) |
Again, this will always be positive, indicating that there will always be gain in efficiency by utilizing the additional event occurrences.
In the special case where F is the exponential distribution, simple calculations reveal that
As a consequence, when the inter-event times are exponentially distributed,
| (51) |
which as noted in Section 2 is the expected number of event occurrences in each unit's monitoring period. This result could further be interpreted as follows. When β → 0, then the monitoring period lengthens, consequently more recurrences will be observed (per unit), which will provide more information, thereby making {} more efficient relative to {} at the incremental order of 1/β. Whereas, when β → ∞, then the monitoring period shortens so there will either be no event observed or more likely just a single event observed (per unit), so in this situation, the two estimators becomes very close, hence the gain in efficiency goes down to zero.
4.3 Optimal Design Choice
Viewed in a different light, this result for the exponentially-distributed inter-event times implies that if experimenter #1 has a sample of size n1 and uses the recurrences for estimating θ, then experimenter #2 will need a sample of size n2 = n1(1 + 1/β) to gain the same (asymptotic) precision if he/she ignores the recurrences and use only the possibly right-censored times to first event occurrence, that is, single-event methods. The flip-side to these two approaches in terms of cost is that experimenter #1 will take more time to perform the study compared to experimenter #2, assuming that all units are entered into the study at the same time. A choice of which design to utilize entails taking into consideration costs associated with the experimental units and the duration of performing the study.
To amplify on the choice of study design, let us assume that C1 is the cost per experimental unit, while C2 is the cost incurred per unit of time while the study is ongoing. Let there be n1 units in study design #1 which monitors all event occurrences over the units monitoring periods [0, τi]s. For this study, the overall study duration will be maxi≤n1 τi. On the other hand, let n2 be the number of units for study design #2 which monitors only the occurrence of the first event time or when it gets right-censored by τi. The overall study duration in this case will be maxi≤n2 (). Therefore, the expected total costs for each of these study designs will be, respectively,
In the case of an exponential F, note that and where the Vis are i.i.d. unit exponential variables. By recalling that
where V(n) is the largest order statistic among V1, V2, ..., Vn i.i.d. EXP(1), then
Since the two study designs will lead to the same asymptotic precision when n2 = n1(1 + 1/β), then study design #1 would be cost-preferable to study design #2 if TC1(n1; θ, β) < TC2(n1(1 + 1/β); θ, β). Using the above approximations, this inequality will occur if
| (52) |
Of course, for this decision criterion to be usable, some prior or pilot estimates of θ and β will be needed.
4.4 Efficiency of Generalized PLE
We now address the third question posed earlier which pertains to the efficiency of the fully nonparametric estimator of the inter-event distribution F relative to an estimator derived using information about the structure of F and G. The KG model in the single-event settings was used for these efficiency studies, so the extended KG model for our recurrent event setting is a justifiably reasonable model to perform analogous efficiency studies. Peña et al. (2001) obtained the generalized product-limit estimator (GPLE) in this recurrent event setting. This estimator of the inter-event time survivor function F̄(t) is given by
| (53) |
where and Yi(·, ·) is the generalized at-risk process defined in (34). It was established in their paper that is asymptotically normal with mean F̄(t) and asymptotic variance σ2(s, t; θ, β)/n, where
| (54) |
with y(s, t; θ, β) is the function given in (35). We compare this limiting variance with the limiting variance of the parametric estimator of F̄(t; θ) which is η2(t; θ, β) in (41). The comparison is when s* → ∞.
Theorem 2 For the generalized KG model in this recurrent event setting, the asymptotic relative efficiency of the GPLE relative to the parametric estimator at time point is
where
| (55) |
Proof: The desired asymptotic relative efficiency is the ratio between η2((t; θ); θ, β)and σ2((t; θ); θ, β). The expressions in the theorem follow via straightforward manipulations of the expressions for η2(t; θ, β) and σ2(t; θ, β), by using the variable transformation v = ΛF(t) in the integrals, and noting that ∇θΛF(t; θ) = ψ(w; θ) λF(w; θ)dw. ∥
Corollary 3 For F̄(t; θ) = exp(−θt), the exponential case, the ARE of the GPLE relative to the parametric estimator at time point (t) is
with .
Proof: In this exponential setting, we have already seen from earlier calculations that Ξ(θ, β) = 0, while Γ(θ, β) = 1/(θ2β). Furthermore, (t; θ) = t/θ, ρF(t; θ) = θt, ϕF(t; θ) = 1/θ, so that (t; θ) = 1/θ. The expressions follow upon simplifying the ARE and noting the above facts. ∥
From the ARE expression in Corollary 3, the following properties are easily established.
Corollary 4 The ARE() in Corollary 3 satisfies (i) limβ→∞ ARE() = 0; (ii) limβ→0 ARE() = t/[exp(t) − 1]; and (iii) if p0 is the solution of exp(p − 1) − , then
The ARE expression in Corollary 3 for the exponential inter-event time distribution as a function of p(t; β) is plotted in Figure 1. In Corollary 4, the result in (i) indicates that the shorter the monitoring periods become, which happens when β → ∞, then the more inefficient the GPLE becomes. This may seem surprising at first sight since one may think that there will be very few observed recurrences; however, because of the informativeness of G for F, then the right-censored observations will contain information about F, and this is being exploited by the parametric estimator. On the other hand, when β → 0, then the monitoring periods become longer, and in this case the impact of the right-censored observations will tend to be negligible because more and more complete observations will be observed. Indeed, this is manifested by observing that the limiting ARE expression in (ii) of Corollary 4 is the ARE of the parametric estimator (t) = exp(−t/T̄), where T̄ is the sample mean, relative to the nonparametric empirical survivor function estimator (t) = I{Ti > t}/n based on an i.i.d. sample T1, T2, ..., Tn from a unit exponential distribution. It is quite interesting to observe the similarity with the efficiency results in single-event settings. The upper bound for the ARE in (iii) is similar to that obtained by Cheng and Lin (1987) concerning the Kaplan-Meier estimator when compared to the estimator that exploits the informative structure of the KG model in the single-event setting.
Figure 1.
Asymptotic relative efficiency of the nonparametric estimator of the inter-event survivor function relative to the parametric estimator as a function of p ≡ p(t; β) = F̄((t; θ); θ)β+1 in the HPP case.
More generally, it is of interest to know if the ARE expression in Theorem 2 is always bounded above by unity. A partial answer is provided by the following theorem. Below, for a matrix A, tr(A) is its trace.
Theorem 3 The ARE() expression in Theorem 2 is bounded above by
where ej(θ, β), j = 1, 2, ..., p, are the eigenvalues of Γ(θ, β)−1Ξ(θ, β). In particular, if the parameter θ is one-dimensional, then ARE() ≤ 1.
Proof: To establish the result, we first observe that if f : and g : with square-integrable components with respect to a measure μ, and if S is a positive definite p × p symmetric matrix, then
| (56) |
This inequality follows by first letting f̄ = S−1/2f, and then noting that the left-hand side of (56) equals
Applying the Cauchy-Schwartz Inequality to each of the p terms, we obtain
The inequality in (56) follows by then noting that
To utilize this result in the context of the ARE expression, define the following.
Then, a direct application of (56) to the ARE expression in Theorem 3 leads to the upper bound in the statement of the theorem. The case of p = 1 is an immediate consequence. ∥
Observe therefore that if the parameter of the inter-event distribution is one-dimensional, then the parametric estimator of the inter-event survivor function will never be less efficient, asymptotically, than the generalized product-limit estimator under this generalized KG model. However, if the parameter vector θ is more than one-dimensional, then it is possible that the ARE will not have an upper bound of unity.
5 Simulation Studies
Obtaining exact analytical efficiency expressions under inter-event distributions other than the exponential distribution is difficult since closed form expressions for the renewal function, which appears in the variance expression, is not generally available. To examine non-exponential inter-event distributions, we resorted to computer simulation studies. Specifically, we considered in the simulation a Weibull inter-event distribution with shape parameter θ1 and scale parameter θ2. The purpose of the simulation study was to compare the efficiency of the estimators , and as well as to ascertain the efficiency of the GPLE F̃(t) relative to the parametric estimator F̂(t).
The simulation code was in the R language (Ihaka and Gentleman (1996)), and a Newton-Raphson procedure as described in the subsubsection dealing with Weibull inter-event distribution was implemented. A total of M = 5000 simulation replications were performed, where for each replication, a recurrent event data following the generalized KG model was generated for combinations of values of n, θ1, θ2, and β. For each of the resulting recurrent event data, the estimates of , and , as well as and were obtained. The estimates F̂(t) and F̃(t) of F(t) were also obtained for t-values coinciding with the percentiles of the true Weibull distribution.
As a measure of the efficiency of over , we computed the estimate of its (generalized) mean-squared error (MSE) given by
| (57) |
where is the estimate obtained from the mth replication. The true θ value is denoted by θ0 = (θ10, θ20)t). The estimate of the MSE for and are similarly defined. The estimate of the efficiency of over is then defined via
| (58) |
Eff() is analogously defined. In an analogous manner, the measure of efficiency of F̃ over F̂ at time point t is the ratio of the MSEs of F̃(t) and F̂(t) computed over the M replications. The simulation was performed for combinations of
As is to be expected, the results were somewhat invariant with respect to a change in the scale parameter θ2, so we only report here those associated with θ1 = 1.0. To conserve space, we also only report the cases n ∈ {20, 50} since the conclusions are unchanged.
Table 1 presents the efficiencies of the θ-estimators for the different cases. We have also provided the column MeanEvs which represents the mean number of events observed for each of the subjects. Note that when β is decreased, then there are more events observed. It is clear from the results of this simulation that the estimator is always more efficient than the other two estimators, as is to be expected. It is interesting to observe that as β increases, the efficiency gain of over increases, whereas it is the opposite direction for the comparison with the estimator. This could intuitively be explained by the fact that when there are fewer event occurrences, then the information coming from the τis, which are being used in the estimator but which are ignored by the estimator, becomes more important. On the other hand, for the estimator, a decrease in β leads to more event occurrences but which are ignored by this estimator, hence the increase in efficiency of the estimator over the estimator. Interestingly, the efficiency behavior are very similar over changes in n for the and comparison. One may wonder why the estimator is still more efficient than the estimator even when the average number of events per subject is less than unity. An explanation for this is that the estimator still utilizes the values of τ, which provide additional information, even if there was only one event observed on a subject. The estimator , on the other hand, only utilizes the minimum between the τi-value and the time-to-first event occurrence Ti1, together with the indicator of which was smaller. We point out that the exact closed-form expressions of the asymptotic covariance matrix for the estimator is obtainable for this Weibull case by exploiting the independence between the censoring indicator and the min{Ti1, τi}. We compared these asymptotic variances with the simulated variances and there is indeed good agreement when the sample size is large (n ∈ {50, 100}).
Table 1.
Relative efficiency (in decimal format) of estimator relative to the and estimators under the Weibull(θ1,θ2) inter-event distribution for different β-values, for n ∈ {20, 50}, and for θ2 = 1. MeanEvs represents the mean number of events observed for each of the subjects. Results are based on 5000 simulation replications.
| n | θ1 | θ2 | β | MeanEvs | ||
|---|---|---|---|---|---|---|
| 20 | 0.9 | 1 | 0.3 | 3.89 | 1.26 | 37.62 |
| 20 | 0.9 | 1 | 0.5 | 2.25 | 1.47 | 17.12 |
| 20 | 0.9 | 1 | 0.7 | 1.57 | 1.72 | 9.88 |
| 20 | 1.0 | 1 | 0.3 | 3.34 | 1.25 | 30.24 |
| 20 | 1.0 | 1 | 0.5 | 2.01 | 1.51 | 12.62 |
| 20 | 1.0 | 1 | 0.7 | 1.43 | 1.82 | 8.69 |
| 20 | 1.5 | 1 | 0.3 | 1.98 | 1.54 | 11.11 |
| 20 | 1.5 | 1 | 0.5 | 1.34 | 1.90 | 6.83 |
| 20 | 1.5 | 1 | 0.7 | 1.02 | 2.12 | 4.38 |
| 20 | 2.0 | 1 | 0.3 | 1.47 | 1.65 | 6.42 |
| 20 | 2.0 | 1 | 0.5 | 1.06 | 2.02 | 4.25 |
| 20 | 2.0 | 1 | 0.7 | 0.84 | 2.41 | 3.31 |
| 50 | 0.9 | 1 | 0.3 | 3.90 | 1.26 | 30.27 |
| 50 | 0.9 | 1 | 0.5 | 2.25 | 1.41 | 13.48 |
| 50 | 0.9 | 1 | 0.7 | 1.57 | 1.66 | 7.96 |
| 50 | 1.0 | 1 | 0.3 | 3.34 | 1.30 | 23.29 |
| 50 | 1.0 | 1 | 0.5 | 2.00 | 1.53 | 9.91 |
| 50 | 1.0 | 1 | 0.7 | 1.42 | 1.71 | 6.69 |
| 50 | 1.5 | 1 | 0.3 | 1.97 | 1.49 | 9.53 |
| 50 | 1.5 | 1 | 0.5 | 1.34 | 1.75 | 5.55 |
| 50 | 1.5 | 1 | 0.7 | 1.02 | 2.11 | 3.82 |
| 50 | 2.0 | 1 | 0.3 | 1.47 | 1.66 | 5.58 |
| 50 | 2.0 | 1 | 0.5 | 1.06 | 2.01 | 3.62 |
| 50 | 2.0 | 1 | 0.7 | 0.84 | 2.30 | 3.04 |
Figure 2 provides the plots of the efficiency of F̃(t) over F̂(t) for different θ1 and n values. For each of the plot frames, we superimposed the plots associated with the different values of β so as to see the effect of changing β-value. We have plotted these graphs with the true value of F(t) in the abscissa as a way to standardize the graphs. From these plots, it is evident that the efficiency behavior are basically very similar for the different combinations of θ1 and n, as well as for varying values of θ2, which are not shown here. It is also clear that the relative efficiency of the GPLE over the parametric estimator never exceeds 0.70. It would be a mathematical challenge to know the exact upper bound of this relative efficiency in analogy to the approximate 0.65 that was found for the exponential inter-event times, but this appears to be a difficult problem owing to a non-closed form for the renewal function of a Weibull distribution. With regards to the impact of the monitoring parameter β, when this parameter is increased, then the relative efficiency of the GPLE decreases for larger values of t. This could be attributed to the fact that there will be fewer observations whose inter-event times are in this region, hence the GPLE suffers, aside from the fact that it does not utilize information coming from the τis, whereas the parametric estimator is able to use information from all the observations, including the τis, for estimating F(t) for larger ts.
Figure 2.
Plots of efficiency of the GPLE F̃ relative to the parametric estimator F̂ under a Weibull(θ1, θ2) inter-event distribution for varying values of n and β. The curves are black and solid for β = .3, red and dashed for β = .5, and blue and dotted for β = .7. The value of θ2 is 1.0. These curves are based on 5000 simulation replications.
6 Concluding Remarks
In this paper we have examined efficiency aspects of estimators of the inter-event distribution in a recurrent event setting. A generalized Koziol-Green model was introduced to provide an analytically tractable model of an informative monitoring period. This enables the analytical assessment of efficiencies of estimators. Of particular interest was to study the loss in efficiency if the informative monitoring structure is ignored in the estimation procedure, and to see the gain in efficiency when one utilizes the event recurrences instead of just simply using the time-to-first, possibly right-censored, event occurrence. The generalized product-limit estimator of the inter-event distribution, which does not utilize the informative monitoring aspect, was also examined in terms of its loss in efficiency relative to the parametric estimator, the latter exploiting the informative structure. In a sequel article, a semiparametric estimator of the inter-event distribution based on recurrent event data, and which exploits the generalized KG structure, will be compared to the generalized product-limit estimator.
Acknowledgments
The authors wish to thank Dr. Alex McLain and Dr. Laura Taylor for helpful discussions, and also wish to thank a Referee and the Editor for their comments.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
References
- Aalen O. Nonparametric inference for a family of counting processes. Ann. Statist. 1978;6:701–726. [Google Scholar]
- Aalen OO, Borgan Ø, Gjessing HK. Survival and event history analysis: A process point of view. Springer; New York: 2008. Statistics for Biology and Health. [Google Scholar]
- Adekpedjou A. Ph.D. thesis. University of South Carolina; Columbia, South Carolina: 2007. Estimation with Recurrent Event Data Under an Informative Monitoring Period. [Google Scholar]
- Allen W. A note on conditional probability of failure when hazards are proportional. Operations Research. 1963;11:658–659. [Google Scholar]
- Andersen P, Borgan O, Gill R, Keiding N. Statistical Models Based on Counting Processes. Springer-Verlag; New York: 1993. [Google Scholar]
- Borgan Ø. Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scand. J. Statist. 1984;11:1–16. [Google Scholar]
- Breslow N, Crowley J. A large sample study of the life table and product limit estimates under random censorship. The Annals of Statistics. 1974;2:437–453. [Google Scholar]
- Chen Y, Hollander M, Langberg N. Small-sample results for the Kaplan-Meier estimator. Journal of the American Statistical Association. 1982;77:141–144. [Google Scholar]
- Cheng PE, Lin GD. Maximum likelihood estimation of a survival function under the Koziol-Green proportional hazards model. Statist. Probab. Lett. 1987;5:75–80. [Google Scholar]
- Cox D. Regression models and life tables (with discussion) Journal of the Royal Statistical Society. 1972;34:187–220. [Google Scholar]
- Csörgő S, Faraway JJ. The paradoxical nature of the proportional hazards model of random censorship. Statistics. 1998;31:67–78. [Google Scholar]
- Efron B. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Prentice-Hall; New York: 1967. The two-sample problem with censored data; pp. 831–853. [Google Scholar]
- Fleming T, Harrington D. Counting Processes and Survival Analysis. Wiley; New York: 1991. [Google Scholar]
- Gill RD. Nonparametric estimation based on censored observations of a Markov renewal process. Z. Wahrsch. Verw. Gebiete. 1980;53:97–116. [Google Scholar]
- Gill RD. Testing with replacement and the product limit estimator. Ann. Statist. 1981;9:853–860. [Google Scholar]
- Hollander M, Peña E. Families of confidence bands for the survival function under the general random censorship model and the Koziol-Green model. Canadian Journal of Statistics. 1989;17:59–74. [Google Scholar]
- Ihaka R, Gentleman R. R: a language for data analysis and graphics. Journal of Computational and Graphical Statistics. 1996;5:299–314. [Google Scholar]
- Jacod J. Multivariate point processes: predictable projection, Radon-Nikodým derivatives, representation of martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete. 197475;31:235–253. [Google Scholar]
- Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. 2nd ed Wiley-Interscience [John Wiley & Sons]; Hoboken, NJ: 2002. Wiley Series in Probability and Statistics. [Google Scholar]
- Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 1958;53:457–481. [Google Scholar]
- Koziol J, Green S. A Cramer-von Mises statistic for randomly censored data. Biometrika. 1976;63:139–156. [Google Scholar]
- Peña E, Hollander M. Models for Recurrent Events in Reliability and Survival Analysis. In: Soyer R, Mazzuchi T, Singpurwalla N, editors. Mathematical Reliability: An Expository Perspective. Kluwer Academic Publishers; 2004. pp. 105–123. chap. 6. [Google Scholar]
- Peña E, Strawderman R, Hollander M. Nonparametric estimation with recurrent event data. J. Amer. Statist. Assoc. 2001;96:1299–1315. [Google Scholar]
- Sellke T. Statistical decision theory and related topics, IV, Vol. 2 (West Lafayette, Ind., 1986) Springer; New York: 1988. Weak convergence of the Aalen estimator for a censored renewal process; pp. 183–194. [Google Scholar]
- Siannis F. Applications of a parametric model for informative censoring. Biometrics. 2004;60:704–714. doi: 10.1111/j.0006-341X.2004.00220.x. [DOI] [PubMed] [Google Scholar]
- Vardi Y. Nonparametric estimation in renewal processes. Ann. Statist. 1982a;10:772–785. [Google Scholar]
- Vardi Y. Nonparametric estimation in the presence of length bias. Ann. Statist. 1982b;10:616–620. [Google Scholar]
- Wang M-C, Chang S-H. Nonparametric estimation of a recurrent survival function. J. Amer. Statist. Assoc. 1999;94:146–153. doi: 10.1080/01621459.1999.10473831. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Wang M-C, Qin J, Chiang C-T. Analyzing recurrent event data with informative censoring. J. Amer. Statist. Assoc. 2001;96:1057–1065. doi: 10.1198/016214501753209031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- William AL. A Model for Informative Censoring. J. Amer. Statist. Assoc. 1989;84:749–752. [Google Scholar]


