Estimation and Efficiency with Recurrent Event Data under Informative Monitoring

Akim Adekpedjou; Edsel A Peña; Jonathan Quiton

doi:10.1016/j.jspi.2009.08.007

. Author manuscript; available in PMC: 2011 Mar 1.

Published in final edited form as: J Stat Plan Inference. 2010 Mar 1;140(3):597–615. doi: 10.1016/j.jspi.2009.08.007

Estimation and Efficiency with Recurrent Event Data under Informative Monitoring

Akim Adekpedjou ^*, Edsel A Peña ^†, Jonathan Quiton ^‡

PMCID: PMC2793417 NIHMSID: NIHMS145648 PMID: 20161471

Abstract

This article deals with studies that monitor occurrences of a recurrent event for n subjects or experimental units. It is assumed that the i^th unit is monitored over a random period [0,τ_i]. The successive inter-event times T_i1, T_i2, ..., are assumed independent of τ_i. The random number of event occurrences over the monitoring period is K_i = max{k ∈ {0, 1, 2, ...} : T_i1 + T_i2 + ... + T_ik ≤ τ_i}. The T_ijs are assumed to be i.i.d. from an unknown distribution function F which belongs to a parametric family of distributions $C = {F (\cdot; θ) : θ \in ϴ \subset ℜ^{p}}$ . The τ_is are assumed to be i.i.d. from unknown distribution function G. The problem of estimating θ, and consequently the distribution F, is considered under the assumption that the τ_is are informative about the inter-event distribution. Specifically, 1 – G = (1 – F)^β for some unknown β > 0, a generalized Koziol-Green (cf., Koziol and Green (1976); Chen, Hollander, and Langberg (1982)) model. Asymptotic properties of estimators of θ, β, and F are presented. Efficiencies of estimators of θ and F are ascertained relative to estimators which ignores the informative monitoring aspect. These comparisons reveal the gain in efficiency when the informative structure of the model is exploited. Concrete demonstrations were performed for F exponential and a two-parameter Weibull.

Keywords: Counting processes, efficiency comparisons, exponential inter-event times, generalized Koziol-Green model, martingales, Weibull inter-event times

1 Introduction

The parametric, semiparametric, and nonparametric estimation of the distribution function of an event time has been the subject of intense research in past decades, especially in settings where there is at most one observed event (so-called single-event settings) during the monitoring period per experimental unit. Among the seminal papers dealing with this problem are those of Kaplan and Meier (1958), Efron (1967), Cox (1972), Breslow and Crowley (1974), Aalen (1978), and Borgan (1984); see the books of Fleming and Harrington (1991), Andersen, Borgan, Gill, and Keiding (1993), Kalbfleisch and Prentice (2002), and Aalen, Borgan, and Gjessing (2008). The situation where the event is recurrent so there could be more than one event occurrence per unit has also been dealt with, albeit not as thoroughly yet as the single-event case. In the recurrent event setting, the estimation problem has been considered by Gill (1980, 1981), Vardi (1982a,b), Wang and Chang (1999), and Peña, Strawderman, and Hollander (2001). Gill (1981) dealt with the problem of nonparametric inference for renewal processes in a life testing setting. Vardi (1982a) presented an algorithm for obtaining the maximum likelihood (ML) estimator of the survivor function when the underlying interoccurence times are arithmetic. Sellke (1988), in the case of a single unit, considered the problem of establishing weak convergence of a Nelson-Aalen-type estimator when the length of the monitoring period increases without bound. Peña et al. (2001) proposed Nelson-Aalen and Kaplan-Meier-type estimators and derived their asymptotic properties when the number of units increases but with the monitoring time for each unit being finite with probability one, in contrast to the setting in Sellke (1988) where the monitoring time increases to infinity.

An important issue that arose in the single-event setting is the impact of an informative censoring mechanism. An analytically tractable informative random censorship model was proposed by Koziol and Green (1976) which assumes proportional hazards for the event time and the censoring time. This model was utilized by Chen et al. (1982) to study exact properties of the Kaplan-Meier estimator. Cheng and Lin (1987) also utilized this model to derive an estimator of the survivor function which exploits the informative censoring structure, and showed that their estimator is more efficient than the Kaplan and Meier (1958) estimator, especially under heavy censoring. Hollander and Peña (1989) also used this Koziol-Green model to obtain a more efficient class of confidence bands for the survivor function.

There are many situations however in the engineering, actuarial, biomedical, public health, social and economic sciences, as well as in business, where the event of interest is recurrent. Examples of such events are machine (mechanical or electronic) malfunction, non-life insurance claim, onset of depression, heart attack, economic recession, marital strife, and commission of a criminal act. In this recurrent event setting, the impact of an informative monitoring period has not been examined extensively. This article is for the purpose of studying the impact of an informative monitoring period especially in the context of efficiency gains and losses in the estimation of the inter-event time parameter and distribution. As pointed out in Peña et al. (2001) and Peña and Hollander (2004), recurrent event data have additional features that require attention in performing statistical inference. Two of these important features are: (i) because of the sum-quota data accrual scheme, the number of observed event occurrences is informative about the inter-event distribution even if G is unrelated to F; and (ii) the variable that right-censors the last inter-event time at the end of the monitoring period is dependent on the previous inter-event times. Thus, there is both informative and dependent censoring in recurrent event data. Because of these additional features for recurrent event data, there is a need to study the additional impact of having a G informative about F in the estimation of F or its parameters, in particular, in the efficiency gain when the informative structure is exploited.

There has been several models that have been proposed to model informative censoring. William (1989) proposed a model where the censoring variable is related to the frailty of the individual. He showed in particular that in the case of exponential frailty the use of the Kaplan-Meier estimator can lead to errors in estimating the survivor probability. Wang, Qin, and Chiang (2001) proposed various models where the occurrence of recurrent events is modeled by a subject specific nonstationary Poisson process via a latent variable. Siannis (2004) considered a parametric model where the parameter represents the level of dependence between the failure and the censoring process. In this article we employ a generalization to the recurrent event setting of the model studied in Koziol and Green (1976), the so-called Koziol-Green (KG) model. This KG model has been most utilized in studying efficiency aspects under informative censoring in single-event settings; see for instance Chen et al. (1982) which obtains exact properties of the Kaplan-Meier estimator under this model, and Cheng and Lin (1987) which derives an estimator of the survivor function utilizing the informative structure. We point out that, just as in the case of the single-event setting, the utility of the proposed generalized KG model is not primarily to provide a practical and realistic model, but rather to provide a medium in which to examine analytically properties of inference procedures with recurrent event data.

The major goal of this article is to obtain estimators of the inter-event time distribution and its parameter for this generalized KG model and to ascertain the loss in efficiency if one ignores the informative structure. An outline of this article is as follows. Section 2 introduces relevant processes, describes the generalized KG model and its properties, and develops the estimators. The framework of stochastic processes is adopted to gain generality. Section 3 deals with asymptotic properties of the estimators under the KG model and those estimators derived by ignoring the KG assumption. Section 4 performs efficiency comparisons of the estimators that exploits the informative structure relative to those which were derived ignoring the structure. In particular, the efficiency of a fully nonparametric estimator of the inter-event distribution is examined. Section 5 presents the results of simulation studies which studies small- to moderate-sample properties of estimators for models in which closed-form analytical expressions are not possible, specifically when the inter-event distribution is a two-parameter Weibull. Finally, Section 6 provides some concluding remarks.

2 Model of Interest and Estimators

2.1 Random Entities

All random entities are defined on a basic probability space (Ω, $F$ , P). We suppose that there are n subjects in the study. For the i^th subject, {S_ij, j = 1, 2, ..., } are the successive calendar times of event occurrences, while {T_ij, j = 1, 2, ...} are the successive inter-event times. Thus, we have S_i0 = 0, S_ij = $Σ_{k = 1}^{j}$ T_ik and T_ij = S_ij – S_i,j–1. The T_ijs are assumed to be i.i.d. nonnegative r.v.s with a common absolutely continuous distribution function F. In this paper we restrict to the i.i.d. inter-event times setting, while the possibly more relevant model for biostatistical applications with correlated inter-event times, specifically with the association induced by frailty components, will be dealt with in a separate paper.

The renewal function associated with F is

ρ_{F} (t) = \sum_{n = 1}^{\infty} F^{⋆ (n)} (t) I {t \geq 0},

(1)

where F*⁽ⁿ⁾ is the n^th convolution of F, the distribution of S_in. We assume that $F \in C = {F (\cdot; θ) : θ \in ϴ \subset ℜ^{p}}$ . The hazard rate function of F(·; θ) is

λ_{F} (\cdot; θ) = \frac{f (\cdot; θ)}{1 - F (\cdot; θ)} = \frac{f (\cdot; θ)}{\overset{‒}{F} (\cdot; θ)},

where F̄ = 1 – F (analogously, Ḡ = 1 – G). The cumulative hazard function is

Λ_{F} (\cdot; θ) = \int_{0}^{\cdot} λ_{F} (v; θ) d v

(similarly λ_G and Λ_G). For subject i, i = 1, 2, ... , n, the recurrent event is observed over a random period [0, τ_i], where the τ_is represent the end of the monitoring periods and are i.i.d. according to some distribution function G. These τ_is may also be viewed as the right-censoring variables, though the right-censoring structure is somewhat different from the usual right-censoring structure with a single event per unit as explained in the next paragraph. Furthermore, the τ_is and T_ijs are mutually independent. With Z_+,0 = {0, 1, 2, ...}, the (random) number of event occurrences observed for the ith unit is K_i = max{k ∈ Z_+,0 : S_ik ≤ τ_i}. Therefore, the random observables for the i^th subject is represented by the vector

D_{i} = (K_{i}, τ_{i}, T_{i 1}, \dots, T_{i K_{i}}, τ_{i} - S_{i K_{i}}), i = 1, 2, \dots, n .

(2)

Let us examine in more detail the censoring structure for this model. Note that there will always be one right-censored inter-event time per unit. This inter-event time, denoted by T_{i,K_i+1}, is right-censored by the variable τ_i – S_{iK_i}. Observe that the right-censoring variable depends on the previous inter-event times through S_{iK_i}, and in fact, since both T_{i,K_i+1} and S_{iK_i} depend on the random variable K_i, then they are dependent. More interestingly, observe that T_{i,K_i+1} has a distribution different from T_i1 owing to the randomness of K_i. This is evident from the resulting sum-quota constraint given by

\sum_{j = 1}^{K_{i}} T_{i j} \leq τ_{i} < \sum_{j = 1}^{K_{i}} T_{i j} + T_{i, K_{i} + 1},

and, in fact, T_{i,K_i+1} is stochastically larger than T₁, a length-biased phenomenon. The intuition is that the inter-event time that covered the monitoring time tends to be longer, similar to the well-known phenomenon that a passenger arriving on a bus stop has a greater chance of riding a bus that waits longer! In this recurrent event model we thus have the situation where the right-censoring structure is both dependent and informative. Observe that the sample space for D_i is the subset

D_{i} \subset \cup_{k = 0}^{\infty} ({k} \times ℜ_{+} \times ℜ_{+}^{k} \times ℜ_{+})

such that d =(k, τ, t₁, t₂, ... , t_k, c) ∈ $D_{i}$ satisfies $t_{j} \leq τ, (j = 1, 2, \dots, k), c = τ - \sum_{j = 1}^{k} t_{j}$ .

2.2 Generalized KG Model for Recurrent Events

The generalized Koziol-Green model for this recurrent event setting postulates that there exists a β > 0 such that Ḡ(t; θ, β) = F̄(t; θ)^β. This condition is equivalent to Λ_G(·; θ, β) = βΛ_F (·; θ). In the single-event setting, the parameter β is referred to as the censoring parameter since Pr(τ_i < T_i) = β/(1 + β)and τ_i right-censors T_i. In contrast, for the recurrent event setting, the parameter β determines the length of the monitoring period relative to the inter-event times. For example, when F(t; θ) = 1 − exp(−θt)for t ≥ 0, then

\frac{E (τ_{i} ∣ θ, β)}{E (T_{i 1} ∣ θ)} = \frac{1 ∕ (θ β)}{1 ∕ θ} = \frac{1}{β} .

More precisely, for this exponential model, 1/β is the expected number of event occurrences during the monitoring period. This is so since, given τ_i, the expected number of event occurrences over [0, τ_i] is θτ_i, and the result follows since E(τ_i) = 1/(θβ). For this recurrent event setting, the case where β ∈ (0, 1] is of more practical relevance since it leads to more observed recurrences, though β > 1 is also an admissible value. The property Pr(τ_i < T_i1) = β/(1 + β) still holds true for this recurrent event setting, but it seems inappropriate to refer to β in this setting as the ‘censoring parameter’ since as pointed out earlier, the effective right-censoring variable for T_{i,K_i+1} is τ_i − S_{iK_i}, whose distribution not only depends on β but also on F. Thus, it is perhaps more proper to refer to β as the ‘monitoring parameter!’

A crucial independence property that was exploited in Chen et al. (1982) and the other papers dealing with this model in the single-event setting is the fact that min(τ_i, T_i) and I{T_i ≤ τ_i} are independent if and only if the KG model holds (cf., Allen (1963)). This allowed the derivation of the exact means and variances of the Kaplan-Meier estimator in Chen et al. (1982), and also facilitated the derivation of the semiparametric estimator of the survivor function in Cheng and Lin (1987). For the recurrent event setting in this paper, such an independence property does not come directly into play. For single-event settings, it has been pointed out, e.g., Csörgő and Faraway (1998), that the KG model may not occur much in practice, though its utility is more from the theoretical point of view as it provides a model allowing for the analytical examinations of properties of procedures. Our proposing the generalized KG model for recurrent event data is also not meant primarily as a practical model, but rather it is meant to provide a model specimen for the examination of analytical properties of procedures utilizing recurrent event data.

2.3 Estimators

The first goal of this article is, given the data in (2), to develop estimators of θ and β, and consequently, an estimator of F. For generality we adopt the approach of counting processes and martingales (cf., Fleming and Harrington (1991) and Andersen et al. (1993)).

For i = 1, 2, ..., n and s ∈ $R_{+}$ , let

N_{i}^{†} (s) = \sum_{j = 1}^{\infty} I {S_{i j} \leq s \land τ_{i}}, Y_{i}^{†} (s) = I {τ_{i} \geq s}, and N_{i}^{τ} (s) = I {τ_{i} \leq s} .

(3)

The processes in (3) counts the number of failures for unit i at time s; indicates whether unit i is still under observation at time s; and indicates whether unit i is past its monitoring time at time s, respectively. We augment the probability space by the filtration F = { $F_{s} : s \geq 0$ }, where $F_{s}$ is given by

F_{s} = F_{0} \lor {⋁_{i = 1}^{n} σ {N_{i}^{†} (v), Y_{i}^{†} (v +), 0 \leq v \leq s}},

with $F_{0}$ representing the σ-field containing all the information available at time 0, and $F_{s}$ is the σ-field containing all information generated on all subjects up to time s. Also, for each i = 1, 2, ..., n, define the backward recurrence process via

R_{i} (s) = s - S_{i N_{i}^{†} (s -)},

(4)

which is the time elapsed at time s since the last event occurrence. This is an F-adapted and a left-continuous process, hence is an F-predictable process. For s ∈ $ℜ_{+}$ , let

A_{i}^{†} (s; θ) = \int_{0}^{s} Y_{i}^{†} (v) λ_{F} (R_{i} (v); θ) d v and A_{i}^{τ} (s; θ, β) = \int_{0}^{s} Y_{i}^{†} (v) β λ_{F} (v; θ) d v .

(5)

From stochastic integration theory, $A_{i}^{†} (s; θ)$ and $A_{i}^{τ} (s; θ, β)$ are, respectively, the compensators of $N_{i}^{†} (s)$ and $N_{i}^{τ} (s)$ , so

M_{i}^{†} (s; θ) = N_{i}^{†} (s) - A_{i}^{†} (s; θ) and M_{i}^{τ} (s; θ, β) = N_{i}^{τ} (s) - A_{i}^{τ} (s; θ, β)

(6)

are, for each i, square-integrable F-martingales. From the results of Jacod (1974/75), the full likelihood process is, for s ∈ $ℜ_{+}$ ,

\begin{matrix} L (s; θ, β) = & \prod_{i = 1}^{n} \prod_{v = 0}^{s} {Y_{i}^{†} {(v)}^{Δ N_{i}^{†} (v) + Δ N_{i}^{τ} (v)} λ_{F} {(R_{i} (v); θ)}^{Δ N_{i}^{†} (v)} {β λ_{F} (v; θ)}^{Δ N_{i}^{τ} (v)}} \\ \times exp {- \sum_{i = 1}^{n} \int_{0}^{s} Y_{i}^{†} (v) [λ_{F} (R_{i} (v); θ) + β λ_{F} (v; θ)] d v} . \end{matrix}

(7)

Taking logarithm we obtain the log-likelihood process

\begin{matrix} l (s; θ, β) = & \sum_{i = 1}^{n} {\int_{0}^{s} log (Y_{i}^{†} (v) β λ_{F} (v; θ)) d N_{i}^{τ} (v) + \int_{0}^{s} log (Y_{i}^{†} (v) λ_{F} (R_{i} (v); θ)) d N_{i}^{†} (v)} \\ - \sum_{i = 1}^{n} {\int_{0}^{s} Y_{i}^{†} (v) β λ_{F} (v; θ) d v + \int_{0}^{s} Y_{i}^{†} (v) λ_{F} (R_{i} (v); θ) d v} . \end{matrix}

(8)

For a vector a, let $a^{0} = 1, a^{\otimes 1} = a, a^{\otimes 2} = a a^{t}$ , and define the operator ∇_θ = ∂/∂θ ≡ (∂/∂θ_j,j = 1, 2, ..., p)^t. Let

φ (v; θ) = \nabla_{θ} log λ_{F} (v; θ) .

(9)

Under the regularity condition that the order of differentiation with respect to θ and β and integration with respect to Lebesgue measure can be interchanged, we obtain the score processes for θ and β by taking the derivatives of (8) with respect to θ and β, respectively, to be

U_{θ} (s; θ, β) = \sum_{i = 1}^{n} {\int_{0}^{s} φ (v; θ) M_{i}^{τ} (d v; θ, β) + \int_{0}^{s} φ (R_{i} (v); θ) M_{i}^{†} (d v, θ)};

(10)

U_{β} (s; θ, β) = \sum_{i = 1}^{n} \int_{0}^{s} \frac{1}{β} M_{i}^{τ} (d v; θ, β) .

(11)

To make notation compact, let

\begin{matrix} H_{i} (v; θ, β) = (\begin{matrix} φ (v; θ) & φ (R_{i} (v); θ) \\ \frac{1}{β} & 0 \end{matrix}); \\ M_{i} (v; θ, β) = {(M_{i}^{τ} (v; θ, β), M_{i}^{†} (v; θ))}^{t} and U (v; θ, β) = {(U_{θ}^{t} (v; θ, β), U_{β} (v; θ, β))}^{t} . \end{matrix}

(12)

Then, the vector of score processes becomes

U (s; θ, β) = \sum_{i = 1}^{n} \int_{0}^{s} H_{i} (v; θ, β) d M_{i} (v; θ, β), s \in ℜ_{+} .

(13)

Let s* ∈ $R_{+}$ . Equating the score vector in (13), evaluated at s = s*, to 0 and solving for θ and β, we obtain the maximum likelihood (ML) estimators $\hat{θ} = \hat{θ} (s^{*})$ and $\hat{β} = \hat{β} (s^{*})$ of θ and β, respectively. In general, it is not possible to obtain closed-form expressions for these estimators, so in practice numerical methods such as the Newton-Raphson (NR) algorithm or the Nelder-Mead simplex algorithm could be used to obtain the ML estimates.

The observed Fisher information process, which is the negative of the second partial derivatives with respect to the parameters is

I (s; θ, β) = [\begin{matrix} I_{11} (s; θ, β) & I_{12} (s; θ, β) \\ I_{21} (s; θ, β) & I_{22} (s; θ, β) \end{matrix}] = - [\begin{matrix} \frac{\partial}{\partial θ \partial θ^{t}} & \frac{\partial}{\partial θ \partial β} \\ \frac{\partial}{\partial β \partial θ^{t}} & \frac{\partial}{\partial β^{2}} \end{matrix}] l (s; θ, β),

(14)

where, with

\begin{matrix} \overset{•}{λ}_{F} (\cdot; θ) = \frac{\partial}{\partial θ} λ_{F} (\cdot; θ) and \overset{• •}{λ}_{F} (\cdot; θ) = \frac{\partial^{2}}{\partial θ \partial θ^{t}} λ_{F} (\cdot; θ); \\ \overset{•}{φ}_{F} (\cdot; θ) = \frac{\partial}{\partial θ} φ_{F} (\cdot; θ) = (\overset{• •}{λ}_{F} (\cdot; θ) ∕ λ_{F} (\cdot; θ, β)) - φ_{F} {(\cdot; θ)}^{\otimes 2}, \end{matrix}

we have

\begin{matrix} I_{11} (s; θ, β) = & \sum_{i = 1}^{n} \int_{0}^{s} Y_{i}^{†} (v) (β \overset{• •}{λ}_{F} (v; θ) + \overset{• •}{λ}_{F} (R_{i} (v); θ)) d v - \\ \sum_{i = 1}^{n} \int_{0}^{s} (\overset{•}{φ}_{F} (v; θ) d N_{i}^{τ} (v) + \overset{•}{φ}_{F} (R_{i} (v); θ) d N_{i}^{†} (v)); \\ I_{12} (s; θ, β) = & I_{21} {(s; θ, β)}^{t} = \sum_{i = 1}^{n} \int_{0}^{s} Y_{i}^{†} (v) \overset{•}{λ}_{F} (v; θ) d v; \\ I_{22} (s; θ, β) = & \frac{1}{β^{2}} \sum_{i = 1}^{n} N_{i}^{τ} (s) . \end{matrix}

Provided that the matrix inverse of the observed Fisher information matrix exists, the Newton-Raphson iteration below for numerically computing the parameter estimates may be implemented.

{({\hat{θ}}^{n e w}, {\hat{β}}^{n e w})}^{t} \leftarrow {({\hat{θ}}^{o l d}, {\hat{β}}^{o l d})}^{t} + I {(s^{⋆}; {\hat{θ}}^{o l d}, {\hat{β}}^{o l d})}^{- 1} U (s^{⋆}; {\hat{θ}}^{o l d}, {\hat{β}}^{o l d}) .

A more convenient computational implementation is achieved when we let s* → ∞ since the score equation for the ML estimator of β satisfies

\hat{β} (\infty) = \hat{β} = \frac{n}{\sum_{i = 1}^{n} Λ_{F} (τ_{i}; \hat{θ})} .

(15)

As a consequence, the estimator of θ, denoted by $\hat{θ} (\infty) = \hat{θ}$ , satisfies

\begin{matrix} \sum_{i = 1}^{n} {φ_{F} (τ_{i}; \hat{θ}) + \sum_{j = 1}^{K_{i}} φ_{F} (T_{i j}; \hat{θ})} = \\ n \frac{\sum_{i = 1}^{n} \overset{•}{Λ}_{F} (τ_{i}; \hat{θ})}{\sum_{i = 1}^{n} Λ_{F} (τ_{i}; \hat{θ})} + \sum_{i = 1}^{n} {\sum_{j = 1}^{K_{i}} \overset{•}{Λ}_{F} (T_{i j}; \hat{θ}) + \overset{•}{Λ}_{F} (τ_{i} - S_{i K_{i}}; \hat{θ})}, \end{matrix}

(16)

since $\int_{0}^{t} φ_{F} (w; θ) λ_{F} (w; θ) d w = {\overset{•}{Λ}}_{F} (t; θ)$ . Equation (16) may be solved through numerical methods, e.g., Newton-Raphson iteration.

Having obtained the estimate $\hat{θ} (s^{⋆})$ of θ, then the parametric estimator of F̄ is provided by

\hat{\overset{‒}{F}} (\cdot) = \overset{‒}{F} (\cdot; \hat{θ} (s^{⋆})) .

(17)

We will obtain the asymptotic distribution of this estimator in Section 3 and compare this with a fully nonparametric estimator in Section 4.

2.4 Computational Forms

To facilitate the numerical implementation of the procedures to obtain estimates, we observe the forms below for the integrals with respect to the martingale processes. Let $ψ : ℜ_{+} \to ℜ$ which could possibly depend also on (θ, β), and

Ψ (s; θ, β) = \int_{0}^{s} ψ (v; θ, β) λ_{F} (v; θ) d v .

Then, we have the following identities.

\begin{matrix} \sum_{i = 1}^{n} \int_{0}^{\infty} ψ (v; θ, β) M_{i}^{τ} (d v; θ, β) = & \sum_{i = 1}^{n} [ψ (τ_{i}; θ, β) - β Ψ (τ_{i}; θ, β)]; \\ \sum_{i = 1}^{n} \int_{0}^{\infty} ψ (R_{i} (v); θ, β) M_{i}^{†} (d v; θ, β) = & \sum_{i = 1}^{n} [\sum_{j = 1}^{K_{i}} ψ (T_{i j}; θ, β) - \\ {\sum_{j = 1}^{K_{i}} Ψ (T_{i j}; θ, β) + Ψ (τ_{i} - S_{i K_{i}}; θ, β)}] . \end{matrix}

These expressions are used in re-expressing the score functions as s → ∞. With regards to the terms in the observed Fisher information, we also observe the following identities:

\overset{•}{λ}_{F} = φ_{F} λ_{F} and \overset{• •}{λ}_{F} = (\overset{•}{φ}_{F} + φ_{F}^{\otimes 2}) λ_{F}

with ${\overset{•}{φ}}_{F} = (\partial ∕ \partial θ^{t}) φ_{F} = (\partial^{2} ∕ \partial θ \partial θ^{t}) \log λ_{F}$ .

2.4.1 Exponential Inter-Event Times

Implementing to the case of exponentially-distributed inter-event times with λ_F (t; θ) = θ so ϕ_F(t; θ) = 1/θ, we find that

U_{θ} (\infty; θ, β) = \frac{n + \sum_{i = 1}^{n} K_{i}}{θ} - (1 + β) \sum_{i = 1}^{n} τ_{i} and U_{β} (\infty; θ, β) = \frac{n}{β} - θ \sum_{i = 1}^{n} τ_{i} .

(18)

Equating both to zeros and solving for θ and β yield the estimators

\hat{θ} = \frac{\sum_{i = 1}^{n} K_{i}}{\sum_{i = 1}^{n} τ_{i}} and \hat{β} = \frac{n}{\sum_{i = 1}^{n} K_{i}},

(19)

so the ML estimator of θ is just the occurrence-exposure rate.

For this exponential case, with $\overset{‒}{K} = \sum_{i = 1}^{n} K_{i} ∕ n$ and $\overset{‒}{τ} = \sum_{i = 1}^{n} τ_{i} ∕ n$ , we also obtain the observed Fisher information matrix to be

I (\infty; θ, β) = n [\begin{matrix} \frac{1 + \overset{‒}{K}}{θ^{2}} & \overset{‒}{τ} \\ \overset{‒}{τ} & \frac{1}{β^{2}} \end{matrix}] .

(20)

Dividing by n and taking the limit as n → ∞, noting that $\overset{‒}{K} \overset{p}{\to} E (N_{1}^{†} (τ_{1})) = 1 ∕ β$ and $\overset{‒}{τ} \overset{p}{\to} E (τ_{1}) = 1 ∕ (θ β)$ , then the in-probability limiting matrix is

Σ (\infty; θ, β) = [\begin{matrix} \frac{1 + β}{θ^{2} β} & \frac{1}{θ β} \\ \frac{1}{θ β} & \frac{1}{β^{2}} \end{matrix}] .

The inverse of this matrix is

Σ {(\infty; θ, β)}^{- 1} = [\begin{matrix} θ^{2} β & - θ β^{2} \\ - θ β^{2} & β^{2} (1 + β) \end{matrix}] .

(21)

2.4.2 Weibull Inter-Event Times

In this subsubsection we present the estimation procedure when F is Weibull with shape parameter θ₁ and scale parameter θ₂, so that θ = (θ₁, θ₂)^t ∈ $ℜ_{+}^{2}$ . For this case we have

Λ_{F} (t; θ) = {(θ_{2} t)}^{θ_{1}} and λ_{F} (t; θ) = (θ_{1} θ_{2}) {(θ_{2} t)}^{θ_{1} - 1} .

(22)

We observe that given θ = (θ₁, θ₂) and β in this Weibull situation, the ratio of E(τ) and E(T₁) equals 1/(β^1/θ₁). Thus, if one desires approximately k₀ observed recurrences per unit, then the monitoring parameter could be chosen approximately equal to 1/( $k_{0}^{θ_{1}}$ ).

From (22), it follows that

φ_{F} (t; θ) = [\begin{matrix} \frac{1}{θ_{1}} [1 + log Λ_{F} (t; θ)] \\ \frac{θ_{1}}{θ_{2}} \end{matrix}] and \overset{•}{Λ}_{F} (t; θ) = [\begin{matrix} \frac{1}{θ_{1}} log Λ_{F} (t; θ) \\ \frac{θ_{1}}{θ_{2}} \end{matrix}] Λ_{F} (t; θ) .

(23)

The second equation arising in (16) simplifies to

\sum_{i = 1}^{n} K_{i} = \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i} + 1} Λ_{F} (T_{i j}; \hat{θ})

where, for brevity of notation but with a slight conflict with our earlier usage of T_{i,K_i+1}, from hereon we shall let

T_{i K_{i} + 1} \equiv τ_{i} - S_{i K_{i}} .

Consequently, ${\hat{θ}}_{2}$ in terms of ${\hat{θ}}_{1}$ is given by

{\hat{θ}}_{2} ({\hat{θ}}_{1}) = {\frac{\sum_{i = 1}^{n} K_{i}}{\sum_{i = 1}^{n} \sum_{j = 1}^{K_{i} + 1} T_{i j}^{{\hat{θ}}_{1}}}}^{1 ∕ {\hat{θ}}_{1}} = {\frac{K_{•}}{W ({\hat{θ}}_{1})}}^{1 ∕ {\hat{θ}}_{1}},

(24)

where

K_{•} = \sum_{i = 1}^{n} K_{i} and W (θ_{1}) = \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i} + 1} T_{i j}^{θ_{1}} .

Let us then define, for i = 1, 2, ... , n,

τ_{i}^{*} (θ_{1}) = K_{•} \frac{τ_{i}^{θ_{1}}}{W (θ_{1})} and T_{i j}^{*} (θ_{1}) = K_{•} \frac{T_{i j}^{θ_{1}}}{W (θ_{1})}, j = 1, 2, \dots, K_{i} + 1 .

Substituting the expression in (24) in the first equation arising in (16), we obtain that ${\hat{θ}}_{1}$ is the solution in θ₁ of the equation

\begin{matrix} (n + K_{•}) + \sum_{i = 1}^{n} {log τ_{i}^{*} (θ_{1}) + \sum_{j = 1}^{K_{i}} log T_{i j}^{*} (θ_{1})} = \\ n \frac{\sum_{i = 1}^{n} τ_{i}^{*} (θ_{1}) log τ_{i}^{*} (θ_{1})}{\sum_{i = 1}^{n} τ_{i}^{*} (θ_{1})} + \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i} + 1} T_{i j}^{*} (θ_{1}) log T_{i j}^{*} (θ_{1}) . \end{matrix}

(25)

The solution to this uni-dimensional equation maybe obtained numerically by plotting, direct search, or a Newton-Raphson iteration. The last approach will be implemented in performing simulations associated with this Weibull inter-event times. For the purpose of implementing this procedure, say in the R package, define the function $w : ℜ_{+}^{2} \to ℜ$ via

w (t, θ) = t^{θ} so that \overset{•}{w} (t, θ) \equiv \frac{\partial}{\partial θ} w (t, θ) = w (t, θ) log t .

In terms of the w-function, we also have

W (θ) = \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i} + 1} w (T_{i j}, θ), τ_{i}^{*} (θ) = K_{•} \frac{w (τ_{i}, θ)}{W (θ)} and T_{i j}^{*} (θ) = K_{•} \frac{w (T_{i j}, θ)}{W (θ)} .

Furthermore, define $r : ℜ_{+}^{2} \to ℜ$ via

r (t, θ) = \frac{w (t, θ)}{W (θ)} [log t - \frac{\overset{•}{W} (θ)}{W (θ)}],

where

\overset{•}{W} (θ) \equiv \frac{d W (θ)}{d θ} = \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i} + 1} \overset{•}{w} (T_{i j}, θ) .

Then, in terms of this r-function,

\overset{•}{τ}_{i}^{*} (θ) \equiv \frac{\partial}{\partial θ} τ_{i}^{*} (θ) = K_{•} r (τ_{i}, θ) and \overset{•}{T}_{i j}^{*} (θ) \equiv \frac{\partial}{\partial θ} T_{i j}^{*} (θ) = K_{•} r (T_{i j}, θ) .

Define the function $q : ℜ_{+} \to ℜ$ according to

\begin{matrix} q (θ) = n + K_{•} + \sum_{i = 1}^{n} log τ_{i}^{*} (θ) + \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i}} log T_{i j}^{*} (θ) - \\ n \frac{\sum_{i = 1}^{n} τ_{i}^{*} (θ) log τ_{i}^{*} (θ)}{\sum_{i = 1}^{n} τ_{i}^{*} (θ)} - \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i} + 1} T_{i j}^{*} (θ) log T_{i j}^{*} (θ), \end{matrix}

(26)

so that ${\hat{θ}}_{1}$ solves q( ${\hat{θ}}_{1}$ ) = 0 from (25). The derivative of this q-function is

\begin{matrix} \overset{•}{q} (θ) \equiv \frac{d q (θ)}{d θ} = \sum_{i = 1}^{n} \frac{\overset{•}{τ}_{i}^{*} (θ)}{τ_{i}^{*} (θ)} + \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i}} \frac{\overset{•}{T}_{i j}^{*} (θ)}{T_{i j}^{*} (θ)} - n \frac{\sum_{i = 1}^{n} \overset{•}{τ}_{i}^{*} (θ) [1 + log τ_{i}^{*} (θ)]}{\sum_{i = 1}^{n} τ_{i}^{*} (θ)} + \\ n [\frac{\sum_{i = 1}^{n} τ_{i}^{*} (θ) log τ_{i}^{*} (θ)}{\sum_{i = 1}^{n} τ_{i}^{*} (θ)}] [\frac{\sum_{i = 1}^{n} \overset{•}{τ}_{i}^{*} (θ)}{\sum_{i = 1}^{n} τ_{i}^{*} (θ)}] - \sum_{i = 1}^{n} \sum_{j = 1}^{K_{i} + 1} \overset{•}{T}_{i j}^{*} (θ) [1 + log T_{i j}^{*} (θ)] . \end{matrix}

(27)

Using these function definitions, the Newton-Raphson iteration for obtaining the estimate of the Weibull shape parameter θ₁ is given by

{\hat{θ}}_{1}^{n e w} \leftarrow {\hat{θ}}_{1}^{o l d} - \frac{q ({\hat{θ}}_{1}^{o l d})}{\overset{•}{q} ({\hat{θ}}_{1}^{o l d})} .

(28)

Upon obtaining the estimate of θ₁, the estimates of the scale parameter θ₂ and the monitoring parameter β could then be obtained using (24) for ${\hat{θ}}_{2}$ and, from (15),

\hat{β} = \frac{n}{\sum_{i = 1}^{n} {({\hat{θ}}_{2} τ_{i})}^{{\hat{θ}}_{1}}} .

(29)

3 Asymptotic Properties

In this section we study the asymptotic properties of the ML estimators. We will make use of the results in Borgan (1984), which deals with the consistency and asymptotic normality of ML estimators in parametric counting process models. We will consider the case where the number of subjects is increasing to infinity (n → ∞) in contrast to the situation where only one subject is considered and the time of monitoring increases to infinity (τ → ∞) as in Sellke (1988). Some of the regularity conditions in Borgan (1984) will be reformulated in terms of gap-times, which will enable obtaining more useful analytical conditions.

3.1 Reformulated Processes

Following an idea exploited in Sellke (1988) and also in Peña et al. (2001), we define stochastic processes $Z_{i} : ℜ_{+}^{2} \to {0, 1}$ via

Z_{i} (s, t) = I {R_{i} (s) \leq t} for i = 1, 2, \dots, n .

(30)

The first time parameter corresponds to calendar time, while the second time parameter represents gap or inter-event time. Note that Z_i(s, t) indicates whether at calendar time s at most t units of time have elapsed since the last event occurrence. Because it is F-adapted and has left-continuous paths, then Z_i(·, t) is an F-predictable process, aside from bounded.

To facilitate our asymptotic analysis we introduce a generalized likelihood process involving two time indices $L_{G} : ℜ_{+}^{2} \times ϴ \times ℜ_{+} \to ℜ_{+}$ defined via

\begin{matrix} L_{G} (s, t; θ, β) = \prod_{i = 1}^{n} {[\prod_{v = 0}^{s} λ_{G} {(v; θ, β)}^{Δ N_{i}^{τ} (v)}] exp {- \int_{0}^{s} Y_{i}^{†} (v) λ_{G} (v; θ, β) d v} \times \\ [\prod_{v = 0}^{s} λ_{F} {(R_{i} (v); θ)}^{Z_{i} (v, t) Δ N_{i}^{†} (v)}] exp {- \int_{0}^{s} Y_{i}^{†} (v) Z_{i} (v, t) λ_{F} (R_{i} (v); θ) d v}} . \end{matrix}

(31)

Notice here that the likelihood in Sellke (1988) could not be used directly for asymptotic properties purposes because it does not contain an informative censoring part since the distribution of the censoring time does not depend on that of the inter-event time. So the most important difference between the two likelihood is the contribution of the censored observations. Observe that L(s; θ, β) = lim_t→∞L_G(s, t; θ, β) since lim_t→∞Z_i(s, t) = 1. Therefore, functions and estimators derived from L(s; β, θ) are limits of analogous functions and estimators obtained from L_G(s, t; β, θ) as t → ∞. As we will see, however, dealing with L_G is more convenient analytically.

Taking the logarithm of L_G to obtain the generalized log-likelihood l_G = log L_G, and then the partial derivatives of l_G with respect to θ and β, we obtain the generalized score process

U_{G} (s, t; θ, β) = [\begin{matrix} \partial ∕ \partial θ \\ \partial ∕ \partial β \end{matrix}] l_{G} (s, t; θ, β) = \sum_{i = 1}^{n} \int_{0}^{s} H_{i} (v; θ, β) M_{i} (d v, t; θ, β),

(32)

where H_i is as defined in (12), and M_i(s, t; θ, β) = ${(M_{i}^{τ} (s; θ, β), M_{i} (s, t; θ))}^{t}$ with

M_{i} (s, t; θ) = \int_{0}^{s} Z_{i} (v, t) M_{i}^{†} (d v; θ) .

If (θ, β) are the true parameter values, then for fixed t ∈ $ℜ_{+}$ , M_i(·, t; θ, β) is a vector of square-integrable martingales with F-predictable quadratic variation (PQV) process given by

〈 M_{i} (\cdot, t; θ, β) 〉 (s) = Dg (\int_{0}^{s} Y_{i}^{†} (v) β λ_{F} (v; θ) d v, \int_{0}^{s} Y_{i}^{†} (v) Z_{i} (v, t) λ_{F} (R_{i} (v); θ) d v),

(33)

where Dg(a) is the diagonal matrix with diagonal elements being the elements of the vector a. Re-scaling by 1/n and simplifying, we have

Σ_{n} (s, t; θ, β) \equiv \frac{1}{n} 〈 U_{G} (\cdot, t; θ, β) 〉 (s) = [\begin{matrix} Σ_{11 n} (s, t; θ, β) & Σ_{12 n} (s; θ, β) \\ Σ_{21 n} (s; θ, β) & σ_{22 n} (s; θ, β) \end{matrix}]

where

\begin{matrix} Σ_{11 n} (s, t; θ, β) & = Σ_{11 A n} (s; θ, β) + Σ_{11 B n} (s, t; θ, β); \\ Σ_{11 A n} (s; θ, β) & = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{s} φ_{F} {(v; θ)}^{\otimes 2} Y_{i}^{†} (v) β λ_{F} (v; θ) d v; \\ Σ_{11 B n} (s, t; θ, β) & = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{s} φ_{F} {(R_{i} (v); θ)}^{\otimes 2} Y_{i}^{†} (v) Z_{i} (v, t) λ_{F} (R_{i} (v); θ) d v; \\ Σ_{12 n} (s; θ, β) & = Σ_{21 n} {(s; θ, β)}^{t} = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{s} \frac{1}{β} φ_{F} (v; θ) Y_{i}^{†} (v) β λ_{F} (v; θ) d v; \\ σ_{22 n} (s; θ, β) & = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{s} \frac{1}{β^{2}} Y_{i}^{†} (v) β λ_{F} (v; θ) d v . \end{matrix}

The matrix Σ_11Bn(s, t; θ, β) has the alternative representation given by

Σ_{11 B n} (s, t; θ, β) = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{t} φ_{F} {(w; θ)}^{\otimes 2} Y_{i} (s, w) λ_{F} (w; θ) d w,

where

Y_{i} (s, t) = \sum_{j = 1}^{N_{i}^{†} (s -)} I {T_{i j} \geq t} + I {(s \land τ_{i}) - S_{i N_{i}^{†} (s -)} \geq t}

(34)

is the generalized at-risk process (cf., Peña et al. (2001)). It is of interest to obtain the limit of Σ_n(s, t; θ, β) as n → ∞ to be able to use Borgan's results. For this purpose, we need the following lemma.

Lemma 1 For (s*, t*) ∈ (0, ∞)², then, as n →∞,

$\sup_{v \in [0, s^{⋆}]} ∣ \frac{1}{n} \sum_{i = 1}^{n} Y_{i}^{†} (v) - \overset{‒}{G} (v) ∣ \overset{p}{\to} 0$ ;
$\sup_{(v, w) \in [0, s^{⋆}] \times [0, t^{⋆}]} ∣ \frac{1}{n} \sum_{i = 1}^{n} Y_{i} (v, w) - y (v, w; θ, β) ∣ \overset{p}{\to} 0$ ,

where the function y(s, t; θ, β) is given by (cf., Peña et al. (2001))

y (s, t; θ, β) = \overset{‒}{F} (t; θ) \overset{‒}{G} (t; θ, β) {1 + \frac{1}{\overset{‒}{G} (t; θ, β)} \int_{t}^{s} ρ_{F} (w - t; θ) G (d w; θ, β)} I {t \leq s},

(35)

where ρ_F(·; θ), the renewal function, is given in (1).

Proof: Result (i) is just the Glivenko-Cantelli theorem, whereas result (ii) was established in Peña et al. (2001). ∥

Using this lemma, it is immediate that as n → ∞,

sup_{(v, w) \in [0, s^{⋆}] \times [0, t^{⋆}]} ‖ Σ_{n} (v, w; θ, β) - Σ (v, w; θ, β) ‖ \overset{p}{\to} 0

where the limiting matrix

Σ (s, t; θ, β) = [\begin{matrix} Σ_{11} (s, t; θ, β) & Σ_{12} (s; θ, β) \\ Σ_{21} (s; θ, β) & σ_{22} (s; θ, β) \end{matrix}]

(36)

has components given by

\begin{matrix} Σ_{11} (s, t; θ, β) & = Σ_{11 A} (s; θ, β) + Σ_{11 B} (s, t; θ, β); \\ Σ_{11 A} (s; θ, β) & = \int_{0}^{s} φ_{F} {(v; θ)}^{\otimes 2} β λ_{F} (v; θ) exp {- β Λ_{F} (v; θ)} d v; \\ Σ_{11 B} (s, t; θ, β) & = \int_{0}^{t} φ_{F} {(w; θ)}^{\otimes 2} y (s, w; θ, β) λ_{F} (w; θ) d w; \\ Σ_{12} (s; θ, β) & = Σ_{21} {(s; θ, β)}^{t} = \int_{0}^{s} \frac{1}{β} φ_{F} (v; θ) β λ_{F} (v; θ) exp {- β Λ_{F} (v; θ)} d v; \\ σ_{22} (s; θ, β) & = \int_{0}^{s} \frac{1}{β^{2}} β λ_{F} (v; θ) exp {- β Λ_{F} (v; θ)} d v . \end{matrix}

Also, observe that the limiting matrix in (36) can be decomposed into

\begin{matrix} Σ (s, t; θ, β) = & \int_{0}^{s} {[\begin{matrix} φ (v; θ) \\ 1 ∕ β \end{matrix}]}^{\otimes 2} β λ_{F} (v; θ) exp {- β Λ_{F} (v; θ)} d v + \\ \int_{0}^{t} {[\begin{matrix} φ (w; θ) \\ 0 \end{matrix}]}^{\otimes 2} y (s, w; θ, β) λ_{F} (w; θ) d w . \end{matrix}

Furthermore, with

I_{G} (s, t; θ, β) = - [\begin{matrix} \frac{\partial^{2}}{\partial θ \partial θ^{t}} & \frac{\partial^{2}}{\partial θ \partial β} \\ \frac{\partial^{2}}{\partial β \partial θ^{t}} & \frac{\partial^{2}}{\partial β^{2}} \end{matrix}] l_{G} (s, t; θ, β)

(37)

the generalized observed Fisher information process, it is straightforward to show that as n →∞,

sup_{(v, w) \in [0, s^{⋆}] \times [0, t^{⋆}]} ‖ \frac{1}{n} I_{G} (v, w; θ, β) - Σ (v, w; θ, β) ‖ \overset{p}{\to} 0 .

The limiting results pertaining to Σ_n and I_G are conditions in Borgan (1984)'s set of regularity conditions, and from his theorem the following results for the recurrent event setting follow.

Theorem 1 Let (s*, t*) ∈ $ℜ_{+}^{2}$ such that Σ(s*, t*; θ, β) is positive definite. Under the ‘usual regularity conditions’ in Borgan (1984),

(Consistency) There exists a sequence of solutions ( ${\hat{θ}}_{n}$ (s*, t*), ${\hat{β}}_{n}$ (s*, t*))^tto the sequence of equations U_G(s*, t*; θ, β) = 0 satisfying, as n → ∞,
$‖ (\begin{matrix} {\hat{θ}}_{n} (s^{⋆}, t^{⋆}) \\ {\hat{β}}_{n} (s^{⋆}, t^{⋆}) \end{matrix}) - (\begin{matrix} θ \\ β \end{matrix}) ‖ \overset{p}{\to} 0;$
(Asymptotic Normality) As n → ∞,
$\sqrt{n} [(\begin{matrix} {\hat{θ}}_{n} (s^{⋆}, t^{⋆}) \\ {\hat{β}}_{n} (s^{⋆}, t^{⋆}) \end{matrix}) - (\begin{matrix} θ \\ β \end{matrix})] \overset{d}{\to} M V N (0, Σ {(s^{⋆}, t^{⋆}; θ, β)}^{- 1}),$

where MVN means multivariate normal. Furthermore, $\frac{1}{n} I_{G} (s^{⋆}, t^{⋆}; {\hat{θ}}_{n} (s^{⋆}, t^{⋆}), {\hat{β}}_{n} (s^{⋆}, t^{⋆}))$ is a consistent estimator of Σ(s*, t*; θ, β).

The following corollary is immediate from Theorem 1.

Corollary 1 Under the conditions of Theorem 1, as n → ∞,

\sqrt{n} ({\hat{θ}}_{n} (s^{⋆}, t^{⋆}) - θ) \overset{d}{\to} M V N (0, Σ^{11} (s^{⋆}, t^{⋆}; θ, β)),

where Σ¹¹(s*, t*; θ, β) = (Γ(s*, t*; θ, β) + Ξ(s*; θ, β))⁻¹with

Γ (s, t; θ, β) = \int_{0}^{t} φ_{F} {(w; θ)}^{\otimes 2} y (s, w; θ, β) λ_{F} (w; θ) d w;

(38)

\begin{matrix} Ξ (s; θ, β) = & \int_{0}^{s} φ_{F} {(v; θ)}^{\otimes 2} β λ_{F} (v; θ) exp {- β Λ_{F} (v; θ)} d v - \\ \frac{{(\int_{0}^{s} φ_{F} (v; θ) β λ_{F} (v; θ) exp {- β Λ_{F} (v; θ)} d v)}^{\otimes 2}}{(\int_{0}^{s} β λ_{F} (v; θ) exp {- β Λ_{F} (v; θ) d v)} . \end{matrix}

(39)

Since of main interest to us is the case where t → ∞, we observe that because y(s, t; θ, β) = 0 whenever s < t, then the limiting function arising in (38) as t → ∞ is

Γ (s; θ, β) \equiv lim_{t \to \infty} Γ (s, t; θ, β) = \int_{0}^{s} φ_{F} {(w; θ)}^{\otimes 2} y (s, w; θ, β) λ_{F} (w; θ) d w .

(40)

With a view towards investigating the loss in efficiency when one uses a fully nonpara-metric estimator of the inter-event survivor function F̄, from Corollary 1 and the δ-method, we are able to get the asymptotic distribution of the parametric estimator $\hat{\overset{‒}{F}} (t)$ of F̄(t) in (17).

Corollary 2 Under the conditions of Corollary 1, as n → ∞,

\sqrt{n} ({\hat{\overset{‒}{F}}}_{n} (t) - \overset{‒}{F} (t; θ)) \overset{d}{\to} N (0, η^{2} (t; θ, β))

where the variance function is

η^{2} (t; θ, β) = exp {- 2 Λ_{F} (t; θ)} {(\nabla_{θ} Λ_{F} (t; θ))}^{t} {[Γ (θ, β) + Ξ (θ, β)]}^{- 1} (\nabla_{θ} Λ_{F} (t; θ)),

(41)

where Γ(θ, β) and Ξ(θ, β) are obtained from (38) and (39), respectively, by letting s and t both tend to ∞.

Note that under regularity conditions,

\nabla_{θ} Λ_{F} (t; θ) = \int_{0}^{t} φ_{F} (v; θ) λ_{F} (v; θ) d v .

We remark here that the result in Corollary 2 could be strengthened to the weak convergence of the process $\sqrt{n} ({\hat{\overset{‒}{F}}}_{n} (\cdot) - \overset{‒}{F} (\cdot; θ))$ to a Gaussian process, see Adekpedjou (2007)'s dissertation for this extended result; however, for the specific purpose of performing efficiency comparisons, the weaker version in Corollary 2 suffices.

4 Efficiency Comparisons

The second goal of this article is to examine efficiency issues in the presence of an informative monitoring period. The major reason for introducing the generalized KG model for recurrent events is in order to have an analytically tractable model which facilitates the investigation of some efficiency questions with recurrent event data. Through the informative monitoring model, we seek to provide concrete answers to the following questions. (i) If one employs an estimator of θ that was derived ignoring the informativeness of G on F, how much efficiency loss is incurred? (ii) If one uses an estimator of θ which was derived ignoring the additional event recurrences after the first event, but which takes into account the informativeness of G on F, what is the cost in terms of efficiency? This further leads to the question of how much information is contributed by the event occurrences past the first occurrence. (iii) If one adopts the fully nonparametric estimator of F̄(t) studied in Peña et al. (2001), but the true underlying model is the generalized KG model, how much efficiency is sacrificed?

4.1 Ignoring Informative Monitoring

We now address the first question. We suppose that ${\tilde{θ}}_{n} (s^{*})$ is the estimator of θ that was derived ignoring the informativeness of G on θ. Then, it is not difficult to see that the limiting variance of $\sqrt{n} {\tilde{θ}}_{n} (s^{*})$ equals Γ(s*; θ, β)⁻¹, where the matrix Γ is defined in (38). With det Q denoting the determinant of a matrix Q, a measure of the change in asymptotic relative efficiency of the sequence { ${\hat{θ}}_{n} (s^{*})$ } relative to the sequence { ${\tilde{θ}}_{n} (s^{*})$ } is

\begin{matrix} Δ ARE ({{\hat{θ}}_{n} (s^{*})} : {{\tilde{θ}}_{n} (s^{*})}) = & det [Γ {(s^{*}; θ, β)}^{- 1} (Γ (s^{*}; θ, β) + Ξ (s^{*}; θ, β)) - I] \\ = & det [Γ {(s^{*}; θ, β)}^{- 1} Ξ (s^{*}; θ, β)] = \frac{det Ξ (s^{*}; θ, β)}{det Γ (s^{*}; θ, β)} . \end{matrix}

Let us examine further the situation when s* → ∞. Denote by

{\overset{‒}{φ}}_{F} (\cdot; θ) = φ_{F} (Λ_{F}^{- 1} (\cdot; θ); θ)

(42)

with $Λ_{F}^{- 1}$ (·; θ) being the inverse of Λ_F(·; θ). Then, making the variable substitution z = βΛ_F(v; θ) in the expression for Ξ(θ, β), we obtain the compact expression

\begin{matrix} Ξ (θ, β) & = \int_{0}^{\infty} {\overset{‒}{φ}}_{F} {(z ∕ β; θ)}^{\otimes 2} exp (- z) d z - {(\int_{0}^{\infty} {\overset{‒}{φ}}_{F} (z ∕ β; θ) exp (- z) d z)}^{\otimes 2} \\ = Cov [{\overset{‒}{φ}}_{F} (\frac{Z}{β}; θ)], \end{matrix}

(43)

where Z is a unit exponential random variable. Note in particular that Ξ(θ, β), being a covariance matrix, is nonnegative definite, so det Ξ(θ, β) ≥ 0. Since Γ(θ, β) is also a limiting covariance matrix, hence is nonnegative definite, then this establishes that

Δ ARE ({{\hat{θ}}_{n} (\infty)} : {{\tilde{θ}}_{n} (\infty)}) \geq 0,

as is to be expected. Analogously, by straightforward manipulations and with Z still a unit exponential variable, we are able to find the more compact expression for Γ(θ, β) given by

\begin{matrix} Γ (θ, β) = (\frac{1}{1 + β}) Cov [{\overset{‒}{φ}}_{F} (\frac{Z}{1 + β}; θ)] + \\ \int_{0}^{\infty} {\overset{‒}{φ}}_{F} {(v; θ)}^{\otimes 2} exp (- v) [\int_{v}^{\infty} ρ_{F} (Λ_{F}^{- 1} (z; θ) - Λ_{F}^{- 1} (v; θ); θ) β exp (- β z) d z] d v . \end{matrix}

(44)

In the special case where the recurrent event accrual follows a homogeneous Poisson process so that Λ_F(t; θ) = θt, we are able to obtain closed form expressions for the above quantities. In this situation, ϕ_F(t;θ) = 1/θ and the renewal function is ρ_F(t; θ) = θtI(t ≥ 0). Since ϕ_F(·; θ) is constant in t, this immediately shows that Ξ(θ, β) = 0. Straightforward calculations also show that in this exponential case,

Γ (θ, β) = \frac{1}{θ^{2} β} .

(45)

Observe that the inverse (reciprocal) of this quantity equals the (1, 1)th element of Σ(∞; θ, β) given in (21) which was obtained via a limiting operation on the observed Fisher information.

Therefore, when F is the exponential distribution, ΔARE( $\hat{θ} : \tilde{θ}$ ) = 0, that is, there is no loss of efficiency by ignoring the fact that Ḡ(t; θ, β) = exp(−βθt) is informative about F̄(t;θ) = exp(−θt)! Surprising as it may look at first, this actually is a logical result for in this case, from (19) and by an easy calculation,

{\hat{θ}}_{n} = {\tilde{θ}}_{n} = \frac{\sum_{i = 1}^{n} K_{i}}{\sum_{i = 1}^{n} τ_{i}},

(46)

that is, both estimators of θ are just the occurrence-exposure rate. However, for other non-exponential F where ϕ_F(·; θ) is not constant, then det Ξ(θ, β) need not be zero, hence there could be loss in efficiency by ignoring the informativeness of G for F, as in the situation of Weibull inter-event times dealt with via simulations in Section 5.

4.2 Single-Event versus Recurrent Event Methods

To address the second efficiency question, denote by ${\overset{ˇ}{θ}}_{n}$ the estimator of θ based only on the possibly right-censored first event times, that is, the single-event estimator, but taking into account the informativeness of G for F. Then, the only technical change is that the appropriate y(∞, t; θ, β) function to utilize in the Γ(θ, β) expression is

y_{1} (\infty, t; θ, β) = \overset{‒}{F} (t; θ) \overset{‒}{G} (t; θ, β) = exp {- (1 + β) Λ_{F} (t; θ)} .

(47)

As a consequence, the limiting variance of $\sqrt{n} {\overset{ˇ}{θ}}_{n}$ is [Γ₁(θ, β) + Ξ(θ, β)]⁻¹, where

\begin{matrix} Γ_{1} (θ, β) & = \int_{0}^{\infty} φ_{F} {(v; θ)}^{\otimes 2} exp {- (1 + β) Λ_{F} (v; θ)} λ_{F} (v; θ) d v \\ = (\frac{1}{1 + β}) Cov [{\overset{‒}{φ}}_{F} (\frac{Z}{1 + β}; θ)], \end{matrix}

(48)

with Z a unit exponential random variable. With Γ₂(θ, β) denoting the second term in the expression for Γ(θ, β) in (44), that is,

\begin{matrix} Γ_{2} (θ, β) = \int_{0}^{\infty} {\overset{‒}{φ}}_{F} {(v; θ)}^{\otimes 2} exp (- v) \times \\ [\int_{v}^{\infty} ρ_{F} (Λ_{F}^{- 1} (z; θ) - Λ_{F}^{- 1} (v; θ); θ) β exp (- β z) d z] d v, \end{matrix}

(49)

then the change in asymptotic relative efficiency of the sequence { ${\hat{θ}}_{n}$ } relative to { ${\overset{ˇ}{θ}}_{n}$ } is

\begin{matrix} Δ ARE ({{\hat{θ}}_{n}} : {{\overset{ˇ}{θ}}_{n}}) & = det [{(Γ_{1} (θ, β) + Ξ (θ, β))}^{- 1} (Γ_{1} (θ, β) + Γ_{2} (θ, β) + Ξ (θ, β)) - I] \\ = \frac{det Γ_{2} (θ, β)}{det (Γ_{1} (θ, β) + Ξ (θ, β))} . \end{matrix}

(50)

Again, this will always be positive, indicating that there will always be gain in efficiency by utilizing the additional event occurrences.

In the special case where F is the exponential distribution, simple calculations reveal that

Γ_{1} (θ, β) = \frac{1}{θ^{2} (1 + β)} and Γ_{2} (θ, β) = \frac{1}{θ^{2} β (1 + β)} .

As a consequence, when the inter-event times are exponentially distributed,

Δ ARE ({{\hat{θ}}_{n}} : {{\overset{ˇ}{θ}}_{n}}) = \frac{1}{β},

(51)

which as noted in Section 2 is the expected number of event occurrences in each unit's monitoring period. This result could further be interpreted as follows. When β → 0, then the monitoring period lengthens, consequently more recurrences will be observed (per unit), which will provide more information, thereby making { ${\hat{θ}}_{n}$ } more efficient relative to { ${\overset{ˇ}{θ}}_{n}$ } at the incremental order of 1/β. Whereas, when β → ∞, then the monitoring period shortens so there will either be no event observed or more likely just a single event observed (per unit), so in this situation, the two estimators becomes very close, hence the gain in efficiency goes down to zero.

4.3 Optimal Design Choice

Viewed in a different light, this result for the exponentially-distributed inter-event times implies that if experimenter #1 has a sample of size n₁ and uses the recurrences for estimating θ, then experimenter #2 will need a sample of size n₂ = n₁(1 + 1/β) to gain the same (asymptotic) precision if he/she ignores the recurrences and use only the possibly right-censored times to first event occurrence, that is, single-event methods. The flip-side to these two approaches in terms of cost is that experimenter #1 will take more time to perform the study compared to experimenter #2, assuming that all units are entered into the study at the same time. A choice of which design to utilize entails taking into consideration costs associated with the experimental units and the duration of performing the study.

To amplify on the choice of study design, let us assume that C₁ is the cost per experimental unit, while C₂ is the cost incurred per unit of time while the study is ongoing. Let there be n₁ units in study design #1 which monitors all event occurrences over the units monitoring periods [0, τ_i]s. For this study, the overall study duration will be max_i≤n₁τ_i. On the other hand, let n₂ be the number of units for study design #2 which monitors only the occurrence of the first event time or when it gets right-censored by τ_i. The overall study duration in this case will be max_i≤n₂ ( $τ_{i} \land T_{i 1}$ ). Therefore, the expected total costs for each of these study designs will be, respectively,

\begin{matrix} T C_{1} (n_{1}; θ, β) & = C_{1} n_{1} + C_{2} E {max_{i \leq n_{1}} τ_{i}}; \\ T C_{2} (n_{2}; θ, β) & = C_{1} n_{1} + C_{2} E {max_{i \leq n_{2}} (τ_{i} \land T_{i 1})} . \end{matrix}

In the case of an exponential F, note that $τ_{i} \overset{d}{=} V_{i} ∕ (β θ)$ and $τ_{i} \land T_{i 1} \overset{d}{=} V_{i} ∕ ((β + 1) θ)$ where the V_is are i.i.d. unit exponential variables. By recalling that

E {V_{(n)}} = \sum_{j = 1}^{n} \frac{1}{n - j + 1} \approx log (n)

where V_(n) is the largest order statistic among V₁, V₂, ..., V_n i.i.d. EXP(1), then

\begin{matrix} T C_{1} (n_{1}; θ, β) & \approx C_{1} n_{1} + C_{2} log (n_{1}) ∕ (β θ); \\ T C_{2} (n_{2}; θ, β) & \approx C_{1} n_{2} + C_{2} log (n_{2}) ∕ ((β + 1) θ) . \end{matrix}

Since the two study designs will lead to the same asymptotic precision when n₂ = n₁(1 + 1/β), then study design #1 would be cost-preferable to study design #2 if TC₁(n₁; θ, β) < TC₂(n₁(1 + 1/β); θ, β). Using the above approximations, this inequality will occur if

\frac{C_{1}}{C_{2}} > \frac{1}{n_{1} θ (β + 1)} log [n_{1} {(\frac{β}{β + 1})}^{\frac{β}{β + 1}}] .

(52)

Of course, for this decision criterion to be usable, some prior or pilot estimates of θ and β will be needed.

4.4 Efficiency of Generalized PLE

We now address the third question posed earlier which pertains to the efficiency of the fully nonparametric estimator of the inter-event distribution F relative to an estimator derived using information about the structure of F and G. The KG model in the single-event settings was used for these efficiency studies, so the extended KG model for our recurrent event setting is a justifiably reasonable model to perform analogous efficiency studies. Peña et al. (2001) obtained the generalized product-limit estimator (GPLE) in this recurrent event setting. This estimator of the inter-event time survivor function F̄(t) is given by

{\tilde{\overset{‒}{F}}}_{n} (s^{⋆}, t) = \prod_{w \leq t} [1 - \frac{\sum_{i = 1}^{n} N_{i} (s^{⋆}, d w)}{\sum_{i = 1}^{n} Y_{i} (s^{⋆}, w)}]

(53)

where $N_{i} (s, t) = \int_{0}^{s} Z_{i} (v, t) N_{i}^{†} (d v)$ and Y_i(·, ·) is the generalized at-risk process defined in (34). It was established in their paper that ${\tilde{\overset{‒}{F}}}_{n} (s, t)$ is asymptotically normal with mean F̄(t) and asymptotic variance σ²(s, t; θ, β)/n, where

σ^{2} (s, t; θ, β) = \overset{‒}{F} {(t; θ)}^{2} \int_{0}^{t} \frac{Λ_{F} (d w; θ)}{y (s, w; θ, β)}

(54)

with y(s, t; θ, β) is the function given in (35). We compare this limiting variance with the limiting variance of the parametric estimator of F̄(t; θ) which is η²(t; θ, β) in (41). The comparison is when s* → ∞.

Theorem 2 For the generalized KG model in this recurrent event setting, the asymptotic relative efficiency of the GPLE $\tilde{\overset{‒}{F}}$ relative to the parametric estimator $\tilde{\overset{‒}{F}}$ at time point $Λ_{F}^{- 1} (t)$ is

A R E (\tilde{\overset{‒}{F}} : \hat{\overset{‒}{F}}) = \frac{{(\int_{0}^{t} {\overset{‒}{φ}}_{F} (w; θ) d w)}^{t} {[Γ (θ, β) + Ξ (θ, β)]}^{- 1} (\int_{0}^{t} {\overset{‒}{φ}}_{F} (w; θ) d w)}{\int_{0}^{t} {[1 + Q (w; θ, β)] exp (- (1 + β) w)}^{- 1} d w},

where

Q (w; θ, β) = \int_{0}^{\infty} ρ_{F} [Λ_{F}^{- 1} (v + w; θ) - Λ_{F}^{- 1} (w; θ); θ] β exp (- β v) d v .

(55)

Proof: The desired asymptotic relative efficiency is the ratio between η²( $Λ_{F}^{- 1}$ (t; θ); θ, β)and σ²( $Λ_{F}^{- 1}$ (t; θ); θ, β). The expressions in the theorem follow via straightforward manipulations of the expressions for η²(t; θ, β) and σ²(t; θ, β), by using the variable transformation v = Λ_F(t) in the integrals, and noting that ∇_θΛ_F(t; θ) = $\int_{0}^{t}$ ψ(w; θ) λ_F(w; θ)dw. ∥

Corollary 3 For F̄(t; θ) = exp(−θt), the exponential case, the ARE of the GPLE relative to the parametric estimator at time point $Λ_{F}^{- 1}$ (t) is

A R E (\tilde{\overset{‒}{F}} : \hat{\overset{‒}{F}}) = \frac{{[(1 + β) t]}^{2}}{exp {(1 + β) t} - 1} = \frac{p (t; β)}{1 - p (t; β)} {(- log p (t; β))}^{2}

with $p (t; β) = \Pr {(τ_{i} \land T_{i 1}) > Λ_{F}^{- 1} (t; θ)} = \overset{‒}{F} (Λ_{F}^{- 1} (t)) \overset{‒}{G} (Λ_{F}^{- 1} (t))$ .

Proof: In this exponential setting, we have already seen from earlier calculations that Ξ(θ, β) = 0, while Γ(θ, β) = 1/(θ²β). Furthermore, $Λ_{F}^{- 1}$ (t; θ) = t/θ, ρ_F(t; θ) = θt, ϕ_F(t; θ) = 1/θ, so that ${\overset{‒}{φ}}_{F}$ (t; θ) = 1/θ. The expressions follow upon simplifying the ARE and noting the above facts. ∥

From the ARE expression in Corollary 3, the following properties are easily established.

Corollary 4 The ARE( $\tilde{\overset{‒}{F}} : \hat{\overset{‒}{F}}$ ) in Corollary 3 satisfies (i) lim_β→∞ARE( $\tilde{\overset{‒}{F}} : \hat{\overset{‒}{F}}$ ) = 0; (ii) lim_β→0ARE( $\tilde{\overset{‒}{F}} : \hat{\overset{‒}{F}}$ ) = t/[exp(t) − 1]; and (iii) if p₀is the solution of exp(p − 1) − $\sqrt{p} = 0$ , then

sup_{p (t; β) \in [0, 1]} A R E (\tilde{\overset{‒}{F}} : \hat{\overset{‒}{F}}) = \frac{p_{0} {(- log p_{0})}^{2}}{1 - p_{0}} \approx 0.65 .

The ARE expression in Corollary 3 for the exponential inter-event time distribution as a function of p(t; β) is plotted in Figure 1. In Corollary 4, the result in (i) indicates that the shorter the monitoring periods become, which happens when β → ∞, then the more inefficient the GPLE becomes. This may seem surprising at first sight since one may think that there will be very few observed recurrences; however, because of the informativeness of G for F, then the right-censored observations will contain information about F, and this is being exploited by the parametric estimator. On the other hand, when β → 0, then the monitoring periods become longer, and in this case the impact of the right-censored observations will tend to be negligible because more and more complete observations will be observed. Indeed, this is manifested by observing that the limiting ARE expression in (ii) of Corollary 4 is the ARE of the parametric estimator $\hat{\overset{‒}{F}}$ (t) = exp(−t/T̄), where T̄ is the sample mean, relative to the nonparametric empirical survivor function estimator $\tilde{\overset{‒}{F}}$ (t) = $\sum_{i = 1}^{n}$ I{T_i > t}/n based on an i.i.d. sample T₁, T₂, ..., T_n from a unit exponential distribution. It is quite interesting to observe the similarity with the efficiency results in single-event settings. The upper bound for the ARE in (iii) is similar to that obtained by Cheng and Lin (1987) concerning the Kaplan-Meier estimator when compared to the estimator that exploits the informative structure of the KG model in the single-event setting.

Asymptotic relative efficiency of the nonparametric estimator $\tilde{\overset{‒}{F}}$ of the inter-event survivor function relative to the parametric estimator $\hat{\overset{‒}{F}}$ as a function of p ≡ p(t; β) = F̄( $Λ_{F}^{- 1}$ (t; θ); θ)^β+1 in the HPP case.

More generally, it is of interest to know if the ARE expression in Theorem 2 is always bounded above by unity. A partial answer is provided by the following theorem. Below, for a matrix A, tr(A) is its trace.

Theorem 3 The ARE( $\tilde{\overset{‒}{F}} : \hat{\overset{‒}{F}}$ ) expression in Theorem 2 is bounded above by

t r {{(I + Γ {(θ, β)}^{- 1} Ξ (θ, β))}^{- 1}} = \sum_{j = 1}^{p} \frac{1}{1 + e_{j} (θ, β)},

where e_j(θ, β), j = 1, 2, ..., p, are the eigenvalues of Γ(θ, β)⁻¹Ξ(θ, β). In particular, if the parameter θ is one-dimensional, then ARE( $\tilde{\overset{‒}{F}} : \hat{\overset{‒}{F}}$ ) ≤ 1.

Proof: To establish the result, we first observe that if f : $ℜ \to ℜ^{p}$ and g : $ℜ \to ℜ$ with square-integrable components with respect to a measure μ, and if S is a positive definite p × p symmetric matrix, then

(\int f^{t} g d μ) S^{- 1} (\int f g d μ) \leq (\int g^{2} d μ) tr (S^{- 1} \int f^{\otimes 2} d μ) .

(56)

This inequality follows by first letting f̄ = S^−1/2f, and then noting that the left-hand side of (56) equals

(\int f^{t} g d μ) S^{- 1} (\int f g d μ) = {(\int \overset{‒}{f} g d μ)}^{t} (\int \overset{‒}{f} g d μ) = \sum_{j = 1}^{p} {(\int {\overset{‒}{f}}_{j} g d μ)}^{2} .

Applying the Cauchy-Schwartz Inequality to each of the p terms, we obtain

\sum_{j = 1}^{p} {(\int {\overset{‒}{f}}_{j} g d μ)}^{2} \leq \sum_{j = 1}^{p} (\int {\overset{‒}{f}}_{j}^{2} d μ) (\int g^{2} d μ) .

The inequality in (56) follows by then noting that

\sum_{j = 1}^{p} (\int {\overset{‒}{f}}_{j}^{2} d μ) = \int {\overset{‒}{f}}^{t} \overset{‒}{f} d μ = \int f^{t} S^{- 1} f d μ = \int tr (S^{- 1} f^{\otimes 2}) d μ = tr (S^{- 1} \int f^{\otimes 2} d μ) .

To utilize this result in the context of the ARE expression, define the following.

\begin{matrix} f (w; θ) & = φ_{F} (w; θ) \sqrt{y (\infty, w; θ, β)}; \\ g (w; θ) & = I {w \leq t} ∕ \sqrt{y (\infty, w; θ, β)}; \\ S (θ, β) & = Γ (θ, β) + Ξ (θ, β); \\ d μ (w; θ) & = λ_{F} (w; θ) d w . \end{matrix}

Then, a direct application of (56) to the ARE expression in Theorem 3 leads to the upper bound in the statement of the theorem. The case of p = 1 is an immediate consequence. ∥

Observe therefore that if the parameter of the inter-event distribution is one-dimensional, then the parametric estimator of the inter-event survivor function will never be less efficient, asymptotically, than the generalized product-limit estimator under this generalized KG model. However, if the parameter vector θ is more than one-dimensional, then it is possible that the ARE will not have an upper bound of unity.

5 Simulation Studies

Obtaining exact analytical efficiency expressions under inter-event distributions other than the exponential distribution is difficult since closed form expressions for the renewal function, which appears in the variance expression, is not generally available. To examine non-exponential inter-event distributions, we resorted to computer simulation studies. Specifically, we considered in the simulation a Weibull inter-event distribution with shape parameter θ₁ and scale parameter θ₂. The purpose of the simulation study was to compare the efficiency of the estimators $\hat{θ}, \tilde{θ}$ , and $\overset{ˇ}{θ}$ as well as to ascertain the efficiency of the GPLE F̃(t) relative to the parametric estimator F̂(t).

The simulation code was in the R language (Ihaka and Gentleman (1996)), and a Newton-Raphson procedure as described in the subsubsection dealing with Weibull inter-event distribution was implemented. A total of M = 5000 simulation replications were performed, where for each replication, a recurrent event data following the generalized KG model was generated for combinations of values of n, θ₁, θ₂, and β. For each of the resulting recurrent event data, the estimates of $\hat{θ}, \tilde{θ}$ , and $\overset{ˇ}{θ}$ , as well as $\hat{β}$ and $\overset{ˇ}{β}$ were obtained. The estimates F̂(t) and F̃(t) of F(t) were also obtained for t-values coinciding with the percentiles of the true Weibull distribution.

As a measure of the efficiency of $\hat{θ}$ over $\tilde{θ}$ , we computed the estimate of its (generalized) mean-squared error (MSE) given by

\hat{M S E} (\hat{θ}, θ_{0}) = \frac{1}{M} \sum_{m = 1}^{M} [({\hat{θ}}_{m} - θ_{0}) {({\hat{θ}}_{m} - θ_{0})}^{t}]

(57)

where ${\hat{θ}}_{m} = {{\hat{θ}}_{1 m}, {\hat{θ}}_{2 m})}^{t}$ is the estimate obtained from the mth replication. The true θ value is denoted by θ₀ = (θ₁₀, θ₂₀)^t). The estimate of the MSE for $\tilde{θ}$ and $\overset{ˇ}{θ}$ are similarly defined. The estimate of the efficiency of $\hat{θ}$ over $\tilde{θ}$ is then defined via

Eff (\hat{θ} : \tilde{θ}) = \frac{det [\hat{M S E} (\tilde{θ}, θ_{0})]}{det [\hat{M S E} (\hat{θ}, θ_{0})]} .

(58)

Eff( $\hat{θ} : \overset{ˇ}{θ}$ ) is analogously defined. In an analogous manner, the measure of efficiency of F̃ over F̂ at time point t is the ratio of the MSEs of F̃(t) and F̂(t) computed over the M replications. The simulation was performed for combinations of

n \in {20, 50, 100}, θ_{1} \in {0.9, 1.0, 1.2, 1.5, 2.0},

As is to be expected, the results were somewhat invariant with respect to a change in the scale parameter θ₂, so we only report here those associated with θ₁ = 1.0. To conserve space, we also only report the cases n ∈ {20, 50} since the conclusions are unchanged.

Table 1 presents the efficiencies of the θ-estimators for the different cases. We have also provided the column MeanEvs which represents the mean number of events observed for each of the subjects. Note that when β is decreased, then there are more events observed. It is clear from the results of this simulation that the $\hat{θ}$ estimator is always more efficient than the other two estimators, as is to be expected. It is interesting to observe that as β increases, the efficiency gain of $\hat{θ}$ over $\tilde{θ}$ increases, whereas it is the opposite direction for the comparison with the $\overset{ˇ}{θ}$ estimator. This could intuitively be explained by the fact that when there are fewer event occurrences, then the information coming from the τ_is, which are being used in the $\hat{θ}$ estimator but which are ignored by the $\tilde{θ}$ estimator, becomes more important. On the other hand, for the $\overset{ˇ}{θ}$ estimator, a decrease in β leads to more event occurrences but which are ignored by this estimator, hence the increase in efficiency of the $\hat{θ}$ estimator over the $\overset{ˇ}{θ}$ estimator. Interestingly, the efficiency behavior are very similar over changes in n for the $\hat{θ}$ and $\tilde{θ}$ comparison. One may wonder why the $\hat{θ}$ estimator is still more efficient than the $\overset{ˇ}{θ}$ estimator even when the average number of events per subject is less than unity. An explanation for this is that the $\hat{θ}$ estimator still utilizes the values of τ, which provide additional information, even if there was only one event observed on a subject. The estimator $\overset{ˇ}{θ}$ , on the other hand, only utilizes the minimum between the τ_i-value and the time-to-first event occurrence T_i1, together with the indicator of which was smaller. We point out that the exact closed-form expressions of the asymptotic covariance matrix for the estimator $\overset{ˇ}{θ}$ is obtainable for this Weibull case by exploiting the independence between the censoring indicator and the min{T_i1, τ_i}. We compared these asymptotic variances with the simulated variances and there is indeed good agreement when the sample size is large (n ∈ {50, 100}).

Table 1.

Relative efficiency (in decimal format) of $\hat{θ}$ estimator relative to the $\tilde{θ}$ and $\overset{ˇ}{θ}$ estimators under the Weibull(θ₁,θ₂) inter-event distribution for different β-values, for n ∈ {20, 50}, and for θ₂ = 1. MeanEvs represents the mean number of events observed for each of the subjects. Results are based on 5000 simulation replications.

n	θ₁	θ₂	β	MeanEvs	$Eff (\hat{θ} : \tilde{θ})$	$Eff (\hat{θ} : \overset{ˇ}{θ})$
20	0.9	1	0.3	3.89	1.26	37.62
20	0.9	1	0.5	2.25	1.47	17.12
20	0.9	1	0.7	1.57	1.72	9.88
20	1.0	1	0.3	3.34	1.25	30.24
20	1.0	1	0.5	2.01	1.51	12.62
20	1.0	1	0.7	1.43	1.82	8.69
20	1.5	1	0.3	1.98	1.54	11.11
20	1.5	1	0.5	1.34	1.90	6.83
20	1.5	1	0.7	1.02	2.12	4.38
20	2.0	1	0.3	1.47	1.65	6.42
20	2.0	1	0.5	1.06	2.02	4.25
20	2.0	1	0.7	0.84	2.41	3.31
50	0.9	1	0.3	3.90	1.26	30.27
50	0.9	1	0.5	2.25	1.41	13.48
50	0.9	1	0.7	1.57	1.66	7.96
50	1.0	1	0.3	3.34	1.30	23.29
50	1.0	1	0.5	2.00	1.53	9.91
50	1.0	1	0.7	1.42	1.71	6.69
50	1.5	1	0.3	1.97	1.49	9.53
50	1.5	1	0.5	1.34	1.75	5.55
50	1.5	1	0.7	1.02	2.11	3.82
50	2.0	1	0.3	1.47	1.66	5.58
50	2.0	1	0.5	1.06	2.01	3.62
50	2.0	1	0.7	0.84	2.30	3.04

Open in a new tab

Figure 2 provides the plots of the efficiency of F̃(t) over F̂(t) for different θ₁ and n values. For each of the plot frames, we superimposed the plots associated with the different values of β so as to see the effect of changing β-value. We have plotted these graphs with the true value of F(t) in the abscissa as a way to standardize the graphs. From these plots, it is evident that the efficiency behavior are basically very similar for the different combinations of θ₁ and n, as well as for varying values of θ₂, which are not shown here. It is also clear that the relative efficiency of the GPLE over the parametric estimator never exceeds 0.70. It would be a mathematical challenge to know the exact upper bound of this relative efficiency in analogy to the approximate 0.65 that was found for the exponential inter-event times, but this appears to be a difficult problem owing to a non-closed form for the renewal function of a Weibull distribution. With regards to the impact of the monitoring parameter β, when this parameter is increased, then the relative efficiency of the GPLE decreases for larger values of t. This could be attributed to the fact that there will be fewer observations whose inter-event times are in this region, hence the GPLE suffers, aside from the fact that it does not utilize information coming from the τ_is, whereas the parametric estimator is able to use information from all the observations, including the τ_is, for estimating F(t) for larger ts.

Plots of efficiency of the GPLE F̃ relative to the parametric estimator F̂ under a Weibull(θ₁, θ₂) inter-event distribution for varying values of n and β. The curves are black and solid for β = .3, red and dashed for β = .5, and blue and dotted for β = .7. The value of θ₂ is 1.0. These curves are based on 5000 simulation replications.

6 Concluding Remarks

In this paper we have examined efficiency aspects of estimators of the inter-event distribution in a recurrent event setting. A generalized Koziol-Green model was introduced to provide an analytically tractable model of an informative monitoring period. This enables the analytical assessment of efficiencies of estimators. Of particular interest was to study the loss in efficiency if the informative monitoring structure is ignored in the estimation procedure, and to see the gain in efficiency when one utilizes the event recurrences instead of just simply using the time-to-first, possibly right-censored, event occurrence. The generalized product-limit estimator of the inter-event distribution, which does not utilize the informative monitoring aspect, was also examined in terms of its loss in efficiency relative to the parametric estimator, the latter exploiting the informative structure. In a sequel article, a semiparametric estimator of the inter-event distribution based on recurrent event data, and which exploits the generalized KG structure, will be compared to the generalized product-limit estimator.

Acknowledgments

The authors wish to thank Dr. Alex McLain and Dr. Laura Taylor for helpful discussions, and also wish to thank a Referee and the Editor for their comments.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

Aalen O. Nonparametric inference for a family of counting processes. Ann. Statist. 1978;6:701–726. [Google Scholar]
Aalen OO, Borgan Ø, Gjessing HK. Survival and event history analysis: A process point of view. Springer; New York: 2008. Statistics for Biology and Health. [Google Scholar]
Adekpedjou A. Ph.D. thesis. University of South Carolina; Columbia, South Carolina: 2007. Estimation with Recurrent Event Data Under an Informative Monitoring Period. [Google Scholar]
Allen W. A note on conditional probability of failure when hazards are proportional. Operations Research. 1963;11:658–659. [Google Scholar]
Andersen P, Borgan O, Gill R, Keiding N. Statistical Models Based on Counting Processes. Springer-Verlag; New York: 1993. [Google Scholar]
Borgan Ø. Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scand. J. Statist. 1984;11:1–16. [Google Scholar]
Breslow N, Crowley J. A large sample study of the life table and product limit estimates under random censorship. The Annals of Statistics. 1974;2:437–453. [Google Scholar]
Chen Y, Hollander M, Langberg N. Small-sample results for the Kaplan-Meier estimator. Journal of the American Statistical Association. 1982;77:141–144. [Google Scholar]
Cheng PE, Lin GD. Maximum likelihood estimation of a survival function under the Koziol-Green proportional hazards model. Statist. Probab. Lett. 1987;5:75–80. [Google Scholar]
Cox D. Regression models and life tables (with discussion) Journal of the Royal Statistical Society. 1972;34:187–220. [Google Scholar]
Csörgő S, Faraway JJ. The paradoxical nature of the proportional hazards model of random censorship. Statistics. 1998;31:67–78. [Google Scholar]
Efron B. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Prentice-Hall; New York: 1967. The two-sample problem with censored data; pp. 831–853. [Google Scholar]
Fleming T, Harrington D. Counting Processes and Survival Analysis. Wiley; New York: 1991. [Google Scholar]
Gill RD. Nonparametric estimation based on censored observations of a Markov renewal process. Z. Wahrsch. Verw. Gebiete. 1980;53:97–116. [Google Scholar]
Gill RD. Testing with replacement and the product limit estimator. Ann. Statist. 1981;9:853–860. [Google Scholar]
Hollander M, Peña E. Families of confidence bands for the survival function under the general random censorship model and the Koziol-Green model. Canadian Journal of Statistics. 1989;17:59–74. [Google Scholar]
Ihaka R, Gentleman R. R: a language for data analysis and graphics. Journal of Computational and Graphical Statistics. 1996;5:299–314. [Google Scholar]
Jacod J. Multivariate point processes: predictable projection, Radon-Nikodým derivatives, representation of martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete. 197475;31:235–253. [Google Scholar]
Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. 2nd ed Wiley-Interscience [John Wiley & Sons]; Hoboken, NJ: 2002. Wiley Series in Probability and Statistics. [Google Scholar]
Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 1958;53:457–481. [Google Scholar]
Koziol J, Green S. A Cramer-von Mises statistic for randomly censored data. Biometrika. 1976;63:139–156. [Google Scholar]
Peña E, Hollander M. Models for Recurrent Events in Reliability and Survival Analysis. In: Soyer R, Mazzuchi T, Singpurwalla N, editors. Mathematical Reliability: An Expository Perspective. Kluwer Academic Publishers; 2004. pp. 105–123. chap. 6. [Google Scholar]
Peña E, Strawderman R, Hollander M. Nonparametric estimation with recurrent event data. J. Amer. Statist. Assoc. 2001;96:1299–1315. [Google Scholar]
Sellke T. Statistical decision theory and related topics, IV, Vol. 2 (West Lafayette, Ind., 1986) Springer; New York: 1988. Weak convergence of the Aalen estimator for a censored renewal process; pp. 183–194. [Google Scholar]
Siannis F. Applications of a parametric model for informative censoring. Biometrics. 2004;60:704–714. doi: 10.1111/j.0006-341X.2004.00220.x. [DOI] [PubMed] [Google Scholar]
Vardi Y. Nonparametric estimation in renewal processes. Ann. Statist. 1982a;10:772–785. [Google Scholar]
Vardi Y. Nonparametric estimation in the presence of length bias. Ann. Statist. 1982b;10:616–620. [Google Scholar]
Wang M-C, Chang S-H. Nonparametric estimation of a recurrent survival function. J. Amer. Statist. Assoc. 1999;94:146–153. doi: 10.1080/01621459.1999.10473831. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang M-C, Qin J, Chiang C-T. Analyzing recurrent event data with informative censoring. J. Amer. Statist. Assoc. 2001;96:1057–1065. doi: 10.1198/016214501753209031. [DOI] [PMC free article] [PubMed] [Google Scholar]
William AL. A Model for Informative Censoring. J. Amer. Statist. Assoc. 1989;84:749–752. [Google Scholar]

[R1] Aalen O. Nonparametric inference for a family of counting processes. Ann. Statist. 1978;6:701–726. [Google Scholar]

[R2] Aalen OO, Borgan Ø, Gjessing HK. Survival and event history analysis: A process point of view. Springer; New York: 2008. Statistics for Biology and Health. [Google Scholar]

[R3] Adekpedjou A. Ph.D. thesis. University of South Carolina; Columbia, South Carolina: 2007. Estimation with Recurrent Event Data Under an Informative Monitoring Period. [Google Scholar]

[R4] Allen W. A note on conditional probability of failure when hazards are proportional. Operations Research. 1963;11:658–659. [Google Scholar]

[R5] Andersen P, Borgan O, Gill R, Keiding N. Statistical Models Based on Counting Processes. Springer-Verlag; New York: 1993. [Google Scholar]

[R6] Borgan Ø. Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scand. J. Statist. 1984;11:1–16. [Google Scholar]

[R7] Breslow N, Crowley J. A large sample study of the life table and product limit estimates under random censorship. The Annals of Statistics. 1974;2:437–453. [Google Scholar]

[R8] Chen Y, Hollander M, Langberg N. Small-sample results for the Kaplan-Meier estimator. Journal of the American Statistical Association. 1982;77:141–144. [Google Scholar]

[R9] Cheng PE, Lin GD. Maximum likelihood estimation of a survival function under the Koziol-Green proportional hazards model. Statist. Probab. Lett. 1987;5:75–80. [Google Scholar]

[R10] Cox D. Regression models and life tables (with discussion) Journal of the Royal Statistical Society. 1972;34:187–220. [Google Scholar]

[R11] Csörgő S, Faraway JJ. The paradoxical nature of the proportional hazards model of random censorship. Statistics. 1998;31:67–78. [Google Scholar]

[R12] Efron B. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Prentice-Hall; New York: 1967. The two-sample problem with censored data; pp. 831–853. [Google Scholar]

[R13] Fleming T, Harrington D. Counting Processes and Survival Analysis. Wiley; New York: 1991. [Google Scholar]

[R14] Gill RD. Nonparametric estimation based on censored observations of a Markov renewal process. Z. Wahrsch. Verw. Gebiete. 1980;53:97–116. [Google Scholar]

[R15] Gill RD. Testing with replacement and the product limit estimator. Ann. Statist. 1981;9:853–860. [Google Scholar]

[R16] Hollander M, Peña E. Families of confidence bands for the survival function under the general random censorship model and the Koziol-Green model. Canadian Journal of Statistics. 1989;17:59–74. [Google Scholar]

[R17] Ihaka R, Gentleman R. R: a language for data analysis and graphics. Journal of Computational and Graphical Statistics. 1996;5:299–314. [Google Scholar]

[R18] Jacod J. Multivariate point processes: predictable projection, Radon-Nikodým derivatives, representation of martingales. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete. 197475;31:235–253. [Google Scholar]

[R19] Kalbfleisch JD, Prentice RL. The statistical analysis of failure time data. 2nd ed Wiley-Interscience [John Wiley & Sons]; Hoboken, NJ: 2002. Wiley Series in Probability and Statistics. [Google Scholar]

[R20] Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J. Amer. Statist. Assoc. 1958;53:457–481. [Google Scholar]

[R21] Koziol J, Green S. A Cramer-von Mises statistic for randomly censored data. Biometrika. 1976;63:139–156. [Google Scholar]

[R22] Peña E, Hollander M. Models for Recurrent Events in Reliability and Survival Analysis. In: Soyer R, Mazzuchi T, Singpurwalla N, editors. Mathematical Reliability: An Expository Perspective. Kluwer Academic Publishers; 2004. pp. 105–123. chap. 6. [Google Scholar]

[R23] Peña E, Strawderman R, Hollander M. Nonparametric estimation with recurrent event data. J. Amer. Statist. Assoc. 2001;96:1299–1315. [Google Scholar]

[R24] Sellke T. Statistical decision theory and related topics, IV, Vol. 2 (West Lafayette, Ind., 1986) Springer; New York: 1988. Weak convergence of the Aalen estimator for a censored renewal process; pp. 183–194. [Google Scholar]

[R25] Siannis F. Applications of a parametric model for informative censoring. Biometrics. 2004;60:704–714. doi: 10.1111/j.0006-341X.2004.00220.x. [DOI] [PubMed] [Google Scholar]

[R26] Vardi Y. Nonparametric estimation in renewal processes. Ann. Statist. 1982a;10:772–785. [Google Scholar]

[R27] Vardi Y. Nonparametric estimation in the presence of length bias. Ann. Statist. 1982b;10:616–620. [Google Scholar]

[R28] Wang M-C, Chang S-H. Nonparametric estimation of a recurrent survival function. J. Amer. Statist. Assoc. 1999;94:146–153. doi: 10.1080/01621459.1999.10473831. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] Wang M-C, Qin J, Chiang C-T. Analyzing recurrent event data with informative censoring. J. Amer. Statist. Assoc. 2001;96:1057–1065. doi: 10.1198/016214501753209031. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R30] William AL. A Model for Informative Censoring. J. Amer. Statist. Assoc. 1989;84:749–752. [Google Scholar]

PERMALINK

Estimation and Efficiency with Recurrent Event Data under Informative Monitoring

Akim Adekpedjou

Edsel A Peña

Jonathan Quiton

Abstract

1 Introduction

2 Model of Interest and Estimators

2.1 Random Entities

2.2 Generalized KG Model for Recurrent Events

2.3 Estimators

2.4 Computational Forms

2.4.1 Exponential Inter-Event Times

2.4.2 Weibull Inter-Event Times

3 Asymptotic Properties

3.1 Reformulated Processes

4 Efficiency Comparisons

4.1 Ignoring Informative Monitoring

4.2 Single-Event versus Recurrent Event Methods

4.3 Optimal Design Choice

4.4 Efficiency of Generalized PLE

Figure 1.

5 Simulation Studies

Table 1.

Figure 2.

6 Concluding Remarks

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Estimation and Efficiency with Recurrent Event Data under Informative Monitoring

Akim Adekpedjou

Edsel A Peña

Jonathan Quiton

Abstract

1 Introduction

2 Model of Interest and Estimators

2.1 Random Entities

2.2 Generalized KG Model for Recurrent Events

2.3 Estimators

2.4 Computational Forms

2.4.1 Exponential Inter-Event Times

2.4.2 Weibull Inter-Event Times

3 Asymptotic Properties

3.1 Reformulated Processes

4 Efficiency Comparisons

4.1 Ignoring Informative Monitoring

4.2 Single-Event versus Recurrent Event Methods

4.3 Optimal Design Choice

4.4 Efficiency of Generalized PLE

Figure 1.

5 Simulation Studies

Table 1.

Figure 2.

6 Concluding Remarks

Acknowledgments

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases