Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2017 Sep 2.
Published in final edited form as: J Nonparametr Stat. 2016 Sep 2;28(4):716–735. doi: 10.1080/10485252.2016.1225733

Asymptotics for a Class of Dynamic Recurrent Event Models

Edsel A Peña 1,*
PMCID: PMC5533306  NIHMSID: NIHMS832199  PMID: 28761316

Abstract

Asymptotic properties, both consistency and weak convergence, of estimators arising in a general class of dynamic recurrent event models are presented. The class of models take into account the impact of interventions after each event occurrence, the impact of accumulating event occurrences, the induced informative and dependent right-censoring mechanism due to the data-accrual scheme, and the effect of covariate processes on the recurrent event occurrences. The class of models subsumes as special cases many of the recurrent event models that have been considered in biostatistics, reliability, and in the social sciences. The asymptotic properties presented have the potential of being useful in developing goodness-of-fit and model validation procedures, confidence intervals and confidence bands constructions, and hypothesis testing procedures for the finite- and infinite-dimensional parameters of a general class of dynamic recurrent event models, albeit the models without frailties.

Keywords and Phrases: consistency, compensators, counting processes, full models, marginal models, martingales, repair models, sum-quota accrual, weak convergence

1 Introduction and Background

Recurrent events pervade many disciplines such as the biomedical and public health sciences, engineering sciences, social and political sciences, economic sciences, and even sporting events. Examples of such events are non-fatal heart attacks, hospitalization of a patient with a chronic disease, migraines, breakdown of an electronic or mechanical system, discovery of a bug in a software, disagreement in a marriage, change of a job, Dow Jones Industrial Average (DJIA) decreasing by at least 200 points during a trading day, a perfect baseball game in the Major Leagues, a goal scored in a World Cup soccer game, and many others. The mathematical modeling of recurrent events, together with the development of statistical inference procedures for the models, are of paramount importance.

There are two approaches to the specification of mathematical models for recurrent events. The first is a full specification of the probability measure on the measurable space induced by the monitoring of the recurrent event. This is done by specifying the joint distributions of the calendar times of event occurrences, or equivalently the joint distributions of the inter-event times. Alternatively, the probability measure can be specified as a measure on the space of paths of the stochastic process arising from the monitoring of the recurrent event. The simplest and perhaps most common full parametric model is when the counting process associated with the event accrual is assumed to follow a homogeneous Poisson process (HPP), in which case the inter-event times are independent and identically distributed (IID) with common negative exponential distribution. One may also specify a nonparametric model by simply assuming that the inter-event time distribution is some unknown continuous distribution, resulting in the IID renewal model. The general dynamic model of interest in this article is of the full model variety.

The second modeling approach is referred to as marginal modeling. In its basic form, the event position within a unit is utilized as a stratifying variable, and a (marginal) probability measure is specified for each of the resulting strata. This approach was pioneered in the papers [16, 19]. It should be observed that the class of full models subsumes the class of marginal models. However, proponents of the marginal modeling approach espouse this marginal approach since it generally leads to an easier interpretation of model parameters though, at the same time, it may be difficult to justify a full model which is consistent with the specified marginal models. In fact, there could be several full models that are consistent with the marginal models.

An IID distributional specification for the inter-event times is clearly an oversimplification since it will often be the case that after an event occurrence some type of intervention, such as a corrective measure or a repair, will be performed, thereby altering the distribution of the time to the next event occurrence. Furthermore, time-dependent concomitant variables could also impact the distributions of the inter-event times, and within a unit the interevent times may be correlated owing to unobserved latent variables. The number of event occurrences could also impact these distributions, such as when event occurrences weakens the unit, thereby stochastically shortening the time to the next event occurrence. Due to practical and unavoidable constraints, the monitoring of the event could also only be performed over a finite, possibly random, observation window, and thus a sum-quota accrual scheme ensues wherein the number of observed event occurrences is a random variable which is informative about the event occurrence mechanism. This finite monitoring constraint also produces a right-censored observation, which could not be ignored in performing inference because of selection bias issues. The class of dynamic recurrent event models proposed in [12] incorporates the above considerations. This class of models is a specific member of the class of models of interest in this article. The major goals of this article are to obtain the asymptotic properties of semi-parametric estimators of the model parameters for the general class of dynamic recurrent event time models of the type in [12]. Note that algorithmic issues of the semi-parametric estimators for the model in [12] were dealt with in [14].

This article focuses on the large-sample properties of semiparametric estimators for the parameters of the class of dynamic models described in section 2. These semiparametric estimators are described in section 3. Consistency properties of the estimators will be established in section 5, while weak convergence properties will be developed in section 6.

2 Class of Dynamic Models

In this section we describe the general class of dynamic models of interest. In the sequel, (Ω, ℱ,P) is the basic probability space on which all random entities are defined. Consider a study in which units are to be monitored over the calendar time [0, s*], where s* ∈ (0,∞) is a fixed calendar time. For a unit in this study we suppose that there is a 1 × q vector of possibly time-varying bounded covariates X = {X(s) : s ∈ [0, s*]}. We shall denote by N = {N(s) : s ∈ [0, s*]} the counting process such that N(s) is the number of observed event occurrences over the period [0, s]. There is also a random time τ such that the unit is only under observation over the period [0, τ], so its at-risk process is Y = {Y(s) = I{τs} : s ∈ [0, s*]}, where I{·} is the indicator function. With the definition of N and Y, note that when τ < s*, no more events are observed over (τ, s*] for the unit, so that N(s) − N(τ) = 0 for s ∈ (τ, s*], and this is reflected in the probabilistic statement in equation (1) below. We shall denote by 𝔉 = {ℱs: s ∈ [0, s*]} a filtration on (Ω, ℱ, P) such that N, Y, and X are 𝔉-adapted and, in addition, Y and X are also 𝔉-predictable.

The class of dynamic models of interest postulates that for k ∈ {1, 2, …} and with dN(s) ≡ N((s + ds)−) − N(s−), as ds ↓ 0 and for s ∈ [0, s*),

P{dN(s)kFs-}=Y(s)[λ(sX(s))I{k=1}+op(1)I{k1}]ds,a.e.-[P], (1)

where

λ(sX(s))=λ0[E(s)]ρ[s,N(s-);α]ψ[X(s)β]. (2)

In (2), 𝔈 = {ℰ(s) : s ∈ [0, s*]} is an 𝔉-predictable process with paths that are piecewise left-continuous, nonnegative, ℰ(s) ≤ s, and piecewise differentiable with derivative satisfying ℰ′(s) ≥ 0; λ0(·) is an unknown baseline hazard rate function with cumulative hazard function Λ0(·)=0·λ0(s)ds is a known nonnegative bounded function over ℜ+ × ℕ where ℕ = {0, 1, 2, …} and with ρ(s, 0; α) = 1 and α ∈ ℜq is an unknown q-dimensional parameter; and ψ(·) is a known nonnegative link function on ℜ and with β ∈ ℜp an unknown p-dimensional regression parameter. The process 𝔈 is called the effective age process. We shall assume that τ ~ G(·), where G(·) is some distribution function which does not involve (λ0(·), α, β), hence it is considered a nuisance parameter. The regressor X is a vector-valued bounded and predictable process whose probabilistic structure may also contain some unknown nuisance parameters. A technical condition that we will assume (see the paper [11]) is that the counting process N is non-explosive over [0, s*], that is, P {N(s*) < ∞} = 1. This condition necessarily imposes a constraint on the form of the function ρ(·, ·; ·) and the model parameters.

The model parameter of main interest is

θ=(Λ0(·),α,β)ΘC×Rq×Rp (3)

where 𝒞 is some class of cumulative hazard functions on ℜ+, which will typically be a nonparametric class. Thus, θ will be a semiparametric parameter. Defining the process M = {M(s; θ) : s ∈ [0, s*]} with M(s; θ) = N(s) − A(s; θ) and where

A(s;θ)=0sY(v)λ0[E(v)]ρ[v,N(v-);α]ψ[X(v)β]dv, (4)

the model is tantamount to the condition that M is a zero-mean square-integrable 𝔉-martingale. The model specified in (1) and (2) is a slightly more general version of those in [12] and [14] since we allow the ρ-function to also directly depend on s aside from N(s−). For more background about this class of models and many specific models subsumed by this class of models, see [12, 14]. This general class of models includes as special cases models that have been considered in the biostatistics and reliability settings. To mention two specific models, if ℰ(s) = sSN(s−) with 0 = S0 < S1 < S2 < … being the times of successive event occurrences, so ℰ(·) represents the backward recurrence time function, the model coincides with resetting the age of the unit to zero after each event occurrence, which is referred to in the reliability literature as a perfect repair; while if we have ℰ(s) = s, then we say that a minimal repair is performed after each event occurrence. If the latter specification is further coupled with ρ(v, k; α) = 1, then we recover the Andersen-Gill multiplicative intensity model [4]; also the Cox proportional hazards (PH) model [7] when ψ(v) = exp(v).

We suppose there are n units in the study, so we will observe n independent and identically distributed (IID) copies 𝔇n ≡ 𝔇 = (D1,D2,,Dn) of the basic observable D = (N,Y,,X). We denote by 𝒟 the sample space of D, so that the sample space for 𝔇 is 𝒟n. A larger filtration on (Ω, ℱ, P) is formed from the n unit filtrations according to

F=i=1nFi=σ(i=1nFi).

Inference on the model parameter θ = (Λ0(·), α, β), or relevant functionals of θ, are to be based on the realization of 𝔇n. Properties of the inferential procedures are to be examined when n → ∞.

We shall use functional notation in the sequel. Thus, for a possibly vector-valued function g defined on 𝒟, Pg will represent the theoretical expectation of g(D), while ℙg ≡ ℙng will represent the empirical expectation of g given 𝔇n. That is, Pg = ∫ g(d)P(dd) and g=1ni=1ng(Di). The theoretical and empirical covariances of g are defined, respectively, via Vg = P(gPg)⊗2 and 𝕍g = ℙ(g − ℙg)⊗2, where, for a column vector a, we write a⊗0 = 1, a⊗1 = a, and a⊗2 = aaT.

3 Semiparametric Estimators

3.1 Doubly-Indexed Processes

The intensity model in (4) has the distinctive feature that the baseline hazard rate λ0(·) is evaluated at time s at the effective age ℰ(s). Since of interest is to infer about λ0(·) or Λ0(·), we need to de-couple λ0(·) from ℰ(·). As demonstrated in [17, 13, 14] such de-coupling is facilitated through the use of doubly-indexed processes.

Let t* ∈ (0,∞) be fixed, and define 𝒮 = [0, s*] and 𝒯 = [0, t*]. Form ℐ = 𝒮 × 𝒯. For our purpose we define the following ℐ-indexed processes associated with the (N, Y,,X) processes for one unit: Z = {Z(s, t) : (s, t) ∈ ℐ}, N = {N(s, t) : (s, t) ∈ ℐ}, A = {A(s, t; θ) : (s, t) ∈ ℐ}, and M = {M(s, t; θ) : (s, t) ∈ ℐ}, where

Z(s,t)=I{E(s)t};N(s,t)=0sZ(v,t)N(dv);A(s,t;θ)=0sZ(v,t)A(dv;θ);M(s,t;θ)=N(s,t)-A(s,t;θ)=0sZ(v,t)M(dv;θ).

As an interpretation, note that N(s, t) is the number of occurrences of the recurrent event over the period [0, s] and for which the effective ages on these occurrences are at most t. We introduce the following notation: For a finite subset T ⊂ 𝒯, N,T) ≡ (N, t) : tT) and similarly for the other processes.

Proposition 1

Let T ⊂ 𝒯 be a finite set. Then {M(s,T; θ) : s ∈ 𝒮} is a |T|-dimensional zero-mean square-integrable martingale with predictable quadratic covariation process

M(·,T;θ)(s)=[(A(s,min(t1,t2);θ))t1,t2T],sS.

Consequently, PN(s,T) = PA(s,T; θ) and VM(s,T; θ) = PM,T; θ)〉(s).

Proof

Follows from the boundedness and predictability of sZ(s,T), the fact that Z(s, t1)Z(s, t2) = Z(s,min(t1, t2)), by stochastic integration theory, and since M(s,T;θ)=0sZ(v,T)M(dv;θ).

Let s ∈ 𝒮 and denote by

0S0<S1<S2<<SN(s-)<SN(s-)+1min(s,τ)

the N(s−) successive event occurrence times for the unit. Define the (random) functions ℰj: 𝒮 → ℜ via

Ej(v)=E(v)I(Sj-1,Sj](v)

for j = 1, 2,,N(s−)+1. By condition, on (Sj−1, Sj), ℰj(·) is nondecreasing and differentiable. We denote by Ej-1(·) its inverse function and by Ej(·) its derivative. Define the (random) functions φj: 𝒮 → ℜ according to

φj(v;α,β)=ρ(v,j-1;α)ψ[X(v)β]Ej(v)I(Sj-1,Sj](v),

for j = 1, 2, …,N(s−) + 1. Next, we define the doubly-indexed process Y = {Y (s, t; α, β) : (s, t) ∈ ℐ} according to

Y(s,t;α,β)=j=1N(s-)+1φj[Ej-1(t);α,β]I(Ej(Sj-1),Ej(Sj)](t). (5)

This is a generalized at-risk process. The importance of these doubly-indexed processes arise from the representation of the A-process in Proposition 2, which de-couples the effective age process ℰ(·) from the baseline hazard function Λ0(·), and the change-of-variable identity in Proposition 3. Restricted forms of these results were used in the IID recurrent event model considered in [15, 13].

Proposition 2

For (s, t) ∈ ℐ, A(s,t;θ)=0tY(s,w;α,β)Λ0(dw).

Proof

Partition the region of integration (0, s] into the disjoint union (0,s]=j=1N(s-)+1(Sj-1,Sj]; do a variable transformation on each region; manipulate; and then simplify.

Proposition 3

Let {H(s, t) : (s, t) ∈ ℐ} be a bounded vector-valued process such that for each t, sH(s, t) is predictable. For (s, t) ∈ ℐ, we have

0sH(s,E(v))M(dv,t;θ)=0tH(s,w)M(s,dw;θ).
Proof

Start with the left-hand side, write the M process into its N and A components, then perform the same manipulations as in the proof of Proposition 2.

3.2 Estimation of Λ0

Propositions 1 and 2 now combine to suggest the stochastic differential equation, for an observable D,

N(s,dt)=Y(s,t;α,β)Λ0(dt)+M(s,dt;θ).

When data 𝔇n is available from n units, we therefore obtain the differential form

N(s,dt)={Y(s,t;α,β)}Λ0(dt)+M(s,dt;θ). (6)

Define

S(0)(s,t;α,β)=Y(s,t;α,β)1ni=1nYi(s,t;α,β) (7)

and J(s, t; α, β) = I{S(0)(s, t; α, β) > 0}. With the convention that 0/0 = 0, we obtain from (6) the stochastic integral identity

0tJ(s,w;α,β)S(0)(s,w;α,β)N(s,dw)=0tJ(s,w;α,β)Λ0(dw)+0tJ(s,w;α,β)S(0)(s,w;α,β)M(s,dw;θ). (8)

Let us consider the last term in (8). We have

0tJ(s,w;α,β)S(0)(s,w;α,β)M(s,dw;θ)=1ni=1n0tJ(s,w;α,β)S(0)(s,w;α,β)Mi(s,dw;θ)=1ni=1n0sJ(s,Ei(v);α,β)S(0)(s,Ei(v);α,β)Mi(dv,t;θ) (9)

where the last equality is obtained by invoking Proposition 3. The integrand in each summand in (9) is bounded and predictable, so it follows from stochastic integration theory that, for i = 1, 2, …, n,

P0sJ(s,Ei(v);α,β)S(0)(s,Ei(v);α,β)Mi(dv,t;θ)=0. (10)

It therefore follows from (8) and (10) that

P0tJ(s,w;α,β)S(0)(s,w;α,β)N(s,dw)=P0tJ(s,w;α,β)Λ0(dw).

Analogously to Aalen’s idea [1], if for the moment we assume that (α, β) is known, we may propose a method-of-moments estimator for Λ0(·) given by

Λ0(t;α,β)=0tJ(s,w;α,β)S(0)(s,w;α,β)N(s,dw)=0tN(s,dw)S(0)(s,w;α,β). (11)

However, (α, β) is not known, hence Λ̃0 is not an estimator. We now therefore find an estimator for (α, β), which will then be plugged-in (11) to obtain a legitimate estimator of Λ0.

3.3 Estimator of (α, β)

For the purpose of estimating (α, β), we form a generalized likelihood process, based on 𝔇n, denoted by L = {L(s, t; θ) : (s, t) ∈ ℐ}. We define

L(s,t;θ)=i=1nv=0s[Ai(dv,t;θ)]Ni(Δv,t)[1-Ai(dv,t;θ)]1-Ni(Δv,t),

with the understanding that when the product operation is over a continuous index, such as v in the second product operation, then it means product-integral; see [10]. By property of the product-integral and re-writing in an expanded form, we have that

L(s,t;θ)={i=1nv=0s[Zi(v,t)Ai(dv;θ)]Zi(v,t)Ni(Δv)}exp{-nA(s,t;θ)}.

This likelihood process involves the functional parameter Λ0(·), for which we have an estimator given in (11) if (α, β) is known. We can therefore obtain a profile likelihood for (α, β) by replacing the Λ0(·) in L(s*, t*; θ) by the Λ̃0(s*, ·; α, β) in (11). Doing so yields a profile likelihood function given by

LP(s,t;α,β)=i=1nv=0s[ρ(v,Ni(v-);α)ψ[Xi(v)β]S(0)(s,Ei(v);α,β)]Ni(Δv,t). (12)

This function may also be viewed as a generalized partial likelihood function for (α, β) being very much reminiscent of the Cox partial likelihood function; see [7, 8, 4, 9, 3, 2]. From this partial likelihood function we obtain its maximizer as our estimator of (α, β), that is,

(α^,β^)(α^(s,t),β^(s,t))=argmax(α,β)Rq×RpLP(s,t;α,β). (13)

Numerical methods, such as the Newton-Raphson algorithm, are needed to obtain the values of (α̂, β̂), as has been done in [14].

Having obtained an estimator of (α, β), now replace (α, β) in Λ̃0(s*, t; α, β) to obtain an estimator of Λ0(·). This resulting estimator of Λ0(·) is given by

Λ^0(s,t)=Λ0(s,t;α^,β^)=0tN(s,dw)S(0)(s,w;α^,β^),tT (14)

Observe that the form of this estimator is analogous to the estimator of the baseline hazard function in the Cox PH model [7, 6, 4], hence it seems appropriate to refer to this as a generalized Aalen-Breslow-Nelson (ABN) estimator.

Denoting by F0 the distribution function associated with the baseline hazard function Λ0, then dictated by the product-integral representation of F0 by Λ0, we are able to obtain a product-limit type estimator of the survivor function 0(t) = 1 − F0(t) given by

F¯^0(s,t)=w=0t[1-N(sdw)S(0)(s,w;α^,β^)],tT. (15)

Small to moderate sample size properties of the estimators presented above were examined through simulation studies in [14] for specific forms of the effective age process ℰ, for a function ρ which was made to depend on s only through N(s−), and for an exponential link function ψ. Applications of these estimators to some real data sets were also presented in that paper. However, general asymptotic properties of these estimators are still unavailable, and establishing the large-sample properties of these semiparametric estimators is the raison d’être of the current paper.

4 Preliminaries for Asymptotics

For studying the large-sample properties of our semiparametric estimators, it is first convenient to deal with the model where A in (4) is of form

A(s;η)=0sY(v)λ0[E(v)]κ(v;η)dv. (16)

Here κ = {κ(s; η) : s ∈ 𝒮} is a bounded and predictable process and η ∈ Γ with Γ an open subset of ℜk. We assume that ηκ(s; η) is twice-differentiable and we let

κ.(s;η)=ηκ(s;η)andκ¨(s;η)=ηηTκ(s;η).

Later to obtain the specific results for the model in (4), we then simply identify η with (α, β) and with

κ(s;η)=ρ(s,N(s-);α)ψ[X(s)β].

With the above simplification, for one unit monitored over 𝒮 = [0, s*], we will then define

φj(v;η)=κ(v;η)E(v)I(Sj-1,Sj](v),j=1,2,,N(s-)+1;Y(s,t;η)=j=1N(s-)+1φj[Ej-1(t);η]I(E(Sj-1),E(Sj)](t),

so that with n units, we will then have

S(0)(s,t;η)=Y(s,t;η)=1ni=1nYi(s,t;η)

where in this last function the κ functions may also depend on i.

We denote by (η0, Λ00) the true parameter vector, and to simplify notation, we suppress writing these true parameter vector in our functions if no confusion could arise. Thus, Ai(s,t)Ai(s,t;η0,Λ00), Yi(s*, t) ≡ Yi(s*, t; η0), and Mi(s,t)Mi(s,t;η0,Λ00).

In establishing consistency and weak convergence properties of the estimators, we will need a general weak convergence result of processes formed as stochastic integrals of the processes Mi(s*, t), i = 1, 2, …, n, which we recall are martingales with respect to s* but not with respect to t.

Given an n and an (s*, t, η), let us define a random discrete probability measure ℚn(·; s*, t, η) on the (random) set

Kn(s)={(i,j):j=1,2,,Ni(s-)+1;i=1,2,,n}

according to the probabilities

n((i,j);s,t,η)=1n{Yi(s,t;η)S(0)(s,t;η)}{φij[Eij-1(t);η]Yi(s,t;η)I(Ei(Sij-1),Ei(Sij)](t)}.

For a function g : 𝒦n(s*) → ℜr, which could be random and also depending on (s*, t, η),

En(s,t,η)gn(s,t,η)g

will denote its expectation with respect to the p.m. ℚn and

Vn(s,t,η)gn(s,t,η)[g-n(s,t,η)g]2=n(s,t,η)g2-[n(s,t,η)g]2

will denote its variance-covariance matrix with respect to ℚn.

Let us also define

n(i;s,t,η)=1n{Yi(s,t;η)S(0)(s,t;η)}=Yi(s,t;η)l=1nYl(s,t;η),i=1,2,,n.

Thus, when the function g : 𝒦n(s*) → ℜr is such that g(i, j) = g*(i) for some g*, then

n(s,t,η)g=i=1ng(i)n(i;s,t,η)=i=1ng(i)[Yi(s,t;η)l=1nYl(s,t;η)].

In this case, the variance-covariance matrix of g with respect to ℚn is also in more simplified form.

Theorem 1

Let { Hi(n)(s,t):(s,t)=[0,s]×[0,t]} for i = 1, 2,, n; n = 1, 2,be a triangular array of vector processes, and assume the following conditions:

  1. i, Hi(n) is bounded andv ∈ [0, s], Hi(n)(s,Ei(v)) is 𝔽-predictable;

  2. There exists a deterministic function s(0) : ℐ → ℜ+ such that
    S(0)(s,t)-s(0)(s,t)up0

    and inft∈𝒯 s(0)(s*, t) > 0; and

  3. There exists a deterministic matrix function v : ℐ → ℜ+ such that
    n(s,w){[H(n)(s,w)]2}-v(s,w)up0,
    and for every t ∈ (0, t*],
    (s,t)=0tv(s,w)s(0)(s,w)Λ00(dw)

    is positive definite.

Defining the stochastic integrals, for n = 1, 2,,

W(n)(s,t)=1ni=1n0tHi(n)(s,w)Mi(s,dw),

then {W(n)(s*, t) : t ∈ 𝒯} converges weakly on Skorokhod’s space D[0, t*] to a zero-mean Gaussian process {W(∞)(s*, t) : t ∈ 𝒯} whose covariance function is

Cov{W()(s,t1),W()(s,t2)}=(s,min(t1,t2)).
Proof

The proof of this result is analogous to the proof of the general theorem in [15].

5 Consistency Properties

In this section we will establish the consistency of the sequence of estimators η̂n and Λ̂n(s*, ·) as the number of units n increases to infinity. We shall assume the following set of “regularity conditions.” These are somewhat analogous conditions to those in [4] and [5].

  • (C1)
    For each (s, t) ∈ ℐ, ηκ(s, t; η) is twice-continuously differentiable with
    κ.(s,t;η)=ηκ(s,t;η)andκ¨(s,t;η)=ηηTκ(s,t;η).

    Furthermore, the operations of differentiation (with respect to η) and integration could be interchanged.

  • (C2)
    There exists a deterministic function s(0) : ℐ × Γ → ℜ+ such that
    suptT;ηΓS(0)(s,t;η)-s(0)(s,t;η)p0,

    and with inft∈𝒯 s(0)(s*, t; η) > 0 and with Λ00(t)=0tλ00(w)dw<.

  • (C3)
    There exist deterministic functions s(1) : ℐ × Γ2 → ℜk and s(2) : ℐ × Γ2 → (ℜk)⊗2 such that with
    Qn(1)(s,t;η1,η2)=n(s,t;η1)[κ.κ(E-1(t);η2)];Qn(2)(s,t;η1,η2)=n(s,t;η1)[κ¨κ(E-1(t);η2)],
    and
    q(1)(s,t;η1,η2)=s(1)s(0)(s,t;η1,η2);q(2)(s,t;η1,η2)=s(2)s(0)(s,t;η1,η2),
    we have
    suptT;(η1,η2)Γ2Qn(1)(s,t;η1,η2)-q(1)(s,t;η1,η2)p0;suptT;(η1,η2)Γ2Qn(2)(s,t;η1,η2)-q(2)(s,t;η1,η2)p0.
  • (C4)
    With v(s*, t) satisfying
    suptTVn(s,t)[κ.κ(E-1(t))]-v(s,t)pr0,
    the matrix
    (s,t)=0tv(s,w)s(0)(s,w)Λ00(dw)

    is positive definite for each t ∈ (0, t*].

  • (C5)
    For each s ∈ [0, s*], the mappings
    (v,η)κ.κ(v;η)-Qn(1)(s,E(v);η,η);(v,η)κ¨κ(v;η)-Qn(2)(s,E(v);η,η),

    are bounded and 𝔉s-measurable for each v ∈ [0, s].

We first establish an intermediate result.

Lemma 1

For w ∈ 𝒯 and η ∈ Γ, we have

S.(0)S(0)(s,w;η)=Qn(1)(s,w;η,η);S¨(0)S(0)(s,w;η)=Qn(2)(s,w;η,η).
Proof

The proofs are straightforward and hence omitted.

For notational brevity, let us define

Ψn(s,t;η)=η{1nlP(s,t;η)};Ψ(s,t;η)=0t[q(1)(s,w;η0,η)-q(1)(s,w;η,η)]s(0)(s,w)Λ00(dw),

where lP (s*, t*; η) = log LP (s*, t*; η) is the logarithm of the partial likelihood function. We are now in position to state a result concerning the consistency of the partial MLE of η. Without loss of generality, we shall assume that the maximizer of the partial likelihood can be obtained as a zero of η ↦ Ψn(s*, t*; η).

Theorem 2

If η̂n is such that Ψn(s*, t*; η̂n) = 0 and if, for every ε > 0, we have that

inf{η:η-η0ε}Ψ(s,t;η)>0,

then, under the regularity conditions (C1)–(C5), η^npη0.

Proof

From (12), (C1), and Lemma 1, we have

Ψn(s,t;η)=n0s[κ.κ(v;η)-Qn(1)(s,E(v);η,η)]N(dv,t)=n0s[κ.κ(v;η)-Qn(1)(s,E(v);η,η)]M(dv,t)+ (17)
n0s[κ.κ(v;η)-Qn(1)(s,E(v);η,η)]A(dv,t). (18)

By (C5) and Theorem 1, the term in (17) is op(1). On the other hand, the term in (18) becomes, after splitting the region of integration into the disjoint intervals (Sj−1, Sj] for j = 1, 2, …, N(s*−) + 1 and then doing a variable transformation,

Term(18)=0tn{j=1N(s-)+1[κ.κ(Ej-1(w);η)-Qn(1)(s,w;η,η)]×φj[Ej-1(w);η]I(E(Sj-1),E(Sj)](w)}Λ00(dw)=0tS(0)(s,w)[Qn(1)(s,w;η0,η)-Qn(1)(s,w;η,η)]Λ00(dw).

By conditions (C2) and (C3), this last term will converge uniformly in probability to Ψ(s*, t*; η), so that we will have the result

supηΓΨn(s,t;η)-Ψ(s,t;η)p0. (19)

Finally, observe that Ψ(s*, t*; η0) = 0, so by the condition of the theorem and coupling with (19), it follows from Theorem 5.9 of van der Vaart [18]) that η^npη0.

Indeed, there is more to be said based on the following Lemma 2 which will also be used in the weak convergence result proof in Section 6. Since Σ(s*, t*) is positive definite, this lemma implies that, in fact, η0 is a maximizer of the limit in probability of the log-partial likelihood [lP (s*, t*; η) − lP (s*, t*)]/n.

Lemma 2

Under conditions (C1)–(C5),

Ψ.n(s,t)ηηT{1nlP(s,t;η)}η=η0=-0tVn(s,w)[κ.κ(E-1(w))]S(0)(s,w)Λ00(dw)+op(1)p-(s,t).
Proof

Straightforward, though tedious, calculations show that

Ψ.n(s,t;η)=n0s[κ¨κ(v;η)-S¨(0)S(0)(s,E(v);η)]N(dv,t)-n0s{[κ.κ(v;η)]2-[S.(0)S(0)(s,E(v);η)]2}N(dv,t)=0t{Qn(2)(s,w;η0,η)-Qn(2)(s,w;η,η)}S(0)(s,w;η0)Λ00(dw)-0t{n(s,w;η0)[κ.κ(E-1(w);η)]2-[Qn(1)(s,w;η,η)]2}×S(0)(s,w;η0)Λ00(dw)+op(1).

Evaluating at η = η0, and noting that

n(s,w;η0)[κ.κ(E-1(w);η0)]=Qn(1)(s,w;η0,η0)

then yields the representation given in the statement of the lemma. Letting n → ∞, the limiting matrix is −Σ(s*, t*).

Theorem 3

Under conditions (C1)–(C5), Λ̂0n(s*, ·) converges uniformly in probability to Λ00(·) on [0, t*], that is,

supt[0,t]|Λ^0n(s,t)-Λ00(t)|p0.
Proof

With Λ0(s,t)=0tI{S(0)(s,w;η^)>0}Λ00(dw), we have that

Λ^0(s,t)-Λ00(t)Λ^0(s,t)-Λ0(s,t)+Λ0(s,t)-Λ00(t)|Λ^0(s,t)-0tI{S(0)(s,w)>0}S(0)(s,w)N(s,dw)|+ (20)
|0tI{S(0)(s,w)>0}S(0)(s,w)M(s,dw)|+ (21)
|0tI{S(0)(s,w;η^)=0}Λ00(dw)|. (22)

Term (22) is bounded above by

|0tI{S(0)(s,w;η^)=0}Λ00(dw)|,

which is op(1) since S(0)(s,w;η^)ps(0)(s,w) and by (C2) we have Λ00(t)< and infw∈[0,t*] s(0)(s*,w) > 0. Term (20) is bounded above by

{supw[0,t]|I{S(0)(s,w;η^)>0}S(0)(s,w;η^)-I{S(0)(s,w)>0}S(0)(s,w)|}N(s,t).

But ℙN(s*, t*) = ℙM(s*, t*)+ℙA(s*, t*). By Theorem 1, ℙM(s*, t*) = op(1), while A(s,t)=0tS(0)(s,w)Λ00(dw), which converges in probability to 0ts(0)(s,w)Λ00(dw), a finite quantity by (C2). Thus, ℙN(s*, t*) = Op(1). Since

supw[0,t]|I{S(0)(s,w;η^)>0}S(0)(s,w;η^)-I{S(0)(s,w)>0}S(0)(s,w)|=op(1)

it therefore follows that term (20) is op(1). Finally, by Theorem 1, we have that the process

{1ni=1n0tI{S(0)(s,w)>0}S(0)(s,w)Mi(s,dw):t[0,t]}

converges weakly to a zero-mean Gaussian process whose covariance function is

Cov(G(t1),G(t2))=0min(t1,t2)Λ00(dw)s(0)(s,w)

for t1, t2 ∈ [0, t*]. As a consequence,

supt[0,t]|1ni=1n0tI{S(0)(s,w)>0}S(0)(s,w)Mi(s,dw)|

converges weakly to supt∈[0,t*] |(t)|, which is Op(1). It follows that

supt[0,t]|0tI{S(0)(s,w)>0}S(0)(s,w)M(s,dw)|=1nsupt[0,t]|1ni=1n0tI{S(0)(s,w)>0}S(0)(s,w)Mi(s,dw)|=op(1).

This completes the proof of the theorem.

6 Distributional Properties

In this section we establish the limiting distributional properties of { n[η^n-η0], n = 1, 2, …} and {Wn(s*, t): t ∈ 𝒯; n = 1, 2, …}, where

Wn(s,t)=n[Λ^0(n)(s,t)-Λ00(t)].

Define the process {Bn(s*, t): t ∈ 𝒯; n = 1, 2, …} according to

Bn(s,t)=0tI{S(0)(s,w)>0}S.(0)(s,w)[S.(0)(s,w)]2nN(s,dw).

Let us also define the process {Vn(s*, t): t ∈ 𝒯; n = 1, 2, …} via

Vn(s,t)=n[Λ^0(n)(s,t)-Λ00(t)]+n(η^n-η0)TBn(s,t).

Furthermore, we shall assume that η̂n solves the equation

UP(n)(s,t;η)=0withUP(n)(s,t;η)=ηlP(s,t;η).

We now present and prove a result from which the asymptotic properties follow.

Theorem 4

Under conditions (C1)–(C5), we have the representations

n(η^n-η0)=[(s,t)]-1×{nn0t[κ.κ[E-1(w)]-S.(0)S(0)(s,w)]M(s,dw)}+op(1); (23)

and

Vn(s,t)=n0tI(wt)I{S(0)(s,w)>0}S(0)(s,w)nM(s,dw)+op(1). (24)

Furthermore, { n(η^n-η0)} and {Vn(s*, t): t ∈ 𝒯 } are asymptotically independent with each weakly converging to Gaussian limits.

Proof

From the definition of η̂n, we have by first-order Taylor expansion that

n(η^n-η0)=[-Ψ.n(s,t;ηn)]-1[nΨn(s,t;η0)]

where η̃n is in a neighborhood centered at η0 and whose radius is ||η̂nη0||. It is easy to see that

nΨn(s,t;η0)=nn0s{κ.κ(v)-S.(0)S(0)(s,E(v))}M(dv,t)=nn0t{κ.κ[E-1(w)]-S.(0)S(0)(s,w)}M(s,dw).

Furthermore, since η^npη0, and by virtue of Lemma 2, we have that

[-Ψ.n(s,t;ηn)]-1=[(s,t)]-1+op(1).

As such we obtain the representation for n(η^n-η0).

Once again, by first-order Taylor expansion, we have that on the set where S(0)(s*,w; η̂n) > 0,

1S(0)(s,w;η^n)=1S(0)(s,w;η0)-(η^n-η0)TS.(0)(s,w;ηn)[S(0)(s,w;ηn)]2

with η̃n inside the ball centered at η0 with radius ||η̂nη0||. Defining

Λ0(s,t)=0tI{S(0)(s,w;η^n)>0}Λ00(dw),

and recalling that

Λ^0(n)(s,t)=0tI{S(0)(s,w;η^n)>0}S(0)(s,w;η^n)nN(s,dw),

we obtain

n[Λ^0(n)(s,t)-Λ0(s,t)]=0tI{S(0)(s,w;η^n)>0}S(0)(s,w;η^n)nnM(s,dw)-n(η^n-η0)T0tI{S(0)(s,w;η^n)>0}[S.(0)(s,w;η^n)][S(0)(s,w;η^n)]2nN(s,dw).

The representation for Vn(s*, t) given in the statement of the lemma now follows by noting that

sup0ttn[Λ0(s,t)-Λ00(t)]=op(1);sup0ttS(0)(s,t;η^n)-S(0)(s,t;η0)=op(1);sup0ttS.(0)(s,t;η^n)-S.(0)(s,t;η0)=op(1).

Finally, let t = (t1, t2, …, tp)T ⊂ 𝒯. From the just-established representations, with I{wt} = (I{wt1}, …, I{wtp})T, we have

[n(η^n-η0)Vn(s,t)]=[(s,t)-100I]×nn0t[κ.κ[E-1(w)]-S.(0)S(0)(s,w)I(wt)I{S(0)(s,w)>0}S(0)(s,w)]M(s,dw)+op(1).

By the main weak convergence theorem or by invoking theMartingale Central Limit Theorem after a time transformation, this converges weakly to the random vector

[W1W2]=[(s,t)-100I][Z1Z2]

where (Z1T,Z2T)T is a (k + p)-dimensional zero mean multivariate normal random vector with covariance matrix

Cov[(Z1Z2),(Z1Z2)]=plimn0tn(s,w)×[κ.κ[E-1(w)]-S.(0)S(0)(s,w)I(wt)I{S(0)(s,w)>0}S(0)(s,w)]S(0)(s,w)Λ00(dw).

However, the covariance matrix between Z1 and Z2 equals 0 since, for every w ∈ 𝒯,

n(s,w)[κ.κ[E-1(w)]-S.(0)S(0)(s,w)]=0.

Because of the Gaussian limits, this then establishes that n(η^-η0) and Vn(s*, ·) are asymptotically independent.

The following two corollaries are then immediate consequences of the preceding theorem and elements of its proof.

Corollary 1

Under the conditions of Theorem 4, as n → ∞,

n(η^n-η0)dN(0,(s,t)-1).
Proof

This is immediate from the fact that Z1 in the proof of Theorem 4 is a k-dimensional zero-mean normal vector with covariance matrix Σ(s*, t*).

Corollary 2

Under the conditions of Theorem 4, as n → ∞, the process Wn(s,·)=n[Λ0(n)(s,·)-Λ00(·)] converges weakly in Skorokhod’s D[𝒯]-space to a zero-mean Gaussian process with covariance function given by

c(s,t1,t2)=0min(t1,t2)Λ00(dw)s(0)(s,w)+b(s,t1)T{(s,t)}-1b(s,t2), (25)

for t1, t2 ∈ 𝒯 and with b(s,t)=0tq(1)(s,w)Λ00(dw).

Proof

From Theorem 4 we have the results that

n(η^n-η0)dW1(s,t)

where W1(s*, t*) ~ N(0, [Σ(s*, t*)]−1). Also, we have that

{Vn(s,t):tT}{Z2(s,t):tT}

where {Z2(s*, t): t ∈ 𝒯 } is a zero-mean Gaussian process with covariance function

Cov{Z2(s,t1),Z2(s,t2)}=0min(t1,t2)Λ00(dw)S(0)(s,w).

In addition, W1(s*, t*) and {Z2(s*, t): t ∈ 𝒯 } are independent. It is also evident that

suptTBn(s,t)-b(s,t)p0.

From the representations in Theorem 4, it follows that {Wn(s*, t): t ∈ 𝒯 } converges weakly to the process W ≡ {W(s*, t): t ∈ 𝒯 } with

W(s,t)=Z2(s,t)-b(s,t)TW1(s,t).

As such W is a zero-mean Gaussian process and its covariance function is

c(s,t1,t2)=Cov{W(s,t1),W(s,t2)}=0min(t1,t2)Λ00(dw)s(0)(s,w)+b(s,t1)T[(s,t)]-1b(s,t2).

This completes the proof of the corollary.

Possible consistent estimators of the covariance functions are then easily obtained. For the covariance matrix Σ(s*, t*), this could be estimated by

^(s,t)=0tn(s,w;η^n)[κ.κ[E-1(w);η^n]-S.(0)S(0)(s,w;η^n)]2×S(0)(s,w;η^n)Λ^0(n)(s,dw;η^n).

For the covariance function of Z2(s*, ·), a consistent estimator is given by

Cov^[Z2(s,t1),Z2(s,t2)]=0min(t1,t2)Λ^0(n)(s,dw)S(0)(s,w;η^n).

On the otherhand, an estimator of b(s*, t) is given by

b^(s,t)=0tS.(0)(s,w;η^n)S(0)(s,w;η^n)Λ^0(n)(s,dw).

From these estimators, we are then able to obtain a consistent estimator of the covariance function c(s*, t1, t2) of the limiting Gaussian process W(s*, ·). This estimator is

c^(s,t1,t2)=Cov^[Z2(s,t1),Z2(s,t2)]+b^(s,t1)T[^(s,t)]-1b^(s,t1).

Observe that the results in Corollaries 1 and 2 are highly analogous to those in [4] pertaining to the estimators of the parameters of the Cox proportional hazards model. However, one need to be cautious since under the setting being considered, the limit functions appearing in the above results are more complicated as they must reflect aspects of the sum-quota accrual scheme and the dynamics of the performed interventions or repairs after each event occurrence.

Through these asymptotic results, large-sample confidence intervals and bands, large-sample hypothesis testing procedures, and goodness-of-fit or model validation methods for the infinite-dimensional parameters may now be constructed for this general dynamic model for recurrent events. We note, however, that the results presented in this paper are still limited to the general dynamic recurrent event model without frailties. It remains an open problem to obtain large-sample results for the general dynamic model incorporating frailties.

Footnotes

AMS Subject Classification: Primary: 62N01, 62N02; Secondary: 62G10

References

  • 1.Aalen O. Nonparametric inference for a family of counting processes. Annals of Statistics. 1978;6:701–726. [Google Scholar]
  • 2.Aalen Odd O, Borgan Ørnulf, Gjessing Håkon K. Statistics for Biology and Health. Springer; New York: 2008. Survival and event history analysis: A process point of view. [Google Scholar]
  • 3.Andersen P, Borgan O, Gill R, Keiding N. Statistical Models Based on Counting Processes. Springer-Verlag; New York: 1993. [Google Scholar]
  • 4.Andersen P, Gill R. Cox’s regression model for counting processes: a large sample study. Annals of Statistics. 1982;10:1100–1120. [Google Scholar]
  • 5.Borgan O. Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scandinavian Journal of Statistics. 1984;11:1–16. [Google Scholar]
  • 6.Breslow N, Crowley J. A large sample study of the life table and product limit estimates under random censorship. The Annals of Statistics. 1974;2:437–453. [Google Scholar]
  • 7.Cox D. Regression models and life tables (with discussion) Journal of the Royal Statistical Society. 1972;34:187–220. [Google Scholar]
  • 8.Cox D. Partial likelihood. Biometrika. 1975;62:269–276. [Google Scholar]
  • 9.Fleming T, Harrington D. Counting Processes and Survival Analysis. Wiley; New York: 1991. [Google Scholar]
  • 10.Gill Richard D, Johansen Søren. A survey of product-integration with a view toward application in survival analysis. Ann Statist. 1990;18(4):1501–1555. [Google Scholar]
  • 11.Gjessing Håkon K, Røysland Kjetil, Pena Edsel A, Aalen Odd O. Recurrent events and the exploding Cox model. Lifetime Data Anal. 2010;16(4):525–546. doi: 10.1007/s10985-010-9180-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Peña E, Hollander M. Mathematical Reliability: An Expository Perspective. In: Soyer R, Mazzuchi T, Singpurwalla N, editors. Models for Recurrent Events in Reliability and Survival Analysis. chapter 6. Kluwer Academic Publishers; 2004. pp. 105–123. [Google Scholar]
  • 13.Peña EA, Strawderman RL, Hollander M. Nonparametric estimation with recurrent event data. J Amer Statist Assoc. 2001;96(456):1299–1315. [Google Scholar]
  • 14.Peña Edsel, Slate Elizabeth, Gonzalez Juan Ramon. Semiparametric inference for a general class of models for recurrent events. Journal of Statistical Planning and Inference. 2007;137:1727–1747. doi: 10.1016/j.jspi.2006.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Peña Edsel A, Strawderman Robert L, Hollander Myles. Recent advances in reliability theory (Bordeaux, 2000) Birkhäuser Boston; Boston, MA: 2000. A weak convergence result relevant in recurrent and renewal models; pp. 493–514. Stat. Ind. Technol. [Google Scholar]
  • 16.Prentice R, Williams B, Peterson A. On the regression analysis of multivariate failure time data. Biometrika. 1981;68:373–379. [Google Scholar]
  • 17.Sellke T. Weak convergence of the Aalen estimator for a censored renewal process. In: Gupta S, Berger J, editors. Statistical Decision Theory and Related Topics IV. Vol. 2. 1988. pp. 183–194. [Google Scholar]
  • 18.van der Vaart AW. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press; Cambridge: 1998. Asymptotic statistics. [Google Scholar]
  • 19.Wei L, Lin D, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Amer Statist Assoc. 1989;84:1065–1073. [Google Scholar]

RESOURCES