Asymptotics for a Class of Dynamic Recurrent Event Models

Edsel A Peña

doi:10.1080/10485252.2016.1225733

. Author manuscript; available in PMC: 2017 Sep 2.

Published in final edited form as: J Nonparametr Stat. 2016 Sep 2;28(4):716–735. doi: 10.1080/10485252.2016.1225733

Asymptotics for a Class of Dynamic Recurrent Event Models

Edsel A Peña ^1,^*

PMCID: PMC5533306 NIHMSID: NIHMS832199 PMID: 28761316

Abstract

Asymptotic properties, both consistency and weak convergence, of estimators arising in a general class of dynamic recurrent event models are presented. The class of models take into account the impact of interventions after each event occurrence, the impact of accumulating event occurrences, the induced informative and dependent right-censoring mechanism due to the data-accrual scheme, and the effect of covariate processes on the recurrent event occurrences. The class of models subsumes as special cases many of the recurrent event models that have been considered in biostatistics, reliability, and in the social sciences. The asymptotic properties presented have the potential of being useful in developing goodness-of-fit and model validation procedures, confidence intervals and confidence bands constructions, and hypothesis testing procedures for the finite- and infinite-dimensional parameters of a general class of dynamic recurrent event models, albeit the models without frailties.

Keywords and Phrases: consistency, compensators, counting processes, full models, marginal models, martingales, repair models, sum-quota accrual, weak convergence

1 Introduction and Background

Recurrent events pervade many disciplines such as the biomedical and public health sciences, engineering sciences, social and political sciences, economic sciences, and even sporting events. Examples of such events are non-fatal heart attacks, hospitalization of a patient with a chronic disease, migraines, breakdown of an electronic or mechanical system, discovery of a bug in a software, disagreement in a marriage, change of a job, Dow Jones Industrial Average (DJIA) decreasing by at least 200 points during a trading day, a perfect baseball game in the Major Leagues, a goal scored in a World Cup soccer game, and many others. The mathematical modeling of recurrent events, together with the development of statistical inference procedures for the models, are of paramount importance.

There are two approaches to the specification of mathematical models for recurrent events. The first is a full specification of the probability measure on the measurable space induced by the monitoring of the recurrent event. This is done by specifying the joint distributions of the calendar times of event occurrences, or equivalently the joint distributions of the inter-event times. Alternatively, the probability measure can be specified as a measure on the space of paths of the stochastic process arising from the monitoring of the recurrent event. The simplest and perhaps most common full parametric model is when the counting process associated with the event accrual is assumed to follow a homogeneous Poisson process (HPP), in which case the inter-event times are independent and identically distributed (IID) with common negative exponential distribution. One may also specify a nonparametric model by simply assuming that the inter-event time distribution is some unknown continuous distribution, resulting in the IID renewal model. The general dynamic model of interest in this article is of the full model variety.

The second modeling approach is referred to as marginal modeling. In its basic form, the event position within a unit is utilized as a stratifying variable, and a (marginal) probability measure is specified for each of the resulting strata. This approach was pioneered in the papers [16, 19]. It should be observed that the class of full models subsumes the class of marginal models. However, proponents of the marginal modeling approach espouse this marginal approach since it generally leads to an easier interpretation of model parameters though, at the same time, it may be difficult to justify a full model which is consistent with the specified marginal models. In fact, there could be several full models that are consistent with the marginal models.

An IID distributional specification for the inter-event times is clearly an oversimplification since it will often be the case that after an event occurrence some type of intervention, such as a corrective measure or a repair, will be performed, thereby altering the distribution of the time to the next event occurrence. Furthermore, time-dependent concomitant variables could also impact the distributions of the inter-event times, and within a unit the interevent times may be correlated owing to unobserved latent variables. The number of event occurrences could also impact these distributions, such as when event occurrences weakens the unit, thereby stochastically shortening the time to the next event occurrence. Due to practical and unavoidable constraints, the monitoring of the event could also only be performed over a finite, possibly random, observation window, and thus a sum-quota accrual scheme ensues wherein the number of observed event occurrences is a random variable which is informative about the event occurrence mechanism. This finite monitoring constraint also produces a right-censored observation, which could not be ignored in performing inference because of selection bias issues. The class of dynamic recurrent event models proposed in [12] incorporates the above considerations. This class of models is a specific member of the class of models of interest in this article. The major goals of this article are to obtain the asymptotic properties of semi-parametric estimators of the model parameters for the general class of dynamic recurrent event time models of the type in [12]. Note that algorithmic issues of the semi-parametric estimators for the model in [12] were dealt with in [14].

This article focuses on the large-sample properties of semiparametric estimators for the parameters of the class of dynamic models described in section 2. These semiparametric estimators are described in section 3. Consistency properties of the estimators will be established in section 5, while weak convergence properties will be developed in section 6.

2 Class of Dynamic Models

In this section we describe the general class of dynamic models of interest. In the sequel, (Ω, ℱ,P) is the basic probability space on which all random entities are defined. Consider a study in which units are to be monitored over the calendar time [0, s^*], where s^* ∈ (0,∞) is a fixed calendar time. For a unit in this study we suppose that there is a 1 × q vector of possibly time-varying bounded covariates X = {X(s) : s ∈ [0, s^*]}. We shall denote by N^† = {N^†(s) : s ∈ [0, s^*]} the counting process such that N^†(s) is the number of observed event occurrences over the period [0, s]. There is also a random time τ such that the unit is only under observation over the period [0, τ], so its at-risk process is Y^† = {Y^†(s) = I{τ ≥ s} : s ∈ [0, s^*]}, where I{·} is the indicator function. With the definition of N^† and Y^†, note that when τ < s^*, no more events are observed over (τ, s^*] for the unit, so that N^†(s) − N^†(τ) = 0 for s ∈ (τ, s^*], and this is reflected in the probabilistic statement in equation (1) below. We shall denote by 𝔉 = {ℱ_s: s ∈ [0, s^*]} a filtration on (Ω, ℱ, P) such that N^†, Y^†, and X are 𝔉-adapted and, in addition, Y^† and X are also 𝔉-predictable.

The class of dynamic models of interest postulates that for k ∈ {1, 2, …} and with dN^†(s) ≡ N^†((s + ds)−) − N^†(s−), as ds ↓ 0 and for s ∈ [0, s^*),

P {{d N}^{†} (s) \geq k ∣ F_{s -}} = Y^{†} (s) [λ (s ∣ X (s)) I {k = 1} + o_{p} (1) I {k \geq 1}] d s, a.e. - [P],

(1)

where

λ (s ∣ X (s)) = λ_{0} [E (s)] ρ [s, N^{†} (s -); α] ψ [X (s) β] .

(2)

In (2), 𝔈 = {ℰ(s) : s ∈ [0, s^*]} is an 𝔉-predictable process with paths that are piecewise left-continuous, nonnegative, ℰ(s) ≤ s, and piecewise differentiable with derivative satisfying ℰ′(s) ≥ 0; λ₀(·) is an unknown baseline hazard rate function with cumulative hazard function $Λ_{0} (\cdot) = \int_{0}^{\cdot} λ_{0} (s) d s$ is a known nonnegative bounded function over ℜ₊ × ℕ where ℕ = {0, 1, 2, …} and with ρ(s, 0; α) = 1 and α ∈ ℜ^q is an unknown q-dimensional parameter; and ψ(·) is a known nonnegative link function on ℜ and with β ∈ ℜ^p an unknown p-dimensional regression parameter. The process 𝔈 is called the effective age process. We shall assume that τ ~ G(·), where G(·) is some distribution function which does not involve (λ₀(·), α, β), hence it is considered a nuisance parameter. The regressor X is a vector-valued bounded and predictable process whose probabilistic structure may also contain some unknown nuisance parameters. A technical condition that we will assume (see the paper [11]) is that the counting process N^† is non-explosive over [0, s^*], that is, P {N^†(s^*) < ∞} = 1. This condition necessarily imposes a constraint on the form of the function ρ(·, ·; ·) and the model parameters.

The model parameter of main interest is

θ = (Λ_{0} (\cdot), α, β) \in Θ \equiv C \times R^{q} \times R^{p}

(3)

where 𝒞 is some class of cumulative hazard functions on ℜ₊, which will typically be a nonparametric class. Thus, θ will be a semiparametric parameter. Defining the process M^† = {M^†(s; θ) : s ∈ [0, s^*]} with M^†(s; θ) = N^†(s) − A^†(s; θ) and where

A^{†} (s; θ) = \int_{0}^{s} Y^{†} (v) λ_{0} [E (v)] ρ [v, N^{†} (v -); α] ψ [X (v) β] d v,

(4)

the model is tantamount to the condition that M^† is a zero-mean square-integrable 𝔉-martingale. The model specified in (1) and (2) is a slightly more general version of those in [12] and [14] since we allow the ρ-function to also directly depend on s aside from N^†(s−). For more background about this class of models and many specific models subsumed by this class of models, see [12, 14]. This general class of models includes as special cases models that have been considered in the biostatistics and reliability settings. To mention two specific models, if ℰ(s) = s − S_{N^†(_s−)} with 0 = S₀ < S₁ < S₂ < … being the times of successive event occurrences, so ℰ(·) represents the backward recurrence time function, the model coincides with resetting the age of the unit to zero after each event occurrence, which is referred to in the reliability literature as a perfect repair; while if we have ℰ(s) = s, then we say that a minimal repair is performed after each event occurrence. If the latter specification is further coupled with ρ(v, k; α) = 1, then we recover the Andersen-Gill multiplicative intensity model [4]; also the Cox proportional hazards (PH) model [7] when ψ(v) = exp(v).

We suppose there are n units in the study, so we will observe n independent and identically distributed (IID) copies 𝔇_n ≡ 𝔇 = (D₁,D₂, …,D_n) of the basic observable D = (N^†,Y^†, ℰ,X). We denote by 𝒟 the sample space of D, so that the sample space for 𝔇 is 𝒟ⁿ. A larger filtration on (Ω, ℱ, P) is formed from the n unit filtrations according to

F = ⋁_{i = 1}^{n} F_{i} = σ (\cup_{i = 1}^{n} F_{i}) .

Inference on the model parameter θ = (Λ₀(·), α, β), or relevant functionals of θ, are to be based on the realization of 𝔇_n. Properties of the inferential procedures are to be examined when n → ∞.

We shall use functional notation in the sequel. Thus, for a possibly vector-valued function g defined on 𝒟, Pg will represent the theoretical expectation of g(D), while ℙg ≡ ℙ_ng will represent the empirical expectation of g given 𝔇_n. That is, Pg = ∫ g(d)P(dd) and $ℙ g = \frac{1}{n} \sum_{i = 1}^{n} g (D_{i})$ . The theoretical and empirical covariances of g are defined, respectively, via Vg = P(g − Pg)^⊗2 and 𝕍g = ℙ(g − ℙg)^⊗2, where, for a column vector a, we write a^⊗0 = 1, a^⊗1 = a, and a^⊗2 = aa^T.

3 Semiparametric Estimators

3.1 Doubly-Indexed Processes

The intensity model in (4) has the distinctive feature that the baseline hazard rate λ₀(·) is evaluated at time s at the effective age ℰ(s). Since of interest is to infer about λ₀(·) or Λ₀(·), we need to de-couple λ₀(·) from ℰ(·). As demonstrated in [17, 13, 14] such de-coupling is facilitated through the use of doubly-indexed processes.

Let t^* ∈ (0,∞) be fixed, and define 𝒮 = [0, s^*] and 𝒯 = [0, t^*]. Form ℐ = 𝒮 × 𝒯. For our purpose we define the following ℐ-indexed processes associated with the (N^†, Y^†, ℰ,X) processes for one unit: Z = {Z(s, t) : (s, t) ∈ ℐ}, N = {N(s, t) : (s, t) ∈ ℐ}, A = {A(s, t; θ) : (s, t) ∈ ℐ}, and M = {M(s, t; θ) : (s, t) ∈ ℐ}, where

\begin{matrix} Z (s, t) = I {E (s) \leq t}; \\ N (s, t) = \int_{0}^{s} Z (v, t) N^{†} (d v); \\ A (s, t; θ) = \int_{0}^{s} Z (v, t) A^{†} (d v; θ); \\ M (s, t; θ) = N (s, t) - A (s, t; θ) = \int_{0}^{s} Z (v, t) M^{†} (d v; θ) . \end{matrix}

As an interpretation, note that N(s, t) is the number of occurrences of the recurrent event over the period [0, s] and for which the effective ages on these occurrences are at most t. We introduce the following notation: For a finite subset T ⊂ 𝒯, N(·,T) ≡ (N(·, t) : t ∈ T) and similarly for the other processes.

Proposition 1

Let T ⊂ 𝒯 be a finite set. Then {M(s,T; θ) : s ∈ 𝒮} is a |T|-dimensional zero-mean square-integrable martingale with predictable quadratic covariation process

〈 M (\cdot, T; θ) 〉 (s) = [{(A (s, min (t_{1}, t_{2}); θ))}_{t_{1}, t_{2} \in T}], s \in S .

Consequently, PN(s,T) = PA(s,T; θ) and VM(s,T; θ) = P〈M(·,T; θ)〉(s).

Proof

Follows from the boundedness and predictability of s ↦ Z(s,T), the fact that Z(s, t₁)Z(s, t₂) = Z(s,min(t₁, t₂)), by stochastic integration theory, and since $M (s, T; θ) = \int_{0}^{s} Z (v, T) M^{†} (d v; θ)$ .

Let s ∈ 𝒮 and denote by

0 \equiv S_{0} < S_{1} < S_{2} < \dots < S_{N^{†} (s -)} < S_{N^{†} (s -) + 1} \equiv min (s, τ)

the N^†(s−) successive event occurrence times for the unit. Define the (random) functions ℰ_j: 𝒮 → ℜ via

E_{j} (v) = E (v) I_{(S_{j - 1}, S_{j}]} (v)

for j = 1, 2, …,N^†(s−)+1. By condition, on (S_j₋₁, S_j), ℰ_j(·) is nondecreasing and differentiable. We denote by $E_{j}^{- 1} (\cdot)$ its inverse function and by $E_{j}^{'} (\cdot)$ its derivative. Define the (random) functions φ_j: 𝒮 → ℜ according to

φ_{j} (v; α, β) = \frac{ρ (v, j - 1; α) ψ [X (v) β]}{E_{j}^{'} (v)} I_{(S_{j - 1}, S_{j}]} (v),

for j = 1, 2, …,N^†(s−) + 1. Next, we define the doubly-indexed process Y = {Y (s, t; α, β) : (s, t) ∈ ℐ} according to

Y (s, t; α, β) = \sum_{j = 1}^{N^{†} (s -) + 1} φ_{j} [E_{j}^{- 1} (t); α, β] I_{(E_{j} (S_{j - 1}), E_{j} (S_{j})]} (t) .

(5)

This is a generalized at-risk process. The importance of these doubly-indexed processes arise from the representation of the A-process in Proposition 2, which de-couples the effective age process ℰ(·) from the baseline hazard function Λ₀(·), and the change-of-variable identity in Proposition 3. Restricted forms of these results were used in the IID recurrent event model considered in [15, 13].

Proposition 2

For (s, t) ∈ ℐ, $A (s, t; θ) = \int_{0}^{t} Y (s, w; α, β) Λ_{0} (d w)$ .

Proof

Partition the region of integration (0, s] into the disjoint union $(0, s] = \cup_{j = 1}^{N^{†} (s -) + 1} (S_{j - 1}, S_{j}]$ ; do a variable transformation on each region; manipulate; and then simplify.

Proposition 3

Let {H(s, t) : (s, t) ∈ ℐ} be a bounded vector-valued process such that for each t, s ↦ H(s, t) is predictable. For (s, t) ∈ ℐ, we have

\int_{0}^{s} H (s, E (v)) M (d v, t; θ) = \int_{0}^{t} H (s, w) M (s, d w; θ) .

Proof

Start with the left-hand side, write the M process into its N^† and A^† components, then perform the same manipulations as in the proof of Proposition 2.

3.2 Estimation of Λ₀

Propositions 1 and 2 now combine to suggest the stochastic differential equation, for an observable D,

N (s^{*}, d t) = Y (s^{*}, t; α, β) Λ_{0} (d t) + M (s^{*}, d t; θ) .

When data 𝔇_n is available from n units, we therefore obtain the differential form

ℙ N (s^{*}, d t) = {ℙ Y (s^{*}, t; α, β)} Λ_{0} (d t) + ℙ M (s^{*}, d t; θ) .

(6)

Define

S^{(0)} (s, t; α, β) = ℙ Y (s, t; α, β) \equiv \frac{1}{n} \sum_{i = 1}^{n} Y_{i} (s, t; α, β)

(7)

and J(s, t; α, β) = I{S⁽⁰⁾(s, t; α, β) > 0}. With the convention that 0/0 = 0, we obtain from (6) the stochastic integral identity

\int_{0}^{t} \frac{J (s^{*}, w; α, β)}{S^{(0)} (s^{*}, w; α, β)} ℙ N (s^{*}, d w) = \int_{0}^{t} J (s^{*}, w; α, β) Λ_{0} (d w) + \int_{0}^{t} \frac{J (s^{*}, w; α, β)}{S^{(0)} (s^{*}, w; α, β)} ℙ M (s^{*}, d w; θ) .

(8)

Let us consider the last term in (8). We have

\begin{array}{l} \int_{0}^{t} \frac{J (s^{*}, w; α, β)}{S^{(0)} (s^{*}, w; α, β)} ℙ M (s^{*}, d w; θ) = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{t} \frac{J (s^{*}, w; α, β)}{S^{(0)} (s^{*}, w; α, β)} M_{i} (s^{*}, d w; θ) \\ = \frac{1}{n} \sum_{i = 1}^{n} \int_{0}^{s^{*}} \frac{J (s^{*}, E_{i} (v); α, β)}{S^{(0)} (s^{*}, E_{i} (v); α, β)} M_{i} (d v, t; θ) \end{array}

(9)

where the last equality is obtained by invoking Proposition 3. The integrand in each summand in (9) is bounded and predictable, so it follows from stochastic integration theory that, for i = 1, 2, …, n,

P \int_{0}^{s^{*}} \frac{J (s^{*}, E_{i} (v); α, β)}{S^{(0)} (s^{*}, E_{i} (v); α, β)} M_{i} (d v, t; θ) = 0.

(10)

It therefore follows from (8) and (10) that

P \int_{0}^{t} \frac{J (s^{*}, w; α, β)}{S^{(0)} (s^{*}, w; α, β)} ℙ N (s^{*}, d w) = P \int_{0}^{t} J (s^{*}, w; α, β) Λ_{0} (d w) .

Analogously to Aalen’s idea [1], if for the moment we assume that (α, β) is known, we may propose a method-of-moments estimator for Λ₀(·) given by

{\tilde{Λ}}_{0} (t; α, β) = \int_{0}^{t} \frac{J (s^{*}, w; α, β)}{S^{(0)} (s^{*}, w; α, β)} ℙ N (s^{*}, d w) = \int_{0}^{t} \frac{ℙ N (s^{*}, d w)}{S^{(0)} (s^{*}, w; α, β)} .

(11)

However, (α, β) is not known, hence Λ̃₀ is not an estimator. We now therefore find an estimator for (α, β), which will then be plugged-in (11) to obtain a legitimate estimator of Λ₀.

3.3 Estimator of (α, β)

For the purpose of estimating (α, β), we form a generalized likelihood process, based on 𝔇_n, denoted by L = {L(s, t; θ) : (s, t) ∈ ℐ}. We define

L (s, t; θ) = \prod_{i = 1}^{n} \prod_{v = 0}^{s} {[A_{i} (d v, t; θ)]}^{N_{i} (Δ v, t)} {[1 - A_{i} (d v, t; θ)]}^{1 - N_{i} (Δ v, t)},

with the understanding that when the product operation is over a continuous index, such as v in the second product operation, then it means product-integral; see [10]. By property of the product-integral and re-writing in an expanded form, we have that

L (s, t; θ) = {\prod_{i = 1}^{n} \prod_{v = 0}^{s} {[Z_{i} (v, t) A_{i}^{†} (d v; θ)]}^{Z_{i} (v, t) N_{i}^{†} (Δ v)}} exp {- n ℙ A (s, t; θ)} .

This likelihood process involves the functional parameter Λ₀(·), for which we have an estimator given in (11) if (α, β) is known. We can therefore obtain a profile likelihood for (α, β) by replacing the Λ₀(·) in L(s^*, t^*; θ) by the Λ̃₀(s^*, ·; α, β) in (11). Doing so yields a profile likelihood function given by

L_{P} (s^{*}, t^{*}; α, β) = \prod_{i = 1}^{n} \prod_{v = 0}^{s^{*}} {[\frac{ρ (v, N_{i}^{†} (v -); α) ψ [X_{i} (v) β]}{S^{(0)} (s^{*}, E_{i} (v); α, β)}]}^{N_{i} (Δ v, t^{*})} .

(12)

This function may also be viewed as a generalized partial likelihood function for (α, β) being very much reminiscent of the Cox partial likelihood function; see [7, 8, 4, 9, 3, 2]. From this partial likelihood function we obtain its maximizer as our estimator of (α, β), that is,

(\hat{α}, \hat{β}) \equiv (\hat{α} (s^{*}, t^{*}), \hat{β} (s^{*}, t^{*})) = arg max_{(α, β) \in R^{q} \times R^{p}} L_{P} (s^{*}, t^{*}; α, β) .

(13)

Numerical methods, such as the Newton-Raphson algorithm, are needed to obtain the values of (α̂, β̂), as has been done in [14].

Having obtained an estimator of (α, β), now replace (α, β) in Λ̃₀(s^*, t; α, β) to obtain an estimator of Λ₀(·). This resulting estimator of Λ₀(·) is given by

{\hat{Λ}}_{0} (s^{*}, t) = {\tilde{Λ}}_{0} (s^{*}, t; \hat{α}, \hat{β}) = \int_{0}^{t} \frac{ℙ N (s^{*}, d w)}{S^{(0)} (s^{*}, w; \hat{α}, \hat{β})}, t \in T

(14)

Observe that the form of this estimator is analogous to the estimator of the baseline hazard function in the Cox PH model [7, 6, 4], hence it seems appropriate to refer to this as a generalized Aalen-Breslow-Nelson (ABN) estimator.

Denoting by F₀ the distribution function associated with the baseline hazard function Λ₀, then dictated by the product-integral representation of F₀ by Λ₀, we are able to obtain a product-limit type estimator of the survivor function F̄₀(t) = 1 − F₀(t) given by

{\hat{\bar{F}}}_{0} (s^{*}, t) = \prod_{w = 0}^{t} [1 - \frac{ℙ N (s^{*} d w)}{S^{(0)} (s^{*}, w; \hat{α}, \hat{β})}], t \in T .

(15)

Small to moderate sample size properties of the estimators presented above were examined through simulation studies in [14] for specific forms of the effective age process ℰ, for a function ρ which was made to depend on s only through N^†(s−), and for an exponential link function ψ. Applications of these estimators to some real data sets were also presented in that paper. However, general asymptotic properties of these estimators are still unavailable, and establishing the large-sample properties of these semiparametric estimators is the raison d’être of the current paper.

4 Preliminaries for Asymptotics

For studying the large-sample properties of our semiparametric estimators, it is first convenient to deal with the model where A^† in (4) is of form

A^{†} (s; η) = \int_{0}^{s} Y^{†} (v) λ_{0} [E (v)] κ (v; η) d v .

(16)

Here κ = {κ(s; η) : s ∈ 𝒮} is a bounded and predictable process and η ∈ Γ with Γ an open subset of ℜ^k. We assume that η ↦ κ(s; η) is twice-differentiable and we let

\dot{κ} (s; η) = \nabla_{η} κ (s; η) and \ddot{κ} (s; η) = \nabla_{η η^{T}} κ (s; η) .

Later to obtain the specific results for the model in (4), we then simply identify η with (α, β) and with

κ (s; η) = ρ (s, N^{†} (s -); α) ψ [X (s) β] .

With the above simplification, for one unit monitored over 𝒮 = [0, s^*], we will then define

\begin{array}{l} φ_{j} (v; η) & = & \frac{κ (v; η)}{E^{'} (v)} I_{(S_{j - 1}, S_{j}]} (v), j = 1, 2, \dots, N^{†} (s^{*} -) + 1; \\ Y (s^{*}, t; η) & = & \sum_{j = 1}^{N^{†} (s^{*} -) + 1} φ_{j} [E_{j}^{- 1} (t); η] I_{(E (S_{j - 1}), E (S_{j})]} (t), \end{array}

so that with n units, we will then have

S^{(0)} (s^{*}, t; η) = ℙ Y (s^{*}, t; η) = \frac{1}{n} \sum_{i = 1}^{n} Y_{i} (s^{*}, t; η)

where in this last function the κ functions may also depend on i.

We denote by (η⁰, $Λ_{0}^{0}$ ) the true parameter vector, and to simplify notation, we suppress writing these true parameter vector in our functions if no confusion could arise. Thus, $A_{i} (s^{*}, t) \equiv A_{i} (s^{*}, t; η^{0}, Λ_{0}^{0})$ , Y_i(s^*, t) ≡ Y_i(s^*, t; η⁰), and $M_{i} (s^{*}, t) \equiv M_{i} (s^{*}, t; η^{0}, Λ_{0}^{0})$ .

In establishing consistency and weak convergence properties of the estimators, we will need a general weak convergence result of processes formed as stochastic integrals of the processes M_i(s^*, t), i = 1, 2, …, n, which we recall are martingales with respect to s^* but not with respect to t.

Given an n and an (s^*, t, η), let us define a random discrete probability measure ℚ_n(·; s^*, t, η) on the (random) set

K_{n} (s^{*}) = {(i, j) : j = 1, 2, \dots, N_{i}^{†} (s^{*} -) + 1; i = 1, 2, \dots, n}

according to the probabilities

ℚ_{n} ((i, j); s^{*}, t, η) = \frac{1}{n} {\frac{Y_{i} (s^{*}, t; η)}{S^{(0)} (s^{*}, t; η)}} {\frac{φ_{i j} [E_{i j}^{- 1} (t); η]}{Y_{i} (s^{*}, t; η)} I_{(E_{i} (S_{i j - 1}), E_{i} (S_{i j})]} (t)} .

For a function g : 𝒦_n(s^*) → ℜ^r, which could be random and also depending on (s^*, t, η),

E_{ℚ_{n} (s^{*}, t, η)} g \equiv ℚ_{n} (s^{*}, t, η) g

will denote its expectation with respect to the p.m. ℚ_n and

\begin{array}{l} V_{ℚ_{n} (s^{*}, t, η)} g \equiv ℚ_{n} (s^{*}, t, η) {[g - ℚ_{n} (s^{*}, t, η) g]}^{\otimes 2} \\ = ℚ_{n} (s^{*}, t, η) g^{\otimes 2} - {[ℚ_{n} (s^{*}, t, η) g]}^{\otimes 2} \end{array}

will denote its variance-covariance matrix with respect to ℚ_n.

Let us also define

ℚ_{n} (i; s^{*}, t, η) = \frac{1}{n} {\frac{Y_{i} (s^{*}, t; η)}{S^{(0)} (s^{*}, t; η)}} = \frac{Y_{i} (s^{*}, t; η)}{\sum_{l = 1}^{n} Y_{l} (s^{*}, t; η)}, i = 1, 2, \dots, n .

Thus, when the function g : 𝒦_n(s^*) → ℜ^r is such that g(i, j) = g^*(i) for some g^*, then

ℚ_{n} (s^{*}, t, η) g = \sum_{i = 1}^{n} g^{*} (i) ℚ_{n} (i; s^{*}, t, η) = \sum_{i = 1}^{n} g^{*} (i) [\frac{Y_{i} (s^{*}, t; η)}{\sum_{l = 1}^{n} Y_{l} (s^{*}, t; η)}] .

In this case, the variance-covariance matrix of g with respect to ℚ_n is also in more simplified form.

Theorem 1

Let { $H_{i}^{(n)} (s, t) : (s, t) \in ℐ = [0, s^{*}] \times [0, t^{*}]$ } for i = 1, 2, …, n; n = 1, 2, … be a triangular array of vector processes, and assume the following conditions:

∀i, $H_{i}^{(n)}$ is bounded and ∀v ∈ [0, s], $H_{i}^{(n)} (s, E_{i} (v))$ is 𝔽-predictable;
There exists a deterministic function s⁽⁰⁾ : ℐ → ℜ₊ such that
$∣ S^{(0)} (s^{*}, t) - s^{(0)} (s^{*}, t) ∣ \overset{u p}{\to} 0$

and inf_t_∈𝒯 s⁽⁰⁾(s^*, t) > 0; and
There exists a deterministic matrix function v : ℐ → ℜ₊ such that
$‖ ℚ_{n} (s^{*}, w) {{[H^{(n)} (s^{*}, w)]}^{\otimes 2}} - v (s^{*}, w) ‖ \overset{u p}{\to} 0,$

and for every t ∈ (0, t^*],
$\sum (s^{*}, t) = \int_{0}^{t} v (s^{*}, w) s^{(0)} (s^{*}, w) Λ_{0}^{0} (d w)$

is positive definite.

Defining the stochastic integrals, for n = 1, 2, …,

W^{(n)} (s^{*}, t) = \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \int_{0}^{t} H_{i}^{(n)} (s^{*}, w) M_{i} (s^{*}, d w),

then {W⁽ⁿ⁾(s^*, t) : t ∈ 𝒯} converges weakly on Skorokhod’s space D[0, t^*] to a zero-mean Gaussian process {W^(∞)(s^*, t) : t ∈ 𝒯} whose covariance function is

Cov {W^{(\infty)} (s^{*}, t_{1}), W^{(\infty)} (s^{*}, t_{2})} = \sum (s^{*}, min (t_{1}, t_{2})) .

Proof

The proof of this result is analogous to the proof of the general theorem in [15].

5 Consistency Properties

In this section we will establish the consistency of the sequence of estimators η̂_n and Λ̂_n(s^*, ·) as the number of units n increases to infinity. We shall assume the following set of “regularity conditions.” These are somewhat analogous conditions to those in [4] and [5].

(C1)
For each (s, t) ∈ ℐ, η ↦ κ(s, t; η) is twice-continuously differentiable with
$\dot{κ} (s, t; η) = \nabla_{η} κ (s, t; η) and \ddot{κ} (s, t; η) = \nabla_{η η^{T}} κ (s, t; η) .$

Furthermore, the operations of differentiation (with respect to η) and integration could be interchanged.
(C2)
There exists a deterministic function s⁽⁰⁾ : ℐ × Γ → ℜ₊ such that
$sup_{t \in T; η \in Γ} ∣ S^{(0)} (s^{*}, t; η) - s^{(0)} (s^{*}, t; η) ∣ \overset{p}{\to} 0,$

and with inf_t_∈𝒯 s⁽⁰⁾(s^*, t; η) > 0 and with $Λ_{0}^{0} (t^{*}) = \int_{0}^{t^{*}} λ_{0}^{0} (w) d w < \infty$ .
(C3)
There exist deterministic functions s⁽¹⁾ : ℐ × Γ² → ℜ^k and s⁽²⁾ : ℐ × Γ² → (ℜ^k)^⊗2 such that with
$\begin{array}{l} Q_{n}^{(1)} (s^{*}, t; η_{1}, η_{2}) & = & ℚ_{n} (s^{*}, t; η_{1}) [\frac{\dot{κ}}{κ} (E^{- 1} (t); η_{2})]; \\ Q_{n}^{(2)} (s^{*}, t; η_{1}, η_{2}) & = & ℚ_{n} (s^{*}, t; η_{1}) [\frac{\ddot{κ}}{κ} (E^{- 1} (t); η_{2})], \end{array}$

and
$\begin{array}{l} q^{(1)} (s^{*}, t; η_{1}, η_{2}) & = & \frac{s^{(1)}}{s^{(0)}} (s^{*}, t; η_{1}, η_{2}); \\ q^{(2)} (s^{*}, t; η_{1}, η_{2}) & = & \frac{s^{(2)}}{s^{(0)}} (s^{*}, t; η_{1}, η_{2}), \end{array}$

we have
$\begin{array}{l} {sup}_{t \in T; (η_{1}, η_{2}) \in Γ^{2}} ‖ Q_{n}^{(1)} (s^{*}, t; η_{1}, η_{2}) - q^{(1)} (s^{*}, t; η_{1}, η_{2}) ‖ \overset{p}{\to} 0; \\ {sup}_{t \in T; (η_{1}, η_{2}) \in Γ^{2}} ‖ Q_{n}^{(2)} (s^{*}, t; η_{1}, η_{2}) - q^{(2)} (s^{*}, t; η_{1}, η_{2}) ‖ \overset{p}{\to} 0. \end{array}$
(C4)
With v(s^*, t) satisfying
$sup_{t \in T} ‖ V_{ℚ_{n} (s^{*}, t)} [\frac{\dot{κ}}{κ} (E^{- 1} (t))] - v (s^{*}, t) ‖ \overset{p r}{\to} 0,$

the matrix
$\sum (s^{*}, t) = \int_{0}^{t} v (s^{*}, w) s^{(0)} (s^{*}, w) Λ_{0}^{0} (d w)$

is positive definite for each t ∈ (0, t^*].
(C5)
For each s ∈ [0, s^*], the mappings
$\begin{array}{l} (v, η) & \mapsto & \frac{\dot{κ}}{κ} (v; η) - Q_{n}^{(1)} (s, E (v); η, η); \\ (v, η) & \mapsto & \frac{\ddot{κ}}{κ} (v; η) - Q_{n}^{(2)} (s, E (v); η, η), \end{array}$

are bounded and 𝔉_s₋-measurable for each v ∈ [0, s].

We first establish an intermediate result.

Lemma 1

For w ∈ 𝒯 and η ∈ Γ, we have

\begin{array}{l} \frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, w; η) & = & Q_{n}^{(1)} (s^{*}, w; η, η); \\ \frac{{\ddot{S}}^{(0)}}{S^{(0)}} (s^{*}, w; η) & = & Q_{n}^{(2)} (s^{*}, w; η, η) . \end{array}

Proof

The proofs are straightforward and hence omitted.

For notational brevity, let us define

\begin{matrix} Ψ_{n} (s^{*}, t^{*}; η) = \nabla_{η} {\frac{1}{n} l_{P} (s^{*}, t^{*}; η)}; \\ Ψ (s^{*}, t^{*}; η) = \int_{0}^{t^{*}} [q^{(1)} (s^{*}, w; η^{0}, η) - q^{(1)} (s^{*}, w; η, η)] s^{(0)} (s^{*}, w) Λ_{0}^{0} (d w), \end{matrix}

where l_P (s^*, t^*; η) = log L_P (s^*, t^*; η) is the logarithm of the partial likelihood function. We are now in position to state a result concerning the consistency of the partial MLE of η. Without loss of generality, we shall assume that the maximizer of the partial likelihood can be obtained as a zero of η ↦ Ψ_n(s^*, t^*; η).

Theorem 2

If η̂_n is such that Ψ_n(s^*, t^*; η̂_n) = 0 and if, for every ε > 0, we have that

inf_{{η : ‖ η - η^{0} ‖ \geq ε}} ‖ Ψ (s^{*}, t^{*}; η) ‖ > 0,

then, under the regularity conditions (C1)–(C5), ${\hat{η}}_{n} \overset{p}{\to} η^{0}$ .

Proof

From (12), (C1), and Lemma 1, we have

\begin{array}{l} Ψ_{n} (s^{*}, t^{*}; η) = ℙ_{n} \int_{0}^{s^{*}} [\frac{\dot{κ}}{κ} (v; η) - Q_{n}^{(1)} (s^{*}, E (v); η, η)] N (d v, t^{*}) \\ = ℙ_{n} \int_{0}^{s^{*}} [\frac{\dot{κ}}{κ} (v; η) - Q_{n}^{(1)} (s^{*}, E (v); η, η)] M (d v, t^{*}) + \end{array}

(17)

ℙ_{n} \int_{0}^{s^{*}} [\frac{\dot{κ}}{κ} (v; η) - Q_{n}^{(1)} (s^{*}, E (v); η, η)] A (d v, t^{*}) .

(18)

By (C5) and Theorem 1, the term in (17) is o_p(1). On the other hand, the term in (18) becomes, after splitting the region of integration into the disjoint intervals (S_j₋₁, S_j] for j = 1, 2, …, N^†(s^*−) + 1 and then doing a variable transformation,

\begin{array}{l} Term (18) = \int_{0}^{t^{*}} ℙ_{n} {\sum_{j = 1}^{N^{†} (s^{*} -) + 1} [\frac{\dot{κ}}{κ} (E_{j}^{- 1} (w); η) - Q_{n}^{(1)} (s^{*}, w; η, η)] \times φ_{j} [E_{j}^{- 1} (w); η] I_{(E (S_{j - 1}), E (S_{j})]} (w)} Λ_{0}^{0} (d w) \\ = \int_{0}^{t^{*}} S^{(0)} (s^{*}, w) [Q_{n}^{(1)} (s^{*}, w; η^{0}, η) - Q_{n}^{(1)} (s^{*}, w; η, η)] Λ_{0}^{0} (d w) . \end{array}

By conditions (C2) and (C3), this last term will converge uniformly in probability to Ψ(s^*, t^*; η), so that we will have the result

sup_{η \in Γ} ‖ Ψ_{n} (s^{*}, t^{*}; η) - Ψ (s^{*}, t^{*}; η) ‖ \overset{p}{\to} 0.

(19)

Finally, observe that Ψ(s^*, t^*; η⁰) = 0, so by the condition of the theorem and coupling with (19), it follows from Theorem 5.9 of van der Vaart [18]) that ${\hat{η}}_{n} \overset{p}{\to} η^{0}$ .

Indeed, there is more to be said based on the following Lemma 2 which will also be used in the weak convergence result proof in Section 6. Since Σ(s^*, t^*) is positive definite, this lemma implies that, in fact, η⁰ is a maximizer of the limit in probability of the log-partial likelihood [l_P (s^*, t^*; η) − l_P (s^*, t^*)]/n.

Lemma 2

Under conditions (C1)–(C5),

\begin{array}{l} {\dot{Ψ}}_{n} (s^{*}, t^{*}) \equiv \nabla_{η η^{T}} {\frac{1}{n} l_{P} (s^{*}, t^{*}; η)} ∣_{η = η^{0}} \\ = - \int_{0}^{t^{*}} V_{ℚ_{n} (s^{*}, w)} [\frac{\dot{κ}}{κ} (E^{- 1} (w))] S^{(0)} (s^{*}, w) Λ_{0}^{0} (d w) + o_{p} (1) \\ \overset{p}{\to} - \sum (s^{*}, t^{*}) . \end{array}

Proof

Straightforward, though tedious, calculations show that

\begin{array}{l} {\dot{Ψ}}_{n} (s^{*}, t^{*}; η) = ℙ_{n} \int_{0}^{s^{*}} [\frac{\ddot{κ}}{κ} (v; η) - \frac{{\ddot{S}}^{(0)}}{S^{(0)}} (s^{*}, E (v); η)] N (d v, t^{*}) - \\ ℙ_{n} \int_{0}^{s^{*}} {{[\frac{\dot{κ}}{κ} (v; η)]}^{\otimes 2} - {[\frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, E (v); η)]}^{\otimes 2}} N (d v, t^{*}) \\ = \int_{0}^{t^{*}} {Q_{n}^{(2)} (s^{*}, w; η^{0}, η) - Q_{n}^{(2)} (s^{*}, w; η, η)} S^{(0)} (s^{*}, w; η^{0}) Λ_{0}^{0} (d w) - \\ \int_{0}^{t^{*}} {ℚ_{n} (s^{*}, w; η^{0}) {[\frac{\dot{κ}}{κ} (E^{- 1} (w); η)]}^{\otimes 2} - {[Q_{n}^{(1)} (s^{*}, w; η, η)]}^{\otimes 2}} \times \\ S^{(0)} (s^{*}, w; η^{0}) Λ_{0}^{0} (d w) + o_{p} (1) . \end{array}

Evaluating at η = η⁰, and noting that

ℚ_{n} (s^{*}, w; η^{0}) [\frac{\dot{κ}}{κ} (E^{- 1} (w); η^{0})] = Q_{n}^{(1)} (s^{*}, w; η^{0}, η^{0})

then yields the representation given in the statement of the lemma. Letting n → ∞, the limiting matrix is −Σ(s^*, t^*).

Theorem 3

Under conditions (C1)–(C5), Λ̂₀_n(s^*, ·) converges uniformly in probability to $Λ_{0}^{0} (\cdot)$ on [0, t^*], that is,

sup_{t \in [0, t^{*}]} | {\hat{Λ}}_{0 n} (s^{*}, t) - Λ_{0}^{0} (t) | \overset{p}{\to} 0.

Proof

With $Λ_{0}^{*} (s^{*}, t) = \int_{0}^{t} I {S^{(0)} (s^{*}, w; \hat{η}) > 0} Λ_{0}^{0} (d w)$ , we have that

\begin{array}{l} ∣ {\hat{Λ}}_{0} (s^{*}, t) - Λ_{0}^{0} (t) ∣ \leq ∣ {\hat{Λ}}_{0} (s^{*}, t) - Λ_{0}^{*} (s^{*}, t) ∣ + ∣ Λ_{0}^{*} (s^{*}, t) - Λ_{0}^{0} (t) ∣ \\ \leq | {\hat{Λ}}_{0} (s^{*}, t) - \int_{0}^{t} \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} ℙ N (s^{*}, d w) | + \end{array}

(20)

| \int_{0}^{t} \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} ℙ M (s^{*}, d w) | +

(21)

| \int_{0}^{t} I {S^{(0)} (s^{*}, w; \hat{η}) = 0} Λ_{0}^{0} (d w) | .

(22)

Term (22) is bounded above by

| \int_{0}^{t^{*}} I {S^{(0)} (s^{*}, w; \hat{η}) = 0} Λ_{0}^{0} (d w) |,

which is o_p(1) since $S^{(0)} (s^{*}, w; \hat{η}) \overset{p}{\to} s^{(0)} (s^{*}, w)$ and by (C2) we have $Λ_{0}^{0} (t^{*}) < \infty$ and inf_w∈[0,t^*] s⁽⁰⁾(s^*,w) > 0. Term (20) is bounded above by

{sup_{w \in [0, t^{*}]} | \frac{I {S^{(0)} (s^{*}, w; \hat{η}) > 0}}{S^{(0)} (s^{*}, w; \hat{η})} - \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} |} ℙ N (s^{*}, t^{*}) .

But ℙN(s^*, t^*) = ℙM(s^*, t^*)+ℙA(s^*, t^*). By Theorem 1, ℙM(s^*, t^*) = o_p(1), while $ℙ A (s^{*}, t^{*}) = \int_{0}^{t^{*}} S^{(0)} (s^{*}, w) Λ_{0}^{0} (d w)$ , which converges in probability to $\int_{0}^{t^{*}} s^{(0)} (s^{*}, w) Λ_{0}^{0} (d w)$ , a finite quantity by (C2). Thus, ℙN(s^*, t^*) = O_p(1). Since

sup_{w \in [0, t^{*}]} | \frac{I {S^{(0)} (s^{*}, w; \hat{η}) > 0}}{S^{(0)} (s^{*}, w; \hat{η})} - \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} | = o_{p} (1)

it therefore follows that term (20) is o_p(1). Finally, by Theorem 1, we have that the process

{\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \int_{0}^{t} \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} M_{i} (s^{*}, d w) : t \in [0, t^{*}]}

converges weakly to a zero-mean Gaussian process G̃ whose covariance function is

Cov (\tilde{G} (t_{1}), \tilde{G} (t_{2})) = \int_{0}^{min (t_{1}, t_{2})} \frac{Λ_{0}^{0} (d w)}{s^{(0)} (s^{*}, w)}

for t₁, t₂ ∈ [0, t^*]. As a consequence,

sup_{t \in [0, t^{*}]} | \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \int_{0}^{t} \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} M_{i} (s^{*}, d w) |

converges weakly to sup_t∈[0,t^*] |G̃(t)|, which is O_p(1). It follows that

\begin{array}{l} sup_{t \in [0, t^{*}]} | \int_{0}^{t} \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} ℙ M (s^{*}, d w) | = \frac{1}{\sqrt{n}} sup_{t \in [0, t^{*}]} | \frac{1}{\sqrt{n}} \sum_{i = 1}^{n} \int_{0}^{t} \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} M_{i} (s^{*}, d w) | \\ = o_{p} (1) . \end{array}

This completes the proof of the theorem.

6 Distributional Properties

In this section we establish the limiting distributional properties of { $\sqrt{n} [{\hat{η}}_{n} - η^{0}]$ , n = 1, 2, …} and {W_n(s^*, t): t ∈ 𝒯; n = 1, 2, …}, where

W_{n} (s^{*}, t) = \sqrt{n} [{\hat{Λ}}_{0}^{(n)} (s^{*}, t) - Λ_{0}^{0} (t)] .

Define the process {B_n(s^*, t): t ∈ 𝒯; n = 1, 2, …} according to

B_{n} (s^{*}, t) = \int_{0}^{t} I {S^{(0)} (s^{*}, w) > 0} \frac{{\dot{S}}^{(0)} (s^{*}, w)}{{[{\dot{S}}^{(0)} (s^{*}, w)]}^{2}} ℙ_{n} N (s^{*}, d w) .

Let us also define the process {V_n(s^*, t): t ∈ 𝒯; n = 1, 2, …} via

V_{n} (s^{*}, t) = \sqrt{n} [{\hat{Λ}}_{0}^{(n)} (s^{*}, t) - Λ_{0}^{0} (t)] + \sqrt{n} {({\hat{η}}_{n} - η^{0})}^{T} B_{n} (s^{*}, t) .

Furthermore, we shall assume that η̂_n solves the equation

U_{P}^{(n)} (s^{*}, t^{*}; η) = 0 with U_{P}^{(n)} (s^{*}, t; η) = \nabla_{η} l_{P} (s^{*}, t; η) .

We now present and prove a result from which the asymptotic properties follow.

Theorem 4

Under conditions (C1)–(C5), we have the representations

\sqrt{n} ({\hat{η}}_{n} - η^{0}) = {[\sum (s^{*}, t^{*})]}^{- 1} \times {\sqrt{n} ℙ_{n} \int_{0}^{t^{*}} [\frac{\dot{κ}}{κ} [E^{- 1} (w)] - \frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, w)] M (s^{*}, d w)} + o_{p} (1);

(23)

and

V_{n} (s^{*}, t) = \sqrt{n} \int_{0}^{t^{*}} I (w \leq t) \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} ℙ_{n} M (s^{*}, d w) + o_{p} (1) .

(24)

Furthermore, { $\sqrt{n} ({\hat{η}}_{n} - η^{0})$ } and {V_n(s^*, t): t ∈ 𝒯 } are asymptotically independent with each weakly converging to Gaussian limits.

Proof

From the definition of η̂_n, we have by first-order Taylor expansion that

\sqrt{n} ({\hat{η}}_{n} - η^{0}) = {[- {\dot{Ψ}}_{n} (s^{*}, t^{*}; {\tilde{η}}_{n})]}^{- 1} [\sqrt{n} Ψ_{n} (s^{*}, t^{*}; η^{0})]

where η̃_n is in a neighborhood centered at η⁰ and whose radius is ||η̂_n − η⁰||. It is easy to see that

\begin{array}{l} \sqrt{n} Ψ_{n} (s^{*}, t^{*}; η^{0}) = \sqrt{n} ℙ_{n} \int_{0}^{s^{*}} {\frac{\dot{κ}}{κ} (v) - \frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, E (v))} M (d v, t^{*}) \\ = \sqrt{n} ℙ_{n} \int_{0}^{t^{*}} {\frac{\dot{κ}}{κ} [E^{- 1} (w)] - \frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, w)} M (s^{*}, d w) . \end{array}

Furthermore, since ${\hat{η}}_{n} \overset{p}{\to} η^{0}$ , and by virtue of Lemma 2, we have that

{[- {\dot{Ψ}}_{n} (s^{*}, t^{*}; {\tilde{η}}_{n})]}^{- 1} = {[\sum (s^{*}, t^{*})]}^{- 1} + o_{p} (1) .

As such we obtain the representation for $\sqrt{n} ({\hat{η}}_{n} - η^{0})$ .

Once again, by first-order Taylor expansion, we have that on the set where S⁽⁰⁾(s^*,w; η̂_n) > 0,

\frac{1}{S^{(0)} (s^{*}, w; {\hat{η}}_{n})} = \frac{1}{S^{(0)} (s^{*}, w; η^{0})} - {({\hat{η}}_{n} - η^{0})}^{T} \frac{{\dot{S}}^{(0)} (s^{*}, w; {\tilde{η}}_{n})}{{[S^{(0)} (s^{*}, w; {\tilde{η}}_{n})]}^{2}}

with η̃_n inside the ball centered at η⁰ with radius ||η̂_n − η⁰||. Defining

Λ_{0}^{*} (s^{*}, t) = \int_{0}^{t} I {S^{(0)} (s^{*}, w; {\hat{η}}_{n}) > 0} Λ_{0}^{0} (d w),

and recalling that

{\hat{Λ}}_{0}^{(n)} (s^{*}, t) = \int_{0}^{t} \frac{I {S^{(0)} (s^{*}, w; {\hat{η}}_{n}) > 0}}{S^{(0)} (s^{*}, w; {\hat{η}}_{n})} ℙ_{n} N (s^{*}, d w),

we obtain

\begin{array}{c} \sqrt{n} [{\hat{Λ}}_{0}^{(n)} (s^{*}, t) - Λ_{0}^{*} (s^{*}, t)] = \int_{0}^{t} \frac{I {S^{(0)} (s^{*}, w; {\hat{η}}_{n}) > 0}}{S^{(0)} (s^{*}, w; {\hat{η}}_{n})} \sqrt{n} ℙ_{n} M (s^{*}, d w) - \\ \sqrt{n} {({\hat{η}}_{n} - η^{0})}^{T} \int_{0}^{t} I {S^{(0)} (s^{*}, w; {\hat{η}}_{n}) > 0} \frac{[{\dot{S}}^{(0)} (s^{*}, w; {\hat{η}}_{n})]}{{[S^{(0)} (s^{*}, w; {\hat{η}}_{n})]}^{2}} ℙ_{n} N (s^{*}, d w) . \end{array}

The representation for V_n(s^*, t) given in the statement of the lemma now follows by noting that

\begin{matrix} {sup}_{0 \leq t \leq t^{*}} ‖ \sqrt{n} [Λ_{0}^{*} (s^{*}, t) - Λ_{0}^{0} (t)] ‖ = o_{p} (1); \\ {sup}_{0 \leq t \leq t^{*}} ‖ S^{(0)} (s^{*}, t; {\hat{η}}_{n}) - S^{(0)} (s^{*}, t; η^{0}) ‖ = o_{p} (1); \\ {sup}_{0 \leq t \leq t^{*}} ‖ {\dot{S}}^{(0)} (s^{*}, t; {\hat{η}}_{n}) - {\dot{S}}^{(0)} (s^{*}, t; η^{0}) ‖ = o_{p} (1) . \end{matrix}

Finally, let t = (t₁, t₂, …, t_p)^T ⊂ 𝒯. From the just-established representations, with I{w ≤ t} = (I{w ≤ t₁}, …, I{w ≤ t_p})^T, we have

[\begin{matrix} \sqrt{n} ({\hat{η}}_{n} - η^{0}) \\ V_{n} (s^{*}, t) \end{matrix}] = [\begin{matrix} \sum {(s^{*}, t^{*})}^{- 1} & 0 \\ 0 & I \end{matrix}] \times \sqrt{n} ℙ_{n} \int_{0}^{t^{*}} [\begin{matrix} \frac{\dot{κ}}{κ} [E^{- 1} (w)] - \frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, w) \\ I (w \leq t) \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)} \end{matrix}] M (s^{*}, d w) + o_{p} (1) .

By the main weak convergence theorem or by invoking theMartingale Central Limit Theorem after a time transformation, this converges weakly to the random vector

[\begin{matrix} W_{1} \\ W_{2} \end{matrix}] = [\begin{matrix} \sum {(s^{*}, t^{*})}^{- 1} & 0 \\ 0 & I \end{matrix}] [\begin{matrix} Z_{1} \\ Z_{2} \end{matrix}]

where (Z₁^T,Z₂^T)^T is a (k + p)-dimensional zero mean multivariate normal random vector with covariance matrix

Cov [(Z_{1} Z_{2}), (Z_{1} Z_{2})] = {plim}_{n \to \infty} \int_{0}^{t^{*}} ℚ_{n} (s^{*}, w) \times [\frac{\dot{κ}}{κ} [E^{- 1} (w)] - \frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, w) I (w \leq t) \frac{I {S^{(0)} (s^{*}, w) > 0}}{S^{(0)} (s^{*}, w)}] S^{(0)} (s^{*}, w) Λ_{0}^{0} (d w) .

However, the covariance matrix between Z₁ and Z₂ equals 0 since, for every w ∈ 𝒯,

ℚ_{n} (s^{*}, w) [\frac{\dot{κ}}{κ} [E^{- 1} (w)] - \frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, w)] = 0 .

Because of the Gaussian limits, this then establishes that $\sqrt{n} (\hat{η} - η^{0})$ and V_n(s^*, ·) are asymptotically independent.

The following two corollaries are then immediate consequences of the preceding theorem and elements of its proof.

Corollary 1

Under the conditions of Theorem 4, as n → ∞,

\sqrt{n} ({\hat{η}}_{n} - η^{0}) \overset{d}{\to} N (0, \sum {(s^{*}, t^{*})}^{- 1}) .

Proof

This is immediate from the fact that Z₁ in the proof of Theorem 4 is a k-dimensional zero-mean normal vector with covariance matrix Σ(s^*, t^*).

Corollary 2

Under the conditions of Theorem 4, as n → ∞, the process $W_{n} (s^{*}, \cdot) = \sqrt{n} [Λ_{0}^{(n)} (s^{*}, \cdot) - Λ_{0}^{0} (\cdot)]$ converges weakly in Skorokhod’s D[𝒯]-space to a zero-mean Gaussian process with covariance function given by

c (s^{*}, t_{1}, t_{2}) = \int_{0}^{min (t_{1}, t_{2})} \frac{Λ_{0}^{0} (d w)}{s^{(0)} (s^{*}, w)} + b {(s^{*}, t_{1})}^{T} {\sum (s^{*}, t^{*})}^{- 1} b (s^{*}, t_{2}),

(25)

for t₁, t₂ ∈ 𝒯 and with $b (s^{*}, t) = \int_{0}^{t} q^{(1)} (s^{*}, w) Λ_{0}^{0} (d w)$ .

Proof

From Theorem 4 we have the results that

\sqrt{n} ({\hat{η}}_{n} - η^{0}) \overset{d}{\to} W_{1} (s^{*}, t^{*})

where W₁(s^*, t^*) ~ N(0, [Σ(s^*, t^*)]⁻¹). Also, we have that

{V_{n} (s^{*}, t) : t \in T} \Rightarrow {Z_{2} (s^{*}, t) : t \in T}

where {Z₂(s^*, t): t ∈ 𝒯 } is a zero-mean Gaussian process with covariance function

Cov {Z_{2} (s^{*}, t_{1}), Z_{2} (s^{*}, t_{2})} = \int_{0}^{min (t_{1}, t_{2})} \frac{Λ_{0}^{0} (d w)}{S^{(0)} (s^{*}, w)} .

In addition, W₁(s^*, t^*) and {Z₂(s^*, t): t ∈ 𝒯 } are independent. It is also evident that

sup_{t \in T} ‖ B_{n} (s^{*}, t) - b (s^{*}, t) ‖ \overset{p}{\to} 0.

From the representations in Theorem 4, it follows that {W_n(s^*, t): t ∈ 𝒯 } converges weakly to the process W_∞ ≡ {W_∞(s^*, t): t ∈ 𝒯 } with

W_{\infty} (s^{*}, t) = Z_{2} (s^{*}, t) - b {(s^{*}, t)}^{T} W_{1} (s^{*}, t^{*}) .

As such W_∞ is a zero-mean Gaussian process and its covariance function is

\begin{array}{l} c (s^{*}, t_{1}, t_{2}) = Cov {W_{\infty} (s^{*}, t_{1}), W_{\infty} (s^{*}, t_{2})} \\ = \int_{0}^{min (t_{1}, t_{2})} \frac{Λ_{0}^{0} (d w)}{s^{(0)} (s^{*}, w)} + b {(s^{*}, t_{1})}^{T} {[\sum (s^{*}, t^{*})]}^{- 1} b (s^{*}, t_{2}) . \end{array}

This completes the proof of the corollary.

Possible consistent estimators of the covariance functions are then easily obtained. For the covariance matrix Σ(s^*, t^*), this could be estimated by

\sum^{^} (s^{*}, t^{*}) = \int_{0}^{t^{*}} ℚ_{n} (s^{*}, w; {\hat{η}}_{n}) {[\frac{\dot{κ}}{κ} [E^{- 1} (w); {\hat{η}}_{n}] - \frac{{\dot{S}}^{(0)}}{S^{(0)}} (s^{*}, w; {\hat{η}}_{n})]}^{\otimes 2} \times S^{(0)} (s^{*}, w; {\hat{η}}_{n}) {\hat{Λ}}_{0}^{(n)} (s^{*}, d w; {\hat{η}}_{n}) .

For the covariance function of Z₂(s^*, ·), a consistent estimator is given by

\hat{Cov} [Z_{2} (s^{*}, t_{1}), Z_{2} (s^{*}, t_{2})] = \int_{0}^{min (t_{1}, t_{2})} \frac{{\hat{Λ}}_{0}^{(n)} (s^{*}, d w)}{S^{(0)} (s^{*}, w; {\hat{η}}_{n})} .

On the otherhand, an estimator of b(s^*, t) is given by

\hat{b} (s^{*}, t) = \int_{0}^{t} \frac{{\dot{S}}^{(0)} (s^{*}, w; {\hat{η}}_{n})}{S^{(0)} (s^{*}, w; {\hat{η}}_{n})} {\hat{Λ}}_{0}^{(n)} (s^{*}, d w) .

From these estimators, we are then able to obtain a consistent estimator of the covariance function c(s^*, t₁, t₂) of the limiting Gaussian process W_∞(s^*, ·). This estimator is

\hat{c} (s^{*}, t_{1}, t_{2}) = \hat{Cov} [Z_{2} (s^{*}, t_{1}), Z_{2} (s^{*}, t_{2})] + \hat{b} {(s^{*}, t_{1})}^{T} {[\sum^{^} (s^{*}, t^{*})]}^{- 1} \hat{b} (s^{*}, t_{1}) .

Observe that the results in Corollaries 1 and 2 are highly analogous to those in [4] pertaining to the estimators of the parameters of the Cox proportional hazards model. However, one need to be cautious since under the setting being considered, the limit functions appearing in the above results are more complicated as they must reflect aspects of the sum-quota accrual scheme and the dynamics of the performed interventions or repairs after each event occurrence.

Through these asymptotic results, large-sample confidence intervals and bands, large-sample hypothesis testing procedures, and goodness-of-fit or model validation methods for the infinite-dimensional parameters may now be constructed for this general dynamic model for recurrent events. We note, however, that the results presented in this paper are still limited to the general dynamic recurrent event model without frailties. It remains an open problem to obtain large-sample results for the general dynamic model incorporating frailties.

Footnotes

AMS Subject Classification: Primary: 62N01, 62N02; Secondary: 62G10

References

1.Aalen O. Nonparametric inference for a family of counting processes. Annals of Statistics. 1978;6:701–726. [Google Scholar]
2.Aalen Odd O, Borgan Ørnulf, Gjessing Håkon K. Statistics for Biology and Health. Springer; New York: 2008. Survival and event history analysis: A process point of view. [Google Scholar]
3.Andersen P, Borgan O, Gill R, Keiding N. Statistical Models Based on Counting Processes. Springer-Verlag; New York: 1993. [Google Scholar]
4.Andersen P, Gill R. Cox’s regression model for counting processes: a large sample study. Annals of Statistics. 1982;10:1100–1120. [Google Scholar]
5.Borgan O. Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scandinavian Journal of Statistics. 1984;11:1–16. [Google Scholar]
6.Breslow N, Crowley J. A large sample study of the life table and product limit estimates under random censorship. The Annals of Statistics. 1974;2:437–453. [Google Scholar]
7.Cox D. Regression models and life tables (with discussion) Journal of the Royal Statistical Society. 1972;34:187–220. [Google Scholar]
8.Cox D. Partial likelihood. Biometrika. 1975;62:269–276. [Google Scholar]
9.Fleming T, Harrington D. Counting Processes and Survival Analysis. Wiley; New York: 1991. [Google Scholar]
10.Gill Richard D, Johansen Søren. A survey of product-integration with a view toward application in survival analysis. Ann Statist. 1990;18(4):1501–1555. [Google Scholar]
11.Gjessing Håkon K, Røysland Kjetil, Pena Edsel A, Aalen Odd O. Recurrent events and the exploding Cox model. Lifetime Data Anal. 2010;16(4):525–546. doi: 10.1007/s10985-010-9180-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
12.Peña E, Hollander M. Mathematical Reliability: An Expository Perspective. In: Soyer R, Mazzuchi T, Singpurwalla N, editors. Models for Recurrent Events in Reliability and Survival Analysis. chapter 6. Kluwer Academic Publishers; 2004. pp. 105–123. [Google Scholar]
13.Peña EA, Strawderman RL, Hollander M. Nonparametric estimation with recurrent event data. J Amer Statist Assoc. 2001;96(456):1299–1315. [Google Scholar]
14.Peña Edsel, Slate Elizabeth, Gonzalez Juan Ramon. Semiparametric inference for a general class of models for recurrent events. Journal of Statistical Planning and Inference. 2007;137:1727–1747. doi: 10.1016/j.jspi.2006.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]
15.Peña Edsel A, Strawderman Robert L, Hollander Myles. Recent advances in reliability theory (Bordeaux, 2000) Birkhäuser Boston; Boston, MA: 2000. A weak convergence result relevant in recurrent and renewal models; pp. 493–514. Stat. Ind. Technol. [Google Scholar]
16.Prentice R, Williams B, Peterson A. On the regression analysis of multivariate failure time data. Biometrika. 1981;68:373–379. [Google Scholar]
17.Sellke T. Weak convergence of the Aalen estimator for a censored renewal process. In: Gupta S, Berger J, editors. Statistical Decision Theory and Related Topics IV. Vol. 2. 1988. pp. 183–194. [Google Scholar]
18.van der Vaart AW. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press; Cambridge: 1998. Asymptotic statistics. [Google Scholar]
19.Wei L, Lin D, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Amer Statist Assoc. 1989;84:1065–1073. [Google Scholar]

[R1] 1.Aalen O. Nonparametric inference for a family of counting processes. Annals of Statistics. 1978;6:701–726. [Google Scholar]

[R2] 2.Aalen Odd O, Borgan Ørnulf, Gjessing Håkon K. Statistics for Biology and Health. Springer; New York: 2008. Survival and event history analysis: A process point of view. [Google Scholar]

[R3] 3.Andersen P, Borgan O, Gill R, Keiding N. Statistical Models Based on Counting Processes. Springer-Verlag; New York: 1993. [Google Scholar]

[R4] 4.Andersen P, Gill R. Cox’s regression model for counting processes: a large sample study. Annals of Statistics. 1982;10:1100–1120. [Google Scholar]

[R5] 5.Borgan O. Maximum likelihood estimation in parametric counting process models, with applications to censored failure time data. Scandinavian Journal of Statistics. 1984;11:1–16. [Google Scholar]

[R6] 6.Breslow N, Crowley J. A large sample study of the life table and product limit estimates under random censorship. The Annals of Statistics. 1974;2:437–453. [Google Scholar]

[R7] 7.Cox D. Regression models and life tables (with discussion) Journal of the Royal Statistical Society. 1972;34:187–220. [Google Scholar]

[R8] 8.Cox D. Partial likelihood. Biometrika. 1975;62:269–276. [Google Scholar]

[R9] 9.Fleming T, Harrington D. Counting Processes and Survival Analysis. Wiley; New York: 1991. [Google Scholar]

[R10] 10.Gill Richard D, Johansen Søren. A survey of product-integration with a view toward application in survival analysis. Ann Statist. 1990;18(4):1501–1555. [Google Scholar]

[R11] 11.Gjessing Håkon K, Røysland Kjetil, Pena Edsel A, Aalen Odd O. Recurrent events and the exploding Cox model. Lifetime Data Anal. 2010;16(4):525–546. doi: 10.1007/s10985-010-9180-y. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R12] 12.Peña E, Hollander M. Mathematical Reliability: An Expository Perspective. In: Soyer R, Mazzuchi T, Singpurwalla N, editors. Models for Recurrent Events in Reliability and Survival Analysis. chapter 6. Kluwer Academic Publishers; 2004. pp. 105–123. [Google Scholar]

[R13] 13.Peña EA, Strawderman RL, Hollander M. Nonparametric estimation with recurrent event data. J Amer Statist Assoc. 2001;96(456):1299–1315. [Google Scholar]

[R14] 14.Peña Edsel, Slate Elizabeth, Gonzalez Juan Ramon. Semiparametric inference for a general class of models for recurrent events. Journal of Statistical Planning and Inference. 2007;137:1727–1747. doi: 10.1016/j.jspi.2006.05.004. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] 15.Peña Edsel A, Strawderman Robert L, Hollander Myles. Recent advances in reliability theory (Bordeaux, 2000) Birkhäuser Boston; Boston, MA: 2000. A weak convergence result relevant in recurrent and renewal models; pp. 493–514. Stat. Ind. Technol. [Google Scholar]

[R16] 16.Prentice R, Williams B, Peterson A. On the regression analysis of multivariate failure time data. Biometrika. 1981;68:373–379. [Google Scholar]

[R17] 17.Sellke T. Weak convergence of the Aalen estimator for a censored renewal process. In: Gupta S, Berger J, editors. Statistical Decision Theory and Related Topics IV. Vol. 2. 1988. pp. 183–194. [Google Scholar]

[R18] 18.van der Vaart AW. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press; Cambridge: 1998. Asymptotic statistics. [Google Scholar]

[R19] 19.Wei L, Lin D, Weissfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Amer Statist Assoc. 1989;84:1065–1073. [Google Scholar]

PERMALINK

Asymptotics for a Class of Dynamic Recurrent Event Models

Edsel A Peña

Abstract

1 Introduction and Background

2 Class of Dynamic Models

3 Semiparametric Estimators

3.1 Doubly-Indexed Processes

Proposition 1

Proof

Proposition 2

Proof

Proposition 3

Proof

3.2 Estimation of Λ0

3.3 Estimator of (α, β)

4 Preliminaries for Asymptotics

Theorem 1

Proof

5 Consistency Properties

Lemma 1

Proof

Theorem 2

Proof

Lemma 2

Proof

Theorem 3

Proof

6 Distributional Properties

Theorem 4

Proof

Corollary 1

Proof

Corollary 2

Proof

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

3.2 Estimation of Λ₀