Skip to main content
Contemporary Clinical Trials Communications logoLink to Contemporary Clinical Trials Communications
. 2018 Mar 12;10:50–56. doi: 10.1016/j.conctc.2018.03.002

A novel approach for analyzing data on recurrent events with duration to estimate the combined cumulative rate of both variables over time

Sudipta Bhattacharya 1
PMCID: PMC5898580  PMID: 29696158

Abstract

Recurrent adverse events, once occur often continue for some duration of time in clinical trials; and the number of events along with their durations is clinically considered as a measure of severity of a disease under study. While there are methods available for analyzing recurrent events or durations or for analyzing both side by side, no effort has been made so far to combine them and present as a single measure. However, this single-valued combined measure may help clinicians assess the wholesome effect of recurrence of incident comprising events and durations. Non-parametric approach is adapted here to develop an estimator for estimating the combined rate of both, the recurrence of events as well as the event-continuation, that is the duration per event. The proposed estimator produces a single numerical value, the interpretation and meaningfulness of which are discussed through the analysis of a real-life clinical dataset. The algebraic expression of variance is derived, asymptotic normality of the estimator is noted, and demonstration is provided on how the estimator can be used in the setup of testing of statistical hypothesis. Further possible development of the estimator is also noted, to adjust for the dependence of event occurrences on the history of the process generating recurrent events through covariates and for the case of dependent censoring.

Keywords: Recurrent events, Duration per event, Intensity, Nelson-aalen estimator

1. Introduction

In clinical trials on diseases like Chronic Obstructive Pulmonary Disease (COPD), asthma, or migraine, etc., the event-durations are of interest along with the event-counts, as together they define severity of the disease.

Poisson regression or Negative-Binomial regression described by Lawless [1] for analyzing data on recurrent events when covariates are considered not time dependent; or for time dependent covariates, estimating the mean or rate function of recurrent events, e.g., method introduced by Lin et al. [2,3] and by Miloslavsky et al. [4] (all the three based on the definition of intensity function introduced by Andersen-Gill [5]) are the standard approaches. Otherwise, if event occurrence is considered dependent on previous events, then stratified Andersen-Gill model (Cook and Lawless [6], pp 175–176) can be used. In addition, non-parametric Nelson Aalen estimator ([7]) for the rate or mean function of recurrent events, and the extensions by Cook et al. [8]) for event dependent censoring and termination are commonly used methods for analyzing data on recurrent events.

For the analysis of waiting times (with assumption of independence among waiting times and deviating from that assumption), detailed discussion is provided in chapter 4 of Cook and Lawless [6], pp 121–160. Otherwise, the modeling of proportional hazard ratio using stratified Cox-type models ([9]) based on total time as well as on gap times introduced by Prentice, William and Peterson [10] and marginal Cox-models based on total time introduced by Wei, Lin and Weisfeld [11] are used as well.

On methods for analyzing data on duration, Metcalfe et al. [12] made a thorough coverage in their article. Otherwise, X. Joan Hu et al. [13] also proposed some methods for analyzing event-duration. The bivariate approach to deal with recurrent events with duration is through an alternating two-state process (‘exacerbation state’ and ‘exacerbation-free state’ being the two alternative states) as described by Cook and Lawless ([6] section 6.5, pp 216–218 and section 6.7.2, pp 232–236).

However, none of the methods mentioned above present an estimate for combined cumulative rate or mean of recurrent events and duration of events over time.

Here in this paper, a non-parametric estimator is proposed that takes the totality of the data into account through dealing with both, the recurrence of events and the duration of them simultaneously; and as a result, produces a single numerical value, which estimates the wholesome effect of the incident. Consequently, the proposed estimator can be looked upon as a joint or combined rate of both, the event recurrences as well as the duration per event.

Following is how the concept of the proposed estimator is developed over the next few sections. In section 2, the mathematical motivation, development and properties of the estimator are described. In section 3, the interpretation, usefulness and meaningfulness of the single value, produced by the estimator based on a real-life clinical dataset are discussed; and in section 4, the possible use and advantages of the proposed estimator are described and further potential developments are mentioned.

2. Mathematical development

The mathematical framework for this paper is built on a process that generates recurrence of events to individual subjects, who constitute a population; and the assumption regarding occurrence of events here is that, an event can occur and end both at a time instant t (like as it happens in case of any Poisson process), or can occur at one time instant and then continues for some time before it ends.

2.1. Definitions

Let us denote by Xt the number of events that are observed to have occurred to an individual subject by (i.e., on or before) time t.

The intensity function is defined (Cook and Lawless [6], chapter 1.3, p 10) as

χ(t|t)=limΔt0Pr(ΔXt=1|t)Δt (1)

where ΔXt=Xt+ΔtXt and t={X(s),0s<t} is the history of a process.

Note that the intensity function can also be looked upon as χ(t|t)dt=P(dXt=1|t)=E{dXt|t}.

Based on the definition of intensity function for the occurrence (or, onset, to be precise) of a new event presented above, let us define the intensity function for ending of events for an event that has already occurred (and started) to an individual subject at time-point t˜0<t (i.e., the onset of the event was at time-point t˜0<t) and is not continuing until (i.e., has ended by) time t as:

χZ(t|t)=limΔt0Pr(ΔZt=1|t)Δt (2)

where t={X(s),0s<t} is the history of a process and ΔZt=Zt+ΔtZt, with Zt denoting an indicator function such that,

  • Zt=1, when an event that has already occurred (and started) at time-point t˜0<t (i.e., the onset of the event was at time-point t˜0<t), continues until time t, or,

  • Zt=0, when an event that has already occurred (and started) at time-point t˜0<t (i.e., the onset of the event was at time-point t˜0<t), has also ended by time t.

  • Note that the intensity function can also be looked upon as χZ(t|t)dt=Pr(ΔZt=1|t)=E{dZt|t}.

2.2. Mathematical motivation

Let us now define the following variables:

  • Nt = total count of the onset of events occurred to the population of n subjects by time t, which is non-decreasing over time, and

  • NtZ = count of events that have already occurred to the population of n subjects before time t and are continuing until time t.

Clearly, NtNtZ at any given timet.

If we define a new variable Nts as:

NtS=Nt+NtZ, then ΔNtS=ΔNt+ΔNtZ.

Then, ΔNt=ΔN(t)=k=1n[X(k)(t+Δt)X(k)(t)]Y(k)(t)C(k)(t), where X(k)(t) is an indicator function such that:

  • X(k)(t)=1 when a new event has occurred (in terms of onset of that new event) by time t to the kth individual subject, or

  • X(k)(t)=0, when a new event has not occurred (in terms of onset of that new event) until time t since the preceding event occurred and ended to the kth individual subject;

  • Y(k)(t) is an indicator function with following such that:

  • Y(k)(t) = 1, when the kth subject belongs to the risk set at time t for having a new event, since the preceding event occurred and ended,

  • Y(k)(t) = 0, when the kth subject does not belong to the risk set at time t for having a new event, since the preceding event has occurred and is continuing;

  • and C(k)(t) is an indicator function with following such that:

  • C(k)(t)=I(tC(k)) is an indicator function of whether the kth subject is under observation at time t.

Clearly, C(k)(t) is the indicator for censoring of a subject and here we assume data to be missing at random after censoring.

We also consider ΔNtZ=k=1n[1{Z(k)(t)Z(k)(t+Δt)}]Z(k)(t)C(k)(t),where Z(k)(t) is an indicator function such that:

  • Z(k)(t) = 1 if an event that has already occurred (and started) to the kth subject and is continuing until time t;

  • Z(k)(t) = 0 if an event that has already occurred (and started) to the kth subject and has also ended by time t.

It should be noted that since Y(k)(t)+Z(k)(t)=1 at any given time t for any k,

k=1nY(k)(t)C(k)(t)+k=1nZ(k)(t)C(k)(t)=k=1n{Y(k)(t)+Z(k)(t)}C(k)(t)=k=1nC(k)(t). (3)

2.3. Development of the estimator

It is already defined that NtS=Nt+NtZ, implies ΔNtS=ΔNt+ΔNtZ, where ΔNt=Nt+ΔtNt.

Since by the total probability theorem, P(ab)=P(ε)P(ab|ε)+P(εC)P(ab|εC)=P(ε)P(a|ε)+P(εC)P(b|εC), in case a and b are disjoint and P(b|ε)=0 and P(a|εC)=0,Where

  • ε: risk set of subjects at time t for having a new event

  • εC: set of (subjects with) existing events (i.e., events that did not end by time t ) that continuing until time t

  • a: occurrence of a new event to a subject within the interval of [t,t+Δt)

  • b: an existing event continuing during the interval of [t,t+Δt)

Which implies that ΔNtSk=1nC(k)(t)=ΔNt+ΔNtZk=1nC(k)(t)=ΔNtk=1nC(k)(t)+ΔNtZk=1nC(k)(t)

=[k=1nY(k)(t)C(k)(t)k=1nC(k)(t)]×ΔNtk=1nY(k)(t)C(k)(t)+[k=1nZ(k)(t)C(k)(t)k=1nC(k)(t)]×ΔNtZk=1nZ(k)(t)C(k)(t),

where k=1nC(k)(t)=k=1nY(k)(t)C(k)(t)+k=1nZ(k)(t)C(k)(t), from equation (3).

Integrating over time, 0tdNuSk=1nC(k)(u)=0tdNu+dNuZk=1nC(k)(u)=0tdNuk=1nC(k)(u)+0tdNuZk=1nC(k)(u)

=0t[k=1nY(k)(u)C(k)(u)k=1nC(k)(u)]×dNusk=1nk=1nY(k)(u)C(k)(u)+0t[k=1nZ(k)(u)C(k)(u)k=1nC(k)(u)]×dNuZk=1nZ(k)(u)C(k)(u).

Then 0tdNuSk=1nC(k)(u)=0tpuχ(u|un)duˆ+0tpuZ{1χZ(u|un)du}ˆ may be looked upon as the estimator of the cumulative weighted event-time recurrence rate, where

pu=[k=1n{Y(k)(u)C(k)(u)}k=1nC(k)(t)],
0tχ(u|un)duˆ=0t[dNuk=1n{Y(k)(u)C(k)(u)}],
puZ=[k=1nZ(k)(u)C(k)(u)k=1nC(k)(u)],

and 0t{1χZ(u|u)du}ˆ=0t[dNuZk=1nZ(k)(u)C(k)(u)].

A detailed derivation of these individual components of the estimator is provided in the Appendix section of this paper.

2.4. Properties

2.4.1. Asymptotic property and variance

Due to the very way of the mathematical development here, the estimator of the cumulative weighted event-time recurrence rate defined above should retain the asymptotic normality property held by the Nelson-Aalen estimator; in addition, the derivation for the estimate of variance for Nelson-Aalen estimator, as derived in Cook and Lawless ([6] chapter 3.4.1, pp 68–69), can be adapted to derive the variance estimate of the proposed estimator in the following way.

Varˆ(0tpuχ(u|un)duˆ+0tpuZ{1χZ(u|un)du}ˆ)=Var(0tdNusk=1nC(k)(u))=Var(0tdNuk=1nC(k)(u)+0tdNuZk=1nC(k)(u))=Var(0tdNuk=1nC(k)(u))+Var(0tdNuZk=1nC(k)(u)), since Cov(ΔNt,ΔNtZ)=0;

Varˆ0tpuχu|Hunduˆ+0tpuZ1χZu|Hunduˆ=h:thtΔNthk=1nCkth2+h:thtΔNthZk=1nCkth2=h:thtΔNthSk=1nCkth2, since ΔNtS=ΔNt+ΔNtZ.

Hence, Varˆ(0tpuχ(u|un)duˆ+0tpuZ{1χZ(u|un)du}ˆ)=h:t(h)tΔNt(h)S{k=1nC(k)(t(h))}2

The practical utility of the proposed estimators' asymptotic normality property is that, in case of large samples it can be used for the testing of the statistical hypothesis regarding comparison of treatment effects on the cumulative weighted event-time recurrence rate.

2.4.2. Poisson properties

In case of a Poisson process or a renewal process without duration, that is, when the events occur and ends at the same time instant t, the estimator of the cumulative weighted event-time recurrence rate boils down to the regular Nelson-Aalen estimator for the cumulative rate (or mean in case of Poisson process, as discussed in details in Cook and Lawless ([6] p 68) of event occurrences (equivalently known as the cumulative intensity function or the integrated hazard function over time).

2.4.3. Equivalence with mean total duration

Under the assumption of no drop outs, k=1nC(k)(t)=n,0tT¨, where n is the total number of subjects in the population and T¨ is the total time all these n subjects were under study, the estimator for the cumulative weighted event-time recurrence rate, 0T¨[dNuSk=1nC(k)(u)]=h:t(h)T¨ΔNt(h)Sk=1nC(k)(t(h))=h:t(h)T¨ΔN(t(h))+ΔNt(h)Zk=1nC(k)(t(h))=1nk=1ni=1i=[mk2]dt(2i1)kt2ikk;where {dt(2i1)kt2ikk},i=1(1)[mk2],0t1ktmk1ktmkkT¨ are the successive duration-times for kth subject (i.e., the time-ordered sequence of time-intervals presenting duration of recurrent events occurred to kth subject) over a given time-period (0,T¨) and t1ktmkk are the time-points when recurrent events occurred and then ended (i.e., the time-ordered sequence of onset-time-point and the end-time-point of the recurrent events, which occurred) to the kth subject.

Note that the right-hand side of the algebraic expression above is nothing but the mean of the total duration-time, i.e., the average of the sum of time-intervals, or in other words, the average of the sum of durations of recurrent events occurred to subjects during the study.

Consequently, if the event occurrence (or onset, to be precise) rate and duration distribution are stationary over time, then by construction, the cumulative weighted event-time recurrence rate is essentially the same as the mean of the total duration-time (per some unit time). That is, the estimator proposed here is equivalent to E[DT|T=t] for large t, where DT is a random variable denoting the sum of time-intervals or durations of recurrent events occurred to a subject over a given time-period (i.e., Dt=i=1i=[m2]dt(2i1)t2i, where {dt(2i1)t2i},i=1(1)[m2],0t1tm1tmt are the random variables denoting the successive time-intervals presenting duration of recurrent events occurred to a subject over a given time-period (0,t), and t1tm are the time-points when recurrent events occurred and then ended (i.e., the time-ordered sequence of onset-time-point and the end-time-point of the recurrent events, which occurred) to that subject. However, since the event occurrence (or onset, to be precise) rate and the distribution of event durations both might vary with time, the estimates from the proposed estimator might only be comparable to the numerical values of Dt as t increases.

3. Data analysis

Since the non-parametric estimator introduced here is developed to estimate a novel parameter for assessing the disease condition in a population through a single value combining recurrence of events and duration over time, which has not been dealt with by any other existing and/or standard methods so far, the purpose of the data analysis here is to understand the purpose of such an estimator in the context of data on recurrent events with duration, through the meaning, usefulness and interpretation of the numerical value the estimator produces, instead of comparing the new estimator to any existing estimator of the event-time recurrence rate, which in fact does not exist until now.

It should be noted that if an event to a subject ends at any time t, then another event to that subject can begin immediately afterwards, that is, at time t+Δt, which will make the previous event appear to be continuing for that subject through the time t+Δt. Hence when day is the time-unit in any data set used for estimating the cumulative weighted event-time recurrence rate, then if an event to a subject ends on a particular day, then continuation of event will be considered for that subject until the day before.

An exacerbation dataset from one of the historical COPD studies is used for the data analysis and estimation of the cumulative weighted event-time recurrence rate with COPD exacerbations being the recurrent events. The Nelson-Aalen estimator of the event rate and the descriptive means for total-duration and also for the duration of the disease (in days) have been calculated on that dataset, so that the results can be compared side by side, in order to understand the meaning and usefulness of the estimator introduced here.

The dataset, thus considered, has a total of 165 patients, of which 117 were under an active drug under study (drug A, say) and the rest of the 48 patients were under an active comparator (drug B, say) for one year, which was the study-period. All the patients under both the treatment arms had at least one occurrence of exacerbation to them. 28 patients under drug A and 11 patients under drug B had more than one exacerbation during the one-year of on-treatment study period. The maximum number of exacerbations to a patient under drug A was three and the maximum number of exacerbations to a patient under drug B was four. The minimum duration of COPD exacerbation was one day under drug A (maximum 34 days) but was two days under drug B (maximum 31 days). Two patients under drug A and one patient under drug B were observed with an event but without any respective duration-time recorded in the dataset. 38 patients under drug A and 20 under drug B had dropped out of the study early.

Table 1 shows the results of the different estimators: cumulative weighted event-time recurrence rate introduced here, the mean duration and the Nelson-Aalen event rate.

Table 1.

Comparison of estimators for event rate and for duration of events with the proposed estimator.

Treatment N Cumulative Weighted Event-Time Recurrence Rate (SE) Mean (SD) of total duration-time Mean Duration (SD) Nelson-Aalen Event Rate (SE) Product of N.A. Event Rate and Mean Duration
A 117 9.0652 (0.331) 9.730 (5.579) 7.5608 (3.635) 1.4383 (0.126) 10.8747
B 48 10.1556 (0.759) 9.766 (6.273) 7.650 (4.100) 1.4315 (0.185) 10.9510
Treatment comparison (p-value) 0.4879 0.9060

3.1. Meaningfulness

As seen in Table 1, the numerical comparison of the estimated cumulative weighted event-time recurrence rate and the product of the Nelson-Aalen estimate of the event rate and the mean duration indicates that the new estimator may be looked upon as a combined rate (over time) of event recurrence and duration. This is because the product of Nelson-Aalen estimate and the mean duration (average event-day of 10.8747 for drug A and 10.9510 for drug B) can be looked upon as a single numerical value giving an idea about the totality of the data through dealing with both: the intensity of the disease through the Nelson-Aalen estimate of the recurrent event rate and the average duration of events. Likewise, the estimated cumulative weighted event-time recurrence rate (9.0652 event-days for drug A and 10.1556 event-days for drug B, both in 1 patient-year) gives an idea about the combined cumulative weighted rate (within a year) of the number of event occurrences (intensity of the disease) along with duration of time those events continued, presented as a single numerical value.

3.2. Usefulness

Because of the asymptotic normality property of the estimator mentioned in section 2.4.1, in case of large samples it can be used as a statistic for the t-test in testing a statistical hypothesis regarding comparison of treatment effect on the cumulative weighted event-time recurrence rate, as presented in Table 1. Since neither the p-value for proposed estimator nor the p-value for N.A. estimator is significant, it's not meaningful to compare between them. However, the ability of performing hypothesis testing using the proposed estimator signifies the purposefulness of the estimator over both: The Nelson-Aalen estimator, which does not take duration into account, and the simple product of Nelson-Aalen event rate and mean duration, which cannot be used as a statistic for the t-test in testing a statistical hypothesis regarding comparison of treatment effect.

Fig. 1 presents the over-time graph comparing the progression of the cumulative weighted event-time recurrence rate and of the Nelson-Aalen estimate over one year from the COPD dataset described above. As observed in the graph (Fig. 1), the different in treatment effects between drugs A and B are more clearly visible in the curves for the cumulative weighted event-time recurrence rate than in the curves for Nelson-Aalen estimates for the cumulative intensity function of event recurrence.

Fig. 1.

Fig. 1

Comparison of the proposed combined rate and N.A. rate over time.

The estimates of the cumulative weighted event-time recurrence rate in Table 1 (9.0652 event-days for treatment A and 10.1556 event-days for treatment B both in 1 patient-year) are numerically comparable to the means of total duration-time (9.73 for treatment A and 9.766 for treatment B) in the one-year time-period, as it is discussed in section 2.4.3.

3.3. Interpretability

When covariance of two variables is considered, it is viewed as a single numerical value, expressing the joint behavior of the two variables. Likewise, the proposed rate should be viewed as a linear combination of progression of event-recurrence and duration over time.

If the cumulative weighted rate of 9.0652 event-days for drug A of the event-time recurrences over 1 patient-year (10.1556 event-days in 1 patient-year for drug B) is viewed as a combined, single-valued rate of event-recurrence and duration, then the proposed estimator can be used to understand the wholesome effect of the recurrence of COPD exacerbations in these subjects under the two different treatments. In other words, this combined, single-valued cumulative weighted rate of event-time recurrences may be used as a measure to assess the severity of the disease involving recurrence of events (intensity of the disease) and the duration of events. This is because, the proposed estimator takes the totality of data on disease causing recurrent events with duration into account and produces a single value containing all information (recurrence of events, duration of events and total observation-time) about that data.

4. Discussion

The estimator of the cumulative weighted event-time reccurrence rate introduced in this paper accounts for both, the recurrence of events as well as the duration per event over time. In case of the recurrent adverse events with duration that occur in clinical studies, the occurrence of events are dependent on whether or not the preceding event is still continuing; and hence, the issue of event continuation should be taken into account while considering the risk set(s) over time for the occurrence of the following event(s). Moreover, in a process generating recurrent events where every event, once occurred, may continue for some duration of time, events and the associated time per event should be considered inseparable in defining such a process generating recurrent events, as done in section 2.2 of this paper. Naturally, one way of analyzing the data of recurrent events with durations is by dealing with them simultaneously, by introducing an extra parameter in the method of estimation, which has been done in the estimator introduced, as shown in section 2.3 of this paper. The proposed estimator, named as the cumulative weighted event-time recurrence rate in section 2.3 of this paper, is discussed in section 3 as a tool for understanding the wholesome effect of the recurrence of events with duration, e.g., the severity of a disease observed in a real-life data-set, as it gives an idea about the totality of the data regarding the disease causing recurrent events with duration through the estimation of the combined rate (over a given period of time) of event recurrences (intensity of the disease) along with duration of time those events continued, presented as a single numerical value.

The proposed estimator is very easy to calculate and also easy to interpret. In fact, the estimator can be interpreted in any of the following two ways. The estimator can be looked upon as an estimate of the combined rate of both: the recurrence of events and of the duration of events calculated over a given period of time. Or, under certain conditions (e.g., the event occurrence (onset) rate and duration distribution being stationary over time and/or when the time of study-period is very large, as described in section 2.4.3) it can be viewed as an estimate of the mean of the total duration-time (or, in other words, the average of the sum of the time-intervals of duration of recurrent events occurred to a subject during the study).

While the alternating two-stage model described in Cook and Lawless ([6] section 6.5, pp 216–218 and section 6.7.2, pp 232–236) is a very appropriate approach to analyze data on recurrent events with duration, the method uses many assumptions (e.g. distribution for the frailty type parameter representing the conditional transitional intensities between states, assumption that the hazard function be dependent on the preceding events i.e., on the process generating recurrent events only through the time-dependent covariates, etc.) that are sometimes hard to justify. Whereas the estimator proposed in this paper uses minimum amount of assumption due to its developmental background being in the non-parametric methodology.

The estimator developed in this paper may be extended to fit into different conditions (namely, Poisson or even general point process in the sense that jumps are greater than of unit one, time-dependent covariates and dependent censoring) by adapting various methods for estimating the mean or the rate function of recurrent events, e.g., method introduced by Lin et al. [2], by Lin et al. [3], and by Miloslavsky et al. [4] i.e., methods wherever increments in the number of events at any time and the risk sets for having the following events over time are considered in the estimation procedure. This is a work left for future research.

Acknowledgements

The author is deeply thankful to Dr. Professor Jerry F. Lawless from University of Waterloo, Canada for all his critical inputs, without which this work of the author would never have been complete. The author sincerely thanks AstraZeneca Respiratory statisticians and the Information sharing group for their kind support by providing the author with a relevant dataset.

Footnotes

Appendix A

Supplementary data related to this article can be found at https://doi.org/10.1016/j.conctc.2018.03.002.

Appendix.

Since ΔNt=k=1n[X(k)(t+Δt)X(k)(t)]Y(k)(t)C(k)(t)=k=1nΔX(k)(t)Y(k)(t)C(k)(t) denotes the number of events occurred (i.e., the onset of the events that are observed) to the population of n subjects within the time-interval of [t,t+Δt), then, denoting the history of the process generating recurrent events for the population of n subjects by time t by tn,

E[ΔNt|tn]=E[k=1nΔX(k)(t)Y(k)(t)C(k)(t)|tn]=k=1nY(k)(t)C(k)(t)E[ΔX(k)(t)|t(k)]=k=1nY(k)(t)C(k)(t)Pr(ΔX(k)(t)=1|t(k))=k=1n[Y(k)(t)C(k)(t)χ(k)(t|t(k))Δt].

The above equation is derived by following equation (1), since for any k:

Pr(ΔX(K)(t)=1|t(k))=χ(k)(t|t(k))Δt,

where χ(k)(t|t(k)) is the intensity function for the occurrence (onset, to be precise) of an event to the kth subject at time t given t(k), which is denoting the history of the process to the kth subject by time t.

However, under the assumption that for all k, χ(k)(t|t(k))=χ(t|tn), which denotes the population intensity function for event recurrence to anyone in the population of n subjects at any time t,

E[ΔNt|tn]=χ(t|tn)[k=1n{Y(k)(t)C(k)(t)}]Δt, which implies

E[{ΔNtχ(t)[k=1n{Y(k)(t)C(k)(t)}]Δt}|tn]=0.

Considering M(t)=Nt0tχ(u|un)[k=1n{Y(k)(u)C(k)(u)}]du, it can be written that E[dM(t)]=0, indicating the martingale property for M(t), as explained in Andersen, Borgen, Gill and Kleiding [14].

Now, because {k=1nY(k)(t)C(k)(t)} and k=1nC(k)(t) both are predictable at any time t, the integrals can be written in the following form:

0t[dNuk=1nC(k)(u)]=0t[χ(u|un)[k=1n{Y(k)(u)C(k)(u)}]duk=1nC(k)(u)]+0t[dMuk=1nC(k)(u)]

Or equivalently,

0t[dNuk=1nC(k)(u)]=0tχ(u|un)[k=1n{Y(k)(u)C(k)(u)}]k=1nC(k)(u)]du+0t[dMuk=1nC(k)(u)]

(following the same mathematical theory as that used in defining the Nelson Aalen estimator ([7]).

Which implies that 0tpuχ(u|un)duˆ=0t[dNuk=1nC(k)(u)], where pu=[k=1n{Y(k)(u)C(k)(u)}k=1nC(k)(t)]

(also, naturally, 0tχ(u|un)duˆ=0t[dNuk=1n{Y(k)(u)C(k)(u)}]).

For the estimation of the rate of the continuing events, what is note-worthy is that since the recurrence of events along with their duration of sustenance is of interest here, the consideration of the ending time of events may not be relevant in the context. This is because, if an event ends at any time t, by definition of instantaneous intensity function for event ending, that event is no longer visible at time t+Δt. Whereas, events that occur at time t and some or all of which continue beyond the time t (i.e., do not also end at time t) or other events that have occurred before time t and continue till time t (i.e., end at some point of time beyond t) are all what are visible at time t+Δt.

Consequently, as the objective here is to find the rate of such continuing cases of events, the following is derived.

Since ΔNtZ=k=1n[1{Z(k)(t)Z(k)(t+Δt)}]Z(k)(t)C(k)(t)=k=1n{(1(ΔZ(k)(t)))Z(k)(t)C(k)(t)} denotes the number of already occurred (i.e., existing) events continuing during the interval of [t,t+Δt), then, denoting the history of the process generating recurrent events for the population of n subjects by time t by tn,

E[ΔNtZ|tn]=E[k=1n{(1(ΔZ(k)(t)))Z(k)(t)C(k)(t)}|tn]=k=1nZ(k)(t)C(k)(t)E[{1(ΔZ(k)(t))}|t(k)]=k=1nZ(k)(t)C(k)(t)Pr({1(ΔZ(k)(t))}=1|t(k))=[k=1n{1χZ(k)(t|t(k))Δt}{Z(k)(t)C(k)(t)}].

The above equation is derived by following equation (2), since for any k:

Pr[1{ΔZ(k)(t)}=1|t(k)]=Pr(ΔZ(k)(t)=0|t(k))={1χZ(k)(t|t(k))Δt};

where χZ(k)(t|t(k)) is the intensity function for the ending of an already occurred event to the kth subject at t given t(k), which is denoting the history of the process generating recurrent events to the kth subject by time t.

However, under the assumption that for all k, χZ(k)(t|t(k))=χZ(t|tn), which denotes the population intensity function for ending of an already occurred event to anyone in that population of n subjects at any time t,

E[ΔNtZ|tn]={1χZ(t|tn)Δt}[k=1n{Z(k)(t)C(k)(t)}], which implies

E[{ΔNtZ{1χZ(t|tn)Δt}{k=1n{Z(k)(t)C(k)(t)}}|tn]=0.

Considering MZ(t)=NtZ0t[k=1n{Z(k)(u)C(k)(u)}]{1χZ(u|un)du}=NtZ[0utk=1n{Z(k)(u)C(k)(u)}0tχZ(u|un)[k=1n{Z(k)(u)C(k)(u)}]du], it can be written that [dMZ(t)]=0, indicating the martingale property for M(1)(t), as explained in Andersen, Borgen, Gill and Kleiding [14].

Now, once again, because k=1nZ(k)(t)C(k)(t) and k=1nC(k)(t) both are predictable at time t, the integrals can be written in the following form:

0t[dNuZk=1nC(k)(u)]=0t[k=1nZ(k)(u)C(k)(u)k=1nC(k)(u)]×{1χZ(u|un)du}+0t[dMuZk=1nC(k)(u)]

Or, equivalently,

0t[dNuZk=1nC(k)(u)]=0ut[k=1nZ(k)(u)C(k)(u)k=1nC(k)(u)]0tχZ(u|un)[k=1nZ(k)(u)C(k)(u)k=1nC(k)(u)]du+0t[dMu(2)k=1nCk(u)],

(following the same mathematical theory as that used in defining the Nelson Aalen estimator ([7]).

Which implies that 0tpuZ{1χZ(u|un)du}ˆ=0t[dNuZk=1nC(k)(u)], where puZ=[k=1nZ(k)(u)C(k)(u)k=1nC(k)(u)]

(also, naturally, 0t{1χZ(u|u)du}ˆ=0t[dNuZk=1nZ(k)(u)C(k)(u)]).

Appendix A. Supplementary data

The following is the supplementary data related to this article:

Online data
mmc1.xml (276B, xml)

References

  • 1.Lawless J.F. Regression models for Poisson process data. J. Am. Stat. Assoc. 1987;82:808–815. [Google Scholar]
  • 2.Lin D.Y., Wei L.J., Yang I., Ying Z. Semiparametric regression for the mean and rate function of recurrent events. J. Roy. Stat. Soc. B. 2000;62(part-4):711–730. [Google Scholar]
  • 3.Lin D.Y., Wei L.J., Ying Z. Semiparametric transformation models for point processes. J. Am. Stat. Assoc. 2001;96:620–628. 454. [Google Scholar]
  • 4.Miloslavsky M., Keles S., van der Laan M.J., Butler S. Recurrent events analysis in the presence of time-dependent covariates and dependent censoring. J. Roy. Stat. Soc. B. 2004;66(part 1):239–257. [Google Scholar]
  • 5.Andersen P.K., Gill R.D. Cox's regression model for counting processes, A large sample study. Ann. Stat. 1982;10:1100–1120. [Google Scholar]
  • 6.Cook R.J., Lawless J.F. Springer; 2007. The Statistical Analysis of Recurrent Events. pp. 175–176, 121–160, 216–218, 232–236, 10, 68–69, 68. [Google Scholar]
  • 7.Aalen O.O. vol. 2. Springer-Verlag; New York: 1978. Nonparametric Inference for a Family of Counting Process, Springer Lecture Notes in Statistics; pp. 1–25. [Google Scholar]
  • 8.Cook R.J., Lawless J.F., Lakhal-Chaieb L., Lee K. Robust estimation of mean functions and treatment effects for recurrent event under event dependent censoring and termination: application to skeletal complications in cancer metastatic to bone. J. Am. Stat. Assoc. 2009;485:60–75. [Google Scholar]
  • 9.Cox D.R. Regression models and life-tables (with discussion) J. Roy. Stat. Soc. B. 1972;34:187–220. [Google Scholar]
  • 10.Prentice R.L., Williams B.J., Peterson A.V. On the regression analysis of multivariate failure time data. Biometrica. 1981;68:373–379. [Google Scholar]
  • 11.Wei L.J., Lin D.Y., Weisfeld L. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Am. Stat. Assoc. 1989;84:1065–1073. [Google Scholar]
  • 12.Metcalfe C., Thompson S.G., White I.R. Analyzing the duration of recurrent events in clinical trials: a comparison of approaches using data from the UK700 trial of psychiatric case management. Contemp. Clin. Trials. 2005;26(4):443–458. doi: 10.1016/j.cct.2005.04.005. [DOI] [PubMed] [Google Scholar]
  • 13.Joan Hu X., Lorenzi M., Spinelli J.J., Celes Ying S., McBride M.L. Analysis of recurrent events with non-negligible event duration, with application to assessing hospital utilization. Lifetime Data Anal. 2011;17(Issue 2):215–233. doi: 10.1007/s10985-010-9183-8. [DOI] [PubMed] [Google Scholar]
  • 14.Andersen P.K., Borgan O., Gill R.D., Keiding N. Springer-Verlag; New York: 1993. Statistical Modeling Based on Counting Processes; pp. 48–56. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Online data
mmc1.xml (276B, xml)

Articles from Contemporary Clinical Trials Communications are provided here courtesy of Elsevier

RESOURCES