Design and analysis of nested case-control studies for recurrent events subject to a terminal event

Ina Jazić; Sebastien Haneuse; Benjamin French; Gaëtan MacGrogan; Virginie Rondeau

doi:10.1002/sim.8302

. Author manuscript; available in PMC: 2020 Sep 30.

Published in final edited form as: Stat Med. 2019 Jul 9;38(22):4348–4362. doi: 10.1002/sim.8302

Design and analysis of nested case-control studies for recurrent events subject to a terminal event

Ina Jazić ^1,^*, Sebastien Haneuse ¹, Benjamin French ², Gaëtan MacGrogan ³, Virginie Rondeau ⁴

PMCID: PMC7423396 NIHMSID: NIHMS1036451 PMID: 31290191

Abstract

The process by which patients experience a series of recurrent events, such as hospitalizations, may be subject to death. In cohort studies, one strategy for analyzing such data is to fit a joint frailty model for the intensities of the recurrent event and death, which estimates covariate effects on the two event types while accounting for their dependence. When certain covariates are difficult to obtain, however, researchers may only have the resources to sub-sample patients on whom to collect complete data: one way is using the nested case-control (NCC) design, in which risk set sampling is performed based on a single outcome. We develop a general framework for the design of NCC studies in the presence of recurrent and terminal events and propose estimation and inference for a joint frailty model for recurrence and death using data arising from such studies. We propose a maximum weighted penalized likelihood approach using flexible spline models for the baseline intensity functions. Two standard error estimators are proposed: a sandwich estimator and a perturbation resampling procedure. We investigate operating characteristics of our estimators as well as design considerations via a simulation study and illustrate our methods using two studies: one on recurrent cardiac hospitalizations in patients with heart failure and the other on local recurrence and metastasis in patients with breast cancer.

Keywords: joint frailty models, penalized likelihood, nested case-control studies, recurrent events

1 |. INTRODUCTION

In many clinical and epidemiological settings, patients can experience a series of repeated, or recurrent, events, such as hospitalizations, tumor recurrences, cardiovascular events, or infections. A variety of statistical methods have been developed to analyze recurrent event data.¹ Frequently, though, death can terminate the recurrent event process in an informative fashion (e.g., multiple hospitalizations may be associated with an increased risk of death). Two broad classes of models have been proposed for this setting: models for the marginal rate or mean function of recurrent events, accounting for the dependent terminal event^2–5; and joint models that simultaneously consider both recurrent and terminal events.^6–12 These joint models typically structure dependence among the recurrent and terminal events with a frailty – a random effect often considered to follow a parametric distribution across individuals. Such models could be based on the calendar timescale,^8,9, in which interest lies in the time to each event from some origin time, or the gap timescale^9,10in which interest lies in the time intervals between events. In this manuscript, we focus on the model from Rondeau et al.,⁹ which adopts the joint frailty model formulation from Liu et al.⁸ but models the baseline intensity functions with M-splines, thereby simplifying computation, and enabling the estimation of intensities and absolute risks.¹³ Estimation and inference have been established for this model when outcome and covariate information is available on all study participants.

In univariate time-to-event settings, if one or more exposures of interest are difficult to obtain, researchers might employ the nested case-control (NCC) design to sub-sample patients on whom to collect complete data.¹⁴ Typically, this is implemented by performing risk set sampling on the event of interest, possibly matching on certain factors. When considering a single event, estimation and inference is most often performed for a Cox model.¹⁵ However, limited work has been performed on the use of NCC studies to address scientific questions involving more than one outcome. In particular, no work has been performed on NCC studies in the context of recurrent events with a dependent terminal event, either in terms of design or analysis. In this manuscript, we propose novel designs for NCC studies in this context that take advantage of the multiplicity of the recurrent events, and develop estimation and inference with respect to the above joint frailty model.

We illustrate our methods using two datasets. The first is a subset of patients from the Penn Heart Failure Study (PHFS) consisting of patients referred to an outpatient heart failure (HF) specialty clinic at three sites in the United States, followed up for recurrent cardiac hospitalizations and a composite terminal event.^16,17 The second is a cohort of patients who were diagnosed with primary operable invasive breast carcinoma from Institut Bergonié, a comprehensive cancer center serving southwestern France. These patients were followed up for local recurrence, metastasis, and death.^12,13,18

This remainder of the manuscript is organized as follows. Section 2 describes the joint frailty model for recurrent and terminal events of Rondeau et al.⁹ In Section 3, we describe NCC studies and propose a general framework for their design in the context of recurrent events that are subject to a terminal event. Additionally, we propose methods for estimation and inference with respect to a joint frailty model for recurrent events and a terminal event to analyze data from such a study. We perform estimation via maximum penalized weighted likelihood, and consider both a sandwich estimator and a perturbation resampling procedure for standard error estimation. In Section 4, we evaluate small-sample characteristics of our proposed estimator through a simulation study based on the aforementioned PHFS data. Finally, in Section 5, these methods are illustrated using the two datasets described above, and we conclude with a discussion in Section 6.

2 |. JOINT FRAILTY MODEL FOR RECURRENT AND TERMINAL EVENTS

Let i = 1,…,N denote an individual in a full cohort. Adopting the notation from Rondeau et al.,⁹ for individual i, let C_i be the censoring time and D_i be the terminal event time. X _{i j} is the j^th recurrence time from some well-defined “origin” (j = 1,…,n_i), where n_i is the number of recurrent events observed for individual i. For each recurrence time, let T_{i j} = min(X_ij ,C_i, D_i) be the j^th follow-up time, and let δ_ij = I {T_ij = X_ij} be the indicator of whether X_ij was observed. Analogously, let $T_{i}^{*} = \min (C_{i}, D_{i})$ be the last follow-up time for individual i, where $δ_{i}^{*} = I {T_{i}^{*} = D_{i}}$ indicates whether the terminal event was observed. Let $N_{i}^{R *} (t)$ be the true number of recurrent events in (0,t], while $N_{i}^{R} (t) = N_{i}^{R *} (\min {T_{i}^{*}, t})$ is the observed number of recurrent events in that interval. Analogously, let $N_{i}^{D *} (t) = I (D_{i} \leq t)$ indicate whether the terminal event has occurred in (0,t], while $N_{i}^{D} (t) = I (T_{i}^{*} \leq t, δ_{i}^{*} = 1)$ is the observed indicator for the terminal event in that interval. Additionally, let $Y_{i} (t) = I (T_{i}^{*} \geq t)$ be the at-risk process, and let $Z_{i}^{R} (t) and Z_{i}^{D} (t)$ be the covariate processes for the recurrent and terminal events, respectively, which may be time-independent or external time-dependent covariates.^9,19 Let $D = {T_{i j}, T_{i}^{*}, δ_{i j}, δ_{i}^{*}, Z_{i}^{R} (u), Z_{i}^{D} (u); 0 \leq u \leq t; j = 1, \dots, n_{i}; i = 1, \dots, N}$ denote the complete data available on all N individuals in the study cohort. Then, one can define $F_{t} = σ {Y_{i} (u), N_{i}^{R} (u), N_{i}^{D} (u), Z_{i}^{R} (u), Z_{i}^{D} (u), 0 \leq u \leq t, ω_{i}, i = 1, \dots, N}$ to be the σ-algebra generated from the entire observed data history, as well as the unobserved frailty ω_i. The joint model can be constructed under the following assumptions, from Liu et al. ⁸ and Rondeau et al. ⁹

Continuous recurrent, terminating, and censoring processes are assumed such that recurrent events and death cannot occur at the same time, adopting the convention that death happens first in some interval [t, t + dt).
$N_{i}^{R *} (t)$ is constant after time D_i but can increase after C_i; in other words, death precludes the occurrence of new recurrent events, but censoring does not.
Y_i(t)r_i(t) is defined as the intensity of the recurrent event process at time t conditional on the covariates and the frailty, such that
$r_{i} (t) d t = P (d N_{i}^{R *} (t) = 1 ∣ Z_{i}^{R} (t), ω_{i}, D_{i} \geq t)$
Likewise, Y_i(t)λ_i(t) is defined as the intensity of the terminal event process at time t conditional on the covariates and the frailty, such that
$λ_{i} (t) d t = P (d N_{i}^{D *} (t) = 1 ∣ Z_{i}^{D} (t), ω_{i}, D_{i} \geq t)$
Censoring is noninformative (i.e., C_i is conditionally independent $N_{i}^{R *} (t) given Z_{i}^{R} (t)$ for all t and is conditionally independent of $N_{i}^{D *} (t) given Z_{i}^{D} (t)$ for all t) and in particular does not depend on the frailty ω_i.

Then, in the calendar timescale, the joint model for the intensity functions for the recurrent and terminal events can be written as:

r_{i} (t ∣ ω_{i}) = ω_{i} r_{0} (t) \exp {β_{1}^{'} Z_{i}^{R} (t)} = ω_{i} r_{i} (t) λ_{i} (t ∣ ω_{i}) = ω_{i}^{α} λ_{0} (t) \exp {β_{2}^{'} Z_{i}^{D} (t)} = ω_{i}^{α} λ_{i} (t)

where r₀(·) and λ₀(·) are baseline intensity functions. The two intensity functions share a common patient-specific frailty ω, which accounts for residual dependence within and among the recurrent and terminal events unaccounted for by the covariates in the model. Analyses typically adopt the assumption that the frailties for each patient are independent draws from a parametric distribution, such as a log-Normal distribution, such that log ω has mean 0 and variance σ², or a Gamma distribution with mean 1 and variance θ. The parameter α allows for the frailty to act differently on the terminal event than on the recurrent events. If α = 1, the frailty has the same impact on the intensities of the recurrent and terminal events, while α = 0 indicates that the terminal event is independent of the recurrent events, and may be analyzed separately. Note that the frailty also serves to encode the impact of the number of recurrent events, an internal time-dependent covariate, on the terminal event process.^9,13 While we focus on the calendar timescale in this manuscript, this model could also be adapted to the gap timescale by replacing T_ij with S_ij = T_ij − T_i(j−1) in the above model.⁹

In the complete data setting, estimation for this joint frailty model has been developed using the EM algorithm, involving the Breslow estimator for the baseline intensity functions,^8,10 as well as direct maximization of the observed log-likelihood under parametric and flexible modeling of the baseline intensity functions⁹ – in this manuscript, we focus on the latter strategy.

3 |. FITTING THE JOINT FRAILTY MODEL TO DATA FROM NESTED CASE-CONTROL STUDIES

3.1 |. Nested case-control studies: univariate case

In this article, we develop methods for performing estimation and inference for the joint frailty model for recurrent events and a terminal event in contexts where complete data is not available – for example, it may be expensive, time-consuming, or otherwise infeasible to collect certain exposure or covariate measures on the full cohort. The NCC study design is a frequently used, cost-efficient outcome-dependent sampling design for univariate time-to-event outcomes in this setting,¹⁴ the goal of which is to collect such exposure or covariate measurements on a subset of individuals. Operationally, the design first identifies all cases (i.e., individuals who experience the event of interest). For each case, m controls are randomly sampled without replacement from the risk set formed at the time of the event. The exposure or covariate measures of interest are then ascertained for the case and each selected control. Ultimately, measurements are collected on all cases and only a subset of controls, who may be expected to provide less information about parameters of interest. Since the sampling of controls across different risk sets is independent, the same patient may be sampled as a control for more than one case or may become a case after having been previously selected as a control. Extensions of the basic design permit matching to facilitate control of confounding²⁰ or the improvement of efficiency through counter-matching.²¹ When scientific interest lies in performing estimation and inference with respect to a univariate Cox model for the event of interest, one can analyze data from a NCC design using either a modification of the usual partial likelihood¹⁵ or inverse probability weighting.²²

In previous work, existing NCC studies have been repurposed to learn about an alternative outcome than the one used to define cases,^23–25 but the resulting analyses have remained within the univariate survival framework. NCC studies have, however, not been explicitly extended to the case where scientific interest lies in recurrent events, particularly in the presence of a terminal event. In this manuscript, we investigate the unique design considerations that stem from the structure of these failure times, and propose an approach to analyzing data arising from such a study – specifically, fitting a joint frailty model for recurrent events and death.

3.2 |. Design considerations in the presence of recurrent and terminal events

Consider the setting where subjects might experience up to K recurrent events (that is, max_i n_i = K). Then, each individual i is a member of one of 2(K +1) classes based on his/her event history, denoted C_rd where r ϵ {0,…,K} (indicating the number of recurrent events the individual has experienced) and d ϵ {0,1} (indicating whether the individual has experienced the terminal event). This event history, and thus class membership, which we refer to as C_i(t), is a function of time. For example, an individual who experiences two recurrent events at times T_i1 and T_i2 and dies at time $T_{i}^{*}$ has the following class membership function:

C_{i} (t) = {\begin{array}{l} C_{00} & for t < T_{i 1} \\ C_{10} & for T_{i 1} \leq t < T_{i 2} \\ C_{20} & for T_{i 2} \leq t < T_{i}^{*} \\ C_{21} & for t \geq T_{i}^{*} \end{array}

We propose that C_i(t) be used as the basis for a general class of NCC designs for studies of recurrent events that are subject to a terminal event. Specifically, a design may be defined based on which values of the class membership function C_i(t) designate case-defining or index events. Let $T_{i}^{s}$ denote the first time at which conditions for being an index case are met for an individual i. If $T_{i}^{s}$ exists, then individual i is selected into the study as a case at time $T_{i}^{s}$ . Note that, below, as in standard applications of the NCC design, all individuals who have experienced the index event are selected into the NCC study. Moreover, it is assumed that information about any non-index events is recorded and can be accessed for all individuals selected by the design (i.e. the cases and selected controls). For each design, complete data are available on only the sub-sample of selected individuals, $D_{NCC} .$

In this setting, there is substantial flexibility in how one defines the index event (in contrast to univariate settings, where there is only one possible index event). Towards this, we propose five designs that we consider at greater length in this manuscript: (1) Terminal, (2) Recurrent I, (3) Composite endpoint (CEP) I, (4) Recurrent II, and (5) CEP II. In design 1, any instance of the terminal event is considered to be the index event, where $T_{i}^{s} = \min (t : C_{i} (t) = C_{r 1}, r \in {0, \dots, K}) .$ Analogously, in design 2, the first instance of the recurrent event is considered to be the index event, such that $T_{i}^{s} = \min (t : C_{i} (t) = C_{10}) .$ We also explore the use of a composite endpoint, in which the first of a recurrent or terminal event for an individual is considered to be the index event. NCC studies have previously used composite endpoints that do not involve multiple recurrent events.^26–28 Design 3 incorporates the first recurrent event into a composite endpoint with death, such that all individuals are cases except for those who are censored for both events – that is, $T_{i}^{s} = \min (t : C_{i} (t) = C_{10} or C_{i} (t) = C_{01}) .$

We also propose novel designs that exploit the potential multiplicity in the recurrent events – in a resource-limited setting, it may be advisable to prioritize the selection of cases that provide more information with respect to the frailty-associated parameters θ and α. The Recurrent II design is analogous to the Recurrent I design, but the index event is the second instance of the recurrent event – in other words, $T_{i}^{s} = \min (t : C_{i} (t) = C_{20}) .$ Similarly, the CEP II design is analogous to the CEP I design, but incorporates the second recurrent event into a composite endpoint with death, such that $T_{i}^{s} = \min (t : C_{i} (t) = C_{20} or C_{i} (t) = C_{r 1}, r \in {0, 1}) .$ Note that this is far from an exhaustive set of designs: for instance, the Recurrent II and CEP II designs may be easily extended. A design using the j^th recurrent event as the index event uses $T_{i}^{s} = \min (t : C_{i} (t) = C_{j 0}),$ while a design based on the composite endpoint of the j^th recurrent event and death uses $T_{i}^{s} = \min (t : C_{i} (t) = C_{j 0} or C_{i} (t) = C_{r 1}, r \in {0, \dots, j - 1}) .$

Sample schematics for the five designs we focus on are displayed in Figure 1. Each graphic displays the length of follow-up and the timing of recurrent and terminal events experienced by members of the same hypothetical cohort. The schematics differ on which events are considered to be index events, which is dictated by the design. For example, the Terminal design designates the terminal event as the index event; consequently, each case of the terminal event is selected into the study, and generates the risk set marked on the graphic with a dashed vertical line, from which one (circled) control is also selected into the study. The symbols at the right of each graphic indicate which individuals have been selected into the NCC study: in other words, the individuals on which investigators have access to difficult-to-obtain exposure or risk factor information.

3.3 |. Penalized weighted likelihood for the joint frailty model

Let ξ = (r₀(·),λ₀(·),β₁,β₂,θ,α) denote the collection of unknown parameters from the joint frailty model in Section 2. Throughout the construction of the likelihood, it is assumed that all individuals are independent and that event times for a particular individual are independent conditional on the frailty and the covariates. Drawing on the notation established in Section 2, the likelihood contribution from individual i conditional on the frailty using the calendar timescale can be written as⁹:

L_{i} (ξ ∣ ω) = \exp {- ω_{i} \sum_{j = 1}^{n_{i}} \int_{T_{i (- 1)}}^{T_{i j}} d R_{i} (s)} [\prod_{j = 1}^{n_{i}} {[ω_{i} d R_{i} (T_{i j})]}^{δ_{i j}}] \exp {- ω_{i}^{α} \int_{0}^{T_{i}^{*}} d Λ_{i} (s)} {[ω_{i}^{α} d Λ_{i} (T_{i}^{*})]}^{δ_{i}^{*}}

The marginal observed log-likelihood can be found by integrating over the frailty. For example, when ω follows a log-Normal distribution such that log ω has mean 0 and variance σ², we have:

l (ξ) = \log \prod_{i = 1}^{N} \int_{0}^{\infty} L_{i} (ξ ∣ ω) f (ω) d ω = \sum_{i = 1}^{N} [\sum_{j = 1}^{n_{i}} (δ_{i j} \log d R_{i} (T_{i j})) + δ_{i}^{*} \log d Λ_{i} (T_{i}^{*}) - \log (σ \sqrt{2 π}) + \log \int_{0}^{\infty} ω^{N_{i}^{R} (T_{i}^{*}) + α δ_{i}^{*} - 1} \exp {- ω \sum_{j = 1}^{n_{i}} \int_{T_{i (j - 1)}}^{T_{i j}} d R_{i} (s) - ω^{α} \int_{0}^{T_{i}^{*}} d Λ_{i} (s) - \frac{{(\ln ω)}^{2}}{2 σ^{2}}} d ω

As in Rondeau et al.,⁹ we model the baseline intensity functions with cubic M-splines,²⁹ a variant of B-splines that is useful for modeling baseline intensity functions since they are guaranteed to be non-negative. Moreover, they integrate to I-splines, which allow for intuitive representation of the cumulative baseline intensity function.³⁰ Below are the models for the baseline intensity functions:

r_{0} (t) = \sum_{k = 1}^{m_{r}} M_{r k} (t) η_{r k} λ_{0} (t) = \sum_{k = 1}^{m_{λ}} M_{λ k} (t) η_{λ k}

where M_rk and M_λk are the k^th basis functions of the cubic M-spline, m_r and m_λ are the number of spline basis functions used, and η_r >0 and η_λ >0 are spline coefficient vectors of length m_r and m_λ, respectively.

To induce smoothness in the estimated baseline intensity functions, the observed log-likelihood is penalized by a term quantifying their roughness, based on the second derivative of the baseline intensity functions^9,30:

l^{p} (ξ) = l (ξ) - κ_{1} \int_{0}^{\infty} r_{0}^{″} (t) d t - κ_{2} \int_{0}^{\infty} λ_{0}^{″} (t) d t

where k₁,k₂ ≥ 0 are smoothing parameters. Given complete data on all N individuals in the study cohort, $D,$ one could proceed with estimation and inference by maximizing the penalized observed log-likelihood. As implemented in the R package frailtypack,^31,32 the observed log-likelihood is maximized using a modified robust Marquardt optimization algorithm.³³

Suppose that complete data are only available on a sub-sample of patients selected via an NCC design, $D_{NCC} . Let δ_{i}^{*}$ be an indicator of whether individual i was observed to experience the index event (i.e. an individual with defined $T_{i}^{s}) and R_{i}$ be the risk set formed at the observed event time $T_{i}^{s} if δ_{i}^{*} = 1.$ Furthermore, let V_0ij be a binary variable indicating whether individual i was selected as a control from $R_{j} .$ As described by Samuelsen²² and Cai and Zheng, ³⁴ D_NCC consists of all individuals for whom $V_{i} = 1, where V_{i} = δ_{i}^{*} + (1 - δ_{i}^{*}) V_{0 i} and V_{0 i} = 1 - Π_{j : i \in R_{j}} (1 - δ_{j}^{*} V_{0 i j})$ is an indicator that the individual was selected as a control from at least one of the risk sets formed by the observed index events. Furthermore, the corresponding probability of being selected by the NCC design over the course of the observed follow-up period is $π_{i} = P (V_{i} = 1 ∣ D) = δ_{i}^{*} + (1 - δ_{i}^{*}) π_{0 i}$ where

π_{0 i} = 1 - \prod_{j : i \in R_{j}} (1 - δ_{j}^{*} [\frac{m}{‖ R_{j} ‖ - 1}]) .

(1)

Intuitively, the expression above involves the computation of the probability that individual i is not selected from any risk set from which he or she is eligible to be selected as a control. Each term in the product represents the probability that individual i is not selected from the particular risk set formed by some case j (i.e., that individual i is not one of the m selected controls); these probabilities can simply be multiplied because the selection of controls from each risk set is independent. Note that $‖ R_{j} ‖ - 1,$ one less than the cardinality of $R_{j},$ is used in the denominator above, since the case used to define a risk set is not eligible to be selected as a control from that risk set. If matching is performed on additional covariates Z, the expression for the selection probability of a control remains the same if $R_{j}$ is redefined to be the set of individuals that have Z = Z_j and that are at risk for the index event at the observed event time for individual j if $δ_{j}^{*} = 1.$ ^35,36

For a given NCC design, an estimate of ξ, denoted $\hat{ξ},$ can be obtained using the same algorithm as for the complete data; here, the maximized function is the inverse-probability-weighted penalized log-likelihoodconsisting of contributions from individuals who have been selected into the NCC study:

l^{p w} (ξ) = \sum_{i = 1}^{N} V_{i} w_{i} l_{i} (ξ) - κ_{1} \int_{0}^{\infty} r_{0}^{″ 2} (t) d t - κ_{2} \int_{0}^{\infty} λ_{0}^{″ 2} (t) d t .

(2)

where $w_{i} = π_{i}^{- 1} and l_{i} (ξ) = \log L_{i} (ξ) .$ This estimator is asymptotically normal, as described in Appendix A of the Supporting Information. Note that weights are only applied to the unpenalized portion of the likelihood: the penalty term is a function of the estimated baseline intensity functions and cannot be decomposed into individual-level contributions.

3.4 |. Standard error estimation

We propose two standard error estimators: a conservative sandwich (CS) estimator and a perturbation resampling (PR) estimator. As outlined in detail in Appendix A of the Supporting Information, an estimator of the asymptotic variance of $\hat{ξ}$ can be written as the sum of two terms: the first term corresponds to the appropriate sandwich estimator for the complete data, while the other term accounts for the NCC design, incorporating the covariance of all pairs of sampling indicators (V_i, v_j). One strategy, as in Saarela et al.²³ and Kim and Kaplan,²⁵ is to use a plug-in estimator that ignores the design component, specifically $\tilde{Var} [\hat{ξ}] = N^{- 1} {\hat{J}}^{p w^{- 1}} {\hat{Γ}}_{I} {\hat{J}}^{p w^{-}}$ where

{\hat{J}}^{p w} = \frac{1}{N} \frac{\partial^{2}}{\partial ξ^{2}} l^{p w} (\hat{ξ}), {\hat{Γ}}_{I} = \frac{1}{N} \sum_{i = 1}^{N} U_{i}^{w} (\hat{ξ}) U_{i}^{w} {(\hat{ξ})}^{T} .

$U_{i} (ξ) = \frac{\partial}{\partial ξ} l_{i} (ξ) .$ Since the sampling indicators V_i and V_j are negatively correlated if they correspond to members of the same risk set, $\tilde{V} ar [\hat{ξ}]$ may be expected to yield conservative standard error estimates, particularly when risk sets are small.

As an alternative to the sandwich estimator $\tilde{V} a r [\hat{ξ}] =$ we propose a standard error estimator based on a perturbation resampling procedure developed for NCC studies.³⁴ By exploiting the relationship between the exact weights under sampling without correlation among the sampling indicators on variance estimation. The algorithm proceeds by first generating $I = {I_{a b} : a, b = 1, \dots, N}$ as N² random draws from a distribution with $E [I_{a b}] = Var [I_{a b}] = 1.0,$ such as an Exponential(1) distribution. Then, set

V_{i}^{*} = δ_{i}^{*} I_{i i} + (1 - δ_{i}^{*}) [1 - \prod_{l : i \in R_{l}} (1 - δ_{l}^{*} V_{0 i l} I_{i l})]

and

π_{i}^{*} = δ_{i}^{*} + (1 - δ_{i}^{*}) [1 - \exp {- \sum_{l : i \in R_{l}} δ_{l}^{*} \frac{\sum_{k \in R_{l}} V_{0 k l} I_{k l}}{‖ R_{l} ‖ - 1}}]

for each $i \in D_{NCC} .$ Note that the perturbed control selection probability takes a Breslow-type form instead of the product-limit form of 𝜋_0i above. This is because the Breslow-type form is guaranteed to produce a probability between 0 and 1, while the perturbed analog of the product-limit form of the probability may exceed 1 if large values of $I_{k l}$ are drawn and $‖ R_{l} ‖$ is small. Note also that if cases and controls are additionally matched on covariates Z, then $R_{l}$ in the above expressions will be redefined to restrict to the set of individuals that have Z = Z_l that are at risk for the index event at the observed event time $T_{l}^{s} .$ The perturbed weights can be computed as $w_{i}^{*} = V_{i}^{*} / π_{i}^{*}$ and the following perturbed penalized log-likelihood can be written:

l^{p w *} (ξ) = \sum_{i = 1}^{N} V_{i} w_{i}^{*} l_{i} (ξ) - κ_{1} \int_{0}^{\infty} r_{0}^{″ 2} (t) d t - κ_{2} \int_{0}^{\infty} λ_{0}^{″ 2} (t) d t .

the maximizer of which we denote as ${\hat{ξ}}^{(1)} .$ These steps are then repeated B times to give ${{\hat{ξ}}^{(b)}, b = 1, \dots, B},$ which can be used to approximate the sampling distribution of $\hat{ξ} .$ Thus, an estimate of $Var [\hat{ξ}]$ is the (empirical) variance of the ${\hat{ξ}}^{(b)},$ excluding perturbations that did not converge.

4 |. SIMULATIONS

To evaluate the operating characteristics of the proposed methods, we conducted simulation studies based on data from the Penn Heart Failure Study (PHFS) referenced in the Introduction. The PHFS data are further introduced in Section 4.1 and a complete analysis can be found in Section 5.1. Generally, the approach for each of the simulations was to initially generate 10,000 “full cohorts” based on the PHFS data. From these, data that would be observed from a given NCC design were generated by mimicking the appropriate selection procedure.

4.1 |. Penn Heart Failure Study

The Penn Heart Failure Study is a prospective cohort of 2,136 patients referred to an outpatient heart failure (HF) specialty clinic at three sites in the United States, recruited between 2003 and 2012.^16,17 Clinical data and venous blood samples were collected at the time of study enrollment. Patients were followed up for a median of 5.4 years for multiple endpoints, including cardiovascular and non-cardiovascular hospitalization, ventricular assist device (VAD) placement, transplantation, and death. In both the simulation study and the data analysis presented in Section 5.1, we consider hypothetical studies in which scientific interest lies with the association between serum biomarker concentrations and the risk of cardiac hospitalization and a terminal event (defined as the composite endpoint of all-cause death, cardiac transplantation, or VAD placement). For the purposes of this analysis, we restricted the cohort to the 1,187 individuals with serum biomarker measurements who had a reduced ejection fraction.

After administratively censoring follow-up at 500 days, 178 patients experienced the terminal event during follow-up, while 463 patients experienced at least one cardiac hospitalization after study enrollment, for a total of 811 recurrent events. An average of 0.7 cardiac hospitalizations were experienced per individual, while an average of 1.8 cardiac hospitalizations occurred among individuals who experienced at least one recurrent event. The maximum number of observed cardiac hospitalizations was 8. More detailed information on the frequency of events, as well as complete covariate information broken down by outcome can be found in Appendix C of the Supporting Information.

4.2 |. Setup

We considered two sets of 10,000 full cohorts – in one, each cohort was of size N=1,187 (the same size as the PHFS data), while in the other, each cohort was of size N=10,000, a reasonable magnitude for large cohort studies (some, like the Nurses’ Health Study,³⁷ are even larger) and cohorts assembled from electronic medical records. For each cohort, individual-level covariate vectors were sampled with replacement, thereby preserving a reasonable joint distribution of the covariates among individuals in the simulated cohort. In other words, the covariates of any individual in any simulated cohort corresponded to the covariates of some individual in the original PHFS data. Then, new outcomes were simulated for these individuals, based on a fit of the joint frailty model to the real PHFS data that adjusts for baseline covariates (from Basuray et al.¹⁷: age, sex, race, site, ischemic-type heart failure, history of chronic kidney disease, history of hypertension) and the (log-transformed and centered) concentration of soluble Toll-like receptor-2 (ST2), under the gap timescale, using a log-normal frailty, with 8 internal knots. The results for this model fit can be found in Appendix C of the Supporting Information. Analogously to Rondeau et al.,⁹ new outcomes for each individual i in the dataset were simulated according to the following procedure:

Generate frailty term log ω_i ∼ N(0,σ²), and compute linear predictors $\exp {\log w_{i} + β_{1}^{'} Z_{i}^{R}} and \exp {α \log w_{i} + β_{2}^{'} Z_{i}^{D}}$ based on the individual’s covariates and the regression coefficients from the model fit to the real data.
Based on the estimated baseline intensity functions from the model fit to the real data, calculate r_i(t|ω_i) and λ_i(t|ω_i). Transform these into the corresponding survivor functions $S_{i}^{R} (t ∣ ω_{i}) and S_{i}^{D} (t ∣ ω_{i})$ for this individual.
Generate u_D ∼ Unif(0,1) and find t_D such that $S_{i}^{D} (t_{D} ∣ ω_{i}) = u_{D},$ rounded up to the nearest day. Let t_D be the time of death. If no such t_D exists, then the individual is censored for death at 500 days.
Generate u_R1 ∼ Unif(0,1), and find t_R1 such that $S_{i}^{R} (t_{R 1} ∣ ω_{i}) = u_{R 1},$ rounded up to the nearest day. Let gap time t_R1 be the time of the first recurrent event. If no such t_R1 exists, or t_R1 >t_D, then the individual does not experience any recurrent events.
If a recurrent event is recorded, then repeat the previous step as long as the sum of the generated gap times does not exceed t_D. That is, if $\sum_{m = 1}^{n_{i}} t_{R m} < t_{D} but \sum_{m = 1}^{n_{i} + 1} t_{R m} > t_{D},$ then the individual experiences n_i recurrent events at times $(t_{R 1}, t_{R 1} + t_{R 2}, \dots, \sum_{m = 1}^{n_{i}} t_{R m})$ .

The simulated cohorts and the original PHFS data had similar event distributions and frequencies, on average. For each simulated cohort, under both initial sample sizes, we formed five NCC studies: one for each of the designs described in Section 3.2. The rates of the index event defined by each design in this cohort are 15.0% for the Terminal design, 39.0% for the Recurrent I design, 44.2% for the CEP I design, 15.9% for the Recurrent II design, and 26.1% for the CEP II design. For each case of the index event, m = 1 control was selected from the risk set formed at time at which the event occurred.

4.3 |. Analyses

For each of the full cohorts and NCC datasets, we obtained maximum penalized likelihood and maximum weighted penalized likelihood estimates of the unknown parameters. For selection of the smoothing parameters, we followed the procedure described in the simulation section of Rondeau et al.,⁹ in which cross-validation was performed on a shared frailty model to obtain k₁ and on a Cox model to obtain k₂. If the model did not converge under these initial values, we multiplied or divided each k by a factor of 10 (depending on whether the initial value was ≥ 100) until convergence occurred. (When convergence was achieved, we found that varying values of k had little impact on the regression parameters.) All models were fit on the calendar timescale with 8 internal knots. The conservative sandwich estimator was computed for all settings, and derivatives used to compute this estimator were computed numerically. For the perturbation resampling-based estimator, we set B = 1000; due to the computational burden involved, we evaluated this estimator for the first 1000 iterations of the simulation and present it for selected settings. Throughout, to avoid a small number of iterations having an unduly large impact on the results, we removed all iterations with frailty parameter estimates >5 MAD (median absolute deviation) from the median estimate,³⁸ typically excluding 2–3% of the iterations.

4.4 |. Results

Table 1 presents results for the percent bias of the parameter estimates under the NCC designs. Since the goal of an NCC study is to recover estimates that one would have obtained in the full cohort study, we compare the average of the parameter estimates from all NCC studies under a particular design to the average parameter estimates from the cohort. Overall, one can see a reduction in the bias of the estimates of the regression and frailty parameters as N increases. The CEP I and CEP II designs, where the index event is a composite endpoint, have the best performance – they are unbiased under both N = 1187 and N = 10,000. The Terminal design performs comparably to the Recurrent I design while marked bias remains for the Recurrent II design (based on the second recurrent event), even when N = 10,000. This relative ordering of performance is mirrored in the degree to which the standard error estimators approximate the empirical standard error, as seen in Table 2. The CS estimator estimates the empirical standard error well, especially for large N, although performance is poorer for α. Results in Appendix B of the Supporting Information demonstrate that the CS estimators do not seem to be consistently larger than the perturbation standard error estimators.

TABLE 1.

Percent bias of the simulation estimates of the recurrent and terminal β parameters for log ST2, σ² and α, with respect to the average estimates from cohort simulations. Results are shown for the five NCC designs that vary by the index event.

	N = 1187				N = 10,000
	β_logST2				β_logST2
Design	Rec	Term	σ²	α	Rec	Term	σ²	α

Terminal	1.1	7.1	−4.2	7.9	0.2	1.2	−0.4	1.1
Rec I	−6.5	−3.9	−7.3	3.5	−2.3	−0.7	−1.9	2.0
CEP I	−0.1	−0.2	−0.7	0.0	0.0	0.0	−0.1	0.0
Rec II	−19.3	−14.5	−22.3	1.2	−11.8	−9.5	−12.3	1.1
CEP II	0.4	−0.4	−0.7	−0.7	0.1	0.0	−0.1	0.0

Open in a new tab

TABLE 2.

Simulation results for standard error estimation for the recurrent and terminal β parameters for log ST2, σ² and α. Shown are empirical standard errors (Emp) and conservative sandwich standard error estimates (CS) for the full cohort and the five NCC designs that vary by the index event.

		Rec β_logST2		Term β_logST2		σ²		α
	Design	Emp	CS	Emp	CS	Emp	CS	Emp	CS

N = 1187	Cohort	0.105	0.104	0.311	0.329	0.214	0.226	0.312	0.342
	Terminal	0.207	0.196	0.464	0.457	0.393	0.399	0.494	0.444
	Rec I	0.148	0.132	0.542	0.451	0.220	0.241	0.444	0.431
	CEP I	0.123	0.121	0.330	0.339	0.220	0.237	0.313	0.329
	Rec II	0.208	0.184	0.802	0.653	0.309	0.326	0.677	0.790
	CEP II	0.148	0.144	0.353	0.368	0.293	0.303	0.326	0.352

N = 10,000	Cohort	0.035	0.036	0.093	0.087	0.074	0.072	0.090	0.068
	Terminal	0.069	0.069	0.142	0.129	0.139	0.139	0.148	0.109
	Rec I	0.069	0.059	0.262	0.210	0.100	0.092	0.172	0.128
	CEP I	0.042	0.042	0.099	0.094	0.076	0.077	0.090	0.070
	Rec II	0.117	0.087	0.457	0.317	0.171	0.146	0.351	0.229
	CEP II	0.049	0.050	0.108	0.104	0.102	0.101	0.094	0.076

Open in a new tab

Table 3 compares the relative uncertainty of the estimates under different designs, which we define as the ratio of the empirical standard error of estimates from a certain NCC study design to the empirical standard error of estimates from the full cohort. The CEP I design achieves close to full efficiency, although it is also the design that contains the most individuals on average. However, the next most efficient design is the CEP II design, based on the composite endpoint of the second recurrent event and death – it is no less efficient than the Recurrent I design, which tends to have more individuals, on every parameter except σ². Note that the Terminal design estimates both the terminal β more efficiently than the recurrent β and σ² more efficiently than α, while the opposite is true of the designs based exclusively on recurrent events. Analogous simulation results for the gamma frailty can be found in Appendix B of the Supporting Information – the same general conclusions hold.

TABLE 3.

Relative uncertainty, defined as the ratio of the (empirical) standard error for estimates based on a given NCC design to that of the estimates from an analysis of the full cohort, of simulation estimates of the recurrent and terminal β parameters for log ST2, σ² and α. Results are shown for the five NCC designs that vary by the index event, for N = 1187.

	β_logST2
Design	Rec	Term	σ²	α	# Obs

Terminal	1.97	1.49	1.84	1.58	378
Rec I	1.41	1.74	1.03	1.42	723
CEP I	1.18	1.06	1.03	1.00	829
Rec II	1.98	2.58	1.44	2.17	359
CEP II	1.41	1.14	1.37	1.04	581

Open in a new tab

Note that the Recurrent II design, based on the second recurrent event, has the worst performance in terms of bias and efficiency, as well as the greatest discrepancy between empirical standard error and the standard error estimates. One reason this occurs is that particular individuals may be excluded from the study by design – it is often the case that individuals who experience the terminal event early in follow-up have few or no opportunities to be selected into the NCC study. If they are selected, a very high weight would be placed on their log-likelihood contributions, leading to inefficiency and possible instability in estimation; if they are not selected, the exclusion of these highly informative individuals may bias the parameter estimates.

This same mechanism is in play, but to a lesser extent, in the Recurrent I design, based on the first recurrent event. This also explains why the Recurrent I design performs more poorly than the Terminal design: the early occurrence of a recurrent event in an individual does not preclude his/her membership in a risk set formed at the time a terminal event. The individuals whose exclusion is of concern in the Recurrent II design are guaranteed to be included in the Terminal design, since they experience the terminal event. The same reasoning also explains the high-quality performance of the designs based on the composite endpoint. The issue of asymmetry between the two event types is avoided, with the additional benefit of gathering the individuals that are most informative for the estimation of the frailty parameters.

5 |. DATA APPLICATIONS

The ability to jointly analyze recurrent events and death is of interest in multiple therapeutic areas, two of which are cardiovascular disease and cancer. Towards this, we apply our proposed methods for NCC studies in this context to two datasets: the first, a subset of patients in the Penn Heart Failure Study (PHFS), and the second, a cohort of breast cancer patients from Institut Bergonié. The analysis strategy we present is similar for both datasets: we construct hypothetical NCC studies under each of the proposed designs and compare analyses from the appropriate weighted joint frailty models to those for the cohort.

To serve as a “gold-standard” analysis, for each dataset we fit a joint frailty model for recurrent and terminal events to the full cohort data using frailtypack, adjusted for the appropriate covariates. Complete information for both cohort fits can be found in Appendix C of the Supporting Information. Next, we simulated a single NCC study for each of the five designs described in Figure 1 and Section 3.2. After the index cases were defined and “selected” into our study, a single control was randomly sampled from the risk set formed by each case. Inverse probability weights were computed for the selected controls, while case weights were taken to be 1. Then, a joint frailty model for recurrent and terminal events was fit to the data, adjusting for appropriate covariates, in which each selected individual’s log-likelihood contribution was weighted using the appropriate inverse probability weights. For illustration, we used a log-normal frailty for analyzing the PHFS data, and we used a gamma frailty for analyzing the Institut Bergonié data.

5.1 |. Heart failure: Penn Heart Failure Study

We conducted a full analysis of the PHFS data described in Section 4.1. As in the simulations, all models were adjusted for the same baseline covariates as in the joint frailty model from Basuray et al.¹⁷: age (<50, 50–60, 60–70, ≥ 70 years old), sex, race (white, black, other), site (PA, OH, WI), ischemic-type heart failure, history of chronic kidney disease, and history of hypertension. In our data analysis, we also adjusted for the following serum biomarkers as our covariates of primary interest, the concentrations of which were log-transformed and centered: ST2 (explored in the simulation study), B-type natriuretic peptide (BNP), soluble fms-like tyrosine kinase receptor-1 (sFlt-1), and high-sensitivity C-reactive protein (hsCRP). All models were fit using 8 internal knots, adopting a log-normal frailty. Table 4 displays adjusted estimates of the hazard ratio (HR) association between our four biomarkers of interest and the risk of recurrent hospitalization and the terminal event, along with corresponding Wald-based 95% confidence intervals (CI).

TABLE 4.

Hazard ratio (HR) estimates of the association between various biomarkers and the risk of recurrent hospitalization and the terminal event (defined as the composite endpoint of all-cause death, cardiac transplant, and vascular assist device (VAD) placement), as well as estimates of σ² and α based on fitting the joint frailty model to the full PHFS data and a single nested case-control study formed under each of five designs. Also shown are 95% confidence intervals based on the conservative sandwich estimate of the standard error (CI_cs).

	Rec HR				Term HR
Design	BNP	ST2	SFlt-1	hsCRP	BNP	ST2	SFlt-1	hsCRP	σ²	α	# obs

Cohort CI	1.40 (1.29–1.52)	1.30 (1.08–1.57)	1.44 (1.10–1.87)	1.07 (0.98–1.17)	1.95 (1.54–2.47)	1.95 (1.27–2.99)	2.68 (1.57–4.59)	1.31 (1.06–1.62)	1.31 (1.02–1.61)	1.95 (1.48–2.42)	(1187)
Terminal CI_𝐶S	1.40 (1.18–1.65)	1.20 (0.82–1.77)	1.16 (0.72–1.88)	1.10 (0.90–1.35)	2.08 (1.38–3.11)	2.17 (0.68–6.91)	4.08 (1.61–10.29)	1.64 (1.06–2.51)	1.17 (0.58–1.76)	2.72 (1.58–3.87)	330
Rec I CI_𝐶S	1.42 (1.29–1.56)	1.28 (1.03–1.60)	1.54 (1.14–2.08)	1.08 (0.97–1.20)	2.03 (1.47–2.80)	3.36 (1.64–6.87)	3.73 (1.62–8.56)	1.29 (0.92–1.79)	1.22 (0.92–1.52)	2.68 (1.96–3.41)	746
CEP I CI_𝐶S	1.40 (1.27–1.54)	1.25 (0.99–1.57)	1.35 (1.02–1.78)	1.08 (0.97–1.20)	2.18 (1.66–2.86)	2.35 (1.27–4.33)	3.50 (1.77–6.91)	1.32 (0.99–1.76)	1.33 (0.99–1.66)	2.47 (1.82–3.10)	823
Rec II CI_𝐶S	1.23 (1.08–1.40)	1.26 (0.93–1.72)	1.79 (1.20–2.66)	1.10 (0.95–1.27)	1.70 (1.09–2.65)	3.01 (1.07–8.49)	4.18 (0.59–29.83)	1.62 (0.86–3.06)	0.84 (0.50–1.18)	2.88 (1.37–4.40)	350
CEP II CI_𝐶S	1.49 (1.30–1.71)	1.23 (0.92–1.66)	1.31 (0.85–2.02)	1.08 (0.93–1.25)	2.27 (1.58–3.27)	2.38 (1.09–5.20)	3.05 (1.27–7.34)	1.28 (0.91–1.79)	1.65 (1.00–2.29)	2.36 (1.49–3.24)	535

Open in a new tab

The top two rows of Table 4 present these HR estimates based on the full cohort. We find that higher levels of each of the biomarkers is associated with a greater risk of cardiac hospitalization and of the composite terminal event, with the exception of hsCRP for hospitalization. The remaining rows of this table display results from the analysis performed on each of the NCC studies. Hazard ratio estimates from the NCC studies are generally compatible with those from the full cohort, despite being based on a fraction of the full N=1,187 patients. There is generally more variability in the estimation of hazard ratios for the terminal event than the recurrent event – the lower event rates for the terminal event compared to hospitalization may be responsible, but this may also be compounded by uncertainty in the estimation of α, which modifies the effect of the frailty on the intensity of the terminal event.

5.2 |. Breast cancer: Institut Bergonié

Next, we considered a cohort of patients who were diagnosed with primary operable invasive breast carcinoma who were operated as first treatment between 1989 and 1993 from Institut Bergonié, a comprehensive cancer center serving southwestern France.^12,13 The origin of follow-up was the surgery date, and patients were followed up for local recurrence and metastasis (which we define as the recurrent event), with a median follow-up of 13.7 years. Clinical characteristics were recorded, as well as immunohistochemical analyses from tumor microarray (TMA) blocks.¹⁸ We restricted our dataset to the 1,050 patients with complete data on androgen receptor (AR) and other key covariates of interest. 352 patients experienced at least one recurrent event, for a total of 452 recurrent events (a maximum of 3 per patient), and 324 patients died during follow-up. Complete covariate information broken down by outcome can be found in Appendix D of the Supporting Information.

All models were adjusted for the same covariates as in Mauguen et al. ¹³: age (<40, 40–55, >55 years old), tumor size at surgery (>20mm or ≤ 20mm), grade of cancer (I, II, or III), whether the tumor was positive for ER/PR, whether the tumor was positive for HER2, whether peritumoral vascular invasion was observed, and whether pathological nodal involvement wasdetected in ≥ 1 node¹³; additionally, we adjusted for whether the tumor was positive for AR. We considered hypothetical studies in which scientific interest lies in the association between the presence of androgen receptor (AR) and the risk of experiencing the recurrent event or death. All models were fit using 6 internal knots, adopting a gamma frailty. Table 5 displays adjusted estimates of the hazard ratio (HR) association between the presence of androgen receptor and the risk of local recurrence/metastasis and death, along with corresponding Wald-based 95% confidence intervals (CI).

TABLE 5.

Hazard ratio (HR) estimates of the association between the presence of androgen receptor and the risk of local recurrence/metastasis and death, as well as estimates of θ and α based on fitting the joint frailty model to the full Institut Bergonié data and a single nested case-control study formed under each of five designs. Also shown are 95% confidence intervals based on the conservative sandwich estimate of the standard error (CI_cs).

Design	Rec HR_𝐴R	Term HR_𝐴R	θ	α	# obs

Cohort CI	0.58 (0.45–0.75)	0.26 (0.14–0.50)	1.07 (1.01–1.13)	4.27 (3.62–4.92)	(1050)
Terminal CI_𝐶S	0.60 (0.41–0.86)	0.26 (0.10–0.70)	1.06 (0.97–1.14)	4.27 (3.39–5.14)	549
Rec I CI_𝐶S	0.55 (0.36–0.86)	0.37 (0.10–1.40)	1.05 (0.96–1.13)	4.17 (2.64–5.70)	591
CEP I CI_𝐶S	0.55 (0.43–0.71)	0.24 (0.13–0.46)	1.08 (1.02–1.15)	4.29 (3.52–5.06)	709
Rec II CI_𝐶S	0.61 (0.31–1.20)	0.30 (0.04–2.16)	0.90 (0.69–1.12)	3.67 (1.40–5.94)	171
CEP II CI_𝐶S	0.63 (0.47–0.86)	0.25 (0.12–0.54)	1.08 (1.01–1.16)	4.43 (3.73–5.13)	577

Open in a new tab

The top two rows of Table 5 present HR estimates based on the full cohort, where it appears that a tumor positive for AR is associated with a lower risk of local recurrence, metastasis, and death. The remaining rows of this table display results from the analysis performed on each of the NCC studies. Once again, hazard ratio estimates from the nested case-control studies are generally compatible with those from the full cohort, despite being based on a fraction of the full cohort, with patterns in the standard error magnitudes mirroring those in the PHFS data and the simulations. Compared to the PHFS data, α is quite high, implying a stronger differential impact of the frailty on the terminal event. This table clearly demonstrates that the CEP II design has tighter confidence intervals than either the Terminal or Recurrent I designs, despite selecting a similar number of patients; we hypothesize that this performance comes from having selected more patients who are more informative for all components of the model.

6 |. DISCUSSION

This manuscript proposes a framework for the design of nested case-control studies when scientific interest lies in both recurrent and terminal events, and additionally proposes estimation and inference for the joint frailty model based on data arising from such studies, thereby expanding the scope of scientific questions that can be answered with an NCC study. While previous work has examined the use of an existing NCC study to learn about a non-index outcome,^23–25 such work has remained within the standard univariate survival framework for NCC studies, and has not dealt with design considerations. To our knowledge, this is the first work that examines the design and analysis of NCC studies when recurrent events and a terminal event are of interest.

As pointed out by a reviewer, an important consideration is the choice of the frailty distribution. We have examined the use of gamma and log-normal frailty distributions, although in principle, any distribution with mean 1 could be used. From our perspective, the use of a particular frailty distribution in an analysis is not necessarily a belief statement about the truth of that distribution, but rather a pragmatic strategy adopted to capture heterogeneity between subjects that remains unaccounted for by the covariates in the model. As in many real-world settings, analysts are unlikely to specify the correct model for the linear predictor; the fit of the model can be improved by including a mechanism of absorbing residual heterogeneity. Note that previous work in the complete data setting has demonstrated that parameter estimation in the joint model framework is generally robust to misspecification of the frailty distribution.¹¹ It is not unreasonable to suppose that similar results hold in the NCC setting.

When the goal is to perform a joint analysis of recurrent and terminal events, and there is flexibility in the design of the NCC study, our simulation results suggest that defining the index case on the basis of a composite endpoint is best. However, the best choice for a given scientific question must be determined on a case-by-case basis, and practical constraints may limit the types of designs that are possible. If a design must be based on a single event type, then the simulations suggest that overall performance is better for the design based on a terminal event. Designs based on the recurrent event may have acceptable performance as well if event rates are sufficiently high, and may be desirable if primary interest lies in a recurrent event process that is dependent on a terminal event.

For standard error estimation, we proposed and presented results for both a conservative sandwich estimator and a perturbation resampling estimator. The risk sets we deal with in this paper are large, so the sandwich estimator has not seemed to overestimate the empirical standard error; however, if fine matching is involved, using the perturbation resampling estimator may be more appropriate. Even in settings where the sandwich estimator is adequate for the regression parameters, though, the perturbation resampling framework is useful for estimating standard errors for quantities for which the explicit form of the variance has not been derived.³⁴ One limitation of our approach is that time-varying covariates cannot be used in this scheme unless investigators have access to their complete trajectories, due to the fact that inverse probability weighting breaks the matching within risk sets that typically occurs in an NCC study.²²

The methods introduced in this manuscript have been incorporated into a recent release of the R package frailtypack; sample code can be found in the package manual as well as Appendix E of the Supporting Information. This methodology could be extended to accommodate multiple types of recurrent events (for example, under the model from Mazroui et al.,¹²) which also brings new possibilities for design: index events could be defined as the first occurrence of one or both types of recurrent events. Furthermore, as evident in the PHFS data, recurrent and terminal events could be common events in the populations of interest. A natural extension of these methods would allow for the selection of only a subset of cases of the index event into the NCC study, which would enhance the cost-effectiveness of such a study.

Supplementary Material

Supp info

NIHMS1036451-supplement-Supp_info.pdf^{(225.2KB, pdf)}

ACKNOWLEDGMENTS

This work was supported by the National Institutes of Health [grant numbers R01 CA181360, T32 CA009337] and the Chateaubriand Fellowship of the Office for Science & Technology of the Embassy of France in the United States. The Radiation Effects Research Foundation (RERF), Hiroshima and Nagasaki, Japan is a public interest foundation funded by the Japanese Ministry of Health, Labour and Welfare (MHLW) and the US Department of Energy (DOE). The research was also funded in part through DOE award DE-HS0000031 to the National Academy of Sciences. The views of the authors do not necessarily reflect those of the two governments.

Footnotes

Data availability statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

SUPPORTING INFORMATION

Additional Supporting Information may be found online in the supporting information tab for this article.

Conflict of interest

The authors declare no potential conflicts of interest.

References

1.Cook RJ, Lawless JF. The Statistical Analysis of Recurrent Events. New York, NY: Springer Science and Business Media; 2007. [Google Scholar]
2.Ghosh D, Lin DY. Nonparametric analysis of recurrent events and death. Biometrics. 2000;56(2):554–562. [DOI] [PubMed] [Google Scholar]
3.Ghosh D, Lin DY. Semiparametric analysis of recurrent events data in the presence of dependent censoring. Biometrics. 2003;59(4):877–885. [DOI] [PubMed] [Google Scholar]
4.Cook RJ, Lawless JF, Lakhal-Chaieb L, Lee KA. Robust estimation of mean functions and treatment effects for recurrent event under event-dependent censoring and termination: application to skeletal complications in cancer metastatic to bone. J Am Stat Assoc. 2009;104(485):60–75. [Google Scholar]
5.Hsieh J-J, Ding AA, Wang W. Regression analysis for recurrent events data under dependent censoring. Biometrics. 2011;67(3):719–729. [DOI] [PubMed] [Google Scholar]
6.Lancaster T, Intrator O. Panel data with survival: hospitalization of HIV-positive patients. J Am Stat Assoc. 1998;93(441):46–53. [Google Scholar]
7.Wang MC, Qin J, Chiang CT. Analyzing recurrent event data with informative censoring. J Am Stat Assoc. 2001;96(455):1057–1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Liu L,Wolfe RA,Huang X.Shared frailty models for recurrent events andaterminalevent.Biometrics.2004;60(3):747–756. [DOI] [PubMed] [Google Scholar]
9.Rondeau V, Mathoulin-Pélissier S, Jacqmin-Gadda H, Brouste V, Soubeyran P. Joint frailty models for recurring events and death using maximum penalized likelihood estimation: application on cancer events. Biostatistics. 2007;8(4):708–721. [DOI] [PubMed] [Google Scholar]
10.Huang X, Liu L. A joint frailty model for survival and gap times between recurrent events. Biometrics. 2007;63(2):389–397. [DOI] [PubMed] [Google Scholar]
11.Mazroui Y, Mathoulin-Pélissier S, Soubeyran P, Rondeau V. General joint frailty model for recurrent event data with a dependent terminal event: application to follicular lymphoma data. Stat Med. 2012;31(11–12):1162–1176. [DOI] [PubMed] [Google Scholar]
12.Mazroui Y, Mathoulin-Pélissier S, MacGrogan G, Brouste V, Rondeau V. Multivariate frailty models for two types of recurrent events with a dependent terminal event: application to breast cancer data. Biometrical J. 2013;55(6):866–884. [DOI] [PubMed] [Google Scholar]
13.Mauguen A, Rachet B, Mathoulin-Pélissier S, MacGrogan G, Laurent A, Rondeau V. Dynamic prediction of risk of death using history of cancer recurrences in joint frailty models. Stat Med. 2013;32(20):5366–5380. [DOI] [PubMed] [Google Scholar]
14.Thomas DC. Addendum to “Methods of cohort analysis: appraisal by application to asbestos mining” by Liddell FDK, McDonald JC, Thomas DC. J R Stat Soc A. 1977;140(4):469–491. [Google Scholar]
15.Goldstein L, Langholz B. Asymptotic theory for nested case-control sampling in the Cox regression model. Ann Stat. 1992;20(4):1903–1928. [Google Scholar]
16.Ky B, French B, Levy WC, et al. Multiple biomarkers for risk prediction in chronic heart failure. Circ-Heart Fail. 2012;5:183–190. [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Basuray A, French B, Ky B, et al. Heart failure with recovered ejection fraction: clinical description, biomarkers, and outcomes. Circulation. 2014;129(23):2380–2387. [DOI] [PMC free article] [PubMed] [Google Scholar]
18.De Mascarel I, Debled M, Brouste V, et al. Comprehensive prognostic analysis in breast cancer integrating clinical, tumoral, micro-environmental and immunohistochemical criteria. SpringerPlus. 2015;4(1):528. [DOI] [PMC free article] [PubMed] [Google Scholar]
19.French B, Heagerty PJ. Marginal mark regression analysis of recurrent marked point process data. Biometrics. 2009;65(2):415–422. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Langholz B, Clayton D. Sampling strategies in nested case-control studies. Environ Health Persp. 1994;102(Suppl 8):47–51. [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Langholz B, Borgan Ø. Counter-matching: a stratified nested case-control sampling method. Biometrika. 1995;82(1):69–79. [Google Scholar]
22.Samuelsen SO. A pseudolikelhood approach to analysis of nested case-control studies. Biometrika. 1997;84(2):379–394. [Google Scholar]
23.Saarela O, Kulathinal S, Arjas E, Läärä E. Nested case-control data utilized for multiple outcomes: a likelihood approach and alternatives. Stat Med. 2008;27(28):5991–6008. [DOI] [PubMed] [Google Scholar]
24.Salim A, Yang Q, Reilly M. The value of reusing prior nested case-control data in new studies with different outcome. Stat Med. 2012;31(11–12):1291–1302. [DOI] [PubMed] [Google Scholar]
25.Kim RS, Kaplan RC. Analysis of secondary outcomes in nested case-control study designs. Stat Med. 2014;33(24):4215–4226. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Ridker PM, Rifai N, Pfeffer M, Sacks F, Lepage S, Braunwald E. Elevation of tumor necrosis factor-α and increased risk of recurrent coronary events after myocardial infarction. Circulation. 2000;101(18):2149–2153. [DOI] [PubMed] [Google Scholar]
27.Hak E, Buskens E, Essen GA, et al. Clinical effectiveness of influenza vaccination in persons younger than 65 years with high-risk medical conditions: the PRISMA study. Arch Int Med. 2005;165(3):274–280. [DOI] [PubMed] [Google Scholar]
28.Gejl M,Starup-LindeJ,Scheel-Thomsen J,Gregersen S,Vestergaard P.Risk of cardiovascular disease:the effects of diabetes and anti-diabetic drugs – a nested case-control study. Int J Cardiol. 2015;178:292–296. [DOI] [PubMed] [Google Scholar]
29.Ramsay JO. Monotone regression splines in action. Stat Sci. 1988;3(4):425–461. [Google Scholar]
30.Joly P, Commenges D, Letenneur L. A penalized likelihood approach for arbitrarily censored and truncated data: application to age-specific incidence of dementia. Biometrics. 1998;54(1):185–194. [PubMed] [Google Scholar]
31.Król A, Mauguen A, Mazroui Y, Laurent A, Michiels S, Rondeau V. Tutorial in joint modeling and prediction: a statistical software for correlated longitudinal outcomes, recurrent events and a terminal event. J Stat Softw. 2017;81(3):1–52. [Google Scholar]
32.R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical ComputingVienna, Austria: 2018. [Google Scholar]
33.Marquardt D. An algorithm for least-squares estimation of non-linear parameters. J Soc Ind Appl Math. 1963;11(2):431–441. [Google Scholar]
34.Cai T, Zheng Y. Resampling procedures for making inference under nested case-control studies. J Am Stat Assoc. 2013;108(504):1532–1544. [DOI] [PMC free article] [PubMed] [Google Scholar]
35.Salim A, Hultman C, Sparén P, Reilly M. Combining data from 2 nested case-control studies of overlapping cohorts to improve efficiency. Biostatistics. 2008;10(1):70–79. [DOI] [PubMed] [Google Scholar]
36.Støer NC, Samuelsen SO. Inverse probability weighting in nested case-control studies with additional matching – a simulation study. Stat Med. 2013;32(30):5328–5339. [DOI] [PubMed] [Google Scholar]
37.Stampfer MJ, Colditz GA, Willett WC, et al. Postmenopausal estrogen therapy and cardiovascular disease – ten-year follow-up from the Nurses’ Health Study. New Engl J Med. 1991;325(11):756–762. [DOI] [PubMed] [Google Scholar]
38.Gronsbell JL, Cai T. Semi-supervised approaches to efficient evaluation of model prediction performance. J R Stat Soc B. 2018;80(3):579–594. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supp info

NIHMS1036451-supplement-Supp_info.pdf^{(225.2KB, pdf)}

[R1] 1.Cook RJ, Lawless JF. The Statistical Analysis of Recurrent Events. New York, NY: Springer Science and Business Media; 2007. [Google Scholar]

[R2] 2.Ghosh D, Lin DY. Nonparametric analysis of recurrent events and death. Biometrics. 2000;56(2):554–562. [DOI] [PubMed] [Google Scholar]

[R3] 3.Ghosh D, Lin DY. Semiparametric analysis of recurrent events data in the presence of dependent censoring. Biometrics. 2003;59(4):877–885. [DOI] [PubMed] [Google Scholar]

[R4] 4.Cook RJ, Lawless JF, Lakhal-Chaieb L, Lee KA. Robust estimation of mean functions and treatment effects for recurrent event under event-dependent censoring and termination: application to skeletal complications in cancer metastatic to bone. J Am Stat Assoc. 2009;104(485):60–75. [Google Scholar]

[R5] 5.Hsieh J-J, Ding AA, Wang W. Regression analysis for recurrent events data under dependent censoring. Biometrics. 2011;67(3):719–729. [DOI] [PubMed] [Google Scholar]

[R6] 6.Lancaster T, Intrator O. Panel data with survival: hospitalization of HIV-positive patients. J Am Stat Assoc. 1998;93(441):46–53. [Google Scholar]

[R7] 7.Wang MC, Qin J, Chiang CT. Analyzing recurrent event data with informative censoring. J Am Stat Assoc. 2001;96(455):1057–1065. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] 8.Liu L,Wolfe RA,Huang X.Shared frailty models for recurrent events andaterminalevent.Biometrics.2004;60(3):747–756. [DOI] [PubMed] [Google Scholar]

[R9] 9.Rondeau V, Mathoulin-Pélissier S, Jacqmin-Gadda H, Brouste V, Soubeyran P. Joint frailty models for recurring events and death using maximum penalized likelihood estimation: application on cancer events. Biostatistics. 2007;8(4):708–721. [DOI] [PubMed] [Google Scholar]

[R10] 10.Huang X, Liu L. A joint frailty model for survival and gap times between recurrent events. Biometrics. 2007;63(2):389–397. [DOI] [PubMed] [Google Scholar]

[R11] 11.Mazroui Y, Mathoulin-Pélissier S, Soubeyran P, Rondeau V. General joint frailty model for recurrent event data with a dependent terminal event: application to follicular lymphoma data. Stat Med. 2012;31(11–12):1162–1176. [DOI] [PubMed] [Google Scholar]

[R12] 12.Mazroui Y, Mathoulin-Pélissier S, MacGrogan G, Brouste V, Rondeau V. Multivariate frailty models for two types of recurrent events with a dependent terminal event: application to breast cancer data. Biometrical J. 2013;55(6):866–884. [DOI] [PubMed] [Google Scholar]

[R13] 13.Mauguen A, Rachet B, Mathoulin-Pélissier S, MacGrogan G, Laurent A, Rondeau V. Dynamic prediction of risk of death using history of cancer recurrences in joint frailty models. Stat Med. 2013;32(20):5366–5380. [DOI] [PubMed] [Google Scholar]

[R14] 14.Thomas DC. Addendum to “Methods of cohort analysis: appraisal by application to asbestos mining” by Liddell FDK, McDonald JC, Thomas DC. J R Stat Soc A. 1977;140(4):469–491. [Google Scholar]

[R15] 15.Goldstein L, Langholz B. Asymptotic theory for nested case-control sampling in the Cox regression model. Ann Stat. 1992;20(4):1903–1928. [Google Scholar]

[R16] 16.Ky B, French B, Levy WC, et al. Multiple biomarkers for risk prediction in chronic heart failure. Circ-Heart Fail. 2012;5:183–190. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] 17.Basuray A, French B, Ky B, et al. Heart failure with recovered ejection fraction: clinical description, biomarkers, and outcomes. Circulation. 2014;129(23):2380–2387. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] 18.De Mascarel I, Debled M, Brouste V, et al. Comprehensive prognostic analysis in breast cancer integrating clinical, tumoral, micro-environmental and immunohistochemical criteria. SpringerPlus. 2015;4(1):528. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R19] 19.French B, Heagerty PJ. Marginal mark regression analysis of recurrent marked point process data. Biometrics. 2009;65(2):415–422. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] 20.Langholz B, Clayton D. Sampling strategies in nested case-control studies. Environ Health Persp. 1994;102(Suppl 8):47–51. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] 21.Langholz B, Borgan Ø. Counter-matching: a stratified nested case-control sampling method. Biometrika. 1995;82(1):69–79. [Google Scholar]

[R22] 22.Samuelsen SO. A pseudolikelhood approach to analysis of nested case-control studies. Biometrika. 1997;84(2):379–394. [Google Scholar]

[R23] 23.Saarela O, Kulathinal S, Arjas E, Läärä E. Nested case-control data utilized for multiple outcomes: a likelihood approach and alternatives. Stat Med. 2008;27(28):5991–6008. [DOI] [PubMed] [Google Scholar]

[R24] 24.Salim A, Yang Q, Reilly M. The value of reusing prior nested case-control data in new studies with different outcome. Stat Med. 2012;31(11–12):1291–1302. [DOI] [PubMed] [Google Scholar]

[R25] 25.Kim RS, Kaplan RC. Analysis of secondary outcomes in nested case-control study designs. Stat Med. 2014;33(24):4215–4226. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] 26.Ridker PM, Rifai N, Pfeffer M, Sacks F, Lepage S, Braunwald E. Elevation of tumor necrosis factor-α and increased risk of recurrent coronary events after myocardial infarction. Circulation. 2000;101(18):2149–2153. [DOI] [PubMed] [Google Scholar]

[R27] 27.Hak E, Buskens E, Essen GA, et al. Clinical effectiveness of influenza vaccination in persons younger than 65 years with high-risk medical conditions: the PRISMA study. Arch Int Med. 2005;165(3):274–280. [DOI] [PubMed] [Google Scholar]

[R28] 28.Gejl M,Starup-LindeJ,Scheel-Thomsen J,Gregersen S,Vestergaard P.Risk of cardiovascular disease:the effects of diabetes and anti-diabetic drugs – a nested case-control study. Int J Cardiol. 2015;178:292–296. [DOI] [PubMed] [Google Scholar]

[R29] 29.Ramsay JO. Monotone regression splines in action. Stat Sci. 1988;3(4):425–461. [Google Scholar]

[R30] 30.Joly P, Commenges D, Letenneur L. A penalized likelihood approach for arbitrarily censored and truncated data: application to age-specific incidence of dementia. Biometrics. 1998;54(1):185–194. [PubMed] [Google Scholar]

[R31] 31.Król A, Mauguen A, Mazroui Y, Laurent A, Michiels S, Rondeau V. Tutorial in joint modeling and prediction: a statistical software for correlated longitudinal outcomes, recurrent events and a terminal event. J Stat Softw. 2017;81(3):1–52. [Google Scholar]

[R32] 32.R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical ComputingVienna, Austria: 2018. [Google Scholar]

[R33] 33.Marquardt D. An algorithm for least-squares estimation of non-linear parameters. J Soc Ind Appl Math. 1963;11(2):431–441. [Google Scholar]

[R34] 34.Cai T, Zheng Y. Resampling procedures for making inference under nested case-control studies. J Am Stat Assoc. 2013;108(504):1532–1544. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R35] 35.Salim A, Hultman C, Sparén P, Reilly M. Combining data from 2 nested case-control studies of overlapping cohorts to improve efficiency. Biostatistics. 2008;10(1):70–79. [DOI] [PubMed] [Google Scholar]

[R36] 36.Støer NC, Samuelsen SO. Inverse probability weighting in nested case-control studies with additional matching – a simulation study. Stat Med. 2013;32(30):5328–5339. [DOI] [PubMed] [Google Scholar]

[R37] 37.Stampfer MJ, Colditz GA, Willett WC, et al. Postmenopausal estrogen therapy and cardiovascular disease – ten-year follow-up from the Nurses’ Health Study. New Engl J Med. 1991;325(11):756–762. [DOI] [PubMed] [Google Scholar]

[R38] 38.Gronsbell JL, Cai T. Semi-supervised approaches to efficient evaluation of model prediction performance. J R Stat Soc B. 2018;80(3):579–594. [Google Scholar]

PERMALINK

Design and analysis of nested case-control studies for recurrent events subject to a terminal event

Ina Jazić

Sebastien Haneuse

Benjamin French

Gaëtan MacGrogan

Virginie Rondeau

Abstract

1 |. INTRODUCTION

2 |. JOINT FRAILTY MODEL FOR RECURRENT AND TERMINAL EVENTS

3 |. FITTING THE JOINT FRAILTY MODEL TO DATA FROM NESTED CASE-CONTROL STUDIES

3.1 |. Nested case-control studies: univariate case

3.2 |. Design considerations in the presence of recurrent and terminal events

Figure 1.

3.3 |. Penalized weighted likelihood for the joint frailty model

3.4 |. Standard error estimation

4 |. SIMULATIONS

4.1 |. Penn Heart Failure Study

4.2 |. Setup

4.3 |. Analyses

4.4 |. Results

TABLE 1.

TABLE 2.

TABLE 3.

5 |. DATA APPLICATIONS

5.1 |. Heart failure: Penn Heart Failure Study

TABLE 4.

5.2 |. Breast cancer: Institut Bergonié

TABLE 5.

6 |. DISCUSSION

Supplementary Material

ACKNOWLEDGMENTS

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases