Efficient Multiple Imputation for Sensitivity Analysis of Recurrent Events Data with Informative Censoring

Guoqing Diao; Guanghan F Liu; Donglin Zeng; Yilong Zhang; Gregory Golm; Joseph F Heyse; Joseph G Ibrahim

doi:10.1080/19466315.2020.1819403

. Author manuscript; available in PMC: 2023 Jan 1.

Published in final edited form as: Stat Biopharm Res. 2020 Nov 5;14(2):153–161. doi: 10.1080/19466315.2020.1819403

Efficient Multiple Imputation for Sensitivity Analysis of Recurrent Events Data with Informative Censoring

Guoqing Diao ^1,^*, Guanghan F Liu ², Donglin Zeng ³, Yilong Zhang ², Gregory Golm ², Joseph F Heyse ², Joseph G Ibrahim ³

PMCID: PMC9119645 NIHMSID: NIHMS1716690 PMID: 35601027

Abstract

Missing data are commonly encountered in clinical trials due to dropout or nonadherence to study procedures. In trials in which recurrent events are of interest, the observed count can be an undercount of the events if a patient drops out before the end of the study. In many applications, the data are not necessarily missing at random and it is often not possible to test the missing at random assumption. Consequently, it is critical to conduct sensitivity analysis. We develop a control-based multiple imputation method for recurrent events data, where patients who drop out of the study are assumed to have a similar response profile to those in the control group after dropping out. Specifically, we consider the copy reference approach and the jump to reference approach. We model the recurrent event data using a semiparametric proportional intensity frailty model with the baseline hazard function completely unspecified. We develop nonparametric maximum likelihood estimation and inference procedures. We then impute the missing data based on the large sample distribution of the resulting estimators. The variance estimation is corrected by a bootstrap procedure. Simulation studies demonstrate the proposed method performs well in practical settings. We provide applications to two clinical trials.

Keywords: bootstrap method, clinical trials, missing data, nonparametric maximum likelihood estimation

1. Introduction

Recurrent events data consisting of times to a number of repeated events are routinely encountered in biomedical studies – for example, times of recurrences of tumor, times of recurrent hospital visits, and times of recurrences of adverse events (AEs), etc. An extensive literature has been devoted to modeling the recurrent times to event using various extensions of the Cox proportional hazards model (Cox 1972), such as the Andersen and Gill model (Andersen & Gill 1982), the stratification models (Prentice et al. 1981), the marginal means/rates model (Wei et al. 1989, Lin et al. 2000), the frailty model (Therneau & Grambsch 2013, Kelly & Lim 2000), and multi-state models (Andersen & Keiding 2002, Meira-Machado et al. 2009). A tutorial on the analysis of recurrent events data using these models can be found in Amorim & Cai (2015). On the other hand, in some applications, it is of interest to model event count data (Roussel et al. 2019). Commonly used models for count data are the Poisson model and the negative binomial model, with or without zero inflation (Lambert 1992, Long 1997, Ridout et al. 2001, Lee et al. 2001).

In clinical trials, missing data frequently arise due to dropout or nonadherence to study procedures. In particular, the observed recurrent events count may be an undercount of the events if a patient drops out before the end of the study. In many cases, the dropout time may depend on the underlying event process inducing so-called informative dropout. Consequently, the data are not necessarily missing at random (MAR). Several authors have developed regression models under different assumptions on the dependence between the informative dropout time and the recurrent events process, e.g., see Wang et al. (2001), Huang et al. (2006), Zhao & Tong (2011), Zhao et al. (2013), Li et al. (2015), Zeng et al. (2014), Liu et al. (2016), and Diao et al. (2017). In these methods, the (missing) event count data between the dropout time and the end of study were not considered. When the recurrent events count data between randomization and end of study are of interest and there exists informative dropout, it is recommended by regulatory agencies to conduct sensitivity analysis to assess the robustness of study findings against the MAR assumption (European Medicines Agency 2010, National Academy of Sciences 2010).

In sensitivity analysis, it is desirable to obtain conservative results such that the biases are unlikely in favor of the experimental treatment (from a regulatory viewpoint). A class of methods for such sensitivity analysis are the control-based imputation methods, in which the missing data in the experimental group are assumed to follow a similar distribution as those in the control group. In seminal work, Carpenter et al. (2013) proposed a general framework for imputing the missing data based on the control group. Specifically, the authors described three types of control-based imputation methods: copy increments in reference (CIR), jump to reference (JR), and copy reference (CR); see Figure 1 in Liu & Pang (2016) for a graphical presentation of these three different imputation methods. Carpenter et al. (2013) originally proposed these control-based imputation methods for completely observed continuous data. There have been several extensions of the methods by Carpenter et al. (2013), for example, see Keene et al. (2014), Lu et al. (2015), Liu & Pang (2016), Gao, Liu, Zeng, Diao, Heyse & Ibrahim (2017), Tang (2017a), and Tang (2017b). These methods are either developed for non-censored data or not applicable in the presence of informative dropout.

More recently, Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017) developed a control-based multiple imputation method for sensitivity analysis for recurrent events data in the presence of informative censoring. A piecewise exponential proportional intensity model with frailty was first used to model the recurrent events data. A Bayesian approach was then proposed to draw samples from the posterior distributions of the parameters in multiple imputation. A bootstrap procedure was then used to estimate the variances of the parameter estimates in the imputed data. There are two limitations in the approach of Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017). First, one needs to specify the cutpoints in the piecewise exponential baseline cumulative hazard function. Secondly, computation using Markov chain Monte Carlo (MCMC) methods to sample data from the posterior distributions of the parameters can be very intensive.

To overcome the limitations of the existing methods, we develop an efficient method for estimating the unknown parameters in the recurrent events model and in the multiple imputation procedure. The baseline cumulative hazard function is assumed to be non-decreasing but otherwise unspecified. We estimate the unknown parameters by maximizing the nonparametric likelihood function and impute data based on the large sample distributions of the resulting nonparametric maximum likelihood estimators. The rest of this paper is organized as follows. In Section 2, we describe the proposed estimation method as well as the multiple imputation approach. A bootstrap method is used to estimate the standard errors of the estimates. Extensive simulations are presented in Section 3. We apply the proposed methods to two real data sets in Section 4 and some discussion is provided in Section 5.

2. Methods

Before describing the multiple imputation procedure, we first introduce the model for recurrent events data. Suppose that in a clinical study there are n subjects. For the ith subject, we observe events at times ${t_{i 1}, \dots, t_{i m_{i}}}$ until C_i ≤ τ_i, where m_i is the number of observed events by C_i, a possibly informative censoring time, and τ_i is the end of study or time of the last scheduled visit. Let X_i = (A_i, Z_i) denote a set of covariates which include the treatment indicator A_i and other covariates Z_i. The observed data consist of ${(X_{i}, t_{i 1}, \dots, t_{i m_{i}}, C_{i}, τ_{i}), i = 1, \dots, n}$ . We denote by N_i(t) the number of events for the ith subject by time t.

We assume that N_i(t) follows a proportional intensity model with a gamma frailty

λ (t ∣ b_{i}, X_{i}) = λ (t) b_{i} exp (X_{i}^{T} β),

(1)

where λ(t) is an unspecified baseline hazard function and b_i’s are i.i.d. Gamma(γ⁻¹, γ⁻¹). We fix the mean of b_i as 1 to ensure model identifiability. Let $Λ (t) = \int_{0}^{t} λ (s) d s$ . Note that the goal is to study the sensitivity of the multiple imputation methods. Typically in the sensitivity analysis, we assume MAR. Consequently the likelihood function for the unknown parameters θ ≡ (β, γ, Λ) is given by

L_{n} (β, γ, Λ) = \prod_{i = 1}^{n} \int_{b_{i}} \prod_{t \leq C_{i}} {λ (t) b_{i} exp (X_{i}^{T} β)}^{Δ N_{i} (t)} exp {- Λ (C_{i}) b_{i} exp (X_{i}^{T} β)} \times \frac{{(1 / γ)}^{1 / γ}}{Γ (1 / γ)} b_{i}^{1 / γ - 1} exp (- \frac{b_{i}}{γ}) d b_{i} = \prod_{i = 1}^{n} [\prod_{t \leq C_{i}} {λ (t) exp (X_{i}^{T} β)}^{Δ N_{i} (t)}] \times \frac{Γ (1 / γ + N_{i} (C_{i}))}{Γ (1 / γ)} \times \frac{{(1 / γ)}^{1 / γ}}{{Λ (C_{i}) exp (X_{i}^{T} β) + 1 / γ}^{N_{i} (C_{i}) + 1 / γ}},

where

\frac{Γ (1 / γ + N_{i} (C_{i}))}{Γ (1 / γ)} = {\begin{array}{l} 1, N_{i} (C_{i}) = 0 \\ \prod_{j = 0}^{N_{i} (C_{i}) - 1} (1 / γ + j), N_{i} (C_{i}) > 0. \end{array}

The corresponding log-likelihood function is given by

log L (θ) = \sum_{i = 1}^{n} [\sum_{j = 1}^{m_{i}} {log λ (t_{i j}) + X_{i}^{T} β} + {\sum_{k = 0}^{N_{i} (C_{i}) - 1} log (1 / γ + k)} + γ^{- 1} log γ^{- 1} - (N_{i} (C_{i}) + 1 / γ) log {Λ (C_{i}) exp (X_{i}^{T} β) + 1 / γ}] .

Ideally, one would like to maximize the above likelihood to estimate the unknown parameters; the maximum of the likelihood, however, does not exist since one can always let λ(t) go to ∞ at an observed event time while fixing the value of Λ(t). Therefore, we use the nonparametric likelihood technique by replacing λ(t) with Λ{t}, the jump size of Λ(·) at t. We maximize the resulting nonparametric likelihood to obtain the nonparametric maximum likelihood estimators of θ, denoted by ${\hat{θ}}_{n} \equiv ({\hat{β}}_{n}, {\hat{γ}}_{n}, {\hat{Λ}}_{n})$ . Following the arguments in Murphy (1994) and Murphy (1995), one can show that ${\hat{θ}}_{n}$ is consistent and asymptotically normal. Furthermore, we can treat the jump sizes of Λ(t) at the observed event times as parameters and apply parametric likelihood theory to estimate the covariance matrix of ${\hat{θ}}_{n}$ by inverting the observed information matrix (Parner 1998). The resulting covariance matrix estimate is denoted by denoted by $\hat{Cov} ({\hat{θ}}_{n})$ .

We now describe how to impute the missing data N_i(τ_i) − N_i(C_i), the number of events in (C_i, τ_i] for the ith subject if the subject dropped out before the last scheduled visit. As shown in the Appendix, the conditional distribution of b_i given the observed data is Gamma with shape 1/γ + N_i(C_i) and rate $1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)$ .

The distribution of the number of events after drop-out is then negative binomial with number of failures 1/γ + N_i(C_i) (until the experiment is stopped) and success probability

1 - \frac{1 + γ Λ (C_{i}) exp (X_{i}^{T} β)}{1 + γ {Λ (τ_{i}) - Λ (C_{i})} exp ({\tilde{X}}_{i}^{T} β) + γ Λ (C_{i}) exp (X_{i}^{T} β)} .

As in Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017), we consider two types of control-based imputation approaches: copy reference and jump to reference. We define ${\tilde{X}}_{i} = X_{i}$ for the jump to reference method and ${\tilde{X}}_{i} = Z_{i}$ for the copy to reference method. In copy reference control-based imputation, we assume that there is no difference between patients in the treated group and patients in the control group, both before and after dropout. Therefore, we first fit the recurrent events model for the patients in the control group only. Then we impute the number of events between the censoring time and the last scheduled visit using covariates Z_i for subjects in both groups. In the jump to reference control-based imputation, we assume that after a patient in the treatment group drops out, the distribution of the outcome is the same as patients in the control group, while the outcome distribution prior to dropout follows that of the treatment group. Therefore, we first fit the recurrent events model using all the data and then impute the missing data fixing the treatment indicator at 0 (control) for both groups.

After the missing data are imputed, we fit an appropriate model for the event count data such as the Poisson or negative binomial regression model. The point estimates of the unknown parameters are then averages of the parameter estimates based on multiple imputations. Since the covariance matrix of the estimators is not analytically tractable, we use the bootstrap procedure. The detailed multiple imputation and bootstrap procedures are described as follows and have been implemented in a computer program in the C language.

Step 1. Generate B bootstrap samples by sampling from the original dataset with replacement. For each bootstrap sample b = 1, …,B, we obtain ${\hat{θ}}_{n b}$ and $\hat{Cov} ({\hat{θ}}_{n b})$ .
Step 2. Multiple imputation: we draw m samples from a multivariate normal distribution $M N ({\hat{θ}}_{n b}, \hat{Cov} ({\hat{θ}}_{n b}))$ , denoted by $θ_{b}^{(j)}, j = 1, \dots, m$ . For each sample of the parameters and the ith subject, draw a random number from a negative binomial distribution with number of success $1 / γ_{b}^{(j)} + N_{i} (C_{i})$ and success probability
$\frac{1 + γ_{b}^{(j)} Λ (C_{i}) exp (X_{i}^{T} β_{b}^{(j)})}{1 + γ_{b}^{(j)} {Λ_{b}^{(j)} (τ_{i}) - Λ_{b}^{(j)} (C_{i})} exp ({\tilde{X}}_{i}^{T} β_{b}^{(j)}) + γ_{b}^{(j)} Λ_{b}^{(j)} (C_{i}) exp (X_{i}^{T} β_{b}^{(j)})} .$
To generate multivariate normal random variables, we first perform a Cholesky decomposition on $\hat{Cov} ({\hat{θ}}_{n b})$ such that $\hat{Cov} ({\hat{θ}}_{n b}) = V_{b} V_{b}^{T}$ , where V_b is a lower triangular matrix with positive diagonal elements. Therefore, we can draw samples such that $θ_{b}^{(j)} = V_{b} U^{(j)}$ , where U^(j)’s are i.i.d. multivariate normal random vectors with mean zero and an identity covariance matrix.
Step 3. Perform the primary analysis by fitting a regression model (e.g., the zero-inflated Poisson model) for N_i(τ_i) based on the imputed samples. Denote the parameter estimates in the primary analysis as ${\hat{ζ}}_{b}^{(j)}$ , b =1, …, B; j = 1, …, m. Then, the point estimate for bootstrap sample b is ${\hat{ζ}}_{b} \equiv \sum_{j = 1}^{m} {\hat{ζ}}_{b}^{(j)} / m$ . The overall point estimate is $\hat{ζ} = \sum_{b = 1}^{B} {\hat{ζ}}_{b} / B$ . The covariance for $\hat{ζ}$ is estimated by the sample variance of ${\hat{ζ}}_{b}$ , b = 1, …, B from the B bootstrap samples. We then test covariate effects and construct confidence intervals using the normal approximation.

Remark 1.

We can use either the point estimate based on the whole sample data or the average of the estimates based on bootstrap samples. In our extensive numerical studies, as expected, they are very close to each other. Hence, we reported the results using the point estimates based on the whole sample data and the standard errors based on the bootstrap samples in the subsequent numerical studies.

Remark 2.

The approach described in Step 2 is the “frequentist” version of the proper imputation advocated by Rubin (1987). The estimator based on this imputation approach is also called a “type A estimator” by Wang & Robins (1998). On the other hand, directly drawing multiple random numbers for the event using the negative binomial distribution plugged in with ${\hat{θ}}_{n}$ is referred to as the improper imputation by Rubin (1987).

3. Simulation Studies

We conducted simulation studies to examine the performance of the two proposed control-based multiple imputation methods. Specifically, we generate recurrent events data from the proportional intensity model

λ (t ∣ A_{i}, Z_{i}, b_{i}) = b_{i} exp (β_{1} A_{i} + β_{2} Z_{i}),

where A_i is the treatment indicator following a Bernoulli distribution with success probability 0.5, Z_i is a continuous covariate from a normal distribution with mean 0 and variance 0.25, and b_i is a Gamma random variable with mean 1 and variance 1. The true parameter values of β₁ and β₂ are −0.5 and 0.5, respectively. We consider both scenarios of non-informative censoring and informative censoring. Under non-informative censoring, we generate the censoring time from an exponential distribution with mean 5. Under informative censoring, we generate the censoring time from an exponential distribution with mean 5/b_i. The censoring is informative in the sense that patients at a higher risk of the recurrent events are also at a higher risk of censoring. It can be shown that the marginal distribution of the informative censoring time is P(C_i ≤ t) = E{P(C_i ≤ t|b_i)} = 0.2t/(1+0.2t). The end of study τ is set to be 5 for all patients. Consequently, the missingness fractions are 63.2% and 50% under non-informative censoring and informative censoring, respectively. It is clear that N_i(τ) follows a negative binomial distribution with parameters r = 1 and p = 1−1/{1+τ exp(β₁A_i+β₂Z_i)}, where r is the number of failures until the experiment is stopped and p is the success probability in each experiment. Therefore, the true parameters in a negative binomial regression model are α (dispersion parameter) = 1, β₀ (intercept) = log(τ) = 1.6094, β₁ (treatment effect) = −0.5, and β₂ (covariate effect of Z) = 0.5.

After a patient is censored before the end of study, for the jump to reference control-based imputation method, we generate data based on the model for the control group regardless whether the patient is in the treatment group or control group. On the other hand, for the copy reference control-based imputation method, conditional on the observed number of events by C_i, we generate the remaining number of events by the end of study τ from a negative binomial distribution with parameters r = 1+N_i(C_i) and p = {(τ−C_i) exp(β₂Z_i)}/{1+τ exp(β₂Z_i)}. We consider sample sizes of 200 and 400. In all simulations, the number of imputations is 50 and we generate 100 bootstrap samples for each data set. The results are based on 1,000 replicates.

Table 1 presents a summary of the parameter estimates based on the full data assuming the number of events are observed even after censoring. When there is no treatment switch, the NPMLEs from the full data regardless of the censoring scheme appear to have small biases and the standard errors are correctly estimated. No treatment switch here means that the patient continues to receive the experimental treatment after censoring. On the other hand, treatment switch means that after censoring, the patient stops receiving the experimental treatment and we assume that the distribution of the outcome after censoring is the same as patients in the placebo group. The 95% confidence intervals of the parameters are close to the nominal level. With treatment switching (i.e., data are generated using the jump to reference method), as expected, the true treatment effect is shrunk towards 0; however, the standard error estimates still reflect accurately the true variation of the estimates under non-informative censoring. The performance of the other parameters does not appear to be impacted by the treatment switch except that the standard errors appear to be under estimated under informative censoring.

Table 1:

Summary of simulation results based on full data (negative binomial regression).

		Treatment Switch			No Treatment Switch
Parameter	True	Mean	SD	SE	Mean	SD	SE	CP
n = 200, non-informative censoring
α	1.000	1.0032	0.1341	0.1329	0.9823	0.1354	0.1350	0.931
β ₀	1.609	1.5962	0.1106	0.1108	1.5988	0.1142	0.1095	0.943
β ₁	−0.500	−0.2825	0.1586	0.1587	−0.5039	0.1630	0.1595	0.940
β ₂	0.500	0.5007	0.1582	0.1606	0.4997	0.1646	0.1619	0.946
n = 400, non-informative censoring
α	1.000	1.0127	0.0941	0.0944	0.9919	0.0971	0.0960	0.941
β ₀	1.609	1.6043	0.0784	0.0784	1.6030	0.0795	0.0776	0.940
β ₁	−0.500	−0.2862	0.1131	0.1122	−0.5027	0.1139	0.1128	0.948
β ₂	0.500	0.5028	0.1133	0.1133	0.5012	0.1153	0.1137	0.956
n = 200, informative censoring
α	1.000	1.0960	0.1448	0.1419	0.9790	0.1298	0.1345	0.941
β ₀	1.609	1.6000	0.1118	0.1149	1.5977	0.1141	0.1097	0.946
β ₁	−0.500	−0.2216	0.1741	0.1641	−0.4974	0.1659	0.1592	0.939
β ₂	0.500	0.4974	0.1701	0.1664	0.4935	0.1574	0.1613	0.945
n = 400, informative censoring
α	1.000	1.1087	0.1072	0.1011	0.9897	0.0971	0.0957	0.940
β ₀	1.609	1.6061	0.0784	0.0813	1.6032	0.0781	0.0776	0.944
β ₁	−0.500	−0.2238	0.1219	0.1161	−0.5006	0.1139	0.1126	0.946
β ₂	0.500	0.4997	0.1215	0.1169	0.5010	0.1134	0.1137	0.952

Open in a new tab

Mean, sample average of the parameter estimates; SD, sample standard deviation of the parameter estimates; SE, sample average of the standard error estimates; CP, empirical coverage probability of the 95% confidence intervals.

Tables 2 and 3 present the results from the copy reference and the jump to reference control-based imputation methods, respectively. As in Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017), we calculated the true value of the treatment effect using the average of the estimates from the full generated datasets with events observed after dropout. Under non-informative censoring, for both control-based imputation methods, the estimates of the nuisance parameters, i.e., (α, β₀, β₂), have small biases under both censoring schemes. The bootstrap method estimates the standard errors of the multiple imputation estimates for all parameters correctly. Under informative censoring, the estimate of the intercept term seems to be biased for both imputation methods. On the other hand, the copy reference method leads to larger treatment effect estimates and larger standard error estimates compared to the jump to reference method. A similar phenomenon has been observed for the copy reference and the jump to reference control-based imputation methods with continuous data (e.g., see Tables 1 and 2 in Liu & Pang (2016)). These results also suggest that the copy reference approach is less sensitive to informative censoring than the jump to reference approach.

Table 2:

Summary of simulation results (negative binomial regression) with copy reference control-based imputation.

		Full Data			Imputation
Parameter	True	Mean	SD	MSE	Mean	SD	SE	MSE
n = 200, non-informative censoring
α	1.000	0.9804	0.1374	0.0193	0.9770	0.1719	0.1711	0.0301
β ₀	1.609	1.6050	0.1114	0.0124	1.5986	0.1248	0.1252	0.0157
β ₁	−0.500	−0.3900	0.1602	0.0257	−0.3767	0.1365	0.1353	0.0188
β ₂	0.500	0.4924	0.1627	0.0265	0.5038	0.1875	0.1813	0.0352
n = 400, non-informative censoring
α	1.000	1.0003	0.0976	0.0095	0.9921	0.1219	0.1215	0.0149
β ₀	1.609	1.6070	0.0755	0.0057	1.6159	0.0873	0.0872	0.0077
β ₁	−0.500	−0.3825	0.1105	0.0122	−0.3802	0.0954	0.0946	0.0091
β ₂	0.500	0.4870	0.1134	0.0130	0.4823	0.1279	0.1272	0.0167
n = 200, informative censoring
α	1.000	1.1017	0.1520	0.0335	1.0183	0.1737	0.1804	0.0305
β ₀	1.609	1.4755	0.1170	0.0292	1.3957	0.1349	0.1308	0.0599
β ₁	−0.500	−0.3930	0.1790	0.0320	−0.4016	0.1474	0.1419	0.0218
β ₂	0.500	0.5179	0.1839	0.0341	0.5211	0.2071	0.1920	0.0433
n = 400, informative censoring
α	1.000	1.1051	0.1096	0.0231	1.0331	0.1305	0.1288	0.0181
β ₀	1.609	1.4766	0.0819	0.0219	1.4010	0.0940	0.0923	0.0484
β ₁	−0.500	−0.3870	0.1236	0.0153	−0.3920	0.0997	0.0999	0.0100
β ₂	0.500	0.5139	0.1237	0.0155	0.5203	0.1382	0.1344	0.0195

Open in a new tab

Mean, sample average of the parameter estimates; SD, sample standard deviation of the parameter estimates; SE, sample average of the standard error estimates; MSE, mean-squared error.

Table 3:

Summary of simulation results (negative binomial regression) with jump to reference control-based imputation.

		Full Data			Imputation
Parameter	True	Mean	SD	MSE	Mean	SD	SE	MSE
n = 200, non-informative censoring
α	1.000	1.0158	0.1382	0.0193	1.0036	0.1732	0.1686	0.0300
β ₀	1.609	1.5928	0.1127	0.0130	1.5970	0.1241	0.1245	0.0156
β ₁	−0.500	−0.2844	0.1632	0.0266	−0.2724	0.0943	0.0955	0.0090
β ₂	0.500	0.5042	0.1664	0.0277	0.5045	0.1859	0.1825	0.0346
n = 400, non-informative censoring
α	1.000	1.0175	0.0965	0.0096	1.0111	0.1227	0.1190	0.0152
β ₀	1.609	1.6007	0.0786	0.0062	1.6078	0.0903	0.0867	0.0082
β ₁	−0.500	−0.2823	0.1138	0.0130	−0.2736	0.0653	0.0663	0.0043
β ₂	0.500	0.4960	0.1097	0.0121	0.4935	0.1276	0.1279	0.0163
n = 200, informative censoring
α	1.000	1.0989	0.1389	0.0291	1.0575	0.1861	0.1848	0.0379
β ₀	1.609	1.5934	0.1111	0.0124	1.3821	0.1249	0.1297	0.0631
β ₁	−0.500	−0.2064	0.1670	0.0279	−0.2873	0.1047	0.1060	0.0175
β ₂	0.500	0.4966	0.1714	0.0294	0.5256	0.1931	0.1957	0.0379
n = 400, informative censoring
α	1.000	1.1082	0.1000	0.0217	1.0736	0.1326	0.1316	0.0230
β ₀	1.609	1.6054	0.0807	0.0065	1.4010	0.0957	0.0911	0.0487
β ₁	−0.500	−0.2181	0.1157	0.0133	−0.2939	0.0743	0.0738	0.0110
β ₂	0.500	0.5023	0.1207	0.0146	0.5302	0.1419	0.1371	0.0211

Open in a new tab

Mean, sample average of the parameter estimates; SD, sample standard deviation of the parameter estimates; SE, sample average of the standard error estimates; MSE, mean-squared error.

We conducted additional simulation studies to compare the performance between the proposed method and the method of Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017). Specifically, we considered the same simulation setting as in Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017). The simulation results are summarized in Table 4. Although the results from the two methods are similar, the proposed method is computationally much more efficient than the method of Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017). It takes about 12.4 seconds and 14.4 seconds using the proposed method to analyze one data set on an iMac with a 3 GHz Intel Core i5 processor, for the copy reference approach and the jump to reference approach, respectively. In contrast, it takes about 1273 seconds and 2373 seconds to analyze one data set using the method of Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017), for the copy reference approach and the jump to reference approach, respectively. The computational gain of the proposed method over the method of Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017) is substantial.

Table 4:

Summary statistics for the treatment effect in the comparison between the proposed method and the method of Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017).

	Gao et al. (2017)				Proposed
Setting	Bias	SD	SE	MSE	Bias	SD	SE	MSE
Copy reference	0.024	0.166	0.169	0.028	−0.001	0.167	0.165	0.028
Jump to reference	0.003	0.149	0.150	0.022	0.012	0.146	0.147	0.021

Open in a new tab

Bias, difference between the sample average of the parameter estimates and true parameter value; SD, sample standard deviation of the parameter estimates; SE, sample average of the standard error estimates based on bootstrap method; MSE, mean-squared error.

4. Data Analysis

4.1. A Diabetes Trial

We first apply the proposed methods to a diabetes trial that was first described in Mathieu et al. (2015). In this multicenter, randomized, double-blind, placebo-controlled, 24-week clinical trial, patients with inadequately controlled type 2 diabetes who were being treated with insulin were randomized to receive the treatment with sitagliptin 100 mg/day or placebo in a 1:1 ratio. The investigators were interested in evaluating the efficacy and safety of co-administration of sitagliptin with intensively titrated insulin glargine. Particularly, hypoglycemia AEs, a common side effect of treatment with insulin, were the endpoint of interest for the safety analysis. A total of 658 patients (with 329 in each group) participated in the trial. Among the 295 patients in the sitagliptin group and 303 in the placebo group who completed the 24-week follow-up in the study, the number of episodes of hypoglycemia AEs ranged from 0 to 27 with a mean of 1.18, and from 0 to 26 with a mean of 2.45, respectively. One hypothesis of the safety study was that sitagliptin reduces the incidence of hypoglycemia AEs. This example was also previously studied by Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017), who argued that the censoring may be dependent on the incidence of hypoglycemia AEs leading to informative censoring and the missingness of these data is likely not at random.

As in Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017), we consider a negative binomial regression model for the analysis of event rate of hypoglycemia AEs. Besides the treatment indicator, we include three baseline covariates in the model: A1C value (a measure of glycemic controal), insulin dose, and body weight. Nine patients (3 in the placebo group and 6 in the treatment group) with missing baseline covariates were excluded from the analysis. The corresponding sample means (standard deviations) of these three baseline variables in the combined sample were 8.73(1.01), 36.77(20.87), and 87.69(21.04). The characteristics of these three variables were similar between the sitagliptin group and the placebo group. We set τ_i = τ = 168 (days) for subjects who dropped out from the study prior to the Week 24 visit. The model is specified as

log μ_{i} = ζ_{0} + ζ_{t r t} T r t_{i} + ζ_{A 1 C} A 1 C_{i} + ζ_{B W} B W_{i} + ζ_{Insulin} Insuli n_{i},

where μ_i is the mean of the number of hypoglycemia AEs for subject i, ζ₀ is the intercept, and other regression coefficients are the covariate effects on the rate of hypoglycemia AEs and they are self-explanatory. Since all τ_i’s are the same, we omit the usual offset term log τ_i in the model.

Before imputing the missing data between C_i, the informative censoring time, and τ_i, we fit the proportional intensity Gamma frailty model

λ_{i} (t) = λ (t) b_{i} exp (β_{t r t} T r t_{i} + β_{A 1 C} A 1 C_{i} + β_{B W} B W_{i} + β_{Insulin} Insuli n_{i}),

where λ_i(t) is the conditional intensity function for the ith subject given a Gamma frailty b_i with mean 1 and variance γ and the baseline covariates, and λ(t) is the baseline intensity function. We standardize the baseline covariates by subtracting the sample means and then dividing by the standard deviations such that the baseline intensity function has a meaningful interpretation. Table 5 presents the results of the NPMLEs of the unknown parameters.

Table 5:

Parameter estimates from the proportional intensity frailty model for the diabetes trial (baseline covariates standardized).

Model parameter	Placebo group			Both groups
Model parameter	Estimate	Std. Err.	P-value	Estimate	Std. Err.	P-value
Treatment				−0.814	0.180	< 0.001
Baseline A1C	0.130	0.122	0.287	0.066	0.091	0.469
Baseline Insulin	0.213	0.118	0.072	0.309	0.100	0.002
Baseline Weight	−0.282	0.133	0.033	−0.306	0.106	0.004
Frailty Variance	3.692	0.429	< 0.001	4.293	0.395	< 0.001

Open in a new tab

We then apply the proposed imputation and bootstrap procedures to analyze the event rate of hypoglycemia AEs. We set B = 1000 and m = 100. Computation times on an MacBook Pro with a 2.8 GHz Intel Core i7 processor for the copy reference imputation and the jump to reference imputation were 1777 and 1821 seconds, respectively. Tables 6 and 7 present the results from the fitting of the negative binomial regression model with the standardized baseline covariates and the original baseline covariates, respectively. Both control-based imputation methods detected a significant treatment effect in reducing the event rate of hypoglycemia AEs. In addition, baseline body weight and insulin dose also have significant effects on the event rate of hypoglycemia AEs. These results are similar to those reported in Table 5 of Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017); however, the standard error estimates of the regression coefficient estimates appear to be smaller than those in Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017) suggesting potential efficiency gain of the proposed methods.

Table 6:

Parameter estimates from control-based multiple imputation for the diabetes trial (baseline covariates standardized).

	Copy reference			Jump to reference
Parameter	Estimate	Std. Err.	P-value	Estimate	Std. Err.	P-value
Intercept	0.930	0.106	< 0.001	0.926	0.102	< 0.001
Treatment	−0.770	0.165	< 0.001	−0.743	0.152	< 0.001
Baseline A1C	0.066	0.093	0.475	0.066	0.090	0.460
Baseline Insulin	0.321	0.095	< 0.001	0.333	0.094	< 0.001
Baseline Weight	−0.264	0.091	0.004	−0.271	0.095	0.004
Dispersion Parameter	4.272	0.389	< 0.001	4.309	0.372	< 0.001

Open in a new tab

Table 7:

Parameter estimates from control-based multiple imputation for the diabetes trial (original baseline covariates).

	Copy reference			Jump to reference
Parameter	Estimate	Std. Err.	P-value	Estimate	Std. Err.	P-value
Intercept	1.033	0.932	0.268	1.024	0.949	0.281
Treatment	−0.772	0.172	< 0.001	−0.744	0.152	< 0.001
Baseline A1C	0.061	0.093	0.507	0.064	0.093	0.491
Baseline Insulin	0.014	0.004	< 0.001	0.015	0.004	< 0.001
Baseline Weight	−0.014	0.005	0.006	−0.014	0.005	0.004
Dispersion Parameter	4.296	0.381	< 0.001	4.329	0.380	< 0.001

Open in a new tab

This approach could be further adapted to incorporate a tipping point analysis as a sensitivity analysis and we use the diabetes trial example to demonstrate. Specifically, considering the jump to reference approach, at the imputation stage, we increase the intensity for the treatment group by multiplying with a constant g (g > 1). As g increases, the imputed events for the patient will increase. We gradually increase the values for g such that the treatment effect becomes insignificant. This value of g is the tipping point. This tipping appoint approach is similar to the one described in Akacha & Ogundimu (2016). In this case, the tipping point we obtained is g = 4.5. Such a large tipping point is likely to be out of range in practice, suggesting that the significant result is robust.

4.2. A Bladder Cancer Trial

We also apply the proposed methods to a bladder cancer trial conducted by the Veteran Administration Co-operative Urological Research Group (Andrews & Herzberg 2012). This randomized clinical trial consisted of 116 patients suffering from superficial bladder cancer who were randomized to receive placebo, pyridoxine (vitamin B6 tablets), or thiotepa (a chemotherapy drug with antitumor activity). As in Akacha & Ogundimu (2016), for illustrative purposes, we focus on the data from the placebo arm and the thiotepa arm, with 47 and 38 patients, respectively. We are interested in assessing the treatment effect on the event rate of the occurrence of new bladder tumors accounting for the effects of the number and size of tumors at baseline, which range from 1 to 8 and from 1 to 7, respectively. We set τ_i = τ = 45 (months) for subjects who dropped out from the study prior to the month 45 visit. In addition, we set B = 1000 and m = 100. Computation time for copy reference imputation and jump to reference imputation were 149 and 150 seconds, respectively.

We first fit the proportional intensity Gamma frailty model as in the analysis of the diabetes trial data in the previous subsection. Table 8 presents the NPMLEs of the regression parameters and the variance of the Gamma frailty. Table 9 summarizes the results from the proposed imputation and bootstrap methods from the negative binomial regression model. There is a significant effect of the baseline number of tumors; a large number of baseline tumors increased the event rate of the occurrence of new bladder tumors. On the other hand, it appears that there is little effect of baseline tumor size. The treatment effect is borderline significant under both control-based imputation methods. As expected, the treatment effect from the jump to reference imputation method is smaller than the copy reference imputation method with a smaller standard error estimate. Compared to the results in Akacha & Ogundimu (2016) which assumed a parametric form for the baseline intensity function, the parameter estimates from the proposed methods were in the same direction but had smaller standard error estimates. Since the baseline intensity function is unspecified, the proposed methods are more robust than those in Akacha & Ogundimu (2016).

Table 8:

Parameter estimates from the proportional intensity frailty model for the bladder cancer trial.

Model parameter	Placebo group			Both groups
Model parameter	Estimate	Std. Err.	P-value	Estimate	Std. Err.	P-value
Treatment				−0.559	0.295	0.058
Number of tumors	0.125	0.128	0.332	0.233	0.081	0.004
Size of tumors	0.004	0.120	0.973	−0.024	0.101	0.810
Fraity Variance	0.671	0.311	0.016	0.779	0.280	0.003

Open in a new tab

Table 9:

Parameter estimates from control-based multiple imputation for the bladder cancer trial.

	Copy reference			Jump to reference
Parameter	Estimate	Std. Err.	P-value	Estimate	Std. Err.	P-value
Intercept	0.464	0.342	0.175	0.409	0.364	0.262
Treatment	−0.409	0.213	0.055	−0.345	0.186	0.063
Baseline Number	0.200	0.078	0.011	0.228	0.084	0.006
Baseline Size	−0.006	0.088	0.945	0.004	0.092	0.963
Dispersion Parameter	0.754	0.233	0.001	0.857	0.248	0.001

Open in a new tab

5. Discussion

In clinical trials with recurrent events, treatment effects may be evaluated using the intensity of events occurring within the study period. For various reasons, patients may discontinue from the study early which results in right censoring. Conventional analysis may assume non-informative censoring, which is analogous to the MAR assumption in longitudinal clinical trials. These assumptions may not be verified using the observed data. Therefore, sensitivity analysis is typically recommended to assess the robustness of the analysis results under some different assumptions for the potential outcomes should the discontinued patients be followed to the end of the study.

Both copy reference and jump to reference methods are commonly used for sensitivity analysis in longitudinal trials with missing data. In this paper, we evaluated these approaches for clinical trials with recurrent events under a semiparametric proportional intensity frailty model. This extends previous research in this area (Akacha & Ogundimu 2016) to allow for a completely unspecified baseline hazard function. The maximum likelihood estimates for the proportional intensity frailty model are obtained and used to impute the missing data after discontinuation. The variance estimates for the parameters are obtained from a bootstrap procedure which corrects the over-estimation problem from the regular multiple imputation analysis. The likelihood approach significantly reduces the computational burden from Bayesian MCMC imputation which makes it feasible for users to apply larger imputation and bootstrap numbers to improve accuracy. The simulations illustrated the computational advantage for the proposed method as compared to the Bayesian MCMC based method of Gao, Liu, Zeng, Xu, Lin, Diao, Golm, Heyse & Ibrahim (2017).

It should be noted that the proportional intensity model assumes that the ratio of intensities for the recurrent events between treatment groups is a constant over the study period (after adjusting for other covariates). The cumulative intensity functions can be arbitrary. The parameters involved in the model are latent variables defined for a study with complete follow-up (without censoring). These parameters are then used to define the sensitivity analysis approaches. For both copy reference and jump to reference, the censoring is assumed to be non-informative for the control group. Therefore, the missing data for the control group are imputed under the non-informative censoring assumption. The missing data in the treatment group are imputed under the copy reference or jump to reference approaches. In the copy reference approach, the intensity profile for dropout patients in the treatment group is the same as those in the control group, both before and after dropout. In the jump to reference approach, it is assumed that after a patient in the treatment group drops out, the intensity profile jumps to that of the control group.

Both approaches are to assess treatment effects for a hypothetical estimand (ICH E9 (R1) 2017) in which the intensity ratio is measured under a hypothetical condition, that is, assuming we can follow-up those dropout patients with no other medication given after discontinuation of the assigned study therapy towards the end of study. This hypothetical condition is infeasible in real clinical trials because patients who discontinued study therapies almost always take other medications so the outcomes after discontinuation would be confounded by other medications. This hypothetical estimand is to assess ‘pure’ treatment effect for the randomly assigned therapies if a trial can be conducted in this hypothetical condition. Note that imputation has been done for all discontinued patients. It is controversial for patients who died as the outcomes after death are undefined. This topic is beyond the scope of this paper. Some discussion on the estimands and issues regarding death can be found in recent papers (e.g., Leuchs et al. 2017; Wang et al. 2017).

The proposed method can incorporate pattern-mixture model. Suppose that the full data consist of {Y ≡ (Y_obs, Y_mis),R,X}, where Y_obs and Y_mis denote observed and observed responses, respectively, the indicator variable R takes value 0 or 1, depending on whether Y is missing or observed, and X is a set of observed covariates. Pattern-mixture models specify the joint-distribution of the response and missing data mechanism by:

p (Y_{o b s}, Y_{m i s}, R ∣ X) = p (R ∣ X) p (Y_{o b s}, Y_{m i s} ∣ R, X),

where p(R|X) is the conditional distribution of the missingness pattern given observed covariates and p(Y_obs, Y_mis|R,X) is the probability distribution of the response given the missingness pattern and observed covariates. Under the MAR assumption, p(Y_mis|Y_obs,R = 0,X) = p(Y_mis|Y_obs,R = 1,X). In this case, we can use the conditional distribution p(Y_mis|Y_obs,R = 1,X) to impute the missing data. Under the MNAR assumption, however, p(Y_mis|Y_obs,R = 0,X) = p(Y_mis|Y_obs,R = 1,X) is not necessarily true. One can conduct sensitivity analysis by considering different models for p(Y_mis|Y_obs,R = 0|X). The proposed method is a special pattern mixture model in which missing responses in the treatment group are imputed from the conditional distribution generated from control group. The methods can be readily applied under different assumed models in the multiple imputation step. For example, suppose there are two types of missing responses in the treatment group: one is believed to be MAR, and the other is under MNAR. Then we can impute those under MAR using the intensity of treatment group, and impute those under MNAR using the jump to reference approach.

One limitation of the proposed methods is that we assume a monotone missing data pattern. While important, it is beyond the scope of this paper to incorporate non-monotone missingness. Future research in this direction is warranted.

Appendix

In this section, we derive the conditional distribution of the Gamma frailty b_i given the observed data. The conditional joint probability density of b_i and ${t_{i 1}, \dots, t_{i m_{i}}}$ given C_i and X_i is

[\prod_{t \leq C_{i}} {λ (t) b_{i} exp (X_{i}^{T} β)}^{Δ N_{i} (t)}] exp {- Λ (C_{i}) b_{i} exp (X_{i}^{T} β)} \frac{{(1 / γ)}^{1 / γ}}{Γ (1 / γ)} b_{i}^{1 / γ - 1} exp (- \frac{b_{i}}{γ}) = {\prod_{j = 1}^{m_{i}} λ (t_{i j}) exp (X_{i}^{T} β)} \frac{{(1 / γ)}^{1 / γ}}{Γ (1 / γ)} b_{i}^{1 / γ + m_{i} - 1} exp [- b_{i} {1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)}] .

The conditional density of ${t_{i 1}, \dots, t_{i m_{i}}}$ given C_i and X_i is then

\int_{0}^{\infty} {\prod_{j = 1}^{m_{i}} λ (t_{i j}) exp (X_{i}^{T} β)} \frac{{(1 / γ)}^{1 / γ}}{Γ (1 / γ)} b_{i}^{1 / γ + m_{i} - 1} exp [- b_{i} {1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)}] d b_{i} = {\prod_{j = 1}^{m_{i}} λ (t_{i j}) exp (X_{i}^{T} β)} \frac{Γ (1 / γ + m_{i})}{Γ (1 / γ)} \frac{{(1 / γ)}^{1 / γ}}{{Λ (C_{i}) exp (X_{i}^{T} β) + 1 / γ}^{m_{i} + 1 / γ}} .

By taking the ratio of the above two densities, we obtain that the conditional distribution of b_i given the observed data is Gamma with shape parameter 1/γ + m_i and rate parameter $1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)$ . Note that m_i = N_i(C_i).

Under the proportional intensity frailty model (1), N_i(τ_i) − N_i(C_i), i.e., the number of events in (C_i,τ_i), follows a Poisson distribution with mean $b_{i} {Λ (τ_{i}) - Λ (C_{i})} e^{{\tilde{X}}_{i}^{T} β}$ given b_i, where ${\tilde{X}}_{i}$ is the vector of covariate values used in the multiple imputation. It follows that the conditional probability mass function of N_i(τ_i) − N_i(C_i) given (X_i,C_i) and ${\tilde{X}}_{i}$ is

P (N_{i} (τ_{i}) - N_{i} (C_{i}) = k ∣ X_{i}, C_{i}, {\tilde{X}}_{i}) = \int_{0}^{\infty} \frac{1}{k!} exp [- b_{i} {Λ (τ_{i}) - Λ (C_{i})} exp ({\tilde{X}}_{i}^{T} β)] {[b_{i} {Λ (τ_{i}) - Λ (C_{i})} exp ({\tilde{X}}_{i}^{T} β)]}^{k} \times \frac{{1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)}^{1 / γ + m_{i}}}{Γ (1 / γ + m_{i})} b_{i}^{1 / γ + m_{i} - 1} exp [- b_{i} {1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)}] d b_{i} = \frac{1}{k!} \frac{Γ (1 / γ + m_{i} + k)}{Γ (1 / γ + m_{i})} {[\frac{{Λ (τ_{i}) - Λ (C_{i})} exp ({\tilde{X}}_{i}^{T} β)}{{Λ (τ_{i}) - Λ (C_{i})} exp ({\tilde{X}}_{i}^{T} β) + 1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)}]}^{k} \times {[\frac{1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)}{{Λ (τ_{i}) - Λ (C_{i})} exp ({\tilde{X}}_{i}^{T} β) + 1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)}]}^{1 / γ + m_{i}} .

The above probability mass function corresponds to a negative binomial distribution with number of failures 1/γ + m_i until the experiment is stopped and success probability

\frac{{Λ (τ_{i}) - Λ (C_{i})} exp ({\tilde{X}}_{i}^{T} β)}{{Λ (τ_{i}) - Λ (C_{i})} exp ({\tilde{X}}_{i}^{T} β) + 1 / γ + Λ (C_{i}) exp (X_{i}^{T} β)} .

References

Akacha M & Ogundimu EO (2016), ‘Sensitivity analyses for partially observed recurrent event data’, Pharmaceutical Statistics 15(1), 4–14. [DOI] [PubMed] [Google Scholar]
Amorim LD & Cai J (2015), ‘Modelling recurrent events: a tutorial for analysis in epidemiology’, International Journal of Epidemiology 44(1), 324–333. [DOI] [PMC free article] [PubMed] [Google Scholar]
Andersen PK & Gill RD (1982), ‘Cox’s regression model for counting processes: a large sample study’, The Annals of Statistics 10(4), 1100–1120. [Google Scholar]
Andersen PK & Keiding N (2002), ‘Multi-state models for event history analysis’, Statistical Methods in Medical Mesearch 11(2), 91–115. [DOI] [PubMed] [Google Scholar]
Andrews DF & Herzberg AM (2012), Data: A Collection of Problems from Many fields for the Student and Research Worker, Springer Science & Business Media. [Google Scholar]
Carpenter JR, Roger JH & Kenward MG (2013), ‘Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation’, Journal of Biopharmaceutical Statistics 23(6), 1352–1371. [DOI] [PubMed] [Google Scholar]
Cox DR (1972), ‘Regression models and life-tables’, Journal of the Royal Statistical Society: Series B (Methodological) 34(2), 187–202. [Google Scholar]
Diao G, Zeng D, Hu K & Ibrahim JG (2017), ‘Modeling event count data in the presence of informative dropout with application to bleeding and transfusion events in myelodysplastic syndrome’, Statistics in Medicine 36(22), 3475–3494. [DOI] [PubMed] [Google Scholar]
European Medicines Agency (2010), ‘Guideline on missing data in confirmatory clinical trials’, pp. Available from: http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2010/09/WC500096793.pdf. (accessed: August 4, 2017).
Gao F, Liu GF, Zeng D, Xu L, Lin B, Diao G, Golm G, Heyse JF & Ibrahim JG (2017), ‘Control-based imputation for sensitivity analyses in informative censoring for recurrent event data’, Pharmaceutical Statistics 16(6), 424–432. [DOI] [PubMed] [Google Scholar]
Gao F, Liu G, Zeng D, Diao G, Heyse JF & Ibrahim JG (2017), ‘On inference of control-based imputation for analysis of repeated binary outcomes with missing data’, Journal of Biopharmaceutical Statistics 27(3), 358–372. [DOI] [PubMed] [Google Scholar]
Huang C-Y, Wang M-C & Zhang Y (2006), ‘Analysing panel count data with informative observation times’, Biometrika 93(4), 763–775. [DOI] [PMC free article] [PubMed] [Google Scholar]
ICH E9 (R1) (2017), ‘Ich harmonised guidelines: Estimands and sensitivity analysis in clinical trials’, pp. Available from: https://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Efficacy/E9/E9-R1EWG_Step2_Guideline_2017_0616.pdf. (accessed: February 22, 2019).
Keene ON, Roger JH, Hartley BF & Kenward MG (2014), ‘Missing data sensitivity analysis for recurrent event data using controlled imputation’, Pharmaceutical Statistics 13(4), 258–264. [DOI] [PubMed] [Google Scholar]
Kelly PJ & Lim LL-Y (2000), ‘Survival analysis for recurrent event data: an application to childhood infectious diseases’, Statistics in Medicine 19(1), 13–33. [DOI] [PubMed] [Google Scholar]
Lambert D (1992), ‘Zero-inflated poisson regression, with an application to defects in manufacturing’, Technometrics 34(1), 1–14. [Google Scholar]
Lee AH, Wang K & Yau KK (2001), ‘Analysis of zero-inflated poisson data incorporating extent of exposure’, Biometrical Journal 43(8), 963–975. [Google Scholar]
Leuchs A-K, Brandt A, Zinserling J & Benda N (2017), ‘Disentangling estimands and the intention-to-treat principle’, Pharmaceutical Statistics 16(1), 12–19. [DOI] [PubMed] [Google Scholar]
Li Y, He X, Wang H, Zhang B & Sun J (2015), ‘Semiparametric regression of multivariate panel count data with informative observation times’, Journal of Multivariate Analysis 140, 209–219. [Google Scholar]
Lin DY, Wei L-J, Yang I & Ying Z (2000), ‘Semiparametric regression for the mean and rate functions of recurrent events’, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62(4), 711–730. [Google Scholar]
Liu GF & Pang L (2016), ‘On analysis of longitudinal clinical trials with missing data using reference-based imputation’, Journal of Biopharmaceutical Statistics 26(5), 924–936. [DOI] [PubMed] [Google Scholar]
Liu L, Huang X, Yaroshinsky A & Cormier JN (2016), ‘Joint frailty models for zero-inflated recurrent events in the presence of a terminal event’, Biometrics 72(1), 204–214. [DOI] [PubMed] [Google Scholar]
Long SJ (1997), Regression Models for Categorical and Limited Dependent Variables, Vol. 7, Sage Publications, Beverly Hills, CA. [Google Scholar]
Lu K, Li D & Koch GG (2015), ‘Comparison between two controlled multiple imputation methods for sensitivity analyses of time-to-event data with possibly informative censoring’, Statistics in Biopharmaceutical Research 7(3), 199–213. [Google Scholar]
Mathieu C, Shankar RR, Lorber D, Umpierrez G, Wu F, Xu L, Golm GT, Latham M, Kaufman KD & Engel SS (2015), ‘A randomized clinical trial to evaluate the efficacy and safety of co-administration of sitagliptin with intensively titrated insulin glargine’, Diabetes Therapy 6(2), 127–142. [DOI] [PMC free article] [PubMed] [Google Scholar]
Meira-Machado L, de Uña-Álvarez J, Cadarso-Suárez C & Andersen PK (2009), ‘Multi-state models for the analysis of time-to-event data’, Statistical Methods in Medical Research 18(2), 195–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
Murphy SA (1994), ‘Consistency in a proportional hazards model incorporating a random effect’, The Annals of Statistics 22(2), 712–731. [Google Scholar]
Murphy SA (1995), ‘Asymptotic theory for the frailty model’, The Annals of Statistics 23(1), 182–198. [Google Scholar]
National Academy of Sciences (2010), ‘The Prevention and Treatment of Missing Data in Clinical Trials. Panel on Handling Missing Data in Clinical Trials Committee on National Statistics, Division of Behavioral and Social Sciences and Education; ’. [Google Scholar]
Parner E (1998), ‘Asymptotic theory for the correlated Gamma-frailty model’, The Annals of Statistics 26(1), 183–214. [Google Scholar]
Prentice RL, Williams BJ & Peterson AV (1981), ‘On the regression analysis of multivariate failure time data’, Biometrika 68(2), 373–379. [Google Scholar]
Ridout M, Hinde J & DeméAtrio CG (2001), ‘A score test for testing a zero-inflated poisson regression model against zero-inflated negative binomial alternatives’, Biometrics 57(1), 219–223. [DOI] [PubMed] [Google Scholar]
Roussel R, Duran-García S, Zhang Y, Shah S, Darmiento C, Shankar RR, Golm GT, Lam RL, O’Neill EA, Gantz I et al. (2019), ‘Double-blind, randomized clinical trial comparing the efficacy and safety of continuing or discontinuing the dipeptidyl peptidase-4 inhibitor sitagliptin when initiating insulin glargine therapy in patients with type 2 diabetes: The composit-i study’, Diabetes, Obesity and Metabolism 21(4), 781–790. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rubin DB (1987), Multiple Imputation for Nonresponse in Surveys, Wiley, New York. [Google Scholar]
Tang Y (2017a), ‘An efficient monotone data augmentation algorithm for multiple imputation in a class of pattern mixture models’, Journal of Biopharmaceutical Statistics 27(4), 620–638. [DOI] [PubMed] [Google Scholar]
Tang Y (2017b), ‘An efficient multiple imputation algorithm for control-based and delta-adjusted pattern mixture models using sas’, Statistics in Biopharmaceutical Research 9(1), 116–125. [Google Scholar]
Therneau TM & Grambsch PM (2013), Modeling survival data: extending the Cox model, Springer Science & Business Media. [Google Scholar]
Wang C, Scharfstein DO, Colantuoni E, Girard TD, Yan Y et al. (2017), ‘Inference in randomized trials with death and missingness’, Biometrics 73(2), 431–440. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang M-C, Qin J & Chiang C-T (2001), ‘Analyzing recurrent event data with informative censoring’, Journal of the American Statistical Association 96(455), 1057–1065. [DOI] [PMC free article] [PubMed] [Google Scholar]
Wang N & Robins JM (1998), ‘Large-sample theory for parametric multiple imputation procedures’, Biometrika 85(4), 935–948. [Google Scholar]
Wei L-J, Lin DY & Weissfeld L (1989), ‘Regression analysis of multivariate incomplete failure time data by modeling marginal distributions’, Journal of the American Statistical Association 84(408), 1065–1073. [Google Scholar]
Zeng D, Ibrahim JG, Chen M-H, Hu K & Jia C (2014), ‘Multivariate recurrent events in the presence of multivariate informative censoring with applications to bleeding and transfusion events in myelodysplastic syndrome’, Journal of Biopharmaceutical Statistics 24(2), 429–442. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhao X & Tong X (2011), ‘Semiparametric regression analysis of panel count data with informative observation times’, Computational Statistics & Data Analysis 55(1), 291–300. [Google Scholar]
Zhao X, Tong X & Sun J (2013), ‘Robust estimation for panel count data with informative observation times’, Computational Statistics & Data Analysis 57(1), 33–40. [DOI] [PubMed] [Google Scholar]

[R1] Akacha M & Ogundimu EO (2016), ‘Sensitivity analyses for partially observed recurrent event data’, Pharmaceutical Statistics 15(1), 4–14. [DOI] [PubMed] [Google Scholar]

[R2] Amorim LD & Cai J (2015), ‘Modelling recurrent events: a tutorial for analysis in epidemiology’, International Journal of Epidemiology 44(1), 324–333. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Andersen PK & Gill RD (1982), ‘Cox’s regression model for counting processes: a large sample study’, The Annals of Statistics 10(4), 1100–1120. [Google Scholar]

[R4] Andersen PK & Keiding N (2002), ‘Multi-state models for event history analysis’, Statistical Methods in Medical Mesearch 11(2), 91–115. [DOI] [PubMed] [Google Scholar]

[R5] Andrews DF & Herzberg AM (2012), Data: A Collection of Problems from Many fields for the Student and Research Worker, Springer Science & Business Media. [Google Scholar]

[R6] Carpenter JR, Roger JH & Kenward MG (2013), ‘Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation’, Journal of Biopharmaceutical Statistics 23(6), 1352–1371. [DOI] [PubMed] [Google Scholar]

[R7] Cox DR (1972), ‘Regression models and life-tables’, Journal of the Royal Statistical Society: Series B (Methodological) 34(2), 187–202. [Google Scholar]

[R8] Diao G, Zeng D, Hu K & Ibrahim JG (2017), ‘Modeling event count data in the presence of informative dropout with application to bleeding and transfusion events in myelodysplastic syndrome’, Statistics in Medicine 36(22), 3475–3494. [DOI] [PubMed] [Google Scholar]

[R9] European Medicines Agency (2010), ‘Guideline on missing data in confirmatory clinical trials’, pp. Available from: http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2010/09/WC500096793.pdf. (accessed: August 4, 2017).

[R10] Gao F, Liu GF, Zeng D, Xu L, Lin B, Diao G, Golm G, Heyse JF & Ibrahim JG (2017), ‘Control-based imputation for sensitivity analyses in informative censoring for recurrent event data’, Pharmaceutical Statistics 16(6), 424–432. [DOI] [PubMed] [Google Scholar]

[R11] Gao F, Liu G, Zeng D, Diao G, Heyse JF & Ibrahim JG (2017), ‘On inference of control-based imputation for analysis of repeated binary outcomes with missing data’, Journal of Biopharmaceutical Statistics 27(3), 358–372. [DOI] [PubMed] [Google Scholar]

[R12] Huang C-Y, Wang M-C & Zhang Y (2006), ‘Analysing panel count data with informative observation times’, Biometrika 93(4), 763–775. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] ICH E9 (R1) (2017), ‘Ich harmonised guidelines: Estimands and sensitivity analysis in clinical trials’, pp. Available from: https://www.ich.org/fileadmin/Public_Web_Site/ICH_Products/Guidelines/Efficacy/E9/E9-R1EWG_Step2_Guideline_2017_0616.pdf. (accessed: February 22, 2019).

[R14] Keene ON, Roger JH, Hartley BF & Kenward MG (2014), ‘Missing data sensitivity analysis for recurrent event data using controlled imputation’, Pharmaceutical Statistics 13(4), 258–264. [DOI] [PubMed] [Google Scholar]

[R15] Kelly PJ & Lim LL-Y (2000), ‘Survival analysis for recurrent event data: an application to childhood infectious diseases’, Statistics in Medicine 19(1), 13–33. [DOI] [PubMed] [Google Scholar]

[R16] Lambert D (1992), ‘Zero-inflated poisson regression, with an application to defects in manufacturing’, Technometrics 34(1), 1–14. [Google Scholar]

[R17] Lee AH, Wang K & Yau KK (2001), ‘Analysis of zero-inflated poisson data incorporating extent of exposure’, Biometrical Journal 43(8), 963–975. [Google Scholar]

[R18] Leuchs A-K, Brandt A, Zinserling J & Benda N (2017), ‘Disentangling estimands and the intention-to-treat principle’, Pharmaceutical Statistics 16(1), 12–19. [DOI] [PubMed] [Google Scholar]

[R19] Li Y, He X, Wang H, Zhang B & Sun J (2015), ‘Semiparametric regression of multivariate panel count data with informative observation times’, Journal of Multivariate Analysis 140, 209–219. [Google Scholar]

[R20] Lin DY, Wei L-J, Yang I & Ying Z (2000), ‘Semiparametric regression for the mean and rate functions of recurrent events’, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 62(4), 711–730. [Google Scholar]

[R21] Liu GF & Pang L (2016), ‘On analysis of longitudinal clinical trials with missing data using reference-based imputation’, Journal of Biopharmaceutical Statistics 26(5), 924–936. [DOI] [PubMed] [Google Scholar]

[R22] Liu L, Huang X, Yaroshinsky A & Cormier JN (2016), ‘Joint frailty models for zero-inflated recurrent events in the presence of a terminal event’, Biometrics 72(1), 204–214. [DOI] [PubMed] [Google Scholar]

[R23] Long SJ (1997), Regression Models for Categorical and Limited Dependent Variables, Vol. 7, Sage Publications, Beverly Hills, CA. [Google Scholar]

[R24] Lu K, Li D & Koch GG (2015), ‘Comparison between two controlled multiple imputation methods for sensitivity analyses of time-to-event data with possibly informative censoring’, Statistics in Biopharmaceutical Research 7(3), 199–213. [Google Scholar]

[R25] Mathieu C, Shankar RR, Lorber D, Umpierrez G, Wu F, Xu L, Golm GT, Latham M, Kaufman KD & Engel SS (2015), ‘A randomized clinical trial to evaluate the efficacy and safety of co-administration of sitagliptin with intensively titrated insulin glargine’, Diabetes Therapy 6(2), 127–142. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] Meira-Machado L, de Uña-Álvarez J, Cadarso-Suárez C & Andersen PK (2009), ‘Multi-state models for the analysis of time-to-event data’, Statistical Methods in Medical Research 18(2), 195–222. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Murphy SA (1994), ‘Consistency in a proportional hazards model incorporating a random effect’, The Annals of Statistics 22(2), 712–731. [Google Scholar]

[R28] Murphy SA (1995), ‘Asymptotic theory for the frailty model’, The Annals of Statistics 23(1), 182–198. [Google Scholar]

[R29] National Academy of Sciences (2010), ‘The Prevention and Treatment of Missing Data in Clinical Trials. Panel on Handling Missing Data in Clinical Trials Committee on National Statistics, Division of Behavioral and Social Sciences and Education; ’. [Google Scholar]

[R30] Parner E (1998), ‘Asymptotic theory for the correlated Gamma-frailty model’, The Annals of Statistics 26(1), 183–214. [Google Scholar]

[R31] Prentice RL, Williams BJ & Peterson AV (1981), ‘On the regression analysis of multivariate failure time data’, Biometrika 68(2), 373–379. [Google Scholar]

[R32] Ridout M, Hinde J & DeméAtrio CG (2001), ‘A score test for testing a zero-inflated poisson regression model against zero-inflated negative binomial alternatives’, Biometrics 57(1), 219–223. [DOI] [PubMed] [Google Scholar]

[R33] Roussel R, Duran-García S, Zhang Y, Shah S, Darmiento C, Shankar RR, Golm GT, Lam RL, O’Neill EA, Gantz I et al. (2019), ‘Double-blind, randomized clinical trial comparing the efficacy and safety of continuing or discontinuing the dipeptidyl peptidase-4 inhibitor sitagliptin when initiating insulin glargine therapy in patients with type 2 diabetes: The composit-i study’, Diabetes, Obesity and Metabolism 21(4), 781–790. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Rubin DB (1987), Multiple Imputation for Nonresponse in Surveys, Wiley, New York. [Google Scholar]

[R35] Tang Y (2017a), ‘An efficient monotone data augmentation algorithm for multiple imputation in a class of pattern mixture models’, Journal of Biopharmaceutical Statistics 27(4), 620–638. [DOI] [PubMed] [Google Scholar]

[R36] Tang Y (2017b), ‘An efficient multiple imputation algorithm for control-based and delta-adjusted pattern mixture models using sas’, Statistics in Biopharmaceutical Research 9(1), 116–125. [Google Scholar]

[R37] Therneau TM & Grambsch PM (2013), Modeling survival data: extending the Cox model, Springer Science & Business Media. [Google Scholar]

[R38] Wang C, Scharfstein DO, Colantuoni E, Girard TD, Yan Y et al. (2017), ‘Inference in randomized trials with death and missingness’, Biometrics 73(2), 431–440. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] Wang M-C, Qin J & Chiang C-T (2001), ‘Analyzing recurrent event data with informative censoring’, Journal of the American Statistical Association 96(455), 1057–1065. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R40] Wang N & Robins JM (1998), ‘Large-sample theory for parametric multiple imputation procedures’, Biometrika 85(4), 935–948. [Google Scholar]

[R41] Wei L-J, Lin DY & Weissfeld L (1989), ‘Regression analysis of multivariate incomplete failure time data by modeling marginal distributions’, Journal of the American Statistical Association 84(408), 1065–1073. [Google Scholar]

[R42] Zeng D, Ibrahim JG, Chen M-H, Hu K & Jia C (2014), ‘Multivariate recurrent events in the presence of multivariate informative censoring with applications to bleeding and transfusion events in myelodysplastic syndrome’, Journal of Biopharmaceutical Statistics 24(2), 429–442. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R43] Zhao X & Tong X (2011), ‘Semiparametric regression analysis of panel count data with informative observation times’, Computational Statistics & Data Analysis 55(1), 291–300. [Google Scholar]

[R44] Zhao X, Tong X & Sun J (2013), ‘Robust estimation for panel count data with informative observation times’, Computational Statistics & Data Analysis 57(1), 33–40. [DOI] [PubMed] [Google Scholar]

PERMALINK

Efficient Multiple Imputation for Sensitivity Analysis of Recurrent Events Data with Informative Censoring

Guoqing Diao

Guanghan F Liu

Donglin Zeng

Yilong Zhang

Gregory Golm

Joseph F Heyse

Joseph G Ibrahim

Abstract

1. Introduction

2. Methods

Remark 1.

Remark 2.

3. Simulation Studies

Table 1:

Table 2:

Table 3:

Table 4:

4. Data Analysis

4.1. A Diabetes Trial

Table 5:

Table 6:

Table 7:

4.2. A Bladder Cancer Trial

Table 8:

Table 9:

5. Discussion

Appendix

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Efficient Multiple Imputation for Sensitivity Analysis of Recurrent Events Data with Informative Censoring

Guoqing Diao

Guanghan F Liu

Donglin Zeng

Yilong Zhang

Gregory Golm

Joseph F Heyse

Joseph G Ibrahim

Abstract

1. Introduction

2. Methods

Remark 1.

Remark 2.

3. Simulation Studies

Table 1:

Table 2:

Table 3:

Table 4:

4. Data Analysis

4.1. A Diabetes Trial

Table 5:

Table 6:

Table 7:

4.2. A Bladder Cancer Trial

Table 8:

Table 9:

5. Discussion

Appendix

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases