Abstract
Survival analysis is used in the medical field to identify the effect of predictive variables on time to a specific event. Generally, not all variation of survival time can be explained by observed covariates. The effect of unobserved variables on the risk of a patient is called frailty. In multicenter studies, the unobserved center effect can induce frailty on its patients, which can lead to selection bias over time when ignored. For this reason, it is common practice in multicenter studies to include a random frailty term modeling center effect. In a more complex event structure, more than one type of event is possible. Independent frailty variables representing center effect can be incorporated in the model for each competing event. However, in the medical context, events representing disease progression are likely related and correlation is missed when assuming frailties to be independent. In this work, an additive gamma frailty model to account for correlation between frailties in a competing risks model is proposed, to model frailties at center level. Correlation indicates a common center effect on both events and measures how closely the risks are related. Estimation of the model using the expectation‐maximization algorithm is illustrated. The model is applied to a data set from a multicenter clinical trial on breast cancer from the European Organisation for Research and Treatment of Cancer (EORTC trial 10854). Hospitals are compared by employing empirical Bayes estimates methodology together with corresponding confidence intervals.
Keywords: correlated frailty, competing risks, EM algorithm, multicenter, unobserved heterogeneity
1. INTRODUCTION
Survival data arises where interest lies on the time from a specific time origin until occurrence of an event of interest. Prominent applications are found in the medical field, where, eg, time from diagnosis of disease until death could be studied. What distinguishes survival analysis from other types of statistical analysis is the type of data it deals with: it is generally incomplete. Since it takes time to observe an event, it is usually not possible to collect complete information. A popular method to model the effect of covariates on risk of event occurrence is through the semiparametric Cox proportional hazards model.1 In some situations, more than one type of endpoint are possible, when, eg, different causes of death are studied. Analogous to the single endpoint situation, the Cox model can be used to model the effect of covariates on the cause‐specific transition hazards2 of each cause of failure. A more complicated event structure with intermediate states can be modeled by a multistate model.2 Dependence in survival data can be modeled by a random frailty term, which models heterogeneity between observations or between clusters of observations. The frailty term represents unobserved covariates on the individual or cluster level that act on the risk of event occurrence. The frailty variance can be interpreted as a measure of heterogeneity between clusters or individuals; however, it can also be seen as a measure of dependence within a cluster.
Multicenter studies are a common strategy to collect sufficient data for a clinical study. Patients are clustered within treatment centers and possible correlation between patients within a center can be modeled by using a shared frailty model. Shared frailty models are able to model dependence; however, these models limit the unobserved covariates modeled by the frailty to have the same effect within a cluster. In the presence of competing events, the use of one frailty per center acting on all causes of failure is questionable. Similarly, using J independent frailties per center one for each cause of failure does not yield a complete picture of the data structure. Frailties for different competing events within a center are likely to be correlated, since they represent the same unobserved covariates on cluster level. Yashin et al3 first introduced a correlated gamma frailty model to analyze twin survival data. They decompose a twin's frailty into a sum of two independent frailties, one of which is shared by both twins. Petersen et al4 use this idea of adding frailty components, which act multiplicatively on the individual hazard and describe more complex variance components models for survival data.
Clustered data in the presence of competing risks further complicate possible dependence structures and different approaches are taken. Extensions of Fine and Gray's subdistribution hazard model5 incorporate a frailty term to model cluster dependence on the cumulative incidence function of the event of interest in the presence of competing events.6, 7, 8, 9 Wienke et al10, 11 analyze correlated frailty models in the presence of competing risks, however, assuming independence between risks. The assumption of independence is questionable since related events (eg, events representing disease progression) might be influenced similarly by the same unobserved covariates. Wienke et al12 extend the bivariate correlated gamma frailty model of Yashin et al3 to model dependence among competing risks based on parametric marginal survival functions. Gorfine and Hsu13 combine frailty components multiplicatively to model dependence between competing risks for clustered survival data. Liquet et al14 analyze hospital heterogeneity in multistate models using independent and joint frailty models to model dependence between transition intensities. Rotolo et al15 propose to incorporate correlated frailties in multistate models acting on the transition‐specific hazard functions. They construct frailties by combining a common cluster component and a transition‐specific component multiplicatively.
In this paper, we propose an additive gamma frailty model that acts multiplicatively on the cause‐specific hazard to model dependence within clusters and between two competing events. The method can be used to investigate hospital heterogeneity in a competing risks setting. An elegant estimation procedure using the expectation‐maximization algorithm (EM algorithm) is outlined as well as a strategy to calculate the standard error of the estimates. In contrast to Wienke et al12 who model dependence among competing risks by using a parametric approach, our method is based on the semiparametric Cox model.1 Compared to methods suggested by Gorfine and Hsu13 and Rotolo et al,15 which combine frailty components multiplicatively, in this article, a gamma decomposition is proposed to model dependence between risks. The advantage of our method is its simplicity in construction and estimation, which is based on the mathematical properties of the gamma distribution. Additionally, estimation through the EM algorithm provides empirical Bayes estimates for each center's frailty, which can be used to compare centers.
In Sections 2 and 3, the cause‐specific hazards model and frailty model will be reviewed briefly. The proposed competing risks frailty model is presented in Section 4. In Section 5, the method is applied to a data example and corresponding results are presented. A simulation study to investigate the performance of the correlated frailty model is discussed in Section 6. A discussion follows in Section 7.
2. COMPETING RISKS MODEL
Competing risks models are used when more then one type of failure is possible. An example is the study of different causes of death. A fundamental concept in competing risks is the cause‐specific hazard. It is the hazard of failing from a particular cause given still event free at that time.
For right‐censored survival times, the cause‐specific hazard of cause j for a subject i with covariate vector X i is as follows:
| (1) |
where λ j0 is the cause‐specific baseline hazard for cause j and β j assesses the effect of the covariates X i on the progression rate to cause j.2 Here, the effects of covariates are quantified on the cause‐specific hazard and not on the marginal hazard. Only if the censoring due to the competing risks is noninformative conditionally on the covariates in the model, the estimates can also be interpreted as effects on the marginal hazard.
3. FRAILTY MODEL
The concept of frailty introduces random effects in survival models, which represent the presence of unobserved heterogeneity. The variance of this random component is a measure used to quantify heterogeneity in the data. Vaupel et al16 discussed univariate frailty models with a gamma distribution and applied this concept to survival. Clayton17 used frailties in the multivariate analysis of chronic disease incidence in families.
A frailty is an unobserved random factor varying over the population of individuals, which is assumed to have a multiplicative effect on the hazard of a single individual or a group or cluster of individuals. In univariate frailty models, each individual has its own independent frailty, while in shared frailty models, clustered individuals share a common frailty.
For subject i with covariate vector X i belonging to cluster k with frailty W k, the hazard is given as
| (2) |
A convenient choice for the frailty distribution is the gamma distribution, since its posterior distribution given survival data stays in the gamma family.4
4. COMPETING RISKS FRAILTY MODEL
Heterogeneity between centers in a competing risks setting can be modeled by assigning each center J frailties, one for each cause of failure. The J frailty terms within a center may be chosen to be independent; however, the effects within a center are likely to be related, which is ignored in such a model. In a more realistic model, frailties within a center are correlated. A model for the dependence structure was first proposed by Yashin et al3 in a twin study, decomposing the frailty of each twin as a sum of two independent frailties one of which is shared. Petersen et al4 use an additive variance components structure on multiplicative gamma frailty models and outline its estimation. The correlated frailty model proposed in this article follows their approach.
4.1. Frailty decomposition
In the following, let W k1,W k2 denote the frailty variables corresponding to two causes of failure within hospital k (k = 1,…,K). Correlation between frailties is constructed by decomposing each frailty as the sum of two independent gamma distributed variables, one of which is common in both frailties.18, 19 For cause j ( j = 1,2), frailties are given as
| (3) |
where
| (4) |
The random variables Z k0,Z k1,Z k2 are independent and, from now on, referred to as the independent frailty components of hospital k. This results in the following frailty distribution:
| (5) |
The expectation of the frailty variables is equal to one, which corresponds to no hospital effect or the average hospital effect. Their variance and correlation are given as
| (6) |
| (7) |
This construction allows for positive correlation only. In many practical situations, however, it may be justified to disregard negative correlation, eg, when competing events describe disease progression. A further restriction is that not all variance correlation combinations are possible in this construction. A large correlation does not allow the variances to be too different, or equivalent, different frailty variances do not allow the correlation to be (almost) one
| (8) |
| (9) |
4.2. Model estimation
Model parameters are obtained by maximizing the log‐likelihood function based on the observed data. Since frailties associated to different centers and individuals across hospitals are independent, the likelihood is the product of hospital likelihoods. For simplicity, only the log‐likelihood and necessary quantities of a single center k are given in the following.
Denote by n k and d k j the number of patients and the number of patients that fail from cause j ( j = 1,2) in hospital k, respectively. Let X k i, t k i, and δ k i (δ k i = 0,1,2) be the covariate vector for patient i treated at hospital k, the event or censoring time, and the event or censoring indicator, respectively. In the following, let β j be the vector of regression coefficients, λ j0 the baseline hazard, and Λj0 the cumulative baseline hazard for cause j ( j = 1,2). If the frailties were observed, the complete data yields the following log‐likelihood for hospital k
| (10) |
where f is the probability density function of the independent and gamma distributed frailty components.
Integrating out all frailty components specific to each center in the log‐likelihood yields the observed data log‐likelihood, which is computationally challenging to maximize (see Appendix A for details). Considering the unobserved frailties as missing information yields a typical application of the EM algorithm.20
4.3. Implementation
For fixed parameter ν=(ν 0,ν 1,ν 2), the estimation procedure uses the EM algorithm to approximate the observed data log‐likelihood to find optimal regression coefficients and baseline hazards.20 The approximated observed data log‐likelihood is then employed in a three‐dimensional search to a find maximum likelihood estimate (MLE) for ν.
Since ν is fixed throughout the EM iterations, the estimation concerns the regression coefficients and baseline hazards only. The conditional expectations of the terms , ( j = 1,2), and of given observed data are irrelevant to the estimation of the complete data case (10). Therefore, the E‐step reduces to the calculation of the conditional expectations of the frailties W k j = (Z k0 + Z k j)/(ν 0 + ν j), ( j = 1,2) given observed data. As a result, defining , ( j = 1,2), it is sufficient to consider
| (11) |
| (12) |
| (13) |
where f is the conditional probability density function of a frailty component given data, and c k(l,m,ν 0,ν 1,ν 2) is a function over the number of events of each type of failure for fixed frailty parameters. Details about the computations are outlined in Appendix A.
Since the conditional distributions of the frailty components Z k0,Z k1,Z k2 given observed data are mixtures of gamma distributions (see Appendix A for details), it is straightforward to compute the quantities (11), (12), (13). Notably, the factor c k(l,m,ν 0,ν 1,ν 2) is the same in all three expectations.
The M‐step consists of estimating the updated baseline hazards Λ10(t),Λ20(t) and coefficient vectors β 1,β 2, through maximization of the conditional log‐likelihood, given frailties estimated in the E‐step. This can be done with existing software, eg, using coxph() from the R 21 package survival,22 incorporating the logarithm of the expected frailties as offset into the cause‐specific hazards model. The algorithm iterates over these two steps and stops once the approximation of the observed data log‐likelihood converged (eg, change of smaller than 1e−06).
Until now, the frailty parameter ν=(ν 0,ν 1,ν 2) was fixed throughout the EM iterations. Profile likelihood is used to obtain MLEs of ; the function optim() is used to find the optimal ν, maximizing the observed data log‐likelihood approximated with the EM algorithm (see the supplementary material in this paper).
4.4. Estimation of the standard error
Louis23 discussed how to obtain the covariance matrix for the regression parameters, which stays within the EM algorithm framework, using only derivatives of the complete data log‐likelihood. This approach does not yet include the uncertainty caused by estimating the frailty parameters ν=(ν 0,ν 1,ν 2) outside of the EM algorithm. Putter and van Houwelingen24 (see the supplementary material in the article of Putter and van Houwelingen) proposed estimation as described in the following.
Let denote the MLEs of the regression coefficients and baseline hazards given frailty parameters ν, and denote by the MLE of ν maximizing the observed data log‐likelihood. The combined covariance matrix of is given as
| (14) |
where Σνν and Σηη are the covariance matrix of ν and , respectively, and the term are the partial derivatives of the regression parameters given ν. The term on the bottom right of (14) represents the covariance of where the term is obtained using a Taylor expansion of and the score functions of and around the MLEs. The off‐diagonal terms are covariance matrices of and can be derived in a similarly way, see Appendix B for details.
The term Σνν is computed from the Hessian matrix obtained using the hessian() function from the numDeriv package25 around the point estimate of ν found by the optim() function in R.21 We proceed by inverting the negative of the Hessian matrix, since the inverse of the observed profile information is equal to the ν component of the full observed inverse information evaluated at (see section 8.6.2 in the work of Young and Smith26).
The term is approximated numerically. The derivative around the MLE is estimated by calculating the slope between parameters for values of ν close to the MLE.
The term Σηη can be computed as described by Louis.23 It requires the gradient vector and second derivative matrix of the complete data log‐likelihood, but not the ones associated to the incomplete data case, see Appendix B for details.
The standard error of the estimated regression parameters η can be calculated by taking the square root of the corresponding diagonal elements of the covariance matrix (14). To obtain the standard error of the frailty variances and correlation, we apply the multivariate delta method on Σνν (see section 5.6 in the work of Casella and Berger27). See the supplementary material for implementation in R.
4.5. Empirical Bayes estimates
Heterogeneity between hospitals may raise the question of hospital ranking based on their frailty or relative performance. A popular method to compare institutions is the empirical Bayes approach introduced to this setting by Thomas et al.28 If many centers are involved, a crude center effect estimate may explode for small centers due to large variation and not due to a real center effect.29 The empirical Bayes estimator helps distinguish observations that are “extreme by nature” and those that are “extreme by chance” and is very well suited for the analysis of quality comparison data.30 The empirical Bayes approach not only uses information on a particular center to quantify its performance but also uses information on all centers to help improve the estimate.
Following the work of van Houwelingen,30 the empirical Bayes principle will be outlined. Let X 1,…,X K be independent outcomes with densities f (x k,θ k) and θ 1,…,θ K iid with distribution G. The optimal estimator under mean squared error loss for each θ k is given by the Bayes estimator d(x k|G) = E(θ k|x k,G), when G is known. When G is unknown, one can estimate E(θ k|x k,G) through an estimate of the distribution G. The resulting estimator is shrunken toward the mean, where the amount of shrinkage depends on the variance of the underlying distribution. In the context of center performance, X k represents the outcome and θ k the true unobserved performance of center k.
The E‐step of the EM algorithm estimates the empirical Bayes estimate of the center frailties given current model parameters and ν. Hence, computing a last E‐step based on the MLE of regression parameters and ν after convergence of the algorithm will give the empirical Bayes estimate of center frailties.
Even though empirical Bayes estimates are preferred to crude performance estimates when analyzing quality comparison data, interpretation of results should be made with caution as reasons for different outcome may lie outside a center's responsibility. Statistical issues in comparing institutions are discussed in more detail in the work of Goldstein and Spiegelhalter.31
The conditional distribution of Z k0,Z k1, and Z k2 given data is the weighted sum of gamma distributions depending on the number of events of each type (see Appendix A for details). To obtain prediction intervals for the empirical Bayes estimates, a simplified sampling procedure is applied.
Sample from set of tuples (l,m) from {(0,0),…,(d k1,d k2)}, where d k1 and d k2 are the number of events of type 1 and type 2, respectively.
Sample Z k0,Z k1, and Z k2 from gamma distributions Γ(d k1 + d k2 + ν 0 − l − m,1 + Λk1 + Λk2), Γ(l + ν 1,1 + Λk1) and Γ(m + ν 2,1 + Λk2), respectively.
Repeating this sampling procedure many times lower and upper confidence limits can be found by taking the 2.5% and 97.5% quantile.
5. DATA APPLICATION
5.1. Data description
The data used in this work originates from the European Organisation for Research and Treatment of Cancer trial 10854, which studied the effect of one course of perioperative chemotherapy given directly after surgery on survival.32 The data set includes 2795 women treated for invasive stage I or II breast cancer, randomized for treatment in 15 different centers. Breast cancer is one of the most common types of cancer in women. The standard treatment for breast cancer is surgery, which may be followed by chemotherapy, radiotherapy, or both. Disease progression after surgery can be described in terms of events that a patient might experience. A patient can develop local recurrence (LR), which means that the tumor grows back at the site of surgery and/or might develop distant metastasis (DM), which corresponds to a tumor growth not at the site of surgery and/or she might die.
Patients were excluded from this analysis following exclusion criteria of the trial (n = 41) and if information on relevant covariates was missing (n = 91). Furthermore, all five patients from a particular center were excluded, because of the small amount of patients treated at this center, leaving a total of 2658 patients from 14 different centers for analysis.
The competing risks model for this data is illustrated in Figure 1. Two competing events are considered, recurrence of disease (LR or DM) and death. The starting state is the state a patient enters after surgery, being alive with no evidence of disease after surgical removal of the primary tumor (ANED).
Figure 1.

Initially, 2658 patients are alive with no evidence of disease (ANED)
The choice of covariates to analyze is based on a previous study on the same data.33 The following prognostic factors are considered in the analysis: age (≥50, 40‐50, <40), tumor size (<2 cm, ≥2 cm), nodal status (negative, positive), type of surgery (mastectomy, breast conserving), perioperative chemotherapy (yes, no), adjuvant chemotherapy (yes, no), and adjuvant radiotherapy (yes, no). Patients' characteristics are provided in Table 1.
Table 1.
Characteristics of 2658 patients
| Variable | N | (%) |
|---|---|---|
| Age | ||
| ≥50 | 1602 | (60.3) |
| 40‐50 | 762 | (28.7) |
| <40 | 294 | (11.1) |
| Tumor size | ||
| <2 cm | 798 | (30.0) |
| ≥2 cm | 1860 | (70.0) |
| Nodal status | ||
| Negative | 1407 | (52.9) |
| Positive | 1251 | (47.1) |
| Surgery | ||
| Mastectomy | 1164 | (43.8) |
| Breast conserving | 1494 | (56.2) |
| Perioperative chemotherapy | ||
| Yes | 1325 | (49.8) |
| No | 1333 | (50.2) |
| Adjuvant chemotherapy | ||
| No | 2173 | (81.8) |
| Yes | 485 | (18.2) |
| Adjuvant radiotherapy | ||
| No | 54 | (2.0) |
| Yes | 2604 | (98.0) |
5.2. Competing risks model with independent frailties
To account for center effect in a cause‐specific regression model, each cause of failure within a hospital is assigned its own independent frailty.
The model can be estimated similarly to the classical competing risks model, by using coxph() together with the frailty() function from the R package survival 22 or the emfrail() function from the frailtyEM 34 package. The results of the estimated model with independent gamma frailties are shown in Table 2.
Table 2.
Cause‐specific hazards model with independent frailties
| ANED→Recurrence | ANED→Death | |||
|---|---|---|---|---|
| HR | 0.95 CI | HR | 0.95 CI | |
| Age | ||||
| ≥50 | 1.00 | 1.00 | ||
| 40‐50 | 1.00 | 0.85‐1.19 | 0.84 | 0.68‐1.04 |
| <40 | 1.43 | 1.16‐1.76 | 1.03 | 0.79‐1.34 |
| Tumor size (≥2 vs <2 cm) | 1.41 | 1.22‐1.64 | 1.46 | 1.21‐1.76 |
| NodST (pos. vs neg.) | 1.55 | 1.34‐1.79 | 2.22 | 1.87‐2.63 |
| Surgery (cons. vs mast.) | 0.92 | 0.80‐1.05 | 0.82 | 0.70‐0.97 |
| PeriCT (no vs yes) | 1.15 | 1.02‐1.30 | 1.11 | 0.96‐1.29 |
| AdjCT (yes vs no) | 0.79 | 0.64‐0.97 | 0.82 | 0.64‐1.05 |
| AdjRT (yes vs no) | 1.20 | 0.73‐1.98 | 1.12 | 0.62‐2.00 |
| Variance | SE | Variance | SE | |
| Frailty | 0.05 | 0.03 | 0.13 | 0.06 |
Abbreviations: NodST (pos. vs neg.), Nodal status (positive vs negative); Surgery (cons. vs mast.), Surgery (breast conserving vs mastectomy); PeriCT, Perioperative chemotherapy; AdjCT, Adjuvant chemotherapy; AdjRT, Adjuvant radiotherapy. ANED, alive with no evidence of disease; CI, confidence interval; HR, hazard ratio; SE, standard error.
A young age (<40) significantly increases the risks of experiencing recurrence (HR: 1.43; CI: 1.16‐1.76), as well as a larger tumor size (HR: 1.41; CI: 1.22‐1.64), a positive nodal status (HR: 1.55; CI: 1.34‐1.79), and whether or not perioperative chemotherapy and adjuvant chemotherapy was administered (HR: 1.15; CI: 1.02‐1.30 and HR: 0.79; CI: 0.64‐0.97, respectively). The frailty variance for transition 1 is estimated to be equal to 0.05.
A larger tumor size and a positive nodal status also have a significant effect on death before recurrence with HR: 1.46 (CI: 1.21‐1.76) and HR: 2.22 (CI: 1.87‐2.63). For death, type of surgery also has a significant effect with HR equal to 0.82 (CI: 0.70‐0.97) for breast conserving therapy compared to mastectomy. This finding is unexpected and should probably be ascribed to insufficient adjustment for factors relates to choice of primary surgical treatment. The frailty variance for this transition is estimated to be equal to 0.13.
A different frailty model assigns to each hospital a shared frailty term for both causes of failure. Both the independent and shared frailty models are not realistic. The former assumes an independent effect of the unobserved covariates on the two events and the latter assumes them to have the same effect on both events. A model allowing for possible correlation between frailties is probably a more accurate representation of reality.
5.3. Competing risks model with correlated frailties
In Table 3, the results for the competing risks frailty model with correlated frailties are shown.
Table 3.
Cause‐specific hazards model with correlated frailties
| ANED→Recurrence | ANED→Death | |||
|---|---|---|---|---|
| HR | 0.95 CI | HR | 0.95 CI | |
| Age | ||||
| ≥50 | 1.00 | 1.00 | ||
| 40‐50 | 1.00 | 0.69‐1.44 | 0.35 | 0.05‐2.76 |
| <40 | 1.42 | 0.92‐2.18 | 0.62 | 0.06‐6.48 |
| Tumor size (≥2 vs <2 cm) | 1.41 | 1.05‐1.89 | 0.96 | 0.25‐3.73 |
| NodST (pos. vs neg.) | 1.55 | 1.15‐2.08 | 1.72 | 0.47‐6.27 |
| Surgery (cons. vs mast.) | 0.92 | 0.70‐1.22 | 0.65 | 0.18‐2.31 |
| PeriCT (no vs yes) | 1.15 | 0.89‐1.48 | 1.14 | 0.35‐3.70 |
| AdjCT (yes vs no) | 0.79 | 0.50‐1.27 | 0.80 | 0.06‐10.08 |
| AdjRT (yes vs no) | 1.18 | 0.81‐1.71 | 0.66 | 0.12‐3.72 |
| Variance | SE | Variance | SE | |
| Frailty | 0.05 | 0.03 | 0.27 | 0.22 |
| Correlation | SE | |||
| Correlation | 0.37 | 0.18 | ||
Abbreviations: NodST (pos. vs neg.), Nodal status (positive vs negative); Surgery (cons. vs mast.), Surgery (breast conserving vs mastectomy); PeriCT, Perioperative chemotherapy; AdjCT, Adjuvant chemotherapy; AdjRT, Adjuvant radiotherapy. ANED, alive with no evidence of disease; CI, confidence interval; HR, hazard ratio; SE, standard error.
The hazard ratios for recurrence are almost unchanged compared to the independent frailty model. However, in the correlated frailty model, nodal status and size are the only significant factors. The hazard ratios for death without recurrence are very different from the independent frailty model. This can be explained by the small number of deaths without recurrence in the data set. The variation added by additionally estimating the frailties increased the standard errors, and fewer variables are significant.
The variance of the frailty for transition 1 (ANED → Recurrence) is equal to 0.05 with a standard error of 0.03. For transition 2 (ANED → Death), the frailty variance is equal to 0.27 with a standard error of 0.22. The correlation of the frailties is estimated to be equal to 0.37 with a standard error of 0.18. Given these frailty variances, the maximum correlation between frailties in this model is 0.43 resulting from inequalities (8) and (9).
5.4. Empirical Bayes estimates
Figure 2 shows the empirical Bayes estimates of the frailties of each center together with 95% prediction intervals, for event recurrence and death. A value equal to 1 implies that there is no center effect. Centers are ordered by number of patients treated. The prediction intervals are computed by sampling from the gamma mixture distribution of frailties and taking 2.5% and 97.5% quantiles as lower and upper limits.
Figure 2.

Empirical Bayes estimates of frailties and 95% prediction intervals for event recurrence and death of 14 centers, sorted by number of patients [Colour figure can be viewed at wileyonlinelibrary.com]
The left panel of Figure 2 shows the frailties for the event recurrence for 14 hospitals ordered by number of patients treated. Two hospitals (9 and 11) have a significantly increased risk for their patients to develop recurrence. One hospital (12) has a significantly decreased risk for its patients to develop recurrence. Further, one can see that the width of the prediction intervals decrease with a growing number of patients in the hospital.
The right panel of Figure 2 shows that one hospital (11) has an increased risk for its patients to move to the state death. One hospital (14) has a marginally significant decreased risk for its patients to die.
To visualize the relation of the frailties within a hospital, the empirical Bayes estimates of the two frailties for each center are plotted against each other in Figure 3, together with the joint empirical distribution of the frailties for two centers with index 11 and 12.
Figure 3.

Empirical Bayes estimates of frailties for two causes of failure plotted together for 14 centers. For centers with index 11 and 12 the joint empirical distribution of the frailties is shown in red and blue respectively [Colour figure can be viewed at wileyonlinelibrary.com]
The hospital effects on a patient can be investigated by looking at the difference in cumulative hazard and cumulative incidence between the hospitals for a particular patient. This is shown in Figure 4, for a patient whose covariate values correspond to the mean covariate values in the data.
Figure 4.

Upper panels: cumulative hazards for an average patient for recurrence (on the left) and death (on the right); each line represents a hospital. Lower panels: cumulative incidence of an average patient for recurrence (on the left) and death (on the right); each line represents a hospital [Colour figure can be viewed at wileyonlinelibrary.com]
A pairwise comparison of cumulative incidence curves for an average patient treated in two hospitals further illustrates the difference in effects. This is depicted in Figure 5, which shows the stacked cumulative incidence curves for an average patient treated in the two hospitals with the lowest and highest frailties for recurrence. The prognosis shown in the left panel estimates a lower risk for both events, compared to the right panel. This is explained by the estimated correlation between frailties (Table 3) and the empirical Bayes estimates of the hospitals (Figure 3), which indicate that a hospital with a decreased risk for one cause also has a decreased risk for the other cause. This makes the hospital corresponding to the left panel more appealing.
Figure 5.

Left panel: stacked cumulative incidence curves for an average patient treated in hospital with lowest estimated frailty for recurrence. Right panel: cumulative incidence curves for an average patient treated in hospital with highest estimated frailty for recurrence [Colour figure can be viewed at wileyonlinelibrary.com]
6. SIMULATION
To investigate the performance of the correlated frailty model, a simulation study is conducted. Multiple data scenarios are simulated and the results of the independent and correlated frailty model are compared. Motivated by the data example from Section 5, a similar scenario with 2700 patients distributed equally over 15 centers is used for simulation. To study how the number of centers affects the estimation, different scenarios with 5, 30, and 50 centers are considered, while keeping the total number of patients fixed to 2700 (see Table 4).
Table 4.
Scenarios for simulation
| Scenario | n | K | n k | Var(W k1) | Var(W k2) | Cor(W k1,W k2) | Correlation Bounds |
|---|---|---|---|---|---|---|---|
| A | 2700 | 5 | 540 | 0.25 | 0.25 | 0.3 | (0, 1) |
| B | 2700 | 15 | 180 | 0.25 | 0.25 | 0.3 | (0, 1) |
| C | 2700 | 30 | 90 | 0.25 | 0.25 | 0.3 | (0, 1) |
| D | 2700 | 50 | 54 | 0.25 | 0.25 | 0.3 | (0, 1) |
| E | 2700 | 15 | 180 | 0.1 | 0.3 | 0.8 | (0, 0.58) |
| F | 2700 | 15 | 180 | 0.25 | 0.25 | −0.3 | (0, 1) |
Notation: n, total number of patients; K, number of centers; n k, number of patients per center; W kj (j = 1, 2), center‐specific frailty for cause j.
Survival times are generated by using two Weibull baseline hazards with a common shape parameter a and rate parameters b 1 and b 2 for the two causes of failure, respectively. Weibull parameters are fixed throughout the data scenarios and are estimated from the data example of Section 5 (a = 1.01,b 1 = 0.05,b 2 = 0.03).
Different frailty variance structures are simulated in the different scenarios. Using an additive gamma model as presented in Section 4, correlated frailties are sampled with variances equal to 0.25 and correlation equal to 0.3 for scenarios A, B, C, and D. As discussed in Section 4, different frailty variances by construction do not allow too large correlations; in addition, correlation is assumed to be positive to use the proposed method. To study the performance of the method proposed in this article, data scenarios E and F that violate these assumptions are simulated. Center and patient distribution are set closest to the data example (15 centers with 180 patients each). Frailties for scenarios E and F in Table 4 are sampled from a multivariate lognormal distribution. Scenario E considers a situation in which the correlation is too large to be modeled: frailty variances are equal to 0.1 and 0.3 for cause 1 and cause 2, respectively, while correlation is equal to 0.8. Scenario F represents a situation in which negative correlation is present, with frailty variances equal to 0.25 and correlation equal to −0.3.
Table 4 summarizes all scenarios simulated. Censoring times are simulated from a uniform distribution between 9 and 14 years, motivated by the data example.
For each scenario, 1000 data sets are simulated for which two models are estimated: a model with independent frailties for the two causes and the proposed correlated frailty model. Results for frailty variance and empirical Bayes estimates are shown in Tables 5 and 6, respectively.
Table 5.
Frailty variance results of simulation study for 6 different data scenarios
| Correlated Frailty Model | Independent Frailty Model | |||||||
|---|---|---|---|---|---|---|---|---|
| Scenario | Parameter | True Value | Mean (avSE; empSE) | Bias | RMSE | Mean (empSE) | Bias | RMSE |
| A | Var(W k1) | 0.25 | 0.20 (0.13; 0.15) | −0.05 | 0.16 | 0.72 (0.28) | 0.47 | 0.55 |
| Var(W k2) | 0.25 | 0.20 (0.13; 0.15) | −0.05 | 0.16 | 0.59 (0.29) | 0.34 | 0.45 | |
| Cor(W k1,W k2) | 0.3 | 0.29 (0.20; 0.29) | −0.01 | 0.29 | ||||
| B | Var(W k1) | 0.25 | 0.24 (0.10; 0.10) | −0.01 | 0.10 | 0.57 (0.18) | 0.32 | 0.37 |
| Var(W k2) | 0.25 | 0.24 (0.10; 0.10) | −0.01 | 0.10 | 0.42 (0.18) | 0.17 | 0.24 | |
| Cor(W k1,W k2) | 0.3 | 0.3 (0.21; 0.22) | 0.00 | 0.22 | ||||
| C | Var(W k1) | 0.25 | 0.25 (0.07; 0.08) | 0.00 | 0.08 | 0.39 (0.13) | 0.14 | 0.19 |
| Var(W k2) | 0.25 | 0.25 (0.08; 0.08) | 0.00 | 0.08 | 0.30 (0.12) | 0.05 | 0.12 | |
| Cor(W k1,W k2) | 0.30 | 0.33 (0.19; 0.17) | 0.03 | 0.17 | ||||
| D | Var(W k1) | 0.25 | 0.25 (0.06; 0.06) | 0.00 | 0.06 | 0.29 (0.09) | 0.04 | 0.10 |
| Var(W k2) | 0.25 | 0.24 (0.07; 0.07) | −0.01 | 0.07 | 0.25 (0.08) | 0.00 | 0.08 | |
| Cor(W k1,W k2) | 0.3 | 0.33 (0.16; 0.15) | 0.03 | 0.16 | ||||
| E | Var(W k1) | 0.1 | 0.09 (0.04; 0.04) | −0.01 | 0.04 | 0.18 (0.12) | 0.08 | 0.15 |
| Var(W k2) | 0.3 | 0.19 (0.08; 0.08) | −0.11 | 0.13 | 0.41 (0.18) | 0.11 | 0.22 | |
| Cor(W k1,W k2) | 0.80 | 0.67 (0.17; 0.15) | −0.13 | 0.20 | ||||
| F | Var(W k1) | 0.25 | 0.20 (0.08; 0.08) | −0.05 | 0.20 | 0.51 (0.18) | 0.26 | 0.32 |
| Var(W k2) | 0.25 | 0.20 (0.08; 0.09) | −0.05 | 0.10 | 0.37 (0.19) | 0.12 | 0.22 | |
| Cor(W k1,W k2) | −0.3 | 0.02 (0.05; 0.06) | 0.32 | 0.32 | ||||
Abbreviations and notation: empSE, empirical standard error; avSE, average standard error; RMSE, root‐mean‐square error;W kj (j = 1, 2), center‐specific frailty for cause j.
Table 6.
Empirical Bayes results of simulation study for six different data scenarios
| Scenario | Parameter | Bias | RMSE | Coverage | Bias(F ( t 1 )) | Bias(F ( t 2 )) | Bias(F ( t 3 )) | RMSE(F ( t 1 )) | RMSE(F ( t 2 )) | RMSE(F ( t 3 )) |
|---|---|---|---|---|---|---|---|---|---|---|
| A | W k1 | 0.10 | 0.30 | 0.39 | 0.00 | 0.00 | −0.04 | 0.01 | 0.02 | 0.05 |
| W k2 | 0.10 | 0.30 | 0.49 | 0.00 | 0.00 | −0.02 | 0.01 | 0.01 | 0.03 | |
| B | W k1 | 0.12 | 0.23 | 0.75 | 0.00 | 0.00 | −0.04 | 0.02 | 0.03 | 0.06 |
| W k2 | 0.11 | 0.24 | 0.82 | 0.00 | 0.00 | −0.02 | 0.01 | 0.03 | 0.04 | |
| C | W k1 | 0.10 | 0.23 | 0.88 | 0.00 | 0.00 | −0.04 | 0.02 | 0.04 | 0.07 |
| W k2 | 0.09 | 0.25 | 0.91 | 0.00 | 0.00 | −0.03 | 0.02 | 0.03 | 0.05 | |
| D | W k1 | 0.09 | 0.25 | 0.92 | 0.00 | 0.00 | −0.04 | 0.03 | 0.05 | 0.07 |
| W k2 | 0.08 | 0.28 | 0.93 | 0.00 | 0.00 | −0.02 | 0.02 | 0.04 | 0.06 | |
| E | W k1 | 0.04 | 0.14 | 0.86 | 0.00 | 0.00 | −0.04 | 0.02 | 0.03 | 0.06 |
| W k2 | 0.10 | 0.23 | 0.80 | 0.00 | 0.00 | −0.02 | 0.01 | 0.02 | 0.04 | |
| F | W k1 | 0.09 | 0.20 | 0.79 | 0.00 | 0.00 | −0.04 | 0.02 | 0.03 | 0.06 |
| W k2 | 0.08 | 0.22 | 0.85 | 0.00 | 0.00 | −0.03 | 0.01 | 0.03 | 0.05 |
Abbreviations and notation: RMSE, root‐mean‐square error; Coverage, coverage of 95% prediction intervals; F, cause‐specific cumulative incidence; t 1,t 2,t 3, quartiles of overall event time distribution; W kj (j = 1, 2), center‐specific frailty for cause j.
Table 5 shows that the independent frailty model generally estimates the frailty variances to be too high with a large bias and large root‐mean‐square error (RSME). This seems to be more apparent in data sets with fewer centers.
The correlated frailty model estimates on average results that are closer to the true parameter values with bias of less than half the empirical standard error apart from scenarios E and F. Empirical standard errors are smaller compared to the independent model and are comparable to the average standard error, which, even though close, is consistently smaller than the empirical standard error. Root‐mean‐square errors are generally smaller for the correlated frailty model compared to the independent model.
For scenarios with a larger number of centers, better estimation results are obtained. Scenario D with 50 centers per data set shows the best estimation results. Average standard errors are close to the empirical standard errors and RSMEs are small. Scenario E showcases a situation in which the correlation is too large to be modeled with the additive gamma construction. Given frailty variances, correlation is restricted to (see Equations (8), (9)). The method in this case finds a middle ground and underestimates the frailty variance for cause 2 to allow for a larger correlation. Scenario F considers negative correlation. In this case, frailty variances are underestimated; however, they are still closer to the true values compared to estimates of the independent model and the correlation estimate is very close to 0.
Table 6 shows summary measures of empirical Bayes estimates over the different data scenarios. Bias as well as RMSEs are reported together with coverage probabilities of prediction intervals acquired using the sampling method described in Section 4 and studied for each scenario. The number of centers has a stronger effect on the empirical Bayes estimates compared to the frailty variance estimates. Scenario A with only five centers shows very poor coverage of the 95% prediction intervals with probabilities of 0.394 and 0.492 for empirical Bayes estimates corresponding to cause 1 and cause 2, respectively. Scenarios with 15 centers (B, E, and F) achieved coverage probabilities between 0.749 and 0.864 and scenarios with more centers (C and D) achieved values between 0.877 and 0.930. Bias and RSME of empirical Bayes estimates appear consistent over different scenarios. To quantify the performance of the method on the estimation of the center‐specific cumulative incidence, its bias and RMSE are estimated at quartiles of the theoretical overall event time distribution (t 1 = 3.55, t 2 = 8.48, t 3 = 16.85). The estimates appear unbiased but worsen for the later time t 3. Interestingly, the bias and RSMEs appear not to be influenced much by the amount of centers and it even appears to become slightly worse if more centers are present in the data. An explanation could be that the estimation of the cumulative incidence becomes more challenging due to the data being generated from many different hazard rates.
For some of the simulated data sets, the standard error of the frailty variance and correlation estimate could not be obtained because the hessian matrix obtained during optimization was not positive definite. In this case, another attempt was made by starting the optimization of the frailty components from another starting value. This procedure was able to compute results in some cases (see Table 7). In case the hessian was not positive definite, the data set was discarded. The amount of failed estimation was strongly dependent on the amount of centers in the data set. Percentages of second attempts and discarded data sets are given in Table 7.
Table 7.
Failed estimation of standard error
| Scenario | Total Nbr. of Data Sets | Total Successful Runs | Success at Second Attempt | Failed Estimation | Evaluated |
|---|---|---|---|---|---|
| A | 1200 | 1062 | 80 | 138 | 1000 |
| B | 1200 | 1165 | 17 | 35 | 1000 |
| C | 1200 | 1117 | 1 | 83 | 1000 |
| D | 1200 | 1142 | 0 | 58 | 1000 |
| E | 1200 | 1046 | 232 | 154 | 1000 |
| F | 1200 | 1196 | 0 | 4 | 1000 |
7. DISCUSSION AND CONCLUSION
Using shared frailty models to account for unobserved covariates in multicenter studies is common practice to avoid bias and to measure the amount of heterogeneity between centers. Correlated frailty models extend the shared frailty model by incorporating dependence structures between related individuals. Dependence among transition intensities of competing risks have come of interest.12, 14, 15
The model presented uses correlated gamma frailties to model dependence within hospitals and between two competing risks. The mathematical properties of the gamma distribution are exploited to construct and estimate correlated frailties. An estimation procedure using the EM algorithm is outlined and estimation of the standard error is illustrated. The estimation procedure provides empirical Bayes estimates for hospital frailties, which, together with their prediction intervals, can be used to compare hospital effects. The model is applied to breast cancer data and a moderate correlation between the frailties of the competing events recurrence and death is estimated. A simulation study is conducted to investigate performance of the method in different situations. Data scenarios with differing number of centers and correlation structures are considered and estimates of a model with independent frailties are compared to the proposed correlated frailty model. The performance of the empirical Bayes estimates obtained by the method was studied under different conditions.
The independent frailty model showed that it is not capable of accounting for center frailty in case of correlation between frailties. The correlated frailty model outperformed it in all data scenarios, concerning estimates as well as size of empirical standard errors. Its estimation benefits from a larger number of centers in the data. In data scenarios with unattainable correlation structures, it still performed reasonably well and behaved in an expectable way.
The method is well suited to investigate hospital heterogeneity in the presence of competing risks. It distinguishes between common and separate effects of a hospital on two competing events and performed well in a simulation study. The proposed model can be extended to the case of more then two competing events. Dependence between risks can be modeled by adding frailty components, where shared components induce dependence between risks. However, the model is limited to positive correlation between frailties.
Wienke et al12 pointed out that, in the case of cause‐specific mortality, the presence of risk factors might increase the risk of death with respect to all disease, making the case for positive dependence between risks. At the same time, the authors argue that everyone dies eventually, so if the risk of death from one cause is decreased, the risk from another cause must be increased, which suggests negative correlation between risks. Further study should be dedicated to the nature of dependencies among competing risks.
Putter and van Houwelingen35 compare a two‐point frailty distribution to a gamma distribution to model association between transition times in multistate models. An advantage of the two‐point frailty model is that it allows the two frailty terms to operate on different scale and that, in contrast to the gamma distribution, it allows negative association. In their simulation study, the two‐point frailty outperforms the gamma distribution. A similar model could be used in the competing risks setting modeling dependence between risks, possibly with three or four points.
Supporting information
SIM_8002‐Supp‐0001‐RCodeforSIM.R
ACKNOWLEDGEMENTS
Research leading to this article was supported by the KWF Kankerbestrijding under grant UL2015‐8028. The European Organisation for Research and Treatment of Cancer (EORTC) is gratefully acknowledged for making data from EORTC trial 10854 available for this analysis.
APPENDIX A.
PROBABILITIES FOR E‐STEP
Let z k0,z k1,z k2 be the independent gamma distributed frailty components and let d kj ( j = 1,2) be the number of failures of type j in hospital k (k = 1,…,K ). Defining ( j = 1,2), the conditional probability of the data given frailty components is given as
Integrating over the frailty components yields the following conditional probabilities:
The observed data likelihood is given as
The conditional probabilities of the frailty components given the data necessary for the E‐step are given as
where
APPENDIX B.
OBSERVED INFORMATION OF REGRESSION PARAMETERS
The term can be computed as described by Louis.23
Let ℓ ∗ and ℓ be the log‐likelihood and the conditional log‐likelihood given frailties. The Fisher information for can be rewritten in terms of the conditional log‐likelihood given as
| (A1) |
where W are the unobserved frailties and R is the set of possible frailties given the data. Notably, the last term is zero at the MLE, and thus, a simplified notation for the Fisher information at the MLE is given as
where the first term represents the full information and the second term represents the loss of information due to the unobserved frailties.
Let
d k1,d k2: number of failures of cause 1 and cause 2 in hospital k, respectively;
d 1,d 2: number of failures of cause 1 and cause 2 in total, respectively;
: number of failures of cause 1 at time in hospital k;
: number of failures of cause 2 at time in hospital k;
: number of failures of cause 1 at time ;
: number of failures of cause 2 at time ;
t kl,l = 1,…,d k1: ordered event times for cause 1 in hospital k;
t km,m = 1,…,d k2: ordered event times for cause 2 in hospital k;
, (l ′ = 1,…,d 1): ordered event times for cause 1;
, (m ′ = 1,…,d 2): ordered event times for cause 2;
;
;
;
;
R k(t) = {i:t k i ≥ t}: risk set at time t for hospital k.
The conditional log‐likelihood given frailties can be expressed as
The term is the product of the gradient vector of the conditional log‐likelihood with itself. The elements of the gradient vector are
The second‐order derivatives to calculate the full information matrix I (full) are
Rueten‐Budde AJ, Putter H, Fiocco M. Investigating hospital heterogeneity with a competing risks frailty model. Statistics in Medicine. 2019;38:269–288. 10.1002/sim.8002
REFERENCES
- 1. Cox DR. Regression models and life‐tables. J R Stat Soc. 1972;34:187‐220. 10.1007/978-1-4612-4380-9_37 [DOI] [Google Scholar]
- 2. Putter H, Fiocco M, Geskus RB. Tutorial in biostatistics: competing risks and multi‐state models. Statist Med. 2007;26:2389‐2430. [DOI] [PubMed] [Google Scholar]
- 3. Yashin AI, Vaupel JW, Iachine IA. Correlated individual frailty: an advantageous approach to survival analysis of bivariate data. Math Popul Stud. 1995;5:145‐159. 10.1080/08898489509525394 [DOI] [PubMed] [Google Scholar]
- 4. Petersen JH, Andersen PK, Gill RD. Variance components models for survival data. Statistica Neerlandica. 1996;50:193‐211. 10.1111/j.1467-9574.1996.tb01487.x [DOI] [Google Scholar]
- 5. Fine JP, Gray RJ. A proportional hazards model for the subdistribution of a competing risk. J Am Stat Assoc. 1999;94:496‐509. 10.2307/2670170 [DOI] [Google Scholar]
- 6. Katsahian S, Resche‐Rigon M, Chevret S, Porcher R. Analysing multicentre competing risks data with a mixed proportional hazards model for the subdistribution. Statist Med. 2006;25:4267‐4278. 10.1002/sim.2684 [DOI] [PubMed] [Google Scholar]
- 7. Scheike TH, Sun Y, Zhang MJ, Jensen TK. A semiparametric random effects model for multivariate competing risks data. Biometrika. 2010;97:133‐145. 10.1093/biomet/asp082 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Dixon SN, Darlington GA, Desmond AF. A competing risks model for correlated data based on the subdistribution hazard. Lifetime Data Anal. 2011;17:473‐495. 10.1007/s10985-011-9198-9 [DOI] [PubMed] [Google Scholar]
- 9. Zhou B, Fine J, Latouche A, Labopin M. Competing risks regression for clustered data. Biostatistics. 2012;13:371‐383. 10.1093/biostatistics/kxr032 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Wienke A, Christensen K, Holm NV, Yashin AI. Heritability of death from respiratory diseases: an analysis of Danish twin survival data using a correlated frailty model. Stud Health Technol Inform. 2000;77:407‐411. 10.3233/978-1-60750-921-9-407 [DOI] [PubMed] [Google Scholar]
- 11. Wienke A, Holm NV, Skytthe A, Yashin AI. The heritability of mortality due to heart diseases: a correlated frailty model applied to Danish twins. Twin Res Hum Genet. 2001;4:266‐274. 10.1375/1369052012399 [DOI] [PubMed] [Google Scholar]
- 12. Wienke A, Christensen K, Skytthe A, Yashin AI. Genetic analysis of cause of death in a mixture model of bivariate lifetime data. Stat Model. 2002;2:89‐102. 10.1191/1471082x02st030oa [DOI] [Google Scholar]
- 13. Gorfine M, Hsu L. Frailty‐based competing risks model for multivariate survival data. Biometrics. 2011;67:415‐426. 10.1111/j.1541-0420.2010.01470.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Liquet B, Timsit J‐F, Rondeau V. Investigating hospital heterogeneity with a multi‐state frailty model: application to nosocomial pneumonia disease in intensive care units. BMC Med Res Methodol. 2012;12:79 10.1186/1471-2288-12-79 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15. Rotolo F, Rondeau V, Legrand C. Incorporation of nested frailties into semiparametric multi‐state models. Statist Med. 2016;35:609‐621. 10.1002/sim.6734 [DOI] [PubMed] [Google Scholar]
- 16. Vaupel JW, Manton KG, Stallard E. The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography. 1979;16:439‐454. 10.2307/2061224 [DOI] [PubMed] [Google Scholar]
- 17. Clayton DG. A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 1978;65:141‐151. 10.2307/2335289 [DOI] [Google Scholar]
- 18. Fiocco M, Putter H, van Houwelingen JC. A new serially correlated gamma‐frailty process for longitudinal count data. Biostatistics. 2009;10:245‐257. 10.1093/biostatistics/kxn031 [DOI] [PubMed] [Google Scholar]
- 19. Fiocco M, Putter H, van Houwelingen JC. Meta‐analysis of pairs of survival curves under heterogeneity: a Poisson correlated gamma‐frailty approach. Statist Med. 2009;28:3782‐3797. 10.1002/sim.3752 [DOI] [PubMed] [Google Scholar]
- 20. Dempster AP, Laird NM, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc. 1977;39:1‐38. [Google Scholar]
- 21. R Core Team . R: a language and environment for statistical computing. 2016. https://www.R-project.org/
- 22. Therneau TM. A package for survival analysis in S. 2015. R package version 2.38. https://CRAN.R-project.org/package=survival
- 23. Louis TA. Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B Methodol. 1982;44:226‐233. [Google Scholar]
- 24. Putter H, van Houwelingen HC. Dynamic frailty models based on compound birth‐death processes. Biostatistics. 2015;16:550‐564. 10.1093/biostatistics/kxv002 [DOI] [PubMed] [Google Scholar]
- 25. Gilbert P, Varadhan R. numDeriv: accurate numerical derivatives. 2016. R package version 2016.8‐1. https://CRAN.R-project.org/package=numDeriv
- 26. Young GA, Smith RL. Essentials of Statistical Inference. Cambridge, UK: Cambridge University Press; 2005. [Google Scholar]
- 27. Casella G, Berger RL. Statistical Inference. Pacific Grove, CA: Duxbury Press; 2001. [Google Scholar]
- 28. Thomas N, Longford NT, Rolph JE. Empirical bayes methods for estimating hospital‐specific mortality rates. Statist Med. 1994;13:889‐903. 10.1002/sim.4780130902 [DOI] [PubMed] [Google Scholar]
- 29. van Houwelingen HC, Brand R, Louis TA. Empirical Bayes methods for monitoring health care quality. 2000. https://www.lumc.nl/sub/3020/att/EmpiricalBayes.pdf
- 30. van Houwelingen HC. The role of empirical bayes methodology as a leading principle in modern medical statistics. Biom J. 2014;56:919‐932. 10.1002/bimj.201400073 [DOI] [PubMed] [Google Scholar]
- 31. Goldstein H, Spiegelhalter DJ. League tables and their limitations: statistical issues in comparisons of institutional performance. J R Stat Soc A Stat Soc. 1996;159:385‐443. 10.2307/2983325 [DOI] [Google Scholar]
- 32. van der Hage JA, van de Velde CJH, Julien JP, et al. Improved survival after one course of perioperative chemotherapy in early breast cancer patients: long‐term results from the European organization for research and treatment of cancer (EORTC) Trial 10854. Eur J Cancer. 2001;37:2184‐2193. 10.1016/S0959-8049(01)00294-5 [DOI] [PubMed] [Google Scholar]
- 33. de Bock GH, Putter H, Bonnema J, van der Hage JA, Bartelink H, van de Velde CJ. The impact of loco‐regional recurrences on metastatic progression in early‐stage breast cancer: a multistate model. Breast Cancer Res Treat. 2009;117:401‐408. 10.1007/s10549-008-0300-2 [DOI] [PubMed] [Google Scholar]
- 34. Balan TA, Putter H. frailtyEM: an r package for estimating semiparametric shared frailty models. 2017. R package version 0.7.9.
- 35. Putter H, van Houwelingen HC. Frailties in multi‐state models: are they identifiable? do we need them? Stat Methods Med Res. 2015;24:675‐692. 10.1177/0962280211424665 [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
SIM_8002‐Supp‐0001‐RCodeforSIM.R
