Non-proportional hazards model with a PVF frailty term: application with a melanoma dataset

Karen C Rosa; Vinicius F Calsavara; Francisco Louzada

doi:10.1080/02664763.2024.2354443

. 2024 May 14;52(1):1–27. doi: 10.1080/02664763.2024.2354443

Non-proportional hazards model with a PVF frailty term: application with a melanoma dataset

Karen C Rosa ^a, Vinicius F Calsavara ^b,^CONTACT, Francisco Louzada ^a

PMCID: PMC11727191 PMID: 39811081

Abstract

Survival data analysis often uses the Cox proportional hazards (PH) model. This model is widely applied due to its straightforward interpretation of the hazard ratio under the assumption that the hazard rates for two subjects remain constant over time. However, in several randomized clinical trials with long-term survival data comparing two new treatments, it is frequently observed that Kaplan-Meier plots exhibit crossing survival curves. This violation of the PH assumption of the Cox PH model can not be applied to evaluate the treatment's effect on survival. This paper introduces a novel long-term survival model with non-PH that incorporates a frailty term into the hazard function. This model allows us to examine the effect of prognostic factors on survival and quantify the degree of unobservable heterogeneity. The model parameters are estimated using the maximum likelihood estimation procedure, and we evaluate the performance of the proposed models through simulation studies. Additionally, we demonstrate the applicability of our approach by fitting the models to a real skin cancer dataset.

Keywords: Cure rate models, generalized time-dependent log-log model, maximum likelihood estimation, random effects, skin cancer

Mathematical subject classifications: 62N02, 62N03

1. Introduction

Melanoma, a type of skin cancer, garners considerable attention in oncology due to its profound effect on patient well-being and outcomes. Clinical outcomes hold immense significance for healthcare providers and public policies. Researchers frequently focus on estimating survival rates, including overall survival, cancer-specific survival, and disease-free survival. These rates are impacted by diverse factors, including the type of cancer and patient characteristics, such as gender, Body Mass Index (BMI), education level, age at diagnosis, clinical stage of the disease, type of treatment, and other pertinent information attainable from medical records.

In the literature, studies have reported melanoma-specific survival rates after ten years ranging from 24% to 88% [24]. In 2018, it was estimated that Brazil witnessed approximately 6000 new cases of melanoma, according to the Brazilian National Institute of Cancer (INCA) [16]. Conversely, the International Agency for Research on Cancer (IARC) reported approximately 7000 new cases [20]. Melanoma is responsible for approximately 2000 deaths each year in Brazil. These statistics highlight the significant impact and ongoing challenges associated with melanoma within the country [16,20].

The survival data analysis often involves fitting the conventional Cox proportional hazards (PH) model [17] to assess the impact of patients' baseline characteristics on the hazard rate, assuming that it remains constant over time. However, in certain situations, the effects of covariates may vary over time, rendering the Cox PH model inadequate. Calsavara et al. [9] noted that certain tumor types may initially respond well to chemotherapy or radiotherapy, but genetic mechanisms can develop tolerance to treatment, resulting in a loss of effectiveness over time. Using the Cox PH model to model the hazard rate may lead to inadequate conclusions in such scenarios. As [56] observed, the Cox PH model has been extensively utilized in various problems, even when the PH assumption is violated, which can have implications for the accuracy and reliability of the results.

In practice, the PH assumption is often evaluated using Schoenfeld residuals [27,50,57]. Klein and Moeschberger [36] recommend assessing the assumption by examining log cumulative hazard plots over time for parallelism. The literature offers various graphical methods for detecting violations of this assumption, with [28] introducing eight such methods. If deviations from the assumption are detected, several potential approaches can be considered, such as redefining covariates, utilizing a stratified model to account for non-proportional covariates, fitting a non-PH model, or employing other suitable strategies to address the issue. These techniques aim to ensure accurate and reliable analyzes.

Several approaches have been developed to address non-PH. These include the nonparametric accelerated failure time model by [34,51], the hybrid hazard model proposed by [21], and the extensions of hybrid hazard models by [39,40]. Mackenzie [41] introduced the generalized time-dependent logistic (GTDL) model, which was further extended by [44] for the gamma frailty model. More recently, [9] extended the GTDL model using a power variance function (PVF) frailty model. While these models have shown success in scenarios where all subjects are susceptible to the event of interest, real-world situations often involve long-term survivors or individuals who are immune to the event of interest. To address such cases, various authors have extensively studied long-term survival models. One commonly used model is the mixture model introduced by [6,8], which postulates that the population survival function is a mixture of susceptible individuals who have the potential to experience the event of interest, as well as non-susceptible individuals who are destined to never experience the event, often referred to as long-term survivors.

Survival models can be expanded to capture the effects of unobserved covariates that were not included in the model, such as genetic and environmental factors or information overlooked during the planning stage. Hougaard [31] highlighted the benefits of incorporating two sources of heterogeneity (observable and unobservable) in a model. Frailty models enable quantifying the degree of unobservable heterogeneity by introducing a random effect (multiplicative or additive) in the hazard function. The random effect accounts for heterogeneity among individuals and allows for the assessment of covariate effects that were either unobserved or cannot be directly observed. Neglecting an important covariate increases the degree of unobserved heterogeneity, consequently impacting the estimation of model parameters. Incorporating a frailty term can assist in mitigating this concern [9]. Frailty models have been extensively studied by various authors [2,15,32,48,60,62]. Additionally, other authors have explored long-term survival models with a frailty term [1,9,12,13,23,33,49,52,54,64].

This paper proposes an alternative approach to modeling long-term survival data that accommodates non-PH structure. Our methodology employs the generalized time-dependent complement log-log model with a PVF frailty term. The proposed model is motivated by a real medical dataset obtained from a skin cancer study conducted on 6752 patients diagnosed with melanoma in the state of São Paulo, Brazil. The study recruited patients between 2000 and 2014, with follow-up conducted until 2018. The event of interest was defined as death due to cancer, while patients who were lost to follow-up or died from other causes during the follow-up period were considered right-censored observations. Selecting treatment strategies for these patients depends on various factors, such as the type and stage of cancer, the patient's overall health, and individual preferences. In the context of this study, patients in clinical stages I or II generally undergo surgery as a primary treatment modality. Many of these patients are expected to achieve long-term survival beyond ten years of follow-up. On the other hand, patients in advanced clinical stages receive alternative therapies, which may include radiation therapy, chemotherapy, targeted therapy, immunotherapy, or a combination thereof. Their prognosis is generally poorer due to the increased potential for metastatic spread.

The study's primary objective is to assess the surgery effect on survival rates adjusted by age at diagnosis and gender. Additionally, the study aims to quantify the amount of unobserved heterogeneity resulting from the lack of relevant clinical information.

Our paper is structured as follows. In the Background section, we provide an overview of the complementary log-log model, its extended version, and the generalized extended complementary log-log model with a frailty term. The Inference section discusses the inference methods based on the likelihood function for the proposed models. In the Simulation study, we conducted a numerical evaluation to assess the asymptotic properties of the estimators across various sample sizes. In the Application section, we performed a survival data analysis by applying the proposed models to a real dataset on melanoma cancer. Finally, we conclude with final remarks.

Complementary log-log regression model

Consider a non-negative random variable, denoted as T>0, representing the failure time. The complementary log-log hazard (CLL) function, proposed by [43], is given by

h_{0} (t; x_{1}) = \exp {- \exp (αt + x_{1}^{⊤} β)},

(1)

where α denotes the time effect, $x_{1}^{⊤} = (1, x_{1_{1}}, \dots, x_{1_{p}})$ is the time-independent covariate vector, and $β^{⊤}$ = $(β_{0}, β_{1}, \dots, β_{p})$ the vector of regression coefficients.

The associated cumulative hazard function, denoted by $H_{0} (t; x_{1})$ , and survival function, $S_{0} (t; x_{1})$ , can be written as

H_{0} (t; x_{1}) = \int_{0}^{t} \exp {- \exp (αy + x_{1}^{⊤} β)} d y

(2)

and

S_{0} (t; x_{1}) = \exp (- \int_{0}^{t} \exp {- \exp (αy + x_{1}^{⊤} β)} d y),

(3)

respectively.

The CLL model (1) is considered to be non-PH because the hazard function ratio between two individuals i and j ( $i \neq j$ and $i, j = 1, \dots, n$ ) with distinct observed covariate vectors changes over time. However, in the specific scenario where $α = 0$ , the CLL model converges to the PH model.

As stated by [43], the behavior of the hazard function (1) depends on the value of α. When $α > 0$ , the hazard function decreases over time. Conversely, when $α < 0$ , it increases. A value of $α = 0$ results in a constant hazard function. Due to the hazard function's shape, the model (1) is particularly suitable for modeling phenomena with monotone failure rates.

The survival function (3) is proper for $α \leq 0$ . However, when the time effect α is positive, the CLL model naturally extends to an improper distribution, enabling the estimation of the proportion of non-susceptible individuals to the event of interest. This proportion can be estimated when $α > 0$ by computing the following limit:

p (x_{1}) = lim_{t \to \infty} \exp (- \int_{0}^{t} \exp {- \exp (αy + x_{1}^{⊤} β)} d y) \in (0, 1) .

(4)

An advantage of the CLL model over traditional long-term survival models is that it does not require an additional parameter for estimating the long-term survivors $p (x_{1})$ . However, the CLL model has a restrictive constraint on the hazard function imposed by $0 \leq h_{0} (t; x_{1}) \leq 1$ for all t>0. To address this limitation, we have introduced the parameter $λ > 0$ into the hazard function (1). This inclusion allows for a more flexible hazard function that is not limited to the interval $[0, 1]$ , resulting in the evolution of the CLL model into the extended CLL model. The extended CLL hazard function (or extended CLL model) is expressed as follows

h_{0} (t; x_{1}) = λexp {- \exp (αt + x_{1}^{⊤} β)},

(5)

where $λ > 0$ is the scale parameter, α is a measure of the time effect, $x_{1}^{⊤} = (1, x_{1_{1}}, \dots, x_{1_{p}})$ is the time-independent covariate vector, and $β^{⊤}$ = $(β_{0}, β_{1}, \dots, β_{p})$ the vector of regression coefficients.

The hazard function (5) exhibits the same shape as the CLL model (1), as previously mentioned. However, it is not constrained to the unit interval. It is worth noting that the conventional CLL model is obtained when $λ = 1$ . Figure 1 illustrates the baseline hazard and survival functions for various parameter values of the extended CLL model, considering only one group factor as independent variable.

Figure 1. — Baseline hazard (left panel) and survival (right panel) functions from the extended CLL model. The parameter values used are as follows: Group1, $α = 2$ , $λ = 4, β_{0} = 0$ , and $β_{1} = - 1$ ; Group2, $α = - 2$ , $λ = 4, β_{0} = 0$ , and $β_{1} = - 1$ ; and Group3, $α = 0$ , $λ = 4, β_{0} = 0$ , and $β_{1} = - 1$ . The subscript numerals indicate the values of the fixed covariates. (Please refer to the online version of this article for the interpretation of the references to color in this figure legend.)

1.1. Extended complementary log-log frailty model

The multiplicative frailty model extends the PH model introduced by [17]. In this model, the hazard function of each unit is influenced by a non-negative unobservable random variable V (or random effect), which acts multiplicatively on the baseline hazard function. Based on the extended CLL model (5), the hazard function of the ith individual, given $v_{i}$ , can be represented as

h_{i} (t; v_{i}, x_{1_{i}}) = v_{i} h_{0} (t; x_{1_{i}}) = v_{i} λexp {- \exp (α t_{i} + x_{1_{i}}^{⊤} β)} .

(6)

The conditional hazard function, $h_{i} (\cdot; v_{i}, x_{1_{i}})$ , is lower than the baseline hazard function, $h_{0} (\cdot; x_{1_{i}})$ , when $v_{i} < 1$ ; Conversely, when $v_{i} > 1$ , $h_{i} (\cdot; v_{i}, x_{1_{i}}) > h_{0} (\cdot; x_{1_{i}})$ . Furthermore, the frailty model (6) reduces to the CLL model (1) when $v_{i} = 1$ .

Given the multiplicative influence of the random effect on the hazard function, appropriate candidates for frailty distributions are typically expected to be non-negative, continuous, and time-independent. In the study by [43], the gamma and inverse Gaussian distributions were considered with mean 1 and a variance of θ. Nevertheless, other options such as log-normal, positive stable, and power variance function (PVF) distributions can also be considered.

Within our scope, we delve into the family of distributions known as the power variance function [30,61]. This family encompasses special cases such as the inverse Gaussian, gamma, and positive stable distributions. The density function associated with the PVF distribution [63] is given by

\begin{aligned} f (v; μ, ψ, γ) & = e^{- ψ (1 - γ) (\frac{v}{μ} - \frac{1}{γ})} \frac{1}{π} \sum_{k = 1}^{\infty} (- 1)^{k + 1} \frac{[ψ (1 - γ)]^{k (1 - γ)} μ^{kγ} Γ (kγ + 1)}{γ^{k} k!} \\ \times v^{- kγ - 1} \sin (kγπ), \end{aligned}

where $0 < γ \leq 1$ , $ψ > 0$ and $μ > 0$ .

We assume the standard assumption in frailty models that $E (V) = μ = 1$ , and the frailty variance is θ ( $Var (V) = μ^{2} / λ := θ)$ . The historical concept introduced by [62] drives this choice and allows for a meaningful interpretation of the parameter θ to measure the level of unobserved heterogeneity within the population.

To eliminate the unobserved component $(v_{i})$ in the frailty model (6), the random effect term can be integrated out. As a result, the marginal survival function can be expressed as follows

S (t; x_{1}) = E_{V} [S (t; x_{1}, v)] = \int_{0}^{\infty} S (t; x_{1}, v) f_{v} (v) d v = L_{v} [- \log S_{0} (t; x_{1})],

where $f_{v} (\cdot)$ represents the probability density of the corresponding frailty distribution, $S_{0} (\cdot)$ denotes the baseline survival function, and $L_{v} [\cdot]$ represents the Laplace transform of the frailty distribution.

The unconditional survival and hazard functions associated with the PVF frailty model are expressed as follows, respectively,

S (t; x_{1}) = \exp {\frac{1 - γ}{γθ} [1 - {(1 + \frac{λθ}{1 - γ} \int_{0}^{t} \exp {- \exp (αy + x_{1}^{⊤} β)} d y)}^{γ}]}

(7)

and

h (t; x_{1}) = \frac{λexp {- \exp (αt + x_{1}^{⊤} β)}}{{[1 + \frac{λθ}{1 - γ} \int_{0}^{t} \exp {- \exp (αy + x_{1}^{⊤} β)} d y]}^{1 - γ}} .

(8)

From now on, we will use the term CLL PVF frailty model or simply CLL PVF model to refer to the model with the survival function defined in (7).

As $θ \to 0$ in the extended CLL frailty model, the traditional CLL model (5) is derived. When $λ = 1$ and $θ \to 0$ , the model (1) is obtained. Furthermore, the CLL PVF model is flexible, encompassing several other frailty models as special cases. For instance, when $γ \to 0$ , the CLL gamma frailty model is obtained, while $γ = 0.5$ yields the CLL inverse Gaussian frailty model. The CLL positive stable frailty model is a special case of the CLL PVF model, but it requires some asymptotic considerations to establish this relationship. For more comprehensive information, we recommend interested readers to consult the work by [63].

Like the CLL model, the CLL PVF model (8) is also classified as a non-PH model, as the hazard function ratio between distinct subjects is not constant over time. The CLL PVF model accommodates positive values for the time effect ( $α > 0$ ), and the long-term survivor can be expressed as follows

\begin{aligned} p (x_{1}) & = lim_{t ⟶ \infty} S (t; x_{1}) \\ = lim_{t ⟶ \infty} \exp {\frac{1 - γ}{γθ} [1 - {(1 + H_{0} (t; x_{1}) \frac{λθ}{1 - γ})}^{γ}]} \in (0, 1), \end{aligned}

(9)

where $H_{0} (\cdot; x_{1})$ is the cumulative hazard function given in (2).

In scenarios where the estimated parameter α is positive, both the CLL model and CLL PVF frailty model offer estimations for the proportion of long-term survivors. These estimates can be obtained from (4) and (9), respectively. However, when the estimated parameter α is negative, both models suggest the absence of long-term survivors. In such cases, the survival functions depicted in (3) and (7) represent proper survival functions.

We enhance the extended CLL model (5) and CLL PVF frailty model (6) by including explanatory variables through the parameter α. This approach is more justifiable as it directly captures the effect of a treatment. As noted by [10], some patients will be cured if a treatment is effective for a specific group, resulting in an estimated $α > 0$ . Conversely, if the treatment is insufficient, the estimate will be $α < 0$ , indicating the absence of long-term survivors. Including explanatory variables through the time effect parameter allows for modeling intersections between survival curves, providing an advantage. Consequently, the extended CLL model (5) and CLL PVF frailty model (6) offer more flexibility compared to the standard approach (1) proposed by [43].

As previously mentioned, explanatory variables are incorporated into the proposed models through the time effect parameter α and in the traditional manner in the hazard function using two sets of covariates vectors, $x_{1} \in R^{p + 1}$ and $x_{2} \in R^{q + 1}$ , where $x^{⊤} = (x_{1}^{⊤}, x_{2}^{⊤}) \in R^{w}$ represents a w-dimensional covariate vector, with w = p + q + 2. Importantly, parameter α can be estimated as either positive or negative. To ensure $α \in R$ , we employ an identity link function, such as

α (x_{2_{i}}) = x_{2_{i}}^{⊤} α,

(10)

where $x_{2_{i}}^{⊤} = (1, x_{2_{i 1}}, x_{2_{i 2}}, \dots, x_{2_{iq}})$ and $α^{⊤} = (α_{0}, α_{1}, \dots, α_{q})$ are the sets of covariates and their regression coefficients, respectively.

In practice, the covariate vectors can be the same, indicating that $x = x_{1} = x_{2}$ . As suggested by [9], if the researcher has prior knowledge about specific variables that may be associated with long-term survivors, they recommend linking those subset variables to the α parameter. This approach facilitates a more focused analysis and can yield valuable insights into the relationship between these variables and long-term survival.

An advantage of the proposed models is that they do not require the addition of a new parameter for estimating long-term survival. Additionally, these models can accommodate both proper and improper distributions depending on the values of the time effect.

The CLL model (1) does not provide a flexible parametric fit for modeling phenomena with non-monotone failure rates, such as unimodal and bathtub-shaped failure rates commonly observed in biological studies and reliability analysis. In contrast, the model (8) accommodates both monotone and non-monotone failure rates, making it applicable to various problems in survival data analysis. The shape of the hazard and survival functions obtained from the CLL PVF frailty with different parameter values are illustrated in Figure 2.

Figure 2. — Baseline hazard (left panel) and survival (right panel) functions from the CLL PVF frailty model. The parameter values used are as follows: Group1, $α_{0} = 1$ , $α_{1} = 2$ , $λ = 1$ , $β_{0} = 0$ , $β_{1} = 0.3$ , $γ = 0.7$ and $θ = 2$ ; Group2, $α_{0} = 1$ , $α_{1} = - 2$ , $λ = 4$ , $β_{0} = 0$ , $β_{1} = 1$ , $γ = 0.7$ and $θ = 1$ ; Group3, $α_{0} = - 1$ , $α_{1} = - 2$ , $λ = 2$ , $β_{0} = 0$ , $β_{1} = 1$ , $γ = 0.7$ and $θ = 0.5$ ; Group4, $α_{0} = - 0.05$ , $α_{1} = 0.1$ , $λ = 4$ , $β_{0} = 0$ , $β_{1} = 1$ , $γ = 0.9$ and $θ = 0.5$ ; The subscript numerals indicate the values of the fixed covariates. (Please refer to the online version of this article for the interpretation of the references to color in this figure legend.)

The proposed model includes intercept terms in two components, $α_{0}$ and $β_{0}$ , which can lead to optimization issues. These problems are likely due to parameter non-identifiability. To address this, we will consider including the intercept term only in the component α, as the parameter λ can fulfill the intercept $β_{0}$ role.

2. Inference

In this section, we discuss the inference for the model under the classical approach. Let us consider a situation when the time to an event is not completely observed and is subject to right censoring. Let Y and C be a non-negative random variable representing the failure and censoring times, respectively. For the ith subject, $i = 1, \dots, n$ , we observe $T_{i} = \min (Y_{i}, C_{i})$ and $δ_{i} = I (Y_{i} \leq C_{i})$ , where $δ_{i} = 1$ indicates the failure and $δ_{i} = 0$ the censored time. The observed dataset is denoted as $D = (t, δ$ , $X)$ , where $t = (t_{1}, \dots, t_{n})^{⊤}$ represents the observed times, $δ = (δ_{1}, \dots, δ_{n})^{⊤}$ the failure indicators, and $X$ is a matrix associated with the observed covariates ( $x_{1}$ and $x_{2}$ ). We assume T is an independent and identically distributed random variable with survival function $S (\cdot; ϑ, x_{1}, x_{2})$ , and hazard function $h (\cdot; ϑ, x_{1}, x_{2})$ , where $ϑ$ is κ dimensional unknown parameter vector that will be estimated using the observed dataset. Furthermore, we assume that C is independent of T. Assuming a non-informative censoring mechanism, the likelihood function of $ϑ$ is given by

L (ϑ; D) \propto \prod_{i = 1}^{n} h (t_{i}; ϑ, x_{1 i}, x_{2 i})^{δ_{i}} S (t_{i}; ϑ, x_{1 i}, x_{2 i}) .

The log-likelihood function, denoted as $ℓ (ϑ) = \log L (ϑ; D)$ , is expressed as

ℓ (ϑ) \propto \sum_{i = 1}^{n} δ_{i} \log h (t_{i}; ϑ, x_{1 i}, x_{2 i}) + \sum_{i = 1}^{n} \log S (t_{i}; ϑ, x_{1 i}, x_{2 i}) .

Consequently, in the case of the extended CLL regression model, the log-likelihood function of $ϑ = (α, λ, β)^{⊤}$ can be expressed as follows

\begin{aligned} ℓ (ϑ) & = - λ \sum_{i = 1}^{n} \int_{0}^{t} \exp {- \exp (x_{2 i}^{⊤} αy + x_{1 i}^{⊤} β)} d y \\ + \log (λ) \sum_{i = 1}^{n} δ_{i} - \sum_{i = 1}^{n} δ_{i} \exp (x_{2 i}^{⊤} α t_{i} + x_{1 i}^{⊤} β) . \end{aligned}

(11)

For the CLL PVF frailty regression model, the log-likelihood function for $ϑ = (α, λ, β, θ, γ)^{⊤}$ can be expressed as

\begin{aligned} ℓ (ϑ) & = - (1 - γ) \sum_{i = 1}^{n} δ_{i} \log [1 + \frac{θλ}{(1 - γ)} \int_{0}^{t} \exp {- \exp (x_{2 i}^{⊤} αy + x_{1 i}^{⊤} β)} d y] \\ + \sum_{i = 1}^{n} \frac{1 - γ}{γθ} (1 - {[1 + \frac{θλ}{(1 - γ)} \int_{0}^{t} \exp {- \exp (x_{2 i}^{⊤} αy + x_{1 i}^{⊤} β)} d y]}^{γ}) \\ + \log (λ) \sum_{i = 1}^{n} δ_{i} - \sum_{i = 1}^{n} δ_{i} \exp (x_{2 i}^{⊤} α t_{i} + x_{1 i}^{⊤} β) . \end{aligned}

(12)

No closed-form solution is available for the definite integrals in the likelihood functions (11) and (12), and numerical integration methods are required for approximating them. For the ith subject, the definite integral

\int_{0}^{t_{i}} \exp {- \exp (x_{2 i}^{⊤} αy + x_{1 i}^{⊤} β)} d y = \int_{0}^{t_{i}} f (y) d y

can be approximate using quadrature formulas, where the desired integral is approximated over the range $[0, t_{i}]$ by sum [46],

\int_{0}^{t_{i}} f (y) d y \approx \sum_{k = 1}^{m} w_{k} f (ξ_{k}),

where $f (\cdot)$ is the function to integrated, $ξ_{k}$ are the nodes, $w_{k}$ are the weights, and m is the number of quadrature points. In practice, general-purpose integration software is not based on a simple quadrature formula but rather on a much more flexible adaptive quadrature rule [3]. For adaptive quadrature, the quadrature points and weights are determined dynamically during integration, according to the required accuracy. The integrate function in R software [53] provides such adaptive quadrature.

We utilized numerical maximization to obtain the maximum likelihood estimates (MLEs) of parameters by optimizing the log-likelihood functions (11) and (12). Various routines for numerical maximization can be found in the literature. In our study, we employed the optim routine available in the R software and specifically applied the L-BFGS-B optimization method. To handle the integrals in the likelihood functions, which do not have analytical solutions, we utilized the integrate function provided in the R software. This function enables the adaptive quadrature of functions of one variable.

To obtain confidence intervals (CIs) and test hypotheses about the model parameters, it is necessary to examine the asymptotic properties of maximum likelihood (ML) estimators. Large sample inference for the parameter vector $ϑ$ can be based on the MLEs and their estimated standard error.

Let $\hat{ϑ}$ be the ML estimator of $ϑ$ . Under certain regularity conditions [37,42,59], the ML estimator $\hat{ϑ}$ is consistent and follows a normal joint asymptotic distribution with a mean $ϑ$ and a variance-covariance matrix $H (\hat{ϑ})$ . For a large sample size, $\hat{ϑ}$ is efficient, that is, $H (\hat{ϑ}) = I^{- 1} (\hat{ϑ})$ , where $I^{- 1} (\hat{ϑ})$ is the inverse expected Fisher information matrix. Based on the proposed models, the expected Fisher information is difficult to compute, but the observed information can approximate it, $Σ (\hat{ϑ})$ , which is given by

Σ (\hat{ϑ}) = - \frac{\partial^{2} ℓ (ϑ; D)}{\partial ϑ \partial ϑ^{⊤}} |_{ϑ = \hat{ϑ}} .

The benefit of employing this observed version is that it can be readily calculated using any computational routine. Thus, we have

\hat{ϑ} ⟹ D N_{κ} (ϑ, Σ^{- 1} (\hat{ϑ})), as n \to \infty,

where $⟹ D$ denotes the convergence in distribution. Therefore, the asymptotic confidence interval for the ith component of parameter vector $ϑ$ , $ϑ_{i}$ , $i = 1, 2, \dots, κ$ with a two-sided $(1 - α^{*})$ confidence level is computed as

{\hat{ϑ}}_{i} \pm z_{1 - α^{*} / 2} \sqrt{h_{(ii)}^{- 1} (\hat{ϑ})},

where ${\hat{ϑ}}_{i}$ is the ML estimator of $ϑ_{i}$ , $h_{(ii)}^{- 1} (\hat{ϑ})$ is the ith diagonal element of the inverse observed Fisher information matrix, and $z_{1 - α^{*} / 2}$ denotes the $(1 - α^{*} / 2)$ th quantile for a standard normal random variable.

The assumption of asymptotic normality for MLEs is applicable under specific regularity conditions, which are challenging to assess in our models. To address this concern, we conducted a simulation study detailed in the following section. Researchers commonly describe simulation studies to evaluate the asymptotic properties of MLEs, mainly when analytical investigations are challenging. Notably, many authors have utilized simulations to examine the asymptotic behavior of MLEs in various contexts, as demonstrated in studies by [11,23,25,26,43,47].

2.1. Nonparametric bootstrap

In contrast to conventional statistical inference approaches, which rely on parametric assumptions or large sample approximations for reliable conclusions, nonparametric bootstrap sampling uses computationally intensive techniques to ensure valid inferences across a broad spectrum of data-generating situations, offering a robust alternative. It does not depend on strict assumptions about the underlying data distribution. This method involves creating numerous resamples from the original dataset. This technique builds a distribution of estimates by repeatedly drawing samples and recalculating statistics of interest. This empirical distribution can then be used to compute confidence intervals, estimate standard errors, and perform hypothesis tests.

Numerous bootstrap methodologies have been developed to tackle various statistical challenges. For those reader interested, we recommend consulting comprehensive resources such as [18,19,29].

We focused on determining the confidence interval for the long-term survivors, considering that the expression for $p (x)$ as provided in Equations (4) or (9) lacks an explicit form. The central concept is to treat the empirical distribution of the bootstrap estimates as an approximation to the unknown distribution of the parameter of interest. The process for obtaining the bootstrap percentile $100 (1 - α^{*}) %$ confidence interval for long-term survivors is outlined as follows:

Generate B boostrap samples through resampling with replacement from the original dataset $D = (t, δ, X)$ . Each bootstrap sample, denoted as $D_{(1)}^{*}, D_{(2)}^{*}, \dots, D_{(B)}^{*}$ , contains n observations as in the original sample;
From each bootstrap sample, compute the MLE for long-term survivors and denote the resulting estimate as ${\hat{p}}^{*}$ . This yields B values of ${\hat{p}}^{*} = {{\hat{p}}_{1}^{*}, \dots, {\hat{p}}_{B}^{*}}$ ;
Utilizing ${{\hat{p}}_{1}^{*}, \dots, {\hat{p}}_{B}^{*}}$ , calculate the bootstrap percentile $100 (1 - α^{*}) %$ confidence interval for p as
$[{\hat{Q}}_{α^{*} / 2}^{*}, {\hat{Q}}_{1 - α^{*} / 2}^{*}],$
where ${\hat{Q}}_{α^{*} / 2}^{*}$ and ${\hat{Q}}_{1 - α^{*} / 2}^{*}$ represent the quantiles $(α^{*} / 2)$ and $(1 - α^{*} / 2)$ of the bootstrap distribution of ${\hat{p}}^{*} = {{\hat{p}}_{1}^{*}, \dots, {\hat{p}}_{B}^{*}}$ , respectively.

It is worth mentioning that the algorithm can be applied to each component of the parameter vector of interest $ϑ$ .

3. Simulation study

We conducted a simulation study to assess the performance of the MLEs for the parameters of the CLL PVF frailty and extended CLL models across different sample sizes. Additionally, we introduced regression parameters into the effect time parameter $α$ , specifically $α (x) = x^{⊤} α$ . As previously mentioned, we did not incorporate the intercept term in the other component ( $β$ ) due to the optimization issues. We have considered two simulation studies assuming i) one covariate and ii) three covariates (dichotomous variables) drawn from the Bernoulli according to the real-data application. Furthermore, we assumed an exponential distribution with a rate parameter τ to model the censoring times. The value of τ was carefully chosen to control the proportion of right-censored observations. The generation of datasets $(t_{i}, δ_{i}, x_{i})$ from the CLL PVF frailty and extended CLL models involved the following steps.

Determine the desired parameter values $ϑ = (α, λ, β)^{⊤}$ (extended CLL model) or $ϑ = (α, λ, β, θ, γ)^{⊤}$ (CLL PVF frailty model);
For the ith subject, draw $x_{i} \sim$ Bernoulli $(η_{j}$ ), where $η_{j}$ is the success parameter associated to the jth covariate, and $U_{i}^{*} \sim Uniform (0, 1)$ ;
Determine the long-term survivors $p_{i} (x_{i})$ according to the desired model;
Draw $C_{i} \sim Exponential (τ)$ , where τ is set to control the proportion of right-censored observations;
If $u_{i}^{*} < p_{i} (x_{i})$ , set $t_{i}^{*} = \infty$ ; otherwise, generate $T_{i}^{*}$ from the CLL or CLL PVF frailty model, i.e. $t_{i}^{*}$ as the root of $S (t_{i}^{*}; ϑ) = 1 - u^{'}$ , where $U^{'} \sim Uniform (0, 1 - p_{i} (x_{i}))$ ;
Let $t_{i} = min {t_{i}^{*}, c_{i}}$ ;
If $t_{i} = t_{i}^{*}$ , set $δ_{i} = 1$ ; otherwise $δ_{i} = 0$ ;
The dataset for the ith subject is ${t_{i}, δ_{i}, x_{i}}, i = 1, \dots, n$ .

We conducted the Monte Carlo simulation studies considering five sample sizes: $n = 100, 200, 300, 500,$ and 1000. For the initial simulation study, which involved one covariate, we set up the parameter values based on the real-world data application for the surgery covariate as follows: $ϑ^{⊤} = (0.18, - 0.13, 1.15, 1.0)$ and $ϑ^{⊤} = (0.11, - 0.09, 2.0, 1.1, 1.5, 0)$ for the extended CLL model and CLL PVF frailty model, respectively.

The second simulation study was based on three covariates, assuming the parameter values derived from real-data application, as shown in Table 7: For the extended CLL model, we fixed: $λ = 0.916$ , $α^{⊤} = (0.184, - 0.145, 0.007, 0.019)$ , $β^{⊤} = (1.093, - 0.143, - 0.213)$ ; for the frailty CLL model, we set up: $λ = 1.65$ , $α^{⊤} = (0.125, - 0.098, 0.001, 0.003)$ , $β^{⊤} = (1.267, - 0.145, - 0.177)$ , and $θ = 1.457$ . The parameter value γ in the CLL PVF frailty model was fixed close to zero (CLL gamma frailty model) to corroborate with the results obtained from the real-data application.

Table 7.

Maximum likelihood estimate (MLEs), standard error (SE), $95 %$ asymptotic confidence intervals (CI), AIC value obtained for the CLL Gamma and CLL IG frailty models categorized by surgery, age and gender fitted for the melanoma dataset.

Model	Extended CLL				CLL PVF frailty
			CI(95%)				CI(95%)
Parameter	MLE	SE	Lower	Upper	MLE	SE	Lower	Upper
$α_{0}$	0.184	0.010	0.165	0.203	0.135	0.015	0.105	0.165
$α_{1_{(Yes)}}$	−0.145	0.010	−0.164	−0.125	−0.110	0.015	−0.139	−0.081
$α_{2_{(Older)}}$	0.007	0.007	−0.006	0.020	0.005	0.007	−0.009	0.018
$α_{3_{(Male)}}$	0.019	0.007	0.006	0.032	0.007	0.007	−0.007	0.021
λ	0.916	0.057	0.805	1.027	1.347	0.128	1.095	1.598
$β_{1_{(Yes)}}$	1.093	0.028	1.037	1.148	1.217	0.035	1.148	1.286
$β_{2_{(Age)}}$	−0.143	0.031	−0.204	−0.083	−0.163	0.031	−0.225	−0.102
$β_{3_{(Male)}}$	−0.213	0.029	−0.270	−0.156	−0.203	0.030	−0.262	−0.143
γ	–	–	–	–	0.329	0.126	0.082	0.576
θ	–	–	–	–	1.511	0.266	0.989	2.033
max $ℓ (\cdot)$	$- 6639.61$				$- 6609.39$
AIC	$13,295.22$				$13,238.78$
Model	CLL Gamma frailty				CLL IG frailty
$α_{0}$	0.125	0.016	0.093	0.156	0.153	0.012	0.130	0.176
$α_{1_{(Yes)}}$	−0.098	0.016	−0.129	−0.068	−0.125	0.012	−0.148	−0.102
$α_{2_{(Older)}}$	0.001	0.007	−0.012	0.014	0.003	0.007	−0.010	0.016
$α_{3_{(Male)}}$	0.003	0.007	−0.011	0.016	0.009	0.007	−0.004	0.022
λ	1.655	0.163	1.335	1.975	1.413	0.134	1.150	1.676
$β_{1_{(Yes)}}$	1.267	0.034	1.201	1.334	1.221	0.034	1.156	1.287
$β_{2_{(Older)}}$	−0.145	0.030	−0.203	−0.087	−0.149	0.031	−0.209	−0.089
$β_{3_{(Male)}}$	−0.177	0.028	−0.233	−0.121	−0.194	0.029	−0.251	−0.137
γ	–	–	–	–	–	–	–	–
θ	1.457	0.194	1.078	1.837	1.468	0.302	0.877	2.059
max $ℓ (\cdot)$	$- 6602.22$				$- 6612.81$
AIC	$13,222.44$				$13,243.62$

Open in a new tab

For each simulated scenario, we computed the average MLEs of the parameters, their standard deviations (SDs), root mean square errors (RMSEs) of the MLEs, and the empirical coverage probabilities (CPs) of $90 %$ and $95 %$ CIs based on b = 1000 Monte Carlo replicates. The coverage probability is the frequency at which the true parameter value is contained in the estimated asymptotic CI. The expected $90 %$ and $95 %$ CIs for the coverage are $[0.881, 0.919]$ , and $[0.936, 0.964]$ , respectively. These expected confidence intervals are obtained by a test that compares proportions assuming that the simulation can be modeled as a Binomial(b, r) distribution, where b is the number of Monte Carlo runs (b = 1000), and r is the fixed nominal values (r = 0.9 and r = 0.95).

The R software was used to perform simulations with 1000 Monte Carlo runs. To estimate the parameters, we employed the L-BFGS-B algorithm, an option provided by the R software's optim function.

3.1. Asymptotic properties: a single covariate

Table 1 presents the results of the simulation studies of the CLL gamma frailty and extended CLL models based on a single dichotomous covariate. The results indicate that the average estimates were close to fixed values as the sample size increased. Consequently, the bias gets to 0, regardless of the model parameters. The RMSEs and SDs decreased towards 0 as the sample size increased, and they became closer to each other (both RMSEs and SDs) when the sample size was $n \geq 300$ . With increasing sample size, the empirical coverage probabilities (CPs) for all parameters exhibited a reasonably close approximation to the nominal level, regardless of the model.

Table 1.

Average of maximum likelihood estimates (AMLE), square roots of the mean squared errors (RMSEs), and standard deviations (SDs) of the maximum likelihood estimates, and empirical coverage probabilities (CPs) of $90 %$ and $95 %$ CIs.

		CLL frailty model					Extended CLL model
		$α_{0}$	$α_{1}$	λ	β	θ	$α_{0}$	$α_{1}$	λ	β
n		0.11	$- 0.09$	2	1.1	1.5	$0.18$	$- 0.13$	1.15	1.00
100	AMLE	0.127	−0.104	2.035	1.077	1.330	0.186	−0.131	1.181	0.988
	Bias	0.017	−0.014	0.035	−0.023	−0.170	0.006	−0.001	0.031	−0.012
	RMSE	0.045	0.043	0.731	0.149	0.685	0.033	0.040	0.238	0.142
	SD	0.042	0.041	0.730	0.148	0.664	0.033	0.040	0.236	0.141
	CP(90%)	0.918	0.926	0.872	0.907	0.892	0.922	0.906	0.907	0.903
	CP(95%)	0.971	0.969	0.912	0.948	0.944	0.964	0.961	0.948	0.942
200	AMLE	0.118	−0.096	2.022	1.088	1.422	0.183	−0.13	1.166	0.988
	Bias	0.008	−0.006	0.022	−0.012	−0.078	0.003	0.000	0.016	−0.012
	RMSE	0.028	0.027	0.523	0.106	0.474	0.022	0.026	0.163	0.095
	SD	0.026	0.026	0.522	0.105	0.468	0.022	0.026	0.163	0.095
	CP(90%)	0.923	0.921	0.897	0.891	0.902	0.899	0.911	0.907	0.912
	CP(95%)	0.966	0.960	0.928	0.951	0.953	0.951	0.956	0.953	0.951
300	AMLE	0.116	−0.095	2.017	1.096	1.436	0.181	−0.129	1.159	0.999
	Bias	0.006	−0.005	0.017	−0.004	−0.064	0.001	0.001	0.009	−0.001
	RMSE	0.023	0.023	0.435	0.088	0.408	0.018	0.021	0.135	0.080
	SD	0.022	0.022	0.435	0.088	0.403	0.018	0.021	0.135	0.080
	CP(90%)	0.893	0.892	0.879	0.877	0.891	0.904	0.892	0.895	0.892
	CP(95%)	0.940	0.943	0.921	0.942	0.934	0.958	0.945	0.944	0.944
500	AMLE	0.115	−0.094	1.991	1.093	1.442	0.181	−0.130	1.157	0.997
	Bias	0.005	−0.004	−0.009	−0.007	−0.058	0.001	0.000	0.007	−0.003
	RMSE	0.017	0.017	0.319	0.063	0.308	0.014	0.016	0.105	0.057
	SD	0.017	0.016	0.319	0.063	0.303	0.014	0.016	0.104	0.057
	CP(90%)	0.895	0.904	0.884	0.914	0.888	0.887	0.894	0.875	0.915
	CP(95%)	0.950	0.960	0.938	0.960	0.940	0.945	0.942	0.939	0.957
1000	AMLE	0.112	−0.092	2.004	1.097	1.480	0.180	−0.130	1.154	0.998
	Bias	0.002	−0.002	0.004	−0.003	−0.020	0.000	0.000	0.004	−0.002
	RMSE	0.011	0.011	0.226	0.046	0.216	0.010	0.011	0.071	0.042
	SD	0.011	0.011	0.226	0.046	0.215	0.010	0.011	0.071	0.042
	CP(90%)	0.902	0.902	0.889	0.893	0.888	0.886	0.878	0.892	0.890
	CP(95%)	0.953	0.949	0.947	0.950	0.941	0.940	0.939	0.950	0.950

Open in a new tab

3.2. Asymptotic properties: three covariates

Tables 2 and 3 show the results of the second simulation study assuming three dichotomous covariates. The results were similar to the first study; the average estimates converged towards fixed values as the sample size increased. The RMSEs and SDs decreased as the sample size increased, and the empirical coverage probabilities (CPs) for all parameters closely approximated the nominal level, regardless of the model.

Table 2.

	Extended CLL model
		$α_{0}$	$α_{1}$	$α_{2}$	$α_{3}$	λ	$β_{1}$	$β_{2}$	$β_{3}$
n		0.184	−0.145	0.007	0.019	0.916	1.093	−0.143	−0.213
100	AMLE	0.194	−0.141	0.011	0.025	0.969	1.101	−0.179	−0.256
	Bias	0.010	0.004	0.004	0.006	0.053	0.008	−0.036	−0.043
	RMSE	0.055	0.061	0.067	0.074	0.283	0.232	0.301	0.387
	SD	0.054	0.061	0.066	0.074	0.278	0.232	0.299	0.385
	CP(90%)	0.897	0.914	0.920	0.923	0.889	0.918	0.909	0.908
	CP(95%)	0.946	0.967	0.968	0.975	0.933	0.960	0.959	0.950
200	AMLE	0.190	−0.144	0.008	0.02	0.939	1.096	−0.158	−0.227
	Bias	0.006	0.001	0.001	0.001	0.023	0.003	−0.015	−0.014
	RMSE	0.033	0.038	0.036	0.037	0.181	0.148	0.184	0.182
	SD	0.033	0.038	0.036	0.037	0.179	0.148	0.183	0.181
	CP(90%)	0.906	0.907	0.919	0.917	0.907	0.910	0.908	0.919
	CP(95%)	0.949	0.954	0.970	0.970	0.953	0.955	0.956	0.956
300	AMLE	0.187	−0.143	0.008	0.02	0.933	1.094	−0.15	−0.223
	Bias	0.003	0.002	0.001	0.001	0.017	0.001	−0.007	−0.010
	RMSE	0.025	0.029	0.028	0.028	0.142	0.112	0.141	0.142
	SD	0.025	0.029	0.028	0.028	0.141	0.112	0.141	0.141
	CP(90%)	0.912	0.906	0.923	0.915	0.916	0.914	0.919	0.917
	CP(95%)	0.956	0.957	0.964	0.961	0.958	0.967	0.960	0.956
500	AMLE	0.186	−0.144	0.007	0.019	0.928	1.094	−0.147	−0.219
	Bias	0.002	0.001	0.000	0.000	0.012	0.001	−0.004	−0.006
	RMSE	0.019	0.022	0.019	0.02	0.108	0.088	0.105	0.106
	SD	0.019	0.022	0.019	0.02	0.107	0.088	0.105	0.106
	CP(90%)	0.897	0.896	0.921	0.921	0.918	0.906	0.905	0.927
	CP(95%)	0.952	0.952	0.959	0.962	0.966	0.954	0.960	0.967
1000	AMLE	0.186	−0.146	0.007	0.019	0.922	1.093	−0.145	−0.214
	Bias	0.002	−0.001	0.000	0.000	0.006	0.000	−0.002	−0.001
	RMSE	0.013	0.016	0.014	0.014	0.075	0.065	0.074	0.080
	SD	0.013	0.016	0.014	0.014	0.075	0.065	0.074	0.080
	CP(90%)	0.903	0.890	0.916	0.898	0.908	0.885	0.922	0.891
	CP(95%)	0.947	0.941	0.957	0.956	0.951	0.936	0.958	0.937

Open in a new tab

Table 3.

	CLL frailty model
		$α_{0}$	$α_{1}$	$α_{2}$	$α_{3}$	λ	$β_{1}$	$β_{2}$	$β_{3}$	θ
n		0.125	$- 0.098$	0.001	0.003	1.65	$1.267$	$- 0.145$	$- 0.177$	1.457
100	AMLE	0.049	−0.017	0.007	0.023	1.869	1.36	−0.227	−0.245	1.517
	Bias	−0.076	0.081	0.006	0.020	0.219	0.093	−0.082	−0.068	0.060
	RMSE	0.773	0.809	0.346	0.755	0.904	1.157	1.172	0.763	1.053
	SD	0.770	0.805	0.346	0.756	0.879	1.154	1.169	0.761	1.052
	CP(90%)	0.912	0.921	0.93	0.928	0.903	0.920	0.914	0.899	0.899
	CP(95%)	0.952	0.971	0.976	0.974	0.925	0.964	0.953	0.941	0.945
200	AMLE	0.124	−0.090	0.000	0.002	1.699	1.266	−0.156	−0.184	1.371
	Bias	−0.001	0.008	−0.001	−0.001	0.049	−0.001	−0.011	−0.007	−0.086
	RMSE	0.157	0.161	0.045	0.039	0.537	0.176	0.215	0.202	0.673
	SD	0.157	0.161	0.045	0.039	0.535	0.177	0.215	0.202	0.668
	CP(90%)	0.913	0.916	0.895	0.914	0.884	0.873	0.881	0.908	0.870
	CP(95%)	0.950	0.962	0.955	0.966	0.919	0.945	0.938	0.952	0.929
300	AMLE	0.131	−0.100	0.001	0.002	1.682	1.264	−0.147	−0.181	1.375
	Bias	0.006	−0.002	0.000	−0.001	0.032	−0.003	−0.002	−0.004	−0.082
	RMSE	0.064	0.070	0.029	0.029	0.417	0.132	0.156	0.152	0.519
	SD	0.064	0.070	0.029	0.029	0.416	0.133	0.156	0.152	0.513
	CP(90%)	0.901	0.897	0.902	0.902	0.897	0.895	0.899	0.902	0.867
	CP(95%)	0.940	0.948	0.960	0.955	0.928	0.958	0.950	0.958	0.921
500	AMLE	0.133	−0.103	0.000	0.002	1.663	1.265	−0.145	−0.179	1.387
	Bias	0.008	−0.005	−0.001	−0.001	0.013	−0.002	0.000	−0.002	−0.070
	RMSE	0.029	0.028	0.019	0.020	0.295	0.101	0.114	0.117	0.380
	SD	0.028	0.028	0.019	0.020	0.295	0.101	0.114	0.117	0.374
	CP(90%)	0.900	0.890	0.915	0.902	0.896	0.900	0.900	0.915	0.877
	CP(95%)	0.942	0.950	0.967	0.948	0.941	0.947	0.959	0.956	0.925
1000	AMLE	0.129	−0.101	0.002	0.003	1.665	1.268	−0.148	−0.177	1.436
	Bias	0.004	−0.003	0.001	0.000	0.015	0.001	−0.003	0.000	−0.021
	RMSE	0.019	0.018	0.013	0.014	0.215	0.072	0.083	0.084	0.262
	SD	0.018	0.018	0.013	0.014	0.214	0.072	0.083	0.085	0.261
	CP(90%)	0.904	0.904	0.896	0.888	0.907	0.885	0.898	0.902	0.894
	CP(95%)	0.959	0.954	0.956	0.949	0.954	0.939	0.952	0.946	0.946

Open in a new tab

4. Survival data analysis

In this section, we analyze long-term survival data by fitting the proposed models to a skin cancer dataset, as described in the previous sections. Our study focuses on cancer-related deaths as the event of interest. The objective is to evaluate the effect of surgery, gender, and age at diagnosis on survival. We present the estimated survival curves alongside the Kaplan-Meier curves [35]. To assess the uncertainty of the estimates for the proportion of long-term survivors, we computed the bootstrap percentile $95 %$ confidence intervals based on B = 1000 bootstrap samples as described in Subsection 2.1. We also fitted alternative survival models and compared them with the proposed models.

4.1. Skin cancer data

The long-term survival data are from a retrospective survey focusing on melanoma cancer, a type of skin cancer, conducted in São Paulo, Brazil. The survival data includes individuals diagnosed with melanoma between 2000 and 2014, with follow-up until 2018. These data were provided by the São Paulo Oncocenter Foundation (FOSP) and can be downloaded from their website at https://fosp.saude.sp.gov.br/. The hospital cancer registry (RHC/FOSP) was established in 2000 to document cancer cases treated in the state. Currently, 77 active hospital cancer registries contribute to the dataset every three months. The FOSP, affiliated with the State Health Secretariat, is crucial in developing and implementing healthcare policies in oncology. As highlighted by [4], these policies are vital tools for oncology hospitals, guiding protocol development and improving patient care practices.

A total of 414 patients were excluded from the analysis due to missing values in the observed covariates, resulting in a final sample size of 6752 patients. Among these patients, 5981 ( $88.6 %$ ) underwent surgery, 3417 ( $51.6 %$ ) were female, and 4551 ( $67.4 %$ ) were aged over 50 years. A total of 1914 events ( $28.3 %$ ) were recorded throughout the follow-up period. The study recorded a maximum observation period of approximately 18.54 years, with a corresponding median follow-up period of 5.19 years.

The AJCC staging system is widely recognized worldwide for selecting appropriate treatment strategies in patients with melanoma and other solid tumors. Early clinical stages (I or II) have a more favorable prognosis as they represent localized melanoma and are typically treated with surgery. Most patients in these stages survive beyond ten years. In clinical stage III, surgery is combined with radiotherapy and systemic treatments, with melanoma-specific survival rates varying from $24 %$ to $88 %$ after ten years. Stage IV indicates metastatic disease and carries the worst prognosis. In our study, $72.1 % (4313)$ of surgical patients were in clinical stages I and II, while $68.6 % (529)$ of non-surgical patients were in stages III and IV.

The melanoma dataset was initially analyzed by [9], with a specific focus on evaluating the impact of surgery on survival rates. Our study expanded upon their analysis by incorporating additional valuable information from the registry, including gender and age at diagnosis.

Figure 3 displays the log cumulative baseline hazard rates plot against time (follow-up period) for the surgery, age, and gender covariates. According to [36], if the proportionality assumption holds, these curves should exhibit approximate parallelism with constant vertical separation. However, the observed plots indicate non-PH for the surgery covariate. Specifically, the proportionality becomes questionable before 5 years, which aligns with the findings reported in [9].

Figure 3 also includes a plot of the standardized Schoenfeld residuals against time for these covariates, as obtained from the fitted Cox PH model. Table 4 presents the results of the PH assumption testing for the Cox PH model fit [27]. The findings provide strong evidence that the surgery variable exhibits a time-varying effect, whereas the age at diagnosis and gender variables demonstrate a constant effect over time.

Table 4.

Test of proportional hazards assumption.

Variable	ρ	$χ^{2}$	p-value
Surgery	0.287	150	<0.0001
Age at diagnosis	0.002	0.008	0.926
Gender	−0.030	1.680	0.195

Open in a new tab

We fitted the extended CLL and CLL PVF frailty models to the melanoma data to evaluate the effect of the covariates on survival. Furthermore, we incorporated the explanatory variables into the parameter α using an identity link function (10).

Table 5 shows the results of the fitted models. It is worth noting that the estimated value of γ closely approaches zero, indicating the suitability of the CLL gamma frailty model. We also fitted the main special cases, including the CLL inverse Gaussian ( $γ = 0.5$ ) and gamma ( $γ \to 0$ ) frailty models, and the corresponding results are in Table 6. Notably, based on the AIC values, the CLL gamma frailty model emerges as the preferred choice among the four models.

Table 5.

Maximum likelihood estimate (MLEs), standard error (SE), $95 %$ asymptotic confidence intervals (CI), AIC value obtained for the CLL and CLL PVF frailty models categorized by surgery, age and gender fitted for the melanoma dataset.

Model	Extended CLL				CLL PVF frailty
			CI(95%)				CI(95%)
Parameter	MLE	SE	Lower	Upper	MLE	SE	Lower	Upper
$α_{0}$	0.182	0.008	0.166	0.199	0.125	0.015	0.095	0.154
$α_{1_{(Yes)}}$	−0.131	0.009	−0.149	−0.113	−0.094	0.014	−0.122	−0.066
λ	1.136	0.061	1.017	1.255	2.064	0.202	1.669	2.460
$β_{(Yes)}$	0.969	0.024	0.922	1.016	1.143	0.032	1.080	1.206
γ	–	–	–	–	0.130	0.083	0.0001	0.294
θ	–	–	–	–	1.547	0.222	1.112	1.983
max $ℓ (\cdot)$	$- 6702.18$				$- 6665.36$
AIC	13, 412.36				13, 342.72
$α_{0}$	0.118	0.007	0.105	0.131	0.084	0.010	0.064	0.104
$α_{1_{(Older)}}$	0.024	0.012	0.001	0.048	0.018	0.018	−0.017	0.053
λ	0.221	0.011	0.200	0.243	0.259	0.017	0.225	0.294
$β_{(Older)}$	−0.331	0.070	−0.468	−0.193	−0.550	0.127	−0.798	−0.302
γ	–	–	–	–	0.001	0.014	0.0001	0.028
θ	–	–	–	–	2.201	0.310	1.593	2.809
max $ℓ (\cdot)$	$- 6975.46$				$- 6946.31$
AIC	13, 958.92				13, 904.62
$α_{0}$	0.112	0.006	0.101	0.123	0.078	0.009	0.060	0.096
$α_{1_{(Male)}}$	0.050	0.012	0.025	0.074	0.050	0.021	0.010	0.090
λ	0.216	0.009	0.199	0.234	0.253	0.014	0.226	0.280
$β_{(Male)}$	−0.518	0.074	−0.664	−0.372	−0.869	0.158	−1.179	−0.560
γ	–	–	–	–	0.002	0.025	0.0001	0.051
θ	–	–	–	–	2.108	0.293	1.534	2.682
max $ℓ (\cdot)$	$- 6952.25$				$- 6923.03$
AIC	$13,912.50$				$13,858.06$

Open in a new tab

Table 6.

Model	CLL Gamma frailty				CLL IG frailty
			CI(95%)				CI(95%)
Parameter	MLE	SE	Lower	Upper	MLE	SE	Lower	Upper
$α_{0}$	0.114	0.016	0.083	0.145	0.146	0.011	0.124	0.167
$α_{1_{(Yes)}}$	−0.085	0.015	−0.114	−0.055	−0.112	0.011	−0.133	−0.090
λ	2.013	0.184	1.652	2.373	1.769	0.158	1.458	2.079
$β_{(Yes)}$	1.139	0.031	1.079	1.199	1.094	0.030	1.036	1.152
γ	–	–	–	–	–	–	–	–
θ	1.492	0.197	1.107	1.878	1.565	0.318	0.942	2.188
max $ℓ (\cdot)$	$- 6663.11$				$- 6674.26$
AIC	$13,336.22$				$13,358.52$
$α_{0}$	0.084	0.010	0.064	0.104	0.091	0.009	0.073	0.109
$α_{1_{(Older)}}$	0.018	0.018	−0.018	0.053	0.027	0.016	−0.005	0.058
λ	0.260	0.018	0.225	0.294	0.265	0.018	0.229	0.300
$β_{(Older)}$	−0.551	0.127	−0.800	−0.302	−0.512	0.114	−0.735	−0.288
γ	–	–	–	–	–	–	–	–
θ	2.213	0.311	1.604	2.821	2.662	0.518	1.646	3.678
max $ℓ (\cdot)$	$- 6946.31$				$- 6949.66$
AIC	$13,902.62$				$13,909.32$
$α_{0}$	0.078	0.009	0.061	0.096	0.085	0.008	0.069	0.101
$α_{1_{(Male)}}$	0.048	0.020	0.008	0.088	0.057	0.018	0.022	0.092
λ	0.254	0.014	0.227	0.281	0.260	0.015	0.230	0.289
$β_{(Male)}$	−0.857	0.155	−1.161	−0.553	−0.783	0.132	−1.041	−0.524
γ	–	–	–	–	–	–	–	–
θ	2.111	0.293	1.538	2.685	2.575	0.504	1.587	3.564
max $ℓ (\cdot)$	$- 6923.02$				$- 6926.50$
AIC	$13,856.04$				$13,863.00$

Open in a new tab

The results indicate a statistically significant effect of surgery on lifetime, regardless of the fitted model, as evidenced by the $95 %$ CI for β, not including 0. Moreover, the time effect measures differ between groups, with $α_{0}$ and $α_{1}$ being statistically significant, except for age at diagnosis. Note that the estimated values of $α_{0}$ and $α_{0} + α_{1}$ are positive in all four fitted models. These findings suggest the presence of long-term survivors in relation to the three covariates.

The CLL gamma frailty model demonstrates the best fit based on the AIC value among the four models. However, the CLL PVF frailty model can also be considered, given the slight difference in AIC values and the similarity of parameter estimates. Nevertheless, we choose the CLL gamma frailty model as the working model, and all interpretations are based on it.

According to the results, the estimated value of $\hat{θ}$ is 1.492, indicating a degree of unobserved heterogeneity when only the surgery covariate in the model. When incorporating age at diagnosis ( $\hat{θ} = 2.213$ ) or gender ( $\hat{θ} = 2.111$ ) into the model, higher levels of estimated unobserved heterogeneity are observed. Furthermore, the estimated time effects are ${\hat{α}}_{0} = 0.114$ ; CI( $95 %) = [0.083; 0.145]$ in the no surgery group and ${\hat{α}}_{0} + {\hat{α}}_{1} = 0.029$ ; CI( $95 %) = [0.022; 0.038]$ in the surgery group. These estimates indicate that the time effect differs between the two groups. Since the time effects are positive, the model suggests the presence of long-term survivors, as supported by the estimated proportions: ${\hat{p}}_{0} = 0.278$ ; bootstrap CI( $95 %) = [0.236; 0.323]$ (no surgery group) and ${\hat{p}}_{1} = 0.616$ ; bootstrap CI( $95 %) = [0.560; 0.629]$ (surgery group).

Regarding the age at diagnosis, the estimated time effects are ${\hat{α}}_{0} = 0.084$ ; CI( $95 %) = [0.064; 0.104]$ in the younger patients, and ${\hat{α}}_{0} + {\hat{α}}_{1} = 0.102$ ; CI( $95 %) = [0.070; 0.133]$ in the older patients. As both time effects are positive, the model suggests the presence of long-term survivors. The estimated proportions are ${\hat{p}}_{0} = 0.662$ ; bootstrap CI( $95 %) = [0.629; 0.686]$ (younger patients) and ${\hat{p}}_{1} = 0.556$ ; bootstrap CI( $95 %) = [0.509; 0.575]$ (older patients). For gender variable, we observe ${\hat{α}}_{0} = 0.078$ ; CI( $95 %) = [0.061; 0.096]$ , ${\hat{p}}_{0} = 0.650$ ; bootstrap CI( $95 %) = [0.607; 0.679]$ in the female group, and ${\hat{α}}_{0} + {\hat{α}}_{1} = 0.126$ ; CI( $95 %) = [0.090; 0.164]$ , ${\hat{p}}_{1} = 0.532$ ; bootstrap CI( $95 %) = [0.496; 0.570]$ in the male group.

In general, the fitted models provide fits that closely resemble the Kaplan-Meier curves. However, including the CLL frailty term allows the quantification of unobserved heterogeneity, which is vital in clinical practice. This becomes particularly valuable since important covariates, such as Breslow thickness, ulceration, and mitotic rate [7,22] were not observed in the dataset.

Figures 4 and 5 presents the estimated survival and hazard functions obtained from the extended CLL and CLL gamma frailty models for each observed covariate, respectively. The survival function estimates from both models closely resemble the Kaplan-Meier curves, with the CLL gamma frailty model providing a better fit. Additionally, regardless of the models, the hazard function curves are consistently higher for patients who did not undergo surgery, particularly during the first five years of the follow-up. The extended CLL and CLL gamma frailty models demonstrate decreasing hazard functions over time, with the curves intersecting at specific time points. Such crossing of hazard curves is not feasible in the traditional CLL model [43], where the time effect (α) is assumed to be equal across all groups. Hence, incorporating explanatory variables into the α parameter offers the advantage of capturing the effects of individual patient groups and enabling the occurrence of crossing hazard curves, as illustrated in Figure 5.

Figure 4. — Estimated survival curve obtained via Kaplan-Meier (black line) for melanoma dataset, and estimated survival function according to extended CLL model (left panel) and CLL PVF frailty model (right panel) for surgery (top panel), age at diagnosis (middle panel) and gender (bottom panel). (Please refer to the online version of this article for the interpretation of the references to color in this figure legend.)

Figure 5. — Estimated hazard function according to extended CLL model and CLL gamma frailty model for surgery (left panel), age at diagnosis (middle panel) and gender (right panel). (Please refer to the online version of this article for the interpretation of the references to color in this figure legend.)

We further conducted model fitting by including surgery, age at diagnosis, and gender as covariates, and the results are displayed in Table 7. According to the AIC criterion, the frailty models appear to be the better choice, with the CLL gamma frailty model having the lowest AIC value. Evidence indicates an association between surgery, age at diagnosis, and gender with the failure rate. This is supported by the fact that the $95 %$ CIs of the coefficients $β^{⊤} = (β_{1}, β_{2}, β_{3})$ do not include 0, regardless of the fitted models. Furthermore, the time effect is statistically significant exclusively for the surgery covariate, indicating that the time effect differs between the groups ( $α_{0}$ and $α_{1}$ are statistically significant).

Based on the AIC criterion, maximum likelihood values, and the number of parameters in the model, we select the CLL gamma frailty model as our working model. The estimated value of $\hat{θ}$ is 1.457, indicating a reasonable degree of unobserved heterogeneity in the sample. Furthermore, the estimated time effect in the no-surgery group is ${\hat{α}}_{0} = 0.125$ ; CI( $95 %) = [0.093; 0.156]$ and ${\hat{α}}_{0} + {\hat{α}}_{1} = 0.026$ ; CI( $95 %) = [0.015; 0.038]$ in the surgery group. Since the estimated time effects are positive, the model suggests the presence of long-term survivors. The better survival rates are associated with young female patients undergoing surgery, as illustrated in Figure 6.

Figure 6. — Estimated survival function according to the CLL gamma frailty model for the surgery, gender, and age at diagnosis. (Please refer to the online version of this article for the interpretation of the references to color in this figure legend.)

The proposed frailty model enables quantification of the amount of unobserved heterogeneity, which is crucial in clinical practice. Using the likelihood ratio test, we tested the appropriateness of the frailty term in the CLL frailty model. The test statistic $Λ = 2 {ℓ (\hat{ϑ}) - ℓ ({\hat{ϑ}}_{0})}$ , where ${\hat{ϑ}}_{0}$ is the maximum likelihood estimator of $\hat{ϑ}$ under the null hypothesis (H: $θ = 0)$ . Under regularity conditions, [14,42,58] demonstrated that the distribution of Λ can be expressed as a mixture. It comprises a chi-squared distribution with one degree of freedom and a point mass at 0 with equal proportions. In our analysis, we obtain a test statistic of $Λ = 58.45$ with a corresponding p-value $< 0.0001$ , which provides evidence in favor of including the frailty term in the model.

Furthermore, given the slight difference in AIC values and the similarity of parameter estimates, the CLL PVF frailty model emerges as a viable alternative to the CLL gamma frailty model.

The inclusion of the scalar λ in both the traditional model (1) and the extended CLL frailty model (8) enhanced the model's flexibility, as demonstrated by the estimates $\hat{λ} \neq 1$ in all models except the CLL model.

Figure 6 displays the estimated survival function based on the CLL gamma frailty model, considering various combinations of covariates, including surgery, gender, and age at diagnosis. The estimated long-term survivors are presented in Table 8. Patients who underwent surgery exhibited better survival rates than those who did not, which is expected since most patients receiving this treatment had an early diagnosis. Furthermore, younger females tend to have better survival rates among patients who underwent surgery than their counterparts who did not. Female patients, in general, demonstrated slightly improved long-term survival compared to male patients when considering the same treatment (surgery). On the other hand, in the absence of surgery, estimated long-term survival rates are poorer, particularly among older male patients. These findings align with observations reported in previous studies by [9,25,26,38,45,54].

Table 8.

Estimated long-term survivors according to the CLL gamma frailty model by surgery, gender, and age atdiagnosis.

				Bootstrap CI(95%)
Surgery	Gender	Age at diagnosis	Estimate	Lower	Upper
Yes	Female	Younger	0.741	0.701	0.775
Yes	Female	Older	0.633	0.587	0.675
Yes	Male	Younger	0.619	0.567	0.665
Yes	Male	Older	0.517	0.470	0.545
No	Female	Younger	0.321	0.255	0.378
No	Female	Older	0.283	0.227	0.327
No	Male	Younger	0.278	0.221	0.322
No	Male	Older	0.247	0.201	0.286

Open in a new tab

Our study's findings are consistent with observations from routine clinical practice and previous research. Existing studies have consistently identified surgery, gender, and age as important independent variables associated with cancer-related mortality. These studies demonstrate that younger patients and women generally have a more favorable prognosis [5,55].

4.2. Comparison between proposed models and alternative models

Using the AIC criterion, we compared the proposed models with alternative long-term survival models, including the GTDL model, which accounts for non-proportional hazards and allows for long-term survivors. Additionally, we applied the widely used mixture cure model proposed by [6,8], considering various distributions for the proper survival (susceptible patients) such as exponential, Weibull, gamma, Gompertz, log-normal, and log-logistic. Despite the PH assumption being violated for the surgery covariate, we also considered the Cox PH model as it is the most commonly applied in practice.

Several models were fitted for each covariate, and the AIC values are shown in Table 9. The CLL gamma frailty model performed better than the other competitive models when considering surgery and surgery adjusted for age and sex in the modeling. When only the age and sex variables were considered, the mixture cure model with a log-logistic distribution presented the smallest AIC values. These results suggest that the proposed models are competitive with the alternative models, mainly when the PH assumption is questionable.

Table 9.

AIC value obtained from the fitted models according to the covariates included in the models.

		Covariate
Model	Distribution	Surgery	Age	Sex	Surgery+Age+Sex
Extended CLL	–	13,412.36	13,958.92	13,912.50	13,295.22
	PVF frailty	13,342.72	13,904.62	13,858.06	13,238.78
CLL	Gamma frailty	13,336.22	13,902.62	13,856.04	13,222.44
	IG frailty	13,358.52	13,909.32	13,863.00	13,243.62
Mixture Cure	Exponential	13,587.72	13,907.99	13,867.28	13,468.57
	Weibull	13,589.07	13,908.80	13,868.09	13,469.99
	Gamma	13,589.70	13,909.95	13,869.24	13,470.56
	Gompertz	13,576.30	13,895.05	13,854.49	13,474.63
	Log-normal	13,573.22	13,888.04	13,847.58	13,466.18
	Log-logistic	13,568.74	13,879.42	13,839.02	13,446.04
GTDL¹	–	13,347.91	13,922.99	13,877.49	13,231.76
Cox PH	–	31,608.73	32,066.37	32,021.78	31,474.95

Open in a new tab

¹ Covariates were also incorporated into the time effect parameter ( $α)$ as done in [9].

5. Concluding remarks

We extended the complementary log-log model proposed by [43] by incorporating a scalar parameter λ into the hazard function. This modification enhances flexibility since the new hazard function is not constrained to the unit interval. Furthermore, we proposed a generalized version of the extended CLL model by including a PVF frailty term. One notable advantage of our proposed models is that they do not require the introduction of an additional parameter for estimating long-term survival, nor do they assume the presence of long-term survivors, which is done through the α parameter, which can accommodate both proper distributions $(α \leq 0)$ and improper distributions $(α > 0)$ . Such flexibility enables the applicability of our models to scenarios with and without long-term survivors. When the estimated value of the α parameter is positive, it enables the calculation of long-term survival using the CLL model parameters. Moreover, incorporating a frailty term in the hazard function allows for the estimation of unobserved heterogeneity, which is captured by the parameter θ. In our simulation study, designed to examine the properties of the MLEs for the model parameters, we observed that the bias and RMSEs approached zero as the sample size increased. However, our findings suggest that the CLL frailty model may not be well-suited for small sample sizes ( $n \leq 100$ ). In order to showcase the practical relevance and applicability of our proposed models, we applied them to a real melanoma dataset. Our analysis revealed statistically significant associations between surgery, age, and gender with the time to event. Notably, the time effect (α) was statistically significant only for the surgery variable, indicating a difference in the time effect between patients who underwent surgery and those who did not. Furthermore, a substantial level of unobservable heterogeneity was estimated ( $\hat{θ} = 1.46$ ), suggesting that the model did not consider genetic and environmental factors or overlooked information during the planning stage. Overall, the proposed model demonstrated superior performance compared to alternative survival models, particularly when the surgery was adjusted by age and sex. While further research is required, our findings indicate that these parametric models offer an improved approach for analyzing non-proportional hazards in the presence of long-term survivors by incorporating a frailty term.

Acknowledgments

The authors thank the Oncology Foundation of São Paulo for providing the melanoma dataset.

Funding Statement

This research of Francisco Louzada is supported by the Brazilian Organizations, CNPq (Grant number: 308849/2021-3) and FAPESP (Grant number: 2013/07375-0). The research of Vinicius F. Calsavara is supported in part by the NIH National Center for Advancing Translational Sciences UCLA CTSI UL1 TR001881.

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

1.Aalen O.O., Heterogeneity in survival analysis, Stat. Med. 7 (1988), pp. 1121–1137. [DOI] [PubMed] [Google Scholar]
2.Andersen P.K., Borgan ϕ., Gill R.D., and Keiding N., Statistical Models Based on Counting Processes, Springer, New York, NY, 1993. [Google Scholar]
3.de Andrade B.B. and Souza G.S., Likelihood computation in the normal-gamma stochastic frontier model, Comput. Stat. 33 (2018), pp. 967–982. [Google Scholar]
4.Andrade C.T.d., Magedanz A.M.P.C.B., Escobosa D.M., Tomaz W.M., Santinho C.S., Lopes T.O., and Lombardo V., The importance of a database in the management of healthcare services, Einstein (São Paulo) 10 (2012), pp. 360–365. [DOI] [PubMed] [Google Scholar]
5.Balch C.M., Thompson J.F., Gershenwald J.E., Soong Sj., Ding S., McMasters K.M., Coit D.G., Eggermont A.M., Gimotty P.A., Johnson T.M., Kirkwood J.M., Leong S.P., Ross M.I., Byrd D.R., Cochran A.J., Mihm Jr M.C., Morton D.L., Atkins M.B., Flaherty K.T., and Sondak V.K., Age as a predictor of sentinel node metastasis among patients with localized melanoma: An inverse correlation of melanoma mortality and incidence of sentinel node metastasis among young and old patients, Ann. Surg. Oncol. 21 (2014), pp. 1075–1081. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Berkson J. and Gage R.P., Survival curve for cancer patients following treatment, J. Am. Stat. Assoc. 47 (1952), pp. 501–515. [Google Scholar]
7.Bertolli E., Franke V., Calsavara V.F., de Macedo M.P., Pinto C.A.L., van Houdt W.J., Wouters M.W., Neto J.P.D., and van Akkooi A.C., Validation of a nomogram for non-sentinel node positivity in melanoma patients, and its clinical implications: A Brazilian–Dutch study, Ann. Surg. Oncol. 26 (2019), pp. 395–405. [DOI] [PubMed] [Google Scholar]
8.Boag J.W., Maximum likelihood estimates of the proportion of patients cured by cancer therapy, J. R. Stat. Soc. B 11 (1949), pp. 15–53. [Google Scholar]
9.Calsavara V.F., Milani E.A., Bertolli E., and Tomazella V., Long-term frailty modeling using a non-proportional hazards model: Application with a melanoma dataset, Stat. Methods. Med. Res. 29 (2020), pp. 2100–2118. [DOI] [PubMed] [Google Scholar]
10.Calsavara V.F., Rodrigues A.S., Rocha R., Louzada F., Tomazella V., Souza A.C., Costa R.A., and Francisco R.P., Zero-adjusted defective regression models for modeling lifetime data, J. Appl. Stat. 46 (2019), pp. 2434–2459. [Google Scholar]
11.Calsavara V.F., Rodrigues A.S., Rocha R., Tomazella V., and Louzada F., Defective regression models for cure rate modeling with interval-censored data, Biom. J. 61 (2019), pp. 841–859. [DOI] [PubMed] [Google Scholar]
12.Calsavara V.F., Rodrigues A.S., Tomazella V.L.D., and de Castro M., Frailty models power variance function with cure fraction and latent risk factors negative binomial, Commun. Stat.-Theory Methods 46 (2017), pp. 9763–9776. [Google Scholar]
13.Calsavara V.F., Tomazella V.L.D., and Fogo J.C., The effect of frailty term in the standard mixture model, Chil. J. Stat. 4 (2013), pp. 95–109. [Google Scholar]
14.Claeskens G., Nguti R., and Janssen P., One-sided tests in shared frailty models, Test 17 (2008), pp. 69–82. [Google Scholar]
15.Clayton D., A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence, Biometrika 65 (1978), pp. 141–151. [Google Scholar]
16.Coordenação de Prevenção e Vigilância , Instituto Nacional de Câncer José Alencar Gomes da Silva. Estimativa 2018: Incidência de Câncer no Brasil. Coordenação de Prevenção e Vigilância – Rio de Janeiro. (2017). Available at http://www1.inca.gov.br/estimativa/2018/.
17.Cox D.R., Regression models and life-tables, J. R. Stat. Soc. B 34 (1972), pp. 187–220. [Google Scholar]
18.Davison A.C. and Hinkley D. V., Bootstrap Methods and Their Application (Cambridge Series in Statistical and Probabilistic Mathematics), Cambridge University Press, Cambridge, 1997. 10.1017/CBO9780511802843 [DOI] [Google Scholar]
19.Efron B. and Tibshirani R.J., An Introduction to the Bootstrap, 1st ed., Chapman and Hall/CRC, New York, NY, 1994. 10.1201/9780429246593 [DOI] [Google Scholar]
20.Ferlay J., Ervik M., Lam F., Laversanne M., Colombet M., Mery L., Piñeros M., Znaor A., Soerjomataram I., and Bray F., Global Cancer Observatory: Cancer Today, International Agency for Research on Cancer, Lyon, 2024. Available at https://gco.iarc.who.int/today (accessed 01 February 2019). [Google Scholar]
21.Etezadi-Amoli J. and Ciampi A., Extended hazard regression for censored survival data with covariates: A spline approximation for the baseline hazard function, Biometrics 43 (1987), pp. 181–192. [Google Scholar]
22.Fonseca I.B., Lindote M.V.N., Monteiro M.R., Doria Filho E., Pinto C.A.L., Jafelicci A.S., de Melo Lôbo M., Calsavara V.F., Bertolli E., and Neto J.P.D., Sentinel node status is the most important prognostic information for clinical stage IIB and IIC melanoma patients, Ann. Surg. Oncol. 27 (2020), pp. 4133–4140. [DOI] [PubMed] [Google Scholar]
23.Gazon A.B., Milani E.A., Mota A.L., Louzada F., Tomazella V.L., and Calsavara V.F., Nonproportional hazards model with a frailty term for modeling subgroups with evidence of long-term survivors: Application to a lung cancer dataset, Biom. J. 64 (2022), pp. 105–130. [DOI] [PubMed] [Google Scholar]
24.Gershenwald J.E., Scolyer R.A., Hess K.R., Sondak V.K., Long G.V., Ross M.I., Lazar A.J., Faries M.B., Kirkwood J.M., McArthur G.A., Haydu L.E., Eggermont A.M.M., Flaherty K.T., Balch C.M., and Thompson J.F., Melanoma staging: Evidence-based changes in the american joint committee on cancer eighth edition cancer staging manual, CA: A Cancer J. Clinic. 67 (2017), pp. 472–492. [DOI] [PMC free article] [PubMed] [Google Scholar]
25.Gómez Y.M., Gallardo D.I., Bourguignon M., Bertolli E., and Calsavara V.F., A general class of promotion time cure rate models with a new biological interpretation, Lifetime. Data. Anal. 29 (2023), pp. 66–86. [DOI] [PubMed] [Google Scholar]
26.Gómez Y.M., Gallardo D.I., Leão J., and Calsavara V.F., On a new piecewise regression model with cure rate: Diagnostics and application to medical data, Stat. Med. 40 (2021), pp. 6723–6742. [DOI] [PubMed] [Google Scholar]
27.Grambsch P.M. and Therneau T.M., Proportional hazards tests and diagnostics based on weighted residuals, Biometrika 81 (1994), pp. 515–526. [Google Scholar]
28.Hess K.R., Graphical methods for assessing violations of the proportional hazards assumption in Cox regression, Stat. Med. 14 (1995), pp. 1707–1723. [DOI] [PubMed] [Google Scholar]
29.Hesterberg T.C., What teachers should know about the bootstrap: Resampling in the undergraduate statistics curriculum, Am. Stat. 69 (2015), pp. 371–386. [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Hougaard P., Survival models for heterogeneous populations derived from stable distributions, Biometrika 73 (1986), pp. 387–396. [Google Scholar]
31.Hougaard P., Modelling heterogeneity in survival data, J. Appl. Probab. 28 (1991), pp. 695–701. [Google Scholar]
32.Hougaard P., Frailty models for survival data, Lifetime. Data. Anal. 1 (1995), pp. 255–273. [DOI] [PubMed] [Google Scholar]
33.Hougaard P., Myglegaard P., and Borch-Johnsen K., Heterogeneity models of disease susceptibility, with application to diabetic nephropathy, Biometrics 50 (1994), pp. 1178–1188. [PubMed] [Google Scholar]
34.Kalbfleisch J.D. and Prentice R.L., The Statistical Analysis of Failure Time Data, John Wiley & Sons, Hoboken, NJ, 2011. [Google Scholar]
35.Kaplan E.L. and Meier P., Nonparametric estimation from incomplete observations, J. Am. Stat. Assoc. 53 (1958), pp. 457–481. [Google Scholar]
36.Klein J.P. and Moeschberger M.L., Survival Analysis: Statistical Methods for Censored and Truncated Data, Springer Verlag, New York, 2003. [Google Scholar]
37.Lawless J.F., Statistical Models and Methods for Lifetime Data, John Wiley & Sons, Hoboken, NJ, 2011. [Google Scholar]
38.Leão J., Bourguignon M., Saulo H., Santos-Neto M., and Calsavara V., The negative binomial beta prime regression model with cure rate: Application with a melanoma dataset, J. Stat. Theory. Pract. 15 (2021), p. 63. [Google Scholar]
39.Louzada-Neto F., Extended hazard regression model for reliability and survival analysis, Lifetime. Data. Anal. 3 (1997), pp. 367–381. [DOI] [PubMed] [Google Scholar]
40.Louzada-Neto F., Polyhazard models for lifetime data, Biometrics 55 (1999), pp. 1281–1285. [DOI] [PubMed] [Google Scholar]
41.Mackenzie G., Regression models for survival data: The generalized time-dependent logistic family, Statistician. 45 (1996), pp. 21–34. [Google Scholar]
42.Maller R. and Zhou X., Survival Analysis with Long-Term Survivors, John Wiley & Sons, Chichester, 1996. [Google Scholar]
43.Milani E.A., Diniz C.A.R., and Tomazella V.L., Generalized time-dependent complement log-log model, Chil. J. Stat. 5 (2014), pp. 29–44. [Google Scholar]
44.Milani E.A., Tomazella V.L., Dias T.C., and Louzada F., The generalized time-dependent logistic frailty model: An application to a population-based prospective study of incident cases of lung cancer diagnosed in Northern Ireland, Braz. J. Probab. Stat. 29 (2015), pp. 132–144. [Google Scholar]
45.Molina K.C., Calsavara V.F., Tomazella V., and Milani E.A., Survival models induced by zero-modified power series discrete frailty: Application with a melanoma data set, Stat. Methods. Med. Res. 30 (2021), pp. 1874–1889. [DOI] [PubMed] [Google Scholar]
46.Monahan J., Numerical Methods of Statistics. In Cambridge Series in Statistical and Probabilistic Mathematics, 2nd ed., Cambridge University Press. Cambridge, 2011. 10.1017/CBO9780511977176 [DOI] [Google Scholar]
47.Mota A., Milani E.A., Calsavara V.F., Tomazella V.L., Leão J., Ramos P.L., Ferreira P.H., and Louzada F., Weighted Lindley frailty model: Estimation and application to lung cancer data, Lifetime. Data. Anal. 27 (2021), pp. 561–587. [DOI] [PubMed] [Google Scholar]
48.Oakes D., A model for association in bivariate survival data, J. R. Stat. Soc. B 44 (1982), pp. 414–422. [Google Scholar]
49.Peng Y., Taylor J.M.G., and Yu B., A marginal regression model for multivariate failure time data with a surviving fraction, Lifetime. Data. Anal. 13 (2007), pp. 351–369. [DOI] [PubMed] [Google Scholar]
50.Pettitt A. and Bin Daud I., Investigating time dependence in Cox's proportional hazards model, Appl. Stat. 39 (1990), pp. 313–329. [Google Scholar]
51.Prentice R.L., Linear rank tests with right censored data, Biometrika 65 (1978), pp. 167–179. [Google Scholar]
52.Price D.L. and Manatunga A.K., Modelling survival data with a cured fraction using frailty models, Stat. Med. 20 (2001), pp. 1515–1527. [DOI] [PubMed] [Google Scholar]
53.R Core Team , R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2020.
54.Rodrigues A.S., Calsavara V.F., Bertolli E., Peres S.V., and Tomazella V.L., Bayesian long-term survival model including a frailty term: Application to melanoma data, Chil. J. Stat. 12 (2021), pp. 53–69. [Google Scholar]
55.Sabel M.S., Griffith K., Sondak V.K., Lowe L., Schwartz J.L., Cimmino V.M., Chang A.E., Rees R.S., Bradford C.R., and Johnson T.M., Predictors of nonsentinel lymph node positivity in patients with a positive sentinel node for melanoma, J. Am. Coll. Surg. 201 (2005), pp. 37–47. [DOI] [PubMed] [Google Scholar]
56.Schemper M., Cox analysis of survival data with non-proportional hazard functions, Statistician. 41 (1992), pp. 455–465. [Google Scholar]
57.Schoenfeld D., Partial residuals for the proportional hazards regression model, Biometrika 69 (1982), pp. 239–241. [Google Scholar]
58.Self S.G. and Liang K.Y., Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions, J. Am. Stat. Assoc. 82 (1987), pp. 605–610. [Google Scholar]
59.Sen P.K., Singer J.M., and de Lima A.C.P., From Finite Sample to Asymptotic Methods in Statistics, Cambridge University Press, New York, NY, 2010. [Google Scholar]
60.Sinha D. and Dey D., Semiparametric Bayesian analysis of survival data, J. Am. Stat. Assoc. 92 (1997), pp. 1195–1212. [Google Scholar]
61.Tweedie M.C.K., An index which distinguishes between some important exponential families. Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference, J.K. Ghosh and J. Roy, eds., Indian Statistical Institute, Calcutta, 1984, pp. 579–604. [Google Scholar]
62.Vaupel J.W., Manton K.G., and Stallard E., The impact of heterogeneity in individual frailty on the dynamics of mortality, Demography 16 (1979), pp. 439–454. [PubMed] [Google Scholar]
63.Wienke A., Frailty Models in Survival Analysis, Chapman & Hall/CRC, Boca Raton, 2011. [Google Scholar]
64.Yu B. and Peng Y., Mixture cure models for multivariate survival data, Comput. Stat. Data. Anal. 52 (2008), pp. 1524–1532. [Google Scholar]

[CIT0001] 1.Aalen O.O., Heterogeneity in survival analysis, Stat. Med. 7 (1988), pp. 1121–1137. [DOI] [PubMed] [Google Scholar]

[CIT0002] 2.Andersen P.K., Borgan ϕ., Gill R.D., and Keiding N., Statistical Models Based on Counting Processes, Springer, New York, NY, 1993. [Google Scholar]

[CIT0003] 3.de Andrade B.B. and Souza G.S., Likelihood computation in the normal-gamma stochastic frontier model, Comput. Stat. 33 (2018), pp. 967–982. [Google Scholar]

[CIT0004] 4.Andrade C.T.d., Magedanz A.M.P.C.B., Escobosa D.M., Tomaz W.M., Santinho C.S., Lopes T.O., and Lombardo V., The importance of a database in the management of healthcare services, Einstein (São Paulo) 10 (2012), pp. 360–365. [DOI] [PubMed] [Google Scholar]

[CIT0005] 5.Balch C.M., Thompson J.F., Gershenwald J.E., Soong Sj., Ding S., McMasters K.M., Coit D.G., Eggermont A.M., Gimotty P.A., Johnson T.M., Kirkwood J.M., Leong S.P., Ross M.I., Byrd D.R., Cochran A.J., Mihm Jr M.C., Morton D.L., Atkins M.B., Flaherty K.T., and Sondak V.K., Age as a predictor of sentinel node metastasis among patients with localized melanoma: An inverse correlation of melanoma mortality and incidence of sentinel node metastasis among young and old patients, Ann. Surg. Oncol. 21 (2014), pp. 1075–1081. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0006] 6.Berkson J. and Gage R.P., Survival curve for cancer patients following treatment, J. Am. Stat. Assoc. 47 (1952), pp. 501–515. [Google Scholar]

[CIT0007] 7.Bertolli E., Franke V., Calsavara V.F., de Macedo M.P., Pinto C.A.L., van Houdt W.J., Wouters M.W., Neto J.P.D., and van Akkooi A.C., Validation of a nomogram for non-sentinel node positivity in melanoma patients, and its clinical implications: A Brazilian–Dutch study, Ann. Surg. Oncol. 26 (2019), pp. 395–405. [DOI] [PubMed] [Google Scholar]

[CIT0008] 8.Boag J.W., Maximum likelihood estimates of the proportion of patients cured by cancer therapy, J. R. Stat. Soc. B 11 (1949), pp. 15–53. [Google Scholar]

[CIT0009] 9.Calsavara V.F., Milani E.A., Bertolli E., and Tomazella V., Long-term frailty modeling using a non-proportional hazards model: Application with a melanoma dataset, Stat. Methods. Med. Res. 29 (2020), pp. 2100–2118. [DOI] [PubMed] [Google Scholar]

[CIT0010] 10.Calsavara V.F., Rodrigues A.S., Rocha R., Louzada F., Tomazella V., Souza A.C., Costa R.A., and Francisco R.P., Zero-adjusted defective regression models for modeling lifetime data, J. Appl. Stat. 46 (2019), pp. 2434–2459. [Google Scholar]

[CIT0011] 11.Calsavara V.F., Rodrigues A.S., Rocha R., Tomazella V., and Louzada F., Defective regression models for cure rate modeling with interval-censored data, Biom. J. 61 (2019), pp. 841–859. [DOI] [PubMed] [Google Scholar]

[CIT0012] 12.Calsavara V.F., Rodrigues A.S., Tomazella V.L.D., and de Castro M., Frailty models power variance function with cure fraction and latent risk factors negative binomial, Commun. Stat.-Theory Methods 46 (2017), pp. 9763–9776. [Google Scholar]

[CIT0013] 13.Calsavara V.F., Tomazella V.L.D., and Fogo J.C., The effect of frailty term in the standard mixture model, Chil. J. Stat. 4 (2013), pp. 95–109. [Google Scholar]

[CIT0014] 14.Claeskens G., Nguti R., and Janssen P., One-sided tests in shared frailty models, Test 17 (2008), pp. 69–82. [Google Scholar]

[CIT0015] 15.Clayton D., A model for association in bivariate life tables and its application in epidemiological studies of familial tendency in chronic disease incidence, Biometrika 65 (1978), pp. 141–151. [Google Scholar]

[CIT0016] 16.Coordenação de Prevenção e Vigilância , Instituto Nacional de Câncer José Alencar Gomes da Silva. Estimativa 2018: Incidência de Câncer no Brasil. Coordenação de Prevenção e Vigilância – Rio de Janeiro. (2017). Available at http://www1.inca.gov.br/estimativa/2018/.

[CIT0017] 17.Cox D.R., Regression models and life-tables, J. R. Stat. Soc. B 34 (1972), pp. 187–220. [Google Scholar]

[CIT0018] 18.Davison A.C. and Hinkley D. V., Bootstrap Methods and Their Application (Cambridge Series in Statistical and Probabilistic Mathematics), Cambridge University Press, Cambridge, 1997. 10.1017/CBO9780511802843 [DOI] [Google Scholar]

[CIT0019] 19.Efron B. and Tibshirani R.J., An Introduction to the Bootstrap, 1st ed., Chapman and Hall/CRC, New York, NY, 1994. 10.1201/9780429246593 [DOI] [Google Scholar]

[CIT0020] 20.Ferlay J., Ervik M., Lam F., Laversanne M., Colombet M., Mery L., Piñeros M., Znaor A., Soerjomataram I., and Bray F., Global Cancer Observatory: Cancer Today, International Agency for Research on Cancer, Lyon, 2024. Available at https://gco.iarc.who.int/today (accessed 01 February 2019). [Google Scholar]

[CIT0021] 21.Etezadi-Amoli J. and Ciampi A., Extended hazard regression for censored survival data with covariates: A spline approximation for the baseline hazard function, Biometrics 43 (1987), pp. 181–192. [Google Scholar]

[CIT0022] 22.Fonseca I.B., Lindote M.V.N., Monteiro M.R., Doria Filho E., Pinto C.A.L., Jafelicci A.S., de Melo Lôbo M., Calsavara V.F., Bertolli E., and Neto J.P.D., Sentinel node status is the most important prognostic information for clinical stage IIB and IIC melanoma patients, Ann. Surg. Oncol. 27 (2020), pp. 4133–4140. [DOI] [PubMed] [Google Scholar]

[CIT0023] 23.Gazon A.B., Milani E.A., Mota A.L., Louzada F., Tomazella V.L., and Calsavara V.F., Nonproportional hazards model with a frailty term for modeling subgroups with evidence of long-term survivors: Application to a lung cancer dataset, Biom. J. 64 (2022), pp. 105–130. [DOI] [PubMed] [Google Scholar]

[CIT0024] 24.Gershenwald J.E., Scolyer R.A., Hess K.R., Sondak V.K., Long G.V., Ross M.I., Lazar A.J., Faries M.B., Kirkwood J.M., McArthur G.A., Haydu L.E., Eggermont A.M.M., Flaherty K.T., Balch C.M., and Thompson J.F., Melanoma staging: Evidence-based changes in the american joint committee on cancer eighth edition cancer staging manual, CA: A Cancer J. Clinic. 67 (2017), pp. 472–492. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0025] 25.Gómez Y.M., Gallardo D.I., Bourguignon M., Bertolli E., and Calsavara V.F., A general class of promotion time cure rate models with a new biological interpretation, Lifetime. Data. Anal. 29 (2023), pp. 66–86. [DOI] [PubMed] [Google Scholar]

[CIT0026] 26.Gómez Y.M., Gallardo D.I., Leão J., and Calsavara V.F., On a new piecewise regression model with cure rate: Diagnostics and application to medical data, Stat. Med. 40 (2021), pp. 6723–6742. [DOI] [PubMed] [Google Scholar]

[CIT0027] 27.Grambsch P.M. and Therneau T.M., Proportional hazards tests and diagnostics based on weighted residuals, Biometrika 81 (1994), pp. 515–526. [Google Scholar]

[CIT0028] 28.Hess K.R., Graphical methods for assessing violations of the proportional hazards assumption in Cox regression, Stat. Med. 14 (1995), pp. 1707–1723. [DOI] [PubMed] [Google Scholar]

[CIT0029] 29.Hesterberg T.C., What teachers should know about the bootstrap: Resampling in the undergraduate statistics curriculum, Am. Stat. 69 (2015), pp. 371–386. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0030] 30.Hougaard P., Survival models for heterogeneous populations derived from stable distributions, Biometrika 73 (1986), pp. 387–396. [Google Scholar]

[CIT0031] 31.Hougaard P., Modelling heterogeneity in survival data, J. Appl. Probab. 28 (1991), pp. 695–701. [Google Scholar]

[CIT0032] 32.Hougaard P., Frailty models for survival data, Lifetime. Data. Anal. 1 (1995), pp. 255–273. [DOI] [PubMed] [Google Scholar]

[CIT0033] 33.Hougaard P., Myglegaard P., and Borch-Johnsen K., Heterogeneity models of disease susceptibility, with application to diabetic nephropathy, Biometrics 50 (1994), pp. 1178–1188. [PubMed] [Google Scholar]

[CIT0034] 34.Kalbfleisch J.D. and Prentice R.L., The Statistical Analysis of Failure Time Data, John Wiley & Sons, Hoboken, NJ, 2011. [Google Scholar]

[CIT0035] 35.Kaplan E.L. and Meier P., Nonparametric estimation from incomplete observations, J. Am. Stat. Assoc. 53 (1958), pp. 457–481. [Google Scholar]

[CIT0036] 36.Klein J.P. and Moeschberger M.L., Survival Analysis: Statistical Methods for Censored and Truncated Data, Springer Verlag, New York, 2003. [Google Scholar]

[CIT0037] 37.Lawless J.F., Statistical Models and Methods for Lifetime Data, John Wiley & Sons, Hoboken, NJ, 2011. [Google Scholar]

[CIT0038] 38.Leão J., Bourguignon M., Saulo H., Santos-Neto M., and Calsavara V., The negative binomial beta prime regression model with cure rate: Application with a melanoma dataset, J. Stat. Theory. Pract. 15 (2021), p. 63. [Google Scholar]

[CIT0039] 39.Louzada-Neto F., Extended hazard regression model for reliability and survival analysis, Lifetime. Data. Anal. 3 (1997), pp. 367–381. [DOI] [PubMed] [Google Scholar]

[CIT0040] 40.Louzada-Neto F., Polyhazard models for lifetime data, Biometrics 55 (1999), pp. 1281–1285. [DOI] [PubMed] [Google Scholar]

[CIT0041] 41.Mackenzie G., Regression models for survival data: The generalized time-dependent logistic family, Statistician. 45 (1996), pp. 21–34. [Google Scholar]

[CIT0042] 42.Maller R. and Zhou X., Survival Analysis with Long-Term Survivors, John Wiley & Sons, Chichester, 1996. [Google Scholar]

[CIT0043] 43.Milani E.A., Diniz C.A.R., and Tomazella V.L., Generalized time-dependent complement log-log model, Chil. J. Stat. 5 (2014), pp. 29–44. [Google Scholar]

[CIT0044] 44.Milani E.A., Tomazella V.L., Dias T.C., and Louzada F., The generalized time-dependent logistic frailty model: An application to a population-based prospective study of incident cases of lung cancer diagnosed in Northern Ireland, Braz. J. Probab. Stat. 29 (2015), pp. 132–144. [Google Scholar]

[CIT0045] 45.Molina K.C., Calsavara V.F., Tomazella V., and Milani E.A., Survival models induced by zero-modified power series discrete frailty: Application with a melanoma data set, Stat. Methods. Med. Res. 30 (2021), pp. 1874–1889. [DOI] [PubMed] [Google Scholar]

[CIT0046] 46.Monahan J., Numerical Methods of Statistics. In Cambridge Series in Statistical and Probabilistic Mathematics, 2nd ed., Cambridge University Press. Cambridge, 2011. 10.1017/CBO9780511977176 [DOI] [Google Scholar]

[CIT0047] 47.Mota A., Milani E.A., Calsavara V.F., Tomazella V.L., Leão J., Ramos P.L., Ferreira P.H., and Louzada F., Weighted Lindley frailty model: Estimation and application to lung cancer data, Lifetime. Data. Anal. 27 (2021), pp. 561–587. [DOI] [PubMed] [Google Scholar]

[CIT0048] 48.Oakes D., A model for association in bivariate survival data, J. R. Stat. Soc. B 44 (1982), pp. 414–422. [Google Scholar]

[CIT0049] 49.Peng Y., Taylor J.M.G., and Yu B., A marginal regression model for multivariate failure time data with a surviving fraction, Lifetime. Data. Anal. 13 (2007), pp. 351–369. [DOI] [PubMed] [Google Scholar]

[CIT0050] 50.Pettitt A. and Bin Daud I., Investigating time dependence in Cox's proportional hazards model, Appl. Stat. 39 (1990), pp. 313–329. [Google Scholar]

[CIT0051] 51.Prentice R.L., Linear rank tests with right censored data, Biometrika 65 (1978), pp. 167–179. [Google Scholar]

[CIT0052] 52.Price D.L. and Manatunga A.K., Modelling survival data with a cured fraction using frailty models, Stat. Med. 20 (2001), pp. 1515–1527. [DOI] [PubMed] [Google Scholar]

[CIT0053] 53.R Core Team , R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2020.

[CIT0054] 54.Rodrigues A.S., Calsavara V.F., Bertolli E., Peres S.V., and Tomazella V.L., Bayesian long-term survival model including a frailty term: Application to melanoma data, Chil. J. Stat. 12 (2021), pp. 53–69. [Google Scholar]

[CIT0055] 55.Sabel M.S., Griffith K., Sondak V.K., Lowe L., Schwartz J.L., Cimmino V.M., Chang A.E., Rees R.S., Bradford C.R., and Johnson T.M., Predictors of nonsentinel lymph node positivity in patients with a positive sentinel node for melanoma, J. Am. Coll. Surg. 201 (2005), pp. 37–47. [DOI] [PubMed] [Google Scholar]

[CIT0056] 56.Schemper M., Cox analysis of survival data with non-proportional hazard functions, Statistician. 41 (1992), pp. 455–465. [Google Scholar]

[CIT0057] 57.Schoenfeld D., Partial residuals for the proportional hazards regression model, Biometrika 69 (1982), pp. 239–241. [Google Scholar]

[CIT0058] 58.Self S.G. and Liang K.Y., Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions, J. Am. Stat. Assoc. 82 (1987), pp. 605–610. [Google Scholar]

[CIT0059] 59.Sen P.K., Singer J.M., and de Lima A.C.P., From Finite Sample to Asymptotic Methods in Statistics, Cambridge University Press, New York, NY, 2010. [Google Scholar]

[CIT0060] 60.Sinha D. and Dey D., Semiparametric Bayesian analysis of survival data, J. Am. Stat. Assoc. 92 (1997), pp. 1195–1212. [Google Scholar]

[CIT0061] 61.Tweedie M.C.K., An index which distinguishes between some important exponential families. Statistics: Applications and New Directions. Proceedings of the Indian Statistical Institute Golden Jubilee International Conference, J.K. Ghosh and J. Roy, eds., Indian Statistical Institute, Calcutta, 1984, pp. 579–604. [Google Scholar]

[CIT0062] 62.Vaupel J.W., Manton K.G., and Stallard E., The impact of heterogeneity in individual frailty on the dynamics of mortality, Demography 16 (1979), pp. 439–454. [PubMed] [Google Scholar]

[CIT0063] 63.Wienke A., Frailty Models in Survival Analysis, Chapman & Hall/CRC, Boca Raton, 2011. [Google Scholar]

[CIT0064] 64.Yu B. and Peng Y., Mixture cure models for multivariate survival data, Comput. Stat. Data. Anal. 52 (2008), pp. 1524–1532. [Google Scholar]

PERMALINK

Non-proportional hazards model with a PVF frailty term: application with a melanoma dataset

Karen C Rosa

Vinicius F Calsavara

Francisco Louzada

Abstract

1. Introduction

Complementary log-log regression model

Figure 1.

1.1. Extended complementary log-log frailty model

Figure 2.

2. Inference

2.1. Nonparametric bootstrap

3. Simulation study

Table 7.

3.1. Asymptotic properties: a single covariate

Table 1.

3.2. Asymptotic properties: three covariates

Table 2.

Table 3.

4. Survival data analysis

4.1. Skin cancer data

Figure 3.

Table 4.

Table 5.

Table 6.

Figure 4.

Figure 5.

Figure 6.

Table 8.

4.2. Comparison between proposed models and alternative models

Table 9.

5. Concluding remarks

Acknowledgments

Funding Statement

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases