Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2026 Apr 18.
Published before final editing as: Commun Stat Simul Comput. 2026 Apr 8:10.1080/03610918.2026.2637684. doi: 10.1080/03610918.2026.2637684

A New Estimation Algorithm for Destructive Cure Model: Illustration with Exponentially Weighted Poisson Competing Risks

Suvra Pal 1,2,*, Souvik Roy 1
PMCID: PMC13089356  NIHMSID: NIHMS2160053  PMID: 42005470

Abstract

We propose an improved estimation method for the destructive cure rate model by introducing a generic maximum likelihood algorithm, the sequential quadratic Hamiltonian (SQH) scheme, which employs a gradient-free optimization approach. The SQH algorithm is applied to the destructive cure model with exponentially weighted Poisson competing risks, and its performance is evaluated through a comprehensive simulation study. Specifically, we compare the model-fitting accuracy of SQH with the recently developed conjugate gradient line search (CGLS) algorithm. Given that the CGLS method has been shown to outperform the widely used expectation-maximization algorithm and other optimization routines available in R (e.g., optim, nlm, and Rcgmin), our focus is on assessing whether the SQH algorithm can offer further improvements. Simulation results show that the SQH algorithm yields parameter estimates with consistently lower bias and root mean square error, resulting in more accurate and precise cure rate estimation. Furthermore, due to its gradient-free nature, the SQH algorithm requires less CPU time than CGLS. These advantages position the SQH algorithm as a preferred estimation method over CGLS for the destructive cure rate model. To demonstrate its practical utility, we apply the SQH algorithm to a well-known melanoma dataset and present the analysis results.

Keywords: Cure rate, Long-term survivors, Destructive cure model, Competing risks, Optimization

1. Introduction

Advancements in the treatment of diseases such as cancer and heart disease have resulted in a significant number of patients showing no signs of recurrence by the end of a prolonged follow-up period. In the medical literature, these individuals are referred to as recurrence-free survivors. Among them, some may never experience a recurrence, even after extended observation, as the disease may become biologically inactive or undetectable. These patients are considered long-term survivors or are regarded as “cured”. However, estimating the probability of cure (or cure rate) from survival data poses a major challenge, as it is difficult to determine which recurrence-free survivors are truly cured. A patient might remain recurrence-free through the study period but still be at risk for recurrence shortly thereafter. This uncertainty complicates the identification of long-term survivors based solely on follow-up data. Despite these challenges, accurately estimating treatment-specific cure rates is crucial. It provides insights into long-term survival trends for a given disease and serves as an important metric for assessing a treatment’s effectiveness. Such estimates are especially valuable when comparing new therapies to standard treatments and considering their potential for clinical adoption.

The literature on cure rate models is extensive, and the field has rapidly emerged as a key area of modern statistical research. Foundational work dates back to Boag (1949), with further development by Berkson & Gage (1952), who introduced the widely known mixture cure rate model. This model expresses the population survival function of the time-to-event variable Y as a combination of cured and susceptible sub-populations:

Spopy=p0+1-p0Ssy, (1)

where p0 denotes the proportion of cured individuals, and Ss(y) is the survival function for the susceptible (non-cured) individuals (Peng & Yu, 2021). Notably, Spop(y) is not a proper survival function, since limySpop(y)=p0(0). A limitation of the mixture model in eqn (1) is its inability to account for competing risks, where multiple latent factors may independently cause the event of interest. To address this, Chen et al. (1999) proposed the promotion time cure rate model, which assumes the number of latent competing risks follows a Poisson distribution. The population survival function in this framework is:

Spopy=e-η1-Sy, (2)

where η is the mean number of risk factors, and S(y) is the common survival function for the time taken by each risk factor to trigger the event. The corresponding cure rate is e-η.

To unify the mixture and promotion time models, Rodrigues et al. (2009) introduced the Conway–Maxwell–Poisson (COM-Poisson) cure rate model, which assumes the number of competing risks follows a COM-Poisson distribution; see also Treszoks & Pal (2025). This model accommodates both over-dispersion and under-dispersion relative to the Poisson distribution. The survival function is given by:

Spopy=ZηSy,ϕZη,ϕ, (3)

where Z(a,ϕ)=j=0aj(j!)ϕ is the normalizing function of the COM-Poisson distribution, η is related to the average number of risk factors, and ϕ is the dispersion parameter. In this model, the cure rate is 1Z(η,ϕ). Notably, the model reduces to the mixture cure model as ϕ, with p0=11+η, and to the promotion time cure model when ϕ=1. For additional developments using the COM-Poisson distribution in cure rate modeling, see Balakrishnan & Pal (2012, 2013, 2014, 2015a,b, 2016) and Pal & Balakrishnan (2017b). Building on this line of work, Rodrigues et al. (2011) offered a biologically meaningful interpretation of event occurrence mechanisms and introduced the destructive cure rate model, which accounts for the possible elimination (or destruction) of risk factors following initial treatment; see also Cooner et al. (2007).

We consider a competing risks framework that accounts for the potential elimination of risk factors following initial treatment. Within this context, Rodrigues et al. (2011) proposed the destructive cure rate model, assuming a weighted Poisson distribution for the initial number of competing risks and estimating model parameters via direct maximization of the observed likelihood function. Subsequent work has extended this model in various directions. Interested readers can refer to Gallardo et al. (2016), Pal & Balakrishnan (2016, 2017a,c, 2018); Pal et al. (2018), Majakwara & Pal (2019), and Treszoks & Pal (2023, 2024), among others. In particular, Pal & Balakrishnan (2016, 2017a,c, 2018), and Pal et al. (2018) developed likelihood-based inference procedures by incorporating alternative weight functions into the weighted Poisson distribution of the initial number of risks. These studies employed the expectation-maximization (EM) algorithm for maximum likelihood estimation (MLE) of model parameters. However, they encountered challenges related to flat likelihood surfaces with respect to certain parameters, which hindered simultaneous optimization. To mitigate this, a profile likelihood approach was embedded within the EM algorithm. Although the profile likelihood method improved performance, it had notable drawbacks. The root mean square error (RMSE) of regression parameters related to the cure rate remained high, potentially compromising the accuracy of overall survival inference. Additionally, the method required multiple runs of the EM algorithm, leading to significant computational overhead. To address these limitations, Pal & Roy (2021, 2022, 2023) introduced the conjugate gradient line search (CGLS) algorithm, enabling simultaneous optimization of all parameters without relying on the EM framework. The CGLS method produced more accurate estimates, particularly for cure rate parameters, and was computationally more efficient than both EM and other optimization routines available in R, including conjugate gradient variants. Overall, CGLS demonstrated superior performance.

In this paper, we propose a new estimation method for the destructive cure rate model, called the sequential quadratic Hamiltonian (SQH) algorithm. This approach is grounded in Pontryagin’s maximum principle, which offers a pointwise and locally structured formulation. As a result, the SQH algorithm updates unknown parameters locally and does not require gradient computation, contributing to its robustness and computational efficiency. The SQH method has been successfully applied in diverse domains, including the Liouville mass transport problem (Roy & Borzì, 2017), medical imaging (Dey et al., 2024; Roy, 2021; Roy et al., 2023), and parabolic optimal control problems (Breitenbach & Borzì, 2019), where it has proven to be a powerful and reliable optimization tool. While the CGLS method has already been shown to outperform EM and other optimization techniques, the primary aim of this work is to demonstrate that the SQH algorithm further improves upon the performance of CGLS. We highlight the advantages of SQH in terms of accuracy and computational speed and aim to position it as a valuable tool for parameter estimation - not only within the context of cure rate models but also in broader applications.

The remainder of the paper is organized as follows. Section 2 provides a brief overview of the destructive cure rate model. In Section 3, we present a detailed account of the proposed SQH algorithm. Section 4 reports the findings of an extensive simulation study, where the initial number of competing risks is assumed to follow an exponentially weighted Poisson distribution. In this section, we compare the performance of the SQH algorithm with the CGLS method proposed by Pal & Roy (2021), emphasizing the advantages of our approach. Section 5 illustrates the application of the SQH algorithm to a well-known melanoma dataset. Finally, Section 6 concludes the paper with a summary of key findings and a discussion of potential future research directions.

2. Cure rate model with destruction of competing risks

Consider a practical scenario in which an unobserved number of risk factors (or competing risks) contribute to the occurrence of an event of interest, such as cancer-related death or disease recurrence. For instance, the development of a malignant tumor may involve multiple cancerous cells, yet the exact number of such cells remains unobservable. Similarly, in the context of child mortality, various adverse environmental factors may act in combination to form a set of competing risks, but the precise number of fatal risks is not directly observable. In such cases, we refer to malignant cells or fatal environmental factors as competing risks.

Let M denote a random variable representing the unobserved number of initial competing risks. Since M is not directly observable, we assume it follows a discrete distribution with mass function pm=P[M=m]. Suppose an intervention, such as medical treatment or environmental action, results in the reduction or elimination of some of these initial risks, leaving D(M) active risk factors that can potentially cause the event of interest, each with a probability p. For example, following chemotherapy or radiation, a subset of malignant cells may be eradicated. Likewise, public health interventions may lower the number of fatal risks in child mortality scenarios. To model this destruction process, we associate a Bernoulli random variable Xj with each initial risk factor, such that PXj=1=p, where p represents the activation probability. We then define the number of remaining, active risk factors as:

D=X1+X2++XM,M>0,0,M=0. (4)

Since the conditional distribution of D given M=m follows a Binomial (m,p), and using the distribution of M, we can derive the marginal distribution of D, as shown in Rodrigues et al. (2011). Given D=d, let Wj, for j=1,2,,d, denote the progression times, i.e., the time taken by each active risk factor to cause the event. As in Rodrigues et al. (2011), we assume that Wj are independent and identically distributed, independent of D, and follow a distribution function F()=1-S(), where S() is the associated survival function. Although individual progression times are unobserved, we do observe the time taken by the first active risk factor to trigger the event, which is referred to as the lifetime. In a competing risks framework, the lifetime Y is defined as:

Y=minW1,,WD,D>0,D=0. (5)

An infinite lifetime, corresponding to D=0, reflects a proportion p0 of the population that is not susceptible to the event, i.e., individuals who are cured. Estimating this cure rate p0 is a central objective in cure rate modeling.

It is important to note that the destructive cure rate model is not identifiable without further assumptions (Li et al., 2001). One common approach to ensuring identifiability is to incorporate covariates. For instance, we can model the activation probability p as a function of covariates x using the logistic link function:

p=expxβ11+expxβ1,

and simultaneously model a parameter governing the distribution of M in terms of another set of covariates z, using an appropriate link function gzβ2, where β1 and β2 are vectors of regression coefficients. To maintain identifiability, either β1 or β2 must omit the intercept term, and the covariate sets x and z must not overlap. The population survival function (also known as the long-term survival function) of the lifetime Y defined in (5) is given by:

Spopy=PYy=d=0P[D=d]{S(y)}d.

The corresponding population density function (or long-term density function) is obtained by differentiating:

fpopy=-Spopy.

By specifying the distribution of M, and consequently D, explicit expressions for Spop(y) and fpop(y) can be derived.

3. Proposed estimation method: sequential quadratic Hamiltonian algorithm

We consider a situation where lifetime data might be incompletely observed due to right censoring. Let Ti represent the actual failure time and Ci the censoring time for the i-th individual, where i=1,2,,n and n denotes the sample size. The observed time, denoted Yi, is defined as Yi=minTi,Ci. To indicate whether a failure time is observed or censored, we define a censoring indicator δi, where δi=1 if the failure is observed, and δi=0 if it is right censored. The observed data can then be expressed as O=yi,δi,xi,zi,i=1,2,,n, where xi and zi are covariates associated with the i-th individual. Assuming the censoring is non-informative, the likelihood function based on the observed data is given by

L(θ)=i=1nfpopyixi,ziδiSpopyixi,zi1-δi,

where θ denotes the vector of unknown model parameters, fpop() is the population density function, and Spop() is the corresponding survival function. Our objective is to estimate the unknown parameter vector θ that maximizes the likelihood function. By taking the natural logarithm of the likelihood, we obtain the observed data log-likelihood function as:

l(θ)=i=1nδilogfpopyixi,zi+1-δilogSpopyixi,zi.

To perform this maximization, we utilize the SQH method. The maximization problem is formulated as

maxθTadlθ,

where Tad represents the admissible parameter space. The SQH method is based on an augmented version of the log-likelihood function, defined as

lϵ(θ,θ~)=l(θ)-ϵθ-θ~l22,

where ϵ>0 is a penalization parameter updated dynamically during iterations, l2 is the Euclidean norm, and θ~ is the previous parameter estimate. This penalty term encourages the updated parameter vector to remain close to the prior estimate, particularly when ϵ is large. If the increase in the log-likelihood is not sufficient, ϵ is increased; otherwise, it is decreased. The steps involved in the SQH algorithm are presented below.

Algorithm 3.1.

(SQH algorithm).

  1. Initialize: choose ϵ>0, tolerance κ>0,ρ>0, step-up parameter λ>1, step-down parameter ζ(0,1), initial guess θ0Tad, and the maximum number of iterations MaxIter.

  2. Set iteration counter k=0.

  3. Choose θiTi,ad, where Ti,ad denotes the admissible set for the i-th element in θ(i=1,2,,s, and s denoting the dimension of θ), such that
    lϵθ1,θ2k,,θsk,θk=maxv1T1,adlϵv1,θ2k,,θsk,θklϵθ1k,θ2,,θsk,θk=maxv2T2,adlϵθ1k,v2,,θsk,θklϵθ1k,θ2k,,θs,θk=maxvsTs,adlϵθ1k,θ2k,,vs,θk
  4. Calculate: τ=θ-θkl22.

  5. If l(θ)-lθk<ρτ, set ϵ=λϵ and return to step 3.

  6. Otherwise, set ϵ=ζϵ, and update θk+1=θ.

  7. Increment k=k+1.

  8. If τ<κ or k>MaxIter, stop and return θk as the estimated optimal parameter vector. Otherwise, repeat from step 3.

4. Simulation study: demonstration with exponentially weighted Poisson competing risks

4.1. Simulation setup

Although the SQH algorithm is formulated in a general framework, for the purposes of the simulation study, we assume that the initial number of competing risks, denoted by M in (4), follows a weighted Poisson distribution with the probability mass function:

PM=m;η,ϕ=exp-ηeϕηeϕmm!,m=0,1,2,, (6)

where ϕR and η>0. Given this assumption, and noting that the conditional distribution of D given M=m is Binomial(m,p), Pal & Balakrishnan (2017c) showed that the marginal distribution of D is given by:

PD=d;η,ϕ,p=exp-ηpeϕηpeϕdd!,d=0,1,2,, (7)

which is again a weighted Poisson distribution with an exponential weight. Furthermore, Rodrigues et al. (2011) demonstrated that the population survival function for the lifetime variable Y in equation (5), under this framework, is:

Spopy=PY>y=exp-ηpeϕFy, (8)

and the corresponding population density function is:

fpopy=ηpeϕSpopyfy, (9)

where f() denotes the probability density function of the promotion times, given by f(y)=ddyF(y). The cure rate, defined as the long-term survival probability, is then:

p0=limySpopy=exp-ηpeϕ. (10)

Although we assume a weighted Poisson distribution for the initial number of competing risks in this study, alternative distributions, such as those explored in Pal & Balakrishnan (2018) and Pal et al. (2018) can also be readily incorporated.

To compare the performance of the SQH algorithm with the previously proposed CGLS algorithm by Pal & Roy (2021), we adopt the same simulation design as in their study. In this setup, the parameter η is modeled using a log-linear link function: η=expzβ2, where β2 does not include the intercept term to preserve identifiability. Following Pal & Roy (2021) and to facilitate comparison, we also assume that F() and f() represent the distribution and density functions, respectively, of a two-parameter Weibull distribution:

Fy=1expγ2y1/γ1,fy=1γ1yγ2y1/γ11Fy, (11)

where y>0,γ1>0, and γ2>0. Consequently, the parameter vector is defined as θ=β1,β2,γ1,γ2,ϕ, with the admissible parameter space Tad=θ:β1Rp,β2Rq,γ1>0,γ2>0,ϕR. It is worth highlighting that the SQH algorithm is flexible and can be extended to accommodate semi-parametric or non-parametric approaches for estimating F() and f(), as well as the inclusion of covariates in these functions.

We begin with a brief overview of the data generation process. To ensure comparability with the study by Pal & Roy (2021), we replicate their setup, specifically emulating the characteristics of the real melanoma dataset available in the “timereg” package of R software. This dataset includes two primary covariates: tumor thickness (measured in mm) and ulceration status (coded as 1 for presence and 0 for absence of an ulcer). Preliminary analysis shows that 44% of patients presented with an ulcer. For these patients, the mean and standard deviation of tumor thickness were 4.34 mm and 3.22 mm, respectively. In contrast, among patients without an ulcer, the mean and standard deviation were 1.81 mm and 2.19 mm, respectively. Based on histograms of tumor thickness for the two groups, a Weibull distribution appears suitable for the ulcer group, while an exponential distribution is appropriate for the non-ulcer group. To simulate ulceration status, we first draw a uniform random variable U~Uniform(0,1). If U0.44, ulceration status (z) is set to 1, and tumor thickness (x) is drawn from a Weibull distribution with parameters selected to yield a theoretical mean of 4.34 and variance of 10.37. If U>0.44, ulceration status is set to 0, and x is drawn from an exponential distribution with a mean of 1.81. To ensure identifiability in our model, we define the parameter η as a function of ulceration status (excluding the intercept) and p as a function of tumor thickness (including the intercept). The respective link functions are:

η=expβ2zandp=expβ0+β1x1+expβ0+β1x.

We set η=3 when an ulcer is present, which implies β2=log(3)=1.099. To determine β0 and β1, we choose target values of p corresponding to the minimum and maximum observed tumor thickness values, xmin and xmax, such that p=0.3 at xmin and p=0.9 at xmax. We then solve the system:

expβ0+β1xmin1+expβ0+β1xmin=0.3,expβ0+β1xmax1+expβ0+β1xmax=0.9.

The resulting values of β0 and β1 will vary depending on the generated tumor thickness values in each dataset.

To incorporate random censoring, censoring times (C) are generated from an exponential distribution with a censoring rate of 0.15. The generation of lifetime data under the destructive cure model with exponentially weighted Poisson competing risks proceeds as follows:

  1. Draw the initial number of risk factors M from an exponentially weighted Poisson distribution (as defined in (6)) using a specified value of ϕ.

  2. If M=0, set the number of active risk factors D=0.

  3. If M>0, draw D~Binomial(M,p).

  4. If D=0, set the observed lifetime Y=C.

  5. If D>0, draw D Weibull-distributed variables W1,W2,,WD with density as defined in (11) using parameters γ1 and γ2, and compute Y=minminW1,,WD,C.

  6. Set the censoring indicator δ=0 if Y=C, and δ=1 otherwise.

4.2. Simulation results

We adopt parameter settings similar to those used in Pal & Roy (2021), with the following true values: γ1,γ2=(0.215,0.183) and (0.316, 0.179); sample sizes n=100 and 200; and values of ϕ=-0.5 and 0.7. All simulations were performed using the R statistical software, and the results are based on 500 Monte Carlo replications. To initialize the SQH algorithm, we followed the strategy proposed in Pal & Roy (2021). For each model parameter, we constructed an interval by allowing a 20% deviation from its true value, and a random value from this interval was selected as the initial estimate. For the fixed parameters of the SQH algorithm, we conducted a preliminary sensitivity analysis and set the values as follows: ϵ=100,λ=100,ρ=100, and ζ=0.5. The convergence criterion was defined by a tolerance level of κ=0.001, and the maximum number of iterations was set to MaxIter = 1000. In comparing the SQH and CGLS algorithms, we ensured fairness by using identical datasets and initial values across both methods.

4.3. Simulation results and discussion

Tables 1 and 2 report the simulation results for bias and RMSE when the true value of ϕ is −0.5 and 0.7, respectively. The results show that the SQH algorithm consistently converges to the true parameter values, with relatively small biases and RMSEs. As expected, RMSEs generally decrease with increasing sample size, highlighting a favorable property of the method. Compared to the CGLS algorithm, the SQH algorithm achieves lower biases and RMSEs for the regression parameters (β0,β1,β2) and the parameter ϕ. This improvement is particularly important, given that the cure rate depends entirely on these parameters, indicating that the SQH algorithm yields more accurate and reliable cure rate estimates. For the lifetime parameters γ1 and γ2, both algorithms perform comparably in terms of bias and RMSE.

Table 1:

Comparison of SQH and CGLS algorithms in terms of bias and RMSE with ϕ=-0.5

n (γ1,γ2) Parameter Bias RMSE
SQH CGLS SQH CGLS
100 (0.215,0.183) β2 −0.007 0.092 0.120 0.466
β0 −0.003 −0.186 0.098 0.556
β1 −0.002 0.078 0.063 0.314
γ1 −0.005 −0.011 0.038 0.040
γ2 0.002 0.002 0.011 0.012
ϕ 0.002 0.002 0.069 0.374
200 (0.215,0.183) β2 0.008 0.043 0.117 0.298
β0 −0.003 −0.069 0.099 0.259
β1 0.008 0.056 0.062 0.194
γ1 0.002 −0.004 0.029 0.027
γ2 0.001 0.001 0.008 0.009
ϕ 0.010 −0.025 0.068 0.222
100 (0.316,0.179) β2 −0.005 0.108 0.121 0.553
β0 −0.002 −0.424 0.099 0.967
β1 −0.001 0.168 0.065 0.614
γ1 −0.007 −0.015 0.052 0.059
γ2 0.003 0.003 0.017 0.019
ϕ 0.000 0.067 0.069 0.506
200 (0.316,0.179) β2 0.009 0.050 0.119 0.357
β0 −0.003 −0.167 0.101 0.446
β1 0.009 0.101 0.061 0.304
γ1 0.004 −0.006 0.040 0.039
γ2 0.001 0.002 0.012 0.013
ϕ 0.011 −0.003 0.072 0.312

Table 2:

Comparison of SQH and CGLS algorithms in terms of bias and RMSE with ϕ=0.70

n (γ1,γ2) Parameter Bias RMSE
SQH CGLS SQH CGLS
100 (0.215,0.183) β2 0.007 0.066 0.125 0.303
β0 −0.005 −0.041 0.100 0.240
β1 0.004 0.032 0.070 0.205
γ1 0.011 −0.008 0.033 0.025
γ2 −0.001 0.001 0.010 0.012
ϕ 0.013 0.021 0.089 0.188
200 (0.215,0.183) β2 −0.006 0.006 0.118 0.194
β0 −0.005 −0.004 0.105 0.141
β1 −0.001 0.026 0.075 0.126
γ1 0.010 −0.002 0.023 0.018
γ2 −0.002 0.000 0.010 0.007
ϕ −0.001 0.008 0.096 0.140
100 (0.316,0.179) β2 0.011 0.070 0.126 0.360
β0 −0.002 −0.120 0.099 0.440
β1 0.008 0.084 0.061 0.409
γ1 0.009 −0.011 0.038 0.038
γ2 0.001 0.001 0.013 0.018
ϕ 0.015 0.047 0.086 0.282
200 (0.316,0.179) β2 −0.004 0.009 0.117 0.236
β0 −0.003 −0.021 0.101 0.190
β1 0.000 0.043 0.053 0.184
γ1 0.008 −0.003 0.029 0.026
γ2 −0.001 0.000 0.012 0.011
ϕ 0.003 0.009 0.087 0.180

To further assess the accuracy and precision of the cure rate estimates produced by the SQH algorithm, Table 3 presents the corresponding biases and RMSEs, with the continuous covariate x fixed at 2.5 and the binary covariate z set to 0 and 1. The results demonstrate that the SQH algorithm substantially outperforms the CGLS algorithm, offering noticeably smaller biases (indicating improved accuracy) and considerably lower RMSEs (reflecting greater precision) in estimating the cure rate.

Table 3:

Comparison of SQH and CGLS algorithms in terms of bias and RMSE of the estimate of cure rate, p0(x,z), with n=100

z γ1,γ2,ϕ Bias of p0(x=2.5,z) RMSE of p0(x=2.5,z)
SQG CGLS SQH CGLS
0 (0.215, 0.183, −0.50) 0.000 −0.003 0.028 0.080
1 (0.215, 0.183, −0.50) 0.004 −0.019 0.061 0.113
0 (0.215, 0.183, 0.70) −0.004 −0.008 0.061 0.092
1 (0.215, 0.183, 0.70) 0.003 −0.004 0.041 0.047
0 (0.316, 0.179, −0.50) 0.000 −0.005 0.029 0.091
1 (0.316, 0.179, −0.50) 0.003 −0.021 0.065 0.121
0 (0.316, 0.179, 0.70) −0.008 −0.012 0.055 0.103
1 (0.316, 0.179, 0.70) −0.001 −0.004 0.037 0.051

Table 4 presents the biases and RMSEs for the estimated population survival probabilities, which are functions of both the cure rate and the lifetime parameters. For this evaluation, the observed lifetime was fixed at y=5, the continuous covariate at x=2.5, and the binary covariate z was varied between 0 and 1. The results demonstrate that the SQH algorithm consistently achieves lower biases and RMSEs compared to the CGLS algorithm. Beyond accuracy and precision, computational efficiency is also a critical consideration. Table 5 reports the CPU times (in seconds) required by each algorithm to produce estimation results, including parameter estimates, biases, and RMSEs, across 500 Monte Carlo runs. The results show that the SQH algorithm, which is gradient free, is substantially faster than the CGLS algorithm across all parameter settings considered.

Table 4:

Comparison of SQH and CGLS algorithms in terms of bias and RMSE of the estimate of Sp(yx,z) with n=100

z γ1,γ2,ϕ Bias of Sp(y=5x=2.5,z) RMSE of Sp(y=5x=2.5,z)
SQH CGLS SQH CGLS
0 (0.215, 0.183, −0.50) −0.002 −0.003 0.026 0.048
1 (0.215, 0.183, −0.50) −0.002 −0.017 0.063 0.085
0 (0.215, 0.183, 0.70) −0.004 −0.008 0.054 0.068
1 (0.215, 0.183, 0.70) −0.002 −0.024 0.074 0.080
0 (0.316, 0.179, −0.50) −0.003 −0.004 0.027 0.055
1 (0.316, 0.179, −0.50) −0.004 −0.020 0.065 0.090
0 (0.316, 0.179, 0.70) −0.010 −0.010 0.056 0.079
1 (0.316, 0.179, 0.70) −0.010 −0.025 0.075 0.084

Table 5:

Comparison of SQH and CGLS algorithms in terms of CPU times (in seconds)

n γ1,γ2,ϕ CPU Time (seconds)
SQH CGLS
100 (0.215, 0.183, −0.50) 5.451 189.678
200 (0.215, 0.183, −0.50) 7.819 184.643
100 (0.215, 0.183, 0.70) 5.601 151.000
200 (0.215, 0.183, 0.70) 7.745 158.732
100 (0.316, 0.179, −0.50) 5.411 230.199
200 (0.316, 0.179, −0.50) 7.313 235.498
100 (0.316, 0.179, 0.70) 5.339 185.281
200 (0.316, 0.179, 0.70) 7.210 186.556

As previously noted, the fixed parameters in the SQH algorithm (ϵ,λ,ρ, and ζ) were selected based on a comprehensive preliminary study, with all analyses conducted over 500 Monte Carlo replications. Our findings indicate that the estimation results are largely insensitive to the choice of ζ within the interval (0, 1). Table 6 provides a brief sensitivity analysis for ζ, with ϵ,λ, and ρ all set to 100. Similar trends were observed across other parameter configurations, which are omitted here for brevity. Additionally, CPU times varied minimally with different ζ values. For example, under the settings n=100,γ1=0.215,γ2=0.183,ϕ=-0.5, the CPU times (in seconds) for ζ=0.1,ζ=0.5, and ζ=0.9 were 5.078, 5.410, and 5.020, respectively. Comparable sensitivity analyses were also performed for the other fixed parameters (ϵ,λ, and ρ), and the estimation outcomes remained stable across the tested values.

Table 6:

Sensitivity analysis of the SQH algorithm with respect to the parameter ζ(0,1) for n=100

γ1,γ2,ϕ Parameter ζ=0.1 ζ=0.5 ζ=0.9
Bias RMSE Bias RMSE Bias RMSE
(0.215, 0.183, −0.50) β2 −0.006 0.125 −0.007 0.120 −0.005 0.125
β0 −0.002 0.100 −0.003 0.098 0.000 0.102
β1 0.000 0.049 −0.002 0.063 0.001 0.082
γ1 0.007 0.039 −0.005 0.038 −0.007 0.038
γ2 0.001 0.010 0.002 0.011 0.002 0.012
ϕ 0.002 0.063 0.002 0.069 0.007 0.083
(0.215, 0.183, 0.70) β2 0.006 0.125 0.007 0.125 0.015 0.126
β0 −0.006 0.100 −0.005 0.100 0.001 0.102
β1 0.000 0.069 0.004 0.070 0.014 0.072
γ1 0.004 0.023 0.011 0.033 0.005 0.031
γ2 −0.001 0.011 −0.001 0.010 0.000 0.010
ϕ 0.011 0.089 0.013 0.089 0.024 0.093

4.4. Comparison of SQH with CGLS and EM algorithms

In this section, we present, for interested readers, the results of a small simulation study comparing the proposed SQH algorithm with the EM algorithm (see Table 7). Specifically, we consider ϕ=-0.50 and γ1,γ2=(0.316,0.179). The EM results are taken from Pal & Roy (2021). For the parameters associated with the cure rate, namely β2,β0,β1, and ϕ, the EM algorithm exhibits substantially larger biases and RMSEs than those obtained using the SQH algorithm. In contrast, for the lifetime parameters γ1 and γ2, the biases and RMSEs of the EM and SQH algorithms are comparable. Overall, these results indicate that the SQH algorithm outperforms the EM algorithm.

Table 7:

Comparison of SQH with CGLS and EM algorithms in terms of bias and RMSE with ϕ=-0.50 and γ1,γ2=(0.316,0.179)

n Parameter Bias RMSE
SQH CGLS EM SQH CGLS EM
100 β2 −0.005 0.108 0.149 0.121 0.553 0.635
β0 −0.002 −0.424 −0.364 0.099 0.967 11.803
β1 −0.001 0.168 0.229 0.065 0.614 1.342
γ1 −0.007 −0.015 0.001 0.052 0.059 0.047
γ2 0.003 0.003 0.000 0.017 0.019 0.017
ϕ 0.000 0.067 0.627 0.069 0.506 1.307
200 β2 0.009 0.050 0.067 0.119 0.357 0.428
β0 −0.003 −0.167 −1.055 0.101 0.446 2.271
β1 0.009 0.101 0.341 0.061 0.304 0.914
γ1 0.004 −0.006 −0.003 0.040 0.039 0.037
γ2 0.001 0.002 0.000 0.012 0.013 0.014
ϕ 0.011 −0.003 0.635 0.072 0.312 1.300

5. Application: analysis of melanoma data

To demonstrate the practical utility of the proposed SQH algorithm within the destructive cure model featuring exponentially weighted Poisson competing risks, we applied it to the well-known melanoma dataset from the E1690 study. This dataset was previously used by Pal & Roy (2021) to illustrate the CGLS algorithm in the same modeling context. The data originate from a clinical trial investigating the efficacy of high-dose interferon alfa-2b (INF) in preventing recurrence of cutaneous melanoma, a form of malignant skin cancer. After excluding records with missing values, the analysis included 417 patients enrolled between 1991 and 1995, with follow-up continuing through 1998. The time-to-event, or lifetime, is defined as the time to death or censoring, measured in years, with approximately 56% of the observations censored.

For our analysis, we selected tumor thickness (in millimeters) and treatment group (1 for INF, 0 for observation) as covariates. Following the approach of Pal & Roy (2021), the parameter p was modeled as a function of tumor thickness (including an intercept), while the parameter η was linked to the treatment group (excluding an intercept) to ensure model identifiability. In this parameterization, β0 and β1 represent the regression coefficients associated with p, and β2 corresponds to η. It is worth noting that although the selection of covariates for η and p is an important modeling consideration, it lies beyond the scope of the current study and is left for future research (Masud et al., 2018).

The lifetime was modeled using a Weibull distribution, as specified in (11). After a preliminary study, the SQH algorithm parameters set to ϵ=5,λ=5,ρ=5, and ζ=0.5. To obtain initial estimates for the model parameters, we followed the procedure outlined by Pal & Roy (2021). Specifically, initial values for the Weibull parameters γ1 and γ2 were derived by equating the sample mean and variance of the observed lifetimes to the theoretical mean and variance of the Weibull distribution given in (11). Solving these two equations yielded estimates of γ1 and γ2, which were then used as their respective starting values. For the regression parameters (β2,β0,β1) and the parameter ϕ in the SQH algorithm, we conducted a four-dimensional grid search. Each parameter was searched over the interval [−5, 5] in increments of 0.1. The combination of values that maximized the observed data log-likelihood function was selected as the initial guess for β2,β0,β1,ϕ. This log-likelihood was evaluated using the previously computed initial values of γ1 and γ2.

Table 8 presents the parameter estimates and corresponding standard errors for the destructive cure model with exponentially weighted Poisson competing risks. For comparison, we also include the CGLS and EM estimation results reported by Pal & Roy (2021). Standard errors were computed using a non-parametric bootstrap procedure, as the second-order derivatives of the observed log-likelihood function proved to be highly unstable, particularly with respect to the parameter ϕ. This issue was similarly encountered by Pal & Roy (2021), who also relied on bootstrap methods for estimating standard errors in their CGLS implementation. For the EM algorithm, which employs a profile likelihood approach for ϕ, the standard errors were computed using the expressions for the Hessian matrix components (with ϕ treated as fixed) provided in Pal & Balakrishnan (2017c). The results indicate that all three methods produce comparable parameter estimates. However, overall, the standard errors obtained using the SQH algorithm are smaller than those from the CGLS and EM methods.

Table 8:

Estimates and standard errors of the parameters for the destructive cure model with exponentially weighted Poisson competing risks using the melanoma data

Parameter Estimate Standard Error
SQH CGLS EM SQH CGLS EM
β2 0.019 0.025 0.024 0.093 0.147 0.147
β0 −4.782 −4.752 −4.803 0.051 0.076 0.147
β1 0.008 0.003 0.002 0.018 0.023 0.023
γ1 0.582 0.586 0.586 0.037 0.041 0.039
γ2 0.396 0.392 0.392 0.032 0.036 0.028
ϕ 4.318 4.349 4.400 0.050 0.076 -

To assess the adequacy of the destructive cure model with exponentially weighted Poisson competing risks and Weibull lifetimes, we evaluated the normalized randomized quantile residuals based on the SQH parameter estimates. We present the QQ plot in Figure 1, where each point represents the median of five sets of ordered residuals. The plot indicates that the model provides an excellent fit to the melanoma dataset. Additionally, a formal assessment of normailty of residual was conducted using the Kolmogorov–Smirnov test, yielding a p-value of 0.933. This high p-value offers strong support for the assumption of normality in the residuals.

Figure 1:

Figure 1:

QQ plot of normalized randomized quantile residuals

Figure 2 displays (for all three methods) the estimated overall survival functions for patients with tumor thicknesses of 0.70 mm, 3.05 mm, and 10 mm, representing the 5th, 50th, and 95th percentiles of the tumor thickness distribution, respectively. The survival curves clearly plateau at a non-zero level, indicating the existence of a cured subgroup within the population. Additionally, the similarity of the survival curves across different tumor thickness values reflects the near-zero estimate of β1, suggesting minimal influence of tumor thickness on overall survival. Likewise, the estimate of β2 being close to zero implies that the survival curves for the treatment and observation groups are nearly identical.

Figure 2:

Figure 2:

Plots of overall survival function for patients with different tumor thickness values

6. Conclusion

In this paper, we focused on parameter estimation for the destructive cure model, assuming that the initial risks follow an exponentially weighted Poisson distribution. To this end, we proposed an improved estimation method, the SQH algorithm, which is based on a gradient-free optimization approach. Although our application of the SQH algorithm centered on the destructive cure model with exponentially weighted Poisson competing risks, the method is readily adaptable to other competing risk distributions. Moreover, the SQH algorithm has potential for broader application in general maximum likelihood estimation problems. Compared to the recently proposed CGLS algorithm, which itself has demonstrated superior performance over the EM algorithm and other optimization methods, the SQH algorithm achieves lower bias and RMSE for parameters associated with cure rate estimation. This leads to more accurate and precise estimates of the cure rate. In addition, our CPU time analysis showed that the SQH algorithm is significantly faster than the CGLS algorithm across all parameter configurations. When applied to the melanoma dataset, the parameter estimates obtained via SQH were consistent with those from the CGLS and EM approaches but exhibited notably smaller standard errors.

The SQH algorithm holds promise for use in more complex cure rate models, such as the Conway-Maxwell (COM) Poisson cure rate model and the destructive COM-Poisson cure rate model (Balakrishnan & Pal, 2016; Majakwara & Pal, 2019), where profile likelihood methods have been used to estimate the COM-Poisson shape parameter ϕ due to the flatness of the likelihood surface. Evaluating the performance of the SQH algorithm in these models, and comparing it to the CGLS algorithm, the EM algorithm, and variants such as the stochastic EM algorithm (Davies et al., 2021; Pal, 2021), is a promising direction for future research. We are currently investigating these extensions and intend to report our findings in a future paper.

Funding

The research reported in this publication was supported by the National Institute Of General Medical Sciences of the National Institutes of Health under Award Numbers R15GM150091 and R35GM156859. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. This research was also partly supported by the US National Science Foundation grant number 2309491.

Footnotes

Conflict of interest

The authors declare no potential conflict of interest.

Data Availability Statement

The R codes for the data generation and the SQH algorithm are available in the GitHub page https://github.com/suvrapal/SQH-EWP.

References

  1. Balakrishnan N, & Pal S (2012). EM algorithm-based likelihood estimation for some cure rate models. Journal of Statistical Theory and Practice, 6 (4), 698–724. [Google Scholar]
  2. Balakrishnan N, & Pal S (2013). Lognormal lifetimes and likelihood-based inference for flexible cure rate models based on COM-Poisson family. Computational Statistics & Data Analysis, 67, 41–67. [Google Scholar]
  3. Balakrishnan N, & Pal S (2014). COM-Poisson cure rate models and associated likelihood-based inference with exponential and Weibull lifetimes. In Frenkel I, Karagrigoriou A, Lisnianski A, & Kleyner A (Eds.), Applied reliability engineering and risk analysis: probabilistic models and statistical inference (pp. 308–348). Chichester, U.K.: John Wiley & Sons. [Google Scholar]
  4. Balakrishnan N, & Pal S (2015a). An EM algorithm for the estimation of parameters of a flexible cure rate model with generalized gamma lifetime and model discrimination using likelihood-and information-based methods. Computational Statistics, 30 (1), 151–189. [Google Scholar]
  5. Balakrishnan N, & Pal S (2015b). Likelihood inference for flexible cure rate models with gamma lifetimes. Communications in Statistics - Theory and Methods, 44 (19), 4007–4048. [Google Scholar]
  6. Balakrishnan N, & Pal S (2016). Expectation maximization-based likelihood inference for flexible cure rate models with Weibull lifetimes. Statistical Methods in Medical Research, 25 (4), 1535–1563. [DOI] [PubMed] [Google Scholar]
  7. Berkson J, & Gage RP (1952). Survival curve for cancer patients following treatment. Journal of the American Statistical Association, 47 (259), 501–515. [Google Scholar]
  8. Boag JW (1949). Maximum likelihood estimates of the proportion of patients cured by cancer therapy. Journal of the Royal Statistical Society. Series B (Methodological), 11 (1), 15–53. [Google Scholar]
  9. Breitenbach T, & Borzì A (2019). On the SQH Scheme to Solve Nonsmooth PDE Optimal Control Problems. Numerical Functional Analysis and Optimization, 40 (13), 1489–1531. doi: 10.1080/01630563.2019.1599911 [DOI] [Google Scholar]
  10. Chen MH, Ibrahim JG, & Sinha D (1999). A new Bayesian model for survival data with a surviving fraction. Journal of the American Statistical Association, 94 (447), 909–919. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Cooner F, Banerjee S, & Sinha D (2007). Flexible cure rate modeling under latent activation schemes. Journal of the American Statistical Association, 102 (478), 560–572. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Davies K, Pal S, & Siddiqua JA (2021). Stochastic EM algorithm for generalized exponential cure rate model and an empirical study. Journal of Applied Statistics, 48 (12), 2112–2135. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Dey A, Borzì A, & Roy S (2024). A high contrast and resolution reconstruction algorithm in quantitative photoacoustic tomography. Journal of Computational and Applied Mathematics, 116065. [Google Scholar]
  14. Gallardo DI, Bolfarine H, & Pedroso-de Lima AC (2016). Destructive weighted Poisson cure rate models with bivariate random effects: classical and Bayesian approaches. Computational Statistics & Data Analysis, 98, 31–45. [Google Scholar]
  15. Li CS, Taylor JMG, & Sy JP (2001). Identifiability of cure models. Statistics & Probability Letters, 54 (4), 389–395. [Google Scholar]
  16. Majakwara J, & Pal S (2019). On some inferential issues for the destructive COM-Poisson-generalized gamma regression cure rate model. Communications in Statistics - Simulation and Computation, 48 (10), 3118–3142. [Google Scholar]
  17. Masud A, Tu W, & Yu Z (2018). Variable selection for mixture and promotion time cure rate models. Statistical Methods in Medical Research, 27 (7), 2185–2199. [DOI] [PubMed] [Google Scholar]
  18. Pal S (2021). A simplified stochastic EM algorithm for cure rate model with negative binomial competing risks: an application to breast cancer data. Statistics in Medicine, 40 (28), 6387–6409. [DOI] [PubMed] [Google Scholar]
  19. Pal S, & Balakrishnan N (2016). Destructive negative binomial cure rate model and EM-based likelihood inference under Weibull lifetime. Statistics & Probability Letters, 116, 9–20. [Google Scholar]
  20. Pal S, & Balakrishnan N (2017a). An EM type estimation procedure for the destructive exponentially weighted Poisson regression cure model under generalized gamma lifetime. Journal of Statistical Computation and Simulation, 87 (6), 1107–1129. [Google Scholar]
  21. Pal S, & Balakrishnan N (2017b). Likelihood inference for COM-Poisson cure rate model with interval-censored data and Weibull lifetimes. Statistical Methods in Medical Research, 26 (5), 2093–2113. [DOI] [PubMed] [Google Scholar]
  22. Pal S, & Balakrishnan N (2017c). Likelihood inference for the destructive exponentially weighted Poisson cure rate model with Weibull lifetime and an application to melanoma data. Computational Statistics, 32 (2), 429–449. [Google Scholar]
  23. Pal S, & Balakrishnan N (2018). Likelihood inference based on EM algorithm for the destructive length-biased Poisson cure rate model with Weibull lifetime. Communications in Statistics - Simulation and Computation, 47 (3), 644–660. [Google Scholar]
  24. Pal S, Majakwara J, & Balakrishnan N (2018). An EM algorithm for the destructive COM-Poisson regression cure rate model. Metrika, 81 (2), 143–171. [Google Scholar]
  25. Pal S, & Roy S (2021). On the estimation of destructive cure rate model: a new study with exponentially weighted Poisson competing risks. Statistica Neerlandica, 75 (3), 324–342. [Google Scholar]
  26. Pal S, & Roy S (2022). A new non-linear conjugate gradient algorithm for destructive cure rate model and a simulation study: illustration with negative binomial competing risks. Communications in Statistics - Simulation and Computation, 51 (11), 6866–6880. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Pal S, & Roy S (2023). On the parameter estimation of Box-Cox transformation cure model. Statistics in Medicine, 42 (15), 2600–2618. [DOI] [PubMed] [Google Scholar]
  28. Peng Y, & Yu B (2021). Cure Models: Methods, Applications and Implementation. Chapman and Hall/CRC. [Google Scholar]
  29. Rodrigues J, de Castro M, Balakrishnan N, & Cancho VG (2011). Destructive weighted Poisson cure rate models. Lifetime Data Analysis, 17 (3), 333–346. [DOI] [PubMed] [Google Scholar]
  30. Rodrigues J, de Castro M, Cancho VG, & Balakrishnan N (2009). COM–Poisson cure rate survival models and an application to a cutaneous melanoma data. Journal of Statistical Planning and Inference, 139 (10), 3605–3611. [Google Scholar]
  31. Roy S (2021). A new nonlinear sparse reconstruction framework in ultrasound-modulated optical tomography. IEEE Transactions on Computational Imaging. [Google Scholar]
  32. Roy S, & Borzì A (2017). Numerical Investigation of a Class of Liouville Control Problems. Journal of Scientific Computing, 73 (1), 178–202. doi: 10.1007/s10915-017-0410-2 [DOI] [Google Scholar]
  33. Roy S, Jeon G, & Moon S (2023). Radon transform with gaussian beam: Theoretical and numerical reconstruction scheme. Applied Mathematics and Computation, 452, 128024. [Google Scholar]
  34. Treszoks J, & Pal S (2023). On the estimation of interval censored destructive negative binomial cure model. Statistics in Medicine, 42 (28), 5113–5134. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Treszoks J, & Pal S (2024). A destructive shifted Poisson cure model for interval censored data and an efficient estimation algorithm. Communications in Statistics-Simulation and Computation, 53 (5), 2135–2149. [Google Scholar]
  36. Treszoks J, & Pal S (2025). Likelihood inference for unified transformation cure model with interval censored data. Computational Statistics, 40, 125–151. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

The R codes for the data generation and the SQH algorithm are available in the GitHub page https://github.com/suvrapal/SQH-EWP.

RESOURCES