A Sup-Score Test for the Cure Fraction in Mixture Models for Long-Term Survivors

Wei-Wen Hsu; David Todem; KyungMann Kim

doi:10.1111/biom.12514

. Author manuscript; available in PMC: 2021 Jul 27.

Published in final edited form as: Biometrics. 2016 Apr 14;72(4):1348–1357. doi: 10.1111/biom.12514

A Sup-Score Test for the Cure Fraction in Mixture Models for Long-Term Survivors

Wei-Wen Hsu ¹, David Todem ^2,^*, KyungMann Kim ³

PMCID: PMC8314275 NIHMSID: NIHMS994999 PMID: 27078815

Summary.

The evaluation of cure fractions in oncology research under the well known cure rate model has attracted considerable attention in the literature, but most of the existing testing procedures have relied on restrictive assumptions. A common assumption has been to restrict the cure fraction to a constant under alternatives to homogeneity, thereby neglecting any information from covariates. This article extends the literature by developing a score-based statistic that incorporates covariate information to detect cure fractions, with the existing testing procedure serving as a special case. A complication of this extension, however, is that the implied hypotheses are not typical and standard regularity conditions to conduct the test may not even hold. Using empirical processes arguments, we construct a sup-score test statistic for cure fractions and establish its limiting null distribution as a functional of mixtures of chi-square processes. In practice, we suggest a simple resampling procedure to approximate this limiting distribution. Our simulation results show that the proposed test can greatly improve efficiency over tests that neglect the heterogeneity of the cure fraction under the alternative. The practical utility of the methodology is illustrated using ovarian cancer survival data with long-term follow-up from the surveillance, epidemiology, and end results registry.

Keywords: Cure rate model, Goodness-of-fit, Likelihood ratio, Ovarian cancer, SEER registry, Sensitivity analysis, Score functions, Unidentified parameters

1. Introduction

In some oncology studies, it is becoming evident with the advent of modern therapeutic agents that a non-negligible fraction of patients may not relapse or die from the disease even after a sufficient follow up. Survival data from these studies typically exhibit heavy administrative censoring with long-term survivors consistent with improper survival functions. Mixture survival models that incorporate a cure fraction have become a popular and interesting tool to analyze such data. Well known developments of this class of models include the seminal works of Boag (1949), Berkson and Gage (1952), with subsequent works of Farewell (1982, 1986). More recent investigations include, among others, the works of Maller and Zhou (1996), Sy and Taylor (2000), Yau and Ng (2001), and references therein.

A natural question arising from real applications of these models in oncology research is whether the cure fraction representing the inherent heterogeneity in the population is consistent with observed data (Klebanov and Yakovlev, 2007; Li et al., 2007). Unlike other mixture models which typically view heterogeneity as a nuisance (e.g., Todem et al., 2012), the evaluation of cure fractions in two-component survival models is often of scientific interest. For this reason, nearly all investigations involving cure models provide some forms of inference on the cure fraction, but the invoked testing procedures for the most part rely on restrictive assumptions. A common restriction is that constant cure fractions are often assumed under alternative to homogeneity (see, e.g., Li et al., 2007; Klebanov and Yakovlev, 2007; Zhao et al., 2009). Albeit important, such an approach may fail to detect cure in a population for which the cure fraction varies with covariates. A good illustration of this limitation, which strongly motivated this research, is provided by survival data on ovarian cancer from SEER (Surveillance, Epidemiology, and End Results) registry in the Los Angeles area. It has been reported in the literature that cure for ovarian cancer typically occurs after about 10 years following the initial treatment regimen (Tai et al., 2005). Using time to death from ovarian cancer in the SEER registry, analyses based on a constant cure fraction have not been satisfactory with only a marginal evidence of cure in this population. But the Kaplan–Meier estimate and the associated confidence band of the survival curve coupled with the biology of this tumor give a clear indication of the dependence of the cure fraction on age group, with younger pre-menopausal patients exhibiting a higher cure rate and an obvious plateau occurring nearly 5 years earlier than their older counterparts (Figure 1). Any analysis that ignores such heterogeneity in the population may fail to detect cure for younger patients, especially if a substantial portion of the sample is made of older post-menopausal patients as is often the case in ovarian cancer studies.

Figure 1. — Kaplan–Meier estimates for ovarian cancer data from SEER registry of Los Angeles stratified by patient’s age group with 95% confidence bands (gray areas). Sample sizes from the younger to the older age group are: 172, 782, 882, and 626 patients.

The preliminary analysis begs the genuine question as to how to incorporate covariates, such as age, in detecting cure among the ovarian cancer patients in the SEER registry. If covariates can stratify the population in a small number of strata (e.g., the four age groups in Figure 1), the generalized score test proposed by Silvapulle and Silvapulle (1995) can be used to assess stratum-specific cure fractions. However, when the number of strata is large or more generally when there are dense continuous covariates, evaluating the cure fraction is nontrivial. For example in the SEER registry, it is unclear how the cure fraction potentially varying with age can be evaluated without relying on an arbitrary discretization of age.

In this article, we consider the nontrivial setting where the cure fraction is related to continuous covariates through a regression technique, which is very typical in real applications of these models. Under the basic assumptions of uniformly bounded covariates and bounded regression slope parameters, which we will elaborate upon later, the null hypothesis then translates into testing infinitely large intercepts in the cure fraction regression model. Sy and Taylor (2000, page 234) have implicitly used this assumption in their cure rate evaluation by reporting that the standard Cox proportional hazards (PH) model is a special case of cure rate models, with an infinitely large intercept in the logistic model for the cure fraction. Their inference was, however, ad hoc ignoring that whenever the intercept becomes infinity large, the slope parameters vanish from the null, resulting in the loss of identifiability known as the Davies problem (Davies, 1977). And that the regularity conditions for the validity of the Wald test may not be satisfied. Using empirical processes arguments, we construct a test statistic that rigorously addresses all the inherent complications and establish its limiting null distribution as a functional of chi-square processes. We focus on score tests because the competing methods that require fitting the cure rate model may be problematic, owing to the well known identifiability issues inherent to this class of models (Li et al., 2001).

The rest of this article is organized as follows. In Section 2, we give a brief description of cure survival models and develop a score-based test statistic to evaluate the cure fraction in presence of continuous covariates, and establish its limiting null distribution which is rigorously approximated by a resampling procedure. We conduct numerical studies to evaluate the finite sample performance of the proposed testing procedure in Section 3. We also illustrate its practical utility using ovarian cancer survival data with long-term follow-up from the SEER registry in Section 4. Concluding remarks are provided in Section 5.

2. The Method

2.1. The Cure Rate Model

Suppose that T is the survival time with the date of cancer diagnosis, e.g., ovarian cancer, serving as the origin, and C the potential censoring variable. But empirically, only y = min(T, C) and the censoring indicator δ = 1{T = y} are observed. Cure rate models posit the existence of a partially observable binary variable that views the population as a mixture of subjects deemed cured and uncured. Under such heterogeneity, the survival function of the population is a weighted mixture of a degenerated survival function for cured patients and a proper survival function for uncured patients. More precisely, the mixture cure rate model has a marginal survival function at time t,

S_{M} (t) = (1 - π) + π S (t), 0 < π \leq 1,

(1)

where π represents the uncure fraction or incidence probability and S(t) the survival function for subjects deemed uncured. Unless π = 1, the marginal survival function S_M(t) is improper in that lim_t→∞ S_M(t) = 1 − π, in contrast to the latent survival function S(t) which converges to 0 at infinity. In real applications of these models, covariates are usually considered both for π and S(t). Unlike Berkson and Gage (1952) who assumed a constant π, other investigations involving this class of models generally consider a parametric regression model relating π to covariates through a logistic function. A flexible semiparametric formulation relating π to covariates has also been discussed (see, e.g., Wang et al., 2012). And for the latent survival function S(t), a Cox-type regression model have been proposed, with the baseline hazard function being estimated nonparametrically (e.g., Kuk and Chen, 1992; Peng and Dear, 2000) or weakly parametrically (e.g., Yin and Ibrahim, 2005).

2.2. Hypothesis Formulation and Challenges for Inference

We are interested in this article in the evaluation of the cure fraction against alternatives that vary with covariates, which is congruent with real applications of these models in cancer research. We specifically focus on the logistic model that relates the cure proportion to covariates, but the methodology and results carry over to any increasing, differentiable and invertible function. For this, assume that there exists some covariate information contained in column vector z of dimension q and a general cure fraction model of the form, 1 − π(z) = {1 + exp{ϑ + z′θ}}⁻¹, where $ϑ \in ℝ$ is the intercept and $θ = {(θ_{1}, θ_{2}, \dots, θ_{q})}^{'} \in Θ \subset ℝ^{q}$ the parameter vector associated with z. The uncure fraction is denoted π(z) to highlight its dependence on z. Thus, to evaluate the null hypothesis, one is typically interested in the one-sided hypothesis,

H_{0} : π^{*} (z) = 1, for all z vs. H_{1} : π^{*} (z) < 1, for some z,

(2)

where π*(z) = {1 + exp{−ϑ* − z′θ*}}⁻¹, with ϑ and θ* being the true values of ϑ and θ. This testing problem is non-standard and requires some working assumptions. Unlike existing approaches that restrict θ* to zero and evaluate the hypothesis ϑ* = ∞ against ϑ* < ∞, our investigation uses the less restrictive assumption that confines θ to a compact set, i.e., sup_θ∈Θ ‖θ‖ ≤ ρ for some ρ < ∞ Here ‖ · ‖ represents the Euclidean norm of a vector. Under the additional working assumption of uniformly bounded covariate z, i.e., Pr(sup_z ‖z‖ ≤ ς) = 1, ς < ∞, the null hypothesis only holds when the true intercept ϑ* goes to infinity. A nontrivial complication is that whenever ϑ* becomes infinitely large, the true slope parameter vector θ* disappears from the cure rate model, with any value of θ* producing the same null distribution. To see this more clearly, consider the transformation ψ = exp{−ϑ} so that π*(z) = {1 + ψ* exp{−z′θ*}}⁻¹, with ψ* representing the true ψ. Whenever ϑ* tends to infinity, ψ* the true ψ tends to zero, and θ* vanishes from the model. The null and alternative hypotheses specified in (2) coupled with the compactness condition then become,

H_{0} : ψ^{*} = 0 for all θ^{*} \in Θ vs. H_{1} : ψ^{*} > 0 for some θ^{*} \in Θ .

In addition to parameters vanishing under the null, the implied hypotheses are not typical and standard regularity conditions to conduct the test may not even hold. The parameter ψ under the null lies on the boundary of the cone $ℝ_{+}$ and the support space Θ of unidentified nuisance parameters may not even be known to the analyst a priori. From a methodological standpoint, there have been several articles in the literature which examine related issues and complications. But these works typically accommodate only one or two of these complications. For example, Chernoff (1954), Self and Liang (1987), and more recently Andrews (2001) have addressed the problem of parameters under the null hypothesis being on the boundary of the parameter space, occurring in the context of one-sided alternatives (see, e.g., Silvapulle and Silvapulle, 1995; Verbeke and Molenberghs, 2003). But these two issues are often discussed in settings where all the nuisance parameters can be identified under the null hypothesis. Several articles in the literature also studied the problem of unidentifiable nuisance parameters under the null hypothesis (see, e.g., Davies, 1977; Hansen, 1996; Andrews, 2001; Ritz and Skovgaard, 2005; Zhu and Zhang, 2006; Song et al., 2009; Di and Liang, 2011). However, these tests are often conducted under the condition that the support space of unidentified parameters is known. In our testing problem, the support Θ of the nuisance parameter may not be known a priori. This is a nontrivial complication because the empirical processes arguments and results often used for this type of problems may not hold when the support of unidentified nuisance parameters is also unknown to the analyst.

2.3. A Sup-Score Test for the Cure Fraction

Assume that the following Cox-type model h(t | x) = h₀(t) exp{x′β} for some observed covariate x, is entertained for the hazard function of the uncured population. Here, h₀(t) represents the associated baseline hazard function and β the regression coefficient vector associated with x. Unlike typical proportional hazards models with proper distributions which do not require estimation of the baseline hazard function h₀(t), cure rate models even when coupled with a Cox-type regression model for the hazard function of the uncured population, still require the estimation of h₀(t). In this article, we will focus this estimation on finite dimensional models for this baseline hazard model. Simple examples of such models are the two-parameter Weibull and log-logistic models, respectively with hazard functions h₀(t) = λαt^α−1 and h₀(t) = λαt^α–1/(1 + λt^α) indexed by parameters α > 0 and λ > 0.

Denote by {y_i, w_i, δ_i} the observed data for subject i, where w_i is a vector that consists of distinct elements of {x_i, z_i}. Assume that ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ are independent and identically distributed copies of random quantity {y, w, δ}. Denote by γ the collection of model parameters under the null homogeneous model and by ℓ_n(ψ, θ, γ) the log likelihood function associated with parameter (ψ, θ, γ) and observed data ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ . Under the conditional independence assumption of T and C given w, this log likelihood function is

ℓ_{n} (ψ, θ, γ) = Σ_{i = 1}^{n} {(1 - δ_{i}) \log (1 - π (z_{i}) + π (z_{i}) {S_{0} (y_{i})}^{\exp {η_{i}}}) + δ_{i} \log (π (z_{i}) \exp {η_{i}} {h_{0} (y_{i})} {S_{0} (y_{i})}^{\exp {η_{i}}})} + Σ_{i = 1}^{n} \log (g (ω_{i})),

where $η_{i} = x_{i}^{'} β$ , $π (z_{i}) = {1 + ψ \exp {- z_{i}^{'} θ}}^{- 1}$ and g(.) represents the density function of w with respect to some dominating measure. Because the scientific interest focuses on the distribution of {y, δ} given w, the term g(w_i) in the log likelihood is treated as nuisance and ignored. When a two-parameter Weibull model is assumed, $S_{0} (y_{i}) = \exp {- λ y_{i}^{α}}$ and $h_{0} (y_{i}) = λ α y_{i}^{α - 1}$ ; and for a two-parameter log-logistic model, $S_{0} (y_{i}) = 1 / (1 + λ y_{i}^{α})$ and $h_{0} (y_{i}) = λ α y_{i}^{α - 1} / (1 + λ y_{i}^{α})$ , α > 0, λ > 0, and γ = {α, λ, β}.

To construct the test, we impose a regularity condition that constrains the estimators of the null model parameters to converge at the $\sqrt{n}$ rate. Such a condition is usually met if a final dimensional model coupled with a parametric likelihood is assumed. More specifically, we adopt the following:

Condition 1. Let Λ be a nonsingular matrix and ν_n(γ) be the score function with respect to γ under the null hypothesis. Assume the existence of a consistent estimator $\hat{γ}$ of γ* the true value of γ, obtained, under the null such that $\sqrt{n} (\hat{γ} - γ^{*}) = n^{- 1 / 2} Λ^{- 1} υ_{n} (γ^{*}) + o_{p} (1)$ , where o_p(1) represents the convergence in probability as n → ∞.

For fixed θ, let u_n(θ) denote the score function ∂ℓ_n(ψ, θ, γ)/∂ψ with ψ and γ fixed at ψ* = 0 and γ*, and $γ$ its estimated version when γ* is replaced by $\hat{γ}$ . We also denote by b_i(θ, γ*) and d_i(θ, γ*) the contribution of subject i to the function u_n(θ) and the matrix $- \partial ℓ_{n}^{2} (ψ, θ, γ) / \partial ψ \partial γ^{'}$ evaluated at ψ* = 0 and γ*, respectively. Define $c_{i} (θ, γ^{*}) = b_{i} (θ, γ^{*}) - \tilde{h} (θ, γ^{*}) Λ^{- 1} a_{i} (γ^{*})$ , where $\tilde{h} (θ, γ^{*}) = E {d_{1} (θ, γ^{*})}$ and a_i(γ*) is the contribution of subject i to the score function ν_n(γ) evaluated at γ*. A basic Taylor expansion can be used to show that $n^{- 1 / 2} {\hat{u}}_{n} (θ) = n^{- 1 / 2} \sum_{i = 1}^{n} c_{i} (θ, γ^{*}) + μ_{n} (θ)$ , where μ_n(θ) → _p 0 as n → ∞, for each θ, ensuring a pointwise convergence of $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ to a mean zero normal distribution with an asymptotic variance E{c₁(θ, γ*)²}. Because of the one-sided alternative ψ* > 0, this pointwise normal approximation can be used to construct a score-type test statistic in the spirit of Silvapulle and Silvapulle (1995) to evaluate H₀ against H₁. More precisely, for each fixed θ, the score test statistic is given by,

{\hat{s}}_{n} (θ) = n^{- 1} {\hat{u}}_{n} {(θ)}^{2} \hat{ι} (θ) 1 {{\hat{u}}_{n} (θ) \geq 0},

(3)

with $\hat{ι} (θ)$ being an estimate of the inverse of E{c₁ (θ, γ*)²}. Assuming parametric estimates ${\hat{S}}_{0} (y_{i})$ and ${\hat{η}}_{i}$ of S₀(y_i) and η_i under the null hypothesis are readily available, ${\hat{u}}_{n} (θ)$ in (3) takes the general form ${\hat{u}}_{n} (θ) = \sum_{i = 1}^{n} {(1 - δ_{i}) {{\hat{S}}_{0} (y_{i})}^{- \exp {{\hat{η}}_{i}}} - 1} \exp {z_{i}^{'} θ}$ . Further details about ${\hat{u}}_{n} (θ)$ and derivation of $\hat{ι} (θ)$ for a baseline Weibull model are given in the Web Appendix. Because θ is unknown, ${\hat{s}}_{n} (θ)$ is really not a score statistic because of its dependence on θ. To remove this dependence, the following supremum score statistic is used,

T_{n} = \sup_{θ \in Θ} {\hat{s}}_{n} (θ) .

(4)

The null hypothesis as formulated in (2) is rejected for large values of $T_{n}$ . The test based on constant cure fractions is a special case of the proposed test in (4) with Θ = {0}, resulting in the test statistic ${\hat{s}}_{n} (0)$ .

2.4. Asymptotic Properties of the Sup-Score Statistic $T_{n}$

Define the function classes, $B = {b_{i} (θ, γ) : θ \in Θ, γ \in Γ, i = 1, \dots, n}$ and $D = {d_{i} (θ, γ) : θ \in Θ, γ \in Γ, i = 1, \dots, n}$ , specified under the null hypothesis ψ* = 0. Under the condition that function classes $B$ and $D$ are pointwise measurable and satisfy the uniform entropy condition (for the definition see van der Vaart and Wellner, 2000), and other regularity conditions in Appendix, it can be shown that $n^{- 1} {\hat{u}}_{n} (θ)$ is uniformly consistent, i.e., $\sup_{θ \in Θ} ‖ n^{- 1} {\hat{u}}_{n} (θ) ‖ \to_{p} 0$ , and that the random processes $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ converge in distribution to mean zero Gaussian processes in θ with covariance function $σ_{c}^{*} (θ_{1}, θ_{2}) = E {c_{1} (θ_{1}, γ^{*}) c_{1} (θ_{2}, γ^{*})}$ , θ₁, θ₂ ∈ Θ (see Theorem 1 in Appendix B). Assume that the random processes ${\hat{s}}_{n} (θ)$ converge to processes s(θ) in θ as n → ∞. Results from the well known continuous mapping theorem ensure that $\sup_{θ \in Θ} {\hat{s}}_{n} (θ) \to_{d} \sup_{θ \in Θ} s (θ)$ , as n → ∞.

Because of the asymptotic normality of $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ for each θ, s(θ) is distributed as a mixture of $χ_{0}^{2}$ and $χ_{1}^{2}$ each with weight $\lim_{n \to \infty} \Pr (n^{- 1 / 2} {\hat{u}}_{n} (θ) > 0) = .5$ . Although the distribution of s(θ) has a closed form, the distribution of its functional counterpart sup_θ∈Θ s(θ) is very complicated and difficult to derive analytically. A simple nonparametric bootstrap can be used to approximate the null distribution of this statistic under the null (Efron and Tibshirani, 1993). But the nonparametric bootstrap requires fitting the null model for each bootstrap sample, which may be a daunting task computationally. In this article, we propose a simple resampling technique which perturbs the influence function of $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ using normal variates to approximate the asymptotic null distribution of the test statistic $T_{n}$ . This technique has been extensively used in the literature when the asymptotic distributions are complicated and analytically intractable (see, e.g., Parzen et al., 1994; Lin et al., 1994; Hansen, 1996 and Zhu and Zhang, 2006). Briefly, the technique is described as follows. Define $τ_{n} (θ) = \sum_{i = 1}^{n} c_{i} (θ, γ^{*}) ξ_{i}$ where {ξ₁, …, ξ_n} are identical and independent standard normal variates, where c_i(θ, γ*) is the influence function of $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ . Process ${\hat{τ}}_{n} (θ)$ in θ is similarly defined by replacing γ* with its null estimate $\hat{γ}$ which satisfied Condition 1. Clearly, given observed data ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ , the random variation in τ_n(θ) and ${\hat{τ}}_{n} (θ)$ results from the randomness of ξ_i, i = 1, …, n. Results in Appendix C give a theoretical justification for the resampling procedure, in that the unconditional distribution of $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ is asymptotically equivalent to the conditional distribution of $n^{- 1 / 2} {\hat{τ}}_{n} (θ)$ given observed data ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ .

2.5. Specification of the Support Set Θ

The calculation of the test statistic $T_{n}$ both for the observed sample and the artificial samples from the resampling scheme requires specification of Θ the unknown support set of θ. This specification can be a daunting task in practice for moderate to high-dimensional covariates. Hence to control the dimensionality of the problem in those instances, we operate with hyperspherical coordinates which only require the analyst to specify the upper bound of the radial coordinate. To see this, consider the smallest hypersphere Φ containing Θ for which the radial coordinate is at most κ, i.e., κ = inf_ρ|ρ < ∞ : sup_θ∈Θ ‖θ‖ ≤ ρ}. The compactness of Θ ensures the existence of such κ. Because Θ ⊂ Φ, any vector θ = (θ₁, …, θ_q)′ ∈ Θ has a hyperspherical coordinate ϕ = (r, φ₁, …, φ_q−1), q > 1, such that

θ_{1} = r \cos (φ_{1}), θ_{k} = r \cos (φ_{k}) \prod_{u = 1}^{k - 1} \sin (φ_{u}), 2 \leq k \leq q - 1,

θ_{q} = r \prod_{u = 1}^{q - 1} \sin (φ_{u}),

where r = ‖θ‖ is the radial coordinate and φ_k, k = 1, …, q − 1, are angular coordinates. Under this specification, we have the following constraints 0 ≤ r ≤ κ, φ_k ∈ [0, π), k = 1, …, q − 2, and φ_q−1 ∈ [0, 2π), yielding Φ = [0, κ] × [0, π)^⊗(q−2) × [0, 2π) with [0, π)^⊗(q−2) denoting the (q − 2) product space. Because of its dependence on κ, Φ will be denoted Φ^(κ), with Φ⁽⁰⁾ representing the trivial set containing only the origin. The range of φ_k, k = 1, …, q − 2, is restricted to π to define a unique set of coordinates for each point in the hypersphere. For example, when q = 3, the vector (θ₁, θ₂, θ₃) is replaced by the spherical coordinate (r, φ₁, φ₂), taking values in the set Φ^(κ) = [0, κ] × [0, π) × [0, 2π), κ > 0, where θ₁ = r cos(φ₁), θ₂ = r sin(φ₁) cos(φ₂), and θ₃ = r sin(φ₁) sin(φ₂). The hyper-spherical transformation is important from a practical viewpoint unless q = 1 in which case the reparameterization is not necessary and ϕ = θ. Given that each argument φ_k, k = 1, …, q − 1, q > 1, lies in a bounded interval, specifying the set Φ^(κ) only requires the specification of κ the upper bound for r, albeit unknown. To address the uncertainty about κ, we recommend that a sensitivity of the test statistic $T_{n} = \sup_{ϕ \in Φ^{(κ)}} {\hat{s}}_{n} (ϕ)$ be examined with respect to κ.

3. Simulations

To evaluate the finite sample performance of the proposed score test $T_{n}$ , we conduct a numerical study and use hyper-spherical coordinates to specify the unknown support space. This performance is first compared to that of the constant cure fraction score test which restricts the unknown support set to the origin (Θ = {0} corresponding to Φ⁽⁰⁾ under hypersphere coordinates), and further to that of the statistic evaluated on the set Θ = {θ*}, where θ* the true θ is known. Throughout all simulations, we generate with probability π*(x) survival data from a Weibull model with true hazard function $h^{*} (t | x) = h_{0}^{*} (t) \exp {0.1 x_{1} + 0.2 x_{2}}$ where $h_{0}^{*} (t) = 0.01 \times 0.75 \times t^{- 0.25}$ . Here x = (x₁, x₂) with covariates x₁ and x₂ generated independently from a uniform distribution between 0 and 1 and a standard normal truncated on the interval [−1, 1], respectively. The failure times are potentially censored according to an exponential model with rate λ_c taking values 0.0002, 0.001, 0.002, representing light, moderate, and heavy censoring. These values of λ_c are selected to investigate the effect of the censoring rates on the empirical test size and power. The tests are performed assuming the working uncure fraction model π(x) = {1 + ψ exp{−θ₁x₁ − θ₂x₂}}⁻¹ under the alternative coupled with polar coordinates (r, φ₁) defined such that (θ₁, θ₂) = (r cos(φ₁), r sin(φ₁)) with support set Φ^(κ) = [0, κ] × [0, 2π), where κ ∈ {2, 4, 6}. To approximate the null distribution of the test statistic $T_{n}$ , the number K of resamples is set to 1,000. All simulations are replicated 1000 times for sample sizes 400, 600, and 800.

To investigate the empirical sizes of the proposed tests, we set π*(x) = 1. Results reported in Table 1 show that all tests, including the classical test based on constant cure fractions, maintain the size for moderate to heavy censoring. But they all tend to be slightly conservative for light censoring in that they reject the null less often than anticipated. This conservativeness has been previously reported in the literature (see, e.g., Peng et al., 2001). An heuristic explanation for this phenomenon is that when censoring is light, a relatively high proportion of subjects have failure times, making it harder to detect cure. To the extreme, if the censoring rate is negligible detecting cure is practically improbable, in which case adjudication from subject matter experts on the time threshold beyond which cancer is deemed cured may be required.

Table 1.

Empirical size and power of the score test statistic under the true model π*(x) = {1 + ψ* exp{3x₁ + 2.5x₂}}⁻¹, with exponential censoring rate λ_c, at 5% significance level

	λ_c = 0.002 (heavy censoring)			λ_c = 0.001 (moderate censoring)			λ_c = 0.0002 (light censoring)
	Sample size n			Sample size n			Sample size n
Support	400	600	800	400	600	800	400	600	800
				ψ* = 0

Φ⁽¹⁾	0.049	0.041	0.044	0.040	0.029	0.043	0.026	0.028	0.032
Φ⁽⁰⁾	0.056	0.058	0.045	0.046	0.034	0.049	0.032	0.035	0.034
				ψ* = 0.05

Φ⁽⁶⁾	0.722	0.869	0.940	0.857	0.967	0.996	0.992	1.000	1.000
Φ⁽⁴⁾	0.695	0.882	0.957	0.874	0.975	0.994	0.998	1.000	1.000
Φ⁽²⁾	0.679	0.901	0.967	0.887	0.982	0.997	0.998	0.999	1.000
Θ = {(−3, −2.5)′}	0.833	0.939	0.976	0.920	0.984	0.997	0.994	1.000	1.000
Φ⁽⁰⁾	0.117	0.130	0.147	0.110	0.160	0.202	0.108	0.140	0.166

Open in a new tab

We further investigate the empirical power of the sup-score test when the true uncure fraction depends on covariates on the form π*(x) ={1 + ψ* exp{3x₁ + 2.5x₂}}⁻¹ with ψ* = 0.05. The powers at 5% nominal level are reported in Table 1. Overall, a larger sample size improves the power of detecting the alternatives. As expected, all tests appear to lose power with increasing censoring rates, especially when censoring is substantial. More importantly, the proposed test statistic $T_{n}$ is consistently more powerful than the constant cure fraction test when the pre-selected set Θ contains the true parameter. The optimum power is achieved when the support set Θ is restricted to the singleton {(−3, −2.5)′}, representing the true θ.

We also conduct a simulation study to evaluate the power of the test statistics when the true cure fraction: (i) does not depend on any covariates; or (ii) does depend on covariates through a log–log, complementary log–log, and a probit link function (see Table 2). The proposed test continues to assume that the working uncure fraction is related to covariates via the logistic transformation under the alternative. Results in Table 2 show that when the true cure fraction does not depend on covariates, the constant non-cure fraction test appears to slightly outperform the proposed test. However, when the true cure fraction depends on covariates via other monotonic transformations, the proposed test coupled with the logistic model continues to outperform the test that ignores covariates.

Table 2.

Empirical power of the score test statistics at 5% significance level under a covariate dependent non-cure fraction π*(χ), with exponential censoring rate λ_c, at 5% significance level

	λ_c = 0.002 (heavy censoring)			λ_c = 0.001 (moderate censoring)			λ_c = 0.0002 (light censoring)
	Sample size n			Sample size n			Sample size n
Support	400	600	800	400	600	800	400	600	800
				π(x_i*) = 0.85

Φ⁽⁶⁾	0.045	0.081	0.103	0.114	0.189	0.281	0.814	0.967	0.993
Φ⁽⁴⁾	0.070	0.101	0.145	0.132	0.234	0.320	0.863	0.975	0.995
Φ⁽²⁾	0.086	0.116	0.160	0.178	0.292	0.371	0.916	0.989	0.999
Φ⁽⁰⁾	0.163	0.175	0.238	0.348	0.447	0.545	0.958	0.993	1.000
			π(x_i*) = 1 − exp{− exp{2 − 2x_1i − 1.5x_2i}}

Φ⁽⁶⁾	0.517	0.742	0.877	0.760	0.926	0.986	0.999	1.000	1.000
Φ⁽⁴⁾	0.537	0.752	0.874	0.811	0.950	0.984	0.998	1.000	1.000
Φ⁽²⁾	0.566	0.732	0.887	0.817	0.933	0.985	0.992	1.000	1.000
Φ⁽⁰⁾	0.086	0.113	0.110	0.089	0.105	0.148	0.099	0.097	0.125
			π(x_i*) = exp{−exp{−2 + 2x_1i + 1.5x_2i}}

Φ⁽⁶⁾	0.764	0.906	0.966	0.879	0.967	0.993	0.992	1.000	1.000
Φ⁽⁴⁾	0.763	0.917	0.965	0.895	0.984	0.998	0.997	1.000	1.000
Φ⁽²⁾	0.745	0.920	0.979	0.898	0.983	0.997	0.998	1.000	1.000
Φ⁽⁰⁾	0.091	0.118	0.120	0.147	0.189	0.213	0.423	0.551	0.664
			π(x_i*) = F{2 − 2x_1i − 1.5x_2i}}

Φ⁽⁶⁾	0.683	0.866	0.953	0.856	0.977	0.994	0.998	1.000	1.000
Φ⁽⁴⁾	0.717	0.892	0.958	0.877	0.972	0.995	0.998	1.000	1.000
Φ⁽²⁾	0.701	0.871	0.957	0.870	0.981	0.994	0.994	1.000	1.000
Φ⁽⁰⁾	0.091	0.139	0.153	0.103	0.139	0.179	0.077	0.104	0.144

Open in a new tab

Function F denotes the CDF of a standard normal distribution.

Finally, we conduct an extensive simulation study to evaluate the impact of selecting κ, the upper bound of the radial coordinate, on the power of the proposed test. For this, we generate data from model in (1) with π*(x) = {1 + 0.05 exp{3x₁ + 2.5x₂}}⁻¹, but allow κ to vary with values from 0 to 10 in increment of 0.2. These simulations are replicated 1000 times for sample size 400. Results at 5% nominal levels are plotted in Figure 2. When the selected support set does not contain θ* the true value θ, the power of the test increases with increasing values of the upper bound κ, with a maximum power being attained when θ* is on the circle or more generally on the edge of the hypersphere. However, when Φ^(κ) contains θ* and is made unnecessarily large, the proposed test loses some power. These findings have practical implications from a sensitivity analysis viewpoint. In practice, we recommend that increasing values of κ be entertained until the supremum test statistic levels off, in which case the smallest κ should be used for the analysis. Using unnecessarily large support sets may lead to a loss of statistical power.

Figure 2. — The empirical power (at 5% significance level, on the first y-axis) and the observed test statistics (on the second y-axis) with various values of the upper bound κ when the true value of r is 3.91.

4. Application: Ovarian Cancer Data from SEER Database

We applied the proposed sup-score test to detect a cure fraction from ovarian cancer data in the SEER registry. Our analysis focuses on 2468 ovarian cancer patients in the Los Angeles area who were diagnosed between 1992 and 2009 with the first malignant primary tumor in ovary. Because sufficient follow-up is a fundamental assumption for cure detection, we first conduct a preliminary analysis to verify if these data meet this condition. The observed follow-up time ranges from 1 to 175 months (median 28 months), while the censored time ranges from 1 to 215 months (median 59 months), both containing the 10-year mark for ovarian cancer. More formally, we perform the nonparametric test developed by Maller and Zhou (1994) to test for sufficient follow-up. Adopting the notation of these authors, let $T_{n}^{*}$ and T_n denote the largest uncensored survival time and the largest survival time from a sample of size n, respectively. Define ${\hat{α}}_{n} = {(1 - N_{n} / n)}^{n}$ , where N_n is the number of uncensored survival times in the interval $(2 T_{n}^{*} - T_{n}, T_{n}^{*}]$ . Sufficient follow-up is achieved when ${\hat{α}}_{n}$ is small, say less than 0.05. Using mortality data from the SEER registry in the Los Angeles area, the observed value of ${\hat{α}}_{n}$ is significantly lower than 0.0001, providing a strong evidence of sufficient follow-up.

Covariates considered in this analysis include the patient’s age at diagnosis (Age) in years, the number of primary tumors (Tumors) and the number of regional lymph nodes (PosNodes) found to contain metastases by the pathologist. We assume that the hazard function for the uncured group has a Cox-type model of the form h(t|x) = h₀(t) exp{x′β}, where x = (Age, Tumors, PosNodes)′ and β = (β₁, β₂, β₃)′, with a parametric baseline hazard function h₀(t). Moreover, we consider the following working model for the uncured fraction π(z) = {1 + ψ exp{−z′θ}}⁻¹, where z = (Age, Tumors, PosNodes)′ with associated regression slope θ = (θ₁, θ₂, θ₃)′ vanishing from the model whenever ψ = 0. We adopt the spherical coordinates (r, φ₁, φ₂) of θ with θ₁ = r cos(φ₁), θ₂ = r sin(φ₁) cos(φ₂) and θ₃ = r sin(φ₁) sin(φ₂) and approximate their domain Φ^(κ) = [0, κ] × [0, π) × [0, 2π), κ > 0, by a fine grid. The upper bound κ of the radial coordinate r is assumed to take values in the set {0.5, 1, 3, 5, 7} to evaluate the sensitivity of κ on inferences. To perform the proposed test, two working baseline survival models, namely the Weibull and the log-logistic models are considered. And for comparison purposes, the classical test that assumes a constant cure fraction under the alternative by setting θ = (0, 0, 0)′ or κ = 0, in which case π(z) = {1 + ψ}⁻¹ with ψ ≥ 0, is also performed. The null distribution of all test statistics is approximated by 10,000 resamples.

Results from Table 3 show that the observed value of the sup-score test statistic appears to level off for values of κ above 3, both for the Weibull and the log-logistic working baseline models. When the observed test statistic has stabilized, the p-value appears relatively stable between values 3 and 5 of κ, but increases markedly at κ = 7. This behavior is consisting with our simulation results in Section 3 in that increasing the upper bound κ when the supremum test statistic has stabilized may inflate the p-value at the expense of power. The sensitivity analysis gives a stronger evidence that the proposed tests coupled with the two working baseline survival models strongly reject the homogeneity hypothesis at 5% significance level, in contrast to the classical test that neglects covariate information under the alternative. This analysis of the SEER registry provides a good example of classical score tests based on constant alternatives losing power to detect heterogeneity in the population.

Table 3.

The supremum score test values (p-values) for ovarian cancer data based on 10,000 resamples, assuming a Weibull model and a log-logistic model as the working baseline survival model

Baseline survival	Constant test Φ⁽⁰⁾	Proposed tests with various upper bounds κ
Baseline survival	Constant test Φ⁽⁰⁾	Φ^(0.5)	Φ⁽¹⁾	Φ⁽³⁾	Φ⁽⁵⁾	Φ⁽⁷⁾
Weibull
	1.5323 (0.1215)	60.6564 (0.0110)	92.3026 (0.0027)	107.1630 (0.0010)	107.1630 (0.0019)	107.1630 (0.0288)
Log-logistic
	1.6545 (0.1009)	94.3724 (<0.0001)	115.4942 (<0.0001)	123.5085 (<0.0001)	123.5085 (<0.0001)	123.5085 (0.0069)

Open in a new tab

Φ⁽⁰⁾ = {0} × [0, π) × [0, 2π) and Φ^(κ) = [0, κ] × [0, π) × [0, 2π) with κ ∈ {0.5, 1, 3, 5, 7}

5. Discussion

In this article, we have developed a method to detect a cure fraction across the covariate space, which does not require the cure rate model to be estimated. But the derivation of this method comes at a small price by confining the covariates and associated regression slope parameters of the cure rate model to compact sets, which is merely a technical requirement in practice. Under this working condition, the homogeneity hypothesis then translates into evaluating infinitely large intercepts in the cure fraction model, leading to nontrivial complications. Using empirical processes arguments, we constructed the test statistic and established its limiting null distribution as a functional of mixtures of chi-square processes. To perform the test in practice, we proposed a simple resampling procedure to rigorously approximate this limiting distribution. A key contribution of this article is that the proposed sup-score test can detect heterogeneity in the population in settings where the classical approach that relies on constant cure fractions may fail to detect cure. But the rationale of the covariate adjustment builds on the behavior of the tests under model misspecification. We noticed a slight deterioration of power of the proposed tests when the true cure fraction does not depend on covariates. However, given that the true model is usually unknown to the analyst in practice, assuming a priori a model that relates the cure fraction to covariates may be the most conservative strategy for detecting cure in the population.

The proposed methodology relies on some key assumptions. For example, it assumes that the cure rate only depends on continuous variables. This is a limitation because more general formulations relating the cure rate not only to dense (continuous) variables but also to sparse (categorical) variables are often entertained in practice. For the ovarian cancer data, the regression model for the cure fraction may be extended to include race, targeting racial disparity investigation one of the endpoints of the SEER study. Even when only continuous variables are considered, the proposed test assumes that the regression model relating the continuous covariates to the cure fraction is well specified. This can be a daunting task in practice given that the true model is usually unknown to the analyst. Moreover, the test also assumes that the null model is indexed by a final dimensional parameter γ coupled with a parametric likelihood, and that there is a consistent estimator of γ converging at the $\sqrt{n}$ rate. Relaxing this condition to estimators that converge at rates slower than $\sqrt{n}$ such as in models involving nonparametric estimates of the baseline hazard function, is technically challenging and may require inferential techniques used in the context of identifiability loss in semiparametric models (see, e.g., Song, Kosorok, and Fine, 2009). This extension and other generalizations of the test (e.g., the extension of the test to general estimating functions coupled with a smooth condition, e.g., estimating functions derived from quasi-likelihood) merit future research.

Supplementary Material

Published Supplementary materials

NIHMS994999-supplement-Published_Supplementary_materials.pdf^{(57.2KB, pdf)}

Acknowledgements

We thank the editor and the associate editor for their helpful comments and constructive suggestions. This work was supported by the second author’s NCI/NIH K-award, 1K01 CA131259 and its supplement from the 2009 ARRA funding mechanism.

Appendix

Appendix A: Additional regularity Conditions

Condition 2. Assume that Γ the support of γ is a compact set and γ* the true value of γ is an interior point of Γ.

Condition 3. The function classes, $B$ and $D$ , are pointwise measurable and satisfy the uniform entropy condition; see van der Vaart and Wellner (2000) for the definitions. For example, functions which are uniformly bounded and uniformly Lipschitz of order > {dim(γ) + dim(θ)}/2 satisfy the above conditions, where dim(·) denotes the dimension of a vector.

Condition 4. Processes $n^{- 1} \sum_{i = 1}^{n} c_{i} (θ_{1}, γ^{*}) c_{i} (θ_{2}, γ^{*})$ converge almost surely to $σ_{c}^{*} (θ_{1}, θ_{2})$ , respectively, uniformly over θ₁, θ₂ ∈ Θ.

Condition 5. Under the null ψ = 0, the function $\tilde{h} (θ, γ)$ is uniformly bounded above, i.e., $\sup_{θ \in Θ, γ \in Γ} {‖ \tilde{h} (θ, γ) ‖}_{2} < \infty$ , where ‖.‖₂ denotes a matrix norm.

Appendix B: Theorem 1 and Derivation of Resampling Method

Theorem 1. Under H₀ and Conditions 1–5 in Appendix A, as n → ∞,

\sup_{θ \in Θ} ‖ n^{- 1} {\hat{u}}_{n} (θ) ‖ \to_{p} 0

and the random processes $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ indexed by θ, converge in distribution to mean zero Gaussian processes in θ with covariance function $σ_{c}^{*} (θ_{1}, θ_{2})$ , θ₁, θ₂ ∈ Θ.

Proof of Theorem 1:

Uniform consistency of n⁻¹u_n(θ)

Condition 3 implies that $B$ is Donsker and Glivenko–Cantelli (van der Vaart and Wellner, 1996, p. 122–133). Therefore, $\sup_{θ \in Θ} ‖ n^{- 1} \sum_{i = 1}^{n} [b_{i} (θ, γ^{*}) - E {b_{i} (θ, γ^{*})}] ‖ \to_{p} 0$ . We know $n^{- 1} u_{n} (θ) = n^{- 1} \sum_{i = 1}^{n} b_{i} (θ, γ^{*})$ and E{b_i(θ, γ*)}=0, for all θ ∈ Θ, then we have the uniform consistency result.
Uniform consistency of $n^{- 1} {\hat{u}}_{n} (θ)$

Using Taylor expansion to ${\hat{u}}_{n} (θ)$ around the true value γ*, we have
${\hat{u}}_{n} (θ) = u_{n} (θ) - {\tilde{h}}_{n} {(θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}),$
where ${\tilde{h}}_{n} (θ, \overset{⌣}{γ}) = - \partial u_{n} (θ) / \partial γ^{'} = \sum_{i = 1}^{n} d_{i} (θ, \overset{⌣}{γ})$ and $\overset{⌣}{γ}$ belongs to the line segment between γ* and $\hat{γ}$ . That is, $\overset{⌣}{γ} = a^{*} γ^{*} + (1 - a^{*}) \hat{γ}$ , for some a* ∈ [0, 1]. This expansion is possible because of Condition 2. By the triangular inequality, we then have, $‖ n^{- 1} {\hat{u}}_{n} (θ) ‖ \leq ‖ n^{- 1} u_{n} (θ) ‖ + ‖ n^{- 1} {\tilde{h}}_{n} {(θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖$ .

From (i), we already have $\sup_{θ \in Θ} ‖ n^{- 1} u_{n} (θ) ‖ \to_{p} 0$ . We only focus on $‖ n^{- 1} {\tilde{h}}_{n} {(θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖$ . Note that, $‖ n^{- 1} {\tilde{h}}_{n} {(θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖ \leq ‖ {n^{- 1} {\tilde{h}}_{n} (θ, \overset{⌣}{γ}) - \tilde{h} (θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖ + ‖ \tilde{h} {(θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖$ , where $\tilde{h} (θ, γ^{*}) = E {d_{1} (θ, γ^{*})}$ .

We define the norm of matrix M as ‖M‖₂ = sup_z:z≠0 ‖Mz‖/‖z‖. We have for any z ≠ 0, ‖M‖₂ ≥ ‖Mz‖/‖z‖ and subsequently ‖Mz‖ ≤ ‖M‖₂‖z‖. Applying this inequality to $‖ {n^{- 1} {\tilde{h}}_{n} (θ, \overset{⌣}{γ}) - \tilde{h} (θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖$ and $‖ \tilde{h} {(θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖$ . We then have,
$‖ n^{- 1} {\tilde{h}}_{n} {(θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖ \leq {{‖ n^{- 1} {\tilde{h}}_{n} (θ, \overset{⌣}{γ}) - \tilde{h} (θ, \overset{⌣}{γ}) ‖}_{2} + {‖ \tilde{h} (θ, \overset{⌣}{γ}) ‖}_{2}} ‖ \hat{γ} - γ^{*} ‖ .$ (B1)

From Condition 3, $D$ is Donsker and Glivenko–Cantelli (van der Vaart and Wellner, 1996, p.122–133). Therefore, $\sup_{θ \in Θ} {‖ n^{- 1} {\tilde{h}}_{n} (θ, \overset{⌣}{γ}) - \tilde{h} (θ, \overset{⌣}{γ}) ‖}_{2} \to_{p} 0$ . From Condition 5, it is also clear that $\sup_{θ \in Θ} {‖ \tilde{h} (θ, \overset{⌣}{γ}) ‖}_{2} < \infty$ . Because of $‖ \hat{γ} - γ^{*} ‖ \to_{p} 0$ from Condition 1 and the inequality in (B1), then we have $\sup_{θ \in Θ} ‖ n^{- 1} {\tilde{h}}_{n} {(θ, \overset{⌣}{γ})}^{'} (\hat{γ} - γ^{*}) ‖ \to_{p} 0$ . Hence, the uniform consistency of $n^{- 1} {\hat{u}}_{n} (θ)$ is then obtained.
Asymptotic distribution of $n^{- 1 / 2} {\hat{u}}_{n} (θ)$

Applying Taylor expansion to ${\hat{u}}_{n} (θ)$ around the true value γ*, we have ${\hat{u}}_{n} (θ) \approx u_{n} (θ) - {\tilde{h}}_{n} {(θ, γ^{*})}^{'} (\hat{γ} - γ^{*})$ , where ${\tilde{h}}_{n} (θ, γ^{*}) = \sum_{i = 1}^{n} d_{i} (θ, γ^{*})$ . From Condition 1, we have $n^{1 / 2} (\hat{γ} - γ^{*}) = n^{- 1 / 2} Λ^{* - 1} υ_{n} (γ^{*}) + o_{p} (1)$ . These two results then lead to $n^{- 1 / 2} {\hat{u}}_{n} (θ) \approx n^{- 1 / 2} u_{n} (θ) - n^{- 1} {\tilde{h}}_{n} {(θ, γ^{*})}^{'} n^{- 1 / 2} Λ^{* - 1} υ_{n} (γ^{*})$ . From Condition 3, $D$ is Donsker and Glivenko–Cantelli (van der Vaart and Wellner, 1996, p.122-133), we have $\sup_{θ \in Θ} {‖ n^{- 1} {\tilde{h}}_{n} (θ, γ^{*}) - \tilde{h} (θ, γ^{*}) ‖}_{2} \to_{p} 0$ , where $\tilde{h} (θ, γ^{*}) = E {d_{1} (θ, γ^{*})}$ . Hence, $n^{- 1 / 2} {\hat{u}}_{n} (θ) \approx n^{- 1 / 2} u_{n} (θ) - \tilde{h} {(θ, γ^{*})}^{'} n^{- 1 / 2} Λ^{* - 1} υ_{n} (γ^{*})$ . Since $u_{n} (θ) = \sum_{i = 1}^{n} b_{i} (θ, γ^{*})$ and $υ_{n} (γ^{*}) = \sum_{i = 1}^{n} a_{i} (θ, γ^{*})$ , the approximation of $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ can be rewritten as $n^{- 1 / 2} {\hat{u}}_{n} (θ) \approx n^{- 1 / 2} \sum_{i = 1}^{n} {b_{i} (θ, γ^{*}) - \tilde{h} {(θ, γ^{*})}^{'} Λ^{* - 1} a_{i} (γ^{*})}$ .

Under Condition 3, the function class $D$ is Donsker and under Condition 5, $\tilde{h} (θ, γ^{*})$ is uniformly bounded for θ ∈ Θ, the function class ${b_{i} (θ, γ^{*}) - \tilde{h} {(θ, γ^{*})}^{'} Λ^{* - 1} a_{i} (γ^{*}) : θ \in Θ, i = 1, 2, \dots, n}$ , is also Donsker. By applying Donsker theorem, the random process $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ converges in distribution to a centered Gaussian process in θ as n → ∞, with covariance kernel function $σ_{c}^{*} (θ_{1}, θ_{2}) = E {c_{1} (θ_{1}, γ^{*}) c_{1} (θ_{2}, γ^{*})}$ , θ₁, θ₂ ∈ Θ, where $c_{i} (θ, γ^{*}) = b_{i} (θ, γ^{*}) - \tilde{h} {(θ, γ^{*})}^{'} Λ^{* - 1} a_{i} (γ^{*})$ , θ ∈ Θ.

Theoretical justification of the resampling method.

Under H₀ and Conditions 1–5 in Appendix A, the unconditional distribution of $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ is asymptotically equivalent to the conditional distribution of $n^{- 1 / 2} {\hat{τ}}_{n} (θ)$ given observed data ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ .

To derive the resampling method in this article, we first need to show that the conditional distribution of n^−1/2τ_n(θ) given observed data ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ converges asymptotically to centered Gaussian processes as n → ∞ with covariance kernel $σ_{c}^{*} (θ_{1}, θ_{2})$ , θ₁, θ₂ ∈ Θ. Note that given observed data ${y_{i}, ω_{i}, δ_{i}}_{i = 1}^{n}$ , n^−1/2τ_n(θ) is a Gaussian process with a conditional covariance function,

Cov (n^{- 1 / 2} τ_{n} (θ_{1}), n^{- 1 / 2} τ_{n} (θ_{2})) = n^{- 1} Cov (Σ_{i = 1}^{n} c_{i} (θ_{1}, γ^{*}) ξ_{i}, Σ_{i = 1}^{n} c_{i} (θ_{2}, γ^{*}) ξ_{i}) .

From the independence assumption we have,

Cov (n^{- 1 / 2} τ_{n} (θ_{1}), n^{- 1 / 2} τ_{n} (θ_{2})) = n^{- 1} Σ_{i = 1}^{n} Cov (c_{i} (θ_{1}) ξ_{i}, c_{i} (θ_{2}) ξ_{i}), γ_{1}, γ_{2} \in Γ .

Note that, $n^{- 1} \sum_{i = 1}^{n} Cov (c_{i} (θ_{1}, γ^{*}) ξ_{i}, c_{i} (θ_{2}, γ^{*}) ξ_{i}) = n^{- 1} \sum_{i = 1}^{n} {c_{i} (θ_{1}, γ^{*}) E {ξ_{i}^{2}} c_{i} (θ_{2}, γ^{*})}$ , with $E {ξ_{i}^{2}} = 1$ . Using Condition 4, the random processes $n^{- 1} \sum_{i = 1}^{n} {c_{i} (θ_{1}, γ^{*}) c_{i} (θ_{2}, γ^{*})}$ indexed by θ₁, θ₂, converge to $σ_{c}^{*} (θ_{1}, θ_{2})$ uniformly over θ₁, θ₂ ∈ Θ.

Second, given observed data ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ , we show that under H₀ and Conditions 1-4, the conditional distribution of processes $n^{- 1 / 2} {\hat{τ}}_{n} (θ)$ is asymptotically equivalent to the conditional distribution of processes $n^{- 1 / 2} τ_{n} (θ)$ , where θ ∈ Θ.

We know $n^{- 1 / 2} {\hat{τ}}_{n} (θ) = n^{- 1 / 2} \sum_{i = 1}^{n} c_{i} (θ, \hat{γ}) ξ_{i}$ . Using the Taylor expansion and under Condition 2, we then have $c_{i} (θ, \hat{γ}) \approx c_{i} (θ, γ^{*}) - {[\partial c_{i} (θ, γ^{*}) / \partial γ]}^{'} (\hat{γ} - γ^{*})$ . This then will give, $n^{- 1 / 2} \sum_{i = 1}^{n} c_{i} (θ, \hat{γ}) ξ_{i} \approx n^{- 1 / 2} {\sum_{i = 1}^{n} c_{i} (θ, γ^{*}) ξ_{i} - \sum_{i = 1}^{n} {[\partial c_{i} (θ, γ^{*}) / \partial γ]}^{'} (\hat{γ} - γ^{*}) ξ_{i}}$ . We consider the quantity, $n^{- 1 / 2} \sum_{i = 1}^{n} {[\partial c_{i} (θ, γ^{*}) / \partial γ]}^{'} (\hat{γ} - γ^{*}) ξ_{i} = {n^{- 1} \sum_{i = 1}^{n} {[\partial c_{i} (θ, γ^{*}) / \partial γ]}^{'} ξ_{i}} {n^{1 / 2} (\hat{γ} - γ^{*})}$ . From Condition 1, we have $n^{1 / 2} (\hat{γ} - γ^{*}) = O_{p} (1)$ , where O_p(1) represents the boundedness in probability. Function classes, {[∂c_i(θ, γ*)/∂γ]′ξ_i, θ ∈ Θ}, i = 1, 2, …, n, are Donsker due to $B$ and $D$ being Donsker under Condition 3 and ξ_i = O_p(1). For fixed data ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ , we then have $\sup_{θ \in Θ} ‖ n^{- 1} \sum_{i = 1}^{n} {[\partial c_{i} (θ, γ^{*}) / \partial γ]}^{'} ξ_{i} ‖ \to_{p} 0$ as n → ∞. Given observed data ${y_{i}, w_{i}, δ_{i}}_{i = 1}^{n}$ , this finally gives, $n^{- 1 / 2} \sum_{i = 1}^{n} c_{i} (θ, \hat{γ}) ξ_{i} \approx n^{- 1 / 2} \sum_{i = 1}^{n} c_{i} (θ, γ^{*}) ξ_{i}$ . Hence, the processes $n^{- 1 / 2} {\hat{τ}}_{n} (θ)$ converge asymptotically to centered Gaussian processes with covariance kernel $σ_{c}^{*} (θ_{1}, θ_{2})$ , θ₁, θ₂ ∈ Θ. From Theorem 1, the processes $n^{- 1 / 2} {\hat{u}}_{n} (θ)$ also converge asymptotically to the same centered Gaussian processes with covariance kernel $σ_{c}^{*} (θ_{1}, θ_{2})$ , θ₁, θ₂ ∈ Θ. Then we have the desired result.

Footnotes

6. Supplementary Materials

Web Appendices referenced in Section 2.3 and additional simulation results are available with this article at the Biometrics website on Wiley Online Library.

References

Andrews DWK (2001). Testing when a parameter is on the boundary of the maintained hypothesis. Econometrica 69, 683–734. [Google Scholar]
Berkson J and Gage RP (1952). Survival curve for cancer patients following treatment. Journal of the American Statistical Association 47, 501–515. [Google Scholar]
Boag JW (1949). Maximum likelihood estimates of the proportion of patients cured by cancer therapy. Journal of the Royal Statistical Society, Series B (Methodological) 11, 15–53. [Google Scholar]
Chernoff H (1954). On the distribution of the likelihood ratio. The Annals of Mathematical Statistics 25, 573–578. [Google Scholar]
Davies RB (1977). Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64, 247–254. [DOI] [PubMed] [Google Scholar]
Di C-Z and Liang K-Y (2011). Likelihood ratio testing for ad-mixture models with application to genetic linkage analysis. Biometrics 67, 1249–1259. [DOI] [PMC free article] [PubMed] [Google Scholar]
Efron B and Tibshirani R (1993). An Introduction to the Bootstrap. New York: Chapman & Hall Ltd. ISBN 0-412-04231-2. [Google Scholar]
Farewell VT (1982). The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38, 1041–1046. [PubMed] [Google Scholar]
Farewell VT (1986). Mixture models in survival analysis: Are they worth the risk? Canadian Journal of Statistics 14, 257–262. ISSN 1708-945X. [Google Scholar]
Hansen BE (1996). Inference when a nuisance parameter is not identified under the null hypothesis (STMA V37 4551). Econometrica 64, 413–430. [Google Scholar]
Klebanov LB and Yakovlev AY (2007). A new approach to testing for sufficient follow-up in cure-rate analysis. Journal of Statistical Planning and Inference 137, 3557–3569. [Google Scholar]
Kuk AY and Chen C-H (1992). A mixture model combining logistic regression with proportional hazards regression. Biometrika 79, 531–541. [Google Scholar]
Li C-S, Taylor JM, and Sy JP (2001). Identifiability of cure models. Statistics & Probability Letters 54, 389–395. [Google Scholar]
Li Y, Tiwari RC, and Guha S (2007). Mixture cure survival models with dependent censoring. Journal of the Royal Statistical Society, Series B (Statistical Methodology) 69, 285–306. [Google Scholar]
Lin DY, Fleming TR, and Wei LJ (1994). Confidence bands for survival curves under the proportional hazards model. Biometrika 81, 73–81. [Google Scholar]
Maller RA and Zhou S (1994). Testing for sufficient follow-up and outliers in survival data. Journal of the American Statistical Association 89, 1499–1506. [Google Scholar]
Maller RA and Zhou X (1996). Survival Analysis with Long-term Survivors. New York: John Wiley & Sons. ISBN 0-471-96201-5. [Google Scholar]
Parzen MI, Wei LJ, and Ying Z (1994). A resampling method based on pivotal estimating functions. Biometrika 81, 341–350. [Google Scholar]
Peng Y and Dear KB (2000). A nonparametric mixture model for cure rate estimation. Biometrics 56, 237–243. [DOI] [PubMed] [Google Scholar]
Peng Y, Dear KBG, and Carriere KC (2001). Testing for the presence of cured patients: a simulation study. Statistics in Medicine 20, 1783–1796. [DOI] [PubMed] [Google Scholar]
Ritz C and Skovgaard IM (2005). Likelihood ratio tests in curved exponential families with nuisance parameters present only under the alternative. Biometrika 92, 507–517. [Google Scholar]
SEER (2012). Surveillance, epidemiology, and end results (seer) program (www.seer.cancer.gov) research data (1973–2009), national cancer institute, dccps, surveillance research program, surveillance systems branch, released april 2012, based on the november 2011 submission.
Self SG and Liang K-Y (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association 82, 605–610. [Google Scholar]
Silvapulle MJ and Silvapulle P (1995). A score test against one-sided alternatives. Journal of the American Statistical Association 90, 342–349. [Google Scholar]
Song R, Kosorok MR, and Fine JP (2009). On asymptotically optimal tests under loss of identifiability in semiparametric models. The Annals of Statistics 37, 2409–2444. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sy JP and Taylor JM (2000). Estimation in a cox proportional hazards cure model. Biometrics 56, 227–236. [DOI] [PubMed] [Google Scholar]
Tai P, Yu E, Cserni G, Vlastos G, Royce M, Kunkler IV, et al. , (2005). Minimum follow-up time required for the estimation of statistical cure of cancer patients: Verification using data from 42 cancer sites in the seer database. BMC Cancer 5, 48. [DOI] [PMC free article] [PubMed] [Google Scholar]
Todem D, Hsu W-W, and Kim K (2012). On the efficiency of score tests for homogeneity in two-component parametric models for discrete data. Biometrics 68, 975–982. [DOI] [PMC free article] [PubMed] [Google Scholar]
van der Vaart AW and Wellner JA (1996). Weak Convergence and Empirical Processes. New York: Springer. [Google Scholar]
van der Vaart AW and Wellner JA (2000). Preservation Theorems for Glivenko-Cantelli and Uniform Glivenko-Cantelli Theorems. High Dimensional Probability II, Giné E, Mason DM, Wellner JA (eds). Birkhäuser, Boston. [Google Scholar]
Verbeke G and Molenberghs G (2003). The use of score tests for inference on variance components. Biometrics 59, 254–262. [DOI] [PubMed] [Google Scholar]
Wang L, Du P, and Liang H (2012). Two-component mixture cure rate model with spline estimated nonparametric components. Biometrics 68, 726–735. [DOI] [PubMed] [Google Scholar]
Yau KKW and Ng ASK (2001). Long-term survivor mixture model with random effects: Application to a multi-centre clinical trial of carcinoma. Statistics in Medicine 20, 1591–1607. [DOI] [PubMed] [Google Scholar]
Yin G and Ibrahim JG (2005). Cure rate models: A unified approach. Canadian Journal of Statistics 33, 559–570. ISSN 1708-945X. [Google Scholar]
Zhao Y, Lee AH, Yau KK, Burke V, and McLachlan GJ (2009). A score test for assessing the cured proportion in the long-term survivor mixture model. Statistics in medicine 28, 3454–3466. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhu H and Zhang H (2006). Generalized score test of homogeneity for mixed effects models. The Annals of Statistics 34, 1545–1569. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Published Supplementary materials

NIHMS994999-supplement-Published_Supplementary_materials.pdf^{(57.2KB, pdf)}

[R1] Andrews DWK (2001). Testing when a parameter is on the boundary of the maintained hypothesis. Econometrica 69, 683–734. [Google Scholar]

[R2] Berkson J and Gage RP (1952). Survival curve for cancer patients following treatment. Journal of the American Statistical Association 47, 501–515. [Google Scholar]

[R3] Boag JW (1949). Maximum likelihood estimates of the proportion of patients cured by cancer therapy. Journal of the Royal Statistical Society, Series B (Methodological) 11, 15–53. [Google Scholar]

[R4] Chernoff H (1954). On the distribution of the likelihood ratio. The Annals of Mathematical Statistics 25, 573–578. [Google Scholar]

[R5] Davies RB (1977). Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika 64, 247–254. [DOI] [PubMed] [Google Scholar]

[R6] Di C-Z and Liang K-Y (2011). Likelihood ratio testing for ad-mixture models with application to genetic linkage analysis. Biometrics 67, 1249–1259. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Efron B and Tibshirani R (1993). An Introduction to the Bootstrap. New York: Chapman & Hall Ltd. ISBN 0-412-04231-2. [Google Scholar]

[R8] Farewell VT (1982). The use of mixture models for the analysis of survival data with long-term survivors. Biometrics 38, 1041–1046. [PubMed] [Google Scholar]

[R9] Farewell VT (1986). Mixture models in survival analysis: Are they worth the risk? Canadian Journal of Statistics 14, 257–262. ISSN 1708-945X. [Google Scholar]

[R10] Hansen BE (1996). Inference when a nuisance parameter is not identified under the null hypothesis (STMA V37 4551). Econometrica 64, 413–430. [Google Scholar]

[R11] Klebanov LB and Yakovlev AY (2007). A new approach to testing for sufficient follow-up in cure-rate analysis. Journal of Statistical Planning and Inference 137, 3557–3569. [Google Scholar]

[R12] Kuk AY and Chen C-H (1992). A mixture model combining logistic regression with proportional hazards regression. Biometrika 79, 531–541. [Google Scholar]

[R13] Li C-S, Taylor JM, and Sy JP (2001). Identifiability of cure models. Statistics & Probability Letters 54, 389–395. [Google Scholar]

[R14] Li Y, Tiwari RC, and Guha S (2007). Mixture cure survival models with dependent censoring. Journal of the Royal Statistical Society, Series B (Statistical Methodology) 69, 285–306. [Google Scholar]

[R15] Lin DY, Fleming TR, and Wei LJ (1994). Confidence bands for survival curves under the proportional hazards model. Biometrika 81, 73–81. [Google Scholar]

[R16] Maller RA and Zhou S (1994). Testing for sufficient follow-up and outliers in survival data. Journal of the American Statistical Association 89, 1499–1506. [Google Scholar]

[R17] Maller RA and Zhou X (1996). Survival Analysis with Long-term Survivors. New York: John Wiley & Sons. ISBN 0-471-96201-5. [Google Scholar]

[R18] Parzen MI, Wei LJ, and Ying Z (1994). A resampling method based on pivotal estimating functions. Biometrika 81, 341–350. [Google Scholar]

[R19] Peng Y and Dear KB (2000). A nonparametric mixture model for cure rate estimation. Biometrics 56, 237–243. [DOI] [PubMed] [Google Scholar]

[R20] Peng Y, Dear KBG, and Carriere KC (2001). Testing for the presence of cured patients: a simulation study. Statistics in Medicine 20, 1783–1796. [DOI] [PubMed] [Google Scholar]

[R21] Ritz C and Skovgaard IM (2005). Likelihood ratio tests in curved exponential families with nuisance parameters present only under the alternative. Biometrika 92, 507–517. [Google Scholar]

[R22] SEER (2012). Surveillance, epidemiology, and end results (seer) program (www.seer.cancer.gov) research data (1973–2009), national cancer institute, dccps, surveillance research program, surveillance systems branch, released april 2012, based on the november 2011 submission.

[R23] Self SG and Liang K-Y (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. Journal of the American Statistical Association 82, 605–610. [Google Scholar]

[R24] Silvapulle MJ and Silvapulle P (1995). A score test against one-sided alternatives. Journal of the American Statistical Association 90, 342–349. [Google Scholar]

[R25] Song R, Kosorok MR, and Fine JP (2009). On asymptotically optimal tests under loss of identifiability in semiparametric models. The Annals of Statistics 37, 2409–2444. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] Sy JP and Taylor JM (2000). Estimation in a cox proportional hazards cure model. Biometrics 56, 227–236. [DOI] [PubMed] [Google Scholar]

[R27] Tai P, Yu E, Cserni G, Vlastos G, Royce M, Kunkler IV, et al. , (2005). Minimum follow-up time required for the estimation of statistical cure of cancer patients: Verification using data from 42 cancer sites in the seer database. BMC Cancer 5, 48. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Todem D, Hsu W-W, and Kim K (2012). On the efficiency of score tests for homogeneity in two-component parametric models for discrete data. Biometrics 68, 975–982. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R29] van der Vaart AW and Wellner JA (1996). Weak Convergence and Empirical Processes. New York: Springer. [Google Scholar]

[R30] van der Vaart AW and Wellner JA (2000). Preservation Theorems for Glivenko-Cantelli and Uniform Glivenko-Cantelli Theorems. High Dimensional Probability II, Giné E, Mason DM, Wellner JA (eds). Birkhäuser, Boston. [Google Scholar]

[R31] Verbeke G and Molenberghs G (2003). The use of score tests for inference on variance components. Biometrics 59, 254–262. [DOI] [PubMed] [Google Scholar]

[R32] Wang L, Du P, and Liang H (2012). Two-component mixture cure rate model with spline estimated nonparametric components. Biometrics 68, 726–735. [DOI] [PubMed] [Google Scholar]

[R33] Yau KKW and Ng ASK (2001). Long-term survivor mixture model with random effects: Application to a multi-centre clinical trial of carcinoma. Statistics in Medicine 20, 1591–1607. [DOI] [PubMed] [Google Scholar]

[R34] Yin G and Ibrahim JG (2005). Cure rate models: A unified approach. Canadian Journal of Statistics 33, 559–570. ISSN 1708-945X. [Google Scholar]

[R35] Zhao Y, Lee AH, Yau KK, Burke V, and McLachlan GJ (2009). A score test for assessing the cured proportion in the long-term survivor mixture model. Statistics in medicine 28, 3454–3466. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] Zhu H and Zhang H (2006). Generalized score test of homogeneity for mixed effects models. The Annals of Statistics 34, 1545–1569. [Google Scholar]

PERMALINK

A Sup-Score Test for the Cure Fraction in Mixture Models for Long-Term Survivors

Wei-Wen Hsu

David Todem

KyungMann Kim

Summary.

1. Introduction

Figure 1.

2. The Method

2.1. The Cure Rate Model

2.2. Hypothesis Formulation and Challenges for Inference

2.3. A Sup-Score Test for the Cure Fraction

2.4. Asymptotic Properties of the Sup-Score Statistic $T_{n}$

2.5. Specification of the Support Set Θ

3. Simulations

Table 1.

Table 2.

Figure 2.

4. Application: Ovarian Cancer Data from SEER Database

Table 3.

5. Discussion

Supplementary Material

Acknowledgements

Appendix

Appendix A: Additional regularity Conditions

Appendix B: Theorem 1 and Derivation of Resampling Method

Theoretical justification of the resampling method.

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A Sup-Score Test for the Cure Fraction in Mixture Models for Long-Term Survivors

Wei-Wen Hsu

David Todem

KyungMann Kim

Summary.

1. Introduction

Figure 1.

2. The Method

2.1. The Cure Rate Model

2.2. Hypothesis Formulation and Challenges for Inference

2.3. A Sup-Score Test for the Cure Fraction

2.4. Asymptotic Properties of the Sup-Score Statistic Tn

2.5. Specification of the Support Set Θ

3. Simulations

Table 1.

Table 2.

Figure 2.

4. Application: Ovarian Cancer Data from SEER Database

Table 3.

5. Discussion

Supplementary Material

Acknowledgements

Appendix

Appendix A: Additional regularity Conditions

Appendix B: Theorem 1 and Derivation of Resampling Method

Theoretical justification of the resampling method.

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

2.4. Asymptotic Properties of the Sup-Score Statistic $T_{n}$