Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2019 Jun 27;47(2):231–247. doi: 10.1080/02664763.2019.1635571

Estimation in the single change-point hazard function for interval-censored data with a cure fraction

Bing Wang 1, Xiaoguang Wang 1,CONTACT, Lixin Song 1
PMCID: PMC9041630  PMID: 35706516

ABSTRACT

In reliability or survival analysis, the hazard function plays a significant part for it can display the instantaneous failure rate at any time point. In practice, the abrupt change in hazard function at an unknown time point may occur after a maintenance activity or major operation. Under these circumstances, identifying the change point and estimating the size of the change are meaningful. In this paper, we assume that the hazard function is piecewise constant with a single jump at an unknown time. We propose the single change-point model for interval-censored survival data with a cure fraction. Estimation methods for the proposed model are investigated, and large-sample properties of the estimators are established. Simulation studies are carried out to evaluate the performance of the estimating method. The liver cancer data and breast cancer data are analyzed as the applications.

KEYWORDS: Survival analysis, interval censoring, change-point hazard model, cure fraction, pseudo-maximum likelihood

1. Introduction

The change-point problem in distribution arises in quality control problems and has recently received much attention. In survival analysis, it is of great importance to detect a time lag of treatment effect or identify the change point in a hazard function. In medical follow-up studies, after a major operation, e.g. liver transplantation or bone marrow transplantation, the initial risk is usually high and then the risk drops to a lower constant long term risk. In this case, the hazard function with a change point is the commonly employed model:

λ(t)=β+θI(t>τ), (1)

where β, β+θ and τ are positive constants. In this model, I() is the indicator function of an event, β and β+θ are the hazard rates before and after the change point τ, respectively. The jump θ can be either positive or negative, which reflects an increase or decrease in the hazard rate. There exist three main research aspects of this model in the literature. Firstly, fit the model by means of maximum -ikelihood methods (MLE) (see [3,13,15,17]). The second is testing the existence of the change point, considered by [10,16]. The third aspect is obtaining the estimation of the change-point location by the structural properties of (1), explored by [2,6], and a certain function of the estimated cumulative hazard function is needed in their estimation procedures.

For standard survival models, we generally assume that all the patients will die from the event of interest. However, in clinical studies, a substantial proportion of patients who respond favorably to the treatment appear subsequently to be free of any signs or symptoms of the disease and may be considered cured, while the remaining patients may eventually relapse. The standard survival models may be inappropriate for this type of data. To deal with this situation, the cure model was proposed by [1] as follows:

S(t)=1p+pS0(t), (2)

where S(t) is the proportion of people surviving at any given point in time, p is the proportion that is not cured, and S0(t) represents the survival function of the uncured people. There are several methods proposed to deal with cure models, such as the expectation-maximization algorithm and Markov chain Monte Carlo method [1,19,27]. Survival models with a cure fraction have been extensively studied for decades and many applications have been reported [14], but these models do not consider possible change-point phenomena. In reality, cured patients may well exist in change-point situations. For example, the nonlymphoblastic leukemia data studied in [15] was proved that there exists an abrupt change for the recurrence rate. At the same time, the Kaplan–Meier estimator of the survival function levels off below 1, indicating the presence of cured patients who will never suffer a relapse of leukemia in the data. Inspired by this, we pursue the change-point model (1) with the possible presence of a cure fraction.

Interval censoring is an increasingly common type of censoring and there is voluminous literature on the statistical analysis of interval-censored failure time data (see [9,23,24]). In these studies, we only know that the event occurs within a time interval based on the follow-up visit. In some situations, interval censorship is due to some data-processing programs. The liver cancer data in our real data analysis are from the Surveillance, Epidemiology, and End Results (SEER) cancer incidence public-use database. SEER only publicizes the processed survival month. Thus, we cannot obtain the day-level information, needless to say the exact event time. The survival month (T) after diagnosis is defined by

T=floor(lastcontactdatediagnosisdateaveragedaysinmonth),

where floor(t)={n:t[n,n+1),nN}. By the definition of survival month T, we obtain that the exact event time lying on the interval [T,T+1) or [T,) [29]. The breast cancer data in our application come from a retrospective study of 94 patients who received radiation therapy. The clinician, who determined whether or not the retraction had occurred, made a series of observation times to each patient. Hence, the time of retraction is known only to lie between the time of the present and last observation times.

In this article, we focus on modeling the ‘case 2’ interval-censored survival data with the cured patients by model (1). Compared with [28], we deal with different types of data. We replace the Kaplan–Meier method with the ICM method. The NPMLE calculated by ICM can guarantee the consistency which is helpful to get the consistency of the uncured rate's estimation. Additionally, we need some assumptions for interval censoring to obtain asymptotic properties and propose a change-point test procedure.

The rest of the article is organized as follows. In Section 2, we outline the notations and model descriptions of the single change-point hazard model with a cure fraction for interval-censored data. Details of pseudo-maximum-likelihood estimation are presented in Section 3. The procedure of the change-point test is proposed in Section 4. Asymptotic properties are investigated in Section 5. Extensive simulation results are reported in Section 6. In Section 7, we apply the proposed method to the liver cancer data and breast cancer data. Technical proofs are relegated to the supplemental material.

2. Model formulation

Under the cure model (2), let η be the indicator variable with η=1 if the patient is uncured and 0 if cured, T be the failure time of a patient and T be the failure time of an uncured patient. Define p=P(η=1). That is, p is the probability of being uncured. Then the relationship between cumulative distribution functions (c.d.f.) of T and T is

F(t)=P(Tt|η=1)P(η=1)+P(Tt|η=0)P(η=0)=pP(Tt)=pF0(t),

where F(t) and F0(t) are the c.d.f. of T and T, respectively. Correspondingly, provided f(t) and f0(t) are their density functions, the hazard rate function of T is of the form

λ(t)=f(t)1F(t)=pf0(t)1pF0(t). (3)

Note that λ(t)0 as t for p<1, and hence the hazard rate cannot remain constant with the presence of a cure fraction. A simple example is the exponential lifetime with cured patients, where T is exponentially distributed with a constant hazard rate ψ. Then the hazard rate of T is

λ(t)=pψexp(ψt)1p+pexp(ψt),

which is no longer constant.

Now, assume that the hazard function of T is specified as

λ0(t)=β+θI(t>τ).

Then, we obtain the corresponding density function and c.d.f. are, respectively,

f0(t)=λ0(t)e0tλ0(u)du={βeβt,0tτ,(β+θ)eβtθ(tτ),t>τ,

and

F0(t)={1eβt,0tτ,1eβtθ(tτ),t>τ.

By (3), the hazard function of T is

λ(t)={pβeβt1p+peβt,0tτ,p(β+θ)eβtθ(tτ)1p+peβtθ(tτ),t>τ. (4)

There is a jump at τ of size

pθexp(βτ)1p+pexp(βτ),

which is increasing with respect to p if θ>0 and decreasing for θ<0, and it reaches its maximum (minimum) value θ at p=1 for the case of θ>0 (θ<0). Apparently, there are two different mathematical forms before and after the point τ, and λ(t)λ0(t). As is common in change-point models, we suppose the existence of bounds τ1 and τ2 such that 0<τ1ττ2< (see [2,6,16]). In medical research, one often has to deal with interval-censored survival data when patients are assessed only at pre-scheduled visits. If the event has not occurred at one visit but has occurred by the following visit, the time T is known within an interval. Following the usual formulation, let (Ti,Li,Ri), 1in be a sample of random variables (r.v.) (T,L,R). We postulate Li, Ri are non-negative, independent with Ti, following a joint distribution function H, and P(RiLic)=1 for some positive constant c. Moreover, we assume that H has a density h, satisfying

h(l,r)>0,if0<F0(l)<F0(r)<1. (5)

Let Oi=(Li,Ri,δL,i,δR,i,δI,i), i=1,,n denote the data for subjects, where δR,i, δL,i, and δI,i are the censoring indicators and satisfy that δL,i=1 when Ti is left censored (Ti<Li), δR,i=1 when Ti is right censored (TiRi), and δI,i=1 when Ti is interval-censored (LiTi<Ri). Note that δL,i+δR,i+δI,i=1 for all i. The observed likelihood function of the parameters β, θ, p and τ under the model (4) for interval-censored data is given by

L(β,θ,p,τ|Oi,i=1,,n)=i=1n{pF0(Li)}δL,i{pF0(Ri)pF0(Li)}δI,i{1pF0(Ri)}δR,i. (6)

3. Pseudo-Maximum- likelihood estimation

According to the preceding instructions, the log-likelihood function based on the observed data can be written as

ln(ϕ)=logL(β,θ,p,τ|Oi,i=1,,n)=i=1n[δL,i{log(p)+logF0(Li)}+δI,i{log(p)+log(F0(Ri)F0(Li))}+δR,ilog(1pF0(Ri))]=i=1nl(β,θ,p,τ|Oi),

where ϕ=(ξ,τ) with ξ=(β,θ,p) and l(β,θ,p,τ|Oi) is the log-likelihood of a single observation Oi and given by

l(β,θ,p,τ|Oi)=δL,i{logp+I(Liτ)log(1eβLi)+I(Li>τ)log(1eβLiθ(Liτ))}+δI,i{logp+I(Riτ)log(eβLieβRi)+I(Liτ<Ri)log(eβLieβRiθ(Riτ))+I(Li>τ)log(eβLiθ(Liτ)eβRiθ(Riτ))}+δR,i{I(Riτ)log(1peβRi)+I(Ri>τ)log(1p(1eβRiθ(Riτ)))}. (7)

From (7), it is obvious that the function l(β,θ,p,τ) is not continuous at τ. Hence, the sufficient conditions for consistency are not met. The classical maximum likelihood (MLE) is not appropriate to implement. Thus, we resort to the pseudo-likelihood approach, which overcomes this problem.

The pseudo-likelihood approach was proposed by [7] and further studied by others including [11,12]. The key idea is to replace the true (but unknown) ‘nuisance’ parameters p and τ in (7) by their consistent estimators p^ and τ^, and then treat the log-likelihood function ln(β,θ,p^,τ^), called the pseudo log-likelihood function, as a usual likelihood function of β and θ to generate the pseudo-MLE (β^,θ^) of (β,θ).

The consistent estimators of τ and p can be obtained by nonparametric methods as follows. As in [14], let

p^=F^n(Y(n)), (8)

where Y(n)=max{Ri;i=1,,n}, and F^n(t) denotes the nonparametric estimate of the c.d.f. of failure times which is achieved by the Iterative Convex Minorant (ICM) algorithm. The ICM algorithm proposed by [9] is fast in computing the nonparametric maximum-likelihood estimation (NPMLE) of the distribution function for interval-censored data without covariates. We will show that p^ is consistent in Section 5.

To develop the estimator of τ, note that the cumulative hazard function of T can be obtained by

Λ(t)=0tλ(u)du={log(1p+peβt),0tτ,log(1p+peβtθ(tτ)),t>τ.

Let

Λ(t)=log[1p{eΛ(t)1+p}]={βt,0tτ,βt+θ(tτ),t>τ,

which is a piecewise linear function of t. Further define

X(t)={Λ(D)Λ(t)DtΛ(t)Λ(0)t}g{t(Dt)}, (9)

for 0<t<D, where D>τ2 and g(x)=xq, 0q1. Then we have

X(t)=Λ(D)g{t(Dt)}DtDΛ(t)g{t(Dt)}(Dt)t=θDτDtg{t(Dt)}I(tτ)+θτtg{t(Dt)}I(t>τ), (10)

which is increasing (decreasing) on [0,τ] and decreasing (increasing) on (τ,D] for θ>0 (θ<0). Let Xn(t) be the empirical version of (9) with unknown cumulative hazard function Λ(t) and p replaced by the estimator Λ^(t)=log{1F^n(t)} and p^ in (8), respectively.

Then an estimator of τ is given by

τ^={inf{t[τ1,τ2]:Xn(t±)=supu[τ1,τ2]Xn(u)},ifθ>0,inf{t[τ1,τ2]:Xn(t±)=infu[τ1,τ2]Xn(u)},ifθ<0. (11)

In the absence of a cure fraction, Chang et al. [2] showed that τ^ is a consistent estimator of τ, with τ^τ=Op(n1). Using the asymptotic properties of Λ^(t) and p^, we can also establish the consistency of τ^.

4. Change-point test

In this section, we propose a test to determine whether there is a change point in the hazard function for the data. We test the null hypothesis H0 : θ=0 or τ= which means there is no change point in the survival distribution versus the alternative hypothesis that there is one change point. Following the results of [21,25,26], together with our model (4), we propose two modified test statistics:

RMF1=2supτ{x1,,xk}{ln(β^(τ),θ^(τ),p^,τ)ln(β^0,0,p^,)},RMF2=1kτ{x1,,xk}{ln(β^(τ),θ^(τ),p^,τ)ln(β^0,0,p^,)},

where β^(τ) and θ^(τ) are estimators of the parameters corresponding to the model (4) obtained by maximizing the pesudo-likelihood ln(β,θ,p^,τ), β^0 is the MLE when θ=0 or τ=, p^ is obtained from (8) and {x1,,xk} are equally spaced points in the interval [τ1,τ2]. The asymptotic distributions of RMF1 and RMF2 under the null hypothesis are complicated. Hence, we apply a resampling procedure to obtain the critical values under H0, taking RMF1 as an example:

Step 1. Calculate p^0 by (8), and β^0 by maximizing logL(β,0,p^0,|Oi,i=1,n).

Step 2. Obtain a series of observation times 0=s1<s2<<sm=, m<=2n by arranging all the points in the set i=1n{Li,Ri} from small to large, and removing the repeated points.

Step 3. Generate the failure time data {T~i}i=1n by the model (4) with p=p^0, β=β^0, θ=0 and τ=. Obtain the interval-censored data {L~i,R~i}i=1n by setting (L~i,R~i)=(sj,sj+1) if Ti~(sj,sj+1],i=1,,n,j{1,,m}.

Step 4. Generate a total of B (e.g. B = 500) simulated trials by repeating Step 3. Obtain the likelihood ratio statistics RMF1b,b=1,,B for each trial.

Step 5. Reject the null hypothesis if RMF1, the likelihood ratio statistic calculated from the original trial, is larger than the 95% percentile of {RMF1b,b=1,,B}.

5. Asymptotic properties

We first introduce the notations. For ease of presentation, we consider the case θ>0 only. Let ϕ=(β,θ,p,τ)Θ=(0,)×(0,)×[τ1,τ2]×(0,1), μ=(β,θ) and ν=(τ,p). The true value of ϕ=(μ,ν)=(β,θ,p,τ) is denoted by ϕ0=(μ0,ν0)=(β0,θ0,p0,τ0). Write P=Pμ0,ν0 for the probability measure of O=(L,R,δL,δI,δR) and employ the abbreviation Pg=gdP for any measurable function g. Denote Pn as the empirical measure of observations {Oi,i=1,,n} and Png=gdPn=n1i=1ng(Oi). Define the parameter spaces for μ and ν as

Θ1={μ=(β,θ):βA1,θA2},Θ2={ν=(τ,p):|ττ0|η1,|pp0|η2}, (12)

where A1, A2, η1, and η2 are some small positive constants. The first partial derivative of l(μ,ν|O) with respect to μ is donated by l˙μ(μ,ν|O)=(l˙β(μ,ν|O),l˙θ(μ,ν|O)) where

l˙β(μ,ν|O)=δL{I(Lτ)LeβL1eβL+I(L>τ)LeβLθ(Lτ)1eβLθ(Lτ)}+δI{I(Rτ)ReβRLeβLeβLeβR+I(Lτ<R)ReβRθ(Rτ)LeβLeβLeβRθ(Rτ)+I(L>τ)ReβRθ(Rτ)LeβLθ(Lτ)eβLθ(Lτ)eβRθ(Rτ)}+δR{I(Rτ)pReβR1peβR+I(R>τ)pReβRθ(Rτ)1p(1eβRθ(Rτ))}, (13)

and

l˙θ(μ,ν|O)=δLI(L>τ)(Lτ)eβLθ(Lτ)1eβLθ(Lτ)+δI{I(Lτ<R)(Rτ)eβRθ(Rτ)eβLeβRθ(Rτ)+I(L>τ)(Rτ)eβRθ(Rτ)(Lτ)eβLθ(Lτ)eβLθ(Lτ)eβRθ(Rτ)}+δRI(R>τ)p(Rτ)eβRθ(Rτ)1p(1eβRθ(Rτ)). (14)

Denote l˙μ(μ,ν|O)l˙μ(μ,ν|O) by l¨μμ(μ,ν|O). From (13) and (14), we can easily have that

Pl˙μ(μ,ν|O)<,Pl¨μμ(μ,ν|O)<, (15)

on condition that the censoring distribution has a finite variance. Here, a vector or matrix less than infinity means that its all components are less than infinity. Note that μ0 is the unique point such that Pl˙μ(μ,ν0|O)=0, then we obtain μ^ by solving Pnl˙μ(μ,ν0|Oi,i=1,,n)=0.

The main results on asymptotic properties of p^, τ^, β^ and θ^ are presented in the next five theorems, and their proofs are given in supplemental materia. In order to describe the theorems, we need to define the right extreme τF0 of F0 by

τF0=sup{t0:F0(t)<1}.

Under assumptions of H, we can obtain that H(,τF0)<1, and hence there exists a constant τH satisfying τH=sup{t>τF0:H(,t)<1}.

Theorem 1 Consistency of p^

Suppose that 0<p<1 and F is continuous at τH in case τH<. Then p^p in probability as n.

Remark 1

Theorem 1 is a modification of [14] for interval-censored data. And because the Kaplan–Meier estimation method is not proper for case 2 interval-censored data, in this paper the nonparametric estimation of the c.d.f. of the failure time is achieved by the ICM algorithm. The strong consistency of F^n obtained by the ICM algorithm is proved by [9], which is sufficient for the proof of Theorem 1.

Using the consistency of p^ and Λ^(t), we establish the consistency of τ^ in the following theorem.

Theorem 2 Consistency of τ^

Assume that F is continuous at τH in case τH< and τF0>D. Then the estimator τ^ of τ defined in (10) is consistent.

Theorem 3 Consistency of μ^

Suppose that p^ and τ^ are obtained by (8) and (11), respectively. Then Pnl˙μ(μ^,ν^)=op(n1/2) almost surely, and μ^ converges in outer probability to μ0.

Remark 2

Note that op(1) in the following representations indicates convergence to zero in outer probability in case that the term involved is not Borel measurable.

Theorem 4 The rate of convergence of μ^

Under the conditions in Theorem 3, n(μ^μ0)=Op(1).

Theorem 5 Asymptotic normality —

Under the conditions in Theorem 3, n(μ^μ0) is asymptotically normal with mean 0 and variance {Pl¨μμ(μ0,ν0)}2V, where V=Var{Λ1+P0l¨μν(μ0,ν0)Λ2}, and Λ1 and Λ2 are random vectors satisfying

n[(PnP)l˙μ(μ0,ν0)ν^ν0]d[Λ1Λ2].

Remark 3

For the asymptotic variance of n(μ^μ0) in Theorem 5, a precise representation of V can be found in Corollary 3.1.4 of [11] for i.i.d. setup. In this case, there exists γ(O,μ,ν)0 satisfying that nPl¨μμ(μ0,ν0)(ν^ν0)=nPnγ(O,μ0,ν0)+op(1), which presents V=Var{l˙μ(μ0,ν0|O)}+Var{γ(O,μ0,ν0)}, where γ(O,μ0,ν0) is defined by (3.1.21) in [11]. Without such a γ(O,μ0,ν0), a closed form of V is not available, but we can estimate the variance by the bootstrap method as discussed below.

Since the asymptotic variances of μ^ and ν^ are often intractable, we resort to the bootstrap method applied in [18,22]. The algorithm proceeds in three steps.

Step 1. Resample the pairs (L1,R1),,(Ln,Rn) with probability 1/n at each pair (Li,Ri). Denote the resampled data by a(1),,a(B), for some positive integer B.

Step 2. For each set of bootstrap data a(b), b=1,,B, evaluate the estimates of interest.

Step 3. Calculate the sample means and standard deviations of estimators.

6. Simulation results

To evaluate our approach, we do a lot of simulations through different settings. Particularly, the studies are performed to explore the influences of the value of q in (9), the jump size, the change-point location, the sample size and the censoring level. We also check the power of the test procedure. In all cases, we compute the estimation of p by (8) and τ by (11), and then achieve the estimation of β and θ.

The study considers the data simulated from the hazard function defined at (3) and the corresponding distribution function is

F(t)={1p+peβt,0tτ,1p+peβtθ(tτ),t>τ,

where p=0.8, and β=1. For checking the influence of the jumping size, we fix τ=1, and set θ{0.5,1,1.5}. And we set τ{0.1,0.5,1,1.5} with θ=1 to see the effect of the change-point location. The change-point search range is set to (0.05,1.75). There are six parameter configurations totally. We also let q{0.25,0.5,0.75,1} for g(t) in (9) to assess the influence of q. Each Ti is generated by solving F(t)=ui numerically, where uiU(0,1). The total number of visit times for each subject is generated according to 1 plus a Poisson random variable having mean parameter σ. The first observation times are the sample from U(0,b) where b is a positive constant. The gap times between adjacent visits are sampled according to an exponential distribution with mean 2. Subsequently, the visit times are given by the cumulative sums of the gap times. The observed interval for the ith subject is then determined to be the two consecutive observation times whose interval contained Ti, with the convention that if Ti is less (greater) than the smallest (largest) observation time then the lower (upper) bound of the observed interval is 0 (). Different censoring levels can be obtained by adjusting b and σ. We consider four kinds of left and right censoring levels (CL), (30%, 30%), (20%, 20%), (0, 50%) and (0, 30%). For the purposes of this study, 1000 data sets of the form (Li,Ri)i=1n are generated for each considered parameter configuration where n{200,400,800}.

Firstly, we investigate the effects of the different q in (9). Since the choice of q can not affect p^, β^ and θ^ directly, the result of τ^ is the only assessment of q. Table 1 displays the empirical biases (bias) and sample standard deviations (SD) of τ^ under different model scenarios where β=1, p=0.8, θ{0.5,1,1.5}, τ{0.1,0.5,1,1.5} and the sample size n=800. The results indicate that the proposed method with any q{0.25,0.5,0.75,1} performs reasonably well. However, smaller sample standard deviations of τ^ were obtained when q=1 in all situations. Hence, q=1 is suggested.

Table 1. Results of the estimation with β=1, p=0.8, θ{0.5,1,1.5}, τ{0.1,0.5,1,1.5} and n=800. CL denotes the average rates of both left and right censoring. Denote τ0.25, τ0.5, τ0.75 and τ1 the pseudo-maximum-likelihood method with q=0.25,0.5,0.75,1, respectively.

  τ0.25 τ0.5 τ0.75 τ1
Con. CL τ θ Bias SD Bias SD Bias SD Bias SD
1 (30%, 30%) 0.1 1 0.050 0.271 0.048 0.256 0.042 0.241 0.039 0.231
2 (30%, 30%) 0.5 1 0.016 0.387 0.010 0.251 0.002 0.161 0.003 0.111
3 (30%, 30%) 1 1 0.007 0.227 0.001 0.195 0.011 0.181 0.003 0.162
4 (30%, 30%) 1.5 1 0.007 0.172 0.017 0.157 0.019 0.168 0.019 0.168
5 (30%, 30%) 1 0.5 0.008 0.373 0.021 0.342 0.020 0.331 0.006 0.337
6 (30%, 30%) 1 1.5 0.016 0.159 0.006 0.144 0.004 0.123 0.001 0.119
7 (20%, 20%) 0.1 1 0.017 0.157 0.015 0.148 0.007 0.151 0.003 0.151
8 (20%, 20%) 0.5 1 0.005 0,213 0.003 0.118 0.005 0.104 0.006 0.096
9 (20%, 20%) 1 1 0.005 0.175 0.006 0.163 0.005 0.158 0.001 0.141
10 (20%, 20%) 1.5 1 0.005 0.144 0.003 0.145 0.009 0.126 0.006 0.119
11 (20%, 20%) 1 0.5 0.003 0.245 0.001 0.237 0.009 0.203 0.002 0.195
12 (20%, 20%) 1 1.5 0.007 0.130 0.007 0.114 0.002 0.101 0.005 0.097
13 (0, 50%) 0.1 1 0.125 0.403 0.124 0.398 0.121 0.393 0.115 0.296
14 (0, 50%) 0.5 1 0.017 0.228 0.005 0.260 0.018 0.196 0.003 0.211
15 (0, 50%) 1 1 0.002 0.308 0.005 0.288 0.004 0.287 0.001 0.270
16 (0, 50%) 1.5 1 0.004 0.220 0.009 0.218 0.004 0.199 0.016 0.216
17 (0, 50%) 1 0.5 0.007 0.374 0.009 0.378 0.007 0.378 0.003 0.367
18 (0, 50%) 1 1.5 0.001 0.193 0.005 0.189 0.006 0.172 0.002 0.157
19 (0, 30%) 0.1 1 0.090 0.304 0.087 0.299 0.083 0.293 0.072 0.282
20 (0, 30%) 0.5 1 0.001 0.133 0.004 0.098 0.007 0.093 0.003 0.089
21 (0, 30%) 1 1 0.011 0.198 0.001 0.191 0.011 0.168 0.008 0.167
22 (0, 30%) 1.5 1 0.006 0.177 0.003 0.169 0.007 0.169 0.001 0.168
23 (0, 30%) 1 0.5 0.001 0.377 0.001 0.297 0.006 0.292 0.007 0.289
24 (0, 30%) 1 1.5 0.002 0.148 0.001 0.140 0.004 0.138 0.001 0.134

Next, we compare our method with the MLE method suggested by [3,15,25,26]. They obtain the estimators as follows: with a fixed τ, let ξ^n(τ) be the value of ξ maximizing ln(ϕ). Then τ is estimated by

τ^=inf{τ[τ1,τ2]:max(ln(ξ^n(τ),τ),ln(ξ^n(τ±),τ±)=supτ[τ1,τ2]ln(ξ^n(τ),τ))}.

Then the maximum-likelihood estimator of ξ is obtained as ξ^n=ξ^n(τ^).

Table 2 displays the empirical biases and sample standard deviations of the estimators considering different configurations of model parameters from the sets θ(0.5,1,1.5), τ(0.1,0.5,1,1.5). We also set different censoring levels and consider samples of size 800. From Table 2, the results indicate that both methods perform reasonably well in estimating the model parameters for the cases investigated except the situation τ=0.1 (see the configuration sequence 1–7–13–19), since the number of the failures smaller than 0.1 is not enough to obtain good estimators. The performances of the two methods are generally comparable showing similar improvement or deterioration behaviors as the system parameters change. More specifically, biases and standard deviations of the estimators are smaller with a larger jump size (see configuration sequences 3–5–6, 9–11–12, 15–17–18 and 21–23–24). By comparing configurations 1 to 6 with 7 to 12 in Table 2, we obtain that the biases and standard deviations of the estimators are smaller with a lower left and right censoring level. More specifically, the performances are better with a smaller right censoring rate referring to configurations 13 to 24. Looking at the configuration sequences 2–3–4, 8–9–10, 14–15–16 and 20–21–22, the change-point location has no great influences on the estimations, if there are enough samples on both sides of the change point. We also obtain that our method provides smaller biases than the MLE method in most cases. Further, when the jump size is small or the location of the change point is close to 0, our method is overwhelmingly better. In Table 3, we take into account the influence of the sample size. As expected, biases and standard deviations decrease as the sample size increases.

Table 2. Results of the estimation with β=1, p=0.8, θ{0.5,1,1.5}, τ{0.1,0.5,1,1.5} and n=800. PMLE denotes the pseudo-maximum-likelihood method with q=1 and CL denotes the level of right and left censoring.

  PMLE MLE
Con. CL τ θ   p^ τ^ β^ θ^ p^ τ^ β^ θ^
1 (30%, 30%) 0.1 1 Bias 0.005 0.039 0.087 0.089 0.001 0.116 0.087 0.229
        SD 0.022 0.231 0.433 0.451 0.020 0.341 0.496 0.415
2 (30%, 30%) 0.5 1 Bias 0.001 0.003 0.003 0.014 0.003 0.015 0.008 0.046
        SD 0.020 0.111 0.113 0.222 0.019 0.093 0.109 0.229
3 (30%, 30%) 1 1 Bias 0.001 0.003 0.002 0.012 0.001 0.015 0.001 0.040
        SD 0.022 0.111 0.067 0.298 0.021 0.125 0.069 0.320
4 (30%, 30%) 1.5 1 Bias 0.001 0.019 0.001 0.011 0.002 0.074 0.005 0.113
        SD 0.023 0.168 0.062 0.414 0.020 0.284 0.075 0.582
5 (30%, 30%) 1 0.5 Bias 0.002 0.006 0.019 0.038 0.002 0.015 0.014 0.105
        SD 0.024 0.337 0.063 0.237 0.021 0.261 0.084 0.337
6 (30%, 30%) 1 1.5 Bias 0.001 0.001 0.003 0.010 0.001 0.011 0.004 0.040
        SD 0.019 0.119 0.066 0.349 0.019 0.071 0.062 0.411
7 (20%, 20%) 0.1 1 Bias 0.001 0.003 0.100 0.103 0.001 0.037 0.079 0.116
        SD 0.015 0.151 0.379 0.363 0.015 0.185 0.401 0.371
8 (20%, 20%) 0.5 1 Bias 0.001 0.006 0.011 0.024 0.001 0.004 0.015 0.026
        SD 0.015 0.096 0.091 0.142 0.016 0.079 0.092 0.135
9 (20%, 20%) 1 1 Bias 0.001 0.001 0.001 0.007 0.001 0.014 0.002 0.018
        SD 0.015 0.141 0.063 0.212 0.015 0.125 0.066 0.202
10 (20%, 20%) 1.5 1 Bias 0.001 0.006 0.002 0.024 0.001 0.018 0.002 0.027
        SD 0.015 0.119 0.054 0.236 0.015 0.159 0.056 0.263
11 (20%, 20%) 1 0.5 Bias 0.001 0.002 0.002 0.019 0.001 0.023 0.002 0.052
        SD 0.015 0.195 0.061 0.139 0.015 0.210 0.063 0.161
12 (20%, 20%) 1 1.5 Bias 0.001 0.005 0.001 0.017 0.001 0.005 0.001 0.024
        SD 0.015 0.097 0.062 0.247 0.015 0.097 0.062 0.229
13 (0, 50%) 0.1 1 Bias 0.010 0.115 0.002 0.065 0.001 0.130 0.032 0.252
        SD 0.032 0.296 0.413 0.478 0.028 0.410 0.500 0.447
14 (0, 50%) 0.5 1 Bias 0.008 0.003 0.046 0.004 0.001 0.001 0.029 0.135
        SD 0.033 0.211 0.119 0.317 0.029 0.218 0.123 0.351
15 (0, 50%) 1 1 Bias 0.009 0.001 0.022 0.044 0.002 0.002 0.008 0.251
        SD 0.037 0.270 0.075 0.479 0.027 0.245 0.091 0.667
16 (0, 50%) 1.5 1 Bias 0.005 0.016 0.006 0.045 0.006 0.128 0.001 0.319
        SD 0.035 0.216 0.085 0.591 0.032 0.358 0.092 0.868
17 (0, 50%) 1 0.5 Bias 0.002 0.003 0.014 0.108 0.009 0.023 0.003 0.399
        SD 0.038 0.367 0.083 0.438 0.034 0.363 0.107 0.622
18 (0, 50%) 1 1.5 Bias 0.005 0.002 0.006 0.027 0.001 0.020 0.005 0.084
        SD 0.025 0.157 0.077 0.440 0.025 0.134 0.088 0.655
19 (0, 30%) 0.1 1 Bias 0.004 0.072 0.045 0.034 0.001 0.120 0.053 0.073
        SD 0.023 0.282 0.366 0.392 0.018 0.294 0.436 0.400
20 (0, 30%) 0.5 1 Bias 0.001 0.003 0.020 0.028 0.001 0.001 0.023 0.057
        SD 0.019 0.089 0.091 0.199 0.019 0.090 0.101 0.207
21 (0, 30%) 1 1 Bias 0.001 0.008 0.003 0.045 0.001 0.007 0.005 0.082
        SD 0.020 0.167 0.069 0.284 0.019 0.149 0.068 0.276
22 (0, 30%) 1.5 1 Bias 0.001 0.001 0.003 0.018 0.001 0.040 0.008 0.075
        SD 0.020 0.168 0.062 0.349 0.019 0.231 0.069 0.454
23 (0, 30%) 1 0.5 Bias 0.003 0.007 0.012 0.047 0.001 0.008 0.012 0.078
        SD 0.022 0.289 0.062 0.221 0.020 0.265 0.074 0.217
24 (0, 30%) 1 1.5 Bias 0.001 0.001 0.005 0.016 0.001 0.005 0.009 0.088
        SD 0.017 0.134 0.069 0.301 0.017 0.098 0.070 0.416

Table 3. Results of the estimation with β=1, p=0.8, τ=1, θ{0.5,1.5} and n{200,400,800}. PMLE denotes the pseudo-maximum-likelihood method with q=1, and CL denotes the level of right and left censoring.

  PMLE MLE
Con. CL θ n   p^ τ^ β^ θ^ p^ τ^ β^ θ^
5 (30%, 30%) 0.5 200 Bias 0.001 0.046 0.081 0.127 0.006 0.055 0.035 0.586
        SD 0.035 0.159 0.206 0.378 0.037 0.439 0.172 0.730
      400 Bias 0.004 0.011 0.031 0.105 0.001 0.020 0.026 0.214
        SD 0.028 0.139 0.158 0.293 0.028 0.353 0.112 0.364
      800 Bias 0.002 0.006 0.019 0.038 0.002 0.005 0.014 0.105
        SD 0.024 0.337 0.063 0.237 0.021 0.120 0.084 0.337
6 (30%, 30%) 1.5 200 Bias 0.001 0.003 0.002 0.089 0.002 0.021 0.005 0.497
        SD 0.036 0.215 0.116 0.433 0.037 0.199 0.120 0.775
      400 Bias 0.002 0.005 0.001 0.052 0.001 0.012 0.005 0.276
        SD 0.029 0.167 0.088 0.387 0.027 0.157 0.093 0.593
      800 Bias 0.001 0.001 0.003 0.010 0.001 0.011 0.004 0.040
        SD 0.019 0.119 0.066 0.349 0.019 0.071 0.062 0.411
11 (20%, 20%) 0.5 200 Bias 0.002 0.013 0.012 0.099 0.003 0.071 0.017 0.282
        SD 0.027 0.372 0.095 0.279 0.028 0.443 0.118 0.504
      400 Bias 0.001 0.024 0.003 0.043 0.001 0.069 0.003 0.122
        SD 0.022 0.313 0.078 0.219 0.022 0.348 0.081 0.290
      800 Bias 0.001 0.002 0.002 0.019 0.001 0.023 0.002 0.052
        SD 0.015 0.195 0.061 0.139 0.015 0.210 0.063 0.161
12 (20%, 20%) 1.5 200 Bias 0.001 0.014 0.005 0.035 0.002 0.057 0.014 0.318
        SD 0.027 0.203 0.100 0.413 0.027 0.203 0.107 0.579
      400 Bias 0.001 0.005 0.006 0.020 0.002 0.019 0.007 0.058
        SD 0.022 0.134 0.077 0.340 0.022 0.113 0.082 0.364
      800 Bias 0.001 0.005 0.001 0.017 0.001 0.005 0.001 0.024
        SD 0.015 0.097 0.062 0.247 0.015 0.097 0.062 0.229
17 (0, 50%) 0.5 200 Bias 0.012 0.012 0.060 0.395 0.006 0.005 0.048 1.013
        SD 0.053 0.451 0.167 0.657 0.052 0.471 0.208 1.075
      400 Bias 0.012 0.026 0.045 0.185 0.005 0.016 0.026 0.636
        SD 0.048 0.451 0.105 0.512 0.038 0.416 0.135 0.856
      800 Bias 0.002 0.003 0.014 0.108 0.009 0.023 0.003 0.399
        SD 0.038 0.367 0.083 0.438 0.034 0.363 0.107 0.622
18 (0, 50%) 1.5 200 Bias 0.007 0.009 0.001 0.034 0.001 0.013 0.011 0.598
        SD 0.051 0.297 0.148 0.592 0.045 0.318 0.181 0.935
      400 Bias 0.009 0.002 0.014 0.029 0.002 0.002 0.018 0.386
        SD 0.042 0.270 0.108 0.541 0.036 0.256 0.127 0.837
      800 Bias 0.005 0.002 0.006 0.027 0.001 0.020 0.005 0.084
        SD 0.025 0.157 0.077 0.440 0.025 0.080 0.088 0.655
23 (0, 30%) 0.5 200 Bias 0.004 0.006 0.030 0.232 0.002 0.109 0.027 0.492
        SD 0.038 0.356 0.109 0.434 0.034 0.414 0.137 0.614
      400 Bias 0.003 0.024 0.014 0.145 0.001 0.036 0.018 0.311
        SD 0.027 0.349 0.094 0.368 0.026 0.394 0.121 0.492
      800 Bias 0.003 0.007 0.012 0.047 0.001 0.008 0.012 0.078
        SD 0.022 0.289 0.062 0.221 0.020 0.265 0.074 0.217
24 (0, 30%) 1.5 200 Bias 0.002 0.015 0.004 0.023 0.001 0.019 0.006 0.351
        SD 0.035 0.260 0.128 0.361 0.034 0.267 0.136 0.748
      400 Bias 0.001 0.008 0.002 0.016 0.001 0.013 0.003 0.180
        SD 0.026 0.158 0.095 0.383 0.026 0.164 0.104 0.540
      800 Bias 0.001 0.001 0.005 0.016 0.001 0.005 0.009 0.016
        SD 0.017 0.134 0.069 0.301 0.017 0.098 0.070 0.310

To assess the powers of our proposed change-point test statistics, we explore the percentages of one change point detected in simulated trials. We consider the samples of size 800, β=1, p=0.8, θ{0,0.1,0.2,0.3,0.4,0.5}, τ{0.1,0.5,1,1.5} and the left and right censoring level is (20%,20%). The number of equally spaced points in the interval [τ1,τ2] is 500. The results are listed in Table 4. We obtain that the powers of the two statistics are larger with a bigger jump size and smaller when the change-point location is close to 0 or goes to infinity. For each configuration, the power of RMF1 is slightly larger than that of RMF2. Looking at the configuration θ=0, we also had false positive discovery rates of 5% and 4.8% for RMF1 and RMF2, respectively, which shows that our methods maintain the overall type I error of α=0.05 when the true model has no change point. Overall, the results demonstrate that our methods have a good performance in identifying the true model and estimating the parameters.

Table 4. Size and powers with one or no change point.

  θ
τ   0 0.1 0.2 0.3 0.4 0.5
0.1 RMF1 5.0% 12.8% 20.6% 36.4% 48.6% 60.4%
  RMF2 4.8% 11.0% 20.0% 30.8% 40.2% 48.0%
0.5 RMF1 5.0% 22.6% 64.0% 88.4% 98.2% 100%
  RMF2 4.8% 26.0% 54.4% 76.0% 92.2% 100%
1 RMF1 5.0% 22.0% 56.0% 84.4% 96.6% 100%
  RMF2 4.8% 24.4% 50.2% 84.2% 94.8% 100%
1.5 RMF1 5.0% 16.4% 32.4% 64.4% 84.8% 90.4%
  RMF2 4.8% 16.8% 32.0% 56.6% 82.6% 86.0%

7. Real data analysis

We are interested in finding whether there exists a change point in the hazard function, whether the cure model is proper and estimating the location of the change point and all other model parameters.

7.1. The liver cancer data

The proposed method is applied to the 2008–2013 Iowa liver cancer data set from the Surveillance, Epidemiology, and End Results cancer incidence public-use database (SEER). Observations with missing values for any of these variables are excluded from this study. People who were lost to follow-up after diagnosis or died immediately after diagnosis are also excluded. The final analytic data set includes 546 patients where a total of 19.6% of the observations are subjected to right censoring and the others belong to interval censoring. As discussed in Section 1, we calculate the survival month T based on the difference in the last contact date and the diagnosis date, then rounded down. We treat it as interval-censored data and apply the change-point model to analyze this data.

Table 5 records the 5-year relative survival probability for liver cancer, and we can see that with the development of the medical treatment, the improvement of 5-year relative survival probability is obvious. In the live cancer data, there are 107 patients still alive at the last visit time. And the 95% confidential interval of p^ is (0.926,0.971). Hence, it is appropriate to admit the presence of long-term survivors. And our suggested model (4) is proper for the analysis of the data. Based on the change-point model (4), the MLE of the parameters are calculated to be (p^,τ^,β^,θ^)=(0.954,55,0.0065,0.0256), which implies that the hazard function has a change point around 55 days with a jump of 0.0256.

Table 5. Liver cancer 5-year relative survival probability.

Year 1975 1980 1985 1990 1995 2000 2005 2010
5-year relative survival probability 3.0% 3.2% 7.0% 5.3% 5.7% 11.7% 16.8% 16.8%

Figure 1 shows two modified cumulative hazard functions. The broken line is generated by the estimation of the model (4). The irregular line is the result of NPMLE. In the figure, the increasing ratio of the irregular line changes around 50–60 which is the rough range of the change point. And the dot of the broken line is 55 which is the change point obtained by the model (4). Table 6 records the results of the estimations of each parameter. Since the asymptotic variance of μ^ is often intractable, we resort to the bootstrap method, which is proposed by [4] to produce the standard deviation (SD) based on 1000 repetitions. The results in Table 6 imply the value of p at around 0.95, indicating a proportion of 5% for the long-term survivors. The change point τ is estimated as 55 months, with an estimated jump of about 0.0256 for the hazard rate. We fitted the data using the survival cure model (4). And the test procedure in Section 4 was applied to determine whether there is an abrupt change for the recurrence rate of the live cancer. We obtain that RMF1=265.502 and RMF2=135.979. Under the null hypothesis, the 95% quantiles of RMF1 and RMF2 are 254.478 and 70.339, respectively. Hence, there is overwhelming evidence to reject the null hypothesis H0 and conclude that a change point does exist for the data.

Figure 1.

Figure 1.

Modified cumulative hazard function for liver cancer data where the broken line is the result of the estimation of change point model and the irregular line is the result of NPMLE.

Table 6. Estimates for liver cancer data.

p^ SD τ^ SD β^ SD θ^ SD
0.954 0.0130 55.00 0.274 0.0065 0.000637 0.0256 0.00179

7.2. The breast cancer data

The breast cancer data set is described and given in [5]. This data set considers the information of time to cosmetic deterioration of the breast for women with Stage 1 breast cancer who have undergone a radiotherapy. The data come from a retrospective study of 94 patients who received radiation. Each woman made a series of visits to a clinician who determined whether or not the retraction had occurred. If the retraction had occurred, the time was known only to lie between the time of the present and last visits. Finally, a total of 40.6% of the observations were subjected to right censoring and 5% belongs to left censoring.

By the bootstrap procedure described in Section 5 and (8), the 95% confidential interval of p^ is (0.866,0.901). Hence the cure model (4) is suitable to analyze the data. And our estimates for (τ,β,θ,p), together with their sample standard deviations, are shown in Table 7. We apply the test procedures in Section 4 to determine whether there is a change point. This shows the deviances RMF1=10.157 and RMF2=5.878. The 95% quantiles of RMF1 and RMF2 under the null hypothesis are 6.804 and 3.987 respectively. Hence, there is a strong evidence to reject the null hypothesis H0 and conclude that a change point does exist for the data.

Table 7. Estimates for breast cancer data.

p^ SD τ^ SD β^ SD θ^ SD
0.882 0.0170 16.00 0.074 0.017 0.004 0.028 0.010

8. Discussion

In this paper, we develop the pseudo-maximum-likelihood method to handle the single change-point hazard model for interval-censored data with a cure fraction. To guarantee the consistency of the uncured rate estimate, we have applied the ICM method to calculate the NPMLE. Compared with [3], our approach possesses a smaller bias of the change size estimate. The simulation and two real data examples illustrate that the proposed method can effectively deal with interval-censored data with a cure fraction.

In our future work, the multiple change-point hazard model will be the key research object. There is some research work on this model, such as [8,20]; however, they did not consider interval-censored data.

Supplementary Material

supplementarydata.pdf

Funding Statement

The research work of Wang is supported by the National Natural Sciences Foundation of China grant 11471065. The research work of Song is supported by the National Natural Sciences Foundation of China grant 11371077 and 61175041.

Disclosure statement

No potential conflict of interest was reported by the authors.

References

  • 1.Berkson J. and Gage R.P., Survival curve for cancer patients following treatment, J. Am. Stat. Assoc. 47 (1952), pp. 501–515. doi: 10.1080/01621459.1952.10501187 [DOI] [Google Scholar]
  • 2.Chang I.S., Chen C.H., and Hsiung C.A., Estimation in change-point hazard rate models with random censorship, Lecture Notes Monogr. Ser. 23 (1994), pp. 78–92. doi: 10.1214/lnms/1215463115 [DOI] [Google Scholar]
  • 3.Dupuy J.F., Estimation in a change-point hazard regression model, Stat. Probab. Lett. 76 (2006), pp. 182–190. doi: 10.1016/j.spl.2005.07.013 [DOI] [Google Scholar]
  • 4.Efron B., Bootstrap methods: another look at the jackknife, Ann. Stat. 7 (1979), pp. 1–26. doi: 10.1214/aos/1176344552 [DOI] [Google Scholar]
  • 5.Finkelstein D.M. and Wolfe R.A., A semiparametric model for regression analysis of interval-censored failure time data, Biometrics 41 (1985), pp. 933–945. doi: 10.2307/2530965 [DOI] [PubMed] [Google Scholar]
  • 6.Gijbels I. and Gürler Ü., Estimation of a change point in a hazard function based on censored data, Lifetime Data Anal. 9 (2003), pp. 395–411. doi: 10.1023/B:LIDA.0000012424.71723.9d [DOI] [PubMed] [Google Scholar]
  • 7.Gong G. and Samaniego F.J., Pseudo maximum likelihood estimation: theory and applications, Ann. Stat. 9 (1981), pp. 861–869. doi: 10.1214/aos/1176345526 [DOI] [Google Scholar]
  • 8.Goodman M.S., Li Y., and Tiwari R.C., Detecting multiple change points in piecewise constant hazard functions, J. Appl. Stat. 38 (2011), pp. 2523–2532. PMID: 22707842. doi: 10.1080/02664763.2011.559209 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Groeneboom P. and Wellner J., Information Bounds and Nonparametric Maximum Likelihood Estimation, Oberwolfach Seminars. Birkhäuser, Basel, 1992. [Google Scholar]
  • 10.Henderson R., A problem with the likelihood ratio test for a change-point hazard rate model, Biometrika 77 (1990), pp. 835–843. doi: 10.1093/biomet/77.4.835 [DOI] [Google Scholar]
  • 11.Hu H., Large sample theory for pseudo-maximum likelihood estimates in semiparametric models, Ph.D. diss., University of Washington, 1998.
  • 12.Huang J., Efficient estimation for the proportional hazards model with interval censoring, Ann. Stat. 24 (1996), pp. 540–568. doi: 10.1214/aos/1032894452 [DOI] [Google Scholar]
  • 13.Loader C.R., Inference for a hazard rate change point, Biometrika 78 (1991), pp. 749–757. doi: 10.1093/biomet/78.4.749 [DOI] [Google Scholar]
  • 14.Maller R.A. and Zhou X., Survival Analysis with Long-term Survivors, Wiley, New York, 1996. [Google Scholar]
  • 15.Matthews D.E. and Farewell V.T., On testing for a constant hazard against a change-point alternative, Biometrics 38 (1982), pp. 463–468. doi: 10.2307/2530460 [DOI] [PubMed] [Google Scholar]
  • 16.Matthews D.E., Farewell V.T., and Pyke R., Asymptotic score-statistic processes and tests for constant hazard against a change-point alternative, Ann. Stat. 13 (1985), pp. 583–591. doi: 10.1214/aos/1176349540 [DOI] [Google Scholar]
  • 17.Nguyen H.T., Rogers G.S., and Walker E.A., Estimation in change-point hazard rate models, Biometrika 71 (1984), pp. 299–304. doi: 10.1093/biomet/71.2.299 [DOI] [Google Scholar]
  • 18.Pan W., Extending the iterative convex minorant algorithm to the cox model for interval-censored data, J. Comput. Graph. Stat. 8 (1999), pp. 109–120. [Google Scholar]
  • 19.Peng Y. and Dear K.B.G., A nonparametric mixture model for cure rate estimation, Biometrics 56 (2000), pp. 237–243. doi: 10.1111/j.0006-341X.2000.00237.x [DOI] [PubMed] [Google Scholar]
  • 20.Qian L., Zhang W., Multiple change-point detection in piecewise exponential hazard regression models with long-term survivors and right censoring, in Contemporary Developments in Statistical Theory: A Festschrift for Hira Lal Koul, Springer International Publishing, Cham, 2014, pp. 289–304.
  • 21.Qin J. and Sun J., Statistical analysis of right-censored failure-time data with partially specified hazard rates, Can. J. Stat. 25 (1997), pp. 325–336. doi: 10.2307/3315782 [DOI] [Google Scholar]
  • 22.Sun J., Variance estimation of a survival function for interval-censored survival data, Stat. Med. 20 (2001), pp. 1249–1257. doi: 10.1002/sim.719 [DOI] [PubMed] [Google Scholar]
  • 23.Sun J., The Statistical Analysis of Interval-censored Failure Time Data, Springer, New York, 2006. [Google Scholar]
  • 24.Turnbull B.W., The empirical distribution function with arbitrarily grouped, censored and truncated data, J. R. Stat. Soc. Ser. B (Methodological) 38 (1976), pp. 290–295. [Google Scholar]
  • 25.Vexler A. and Hutson A., Statistics in the Health Sciences, Chapman and Hall/CRC, New York, 2018. [Google Scholar]
  • 26.Vexler A., Hutson A., and Chen X., Statistical Testing Strategies in the Health Sciences, Chapman and Hall/CRC, New York, 2016. [Google Scholar]
  • 27.Zhang J. and Peng Y., A new estimation method for the semiparametric accelerated failure time mixture cure model, Stat. Med. 26 (2007), pp. 3157–3171. doi: 10.1002/sim.2748 [DOI] [PubMed] [Google Scholar]
  • 28.Zhao X., Wu X., and Zhou X., A change-point model for survival data with long-term survivors, Stat. Sin. 19 (2009), pp. 377–390. [Google Scholar]
  • 29.Zhou J., Zhang J., McLain A.C., and Cai B., A multiple imputation approach for semiparametric cure model with interval censored data, Comput. Stat. Data. Anal. 99 (2016), pp. 105–114. doi: 10.1016/j.csda.2016.01.013 [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplementarydata.pdf

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES