Skip to main content
Journal of Applied Statistics logoLink to Journal of Applied Statistics
. 2020 Jun 25;48(8):1429–1441. doi: 10.1080/02664763.2020.1784854

Bayesian analysis of the Box-Cox transformation model based on left-truncated and right-censored data

Chunjie Wang 1, Jingjing Jiang 1, Linlin Luo 1, Shuying Wang 1,CONTACT
PMCID: PMC9041913  PMID: 35706470

Abstract

In this paper, we discuss the inference problem about the Box-Cox transformation model when one faces left-truncated and right-censored data, which often occur in studies, for example, involving the cross-sectional sampling scheme. It is well-known that the Box-Cox transformation model includes many commonly used models as special cases such as the proportional hazards model and the additive hazards model. For inference, a Bayesian estimation approach is proposed and in the method, the piecewise function is used to approximate the baseline hazards function. Also the conditional marginal prior, whose marginal part is free of any constraints, is employed to deal with many computational challenges caused by the constraints on the parameters, and a MCMC sampling procedure is developed. A simulation study is conducted to assess the finite sample performance of the proposed method and indicates that it works well for practical situations. We apply the approach to a set of data arising from a retirement center.

Keywords: Left-truncated and right-censored data, additive hazards model, proportional hazards model, Bayesian, MCMC sampling

2010 Mathematics Subject Classifications: 62N02, 62N01

1. Introduction

In this paper, we discuss the inference problem about the Box-Cox transformation model when one faces left-truncated and right-censored data and it is well-known that the Box-Cox transformation model includes many commonly used models as special cases such as the proportional hazards model and the additive hazards model. The left-truncated and right-censored failure time data often occur in many areas, including medicine, economics, engineering, sociology and marketing [10,20,24]. This is especially used in the case when the cross-sectional sampling scheme is involved.

One specific example of left-truncated and right-censored data is given by a well-known dementia study and it recruited about 10,000 Canadians over the age of 65 who were screened for dementia and followed for their onset date of dementia and the subsequent time of death. The left truncation occurs since the dementia had already occurred at the recruitment for some subjects. The famous SEER database on the natural history of lung cancer provides another example of left-truncated and right-censored data since only the individuals diagnosed with lung cancer and still alive before recruitment time were eligible for the inclusion. It is well-known that with the left truncation, the observed failure time tends to be longer than that generated by the underlying distribution of the general population and thus special methods are needed to take it into account in the analysis.

A great deal of literature has been established for the analysis of left-truncated failure time data and also length-biased data, a special case of left-truncated data, and most of the existing methods can be classified into two types. One is these conditional on the left truncation times [15,21,25,29] and the other is the unconditional approach [8,26,27]. More specifically, among others, Turnbull [25] and Sun [21] discussed nonparametric estimation of a survival function and [1,28] considered the regression analysis under the proportional hazards censored model. Also one can find some parametric methods in Balakrishnan [2–4] and other semiparametric estimation procedures in Ning et al. [17–19,23]. On the other hand, all of the methods above are frequentist methods and in the following, we propose a Bayesian estimation procedure.

The rest of this paper is organized as follows. In Section 2, we will first introduce the notation and model that will be used throughout the paper as well as the structure of the observed data and then describe the resulting likelihood function and the prior to be used. In particular, the failure time of interest will be assumed to follow the Box-Cox transformation model with the piecewise baseline hazards function. The Bayesian estimation procedure will be developed in Sections 3 and in the method, to deal with the complexity of the posterior likelihood, the ARMS algorithm will be used. In Section 4, a simulation study is performed and indicates that the proposed method works well for practical situations. It is applied to the data arising from a retirement center in Section 5 and some conclusions are given in Section 6.

2. Notation, model and assumptions

Consider a failure time study that may involve left truncation, meaning that the failure time of interest is observed only if the study subject experiences some event. Let T and A denote the failure time of interest and the left truncation time, respectively. Then we have TA or only the subjects with AT can be included in the study. A general set-up for this is that T represents the time from some initial event to the failure event of interest and A the time from the same initial event to some cut-off event or the study enrollment time. Of course, a failure time study usually also involves a right-censoring time, denoted by C, and a vector of covariates denoted by Z, which doesn't depend on the time. Suppose that the study involves n independent subjects and define yi=min(Ti,Ci), the observed time, and δi=I(TiCi), the censoring indicator. Then the observed data have the form D={Di={(yi,Ai,Zi,δi),i=1,,n}}.

For the description of the covariate effect on T, the most commonly used model is perhaps the Cox proportional hazards model [5] given by

λ(t|Z)=λ0(t)exp{βZ}. (1)

Another commonly used model is the additional hazards model [16] with the form

λ(t|Z)=λ0(t)+βZ. (2)

It is well-known that sometimes the models above may be restrictive, and in this paper, we consider the Box-Cox transformation hazard model [30] given by

ϕ{λ(t|Zi)}=ϕ{λ0(t)}+βZi, (3)

where ϕ() is a known link function having the form

ϕ(y)={(yγ1)/γγ0log(y)γ=0.

It is easy to see model (3) can be rewritten as

λ(t|Zi)=[λ0(t)γ+γβZi]1/γ, (4)

and the model above reduces to model (1) with γ0, and model (2) as γ1.

To derive the likelihood function, by following Chen and Sinha [12], we will assume that λ0(t) is a piecewise constant function. More specifically, 0<s1<<sK denote a partition of the time axis and assume that λ0(y)=λk for y(sk1,sk], k=1,,K, and define ωik=1 if subject i fails or is censored in interval k and 0 otherwise; νik=1 means if the truncation time falls into the k-th interval, otherwise 0. Then for the ith subject, we have the hazard function λk(t|Zi)=[λk(t)γ+γβZi]ωik/γ for t in the kth interval, and the likelihood function has the form

L(β,λ|D)=i=1nk=1Kf(y|Zi)δiS(y|Zi)1δi1(S(Ai))=i=1nk=1Kλ(y|Zi)δiS(y|Zi)1(S(Ai))=i=1nk=1K(λkγ+γβZi)δiωik/γexp{ωik{(λkγ+γβZi)1/γ(yisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}}exp{νik{(λkγ+γβZi)1/γ(Aisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}}

In the above, y=(y1,,yn), A=(A1,,An), and δ=(δ1,,δn). Also we assume that Ai=0 if there does not exist left truncation.

For the specification of the prior distribution of (β,λ), note that under model (3) or (4), we have the nonlinear constraints

λkγ+γβZi0,(i=1,,n;k=1,K) (5)

due to the non-negative property of the risk function. To deal with this, one way is to specify an appropriately truncated joint prior distribution such as the truncated multivariate normal prior N(μ,Σ). This would lead to the prior distribution of the form

π(β,λ)=π(β|λ)π(λ)I{λkγ+γβZi0,foralli,k}.

Following this route, we would need to analytically compute the normalizing constant

c(λ)=λkγ+γβZi0foralli,kexp{12(βμ)Σ1(βμ)}dβ1dβp.

Motivated by the discussion above, we propose the following joint prior for (β,λ)

π(β,λ)=π(βg|β(g),λ)π(β(g),λ)×I(βgmini,kλkγ+γβ(g)Zi(g)γZig),

where (g) represents the vector after the gth element is removed, that is, Z(g)=(Z1,,Zg1,Zg+1,,ZP) and β(g)=(β1,,βg1,βg+1,,βP). In consequence, (βg|β(g),λ) has the truncated normal distribution

π(βg|β(g),λ)=c1(β(g),λ)exp(βg22σg2)×I(βgmini,kλkγ+γβ(g)Zi(g)γZig)

with the normalizing constant given by

c(β(g),λ)=2πσg[1Φ(mini,kλkγ+γβ(g)Zi(g)γZigσg)]. (6)

In the above, Φ() denotes the cumulative distribution function of the standard normal distribution and σg denotes the standard deviation of βg. In the following, we will assume that the components of λ are independent a prior, and each λkGamma(ς,ξ),k=1,,K. And βg and λ are independent a prior. We can specify a normal prior distribution for each component of β(g).

3. Bayesian inference procedure

Now we develop the Bayesian inference procedure for (β,λ). Note that based on the assumption above, we have that the gth component of β has a truncated normal prior and the full conditional posterior of these parameters has the form

π(βgβ(g),λ,D)L(β,λD)π(βg|β(g),λ),π(βl,lgβ(l),λ,D)L(β,λD)π(βl)c1(β(g),λ),

and

π(λkβ,λ(k),D)L(β,λD)π(λk)c1(β(g),λ),

where

π(βl)exp(βl2/2σl2),π(λk)λkς1exp(ξλk).

Also note that Gilks and Wild [7] had proved that in practice, if all the conditional density is log-concave, the adaptive rejection sampling (ARS) method can effectively sample from the single variable log-concave distribution, but the ARS algorithm cannot be used to sample from the non-log-concavity distribution. In order to sample from these distributions, the MH algorithm can be used to update one parameter or a group of parameters at a time. However, due to the slow convergence speed of the chain, in order to avoid high rejection probability, they may better fit and estimate the shape of the proposed density adjusted to the all conditional density. Since the ARS provides a way to make the proposed function more suitable for all conditional density, one can use it to create a good proposed density. Then a single MH step is added to the ARS algorithm to create an adaptive rejection Metropolis sampling (ARMS) algorithm in the Gibbs chain.

Since the full condition posterior distribution of the above parameters is not logarithmic, one can employ the method proposed by Gilks and Wild [6] to sample the parameter posterior distribution. Specifically, they considered the non-log-concavity of the full conditional distribution problem and extended the ARS algorithm to include the Hastings-Metropolis algorithm step. In other words, in the case of non-log-concavity, the ARMS algorithm can be used for sampling. In the following, we will use the HI software package in the R software to sample the parameters with the specific steps for sampling as follows.

Step 0: Given the initial values of each parameter: βg(0),βl(0),λk(0); let chain q=1,,Q:

Step 1: Updating βg with ARMS algorithm

βg(q)π(βg|β(g)(q1),λ(q1),D).

The posterior density function of βg is obtained as

π(βg|β(g),λ,D)L(β,λ|D)π(βg|β(g),λ)i=1nk=1K(λkγ+γβZi)δiωik/γexp{ωik{(λkγ+γβZi)1/γ(yisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}}exp{νik{(λkγ+γβZi)1/γ(Aisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}}×exp(βg22σg2)×I(βgmini,kλkγ+γβ(g)Zi(g)γZig)2πσg[1Φ(mini,kλkγ+γβ(g)Zi(g)γZigσg)].

Furthermore, the logarithmic posterior density function is given as

logπ(βg|β(g),λ,D)logL(β,λ|D)π(βg|β(g),λ)i=1nk=1Kδiωikγlog(λkγ+γβZi)ωik{(λkγ+γβZi)1/γ(yisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}+νik{(λkγ+γβZi)1/γ(Aisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}βg22σg2log(1Φ(mini,kλkγ+γβ(g)Zi(g)γZigσg)).

Step 2: Updating βl with ARMS algorithm

βl(q)π(βl,lg|β(l)(q),λ(q1),D).

The posterior density function of βl is obtained

π(βl,lg|β(l),λ,D)L(β,λ|D)π(βl)c1(β(g),λ)i=1nk=1K(λkγ+γβZi)δiωik/γexp{ωik{(λkγ+γβZi)1/γ(yisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}}exp{νik{(λkγ+γβZi)1/γ(Aisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}}×exp(βl22σl2)2πσg[1Φ(mini,kλkγ+γβ(g)Zi(g)γZigσg)].

The logarithmic posterior density function is:

logπ(βl,lg|β(l),λ,D)logL(β,λ|D)π(βl)c1(β(g),λ)i=1nk=1Kδiωikγlog(λkγ+γβZi)ωik{(λkγ+γβZi)1/γ(yisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}+νik{(λkγ+γβZi)1/γ(Aisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}βl22σl2log(1Φ(mini,kλkγ+γβ(g)Zi(g)γZigσg)).

Step 3: Updating λs with ARMS algorithm:

λs(q)π(λs|β(q),D).

The posterior density function of λs is obtained:

π(λs|β,λ(s),D)L(β,λ|D)π(λs)c1(β(g),λ)i=1nk=1K(λkγ+γβZi)δiωik/γexp{ωik{(λkγ+γβZi)1/γ(yisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}}exp{νik{(λkγ+γβZi)1/γ(Aisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}}×λsς1exp(ξλs)2πσg[1Φ(mini,sλsγ+γβ(g)Zi(g)γZigσg)].

The logarithmic posterior density function is

logπ(λs|β,λ(s),D)logL(β,λ|D)π(λs)c1(β(g),λ)i=1nk=1Kδiωikγlog(λkγ+γβZi)ωik{(λkγ+γβZi)1/γ(yisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}+νik{(λkγ+γβZi)1/γ(Aisk1)+g=1k1(λgγ+γβZi)1/γ(sgsg1)}+(ς1)logλsξλslog(1Φ(mini,sλsγ+γβ(g)Zi(g)γZigσg)).

By using the form of a given truncation prior, a closed form of c1(β(g),λ) is obtained. Therefore, the full conditional posteriori of these parameters is easier to deal with. Remarkably, the posterior estimation is very robust with respect to the choice of g in (6).

4. A simulation study

A simulation study was conducted to examine the final sample properties of the proposed method. In the study, the failure time of interest T was generated from the transformation model

λ(t|Zi)=[λ0(t)γ+γβZi]1/γ.

Here we assumed that there exist two covariates with Z1 following the normal distribution N(5,1) and Z2 is generated from the Bernoulli distribution with the probability of success 0.5. We set β1=0.7, β2=1. Also it was assumed that the baseline hazards function λ0(t) is a piecewise function with K = 5 intervals and λ0(t)=2t or λ0(t)=2t2. The transformation parameters γ are 0, 0.5 and 1, respectively. Furthermore, The underlying left truncation time A was independently generated from a U(0, 10) random variable. To form a prevalent cohort of sample size n, realizations of (A,T) were generated until n subjects satisfied the sampling constraint TA. The censoring time for the residual survival time TA was generated from a uniform distribution, U(0, c1), where c1 was selected so that the censoring rate was approximately 0% or 20%. In addition, given parameters β1 and β2 obey normal prior N(0,104); λkGamma(ς,ξ). The hyperparameter is ς=2, ξ=0.001.

Table 1 presents the results given by the proposed estimation procedure with the baseline hazards function λ0(t) taken as 2t, the censoring rates being 0 or 0.2. Also here we set γ=0.5 for the Box-Cox transformation, and sample size n = 200, 300 or 500. In addition, the chain length was set at 10,000, and the parameters were estimated by using the remaining 7000 samples before burning 3000 times. we replicated 500 simulations for the Bayesian estimation under 200, 300 and 500 samples, respectively. In the table, PARA represents the parameter value with estimation. BIAS represents the empirical bias of the estimated parameters, SD represents the standard deviations of the estimated parameters, SEE represents the mean of the estimated standard errors, and CP represents the coverage probability of the 95% confidence interval. One can see from the table that when the sample size is 200, the proposed estimation procedure seems to have given reasonable results. And when the sample size increased to 300 and 500, the estimated value and the true value of the parameter are very close. The efficiency of the estimation procedures increases with the increase of sample size. At the same time, the values of SD and SEE are very close, and the coverage probability is around 95%.

Table 1. Summary statistics for the estimator under different censoring proportions with baseline hazards function λ=2t.

    cen%=20% cen%=0%
n PARA BIAS SD SEE CP BIAS SD SEE CP
200 β1 0.0648 0.0665 0.0913 0.968 0.0741 0.0681 0.0897 0.932
  β2 −0.0195 0.3623 0.3852 0.952 0.0128 0.3435 0.3633 0.956
300 β1 0.0638 0.0661 0.0855 0.942 0.0668 0.0655 0.0862 0.940
  β2 0.0104 0.3129 0.3292 0.962 0.0151 0.2945 0.3047 0.954
500 β1 0.0464 0.0633 0.0824 0.972 0.0392 0.0643 0.0824 0.970
  β2 0.0096 0.2544 0.2619 0.960 −0.0139 0.2386 0.2402 0.948

For the results given in Table 2, we focused on the situations with the baseline hazards function λ0(t)=2t2, and the other set-ups being the same as Table 1. They indicate that with the two different baseline hazards functions, the proposed estimation procedure seems to perform well under different sample sizes. In Table 3, we considered the situation where λ0(t) is 2t2, the censoring rate is 0.2 with different Box-Cox transformation, we set γ is 0 and 1, respectively. Again the obtained results suggest that the Bayesian inference procedure proposed above seems to give satisfactory results.

Table 2. Summary statistics for the estimator under different censoring proportions with baseline hazards function λ=2t2.

    cen%=20% cen%=0%
n PARA BIAS SD SEE CP BIAS SD SEE CP
200 β1 −0.0218 0.0588 0.0806 0.978 −0.0139 0.0560 0.0797 0.990
  β2 −0.0203 0.3352 0.3594 0.980 −0.0144 0.3306 0.3344 0.946
300 β1 −0.0192 0.0559 0.0739 0.976 −0.0229 0.0499 0.0759 0.982
  β2 −0.0162 0.2970 0.3027 0.956 −0.0252 0.2726 0.2789 0.958
500 β1 −0.0401 0.0469 0.0721 0.976 −0.0505 0.0477 0.0694 0.964
  β2 −0.0404 0.2273 0.2383 0.946 −0.0380 0.1996 0.2194 0.952

Table 3. Summary statistics for the estimator with different γ under the baseline hazards function λ=2t2. The censoring rate was set to 0.2.

    γ=0 γ=1
n PARA BIAS SD SEE CP BIAS SD SEE CP
200 β1 −0.0478 0.0931 0.0941 0.928 −0.0342 0.0760 0.1051 0.982
  β2 −0.0641 0.1774 0.1766 0.936 0.0763 0.5041 0.5694 0.978
300 β1 −0.0504 0.0712 0.0766 0.930 −0.0408 0.0741 0.0936 0.974
  β2 −0.0648 0.1422 0.1437 0.934 −0.0204 0.4037 0.4801 0.972
500 β1 −0.0472 0.0499 0.0592 0.942 −0.0454 0.0562 0.0809 0.972
  β2 −0.0563 0.1034 0.1111 0.946 −0.0758 0.3556 0.3919 0.958

5. An application

In this section, we apply the Bayesian estimation procedure proposed in the previous sections to a set of left-truncated and right-censored data arising from a study on a retirement center [11] concerning the age of death. The study consists of 462 residents living in the retirement center from January 1964 to July 1975 and the observed data include the age at which each subject entered the center and the age of death or the age at which they moved out the center or the study stopped. It is easy to see that the observations were left-truncated with the age of entry as the truncation time and the moving out or study stopping time serves as the censoring time. And the subset of those individuals whose age is more than 786 months(65.5 years) is a length-biased data set. The distribution of the truncated variables is uniform, In fact, 448 individuals are included in this subset. On objective of the study is to investigate if the gender had the effect on the age of death. In the following analysis, for simplicity, we use years as a unit.

To apply the proposed approach, define Z = 1 if the subject is male and 0 otherwise. Table 4 presents the estimation results given by the approach with K=4,5,6 and γ=0, 0.5 and 1. They include the regression parameter variable, (Para), the estimated gender effect, (Estimate), the estimated standard deviations(Std), and the lower (Lower) and upper (Upper) bounds of the 95% confidence interval, the significance test p value(p-value). One can see that all results suggest that the male residents seem to have significantly higher death rate than the female residents, which is consistent with the original analysis results given under the Cox model [13]. To further see the results, Figure 1 gives the estimated hazard functions.

Table 4. Analysis of the retirement center data with different transformation parameters γ using K = 4, 5, 6.

Para γ Estimate Std Lower Upper P-value
  K = 4
  0 0.2745 0.1510 0.0238 0.5947 0.0685
β 0.5 0.0816 0.0419 0.0095 0.1697 0.0520
  1 0.0199 0.0104 0.0025 0.0429 0.0575
  K = 5
  0 0.2761 0.1523 0.0258 0.6014 0.0699
β 0.5 0.0858 0.0423 0.0112 0.1739 0.0434
  1 0.0209 0.0103 0.0031 0.0433 0.0429
  K = 6
  0 0.2727 0.1497 0.0265 0.5880 0.0706
β 0.5 0.0834 0.0420 0.0111 0.1724 0.0477
  1 0.0203 0.0103 0.0032 0.0424 0.0494

Figure 1.

Figure 1.

Estimated hazards under models of the retirement center data with different transformation parameters γ=0,.5 and 1, using K = 5. (a) In the model of γ=0, the hazards was estimated for all male and female subjects. (b) In the model of γ=0.5, the hazards was estimated for all male and female subjects and (c) In the model of γ=1, the hazards was estimated for all male and female subjects.

6. Discussion and conclusions

In this paper, we discussed regression analysis of left-truncated and right-censored data, which often occur in many applications, and for the problem, a Bayesian estimation procedure was developed. In particular, we considered a class of Box-Cox transformation hazards functions, which are semiparametric and clearly more flexible than the parametric models discussed in Balakrishnan and Mitra [2–4] among others. For the assessment of the finite sample properties of the proposed approach, a simulation study was conducted and suggested that the approach seems to work well for practical situations. Also the method was illustrated through a real study on a retirement center.

Note that in the proposed estimation procedure, to deal with the complexity of data and model, we presented a form of joint prior to absorb nonlinear constraints into a parameter and exclude all other parameters from the constraint. The Bayesian estimation was realized by the MCMC sampling with the ARMS algorithm used. Also note that instead of left-truncated and right-censored data, a more type of failure time data is left-truncated and interval-censored data or truncated and doubly censored data [22]. For these latter situations, it is apparent that the proposed estimation approach cannot be directly applied and one possible approach is to combine the proposed method and the imputation method.

Acknowledgments

We would like to thank the editor for their significant guidance. Also, we would like to thank the anonymous reviewers for orienting us toward important references and for helping in improving this work.

Funding Statement

The research of the first author was supported by grants from the National Natural Science Foundation of China (11671054). This work of the corresponding author was partly supported by the National Natural Science Foundation of China Grant No. 11901054 and the Mathematics Tianyuan Foundation of NSFC (11926340, 11926341).

Disclosure statement

No potential conflict of interest was reported by the author(s).

References

  • 1.Asgharian M., M'Lan C.E., and Wolfson D.B., Length-biased sampling with right censoring, Appl. Statist. 97 (2002), pp. 201–209. [Google Scholar]
  • 2.Balakrishnana N. and Mitra D., Likelihood inference for lognormal data with left truncation and right censoring with an illustration, J. Statist. Plann. Inference 141 (2011), pp. 3536–3553. doi: 10.1016/j.jspi.2011.05.007 [DOI] [Google Scholar]
  • 3.Balakrishnana N. and Mitra D., Left truncated and right censored Weibull data and likelihood inference with an illustration, Comput. Stat. Data. Anal. 56 (2012), pp. 4011–4025. doi: 10.1016/j.csda.2012.05.004 [DOI] [Google Scholar]
  • 4.Balakrishnana N. and Mitra D., Likelihood inference based on left truncated and right censored data from a gamma distribution, IEEE Trans. Reliab. 62 (2013), pp. 679–688. doi: 10.1109/TR.2013.2273039 [DOI] [Google Scholar]
  • 5.Cox D.R., Regression models and life-tables, J. R. Stat. Soc. Ser. B. 34 (1972), pp. 187–220. [Google Scholar]
  • 6.Gilks W.R., Best N.G., and Tan K.K.C., Adaptive rejection metropolis sampling within gibbs sampling, Appl. Statist. 44 (1995), pp. 455–472. doi: 10.2307/2986138 [DOI] [Google Scholar]
  • 7.Gilks W.R. and Wild P., Adaptive rejection sampling for Gibbs sampling, Appl. Statist. 41 (1992), pp. 337–348. doi: 10.2307/2347565 [DOI] [Google Scholar]
  • 8.Gill R., Vardi Y., and Wellner J.A., Large sample theory of empirical distributions in biased sampling models, Ann. Statist. 16 (1988), pp. 1069–1112. doi: 10.1214/aos/1176350948 [DOI] [Google Scholar]
  • 10.Huang C.Y. and Qin J., Nonparametric estimation for length-biased and right-censored data, Biometrika 98 (2011), pp. 177–186. doi: 10.1093/biomet/asq069 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Hude J., Survival analysis with incomplete observations, in Biostatistics Casebook, John Wiley and Sons, New York, 1980. [Google Scholar]
  • 12.Ibrahim J.G., Chen M.H., and Sinha D., Bayesian Survival Analysis, Springer-Verlag, New York, 2001. [Google Scholar]
  • 13.Klein J.P. and Moeschberger M.L., Survival Analysis Techniques for Censored and Truncated Data, 2nd ed., Springer, New York, 1997. [Google Scholar]
  • 15.Lagakos S.W., Barraj L.M., and Gruttola V.D., Nonparametric analysis of truncated survival data, with application to AIDS, Biometrika 75 (1988), pp. 515–523. doi: 10.1093/biomet/75.3.515 [DOI] [Google Scholar]
  • 16.Lin D.Y. and Ying Z., Semiparametric analysis of the additive risk model, Biometrika 81 (1994), pp. 61–71. doi: 10.1093/biomet/81.1.61 [DOI] [Google Scholar]
  • 17.Ning J., Qin J., and Shen Y., Buckley-James-type estimator with right-censored and length-biased data, Biometrics 67 (2011), pp. 1369–1378. doi: 10.1111/j.1541-0420.2011.01568.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Ning J., Qin J., and Shen Y., Score estimating equations from embedded likelihood functions under accelerated failure time model, J. Am. Stat. Assoc. 109 (2014), pp. 1625–1635. doi: 10.1080/01621459.2014.946034 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Ning J., Qin J., and Shen Y., Semiparametric accelerated failure time model for length-biased data with application to dementia study, Stat. Sin. 24 (2014), pp. 313–333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Qin J. and Shen Y., Statistical methods for analyzing right-censored length-biased data under Cox model, Biometrics 66 (2010), pp. 382–392. doi: 10.1111/j.1541-0420.2009.01287.x [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Sun J., Self-consistency estimation of distributions based on truncated and doubly censored data with applications to AIDS cohort studies, Lifetime Data Anal. 3 (1997), pp. 305–313. doi: 10.1023/A:1009609227969 [DOI] [PubMed] [Google Scholar]
  • 22.Sun J., The Statistical Analysis of Interval-Censored Failure Time Data, Springer Science+Business Inc., New York, 2006. [Google Scholar]
  • 23.Sun Y.f., Chan K.C.G., and Qin J., Simple and fast overidentified rank estimation for right-censored length-biased data and backward recurrence time, Biometrics 74 (2018), pp. 77–85. doi: 10.1111/biom.12727 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Shen Y., Ning J., and Qin J., Analyzing length-biased data with semiparametric transformation and accelerated failure time models, J. Am. Stat. Assoc. 104 (2009), pp. 1192–1202. doi: 10.1198/jasa.2009.tm08614 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Turnbull B.W., The empirical distribution function with arbitrarily grouped, censored and truncated data, J. R. Stat. Soc. B. 38 (1976), pp. 290–295. [Google Scholar]
  • 26.Vardi Y., Nonparametric estimation in the presence of length bias, Ann. Statist. 10 (1982), pp. 616–620. doi: 10.1214/aos/1176345802 [DOI] [Google Scholar]
  • 27.Vardi Y., Empirical distributions in selection bias models, Ann. Statist. 13 (1985), pp. 178–203. doi: 10.1214/aos/1176346585 [DOI] [Google Scholar]
  • 28.Vardi Y. and Zhang C.H., Large sample study of empirical distributions in a random-multiplicative censoring model, Ann. Statist. 20 (1992), pp. 1022–1039. doi: 10.1214/aos/1176348668 [DOI] [Google Scholar]
  • 29.Wang M.C., Nonparametric estimation from cross-sectional survival data, J. Am. Stat. Assoc. 86 (1991), pp. 130–143. doi: 10.1080/01621459.1991.10475011 [DOI] [Google Scholar]
  • 30.Yin G. and Ibrahim J.G., Bayesian transformation hazard models, IMS Lecture Notes Monogr. Ser. 49 (2006), pp. 170–182. doi: 10.1214/074921706000000446 [DOI] [Google Scholar]

Articles from Journal of Applied Statistics are provided here courtesy of Taylor & Francis

RESOURCES