Abstract
In this paper, we discuss the inference problem about the Box-Cox transformation model when one faces left-truncated and right-censored data, which often occur in studies, for example, involving the cross-sectional sampling scheme. It is well-known that the Box-Cox transformation model includes many commonly used models as special cases such as the proportional hazards model and the additive hazards model. For inference, a Bayesian estimation approach is proposed and in the method, the piecewise function is used to approximate the baseline hazards function. Also the conditional marginal prior, whose marginal part is free of any constraints, is employed to deal with many computational challenges caused by the constraints on the parameters, and a MCMC sampling procedure is developed. A simulation study is conducted to assess the finite sample performance of the proposed method and indicates that it works well for practical situations. We apply the approach to a set of data arising from a retirement center.
Keywords: Left-truncated and right-censored data, additive hazards model, proportional hazards model, Bayesian, MCMC sampling
2010 Mathematics Subject Classifications: 62N02, 62N01
1. Introduction
In this paper, we discuss the inference problem about the Box-Cox transformation model when one faces left-truncated and right-censored data and it is well-known that the Box-Cox transformation model includes many commonly used models as special cases such as the proportional hazards model and the additive hazards model. The left-truncated and right-censored failure time data often occur in many areas, including medicine, economics, engineering, sociology and marketing [10,20,24]. This is especially used in the case when the cross-sectional sampling scheme is involved.
One specific example of left-truncated and right-censored data is given by a well-known dementia study and it recruited about 10,000 Canadians over the age of 65 who were screened for dementia and followed for their onset date of dementia and the subsequent time of death. The left truncation occurs since the dementia had already occurred at the recruitment for some subjects. The famous SEER database on the natural history of lung cancer provides another example of left-truncated and right-censored data since only the individuals diagnosed with lung cancer and still alive before recruitment time were eligible for the inclusion. It is well-known that with the left truncation, the observed failure time tends to be longer than that generated by the underlying distribution of the general population and thus special methods are needed to take it into account in the analysis.
A great deal of literature has been established for the analysis of left-truncated failure time data and also length-biased data, a special case of left-truncated data, and most of the existing methods can be classified into two types. One is these conditional on the left truncation times [15,21,25,29] and the other is the unconditional approach [8,26,27]. More specifically, among others, Turnbull [25] and Sun [21] discussed nonparametric estimation of a survival function and [1,28] considered the regression analysis under the proportional hazards censored model. Also one can find some parametric methods in Balakrishnan [2–4] and other semiparametric estimation procedures in Ning et al. [17–19,23]. On the other hand, all of the methods above are frequentist methods and in the following, we propose a Bayesian estimation procedure.
The rest of this paper is organized as follows. In Section 2, we will first introduce the notation and model that will be used throughout the paper as well as the structure of the observed data and then describe the resulting likelihood function and the prior to be used. In particular, the failure time of interest will be assumed to follow the Box-Cox transformation model with the piecewise baseline hazards function. The Bayesian estimation procedure will be developed in Sections 3 and in the method, to deal with the complexity of the posterior likelihood, the ARMS algorithm will be used. In Section 4, a simulation study is performed and indicates that the proposed method works well for practical situations. It is applied to the data arising from a retirement center in Section 5 and some conclusions are given in Section 6.
2. Notation, model and assumptions
Consider a failure time study that may involve left truncation, meaning that the failure time of interest is observed only if the study subject experiences some event. Let T and A denote the failure time of interest and the left truncation time, respectively. Then we have or only the subjects with can be included in the study. A general set-up for this is that T represents the time from some initial event to the failure event of interest and A the time from the same initial event to some cut-off event or the study enrollment time. Of course, a failure time study usually also involves a right-censoring time, denoted by C, and a vector of covariates denoted by Z, which doesn't depend on the time. Suppose that the study involves n independent subjects and define , the observed time, and , the censoring indicator. Then the observed data have the form .
For the description of the covariate effect on T, the most commonly used model is perhaps the Cox proportional hazards model [5] given by
| (1) |
Another commonly used model is the additional hazards model [16] with the form
| (2) |
It is well-known that sometimes the models above may be restrictive, and in this paper, we consider the Box-Cox transformation hazard model [30] given by
| (3) |
where is a known link function having the form
It is easy to see model (3) can be rewritten as
| (4) |
and the model above reduces to model (1) with , and model (2) as .
To derive the likelihood function, by following Chen and Sinha [12], we will assume that is a piecewise constant function. More specifically, denote a partition of the time axis and assume that for , , and define if subject i fails or is censored in interval k and 0 otherwise; means if the truncation time falls into the k-th interval, otherwise 0. Then for the ith subject, we have the hazard function for t in the kth interval, and the likelihood function has the form
In the above, , , and . Also we assume that if there does not exist left truncation.
For the specification of the prior distribution of , note that under model (3) or (4), we have the nonlinear constraints
| (5) |
due to the non-negative property of the risk function. To deal with this, one way is to specify an appropriately truncated joint prior distribution such as the truncated multivariate normal prior . This would lead to the prior distribution of the form
Following this route, we would need to analytically compute the normalizing constant
Motivated by the discussion above, we propose the following joint prior for
where represents the vector after the gth element is removed, that is, and . In consequence, has the truncated normal distribution
with the normalizing constant given by
| (6) |
In the above, denotes the cumulative distribution function of the standard normal distribution and denotes the standard deviation of . In the following, we will assume that the components of λ are independent a prior, and each . And and λ are independent a prior. We can specify a normal prior distribution for each component of .
3. Bayesian inference procedure
Now we develop the Bayesian inference procedure for . Note that based on the assumption above, we have that the gth component of β has a truncated normal prior and the full conditional posterior of these parameters has the form
and
where
Also note that Gilks and Wild [7] had proved that in practice, if all the conditional density is log-concave, the adaptive rejection sampling (ARS) method can effectively sample from the single variable log-concave distribution, but the ARS algorithm cannot be used to sample from the non-log-concavity distribution. In order to sample from these distributions, the MH algorithm can be used to update one parameter or a group of parameters at a time. However, due to the slow convergence speed of the chain, in order to avoid high rejection probability, they may better fit and estimate the shape of the proposed density adjusted to the all conditional density. Since the ARS provides a way to make the proposed function more suitable for all conditional density, one can use it to create a good proposed density. Then a single MH step is added to the ARS algorithm to create an adaptive rejection Metropolis sampling (ARMS) algorithm in the Gibbs chain.
Since the full condition posterior distribution of the above parameters is not logarithmic, one can employ the method proposed by Gilks and Wild [6] to sample the parameter posterior distribution. Specifically, they considered the non-log-concavity of the full conditional distribution problem and extended the ARS algorithm to include the Hastings-Metropolis algorithm step. In other words, in the case of non-log-concavity, the ARMS algorithm can be used for sampling. In the following, we will use the HI software package in the R software to sample the parameters with the specific steps for sampling as follows.
Step 0: Given the initial values of each parameter: ; let chain :
Step 1: Updating with ARMS algorithm
The posterior density function of is obtained as
Furthermore, the logarithmic posterior density function is given as
Step 2: Updating with ARMS algorithm
The posterior density function of is obtained
The logarithmic posterior density function is:
Step 3: Updating with ARMS algorithm:
The posterior density function of is obtained:
The logarithmic posterior density function is
By using the form of a given truncation prior, a closed form of is obtained. Therefore, the full conditional posteriori of these parameters is easier to deal with. Remarkably, the posterior estimation is very robust with respect to the choice of g in (6).
4. A simulation study
A simulation study was conducted to examine the final sample properties of the proposed method. In the study, the failure time of interest T was generated from the transformation model
Here we assumed that there exist two covariates with following the normal distribution and is generated from the Bernoulli distribution with the probability of success 0.5. We set , . Also it was assumed that the baseline hazards function is a piecewise function with K = 5 intervals and or . The transformation parameters γ are 0, 0.5 and 1, respectively. Furthermore, The underlying left truncation time A was independently generated from a U(0, 10) random variable. To form a prevalent cohort of sample size n, realizations of were generated until n subjects satisfied the sampling constraint . The censoring time for the residual survival time T−A was generated from a uniform distribution, U(0, ), where was selected so that the censoring rate was approximately or . In addition, given parameters and obey normal prior ; . The hyperparameter is , .
Table 1 presents the results given by the proposed estimation procedure with the baseline hazards function taken as , the censoring rates being 0 or 0.2. Also here we set for the Box-Cox transformation, and sample size n = 200, 300 or 500. In addition, the chain length was set at 10,000, and the parameters were estimated by using the remaining 7000 samples before burning 3000 times. we replicated 500 simulations for the Bayesian estimation under 200, 300 and 500 samples, respectively. In the table, PARA represents the parameter value with estimation. BIAS represents the empirical bias of the estimated parameters, SD represents the standard deviations of the estimated parameters, SEE represents the mean of the estimated standard errors, and CP represents the coverage probability of the confidence interval. One can see from the table that when the sample size is 200, the proposed estimation procedure seems to have given reasonable results. And when the sample size increased to 300 and 500, the estimated value and the true value of the parameter are very close. The efficiency of the estimation procedures increases with the increase of sample size. At the same time, the values of SD and SEE are very close, and the coverage probability is around .
Table 1. Summary statistics for the estimator under different censoring proportions with baseline hazards function .
| n | PARA | BIAS | SD | SEE | CP | BIAS | SD | SEE | CP |
|---|---|---|---|---|---|---|---|---|---|
| 200 | 0.0648 | 0.0665 | 0.0913 | 0.968 | 0.0741 | 0.0681 | 0.0897 | 0.932 | |
| −0.0195 | 0.3623 | 0.3852 | 0.952 | 0.0128 | 0.3435 | 0.3633 | 0.956 | ||
| 300 | 0.0638 | 0.0661 | 0.0855 | 0.942 | 0.0668 | 0.0655 | 0.0862 | 0.940 | |
| 0.0104 | 0.3129 | 0.3292 | 0.962 | 0.0151 | 0.2945 | 0.3047 | 0.954 | ||
| 500 | 0.0464 | 0.0633 | 0.0824 | 0.972 | 0.0392 | 0.0643 | 0.0824 | 0.970 | |
| 0.0096 | 0.2544 | 0.2619 | 0.960 | −0.0139 | 0.2386 | 0.2402 | 0.948 | ||
For the results given in Table 2, we focused on the situations with the baseline hazards function , and the other set-ups being the same as Table 1. They indicate that with the two different baseline hazards functions, the proposed estimation procedure seems to perform well under different sample sizes. In Table 3, we considered the situation where is , the censoring rate is 0.2 with different Box-Cox transformation, we set γ is 0 and 1, respectively. Again the obtained results suggest that the Bayesian inference procedure proposed above seems to give satisfactory results.
Table 2. Summary statistics for the estimator under different censoring proportions with baseline hazards function .
| n | PARA | BIAS | SD | SEE | CP | BIAS | SD | SEE | CP |
|---|---|---|---|---|---|---|---|---|---|
| 200 | −0.0218 | 0.0588 | 0.0806 | 0.978 | −0.0139 | 0.0560 | 0.0797 | 0.990 | |
| −0.0203 | 0.3352 | 0.3594 | 0.980 | −0.0144 | 0.3306 | 0.3344 | 0.946 | ||
| 300 | −0.0192 | 0.0559 | 0.0739 | 0.976 | −0.0229 | 0.0499 | 0.0759 | 0.982 | |
| −0.0162 | 0.2970 | 0.3027 | 0.956 | −0.0252 | 0.2726 | 0.2789 | 0.958 | ||
| 500 | −0.0401 | 0.0469 | 0.0721 | 0.976 | −0.0505 | 0.0477 | 0.0694 | 0.964 | |
| −0.0404 | 0.2273 | 0.2383 | 0.946 | −0.0380 | 0.1996 | 0.2194 | 0.952 | ||
Table 3. Summary statistics for the estimator with different γ under the baseline hazards function . The censoring rate was set to 0.2.
| n | PARA | BIAS | SD | SEE | CP | BIAS | SD | SEE | CP |
|---|---|---|---|---|---|---|---|---|---|
| 200 | −0.0478 | 0.0931 | 0.0941 | 0.928 | −0.0342 | 0.0760 | 0.1051 | 0.982 | |
| −0.0641 | 0.1774 | 0.1766 | 0.936 | 0.0763 | 0.5041 | 0.5694 | 0.978 | ||
| 300 | −0.0504 | 0.0712 | 0.0766 | 0.930 | −0.0408 | 0.0741 | 0.0936 | 0.974 | |
| −0.0648 | 0.1422 | 0.1437 | 0.934 | −0.0204 | 0.4037 | 0.4801 | 0.972 | ||
| 500 | −0.0472 | 0.0499 | 0.0592 | 0.942 | −0.0454 | 0.0562 | 0.0809 | 0.972 | |
| −0.0563 | 0.1034 | 0.1111 | 0.946 | −0.0758 | 0.3556 | 0.3919 | 0.958 | ||
5. An application
In this section, we apply the Bayesian estimation procedure proposed in the previous sections to a set of left-truncated and right-censored data arising from a study on a retirement center [11] concerning the age of death. The study consists of 462 residents living in the retirement center from January 1964 to July 1975 and the observed data include the age at which each subject entered the center and the age of death or the age at which they moved out the center or the study stopped. It is easy to see that the observations were left-truncated with the age of entry as the truncation time and the moving out or study stopping time serves as the censoring time. And the subset of those individuals whose age is more than 786 months(65.5 years) is a length-biased data set. The distribution of the truncated variables is uniform, In fact, 448 individuals are included in this subset. On objective of the study is to investigate if the gender had the effect on the age of death. In the following analysis, for simplicity, we use years as a unit.
To apply the proposed approach, define Z = 1 if the subject is male and 0 otherwise. Table 4 presents the estimation results given by the approach with and , 0.5 and 1. They include the regression parameter variable, (Para), the estimated gender effect, (Estimate), the estimated standard deviations(Std), and the lower (Lower) and upper (Upper) bounds of the confidence interval, the significance test p value(p-value). One can see that all results suggest that the male residents seem to have significantly higher death rate than the female residents, which is consistent with the original analysis results given under the Cox model [13]. To further see the results, Figure 1 gives the estimated hazard functions.
Table 4. Analysis of the retirement center data with different transformation parameters γ using K = 4, 5, 6.
| Para | γ | Estimate | Std | Lower | Upper | P-value |
| K = 4 | ||||||
| 0 | 0.2745 | 0.1510 | 0.0238 | 0.5947 | 0.0685 | |
| β | 0.5 | 0.0816 | 0.0419 | 0.0095 | 0.1697 | 0.0520 |
| 1 | 0.0199 | 0.0104 | 0.0025 | 0.0429 | 0.0575 | |
| K = 5 | ||||||
| 0 | 0.2761 | 0.1523 | 0.0258 | 0.6014 | 0.0699 | |
| β | 0.5 | 0.0858 | 0.0423 | 0.0112 | 0.1739 | 0.0434 |
| 1 | 0.0209 | 0.0103 | 0.0031 | 0.0433 | 0.0429 | |
| K = 6 | ||||||
| 0 | 0.2727 | 0.1497 | 0.0265 | 0.5880 | 0.0706 | |
| β | 0.5 | 0.0834 | 0.0420 | 0.0111 | 0.1724 | 0.0477 |
| 1 | 0.0203 | 0.0103 | 0.0032 | 0.0424 | 0.0494 | |
Figure 1.
Estimated hazards under models of the retirement center data with different transformation parameters and 1, using K = 5. (a) In the model of , the hazards was estimated for all male and female subjects. (b) In the model of , the hazards was estimated for all male and female subjects and (c) In the model of , the hazards was estimated for all male and female subjects.
6. Discussion and conclusions
In this paper, we discussed regression analysis of left-truncated and right-censored data, which often occur in many applications, and for the problem, a Bayesian estimation procedure was developed. In particular, we considered a class of Box-Cox transformation hazards functions, which are semiparametric and clearly more flexible than the parametric models discussed in Balakrishnan and Mitra [2–4] among others. For the assessment of the finite sample properties of the proposed approach, a simulation study was conducted and suggested that the approach seems to work well for practical situations. Also the method was illustrated through a real study on a retirement center.
Note that in the proposed estimation procedure, to deal with the complexity of data and model, we presented a form of joint prior to absorb nonlinear constraints into a parameter and exclude all other parameters from the constraint. The Bayesian estimation was realized by the MCMC sampling with the ARMS algorithm used. Also note that instead of left-truncated and right-censored data, a more type of failure time data is left-truncated and interval-censored data or truncated and doubly censored data [22]. For these latter situations, it is apparent that the proposed estimation approach cannot be directly applied and one possible approach is to combine the proposed method and the imputation method.
Acknowledgments
We would like to thank the editor for their significant guidance. Also, we would like to thank the anonymous reviewers for orienting us toward important references and for helping in improving this work.
Funding Statement
The research of the first author was supported by grants from the National Natural Science Foundation of China (11671054). This work of the corresponding author was partly supported by the National Natural Science Foundation of China Grant No. 11901054 and the Mathematics Tianyuan Foundation of NSFC (11926340, 11926341).
Disclosure statement
No potential conflict of interest was reported by the author(s).
References
- 1.Asgharian M., M'Lan C.E., and Wolfson D.B., Length-biased sampling with right censoring, Appl. Statist. 97 (2002), pp. 201–209. [Google Scholar]
- 2.Balakrishnana N. and Mitra D., Likelihood inference for lognormal data with left truncation and right censoring with an illustration, J. Statist. Plann. Inference 141 (2011), pp. 3536–3553. doi: 10.1016/j.jspi.2011.05.007 [DOI] [Google Scholar]
- 3.Balakrishnana N. and Mitra D., Left truncated and right censored Weibull data and likelihood inference with an illustration, Comput. Stat. Data. Anal. 56 (2012), pp. 4011–4025. doi: 10.1016/j.csda.2012.05.004 [DOI] [Google Scholar]
- 4.Balakrishnana N. and Mitra D., Likelihood inference based on left truncated and right censored data from a gamma distribution, IEEE Trans. Reliab. 62 (2013), pp. 679–688. doi: 10.1109/TR.2013.2273039 [DOI] [Google Scholar]
- 5.Cox D.R., Regression models and life-tables, J. R. Stat. Soc. Ser. B. 34 (1972), pp. 187–220. [Google Scholar]
- 6.Gilks W.R., Best N.G., and Tan K.K.C., Adaptive rejection metropolis sampling within gibbs sampling, Appl. Statist. 44 (1995), pp. 455–472. doi: 10.2307/2986138 [DOI] [Google Scholar]
- 7.Gilks W.R. and Wild P., Adaptive rejection sampling for Gibbs sampling, Appl. Statist. 41 (1992), pp. 337–348. doi: 10.2307/2347565 [DOI] [Google Scholar]
- 8.Gill R., Vardi Y., and Wellner J.A., Large sample theory of empirical distributions in biased sampling models, Ann. Statist. 16 (1988), pp. 1069–1112. doi: 10.1214/aos/1176350948 [DOI] [Google Scholar]
- 10.Huang C.Y. and Qin J., Nonparametric estimation for length-biased and right-censored data, Biometrika 98 (2011), pp. 177–186. doi: 10.1093/biomet/asq069 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Hude J., Survival analysis with incomplete observations, in Biostatistics Casebook, John Wiley and Sons, New York, 1980. [Google Scholar]
- 12.Ibrahim J.G., Chen M.H., and Sinha D., Bayesian Survival Analysis, Springer-Verlag, New York, 2001. [Google Scholar]
- 13.Klein J.P. and Moeschberger M.L., Survival Analysis Techniques for Censored and Truncated Data, 2nd ed., Springer, New York, 1997. [Google Scholar]
- 15.Lagakos S.W., Barraj L.M., and Gruttola V.D., Nonparametric analysis of truncated survival data, with application to AIDS, Biometrika 75 (1988), pp. 515–523. doi: 10.1093/biomet/75.3.515 [DOI] [Google Scholar]
- 16.Lin D.Y. and Ying Z., Semiparametric analysis of the additive risk model, Biometrika 81 (1994), pp. 61–71. doi: 10.1093/biomet/81.1.61 [DOI] [Google Scholar]
- 17.Ning J., Qin J., and Shen Y., Buckley-James-type estimator with right-censored and length-biased data, Biometrics 67 (2011), pp. 1369–1378. doi: 10.1111/j.1541-0420.2011.01568.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Ning J., Qin J., and Shen Y., Score estimating equations from embedded likelihood functions under accelerated failure time model, J. Am. Stat. Assoc. 109 (2014), pp. 1625–1635. doi: 10.1080/01621459.2014.946034 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Ning J., Qin J., and Shen Y., Semiparametric accelerated failure time model for length-biased data with application to dementia study, Stat. Sin. 24 (2014), pp. 313–333. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Qin J. and Shen Y., Statistical methods for analyzing right-censored length-biased data under Cox model, Biometrics 66 (2010), pp. 382–392. doi: 10.1111/j.1541-0420.2009.01287.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Sun J., Self-consistency estimation of distributions based on truncated and doubly censored data with applications to AIDS cohort studies, Lifetime Data Anal. 3 (1997), pp. 305–313. doi: 10.1023/A:1009609227969 [DOI] [PubMed] [Google Scholar]
- 22.Sun J., The Statistical Analysis of Interval-Censored Failure Time Data, Springer Science+Business Inc., New York, 2006. [Google Scholar]
- 23.Sun Y.f., Chan K.C.G., and Qin J., Simple and fast overidentified rank estimation for right-censored length-biased data and backward recurrence time, Biometrics 74 (2018), pp. 77–85. doi: 10.1111/biom.12727 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Shen Y., Ning J., and Qin J., Analyzing length-biased data with semiparametric transformation and accelerated failure time models, J. Am. Stat. Assoc. 104 (2009), pp. 1192–1202. doi: 10.1198/jasa.2009.tm08614 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Turnbull B.W., The empirical distribution function with arbitrarily grouped, censored and truncated data, J. R. Stat. Soc. B. 38 (1976), pp. 290–295. [Google Scholar]
- 26.Vardi Y., Nonparametric estimation in the presence of length bias, Ann. Statist. 10 (1982), pp. 616–620. doi: 10.1214/aos/1176345802 [DOI] [Google Scholar]
- 27.Vardi Y., Empirical distributions in selection bias models, Ann. Statist. 13 (1985), pp. 178–203. doi: 10.1214/aos/1176346585 [DOI] [Google Scholar]
- 28.Vardi Y. and Zhang C.H., Large sample study of empirical distributions in a random-multiplicative censoring model, Ann. Statist. 20 (1992), pp. 1022–1039. doi: 10.1214/aos/1176348668 [DOI] [Google Scholar]
- 29.Wang M.C., Nonparametric estimation from cross-sectional survival data, J. Am. Stat. Assoc. 86 (1991), pp. 130–143. doi: 10.1080/01621459.1991.10475011 [DOI] [Google Scholar]
- 30.Yin G. and Ibrahim J.G., Bayesian transformation hazard models, IMS Lecture Notes Monogr. Ser. 49 (2006), pp. 170–182. doi: 10.1214/074921706000000446 [DOI] [Google Scholar]

