A Monte Carlo method for variance estimation for estimators based on induced smoothing

Zhezhen Jin; Yongzhao Shao; Zhiliang Ying

doi:10.1093/biostatistics/kxu021

. 2014 May 7;16(1):179–188. doi: 10.1093/biostatistics/kxu021

A Monte Carlo method for variance estimation for estimators based on induced smoothing

Zhezhen Jin ^1,^*, Yongzhao Shao ², Zhiliang Ying ³

PMCID: PMC4288129 PMID: 24812418

Abstract

An important issue in statistical inference for semiparametric models is how to provide reliable and consistent variance estimation. Brown and Wang (2005. Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 732–746) proposed a variance estimation procedure based on an induced smoothing for non-smooth estimating functions. Herein a Monte Carlo version is developed that does not require any explicit form for the estimating function itself, as long as numerical evaluation can be carried out. A general convergence theory is established, showing that any one-step iteration leads to a consistent variance estimator and continuation of the iterations converges at an exponential rate. The method is demonstrated through the Buckley–James estimator and the weighted log-rank estimators for censored linear regression, and rank estimation for multiple event times data.

Keywords: Accelerated failure time model, Asymptotic fiducialdistribution, Buckley–James estimator, Censored data, Contraction mapping, Estimating function, Kaplan–Meier estimator, Monte Carlointegration, Rank estimator

1. Introduction

Many important estimators involve solving non-smooth and perhaps discontinuous estimating functions. Examples include least absolute deviation estimator (Bloomfield and Steiger, 1983), various rank estimators (Hettmansperger and McKean, 1998), and parameter estimators in censored linear regression (Buckley and James, 1979; Ritov, 1990; Tsiatis, 1990; Lai and Ying, 1991; Ying, 1993). Such estimators are also common in the econometrics literature, where robust procedures are advocated and censoring and sampling bias may occur. Koenker and Bassett (1978) pioneered the quantile regression method and Powell (1984) developed an extended least absolute deviation estimator for censored regression, all involving non-smooth estimating equations.

When the estimating functions are non-smooth, the limiting distributions of the resulting estimators often involve density functions, as exhibited in the above-cited examples. It is therefore desirable to develop methods for variance estimation that bypass density estimation. An interesting development is due to Brown and Wang (2005), where they used a pseudo-Bayesian approach to obtain a naturally induced smoothed version whereby a consistent variance estimator can be obtained through an iterative procedure. They demonstrated usefulness of their method through rank estimators. The work of Brown and Wang (2007) contains an extension of the method to the censored linear regression using the Gehan estimating function. Wang and Zhao (2008), Johnson and Strawderman (2009), and Fu and others (2010) further applied the procedure to the analysis of clustered data. Recently, the procedure has been successfully applied to different regression models, Pang and others (2012) on the censored quantile regression model, Li and others (2012) on the accelerated hazards model, and Lin and Peng (2013) on the linear transformation model.

It appears that a key component in the implementation and theoretic analysis in Brown and Wang (2005, 2007) is that the smoothed version of the estimating function, i.e. integration of the original estimating function with respect to a normal kernel, has a closed analytic form from which the iterative algorithm and convergence analysis can be carried out. This would exclude many well-known estimators of which the estimating functions are non-smooth and non-monotone. In particular, it excludes all weighted log-rank estimators (except for the case of Gehan) and the Buckley–James estimator for censored linear regression. They also noted that their approach is effective only when the underlying estimating function is monotone.

Motivated by Brown and Wang (2005), this paper develops a general way for approximating standard errors. It is based on the use of numerical approximations to integrals and modifies the Brown–Wang method so that the scope and applicability are substantially expanded. This new development is particularly appealing to complicated situations, such as the Buckley–James and weighted log-rank estimators in censored linear regression where the estimating functions take complex forms. The lack of an explicit form for the smoothed estimating function in the general situation also entails that new analytic tools are needed to ensure general convergence of the iterative algorithm. Indeed, we show that the so-called contraction mapping theorem is applicable stochastically and, in consequence, an exponential rate of convergence is established.

The paper is organized as follows. In the next section, we describe our proposed method along with theory. In Section 3, we apply the proposed method to estimate the variance of the parameter estimators in rank estimation and least squares estimation for censored regression, and discuss its extension to the multivariate cases. In Section 4, we present several simulation studies and real examples. We conclude with a discussion in Section 5. Supplementary material available at Biostatistics online outlines all theoretical proofs.

2. Method and theory

Let Inline graphic be a -dimensional vector of parameters that is related to the observations . We shall use to denote a vector of estimating functions for . The resulting estimator is obtained by solving . Without loss of generality, we assume that is properly scaled so that converges to a non-random function Inline graphic and , where denotes the true parameter. The first two basic assumptions are as follows.

Assumption A1 —

is asymptotically normal with mean 0 and a covariance matrix , i.e.

(2.1)

Assumption A2 —

The estimator obtained by solving is -consistent, and is asymptotically normal with mean and covariance matrix .

Assumptions A1 and A2 are usually satisfied by many estimating functions. Inference on Inline graphic , e.g. construction of a confidence set, requires a consistent estimate of . As noted in Jin and others (2001) and Brown and Wang (2005), a consistent estimator of is usually easy to obtain because it only involves . If has a continuous derivative the typical asymptotic arguments, as discussed in Brown and Wang (2005), would lead to

(2.2)

that is, Inline graphic is asymptotically normal with mean and covariance matrix , where Thus, can be estimated by and can be estimated by . However, the estimating function is often non-smooth, thus one cannot estimate the slope matrix by simply taking partial derivatives. As a consequence, variance estimation for Inline graphic can be a challenging issue.

The idea behind the elegant approach of Brown and Wang (2005) is to use the asymptotic fiducial distribution as the basis for an induced smoothing kernel. Specifically, Inline graphic is in distribution approximately equal to , where is the standard -variate normal random vector. It induces the following smoothed version of the estimating function:

(2.3)

where Inline graphic denotes the expectation with respect to . Brown and Wang (2005) then suggested to obtain and by jointly solving and the following equation:

(2.4)

where Inline graphic and is a consistent estimator of . Because both sides of (2.4) involve , an iterative algorithm for solving results.

For certain rank estimators and quantiles, Brown and Wang (2005, 2007) were able to obtain manageable analytic forms for Inline graphic and and showed that their proposed iterative algorithm converges numerically. The approach, however, is not applicable if estimating functions are non-smooth and their smoothed version is too complicated to be written out in simple analytic forms. Examples of such kind include all weighted log-rank estimators (except the Gehan estimator) and the Buckley–James estimator for censored linear regression. Thus, there is a need to develop a simple and more generally applicable algorithm to estimate the variance Inline graphic . It is also desirable to investigate the convergence property of the iterative algorithms under more general conditions.

Next we state another assumption on local asymptotic linearity (LAL) which is generally satisfied even for estimating functions Inline graphic that are non-smooth and/or non-monotone. In fact, the LAL is a commonly used assumption for proving asymptotic normality. In particular, all examples considered in Brown and Wang (2005, 2007) and in this paper satisfy the following LAL assumption.

Assumption A3 —

is locally asymptotically linear (LAL) at , i.e.

(2.5)

where is a non-degenerate slope matrix, denotes the Euclidean norm, and is some small neighborhood of .

Note that Assumptions A1–A3 imply (2.2) with Inline graphic as defined in (2.5). That is, . It follows that is asymptotically normal with mean and covariance matrix .

Using either Stein's Identity (Stein, 1981) or a simple integration by parts argument, the derivative of the smoothed estimating equation Inline graphic defined in (2.3) satisfies the following equation:

(2.6)

Remark 2.1 —

The validity of (2.6) does not require the existence of partial derivative of . It is valid as long as the order of the partial derivative and the expectation is exchangeable, which can be checked with Fubini's Theorem in measure theory; see supplementary material available at Biostatistics online for a proof.

Under the above three general Assumptions A1–A3, a very simple consistent estimate of Inline graphic is given by , where and with being the identity matrix. Note that the two integrals in (2.3) and (2.6) can be numerically approximated arbitrarily well by a simple Monte Carlo method (MCM) or the Gaussian quadrature method (GQM). More specifically, we propose the following two numerical methods to provide simple consistent estimates of the variance Inline graphic .

An MCM

Step 1: Calculate a consistent estimator Inline graphic of the matrix .

Step 2: Choose Inline graphic and a large number .

Step 3: For the Inline graphic th step (), generate , from multivariate normal distribution . Estimate by

Step 4: Calculate Inline graphic and define .

Step 5: Repeat Steps 3 and 4 for next Inline graphic until and converge.

The covariance matrix of Inline graphic will be estimated using the at the convergence in the above iterative algorithm.

The convergence of Inline graphic and can be assessed by commonly used matrix convergence criteria, such as the difference or relative change. One may choose a very large to ensure a good approximation for the Gaussian integral.

A GQM

Replace Step 3 in MCM with

Step 3 Inline graphic : Choose grid -vector points , , based on a pre-specified accuracy criterion, and calculate

where Inline graphic are the Gaussian quadrature weights. One choice of is based on 1D Gauss–Hermite quadrature calculations.

The following theorem justifies the convergence of the algorithms in MCM and GQM. A proof of the theorem can be found in supplementary material available at Biostatistics online.

Theorem 2.2 —

Under Assumptions A1–A3, the one-step () estimates of and in the MCM or in the GQM with a large are consistent as . Moreover, the iteration algorithm in either MCM or GQM converges under Assumptions A1–A3.

3. Examples

We illustrate the methods with three examples: weighted log-rank estimators, the Buckley–James estimators for censored linear regression, and their extensions to multivariate data. It should be noted that the approach of Brown and Wang (2005, 2007) is not applicable to any of the three examples as the corresponding estimating functions are non-monotone and do not have simple form to give an explicit evaluation for the induced smooth versions.

3.1. Weighted log-rank estimators

Consider the accelerated failure time (AFT) model for survival times (Kalbfleisch and Prentice, 2002). Let Inline graphic be the failure time and be the -vector of covariates for the th individual, . The AFT model relates the logarithm of the failure time, , linearly to the covariates

(3.1)

where Inline graphic are independent and identically distributed random errors with unknown distribution function and is a -vector of unknown regression parameters. Instead of the , we observe and where are censoring times. We assume that observations , , are independent and identically distributed.

The general weighted log-rank estimating function with weight function Inline graphic takes the form

(3.2)

where Inline graphic . The corresponding estimate solves .

In general, the Inline graphic can be neither smooth nor monotone. Step 1 in the previous section can be easily implemented as the variance of ; can be estimated through the usual plug-in estimator

where Inline graphic for a matrix ; cf. Ying and others (1992). Steps 2–5 can also be implemented straightforwardly.

3.2. Buckley–James estimators

For the censored linear regression model in Section 3.1, Buckley and James (1979) considered an estimating equation based on the least squares principle with

where Inline graphic and is the left-continuous version of the Kaplan–Meier estimate of based on . The estimator can be obtained by the method of Jin and others (2006a).

Step 1 in Section 2 can be easily implemented as the variance of Inline graphic ; can be estimated by the method in Ying and others (1992).

The remaining steps are straightforward.

3.3. Rank estimation for multiple event times data

Jin and others (2006b) considered the extension of the rank estimation to multiple events data, recurrent events data, and clustered failure time data and developed resampling approaches to estimate the limiting covariance matrices without non-parametric density estimation or evaluation of numerical derivatives. However, their implementation is computationally intensive. The approach developed in this paper offers a rather simple way of estimating the limiting covariance matrices. We illustrate the use of the proposed method with the rank estimation in multiple events data.

Suppose that a subject can potentially experience Inline graphic types of events. For the th subject, and , let be the time to the th event, be the corresponding censoring time, and be the corresponding vector of covariates. The observed data consist of , where and .

Jin and others (2006b) considered the marginal distributions of the Inline graphic types of events with AFT models while leaving the dependence structures unspecified.

where Inline graphic is a vector of unknown regression parameters, and are independent random vectors with a common, but completely unspecified, joint distribution that are independent of the .

Let Inline graphic and . The weighted log-rank estimating function for is given by

where Inline graphic , and is a weight function. The resulting estimator is denoted by . Note that the choices of , and being the Kaplan–Meier estimator based on as correspond to the log-rank, Gehan–Wilcoxon, and Prentice–Wilcoxon statistics, respectively.

Let Inline graphic and . The random vector is asymptotically zero-mean normal with covariance matrix .

The Inline graphic can be estimated by the empirical estimator of covariance matrix between and .

Let Inline graphic , . Denote the empirical estimator of covariance matrix of as ; then the can be estimated as follows.

Step 1: Generate Inline graphic , from -dimensional multivariate normal distribution .

Step 3: Choose Inline graphic . Then estimate by

Step 4: Calculate Inline graphic and denote .

Step 5: Replace Inline graphic with ; then iterate between Steps 3 and 4 until converges.

The covariance matrix of Inline graphic will be .

4. Simulations and application to real data

Simulation studies were conducted to assess the performance of the proposed methods. Here we present simulation results for censored linear regression model (3.1) using the Gehan, the log-rank, and the Buckley–James least squares estimating equations. Following Jin and others (2006a), we generate failure times from the model

where Inline graphic is Bernoulli with success probability 0.5, is normal with mean 0 and standard deviation 0.5, and has the standard normal, extreme value. The censoring times were generated from the uniform Un distribution, where was chosen to yield a desired level of censoring. We estimated and with the log-rank estimation method as in Jin and others (2003) and the least squares method as in Jin and others (2006a). The 1000 Monte Carlo standard 2D normal random vectors were used and Inline graphic was set to be .

The results for a sample size of 100 based on 1000 simulated datasets are summarized in Tables 1 and 2. In all cases, the proposed procedure accurately estimates the variability of the parameter estimator, and the confidence intervals have proper coverage probabilities.

Table 1.

Summary statistics for the simulation studies normal error

		Gehan estimator				Log-rank				Least squares
Parameter	Censoring (%)	Bias	SE	SEE	CP	Bias	SE	SEE	CP	Bias	SE	SEE	CP
	0	0.002	0.207	0.209	0.945	0.006	0.222	0.225	0.941	0.003	0.202	0.213	0.957
	25		0.230	0.227	0.947	0.001	0.246	0.245	0.940	0.000	0.226	0.235	0.949
	50		0.257	0.260	0.949		0.274	0.280	0.948		0.254	0.267	0.953

	0		0.213	0.211	0.943		0.223	0.227	0.955		0.208	0.215	0.954
	25		0.233	0.231	0.938		0.242	0.247	0.951	0.000	0.226	0.239	0.959
	50	0.005	0.263	0.267	0.955		0.277	0.284	0.948	0.003	0.258	0.274	0.963

Open in a new tab

Bias, bias of the parameter estimator; SE, standard error of the parameter estimator; SEE, mean of the standard error estimator; CP, coverage probability of the 95% confidence interval.

Table 2.

Summary statistics for the simulation studies extreme-value error

		Gehan estimator				Log-rank				Least squares
Parameter	Censoring (%)	Bias	SE	SEE	CP	Bias	SE	SEE	CP	Bias	SE	SEE	CP
	0		0.236	0.238	0.953		0.204	0.211	0.952		0.262	0.278	0.957
	25	0.000	0.289	0.282	0.941		0.243	0.247	0.957		0.306	0.307	0.949
	50	0.000	0.368	0.362	0.951		0.318	0.321	0.956		0.379	0.370	0.945

	0	0.005	0.243	0.241	0.945		0.216	0.213	0.938	0.005	0.266	0.281	0.956
	25	0.003	0.286	0.288	0.945		0.249	0.250	0.949	0.002	0.300	0.314	0.957
	50	0.008	0.361	0.373	0.945	0.004	0.318	0.324	0.947	0.007	0.366	0.381	0.955

Open in a new tab

Bias, bias of the parameter estimator; SE, standard error of the parameter estimator; SEE, mean of the standard error estimator; CP, coverage probability of the 95% confidence interval.

We applied the method to the data on multiple myeloma reported by Krall and others (1975), which is the main example in SAS PROC PHREG. (SAS Institute, 1999). Two standardized covariates Inline graphic (BUN) and hemoglobin at diagnosis (HGB) were considered for the censored regression model in Section 3.1. The 10 000 Monte Carlo standard 2D normal random vectors were used, was set to be , and 0.0001 was used as the convergence criterion between successive estimates. Convergence was reached after three or four iterations. It yielded standard errors (0.142, 0.168) for the Gehan estimate ( Inline graphic , 0.292), (0.173, 0.158) for the log-rank estimate (, 0.268), and (0.122, 0.146) for the least-squares estimate (, 0.281). The results are similar to those obtained with the resampling approach in Jin and others (2003, 2006a).

We also did reanalysis of the Stanford heart transplantation data in Miller and Halpern (1982) by regressing the base-10 logarithm of the survival time on the patient's age and the T5 mismatch score for the 157 patients with complete records on the T5 mismatch score; the 10 000 Monte Carlo standard 2D normal random vectors were used, Inline graphic was set to be and 0.0001 was used as the convergence criterion between successive estimates. After convergence in four iterations, it yielded standard errors (0.0090, 0.1565) for the Gehan estimate (, ) and (0.0089, 0.1540) for the least-squares estimate (, ). The results are similar to those obtained with the resampling approach in Jin and others (2006a).

In our numerical studies, the proposed MCM used significantly less computational time compared with the resampling method for a similar accuracy in results.

5. Discussion

Variance estimation is an important aspect in semiparametric inference. It can be a thorny issue when the corresponding estimating functions are non-smooth. The Brown–Wang approach provides a simple solution through an induced smoothing, and can be easily implemented and justified when the closed form of the induced smoothed estimating function is available.

The present paper expands the scope and applicability of the Brown–Wang approach by recognizing that smoothing can be carried out via Monte Carlo approximations. This is especially crucial when the underlying estimating equations involve the empirical version of the infinite-dimensional parameter in the semiparametric model, as being demonstrated through several examples that are common in semiparametric analysis of failure time data.

The paper focuses on the parametric component of the semiparametric model. It is certainly of interest to extend the approach so that inference for the non-parametric component can be carried out properly. This may require effective handling of estimation for the non-parametric part without creating a large number of estimating equations that increases with the sample size.

Supplementary material

Supplementary Material is available at http://biostatistics.oxfordjournals.org.

Funding

Y.S.'s research is partially supported by the NYU Cancer Center Support Grant 2P30 CA16087.

Supplementary Material

Supplementary Data

supp_16_1_179__index.html^{(916B, html)}

Acknowledgments

We thank an associate editor and two referees for their careful reading and valuable comments. Conflict of Interest: None declared.

References

Bloomfield P., Steiger W. L. (1983). Least Absolute Deviations. Theory, Applications, and Algorithms. Progress in Probability and Statistics 6 Boston, MA: Birkhäuser Boston. [Google Scholar]
Brown B. M., Wang Y.-G. (2005). Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 732–746. [Google Scholar]
Brown B. M., Wang Y.-G. (2007). Induced smoothing for rank regression with censored survival times. Statistics in Medicine 26, 828–836. [DOI] [PubMed] [Google Scholar]
Buckley J., James I. (1979). Linear regression with censored data. Biometrika 66, 429–436. [Google Scholar]
Fu L., Wang Y., Bai Z. (2010). Rank regression for analysis of clustered data: a natural induced smoothing approach. Computational Statistics and Data Analysis 54, 1036–1050. [Google Scholar]
Hettmansperger T. P., McKean J. W. (1998) Robust Nonparametric Statistical Methods. London: Arnold. [Google Scholar]
Jin Z., Lin D. Y., Wei L. J., Ying Z. (2003). Rank-based inference for the accelerated failure time model. Biometrika 90, 341–353. [Google Scholar]
Jin Z., Lin D. Y., Ying Z. (2006a). On least-squares regression with censored data. Biometrika 93, 147–161. [Google Scholar]
Jin Z., Lin D. Y., Ying Z. (2006b). Rank regression analysis of multivariate failure time data based on marginal linear models. Scandinavian Journal of Statistics 33, 1–23. [Google Scholar]
Jin Z., Ying Z., Wei L. J. (2001). A simple resampling method by perturbing the minimand. Biometrika 88, 381–390. [Google Scholar]
Johnson L. M., Strawderman R. L. (2009). Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96, 577–590. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kalbfleisch J. D., Prentice R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd edition Hoboken: John Wiley. [Google Scholar]
Koenker R., Bassett G. (1978). Regression quantiles. Econometrica 46, 33–50. [Google Scholar]
Krall J. M., Uthoff V. A., Harley J. B. (1975). A step-up procedure for selecting variables associated with survival. Biometrics 31, 49–57. [PubMed] [Google Scholar]
Lai T. L., Ying Z. (1991). Large sample theory of a modified Buckley–James estimator for regression analysis with censored data. The Annals of Statistics 19, 1370–1402. [Google Scholar]
Li H., Zhang J., Tang Y. (2012). Induced smoothing for the semiparametric accelerated hazards model. Computational Statistics and Data Analysis 56, 4312–4319. [DOI] [PMC free article] [PubMed] [Google Scholar]
Lin H., Peng H. (2013). Smoothed rank correlation of the linear transformation regression model. Computational Statistics and Data Analysis 57, 615–630. [Google Scholar]
Miller R., Halpern J. (1982). Regression with censored data. Biometrika 69, 521–531. [Google Scholar]
Pang L., Lu W., Wang H. J. (2012). Variance estimation in censored quantile regression via induced smoothing. Computational Statistics and Data Analysis 56, 785–796. [DOI] [PMC free article] [PubMed] [Google Scholar]
Powell J. L. (1984). Least absolute deviations estimation for the censored regression model. Journal of Econometrics 25, 303–325. [Google Scholar]
Ritov Y. (1990). Estimation in a linear regression model with censored data. The Annals of Statistics 18, 303–328. [Google Scholar]
SAS Institute. (1999). SAS/STAT User's Guide, Version 8 Cary NC: SAS Institute Inc. [Google Scholar]
Stein C. M. (1981). Estimation of the mean of a multivariate normal distribution. The Annals of Statistics 9, 1135–1151. [Google Scholar]
Tsiatis A. A. (1990). Estimating regression parameters using linear rank tests for censored data. The Annals of Statistics 18, 354–372. [Google Scholar]
Wang Y.-G., Zhao Y. (2008). Weighted rank regression for clustered data analysis. Biometrics 64, 39–45. [DOI] [PubMed] [Google Scholar]
Ying Z. (1993). A large sample study of rank estimation for censored regression data. The Annals of Statistics 21, 76–99. [Google Scholar]
Ying Z., Wei L. J., Lin J. S. (1992). Prediction of survival probability based on a linear regression model. Biometrika 79, 205–209. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

supp_16_1_179__index.html^{(916B, html)}

supp_kxu021_kxu021supp.pdf^{(136.2KB, pdf)}

[C1] Bloomfield P., Steiger W. L. (1983). Least Absolute Deviations. Theory, Applications, and Algorithms. Progress in Probability and Statistics 6 Boston, MA: Birkhäuser Boston. [Google Scholar]

[C2] Brown B. M., Wang Y.-G. (2005). Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 732–746. [Google Scholar]

[C3] Brown B. M., Wang Y.-G. (2007). Induced smoothing for rank regression with censored survival times. Statistics in Medicine 26, 828–836. [DOI] [PubMed] [Google Scholar]

[C4] Buckley J., James I. (1979). Linear regression with censored data. Biometrika 66, 429–436. [Google Scholar]

[C5] Fu L., Wang Y., Bai Z. (2010). Rank regression for analysis of clustered data: a natural induced smoothing approach. Computational Statistics and Data Analysis 54, 1036–1050. [Google Scholar]

[C6] Hettmansperger T. P., McKean J. W. (1998) Robust Nonparametric Statistical Methods. London: Arnold. [Google Scholar]

[C7] Jin Z., Lin D. Y., Wei L. J., Ying Z. (2003). Rank-based inference for the accelerated failure time model. Biometrika 90, 341–353. [Google Scholar]

[C8] Jin Z., Lin D. Y., Ying Z. (2006a). On least-squares regression with censored data. Biometrika 93, 147–161. [Google Scholar]

[C9] Jin Z., Lin D. Y., Ying Z. (2006b). Rank regression analysis of multivariate failure time data based on marginal linear models. Scandinavian Journal of Statistics 33, 1–23. [Google Scholar]

[C10] Jin Z., Ying Z., Wei L. J. (2001). A simple resampling method by perturbing the minimand. Biometrika 88, 381–390. [Google Scholar]

[C11] Johnson L. M., Strawderman R. L. (2009). Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96, 577–590. [DOI] [PMC free article] [PubMed] [Google Scholar]

[C12] Kalbfleisch J. D., Prentice R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd edition Hoboken: John Wiley. [Google Scholar]

[C13] Koenker R., Bassett G. (1978). Regression quantiles. Econometrica 46, 33–50. [Google Scholar]

[C14] Krall J. M., Uthoff V. A., Harley J. B. (1975). A step-up procedure for selecting variables associated with survival. Biometrics 31, 49–57. [PubMed] [Google Scholar]

[C15] Lai T. L., Ying Z. (1991). Large sample theory of a modified Buckley–James estimator for regression analysis with censored data. The Annals of Statistics 19, 1370–1402. [Google Scholar]

[C16] Li H., Zhang J., Tang Y. (2012). Induced smoothing for the semiparametric accelerated hazards model. Computational Statistics and Data Analysis 56, 4312–4319. [DOI] [PMC free article] [PubMed] [Google Scholar]

[C17] Lin H., Peng H. (2013). Smoothed rank correlation of the linear transformation regression model. Computational Statistics and Data Analysis 57, 615–630. [Google Scholar]

[C18] Miller R., Halpern J. (1982). Regression with censored data. Biometrika 69, 521–531. [Google Scholar]

[C19] Pang L., Lu W., Wang H. J. (2012). Variance estimation in censored quantile regression via induced smoothing. Computational Statistics and Data Analysis 56, 785–796. [DOI] [PMC free article] [PubMed] [Google Scholar]

[C20] Powell J. L. (1984). Least absolute deviations estimation for the censored regression model. Journal of Econometrics 25, 303–325. [Google Scholar]

[C21] Ritov Y. (1990). Estimation in a linear regression model with censored data. The Annals of Statistics 18, 303–328. [Google Scholar]

[C22] SAS Institute. (1999). SAS/STAT User's Guide, Version 8 Cary NC: SAS Institute Inc. [Google Scholar]

[C23] Stein C. M. (1981). Estimation of the mean of a multivariate normal distribution. The Annals of Statistics 9, 1135–1151. [Google Scholar]

[C24] Tsiatis A. A. (1990). Estimating regression parameters using linear rank tests for censored data. The Annals of Statistics 18, 354–372. [Google Scholar]

[C25] Wang Y.-G., Zhao Y. (2008). Weighted rank regression for clustered data analysis. Biometrics 64, 39–45. [DOI] [PubMed] [Google Scholar]

[C26] Ying Z. (1993). A large sample study of rank estimation for censored regression data. The Annals of Statistics 21, 76–99. [Google Scholar]

[C27] Ying Z., Wei L. J., Lin J. S. (1992). Prediction of survival probability based on a linear regression model. Biometrika 79, 205–209. [Google Scholar]

PERMALINK

A Monte Carlo method for variance estimation for estimators based on induced smoothing

Zhezhen Jin

Yongzhao Shao

Zhiliang Ying

Abstract

1. Introduction

2. Method and theory

Assumption A1 —

Assumption A2 —

Assumption A3 —

Remark 2.1 —

Theorem 2.2 —

3. Examples

3.1. Weighted log-rank estimators

3.2. Buckley–James estimators

3.3. Rank estimation for multiple event times data

4. Simulations and application to real data

Table 1.

Table 2.

5. Discussion

Supplementary material

Funding

Supplementary Material

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

A Monte Carlo method for variance estimation for estimators based on induced smoothing

Zhezhen Jin

Yongzhao Shao

Zhiliang Ying

Abstract

1. Introduction

2. Method and theory

Assumption A1 —

Assumption A2 —

Assumption A3 —

Remark 2.1 —

Theorem 2.2 —

3. Examples

3.1. Weighted log-rank estimators

3.2. Buckley–James estimators

3.3. Rank estimation for multiple event times data

4. Simulations and application to real data

Table 1.

Table 2.

5. Discussion

Supplementary material

Funding

Supplementary Material

Acknowledgments

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases