Abstract
The Receiver Operating Characteristic (ROC) curve is a widely used measure to assess the diagnostic accuracy of biomarkers for diseases. Biomarker tests can be affected by subject characteristics, the experience of testers, or the environment in which tests are carried out, so it is important to understand and determine the conditions for evaluating biomarkers. In this paper, we focus on assessing the effects of covariates on the performance of the ROC curves. In particular, we develop an accelerated ROC model by assuming that the effect of covariates relates to rescaling a baseline ROC curve. The proposed model generalizes the accelerated failure time model in the survival context to ROC analysis. An innovative method is developed to construct estimation and inference for model parameters. The obtained parameter estimators are shown to be asymptotically normal. We demonstrate the proposed method via a number of simulation studies, and apply it to analyze data from a prostate cancer study.
Keywords: Accelerated failure time model, asymptotic normality, receiver operating characteristic curve, regression models
1. Introduction
In medical studies, noninvasive and accurate biomarkers are widely used for evaluating patients’ disease status or their responses to treatments. Examples include the use of prostate-specific antigen and CA-125 to detect the presence of prostate cancer and ovarian cancer, respectively. To assess the accuracy of biomarkers for diagnosis and prognosis of disease, one of the most popular tools is the analysis of the Receiver Operating Characteristic (ROC) curve (Swets and Pickett (1982); Hanley (1989)). The definition of a ROC curve is as follows: let Y1 denote the biomarker for a diseased subject and Y0 denote the biomarker for a non-diseased subject. For any threshold value c for which any test results greater than c are considered to be positive, the true positive and false positive rates are defined as S1(c) = P(Y1 ≥ c) and S0(c) = P(Y0 ≥ c). The ROC curve is defined as the plot of the true positive rate versus the false positive rate, (S0(c), S1(c)), when the threshold value c varies from –∞ to ∞. Equivalently, the ROC curve is a function
where denotes the inverse of S0.
In practice, the diagnostic performance of biomarkers can vary under different conditions. They may be accurate for predicting diseases for certain patients, but may not perform well for others. Biomarker performance may also depend on the particular conditions under which biomarker tests are carried out, including the level of experience of the tester. In order to evaluate the diagnostic performance of biomarkers, it is important to understand how the performance depends on patient characteristics or test conditions.
In the existing literature, three approaches to incorporate covariate effects into the ROC analysis have been suggested (c.f., Pepe (1998, 2003); Zhou, Obuchowski, and McClish (2002)). The first approach is to model the ROC curve summary indices as a function of covariates. Particularly, Dorfman, Berbaum, and Metz (1992) and Obuchowski and Rockette (1995) suggested modeling the area under the curve (AUC), while Thompson and Zucchini (1989) recommended modeling the partial area under the curve (pAUC). This approach is feasible only when covariates are discrete, and there are enough patients in each covariate combination to permit the reliable calculation of the summary accuracy measure. The second approach is to model the distributions of test results as a function of disease status and covariates. Tosteson and Begg (1998) described the use of an ordinal regression model to induce the regression models for the ROC curve for tests with ordinal outcomes. Their method has been extended to random effects models (Beam (1995); Gatsonis (1995)) and Bayesian methods (Peng and Hall (1996); Hellmich et al. (1998); Ishwaran and Gatsonis (2000)). However, in this approach, the parameter estimates do not reflect the covariate effects on the ROC curve, so it is difficult to examine how the ROC curves can vary over different covariates. Instead, the third approach directly models covariate effects on the ROC curve (Pepe (1997, 2000); Alonzo and Pepe (2002)). Sometimes, this approach is called a parametric distribution free approach since it only assumes a parametric model for the ROC curve, but is distribution-free regarding the distribution of the test results. The most important advantage of this approach is that the interpretation of model parameters pertains directly to the covariate effects on the ROC curves. Specifically, in this approach, Pepe (1997, 2000) proposed parametric ROC regression models of the generalized linear model (GLM) form by assuming,
| (1.1) |
where ROCX(t) denotes the ROC curve at a false positive rate t associated with covariates X, g(·) is a known link function, and h(·) is a baseline function specified up to some finite parameters. Here, the baseline function h defines the location and shape of the ROC curve, and β quantifies covariate effects. Pepe used the estimating equations for β based on the binary indicator variable . Later, Cai and Pepe (2002) extended this parametric ROC regression model to a semiparametric approach by allowing an arbitrary nonparametric baseline function for h. They assumed a semiparametric location model for S0(y|X) (Pepe (1998); Heagerty and Pepe (1999)), and constructed high-dimensional estimating equations for estimating β and h. We emphasize that the last two models both assume that the effects of covariates are related to the location shift of a baseline ROC curve. This may not be true in some situations.
In this article, we develop an alternative regression model, namely the accelerated ROC model, by adjusting for covariates that can influence the performance of biomarkers. We consider modeling covariates directly on the ROC curve and our model generalizes the usual accelerated failure time model (Kalbfleisch and Prentice (2002)) in the survival context to the ROC analysis. A practical advantage of the proposed approach is that the estimation for the regression parameters only requires solving a small number of equations compared to the estimation techniques by Cai and Pepe (2002). In Section 2, we describe an accelerated ROC model and the procedures for estimating parameters of covariates β as well as the ROC function. The asymptotic properties of β and the ROC function are given in Section 3, and simulation studies are provided in Section 4. As an example, we apply our method to a prostate cancer dataset in Section 5, and a discussion is given in Section 6. All technical proofs are given in the Appendix.
2. Models and Inference Procedure
To model the covariate effects on the ROC curve, we propose the semiparametric ROC model
| (2.1) |
where ROCX(t) denotes the ROC curve at a false positive rate t associated with covariates X and G(·) is an unknown and increasing function satisfying G(–∞) = 0 and G(0) = 1, this is because ROCX(1) = 1 and ROCX(0) = 0 for any fixed value X. It is noted that (2.1) becomes one ROC-GLM model if a link function g in (1.1) takes G(exp(t)) with a baseline function h(t) equal to log(log(t)).
In model (2.1), a negative value for β indicates that discrimination improves as X increases since log(t), t ∈ (0,1) is negative and G(·) is an increasing function. For example, if G(t) = exp(αt), then ROCX(t) = exp{αeβTX log t}, 0 < t < 1. If α = 1 and β = 0, the ROC curve is the 45 degree line indicating that a biomarker has no discriminatory ability, while if 0 < α < 1 and β = 0, the ROC curve is above the diagonal line, and a biomarker is considered to have reasonable discriminatory ability to diagnose patients with and without the disease. If α = 0.5 and β = –0.8, then ROCX(t) = exp{0.5e–0.8X log t}; in this case, discrimination improves as X increases, see Figure 1.
Figure 1.
(Left) Parametric ROC-GLM, ROC 1(t) = Φ(0.6X + 0.8Φ–1(t)); (Right) Accelerated ROC model, ROC2(t) = exp(0.5e–0.8X log(t))
The parameter β characterizes the shape of the ROC curve for X, where the effects of X in the proposed ROC model relate to rescaling a baseline ROC curve. To see how this is different from the parametric ROC regression model in Pepe (1997, 2000), we plot the ROC curves using the two models in Figure 1. Clearly, the covariate affects true positive rates more dramatically for low false positive rates based on our model.
Suppose we observe n1 biomarker measurements from diseased subjects and n0 biomarker measurements from non-diseased subjects. Let Yi1 (i = 1, ..., n1) denote the biomarker measurement for diseased subject i and Yj0 (j = 1, ..., n0) denote the biomarker measurement for non-diseased subject j. We assume that each subject may have one or more covariates and denote them as Xi1 and Xj0 for diseased subject i and non-diseased subject j, respectively. In many applications, the measurements of a biomarker are subject to a finite upper detection limit, denoted by τ, where the test results above τ are not quantifiable and are considered to be censored. Thus, the observed data can be represented as
for diseased subjects and
for non-diseased subjects, where Δi1 = I(Yi1 ≤ τ) and Δj0 = I(Yj0 ≤ τ).
By the definition of ROCX(t), the model (2.1) can be rewritten as
| (2.2) |
It is to be noted that we make no assumptions on the model for S0(t|X). To estimate β, we take Zi1 = –log S0(Yi1|Xi1). Using (2.2), it can be shown that
with F(x) = 1 – G(–x). Hence, Zi1 satisfies the accelerated failure time (AFT) model, so inference for β can be conducted by solving the log-rank estimating equation, that is commonly used for the estimation in the AFT model. Specifically, the log-rank estimating equation is
| (2.3) |
Alternatively, other methods, such as the the Gehan-rank estimation equation (Jin et al. (2003)) or the nonparametric maximum likelihood estimation (Zeng and Lin (2007)) can be applied. Because of the AFT model implication on the Zi1 = –logS0(Y1|X), we call the proposed ROC function (2.1) the accelerated ROC model.
Since S0 is unknown, we estimate S0 nonparametrically using the smoothed Breslow estimator as follows:
| (2.4) |
where with an the bandwidth and d the dimension of X. Alternatively, one may use the Kaplan-Meier type estimator.
We suggest using an optimal bandwidth selection method in Wang and Shen (2008) for an in (2.4). First, we obtain a smoothed Breslow estimator Ŝ(y|x, an) using a reasonable initial bandwidth an. We also generate repeatedly B bootstrap samples from {(xi0, yi0), i = 1, ..., n0} and get bootstrapped smoothed Breslow estimators,
Then, we select ân by minimizing the bootstrapped mean integrated squared error (MISE)
| (2.5) |
over possible bandwidths . If the difference between ân and the initial value an is small enough, the process stops and the optimal bandwidth is set to ân. Otherwise, we replace the initial bandwidth and repeat similar procedures until the process converges to an optimal value.
Zi1 is estimated by Ẑi1 = –log Ŝ0(Yi1|Xi1) using (2.4) and, after plugging Ẑi1 into (2.3), is calculated by solving
Remark 1. When X is discrete, the estimator for S0, Ŝ0(y|x) in (2.4) can be replaced by the Breslow estimator using the data with Xj0 = x. i.e.,
Remark 2. When X has more than one continuous covariate, the kernel estimate Ŝ0 may not perform well with a moderate sample size. In this case, we suggest estimating S0(y|x) based on the Cox regression model using the non-diseased data. That is,
where is the estimate of the cumulative baseline function and is the regression parameter estimate. An alternative approach is to use the single index model, which is more flexible than the Cox regression model. The estimators from the latter model, however, can be computed easily.
We next describe the procedures for estimating G and the ROC function specified in (2.1). Clearly, P(Zi1eβTXi1 ≤ z|Xi1) = 1 – G(–z). Therefore, Zi1eβTXi1 is independent of Xi1 and has distribution function 1 – G(–z). This implies that we can estimate G consistently by using the empirical distribution of Wi1 ≡ Zi1eβTXi1. In light of possible upper limit detection in practice, we specifically use the Kaplan-Meier estimator to estimate the survival function of Wi1. After replacing Wi1 with its estimate
we estimate G(·) using
| (2.6) |
Finally, the ROC curve for fixed covariates X is estimated by
| (2.7) |
and the corresponding AUC estimate is
| (2.8) |
which can be calculated via the trapezoidal numerical integration.
Although the asymptotic variance of has an analytic expression (see the Appendix), directly estimating its variance involves estimating some derivatives and can be computationally tedious. Thus, we propose to estimate the variances of and Ĝ using the bootstrap method in order to make inferences. Bootstrap samples are drawn repeatedly with replacement from the dataset, and β and G are estimated for each bootstrap sample. We then use the variances of these and Ĝ's as our estimates.
The confidence region for in (2.7) can be calculated in the following manner. For 0 < ξ < 1 and 0 ≤ a < b ≤ 1, we first find Cξ such that
where is the estimated standard deviation of . Then, a 100(1 – ξ)% confidence region for ROCX(t) over [a, b] is
Specifically, we generate K (e.g. K = 500) samples consisting of biomarker measurements and corresponding covariates in the diseased and nondiseased groups and compute Ĝk (k = 1, ..., K) for each sample. Then, can be calculated by the sample standard deviation of Ĝk's, and Cξ can be computed as the 100(1-ξ)% percentile of the .
Remark 3. The proposed approach can be generalized to handle the situation in which each subject may have multiple or repeated biomarkers. We assume the marginal ROC model of such multivariate biomarkers. In this case, the estimating equation for β is replaced by
where Δij1, Ẑij1, and Xij1 are the observations of jth measurement for subject i in the diseased group, and Ẑij1 can be estimated similarly as Ẑi1. The bootstrapping method can still be used for inference by randomly selecting subjects for each bootstrap sample.
Remark 4. We suggest using the following procedure to check model adequacy. First, we stratify the data based on covariates X to obtain L groups. Let be the biomarker measurement for diseased subject i, and be the biomarker measurement for non-diseased subject j in group l. Next, we compare the empirical ROC curve with the proposed ROC curve in each group, where X takes the mean value from the group. A good-fitting ROC model is reasonably consistent with the empirical ROC curve. Finally, we check if the proposed AUC estimate (2.8) is close to the empirical AUC within each stratum
3. Asymptotic Properties
In this section, we derive the asymptotic properties of and Ĝ. Consider the following conditions.
(C.1) The true parameter value, β0, belongs to a compact set .
(C.2) The true densities with respective to a dominating measure for (Y1, X1) and (Y0, X0) are (χ + 1)-continuously differentiable, where χ > d/2 with d the dimension of X0. Additionally, X1 and X0 have bounded support.
(C.3) The matrix [1, X1] is linearly independent with positive probability.
(C.4) The kernel function K(·) is differentiable with bounded symmetric support and first (χ – 1) moments begin zero. Moreover, and .
(C.5) n0/n – ν ∈ (0, 1), where n = n0 + n1.
(C.1) and (C.5) are standard conditions for this type of problem. (C.3) ensures the identifiability of the regression parameters, and (C.4) states the restrictions on the choice of possible kernel functions. For example, when d = 2, the kernel function can be chosen to be the Gaussian kernel or the Epanechnikov kernel. Both (C.2) and (C.4) are necessary conditions to prove the asymptotic distribution of . Obviously, if S0 is estimated using the Breslow method with discrete X1 or from the Cox regression method, (C.4) is not needed.
Theorem 1. Under Conditions (C1)–(C5), .
Theorem 2. Under Conditions (C1)–(C5), converges in distribution to a mean zero normal random vector as n → ∞.
Theorem 3. Under Conditions (C1)–(C5), converges weakly to a zero mean Gaussian process in l∞([0,1]).
The proofs of Theorems 1–3 are provided in the Appendix. For the proof of Theorem 1, we use the fact that Ŝ0(y; x) converges uniformly in (y, x) to S0(y; x) as n goes to ∞, which is given in Zeng (2004). We then apply Theorem 2.10.3 of van der Vaart and Wellner (1996) and Theorem 5.9 of van der Vaart (1998). The proofs of Theorems 2 and 3 follow the same arguments as in Zeng (2004), and we use the central limit theorems for the empirical process indexed by classes depending on samples (Theorem 2.11.23, van der Vaart and Wellner (1996)).
4. Simulation Studies
Simulation studies were conducted to examine the performance of the proposed method. First, we took as true G(x) = exp(αx), so from (2.1),
| (4.1) |
The biomarker values for diseased and non-diseased subjects, Y1 and Y0, were generated by
| (4.2) |
where U0 and U1 are Uniform(0, 1) random variables. It is easy to check such (Y0, Y1) gives the ROC function specified in (4.1). We used an equal number of diseased and non-diseased subjects but varied the total sample size n from 200 to 400. Additionally, we set the upper detection limit τ as the 95th percentile of the biomarker in the non-diseased group.
We conducted three simulations with different types of covariates. For the first simulation, a binary covariate X was generated from a Bernoulli distribution with probability 0.5, and true parameters in (4.2) were set to β = 0.5, γ = –0.5, λ = 1, and α = 1.2. Because X was discrete, we estimated S0(y|x) using the Breslow estimator given in Remark 1. In the second simulation, we used a continuous covariate generated from Beta(4, 2) distribution, and true parameters were set to β = –1, γ = –0.2, λ = 0.5, and α = 0.5. In this simulation, S0(y|x) was estimated using the smoothed Breslow estimator in (2.4), where the Gaussian kernel function K(x) = (2π)–1/2exp(–x2/2) was applied. The initial bandwidth was set to an = n1–1/3, and optimal bandwidths an were chosen such that the bootstrapped MISE (2.5) attained is minimum. Specifically, we used the optimal bandwidths at the average values of the covariates as shown in Table 1. Our simulation studies showed, however, that the optimal bandwidths and initial bandwidth an = n1–1/3 resulted in very similar β estimates. For the third simulation, two continuous covariates, generated as Uniform(0,1) were used with β = (–1.3, –1.8)T, γ = (–0.2, –0.25)T, λ = (1,1)T, and α = 7. We then fit the Cox model to estimate S0 as described in Remark 2. In all the simulation studies, we obtained β by solving the log-rank estimating equation (2.3) through bisection search.
Table 1.
Estimates of optimal bandwidths hopt and bootstrapped MISEs when X = 0.54 in Simulation 2 based on 1,000 simulations.
| Scenario | n1 = n0 | hopt | MISE |
|---|---|---|---|
| Simulation 2 | 100 | 0.1706 | 0.619 |
| 200 | 0.1212 | 0.624 |
Table 2 summarizes the simulation results based on 1,000 replicates. Column “Est” is the average value of the estimates from 1,000 replicates; column “ASE” is the average of the estimated standard errors by the bootstrap method with 1,000 replicates; column “SE” is the standard deviation of the estimates; column “CP” gives the (100×) coverage proportion of the 95% confidence intervals based on asymptotic normality. Overall, the estimates for β are very close to the actual values, and the estimated standard errors using the bootstrap method approximate the empirical standard errors fairly well. In addition, the coverage proportions of 95% CIs are close to the nominal level of 95% across sample sizes. In the same table, we present the true and estimated G at three fixed points chosen to be the quartiles of the true distribution of –W1 (= eβTX1 log S0(Y1; X1)). For all simulations, the estimated values of G are very close to the actual values at all three points. Moreover, the fitted semiparametric ROC curves obtained from the three simulation studies were extremely close to the true ROC curves (e.g., Figure 2).
Table 2.
Summary results from simulation studies.
| n1 = n0 = 100 | n1 = n0 = 200 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Par. | True | Est | ASE | SE | CP | Est | ASE | SE | CP |
| Simulation Study 1. X: 0 or 1 | |||||||||
| β | 0.5 | 0.479 | 0.309 | 0.302 | 95.7 | 0.517 | 0.219 | 0.210 | 94.9 |
| G(–2.7) | 0.259 | 0.255 | 0.081 | 0.080 | 95.6 | 0.262 | 0.060 | 0.060 | 93.0 |
| G(–1.3) | 0.522 | 0.519 | 0.084 | 0.082 | 94.8 | 0.524 | 0.060 | 0.059 | 94.0 |
| G(–0.5) | 0.779 | 0.774 | 0.063 | 0.060 | 96.0 | 0.778 | 0.044 | 0.042 | 94.3 |
| Simulation Study 2. X ~ Beta(4,2) | |||||||||
| β | −1 | −1.080 | 0.967 | 0.936 | 94.9 | −1.057 | 0.703 | 0.694 | 95.8 |
| G(–0.89) | 0.663 | 0.630 | 0.160 | 0.165 | 92.2 | 0.641 | 0.125 | 0.128 | 93.4 |
| G(–0.542) | 0.763 | 0.727 | 0.141 | 0.142 | 94.3 | 0.741 | 0.106 | 0.106 | 94.2 |
| G(–0.23) | 0.891 | 0.865 | 0.098 | 0.091 | 95.0 | 0.877 | 0.066 | 0.061 | 95.6 |
| Simulation Study 3. X1 ~ Uniform(0,1), X2 ~ Uniform(0,1) | |||||||||
| β 1 | −1.3 | −1.300 | 0.571 | 0.543 | 95.6 | −1.297 | 0.384 | 0.367 | 94.7 |
| β 2 | −1.8 | −1.847 | 0.587 | 0.582 | 94.5 | −1.817 | 0.395 | 0.391 | 95.2 |
| G(–0.162) | 0.321 | 0.319 | 0.144 | 0.150 | 92.7 | 0.325 | 0.108 | 0.107 | 94.1 |
| G(–0.116) | 0.442 | 0.430 | 0.151 | 0.158 | 93.2 | 0.441 | 0.112 | 0.110 | 94.2 |
| G(–0.056) | 0.675 | 0.649 | 0.137 | 0.137 | 95.2 | 0.668 | 0.096 | 0.091 | 94.9 |
Figure 2.
Semiparametric ROC curve (···), misspecified parametric ROC curve (– · – ·), and true ROC curve (—)
We next investigated the robustness of the β estimates conducted in the third simulation study by misspecifying the model for S0. Specifically, Y0 was generated from the log-normal model of the form
with Z ~ Normal(0,0.62), and Y1 was generated as at (4.2),
with β = (–1.2, –2)T, γ = (–2, 1.95)T, and α = 7. As shown in Table 3, the estimates of β using the accelerated ROC model are fairly robust to the choice of distributions for S0.
Table 3.
Estimates of β under the misspecified model for S0.
| n1 = n0 = 100 | n1 = n0 = 200 | ||||||
|---|---|---|---|---|---|---|---|
| Par | True | Est | SE | MSE | Est | SE | MSE |
| β 1 | −1.3 | −1.271 | 0.688 | 0.448 | −1.342 | 0.462 | 0.215 |
| β 2 | −1.8 | −1.788 | 0.724 | 0.524 | −1.840 | 0.503 | 0.254 |
Furthermore, we compared the performance of the proposed semiparametric approach, based on the accelerated ROC model, to the parametric ROC-GLM approach (Alonzo and Pepe (2002))
Specifically, the three simulation scenarios were considered (n1 = n0 = 100) and the two approaches were compared with respect to MISEs shown in Table 4. First, we used the data obtained from Simulation 1 and the semiparametric and parametric ROC curves were estimated at X = 0, 1. Figure 2 and the MISEs in Table 4 indicate that the fitted semiparametric ROC curve Ĝ(e0.479X log t) is slightly closer to the true ROC curve exp(0.5e0.5X log(t)) than the misspecified parametric ROC curve Φ(–0.463 + 1.449Φ–1(t) + 0.56X) at both points.
Table 4.
Average of estimated MISE based on 1,000 simulated datasets by the proposed semiparametric and parametric ROC-GLM approaches.
| Figure | Model | MISE (Covariate) | |
|---|---|---|---|
| 1 | Semiparametric | 2.32 × 10–5 (X = 0) | 1.94 × 10–4 (X = 1) |
| Misspecified ROC-GLM | 0.0013 (X = 0) | 0.0014 (X = 1) | |
| 2 | Misspecified semiparametric | 0.0131 (X = 0) | 0.0451 (X = 0.5) |
| Misspecified ROC-GLM | 0.0599 (X = 0) | 0.0635 (X = 0.5) | |
| 3 | Misspecified semiparametric | 0.0035 (X = 0) | 0.0044 (X = 1) |
| ROC-GLM | 6.75 × 10–5 (X = 0) | 2.72 × 10–4 (X = 1) | |
Second, we simulated the biomarker values from
where X and U were generated from Uniform(0,1) and the ε were from N(0, 0.62). The induced ROC curve is ROCX(t) = exp[–exp(–1.2 + 1.6X – 0.6Φ–1(t))], which is neither parametric ROC-GLM nor our accelerated ROC model. Figure 3 suggests that the proposed semiparametric ROC curve Ĝ(e1.39X log t) is closer to the truth than is the parametric curve Φ(0.058 + 0.916Φ–1(t) – 0.24X) at both points. Interestingly, the ROC-GLM approach gives very different estimates from the truth.
Figure 3.
Misspecified semiparametric ROC curve (···), misspecified parametric ROC Curve (– · – ·), and true ROC curve (—)
Finally, we simulated the biomarker values from
where X is a Bernoulli random variable with probability 0.5, and ε1 and ε0 have the standard normal distributions. Then the corresponding covariate-specific curve is
which is exactly the form in the ROC-GLM approach. Undoubtedly, the ROC-GLM approach fits data well but our accelerated ROC model, even though biased, is still not far from the truth; see Figure 4.
Figure 4.
Misspecified semiparametric ROC curve (···), parametric ROC curve (– · – ·), and true ROC curve (—)
5. Application
We illustrate our approach by utilizing a prostate cancer dataset. Prostate-specific antigen (PSA) is a protein produced by the prostate gland, and the PSA test measures the level of PSA in the blood. Most healthy men have PSA levels under 4 nanograms per milliliter (ng/mL) of blood, and the chance of having prostate cancer rises as the PSA level increases. PSA occurs in two major forms in the blood. One form is attached to blood proteins while the other freely circulates. The free PSA is the ratio of how much PSA circulates freely compared to the total PSA level. Low free PSA may indicate prostate cancer, and most men with prostate cancer have a free PSA below 15%. According to the American Cancer Society and National Cancer Institute, men with free PSA at 7% or lower should undergo a biopsy as a precaution. We used a dataset of 71 prostate cancer subjects and 68 controls, all of whom participated in the Beta-Carotene and Retinol Efficacy Trial (CARET), a randomized lung cancer prevention study including 12,025 men (Goodman et al. (1993); Etzioni et al. (1999)).
The objective of this analysis was to evaluate the capacity of free PSA levels to discriminate men with prostate cancer from those with no malignancy prior to the onset of clinical symptoms. Subjects who participated in CARET had serum samples drawn at baseline and at two-year intervals thereafter. Blood samples drawn after a diagnosis of prostate cancer were excluded from this analysis, leaving 1-7 blood samples per subject. Previous studies have suggested that age and the time at which PSA was measured prior to diagnosis may affect the detection of prostate cancer. Let X be the age PSA was measured, and T be the time (years) from the onset of symptoms to the time at which the serum sample was drawn, so that time is negative and increases to 0, the time of clinical diagnosis. Accuracy would be expected to increase with increasing values of T. The average age of participants was 63.7 (range from 46.7 to 80.8), and the average time was −3.06 years (range from −9.008 to −0.003 yrs). We fitted an accelerated ROC model adjusting for age X and time T,
Since each subject may have more than one measurement, the estimating equations in Remark 3 were solved for estimating β's.
We found with SE 0.0248 (p-value 0.0505) and with SE 0.0442 (p-value 0.1841). The positive coefficient for age suggested that discrimination of disease is more efficient in younger men, and the negative coefficient for time implies that discrimination improves as PSA levels are measured closer to actual diagnosis although the time T was not found to be significant.
The estimated AUCs based on the accelerated ROC model were 0.8579, 0.8103, and 0.7623 using the median time T = –2.82 and the respective mean ages for groups age ≤ 61, 61< age ≤ 65, and age ≥ 65. On the other hand, the estimated parametric binormal model was ROCT,X(u) = Φ(4.82+ 0.715Φ–1(t) – 0.051X + 0.166T) and the corresponding estimated AUCs were 0.8788, 0.8220, and 0.7634. Our AUC estimates turned out to be closer to the empirical AUCs which were calculated as 0.8575, 0.8062, and 0.7527. Additionally, Figure 5 shows that the fitted ROC curves based on the accelerated ROC model matches well with the empirical curves demonstrating the model adequacy of the proposed method; refer to Remark 4.
Figure 5.
Estimated Semiparametric ROC Curves for PSA Adjusted for Age and Time (– · – ·) and their 95 % Confidence Regions over [0, 0.6] (···), Corresponding Empirical ROC Curves (—).
6. Discussion
We have proposed a semiparametric method to assess the accuracy of biomarkers by adjusting for covariates that could influence their performance. We developed an accelerated ROC model by employing the properties of the AFT model and showed that the parameter estimate of β can be conducted by solving the log-rank estimating equation. The function G was estimated using the empirical distribution of ZijeβTXi1 without making any assumptions about the distribution of G. We demonstrated that Ĝ derived using the Kaplan-Meier estimator of ZijeβTXi1 is a good fit to the true function G. The bootstrapping method was used for inference, and the asymptotic properties of and Ĝ were presented.
In our proposed method, the parameter estimates of covariates based on the log-rank estimating equations may not be efficient. Other estimation approaches such as described by Jin et al. (2003) and Jin, Lin, and Ying (2006) can be applied. For future work, we will examine whether a semiparametrically efficient estimator can be obtained.
Both our model and Pepe's (1997; 2000) directly model the effects of covari-ates on the ROC curves. These two models are in parallel to the AFT model and the proportional hazards model in the survival context. In survival analysis, there have been a number of approaches developed for model diagnostics and model checking. It is interesting to see how those approaches can be extended to the ROC regression models. Another possibility is to consider an even more general model by assuming
where both G and h are unknown functions. This general model includes our model and Pepe's model as special cases with h(t) = log t and G(·) a known link, respectively. However, it is unclear how reliably the model parameters can be estimated in practice.
Appendix
A.1. Proof of Theorem 1
With direct calculations, (C.3) implies that
is positive for , where Q1(x) = E[X1I(log Z1 + βTX1 ≥ x)] and Q0 = E[I(log Z1 + βTX1 ≥ x)]. Therefore, β0 must be the unique solution to
We introduce some notation. We use and to denote the empirical measure and expectation based on i.i.d. observations in the diseased group, (Yi1,Xi1), i = 1,..., n1. Similarly, we use and to denote the empirical measure and expectation based on i.i.d. observations in the non-diseased group, (Yj0, Xj0), j = 1,..., n0. Moreover, and denote the empirical processes and , respectively. Thus, by definition, should solve
We show the consistency of . First, conditional on non-diseased data, (Ẑi1, Xi1) are i.i.d. Therefore, the class
is the VC-class, so is Donsker. Note that the random functions
where here and later, E* and E** denote the expectation with respect to those random variables with asterisk and double asterisk respectively, can be expressed as the limit of the convex combinations of and are bounded from above. Thus, they belong to , which is a Donsker class by Theorem 2.10.3 of van der Vaart and Wellner (1996). Therefore, by the Glivenko-Cantelli Theorem, it is easy to see that
Furthermore, as n goes to ∞, Ŝ0(y; x) converges uniformly in (y,x) to S0(y; x), as shown in Zeng (2004). Thus, the limit function
converges uniformly in β to
The latter has a unique minimum zero at β0 by (C.1). Additionally, it satisfies the separability at β0 by (C.3). Therefore, by Theorem 5.9 of van der Vaart (1998), converges almost surely to β0.
A.2. Proof of Theorem 2
We derive the asymptotic distribution of . From
| (A.1) |
if we define
then we obtain
From the Donsker theorem, we have
| (A.2) |
where
On the other hand, from (C.2),
where denotes the inverse of H0(y; x) ≡ – log S0(y; x) for given x. Thus, if f1(y|x) is the conditional density of Y1 given X1, then
By slightly modifying Lemma 3.9.20 of van der Vaart and Wellner (1996), we can show
and that it holds uniformly in x, , and X1. Moreover, since Ĥ0(·; x) converges to H0(·; x) in D[0, τ] uniformly in x, we obtain
The last term on the right-hand side can be further approximated by
Similarly, we can expand the numerator term in the left-hand side of (A.2) to eventually obtain that (A.2) is equivalent to
| (A.3) |
for some differentiable functions σ2 and σ3. Particularly, σ1 has the same expression as given in (C.3) with β = β0, so σ1 is non-singular.
Using the same arguments as in Zeng (2004) and (C.4), we can show that, uniformly in x and y ∈ [0, τ],
We plug the above expression into (A.4), then (A.2). From (C.4), we obtain
| (A.4) |
Finally, we apply Theorem 2.11.23 in van der Vaart and Wellner (1996) to the last two terms in the right-hand side of (A.4). Particularly, their conditions are satisfied by observing that after integration by parts, both
and E[σ3(Y1, X1)qn(Y1, X1, Y0, X0)] converge uniformly in (Y0, X0) to
and E[σ3(Y1, X0)q(Y1, X0, Y0, X0)|X1 = X0], respectively, where q(Y1, X1, Y0, X0) = I(Y0 ≤ Y1)/S0(Y0|X = x). Furthermore, they have bounded total variation in Y0 uniformly in X0, and are Lipschitz continuous in X0. The latter implies the entropy condition in Theorem 2.11.23. Therefore, combining the above results and the non-singularity of a1 in (A.4), we obtain
where
Hence, converges in distribution to a normal distribution with mean zero and variance (1 – ν)Var (g1) + νVar (g2).
Remark A2.1 When X's take discrete values, the proof can be much simplified. Particulary, we can set an = 1/n and Kan(x) = I(x = 0) in the above arguments.
Remark A2.2 When S0(y|x) is estimated by the Cox model, the only difference is in the expressions of Ĥ0(y; x) – H0(y, x); the influence function qn(y, x, Y0, X0) is given by the influence function of , where is the nonparametric maximum likelihood estimator in the Cox model.
A.3. Proof of Theorem 3
The asymptotic property of Ĝ(t) follows the same expansion as the proof of Theorem 2 but we utilize the differentiability of the product-limit function. Let SW denote the survival function for W1, and HW the cumulative hazard function of W1. We have
We further expand the second term in the right-hand side as in the previous section to obtain
Hence, from the same arguments as in Theorem 2, we obtain
for some g3(Y0, X0). Therefore, converges in distribution to a Gaussian process in l∞[0, τ], and the covariance function is equal to the covariance of
Contributor Information
Eunhee Kim, Department of Biostatistics and Center for Statistical Sciences, Brown University, Providence, Rhode Island 02912, U.S.A. ekim@stat.brown.edu.
Donglin Zeng, Department of Biostatistics, University of North Carolina at Chapel Hill, North Carolina 27599, U.S.A. dzeng@bios.unc.edu.
References
- Alonzo TA, Pepe MS. Distribution-free ROC analysis using binary regression techniques. Biostatistics. 2002;3:421–433. doi: 10.1093/biostatistics/3.3.421. [DOI] [PubMed] [Google Scholar]
- Beam CA. Random-effects models in the receiver operating characteristic curve-based assessment of the effectiveness of diagnostic imaging technology: concepts, approaches, and issues. Academic Radiology. 1995;2:4–13. [PubMed] [Google Scholar]
- Cai T, Pepe MS. Semi-parametric ROC analysis to evaluate biomarkers for disease. J. Amer. Statist. Assoc. 2002;97:1099–1107. [Google Scholar]
- Dorfman DD, Berbaum KS, Metz CE. ROC rating analysis: generalization to the population of readers and cases with the jackknife method. Invest. Radiol. 1992;27:723–731. [PubMed] [Google Scholar]
- Etzioni R, Pepe MS, Longton G, Hu C, Goodman G. Incorporating the time dimension in receiver operating characteristic curves: a case study of prostate cancer. Med. Decis. Making. 1999;19:242–251. doi: 10.1177/0272989X9901900303. [DOI] [PubMed] [Google Scholar]
- Gatsonis CA. Random effects models for diagnostic test accuracy. Academic Radiology. 1995;2:S14–S21. [PubMed] [Google Scholar]
- Goodman G, Omenn GS, Thornquist M, Lund B, Metch B, Gylys-Colwell I. The Carotene and Retinol Efficacy Trial (CARET) to prevent lung cancer in high-risk populations: pilot study with cigarette smokers. Cancer Epidemiol. Biomarkers Prev. 1993;2:389–396. [PubMed] [Google Scholar]
- Hanley HA. Receiver operating characteristic (ROC) methodology: the state of the art. Crit. Rev. Diagn. Imag. 1989;29:307–335. [PubMed] [Google Scholar]
- Heagerty PJ, Pepe MS. Semiparametric estimation of regression quantiles with application to standardizing weight for height and age in US children. J Roy. Statist. Soc. Ser. C. 1999;48:533–551. [Google Scholar]
- Hellmich M, Abrams KR, Jones DR, Lambert PC. A Bayesian approach to a general regression model for ROC curves. Med. Decis. Making. 1998;18:436–443. doi: 10.1177/0272989X9801800412. [DOI] [PubMed] [Google Scholar]
- Ishwaran H, Gatsonis C. A general class of hierarchical ordinal regression models with applications to correlated ROC analysis. Canad. J. Statist. 2000;28:731–750. [Google Scholar]
- Jin Z, Lin DY, Wei LJ, Ying Z. Rank-based inference for the accelerated failure time model. Biometrika. 2003;90:341–353. [Google Scholar]
- Jin Z, Lin DY, Ying Z. On least-squares regression with censored data. Biometrika. 2006;93:147–161. [Google Scholar]
- Kalbfleisch JD, Prentice RL. The Statistical Analysis of Failure Time Data. Wiley; New York: 2002. [Google Scholar]
- Obuchowski NA, Rockette HE. Hypothesis testing of diagnostic accuracy for multiple readers and multiple tests: An ANOVA approach with dependent observations. Commun. Stat. B.-Simul. 1995;24:285–308. [Google Scholar]
- Peng F, Hall WJ. Bayesian analysis of ROC curves using Markov-Chain Monte Carlo methods. Med. Decis. Making. 1996;16:404–411. doi: 10.1177/0272989X9601600411. [DOI] [PubMed] [Google Scholar]
- Pepe MS. A regression modelling framework for receiver operating characteristic curves in medical diagnostic testing. Biometrika. 1997;84:595–608. [Google Scholar]
- Pepe MS. Three approaches to regression analysis of receiver operating characteristic curves for continuous test results. Biometrics. 1998;54:124–135. [PubMed] [Google Scholar]
- Pepe MS. An interpretation for the ROC curve and inference using GLM procedures. Biometrics. 2000;56:352–359. doi: 10.1111/j.0006-341x.2000.00352.x. [DOI] [PubMed] [Google Scholar]
- Pepe MS. The Statistical Evaluation of Medical Tests for Classification and Prediction. Oxford University Press; Oxford: 2003. [Google Scholar]
- Swets JA, Pickett RM. Evaluation ofDiagnostic Systems: Methods From Signal Detection Theory. Academy Press; New York: 1982. [Google Scholar]
- Thompson ML, Zucchini W. On the statistical analysis of ROC curves. Statist. Medicine. 1989;8:1277–1290. doi: 10.1002/sim.4780081011. [DOI] [PubMed] [Google Scholar]
- Tosteson A, Begg C. A general regression methodology for ROC curve estimations. Med. Decis. Making. 1998;8:204–215. doi: 10.1177/0272989X8800800309. [DOI] [PubMed] [Google Scholar]
- van der Vaart AW. Asymptotic Statistics. Cambridge University Press; Cambridge: 1998. [Google Scholar]
- van der Vaart AW, Wellner JA. Weak Convergence and Empirical Processes. Springer-Verlag; New York: 1996. [Google Scholar]
- Wang Q, Shen J. Estimation and confidence bands of a conditional survival function with censoring indicators missing at random. J. Multivariate Anal. 2008;99:928–948. [Google Scholar]
- Zeng D. Estimating marginal survival function by adjusting for dependent censoring using many covariates. Ann. Statist. 2004;32:1533–1555. [Google Scholar]
- Zeng D, Lin DY. Efficient estimation for the accelerated failure time model. J. Amer. Statist. Assoc. 2007;102:1387–1396. [Google Scholar]
- Zhou XH, Obuchowski NA, McClish DK. Statistical Methods in Diagnostic Medicine. Wiley; New York: 2002. [Google Scholar]





