Summary
In this article, we consider comparing the areas under correlated receiver operating characteristic (ROC) curves of diagnostic biomarkers whose measurements are subject to a limit of detection (LOD), a source of measurement error from instruments’ sensitivity in epidemiological studies. We propose and examine the likelihood ratio tests with operating characteristics that are easily obtained by classical maximum likelihood methodology.
Keywords: Area under curve (AUC), Censoring, Hypothesis testing, Limit of detection (LOD), Maximum likelihood, Receiver operating characteristics (ROC)
1. Introduction
Receiver operating characteristic (ROC) curve is a well-accepted statistical tool for evaluating the discriminatory ability of biomarkers (e.g., Shapiro, 1999). An ROC curve plots the true positive rates of a biomarker versus its false-positive rates for various thresholds of the test result. It is a convenient way to compare diagnostic biomarkers because the ROC curve places tests on the same scale where they can be compared for accuracy.
The area under the ROC curve (AUC) is a common index of the diagnostic performance of a biomarker. Bamber (1975) showed that AUC = Pr(X > Y), where X and Y represent values of the biomarker from diseased and healthy populations, respectively. Obviously, the closer the AUC is to 1, the better the diagnostic accuracy of the biomarker. In a parametric setting the AUCs can generally be expressed as a function of unknown parameters and thus can be evaluated via estimation of these parameters. Nonparametric estimation of the AUC has also been well addressed in the biostatistical and epidemiological literature. However, the test-scores of biomarkers are frequently associated with measurement error, and in this article we focus on measurement errors due to the limit of detection (LOD).
The LOD is a source of bias in many experiments and is usually caused by the limitation of instruments in measuring very high or low concentrations (e.g., Lyles, Williams, and Chuachoowong, 2001; Lubin et al., 2004; Mumford et al., 2006; Schisterman et al., 2006; Vexler, Liu, and Schisterman, 2006). This inability to accurately determine values of biomarkers introduces bias in the analysis of data from such experiments. For example, biomarkers for polychlorinated biphenyl (PCB), which are associated with endometriosis (Louis et al., 2005), are limited by instrument sensitivity (e.g., Lubin et al., 2004). The LOD issue can be considered as a problem of censored data analysis (e.g., Vexler et al., 2006). Perkins, Schisterman, and Vexler (2007) as well as Mumford et al. (2006) have proposed methods for estimation of ROC curves based on samples with LOD.
Often it is necessary to determine whether a biomarker has satisfactory accuracy in correctly discriminating between cases and controls, for example, testing for AUC = 0.5 (i.e., a biomarker has no discriminatory ability), or whether one biomarker has better diagnostic accuracy than another (e.g., Molodianovitch, Faraggi, and Reiser, 2006). This can be achieved by comparing the AUCs of these biomarkers. The present article addresses these issues when the measurements of the biomarkers are subject to an LOD. We investigate the maximum likelihood ratio test (MLRT), utilizing the likelihood function proposed by Lyles et al. (2001). Operating characteristics of the proposed tests (e.g., significance level and power) can be obtained from classical results of the maximum likelihood method.
The article is organized as follows. Section 2 introduces the MLRT for comparing AUCs. Section 3 presents Monte Carlo simulation results. In Section 4, we apply the proposed tests to data from two studies to evaluate the AUCs of several biomarkers, with some concluding remarks in Section 5. One example is from a study conducted in Birmingham, Alabama, to investigate whether intrauterine inflammation is associated with neuron developmental abnormalities in early childhood, so that certain educational methods for improvement will be utilized. In this example the levels of intrauterine inflammation biomarkers are observed only if they are above the detection limits. Another example uses data from a study of atherosclerotic coronary heart disease to test for discriminatory ability of several biomarkers. This study sampled residents of Niagara and Erie counties in New York who were between the ages of 35 and 79. Adults between the ages of 35 and 65 were randomly selected using the New York State Department of Motor Vehicles drivers’ licenses rolls. Individuals between 65 and 79 years of age were sampled randomly from the Health Care Financing Administration database. A cohort of 942 individuals consisted of 143 people with myocardial infarction (cases) and 799 controls. The purpose of the study was to determine whether biomarkers that measure individuals’ oxidative stress and antioxidant status are good at determining an individual's disease status.
2. Maximum Likelihood Ratio Tests
2.1 Test Based on Complete Data
Let Xk and Yk represent the values of biomarker k(=1, 2) associated with a diseased (X) and healthy (Y) population, respectively, and {xk1, . . . , xkn} and {yk1, . . . , ykm} be the corresponding test-scores. Suppose the independent vectors (x1i, x2i)T follow a normal distribution
and similarly,
Following Bamber (1975), the AUCs of the biomarkers are AUC1 = P(X1 > Y1) and AUC2 = P(X2 > Y2), respectively.
In this section, we formally consider testing hypothesis:
(1) |
It is clear that , k = 1, 2, and therefore
In a simple case, where all the parameters are known and there is no measurement error, that is, (X1, X2) and (Y1, Y2) are observed completely, we can utilize the classical MLRT for testing H0. To this end, note that under H1 and H0 the likelihood function has the form
respectively, where the vectors of parameters are
and the density function f is f(x1, x2, y1, y2; ΘX, ΘY) = ϕ(x1, x2; ΘX) ϕ(y1, y2; ΘY), where, with Θ = (θ1, θ2, θ3, θ4, θ5),
Therefore the classical likelihood ratio test-statistic is
Thus, we reject the null hypothesis iff z > zα, where the threshold zα corresponds to type I error α. It is clear that this test is the most powerfulunbiased test; see, for example, Lehmann (1997).
When the parameters are unknown, Molodianovitch et al. (2006) propose the transformed normal approach by normalizing data through transformation and then applying the parametric test proposed by Wieand et al. (1989) in order to test for hypothesis (1). (This test is based on confidence intervals (CIs) of AUCs [e.g., Reiser and Faraggi, 1997]. We will investigate this method in detail in Section 4.2.) Alternatively, we can apply the maximum likelihood estimation and obtain the test-statistic:
where
It is well known (e.g., Lehmann, 1997) that under H0, the statistic 2 log z asymptotically has a distribution and therefore the threshold zα can be easily obtained from Pr(z > zα) = α, as n, m → ∞. Moreover, this test is asymptotically most powerful (e.g., Choi, Hall, and Schick, 1996).
2.2 Test Based on Data Subject to Limit of Detection
If measurements of the biomarkers are subject to a LOD, then instead of observing x1i, x2i, y1j, y2j we have
where k = 1, 2, i = 1, . . . , n, j = 1, . . . , m and dx, dy are the values of the LOD (e.g., Lynn, 2001; Lubin et al., 2004; Mumford et al., 2006; Schisterman et al., 2006; Vexler et al., 2006). In the present article we assume, without loss of generality, that dx = dy = d and d is known (if d is unknown, it can be easily estimated, for example, by mini,j,k{xki, ykj}). We can still obtain the MLRT statistic based on the left-censored data. Following Lyles et al. (2001), write the likelihood functions based on and as and , respectively, that are formally defined in Appendix A.
Thus the MLRT statistic is given by
Subsequently, the test threshold zα can be obtained by the MLRT's asymptotic result: as n, m → ∞, and d is fixed.
Remark 1. Numerical calculations
Note that, applying statistical software such as R, SPlus, etc., allows us to calculate test-statistics z and z(LOD) without using closed forms of the estimators of the unknown parameters. A schematic example of programming in R is available upon request from the first author.
Remark 2. Transformed normal approach
The proposed method is based on the MLRT technique and hence the parametric assumptions regarding the data points are required. In order to relax the normal distribution assumptions, following Molodianovitch et al. (2006) we can fit the data to a Box–Cox power transformation modelto better achieve normality and then test for (1). Note that, Molodianovitch et al. (2006) have concluded that the transformed normal approach is efficient and robust when AUCs are compared. We present a modification of the proposed test in Appendix B.
3. Simulation
We conducted Monte Carlo simulations to examine the performance of the proposed method. To this end, we generated values of {x1i, i = 1, . . . , n} from the normal distribution with mean μx1 and variance 1, and {x2i = ax1i + εi, i = 1, . . . , n}, where the independent and identically distributed (i.i.d.) random variables εi ∼ N(0, 1). Similarly, y1j ∼ N(1, 0.52) and y2j = by1j + εj(j = 1, . . . , m) were generated. Hence, , and , where a and b are specified below.
Significance level of the test
Setting a = 0.7, b = 0.5, μx1 = 1.274, we have AUC1 = AUC2 = 0.597. For each value of d = −3, −1, −0.5, 0, 0.5, and 0.75 we generated 10,000 samples of {x1i, x2i, i = 1, . . . , n} and {y1j, y2j, j = 1, . . . , m}. Based on the generated samples, in each repetition we calculated the values of the test-statistic z(LOD).
In Table 1 we present the Monte Carlo estimation of type I error, where the test thresholds 2 log zα are 3.84 and 6.63. These thresholds correspond to Pr(ξ > 3.84) = 0.05 and Pr(ξ > 6.63) = 0.01, where . Table 1 also provides the theoretical proportion of the number of observations of X1, X2, Y1, and Y2 that are below the LOD value d. As can be seen, asymptotically, the type I error of the proposed test can be obtained from the distribution of the 2 log z(LOD) statistic. However, if, for example, n = m = 150 and d = 0.75 (in which case about 60% of Y2's are not observed numerically), then this assumption is dubious. (Note that for Type I error α we can assume for this simulation CI = α ± 1.96{α(1 − α)/10,000}1/2.)
Table 1.
n = m | d | F(3.84) | F(6.63) | P(x1 < d) | P(y1 < d) | P(x2 < d) | P(y2 < d) |
---|---|---|---|---|---|---|---|
150 | −3 | 0.0504 | 0.0098 | 9.5 × 10−6 | 6.22 × 10−16 | 0.0007 | 0.0003 |
−1 | 0.0510 | 0.0100 | 0.0115 | 3.17 × 10−5 | 0.0606 | 0.0728 | |
−0.5 | 0.0535 | 0.0114 | 0.0380 | 0.0013 | 0.1271 | 0.1660 | |
0 | 0.0573 | 0.0146 | 0.1013 | 0.0228 | 0.2325 | 0.3138 | |
0.5 | 0.0601 | 0.0153 | 0.2194 | 0.1587 | 0.3740 | 0.5000 | |
0.75 | 0.0634 | 0.0241 | 0.3000 | 0.3085 | 0.4537 | 0.5958 | |
30 | −3 | 0.0510 | 0.0110 | ||||
−0.5 | 0.0507 | 0.0120 | |||||
0 | 0.0593 | 0.0150 |
Power of the test
Here we examine the power of the test for situations where {AUC1 = 0.5, AUC2 = 0.6} and {AUC1 = 0.6, AUC2 = 0.9}. For the first case, we set μx1 = 1.3, a = 0.5, b = −1.5, and for the second μx1 = 1, a = 0.7, b = 0.3. For both cases n = m = 150. Table 2 displays the Monte Carlo estimation of the test's power for different values of d. Obviously, the power of the test is dependent on the proportion of X1's, X2's, Y1's, and Y2's below d.
Table 2.
d | AUC1 | AUC2 | F | P(x1 < d) | P(y1 < d) | P(x2 < d) | P(y2 < d) |
---|---|---|---|---|---|---|---|
−3 | 0.5 | 0.6 | 0.8394 | 3.2 × 10−5 | 6.2 × 10−16 | 1.1 × 10−3 | 5.5 × 10−4 |
−1 | 0.5 | 0.6 | 0.8284 | 2.3 × 10−2 | 3.2 × 10−5 | 8.2 × 10−2 | 9.9 × 10−2 |
0 | 0.5 | 0.6 | 0.8374 | 0.16 | 2.3 × 10−2 | 0.28 | 0.38 |
0.75 | 0.5 | 0.6 | 0.7372 | 0.40 | 0.31 | 0.52 | 0.67 |
−3 | 0.6 | 0.9 | 0.9995 | 8.5 × 10−6 | 6.2 × 10−16 | 5.5 × 10−4 | 0.12 |
−1 | 0.6 | 0.9 | 0.9985 | 1.1 × 10−2 | 3.2 × 10−5 | 0.07 | 0.66 |
0 | 0.6 | 0.9 | 0.9973 | 0.08 | 0.02 | 0.28 | 0.89 |
Table 2 demonstrates the high values of the power even in the situation where AUC1 is close to AUC2 and the proportions of the biomarker values below d is high (d = 0, 0.75).
Robustness
The simulations thus far assume that the samples follow normal distributions. In order to illustrate the robustness of our method, we performed the following Monte Carlo simulations. Suppose that, instead of following normal distributions, the diagnostic markers satisfy , where , and ε(df) are independent identically t-distributed random variables with df degrees of freedom, mean 0 and variance 1, 1 ≤ i, j ≤ 150. Thus, AUC1 = AUC2 = 0.5. Here we ran 10,000 repetitions of the sample (X′, Y′) at each df = 5, 10, 15 and d = −3, −1, 0 (d is the value of LOD). We examined the significance level of the proposed test given the uncorrected distributional assumption. Table 3 corresponds to the case when we expect the type I error to be 0.05 (the test threshold 2 log zα is 3.84).
Table 3.
df | d | P̂r(2 log z(LOD) > 3.84) |
---|---|---|
15 | −3 | 0.0506 |
10 | −3 | 0.0524 |
5 | −3 | 0.0518 |
15 | −1 | 0.0590 |
10 | −1 | 0.0610 |
5 | −1 | 0.0651 |
15 | 0 | 0.0958 |
10 | 0 | 0.1507 |
5 | 0 | 0.3288 |
From these results we conclude that the proposed method is reasonable even when the distributional assumptions do not exactly satisfy normality. However, the accuracy of the expected significance level is poor when d = 0 (about 50% of the data are below the detection limit). In contrast, Table 1 indicates that under the corrected distributional assumption this proportion of observations below LOD is not critical.
Imputation method
Conventional approaches to dealing with data below LOD include omission, resulting in a truncated data set, and imputation with a constant, such as d or a fraction thereof (e.g., d/2, ); or the observed values may be used directly or indirectly (e.g., Lubin et al., 2004; Schisterman et al., 2006). Perkins et al. (2007) showed that the imputation method can lead to biased parametric/nonparametric estimation of AUCs. Here we report results of the Monte Carlo simulation corresponding to Table 1, where the test based on CIs (e.g., Wieand et al., 1989; Reiser and Faraggi, 1997; for details, see Section 4.2) has been calculated for observations:
k = 1, 2, i = 1, . . . , n = 150, j = 1, . . . , m = 150, and Imp = d/2, .
In contrast with Table 1, Table 4 demonstrates that when d = 0, −3, 0.75 these conventional approaches should not be recommended. (Investigation of the test based on the samples ignoring the NA values and the nonparametric test [Wieand et al., 1989] based on , k = 1, 2, i = 1, . . . , 150, j = 1, . . . , 150 led to similar conclusion.)
Table 4.
d | Imp = d/2 | |
---|---|---|
−3 | 0.0549 | 0.0546 |
0 | 0.0689 | 0.0687 |
0.5 | 0.1098 | 0.1455 |
0.75 | 0.1539 | 0.2273 |
4. Examples
We exemplify the proposed method with data from the two studies briefly described in the introduction.
4.1 The IQ Study
Here we examine whether biomarker IL8 has the ability to discriminate between low and high levels of IQ. The data include 369 subjects. The IQ indicator full-scale IQ (FSIQ) has values ranging from 46 to 118 with an average equal to 82.57. We split our data into two populations, where population A includes those with IQ less than 82.57 and population B includes those with IQ greater than 82.57. We associate biomarker IL8 with both populations separately. Denote X, Y as biomarker values related to population A and B, respectively. The total number of Xs is 189 and the total number of Ys is 180. According to the instrument manual, the LOD for IL8 is d = 3.2, yielding the numbers of NAs to be 95 and 108 for X and Y, respectively. The logarithmic values of the biomarkers are used in order to better achieve normality. The empirical histograms of the log-transformed biomarker corresponding to high and low levels of IQ are depicted in Figure 1.
Under the assumption that log X and log Y have normal distributions, applying the maximum likelihood estimation proposed by Vexler et al. (2006) (or estimation based on censored data, see, e.g., Gupta [1952]) leads to estimated mean of log X and log Y as 1.02 and 0.38, respectively. The corresponding standard deviations are 2.60 and 2.88. In this case the estimated AUC is
Now, we test for AUC = 0.5 under the ROC curve of IL8 (i.e., no discriminatory ability of the biomarker). This is a particular case of the testing procedure considered in Section 2. Specifically, because the AUC = 0.5 iff E log X = E log Y, the test statistic has the form
where
and the function exp(l) is proportional to the likelihood based on censored data. For details regarding this maximum likelihood function see Vexler et al. (2006). The value of the test-statistic is computed to be 2.97. Because the value of z0.05 corresponding to PrH0(2 log z > z0.05) ≃ 0.05 (from distribution) is 3.84, we do not reject H0. Therefore we conclude that the discriminatory ability of biomarker IL8 is not significant.
4.2 Evaluating Biomarkers for Coronary Heart Disease
For this example, we compare the diagnostic accuracy of two biomarkers, cholesterol and hdl-cholesterol. To normalize the data, we log-transform the values of both biomarkers. It is obvious from a biological standpoint that the levels of cholesterol and hdl-cholesterol are correlated. Denote by X1, Y1 the log-transformed values of cholesterol for the cases and controls, respectively, and similarly, X2, Y2 the log-transformed hdl-cholesterol levels from cases and controls. The estimated means of X1, X2, Y1, and Y2 are 5.63, 4.15, 5.47, 4.13, and the estimated standard deviations are 0.18, 0.24, 0.30, 0.25, respectively. The estimators of the correlation between X1 and X2, as well as between Y1 and Y2 are and , respectively. Figure 2 introduces empirical histograms of X1, X2, Y1, and Y2.
Assume that the values of the log-transformed biomarkers are normally distributed. Simulation studies were conducted for each of the d = 0, 3, 3.25, 3.5, 3.75, 4, and 4.25. Table 5 presents estimators of the correlated AUCs and p-values obtained based on values of the test-statistic z for different d (theoretically 2 log z(LOD) is approximately distributed). (Note that situation d = 0 corresponds to no LOD effect.)
Table 5.
d | NX1 | NX2 | NY1 | NY2 | AÛC1 | AÛC2 | 2 1og z(LOD) | p-value |
---|---|---|---|---|---|---|---|---|
0.00 | 0 | 0 | 0 | 0 | 0.671 | 0.524 | 24.291 | 10.20 × 10−7 |
3.00 | 0 | 0 | 0 | 1 | 0.671 | 0.524 | 24.049 | 9.39 × 10−7 |
3.25 | 0 | 0 | 0 | 2 | 0.671 | 0.524 | 22.156 | 2.51 × 10−6 |
3.50 | 0 | 2 | 1 | 8 | 0.671 | 0.524 | 21.969 | 2.77 × 10−6 |
3.75 | 0 | 8 | 1 | 40 | 0.672 | 0.525 | 22.570 | 2.03 × 10−6 |
4.00 | 0 | 34 | 2 | 207 | 0.672 | 0.530 | 22.941 | 1.67 × 10−6 |
4.25 | 0 | 88 | 5 | 553 | 0.673 | 0.583 | 21.862 | 2.93 × 10−6 |
From Table 5, for any selected value of d, the null hypothesis H0 is rejected with p-values increasing as d increases.
Although standard SPSS output gives the asymptotic 95% CI of AUC1 as (0.628,0.708) and of AUC2 as (0.481,0.585), in the simple case where d = 0, we cannot conclude that H0 : AUC1 = AUC2 is rejected because the estimators of AUC1 and AUC2 are correlated. We utilize a method proposed by Wieand et al. (1989, p. 587). Following these authors, we have, if biomarkers’ values are normally distributed, the test-statistic
where
Thus, because the zCI calculated from the data is 4.71, the p-value of the test |zCI| > zα is 0.0021, whereas our proposed method has p-value 10.20 × 10−7; see Table 5 (d = 0).
5. Discussion
In the present article, we have shown that the maximum likelihood ratio approach serves as a method of testing for the hypothesis regarding the comparison of AUCs. Such an approach yields a powerful test with characteristics that can be obtained by the well-established maximum likelihood theory. We used real data examples to illustrate how easily the MLRT method can be carried out in order to compare two biomarkers and to determine whether a biomarker has discriminatory ability.
The article assumes normal distributions for the values of the biomarkers when LOD is present. However, the proposed approach can be extended to other commonly used distributions, for example, gamma, lognormal, etc. Similarly, we can perform hypothesis testing for AUCs based on right, double-censored, or truncated data. We have focused on comparing paired correlated areas, but the proposed method can be adapted to multivariate cases as well.
Our article presented a method dealing with data subject to LOD with broad validity under a reasonable set of assumptions. Sensitivity analysis, though beyond the scope of the present article, is important to assess these distributional assumptions. This topic can be discussed in a generalcontext of missing data analysis; see Molenberghs and Kenward (2007). However, one must bear in mind that data below LOD are informative missing, in the sense that they are unobservable only if the actual values are below the detection limit.
We briefly investigated several imputation methods that are commonly applied among epidemiologists in dealing with LOD data. These methods, however, are not statistically justified and should not be confused with the popular method of multiple imputation (e.g., Rubin, 1987) in the missing data analysis literature. The use of multiple imputation in the analysis of LOD data deserves further investigation.
Note that nonparametric distribution function estimation based on censored data can be obtained and hence Kolmogorov–Smirnov-(or Shapiro–Wilk)-type tests for correctness of parametric assumptions can be evaluated (e.g., Verrill and Johnson, 1988). In the context of the ROC curves and Box–Cox power transformation models based on data subject to LOD, we will address nonparametric and semi-parametric methods in a subsequent article.
The proposed approach preserves the efficiency of the MLRT when applied to testing for biomarkers’ diagnostic accuracy subject to the LOD. When an additive measurement error is in effect, the appropriate maximum likelihood approach can also be utilized following a method similar to that of Section 2.
Acknowledgements
This research was supported by the IntramuralResearch Program of the National Institute of Child Health and Human Development, National Institutes of Health. The opinions expressed are those of the authors and not necessarily of the National Institutes of Health. The authors would like to thank Margaret Hillier for providing the intrauterine inflammation data. We are grateful to the co-editor, associate editor, and referee for their helpful comments that clearly improved this article.
Appendix A
The Likelihood Functions Based on Data Subject to LOD
The likelihood function based on has the form of
where
and n1, n2, n3, n4 (n1 + n2 + n3 + n4 = n) are the numbers of events , and , respectively. Here, the term corresponds to situations where x1i and x2i are observed completely; relates to situations where x1i is observed, whereas x2i is below the detection limit d and thus is NA (the opposite case where x1i is unobserved whereas x2i is available, matches ). Finally, when both x1 and x2 are not observed numerically, we have (i.e., represents the probability of both and to be NA).
The likelihood function based on is defined in a similar manner:
Appendix B
Transformed Normal Approach
We denote the function
and extend the likelihoods and by Appendix A to the forms of
where for z = x, y
and it is assumed that (T(z1, λ1), T(z2, λ2)) are jointly normally distributed. Thus, in this case, the MLR test-statistic is
References
- Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of Mathematical Psychology. 1975;12:387–415. [Google Scholar]
- Choi S, Hall WJ, Schick A. Asymptotically uniformly most powerful tests in parametric and semi-parametric models. Annals of Statistics. 1996;24:841–861. [Google Scholar]
- Gupta AK. Estimation of the mean and standard deviation of a normal population from a censored sample. Biometrika. 1952;39:260–273. [Google Scholar]
- Lehmann EL. Testing Statistical Hypotheses. 2nd edition John Wiley and Sons; New York: 1997. [Google Scholar]
- Louis GM, Weiner JM, Whitecomb BW, Sperrazza R, Schisterman EF, Lobdell DT, Crickard K, Greizerstein H, Kostyniak PJ. Environmental PCB exposure and risk of endometriosis. Human Reproduction. 2005;20:279–285. doi: 10.1093/humrep/deh575. [DOI] [PubMed] [Google Scholar]
- Lubin JH, Colt JS, Camann D, Davis S, Cerhan JR, Severson RK, Bernstein L, Hartge P. Epidemiological evaluation of measurement data in the presence of detection limits. Environmental Health Perspectives. 2004;112:1691–1696. doi: 10.1289/ehp.7199. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lyles RH, Williams JK, Chuachoowong R. Correlating two viral load assays with known detection limits. Biometrics. 2001;57:1238–1244. doi: 10.1111/j.0006-341x.2001.01238.x. [DOI] [PubMed] [Google Scholar]
- Lynn HS. Maximum likelihood inference for left-censored HIV RNA data. Statistics in Medicine. 2001;20:33–45. doi: 10.1002/1097-0258(20010115)20:1<33::aid-sim640>3.0.co;2-o. [DOI] [PubMed] [Google Scholar]
- Molenberghs G, Kenward MG. Missing Data in Clinical Studies. Wiley; Chichester, UK: 2007. [Google Scholar]
- Molodianovitch K, Faraggi D, Reiser B. Comparing the areas under two correlated ROC curves: Parametric and non-parametric approaches. Biometrical Journal. 2006;48:745–757. doi: 10.1002/bimj.200610223. [DOI] [PubMed] [Google Scholar]
- Mumford SL, Schisterman EF, Vexler A, Liu A. Pooling biospecimens and limits of detection: Effects on ROC curve analysis. Biostatistics. 2006;7:585–598. doi: 10.1093/biostatistics/kxj027. [DOI] [PubMed] [Google Scholar]
- Perkins NJ, Schisterman EF, Vexler A. Receiver operating characteristic curve inference from a sample with a limit of detection. American Journal of Epidemiology. 2007;165:325–333. doi: 10.1093/aje/kwk011. [DOI] [PubMed] [Google Scholar]
- Reiser B, Faraggi D. Confidence intervals for the generalized ROC criterion. Biometrics. 1997;53:644–652. [PubMed] [Google Scholar]
- Rubin DB. Multiple Imputation for Nonresponse in Surveys. Wiley; New York: 1987. [Google Scholar]
- Schisterman EF, Vexler A, Whitcomb BW, Liu A. The limitations due to exposure detection limits for regression models. American Journal of Epidemiology. 2006;163:374–383. doi: 10.1093/aje/kwj039. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shapiro D. The interpretation of diagnostic tests. Statistical Methods in Medical Research. 1999;8:113–134. doi: 10.1177/096228029900800203. [DOI] [PubMed] [Google Scholar]
- Verrill S, Johnson RA. Tables and large-sample distribution theory for censored-data correlation statistics for testing normality. Journal of the American Statistical Association. 1988;83:1192–1197. [Google Scholar]
- Vexler A, Liu A, Schisterman EF. Efficient design and analysis of biospecimens with measurements subject to detection limit. Biometrical Journal. 2006;48:780– 791. doi: 10.1002/bimj.200610266. [DOI] [PubMed] [Google Scholar]
- Wieand S, Gail MH, James BR, James KL. A family of nonparametric statistics for comparing diagnostic markers with paired or unpaired data. Biometrika. 1989;76:585–592. [Google Scholar]