Abstract
Three recent nonparametric methodologies for estimating a monotone regression function F and its inverse F−1 are (1) the inverse kernel method DNP (Dette et al. (2005), Dette and Scheder (2010)), (2) the monotone spline (Kong and Eubank (2006)) and (3) the data adaptive method NAM (Bhattacharya and Lin (2010), (2011)), with roots in isotonic regression (Ayer et al. (1955), Bhattacharya and Kong (2007)). All three have asymptotically optimal error rates. In this article their finite sample performances are compared using extensive simulation from diverse models of interest, and by analysis of real data. Let there be m distinct values of the independent variable x among N observations y. The results show that if m is relatively small compared to N then generally the NAM performs best, while the DNP outperforms the other methods when m is O(N) unless there is a substantial clustering of the values of the independent variable x.
1. Introduction
Consider the estimation of a monotone increasing regression function F on an interval [a, b], F′ > 0, based on observations (xj, yj), j = 1, ⋯ , N, satisfying
| (1.1) |
where a = x1 ≤ x2 ≤ ⋯ ≤ xN= b are nonstochastic and εj (j = 1, ⋯ , N) are independent mean zero random variables; the distribution of εj may depend on xj. Suppose there are m distinct values of x, say a = z1 < ⋯ < zm = b, and ni observations yj for a given x = zi, n1 + ⋯ + nm = N, m → ∞. In particular, one may allow m = N. Assume F has a continuous second derivative. It is known from the general theory of nonparametric regression without an order restriction that the minimum MISE, or mean integrated squared error, of an estimate of F cannot be smaller than O(N−4/5), and that an estimate F̂(x) of F(x) with an asymptotic Normal distribution N(F(x), ) can only attain a variance of order slightly larger than O(N−4/5), e.g., O(N−4/5 log log N) (See Eubank (1999) or Tsybakov (2010)). The optimal estimators commonly used are Nadarya-Watson type kernel estimators with optimally chosen bandwidths. Optimal cubic splines may also be used (Wahba (1990)), although fully nonparametric frequentist inference based on them has been problematic (See, e.g., Eubank (1999) and Kelly and Rice (1990)).
It is a more delicate problem to construct optimal monotone estimates of F and, especially, of the inverse F−1. An important problem in bioassay and environmental risk assessment is to estimate a level x = F−1(p) of a drug or a chemical agent for which the response probability F(x) is a given quantity p. In quantal bioassay, or environmental risk assessment, for the level xj of a drug or a chemical agent, yj is recorded as 1 for response and 0 for nonresponse. In the next section we provide outlines of a number of recent methods which yield asymptotically optimal rates of MISE and asymptotic variances for the estimation of F and F−1. In particular, the recent nonparametric adaptive method NAM developed by the authors (Bhattacharya and Lin (2010), (2011), and Lin (2012)) and its smoother version SNAM introduced here are compared with an interesting kernel based methodology DNP due to Dette et al. (2005) and Dette and Scheder (2010), and with cubic splines. All these methods attain asymptotically optimal MISEs under appropriate conditions. But asymptotics only provide broad guidelines for large sample sizes, they cannot predict actual finite sample performance for small and moderately large sample sizes. Such comparative performances may only be studied by extensive simulation from diverse important models and by data analysis. This is our focus in the present article. For relatively small and moderate sample sizes, the classical CLT-based confidence intervals are difficult to compute as the estimation of the standard error of the estimate requires estimating the derivative F′ of F-an object which is highly sensitive to minor changes in the data. But we can show that in most cases Efron’s bootstrap procedure (Efron (1979), (1981)) provides a valid and computationally feasible alternative. In particular, for these cases the use of bootstrap provides the first fully nonparametric frequentist inference for monotone cubic spline estimates.
The extensive simulation and data analysis carried out in this article provide the following broad lessons. If ni (i = 1, ⋯ , m) are large relative to m then the NAM/SNAM outperforms DNP and the monotone spline. If on the other hand m is very large, say m = N or O(N), then the NAM is not quite applicable and the DNP outperforms a substitute of NAM, namely, the adaptive average grouping method AAGM whose roots lie in a paper by Wright (1982) but extended, made data adaptive and subjected to bootstrapping for inference in the present article and in Lin (2012). For small and moderate sample sizes (n = 5, 10) the NAM and DNP both mostly outperform the MLE (for the correctly specified model).
The following section provides outlines of the different methods under comparison. Section 3 exhibits results of extensive simulation with binary responses from four important models-logistic, probit, beta, and Weibull-with values 5 and 10 of m and for values 5,10,25 of a common n = ni (i = 1, ⋯ , m). The comparisons among NAM, DNP, Spline and MLE are by lengths of confidence intervals for F−1(p) for an equidistant set of 11 values of p in [0,1]. This section also includes simulation results on the biases and variances of the upper and lower confidence bounds for F−1(p) estimated by the NAM and DNP using bootstrapping. It uses 1000 samples from each model and for each of the 11 values of p. This provides information on the stability of bootstrapping in the present context. The last section, Section 4, contains some examples of data analysis.
A finite sample comparison has been carried out among various kernel methods for monotone estimates of F−1 in Dette and Scheder (2010). Apart from the DNP these methods include those of Müller and Schmitt (1988) and Park and Park (2006). Although all these methods attain asymptotic optimal rates, and the performance of Muller and Schmitt (1988) is reasonably good, DNP seems to do the best.
The results in Section 3, 4 show that if m is small relative to n then the NAM and its smooth version SNAM outperform the other two main methods, namely, the DNP and the monotone spline. But if m is large relative to n, e.g., m = N or O(N), while n = O(1), then the DNP outperforms its competitors AAGM developed in Lin (2012), Chapter 6, and the monotone spline due to Kong and Eubank (2006) but with bootstrapping as used in Lin (2012) and here.
Efron’s bootstrap may be proved valid for constructing confidence intervals for the three methods considered here, and it seems to have been applied for the first time in these contexts in Bhattacharya and Lin (2010), (2011), Lin (2012), and the present article.
It may be noted that most available data for environmental risk assessment involve very small m but fairly large n. For such cases the NAM is the same as the method due to Bhattacharya and Kong (2007), sometimes referred to as the BK method. The estimation of the so-called benchmark dosage, or BMD, may then be based on the BK method (See, e.g.,Piegorsch et al. (2012)).
In a past study (See Bhattacharya and Lin (2011)), comparisons among NAM, DNP and the MLE were carried out using bootstrapping from a single sample in each case. In contrast, the present study is much more comprehensive and provides precise comparisons. First, it includes the method of monotone splines with bootstrapping, rendering it fully nonparametric. Also included in the comparisons are the kernel-smoothed version (SNAM) of the NAM, and the AAGM (for m = O(N)).
Secondly, comparisons are made precise by looking at true confidence intervals approximated for each of four important models (logistic, probit, beta, Weibull), for each of eleven dosage levels, and for three different values of n, using 4 × 11 × 3 sets of 1000 simulations each (Tables 1–24 and Figures 1–11). Third, since in dealing with real data only one sample is available, we carry out in Section 3.2 a study of the stability of the bootstrap based on 1000 simulations from each of eleven dosage levels from the logistic model. For each of these 11 sets of 1000 simulations a bootstrap resample of size 1000 is used to compute the average bias and variance of the lower and upper confidence bounds of the NAM and the DNP (Tables 28–35). Fourth, the comparison between NAM and its smoother version SNAM is provided in Tables 12–14 and Figures 25–27, for ten dosage levels for the logistic model for n = 5, 10, 25, and m = 10, based on 10 × 3 new sets of 1000 simulations each. The SNAM seems to reduce the bias for small p, as expected under theoretical considerations (See Bhattacharya and Lin (2011), Remark 3). Finally, to deal with large m, MISEs of the DNP and the newly developed adaptive method AAGM are compared by extensive simulations from the four models mentioned above for the case m = 20, and for n = 1,5,10,25,50. Bootstrapping is used in the two data examples in Section 4 to compare the MISEs of the relevant methods, namely, DNP and AAGM in Example I, and NAM, DNP, monotone Spline and the MLE, under the logistic model assumption for the latter, in Example II. Curiously, the AAGM holds its own against the DNP in Example I, which seems to be in contrast with the findings based on simulations from the four models in Section 3.3 for the case m = N.
Table 1.
[Logistic(m=5,n=5)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.2562 | 0.2925 | 0.2526 | 0.2481 | 0.2534 | 0.2530 | 0.2465 | 0.2428 | 0.2351 | 0.2483 | 0.3016 |
| DNP: | 0.1857 | 0.2621 | 0.2524 | 0.2488 | 0.2331 | 0.2396 | 0.2289 | 0.2277 | 0.2291 | 0.2397 | 0.2830 |
| Spline | 0.2972 | 0.3530 | 0.3748 | 0.3812 | 0.3400 | 0.2801 | 0.2759 | 0.2649 | 0.3067 | 0.3344 | 0.7350 |
| D1: | −0.0705 | −0.0304 | −0.0002 | 0.0007 | −0.0203 | −0.0134 | −0.0176 | −0.0151 | −0.0060 | −0.0086 | −0.0186 |
| D2: | 0.1115 | 0.0909 | 0.1224 | 0.1324 | 0.1069 | 0.0405 | 0.0469 | 0.0372 | 0.0776 | 0.0947 | 0.4520 |
Table 24.
[Weibull (m=10,n=25)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.2672 | 0.3900 | 0.4080 | 0.4489 | 0.4933 | 0.5273 | 0.5798 | 0.6388 | 0.7249 | 0.8204 | 0.9159 |
| DNP: | 0.3644 | 0.4817 | 0.5750 | 0.6135 | 0.6636 | 0.6875 | 0.7515 | 0.8263 | 0.8802 | 0.9672 | 1.1384 |
| Spline: | 0.4724 | 0.6704 | 0.8248 | 0.8549 | 0.7569 | 0.7691 | 0.8664 | 0.9039 | 1.0354 | 1.0535 | 1.2535 |
| D1: | 0.0973 | 0.0917 | 0.1670 | 0.1646 | 0.1703 | 0.1602 | 0.1716 | 0.1874 | 0.1552 | 0.1468 | 0.2225 |
| D2: | 0.1080 | 0.1887 | 0.2499 | 0.2415 | 0.0932 | 0.0815 | 0.1150 | 0.0776 | 0.1552 | 0.0862 | 0.1150 |
Figure 1. [Probit].
[Probit Data]95% CI for NAM, DNP and MLE (m=5,n=5).
Figure 11. [Weibull].
[Weibull Data]95% CI for NAM and SP (m=10,n=25).
Table 28.
[Logistic(m=10,n=5)] The Variance of confidence limits for DNP and NAM
| Var(NAML): | 0.0038 | 0.0043 | 0.0046 | 0.0044 | 0.0033 | 0.0024 | 0.0021 | 0.0020 | 0.0020 | 0.0020 | 0.0020 |
| Var(DNPL): | 0.0022 | 0.0042 | 0.0046 | 0.0036 | 0.0028 | 0.0022 | 0.0021 | 0.0020 | 0.0020 | 0.0020 | 0.0021 |
| Var(NAMU) | 0.0023 | 0.0021 | 0.0020 | 0.0020 | 0.0020 | 0.0022 | 0.0025 | 0.0037 | 0.0038 | 0.0048 | 0.0045 |
| Var(DNPU): | 0.0018 | 0.0020 | 0.0020 | 0.0020 | 0.0020 | 0.0020 | 0.0022 | 0.0026 | 0.0035 | 0.0044 | 0.0047 |
Table 35.
[Logistic(m=10,n=50)] The Bias of confidence limits for DNP and NAM
| Bias(NAML): | 0.0251 | −0.0003 | 0.0018 | −0.0003 | −0.0002 | −0.0005 | −0.0013 | 0.0003 | −0.0019 | 0.0004 | −0.0035 |
| Bias(DNPL): | 0.0336 | 0.0063 | 0.0037 | 0.0044 | 0.0025 | 0.0011 | 0.0038 | 0.0048 | −0.0056 | −0.0004 | 0.0005 |
| Bias(NAMU) | 0.0039 | 0.0013 | −0.0030 | −0.0002 | −0.0008 | −0.0003 | −0.0002 | −0.0002 | 0.0003 | 0.0018 | 0.0019 |
| Bias(DNPU): | −0.0064 | 0.0004 | −0.0107 | −0.0035 | 0.0017 | 0.0031 | 0.0024 | −0.0070 | −0.0009 | −0.0015 | −0.0040 |
Table 12.
[Probit (m=10,n=25)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.0641 | 0.0569 | 0.0556 | 0.0531 | 0.0522 | 0.0509 | 0.0498 | 0.0504 | 0.0513 | 0.0544 | 0.0587 |
| DNP: | 0.0835 | 0.0787 | 0.0667 | 0.0643 | 0.0655 | 0.0637 | 0.0657 | 0.0634 | 0.0655 | 0.0675 | 0.0724 |
| Spline: | 0.1029 | 0.0900 | 0.0767 | 0.0734 | 0.0735 | 0.0730 | 0.0743 | 0.0725 | 0.0772 | 0.0741 | 0.0850 |
| D1: | 0.0194 | 0.0218 | 0.0111 | 0.0112 | 0.0134 | 0.0128 | 0.0159 | 0.0130 | 0.0142 | 0.0131 | 0.0137 |
| D2: | 0.0194 | 0.0114 | 0.0100 | 0.0091 | 0.0080 | 0.0093 | 0.0086 | 0.0090 | 0.0116 | 0.0066 | 0.0126 |
Table 14.
[Beta (m=5,n=10)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.1123 | 0.1711 | 0.1883 | 0.1838 | 0.1848 | 0.2028 | 0.2022 | 0.1991 | 0.2246 | 0.2155 | 0.2132 |
| DNP: | 0.1238 | 0.1763 | 0.1854 | 0.2015 | 0.2019 | 0.2102 | 0.2163 | 0.2218 | 0.2233 | 0.2380 | 0.2506 |
| Spline: | 0.1238 | 0.1763 | 0.1854 | 0.2015 | 0.2019 | 0.2102 | 0.2163 | 0.2218 | 0.2233 | 0.2380 | 0.2506 |
| D1: | 0.0116 | 0.0052 | −0.0029 | 0.0178 | 0.0171 | 0.0074 | 0.0141 | 0.0227 | −0.0013 | 0.0225 | 0.0374 |
| D2: | 0.0545 | 0.0656 | 0.0853 | 0.0758 | 0.0744 | 0.0580 | 0.0375 | 0.0291 | 0.0336 | 0.0637 | 0.0720 |
One may extend the results to the case of a stochastic variable x under appropriate assumptions; but for simplicity we assume x to be nonstochastic.
2. Descriptions of different methods
The recent methods we compare are (1) the NAM due to Bhatacharya and Lin (2010) which has its origin in isotonic regression and the pool violators algorithm, or PAV. This is also true for the AAGM which is applicable for large m. (2) The DNP due to Dette et al. (2005) and Dette and Schedar (2010) is an inverse kernel method. (3) Apart from the use of bootstrapping which ensures a fully nonparametric inference, the monotone spline methodology used here is due to Kong and Eubank (2006).
We give detailed statements of the NAM and AAGM in the first subsection because they have not appeared in the literature in this generality earlier.
2.1. A Nonparametric Adaptive Method NAM and An Adaptive Average Grouping Method AAGM
It was shown by Ayer et al. (1955) that, for a given set of weights wi (i = 1, ⋯ , m), in the isotonic regression problem (1.1) the minimize of
| (2.1) |
over the class of all monotone nondecreasing F is given by
| (2.2) |
The estimate of the whole curve F is obtained by linear interpolation in [F̃(zi)), F̃(zi+1)], i = 1, ⋯ , m − 1. This also allows one to obtain an estimate F̃−1 of the inverse curve F−1. Consider also the usual estimate F̂(zi) of F(zi) as the mean of those observations yj for a given x value zi. That is,
| (2.3) |
where, Si is the sum of those yj with xj = zi. We make the following assumption throughout.
Assumption 2.1. mini ni/ maxi ni is bounded away from zero, and (zi+1 − zi)/m (i = 1, ⋯ , m−1) are bounded away from zero and infinity.
In addition, the following assumption is used in this subsection.
Assumption 2.2. There is a neighborhood of zero on which the moment generating function of εj is uniformly bounded for all j.
For the quantal bioassay problem described in the Introduction, write n = max ni. It was shown in Bhattacharya and Kong (2007) that if m/n1/4 → ∞ but m/(n. log n)1/2 is bounded then, with weights wi = ni, F̃−1(p) is asymptotically Normal N(F−1(p), ). It turns out that attains essentially the optimal rate O(N−4/5(log log N)6/5) only for the design in which m is of order slightly larger than n1/4. Since such a restriction on the design may not be met in practice, a new procedure was developed in Bhattacharya and Lin (2010) for the quantal bioassay problem, which we now extend to the more general case of monotone regression (1.1).
For values of m such that m/n1/4 → ∞, divide the set of m sets (z1, ⋯ , zm, ) into r adjacent nearly disjoint subgroups of approximately equal size s(n) each, where r and s(n) will be specified later. They satisfy the approximate equality
| (2.4) |
For example, Group 1 comprises the z values (z1, zr+1, z2r+1, ⋯ , z(s(n)−1)r+1), zm), Group 2 is (z1, z2, zr+2, z2r+2, ⋯ , z(s(n)−1)r+2), zm), ⋯ , Group r is (z1, zr, z2r, ⋯ , z(s(n)r), zm). Now construct the linearly interpolated PAV estimate F̃t of F as above, but using only the yj’s belonging to the t-th Group of z levels (t = 1, ⋯ , r). Then define the NAM estimates of F and F−1 as F̃ = (1/r) ∑1≤t≤r F̃t and ζ = (1/r) ∑q≤t≤r (F̃t)−1, respectively.
For the statement of the theorems below, the sign ≈ indicates that the ratio of its two sides is bounded away from zero and infinity.
Theorem 2.3. Let F be a function on [a, b], F′ > 0, and F″ continuous. Let Assumptions 2.1 and 2.2 hold. (a) If m/n1/4 ≈ 1 then, with r = 1, the MISE of F̃ has the optimal rate O(N−4/5). (b) If m/n1/4 → ∞ and m = o(n3/2/(log n)5/2) then, with r ≈ (m4/n)1/5, the MISE of F̃ is O(N−4/5).
Theorem 2.4. Make the same assumptions as in Theorem 2.3 (a) If m/n1/4 ≈ 1 then, with r = 1, the MISE of ζ has the optimal rate O(N−4/5). (b) If m/n1/4 → ∞, but m/n2/3 is bounded away from infinity then, with r ≈ (m4/n)1/5, the MISE of ζ is O(N−4/5).
Theorem 2.5. Make the same assumptions as in Theorem 2.3. If m/n1/4 → ∞ but m = o(n2/3/ log log n) then, with r ≈ (m4/n)1/5/(log log n)6/5, ζ is asymptotically Normal N(F−1(p), ), where .
The proofs of these theorems are omitted as they are analogous to those for the special case of Bernoulli distributed binary observations yj , in which case F(x) is the probability of response (i.e., y = 1) for a given dose level x (See Bhattacharya and Lin (2010) and Lin (2012)).
For m of larger order than considered in Theorems 2.3–2.5, a different method, namely, AAGM (the adaptive average grouping method) is employed. The original idea for this is due to Wright (1982). By extending it, making it data adaptive, and using the bootstrap, the results are made applicable in Lin (2012). Here one considers a = x1 ≤ x2 ≤ ⋯ ≤ xN = b in the general model (1.1), with m distinct values of x , namely, a = z1 < ⋯ < zm = b assumed to be equidistant for simplicity: zi+1 − zi = (b − a)/m. Assume ni = n for all i = 1, ⋯ , m. Let k be a positive integer, k < m. We divide the m dosages into approximate m/k groups of dosages (z1, z2, ⋯ , zk), (zk+1, ⋯ , z2k), ⋯ , etc. If k is odd, then z̄1 = z(k+1)/2, z̄2 = zk+(k+1)/2, ⋯ , z̄j = z(j−1)k+(k+1)/2, ⋯ , etc. If k is even, then , ⋯.
Our new estimate of F(z̄j) is at design points z̄j (j = 1, ⋯, m/k (approximately)), where is derived by the PAV algorithm applied to (z̄j, ), 1 ≤ j ≤ m/k.
We now define on [a, b], by letting , and linearly interpolating in (zj, zj+1) for all j.
Theorem 2.6. If F′ > 0 and F″ is continuous on [a, b] and Assumptions 2.1 and 2.2 hold, then with
| (2.5) |
the estimates F̃ and F̃−1 of F and F−1, respectively, attain asymptotically minimal error rates for m satisfying
| (2.6) |
for an appropriate constant α.
Proof. We refer to Wright (1982) and Lin (2012) for a proof.
Remark 2.1. The assumption m ≥ αN3/5 log N in Theorem 2.6 complements the assumption m = o(N3/5) in Theorem 2.5.
Remark 2.2. In applications with real data the group size k may be decided on the basis of estimates of the MISE for a range of reasonable values of k.
2.2. Smoothed Nonparametric Adaptive method (SNAM)
The smoothed nonparametric adaptive method (SNAM) is a smoothed version of the NAM method. Like the NAM, SNAM divides the distinct z values into r group in the same way. For example, Group 1 comprises the z values (z1, zr+1, z2r+1, ⋯ , z(s(n)−1)r+1), zm), Group 2 is (z1, z2, zr+2, z2r+2, ⋯ , z(s(n)−1)r+2), zm), ⋯ , Group r is (z1, zr, z2r, ⋯ , z(s(n)r), zm). The SNAM estimate of the monotone regression function F(x) is given by
where F̃(t) is the smoothed B-K estimate which are obtained by kernel smoothing the B-K curve. Note that such estimates are monotone so long as one picks a log-concave kernel function K(x) (See Mukerjee (1988)). The estimates F̃t (t = 1, ⋯ , r) are asymptotically independent, since the only common points among the r Groups are z1 and zm (See Bhattacharya and Lin (2010)).
2.3. Kernel Based DNP Method
An important kernel-based method in estimating the effective dosage curve is the DNP method (following the terminology in Dette and Scheder (2010)). Let the response to xi be yi (0 or 1). The local linear estimator is obtained first by finding the solution to the following minimization problem: for a small h > 0, find the minimizer of
| (2.7) |
where K(x) is a symmetric density on the real line ℝ with a finite second moment, and h is the bandwidth. The estimator β̂1(x) of β1 is the estimator of F(x). The p-th quantile EDp = F−1(p) is then estimated as
| (2.8) |
where hd is small. Here Kd is a symmetric kernel with the same properties as K (e.g, Kd = K). But h and hd are not of the same order, as shown below. Note that as hd ↓ 0, converges to EDp. To understand this, observe that converges to the Dirac measure δF̂(x)(du) as hd ↓ 0, so that the inner integral converges to the indicator function . The outer integral of this limit is the Lebesgue measure of the set {x : } which equals the length of this interval. This method of monotonization of a function is called monotone or measure-preserving rearrangement in Hardy et al. (1952).
With an optimal choice of the bandwidth the estimate of F−1 attains asymptotically optimal error rates.
2.4. Monotone B-Spline Smoothing
Given the regression model (1.1), the general smoothing spline problem is to find the estimate F̂h of the function F that minimizes the objective function (over the class of twice differentiable functions)
| (2.9) |
where wj are positive weights and h is a smoothing parameter which controls the trade off between the smoothness of the curve and the fidelity to the data.
The existence and characterizations of the solution to (2.9) without any shape constraint on the regression function F are derived in Wahba (1990) (Also see Eubank (1999)). When F is assumed to be monotone, an approach by Kong and Eubank (2007) and Kelly and Rice (1990) with the use of the so-called B-spline basis Bj,4, represents a monotone F as a linear combination of the basis functions Bj,4, with coefficients βj increasing with j. The existence of such F as a solution minimizing (2.9) can be easily shown.
For constructing confidence intervals using monotone spline estimates in quantal bioassay, Kong and Eubank (2007) proposed a form of parametric Bayesian inference for deriving confidence intervals. Our article constructs confidence intervals using the nonparametric bootstrap, which may be shown to be valid and which makes the procedure fully nonparametric. The optimal estimate of the smoothing parameter h is given by the GCV (generalized cross validation) algorithm (See Eubank (1999)).
3. A summary of computation results
3.1. Comparison in Terms of True Confidence Intervals of Effective Dosages ζ
3.1.1. Comparisons among NAM, DNP, Spline and MLE
In this subsection, 95% ‘true’ confidence intervals are constructed for the NAM, DNP, Spline estimates and MLE of the effective dosages ζp with 1000 samples of data simulated from some important parametric models. For each sample, NAM, DNP, Spline estimates and MLE of the effective dosage curve are obtained for 11 equidistant response levels in [0.05,0.85]. For each of the response level p, the lower confidence limits are given by the 2.5 percent quantile of the 1000 estimates and the upper confidence limits are given by the 97.5 percent quantile value of the 1000 estimates for each method.
The data are simulated from the following four parametric models for the case m = 5, n = 5, 10, 25 and m = 10, n = 5, 10, 25. Here m stands for the number of dosages and n is the number of observations at each dosage. The parametric models are:
Logistic model with α = −20, β = 10.
Probit model F(x) = Φ((x − μ)/σ) with μ = 0.5 and σ = 0.3.
Beta distribution with α = 2 and β = 3.
Weibull model F(x) = 1 − exp(−(x/α)β) with α = 2 and β = 1.5.
The comparisons are mainly carried out in terms of the length of the confidence interval for each method. The first three rows of the following tables record the lengths of the confidence intervals for the nonparametric estimates by NAM, DNP and Spline; the fourth row records the difference D1 = the length of the CI for DNP − the length of CI for NAM; and the last row records the difference D2 =the length of the CI for Spline − the length of CI for DNP.
As one can see from the results given in the following tables, for the most part, the NAM method yields the narrowest confidence intervals when n = 10, 25 for both of the case when m = 5 and m = 10 for for all four models. When n = 5, it seems that the DNP method works better for most of the cases. The NAM and DNP methods in general perform better than the monotone Spline method.
At the end of this subsection, some plots of the true confidence intervals for different estimates are given. The lengths of confidence intervals are recorded in the previous tables. For the case m = 5, only the plots from the Probit model are given while for the case m = 10 only the graphs from the Weibull model are provided due to the limited space. However, the graphs exhibit similar patterns for the other models. The blue line in the middle is the true effective dosage curve, the red line with circles represents the confidence interval for the NAM estimate, the green line represents the confidence intervals for the MLE and the black line represents the confidence interval for the DNP estimate. For the graphs where the NAM confidence limits and the Spline (SP) confidence limits are compared, the blue lines represent the confidence intervals for the Spline estimates.
3.1.2. Comparisons Between NAM and SNAM
Here we carry out the comparisons between the true confidence intervals of ζp by the NAM and SNAM. We only include the results for the Logistic model. The results are similar for other models. The blue lines are the confidence limits for the SNAM estimates while the red lines with circles are confidence limits for the NAM estimates. The SNAM yields slightly narrower confidence intervals except for the lower values of responses p. For a relatively large adaptive estimate of ζ, namely r = 3, the SNAM seems to provide a bias correction, especially for small p, but with a slightly larger variance (Also see Remark 3 in Bhattacharya and Lin (2011)).
3.2. Bootstrap Stability
In this subsection, we record the variance and bias of bootstrap confidence limits with data simulated from the Logistic model. First, we simulate 1000 samples of data from the true Logistic model. For each of the simulated sample data, we calculate the bootstrap confidence limits for the NAM and DNP methods for 11 equidistant response levels p in [0.05, 0.85]. Therefore for each p, 1000 estimates for both of the lower confidence limit and the upper confidence limits are obtained. We then calculate the variance of the lower limits and variance of the upper limits of 95% confidence intervals. The bias compared to the true confidence limits recorded in the first section are also obtained.
We have the following notation for the tables below.
Var(NAML) - the variance of the NAM estimates of the lower confidence limit.
Var(NAMU) -the variance of the NAM estimates of the upper confidence limit.
Var(DNPL) - the variance of the DNP estimates of the lower confidence limit.
Var(DNPU)-the variance of the DNP estimates of the upper confidence limit.
Bias(NAML)-the Bias of the NAM estimates of the lower confidence limit.
Bias(NAMU) -the Bias of the NAM estimates of the upper confidence limit.
Bias(DNPL) - the Bias of the DNP estimates of the lower confidence limit.
Bias(DNPU)-the Bias of the DNP estimates of the upper confidence limit.
As one can see from the results, the bias and variance are fairly small except in a few cases where p is very small.
3.3. True MISE for DNP and AAGM (the Adaptive Average Grouping Method)
In this subsection, we carry out the comparison between the DNP method and the AAGM method in terms of true MISE. These two methods are more suitable for designs with relatively larger number of m. Therefore, the number of dosage levels is taken to be fairly large (m = 20) in this comparison and n is taken to be 1, 5, 10, 25 and 50 with data simulated from the four parametric models described in Section 3.1 For the method of adaptive grouping, the number of group k is taken to be 2 or 3. The results show that when n = 1, 5, 10, the DNP method yields smaller MISE, while when n = 25 and n = 50, the MISE of AAGM estimates are smaller.
4. Analysis of Data Examples
4.1. Data Example: The space shuttle problem
We apply the DNP method and the AAGM method (k = 2, 3, 4) to estimating the probability of O-ring failure under certain temperatures for the space shuttle disaster problem (See Rogers Commission Report (1986)). We will first describe the statistical problem.
It was determined that in the disaster of the space shuttle Challenger, the explosion of the shuttle was the result of O-ring failure, a splitting of a ring of rubber that seals different parts of the external rocket motors together. The flight accident was believed to be caused by the unusually cold weather (31°F at the time of the launch). Previous O-ring failure data along with temperature at launch time are collected for 23 prior flights. The object of the study is to determine the probability of O-ring failure under certain temperatures and carry out inference in terms of confidence intervals. Specifically, given the response probability 0.9 of “no O-ring failure”, or probability 0.1 of “O-ring failure”, we aim to estimate the corresponding temperature using the DNP method. We denote this target temperature as T0.1. We also wish to calculate the bootstrap 95% confidence interval on T0.1. The results are compared for the DNP method and the AAGM method.
After calculation, the estimate of T0.1 together with both the one-sided and two-sided 95% bootstrap confidence intervals for this temperature given the level of probability of ’no O- ring failure’ of 0.9 are recorded in the following table for both of the DNP method and the MAG method. We also calculated the mean squared error for the estimate of T0.1, the case k = 4 of MAG yields the smallest MSE 1.5555 which is a little smaller that of the DNP method which is 1.7076. This appears to be in variance with the finding in subsection 3.3. A possible explanation is that even with large m the AAGM performs very well if the x values are clustered.
| T0.1 | two-sided CI | one-sided CI | MSE | |
|---|---|---|---|---|
| DNP | 76.4760 | [66.8798, 81] | [69.3642, | 1.7076 |
| AAGM (k = 2) | 75.5333 | [69.1000, 81] | [70.0000, | 6.7941 |
| AAGM (k = 3) | 75.7333 | [70.1667, 81] | [73.0133, | 3.5065 |
| AAGM (k = 4) | 77.2889 | [69.6500, 81] | [72.9500, | 1.5555* |
4.2. Data Example II
In this data example, the data set has m = 15 dosages levels of the agent potassium bromate (KBrO3) with x = [0, 0.00625, 0.0125, 0.025, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]. The data set is from a micronucleus assay where the DNA damage of cells from KBrO3 is recorded (See Platel et al. (2009)). Certain type of cells are exposed to the 15 different levels of concentrations of KBrO3 and the numbers of cells with DNA damages are recorded. The proportions of responses are given by .
We obtain the estimates of EDp using NAM, DNP, Spline and MLE (Logistic model). The estimated MISE for each method is calculated using Bootstrap. The results are recorded in the following table. The MLE from the Logistic model yields the smallest MISE, however the estimates are not consistent as one can see from the plots of confidence interval from each method (blue lines). The NAM (r=3) yields the smallest MISE among the nonparametric methods.
Figure 2. [Probit].
[Probit Data]95% CI for NAM, DNP and MLE (m=5,n=10).
Figure 3. [Probit].
[Probit Data]95% CI for NAM and SP (m=5,n=10).
Figure 4. [Probit].
[Probit Data]95% CI for NAM, DNP and MLE (m=5,n=25).
Figure 5. [Probit].
[Probit Data]95% CI for NAM and SP (m=5,n=25).
Figure 6. [Weibull].
[Weibull Data]95% CI for NAM, DNP, SP and MLE (m=10,n=5).
Figure 7. [Weibull].
[Weibull Data]95% CI for NAM and SP (m=10,n=5).
Figure 8. [Weibull].
[Weibull Data]95% CI for NAM, DNP, SP and MLE (m=10,n=10).
Figure 9. [Weibull].
[Weibull Data]95% CI for NAM and SP (m=10,n=10).
Figure 10. [Weibull].
[Weibull Data]95% CI for NAM, DNP, SP and MLE (m=10,n=25).
Figure 12. [Logit].
[Logistic Data]95% CI for m=10, n=5 for Smoothing curve and NAM curve(r=3).
Figure 13. [Logit].
[Logistic Data]95% CI for m=10, n=10 for Smoothing curve and NAM curve(r=3).
Figure 14. [Logit].
[Logistic Data]95% CI for m=10, n=25 for Smoothing curve and NAM curve(r=3).
Table 2.
[Logistic(m=5,n=10)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.1686 | 0.2293 | 0.1859 | 0.1769 | 0.1838 | 0.1784 | 0.1764 | 0.1759 | 0.1746 | 0.1807 | 0.1022 |
| DNP: | 1838 | 0.2114 | 0.1917 | 0.1850 | 0.1815 | 0.1852 | 0.1794 | 0.1755 | 0.1827 | 0.1954 | 0.2101 |
| Spline | 0.2719 | 0.2852 | 0.2715 | 0.2031 | 0.2076 | 0.2182 | 0.2163 | 0.2050 | 0.2045 | 0.2427 | 0.2867 |
| D1: | 0.0152 | −0.0179 | 0.0059 | 0.0081 | −0.0023 | 0.0069 | 0.0029 | −0.0004 | 0.0082 | 0.0147 | 0.0078 |
| D2: | 0.0881 | 0.0738 | 0.0798 | 0.0181 | 0.0261 | 0.0330 | 0.0370 | 0.0295 | 0.0218 | 0.0473 | 0.0766 |
Table 3.
[Logistic(m=5,n=25)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.1217 | 0.1347 | 0.1282 | 0.1174 | 0.1138 | 0.1123 | 0.1116 | 0.1143 | 0.1164 | 0.1282 | 0.1421 |
| DNP: | 0.1757 | 0.1664 | 0.1521 | 0.1314 | 0.1231 | 0.1290 | 0.1262 | 0.1248 | 0.1290 | 0.1449 | 0.1692 |
| Spline | 0.2216 | 0.2368 | 0.1578 | 0.1285 | 0.1318 | 0.1352 | 0.1337 | 0.1299 | 0.1307 | 0.1495 | 0.2049 |
| D1: | 0.0540 | 0.0317 | 0.0239 | 0.0140 | 0.0093 | 0.0167 | 0.0146 | 0.0104 | 0.0126 | 0.0166 | 0.0272 |
| D2: | 0.0459 | 0.0704 | 0.0057 | −0.0029 | 0.0086 | 0.0061 | 0.0076 | 0.0051 | 0.0016 | 0.0046 | 0.0356 |
Table 4.
[Logistic(m=10,n=5)] Lengths of Confidence Intervals for DNP, NAM(r=3 and Spline and Their Differences
| NAM: | 0.2504 | 0.2219 | 0.2032 | 0.1918 | 0.1904 | 0.1913 | 0.1871 | 0.1849 | 0.1944 | 0.2076 | 0.2286 |
| DNP: | 0.1874 | 0.2398 | 0.1855 | 0.1726 | 0.1701 | 0.1700 | 0.1675 | 0.1657 | 0.1730 | 0.1830 | 0.2198 |
| Spline | 0.3164 | 0.3396 | 0.2493 | 0.2339 | 0.2353 | 0.2143 | 0.2265 | 0.2285 | 0.2353 | 0.2449 | 0.6741 |
| D1: | −0.0630 | 0.0179 | −0.0177 | −0.0192 | −0.0203 | −0.0214 | −0.0196 | −0.0191 | −0.0214 | −0.0246 | −0.0088 |
| D2: | 0.1290 | 0.0998 | 0.0638 | 0.0613 | 0.0651 | 0.0444 | 0.0590 | 0.0627 | 0.0623 | 0.0619 | 0.4543 |
Table 5.
[Logistic(m=10,n=10)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.1820 | 0.1778 | 0.1537 | 0.1366 | 0.1288 | 0.1290 | 0.1286 | 0.1277 | 0.1334 | 0.1400 | 0.1545 |
| DNP: | 0.1842 | 0.1924 | 0.1569 | 0.1390 | 0.1314 | 0.1322 | 0.1322 | 0.1344 | 0.1417 | 0.1518 | 0.1694 |
| Spline: | 0.2849 | 0.2423 | 0.2024 | 0.1754 | 0.1716 | 0.1738 | 0.1713 | 0.1657 | 0.1742 | 0.1993 | 0.2141 |
| D1: | 0.0022 | 0.0147 | 0.0032 | 0.0024 | 0.0025 | 0.0032 | 0.0036 | 0.0068 | 0.0084 | 0.0118 | 0.0149 |
| D2: | 0.1007 | 0.0499 | 0.0455 | 0.0364 | 0.0402 | 0.0416 | 0.0391 | 0.0313 | 0.0325 | 0.0475 | 0.0447 |
Table 6.
[Logistic(m=10,n=25)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.1416 | 0.1156 | 0.0990 | 0.0882 | 0.0836 | 0.0798 | 0.0791 | 0.0802 | 0.0818 | 0.0909 | 0.1103 |
| DNP: | 0.1733 | 0.1440 | 0.1223 | 0.1087 | 0.1047 | 0.0993 | 0.0975 | 0.0954 | 0.1043 | 0.1147 | 0.1391 |
| Spline | 0.2109 | 0.1736 | 0.1371 | 0.1305 | 0.1140 | 0.1091 | 0.1124 | 0.1089 | 0.1263 | 0.1304 | 0.1525 |
| D1: | 0.0316 | 0.0283 | 0.0232 | 0.0204 | 0.0210 | 0.0195 | 0.0184 | 0.0152 | 0.0226 | 0.0238 | 0.0288 |
| D2: | 0.0376 | 0.0296 | 0.0148 | 0.0219 | 0.0093 | 0.0099 | 0.0149 | 0.0135 | 0.0220 | 0.0156 | 0.0134 |
Table 7.
[Probit(m=5,n=5)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.1352 | 0.1544 | 0.1386 | 0.1407 | 0.1407 | 0.1369 | 0.1349 | 0.1373 | 0.1417 | 0.1390 | 0.1643 |
| DNP: | 0.0980 | 0.1405 | 0.1371 | 0.1324 | 0.1309 | 0.1290 | 0.1268 | 0.1308 | 0.1282 | 0.1387 | 0.1513 |
| Spline | 0.1560 | 0.1849 | 0.2003 | 0.2053 | 0.1911 | 0.1631 | 0.1495 | 0.1488 | 0.2037 | 0.2024 | 0.3962 |
| D1: | −0.0372 | −0.0139 | −0.0015 | −0.0083 | −0.0098 | −0.0079 | −0.0081 | −0.0065 | −0.0135 | −0.0002 | −0.0130 |
| D2: | 0.0580 | 0.0444 | 0.0632 | 0.0729 | 0.0602 | 0.0341 | 0.0227 | 0.0181 | 0.0756 | 0.0636 | 0.2449 |
Table 8.
[Probit (m=5,n=10)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.0719 | 0.1177 | 0.0996 | 0.1012 | 0.1056 | 0.1011 | 0.0984 | 0.1009 | 0.0978 | 0.0973 | 0.1095 |
| DNP: | 0.0869 | 0.1083 | 0.1091 | 0.1037 | 0.1028 | 0.1039 | 0.1031 | 0.0998 | 0.1018 | 0.1055 | 0.1079 |
| Spline: | 0.1379 | 0.1479 | 0.1479 | 0.1346 | 0.1209 | 0.1206 | 0.1200 | 0.1199 | 0.1249 | 0.1429 | 0.1487 |
| D1: | 0.0151 | −0.0094 | 0.0096 | 0.0025 | −0.0027 | 0.0028 | 0.0047 | −0.0011 | 0.0040 | 0.0082 | −0.0016 |
| D2: | 0.0509 | 0.0396 | 0.0388 | 0.0309 | 0.0180 | 0.0167 | 0.0169 | 0.0201 | 0.0231 | 0.0374 | 0.0408 |
Table 9.
[Probit(m=5,n=25)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.0626 | 0.0714 | 0.0738 | 0.0658 | 0.0664 | 0.0665 | 0.0657 | 0.0651 | 0.0653 | 0.0704 | 0.0745 |
| DNP: | 0.0890 | 0.0904 | 0.0950 | 0.0797 | 0.0720 | 0.0790 | 0.0809 | 0.0744 | 0.0742 | 0.0851 | 0.0919 |
| Spline | 0.1070 | 0.1275 | 0.1132 | 0.0812 | 0.0757 | 0.0819 | 0.0823 | 0.0783 | 0.0781 | 0.0955 | 0.1221 |
| D1: | 0.0264 | 0.0190 | 0.0212 | 0.0140 | 0.0056 | 0.0125 | 0.0152 | 0.0093 | 0.0089 | 0.0148 | 0.0173 |
| D2: | 0.0180 | 0.0371 | 0.0182 | 0.0015 | 0.0037 | 0.0029 | 0.0014 | 0.0039 | 0.0039 | 0.0104 | 0.0302 |
Table 10.
[Probit (m=10,n=5)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.1087 | 0.1268 | 0.1173 | 0.1093 | 0.1066 | 0.1052 | 0.1020 | 0.1066 | 0.1040 | 0.1145 | 0.1173 |
| DNP: | 0.0940 | 0.1263 | 0.1116 | 0.1005 | 0.0971 | 0.0968 | 0.0953 | 0.0985 | 0.1049 | 0.1105 | 0.1234 |
| Spline | 0.1550 | 0.1796 | 0.1538 | 0.1417 | 0.1387 | 0.1322 | 0.1326 | 0.1348 | 0.1370 | 0.1506 | 0.1671 |
| D1: | −0.0147 | −0.0005 | −0.0057 | −0.0087 | −0.0095 | −0.0084 | −0.0067 | −0.0081 | 0.0010 | −0.0040 | 0.0062 |
| D2: | 0.0611 | 0.0533 | 0.0422 | 0.0412 | 0.0417 | 0.0353 | 0.0374 | 0.0363 | 0.0321 | 0.0401 | 0.0437 |
Table 11.
[Probit (m=10,n=10)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.0847 | 0.0940 | 0.0856 | 0.0803 | 0.0788 | 0.0779 | 0.0751 | 0.0772 | 0.0827 | 0.0870 | 0.0898 |
| DNP: | 0.0890 | 0.1044 | 0.0905 | 0.0830 | 0.0797 | 0.0812 | 0.0809 | 0.0799 | 0.0837 | 0.0922 | 0.0987 |
| Spline: | 0.1309 | 0.1416 | 0.1101 | 0.1030 | 0.1061 | 0.1042 | 0.1014 | 0.0999 | 0.1026 | 0.1177 | 0.1329 |
| D1: | 0.0043 | 0.0104 | 0.0050 | 0.0027 | 0.0009 | 0.0033 | 0.0058 | 0.0027 | 0.0010 | 0.0053 | 0.0090 |
| D2: | 0.0419 | 0.0372 | 0.0195 | 0.0200 | 0.0264 | 0.0230 | 0.0205 | 0.0201 | 0.0189 | 0.0255 | 0.0342 |
Table 13.
[Beta (m=5,n=5)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.1787 | 0.2262 | 0.2414 | 0.2431 | 0.2598 | 0.2852 | 0.2871 | 0.2804 | 0.2926 | 0.3061 | 0.3232 |
| DNP: | 0.1375 | 0.2155 | 0.2508 | 0.2422 | 0.2522 | 0.2621 | 0.2702 | 0.2768 | 0.2899 | 0.3046 | 0.3127 |
| Spline | 0.2642 | 0.3008 | 0.3093 | 0.3624 | 0.3863 | 0.3905 | 0.3552 | 0.3202 | 0.3643 | 0.6655 | 0.7029 |
| D1: | −0.0412 | −0.0107 | 0.0093 | −0.0009 | −0.0076 | −0.0230 | −0.0168 | −0.0036 | −0.0028 | −0.0015 | −0.0105 |
| D2: | 0.1267 | 0.0853 | 0.0585 | 0.1202 | 0.1341 | 0.1284 | 0.0849 | 0.0434 | 0.0744 | 0.0891 | 0.0925 |
Table 15.
[Beta (m=5,n=25)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.0814 | 0.1013 | 0.1120 | 0.1274 | 0.1257 | 0.1246 | 0.1333 | 0.1392 | 0.1414 | 0.1393 | 0.1530 |
| DNP: | 0.0967 | 0.1360 | 0.1449 | 0.1682 | 0.1575 | 0.1471 | 0.1685 | 0.1890 | 0.1714 | 0.1754 | 0.1914 |
| Spline | 0.1264 | 0.1646 | 0.2063 | 0.2143 | 0.1739 | 0.1593 | 0.1779 | 0.1853 | 0.1824 | 0.1837 | 0.2117 |
| D1: | 0.0153 | 0.0347 | 0.0329 | 0.0408 | 0.0317 | 0.0226 | 0.0352 | 0.0498 | 0.0300 | 0.0361 | 0.0384 |
| D2: | 0.0297 | 0.0286 | 0.0613 | 0.0461 | 0.0164 | 0.0121 | 0.0094 | −0.0037 | 0.0110 | 0.0083 | 0.0203 |
Table 16.
[Beta (m=10,n=5)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.1743 | 0.1919 | 0.2047 | 0.2059 | 0.2141 | 0.2216 | 0.2324 | 0.2443 | 0.2398 | 0.2476 | 0.2535 |
| DNP: | 0.1301 | 0.1929 | 0.1956 | 0.2041 | 0.2034 | 0.2095 | 0.2142 | 0.2149 | 0.2275 | 0.2297 | 0.2660 |
| Spline | 0.2261 | 0.2815 | 0.3117 | 0.2872 | 0.2824 | 0.2899 | 0.2964 | 0.3039 | 0.3092 | 0.3257 | 0.3335 |
| D1: | −0.0442 | 0.0010 | −0.0090 | −0.0019 | −0.0107 | −0.0121 | −0.0182 | −0.0293 | −0.0123 | −0.0179 | 0.0126 |
| D2: | 0.0960 | 0.0886 | 0.1160 | 0.0831 | 0.0790 | 0.0804 | 0.0822 | 0.0890 | 0.0817 | 0.0960 | 0.0674 |
Table 17.
Beta (m=10,n=10)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.1090 | 0.1520 | 0.1468 | 0.1535 | 0.1545 | 0.1602 | 0.1651 | 0.1750 | 0.1812 | 0.1790 | 0.1855 |
| DNP: | 0.1148 | 0.1720 | 0.1617 | 0.1626 | 0.1645 | 0.1675 | 0.1713 | 0.1824 | 0.1917 | 0.1992 | 0.2040 |
| Spline: | 0.1893 | 0.2274 | 0.2457 | 0.2184 | 0.2159 | 0.2105 | 0.2216 | 0.2308 | 0.2498 | 0.2536 | 0.2716 |
| D1: | 0.0058 | 0.0201 | 0.0150 | 0.0091 | 0.0100 | 0.0073 | 0.0061 | 0.0075 | 0.0105 | 0.0202 | 0.0185 |
| D2: | 0.0745 | 0.0553 | 0.0839 | 0.0558 | 0.0514 | 0.0430 | 0.0503 | 0.0484 | 0.0581 | 0.0545 | 0.0676 |
Table 18.
[Beta (m=10,n=25)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.0732 | 0.0889 | 0.0968 | 0.0991 | 0.1022 | 0.1075 | 0.1117 | 0.1116 | 0.1149 | 0.1235 | 0.1251 |
| DNP: | 0.0946 | 0.1184 | 0.1244 | 0.1347 | 0.1334 | 0.1363 | 0.1365 | 0.1436 | 0.1456 | 0.1518 | 0.1621 |
| Spline: | 0.1240 | 0.1594 | 0.1499 | 0.1508 | 0.1500 | 0.1545 | 0.1512 | 0.1611 | 0.1675 | 0.1691 | 0.1790 |
| D1: | 0.0214 | 0.0295 | 0.0276 | 0.0356 | 0.0312 | 0.0288 | 0.0248 | 0.0320 | 0.0307 | 0.0283 | 0.0370 |
| D2: | 0.0294 | 0.0410 | 0.0255 | 0.0161 | 0.0166 | 0.0182 | 0.0147 | 0.0175 | 0.0219 | 0.0172 | 0.0170 |
Table 19.
[Weibull (m=5,n=5)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.1787 | 0.2262 | 0.2414 | 0.2431 | 0.2598 | 0.2852 | 0.2871 | 0.2804 | 0.2926 | 0.3061 | 0.3232 |
| DNP: | 0.1375 | 0.2155 | 0.2508 | 0.2422 | 0.2522 | 0.2621 | 0.2702 | 0.2768 | 0.2899 | 0.3046 | 0.3127 |
| Spline | 0.2642 | 0.3008 | 0.3093 | 0.3624 | 0.3863 | 0.3905 | 0.3552 | 0.3202 | 0.3643 | 0.6655 | 0.7029 |
| D1: | −0.0412 | −0.0107 | 0.0093 | −0.0009 | −0.0076 | −0.0230 | −0.0168 | −0.0036 | −0.0028 | −0.0015 | −0.0105 |
| D2: | 0.1267 | 0.0853 | 0.0585 | 0.1202 | 0.1341 | 0.1284 | 0.0849 | 0.0434 | 0.0744 | 0.0891 | 0.0925 |
Table 20.
[Weibull (m=5,n=10)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.1123 | 0.1711 | 0.1883 | 0.1838 | 0.1848 | 0.2028 | 0.2022 | 0.1991 | 0.2246 | 0.2155 | 0.2132 |
| DNP: | 0.1238 | 0.1763 | 0.1854 | 0.2015 | 0.2019 | 0.2102 | 0.2163 | 0.2218 | 0.2233 | 0.2380 | 0.2506 |
| Spline: | 0.1238 | 0.1763 | 0.1854 | 0.2015 | 0.2019 | 0.2102 | 0.2163 | 0.2218 | 0.2233 | 0.2380 | 0.2506 |
| D1: | 0.0116 | 0.0052 | −0.0029 | 0.0178 | 0.0171 | 0.0074 | 0.0141 | 0.0227 | −0.0013 | 0.0225 | 0.0374 |
| D2: | 0.0545 | 0.0656 | 0.0853 | 0.0758 | 0.0744 | 0.0580 | 0.0375 | 0.0291 | 0.0336 | 0.0637 | 0.0720 |
Table 21.
[Weibull (m=5,n=25)] Lengths of Confidence Intervals for DNP, NAM(r=2) and Spline and Their Differences
| NAM: | 0.0814 | 0.1013 | 0.1120 | 0.1274 | 0.1257 | 0.1246 | 0.1333 | 0.1392 | 0.1414 | 0.1393 | 0.1530 |
| DNP: | 0.0967 | 0.1360 | 0.1449 | 0.1682 | 0.1575 | 0.1471 | 0.1685 | 0.1890 | 0.1714 | 0.1754 | 0.1914 |
| Spline | 0.1264 | 0.1646 | 0.2063 | 0.2143 | 0.1739 | 0.1593 | 0.1779 | 0.1853 | 0.1824 | 0.1837 | 0.2117 |
| D1: | 0.0153 | 0.0347 | 0.0329 | 0.0408 | 0.0317 | 0.0226 | 0.0352 | 0.0498 | 0.0300 | 0.0361 | 0.0384 |
| D2: | 0.0297 | 0.0286 | 0.0613 | 0.0461 | 0.0164 | 0.0121 | 0.0094 | −0.0037 | 0.0110 | 0.0083 | 0.0203 |
Table 22.
[Weibull (m=10,n=5)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.6268 | 0.8851 | 0.9274 | 0.9700 | 1.0744 | 1.1611 | 1.2431 | 1.4632 | 1.5249 | 1.6640 | 1.9524 |
| DNP: | 0.5311 | 0.8156 | 0.9539 | 0.9713 | 1.0605 | 1.1368 | 1.1874 | 1.2617 | 1.3447 | 1.5111 | 1.8156 |
| Spline | 0.8892 | 1.2904 | 1.3509 | 1.4651 | 1.5616 | 1.5528 | 1.7029 | 1.7617 | 1.9285 | 2.0212 | 2.6157 |
| D1: | −0.0957 | −0.0695 | 0.0264 | 0.0013 | −0.0139 | −0.0244 | −0.0556 | −0.2015 | −0.1802 | −0.1528 | −0.1368 |
| D2: | 0.3581 | 0.4748 | 0.3970 | 0.4937 | 0.5011 | 0.4160 | 0.5155 | 0.5000 | 0.5838 | 0.5101 | 0.8001 |
Table 23.
[Weibull (m=10,n=10)] Lengths of Confidence Intervals for DNP, NAM(r=3) and Spline and Their Differences
| NAM: | 0.3905 | 0.6339 | 0.7121 | 0.7498 | 0.7925 | 0.8316 | 0.8952 | 0.9822 | 1.0641 | 1.2010 | 1.4386 |
| DNP: | 0.4510 | 0.6948 | 0.7144 | 0.7759 | 0.8158 | 0.8971 | 0.9377 | 1.0168 | 1.1194 | 1.3422 | 1.5087 |
| Spline: | 0.7749 | 0.9332 | 1.0090 | 1.2055 | 1.1872 | 1.1504 | 1.1786 | 1.3059 | 1.3999 | 1.6432 | 1.8403 |
| D1: | 0.0606 | 0.0609 | 0.0023 | 0.0261 | 0.0233 | 0.0655 | 0.0425 | 0.0346 | 0.0553 | 0.1411 | 0.0701 |
| D2: | 0.3239 | 0.2384 | 0.2946 | 0.4296 | 0.3714 | 0.2533 | 0.2409 | 0.2891 | 0.2805 | 0.3010 | 0.3317 |
Table 25.
The Lengths of Confidence Intervals for NAM (r=3) and SNAM for m=10,n=5
| NAM: | 0.1561 | 0.2033 | 0.2269 | 0.2304 | 0.2320 | 0.2175 | 0.1998 | 0.1892 | 0.1754 |
| SNAM: | 0.2562 | 0.2683 | 0.2568 | 0.2023 | 0.1877 | 0.1863 | 0.1880 | 0.1809 | 0.1691 |
Table 26.
The Lengths of Confidence Intervals for NAM (r=3) and SNAM for m=10,n=10
| NAM: | 0.1137 | 0.1361 | 0.1595 | 0.1448 | 0.1226 | 0.1284 | 0.1740 | 0.1844 | 0.1828 |
| SNAM: | 0.2086 | 0.1221 | 0.1050 | 0.1019 | 0.1082 | 0.1257 | 0.1539 | 0.1839 | 0.1837 |
Table 27.
The Lengths of Confidence Intervals for NAM (r=3) and SNAM for m=10,n=25
| NAM: | 0.0747 | 0.1146 | 0.1016 | 0.0812 | 0.0756 | 0.0732 | 0.0794 | 0.0984 | 0.1118 |
| SNAM: | 0.1074 | 0.0900 | 0.0773 | 0.0773 | 0.0757 | 0.0742 | 0.0792 | 0.0870 | 0.0980 |
Table 29.
[Logistic(m=10,n=5)] The Bias of confidence limits for DNP and NAM
| Bias(NAML): | 0.1018 | 0.0711 | 0.0219 | 0.0072 | 0.0023 | 0.0027 | 0.0030 | 0.0000 | 0.0018 | −0.0025 | −0.0084 |
| Bias(DNPL): | 0.0537 | 0.0641 | −0.0002 | −0.0004 | 0.0063 | 0.0082 | 0.0081 | 0.0085 | 0.0085 | 0.0085 | 0.0113 |
| Bias(NAMU) | −0.0051 | 0.0102 | 0.0044 | −0.0001 | −0.0043 | −0.0062 | −0.0036 | −0.0015 | −0.0125 | −0.0281 | −0.0749 |
| Bias(DNPU): | −0.0092 | −0.0107 | −0.0145 | −0.0126 | −0.0111 | −0.0132 | −0.0114 | −0.0065 | −0.0069 | −0.0086 | −0.0405 |
Table 30.
Logistic(m=10,n=10] The Variance of confidence limits for DNP and NAM
| Var(NAML): | 0.0022 | 0.0033 | 0.0022 | 0.0017 | 0.0014 | 0.0013 | 0.0012 | 0.0012 | 0.0012 | 0.0011 | 0.0011 |
| Var(DNPL): | 0.0020 | 0.0033 | 0.0026 | 0.0018 | 0.0015 | 0.0013 | 0.0012 | 0.0012 | 0.0012 | 0.0012 | 0.0013 |
| Var(NAMU) | 0.0013 | 0.0012 | 0.0012 | 0.0011 | 0.0011 | 0.0011 | 0.0012 | 0.0013 | 0.0016 | 0.0025 | 0.0029 |
| Var(DNPU): | 0.0015 | 0.0014 | 0.0012 | 0.0012 | 0.0012 | 0.0012 | 0.0012 | 0.0013 | 0.0016 | 0.0023 | 0.0032 |
Table 31.
[Logistic(m=10,n=10)] The Bias of confidence limits for DNP and NAM
| Bias (NAML): | 0.0678 | 0.0238 | 0.0030 | −0.0080 | −0.0084 | −0.0087 | −0.0067 | −0.0077 | −0.0060 | −0.0065 | −0.0054 |
| Bias(DNPL): | 0.0424 | 0.0210 | 0.0006 | 0.0008 | 0.0002 | −0.0003 | 0.0018 | 0.0032 | 0.0029 | 0.0026 | 0.0042 |
| Bias(NAMU) | 0.0056 | −0.0022 | 0.0007 | 0.0017 | 0.0028 | 0.0003 | 0.0025 | 0.0061 | 0.0070 | 0.0076 | −0.0056 |
| Bias(DNPU): | −0.0076 | −0.0084 | −0.0086 | −0.0043 | −0.0038 | −0.0078 | −0.0054 | −0.0030 | −0.0040 | −0.0022 | −0.0013 |
Table 32.
[Logistic(m=10,n=25] The Variance of confidence limits for DNP and NAM
| Var(NAML): | 0.0012 | 0.0014 | 0.0007 | 0.0006 | 0.0005 | 0.0005 | 0.0005 | 0.0005 | 0.0005 | 0.0006 | 0.0006 |
| Var(DNPL): | 0.0014 | 0.0019 | 0.0013 | 0.0010 | 0.0008 | 0.0007 | 0.0007 | 0.0006 | 0.0006 | 0.0007 | 0.0008 |
| Var(NAMU) | 0.0008 | 0.0007 | 0.0006 | 0.0005 | 0.0005 | 0.0005 | 0.0005 | 0.0005 | 0.0006 | 0.0007 | 0.0012 |
| Var(DNPU): | 0.0012 | 0.0008 | 0.0007 | 0.0006 | 0.0006 | 0.0006 | 0.0007 | 0.0008 | 0.0010 | 0.0012 | 0.0017 |
Table 33.
[Logistic(m=10,n=25)] The Bias of confidence limits for DNP and NAM
| Bias(NAML): | 0.0380 | 0.0031 | −0.0001 | −0.0024 | −0.0026 | −0.0028 | −0.0022 | −0.0032 | −0.0041 | −0.0038 | 0.0018 |
| Bias(DNPL): | 0.0361 | 0.0023 | 0.0008 | 0.0039 | 0.0028 | 0.0016 | −0.0002 | −0.0018 | 0.0006 | 0.0012 | 0.0046 |
| Bias(NAMU) | 0.0059 | 0.0022 | 0.0018 | 0.0024 | 0.0024 | 0.0036 | 0.0043 | 0.0040 | 0.0055 | 0.0035 | 0.0020 |
| Bias(DNPU): | −0.0012 | −0.0063 | −0.0049 | 0.0007 | −0.0028 | −0.0011 | −0.0007 | 0.0019 | −0.0008 | −0.0023 | −0.0061 |
Table 34.
[Logistic(m=10,n=50] The Variance of confidence limits for DNP and NAM
| Var(NAML): | 0.0009 | 0.0006 | 0.0003 | 0.0003 | 0.0002 | 0.0002 | 0.0002 | 0.0002 | 0.0002 | 0.0002 | 0.0003 |
| Var(DNPL): | 0.0011 | 0.0011 | 0.0008 | 0.0007 | 0.0005 | 0.0005 | 0.0004 | 0.0004 | 0.0004 | 0.0004 | 0.0005 |
| Var(NAMU) | 0.0005 | 0.0003 | 0.0003 | 0.0002 | 0.0002 | 0.0002 | 0.0002 | 0.0002 | 0.0002 | 0.0003 | 0.0004 |
| Var(DNPU): | 0.0009 | 0.0006 | 0.0005 | 0.0004 | 0.0004 | 0.0004 | 0.0005 | 0.0005 | 0.0006 | 0.0008 | 0.0011 |
Table 36.
[Logistic(m=20)] The true MISE comparison for DNP and AAGM
| n = 1 | n = 5 | n = 10 | n = 25 | n = 50 | |
|---|---|---|---|---|---|
| DNP: | 0.0059 | 0.0021 | 0.0014 | 8.2723e-04 | 5.2839e-4 |
| AAGM(k=2): | 0.0126 | 0.0034 | 0.0022 | 0.0010 | 5.4949e-04 |
| AAGM(k=3): | 0.0117* | 0.0031* | 0.0018* | 9.0159e-04* | 5.1024e-04* |
Table 37.
[Probit(m=20)] The true MISE comparison for DNP and AAGM
| n = 1 | n = 5 | n = 10 | n = 25 | n = 50 | |
|---|---|---|---|---|---|
| DNP: | 0.0151 | 0.0053 | 0.0035 | 0.0022 | 0.0015 |
| AAGM(k=2): | 0.0340 | 0.0095 | 0.0058 | 0.0027 | 0.0014 |
| AAGM(k=3): | 0.0312* | 0.0086* | 0.0048* | 0.0022* | 0.0011* |
Table 38.
[Beta (m=20)] The true MISE comparison for DNP and AAGM
| n = 1 | n = 5 | n = 10 | n = 25 | n = 50 | |
|---|---|---|---|---|---|
| DNP: | 0.0063 | 0.0021 | 0.0014 | 9.0506e-04 | 6.4270e-04 |
| AAGM(k=2): | 0.0141 | 0.0041 | 0.0023 | 0.0011 | 5.8343-04 |
| AAGM(k=3): | 0.0129* | 0.0036* | 0.0019* | 8.1738e-04* | 4.3970e-04* |
Table 39.
[Weibull(m=20)] The true MISE comparison for DNP and AAGM
| n = 1 | n = 5 | n = 10 | n = 25 | n = 50 | |
|---|---|---|---|---|---|
| DNP: | 0.3000 | 0.0781 | 0.0479 | 0.0297 | 0.02 |
| AAGM(k=2): | 0.4853 | 0.1251 | 0.0738 | 0.0352 | 0.0193 |
| AAGM(k=3): | 0.4556* | 0.1148* | 0.0602* | 0.0271* | 0.0140* |
Table 40.
Estimated MISE for NAM, DNP, Spline and MLE
| NAM (r = 2) | NAM (r = 3) | NAM (r = 4) | DNP | Spline | MLE (Logistic) | |
|---|---|---|---|---|---|---|
| MISE: | 0.0050 | 0.0044* | 0.0062 | 0.0081 | 0.0121 | 0.00045079 |
Acknowledgments
This research was supported by NIH grant R21-ES016791 and NSF grant DMS 1107053.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Contributor Information
Rabi Bhattacharya, Email: rabi@math.arizona.edu.
Lizhen Lin, Email: lizhen@stat.duke.edu.
References
- 1.Ayer M, Brunk HD, Ewing GM, Reid WT, Silverman E. An empirical distribution function for sampling with incomplete information. Ann.Math.Statist. 1955;26:641–647. [Google Scholar]
- 2.Bhattacharya R, Kong M. Consistency and asymptotic normality of the estimated effective dose in bioassay. J. Statist. Plan. Inf. 2007;137:643–658. [Google Scholar]
- 3.Bhattacharya R, Lin L. An adaptive nonparametric method in benchmark analysis for bioassay and environmental studies. Statist. Probab. Letters. 2010;80:1947–1953. doi: 10.1016/j.spl.2010.08.024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Bhattacharya R, Lin L. Nonparametric benchmark analysis in risk assessment: a comparative study by simulation and data analysis. Sankhya, The Indian Journal of Statistics Ser.B. 2011;73(Issue 1):144–163. doi: 10.1007/s13571-011-0019-7. (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Dette H, Scheder R. A finite sample comparison of nonparametric estimates of the effective dose in quantal bioassay. J. Statist. Comput. Simulation. 2010;80(5):527–544. [Google Scholar]
- 6.Dette H, Neumeyer N, Pliz KF. A note on nonparametric estimation of the effective dose in quantal bioassay. J.Amer.Statist.Assoc. 2005;100:503–510. [Google Scholar]
- 7.Efron B. Bootstrap methods: another look at the jackknife. Ann. Statist. 1979;7(no. 1):1–26. [Google Scholar]
- 8.Efron B. Nonparametric standard errors and confidence intervals. With discussion and a reply by the author. Canad. J. Statist. 1981;9(no. 2):139–172. [Google Scholar]
- 9.Efron B, Tibshirani RJ. An Introduction to the Bootstrap. London: Chapman & Hall; 1993. [Google Scholar]
- 10.Eubank RL. Nonparametric Regression and Spline Smoothing. 2nd Edition. New York: Marcel Dekker; 1999. [Google Scholar]
- 11.Hardy GH, Littlewood JE, Pólya G. Inequalities. Cambridge, England: Cambridge University Press; 1952. [Google Scholar]
- 12.Kelly C, Rice J. Monotone smoothing with application to dose response curves and the assessment of synergism. Biometrics. 1990;46:1071–1085. [PubMed] [Google Scholar]
- 13.Kong M, Eubank RL. Monotone smoothing with application to dose-response curve. Comm. Statist. Simulation Comput. 2006;35(no. 4):991–1004. [Google Scholar]
- 14.Lin L. Thesis. University of Arizona; 2012. Nonparametric Inference for Bioassay. [Google Scholar]
- 15.Müller HG, Schmitt T. Kernel and probit estimation in quantal bioassay. J.Amer.Statist.Assoc. 1988;83(403):750–759. [Google Scholar]
- 16.MukerJee H. Monotone nonparametric regression. The Annals of Statistics. 1988;Vol 16(2):741–750. [Google Scholar]
- 17.Park D, Park S. Parametric and nonparametric estimators of ED100α. J. Statist. Comput. Simulation. 2006;76(No. 8):661–672. [Google Scholar]
- 18.Piegorsch W, Xiong H, Bhattacharya R, Lin L. Nonparametric estimation of benchmark doses in environmental risk assessment. Environmetrics. 2012 doi: 10.1002/env.2175. Accepted. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Platel A, Nesslany F, Gervais V, Marzin D. Study of oxidative damage in TK6 human lymphoblastoid cells by use of the in vitro micronucleus test: Determi-nation of No-Observed-Effect Levels. Mutation Research. 2009;678:3037. doi: 10.1016/j.mrgentox.2009.06.006. [DOI] [PubMed] [Google Scholar]
- 20.Rogers Commission Report. Report of the Presidential Commission on the Space Shuttle Challenger Accident. 1986. [Google Scholar]
- 21.Tsybakov AB. Introduction to Nonparametric Estimation. Springer Series in Statistics. 2010 [Google Scholar]
- 22.Utreras FI. Smoothing noisy data under monotonicity constraints: existence, characterization and convergence rates. Numer. Math. 1985;74:611–625. [Google Scholar]
- 23.Wahba G. CBMS-NSF Series. Vol. 59. Philadelphia, PA: SIAM; 1990. Spline Models for Observational Data. [Google Scholar]
- 24.Wright FT. The asymptotic behavior of monotone regression estimates. Annals of Statistics. 1981;Vol. 9:443–448. [Google Scholar]
- 25.Wright FT. Monotone regression estimates for grouped observations. Annals of Statistics. 1982;Vol. 10:278–286. [Google Scholar]















