Bias induced by adaptive dose-finding designs

Nancy Flournoy; Assaf P Oron

doi:10.1080/02664763.2019.1649375

. 2019 Aug 1;47(13-15):2431–2442. doi: 10.1080/02664763.2019.1649375

Bias induced by adaptive dose-finding designs

Nancy Flournoy ^a,^CONTACT, Assaf P Oron ^b

PMCID: PMC9041908 PMID: 35707439

Abstract

There is a long literature on bias in maximum likelihood estimators. Here we demonstrate that adaptive dose-finding procedures (such as Continual Reassessment Methods, Up-and-Down and Interval Designs) themselves induce bias. In particular, with Bernoulli responses and dose assignments that depend on prior responses, we provide an explicit formula for the bias of observed response rates. We illustrate the patterns of bias for designs that aim to concentrate dose allocations around a target dose, which represents a specific quantile of a cumulative response-threshold distribution. For such designs, bias tends to be positive above the target dose and negative below it. To our knowledge, this property of dose-finding designs has not previously been recognized by design developers. We discuss the implications of this bias and suggest a simple shrinkage mitigation formula that improves estimation at doses away from the target.

Keywords: Bernoulli regression analysis, up-and-down procedures, continual reassessment method, interval designs, phase I clinical trials, inference following stochastic processes

1. Introduction

Bias in maximum likelihood estimators is well known, but in regular exponential family models this bias goes to zero with increasing sample size (see, e.g. Firth [11]). With regard to adaptive sampling procedures, Coad and Ivanova [6] found bias and variance expressions for maximum likelihood estimators (MLEs) for five popular urn models building on prior work by Woodroofe and Coad [5,7,36]. The bias and mean squared error (MSE) of MLEs following sample size re-calculation has been explored by Liu et al. [23], Brannath [3], Graf et al. [15] and Tarima and Flournoy [31]. Bias in group sequential designs is discussed by Jennison and Turnbull [19]. Bias induced by multi-arm bandit selection procedures was studied by Xu et al. [38].

But bias in dose-finding experiments has here-to-fore been overlooked. Dose-finding designs (such as Continual Reassessment Methods [4,25], Up-and-Down [10,12,17,28] and Interval Designs [18,21]) are routine in many fields in which subjects arrive sequentially or in batches, including Phase I cancer trials, sensory studies, acute toxicity and psychometric testing. Adaptation in these designs refers to the use of prior outcome information to determine the stimulus or treatment X for the current subject or cohort.

Even though the explicit bias formula derived in Section 2 applies to a much broader family of models and goals, for simplicity of exposition, we restrict this article to the case of univariate treatment X and binary response Y. Without loss of generality, we call magnitudes of the stimuli doses and positive responses toxicities. Doses $X_{1}, \dots, X_{n}$ are sequentially selected from a discrete ordered finite set denoted by $X = {d_{1}, \dots, d_{M} : d_{1} < \dots < d_{M}}$ . Assuming

F (x) := P {Y = 1 | X = x} = E [Y | X]

is monotone increasing in x, it follows that $F_{m} := F (d_{m})$ is also monotone increasing in $m \in [1, \dots, M]$ ; and it renders the region of interest interpretable as a neighborhood around a particular quantile of a cumulative distribution function of toxicity thresholds. The goal of the experiment is either (1) to estimate some pre-specified Γth quantile of $F (x)$ [i.e. $F^{- 1} (Γ)$ ] or (2) to select the dose in $X$ that is closest to $F^{- 1} (Γ)$ .

Most dose-finding designs attempt to concentrate dose allocations around $F^{- 1} (Γ)$ , which is called the target dose, rather than spread them over $X$ (e.g. randomly or using the local D-optimality criterion). To achieve this concentration, the dose-assignment rule may use likelihood-based estimation, employing a parametric or nonparametric model for $F (x)$ . The likelihood-based estimates are constructed from the observed toxicity rates at the doses in $X$ [30]. Some dose-assignment rules that are not based explicitly on models (such as the increasingly popular family of interval designs, e.g. [18,21]) use the observed rates directly for dosing decisions. Even designs that do not rely upon estimation to allocate doses (such as the Markovian family of up-and-down designs, e.g. [10,12,17,28]) may use some type of binary regression to estimate $F^{- 1} (Γ)$ at the end of the experiment, and binary regression also utilizes the observed toxicity rates.

As the sample size at a specific $X \in X$ goes to ∞, the observed response rate converges to its true value [26,33]. Design developers and analysts (including the authors of this paper) have, it seems, made the implicit assumption that these rates are unbiased with finite dose-specific sample sizes as well. This is trivially true when dose assignments do not depend on outcomes, but unfortunately the assumption breaks down for adaptive designs. The very adaptive property that makes dose-finding designs useful and popular, namely their tendency to concentrate allocations in a certain region of the $F (x)$ curve, induces bias in observed dose-specific response rates.

In Section 2, we introduce the sufficient statistics for common dose-finding experiments and relate them to ${F_{m}}$ . We then show the existence and form of the bias of the observed toxicity rate at $d_{m}$ for $F_{m}$ , $m = 1, \dots, M$ , and provide numerical illustrations of the bias for some common designs. In Section 3, we propose an adjustment to the observed toxicity rates that mitigates the bias. Section 4 includes a discussion of potential practical implications of the bias.

2. The bias of observed dose-specific toxicity rates

2.1. Preliminaries

Each treatment $X_{i}$ triggers a binary response $Y_{i} \in {0, 1}, i = 1, \dots, n$ . Let $δ_{i m} = 1$ if $X_{i} = d_{m}$ and 0 otherwise, $m = 1, \dots, M$ . Dose-specific treatment and toxicity counts (ignoring their dependence on n for notational simplicity) are

N_{m} = \sum_{i = 1}^{n} δ_{i m} and T_{m} = \sum_{i = 1}^{n} Y_{i} δ_{i m}, m = 1, \dots, M .

(1)

Applying the factorization theorem to the likelihood $\prod_{m} F_{m}^{T_{m}} (1 - F_{m})^{(N_{m} - T_{m})}$ , with mild assumptions that typically hold as given in [30], one sees that the dose-specific summary statistics ${T_{m}, N_{m}}_{1}^{M}$ are sufficient for commonly used dose-finding experiments. But the most common statistics used in analyses are the observed dose-specific toxicity rates ${T_{m} / N_{m}, N_{m}; N_{m} > 0}_{1}^{M}$ , ignoring doses in $X$ that are not used.

The ratio of two random variables $R_{m} := T_{m} / N_{m}$ , $N_{m} > 0$ is commonly thought of as an estimate of $F_{m}$ , but it is $E [T_{m}] / E [N_{m}]$ that equals $F_{m}$ as can be seen from the following equivalencies:

\begin{aligned} E [T_{m}] & = \sum_{i = 1}^{n} E [Y_{i} δ_{i m}] = \sum_{i = 1}^{n} P {Y_{i} | δ_{i m} = 1} P {δ_{i m} = 1} \\ = F_{m} \sum_{i = 1}^{n} P {δ_{i m} = 1} \\ = F_{m} E [N_{m}] . \end{aligned}

Therefore, $F_{m}$ is the ratio of expectations,

F_{m} = E [T_{m}] / E [N_{m}], m = 1, \dots, M .

(2)

We study when this ratio of expectations does or does not equal the expectation of the ratio $T_{m} / N_{m}$ .

2.2. An expression for the bias of observed toxicity rates

Start with the definition of the covariance function for two random variables U and V, with $V \neq 0$ :

Cov [\frac{U}{V}, V] = E [\frac{U}{V} V] - E [\frac{U}{V}] E [V] = E [U] - E [\frac{U}{V}] E [V] .

Rearranging terms (as observed in [16]) yields $E [U / V] = {E [U] - Cov [U / V, V]} / E [V];$ substituting $T_{m}, N_{m}$ and $R_{m}$ for U,V and U/V, and using (2), we arrive at

E [R_{m}] = \frac{E [T_{m}]}{E [N_{m}]} - \frac{Cov [R_{m}, N_{m}]}{E [N_{m}]} = F_{m} - \frac{Cov [R_{m}, N_{m}]}{E [N_{m}]} .

(3)

So $R_{m}$ is a biased estimator of $F_{m}$ unless its covariance with $N_{m}$ is zero. The bias is proportional, and opposite in sign, to $Cov (R_{m}, N_{m})$ , but $| R_{m} - F_{m} |$ is always between $\pm 1$ .

For adaptive dose-finding designs, $R_{m}$ is consistent if $N_{m} \to \infty$ [26], but if $E [N_{m}] \to \infty$ while $N_{m}$ itself remains finite with non-vanishing probability, consistency is not guaranteed. Therefore, care must be taken in using asymptotic approximations. Here we add another path to consistency via an order-of-magnitude analysis of equation (3). Because correlations are in [ $- 1, 1$ ], the bias magnitude is bounded as

| E [R_{m}] - F_{m} | = \frac{| Cov [R_{m}, N_{m}] |}{E [N_{m}]} \leq \frac{\sqrt{Var [R_{m}]} \sqrt{Var [N_{m}]}}{E [N_{m}]} \leq \frac{1}{2} CV [N_{m}],

where $CV [N_{m}]$ is the coefficient of variation of $N_{m}$ .

With Markovian up-and-down designs, $E [N_{m} / n]$ converges to a constant while $Var [N_{m}] / n \to 0$ (see, e.g. [10,12,17,28]), and therefore $R_{m}$ is consistent for all $d_{m}$ . In contrast, when using interval designs (e.g. [18] and [20]) and likelihood-derived allocation procedures (e.g. [25]) with small n, ${N_{m}}$ are known to have high variability [29]. As $n \to \infty$ , these designs generally come to allocate doses exclusively to a single dose or to doses in a small interval [22,26]. Under such designs, the bias at doses outside this interval will not go away.

A straightforward rearrangement of terms in (3) reveals that

\begin{aligned} Cor [R_{m}, N_{m}] {\begin{cases} > 0 & iff E [R_{m}] < F_{m}; \\ = 0 & iff E [R_{m}] = F_{m}; \\ < 0 & iff E [R_{m}] > F_{m} . \end{cases} \end{aligned}

(4)

Figure 1 demonstrates this relationship between bias and $Cor [R_{m}, N_{m}]$ for selected dose-finding designs that are described in Section 2.3. This plot shows points have roughly the same pattern of association regardless of design.

Now consider how bias might change as a function of dose. Suppose a dose-finding design causes ${N_{m}}_{1}^{M}$ to have a unimodal distribution around $F^{- 1} (Γ)$ (as a function of m) and that ${R_{m}}_{1}^{M}$ is increasing with values that bound Γ. Then ${R_{m}}$ and ${N_{m}}$ are increasing together up to the point $[F^{- 1} (Γ), Γ]$ , after which $R_{m}$ continues to increase while $N_{m}$ decreases. That is, ${R_{m}}$ and ${N_{m}}$ are positively correlated for doses below $F^{- 1} (Γ)$ and negatively correlated for larger doses. If a design had these characteristics, then by (4), for $d_{m}$ below the target dose, $R_{m}$ would have negative bias and above the target $R_{m}$ would have positive bias.

In addition, because $E [N_{m}]$ (in the denominator of the bias) will tend to decrease as $| d_{m} - Γ |$ increases, one expects the magnitude of the bias to increase as $| F_{m} - Γ |$ increases. Thus we expect dose-finding designs to induce a ‘flaring out’ of the bias, pushing ${R_{m}}$ away from $F_{m}$ , and even farther away from Γ in both directions.

2.3. Numerical illustration of observed toxicity rate bias as a function of dose

To illustrate the bias of observed toxicity rates, five common dose-finding designs are simulated, with each simulation run containing 10,000 replications and operating on the dose set $X = {1, \dots, 10}$ . To simplify our focus on bias, all simulations use the same simple logistic function defined by logit $[F (x)] = (x - 5.6) / 2$ . The parameters of four designs studied are set with the experimental aim of estimating the 30th percentile of $F (x)$ [which is $F^{- 1} (0.3) \approx 3.9$ ]. Studied are the Durham–Flournoy Biased Coin Design (BCD) [9,12], the Continual Reassessment Method (CRM) [4,25], the Cumulative Cohort Design (CCD) [18] and the K-in-a-row Design (KRD) [14,28,35] with K=2. For comparison, we also simulate the Classical Up-and-Down Design (CUDD) [8] whose aim is to estimate the 50th percentile [here $F^{- 1} (0.5) = 5.6$ ]. Experiments of size n=30 and n=120 are simulated for each design.

Let ${\bar{R}}_{m}$ denote the ensemble average of simulated $R_{m}$ values. Figure 1 plots the bias, ${\bar{R}}_{m} - F_{m}$ , versus the correlation between $R_{m}$ and $N_{m}$ for $m = 1, \dots, M$ for all five dose-finding designs. The correlation is positive when the bias is negative and vice versa. Plots for n=30 and n=120 are quite similar, except that points of extreme bias using the CRM are tempered with the larger sample size.

Figure 2 plots the simulated bias versus dose. For all designs, the bias generally tends from negative to positive as the dose increases, crossing zero near $F^{- 1} (Γ)$ . The trend is more gradual and more clearly monotone under the Markovian designs (BCD, KRD, CUDD), with the long-memory CRM and CCD displaying greater bias magnitudes, and a steeper, more localized change from negative to positive near the target.

3. Mitigating the bias of observed toxicity rates

3.1. Potential mitigation approaches

The expression for bias (3) appears simple and straightforward at face value, suggesting there might be a way to mitigate the bias in real time, possibly even by direct substitution. Dose-specific covariances can only be observed on populations of experiments, such as we synthesized via simulation ensembles in the previous section. However, a rough on-the-fly estimate of $Cov [R_{m}, N_{m}]$ is a sample covariance considering $R_{m - 1}, N_{m - 1}$ and $R_{m + 1}, N_{m + 1}$ to be replicates of $R_{m}, N_{m}$ . The denominator of the bias, $E [N_{m}]$ , is more problematic with small overall n because the obvious substitution, $N_{m}$ , will be very small except near the apparent target, making the overall correction rather unstable. We explored the use of this substitution and found that it tends to over-correct and to have high variance.

A far simpler approach is inspired by the work of Firth [11] on bias reduction, but makes use of our observation that the bias ‘flares out’ away from target. The fix is to slightly shrink all $R_{m}$ 's towards target using the formula:

{\tilde{R}}_{m} = \frac{T_{m} + Γ}{N_{m} + 1} = \frac{N_{m} R_{m} + Γ}{N_{m} + 1} .

(5)

The magnitude of shrinkage is approximately inverse to $N_{m}$ , in agreement with the bias expression (3). When $Γ = 0.5$ , this formula is identical to the commonly used correction for calculating the empirical logit in the presence of zero cell counts [2,37]. Firth's correction has been used to analyze results from a Markovian up-and-down design with $Γ = 0.5$ in anesthesiology research [1].

This mitigation formula performs rather acceptably in the simulations shown below.

3.2. Numerical examination of the shrinkage formula

We applied the shrinkage formula (5) to the simulation data that produced the figures in Section 2.3. Plots from two representative designs (KRD and CRM) in Figure 2 are repeated in Figure 3 (top left, dashed curves), augmented with plots created by applying formula 5 (solid curves). The bias is mitigated fairly well. There is some over correction at extreme doses, particularly far above the target where sample sizes are small. However, we found the overall best performance by far is obtained when shrinkage is applied without regard to dose-specific sample sizes.

Figure 3 (top, right) shows that the application of formula (5) reduces root-mean-square error (RMSE), which helps ${{\tilde{R}}_{m}}$ stay generally closer than ${R_{m}}$ to the true ${F_{m}}$ . RMSE is graphed to make the magnitudes of mitigation in these two figures comparable. The two graphs in the bottom row of Figure 3 clarify the bias/variance tradeoff across doses, showing the mitigating effect of (5) on both.

3.3. Impact of bias and shrinkage on parametric MLEs

We assumed the two parameter function logit $[F (x)] = (x - 5.6) / 2$ in order to examine how bias in the observed toxicity rates affects parametric estimates following KRD and CRM designs. Figure 4 shows the effect of using the shrinkage formula on estimates of location, slope and target dose parameters, which here are $F^{- 1} (0.50) = 5.6$ , 0.5 and $F^{- 1} (0.30) = 3.9$ , respectively. For reference, we include estimates generated by the D-optimal design, which assigns an equal number of samples to $F^{- 1} (0.176)$ and $F^{- 1} (0.824)$ [34]. The D-optimal design provides an important benchmark in that it minimizes the asymptotic confidence ellipsoid around the MLEs.

Figure 4. — The impact of bias and shrinkage upon parametric estimates from simulated KRD and CRM for n = 30 experiments. The D-optimal design experiment included for reference. True values are indicated by horizontal red lines.

CRM and KRD both underestimate location and overestimate slope, with CRM's estimates being more extreme. The shrinkage mitigates those biases, but not completely. Encouragingly, the third panel shows that the adaptive designs perform well in estimating the target dose, with shrinkage not offering any improvement. Note that KRD generally produces better estimates than CRM, for both parameters and target dose. Note also that in a very small fraction of adaptive-design runs, very extreme parametric estimates are generated that fall outside the range of values shown in Figure 4, usually because all observations are concentrated in one to two doses.

3.4. Revisiting coverage of isotonic regression interval estimates of the target dose

In our presentation of centered isotonic regression [CIR, 27], we also developed analytical confidence intervals (CIs) for the target dose. Those intervals tended to be conservative for $F^{- 1} (Γ)$ with fixed non-adaptive designs. However, when applied to simulated KRD dose-finding experiments, coverage was rather deficient at n=20: only $80 %$ for the nominal $90 %$ CI. An empirical modification, assuming a simple random draw of doses, only increased coverage by $\sim 1 %$ [27, Table 7]. Coverage was still somewhat deficient at n=40, but somewhat conservative at n=80, indicating a sample-size related issue. The deficiency in coverage was clearly asymmetric, but at the time we could not find the root cause.

Table 1 uses the same dataset that produced Table 7 in [27], showing interval coverage rates and widths calculated with and without shrinking the observed response rates toward the target rate according to (5). At n=20 coverage, while still somewhat deficient, is much closer to the nominal $90 %$ level with bias mitigation. Coverage attains the nominal level before n=40 and the change in coverage with increase sample size is less pronounced. Clearly, more work is needed on interval estimation procedures following adaptive designs.

Table 1. Coverage rates (interval widths)^a for the target dose $F^{- 1} (0.30)$ using 2 KRD with and without bias mitigation.

Family	n	Using raw observed rates	Using rates after shrinkage
Logistic	20	0.80 (2.14)	0.88 (2.68)
	40	0.88 (2.02)	0.92 (2.30)
	80	0.92 (1.63)	0.94 (1.75)
Weibull	20	0.80 (1.93)	0.88 (2.44)
	40	0.88 (1.85)	0.92 (2.11)
	80	0.91 (1.57)	0.93 (1.68)

Open in a new tab

^aThe nominal coverage rate is $90 %$ for interval estimates after simulated KRD (K=2) experiments with n=20,40,80. Average interval widths, given in parentheses, are on the dose scale. Simulated experiments were run assuming randomly selected toxicity curves from the Logistic (top) and Weibull (bottom) family. Raw observed rates were previously published in Table 7 in [27].

4. Discussion

A pattern of bias induced by adaptive dose-finding designs has been hiding in plain sight. This article exposes and explains this discovery in the context of dose-finding experiments. It is interesting to note that a formula similar to (3) appears in the group sequential design Example 3 of [23]. Also, formula (3) can be applied when adapting treatment assignments to independent groups instead of doses (ordered points) in which case, for example, it reveals that increasing treatment assignments to those showing less toxicity causes positive bias because the correlation between toxicity rate and sample sizes is negative. For continuous responses, re-defining $R_{m}$ to be the observed group-specific sample mean and $T_{m}$ to be the observed dose-specific sum, formula (3) reveals that the sample mean will be negatively biased for the population mean when using Greedy sampling, Thompson sampling [32], Randomly Reinforced Urns [13] or other schemes that increasing the probability of sampling from high performing groups. The derivation of formula (3) provides an alternative proof to Theorem 1 in [24] and demonstrates that their conclusion that ‘adaptively collected data have negative bias’ applies to the class of adaptations they considered and not generally.

Figure 4 (right panel) demonstrates that adaptive designs can produce unbiased MLEs of the target dose despite producing biased parameter estimates.

If isotonic regression or CIR are used to produce a point estimate for the target dose, then the shrinkage helps ensure that the confidence interval of this estimate has sufficient coverage. Figure 4 demonstrates that parametric estimates of percentiles not too distant from the target (compare $F^{- 1} (0.5)$ on the left with $F^{- 1} (0.3)$ on the right) and estimates of the slope (center panel) are much less reliable without shrinkage. It is of concern that data generated by dose-finding designs of all kinds (including the long-memory CRM and CCD) are increasingly being used to estimate F more broadly and to make decisions regarding dropping doses away from the target (e.g. [39]). Safety exclusion rules are important, but without effective bias mitigation, these decision rules are likely to be overly aggressive. This work reinforces the warning that data from dose-finding designs that concentrate treatments at doses near a target is of limited value for making predictions except near that target.

The small-sample dose-finding problem is difficult, and some have shown more complacency than is warranted regarding the quality and reliability of existing methods. The truth is that, at present, no single design and estimation combination can provide the quality of results that some applied researchers, particularly in Phase I clinical trials, have come to expect from dose-finding experiments. Whether the expectations are unrealistic, or further breakthroughs await discovery, remains to be seen.

Markovian up-and-down designs, with their more tractable analytical properties and more balanced treatment allocations, seem to have less bias away from target, and lower RMSE, than long-memory designs like the CRM (Figures 2–4). However, parametric estimates following a CRM greatly benefit from bias-mitigating shrinkage (Figures 3 and 4).

We have been fortunate to find a simple mitigating fix for the observed toxicity rate bias that works reasonably well. Shrinkage estimators often reduce variance at the cost of some additional bias, but with dose-finding designs one can leverage knowledge of the patterns shown in Section 2.3 to reduce both estimation bias and variance via shrinkage. The shrinkage formula (5) is now implemented in the R package cir on GitHub, via the logical argument adaptiveShrink, and will be incorporated into the CRAN version as well soon.

Acknowledgments

We would like to thank the reviewers for excellent questions and comments that have improved this manuscript. We also thank David Azriel for discussions regarding consistency.

Disclosure statement

No potential conflict of interest was reported by the authors.

References

1.Albrecht E., Kirkham K.R., Taffe P., Endersby R.V., Tse C., Chan V.W. and Brull R., The maximum effective needle-to-nerve distance for ultrasound-guided interscalene block, in Swiss Medical Weekly, Vol. 143, EMH Swiss Medical Publishers LTD Farnsburgerster 8, CH-4132 Muttenz, Switzerland, 2013, pp. 6S–6S.
2.Anscombe F., On estimating binomial response relations, Biometrika 43 (1956), pp. 461–464. doi: 10.1093/biomet/43.3-4.461 [DOI] [Google Scholar]
3.Brannath W., Knig F. and Bauer P., Estimation in flexible two stage designs, Stat. Med. 25 (2006), pp. 3366–3381. doi: 10.1002/sim.2258 [DOI] [PubMed] [Google Scholar]
4.Cheung Y., Dose Finding by the Continual Reassessment Method, Chapman and Hall/CRC, New York, 2011. [Google Scholar]
5.Coad D., Estimation following sequential tests involving data-dependent allocation, Stat. Sin. 4 (1994), pp. 693–700. [Google Scholar]
6.Coad D. and Ivanova A., Bias calculations for adaptive urn designs, Seq. Anal. 20 (2001), pp. 91–116. doi: 10.1081/SQA-100106051 [DOI] [Google Scholar]
7.Coad D. and Woodroofe M., Approximate bias calculations for sequentially designed experiments, Seq. Anal. 17 (1998), pp. 1–31. doi: 10.1080/07474949808836396 [DOI] [Google Scholar]
8.Dixon W. and Mood A., A method for obtaining and analyzing sensitivity data, J. Am. Statist. Assoc. 43 (1948), pp. 109–126. doi: 10.1080/01621459.1948.10483254 [DOI] [Google Scholar]
9.Durham S. and Flournoy N., Random walks for quantile estimation, in Statistical Decision Theory and Related Topics V (West Lafayette, IN, 1992), Springer, New York, 1994, pp. 467–476. QA279.4.S745 1994. MR MR1286322 (95d:62049).
10.Durham S. and Flournoy N., Up-and-down designs. I. Stationary treatment distributions, in Adaptive designs, N. Flournoy and W. Rosenberger, eds., IMS Lecture Notes Monogr. Ser. Vol. 25, Inst. Math. Statist., Hayward, CA, 1995, pp. 139–157. MR MR1477678
11.Firth D., Bias reduction of maximum likelihood estimates, Biometrika 80 (1993), pp. 27–38. doi: 10.1093/biomet/80.1.27 [DOI] [Google Scholar]
12.Flournoy N. and Oron A., Up-and-down designs for dose-finding, in Handbook of Design and Analysis of Experiments, D. Bingham, A. Dean, M. Morris, and J. Stufken, eds., chap. 24, CRC Press, Chapman Hall, 2015, pp. 862–898
13.Flournoy N., May C. and Secchi P., Asymptotically optimal response-adaptive designs for allocating the best treatment: an overview, Int. Stat. Rev. 80 (2012), pp. 293–305. doi: 10.1111/j.1751-5823.2011.00173.x [DOI] [Google Scholar]
14.Gezmu M., The geometric up-and-down design for allocating dosage levels, Ph.D. diss., American University, Washington, DC, 1996.
15.Graf A.C., Gutjahr G. and Brannath W., Precision of maximum likelihood estimation in adaptive designs, Stat. Med. 35 (2016), pp. 922–941. sim.6761.. doi: 10.1002/sim.6761 [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Heijmans R., When does the expectation of a ratio equal the ratio of expectations?, Stat. Papers 40 (1999), pp. 107–115. doi: 10.1007/BF02927114 [DOI] [Google Scholar]
17.Ivanova A., Dose escalation and up-and-down designs, Wiley Encyclopedia of Clinical Trials, 2007, pp. 1–4.
18.Ivanova A., Flournoy N. and Chung Y., Cumulative cohort design for dose-finding, J. Statist. Plan. Inference 137 (2007), pp. 2316–2327. doi: 10.1016/j.jspi.2006.07.009 [DOI] [Google Scholar]
19.Jennison C. and Turnbull B.W., Group Sequential and Adaptive Methods for Clinical Trials, 2nd ed., Chapman & Hall/CRC, Boca Raton, 2002. [Google Scholar]
20.Ji Y., Li Y. and Bekele B.N., Dose-finding in phase I clinical trials based on toxicity probability intervals, Clin. Trials 4 (2007), pp. 235–244. doi: 10.1177/1740774507079442 [DOI] [PubMed] [Google Scholar]
21.Ji Y., Liu P., Li Y. and Nebiyou Bekele B., A modified toxicity probability interval method for dose-finding trials, Clin. Trials 7 (2010), pp. 653–663. doi: 10.1177/1740774510382799 [DOI] [PMC free article] [PubMed] [Google Scholar]
22.Lee S.M. and Cheung Y.K., Model calibration in the continual reassessment method, Clin. Trials 6 (2009), pp. 227–238. doi: 10.1177/1740774509105076 [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Liu Q., Proschan M.A. and Pledger G.W., A unified theory of two-stage adaptive designs, J. Am. Stat. Assoc. 97 (2002), pp. 1034–1041. doi: 10.1198/016214502388618852 [DOI] [Google Scholar]
24.Nie X., Tian X., Taylor J. and Zou J., Why adaptively collected data have negative bias and how to correct for it, preprint (2017). Available at arXiv:1708.01977.
25.O'Quigley J., Pepe M. and Fisher L., Continual reassessment method: a practical design for Phase I clinical trials in cancer, Biometrics 46 (1990), pp. 33–48. MRMR1059105. doi: 10.2307/2531628 [DOI] [PubMed] [Google Scholar]
26.Oron A., Azriel D. and Hoff P., Dose-finding designs: the role of convergence properties, Int J Biostat 7 (2011), pp. 1–17. Article doi: 10.2202/1557-4679.1298 [DOI] [PubMed] [Google Scholar]
27.Oron A. and Flournoy N., Centered isotonic regression: point and interval estimation for dose-response studies, Stat. Biopharm. Res. 9 (2017), pp. 258–267. doi: 10.1080/19466315.2017.1286256 [DOI] [Google Scholar]
28.Oron A. and Hoff P., The k-in-a-row up-and-down design, revisited, Stat. Med. 28 (2009), pp. 1805–1820. doi: 10.1002/sim.3590 [DOI] [PubMed] [Google Scholar]
29.Oron A. and Hoff P., Small-sample behavior of novel Phase I cancer trial designs, Clin. Trials 10 (2013), pp. 63–80. doi: 10.1177/1740774512469311 [DOI] [PubMed] [Google Scholar]
30.Rosenberger W., Flournoy N. and Durham S.D., Asymptotic normality of maximum likelihood estimators from multiparameter response-driven designs, J. Statist. Plan. Inference 60 (1997), pp. 69–76. doi: 10.1016/S0378-3758(96)00120-6 [DOI] [Google Scholar]
31.Tarima S. and Flournoy N., Distribution theory following blinded and unblinded sample size re-estimation under parametric models, Communications in Statistics, revision under review (2019). [DOI] [PMC free article] [PubMed]
32.Thompson W.R., Biometrika trust, Biometrika 25 (1933), pp. 285–294. doi: 10.1093/biomet/25.3-4.285 [DOI] [Google Scholar]
33.Tsutakawa R., Asymptotic properties of block up-and-down methods in bio-assay, Ann. Math. Statist. 38 (1967), pp. 1822–1828. doi: 10.1214/aoms/1177698615 [DOI] [Google Scholar]
34.Wetherill G.B., Sequential estimation of quantal response curves, J Roy. Stat. Soc. B 25 (1963), pp. 1–48. [Google Scholar]
35.Wetherill G., Sequential estimation of quantal response curves, J. Royal. Stat. Soc. B 25 (1963), pp. 1–48. [Google Scholar]
36.Woodroofe M., On stopping times and stochastic monotonicity, Seq. Anal. 9 (1990), pp. 335–342. doi: 10.1080/07474949008836216 [DOI] [Google Scholar]
37.Woolf B., On estimating the relation between blood group and disease, Ann. Hum. Genet. 19 (1955), pp. 251–253. doi: 10.1111/j.1469-1809.1955.tb01348.x [DOI] [PubMed] [Google Scholar]
38.Xu M., Qin T. and Liu T.Y., Estimation bias in multi-armed bandit algorithms for search advertising, in Advances in Neural Information Processing Systems, Vol. 26, 2013, pp. 2400–2408.
39.Yang S., Wang S.J. and Ji Y., An integrated dose-finding tool for phase I trials in oncology, Contemp. Clin. Trials. 45 (2015), pp. 426–434. doi: 10.1016/j.cct.2015.09.019 [DOI] [PubMed] [Google Scholar]

[CIT0001] 1.Albrecht E., Kirkham K.R., Taffe P., Endersby R.V., Tse C., Chan V.W. and Brull R., The maximum effective needle-to-nerve distance for ultrasound-guided interscalene block, in Swiss Medical Weekly, Vol. 143, EMH Swiss Medical Publishers LTD Farnsburgerster 8, CH-4132 Muttenz, Switzerland, 2013, pp. 6S–6S.

[CIT0002] 2.Anscombe F., On estimating binomial response relations, Biometrika 43 (1956), pp. 461–464. doi: 10.1093/biomet/43.3-4.461 [DOI] [Google Scholar]

[CIT0003] 3.Brannath W., Knig F. and Bauer P., Estimation in flexible two stage designs, Stat. Med. 25 (2006), pp. 3366–3381. doi: 10.1002/sim.2258 [DOI] [PubMed] [Google Scholar]

[CIT0004] 4.Cheung Y., Dose Finding by the Continual Reassessment Method, Chapman and Hall/CRC, New York, 2011. [Google Scholar]

[CIT0005] 5.Coad D., Estimation following sequential tests involving data-dependent allocation, Stat. Sin. 4 (1994), pp. 693–700. [Google Scholar]

[CIT0006] 6.Coad D. and Ivanova A., Bias calculations for adaptive urn designs, Seq. Anal. 20 (2001), pp. 91–116. doi: 10.1081/SQA-100106051 [DOI] [Google Scholar]

[CIT0007] 7.Coad D. and Woodroofe M., Approximate bias calculations for sequentially designed experiments, Seq. Anal. 17 (1998), pp. 1–31. doi: 10.1080/07474949808836396 [DOI] [Google Scholar]

[CIT0008] 8.Dixon W. and Mood A., A method for obtaining and analyzing sensitivity data, J. Am. Statist. Assoc. 43 (1948), pp. 109–126. doi: 10.1080/01621459.1948.10483254 [DOI] [Google Scholar]

[CIT0009] 9.Durham S. and Flournoy N., Random walks for quantile estimation, in Statistical Decision Theory and Related Topics V (West Lafayette, IN, 1992), Springer, New York, 1994, pp. 467–476. QA279.4.S745 1994. MR MR1286322 (95d:62049).

[CIT0010] 10.Durham S. and Flournoy N., Up-and-down designs. I. Stationary treatment distributions, in Adaptive designs, N. Flournoy and W. Rosenberger, eds., IMS Lecture Notes Monogr. Ser. Vol. 25, Inst. Math. Statist., Hayward, CA, 1995, pp. 139–157. MR MR1477678

[CIT0011] 11.Firth D., Bias reduction of maximum likelihood estimates, Biometrika 80 (1993), pp. 27–38. doi: 10.1093/biomet/80.1.27 [DOI] [Google Scholar]

[CIT0012] 12.Flournoy N. and Oron A., Up-and-down designs for dose-finding, in Handbook of Design and Analysis of Experiments, D. Bingham, A. Dean, M. Morris, and J. Stufken, eds., chap. 24, CRC Press, Chapman Hall, 2015, pp. 862–898

[CIT0013] 13.Flournoy N., May C. and Secchi P., Asymptotically optimal response-adaptive designs for allocating the best treatment: an overview, Int. Stat. Rev. 80 (2012), pp. 293–305. doi: 10.1111/j.1751-5823.2011.00173.x [DOI] [Google Scholar]

[CIT0014] 14.Gezmu M., The geometric up-and-down design for allocating dosage levels, Ph.D. diss., American University, Washington, DC, 1996.

[CIT0015] 15.Graf A.C., Gutjahr G. and Brannath W., Precision of maximum likelihood estimation in adaptive designs, Stat. Med. 35 (2016), pp. 922–941. sim.6761.. doi: 10.1002/sim.6761 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0016] 16.Heijmans R., When does the expectation of a ratio equal the ratio of expectations?, Stat. Papers 40 (1999), pp. 107–115. doi: 10.1007/BF02927114 [DOI] [Google Scholar]

[CIT0017] 17.Ivanova A., Dose escalation and up-and-down designs, Wiley Encyclopedia of Clinical Trials, 2007, pp. 1–4.

[CIT0018] 18.Ivanova A., Flournoy N. and Chung Y., Cumulative cohort design for dose-finding, J. Statist. Plan. Inference 137 (2007), pp. 2316–2327. doi: 10.1016/j.jspi.2006.07.009 [DOI] [Google Scholar]

[CIT0019] 19.Jennison C. and Turnbull B.W., Group Sequential and Adaptive Methods for Clinical Trials, 2nd ed., Chapman & Hall/CRC, Boca Raton, 2002. [Google Scholar]

[CIT0020] 20.Ji Y., Li Y. and Bekele B.N., Dose-finding in phase I clinical trials based on toxicity probability intervals, Clin. Trials 4 (2007), pp. 235–244. doi: 10.1177/1740774507079442 [DOI] [PubMed] [Google Scholar]

[CIT0021] 21.Ji Y., Liu P., Li Y. and Nebiyou Bekele B., A modified toxicity probability interval method for dose-finding trials, Clin. Trials 7 (2010), pp. 653–663. doi: 10.1177/1740774510382799 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0022] 22.Lee S.M. and Cheung Y.K., Model calibration in the continual reassessment method, Clin. Trials 6 (2009), pp. 227–238. doi: 10.1177/1740774509105076 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CIT0023] 23.Liu Q., Proschan M.A. and Pledger G.W., A unified theory of two-stage adaptive designs, J. Am. Stat. Assoc. 97 (2002), pp. 1034–1041. doi: 10.1198/016214502388618852 [DOI] [Google Scholar]

[CIT0024] 24.Nie X., Tian X., Taylor J. and Zou J., Why adaptively collected data have negative bias and how to correct for it, preprint (2017). Available at arXiv:1708.01977.

[CIT0025] 25.O'Quigley J., Pepe M. and Fisher L., Continual reassessment method: a practical design for Phase I clinical trials in cancer, Biometrics 46 (1990), pp. 33–48. MRMR1059105. doi: 10.2307/2531628 [DOI] [PubMed] [Google Scholar]

[CIT0026] 26.Oron A., Azriel D. and Hoff P., Dose-finding designs: the role of convergence properties, Int J Biostat 7 (2011), pp. 1–17. Article doi: 10.2202/1557-4679.1298 [DOI] [PubMed] [Google Scholar]

[CIT0027] 27.Oron A. and Flournoy N., Centered isotonic regression: point and interval estimation for dose-response studies, Stat. Biopharm. Res. 9 (2017), pp. 258–267. doi: 10.1080/19466315.2017.1286256 [DOI] [Google Scholar]

[CIT0028] 28.Oron A. and Hoff P., The k-in-a-row up-and-down design, revisited, Stat. Med. 28 (2009), pp. 1805–1820. doi: 10.1002/sim.3590 [DOI] [PubMed] [Google Scholar]

[CIT0029] 29.Oron A. and Hoff P., Small-sample behavior of novel Phase I cancer trial designs, Clin. Trials 10 (2013), pp. 63–80. doi: 10.1177/1740774512469311 [DOI] [PubMed] [Google Scholar]

[CIT0030] 30.Rosenberger W., Flournoy N. and Durham S.D., Asymptotic normality of maximum likelihood estimators from multiparameter response-driven designs, J. Statist. Plan. Inference 60 (1997), pp. 69–76. doi: 10.1016/S0378-3758(96)00120-6 [DOI] [Google Scholar]

[CIT0031] 31.Tarima S. and Flournoy N., Distribution theory following blinded and unblinded sample size re-estimation under parametric models, Communications in Statistics, revision under review (2019). [DOI] [PMC free article] [PubMed]

[CIT0032] 32.Thompson W.R., Biometrika trust, Biometrika 25 (1933), pp. 285–294. doi: 10.1093/biomet/25.3-4.285 [DOI] [Google Scholar]

[CIT0033] 33.Tsutakawa R., Asymptotic properties of block up-and-down methods in bio-assay, Ann. Math. Statist. 38 (1967), pp. 1822–1828. doi: 10.1214/aoms/1177698615 [DOI] [Google Scholar]

[CIT0034] 34.Wetherill G.B., Sequential estimation of quantal response curves, J Roy. Stat. Soc. B 25 (1963), pp. 1–48. [Google Scholar]

[CIT0035] 35.Wetherill G., Sequential estimation of quantal response curves, J. Royal. Stat. Soc. B 25 (1963), pp. 1–48. [Google Scholar]

[CIT0036] 36.Woodroofe M., On stopping times and stochastic monotonicity, Seq. Anal. 9 (1990), pp. 335–342. doi: 10.1080/07474949008836216 [DOI] [Google Scholar]

[CIT0037] 37.Woolf B., On estimating the relation between blood group and disease, Ann. Hum. Genet. 19 (1955), pp. 251–253. doi: 10.1111/j.1469-1809.1955.tb01348.x [DOI] [PubMed] [Google Scholar]

[CIT0038] 38.Xu M., Qin T. and Liu T.Y., Estimation bias in multi-armed bandit algorithms for search advertising, in Advances in Neural Information Processing Systems, Vol. 26, 2013, pp. 2400–2408.

[CIT0039] 39.Yang S., Wang S.J. and Ji Y., An integrated dose-finding tool for phase I trials in oncology, Contemp. Clin. Trials. 45 (2015), pp. 426–434. doi: 10.1016/j.cct.2015.09.019 [DOI] [PubMed] [Google Scholar]

PERMALINK

Bias induced by adaptive dose-finding designs

Nancy Flournoy

Assaf P Oron

Abstract

1. Introduction

2. The bias of observed dose-specific toxicity rates

2.1. Preliminaries

2.2. An expression for the bias of observed toxicity rates

Figure 1.

2.3. Numerical illustration of observed toxicity rate bias as a function of dose

Figure 2.

3. Mitigating the bias of observed toxicity rates

3.1. Potential mitigation approaches

3.2. Numerical examination of the shrinkage formula

Figure 3.

3.3. Impact of bias and shrinkage on parametric MLEs

Figure 4.

3.4. Revisiting coverage of isotonic regression interval estimates of the target dose

Table 1. Coverage rates (interval widths)^a for the target dose $F^{- 1} (0.30)$ using 2 KRD with and without bias mitigation.

4. Discussion

Acknowledgments

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Bias induced by adaptive dose-finding designs

Nancy Flournoy

Assaf P Oron

Abstract

1. Introduction

2. The bias of observed dose-specific toxicity rates

2.1. Preliminaries

2.2. An expression for the bias of observed toxicity rates

Figure 1.

2.3. Numerical illustration of observed toxicity rate bias as a function of dose

Figure 2.

3. Mitigating the bias of observed toxicity rates

3.1. Potential mitigation approaches

3.2. Numerical examination of the shrinkage formula

Figure 3.

3.3. Impact of bias and shrinkage on parametric MLEs

Figure 4.

3.4. Revisiting coverage of isotonic regression interval estimates of the target dose

Table 1. Coverage rates (interval widths)a for the target dose F−1(0.30) using 2 KRD with and without bias mitigation.

4. Discussion

Acknowledgments

Disclosure statement

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

Table 1. Coverage rates (interval widths)^a for the target dose $F^{- 1} (0.30)$ using 2 KRD with and without bias mitigation.