Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2015 Jun 30.
Published in final edited form as: Biometrics. 2010 Aug 19;67(2):596–603. doi: 10.1111/j.1541-0420.2010.01471.x

Incorporating Individual and Collective Ethics into Phase I Cancer Trial Designs

Jay Bartroff 1, Tze Leung Lai 2
PMCID: PMC4485382  NIHMSID: NIHMS691939  PMID: 20731643

Summary

A general framework is proposed for Bayesian model based designs of Phase I cancer trials, in which a general criterion for coherence of a design is also developed. This framework can incorporate both “individual” and “collective” ethics into the design of the trial. We propose a new design that minimizes a risk function composed of two terms, with one representing the individual risk of the current dose and the other representing the collective risk. The performance of this design, which is measured in terms of the accuracy of the estimated target dose at the end of the trial, the toxicity and overdose rates, and certain loss functions reflecting the individual and collective ethics, is studied and compared with existing Bayesian model based designs and is shown to have better performance than existing designs.

Keywords: Cancer trials, Coherence, Dose-finding, Logistic regression, Markov decision problem, Phase I

1. Introduction

A Phase I trial for a new treatment is generally intended to determine a dose to use in subsequent Phase II and III testing. Phase I cancer trials have the additional complexity that the treatment in question is usually a cytotoxic agent and the efficacy usually increases with dose, and therefore it is widely accepted that some degree of toxicity must be tolerated to experience any substantial therapeutic effects. Hence, an acceptable proportion p of patients experiencing dose limiting toxicities (DLTs) is generally agreed on before the trial, which depends on the type and severity of the DLT; the dose resulting in this proportion is thus referred to as the maximum tolerated dose (MTD). In addition to the explicitly stated objective of determining the MTD, a Phase I cancer trial also has the implicit goal of safe treatment of the patients in the trial. However, the aims of treating patients in the trial and generating an efficient design to estimate the MTD for future patients often run counter to each other. Commonly used designs in Phase I cancer trials implicitly place their focus on the safety of the patients in the trial, beginning from a conservatively low starting dose and escalating cautiously. Escalation is further slowed by the assignment of the same dose to groups of consecutive patients, as in the widely used 3-plus-3 design, which is convenient to administer and shortens trial duration by simultaneously following patients in groups of 3. Von Hoff and Turner (1991) have documented that the overall response rates in these Phase I trials are low, and substantial numbers of patients are treated at doses that are retrospectively found to be nontherapeutic. Moreover, as pointed out by O'Quigley, Pepe, and Fisher (1990), these designs are very inefficient for estimating the MTD, which is implied by the 3-plus-3 design to correspond to the case p = 1/3. They proposed a Bayesian model-based design, called the “continual reassessment method” (CRM), to choose the dose levels sequentially, making use of all past data at each stage.

More than 90 new Phase I methods were published between 1991 and 2006 (Rogatko et al., 2007), and there have been several reviews of the new methods (e.g., Rosenberger and Haines, 2002). In this article, we focus on Bayesian model based designs and Section 2 describes a general framework to develop and analyze them. As shown in Sections 2 and 3, this framework allows one to incorporate the competing aims of a Phase I cancer trial by choosing the loss function accordingly. It also enables one to derive certain desirable properties of the design, such as coherence (Cheung, 2005), from the loss function, or to enforce them by using simple reformulations in this framework. Section 4 provides implementation details and gives a simulation study comparing Bayesian designs that correspond to different loss functions in the setting of a colon cancer trial considered by Babb, Rogatko, and Zacks (1998).

2. Posterior Distributions, Loss Functions, and Sequential Dose Determination

A commonly used model-based approach to Phase I cancer clinical trial design assumes the usual logistic regression model for the probability Fθ (x) of DLT at dose level x:

Fθ(x)=1(1+e(α+βx)), (1)

in which β > 0 and θ = (α, β) is unknown and to be estimated from the observed pairs (xi, yi), where yi = 1 if the ith subject, treated at dose xi, experiences DLT and yi = 0 otherwise. The frequentist approach to inference on θ uses the likelihood function and estimates θ by maximum likelihood, while the Bayesian approach assumes a prior distribution of θ and uses the posterior distribution for inference on θ.

Denote the MTD by η=Fθ1(p) and the posterior distribution of θ based on (x1, y1), . . . , (xk, yk) by Πk , and let Π0 denote the prior distribution. The Bayes estimate of η with respect to squared error loss is the posterior mean EΠk (η), and the CRM proposed by O'Quigley et al. (1990) uses this posterior mean to set the dose for the next patient, that is, xk+1 = EΠk (η). Instead of the posterior mean, Babb et al. (1998) proposed to set xk+1 equal to the ω-quantile of the posterior distribution, where 0 < ω < 1/2 is chosen to be slightly less than p in their examples. This design is called “escalation with overdose control” (EWOC) and ω is called the “feasibility bound.” A sequence of doses xn is called “Bayesian feasible” at level 1 – ω if PΠn–1 (ηxn) ≥ 1 – ω for all n ≥ 1, and the EWOC doses are optimal Bayesian-feasible ones; see Zacks, Rogatko, and Babb (1998).

Note that the dose for the nth patient in CRM or EWOC depends only on the posterior distribution Πn–1, that is, xn is a functional fn–1) of Πn–1. This functional defines {Πk : k ≥ 0} as a Markov chain whose states are distributions on the parameter space Θ and whose state transitions are given by the following.

Bayesian updating scheme

Given current state Π (which is a prior distribution of θ), let x = f(Π) and generate first θ from Π and then y ~ Bern(Fθ (x)). The new state is the posterior distribution of θ given (x, y).

The functional x = f(Π) for CRM is EΠ(η), which minimizes the expected squared error loss EΠ[(ηx)2]. As pointed out by Babb et al. (1998), the symmetric nature of the squared error loss may not be appropriate for modeling the toxic response to a cancer treatment. Instead of squared error loss, EWOC with feasibility bound ω uses the functional x = x(Π) that minimizes the asymmetric loss function EΠ[(η,x)], where

(η,x)={ω(ηx),ifxη(1ω)(xη),ifxη.} (2)

More generally, we can consider other loss functions (θ,x) and define x(Π) that attains minx EΠ[(θ,x)]. In particular, the following example gives a response-based version of EWOC.

Example 1: Inverted overdose control

The EWOC loss function (2) penalizes an overdose x > η by the amount (1 – ω)(xη), and an under-dose x < η by the amount ω(xη). However, a dose x deemed “too large” on this scale may actually correspond to a probability of DLT not much larger than the target rate p depending on the dose–response curve, making x a relatively desirable dose. Likewise, a small value of |xη| may correspond to a large discrepancy between the actual DLT probability Fθ (x) and p. Hardwick and Stout (2001) suggest to measure the excess/deficit of the DLT rate on the “probability scale.” Taking 0 < γ < 1/2, this leads to the “inverted” loss function

(θ,x)={γ(pFθ(x)),ifxη(1γ)(Fθ(x)p),ifxη.} (3)

Remark

The loss when the dose falls below η, measured by the difference ω(ηx) in (2) and γ(pFθ (x)) in (3), should ideally be measured by the difference in response rates at η and x, respectively, when efficacy data, taking the value 1 if the patient responds to the treatment and 0 otherwise, are also available besides the toxicity data. Note, however, that this involves bivariate efficacy-toxicity data. While many existing designs solely consider toxicity outcomes and the MTD, designs that incorporate efficacy responses as well have been proposed by a number of authors, including Li, Durham, and Flournoy (1995); Hardwick and Stout (2001); Kpamegan and Flournoy (2001); Thall and Cook (2004); Dragalin and Fedorov (2006); Dragalin, Fedorov, and Wu (2008); and Pronzato (2010). When efficacy responses are available, the minimum effective dose (MED) is of interest, that is, the lowest dose at which some desired proportion of positive efficacy responses is attained. When both efficacy and toxicity data are available, the optimal safe dose, which is the dose between the MED and the MTD maximizing the probability of simultaneous efficacy and nontoxicity, is of interest. Since this article focuses on univariate toxicity data, we consider elsewhere better alternatives to (3) for xη that also require efficacy data.

Noting that the explicitly stated objective of a Phase I cancer trial is to estimate the MTD, Whitehead and Brunier (1995) considered Bayesian sequential designs that are optimal, in some sense, for this estimation problem. Haines, Perevozskaya, and Rosenberger (2003) made use of the theory of optimal design of experiments (Fedorov, 1972; Atkinson and Donev, 1992; Dette, Melas, and Pepelyshev, 2004) to construct Bayesian c- and D-optimal designs, and further imposed a relaxed Bayesian feasibility constraint on the design to avoid highly toxic doses. Optimal design theory involves a design measure ξ on the dose space X, and a sequential design updates the empirical design measure ξn–1 at stage n by changing it to ξn with the addition of the dose xn. The empirical measure ξn of the doses x1, . . . , xn up to stage n can be represented by ξn=n1i=1nδxi, where δx is the probability measure degenerate at x. We let ∥ξ∥ denote the number of xi (not necessarily distinct) in the support of ξ. Thus ∥ξn∥ = n and ∥ξ0∥ = 0, with ξ0 being the zero measure on X. To include the construction of sequential Bayesian optimal designs as a special case of our general approach, we can modify the preceding procedure that minimizes EΠ[(θ,x)] to choose the next dose based on the current posterior distribution Π, by including the current design measure ξ in the loss function.

Example 2: Bayesian c- or D-optimal designs

As described by Haines et al. (2003), optimal design theory is concerned with choosing a design measure ξ to minimize a convex function Ψ of the information matrix M(θ,ξ)=I(θ,x)dξ(x), where I(θ, x) is the Fisher information matrix at design point x

I(θ,x)=eα+βx(1+eα+βx)2(1xxx2).

The convex function Ψ is associated with the optimality criterion, for example, Ψ(M) = –log det(M) for D-optimality and Ψ(M) = c′ M–1c for c-optimality. Since θ = (α, β) is unknown, the frequentist approach uses a sequential design that replaces θ in M(θ, ξt) by its maximum likelihood estimate at every stage t. The Bayesian approach puts a prior distribution Π0 on θ and minimizes Ψ(M(θ,ξt))dΠ0(θ). Noting that this Bayesian approach does not accommodate the fact that patients are assigned doses sequentially in Phase I trials, Haines et al. (2003, Section 5) propose to start the optimal design after an initial sample of k patients so that the dose x of a patient after this initial sample can be determined by minimizing

Ψ({kM(θ,ξk)+I(θ,x)}(k+1))dΠk(θ), (4)

where ξk is the empirical measure of the initial sample of design points and Πk is the posterior distribution of θ based on the initial sample.

We can easily extend our loss function approach to Bayes sequential designs by including ξ as an argument of the loss function in this setting. Let Π be the current posterior distribution of θ and ξ be the current empirical design measure. Define

(θ,x;ξ)=Ψ(M(θ,ξ+{x})),whereξ+{x}=ξξ+δxξ+1. (5)

The sequential Bayes optimal design chooses the next design level x that minimizes EΠ(θ,x;ξ). The measure ξ+{x} in (5) represents the new empirical measure obtained by adding x to the support of ξ, with ∥ξ+{x}∥ = ∥ξ∥ + 1. We can also impose a relaxed feasibility in the choice of x.

MinimizeEΠ(θ,x;ξ)subject toPΠ(η~<x)ω, (6)

as in Haines et al. (2003), where η~=Fθ1(q) with qp and ω is a prescribed positive constant. If q = p, then η~=η and the constraint corresponds to requiring the doses to be Bayesian feasible (see the description of EWOC above).

3. Coherence and Dilemma Between Individual and Collective Ethics

The preceding section has focused on determining the next dose by minimizing EΠ[(θ,x)], where Π is the current posterior distribution and is a loss function incorporating the trial's main objective into the Bayes sequential design. In Example 2, we have shown how additional information, such as the empirical measure of previous design points, can be included in the minimization problem to determine the dose. The following subsections extend this idea to address two important issues in Phase I cancer clinical trial designs.

3.1 Coherence and Its Enforcement

Motivated by ethical concerns, Cheung (2005) introduced coherence principles for sequential dose escalation or de-escalation. A dose sequence is said to be “coherent” if a higher (respectively, lower) dose is not given to the next patient when the current patient experiences (respectively, does not experience) DLT. In particular, CRM and EWOC are coherent and the following theorem, whose proof is given in the Appendix, provides conditions for the coherence of a Bayes sequential design that minimizes the posterior loss at every stage.

Theorem 1

Suppose that the dose space is a finite interval and that (η,x) is convex in x for every fixed η. Assume that for fixed x > x′, (η,x)(η,x) is nonincreasing in η. Then the dose sequence xn = arg minx EΠn1(η,x) is coherent.

Theorem 1 shows that CRM is coherent since (η,x)=(ηx)2 is convex and

(η,x)(η,x)=2η(xx)+x2(x)2

is nonincreasing in η for x > x′ . The loss function (2) associated with EWOC also satisfies the assumption of Theorem 1, which therefore shows the coherence of EWOC. The loss functions in Examples 1 and 2, however, may not satisfy the assumptions of Theorem 1. Moreover, a modification of EWOC recommended by its proponents (Babb and Rogatko, 2004), in which the feasibility bound is escalated throughout the trial from a low starting value to 1/2 at the end of the trial, does not satisfy the assumptions of Theorem 1 and it indeed exhibits slight incoherence in the simulation studies in Section 4. This can be understood by noting that, toward the end of the trial, the posterior distribution does not change much from patient to patient, and that an increase in the feasibility bound may overwhelm the slight downward shift in the posterior following an outcome y = 0, causing a dose higher than the previous to be assigned. Cheung (2005, p. 865) also found a certain two-stage modification of CRM to be incoherent. On the other hand, we can enforce coherence by modifying xn = fn–1) into xn = fn–1, xn–1, yn–1), where

f(Π,x,y)={argminxxEΠ(η,x)ify=1,argminxxEΠ(η,x)ify=0.} (7)

3.2 Treatment of Current Patient versus Information for Future Patients

We have noted in Section 2 that CRM or EWOC treats the next patient at the dose x that minimizes EΠ[(θ,x)] for (η,x) given by (ηx)2 or by (2), where Π is the current posterior distribution. This is tantamount to dosing the next patient at the best guess of η, where “best” means “closest” according to some measure of distance from η. On the other hand, a Bayesian c- or D-optimal design aims at generating doses that provide most information, as measured by the Fisher information matrix of a design measure, for estimating the dose-toxicity curve to benefit future patients. To resolve this dilemma between treatment of patients in the trial and efficient experimental design for posttrial parameter estimation, Bartroff and Lai (2010) considered the finite-horizon optimization problem of choosing the dose levels x1, x2, . . . , xn sequentially to minimize the “global risk”

EΠ0[i=1nh(η,xi)+g(η^n,η)], (8)

in which Π0 denotes the prior distribution of θ, h(η, xi) represents the loss for the ith patient in the trial, η^n is the terminal estimate of the MTD, and g represents a terminal loss function. The optimizing doses xi depend on ni, where the horizon n is the sample size of the trial, and therefore are not of the form xi = fi–1) considered in Section 2. In terms of “individual” and “collective” ethics, note that (8) measures the individual effect of the dose xk on the kth patient through h (η, xk), and its collective effect on future patients through i>kh(η,xi)+g(η^n,η).

By using a discounted infinite-horizon version of (8), we can still have solutions of the form xi = fi–1) for some functional f that only depends on Πi–1. Specifically, take a discount factor 0 < δ < 1 and replace (8) by

EΠ0[i=1h(η,xi)δi1] (9)

as the definition of global risk. Note that this global risk measures the individual effect of the dose xk on the kth patient through h(η, xk), and its collective effect on future patients through i>kh(η,xi)δik. This means the myopic dose xk that minimizes EΠk–1 [h(η, x)] for treating the kth patient has to be perturbed such that it also helps to create a more informative posterior distribution Πk that is used for dosing future patients. Note that (9) does not have the term g(η^n,η) appearing in the finite-horizon problem (8), but even without this term, the global risk (9) still captures the collective effect of the doses, as indicated above. As we have pointed out in Section 2, if xi is of the form fi–1) for all i, then {Πk : k ≥ 0} is a Markov chain whose states are distributions of θ and undergo Markovian dynamics described by the updating scheme for posterior distributions. In the context of the present problem of minimizing (9), the optimal expected loss V (Π) at state Π (posterior distribution of θ) satisfies Bellman's dynamic programming equation

V(Π)=infxEΠ{h(η,x)+δEΠV(Π+{x})}, (10)

where Π+{x} is the new posterior distribution of θ after (x, y) is observed, with y ~ Bern(Fθ(x)) and θ ~ Π; see the Bayesian updating scheme in Section 2. For finite-state controlled Markov chains, iteration is a commonly used method to solve (10); see Bertsekas (2007, Section 1.3). In the present case, not only is the state space infinite, but it is also infinite-dimensional (space of all posterior distributions of θ), making dynamic programming intractable.

The main complexity of the infinite-horizon problem is that the dose x for the next patient involves also consideration for future patients who will receive optimal doses themselves; these future doses depend on the future posterior distributions. A simple way to reduce the complexity is to consider two (instead of infinitely many) future patients. This amounts to choosing the next dose x to minimize EΠ(η,x;Π) when the current posterior distribution of θ is Π, where

(η,x;Π)=h(η,x)+λEΠ{EΠ[h(η,x)x1=x,y1]}, (11)

in which η=Fθ1(p) with θ′ ~ Π , and Π′ and x′ are defined below. The first summand in (11) measures the (toxicity) effect of the dose x on the patient receiving it. The second summand considers the patient who follows and receives a myopic dose x′ that minimizes the patient's posterior loss; the myopic dose is optimal because there are no more patients involved in (11). The effect of x on this second patient is through the posterior distribution Π′ that updates Π after observing (x1, y1), with x1 = x. Since y1 is not yet observed, the expectation outside the curly brackets is taken over y1 ~ Bern(Fθ (x)), with θ ~ Π. For example, when implemented with h(η, x) given by the EWOC loss function (2), this proposal can be viewed as a modification of EWOC since it utilizes its loss function but adds an additional term to represent the effect on future patients.

Unlike 0 < δ < 1 in the discounted infinite-horizon problem, the choice of λ > 0 in (11) can exceed 1 and reflects the balance between the collective ethics in generating information for future patients and the individual ethics for the patient receiving the dose. Although we use here a single patient to represent all patients following the one receiving the next dose, because the posterior distributions also change successively, the doses are functionals of these posterior distributions.

4. Implementation and a Simulation Study

In this section we first describe three main components in the implementation of the above Bayesian sequential designs and then evaluate their performance in a simulation study.

4.1 Updating the Posterior Distribution

Letting η denote the MTD and ρ = Fθ (xmin), we follow Babb et al. (1998) to transform (α, β) in (1) to (ρ, η) via the formulas

α=xminlog(p11)ηlog(ρ11)ηxmin,β=log(ρ11)log(p11)ηxmin,

and therefore

α+βx=(xη)log(ρ11)(xxmin)log(p11)ηxmin=G(x,ρ,η).

We assume that the joint prior distribution of (ρ, η) has density π(ρ, η) with support on [0, p] × [xmin, xmax]. Therefore the Fk1-posterior density of (ρ, η) is

πk1(ρ,η)=Ci=1k1[11+eG(xi,ρ,η)]yi×[11+eG(xi,ρ,η)]1yiπ(ρ,η), (12)

where

C1=xminxmax0qi=1k1[11+eG(xi,ρ,η)]yi×[11+eG(xi,ρ,η)]1yiπ(ρ,η)dρdη.

The marginal Fk1-posterior distribution of η is then 0pπk1(ρ,η)dρ, and the CRM and EWOC doses based on Fk1 are the mean and ω-quantile of this distribution, respectively.

4.2 Computation of EΠ(η,x) and its Minimizer in Sections 2 and 3.1

The integrals in (12) can be evaluated by using a numerical double-integration routine involving Gaussian quadrature in MATLAB. This can be used to evaluate EΠ(η,x) for a posterior distribution Π. We can find the minimum of EΠ(η,x) over x by a grid search in [xmin, xmax], or by using gradient descent if is smooth. For computation of the constrained Bayesian optimal design (6), a constrained nonlinear optimization routine in MATLAB can be used in conjunction with numerical integration, as outlined in Haines et al. (2003, p. 593).

4.3 Minimization of EΠ(η,x;Π) in Section 3.2

While MCMC or rejection sampling can be used to compute (11) for any candidate dose x, importance sampling (e.g., Robert and Casella, 2004, Chapter 3.3) is a simple, robust alternative that takes advantage of the fact that just an expectation with respect to the posterior distribution is needed. Letting Π0 denote the uniform distribution of the transformed coordinates (ρ, η) over [0, p] × [xmin, xmax], we have

EΠ(η,x;Π)B1b=1B(ηb,x;Π)π(ρb,ηb)π0(ρb,ηb) (13)

for large B, where (ρb, ηb), b = 1, . . . , B, are i.i.d. and generated from Π0. Letting Π+{x,y} denote the posterior distribution obtained from Π by including (x, y) and letting x′ = x′+{x,y}), the nested expectation in (11) can be similarly approximated by using

EΠ[h(η,x)x1=x,y]=EΠ+{x,y}h(η,x)B1b=1Bh(ηb,x)π+{x,y}(ρb,ηb)π0(ρb,ηb), (14)
PΠ(y=1x)=Fθ(x)dΠ(θ)B1b=1BFθb(x)π(ρb,ηb)π0(ρb,ηb), (15)

where (ρb,ηb), (ρb,ηb)Π0 and θb=θ(ρb,ηb). Let HΠ(x, y) and QΠ(x) denote the right-hand sides of (14) and (15), respectively. Combining (13)(15) gives

EΠ(η,x;Π)B1b=1B{h(ηb,x)+λ[HΠ(x,0)(1QΠ(x))+HΠ(x,1)QΠ(x)]}π(ρb,ηb)π0(ρb,ηb). (16)

We can minimize the right-hand side of (16) over x ∈ [xmin, xmax] by using a bounded minimization routine in MATLAB.

4.4 Simulation Study

To compare the proposed procedure in Section 3.2 to EWOC, CRM, and the inverted overdose control (IVOC) design in Example 1, a simulation study was performed in the setting of the trial to determine the MTD of the antimetabolite 5-fluorouracil (5-FU) for treating solid tumors in the colon, as described in Babb et al. (1998). Based on previous studies of 5-FU, a dose of 140 mg/m2 of 5-FU was believed to be safe, and the MTD was believed to be no greater than 425 mg/m2, thus the dose space was taken to be the interval [xmin, xmax] = [140, 425]. The two-parameter logistic model (1) was chosen based on previous experience with the agent, and the uniform distribution over [0, p] × [xmin, xmax] was chosen as the prior distribution Π0 for (ρ, η), with p = 1/3. The feasibility bound of ω = .25 was chosen, which was also used here for the IVOC weight γ in (3). In a trial of length n = 24, Table 1 compares EWOC that uses a linearly escalated feasibility bound (Babb and Rogatko, 2004), denoted by EWOC*, with IVOC, CRM, and the proposed design in Section 3.2 with h in (11) given by the EWOC loss function (and denoted by EWOC+, in which + signifies an additional future patient considered by (11)), for two different values of the discount factor λ in (11). Each entry in the table was calculated from 10,000 simulated trials. The first set of rows is a Bayesian setting in which, for each replication, a pair (ρ, η) is drawn from Π0, and the next three sets of rows are frequentist settings (denoted Freq1, Freq2, Freq3) where the true values (ρ, η) are set at fixed values for all 10,000 replications; these three pairs of fixed values were drawn from Π0. A comprehensive comparison of EWOC, CRM, sequential c-optimal, constrained D-optimal, ADP and other designs has been given by Bartroff and Lai (2010), who use approximate dynamic programming (ADP) to minimize the finite-horizon risk (8).

Table 1.

Risk1, Risk2, bias, and RMSE of the final MTD estimate, DLT rate, MTD overdoes rate (OD), excess DLT rate E[Fθ(x) – p]+ (OD*), and coherence violation rate (ChV), with SEs in parentheses, of various designs

Statistic EWOC* IVOC CRM EWOC+,λ=.1 EWOC+,λ=.4
Bayesian: (ρ, η) ~ Π0
Risk1 485.5 (3.6) 723.2 (3.6) 986.1 (45.9) 469.5 (3.0) 454.8 (2.8)
Risk2 1.03 (0.0l) 1.44 (0.02) 1.54 (0.02) 0.85 (0.01) 0.73 (0.007)
Bias –9.67 (0.6) –50.2 (0.8) 22.3 (1.6) 2.42 (0.7) –5.9 (0.6)
RMSE 61.5 (0.7) 75.8 (0.2) 157.3 (0.8) 66.9 (0.2) 58.6 (0.7)
DLT (%) 33.5 (0.001) 26.6 (9 × 10–4) 39.1 (0.001) 29.3 (0.0009) 29.1 (9 × 10–4)
OD (%) 37.4 (0.001) 17.5 (8 × 10–4) 55.6 (0.001) 29.6 (9 × 10–4) 27.0 (9 × 10–4)
OD* 0.043 (2 × 10–4) 0.043 (3 × 10–4) 0.078 (3 × 10–4) 0.029 (2 × 10–4) 0.021 (1 × 10–4)
ChV (%) 0 (0) 0.6 (0.04) 0 (0) 15.0 (8 × 10–4) 14.4 (8 × 10–4)

Freq1 : ρ = .07, η = 403.9
Risk1 585.6 (1.3) 1302.2 (0.8) 368.8 (1.1) 175.9 (1.0) 170.3 (1.1)
Risk2 0.78 (0.002) 1.40 (6 × 10–4) 0.53 (0.001) 0.29 (0.001) 0.25 (0.001)
Bias –51.1 (0.2) –151.7 (0.2) –30.1 (0.4) –23.2 (0.3) –48.7 (0.2)
RMSE 58.8 (0.8) 156.1 (0.6) 44.9 (0.1) 49.0 (0.5) 19.2 (0.4)
DLT (%) 20.6 (8 × 10–4) 10.3 (7 × 10–4) 24.9 (9 × 10–4) 18.7 (8 × 10–4) 18.6 (8 × 10–4)
OD (%) 0 (0) 0 (0) 2.3 (3 × 10–4) 1.2 (2 × 10–4) 1.1 (2 × 10–4)
OD* 0 (0) 0 (0) 4 × 10–4 (1 × 10–5) 0.001 (1 × 10–5) 0.001 (1 × 10–5)
ChV (%) 0 (0) 0 (0) 0 (0) 0 (0) 0 (0)

Freq2 : ρ = .19, η = 269.1
Risk1 402.8 (0.8) 423.1 (0.9) 313.7 275.0 (3.7) 266.2 (3.8)
Risk2 0.53 (0.003) 0.49 (0.001) 0.99 (0.006) 0.43 (0.001) 0.26 (0.001)
Bias 15.0 (0.5) –28.9 (0.1) 32.5 (0.4) 17.3 (0.7) 12.3 (0.4)
RMSE 47.5 (0.2) 32.0 (0.7) 52.1 (0.7) 38.2 (0.3) 25.2 (0.3)
DLT (%) 32.6 (0.001) 25.4 (9 × 10–4) 38.2 (0.001) 28.2 (9 × 10–4) 24.8 (9 × 10–4)
OD (%) 32.8 (0.001) 0 (0) 83.1 (8 × 10–4) 27.1 (9 × 10–4) 26.9 (9 × 10–4)
OD* 0.02 (7 × 10–5) 0 (0) 0.053 (8 × 10–5) 0.03 (4 × 10–5) 0.01 (4 × 10–5)
ChV (%) 4.3 (4 × 10–4) 3.1 (4 × 10–4) 0 (0) 17.6 (8 × 10–4) 12.7 (7 × 10–4)

Freq3 : ρ = .30, η = 226.7
Risk1 674.3 (4.9) 232.1 (2.9) 1444.9 (6.3) 158.9 (4.9) 146.1 (3.5)
Risk2 0.27 (0.002) 0.09 (4 × 10–4) 0.59 (0.003) 0.05 (4 × 10–4) 0.05 (6 × 10–4)
Bias 42.9 (0.62) 10.3 (0.14) 81.4 (0.5) 54.9 (0.54) 47.4 (0.49)
RMSE 74.2 (0.7) 17.4 (0.5) 96.8 (0.8) 18.0 (0.5) 13.2 (0.2)
DLT (%) 34.2 (0.001) 32.7 (0.001) 36.6 (0.001) 35.8 (0.001) 35.7 (0.001)
OD (%) 73.2 (0.001) 54.3 (0.001) 93.5 (5 × 10–4) 88.8 (6 × 10–4) 81.0 (6 × 10–4)
OD* 0.014 (3 × 10–5) 0.002 (2 × 10–5) 0.032 (4 × 10–5) 0.031 (4 × 10–5) 0.030 (4 × 10–5)
ChV (%) 0 (0) 3.9 (5 × 10–4) 0 (0) 10.0 (6 × 10–4) 9.6 (6 × 10–4)

Table 1 reports two different risk measures. Since the length of the trial is fixed at n = 24 in the simulation study, Risk1 is the finite-horizon analog of (9), which is (8) with h given by the EWOC loss function (2) and no terminal loss (i.e., g = 0), and Risk2 is the same risk function but with h given by the “inverted” loss function (3). Also reported are the bias and RMSE (root mean squared error {E(η^nη)2}12) of the terminal MTD estimate η^n (which is the mean of the terminal posterior distribution of η), the DLT rate P(y = 1) (denoted DLT), the overdose rate P (x > η) (denoted OD), the excess DLT rate E[Fθ (x) – p]+ (denoted OD*), and the coherence violation rate (denoted ChV)

(n1)1i=1n1{P(yi=0,xi+1<xi)+P(yi=1,xi+1>xi)}.

In these expressions, P and E denote the probability and expectation, respectively, with respect to the prior distribution in the Bayesian setting, or with respect to the appropriate fixed values of (ρ, η) in the frequentist settings, and are computed by Monte Carlo.

In terms of Risk1 and Risk2, EWOC+ performs better in the Bayesian setting than the myopic designs EWOC*, IVOC, and CRM, in that order. This occurs in the frequentist settings as well, although the ordering of the myopic designs varies depending on the particular parameter values. Even though it myopically minimizes the posterior risk at every stage, IVOC performs poorly in terms of the cumulative risk, Risk2, in the Bayesian setting. A possible explanation is that its loss function (3) is a function of Fθ (x), whose posterior distribution (induced by the posterior distribution of θ) has relatively large variance toward the middle of the interval (0, 1) in which Fθ (x) takes values and, in particular, near p = 1/3, resulting in low initial doses observed in the simulations. On the other hand, in the Freq3 setting, where η is relatively small and the dose–response curve is relatively flat (e.g., ρ large), IVOC performs well in terms of the risks. In terms of estimation, EWOC+=.4 has the smallest RMSE, with EWOC* and EWOC+=.1 both comparable in the Bayesian setting. Moreover, EWOC+=.4 has uniformly the smallest RMSE in the frequentist settings, with IVOC comparable to it in Freq2 and IVOC and EWOC+=.1 comparable to it in Freq3.

5. Conclusion and Discussion

In this article, we present a general formulation of Bayesian sequential design of Phase I cancer trials. This formulation enables us to prove a general coherence result in Theorem 1 applicable to any design that can be defined as the minimizer of the posterior risk when the loss function satisfies some mild conditions. Although the theorem is proved for the widely used logistic regression model (1), the last paragraph of its proof in the Appendix shows that it is applicable to any dose-response model that is nonincreasing in the MTD, such as the model Fθ (x) = {(tanh x + 1)/2}θ , which is also popular.

In Section 3.2, we propose a new design that incorporates both the individual ethics of the current patient being administered the dose, through a given loss function such as the EWOC loss (2), and the collective ethics of all future patients by including an additional term in the overall loss function to represent the dose's information content for determining another dose for the next patient. The simulation study in Section 4.4 shows that this new design is indeed an improvement over myopic designs in terms of global risk minimization, posttrial estimation of the MTD, and DLT and OD rates. This design provides a practical alternative to the optimal design associated with the intractable Markov decision problem of minimizing (9), which requires at each stage the daunting consideration of all future posterior distributions and calculating their associated optimal doses. For the finite-horizon problem of minimizing (8), Bartroff and Lai (2010) have developed an approximate solution that is a time-varying mixture of myopic and c-optimal designs. The new design in Section 3.2, which can be described by a time-invariant functional of the posterior distribution at each stage, is substantially simpler computationally and provides substantial improvement over the myopic designs. We conjecture that with suitably chosen λ (depending on δ), its global risk (9) can approximate that of the optimal design minimizing (9). Instead of minimizing (9) directly, it may be possible to obtain a good lower bound for (9). Such a bound, which can provide a benchmark for assessing the proposed design, is a topic for future work.

We also consider an “inverted” loss function (3), which measures deviation from the target DLT rate p on the probability scale rather than on the dose scale, and the associated myopic design IVOC. Even though IVOC minimizes the myopic posterior expected loss (3) at each stage, its cumulative global loss Risk2 in Table 1 is far from optimum, exceeding even that of EWOC that uses a completely different loss function, in the Bayesian setting. On the other hand, the design proposed in Section 3.2 can be applied with the IVOC loss function (3) to yield a substantially improved design IVOC+

Acknowledgements

This work was supported in part by National Science Foundation grants DMS-0907241 at University of Southern California and DMS-0805879 at Stanford University. The authors thank the associate editor and two referees for their helpful comments.

Appendix

Proof of Theorem 1

We prove coherence in de-escalation; the proof for escalation is similar. Let xmin < xmax be the boundaries of the dose space, which is assumed to be a finite interval. For fixed η, since (η,x) is a convex function of x, its right derivative x(η,x) with respect to x is nondecreasing for xminx < xmax, and the same is also true for the left derivative for xmin < xxmax. Moreover, the left and right derivatives are equal and continuous except for at most countably many points; see Rockafellar (1970, pages 214, 228, 244). Let xΠ = arg minx EΠ(η,x), Π~ be the posterior distribution obtained from Π and the additional dose–response pair (x, y) = (xΠ, 1), and let L(x)=EΠ~(η,x). Since (η,x) is convex in x for every η, so is L(x); moreover, its right derivative is given by L.+(x)=EΠ~x(η,x). To show that xΠ~xΠ, we will assume that xΠ < xmax because the case xΠ = xmax is trivial. It suffices to show that +(xΠ) ≥ 0 because L is convex and has minimizer xΠ~. Since EΠx(η,xΠ)0 and dΠ~(θ)=Fθ(xΠ)dΠ(θ)Fθ(xΠ)dΠ(θ), recalling that (x, y) = (xΠ, 1), it follows that

L.+(xΠ)EΠ~x(η,xΠ)EΠx(η,xΠ)=x(η,xΠ)Fθ(xΠ)dΠ(θ)Fθ(xΠ)dΠ(θ)x(η,xΠ)dΠ(θ)dΠ(θ)=A/Fθ(xΠ)dΠ(θ), (17)

where A=x(η,xΠ)[Fθ(xΠ)Fθ(xΠ)]dΠ(θ)dΠ(θ). A change of variables also yields A=x(η,xΠ)[Fθ(xΠ)Fθ(xΠ)]dΠ(θ)dΠ(θ). Hence

2A=[x(η,xΠ)x(η,xΠ)]×[Fθ(xΠ)Fθ(xΠ)]dΠ(θ)dΠ(θ)0, (18)

in which the inequality follows from

[x(η,xΠ)x(η,xΠ)][Fθ(xΠ)Fθ(xΠ)]0 (19)

for all x, θ and θ′ , as will be shown below. Combining (17) and (18) yields +(xΠ) ≥ 0, completing the proof of the theorem.

From the assumption that (η,x)(η,x) is nonincreasing in η for any x > x′ , it follows that x(η,x) is nonincreasing in η for fixed x. It therefore suffices for the proof of (19) to show that Fθ (x) is nonincreasing in η=Fθ1(p). Since p–1 = 1/Fθ (η) = 1 + e–(α+β η), Fθ (x) = 1/[1 + exp{log(p–1 – 1) + βη – βx}], which is nonincreasing in η since β > 0.

Contributor Information

Jay Bartroff, Department of Mathematics, University of Southern California, 3620 South Vermont Avenue, KAP 108, Los Angeles, California 90089, U.S.A. bartroff@usc.edu.

Tze Leung Lai, Department of Statistics, Sequoia Hall, Stanford University, Stanford, California 94305, U.S.A. lait@stat.stanford.edu.

References

  1. Atkinson AC, Donev AN. Optimum Experimental Designs. Oxford University Press; Oxford: 1992. [Google Scholar]
  2. Babb J, Rogatko A. Bayesian methods for cancer phase I clinical trials. In: Geller NL, editor. Advances in Clinical Trial Biostatistics. Marcel Dekker; New York: 2004. pp. 1–40. [Google Scholar]
  3. Babb J, Rogatko A, Zacks S. Cancer phase I clinical trials: Efficient dose escalation with overdose control. Statistics in Medicine. 1998;17:1103–1120. doi: 10.1002/(sici)1097-0258(19980530)17:10<1103::aid-sim793>3.0.co;2-9. [DOI] [PubMed] [Google Scholar]
  4. Bartroff J, Lai TL. Approximate dynamic programming and its applications to the design of phase I cancer trials. To appear in Statistical Science. 2010;25 [Google Scholar]
  5. Bertsekas DP. Dynamic Programming and Optimal Control. 3rd edition Vol. 2. Athena Scientific; Belmont, Massachusetts: 2007. [Google Scholar]
  6. Cheung YK. Coherence principles in dose-finding studies. Biometrika. 2005;92:863–873. [Google Scholar]
  7. Dette H, Melas VB, Pepelyshev A. Optimal designs for a class of nonlinear regression models. The Annals of Statistics. 2004;32:2142–2167. [Google Scholar]
  8. Dragalin V, Fedorov V. Adaptive designs for dose-finding based on efficacy-toxicity response. Journal of Statistical Planning and Inference. 2006;136:1800–1823. [Google Scholar]
  9. Dragalin V, Fedorov V, Wu Y. Adaptive designs for selecting drug combinations based on efficacy-toxicity response. Journal of Statistical Planning and Inference. 2008;138:352–373. [Google Scholar]
  10. Fedorov VV. Theory of Optimal Experiments. Academic Press; New York: 1972. [Google Scholar]
  11. Haines LM, Perevozskaya I, Rosenberger WF. Bayesian optimal design for phase I clinical trials. Biometrics. 2003;59:591–600. doi: 10.1111/1541-0420.00069. [DOI] [PubMed] [Google Scholar]
  12. Hardwick J, Stout QF. Optimizing a unimodal response function for binary variables. In: Atkinson A, Bogacka B, Zhigljavsky A, editors. Optimum Design 2000. Kluwer Academic Publishers; Dordrecht: 2001. pp. 195–210. [Google Scholar]
  13. Kpamegan EE, Flournoy N. An optimizing up-and-down design. In Optimum Design 2000. In: Atkinson A, Bogacka B, Zhigljavsky A, editors. Kluwer Academic Publishers; Dordrecht: 2001. pp. 211–224. [Google Scholar]
  14. Li Z, Durham SD, Flournoy N. An adaptive design for maximization of a contingent binary response. In: Flournoy N, Rosenberger WF, editors. Adaptive Designs. Institute of Mathematical Statistics; Beachwood, Ohio: 1995. pp. 179–196. [Google Scholar]
  15. O'Quigley J, Pepe M, Fisher L. Continual reassessment method: A practical design for phase I clinical trials in cancer. Biometrics. 1990;46:33–48. [PubMed] [Google Scholar]
  16. Pronzato L. Penalized optimal designs for dose-finding. Journal of Statistical Planning and Inference. 2010;140:283–296. [Google Scholar]
  17. Robert CP, Casella G. Monte Carlo Statistical Methods. 2nd edition Springer-Verlag; New York: 2004. [Google Scholar]
  18. Rockafellar RT. Princeton Mathematical Series, No. 28. Princeton University Press; Princeton, NJ: 1970. Convex Analysis. [Google Scholar]
  19. Rogatko A, Schoeneck D, Jonas W, Tighiouart M, Khuri F, Porter A. Translation of innovative designs into phase I trials. Journal of Clinical Oncology. 2007;25:4982–4986. doi: 10.1200/JCO.2007.12.1012. [DOI] [PubMed] [Google Scholar]
  20. Rosenberger WF, Haines LM. Competing designs for phase I clinical trials: A review. Statistics in Medicine. 2002;21:2757–2770. doi: 10.1002/sim.1229. [DOI] [PubMed] [Google Scholar]
  21. Thall PF, Cook JD. Dose-finding based on efficacy-toxicity trade-offs. Biometrics. 2004;60:684–693. doi: 10.1111/j.0006-341X.2004.00218.x. [DOI] [PubMed] [Google Scholar]
  22. Von Hoff D, Turner J. Response rates, duration of response, and dose response effects in phase I studies of antineo-plastics. Investigational New Drugs. 1991;9:115–122. doi: 10.1007/BF00194562. [DOI] [PubMed] [Google Scholar]
  23. Whitehead J, Brunier H. Bayesian decision procedures for dose determining experiments. Statistics in Medicine. 1995;14:885–893. doi: 10.1002/sim.4780140904. [DOI] [PubMed] [Google Scholar]
  24. Zacks S, Rogatko A, Babb J. Optimal Bayesian-feasible dose escalation for cancer phase I trials. Statistics and Probability Letters. 1998;38:215–220. [Google Scholar]

RESOURCES