The Randomized CRM: An Approach to Overcoming the Long-Memory Property of the CRM

Joseph S Koopmeiners; Andrew Wey

doi:10.1080/10543406.2017.1293076

. Author manuscript; available in PMC: 2018 Mar 25.

Published in final edited form as: J Biopharm Stat. 2017 Mar 25;27(6):1028–1042. doi: 10.1080/10543406.2017.1293076

The Randomized CRM: An Approach to Overcoming the Long-Memory Property of the CRM

Joseph S Koopmeiners ^a,^b,^†, Andrew Wey ^c

PMCID: PMC5581285 NIHMSID: NIHMS890952 PMID: 28340333

Abstract

The primary object of a phase I clinical trial is to determine the maximum tolerated dose (MTD). Typically, the MTD is identified using a dose-escalation study, where initial subjects are treated at the lowest dose level and subsequent subjects are treated at progressively higher dose levels until the MTD is identified. The continual reassessment method (CRM) is a popular model-based dose-escalation design, which utilizes a formal model for the relationship between dose and toxicity to guide dose-finding. Recently, it was shown that the CRM has a tendency to get “stuck” on a dose-level, with little escalation or de-escalation in the late stages of the trial, due to the long-memory property of the CRM. We propose the randomized CRM (rCRM), which introduces random escalation and de-escalation into the standard CRM dose-finding algorithm, as well as a hybrid approach that incorporates escalation and de-escalation only when certain criteria are met. Our simulation results show that both the rCRM and the hybrid approach reduce the trial-to-trial variability in the number of cohorts treated at the MTD but that the hybrid approach has a more favorable trade-off with respect to the average number treated at the MTD.

Keywords: Phase I clinical trial, Dose Finding, Continual Reassessment Method, Long-memory property

1 Introduction

The primary objective of phase I clinical trials is to evaluate the safety of a novel therapeutic agent and identify the maximum tolerated dose (MTD). The MTD is defined as the maximum dose with probability of dose limiting toxicity (DLT) less than some pre-specified threshold (typically 0.33 in phase I oncology trials). The MTD is identified in a phase I clinical trial using a dose escalation study, where initial subjects are treated at the lowest dose level and subsequent subjects are treated at progressively higher dose levels until the MTD is identified. A wide variety of dose escalation designs for identifying the MTD have been discussed in the statistical literature.

A commonly used dose escalation design in Phase I oncology trials is the 3+3 design (Storer, 1989), which utilizes a simple algorithm for dose escalation where subjects are assigned a dose based on the outcomes for the previous subject or cohort. An alternate approach is to specify a fully parametric model for the dose-toxicity relationship and base dose-finding on the estimated dose-toxicity curve. Parameter estimation is typically completed under the Bayesian paradigm but other approaches have also been considered. The most common example of this type of design is the continual reassessment method (CRM) (O’Quigley et al., 1990), which treats subjects at the current estimate of the MTD under some restrictions described by Goodman et al. (1995) when escalating. Other examples include escalation with overdose control (EWOC) (Babb et al., 1998) and the EffTox design (Thall and Cook, 2004), which considers the trade-off between toxicity and efficacy in phase I.

Oron and Hoff (2013) investigated the long-memory property of several phase I clinical trial designs, including the CRM. They pointed out that, in the CRM and other fully-parametric designs, the relative amount of information provided by each additional cohort diminishes as the trial proceeds and, as a result, the CRM has a tendency to get “stuck” on a dose-level after only a few cohorts with little escalation or de-escalation late in the trial. As a result, the variability in the number of subjects treated at the MTD is higher for the CRM than for other designs, and the CRM only achieves a high average number of subjects treated at the MTD by averaging over realizations of the trial where a large number of subjects are treated at the MTD and realizations of the trial where a small numbers of subjects (or no subjects at all) are treated at the MTD. Oron and Hoff (2013) argue that this issue represents a limitation of the CRM and should be considered when evaluating the CRM and other model-based designs.

The arguments of Oron and Hoff (2013) are not without controversy. In commentaries, several authors (Carlin et al., 2013; Cheung, 2013) argued that the CRM’s high probability of correctly identifying the MTD (Iasonos et al., 2008) outweighs the long-memory property of the CRM and the large number of subjects treated at the true MTD when the true MTD is correctly identified early in the trial represents a strength, not a weakness, of the design. Nevertheless, given the issues raised by Oron and Hoff (2013), further investigation of approaches to overcome the long-memory property of the CRM and a thorough evaluation of the associated trade-offs for other key operating characteristics is warranted.

In this manuscript, we propose the randomized CRM (rCRM) and discuss how it can be used to overcome the long-memory property of the CRM. In the CRM, initial cohorts are treated at the lowest dose level and subsequent cohorts are treated at the current estimate of the MTD under some restrictions when escalating. For the rCRM, we alter the standard CRM dose-finding algorithm to allow random escalation or de-escalation in the event that the current estimate of the MTD is the same as the estimate of the MTD before the previous cohort was enrolled. We investigate two schemes to random escalation/de-escalation where the probability of escalation/de-escalation is proportional to the posterior probability that each dose is the MTD. This is a natural approach to overcoming the long-memory property of the CRM and has the added benefit of acknowledging the uncertainty in our estimate of the MTD at any point in the trial. Finally, we investigate a hybrid approach that only incorporates random escalation/de-escalation when several consecutive cohorts are treated at the same dose but there is still uncertainty as to the true MTD. Our simulation results illustrate that the two rCRM algorithms, as well as the hybrid algorithm, decrease the trial-to-trial variability in the CRM but the hybrid approach retains more of the positive characteristics of the CRM than the two rCRM algorithms.

The remainder of this manuscript proceeds as follows. In Section 2, we provide a brief overview of the CRM and introduce the rCRM as an approach to overcoming the long-memory property of the CRM. In Section 3, we present simulation results evaluating the operating characteristics of the rCRM and illustrate how the rCRM decreases the trial-to-trial variability in the number of subjects treated at the MTD compared to the standard CRM. We then present and evaluate the hybrid approach in Section 4 and conclude with a brief discussion in Section 5.

2 Study Design

In this section, we provide a brief overview of the CRM and then proceed to introduce the rCRM. Let X be the primary outcome, DLT, which is a Bernoulli random variable with probability θ (d) at dose level d. The CRM is a model-based method and requires that a model be specified for the relationship between dose and the probability of DLT. A simple, one-parameter model that is commonly used with the CRM is the power model:

P (DLT | dose = d) = p_{d}^{e^{α}},

where p₁, …, p_D take values between 0 and 1 and are monotonically increasing from p₁ to p_D. (p₁, …, p_D) is known as the skeleton and must be specified before the study begins. The dose-response relationship for the probability of DLT is controlled by the parameter α and the probability of DLT for all dose levels decreases as α increases. Other, more flexible models can also be used for the association between dose and the probability of DLT, such as a two-parameter logistic model or curve-free methods for modeling the association between dose and the probability of DLT (Gasparini and Eisele, 2000). For the purposes of this manuscript, we will proceed assuming the power model for the relationship between dose level and the probability of DLT but note that changing the model for the probability of DLT would not alter the proposed dose-finding algorithm.

Let θ* be the pre-specified target probability of DLT that is used to identify the MTD. Typical values for θ* are 0.20 or 0.33. In the context of the trial, the current estimate of the MTD is the dose level with estimated probability of DLT closest to the target probability of DLT, regardless of direction. The CRM begins by treating the first cohort of (typically three) subjects at the lowest dose level. Outcomes for the first cohort are used to update the posterior for α and estimate the MTD. The trial terminates if the posterior probability that θ (1) (the probability of DLT at the lowest dose level) exceeds θ* is greater than some pre-specified threshold, π. That is, the trial terminates if,

P (θ (1) > θ^{*} | \vec{X}) > π .

(1)

Typical values for π are 0.90 or 0.95 and π is chosen to achieve the desired operating characteristics for the trial. Otherwise, the next cohort of subjects are treated at the dose with estimated probability of DLT closest to the target under the restriction that no dose levels may be skipped when escalating. This continues until the maximum sample size has been reached. The current estimate of the MTD at study completion is declared the MTD.

2.1 The Randomized CRM

A limitation to the standard CRM dose-finding procedure is that it has a tendency to get “stuck” on a dose-level early in the trial with very little escalation or de-escalation in the later stages of the trial (Oron and Hoff, 2013). As a result, the standard CRM exhibits substantial variability in the number of subjects treated at the true MTD with a large number of subjects treated at the true MTD in some trials but only a small number of subjects treated at the true MTD in others. We will overcome this problem by introducing random escalation or de-escalation to situations where the standard CRM has a tendency to get “stuck” on a dose-level.

The rCRM dose-finding algorithm starts in the same manner as the standard CRM dose-finding algorithm. The first cohort of subjects are treated at the lowest dose level and outcomes for the first cohort are used to estimate the posterior for α and estimate the MTD. The trial terminates if there is strong evidence that the lowest dose level is excessively toxic. Otherwise, the next cohort is enrolled at the current estimate of the MTD under the restriction that dose levels may not be skipped when escalating. The difference between the rCRM and the CRM is that the dose will be randomly escalated or de-escalated if the new cohort is to be treated at the same dose-level as the previous cohort. We discuss two approaches to random escalation or de-escalation below. This process continues until the maximum sample size has been reached and the current estimate of the MTD at study completion is declared the MTD.

We now discuss two approaches to random escalation or de-escalation when the new cohort is to be treated at the same dose-level as the previous cohort. Let d^j be the dose level assigned to the jth cohort, d^max be the maximum dose level tried in the first j cohorts and d* be the estimate of the MTD after the outcomes are observed for the first j cohorts. In the first randomization scheme (rCRM 1), if d* = d^j, the next cohort is randomly assigned to d^j − 1, d^j or d^j + 1 with the following probabilities:

P (d^{j + 1} = d^{j} + i) = \frac{P (MTD = d^{j} + i | \vec{x})}{\sum_{k = - 1}^{1} P (MTD = d^{j} + k | \vec{x})}

for i = −1,0,1, where x⃗ are the outcomes for subjects in the first j cohorts. Here, P (MTD = d^j + i|x⃗) is the posterior probability that dose level d^j + i is the true MTD conditional on all available data and we are escalating, de-escalating or staying at the current dose-level with probability proportional to the posterior probability that the three dose-levels are truly the MTD. If d^j is the lowest or highest dose level, then P (MTD = d^j − 1|x⃗) or P (MTD = d^j + 1|x⃗), respectively, are not defined. These probabilities can be treated as being equal to 0 when determining the probability of escalating or de-escalating and the rCRM1 proceeds with no other changes.

The rCRM1 only allows for escalating or de-escalating one dose-level at a time. An alternate randomization scheme (rCRM 2), would be to randomize among all possible dose-levels under the restriction that untried dose-levels can not be skipped. That is, if d* = d^j, we randomize the next cohort among dose-levels {1, …, d^max + 1}, with the following probabilities:

P (d^{j + 1} = i) = \frac{P (MTD = i | \vec{x})}{\sum_{k = 1}^{d^{max} + 1} P (MTD = k | \vec{x})}

for i = 1, …, d^max + 1. In this case, if d^max is equal to the maximum dose level, D, we are randomizing subjects to all dose levels with randomization probabilities proportional to the posterior probability that each dose is the MTD.

There are two primary advantages to this approach. First, we eliminate the possibility that the dose-finding algorithm gets “stuck” on a dose level by introducing random escalation and de-escalation. Regardless of how many consecutive cohorts have been treated at a single dose level, there will always be some probability of escalating or de-escalating, which makes it unlikely that, say, ten consecutive cohorts would all be treated at the same dose level. Second, the randomized CRM dose-finding algorithm more accurately reflects the uncertainty in our estimate of the MTD. In the standard CRM, subjects are assigned to the current estimate, which is based on only a single summary of the posterior distribution (mean, median, etc.). In contrast, the randomized CRM accounts for the entire posterior distribution and differentiates between precise estimates of the MTD, where there will be little chance of random escalation or de-escalation, and highly variable estimates of the MTD, where there will be a much higher chance of random escalation or de-escalation. As the trial continues and more data are collected, our estimate of the MTD should become more precise and the probability of random escalation or de-escalation should decrease.

In conclusion, our proposed dose finding algorithm is as follows:

Treat the first cohort of m patients at the lowest dose level.
Update the posterior distribution for α after outcomes are observed for the first cohort
The trial terminates if the lowest dose level has unacceptable toxicity as determined by Equation 1. Otherwise, the next cohort will be treated at the current estimate of the MTD.
If this does not result in escalation or de-escalation, the next cohort will be randomly assigned a dose using either the rCRM1 or rCRM2 randomization scheme.
The trial continues until termination or until the maximum sample size is reached. The estimated MTD at study completion is declared the MTD.

3 Simulation Study

We completed a small simulation study to evaluate the operating characteristics of the rCRM. Data were simulated from six hypothetical scenarios (described below) to evaluate the performance of the rCRM in a variety of settings. For each scenario, we simulated 1000 trials using the CRM, rCRM1 and rCRM2. We considered trials with a maximum sample size of 21 or 30 subjects and cohorts of three subjects. The target probability of DLT, θ*, was set equal to 0.30 and trials were terminated for overtoxicity if the posterior probability that θ (1) > θ* exceeded π = 0.90. The association between dose and the probability of DLT was modeled using the power model with a skeleton of (0.05, 0.10, 0.20, 0.35, 0.55). A normal prior with a mean of 0 and standard deviation of 2 was used for α. Simulations were completed in R (R Core Team, 2013) and the posterior for α was calculated using JAGS called from R using rjags (Plummer, 2013).

Figure 1 presents the dose-toxicity relationship for the six scenarios considered in our simulation study. In Scenarios 1 – 3, dose 4 is the MTD, and in Scenarios 4 – 6, dose 2 is the MTD. The dose-toxicity relationships for Scenarios 1 and 4 correspond to the case where the power model skeleton is correct with the power parameter, α, equal to 1.15 for Scenario 1 and 0.52 for Scenario 4, which were set to achieve the desired MTD. The remaining scenarios correspond to logistic regression models set to achieve the desired MTD (dose 4 for Scenarios 2 and 3 and dose 2 for Scenarios 5 and 6) with slopes that are either steeper (Scenarios 2 and 5) or more gradual (Scenarios 3 and 6) than the power model skeleton used for fitting the dose-toxicity model.

Typically, phase I study designs are evaluated by considering the probability of correctly identifying the MTD, the average number of subjects treated at the MTD and the average number of DLTs observed in the course of the trial. Oron and Hoff (2013) note that the long-memory property of the CRM results in substantial variability in the number of subjects treated at the true MTD and argue that a broader set of summaries should be considered, including the distribution of the number of subjects treated at the MTD across realizations of the trial. Our objective is to overcome the long-memory property of the CRM and, as a result, decrease the variability in the number of subjects treated at the MTD. Therefore, we will evaluate the rCRM by presenting histograms of the number of cohorts treated at the MTD and compare designs by a measure of the variability in the number of subjects treated at the MTD, specifically, the standard deviation of the number of subjects treated at the MTD.

Ideally, the rCRM will decrease the standard deviation of the number of subjects treated at the MTD but we realize that there is “no free lunch” and, therefore, we will evaluate the cost of reducing the trial-to-trial variability on other operating characteristics of the study. With this in mind, we consider the following trial operating characteristics: selection probability of each dose (including the probability of correctly identifying the MTD), average number of DLTs, average number of subjects underdosed, average number of subjects overdosed, probability of a trial treating no subjects at the MTD and the probability of a nearly optimal trial. A nearly optimal trial was defined as a trial where the trial was within one cohort of treating the maximum possible number of cohorts at the MTD. For example, with a maximum sample size of 21 subjects (7 cohorts) and an MTD at dose level 2, at most six of the seven cohorts can be treated at the MTD. In this case, a nearly optimal trial would include the cases where either 5 or 6 cohorts were treated at the MTD.

Finally, we also evaluate the bias and MSE of the estimated MTD. Our simulation study was completed using five dose-levels but without a specific dosage assigned to each dose-level. For the purposes of calculating bias and MSE, we assume an additive dosing scheme, where dose-levels 1 through 5 correspond to dosages of 20, 40, 60, 80 and 100.

In addition to the standard CRM algorithm and the two rCRM algorithms, we also present operating characteristics for the CRM if the trial were to terminate once the trial is “stuck” on a dose (CRM-ET). We define the trial to be “stuck” on a dose if three consecutive cohorts are treated at the same dose level. That is, if cohorts j − 2 and j − 1 were treated at dose level i and the current estimate of the MTD remains dose level i, the trial terminates and dose level i is declared the MTD. This allows us to evaluate the gain associated with using the maximum sample size compared to terminating the trial early once it is “stuck” on a dose level.

Histograms, along with the mean and standard deviation, of the number of cohorts treated at the MTD for a maximum sample size of 21 subjects are presented in Figure 2 for Scenarios 1 through 3 (dose level 4 is the true MTD) and Figure 3 for Scenarios 4 through 6 (dose level 2 is the true MTD). Also presented are results for a hybrid approach that will be discussed in Section 4. We see that the rCRM achieves the primary objective of decreasing the standard deviation in the number of subjects treated at the MTD. Specifically, we observe a 20 to 30% decrease in the standard deviation compared to the standard CRM algorithm with the largest decrease observed in Scenarios 2 and 5, where the true dose-toxicity curve is steeper than the underlying skeleton, and the smallest reduction observed for Scenarios 3 and 6, where the true dose-toxicity curve is more gradual than the underlying skeleton. Reducing the trial-to-trial variability did, though, come at the cost of a 5 to 10% reduction in the average number of subjects treated at the MTD.

Histogram of the number of cohorts treated at the MTD for Scenarios 1 through 3 and a maximum sample size of N = 21 for each of the six dose-finding algorithms. Also presented are the mean and standard deviation of the number of cohorts treated at the MTD.

Histogram of the number of cohorts treated at the MTD for Scenarios 4 through 6 and a maximum sample size of N = 21 for each of the six dose-finding algorithms. Also presented are the mean and standard deviation of the number of cohorts treated at the MTD.

Tables 1 and 2 present other operating characteristics for the six scenarios. Results for the hybrid approach are again reported and will be discussed in Section 4. There is little difference between the two rCRM algorithms and the standard CRM algorithm in the probability of correctly identifying the MTD. The rCRM is slightly more likely to correctly identify the MTD than the standard CRM algorithm in Scenarios 1 and 3, whereas the CRM does slightly better in Scenarios 2 and 4, but, in general, both algorithms perform similarly in their ability to correctly identify the MTD. With respect to key safety outcomes, the rCRM does result in small increases in the number of subjects treated above the MTD and the average number of DLTs. Although, the overall impact is small with no more than a 0.2 DLT increase in the average number of DLTs observed across all scenarios. Finally, the rCRM resulted in a decrease in the probability of treating 0 subjects at the MTD but also decreased the probability of observing a nearly optimal trial. Although, we note that the decrease in the probability of achieving a nearly optimal trial was larger, in general, than the decrease in the probability of treating 0 patients at the MTD, which explains the previously mentioned decrease in the average number of subjects treated at the MTD.

Table 1.

Simulation results evaluating the operating characteristics of the six dose-finding algorithms for Scenarios 1 through 3 and a maximum sample size of N = 21. Presented are the selection probability for each dose, the probability of early termination for excess-toxicity (Overtoxic), the average number of DLTs, the average number of subjects under-dosed, the average number of subjects over-dose, the probability that 0 subjects will be treated at the MTD and the probability of a nearly optimal trial. 1000 simulations were completed for each scenario.

Scenario 1	Dose 1	Dose 2	Dose 3	Dose 4	Dose 5	Overtoxic	Avg ⧣ DLTs	Avg ⧣ Under	Avg ⧣ Over	Prob 0 at MTD	Prob Nearly Opt. Trial

CRM	0	0.02	0.23	0.57	0.19	0	4.3	12.3	2.3	0.13	0.44
rCRM1	0	0.01	0.23	0.59	0.17	0	4.4	12.4	2.6	0.09	0.31
rCRM2	0	0.01	0.23	0.6	0.16	0	4.4	12.4	2.6	0.08	0.3
CRM - ET	0.01	0.02	0.28	0.5	0.19	0	3.5	11	1.9	0.2	0.16
Hybrid - co 3	0	0.02	0.23	0.58	0.17	0	4.4	12.3	2.5	0.07	0.36
Hybrid - co 2	0	0.02	0.22	0.57	0.19	0	4.4	12.1	2.6	0.06	0.34

Scenario 2	Dose 1	Dose 2	Dose 3	Dose 4	Dose 5	Overtoxic	Avg ⧣ DLTs	Avg ⧣ Under	Avg ⧣ Over	Prob 0 at MTD	Prob Nearly Opt. Trial

CRM	0	0	0.13	0.73	0.14	0	4.6	10.6	3	0.03	0.51
rCRM1	0	0	0.11	0.72	0.17	0	4.7	10.8	3.5	0.01	0.35
rCRM2	0	0	0.13	0.7	0.17	0	4.7	11	3.5	0.01	0.33
CRM - ET	0	0	0.14	0.67	0.2	0	3.8	10	2.5	0.06	0.24
Hybrid - co 3	0	0	0.12	0.73	0.15	0	4.6	10.8	3	0.01	0.48
Hybrid - co 2	0	0	0.12	0.72	0.15	0	4.6	10.7	3	0	0.46

Scenario 3	Dose 1	Dose 2	Dose 3	Dose 4	Dose 5	Overtoxic	Avg ⧣ DLTs	Avg ⧣ Under	Avg ⧣ Over	Prob 0 at MTD	Prob Nearly Opt. Trial

CRM	0.02	0.12	0.39	0.33	0.11	0.02	4.3	15.7	1.2	0.47	0.25
rCRM1	0.03	0.15	0.33	0.36	0.11	0.02	4.4	15.2	1.5	0.32	0.2
rCRM2	0.02	0.15	0.36	0.34	0.11	0.01	4.3	15.6	1.4	0.35	0.16
CRM - ET	0.13	0.1	0.36	0.3	0.1	0.01	3.2	12	0.8	0.52	0.05
Hybrid - co 3	0.02	0.14	0.35	0.33	0.14	0.02	4.4	15.2	1.6	0.34	0.18
Hybrid - co 2	0.03	0.13	0.36	0.34	0.12	0.02	4.4	15.1	1.3	0.3	0.18

Open in a new tab

Table 2.

Simulation results evaluating the operating characteristics of the six dose-finding algorithms for Scenarios 4 through 6 and a maximum sample size of N = 21. Presented are the selection probability for each dose, the probability of early termination for excess-toxicity (Overtoxic), the average number of DLTs, the average number of subjects under-dosed, the average number of subjects over-dose, the probability that 0 subjects will be treated at the MTD and the probability of a nearly optimal trial. 1000 simulations were completed for each scenario.

Scenario 4	Dose 1	Dose 2	Dose 3	Dose 4	Dose 5	Overtoxic	Avg ⧣ DLTs	Avg ⧣ Under	Avg ⧣ Over	Prob 0 at MTD	Prob Nearly Opt. Trial

CRM	0.25	0.4	0.25	0.02	0	0.08	6	8.7	5	0.17	0.05
rCRM1	0.27	0.39	0.22	0.03	0	0.09	6	8.5	5	0.1	0
rCRM2	0.26	0.38	0.23	0.03	0	0.09	5.9	8.8	4.8	0.11	0
CRM - ET	0.4	0.26	0.23	0.03	0	0.07	4.2	5.7	3.6	0.32	0
Hybrid - co 3	0.24	0.44	0.21	0.02	0	0.09	6	8.4	4.6	0.09	0
Hybrid - co 2	0.25	0.4	0.22	0.03	0	0.1	5.9	8.2	4.9	0.1	0

Scenario 5	Dose 1	Dose 2	Dose 3	Dose 4	Dose 5	Overtoxic	Avg ⧣ DLTs	Avg ⧣ Under	Avg ⧣ Over	Prob 0 at MTD	Prob Nearly Opt. Trial

CRM	0.22	0.56	0.2	0	0	0.02	6.2	7.7	4.8	0.05	0.07
rCRM1	0.23	0.55	0.2	0	0	0.02	6.4	7.9	5.3	0.02	0
rCRM2	0.2	0.58	0.18	0.01	0	0.03	6.3	8	5.4	0.04	0
CRM - ET	0.3	0.46	0.21	0.01	0	0.02	4.9	6	4	0.17	0
Hybrid - co 3	0.22	0.59	0.16	0	0	0.02	6.3	7.5	4.9	0.02	0
Hybrid - co 2	0.22	0.59	0.16	0	0	0.02	6.4	7.6	5	0.02	0

Scenario 6	Dose 1	Dose 2	Dose 3	Dose 4	Dose 5	Overtoxic	Avg ⧣ DLTs	Avg ⧣ Under	Avg ⧣ Over	Prob 0 at MTD	Prob Nearly Opt. Trial

CRM	0.27	0.31	0.24	0.07	0	0.12	5.8	9.4	4.9	0.24	0.04
rCRM1	0.26	0.31	0.23	0.07	0.01	0.13	5.8	8.6	5.1	0.14	0
rCRM2	0.27	0.29	0.24	0.07	0	0.13	5.7	8.9	5	0.15	0
CRM - ET	0.42	0.16	0.24	0.07	0.01	0.1	3.7	5.5	3.5	0.4	0
Hybrid - co 3	0.27	0.28	0.25	0.06	0.01	0.12	5.7	8.3	4.9	0.12	0
Hybrid - co 2	0.28	0.29	0.21	0.07	0.01	0.14	5.7	8.7	4.7	0.14	0

Open in a new tab

Comparing the CRM-ET to the CRM and rCRM, the CRM-ET reduced the trial-to-trial variability relative to the CRM but not to the same extent as the rCRM and also reduced the probability of correctly identifying the MTD relative to the other designs. In addition, allowing early termination decreases the total number of subject in the trial, which results in fewer patients under or overdosed, but also fewer patients treated at the MTD and an increased probability that no subjects would be treated at the MTD. This suggests that the improved performance of the CRM compared to other phase I designs (i.e. the 3+3 design) is, in part, due to increasing the number of patients treated in phase I and not simply due to applying a dose-response model, while the rCRM gains efficiency by both increasing the sample size and by increasing the probability of exploring additional dose-levels when the CRM and CRM-ET might not. We also note that the expected sample size for the CRM-ET when the model is misspecified is similar to when the model is correct. This implies that the rCRM randomly escalates and de-escalated at similar rates regardless of model misspecification because termination of the CRM-ET algorithm corresponds to situations of random escalation or de-escalation for the rCRM algorithms. Finally, it is worth noting that, while the CRM-ET may not treat as many subjects at the MTD as other designs, it also results in fewer total DLTs, which is an advantage compared to the CRM or rCRM.

Estimated bias and MSE can be found in Table 3. The CRM has less bias and smaller MSE than the two rCRM algorithms in Scenarios 1, 2, 3 and 6, while the rCRM1 has the smallest bias and MSE in Scenario 4 and the rCRM2 has the smallest bias and MSE in Scenario 5. Comparing the two rCRM algorithms, the rCRM2 has smaller bias and MSE than the rCRM1 in five of six scenarios.

Table 3.

Simulation results evaluating the bias and MSE of the estimated MTD for Scenarios 1 through 6 assuming an additive dosing scheme (Dosing Scheme 1) and a geometric dosing scheme (Dosing Scheme 2) for a maximum sample size of N = 21.

	Scenario 1		Scenario 2		Scenario 3		Scenario 4		Scenario 5		Scenario 6
Method	Bias	MSE	Bias	MSE	Bias	MSE	Bias	MSE	Bias	MSE	Bias	MSE

Dosing Scheme 1
CRM	−1.4	2	0.2	0.1	−11.9	140.6	1.1	1.1	−0.4	0.2	2.7	7.1
rCRM1	−1.7	2.8	1.1	1.3	−12.3	150.4	0.1	0.02	−0.6	0.3	2.9	8.3
rCRM2	−1.6	2.6	0.8	0.7	−12.6	159.4	0.6	0.4	0	0.002	2.8	8
CRM − ET	−2.8	7.7	1.2	1.4	−17	287.8	−2	4.1	−1.6	2.4	−0.1	0.01
Hybrid − co 3	−1.7	3	0.5	0.2	−11.5	131.9	0.3	0.1	−1	1.1	3.2	10
Hybrid − co 2	−1.2	1.5	0.6	0.3	−12.2	149.7	0.7	0.5	−1.1	1.1	2.2	4.7

Open in a new tab

Additional simulation results can be found in the Supplementary Materials. Included are additional scenarios, two with discrete jumps in the probability of DLT (Scenarios 10 and 11) and one where the true MTD is between dose-levels (Scenario 12), and simulation results with a maximum sample size of 30 subjects. Results for our additional scenarios are consistent with the results discussed above. The results for a maximum sample size of 30 subjects were similar to the results with a maximum sample size of 21 subjects. The two rCRM algorithms decreased the standard deviation in the number of subjects treated at the MTD and the probability of 0 subjects being treated at the MTD. In addition, the rCRM had little impact on the probability of correctly identifying the MTD but there was a slight increase in the average number of DLTs (less than 2.5% in all cases). Although, our results indicate a greater reduction in the average number of subjects treated at the MTD and the probability of a nearly optimal trial than was observed with a maximum sample size of 21 subjects.

4 Hybrid Approaches

The simulation results presented in Section 3 are promsing but they also illustrate the trade-offs associated with the rCRM. The rCRM decreases the variability in the number of subjects treated at the MTD but this comes at the cost of reducing the probability of observing a nearly optimal trial, where a large number of subjects are treated at the MTD. Additional investigation of our simulation results suggest that the CRM does quite well when there is little ambiguity as to the MTD and we have high posterior probability that the current dose is the MTD. In contrast, there are also situations where the CRM stays on a single dose for several consecutive cohorts but the posterior probability of that dose being the MTD remains relatively low. For example, the CRM might stay at a single dose for several consecutive cohorts but the posterior probability of that dose being the MTD may only be 0.50.

This suggests that alternate approaches to random escalation should be considered that might better maintain the properties of the standard CRM, while reducing the trial-to-trial variability. We first considered random escalation based on a power of the probability that a dose is the MTD in an approach similar to that described by Thall and Wathen (2007). A power greater than one increases the probability that the current estimate of the MTD would be assigned, while a power less than one would decrease the probability that the current estimate of the MTD would be assigned. This approach resulted in similar performance to the rCRM algorithms discussed in Section 2. For the remainder of this section, we will instead discuss a hybrid dose-finding approach that allows for several consecutive cohorts to be treated at the same dose if the posterior probability of that dose being the MTD is high but random escalation or de-escalation is introduced if there is ambiguity as to the true MTD.

Our hybrid dose-finding algorithm will proceed as follows. First, unlike the rCRM, we will only implement random escalation or de-escalation if the next cohort is to be treated at the same dose as the previous two cohorts (i.e. the CRM is “stuck on a dose”) and the posterior probability of the MTD equaling the MTD is less than some pre-defined threshold, γ. The second criteria is put in place to distinguish between the situation where there is high posterior probability that the MTD is equal to the current dose, in which case, we do not want to escalate or de-escalate, and the case where there is ambiguity as to the true MTD, in which case, we would like to explore other dose levels. If our criteria is met, we will randomly escalate or de-escalate with the probability of escalation or de-escalation proportionate to the posterior probability that the MTD is equal to the doses that are immediately above or below the current dose. Note that this is different than the algorithm we proposed in Section 2 in that we force escalation or de-escalation and do not allow the next cohort to be treated at the same dose as the previous two cohorts. Intuitively, our algorithm prohibits randomization when the dose recommendation is based on either too little or too much information and forces randomization when evaluation of adjacent doses is most likely to inform the dose-response model. This algorithm directly addresses the primary limitation of the CRM, which is that the standard CRM algorithm may treat several consecutive cohorts at the same dose, even when there is ambiguity as to the true MTD.

In summary, using the notation from Section 2, our hybrid approach will escalation or de-escalate as follows:

If d* = d^j = d^j−1 and P (MTD = d^j |x⃗) < γ, then
- –
  Assign cohort j + 1 to dose d^j + 1 with probability $\frac{P (MTD = d^{j} + 1 | \vec{x})}{P (MTD = d^{j} + 1 | \vec{x}) + P (MTD = d^{j} - 1 | \vec{x})}$
- –
  Assign cohort j + 1 to dose d^j − 1 with probability $\frac{P (MTD = d^{j} - 1 | \vec{x})}{P (MTD = d^{j} + 1 | \vec{x}) + P (MTD = d^{j} - 1 | \vec{x})}$ .

If d^j = 1, we would always escalate to dose-level 2, and if d^j = D (i.e. the maximum dose level, as defined in Section 2), we would always de-escalate to dose-level D − 1. This algorithm is more aggressive than the rCRM in forcing a random escalation or de-escalation conditional on the CRM being “stuck” on a dose but this only occurs if the posterior probability of the MTD being equal to the current dose is less than γ, in which case the form of the power model implies that there should be a reasonably high posterior probability that either the dose immediately above or immediately below the current dose is the MTD. Finally, to this point, we have always assumed cohorts of 3 subjects for our simulations but we will also investigate the impact of switching to cohorts of 2 subjects after transitioning to random escalation and de-escalation. This will provide more opportunities for adaptation, which is beneficial when there is ambiguity as to the true MTD.

We completed additional simulations to evaluate the performance of our proposed hybrid dose-finding algorithm with γ = 0. 7, using the same trial parameters and scenarios as were used to evaluate the rCRM in Section 3. Additional simulation results illustrating the impact of varying γ were also completed and will be discussed below. Simulation results for the hybrid algorithm with post-transition cohorts of 3 (hybrid - co 3) and the hybrid algorithm with post-transition cohorts of 2 (hybrid - co 2) with a maximum sample size of 21 subjects are presented in Figure 2 and Table 1 for Scenarios 1 through 3 (dose level 4 is the true MTD) and Figure 3 and Table 2 for Scenarios 4 through 6 (dose level 2 is the true MTD). The two hybrid approaches perform well, providing similar benefits to the rCRM but the trade-offs associated with these benefits are less dramatic. As with the rCRM, the two hybrid methods reduce the standard deviation of the number of subjects treated at the MTD and reduce the proportion of trials where no subjects are treated at the MTD. The primary benefit of the hybrid approaches is that they treat more subjects at the MTD, on average, than the rCRM. In fact, the hybrid approaches, on average, treat more subjects at the MTD than the rCRM in all cases and treat more subjects at the MTD than the CRM in Scenarios 3, 4 and 6. In addition, in Scenarios 1 through 3, where the CRM achieves a nearly optimal trial in a large number of cases, the hybrid approaches are far more likely to achieve a nearly optimal trial than the rCRM. In general, the two hybrid approaches result in a smaller MSE than the rCRM algorithms but larger than the CRM. Finally, we note that the hybrid approaches correctly identify the MTD at similar rates to both the standard CRM and rCRM and, like the rCRM, have only a minor increase in the number of DLTs compared to the standard CRM algorithm.

Additional simulation results can be found in the Supplementary Materials. Results for the additional scenarios are consistent with the results discussed above. We see that increasing from a maximum sample size of 21 subjects to a maximum sample size of 30 subjects results in a similar reduction in the trial-to-trial variability with a minimal increase in the probability of DLTs due to the hybrid approach but a less favorable trade-off when considering the average number of subjects treated at the MTD and the probability of a nearly optimal design. The most notable difference is that the probability of a nearly perfect trial decreases to 0 when N = 30 for Scenarios 1 through 3. This suggests that, as the sample size increases, the probability of the hybrid approach forcing escalation or de-escalation increases, thus dramatically decreasing the probability of a nearly perfect trial. Also presented in the Supplementary Materials are simulation results evaluating the impact of varying γ on the operating characteristics of the hybrid approach. The results indicate that γ had little impact with N = 21 but that a larger value for γ might be preferable with larger sample sizes to avoid unnecessarily forcing escalation or de-escalation, which will actually increase the standard deviation of the number of cohorts treated at the MTD, as well as negatively impact other trial operating characteristics.

5 Discussion

We discussed two approaches to overcoming the long-memory property of the CRM. First, we discussed the rCRM, which introduces random escalation or de-escalation in cases where the standard CRM would assign a cohort the same dose as the previous cohort. We investigated two algorithms for random escalation or de-escalation. The rCRM1 randomly assigns a cohort to either escalate one dose level, de-escalate one dose level or stay the same relative to the previous cohort with probabilities proportional to the posterior probability that each dose is the MTD. In contrast, the rCRM2 randomly assigns a cohort to any dose from the minimum dose level to a dose level representing a one dose escalation relative to the maximum dose-level tried in the previous cohorts with probabilities proportional to the posterior probability that each dose is the MTD. Both algorithms exhibited similar performance and achieved the primary objective of reducing the standard deviation of the number of subjects treated at the MTD and the probability that 0 subjects would be treated at the MTD throughout the trial, while having little impact on the probability of correctly identifying the MTD. Although, this did come at the cost of reducing the average number of subjects treated at the MTD and the probability of treating a high number of subjects at the MTD.

To rectify this problem, we also investigated a hybrid approach to dose-finding. The hybrid approach allows consecutive cohorts to be treated at the MTD but forces escalation or de-escalation after two consecutive cohorts if the estimated MTD remains the same but there remains ambiguity as to the true MTD. Specifically, if the estimated MTD is unchanged after two consecutive cohorts have been treated at the same dose and the posterior probability of that dose equaling the MTD is less than γ, then the third cohort will be treated at either one dose below or one dose above the current estimate of MTD. The hybrid approach achieved the primary objective of reducing the trial-to-trial variability of the CRM and had little impact on the probability of correctly identifying the MTD but had a more favorable trade-off profile than the rCRM with respect to the average number of subjects treated at the MTD and the probability of treating a high number of subjects at the MTD. Although, the hybrid approach had similar performance to the rCRM when a larger sample size was considered.

The rCRM and hybrid approaches discussed in this manuscript reduce the excess trial-to-trial variability in the number of subjects treated at the MTD, which results from the long-memory property of the CRM, as discussed by Oron and Hoff (2013). In many ways, the arguments of Oron and Hoff (2013) parallel the ongoing controversy surrounding adaptive randomization in phase II and III clinical trials (Thall et al., 2015), where there is a similar trade-off between maximizing the number of patients receiving the optimal treatment and the potentially large trial-to-trial variability in the number of patients randomized to the optimal treatment. In the case of Phase I clinical trials, our results suggest that a researcher should use the CRM if they are unconcerned with the trial-to-trial variability in the number of subjects treated at the MTD in phase I clinical trials because the CRM maximizes the probability of correctly identifying the optimal dose and the number of patients treated at the MTD. In contrast, if the variability in the number of patients treated at the MTD is of concern, the approaches discussed in this manuscript provide alternate approaches that address this problem and maintain most of the positive qualities of the standard CRM algorithm.

The primary objective of this manuscript was to develop a novel dose-finding algorithm that overcomes the long-memory property of the CRM but an additional advantage of the rCRM is that it allows us to incorporate the uncertainty in our estimate of the MTD into dose-finding. The standard CRM treats each cohort at the current estimate of the MTD regardless of the uncertainty in that estimate. In contrast, the rCRM considers the entire posterior distribution of the MTD and incorporates the uncertainty in our estimate into dose-finding. As a result, the rCRM is less likely to get “stuck” on a dose-level than the standard CRM and also has the added advantage of more thoroughly exploring the dose-response relationship of toxicity without sacrificing the probability of correctly identifying the MTD.

Finally, we have considered random escalation and de-escalation in standard phase I clinical trials to evaluate the toxicity of a single agent. There exists a vast literature of Bayesian adaptive trial designs that consider extensions of phase I dose-escalation studies to the case of multiple agents (Thall et al., 2003; Yin and Yuan, 2009) and designs that consider efficacy, as well as toxicity, in phase I (Braun, 2002; Thall and Cook, 2004). These designs are also likely to have a long-memory and incorporating random escalation or de-escalation into their dose-finding algorithms could be used to overcome the long-memory property in their settings, as well.

Supplementary Material

NIHMS890952-supplement-Supplementary_Material.pdf^{(175.6KB, pdf)}

Acknowledgments

This work was partially supported by a research grant from Medtronic Inc and NIH grant P30-CA077598.

References

Babb J, Rogatko A, Rogatko A, Zacks S. Cancer phase I clinical trials: Efficient dose escalation with overdose control. Statistics in Medicine. 1998;17:1103–1120. doi: 10.1002/(sici)1097-0258(19980530)17:10<1103::aid-sim793>3.0.co;2-9. [DOI] [PubMed] [Google Scholar]
Braun TM. The bivariate continual reassessment method: extending the CRM to phase I trials of two competing outcomes. Controlled Clinical Trials. 2002;23:240–256. doi: 10.1016/s0197-2456(01)00205-7. [DOI] [PubMed] [Google Scholar]
Carlin BP, Zhong W, Koopmeiners JS. Discussion of ’small-sample behavior of novel phase I cancer trial designs’ by Assaf P Oron and Peter D Hoff. Clin Trials. 2013;10:81–85. doi: 10.1177/1740774512469313. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cheung YK. Commentary on ’Small-sample behavior of novel phase I cancer trial designs’. Clin Trials. 2013;10:86–87. doi: 10.1177/1740774512470221. [DOI] [PubMed] [Google Scholar]
Gasparini M, Eisele J. A curve-free method for phase i clinical trials. Biometrics. 2000;56:609–615. doi: 10.1111/j.0006-341x.2000.00609.x. [DOI] [PubMed] [Google Scholar]
Goodman SN, Zahurak ML, Piantadosi S. Some practical improvements in the continual reassessment method for phase I studies. Statistics in Medicine. 1995;14:1149–1161. doi: 10.1002/sim.4780141102. [DOI] [PubMed] [Google Scholar]
Iasonos A, Wilton AS, Riedel ER, Seshan VE, Spriggs DR. A comprehensive comparison of the continual reassessment method to the standard 3 + 3 dose escalation scheme in phase i dose-finding studies. Clinical Trials. 2008;5:465–477. doi: 10.1177/1740774508096474. [DOI] [PMC free article] [PubMed] [Google Scholar]
O’Quigley J, Pepe M, Fisher L. Continual reassessment method: A practical design for phase 1 clinical trials in cancer. Biometrics. 1990;46:33–48. [PubMed] [Google Scholar]
Oron AP, Hoff PD. Small-sample behavior of novel phase i cancer trial designs. Clinical Trials. 2013;10:63–80. doi: 10.1177/1740774512469311. [DOI] [PubMed] [Google Scholar]
Plummer M. rjags: Bayesian graphical models using MCMC. R package version 3–10 2013 [Google Scholar]
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2013. [Google Scholar]
Storer BE. Design and analysis of phase I clinical trials. Biometrics. 1989;45:925–937. [PubMed] [Google Scholar]
Thall P, Fox P, Wathen J. Statistical controversies in clinical research: scientific and ethical problems with adaptive randomization in comparative clinical trials. Annals of Oncology. 2015;26:1621–1628. doi: 10.1093/annonc/mdv238. [DOI] [PMC free article] [PubMed] [Google Scholar]
Thall PF, Cook JD. Dose-finding based on efficacy/toxicity trade-offs. Biometrics. 2004;60:684–693. doi: 10.1111/j.0006-341X.2004.00218.x. [DOI] [PubMed] [Google Scholar]
Thall PF, Millikan RE, Mueller P, Lee S-J. Dose-finding with two agents in phase i oncology trials. Biometrics. 2003;59:487–496. doi: 10.1111/1541-0420.00058. [DOI] [PubMed] [Google Scholar]
Thall PF, Wathen JK. Practical bayesian adaptive randomisation in clinical trials. European Journal of Cancer. 2007;43:859–866. doi: 10.1016/j.ejca.2007.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yin G, Yuan Y. A latent contingency table approach to dose finding for combinations of two agents. Biometrics. 2009;65:866–875. doi: 10.1111/j.1541-0420.2008.01119.x. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Material

NIHMS890952-supplement-Supplementary_Material.pdf^{(175.6KB, pdf)}

[R1] Babb J, Rogatko A, Rogatko A, Zacks S. Cancer phase I clinical trials: Efficient dose escalation with overdose control. Statistics in Medicine. 1998;17:1103–1120. doi: 10.1002/(sici)1097-0258(19980530)17:10<1103::aid-sim793>3.0.co;2-9. [DOI] [PubMed] [Google Scholar]

[R2] Braun TM. The bivariate continual reassessment method: extending the CRM to phase I trials of two competing outcomes. Controlled Clinical Trials. 2002;23:240–256. doi: 10.1016/s0197-2456(01)00205-7. [DOI] [PubMed] [Google Scholar]

[R3] Carlin BP, Zhong W, Koopmeiners JS. Discussion of ’small-sample behavior of novel phase I cancer trial designs’ by Assaf P Oron and Peter D Hoff. Clin Trials. 2013;10:81–85. doi: 10.1177/1740774512469313. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Cheung YK. Commentary on ’Small-sample behavior of novel phase I cancer trial designs’. Clin Trials. 2013;10:86–87. doi: 10.1177/1740774512470221. [DOI] [PubMed] [Google Scholar]

[R5] Gasparini M, Eisele J. A curve-free method for phase i clinical trials. Biometrics. 2000;56:609–615. doi: 10.1111/j.0006-341x.2000.00609.x. [DOI] [PubMed] [Google Scholar]

[R6] Goodman SN, Zahurak ML, Piantadosi S. Some practical improvements in the continual reassessment method for phase I studies. Statistics in Medicine. 1995;14:1149–1161. doi: 10.1002/sim.4780141102. [DOI] [PubMed] [Google Scholar]

[R7] Iasonos A, Wilton AS, Riedel ER, Seshan VE, Spriggs DR. A comprehensive comparison of the continual reassessment method to the standard 3 + 3 dose escalation scheme in phase i dose-finding studies. Clinical Trials. 2008;5:465–477. doi: 10.1177/1740774508096474. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R8] O’Quigley J, Pepe M, Fisher L. Continual reassessment method: A practical design for phase 1 clinical trials in cancer. Biometrics. 1990;46:33–48. [PubMed] [Google Scholar]

[R9] Oron AP, Hoff PD. Small-sample behavior of novel phase i cancer trial designs. Clinical Trials. 2013;10:63–80. doi: 10.1177/1740774512469311. [DOI] [PubMed] [Google Scholar]

[R10] Plummer M. rjags: Bayesian graphical models using MCMC. R package version 3–10 2013 [Google Scholar]

[R11] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing; Vienna, Austria: 2013. [Google Scholar]

[R12] Storer BE. Design and analysis of phase I clinical trials. Biometrics. 1989;45:925–937. [PubMed] [Google Scholar]

[R13] Thall P, Fox P, Wathen J. Statistical controversies in clinical research: scientific and ethical problems with adaptive randomization in comparative clinical trials. Annals of Oncology. 2015;26:1621–1628. doi: 10.1093/annonc/mdv238. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Thall PF, Cook JD. Dose-finding based on efficacy/toxicity trade-offs. Biometrics. 2004;60:684–693. doi: 10.1111/j.0006-341X.2004.00218.x. [DOI] [PubMed] [Google Scholar]

[R15] Thall PF, Millikan RE, Mueller P, Lee S-J. Dose-finding with two agents in phase i oncology trials. Biometrics. 2003;59:487–496. doi: 10.1111/1541-0420.00058. [DOI] [PubMed] [Google Scholar]

[R16] Thall PF, Wathen JK. Practical bayesian adaptive randomisation in clinical trials. European Journal of Cancer. 2007;43:859–866. doi: 10.1016/j.ejca.2007.01.006. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R17] Yin G, Yuan Y. A latent contingency table approach to dose finding for combinations of two agents. Biometrics. 2009;65:866–875. doi: 10.1111/j.1541-0420.2008.01119.x. [DOI] [PubMed] [Google Scholar]

PERMALINK

The Randomized CRM: An Approach to Overcoming the Long-Memory Property of the CRM

Joseph S Koopmeiners

Andrew Wey

Abstract

1 Introduction