Evaluation of phase I clinical trial designs for combinational agents along with guidance based on simulation studies

Shu Wang; Elias Sayour; Ji-Hyun Lee

doi:10.1080/02664763.2022.2105827

. 2022 Aug 3;50(9):2055–2078. doi: 10.1080/02664763.2022.2105827

Evaluation of phase I clinical trial designs for combinational agents along with guidance based on simulation studies

Shu Wang ^a,^b, Elias Sayour ^c, Ji-Hyun Lee ^a,^b,^CONTACT

PMCID: PMC10291942 PMID: 37378271

Abstract

Combinational therapy that combines two or more therapeutic agents is very common in cancer treatment. Currently, many clinical trials aim to assess feasibility, safety and activity of combinational therapeutics to achieve synergistic response. Dose-finding for combinational agents is considerably more complex than single agent, because only partial order of dose combinations' toxicity is known. Prototypical phase I designs may not adequately capture this complexity thus limiting identification of the maximum tolerated dose (MTD) of combinational agents. In response, novel phase I clinical trial designs for combinational agents have been extensively proposed. However, with so many available designs, studies that compare their performances and explore the impact of design parameters, along with providing recommendations are limited. We are evaluating available phase I designs that identify a single MTD for combinational agents using simulation studies under various conditions. We are also exploring the influences of different design parameters and summarizing the risks/benefits of each design to provide general guidance in design selection.

Keywords: Phase I clinical trial, combinational agents, dose-finding, guidance for phase I

1. Introduction

Clinical trials investigating combinational therapies that combine two or more therapeutic agents have garnered renewed attention with the development of cancer therapy. Novel combinational agents require identification of a maximum tolerated dose (MTD). The MTD is the highest dose level that leads to a pre-specified target toxicity probability. However, dose-finding for combinational agents could be challenging as we do not know the complete order of dose combinations' toxicity. To fill the gap, several phase I study designs for combinational agents have been proposed.

Overall, there are 3 categories of designs: algorithm- or rule-based, model-based, and model-assisted. Algorithm-based designs do not involve any parametric relationship between dose combinations and their toxicity probabilities. Ivanova and Wang [13] proposed an up-and-down design and used isotonic regression to estimate the MTD. Later Ivanova and Kim [12] updated the previous up-and-down design using T-statistics. Lee and Fan [16] proposed a two-dimensional search algorithm to identify MTD. Algorithm-based designs usually lack statistical theory foundation, and their escalation/de-escalation rules are ad-hoc. Therefore, their performances are not guaranteed.

Model-based designs assume a parametric dose-toxicity relationship. To account for the design parameters' estimation uncertainty in the beginning of the trial, a start-up phase is usually used before transitioning to model-based part. The main differences among model-based methods are the choice of dose-toxicity relationship and the scheme of start-up phase. Thall et al. [30] proposed to identify MTD contour with a six-parameter logistic regression. Wang and Ivanova [36] proposed a three parameter model to link doses and toxicity probabilities. Yin and Yuan proposed a latent contingency table method [38] and another one that used Copula to model toxicity probabilities through marginal toxicity profile of individual agents [39]. Riviere et al. [28] developed a method based on Bayesian logistic regression. Braun and Jia [2] used a proportional odds logistic regression fitting model within each ‘row’ of the dose combination matrix, and later join them together. Braun and Wang [3] proposed a hierarchical model through linking the effective doses with hyperparameter of dose toxicity probabilities. Tighiouart et al. [31] extended the escalation with overdose control (EWOC) method [1] to two-dimensional setting to identify the MTD curve. Some other designs were motivated by the main difficulty of two-dimensional dose-finding where only the partial order of dose combinations' toxicity is known. In this case, we only know that doses $(i + 1, j)$ and $(i, j + 1)$ are more toxic than dose $(i, j)$ , but we are not aware of the toxicity order of dose $(i - 1, j + 1)$ and $(i, j)$ , where $(i, j)$ denotes the dose combination of the $i th$ dose of one agent and the $j th$ dose of the second agent. Therefore, several possible toxicity orderings which satisfy the partial order exist. Conaway et al. [5] proposed to identify all possible orderings of dose combination toxicities so that the two dimensional dose-finding can be solved by continuous reassessment method (CRM) [23]. Building upon this methodology, Wages et al. [33,34] proposed to use a subset of possible orderings, which is more feasible especially when the total number of possible orderings is large. To avoid pre-specifying orderings, Lin and Yin [17] proposed to dynamically update the ordering. However, model-based designs have several limitations in practice: (1) they are relatively complicated and require constant model updating by statisticians, which posit barriers to clinicians to understand and implement. (2) Most designs need parameter calibration. (3) Some designs require prior knowledge about agents (e.g. guesses of dose combinations' toxicities) which is not easy for clinicians to provide.

Model-assisted designs still utilize statistical models in decision making, but focus on easier implementation through pre-tabulating escalation and de-escalation rules before trial conduct [37]. Therefore, they have the advantages of algorithm-based and model-based designs. BOIN design [20,40] and Keyboard design [37] are representative model-assisted designs. Later, Lin and Yin [18] extended the BOIN design, Pan et al. [24] extended the Keyboard design to handle two-dimensional dose-findings.

In addition, there are designs that incorporate special features while conducting two dimensional dose-finding. Liu and Ning [19] proposed a design that is able to handle trials with delayed toxicities. Diniz et al. [6] built a Bayesian design for combinational doses upon escalation with the EWOC method [1] accounting for patient heterogeneity through taking baseline covariates into consideration.

With so many novel methods available yet very limited implementation in oncology trials, we have relatively little knowledge about which methods are superior under which scenarios. Riviere et al. [26] compared six phase I designs for combinational agents that are either algorithm-based or model-based. Their paper claimed that all designs were optimized to improve the percentage of correct MTD selection before comparison. However, such claim itself is questionable as it is impossible to achieve optimization under diverse scenarios using a universal set of parameters. Based on the sensitivity analyses the authors have conducted, different scenarios actually required different parameter sets to achieve optimization. Therefore, simply comparing results using single set of design parameters makes the comparison less meaningful. Another limitation of this study is that it did not discuss in detail about the influences of different design settings, although sensitivity analyses were presented. Hirakawa et al. [10] compared performances of five model-based designs for combinational agents. But this paper did not explore effects of design parameter beyond cohort size. Harrington et al. [8] reviewed some algorithm-based and model-based combinational agents designs, and discussed their advantages and limitations without simulation studies. None of the above papers included the recently developed model-assisted designs as they were published in earlier days. Moreover, the feasibility of parameter tuning and the influences of design parameters for the model-based designs were investigated in a limited fashion. Later Pan et al. [24] compared two-dimensional BOIN, two-dimensional Keyboard, and Continual reassessment method for partial ordering (POCRM) [33,34]. However, POCRM was the only model-based design in the comparison. Moreover, similar to those review studies mentioned above, this one did not explore effects of design parameters for POCRM. To provide a more recent view of phase I clinical trial designs for combinational agents, we conducted a simulation study to evaluate the performances of various designs under comprehensive scenarios.

Our study is different from previous review papers in several aspects: (1) we included two recently-developed model-assisted designs in the study, (2) for design parameters with no clear recommendations, we investigated multiple sets to investigate their influences instead of using single subjectively selected set, (3) we used different sample sizes in simulations to check that whether our findings are valid with different trial sizes, and (4) in addition to the summary of each design's characteristics, we discussed putative reasons that led to their performances.

Specifically, in this paper we focus on the designs that (1) utilize toxicity information only, (2) identify single MTD instead of MTD contour or curve, (3) assume monotonic dose-toxicity relationship within each individual agent, and (4) having programming codes/softwares available. As a result, we have selected 9 designs: Dose finding in discrete dose space [36], Bayesian dose finding by copula regression [39], Continual reassessment method for partial ordering [33,34], Hierarchical Bayesian Design [3], Logistic model-Based Bayesian dose finding design [28], Generalized Continual Reassessment Method [2], Bootstrap Aggregating Continual Reassessment Method [17], combinational Bayesian optimal interval design [18], and combinational Keyboard design [24]. We did not include few currently available algorithm-based designs as it is consensus that algorithm-based designs usually have inferior performances compared with model-based ones [14,22,26].

The rest of the paper is organized as follows: we first reviewed nine designs that will be included in our evaluation; then presented our simulation studies and results; in the last, we discussed our findings.

2. Review of designs

2.1. Notations

Here we define some common notations used in these methods. As most of the designs we selected apply to dual-agent dose finding only (Copula has been extended to handle more than two agents), we assume two agents A and B, with J and K doses, respectively. Define $π_{j k}$ to be the true toxicity probability of the dose combination $(j, k)$ , $j = 1, 2, \dots, J$ , $k = 1, 2, \dots, K$ ; define $p_{j}$ to be the true toxicity probability of agent A when used as a monotherapy, $j = 1, 2, \dots, J$ , and $q_{k}$ to be the true toxicity probability of agent B when used as a monotherapy, $k = 1, 2, \dots, K$ . Define ϕ to be the pre-specified target toxicity probability. Define N to be maximum sample size in the trial. Define $n_{j k}$ to be number of subjects that received dose $(j, k)$ and $y_{j k}$ to be number of dose-limiting toxicities (DLTs) observed among those $n_{j k}$ patients.

2.2. Model-based: dose finding in discrete dose space (I2D) [36]

This method is a Bayesian design that extends CRM to accommodate dose-finding in dual agents.

Dose-toxicity model:

π_{j k} (θ) = 1 - (a_{j})^{α} (1 - b_{k})^{β + γ \log (1 - a_{j})},

(1)

where $θ = (α, β, γ)$ is a vector of unknown design parameters and restricts $α > 0, β > 0, γ < 0$ to satisfy the assumption of toxicity monotonicity; $0 \leq a_{1} < \dots < a_{J}$ and $0 \leq b_{1} < \dots < b_{K}$ are constants instead of actual doses of agents. If no interaction between two agents exists in Equation (1), the model becomes

π_{j k} (θ) = 1 - (a_{j})^{α} (1 - b_{k})^{β} .

(2)

Start-up phase: the trial is initiated with a dose combination $(1, 1)$ . Next, escalate agent A while agent B is maintained at the lowest dose if no DLT is observed. If still no DLT is observed when agent A reaches its maximum dose, agent B is escalated to its second lowest dose combining with agent A's $(J - 2)^{th}$ dose. Then if no DLT is observed, we continue to a combination where agent B is at its third lowest dose and agent A's $(J - 4) th$ dose, namely the one used in previous combination minus 2. When agent B reaches its maximum dose, if agent A is at its $m th$ dose, we evaluate all combinations from $(m, K), (m + 1, K), \dots, (J, K)$ . The start-up phase ends if at least one DLT is observed at any time.

Post start-up escalation/de-escalation dose set: if the current combination is $(j, k)$ , I2D only considers doses $(j - 1, k), (j + 1, k), (j, k), (j, k + 1), (j, k - 1), (j + 1, k - 1), (j - 1, k + 1)$ to the next subject prohibiting diagonal moves.

Post start-up trial conduct: after the start-up phase ends, the working model equation (1) or Equation (2) will be used to obtain toxicity estimates of all dose combinations. Due to safety concerns, the working model starts at the combination dose where agent B is at its lowest dose and agent A is at the dose that makes the combination's estimated toxicity probability closest to ϕ.

MTD determination: the dose combination whose posterior probability of toxicity is closest to ϕ will be selected as the MTD.

2.3. Model-based: Bayesian dose finding by copula regression $($ Copula $)$ [39]

This method utilizes copula to model the dose-toxicity relationship because copula allows to link the joint distribution and marginal distributions via a dependence parameter.

Dose-toxicity model:

π_{j k} = 1 - {(1 - p_{j}^{α})^{- γ} + (1 - q_{k}^{β})^{- γ} - 1}^{- 1 / γ},

(3)

where α and β are power parameters as in CRM to accommodate the uncertainty, and $γ > 0$ represents the interaction between two agents. Intermediate informative prior distributions with prior mean 1 and a relatively small variance will be assigned to α and β (e.g. Gamma (2,2)). A Gamma distribution with a large variance is usually chosen as the non-informative prior for γ. If only one agent is involved, this approach reduces to regular CRM.

Start-up phase: the start-up phase begins with the lowest dose combination $(1, 1)$ . It proceeds vertically $(1, 2), (1, 3), \dots, (1, K)$ until the first toxicity is observed, then it proceeds horizontally $(2, 1), (3, 1), \dots, (J, 1)$ until the first toxicity is observed. Once one toxicity is observed in both directions, the formal design starts.

Post start-up escalation/de-escalation dose set: if the current combination is $(j, k)$ , the dose escalation set is defined as $A_{E} = {(j + 1, k), (j, k + 1), (j + 1, k - 1), (j - 1, k + 1)}$ , dose de-escalation set is define to be $A_{D} = {(j - 1, k), (j, k - 1), (j + 1, k - 1), (j - 1, k + 1)}$ . As we only know the partial order of dose toxicity in combinational agents, we do not know whether dose combinations $(j + 1, k - 1)$ and $(j - 1, k + 1)$ are more/less toxic than dose combination $(j, k)$ . Therefore, the authors included $(j + 1, k - 1)$ and $(j - 1, k + 1)$ in both escalation and de-escalation sets.

Post start-up trial conduct: this design involves two parameters $c_{e}$ and $c_{d}$ that represent the fixed probability cut-offs for dose escalation and de-escalation, respectively, and $c_{e} + c_{d} > 1$ . Detailed algorithm is laid out as below.

if at current dose combination $(j, k)$ , $P (π_{j k} < ϕ) > c_{e}$ , we will escalate to the dose that belongs to $A_{E}$ and with the toxicity probability closest to ϕ and higher than that of $(j, k)$ . If current dose is $(J, K)$ , then stay at the same combination.
If at current dose combination $(j, k)$ , $P (π_{j k} > ϕ) > c_{d}$ , we will de-escalate to the dose that belongs to $A_{D}$ and with the toxicity probability closest to ϕ and lower than that of $(j, k)$ . If current dose is $(1, 1)$ , the trial is terminated.
Otherwise, stays at the same dose combination.

MTD determination: after N subjects are exhausted, the MTD is determined as the dose combination with the estimated probability of toxicity closest to ϕ.

2.4. Model-based: continual reassessment method for partial ordering (POCRM) [33,34]

To solve the problem of only knowing partial order of dose toxicity, POCRM proposes to pre-specify a subset of possible orderings, then utilize the CRM on each of them. This way, two-dimensional dose-finding is reduced to a one-dimensional problem.

Dose-toxicity model: define T as the total number of dose combinations, $T = J \times K$ ; $d_{n}$ as the dose assigned to subject n; $R (d_{n})$ as the true toxicity probability of $d_{n}$ ; $y_{n}$ as a binary indicator of whether subject n has toxicity or not, $n = 1, 2, \dots, N$ ; and $Ω_{n}$ as data collected after having n subjects where $Ω_{n} = {d_{1}, y_{1}, \dots, d_{n}, y_{n}}$ . Assume we have M possible partial ordering in total. For a specific ordering m, $m = 1, 2, \dots, M$ , $R (d_{n})$ is modeled as below, similar to the CRM

R (d_{n}) = E (Y_{n} | d_{n}) ≐ ψ_{m} (d_{n}, a)

(4)

where $ψ_{m}$ is some working dose-toxicity model, a is the model parameter. After having n patients, the likelihood under partial order m is

L_{m} (a | Ω_{n}) = \prod_{i = 1}^{n} ψ_{m}^{y_{i}} (d_{i}, a) {1 - ψ_{m} (d_{i}, a)}^{(1 - y_{i})} .

(5)

Then, the estimate of parameter a under ordering m, $\hat{a_{m}}$ , could be obtained through maximizing Equation (5). Then we could obtain the posterior probability of partial order m:

p (m | Ω_{n}) = \frac{p (m) L_{m} (\hat{a_{m}} | Ω_{n})}{\sum_{l = 1}^{M} p (l) L_{l} (\hat{a_{l}} | Ω_{n})} .

(6)

Start-up phase: when POCRM was first proposed, it was a single stage method [33]. Later it was extended to include a start-up phase [34]. The start-up phase partitions the dose combination matrix to different ‘zones’ and starts the first cohort with zone 1, which is the lowest dose combination. If no DLT is observed, then it assigns next cohort to doses in zone 2. If there are multiple combinations in zone 2, it randomly selects one of them. If no DLT is observed, it continues to assign the next cohort to other combinations in the same zone. Moving to the next zone is only allowed when all the dose combinations have been explored in lower zones. The start-up phase ends when one DLT is observed. POCRM also allows users to specify their own scheme in the start-up phase as they see appropriate.

Post start-up escalation/de-escalation dose set: since POCRM pre-specifies a subset of possible orderings, the escalation and de-escalation dose are known within each ordering.

Post start-up trial conduct: after the start-up phase ends, the authors used weighted randomization to select the partial ordering with $p (m | Ω_{n})$ from Equation (6) being the weight. After selecting the working partial ordering m, $π_{t}$ is estimated for all $t \in {1, 2, \dots, T}$ through Equation (4) and assign the dose combination that minimizes $| \hat{π_{t}} - ϕ |$ to the next subject. But for final MTD determination, the ordering with the maximum posterior probability will be chosen among all candidate orderings.

MTD determination: after N subjects are exhausted, the MTD is determined as the dose combination with the estimated probability of toxicity closest to ϕ, given the ordering with maximum posterior probability.

2.5. Model-based: hierarchical Bayesian design (Hierarchy) [3]

Dose-toxicity model: This method employs a hierarchical model:

\begin{aligned} π_{j k} \sim Beta (α_{j k}, β_{j k}), \end{aligned}

(7)

\begin{aligned} \log (α_{j k}) = θ_{0} + θ_{1} a_{j} + θ_{2} b_{k}, \end{aligned}

(8)

\begin{aligned} \log (β_{j k}) = ϕ_{0} + ϕ_{1} a_{j} + ϕ_{2} b_{k}, \end{aligned}

(9)

where $θ = {θ_{0}, θ_{1}, θ_{2}}$ follows a multivariate normal distribution with mean $μ = {μ_{0}, μ_{1}, μ_{2}}$ and variance covariance matrix $σ^{2} I$ where $I$ is a $3 \times 3$ identity matrix; $ϕ = {ϕ_{0}, ϕ_{1}, ϕ_{2}}$ follows a multivariate normal distribution with mean $ω = {ω_{0}, ω_{1}, ω_{2}}$ and the same variance covariance matrix; $a_{j}$ and $b_{k}$ are ‘effective doses’ instead of actual clinical values. This method omits the interaction effects between two agents.

The authors provided recommendations about selecting priors and methods to calculate ‘effective doses’. They used the fact that $\frac{K {\tilde{π}}_{11}}{K (1 - {\tilde{π}}_{11})} = \frac{\exp {μ_{0}}}{\exp {ω_{0}}}$ to obtain the solutions for $μ_{0}$ and $ω_{0}$ :

μ_{0} = \log (K {\tilde{π}}_{11}), ω_{0} = \log (K (1 - {\tilde{π}}_{11})) .

They suggested setting $μ_{1} = μ_{2} = ω_{1} = ω_{2} = 2 \sqrt{σ^{2}}$ and selecting $σ^{2} \in [5, 10]$ . They set $a_{1} = b_{1} = 0$ , then

\begin{aligned} a_{j} = (μ_{1} + ω_{1})^{- 1} \log ({\tilde{O R}}_{j .}), \\ b_{k} = (μ_{2} + ω_{2})^{- 1} \log ({\tilde{O R}}_{. k}) \end{aligned}

where

\begin{aligned} {\tilde{O R}}_{j .} = \exp {\frac{{\tilde{π}}_{j 1} / (1 - {\tilde{π}}_{j 1})}{{\tilde{π}}_{11} / (1 - {\tilde{π}}_{11})}}, \\ {\tilde{O R}}_{. k} = \exp {\frac{{\tilde{π}}_{1 k} / (1 - {\tilde{π}}_{1 k})}{{\tilde{π}}_{11} / (1 - {\tilde{π}}_{11})}} . \end{aligned}

Therefore, this design needs inputs of $π_{j 1}$ and $π_{1 k}$ , $j = 1, 2, \dots, J, k = 1, 2, \dots, K$ from the clinicians.

Start-up phase: this method is a single stage design without a start-up phase.

Escalation/de-escalation dose set: if the current combination is $(j, k)$ , acceptable dose combination set to the next subject S is defined as $(j - 1, k)$ , $(j + 1, k)$ , $(j, k)$ , $(j, k + 1)$ , $(j, k - 1)$ , $(j + 1, k - 1)$ , $(j - 1, k + 1)$ , $(j + 1, k + 1)$ , $(j - 1, k - 1)$ . As dose combinations $(j + 1, k + 1)$ and $(j - 1, k - 1)$ are included, this design allows simultaneous dose escalation or de-escalation of both agents.

Trial conduct:

Compute a $95 %$ CI for overall toxicity rate from cumulative data of currently recruited subjects.
If the lower bound of this CI is greater than ϕ, terminate the trial.
If the lower bound of this CI is less than or equal to ϕ, use all previous information to obtain posterior mean of $π_{j k}$ , $j = 1, 2, \dots, J, k = 1, 2, \dots, K$ .
Select a dose that belongs to set S and is closest to ϕ. Assign this dose combination to the next patient.
Continue until all N subjects are exhausted.

MTD determination: Use the outcomes and assignments of all N subjects to derive posterior mean of $π_{j k}$ , $j = 1, 2, \dots, J, k = 1, 2, \dots, K$ . If the last subject received dose $(j^{'}, k^{'})$ , then the dose combination that is among set S of $(j^{'}, k^{'})$ and with estimated toxicity closest to ϕ will be selected as the MTD.

2.6. Model-based: logistic model-based Bayesian dose finding design $($ DFCOMB $)$ [28]

Dose-toxicity model: this method uses logistic regression to link doses and toxicities of the two agents:

logit (π_{j k}) = β_{0} + β_{1} a_{j} + β_{2} b_{k} + β_{3} a_{j} b_{k},

(10)

where $a_{j}$ and $b_{k}$ are ‘effective doses’ instead of actual clinical values, $β_{1} > 0$ , $β_{2} > 0$ , $β_{1} + β_{3} b_{k} > 0$ , $β_{2} + β_{3} a_{j} > 0$ to ensure monotonicity. The authors define ‘effective doses’ as $a_{j} = \log (\frac{p_{j}}{1 - p_{j}})$ , $b_{k} = \log (\frac{q_{k}}{1 - q_{k}})$ and recommend a vague normal prior $N (0, 1)$ for $β_{0}$ and $β_{3}$ , an informative prior exp ${1}$ for $β_{1}$ and $β_{2}$ .

Start-up phase: the start-up phase starts from dose $(1, 1)$ . If no toxicity is observed, escalate the dose along the diagonal until at least one agent reaches maximum dose. If still no toxicity is observed when one agent reaches its maximum dose, we increase the dose of the other agent until both agents reach maximum doses. The start-up phase ends once the first toxicity is observed and the model-based design starts.

Post start-up escalation/de-escalation dose set: if the current combination is $(j, k)$ , the dose escalation set is defined as $A_{E} = {(j + 1, k), (j, k + 1), (j + 1, k - 1), (j - 1, k + 1)}$ , dose de-escalation set is define as $A_{D} = {(j - 1, k), (j, k - 1), (j + 1, k - 1), (j - 1, k + 1)}$ . As we only know the partial order of dose toxicity in combinational agents, we do not know whether dose combinations $(j + 1, k - 1)$ and $(j - 1, k + 1)$ are more/less toxic than dose combination $(j, k)$ . Therefore, the authors included $(j + 1, k - 1)$ and $(j - 1, k + 1)$ in both escalation and de-escalation sets.

Post start-up trial conduct: in the model-based design part, the escalation and de-escalation rule is the same as in Copula design [39].

MTD determination: DFCOMB utilizes a different method to identify MTD after the trial is completed. The dose combination that has the largest posterior probability $P (π_{j k} \in [ϕ - δ, ϕ + δ])$ and is used to treat at least one cohort will be selected as the MTD. Parameter δ is the length around the target toxicity probability.

2.7. Model-based: a generalized continual reassessment method $($ gCRM $)$ [2]

This method is another generalization of the CRM.

Dose-toxicity model: it uses proportional odds logistic regression to model the dose-toxicity relationship:

logit (π_{j k}) = α_{k} + β a_{j},

(11)

where $α_{k}$ is agent B specific intercept, $k = 1, 2, \dots, K$ ; β is a common coefficient across models; $a_{j}$ is the ‘effective dose’ of agent A, $j = 1, 2, \dots, J$ . For example, if agent B has three dose levels, then gCRM will need three models: $logit (π_{j 1}) = α_{1} + β a_{j}$ , $logit (π_{j 2}) = α_{2} + β a_{j}$ , and $logit (π_{j 3}) = α_{3} + β a_{j}$ . Later these ‘sub’ models will be aggregated together through a joint prior distribution that forces correlation among $(α_{1}, α_{2}, \dots, α_{k})$ . As observed, gCRM assumes no interaction between two agents as well.

In terms of parameters $α_{k}$ and β, their paper assumes that β follows a Gamma distribution with mean $μ_{β}$ and variance $σ_{β}^{2}$ , $α_{1} \sim N (μ_{α}, σ_{α}^{2})$ , and defines $Δ_{k} = α_{k} - α_{k - 1} \sim N (δ_{k}, 2 σ_{α}^{2})$ for $k = 2, 3, \dots, K$ so the joint distribution of $α = (α_{1}, \dots, α_{K})^{T}$ is multivariate normal. If one assumes that $logit (π_{j 1}) = E (α_{1}) + E (β) a_{j}$ , $a_{j} \approx [logit (π_{j 1}) - μ_{α}] / μ_{β}$ can be obtained. One can approximately obtain $δ_{k} = logit (π_{1 k}) - logit (π_{1, k - 1})$ . The authors recommend setting $μ_{α} = - 8$ , $μ_{β} = 1$ , $σ_{α}^{2} = σ_{β}^{2} = 1$ . Therefore, with the inputs of $π_{j 1}$ and $π_{1 k}$ for $j = 1, 2, \dots, J, k = 1, 2, \dots, K$ from clinicians, all parameters can be calculated.

Start-up phase: this method is a single stage design without a start-up phase.

Escalation/de-escalation dose set: if the current combination is $(j, k)$ , define acceptable dose combination set to the next subject S to be $(j - 1, k)$ , $(j + 1, k)$ , $(j, k)$ , $(j, k + 1)$ , $(j, k - 1)$ , $(j + 1, k - 1)$ , $(j - 1, k + 1)$ , $(j + 1, k + 1)$ , $(j - 1, k - 1)$ . As dose combination $(j + 1, k + 1)$ and $(j - 1, k - 1)$ are included, this design allows simultaneous dose escalation or de-escalation of both agents.

Trial conduct:

Treat the first patient at dose $(1, 1)$ .
For patient $2, 3, \dots, N$ , compute ${\hat{π}}_{j k}$ from logit $({\hat{π}}_{j k}) = {\hat{α}}_{k} + \hat{β} a_{j}$ where ${\hat{α}}_{k}$ and $\hat{β}$ are posterior means.
As the posterior distribution of $π_{11}$ will be updated constantly, check that whether the stopping rule of $P (π_{11} > ϕ) > 0.95$ has been reached. If yes, then terminate the trial; otherwise assign next subject to the dose in set S where ${\hat{π}}_{j k}$ is closest to ϕ.
Continue until all N subjects are exhausted.

MTD determination: if the last subject received dose $(j^{'}, k^{'})$ , then the dose combination that is among set S of $(j^{'}, k^{'})$ and with estimated toxicity closest to ϕ will be selected as the MTD.

2.8. Model-based: bootstrap aggregating continual reassessment method $($ bCRM $)$ [17]

Bootstrap aggregating CRM is similar to POCRM as they both use one-dimensional CRM to identify the MTD. However, bCRM keeps updating the toxicity ordering of dose combinations rather than pre-specifying them.

Dose-toxicity model: in bCRM, it assigns a beta prior to $π_{j k}$ and obtain its posterior mean ${\bar{π}}_{j k}$ . Then bCRM applies two-dimensional pool-adjacent-violators algorithm (PAVA) [4] on ${\bar{π}}_{j k}$ to obtain ${\tilde{π}}_{j k}$ to ensure that these estimates satisfy partial ordering. To avoid ties among ${\tilde{π}}_{j k}$ , a term $r_{j k} ϵ$ is added, where $r_{j k}$ is the rank of dose $(j, k)$ and ϵ is a small positive number. The resulted estimates are denoted as ${\tilde{π}}_{j k}^{†}$ and, as a result, one can obtain a new ordering $O$ . As noted by the authors, such orderings could vary dramatically due to data sparsity. Therefore, they bootstrapped B samples of data to obtain corresponding orderings $O_{b}$ , and toxicity probability estimates ${\hat{π}}_{j k}^{b}$ , $b \in 1, 2, \dots, B$ . The final estimate of $π_{j k}$ is

{\hat{π}}_{j k}^{Bagging} = \sum_{b = 1}^{B} P (O_{b} | D) {\hat{π}}_{j k}^{b},

(12)

where $D = [\begin{matrix} t_{1} & \dots & t_{n} \\ d_{1} & \dots & d_{n} \end{matrix}]$ represents cumulative data up to $n^{th}$ subject, $t_{i}$ indicates whether subject i experienced DLT or not; $d_{i}$ indicates dose combination subject i received, $i = 1, 2, \dots, n$ .

Start-up phase: the start-up phase is similar to DFCOMB.

Post start-up escalation/de-escalation dose set: similar to POCRM, bCRM uses one-dimensional CRM in the dose-finding process, therefore, the escalation and de-escalation dose is certain within each ordering.

Post start-up trial conduct: the trial conduct procedures are similar to DFCOMB.

MTD determination: After the trial is completed, one could select the combination that has been administered to patients and has the largest posterior probability of falling into the ε-neighborhood of ϕ, where ε is a small positive number.

2.9. Model-assisted: combinational Bayesian optimal interval design $($ cBOIN $)$ [18]

Combinational BOIN is a model-assisted design that is generalized from the single agent BOIN design [20,40].

Dose escalation and de-escalation rule: BOIN mainly involves two important parameters $Δ_{L}$ and $Δ_{U}$ which are lower and upper cut-offs. At current dose j, the escalation and de-escalation rules are below:

if $\hat{p_{j}} \in (ϕ - Δ_{L}, ϕ + Δ_{U})$ , then the next cohort stays at current dose;
if $\hat{p_{j}} \leq ϕ - Δ_{L}$ , then the next cohort escalates to dose j + 1;
if $\hat{p_{j}} \geq ϕ + Δ_{U}$ , then the next cohort de-escalates to dose j−1;

where $\hat{p_{j}}$ is the estimated toxicity probability of dose j in single agent dose-finding and it is simply proportion of patients experiencing toxicities among those who receive dose j.

In two-dimensional dose-finding, ${\hat{p}}_{j k}$ is calculated the same way: ${\hat{p}}_{j k} = y_{j k} / n_{j k}$ .

An important task of cBOIN is to determine $Δ_{L}$ and $Δ_{U}$ . Through minimizing the probability of incorrect movement given data at current dose,

Δ_{L} = ϕ - \frac{\log {\frac{1 - ϕ_{1}}{1 - ϕ}}}{\log {\frac{ϕ (1 - ϕ_{1})}{ϕ_{1} (1 - ϕ)}}}, Δ_{U} = \frac{\log {\frac{1 - ϕ}{1 - ϕ_{2}}}}{\log {\frac{ϕ_{2} (1 - ϕ)}{ϕ (1 - ϕ_{2})}}} - ϕ .

The authors suggested using $ϕ_{1} = 0.6 ϕ$ and $ϕ_{2} = 1.4 ϕ$ through their simulation calibration.

Start-up phase: this method does not have a start-up phase.

Escalation/de-escalation dose set: admissible dose escalation set is defined as $A_{E} = {(j + 1, k), (j, k + 1)}$ , admissible de-escalation set is define as $A_{D} = {(j - 1, k), (j, k - 1)}$ .

Trial conduct:

Treat the first cohort at dose $(1, 1)$ .
Suppose that current dose is combination $(j, k)$ . If ${\hat{p}}_{j k} \leq ϕ - Δ_{L}$ , escalate to the dose combination that belongs to $A_{E}$ and has the largest $P [p_{j' k'} \in (ϕ - Δ_{L}, ϕ + Δ_{U}) | y_{j' k'}]$ .
If ${\hat{p}}_{j k} \geq ϕ + Δ_{U}$ , de-escalate to the dose combination that belongs to $A_{D}$ and has the largest $P [p_{j' k'} \in (ϕ - Δ_{L}, ϕ + Δ_{U}) | y_{j' k'}]$ .
Otherwise if $ϕ - Δ_{L} < {\hat{p}}_{j k} < ϕ + Δ_{U}$ , stay at current dose.
Dose combinations with $P (p_{j k} > ϕ | y_{j k}) \geq λ$ will be permanently excluded, where λ is pre-specified threshold probability. If dose combination $(1, 1)$ satisfies this stopping rule, the trial will be terminated early.
Continue until all N subjects are exhausted.

MTD determination: after the trial is completed, isotonic regression will be used on ${\hat{p}}_{j k}$ to obtain estimator ${\tilde{p}}_{j k}$ so that they satisfy monotonic dose-toxicity within one agent when fixing the other agent's dose. The MTD is the dose combination with ${\tilde{p}}_{j k}$ closest to ϕ.

2.10. Model-assisted: combinational keyboard design $($ cKeyboard $)$ [24]

Similar to combinational BOIN, combinational Keyboard is a model-assisted design as well. Combinational Keyboard design starts with specifying a target toxicity interval $J_{target} = (ϕ - ε_{1}, ϕ + ε_{2})$ , where $ε_{1}$ and $ε_{2}$ are tolerable deviations from ϕ. This interval $J_{target}$ is called target key. Then a series of equal-width keys are identified along both sides of the target key.

Dose escalation and de-escalation rule: in the setting of the single agent design, the escalation and de-escalation rules are straightforward. Define $J_{\max}$ to be the strongest key based on the posterior distribution of current dose j,

if $J_{\max} ≺ J_{target}$ , then next cohort escalates to dose j + 1;
if $J_{\max} \equiv J_{target}$ , then next cohort stays at current dose;
if $J_{\max} ≻ J_{target}$ , then next cohort de-escalates to dose j−1;

To address two-dimensional dose-finding, the authors define five strategies of admissible escalate and de-escalate sets. After simulations, the strategy whose admissible escalation and de-escalation sets are the same with combinational BOIN design is recommended.

Start-up phase: this method does not have a start-up phase.

Escalation/de-escalation dose set: several dose assignment algorithms have been proposed for Keyboard design and the authors recommend to define admissible dose escalation set to be $A_{E} = {(j + 1, k), (j, k + 1)}$ , admissible de-escalation set to be $A_{D} = {(j - 1, k), (j, k - 1)}$ .

Trial conduct:

Treat the first cohort at dose $(1, 1)$ .
Suppose that current dose is combination $(j, k)$ . If $J_{\max} ≺ J_{target}$ , escalate to dose combination that belongs to $A_{E}$ and has the largest $P [p_{j' k'} \in J_{target} | (n_{j k}, y_{j k)}]$ .
If $J_{\max} ≻ J_{target}$ , de-escalate to dose combination that belongs to $A_{D}$ and has the largest $P [p_{j' k'} \in J_{target} | (n_{j k}, y_{j k})]$ .
Otherwise if $J_{\max} \equiv J_{target}$ , stay at current dose.
Dose combinations with $P (p_{j k} > ϕ | y_{j k}) \geq λ$ will be permanently excluded, where λ is pre-specified threshold probability. If dose combination $(1, 1)$ satisfies this stopping rule, the trial will be terminated early.
Continue until all N subjects are exhausted.

MTD determination: after the trial is completed, isotonic regression will be used to identify the MTD.

3. Simulation studies

In the simulation studies, our goal is to identify single MTD of two combined agents. Simulation settings are borrowed from previous studies [9,26] and shown in Table 2. The target toxicity probability is 0.3. All designs started with the lowest dose combination. Cohort size was set to be 3 for all designs that use cohorts as dose assignment unit, unless otherwise specified. Two thousand simulation runs were generated for each scenario.

Table 2.

Simulation settings.

	Agent A
Agent B	1	2	3	4	5	1	2	3	4	5	1	2	3	4	5
	Scenario 1					Scenario 2					Scenario 3
1	0.05	0.1	0.15	0.3	0.45	0.15	0.3	0.45	0.5	0.6	0.02	0.07	0.1	0.15	0.3
2	0.1	0.15	0.3	0.45	0.55	0.3	0.45	0.5	0.6	0.75	0.07	0.1	0.15	0.3	0.45
3	0.15	0.3	0.45	0.5	0.6	0.45	0.55	0.6	0.7	0.8	0.1	0.15	0.3	0.45	0.55
	Scenario 4					Scenario 5					Scenario 6
1	0.3	0.45	0.6	0.7	0.8	0.01	0.02	0.08	0.1	0.11	0.05	0.08	0.1	0.13	0.15
2	0.45	0.55	0.65	0.75	0.85	0.03	0.05	0.1	0.13	0.15	0.09	0.12	0.15	0.3	0.45
3	0.5	0.6	0.7	0.8	0.9	0.07	0.09	0.12	0.15	0.3	0.15	0.3	0.45	0.5	0.6
	Scenario 7					Scenario 8					Scenario 9
1	0.07	0.1	0.12	0.15	0.3	0.02	0.1	0.15	0.5	0.6	0.005	0.01	0.02	0.04	0.07
2	0.15	0.3	0.45	0.52	0.6	0.05	0.12	0.3	0.55	0.7	0.02	0.05	0.08	0.12	0.15
3	0.3	0.5	0.6	0.65	0.75	0.08	0.15	0.45	0.6	0.8	0.15	0.3	0.45	0.55	0.65
	Scenario 10					Scenario 11					Scenario 12
1	0.05	0.1	0.15	0.3	0.45	0.08	0.14	0.19	0.3		0.05	0.1	0.2	0.3
2	0.45	0.5	0.6	0.65	0.7	0.1	0.2	0.3	0.55		0.08	0.3	0.45	0.5
3	0.7	0.75	0.8	0.85	0.9	0.15	0.3	0.52	0.6		0.15	0.35	0.5	0.55
4						0.3	0.5	0.6	0.7		0.3	0.5	0.6	0.7
	Scenario 13					Scenario 14					Scenario 15
1	0.05	0.08	0.1	0.3		0.01	0.05	0.1	0.3		0.01	0.1	0.15	0.45
2	0.08	0.1	0.2	0.35		0.05	0.1	0.45	0.5		0.03	0.3	0.4	0.5
3	0.1	0.2	0.3	0.4		0.1	0.45	0.5	0.6		0.05	0.5	0.55	0.65
4	0.3	0.35	0.4	0.6		0.3	0.5	0.6	0.65		0.08	0.55	0.6	0.75

Open in a new tab

3.1. Simulation scenarios

A total of 15 scenarios are displayed in Table 2. In the first 10 scenarios, agent A has 5 dose levels and agent B has 3. In scenarios 11 to 15, both agents have 4 dose levels. Target toxicity rate 0.3 is bolded. Among the first ten $5 \times 3$ matrices: scenario 1 contains multiple MTD locations that are in the middle of matrix and diagonally connected; scenarios 2 and 4 represent over-toxic situations while scenario 4 is more extreme; scenarios 3 and 5 represent over-conservative situations while scenario 5 is more extreme; scenarios 6 and 7 contain multiple MTD locations but those locations are more scattered; scenarios 8, 9, and 10 contain single MTD at different locations. Among the last five $4 \times 4$ square matrices: scenario 11 contains multiple MTD locations that are in the middle of matrix and diagonally connected; scenarios 12 and 13 contain multiple but more scattered MTD locations; scenario 14 contains two MTD locations that are at the bottom left and top right; scenario 15 has single MTD location.

3.2. Evaluation metrics

Four evaluation metrics are used: (1) correct MTD selection $S_{C}$ , defined as proportion of simulation runs that correctly identified the MTD among all 2000 simulations; (2) over-toxic MTD selection $S_{O T}$ , defined as the proportion of simulation runs that identified over-toxic doses as MTD among all 2000 simulations; (3) correct patient assignment $A_{C}$ , defined as the average proportion of patients that were assigned to the MTD during the trial across all 2000 simulations; (4) over-toxic patient assignment $A_{O T}$ , defined as the average proportion of patients that were assigned to over-toxic doses during the trial across all 2000 simulations. Metrics $S_{C}$ and $S_{O T}$ will be used to evaluate the performance of designs in terms of MTD selection. The larger the $S_{C}$ is, the more accurate the design is in selecting the correct MTD. The larger the $S_{O T}$ is, the more aggressive the design is in selecting MTD. Metrics $A_{C}$ and $A_{O T}$ will be used to evaluate the characteristics of designs during trial conduct. The larger the $A_{C}$ is, the more accurate patient assignment is during the trial. The larger the $A_{O T}$ is, the more aggressive the design is in dose escalation during the trial. Ideally, a design should show relatively large $S_{C}$ and $A_{C}$ but small $S_{O T}$ and $A_{O T}$ .

3.3. Design specifications

For I2D, we implemented published R codes [7]. We set cohort size of start-up phase to be 1 based on suggestions from simulation studies when target toxicity rate is 0.3 [11,36], and interaction to be 0 so that it is consistent with the paper's focus. The prior of parameters $(α, β)$ is the product of two independent exponential distributions with mean 1, which is the same as the one used in the I2D study [36].

For Copula, the website (http://www.blackwellpublishing.com/rss) where simulation programs were originally published is not accessible now, so we used the executable file on the website https://odin.mdacc.tmc.edu/yyuan/index_code.html. The executable file used escalation and de-escalation probability boundaries fixed at 0.8 and 0.45, respectively.

Hierarchy was implemented via R codes from website http://www-personal.umich.edu/tombraun/software.html. We set $σ^{2}$ to be 10 based on the suggestion in the paper [3]. Together with other recommendations from the authors, we could obtain priors for all involved parameters. Details are described in Section 2.5.

POCRM was implemented via R package pocrm. We utilized six possible partial ordering as suggested [32,35]: across rows, across columns, up diagonals, down diagonals, up-down diagonals, and down-up diagonals. As we do not have information about which partial ordering is more likely, the prior probabilities of all 6 partial ordering were set to be equal. The skeleton required by the program was obtained using getprior function in package dfcrm from algorithm of Lee and Cheung [15] as suggested [35]. For the start-up phase, we used the ‘zoning’ method as suggested [33,34].

Design cBOIN was implemented via R package BOIN. For cBOIN, the interval boundaries were set to be 0.18 and 0.42 as suggested [18].

Design cKeyboard was implemented via R package Keyboard.

DFCOMB was implemented via R package dfcomb. As recommended by the authors, a vague normal prior $N (0, 1)$ for $β_{0}$ and $β_{3}$ , an informative prior exp ${1}$ for $β_{1}$ and $β_{2}$ [28]. For DFCOMB, we set the target toxicity boundaries as 0.18 and 0.42 to be consistent with cBOIN. As one of our reviewers suggested, we also tried setting these toxicity boundaries as 0.25 and 0.35.

Design gCRM was implemented via R codes from website http://www-personal.umich.edu/tombraun/software.html. As the authors suggested, we used a Gamma prior with mean 1 and variance 1 for β, normal prior with mean -8 and variance 1 for $α_{1}$ , normal prior with mean $logit (π_{1 k}) - logit (π_{1, k - 1})$ and variance 2 for $δ_{k}$ , where $k = 2, 3, \dots, K$ [2].

Design bCRM was implemented via R codes from the authors. Its skeleton setting is the same as POCRM.

For most model-based designs, there are several design parameters involved. Some of these parameters have recommended specifications provided by the authors, for example, interval boundaries in design cBOIN are recommended to set as $0.6 ϕ$ and $1.4 ϕ$ where ϕ is the target toxicity probability [18]. However, some design parameters lack authors' suggested specifications and their influences to design performances are not clear. Therefore, we list such design parameters and corresponding designs in Table 1 where column ‘Main setting’ and column ‘Alternative setting’ contain parameter specifications used in our simulation studies.

Table 1.

Explored design parameters.

		Main Setting		Alternative Setting
Design	Parameter	$5 \times 3$	$4 \times 4$	$5 \times 3$	$4 \times 4$
I2D Copula DFCOMB	$p_{j}$ and $q_{k}$	$p_{j}$ : 0.1, 0.2, 0.25, 0.3, 0.35	$p_{j}$ : 0.1, 0.2, 0.25, 0.3	$p_{j}$ : 0.05, 0.1, 0.2, 0.25, 0.3	$p_{j}$ : 0.05, 0.1, 0.2, 0.22
		$q_{k}$ : 0.1, 0.3, 0.35	$q_{k}$ : 0.1, 0.2, 0.25, 0.3	$q_{k}$ : 0.1, 0.2, 0.25	$q_{k}$ : 0.05, 0.1, 0.2, 0.22
POCRM bCRM	Skeleton setting	half width: 0.05	half width: 0.05	half width: 0.03	half width: 0.03
		MTD position: 11	MTD position: 12	MTD position: 13	MTD position: 15
DFCOMB bCRM	Escalation/de-escalation probability cutoff	0.85 and 0.45		0.6 and 0.6
Hierarchy gCRM	$π_{1 k}$ and $π_{j 1}$	truth of each scenario		incorrect guess

Open in a new tab

For all designs and scenarios, the maximum sample size was set to be 60. This number is widely used in other studies [2,17,18,24,26,39]. To verify that our conclusions are still valid with a different sample size, we repeated all simulations with a maximum sample size of 30.

4. Results

4.1. When maximum sample size is 60

Tables 3– 6 display design performances of $S_{C}$ , $S_{O T}$ , $A_{C}$ , and $A_{O T}$ , respectively. In these tables, I2D with parameter $p_{j}$ and $q_{k}$ specification in column ‘Main setting’ of Table 1 is denoted as ‘I2D’; I2D with parameter $p_{j}$ and $q_{k}$ specification in column ‘Alternative setting’ is denoted as ‘I2D.pq’. Copula with parameter $p_{j}$ and $q_{k}$ specification in column ‘Main setting’ is denoted as ‘Copula’; Copula with parameter $p_{j}$ and $q_{k}$ specification in column ‘Alternative setting’ is denoted as ‘Copula.pq’. Hierarchy with $π_{1 k}$ and $π_{j 1}$ specification in column ‘Main setting’ is denoted as ‘Hierarchy’; Hierarchy with $π_{1 k}$ and $π_{j 1}$ specification in column ‘Alternative setting’ is denoted as ‘Hierarchy.pi’. POCRM with skeleton specification in column ‘Main setting’ is denoted as ‘POCRM’; POCRM with skeleton specification in column ‘Alternative setting’ is denoted as ‘POCRM.skeleton’. DFCOMB with $p_{j}$ and $q_{k}$ , and escalation/de-escalation probability cutoff specifications in column ‘Main setting’ is denoted as ‘DFCOMB’; DFCOMB with parameter $p_{j}$ and $q_{k}$ specification in column ‘Alternative setting’ is denoted as ‘DFCOMB.pq’; DFCOMB with escalation/de-escalation probability cutoff specification in column ‘Alternative setting’ is denoted as ‘DFCOMB.cut’; DFCOMB with $p_{j}$ and $q_{k}$ , and escalation/de-escalation probability cutoff specifications in column ‘Main setting’, but with toxicity boundaries being 0.25 and 0.35 as suggested by one of our reviewers, is denoted as ‘DFCOMB.sensitive’. Design gCRM with $π_{1 k}$ and $π_{j 1}$ specification in column ‘Main setting’ is denoted as ‘gCRM’; gCRM with $π_{1 k}$ and $π_{j 1}$ specification in column ‘Alternative setting’ is denoted as ‘gCRM.pi’. Design bCRM with skeleton and escalation/de-escalation probability cutoff specifications in column ‘Main setting’ is denoted as ‘bCRM’; bCRM with skeleton specification in column ‘Alternative setting’ is denoted as ‘bCRM.skeleton’; bCRM with escalation/de-escalation probability cutoff specification in column ‘Alternative setting’ is denoted as ‘bCRM.cut’. We marked designs metrics that are ‘outstandingly’ poor as red, and those that are poor, but not ‘outstanding’ from the others as magenta.

Table 4.

Performance of MTD selection of designs across scenarios when maximum sample size is 60.

	Simulation scenario
	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
Design	Selection of over-toxic MTD $(S_{O T})$
I2D	0.18	0.18	0.18	0.10	0	0.29	0.07	0.26	0.39	0.32	0.12	0.18	0.43	0.31	0.28
I2D.pq	0.19	0.19	0.18	0.19	0	0.25	0.17	0.30	0.42	0.41	0.19	0.32	0.43	0.35	0.45
Copula	0.32	0.19	0.36	0.09	0	0.43	0.22	0.58	0.55	0.63	0.19	0.43	0.66	0.69	0.50
Copula.pq	0.30	0.16	0.33	0.09	0	0.46	0.33	0.44	0.53	0.78	0.22	0.51	0.41	0.62	0.54
Hierarchy	0.22	0.18	0.21	0.16	0	0.30	0.31	0.30	0.17	0.20	0.13	0.37	0.39	0.41	0.22
Hierarchy.pi	0.22	0.23	0.14	0.24	0	0.29	0.24	0.27	0.40	0.39	0.13	0.43	0.44	0.54	0.47
POCRM	0.12	0.24	0.08	0.22	0	0.11	0.19	0.17	0.18	0.26	0.04	0.30	0.28	0.32	0.31
POCRM.skeleton	0.15	0.24	0.12	0.20	0	0.17	0.21	0.18	0.23	0.27	0.06	0.34	0.32	0.34	0.39
DFCOMB	0.18	0.08	0.17	0.06	0	0.27	0.09	0.27	0.57	0.32	0.08	0.19	0.25	0.61	0.21
DFCOMB.pq	0.10	0.06	0.14	0.08	0	0.31	0.13	0.37	0.62	0.52	0.08	0.30	0.28	0.51	0.45
DFCOMB.cut	0.08	0.07	0.11	0.08	0	0.07	0.05	0.19	0.37	0.20	0.02	0.12	0.16	0.53	0.23
DFCOMB.sensitive	0.25	0.10	0.21	0.09	0	0.36	0.13	0.36	0.65	0.37	0.12	0.25	0.32	0.66	0.27
gCRM	0.15	0.13	0.13	0.10	0	0.18	0.10	0.31	0.11	0.33	0.06	0.32	0.31	0.52	0.29
gCRM.pi	0.18	0.14	0.17	0.10	0	0.19	0.08	0.29	0.13	0.26	0.07	0.22	0.38	0.45	0.44
cBOIN	0.16	0.21	0.15	0.17	0	0.19	0.13	0.21	0.13	0.31	0.08	0.29	0.43	0.34	0.29
cKeyboard	0.17	0.21	0.14	0.17	0	0.20	0.14	0.21	0.12	0.31	0.09	0.27	0.43	0.34	0.30
bCRM	0.08	0.22	0.05	0.24	0	0.11	0.15	0.20	0.20	0.38	0.03	0.27	0.19	0.53	0.42
bCRM.skeleton	0.12	0.22	0.08	0.16	0	0.17	0.21	0.29	0.43	0.50	0.05	0.33	0.21	0.58	0.46
bCRM.cut	0.12	0.30	0.07	0.29	0	0.12	0.19	0.19	0.17	0.36	0.04	0.32	0.24	0.53	0.44

Open in a new tab

Notes: I2D: design I2D with parameter $p_{j}$ and $q_{k}$ specified in ‘Main setting’ of Table 1.

I2D.pq: design I2D with parameter $p_{j}$ and $q_{k}$ specified in ‘Alternative setting’.

Copula: design Copula with parameter $p_{j}$ and $q_{k}$ specified in ‘Main setting’.

Copula.pq: design Copula with parameter $p_{j}$ and $q_{k}$ specified in ‘Alternative setting’.

Hierarchy: design Hierarchy with $π_{1 k}$ and $π_{j 1}$ specified in ‘Main setting’.

Hierarchy.pi: design Hierarchy with $π_{1 k}$ and $π_{j 1}$ specified in ‘Alternative setting’.

POCRM: design POCRM with skeleton specified in ‘Main setting’.

POCRM.skeleton: design POCRM with skeleton specified in ‘Alternative setting’.

DFCOMB: design DFCOMB with $p_{j}$ and $q_{k}$ , and escalation/de-escalation probability cutoff specified in ‘Main setting’.

DFCOMB.pq: design DFCOMB with $p_{j}$ and $q_{k}$ specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.

DFCOMB.cut: design DFCOMB with $p_{j}$ and $q_{k}$ specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.

DFCOMB.sensitive: design DFCOMB with $p_{j}$ and $q_{k}$ , and escalation/de-escalation probability cutoff specified in ‘Main setting’, but target toxicity interval boundaries suggested by one of our reviewers.

gCRM: design gCRM with $π_{1 k}$ and $π_{j 1}$ specified in ‘Main setting’.

gCRM.pi: design gCRM with $π_{1 k}$ and $π_{j 1}$ specified in ‘Alternative setting’.

bCRM: design bCRM with skeleton, and escalation/de-escalation probability cutoff specified in ‘Main setting’.

bCRM.skeleton: design bCRM with skeleton specified in ‘Alternative setting’, and escalation/de-escalation probability cutoff specified in ‘Main setting’.

bCRM.cut: design bCRM with skeleton specified in ‘Main setting’, and escalation/de-escalation probability cutoff specified in ‘Alternative setting’.

Table 5.

Performance of patient assignment of designs across scenarios when maximum sample size is 60.

	Simulation scenario
	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15
Design	Patient receiving correct MTD during trials $(A_{C})$
I2D	0.38	0.57	0.45	0.71	0.58	0.23	0.50	0.07	0.05	0.33	0.44	0.39	0.25	0.27	0.18
I2D.pq	0.45	0.56	0.44	0.55	0.61	0.26	0.42	0.15	0.03	0.28	0.50	0.34	0.27	0.28	0.11
Copula	0.33	0.57	0.31	0.72	0.46	0.29	0.41	0.12	0.31	0.09	0.32	0.33	0.08	0.07	0.34
Copula.pq	0.41	0.55	0.37	0.63	0.49	0.27	0.43	0.26	0.18	0.00	0.40	0.23	0.27	0.07	0.12
Hierarchy	0.42	0.50	0.43	0.68	0.68	0.31	0.35	0.22	0.30	0.35	0.43	0.27	0.38	0.15	0.39
Hierarchy.pi	0.42	0.40	0.52	0.49	0.65	0.30	0.41	0.26	0.07	0.04	0.43	0.23	0.27	0.10	0.16
POCRM	0.53	0.53	0.47	0.64	0.33	0.37	0.43	0.35	0.30	0.36	0.57	0.39	0.33	0.36	0.33
POCRM.skeleton	0.52	0.48	0.48	0.68	0.47	0.36	0.38	0.35	0.29	0.33	0.56	0.38	0.34	0.35	0.20
DFCOMB	0.23	0.45	0.37	0.91	0.39	0.16	0.33	0.23	0.13	0.16	0.20	0.28	0.12	0.11	0.35
DFCOMB.pq	0.34	0.48	0.37	0.91	0.37	0.17	0.33	0.20	0.06	0.04	0.28	0.29	0.19	0.23	0.28
DFCOMB.cut	0.34	0.46	0.39	0.77	0.51	0.23	0.41	0.17	0.17	0.28	0.39	0.33	0.21	0.11	0.26
DFCOMB.sensitive	0.23	0.45	0.37	0.91	0.39	0.16	0.33	0.23	0.13	0.16	0.20	0.28	0.12	0.11	0.35
gCRM.b	0.46	0.45	0.49	0.75	0.69	0.36	0.46	0.27	0.29	0.36	0.44	0.34	0.34	0.14	0.33
gCRM.pi	0.37	0.44	0.41	0.75	0.68	0.32	0.54	0.16	0.25	0.31	0.43	0.40	0.25	0.18	0.18
cBOIN	0.43	0.49	0.40	0.72	0.43	0.34	0.46	0.21	0.26	0.20	0.44	0.37	0.23	0.21	0.25
cKeyboard	0.42	0.49	0.40	0.72	0.43	0.33	0.44	0.21	0.25	0.20	0.43	0.37	0.23	0.22	0.24
bCRM	0.43	0.52	0.37	0.70	0.24	0.26	0.35	0.24	0.18	0.24	0.36	0.37	0.26	0.21	0.30
bCRM.skeleton	0.46	0.49	0.44	0.77	0.34	0.28	0.37	0.25	0.16	0.19	0.42	0.37	0.31	0.20	0.28
bCRM.cut	0.47	0.44	0.47	0.52	0.36	0.30	0.35	0.26	0.15	0.25	0.47	0.35	0.29	0.21	0.24